To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Current automatic deception detection approaches tend to rely on cues that are based either on specific lexical items or on linguistically abstract features that are not necessarily motivated by the psychology of deception. Notably, while approaches relying on such features can do well when the content domain is similar for training and testing, they suffer when content changes occur. We investigate new linguistically defined features that aim to capture specific details, a psychologically motivated aspect of truthful versus deceptive language that may be diagnostic across content domains. To ascertain the potential utility of these features, we evaluate them on data sets representing a broad sample of deceptive language, including hotel reviews, opinions about emotionally charged topics, and answers to job interview questions. We additionally evaluate these features as part of a deception detection classifier. We find that these linguistically defined specific detail features are most useful for cross-domain deception detection when the training data differ significantly in content from the test data, and particularly benefit classification accuracy on deceptive documents. We discuss implications of our results for general-purpose approaches to deception detection.
A common task arising in many contexts is rewriting parts of a given input string to another form. Subparts of the input that match specific conditions are replaced by other output parts. In this way, the complete input string is translated to a new output form. Due to the importance of text rewriting, many programming languages offer matching/rewriting operations for subexpressions of strings, also called replace rules. When using strictly regular relations and functions for representing replace rules, a cascade of replace rules can be composed into a single transducer. If the transducer is functional, an equivalent bimachine or (in some cases) a subsequential transducer can be built, thus achieving theoretically and practically optimal text processing speed. In this chapter we introduce basic constructions for building text rewriting transducers and bimachines from replace rules and provide implementations. A first simple version in general leads to an ambiguous form of text rewriting with several outputs. A second more sophisticated construction solves conflicts using the leftmost-longest match strategy and leads to functional devices.
An important generalization of classical finite-state automata are multi-tape automata, which are used for recognizing relations of a particular type. The so-called regular relations (also refered to as ‘rational relations’) offer a natural way to formalize all kinds of translations and transformations, which makes multi-tape automata interesting for many practical applications and explains the general interest in this kind of device. A natural subclass are monoidal finite-state transducers, which can be defined as two-tape automata where the first tape reads strings. In this chapter we present the most important properties of monoidal multi-tape automata in general and monoidal finite-state transducers in particular. We show that the class of relations recognized by n-tape automata is closed under a number of useful relational operations like composition, Cartesian product, projection, inverse etc. We further present a procedure for deciding the functionality of classical finite-state transducers.
In this article, based on four decades of experience of using samples in diverse ways in experimental, particularly electroacoustic compositions, the author investigates the world of what he calls ‘sample-based sound-based music’ and suggests that there is a relative lack of scholarship in this important area. The article’s contextual sections focus on briefly delineating this world of sonic creativity and placing it within today’s sampling culture as well as dealing with two political aspects of sampling, a musician’s attitude towards the reuse of sonic materials and the legality of sampled sounds, including musical passages, in the discussion of which current legislation related to sampling is challenged. Following this, a number of categories are presented in terms of the types of sampling material that is being used as well as how sample-based works are presented. The subsequent section is perhaps the most poignant in the article, namely the opening up of this form of innovative composition from a more traditional ‘artist creates work’ mode of operation to a more collaborative one which is essentially already part of most other forms of sampling culture. The objective here is to suggest that such collaborative approaches will enable sample-based sound-based music to become part of the lives of a much broader group than those currently involved with it.
Reproduction (playback) is responsible for the presentation of the full spectrum of sound character captured during the recording process. The control of this and the faithfulness to an original sound has informed modern sound aesthetics. Current modes of reproduction, such as streaming, see the listener more interested in an approximate presentation of sound, rather than a broad and more psychoacoustically pleasing one. In the sonic arts, the practice of sound recycling and its associated methodologies, reproduction is re-contextualised, involving material that is borrowed, reworked and often disconnected from its source. Such issues are considered in this article through the examination of sound recycling in 94 diskont (1995), an album produced by the German act Oval. By studying the use of material and medium in the work, an attempt is made to discuss approaches to sound recycling through conceptual frameworks proposed by Bregman, Deleuze, Guattari and Smalley to provide a forum towards the interpretation of sound recycling in wider sonic arts practices.
In this chapter we introduce the C(M) language, a new programming language. C(M) statements and expressions closely resemble the notation commonly used for the presentation of formal constructions in a Tarskian style set theoretical language. The usual set theoretic objects such as sets, functions, relations, tuples etc. are naturally integrated in the language. In contrast to imperative languages such as C or Java, C(M) is a functional declarative programming language. C(M) has many similarities with Haskell but makes use of the standard mathematical notation like SETL. The C(M) compiler translates a well-formed C(M) program into efficient C code, which can be executed after compilation. Since it is easy to read C(M) programs, a pseudo-code description becomes obsolete.
In this chapter we introduce the bimachine, a deterministic finite-state device that exactly represents the class of all regular string functions. We prove this correspondence, using as a key ingredient a procedure for converting transducers to bimachines. Methods for pseudo-minimization and direct composition of bimachines are added.
The medium and genre of the mixtape, a form dependent upon borrowed, repurposed and re-contextualised sonic material, has recently re-emerged as a vital component of contemporary commercial and creative music culture. This article suggests that analysis of the mixtape’s history and ethos can provide useful insight into ongoing tensions between cultures of active and passive listening, questions of ownership with regard to recorded sound and between the roles of producer and consumer within contemporary audio culture. It also proposes, via reference to relevant contemporary works, that the mixtape may now be considered a vital hybrid creative form, part composition/part compilation, with which composers, producers and sonic artists may actively engage. It historically and culturally contextualises this form, placing it within a lineage of military, political, creative and commercial conflict, which, among recorded sound technologies, is unique to tape and shows that the mixtape may be seen as an expression or utilisation of these factors. Consistent commercial and industrial efforts to suppress tape’s subversive qualities are outlined as are creative methodologies which have been adopted to resist such efforts. Contemporary music industry strategies to channel and commodify the aura and ethos of the mixtape as forerunner of the curated playlists vital to its current business model are detailed and contemporary cassette culture is considered as an alternative inheritor of the mixtape’s legacy and as alternative model for future musical creation and distribution. Finally, a set of characteristics which distinguish the mixtape as a creative form are identified and their potential application and implications are discussed.
In this chapter we present C(M) implementations of the main automata constructions. Our aim is to provide full descriptions of the implementations that are clear and easy to follow. In some cases the simplicity of the implementation is achieved at the expense of some inefficiency.
In this chapter we explore deterministic finite-state transducers. Obviously, it only makes sense to ask for determinism if we restrict attention to transducers with a functional input-output behaviour. In this chapter we focus on transducers that are deterministic on the input tape (called sequential or subsquential transducers). We shall see that only a proper subset of all regular string functions can be represented by this kind of device and we describe a decision procedure for testing whether a functional transducer can be determinized. Further we present a subsequential transducer minimization procedure based on theMyhill–Nerode relation for string functions.
The Empirical Methods in Natural Language Processing (EMNLP) 2018 workshop BlackboxNLP was dedicated to resources and techniques specifically developed for analyzing and understanding the inner-workings and representations acquired by neural models of language. Approaches included: systematic manipulation of input to neural networks and investigating the impact on their performance, testing whether interpretable knowledge can be decoded from intermediate representations acquired by neural networks, proposing modifications to neural network architectures to make their knowledge state or generated output more explainable, and examining the performance of networks on simplified or formal languages. Here we review a number of representative studies in each category.
Current approaches to learning semantic representations of sentences often use prior word-level knowledge. The current study aims to leverage visual information in order to capture sentence level semantics without the need for word embeddings. We use a multimodal sentence encoder trained on a corpus of images with matching text captions to produce visually grounded sentence embeddings. Deep Neural Networks are trained to map the two modalities to a common embedding space such that for an image the corresponding caption can be retrieved and vice versa. We show that our model achieves results comparable to the current state of the art on two popular image-caption retrieval benchmark datasets: Microsoft Common Objects in Context (MSCOCO) and Flickr8k. We evaluate the semantic content of the resulting sentence embeddings using the data from the Semantic Textual Similarity (STS) benchmark task and show that the multimodal embeddings correlate well with human semantic similarity judgements. The system achieves state-of-the-art results on several of these benchmarks, which shows that a system trained solely on multimodal data, without assuming any word representations, is able to capture sentence level semantics. Importantly, this result shows that we do not need prior knowledge of lexical level semantics in order to model sentence level semantics. These findings demonstrate the importance of visual information in semantics.
We present two studies on neural network architectures that learn to represent sentences by composing their words according to automatically induced binary trees, without ever being shown a correct parse tree. We use Tree-Long Short-Term Memories (LSTMs) as our composition function, applied along a tree structure found by a differentiable natural language chart parser. The models simultaneously optimise both the composition function and the parser, thus eliminating the need for externally provided parse trees, which are normally required for Tree-LSTMs. They can therefore be seen as tree-based recurrent neural networks that are unsupervised with respect to the parse trees. Due to being fully differentiable, the models are easily trained with an off-the-shelf gradient descent method and backpropagation.
In the first part of this paper, we introduce a model based on the CKY chart parser, and evaluate its downstream performance on a natural language inference task and a reverse dictionary task. Further, we show how its performance can be improved with an attention mechanism which fully exploits the parse chart, by attending over all possible subspans of the sentence. We find that our approach is competitive against similar models of comparable size and outperforms Tree-LSTMs that use trees produced by a parser.
Finally, we present an alternative architecture based on a shift-reduce parser. We perform an analysis of the trees induced by both our models, to investigate whether they are consistent with each other and across re-runs, and whether they resemble the trees produced by a standard parser.
Most compositional distributional semantic models represent sentence meaning with a single vector. In this paper, we propose a structured distributional model (SDM) that combines word embeddings with formal semantics and is based on the assumption that sentences represent events and situations. The semantic representation of a sentence is a formal structure derived from discourse representation theory and containing distributional vectors. This structure is dynamically and incrementally built by integrating knowledge about events and their typical participants, as they are activated by lexical items. Event knowledge is modelled as a graph extracted from parsed corpora and encoding roles and relationships between participants that are represented as distributional vectors. SDM is grounded on extensive psycholinguistic research showing that generalized knowledge about events stored in semantic memory plays a key role in sentence comprehension.We evaluate SDMon two recently introduced compositionality data sets, and our results show that combining a simple compositionalmodel with event knowledge constantly improves performances, even with dif ferent types of word embeddings.
Sentence-level representations are necessary for various natural language processing tasks. Recurrent neural networks have proven to be very effective in learning distributed representations and can be trained efficiently on natural language inference tasks. We build on top of one such model and propose a hierarchy of bidirectional LSTM and max pooling layers that implements an iterative refinement strategy and yields state of the art results on the SciTail dataset as well as strong results for Stanford Natural Language Inference and Multi-Genre Natural Language Inference. We can show that the sentence embeddings learned in this way can be utilized in a wide variety of transfer learning tasks, outperforming InferSent on 7 out of 10 and SkipThought on 8 out of 9 SentEval sentence embedding evaluation tasks. Furthermore, our model beats the InferSent model in 8 out of 10 recently published SentEval probing tasks designed to evaluate sentence embeddings’ ability to capture some of the important linguistic properties of sentences.