To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This article reports on patterns in the production and perception of New Zealand English r-sandhi. We report on two phoneme-monitoring experiments that examine whether listeners from three regions are sensitive to the distribution of r-presence in linking and intrusive environments. The results provide evidence that sound perception is affected by a listener's experience-driven expectations: greater prior experience with a sound in a given context increases the likelihood of perceiving the sound in that context, regardless of whether the sound is present in the stimulus. For listeners with extremely limited prior exposure to a variant, the variant is especially salient and we also observe an experiment-internal effect of experience. We argue that our results support models that incorporate both word-specific and abstract probabilistic representations.
This study investigates how listeners associate acoustically different vowels with a single linguistic vowel quality. Listeners were asked to identify vowel sounds as /æ/ or /ʌ/ and to indicate the size of the speaker that produced them. Results indicate that perceived vowel quality trades off with the perception of speaker size: different vowels can sound the same, and the same vowel can sound different when a different speaker is perceived. These findings suggest that vowel normalization is broadly similar to perceptual constancy in other domains, and that social, indexical, and linguistic information play an important role in determining even the most fundamental units of linguistic representation.
Listeners have a remarkable ability to adapt to novel speech patterns, such as a new accent or an idiosyncratic pronunciation. In almost all of the previous studies examining this phenomenon, the participating listeners had reason to believe that the speech signal was produced by a human being. However, people are increasingly interacting with voice-activated artificially intelligent (voice-AI) devices that produce speech using text-to-speech (TTS) synthesis. Will listeners also adapt to novel speech input when they believe it is produced by a device? Across three experiments, we investigate this question by exposing American English listeners to shifted pronunciations accompanied by either a ‘human’ or a ‘device’ guise and testing how this exposure affects their subsequent categorization of vowels. Our results show that listeners exhibit perceptual learning even when they believe the speaker is a device. Furthermore, listeners generalize these adjustments to new talkers, and do so particularly strongly when they believe that both old and new talkers are devices. These results have implications for models of speech perception, theories of human-computer interaction, and the interface between social cognition and linguistic theory.
Understanding the relation between speech production and perception is foundational to phonetic theory, and is similarly central to theories of the phonetics of sound change. For sound changes that are arguably perceptually motivated, it is particularly important to establish that an individual listener's selective attention—for example, to the redundant information afforded by coarticulation—is reflected in that individual's own productions. This study reports the results of a pair of experiments designed to test the hypothesis that individuals who produce more consistent and extensive coarticulation will attend to that information especially closely in perception. The production experiment used nasal airflow to measure the time course of participants' coarticulatory vowel nasalization; the perception experiment used an eye-tracking paradigm to measure the time course of those same participants' attention to coarticulated nasality. Results showed that a speaker's coarticulatory patterns predicted, to some degree, that individual's perception, thereby supporting the hypothesis: participants who produced earlier onset of coarticulatory nasalization were, as listeners, more efficient users of nasality as that information unfolded over time. Thus, an individual's perception of coarticulated speech is made public through their productions.
This chapter first focuses on infants’ perception, the process of becoming aware of objects, relations, and events by way of the senses, and then on infant cognition, the processes or faculties by which knowledge is acquired and manipulated. We start with a look at methods that have provided psychologists with insights into what babies perceive and think, mainly what they see and hear. We then examine several aspects of infant visual perception, including visual preferences, depth perception, and face perception. We follow this with a brief look at infant auditory perception and then intermodal (between senses) perception, focusing mainly on the integration of vision and hearing. Our discussion then turns to topics of infant cognition. We look at the concept of core knowledge, followed by reviews of research on object representation and infants’ abilities to make sense of quantitative information. We then examine infants’ memory skills. The chapter concludes with a short discussion about the role of experience in infant perceptual and cognitive development and the relation between infant brain and cognitive development.
This study investigates second language (L2) phonetic categorization and phonological encoding of L2 words (hereafter, phonolexical encoding1) with phonemic and allophonic cross-linguistic mismatches. We focus on the acquisition of Spanish /ɾ/-/l/ and /ɾ/-/t/ contrasts among Spanish learners with American English (AE) and Mandarin Chinese (hereafter, Chinese) as first languages (L1s). [ɾ] and [t] are positional allophones in AE but separate phonemes in Spanish. The phoneme /ɾ/ is absent in Chinese. AE learners showed nativelike phonetic categorization and little between-contrast difference in phonolexical encoding, suggesting that L1 positional allophony does not necessarily impede L2 contrast acquisition. Chinese learners showed persistent perceptual difficulties with both contrasts due to perceptual similarity. Phonetic categorization significantly predicted phonolexical encoding for /ɾ/-/t/ contrasts for Chinese learners bidirectionally, while AE learners showed this relationship only when /t/ was incorrectly replaced by /ɾ/ in Spanish words. This asymmetry can be driven by the fact that [t] is the dominant allophone of /t/ in AE, while [ɾ] is a positional allophone. It suggests L1 allophonic knowledge heightens perceptual monitoring when evaluating substitutions that conflict with L1 phonological expectations. This study calls for more nuanced treatment of L1 influence in L2 phonological acquisition models, especially at the allophonic level.
Music is among the most important factors of the human experience. It draws on core perceptual-cognitive functions including those most relevant to speech-language processing. Consequently, musicians have been a model for understanding neuroplasticity and its far-reaching transfer effects to perception, action, cognition, and linguistic brain functions. This chapter provides an overview of these perceptual-cognitive benefits that music exerts on the brain with specific reference to spillover effects it has on speech and language functions. We highlight cross-sectional and longitudinal findings on music’s impact on the linguistic brain ranging from psychophysical benefits to enhancements of higher-order cognition. We also emphasize commonalities and distinctions in brain plasticity afforded by experience in the speech and music domains, drawing special attention to cross-domain transfer effects (or lack thereof) in how musical training influences linguistic processing and vice versa.
The development of a sound change can be influenced by linguistic and social factors, both within the language community and from cases of language contact. The present study is an examination of the internally generated ongoing tonogenesis process in Afrikaans, specifically analyzing production and perception of word-initial plosives among different age and gender groups. Results show that female speakers are devoicing significantly more often than male speakers, and the perception of female voices is influenced more by f0 levels than the perception of male voices. This study finds that gender is a larger predictor overall of tonogenetic patterns than age.*
This study revisits the relationship between second-language (L2) learners’ ability to distinguish sounds in non-native phonological contrasts and to recognize spoken words when recognition depends on these sounds, while addressing the role of methodological similarity. Bilingual Catalan/Spanish learners of English were tested on the identification of two vowel contrasts (VI) of diverging difficulty, /i/-/ɪ/ (difficult) and /ɛ/-/æ/ (easy), in monosyllabic minimal pairs, and on their recognition of the same pairs in a word-picture matching task (WPM). Learners performed substantially better with /i/-/ɪ/ in VI than in WPM, and individual scores were only weakly correlated. By replicating previous findings through a more symmetrical design, we show that an account of prior work rooted in methodological dissimilarity is improbable and provide additional support for the claim that accuracy in sound identification does not guarantee improvements in word recognition. This has implications for our understanding of L2-speech acquisition and L2 pronunciation training.
Multilinguals face greater challenges than monolinguals in speech perception tasks, such as processing noisy sentences. Factors related to multilinguals’ language experience, such as age of acquisition, proficiency, exposure and usage, influence their perceptual performance. However, how language experience variability modulates multilinguals’ listening effort remains unclear. We analyzed data from 92 multilinguals who completed a listening task with words and sentences, presented in quiet and noise across participants’ spoken languages (Arabic, Hebrew and English). Listening effort was assessed using pupillometry. The results indicated higher accuracy and reduced effort in quiet than in noise, with greater language experience predicting better accuracy and reduced effort. These effects varied by stimulus and listening condition. For single words, greater language experience most strongly reduced effort in noise; for sentences, it had a more pronounced effect in quiet, especially for high-predictability sentences. These findings emphasize the importance of considering language experience variability when evaluating multilingual effort.
This chapter explores the intricate relationship between music and language, highlighting their shared neural processing in the brain. It looks at the musicality of speech, demonstrating how acoustic features such as pitch, rhythm, and timbre convey meaning and emotion in both music and language. Research reveals that even those who consider themselves unmusical possess an innate musicality, evident in their ability to perceive subtle differences in speech sounds. The chapter emphasizes that language acquisition in infants relies heavily on musical aspects, such as melody, rhythm, and prosody. Brain imaging studies confirm an overlap in neural networks for music and language processing, including Broca’s and Wernicke’s areas, traditionally associated with language. This ’music-language network’ is active from infancy, suggesting a deep biological connection between these two forms of communication. The chapter also highlights the therapeutic potential of music for language development. Musical activities can enhance speech perception, rhythmic skills, and reading abilities, particularly in children with language disorders or dyslexia. By engaging with music, children can playfully develop essential mental faculties, fostering overall cognitive and emotional growth.
One of the main challenges individuals face when learning an additional language (L2) is learning its sound system, which includes learning to perceive L2 sounds accurately. High variability phonetic training (HVPT) is one method that has proven highly effective at helping individuals develop robust L2 perceptual categories, and recent meta-analytic work suggests that multi-talker training conditions provide a small but statistically reliable benefit compared to single-talker training. However, no study has compared lower and higher variability multi-talker conditions to determine how the number of talkers affects training outcomes, even though such information can shed additional light on how talker variability affects phonetic training. In this study, we randomly assigned 458 L2 Spanish learners to a two-talker or six-talker HVPT group or to a control group that did not receive HVPT. Training focused on L2 Spanish stops. We tested performance on trained talkers and words as well as several forms of generalization. The experimental groups improved more and demonstrated greater generalization than the control group, but neither experimental group outpaced the other. The number of sessions experimental participants completed moderated learning gains.
This study investigated the neural mechanisms underlying bilingual speech perception of competing phonological representations. A total of 57 participants were recruited, consisting of 30 English monolinguals and 27 Spanish-English bilinguals. Participants passively listened to stop consonants while watching movies in English and Spanish. Event-Related Potentials and sLORETA were used to measure and localize brain activity. Comparisons within bilinguals across language contexts examined whether language control mechanisms were activated, while comparisons between groups assessed differences in brain activation. The results showed that bilinguals exhibited stronger activation in the left frontal areas during the English context, indicating greater engagement of executive control mechanisms. Distinct activation patterns were found between bilinguals and monolinguals, suggesting that the Executive Control Network provides the flexibility to manage overlapping phonological representations. These findings offer insights into the cognitive and neural basis of bilingual language control and expand current models of second language acquisition.
Growing evidence suggests that ratings of second language (L2) speech may be influenced by perceptions of speakers’ affective states, yet the size and direction of these effects remain underexplored. To investigate these effects, 83 raters evaluated 30 speech samples using 7-point scales of four language features and ten affective states. The speech samples were 2-min videorecordings from a high-stakes speaking test. An exploratory factor analysis reduced the affect scores to three factors: assuredness, involvement, and positivity. Regression models indicated that affect variables predicted spoken language feature ratings, explaining 18–27% of the variance in scores. Assuredness and involvement corresponded with all language features, while positivity only predicted comprehensibility scores. These findings suggest that listeners’ perceptions of speakers’ affective states intertwine with their spoken language ratings to form a visual component of second-language communication. The study has implications for models of L2 speech, language pedagogy, and assessment practice.
Twenty-five years ago, the publication of an article by Pallier, Colomé, and Sebastián-Gallés (2001) launched a new and rapidly evolving research program on how second language (L2) learners represent the phonological forms of words in their mental lexicons. Many insights are starting to form an overall picture of the unique difficulties for establishing functional and precise phonolexical representations in L2; however, for the field to move forward it is pertinent to outline its major emerging research questions and existing challenges. Among significant obstacles for further research, the paper explores the current lack of theoretical agreement on the concept of phonolexical representations and the underlying mechanism involved in establishing them, as well as the variable use of the related terminology (e.g., fuzziness and target-likeness). Methodological challenges involved in investigating phonological processing and phonolexical representations as well as their theoretical implications are also discussed. To conclude, we explore the significance of L2-specific phonological representations for the bottom-up lexical access during casual, conversational speech and how our emerging knowledge of L2 lexical representations can be applied in an instructional setting as two potentially fruitful research avenues at the forefront of the current research agenda.
Study abroad is typically viewed as a catalyst for pronunciation learning because it affords learners both massive amounts of L2 input and abundant opportunities for meaningful L2 use. Yet, even in such an environment, there is substantial variability in learning trajectories and outcomes. The nature of the target structure is also a powerful determinant of learning; some structures seem to develop effortlessly, whereas others do not improve much at all. Additionally, study abroad research brings to light the important issue of speaker identity, as learners often make decisions about how they want to sound and what pronunciation features they will adopt. This chapter examines developmental time frames, trajectories, and turning points in the phonetics and phonology of L2 learners in a study abroad context. We also describe how learners acquire the regional pronunciation variants of their host communities considering the phonetics of the target feature and learners’ attitudes and beliefs. We argue that study abroad should be situated within a dynamic, longitudinal, and context-dependent view of phonetic and phonological learning.
Researchers in bilingualism seek to identify factors that are associated with specific features of bilingual speech. One such predictive factor is language dominance, typically understood as the degree to which one of the languages of a bilingual is more often and more proficiently used. In this chapter we review landmark studies that demonstrate the power of language dominance in predicting fine-grained phonetic and phonological characteristics of speech production and on the perceptual and processing abilities in one or both languages of bilinguals. We then critically examine the construct of dominance and identify ways that dominance can be and has been measured, as well as challenges inherent in the measurement of dominance. We follow demonstrating the dynamic character of dominance by reviewing research on dominance switches and shifts. This is followed by a review of extant studies on language dominance in bilingual speech production, perception, and processing in both languages. We conclude with four areas where research can be fruitfully directed.
The Automatic Selective Perception (ASP) model posits that listeners make use of selective perceptual routines (SPRs) that are fast and efficient for recovering lexical meaning. These SPRs serve as filters to accentuate relevant cues and minimize irrelevant information. Years of experience with the first language (L1) lead to fairly automatic L1 SPRs; consequently, few attentional resources are needed in processing L1 speech. In contrast, L2 SPRs are less automatic. Under difficult task or stimulus conditions, listeners fall back on more automatic processes, specifically L1 SPRs. And L2 speech perception suffers where there is a mismatch between the L1 and the L2 phonetics because L1 SPRs may not extract the important cues needed for identifying L2 phonemes. This chapter will present behavioral and neurophysiology evidence that supports the ASP model, but which also indicates the need for some modification. We offer suggestions for future directions in extending this model.
This chapter provides a thorough, up-to-date review of the literature on the phonetics and phonology of early bilinguals. It pulls together studies from a range of bilingual settings, including bilingual societies and heritage language contexts. While the chapter mostly reviews evidence from adolescent and adult participants, it also makes reference to the child bilingualism literature, where appropriate. The chapter first reviews studies on the accents of early versus late bilinguals, followed by a discussion of the various explanatory accounts for the observed differences between these two groups. Subsequently, the critical significance of early linguistic experience on bilingual speech patterns is considered, with particular reference to the evidence from childhood overhearers and international adoptees. The following sections then review studies comparing simultaneous and early sequential bilinguals, and those exploring the role of language dominance, continued use, the language of the environment in bilinguals’ pronunciation patterns, and the role of sociolinguistic factors in early bilingual speech patterns. The chapter concludes with suggestions for future research.
This chapter reviews evidence that the orthographic forms (spellings) of L2 sounds and words affect L2 phonological representation and processing. Orthographic effects are found in speech perception, speech production, phonological awareness, and the learning of words and sounds. Orthographic forms facilitate L2 speakers/listeners – for instance in lexical learning – but also have negative effects, resulting in sound additions, deletions, and substitutions. This happens because L2 speakers’ L2 orthographic knowledge differs from the actual working of the L2 writing system. Orthographic effects are established after little exposure to orthographic forms, are persistent, can be reinforced by factors other than orthography, including spoken input, and are modulated by individual-level and sound/word-level variables. Future research should address gaps in current knowledge, for instance investigating the effects of teaching interventions, and aim at producing a coherent framework.