Skip to main content Accessibility help

Can native Japanese listeners learn to differentiate /r–l/ on the basis of F3 onset frequency?*



Many attempts have been made to teach native Japanese listeners to perceptually differentiate English /r–l/ (e.g. rock–lock). Though improvement is evident, in no case is final performance native English-like. We focused our training on the third formant onset frequency, shown to be the most reliable indicator of /r–l/ category membership. We first presented listeners with instances of synthetic /r–l/ stimuli varying only in F3 onset frequency, in a forced-choice identification training task with feedback. Evidence of learning was limited. The second experiment utilized an adaptive paradigm beginning with non-speech stimuli consisting only of /r/ and /l/ F3 frequency trajectories progressing to synthetic speech instances of /ra–la/; half of the trainees received feedback. Improvement was shown by some listeners, suggesting some enhancement of /r–l/ identification is possible following training with only F3 onset frequency. However, only a subset of these listeners showed signs of generalization of the training effect beyond the trained synthetic context.


Corresponding author

Address for correspondence: Erin Ingvalson, Department of Communication Sciences and Disorders, Northwestern University, 2240 Campus Dr., Evanston, IL 60208, USA


Hide All

We wish to thank Daniel Dickison for serving as translator and interpreter and Robert Kass for numerous statistical consultations, particularly the suggestion of Fisher's combined probability test. We also wish to thank several anonymous reviewers for their helpful comments. Portions of this work were presented at the 2003 meeting of the Psychonomic Society and the 2005 meeting of the Cognitive Science Society. This work was supported by NIH grant 3R01-DC004674-06S1, NSF grant BCS-0746067, and a grant from The Bank of Sweden Tercentenary Foundation to the second author and NIMH grant P50-MH64445 to the third author.



Hide All
Agresti, A. (1992). A survey of exact inference for contingency tables. Statistical Science, 7, 131153.
Aoyama, K., Flege, J. E., Guion, S. G., Akahane-Yamada, R., & Yamada, T. (2004). Perceived phonetic dissimilarity and L2 speech learning: The case of Japanese /r/ and English /l/ and /r/. Journal of Phonetics, 32, 233250.
Bradlow, A. R., Akahane-Yamada, R., Pisoni, D. B., & Tohkura, Y. (1999). Training Japanese listeners to identify English /r/ and /l/: Long-term retention of learning in perception and production. Perception and Psychophysics, 61, 977985.
Bradlow, A. R., Pisoni, D. B., Akahane-Yamada, R., & Tohkura, Y. (1997). Training Japanese listeners to identify English /r/ and /l/: IV. Some effects of perceptual learning on speech production. Journal of the Acoustical Society of America, 101, 22992310.
Cutting, J. E., & Rosner, B. S. (1976). Discrimination functions predicted from categories of speech and music. Perception & Psychophysics, 20, 8788.
Espy-Wilson, C. Y. (1992). Acoustic measures for linguistic features distinguishing the semivowels /wjrl/ in American English. Journal of the Acoustical Society of America, 92 (2), 736757.
Flege, J. E. (2002). Interactions between the native and second-language phonetic systems. In Burmeister, P., Piske, T. & Rhode, A. (eds.), An integrated view of language development: Papers in honor of Henning Wode, pp. 217244. Trier: Wissenschaftlicher Verlag Trier.
Flege, J. E. (2003). Assessing constraints on second language segmental production and perception. In Schiller, N. O. & Meyer, A. (eds.), Phonetics and phonology in language comprehension and production, differences and similarities, pp. 319355. Berlin: Mouton de Gruyter.
Flege, J. E., Takagi, N., & Mann, V. (1996). Lexical familiarity and English-language experience affect Japanese adults’ perception of /r/ and /l/. Journal of the Acoustical Society of America, 99, 11611173.
Golestani, N., Molko, N., Sehaene, S., LeBihan, D., & Pallier, C. (2007). Brain structure predicts the learning of foreign speech sounds. Cerebral Cortex, 17, 575582.
Gordon, P. C., Keyes, L., & Yung, Y.-F. (2001). Ability in perceiving nonnative contrasts: Performance on natural and synthetic speech stimuli. Perception & Psychophysics, 63, 746758.
Guion, S. G., Flege, J. E., Akahane-Yamada, R., & Pruitt, J. C. (2000). An investigation of current models of second language speech perception: The case of Japanese adults’ perception of English consonants. Journal of the Acoustical Society of America, 107, 27112724.
Hattori, K., & Iverson, P. (2009). English /r/–/l/ category assimilation by Japanese adults: Individual differences and the link to identification accuracy. Journal of the Acoustical Society of America, 125 (1), 469479.
Holt, L. L. (2005). Temporally non-adjacent non-linguistic sounds affect speech categorization. Psychological Science, 16, 305312.
Holt, L. L., & Lotto, A. J. (2006). Cue weighting in auditory categorization: Implications for first and second language acquisition. Journal of the Acoustical Society of America, 119, 30593071.
Holt, L. L., & Wade, T. (2004). Non-linguistic sentence-length precursors affect speech perception: Implications for speaker and rate normalization. In Slifka et al., pp. C49–C54.
Ingvalson, E. M., McClelland, J. L., & Holt, L. L. (2011). Predicting native English-like performance by native Japanese speakers. Journal of Phonetics, 39, 571584.
Ingvalson, E. M., & Wenger, M. J. (2005). A strong test of the dual mode hypothesis. Perception & Psychophysics, 67, 1435.
Iverson, P., Ekanayake, D., Hamann, S., Sennema, A., & Evans, B. G. (2008). Category and perceptual interference in second-language phoneme learning: An examination of English /w/–/v/ learning by Sinhala, German, and Dutch speakers. Journal of Experimental Psychology: Human Perception and Performance, 34, 13051316.
Iverson, P., Hazan, V., & Bannister, K. (2005). Phonetic training with acoustic cue manipulations: A comparison of methods for teaching English /r–l/ to Japanese adults. Journal of the Acoustical Society of America, 118, 32673278.
Iverson, P., Kuhl, P. K., Akahane-Yamada, R., Diesch, E., Tohkura, Y., Kettermann, A., & Siebert, C. (2003). A perceptual interference account of acquisition difficulties for non-native phonemes. Cognition, 87, B47B57.
Jamieson, D. G., & Morosan, D. E. (1986). Training non-native speech contrasts in adults: Acquisition of the English /ð/–/θ/ contrast by francophones. Perception & Psychophysics, 40, 205215.
Jenkins, J. J., Strange, W., & Polka, L. (1995). Not everyone can tell a “rock” from a “lock”: Assessing individual differences in speech perception. In Lubinski, D. & Dawis, R. V. (eds.), Assessing individual differences in human behavior: New concepts, methods, and findings, pp. 297325. Palo Alto, CA: Davies-Black.
Johnson, J. S., & Newport, E. L. (1989). Critical period effects in second language learning: The influence of maturational state on the acquisition of English as a second language. Cognitive Psychology, 21, 6099.
Klatt, D. H. (1980). Software for a cascade-parallel formant synthesizer. Journal of the Acoustical Society of America, 67, 971995.
Klatt, D. H., & Klatt, L. C. (1990). Analysis, synthesis, and perception of voice quality variations among female and male talkers. Journal of the Acoustical Society of America, 87, 820857.
Kluender, K. R., Lotto, A. J., & Holt, L. L. (2005). Contributions of nonhuman animal models to understanding human speech perception. In Greenberg, S. & Ainsworth, W. (eds.), Listening to speech: An auditory perspective, pp. 203220. New York: Oxford University Press.
Knudsen, E. I. (2004). Sensitive periods in the development of brain and behavior. Journal of Cognitive Neuroscience, 16, 14121425.
Knudsen, E. I., & Knudsen, P. F. (1990). Sensitive and critical periods for visual localization of sound calibration by barn owls. Journal of Neuroscience, 10, 222232.
Kuhl, P. K. (1991). Human adults and human infants show a ‘perceptual magnet effect’ for the prototypes of speech categories, monkeys do not. Perceptual Psychophysics, 50, 93107.
Kuhl, P. K. (1993). Innate predispositions and the effects of experience in speech perception: The native language magnet theory. In deBoysson-Bardies, B., de Schonen, S., Jusczyk, P., McNeilage, P. & Morton, J. (eds.), Developmental neurocognition: Speech and face processing in the first year of life, pp. 259274. Dordrecht: Kluwer.
Kuhl, P. K., & Miller, J. D. (1975). Speech perception by the chinchilla: Voiced–voiceless distinction in alveolar plosive consonants. Science, 190, 6972.
Kuhl, P. K., & Padden, D. M. (1982). Enhanced discriminability at the phonetic boundaries for the voicing feature in macaques. Perception & Psychophysics, 32, 542550.
Kuhl, P. K., & Padden, D. M. (1983). Enhanced discriminability at the phonetic boundaries for the place feature in macaques. Journal of the Acoustical Society of America, 73, 10031010.
Kuhl, P. K., Williams, K. A., Lacerda, F., Stevens, K. N., & Lindblom, B. (1992). Linguistic experience alters phonetic perception in infants by 6 months of age. Science, 255, 606608.
Lenneberg, E. H. (1967). Biological foundations of language. New York: John Wiley & Sons.
Liberman, A. M., Harris, K. S., Hoffman, H. S., & Griffith, B. C. (1957). The discrimination of speech sounds within and across phoneme boundaries. Journal of Experimental Psychology, 54, 358368.
Lively, S. E., Logan, J. S., & Pisoni, D. B. (1993). Training Japanese listeners to identify English /r/ and /l/ II: The role of phonetic environment and talker variability in learning new perceptual categories. Journal of the Acoustical Society of America, 94, 12421255.
Lively, S. E., Pisoni, D. B., Yamada, R. A., Tohkura, Y., & Yamada, T. (1994). Training Japanese listeners to identify English /r/ and /l/: III. Long-term retention of new phonetic categories. Journal of the Acoustical Society of America, 96, 20762087.
Logan, J. S., Lively, S. E., & Pisoni, D. B. (1991). Training Japanese listeners to identify English /r/ and /l/: A first report. Journal of the Acoustical Society of America, 89, 874885.
Lotto, A. J., Holt, L. L., & Kluender, K. R. (1997). Effect of voice quality on perceived height of English vowels. Phonetica, 54, 7693.
Lotto, A. J., Kluender, K. R., & Holt, L. L. (1997a). Animal models of speech perception phenomena. In Singer, K., Eggert, R., & Anderson, G. (eds.), Chicago Linguistic Society (vol. 33), pp. 357367. Chicago: Chicago Linguistic Society.
Lotto, A. J., Kluender, K. R., & Holt, L. L., (1997b). Perceptual compensation for coarticulation by Japanese quail (Coturnix cotrunix japonica). Journal of the Acoustical Society of America, 102, 11341140.
Lotto, A. J., Sato, M., & Diehl, R. L. (2004). Mapping the task for the second language learner: Case of Japanese acquisition of /r/ and /l/. In Slifka, et al. (eds.), pp. C181–C186.
MacKain, K. S., Best, C. T., & Strange, W. (1982). Categorical perception of English /r/ and /l/ by Japanese bilinguals. Applied Psycholinguistics, 2, 369390.
Maddox, W. T., Diehl, R. L., & Molis, M. R. (2001). Generalizing a neuropsychological model of visual categorization to auditory categorization of vowels. In Smits, R., Kingston, J., Nearey, T. M. & Zondervan, R. (eds.), Proceedings of the Workshop on Speech Recognition as Pattern Recognition, pp. 8590. Nijmegen: MPI for Psycholinguistics.
Mann, V. A. (1986). Distinguishing universal and language-dependent levels of speech perception: Evidence from Japanese listeners’ perception of English “l” and “r”. Cognition, 24, 169196.
McCandliss, B. D., Fiez, J. A., Protopapas, A., Conway, M., & McClelland, J. L. (2002). Success and failure in teaching the [r]–[l] contrast to Japanese adults: Predictions of a Hebbian model of plasticity and stabilization in spoken language perception. Cognitive, Affective and Behavioral Neuroscience, 2, 89108.
Mirman, D., Holt, L. L., & McClelland, J. M. (2004). Categorization and discrimination of non-speech sounds: Differences between steady-state and rapidly-changing acoustic cues. Journal of the Acoustical Society of America, 116, 11981207.
Miyawaki, K., Strange, W., Verbrugge, R., Liberman, A. L., Jenkins, J. J., & Fujimura, O. (1975). An effect of linguistic experience: The discrimination of [r] and [l] by native speakers of Japanese and English. Attention, Perception, & Psychophysics, 18, 331340.
O'Connor, J. D., Gerstman, L. J., Liberman, A. M., Delattre, P. C., & Cooper, F. S. (1957). Acoustic cues for the perception of initial /w, j, r, l/ in English. Word, 13, 2443.
Orr, D. B., & Friedman, H. L. (1968). Effect of massed practice on the comprehension of time-compressed speech. Journal of Educational Psychology, 59, 611.
Polka, L., & Strange, W. (1985). Perceptual equivalence of acoustic cues that differentiate /r/ and /l/. Journal of the Acoustical Society of America, 78 (4), 11871197.
Raizada, R. D. S., Tsao, F. M., Liu, H. M., & Kuhl, P. K. (2009). Quantifying the adequacy of neural representations for a cross-language phonetic discrimination task: Prediction of individual differences. Cerebral Cortex, 20 (1), 112.
Romaine, S. (2003). Variation. In Doughty, C. J. & Long, M. H. (eds.), The handbook of second language acquisition, pp. 409435. Oxford: Blackwell.
Slifka, J., Manuel, S., & Matthies, M. (eds.) (2004). Proceedings of From Sound to Sense: Fifty+ Years of Discoveries in Speech Communication. Cambridge, MA: MIT Press.
Stephens, J. D. W., & Holt, L. L. (2003). Preceding phonetic context affects perception of nonspeech. Journal of the Acoustical Society of America, 114, 30363039.
Strange, W., & Dittman, S. (1984). Effects of discrimination training on the perception of /r–l/ by Japanese adults learning English. Perception & Psychophysics, 36, 131145.
Takagi, N. (2002). The limits of training Japanese listeners to identify English /r/ and /l/: Eight case studies. Journal of the Acoustical Society of America, 111, 28872894.
Takagi, N., & Mann, V. (1995). The limits of extended naturalistic exposure on the perceptual mastery of English /r/ and /l/ by adult Japanese learners of English. Applied Psycholinguistics, 16, 379405.
Underbakke, M., Polka, L., Gottfried, T. L., & Strange, W. (1988). Trading relations in the perception of /r/–/l/ by Japanese learners of English. Journal of the Acoustical Society of America, 84 (1), 90100.
Vouloumanos, A., Kiehl, K. A., Werker, J. F., & Liddle, P. F. (2001). Detection of sounds in the auditory stream: Event-related fMRI evidence for differential activation to speech and nonspeech. Journal of Cognitive Neuroscience, 13 (7), 9941005.
Werker, J. F., & Tees, R. C. (1984). Cross-language speech perception: Evidence for perceptual reorganization during the first year of life. Infant Behavior and Development, 7, 4963.
Wong, P. C. M., Perrachione, T. K., & Parrish, T. B. (2007). Characteristics of successful and less successful speech and word learning in adults. Human Brain Mapping, 28, 9951006.
Wong, P. C. M., Warrier, C. M., Penhune, V. B., Roy, A. K., Sadehh, A., Parrish, T. B., & Zatorre, R. J. (2008). Volume of left Heschl's gyrus and linguistic pitch learning. Cerebral Cortex, 18, 828836.
Yamada, R. A., & Tohkura, Y. (1990). Perception and production of syllable-initial English /r/ and /l/ by native speakers of Japanese. Proceedings of the 1990 International Conference on Spoken Language Processing, pp. 757760. Kobe, Japan.
Zhang, Y., Kuhl, P. K., Imada, T., Iverson, P., Pruitt, J., Stevens, E. B., Kawakatsu, M., Tohkura, Y., & Nemoto, I. (2009). Neural signatures of phonetic learning in adulthood: A magnetoencephalography study. NeuroImage, 46, 226240.


Type Description Title
Supplementary Material

Ingvalson Supplementary Material
Ingvalson Supplementary Material

 PDF (1.2 MB)
1.2 MB


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed