Hostname: page-component-5d59c44645-jb2ch Total loading time: 0 Render date: 2024-02-19T12:48:45.977Z Has data issue: false hasContentIssue false

Developmental change in children’s speech processing of auditory and visual cues: An eyetracking study

Published online by Cambridge University Press:  08 December 2021

Department of Linguistics, University of Ottawa, Canada
Department of Linguistics, University of Ottawa, Canada
Margarethe MCDONALD
Department of Linguistics, University of Ottawa, Canada School of Psychology, University of Ottawa, Canada
H. Henny YEUNG
Department of Linguistics, Simon Fraser University, Canada Integrative Neuroscience and Cognition Centre, UMR 8002, CNRS and University of Paris, France
Corresponding author. Tania S. Zamuner, Department of Linguistics, University of Ottawa, Hamelin Hall, 70 Laurier Ave. East, Ottawa ON, Canada K1N 6N5. E-mail:


This study investigates how children aged two to eight years (N = 129) and adults (N = 29) use auditory and visual speech for word recognition. The goal was to bridge the gap between apparent successes of visual speech processing in young children in visual-looking tasks, with apparent difficulties of speech processing in older children from explicit behavioural measures. Participants were presented with familiar words in audio-visual (AV), audio-only (A-only) or visual-only (V-only) speech modalities, then presented with target and distractor images, and looking to targets was measured. Adults showed high accuracy, with slightly less target-image looking in the V-only modality. Developmentally, looking was above chance for both AV and A-only modalities, but not in the V-only modality until 6 years of age (earlier on /k/-initial words). Flexible use of visual cues for lexical access develops throughout childhood.

© The Author(s), 2021. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Alsius, A., Paré, M., & Munhall, K. G. (2017). Forty years after Hearing Lips and Seeing Voices: the McGurk Effect Revisited. Multisensory Research, 31, 111144. DOI:10.1163/22134808-00002565 CrossRefGoogle Scholar
Barenholtz, E., Mavica, L., & Lewkowicz, D. J. (2016). Language familiarity modulates relative attention to the eyes and mouth of a talker. Cognition, 147, 100105. CrossRefGoogle Scholar
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67 (1),148. DOI: 10.18637/jss.v067.i01 CrossRefGoogle Scholar
Bernstein, L. E., Auer, E. T. Jr & Takayanagi, S. (2004). Auditory speech detection in noise enhanced by lipreading. Speech Communication, 44(1-4), 518. CrossRefGoogle Scholar
Borovsky, A., Ellis, E. M., Evans, J. L., & Elman, J. L. (2016). Semantic Structure in Vocabulary Knowledge Interacts With Lexical and Sentence Processing in Infancy. Child Development, 87(6), 18931908. CrossRefGoogle ScholarPubMed
Buchanan-Worster, E., MacSweeney, M., Pimperton, H., Kyle, F., Harris, M., Beedie, I., Ralph-Lewis, A., & Hulme, C. (2020). Speechreading ability is related to phonological awareness and single-word reading in both deaf and hearing children. Journal of Speech, Language, and Hearing Research, 63(11), 37753785. CrossRefGoogle ScholarPubMed
Buchwald, A. B., Winters, S. J., & Pisoni, D. B. (2009). Visual speech primes open-set recognition of spoken words. Language and Cognitive Processes, 24, 580610. CrossRefGoogle ScholarPubMed
Calvert, G. A., Bullmore, E. T., Brammer, M. J., Campbell, R., Williams, S. C., McGuire, P. K., Woodruff, P. W., Iversen, S. D., & David, A. S. (1997). Activation of auditory cortex during silent lipreading. Science, 276 (5312), 593596. CrossRefGoogle ScholarPubMed
Cannistraci, R. A. (2017). Do you see what I mean? The role of visual speech information in lexical representations. (Master’s Thesis, University of Tennessee). Retrieved from Google Scholar
Danielson, D. K., Bruderer, A. G., Kandhadai, P., Vatikiotis-Bateson, E., & Werker, J. F. (2017). The organization and reorganization of audiovisual speech perception in the first year of life. Cognitive Development, 42, 3748. CrossRefGoogle ScholarPubMed
Desmeules-Trudel, F., Moore, C., & Zamuner, T. S. (2020). Monolingual and bilingual children's processing of coarticulation cues during spoken word recognition. Journal of Child Language, 1-18. CrossRefGoogle Scholar
Fenson, L., Marchman, V. A., Thal, D., Dale, P., Reznick, J. S., & Bates, E. (2007). MacArthur-Bates Communicative Development Inventories: User's Guide and Technical Manual. 2nd Edition. Baltimore, MD: Brookes Publishing Co.Google Scholar
Fort, M., Kandel, S., Chipot, J., Savariaux, C., Granjon, L., & Spinelli, E. (2013). Seeing the initial articulatory gestures of a word triggers lexical access. Language and Cognitive Processes, 28(8), 12071223. CrossRefGoogle Scholar
Fort, M., Spinelli, E., Savariaux, C., & Kandel, S. (2010). The word superiority effect in audiovisual speech perception. Speech Communication, 52(6), 525532. CrossRefGoogle Scholar
Fort, M., Spinelli, E., Savariaux, C., & Kandel, S. (2012). Audiovisual vowel monitoring and the word superiority effect in children. International Journal of Behavioral Development, 36(6), 457467. CrossRefGoogle Scholar
Fox, J., & Weisberg, S. (2011). An {R} Companion to Applied Regression, Second Edition. Thousand Oaks CA: Sage. URL: Google Scholar
Frank, M. C., Braginsky, M., Yurovsky, D., & Marchman, V. A. (2016). Wordbank: An open repository for developmental vocabulary data. Journal of Child Language. doi: 10.1017/S0305000916000209.CrossRefGoogle Scholar
Gow, D. W., Melvold, J., & Manuel, S. (1996, October). How word onsets drive lexical access and segmentation: Evidence from acoustics, phonology and processing. In ICSLP’96 (Vol. 1, pp. 66-69). DOI: 10.1109/ICSLP.1996.607031 CrossRefGoogle Scholar
Granier-Deferre, C., Bassereau, S., Ribeiro, A., Jacquet, A. Y., & DeCasper, A. J. (2011). A melodic contour repeatedly experienced by human near-term fetuses elicits a profound cardiac reaction one month after birth. PLoS One, 6(2), e17304. CrossRefGoogle ScholarPubMed
Hall, M., Green, J., Moore, C., & Kuhl, P. (1999). Contribution of articulatory kinematics to visual perception of stop consonants. The Journal of the Acoustical Society of America, 105 ( 2 ), 12491249. CrossRefGoogle Scholar
Havy, M., & Zesiger, P. E. (2017). Learning spoken words via the ears and eyes: Evidence from 30-month-old children. Frontiers in Psychology, 8, 2122. DOI: 10.3389/fpsyg.2017.02122 CrossRefGoogle ScholarPubMed
Havy, M., & Zesiger, P. E. (2020). Bridging ears and eyes when learning spoken words: On the effects of bilingual experience at 30 months. Developmental Science, e13002. CrossRefGoogle Scholar
Hirst, R. J., Stacey, J. E., Cragg, L., Stacey, P. C., & Allen, H. A. (2018). The threshold for the McGurk effect in audio-visual noise decreases with development. Scientific Reports, 8, 12372. DOI 10.1038/s41598-018-30798-8 CrossRefGoogle ScholarPubMed
Hnath-Chisolm, T. E., Laipply, E., & Boothroyd, A. (1998). Age-related changes on a children's test of sensory-level speech perception capacity. Journal of Speech, Language, and Hearing Research, 41(1), 94106. CrossRefGoogle ScholarPubMed
Işik, C., & Yilmaz, S. (2011). E-learning in life long education: A computational approach to determining listening comprehension ability. Education and Information Technologies, 16, 7188. DOI 10.1007/s10639-009-9117-9 CrossRefGoogle Scholar
Jerger, S., Damian, M. F., McAlpine, R. P., & Abdi, H. (2018). Visual speech fills in both discrimination and identification of non-intact auditory speech in children. Journal of Child Language, 45, 392414. CrossRefGoogle ScholarPubMed
Jerger, S., Damian, M. F., Spence, M. J., Tye-Murray, N., & Abdi, H. (2009). Developmental shifts in children’s sensitivity to visual speech: A new multimodal picture–word task. Journal of Experimental Child Psychology, 102(1), 4059. DOI:10.1016/j.jecp.2008.08.002 CrossRefGoogle ScholarPubMed
Jerger, S., Damian, M. F., Tye-Murray, N., & Abdi, H. (2014). Children use visual speech to compensate for non-intact auditory speech. Journal of Experimental Child Psychology, 126, 295312. CrossRefGoogle ScholarPubMed
Jerger, S., Damian, M. F., Tye-Murray, N., & Abdi, H. (2017). Children perceive speech onsets by ear and eye. Journal of Child Language, 44, 185215. DOI:10.1017/S030500091500077X CrossRefGoogle ScholarPubMed
Kaganovich, N., Schumaker, J., & Rowland, C. (2016). Atypical audiovisual word processing in school-age children with a history of specific language impairment: An event-related potential study. Journal of Neurodevelopmental Disorders, 8, 33. DOI:10.1186/s11689-016-9168-3 CrossRefGoogle ScholarPubMed
Knowland, V. C., Evans, S., Snell, C., & Rosen, S. (2016). Visual speech perception in children with language learning impairments. Journal of Speech, Language, and Hearing Research, 59, 114. CrossRefGoogle ScholarPubMed
Kuhl, P. K., & Meltzoff, A. N. (1982). The bimodal perception of speech in infancy. Science, 218, 11381141. DOI: 10.1126/science.7146899 CrossRefGoogle ScholarPubMed
Kushnerenko, E., Teinonen, T., Volein, A., & Csibra, G. (2008). Electrophysiological evidence of illusory audiovisual speech percept in human infants. Proceedings of the National Academy of Sciences, 105(32), 1144211445. CrossRefGoogle ScholarPubMed
Kyle, F. E., Campbell, R., Mohammed, T., Coleman, M., & MacSweeney, M. (2013). Speechreading development in deaf and hearing children: Introducing the test of child speechreading. Journal of Speech, Language, and Hearing Research, 56, 416427. ScholarPubMed
Lalonde, K., & Holt, R. F. (2015). Preschoolers benefit from visually salient speech cues. Journal of Speech, Language, and Hearing Research, 58, 135150. CrossRefGoogle ScholarPubMed
Lalonde, K., & Holt, R. F. (2016). Audiovisual speech perception development at varying levels of perceptual processing. The Journal of the Acoustical Society of America, 139, 17131723. CrossRefGoogle ScholarPubMed
Lalonde, K., & Werner, L. A. (2021). Development of the Mechanisms Underlying Audiovisual Speech Perception Benefit. Brain Sciences, 11, 49. CrossRefGoogle ScholarPubMed
Lenth, R. (2020). emmeans: Estimated Marginal Means, aka Least-Squares Means. R package version 1.4.4. Google Scholar
Lewkowicz, D. J., & Hansen-Tift, A. M. (2012). Infants deploy selective attention to the mouth of a talking face when learning speech. Proceedings of the National Academy of Sciences, 109(5), 14311436. CrossRefGoogle Scholar
Massaro, D. W. (1984). Children’s perception of visual and auditory speech. Child Development, 55, 17771788. CrossRefGoogle ScholarPubMed
Massaro, D., & Light, J. (2004). Using visible speech to train perception and production of speech for individuals with hearing loss. Journal of Speech, Language, and Hearing Research, 47, 304320. ScholarPubMed
Massaro, D. W., Thompson, L. A., Barron, B., & Laren, E. (1986). Developmental changes in visual and auditory contributions to speech perception. Journal of Experimental Child Psychology, 41, 93113. DOI: 10.1016/0022-0965(86)90053-6 CrossRefGoogle ScholarPubMed
McClelland, J., & Elman, J. (1986). The TRACE model of speech perception. Cognitive Psychology, 18, 186. CrossRefGoogle ScholarPubMed
McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264(5588), 746748. DOI: 10.1038/264746a0 CrossRefGoogle ScholarPubMed
Morin-Lessard, E., Poulin-Dubois, D., Segalowitz, N., & Byers-Heinlein, K. (2019). Selective attention to the mouth of talking faces in monolinguals and bilinguals aged 5 months to 5 years. Developmental Psychology, 55, 16401655. CrossRefGoogle Scholar
Nakano, T., Tanaka, K., Endo, Y., Yamane, Y., Yamamoto, T., Nakano, Y., Ohta, H., Kato, N., & Kitazawa, S. (2010). Atypical gaze patterns in children and adults with autism spectrum disorders dissociated from developmental changes in gaze behaviour. Proceedings of the Royal Society B: Biological Sciences, 277(1696), 29352943. DOI: 10.1098/rspb.2010.0587 CrossRefGoogle ScholarPubMed
Norris, D., McQueen, J., & Cutler, A. (2000). Merging information in speech recognition: Feedback is never necessary. Behavioral and Brain Sciences Behavioral Brain Science, 23(3), 299325. CrossRefGoogle ScholarPubMed
Patterson, M. L., & Werker, J. F. (2003). Two-month-old infants match phonetic information. Developmental Science, 6(2), 191196. CrossRefGoogle Scholar
Pons, F., Bosch, L., & Lewkowicz, D. J. (2019). Twelve-month-old infants’ attention to the eyes of a talking face is associated with communication and social skills. Infant Behavior and Development, 54, 8084. CrossRefGoogle Scholar
Pons, F., Lewkowicz, D. J., Soto-Faraco, S., & Sebastián-Gallés, N. (2009). Narrowing of intersensory speech perception in infancy. Proceedings of the National Academy of Sciences, 106(26), 1059810602. DOI: 10.1073/pnas.0904134106 CrossRefGoogle ScholarPubMed
Rigler, H., Farris-Trimble, A., Greiner, L., Walker, J., Tomblin, J. B., & McMurray, B. (2015). The slow developmental time course of real-time spoken word recognition. Developmental Psychology, 51, 1690. CrossRefGoogle ScholarPubMed
Ross, L. A., Molholm, S., Blanco, D., Gomez‐Ramirez, M., Saint‐Amour, D., & Foxe, J. J. (2011). The development of multisensory speech perception continues into the late childhood years. European Journal of Neuroscience, 33, 23292337. CrossRefGoogle ScholarPubMed
Schielzeth, H., Dingemanse, N. J., Nakagawa, S., Westneat, D. F., Allegue, H., Teplitsky, C., Réale, D., Dochtermann, N. A., Garamszegi, L. Z., & Araya-Ajoy, Y. G. (2020). Robustness of linear mixed-effects models to violations of distributional assumptions. Methods in Ecology and Evolution, 11, 11411152. CrossRefGoogle Scholar
Schwartz, J. L., & Savariaux, C. (2014). No, there is no 150 ms lead of visual speech on auditory speech, but a range of audiovisual asynchronies varying from small audio lead to large audio lag. PLoS Comput Biol, 10(7), e1003743. CrossRefGoogle ScholarPubMed
Shaw, K. E., & Bortfeld, H. (2015). Sources of confusion in infant audiovisual speech perception research. Frontiers in Psychology, 6, 1844. CrossRefGoogle ScholarPubMed
Sumby, W. H., & Pollack, I. (1954). Visual contribution to speech intelligibility in noise. The Journal of the Acoustical Society of America, 26, 212215. DOI: 10.1121/1.1907309 CrossRefGoogle Scholar
Teinonen, T., Aslin, R. N., Alku, P., & Csibra, G. (2008). Visual speech contributes to phonetic learning in 6-month-old infants. Cognition, 108(3), 850855. DOI: 10.1016/j.cognition.2008.05.009 CrossRefGoogle ScholarPubMed
Tenenbaum, E. J., Shah, R. J., Sobel, D. M., Malle, B. F., & Morgan, J. L. (2013). Increased focus on the mouth among infants in the first year of life: A Longitudinal eye-tracking study. Infancy, 18(4), 534553. CrossRefGoogle ScholarPubMed
Toki, E., & Pange, J. (2010). E-learning activities for articulation in speech language therapy and learning for preschool children. Procedia Social and Behavioral Sciences, 2, 42744278. CrossRefGoogle Scholar
Tye-Murray, N., Hale, S., Spehar, B., Myerson, J., & Sommers, M. S. (2014). Lipreading in school-age children: The roles of age, hearing status, and cognitive ability. Journal of Speech, Language, and Hearing Research, 57, 556565. 10.1044/2013_JSLHR-H-12-0273 CrossRefGoogle ScholarPubMed
Weatherhead, D., & White, K. S. (2017). Read my lips: Visual speech influences word processing in infants. Cognition, 160, 103109. DOI: 10.1016/j.cognition.2017.01.002 CrossRefGoogle ScholarPubMed
Worster, E., Pimperton, H., Ralph‐Lewis, A., Monroy, L., Hulme, C., & MacSweeney, M. (2018). Eye movements during visual speech perception in deaf and hearing children. Language Learning, 68, 159179. CrossRefGoogle ScholarPubMed
Yeung, H. H., & Werker, J. F. (2013). Lip movements affect infants’ audiovisual speech perception. Psychological Science, 24(5), 603612. DOI: 10.1177/0956797612458802 CrossRefGoogle ScholarPubMed