Skip to main content Accessibility help

Analyzing language samples of Spanish–English bilingual children for the automated prediction of language dominance

  • T. SOLORIO (a1), M. SHERMAN (a2), Y. LIU (a2), L. M. BEDORE (a3), E. D. PEÑA (a3) and A. IGLESIAS (a4)...

In this work we study how features typically used in natural language processing tasks, together with measures from syntactic complexity, can be adapted to the problem of developing language profiles of bilingual children. Our experiments show that these features can provide high discriminative value for predicting language dominance from story retells in a Spanish–English bilingual population of children. Moreover, some of our proposed features are even more powerful than measures commonly used by clinical researchers and practitioners for analyzing spontaneous language samples of children. This study shows that the field of natural language processing has the potential to make significant contributions to communication disorders and related areas.

Hide All
Bedore, L. M., Fiestas, C. E., Peña, E. D., and Nagy, V. J. 2006. Crosslanguage comparisons of maze use in Spanish and English in functionally monolingual and bilingual children. Bilingualism: Language and Cognition 9 (3): 233247.
Bedore, L. M., Peña, E. D., Gillam, R. B., and Ho, T. (in press) Language sample measures and language ability in Spanish English bilingual kindergarteners. Journal of Communication Disorders.
Berman, R. A., and Slobin, D. I. 1994. Relating Events in Narrative: Crosslinguistic Developmental Study. Hillsdale, New Jersey: Lawrence Erlbaum Associates.
Bohman, T. M., Bedore, L. M., Peña, E. D., Mendez-Perez, A., and Gillam, R. B. 2010. What you hear and what you say: language performance in Spanish–English bilinguals. International Journal of Bilingual Education and Bilingualism 13 (3): 325344.
Brill, E., and Moore, R. C. 2000. An improved error model for noisy channel spelling correction. In Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics. Hong Kong. ACL.
Brown, R. 1973. A First Language: The Early Stages. Cambridge: Harvard University Press.
Coleman, M., and Liau, T. L. 1975. A computer readability formula designed for machine scoring. Journal of Applied Psychology 60: 283284.
Dollaghan, C., and Campbell, T. 1992. A procedure for classifying disruptions in spontaneous language samples. Topics in Language Disorders 12: 5668.
Dollaghan, C., Campbell, T. F., Paradise, J., Feldman, H. M., Janosky, J. E., Pitcairn, D. N., and Kurs-Lasky, M. 1999. Maternal education and measures of early speech and language. Journal of Speech, Language and Hearing Research 42: 14321443.
Feng, L., Elhadad, N., and Huenerfauth, M. 2009. Cognitively motivated features for readability assessment. In Proceedings of the 12th Conference of the European Chapter of the ACL, pp. 229237. Athens, Grece, ACL.
Flesch, R. 1948. A new readability yardstick. Journal of Applied Psychology 32: 221233.
Gabani, K. 2009. Automatic Identification of Language Impairment in Monolingual English-Speaking Children. M.S. thesis, Department of Computer Science, The University of Texas at Dallas.
Gabani, K., Sherman, M., Solorio, T., Liu, Y., Bedore, L. M., and Peña, E. D. 2009. A corpus-based approach for the prediction of language impairment in monolingual English and Spanish–English bilingual children. In North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL–HLT) 2009, pp. 4655, Boulder, Colorado. ACL.
Goldstein, B. and Kohnert, K. 2005. Speech, language, and hearing in developing bilingual children: current findings and future directions. Language, Speech and Hearing Services 36: 264267.
Grosjean, F. 1989. Neurolinguists, beware! The bilingual is not two monolinguals in one person. Brain and Language 36: 315.
Gunning, R. 1952. The Technique of Clear Writing. New York, NY: R. McGraw-Hill International Book Co.
Guo, L.-Y., Tomblin, J. B., and Samelson, V. 2008. Speech disruptions in the narratives of English-speaking children with specific language impairment. Journal of Speech, Language and Hearing Research 51: 722738.
Gutiérrez-Clellen, V. F., and Kreiter, J. 2003. Understanding child bilingual acquisition using parent and teacher reports. Applied Psycholinguistics 24: 267288.
Jurafsky, D., and Martin, J. H. 2000. Speech and Language Processing: An Introduction to Natural Language Processing. Englewood Cliffs, New Jersey: Prentice Hall.
Kincaid, P. J., Fishburne, R. P., Rogers, R. L., and Chisson, B. S. 1975. Derivation of new readability formulas for Navy enlisted personnel. Research Branch Report 8-75, US Naval Air Station, Memphis, 1975.
Klee, T., and Fitzgerald, M. D. 1985. The relation between grammatical development and mean length of utterance in morphemes. Journal of Child Language 12: 251269.
Kowal, S., O'Connell, D. C., and Sabin, E. 1975. Development of temproal patterning and vocal hesitations in spontaneous narratives. Journal of Psycholinguistic Research 4: 195207.
Leonard, L. B. 1998. Children with Specific Language Impairment. Cambridge, MA: MIT Press.
Loban, W.Language Development: Kindergarten Through Grade Twelve. Urbana, IL: National Council of Teachers of English.
MacLachlan, B. G. and Chapman, R. S. 1988. Communication breakdowns in normal and language learning-disabled children's conversation and narration. Journal of Speech and Hearing Disorders 53: 27.
MacWhinney, B. 2000. The CHILDES Project: Tools for Analyzing Talk. Mahwah, NJ: Lawrence Erlbaum Associates.
MacWhinney, B. 2008. Trends in corpus research: finding structure in data. In Behrens, H. (ed.), Enriching CHILDES for Morphosyntactic Analysis, pp. 165198, Amsterdam: Benjamins.
Manning, C. D., and Schütze, H. 1999. Foundations of Statistical Natural Language Processing. The MIT Press.
McLaughlin, H. G. 1969. SMOG grading - a new readability formula. Journal of Reading 12 (8): 639646.
Mayer, M. 1967. A Boy, a Dog and a Frog. Dial Press.
Mayer, M. 1969. Frog, Where are You? Dial Press.
Mayer, M. 1973. Frog on His Own. Dial Press.
Mayer, M. 1974. Frog Goes to Dinner. Dial Press.
Miller, J., and Iglesias, A. 2008. Systematic Analysis of Language Transcripts (SALT) Research Version 2008. Madison, WI: SALT Software LLC.
Paradis, J. 2005. Grammatical morphology in children learning English as a second language: implications of similarities with specific language impairment. Language, Speech and Hearing Services in Schools 36: 172187.
Paradis, J., Crago, M., and Genesee, F. 2003. French-English bilingual children with SLI: how do they compare with their monolingual peers? Journal of Speech, Language and Hearing Research 46: 113127.
Peña, E. D., Gutiérrez-Clellen, V., Iglesias, A., Goldstein, B. A., and Bedore, L. M.The Bilingual English-Spanish Assessment.
Petersen, S. E., and Ostendorf, M. 2008. A machine learning approach to reading level assessment. Computer Speech and Language 23: 89106.
Redmond, S. 2004. Conversational profiles of children with ADHD, SLI and typical development. Clinical Linguistics & Phonetics 18 (2): 107125.
Rice, M. L., and Wexler, K. 1996. A phenotype of specific language impairment: Extended optional infinitives. In Rice, M. L. (ed.), Toward a Genetics of Language, pp. 215237, Mahwah, NJ: Lawrence Erlbaum Associates.
Restrepo, M. A. 1998. Identifiers of predominantly Spanish-speaking children with language impairment. Journal of Speech, Language and Hearing Research 41: 13981411.
Roark, B., Bachrach, A., Cardenas, C., and Pallier, C. 2009. Deriving lexical and syntactic expectation-based measures for psycholinguistic modeling via incremental top-down parsing. In The 2009 Conference on Empirical Methods for Natural Language Processing, pp. 324333, Singapore, ACL.
Roark, B., Mitchell, M., and Hollingshead, K. 2007a. Syntactic complexity measures for detecting mild cognitive impairment. In BioNLP 2007: Biological, Translational and Clinical Language Processing, pp. 18, Prague.
Roark, B., Mitchell, M., and Kaye, J. A. 2007b. Automatically derived spoken language markers for detecting mild cognitive impairment. In Proceedings of the 2nd International Conference on Technology and Aging (ICTA), Toronto, Canada.
Sagae, K., Davis, E., Lavie, A., MacWhinney, B., and Wintner, S. 2010. Morphosyntactic annotation of CHILDES transcripts. Journal of Child Language 37 (3): 705729.
Sagae, K., Davis, E., Lavie, A., MacWhinney, B., and Wintner, S. 2007. High accuracy annotation and parsing of CHILDES transcripts. In Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition, pp. 2532, Prague, Czech Republic.
Sagae, K., Lavie, A., and MacWhinney, B. 2005 Automatic measurement of syntactic development in child language. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pp. 197204, Ann Arbor, Michigan, ACL.
Scarborough, H. S. 1990. Index of productive syntax. Applied Psycholinguistics (11): 122.
Solorio, T., and Liu, Y. 2008. Part-of-Speech tagging for English-Spanish code-switched text. In Empirical Methods on Natural Language Processing, EMNLP-2008, pp. 10511060. Honolulu, Hawaii, ACL.
Stolcke, A. 2002. SRILM - an extensible language modeling toolkit. In Proceedings of the International Conference on Spoken Language Processing. Denver, Colorado.
Thordardottir, E. T., and Weismer, S. E. 2002. Content mazes and filled pauses on narrative language samples of children with specific language impairment. Brain and Cognition 48 (2–3): 587592.
Tomblin, J. B. 1997. Prevalence of specific language impairment in kindergarten children. Journal of Speech, Language and Hearing Research 40: 12451260.
Wetherell, D., Botting, N., and Conti-Ramsden, G. 2007. Narrative in adolescent specific language impairment (SLI): a comparison with peers across two different narrative genres. International Journal of Language and Communication Disorders 42 (5): 583605.
Wexler, K. 1994. Optional infinitives, head movement and the economy of derivations. In Lightfoot, D. and Hornstein, N. (eds.), Verb Movement, pp. 305350, Cambridge: Cambridge University Press.
Witten, I. H., and Frank, E. 1999. Data Mining, Practical Machine Learning Tools and Techniques with Java Implementations. San Francisco, CA: Morgan Kaufmann.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Natural Language Engineering
  • ISSN: 1351-3249
  • EISSN: 1469-8110
  • URL: /core/journals/natural-language-engineering
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed