Skip to main content Accessibility help

Modeling the contribution of phonotactic cues to the problem of word segmentation*



How do infants find the words in the speech stream? Computational models help us understand this feat by revealing the advantages and disadvantages of different strategies that infants might use. Here, we outline a computational model of word segmentation that aims both to incorporate cues proposed by language acquisition researchers and to establish the contributions different cues can make to word segmentation. We present experimental results from modified versions of Venkataraman's (2001) segmentation model that examine the utility of: (1) language-universal phonotactic cues; (2) language-specific phonotactic cues which must be learned while segmenting utterances; and (3) their combination. We show that the language-specific cue improves segmentation performance overall, but the language-universal phonotactic cue does not, and that their combination results in the most improvement. Not only does this suggest that language-specific constraints can be learned simultaneously with speech segmentation, but it is also consistent with experimental research that shows that there are multiple phonotactic cues helpful to segmentation (e.g. Mattys, Jusczyk, Luce & Morgan, 1999; Mattys & Jusczyk, 2001). This result also compares favorably to other segmentation models (e.g. Brent, 1999; Fleck, 2008; Goldwater, 2007; Johnson & Goldwater, 2009; Venkataraman, 2001) and has implications for how infants learn to segment.


Corresponding author

Address for correspondence: Daniel Blanchard, University of Delaware – Computer & Information Sciences, 101 Smith Hall, Newark, Delaware 19716, United States. e-mail:


Hide All

This work was supported by a University of Delaware Research Foundation grant to the second author, and by NIH (5R01HD050199) and NSF grants (BCS-0642529) to the third author. We thank Vijay Shanker for valuable discussions, and Regine Lai and Aimee Stahl for feedback on the manuscript.



Hide All
Bernstein-Ratner, N. (1987). The phonology of parent–child speech. In Nelson, K. & van Kleeck, A. (eds), Children's language, Volume 6, 159–74. Hillsdale, NJ: Erlbaum.
Blanchard, D. & Heinz, J. (2008). Improving word segmentation by simultaneously learning phonotactics. In 12th Conference on Computational Natural Language Learning, 6572. Morristown, NJ: Association for Computational Linguistics.
Bortfeld, H., Morgan, J., Golinkoff, R. & Rathbun, K. (2005). Mommy and me: Familiar names help launch babies into speech-stream segmentation. Psychological Science 16, 298304.
Brent, M. R. (1999). An efficient, probabilistically sound algorithm for segmentation and word discovery. Machine Learning 34, 71–105.
Brent, M. R. & Cartwright, T. (1996). Distributional regularity and phonotactic constraints are useful for segmentation. Cognition 61, 93–125.
Brent, M. R. & Siskind, J. (2001). The role of exposure to isolated words in early vocabulary development. Cognition 81, B33B44.
Chomsky, N. & Halle, M. (1965). Some controversial questions in phonological theory. Journal of Linguistics 1, 97–138.
Cole, R. & Jakimik, J. (1980). A model of speech perception. In Cole, R. (ed.), Perception and production of fluent speech, 136–63. Hillsdale, NJ: Lawrence Erlbaum Associates.
Coleman, J. & Pierrehumbert, J. (1997). Stochastic phonological grammars and acceptability. In Proceedings of the Third Meeting of the Association for Computational Linguistics SIGPHON, 4956. Somerset, NJ: Association for Computational Linguistics.
Cutler, A. & Carter, D. (1987). The predominance of strong initial syllables in the English vocabulary. Computer Speech and Language 2, 133–42.
Demuth, K. (1992). Acquisition of Sesotho. In Slobin, D. (ed.), The cross-linguistic study of language acquisition, Volume 3, 557638. Hillsdale, NJ: Lawrence Erlbaum Associates.
Dixon, R. M. W. & Aikhenvald, A. Y. (2002). Word: A typological framework. In Dixon, R. M. W. & Aikhenvald, A. Y. (eds), Word: A cross-linguistic typology, 141. Cambridge: Cambridge University Press.
Fleck, M. M. (2008). Lexicalized phonotactic word segmentation. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics, 130–38. Morristown, NJ: Association for Computational Linguistics.
Friederici, A. & Wessels, J. (1993). Phonotactic knowledge of word boundaries and its use in infant speech perception. Perception and Psychophysics 54, 287–95.
Goldwater, S. (2007). Nonparametric Bayesian models of lexical acquisition. Unpublished doctoral dissertation, Brown University, Department of Cognitive and Linguistic Sciences.
Golinkoff, R. & Hirsh-Pasek, K. (2006). Baby wordsmith: From associationist to social sophisticate. Current Directions in Psychological Science 15, 3033.
Halle, M. (1978). Knowledge unlearned and untaught: What speakers know about the sounds of their language. In Halle, M., Bresnan, J. & Miller, G. A. (eds), Linguistic theory and psychological reality, 294303. Cambridge, MA: MIT Press.
Harris, Z. (1954). Distributional structure. Word 10, 146–62.
Hayes, B. & Wilson, C. (2008). A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry 67, 379440.
Heinz, J. (2007). Inductive learning of phonotactic patterns. Unpublished doctoral dissertation, University of California, Los Angeles, Department of Linguistics.
Hollich, G., Hirsh-Pasek, K., Golinkoff, R., Brand, R. J., Brown, E., Chung, H. L., et al. (2000). Breaking the language barrier: An emergentist coalition model for the origins of word learning. Monographs of the Society for Research in Child Development 65, i–vi, 1123.
Johnson, M. (2008). Unsupervised word segmentation for Sesotho using adaptor grammars. In Proceedings of the Tenth Meeting of the Association for Computational Linguistics, SIGMORPHON, 2027. Morristown, NJ: Association for Computational Linguistics.
Johnson, M. & Goldwater, S. (2009). Improving nonparameteric Bayesian inference: Experiments on unsupervised word segmentation with adaptor grammars. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 317–25. Morristown, NJ: Association for Computational Linguistics.
Jurafsky, D. & Martin, J. (2008). Speech and language processing, 2nd edn.Upper Saddle River, NJ: Prentice-Hall.
Jusczyk, P. (1993). From general to language specific capacities: The WRAPSA model of how speech perception develops. Journal of Phonetics 21, 3–28.
Jusczyk, P., Friederici, A., Wessels, J., Svenkerud, V. Y. & Jusczyk, A. M. (1993). Infants' sensitivity to the sound patterns of native language words. Journal of Memory and Language 32, 402420.
Jusczyk, P., Hohne, E. & Baumann, A. (1999). Infants' sensitivity to allophonic cues for word segmentation. Perception and Psychophysics 61, 1465–76.
Jusczyk, P., Houston, D. & Newsome, M. (1999). The beginnings of word segmentation in English-learning infants. Cognitive Psychology 39, 159207.
MacWhinney, B. & Snow, C. (1985). The child language data exchange system. Journal of Child Language 12, 271–95.
Matthews, P. (1991). Morphology, 2nd edn.Cambridge: Cambridge University Press.
Mattys, S. & Jusczyk, P. (2001). Phonotactic cues for segmentation of fluent speech by infants. Cognition 78, 91–121.
Mattys, S., Jusczyk, P., Luce, P. & Morgan, J. (1999). Phonotactic and prosodic effects on word segmentation in infants. Cognitive Psychology 38, 465–94.
Mohri, M. (2005). Statistical natural language processing. In Lothaire, M. (ed.), Applied combinatorics on words, 210–40. Cambridge: Cambridge University Press.
Nelson, D. K., Jusczyk, P., Mandel, D., Myers, J., Turk, A. & Gerken, L. (1995). The head-turn preference procedure for testing auditory perception. Infant Behavior and Development 18, 111–16.
Saffran, J., Aslin, R. & Newport, E. (1996). Statistical learning by 8-month-old infants. Science 274, 1926–28.
Saffran, J., Werker, J. & Werner, L. (2006). The infant's auditory world: Hearing, speech, and the beginnings of language. In Siegler, R. & Kuhn, D. (eds), 6th edition of the handbook of child development, Volume 2, 58–108. New York: Wiley.
Sapir, E. (1925). Sound patterns in language. Language 1, 3751.
Shi, R. & Lepage, M. (2008). The effect of functional morphemes on word segmentation in preverbal infants. Developmental Science 11, 407413.
Teahan, W. J., McNab, R., Wen, Y. & Witten, I. H. (2000). A compression-based algorithm for Chinese word segmentation. Computational Linguistics 26(3), 375–93.
Thiessen, E. & Saffran, J. (2003). When cues collide: Use of stress and statistical cues to word boundaries by 7- to 9-month-old infants. Developmental Psychology 39, 706716.
Thiessen, E. & Saffran, J. (2007). Learning to learn: Infants' acquisition of stress-based strategies for word segmentation. Language Learning and Development 3, 73–100.
Toft, Z. (2002). The phonetics and phonology of some syllabic consonants in Southern British English. In Toft, Z. (ed.), Papers on phonetics and phonology: The articulation, acoustics and perception of consonants, Volume 28, 111144.
Toro, J. M., Nespor, M., Mehler, J. & Bonatti, L. L. (2008). Finding words and rules in a speech stream: Functional differences between vowels and consonants. Psychological Science 19, 137144.
Venkataraman, A. (2001). A statistical model for word discovery in transcribed speech. Computational Linguistics 27, 352–72.
Xie, Z. & Niyogi, P. (2006). Robust acoustic-based syllable detection. In INTERSPEECH-2006, paper 1327-Wed1BuP.6. Accessed at

Modeling the contribution of phonotactic cues to the problem of word segmentation*



Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed