Hostname: page-component-848d4c4894-pftt2 Total loading time: 0 Render date: 2024-06-12T14:35:44.056Z Has data issue: false hasContentIssue false

Modelling Mandarin speakers’ phonotactic knowledge

Published online by Cambridge University Press:  29 September 2021

Shuxiao Gong*
University of Kansas
Jie Zhang*
University of Kansas


This paper investigates the nature of native Mandarin Chinese speakers’ phonotactic knowledge via an experimental study and formal modelling of the experimental results. Results from a phonological well-formedness judgement experiment suggest that Mandarin speakers’ phonotactic knowledge is sensitive not only to lexical statistics, but also to grammatical principles such as systematic and accidental phonotactic constraints, allophonic restrictions and segment–tone co-occurrence restrictions. We employ the UCLA Phonotactic Learner to model Mandarin speakers’ phonotactic knowledge, and compare the model's well-formedness predictions with speakers’ judgements. The disparity between the model's predictions and the well-formedness ratings from the experiment indicates that grammatical principles and the lexicon are still not sufficient to explain all of the variations in the speakers’ judgements. We argue that multiple biases, such as naturalness bias, allophony bias and suprasegmental bias, are effective during phonotactic learning.

Copyright © The Author(s), 2021. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


We thank an associate editor of Phonology and three anonymous reviewers, whose comments and critiques have improved the quality and clarity of this paper. For helpful discussions, we thank San Duanmu, Bruce Hayes, Allard Jongman, James Myers, Joan Sereno, Annie Tremblay, James White and members of the KU Experimental Linguistics Seminar, as well as audiences at the 7th Annual Meeting on Phonology, the 24th Annual Mid-Continental Phonetics and Phonology Conference and the 27th Annual Meeting of the International Association of Chinese Linguistics, where the experimental results of this paper were presented. We are also grateful to Chulong Liu and Yilei Shen for their support in this project.


Albright, Adam (2009). Feature-based generalisation as a source of gradient acceptability. Phonology 26. 941.CrossRefGoogle Scholar
Albright, Adam & Hayes, Bruce (2003). Rules vs. analogy in English past tenses: a computational/experimental study. Cognition 90. 119161.CrossRefGoogle ScholarPubMed
Bailey, Todd M. & Hahn, Ulrike (2001). Determinants of wordlikeness: phonotactics or lexical neighborhoods? Journal of Memory and Language 44. 568591.CrossRefGoogle Scholar
Bates, Douglas, Mächler, Martin, Bolker, Benjamin M. & Walker, Steven C. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67. 148.CrossRefGoogle Scholar
Becker, Michael, Ketrez, Nihan & Nevins, Andrew (2011). The surfeit of the stimulus: analytic biases filter lexical statistics in Turkish laryngeal alternations. Lg 87. 84125.Google Scholar
Berent, Iris & Shimron, Joseph (1997). The representation of Hebrew words: evidence from the Obligatory Contour Principle. Cognition 64. 3972.CrossRefGoogle ScholarPubMed
Berent, Iris, Steriade, Donca, Lennertz, Tracy & Vaknin, Vered (2007). What we know about what we have never heard: evidence from perceptual illusions. Cognition 104. 591630.CrossRefGoogle ScholarPubMed
Boersma, Paul & Weenink, David (2017). Praat: doing phonetics by computer. Version 6.0.33. Scholar
Boomershine, Amanda, Hall, Kathleen Currie, Hume, Elizabeth & Johnson, Keith (2008). The impact of allophony versus contrast on speech perception. In Avery, Peter, Elan Dresher, B. & Rice, Keren (eds.) Contrast in phonology: theory, perception, acquisition. Berlin & New York: Mouton de Gruyter. 145171.Google Scholar
Chao, Yuen Ren (1968). A grammar of spoken Chinese. Berkeley: University of California Press.Google Scholar
Cheng, Chin-Chuan (1973). A synchronic phonology of Mandarin Chinese. The Hague & Paris: Mouton.CrossRefGoogle Scholar
Chen, Zhangtai & Li, Xingjian (1994). Putonghua jichu fangyan jiben cihui ji (yuyin juan). [Fundamental vocabulary of basic Mandarin dialects (pronunciation volume).] Beijing: Yuwen Chubanshe.Google Scholar
Chomsky, Noam & Halle, Morris (1965). Some controversial questions in phonological theory. JL 1. 97138.CrossRefGoogle Scholar
Chomsky, Noam & Halle, Morris (1968). The sound pattern of English. New York: Harper & Row.Google Scholar
Coetzee, Andries W. (2006). Variation as accessing ‘non-optimal’ candidates. Phonology 23. 337385.CrossRefGoogle Scholar
Coetzee, Andries W. (2008). Grammaticality and ungrammaticality in phonology. Lg 84. 218257.Google Scholar
Coetzee, Andries W. & Pater, Joe (2008). Weighted constraints and gradient restrictions on place co-occurrence in Muna and Arabic. NLLT 26. 289337.Google Scholar
Colavin, Rebecca S., Levy, Roger & Rose, Sharon (2010). Modeling OCP-Place in Amharic with the Maximum Entropy phonotactic learner. CLS 46:2. 2741.Google Scholar
Coleman, John & Pierrehumbert, Janet B. (1997). Stochastic phonological grammars and acceptability. In Coleman, John (ed.) Proceedings of the 3rd Meeting of the ACL Special Interest Group in Computational Phonology. Somerset, NJ: Association for Computational Linguistics. 4956.Google Scholar
Cowart, Wayne (1997). Experimental syntax: applying objective methods to sentence judgments. Thousand Oaks, CA: Sage.Google Scholar
Cutler, Anne & Chen, Hsuan-Chih (1997). Lexical tone in Cantonese spoken-word processing. Perception and Psychophysics 59. 165179.CrossRefGoogle ScholarPubMed
Daland, Robert, Hayes, Bruce, White, James, Garellek, Marc, Davis, Andrea & Norrmann, Ingrid (2011). Explaining sonority projection effects. Phonology 28. 197234.CrossRefGoogle Scholar
de Lacy, Paul & Kingston, John (2013). Synchronic explanation. NLLT 31. 287355.Google Scholar
Dell, Gary S. (1984). The representation of serial order in speech: evidence from the repeated phoneme effect in speech errors. Journal of Experimental Psychology: Learning, Memory and Cognition 10. 222233.Google ScholarPubMed
Dell, Gary S., Burger, Lisa K. & Svec, William R. (1997). Language production and serial order: a functional analysis and a model. Psychological Review 104. 123147.CrossRefGoogle ScholarPubMed
Do, Youngah & Yau Lai, Ryan Ka (2020). Incorporating tone in the modelling of wordlikeness judgements. Phonology 37. 577615.CrossRefGoogle Scholar
Duanmu, San (1990). A formal study of syllable, tone, stress and domain in Chinese languages. PhD dissertation, MIT.Google Scholar
Duanmu, San (1994). Syllable weight and syllabic duration: a correlation between phonology and phonetics. Phonology 11. 124.CrossRefGoogle Scholar
Duanmu, San (2007). The phonology of Standard Chinese. 2nd edn. Oxford: Oxford University Press.Google Scholar
Duanmu, San (2011). Chinese syllable structure. In van Oostendorp, Marc, Ewen, Colin J., Hume, Elizabeth & Rice, Keren (eds.) The Blackwell companion to phonology. Malden, MA: Wiley-Blackwell. 27542777.Google Scholar
Dupoux, Emmanuel, Kakehi, Kazuhiko, Hirose, Yuki, Pallier, Christophe & Mehler, Jacques (1999). Epenthetic vowels in Japanese: a perceptual illusion? Journal of Experimental Psychology: Human Perception and Performance 25. 15681578.Google Scholar
Dupoux, Emmanuel, Parlato, Erika, Frota, Sónia, Hirose, Yuki & Peperkamp, Sharon (2011). Where do illusory vowels come from? Journal of Memory and Language 64. 199210.CrossRefGoogle Scholar
Finley, Sara (2012). Typological asymmetries in round vowel harmony: support from artificial grammar learning. Language and Cognitive Processes 27. 15501562.CrossRefGoogle ScholarPubMed
Frisch, Stefan A. (2004). Language processing and segmental OCP effects. In Hayes, Bruce, Kirchner, Robert & Steriade, Donca (eds.) Phonetically based phonology. Cambridge: Cambridge University Press. 346371.CrossRefGoogle Scholar
Frisch, Stefan A., Large, Nathan R. & Pisoni, David B. (2000). Perception of wordlikeness: effects of segment probability and length on the processing of nonwords. Journal of Memory and Language 42. 481496.CrossRefGoogle ScholarPubMed
Frisch, Stefan A., Pierrehumbert, Janet B. & Broe, Michael B. (2004). Similarity avoidance and the OCP. NLLT 22. 179228.Google Scholar
Frisch, Stefan A. & Zawaydeh, Bushra (2001). The psychological reality of OCP-place in Arabic. Lg 77. 91106.Google Scholar
Futrell, Richard, Albright, Adam, Graff, Peter & O'Donnell, Timothy J. (2017). A generative model of phonotactics. Transactions of the Association for Computational Linguistics 5. 7386.CrossRefGoogle Scholar
Gallagher, Gillian (2010). Perceptual distinctness and long-distance laryngeal restrictions. Phonology 27. 435480.CrossRefGoogle Scholar
Gallagher, Gillian (2013). Learning the identity effect as an artificial language: bias and generalisation. Phonology 30. 253295.CrossRefGoogle Scholar
Goldwater, Sharon & Johnson, Mark (2003). Learning OT constraint rankings using a Maximum Entropy model. In Spenader, Jennifer, Eriksson, Anders & Dahl, Östen (eds.) Proceedings of the Stockholm Workshop on Variation within Optimality Theory. Stockholm: Stockholm University. 111120.Google Scholar
Graff, Peter (2012). Communicative efficiency in the lexicon. PhD dissertation, MIT.Google Scholar
Greenberg, Joseph H. & Jenkins, James J. (1964). Studies in the psychological correlates of the sound system of American English. Word 20. 157177.CrossRefGoogle Scholar
Hallé, Pierre A., Segui, Juan, Frauenfelder, Uli & Meunier, Christine (1998). Processing of illegal consonant clusters: a case of perceptual assimilation? Journal of Experimental Psychology: Human Perception and Performance 24. 592608.Google ScholarPubMed
Hayes, Bruce (2009). Introductory phonology. Malden, MA & Oxford: Wiley-Blackwell.Google Scholar
Hayes, Bruce & Londe, Zsuzsa Cziráky (2006). Stochastic phonological knowledge: the case of Hungarian vowel harmony. Phonology 23. 59104.CrossRefGoogle Scholar
Hayes, Bruce, Zuraw, Kie, Siptár, Péter & Londe, Zsuzsa (2009). Natural and unnatural constraints in Hungarian vowel harmony. Lg 85. 822863.Google Scholar
Hayes, Bruce & White, James (2013). Phonological naturalness and phonotactic learning. LI 44. 4575.Google Scholar
Hayes, Bruce & White, James (2015). Saltation and the P-map. Phonology 32. 267302.CrossRefGoogle Scholar
Hayes, Bruce & Wilson, Colin (2008). A maximum entropy model of phonotactics and phonotactic learning. LI 39. 379440.Google Scholar
Hsiao, Pai-Hsiang, Tsai, Chih-Hao, Hsieh, Tung-Han, Yeh, William & Tan, Koan-Sin (2013). Libtabe lexicon. Scholar
Hsieh, Feng-fan, Kenstowicz, Michael J. & Mou, Xiaomin (2009). Mandarin adaptations of coda nasals in English loanwords. In Calabrese, Andrea & Leo Wetzels, W. (eds.) Loan phonology. Amsterdam & Philadelphia: Benjamins. 131154.CrossRefGoogle Scholar
Jaeger, Jeri J. (1980). Testing the psychological reality of phonemes. Language and Speech 23. 233253.CrossRefGoogle Scholar
Jin, Shao-jie & Lu, Yu-an (2018). Accidental gaps in Mandarin tones. JASA 144. 1908.CrossRefGoogle Scholar
Jurafsky, Daniel & Martin, James H. (2009). Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition. 2nd edn. Upper Saddle River, NJ: Prentice Hall.Google Scholar
Kager, René & Pater, Joe (2012). Phonotactics as phonology: knowledge of a complex restriction in Dutch. Phonology 29. 81111.CrossRefGoogle Scholar
Kirby, James P. & Yu, Alan C. L. (2007). Lexical and phonotactic effects on wordlikeness judgments in Cantonese. In Trouvain, Jürgen & Barry, William J. (eds.) Proceedings of the 16th International Congress of Phonetic Sciences. Saarbrücken: Saarland University. 13891392.Google Scholar
Kuznetsova, Alexandra, Brockhoff, Per B. & Christensen, Rune H. B. (2017). lmerTest package: tests in linear mixed effects models. Journal of Statistical Software 82. Scholar
Lahiri, Aditi & Reetz, Henning (2010). Distinctive features: phonological underspecification in representation and processing. JPh 38. 4459.Google Scholar
Leben, William R. (1973). Suprasegmental phonology. PhD dissertation, MIT.Google Scholar
Lee, Wai-Sum & Zee, Eric (2003). Standard Chinese (Beijing). Journal of the International Phonetic Association 33. 109112.CrossRefGoogle Scholar
Lin, Yen-Hwei (1989). Autosegmental treatment of segmental processes in Chinese phonology. PhD dissertation, University of Texas at Austin.Google Scholar
Lin, Yen-Hwei (2007). The sounds of Chinese. Cambridge: Cambridge University Press.Google Scholar
McCarthy, John J. (1986). OCP effects: gemination and antigemination. LI 17. 207263.Google Scholar
Martin, Alexander & Peperkamp, Sharon (2020). Phonetically natural rules benefit from a learning bias: a re-examination of vowel harmony and disharmony. Phonology 37. 6590.CrossRefGoogle Scholar
Martin, Andrew (2007). The evolving lexicon. PhD dissertation, University of California, Los Angeles.Google Scholar
Massaro, Dominic W. & Cohen, Michael M. (1983). Phonological context in speech perception. Perception and Psychophysics 34. 338348.CrossRefGoogle ScholarPubMed
Mitterer, Holger, Reinisch, Eva & McQueen, James M. (2018). Allophones, not phonemes in spoken-word recognition. Journal of Memory and Language 98. 7792.CrossRefGoogle Scholar
Mitterer, Holger, Scharenborg, Odette & McQueen, James M. (2013). Phonological abstraction without phonemes in speech perception. Cognition 129. 356361.CrossRefGoogle ScholarPubMed
Moreton, Elliott (2002). Structural constraints in the perception of English stop-sonorant clusters. Cognition 84. 5571.CrossRefGoogle ScholarPubMed
Moreton, Elliott (2008). Analytic bias and phonological typology. Phonology 25. 83127.CrossRefGoogle Scholar
Moreton, Elliott & Pater, Joe (2012a). Structure and substance in artificial-phonology learning. Part 1: Structure. Language and Linguistics Compass 6. 686701.CrossRefGoogle Scholar
Moreton, Elliott & Pater, Joe (2012b). Structure and substance in artificial-phonology learning. Part 2: Substance. Language and Linguistics Compass 6. 702718.CrossRefGoogle Scholar
Myers, James (2002). An analogical approach to the Mandarin syllabary. Chinese Phonology 11. 163190.Google Scholar
Myers, James (2015). Markedness and lexical typicality in Mandarin acceptability judgments. Language and Linguistics 16. 791818.Google Scholar
Myers, James & Tsay, Jane (2005). The processing of phonological acceptability judgments. Proceedings of Symposium on 90–92 NSC Projects. 26–45. Available (May 2021) at Scholar
Myers, Scott & Padgett, Jaye (2014). Domain generalisation in artificial language learning. Phonology 31. 399433.CrossRefGoogle Scholar
Ohala, John J. (1986). Consumer's guide to evidence in phonology. Phonology Yearbook 3. 326.CrossRefGoogle Scholar
Pegg, Judith E. & Werker, Janet F. (1997). Adult and infant perception of two English phones. JASA 102. 37423753.CrossRefGoogle ScholarPubMed
Peperkamp, Sharon, Le Calvez, Rozenn, Nadal, Jean-Pierre & Dupoux, Emmanuel (2006). The acquisition of allophonic rules: statistical learning with linguistic constraints. Cognition 101. B31B41.CrossRefGoogle ScholarPubMed
Peperkamp, Sharon, Pettinato, Michèle & Dupoux, Emmanuel (2003). Allophonic variation and the acquisition of phoneme categories. In Beachley, Barbara, Brown, Amanda & Conlin, Francis (eds.) Proceedings of the 27th Annual Boston University Conference on Language Development. Somerville: Cascadilla. 650661.Google Scholar
Phillips, Lawrence & Pearl, Lisa (2015). The utility of cognitive plausibility in language acquisition modeling: evidence from word segmentation. Cognitive Science 39. 18241854.CrossRefGoogle ScholarPubMed
Pitt, Mark A. (1998). Phonological processes and the perception of phonotactically illegal consonant clusters. Perception and Psychophysics 60. 941951.CrossRefGoogle ScholarPubMed
Pycha, Anne, Nowak, Pawel, Shin, Eurie & Shosted, Ryan (2003). Phonological rule-learning and its implications for a theory of vowel harmony. WCCFL 22. 423435.Google Scholar
Sereno, Joan A. & Lee, Hyunjung (2015). The contribution of segmental and tonal information in Mandarin spoken word processing. Language and Speech 58. 131151.CrossRefGoogle ScholarPubMed
Shademan, Shabnam (2007). Grammar and analogy in phonotactic well-formedness judgments. PhD thesis, University of California, Los Angeles.Google Scholar
Steriade, Donca (1994). Positional neutralization and the expression of contrast. Ms, University of California, Los Angeles. Available (May 2021) at Scholar
Tsai, Chih-Hao (2000). Mandarin syllable frequency counts for Chinese characters. Scholar
Vitevitch, Michael S. & Luce, Paul A. (1999). Probabilistic phonotactics and neighborhood activation in spoken word recognition. Journal of Memory and Language 40. 374408.CrossRefGoogle Scholar
Vitevitch, Michael S. & Luce, Paul A. (2004). A web-based interface to calculate phonotactic probability for words and nonwords in English. Behavior Research Methods, Instruments, and Computers 36. 481487.CrossRefGoogle ScholarPubMed
Wang, Samuel (1998). An experimental study on the phonotactic constraints of Mandarin Chinese. In Benjamin, K. T'sou (ed.) Studia linguistica serica. Language Information Sciences Research Center, City University of Hong Kong. 259268.Google Scholar
Whalen, D. H., Best, Catherine T. & Irwin, Julia R. (1997). Lexical effects in the perception and production of American English /p/ allophones. JPh 25. 501528.Google Scholar
Wiener, Seth & Turnbull, Rory (2016). Constraints of tones, vowels and consonants on lexical selection in Mandarin Chinese. Language and Speech 59. 5982.CrossRefGoogle ScholarPubMed
Wiese, Richard (1997). Underspecification and the description of Chinese vowels. In Jialing, Wang & Smith, Norval (eds.) Studies in Chinese phonology. Berlin: Mouton de Gruyter. 219249.Google Scholar
Wilson, Colin & Gallagher, Gillian (2018). Accidental gaps and surface-based phonotactic learning: a case study of South Bolivian Quechua. LI 49. 610623.Google Scholar
Woods, David L., William Yund, E., Herron, Timothy J. & Cruadhlaoich, Matthew A. I. Ua. (2010). Consonant identification in consonant-vowel-consonant syllables in speech-spectrum noise. JASA 127. 16091623.CrossRefGoogle ScholarPubMed
Yao, Yao & Sharma, Bhamini (2017). What is in the neighborhood of a tonal syllable? Evidence from auditory lexical decision in Mandarin Chinese. Proceedings of the Linguistic Society of America 2. Scholar
Yi, Li & Duanmu, San (2015). Phonemes, features, and syllables: converting onset and rime inventories to consonants and vowels. Language and Linguistics 16. 819842.Google Scholar
Yip, Moira (1989). Feature geometry and cooccurrence restrictions. Phonology 6. 349374.CrossRefGoogle Scholar
Zuraw, Kie (2007). The role of phonetic knowledge in phonological patterning: corpus and survey evidence from Tagalog infixation. Lg 83. 277316.Google Scholar
Supplementary material: PDF

Gong and Zhang supplementary material

Gong and Zhang supplementary material

Download Gong and Zhang supplementary material(PDF)
PDF 170.2 KB