Hostname: page-component-7d684dbfc8-mqbnt Total loading time: 0 Render date: 2023-10-01T13:17:45.480Z Has data issue: false Feature Flags: { "corePageComponentGetUserInfoFromSharedSession": true, "coreDisableEcommerce": false, "coreDisableSocialShare": false, "coreDisableEcommerceForArticlePurchase": false, "coreDisableEcommerceForBookPurchase": false, "coreDisableEcommerceForElementPurchase": false, "coreUseNewShare": true, "useRatesEcommerce": true } hasContentIssue false

There is a simplicity bias when generalising from ambiguous data

Published online by Cambridge University Press:  11 August 2020

Karthik Durvasula*
Michigan State University
Adam Liter*
University of Maryland


How exactly do learners generalise in the face of ambiguous data? While there has been a substantial amount of research studying the biases that learners employ, there has been very little work on what sorts of biases are employed in the face of data that is ambiguous between phonological generalisations with different degrees of complexity. In this article, we present the results from three artificial language learning experiments that suggest that, at least for phonotactic sequence patterns, learners are able to keep track of multiple generalisations related to the same segmental co-occurrences; however, the generalisations they learn are only the simplest ones consistent with the data.

Copyright © The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


We would like to thank the associate editor and three anonymous reviewers for helping to improve this manuscript tremendously. We would also like to thank the audiences at the 2015 Annual Meeting on Phonology, the LSA 2016 Annual Meeting and the 2016 North American Phonology Conference, as well as the Phonology/Phonetics group at Michigan State University for helpful discussions. Many thanks to Mina Hirzel for recording the stimuli used in our experiments. Additionally, we would like to thank Russ Werner and Mike Kramizeh at Michigan State University for help with technical matters. Adam Liter was supported by the NSF NRT award DGE-1449815 during portions of the writing and revising of this paper. The authors have contributed equally to this paper, and share first authorship.


Albright, Adam (2009). Feature-based generalisation as a source of gradient acceptability. Phonology 26. 941.CrossRefGoogle Scholar
Barr, Dale J., Levy, Roger, Scheepers, Christoph & Tilly, Harry J. (2013). Random effects structure for confirmatory hypothesis testing: keep it maximal. Journal of Memory and Language 68. 255278.CrossRefGoogle ScholarPubMed
Bates, Douglas, Mächler, Martin, Bolker, Benjamin M. & Walker, Steven C. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67. 148.CrossRefGoogle Scholar
Becker, Michael, Ketrez, Nihan & Nevins, Andrew (2011). The surfeit of the stimulus: analytic biases filter lexical statistics in Turkish laryngeal alternations. Lg 87. 84125.Google Scholar
Bergelson, Elika & Idsardi, William J. (2009). Structural biases in phonology: infant and adult evidence from artificial language learning. In Chandlee, Jane, Franchini, Michelle, Lord, Sandy & Rheiner, Gudrun-Marion (eds.) Proceedings of the 33rd Annual Boston University Conference on Language Development. Somerville, Mass.: Cascadilla. 8596.Google Scholar
Berwick, Robert C. (1985). The acquisition of syntactic knowledge. Cambridge, Mass.: MIT Press.Google Scholar
Bolker, Benjamin M. (2014). How to choose random- and fixed-effects structure in linear-mixed models? Available (April 2020) at Scholar
Chambers, Kyle E., Onishi, Kristine H. & Fisher, Cynthia (2003). Infants learn phonotactic regularities from brief auditory experience. Cognition 87. B69B77.CrossRefGoogle ScholarPubMed
Chambers, Kyle E., Onishi, Kristine H. & Fisher, Cynthia (2011). Representations for phonotactic learning in infancy. Language Learning and Development 7. 287308.CrossRefGoogle ScholarPubMed
Chomsky, Noam & Halle, Morris (1968). The sound pattern of English. New York: Harper & Row.Google Scholar
Cowan, Nelson (2010). The magical mystery four: how is working memory capacity limited, and why? Current Directions in Psychological Science 19. 1. 5157.CrossRefGoogle Scholar
Cristià, Alejandrina, Mielke, Jeff, Daland, Robert & Peperkamp, Sharon (2013). Similarity in the generalization of implicitly learned sound patterns. Laboratory Phonology 4. 259285.CrossRefGoogle Scholar
Cristià, Alejandrina & Seidl, Amanda (2008). Is infants’ learning of sound patterns constrained by phonological features? Language Learning and Development 4. 203227.CrossRefGoogle Scholar
Cristià, Alejandrina, Seidl, Amanda & Gerken, LouAnn (2011). Learning classes of sounds in infancy. University of Pennsylvania Working Papers in Linguistics 17. 6976.Google Scholar
Culbertson, Jennifer (2012). Typological universals as reflections of biased learning: evidence from artificial language learning. Language and Linguistics Compass 6. 310329.CrossRefGoogle Scholar
Dell, François (1981). On the learnability of optional phonological rules. LI 12. 3137.Google Scholar
Dupoux, Emmanuel, Kakehi, Kazuhiko, Hirose, Yuki, Pallier, Christophe & Mehler, Jacques (1999). Epenthetic vowels in Japanese: a perceptual illusion? Journal of Experimental Psychology: Human Perception and Performance 25. 15681578.Google Scholar
Eberhardt, Frederick & Danks, David (2011). Confirmation in the cognitive sciences: the problematic case of Bayesian models. Minds and Machines 21. 389410.CrossRefGoogle Scholar
Field, Andy (2013). Discovering statistics using IBM SPSS Statistics. 4th edn.London: SAGE.Google Scholar
Finley, Sara (2011). The privileged status of locality in consonant harmony. Journal of Memory and Language 65. 7483.Google ScholarPubMed
Finley, Sara (2012). Testing the limits of long-distance learning: learning beyond a three-segment window. Cognitive Science 36. 740756.CrossRefGoogle ScholarPubMed
Finley, Sara & Badecker, William (2009). Artificial language learning and feature-based generalization. Journal of Memory and Language 61. 423437.CrossRefGoogle Scholar
Folia, Vasiliki, Uddén, Julia, de Vries, Meinou, Forkstam, Christian & Petersson, Karl Magnus (2010). Artificial language learning in adults and children. Language Learning 60. Suppl. 2. 188220.CrossRefGoogle Scholar
Friederici, Angela D. & Wessels, Jeanine M. I. (1993). Phonotactic knowledge of word boundaries and its use in infant speech perception. Perception and Psychophysics 54. 287295.CrossRefGoogle ScholarPubMed
Gerken, LouAnn (2006). Decisions, decisions: infant language learning when multiple generalizations are possible. Cognition 98. B67B74.CrossRefGoogle ScholarPubMed
Gerken, LouAnn & Knight, Sara (2015). Infants generalize from just (the right) four words. Cognition 143. 187192.CrossRefGoogle ScholarPubMed
Gorman, Kyle (2013). Generative phonotactics. PhD dissertation, University of Pennsylvania.Google Scholar
Hale, Mark & Reiss, Charles (2003). The Subset Principle in phonology: why the tabula can't be rasa. JL 39. 219244.CrossRefGoogle Scholar
Halle, Morris (1961). On the role of simplicity in linguistic descriptions. In Proceedings of Symposia in Applied Mathematics. Vol. 12: Structure of language and its mathematical aspects. American Mathematical Society. 89–94.Google Scholar
Hayes, Bruce & Wilson, Colin (2008). A maximum entropy model of phonotactics and phonotactic learning. LI 39. 379440.Google Scholar
Heinz, Jeffrey (2010). Learning long-distance phonotactics. LI 41. 623661.Google Scholar
Jusczyk, Peter W., Friederici, Angela D., Wessels, Jeanine M. I., Svenkerud, Vigdis Y. & Jusczyk, Ann Marie (1993). Infants’ sensitivity to the sound patterns of native language words. Journal of Memory and Language 32. 402420.CrossRefGoogle Scholar
Kabak, Barış & Idsardi, William J. (2007). Perceptual distortions in the adaptation of English consonant clusters: syllable structure or consonantal contact constraints? Language and Speech 50. 2352.CrossRefGoogle ScholarPubMed
Kazanina, Nina, Bowers, Jeffrey S. & Idsardi, William J. (2018). Phonemes: lexical access and beyond. Psychonomic Bulletin and Review 25. 560585.CrossRefGoogle ScholarPubMed
Kuo, Li-Jen (2009). The role of natural class features in the acquisition of phonotactic regularities. Journal of Psycholinguistic Research 38. 129150.CrossRefGoogle ScholarPubMed
Lai, Regine (2015). Learnable vs. unlearnable harmony patterns. LI 46. 425451.Google Scholar
Linzen, Tal & Gallagher, Gillian (2014). The timecourse of generalization in phonotactic learning. In John Kingston, Claire Moore-Cantwell, Joe Pater & Robert Staubs (eds.) Proceedings of the 2013 Meeting on Phonology. Scholar
Linzen, Tal & Gallagher, Gillian (2017). Rapid generalization in phonotactic learning. Laboratory Phonology 8. Scholar
Linzen, Tal & O'Donnell, Timothy J. (2015). A model of rapid phonotactic generalization. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon: Association for Computational Linguistics. 1126–1131.Google Scholar
McMullin, Kevin (2016). Tier-based locality in long-distance phonotactics: learnability and typology. PhD dissertation, University of British Columbia.Google Scholar
McQueen, James M. (1998). Segmentation of continuous speech using phonotactics. Journal of Memory and Language 39. 2146.CrossRefGoogle Scholar
Moreton, Elliott (2002). Structural constraints in the perception of English stop-sonorant clusters. Cognition 84. 5571.Google ScholarPubMed
Moreton, Elliott (2008). Analytic bias and phonological typology. Phonology 25. 83127.CrossRefGoogle Scholar
Moreton, Elliott & Pater, Joe (2012a). Structure and substance in artificial-phonology learning. Part 1: Structure. Language and Linguistics Compass 6. 686701.CrossRefGoogle Scholar
Moreton, Elliott & Pater, Joe (2012b). Structure and substance in artificial-phonology learning. Part 2: Substance. Language and Linguistics Compass 6. 702718.CrossRefGoogle Scholar
Peirce, Jonathan, Gray, Jeremy R., Simpson, Sol, MacAskill, Michael, Höchenberger, Richard, Sogo, Hiroyuki, Kastman, Erik & Lindeløv, Jonas Kristoffer (2019). PsychoPy2: experiments in behavior made easy. Behavior Research Methods 51. 195203.CrossRefGoogle ScholarPubMed
Pierrehumbert, Janet B. (2001). Why phonological constraints are so coarse-grained. Language and Cognitive Processes 16. 691698.CrossRefGoogle Scholar
Pierrehumbert, Janet B. (2003). Probabilistic phonology: discrimination and robustness. In Bod, Rens, Hay, Jennifer & Jannedy, Stefanie (eds.) Probabilistic linguistics. Cambridge, Mass.: MIT Press. 177228.Google Scholar
Pycha, Anne, Nowak, Pawel, Shin, Eurie & Shosted, Ryan (2003). Phonological rule-learning and its implications for a theory of vowel harmony. WCCFL 22. 423435.Google Scholar
R Development Core Team (2014). R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. Available at Scholar
Saffran, Jenny R. & Thiessen, Erik D. (2003). Pattern induction by infant language learners. Developmental Psychology 39. 484494.CrossRefGoogle ScholarPubMed
Scholes, Robert J. (1966). Phonotactic grammaticality. The Hague: Mouton.CrossRefGoogle Scholar
Tenenbaum, Joshua B. & Griffiths, Thomas L. (2001). Generalization, similarity, and Bayesian inference. Behavioral and Brain Sciences 24. 629640.CrossRefGoogle ScholarPubMed
Wilson, Colin (2006). Learning phonology with substantive bias: an experimental and computational study of velar palatalization. Cognitive Science 30. 945982.CrossRefGoogle ScholarPubMed
Xu, Fei & Tenenbaum, Joshua B. (2007). Word learning as Bayesian inference. Psychological Review 114. 245272.CrossRefGoogle ScholarPubMed