Hostname: page-component-5f7774ffb-ndmrv Total loading time: 0 Render date: 2026-02-20T04:57:04.798Z Has data issue: false hasContentIssue false

Learning phonological categories

Published online by Cambridge University Press:  19 February 2026

John Goldsmith*
Affiliation:
University of Chicago
Aris Xanthos*
Affiliation:
University of Lausanne
*
Goldsmith, University of Chicago, Departments of Linguistics and Computer Science, 1010 East 59th St., Chicago, IL 60637 [goldsmith@uchicago.edu]
Xanthos, University of Lausanne, Department of Computer Science and Mathematical Methods, Anthropole, CH-1015 Lausanne, Switzerland [aris.xanthos@unil.ch]

Abstract

This article describes in detail several explicit computational methods for approaching such questions in phonology as the vowel/consonant distinction, the nature of vowel harmony systems, and syllable structure, appealing solely to distributional information. Beginning with the vowel/consonant distinction, we consider a method for its discovery by the Russian linguist Boris Sukhotin, and compare it to two newer methods of more general interest, both computational and theoretical, today. The first is based on spectral decomposition of matrices, allowing for dimensionality reduction in a finely controlled way, and the second is based on finding parameters for maximum likelihood in a hidden Markov model. While all three methods work for discovering the fairly robust vowel/consonant distinction, we extend the newer ones to the discovery of vowel harmony, and in the case of the probabilistic model, to the discovery of some aspects of syllable structure.

Information

Type
Research Article
Copyright
Copyright © 2009 by Linguistic Society of America

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Bavaud, François, and Xanthos, Aris. 2005. Markov associativities. Journal of Quantitative Linguistics 12. 123–37.CrossRefGoogle Scholar
Belkin, Mikhail, and Goldsmith, John A.. 2002. Using eigenvectors of the bigram graph to infer morpheme identity. Proceedings of the sixth workshop of the ACL Special Interest Group in Computational Phonology, ed. by Maxwell, Michael, 4147. East Stroudsburg, PA: Association for Computational Linguistics.Google Scholar
Bezdek, James C. 1981. Pattern recognition with fuzzy objective function algorithms. New York: Plenum.CrossRefGoogle Scholar
Biggs, Norman. 1993. Algebraic graph theory. 2nd edn. Cambridge: Cambridge University Press.Google Scholar
Bloomfield, Leonard. 1933. Language. New York: H. Holt and Company.Google Scholar
Charniak, Eugene. 1993. Statistical language learning. Cambridge, MA: MIT Press.Google Scholar
Chomsky, Noam. 1986. Knowledge of language. New York: Praeger.Google Scholar
Chomsky, Noam. 2000. An interview on minimalism. Online: http://www.ling.ed.ac.uk/~s0450647/docs/interview_Chomsky.pdf.Google Scholar
Chung, Fan R. K. 1997. Spectral graph theory. Providence, RI: American Mathematical Society.Google Scholar
Dowman, Mike. 2008. Minimum description length as a solution to the problem of generalization in syntactic theory. Tokyo: University of Tokyo, ms. Online: http://www.ling.ed.ac.uk/~mdowman/mdl-and-generalization.pdf.Google Scholar
Ellison, T. Mark. 1991. The iterative learning of phonological constraints. Crawley: University of Western Australia, ms.Google Scholar
Ellison, T. Mark. 1994. The machine learning of phonological structure. Crawley: University of Western Australia dissertation.Google Scholar
Ellison, T. Mark. 2001. Induction and inherent similarity. Similarity and categorization, ed. by Hahn, Ulrike and Ramscar, Martin C., 2949. Oxford: Oxford University Press.CrossRefGoogle Scholar
Finch, Steven. 1993. Finding structure in language. Edinburgh: University of Edinburgh dissertation.Google Scholar
Goldsmith, John A. 2001. The unsupervised learning of natural language morphology. Computational Linguistics 27. 153–98.CrossRefGoogle Scholar
Goldsmith, John A. 2009. The syllable. The handbook of phonological theory, vol. 2, ed. by Goldsmith, John, Riggle, Jason, and Yu, Alan. Oxford: Blackwell, to appear.Google ScholarPubMed
Goldsmith, John A., and O'Brien, Jeremy. 2006. Learning inflectional classes. Language Learning and Development 2. 219–50.CrossRefGoogle Scholar
Goldsmith, John A., and Riggle, Jason. 2007. Information theoretic approaches to phonology: The case of Finnish vowel harmony. Chicago: University of Chicago, ms. Online: http://hum.uchicago.edu/~jagoldsm//Papers/boltzmann.pdf.Google Scholar
Goldsmith, John A., and Xanthos, Aris. 2008. Three models for learning phonological categories. Technical report 2008–8. Chicago: Department of Computer Science, University of Chicago.Google Scholar
Goldwater, Sharon. 2006. Nonparametric Bayesian models of lexical acquisition. Providence, RI: Brown University dissertation.Google Scholar
Guy, Jacques. 1991. Vowel identification: An old (but good) algorithm. Cryptologia 15. 258–62.CrossRefGoogle Scholar
Hayes, Bruce, and Wilson, Colin. 2008. A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry 39. 379440.CrossRefGoogle Scholar
Jelinek, Frederick. 1997. Statistical methods for speech recognition. Cambridge, MA: MIT Press.Google Scholar
Kannan, Ravi, Vempala, Santosh; and Vetta, Adrian. 2000. On clusterings: Good, bad, and spectral. Proceedings of the 41st Annual Symposium on the Foundation of Computer Science, 367–80. Washington, DC: IEEE Computer Society.Google Scholar
Peperkamp, Sharon, Calvez, Rozenn le, Nadal, Jean-Pierre; and Dupoux, Emmanuel. 2006. The acquisition of allophonic rules: Statistical learning with linguistic constraints. Cognition 101.B31–B41.CrossRefGoogle Scholar
Pike, Kenneth. 1943. Phonetics. Ann Arbor: University of Michigan Press.CrossRefGoogle Scholar
Powers, David M. W. 1991. How far can self-organization go? Results in unsupervised language learning. Proceedings of AAAI Spring Symposium on Machine Learning of Natural Language and Ontology, ed. by Powers, David M. W. and Reeker, Larry, 131–37.Google Scholar
Powers, David M. W. 1997. Unsupervised learning of linguistic structure: An empirical evaluation. International Journal of Corpus Linguistics 2. 91132.CrossRefGoogle Scholar
Reichenbach, Hans. 1938. Experience and prediction. Chicago: University of Chicago Press.Google Scholar
Rissanen, Jorma. 1989. Stochastic complexity in statistical inquiry. Singapore: World Scientific.Google Scholar
Saffran, Jenny R., Aslin, Richard N.; and Newport, Elissa L.. 1996. Statistical learning by 8-month-old infants. Science 274. 1926–28.CrossRefGoogle ScholarPubMed
Schifferdecker, G. 1994. Finding structure in language. Karlsruhe: University of Karlsruhe master's thesis.Google Scholar
Silverstein, Michael (ed.) 1971. Whitney on language: Selected writings of William Dwight Whitney. Cambridge, MA: MIT Press.Google Scholar
Sukhotin, Boris V. 1962. Eksperimental'noe vydelenie klassov bukv s pomoščju EVM. Problemy strukturnoj lingvistiki 234. 189206.Google Scholar
Sukhotin, Boris V. 1973. Méthode de déchiffrage, outil de recherche en linguistique. T. A. Informations 2. 143.Google Scholar
Tesar, Bruce. 1998. An iterative strategy for language learning. Lingua 104. 131–45.CrossRefGoogle Scholar
Trubetzkoy, Nicolai S. 1969. Principles of phonology. Berkeley: University of California Press.Google Scholar
van der Hulst, Harry, and van, Jeroen Weijer, de. 1995. Vowel harmony. The handbook of phonological theory, ed. by Goldsmith, John A., 495534. Oxford: Blackwell.Google Scholar
Ward, Joe H. 1963. Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58. 236–44.CrossRefGoogle Scholar
Xanthos, Aris. 2008. Apprentissage automatique de la morphologie: Le cas des structures racine-schème. (Sciences pour la communication 88.) Berne: Peter Lang.Google Scholar