Skip to main content Accessibility help

Learning opacity in Stratal Maximum Entropy Grammar*

  • Aleksei Nazarov (a1) and Joe Pater (a2)


Opaque phonological patterns are sometimes claimed to be difficult to learn; specific hypotheses have been advanced about the relative difficulty of particular kinds of opaque processes (Kiparsky 1971, 1973), and the kind of data that is helpful in learning an opaque pattern (Kiparsky 2000). In this paper, we present a computationally implemented learning theory for one grammatical theory of opacity, a Maximum Entropy version of Stratal OT (Bermúdez-Otero 1999, Kiparsky 2000), and test it on simplified versions of opaque French tense–lax vowel alternations and the opaque interaction of diphthong raising and flapping in Canadian English. We find that the difficulty of opacity can be influenced by evidence for stratal affiliation: the Canadian English case is easier if the learner encounters application of raising outside the flapping context, or non-application of raising between words (e.g. life with [ʌɪ]; lie for with [aɪ]).


Corresponding author


Hide All

We would like to thank Ricardo Bermúdez-Otero, Paul Boersma, Jeroen Breteler, Ivy Hauser, Jeff Heinz, Coral Hughto, Gaja Jarosz, Marc van Oostendorp, Olivier Rizzolo, Klaas Seinhorst and Robert Staubs, as well as audiences at the 21st Manchester Phonology Meeting, the University of Massachusetts Amherst and the University of Amsterdam for their insightful feedback on this paper and for stimulating discussion. We also thank the editors of this volume and two anonymous reviewers for their very helpful and useful comments. We are grateful to the National Science Foundation for supporting this work through grants BCS-0813829 and BCS-1424077. All errors are ours.



Hide All
Bermúdez-Otero, Ricardo (1999). Constraint interaction in language change: quantity in English and Germanic. PhD dissertation, University of Manchester & University of Santiago de Compostela.
Bermúdez-Otero, Ricardo (2003). The acquisition of phonological opacity. In Spenader et al. (2003). 2536.
Boersma, Paul (1998). Functional phonology: formalizing the interactions between articulatory and perceptual drives. PhD dissertation, University of Amsterdam.
Boersma, Paul & van Leussen, Jan-Willem (to appear). Efficient evaluation and learning in multilevel parallel constraint grammars. LI 48.
Byrd, Richard H., Lu, Peihuang, Nocedal, Jorge & Zhu, Ciyou (1995). A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing 16. 11901208.
Coetzee, Andries W. & Pater, Joe (2011). The place of variation in phonological theory. In Goldsmith et al. (2011). 401434.
De Jong, Kenneth J. (2011). Flapping in American English. In van Oostendorp, Marc, Ewen, Colin J., Hume, Elizabeth & Rice, Keren (eds.) The Blackwell companion to phonology. Malden, Mass.: Wiley-Blackwell. 27112729.
Dinnsen, Daniel A. & Farris-Trimble, Ashley W. (2008). An opacity-tolerant conspiracy in phonological acquisition. Indiana University Working Papers in Linguistics 6. 99118.
Eisenstat, Sarah (2009). Learning underlying forms with MaxEnt. MA thesis, Brown University.
Eychenne, Lucien (2014). Schwa and the loi de position in Southern French. Journal of French Language Studies 24. 223253.
Goldsmith, John A., Riggle, Jason & Yu, Alan C. L. (eds.) (2011). The handbook of phonological theory. 2nd edn. Malden, Mass.: Wiley-Blackwell.
Goldwater, Sharon & Johnson, Mark (2003). Learning OT constraint rankings using a Maximum Entropy model. In Spenader et al. (2003). 111120.
Hayes, Bruce & Wilson, Colin (2008). A maximum entropy model of phonotactics and phonotactic learning. LI 39. 379440.
Hoerl, Arthur E. & Kennard, Robert W. (1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12. 5567.
Idsardi, William J. (2000). Clarifying opacity. The Linguistic Review 17. 337350.
Jarosz, Gaja (2006). Rich lexicons and restrictive grammars: maximum likelihood learning in Optimality Theory. PhD dissertation, Johns Hopkins University.
Jarosz, Gaja (2014). Serial markedness reduction. In John Kingston, Claire Moore-Cantwell, Joe Pater & Robert Staubs (eds.) Proceedings of the 2013 Meeting on Phonology. Available (May 2017) at
Jarosz, Gaja (2015). Expectation driven learning of phonology. Ms, University of Massachusetts Amherst.
Jarosz, Gaja (2016). Learning opaque and transparent interactions in Harmonic Serialism. In Gunnar Ólafur Hansson, Ashley Farris-Trimble, Kevin McMullin & Douglas Pulleyblank (eds.) Proceedings of the 2015 Annual Meeting on Phonology. Available (May 2017) at
Johnson, Mark (2013). A gentle introduction to maximum entropy, log-linear, exponential, logistic, harmonic, Boltzmann, Markov Random Fields, Conditional Random Fields, etc., models. Slides of paper presented to the Macquarie University Machine Learning Reading Group. Available (May 2017) at
Johnson, Mark, Pater, Joe, Staubs, Robert & Dupoux, Emmanuel (2015). Sign constraints on feature weights improve a joint model of word segmentation and phonology. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics. 303313.
Joos, Martin (1942). A phonological dilemma in Canadian English. Lg 18. 141144.
Kaye, Jonathan (1990). What ever happened to Dialect B? In Mascaró, Joan & Nespor, Marina (eds.) Grammar in progress: GLOW essays for Henk van Riemsdijk. Dordrecht: Foris. 259263.
Kim, Yun Jung (2012). Do learners prefer transparent rule ordering? An artificial language learning study. CLS 48:1. 375386.
Kiparsky, Paul (1971). Historical linguistics. In Dingwall, William Orr (ed.) A survey of linguistic science. College Park: University of Maryland Linguistics Program. 576642.
Kiparsky, Paul (1973). Abstractness, opacity, and global rules. In Fujimura, Osamu (ed.) Three dimensions in linguistic theory. Tokyo: TEC. 5786.
Kiparsky, Paul (2000). Opacity and cyclicity. The Linguistic Review 17. 351365.
Kullback, S. & Leibler, R. A. (1951). On information and sufficiency. Annals of Mathematical Statistics 22. 7986.
McCarthy, John J. (2007). Hidden generalizations: phonological opacity in Optimality Theory. Sheffield & Bristol, Conn.: Equinox.
McCarthy, John J. & Pater, Joe (eds.) (2016). Harmonic Grammar and Harmonic Serialism. London: Equinox.
Moreux, Bernard (1985). La ‘Loi de Position’ en français du Midi. I: Synchronie (Béarn). Cahiers de Grammaire 9. 45138.
Moreux, Bernard (2006). Les voyelles moyennes en français du Midi: une tentative de synthèse en 1985. Cahiers de Grammaire 30. 307317.
Odden, David (2011). Rules v. constraints. In Goldsmith et al. (2011). 139.
Pater, Joe (2014). Canadian raising with language-specific weighted constraints. Lg 90. 230240.
Pater, Joe (2016). Universal Grammar with weighted constraints. In McCarthy & Pater (2016). 146.
Pater, Joe, Staubs, Robert, Jesney, Karen & Smith, Brian (2012). Learning probabilities over underlying representations. In Proceedings of the 12th Meeting of the Special Interest Group on Computational Morphology and Phonology. Montreal: Association for Computational Linguistics. 6271. Available (May 2017) at
R Core Team (2013). R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.
Rizzolo, Olivier (2002). Du leurre phonétique des voyelles moyennes en français et du divorce entre licenciement et licenciement pour gouverner. PhD dissertation, University of Nice-Sophia Antipolis.
Selkirk, Elisabeth (1978). The French foot: on the status of ‘mute’ e. Studies in French Linguistics 1:2. 141150.
Smolensky, Paul (1986). Information processing in dynamical systems: foundations of Harmony Theory. In Rumelhart, D. E., McClelland, J. L. & the PDP Research Group (eds.) Parallel Distributed Processing: explorations in the micro-structure of cognition. Vol. 1: Foundations. Cambridge, Mass.: MIT Press. 194281.
Smolensky, Paul & Legendre, Géraldine (eds.) (2006). The harmonic mind: from neural computation to optimality-theoretic grammar. 2 vols. Cambridge, Mass.: MIT Press.
Spenader, Jennifer, Eriksson, Anders & Dahl, Östen (eds.) (2003). Variation within Optimality Theory: Proceedings of the Stockholm Workshop on ‘Variation within Optimality Theory’. Stockholm: Department of Linguistics, Stockholm University.
Staubs, Robert (2014a). Computational modeling of learning biases in stress typology. PhD dissertation, University of Massachusetts Amherst.
Staubs, Robert (2014b). Stratal MaxEnt Solver. Software package. Available (July 2017) at
Staubs, Robert & Pater, Joe (2016). Learning serial constraint-based grammars. In McCarthy & Pater (2016). 369388.
Tesar, Bruce & Smolensky, Paul (2000). Learnability in Optimality Theory. Cambridge, Mass.: MIT Press.

Learning opacity in Stratal Maximum Entropy Grammar*

  • Aleksei Nazarov (a1) and Joe Pater (a2)


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed