Skip to main content

Error-driven learning in Optimality Theory and Harmonic Grammar: a comparison*

  • Giorgio Magri (a1)

OT error-driven learning admits guarantees of efficiency, stochastic tolerance and noise robustness which hold independently of any substantive assumptions on the constraints. This paper shows that the HG learner used in the current literature does not admit such constraint-independent guarantees. The HG theory of error-driven learning thus needs to be substantially restricted to specific constraint sets.

Corresponding author
Hide All

Parts of this paper were presented at the 21st Manchester Phonology Meeting in 2013 and at the 11th Old World Conference in Phonology in 2014. I wish to thank Paul Boersma and Joe Pater for useful discussion. Three anonymous reviewers and the associate editor of the journal also provided me with detailed and valuable suggestions. The research reported in this paper was supported by a grant from the Fyssen Research Foundation, as well as by a Marie Curie Intra European Fellowship within the 7th European Community Framework Programme.

Appendices providing more technical details and simulation results can be found in supplementary online materials at

Hide All
Bane, Max, Riggle, Jason & Sonderegger, Morgan (2010). The VC dimension of constraint-based grammars. Lingua 120. 11941208.
Bíró, Tamás S. (2006). Finding the right words: implementing Optimality Theory with simulated annealing. PhD dissertation, University of Groningen.
Block, H. D. (1962). The perceptron: a model of brain functioning. Review of Modern Physics 34. 123135.
Boersma, Paul (1997). How we learn variation, optionality, and probability. Proceedings of the Institute of Phonetic Sciences of the University of Amsterdam 21. 4358.
Boersma, Paul (1998). Functional phonology. PhD dissertation, University of Amsterdam. Published, The Hague: Holland Academic Graphics.
Boersma, Paul (2009). Some correct error-driven versions of the Constraint Demotion Algorithm. LI 40. 667686.
Boersma, Paul & Hayes, Bruce (2001). Empirical tests of the Gradual Learning Algorithm. LI 32. 4586.
Boersma, Paul & van Leussen, Jan-Willem (2014). Fast evaluation and learning in multi-level parallel constraint grammars. Ms, University of Amsterdam.
Boersma, Paul & Pater, Joe (2016). Convergence properties of a Gradual Learning Algorithm for Harmonic Grammar. In McCarthy, John J. & Pater, Joe (eds.) Harmonic Grammar and Harmonic Serialism. London: Equinox. 389434.
Cesa-Bianchi, Nicolò & Lugosi, Gábor (2006). Prediction, learning, and games. Cambridge: Cambridge University Press.
Chomsky, Noam (1965). Aspects of the theory of syntax. Cambridge, Mass.: MIT Press.
Coetzee, Andries W. & Kawahara, Shigeto (2013). Frequency biases in phonological variation. NLLT 31. 4789.
Coetzee, Andries W. & Pater, Joe (2008). Weighted constraints and gradient restrictions on place co-occurrence in Muna and Arabic. NLLT 26. 289337.
Coetzee, Andries W. & Pater, Joe (2011). The place of variation in phonological theory. In Goldsmith, John, Riggle, Jason & Yu, Alan (eds.) The handbook of phonological theory. 2nd edn. Malden, Mass. & Oxford: Wiley-Blackwell. 401434.
Collins, Michael (2002). Discriminative training methods for hidden Markov models: theory and experiments with perceptron algorithms. In Haji, Jan & Matsumoto, Yuji (eds.) Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) . Stroudsburg, PA: Association for Computational Linguistics. 18.
Cristianini, Nello & Shawe-Taylor, John (2000). An introduction to Support Vector Machines and other kernel-based methods. Cambridge: Cambridge University Press.
Eisner, Jason (2000). Easy and hard constraint ranking in Optimality Theory: algorithms and complexity. In Eisner, Jason, Karttunen, Lauri & Thériault, Alain (eds.) Finite-state phonology: Proceedings of the 5th Workshop of the ACL Special Interest Group in Computational Phonology (SIGPHON). 22–33.
Frank, Robert & Kapur, Shyam (1996). On the use of triggers in parameter setting. LI 27. 623660.
Freund, Yoav & Schapire, Robert E. (1999). Large margin classification using the perceptron algorithm. Machine Learning 37. 277296.
Fürnkranz, Johannes & Hüllermeier, Eyke (2010). Preference learning. Berlin & Heidelberg: Springer.
Gibson, Edward & Wexler, Kenneth (1994). Triggers. LI 25. 407454.
Hayes, Bruce (2004). Phonological acquisition in Optimality Theory: the early stages. In Kager et al. (2004). 158–203.
Heinz, Jeffrey (2011). Computational phonology. Part I: Foundations. Language and Linguistics Compass 5. 140152.
Jäger, Gerhard & Rosenbach, Anette (2006). The winner takes it all – almost: cumulativity in grammatical variation. Linguistics 44. 937971.
Jarosz, Gaja (2010). Implicational markedness and frequency in constraint-based computational models of phonological learning. Journal of Child Language 37. 565606.
Jarosz, Gaja (2013). Learning with hidden structure in Optimality Theory and Harmonic Grammar: beyond Robust Interpretive Parsing. Phonology 30. 2771.
Jesney, Karen & Tessier, Anne-Michelle (2011). Biases in Harmonic Grammar: the road to restrictive learning. NLLT 29. 251290.
Kager, René, Pater, Joe & Zonneveld, Wim (eds.) (2004). Constraints in phonological acquisition. Cambridge: Cambridge University Press.
Keller, Frank (2000). Gradience in grammar: experimental and computational aspects of degrees of grammaticality. PhD dissertation, University of Edinburgh.
Kivinen, Jyrki (2003). Online learning of linear classifiers. In Mendelson, Shahar & Smola, Alexander J. (eds.) Advanced lectures on machine learning. Berlin & Heidelberg: Springer. 235257.
Klasner, Norbert & Simon, Hans Ulrich (1995). From noise-free to noise-tolerant and from on-line to batch learning. In Maass, Wolfgang (ed.) Proceedings of the 8th Annual Conference on Computational Learning Theory (COLT) . New York: ACM. 250257.
Legendre, Géraldine, Miyata, Yoshiro & Smolensky, Paul (1998a). Harmonic Grammar: a formal multi-level connectionist theory of linguistic well-formedness: an application. In Proceedings of the 12th Annual Conference of the Cognitive Science Society. Hillsdale: Erlbaum. 884–891.
Legendre, Géraldine, Miyata, Yoshiro & Smolensky, Paul (1998b). Harmonic Grammar: a formal multi-level connectionist theory of linguistic well-formedness: theoretical foundations. In Proceedings of the 12th Annual Conference of the Cognitive Science Society. Hillsdale: Erlbaum. 388–395.
Legendre, Géraldine, Sorace, Antonella & Smolensky, Paul (2006). The Optimality Theory–Harmonic Grammar connection. In Smolensky & Legendre (2006: vol. 2). 903–966.
Levelt, Clara C., Schiller, Niels O. & Levelt, Willem J. (2000). The acquisition of syllable types. Language Acquisition 8. 237264.
Magri, Giorgio (2012a). Constraint promotion: not only convergent, but also efficient. CLS 48. 471485.
Magri, Giorgio (2012b). Convergence of error-driven ranking algorithms. Phonology 29. 213269.
Magri, Giorgio (2013a). The complexity of learning in Optimality Theory and its implications for the acquisition of phonotactics. LI 44. 433468.
Magri, Giorgio (2013b). HG has no computational advantages over OT: toward a new toolkit for computational OT. LI 44. 569609.
Magri, Giorgio (2015). How to keep the HG weights non-negative: the truncated Perceptron reweighting rule. Journal of Language Modelling 3. 345375.
Magri, Giorgio (2016). Noise robustness and stochastic tolerance of OT error-driven ranking algorithms. Journal of Logic and Computation 26. 959988.
Magri, Giorgio (forthcoming). Idempotency in Optimality Theory. JL.
Magri, Giorgio & Storme, Benjamin (forthcoming). A closer look at Boersma & Hayes’ Ilokano metathesis test case. CLS 49.
Minsky, Marvin L. & Papert, Seymour A. (1969). Perceptrons: an introduction to computational geometry. Cambridge, Mass.: MIT Press.
Mohri, Mehryar & Rostamizadeh, Afshin (2013). Perceptron mistake bounds.
Mohri, Mehryar, Rostamizadeh, Afshin & Talwalkar, Ameet (2012). Foundations of machine learning. Cambridge, Mass.: MIT Press.
Novikoff, Albert B. J. (1962). On convergence proofs on Perceptrons. In Proceedings of the Symposium on the Mathematical Theory of Automata. Vol. 12. New York: Polytechnic Institute of Brooklyn. 615–622.
Pater, Joe (2008). Gradual learning and convergence. LI 39. 334345.
Pater, Joe (2009). Weighted constraints in generative linguistics. Cognitive Science 33. 9991035.
Prince, Alan & Smolensky, Paul (2004). Optimality Theory: constraint interaction in generative grammar. Malden, Mass. & Oxford: Blackwell.
Prince, Alan & Tesar, Bruce (2004). Learning phonotactic distributions. In Kager et al. (2004). 245–291.
Riggle, Jason (2009). The complexity of ranking hypotheses in Optimality Theory. Computational Linguistics 35. 4759.
Rosenblatt, Frank (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review 65. 386408.
Rosenblatt, Frank (1962). Principles of neurodynamics: perceptrons and the theory of brain mechanisms. Washington, DC: Spartan.
Shalev-Shwartz, Shai & Singer, Yoram (2005). A new perspective on an old Perceptron algorithm. In Auer, Peter & Meir, Ron (eds.) Learning theory. Berlin & Heidelberg: Springer. 264278.
Smolensky, Paul & Legendre, Géraldine (eds.) (2006). The harmonic mind: from neural computation to optimality-theoretic grammar. 2 vols. Cambridge, Mass.: MIT Press.
Staubs, Robert, Becker, Michael, Potts, Christopher, Pratt, Patrick, McCarthy, John J. & Pater, Joe (2010). OT-Help 2.0. Software package.
Tesar, Bruce (2004). Using inconsistency detection to overcome structural ambiguity. LI 35. 219253.
Tesar, Bruce (2013). Output-driven phonology: theory and learning. Cambridge: Cambridge University Press.
Tesar, Bruce & Smolensky, Paul (1998). Learnability in Optimality Theory. LI 29. 229268.
Tesar, Bruce & Smolensky, Paul (2000). Learnability in Optimality Theory. Cambridge, Mass.: MIT Press.
Wexler, Kenneth & Culicover, Peter W. (1980). Formal principles of language acquisition. Cambridge, Mass.: MIT Press.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

  • ISSN: 0952-6757
  • EISSN: 1469-8188
  • URL: /core/journals/phonology
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
Type Description Title
Supplementary materials

Magri supplementary material
Magri supplementary material 1

 Unknown (3.1 MB)
3.1 MB


Full text views

Total number of HTML views: 6
Total number of PDF views: 103 *
Loading metrics...

Abstract views

Total abstract views: 312 *
Loading metrics...

* Views captured on Cambridge Core between 16th January 2017 - 25th June 2018. This data will be updated every 24 hours.