A systematic review of unsupervised approaches to grammar induction

Vigneshwaran Muralidaran; Irena Spasić; Dawn Knight

doi:10.1017/S1351324920000327

A systematic review of unsupervised approaches to grammar induction

Published online by Cambridge University Press: 27 October 2020

Vigneshwaran Muralidaran

Irena Spasić and

Dawn Knight

Show author details

Vigneshwaran Muralidaran: Affiliation:
School of English, Communication and Philosophy, Cardiff University, John Percival Building, Colum Drive, Cardiff, UK
Irena Spasić: Affiliation:
School of Computer Science and Informatics, Cardiff University, Queen’s Buildings, The Parade, Cardiff, UK
Dawn Knight*: Affiliation:
School of English, Communication and Philosophy, Cardiff University, John Percival Building, Colum Drive, Cardiff, UK
*: *Corresponding author. E-mail: knightd5@cardiff.ac.uk

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

This study systematically reviews existing approaches to unsupervised grammar induction in terms of their theoretical underpinnings, practical implementations and evaluation. Our motivation is to identify the influence of functional-cognitive schools of grammar on language processing models in computational linguistics. This is an effort to fill any gap between the theoretical school and the computational processing models of grammar induction. Specifically, the review aims to answer the following research questions: Which types of grammar theories have been the subjects of grammar induction? Which methods have been employed to support grammar induction? Which features have been used by these methods for learning? How were these methods evaluated? Finally, in terms of performance, how do these methods compare to one another? Forty-three studies were identified for systematic review out of which 33 described original implementations of grammar induction; three provided surveys and seven focused on theories and experiments related to acquisition and processing of grammar in humans. The data extracted from the 33 implementations were stratified into 7 different aspects of analysis: theory of grammar; output representation; how grammatical productivity is processed; how grammatical productivity is represented; features used for learning; evaluation strategy and implementation methodology. In most of the implementations considered, grammar was treated as a generative-formal system, autonomous and independent of meaning. The parser decoding was done in a non-incremental, head-driven fashion by assuming that all words are available for the parsing model and the output representation of the grammar learnt was hierarchical, typically a dependency or a constituency tree. However, the theoretical and experimental studies considered suggest that a usage-based, incremental, sequential system of grammar is more appropriate than the formal, non-incremental, hierarchical view of grammar. This gap between the theoretical as well as experimental studies on one hand and the computational implementations on the other hand should be addressed to enable further progress in computational grammar induction research.

Keywords

Natural language processing Formal grammar Usage-based grammar Grammar induction Parsing

Information

Type: Survey Paper
Information: Natural Language Engineering , Volume 27 , Issue 6 , November 2021 , pp. 647 - 689

DOI: https://doi.org/10.1017/S1351324920000327 [Opens in a new window]
Copyright: © The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Adriaans, P., Trautwein, M. and Vervoort, M. (2000). Towards high speed grammar induction on large text corpora. In International Conference on Current Trends in Theory and Practice of Computer Science. Berlin, Heidelberg: Springer, pp. 173–186.Google Scholar

Araujo, L. and Santamaría, J. (2010). Evolving natural language grammars without supervision. In Evolutionary Computation (CEC), 2010 IEEE Congress on (pp. 1–8). IEEE.Google Scholar

Bates, E. and McWhinney, B. (1982). Functionalist approaches to grammar.Google Scholar

Berant, J., Gross, Y., Mussel, M., Sandbank, B., Ruppin, E. and Edelman, S. (2007). Boosting unsupervised grammar induction by splitting complex sentences on function words. In Proceedings of the Boston University Conference on Language Development.Google Scholar

Bloom, L., Hood, L. and Lightbown, P. (1974). Imitation in language development: if, when, and why. Cognitive Psychology 6, 380–420.CrossRef Google Scholar

Bloomfield, L. (1962). Language. 1933. Holt, New York.Google Scholar

Bod, R. (2006). An all-subtrees approach to unsupervised parsing. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 865–872.Google Scholar

Bod, R. (2007). A linguistic investigation into unsupervised DOP. In Proceedings of the Workshop on Cognitive Aspects of Computational Language Acquisition. Association for Computational Linguistics, pp. 1–8.Google Scholar

Bod, R. (2009). From exemplar to grammar: a probabilistic analogy-based model of language learning. Cognitive Science 33, 752–793.CrossRef Google Scholar PubMed

Boonkwan, P. and Steedman, M. (2011). Grammar induction from text using small syntactic prototypes. In Proceedings of 5th International Joint Conference on Natural Language Processing, pp. 438–446.Google Scholar

Brin, S. and Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30, 107–117.CrossRef Google Scholar

Briscoe, T. and Waegner, N. (1992). Robust stochastic parsing using the inside-outside algorithm. In Proc. of the AAAI Workshop on Probabilistic-Based Natural Language Processing Techniques, pp. 39–52.Google Scholar

Brodsky, P. and Waterfall, H. (2007). Characterizing motherese: on the computational structure of child-directed language. In Proceedings of the Annual Meeting of the Cognitive Science Society, Vol. 29.Google Scholar

Brooks, D.J. (2006). Unsupervised grammar induction by distribution and attachment. In Proceedings of the Tenth Conference on Computational Natural Language Learning. Association for Computational Linguistics, pp. 117–124.CrossRef Google Scholar

Chen, D. and Christopher, M. (2014). A fast and accurate dependency parser using neural networks. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).CrossRef Google Scholar

Chomsky, N. (1957). Syntactic Structures. The Hague: Mouton.CrossRef Google Scholar

Chomsky, N. (1965). Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.Google Scholar

Chomsky, N. (1968). Remarks on Nominalization. Linguistics Club, Indiana University.Google Scholar

Chomsky, N. (2014). Aspects of the Theory of Syntax, vol. 11. MIT Press.Google Scholar

Clark, A. and Lappin, S. (2010). Linguistic Nativism and the Poverty of the Stimulus. John Wiley & Sons.Google Scholar

Cocos, A., Masino, A., Qian, T., Pavlick, E. and Callison-Burch, C. (2015). Effectively crowdsourcing radiology report annotations. In Proceedings of the Sixth International Workshop on Health Text Mining and Information Analysis, pp. 109–114.CrossRef Google Scholar

Cramer, B. (2007). Limitations of current grammar induction algorithms. In Proceedings of the 45th Annual Meeting of the ACL: Student Research Workshop. Association for Computational Linguistics, pp. 43–48.CrossRef Google Scholar

Dalrymple, M. (2001). Lexical Functional Grammar. Brill.CrossRef Google Scholar

Dennis, S.J. (2005). An exemplar-based approach to unsupervised parsing.Google Scholar

Dik, S. (1987). Some principles of functional grammar. Functionalism in Linguistics 20, 81.CrossRef Google Scholar

Dik, S. (1991). Functional grammar. Linguistic theory and grammatical description, pp. 247–274.CrossRef Google Scholar

D’Ulizia, A., Ferri, F. and Grifoni, P. (2011). A survey of grammatical inference methods for natural language learning. Artificial Intelligence Review 36, 1–27.CrossRef Google Scholar

Dunn, J. (2017a). Learnability and falsifiability of construction grammars. Proceedings of the Linguistic Society of America 2, 1.CrossRef Google Scholar

Dunn, J. (2017b). Computational learning of construction grammars. Language and Cognition 9, 254–292.CrossRef Google Scholar

Dominguez, M.A. and Infante-Lopez, G. (2011). Unsupervised induction of dependency structures using Probabilistic Bilexical Grammars. In Natural Language Processing and Knowledge Engineering (NLP-KE), 2011 7th International Conference on (pp. 314–318). IEEE.Google Scholar

Edelman, S., Solan, Z., Horn, D. and Ruppin, E. (2003). Rich syntax from a raw corpus: unsupervised does it. In NIPS-2003 Workshop on Syntax, Semantics and Statistics.Google Scholar

Edelman, S., Solan, Z., Horn, D. and Ruppin, E. (2005). Learning syntactic constructions from raw corpora. In 29th Boston University Conference on Language Development.Google Scholar

Ellefson, M.R. and Christiansen, M.H. (2000). Subjacency constraints without universal grammar: Evidence from artificial language learning and connectionist modeling. In Proceedings of the Annual Meeting of the Cognitive Science Society, vol. 22, No. 22.Google Scholar

Evans, V. (2006). Cognitive Linguistics. Edinburgh University Press.Google Scholar

Falk, Y. (2011). Lexical-Functional Grammar. Oxford University Press.Google Scholar

Fauconnier, G. (1994). Mental Spaces: Aspects of Meaning Construction in Natural Language. Cambridge University Press.CrossRef Google Scholar

Frank, S.L., Bod, R. and Christiansen, M.H. (2012). How hierarchical is language use? Proceedings of the Royal Society of London B: Biological Sciences, p.rspb20121741.Google Scholar PubMed

Gazdar, G., Klein, E., Pullum, G.K. and Sag, I.A. (1985). Generalized Phrase Structure Grammar. Harvard University Press.Google Scholar

Gillenwater, J., Ganchev, K., Pereira, F. and Taskar, B. (2011). Posterior sparsity in unsupervised dependency parsing. Journal of Machine Learning Research 12, 455–490.Google Scholar

Givón, T. (1983). Topic Continuity in Discourse: A Quantitative Cross-Language Study, vol. 3. John Benjamins Publishing.CrossRef Google Scholar

Goldberg, A.E. (2003). Constructions: a new theoretical approach to language. Trends in Cognitive Sciences 7, 219–224.CrossRef Google Scholar

Hajic, J., Hajicová, E., Panevová, J., Sgall, P., Bojar, O., Cinková, S., Fucíková, E., Mikulová, M., Pajas, P., Popelka, J. and Semecký, J. (2012). Announcing Prague Czech-English Dependency Treebank 2.0. In LREC, pp. 3153–3160.Google Scholar

Hampe, B. and Grady, J.E. (eds.) (2005). From Perception to Meaning: Image Schemas in Cognitive Linguistics, vol. 29. Walter de Gruyter.CrossRef Google Scholar

Harrison, C., Nuttall, L., Stockwell, P. and Yuan, W. (2014). Introduction: cognitive grammar in literature. In Cognitive Grammar in Literature. John Benjamins, pp. 1–16.CrossRef Google Scholar

Harrison, M.A. (1978). Introduction to Formal Language Theory. Addison-Wesley Longman Publishing Co., Inc.Google Scholar

Headden, W.P. III, Johnson, M. and McClosky, D. (2009). Improving unsupervised dependency parsing with richer contexts and smoothing. In Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 101–109.Google Scholar

Jackendoff, R. (1977). X syntax: A study of phrase structure. Linguistic Inquiry Monographs 4. Cambridge, Mass., (2), pp. 1–249.Google Scholar

Jensen, K.E. (2014). Performance and competence in usage-based construction grammar. In Multidisciplinary Perspectives on Linguistic Competences, pp. 157–188.Google Scholar

Jin, L., Doshi-Velez, F., Miller, T., Schuler, W. and Schwartz, L. (2018). Unsupervised Grammar Induction with Depth-bounded PCFG. arXiv preprint arXiv:1802.08545.Google Scholar

Jones, K.S. (2007). Computational linguistics: what about the linguistics? Computational Linguistics 33, 437–441.CrossRef Google Scholar

Joshi, A.K. (1985). Tree adjoining grammars: How much context-sensitivity is required to provide reasonable structural descriptions? In Dowty, D.R., Karttunen, L. & Zwicky, A.M. (eds), Natural language parsing, Cambridge University Press, pp. 206–250.CrossRef Google Scholar

Kaiser, E. and Trueswell, J.C. (2004). The role of discourse context in the processing of a flexible word-order language. Cognition 94, 113–147.CrossRef Google Scholar PubMed

Kitchenham, B. and Charters, S. (2007). Guidelines for performing systematic literature reviews in software engineering.Google Scholar

Kiperwasser, E. and Yoav, G. (2016) Simple and accurate dependency parsing using bidirectional LSTM feature representations. Transactions of the Association for Computational Linguistics 4, 313–327.CrossRef Google Scholar

Klein, D. and Manning, C.D. (2004). Corpus-based induction of syntactic structure: models of dependency and constituency. In Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics. Association for Computational Linguistics, p. 478.Google Scholar

Król-Markefka, A. (2014). Between usage-based and meaningfully-motivated grammatical rules: a psycholinguistic basis of applied cognitive grammar. Studia Linguistica Universitatis Iagellonicae Cracoviensis 131, 43.Google Scholar

Lakoff, G. and Johnson, M. (1980). Conceptual metaphor in everyday language. The Journal of Philosophy 77, 453–486.CrossRef Google Scholar

Lakoff, G. (1988). Cognitive semantics. Meaning and Mental Representations 119, 154.Google Scholar

Langacker, R.W. (1987). Foundations of Cognitive Grammar: Theoretical Prerequisites, vol. 1. Stanford university press.Google Scholar

Langacker, R.W. (2008). Cognitive Grammar: A Basic Introduction. Oxford University Press.CrossRef Google Scholar

Langacker, R.W. (2009). Investigations in Cognitive Grammar, vol. 42. Walter de Gruyter.CrossRef Google Scholar

Lawrence, S., Giles, C.L. and Fong, S. (2000). Natural language grammatical inference with recurrent neural networks. IEEE Transactions on Knowledge and Data Engineering 12, 126–140.CrossRef Google Scholar

Leech, G.N. (1993). Statistically-Driven Computer Grammars of English: The IBM/Lancaster Approach (No. 8). Rodopi.Google Scholar

Levine, R.D. and Meurers, W.D. (2006). Head-driven phrase structure grammar. Encyclopedia of Language and Linguistics 5, 237–252.CrossRef Google Scholar

Mareček, D. and Žabokrtský, Z. (2012a). Unsupervised dependency parsing using reducibility and fertility features. In Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure. Association for Computational Linguistics, pp. 84–89.Google Scholar

Mareček, D. and Žabokrtský, Z. (2012b). Exploiting reducibility in unsupervised dependency parsing. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning. Association for Computational Linguistics, pp. 297–307.Google Scholar

Marques, T. and Beuls, K. (2016). Evaluation strategies for computational construction grammars. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 1137–1146.Google Scholar

Matthiessen, C.M. and Halliday, M.A.K. (2009). Systemic functional grammar: a first step into the theory.Google Scholar

Moshier, M. (1988). Extensions to unification grammar for the description of programming languages.Google Scholar

Neves, M. and Ševa, J. (2019). An extensive review of tools for manual annotation of documents. Briefings in Bioinformatics.Google Scholar

Nichols, J. (1984). Functional theories of grammar. Annual Review of Anthropology 13, 97–117.CrossRef Google Scholar

Östman, J.O. and Fried, M. (eds) (2005). Construction Grammars: Cognitive Grounding and Theoretical Extensions, vol. 3. John Benjamins Publishing.CrossRef Google Scholar

Paillet, J.P. (1973). Computational linguistics and linguistic theory. In Proceedings of the 5th Conference on Computational Linguistics, vol. 2. Association for Computational Linguistics, pp. 357–366.CrossRef Google Scholar

Petticrew, M. 2001. Systematic reviews from astronomy to zoology: myths and misconceptions. British Medical Journal 322, 98–101.CrossRef Google Scholar PubMed

Pollard, C. and Sag, I.A. (1994). Head-Driven Phrase Structure Grammar. University of Chicago Press.Google Scholar

Ponvert, E., Baldridge, J. and Erk, K. (2011). Simple unsupervised grammar induction from raw text with cascaded finite state models. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1. Association for Computational Linguistics, pp. 1077–1086.Google Scholar

Post, M. and Gildea, D. (2013). Bayesian tree substitution grammars as a usage-based approach. Language and Speech 56, 291–308.CrossRef Google Scholar PubMed

Radford, A. (1981). Transformational Syntax: A Student’s Guide to Chomsky’s Extended Standard Theory. Cambridge University Press.Google Scholar

Reichart, R. and Rappoport, A. (2008). Unsupervised induction of labeled parse trees by clustering with syntactic features. In Proceedings of the 22nd International Conference on Computational Linguistics, vol. 1. Association for Computational Linguistics, pp. 721–728.Google Scholar

Rimell, L., Clark, S. and Steedman, M. (2009). Unbounded dependency recovery for parser evaluation. In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, vol. 2. Association for Computational Linguistics, pp. 813–821.Google Scholar

Roche, E. and Schabes, Y. (eds) (1997). Finite-State Language Processing. MIT press.CrossRef Google Scholar

Saffran, J.R. (2001). The use of predictive dependencies in language learning. Journal of Memory and Language 44, 493–515.CrossRef Google Scholar

Sangati, F. (2010). A probabilistic generative model for an intermediate constituency-dependency representation. In Proceedings of the ACL 2010 Student Research Workshop. Association for Computational Linguistics, pp. 19–24.Google Scholar

Santamaria, J. and Araujo, L. (2010). Identifying patterns for unsupervised grammar induction. In Proceedings of the Fourteenth Conference on Computational Natural Language Learning. Association for Computational Linguistics, pp. 38–45.Google Scholar

Schabes, Y., Roth, M. and Osborne, R. (1993). Parsing the Wall Street Journal with the inside-outside algorithm. In Proceedings of the sixth conference on European chapter of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 341–347.CrossRef Google Scholar

Seginer, Y. (2007). Fast unsupervised incremental parsing. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 384–391.Google Scholar

Skinner, B.F. (2014). Verbal Behavior. BF Skinner Foundation.Google Scholar

Snyder, B., Naseem, T. and Barzilay, R. (2009). Unsupervised multilingual grammar induction. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, vol. 1. Association for Computational Linguistics, pp. 73–81.Google Scholar

Solan, Z., Horn, D., Ruppin, E. and Edelman, S. (2004). Unsupervised context sensitive language acquisition from a large corpus. In Advances in Neural Information Processing Systems, pp. 961–968.Google Scholar

Søgaard, A. (2011). From ranked words to dependency trees: two-stage unsupervised non-projective dependency parsing. In Proceedings of TextGraphs-6: Graph-based Methods for Natural Language Processing. Association for Computational Linguistics, pp. 60–68.Google Scholar

Spitkovsky, V.I., Alshawi, H. and Jurafsky, D. (2010). From baby steps to leapfrog: How less is more in unsupervised dependency parsing. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 751–759.Google Scholar

Spitkovsky, V.I., Alshawi, H. and Jurafsky, D. (2011). Punctuation: Making a point in unsupervised dependency parsing. In Proceedings of the Fifteenth Conference on Computational Natural Language Learning. Association for Computational Linguistics, pp. 19–28.Google Scholar

Spitkovsky, V.I., Alshawi, H. and Jurafsky, D. (2012). Capitalization cues improve dependency grammar induction. In Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure. Association for Computational Linguistics, pp. 16–22.Google Scholar

Taylor, A., Marcus, M. and Santorini, B. (2003). The Penn treebank: an overview. In Treebanks. Dordrecht: Springer, pp. 5–22.CrossRef Google Scholar

Yang, C. (2011). A statistical test for grammar. In Proceedings of the 2nd workshop on Cognitive Modeling and Computational Linguistics. Association for Computational Linguistics, pp. 30–38.Google Scholar

Zuidema, W. (2006). What are the productive units of natural language grammar?: a DOP approach to the automatic identification of constructions. In Proceedings of the Tenth Conference on Computational Natural Language Learning. Association for Computational Linguistics, pp. 29–36.CrossRef Google Scholar

Article contents

A systematic review of unsupervised approaches to grammar induction

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests