Skip to main content
×
Home
    • Aa
    • Aa

Computational learning of construction grammars

  • JONATHAN DUNN (a1)
abstract
<span class='sc'>abstract</span>

This paper presents an algorithm for learning the construction grammar of a language from a large corpus. This grammar induction algorithm has two goals: first, to show that construction grammars are learnable without highly specified innate structure; second, to develop a model of which units do or do not constitute constructions in a given dataset. The basic task of construction grammar induction is to identify the minimum set of constructions that represents the language in question with maximum descriptive adequacy. These constructions must (1) generalize across an unspecified number of units while (2) containing mixed levels of representation internally (e.g., both item-specific and schematized representations), and (3) allowing for unfilled and partially filled slots. Additionally, these constructions may (4) contain recursive structure within a given slot that needs to be reduced in order to produce a sufficiently schematic representation. In other words, these constructions are multi-length, multi-level, possibly discontinuous co-occurrences which generalize across internal recursive structures. These co-occurrences are modeled using frequency and the ΔP measure of association, expanded in novel ways to cover multi-unit sequences. This work provides important new evidence for the learnability of construction grammars as well as a tool for the automated corpus analysis of constructions.

  • View HTML
    • Send article to Kindle

      To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle.

      Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

      Find out more about the Kindle Personal Document Service.

      Computational learning of construction grammars
      Available formats
      ×
      Send article to Dropbox

      To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your Dropbox account. Find out more about sending content to Dropbox.

      Computational learning of construction grammars
      Available formats
      ×
      Send article to Google Drive

      To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your Google Drive account. Find out more about sending content to Google Drive.

      Computational learning of construction grammars
      Available formats
      ×
Copyright
Corresponding author
Address for correspondence: 3300 South Federal Street, Chicago, IL 60616; web: www.jdunn.name; e-mail: jonathan.edwin.dunn@gmail.com
Footnotes
Hide All
*

The author would like to thank Shlomo Argamon and Joshua Trampier for their support and engagement throughout this project. This work was funded in part by the Oak Ridge Institute for Science and Education.

Footnotes
References
Hide All
M. Baroni , S. Bernardini , A. Ferraresi , & E Zanchetta . (2009). The WaCky Wide Web: a collection of very large linguistically processed web-crawled corpora. Language Resources and Evaluation, 43, 209226.

T Briscoe . (2000). Grammatical acquisition: inductive bias and coevolution of language and the language acquisition device. Language, 76(2), 245296.

J Bybee . (2006). From usage to grammar: the mind’s response to repetition. Language, 82(4), 711733.

J Bybee . (2010). Language, usage, and cognition. Cambridge: Cambridge University Press.

N. Chang , J. De Beule , & V Micelli . (2012). Computational construction grammar: comparing ECG and FCG. In L. Steels (Ed.), Computational issues in Fluid Construction Grammar (pp. 259288). Berlin: Springer.

A Clark . (2001). Unsupervised induction of stochastic context-free grammars using distributional clustering. In W. Daelemans & R. Zajac (Eds.), Proceedings of the ACL 2001 Workshop on Computational Natural Language Learning. Stroudsburg, PA: Association for Computational Linguistics.

V. Daudaravičius , & R Marcinkevičienė . (2004). Gravity counts for the boundaries of collocations. International Journal of Corpus Linguistics, 9(2), 321348.

M Davies . (2010). The Corpus of Contemporary American English as the first reliable monitor corpus of English. Literary and Linguistic Computing, 25(4), 447464.

C Fillmore . (1988). The mechanisms of ‘Construction Grammar.’ In S. Axmaker , A. Jaisser , & H. Singmaster (Eds.), Proceedings of the Fourteenth Annual Meeting of the Berkeley Linguistics Society (pp. 3555). Berkeley, CA: Berkeley Linguistics Society.

A Goldberg . (2009). The nature of generalization in language. Cognitive Linguistics, 20(1), 93127.

A. Goldberg , D. Casenhiser , & N Sethuraman . (2004). Learning argument structure generalizations. Cognitive Linguistics, 15(3), 289316.

J Goldsmith . (2001). Unsupervised learning of the morphology of a natural language. Computational Linguistics, 27(2), 153198.

J Goldsmith . (2006). An algorithm for the unsupervised learning of morphology. Natural Language Engineering, 12(4), 353371.

S Gries . (2008). Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics, 13(4), 403437.

S Gries . (2013). 50-something years of work on collocations: what is or should be next. International Journal of Corpus Linguistics, 18(1), 137165.

S. Gries , & J Mukherjee . (2010). Lexical gravity across varieties of English: an ICE-based study of n-grams in Asian Englishes. International Journal of Corpus Linguistics, 15(4), 520548.

S. Gries , & A Stefanowitsch . (2004a). Extending collostructional analysis: a corpus-based perspective on ‘alternations’. International Journal of Corpus Linguistics, 9(1), 97129.

J. Heinz , C. de la Higuera , & M van Zaanen . (2016). Grammatical inference for computational linguistics. San Rafael, CA: Morgan & Claypool.

M Hilpert . (2008). New evidence against the modularity of grammar: constructions, collocations, and speech perception. Cognitive Linguistics, 19(3), 483503.

P Hopper . (1987). Emergent grammar. In J. Aske , N. Beery , L. Michaelis , & H. Filip (Eds.), Proceedings of the Thirteenth Annual Meeting of the Berkeley Linguistics Society (pp. 139157). Berkeley, CA: Berkeley Linguistics Society.

F Jelinek . (1990). Self-organizing language modeling for speech recognition. In A. Waibel & K. Lee (Eds.), Readings in speech recognition (pp. 450506). San Mateo, CA: Morgan Kaufmann.

R Katzir . (2014). A cognitively plausible model for grammar induction. Journal of Language Modelling, 2(2), 213248.

P. Kay , & C Fillmore . (1999). Grammatical constructions and linguistic generalizations: the What’s X Doing Y? construction. Language, 75(1), 133.

R Langacker . (2006). On the continuous debate about discreteness. Cognitive Linguistics, 17(1), 107151.

R Langacker . (2008). Cognitive Grammar: a basic introduction. Oxford: Oxford University Press.

J. Lidz , & A Williams . (2009). Constructions on holiday. Cognitive Linguistics, 20(1), 177189.

Z. Solan , D. Horn , E. Ruppin , & S Edelman . (2005). Unsupervised learning of natural languages. Proceedings of the National Academy of Sciences, 102(33), 1162911634.

L Steels . (2012). Design methods for fluid construction grammar. In L. Steels (Ed), Computational issues in Fluid Construction Grammar (pp. 336). Berlin: Springer.

A. Stefanowitsch , & S Gries . (2003). Collostructions: investigating the interaction between words and constructions. International Journal of Corpus Linguistics, 8(2), 209243.

A. Stefanowitsch , & S Gries . (2005). Covarying lexemes. Corpus Linguistics and Linguistic Theory, 1(1), 143.

M van Zaanen . (2000). ABL: alignment-based learning. In M. Kay (Ed.), Proceedings of the 18th International Conference on Computational Linguistics (pp. 961967). San Francisco, CA: Morgan Kaufmann Publishers.

N. Wei , & J Li . (2013). A new computing method for extracting contiguous phraseological sequences from academic text corpora. International Journal of Corpus Linguistics, 18(4), 506535.

W. Zadrozny , M. Szummer , S. Jarecki , D. Johnson , & L Morhenstern . (1994). NL understanding with a grammar of constructions. In M. Nagao (Ed.), Proceedings of the International Conference on Computational Linguistics (pp. 12891293). International Conference on Computational Linguistics.

Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Language and Cognition
  • ISSN: 1866-9808
  • EISSN: 1866-9859
  • URL: /core/journals/language-and-cognition
Please enter your name
Please enter a valid email address
Who would you like to send this to? *
×

Keywords:

Metrics

Full text views

Total number of HTML views: 18
Total number of PDF views: 132 *
Loading metrics...

Abstract views

Total abstract views: 275 *
Loading metrics...

* Views captured on Cambridge Core between September 2016 - 18th October 2017. This data will be updated every 24 hours.