Hostname: page-component-77f85d65b8-6bnxx Total loading time: 0 Render date: 2026-03-28T09:45:07.245Z Has data issue: false hasContentIssue false

Computational learning of construction grammars

Published online by Cambridge University Press:  28 March 2016

JONATHAN DUNN*
Affiliation:
Illinois Institute of Technology, Department of Computer Science
*
Address for correspondence: 3300 South Federal Street, Chicago, IL 60616; web: www.jdunn.name; e-mail: jonathan.edwin.dunn@gmail.com
Rights & Permissions [Opens in a new window]

abstract

This paper presents an algorithm for learning the construction grammar of a language from a large corpus. This grammar induction algorithm has two goals: first, to show that construction grammars are learnable without highly specified innate structure; second, to develop a model of which units do or do not constitute constructions in a given dataset. The basic task of construction grammar induction is to identify the minimum set of constructions that represents the language in question with maximum descriptive adequacy. These constructions must (1) generalize across an unspecified number of units while (2) containing mixed levels of representation internally (e.g., both item-specific and schematized representations), and (3) allowing for unfilled and partially filled slots. Additionally, these constructions may (4) contain recursive structure within a given slot that needs to be reduced in order to produce a sufficiently schematic representation. In other words, these constructions are multi-length, multi-level, possibly discontinuous co-occurrences which generalize across internal recursive structures. These co-occurrences are modeled using frequency and the ΔP measure of association, expanded in novel ways to cover multi-unit sequences. This work provides important new evidence for the learnability of construction grammars as well as a tool for the automated corpus analysis of constructions.

Information

Type
Research Article
Copyright
Copyright © UK Cognitive Linguistics Association 2016 
Figure 0

Fig. 1. Grammar and grammars.

Figure 1

table 1. The construction-grammar induction algorithm

Figure 2

table 2. Calculating ΔP

Figure 3

table 3. Calculating the Summed ΔP

Figure 4

table 4. Calculating the Reduced ΔP

Figure 5

table 5. Calculating the Divided ΔP

Figure 6

table 6. Calculating the Direction ΔP

Figure 7

table 7. Summary of measures in vector representing the candidates

Figure 8

table 8. From potential to actual constructions

Figure 9

Fig. 2. Left-to-Right Correlations.

Figure 10

Fig. 3. Right-to-Left Correlations.

Figure 11

table 9. Distribution measures for each feature

Figure 12

Fig. 4. Degree of coverage across test sets of 100k sentences.

Figure 13

table 10. Grammar agreement across corpus sizes

Figure 14

Fig. 5. Stability across simulated learners.