Approaches collectively described as ‘cognitive linguistics’ have been very important in providing a framework in which to embed the emergentist or usage-based (UB) study of children’s language development. The crucial aspects have been the emphasis on constructions and on their identity as mappings of form to function/meaning. Adopting a construction/form-mapping perspective has had the effect of moving us away from seeing sentence production and comprehension as based on the algorithmic assembly of abstract categories. It should, therefore, also focus us not only on the form of what children say and understand but the function or meaning that is conveyed by the form. A second important thrust of the UB approach is that most or all of early language learning can be explained by general cognitive processes (e.g., working memory, processing speed, the development of prototypes) rather than by any syntactically dedicated (innate) factors. Thus, the roles of frequency, saliency, processing speed, and memory are crucial to UB explanations of language learning and development.
In this paper, I outline the usage-based theory of language development and detail some of the empirical evidence that supports it. However, even in usage-based approaches to language development there has been rather more attention to the learning of form than to the functions of these forms, as well as less attention to the processing issues that might underlie learning. This is the direct result, I would suggest, of the attempt to counter theoretical perspectives which have tended to see language acquisition as the outcome of a set of innate, and specifically linguistic, modules (Valian, Reference Valian1986; Gibson & Wexler, Reference Gibson and Wexler1994; Sakas & Fodor, Reference Sakas and Fodor2012; Guasti, Reference Guasti2004). In what follows, I try to indicate some research avenues that would redress this balance.
1. A brief outline of the emergentist/usage-based approach
A growing body of research indicates that form–meaning mappings begin to be established in infancy and become attached to emergent pattern identification. Thus, syntactic development does not start out as a separate encapsulated process. In the first year of life, there are many developments in infant speech perception, cognition, and communication which come together in a range of intention-reading behaviours in the last trimester of the first year. Children start to manifest an understanding of other minds – that others have intentions and that these can form the basis of common ground. Thus they start to share attention with others, to point, to inform, and to imitate (Tomasello, Reference Tomasello1999). Tomasello has argued that this development of shared intentionality underpins the development of language because it depends on some understanding of common ground, and without common ground the utterances of others would be difficult, if not impossible, to interpret. Children are exposed to many meaningful usage events which they can now begin to interpret in the context of this newly developing understanding of shared intentionality. Grammar is learned through a continuous process of abstraction. Constituency and more complex syntax emerge through this process.
While there are obviously precursors to the development of intention-reading before the so-called 9-month revolution, there does seem to be good evidence that there is a step change at around 8–9 months when intention-reading starts to emerge, and this seems to be true across cultures. Thus Callaghan et al. (Reference Callaghan, Moll, Rakoczy, Warneken, Liszkowski, Behne and Tomasello2011) reported interviews and experiments across a number of different cultures in India, Peru, and Canada and found that the onset of pointing, helping others, imitation, and collaboration all emerged at roughly the same ages. Lieven and Stoll (Reference Lieven and Stoll2013) found similar results for children growing up in Germany and in a community in East Nepal speaking an endangered Tibeto-Burman language, while Brown (Reference Brown, Duranti, Ochs and Schieffelin2011) reported similar rates of interactive initiation by babies growing up in the very different cultures of the Chiapas of Mexico and Rosset Island of Papua New Guinea, despite large differences in the rates of interactive initiation with the babies by the adults in the two cultures. This suggests that there may be some ontogenetic timetable for the emergence of these mind-reading skills upon which language development builds. However, it is unlikely that this would happen in isolation from a human social and linguistic environment. Although difficult to interpret, most studies of so-called ‘wild children’ suggest that what Keller (Reference Keller2007, p. 22) characterizes as ‘(universal) parenting systems’ are necessary for normal social and linguistic cognition. On the other hand, it may be possible to get quite a long way with learning aspects of structural language with relatively impaired intention-reading skills. Children diagnosed with autism spectrum disorder show impairments in early joint attention and later in the pragmatic uses of language, but some of them show relatively intact structural language, though they typically develop it more slowly. Much more research is needed to work out how these children manage this, and to explore the possibility of different routes into the learning of language structure, but for the purposes of this paper I concentrate on the close relationship between the development of the understanding of common ground and the mapping of form to meaning in typical children’s development of language.
2. Emergent categories in language development
In the usage-based approach, linguistic categories such as noun, verb, noun phrase, subject, and object are not pre-given but emerge as the child constructs language by connecting what they already know in terms of the cognitive and intention-reading developments of the first year to the language that they hear. Initially, children’s constructions will consist not only of single words but also of ‘big words’ i.e., rote-learned, unanalyzed strings of words (Peters, Reference Peters1983; Lieven, Pine, & Dresner-Barnes, Reference Lieven, Pine and Dresner-Barnes1992; Bannard & Matthews, Reference Bannard and Matthews2008; Arnon & Snider, Reference Arnon and Snider2010; McCauley & Christiansen, Reference McCauley and Christiansen2014) or of stems with a specific array of morphemes. To take an example, when children start to say what’s that’, it may well have been learned as an unanalyzed whole, unconnected to the copula. While adults may also have this as a ‘big word’ in their lexicons (Bybee & Scheibman, Reference Bybee and Scheibman1999), it will almost certainly be connected to analyzed, related forms so that the adult can reply, as appropriate, with other forms of the copula (e.g., Yes, what is that? or I don’t know, I thought it was a whale but it’s actually a seal!).
The development of word categories is tied to children starting to develop low-scope slot-and-frames patterns based on the frequencies in the input. Examples from English are It’s X-ing, I want a Y, That’s a Z. The slots in these patterns are the basis of emergent categories, initially of low-semantic scope such as THING or ACTION but showing increasing evidence of abstraction. An example of this for English-speaking children is the development of the noun phrase (Lieven, Salomo, & Tomasello, Reference Lieven, Salomo and Tomasello2009), in which children build up the THING slots in schemas such as I want the X, It’s a Y, from bare nouns, then add determiners, and finally adjectives. Bannard, Lieven, and Tomasello (Reference Bannard, Lieven and Tomasello2009) used a computational model based entirely on the distribution of words in child-directed speech to develop grammars of two children’s speech at 2;0 and 3;0. They found, first, that at 2;0 the models did as well or better than models which started from abstract categories such as noun and verb, but, second, that by 3;0, adding categories of noun and verb significantly reduced the number of operations needed to generate the children’s utterances. This tends to support both naturalistic and experimental results with English-speaking children indicating that, at younger ages, they are more flexible with nouns than with verbs, but that more schematic verb categories are also developing from early on. In principle, morphological learning could follow the same pattern, with children first developing slot-and-frame templates from which morphological categories emerge.
3. What children hear and how it affects what they learn
The idea that children can build slot-and-frame patterns from what they hear in child-directed speech (CDS) is supported by analyses showing how repetitive the first three words in English CDS utterances are. The study by Cameron-Faulkner, Lieven, and Tomasello (Reference Cameron-Faulkner, Lieven and Tomasello2003) of twelve mother–child dyads reported that fifty-two ‘core frames’ (used by more than half the mothers) accounted for 51% of all the CDS utterances, and that 45% of these utterances started with just one of seventeen words. In many ways this is not that surprising: there are a limited number of topics about which one can converse with a two-year-old. As well, since 32% of the utterances were questions, they started with the limited set of question words and auxiliaries available in English, resulting in English-speaking children being presented with highly repetitive frames. Interestingly, this was also found to be the case for Russian and German CDS (Stoll, Abbot-Smith, & Lieven, Reference Stoll, Abbot-Smith and Lieven2009). Despite the effects of typological differences (e.g., subject drop and the lack of a copula in Russian present tense, and somewhat more varied word order in both German and Russian), sentence-initial frames accounted for between 70 and 80% of CDS utterances. Arnon (p.c.) has found a similar result for Hebrew.
Large numbers of studies, not only for English, have found that frequency in the input is closely associated with what children learn. But three important issues need addressing. First is the problem that, if something is very frequent, it may look as if it emerges first in children’s language, but this could be due to the fact that the type of sampling regimes usually employed in child language studies mean that frequent items will be recorded earlier than rare items (Tomasello & Stahl, Reference Tomasello and Stahl2004; Rowland & Fletcher, Reference Rowland and Fletcher2006). Second is the fact that children do not learn everything that is frequent. And, finally, there is the question of what types of frequency affect learning; that is, not all frequencies are equal (see Ambridge, Kidd, Rowland, & Theakston, Reference Ambridge, Kidd, Rowland and Theakston2015, and Lieven, Reference Lieven2010, for reviews of frequency effects in acquisition).
Problems of sampling bias are difficult to overcome, given the resources available to most child language researchers in terms of the number of recordings and transcriptions that can be achieved. But this is more of a problem with relatively rare events. If something is very frequent in the input, but does not occur in the child’s speech, this suggests that there is something about the form in terms of complexity or meaning that is slowing learning. I elaborate on this point below in my discussion of children’s acquisition of modal verbs, subject–auxiliary frames, and word order versus case markers.
Whether frequent forms in the input are learned earlier than less frequent forms is affected by a number of factors. An example comes from my study of six children’s learning of English auxiliaries (Lieven, Reference Lieven and Behrens2008). There was a strong rank order correlation between the frequency of these in the input and the order in which they were found in the children’s speech, but there were a number of exceptions. Frames with could, would, and should were relatively frequent in the input, but in the period studied these emerged either late or not at all in the children’s speech. This is probably because these modals require a subtle semantics which the children did not yet control. Modals are a set of verbs that diverge from simple declarative sentences and questions about factuality, signalling a range of speaker stances towards the information being conveyed. Moreover, they are polysemous (being used to convey both speech acts and logical prediction), and in each usage they signal a slightly different range of speaker stances.
A second example of frequency not being the sole predictor of acquisition is found in a study of the acquisition of the auxiliaries BE and HAVE from Theakston, Lieven, Pine, and Rowland (Reference Theakston, Lieven, Pine and Rowland2005). In this study, we correlated subject–auxiliary frames in CDS with their order of emergence in the speech of eleven children. For third person frames, the correlation was high, but it was low for first and second person frames. Children use first person much more frequently than second person, while the reverse is true for caregivers, because children are more interested in talking about themselves than about the caregiver, and the reverse is true for the caregiver.
A third example comes from Dittmar, Abbot-Smith, Lieven, and Tomasello’s (Reference Dittmar, Abbot-Smith, Lieven and Tomasello2008a) study, which used the Cue Competition framework (Bates & MacWhinney, Reference Bates, MacWhinney and MacWhinney1987) to investigate German children’s reliance on word order and case marking in transitive sentences. When presented with sentences with unambiguous case marking, but non-canonical word order, it was only the oldest group of children (7;0s) and adults who were above chance in pointing to the correct picture out of two. This was despite the fact that case marking showed higher reliability than word order in the input. However, Dittmar et al. (Reference Dittmar, Abbot-Smith, Lieven and Tomasello2008a) pointed out that case-marked pronouns were included in the reliability measures, and it is possible that these are easily learned but not initially associated with any more abstract representation of the case-marking system. If these pronouns are taken out of the reliability measures, then word order wins out over case marking. The point here is that what is counted as contributing to a frequency measure matters crucially in attempting to explain order of development and that, in this case, children may learn to use case-marked pronouns well before they have a fully developed case system.
One final issue concerning frequency is that, as children develop, the scope and abstractness of their linguistic system will change, and what is measured in terms of frequency should also change. Initially one may have to count at the lexically specific level, while later more abstract categories may be more relevant (Bannard et al., Reference Bannard, Lieven and Tomasello2009; Ambridge & Lieven Reference Ambridge, Lieven, MacWhinney and O’Grady2014). A study by Dittmar, Abbot-Smith, Lieven, and Tomasello (Reference Dittmar, Abbot-Smith, Lieven and Tomasello2014) on active and passive comprehension gives a nice example of this. Two-and-a-half-year-old German-speaking children were presented with passive sentences containing either novel or familiar verbs. It was found that the children did better with the novel verbs than the familiar verbs. The suggestion is that the familiar verbs had become so entrenched in the active, that it was difficult for children to process them in the passive, while, once they had mastered the passive construction, the novel verbs were easier.
4. Developing schematic constructions: comprehension
Although children start with rote-learned strings and low-scope schemas and may retain these into adulthood, they clearly also develop the capacity to produce and comprehend at a more abstract level. A large body of research into how this schematicity develops has focused on children’s development of the transitive causative construction. Comprehension measures include preferential looking, eye-tracking, pointing, and act-out, always in response to two videos presented side-by-side with reversed agents and patients. The original version of these paradigms used known verbs (Slobin & Bever, Reference Slobin and Bever1982; Hirsh-Pasek, Golinkoff, & Naigles, Reference Hirsh-Pasek, Golinkoff and Naigles1996) but, for a long time now, studies have used novel or low-frequency verbs, in order to control for children’s lexically specific knowledge of the argument structure for well-entrenched verbs. Research follows one broad paradigm in which children are presented with two videos with agents and patients reversed. They hear a transitive sentence and their ability to match it to the correct video is measured. Some studies simply manipulate the order of the arguments; in others, developed in the context of the Cue Competition Model (Bates & MacWhinney, Reference Bates, MacWhinney and MacWhinney1987), the word order and inflectional cues that children speaking different languages utilize in interpreting such sentences are investigated.
From comprehension experiments, there is much to suggest that children know something schematic about the transitive by at least 21 months. Under some conditions, they can correctly match an active transitive containing a novel verb and an animate agent and patient to the correct video (Gertner, Fisher, & Eisengart, Reference Gertner, Fisher and Eisengart2006). The interesting question is what they need to know to do this. Evidence suggests that the effect is quite fragile: at this age, they fail if the same novel verb/action is used in the two screens and if they have not received training with the particular characters used (Dittmar, Abbot-Smith, Lieven, & Tomasello, Reference Dittmar, Abbot-Smith, Lieven and Tomasello2008b). Experiments with conjoined agent intransitives (e.g., Big Bird and Cookie Monster are meeking) (Gertner & Fisher, Reference Gertner and Fisher2012; Noble, Iqbal, Lieven, & Theakston, Reference Noble, Iqbal, Lieven and Theakston2015) suggest that young children interpret this as a transitive in which the first noun, e.g., Big Bird, is acting on the second, e.g., Cookie Monster, suggesting that there is a bias to interpret the first noun as agent. By around 3;0, this bias can be neutralized if the first noun in the conjoined agent noun phrase is inanimate, indicating that not only the position of the first noun but also its animacy is important at the youngest ages. This finding was supported by Chan, Lieven, and Tomasello’s (Reference Chan, Lieven and Tomasello2009) experiment in which English-, German-, and Cantonese-speaking children aged 2;6, 3;6, and 4;6 were presented with active causatives in which the entities/nouns acting as agent and patient varied in three conditions (animate–inanimate, A-I, animate–animate, A-A, and inanimate–animate, I-A). In the prototypical condition (A-I), all two-year-olds chose the first noun as agent above chance, but the performance in the other conditions varied as a function of the language. The youngest English group also chose the first noun as agent in the A-A condition, but this was only true for the older German two–year-olds and not for the Cantonese two-year-olds. The I-A condition was hardest for all groups, with none choosing the first inanimate noun as agent until 3;6. These effects were explained by the typology of the three languages. Namely, German allows OVS word order under certain pragmatic conditions, and Cantonese utilizes extensive argument drop, making identification of arguments more difficult. The important point here, however, is that it was the prototypical semantics of the construction that was guiding the youngest children’s choices, and not the syntactic structure alone.
Of course, word order and animacy are not the only cues to identifying the agents and patients of transitives; there is also case marking, and this is much more crucial in languages other than English, though even English has case-marked pronouns which distinguish the agent and patient roles. And here, too, the evidence is that the youngest children can only correctly identify the agents and patients of transitive causatives if they are presented with a prototypical coalition of cues. Studies in German, Polish, and Finnish show that sentences with SVO word order (by far the most frequent for transitives in all three languages) and unambiguous nominative–accusative case marking can be interpreted correctly by two-year-olds, but SVO word order with case marking neutralized remains challenging for the youngest groups in these three languages (Dittmar et al., Reference Dittmar, Abbot-Smith, Lieven and Tomasello2008a; Krajewski & Lieven Reference Krajewski, Lieven, MacWhinney, Malchukov and Moravcsik2014; Lemetyinen, Lieven, & Theakston, unpublished observations). If the reliability and validity of case marking in the input is counted at the level of abstract case, using the definitions in the competition model, this is a surprising result. Since all these languages allow more variable word order than does English, case marking is more reliable. However, as Dittmar et al. (Reference Dittmar, Abbot-Smith, Lieven and Tomasello2008a) pointed out, this is almost certainly the wrong level at which to count: the most frequent use of case in German input is on case-marked pronouns – so children might be able to interpret sentences containing these pronouns using lower-scope schemata, without having a full grasp of abstract case marking.
In these experiments, children were also tested on sentences with unambiguous case marking but which were object-initial and, depending on the language and age tested, even five- to seven-year-olds were at chance on these sentences in which the most frequent word order, SVO, conflicted with the object-first but accusative-marked noun. If presented without any context, these sentences are quite odd, and although the adult groups tested in these experiments all scored at ceiling, it is quite possible that they would show slowed reaction times in response to these sentences. Children do much better when the sentences are presented with an appropriate discourse context and stress pattern (contrastive: Grünloh, Lieven, & Tomasello, Reference Grünloh, Lieven and Tomasello2011). Simply considering factors like frequency or cue strength fails to explain this finding. A straightforward ‘performance limitations’ account also seems inadequate to deal with these data, given that the sentences all start with a clearly accusative marked noun. After all, the very first cue in the sentence is a clearly case-marked noun, so one might expect that there would not be any garden-pathing. However, from the point of view of a usage-based account, one can see these results arising from two competing processes: the deep entrenchment of SVO word order (initially with low-scope pronoun schemas) which competes with the much less frequently encountered and highly specific pragmatic contexts in which OVS word order (even with case marking) is used. This latter usage requires a coalition of contextualizing cues for its interpretation until quite late in childhood, when, probably aided by the development of metalinguistic awareness and literacy, older children and adults have access to fully schematic representations of case marking.
We can also interpret results on children’s development of the passive construction in a similar way. Messenger, Branigan, and McLean’s (Reference Messenger, Branigan and McLean2011) results from priming studies show that four-year-olds already have a relatively abstract representation of the passive construction. Presumably the abstract schemas demonstrated by the four-year-olds build up from earlier, lower-scope schemas. Evidence for the gradual construction of a more abstract schema comes from a study of two- and three-year-old children’s interpretation of English actives and passives containing case-marked pronouns (Ibbotson, Theakston, Lieven, and Tomasello, Reference Ibbotson, Theakston, Lieven and Tomasello2011). These researchers found that the children could interpret both active and passive causative transitives with two correctly case-marked third person pronouns, but struggled if one was neutralized (with it) or if one or both pronouns was ungrammatical (e.g., Him or Her as subject).
5. Developing schematic constructions: production
Production and comprehension are clearly different. The extent to which children draw on the same heuristics when putting an utterance together, as when they are trying to understand one, is an important and under-studied question. A nice example comes from Matthews, Lieven, Theakston, and Tomasello’s (Reference Matthews, Lieven, Theakston and Tomasello2009) paper on so-called Principle A/B errors. The children in this study used the non-reflexive pronouns (he, she) and reflexive pronouns (himself, herself) correctly in production, but were quite prepared to accept the reflexive for the non-reflexive in comprehension (e.g., herself for her), presumably because it is close enough in meaning and form.
Obviously, if one thinks of an abstract level of grammar existing independently of processing requirements and other performance limitations, then this would be the starting point for both comprehension and production. But if, rather, we are dealing with a developing network of more or less entrenched interconnections based on a whole range of factors from phonology to pragmatics, then how these various factors become activated may differ as a function of the task involved and, perhaps more particularly, the specific demands set by comprehension tasks and production tasks (see, for instance, Abbot-Smith & Tomasello, Reference Abbot-Smith and Tomasello2006). However, there is good evidence that the entrenchment of schemas, and the extent to which cues are in coalition or developing as independent representations, are important factors in both production and in comprehension.
First, there is evidence for the storage of ‘big words’. Bannard and Matthews (Reference Bannard and Matthews2008) showed that children did better on production of 4-word sequences that were frequent in the input than identical sequences in which the last word changed to make a less frequent string (e.g., a cup of tea vs. a cup of milk). Second, there is good evidence for the importance of low-scope, pronoun-based schemas particularly in the early stages of sentence production (Ambridge & Lieven, Reference Ambridge, Lieven, MacWhinney and O’Grady2014). We know that children are significantly more likely to correct non-grammatical word orders to canonical word order as they get older (Akhtar, Reference Akhtar1999). When presented with novel verbs in non-canonical word order, younger children tend to use the same word order when asked to produce the sentence with different nouns. However, when children do change to the correct canonical order, they are very likely to use schemas based on pronouns (e.g., He’s meeking it; Abbot-Smith, Lieven, & Tomasello, Reference Abbot-Smith, Lieven and Tomasello2001; Matthews, Lieven, Theakston, & Tomasello, Reference Matthews, Lieven, Theakston and Tomasello2004, Reference Matthews, Lieven, Theakston and Tomasello2007). Further support for the importance of low-scope, pronoun-based schemas is shown in training studies in which children are asked to produce SVO transitive sentences with a novel verb, having been trained on other constructions containing the same verb (intransitives, passives, predicate nominal; Dodson & Tomasello, Reference Dodson and Tomasello1998; Childers & Tomasello, Reference Childers and Tomasello2001). Again, when they correctly provide a transitive, they are significantly more likely to do so using a pronoun-based schema. Note that this means that children already have some more abstract knowledge of the transitive and aspects of its relation to other constructions, but that this becomes more abstract with development as they become able to use the full noun phrase schema. Once again, the cues that children use to aid their production go well beyond any notion of a ‘core grammar’.
6. Explaining children’s errors
One of the best ways to explore the nature of children’s linguistic representations is to examine their systematic errors. These have often been explained within the generativist framework as arising from the abstract, and specifically linguistic, features of different versions of Universal Grammar. One of the most influential of these theories is the agreement-tense-omission model (ATOM, Schütze & Wexler, Reference Schütze and Wexler1996; Wexler, Reference Wexler1998). In this model, children are innately equipped with abstract knowledge of both agreement and tense, but initially the system is set so that marking of both is optional rather than obligatory. This results in the non-finite marking of verbs (He go there; I doing it) and, where agreement is not marked, in the use of default subject case (in English, accusative, hence Him doing it). This proposal has faced many criticisms from a variety of theoretical frameworks (see, for instance, in chronological order: Hyams Reference Hyams and Clahsen1996; Yang Reference Yang2004; Pine, Rowland, Lieven, & Theakston Reference Pine, Rowland, Lieven and Theakston2005; Freudenthal, Pine, Aguado-Orea, & Gobet, Reference Freudenthal, Pine, Aguado-Orea and Gobet2007; Legate & Yang, Reference Legate and Yang2007, Freudenthal, Pine, & Gobet Reference Freudenthal, Pine and Gobet2010). From the point of view of a usage-based approach, the question is how these errors are to be explained: Since adults do not say these things, why do children?
A major part of the explanation lies in the interaction between children’s processing and what they hear. Thus ‘optional infinitive’ (OI) errors are explained in terms of a learning mechanism which learns preferentially from the ends of utterances. In German and Dutch, complex VPs place the non-finite verb at the end, with the finite auxiliary in V2 position. This leads to the very high rates of utterances with non-finite verbs in the speech of young Dutch and German children. Despite the presence of equal numbers of complex VPs in Spanish, the OI rate is very low, since the elements of the VP stay together and do not appear at the end of utterances very often. Part of the explanation for English OI errors is similar: children hear utterances in which the subject appears before a non-finite verb (e.g., Did you see Johnny dancing?, Let’s watch Annie run ). Using a simple learning model (MOSAIC), Freudenthal and colleagues have been able to reproduce the differences between in the rates of OI errors across languages, and also to show that particular verbs that occur in complex VPs in the input are precisely those in which OI errors occur (Freudenthal et al., Reference Freudenthal, Pine, Aguado-Orea and Gobet2007; Freudenthal et al., Reference Freudenthal, Pine and Gobet2010). These results were also replicated for English in an experiment teaching children novel verbs in either questions or declaratives (Theakston, Lieven, & Tomasello, Reference Theakston, Lieven and Tomasello2003).
We can also explain the occurrence of some case-marking errors through a similar mechanism. Children hear strings such as Did you see me dancing, Let’s watch her run and produce utterances such as me dancing or her run. In a study of seventeen English-speaking children’s speech, Kirjavainen, Theakston, and Lieven (Reference Kirjavainen, Theakston and Lieven2009) showed that this was likely to be the case for utterances with me as subject. The frequency of these errors was correlated with the frequency with which mothers used complex VPs with me and, as in the Freudenthal et al. (Reference Freudenthal, Pine and Gobet2010) study, there was a lexically specific effect in that me-errors for particular children were more likely to be produced with verbs that had appeared in their mother’s complex VPs. These authors also noted that some of the children went beyond this lexical specificity and had clearly generalized the possibility of using me in subject position because they produced utterances that they would never have heard in the input (e.g., me got, me wanna). This is particularly interesting because it demonstrates how the entrenchment of a set of related strings extracted from the input could result in the network generating a ‘creative’ error.
However some errors cannot be explained in this way, and a good example involves errors where a child uses my in subject position (my want it, my do that). In a recent study (McKnight, Lieven, & Theakston, unpublished observations), we explored a wide range of frequency measures to see if these can predict children’s use of my in this construction. There were no straightforward relationships between frequencies of possible source strings in the mothers’ speech (input frequencies of my, am-I+verb strings, and my+noun/verb strings). One input effect that we did find concerned the relative proportion of proper name as opposed to pronominal use in the mothers’ first person reference (i.e., I’ll do it vs. Mummy’ll do it). Children hearing a relatively higher proportion of utterances with a proper name as first person subject produced more my errors. Further, the higher relative frequency of my and lower relative frequency of I in children’s speech before the onset of errors predicted children’s my-for-I-error rates. This supports a suggestion by Rispoli (Reference Rispoli1994) that pronoun case-marking errors reflect the relative strength of representation of different forms meaning roughly the same thing (e.g., I vs. me vs. my). An important point is that all the children were using I as a subject in sentences as well as the erroneous my. So we also explored a suggestion made by Budwig (Reference Budwig1989) that children might be differentiating the meaning of I and my as subject by using my to express a claim to agency. This turned out to do a good job of accounting for the data: my-utterances were significantly more likely to occur in situations in which the child was claiming agency over an object or action than were utterances with the same verb preceded by I. Thus there is no single factor determining the production of these my errors. They result from a complex of usage-based, distributional factors: lack of pronoun modelling, a low entrenchment of the correct I form, and the overextension of the possessive meaning of my, and these may have different strengths for different children as well varying with the precise contexts of use. Errors occur early in development as children have yet to learn the precise phonological, semantic, or pragmatic properties of particular slots and/or particular items. They gradually cease as these are acquired, causing children to no longer use items in inappropriate slots. During development, the particular production task that the child has on hand will determine which usage wins out.
7. Meaning and form
A central tenet of construction-based linguistic theory is that form and meaning are indissolubly linked in constructions, and that constructions have their own meaning – their meaning does not just come from the meaning of the individual words that they contain. On the usage-based assumption that young children learn language in order to communicate, the relationship of form to meaning is obviously a crucial area for research. However, in research on the learning of syntax, there has tended to be more of a focus on structure than on meaning. I think this has been in reaction to the emphasis on abstract structure in generativist theory and the claim that children could not learn this structure from what they hear. Usage-based researchers have been concerned to show how children can indeed abstract a grammar from the language that they hear, and to argue that generativist theories are not able to solve the ‘linking problem’ of how the hypothesized Universal Grammar interacts with the input to produce the grammar of the specific language (Ambridge, Pine, & Lieven, Reference Ambridge, Noble and Lieven2014).
There have, of course, always been exceptions to this focus on abstract syntax. A good example is Clark’s early studies of children’s semantic over- and under-extensions in word leaning (Clark, Reference Clark and Moore1973, Reference Clark, Sinclair, Jarvella and Levelt1978). More recent, and more focused on sentence structure, is work by Goldberg and colleagues on the learning of novel constructions with a particular meaning. Although these studies explore argument linking as well as construction meaning, it is usually construction meaning that is assessed at test (Goldberg, Casenhiser, & Sethuraman, Reference Goldberg, Casenhiser and Sethuraman2004).
In a recent study (Ambridge, Noble, & Lieven, Reference Ambridge, Noble and Lieven2014) we have shown that construction meaning can override word meaning. We used a forced-choice pointing task in which we presented adults and children aged 3;0–3;6 with ungrammatical noun–verb–noun uses of intransitive-only verbs (e.g., *Bob laughed Wendy). Participants had to select either a picture mapped to a construction with a causal interpretation (e.g., ‘Bob made Wendy laugh’) or as non-causal repair interpretation (e.g., ‘Bob laughed at Wendy’). On at least 82% of trials both adults and children chose the causal construction meaning regardless of verb frequency. This supports cognitive linguistic approaches which argue that verb argument structure constructions have meanings in and of themselves, and suggests that this construction meaning is powerful enough to override verb meaning when the two are in conflict.
The work of Ambridge and colleagues on the joint roles of frequency and semantics in the recovery from argument structure overgeneralizations (e.g., Ambridge, Pine, & Rowland, Reference Ambridge, Pine and Rowland2012) also shows the important of meaning in the representation of construction meaning. For instance, the reversative un-VERB construction shows an idiosyncratic set of semantic restrictions: one can unbutton, unroll, or unscrew, but not ungive, unstand, or uncome. Ambridge (Reference Ambridge2013) used a semantic rating study to show that the un-VERB construction exhibits a cluster of fuzzy, overlapping semantic properties such as circular motion, change of state, fitting together, and manipulative action (Whorf, Reference Whorf1956; Li & MacWhinney, Reference Li and MacWhinney1996). The acceptability of the relevant un-form for both adults and children (aged 5–6 and 9–10) in a judgment task was predicted by the extent to which verbs exhibit these properties. We showed similar findings in an elicited production study with even younger children (aged 3–4 and 5–6; Blything, Ambridge, & Lieven, Reference Blything, Ambridge and Lieven2014).
If children are matching meaning to construction form, then a central issue is what they might be able to extract from the relationship between meaning and form in the input. There has been a great deal of interesting work on the mapping between words and meaning in what children hear (see, for instance, Tomasello & Kruger, Reference Tomasello and Kruger1992; Frank & Goodman, Reference Frank and Goodman2014; Smith, Suanda, & Yu, Reference Smith, Suanda and Yu2014), but much less on syntax. Thus, in the word learning literature, there is research on whether the object or event is the focus of attention when it is being talked about and whether this differs for objects and events (Tomasello & Akhtar, Reference Tomasello and Akhtar1995; Frank, Vul, & Saxe, Reference Frank, Vul and Saxe2012; McMurray, Horst, & Samuelson, Reference McMurray, Horst and Samuelson2012). It ought to be possible to do something similar for constructions, and the study by Ibbotson, Lieven, and Tomasello (Reference Ibbotson, Lieven and Tomasello2014) is a preliminary example. This study looked at the relationship between the use of the English progressive and the timing of the event being talked about. Results showed that when the mother used the progressive, this was much more likely to overlap with the event being talked about than when she used other forms of the same verb. While this doesn’t tell us directly about the child’s learning of the progressive, we do know that It’s X-ing and He’s X-ing are among the earliest low-scope schemas that English-speaking children develop (Theakston et al., Reference Theakston, Lieven, Pine and Rowland2005; Lieven, Reference Lieven and Behrens2008).
It is also important to track the changes in the form–meaning mappings of a construction with development, and to relate this to the relationship between meaning and form in the input. Cameron-Faulkner, Lieven, and Theakston (Reference Cameron-Faulkner, Lieven and Theakston2007) studied the development of early multiword negation by analyzing the dense data corpus of one English-speaking boy, Thomas. The results suggested that, while input frequency was important in the development of this child’s negative utterances, the precise relationships were quite complex. Thus Thomas first used only a creative no+X schema, which is ungrammatical if X is a verb (apart from the imperative, e.g., No-running/laughing which did not appear in the input). No, used as a single word, was the most frequent negator in the mother’s speech and had been used extensively by Thomas before he started to produce multiword utterances. The order in which Thomas moved to expressing different types of negation with verbs, first using not+verb in competition with no+verb, and subsequently auxiliaries such as can’t and don’t, was related to the extent to which there was a one-to-one mapping between the form and function in the input. Thus negators which occur frequently in the input within particular functions (e.g., can’t to express INABILITY) emerge earlier than less frequent form–function pairings. This supports the idea suggested by Slobin (Reference Slobin, Ferguson and Slobin1973), Karmiloff-Smith (Reference Karmiloff-Smith1979), and others, that children attempt to map one form to one function, but extends it by tracing how this mapping changes over development as the child’s semantic representations become more complex, and less immediately salient mappings between form and meaning in the input can be extracted.
Children’s understanding and use of the English determiners the and a/an provides an instructive example of developmental change. The study of determiner use has a long history in the field. A study by Maratsos (Reference Maratsos1974) suggested that children understand the uniqueness function of definite reference by the age of 3;0. However, while these two determiners appear very early in children’s multiword speech, it seems that they are used with a limited set of meanings. Thus, understanding the use of determiners with reference to the listener’s knowledge state seems to take much longer (Power & Dal Martello, Reference Power and Dal Martello1986). More recently, evidence suggests that children’s very early determiner use is low-scope and initially centred around constructions such as the + X and a+ X (e.g., I want the X, It’s a Y), without the child necessarily having a determiner ‘category’ (Pine & Martindale, Reference Pine and Martindale1996; Pine, Freudenthal, Krajewski, & Gobet, Reference Pine, Freudenthal, Krajewski and Gobet2013), though there has been intense debate over this claim (Valian, Solt, & Stewart, Reference Valian, Solt and Stewart2009).
In an experimental task, Schmerse, Lieven, and Tomasello (Reference Schmerse, Lieven and Tomasello2014) showed that young participants had partial, but not full, control over the discourse use of these determiners. Two-year-old and three-year-old children and an experimenter shared toy play with a particular object out of a group of three similar objects (e.g., 3 different pencils). Subsequently, one group of children was asked to fetch der (=the) Bleistift (=pencil) while a second group was asked to fetch ein (=a) Bleistift. The two-year-olds were at chance in the choice of determiner in both groups. However, the three-year-olds in the definite determiner group reliably fetched the previously shared object, showing understanding of one important discourse condition for using a definite determiner. However, the children in the indefinite determiner group did not show a preference for selecting a ‘new’ referent. This may be related to input, since Rozendaal and Baker (Reference Rozendaal and Baker2008) showed, in an analysis of French, German, and English CDS, that children almost never encounter situations in which their caregivers talk about new referents that are known to the adult but not to the child.
As new and more complex constructions are learned we can see this development from low-scope mappings between form and meaning to more abstract constructions that are mapped to a more complex set of meanings being repeated. For instance, when children initially use complex sentences with finite complements, these consist of a limited set of formulaic matrices such as I think X or D’you know Y? (Diessel & Tomasello, Reference Diessel and Tomasello2001; Brandt, Verhagen, Lieven, & Tomasello, Reference Brandt, Verhagen, Lieven and Tomasello2011). The suggestion is that these are not really examples of matrix plus subordinate clause constructions but of more limited low-scope schemas in which the matrix is mapped to meanings which are hedges rather than references to the content of one’s own mind (e.g., I think), or attention getters rather than references to the contents of others’ minds (Do you know … ?). Experiments suggest that this is the case, with children finding it more difficult to produce and understand third person matrices (He thinks X) than first person. In addition, this comprehension is related to their understanding of first and third person theory of mind (Brandt, Buttelmann, Lieven, & Tomasello, Reference Brandt, Buttelmann, Lieven and Tomaselloin press). It is also supported by a diary study of one child’s (Laura; see Braunwald & Brislin, Reference Braunwald, Brislin, Ochs and Schieffelin1979) development of these types of complex sentences (Köymen, Lieven, & Brandt, Reference Köymen, Lieven and Brandt2016). Initially, Laura had difficulty in coordinating both the syntax and the semantics of these complex sentences. More syntactic errors were made in the complex sentences than in simple sentences of the same length, and there were more errors with the verbs of the subordinate clauses than in simple sentences with the same verbs. This suggests that errors were arising from the complexities involved in coordinating the syntax of the matrix and the subordinate clause. But coordinating the semantics of the matrix and subordinate clauses was also difficult. When Laura first started to use her I hope X construction, she often intended a hope that something would NOT happen, but the utterances were often produced without the negator, leading to examples like I hope my room go on fire. Another construction had the form I wish X. Initially these often took the form of I wish I want X. It seemed that she had learned some form–meaning mapping for an I wish X construction and simply embedded the much better established I want X construction within it. Over development, the rate of syntactic errors decreased, as did the problems of coordinating the semantics of the matrix and subordinate clause. In addition, new and more flexible matrices began to be used and there was increasing evidence of flexibility with the different components of the constructions, such that matrix clauses started to appear in non-initial position and interjections such as just and now were also more flexibly positioned within the utterances.
8. Conclusion: Where do we go from here?
I have attempted to sketch the outline of a usage-based account of how children learn language, and the ways in which this becomes more flexible, abstract, and complex over development. I have argued, first, that young children show differential and restricted competence in comprehension and production early on; second, that children’s linguistic productivity is tied closely to their linguistic experience, but this interacts with processing capacity, the developing linguistic system, and children’s communicative goals; and, finally, that the development of more abstract grammar is protracted, and that differing levels of abstraction will give the ability to do different tasks. Thus, children build up a network of constructions in which, during development, form–meaning mappings become more inter-connected along more dimensions (e.g., pragmatic, form-based, sound-based). In the early stages, children use various heuristics in attempting to interpret the speech addressed to them, and in production they rely on memory and current schemas to get things said. The fact that, during development, links between constructions grow on the basis of more and more features explains some of the differences between studies in terms of age of success or the differing effects of frequency. For instance, the route through the network which results in the comprehension or production of an utterance can depend on the precise details of the context, as well as on the current state of the child’s representations, and the relationships between them. These will give rise to individual differences and to results that indicate emergent structures under some conditions and not under others. And it will depend crucially on what the child is being asked to do.
So, what are the most important lines of research that would test the assumptions of the usage-based approach and further develop its theoretical and empirical foundations? First, we clearly need a much more detailed understanding of the development of form–meaning mappings. More research is needed on the relationship between children’s preverbal cognitive representations of the world and how these map onto, or are changed by, learning language. We know from Fernald and Hurtado (Reference Fernald and Hurtado2006) that infants are faster to look at a known object if the name appears in a carrier phrase similar to that of the low-scope constructions that our research has identified. We also know from the work of Bowerman and Choi (Reference Bowerman, Choi, Gentner and Goldin-Meadow2003) that, from the outset of early language, children are reflecting the form–meaning mappings of the language they are learning. Putting these two areas of research together to identify the factors involved in the development of meaning is an important task (see, for instance, Göksun, Hirsh-Pasek, & Golinkoff, Reference Göksun, Hirsh-Pasek and Golinkoff2010).
Second, a major focus within the UB approach is how much of early language learning can be explained by general cognitive processes (e.g., working memory, processing speed, the development of prototypes) rather than by any syntactically dedicated (innate) factors. Goldberg (Reference Goldberg2006) argues this forcefully, but we need to subject this assumption to much closer empirical analysis. To give two examples: there has been a strong emphasis in our work on children’s abstraction of construction prototypes (Dittmar et al., Reference Dittmar, Abbot-Smith, Lieven and Tomasello2008a; Chan et al., Reference Chan, Lieven and Tomasello2009), and some suggestion that these work in the same way as other cognitive prototypes (Ibbotson, Theakston, Lieven, & Tomasello, Reference Ibbotson, Theakston, Lieven and Tomasello2012), but is this merely an analogy or are the formation and development of these prototypes governed by the same cognitive principles as non-linguistic prototypes? There is also increasing interest in how memory mechanisms might be able to account for long-distance dependencies, but again we need much more precise specification of exactly how this would work and how it would account for developmental results (Christiansen & Chater, Reference Christiansen and Chater2015).
Third, the nature of the emergent links between constructions, and the factors that influence this, are a crucial area for further research. There are a number of theories in cognitive linguistics that provide very useful accounts of these links (see, for instance, Goldberg, Reference Goldberg1995, on caused motion constructions, Goldberg & Jackendoff, Reference Goldberg and Jackendoff2004, on resultative constructions, and Verhagen, Reference Verhagen2005, on complementation), but we have no idea if these bear any relationship to how children’s networks of form–meaning mappings actually develop. Experiments may be useful here, but trying to control for the range of potential ways in which these links could develop will be extremely difficult. The control provided by computational modelling may make this more tractable. Chang, Dell, and Bock’s (Reference Chang, Dell and Bock2006) model, which learns links between event semantics and syntactic frames, shows ‘comprehension’ of causative transitives before ‘production’. This provides a proof of concept for the idea that different tasks can elicit different performance on the basis of the same learning history. It is also possible to trace the development of links by analyzing vertical sequences in discourse. For instance, in my study of auxiliary development (Lieven, Reference Lieven and Behrens2008), I showed that it was only towards the age of 3;0 that children started to be able to manipulate different forms of the same auxiliary in discourse settings, for instance by negating an auxiliary produced by the mother in the just prior utterance, or using a different form of the same auxiliary in a question. I suggested that this indicated that the children were developing links between previous low-scope pronoun–auxiliary frames, leading to a richer network of linked auxiliary representations.
Fourth, the UB approach also needs to be tested by research on languages very different from English, in which more grammatical work is carried by inflectional morphology and less by word order (Stoll, Reference Stoll and Baerman2015). In principle, it is possible to see that ‘big words’ and low-scope schemas could be equivalent to amalgams of stems and their surrounding morphology in early learning, but this has yet to be shown in any detail (though see Krajewski & Lieven, Reference Krajewski, Lieven, MacWhinney, Malchukov and Moravcsik2014). A crucial test for the UB approach is to be able to explain the acquisition of languages very different to English and the small number of other languages for which there has been acquisition research (Lieven & Stoll, Reference Lieven, Stoll and Bornstein2009). A language like Chintang (a Tibeto-Burman language of East Nepal) with 1,800 verb forms presents very different challenges to the language-learning child (Stoll & Bickel, Reference Stoll, Bickel, Bavin and Stoll2013). These challenges have to be met by children learning these language using the same cognitive tools as those outlined in this paper. To understand how they do this we need to collect naturalistic corpora from non-Indo-European languages and to attempt to identify features of distribution that could support learning. Can the trajectory of children’s development in these languages be explained in the terms I have outlined? When we have the data, modelling is going to be indispensable, not only for so-called ‘exotic’ languages where experimentation is almost impossible, but also to explore the influence of the relative weights of semantics, type and token frequencies, and phonological and other neighbourhoods as development proceeds.
The usage-based approach to children’s language learning has its foundations in the work of Brown (Reference Brown1973), Slobin (Reference Slobin, Ferguson and Slobin1973), Braine (Reference Braine1976), Bruner (Reference Bruner1983), and many others. More recently, it has developed as a sustained empirical challenge to the idea of a pre-given, syntactic module. A great deal of empirical evidence has shown: (1) the strong relationships between the language that children hear and the course of their language development; and (2) that children’s language builds up from low-scope patterns and heuristics to an increasingly schematic and abstract network of constructions. To build a comprehensive and psychologically realistic account of children’s language development we now need to concentrate on identifying the processing mechanisms that are involved; to seriously address the relationship between meaning and form; to account for individual differences in learning; and to extend our research to languages that provide specific challenges to the present state of our theories.