The ubiquity of frequency effects in first language acquisition

This review article presents evidence for the claim that frequency effects are pervasive in children's first language acquisition, and hence constitute a phenomenon that any successful account must explain. The article is organized around four key domains of research: children's acquisition of single words, inflectional morphology, simple syntactic constructions, and more advanced constructions. In presenting this evidence, we develop five theses. (i) There exist different types of frequency effect, from effects at the level of concrete lexical strings to effects at the level of abstract cues to thematic-role assignment, as well as effects of both token and type, and absolute and relative, frequency. High-frequency forms are (ii) early acquired and (iii) prevent errors in contexts where they are the target, but also (iv) cause errors in contexts in which a competing lower-frequency form is the target. (v) Frequency effects interact with other factors (e.g. serial position, utterance length), and the patterning of these interactions is generally informative with regard to the nature of the learning mechanism. We conclude by arguing that any successful account of language acquisition, from whatever theoretical standpoint, must be frequency sensitive to the extent that it can explain the effects documented in this review, and outline some types of account that do and do not meet this criterion.

The article is organized around four key domains of research: children's acquisition of single words, inflectional morphology, simple syntactic constructions, and more advanced constructions. In presenting this evidence, we develop five theses. (i) There exist different types of frequency effect, from effects at the level of concrete lexical strings to effects at the level of abstract cues to thematic-role assignment, as well as effects of both token and type, and absolute and relative, frequency. High-frequency forms are (ii) early acquired and (iii) prevent errors in contexts where they are the target, but also (iv) cause errors in contexts in which a competing lower-frequency form is the target. (v) Frequency effects interact with other factors (e.g. serial position, utterance length), and the patterning of these interactions is generally informative with regard to the nature of the learning mechanism. We conclude by arguing that any successful account of language acquisition, from whatever theoretical standpoint, must be frequency sensitive to the extent that it can explain the effects documented in this review, and outline some types of account that do and do not meet this criterion.

I N T R O D U C T I O N
Frequency effects are ubiquitous in virtually every domain of human cognition and behaviour, from the perception of facial attractiveness (Grammer & Thornhill, ) and the processing of musical structure (Temperley, ) to language change (Bybee, ) and adult sentence processing (Ellis, ). Our goal in this target article is to argue that frequency effects are ubiquitous also in children's first language acquisition, and to summarize the different types of frequency effect that are observed across all of its subdomains. We argue, very simply, that frequency effects constitute a phenomenon for which any successful theory must account. Such a theory might be a generativist/nativist account, under which children have innate knowledge of abstract categories, but are sensitive to the frequency with which exemplars of these categories are present in the input (e.g. see Yang, , for a review). It could equally be a constructivist/ usage-based account, under which children build up abstract constructions on the basis of the input, with the aid of little or no innate linguistic knowledge (e.g. Tomasello, ). Regardless of whatever other theoretical assumptions are made, any successful account of language acquisition will need to incorporate frequency-sensitive learning mechanisms.
It is important, at the outset, to clarify our claim. We do not argue that sensitivity to input frequency must be the defining feature, or even the most important feature, of a successful account of acquisition (i.e. we do not argue for a frequency-DRIVEN or frequency-BASED mechanism). It is not difficult to think of factors that are more important than input frequency A M B R I D G E E T A L .
 in at least some scenarios. For example, if we consider the straightforward token frequency of lexical items, there is every reason to believe that children will make more effort to store low-frequency input strings that can be used to obtain desired objects (e.g. cake) than higher-frequency strings that cannot (e.g. the). We argue, instead, for a learning mechanism that is minimally frequency SENSITIVE, under which input frequency need not be the chief determinant of acquisition in all cases.
It is also important to make clear that a frequency-sensitive learning mechanism need not (and most probably does not) entail a mechanism that "computes and matches the frequency of various elements in the input" or acquires "knowledge of frequency" (Bohnacker, , pp. -; see Ambridge, , for discussion). Frequency in this sense (i.e. token frequency) need not be represented per se, but may be instantiated in the strength of representations or neural connections in exactly the same way that explicit and implicit memory for stimuli of all types is boosted by repetition. Similarly, type frequency information may be represented only indirectly, instantiated in the similarity structure of stored exemplars.
Thus far, our claim is relatively uncontroversial: few would disagree that at least some domains of language acquisition show frequency effects at some level (though see Roeper, ). But our claim is much broader: we propose that frequency effects are ubiquitous in every domain of child language acquisition and that any apparent null finding simply reflects a failure to conceptualize frequency appropriately, to find a sufficiently sensitive dependent measure, or to hold constant other relevant factors.
We illustrate this claim with evidence from four core domains: the acquisition of SINGLE WORDS, INFLECTIONAL MORPHOLOGY, SIMPLE SYNTACTIC CONSTRUCTIONS, and MORE ADVANCED CONSTRUCTIONS. Within these sections, our overarching claim takes the form of five inter-related theses: . Levels and Kinds Thesis. Frequency effects exist at all levels and are of many different kinds. They are observed not only at the level of CONCRETE LEXICAL STRINGS (perhaps the prototypical frequency effect), but also at the level of ABSTRACT CATEGORIES (e.g. particular orderings of SUBJECT and OBJECT) and cues (e.g. animacy, givenness). There are TOKEN FREQUENCY effects (e.g. at the level of the word, the more often you hear a word, the more likely you are to learn it) and TYPE FREQUENCY effects (e.g. at the level of inflectional morphology, the more verbs you hear with a particular inflectional ending, the more likely you are to learn that ending). There are effects of ABSOLUTE FREQUENCY (e.g. high-frequency words will be learned earlier than low-frequency words) and RELATIVE FREQUENCY (e.g. of two competing forms, the most frequent will be dominant).
. Age of Acquisition (AoA) Thesis. All other things being equal, frequent forms will be acquired before less-frequent forms. As we will see in more detail, since all other things are rarelyif everequal, this claim does not entail a one-to-one relationship between frequency and age of acquisition (and neither is the definition of 'acquisition' straightforward). . Prevent Error Thesis. High-frequency forms prevent (or at least reduce) errors in contexts in which they are the target. For example, we will see that third person singular verb formsalmost always the most frequent in the inputare invariably produced correctly in third person singular contexts. . Cause Error Thesis. Conversely, high-frequency forms also cause error in contexts in which a competing, related lower-frequency form is the target. For example, we will see that high-frequency third person singular verb forms are often used inappropriately in third person plural contexts. . Interaction Thesis. Finally, we propose that frequency effects will interact with other effects. One example is utterance position: high-frequency verbs are generally learned before lower-frequency verbs (a main effect of verb frequency), and this effect is boosted for verbs that occur frequently in utterance-final position (an interaction of verb frequency by utterance position). The downside of these interactions is that they can make frequency effects difficult to detect. The upside is that these interactions are generally informative with regard to the other factors that we need to build into the learning mechanism (e.g. sensitivity to utterance position or temporal ordering).
The remainder of this article synthesizes the considerable empirical support that exists for each of our theses across four domains: single words, inflectional morphology, simple syntactic constructions, and more advanced constructions. This strategy inevitably entails a degree of repetition and overlap, for which we make no apology. The point is that the frequency effects captured by these five theses do not rely on cherry-picking particular domains or debates, but are ubiquitous across first language acquisition.
At this point, we should also clarify that whenever we refer to frequency in this article, we mean INPUT frequency. It is likely that children also show effects of output frequency (e.g. better performance with strings that they produce more often). However, we do not discuss such effects, as, other than in the domain of phonology (e.g. DePaolis, Vihman & Keren-Portnoy, ), few studies have attempted to dissociate effects of input and output frequency. Indeed, this will often prove to be rather difficult, given that the frequency distributions of utterances produced by children and their caregivers are generally extremely similar.
This section presents evidence for perhaps our two most straightforward theses; thatall else being equalfrequent forms are (a) acquired earlier than less frequent ones (AoA Thesis) and (b) associated with lower rates of error, and higher rates of correct use (Prevent Error Thesis). The findings discussed also constitute evidence for our Interaction Thesis.
Similar frequency effects are apparent in children's acquisition (our AoA Thesis). As a rule, children learn frequent words before infrequent ones: American English-speaking children's most common first words in production are (in order) Daddy, Mommy, bye, hi, uh-oh, dog, no, ball, baby, and book (Fenson et al., ), not, for example, coffee and computer (words that children certainly hear, just less frequently).
However, there is an important caveat to be made here, one that has sometimes been misunderstood. Our claim is not that frequency is the only predictor, but that frequent words are learned before infrequent ones, ALL OTHER THINGS BEING EQUAL. Thus, we do not predict that there will be a one-to-one relationship between frequency and age of acquisition (which is just as well, since children's first word is rarely the). There are many other factors that influence acquisition: a word is more likely to be early learned if it is, inter alia, relevant to the child's communicative goals (Ninio, ), associated with an easily identifiable referent (Gentner, ), imageable (Bird, Franklin & Howard, ), aligned with prosodic boundaries (Christophe & Dupoux, ), easy to segment from the continuous speech stream (Monaghan & Christiansen, ), easy to say (Vihman & Vihman, ), and attested in a wide range of contexts (Naigles & Hoff-Ginsberg, ; Küntay & Slobin, ). Our prediction, thus, is that, in a regression analysis, input frequency will make a significant unique contribution to the variance of the outcome measure (in this case, age of acquisition), even when all of these other factors are included in the model. Although few, if any, studies have controlled for ALL of these factors,  this prediction is, in general, very well supported. For example, independent effects of input frequency on age of acquisition have been found looking across verbs (Naigles & Hoff-Ginsberg, ; Smiley & Huttenlocher, ; Theakston, Lieven, Pine & Rowland, ), adjectives (Blackwell, ), and nouns and function words (Goodman, Dale & Li, ).
Turning now to our Prevent Error Thesis, the domain of single-word acquisition provides ample evidence that high-frequency forms are associated with lower rates of error, and higher rates of correct production and comprehension, than lower-frequency forms. The most direct evidence comes from studies in which word frequency is manipulated experimentally, which allow researchers to control out confounding factors using counterbalancing procedures. For example, Schwartz and Terrel () taught one-to three-year-old children either four novel nouns or four novel verbs. Each individual word+object/action pair was presented with high frequency (a total of  presentations) for half of the children and low frequency ( presentations) for the remainder. Thus their finding that the high-frequency words were correctly recalled significantly more often than low-frequency words (a finding that held for both nouns and verbs) cannot realistically be attributed to any factor other than input frequency (for similar studies with L learners and children with SLI, see At the same time, while it is useful to be able to control factors such as imageability, prosody, and utterance position experimentally, our Interaction Thesis holds that interactions between frequency and one or more of these other effects are informative with regard to the nature of the language learning mechanism. A detailed analysis of all of these potential interactions is beyond the scope of the present article. However, two findings are relevant as an illustration of the informative nature of interactions between frequency and a second factor, here utterance position and utterance length. In their study of verb acquisition, Naigles and Hoff-Ginsberg () found that, in addition to overall input frequency, input frequency in utterance-final position was a significant predictor of age of acquisition. Relatedly, Brent and Siskind () found that age of acquisition was best predicted not by a word's overall input frequency, but by the frequency with which it appeared as the sole constituent of an utterance.
Consequently, interactions with other factors are not merely a source of noise that must be eliminated in order to observe frequency effects or that can be appealed to in order to explain away null findings. Rather, these interactions can constrain our theories, by informing us about the nature of the learning mechanism, For example, the finding of an interaction between frequency and utterance position (e.g. Naigles & Hoff-Ginsberg,  ) suggests that we need to posit a learning mechanism that is sensitive to temporal order, rather than, for example, a mechanism that processes entire input sequences one batch at a time. Thus, our Interaction Thesis allows us to make general predictions about the learning mechanism that can be tested in other domains (e.g. morphosyntax; e.g. Freudenthal, Pine, Aguado-Orea & Gobet, ), and perhaps even non-linguistic domains such as memory for musical notes or sequences (e.g. Berz, ).
In this section we consider children's acquisition of morphologically inflected forms (mainly verbs, but also nouns), and the evidence that this domain provides for three of our theses. The first is that high-frequency forms (in this case surface strings) are associated with lower rates of error, and higher rates of correct use (Prevent Error Thesis). The second is that high-frequency forms can cause errors when used in inappropriate contexts, whichin this domainessentially means inappropriate person/number contexts (Cause Error Thesis). The third is that there are different types of frequency effect (Levels & Kinds Thesis); the specific kinds of error contrasted here being (a) relative versus absolute and (b) type versus token frequency effects.
Many early investigations concluded that no effect of input frequency could be observed in the domain of the acquisition of inflectional morphology. For example, looking across fourteen different morphemes, Brown () found no correlation between input frequency and age of acquisition, whether looking at individual child-caregiver dyads or across the whole group (see also Newport, Gleitman & Gleitman, ; Gleitman & Wanner, ; De Villiers, ; though see Moerk, , for a reanalysis of Brown's data that did yield frequency effects, and Moerk, , and Pinker, , for further discussion).
The problem with this study, however, is the use of age of 'acquisition' (which usually entails first production) in naturalistic speech as the dependent measure. This measure is problematic because children are motivated to talk about certain topics at the expense of others, and thus have little occasion to produce certain inflected forms, even if they know them well. For example, despite their high frequency in the input, children rarely produce second person singular forms. Raw production data simply cannot tell us whether children (a) have failed to learn these forms despite their high frequency or (b) have learned these forms, but find little use for them (e.g. young children are not interested in talking about what their listener is doing).
One solution is to use as our dependent measure not the age at which a particular form is first produced or the raw frequency of these forms in F R E Q U E N C Y E F F E C T S  the child's speech but the PROPORTION of correct versus incorrect uses in obligatory contexts. Because this is a proportional measure, it controls for the confound that, for example, first person singular contexts far outnumber third person singular contexts in children's speech. Thus, a better way of examining frequency effects is to test the prediction that the higher the frequency of the individual word form (i.e. the inflected, realized form, as opposed to the lemma), the higher the rate (i.e. proportion) of correct use, and the lower the rate of errors; whether errors of commission or omission (our Prevent Error Thesis).
When this prediction is tested, clear effects of input frequency are found, in both naturalistic (e.g. Theakston Dabrowska and Szczerbinkski () found a correlation between the input frequency of genitive, dative, and accusative Polish noun case-marking inflections, and children's correct performance with novel noun inflection. These frequency effects are not merely an artefact caused by children's memory or processing difficulties. In adult studies of production latency, differences are found between more and less frequent forms of the same lemma (e.g. playing vs. plays; Jescheniak & Levelt, ). Though, again, it is important to bear in mind thatconsistent with our Interaction Thesisfrequency interacts with other factors, including serial position (e.g. Freudenthal et al., ; Gagarina, ; Freudenthal, Pine & Gobet, ;) and the form most recently produced by an interlocutor (e.g. Krajewski, Theakston, Lieven & Tomasello, ).
A number of findings from this domain illustrate another of our theses: high-frequency forms not only PREVENT errors in contexts where they are the target, but Cause Error where a lower-frequency form is the target. For example, in a naturalistic study of child Spanish, Aguado-Orea () found high error rates for third person plural target forms (which are very rare in the input), almost all of which involved the substitution of much more frequent third person singular forms (see also Räsänen, Ambridge & Pine, unpublished observations, for Finnish). Similar findings were reported by Dabrowska () for case-marking errors, Theakston and Rowland () for auxiliary is-for-are errors, and Cameron-Faulkner and Kidd () for are-for-am errors (e.g. *I are playing).
Turning now to our Levels and Kinds Thesis, the domain of inflectional morphology also provides a useful illustration of the difference between the effects of TOKEN and TYPE frequency. Token frequency is simply the number of times that a particular string (e.g. Mummy) occurs in the child's input. Type frequency is the number of different items that follow a particular morphosyntactic pattern. Precisely what is meant by the term 'following a A M B R I D G E E T A L .
 particular pattern' varies from domain to domain, but a reasonably straightforward case occurs in the English past tense system (e.g. Bybee & Slobin, ; Bybee & Moder, ). For example, the ow?ew pattern has a high type frequency because many verbs form their past tense in this way (e.g. blow/blew, know/knew, grow/grew, throw/threw), whilst the pattern exemplified by make/made has a very low type frequency (probably a type frequency of ).
There is some evidence to suggest that patterns with high TYPE frequency are more productive (i.e. more open to newcomers), though it is often difficult, when considering morphological systems, to separate the effect of type frequency from phonological heterogeneity ( . However, there is also evidence to suggest that inflected forms with very high TOKEN frequency (e.g. said) constitute unanalyzed frozen phrases, and so do not contribute to analogical generalization at all (e.g. the existence of say?said does not lead children to produce errors such as play?*pled or obey?*obed); see Baayen and Lieber (), Bybee (), and Wang and Derwing ().
The domain of inflectional morphology, in particular, English verb past tense and noun plural marking, also illustrates a further contrast within our Levels and Kinds Thesis -ABSOLUTE vs. RELATIVE frequency. With regard to absolute frequency, this domain illustrates the common finding that the more frequent the irregular form (in absolute terms), the more likely children are to produce this form, as opposed to an error (also relevant to our Prevent Error Thesis). For example, the high-frequency irregulars blew and feet are less likely to be over-regularized (e.g. *blowed, *foots) than the low-frequency irregulars drank and shelves (e.g. *drinked and *shelfs) (Marchman, ; Marchman, Wulfeck & Weismer, ; Maslen, Theakston, Lieven & Tomasello, ).
With regard to relative frequency, errors are particularly common when the target form is infrequent RELATIVE TO A HIGH-FREQUENCY COMPETITOR FORM (e.g. a 'zero-marked' form, as in Yesterday I wanted/*want an ice-cream). For example, focusing on zero-marking errors in the domain of noun plural marking, Matthews and Theakston () found that children often produced *two mouse, because the target (mice) is less frequent in the input than the competitor (mouse), but rarely produced *two foot, because the target (feet) is more common in the input than the competitor (foot).
The implication of our Levels and Kinds Thesis is that we need an account that incorporates different types of frequency effect: both ABSOLUTE frequency (e.g. to explain why Mummy is learned before coffee or why feet resists overgeneralization better than does shelves) and RELATIVE frequency (e.g. to explain why children substitute low-frequency third person plural verb forms with erroneous high-frequency third person singular forms of the same verb, or mice with mouse, but not feet with foot). This does not necessarily entail positing that children must 'decide' whether to pay attention to absolute or relative frequency in a particular domain (which is just as well, since such a position would be untenable). Children are clearly sensitive to both relative and absolute frequency; the challenge is to posit a learning mechanism that yields effects at both of these levels.
One example is the learning model of Rescorla and Wagner (). In this model, the assumption is that a meaning or entity (e.g. MUMMY) has only a certain amount of associative strength to give out. If this entity is paired with one label (e.g. Mummy), this associative strength does not need to be shared: every pairing of MUMMY and Mummy strengthens the association between the two. If an entity (e.g. MOUSE) is paired with two labels (e.g. Mouse, Mice), its associative strength is shared between the two: every pairing of MOUSE and Mouse strengthens the link between MOUSE and Mouse at the expense of the link between MOUSE and Mice, and vice versa (Ramscar, Dye & McCauley, ; see Legate & Yang, , for a version of this account in the domain of Optional Infinitive errors). Regardless of the merits or otherwise of an associative account of word learning, the point is simply that a learning mechanism can yield effects of both absolute and relative frequency, without it somehow having to 'decide' which to use in each domain.
The moral here is that a sophisticated consideration of different possible types of frequency effect (Levels and Kinds Thesis) allows us to constrain theory building in a way that simplistic correlations between the input and output frequency of particular strings cannot. The need to account for effects of both absolute and relative frequency forces us to posit particular types of acquisition model that we may not otherwise have considered; specifically those that build in some form of competition between words with similar meanings and/or surface forms (MacWhinney, ). Thus a 'frequency effect' can never be an explanation or answer in its own right. Rather, it poses a question: What type of learning mechanism is needed to yield the PARTICULAR TYPES of frequency effect observed?
This section discusses frequency effects at the levels of multiword strings and grammatical (i.e. sentence-level) constructions. This domain is useful in particular for illustrating our claim that there exist many different types of frequency effect (Levels and Kinds Thesis), as well as providing evidence for our Prevent Error, Cause Error, and AoA Theses.

Multiword strings
The first type of frequency effect is one that we have discussed already: frequently occurring strings prevent or reduce errors (Prevent Error). This is true not only of single words (including inflected forms) but also of multiword strings. Bannard and Matthews () found that children are better able to repeat four-word sequences found frequently in child-directed speech (CDS) than less-frequent four-word sequences, even when the frequency of the individual items and bigrams was carefully controlled (e.g. comparing a cup of tea with a cup of milk). Similar findings were observed by Matthews and Bannard (), Arnon and Snider (), and Arnon and Clark (; see also Conklin & Schmitt, , for an overview of such effects in adults). In a different context, a number of studies (Mintz, ; Chemla, Mintz, Bernal, and Christophe, ; Weisleder & Waxman, ; but see Erkelens, ; Stumper, Bannard, Lieven & Tomasello, ) have demonstrated that children are also sensitive to frequent frames: "ordered pairs of words that frequently co-occur with exactly one word position intervening (occupied by any word)" (Mintz, , p. ).
The second type of frequency effect is also one that we have encountered previously: high-frequency strings not only prevent error when used correctly, but seem to cause errors when used incorrectly (Cause Error Thesis). For example, in a study of early negation, Cameron-Faulkner, Lieven, and Theakston () reported that early verbal negation was largely ungrammatical (e.g. no move, no drop it), and therefore reflected creative use on the part of the child (multiword utterances containing the negator no were very rare in the caregiver's speech). However, they argued that these early errors were in fact frequency driventhe child was using the most frequent, functionally generic, and salient single word negator in the input overall (no), which he creatively combined with verbs, resulting in a no+VERB frame. Later in development this made way for a shift towards the use of not +VERB (e.g. not going there, not open the lid), which they argued was due to the high frequency of not in multiword utterances in the input, although not necessarily in combination with verbs. Finally, the child shifted towards the use of auxiliary forms (e.g. Don't sit down here, I can't talk), but this shift was function-dependent (e.g. prohibition, inability) and was closely tied to the frequency of particular AUX+neg forms (e.g. don't, can't) to express particular functions in the input.
These complex effects encompassing frequency of both surface forms and communicative functions pose a challenge for researchers. We currently lack a good understanding of whether and how frequency effects change over the course of development, as a consequence of children's increasing semantic and pragmatic knowledge. Computational models provide one means of investigating how far it is possible to get with relatively simple surface-form learning, provided that the model is sensitive to frequency in an appropriate way (e.g. Freudenthal et al., ). Incorporating semantic and/or pragmatic coding into these kinds of model (e.g. Chang, Dell & Bock, ) would allow researchers to determine what additional benefit this kind of frequency information provides to the learning mechanism, and how closely the corresponding output matches children's language at different stages in development.

Simple syntactic constructions
In the domain of simple grammatical constructions, we see effects of frequency at a variety of levels and of different kinds; frequency of (a) individual verbs, (b) verb+argument/construction combinations, and (c) abstract cues to word order (Levels and Kinds Thesis). For example, with regard to verb+argument combinations, the order in which children acquire verbs within the transitive and intransitive constructions is predicted by both the overall frequency of the verbs and the frequency of those verbs in those same constructions in the input (Ninio, ; Theakston, Lieven, Pine & Rowland, ), consistent with our AoA Thesis. Focusing on arguments, children's use of grammatical objects with verbs that can occur both transitively and intransitively mirrors the relative use of the two constructions with those same verbs in the input (Theakston, Lieven, Pine & Rowland, ). Similar findings are observed in so-called weird-word order studies (e.g. Akhtar, ; Abbot-Smith, Lieven & Tomasello, ; Matthews, Lieven, Theakston & Tomasello, , ), in which children follow an experimenter's ungrammatical word order for low-frequency and novel verbs (e.g. Fox bear rammed, Elmo the car gopping), but correct the use of a high-frequency verb to the word order in which it has frequently been attested in the input (e.g. Continuing our illustration of the Levels and Kinds Thesis, there is evidence that children are sensitive not only to the frequency of particular verb+arugment and verb+construction combinations, but also to the frequency of more abstract cues to word order (possibly at different developmental stages). In particular, investigations of children's developing sensitivity to cues such as word order, case marking, and animacy, in their interpretation of the simple transitive NVN construction, typically show that young children are better able to interpret sentences in which multiple cues indicate the same sentence interpretation than those in which only a single cue operates in isolation or cues conflict. This finding, which has been replicated across a number of languages, reflects the higher frequency of sentences with multiple supporting cues in the input ( , for counter-arguments, and Goldberg, , for a critique of their approach). Later in development, however, children start to grasp the significance of individual, often rather infrequent, cues (e.g. the need to prioritise case marking over word order in German, reflecting a shift from the influence of highly frequent SVO word order, to less-frequent but highly reliable case marking; Dittmar et al., ).
Further illustrating our Levels and Kinds Thesis, the domain of the acquisition of simple constructions exhibits a particularly interesting and well-studied interaction between type and token frequency. Several studies (Goldberg, Casenhiser & Sethuraman, ; Casenhiser & Goldberg, ; Goldberg, Casenhiser & White, ) have found that children show an advantage for learning the meanings of 'skewed' constructions where one or two types constitute the lion's share of all constructional tokens, as compared to 'balanced' constructions where the tokens are divided more evenly amongst the types. The picture has been complicated by the fact that some studies have found no advantage for either type of distribution (Year & Gordon, ), or even an advantage for a more balanced distribution (Siebenborn, Krajewski & Lieven, unpublished observations; see Johnson & Goldberg, unpublished observations, for discussion: online <http://www.princeton.edu/~adele/Princeton_Construction_Site/Publicatio ns_files/SkewedInput.pdf>). Whatever the overall pattern, for our present purposes, the important point is thatagainwe see a case where careful examination of the different TYPES of frequency effect observed constrains theory development by forcing us to build models that can yield these complex effects; effects that would have been missed entirely by an approach that focused solely on the relationship between the input and output frequency of particular tokens.
Although we have focused in this domain on our Levels and Kinds Thesis, this is not to say that our other theses are not supported here. Work on the development of simple grammatical constructions also illustrates our Cause Error Thesis. Theakston () found that, when producing simple transitive sentences with a discourse-new subject, children as old as five years often produced an underinformative pronoun subject (e.g. He rather than The cat). That is, children seemed to overgeneralize a particularly frequent transitive sentence subject, He (or perhaps even its 'givenness' property) into an inappropriate context (one in which the subject is discourse-new). With regard to the Prevent Error Thesis, Rowland and Noble () found that children showed better comprehension of dative sentences containing novel verbs when the recipient was a proper noun (e.g. I'm blicking Teddy the frog) than a definite determiner phrase (e.g. I'm blicking the Teddy the frog). Although other factors are no doubt relevant too (e.g. consecutive determiner+noun sequences are confusing), one relevant factor seems to be that % of datives in child-directed speech are of the former type. Thus frequency is preventing errors here; but frequency not of individual lexical items or categories, but of cues to thematic role assignment (e.g. 'being a proper noun' is a frequently heard cue to recipienthood).
In summary, whilst input frequency effects are straightforwardly (and hence uncontroversially) observed at the levels of individual words or surface strings, effects at the level of sentence constructions are much more evasive. We have argued, however, that frequency effectstoken and type, AoA, and preventing and causing errorare no less ubiquitous in this domain than any other. The reason that they often elude discovery is that they tend to be rather abstract: what is relevant is often the frequency not of surface strings but of pairings between concrete lexical items and abstract constructions, of abstract cues to subjecthood, of type:token ratios within a given construction, and so on. Indeed, even when we might be tempted simply to count the number of occurrences of a particular word (e.g. go), the appropriate frequency measureand the one that yields correlations between children's speech and their input (Theakston, Lieven, Pine & Rowland, )is the frequency of each of its different senses. In short, as the saying goes, not everything that can be (easily) counted counts, and vice versa.
Consequently, if we are to make progress in our understanding of children's acquisition of sentence-level constructions, we need to move away from models based only on surface form and towards models that include roles for abstract factors such as verb meaning, animacy, participant roles, construction-level semantics, and so on (e.g.

M O R E A D VA N C E D C O N S T R U C T I O N S
Both frequency effects in general, and our five theses in particular, scale up to more advanced constructions. Here we consider three construction types that have received considerable attention in the acquisition literature: questions (focusing mainly on wh-questions, which have tended to attract more research attention than yes/no questions), relative clauses, and passives.

Questions
Most agree that the very first questions that English-speaking children produce are rote-learned, frequently heard, probably unanalyzed strings, such as what's+that (often pronounced as whassat?). Many would also agree with Klima and Bellugi () that these very early questions include partially analyzed high-frequency formulae such as What-X-(doing)? and Where-X-(going)? (see also Fletcher, ). However, the role of frequency beyond these earliest formulaic utterances is more controversial. Here we argue that there is ample evidence that children's early question acquisition is moulded by input frequency well into development. We suggest that studies of question acquisition support three of our theses: (i) that frequent items are acquired before infrequent ones, all else being equal (AoA); (ii) that high-frequency question types can Prevent Errors; and (iii) under some circumstances, an over-reliance on high-frequency forms can Cause Errors).
First, studying the order in which children start to produce wh-words demonstrates that a word's frequency affects how easily and early it is acquired (AoA). Wh-questions in particular provide a good test bed for investigating the effect of frequency on the acquisition of lexical items because they contain a built-in control for many of the other variables that we know interact with (and can mask the effect of) frequency. For example, in English, wh-words always appear in the same positionat the beginning on the clauseso controlling for the effect of sentence position on an item's salience is not necessary. Similarly, all wh-words are roughly equivalent in ease of production since all are one-syllable words which start with one of two phonemes (/w/ for what, where, why, when, and which and /h/ for how and who).
A number of studies have observed a correlation between order of acquisition and input frequency in a range of languages. For example, Rowland, Pine, Lieven, and Theakston () reported that the order in which the twelve Manchester corpus children began to produce English wh-words correlated with the frequency of the wh-words in their input, even when syntactic and semantic complexity were taken into account. Wode (), Forner (), Savic (), and Clancy () have reported similar findings for German, Serbo-Croatian, and Korean (see also Tyack & Ingram, ; Bloom, Merkin & Wootten, , for English; Okubo, , for Japanese). Once again, input frequency is not the only relevant factor here, since it only accounted for only -% of the variance in the order of wh-word acquisition (Rowland et al., ), as predicted by our Interaction Thesis, but it is a significant factor nonetheless.
Research into children's questions (both wh-and yes/no) also demonstrates how highly frequent sequences can help protect children from making syntactic errors when constructing sentences (Prevent Error). Although word order errors are rare in children's early productions, English-learning children make a surprising number of these errors in their early question formation. These errors include subject-auxiliary inversion errors in which the tense-and agreement-marked auxiliary occurs post-, instead of pre-subject (e.g. *What he can do?) and double-marking errors in which tense+agreement is marked twice (*What did he didn't want; *What is he isn't eating?; *Does she doesn't want a drink?). These errors pattern systematically, and therefore cannot be dismissed as momentary lapses or slips of the tongue. For example, they are generally more common with some wh-words (e.g. why) and auxiliaries (e.g. DO and the modal auxiliaries), and with negative questions (e.g. The many different theoretical accounts of these errors that have been proposed need not concern us here (e.g. Stromswold, ; De Villiers, ; Valian, Lasser & Mandelbaum, ; Santelmann, Berk, Austin, Somashekar & Lust, ). The important point is that whatever other factors may affect rates of error (e.g. polarity and auxiliary type, as discussed above), questions are more susceptible to error when certain wh-words are combined with certain auxiliaries. For example, Rowland and Pine () reported that one child, Adam, produced Where shall questions correctly but made errors with What shall. Similarly, he produced errors with How can but not with How do. These findings suggest that, whatever other rules or abstractions young children are using, they are making at least some use of high-frequency lexical frames learned from the input (e.g. How do + X; Rowland & Pine, : Rowland, ; Ambridge & Rowland, ). The relevant questions are thus protected from error, since the word order of the question is specified directly in the frames.
If this is the case, then one would expect to see higher error rates for lower-frequency question types for which the child has no frame available, and must therefore be generated using other strategies (e.g. generalizing from existing knowledge). Rowland (; see also Dabrowska & Lieven, ; Ambridge & Rowland, ) directly tested the prediction that question types that had occurred with high frequency in the input would A M B R I D G E E T A L .
 be picked up as frames by children and so would be protected from error. In an analysis of the yes/no and wh-questions produced by ten English-learning children aged two to five years, she reported significantly lower rates of error in question types that were highly frequent in the children's input than in low-frequency question types. Importantly, the analyses ruled out alternative explanations, such as the identity of the wh-word or auxiliary, or the input frequency of the individual words.
The domain of question acquisition also exhibits evidence for our Cause Error Thesis. An over-reliance on frequent frames can not only protect from error, but, in some cases, cause errors, when children use these frames inappropriately, for example by combining a wh-word+auxiliary frame (e.g. Why can), with an inappropriate declarative phrase (she can't drink the milk) to yield a doubling error (Why can she can't drink it the milk?; Dabrowska and Lieven, , found that % of their potentially frame-derived questions were errors). Ambridge and Rowland () tested this prediction in an elicitation experiment with English-learning three-to four-year-olds. They reported that doubling errors were more likely to be produced by children who had already learnt the relevant wh+auxiliary frame (Why can), and speculated that doubling errors occurred when children combined these frames with a declarative fragment (Why can + she can't drink the milk), suggesting that stored high-frequency strings can sometimes cause, as well as protect from, error.
Once again, this is a domain in which frequency interacts with other factors such as cognitive complexity (Interaction Thesis). For example, both Rowland () and Ambridge and Rowland () reported that certain question types (e.g. Why don't, and, indeed, most negative questions) attracted higher rates of error than would be expected solely on the basis of input frequency. Again, the conclusion that other factors are also at play does not obviate the need for a frequency-sensitive learning mechanism and, indeed, constrains theory development by highlighting the need for a mechanism that explains the interaction of frequency with other relevant factors.
Finally, it is important to note that an explanation of the frequency effects outlined in this section need not necessarily incorporate the assumption of item-based frames. For example, under Westergaard's () approach, children are learning and applying grammatical movement rules (as in the generativist theories mentioned above), but these are framed in terms of languagespecific micro-cues that specify in detail when and where different grammatical rules apply. Cues for which there is a lot of evidence in the input (i.e. highfrequency cues) will inevitably be learned first. Thus, as we argued in the 'Introduction', a frequency-sensitive account will not necessarily be a constructivist one; a point to which we return in the final section.

Relative clauses
Throughout this article we have emphasized the existence of different types of frequency effect (Levels and Kinds Thesis), from those involving concrete strings to those involving abstract cues and constructions. In this section, we present evidence that frequency effects of the more abstract type are observed for children's acquisition of relative clauses. Thus, frequent forms, when appropriately defined, are associated with earlier acquisition (AoA) and lower error rates (Prevent Error).
At first glance, the bulk of past research on relative clauses (RCs) appears to present a clear counter-argument to the claim that frequency significantly influences acquisition. Most of this research has focused on the acquisition of subject () and object () RCs.
() The girl that chased the boy () The boy that the girl chased Let us first concentrate on the language for which we have the most data: English. Naturalistic and experimental studies SUGGEST that children acquire subject RCs before object RCs (e.g. Diessel & Tomasello, ; Kidd & Bavin, ). Additionally, a host of adult sentence processing studies have consistently reported a subject advantage for RC processing (e.g. Gibson, ). These results, especially the experimental data, are consistent, and replicate across typologically similar languages. This pattern is problematic for any argument that frequency influences syntactic acquisition, since, in English, object RCs are MORE FREQUENT than subject RCs in child-directed speech (Diessel, ) and in spoken language in general (Roland et al., ). We argue in this section that, far from constituting evidence against a frequency-sensitive learning mechanism, the case of RCs reveals the multiplicity of levels in which frequency exerts an influence on acquisition (Levels and Kinds Thesis).
Subject and object RCs differ substantially in their functionaldistributional properties. Fox and Thompson () first identified a number of dimensions on which the two structures differ. One prominent dimension is the ANIMACY of the head noun: subject RCs are significantly more likely than object RCs to contain an animate head noun, whereas the opposite is the case for inanimate heads. Second, object RCs typically contain discourse-old RC subjects. Finally, both Roland et al. () and Fox and Thompson () have shown that object RCs in spoken English rarely contain a relative pronoun. As such, although most experimental studies tested object RCs like (), which contain two animate NPs and an overt relative pronoun, the types of object RCs that are most frequent in spoken discourse more closely resemble ().
() The film I saw last night A M B R I D G E E T A L .


The distributional tendencies of object RCs are attributable to two functional properties of language (Du Bois, ): (i) objects are typically inanimate, whereas subjects tend to be animate (typically human); and (ii) subjects tend to be discourse-old. These are STATISTICAL properties of language. The likelihood of overt relativizer (that, which) use is also subject to frequency constraints: Fox and Thompson () identified several variables that predict the use/non-use of the relativizer, one being whether or not the RC subject was expressed as a pronoun (leading to non-use). Although these distributional facts are often ignored in studies of RC acquisition, they exert significant influences on children's acquisition.
Studies of naturalistic speech show that children quickly converge on these frequency patterns. Diessel () reported on the distributional properties of subject and non-subject (predominantly object) RCs in Adam's (Brown, ) and Abe's (Kuczaj, ) speech from the CHILDES corpus (MacWhinney, ). Non-subject RCs overwhelmingly contained inanimate head nouns (·%) and pronominal RC subjects (·%) (see also Kidd, Brandt, Lieven & Tomasello, ). These numbers closely resembled the frequency of different NP-types in simple transitive clauses in the children's speech, where ·% of all subjects were first or second person pronouns. Therefore, despite the fact that non-subject RCs do not follow canonical word order, they do mark syntactic roles canonically (i.e. subject = animate, given, object = inanimate) and in a manner that matches the distributional properties of simple transitive sentences. Crucially, these frequency estimates from corpora predict children's correct production and comprehension of RCs in controlled experimental contexts. For instance, Kidd et al. () and Brandt, Kidd, Lieven, and Tomasello () showed that the typical subjectobject asymmetry is neutralized and in some instances reversed when threeto four-year-old English-and German-speaking children were tested on highly frequent object RC types (i.e. those with an inanimate head noun and a pronominal RC subject) (see also Arnon, ).
Thus, as we saw in 'Simple syntactic constructions', children's acquisition of RCs is influenced by frequency, but at the level of abstract cues (e.g. animacy, givenness) and lexical items (i.e. pronouns) that are frequently associated with particular sentence positions. These distributional frequencies predict earlier acquisition (AoA), as well as lower error rates, and hence higher rates of correct performance, in both comprehension and production (Prevent Error).
Potentially problematic for this conclusion is the finding that subject RCs are actually the first type of RC to emerge in children's speech (Diessel & Tomasello, ). A closer inspection, however, reveals that the vast majority of these early RCs are so-called 'presentational amalgam' constructions, as in () and ().
() Here's a mouse go sleep () That is a train go go Lambrecht () described the presentational amalgam construction as a type of truncated RC, where the predicate nominal of the copular clause serves as the subject of the clause-final VP. Their status as true RCs in child language is equivocal: they are monoclausal and lack the obligatory relative pronoun. As such, they closely resemble canonical SV(O) clauses, leading to the possibility that children use their knowledge of frequent structural patterns to break into the syntax of RCs, after which their relative use of subject and object RCs closely approximates adult usage (see Fitz, Chang & Christiansen, , for a connectionist model that uses word-order patterns learned from canonical SVO sentences to acquire the structure of relative clauses). Thus, again, we find that there are many different types of frequency effect (Levels and Kinds Thesis), and that, provided we define 'form' at the appropriate level, more frequent forms are associated with earlier acquisition (AoA Thesis).
One final emerging piece of evidence regarding the role of frequency in RC acquisition comes from languages other than English. Several researchers have suggested that the traditional subject-object asymmetry observed in experimental studies of English (and other typologically similar languages) derives from the fact that subject RCs follow canonical word order, whereas object RCs do not (e.g. Bever, ; MacDonald & Christiansen, ). This account makes the following prediction: object RCs should be acquired first and should be easier to understand in languages where their word order follows canonical word order. Chinese languages such as Mandarin and Cantonese follow this pattern. Although there are many more studies to conduct on these languages, there is some evidence in support of this prediction (Yip & Matthews, ; Chan, Matthews & Yip, ; Chen & Shirai, ; though see Hsu, Hermon & Zukowski, ). Thus, again, we see an effect of frequency, but at a very abstract level: the frequency of particular orderings of SUBJECT and OBJECT roles in the language as a whole; an effect far removed from a view under which the acquisition mechanism is sensitive only to the frequency of particular surface strings.
Whilst the evidence for frequency effects in this domain is clear, what remains unclear is how these effects are represented and implemented on-line. For instance, there is some evidence to suggest that many object RCs are produced using prefabricated chunks (e.g. the one pro VERB; see Fox & Thompson, ; Reali & Christiansen, ), but the processing advantage shown for object RCs that have less prototypical features (e.g. the pen that I bought) raises the possibility that the constraints of animacy and RC subject might be implemented incrementally on-line (see Kidd et al., ). Given the importance of the wider question of the locus of A M B R I D G E E T A L .
 frequency effects observed in first language acquisition, this is clearly an issue that requires further investigation.

Passives
Research on passives illustrates that frequency effects can be found not only within a given language, but also cross-linguistically (Levels and Kinds Thesis): across languages, a negative correlation is often observed between the relative frequency of a particular construction in the language and the age at which it is typically acquired by its speakers (AoA Thesis). Passives are highly dispreferred in languages like English, German, and Hebrew, and thus occur infrequently. Our most comprehensive naturalistic data come from English: in a large corpus study, Xiao, McEnery, and Qian () reported that the percentage of all passive types (full and truncated, using either be or get) in spoken British English is ·%. Using the Brown () corpus (i.e. American English), Gordon and Chafetz () reported that full passives occur in only ·% of all sentences in CDS, whereas truncated passives occur ·% of the time. Not surprisingly, passives are also rare in the spontaneous speech of English-speaking children (Pinker, Lebeaux & Frost, ; Israel, Johnson & Brooks, ), a finding that is similar to reports on German (Mills, ) and Hebrew (Berman, ).
The learnability problem posed by infrequent and more advanced structures is well-worn territory in child language research, and the passive has been central to this debate. One way to evaluate how frequency matters is to compare languages such as English and German, in which the passive is infrequent, to languages where the passive occurs with much higher frequency. Indeed, there are several cases in the literature where higher passive frequency results in earlier acquisition (AoA Thesis). For instance, in Sesotho the passive is estimated to be ten times more frequent than it is in English (Kline & Demuth, ), which appears to result in comparatively earlier acquisition (Demuth, ; Demuth, Moloi & Machobane, ). Similar effects have been reported for Inuktitut (Allen & Crago, ), Bahasa Indonesia (Gil, ), and Ki'che' Maya (Pye & Quixtan Poz, ). In every case the high frequency of passive use appears to stem from particular typological properties of the languages, which, in comparison to European languages, make the passive a less marked structure (Interaction Thesis).
Training studies in English complement the cross-linguistic work. In an early study, Whitehurst, Ironsmith, and Goldfein () showed that modelling passives to four-to five-year-olds increased their production and comprehension, a finding corroborated by Vasilyeva, Huttenclocher, and Waterfall () (for a training study of rare subject RCs in Turkish, see Sarilar, Matthews & Küntay, ). The Whitehurst et al., study predates the structural priming literature (e.g. Bock, ; Pickering & Ferreira, ), but nowadays would be interpreted as a priming effect. The passive is the most studied structure in priming studies conducted with developmental populations, showing a consistent priming effect (e.g. Savage, Lieven, Theakston & Tomasello, ; Huttenlocher, Vasilyeva & Shimpi, ; Messenger, Branigan & McLean, ; Kidd, ).
The robust nature of the priming effect for the English passive has been explained with reference to the structure's low frequencythe so called INVERSE FREQUENCY EFFECT, which describes the tendency for low-frequency structures to yield higher priming effects. Several explanations for this inverse frequency effect have been proposed, but the one that most naturally extends to acquisition is the argument that structural priming effects reflect implicit learning of structure (Chang et al., ): children have a greater tendency to produce low-frequency forms after being primed because priming leads to larger representational change in comparison to more entrenched structures (e.g. the active transitive). Importantly, the account predicts that children will respond to low-frequency forms such as the passive differently across development: representational change in young children following exposure will be greater than in older children (effectively, younger children have more to learn). This leads to a prediction (or even perhaps a caution): we should not expect frequency effects to be uniform across developmental stages and, indeed, individual children (Levels and Kinds Thesis).
Finally, the acquisition of the passive has been shown to be either supported or hindered by its similarity or dissimilarity to other structural patterns. Abbot-Smith and Behrens () showed that a German-speaking child acquired the stative sein-passive before the eventive werden-passive, even though the two forms are roughly equal in frequency in the input. However, the two passives overlap with other structures that serve to either support (in the case of the sein-passive) or hinder acquisition (in the case of the werden-passive). The acquisition of the sein-passive is facilitated by the previously learned morphologically and functionally similar present perfect, whereas the werden-passive cannot build on a previously acquired construction and competes in function with high-frequency modal verb constructions. Thus we have another instance where frequency at multiple levels interacts with other properties of language, in this case structural overlap, to determine acquisition (Interaction Thesis).
To conclude this section, there is ample evidence to suggest that frequency effects are observed not only for lexical strings and simple structures, but also for more advanced structures including questions, relative clauses, and passives. Because, in many cases, these frequency effects occur at the level of abstract categories, patterns, or cues, they are often more difficult to detect than frequency effects at the single-word or even construction level. When A M B R I D G E E T A L .
 the data are analyzed at the appropriate level of abstraction, however, we see exactly the same types of frequency effect that are observed for other domains. One pressing challenge for future research in this domain is to better determine how frequency effects interact with other features of language, such as typology (e.g. see papers in Kidd, ).

T H E O R E T I C A L I M P L I C A T I O N S
The present article reviewed frequency effects in four core domains: the acquisition of single words, inflectional morphology, simple syntactic constructions, and more advanced constructions. We argued that frequency effects are ubiquitous across all of these domains, and, indeed, across language acquisition in general. In summarizing this evidence, we argued that there exist different types of frequency effect; for example, effects at the levels of lexical strings and abstract sentence constructions, as well as effects of both type and token frequency and of relative and absolute frequency (Levels and Kinds Thesis). We presented evidence that high-frequency forms are associated with earlier acquisition (AoA Thesis) and lower rates of error (Prevent Error Thesis), but also that they can cause error when used inappropriately (Cause Error Thesis). Finally we argued that frequency effects interact with other effects, such as utterance position, and that such interactions can be informative with regard to the nature of the language acquisition mechanism (Interaction Thesis).
Whether or not we have succeeded in convincing the reader of all of these individual claims, we hope to have marshalled sufficient evidence to convince all but the most hardened classicist (in the sense of Newmeyer, ) of the ubiquity of frequency effects across all domains of child language acquisition, and that frequency effects therefore constitute a phenomenon for which any successful theory must be able to account.
As we noted in the 'Introduction', this might be either a generativist/ nativist account that assumes knowledge of innate syntactic categories, principles, and parameters (e.g. Yang, ; Westergaard, ) or a constructivist/usage-based account that does not (e.g. Tomasello, ). In principle, both classes of account could, given certain assumptions, explain the patterns of frequency effects outlined here. This is not to say, however, that all current theories can explain frequency effects, and that, by making reference to accounts that are incompatible with such effects, we are setting up a straw man. We have already mentioned in passing one account that explicitly denies any meaningful effect of frequency (Roeper, ). Much more common are proposals that do not explicitly rule out frequency effects (or, indeed, discuss them at all), but that posit learning procedures that not only (a) yield no frequency effects in their current form, but also F R E Q U E N C Y E F F E C T S (b) COULD yield no frequency effects without abandoning the core learning mechanism assumed. An example is the triggering approach to setting word order parameters. Under such accounts (e.g. Sakas and Fodor, ), children acquire the word order of their language (e.g. SVO for English), not by abstracting across input utterances, but by setting syntactic parameters (e.g. setting the specifier-head and head-complement parameters to the settings that yield SV and VO, respectively). Because the account includes no role for input-based learning, it does not explain the finding that word order is better learned for more frequent verbs (Matthews et al., , ). Neither can the account straightforwardly be modified to yield such effects. It would be necessary to add the assumption that children learn word order by abstracting across input strings, which entirely obviates the need for the parameter-setting mechanism. The whole point of the account is to explain how children could use triggers to acquire word order rapidly, WITHOUT having to build this knowledge gradually on the basis of the input. Thus there exist at least some accounts with which the type of frequency effects discussed in the present article are INCOMPATIBLE IN PRINCIPLE. However, while some individual accounts are incompatible with frequency effects, this is not true for whole families of accounts. Both constructivist and generativist accounts (including some parameter-setting accounts) can incorporate frequency-sensitive learning mechanisms. That said, we feel that it would be remiss of us to end this review sitting on the fence, and that we owe it to readers who have persisted this far to nail our colours to the theoretical mast. It will come as no surprise to anyone who has read any of our previous papers that these colours are those of the constructivist camp. But this is not a matter of research tradition, terminology, or simple preference; on our view, the constructivist account offers a more parsimonious account of frequency effects.
Let us illustrate this claim by returning to one of the domains that we have discussed hereinflectional morphologyand, specifically, to a phenomenon to which we have already alluded briefly. The phenomenon is that children sometimes produce agreement-/tense-less verb forms in contexts in which an inflected (here third person singular -s) form is required (e.g. *Dolly eat it). Importantly, both sides agree that this phenomenon is related to the input. For example, English and Dutch children hear these agreement-/tense-less verb forms frequently (e.g. in sentences such as Let Dolly eat it and Dolly can eat it), and so produce these errors at high rates. Italian and Spanish children hear these forms much less frequently, and so produce these errors rarely. Thus both generativist and constructivist researchers agree that this phenomenon can be explained only by positing some kind of frequency-sensitive learning mechanism.
Under a generativist account (e.g. Legate & Yang, ), children use the input to set an innately given TENSE parameter to either a positive (the language requires tense/agreement marking) or negative setting (it does not). Because this parameter is set probabilistically on the basis of the inputi.e. in a way that is frequency sensitivethis account can explain why English and Dutch children, who hear these 'bare' forms frequently, produce more errors that Italian and Spanish children, who do not.
Under the constructivist account (e.g. Freudenthal et al., ; Räsänen et al., ) children make these errors because they are learning from the input individual lexical forms and multiword strings (e.g. play, plays, Let Dolly play, etc.), which they sometimes use inappropriately (e.g. producing Let Dolly play, in a context where Dolly plays would be appropriate). This proposal not only offers a closer fit to the quantitative cross-linguistic pattern, but also explains whywithin a given languagesome verbs display higher error rates than others (Freudenthal et al., ). For example, in English, the verbs that children frequently hear in 'bare' versus third person singular -s form, particularly in utterance-final position, are exactly those verbs that children frequently produce in bare form in third singular contexts (Theakston, Lieven & Tomasello, ; Kirjavainen, Theakston & Lieven, ; Freudenthal et al., ; Räsänen et al., ). Now, as we argued above, there is no reason in principle why the generativist account could not be adapted to accommodate these lexical-level frequency findings. One could quite easily propose that, in addition to using input forms to set the TENSE parameter (Legate & Yang, ), children additionally store input strings and, on a non-negligible proportion of occasions, produce utterances by retrieving these stored strings directly. Why then, do we favour the constructivist alternative? The reason is that the constructivist account yields these lexical input frequency effects naturally, using the core learning mechanism assumed by the account (i.e. the storage and reuse of strings from the input). In contrast, the generativist account yields these effects by DISCARDING the core mechanism assumed by that account (at least, on a sufficiently large proportion of occasions for the effects to be detectable) and adding ancillary hypotheses that have no independent theoretical motivation within the account; that serve no purpose other than to explain otherwise recalcitrant findings.
An analogous situation applies in every domain that we have investigated. For example, children could acquire word order by setting innate complement-head and specifier-head parameters that spell out (amongst other things) the target order of the innate categories of SUBJECT, VERB, and OBJECT in the language being learned. But in order to explain the finding that children and adults have detailed knowledge of the frequency with which particular verbs have appeared in this construction, the generativist account would have to add the assumption thatin addition to setting this parameterchildren record verb+construction collocation frequencies. Again, whilst for the generativist account this assumption is merely an ancillary hypothesis with no independent theoretical motivation, the phenomenon falls naturally and inevitably out of the constructivist account: if children learn the SUBJECT VERB OBJECT construction by abstracting across particular instances of that construction in the input, then the frequency with which each verb has appeared in this construction is immanent in the generalization. We would be the first to admit that there are many important language acquisition phenomena for which current constructivist accounts do not offer a satisfactory explanation; but, on our view, constructivist accounts, which have frequency sensitivity built into their very fabric, provide the most parsimonious explanation of the multiplicity of frequency effects discussed here.
To summarize, the current article has presented evidence of pervasive frequency effects across children's language acquisition. Frequency effects are observed across a variety of different domains, levels (e.g. lexical vs. abstract; type vs. token, absolute vs. relative), and outcome measures (e.g. age of acquisition, rates of error/correct use, types of error), and therefore constitute a phenomenon that demands explanation under any theoretical account. Although we have advocated a constructivist account, this is not to say that alternative approaches are incompatible with frequency effects in principle. The challenge for such accounts is to incorporate motivated mechanisms that yield frequency effects whilst preserving the core mechanistic assumptions of the account.
In conclusion, whilstas we have tried to stress throughoutfrequency isn't everything, frequency CERTAINLY isn't nothing. On the contrary, frequency effects constitute a phenomenon that any successful account of child language acquisition must explain.