6 Crosslinguistic approaches to language acquisition
6.1 Introduction
Human language is the only communication system with extensive variation in form and meaning across the groups of its users. Human language comes in a great many varieties, and the structures we find in grammars of individual languages and in the way meanings are expressed vary to an impressive degree. Currently, there are about 6,000–7,000 languages spoken.1 For only about half of these we have some kind of basic grammatical description and for only about 10 per cent do we have good and elaborate analyses. Yet in-depth description of the adult language is a prerequisite for any acquisition study. Even though in the last forty years a lot of crosslinguistic language acquisition research has been conducted, it is still for only about 2 per cent of the world’s languages that we have at least one acquisition study. For even these 2 per cent, however, we may only have acquisition studies devoted to one individual feature or aspect of language development.
Furthermore, this small sample is heavily biased toward Indo-European languages of Western Europe with the bulk of research still concentrated on English. This bias manifests itself even in the titles of works on language acquisition. English is the default case: if there is a title about the acquisition of language or some feature of language without naming the language, then we can assume the work is on English; if the work bears on any other language, that language is normally named in the title.
A problem of this small biased sample is that we take English and a few other Indo-European languages as the prototype for acquisition. Yet it is well known that these languages are typologically unusual; English and the Indo-European languages of Northwestern Europe for which we have acquisition data (e.g. French, Italian, German) exhibit a large number of linguistically rare phenomena (cf. Dahl Reference Dahl, Bechert, Bernini and Buridant1990, Haspelmath Reference Haspelmath, Haspelmath, Konig, Osterreicher and Raible2001). A prominent example is the relative construction with relative pronouns (e.g. whom in the woman whom I saw or that in the mouse that ate the cheese). This construction raises specific acquisition issues (see Diessel & Tomasello Reference Diessel and Tomasello2000), but it is not attested in many other languages where its function is taken over by structurally different constructions (e.g. Comrie & Kuteva Reference Comrie, Kuteva, Haspelmath, Dryer and Comrie2005).
Thus a substantial part of our knowledge about language acquisition is built on specific constructions prominent in languages of Europe that have been well described, but we do not have information about how other, more widespread, constructions are acquired. Generalizing from the acquisition of one or a few languages to language in general is comparable to biologists studying one unusual mammal species, such as whales, and making generalizations from that to all other mammals. It is well known that children learn the language of their environment but languages differ and we need to include in our research the range of features that children may have to acquire. Acquisition studies of less well-documented languages and, in general, a more crosslinguistic perspective on acquisition is a top priority in the field.
Crosslinguistic language acquisition research is usually understood in two different ways. First, and most frequently, the term is used for acquisition studies of languages other than English. Studies of this type of research, for instance, investigate how ergative structures are acquired in Quiche Mayan, or how grammatical morphology is acquired in Turkish. Results of such studies are often used to test theories of language acquisition that are developed on the basis of research on English, or that are informed by general speculation about the nature of grammar.
The other type of crosslinguistic research is inherently comparative, and languages for comparison are selected on the basis of typological differences or similarities. I will use the term ‘typological language acquisition research’ for this type of research. The goal is to systematically explore commonalities and differences in the acquisition of specific linguistic features across different languages. Languages are grouped typologically on the basis of shared features. For example, word order has often been used to define types of languages; English has a predominantly subject–verb–object pattern (SVO), whereas Welsh has a predominantly VSO order and Japanese has a SOV order. A variety of features is used to classify languages into typologies, for example case marking. Some languages are classified as Ergative–Absolutive while others are Nominative–Accusative, identified on the pattern of case marking used. A language with ergative case marking typically treats the subject of an intransitive sentence like the object of a transitive sentence while the subject of a transitive sentence is distinct. However, there is variation within this general pattern (Van Valin Reference Van Valin and Slobin1992). The advantage of the ‘typological language acquisition research’ approach is that a range of crosslinguistic variation is covered.
There has been an increase in the number of studies comparing acquisition across languages. Despite this, most research – even when on less well-studied languages – still focuses on one language; typological acquisition research is relatively rare. Some typological studies are Pye et al. (Reference Pye, Pfeiler, de León, Brown, Mateo and Pfeiler2007), Slobin (Reference Slobin1997b) and Strömqvist et al. (Reference Strömqvist, Ragnarsdóttir and Toivainen1995). The use of different data sets, different methods or different criteria for coding makes it difficult to compare across languages. This complicates post hoc comparisons and meta-analyses and creates a considerable challenge to a full-scale typological approach.
In the remainder of this chapter I discuss some examples of variation across languages and theoretical and methodological challenges posed by language variation. I then review one example of an intra-genealogical acquisition study, a study that compares languages within language families and one example of an inter-genealogical acquisition study that compares languages across families.
6.2 Variation across languages
6.2.1 Some theoretical views
Variation is found at all linguistic levels: phonology, morphology, syntax, semantics and pragmatics. In addition, there is considerable variation in the context in which learning occurs. The main question of typological language acquisition research is whether and if so, how, the actual course of language acquisition is affected by differences across languages, as well as cultures. However, language acquisition research is very much guided by what language is understood to be, and this affects how typological research can be conceived.
In approaches to language acquisition which adopt a nativist perspective (see Valian Ch. 2), linguistic diversity and variation originally played a marginal role. This has changed somewhat in current work that incorporates data from a wider range of languages. Within nativist approaches explanations of how children deal with variation range from performance factors to the assumption of innate mechanisms. In one version of the theory to account for variation across languages, a small set of parameters was proposed to limit the possible syntactic variation. For example, the pro-drop parameter distinguishes languages which allow pronoun subjects to be non-overt, as in Italian, and languages which require pronoun subjects, as in English (Hyams Reference Hyams, Jaeggli and Safir1989b).
In contrast to approaches which assume innate language structures, the cognitive, constructivist or usage-based theories (e.g. Bybee Reference Bybee1985, Langacker Reference Langacker1987, Tomasello Reference Tomasello2000b and Ch. 5) assume that children construct their languages from a small set of item-specific and low-scope constructions. For usage-based approaches, crosslinguistic variation is of key importance because item-specific constructions are necessarily also language-specific, and the variation in linguistic structure is likely to have an impact on how individual constructions are learned (Slobin Reference Slobin1985a).
Dan Slobin has been leading a visionary initiative over the past two decades in expanding our understanding of similarities and differences in the acquisition of languages of different types. His work has focused, in part, on how languages differ in what is grammaticized, and the problem of form–function mapping in the acquisition process, that is, detecting linguistic forms and assigning a meaning/function to each. He launched a large pioneering project that culminated in five volumes, with sketch descriptions of the language acquisition of twenty-eight languages ranging across a wide range of families (e.g. Slobin Reference Slobin1985a, Reference Slobin1985b, Reference Slobin1992, Reference Slobin1997a, Reference Slobin1997b).2 A number of language acquisition researchers provided selective, mostly uniform, summaries of what we know about the acquisition of these languages. The rationale behind his approach was that different types of languages pose different types of acquisition problems and the crosslinguistic method is a ‘method for the discovery of general principles of acquisition’ (Slobin Reference Slobin1985a: 5).
Slobin’s goal was to use this crosslinguistic data to determine the relative difficulties in acquiring formal devices (Slobin Reference Slobin, Ferguson and Slobin1973). The assumption that the ‘rate and order of development of the semantic notions expressed by languages are fairly constant across children learning different languages’ (Slobin Reference Slobin, Ferguson and Slobin1973: 187) is difficult to evaluate. The complexity measure of forms consisted in comparing time of first use and time of mastery. As Bowerman (Reference Bowerman and Slobin1985) pointed out, this is a very difficult measure to apply, since it is far from clear how first use should be coded and whether the establishment of time of acquisition can be assessed from very different types of data collected from a small number of children. In addition, the time of acquisition will depend on the criterion used by the researcher, the data and the method. The data used in Slobin’s collections stems from a number of different resources: diaries, experiments and longitudinal studies of children of varying ages, across different time spans and stages of development. That is, the data is heterogeneous. However, the chapters provide valuable insights, and some similarities and differences in the acquisition of different languages emerged.
It has often been assumed that the more complex a feature the more difficult it is to learn (Slobin Reference Slobin1985a). The crucial challenge, however, is to ascertain what complexity consists of. Complexity can be measured along a number of dimensions, and in order to understand development processes, an understanding of the complexity is needed, not just of the form of a structure, but also its function and its interrelation with other structures in the language. Interacting with complexity of form is how consistent and how transparent their functions are. Bates and MacWhinney (Reference Bates, MacWhinney and MacWhinney1987, Reference Bates, MacWhinney, MacWhinney and Bates1989) proposed the Competition model to account for some of the different patterns of acquisition found across languages. In this model, mechanisms determining the ways in which cues combine or compete are described and the strength with which a cue is used is directly proportional to the informational value or cue validity. Cue validity is the product of cue availability (proportion of time a cue is present) and cue reliability (proportion of time when the cue is present that it indicates the correct solution) (McDonald Reference McDonald1986, McDonald & MacWhinney Reference McDonald, MacWhinney, MacWhinney and Bates1989). When there are several morphological forms with one function and several functions for one form, cue validity and reliability are affected. For example, if a particular case form is used to mark some nouns but not others, that form is low in validity. The extent to which word order is important in helping children determine who did what to whom has been investigated within the Competition model. Animacy, case marking, agreement or stress may be used in the early stages, depending on the language being acquired (cf. Bates et al. Reference Bates, McNew, MacWhinney, Devescovi and Smith1982, Reference Bates, MacWhinney, Caselli, Devescovi, Natale and Venza1984, MacWhinney & Bates Reference MacWhinney and Bates1989). In English, for instance, word order is the dominant cue for young children, but in Hungarian it is animacy and, in Turkish, case marking. That is, young children learning different languages focus on different cues, not necessarily word order, and they are not necessarily the predominant cues which adult speakers of the language rely on.
6.2.2 Conceptualization and linguistic relativity
A large body of research suggests that language is tightly connected with the conceptualization of the world (e.g. Bowerman & Choi Reference Bowerman, Choi, Gentner and Goldin-Meadow2003, Lucy Reference Lucy1992, Slobin Reference Slobin, Gumperz and Levinson1996). This research focuses on linguistic relativity which states that the grammar and the lexicon of a language systematically influence how a speaker of this language perceives and conceptualizes the world around. Even concepts like time and space have been shown to be conceptualized differently across languages and cultures. In the spatial domain, Levinson (Reference Levinson, Gentner and Goldin-Meadow2003) postulates three major linguistic frames of references that are grammaticalized or lexicalized in the languages of the world: intrinsic (‘the man is inside the house’), relative (‘the man stands to the right of the house’) and absolute (‘the man is to the north of the house’). Children will need to learn which of these modes of orientation is relevant in the language of their surroundings. Thus finding out how children learn a language also means finding out how their conceptualization of the world develops.
Korean and English differ both in their conceptualization of space and the linguistic expressions that encode spatial distinctions. In a pathbreaking typological study, Choi and Bowerman (Reference Choi and Bowerman1991) compared the acquisition of Korean and English spatial terms. Where Korean uses verbs to encode spatial concepts, English uses predominantly adpositions. In English a distinction is made between in (enclosure of a figure in some container) and on (contact of a figure with some object – for support). In contrast Korean distinguishes the kind of fit. For example, nehhta ‘put loosely in or around’ contrasts with kkita ‘interlock, fit tightly’. Choi et al. (Reference Choi, McDonough, Bowerman and Mandler1999) found that children from 18–23 months show sensitivity to these language-specific differences. That is, infants are attuned to the way in which their language conceptualizes space. The linguistic input affects concept formation from the earliest stages.
6.2.3 Phonological systems
Children need to learn individual sounds and their phonological contrasts. There are approximately 3,000 categorically distinct sounds used in living languages and there are quite a few more that would in principle be possible – the IPA generates over 50,000 possible symbol combinations (p.c. Ian Maddieson). In their first year, babies build up language-specific phonetic prototypes which help to organize sounds into categories (Kuhl et al. Reference Kuhl, Williams, Lacerda, Stevens and Lindblom1992, also see Curtin & Hufnagle Ch. 7 and Vihman et al. Ch. 10). This also holds for children acquiring tone language such as Yoruba (Niger-Congo, Nigeria) (Harrison Reference Harrison2000). Languages differ in the number of phonemes in their sound system. Rotokas (North Bougainville family, Papua New Guinea) is the language with the smallest known inventory (11 phonemes), whereas !Xóõ (Tuu family, Botswana) is at the other extreme with approximately 153 phonemes. Out of the 122 consonants of !Xóõ there are about 83 clicks which are preferred word-initially over nonclicks (Maddieson Reference Maddieson, Haspelmath, Dryer, Gill and Comrie2005, Traill Reference Traill1985). Clicks are known to be complex to produce and range among the most complex articulatory speech sounds. Children learning such a complex sound system might differ systematically in word-learning strategies from children learning languages with a smaller inventory. Children who still have a small vocabulary may be very selective in their choice of words, that is, either actively avoid words which are difficult to pronounce or substitute consonants systematically (for a summary, see Macken & Ferguson Reference Macken, Ferguson and Nelson1983). In fact, clicks are reported to be acquired late in Xhosa (Mowrer & Burger Reference Mowrer and Burger1991) and closely related Sesotho (Demuth Reference Demuth and Slobin1992), but the functional load of clicks in these Bantu languages is considerably lower than in the non-Bantu (‘Khoisan’) languages of Southern Africa. However, the acquisition of ‘Khoisan’ languages has not yet been documented and so it is not known if clicks are acquired earlier than in Xhosa and Sesotho.
6.2.4 Words
There are different types of words, phonological and grammatical words, and their structure and identification differ from language to language. To illustrate why the study of diversity is crucial but difficult, let us consider an example which shows how our theories are driven by the data we use. Morphology directly influences the kind of words we have in a language (more analytic or synthetic – see Behrens Ch. 12) but this interrelation has not been addressed in studies of word acquisition. A study on the acquisition of verbs in five Mayan languages (Pye et al. Reference Pye, Pfeiler, de León, Brown, Mateo and Pfeiler2007) showed that even in closely related languages the children’s first verb forms differ, depending on the morphology of the particular language (see Section 6.6.1). Words are language-specific constructions and generalizations are difficult to make without taking a wide range of factors into consideration.
It has been taken as common ground that the order of morphemes within a word is fixed and that free permutation of the morphemes is not possible. Any change in order is assumed to create a word with a different meaning. This assumption was confirmed for the languages that have been documented so far. Recent research on words in Chintang (Sino-Tibetan, Eastern Nepal), however, (Bickel et al. Reference Bickel, Banjade and Gaenszle2007), shows that prefixes can freely permutate within a word without any change in meaning or other consequences, such as dialect change or pragmatic differences. Thus, speakers freely vary between forms like u-kha-ma-cop-yokt-e (3nonsg.a-1nonsg.p-neg-see-neg-pst excl) and kha-u-ma-cop-yokt-e, ma-kha-u-cop-yokt-e ‘they didn’t see us (excl.)’.3 Free prefix permutation severely reduces the amount of repetition available in the input, but we have at present no idea of how children manage to successfully cope with this feature.
A major finding in word learning has been that children in their early word use tend to prefer nouns over verbs (Gentner Reference Gentner and Kuczaj1982). Gentner’s observation is based on a number of languages including English, German, Japanese, Kaluli, Mandarin and Turkish. The generalization, however, is based on a survey of early vocabulary studies collected from a variety of independent studies conducted by different researchers. Subsequent studies on other languages (Tzeltal: Brown Reference Brown1998a, Mandarin Chinese: Tardif Reference Tardif1996, Korean: Choi & Gopnik Reference Choi and Gopnik1995), and a reanalysis of the English data have shown mixed results; verbs seem to be more represented in the early vocabulary of Korean, for example. It is likely that the use of different data sets or maternal checklists or spontaneous speech samples, yield different results (Clark Reference Clark2003). An additional factor is the context in which a spontaneous speech sample is collected (Tardif et al. Reference Tardif, Gelman and Xu1999). Similarities across English and Mandarin have been found if the context is kept constant.
Estimating the frequency of nouns and verbs presupposes that we can easily distinguish between nouns and verbs in the speech of a child. However, this can be often a challenge both in child language and in some languages in general such as, for example, Riau Indonesian and colloquial Jakarta Indonesian (Gil Reference Gil, Vogel and Comrie2000).
6.2.5 Verb morphology
A considerable challenge to acquisition is posed by morphology. Some languages have a lot of morphology such as for instance Mohawk (Iroquoian, United States, Canada); other languages such as English or Mandarin Chinese have very little morphology and Vietnamese has none. In verbs, for instance, languages vary as to how many grammatical categories can be expressed within a single verb form. Based on a world-wide survey, Bickel and Nichols (Reference Bickel, Nichols, Haspelmath, Dryer, Gil and Comrie2005) report a range between 0 (Vietnamese, with no evidence of any inflectional form in the verb), and 13 (Koasati). Grammatical categories expressed in the verb can cover a wide range, from more familiar categories like tense, aspect or negation to less well-known but widespread categories like evidentiality (grammatical marking of evidence for a statement) and mirativity (grammatical marking of new and unexpected information) to less common categories like honorificity or switch-reference. A child learning a language which obligatorily expresses honorificity in verb forms (e.g. Maithili: daur-l-ak ‘run-PST-3nh, ‘he ran’ (non honorific), daur-l-aith ‘run-PST-3h ‘he run’ (honorific)), has a more complex task of verb learning in the sense of pattern-to-world matching than a child learning a language which does not even express person systematically.
The more verbal categories encoded, the more verb forms a given language exhibits. English expresses three grammatical categories in the verb: person of subject, number of subject and tense, with only two forms to mark them. For example, in She works the -s encodes the person and number of the subject and tense; in She worked the -ed expresses tense. In contrast, the Sino-Tibetan language Chintang obligatorily encodes eight categories and speakers of the language need to make choices in all eight (tense, mood, aspect, polarity, person of subject, number of subject, person of object, number of object). A transitive verb in this language has up to 983 distinct forms (Bickel et al. Reference Bickel, Banjade and Gaenszle2007). Even though with many verbs, some of these forms are rarely used, they are still part of the grammar of adults, and children will acquire them.
The number of verb forms to acquire adds complexity to the task of acquisition, but the way the forms are encoded also adds complexity. Turkish, for example, is agglutinating: that is, each morpheme encodes one meaning. In contrast, Russian and Polish are inflectional languages, in which forms combine several elements of meaning. Exact repetitions of verbs in agglutinating languages like Turkish (as well as in languages with very little verbal morphology like English) are statistically much more likely than in ‘inflectional’ languages like Polish, and exact repetitions become even more rare if the number of categories increase as in a polysynthetic ‘inflectional’ language like Chintang (Tibeto-Burman, Eastern Nepal). Thus, in English constructions like I saw you, He saw me, We saw them, the verb form is repeated no matter what person or gender is involved. In Polish there is a different verb form for each person and in addition the gender of the subject is also marked at the verb, e.g. ja go zobaczyłam (I him saw.1sg fem) ‘I saw him’, ty nam zobaczyłas (you us saw.2sg fem) ‘You saw us’, but ty nam zobacyłes (You us saw.2sg masc) if the addressee is masculine. Thus the probability for exact repetitions of verb forms is much lower in a language like Polish than in English.
For languages like Chintang the likelihood of exact repetition is even less. For a sentence like ‘I saw you’, Chintang differentiates the three verb forms copnehẽ, copnace and copnanihẽ, with different suffixal strings depending on whether the object ‘you’ is singular, dual or plural, respectively. ‘You saw me’ involves an altogether different pattern of tense and agreement marking, involving a prefix: acobehẽ ‘You (singular) saw me’, acobaŋcihẽ ‘You (dual) saw me’, acobaŋnihẽ ‘You (plural) saw me’ (Bickel et al. Reference Bickel, Banjade and Gaenszle2007).
In summary, verb forms in morphologically rich languages are more variable and the child has to master many more forms and combinations of forms and the appropriate contexts of use.
An area in which similarities in acquisition patterns have been reported is in the acquisition of tense/aspect. Data on tense and aspect are available from a wide variety of historically unrelated languages (see Li & Shirai Reference Li and Shirai2000). There is a strong correlation between tense and grammatical and lexical aspect. Grammatical aspect is a formal category of some languages encoding the temporal structure of an event (e.g. perfective vs. imperfective aspect). Lexical aspect, also called Aktionsarten, is an inherent property of predicates categorizing events into states, activities, telic (goal-directed) events, and other such types. Perfective verb forms, that is, forms portraying events as unstructured wholes (such as the Russian form dat’ ‘give.PFV’) and telic Aktionsarten, that is, verbs including a goal or result in their lexical semantics (such as buy) typically appear in the past tense form of a verb, whereas imperfective aspect and atelic Aktionsarten typically appear in the present (or nonpast) form (Shirai et al. Reference Shirai, Slobin and Weist1998). However, there is variation in the acquisition of tense and aspect across languages. It is unclear whether the variation is due to differences in the language-specific structures that are being acquired, or because researchers use different criteria for identifying acquisition or different types of data on which to base their conclusions. For example, some data have been collected through observation while other data have been elicited in experimental settings. Another likely source of variation is the discourse context of aspect usage, which has been shown to cause substantial variation in a study on the acquisition of Russian aspect (Stoll Reference Stoll2001, Reference Stoll2005).
6.3 Variation in context
Children learn their language from their environment, and there is much descriptive work on the input that children receive. There is not only variation in the structures that children have to learn, but also in their cultural and linguistic contexts (Lieven Reference Lieven, Gallaway and Richards1994, Ochs & Schieffelin Reference Ochs, Schieffelin, Shweder and LeVine1984). Studying the linguistic environment of children can help answer two important questions. First, are there any commonalities of qualitative changes made by the caretakers when talking to the child, in other words do all cultures somehow facilitate their speech when talking to children (not necessarily in the same way)? Second, does the input influence development; that is, do we find correlations between certain features in the input and the language development of the child?
As discussed by Ochs and Schieffelin (Reference Ochs, Schieffelin, Shweder and LeVine1984), some cultures are more child-centred while others are more situation-centred. The difference relates to the values and beliefs of the society. In a child-centred society, as is typical with urban industrialized Western groups, a child is assumed to be a communicative partner from birth and caregivers will talk to a young baby as if the baby can understand, and will even answer for the baby; in addition, a baby’s vocalization will be interpreted as a word. In contrast, in situation-centred societies, a young baby is not assumed to be a communicative partner and so child-directed speech does not play the same role. In fact, children may not be addressed directly until they start to produce intelligible words (e.g. Quiche Mayan: Ratner & Pye Reference Ratner and Pye1984, Kaluli: Schieffelin Reference Schieffelin and Slobin1985). Other features also vary, such as prompting a child to use appropriate language or even speaking for the child. However, it is difficult to compare directly across cultures because we may not have captured all the contexts in which adults talk to children (de León Reference de León1998). Thus we do not know the extent to which children learn language structures from the language addressed to them and from language they overhear,
Research on the dyadic interaction between mothers and their children in Western, literate, urban contexts (that is, child-centred) has identified a series of features characterizing child-directed speech: shorter and simpler utterances, higher pitch (Fernald & Kuhl Reference Fernald and Kuhl1987, Fernald et al. Reference Fernald, Taeschner, Dunn, Papousek, Boysson-Bardies and Fuko1989), exaggerated intonation, few errors (Snow & Ferguson Reference Snow and Ferguson1977). None of these adaptations, which should facilitate acquisition, applies universally. Higher pitch, for example, was long assumed to be a good candidate for a universal of child-directed speech. It has even been found in tone languages such as Mandarin Chinese (Grieser & Kuhl Reference Grieser and Kuhl1988, Papousek et al. Reference Papousek, Papousek and Symmes1991). However, there are societies in which higher pitch seems absent from child-directed speech because it is reserved for other registers, as Ratner and Pye (Reference Ratner and Pye1984) suggest for Quiche Maya (though for an alternative interpretation, see Fernald et al. Reference Fernald, Taeschner, Dunn, Papousek, Boysson-Bardies and Fuko1989). A study by Fernald et al. (Reference Fernald, Taeschner, Dunn, Papousek, Boysson-Bardies and Fuko1989), comparing prosodic modifications in mother’s and father’s speech to preverbal children in languages with considerably diverse prosodic structures (French, Italian, German, Japanese and both British and American English) suggests that even though there are common patterns found in the input there are language-specific variations. Repetition has also been reported for the speech addressed to young children in, for example, Tzeltal (Brown Reference Brown1998b), English (Cameron-Faulkner et al. Reference Cameron-Faulkner, Lieven and Tomasello2003) and also in a recent comparative study of Russian, English and German (Stoll et al. Reference Stoll, Abbot-Smith and Lievenin press).
6.4 Methods for investigating language acquisition
A main problem for typological research is the comparison across studies. If, for instance, we want to compare the acquisition of aspect in French, Russian and English using the results of already available studies we would encounter a number of difficulties. Researchers may have collected different types of data and with different research methods, number of participants and age range of the children. There is a wide range of methods used in language acquisition research: experimental paradigms, structured elicitations using a uniform stimulus kit, picture identification and observations in naturalistic or laboratory contexts. Experiments are used to test what children can do both in production and comprehension in a specific context, but they raise methodological and practical issues for typological research. Experiments for investigating typological similarities and differences in acquisition patterns need to be equivalent across language groups, but this can be difficult for a number of reasons. For example, one experimental paradigm for research on very young children’s comprehension is the intermodal preferential looking paradigm (IPL) (Golinkoff et al. Reference Golinkoff, Hirsh-Pasek, Cauley and Gordon1987). In this paradigm, children are simultaneously presented with two pictures and an auditory match for one of the pictures. It is assumed that if children understand the input they will look longer at the matching picture, although there are problems in interpreting what it is the children have actually understood. However, even though the design is relatively simple, the technical and practical prerequisites can be a challenge if one wants to conduct such an experiment in the field. For such an experiment an electricity supply is needed but is not always available. In addition, there needs to be a location where the experiment can be conducted without interruption from others. This means that IPL testing is more or less restrained to the specific cultural context of technically advanced societies.
Any kind of data collection needs to be conducted in collaboration with a native speaker of the language and for experimental or comparative research it needs to be conducted in a uniform context for all participants. In various cultures there can be difficulties in finding assistants who can deal with the experimental situation appropriately. Further, the instructions of the experiment need to be equivalent across languages. Any differences can bias the results considerably. Keeping the instructions constant is not a trivial task, for example, one language may have obligatory articles while another does not which results in differences in the stimuli.
Another problem is in developing stimuli that can be compared across languages. The use of picture prompts (or videos) for instance presupposes that children of the culture are familiar with pictures or videos, but this may not be the case. The choice of stimuli can also introduce a bias. Familiarity with the stimuli can bear significantly on the results. Consider the acquisition of ergativity; if we want to compare its acquisition in Quiche Mayan children (Mexico), Warlpiri (Australia) and Inuktitut (Canada), we might have difficulties in finding stimuli that are equally common and appropriate in the three societies and ecosystems. Another example is if we want to test children’s understanding of transitivity comparing Russian with English and other languages we need to be aware that case marking of objects in Russian is different for masculine animate nouns than for masculine inanimate nouns; neuter nouns and feminine nouns have yet another ending. The researcher must decide which gender groups to use. If all gender/animate combinations are included, the number of items to test will be large and the task may be too long for young children. However, to restrict the stimuli to one case would render the data not representative.
Thus it can be a challenge to control the conditions without biasing the results. It is less difficult to conduct an experiment across closely related languages and cultures than in unrelated languages or very different cultures. This does not mean that typological/crosscultural research is impossible but it is important to be aware of introducing potential biases that are unrelated to the research questions.
We expect that a situation is understood more or less in a similar enough way. However an important point to keep in mind is that there are cultural differences. As Greenfield (Reference Greenfield1997) has argued, in order to use a test developed for one culture in another, the cultures must share values, knowledge and communication. For example, there needs to be agreement on the merit of particular responses to particular questions. In addition, we cannot assume a universal function of questions; testing a child on something for which we know the answer may not be appropriate. Also, knowledge may be held jointly in some cultures so it will not be culturally appropriate to test an individual; a group session would be more appropriate.
Further, the context for an experiment is always quite specific and does not necessarily translate to other linguistic contexts (Stoll Reference Stoll2005, Tardif et al. Reference Tardif, Gelman and Xu1999) or performance in general (Richards Reference Richards, Gallaway and Richards1994). Depending on the exact design, the stimuli and the procedure, very different results can be obtained as shown for instance by various results on the acquisition of the transitive construction in English (Abbot-Smith & Tomasello Reference Abbot-Smith and Tomasello2006).
The goal of longitudinal naturalistic acquisition studies is to gain a representative sample of the language of a child or a group of children and the linguistic context over a specific developmental period. These data constitute an important resource. The main advantage is that we obtain spontaneous speech samples. However, one of the problems is that the resources required are extensive. In addition, the time commitment is huge; data need to be transcribed, translated with glosses for morphemes and also coded so that patterns of development can be analysed. This requires the help of research assistants who are native speakers of the language.
There are several questions that need to be decided in developing such a project: How many children to record? With whom to record them? In which situations? At what time of the day? At what intervals? With or without observer? Are there siblings and will they be in the recording? Answers to these questions have a direct influence on the sample of speech obtained (Hoff-Ginsberg Reference Hoff-Ginsberg1991). Three issues are of particular relevance. First, small samples make generalizations to the population problematic, especially since there is variability in how children develop (Bates et al. Reference Bates, Bretherton and Snyder1988, Lieven Reference Lieven and Slobin1997, Lieven, Pine & Barnes Reference Lieven, Pine and Barnes1992). With only a small sample, there is no way of knowing what the normal range of development is. Second, the density of sampling can influence the results. Since the frequency of occurrence of linguistic structures varies, the frequency of sampling influences the probability of how often a linguistic feature will be encountered. Thus, if we are interested in a rarely occurring linguistic feature, we might severely underestimate the age of emergence just because our sample is not dense enough (Tomasello & Stahl Reference Tomasello and Stahl2004). Third, the situation in which the sampling occurs influences the data obtained (Hoff Ginsberg Reference Hoff-Ginsberg1991). Bornstein and colleagues (Bornstein et al. Reference Bornstein, Haynes, Painter and Genevro2000, Reference Bornstein, Painter and Park2002) found that the recording situation strongly affects children’s output. Children acquiring English were more likely to produce longer utterances if they are recorded at a time that the mother judged would provide an optimal sample of speech than when, for example, the child plays by herself with the mother nearby. In order to make generalizations, we need to have an overall picture of the typical day of a child and choose contexts which best allow for comparisons across cultures. Fourth, the interpretation of the child data requires that we know how the output of the child correlates with the input of the caretakers (Stoll & Gries Reference Stoll and Griesin press). In addition we need methods to compare the data of children learning different languages meaningfully and these methods still need to be developed. This is an important task of future research.
6.5 Child Language Data Exchange System
An important source of data from a variety of languages was developed in the early 1980s by Brian MacWhinney and Catherine Snow; this is the Child Language Data Exchange System project (CHILDES). CHILDES provides a series of tools to transcribe and analyse data to facilitate empirical language acquisition research. It hosts corpora on about thirty languages. English is the best represented language with several corpora that are morphologically glossed. Three other languages, Irish (Guilfoyle), Sesotho (Demuth) and Indonesian (Gil), are represented by corpora that are translated and morphologically glossed for both child and interactors (Indonesian and Sesotho) and for the child only (Irish). In addition, CHILDES contains corpora of five languages, which are glossed but not translated, and there are corpora of three languages, which are translated but not glossed. All other corpora of the remaining languages are transcripts only.
The lack of glossing and translation limits the way the data can be used for analysis since quantitative analysis is limited to orthographically identifiable structures. For typological work, glossing and translations are required. Given the amount of resources needed to build up a transcribed, translated and glossed longitudinal corpus, it is clear why not all the corpora in CHILDES have been glossed and translated yet. However, the data available help in making crosslinguistic and typological comparisons possible. The data is free for researchers to access as are the tools available for analysis.
6.6 Typological studies of language acquisition
Slobin (Reference Slobin and Slobin1997d) called the two major ways of engaging in typological language acquisition studies based on the sampling of languages intra-typological and cross-typological. To avoid confusion with the term ‘crosslinguistic studies’, I will use the standard terms used in typology, namely intra-genealogical studies for studies which compare languages within language families and inter-genealogical for studies which investigate the acquisition of a feature across language families. I focus only on studies here that were designed as typological studies thus excluding studies that evaluate very different data sets.
6.6.1 Intra-genealogical studies
Since the grammars of closely related languages usually do not differ as strongly as grammars of unrelated languages we can hold several variables constant, which potentially otherwise might influence our results. Intra-genealogical studies (e.g. Smoczynska Reference Smoczyńska and Slobin1985, Strömqvist et al. Reference Strömqvist, Ragnarsdóttir and Toivainen1995) also constitute an important basis for inter-genealogical studies.
To illustrate how intra-genealogical studies operate I present the findings of a recent study of early verb forms in five Mayan languages (Pye et al. Reference Pye, Pfeiler, de León, Brown, Mateo and Pfeiler2007). The key feature in this study is that the same method of analysing longitudinal data is used in all five languages. The study starts from the observation that children learning Quiche, Q’anjob’al and Yukatek produce many more combinations of verb root plus suffixes than children learning Tzeltal and Tzotzil, who produce a high proportion of bare verb roots. Even though the morphology of the languages is similar, there are differences in the position of some affixes, such as the position of an affix that marks verb transitivity and mood, and there are other differences in the structure of the inflectional paradigms. These fine-grained differences make the comparisons of early verb forms in these languages a natural experiment. The data for comparison are early verb forms occurring in natural speech, and a sample of child-directed speech. A range of factors in the input were correlated with the use of bare verb forms in the children’s data. The factors include: the frequency of verbs occurring without prefixes, verbs in sentence-initial position, the number of imperatives used, and what are called ‘right-edge factors’, that is, the frequency of occurrence of verb forms without suffixes at the right edge of a sentence. The main significant factor turns out to be the frequency with which adults produce verb forms at the right edge of words and sentences. Contexts vary significantly in the five languages in which the verb root can occur without an overt suffix. In Tzeltal and Tzotzil verb roots can appear simultaneously at the right edge of the verb stem and the right edge of the sentence. In the other three languages the verb root only occurs at the right edge of the verb stem but not at the end of the sentence because these have status suffixes that need to appear at the right sentence edge. The study shows that if the researchers had restricted their analysis to Tzeltal and Tzotzil, they would have concluded that children are drawn to the ‘semantic kernels’ of verbs. However, the results from Quiche, Yukatek and Q’anjob’al show that the input influences why Tzeltal and Tzotzil children favour the extraction of verb roots (Pye et al. Reference Pye, Pfeiler, de León, Brown, Mateo and Pfeiler2007). This study exemplifies how intra-genealogical studies can reach a high level of precision in testing variables in closely related languages.
6.6.2 Inter-genealogical studies
In inter-genealogical studies, features are investigated independent of language families. Studies of this type range from small-scale studies including two languages to larger studies with a number of languages. Such typological studies provide in-depth insights into how children acquiring different languages compare in the acquisition of a specific feature (e.g. Allen et al. Reference Allen, Özürek and Kita2006, Bowerman et al. Reference Bowerman, de León, Choi and Clark1995, Imai & Gentner Reference Imai and Gentner1997, Johnston & Slobin Reference Johnston and Slobin1979). A key characteristic here is the justification for the choice of languages. The choice of languages depends on the variables a researcher is interested in.
A discussion of Slobin’s typological study of motion verbs (Slobin Reference Slobin and Slobin1997d), which is part of a larger typological study on narratives (Berman & Slobin Reference Berman and Slobin1994), illustrates this kind of research. The study was influenced by Talmy’s (Reference Talmy and Shopen1985) typology of the way languages code path and manner of movement. On the one hand, there are what he calls ‘verb-framed’ languages, which encode paths by the verb, and leave out the manner of the motion completely or express it in a complement (typically a gerund), e.g. Spanish salió (corriendo) ‘he exited (running)’. The other type of motion verbs are what Talmy calls ‘satellite-framed’ languages, where the verb root expresses manner of motion and particles (adpositions, adverbs) are used to express the path; e.g. She ran out of the house. In Slobin’s study the languages were chosen depending on the way they express motion. The use of motion verbs was then investigated in a narrative experiment with a picture book without words as a stimulus (Frog, Where are you?, Mayer Reference Mayer1969). The experiment was conducted with English, German, Spanish, Turkish and Hebrew-speaking children. In comparing the narratives of children learning verb-framed and satellite-framed languages, distinct styles emerged. English children, for instance, devoted more narrative attention to the dynamics of movement along a path because of the availability of verbs of motion that trace out detailed paths in relation to ground elements. This is shown in the number of different verb types used in the two languages. English children used many more verb types expressing motion than did Spanish children. Spanish speakers, by contrast, gave relatively more attention to static scene setting (Slobin Reference Slobin and Slobin1997d). This dichotomy was later extended to a third group of languages, where manner and path were balanced across different parts of speech (Thai, Warlpiri and several other languages of different families, see Strömqvist & Verhoeven Reference Strömqvist and Verhoeven2004). The inclusion of a wider range of languages helped develop theories about linguistic categories and also about the acquisition of these categories.
6.7 Conclusions
The past few decades have seen considerable progress in the study of language acquisition across a wide range of languages, including some endangered languages such as Tzeltal, Tzotzil, Yukatec and Inuktitut. This research is a pressing task because more than half of the approximately 7,000 languages (and thus linguistic diversity) are severely endangered. Language acquisition research of little-known languages requires extensive collaboration with field linguists and social anthropologists. This makes typological language acquisition resource intensive. However, it is only by conducting such research that our understanding of the diversity of human language and the effect of this diversity on language acquisition can be fully understood.
My warm thanks go to Edith Bavin, Balthasar Bickel, Gabriella Hermon, Elena Lieven and Dan Slobin for helpful comments.
1 In addition, there is a large number of sign languages (see Sandler & Lillo-Martin Reference Sandler and Lillo-Martin2006) but I limit myself in this chapter to spoken languages.
2 The languages represented include: English, German, Hebrew, Japanese, Kaluli, Polish, Romance languages (with particular emphasis on French), Turkish, ASL, Hungarian, Georgian, West Greenlandic, Quiche Maya, Warlpiri, Mandarin, Sesotho, Scandinavian languages, a comparison of Estonian, Finnish and Hungarian, Finnish, Greek and Korean.
3 3nonsg.a = third person nonsingular agent, 1nonsg.p = first person nonsingular Patient, neg=negative, pst = past, excl. = exclusive.