Developing early lexical composition in Mandarin-speaking children: A longitudinal study

This study investigated the developmental pattern of early lexical production and composition in Mandarin-speaking children. Forty Mandarin-speaking children and their parents participated in this one-and-a-half-year longitudinal study, and naturalistic samples of parent-to-child speech in toy play were collected when the children were 1;8, 2;2, and 3;0. The results showed that children ’ s lexical production increased significantly between ages 1;8 and 3;0. The proportion of closed-class words increased significantly with age, whereas the proportion of common nouns showed the inverse pattern, indicating the role of grammatical words increased as the children grew. Furthermore, nouns and verbs were predominant in Mandarin-speaking children between ages 1;8 and 3;0, and Mandarin-speaking children used more verbs than nouns at 2;2 and 3;0 in the toy play context. The longitudinal study clarifies early lexical development in Mandarin-speaking children, which provides a valuable contrast for different language systems.

One of the important issues in early language acquisition research is the developmental pattern of the early lexicon and whether the developmental patterns in children are the same across languages.Numerous studies of children's early lexical development in a variety of languages have reported that children typically produce their first words around the end of their first year, and that after a few months, they experience a rapid growth in vocabulary, the so-called "vocabulary spurt" (Caselli, Bates, Casadio, Fenson, Fenson, Sanderl, & Weir, 1995;Choi & Gopnik, 1995;D'Odorico, Carubbi, Salerni, & Calvo, 2001;Goldfield & Reznick, 1990;Hao, Shu, Xing, & Li, 2008;Nelson, 1973;Tardif et al., 2009).Many researchers have reported noun dominance in early lexical acquisition.For example, Nelson's (1973) study of English-speaking children revealed that nominals were the most common category.Similarly, Goldfield and Reznick's (1990) study of early lexical acquisition of English-speaking children from 1;2 to 1;10 found that almost threequarters of the vocabulary added in this period comprised nouns.Several studies have reported noun dominance of early vocabulary in the case of English (Bates et al., 1994;Bornstein, Cote, Maital, Painter, Park, Pascual, Pêcheux, Ruel, Venuti, & Vyt, 2004;Caselli et al., 1995;Gentner, 1982;Goldfield & Reznick, 1990), Italian (Bornstein et al., 2004;Caselli et al., 1995), and German (Gentner, 1982).However, more recent studies of Korean (Choi & Gopnik, 1995), Japanese (Ogura, Dale, Yamashita, Murase, & Mahieu, 2006), and Mandarin (Tardif, 1996) have challenged the universality of a noun-bias.For example, Choi and Gopnik's (1995) study investigated lexical development of nine Korean children from 1;2 to 1;10 using maternal reports, and their findings revealed that six of the nine children showed the first verb spurt before the first noun spurt.Tardif's (1996) study of 22-month-old Mandarin-speaking children also demonstrated that most of children in the study produced more verbs than nouns in their spontaneous speech.These studies revealed that nouns were not always predominant over the course of early lexical acquisition, and the use of more verbs than nouns was observed in early lexical development.The question whether children demonstrate a universal noun-bias across languages in early lexical acquisition has aroused researchers' interest, and the possible factors influencing the apparent noun or verb dominance in children's early lexical development have received extensive discussion and investigation.

The developmental changes of early lexical composition
In the last three decades, a growing number of studies have explored the development of early lexical composition, including nouns, verbs, adjectives, adverbs, prepositions, and conjunctions.Studies of lexical composition exploring the distribution of the semantic and syntactic categories of words have provided valuable information about the development of early language acquisition and the relationship between semantic and grammatical development.Previous studies of early lexical development have shown that children's lexical composition changes with an increase in vocabulary or age (Bassano, 2000;Bassano, Eme, & Champaud, 2005;Bates et al., 1994;Caselli et al., 1995;Day & Elison, 2022;D'Odorico et al., 2001;Hao et al., 2015;Kauschke & Hofmeister, 2002;Liu & Chen, 2015;Liu, Zhao, & Li, 2008;Yang, 2015).For example, Bates et al. (1994) conducted a developmental study of early lexical composition using parental reports in Englishspeaking children aged between 0;8 and 2;6.In their study, they analyzed lexical composition using three variables: the proportions of common nouns, predicates (verbs and adjectives), and closed-class items (pronoun, prepositions, question words, quantifiers, articles, auxiliary verbs, and connectives).Their findings revealed that the proportion of common nouns increased significantly and occupied the highest proportion of total vocabulary in the period from 1 to 100 words; verbs and other predicates increased slowly at first, but showed the greatest gains between 100 and 400 words; for closed-class words, there was less proportional development between 0 and 400 words, followed by a sharp increase after 400 words.Caselli et al. (1995) analyzed the lexical composition of Englishand Italian-speaking children between ages 0;8 and 1;4 using parental reports, and found that in both languages common nouns predominated and grew rapidly when children learned their first 50 to 100 words, while verbs, adjectives, and grammatical function words were extremely rare until children had vocabularies of at least 100 words.Bates et al.'s (1994) and Caselli et al.'s (1995) results indicate that the change in early lexical composition in children revealed a shift in emphasis from common nouns to predicates to grammatical function words as the children's vocabulary increased.
Several longitudinal studies of early lexical development based on naturalistic speech samples have pointed out that nouns are not always predominant over the course of early lexical acquisition (Kauschke & Hofmeister, 2002;Ogura et al., 2006;Tardif, 1996).For instance, Kauschke and Hofmeister (2002) conducted a longitudinal study of early lexical development on German children between ages 1;1 and 3;0 using spontaneous speech, and found that relational words and personal-social words were predominant during the first half of the second year of life.Moreover, their findings showed that children used more verbs than nouns when they were 3;0.Ogura et al. (2006) investigated the use of nouns and verbs by Japanese children between ages 1;0 and 2;0 using spontaneous speech.Their study was conducted on children who were similar in age to the study by Bates et al. (1994), but they further grouped children by grammatical stage.Their findings revealed that the verb types used in the toy play context increased from the single-word stage to the syntactic stage, and more verb types than noun types were used at the syntactic stage according to the narrower definition of nouns, which excluded people and proper nouns.In addition, Ogura et al. (2006) found that there were more children with verb dominance than those with noun dominance in the toy play context at the syntactic stage using narrow definition of nouns.Ogura et al.'s (2006) findings, with some differences from those of Bates et al.'s, showed that noun dominance was not persistent, and children used more verbs than nouns in a particular context and grammatical stage.Both Kauschke and Hofmeister's (2002) and Ogura et al.'s (2006) results indicated that noun dominance was not constant across early lexical development and suggested that children's lexical composition changes with age or the emergence of grammar.
Many studies of early lexical composition using parental reports have indicated that there was an apparent noun dominance and the acquisition of verbs was universally later than that of nouns in children's early production (Bates et al., 1994;Caselli et al., 1995;Hao et al., 2015;Liu & Chen, 2015).However, some studies on early vocabulary development based on naturalistic speech showed that children used more verbs than nouns in their early lexical development (Ogura et al., 2006;Tardif, 1996).Regarding Mandarin-speaking children, Liu and Chen (2015) and Hao et al. (2015) investigated their early lexical development using parental reports and found they have a greater proportion of nouns than other word categories.However, Tardif (1996) reported that 9 out of 10 twenty-two-month-old Mandarin-speaking children produced more verbs than nouns in their naturalistic speech.Previous studies have indicated that although parent-report and observational measures were significantly correlated, there were also systematic quantitative differences between these two measures (Pine, Lieven, & Rowland, 1996;Tardif, Gelman, & Xu, 1999).Compared to parental reports, spontaneous speech samples may provide more fine-grained measures of children's early vocabulary and usage, which concurrently present more semantic, syntactic, and pragmatic properties of children's language production.Another issue of interest to researchers is the commonality or variability of early lexical development (Bassano et al., 2005;Bates et al., 1994).Bassano et al.'s (2005) study of early lexical development in French children between ages 1;8 and 3;3 using spontaneous speech showed that inter-individual variability in lexical composition was greater at 1;8 than at 2;6 and 3;3, which indicated that individual variation decreased in the course of the third year.However, research has less frequently explored the core lexicons shared by the majority of children and whether they change with age, and these issues remain unclear in Mandarin-speaking children.

Possible influencing factors of lexical composition
As mentioned above, the stage of language development may be an influencing factor of lexical composition.Lexical composition changes with vocabulary growth and the emergence of grammar (Bates et al., 1994;Caselli et al., 1995;Kauschke & Hofmeister, 2002;Ogura et al., 2006;Trudeau & Sutton, 2011).A high interdependence between lexical and grammatical skills across a variety of languages has been reported in previous studies (Caselli, Casadio, & Bates, 1999;Chi, 2002;Dixon & Marchman, 2007;Trudeau & Sutton, 2011;Viana, Pérez-Pereira, Cadime, Silva, Santos, & Ribeiro, 2017).Trudeau and Sutton (2011) investigated expressive vocabulary and early grammar of 16-to 30-month-old children using parental reports.Their results demonstrated that almost 40% of children combined words at 1;8 and nouns occupied the highest proportion of total vocabulary in this period.As the children grew, the frequency of the use of multi-word utterances increased, and the proportions of verbs and grammatical function words used by children also increased.Similarly, Parisse and Le Normand's (2000) study investigated the morphosyntax produced by two-year-old Frenchspeaking children using spontaneous speech.Their findings showed that children primarily use content words in the first word combination, and then they progressively introduce functional words in their utterances to produce more complex structure.The findings of these studies indicate that the proportions of different word categories change from the single-word stage to the syntactic stage.However, to the best of our knowledge, longitudinal studies using spontaneous speech to investigate lexical composition from the single-word stage to the syntactic stage in Mandarin-speaking children are relatively scarce.
Language systems may be the other factor that influences early lexical composition.Choi and Gopnik (1995) investigated children's early lexical development in English and Korean using parental reports.They found that among Korean-speaking children, both verbs and nouns are dominant categories from the single-word stage.Furthermore, six of the nine Korean-speaking children studied showed the first verb spurt before their first noun spurt, but no such early verbs spurt was found in the English-speaking children.Similarly, Tardif et al. (1999) compared the proportions of nouns and verbs in the early vocabularies of English-and Mandarin-speaking toddlers, and found that Mandarinspeaking toddlers used relatively fewer nouns and more verbs than English-speaking toddlers.
Mandarin syntax has some typological characteristics that differ from those of other languages.First, Mandarin is an isolating language.There is very little morphological complexity in Mandarin.Some of the types of morphemes that many languages have are not found in Mandarin.For example, Mandarin pronouns do not have case markers, and Mandarin verbs have no tense markers.Second, Mandarin is a pro-drop language, permitting omission of the subject or/and the object, and is a topic-comment language with more flexible SVO word order than English (Hsu, 1996;Li & Thompson, 2009).The linguistic features of Mandarin, such as lack of tense, the phenomenon of pronoundropping, and topic prominence, may explain why more verbs were used by Mandarinspeaking children than by English-speaking children (Tardif, 1996;Tardif, Shatz, & Naigles, 1997;Tardif et al., 1999).
Past research on the early lexical development of Mandarin-speaking children has mostly focused on children aged one to two years (Tardif, 1996;Tardif et al., 1997Tardif et al., , 1999;;Tardif, Fletcher, Liang, Zhang, Kaciroti, & Marchman, 2008) and used parental reports (Chi, 2002;Hao et al., 2008Hao et al., , 2015;;Liu & Chen, 2015;Tardif et al., 2008Tardif et al., , 2009)).Although there have been a few corpus-based studies of early lexical development in Mandarin-speaking children (Liu et al., 2008;Yang, 2015), most used a cross-sectional design.Only Hsu (1996) used both cross-sectional and longitudinal data.The sample size of each age group in his study, however, was relatively small (n=12).Longitudinal studies using spontaneous speech and following the same group of Mandarinspeaking children to investigate their lexical development from learning their first words to the production of two-word or even longer sentences are relatively scarce.How does lexical composition change from the single-word stage to the syntactic stage in Mandarin-speaking children?Do the changes in lexical composition follow a universal pattern or a language-specific pattern?Does noun or verb dominance appear during this period?These questions remain unclear and require further longitudinal studies.

Goals of the present study
In this study, we investigate the changes in lexical production and composition in Mandarin-speaking children between ages 1;8 and 3;0 using spontaneous speech samples in order to determine the developmental pattern of early lexical production in Mandarinspeaking children.Three specific goals of this study are: (1) to examine the developmental pattern of early lexical production and composition in Mandarin-speaking children between ages 1;8 and 3;0; (2) to examine whether there is noun or verb dominance in Mandarin-speaking children between ages 1;8 and 3;0; and (3) to identify which words are shared by the majority of the children (core lexicons of children) and which lexical categories the core lexicons belong to.

Method
Participants Forty 20-month-old children (23 boys, 17 girls) and their parents living in northern Taiwan participated in this one-and-a-half-year longitudinal study.All participating children were first-borns in middle-class families and were typically developing.One child whose mother spoke English with her in the toy play context was excluded from this study.All children in this study spoke Mandarin Chinese as their first language.They were immersed in a Mandarin environment, and bilingual education was not provided during the period of data collection.The mean ages of the mothers and fathers were 32.40 (SD = 3.54) and 34.55 (SD = 3.76) years, respectively.Most participating mothers (75.0%) and fathers (67.5%) had at least a college-level education.

Procedures and data collection
Parents were briefed about the research purpose and procedures, and signed consent was obtained before participation in the study.All children and their parents were visited at home when the children were 1;8 (Time 1), 2;2 (Time 2), and 3;0 (Time 3).Parents were instructed to play with their children as they usually did at home during each visit, and a set of toys, including puppets, jigsaw puzzles, and Legos, was provided for children and parents to play.Parent-child interactions during play were videotaped, and approximately 15 minutes of videotaped sessions were included in the analyses as per previous studies (Ogura et al., 2006;Southwood & Russell, 2004).

Coding and analyses
Language production in parent-child interactions during play was transcribed and analyzed using the Child Language Data Exchange System (CHILDES; MacWhinney, 2000;MacWhinney & Snow, 1990).All transcripts were coded by a group of graduate research assistants majoring in child development using the CHAT format of the CHILDES (MacWhinney, 2000), and the utterances and words in the transcripts were segmented based on the Taiwan Corpus of Child Mandarin (TCCM) word segmentation rules (Cheung, Chang, Ko, & Tsay, 2011).The TCCM word segmentation standard is mainly based on the Chinese National Standard (CNS) of segmentation principles for Chinese literature processing as the model, but some revisions are made in consideration of the characteristics of children's language development.For example, negative words such as mei2you3 'not' and bu2yao4 'no' are treated as a segmentation unit.In addition, dao4 'arrive', as a postverbal directional verb complement, and diao4 'drop', as a postverbal resultative verb complement, are treated as a segmentation unit and separate from the preceding verb.
After utterance and word segments, the MOR command of the CLAN programs of the CHILDES (MacWhinney, 2000) was used to conduct automatically morphosyntactic analyses and tag grammatical labels (part-of-speech) on each word of the transcripts.Although the MOR command of the CLAN programs of the CHILDES (MacWhinney, 2000) could tag most words correctly in Chinese, there were still some polysemous words belonging to different lexical categories when presented in different contexts that were confused easily and tagged incorrectly.For these situations, we checked and tagged the words manually according to the context in which these words occurred.
The measures of early language production included the number of total utterances, total number of words (word tokens), total number of different words (word types), mean length of utterance (MLU), and frequencies and proportions of word types in 12 word classes and three word categories.The 12 word classes included common nouns (such as dian4hua4 'phone', shou3 'hand', and mian4bao1 'bread'), main verbs (such as wan2 'play' and chi1 'eat'), adjectives (such as da4 'big' and hong2 'red'), pronouns (such as wo3 'I', ni3 'you', and zhe4 'this'), prepositions (such as zai4 'at', yong4 'with', and gei3 'for'), conjunctions (such as hai2you3 'as well as' and ran2hou4 'then'), quantifiers (such as hen3duo1 'a lot' and xie1 'some'), question words (such as zen3me 'how', shen2me 'what', and na3li3 'where'), adverbs (such as bu4 'not', dou1 'all', and hao3 'very'), numbers (such as yi1 'one' and er4 'two'), classifiers (such as ge4 used for people or individual things, ke1 used for small or spherical objects, and zhi1 used for long or sticklike objects), and sentence-final particles (such as le and a1).Of these 12 word classes, 8 word classes were further grouped into three word categories: common nouns, predicates (main verbs and adjectives), and closed-class words (pronouns, prepositions, conjunctions, quantifiers, and question words) based on previous studies (Bates et al., 1994;Hao et al., 2015;Liu & Chen, 2015) for subsequent analyses of lexical composition.(see Table 1) Prior research indicated that definitions of nouns and verbs affected the presence and extent of a noun or verb bias (Tardif, 1996).To permit comparison with previous studies, two methods of counting nouns and verbs were used to examine noun or verb dominance in Mandarin-speaking children in the study.One used a stricter definition, considering only common nouns as nouns and main verbs as verbs.The other used a broader definition; that is, nouns included common nouns and proper names, and verbs included all verbs (main verbs, copular verbs, modal auxiliary verbs, directional verbs, and resultative verbs).
Furthermore, CLAN programs were used to extract the core lexicons of children's language production and examine the word categories in which the core lexicons are distributed.Through these analyses, we sought to determine which words and word categories are shared by the majority of the children and to investigate their developmental patterns.The core lexicons were defined as the lexical items used by over 30 children (75%) in the present study.
Statistical analyses were conducted using IBM SPSS software (version 23.0;IBM Corp., USA).The distribution of data was evaluated using the Kolmogorov-Smirnov test for normality, which showed a non-normal distribution of most of the obtained results.Therefore, the non-parametric Friedman test was used to examine the changes in the total number of utterances, word tokens (total number of words), word types (total number of different words), MLU (mean length of utterance), frequencies, and proportions of word types in 12 word classes and three word categories over time.If there was a statistically significant difference, the Wilcoxon test with Bonferroni correction (p = .05/3= .017)was used for post hoc analysis.The Wilcoxon test was also used to examine whether there was a significant difference between the proportions of nouns and verbs in the analyses of noun or verb dominance under different counting methods.A chi-square test was used to compare the differences in the numbers of children who demonstrated different patterns of dominance.

Results
Children's early language production and lexical composition First, we analyzed children's basic language measures during play with their parents.The descriptive statistics and Friedman test results for the total number of utterances, word tokens, word types, and MLU at 1;8, 2;2, and 3;0 are displayed in ).Furthermore, post-hoc analyses revealed significant increases in both lexical and grammatical indices, including word tokens, word types, and MLU, between ages 1;8 and 2;2 as well as between ages 2;2 and 3;0.Second, we analyzed the frequencies and proportions of word types in each of the 12 word classes across age to depict the developmental pattern of lexical composition in Mandarin-speaking children.The proportion is the total number of word types in each word class divided by the total number of all word types (e.g., proportion of verbs = verb types/all word types).Regarding the frequencies of the 12 word classes, the frequencies of all word classes except quantifiers (χ 2 (2, n = 40) = 4.000, p > .05)increased significantly with age.In addition, post-hoc analyses revealed that all word classes except common nouns, adjectives, and quantifiers increased significantly between ages 1;8 and 2;2 as well as between ages 2;2 and 3;0, whereas common nouns and adjectives only showed a significant increase between ages 1;8 and 2;2 (common nouns: Z = -4.288,p < .001;adjectives: Z = -4.994,p < .001),but not between ages 2;2 and 3;0 (common nouns: Z = -1.494,p > .05;adjectives: Z = -1.820,p > .05)(see Table 3).
Regarding the proportions of the 12 word classes, at 1;8, common nouns and main verbs, which were the two largest classes, accounted for 20.3% and 15.8% of the children's total production vocabulary, respectively.The median proportions of adjectives and adverbs were 4.7% and 2.5%, respectively.As the children grew, the proportion of common nouns decreased significantly (χ 2 (2, n = 40) = 6.453, p < .05),while the other word classes except adjectives and quantifiers increased significantly with age.The proportions of quantifiers and adjectives increased and decreased with age, respectively, but the changes did not reach statistical significance (quantifiers: χ 2 (2, n = 40) = 4.000, p > .05;adjectives: χ 2 (2, n = 40) = 2.730, p > .05)(see Table 4).
Furthermore, the distributions of the proportions of the three word categories were also examined.Common nouns and predicates were the two largest categories at 1;8, accounting for 20.3% and 20.0% of the total productive vocabulary, respectively.Compared to common nouns and predicates, closed-class words were less frequent at 1;8.As the children grew, the proportion of common nouns decreased significantly with age (χ 2 (2, n = 40) = 6.453, p < .05),while the proportion of closed-class words increased (χ 2 (2, n = 40) = 54.650,p < .001).In contrast to the aforementioned word classes, which changed significantly with age, the median proportion of predicates remained at roughly 20% -27 %, with less change (χ 2 (2, n = 40) = 2.450, p > .05)(see Figure 1).

Noun and verb dominance in Mandarin-speaking children across age
We examined whether there was noun or verb dominance in Mandarin-speaking children between ages 1;8 and 3;0 in the toy play context.As shown in Table 5, regardless of whether one uses a stricter or a broader definition of nouns and verbs, Mandarinspeaking children used slightly more nouns than verbs at 1;8, but this did not reach statistical significance (stricter definition: Z = -1.394,p > .05;broader definition: Z = -0.248,p > .05).However, as the children grew, the proportion of verbs was significantly higher than that of nouns at 2;2 (stricter definition: Z = -3.047,p < .01;broader definition: Z = -4.731,p < .001)and 3;0 (stricter definition: Z = -5.135,p < .001;broader definition: Z = -5.511,p < .001)(see Figure 2).Furthermore, we examined whether there were significant differences in the numbers of noun-dominant, verb-dominant, and balanced children.Under the stricter definition of nouns and verbs, including only common nouns and main verbs, the number of children who produced more nouns than verbs was significantly larger than of those who produced more verbs than nouns and of those who produced equal nouns and verbs at 1;8 (χ 2 (2, n = 40) = 6.350, p < .05).However, there was an inverse pattern at 2;2 and 3;0, such that the number of children who produced more verbs than nouns was significantly larger than of those who produced more nouns than verbs and of those who produced equal numbers of nouns and verbs at 2;2 and 3;0 (at 2;2: χ 2 (2, n = 40) = 21.350,p < .001;at 3;0: χ 2 (2, n = 40) = 25.600,p < .001).Similarly, the number of children who produced more verbs than nouns was significantly larger than of those who produced more nouns than verbs and of those who produced an equal number of nouns and verbs at 1;8, 2;2, and 3;0 under the broader definition of nouns and verbs (at 1;8: χ 2 (2, n = 40) = 6.650, p < .05;at 2;2: χ 2 (2, n = 40) = 22.500, p < .001)(see Figure 3).Core lexicons of children's language production in the toy play context We extracted the core lexicons of children's language production and examined which words and lexical categories were shared by the majority of the children in the toy play context.As shown in Table 6, at 1;8, there were no words used by more than 30 children.
As the children grew, their core lexicons mainly comprised predicates and closed-class words.In addition, some words that did not belong to the three main lexical categories, such as bu2yao4 'no' and bu4 'not' among negative words, ge4 among classifiers, and le and a1 among sentence-final particles, were found to be shared by the majority of the children in the toy play context at 2;2 and 3;0.

The developmental changes of early lexical composition
The first goal of the present study was to examine the developmental pattern of early lexical composition in Mandarin-speaking children by analyzing how words were distributed across the three word categories and across the 12 word classes.The results showed that at 1;8, the majority of children's lexicons comprised common nouns and predicates, with closed-class words being less frequent.As the children grew, the proportion of predicates predominated over that of common nouns when the children were  Table 6.Core Lexicons of Children's Language Production at 1;8 (Time 1), 2;2 (Time 2), and 3;0 (Time 3) 1;8 (Time 1) 2;2 (Time 2) 3;0 (Time 3) Common nouns --- Closed-class words -zhe4 (this) zhe4 (this) yi1 (one) 2;2 and 3;0.Previous studies of lexical composition using parental reports demonstrated that the proportion of common nouns is higher than that of predicates, which is inconsistent with our findings (Bates et al., 1994;Caselli et al., 1999;Hao et al., 2015;Liu & Chen, 2015).
The discrepancy between findings of this study and those of prior research might lie in three factors, i.e., data collection methods, activity contexts, and language systems.Studies based on observational and parent-report measures of lexical composition tend to generate different results.For example, Bates et al. (1994) analyzed the lexical composition of English-speaking children between ages 1;4 and 2;6 using parental reports and found that common nouns predominated over predicates.Similar findings were also found in studies of the lexical composition of Mandarin-speaking children based on parental reports (Hao et al., 2015;Liu & Chen, 2015).
However, past studies of early lexical development based on spontaneous speech demonstrated a more even distribution between nouns and verbs, and some of them even found that children produce more verbs than nouns at an early stage (Bassano, 2000;Ogura et al., 2006;Tardif, 1996;Tardif et al., 1999).Pine et al. (1996) investigated the relationship between observational and parent-report measures of lexical composition.They found in parent-report studies that common nouns make up over 50% of the words in the lexicons of 80% of children; however, only around 20% of children showed common nouns making up over 50% of the words in observational measures, which supports the differences between observational and parent-report measures of lexical composition.Tardif et al. (1999) examined the consistency and relative biases of observational and parent-report measures in English-and Mandarin-speaking children.Their findings revealed that for both the English-and Mandarin-speaking children, the proportion of nouns that were produced by children and reported by their parents was higher than that of verbs; in addition, the proportion of verbs that were produced by children but not reported by parents was higher than nouns.The differences in memory accuracy for different parts-of-speech may be a reason for the discrepancy between observational and parent-report measures of lexical composition.Furthermore, compared to parental reports, spontaneous speech samples concurrently present semantic, syntactic, and pragmatic properties of children's language production, meaning that lexical composition based on spontaneous speech is more strongly influenced by the structure of sentences used by children.The properties of spontaneous speech may also create the differences between observational and parent-report measures of lexical composition.
The second possible explanation for this discrepancy may be the activity contexts.Previous studies have suggested that activity contexts affect several aspects of children's speech (Hoff-Ginsberg, 1991;Lucariello & Nelson, 1986;Ogura et al., 2006;Tardif et al., 1999).Tardif et al. (1999) and Ogura et al. (2006) indicated that children use more verbs in the toy play context than in the book-reading context.The present study collected spontaneous speech samples in the toy play context, which may be the reason for the higher proportion of predicates.
The third possible explanation for this discrepancy may be the language systems.Mandarin is a pro-drop language; therefore, nouns are frequently omitted.Hsu (1996) investigated early language development in Mandarin-speaking children and found that they produce combined words and simple sentences at around 1;5 to 1;10.In his study, most of the children at the age of 1;5 were able to combine words to form phrases and simple sentences, and most of the structures were imperatives or verb phrases with the subject deleted.Moreover, Mandarin verbs do not carry tense markings and do not have subject-verb agreement.These morphosyntactic properties of Mandarin verbs make them easier for Mandarin-speaking children to acquire or use.Tardif et al. (1999) compared the proportions of nouns and verbs in the early lexicons of English-and Mandarin-speaking toddlers and found that Mandarin-speaking children used fewer nouns and more verbs than English-speaking children, which confirmed that language systems may be an influencing factor of lexical composition.
Regarding developmental changes in lexical composition, the results demonstrated that the proportion of common nouns decreased significantly with age, and the proportions of predicates and closed-class words increased, which was similar to the developmental trajectories found in other languages (Bassano et al., 2005;Bates et al., 1994;Caselli et al., 1999;Kauschke & Hofmeister, 2002).For example, Bates et al. (1994) analyzed the lexical composition of English-speaking children using parental reports and found that the proportion of common nouns increased from 0 to 200 words, followed by a proportional decrease after 200 words.In Bates et al.'s study, the median number of words attained by children at 1;8 is around 170, and our study on children aged between 1;8 and 3;0 demonstrates that the proportion of common nouns decreased with age, which is consistent with Bates et al.'s (1994) findings.In addition, Bates et al. (1994) revealed that predicates showed a steady increase, while closed-class words grew markedly after 400 words, which indicates that the development of closed-class words may depend on certain critical masses of nouns, verbs, and other content words.Similarly, previous studies based on spontaneous speech have also demonstrated that the proportion of nouns decreases, while the proportions of predicates and grammatical function words increase in early lexical acquisition (Bassano et al., 2005;Kauschke & Hofmeister, 2002).
Furthermore, the present study found that Mandarin-speaking children used prepositions, conjunctions, question words, classifiers, and sentence-final particles as early as 1;8, and that the use of these word classes increased significantly between ages 1;8 and 3;0.According to past research on grammatical development in Mandarin-speaking children, they first appear to use two-word combinations around one and a half to two years of age, followed by combining three or more words to compose short sentences, and then progressing to complex or compound sentences (Cheng, 1986;Chi, 2002;Hsu, 1996;Liu & Tsao, 2010).Thus, the use of grammatical function words, such as conjunctions and prepositions, may increase as the length and complexity of sentences increase.The present study showed that the median MLU of children's language production increased from 1.4 to 3.0 between ages 1;8 and 3;0, which indicates that the children used longer and more complex sentences with age.This developmental process from the single-word stage to the syntactic stage may increase the use of grammatical words and induce changes in lexical composition.Children's early language development is not only at the level of vocabulary but also at that of grammar, which advances with age.Lexical and grammatical development are two important components of early childhood language development, and they are closely related and not easily separable.Naturalistic speech samples may provide more information about the links between lexicon and grammar.

The use of nouns and verbs in the toy play context across age
The second goal of the present study was to investigate whether noun or verb dominance exists in Mandarin-speaking children during the period between 1;8 and 3;0.The results indicated that children tended to use more nouns than verbs at 1;8.However, as the children grew, they used significantly more verbs than nouns in toy play at 2;2 and 3;0.
The stage of grammatical development may influence the use of nouns and verbs.Bassano (2000) investigated the development of noun and verb word classes in the free speech of a French child between the ages of 1;2 and 2;6, and found that nouns significantly predominated over verbs in the child's productions both in types and tokens until 1;8.But, from around 2;0 onward, verb types equaled noun types and verb tokens surpassed noun tokens in the child's productions.Similar results were found by Ogura et al. (2006), who analyzed children's use of nouns and verbs in the spontaneous speech of Japanesespeaking children and their caregivers, and found that there was a shift from noun dominance to verb dominance in toy play context from the single-word stage to the syntactic stage.The present study is consistent with Bassano's (2000) and Ogura et al.'s (2006) studies, suggesting that noun or verb dominance may change with the stage of grammatical development.

Core lexicons of children's language production in the toy play context
The third goal of the present study was to examine which words and lexical categories are shared by the majority of children.The results showed that there were no words used by more than 30 in the toy play context at 1;8.Children's core lexicons were mainly composed of predicates and closed-class words when the children were 2;2 and 3;0.A previous study investigated communicative acts during toy play between mothers and children using the Inventory of Communicative Acts-Abridged (INCA-A), and found that statements and responses as well as questions and responses were the two most common speech acts used by Mandarin-speaking toddlers during toy play (Lee, 2019).In parent-child interaction during play, children commonly used words such as bu2yao4 'no' and yao4 'want' to respond to parents' questions, or used words such as zhe4 'this', shi4 'is', and wo3 'I' in simple declarative sentences, which may explain why predicates and closed-class words were shared by a majority of children in the present study.Closedclass words were used more commonly by the majority of the children in early language production in the toy play context, and the proportion of closed-class words increased significantly between 1;8 and 3;0.These findings demonstrate that the role of grammatical function words increases with the development of children's syntactic abilities.
Furthermore, some words that did not belong to the three lexical categories, such as adverbs, classifiers, and sentence-finial particles, were found to be shared by the majority of the children at 2;2 and 3;0 in the toy play context.
Classifiers are a special part of Mandarin.Mandarin classifiers usually occur between a numeral and a noun or between a demonstrative word and a noun (Chao, 1968;Li & Thompson, 2009;Tai & Wang, 1990;Tien, Tzeng, & Hung, 2002).The development of classifiers not only reflects the linguistic abilities of children, but also their cognitive foundation, such as conceptual structure and categorization (Tien et al., 2002).Therefore, classifiers have received much attention in Chinese language development.The present study showed that classifier ge4 is shared by the majority of the children at 2;2 and 3;0, which is consistent with Hsu's (1996) finding that ge4 is the most popular and the first acquired classifier in Mandarin-speaking children.In addition, sentence-final particles are widely used in spoken Mandarin to convey speakers' various feelings, emotions, and attitudes, which play an important semantic and pragmatic role in spoken Mandarin discourse (Li & Thompson, 2009).The present study also found that sentence-final particles such as le and a1 were learned by Mandarin-speaking children quite early and were used by the majority of the children at 2;2 and 3;0.

Limitations and suggestions for future research
Although we investigated longitudinal changes in Mandarin-speaking children's early lexical production and composition based on longitudinal language samples that provide important information about the developmental patterns of early language acquisition in Mandarin-speaking children, the present study still has certain limitations.First, the sample size is relatively small, and the data are drawn exclusively from language performance during 15 minutes of parent-child interactions in the toy play context, which allows limited generalizability.Second, all participating children were from middle SES families; therefore, our findings may not be applicable to children from other SES groups.Third, the present study only included children between ages 1;8 and 3;0.However, the lexical and grammatical abilities of children continue to develop after the age of three years, which may cause more complex changes in lexical composition.We recommend that future studies collect spontaneous speech from larger sample sizes, in more varied contexts, and with a longer collection duration, which would increase the representativeness of language samples.In addition, further longitudinal research may include older children, or analyze the relationships between vocabulary and grammar using more complete and detailed grammatical indicators, which would aid in acquiring broader range of information about early language development.

Conclusion
In summary, the present study suggests that common nouns and predicates are dominant categories in Mandarin-speaking children in the early stages of lexical development.With vocabulary growth and the emergence of grammar, different lexical categories show different developmental patterns.A decrease in the proportion of common nouns and an increase in the proportion of predicates and closed-class words found in Mandarinspeaking children in the present study are similar to the developmental trajectories in English, Italian, and French.Furthermore, closed-class words are shared by the majority of children at 2;2 and 3;0, which demonstrates that the role played by grammatical function words gradually increases with age.These longitudinal data provide important information about the developmental trends of early lexical composition in Mandarinspeaking children, providing a valuable contrast for the examination of children's language in different contexts and language systems.

Table 1 .
Classification of Word Categories and Word Classes

Table 2 .
Descriptive Statistics and Friedman Test Results for Early Language Production at a Post-hoc analysis was performed by the Wilcoxon test with Bonferroni correction, with p < .017being considered statistically significant for multiple comparison.***p < .001.

Table 3 .
Descriptive Statistics and Friedman Test Results for Frequencies of All Word Classes at

Table 4 .
Descriptive Statistics and Friedman Test Results for Proportions of All Word Classes at Some words classes, such as onomatopoeia, interjections and kinship terms, were not included.a Post-hoc analysis was performed by the Wilcoxon test with Bonferroni correction, with p < .017being considered statistically significant for multiple comparison. Note.

Table 5 .
Proportions of Nouns and Verbs in Different Counting Methods at