Hostname: page-component-857557d7f7-8wkb5 Total loading time: 0 Render date: 2025-12-10T08:36:00.972Z Has data issue: false hasContentIssue false

Diversity in monolingual and multilingual communicative environments and its relation to vocabulary in early childhood

Published online by Cambridge University Press:  10 December 2025

Esmee Miron Aalders*
Affiliation:
Developmental Psychology: Infancy and Childhood, University of Zurich , Zurich, Switzerland Jacobs Center for Productive Youth Development, Universität Zürich , Zurich, Switzerland
Moritz M. Daum
Affiliation:
Developmental Psychology: Infancy and Childhood, University of Zurich , Zurich, Switzerland Jacobs Center for Productive Youth Development, Universität Zürich , Zurich, Switzerland
Stephanie Wermelinger
Affiliation:
Developmental Psychology: Infancy and Childhood, University of Zurich , Zurich, Switzerland Jacobs Center for Productive Youth Development, Universität Zürich , Zurich, Switzerland
*
Corresponding author: Esmee Miron Aalders; Email: esmee.aalders@psychologie.uzh.ch
Rights & Permissions [Opens in a new window]

Abstract

Research on multilingualism often assumes homogeneity within monolingual and multilingual groups, overlooking diversity in language environments, such as differences in language exposure and combinations. This study examines three such diversity indicators – language entropy, context entropy and linguistic distance – and their relationship to vocabulary in 4- to 5-year-old mono- and multilingual children (N = 257). Results reveal significantly greater vocabulary in monolinguals than multilinguals when comparing one language, but multilinguals outperform monolinguals on conceptual vocabulary. Vocabulary size in multilinguals showed a quadratic relationship with language and context entropy, initially increasing but declining at higher entropy levels. Additionally, children with greater linguistic distances generally had larger dominant vocabularies. However, within the group with high linguistic distance, further increased distance was linked to smaller dominant vocabularies. These findings suggest that the applied diversity indicators capture meaningful variation in language environments, offering valuable insights about diversity in environments on vocabulary outcomes in multilingual children.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NC
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial licence (http://creativecommons.org/licenses/by-nc/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original article is properly cited. The written permission of Cambridge University Press or the rights holder(s) must be obtained prior to any commercial use.
Copyright
© The Author(s), 2025. Published by Cambridge University Press

Highlights

  • Measuring language status dichotomously masks variability between children.

  • Multilingual children show greater conceptual vocabulary than monolinguals.

  • Linguistic distance, language and context entropy partially explain vocabulary size.

  • Medium linguistic distance is related to increased vocabulary.

  • Dominant, non-dominant and total vocabulary in multilinguals show a quadratic relation to entropy measures.

1. Introduction

Being able to communicate is fundamental for children’s later life outcomes (Duncan et al., Reference Duncan, Anderson, King, Finders, Schmitt and Purpura2022; Marchman & Fernald, Reference Marchman and Fernald2008). Children learn to communicate within and from their natural everyday environment. However, these everyday environments differ significantly between children, for instance, the objects children encounter or the characteristics of the people they meet (Bergelson & Aslin, Reference Bergelson and Aslin2017). As a result, children vary in cognition, behaviour, and the ways they interact, cooperate, and work with others (Anderson et al., Reference Anderson, Mak, Keyvani Chahi and Bialystok2018; Brink et al., Reference Brink, Lane and Wellman2015; Neuhauser et al., Reference Neuhauser, Ramseier, Schaub, Burkhardt and Lanfranchi2018). Likewise, the diversity in children’s language and vocabulary development has been linked to their environments. Specifically, the quantity and quality of language input they receive from their caregivers (Huttenlocher, Reference Huttenlocher1998; Rowe, Reference Rowe2012; Rowe & Goldin-Meadow, Reference Rowe and Goldin-Meadow2009). The more words children hear, the more words they know (Hart & Risley, Reference Hart and Risley2003) and the more frequently a word is heard, the earlier it is learned (Goodman et al., Reference Goodman, Dale and Li2008). Thus, children’s early vocabulary and language development have a significant impact on their later outcomes, and this development is influenced by their environment.

Children exposed to more than one language experience a communicative environment that differs significantly from that of their monolingual peers (Paradis, Reference Paradis2023; Singh, Reference Singh2021). To communicate successfully, multilingual children must navigate a highly variable environment and adapt their behaviour accordingly (Howard et al., Reference Howard, Carrazza and Woodward2014; Wermelinger et al., Reference Wermelinger, Daum and Gampe2024). Multilinguals’ language exposure is split between the different languages they are exposed to, and, as a result, they receive less input per language. For this reason, multilingual children and adults tend to know fewer words than monolinguals do in at least one of their languages (Bedore et al., Reference Bedore, Peña, García and Cortez2005; Bialystok & Barac, Reference Bialystok and Barac2012), while their combined vocabularies are of comparable size to the vocabularies of monolinguals (Gampe et al., Reference Gampe, Kurthen and Daum2018; Hoff et al., Reference Hoff, Core, Place, Rumiche, Señor and Parra2012; Pearson et al., Reference Pearson, Fernández and Oller1993). To overcome the limitations of single-language vocabulary comparisons and to include possible interaction effects of the multiple languages, researchers advocate measuring total vocabulary (i.e., the vocabularies in each of the languages combined) and conceptual vocabulary (i.e., the number of concepts the child knows in either language; Kohnert, Reference Kohnert2010; Pearson et al., Reference Pearson, Fernández and Oller1993). Nevertheless, a simple accumulator model (Kachergis et al., Reference Kachergis, Marchman and Frank2022), where more input equals greater vocabulary, does not seem to hold up for multilingual children (Sander-Montant et al., Reference Sander-Montant, López Pérez and Byers-Heinlein2023). Other factors likewise predict multilingual vocabularies, such as the types of words children learn (Muszyńska et al., Reference Muszyńska, Kołak, Haman, Białecka-Pikul and Otwinowska2024a), the people with whom children interact (e.g., adults or siblings; Hoff et al., Reference Hoff, Rumiche, Burridge, Ribot and Welsh2014) and the number of translation equivalents children use (i.e., words with similar meanings in different languages; Tan et al., Reference Tan, Marchman and Frank2024). Furthermore, multilingual children’s environments are largely heterogeneous (Gullifer & Titone, Reference Gullifer and Titone2020; Titone & Tiv, Reference Titone and Tiv2023). Multilingual children differ depending on the languages they learn, the prestige associated with their languages, or when and how they are exposed to each language (Wermelinger et al., Reference Wermelinger, Daum and Gampe2024). Previous research has not captured this diversity and has often treated multilingual children as one homogeneous group. However, to validly capture multilinguals’ communicative environment, we need to adopt measures that assess different aspects of a multilingual environment and understand multilingualism as a continuous variable rather than the dichotomous distinction between monolingualism and multilingualism (Gullifer & Titone, Reference Gullifer and Titone2020).

In the current project, we characterised the communicative environment of multilingual children using three measures: linguistic distance, language entropy and context entropy to increase our understanding of the diversity within multilingual environments. We used these measures to explain differences in children’s receptive vocabulary as an early marker of language and communicative development (Huttenlocher, Reference Huttenlocher1998).

2. Linguistic distance

Monolingual children differ in their language development depending on the language they acquire (vocabulary growth, Bleses et al., Reference Bleses, Vach, Slott, Wehberg, Thomsen, Madsen and Basbøll2008; Thordardottir, Reference Thordardottir2005; phonological awareness, Kang, Reference Kang2012; word segmentation, Antovich & Graf Estes, Reference Antovich and Graf Estes2020; Mateu & Sundara, Reference Mateu and Sundara2022; Orena & Polka, Reference Orena and Polka2019). Multilinguals need to develop multiple interdependent language systems (French & Jacquet, Reference French and Jacquet2004) such that speech in one language activates word recognition in both of their languages (Blumenfeld & Marian, Reference Blumenfeld and Marian2013; Lin et al., Reference Lin, Lin and Yeh2023; Von Holzen & Mani, Reference Von Holzen and Mani2012). Furthermore, each language has specific qualities, which may lead to interference effects (Marian et al., Reference Marian, Blumenfeld and Boukrina2008; Tan et al., Reference Tan, Marchman and Frank2024). For example, words may have different meanings in different languages (e.g., gift means present in English but poison in German), or correct sentence construction in one language can lead to errors in other languages. These interactions between languages influence how multilingual children develop languages. For instance, multilingual infants rely more or less on vowels than consonants for lexical processing, depending on the language (Delle Luche et al., Reference Delle Luche, Poltrock, Goslin, New, Floccia and Nazzi2014; Mani & Plunkett, Reference Mani and Plunkett2007; Nishibayashi & Nazzi, Reference Nishibayashi and Nazzi2016).

The linguistic distance between the languages, that is, the level of similarity between languages on the lexical, phonological, and/or grammatical level (Jaekel et al., Reference Jaekel, Ritter and Jaekel2023), is associated with children’s language development. Smaller linguistic distances are associated with larger vocabulary sizes in bilingual toddlers in both of their languages (Gampe et al., Reference Gampe, Endesfelder Quick and Daum2021) and faster development and greater proficiency in adult second language (L2) learners (Jaekel et al., Reference Jaekel, Ritter and Jaekel2023; Van der Slik, Reference Van der Slik2010). Using commonalities between languages, such as cognates, may allow multilinguals to bootstrap their word learning via cross-linguistic transfer (Tan et al., Reference Tan, Marchman and Frank2024; Van der Slik, Reference Van der Slik2010). Cognates are words that are phonologically similar between two spoken languages (e.g., English house and German Haus), as compared to translational equivalents that are non-cognates (e.g., English dog and German Hund). However, phonological similarities (such as those found in cognates) may also reduce the separation between the two languages, leading to interference and delays in language acquisition, as observed in adults (Marian et al., Reference Marian, Blumenfeld and Boukrina2008). In line with this, Squires et al. (Reference Squires, Ohlfest, Santoro and Roberts2020) report that 25% of the investigated studies on children in their systematic review did not find a positive cognate facilitation effect on vocabulary and report that the cognate facilitation effect is influenced by many factors, such as language dominance (Bosma et al., Reference Bosma, Blom, Hoekstra and Versloot2019; Chai & Bao, Reference Chai and Bao2023; Garrido-Pozú, Reference Garrido-Pozú2024; Koutamanis et al., Reference Koutamanis, Kootstra, Dijkstra and Unsworth2025; Poarch & van Hell, Reference Poarch and van Hell2012; Quirk & Cohen, Reference Quirk and Cohen2022; Robinson Anthony et al., Reference Robinson Anthony, Blumenfeld, Potapova and Pruitt-Lord2022), language proficiency (Chai & Bao, Reference Chai and Bao2023) and language exposure (Robinson Anthony et al., Reference Robinson Anthony, Blumenfeld, Potapova and Pruitt-Lord2022). In sum, previous work is mixed and suggests that linguistic distance is associated with both larger and smaller vocabularies.

Linguistic distance can be estimated based on language families (Spolaore & Wacziarg, Reference Spolaore, Wacziarg, Ginsburgh and Weber2016), expert judgments of language characteristics (Dryer & Haspelmath, Reference Dryer and Haspelmath2013), or automated calculations of phonetic similarities (Automatic Similarity Judgement Program, ASJP; Wichmann, Reference Wichmann2020). The current study estimates linguistic distance based on the lexico-phonological similarity indicated in the ASJP database (Søren et al., Reference Søren, Holman and Brown2022; Wichmann, Reference Wichmann2020).

3. Entropy measures

Next to the linguistic distance between their languages, multilingual children differ in how they are exposed to those languages across social contexts (e.g., caregivers, institutional childcare). For example, children can learn their languages in a compartmentalised context (e.g., a child hears one language at home and another one at institutional childcare) or an integrated context (Gullifer & Titone, Reference Gullifer and Titone2020; for example, a child hears their languages regardless of the communicative context). Individual differences in how multilinguals use their languages across social contexts may define how they represent, access and control those languages (Abutalebi & Green, Reference Abutalebi and Green2016; Green & Abutalebi, Reference Green and Abutalebi2013). In the current study, we measured the differences in children’s language exposure across social contexts using holistic entropy measures. Entropy is a concept rooted in physics and has been used to quantify uncertainty or diversity (Shannon, Reference Shannon1948). Previous work in psycholinguistics used entropy measures in sentence comprehension research to model how readers or listeners adapt to noisy linguistic input, by quantifying how new information shifts expectations about earlier parts of a sentence (Levy, Reference Levy2008).

Entropy is computed using the following function: $ H(X)=-{\sum}_{\mathrm{i}=1}^{\mathrm{n}}p\left({x}_{\mathrm{i}}\right){\log}_2p\left({x}_{\mathrm{i}}\right) $ (Shannon, Reference Shannon1948). With p being the proportion of awake time the child is exposed to each language/context (x), and n representing the total number of languages/contexts the child is exposed to. In the current study, we calculated the entropy of children’s language exposure (i.e., language entropy) and contexts (i.e., context entropy). Unlike simple frequency counts, entropy captures both the number of distinct elements (i.e., different languages and contexts a child is exposed to) and their relative probabilities (Cover & Thomas, Reference Cover and Thomas2006), offering a nuanced measure of language and contextual diversity. Entropy increases with the number of distinct elements (i.e., different languages or social contexts), meaning that exposure to more varied input naturally leads to greater measured diversity. Higher entropy values, thus, reflect a more diverse and less predictable linguistic environment, which has been linked to improved vocabulary development and language outcomes in children (Rowe, Reference Rowe2012). Further, entropy is a symmetric measure, treating all elements without bias. Lastly, it is sensitive to the distribution of elements: entropy is maximised when input is balanced and evenly distributed, reflecting greater unpredictability and complexity (Gullifer & Titone, Reference Gullifer and Titone2020). In case of exposure to two languages or two contexts, the entropy is highest (H (2) = 1.00) if exposure is exactly balanced between the two elements. In case of exposure to three languages/contexts, entropy is higher than it would be with exposure to two languages and peaks (H(3) = 1.58) at exposure being equally distributed across the three languages (see Figure 1).

Figure 1. Distribution of entropy for (a) two elements (languages/social contexts) and (b) for three elements. Note that the distributions peak at equal distributions across the given number of elements.

3.1. Language entropy

Language entropy (Gullifer & Titone, Reference Gullifer and Titone2020; Titone & Tiv, Reference Titone and Tiv2023) quantifies the diversity of cumulative everyday language exposure to different languages (Gullifer & Titone, Reference Gullifer and Titone2020). It is estimated as a function of the probability of certain events occurring, that is, when the child is exposed to a specific language. As described in the previous paragraph, higher language entropy values relate to more balanced language use and greater language diversity across contexts. In adults, increased language entropy is associated with increased language proficiency in the non-dominant language (Gullifer et al., Reference Gullifer, Kousaie, Gilbert, Grant, Giroud, Coulter, Klein, Baum, Phillips and Titone2021; Gullifer & Titone, Reference Gullifer and Titone2020). However, while language exposure has been linked to language development in childhood (Hart & Risley, Reference Hart and Risley2003; Huttenlocher, Reference Huttenlocher1998), this relationship is not necessarily linear (Hoff & Ribot, Reference Hoff and Ribot2017; Sander-Montant et al., Reference Sander-Montant, López Pérez and Byers-Heinlein2023). For instance, the number of translation equivalents in a child’s languages can support cross-linguistic word learning (Tan et al., Reference Tan, Marchman and Frank2024). Hence, an increase in language entropy (i.e., more balanced language input) is not automatically associated with an increased vocabulary in a child’s language. Furthermore, the diversity of social contexts in which languages are used may further contribute to children’s language development (Marvin et al., Reference Marvin, Beukelman and Bilyeu2009).

3.2. Context entropy

Language use differs between social contexts (Abdalla, Reference Abdalla2022; Straker, Reference Straker1980). For instance, preschool children use only about one-third of their vocabulary across their home and preschool contexts, while other words are specific to either of the contexts (Marvin et al., Reference Marvin, Beukelman and Bilyeu2009; Muszyńska et al., Reference Muszyńska, Łuniewska, Dynak, Kolak, Lohrum, Otwinowska, Wodniecka and Haman2024b). In contrast, monolinguals hear context-specific labelled words across home and school contexts (Muszyńska et al., Reference Muszyńska, Łuniewska, Dynak, Kolak, Lohrum, Otwinowska, Wodniecka and Haman2024b). The differences in language use between social contexts influence children’s vocabulary, depending on the time spent in these contexts and the subsequent variety of language input children experience (Goodman et al., Reference Goodman, Dale and Li2008; Hart & Risley, Reference Hart and Risley2003). In the current study, we measured the diversity of children’s social contexts with context entropy. In parallel to language entropy, context entropy is estimated by calculating the probability of certain events occurring. In the case of context entropy, these events refer to the time children spend in different social contexts (e.g., primary caregivers, institutional childcare). Thus, a balanced measure of contextual entropy indicates children spending similar amounts of time in their different caregiving contexts.

4. The current study

The current study explored how diversity in multilingual children’s communicative environments relates to their language development. This diversity is indicated by children’s linguistic distance and their language and context entropy. We took an exploratory approach to examining how various aspects of children’s language environment associate with their receptive vocabulary. Specifically, we addressed two objectives: (1) to characterise diversity in children’s communicative environment using three measures: linguistic distance, language entropy and context entropy, and (2) relate these measures to their language development. We re-analysed existing data from parental questionnaires on children’s language exposure and receptive vocabulary tests from studies conducted in our research group between 2019 and 2024. We first followed the traditional approach and compared the receptive vocabulary of monolingual and multilingual children on the group level. We then associated children’s linguistic distance, language entropy and context entropy with their receptive vocabulary. Multilingual children’s receptive vocabulary, in the two languages they are most exposed to, was measured and reported as vocabulary per language, total vocabulary (i.e., summed across both languages) and conceptual vocabulary (i.e., the sum of how many concepts a child knows).

5. Method

We preregistered the study prior to data analyses (https://osf.io/fhu5e/?view_only=285a0d10bd814d60b88ad0f11c138cc3) and made the data collected and R codes available on the Open Science Framework (see Data Availability Statement).

5.1. Participants

We included data of N = 257 children between the ages of 3.8–5.6 years (Mage = 4.54 years, SDage = 0.36), 49% were girls. Children were monolingual Swiss German (n = 112, i.e., not more than 10% input in another language) and multilingual (n = 134). All multilingual children spoke Swiss German (i.e., the societal language) and were exposed to another language for at least 20% of the time (in accordance with Wermelinger et al., Reference Wermelinger, Gampe and Daum2017). Of the multilingual children, 19 were exposed to a third language. The multilingual children were exposed to the following languages: English (n = 36), Italian (n = 35), French (n = 21), Spanish (n = 18), Portuguese (n = 8), Swedish, Greek, Hungarian (n = 4 each), Romansh, Mandarin, Czech (n = 3 each), Croatian, Polish, Russian, Serbian (n = 2 each) and Albanian, Arabic, Danish, Dutch, Slovakian, Tagalog (the Philippines), Tigrinya (Eritrea), Turkish and Vietnamese (n = 1 each). Twelve children were excluded from the analyses because they did not meet the criteria for language exposure.

The children in our sample came from a background of high socio-economic status. The household income of the participants in our sample was above average; 79.5% of caregivers reported above-city average household income, 12.4% reported a household income within the average range and 8.0% reported below-average household income (Bundesamt fuer Statistik, Reference Bundesamt fuer2024). Parental education was high; 73.0% of children came from households in which both caregivers completed higher education at a university or a university of applied sciences and 16.0% of children came from households in which one of the caregivers completed higher education at a university or a university of applied sciences. The other 1% of children came from households in which caregivers completed other types of education, such as apprenticeships or vocational training.

The participants were recruited via the research unit’s database. The database consists of caregivers interested in participating in studies with their children. The children were healthy, born full term (week of gestation >37), had a birth weight > 2500 g and had no diagnosis of a developmental disorder, according to parental reports. Each child received a certificate and a small present (approximately USD 5) as compensation. Children’s caregivers gave informed consent. The Ethics Committee of the [Ethics Committee of the UZH Faculty of Arts and Social Sciences] agreed with the general procedure, which adheres to the ethical standards of the 1964 Declaration of Helsinki and its subsequent amendments.

5.2. Design

This study re-analysed data from three cross-sectional observational studies on monolingual and multilingual children’s communicative development. We used two data sources: a parental questionnaire on children’s cumulative language exposure and a laptop-based test of children’s receptive vocabulary (Gampe et al., Reference Gampe, Kurthen and Daum2018).

In the parental questionnaire, caregivers were asked for each caregiving situation since the child’s third birthday, for each day of the week, each interaction partner, and each language and the amount of time the child spent with this interaction partner and was exposed to each language. Based on this information, children’s language status was determined (monolingual in case of exposure to a second language <10% of awake time and multilingual in case of exposure to a second language for at least 20% of awake time).

Receptive vocabulary in the children’s languages was assessed by the BILEX (Gampe et al., Reference Gampe, Kurthen and Daum2018). The BILEX is a reliable and valid method for measuring receptive vocabulary in children aged 3–5 years. In this test, children hear a noun and are asked to correctly match it to one of six simultaneously presented objects across 48 trials. Children’s vocabulary in Swiss German was always assessed as the first task in the studies, and the vocabulary in their second language as the final task. Hence, no randomisation was performed for the measurements of the current study. However, we controlled for possible effects of fatigue in the second administration of the vocabulary test by comparing the children’s performance and reaction times between the two measurements (Heitz, Reference Heitz2014). The results of two dependent t-tests show significant differences in children’s performance, t(158) = 5.99, p < .001, but not in children’s reaction times, t(159) = 1.26, p = .209. Therefore, the difference in performance is unlikely to be explained by differences in reaction times, for instance, due to increased fatigue in completing the second BILEX. Children’s data were included in the current study if data from both the parental questionnaire and the BILEX were available. Children’s BILEX data were discarded in case of values below the second percentile of the norm scores developed within the research unit.

5.3. Measures

5.3.1. Linguistic distance

The children’s linguistic distance measure was based on the lexico-phonological similarity indicated in the ASJP database (Holman et al., Reference Holman, Brown, Wichmann, Müller, Velupillai, Hammarström, Sauppe, Jung, Bakker, Brown, Belyaev, Urban, Mailhammer, List and Egorov2011; Wichmann, Reference Wichmann2020). The ASJP consists of word lists of the 40 most stable lexical items from Swadesh’s 100-item list (Swadesh, Reference Swadesh1955), transcribed into a simplified standard orthography which is phonologically informed and consistent across the different languages (e.g., the English words dog and sun are transcribed as dag and s3n and would be compared to the Swiss-German huN and sunne). We calculated the normalised Levenshtein distance (Bakker et al., Reference Bakker, Müller, Velupillai, Wichmann, Brown, Brown, Egorov, Mailhammer, Grant and Holman2009) from each child’s language to the societal language to estimate the linguistic distance between the languages they are exposed to (Konstantinidis, Reference Konstantinidis2007). The score is calculated as the number of characters that need to change from one word to its equivalent in the other language. The sum of these transformations for all 40 items is the linguistic distance: the higher the linguistic distance, the fewer similarities between languages. Because the societal language consists of a wide variety of dialects and the ASJP offers data on one of those dialects, the linguistic distance obtained approximates the actual distance, as the children in our sample may be exposed to other dialects. If children were exposed to more than two languages, we calculated the linguistic distance for the two languages the child is exposed to most.

5.3.2. Entropy measures

Language and context entropy were computed based on the proportion of awake time children spent in certain languages or social contexts, based on the data from the parental questionnaire, using the languageEntropy R package (Gullifer & Titone, Reference Gullifer and Titone2018). The entropy values range from 0 to log n, with n being the total number of languages or social contexts the function is computed over (Gullifer & Titone, Reference Gullifer and Titone2020). Higher levels of language and context entropy indicate more balanced language exposure or time spent in different social contexts. When exposure is evenly distributed between two languages or contexts (i.e., 50% each), entropy reaches its maximum value of 1. In the case of three equally balanced languages or contexts (approximately 33% each), the maximum entropy is about 1.58 (see Figure 1). Context entropy was computed across the contexts of primary caregivers, secondary caregivers, institutional childcare and kindergarten.

5.3.3. Vocabulary

Children’s receptive vocabulary in both of their most dominant languages was measured as the number of correctly identified nouns in the BILEX per language and summed up across both languages to obtain a measure of total vocabulary. Furthermore, the conceptual vocabulary was calculated via the number of correctly identified nouns in either language. For the analysis, we took a data-driven approach and considered the language the child is exposed to more than 50% of the time to be their first language (dominant language was not the societal language for n = 42 children). In case a child’s exposure was exactly 50%, we considered the societal language as the child’s first language (n = 1).

In sum, children’s language status, age and communication environment (language entropy, context entropy and linguistic distance) were derived from parental questionnaires. Linguistic distance is operationalised as a continuous variable based on lexico-phonological similarity and as a categorical variable following an exploratory cluster analysis (see Results). Continuous vocabulary outcomes (total, conceptual, dominant and non-dominant language) were assessed using BILEX (Gampe et al., Reference Gampe, Kurthen and Daum2018). A detailed overview of all variables and their operationalisation is provided in Tables A1 and A2 in the Appendix.

6. Results

We deviated from the preregistered analyses by running group-level analyses (i.e., monolingual versus multilingual) on children’s vocabulary, and by exploring the effects of context entropy on multilingual children and monolingual children separately.

6.1. Group-level analyses

We used independent t-tests to assess the difference in vocabulary scores for the dominant language and conceptual vocabulary between monolingual and multilingual children on the group level to assess whether multilingual children show greater variability in their vocabulary, as often suggested in the literature (Hoff et al., Reference Hoff, Rumiche, Burridge, Ribot and Welsh2014; Hoff & Core, Reference Hoff and Core2013; Lauro et al., Reference Lauro, Core and Hoff2020). The difference in variance between the two groups was assessed with an F-test for Equality of Variance.

Monolingual children scored significantly higher (M = 39.3, SD = 4.2) than multilingual children (M = 37.8, SD = 4.0) in their dominant language, t(236) = 2.49, p = .013, Cohen’s f 2 = .03. We found significant difference in variance between monolinguals and multilinguals, F(111,126) = 0.69, p = .049, with higher varience in dominant language vocabulary among multilinguals. Multilingual children scored significantly higher (M = 41.2, SD = 4.0) on conceptual vocabulary than their monolingual peers (M = 39.4, SD = 4.1), t(233) = 3.38, p < .001, Cohen’s f 2 = .05. However, the variability between the groups showed no significant difference, F(111,130) = 1.07, p = .718 (see Figure 2A,B).

Figure 2. Graphs a and b show group-level analyses comparing the vocabularies of monolingual and multilingual children. Graphs c to f show vocabulary differences in multilingual children by the two identified linguistic distance groups (Low and High).

6.2. Indicators of diversity

To examine diversity in children’s communicative environments and relate this to their vocabulary development, we computed three indicators of diversity (i.e., linguistic distance, language entropy and context entropy) and tested the associations between these indicators and vocabulary outcomes. We used linear models to assess whether these indicators of diversity are related to their vocabulary. We standardised all predictors before running the analyses. For visualisation purposes, however, we used the raw scores. Since we did not expect these associations to be linear, we compared the linear models with the respective quadratic models using the R anova() function and model comparison criteria (i.e., Akaike Information Criterion, Bayesian Information Criterion, adjusted R 2 and χ2) and report the findings of the better-fitting models, see Table 1 (for model comparisons see Tables S6–S13 in Supplementary Materials).

Table 1. Best fitting models

Note: With ‘LD’ for Linguistic Distance and ‘LD group’ referring to the low and high linguistic distance group categories. For model comparisons, see Tables S6 to S13 in Supplementary Materials.

6.2.1. Linguistic distance

Linguistic distance was analysed only among multilingual children. It was not normally distributed in our sample. Therefore, we performed a clustering analysis using Mclust in R (Scrucca et al., Reference Scrucca, Fraley, Murphy and Adrian2023) to explore different categories in the data, see Table S3 in Supplementary Materials. We identified two groups, which we will use for further analyses: low (M = 72.12, SD = 1.07) and high (M = 93.41, SD = 6.27) linguistic distance. We refer to the linguistic distance as ‘low’ and ‘high’ as it refers to the ASJP computation of level of similarity. The low linguistic distance group (n = 33) includes the following languages: Danish, Dutch, English and Swedish. The high linguistic distance group (n = 100) includes the languages Italian, French, Spanish, Portuguese, Hungarian, Romansch, Czech, Greek, Mandarin, Croatian, Tagalog, Vietnamese, Arabic, Tigrinya, Albanian, Russian and Slovakian. We ran three linear models predicting multilinguals’ dominant language, non-dominant language, conceptual and total vocabulary using linguistic distance group, language entropy and context entropy as predictor variables and children’s age in months as a covariate (see Table 1).

We found significant effects of linguistic distance on vocabulary size, associating higher linguistic distances with larger receptive vocabulary in the children’s dominant language, F(2,124) = 8.20, p = .026. and Cohen’s f 2 = .06. We did not find a significant relation between linguistic distance and conceptual and total vocabulary scores or the children’s vocabulary scores in their non-dominant language (all p> .060). For the results of each model, see Table A3 in the Appendix and Figure 2C–F.

To better understand the relationship between linguistic distance and vocabulary size, we ran non-preregistered exploratory analyses focusing on the multilingual children in the category with higher linguistic distance. We ran linear models with the linguistic distance as a continuous variable on all measures of vocabulary size. For multilingual children in the high linguistic distance group, greater linguistic distance was negatively associated with vocabulary size in the dominant language, F(2, 93) = 5.32, p = .032, adjusted R 2 = .08, Cohen’s f 2 = .11. This effect was not found in non-dominant language vocabulary, conceptual and total vocabularies (all p> .100), see Figure 2C–F and Table A4 in Appendix.

To explore the possible quadratic relationship between linguistic distance and dominant language vocabulary, we ran an additional unregistered quadratic model including all multilingual children, predicting their dominant vocabulary based on the continuous measure of the linguistic distance. This model confirmed a quadratic relationship between linguistic distance and children’s receptive vocabulary in their dominant language (F(3, 123) = 7.83, p = .005, adjusted R 2 = .14, Cohen’s f 2 = .19.

6.2.2. Language entropy

Language entropy was calculated based on the proportion of children’s awake time spent exposed to different languages. The resulting measure captures the degree of balance in the language exposure, with higher values reflecting a more even distribution of time across languages. For example, when children are exposed to two languages, language entropy ranges from 0 to 1.00, with a value of 1.00 indicating perfectly equal exposure. For children exposed to three languages, entropy can range from 0 to approximately 1.58, where 1.58 represents equal exposure to all three languages.

Language entropy was analysed only among multilingual children. We compared several linear regression models, including linear and higher-order terms of language entropy (for best-fitting model, see Table 1). For full details on the fit statistics and comparison of the tested models, see Table S4 in Supplementary Materials. Furthermore, because the results showed different effects on vocabulary for groups with lower and higher linguistic distances, we included these categories as control variables, deviating from the pre-registration.

Language entropy was quadratically associated with dominant vocabulary, F(4,122) = 4.73, p = .018, R 2 = .14, Cohen’s f 2 = .20. Higher language entropy is associated with higher vocabulary scores in the dominant language, up to a threshold of 0.94. Beyond this point, higher language entropy is associated with smaller vocabulary scores in the dominant language. Similarly, we found a curvilinear relationship between language entropy and children’s vocabulary in their non-dominant language (F(3, 120) = 6.54, p = .036, R 2 = 0.09 and Cohen’s f 2 = .10). Higher language entropy is associated with higher vocabulary scores in the non-dominant language up to an entropy of 1.00, after which higher entropy is associated with smaller vocabulary scores. We found the same effect for multilingual children’s total vocabulary (F(3, 119) = 9.46, p = .003, R 2 = .17 and Cohen’s f 2 = .24). Higher language entropy is related to higher total vocabulary scores up to an entropy of 0.98, after which higher entropy scores are associated with lower total vocabulary scores.

In contrast, we did not find a significant association between language entropy and children’s vocabulary score in their conceptual vocabulary, p > .050, see Table A5 in the Appendix and Figure 3.

Figure 3. Associations of language entropy and children’s vocabulary scores. The solid line shows the predicted vocabulary scores from the statistical model. The shaded area represents the 95% confidence interval around these predictions.

6.2.3. Context entropy

Context entropy was calculated based on the proportion of awake time children spent across different caregiving environments. Entropy values were computed over four contexts: primary caregivers, secondary caregivers, institutional childcare, and kindergarten. The resulting entropy reflects how evenly children’s time was distributed across these contexts, with higher values indicating a more balanced exposure. For example, for children exposed to two different contexts, their context entropy falls anywhere between 0 and 1.00, with 1.00 being exactly balanced exposure to two contexts. For children exposed to three contexts, their entropy lies anywhere between 0 and 1.58, with 1.58 representing their time evenly distributed across three contexts.

The models on context entropy include monolingual and multilingual children. We compared linear regression models with both linear and quadratic terms for context entropy, as preregistered. The model with a linear term for context entropy was the best fit (see Table 1 and Tables S8 and S9 in Supplementary Materials for fit statistics).

The results showed no significant associations between context entropy and total and conceptual vocabulary for both monolingual and multilingual children, F(3,241) = 8.70, p = .288 and F(3,241) = 113.70, p = .154, see Table A6. We ran additional, unregistered linear regressions for multilingual children only, to explore whether context entropy is differently related to children’s dominant vocabulary than to their non-dominant vocabulary. For full details on model fit and comparison, see Tables S10–S13 in Supplementary Materials. Context entropy showed a curvilinear relationship to dominant vocabulary for multilingual children, F(3,122) = 5.09, p = .037, R 2 = .09, Cohen’s f 2 = .13, meaning that higher context entropy scores are associated with higher vocabulary scores in the children’s dominant language, up to a threshold of 1.03, after which higher context entropy scores are associated with lower vocabulary scores in the children’s dominant language (see Figure 4 and Table A7 in Appendix). We found no significant associations between context entropy and non-dominant vocabulary, as well as conceptual and total vocabulary (all p > .05), see Table A7.

Figure 4. Associations of context entropy and children’s vocabulary scores. The solid line shows the predicted vocabulary scores from the statistical model. The shaded area represents the 95% confidence interval around these predictions. Note that the vocabulary scores on the dominant language (c) and the non-dominant language (d) are based on data of multilingual children only.

7. Discussion

Children’s communicative environments shape their development, with multilingual children experiencing distinct environments compared to their monolingual peers. However, research often simplifies language status as a binary variable, overlooking the variability within multilingual environments and its nuanced influence on developmental outcomes (Kremin & Byers-Heinlein, Reference Kremin and Byers-Heinlein2021). Here, we quantified the communicative environment of monolingual and multilingual preschool children using three indicators of communicative complexity: linguistic distance, language entropy, and context entropy and investigated the associations between these measures and vocabulary outcomes. In line with previous studies, monolinguals scored higher than multilinguals on a receptive vocabulary task in their dominant language (Bialystok & Barac, Reference Bialystok and Barac2012). However, when considering multilinguals’ vocabulary in their two most dominant languages, no significant differences were found between multilingual and monolingual children’s conceptual vocabulary scores (Hoff et al., Reference Hoff, Core, Place, Rumiche, Señor and Parra2012; Pearson et al., Reference Pearson, Fernández and Oller1993). Additionally, we observed greater variability in multilinguals’ language outcomes. This variability appears to be partly due to differences in multilinguals’ linguistic distance, language entropy and context entropy. Our findings indicate that linguistic distance is associated with vocabulary size in children’s dominant but not in their non-dominant language. Regarding language entropy, we found that higher language entropy up to a threshold of around 1.00 is associated with larger vocabulary sizes in total, dominant and non-dominant languages, after which the relationship reverses. As entropy peaks at 1.00 for children exposed to two languages, an entropy value higher than 1.00 is associated with exposure to a third language. Finally, higher context entropy was associated with larger dominant and non-dominant language vocabulary among multilingual children.

7.1. Linguistic distance

Linguistic distance was associated with an increased receptive vocabulary in the dominant language and conceptual vocabulary scores, suggesting that greater separation between languages may enhance vocabulary size or language processing speed (Marian et al., Reference Marian, Blumenfeld and Boukrina2008). However, we found a negative correlation between linguistic distance and vocabulary scores for children within the high linguistic distance group. This group consisted of children speaking a non-Germanic language and Swiss German. In other words, while children with high linguistic distances tend to score better on vocabulary than children whose second language is more closely related to the societal language, further increases in linguistic distance are linked to lower vocabulary scores. To explore this further, we examined correlations between the three indicators of diversity. We found only small associations (all r < .20), which makes it unlikely that multicollinearity is driving this effect (for models including all three indicators, see Tables S14–S16 in the Supplementary Material).

Our finding aligns with previous research (Gampe et al., Reference Gampe, Endesfelder Quick and Daum2021), suggesting that the benefits of higher linguistic distance may diminish beyond a certain point. One possible explanation for this pattern could be the nature of the languages in the low linguistic distance group in our sample, which consists exclusively of Germanic languages. These languages share substantial lexical (53%; Batubara & Widayati, Reference Batubara and Widayati2022), grammatical and phonological similarities with the societal language (Hansen & Kroonen, Reference Hansen, Kroonen and Olander2022). We speculate that this may lead to interference effects that hinder vocabulary acquisition. In contrast, languages with moderate linguistic distance may introduce more distinct structures and facilitate linguistic differentiation, resulting in better vocabulary outcomes. However, as linguistic distance continues to increase, the challenges associated with processing and learning such distinct languages may outweigh the benefits of diversity and could lead to lower vocabulary scores (Squires et al., Reference Squires, Ohlfest, Santoro and Roberts2020).

While a positive association between linguistic distance and vocabulary was observed in children’s dominant language, we do not find significant associations between linguistic distance and vocabulary in their non-dominant language. This suggests that linguistic distance may support vocabulary growth in the dominant language to some extent, but it does not necessarily impact vocabulary in a non-dominant language. The finding differs from previous studies that found a negative effect of linguistic distance on learning a second language (Jaekel et al., Reference Jaekel, Ritter and Jaekel2023; Van der Slik, Reference Van der Slik2010). These studies, however, focused on L2 acquisition later in life. Therefore, rather than one language impacting the acquisition of the other as seen in adult second-language learners, our results suggest that the simultaneous acquisition of both languages in early childhood may lead to a more interconnected process, where the linguistic distance can affect the acquisition of both languages.

In sum, our results suggest that, in early childhood, the acquisition of L1 in multilinguals is affected by the simultaneous acquisition of L2, similar to how L2 acquisition is affected by L1 in adult second-language learners. Nevertheless, the discrepancy found between the dominant and non-dominant language invites further investigation into the factors influencing non-dominant language acquisition, potentially including the role of exposure and language use frequency (e.g., daily exposure to and use of multiple languages or intensive exposure to individual languages during specific periods such as vacations).

7.2. Language entropy

Our analysis revealed that language entropy is not linearly related to vocabulary but follows a curvilinear pattern. Higher language entropy is related to larger vocabulary size, up to a threshold of 0.94 for the dominant language vocabulary, 1.00 for the non-dominant language, and 0.98 for the total vocabulary. Beyond these thresholds, language entropy is associated with lower vocabulary outcomes. This finding aligns with earlier research emphasising the importance of language exposure for vocabulary development (Huttenlocher, Reference Huttenlocher1998; Rowe, Reference Rowe2012), particularly in multilingual contexts (language entropy >1.00) where reduced exposure to each language can limit vocabulary growth (Hoff & Core, Reference Hoff and Core2013). Our results further suggest that vocabulary acquisition is not solely driven by exposure; other factors, such as the balance between languages, play a significant role (Hoff & Ribot, Reference Hoff and Ribot2017; Sander-Montant et al., Reference Sander-Montant, López Pérez and Byers-Heinlein2023). A language entropy of 1.00 corresponds to the transition in exposure from two to three languages. In the case of bilinguals, that is, children speaking two languages, language entropy ranges from 0.00 to 1.00 and is highest when time is equally split between the two languages. An increase in language entropy means relatively less exposure to the language that is considered dominant and more exposure to the non-dominant language. Our findings, therefore, suggest that increased exposure to the non-dominant language in bilingual children is related to a greater vocabulary in their dominant and non-dominant languages. For the non-dominant language, this relationship is intuitive: more exposure naturally supports greater vocabulary development (Huttenlocher, Reference Huttenlocher1998; Rowe, Reference Rowe2012; Rowe & Goldin-Meadow, Reference Rowe and Goldin-Meadow2009). However, the association is less straightforward for the dominant language, because higher language entropy reflects reduced exposure to that language. One possible explanation for this asymmetry could lie in the interplay between the differences in the strength and stability of linguistic representations between the two languages (Kastenbaum et al., Reference Kastenbaum, Bedore, Peña, Sheng, Mavis, Sebastian, Rangmani, Vallila-Rohter and Kiran2018) and the development of meta-linguistic awareness (Cummins, Reference Cummins1978; Huang, Reference Huang2018). In the dominant language, the representation of words is typically stronger and more stable due to more frequent exposure and use. While meta-linguistic awareness develops across both languages, the stronger representations in the dominant language may allow bilingual children to better leverage this awareness in vocabulary acquisition in their dominant language (Varga, Reference Varga2021; Zhang et al., Reference Zhang, Chin and Li2017) than in their non-dominant language.

The curvilinear relationship between language entropy and multilingual children’s total vocabulary suggests that balanced language exposure supports vocabulary development, up to a threshold of 0.98, after which it declines. In contrast, we did not find a significant effect of the association between language entropy and children’s conceptual vocabulary. One possible explanation is that conceptual vocabulary, which reflects children’s knowledge of concepts regardless of the language used to label them, may be less sensitive to the overall balance of language exposure and more influenced by compartmentalisation of language use (Muszyńska et al., Reference Muszyńska, Łuniewska, Dynak, Kolak, Lohrum, Otwinowska, Wodniecka and Haman2024b). That is, if certain languages are predominantly used in a specific context or for a specific activity, children may acquire concept-word mappings in a context-dependent manner. This kind of distributional pattern would not be captured by our current entropy measure, which does not account for the functional or contextual separation of languages.

For all multilingual children in our sample, those learning more than two languages, language entropy is greater than 1.00 and is negatively associated with vocabulary size in the dominant language. This may be explained by a reduction in exposure to each language (Pearson et al., Reference Pearson, Fernandez, Lewedeg and Oller1997; Unsworth, Reference Unsworth2013), as a greater language entropy in multilinguals reflects a more balanced exposure across all languages (i.e., 33% exposure to each language). Additionally, multilingualism places demands on cognitive processing (Bialystok, Reference Bialystok2009), which may become more pronounced with the number of languages children are exposed to. In sum, our findings support the idea that there is more to vocabulary acquisition than exposure alone (Byers-Heinlein, Reference Byers-Heinlein2013), and other factors, such as the number of languages spoken, need to be considered.

7.3. Context entropy

Our investigation into context entropy revealed a positive association with dominant language vocabulary in multilingual children up to a context entropy of 1.03, after which the association becomes negative. This suggests that, initially, introducing more different social contexts may positively impact the dominant language vocabulary. For most children in our sample, their dominant language is the societal language. When the number of social contexts increases (e.g., through institutionalised daycare), children’s exposure to the societal language increases and diversifies, which may increase their vocabulary (see Table S14 in the Supplementary Materials; Soderstrom et al., Reference Soderstrom, Grauer, Dufault and McDivitt2018; Zaretsky & Lange, Reference Zaretsky and Lange2017). However, it has been suggested that children in institutionalised daycare get less direct attention and child-directed speech from group leaders than at home from caregivers (Peterson & Peterson, Reference Peterson and Peterson1986). This could affect their vocabulary acquisition (Weisleder & Fernald, Reference Weisleder and Fernald2013). Hence, adding more different contexts, with reduced child-directed speech at the cost of contexts with more child-directed speech, may affect language development negatively, explaining the downward part of the curvilinear association. In sum, while increasing social context diversity may initially support dominant language vocabulary growth in multilingual children, excessive diversity could hinder development due to possible reduced exposure to child-directed speech in certain contexts.

7.4. Contextual considerations and implications

The relationships observed in the current study highlight the complex interplay between language environments and vocabulary development, suggesting that the early language environment affects vocabulary differently depending on the language. The linguistic landscape in which this study was conducted, characterised by its linguistic and cultural diversity, offers a unique backdrop that contrasts with more homogeneous multilingual studies on language development (mostly English-Spanish; Francisco et al., Reference Francisco, Carlo, August and Snow2006; Shiro et al., Reference Shiro, Hoff and Ribot2020; followed by Canadian English-French; Comeau et al., Reference Comeau, Genesee and Mendelson2007; Nicoladis et al., Reference Nicoladis, Pika and Marentette2009). The diversity of languages and cultures in the country in which this study was conducted likely increases individual differences in vocabulary development, highlighting the need to consider varied linguistic environments in multilingual research. Although this diversity can create variability, the specific traits of our study population may help reduce these differences. The homogeneity of our study population, regarding caregiver education and household income, enabled us to examine other influences on vocabulary. This could explain the smaller differences in vocabulary sizes between multilingual and monolingual children. Higher socio-economic status often correlates with larger vocabularies (Dicataldo & Roch, Reference Dicataldo and Roch2020), which may compensate for the smaller vocabulary sizes typically seen in multilingual preschoolers (Bialystok & Barac, Reference Bialystok and Barac2012).

8. Limitations

This study offers several strengths, including a multidimensional approach to characterising multilingual environments, comprehensive vocabulary assessment across multiple languages and a linguistically and culturally diverse sample. However, some methodological limitations should be noted. First, the data were drawn from lab-based studies conducted between 2019 and 2024, which applied strict inclusion criteria: children were classified as multilingual only if they received more than 20% exposure to a second language, and as monolingual if exposure was below 10%. As a result, children with low levels of second language exposure were excluded, limiting the generalisability of the findings to those with moderate or high bilingual exposure. Future research should consider treating language exposure as a fully continuous variable and include a broader range of language experiences.

Second, the study assessed receptive vocabulary using nouns only. Prior work suggests that linguistic distance may affect productive and receptive vocabularies differently (Floccia et al., Reference Floccia, Sambrook, Delle Luche, Kwok, Goslin, White, Cattani, Sullivan, Abbot-Smith, Krott, Mills, Rowland, Gervain and Plunkett2018; Kelley & Kohnert, Reference Kelley and Kohnert2012; Potapova et al., Reference Potapova, Blumenfeld and Pruitt-Lord2016), an effect we were unable to detect in this study. Further, receptive vocabulary tests do not provide an accurate representation of children’s linguistic competences (Bogue et al., Reference Bogue, DeThorne and Schaefer2014; Ukrainetz & Blomquist, Reference Ukrainetz and Blomquist2002) and tests that sample nouns only may miss other word types a child may or may not know, which could reduce the precision and generalisability of the test results to overall vocabulary size (Stoeckel et al., Reference Stoeckel, McLean and Nation2021). Despite these limitations, this measure was chosen for its strong psychometric properties and its suitability for assessing vocabulary across multiple languages in multilingual children (Gampe et al., Reference Gampe, Kurthen and Daum2018).

9. Conclusion

In conclusion, we examined how different indicators of diversity (linguistic distance, language entropy and context entropy) relate to vocabulary outcomes in preschool children. Our findings show that these indicators are associated with vocabulary outcomes in multilingual children, particularly in their dominant language. Linguistic distance was positively associated with dominant language vocabulary scores, up to a threshold, after which linguistic distance shows a negative association. A similar pattern was observed for language entropy with vocabulary in children’s dominant and non-dominant language and for their total vocabulary. Moderate entropy, reflecting balanced exposure to two languages, was associated with higher vocabulary sizes. Whereas high entropy levels, indicating exposure to more than two languages, were linked to smaller vocabulary sizes. Similarly, context entropy positively predicted vocabulary outcomes in only the dominant language up to a point, beyond which higher context entropy was related to smaller vocabulary sizes. These results confirm the need to consider multilingualism as a multidimensional, continuous variable and highlight that balance and structure of exposure matter, not just the amount.

Future research should continue to explore these dynamics, particularly in diverse linguistic environments constantly changing due to factors like globalisation and migration, to better understand multilingual development and address the unique needs of children growing up in diverse language environments.

Supplementary material

The supplementary material for this article can be found at http://doi.org/10.1017/S1366728925100758.

Data availability statement

The data supporting the findings of this study are openly available in Open Science Framework (OSF, https://osf.io/fhu5e/?view_only=285a0d10bd814d60b88ad0f11c138cc3).

Acknowledgements

We thank the Kleine Weltentdecker*innen Lab and the Jacobs Center for Productive Youth Development at Universität Zürich for providing the resources necessary for this research and all children and their caregivers for their participation in our studies.

Competing interests

The authors declare none.

Appendix

Table A1. Operationalisation of independent variables

Table A2. Operationalisation of dependent variables

Table A3. Linguistic distance and vocabulary sizes

Note: LD-group = Linguistic distance group. The reported category is the high linguistic distance group. The reference category is the low linguistic distance group. Due to missing data, sample sizes varied slightly in the different models: n = 127 for the dominant language vocabulary, n = 124 for the non-dominant language vocabulary, n = 131 for the conceptual vocabulary and n = 133 for the total vocabulary.

Table A4. Linguistic distance and vocabulary sizes in high linguistic distance group

Note: Linguistic distance (LD) as a continuous variable. Due to missing data, sample sizes varied slightly in the different models: n = 96 for the dominant language vocabulary, n = 95 for the non-dominant language vocabulary, n = 118 for the conceptual vocabulary and n = 100 for the total vocabulary.

Table A5. Language entropy and vocabulary sizes

Note: LE = Language entropy, LD-group = Linguistic distance group. Reported LD-group is the high linguistic distance group, with the low linguistic distance group as the category of reference. Due to missing data, sample sizes varied slightly in the different models: n = 127 for the dominant language vocabulary, n = 124 for the non-dominant language vocabulary, n = 131 for the conceptual vocabulary and n = 133 for the total vocabulary.

Table A6. Context entropy and vocabulary sizes in monolingual and multilingual children

Note: The reference category for language status was monolingual children.

Table A7. Context entropy and vocabulary sizes in multilingual children only

Note: Due to missing data, sample sizes varied slightly in the different models: n = 127 for the dominant language vocabulary, n = 123 for the non-dominant language vocabulary, n = 129 for the conceptual vocabulary and n = 132 for the total vocabulary. Only the best-fitting models are reported.

Footnotes

S.W. and M.M.D. authors on this paper share last authorship

This research article was awarded Open Data and Open Materials badges for transparent practices. See the Data Availability Statement for details.

References

Abdalla, A. A. (2022). Analyzing social factors which explain how the social context affects our Choise of a code or a variety, whether language, dialect, or style, with examples from English-other languages and Libyan Arabic. Journal of Literature, Languages and Linguistics, 93. https://doi.org/10.7176/JLLL/93-01.Google Scholar
Abutalebi, J., & Green, D. W. (2016). Neuroimaging of language control in bilinguals: Neural adaptation and reserve. Bilingualism: Language and Cognition, 19(4), 689698. https://doi.org/10.1017/S1366728916000225.CrossRefGoogle Scholar
Anderson, J. A. E., Mak, L., Keyvani Chahi, A., & Bialystok, E. (2018). The language and social background questionnaire: Assessing degree of bilingualism in a diverse population. Behavior Research Methods, 50(1), 250263. https://doi.org/10.3758/s13428-017-0867-9.CrossRefGoogle Scholar
Antovich, D. M., & Graf Estes, K. (2020). One language or two? Navigating cross-language conflict in statistical word segmentation. Developmental Science, 23(6), e12960. https://doi.org/10.1111/desc.12960.CrossRefGoogle ScholarPubMed
Bakker, D., Müller, A., Velupillai, V., Wichmann, S., Brown, C. H., Brown, P., Egorov, D., Mailhammer, R., Grant, A., & Holman, E. W. (2009). Adding typology to lexicostatistics: A combined approach to language classification. 13(1), 169181. https://doi.org/10.1515/LITY.2009.009CrossRefGoogle Scholar
Batubara, N. A., & Widayati, D. (2022). Language kinship of English, German, and Dutch: A comparative historical linguistic study. International Journal Of Humanities Education and Social Sciences, 1(6). https://doi.org/10.55227/ijhess.v1i6.189.Google Scholar
Bedore, L. M., Peña, E. D., García, M., & Cortez, C. (2005). Conceptual versus monolingual scoring. Language, Speech, and Hearing Services in Schools, 36(3), 188200. https://doi.org/10.1044/0161-1461(2005/020.CrossRefGoogle ScholarPubMed
Bergelson, E., & Aslin, R. N. (2017). Nature and origins of the lexicon in 6-mo-olds. Proceedings of the National Academy of Sciences, 114(49), 1291612921. https://doi.org/10.1073/pnas.1712966114.CrossRefGoogle ScholarPubMed
Bialystok, E. (2009). Bilingualism: The good, the bad, and the indifferent. Bilingualism: Language and Cognition, 12(1), 311. https://doi.org/10.1017/S1366728908003477.CrossRefGoogle Scholar
Bialystok, E., & Barac, R. (2012). Emerging bilingualism: Dissociating advantages for metalinguistic awareness and executive control. Cognition, 122(1), 6773. https://doi.org/10.1016/j.cognition.2011.08.003.CrossRefGoogle ScholarPubMed
Bleses, D., Vach, W., Slott, M., Wehberg, S., Thomsen, P., Madsen, T. O., & Basbøll, H. (2008). Early vocabulary development in Danish and other languages: A CDI-based comparison. Journal of Child Language, 35(3), 619650. https://doi.org/10.1017/S0305000908008714.CrossRefGoogle ScholarPubMed
Blumenfeld, H. K., & Marian, V. (2013). Parallel language activation and cognitive control during spoken word recognition in bilinguals. Journal of Cognitive Psychology, 25(5), 547567. https://doi.org/10.1080/20445911.2013.812093.CrossRefGoogle ScholarPubMed
Bogue, E. L., DeThorne, L. S., & Schaefer, B. A. (2014). A psychometric analysis of childhood vocabulary tests. Contemporary Issues in Communication Science and Disorders, 41(Spring), 5569. https://doi.org/10.1044/cicsd_41_S_55.CrossRefGoogle Scholar
Bosma, E., Blom, E., Hoekstra, E., & Versloot, A. (2019). A longitudinal study on the gradual cognate facilitation effect in bilingual children’s Frisian receptive vocabulary. International Journal of Bilingual Education and Bilingualism, 22(4), 371385. https://doi.org/10.1080/13670050.2016.1254152.CrossRefGoogle Scholar
Brink, K. A., Lane, J. D., & Wellman, H. M. (2015). Developmental pathways for social understanding: Linking social cognition to social contexts. Frontiers in Psychology, 6. https://doi.org/10.3389/fpsyg.2015.00719.CrossRefGoogle ScholarPubMed
Byers-Heinlein, K. (2013). Parental language mixing: Its measurement and the relation of mixed input to young bilingual children’s vocabulary size. Bilingualism: Language and Cognition, 16(1), 3248. https://doi.org/10.1017/S1366728912000120.CrossRefGoogle Scholar
Chai, X., & Bao, J. (2023). Linguistic distances between native languages and Chinese influence acquisition of Chinese character, vocabulary, and grammar. Frontiers in Psychology, 13. https://doi.org/10.3389/fpsyg.2022.1083574.CrossRefGoogle ScholarPubMed
Comeau, L., Genesee, F., & Mendelson, M. (2007). Bilingual children’s repairs of breakdowns in communication. Journal of Child Language, 34(1), 159174. https://doi.org/10.1017/S0305000906007690.CrossRefGoogle ScholarPubMed
Cover, T. M., & Thomas, J. A. (2006). Elements of information theory (2nd ed.). Wiley.Google Scholar
Cummins, J. (1978). Bilingualism and the development of metalinguistic awareness. Journal of Cross-Cultural Psychology, 9(2), 131149. https://doi.org/10.1177/002202217892001.CrossRefGoogle Scholar
Delle Luche, C., Poltrock, S., Goslin, J., New, B., Floccia, C., & Nazzi, T. (2014). Differential processing of consonants and vowels in the auditory modality: A cross-linguistic study. Journal of Memory and Language, 72, 115. https://doi.org/10.1016/j.jml.2013.12.001.CrossRefGoogle Scholar
Dicataldo, R., & Roch, M. (2020). Are the effects of variation in quantity of daily bilingual exposure and socioeconomic status on language and cognitive abilities independent in preschool children? International Journal of Environmental Research and Public Health, 17(12), 4570. https://doi.org/10.3390/ijerph17124570.CrossRefGoogle ScholarPubMed
Dryer, M. S., & Haspelmath, M. (2013). The world atlas of language structures online. http://wals.info/.Google Scholar
Duncan, R. J., Anderson, K. L., King, Y. A., Finders, J. K., Schmitt, S. A., & Purpura, D. J. (2022). Predictors of preschool language environments and their relations to children’s vocabulary. Infant and Child Development, 32(1), e2381. https://doi.org/10.1002/icd.2381.CrossRefGoogle Scholar
Floccia, C., Sambrook, T. D., Delle Luche, C., Kwok, R., Goslin, J., White, L., Cattani, A., Sullivan, E., Abbot-Smith, K., Krott, A., Mills, D., Rowland, C., Gervain, J., & Plunkett, K. (2018). I: Introduction. Monographs of the Society for Research in Child Development, 83(1), 729. https://doi.org/10.1111/mono.12348.CrossRefGoogle ScholarPubMed
Francisco, A. R. S., Carlo, M., August, D., & Snow, C. E. (2006). The role of language of instruction and vocabulary in the English phonological awareness of Spanish–English bilingual children. Applied PsychoLinguistics, 27(2), 229246. https://doi.org/10.1017/S0142716406060267.CrossRefGoogle Scholar
French, R. M., & Jacquet, M. (2004). Understanding bilingual memory: Models and data. Trends in Cognitive Sciences, 8(2), 8793. https://doi.org/10.1016/j.tics.2003.12.011.CrossRefGoogle ScholarPubMed
Gampe, A., Endesfelder Quick, A., & Daum, M. M. (2021). Does linguistic similarity affect early simultaneous bilingual language acquisition? Journal of Language Contact, 13(3), 482500. https://doi.org/10.1163/19552629-13030001.CrossRefGoogle Scholar
Gampe, A., Kurthen, I., & Daum, M. M. (2018). BILEX: A new tool measuring bilingual children’s lexicons and translational equivalents. First Language, 38(3), 263283. https://doi.org/10.1177/0142723717736450.CrossRefGoogle Scholar
Garrido-Pozú, J. J. (2024). Cross-linguistic effects of form overlap in aural recognition of Spanish–English cognates. Bilingualism: Language and Cognition, 27(5), 914926. https://doi.org/10.1017/S1366728924000270.CrossRefGoogle Scholar
Goodman, J. C., Dale, P. S., & Li, P. (2008). Does frequency count? Parental input and the acquisition of vocabulary. Journal of Child Language, 35(3), 515531. https://doi.org/10.1017/S0305000907008641.CrossRefGoogle ScholarPubMed
Green, D. W., & Abutalebi, J. (2013). Language control in bilinguals: The adaptive control hypothesis. Journal of Cognitive Psychology, 25(5), 515530. https://doi.org/10.1080/20445911.2013.796377.CrossRefGoogle ScholarPubMed
Gullifer, J. W., Kousaie, S., Gilbert, A. C., Grant, A., Giroud, N., Coulter, K., Klein, D., Baum, S., Phillips, N., & Titone, D. (2021). Bilingual language experience as a multidimensional spectrum: Associations with objective and subjective language proficiency. Applied PsychoLinguistics, 42(2), 245278. https://doi.org/10.1017/S0142716420000521.CrossRefGoogle Scholar
Gullifer, J. W., & Titone, D. (2018). Compute language entropy with languageentropy [Computer software]. https://github.com/jasongullifer/languageEntropy.Google Scholar
Gullifer, J. W., & Titone, D. (2020). Characterizing the social diversity of bilingualism using language entropy. Bilingualism: Language and Cognition, 23(2), 283294. https://doi.org/10.1017/S1366728919000026.CrossRefGoogle Scholar
Hansen, B. S. S., & Kroonen, G. J. (2022). Germanic. In Olander, T. (Ed.), The Indo-European language family (pp. 152172). Cambridge University Press. https://doi.org/10.1017/9781108758666.010.CrossRefGoogle Scholar
Hart, B., & Risley, T. R. (2003). The early catasrophe. The 30 million word gap by age 3. American Educator, 27(1), 49.Google Scholar
Heitz, R. P. (2014). The speed-accuracy tradeoff: History, physiology, methodology, and behavior. Frontiers in Neuroscience, 8. https://doi.org/10.3389/fnins.2014.00150.CrossRefGoogle ScholarPubMed
Hoff, E., & Core, C. (2013). Input and language development in bilingually developing children. Seminars in Speech and Language, 34(4), 215. https://doi.org/10.1055/s-0033-1353448.Google ScholarPubMed
Hoff, E., Core, C., Place, S., Rumiche, R., Señor, M., & Parra, M. (2012). Dual language exposure and early bilingual development. Journal of Child Language, 39(1), 127. https://doi.org/10.1017/S0305000910000759.CrossRefGoogle ScholarPubMed
Hoff, E., & Ribot, K. M. (2017). Language growth in English monolingual and Spanish-English bilingual children from 2.5 to 5 years. The Journal of Pediatrics, 190, 241, e1–245. https://doi.org/10.1016/j.jpeds.2017.06.071.CrossRefGoogle ScholarPubMed
Hoff, E., Rumiche, R., Burridge, A., Ribot, K., & Welsh, S. (2014). Expressive vocabulary development in children from bilingual and monolingual homes: A longitudinal study from two to four years. Early Childhood Research Quarterly, 29. https://doi.org/10.1016/j.ecresq.2014.04.012.CrossRefGoogle Scholar
Holman, E. W., Brown, C. H., Wichmann, S., Müller, A., Velupillai, V., Hammarström, H., Sauppe, S., Jung, H., Bakker, D., Brown, P., Belyaev, O., Urban, M., Mailhammer, R., List, J.-M., & Egorov, D. (2011). Automated dating of the world’s language families based on lexical similarity. Current Anthropology, 52(6), 841875. https://doi.org/10.1086/662127.CrossRefGoogle Scholar
Howard, L. H., Carrazza, C., & Woodward, A. L. (2014). Neighborhood linguistic diversity predicts infants’ social learning. Cognition, 133(2), 474479. https://doi.org/10.1016/j.cognition.2014.08.002.CrossRefGoogle ScholarPubMed
Huang, K.-J. (2018). On bilinguals’ development of metalinguistic awareness and its transfer to L3 learning: The role of language characteristics. International Journal of Bilingualism, 22(3), 330349. https://doi.org/10.1177/1367006916681081.CrossRefGoogle Scholar
Huttenlocher, J. (1998). Language input and language growth. Preventive Medicine, 27(2), 195199. https://doi.org/10.1006/pmed.1998.0301.CrossRefGoogle ScholarPubMed
Jaekel, N., Ritter, M., & Jaekel, J. (2023). Associations of students’ linguistic distance to the language of instruction and classroom composition with English reading and listening skills. Studies in Second Language Acquisition, 45(5), 12871309. https://doi.org/10.1017/S0272263123000268.CrossRefGoogle Scholar
Kachergis, G., Marchman, V. A., & Frank, M. C. (2022). Toward a “standard model” of early language learning. Current Directions in Psychological Science, 31(1), 2027. https://doi.org/10.1177/09637214211057836.CrossRefGoogle Scholar
Kang, J. Y. (2012). Do bilingual children possess better phonological awareness? Investigation of Korean monolingual and Korean-English bilingual children. Reading and Writing, 25(2), 411431. https://doi.org/10.1007/s11145-010-9277-4.CrossRefGoogle Scholar
Kastenbaum, J., Bedore, L., Peña, E., Sheng, L., Mavis, İ., Sebastian, R., Rangmani, G., Vallila-Rohter, S., & Kiran, S. (2018). The influence of proficiency and language combination on bilingual lexical access. Bilingualism: Language and Cognition, 22, 131. https://doi.org/10.1017/S1366728918000366.Google ScholarPubMed
Kelley, A., & Kohnert, K. (2012). Is there a cognate advantage for typically developing Spanish-speaking English-language learners? Language, Speech, and Hearing Services in Schools, 43(2), 191204. https://doi.org/10.1044/0161-1461(2011/10-0022.CrossRefGoogle Scholar
Kohnert, K. (2010). Bilingual children with primary language impairment: Issues, evidence and implications for clinical actions. Journal of Communication Disorders, 43(6), 456473. https://doi.org/10.1016/j.jcomdis.2010.02.002.CrossRefGoogle ScholarPubMed
Konstantinidis, S. (2007). Computing the edit distance of a regular language. Information and Computation, 205(9), 13071316. https://doi.org/10.1016/j.ic.2007.06.001.CrossRefGoogle Scholar
Koutamanis, E., Kootstra, G. J., Dijkstra, T., & Unsworth, S. (2025). The role of cognates and language distance in simultaneous bilingual children’s productive vocabulary acquisition. Language Learning, 75(2), 347378. https://doi.org/10.1111/lang.12666.CrossRefGoogle Scholar
Kremin, L. V., & Byers-Heinlein, K. (2021). Why not both? Rethinking categorical and continuous approaches to bilingualism. International Journal of Bilingualism, 25(6), 15601575. https://doi.org/10.1177/13670069211031986.CrossRefGoogle Scholar
Lauro, J., Core, C., & Hoff, E. (2020). Explaining individual differences in trajectories of simultaneous bilingual development: Contributions of child and environmental factors. Child Development, 91(6), 20632082. https://doi.org/10.1111/cdev.13409.CrossRefGoogle ScholarPubMed
Levy, R. (2008). A Noisy-Channel Model of Human Sentence Comprehension under Uncertain Input. In Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pp. 234243, Honolulu, Hawaii: Association for Computational Linguistics.Google Scholar
Lin, Y.-C., Lin, P.-Y., & Yeh, L.-H. (2023). Syllable or phoneme? A mouse-tracking investigation of phonological units in mandarin Chinese and English spoken word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 49(1), 130176. https://doi.org/10.1037/xlm0001128.Google ScholarPubMed
Mani, N., & Plunkett, K. (2007). Phonological specificity of vowels and consonants in early lexical representations. Journal of Memory and Language, 57(2), 252272. https://doi.org/10.1016/j.jml.2007.03.005.CrossRefGoogle Scholar
Marchman, V. A., & Fernald, A. (2008). Speed of word recognition and vocabulary knowledge in infancy predict cognitive and language outcomes in later childhood. Developmental Science, 11(3), F9F16. https://doi.org/10.1111/j.1467-7687.2008.00671.x.CrossRefGoogle ScholarPubMed
Marian, V., Blumenfeld, H. K., & Boukrina, O. V. (2008). Sensitivity to phonological similarity within and across languages. Journal of Psycholinguistic Research, 37(3), 141170. https://doi.org/10.1007/s10936-007-9064-9.CrossRefGoogle ScholarPubMed
Marvin, C., Beukelman, D., & Bilyeu, D. (2009). Vocabulary-use patterns in preschool children: Effects of context and time sampling. Augmentative and Alternative Communication, 10(4), 224236. https://doi.org/10.1080/07434619412331276930.CrossRefGoogle Scholar
Mateu, V., & Sundara, M. (2022). Spanish input accelerates bilingual infants’ segmentation of English words. Cognition, 218, 104936. https://doi.org/10.1016/j.cognition.2021.104936.CrossRefGoogle ScholarPubMed
Muszyńska, K., Kołak, J., Haman, E., Białecka-Pikul, M., & Otwinowska, A. (2024a). Metacognitive verbs do not show a cross-language gap: An investigation of metacognitive and concrete verbs in bilingual children. International Journal of Bilingualism, 28(3), 316336. https://doi.org/10.1177/13670069221149941.CrossRefGoogle Scholar
Muszyńska, K., Łuniewska, M., Dynak, A., Kolak, J., Lohrum, R., Otwinowska, A., Wodniecka, Z., & Haman, E. (2024b). Bilinguals’ knowledge of ‘home’ and ‘school’ words revisited: Evidence from polish-English bilinguals. International Journal of Bilingual Education and Bilingualism, 28(1), 7391. https://doi.org/10.1080/13670050.2024.2399639.CrossRefGoogle Scholar
Neuhauser, A., Ramseier, E., Schaub, S., Burkhardt, S. C. A., & Lanfranchi, A. (2018). Mediating role of maternal sensitivity: Enhancing language development in at-risk families. Infant Mental Health Journal: Infancy and Early Childhood, 39(5), 522536. https://doi.org/10.1002/imhj.21738.CrossRefGoogle ScholarPubMed
Nicoladis, E., Pika, S., & Marentette, P. (2009). Do French–English bilingual children gesture more than monolingual children? Journal of Psycholinguistic Research, 38(6), 573585. https://doi.org/10.1007/s10936-009-9121-7.CrossRefGoogle ScholarPubMed
Nishibayashi, L.-L., & Nazzi, T. (2016). Vowels, then consonants: Early bias switch in recognizing segmented word forms. Cognition, 155, 188203. https://doi.org/10.1016/j.cognition.2016.07.003.CrossRefGoogle ScholarPubMed
Orena, A. J., & Polka, L. (2019). Monolingual and bilingual infants’ word segmentation abilities in an inter-mixed dual-language task. Infancy, 24(5), 718737. https://doi.org/10.1111/infa.12296.CrossRefGoogle Scholar
Paradis, J. (2023). Sources of individual differences in the dual language development of heritage bilinguals. Journal of Child Language, 50(4), 793817. https://doi.org/10.1017/S0305000922000708.CrossRefGoogle ScholarPubMed
Pearson, B. Z., Fernandez, S. C., Lewedeg, V., & Oller, D. K. (1997). The relation of input factors to lexical learning by bilingual infants. Applied PsychoLinguistics, 18(1), 4158. https://doi.org/10.1017/S0142716400009863.CrossRefGoogle Scholar
Pearson, B. Z., Fernández, S. C., & Oller, D. K. (1993). Lexical development in bilingual infants and toddlers: Comparison to monolingual norms. Language Learning, 43(1), 93120. https://doi.org/10.1111/j.1467-1770.1993.tb00174.x.CrossRefGoogle Scholar
Peterson, C., & Peterson, R. (1986). Parent—Child interaction and daycare: Does quality of daycare matter? Journal of Applied Developmental Psychology, 7(1), 115. https://doi.org/10.1016/0193-3973(86)90015-8.CrossRefGoogle Scholar
Poarch, G. J., & van Hell, J. G. (2012). Cross-language activation in children’s speech production: Evidence from second language learners, bilinguals, and trilinguals. Journal of Experimental Child Psychology, 111(3), 419438. https://doi.org/10.1016/j.jecp.2011.09.008.CrossRefGoogle ScholarPubMed
Potapova, I., Blumenfeld, H. K., & Pruitt-Lord, S. (2016). Cognate identification methods: Impacts on the cognate advantage in adult and child Spanish-English bilinguals. International Journal of Bilingualism, 20(6), 714731. https://doi.org/10.1177/1367006915586586.CrossRefGoogle ScholarPubMed
Quirk, E., & Cohen, C. (2022). The development of the cognate advantage from elementary to middle school years in French-English bilinguals attending a dual language program in France. International Journal of Bilingual Education and Bilingualism, 25(10), 38593874. https://doi.org/10.1080/13670050.2022.2087468.CrossRefGoogle Scholar
Robinson Anthony, J. J. D., Blumenfeld, H. K., Potapova, I., & Pruitt-Lord, S. L. (2022). Language dominance predicts cognate effects and metalinguistic awareness in preschool bilinguals. International Journal of Bilingual Education and Bilingualism, 25(3), 922941. https://doi.org/10.1080/13670050.2020.1735990.CrossRefGoogle ScholarPubMed
Rowe, M. L. (2012). A longitudinal investigation of the role of quantity and quality of child-directed speech in vocabulary development. Child Development, 83(5), 17621774. https://doi.org/10.1111/j.1467-8624.2012.01805.x.CrossRefGoogle ScholarPubMed
Rowe, M. L., & Goldin-Meadow, S. (2009). Differences in early gesture explain SES disparities in child vocabulary size at school entry. Science, 323(5916), 951953. https://doi.org/10.1126/science.1167025.CrossRefGoogle ScholarPubMed
Sander-Montant, A., López Pérez, M., & Byers-Heinlein, K. (2023). The more they hear the more they learn? Using data from bilinguals to test models of early lexical development. Cognition, 238, 105525. https://doi.org/10.1016/j.cognition.2023.105525.CrossRefGoogle ScholarPubMed
Scrucca, L., Fraley, C., Murphy, T. B., & Adrian, E., R. (2023). Model-Based Clustering, Classification, and Density Estimation Using mclust in R. Chapman and Hall/CRC. https://doi.org/10.1201/9781003277965CrossRefGoogle Scholar
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379423. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x.CrossRefGoogle Scholar
Shiro, M., Hoff, E., & Ribot, K. M. (2020). Cultural differences in the content of child talk: Evaluative lexis of English monolingual and Spanish-English bilingual 30-month-olds. Journal of Child Language, 47(4), 844869. https://doi.org/10.1017/S0305000919000990.CrossRefGoogle Scholar
Singh, L. (2021). Evidence for an early novelty orientation in bilingual learners. Child Development Perspectives, 15(2), 110116. https://doi.org/10.1111/cdep.12407.CrossRefGoogle Scholar
Soderstrom, M., Grauer, E., Dufault, B., & McDivitt, K. (2018). Influences of number of adults and adult: Child ratios on the quantity of adult language input across childcare settings. First Language, 38(6), 563581. https://doi.org/10.1177/0142723718785013.CrossRefGoogle Scholar
Søren, W., Holman, E. W., & Brown, C. H.. (2022). The ASJP databse (version 20) (No. ASJP; Version 20) [Dataset].Google Scholar
Spolaore, E., & Wacziarg, R. (2016). Ancestry, language and culture. In Ginsburgh, V. & Weber, S. (Eds.), The Palgrave handbook of economics and language (pp. 174211). Palgrave Macmillan UK. https://doi.org/10.1007/978-1-137-32505-1_7.Google Scholar
Squires, L. R., Ohlfest, S. J., Santoro, K. E., & Roberts, J. L. (2020). Factors influencing cognate performance for young multilingual children’s vocabulary: A research synthesis. American Journal of Speech-Language Pathology, 29(4), 21702188. https://doi.org/10.1044/2020_AJSLP-19-00167.CrossRefGoogle ScholarPubMed
Stoeckel, T., McLean, S., & Nation, P. (2021). Limitations of size and levels tests of written receptive vocabulary knowledge. Studies in Second Language Acquisition, 43(1), 181203. https://doi.org/10.1017/S027226312000025X.CrossRefGoogle Scholar
Straker, D. (1980). Situational variables in language use. 1980 (26), 101122. https://doi.org/10.1515/ijsl.1980.26.101CrossRefGoogle Scholar
Swadesh, M. (1955). Towards greater accuracy in lexicostatistic dating. International Journal of American Linguistics, 21(2), 121137.10.1086/464321CrossRefGoogle Scholar
Tan, A. W. M., Marchman, V. A., & Frank, M. C. (2024). The role of translation equivalents in bilingual word learning. Developmental Science, 27(4), e13476. https://doi.org/10.1111/desc.13476.CrossRefGoogle ScholarPubMed
Thordardottir, E. T. (2005). Early lexical and syntactic development in Quebec French and English: Implications for cross-linguistic and bilingual assessment. International Journal of Language & Communication Disorders, 40(3), 243278. https://doi.org/10.1080/13682820410001729655.CrossRefGoogle ScholarPubMed
Titone, D. A., & Tiv, M. (2023). Rethinking multilingual experience through a systems framework of bilingualism: Response to commentaries. Bilingualism: Language and Cognition, 26(1), 247251. https://doi.org/10.1017/S1366728922000785.CrossRefGoogle Scholar
Ukrainetz, T. A., & Blomquist, C. (2002). The criterion validity of four vocabulary tests compared with a language sample. Child Language Teaching and Therapy, 18(1), 5978. https://doi.org/10.1191/0265659002ct227oa.CrossRefGoogle Scholar
Unsworth, S. (2013). Assessing the role of current and. Bilingualism: Language and Cognition, 16(1), 86110. https://doi.org/10.1017/S1366728912000284.CrossRefGoogle Scholar
Van der Slik, F. W. P. (2010). Acquisition of Dutch as a second language: The explanative power of cognate and genetic linguistic distance measures for 11 West European first languages. Studies in Second Language Acquisition, 32(3), 401432. https://doi.org/10.1017/S0272263110000021.CrossRefGoogle Scholar
Varga, S. (2021). The relationship between reading skills and metalinguistic awareness. Gradus, 8(1), 5257. https://doi.org/10.47833/2021.1.ART.001.CrossRefGoogle Scholar
Von Holzen, K., & Mani, N. (2012). Language nonselective lexical access in bilingual toddlers. Journal of Experimental Child Psychology, 113(4), 569586. https://doi.org/10.1016/j.jecp.2012.08.001.CrossRefGoogle ScholarPubMed
Weisleder, A., & Fernald, A. (2013). Talking to children matters: Early language experience strengthens processing and builds vocabulary. Psychological Science, 24(11), 21432152. https://doi.org/10.1177/0956797613488145.CrossRefGoogle ScholarPubMed
Wermelinger, S., Daum, M. M., & Gampe, A. (2024). From everyday exposure to pragmatic mastery: The COME perspective. International Review of Pragmatics, 16(1), 149161. https://doi.org/10.1163/18773109-01601006.CrossRefGoogle Scholar
Wermelinger, S., Gampe, A., & Daum, M. M. (2017). Bilingual toddlers have advanced abilities to repair communication failure. Journal of Experimental Child Psychology, 155, 8494. https://doi.org/10.1016/j.jecp.2016.11.005.CrossRefGoogle ScholarPubMed
Wichmann, S. (2020). How to distinguish languages and dialects. Computational Linguistics, 45(4), 823831. https://doi.org/10.1162/coli_a_00366.CrossRefGoogle Scholar
Zaretsky, E., & Lange, B. P. (2017). Sociolinguistic factors associated with the subjectively and objectively measured language development in German preschoolers in three follow-up studies. Linguistics, 55(1), 3965. https://doi.org/10.1515/ling-2016-0036.CrossRefGoogle Scholar
Zhang, D., Chin, C.-F., & Li, L. (2017). Metalinguistic awareness in bilingual children’s word reading: A cross-lagged panel study on cross-linguistic transfer facilitation. Applied PsychoLinguistics, 38(2), 395426. https://doi.org/10.1017/S0142716416000278.CrossRefGoogle Scholar
Figure 0

Figure 1. Distribution of entropy for (a) two elements (languages/social contexts) and (b) for three elements. Note that the distributions peak at equal distributions across the given number of elements.

Figure 1

Figure 2. Graphs a and b show group-level analyses comparing the vocabularies of monolingual and multilingual children. Graphs c to f show vocabulary differences in multilingual children by the two identified linguistic distance groups (Low and High).

Figure 2

Table 1. Best fitting models

Figure 3

Figure 3. Associations of language entropy and children’s vocabulary scores. The solid line shows the predicted vocabulary scores from the statistical model. The shaded area represents the 95% confidence interval around these predictions.

Figure 4

Figure 4. Associations of context entropy and children’s vocabulary scores. The solid line shows the predicted vocabulary scores from the statistical model. The shaded area represents the 95% confidence interval around these predictions. Note that the vocabulary scores on the dominant language (c) and the non-dominant language (d) are based on data of multilingual children only.

Figure 5

Table A1. Operationalisation of independent variables

Figure 6

Table A2. Operationalisation of dependent variables

Figure 7

Table A3. Linguistic distance and vocabulary sizes

Figure 8

Table A4. Linguistic distance and vocabulary sizes in high linguistic distance group

Figure 9

Table A5. Language entropy and vocabulary sizes

Figure 10

Table A6. Context entropy and vocabulary sizes in monolingual and multilingual children

Figure 11

Table A7. Context entropy and vocabulary sizes in multilingual children only

Supplementary material: File

Aalders et al. supplementary material

Aalders et al. supplementary material
Download Aalders et al. supplementary material(File)
File 514 KB