Hostname: page-component-77f85d65b8-2tv5m Total loading time: 0 Render date: 2026-04-18T20:03:42.470Z Has data issue: false hasContentIssue false

The Development of Canonical Proportion as a Function of Community, Multilingualism, and Target Language’s Syllable Complexity

Published online by Cambridge University Press:  19 February 2026

Kai Jia Tey*
Affiliation:
Département d’Etudes Cognitives, Laboratoire de Sciences Cognitives et de Psycholinguistique, ENS, EHESS, CNRS, Paris, France
Sarah Walker
Affiliation:
School of Economics, University of New South Wales, Australia
Amanda Seidl
Affiliation:
Department of Communication Sciences and Disorders, University of Delaware, USA
Camila Scaff
Affiliation:
Institute of Evolutionary Medicine, Switzerland
Loann Peurey
Affiliation:
Département d’Etudes Cognitives, Laboratoire de Sciences Cognitives et de Psycholinguistique, ENS, EHESS, CNRS, Paris, France
Bridgette L. Kelleher
Affiliation:
Department of Psychological Sciences, Purdue University, USA
William Havard
Affiliation:
Laboratoire Ligérien de Linguistique, University of Orléans, France
Lisa Hamrick
Affiliation:
Department of Psychology, University of South Carolina, USA
Pauline Grosjean
Affiliation:
School of Economics, University of New South Wales, Australia
Margaret Cychosz
Affiliation:
Department of Linguistics, University of California Los Angeles, USA
Heidi Colleran
Affiliation:
BirthRites Lise Meitner Research Group, Max Planck Institute for Evolutionary Anthropology, Germany
Marisa Casillas
Affiliation:
Comparative Human Development, University of Chicago Division of the Social Science, USA
Elika Bergelson
Affiliation:
Department of Psychology, Harvard University, USA
Kasia Hitczenko
Affiliation:
Department of Computer Science, The George Washington University, USA
Alejandrina Cristia
Affiliation:
Département d’Etudes Cognitives, Laboratoire de Sciences Cognitives et de Psycholinguistique, ENS, EHESS, CNRS, Paris, France
*
Corresponding author: Kai Jia Tey; Email: kaijiatey@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

This study investigates the development of canonical proportion (CP), an indicator of speech development, across diverse language and environmental contexts. Using the Speech Maturity Dataset (SMD) comprising 366 children, aged 0;2–6;4, across 10 different languages and cultures, we explore the influence of multilingual exposure, language syllable complexity, and community type (industrialised, non-industrialised) on CP. We find that monolingual children display higher CP measures than their multilingual peers. In addition, CP is higher for children learning languages with simple syllable complexity than those with more complex syllables. We also find no significant differences in the CP trajectory of children from industrialised versus non-industrialised communities. Integrating these findings in the broader literature, we highlight the importance of diversifying participant samples to capture the complex relationship between language exposure, social environment, and language development.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press
Figure 0

Table 1. Number of participants included in each corpora in previous studies and in the present dataset. The three papers largely build on each other. Hitczenko et al. (2023) included all the data in Cychosz et al. (2021) (except for USA–California). Our study includes all the data in Hitczenko et al. and more

Figure 1

Table 2. Number of CP measures across age range, by corpus

Figure 2

Table 3. Summary of corpus characteristics

Figure 3

Figure 1. Canonical proportions by age and multilingual exposure (full sample). The regression line represents the fitted model, and the shaded bands surrounding the line represent the 95% confidence intervals. Each data point represents a single child, with point size representing the total number of vocalisations contributed by that child (larger points represent children who produced more vocalisations).

Figure 4

Figure 2. Canonical proportions by age and syllable complexity in monolingual children. The regression line represents the fitted model, and the shaded bands surrounding the line represent 95% confidence intervals. Each data point represents a single child, with point size indicating the total number of vocalisations contributed by that child (larger points represent children who produced more vocalisations). *Others represent French, Tseltal, and English.

Figure 5

Figure 3. Canonical proportions by age and community. The regression line represents the fitted model, and the shaded bands surrounding the line represent 95% confidence intervals. Each data point represents a single child (N = 115 non-industrialised; N = 33 industrialised), with point size indicating the total number of vocalisations contributed by that child (larger points represent children who produced more vocalisations).

Supplementary material: File

Tey et al. supplementary material

Tey et al. supplementary material
Download Tey et al. supplementary material(File)
File 442 KB