Hostname: page-component-6766d58669-mzsfj Total loading time: 0 Render date: 2026-05-22T20:58:59.744Z Has data issue: false hasContentIssue false

Gendered speech development in early childhood: Evidence from a longitudinal study of vowel and consonant acoustics

Published online by Cambridge University Press:  04 April 2025

Eugene Wong*
Affiliation:
Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, USA
Kiana Koeppe
Affiliation:
Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, USA
Margaret Cychosz
Affiliation:
Department of Linguistics, University of California, Los Angeles, USA
Benjamin Munson
Affiliation:
Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, USA
*
Corresponding author: Eugene Wong; Email: wong0703@umn.edu
Rights & Permissions [Opens in a new window]

Abstract

Adults rate the speech of children assigned male at birth (AMAB) and assigned female at birth (AFAB) as young as 2.5 years of age differently on a scale of definitely a boy to definitely a girl (Munson et al., 2022), despite the lack of consistent sex dimorphism in children’s speech production mechanisms. This study used longitudinal data to examine the acoustic differences between AMAB and AFAB children and the association between the acoustic measures and perceived gender ratings of children’s speech. We found differences between AMAB and AFAB children in two acoustic parameters that mark gender in adult speech: the spectral centroid of /s/ and the overall scaling of resonant frequencies in vowels. These results demonstrate that children as young as 3 years old speak in ways that reflect their sex assigned at birth. We interpret this as evidence that children manipulate their speech apparatus volitionally to mark gender through speech.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Table 1. Full list of test words at the first time point and last time point, separated by vowels. Asterisks denote words that were used as stimuli in the perceived gender rating study. Words that were used for /s/ analysis are italicized

Figure 1

Figure 1. Illustration of vowel-space dispersion (VSD) measurement in one child. VSD is taken as the mean of the Euclidean distances between the centre of the ΔF normalized F1 by F2 vowel space and all vowel tokens.

Figure 2

Table 2. Correlation matrix with confidence intervals of the four scaled acoustic variables at the first time point

Figure 3

Table 3. Correlation matrix with confidence intervals of the four scaled acoustic variables at the last time point

Figure 4

Table 4. Linear mixed-effect model predicting acoustic vocal-tract length from time point and sex assigned at birth, with a random intercept of child

Figure 5

Figure 2. Violin plots of acoustic vocal-tract length of the 110 children, separated by sex assigned at birth and time points.

Figure 6

Table 5. Linear mixed-effect model predicting vowel-space dispersion from time point, sex assigned at birth, and mean vowel duration, with a random intercept of child

Figure 7

Figure 3. Violin plots of vowel-space dispersion of the 110 children, separated by sex assigned at birth and time points.

Figure 8

Table 6. Linear mixed-effect model predicting fundamental frequency (f0) from time point and sex assigned at birth, with random slopes of time point by child and time point by word item

Figure 9

Figure 4. Violin plots of the fundamental frequency of the 110 children, separated by sex assigned at birth and time points.

Figure 10

Table 7. Generalized mixed-effect model predicting /s/ spectral centroid from time point and sex assigned at birth, with random slopes of time point by child and sex assigned at birth by word item

Figure 11

Figure 5. Violin plots of /s/ spectral centroid of the 110 children, separated by sex assigned at birth (SAB) and time points.

Figure 12

Table 8. Generalized mixed-effect model of perceived gender ratings predicted by sex assigned at birth, time points, and acoustic vocal-tract length, vowel-space dispersion, and mean fundamental frequency. Formula = Rating ~ time point* SAB* Mean f0 + aVTL + VSD + (0 + SAB: Mean f0 + aVTL + VSD | rater) + (SAB: Mean f0 + aVTL + VSD | child) + (aVTL + VSD | word)

Figure 13

Figure 6. Line plot showing children’s perceived gender ratings predicted by sex assigned at birth, time points, and acoustic vocal-tract length.

Figure 14

Figure 7. Line plot showing the relationship between perceived gender ratings of children’s speech by sex assigned at birth, time point, and vowel-space dispersion.

Figure 15

Figure 8. Line plot showing perceived gender ratings of children’s speech predicted by sex assigned at birth, time points, and mean fundamental frequency.

Figure 16

Table 9. Generalized mixed-effect model of perceived gender ratings of /s/−initial words predicted by sex assigned at birth, time points, and mean /s/ spectral centroid. Formula = /s/ Rating ~ Time point * SAB * Mean /s/ spectral centroid + (1 | rater) + (Mean /s/ spectral centroid | child) + (0 + Mean /s/ spectral centroid | word)

Figure 17

Figure 9. Line plot showing children’s perceived gender ratings of /s/−initial words predicted by sex assigned at birth, time points, and mean spectral centroid of /s/.

Supplementary material: File

Wong et al. supplementary material

Wong et al. supplementary material
Download Wong et al. supplementary material(File)
File 338.1 KB