Hostname: page-component-89b8bd64d-b5k59 Total loading time: 0 Render date: 2026-05-07T07:17:33.758Z Has data issue: false hasContentIssue false

Gender differences in the acoustic realization of lexical, phrasal, and contrastive stress

Published online by Cambridge University Press:  04 May 2026

Sten Knutsen*
Affiliation:
Department of Psychology, Rutgers University–New Brunswick, USA
Karin Stromswold
Affiliation:
Department of Psychology and Center for Cognitive Science, Rutgers University–New Brunswick, USA
*
Corresponding author: Sten Knutsen; Email: stenknutsen@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

This study investigates how lexical, phrasal, and contrastive stress are acoustically realized in American English, focusing on whether men and women differ in how they use pitch, amplitude, and duration to convey stress. Thirty-six native speakers completed minimal-pair stress production tasks online. We analyzed the resulting speech using prosodic contour measures, Bayesian ANOVAs, mixed-effects regression, Random Forest Classification, and human coder judgments. Results show greater acoustic overlap between lexical and contrastive stress than between either of those and phrasal stress. Duration was the primary cue for phrasal stress, while lexical and contrastive stress relied more evenly on multiple cues. Gender-based differences were especially evident in contrastive stress, which, to our knowledge, has not previously been studied in relation to gender: women relied more on pitch, while men emphasized amplitude and duration. These findings highlight the multidimensional acoustic nature of stress realization and demonstrate the value of combining computational and perceptual approaches in prosody research.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press
Figure 0

Figure 1. Lexical stress prosodic contours.Note: Panels A and B show mean-centered F0 contours (in semitones) for lexical stress productions by women (n = 18) and men (n = 18), respectively. Red lines represent initial-syllable stress items (INsert); blue lines represent final-syllable stress items (inSERT). Shaded ribbons depict ±1 standard error across speakers. Vertical lines and shaded areas denote the ROIs corresponding to the first and second syllables of the target word. Pitch values were centered within each stress condition for each speaker, and contours include both correct and incorrect trials.

Figure 1

Figure 2. Phrase stress prosodic contours.Note: Panels A and B show mean-centered F0 contours (in semitones) for phrase stress productions by women (n = 18) and men (n = 18), respectively. Red lines represent compound nouns (e.g., greenhouse); blue lines represent adjective-noun phrases (e.g., green house). Shaded ribbons depict ±1 standard error across speakers. Vertical lines and shaded areas denote the ROIs corresponding to the first and second morphemes of each construction. Pitch values were centered within each stress condition for each speaker, and contours include both correct and incorrect trials. Differences in total duration reflect the faster articulation of compound nouns relative to adjective–noun phrases.

Figure 2

Figure 3. Contrastive stress prosodic contours.Note: Panels A and B show mean-centered F0 contours (in semitones) for contrastive stress productions by women (n = 18) and men (n = 18), respectively. Blue lines represent trials in which contrastive stress fell on the color word; red lines represent trials in which contrastive stress fell on the animal word. Shaded ribbons depict ±1 standard error across speakers. Vertical lines and shaded regions denote the ROIs corresponding to the color and animal words in the utterance. Pitch values were centered within each stress condition for each speaker, and contours include both correct and incorrect trials.

Figure 3

Figure 4. Acoustics of lexical stress regions of interest.Note: Panels A–F show acoustic measures extracted from ROIs for lexical stress productions by women (n = 18) and men (n = 18). White circles represent initial-syllable stress (e.g., INsert); black circles represent final-syllable stress (e.g., inSERT). Panels A and B display mean pitch (in semitones) across three equal-duration intervals within each syllable; Panels c and d display mean amplitude (in dB) for the same intervals; Panels e and f display syllable durations (in seconds). Error bars represent 95% credible intervals.

Figure 4

Figure 5. Acoustics of phrase stress regions of interest.Note: Panels A–F show acoustic measures extracted from ROIs for phrase stress productions by women (n = 18) and men (n = 18). White circles represent one-word compound nouns (e.g., greenhouse); black circles represent two-word adjective-noun phrases (e.g., green house). Panels a and b display mean pitch (in semitones) across three equal-duration intervals within each morpheme; Panels c and d display mean amplitude (in dB) for the same intervals; Panels e and f display total duration (in seconds) from the onset of the first ROI to the offset of the second. Error bars represent 95% credible intervals.

Figure 5

Figure 6. Acoustics of contrastive stress regions of interest.Note: Panels A–F show acoustic measures extracted from ROIs for contrastive stress productions by women (n = 18) and men (n = 18). White circles represent animal-stressed trials (e.g., red COW); black circles represent color-stressed trials (e.g., BLACK cow). Panels A and B display mean pitch (in semitones) across three equal-duration intervals within each ROI; Panels c and d display mean amplitude (in dB) for the same intervals; Panels e and f display ROI durations (in seconds). Error bars represent 95% credible intervals.

Figure 6

Table 1. RFC model accuracies and importance scores

Figure 7

Table 2. Mean accuracy rates for human coders and random forest classification

Figure 8

Figure 7. Pitch of ROIs in correct and incorrect lexical stress trials.Note: Panels show mean pitch (in semitones) across three equal-duration intervals for lexical stress productions by women (n = 18) and men (n = 18), separated by coder-determined accuracy. White circles represent initial-stress trials (e.g., INsert); black circles represent final-stress trials (e.g., inSERT). Left panels display trials that human coders classified as correct; right panels display trials coded as incorrect. Error bars represent 95% credible intervals.

Figure 9

Figure 8. Amplitude of ROIs in correct and incorrect lexical stress trials.Note: Panels show mean amplitude (in dB) across three equal-duration intervals for lexical stress productions by women (n = 18) and men (n = 18), separated by coder-determined accuracy. White circles represent initial-stress trials (e.g., INsert); black circles represent final-stress trials (e.g., inSERT). Left panels display trials that human coders classified as correct; right panels display trials coded as incorrect. Error bars represent 95% credible intervals.

Figure 10

Figure 9. Duration of syllables in correct and incorrect lexical stress trials.Note: Panels show syllable durations (in seconds) for lexical stress productions by women (n = 18) and men (n = 18), separated by coder-determined accuracy. White circles represent initial-stress trials (e.g., INsert); black circles represent final-stress trials (e.g., inSERT). Left panels display trials that human coders classified as correct; right panels display trials coded as incorrect. Error bars represent 95% credible intervals.

Figure 11

Figure 10. Pitch of ROIs in Correct and incorrect phrase stress trials.Note: Panels show mean pitch (in semitones) across ROIs for phrase stress productions by women (n = 18) and men (n = 18), separated by coder-determined accuracy. White circles represent one-word compounds (e.g., greenhouse); black circles represent two-word adjective–noun phrases (e.g., green house). Left panels display trials classified as correct by human coders; right panels display trials coded as incorrect. Error bars represent 95% credible intervals.

Figure 12

Figure 11. Amplitude of total ROIs in correct and incorrect phrase stress trials.Note: Panels show mean amplitude (in dB) across ROIs for phrase stress productions by women (n = 18) and men (n = 18), separated by coder-determined accuracy. White circles represent one-word compounds (e.g., greenhouse); black circles represent two-word adjective-noun phrases (e.g., green house). Left panels display trials classified as correct by human coders; right panels display trials coded as incorrect. Error bars represent 95% credible intervals.

Figure 13

Figure 12. Duration of total ROIs in correct and incorrect phrase stress trials.Note: Panels show total duration (in seconds) from the onset of the first morpheme to the offset of the second morpheme for phrase stress productions by women (n = 18) and men (n = 18), separated by coder-determined accuracy. Black circles represent correct trials; white circles represent incorrect trials. Error bars represent 95% credible intervals.

Figure 14

Figure 13. Pitch of ROIs in correct and incorrect contrastive stress trials.Note: Panels show mean pitch (in semitones) across three equal-duration intervals for contrastive stress productions by women (n = 18) and men (n = 18), separated by coder-determined accuracy. Black circles represent color-stressed trials; white circles represent animal-stressed trials. Left panels display correctly classified trials; right panels display misclassified trials. Error bars represent 95% credible intervals.

Figure 15

Figure 14. Amplitude of ROIs in correct and incorrect contrastive stress trials.Note: Panels show mean amplitude (in dB) across three equal-duration intervals for contrastive stress productions by women (n = 18) and men (n = 18), separated by coder-determined accuracy. Black circles represent color-stressed trials; white circles represent animal-stressed trials. Left panels display correctly classified trials; right panels display misclassified trials. Error bars represent 95% credible intervals.

Figure 16

Figure 15. Duration of morphemes in correct and incorrect contrastive stress trials.Note: Panels show mean durations (in seconds) for the two morphemes (color word, animal word) in contrastive stress productions by women (n = 18) and men (n = 18), separated by coder-determined accuracy. Black circles represent color-stressed trials; white circles represent animal-stressed trials. Left panels display correctly classified trials; right panels display misclassified trials. Error bars represent 95% credible intervals.

Figure 17

Table 3. Bayesian mixed effects regression model of participants’ mean accuracy scores for trials