Hostname: page-component-89b8bd64d-x2lbr Total loading time: 0 Render date: 2026-05-12T18:01:21.572Z Has data issue: false hasContentIssue false

The timing of an avatar’s beat gestures biases lexical stress perception in vocoded speech

Published online by Cambridge University Press:  10 November 2025

Matteo Maran*
Affiliation:
Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
Renske M. J. Uilenreef
Affiliation:
Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
Roos Rossen
Affiliation:
Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
Hans Rutger Bosker
Affiliation:
Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands Max Planck Institute for Psycholinguistics, PO Box 310, 6500 AH, Nijmegen, The Netherlands
*
Corresponding author: Matteo Maran; Email: matteo.maran@donders.ru.nl
Rights & Permissions [Opens in a new window]

Abstract

Cochlear implants (CIs) are neural prostheses that restore some level of hearing capacity, albeit conveying a less fine-grained speech signal than normal hearing conditions. For example, CIs convey altered fundamental frequency (F0) information, resulting in atypical lexical stress perception (e.g., distinguishing between the noun CONtent and the adjective conTENT) in languages in which this feature rests on F0 modulations. CI users can compensate for the degraded nature of the acoustic input by exploiting the audiovisual affordances of human communication, weighing more heavily the visual information provided by speakers (e.g., lip movements and gestures). Recent studies showed that, in individuals with normal hearing, the timing of simple up-and-down movements of the hand (i.e., beat gestures) biases lexical stress perception. The present study tested if the timing of beat gestures produced by an avatar can bias Dutch lexical stress perception in vocoded speech, which limits the reliability of F0 information in a way that mimics CI-hearing conditions. The effect of gestures in vocoded speech was particularly pronounced when hearing an ambiguous or the least frequent stress pattern in Dutch. These results suggest that (even artificially generated) beat gestures can support the perception of vocoded speech, especially when processing less frequent prosodic features.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Table 1. Dutch minimal pairs employed

Figure 1

Figure 1. The Manual McGurk paradigm. Note. Schematic illustration of the paradigm. The same acoustic token was aligned with a beat gesture falling on the first syllable in the BeatOn1 condition and on the second syllable in the BeatOn2 condition. The dashed blue line approximates the avatar’s hand trajectory, while the full blue line approximates the timing of the beat gesture’s apex (i.e., point of maximal extension).

Figure 2

Figure 2. Perceived stress data. Note. The manual McGurk effect, for each combination of Stress Pattern and Speech Condition, consists of the difference between BeatOn1 and BeatOn2. Mean value indicates the average of single-subject averaged proportions. Error bars indicate standard error of the mean.

Figure 3

Table 2. Perceived stress judgments: GLMM model fixed effects’ estimates and contrasts in the SW stress pattern

Figure 4

Table 3. Perceived stress judgments: GLMM model random effects in the SW stress pattern

Figure 5

Table 4. Perceived stress judgments: GLMM model fixed effects’ estimates and contrasts in the Ambiguous stress pattern

Figure 6

Table 5. Perceived stress judgments: GLMM model random effects in the Ambiguous stress pattern

Figure 7

Table 6. Perceived stress judgments: GLMM model fixed effects’ estimates and contrasts in the WS stress pattern

Figure 8

Table 7. Perceived stress judgments: GLMM model random effects in the WS stress pattern

Figure 9

Table 8. Reaction time data

Figure 10

Figure 3. Response time data. Note. The figure illustrates the average of averaged single-subject RTs across combinations of Speech Condition and Stress Pattern (SW: Strong-Weak; Ambiguous; WS: Weak-Strong). The zoom-in illustrates the average of averaged single-subject RTs across combinations of Speech Condition and Given Response (responding SW or WS) in the Ambiguous condition. Response times are provided in ms for illustration purposes, but the analysis was based on log-transformed RTs. Error bars indicate standard error of the mean.

Supplementary material: File

Maran et al. supplementary material

Maran et al. supplementary material
Download Maran et al. supplementary material(File)
File 397.3 KB