Hostname: page-component-89b8bd64d-ktprf Total loading time: 0 Render date: 2026-05-08T13:26:07.709Z Has data issue: false hasContentIssue false

Facial expressions in different communication settings: A case of whispering and speaking with a face mask in Farsi

Published online by Cambridge University Press:  02 September 2024

Nasim Mahdinazhad Sardhaei*
Affiliation:
Leibniz-Zentrum für Allgemeine Sprachwissenschaft, Berlin, Germany
Marzena Żygis
Affiliation:
Leibniz-Zentrum für Allgemeine Sprachwissenschaft, Berlin, Germany Humboldt Universität, Berlin, Germany
Hamid Sharifzadeh
Affiliation:
Unitec Institute of Technology, Auckland, New Zealand
*
Corresponding author: Nasim Mahdinazhad Sardhaei; Email: sardhaei@leibniz-zas.de
Rights & Permissions [Opens in a new window]

Abstract

This study addresses the importance of orofacial gestures and acoustic cues to execute prosodic patterns under different communicative settings in Farsi. Given that Farsi lacks morpho-syntactic markers for polar questions, we aim to determine whether specific facial movements accompany the prosodic correlates of questionhood in Farsi under conditions of degraded information, that is, whispering and wearing face masks. We hypothesise speakers will employ the most pronounced facial expressions when whispering questions with a face mask to compensate for the absence of F0, reduced intensity and lower face invisibility. To this end, we conducted an experiment with 10 Persian speakers producing 10 pairs of statements and questions in normal and whispered speech modes with and without face masks. Our results provide support to our hypotheses that speakers will intensify their orofacial expressions when confronted with marked conditions. We interpreted our results in terms of the ‘hand in hand’ and ‘trade-off’ hypothesis. In whispered speech, the parallel realisation of longer word duration and orofacial expressions may be a compensatory mechanism for the limited options to convey intonation. Also, the lower face coverage is mutually compensated for by word duration and intensified upper facial expressions, all of which in turn support the trade-off hypothesis.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Table 1. Blocks of stimuli

Figure 1

Figure 1. Experimental setup. Reading phase of the experiment: field monitor displays the sentences one by one (left). Sentence imitation phase of the experiment (right): confederate and the participant exchanging the questions and statements in voice mode of speech (the data were collected in the peak of COVID-19, so to avoid the risk of being infected by the virus; the confederate wore a surgical mask during the whole experiment).

Figure 2

Figure 2. Spectrogram and oscillogram for voiced speech with annotation labels for the sentence ‘Diruz shima goft tapesh?’ (‘Yesterday, Shima said “beat”?’). Blue line marks F0 in voiced speech. Alphabet letters represent the annotated intervals.

Figure 3

Figure 3. Sixty-eight facial landmarks tracked by OpenCV.

Figure 4

Figure 4. Expression of a statement (left) and a question (right) by a speaker with and without a mask.

Figure 5

Table 2. Face vectors used for the analysis

Figure 6

Figure 5. Sixty-eight facial landmarks tracked by OpenCV and Dlib, with four face vector distances (image from the iBUG 300-W dataset by Sagonas et al., 2013).

Figure 7

Figure 6. Maximum F0 values calculated for eight intervals across all questions and statements in voiced speech with and without a face mask. The dots show mean values, and the whiskers show 95% confidence intervals.

Figure 8

Figure 7. Left eyebrow raising in the sentence-final word: interaction between speech mode and mask condition (left) as well as speech mode and sentence type (right).

Figure 9

Figure 8. Right eyebrow raising in the sentence-final word: interaction between speech mode and sentence type (left) as well as speech mode and mask condition (right).

Figure 10

Figure 9. Lip aperture in the in sentence-final word: interaction between speech mode and sentence type.

Figure 11

Figure 10. Duration of sentence-final words: interaction between speech mode and mask condition (left) as well as between speech mode and sentence type (right).

Figure 12

Figure 11. Mean amplitude difference between unstressed and stressed syllable of the sentence-final words.

Figure 13

Figure 12. Scatterplots with z-scored word duration (x-axis) and orofacial parameters (y-axis). Linear regression lines (y ~ x) are given in dark blue (colour online) and confidence bands in grey. Data points corresponding to normal speech are visualised in light green, while data points for whispered speech appear in turquoise blue.

Supplementary material: File

Mahdinazhad Sardhaei et al. supplementary material

Mahdinazhad Sardhaei et al. supplementary material
Download Mahdinazhad Sardhaei et al. supplementary material(File)
File 476.4 KB