Hostname: page-component-6766d58669-bkrcr Total loading time: 0 Render date: 2026-05-15T15:26:22.049Z Has data issue: false hasContentIssue false

Hearing once, reading twice: How dual subtitles shape visual attention in bilingual viewing

Published online by Cambridge University Press:  30 January 2026

Inka Romero-Ortells
Affiliation:
Centro de Investigación Nebrija en Cognición (CINC), Universidad Nebrija, Spain
Manuel Perea*
Affiliation:
Centro de Investigación Nebrija en Cognición (CINC), Universidad Nebrija, Spain Department of Methodology and ERI-Lectura, Universitat de València, Spain
Jon Andoni Duñabeitia
Affiliation:
Centro de Investigación Nebrija en Cognición (CINC), Universidad Nebrija, Spain
*
Corresponding author: Manuel Perea; Email: manuel.perea@uv.es
Rights & Permissions [Opens in a new window]

Abstract

Dual subtitles, combining captions (audio transcription) with subtitles translated into another language, are increasingly used in language learning. However, how they shape visual attention remains unclear. In the present experiments, we tracked the eye movements of Spanish–English bilinguals, as they viewed instructional videos with either no subtitles (Experiment 1) or dual subtitles (Experiment 2), manipulating subtitle position and audio language. Without subtitles, L1 audio focused gaze on the speaker’s eyes, while L2 audio distributed it between the eyes and mouth. With dual subtitles, gaze shifted strongly to the text, with a preference for the top line, which attracted more viewing time regardless of language. Viewers selectively attended to the line matching the audio. Comprehension improved for L2 audio with subtitles, while L1 comprehension was unaffected. Our findings demonstrate that display layout and language alignment jointly govern attentional allocation in bilingual viewing, with direct implications for L2 instructional design.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press
Figure 0

Figure 1. Representation of the distribution on screen for the no-subtitles condition.

Figure 1

Figure 2. Representation of the distribution on screen for the dual-subtitles condition.

Figure 2

Figure 3. Proportional dwell time for eyes and mouth in the L1 and L2 audio versions. The bars represent the standard error of the mean.

Figure 3

Table 1. Descriptive statistics for Experiment 1. Relative dwell time (proportion of trial duration) is shown for each area of interest (AOI), and comprehension scores are reported as mean test scores. Values are means (M) with standard errors (SE)

Figure 4

Figure 4. Proportional dwell time for eyes, mouth, and subtitles (L2 vs L1), considering the position of subtitles (top versus bottom) in the L2 and L1 audio videos. The bars represent the standard error of the mean.

Figure 5

Table 2. Descriptive statistics for Experiment 2: Mean relative dwell time (proportion of trial duration) and comprehension scores are reported for the four within-subjects conditions (2 Audio Languages × 2 Subtitle Positions: L1-Top/L2-Bottom versus L2-Top/L1-Bottom). Values are means (M) with standard errors (SE)

Figure 6

Table A1. Subtitle file characteristics by language and video: total words, duration (s), and words per minute (WPM; computed as total words ÷ duration in minutes). VO = original L1 video; OV = original L2 video; EN/ES = subtitle language (English/Spanish). The names of the video files correspond to the materials in OSF