Hostname: page-component-89b8bd64d-j4x9h Total loading time: 0 Render date: 2026-05-06T06:29:55.491Z Has data issue: false hasContentIssue false

Characterizing eye gaze and mental workload for assistive device control

Published online by Cambridge University Press:  03 March 2025

Larisa Y.C. Loke*
Affiliation:
Department of Mechanical Engineering, Northwestern University, Evanston, IL, USA Shirley Ryan AbilityLab, Chicago, IL, USA
Demiana R. Barsoum
Affiliation:
Department of Mechanical Engineering, Northwestern University, Evanston, IL, USA Shirley Ryan AbilityLab, Chicago, IL, USA
Todd D. Murphey
Affiliation:
Department of Mechanical Engineering, Northwestern University, Evanston, IL, USA
Brenna D. Argall
Affiliation:
Department of Mechanical Engineering, Northwestern University, Evanston, IL, USA Shirley Ryan AbilityLab, Chicago, IL, USA
*
Corresponding author: Larisa Y.C. Loke; Email: larisaycl@u.northwestern.edu

Abstract

Eye gaze tracking is increasingly popular due to improved technology and availability. In the domain of assistive device control, however, eye gaze tracking is often used in discrete ways (e.g., activating buttons on a screen), and does not harness the full potential of the gaze signal. In this article, we present a method for collecting both reactionary and controlled eye gaze signals, via screen-based tasks designed to isolate various types of eye movements. The resulting data allows us to build an individualized characterization for eye gaze interface use. Results from a study conducted with participants with motor impairments are presented, offering insights into maximizing the potential of eye gaze for assistive device control. Importantly, we demonstrate the potential for incorporating direct continuous eye gaze inputs into gaze-based interface designs; generally seen as intractable due to the ‘Midas touch’ problem of differentiating between gaze movements for perception versus for interface operation. Our key insight is to make use of an individualized measure of smooth pursuit characteristics to differentiate between gaze for control and gaze for environment scanning. We also present results relating to gaze-based metrics for mental workload and show the potential for the concurrent use of eye gaze for control input as well as assessing a user’s mental workload both offline and in real-time. These findings might inform the development of continuous control paradigms using eye gaze, as well as the use of eye tracking as the sole input modality to systems that share control between human-generated and autonomy-generated inputs.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Figure 1. The screen-based tasks implemented in Unity2D. (a) The Painting Task. Participants control a blue cursor with the 5-min countdown indicated as a radially filling red outline. (b) The Focus Task without visual feedback on gaze position. Targets fade away as participants fixate on the target continuously for 2 s, and the next target appears. (c) The Focus Task with visual feedback on gaze position. As in (b), but now with a blue dot representing the gaze position, as measured by the eye gaze tracking system, for visual feedback. (d) The Tracking Task. Participants track moving targets on the screen. For panels (b–d), red-filled circles represent the current target. Dashed circles and arrows are provided for illustration purposes only, and represent the prior target (circles) and movement of targets (arrow).

Figure 1

Table 1. Details on recruited participants

Figure 2

Figure 2. (a) Tobii Pro Glasses 3 (TPG3) head-mounted eye tracker (Tobii, 2021). (b) The 1-channel-ECG lead setup with the SOMNOtouch RESP cardiorespiratory screener (Polygraph) ECG Sensor (Somnomedics, 2018). (c) A participant performing the Painting Task. ArUco markers on the corners of the screen are used for transforming the gaze position in the scene to the gaze position on the screen.

Figure 3

Figure 3. Schematic of our eye-gaze characterization system, implemented in ROS2. A gaze stream receiver and pre-processing node (green) receives input from eye tracker streamer nodes (gray) that are specific to a given eye tracking device and API. The preprocessing node publishes the processed screen-based gaze position$ \left(x,y\right) $ to task nodes (blue), which interact with Unity and handle task progress. All data is logged in ROS bagfiles for post-hoc analysis. Communication with Unity2D takes place over their provided ROS TCP Endpoint.

Figure 4

Figure 4. Painting Task, Left: Index of displacement from uniform screen coverage. Each grid represents a 100px by 100px area. The intensities of the heatmap range from $ -b $ displacement to $ +b $ displacement from uniform coverage. Zero displacement is a user’s baseline screen coverage. Right: Distribution of fixations across the screen. Each grid represents a 100px by 100px area. The intensities of the heatmap are normalized on a scale from 0 to 1 for each participant, and normalization is done relative to the counts in the most frequently occurring 100px by 100px area in the heatmap. Higher intensities indicate regions of greater demonstrated ability to dwell.

Figure 5

Figure 5. Target success for the Focus Task. Left: Number of successful targets, grouped by target size, for all participants and under both feedback conditions. We observe an approximate minimum button size for which most participants are able to successfully fixate (of 4.0 (5.0) DVA with (without) visual feedback). Right: Number of successful targets (left) and mean time taken to complete successful targets (right) for all participants, comparing without and with visual feedback for measured gaze position. We observe a trade-off between success and completion time when visual feedback is present.

Figure 6

Table 2. The first and second derivatives of the curve fit the median number of successful targets shown in Figure 5 (left)

Figure 7

Figure 6. Comparison of smooth pursuits between the Painting Task and Tracking Task. Left: Duration of smooth pursuit segments, for all participants. Right: Length (magnitude) of smooth pursuit segments, in degrees of visual angle, for all participants. Significant differences between the two tasks are seen for most participants, suggesting the possibility of the length and duration of smooth pursuits to differentiate between using the eyes for observation versus control.

Figure 8

Figure 7. NASA TLX Scores for all participants (S01–S11), stacked by task. We observe large inter-participant variability in perceived overall MWL, as well as in the relative MWL required of each task.

Figure 9

Table 3. Intraclass Correlation Coefficient (ICC) to estimate interrater consistency of the MWL metrics, over all participants

Figure 10

Figure 8. Illustrative comparison of NASA TLX, HR, RMSSD (negative), and CPD over tasks for three participants. We observe instances of agreement across all measures (left), varying agreement/disagreement between physiological and subjective measures (middle), and no real trend between the measures (right). For the purposes of visualization and ICC calculation, all four metrics are normalized by their respective minimum and maximum values for each participant. The participants shown are representative of the variability in agreement between measures. Data from other participants is omitted to keep the presentation clear and concise.

Figure 11

Figure 9. Visual comparison of agreement and similarity between CPD and HR (top), and CPD and RMSSD (bottom) for S11 during the Focus with Feedback Task. Left: Bland–Altman plot of agreement with limits of agreement (dashed lines) and mean bias (solid, black line). Middle: DTW optimal warp path in red, with 20% Sakoe-Chiba window constraint shown in the grey band. Right: DTW alignment showing matched points in the time series. We observe that a large majority of points lie within the LoA of the Bland–Altman plot when comparing CPD to both HR and RMSSD, as well as alignment in peaks of the CPD, HR, and RMSSD time series.

Figure 12

Table 4. Summary of Bland–Altman analysis and DTW Similarity aggregated over all participants

Figure 13

Table A1. Intraclass Correlation Coefficient (ICC) to estimate interrater consistency of the MWL metrics aggregated for each participant

Figure 14

Table A2. HR and CPD: Systematic Bias, 95% LoA, and percentage of points within the LoA for each participant for the Focus with Feedback Task

Figure 15

Table A3. RMSSD and CPD: Systematic Bias, 95% LoA, percentage of points within the LoA, and DTW Similarity Score for each participant for the Focus with Feedback Task

Supplementary material: File

Loke et al. supplementary material

Loke et al. supplementary material
Download Loke et al. supplementary material(File)
File 37.6 MB