While prior research has established how people read in non-interactive media, little is known about the reading process in interactive multimedia such as video games. In this exploratory eye-tracking study, two levels of reading demand (high vs. low) were created during gameplay. Ninety-eight participants were randomly assigned to play a video game with either an English (L1; low reading demand) or an unintelligible foreign-language (FL; high reading demand) soundtrack. At the subtitle level, the FL group (vs. the L1 group) had higher dwell time percentages, more fixations, higher regression rates, and longer mean fixation durations. No group differences were found in saccade lengths. At the word level, the FL group skipped fewer words. An interaction was found between reading demand and word frequency, where the magnitude of skipping lower-frequency words (vs. higher-frequency words) was smaller in the FL group. Gaze duration showed a significant word frequency effect only. The FL group had longer total fixation time on words, and no interaction was found. These results showed that the FL group (vs. the L1 group) experienced greater processing efforts, as reflected in increased fixation-based measures and regressions. The current study provided empirical insights into task-driven reading in interactive multimedia.