Skip to main content Accessibility help

Partial and synchronized captioning: A new tool to assist learners in developing second language listening skill

  • Maryam Sadat Mirzaei (a1), Kourosh Meshgi (a2), Yuya Akita (a3) and Tatsuya Kawahara (a4)


This paper introduces a novel captioning method, partial and synchronized captioning (PSC), as a tool for developing second language (L2) listening skills. Unlike conventional full captioning, which provides the full text and allows comprehension of the material merely by reading, PSC promotes listening to the speech by presenting a selected subset of words, where each word is synched to its corresponding speech signal. In this method, word-level synchronization is realized by an automatic speech recognition (ASR) system, dedicated to the desired corpora. This feature allows the learners to become familiar with the correspondences between words and their utterances. Partialization is done by automatically selecting words or phrases likely to hinder listening comprehension. In this work we presume that the incidence of infrequent or specific words and fast delivery of speech are major barriers to listening comprehension. The word selection criteria are thus based on three factors: speech rate, word frequency and specificity. The thresholds for these features are adjusted to the proficiency level of the learners. The selected words are presented to aid listening comprehension while the remaining words are masked in order to keep learners listening to the audio. PSC was evaluated against no-captioning and full-captioning conditions using TED videos. The results indicate that PSC leads to the same level of comprehension as the full-captioning method while presenting less than 30% of the transcript. Furthermore, compared with the other methods, PSC can serve as an effective medium for decreasing dependence on captions and preparing learners to listen without any assistance.



Hide All
Baddeley, A. (1992) Working memory. Science, 255(5044): 556559.
Bailly, G. and Barbour, W. S. (2011) Synchronous reading: Learning French orthography by audiovisual training. In: Cosi, P., De Mori, R., Di Fabbrizio, G. and Pieraccini, R. (eds.), Proceedings of the 12th Annual Conference of the International Speech Communication Association ( Interspeech 2011 ). Florence, Italy, 11531156.
Bird, S. A. and Williams, J. N. (2002) The effect of bimodal input on implicit and explicit memory: An investigation into the benefits of within-language subtitling. Applied Psycholinguistics, 23(4): 509533.
Bird, S., Klein, E. and Loper, E. (2009) Natural language processing with Python. Sebastopol, CA: O’Reilly Media.
Bloomfield, A., Wayland, S. C., Rhoades, E., Blodgett, A., Linck, J. and Ross, S. (2010) What makes listening difficult? Factors affecting second language listening comprehension. College Park, MD: University of Maryland, Center for Advanced Study of Language.
Braunschweiler, N., Gales, M. J. and Buchholz, S. (2010) Lightly supervised recognition for automatic alignment of large coherent speech recordings. In: Kobayashi, T., Hirose, K. and Nakamura, S. (eds.), Proceedings of the 11th Annual Conference of the International Speech Communication Association ( Interspeech 2010 ). Makuhari, Japan, 22222225.
Buck, G. (2001) Assessing listening. Cambridge: Cambridge University Press.
Chang, A. C. S. (2009) Gains to L2 listeners from reading while listening vs. listening only in comprehending short stories. System, 37(4): 652663.
Coxhead, A. (2000) A new academic word list. TESOL Quarterly, 34(2): 213238.
Danan, M. (1992) Reversed subtitling and dual coding theory: New directions for foreign language instruction. Language Learning, 42(4): 497527.
Danan, M. (2004) Captioning and subtitling: Undervalued language learning strategies. META, 49(1): 6677.
Davies, M. (2008) The Corpus of Contemporary American English: 450 million words, 1990–present. (accessed October, 2014).
Diao, Y., Chandler, P. and Sweller, J. (2007) The effect of written text on comprehension of spoken English as a foreign language. The American Journal of Psychology, 120(2): 237261.
Ellis, N. C. (2003) Constructions, chunking, and connectionism: The emergence of second language structure. In Doughty, C. J. and Long, M. H. (eds.), The handbook of second language acquisition. Oxford: Blackwell, 63103.
Gardner, D. and Davies, M. (2013) A new academic vocabulary list. Applied Linguistics, 35(3): 305327.
Garza, T. J. (1991) Evaluating the use of captioned video materials in advanced foreign language learning. Foreign Language Annals, 24(3): 239258.
Gilmore, A. (2007) Authentic materials and authenticity in foreign language learning. Language Teaching, 40(2): 97118.
Goh, C. (2000) A cognitive perspective on language learners’ listening comprehension problems. System, 28(1): 5575.
Griffiths, R. (1992) Speech rate and listening comprehension: Further evidence of the relationship. TESOL Quarterly, 26(2): 385390.
Guillory, H. G. (1998) The effects of keyword captions to authentic French video on learner comprehension. Calico Journal, 15(1–3): 89108.
Inhoff, A. W. and Rayner, K. (1986) Parafoveal word processing during eye fixations in reading: Effects of word frequency. Perception & Psychophysics, 40(6): 431439.
King, J. (2002) Using DVD feature films in the EFL classroom. Computer Assisted Language Learning, 15(5): 509523.
Korat, O. (2010) Reading electronic books as a support for vocabulary, story comprehension and word reading in kindergarten and first grade. Computers & Education, 55(1): 2431.
Krashen, S. D. (1985) The input hypothesis: Issues and implications. Harlow: Longman.
Leveridge, A. N. and Yang, J. C. (2013) Testing learner reliance on caption supports in second language listening comprehension multimedia environments. ReCALL, 25(2): 199214.
Lund, R. J. (1991) A comparison of second language listening and reading comprehension. The Modern Language Journal, 75(2): 196204.
Markham, P. (1989) The effects of captioned television videotapes on the listening comprehension of beginning, intermediate, and advanced ESL students. Educational Technology, 29(10): 3841.
Markham, P. and Peter, L. (2003) The influence of English language and Spanish language captions on foreign language listening/reading comprehension. Journal of Educational Technology Systems, 31(3): 331341.
Mayer, R. E. and Moreno, R. (2003) Nine ways to reduce cognitive load in multimedia learning. Educational Psychologist, 38(1): 4352.
Mayer, R. E., Lee, H. and Peebles, A. (2014) Multimedia learning in a second language: A cognitive load perspective. Applied Cognitive Psychology, 28(5): 653660.
Medwell, J. (1998) The talking books project: Some further insights into the use of talking books to develop reading. Reading, 32(1): 38.
Montero Perez, M., Van den Noortgate, W. and Desmet, P. (2013) Captioned video for L2 listening and vocabulary learning: A meta-analysis. System, 41(3): 720739.
Montero Perez, M., Peters, E. and Desmet, P. (2014a) Is less more? Effectiveness and perceived usefulness of keyword and full captioned video for L2 listening comprehension. ReCALL, 26(1): 2143.
Montero Perez, M., Peters, E., Clarebout, G. and Desmet, P. (2014b) Effects of captioning on video comprehension and incidental vocabulary learning. Language Learning & Technology, 18(1): 118141.
Moran, S. (2010) The effect of linguistic variation on subtitle reception. In Perego, E. (ed.), Eye tracking in audiovisual translation, Roma: Aracne Editrice, 183222.
Nation, I. S. P. (2006) How large a vocabulary is needed for reading and listening? Canadian Modern Language Review, 63(1): 5982.
Nation, I. S. P. and Beglar, D. (2007) A vocabulary size test. The Language Teacher, 31(7): 913.
Nation, I. S. P. and Webb, S. A. (2011) Researching and analyzing vocabulary. Boston: Heinle Cengage Learning.
Nissan, S., DeVincenzi, F. and Tang, K. L. (1996) An analysis of factors affecting the difficulty of dialogue items in TOEFL listening comprehension. TOEFL Research Report , 51. Princeton, NJ: Educational Testing Service.
Nogami, Y. and Hayashi, N. (2010) A Japanese adaptive test of English as a foreign language: Developmental and operational aspects. In: Van der Linden, W. J. and Glas, C. A. W. (eds.), Elements of adaptive testing, New York: Springer, 191211.
Osada, N. (2004) Listening comprehension research: A brief review of the past thirty years. Dialogue, 3: 5366.
Paivio, A. (1990) Mental representations: A dual coding approach. Oxford: Oxford University Press.
Pimsleur, P., Hancock, C. and Furey, P. (1977) Speech rate and listening comprehension. In Burt, M. K., Dulay, H. B. and Finocchiaro, M. C. (eds.) Viewpoints on English as a second language. New York: Regents, 2734.
Pujolà, J. T. (2002) CALLing for help: Researching language learning strategies using help facilities in a web-based multimedia program. ReCALL, 14(2): 235262.
Révész, A. and Brunfaut, T. (2013) Text characteristics of task input and difficulty in second language listening comprehension. Studies in Second Language Acquisition, 35(1): 3165.
Rost, M. (2005) L2 listening. In: Hinkel, E. (ed.) Handbook of research in second language teaching and learning. Mahwah, NJ: Erlbaum, 503527.
Schmitt, N. and McCarthy, M. (eds.) (1997) Vocabulary: Description, acquisition and pedagogy. Cambridge: Cambridge University Press.
Sweller, J. (1994) Cognitive load theory, learning difficulty, and instructional design. Learning and Instruction, 4(4): 295312.
Sydorenko, T. (2010) Modality of input and vocabulary acquisition. Language Learning & Technology, 14(2): 5073.
Tauroza, S. and Allison, D. (1990) Speech rates in British English. Applied Linguistics, 11(1): 90105.
Taylor, G. (2005) Perceived processing strategies of students watching captioned video. Foreign Language Annals, 38(3): 422427.
Trancoso, I., Serralheiro, A., Viana, C., Caseiro, D. and Mascarenhas, I. (2007) Digital talking books in multiple languages and varieties. Proceedings of the 3rd Language & Technology Conference. Poznan, Poland.
Vandergrift, L. (2004) Listening to learn or learning to listen? ARAL, 24(1): 325.
Vandergrift, L. (2007) Recent developments in second and foreign language listening comprehension research. Language Teaching, 40(3): 191210.
Vandergrift, L. (2011) Second language listening: Presage, process, product, and pedagogy. In: Hinkel, E. (ed.), Handbook of research in second language teaching and learning. New York/London: Routledge, 455471.
Vanderplank, R. (1988) The value of teletext sub-titles in language learning. ELT Journal, 42(4): 272281.
Vanderplank, R. (2010) Déjà vu? A decade of research on language laboratories, television and video in language learning. Language Teaching, 43(1): 137.
Wang, D. and Narayanan, S. (2005) An unsupervised quantitative measure for word prominence in spontaneous speech. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’05). Philadelphia, PA: IEEE, 377380. doi: 10.1109/ICASSP.2005.1415129.
Webb, S. (2010) Using glossaries to increase the lexical coverage of television programs. Reading in a Foreign Language, 22(1): 201221.
Winke, P., Gass, S. and Sydorenko, T. (2010) The effects of captioning videos used for foreign language listening activities. Language Learning & Technology, 14(1): 6586.
Winke, P., Gass, S. and Sydorenko, T. (2013) Factors influencing the use of captions by foreign language learners: An eye-tracking study. The Modern Language Journal, 97(1): 254275.
Zhao, Y. (1997) The effects of listeners’ control of speech rate on second language comprehension. Applied Linguistics, 18(1): 4968.


Related content

Powered by UNSILO
Type Description Title
Supplementary materials

Mirzaei supplementary material
Mirzaei supplementary material 1

 Word (265 KB)
265 KB

Partial and synchronized captioning: A new tool to assist learners in developing second language listening skill

  • Maryam Sadat Mirzaei (a1), Kourosh Meshgi (a2), Yuya Akita (a3) and Tatsuya Kawahara (a4)


Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed.