Skip to main content Accessibility help
×
Home
Hostname: page-component-559fc8cf4f-28jzs Total loading time: 0.262 Render date: 2021-02-26T14:17:35.145Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "metricsAbstractViews": false, "figures": false, "newCiteModal": false, "newCitedByModal": true }

Article contents

Advancing CALL research via data-mining techniques: Unearthing hidden groups of learners in a corpus-based L2 vocabulary learning experiment

Published online by Cambridge University Press:  11 September 2018

Hansol Lee
Affiliation:
University of California, Irvine, United States Korea Military Academy, Republic of Korea (hansol3@uci.edu)
Mark Warschauer
Affiliation:
University of California, Irvine, United States (markw@uci.edu)
Jang Ho Lee
Affiliation:
Chung-Ang University, Republic of Korea (jangholee@cau.ac.kr)
Corresponding

Abstract

In this study, we used a data-mining approach to identify hidden groups in a corpus-based second-language (L2) vocabulary experiment. After a vocabulary pre-test, a total of 132 participants performed three online reading tasks (in random orders) equipped with the following glossary types: (1) concordance lines and definitions of target lexical items, (2) concordance lines of target lexical items, and (3) no glossary information. Although the results of a previous study based on variable-centred analysis (i.e. multiple regression analysis) revealed that more glossary information could lead to better learning outcomes (Lee, Warschauer & Lee, 2017), using a model-based clustering technique in the present study allowed us to unearth learner types not identified in the previous analysis. Instead of the performance pattern found in the previous study (more glossary led to higher gains), we identified one learner group who exhibited their ability to make successful use of concordance lines (and thus are optimized for data-driven learning, or DDL; Johns, 1991), and another group who showed limited L2 vocabulary learning when exposed to concordance lines only. Further, our results revealed that L2 proficiency intersects with vocabulary gains of different learner types in complex ways. Therefore, using this technique in computer-assisted language learning (CALL) research to understand differential effects of accommodations can help us better identify hidden learner types and provide personalized CALL instruction.

Type
Regular papers
Copyright
© European Association for Computer Assisted Language Learning 2018 

Access options

Get access to the full version of this content by using one of the access options below.

References

AbuSeileek, A. F. (2011) Hypermedia annotation presentation: The effect of location and type on the EFL learners’ achievement in reading comprehension and vocabulary acquisition. Computers & Education, 57(1): 12811291. https://doi.org/10.1016/j.compedu.2011.01.011 CrossRefGoogle Scholar
Bergman, L. R. Magnusson, D. (1997) A person-oriented approach in research on developmental psychopathology. Development and Psychopathology, 9(2): 291319. https://doi.org/10.1017/S095457949700206X CrossRefGoogle ScholarPubMed
Boulton, A. (2009) Data-driven learning: Reasonable fears and rational reassurance. Indian Journal of Applied Linguistics, 35(1): 81106.Google Scholar
Boulton, A. Cobb, T. (2017) Corpus use in language learning: A meta-analysis. Language Learning, 67(2): 348393. https://doi.org/10.1111/lang.12224 CrossRefGoogle Scholar
Chen, I.-J. Yen, J.-C. (2013) Hypertext annotation: Effects of presentation formats and learner proficiency on reading comprehension and vocabulary learning in foreign languages. Computers & Education, 63: 416423. https://doi.org/10.1016/j.compedu.2013.01.005 CrossRefGoogle Scholar
Chun, D. M. (2001) L2 reading on the Web: Strategies for accessing information in hypermedia. Computer Assisted Language Learning, 14(5): 367403. https://doi.org/10.1076/call.14.5.367.5775 CrossRefGoogle Scholar
Cobb, T. (1999) Applying constructivism: A test for the learner-as-scientist. Educational Technology Research and Development, 47(3): 1531. https://doi.org/10.1007/BF02299631 CrossRefGoogle Scholar
Cobb, T., Greaves, C. Horst, M. (2001) Can the rate of lexical acquisition from reading be increased? An experiment in reading French with a suite of on-line resources. In Raymond, P. & Cornaire, C. (eds.), Regards sur la didactique des langues seconds. Montréal: Éditions logique, 133153.Google Scholar
Csizér, K. Dörnyei, Z. (2005) Language learners’ motivational profiles and their motivated learning behavior. Language Learning, 55(4): 613659. https://doi.org/10.1111/j.0023-8333.2005.00319.x CrossRefGoogle Scholar
Cunningham, S., Moor, P. Carr, J. C. (2003) Cutting edge: Advanced with phrase builder. Harlow: Pearson Education.Google Scholar
Dolnicar, S. (2002) A review of unquestioned standards in using cluster analysis for data-driven market segmentation. In Shaw, R. N., Adam, S. & McDonald, H. (eds.), ANZMAC 2002: Proceedings of the Australian and New Zealand Marketing Academy Conference 2002. Deakin University, 2–4 December, 31–37.Google Scholar
Doornik, J. A. Hansen, H. (2008) An omnibus test for univariate and multivariate normality. Oxford Bulletin of Economics and Statistics, 70(s1): 927939. https://doi.org/10.1111/j.1468-0084.2008.00537.x CrossRefGoogle Scholar
Educational Testing Service (2016) TOEIC® listening and reading test scored and the CEFR levels. https://www.etsglobal.org/Tests-Preparation/The-TOEIC-Tests/TOEIC-Listening-Reading-Test/Scores-Overview Google Scholar
Faul, F., Erdfelder, E., Lang, A.-G. Buchner, A. (2007) G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39: 175191. https://doi.org/10.3758/BF03193146 CrossRefGoogle Scholar
Field, A. P. (2009) Discovering statistics using SPSS (3rd ed.). London: Sage.Google Scholar
Firooz, H. (2015, March 4) When not to use Gaussian mixture model (EM clustering). https://hameddaily.blogspot.com/2015/03/when-not-to-use-gaussian-mixtures-model.html Google Scholar
Fitzmaurice, G. M., Laird, N. M. Ware, J. H. (2012) Applied longitudinal analysis (2nd ed.). Hoboken: John Wiley & Sons.Google Scholar
Flowerdew, L. (2008) Pedagogic value of corpora: A critical evaluation. In Frankenberg-Garcia, A. (ed.), Proceedings of the 8th Teaching and Language Corpora conference. Associação de Estudos e de Investigação Cientifíca do ISLA-Lisboa, 115119.Google Scholar
Flowerdew, L. (2015) Data-driven learning and language learning theories: Whither the twain shall meet. In Leńko-Szymańska, A. & Boulton, A. (eds.), Multiple affordances of language corpora for data-driven learning. Amsterdam: John Benjamins, 1536.Google Scholar
Fraley, C. Raftery, A. E. (1998) How many clusters? Which clustering method? Answers via model-based cluster analysis. The Computer Journal, 41(8): 578588. https://doi.org/10.1093/comjnl/41.8.578 CrossRefGoogle Scholar
Fraley, C. Raftery, A. E. (2002) Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97(458): 611631. https://doi.org/10.1198/016214502760047131 CrossRefGoogle Scholar
Fraley, C., Raftery, A. E., Scrucca, L., Murphy, T. B. Fop, M. (2017) mclust: Gaussian mixture modelling for model-based clustering, classification, and density estimation (R package version 5.3) https://CRAN.R-project.org/package=mclust Google Scholar
Frankenberg-Garcia, A. (2012) Learners’ use of corpus examples. International Journal of Lexicography, 25(3): 273296. https://doi.org/10.1093/ijl/ecs011 CrossRefGoogle Scholar
Frankenberg-Garcia, A. (2014) The use of corpus examples for language comprehension and production. ReCALL, 26(2): 128146. https://doi.org/10.1017/S0958344014000093 CrossRefGoogle Scholar
Fraser, C. A. (1999) Lexical processing strategy use and vocabulary learning through reading. Studies in Second Language Acquisition, 21(2): 225241. https://doi.org/10.1017/S0272263199002041 CrossRefGoogle Scholar
Gass, S. M., Behney, J. Plonsky, L. (2013) Second language acquisition: An introductory course (4th ed.). New York: Routledge.CrossRefGoogle Scholar
Godwin-Jones, R. (2001) Tools and trends in corpora use for teaching and learning. Language Learning & Technology, 5(3): 712. https://doi.org/10125/44559 Google Scholar
Henze, N. Zirkler, B. (1990) A class of invariant consistent tests for multivariate normality. Communications in Statistics – Theory and Methods, 19(10): 35953617. https://doi.org/10.1080/03610929008830400 CrossRefGoogle Scholar
Huang, L.-S. (2011) Language learners as language researchers: The acquisition of English grammar through a corpus-aided discovery learning approach mediated by intra- and interpersonal dialogues. In Newman, J., Baayen, H. & Rice, S. (eds.), Corpus-based studies in language use, language learning, and language documentation. Amsterdam: Rodopi, 91122.Google Scholar
Hummel, K. M. French, L. M. (2016) Phonological memory and aptitude components: Contributions to second language proficiency. Learning and Individual Differences, 51: 249255. https://doi.org/10.1016/j.lindif.2016.08.016 CrossRefGoogle Scholar
Johns, T. (1991) Should you be persuaded: Two examples of data-driven learning. In Johns, T. & King, P. (eds.), Classroom concordancing. English Language Research Journal , 4: 116.Google Scholar
Jung, Y. G., Kang, M. S. Heo, J. (2014) Clustering performance comparison using K-means and expectation maximization algorithms. Biotechnology & Biotechnological Equipment, 28(Supp. 1): S44S48. https://doi.org/10.1080/13102818.2014.949045 CrossRefGoogle ScholarPubMed
Lee, H. Lee, J. H. (2013) Implementing glossing in mobile-assisted language learning environments: Directions and outlook. Language Learning & Technology, 17(3): 622. https://doi.org/10125/44334 Google Scholar
Lee, H. Lee, J. H. (2015) The effects of electronic glossing types on foreign language vocabulary learning: Different types of format and glossary information. The Asia-Pacific Education Researcher, 24(4): 591601. https://doi.org/10.1007/s40299-014-0204-3 CrossRefGoogle Scholar
Lee, H., Warschauer, M. Lee, J. H. (2017) The effects of concordance-based electronic glosses on L2 vocabulary learning. Language Learning & Technology, 21(2): 3251. https://doi.org/10125/44610 Google Scholar
Lee, H., Warschauer, M. Lee, J. H. (2018) The effects of corpus use on second language vocabulary learning: A multilevel meta-analysis. Applied Linguistics. Advance online publication. https://doi.org/10.1093/applin/amy012 CrossRefGoogle Scholar
Leńko-Szymańska, A. Boulton, A. (2015) Introduction: Data-driven learning in language pedagogy. In Leńko-Szymańska, A. & Boulton, A. (eds.), Multiple affordances of language corpora for data-driven learning. Amsterdam: John Benjamins, 114.Google Scholar
Lomicka, L. L. (1998) “To gloss or not to gloss”: An investigation of reading comprehension online. Language Learning & Technology, 1(2): 4150. https://doi.org/10125/25020 Google Scholar
Maris, E. (1998) Covariance adjustment versus gain scores—revisited. Psychological Methods, 3(3): 309–327. http://dx.doi.org/10.1037/1082-989X.3.3.309 CrossRefGoogle Scholar
Martin, K. I. Ellis, N. C. (2012) The role of phonological short-term memory and working memory in L2 grammar and vocabulary learning. Studies in Second Language Acquisition, 34(3): 379413. https://doi.org/10.1017/S0272263112000125 CrossRefGoogle Scholar
Meilă, M. Heckerman, D. (2001) An experimental comparison of model-based clustering methods. Machine Learning, 42(1/2): 929. https://doi.org/10.1023/A:1007648401407 CrossRefGoogle Scholar
Mun, E. Y., von Eye, A., Bates, M. E. Vaschillo, E. G. (2008) Finding groups using model-based cluster analysis: Heterogeneous emotional self-regulatory processes and heavy alcohol use risk. Developmental Psychology, 44(2): 481495. https://doi.org/10.1037/0012-1649.44.2.481 CrossRefGoogle ScholarPubMed
Nassaji, H. (2003) L2 vocabulary learning from context: Strategies, knowledge sources, and their relationship with success in L2 lexical inferencing. TESOL Quarterly, 37(4): 645670. https://doi.org/10.2307/3588216 CrossRefGoogle Scholar
Papi, M. Teimouri, Y. (2014) Language learner motivational types: A cluster analysis study. Language Learning, 64(3): 493525. https://doi.org/10.1111/lang.12065 CrossRefGoogle Scholar
Pires, A. M. Branco, J. A. (2010) Projection-pursuit approach to robust linear discriminant analysis. Journal of Multivariate Analysis, 101(10): 24642485. https://doi.org/10.1016/j.jmva.2010.06.017 CrossRefGoogle Scholar
Plass, J. L., Chun, D. M., Mayer, R. E. Leutner, D. (1998) Supporting visual and verbal learning preferences in a second-language multimedia learning environment. Journal of Educational Psychology, 90(1): 2536. https://doi.org/10.1037/0022-0663.90.1.25 CrossRefGoogle Scholar
Poole, R. (2012) Concordance-based glosses for academic vocabulary acquisition. CALICO Journal, 29(4): 679693. https://www.jstor.org/stable/pdf/calicojournal.29.4.679.pdf CrossRefGoogle Scholar
Royston, P. (1991) sg3.5: Comment on sg3.4 and an improved D’Agostino test. Stata Technical Bulletin, 3: 2324. https://stata-press.com/journals/stbcontents/stb3.pdf Google Scholar
Rüschoff, B. Ritter, M. (2001) Technology-enhanced language learning: Construction of knowledge and template-based learning in the foreign language classroom. Computer Assisted Language Learning, 14(3-4): 219232. https://doi.org/10.1076/call.14.3.219.5789 CrossRefGoogle Scholar
Schmitt, N. (2000) Vocabulary in language teaching. Cambridge: Cambridge University Press.Google Scholar
Schmitt, N. (2008) Review article: Instructed second language vocabulary learning. Language Teaching Research, 12(3): 329363. https://doi.org/10.1177/1362168808089921 CrossRefGoogle Scholar
Scrucca, L., Fop, M., Murphy, T. B. Raftery, A. E. (2016) mclust 5: Clustering, classification and density estimation using Gaussian finite mixture models. The R Journal, 8(1): 289317. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5096736 CrossRefGoogle ScholarPubMed
Skehan, P. (1986) Cluster analysis and the identification of learner types. In Cook, V. (ed.), Experimental approaches to second language acquisition. Oxford: Pergamon, 8194.Google Scholar
Staples, S. Biber, D. (2015) Cluster analysis. In Plonsky, L (ed.), Advancing quantitative methods in second language research. New York: Routledge, 243274.CrossRefGoogle Scholar
Tacq, J. (2010) Multivariate normal distribution. In Peterson, P., Baker, E. & McGaw, B. (eds.), International encyclopedia of education (3rd ed.). Oxford: Elsevier, 332338. https://doi.org/10.1016/B978-0-08-044894-7.01351-8 CrossRefGoogle Scholar
Tseng, W.-T. Schmitt, N. (2008) Toward a model of motivated vocabulary learning: A structural equation modeling approach. Language Learning, 58(2): 357400. https://doi.org/10.1111/j.1467-9922.2008.00444.x CrossRefGoogle Scholar
Witten, I. H., Frank, E., Hall, M. A. Pal, C. J. (2016) Data mining: Practical machine learning tools and techniques (4th ed.). Cambridge, MA: Morgan Kaufmann.Google Scholar
Yamamori, K., Isoda, T., Hiromori, T. Oxford, R. L. (2003) Using cluster analysis to uncover L2 learner differences in strategy use, will to learn, and achievement over time. International Review of Applied Linguistics in Language Teaching, 41(4): 381409. https://doi.org/10.1515/iral.2003.017 CrossRefGoogle Scholar
Yanguas, I. (2009) Multimedia glosses and their effect on L2 text comprehension and vocabulary learning. Language Learning & Technology, 13(2): 4867. https://doi.org/10125/44180 Google Scholar
Yeung, K. Y., Fraley, C., Murua, A., Raftery, A. E. Ruzzo, W. L. (2001) Model-based clustering and data transformations for gene expression data. Bioinformatics, 17(10): 977987. https://doi.org/10.1093/bioinformatics/17.10.977 CrossRefGoogle ScholarPubMed
Yoshii, M. (2006) L1 and L2 glosses: Their effects on incidental vocabulary learning. Language Learning & Technology, 10(3): 85101. https://doi.org/10125/44076 Google Scholar

Lee et al. supplementary material

Lee et al. supplementary material

File 59 KB

Altmetric attention score

Full text views

Full text views reflects PDF downloads, PDFs sent to Google Drive, Dropbox and Kindle and HTML full text views.

Total number of HTML views: 205
Total number of PDF views: 452 *
View data table for this chart

* Views captured on Cambridge Core between 11th September 2018 - 26th February 2021. This data will be updated every 24 hours.

Send article to Kindle

To send this article to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

Note you can select to send to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Advancing CALL research via data-mining techniques: Unearthing hidden groups of learners in a corpus-based L2 vocabulary learning experiment
Available formats
×

Send article to Dropbox

To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

Advancing CALL research via data-mining techniques: Unearthing hidden groups of learners in a corpus-based L2 vocabulary learning experiment
Available formats
×

Send article to Google Drive

To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

Advancing CALL research via data-mining techniques: Unearthing hidden groups of learners in a corpus-based L2 vocabulary learning experiment
Available formats
×
×

Reply to: Submit a response


Your details


Conflicting interests

Do you have any conflicting interests? *