In this study, we used a data-mining approach to identify hidden groups in a corpus-based second-language (L2) vocabulary experiment. After a vocabulary pre-test, a total of 132 participants performed three online reading tasks (in random orders) equipped with the following glossary types: (1) concordance lines and definitions of target lexical items, (2) concordance lines of target lexical items, and (3) no glossary information. Although the results of a previous study based on variable-centred analysis (i.e. multiple regression analysis) revealed that more glossary information could lead to better learning outcomes (Lee, Warschauer & Lee, 2017), using a model-based clustering technique in the present study allowed us to unearth learner types not identified in the previous analysis. Instead of the performance pattern found in the previous study (more glossary led to higher gains), we identified one learner group who exhibited their ability to make successful use of concordance lines (and thus are optimized for data-driven learning, or DDL; Johns, 1991), and another group who showed limited L2 vocabulary learning when exposed to concordance lines only. Further, our results revealed that L2 proficiency intersects with vocabulary gains of different learner types in complex ways. Therefore, using this technique in computer-assisted language learning (CALL) research to understand differential effects of accommodations can help us better identify hidden learner types and provide personalized CALL instruction.