1. Introduction
Data-driven learning (DDL) is an approach that uses corpora in the classroom to promote learners’ understanding of language patterns (Boulton & Vyatkina, Reference Boulton and Vyatkina2024). DDL reflects constructivist principles (Cobb, Reference Cobb1999a) by engaging learners in exploring authentic data and constructing lexical knowledge through observation and inference (Boulton & Vyatkina, Reference Boulton and Vyatkina2024). DDL also aligns with the noticing hypothesis (Schmidt, Reference Schmidt1990), as concordance lines present lexical items and their context in a salient way (Liu & Gablasova, Reference Liu and Gablasova2025). Pedagogical procedures for DDL commonly follow an inductive sequence from corpus observation to generalization, as reflected in Flowerdew’s (Reference Flowerdew2009) “four Is” of illustration, interaction, intervention, and induction, and Johns’ (Reference Johns1991) identify-classify-generalize approach.
Lexical acquisition refers to the process of building large, well-structured, and richly interconnected lexical networks that encompass both knowing the meanings of words and knowing the uses, collocations, and associations of words (Cobb, Reference Cobb1999b). In writing, lexical acquisition can be reflected in lexical complexity, which is defined as the degree of elaboration and diversity, and range of lexical items and collocations in a text, and is widely recognized as an effective indicator of L2 proficiency (Kyle et al., Reference Kyle, Crossley and Jarvis2021). Lexical complexity comprises lexical sophistication and diversity, each reflecting a distinct aspect of lexical acquisition (Bulté & Housen, Reference Bulté, Housen, Housen, Kuiken and Vedder2012). Lexical sophistication indicates learners’ ability to use less frequent and later-acquired words (Kyle & Crossley, Reference Kyle and Crossley2015). Lexical diversity reflects the breadth of productive vocabulary through the variety of distinct words used in a text (Nasseri & Thompson, Reference Nasseri and Thompson2021).
A plethora of previous DDL studies have examined and reported the positive effect of corpus use for language learning, particularly for lexical acquisition (e.g., Boulton & Cobb, Reference Boulton and Cobb2017; Dong et al., Reference Dong, Zhao and Buckingham2023; Lee et al., Reference Lee, Warschauer and Lee2019; Pérez-Paredes, Reference Pérez-Paredes2022). In the domain of lexical complexity, previous studies have provided primary evidence for the positive impact of DDL on various lexical complexity measures such as frequency, variety, and type-token ratio (TTR; e.g., Muftah, Reference Muftah2023; Şahin Kızıl, Reference Şahin Kızıl2023; Sun & Hu, Reference Sun and Hu2023). These findings highlight the potential of DDL to promote learners’ improvements in lexical complexity in authentic written production.
Despite growing interest in DDL, its effects on lexical complexity from a dynamic perspective remain underexplored. The term dynamic refers to the longitudinal and nonlinear nature of lexical development over time. Informed by complex dynamic systems theory (CDST), this study views L2 lexical development as a fluctuating process, where subsystems such as lexical diversity and sophistication fluctuate over time, and individuals may experience periods of rapid growth, plateaus, or regressions (Larsen-Freeman & Cameron, Reference Larsen-Freeman and Cameron2008). In this study, lexical complexity was operationalized in terms of lexical diversity and lexical sophistication. Lexical density was not included because it has been shown to relate more closely to register than to proficiency and is therefore rarely used as an indicator of lexical complexity (Díez-Ortega & Kyle, Reference Díez-Ortega and Kyle2024; Engber, Reference Engber1995; Linnarud, Reference Linnarud1986).
2. Literature review
2.1. Research on the effectiveness of DDL for lexical acquisition
Previous studies have examined the effectiveness of DDL for vocabulary and collocation learning and have generally reported positive effects. These studies have assessed DDL effects through vocabulary tests (Chan & Liou, Reference Chan and Liou2005; Karras, Reference Karras2016), collocation tests (Liu & Gablasova, Reference Liu and Gablasova2025; Saeedakhtar et al., Reference Saeedakhtar, Bagerin and Abdi2020; Vyatkina, Reference Vyatkina2016), and comprehension and production tests (Frankenberg-Garcia, Reference Frankenberg-Garcia2014). These findings suggested that DDL could affect not only learners’ knowledge of individual words and collocations but also their productive use of lexical resources in writing. DDL may therefore help learners move beyond repetitive vocabulary and internalize more sophisticated lexical patterns, thereby fostering lexical diversity and sophistication in their writing. Examining DDL from the perspective of lexical complexity thus provides a theoretically grounded lens for understanding how corpus use shapes learners’ productive vocabulary and pedagogical insights into how DDL can promote different aspects of lexical acquisition in L2 writing.
To our knowledge, only a small number of studies have utilized lexical complexity as the assessment of learners’ lexical acquisition. For instance, Luo (Reference Luo2016) examined the effects of DDL activities on learners’ writing performance using sophisticated TTR and found no significant improvements in complexity measures. Tsai (Reference Tsai2021) examined the effects of DDL on EFL learners’ lexical diversity in writing and reported learners’ improvements in most measures, such as the number of different words and lexical word variation, with the exception of verb variation. Later, Samoudi and Modirkhamene (Reference Samoudi and Modirkhamene2022) compared direct DDL, indirect DDL, and non-DDL instruction across complexity, accuracy, and fluency dimensions, and reported that indirect DDL significantly improved accuracy and fluency, but not complexity. More recently, Hu and Deng (Reference Hu, Deng, Li, Cheung, Wang, Lu and Kwok2023) examined the impact of DDL writing practices on learners’ lexical diversity via MTLD and lexical sophistication based on low-frequency word type ratio and found significant improvement in both indices. Şahin Kızıl (Reference Şahin Kızıl2023) incorporated lexical diversity measured by TTR and reported that DDL facilitated effective writing revision. Muftah (Reference Muftah2023) used a similar measure and found comparable effects for BNCweb (Hoffmann et al., Reference Hoffmann, Evert, Smith, Lee and Berglund Prytz2008) and Sketch Engine (Kilgarriff et al., Reference Kilgarriff, Baisa, Bušta, Jakubíček, Kovář, Michelfeit, Rychlý and Suchomel2014).
2.2. Research on the developmental pattern of lexical complexity
CDST has increasingly been adopted in second language acquisition (SLA) research as a lens to investigate linguistic complexity (Bulté & Housen, Reference Bulté, Housen, Fogal and Verspoor2020). CDST conceptualizes language as a dynamic and adaptive system in which change is characterized by variability, nonlinearity, and sensitivity to contextual factors (Dong et al., Reference Dong, Liu and Lu2023; Larsen-Freeman, Reference Larsen-Freeman2009). Within this framework, language is composed of multiple interconnected subsystems including morphological, lexical, syntactic, and discourse features. Lexical complexity is often understood as developing through fluctuations, spurts, and regressions that reflect the interplay of multiple subsystems (Bulté & Housen, Reference Bulté, Housen, Fogal and Verspoor2020). Importantly, CDST highlights variability itself as a central window into development, treating it not as statistical “noise” but as an integral part of the language learning process and a potential driver of growth (Larsen-Freeman, Reference Larsen-Freeman2009).
A growing body of longitudinal research has examined developmental trajectories of lexical complexity across different learner groups and instructional contexts. These studies showed that developmental trajectories of lexical complexity were closely tied to instructional environments. More specifically, naturalistic exposure and immersion programs yielded fluctuations and regressions in the growth of lexical diversity (Pfenninger, Reference Pfenninger2020; Polat & Kim, Reference Polat and Kim2014); classroom instruction showed various patterns of either linear or nonlinear development in different dimensions (Kim et al., Reference Kim, Kim and Kang2025; Zheng, Reference Zheng2016); and study-abroad contexts contributed to learners’ nonlinear growth of lexical diversity (Díez-Ortega & Kyle, Reference Díez-Ortega and Kyle2024; McManus et al., Reference McManus, Mitchell and Tracy-Ventura2021). For instance, Polat and Kim (Reference Polat and Kim2014) examined the development of lexical diversity among English learners of Turkish immigrants under naturalistic exposure and found nonlinear development in lexical diversity. McManus et al. (Reference McManus, Mitchell and Tracy-Ventura2021) similarly documented gradual but nonlinear growth in lexical diversity among advanced French and Spanish learners studying abroad. Zheng (Reference Zheng2016) also reported nonlinear development in lexical sophistication and diversity among L2 learners of classroom instruction. Pfenninger (Reference Pfenninger2020) also identified nonlinear growth in lexical diversity among Swiss learners in a bilingual immersion program, and Díez-Ortega and Kyle (Reference Díez-Ortega and Kyle2024) found steady linear increases in diversity but mixed, index-dependent trends in sophistication among UK students of Spanish studying abroad. By contrast, Kim et al. (Reference Kim, Kim and Kang2025) observed significant linear growth in sophistication but no change in diversity among beginning Japanese EFL students in a university English course.
Despite the growing body of research on lexical complexity development, limited attention has been paid to how DDL influences longitudinal development of lexical complexity. To address this, this study adopted a longitudinal, mixed-methods design involving two types of instruction: DDL and traditional instruction. Specifically, we examine (a) pretest–posttest comparisons under timed conditions, (b) holistic developmental trajectories across timed and untimed writing conditions, and (c) learners’ perceptions of the DDL activities. The study is guided by the following research questions (RQs):
-
1. What is the effectiveness of DDL on the different dimensions of lexical complexity of learners’ argumentative writing in timed conditions, as compared with traditional instruction?
-
2. What are the developmental trajectories of the different dimensions of lexical complexity of learners’ argumentative writing in timed and untimed conditions under DDL instruction, as compared with traditional instruction?
-
3. How do learners perceive the effectiveness, experience, and challenges of DDL activities?
3. Method
3.1. Research setting and participants
The study involved undergraduate students in a compulsory English writing course in China, comprising a DDL class (n = 26) and a non-DDL class (n = 22). Based on their scores in the Test for English Majors Band 4 (TEM-4),Footnote 1 the participants’ English proficiency was approximately aligned with the B1-B2 level of the Common European Framework of Reference for Languages (CEFR; Council of Europe, 2001). A pretest of argumentative writing was conducted before the intervention. An independent-samples t-test showed no significant difference between the DDL class (M = 74.48, SD = 5.44) and the non-DDL class (M = 74.17, SD = 4.43), t(46) = 0.22, p = 0.826, showing that the two classes were parallel in their writing proficiency. In the pre-instruction questionnaire, none reported prior use of a corpus for language learning.
3.2. Instructional procedures
The course spanned 18 weeks, with four hours of English instruction per week. Both classes followed the same curriculum and undertook the same assessments. Both the DDL and non-DDL classes were taught by the same instructor (one of the authors). Two hours each week were dedicated to the pedagogical intervention described below, while the remaining instructional time focused on general argumentative writing instruction. DDL and non-DDL instruction were embedded in regular class time (see Appendix A; all appendices appear in the supplementary material).
In the DDL instruction, learners used Sketch Engine, a corpus analysis platform chosen for its powerful functions such as wildcard searches and frequency comparisons (Crosthwaite & Steeples, Reference Crosthwaite and Steeples2024). At the beginning of the semester, a one-hour training session was provided to the DDL class to introduce the use of Sketch Engine, mainly focusing on the Concordance and WordSketch modules. The instructor modeled how to conduct searches and interpret the results, and students then practiced under guidance. During corpus-searching activities, learners primarily used the Concordance and WordSketch functions. Concordance (Figure 1) enabled observation of usage patterns in concordance lines, while WordSketch presented collocational profiles grouped by grammatical relations (Figure 2) with contexts (Figure 3).
The screenshot of the Sketch Engine interface showing a concordance search result “claim”.

The screenshot of the Sketch Engine interface showing collocates of “argue”.

The screenshot of the Sketch Engine interface showing concordances of “strongly argued”.

Learners’ corpus search was guided by worksheets, which were designed by the instructor based on previous DDL practices and pedagogical experience, and further refined through consultation with three other EFL teachers to ensure instructional clarity and appropriateness. The selection of lexical items adhered to the course syllabus and was informed by the prescribed textbook, Writing Critically 3 for Argumentative Writing (Chen et al., Reference Chen, Lü and Liu2016), while also taking into account frequency information from the British Academic Written English Corpus (BAWE; Alsop & Nesi, Reference Alsop and Nesi2009). For example, Units 2 and 3 of this textbook focus on making and supporting claims (see Appendix A); accordingly, target collocations for constructing claims and data (e.g., assert a claim, provide evidence) were incorporated into guided worksheets for classroom practice (see Appendix B for a sample worksheet). Students searched by themselves, followed by group discussions of two to four students, and were encouraged to share their findings with the whole class.
In the non-DDL class, students received instruction based on the same lexical items but followed the traditional presentation–practice–production (PPP) approach (Smart, Reference Smart2014). In the presentation phase, the learners were introduced to the target vocabulary via slides. During the practice phase, they completed controlled exercises. In the production phase, they used the vocabulary in writing-related tasks such as drafting topic sentences.
3.3. Measures of lexical complexity
Guided by the two dimensions of lexical complexity, we adopted four indices (see Table 1) as these indices have been attested to be effective in capturing different aspects of learners’ lexical use and have been widely adopted in previous studies (e.g., Nasseri & Thompson, Reference Nasseri and Thompson2021; Zhou et al., Reference Zhou, Gao and Lu2023). TAALES (Kyle & Crossley, Reference Kyle and Crossley2015) was used to measure lexical sophistication, and TAALED (Kyle et al., Reference Kyle, Crossley and Jarvis2021) was used to assess lexical diversity. More explanations of these indices can be found in Appendix C.
Indices for assessing the lexical complexity

3.4. Data collection and analysis
3.4.1. Writing samples
The diachronic effects of DDL on learners’ lexical complexity were evaluated through writing samples. This study adopted both timed in-class writing (Week 1, 18) and untimed out-of-class essays (Week 5, 10, and 15) for a more comprehensive evaluation of students’ writing abilities. The writing samples were collected before generative AI tools such as ChatGPT became publicly available; corpus tools were inaccessible during writing, and submissions were checked using plagiarism detection tools (more details can be found in Appendix D).
Before analysis, composite z-scores were computed to represent each learner’s overall lexical complexity at each time point. Specifically, all four indices were z-standardized; the negatively oriented index (COCA_academic_frequency_log_aw) was reverse-coded, and the standardized indices were then averaged to form a composite score, such that higher composite z-scores indicated greater lexical complexity (see the detailed procedures in Appendix E). In comparing lexical complexity before and after instruction (RQ1), we used paired- and independent-samples t-tests to compare differences between the pretest and posttest and between the two classes at the pretest and the posttest, which were timed in-class writing tasks completed in Week 1 and 18. Effect size thresholds recommended by Plonsky and Oswald (Reference Plonsky and Oswald2014) were adopted, namely 0.60 (small), 1.00 (medium), and 1.40 (large) for within-group designs and 0.40 (small), 0.70 (medium), and 1.00 (large) for between-group designs. When measuring the changes in lexical complexity over time (RQ2), a series of linear mixed-effects (LME) models were employed. For each lexical complexity index, we built three separate models to examine both within-group changes and between-group differences. These models were applied across five time points to capture developmental trajectories in the composite measure of overall lexical complexity and in the four individual indices representing lexical diversity and lexical sophistication throughout the intervention. The first two models were run separately for the DDL and non-DDL classes. Each model included time as a fixed factor, participant as a random intercept, random slopes for time, and one lexical complexity index as the dependent variable. The third model included data from both classes, with time, group, and their interaction as fixed effects, and participant as a random intercept with random slopes for time. The three models allowed for evaluating trends in each pedagogical condition and group differences simultaneously. All models were built using the lme4 package (Bates et al., Reference Bates, Mächler, Bolker and Walker2015) in R (R Core Team, 2021).
3.4.2. Questionnaires
The pre-instruction questionnaire, administered in Week 1, collected learners’ demographic information and prior corpus knowledge. The post-instruction questionnaire in Week 18 assessed learners’ perceptions of DDL activities, following previous DDL research (Crosthwaite & Steeples, Reference Crosthwaite and Steeples2024; Yoon & Hirvela, Reference Yoon and Hirvela2004). The 14-item questionnaire covered perceived effectiveness, experience of corpus use, and ease and challenges of corpus use. Responses were recorded on a 5-point Likert scale and analyzed using percentages, means, and standard deviations.
3.4.3. Interviews and reflective journals
To triangulate the quantitative findings, we collected monthly reflective journals from DDL participants and conducted semi-structured interviews with six randomly selected students (Appendix F). The qualitative data were analyzed thematically following Braun and Clarke (Reference Braun and Clarke2006). The first and second authors coded the data independently, refined the codes into broader themes, and resolved discrepancies through discussion. Percentage agreement reached 89%, indicating strong interrater reliability (Rau & Shih, Reference Rau and Shih2021).
4. Results
4.1. Learners’ performances on overall lexical complexity
4.1.1. The pre–post comparisons of overall lexical complexity
Table 2 presents descriptive statistics for overall lexical complexity (z-scores) across the five time points, and Tables 3 and 4 report the paired- and independent-samples t-test results for T1 (pretest) and T5 (posttest). The within-group analysis showed that the DDL class improved from T1 (M = −0.127, SD = 0.345) to T5 (M = 0.182, SD = 0.402), t(25) = 3.695, p = 0.001, d = 0.788, while the non-DDL class declined from T1 (M = 0.150, SD = 0.379) to T5 (M = −0.198, SD = 0.419), t(21) = −2.303, p = 0.033, d = 0.515. The between-group analysis also showed a significant difference at posttest, t(46) = −3.133, p = 0.003, d = 0.926, indicating that the DDL class achieved higher overall lexical complexity than the non-DDL class at posttest.
Descriptive statistics for overall lexical complexity

Results of paired-samples t-tests for overall lexical complexity

Note: *p < .05, **p < .01, ***p < .001.
Results of independent-samples t-tests for overall lexical complexity

Note: *p < .05, **p < .01, ***p < .001.
4.1.2. Developmental trajectories of overall lexical complexity
The developmental trajectories of overall lexical complexity in the two classes are shown in Figure 4, where the red line represents the DDL class and the blue line represents the non-DDL class. As can be seen, learners in the DDL class showed a gradual increase in overall lexical complexity; in contrast, the non-DDL class exhibited a consistent decline. The within-group LME models (Table 5) illustrated that the change over time was significant for both classes (DDL: β = 0.074, SE = 0.027, t = 2.761, p = 0.007; non-DDL: β = −0.083, SE = 0.026, t = −3.227, p = 0.002). The between-group LME models (Table 6) showed a significant interaction between time and group, β = 0.155, SE = 0.040, t = 3.899, p < .001. Notably, from T3 onward, the DDL class maintained higher mean values than the non-DDL class, and the gap widened over time. This pattern suggests that overall lexical complexity developed more positively in the DDL class than in the non-DDL class.
Developmental trajectories of the composite z-scores.

Results of within-group LME models with fixed factor of time for overall lexical complexity

Note: *p < .05, **p < .01, ***p < .001.
Results of between-group LME models for overall lexical complexity

Note: *p < .05, **p < .01, ***p < .001.
As regards the developmental trajectories, at the group level, the mean z-scores showed a relatively linear trend across time, likely because it summarized multiple standardized indices. At the individual level, however, learners followed nonlinear trajectories, indicating that lexical complexity development was not uniformly linear across learners.
4.2. Learners’ performances on lexical sophistication
4.2.1. The pre–post comparisons of lexical sophistication
Learners in the DDL class showed a significant increase in lexical sophistication, as evidenced by the significant decrease in the negatively oriented measure COCA_academic_frequency_log_aw from the pretest (T1: M = 0.297, SD = 0.980) to the posttest (T5: M = −0.294, SD = 0.954), t(25) = 2.243, p = 0.036, d = 0.478 (see Tables 7 and 8). In contrast, the non-DDL class showed a decrease in lexical sophistication, with the raw value increasing from T1 (M = −0.351, SD = 0.926) to T5 (M = 0.321, SD = 0.969), although this change was not statistically significant (p = 0.140). The between-group comparison (Table 9) further revealed a significant difference at posttest, t(46) = 2.167, p = 0.036, d = 0.640, showing the higher sophistication in the DDL class. For Kuperman_AoA_aw, no significant within-group changes were found in either class (p > .05). As shown by descriptive results in Table 7, the DDL class increased from T1 (M = −0.292, SD = 1.005) to T5 (M = 0.176, SD = 1.187), but the non-DDL class decreased from T1 (M = 0.345, SD = 0.898) to T5 (M = −0.191, SD = 0.726). These results suggest that the DDL class showed more positive development in lexical sophistication than the non-DDL class. Because corpus tools were not available during the tests, this may indicate that learners successfully internalized and applied the lexical knowledge into independent writing.
Descriptive statistics for lexical sophistication

Results of paired-samples t-tests for lexical sophistication

Note: *p < .05, **p < .01, ***p < .001.
Results of independent-samples t-tests for lexical sophistication

Note: *p < .05, **p < .01, ***p < .001.
Examples 1 and 2 (see Table 10) illustrate sentences produced by learners in the DDL class. These two examples demonstrate advanced lexical choices such as reconsider, implement, regulations, and alleviate, along with the phrase from my perspective, which reflect their high lexical sophistication. In contrast, Examples 4 and 5 come from students in the non-DDL class, which reflect more general lexical choices such as ban, encourage, help, and take, indicating a comparatively lower level of lexical sophistication.
Learners’ writing examples from the posttest writing task

4.2.2. Developmental trajectories of lexical sophistication
Figure 5 illustrates the developmental trajectories of lexical sophistication across five time points. The negative-oriented measure COCA_academic_frequency_log_aw showed a significant decrease over time in the DDL class, β = −0.128, SE = 0.061, t = −2.086, p = 0.039 (see Table 11), indicating increased use of less frequent vocabulary. In contrast, the non-DDL class showed a significant increase over time, β = 0.147, SE = 0.064, t = 2.283, p = 0.024. The interaction effect between time and group was also significant, β = −0.275, SE = 0.089, t = −3.082, p = 0.002 (see Table 12), indicating divergent trajectories between the two classes.
Developmental trajectories of lexical sophistication measures.

Results of within-group LME models with fixed factor of time for lexical sophistication

Note: *p < .05, **p < .01, ***p < .001.
Results of between-group LME models for lexical sophistication

Note: *p < .05, **p < .01, ***p < .001.
For Kuperman_AoA_aw, the DDL class exhibited a significant increase (β = 0.125, SE = 0.059, t = 2.142, p = 0.034), suggesting their use of more later-acquired words. There was a significant interaction effect (β = 0.277, SE = 0.086, t = 3.203, p = 0.002). As can be seen in Figure 5, the non-DDL class showed a decreasing pattern, while the DDL class increased over time. These results indicate that learners in the DDL class increasingly used more complex words over time, whereas those in the non-DDL class showed the opposite tendency.
The mean trajectories of COCA_academic_frequency_log_aw showed a continuous decrease in the DDL class, with a steeper decline from T1 to T2 followed by a more gradual decrease, whereas the non-DDL class showed a relatively rapid increase from T1 to T2 and a near-linear, gradual increase from T2 to T5. The mean trajectories of Kuperman_AoA_aw showed a small increase from T1 to T2 in the DDL class, slightly decreased from T2 to T3, increased from T3 to T4, and then decreased slightly from T4 to T5, whereas the non-DDL class decreased from T1 to T2, slightly increased from T2 to T3, decreased from T3 to T4, and then increased slightly from T4 to T5. At the individual learner level, trajectories showed fluctuation and nonlinearity over time. This suggests that lexical sophistication development was not uniformly linear across learners.
4.3. Learners’ performances on lexical diversity
4.3.1. The pre–post comparisons of lexical diversity
For lexical diversity (Table 13), the descriptive statistics showed an upward trend in mattr50_cw for the DDL class (T1: M = −0.203, SD = 1.067; T5: M = 0.267, SD = 1.000), whereas the non-DDL class showed a slight decline (T1: M = 0.240, SD = 0.879; T5: M = −0.291, SD = 0.937). Within each group, no statistically significant changes were found (p > .05; see Table 14), and the between-group comparison at posttest (see Table 15) was also not significant, t(46) = −1.952, p = 0.057, d = 0.574. Overall, the descriptive pattern suggests that the DDL class tended to show greater lexical diversity at posttest than the non-DDL class. This difference might be due to their sustained two-hour weekly engagement with corpus activities, which repeatedly exposed them to varied lexical patterns. Additional results for mtld_original_aw are provided in Appendix G in the supplementary material.
Descriptive statistics for lexical diversity

Results of paired-samples t-tests for lexical diversity

Results of independent-samples t-tests for lexical diversity

Note: *p < .05, **p < .01, ***p < .001.
As exemplified in Example 3, lexical diversity was reflected in a varied range of vocabulary, including both general and academic words (e.g., practical solution, unrealistic, stringent). Example 6 from the non-DDL class illustrates more limited lexical diversity, with lexical items mostly being high-frequency and general words (e.g., ease traffic, reduce, air and noise pollution, good decision), with less use of precise, abstract, or domain-specific vocabulary.
4.3.2. Developmental trajectory of lexical diversity
Table 16 shows the within-group LME results for lexical diversity, and Table 17 displays the between-group LME results. For lexical diversity, Figure 6 showed that the DDL class exhibited a gradual upward trajectory in mattr50_cw, whereas the non-DDL class showed a downward or more stable pattern over time. The between-group LME models showed significant interactions between time and group for mattr50_cw (β = 0.237, SE = 0.085, t = 2.797, p = 0.006), indicating that learners in the DDL class developed lexical diversity more positively than the non-DDL class.
Results of within-group LME models with fixed factor of time for lexical diversity

Note: *p < .05, **p < .01, ***p < .001.
Results of between-group LME models for lexical diversity

Note: *p < .05, **p < .01, ***p < .001.
Developmental trajectories of mattr50_cw.

The trajectories of mean values for mattr50_cw increased in an approximately linear pattern from T1 to T3 and then remained relatively stable from T3 to T5 in the DDL class, whereas in the non-DDL class, the mean score decreased from T1 to T3 and then remained relatively stable from T3 to T5. At the individual level, learners followed highly varied and nonlinear paths, reflecting the dynamic and learner-specific nature of lexical diversity development over time, similar to the results of lexical sophistication.
4.4. Learners’ perceptions of DDL activities
4.4.1. Effectiveness of using corpora for argumentative writing
Figure 7 provides an overview of the three themes identified, namely the effectiveness of using corpora for argumentative writing, experience of corpus use, and easiness of using corpora. The results show that 82.61% (M = 3.74, SD = 0.83) of learners found corpora beneficial for improving collocation use, and 73.92% (M = 3.70, SD = 0.82) reported enhanced word choice in argumentative writing. Interview results further supported the benefits of DDL for word choice and accuracy, as illustrated in Excerpts 1 and 2 (Table 18). More than half of the learners (56.53%; M = 3.57, SD = 0.79) agreed that using corpora was beneficial for their argumentative writing. Specifically, one student, in the interview, explained that consulting corpora helped them choose lexical items that made their essay appear more coherent and well organized, thus improving their argumentative writing quality (see Excerpt 3).
Results of post-instruction questionnaire.

Learner excerpts by theme

4.4.2. The use of corpora
The analysis showed that 60.87% of learners (M = 3.43, SD = 0.98) used corpora during writing practice and 78.26% (M = 3.74, SD = 0.83) used them to resolve writing difficulties, as illustrated in Excerpt 4. The results also showed that 65.21% (M = 3.74, SD = 0.90) could search corpora independently, 73.91% (M = 3.57, SD = 1.00) were confident in learning language use from corpora, and 73.91% (M = 3.83, SD = 0.87) evaluated the structured guidance positively (see Excerpts 5 and 6). In addition, 52.18% (M = 3.57, SD = 1.00) reported that corpus-based writing activities were interesting.
Despite the overall positive evaluations, several learners reported challenges in using corpus tools, particularly at the initial stages. The questionnaire results showed that 86.90% of learners encountered difficulties in using corpora, and 39.10% considered corpus consultation time-consuming.
5. Discussion
5.1. Effectiveness of DDL on lexical complexity
The analysis of the effectiveness of DDL on lexical complexity (RQ1) showed that the DDL class improved significantly in overall lexical complexity, whereas the non-DDL class showed a significant decline. More specifically, the results suggest that learners in the DDL class exhibited increased lexical sophistication, reflected in a greater use of less frequent vocabulary and words that are typically acquired at later stages of language development. This may be explained by their frequent engagement with concordance lines and authentic language input through corpus tools. This exposure may have helped learners internalize and use academic and low-frequency vocabulary. These findings are consistent with previous studies suggesting that DDL fostered the development of lexical complexity and supported the acquisition of academic and abstract vocabulary (Boulton & Cobb, Reference Boulton and Cobb2017; Hu & Deng, Reference Hu, Deng, Li, Cheung, Wang, Lu and Kwok2023; Tsai, Reference Tsai2021). The regression observed in the non-DDL class may be attributed to the limited input provided in the PPP approach, which tends to lead learners to recycle familiar high-frequency vocabulary rather than extend their lexical repertoire (Harris & Leeming, Reference Harris and Leeming2022). The results further indicate that the DDL class demonstrated a stronger development in lexical diversity over time, particularly as reflected in the measures mtld_original_aw and mattr50_cw. This suggests that engagement with DDL activities may have supported learners in expanding the range of lexical items used in their writing across the instructional period. This benefit might be attributed to learners’ exposure to a wider range of vocabulary resources, helping them avoid relying on repetitive vocabulary (Boulton & Cobb, Reference Boulton and Cobb2017). This is consistent with Şahin Kızıl (Reference Şahin Kızıl2023) and Muftah (Reference Muftah2023), who reported that DDL significantly enhanced lexical diversity in L2 writing.
This finding corroborates previous DDL research suggesting that DDL is more effective in enhancing the acquisition of lexical patterns and collocations than traditional instruction (Boulton & Cobb, Reference Boulton and Cobb2017; Frankenberg-Garcia, Reference Frankenberg-Garcia2014; Lee et al., Reference Lee, Warschauer and Lee2019; Liu & Gablasova, Reference Liu and Gablasova2025; Saeedakhtar et al., Reference Saeedakhtar, Bagerin and Abdi2020; Sun & Hu, Reference Sun and Hu2023; Vyatkina, Reference Vyatkina2016), which may be explained by the underlying mechanisms of DDL. First, the KWIC format provided repeated exposure to authentic language data, helping learners notice and apply recurring lexical patterns (Frankenberg-Garcia, Reference Frankenberg-Garcia2014; Pérez-Paredes, Reference Pérez-Paredes, Harris and Moreno Jaén2010). Second, the discovery-based nature of DDL positioned learners as active agents engaging with language data, which may enhance autonomy, motivation, and metalinguistic reflection (Johns, Reference Johns1991). Third, repeated engagement with authentic corpus data may have facilitated incidental acquisition of academic and low-frequency vocabulary (Muftah, Reference Muftah2023).
5.2. Developmental trajectory of lexical complexity
This study also examined the developmental trajectories of lexical complexity under DDL and traditional instruction across five time points (RQ2). The results revealed clearly divergent patterns between the two classes: overall lexical complexity, lexical sophistication, and lexical diversity increased over time in the DDL class, whereas measures of these dimensions declined in the non-DDL class. At the individual learner level, the developmental trajectories were not uniformly linear; rather, they exhibited fluctuations and nonlinear patterns across the observation period.
First, the results showed that the two classes followed diverging developmental trajectories across overall lexical complexity as well as sophistication and diversity indices. Specifically, the overall lexical complexity z-scores showed a linear mean-level increasing trajectory in the DDL class and a linear decreasing trajectory in the non-DDL class. A similar pattern was also observed for all measures of lexical sophistication, including corpus-based and psycholinguistic measures. For instance, the increase in Kuperman_AoA_aw showed that learners gradually used more advanced words, which were more abstract, lower in frequency, and cognitively more demanding. An increasing trajectory was also seen for lexical diversity in the DDL class; however, the non-DDL class decreased. Taken together, the trajectories suggest that DDL was more effective than traditional instruction in supporting the development of overall lexical complexity, lexical sophistication, and lexical diversity.
Second, the two lexical sophistication indices showed different mean-level trajectories over time. More specifically, the frequency-based index COCA_academic_frequency_log_aw showed a relatively steady, unidirectional pattern, whereas the psycholinguistic index Kuperman_AoA_aw showed greater fluctuation and nonlinearity. This difference may be related to the way the two indices are calculated. Frequency-based indices are based on the overall frequency values of many lexical items in a text and therefore provide a broader frequency profile, so changes in a small set of words are less likely to alter the mean score markedly across time points. By contrast, age-of-acquisition indices are more item-dependent and more sensitive to the lexical items sampled in a given text (De Wilde, Reference De Wilde2023). As a result, the inclusion or absence of a small number of later-acquired words in a writing sample may lead to greater variation in mean-level trajectories across tasks (Zhou et al., Reference Zhou, Gao and Lu2023). This result is in line with Kim (Reference Kim2021), who found different trajectories for different lexical measures.
Third, at the individual learner level, all indices showed fluctuation and nonlinearity, with trajectories displaying variability and non-monotonic patterns over time. This is in line with previous studies showing that lexical sophistication and lexical diversity did not develop in a continuous linear manner and that individual trajectories often diverged from group-level trends (Díez-Ortega & Kyle, Reference Díez-Ortega and Kyle2024; Zheng, Reference Zheng2016). This variable and nonlinear pattern is also consistent with a CDST view, which regards variability as an inherent characteristic of L2 development rather than as random noise (Díez-Ortega & Kyle, Reference Díez-Ortega and Kyle2024; Spoelman & Verspoor, Reference Spoelman and Verspoor2010).
5.3. Learners’ perceptions of DDL activities
The analysis of learners’ perceptions found overall positive attitudes toward DDL instruction (RQ3). These observations are consistent with previous research showing that learners generally perceived corpus use as beneficial for developing L2 writing skills and useful for their L2 learning (Boulton, Reference Boulton2010; Yoon & Hirvela, Reference Yoon and Hirvela2004). Moreover, many learners reported successful application of corpus skills during writing, indicating that learners were able to transfer corpus-based strategies from classroom instruction to real-world writing tasks. A majority of participants expressed confidence in their ability to independently search corpora and interpret corpus output, suggesting that DDL fostered both corpus skills and writing confidence (Liu & Gablasova, Reference Liu and Gablasova2025; Yoon & Hirvela, Reference Yoon and Hirvela2004). Learners also viewed corpus-based writing activities as interesting, in line with Boulton (Reference Boulton2010).
Despite these benefits, some challenges were identified, particularly at the initial stage of DDL implementation. Learners reported difficulties with corpus interface design and search syntax, which made it challenging to locate relevant examples or interpret the results effectively (Dong & Wang, Reference Dong and Wang2025; Liu & Gablasova, Reference Liu and Gablasova2025; Saeedakhtar et al., Reference Saeedakhtar, Bagerin and Abdi2020). This points to a broader concern in implementing DDL; that is, although corpus tools are pedagogically valuable, they are often regarded as cognitively demanding and time-consuming, particularly when learners lack sufficient training (Boulton, Reference Boulton2010; Zhao et al., Reference Zhao, Buckingham and Dong2026). Several learners in this study also emphasized that effective corpus use requires prior knowledge of grammar and word classes, which may pose a barrier for lower-proficiency learners. These findings support previous arguments that DDL tasks required cognitive and linguistic resources, which may overwhelm learners in the absence of structured support (Boulton & Cobb, Reference Boulton and Cobb2017).
From a pedagogical standpoint, these difficulties underscore the need for explicit corpus literacy training, particularly at the early stages of DDL implementation. Teachers may need to carefully consider the potential challenges of introducing DDL, as learners who are unfamiliar with corpus tools may find it overwhelming. For instance, instructors can start with teacher-prepared corpus samples before transitioning students to independent searches, serving as a more accessible entry point for beginners (Sun & Hu, Reference Sun and Hu2023). Once learners become familiar with corpus use, they tend to find it both effective and intrinsically motivating, highlighting the value of investing in training and task design (Boulton & Cobb, Reference Boulton and Cobb2017).
6. Conclusion
This study investigated the effectiveness of DDL in promoting different dimensions of lexical complexity, explored their developmental trajectories, and examined learners’ perceptions of DDL instruction. A comparative analysis revealed statistically significant differences, with DDL proving more effective in enhancing lexical complexity. Longitudinal analyses further uncovered diverse developmental trajectories across multiple dimensions of lexical complexity, characterized by fluctuations, regressions, and periods of stability. Moreover, the analysis of learners’ responses to questionnaires and interviews indicated that, despite reported difficulties in using corpus tools, students generally perceived DDL as beneficial.
This study makes several contributions to the field of DDL. Theoretically, this study applies a multidimensional framework of lexical complexity that integrates both corpus-based and psycholinguistic indices. This framework provides a more nuanced approach to analyzing L2 lexical development, contributing to the understanding of lexical complexity in SLA. Empirically, the study suggests benefits of DDL for promoting lexical complexity and provides insights into lexical development. Furthermore, the study’s findings support the view that L2 lexical development is nonlinear, individualized, and dynamic. Learner trajectories did not follow uniform, linear progressions; instead, they may be influenced by individual differences and task-specific factors in L2 lexical acquisition, which may be worth further exploration. Pedagogically, the findings indicate that DDL has the potential to enhance more sophisticated and varied vocabulary use. At the same time, given the challenges students faced in using corpus tools, there is a need for scaffolding, gradual training, and adaptation of corpus-based tasks to learners’ proficiency levels.
It is also necessary to point out several limitations. First, the small sample size limited the generalizability of the findings, and future research may consider including more diverse participants to validate the findings in a larger population. Second, while this study attempted to balance topic familiarity with the need to avoid topic repetition across tasks, the effect of topic could not be fully controlled. Future studies may consider adopting a more controlled design to reduce topic effect. Third, unauthorized assistance during learners’ out-of-class writing tasks could not be entirely ruled out. Future research may consider using online timed writing platforms or tracking revision histories to enhance writing integrity.
Supplementary material
To view supplementary material for this article, please visit https://doi.org/10.1017/S0958344026100494
Data availability statement
The data that support the findings of this study are available from the corresponding author upon reasonable request.
Authorship contribution statement
Yanan Zhao: Methodology; Formal analysis; Writing – original draft. Jihua Dong: Supervision; Investigation; Writing – review & editing.
Funding disclosure statement
This work was supported by the National Social Science Foundation of China (No. 21FYYB052) and the Educational Reform Project of Shandong Province (Grant No. Z2022151).
Competing interests statement
The authors declare no competing interests.
Ethical statement
Ethical approval for this study was granted by Shandong University. Participation was voluntary, and informed consent was obtained from all participants before the study began. Participants were informed that their writing samples, questionnaire responses, interviews, and journals would be anonymized and used only for research purposes. No financial or non-financial competing interests affected the design, conduct, or reporting of the study.
GenAI use disclosure statement
During manuscript preparation, the authors used ChatGPT (OpenAI, accessible at https://chatgpt.com/) with GPT-5.2 in December 2025 and GPT-5.3 in March 2026 for limited language polishing and proofreading of author-written text. The tool was not used to generate data, analyze data, interpret results, or produce references. No modification or custom training of the tool was undertaken. All AI-assisted outputs were carefully reviewed and revised by the authors, who accept full responsibility for the content of the article as submitted.
About the authors
Yanan Zhao is a PhD student in the School of Foreign Languages and Literature at Shandong University, China. She holds a master’s degree in languages and linguistics from the University of Melbourne, Australia. Her research interests include data-driven learning, second language teaching and learning, and corpus linguistics. She has published in ReCALL, Computer Assisted Language Learning, and Assessing Writing.
Jihua Dong is a Professor, Qilu Young Scholar, and Taishan Young Scholar at Shandong University, China. Her research interests include corpus linguistics, data-driven learning and teaching, and academic writing. She has published in Applied Linguistics, Assessing Writing, English for Specific Purposes, Journal of English for Academic Purposes, ReCALL, and System, etc.

















