1. Introduction
Sentences with interpretive ambiguity often pose challenges for second language (L2) learners, partly because learners must acquire the ability to map a single linguistic form to multiple interpretations. An example of this complexity arises with sentences involving quantifier scope, where the presence of two logical operators can create scope ambiguities, as in ‘Every horse didn’t jump over the fence’, which allows a surface scope (SS) reading (‘none of the horses jumped’) or an inverse scope (IS) reading (‘only some horses jumped’).
IS interpretations pose particular difficulties because (1) their derivation is more complex due to the misalignment between surface syntax and semantic scope (Anderson, Reference Anderson2004; O’Grady & Lee, Reference O’Grady, Lee, O’Grady, Yoon and Han2006) and (2) their availability varies across languages (Scontras et al., Reference Scontras, Polinsky, Tsai and Mai2017). Consequently, L2 research has extensively examined scope ambiguity acquisition (e.g., Chu et al., Reference Chu, Gabriele and Minai2014; Chung, Reference Chung2013; Lee, Reference Lee2009; Marsden, Reference Marsden2009; Özçelik, Reference Özçelik2018; Wu et al., Reference Wu, Ionin, Brown and Dailey2019; Wu & Ionin, Reference Wu and Ionin2022), though findings remain mixed. Some studies report that L2 learners find IS interpretations less acceptable than native speakers (e.g., Chu et al., Reference Chu, Gabriele and Minai2014; Chung, Reference Chung2009; Kim, Reference Kim2010; Lee, Reference Lee2009; Wu & Ionin, Reference Wu and Ionin2022), while others suggest comparable acceptance levels (e.g., Özçelik, Reference Özçelik2018). Several factors have been identified as contributing to this variability, including L1 transfer (Kim, Reference Kim2010), discourse context (Chung, 2022), L2 proficiency (Chung, Reference Chung2009; Marsden, Reference Marsden2009) and explicit instructions (Wu & Ionin, Reference Wu and Ionin2022).
However, to the best of our knowledge, little research has been conducted to examine the role of individual differences in cognitive abilities, such as working memory (WM) and inhibitory control (IC), in contributing to the variability in quantifier scope interpretation, particularly the IS interpretation. Recent developments in L2 research advocate for moving beyond the simplistic focus on the differences between L1 processing and L2 processing (cf. Clahsen & Felser, Reference Clahsen and Felser2006) toward addressing how various factors, including both linguistic and extralinguistic factors, shape language acquisition and processing among L2 speakers. Such an approach permits a more nuanced understanding of the linguistic and cognitive mechanisms underlying L2 acquisition and processing (Hopp, Reference Hopp2022; Jackson, Reference Jackson2023; Juffs & Fang, Reference Juffs and Fang2023; Roberts, Reference Roberts2012). Much of the extant research investigating the role of L2 proficiency and cognitive abilities in L2 acquisition and processing has primarily focused on morphosyntactic phenomena, such as subject–verb agreement, which questions, relative clauses and garden-path structures (e.g., Cotter & Ferreira, Reference Cotter and Ferreira2024; Fang, Reference Fang, Velnić, Dahl and Listhaug2024; Fang & Xu, Reference Fang and Xu2022; Fang & Wu, Reference Fang and Wu2024; Havik et al., Reference Havik, Roberts, Van Hout, Schreuder and Haverkort2009; Hopp, Reference Hopp2017, 2024; Juffs, Reference Juffs2004; Santiago & Kim, Reference Santiago and Kim2024).
Quantifier scope, lying at the interface of syntax, semantics and pragmatics, is expected to exhibit greater variability in its computation by L2 learners, even if not qualitatively different from native speakers (Chung, Reference Chung2013). This variability may arise for two main reasons: (1) L1–L2 differences in scope interpretation may lead to negative transfer effects (Chu et al., Reference Chu, Gabriele and Minai2014; Wu et al., Reference Wu, Ionin, Brown and Dailey2019; Wu & Ionin, Reference Wu and Ionin2022) and (2) L2 learners may struggle to integrate multiple information types due to processing limitations (Hopp, Reference Hopp2022; Sorace & Serratrice, Reference Sorace and Serratrice2009). Against the backdrop, it remains an open question regarding the extent to which individual differences (linguistic and extralinguistic) among L2 learners influence their quantifier scope interpretation. To this end, this study employs a picture selection task to investigate whether, and in what ways, L2 proficiency and cognitive capacities (WM and cognitive control) affect quantifier scope interpretation.
2. Literature review
2.1. Linguistic background on quantifier scope
To enhance the generalizability of our results across various scope configurations, we included two types of quantifier scope structures: doubly quantified (DQ) sentences and negatively quantified (NQ) sentences, as exemplified in (1) and (2). Both exhibit scope ambiguity in English (Kurtzman & MacDonald, Reference Kurtzman and MacDonald1993; Musolino & Lidz, Reference Musolino and Lidz2006), each allowing both an SS interpretation and an IS interpretation.


In (1), an existential quantifier precedes a universal quantifier. Depending on the relative scope relations between these two logical operators at logical form (LF), this sentence is ambiguous between the SS interpretation (1a) (i.e., A single child climbed multiple trees) and the IS interpretation (1b) (i.e., Each tree has a different child climbing it). Similarly, sentence (2) with a universal quantifier and negation is ambiguous between an SS reading (2a) (i.e., none of the horses jumped over the fence) and an IS reading (2b) (i.e., only some, but not all, of the horses jumped over the fence). In both cases, SS is derived based on the c-command relationship between the two logical operators (Musolino & Lidz, Reference Musolino and Lidz2006), whereas IS is derived through the covert syntactic operation of quantifier raising (QR) at LF (May, Reference May1985).
It is worth noting that if the orders of logical operators in (1) and (2) are reversed, as in sentences like ‘Every child climbed a tree’ and ‘The horse didn’t jump over every fence’, the IS reading would entail the SS reading. In other words, whenever the IS reading is true, the SS reading will also be true. In such cases, the IS reading describes a specific situation already covered by the SS reading, making it impossible to determine whether the IS reading can be independently obtained (Reinhart, Reference Reinhart1976). To avoid the confounding entailment and ensure that acceptance of IS reflects an independent reading, the current study focuses on configurations like (1) and (2) (Mayr & Spector, Reference Mayr and Spector2010; Scontras et al., Reference Scontras, Polinsky, Tsai and Mai2017).
Despite potential ambiguity in English, SS is generally preferred over IS. Anderson (Reference Anderson2004) proposed the processing scope economy (PSE) principle, which suggests that IS, arising from a non-isomorphic relationship between surface syntax and semantic scope, requires additional syntactic movement at LF. This incurs extra processing costs, making IS less preferred than SS. However, evidence regarding the preferred reading in NQ remains mixed (e.g., Musolino & Lidz, Reference Musolino and Lidz2006; Wu et al., Reference Wu, Ionin, Brown and Dailey2019), highlighting the need for a native English-speaking group in this study to establish baseline interpretations.
In contrast, Chinese, often considered a scope-rigid language, is claimed to lack scope ambiguity, allowing only SS readings (Aoun & Li, Reference Aoun and Li1989, Reference Aoun and Li1993; Huang, Reference Huang1982). Strong and consistent empirical evidence supports this claim. Across several studies testing native Chinese speakers on Chinese counterparts of configurations like (1) and (2) (Chen, Reference Chen2024; Chu et al., Reference Chu, Gabriele and Minai2014; Fan, Reference Fan2017; Scontras et al., Reference Scontras, Polinsky, Tsai and Mai2017; Wu et al., Reference Wu, Ionin, Brown and Dailey2019; Wu & Ionin, Reference Wu and Ionin2022; Zhou & Crain, Reference Zhou and Crain2009), researchers consistently found that IS is unavailable in Chinese.
However, because these studies often varied in their methodologies and in how English structures were translated into Chinese, more compelling cross-linguistic evidence should come from studies directly comparing native English and native Chinese speakers using the same method in their respective languages. Scontras et al. (Reference Scontras, Polinsky, Tsai and Mai2017) and Wu et al. (Reference Wu, Ionin, Brown and Dailey2019) offer exemplary designs of this kind. Scontras et al. tested English native speakers on sentences like (1) and Chinese native speakers on their Chinese counterparts. Participants completed a web-based judgment task in which they rated the acceptability of an auditorily presented sentence paired with a picture depicting either SS or IS. Ratings were provided on a 7-point Likert scale. Results showed that IS readings received significantly higher ratings in English (average ≈ 4.5) than in Chinese (average ≈ 1.5). Similarly, Wu et al. (Reference Wu, Ionin, Brown and Dailey2019) employed a context-based acceptability judgment task. Native English speakers evaluated English sentences like (2), and Chinese native speakers judged the corresponding Chinese counterparts. In each trial, participants read a short written story and viewed a picture supporting either SS or IS and then rated the sentence on a 4-point Likert scale. Sentences supporting IS received near-floor ratings in Chinese and significantly lower ratings than in English, where mean ratings exceeded 2 out of 4.
Together, these findings confirm that, unlike English, Chinese does not permit IS interpretations for both DQ and NQ sentences. A detailed discussion of different approaches to scope rigidity in Chinese is beyond the scope of this paper (cf. Aoun & Li, Reference Aoun and Li1989, Reference Aoun and Li1993; Lee, Reference Lee, Napoli and Kegl1991; Xu & Lee, Reference Xu and Lee1989). One of the most influential accounts of scope interpretation in Chinese is Huang’s (Reference Huang1982) isomorphic principle (later termed as such by Aoun & Li, Reference Aoun and Li1989), which posits a strict isomorphism between surface syntactic positions and scope relations at LF. According to this principle, the scope relations between logical operators in both (1) and (2) can be directly read off their surface syntactic configurations. Aoun and Li (Reference Aoun and Li1989, Reference Aoun and Li1993) further attributed such cross-linguistic differences in scope interpretation between Chinese and English to structural differences in constituent configuration within a generative framework.
Setting theoretical issues aside, empirical evidence from English and Chinese regarding IS availability suggests that L2 learners of English with Chinese as their L1 may struggle with the interpretation of IS not only due to its inherent processing demands but also because of negative transfer from their L1, resulting in variability shaped by individual differences.
2.2. Previous studies on L2 acquisition of quantifier scope
Quantifier scope has garnered considerable attention in L1 and L2 acquisition research, with most studies conducted from a generative perspective (see Marsden, Reference Marsden2024, for a recent overview). The literature on L1 child acquisition of quantifier scope is extensive, and its review is beyond the scope of this paper. Therefore, we focus on reviewing L2 studies relevant to the current research.
L2 acquisition of quantifier scope has been studied across various language populations and structures (e.g., Marsden, Reference Marsden2009, on DQ in L2 Japanese; Ionin et al., Reference Ionin, Luchkina and Stoops2014, on DQ in L2 Russian; Wu & Ionin, Reference Wu and Ionin2022, on NQ in L2 English; Jo et al., Reference Jo, Kim and Kim2021, on NQ in L3 English). Most research examines whether L2 learners can acquire target-like quantifier scope interpretations despite L1–L2 differences. For instance, Marsden (Reference Marsden2009) investigated DQ structures in Japanese among L1 English and Korean learners. Unlike English, which allows both SS and IS readings, Japanese and Korean permit only SS. A picture-based truth-value judgment task showed that intermediate English-speaking learners of Japanese struggled, whereas advanced English-speaking learners of Japanese- and Korean-speaking learners successfully rejected IS readings, demonstrating native-like patterns. These findings highlight L1 transfer effects and their interaction with L2 proficiency. Similarly, Chung (Reference Chung2013) found that advanced Korean-speaking learners of English could overcome negative L1 transfer, accepting the IS reading of the NQ structure in English, despite its unavailability in Korean.
However, L2 proficiency does not always mitigate L1 transfer effects, which can persist even at advanced levels. Chu et al. (Reference Chu, Gabriele and Minai2014) tested Chinese-speaking learners of English on DQ sentences like ‘Someone dropped every plate’, where IS readings are available in English but not in Chinese. Both intermediate and advanced learners rejected IS readings, suggesting persistent L1 transfer effects. Wu and Ionin (Reference Wu and Ionin2022) similarly found that learners, regardless of proficiency, rejected IS readings of DQ structures in English, despite extensive exposure to high-quality L2 input. Additionally, Wu et al. (Reference Wu, Ionin, Brown and Dailey2019) examined NQ structures among intermediate-to-advanced Chinese-speaking learners of English, who also failed to exhibit native-like patterns, though this study did not analyze the role of L2 proficiency.
Taken together, the evidence regarding the role of L2 proficiency is mixed, rendering its modulating influence on L1 transfer inconclusive. Possible reasons include small sample sizes with limited proficiency ranges, which may obscure effects, and the use of categorical proficiency variables (e.g., Chu et al., Reference Chu, Gabriele and Minai2014; Ionin et al., Reference Ionin, Luchkina and Stoops2014), which can reduce statistical power and potentially yield spurious results (Leal, Reference Leal2018; Plonsky & Oswald, Reference Plonsky and Oswald2017). Notably, these studies were not designed to examine L2 proficiency explicitly. To address this, our study investigates its role in scope interpretation using a larger sample with a broad proficiency range and treating proficiency as a gradient construct (Van Hell & Tanner, Reference Van Hell and Tanner2012).
Beyond L1 transfer and L2 proficiency, cognitive factors like WM and IC – both known to influence L2 sentence comprehension (e.g., Linck et al., Reference Linck, Osthus, Koeth and Bunting2014; Linck & Weiss, Reference Linck and Weiss2015) – have received little attention in L2 quantifier scope research. WM’s role in scope interpretation has been sporadically explored in other populations, including child L1 (Wang, Reference Wang2021), child L2 (Kim et al., Reference Kim, Kim and Jo2023), child L3 (Jo et al., Reference Jo, Kim and Kim2021) and older adults (Kemtes & Kemper, Reference Kemtes and Kemper1999), with little evidence of its impact on child learners. However, to our knowledge, no studies have examined WM’s role in adult L2 learners’ scope interpretations. Given that IS interpretations involve a non-isomorphic alignment between surface word order and logical representation, they should be cognitively demanding and likely place greater demands on WM than SS readings.
Likewise, no studies have yet investigated the role of other cognitive factors, such as IC, in quantifier scope interpretation. IC is a domain-general function for controlled processing encompassing processes like conflict detection and resolution, context maintenance and inhibition (Friedman & Miyake, Reference Friedman and Miyake2004). Relevant to this study, we focus on one specific control capacity: the ability to suppress interference (interference suppression) or inhibit irrelevant information (response inhibition) (Rodriguez-Fornells et al., Reference Rodriguez-Fornells, De Diego Balaguer and Münte2006). This ability is expected to play a role in acquiring IS interpretations. We hypothesize that learners with stronger IC are less affected by negative L1 transfer, as they can better suppress L1-driven influences and exhibit more native-like patterns. Additionally, IC may help mitigate overreliance on shallow processing strategies, such as word-order heuristics that strongly favor SS interpretations (Dwivedi, Reference Dwivedi2013). In turn, it may facilitate the acceptability of IS interpretations.
In summary, the factors influencing L2 acquisition of quantifier scope remain unclear. While research on the role of L2 proficiency has yielded mixed and inconclusive results, cognitive factors such as WM and IC remain largely understudied. To address these gaps, this study investigates the L2 acquisition of quantifier scope by examining English quantifier scope interpretations among Chinese-speaking learners, with a particular focus on the relative contributions of L2 proficiency, WM and IC to quantifier scope interpretation.
3. Present study
Given the linguistic background in the quantifier scope and an overview of prior L2 studies, this study addressed the following two research questions using a picture selection task in the cover-boxed paradigm.
-
1) How do Chinese-speaking learners of English acquire quantifier scope in English, particularly the IS interpretations which are absent in Chinese?
-
2) How do linguistic (e.g., interpretation complexity and L2 proficiency) and extralinguistic cognitive factors (e.g., WM and IC) influence L2 learners’ scope interpretations?
4. Methods
4.1. Participants
The study involved 70 adult Chinese-speaking learners of English as the L2 group and 40 native English speakers as the control group. The L2 participants were undergraduate students from various universities in Mainland China at the time of testing (61 females, 9 males, mean age = 20.5, standard deviation [SD] = 2.1). They reported being native speakers of Mandarin Chinese and indicated that their exposure to English occurred primarily through formal classroom instruction. None had lived in English-speaking countries or regions. On average, they began learning English at 7.5 years of age (SD = 3) and had received approximately 12 years of classroom English instruction (SD = 2.1). The participants’ English proficiency was assessed using the Lexical Test for Advanced Learners of English (LexTALE), which has been shown to strongly correlate with standardized measures like the Quick Placement Test (QPT) (Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012). LexTALE scores ranged from 42.5 to 96.3 (mean = 62.2, SD = 11.6), indicating a wide spectrum of proficiency levels from low to advanced, as shown in Figure 1.

Figure 1. Distribution of LexTALE scores within the L2 group.
The L1 group included native English speakers (21 females, 19 males, mean age = 31.3, SD = 7.5), recruited through Prolific (www.prolific.co), a platform widely used in psycholinguistic research (Fang & Juffs, Reference Fang, Juffs and Sadeghi2024; Juffs & Fang, Reference Juffs and Fang2023). The recruitment advertisement was accessible only to individuals who met specific screening criteria: aged 18–45, residing in the United States, having English as their native language and being raised monolingually in English. All L1 and L2 participants had normal or corrected-to-normal vision and reported no history of language-related deficits. Participants were compensated for their time.
4.2. Materials and design
We adopted the covered-box paradigm (CBP) to investigate participants’ knowledge of quantifier scope (Huang et al., Reference Huang, Spelke and Snedeker2013). In each trial, participants saw a test sentence and two pictures, one of which was covered. They selected a picture based on their interpretation of the sentence. If confident, they chose the visible picture; otherwise, they could select the covered one, which might better match their interpretation (Figure 2).

Figure 2. Sample item in the CBP.
The CBP has the advantage of minimizing underestimation of less preferred interpretations, such as IS readings. In tasks presenting multiple interpretations simultaneously, participants often select the strongly preferred reading, which may obscure the fact that their grammar permits the less preferred interpretation, but processing demands make it less accessible (Anderson, Reference Anderson2004). The CBP avoids this confound by displaying only one target interpretation at a time, as in the present study.
Unlike sentence-picture verification tasks, which require explicit meta-linguistic judgments and can be challenging for non-linguists, especially for less preferred readings (Fang & Francis, Reference Fang and Francis2025; Fanselow et al., Reference Fanselow, Zimmermann and Philipp2022), the CBP simply asks participants to select a picture. This makes it a more accessible and effective method for assessing IS interpretations, particularly for L2 learners, including those with lower proficiency levels.
The study employed a 2 × 2 design with two target structures (DQ and NQ, as in (1) and (2)) and two target interpretations (SS and IS readings), yielding four conditions: SS of DQ, IS of DQ, SS of NQ and IS of NQ. Each condition had 12 items, totaling 48. In each trial, a test sentence (DQ or NQ) was paired with two pictures – one depicting the target interpretation (SS or IS) and the other covered. For example, in Figure 2, the NQ sentence ‘Every horse didn’t jump over the fence’ is shown with a visible image illustrating the IS reading (‘Only some horses jumped’). Participants accepting this reading chose the visible picture; otherwise, they could choose the hidden picture, which, in their mind, could be the correct interpretation.
Critical items were distributed across two lists using a Latin square design, ensuring each participant saw only one interpretation per sentence. For instance, the SS and IS readings of ‘Every horse didn’t jump over the fence’ appeared in different lists to prevent repetition. Each list contained 48 test items and 96 syntactically matched fillers to obscure the experiment’s purpose.
4.3. Procedure
L1 and L2 participants completed the following tasks in sequence: a WM test using the backward digit span task, the CBP task, the LexTALE test (administered only to L2 participants), an IC test using the Simon task and a language background questionnaire. Participants provided their informed consent prior to the experiment.
4.3.1. Backward digit span test
The backward digit span task was used to assess participants’ general WM capacity (Wechsler, Reference Wechsler1981). As a non-verbal task, it measures overall WM rather than language-specific components, such as verbal or phonological WM. During the task, participants viewed digit sequences (e.g., 2, 5 and 3) presented at a rate of one digit per second and were instructed to recall the digits in reverse order (e.g., 3, 5 and 2) by typing them into a computerized text box. The test included sequences ranging from two to seven digits, with two trials per sequence length, resulting in a total of 12 trials. The task continued until participants failed both trials for a given sequence length. Additionally, a two-digit practice trial was provided to familiarize participants before the actual digit span test.
4.3.2. Covered-box picture selection
The rationale and details of this task were specified in Section 4.2. As a note, prior to the main experiment, participants completed five practice items to familiarize themselves with the task. Unlike the experimental phase, the practice trials included feedback, allowing participants to learn that either the visible or the covered image could be the correct choice. Following the practice session, participants proceeded to complete the experimental items without feedback.
4.3.3. LexTALE
Participants’ overall English proficiency was assessed using the LexTALE test, an untimed lexical decision task (Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012). In this task, participants were presented with letter strings and had to determine whether each string was a real English word by selecting Yes or No. They were instructed to select Yes if they believed the string was a valid English word, even if they did not know its meaning. The task comprised 63 trials, including 42 real words, and took approximately 10 minutes to complete. The final score, ranging from 0 to 100, was calculated based on 60 trials (after removing three dummy items) using the formula provided by Lemhöfer and Broersma (Reference Lemhöfer and Broersma2012): ((number of words correct/40*100) + (number of nonwords correct/20*100))/2.
4.3.4. Simon task
The Simon task is widely used to assess IC, a core aspect of cognitive control (Bialystok et al., Reference Bialystok, Craik and Luk2008; Simon & Rudell, Reference Simon and Rudell1967). In this task, blue or brown squares appeared randomly on the left, middle or right of the screen. Participants pressed ‘Q’ with their left hand for blue squares and ‘P’ with their right hand for brown squares. Each trial began with an 800-ms fixation cross, followed by a square that remained until a response. After a six-trial practice session with feedback, participants completed 42 trials without feedback, with accuracy and response times recorded. The task included congruent trials, where the square appeared on the same side as the required response, and incongruent trials, where it appeared on the opposite side. The need to override location-based instincts slows responses in incongruent trials, a delay known as the Simon effect, measured as the performance difference between trial types.
5. Data preparation and analysis
Participants were screened based on their performance on 24 dummy filler items out of a total of 96 fillers, specifically designed to assess task comprehension and engagement. These dummy fillers were unambiguous in meaning (e.g., ‘The boy is taller than the girl’), with clear expected responses. Participants scoring below 80% on these items were excluded from further analysis, resulting in the removal of data from 8 L1 participants and 8 L2 participants. This exclusion left a final sample of 32 L1 and 62 L2 participants for subsequent statistical analyses. The mean accuracy of dummy fillers for the retained participants was 88.5% (SD = 0.3%) for the L1 group and 87.2% (SD = 0.2%) for the L2 group.
DQ and NQ structures were analyzed separately. From the CBP task, participants’ responses were coded based on whether they selected the visible picture or the covered picture. Accordingly, the dependent variable in the analysis is the percentage of visible versus covered pictures selected. Regarding the analysis of individual difference factors, all three factors were included as continuous variables in the statistical models. The calculation of LexTALE scores has been detailed in Section 4.3.3. For WM scores and cognitive control capacities, we briefly outline the calculation procedures.
WM scores were derived from the backward digit span task, consisting of 12 trials. Participants were awarded 1 point per correctly recalled trial, with a maximum possible score of 12. To ensure data quality, responses exceeding 2.5 SDs from each participant’s mean response time were excluded. This data trimming affected less than 1.7% of responses in the L1 group and 1.6% in the L2 group.
In this paper, IC is understood and defined through the Simon effect, as measured by the Simon task. Consistent with previous research on the Simon effect, the Simon effect is calculated as the difference in mean reaction times between incongruent and congruent trials (Bialystok et al., Reference Bialystok, Craik and Luk2008; Nguyen et al., Reference Nguyen, Hutchison, Norvell, Mead and Winsler2024). A smaller Simon effect (i.e., a smaller difference between the two types of trials) indicates greater IC capacity, as it reflects more efficient conflict resolution and the ability to suppress irrelevant stimuli. Descriptive statistics for the three individual differences factors are presented in Table 1. Note that these statistics are based on the participants retained after data trimming.
Table 1. Descriptive results for L2 proficiency, WM and Simon effect scores

Statistical analyses were conducted in R (version 4.4.1) (R Core Team, 2021) using logistic mixed-effects models (lme4, version 1.1.35.5) (Jaeger, Reference Jaeger2008). Categorical predictors were sum-coded (−0.5, 0.5), and continuous variables were centered and standardized. Post hoc comparisons were performed with emmeans (version 1.10.2) (Lenth, Reference Lenth2020), applying Tukey’s adjustments. Model fit plots were generated using sjPlot (version 2.8.16) (Lüdecke, Reference Lüdecke2021). Initial models included the maximal random effects structure allowed by the design – random intercepts for participants and items, by-participant slopes for within-subject factors (e.g., scope interpretation) and interactions and by-item slopes for between-subject factors (e.g., L2 proficiency, WM and IC). In cases where models failed to converge, the random effects structure was simplified iteratively by removing correlations between random effects and eliminating random effects contributing the least variance until convergence was achieved. To assess the model’s predictive performance, marginal (explanatory power of the fixed effects only) and conditional (explanatory power of both the fixed and random effects) R2 values were computed using the MuMIn package (version 1.47.5) (Nakagawa & Schielzeth, Reference Nakagawa and Schielzeth2013).
When analyzing the influence of individual difference factors, we did not include a three-way interaction among L2 proficiency, WM and IC in our models. Instead, separate models were created to examine the interactive effects of L2 proficiency with WM and of L2 proficiency with IC. This decision was based on a multicollinearity check using the vif() function from the car package (version 3.1.2; Fox & Weisberg, Reference Fox and Weisberg2011). The logistic mixed-effects regression model that included the three-way interaction returned variance inflation factors exceeding 10 for the interaction between WM and IC, indicating potential multicollinearity issues (Montgomery et al., Reference Montgomery, Peck and Vining2021; see also Linck et al., Reference Linck, Hoshino and Kroll2008). Including this interaction term would therefore compromise the reliability of the model. All data and R codes used for this study are available at the Open Science Framework (OSF): https://osf.io/t7xh4/?view_only=1ef3f3f5ad01418e8ec329e48e1227cb
6. Results
Figures 3 and 4 show the mean percentages of picture selections for each group across conditions involving DQ and NQ sentences. For DQ sentences (Figure 3), the SS condition reveals a similar pattern in L1 and L2 groups: Both favored the visible picture when it aligned with the SS, with L2 participants showing an even stronger preference. However, in the IS condition, selection patterns diverged – L2 participants were more likely than L1 participants to choose the covered picture when the visible one depicted the IS interpretation, suggesting lower confidence in committing to an IS interpretation. For NQ sentences (Figure 4), the results mirrored those of DQ sentences. In the SS condition, both groups preferred the visible picture. In the IS condition, however, L1 participants tended to select the visible picture, while L2 learners more often chose the covered one. This again suggests that L2 learners were less confident in adopting IS interpretations and, when such interpretations were visible, favored an alternative they perceived as more plausible.

Figure 3. Mean percentage of picture selection for DQ sentences in L1 and L2 groups.

Figure 4. Mean percentage of picture selection for NQ sentences in L1 and L2 groups.
Statistical analyses using a series of logistic mixed-effects models were conducted for each condition within each group, with the picture selection as the dependent variable. The fixed effect represents the overall intercept, indicating the baseline log odds of selecting the visible picture across the entire dataset. The primary goal was to confirm whether participants’ selection of one picture corresponding to a given interpretation differed significantly from their selection of the alternative picture. Table 2 presents the modeling results regarding the comparison between visible and covered picture selections. Both L1 and L2 participants demonstrated distinct patterns in selecting visible versus covered pictures, with divergent tendencies between language groups, particularly for IS readings. These differences statistically confirmed the descriptive results. However, two exceptions were noted: no significant differences between the choice of visible pictures and that of covered pictures for the SS condition of DQ structures or the IS condition of the NQ structures for the L1 group.
Table 2. Comparison between visible and covered picture selections

Note: *p < 0.05, **p < .01, ***p < .001.
Given that picture selection patterns differed between groups, particularly for IS conditions (as shown in Figures 3 and 4), we constructed a logistic mixed-effects model for each quantifier scope structure, with condition (SS vs IS), language and their interaction as fixed effects, to statistically verify these differences. The modeling results are presented in Table 3, with significant interactions observed for each structure. Given their theoretical importance, these interactions were further unpacked. Post hoc pairwise comparisons revealed that for DQ structures, the effect of language group was significant only in the IS condition (β = 2.28, standard error [SE] = 0.204, p < .001), whereas no significant effect was found in the SS condition (β = −0.231, SE = 0.182, p = .204). Specifically, IS was more acceptable than SS for L1 speakers (β = 1.26, SE = 0.18, p < .001), whereas the opposite pattern was observed for L2 speakers (β = −1.26, SE = 0.116, p < .001).
Table 3. Model outputs for the logistic mixed-effects models of picture selection

Note: *p < 0.05, **p < .01, ***p < .001.
Similarly, for NQ sentences, the significant interaction between condition and language indicated that the effect of language group was again significant only in the IS condition (β = 1.414, SE = 0.237, p < .001) and not in the SS condition (β = 0.553, SE = 0.273, p = .043). Both L1 (β = −2.12, SE = 0.206, p < .001) and L2 (β = −2.98, SE = 0.143, p < .001) speakers showed a preference for the SS reading over the IS reading. Based on the modeling results, these interactions were visualized in Figure 5 for DQ sentences and Figure 6 for NQ sentences. The visualizations indicate that SS readings posed no major challenges for either group. However, L2 learners demonstrated less confidence and comfort in selecting visible pictures corresponding to IS readings compared to native speakers.

Figure 5. Mean percentage of visible picture selection for DQ sentences by group.

Figure 6. Mean percentage of visible picture selection for NQ sentences by group.
Next, we turn to the experimental results concerning the influence of individual difference factors. Since the inquiry focuses on the influence of individual difference factors within the learner population, separate analyses were conducted for the L2 group and the L1 group, with the latter serving as the baseline. We first presented the results for the L1 group, followed by those for the L2 group.
For the L1 group, two separate models were constructed to analyze DQ and NQ sentences. The first model included condition (SS versus IS), WM and their interaction as fixed effects. The second model included condition, IC and their interaction as fixed effects. Similarly, for the L2 group, two separate models were constructed for both DQ and NQ sentences. One model included condition, WM, L2 proficiency and their interaction as fixed effects, while the other included condition, IC, L2 proficiency and their interaction as fixed effects. Given the theoretical importance of the interaction between individual difference factors and condition, any significant interactions were further unpacked for elaboration.
Results for the L1 group showed that for DQ sentences, condition emerged as a significant main effect (β = −2.628, SE = 0.693, p < .001), with L1 participants more likely to select visible pictures for IS interpretations than SS interpretations. Neither WM nor its interaction with condition was significant (ps > .05). Similarly, in the IC model, only condition was significant (β = −2.692, SE = 0.705, p < .001), reflecting the same trend, while IC and its interaction with condition were not (ps > .05).
For NQ sentences, both WM and IC models showed significant main effects of condition (WM: β = 2.023, SE = 0.611, p < .001; IC: β = 2.334, SE = 0.225, p < .001), with L1 participants favoring visible pictures for SS over IS interpretations. Condition also interacted significantly with WM (β = 1.206, SE = 0.609, p = .048) and IC (β = 0.445, SE = 0.187, p = .017). The WM model explained 64% of the variance in picture selection (fixed factors: 16%), while the IC model explained 51% (fixed factors: 23%). Figures 7 and 8 illustrate these interactions based on regression models, with the y-axis representing the percentage of visible picture selections. As WM and IC capacities increased, L1 participants were less likely to choose visible pictures for IS interpretations.

Figure 7. Interaction between condition and working memory in the L1 group for NQ sentences (higher WM scores indicating better WM).

Figure 8. Interaction between condition and inhibitory control in the L1 group for NQ sentences (smaller SimonRT scores indicating better IC).
For the L2 group, separate models were built for DQ and NQ sentences, each analyzing the effects of WM and IC. One model included condition, L2 proficiency and WM, while the other examined condition, L2 proficiency and IC. For DQ sentences, the WM model showed a significant main effect of condition (β = 1.510, SE = 0.318, p < .001), indicating that L2 participants preferred visible pictures representing SS interpretations over IS interpretations. A marginal interaction between L2 proficiency and WM (β = 0.226, SE = 0.129, p = .08) suggested that higher WM capacity aided selection in high-proficiency L2 participants, while lower WM capacity hindered selection in low-proficiency participants. The IC model revealed a marginal three-way interaction (β = 4.652, SE = 2.517, p = .065), as shown in Figure 9. This trend suggests that greater IC increased acceptance of IS interpretations (e.g., selecting the visible picture), but only among L2 learners with moderate (nor = 0) or high proficiency (nor = 1). In contrast, for low-proficiency learners (nor = −1), IS acceptance decreased as IC capacity improved. Overall, the full model explained 68% of the variance, with fixed factors accounting for 49%.

Figure 9. Three-way interaction of condition, L2 proficiency and inhibitory control in the L2 group for DQ sentences (“nor” indicating L2 proficiency).
For NQ sentences in the L2 group, both WM and IC models showed significant main effects of condition (WM: β = 2.942, SE = 0.150, p < .001; IC: β = 5.776, SE = 0.790, p < .001), indicating that L2 participants favored visible pictures representing SS interpretations over IS interpretations. Condition significantly interacted with WM (β = −0.630, SE = 0.182, p < .001), showing that higher WM capacity increased acceptance of IS readings (Figure 10). A marginal three-way interaction among condition, L2 proficiency and WM (β = −0.397, SE = 0.211, p = .059) suggested that low-proficiency learners were predominantly biased toward SS interpretations, regardless of WM capacity. In contrast, high-proficiency learners with greater WM capacity showed increased access to IS interpretations, as reflected in their higher selection of visible pictures (Figure 11).

Figure 10. Interaction of condition and working memory in the L2 group for NQ sentences.

Figure 11. Three-way interaction of condition, L2 proficiency and working memory in the L2 group for NQ sentences.
For NQ sentences in the L2 group, the model with condition, L2 proficiency and IC showed several significant effects: a main effect of IC (β = −2.323, SE = 1.145, p = .042), a condition × IC interaction (β = 6.977, SE = 2.387, p = .003) and an L2 proficiency × IC interaction (β = −3.477, SE = 1.518, p = .022). The main effect of IC indicates that L2 learners with higher IC capacities were more likely to select visible pictures for IS interpretations. The condition × IC interaction suggests that higher IC capacities increased IS selections while decreasing SS selections (Figure 12). Additionally, the L2 proficiency × IC interaction indicates that IC’s influence on picture selection varied with proficiency. However, no significant three-way interaction was found (β = 4.773, SE = 3.443, p = .166).

Figure 12. Two-way interaction of condition and inhibitory control in the L2 group for NQ sentences.
7. General discussion
This study investigated how Chinese-speaking learners of English acquire quantifier scope, particularly IS, which varies cross-linguistically and poses processing challenges. It also explored how individual differences, including L2 proficiency, WM and IC, modulate scope interpretations. Results from a picture selection experiment using the CBP revealed that L2 learners faced significant difficulties in accessing IS interpretations of both DQ and NQ structures compared to native speakers. However, L2 learners’ success in interpreting IS is not entirely constrained; rather, it is influenced by individual differences. Specifically, learners with higher L2 proficiency, better WM and stronger IC tend to exhibit greater success in accessing this challenging interpretation. Furthermore, these individual differences potentially interact with L1 transfer and L2 input in shaping learners’ interpretation patterns.
7.1. Inverse scope interpretation in L2 learners versus L1 speakers
For English native speakers, a clear distinction between SS and IS readings was observed in both DQ and NQ structures, though the direction of this difference varied by structure. Notably, in the DQ structure, IS was more likely than SS, contradicting Anderson’s (Reference Anderson2004) PSE principle, which predicts that IS should be less accessible due to its higher processing cost from covert movement at the LF level. This finding also contrasts with prior studies (Chu et al., Reference Chu, Gabriele and Minai2014; Fang, Reference Fang2023; Scontras et al., Reference Scontras, Polinsky, Tsai and Mai2017; Wu & Ionin, Reference Wu and Ionin2022), which consistently reported that IS readings were less likely than SS readings. Putting aside the issue of interpretation preference, English speakers can access IS, as evidenced by Fang (Reference Fang2023), where IS received a 4.46/7 rating in a story-based truth-value judgment task. Given this, it is not entirely surprising that IS was also found to be highly acceptable in this study.
Another possible explanation for our findings is that a significant proportion of English speakers in our sample may have exhibited a strong preference for IS, while a smaller subset favored SS. This pattern aligns with the two-grammar hypothesis proposed by Han et al. (Reference Han, Musolino and Lidz2016), which suggests that scope interpretation varies due to the absence of clear linguistic input confirming or rejecting specific quantifier scope interpretations. Consequently, an interpretation accepted by some speakers may be rejected by others, mirroring observations in negation interpretation in Korean (Han et al., Reference Han, Musolino and Lidz2016) and scope interpretation in Chinese relative clauses (Fang et al., Reference Fang, Wu and Zhao2024). Thus, it is plausible that most L1 participants in our sample strongly preferred IS over SS. Nevertheless, our findings extend previous research by offering new empirical evidence on relative preferences for SS versus IS in DQ, using a task paradigm different from those in prior studies. While the theoretical explanation remains tentative, the L1 data serve as a critical baseline for interpreting the L2 results in this study.
Regarding the NQ structure, SS was more likely than IS, aligning with the PSE principle and replicating previous findings that favored IS in offline judgment tasks (Chung, Reference Chung2009; Wu et al., Reference Wu, Ionin, Brown and Dailey2019) and online reading time tasks (Lee, Reference Lee2010). However, IS for NQ is not entirely impossible. Notably, English native speakers selected the visible picture corresponding to IS 54% of the time, as in Figure 4. Moreover, previous research has also provided evidence that NQ sentences were globally ambiguous, permitting both readings (Musolino & Lidz, Reference Musolino and Lidz2006). However, the mixed evidence for NQ sentence interpretation should be approached with caution, as different studies have employed varying methodologies.
The results indicate that L2 learners accepted IS to a lesser extent than L1 speakers, whereas SS showed no group differences in acceptance for either structure. The L2 participants’ non-target-like performance in IS aligns with previous findings on the challenges Chinese learners of English face in acquiring and processing IS, even at advanced proficiency levels (Wu et al., Reference Wu, Ionin, Brown and Dailey2019, for NQ; Chu et al., Reference Chu, Gabriele and Minai2014, for DQ; Wu & Ionin, Reference Wu and Ionin2022, for DQ). Our study extends prior research by incorporating a larger sample size and investigating two structures within a single study. Moreover, we tested L2 learners in instructional contexts using a distinct methodological approach.
The persistent difficulty in acquiring IS may stem from three factors: negative L1 transfer, limited L2 input and processing constraints. First, negative L1 transfer likely played a role, as Chinese, which allows only SS, influenced L2 learners’ scope interpretations. Empirical evidence supports the absence of IS in Chinese, as demonstrated in studies on Chinese speakers’ quantifier scope interpretation (e.g., Scontras et al., Reference Scontras, Polinsky, Tsai and Mai2017, for DQ; Zhou & Crain, Reference Zhou and Crain2009, for NQ). However, the L1 transfer explanation remains tentative without evidence from additional L2 learner groups whose native language permits IS, as English does. If L1 transfer is indeed at play, learners whose L1 allows IS should exhibit greater acceptance of English IS than L1-Chinese learners. Nevertheless, evidence for L1 transfer in scope interpretation has been documented in other L2 studies, such as Marsden (Reference Marsden2009), which examined L1-Korean and L1-English learners of Japanese. We leave a fuller investigation of L1 transfer to future work. Second, positive evidence from L2 input plays a crucial role in facilitating scope interpretations. Figures 4 and 5 show that visible pictures representing IS readings were selected more than 50% of the time for both DQ and NQ structures, indicating that L2 input was largely available for learners to navigate through to recognize the availability of IS in English. Further confirming the role of L2 input, the acceptance rate of IS was higher for DQ (38%) than for NQ (25%) among L2 learners, mirroring its greater likelihood in L1 (83% for DQ versus 54% for NQ). This parallel pattern highlights the influence of L2 input – particularly its quantity – on L2 learners’ scope interpretations. The reason L2 learners still struggle to reach an acceptance level of IS comparable to that of L1 speakers is that L2 input is not sufficiently frequent to fully override L1 negative transfer. Besides, the possible influence of the PSE principle may further hinder L2 learners’ ability to process and acquire IS, despite some evidence suggesting the availability of such readings. Taken together, L1 transfer, L2 input and processing constraints collectively make it challenging for Chinese-speaking L2 learners to acquire IS readings in English.
While IS is globally challenging for L2 learners, they may improve their ability to access it with increased L2 proficiency. Additionally, individual differences in cognitive capacities could modulate the extent to which learners access IS.
7.2. Effects of L2 proficiency, working memory and inhibitory control
Before discussing the findings on the effects of WM and IC in the L1 and L2 groups, an important methodological note is warranted. Although it is statistically feasible to include language group as a categorical predictor alongside continuous variables such as WM and IC, we opted not to do so in our primary analyses. This decision aligns with recent calls for a multilingual turn in L2 research (Ortega, Reference Ortega2013) and a broader shift toward analyzing language processing in diverse populations on their own terms. Rather than relying on monolingual comparative norms, this approach supports a more equitable applied psycholinguistics (e.g., Kutlu & Hayes-Harb, Reference Kutlu and Hayes-Harb2023; McMurray et al., Reference McMurray, Baxelbaum, Colby and Tomblin2023; Rothman et al., Reference Rothman, Bayram, DeLuca, Di Pisa, Duñabeitia, Gharibi and Wulff2023). Accordingly, we analyzed variation separately within the L1 and L2 populations, consistent with prior L2 psycholinguistic research that often treats native and non-native speakers as distinct groups when investigating individual differences (e.g., Aldosari et al., Reference Aldosari, Covey and Gabriele2024; Coughlin & Tremblay, Reference Coughlin and Tremblay2013). That said, to complement this group-specific approach, we also conducted an additional analysis that included language group (L1 versus L2) as a fixed effect in models examining the interaction with WM or IC. This analysis revealed a significant three-way interaction among WM, condition and language group for the NQ structure, indicating that WM modulated SS and IS interpretations differently for L1 and L2 speakers. These results converge with our original group-specific analyses, which likewise demonstrated distinct WM effects across groups.
We first discuss the results from the L1 group and then turn to the L2 group, focusing on how WM and IC influence IS interpretations. These readings pose particular challenges and should be facilitated by the cognitive resources available to individuals. Our results showed that WM and IC affected only NQ sentences, not DQ sentences in native speakers of English. We argue that while IS is typically cognitively demanding, its acceptance in DQ sentences suggests it did not impose a significant memory load due to WM. Regarding IC, because IS was readily available and thus highly activated, there was no need for IC to resolve the competition with SS. This aligns with the proposal that a competing representation must reach a certain activation threshold before conflict arises, thereby necessitating cognitive control (Langlois et al., Reference Langlois, Ness, Kim and Novick2024).
However, WM and IC seem to influence access to IS in NQ sentences in an unexpected way: English native speakers with greater WM and IC were less likely to accept IS, as illustrated in Figures 7 and 8. We argue that the variability in scope interpretation of NQ sentences among native English speakers may be due to individual variation in their relative preference for SS versus IS. As shown in Figures 3 and 4, acceptance of IS in NQ sentences was more variable than in DQ sentences. This pattern aligns with previous findings reporting mixed preferences among native speakers for NQ sentences: Some studies have found a preference for IS (Attali et al., Reference Attali, Scontras and Pearl2021; Musolino et al., Reference Musolino, Crain and Thornton2000; Musolino & Lidz, Reference Musolino and Lidz2006), others for SS (Chung, Reference Chung2009; Conroy et al., Reference Conroy, Fults, Musolino and Lidz2008; Fang, Reference Fang2023; Lee, Reference Lee2010; Wu et al., Reference Wu, Ionin, Brown and Dailey2019), and some have reported no consistent preference (Lee, Reference Lee2009). To capture this individual variability, we further conducted an analysis at the individual level by grouping native English speakers into three categories based on their responses to trials where the visible picture depicted the IS reading. We used a 75% cutoff (9 out of 12 items) (as in Slabakova, Reference Slabakova2010, on scalar implicature): Participants who selected the visible picture for IS on 75% or more of the trials were classified as having an ‘Inverse bias’; those who selected such pictures 25% of the time or less were classified as having a ‘Surface bias’; and those in between were labeled ‘Ambiguous’, indicating no clear preference. This analysis confirmed our hypothesis that individuals vary in their interpretation preferences. The three groups were distributed fairly evenly, with 31% of native speakers categorized as ‘Ambiguous’, 31% as ‘Surface bias’ and 38% as ‘Inverse bias’. Figures 13 and 14 illustrate this individual variation in scope preferences in relation to WM and IC scores, respectively.

Figure 13. Individual variation in scope preference with NQ sentences sorted by WM in L1 speakers.

Figure 14. Individual variation in scope preference with NQ sentences sorted by IC in L1 speakers.
Such a finding aligns with the growing body of research supporting the multiple grammar hypothesis (Polinsky, Reference Polinsky2025), which suggests that even monolingual native speakers may not share a uniform mental grammar, allowing for individual variation in grammatical representations (Dąbrowska, Reference Dąbrowska2012; Fang et al., Reference Fang, Wu and Zhao2024; Han et al., Reference Han, Musolino and Lidz2016; Polinsky, Reference Polinsky2025). In the case of NQ sentences, as demonstrated in our study, the observed variation in interpretation we argue stems from a tension between a syntax-driven preference for SS and pragmatic reasoning. According to Anderson’s (Reference Anderson2004) PSE principle, such sentences would be expected to favor SS. However, as argued by Musolino and Lidz (Reference Musolino and Lidz2006), the SS reading of NQ sentences like ‘Every horse didn’t jump over the fence’ can be unambiguously conveyed using an alternative expression such as ‘None of the horses jumped over the fence’. This pragmatic availability of a clearer expression may block the SS interpretation, leading speakers instead to favor the IS reading. As such, each of the three groups of native English speakers may exhibit varying sensitivity to different interpretations, shaped by distinct linguistic constraints due to their individual language experiences.
Variability in scope interpretation may have complicated the effects of WM and IC on NQ sentence processing among native speakers, which might otherwise be obscured in the group-level results, as in Figures 7 and 8 indicating a negative correlation between cognitive capacity and the acceptance of IS interpretations. Figures 13 and 14 further complement these group findings by highlighting individual differences. Specifically, these figures show that participants with different scope preferences span a wide range of WM and IC capacities. Notably, those with greater cognitive capacities appear more likely to accept SS readings or tolerate scope ambiguity, a tendency particularly evident in WM. This pattern is consistent with the group-level results and may reflect the cognitive effort required to override SS interpretations – typically dispreferred for pragmatic reasons among ‘IS biased’ native speakers. However, as one reviewer correctly noted, our findings regarding the impact of individual difference factors with the L1 group should be interpreted with caution due to the relatively smaller sample size and the limited range of WM and IC scores, compared to the L2 group. Further research is needed with a larger sample and a broader range of cognitive capacity scores to more accurately capture individual variation associated with WM and IC.
For L2 participants, individual difference factors likely affect the interpretation of complex linguistic representations that are computationally demanding and demand substantial cognitive resources. This is especially true for IS, which is more complex than SS for Chinese-speaking learners of English. Indeed, as this study found, L2 learners’ acceptance of IS varied as a function of individual differences in WM, IC capacities and L2 proficiency, though not always consistently.
WM significantly predicted L2 learners’ access to IS in the NQ structure, with higher WM linked to greater IS acceptance. This finding can be accommodated by three theoretical accounts of WM in language comprehension. First, Just and Carpenter’s (Reference Just and Carpenter1992) model posits that comprehension relies on a single limited-capacity WM pool, with complex and infrequent structures like IS demanding more resources. Learners with higher WM thus had more capacity to activate and access IS. Second, Caplan and Waters (Reference Caplan and Waters1999) distinguish between online processing and offline interpretation, supported by separate WM pools. Since the picture selection task involved offline comprehension, the WM effect was more likely to surface, consistent with prior L2 research (e.g., Cotter & Ferreira, Reference Cotter and Ferreira2024; Hopp, Reference Hopp2014; Hopp et al., Reference Hopp, Schimke, Gastmann, Öwerdieck and Poarch2024). Third, MacDonald and Christiansen (Reference MacDonald and Christiansen2002) argue that WM effects reflect differences in language experience rather than resource limitations (Hintz et al., Reference Hintz, Voeten, McQueen, Meyer, Culbertson, Perfors, Rabagliati and Ramenzoni2022). Given L2 learners’ limited exposure to IS, variability in their comprehension may stem from individual differences in WM as a function of language experience with IS in the L2 input. This is further evidenced by a three-way interaction among condition, L2 proficiency and WM. Specifically, the strong effect of WM on access to IS was primarily observed in higher-proficiency learners. This finding empirically supports the mediating role of L2 proficiency in WM functioning (Linck et al., Reference Linck, Osthus, Koeth and Bunting2014) and aligns with prior research on the positive correlation between WM capacity and L2 proficiency (Juffs & Harrington, Reference Juffs and Harrington2011; Linck & Weiss, Reference Linck and Weiss2015). Additionally, it reinforces MacDonald and Christiansen’s (Reference MacDonald and Christiansen2002) argument that individual differences in language comprehension, as modulated by WM, are mediated by language experience, which is closely linked to L2 proficiency. Thus, this study underscores the importance of considering L2 proficiency when evaluating the impact of WM.
As it has received significantly less attention than WM in the field, IC was also found to influence the interpretations of DQ and NQ, with better control capacity predicting more IS interpretations. Additionally, it also affected interpretations of DQ in combination with L2 proficiency, with better IC predicting more access to IS among intermediate and advanced L2 learners. This three-way interaction is only marginally significant, but the finding appears counterintuitive at first glance, as higher proficiency is generally expected to reduce reliance on IC for language comprehension. However, our result is in line with a recent study by Rao et al. (Reference Rao, Li, Lin and Liang2024), which examined garden-path sentence processing in L2 learners. Their study found that low-proficiency L2 learners were more likely to exhibit lingering misinterpretations, particularly when they had higher IC, compared to those with lower IC. These patterns suggest that the role of IC may be influenced by linguistic knowledge, such that IC functions as expected in learners with higher proficiency, which is associated with greater linguistic knowledge (Ness et al., Reference Ness, Langlois, Kim and Novick2023). The greater IC effect with higher L2 proficiency may also be attributed to the link between bilingualism and enhanced cognitive control (Bialystok et al., Reference Bialystok, Craik and Luk2008), with higher language proficiency further strengthening this advantage (Mishra, Reference Mishra2015). As a result, our higher-proficiency L2 learners, who tend to possess greater linguistic knowledge about IS in English, are more likely to benefit from IC in accessing IS.
As a whole, IC played an important role in modulating scope interpretation by L2 learners for both DQ and NQ, whereas WM influenced only NQ. This suggests that the extent to which IS was available differs by structure and was supported by different types of cognitive resources in distinct ways. Moreover, it indicates that WM and IC played distinct roles in sentence interpretation among L2 learners and should be examined separately (e.g., Hopp et al., Reference Hopp, Schimke, Gastmann, Öwerdieck and Poarch2024; Linck & Weiss, Reference Linck and Weiss2015). Proficiency did not independently affect scope interpretation but rather modulated the way and the extent to which WM and IC exert their influence. These results have important implications for approaches to L2 acquisition. Our findings align more closely with capacity-based approaches (Hopp, Reference Hopp2010) and more recent identity-oriented approaches (Hopp, Reference Hopp2022). We contend that L1 and L2 comprehension, in the context of quantifier scope interpretation in this study, do not differ in terms of linguistic representations and processing architectures. Any differences or deviations in L2 scope interpretation compared to L1 can be attributed to the same factors that drive individual differences among native speakers, such as WM capacity and IC, as well as factors unique to L2 acquisition, such as proficiency.
8. Limitations and future directions
This study has limitations that warrant further investigation. First, it used only a single test for each individual difference measure, which may not fully capture individual variations. Different tests may tap different aspects of cognitive capacity; for instance, the Simon task may better capture domain-general cognitive control due to its immunity to linguistic interference, whereas the Stroop task better measures language-oriented response inhibition. Therefore, future research could employ a battery of IC tasks (e.g., Simon, Stroop and Flanker) and aggregate their scores for a more robust measure of IC. Second, WM measures similarly vary in reliability. Digit span tasks, for example, have shown lower reliability than written sentence recall tasks (Liu & Murao, Reference Liu and Murao2025), potentially limiting their predictive power. Future studies can benefit from including multiple WM tests to assess overall WM capacity more accurately. Third, this study posited that L2 proficiency mediates the role of WM and IC as a function of language experience. However, Kheder and Kaan (Reference Kheder and Kaan2021) found that L2 proficiency alone does not fully capture bilinguals’ language experience. Future studies should disentangle these constructs to better understand their interaction with cognitive factors.
9. Conclusion
This study examined how Chinese-speaking learners of English interpreted quantifier scope in English, focusing on the role of individual differences in L2 proficiency, WM and IC in interpreting IS, which is cognitively demanding. Results from a covered-box picture selection task showed that L2 learners accepted IS to a lesser extent compared to native speakers. This pattern reflects the influence of L1 transfer, limited L2 input and processing demands associated with IS. Moreover, WM and IC modulated the access to IS, with L2 proficiency mediating their effects. Overall, these findings underscore the contributions of domain-specific and domain-general mechanisms in L2 acquisition.
Data availability statement
All data and R codes used for this study are available at OSF: https://osf.io/t7xh4/overview?view_only=13c2b2f0cf7c4a12815fd7fc26c99cdd.
Acknowledgements
We gratefully acknowledge the CLA Ross-Lynn Postdoctoral Fellowship, funded by the Office of the Vice President for Research and Partnerships at Purdue University and awarded to Shaohua Fang. We also thank Dr. Al López, the Department of English and the College of Liberal Arts at Purdue University for their support. We are deeply grateful to Dr. Qilin Tian and Dr. Kangzheng Gao for their assistance with participant recruitment and data collection. We further acknowledge the two anonymous reviewers for their constructive feedback and insightful comments. We also thank the audiences at the 38th Annual Conference on Human Sentence Processing (University of Maryland, 2025) and the Language Learning roundtable conference Cognitive Neuroscience of Individual Differences in Adult Language Learning: Future Directions (CogNeuroIDALL) (University of Illinois Chicago, 2025) for their valuable suggestions. Earlier versions of this work were presented by Shaohua Fang at the Purdue Experimental Linguistics Lab and the Acquilab at the University of Connecticut, and we thank members of both labs for their stimulating questions and helpful discussions. Part of the data in this paper was recently presented by Shaohua Fang at the College of Liberal Arts Research Academy’s Research Day at Purdue University. We thank Dr. Elaine Francis and Dr. Arielle Borovsky for their constructive comments and feedback.
Funding statement
This work was supported by a Ross-Lynn Postdoctoral Fellowship awarded by Purdue University to Shaohua Fang.
Competing interests
The authors declare none.




