To bind or not to bind: Individual differences in pronominal processing among adolescent Mandarin-English heritage speakers

Jiuzhou Hao; Vincent DeLuca; Jason Rothman

doi:10.1017/S0305000926100828

To bind or not to bind: Individual differences in pronominal processing among adolescent Mandarin-English heritage speakers

Published online by Cambridge University Press: 29 June 2026

Jiuzhou Hao

Vincent DeLuca and

Jason Rothman

Show author details

Jiuzhou Hao*: Affiliation:
School of Philosophy, Psychology and Language Sciences, The University of Edinburgh , UK Department of Language and Culture, UiT The Arctic University of Norway, Norway Department of Linguistics and English Language, Lancaster University , UK
Vincent DeLuca: Affiliation:
Department of Language and Culture, UiT The Arctic University of Norway, Norway
Jason Rothman: Affiliation:
Department of Language and Culture, UiT The Arctic University of Norway, Norway Department of Linguistics and English Language, Lancaster University , UK Nebrija Research Centre in Cognition, Universidad Nebrija , Spain
*: Corresponding author: Jiuzhou Hao; Email: jz.hao@outlook.com

Article contents

Abstract
Introduction
ID factors modulating HL development and processing
Pronoun systems in English and Mandarin
The acquisition and processing of English and Mandarin pronouns
The present study
Results
Discussion
Data availability statement
Funding statement
Competing interests
Disclosure of use of AI tools
Ethics statement
References

Rights & Permissions

Abstract

This study examines how late adolescent Mandarin-English heritage speakers (HSs) process different types of Mandarin pronouns in real time and how individual differences in cognitive and experiential factors modulate this process. Using a web-based visual world eye-tracking paradigm, we tested the interpretation of pronominals (ta), simplex reflexives (ziji), and complex reflexives (taziji), which differ in their reliance on narrow syntax versus syntax-discourse-semantic integration. At the group level, taziji was interpreted locally, while ziji and ta were interpreted as referring to long-distance (LD) antecedents. Working memory and inhibition modulated the processing of ziji and ta, whereas only current heritage language (HL) exposure influenced the processing of taziji. These findings indicate that domain-general cognitive resources are recruited during the resolution of pronouns involving LD and interface-level dependencies, while narrow syntactic structures are more sensitive to variation in language exposure. The results point to structural asymmetries in how cognitive and experiential factors affect real-time HL pronoun resolution.

摘要

本研究考察了青春期后期普通话–英语传承语双语者(heritage speakers, HSs) 如何实时加工不同类型的普通话代词, 以及认知和经验性个体差异如何影响这一过程。我们采用基于网络的视觉世界眼动技术, 测试了人称代词“他/她” (ta)、单纯反身代词“自己” (ziji) 和复杂反身代词“他自己/她自己” (taziji) 的指代解析。这三类代词在加工过程中对狭义句法规则以及句法–语篇–语义整合的依赖程度不同。群体层面的结果显示, 受试者将 taziji 主要理解为指向局部先行词, 而 ziji 和 ta 则更多地被理解为指向长距离先行词。个体差异分析表明, 工作记忆和抑制控制显著影响了 ziji 和 ta 的理解, 而只有当前传承语使用程度显著影响了 taziji 的加工。这些发现表明, 在涉及长距离依存关系和界面层面整合的代词解析中, 认知因素发挥了重要作用; 相比之下, 依赖狭义句法约束的结构则对语言经验差异更为敏感。研究结果揭示了认知因素与语言经验在实时传承语代词加工中的结构性不对称作用, 为理解传承语句法加工机制提供了新的证据。.

Keywords

individual differences binding principles adolescent heritage speakers simplex and complex anaphors pronominals

Information

Type: Research Article
Information: Journal of Child Language , First View , pp. 1 - 25

DOI: https://doi.org/10.1017/S0305000926100828 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2026. Published by Cambridge University Press

1. Introduction

To date, heritage language (HL) acquisition research has primarily focused on either early childhood development or adult competence/performance outcomes (Kupisch & Rothman, Reference Kupisch and Rothman2018; Montrul, Reference Montrul2016; Polinsky, Reference Polinsky2018). As a result, HL development and use in adolescence remain understudied, despite compelling reasons to the contrary (Bayram et al., Reference Bayram, Pisa, Rothman, Slabakova, Montrul and Polinsky2021; Korkus & Vihman, Reference Korkus and Vihman2024; Minkov et al., Reference Minkov, Kagan, Protassova and Schwartz2019). Indeed, having a clear picture of HL development over late childhood and adolescence is essential for capturing the full developmental trajectory of HL bilingualism, in particular for linguistic domains where sub-properties are generally known to be gradually acquired in stages over time (e.g., properties related to pronouns). The fact that adolescent HL bilinguals are differentially subject to (drastic) changes in their input and usage patterns, emerging sociolinguistic realities with increased personal agency, as well as maturation in other cognitive and psychological abilities, results in a rather unique natural laboratory whereby studying their development over this transitional period (from childhood to adulthood) can be informative in multifarious ways.

Heritage speakers (HSs) are early bilinguals who acquire a HL from birth with naturalistic exposure at home or their immediate communities. Importantly, the HL is not the language of the larger society (Montrul, Reference Montrul2016; Rothman, Reference Rothman2009). At the macro-group level, a substantial body of research has documented that adult HSs often perform differently compared to first language (L1)-dominant speakers of the same language (Montrul, Reference Montrul2016; Polinsky, Reference Polinsky2018). To investigate the source of these differences, researchers have proposed a comparative developmental approach, examining child HSs and adult HSs relative to monolingual “baselines” (Montrul & Polinsky, Reference Montrul and Polinsky2021; Polinsky, Reference Polinsky2018; Polinsky & Scontras, Reference Polinsky and Scontras2020). The logic behind this approach is that if both child and adult HSs pattern similarly and differently from monolinguals at the same time, the difference reflects a differential acquisition trajectory and outcomes between HSs and L1-dominant users. In contrast, if child HSs are more similar to monolingual children but differ from adult HSs, then HL attrition may explain the adult HSs’ outcomes (Polinsky, Reference Polinsky2018).

While this comparative approach has yielded important insights, it implicitly assumes a linear or categorical developmental path and often (unwittingly) treats HS populations as internally homogeneous. However, an accumulating body of research highlights that HSs exhibit striking individual differences (IDs) to a degree that is rarely observed among neurotypical L1-dominant users or even L2 populations (De Houwer, Reference De Houwer2023; Rothman et al., Reference Rothman, Bayram, DeLuca, Di Pisa, Duñabeitia, Gharibi, Hao, Kolb, Kubota, Kupisch, Laméris, Luque, Van Osch, Pereira Soares, Prystauka, Tat, Tomić, Voits and Wulff2023). These differences span multiple linguistic domains (Daskalaki et al., Reference Daskalaki, Chondrogianni, Blom, Argyri and Paradis2019) and are shaped by a complex interaction of child-internal factors, for example, cognitive abilities, and child-external factors, for example, input quantity and quality (see Paradis, Reference Paradis2023, for a summary). For the comparative framework to be truly explanatory, it must be embedded within a broader developmental model that not only traces language outcomes across the lifespan but also accounts for the heterogeneity observed within HS groups.

Under the comparative approach, another recurring question in the HL bilingualism literature concerns whether all linguistic domains are equally vulnerable to input reduction (Polinsky & Scontras, Reference Polinsky and Scontras2020). One influential proposal is that linguistic structures situated at the interface of syntax with other domains, especially discourse-pragmatics, which are often the same ones fully converged upon later in L1-dominant child acquisition, are more susceptible to reduced input than properties governed strictly by narrow syntax. This idea is formalized in the Interface Hypothesis (IH), which maintains that morphosyntactic properties requiring integration with discourse-pragmatic information are particularly vulnerable in bilingual grammars (Sorace, Reference Sorace2011), most likely for processing-related reasons. In addition, structures involving long-distance (LD) dependencies, such as object-relative clauses and wh-movement, are thought to impose greater difficulties for HSs, a phenomenon sometimes referred to as “the distance problem” (Polinsky & Scontras, Reference Polinsky and Scontras2020). This distance problem suggests that the integration of elements across larger syntactic spans may interact with the vulnerability of interface structures, creating a cumulative challenge for bilingual grammars. The IH and the distance problem have been empirically tested in both child and adult HSs and other bilinguals, but findings supporting its predictions remain mixed (Daskalaki et al., Reference Daskalaki, Chondrogianni, Blom, Argyri and Paradis2019; Hao, Chondrogianni & Sturt Reference Hao, Chondrogianni and Sturt2024; Hao et al., Reference Hao, Chondrogianni and Sturt2024; Leal et al., Reference Leal, Rothman and Slabakova2014).

These mixed results may not be surprising when considering that the comparative approach tends to focus on group-level patterns while often overlooking the substantial variability within HS populations. By averaging across individuals, this approach may obscure meaningful variability in how different structures are processed and acquired by different individuals. An ID approach offers a powerful complementary lens to refine and test the IH and the distance problem. From the ID perspective, narrow syntactic structures and local dependencies are expected to be more uniform across HSs, whereas interface structures and LD dependencies are more susceptible to variation.

Additionally, while there has been an upsurge in research adopting an ID approach in the HL literature, doing so in combination with online processing methods trails significantly behind. Online processing methods measure how language users respond to linguistic information in real time, offering insights into automatic, time-sensitive mechanisms that underpin language comprehension/use. In contrast, offline methods, such as grammaticality judgment tasks or elicited production tasks, are more likely to be influenced by language users’ metalinguistic awareness, task strategies, and explicit knowledge. This distinction is particularly relevant for HSs, whose performance is known to be highly sensitive to such factors (Polinsky, Reference Polinsky2018). Online methods, therefore, offer a more direct window into HL knowledge and are essential for evaluating how ID factors shape language comprehension and use beyond what offline tasks can reveal (Bayram et al., Reference Bayram, Pisa, Rothman, Slabakova, Montrul and Polinsky2021).

The current study investigates how IDs in cognitive and input-related factors affect online HL processing in adolescent Mandarin-English HSs. Using a visual world eye-tracking paradigm, we focus on three types of Mandarin pronouns: pronominals (ta “he/she/it,” a LD dependency located at the syntax-pragmatics interface), simplex reflexives (ziji “self,” a LD dependency located at the syntax-semantics-pragmatic interface), and complex reflexives (taziji “himself/herself/itself,” a local dependency governed by narrow syntax). By examining how adolescent HSs interpret these forms in real time, we evaluate the extent to which working memory (WM), inhibition, and HL exposure and use patterns modulate HL processing. In doing so, we aim to contribute to a more developmentally grounded, cognitively informed model of HL bilingualism.

2. ID factors modulating HL development and processing

ID factors can be broadly categorized into user-internal and user-external dimensions (see Paradis, Reference Paradis2023, for a summary). Internal factors refer to the cognitive capacities that language users bring to the task of language acquisition and use, such as working memory (WM), inhibitory control/inhibition, and other executive functions. These factors influence how efficiently learners can process, store, and retrieve linguistic information. In contrast, external factors encompass the broader socioenvironmental context of language exposure and use. These include proximal factors, such as the amount and quality of direct input in the HL and opportunities for its use, as well as distal factors related to the larger environment, such as socioeconomic status (SES), which can shape the availability and richness of proximal experiences. It is worth noting that while ID factors should, in principle, affect language learners and users more generally, their effects on HL processing need not reflect the same underlying mechanisms across populations. HSs occupy a unique sociolinguistic and developmental position, characterized by early bilingual exposure, long-term dominance shifts, and often sustained reduction in naturalistic HL input. As a result, ID effects in this population are likely to arise from interactions between user-internal cognitive resources and user-external experiential factors that are not directly comparable to those in other bilingual or monolingual populations. For example, an effect of WM in HL processing, where input is reduced, variable, and largely naturalistic, may reflect different underlying mechanisms from WM effects observed in L1-dominant speakers with full input or in L2 learners whose reduced input is often instruction-driven (see also Cunnings, Reference Cunnings2017, on the differential role of WM in L1-dominant and L2 users). To avoid generalizations that might incorrectly imply shared underlying mechanisms, we therefore limit our discussion of ID effects to HSs.

Starting with (language) user-internal factors, a growing body of research has demonstrated that WM plays a positive role in bilinguals’ language abilities across a wide range of linguistic domains (e.g., vocabulary, morphosyntax, and narrative skills) and across task modalities (i.e., offline comprehension, production, and online measures). However, the role of WM in HL development and use has received comparatively less attention. Among the few existing studies, Paradis et al. (Reference Paradis, Soto-Corominas, Chen and Gottardo2020) found that HL vocabulary size in child HSs was positively associated with WM capacity, while Soto-Corominas et al. (Reference Soto-Corominas, Daskalaki, Paradis, Winters-Difani and Janaideh2022) reported a similar positive relationship between WM and HL sentence repetition accuracy across a range of morphosyntactic structures. These findings suggest that WM supports HL development at least in young children. While there is good reason to hypothesize a strong relationship between HL processing and WM capacity for particular domains of grammar, namely those that tax memory, such as, for example, lexical retrieval, LD dependencies, those that require integration between grammar and discourse, etc., the relationship between WM and HL processing is severely understudied, and thus unclear. In terms of grammar proper, Bice and Kroll (Reference Bice and Kroll2021) stand out as a singular study directly examining the role of WM on HL processing. Their findings showed that WM capacity correlated with sensitivity to subject-verb agreement violations in L1-dominant users, but not in adult HSs. While this stands out in contrast to Paradis et al. (Reference Paradis, Soto-Corominas, Chen and Gottardo2020) and Soto-Corominas et al. (Reference Soto-Corominas, Daskalaki, Paradis, Winters-Difani and Janaideh2022), the discrepancy may be task-specific, structure-specific, or age-specific. This highlights the need for carefully considering the linguistic domains and developmental stages under investigation.

Another user-internal factor that is particularly relevant to the current study is inhibitory control, or inhibition. While the role of inhibition in bilingualism has been extensively studied in the context of language switching and lexical selection as well as in research on domain-general cognitive advantages associated with bilingual experience, relatively little attention has been paid to how inhibition supports real-time sentence processing, particularly in HSs. Yet, examining inhibition at the processing level holds significant promise for addressing a central question in bilingualism: how does cross-linguistic influence (CLI) manifest during language comprehension, and what cognitive mechanisms help bilinguals manage competing representations from their two languages?

At the processing level, CLI has been reported in terms of HSs’ use of the majority language processing strategies even when processing the HL (see Chondrogianni, Reference Chondrogianni, Elgort, Siyanova-Chanturia and Brysbaert2023, for a summary). In such cases, HSs may rely on cues or parsing strategies that are more consistent with their societal language than with those typically employed by dominant speakers of their HL. Several theoretical models of bilingual sentence processing account for this phenomenon by emphasizing the role of cue-based transfer. For instance, both the cue-based retrieval model (e.g., Cunnings, Reference Cunnings2017) and the Unified Competition Model (e.g., MacWhinney, Reference MacWhinney, Hickmann, Veneziano and Jisa2018) posit that bilinguals interpret sentences by drawing on cues associated with both of their languages. When the two languages differ in cues, bilinguals may transfer cue preferences from one language to the other, resulting in non-target-like processing. Therefore, when HSs process HL structures that diverge from the majority language, those with stronger inhibitory control may show greater apparent success in suppressing default strategies that align with the societal dominant language.

Turning to user-external factors, a substantial body of evidence highlights the importance of HL input quantity (a proximal factor typically measured as current or cumulative exposure to and use of the HL) in shaping HL development and use, particularly as reflected in offline measures across a wide range of linguistic domains (Chondrogianni & Daskalaki, Reference Chondrogianni and Daskalaki2023; Daskalaki et al., Reference Daskalaki, Chondrogianni, Blom, Argyri and Paradis2019; Kubota et al., Reference Kubota, Goto, Kurokawa, Matsuoka, Otani and Rothman2025; Paradis et al., Reference Paradis, Soto-Corominas, Chen and Gottardo2020; Soto-Corominas et al., Reference Soto-Corominas, Daskalaki, Paradis, Winters-Difani and Janaideh2022). However, the role of HL input quantity on HL processing remains less clear. The emerging literature presents mixed findings, suggesting that the role of HL input quantity may vary across developmental stages or types of tasks. For example, Hao et al. (Reference Hao, Chondrogianni and Sturt2024) found that among pre-teenage Mandarin-English HSs, HL input quantity significantly predicted performance on offline comprehension and production of various non-canonical structures. However, input quantity did not modulate online processing of the same structures in a self-paced listening with picture verification task. In contrast, studies with adult HSs have reported more consistent effects of input. For instance, using the visual world eye-tracking paradigm, Hao, Rossi et al. (Reference Hao, Rossi, Nakamura, Luque and Rothman2025) found that input modulated the processing strategy preferences (see also Karaca et al., Reference Karaca, Brouwer, Unsworth and Huettig2024). Moreover, emerging neuroimaging work (EEG) also shows a positive correlation between HL grammatical processing and quantitative as well as qualitative aspects of HL input (Hao, Rossi et al., Reference Hao, Rossi, Nakamura, Luque and Rothman2025).

These divergent findings suggest that the role of HL input in processing may be developmentally mediated, and that adolescence could represent a transitional phase in which the relationship between input and real-time processing begins to emerge or reorganize. Moreover, compared to self-paced listening with picture verification tasks, the visual world eye-tracking and EEG paradigms, at least as implemented in the above studies, can offer a more naturalistic and certainly more temporally fine-grained measure of language processing. They track participants’ real-time attention to referents as they listen to/read (spoken) language, allowing the detection of subtle effects of input on processing. Additionally, unlike self-paced listening with picture verification tasks, eye-tracking and EEG do not require an additional metalinguistic or verification task at the end of each trial. This minimizes task-related demands and reduces the likelihood that processing is influenced by response strategies, affective factors (argued to be particularly relevant for HS, Polinsky, Reference Polinsky2018), or metalinguistic knowledge, making it particularly suitable for capturing the automatic aspects of HL processing.

The present study focuses on three language user-level factors: WM, inhibition, and HL input quantity. While other language user-level factors may also contribute to HL processing IDs, including additional predictors would require a substantially larger sample size to ensure adequate statistical power and avoid overfitting. This is especially true as we are also interested in how IDs potentially differentially manifest themselves across different linguistic domains. Moreover, many user-external factors, such as input quantity and input richness, tend to be highly correlated, complicating model specification and interpretability.

To maintain analytical clarity while achieving a robust understanding of the role of WM, inhibition, and HL input quantity, the present study controls for other user-level factors to the extent possible. As detailed in the Participants section, we accounted for HL education, parental language background, and SES, three distal factors that have been shown to influence HL development and use (see Paradis, Reference Paradis2023, for a review). By focusing on theoretically motivated core predictors while minimizing potential confounds, this study aims to provide a precise and developmentally sensitive account of how internal and external factors shape real-time HL processing in adolescence.

3. Pronoun systems in English and Mandarin

Pronouns are linguistic devices whose referential interpretation depends on other elements within the sentence and/or discourse. That is, their meanings are not fixed but are bound to and/or co-referenced with other noun phrases (NPs) or entities. Pronouns include reflexives (sometimes referred to as anaphors), such as himself, and pronominals (sometimes referred to reductively as pronouns) such as he. Reflexives and pronominals have different distributions. For example, while the reflexive himself in (1) must be interpreted as Joe but not Jack, the pronominal him in (2) must not refer to Joe but can be interpreted as Jack or some other third-person male in the discourse.

(1)	Jack thinks Joe likes himself.
(2)	Jack thinks Joe likes him.

Chomsky (Reference Chomsky1993) proposed the Binding Principles A and B to capture the distribution of reflexives and pronominals respectively. More specifically, Binding Principle A suggests that a reflexive is bound in its governing category/a reflexive is locally bound (the minimal category that contains the reflexive, assigns a thematic role or syntactic Case to the reflexive, and is a SUBJECT). Binding Principle B stipulates that a pronominal must be free within its governing category. While in both (1) and (2), Jack and Joe c-command himself and him, Joe is realized within the governing category of the pronouns, but Jack is not. In other terms, Joe is a local antecedent of the pronouns, and Jack is a LD antecedent and pronominals can bind with an LD antecedent but cannot bind with a local antecedent, and reflexives must bind with a local antecedent.

While Binding Principle A reliably accounts for the distribution of reflexives in argument positions in English, Binding Principle B appears to face some challenges, that is, pronominals governed by Principle B often exhibit behaviour that suggests greater interpretive flexibility than Binding Principle B alone predicts. Consider example (3):

(3)	I know what Jack and Joe have in common. Jack adores him, and Joe adores him too.

In this case, the second pronominal him for many can refer to Joe, its apparent local antecedent, seemingly violating Principle B (as a bound variable interpretation). However, such examples are often judged acceptable in context, especially when contrastive focus or parallel discourse structure is present. This example highlights the difference between binding and co-reference (see Reuland, Reference Reuland, Everaert and van Riemsdijk2006, for a summary). Binding involves a syntactic dependency between a pronoun and its antecedent, constrained by structural rules such as locality and c-command. In contrast, co-reference refers to cases where two expressions refer to the same entity but are not syntactically dependent on each other. Building on this distinction, Grodzinsky and Reinhart (Reference Grodzinsky and Reinhart1993) proposed the Rule I, which formalizes a principle of interpretive economy: when both a bound variable reading (via syntactic binding) and a coreferential reading (via discourse) are available and yield the same interpretation, the coreferential reading is blocked. In example (2), him cannot be bound by Joe (as it violates Principle B) nor be co-referenced with Joe (this co-reference reading is identical to a bound reading, violating Rule I). In example (3), however, while the binding reading is ruled out by Principle B, the co-referential reading (A and B like B) is not blocked by Rule I as it differs from the bound variable reading (A likes A and B likes B), allowing a local co-referential reading. These patterns demonstrate that (LD) pronominal resolution draws on both syntactic constraints and pragmatic computations: while syntax constrains the space of possible antecedents, pragmatic and discourse principles determine which referent is ultimately selected.

Under such an account, reflexives, governed by Principle A, are interpreted via syntactic binding: once the structural dependency is established, the reflexive’s reference is determined compositionally. In contrast, pronominals are referentially independent and interpreted through discourse co-reference (Rule I) once syntactic constraints (Principle B) have ruled out illicit binding. This asymmetry explains why reflexives are handled within narrow syntax, whereas pronominals require computation at the syntax-pragmatics interface.

Mandarin has three forms of pronouns: the pronominal ta “he/she/it” (ta henceforth), the complex reflexive taziji “himself/herself/itself” (taziji henceforth), and the simplex reflexive ziji “self” (ziji henceforth). The pronominal ta, like its English counterpart him, is governed by Binding Principle B and Rule I (which blocks co-reference between ta and its local antecedent when a complex reflexive taziji yields the same interpretation). The complex reflexive taziji, corresponding closely to English himself, is subject to Binding Principle A, requiring local syntactic binding. In contrast, the simplex reflexive, ziji, lacking in English, can refer either to an LD antecedent or to a local antecedent, violating Binding Principle A. The status of ziji in Mandarin remains highly debated without a theoretical consensus.

Some approaches treat ziji as a syntactic anaphor in both LD and local readings (see Cole et al., Reference Cole, Hermon, Huang, Everaert and van Riemsdijk2006, for a summary). The basic idea is that LD reading of ziji is a result of Logical Form (LF)-movement, where ziji is moved to the T(ense) position at LF, where it receives its features from the subject through Specifier-Head agreement. Under this approach, even LD binding is ultimately local, conforming to the Binding Principle A. However, this movement-based approach fails to account for some empirical observations, such as island constraints and blocking effects (see Wang & Pan, Reference Wang, Pan, Wang and Pan2021, for a summary of criticisms of the movement approach). One influential alternative approach to ziji proposes that it functions as a syntactic anaphor following Binding Principle A and giving rise to local binding and as a pragmatic logophor leading to an LD reading (C. T. J. Huang & Liu, Reference Huang and Liu2000). Logophors impose a consciousness requirement, requiring the antecedent to be conscious of the relevant event being reported. For example, in example (4), the LD antecedent Jack must be aware of the claim made in the embedded clause, as it reflects his own view, that is, the consciousness requirement is met, allowing ziji to take on the LD antecedent as its referent. Under this non-uniform approach, the interpretation of the LD ziji is influenced by syntactic structure, verb semantics (e.g., whether the verb is logophoric), and discourse-level factors (e.g., perspective alignment).

In the present study, as we are interested in IDs in modulating the processing of structures at the interface versus at the narrow syntax, all main verbs in the experiment are logophoric, leading to a preference for LD binding. Thus, the LD interpretation of ziji arises at the syntax-semantics-pragmatics interface: syntax constrains possible antecedents, verb semantics encode logophoricity, and discourse/pragmatic context ultimately selects the referent.

(4)	杰克	认为	乔	喜欢	自己。
	Jack	renwei	Joe	xihuan	ziji
	Jack	believe	Joe	likes	SELF
	Jack believes that Joe likes him/himself.

4. The acquisition and processing of English and Mandarin pronouns

In terms of the acquisition of pronouns in L1-dominant children, both English- and Mandarin-speaking children have been shown to have acquired Binding Principle A from a very young age (around 5 y.o.). That is, they correctly interpret reflexives to a local c-commanding antecedent (Chien, Reference Chien1992; Chien & Wexler, Reference Chien and Wexler1990; Clackson et al., Reference Clackson, Felser and Clahsen2011). In contrast, at the same age, English-speaking children have been consistently shown to erroneously accept local bindings of pronominals, a phenomenon often referred to as the Delay of Principle B Effect (see Thornton & Wexler, Reference Thornton and Wexler1999, for a review). This has been attributed either to children’s immature development of pragmatic principles governing pronominal resolution (e.g., Rule I), despite having acquired Binding Principle B (e.g., Chien & Wexler, Reference Chien and Wexler1990), or to domain-general limitations such as WM constraints (e.g., Kim & Yoon, Reference Kim and Yoon2020). Studies testing the Delay of Principle B Effect in Mandarin, however, have received mixed results (Chien & Lust, Reference Chien, Lust, Li, Tan, Bates and Tzeng2006). It is typically not until the age of 9 that children begin to show adult-like performance in correctly rejecting local bindings of pronominals.

As for the development of the Mandarin ziji , while children before the age of 4 show unsystematic performance patterns, children from the age of 5 predominantly co-index ziji with local antecedents rather than with LD antecedents (Chien & Lust, Reference Chien, Lust, Li, Tan, Bates and Tzeng2006). This local preference is also found in adults. However, individual studies show a high degree of variations, for example, the mean acceptance rates of LD readings range from under 40% to over 90%, and even the mean acceptance rates of local readings can range from under 70% to 90% (see Chen & Ionin, Reference Chen and Ionin2023, for a summary). Such inter-study variation could be attributed to the different methodologies used in these studies, for example, truth value judgement task versus picture-biasing sentence acceptability judgement task, different verbs (logophoric vs. generic verbs), etc., to potential individual variation in executive function (particularly WM) and/or the importance placed on syntax versus discourse factors (see Kim & Yoon, Reference Kim and Yoon2020, for a summary). Importantly, different offline tasks vary in their demands on WM and in the degree to which they require integration of pragmatic information. However, research efforts aimed at addressing task effects and individual variation in this domain remain limited. Indeed, studies adopting online processing methods tend to report a local-reading advantage such that binding ziji to a local antecedent induces smaller processing costs compared to binding ziji to an LD antecedent (Dillon et al., Reference Dillon, Chow and Xiang2016; Lyu & Kaiser, Reference Lyu and Kaiser2021).

Among bilingual speakers, to our knowledge, all existing studies have employed offline tasks. For example, C. Chen and Ionin (Reference Chen and Ionin2023) used a picture-based truth-value judgment task to examine the acceptability of local and LD readings of ta, ziji, and taziji among two groups of Mandarin proficiency-matched L2 learners: L1-Korean and L1-English speakers. Their results showed that L1-Korean learners were more likely to accept local readings of ta compared to both the L1-English learners and L1-dominant Mandarin speakers, suggesting CLI from Korean, a language that permits locally bound pronominals. In the comprehension of taziji, all three groups demonstrated a preference for local readings, although the L1-English group was somewhat more likely to reject its local readings. For ziji, L1-dominant Mandarin speakers showed a numerical preference for local readings, while both L2 groups were significantly more likely to reject LD readings and accept local readings.

In a related study, C. Chen (Reference Chen2020) used the same methodology to compare the comprehension of all three pronoun forms among Mandarin-English HSs and L1-English L2 learners of Mandarin. The results showed that L2 learners were more likely to reject LD readings and accept local readings of ta (in contrast to Chen & Ionin, Reference Chen and Ionin2023), taziji (partially consistent with Chen & Ionin, Reference Chen and Ionin2023), and ziji (like Chen & Ionin, Reference Chen and Ionin2023). In contrast, HSs patterned more closely with L1-dominant speakers in their interpretation of ta and taziji. However, they were less likely to accept the LD reading of ziji and more likely to accept its local reading, which was partially attributed to CLI from English.

5. The present study

The present study employs the web-based visual world eye-tracking paradigm to investigate the automatic, online processing of three types of Mandarin pronouns, ta, ziji, and taziji, among Mandarin-English late adolescent HSs aged 14 to18. Importantly, to avoid variable binding preferences of ziji across individuals and to bias an LD reading, we use only logophoric verbs in the experiment. This manipulation allows us to probe whether different pronoun types engage distinct resolution mechanisms during real-time processing. Specifically, we aim to address the following research questions (RQs):

RQ1: How do HSs process the three types of Mandarin pronouns? Do the pronouns elicit distinct processing patterns?

RQ2: Which (and how do) user-internal and user-external factors modulate IDs in HSs’ processing? If so, do they do so differentially for different types of pronouns?

As described above, two user-internal factors are of particular interest in the present study: WM and inhibition. As for the user-external factor, we focus on HL input quantity. Our focus on late adolescence offers a particularly informative window for testing the effects of user-external and user-internal factors. Compared to early adolescence and childhood, late adolescence is marked by the maturation of executive functions, enabling us to better distinguish between effects attributable to ongoing cognitive development and those driven by stable individual variability. Importantly, this age range is also when adolescents begin to exert greater autonomy over language use, making more independent choices about when, how, and with whom they engage in their HL. As a result, patterns of HL exposure and use (user-external factors) become more variable and personalized than in childhood, offering richer variation in external input that can be captured within an ID framework.

5.1. Predictions

Starting with RQ1, following C. Chen (Reference Chen2020)’s study with Mandarin-English adult HSs and the broader HL processing literature (e.g., Fuchs, Reference Fuchs2022; Hao et al., Reference Hao, Chondrogianni and Sturt2024; Karaca et al., Reference Karaca, Brouwer, Unsworth and Huettig2024), we predict that adolescent HSs will show distinct processing patterns across pronoun types. More specifically, they would show local binding preferences for taziji , and LD preferences for ziji and ta . Nevertheless, according to the IH and the distance problem, HSs may show greater variability in their preferences during the processing of ziji and ta , but relatively robust local preferences for taziji . This is because taziji resolution is governed by narrow syntactic constraints (Binding Principle A) and involves local dependencies, whereas the resolution of ziji and ta requires the integration of information across domains and the establishment of LD dependencies, making these structures more susceptible to input reduction and individual variation. The ID approach adopted in the present study allows us to move beyond a simple group-level analysis and ask why some HSs are more likely than others to exhibit local binding of ziji and ta despite the LD bias – driven by logophoric verbs in the case of ziji and by Binding Principle B in the case of ta – by examining the role of user-internal and user-external factors.

It is worth noting that local interpretations of ziji and ta in the present study need not reflect the same underlying mechanisms. Because ziji allows both local and LD binding in Mandarin and exhibits a default preference for local binding, local interpretations of ziji in the present design may arise from multiple sources, including difficulties with interface/LD structures, reliance on a default local strategy, or CLI from English, where reflexives are locally bound. In contrast, ta does not permit local binding in either Mandarin or English; therefore, any local binding of ta in the present study cannot be attributed to CLI or default strategies and is more likely to reflect difficulties associated with interface/LD integration.

For RQ2, we predict user-internal and user-external factors to play differential roles depending on the type of pronouns. With respect to user-internal factors, WM is expected to facilitate LD interpretations of ziji and ta , insofar as establishing LD dependencies requires maintaining and integrating multiple candidate antecedents and sources of information (e.g., Rule I, logophoric verb semantics) during online processing. In addition, inhibitory control is expected to modulate the extent to which ziji is interpreted locally. Specifically, HSs with higher inhibitory ability may be better able to suppress English-like local binding strategies and/or Mandarin local binding preference, leading to a higher likelihood of LD readings. Turning to user-external factors, despite it seemingly being relatively intuitive that HL input will have a default, ubiquitous effect on HL processing, previous findings are mixed (Hao et al., Reference Hao, Chondrogianni and Sturt2024; Hao, Kubota, et al., Reference Hao, Kubota, Bayram, González Alonso, Grüter, Li and Rothman2025; Karaca et al., Reference Karaca, Brouwer, Unsworth and Huettig2024) and likely depend on other factors such as the specific domain of grammar and the time-course of its typical acquisition (Tsimpli, Reference Tsimpli2014). Two possibilities emerge, that is, HL input either modulates HL processing or does not. Given the limited literature in this specific domain and age range, we treat the effect of HL input as exploratory, without strong directional predictions. However, given that the properties we examine, while inherently related, have independent time courses for full acquisition even in monolingual children (Chien & Lust, Reference Chien, Lust, Li, Tan, Bates and Tzeng2006) and in light of claims that reduced input would affect interface related structures more, it is possible, if not likely, that input factors might have a greater influence over some properties (ziji and ta) examined herein than others (taziji).

5.2. Participants

In total, 125 eligible participants accessed the experimental platform and provided informed consent. Of these, 44 participants did not successfully complete the main visual-world eye-tracking task, primarily due to repeated calibration failures. An additional nine participants were excluded because their effective eye-tracking sampling rate during the main task was below 15 Hz. Consequently, data from 72 Mandarin-English late adolescent HSs (14–18 y.o) who completed the study online were deemed of sufficient quality for analysis.

However, we further excluded 21 participants to control for variation in several user-level factors that have been shown to modulate HL development and use beyond the factors in focus in the current study. More specifically, we excluded three participants due to relatively low SES, five participants who currently reside in Ireland, eight participant with low Mandarin proficiency measured by the Peabody Picture Vocabulary Test (PPTV) fourth edition (Dunn & Dunn, Reference Dunn and Dunn2012), three participants who speak another language other than Mandarin and English (including other Chinese languages, e.g., Cantonese, Hokkien, etc.), and two participants who were exposed to the societal dominant language English after the age of three. The final sample included 51 participants (18 girls, mean age = 15.8 years, SD = 1.7, min = 14 years, max = 18 years). Within the 51 participants, 16 currently reside in the UK, while the other 35 reside in the USA. All HSs were exposed to Mandarin from birth at home and to English before the age of 3 years. They were either born and raised (all second-generation immigrants; n = 17) or immigrated to their current residency before the age of 3 (first-generation immigrants; n = 34), with a mean age onset of acquisition of English of 11.57 months (SD = 7.96, min = 0, max = 27). All participants had exposure to formal instruction in Mandarin (e.g., via Saturday Schools, tuition, etc.).

5.3. Baseline tasks

5.3.1. Language background questionnaire

To collect participants’ language background and demographic information, we administered the Quantifying Bilingual Experience (Q-BEx) questionnaire (De Cat et al., Reference De Cat, Kašćelan, Prévost, Serratrice, Tuller and Unsworth2023). Q-BEx is a validated, user-friendly online instrument designed to quantify multilingual language experience. We included all mandatory modules and optional modules, except for the detailed attitudinal module. As a result, we obtained a comprehensive assessment of HSs’ language exposure and use, self-rated proficiency, richness of linguistic experiences, and language mixing patterns. The questionnaire provides four composite scores that serve as proxies for HL use and exposure. Specifically, we derived two key measures: current HL exposure and use (Mean = 0.41, SD = 0.11, Range = 0.07–0.60) and cumulative HL exposure and use (Mean = 80.53, SD = 30.61, Range = 11.70–118.38), by summing the respective exposure and use components. These aggregated scores were used as continuous user-external variables in subsequent analyses.

5.3.2. English and Mandarin receptive vocabulary

We administered the Peabody Picture Vocabulary Test, Fourth Edition (PPVT-4; Dunn & Dunn, Reference Dunn and Dunn2012) to assess participants’ receptive vocabulary ability in both English and Mandarin. Form A was administered in English, and Form B was translated from English into Mandarin and administered in Mandarin. Because the PPVT-4 is neither available nor normed for Mandarin and was not designed or normed for bilingual populations, even in its English version, we report raw scores only, which were used solely for participant screening and exclusion purposes. For the same reasons, we caution readers against interpreting PPVT scores as direct measures of language proficiency or comparing English and Mandarin vocabulary scores within the same participant. Nevertheless, test administration followed the procedures outlined in the PPVT-4 manual, including age-appropriate starting items, ceiling rules, and termination criteria based on error counts. The final sample has a mean PPVT score of 139 in English (SD = 8.08, min = 115, max = 152) and 124 in Mandarin (SD = 7.44, min = 106, max = 151).

5.3.3. Flanker/no-go task

To examine inhibitory control, participants were tested on an engaging variant of a flanker task that also includes a Go and No-Go component (Woodard et al., Reference Woodard, Pozzan and Trueswell2016). In this task, a Go trial requires the participants to press a key on the keyboard (“Z” for left and “M” for right) in accordance with the direction of the middle fish that is surrounded by two flanker fishes on its left and two on its right. There are two conditions in the Go trials: the congruent condition (30 trials) and the incongruent condition (30 trials). In the congruent conditions, the flanker fish faces the same direction as the middle fish. In the incongruent conditions, the flanker fish faces the opposite direction to the middle fish. In a No-Go trial (30 trials), the middle fish is surrounded by fishbowls. The participants were instructed to refrain from responding. We counterbalanced the direction of the middle fish. Each trial was preceded by a 1000 ms fixation cross in the middle of the screen, and each trial lasted till a response was recorded or until a maximum display time of 5000 ms was reached. Reaction time and accuracy were recorded. We calculated the No-Go cost (mean = −0.16, SD = 0.18, min = −0.33, max = 0.33) as a proxy for inhibitory control by taking the average accuracy difference between omission errors on Go-trials and hits on No-Go trials. This No-Go cost score was carried forward into subsequent analyses, with higher values indexing weaker inhibitory control.

5.3.4. Working memory task

We used a spatial sequence WM task inspired by the Alloway Working Memory Assessment (Alloway et al., Reference Alloway, Gathercole, Kirkwood and Elliott2008) to assess participants’ non-verbal visuospatial WM. In this task, participants were instructed to help a forgetful alien return home by recalling, in reverse order, the sequence of squares the alien had walked through in a 4 × 4 matrix. In each trial, the alien randomly moved across one or more squares in the matrix. In the first block, the alien walked through a single square; in each subsequent block, the number of visited squares increased by one. Each block consisted of six trials, with a maximum of eight blocks in total. Scoring followed the standard procedure outlined in the Alloway assessment. If a participant responded correctly to the first four trials in a block, they automatically progressed to the next block and were awarded the full six points for that block. The task was terminated once the participant responded incorrectly to three trials within the same block. The mean score of the task reached 22.69 (SD = 4.76, min = 15, max = 30). This score was carried forward into the analyses as an index of WM capacity, with higher scores indicating greater WM capacity.

5.3.5. The visual world eye-tracking experiment

The visual world eye-tracking paradigm was adopted to examine participants’ online processing. In the task, participants listened to sentences while viewing three pictures (of potential referents) on the screen (Figure 1). We embedded the pronouns in genitive forms by adding the genitive marker de after the pronouns, and we included three experimental conditions (i.e., ta, ziji vs. taziji). Additionally, we included a condition with full NPs followed by de marker as a control condition to make sure participants understand genitives. Each condition had nine trials, giving rise to 27 experimental trials and nine control trials. All experimental sentences followed the same format: This morning/last night + Long Distance referent NP + Main Verb + Local referent NP + Embedded Verb + ta/ziji/taziji/control NP + de + NP. We chose three logophoric verbs (xiangrang “want somebody to do something,” yaoqiu “demand somebody do something,” and mengjian “dream about somebody doing something”) as the main verbs, and each appeared three times per condition. For the embedded verbs and the de NP pairs, we chose three pairs, that is, liang … de tiwen “checking someone’s temperature,” mo … de erdu “touch someone’s ear,” and la … de weiba “pull someone’s tail,” and each appeared three times per condition. For all referent NPs (including for the control condition NPs), frequently used disyllabic animals were adopted, and each animal appeared equally often as the LD referent, the local referent, and the third referent (either not mentioned or the control condition NP). The positions of the NPs were counterbalanced such that the LD referent, local referent, and a third potential referent appeared an equal number of times on Top, Bottom Left, and Bottom Right positions. To avoid item-specific effects, we created four lists such that each item appeared once as a ta, ziji, taziji, or NP condition. For example, Figure 1, as a visual scene, was accompanied by sentence (5a) in List A, (5b) in List B, (5c) in List C, and (5d) in List D.

(5) a	jintianzaoshang	daxiang	yaoqiu	xiaogou	liang	ta	de	tiwen
	This morning	elephant	demand	dog	check	PRO	de	temperature
	“This morning, the elephant demanded the dog to check his temperature.”
b	jintianzaoshang	daxiang	yaoqiu	xiaogou	liang	ziji	de	tiwen
	This morning	elephant	demand	dog	check	SE	de	temperature
	“This morning, the elephant demanded the dog to check his temperature/the temperature of himself.”
c	jintianzaoshang	daxiang	yaoqiu	xiaogou	liang	taziji	de	tiwen
	This morning	elephant	demand	dog	check	SELF	de	temperature
	“This morning, the elephant demanded the dog to check the temperature of himself.”
d	jintianzaoshang	daxiang	yaoqiu	xiaogou	liang	shizi	de	tiwen
	This morning	elephant	demand	dog	check	lion	de	temperature
	“This morning, the elephant demanded the dog to check the lion’s temperature.”

Figure 1.

Example of a visual scene in the visual world eye-tracking experiment.

A grid layout featuring three cartoon animal illustrations. A dog is at the top center. An elephant is at the bottom left. A lion cub is at the bottom right. The remaining squares are solid pink.

The experiment also included 18 filler trials where the processing of relative clauses was the focus. Relative clauses were chosen as they allow us to make sure two animate referents can be mentioned while another one can be inferred, following Y. T. Huang et al. (Reference Huang, Zheng, Meng and Snedeker2013)’s design. As such, the visual scenes for the relative clauses are comparable to the ones used for pronoun processing in terms of animacy and number of potential referents. For example, for the filler sentence “The fish that catches the lion is singing,” the visual scene consisted of a fish, a lion, and a potential patient of the action not mentioned in the sentence, for example, a shrimp, or a potential agent of the action not mentioned, for example, a crab.

To ensure that the participants paid attention to the sentences and to make sure the participants understood genitives, the control trials, along with another random 15 trials, included an offline comprehension check in the format of picture sentence verification (no participant was removed due to low/below-chance offline comprehension accuracy). In the comprehension check, participants were shown a picture that either matched the sentence or did not and asked to press the “Z” key on the keyboard if they matched or the “M” key if not. This comprehension check was embedded in another alien game. We instructed the participants to listen to the sentences and look at some pictures. We also informed them that occasionally, an alien will try to draw the event described in the sentence. They were instructed that when this happens, their task was to decide whether the alien’s drawing matched the sentence or not by pressing “Z” or “M.” All trials appeared in a completely random order for each participant. Each trial began with a 1,500-ms display of the visual scene, followed by the auditory experimental sentence.

Auditory stimuli were recorded by a male Mandarin L1-dominant user in a soundproof booth. Experimental stimuli were constructed by concatenating extracted tokens of This morning/last night + Long Distance referent NP + Main Verb + Local referent NP + Embedded Verb + ta/ziji/taziji/control NP + de + NP. All recordings were produced with neutral prosody, and no systematic prosodic manipulation (e.g., stress) was implemented on any segments. Two L1-dominant Mandarin users checked the naturalness of all stimuli. The duration of all parts but the ta/ziji/taziji/control NP was held constant across all items. Importantly, the duration of de + NP was exactly 1,200 ms in each experimental item. This 1,200 ms period constitutes the critical region for analysis.

5.4. Procedure

All participants took part in the study from their homes. We implemented all tasks with Gorilla on a webpage (Anwyl-Irvine et al., Reference Anwyl-Irvine, Massonnié, Flitton, Kirkham and Evershed2020), which utilizes WebGazer.js (Papoutsaki et al., Reference Papoutsaki, Sangkloy, Laskey, Daskalova, Huang and Hays2016) to run webcam-based eye-tracking. To minimize any carry-over effect between the Mandarin and English vocabulary test, all participants completed all tasks in the following sequence: the Eye-tracking task, Mandarin PPVT, Flanker/No Go Task, WM Task, English PPVT, and the Q-Bex. The whole experiment lasts around approx. 65 mins. The study was approved by the institutional ethics committee. All participants were informed of their ethical rights of participation in written form, prior to the experiment. Before any tasks, participants were asked to check boxes on the webpage to give consent for their participation.

Prior to participation, participants received an introduction video accompanied by written instructions in both Mandarin and English, detailing how participants could help in optimizing data quality (e.g., close all other applications and webpages except for the experiment page; maximize ambient lighting, etc.). Participants were additionally provided with both video and written instructions on how to complete the calibration procedure. Eye-tracking calibration employed a 9-point calibration routine. Recalibration was performed every nine trials or every 5 min, whichever occurred first. For each calibration phase, participants were allowed up to three attempts. An attempt was classified as unsuccessful if at least two out of the nine calibration points failed to calibrate successfully. Furthermore, to minimize system lag and reduce computational load, eye-tracking data were recorded only from the onset of the LD NPs. As a result, eye-gaze data during the 1,500-ms preview window and during the initial temporal adverbial segment (e.g., “last night” / “this morning”) were not recorded. These quality optimization procedures gave us a mean effective sampling rate of 30.6 Hz (SD = 9.3, Range = 15–60).

6. Results

For plotting and data analyses, we resampled the eye-movement data into 50-ms time bins. Given the mean sampling rate, this bin width minimizes empty bins while avoiding excessive aggregation of multiple samples within a single bin. This resampling yielded 24 time bins (data points) within the critical time window. To ensure data quality, we excluded trials with more than 50% of invalid data points (e.g., out of bounds), resulting in 1,285 trials retained out of a total of 1,479 trials. Figure 2 illustrates the difference in mean fixation proportion to the long distance (LD) referent minus the mean fixation proportion to the local referent throughout the course of a trial. As such, a positive value indicates more looks to the LD referent over the local referent, and a negative value indicates more looks to the local referent over the LD referent. The dotted vertical line indicates the onset of the critical time window (the onset of the genitive marker de). As we centred the time information with reference to the onset of the genitive marker, and as different pronouns have different lengths, the onset of each condition differs in the figure. Visual inspection suggests that at the group level (RQ 1), HSs preferred LD referent over local referent after hearing ta (LD-advantage score: Mean = 10.88, SD = 4.37, Range = 1–21) and ziji (LD-advantage score: Mean = 6.76, SD = 6.65, Range = −13–21). However, a preference for the local over LD referent was observed after HSs hearing taziji (LD-advantage score: Mean = −7.40, SD = 3.89, Range = −24–0).

Figure 2.

Difference in proportion fixations to LD versus local referent by Condition.

A line graph showing the difference in fixation proportions to L D versus local referents over time for three conditions. See long description.

Figure 2. Long description

The graph features a horizontal x-axis labeled Time Bin ranging from negative 3000 to 1000 and a vertical y-axis labeled Difference in fixation proportions to L D referent versus local referent ranging from negative 1.0 to 1.0. A vertical dotted line marks the zero point on the x-axis. A legend on the right identifies three conditions: ta represented by a solid black line, ziji represented by a dotted line, and taziji represented by a dashed line. All lines include a light gray shaded area representing the confidence interval.

* The ta condition (solid line) starts near zero, rises to a peak of approximately 0.35 at negative 1500, dips back to zero at the vertical dotted line, and then rises sharply toward 0.5 at the end of the timeline.

* The ziji condition (dotted line) remains relatively flat near the zero baseline throughout the negative time bins, showing a slight upward trend toward 0.3 after passing the zero mark.

* The taziji condition (dashed line) fluctuates near zero until negative 2000, then dips to a trough of negative 0.2 at negative 1200, returns to zero at the vertical dotted line, and then drops sharply to negative 0.5 by the end of the timeline.

To statistically account for the results, we calculated the LD advantage score (over local referent) within the critical time window. This LD-advantage score was calculated by subtracting the number of 50 ms time bins within the critical time window that contained looks to the local referent from the number of 50 ms time bins that contained looks to the LD referent. We did not adjust for the 200 ms needed to initiate ballistic eye movement in response to an acoustic stimulus because we do not make a distinction between prediction and integration. In contrast, we are interested in how pronouns are interpreted online in general. For statistical analyses, general linear mixed-effect regressions were carried out with the lme4 package (Bates et al., Reference Bates, Mächler, Bolker and Walker2015) in R (R Core Team, 2018). Our statistical modelling, whenever possible, follows a confirmatory approach that is subjective and theory-driven (McElreath, Reference McElreath2020), where only theory-driven fixed effect factors were included. For random effects, we included the maximal random effects justified by the design where possible (Barr et al., Reference Barr, Levy, Scheepers and Tily2013), that is, by-subject and by-item random intercepts, as well as by-subject and by-item random slopes for Condition. When the maximal model failed to converge, we iteratively simplified random effect structures until convergence was achieved, that is, removing random effect(s) accounting for the least variance.

For RQ1 (Do the pronouns elicit distinct processing patterns?), the maximal converged model was derived via the R syntax: lmer(LD_Adv ~ Condition + (1 + Condition|Participant) + (1|Item), with the fixed effect Condition treatment coded and ziji as the referenced level ( ziji vs. ta and taziji ). The model suggests that (1) the ziji condition induced significantly more looks to the LD referent over the local referent (the intercept is significantly larger than zero: Estimate = 7.41, SE = 0.51, CI [6.40, 8.42], t = 14.42, p < 0.001); (2) the ta condition induced more looks to the LD referent compared to the ziji condition (Estimate = 3.46, SE = 0.48, CI [2.51, 4.41], t = 7.15, p < 0.001); and (3) the taziji condition induced more looks to the local referent compared to the ziji condition (Estimate = −15.38, SE = 0.74, CI [−16.83, −13.92], t = −20.76, p < 0.001). As post hoc analyses, we reran the model with the reference level for both ta and taziji to examine HSs’ preference for LD and local referent for each condition. The results suggest that the ta condition induced significantly more looks to the LD referent over the local referent (the intercept is significantly larger than zero: Estimate = 10.87, SE = 0.35, CI [10.18, 11.56], t = 31.00, p < 0.001). In contrast, the taziji condition induced significantly more looks to the local referent over the LD referent (the intercept is significantly smaller than zero: Estimate = −7.96, SE = 0.48, CI [−8.91, −7.01], t = −16.42, p < 0.001).

For RQ2 (RQ2: Which (and how do) user-internal and user-external factors modulate IDs in HSs’ processing?), we firstly examined the correlations among all individual-internal and individual-external factors to avoid multicollinearity in statistical modelling. This step identified strong correlations between Mandarin PPVT score and all other variables, between current HL exposure and use and cumulative HL exposure and use, among others (see the R script on OSF for more information). As such, aiming to statistically account for the effects of theoretical interests, we decided to include only current HL exposure and use (centred), WM (centred), and inhibition (No Go Cost; centred) as fixed effects interacting with Condition (treatment coded). Given the quantity of data at our disposal, we did not include interaction terms in the models among these background factors. The final converged maximal model has the R syntax of lmer(LD_Adv ~ Condition*(Current_Exposure_Use_Mandarin_c + WM_c + NoGo_Cost_c) + (1 |Participant) + (1|Item)). We calculated the Variance Inflation Factor (VIF) to ensure that the maximal model does not violate the multicollinearity principle (all VIFs < 2). Table 1 summarizes the statistical output where ziji is treated as the reference level for the Condition variable (ziji vs. ta, taziji). It is important to note that, as the categorical variable was treatment-coded, the effects reported in the statistical table represent simple effects, that is, the effect of a variable at a given level relative to the reference level, rather than main effects.

Table 1.

The model with Condition (ziji as the reference level) interacting with Current HL Exposure and Use, WM, and NoGo Cost as fixed effects

A statistical results table showing model estimates for predictors including taziji and ta interacting with HL Exposure, WM, and NoGo Cost. See long description.

Table 1. Long description

The table contains six columns: Predictors, Estimates, S E, C I, t, and p.

* (Intercept): Estimate 7.41, S E 0.29, C I 6.83 to 7.98, t 25.36, p less than 0.001.

* taziji: Estimate negative 15.42, S E 0.41, C I negative 16.23 to negative 14.61, t negative 37.39, p less than 0.001.

* ta: Estimate 3.46, S E 0.36, C I 2.76 to 4.17, t 9.62, p less than 0.001.

* Current H L Exposure Use: Estimate 0.50, S E 0.26, C I negative 0.00 to 1.00, t 1.95, p 0.052.

* W M: Estimate 1.78, S E 0.25, C I 1.30 to 2.27, t 7.25, p less than 0.001.

* NoGo Cost: Estimate negative 2.13, S E 0.25, C I negative 2.63 to negative 1.63, t negative 8.41, p less than 0.001.

* taziji multiplied by Current H L Exposure Use: Estimate negative 2.62, S E 0.34, C I negative 3.30 to negative 1.95, t negative 7.62, p less than 0.001.

* ta multiplied by Current H L Exposure Use: Estimate negative 0.66, S E 0.30, C I negative 1.24 to negative 0.08, t negative 2.24, p 0.025.

* taziji multiplied by W M: Estimate negative 2.18, S E 0.35, C I negative 2.86 to negative 1.50, t negative 6.26, p less than 0.001.

* ta multiplied by W M: Estimate 0.14, S E 0.29, C I negative 0.42 to 0.70, t 0.48, p 0.628.

* taziji multiplied by NoGo Cost: Estimate 1.99, S E 0.35, C I 1.31 to 2.67, t 5.74, p less than 0.001.

* ta multiplied by NoGo Cost: Estimate 2.13, S E 0.29, C I 1.55 to 2.70, t 7.27, p less than 0.001.

This model suggests that individual HSs’ look patterns for the ziji condition were modulated by WM and NoGo Cost but not by Current HL Exposure and Use. More specifically, with the increase of WM, there is an increase in looks to the LD referent, and with the increase of NoGo Cost, there is a decrease in looks to the LD referent. To unpack the significant interaction terms between Condition and WM, NoGo Cost, and Current HL Exposure and Use, we ran post hoc analyses by conducting models with all possible combinations of reference levels for the categorical variable Condition. As can also be seen in Figure 3, post hoc analyses reveal that participants’ look pattern for the ta condition was modulated by WM, where participants with higher WM load were more likely to look at the LD referent (Estimate = 1.92, SE = 0.24, CI [1.45, 2.39], t = 8.07, p < 0.001). However, the look pattern for the ta condition was not modulated by Current HL Exposure and Use (Estimate = −0.17, SE = 0.25, CI [−0.65, 0.32], t = −0.67, p = 0.51) nor by NoGo Cost (Estimate = −0.01, SE = 0.24, CI [−0.47, 0.47], t = −0.02, p = 0.98). Look pattern for the taziji condition was not modulated by WM (Estimate = −0.40, SE = 0.31, CI [−1.01, 0.21], t = −1.27, p = 0.20) nor by NoGo Cost (Estimate = −0.14, SE = 0.30, CI [−0.73, 0.46], t = −0.45, p = 0.65) but was modulated by Current HL Exposure and Use such that participants with more HL exposure and use were more likely to look at the local referent (Estimate = −2.13, SE = 0.30, CI [−2.72, −1.53], t = −6.99, p < 0.001).

Figure 3.

Effect of WM (left), Inhibition (mid), and Current Exposure and Use of Mandarin on LD advantage.

Three-panel line graph showing the relationship between L D vs Local scores and Mandarin exposure, W M, and inhibition across three conditions. See long description.

Figure 3. Long description

A multi-panel line graph with three panels arranged horizontally. All panels share a Y-axis labeled L D underscore v s underscore Local ranging from negative 10 to 15. A legend at the bottom identifies three conditions: taziji (red), ta (blue), and ziji (green).

* Left Panel (Mandarin exposure/use): The X-axis is Current underscore Exposure underscore Use underscore Mandarin from 0.00 to 0.75. The blue line (ta) remains high and stable near 10. The green line (ziji) shows a slight linear increase from 5 to 10. The red line (taziji) shows a sharp linear decrease from 0 to negative 15.

* Middle Panel (Working memory): The X-axis is W M from 15 to 30. The blue line (ta) and green line (ziji) both show a parallel linear increase, with ta rising from 8 to 14 and ziji rising from 5 to 10. The red line (taziji) is stable and flat near negative 8.

* Right Panel (Inhibition): The X-axis is NoGo underscore Cost from negative 0.2 to 0.2. The blue line (ta) is flat at 11. The red line (taziji) is flat at negative 8. The green line (ziji) shows a significant linear decrease from 9 to 2.

7. Discussion

The present study investigated how Mandarin-English HSs interpret different types of pronouns in real time and how linguistic-level and individual-level factors shape this process. We focused on a group of late adolescent HSs, an age group that is underrepresented in the literature. Using a web-based visual world eye-tracking paradigm, we examined the online interpretation of three Mandarin pronouns, that is, ta, ziji, and taziji. Importantly, while taziji is governed by narrow syntax, interpreting ta and ziji requires integrating different information sources, with ta requiring syntax-discourse integration and ziji requiring syntax-semantics-discourse integration.

Starting with group-level performance across pronoun types (RQ1), we observed clear distinctions in pronoun processing patterns. Specifically, HSs interpreted taziji as referring to the local antecedent, whereas ta and ziji were interpreted as referring to the LD antecedent. These findings align with those reported by C. Chen (Reference Chen2020), who found L1-dominant-like offline interpretations of ta and taziji among adult HSs. Our results, combined with those of C. Chen (Reference Chen2020), suggest that Mandarin HSs successfully process and interpret pronouns, even when pronominal resolution requires establishing complex linguistic dependencies, such as those involving LD antecedents, and are at the interface between syntax (semantics) and discourse (as in ta and ziji). This observation is particularly noteworthy in light of claims that linguistic dependencies, especially LD ones, constitute a vulnerable domain in HL grammars (Polinsky & Scontras, Reference Polinsky and Scontras2020). Moreover, under the IH (Sorace, Reference Sorace2011), structures that require integration across grammatical interfaces between syntax and discourse are predicted to be especially prone to attrition or arrested development in bilingual populations. Both ta and ziji fall into this category, as their resolution depends on information at the syntax-(semantics-) discourse interface.

The absence of evidence for such vulnerability in our data, therefore, challenges the scope of these theoretical predictions, highlighting the need to better understand the conditions under which interface-dependent phenomena/LD dependencies may or may not be vulnerable in HSs (Daskalaki et al., Reference Daskalaki, Chondrogianni, Blom, Argyri and Paradis2019; Leal et al., Reference Leal, Rothman and Slabakova2014). One plausible explanation for this discrepancy is methodological. Whereas previous studies reporting vulnerability in pronominal interpretation among HSs have primarily used offline comprehension or production tasks and focused on adult HSs (e.g., Kim & Yoon, Reference Kim and Yoon2020), our study employed a real-time eye-tracking paradigm and targeted late-adolescent HSs. The findings thus underscore the need for future research that adopts a developmental perspective and systematically varies task modality to capture a more comprehensive picture of HL bilingual processing and representation across the lifespan (Fuchs, Reference Fuchs2022).

Additionally, the current study found that ta elicited more looks to the LD referent than ziji , suggesting stronger and more consistent LD resolution for ta across participants. Strikingly, all LD advantage scores for the ta condition were positive, indicating that every participant consistently interpreted ta as referring to the LD antecedent across all items. In contrast, the LD advantage scores for ziji showed much more variability. This suggests that although there was a group-level preference for LD interpretations of ziji , individual participants occasionally interpreted ziji as referring to the local antecedent. This variability is not entirely unexpected, even though all the verbs used in the study were logophoric, biasing LD-readings. This may reflect differences in the linguistic architecture supporting each pronoun. The resolution of ta relies on syntax-discourse interface mechanisms. In contrast, ziji resolution requires integration across multiple interfaces (syntax, semantics, and discourse), potentially making it more susceptible to individual variation whereby HSs may weigh sources of information differently depending on their individual language experiences and cognitive resources (Hao, Kubota, et al., Reference Hao, Kubota, Bayram, González Alonso, Grüter, Li and Rothman2025; Hao, Rossi, et al., Reference Hao, Rossi, Nakamura, Luque and Rothman2025; Kim & Yoon, Reference Kim and Yoon2020). These findings underscore the importance of examining not only group-level trends but also participant-level variability in HL bilingualism (De Houwer, Reference De Houwer2023; Paradis, Reference Paradis2023; Rothman et al., Reference Rothman, Bayram, DeLuca, Di Pisa, Duñabeitia, Gharibi, Hao, Kolb, Kubota, Kupisch, Laméris, Luque, Van Osch, Pereira Soares, Prystauka, Tat, Tomić, Voits and Wulff2023), a point we now turn to in the discussion of RQ2.

Our second research question (RQ2) investigated which user-internal (i.e., WM, inhibition) and user-external (i.e., current HL exposure and use) factors modulate IDs in HSs’ pronoun processing and whether these effects differ across pronoun types. The results reveal distinct patterns of modulation for each pronoun. For taziji, only the user-external factor of current HL exposure and use significantly modulated participants’ look patterns: HSs with greater current exposure to and use of the HL were more likely to fixate on the local antecedent. In contrast, for ta, only the user-internal factor of WM, but not inhibition or HL exposure and use, predicted processing behaviour. That is, participants with higher WM capacity showed stronger preferences for the LD antecedent. Lastly, for ziji, both WM and inhibition modulated look patterns: participants with higher WM capacity and stronger inhibitory control (indicated by lower NoGo cost) showed a greater tendency to fixate on the LD antecedent.

The role of user-internal factors aligns well with our predictions and broader cognitive accounts of sentence processing. Specifically, WM emerged as a key modulator of successful LD resolution for both ta and ziji, but not for taziji. This pattern suggests that pronouns that require resolution across longer syntactic distances or across multiple domains (e.g., syntax, discourse, semantics) impose greater cognitive demands and thus rely more heavily on WM resources. In contrast, the processing of taziji, which is governed primarily by narrow syntactic constraints and typically resolved locally, does not appear to require substantial WM resources. This accords well with findings from Bice and Kroll (Reference Bice and Kroll2021), who also reported no WM effects on morphosyntactic processing (subject-verb agreement) among HSs, reinforcing the view that local dependencies/narrow syntax may not engage domain-general cognitive systems to the same extent compared to LD dependencies/interface structures.

Unlike ta , ziji exhibited sensitivity to both WM and inhibition, suggesting that processing ziji may place demands not only on memory resources but also on participants’ ability to manage competing interpretations. One possibility is that inhibition is involved in suppressing CLI. English lacks a direct equivalent of Mandarin ziji , which might lead HSs to map English reflexives onto Mandarin ziji , giving rise to a preference for local interpretations. However, this does not mean that Mandarin HSs have a reduced inventory of pronouns ( ta vs. reflexives), a possibility suggested by Polinsky and Scontras (Reference Polinsky and Scontras2020), as the current study does show a clear distinction in how HSs process the three different pronouns. Alternatively, or perhaps additionally, inhibition may be required to suppress a dominant or default preference for local binding of ziji , even when the semantic cues from the logophoric verb bias a LD reading. This is supported by prior research suggesting that local interpretations of ziji are more accessible and less costly for processing (Dillon et al., Reference Dillon, Chow and Xiang2016; Lyu & Kaiser, Reference Lyu and Kaiser2021). Under such an account, participants with better inhibitory control were more successful in suppressing the local (default) interpretation, especially when it conflicted with the logophoric semantics of the main verbs.

The fact that inhibition modulated ziji but not ta, despite both requiring LD interpretation, further suggests that ziji may involve more interpretive conflict. An open question, however, concerns which of these mechanisms, that is, CLI versus default local binding preference, is more influential in shaping ziji interpretation among HSs. Future studies could address this question by directly manipulating verb type (e.g., logophoric vs. generic), thereby testing whether the availability of strong semantic cues reduces inhibitory demands in ziji interpretation. Additionally, examining the processing of other LD dependencies that differ in across-language similarities/differences and within-language defaults may help dissociate CLI effects from interpretive biases.

Lastly, regarding the effect of the user-external factor, that is, current HL exposure and use, we found that it modulated look patterns only when HSs processed taziji , but not ta or ziji . While it is seemingly intuitive that more HL exposure and use would lead to better performance in general, as observed in offline tasks (Chondrogianni & Daskalaki, Reference Chondrogianni and Daskalaki2023; Daskalaki et al., Reference Daskalaki, Chondrogianni, Blom, Argyri and Paradis2019; Kubota et al., Reference Kubota, Goto, Kurokawa, Matsuoka, Otani and Rothman2025; Paradis et al., Reference Paradis, Soto-Corominas, Chen and Gottardo2020; Soto-Corominas et al., Reference Soto-Corominas, Daskalaki, Paradis, Winters-Difani and Janaideh2022), findings from online processing studies have been mixed. HL input effects have been reported in adult HSs (e.g., Hao, Kubota, et al., Reference Hao, Kubota, Bayram, González Alonso, Grüter, Li and Rothman2025; Hao, Rossi, et al., Reference Hao, Rossi, Nakamura, Luque and Rothman2025), yet such effects have been shown to be absent in children (e.g., Hao et al., Reference Hao, Chondrogianni and Sturt2024). While one interpretation is that HL input effects are developmentally mediated (i.e., the older one gets, possibly according to more distance in multiple senses the average HS has to the HL, IDs in exposure become (more) deterministic), the current findings suggest a more nuanced possibility: HL input may have selective effects on real-time processing depending on the grammatical domain.

Specifically, we propose that structures involving interface-level integration and/or LD dependencies (ziji and ta) place greater demands on domain-general cognitive resources. As such, their processing may rely less on the amount of HL input alone and more on individual cognitive capacities like WM and inhibition. In contrast, narrow syntactic structures, such as taziji, which is constrained by Binding Principle A and requires local binding, may benefit more directly and robustly from increased HL exposure and use, as these structures are more rule-governed, frequent in input, and less cognitively taxing.

This interpretation, however, stands in contrast to the predictions of the IH (Sorace, Reference Sorace2011), which posits that interface phenomena are more vulnerable to variability and should be more sensitive to input (reduction). One possible resolution is to distinguish between representation and processing efficiency, especially when most studies supporting the IH come from offline measures. It may be that increased HL use enhances the automaticity and efficiency of processing well-established syntactic representations (e.g., Hao et al., Reference Hao, Chondrogianni and Sturt2024; Hao, Kubota, et al., Reference Hao, Kubota, Bayram, González Alonso, Grüter, Li and Rothman2025) but is less effective at resolving the more variable and inferential demands of interface phenomena, where cognitive effort, rather than input frequency, may be the bottleneck.

There is evidence from the present study that supports this view. When we examined individual participants’ LD versus local antecedent preferences for taziji across items, we found that all participants consistently interpreted taziji as referring to the local antecedent, as LD advantage scores were uniformly negative. This suggests categorical application of Binding Principle A. The role of HL exposure and use here may not have influenced which interpretation participants arrived at, but rather how efficiently they processed and resolved the dependency in real time. In other words, HL exposure may have facilitated more rapid retrieval or application of binding constraints, even when interpretive outcomes were uniform across participants. Future research could test this by manipulating HL use, comparing real-time processing and interpretive accuracy across pronoun types, using longitudinal or training designs, and varying verb semantics to isolate input effects from cognitive demands.

Overall, the current findings strongly suggest that not all interface phenomena/LD dependencies are equally vulnerable, nor do they uniformly respond to input variation or cognitive factors (Leal et al., Reference Leal, Rothman and Slabakova2014). Instead, this study highlights the nuanced role cognitive and experiential factors play in shaping real-time pronoun processing among adolescent HSs.

Data availability statement

Supplementary materials, including the full experimental lists and the data that support the findings of this study, are openly available in OSF at https://osf.io/xs23z.

Acknowledgements

We thank the participants who made this research possible. Our special thanks go to the enthusiastic families who advertised the study on our behalf.

Funding statement

This project was funded by the European Union’s Horizon Europe research and innovation programme under the Marie Sklodowska-Curie grant agreement No 101104834, and the Trond Mohn Foundation, under the Center for Language, Brain, and Learning (C-LaBL) grant No. TMS2023UiT01. Views and opinions expressed are, however, those of the author(s) only and do not necessarily reflect those of the European Union. Neither the European Union nor the granting authority can be held responsible for them.

Competing interests

The author(s) declare none.

Disclosure of use of AI tools

None declared.

Ethics statement

The Research Ethics Committee at the Faculty of Humanities, Social Sciences, and Education at UiT The Arctic University of Norway has assessed the study protocol, including the methodology, recruitment of participants, data processing, as well as the information letter and informed consent. The study protocol is approved by the committee in accordance with the Guidelines for Research Ethics in the Social Sciences and the Humanities (Ref: 12–2024).

References

Alloway, T. P., Gathercole, S. E., Kirkwood, H., & Elliott, J. (2008). Evaluating the validity of the Automated Working Memory Assessment. Educational Psychology, 28(7), 725–734. https://doi.org/10.1080/01443410802243828CrossRef Google Scholar

Anwyl-Irvine, A. L., Massonnié, J., Flitton, A., Kirkham, N., & Evershed, J. K. (2020). Gorilla in our midst: An online behavioral experiment builder. Behavior Research Methods, 52(1), 388–407. https://doi.org/10.3758/s13428-019-01237-xCrossRef Google Scholar PubMed

Barr, D. J., Levy, R., Scheepers, C., & Tily, H. J. (2013). Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language, 68(3), 255–278. https://doi.org/10.1016/j.jml.2012.11.001CrossRef Google Scholar PubMed

Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models Usinglme4. Journal of Statistical Software, 67(1), 1–48. https://doi.org/10.18637/jss.v067.i01CrossRef Google Scholar

Bayram, F., Pisa, G. D., Rothman, J., & Slabakova, R. (2021). Current trends and emerging methodologies in charting heritage language grammars. In Montrul, S. & Polinsky, M. (Eds.), The cambridge handbook of heritage languages and linguistics (1st ed., pp. 545–578). Cambridge University Press. https://doi.org/10.1017/9781108766340.025CrossRef Google Scholar

Bice, K., & Kroll, J. F. (2021). Grammatical processing in two languages: How individual differences in language experience and cognitive abilities shape comprehension in heritage bilinguals. Journal of Neurolinguistics, 58, 100963. https://doi.org/10.1016/j.jneuroling.2020.100963CrossRef Google Scholar PubMed

Chen, C. (2020). The acquisition of mandarin relative clauses and binding by heritage speakers and second language learners. Proceedings of the 44th Boston University Conference on Language Development, 77–90.Google Scholar

Chen, C., & Ionin, T. (2023). Interpretation of Mandarin pronouns and reflexives by L1-Korean and L1-English learners of Mandarin. Second Language Research, 39(4), 941–968. https://doi.org/10.1177/02676583221103744CrossRef Google Scholar

Chien, Y.-C. (1992). Theoretical implications of the principles and parameters model for language acquisition in Chinese. In Advances in psychology (Vol. 90, pp. 313–345). Elsevier. https://doi.org/10.1016/S0166-4115(08)61896-8Google Scholar

Chien, Y.-C., & Lust, B. (2006). Chinese children’s knowledge of the Binding Principles. In Li, P., Tan, L. H., Bates, E., & Tzeng, O. J. L. (Eds.), The handbook of East Asian psycholinguistics (1st ed., pp. 23–38). Cambridge University Press. https://doi.org/10.1017/CBO9780511550751.004CrossRef Google Scholar

Chien, Y.-C., & Wexler, K. (1990). Children’s knowledge of locality conditions in Binding as evidence for the modularity of syntax and pragmatics. Language Acquisition, 1(3), 225–295. https://doi.org/10.1207/s15327817la0103_2CrossRef Google Scholar

Chomsky, N. (1993). Lectures on government and binding: The Pisa lectures (7th ed). DE GRUYTER MOUTON.10.1515/9783110884166CrossRef Google Scholar

Chondrogianni, V. (2023). Cross-linguistic influences in bilingual morphosyntactic acquisition. In Elgort, I., Siyanova-Chanturia, A., & Brysbaert, M. (Eds.), Cross-linguistic influences in bilingual morphosyntactic acquisition (Vol. 16, pp. 294–315). John Benjamins Publishing Company.Google Scholar

Chondrogianni, V., & Daskalaki, E. (2023). Heritage language use in the country of residence matters for language maintenance, but short visits to the homeland can boost heritage language outcomes. Frontiers in Language Sciences, 2, 1230408. https://doi.org/10.3389/flang.2023.1230408CrossRef Google Scholar

Clackson, K., Felser, C., & Clahsen, H. (2011). Children’s processing of reflexives and pronouns in English: Evidence from eye-movements during listening. Journal of Memory and Language, 65(2), 128–144. https://doi.org/10.1016/j.jml.2011.04.007CrossRef Google Scholar

Cole, P., Hermon, G., & Huang, C.-T. J. (2006). Long-distance binding in Asian languages. In Everaert, M. & van Riemsdijk, H. (Eds.), The blackwell companion to syntax (pp. 21–84). Wiley. https://doi.org/10.1002/9780470996591.ch39CrossRef Google Scholar

Cunnings, I. (2017). Parsing and working memory in bilingual sentence processing. Bilingualism: Language and Cognition, 20(4), 659–678. https://doi.org/10.1017/S1366728916000675CrossRef Google Scholar

Daskalaki, E., Chondrogianni, V., Blom, E., Argyri, F., & Paradis, J. (2019). Input effects across domains: The case of Greek subjects in child heritage language. Second Language Research, 35(3), 421–445. https://doi.org/10.1177/0267658318787231CrossRef Google Scholar

De Cat, C., Kašćelan, D., Prévost, P., Serratrice, L., Tuller, L., Unsworth, S., & The Q-BEx Consortium. (2023). How to quantify bilingual experience? Findings from a Delphi consensus survey. Bilingualism: Language and Cognition, 26(1), 112–124. https://doi.org/10.1017/S1366728922000359CrossRef Google Scholar

De Houwer, A. (2023). The danger of bilingual–monolingual comparisons in applied psycholinguistic research. Applied Psycholinguistics, 44(3), 343–357. https://doi.org/10.1017/S014271642200042XCrossRef Google Scholar

Dillon, B., Chow, W.-Y., & Xiang, M. (2016). The relationship between anaphor features and antecedent retrieval: Comparing Mandarin Ziji and Ta-Ziji. Frontiers in Psychology, 6. https://doi.org/10.3389/fpsyg.2015.01966CrossRef Google Scholar PubMed

Dunn, L. M., & Dunn, D. M. (2012). Peabody Picture Vocabulary Test—Fourth Edition [Dataset]. https://doi.org/10.1037/t15144-000CrossRef Google Scholar

Fuchs, Z. (2022). Eyetracking evidence for heritage speakers’ access to abstract syntactic agreement features in real-time processing. Frontiers in Psychology, 13, 960376. https://doi.org/10.3389/fpsyg.2022.960376CrossRef Google Scholar PubMed

Grodzinsky, Y., & Reinhart, T. (1993). The innateness of binding and coreference. Linguistic Inquiry, 24, 69–102.Google Scholar

Hao, J., & Chondrogianni, V. (2024). Comprehension and production of non-canonical word orders in Mandarin-speaking child heritage speakers. Linguistic Approaches to Bilingualism, 13(4), 468–499. https://doi.org/10.1075/lab.20096.haoCrossRef Google Scholar

Hao, J., Chondrogianni, V., & Sturt, P. (2024). Heritage language development and processing: Non-canonical word orders in Mandarin–English child heritage speakers. Bilingualism: Language and Cognition, 27(3), 334–349. https://doi.org/10.1017/S1366728923000639CrossRef Google Scholar

Hao, J., Kubota, M., Bayram, F., González Alonso, J., Grüter, T., Li, M., & Rothman, J. (2025). Schooling and home language usage matter in heritage bilingual processing: Sortal classifiers in Mandarin. Second Language Research, 41(4), 649–674. https://doi.org/10.1177/02676583241270900CrossRef Google Scholar

Hao, J., Rossi, E., Nakamura, M., Luque, A., & Rothman, J. (2025). Individual differences matter in heritage language bilingual processing: An electroencephalography (EEG) study of grammatical gender. Studies in Second Language Acquisition, 47(5), 1230–1249. https://doi.org/10.1017/S0272263125101149CrossRef Google Scholar PubMed

Huang, C. T. J., & Liu, C. L. (2000). Logophoricity, attitudes, and Ziji at the interface. Syntax and semantics, 33, 141–195, https://doi.org/10.1108/S0092-4563(2000)0000033007Google Scholar

Huang, Y. T., Zheng, X., Meng, X., & Snedeker, J. (2013). Children’s assignment of grammatical roles in the online processing of Mandarin passive sentences. Journal of Memory and Language, 69(4), 589–606. https://doi.org/10.1016/j.jml.2013.08.002CrossRef Google Scholar PubMed

Karaca, F., Brouwer, S., Unsworth, S., & Huettig, F. (2024). Morphosyntactic predictive processing in adult heritage speakers: Effects of cue availability and spoken and written language experience. Language, Cognition and Neuroscience, 39(1), 118–135. https://doi.org/10.1080/23273798.2023.2254424CrossRef Google Scholar

Kim, E. H., & Yoon, J. (2020). Experimental evidence supporting the overlapping distribution of core and exempt anaphors: Re-examination of long-distance boundcaki-casinin Korean. Linguistics, 58(6), 1775–1806. https://doi.org/10.1515/ling-2020-0233CrossRef Google Scholar

Korkus, M.-L., & Vihman, V.-A. (2024). Adolescent heritage speakers: Morphosyntactic divergence in Estonian Youth Language Usage in Sweden. Languages, 9(12), 366. https://doi.org/10.3390/languages9120366CrossRef Google Scholar

Kubota, M., Goto, Y., Kurokawa, S., Matsuoka, Y., Otani, M., & Rothman, J. (2025). Different variables hold varying significance from childhood to adolescence. Studies in Second Language Acquisition, 47(1), 104–135. https://doi.org/10.1017/S0272263124000615CrossRef Google Scholar

Kupisch, T., & Rothman, J. (2018). Terminology matters! Why difference is not incompleteness and how early child bilinguals are heritage speakers. International Journal of Bilingualism, 22(5), 564–582. https://doi.org/10.1177/1367006916654355CrossRef Google Scholar

Leal, T., Rothman, J., & Slabakova, R. (2014). A rare structure at the syntax-discourse interface: Heritage and Spanish-Dominant native speakers weigh in. Language Acquisition, 21(4), 411–429. https://doi.org/10.1080/10489223.2014.892946CrossRef Google Scholar

Lyu, J., & Kaiser, E. (2021). Unpacking the blocking effect: Syntactic prominence and perspective-taking in antecedent retrieval in Mandarin Chinese. Glossa: A Journal of General Linguistics, 6(1). https://doi.org/10.16995/glossa.5781Google Scholar

MacWhinney, B. (2018). A unified model of first and second language learning. In Hickmann, M., Veneziano, E., & Jisa, H. (Eds.), Trends in Language Acquisition Research (Vol. 22, pp. 287–312). John Benjamins Publishing Company.Google Scholar

McElreath, R. (2020). Statistical rethinking: A bayesian course with examples in R and STAN (2nd ed.). Chapman and Hall/CRC.10.1201/9780429029608CrossRef Google Scholar

Minkov, M., Kagan, O., Protassova, E., & Schwartz, M. (2019). Towards a better understanding of a continuum of heritage language proficiency: The case of adolescent Russian Heritage Speakers. Heritage Language Journal, 16(2), 211–237. https://doi.org/10.46538/hlj.16.2.5CrossRef Google Scholar

Montrul, S. (2016). The acquisition of heritage languages. Cambridge University Press.Google Scholar

Montrul, S., & Polinsky, M. (Eds.). (2021). Research approaches to heritage languages: In The Cambridge Handbook of Heritage languages and linguistics (1st ed., pp. 373–578). Cambridge University Press.10.1017/9781108766340.017CrossRef Google Scholar

Papoutsaki, A., Sangkloy, P., Laskey, J., Daskalova, N., Huang, J., & Hays, J. (2016). Webgazer: Scalable webcam eye tracking using user interactions. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 3839–3845.Google Scholar

Paradis, J. (2023). Sources of individual differences in the dual language development of heritage bilinguals. Journal of Child Language, 50(4), 793–817. https://doi.org/10.1017/S0305000922000708CrossRef Google Scholar PubMed

Paradis, J., Soto-Corominas, A., Chen, X., & Gottardo, A. (2020). How language environment, age, and cognitive capacity support the bilingual development of Syrian refugee children recently arrived in Canada. Applied Psycholinguistics, 41(6), 1255–1281. https://doi.org/10.1017/S014271642000017XCrossRef Google Scholar

Polinsky, M. (2018). Bilingual children and adult heritage speakers: The range of comparison. International Journal of Bilingualism, 22(5), 547–563. https://doi.org/10.1177/1367006916656048CrossRef Google Scholar

Polinsky, M., & Scontras, G. (2020). Understanding heritage languages. Bilingualism: Language and Cognition, 23(1), 4–20. https://doi.org/10.1017/S1366728919000245CrossRef Google Scholar

R Core Team. (2018). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. https://www.R-project.org/Google Scholar

Reuland, E. (2006). Binding theory. In Everaert, M. & van Riemsdijk, H. (Eds.), The blackwell companion to syntax (pp. 260–283). Wiley.10.1002/9780470996591.ch9CrossRef Google Scholar

Rothman, J. (2009). Understanding the nature and outcomes of early bilingualism: Romance languages as heritage languages. International Journal of Bilingualism, 13(2), 155–163. https://doi.org/10.1177/1367006909339814CrossRef Google Scholar

Rothman, J., Bayram, F., DeLuca, V., Di Pisa, G., Duñabeitia, J. A., Gharibi, K., Hao, J., Kolb, N., Kubota, M., Kupisch, T., Laméris, T., Luque, A., Van Osch, B., Pereira Soares, S. M., Prystauka, Y., Tat, D., Tomić, A., Voits, T., & Wulff, S. (2023). Monolingual comparative normativity in bilingualism research is out of “control ”: Arguments and alternatives. Applied Psycholinguistics, 44(3), 316–329. https://doi.org/10.1017/S0142716422000315CrossRef Google Scholar

Sorace, A. (2011). Pinning down the concept of “interface” in bilingualism. Linguistic Approaches to Bilingualism, 1(1), 1–33. https://doi.org/10.1075/lab.1.1.01sorCrossRef Google Scholar

Soto-Corominas, A., Daskalaki, E., Paradis, J., Winters-Difani, M., & Janaideh, R. A. (2022). Sources of variation at the onset of bilingualism: The differential effect of input factors, AOA, and cognitive skills on HL Arabic and L2 English syntax. Journal of Child Language, 49(4), 741–773. https://doi.org/10.1017/S0305000921000246CrossRef Google Scholar PubMed

Thornton, R., & Wexler, K. (1999). Principle B, VP ellipsis, and interpretation in child grammar. MIT Press.10.7551/mitpress/5550.001.0001CrossRef Google Scholar

Tsimpli, I. M. (2014). Early, late or very late?: Timing acquisition and bilingualism. Linguistic Approaches to Bilingualism, 4(3), 283–313. https://doi.org/10.1075/lab.4.3.01tsiCrossRef Google Scholar

Wang, Y., & Pan, H. (2021). Chinese reflexives. In Wang, Y. & Pan, H. (Eds.), Oxford Research Encyclopedia of Linguistics. Oxford University Press.10.1093/acrefore/9780199384655.013.887CrossRef Google Scholar

Woodard, K., Pozzan, L., & Trueswell, J. (2016). Taking your own path: Individual differences in executive function and language processing skills in child learners. Journal of Experimental Child Psychology, 141, 187–209. https://doi.org/10.1016/j.jecp.2015.08.005CrossRef Google Scholar PubMed

Figure 1. Example of a visual scene in the visual world eye-tracking experiment.

Figure 2. Difference in proportion fixations to LD versus local referent by Condition.Figure 2. long description.

Table 1. The model with Condition (ziji as the reference level) interacting with Current HL Exposure and Use, WM, and NoGo Cost as fixed effectsTable 1. long description.

Figure 3. Effect of WM (left), Inhibition (mid), and Current Exposure and Use of Mandarin on LD advantage.Figure 3. long description.

Article contents

To bind or not to bind: Individual differences in pronominal processing among adolescent Mandarin-English heritage speakers

Abstract

摘要

Keywords

Information

1. Introduction

2. ID factors modulating HL development and processing

3. Pronoun systems in English and Mandarin

4. The acquisition and processing of English and Mandarin pronouns

5. The present study

5.1. Predictions

5.2. Participants

5.3. Baseline tasks

5.3.1. Language background questionnaire

5.3.2. English and Mandarin receptive vocabulary

5.3.3. Flanker/no-go task

5.3.4. Working memory task

5.3.5. The visual world eye-tracking experiment

5.4. Procedure

6. Results

7. Discussion

Data availability statement

Acknowledgements

Funding statement

Competing interests

Disclosure of use of AI tools

Ethics statement

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests