1. Introduction
Language representation during sentence processing has been extensively studied in monolingual speakers, but with over half the world speaking more than one language (Grosjean, Reference Grosjean2010), it is important to understand how bilinguals process and represent language. Bilinguals do not entirely separate their languages, as evidenced by language contact phenomena and bilingual code-switching (Myers-Scotton & Jake, Reference Myers-Scotton, Jake, Bullock and Toribio2009). Some researchers suggest bilinguals process or produce languages with a co-activation mechanism, drawing on knowledge of both languages stored in one mind (e.g., Goldrick et al., Reference Goldrick, Putnam and Schwarz2016; Veríssimo, Reference Veríssimo2016). Theoretical models of bilingualism have also proposed that the bilingual language faculty consists of either a shared computational system or a shared vocabulary inventory that indexes entities with feature bundles in syntax (López, Reference López2020; MacSwan, Reference MacSwan2000). These models are grounded in frameworks like the Minimalist Program (Chomsky, Reference Chomsky1995), where items from the mental lexicon are merged into structures via grammatical operations and sent to phonological and semantic modules for comprehension or production. Such models rely heavily on empirical psycholinguistic evidence, emphasizing the need for further research into bilingual language representation.
Studies on cross-linguistic syntactic priming have provided empirical evidence for the hypothesis of a shared-syntax system (Hartsuiker, Reference Hartsuiker2013; Hartsuiker et al., Reference Hartsuiker, Pickering and Veltkamp2004; Hartsuiker & Bernolet, Reference Hartsuiker and Bernolet2017; Hopp & Grüter, Reference Hopp and Grüter2023; Kidd et al., Reference Kidd, Tennant and Nitschke2015; Loebell & Bock, Reference Loebell and Bock2003; Muylle et al., Reference Muylle, Bernolet and Hartsuiker2021). Many researchers have also used electrophysiological and eye-tracking methodologies with various sentence-processing tasks. Based on the results, it is argued that one language can be activated automatically during processing of another simultaneously (or very early) acquired language (Luque et al., Reference Luque, Mizyed and Morgan-Short2018; Rankin et al., Reference Rankin, Grosso and Reiterev2016; Sanoudaki & Thierry, Reference Sanoudaki, Thierry, Thomas and Mennen2014, Reference Sanoudaki and Thierry2015; Thierry & Wu, Reference Thierry and Wu2007; Vaughan-Evans et al., Reference Vaughan-Evans, Kuipers, Thierry and Jones2014, Reference Vaughan-Evans, Liversedge, Fitzsimmons and Jones2020).
It remains unclear how sentence processing in a first or dominant language adapts to experiences with a second or less dominant language. Further research is also needed to extend the shared-syntax hypothesis to more complex structures and diverse language pairs. This study addresses these issues by testing English–French bilinguals in a grammatical maze task (Witzel et al., Reference Witzel, Witzel and Forster2012), focusing on adverb placement rules, which differ between the two languages due to distinct head movement operations across multiple syntactic nodes in deep structure (Embick & Noyer, Reference Embick and Noyer2001; Pollock, Reference Pollock1989). These rules involve more complex structures than the adjective placement differences studied in Welsh–English bilinguals (Sanoudaki & Thierry, Reference Sanoudaki, Thierry, Thomas and Mennen2014, Reference Sanoudaki and Thierry2015). Specifically, we present English sentences that reflect both English and French adverb placement orders to English–French bilinguals and investigate whether the two different word orders are co-activated as participants read. The effect of the age of immersion (AoI) to French is further explored to see whether it can modulate co-activation patterns.
2. Background
2.1. The shared-syntax hypothesis
Early bilinguals (or highly proficient bilinguals in some studies) are argued to maintain a shared mental lexicon (i.e., a collection of entries to known words) for their two spoken languages, both from the perspectives of performance and competence (DeAnda et al., Reference DeAnda, Poulin-Dubois, Zesiger and Friend2016; Hartsuiker et al., Reference Hartsuiker, Pickering and Veltkamp2004; Kroll & Stewart, Reference Kroll and Stewart1994; López, Reference López2020; MacSwan, Reference MacSwan2000; Sabourin et al., Reference Sabourin, Brien and Burkholder2014, Reference Sabourin, Burkholder, Vīnerte, Leclerc and Brien2016a). However, the organization of bilingual syntactic systems seems more complicated.
From a production perspective, research on bilingual grammar integration often relies on syntactic priming – the facilitation of processing or producing a structure after exposure to a similar one (Bock, Reference Bock1986). Cross-language syntactic priming occurs when this effect happens between similar structures in different languages. Loebell & Bock (Reference Loebell and Bock2003) tested German–English bilinguals on active (similar word order) and passive (different word order) sentences, having participants hear a sentence in one language and then describe a picture in the other. They found significant priming for active but not passive sentences, suggesting that cross-language priming, and thus co-activation, is more likely when structures share similar word order.
Likewise, Hartsuiker et al. (Reference Hartsuiker, Pickering and Veltkamp2004) found that Spanish–English bilinguals produced more English passives after hearing a similarly structured Spanish passive than after active or intransitive Spanish sentences, suggesting shared representation due to similar word order. In another study, Muylle et al. (Reference Muylle, Bernolet and Hartsuiker2021) trained Dutch speakers on an artificial language with both SVO and SOV word orders, finding priming only for the similar cross-language SVO sentences but not for SOV sentences. These results support Hartsuiker & Bernolet’s (Reference Hartsuiker and Bernolet2017) hypothesis that L2 development leads to increasingly shared representations between L1 and L2. They also suggest that early acquisition may play a key role in bilingual grammatical sharing.
From a comprehension perspective, there are also studies that have examined the shared-syntax hypothesis. Kidd et al. (Reference Kidd, Tennant and Nitschke2015) found that L2 German processing was facilitated by similar structures in L1 English. Hopp and Grüter (Reference Hopp and Grüter2023) used cross-linguistic structural priming to distinguish syntactic transfer from general L2 effects, comparing L1 German and Japanese learners of English. They found differing L1 influences, attributing variation to grammatical differences between the languages. Their results highlight that L1 transfer significantly affects L2 processing, underscoring the need for both general L2 processing theories and language-specific models of grammatical representation.
Other studies have explored co-activation effects on sentence comprehension using event-related potentials (ERPs). The N2 component, which typically signals attentional shifts to non-target stimuli (Luck & Hillyard, Reference Luck and Hillyard1994), has been used due to studies that have shown that in bilingual sentence processing (using picture-sentence matching tasks), N2 patterns can reveal subtle cognitive processes (Luque et al., Reference Luque, Mizyed and Morgan-Short2018; Sanoudaki & Thierry, Reference Sanoudaki, Thierry, Thomas and Mennen2014, Reference Sanoudaki and Thierry2015; Thierry & Wu, Reference Thierry and Wu2007). Sanoudaki & Thierry (Reference Sanoudaki and Thierry2015) tested high- and low-fluency Welsh–English bilinguals using a picture–sentence matching task with grammatical and ungrammatical adjective–noun structures. The task was designed as a special version of a go/no-go task (known to elicit the N2 component). High-fluency bilinguals showed a significant N2 modulation specifically for the ungrammatical noun–adjective word order of English phrases, indicating nonselective activation of Welsh grammar during English processing. This was not shown by low-fluency and monolingual participants. While results support grammar co-activation, stronger evidence would come from less controlled reading paradigms or more complex syntactic structures, such as adverb placement. Further research is also needed to test co-activation across a wider range of syntactic forms.
Vaughan-Evans et al. (Reference Vaughan-Evans, Liversedge, Fitzsimmons and Jones2020) conducted an eye-tracking study in natural reading mode: participants were English monolinguals and highly proficient Welsh–English bilinguals who had a mean age of acquisition (AoA) of 3.9 years for L2 English. The authors investigated whether Welsh grammar was co-active when bilingual participants read English sentences. English stimuli were manipulated according to the soft mutation rules of Welsh. One of these rules is that some voiceless stops such as /p, t, k/ will need to be voiced if they are in the initial position of a feminine noun which is preceded by a definite article y ‘the’ (e.g., cannwyll ‘candle’ ➔ y gannwyl ‘the candle’). While it is a masculine noun, this process will not be triggered (e.g., teledu ‘television’ ➔ y teledu ‘the television’). Bilinguals showed an effect of rereading time difference between the mutation and non-mutation conditions, which indicates both reanalysis and a grammatical interference by Welsh, not seen in monolingual English speakers: the results were interpreted as an indicator of active Welsh grammar even when the target word was English and integrated into the English phrase structure. Findings for these studies suggest that bilinguals with high fluency or early language acquisition tend to show grammatical co-activation during processing.
Most studies argue for a shared bilingual syntactic system: representations of one syntactic structure would be shared between two languages (even when they have different word orders). However, there are still open questions. The first concern is how and whether this hypothesis can be tested in other paradigms and generalized to other language pairs, syntactic structures and so forth. Previous studies have tested bilinguals of German–English, Japanese–English, German–Dutch, Spanish–English and Welsh–English with simple SVO structures, head-directionality of noun phrases, phono-syntactic rules (e.g., self-mutation in Welsh), subordinate clauses and so forth. No one has tested the co-activation effect with English–French bilinguals, nor with adverb placement (which is directly affected by syntactic movement). Secondly, it is unclear how within-group individual differences (such as AoA) could affect the blending of syntactic representations. Even late learners have shown a cross-language syntactic priming effect in production (e.g., Muylle et al., Reference Muylle, Bernolet and Hartsuiker2021); early language acquisition and fluency appear important for comprehension (Sanoudaki & Thierry, Reference Sanoudaki and Thierry2015; Vaughan-Evans et al., Reference Vaughan-Evans, Liversedge, Fitzsimmons and Jones2020). Therefore, if stronger L2 knowledge facilitates the formation of more prominent shared representations, as Hartsuiker & Bernolet (Reference Hartsuiker and Bernolet2017) propose, our question is as follows: Does this adaptive change follow the change of neural plasticity from birth to puberty or post-puberty?
The maturation or age factor has been shown critical for the ultimate attainment of both L1 and L2 grammatical knowledge (Birdsong, Reference Birdsong1992; Johnson & Newport, Reference Johnson and Newport1989, Reference Johnson and Newport1991; Lenneberg, Reference Lenneberg1967). Grammatical competence will not reach a native-like level if a language is acquired during or after puberty, or even after 6 ~ 7 years old. As the current study focuses on grammatical co-activation during sentence processing of the L1 or more dominant language, the AoA effect of the L2 or less dominant language cannot be ignored.
Considering that grammatical competence is the basic required input for the performance of syntactic parsing (Frazier, Reference Frazier1978; Frazier & Fodor, Reference Frazier and Fodor1978), there is no doubt that L2 AoA would shape the performance of bilingual speakers’ L2 sentence processing. According to the shallow structure hypothesis (Clahsen & Felser, Reference Clahsen and Felser2006b, Reference Clahsen and Felser2006a, Reference Clahsen and Felser2018), L2 sentence processing is non-native-like for adult language learners: L2 learners with late AoA tend to use more declarative memory than procedural memory during processing a sentence in L2 (Ullman, Reference Ullman2001). Early bilinguals tend to be more native-like in terms of their L2 sentence processing (Wartenburger et al., Reference Wartenburger, Heekeren, Abutalebi, Cappa, Villringer and Perani2003; Weber-Fox & Neville, Reference Weber-Fox and Neville1996). However, the results of these studies could not address how the sentence-processing mechanisms of the L1 or more dominant language can be affected by AoA of the L2 or less dominant language, especially for bilingual speakers who frequently use both languages in a bilingual community.
Previous studies have suggested that adaptive changes will occur in one’s L1 performance due to competition from the L2 (see Kroll et al., Reference Kroll, Bobb and Hoshino2014 for a review). Also, it has been found that the co-activation of the interlinked bilingual lexicon starts quite early in life (DeAnda et al., Reference DeAnda, Poulin-Dubois, Zesiger and Friend2016). For more balanced bilinguals who are exposed to two languages in their daily life, it seems that the AoI and manner of acquisition (MoA) of one language are prominent in both languages’ development and executive functions. A typical case is the English–French bilinguals residing in the Ottawa area in Ontario, Canada, who are early immersed in a French program in elementary school (Sabourin et al., Reference Sabourin, Brien and Burkholder2014, Reference Sabourin, Burkholder, Vīnerte, Leclerc and Brien2016a). AoI is here defined as the starting age of being exposed to a language at least 20% of the day in a balanced bilingual community. The MoA refers to how a language is acquired. The school setting environment (i.e., classroom exposure) is usually categorized as a formalistic acquisition, while exposure to a language in the home and community environments represents a naturalistic MoA (Sabourin et al., Reference Sabourin, Leclerc, Lapierre, Burkholder and Brien2016b). In the literature, it was shown that having an early AoI in French was critical for bilinguals to have an L2 to L1 lexical priming effect, suggesting a shared bilingual lexicon of very early English–French bilinguals (Sabourin et al., Reference Sabourin, Brien and Burkholder2014). Therefore, such individual differences should be considered during the investigation of the bilingual language faculty and bilingual syntactic integration/co-activation.
2.2. The current study and adverb placement
To test whether and how bilinguals unconsciously activate the grammars of their two spoken languages during reading comprehension of the L1 or more dominant language, we investigate how English–French bilinguals process the adverb placement rule in English sentences in a grammatical maze (G-maze) task. As preregistered in OSFFootnote 1, the investigated predictors are AoI and MoA, which are two important variables in previous studies of balanced English–French bilinguals. In the current paper, we first address the AoI effectFootnote 2.
There are divergent distributions of (frequency and manner) adverbs in French and English. In English, frequency and manner adverbs cannot intervene between the main verb and an object (e.g., John often watches TV; *John watches often TV). On the contrary, these two types of adverbs in French can only be used post-verballyFootnote 3: John regarde souvent la television (*John souvent regarde la television) ‘John often watches TV’. Notably, the usage frequencies seem to be different between frequency and manner adverbs in Québec French (Lealess, Reference Lealess2014). Specifically, manner adverbs have a more variant usage frequency (than frequency adverbs) in terms of whether they are located at the MID-VP (44%) or post-verbal (56%) position (Lealess, Reference Lealess2014, p. 149). The MID-VP usage (i.e., an adverb is used following an auxiliary verb but preceding the main verb) is more like the distribution of its English counterpart (see Waters, Reference Waters2011, for a study on English adverb distributions). On the other hand, frequency adverbs are highly frequently used post-verbally in French (83%). This difference could correlate with the different distributions of these two adverb types in French grammar. Therefore, adverb type might add potential variability to the predictions of how the two grammars compete when bilinguals process this structure.
It has been shown that English adverb placement is not easy for L1-French-speaking children without long-term immersion in an L2 English environment. Their L2 (English) competence and use of adverb placement were thought to reflect their L1 (French) order (White, Reference White1990, Reference White1991). This type of pattern has also been found in L1 Cantonese–L2 English speakers (Chan, Reference Chan2004) and heritage Spanish speakers living in the USA (Camacho & Kirova, Reference Camacho and Kirova2018). In these studies, the often-applied task of testing the grammatical competence of a second language is the grammaticality judgment task (GJT), which is an offline measure, telling little about the syntactic representations during online automatic processing. The current study explores the co-activation patterns of English–French bilinguals through a maze task testing online sentence processing. For frequency and manner adverbs in a verb phrase domain, we hypothesize that English only allows the word order of adverb preceding verb and object (advVO), while French only allows the order of adverb following verb (VadvO). Therefore, our manipulation of the English stimuli leads to the creation of two conditions. The English word order condition is realized as the advVO order, like example 1(a), while the French word order condition is reflected as the VadvO order, like example 1(b).
We hypothesize that while all participants will find the French word order less acceptable than the English word order, bilinguals would process these two conditions differently than monolingual English speakers. Considering the effect of AoI, we expected that bilinguals would process the verb phrase *watches often as being more natural than English monolinguals would. This can be argued as evidence of the shared grammar of the English and French orders of adverb placement. Additionally, an earlier AoI in French will result in more co-activation of the grammars while controlling for French proficiency. In other words, we ask two research questions (RQs).
RQ (1). Are English–French bilinguals less sensitive than English monolinguals to the syntactic violation of the French adverb placement rule used in English sentences?
RQ (2). Does AoI continuously predict the processing times of English–French bilinguals?
3. Methods
3.1. Experimental design
To address the RQs, we elicited data from a maze task, which can indicate the cost of incremental sentence processing in a self-paced reading (SPR) mode (Forster et al., Reference Forster, Guerrera and Elliot2009). Witzel et al. (Reference Witzel, Witzel and Forster2012) find that the effect size of ambiguous sentence processing in a grammatical maze task is as prominent as the one elicited in the eye-tracking approach, while larger than that of a SPR task. Therefore, a maze task is applied as the primary method for investigating behavioral performance in syntactic co-activation. The findings of the current study contribute to a broader picture that will be later combined with neural activity studies.
On each trial in the maze task, participants were presented with a sentence phrase by phrase. Each target phrase had a non-target word or chunk of words as the distractor (see Figure 1 for the demonstration of time frames in a trial). The target phrase and distractor randomly appeared on either the left or right side of the screen. Participants had to select either the right or the left phrase (or word if there is only one word in that time frame) to continue the sentence. To increase the ecological validity of natural processing, a contextual sentence preceded each target sentence. For instance, as shown in Figure 1, following a contextual sentence There is a river nearby their house, a target sentence The boy always catches big fish in that river would be read in a self-paced manner in four time frames (i.e., four sequential regions): The boy always catches big fish in that river . The paired distractors for these four regions are +++ there are went down went to and , each of which usually contains the same number of words as the correct phrase (i.e., the target) in the sentence. A pair with the symbol ‘+++’ indicates the start of a new trial, which is the subject noun phrase of a sentence in this case. In region 2, participants should choose always catches or catches always (in the condition of French word order) instead of there are, which is not a grammatical continuation of the subject The boy as always catches/catches always . By pressing a particular button (i.e., either the letter ‘F’ or ‘J’ on the keyboard) to select the right or left phrase in a time frame, participants were directed to the next chunk of word(s) or time frame.

Figure 1. A schematic view of the maze task.
It was predicted that monolingual speakers would have longer reaction times (RTs) processing region 2 of the French word order condition (such as *watches often in *John watches often television at home) than the English word order condition (such as often watches). However, early bilingualsFootnote 4 should have a smaller or even non-significant between-condition difference compared to late bilinguals, who would then have a smaller between-condition difference than monolingual speakers. This pattern of effect difference among the three groups should reflect that monolingual speakers would have the largest surprise processing the ungrammatical French word order condition, while early bilinguals would have the lowest level of surprise. Also, the AoI should continuously predict the between-condition difference of RTs of bilingual participants. Specifically, bilinguals with later AoI to French should have larger between-condition differences of RTs on the critical region phrase (often watches versus *watches often).
As a separate task from the maze task, a GJT in English was also designed for all participants to complete following the maze task. The main purpose of this task was to evaluate their grammatical knowledge (based on their offline responses) in addition to looking at their online processing behaviors in the maze task. In the GJT task, participants were required to click ‘Yes’ or ‘No’ as fast as they could to answer whether the presented sentence was grammatical or not.
3.2. Participants
We proposed to recruit 30 participants for each group in the preregistration. The effect sizes and post hoc power were computed in R with the sjstats package (Lüdecke, Reference Lüdecke2022). Ultimately, all participants (N = 114) tested were native speakers of English, who are current undergraduate students or employees of the University of Ottawa. The student participants were recruited through the Integrated System of Participation in Research (ISPR) Student Pool of the University of Ottawa, and each of them was granted one credit point after the participation. All participants were born in an English-speaking family where at least one of their caregivers is an English native speaker. These English speakers were put into one of three groups. One group (n = 53: 49 females, 4 males) consisted of early English–French bilingualsFootnote 5 with English as their native language (L1) or one of their native languages (if they are a simultaneous English–French bilingual) and who have been immersed in French between birth and age 6. The second group (n = 31: 24 females, 7 males, 1 other) consisted of late English–French bilinguals who speak English as their L1 and have been immersed in French as their L2 between the ages of 7 and 15. The third group (n = 30: 23 females, 6 males, 1 other) consisted of functional English monolingual participants who had little to no knowledge of French.
A summary of their mean age, AoI, MoAFootnote 6, cloze task scores and GJT d-prime scores Footnote 7 is shown in Table 1. The group effect in a one-way ANOVA on the d-prime scores was significant (p = 0.002), and the post hoc pairwise comparisons revealed that this was due to late bilinguals having a significantly lower mean score than both early bilinguals (p = 0.028) and monolinguals (p = 0.002). No significant differences were observed between early bilinguals and monolinguals (p = 0.505). This means late bilinguals were less sensitive to the distinction between the two conditions.
Table 1. A summary of mean age and cloze task scores

3.3. Materials
All the testing materials were written in English. For the maze task, two counterbalanced conditions were designed for each trial. Each participant was tested with sixteen trials of the English word order (EN) condition, which is an SadvVO structure, and another sixteen trials of the French word order (FR) condition, which is an SVadvO structure. The two sentences in each pair have different contextual sentences, subjects and noun objects from each other. In other words, participants would not see the same sentence appearing in both conditions. Thirty filler trials (without contextual sentences) were inserted between the target trials to avoid any order effect. Fillers and trials were fully randomized by the program. Two lists of stimuli were prepared, and the program recruited either of them randomly to the maze task for each participant. Another sixteen sentences (with the same conditions as those in the maze task) and ten fillers were designed for the GJT.
In total, each participant completed 32 experimental trials and 30 fillers in the maze task. After that, they were asked to complete 26 sentences in the GJT task (16 trials and 10 fillers). The manner and frequency adverbs in the trials were the same as those tested in the study of White (Reference White1991): frequency – often, always, sometimes, usually; manner – quickly, slowly, quietly, carefully. In addition, we included the manner adverb suddenly. All the trials in the formal maze and GJT tasks are listed in the Supplementary Material.
3.4. Procedure
The entire experimentFootnote 8 was created and hosted by Gorilla Experiment Builder (www.gorrila.sc) (Anwyl-Irvine et al., Reference Anwyl-Irvine, Massonnié, Flitton, Kirkham and Evershed2020). Participation was completely remote. Each participant was asked to sign an online consent form before starting the formal experiment. Following this, they were requested to fill out the Language Background Questionnaire (LBQ) (Sabourin et al., Reference Sabourin, Leclerc, Lapierre, Burkholder and Brien2016b), where they reported their linguistic background demographics, including history of language exposure and immersion, self-rated English and French proficiency levels and the detailed percentages of the use of all their spoken languages from birth until high school. The LBQ was written in English.
Before starting the maze task, all bilingual participants also completed a cloze task in French (Tremblay & Garrison, Reference Tremblay, Garrison and Prior2008), which was used to test their French language proficiency. Next, in the maze task, participants were told that they were going to play a maze game of reading English sentences. Each participant needed to finish five practice trials before continuing with the target trials and fillers. Only practice trials and target trials were preceded by contextual sentences. Participants were informed that some of the sentences would have a context for ease of comprehension and to help them achieve a higher score. Each participant was directed to the final GJT task after finishing the maze game. In this task, they were simply instructed to read each sentence as fast as they could and select whether it was grammatical or not. Participation in the entire experiment involved only one session and took approximately 45 minutes.
3.5. Data analysis
3.5.1. Data preprocessing
The maze task required participants to select the best-fitting item for each chunked word or phrase of each trial as the most acceptable continuation to the previously presented word or phrase of the sentence. The RT for each selection was recorded and used to determine any processing cost. Therefore, RT was set as the dependent variable (DV) for all statistical models in the current study.
As we expected a large level of noise in the raw data due to this being an online experiment with remote participation, data preprocessing steps were mostly determined post-data collection. Only responses to accurate trials were retained (which excluded 1% of the data points). Responses less than 300 ms for any part of the sentence were considered ineffective processing and excluded from the final analysis. This resulted in a further 1% of data being removed. Additionally, RT data points that were more than two standard deviations from the mean of RTs in each region were also removed (5% of the data). In total, 7% of the data points were removed as outliers before statistical analyses. The skewness levels of the RT distribution of the four regions were between 0 and 2.0.
3.5.2. ANOVA
As preregistered, the RTs (DVs) on the critical and post-critical regions (i.e., regions 2 and 3) were of the most interest. The level of processing difficulties can be identified by comparing the critical region RTs between the two counterbalanced conditions of adverb placement (i.e., English versus French, such as often watches versus *watches often). The post-critical region can both show a continued effect and provide inferences of between-group differences. Additionally, regions 1 and 4 were analyzed to make sure there was no extra effect from either of them. This extension beyond the preregistered statistical models is likewise incorporated into the mixed-effects analysis presented in the following section. To answer the first research question, AoI was treated as a categorical variable first. An ANOVA was conducted for each region of the sentence to test the between-group (early bilinguals, late bilinguals and monolinguals) and between-condition (English versus French word order) differences. Each ANOVA includes both a by participants (F1) and a by items (F2) test. As introduced in Section 2.2, in Québec French, frequencies of adverb usage correlate with the distributions of the two adverb types in the current study, despite the adverb placement rules being very well described in syntax theories (Lealess, Reference Lealess2014). Therefore, we also added adverb type as one of the independent variables (IVs), although this was not preregistered. In the by-participant (F1) tests, group (three levels: early bilinguals, late bilinguals and monolinguals) was the between-subject IV. The adverb placement (two levels: English and French) and adverb type (two levels: frequency and manner) were treated as repeated measures. The by-item (F2) tests used adverb placement and adverb type as between-item IVs and group as the repeated measure. To compare the effect size of word order between groups, contrastive comparisons of the estimated marginal means (by t-tests) for both adverb placement conditions of each group were further computed with the contrast() function in the R package emmeans (Lenth et al., Reference Lenth, Singmann, Love, Buerkner and Herve2018). These post hoc comparisons (with the Bonferroni method for p-value correction) were coded with contrast matrices that were conceptually planned in the preregistration.
3.5.3. Mixed-model analysis
To answer the two research questions and evaluate the continuum of effects of French AoI as preregistered, we built a linear mixed-effect regression model (LMM) for each region and total RT of the sentence. AoI to French (range: 0 ~ 15) was fitted as a continuous variable rather than a categorical one. The maximal model included the fixed effects of AoI (continuous), cloze task scores (continuous), self-report proficiency (categorical with five levels: 0, 1, 2, 3, 4), adverb type and list. There were two random structures included (i.e., item and participant). The by-participant and by-item random slopes were reduced to converge the models. Model comparison was done using the anova function in the car package in R (Fox & Weisberg, Reference Fox and Weisberg2019). Model fitting was checked by computing the marginal R-squared and conditional R-squared values using the MuMIn package in R (Bartoń, Reference Bartoń2022). The list variable was dropped during model selection since it accounted for little variance in the model. Adverb type was dropped from the fixed variables for some of the optimal models when it showed no significant effect. Eventually, there were no random slopes left for the optimal models. The p-values/df were calculated using the Satterthwaite approximation with the lmerTest package in R (Kuznetsova et al., Reference Kuznetsova, Brockhoff and Christensen2017). The Profile method was used to compute 95% confidence intervals. All the linear mixed-effect models were created and run with the lme4 package in R (Bates et al., Reference Bates, Mächler, Bolker and Walker2015). The contrastive comparisons of the estimated means for interactions and significant predictors were further computed with the contrast and etrends functions in the R package emmeans (Lenth et al., Reference Lenth, Singmann, Love, Buerkner and Herve2018). An observation-level validation was performed by plotting the residuals of all the LMMs (Sonderegger, Reference Sonderegger2020). The residuals were all slightly skewed (but not far from a normal distribution). This is due to the skewed distribution of the response values (i.e., RTs) in the final dataset. The skewness values of residuals in the mixed-effect models were close to those of the RT for each region. All the model results and R scripts are available in our project repository on OSFFootnote 9. Both the data preprocessing and inferential statistics were completed in RStudio (R version 4.2.1).
4. Results
4.1. ANOVA Results
The results of the ANOVA were able to address the first research question. Overall, the sentences that were displayed in the French order took longer to read than the English order. For all three groups of participants, there were significant between-condition (FR versus EN) differences in region 2 (critical) and region 3 (post-critical). The between-condition comparisons for each specific group are plotted in Figure 2. It reveals that early bilinguals (EB) had the smallest difference among the three groups in region 2 (e.g., *watches often versus often watches), while the mean differences of late bilinguals (LB) and monolinguals (MONO) were at the same level. However, the interaction effect of group by condition was only marginally significant in the ANOVA. Therefore, it is interpreted cautiously in combination with the post hoc contrastive comparisons within each group. No such effects occurred in regions 1 and 4. Detailed statistical results are reported in the following paragraphs.

Figure 2. An overview of between-condition comparisons across three groups.
4.1.1. Region 1
There was a main effect of group in this region (F1(2, 108) = 6.59, η2 = 0.073, p = 0.002; F2(1.96, 117.37) = 40.97, η2 = 0.15, p < 0.001). Results of a post hoc pairwise comparison suggest that both early and late bilinguals processed this region significantly slower than monolinguals, while they did not significantly differ from each other. No differences were found between the two conditions of adverb placement in this region, and there was no significant effect of condition-related interactions (see Figure 2).
4.1.2. Region 2 (critical)
Overall, a main effect of adverb placement condition was revealed in this critical region (F1(1, 108) = 156.46, η2 = 0.137, p = 0.002; F2(1, 60) = 30.36, η2 = 0.273, p < 0.001), as seen in Figure 2, and the (ungrammatical) FR condition was processed significantly slower than the EN condition. The results also show a significant effect of group (F1(2, 108) = 6.59, η2 = 0.073, p = 0.002; F2(1.96, 117.37) = 40.97, η2 = 0.15, p < 0.001). Results of the contrastive comparison from the by-item ANOVA revealed that the early bilingual group processed this region significantly slower than both late bilinguals (diff = 94 ms, SE = 16.3, p < 0.001) and monolinguals (diff = 156 ms, SE = 17.1, p < 0.001) (Table 2). Meanwhile, late bilinguals were significantly slower than monolinguals (diff = 61 ms, SE = 18.5, p = 0.0047).
Table 2. The contrastive comparison of the estimated marginal means in region 2 (AoI)

The interaction effect between group and condition was marginally significant in the by-participant analysis (F1(2, 108) = 2.72, η2 = 0.005, p = 0.07; F2(1.96, 117.37) = 2.11, η2 = 0.009, p = 0.127). Meanwhile, the contrastive comparisons of the estimated marginal means disclosed that the between-condition differences were prominently distinguishable between early bilinguals and the other two groups. Specifically, early bilinguals showed a smaller mean difference (diff (FR-EN) = 153 ms, SE = 23, p < 0.001) than both monolinguals (diff (FR-EN) = 224 ms, SE = 30, p < 0.001) and late bilinguals (diff (FR-EN) = 227 ms, SE = 30, p < 0.001). Since adverb type also showed a main effect significantly (F1(1, 108) = 11.27, η2 = 0.015, p = 0.001; F2(1, 60) = 3.22, η2 = 0.038, p = 0.078), a post hoc contrastive comparison of the estimated marginal means was applied to the interaction of group × condition × advType during both the by-participant and by-item analysis. Results revealed different effects of the two adverb types (manner versus frequency) on the between-condition differences for our bilingual participants (see the detailed planned contrasts in the ANOVA results file in our OSF repository). Both bilingual groups showed a smaller between-condition difference when it was a manner adverb (e.g., suddenly) than when it was a frequency one (e.g., often). This might suggest that sentences with manner adverbs are more likely to elicit grammatical co-activation than those with frequency adverbs. Details will be discussed later.
4.1.3. Region 3 (post-critical)
As seen in Figure 2, there was a spillover effect of condition in this region, which was significant (F1(1, 109) = 57.04, η2 = 0.056, p < 0.001; F2(1, 109) = 8.31, η2 = 0.1, p = 0.005). Like in region 2, all participants showed a longer RT when reading the FR condition than the EN condition. The overall group effect was also significant (F1(2, 109) = 3.14, η2 = 0.04, p = 0.047; F2(1.99, 119.69) = 21.15, η2 = 0.066, p < 0.001). The contrastive comparison of the estimated marginal means revealed that the between-group differences were smaller than those in the critical region but were still significant (in the by-item analysis) (see Table 3).
Table 3. The contrastive comparison of the estimated marginal means in region 3 (AoI)

The interaction effect between group and condition was not significant (F1 = 0.52, F2 = 0.64). Results of the contrastive comparison indicate that the between-condition difference of early bilinguals was slightly smaller than that of monolinguals but was not prominently different from late bilinguals (see Table 3). No significant effect of any other factor was found on the interaction of group by condition.
4.1.4. Region 4
Similar to region 1, no significant effect of condition (F1 = 0.11, F2 = 0.04) was shown in region 4 (Figure 2), while there was still a significant effect of group (F1(2, 109) = 6.52, η2 = 0.081, p = 0.002; F2(1.81, 108.31) = 74.6, η2 = 0.149, p < 0.001). Results of the pairwise comparison of the estimated marginal means (for the by-item ANOVA) reflected a significant differences among the three groups.
4.1.5. Total RT
For the total RT, there was a main effect of condition (F1(1, 108) = 91.36, η2 = 0.045, p < 0.001; F2(1, 60) = 16.51, η2 = 0.167, p < 0.001), showing that the EN condition was processed significantly faster overall than the ungrammatical FR condition. The group effect was also significant, showing the same pattern as region 4. The pairwise comparison discloses significant mean differences (of the by-item ANOVA) among the three groups:
-
• early bilinguals–late bilinguals = 334 ms (SE = 53.2, p < 0.001);
-
• early bilinguals–monolinguals = 547 ms (SE = 48.9, p < 0.001);
-
• late bilinguals–monolinguals = 213 ms (SE = 56.1, p = 0.001).
4.2. LMM results
Results of the LMM are analyzed and discussed in relation to the second research question: whether AoI has a continuum effect on the grammatical co-activation of English and French. Adverb placement (condition) was shown to be significant in the mixed-effect models for region 2, region 3 and total RT, which is consistent with the results of ANOVA. The AoI factor was not significant in the results of any of the five mixed-effect models. However, the post hoc test results showed a significant decreasing trend of RTs on the EN (i.e., grammatical) condition as AoI increased in region 2 (i.e., the critical region) (see Table 4). The RT distribution of both conditions in this region was plotted in Figure 3. The plots of the predicted AoI trend can be found in the knitted HTML file in our OSF repository. Specifically, the model for region 2 predicts that as AoI increases by 1 year, the RT in the critical region of the EN condition (e.g., often watches) will decrease by 17.2 ms (SE = 7.96, df = 81.0, t = −2.239, p = 0.028). This trend was not significant for AoI in other regions. For the critical region of the FR (i.e., ungrammatical) condition (e.g., *watches often), the AoI did not show a significant trend of decreasing. The adverb type was significant in region 2 (β^ = 182.4, SE = 69.4, t = 2.63, p = 0.009) (as shown in Table 5).
Table 4. The contrastive comparison of the estimated marginal means of LMM (AoI)

Note: Bold values indicate statistically significant effects (p < 0.05) and marginally significant effects (p < 0.10) after Bonferroni correction.

Figure 3. The AoI trends of RT distributions on both the FR and EN word orders.
Table 5. The fixed-effect results on region 2 (AoI)

Note: Number of obs.: 2217, groups: participant (82), item (64). p-values/df calculated using the Satterthwaite approximation. Formula: reaction ~ advPlacement * AoI * advTYPE + (1 | item_no) + (1+advTYPE+advPlacement | participant). Marginal R2 = 0.06, conditional R2 = 0.35. Bold values indicate statistically significant effects (p < 0.05).
4.3. General discussion
In the current study, we investigated whether English–French bilinguals showed grammatical co-activation effects when processing English sentences with verb phrase internal adverbs (such as often in the sentence John often watches television). We also explored whether French AoI has a continuous effect on grammatical co-activation, as we assume that AoI critically determines learners’ French grammatical competence, thereby influencing its competitive strength against the English grammar encapsulated within the same brain. The ANOVA results in the critical region showed a marginally significant interaction of group by condition, suggesting a potential grammatical co-activation pattern of the early bilingual participants. On the other hand, as revealed by the between-group difference in the GJT results in Table 1, varieties of English might exist in the bilingual community. Therefore, we cautiously interpret the results in terms of the grammatical co-activation hypothesis. Previous studies have shown grammatical co-activation effects in bilingual populations, particularly with respect to morphosyntactic and phonosyntactic rules such as agreement, inflection, case or phonological mutation. However, the current study focuses specifically on verb–adverb word order differences in English and French, which is a syntactic-level effect. This distinguishes our work from previous research and provides a complementary perspective on cross-linguistic influence in bilingual processing. Overall, the results suggest that grammatical co-activation in L1 sentence processing is modulated by L2 AoI.
4.4. The main group effect
There is a main effect of group in every region and total RTs, suggesting that early bilinguals with an AoI to French before age 7 tend to process sentences significantly slower than late bilinguals. Monolinguals responded significantly faster than the two bilingual groups. This is consistent with previous studies that simultaneous bilinguals tend to have slower access to the mental lexicon than sequential bilinguals, and a strong correlation exists between the L2 AoA and the organization of the bilingual lexicon (e.g., Sabourin et al., Reference Sabourin, Brien and Burkholder2014).
This pattern can be accounted for by the Competition Model (CM), which suggests that competing cues lead to longer RTs (while converging cues facilitate processing with shorter RTs) (Li & MacWhinney, Reference Li, MacWhinney and Chapelle2013). In light of Costa & Santesteban (Reference Costa and Santesteban2004), our result can also be explained from the perspective of cross-linguistic inhibition. Higher proficiency in French for bilinguals with early AoI may involve greater inhibitory control for lexical-semantic access, leading to longer RTs for a whole sentence. In other words, early bilinguals seem to have more information parsed from their mental lexicon than monolingual speakers while showing more co-activation and competition of their two languages than late bilinguals.
4.5. The age of immersion effect
In the critical region (verb and adverb), the (word order) condition effect has been shown significant in the results. All the participants showed a significant level of surprise when reading the ungrammatical condition, compared to the grammatical condition. Most importantly, the between-condition difference (i.e., French versus English word order) of early bilinguals is distinguishably smaller than those of the other two groups by more than 50 ms, while late bilinguals and monolinguals have the same violation effect (which was not predicted). However, this does support the hypothesis that for late bilinguals, less syntactic co-activation and competition might lead to greater surprise when processing the French word order sentences, since they were neither interfered with nor facilitated by the French grammar as much as early bilinguals were.
The post hoc trend of AoI on the processing time of the grammatical condition was shown significant in the critical region as revealed by the LMM. Despite a non-significant trend in the ungrammatical condition, this result could still symbolize a continuum of AoI (to a second or less dominant language, French), affecting the processing of a first or more dominant language for English–French bilinguals. In addition to the insignificant result of French proficiency yielded by the mixed model, we conducted a t-test comparing the French cloze task scores of early and late bilinguals. While statistically significant, the difference was minimal (mean scores: 26.4 versus 25.8 out of 45), indicating that proficiency differences alone should not be able to account for the AoI effects observed. However, it should be noted that language proficiency could still play a role in either of the two spoken languages for morphosyntactic processing (Steinhauer, Reference Steinhauer2014; but also see the insignificant effect of L2 Mandarin proficiency on L1 English processing in the results of Wolpert et al., Reference Wolpert, Zhang, Baum and Steinhauer2024).
Theories have been developed to explain the continuous adaptation of bilingual languages. For example, Kroll et al. (Reference Kroll, Bobb and Hoshino2014) suggests that there are adaptive changes in a native language due to the continuous acquisition of a second language or the other simultaneously acquired language that is less or slightly less dominant. The unified competition model also broadly implicates that AoA acts as a biological constraint on language development (Li & MacWhinney, Reference Li, MacWhinney and Chapelle2013; MacWhinney, Reference MacWhinney, Hickmann, Veneziano and Jisa2018, Reference MacWhinney, Gervain, Csibra and Kovács2022). The observed differences in the violation effects between early and late bilinguals are consistent with the CM’s emergentist view that varying inputs lead to different patterns of language development (MacWhinney, Reference MacWhinney, Dabrowska and Divjak2015). Specifically, for early or simultaneous English–French bilinguals, the strong post-verbal cue validity of L2 French adverb placement significantly influences their L1 English adverb placement processing, despite English’s predominant pre-verbal cue. Meanwhile, late bilinguals with later French immersion tend to develop weaker cue validity for French adverb placement, leading to a greater between-condition difference (i.e., a larger syntactic violation effect). The results explicitly suggest that the biological constraint factor of late versus early L2 immersion is measurable in the adaptive reorganization of L1.
To summarize, in the very early stage of processing (i.e., the critical region), late bilinguals seem to show less or no co-activation of the adverb placement rules compared to early bilinguals when processing a simple SVO sentence in English. Considering that the LMM results did not show a strong main effect of AoI as a continuous variable, our finding suggests a possible direction for future investigation into how AoI shapes processing patterns. The post hoc comparison results may imply that with earlier AoI to French, English–French bilinguals would show more co-activation, thus a smaller violation effect, while longer RT on reading English sentences, even though they all speak English as their first language or one of their first languages.
4.6. The spillover effect
The spillover effect of condition was present for all participants in the object noun position, while less prominent between-condition differences were found across all three groups of participants (see the by-item results in Table 3). It should be emphasized that this region is the point at which each trial with a post-verbal adverb (i.e., the ungrammatical condition) becomes fully ungrammatical, as this region constitutes the object of the verb phrase. For example, some intransitive verbs in English, such as eat, can be followed by an adverb like often (e.g., John eats often) grammatically. Taking this transitivity property into account, one should expect more prominent differences in the violation effect between bilinguals and functional monolinguals. However, our results might suggest that in English, the grammatical co-activation effect starts earlier in the position of verb and adverb and then fades away in the object noun position (i.e., the post-critical region).
4.7. Other effects
Lastly, the adverb type seems to affect the processing patterns of bilingual speakers. In the critical region, the between-condition difference (i.e., violation effect) for both groups of bilingual participants is always larger for sentences with a frequency adverb (than for those with a manner adverb). This may be because the distribution of French manner adverbs more closely resembles that of English (i.e., 44% precede the main verb) than does the distribution of French frequency adverbs (less than 20%) (Lealess, Reference Lealess2014, p. 149), leading the bilinguals to be more accepting of manner adverbs (than frequency adverbs) when they are placed in the ungrammatical position.
Beyond all these discussions, it is still controversial whether the activation of one language when using the other is nonselective (Pu et al., Reference Pu, Medina, Holcomb and Midgley2019) or just a type of transfer learning from L1 to L2 (Costa et al., Reference Costa, Pannunzi, Deco and Pickering2017) for lexical access. The present study of L1 (or more dominant language) English processing with the co-activation of L2 French may contribute to this debate. We predict that other similar paradigms should also be able to yield this effect. For instance, an SPR task in the moving-window paradigm may show smaller between-condition differences at the verb, adverb and object positions for early bilinguals than for late bilinguals. An EEG study could reveal group differences in ERP components (such as the P600) and in power distribution across the time–frequency domain during the processing of the verb and adverb region.
5. Conclusions
Our research questions are whether there is a grammatical co-activation effect in English–French bilinguals on reading comprehension of adverb placement and how it correlates with a key aspect of language backgrounds (i.e., AoI). In the current study, late bilinguals tend to show greater sensitivity to grammatical violation than early bilinguals. We also observed marginally significant trends of the AoI effect on processing the grammatical condition of adverb placement in English (e.g., often watches). Therefore, we argue that the AoI of a second or less dominant language significantly affects how bilinguals process a sentence with grammatical co-activation in their first or more dominant language.
Two questions remain open. First, the effect of adverb type needs to be further investigated. Second, we have not yet investigated French–English bilinguals (residing in an English-dominant region such as Ottawa) who speak French as their first language while attending an English school later in their life (e.g., at 6 years old). These populations tend to have a more naturalistic MoA of their French. Future studies should expand the populations to these bilinguals to explore the effect of MoA. Our future exploration will also include what neural correlates can represent the underlying mechanism of the behavioral patterns of bilingual speakers in the current study. To do that, we will apply electrophysiological methods (e.g., EEG), which can disclose more subtle information about bilingual sentence processing with two competing but possibly integrated grammars in the brain. Testing other language pairs would be another direction of research to generalize our findings to other bilingual populations.
Supplementary material
To view supplementary material for this article, please visit http://doi.org/10.1017/S1366728926101254.
Data availability statement
The full study design (including consent form, questionnaires, and tasks) is available for preview and cloning at: https://app.gorilla.sc/openmaterials/1244712. The materials, R scripts, and anonymized data are available on OSF at: https://osf.io/ajb5e/?view_only=ccd2c781a14f4378b1818759451a0fd6.
Acknowledgements
We thank Dr. Tania Zamuner of the CCLR Lab (University of Ottawa) for her insightful feedback throughout the period of this study. We are also thankful to the ERPLing lab members for their discussions that enriched the quality of this research. Special thanks go to our lab colleague Ariane Senécal, who reviewed the experimental stimuli and paper wording, and the participants, who volunteered their time to participate in the experiment. Their contribution made the success of this project possible. Lastly, we are grateful to the reviewers for their constructive comments and suggestions, which helped to improve this manuscript. Laura Sabourin was supported by a Social Sciences and Humanities Research Council of Canada Insight Research Grant.
Competing interests
No conflict of interest was reported by the authors.

