Interactions between lexical and syntactic L1-L2 overlap: Effects of gender congruency on L2 sentence processing in L1 Spanish-L2 German speakers

Abstract Bringing together lines of research from sentence processing and lexical access, this empirical study investigates the interplay between lexical (grammatical gender) and syntactic (word order) cross-linguistic overlap in L2 German. Eighty-six L1 Spanish-L2 German and thirty-six monolingual German adults completed a German self-paced reading task with noun phrases (NPs) manipulated by L1-L2 gender congruency (congruent, incongruent, neuter) and L1-L2 adjective-noun word order (pre- vs. postnominal adjectives). The study examines the effects of gender congruency, the type of L1-L2 gender mapping (i.e., presence vs. absence of each class in L1 and L2), and L2 proficiency level. Results show that the detection of ungrammatical word order in L2 German interacts with gender congruency, in that L2 speakers are only sensitive to word order violations for sentences with gender-congruent nouns. The detection of ungrammaticality for sentences containing gender-incongruent nouns only emerges at higher L2 proficiency levels. These findings underscore the role of cross-linguistic lexical overlap in syntactic processing.


Introduction
Research on the bilingual mental lexicon shows that bilinguals activate both languages when speaking and understanding the L1 or the L2 (for a review see Tokowicz, 2015). Such "fundamental permeability" (Kroll, 2015) between languages, evident in cross-linguistic influence, is often considered a hallmark of bilingual language processing. Permeability extends into grammar, with learners, for instance, being affected by the grammatical gender of nouns in one language even when processing sentences in another (gender congruency effect; Morales et al., 2016). At the same time, there is mixed evidence of cross-linguistic effects at the sentence level. While work on L2 sentence productionmostly from crosslinguistic priming studiessuggests a high degree of interactivity between the L1 and L2 also at a structural level (e.g., Hartsuiker & Bernolet, 2017), research on L2 sentence comprehension often reports that L2 comprehenders demonstrate analogous processing patterns despite L1 differences (see Hopp, 2022 for a review). Such absence of cross-linguistic effects has been interpreted as reflecting an overall tendency among adult L2 learners to underuse syntactic information in L2 real-time comprehension (e.g., Clahsen & Felser, 2006, 2018. This general observation of cross-linguistic influence at the lexical level and its comparative absence at the syntactic leveltogether with a general underreliance on syntaxhas spurred research to consider how lexical and syntactic processing may interact in L2 adults. According to the Shared Syntax model for language production (Hartsuiker & Bernolet, 2017), L2 grammatical representations are initially lexically specific (i.e., tied to particular lexical items), and learners only gradually abstract grammatical structure. At a higher proficiency, learners connect abstract structural properties across languages, which leads to cross-linguistically shared syntax. In this way, lexical acquisition paves the way to L2 syntax. Other approaches have focused on how lexical processing can impede target sentence processing. For instance, the Lexical Bottleneck Hypothesis (Hopp, 2018) stipulates that incomplete parsing arises partially from greater demands on lexical processing in bilinguals, as evidenced by slowdowns in lexical access (e.g., Hopp, 2016;Miller, 2014) and cross-linguistic influence in lexical processing (Hopp & Lemmerth, 2018).
The present study investigates the interaction between syntactic and lexical gender congruency in Spanish as the L1 and German as the L2, that is, two languages with a different number of gender classes. We developed a self-paced reading task in German where we manipulated lexical and syntactic cross-linguistic overlaps in Spanish and German with a particular focus on attributive adjectives. To investigate syntactic congruency, attributive adjectives appeared either pre-or postnominally within the noun phrase (NP). While postnominal adjectives are grammatical in Spanish, they are ungrammatical in German, as the language requires attributive adjectives to appear prenominally. To investigate lexical congruency, the grammatical gender of the nouns in the NPs was either congruent between Spanish and German, incongruent (e.g., Spanish masculine-German feminine), or neuter in German. We examine (a) how lexical gender and syntactic overlap interact during L2 sentence processing; (b) whether the type of L1-L2 mappings of gender affects L2 sentence processing (i.e., whether the gender has an analogous value in the L1); and (c) whether L2 proficiency plays a role. The findings demonstrate that lexical gender congruency interacts with incremental sensitivity to syntactic ungrammaticality, in that L1 Spanish learners of German only show slowdowns for sentences with ungrammatical Spanish-like postnominal attributive adjectives if the gender of the nouns in the NPs is congruent with Spanish. For gender-incongruent nouns, only higher proficiency learners come to be sensitive to the ungrammaticality of Spanish NP word order in German.

Background
Interactions between lexical and syntactic processing When reading or listening to sentences in the L2, late learners often demonstrate attenuated, slower, or absent reflexes of syntactic structure building compared to native speakers. These are manifested in difficulties or delays in the resolution of syntactic ambiguities or lower and delayed sensitivity to ungrammatical syntax during real-time comprehension (for a review, see Roberts, 2013).
Many approaches relate these difficulties in processing L2 morphosyntax to lower degrees of the availability of grammar to L2 learners as compared to L1 speakers (Clahsen & Felser, 2006, 2018, stronger interference from competing information (e.g., Cunnings, 2017), or lower degrees of integration of morphosyntactic information in a late-learnt L2 (e.g., Jiang, 2007). Yet others have raised the possibility that some of these difficulties may be natural consequences of bilingualism; that is, the distribution of language use across two languages and the concurrent activation of all languages during bilingual language processing (see Hopp, 2018 for a review).
On the one hand, bilinguals divide their time across two languages such that they have less experience with either language than a monolingual speaker of that language. Consequently, they encounter words and sentences in each language less frequently. As argued by the Weaker Links (or frequency lag) hypothesis by Gollan and colleagues (e.g., Gollan et al., 2008), less use translates into larger frequency effects in lexical retrieval, with bilinguals suffering delays in word recognition, particularly for lower-frequency words. Several studies have examined potential consequences of slower lexical processing for sentence comprehension.
One line of work manipulates lexical processing by item-level factors, such as the lexical frequency of words in the sentences. As in L1 processing (Tily et al., 2010), L2 readers show earlier evidence of structure building with high-frequency than with low(er)-frequency nouns (Hopp, 2017;Miller, 2014). Another line of work relates individual differences in lexical processing at the participant level to differences in sentence processing. For instance, L2 learners with faster lexical decoding skills demonstrate more target-like processing of ambiguous sentences, while less efficient lexical decoders are not sensitive to L1-L2 structural differences in sentence processing (Cheng et al., 2021;Hopp, 2014). Such findings extend to the processing of grammatical gender (Hopp, 2013), suggesting that efficient lexical access is a prerequisite for target grammatical processing. In turn, when lexical processing is slower and more taxing in an L2 compared to an L1, it may have knock-on effects on syntactic structure-building operations in real time. These operations can only be successfully executed once the lexical items incorporated in the parse have been processed to some degree.
On the other hand, the integrated (or non-selective) nature of lexical access in bilinguals can impact sentence processing. As bilinguals automatically activate all of the languages in their lexiconincluding grammatical information such as grammatical genderthis spreading activation can lead to different, more diffuse, or delayed use of grammatical information in sentence comprehension. In a series of studies with Russian-German bilinguals, Hopp and Lemmerth (2018;Lemmerth & Hopp, 2019) examined how the predictive processing of grammatical gender agreement in an L2 is affected by gender congruency with the L1. In a visualworld eye-tracking study, they studied whether Russian-German bilinguals could use grammatical gender marked on articles (e.g., der M /die F /das N ) or adjectives (blauer M /blaues N "blue") to anticipate a following noun (e.g., Tisch M /Lampe F / Haus N ). Of note, the study contrasted the gender congruency of the German nouns with Russian such that nouns and their translation equivalents were either gendercongruent (Lampe F -lampa F "lamp") or gender-incongruent (Haus N -dom M "house"). In addition, the conditions varied as to whether they were syntactically congruent (i.e., both German and Russian have gender marking on prenominal adjectives), or syntactically incongruent (i.e., only German marks gender on articles since Russian does not have articles). The results show that, for intermediate adult L2 learners, lexical and syntactic congruency interacted: they could only use gender for predictive agreement processing in the syntactically incongruent condition (articles) if the nouns were gender-congruent between German and Russian, even when target knowledge of the nouns and their genders was controlled (see also Weber & Paris, 2004). For successive bilingual Russian-German children, Lemmerth and Hopp (2019) also found that gender congruency was a prerequisite for target syntactic prediction according to gender agreement. These findings indicate that lexical gender congruency effects observed in predictive processing interact with syntactic processing, leading to delayed and less robust referent identification when the gender of the nouns implicated in the agreement relation does not overlap between L1 and L2.
This body of evidence led to the formulation of the Lexical Bottleneck hypothesis (LBH; Hopp, 2018), arguing that incomplete parsing in an L2 can partially arise from slowdowns at the lexical level and the non-language-selective nature of lexical access in bilinguals. Lexical processing constitutes a bottleneck, in that lexical retrieval consumes time and resources that then cut short the subsequent target computation of syntax or lead to differences in the activation of grammatical information in bilinguals. In this paper, we test the scope of the Lexical Bottleneck hypothesis in the processing of word order violations in NPs containing L1-L2 gender-congruent, incongruent, or neuter nouns. In this way, we investigate whether the effects of lexical gender congruency observed in predictive L2 processing of gender agreementwhere incongruency in gender leads to less successful referent identificationcan also be seen in slower processing of non-target syntax in the L2 when the latter corresponds to licit word orders in the L1.
Interplay between L1-L2 gender mapping and lexical congruency effects Previous work from individual word production and processing has further shown that it is not simply a matter of whether gender is cross-linguistically congruent or incongruent, but that the nature of the gender mapping impacts L2 lexical access. Most of the studies that have examined gender interactions in asymmetric systems, where there is no straightforward mapping for one of the gender classes (e.g., German neuter for Spanish-German bilinguals), yield different findings for incongruent versus asymmetric gender classes.
Testing bilinguals with a three-gender L1 and a two-gender L2, Manolescu and Jarema (2015) conducted an L2 picture naming task and a timed L2 translation task with L1 Romanian-high proficient L2 French adults. In naming, reaction times (RTs) were significantly faster for both congruent and neuter nouns (i.e., Romanian neuter and masculine/feminine in French) compared to incongruent (masculine-feminine mismatched) nouns. The same pattern of results was found in the translation task, though in this case neuter did not differ statistically from incongruent. In a similar vein, Paolieri and colleagues (2019) examined gender representation in L1 Russian-high proficient L2 Spanish adults using two timed L2 translation tasks. Bare noun translation revealed significantly faster RTs for congruent than incongruent, for neuter than incongruent, and also for congruent than neuter nouns (with only the latter differing from findings with L1 Romanian-L2 French speakers). NP (articlenoun) translation offered a similar pattern of results, though the difference between incongruent and neuter nouns was not significant.
Focusing on bilinguals with a two-gender L1 and a three-gender L2, Klassen (2016b) conducted a timed binary choice NP (articlenoun) grammaticality judgment task with L1 French-intermediate L2 German adults. Results showed significantly faster RTs for congruent nouns compared to incongruent ones, with no significant difference between incongruent and neuter or congruent and neuter. Finally, in the most relevant previous study, L1 Spanish-intermediate L2 German adults completed an L2 picture naming task (Klassen 2016a). RTs were significantly faster with congruent than incongruent and with neuter than incongruent, while there was no significant difference between congruent and neuter nouns.
Across studies, there is a trend for faster RTs for neuter nouns than incongruent ones, although technically both of these conditions consist of gender mismatches between the L1 and L2. Klassen (2016a,b) argues that the asymmetry between neuter and incongruent nouns can be accounted for by the language-specific nature of the neuter gender node in a system where symmetric genders across the L1 and L2 have a shared representation. Within this L1-L2 integrated representation (Salamoura & Williams, 2007), the activation of masculine and feminine gender nodes that are common to both the L1 and the L2as is the case with incongruent nounscreates interference in the response. This interference arises due to the competition for selection between the shared nodes, as the mismatching genders of the lexical items conflict symmetrically. In other words, the masculine gender of a word presented in the task interferes with lexical selection, as its translation equivalent in the other language has feminine gender. In contrast, with neuter nouns, neuter gender does not interfere with lexical selection, since it does not compete with a node available in the other language. Compared to congruent nouns (in which selection is facilitated by the L1 and L2 both activating the same shared gender node), and incongruent nouns (where selection is symmetrically inhibited), neuter nouns lead to asymmetric, one-sided interference only, which generates lower levels of response interference due to the gender class that is unique to the L2 (Figure 1). To examine whether this pattern of results emerging from lexical access studies extends beyond words produced and processed in isolation, this study also examines the role of neuter nouns in sentence processing in Spanish-German learners.
Gender and the NP in Spanish and German Both Spanish and German instantiate grammatical gender, with Spanish making a two-way distinction between masculine and feminine, while German displays the three-way distinction masculine, feminine, and neuter. Approximately half of the nouns in each of the languages are assigned to masculine gender, with the remaining half constituting either feminine or feminine and neuter (for Spanish: 52% masculine, 45% feminine (Bull, 1965); for German: 50% masculine, 30% feminine, 20% neuter (Bauch, 1971)).
Spanish is considered to have a largely transparent gender system: approximately two-thirds of nouns end in -o or -acorresponding to masculine or feminine, respectively, in more than 96% of instances (Teschner, 1987)while the remaining third end in -e or a consonant (Harris, 1991). Gender is similarly transparently marked on articles and adjectives. In contrast, German gender marking is rather opaque and non-predictable, offering only some probabilistic semantic or morphophonological regularities (e.g., Köpcke & Zubin, 1996). However, there are so many exceptions that L2 speakers must typically learn each noun gender individually. For this reason, gender is only clearly marked on articles and attributive adjectives, even though the transparency of gender marking is compromised by high levels of syncretism among case, number, and gender.
Of particular relevance to the present study are indefinite articles and attributive adjectives in nominative case. Spanish marks gender on these elements transparently and with unique entries in each instance, while German displays syncretism between masculine and neuter with indefinite articles, but unique adjectival endings by gender. These paradigms are illustrated in Table 1.
In addition, Spanish and German differ syntactically in the realization of attributive adjectives with respect to the relative order of the noun and adjective within the NP. Canonically, attributive adjectives appear postnominally in Spanish (Bosque & Picallo, 1996) but prenominally in German (Behaghel, 1923). As seen in Table 1, there is no overlap in the syntactic realization of adjectives between Spanish and German, while there can be lexical overlap in the gender of nouns for gendercongruent nouns. In addition, nouns can differ cross-linguistically with respect to gender in two ways: on the one hand, German nouns can have the opposite gender of their translation equivalent in Spanish; on the other hand, they can have an asymmetric (neuter) gender which is not available in Spanish.  The present study

Research questions
Against this backdrop, the present study investigates possible connections between lexical and syntactic processing. Specifically, we focus on the following research questions: RQ1: How do lexical gender and syntactic congruency interact during sentence processing in an L2?
As suggested by the Lexical Bottleneck Hypothesis, we expect to find that interactions of lexical and syntactic processing extend to the processing of ungrammatical sentences during reading to the extent that less taxing lexical processing will facilitate target syntactic processing. Specifically, non-target syntax, that is, postnominal attributive adjectives in German, should be easier to detect when the nouns are congruent in lexical gender class.
RQ2: Does the type of L1-L2 mappings of gender affect sentence processing?
Given the results emerging from studies on words processed in isolation (e.g., Klassen, 2016a,b), we expect that neuter nouns, for which there is no analogue in Spanish, will be processed differently from feminine and masculine incongruent nouns, given the contrast in the nature of cross-linguistic competition between masculine and feminine and the competition between masculine/feminine and the asymmetric gender with no representation in the L1 (neuter). Hence, ungrammaticality detection should be easier for neuter nouns than incongruent nouns. RQ3: How does L2 proficiency affect the processing of cross-linguistic overlap?
On the basis of previous research on word recognition and sentence processing, we expect that learners at lower proficiency levels will show greater congruency effects (e.g., Hopp & Lemmerth, 2018;Sá-Leite et al., 2020), since the L1 affects L2 processing to a greater extent at lower proficiency. Larger effects of gender congruency will attenuate or delay their sensitivity to ungrammaticality during sentence processing (e.g., Hopp, 2006;Jackson, 2008). As a consequence, we explore effects of proficiency in the present study by adding proficiency as a continuous predictor variable.

Design
To address these research questions, we developed a self-paced reading task in German in which lexical and syntactic cross-linguistic overlaps in Spanish (L1) and German (L2) were manipulated. In the experimental items, target manipulations focused on the gender of nouns at the lexical level and the relative order of attributive adjectives and nouns in NPs at the syntactic level. The manipulations at the level of syntactic overlap resulted in grammatical and ungrammatical sentences in German. Attributive adjectives appear prenominally in German, but postnominally in Spanishwith few exceptions. Thus, sentences that were grammatical in German (Adj-N) would be ungrammatical in Spanish, and those that were ungrammatical in German (N-Adj) would be grammatical in Spanish.
With respect to lexical gender overlap, nouns were either congruent between German and Spanish (masculine or feminine in both L1 and L2; MM & FF), incongruent (masculine-feminine mismatches between L1 and L2; MF & FM), or neuter (masculine or feminine in L1 and neuter in L2; MN & FN).

Participants
In total, 122 adults participated in this study: L1 Spanish speakers who were L2 learners of German (n=86) and L1 German speakers as the control group (n=36). Participants were recruited through German-teaching colleagues in Spain and via the authors' networks in both Spain and Germany. Due to travel restrictions at the time of testing, all participants completed the study over the internet, using the Gorilla Experiment Builder platform (Anwyl-Irvine et al., 2019).
All L2 German participants were adult L2 learners with no significant exposure to another language with grammatical gender. Prior to data analysis, we excluded L2 participants who had L1s in addition to Spanish (n=3), who did not complete the post-task (n=15), and who did not adhere to the instructions in the experiment (n=2). Table 2 shows the age and gender information for all remaining participants, as well as the proficiency means for the L2 group. Proficiency was assessed by a standardized 30-item written placement test of German (Goethe- Institut, 2010).
All speakers in the L1 control group were living in Germany at the time of testing and did not have any knowledge of Spanish. From the 36 native speakers, we excluded one participant who had an additional L1 to German.

Materials
For the reading study, we created 48 pairs of experimental items, as in (1), which contained a complex NP as a subject in an embedded clause. The main clause always consisted of a predicative adjective, followed by the conjunction denn ("because"), the NP, a copula verb, and two prepositional phrases (PPs), serving as spillover regions 1 . The order of nouns and adjectives was either grammatical (prenominal adjectives as in (1a)) or ungrammatical (postnominal adjectives as in (1b)) in German.  Jakubíček et al., 2013), number of letters, and number of syllables. In addition, 48 adjectives were selected and paired with the target nouns, also matching them as closely as possible by frequency, number of letters, and number of syllables across conditions (Appendix A). None of the adjectives selected were typically prenominal in Spanish. Subsequently, a total of 48 experimental sentences were created from the nounadjective pairings, each with a grammatical (1a) and ungrammatical (1b) version.
The task also included 16 grammatical filler sentences containing grammatical (postnominal) predicative adjectives (2) that served to mitigate possible task effects created by the ungrammaticality of postnominal attributive adjectives in the experimental items. There were a further 8 grammatical and 8 ungrammatical sentences containing negation. In all, the task consisted of 80 sentences, 48 (60%) of which were grammatical and 32 ungrammatical (40%) (Appendix B).
(2) Ich bin besorgt, denn die Wolke ist dunkel und gewaltig. I am worried, because the cloud is dark and huge "I am worried because the cloud is dark and huge." To ensure that the L2 speakers had sufficient knowledge of German gender, a gender assignment task including all German nouns in the experimental items was carried out following the reading task. In this 48-item post-task, participants selected the correct nominative definite article form (der M , die F or das N ) for each of the target nouns.

Procedure
Two lists were created such that each participant saw either the grammatical or ungrammatical version of each of the experimental sentences (1a vs 1b) in addition to the grammatical and ungrammatical filler items. In total, each participant was presented with a total of 48 grammatical and 32 ungrammatical sentences, preceded by five practice sentences to familiarize participants with the task. Sentences were segmented by phrases as in (3) 2 and presented as a noncumulative, moving-window self-paced reading task programmed using Gorilla Experiment Builder (Anwyl-Irvine et al., 2019). All sentences were presented in 18-pt Open Sans font.
( 3)  Each trial began with a fixation cross and the first segment appeared after 500ms, with the participants using the spacebar to advance through the segments. Following each sentence, either a yes/no comprehension question targeting the first PP (segment 5) appeared (for 28 out of 80 trials) or a blank screen displayed for 1000ms. All sentences were randomized for each participant by the software, and participants were offered a break at the midpoint in the task.
The experimental session was completed via the internet by each participant using their personal computer. Technical (timeouts) and attention (intermittent instructions to click a specific button) checkpoints were included in the programming in order to ensure the quality of remotely collected data. Participants who scored less than 75% on the attention checks prior to the reading task as well as those who took less than 3 or more than 30 minutes to complete the proficiency task were automatically prevented from continuing the experiment. 3 Participants provided informed consent, completed the proficiency test, the self-paced reading task, and the gender assignment task. Finally, they filled out a language background questionnaire. The entire session lasted approximately 45 minutes, and participants were sent 20 Euro gift cards upon completion.

Results
We excluded participants in the L2 group who had fewer than two correct gender assignments in at least one condition in the post-task (n=6). We used the data from the post-task to exclude participants based on the number of data points in each conditionthat is, participants had to have at least 2 correct gender assignments per conditionrather than excluding specific sentences for which the individual participant did not assign the correct gender, since target assignment is not relevant for gender congruency effects. L2 learners can link whatever gender of the noun is given in the input (i.e., the target German gender) to the gender of the Spanish translation equivalent, irrespective of whether they have target knowledge of German gender. Since the L2 participants are native Spanish speakers, we can be confident they know the gender of the Spanish translation equivalents. So if they see a German noun with feminine gender, they can link it to the translation equivalent in Spanish, which is either congruent or incongruent in gender. The congruency effects arising from this mapping are independent of whether the participants have a target representation of German gender. Data for the remaining 60 adult L2 learners of German and the 35 native speakers were analyzed.
In the gender assignment task, gender accuracy among the L2 speakers varied across gender classes, with feminine nouns having higher accuracy (77%) than masculine (67%) and neuter (65%) nouns. Gender accuracy did not vary by gender congruency, with congruent nouns showing similar accuracy (68%) to incongruent nouns, including neuter items (70%). In all, the L2 group demonstrated considerably above-chance (> 33%) gender knowledge of the critical nouns used in the experimental sentences.
For the analysis of the reading times, we excluded all segments with reading times below 200 ms and above 5000 ms. In total, these exclusions removed less than 7.2% of the data. We then log-transformed the reading times (natural logarithm) to adjust for the skewness of their distribution, since the Box-Cox procedure (Box & Cox, 1964) confirmed that a log transformation was appropriate. Table 3 shows the mean reading times by conditions and group.
To address RQ1 about effects of gender congruency, we first compared the read-   Jakubíček et al., 2013) as a scaled fixed effect. Since reading times in self-paced reading are subject to spillovers from one segment to the next, we also added the RTs of the previous segment as a fixed effect. Initially, we estimated models with a maximal random effects structure containing random slopes for Grammaticality, Congruency, and Frequency and their interactions on the random intercept of participants and random slopes for Grammaticality and Group and their interaction on the item intercept. We then used the "order" command in the buildmer package (version 2.3; Voeten, 2021) to obtain the maximal models that converge. The final models are reported by segment in Table 4. On top of main effects of reading times on the previous segment, frequency, group, and grammaticality, the model returned significant interactions of group and grammaticality in segments 3 and 4, with this interaction being qualified by a three-way interaction between group, grammaticality, and congruency in segment 4. In light of the interactions with group, we performed separate analyses of segments 3 and 4 for the L1 group and the L2 group. For the analyses of the L2 group, we added the proficiency score as a scaled fixed effect, including its interactions with grammaticality and congruency. For segment 3, analyses by group revealed that the L1 speakers demonstrated a highly significant main effect of grammaticality (ß = −0.069; SE = 0.012; z = −5.611, p < .001), while the L2 group did not demonstrate a main effect of grammaticality (ß = −0.013; SE = 0.011; z = −1.211, p = .226) or any interactions with it. No further effects or interactions reached significance in segment 3. For segment 4, Table 5 lists the models by group and by congruency for the L2 group. The L1 group did not show any significant effects beyond the main effect of grammaticality, while the L2 group demonstrated a main effect of grammaticality qualified by an interaction with congruency. As the subsequent comparisons by congruency show, the L2 group only evinced a main effect of grammaticality for sentences with congruent NPs, while there was no significant effect of grammaticality for sentences with incongruent NPs. For the latter, there was a trend for more highly proficient learners to make a difference between grammatical and ungrammatical sentences, as suggested by the marginally significant interaction of grammaticality and proficiency. In sum, the analyses suggest that the L2 group is less sensitive to the difference between grammatical and ungrammatical sentences than the L1 group, in that the L2 group only shows effects of grammaticality on segment 4 for sentences with gender-congruent NPs.   In order to answer RQ2 regarding the type of L1-L2 mappings, we compared sentences in the neuter condition with those containing incongruent nouns on the basis of our hypothesis that neuter nouns behave differently from incongruent nouns for L2 learners. Figure 4 graphs the reading times for neuter and incongruent nouns for the L1 group, and Figure 5 does so for the L2 group. As can be seen for the L2 group, the reading time differences are largely similar for neuter and incongruent nouns.
We fitted the same omnibus model as above to the data, the only difference being that the fixed effect of Congruency was defined as neuter (0.5) vs incongruent (−0.5). Table 6 shows the model output.
Beyond main effects of reading times on the previous segment, frequency, congruency, grammaticality, and group, the models returned two-way interactions between grammaticality and group on segments 3 and 4. Subsequent by-group models demonstrate that the main effects of grammaticality became significant for the L1 group in both segments (S3: β = −0.080; SE = 0.013; z = −6.183; p < .001; S4: β = −0.066; SE = 0.0149; z = −4.445; p < .001). In contrast, the L2 group did not show an effect of grammaticality in segment 3 (β = −0.010; SE = 0.013; z = −0.777; p = .437) and only a trend towards a main effect of grammaticality in segment 4 (β = −0.013; SE = 0.007; z = −1.791; p = .0733), which was qualified by an interaction with proficiency (β = −0.016; SE = 0.007; z = −2.237; p = .0253). Figure 6 plots the model output for the interaction between proficiency and the grammaticality effect in the L2 group across neuter and incongruent items in segment 4. It shows that more proficient L2 learners begin to make a difference between grammatical and ungrammatical sentences, while lower proficiency learners are not sensitive to grammaticality for sentences containing incongruent and neuter nouns.
As regards RQ3, then, the study finds an interaction between grammaticality and proficiency for sentences with incongruent and neuter NPs. This interaction illustrates that more highly proficient L2 learners tend to demonstrate longer reading times on segment 4 for ungrammatical versus grammatical sentences containing NPs that differ in gender between Spanish and German in being either of opposite or asymmetrical gender cross-linguistically.

Discussion
The aim of the current study was to investigate lexical gender effects in L2 syntactic processing. We manipulated the type of gender mapping between German and Spanish at the lexical level, as well as the syntactic grammaticality in the relative order of the noun and adjectives. We found that gender congruency interacts with the detection of syntactic ungrammaticality. L2 learners demonstrated grammaticality effects in processing sentences containing congruent nouns compared to those comprising incongruent nouns (RQ1). Second, there were no significant differences  in the processing of neuter nouns compared to incongruent nouns (RQ2). For both incongruent and neuter nouns, grammaticality effects emerged only at higher L2 proficiency levels (RQ3). With respect to RQ1, the study found that L2 learners demonstrate earlier and stronger slowdowns for ungrammatical sentences comprising congruent as compared to incongruent nouns, while there was no such effect for the monolingual controls. Hence, the difference between conditions likely reflects effects of the L1, in that the activation of the same gender node by both Spanish and German appeared to speed up the integration and detection of the syntactic ungrammaticality of postnominal attributive adjectives in German. For gender-congruent nouns, learners demonstrated slowdowns associated with ungrammatical syntax on the segment following the NP, unlike the native control group that evinced the effects on both the NP and the following segment. Such a delay in incremental reading effects is common among L2 learners, especially at less than near-native proficiency levels (e.g., Hopp, 2010). Overall, this finding of congruency effects suggests that L2 learners are more sensitive to mismatches between L1 and L2 word order during L2 sentence comprehension when the nouns embedded within the ungrammatical segment match in gender between the L1 and the L2.
In principle, these lexical effects of gender are consistent with gender congruency effects reported in word recognition and production (e.g., Lemhöfer et al., 2008;Paolieri et al., 2010) and extend these findings to sentence contexts. Crucially, in this study, lexical congruency interacts with syntactic grammaticality detection. These results underscore that gender congruency has consequences for syntactic processing.
This pattern of results cannot be exhaustively accommodated by genderintegrated approaches to the bilingual mental lexicon. If gender congruency was limited to lexical co-activation of both languages due to non-selective access to the lexicon, congruent nouns should lead to globally faster reading times compared to incongruent nouns, irrespective of grammaticality differences. While compatible with gender-integrated models, the finding that gender congruency had specific effects for the detection of ungrammaticality requires additional explanation, as for instance, that provided by the Lexical Bottleneck Hypothesis (LBH). According to the LBH, fast lexical access leaves more time and resources for syntactic computations in real-time sentence processing, while greater demands on lexical processing can delay or attenuate syntactic structure building in real time among L2 learners. The present findings are compatible with this account, in that gender-congruent nouns are processed with greater ease, and thus, readers show sensitivity to syntactic ungrammaticality. Gender-incongruent nouns, in contrast, consume greater lexical processing resources, as learners need to inhibit the concurrent activation of the mismatching gender node in L1 Spanish as they process the German noun. As a consequence, reading slowdowns for ungrammatical versus grammatical word orders are attenuated or absent compared to congruent nouns and give rise to non-native syntactic processing.
In this respect, the findings are broadly similar to the ones reported for predictive gender processing among L1 Russian learners of German by Hopp and Lemmerth (2018). In their study, high-intermediate L2 learners showed predictive use of gender only for lexically congruent nouns in the syntactically incongruent (article) condition, while there were no effects of lexical congruency for gender agreement processing in syntactically congruent contexts, that is, gender marking on prenominal adjectives. The authors account for this asymmetry by arguing that L2 learners' use of inflection is limited when the syntactic realizations of agreement differ between the L1 and L2 (see also Tokowicz & MacWhinney, 2005). However, lexical gender congruency can ease agreement processing by lowering the processing demands associated with the non-overlapping syntax.
The present findings suggest that these facilitative effects of lexical congruency attested for the predictive use of gender information in syntactic agreement processing extend to identifying ungrammaticalities in syntactic processing, even if these do not involve mismatches in grammatical gender agreement between the L1 and L2. Instead, gender congruency at the lexical level facilitates the sensitivity to word order differences between the L1 and L2. From a broader perspective, the study adds to previous findings that L1-L2 lexical overlap, for example, by virtue of cognates, leads to more target-like L2 sentence processing (e.g., Miller, 2014) and reduces the interference of the L1 syntax (Hopp, 2017). As such, it highlights interdependencies between lexical and syntactic processing during L2 sentence comprehension.
As regards RQ2, this study goes beyond previous research on L2 sentence processing by addressing the question of how the presence (or absence) of an analogous L1 gender for the gender class in the L2 affects syntactic processing. Masculine and feminine are present in both Spanish and German and can be straightforwardly mapped onto each other, while there is no clear mapping for the German neuter given the absence of a third gender class in Spanish. 4 In this study, there were no differences in reading times for sentences with neuter nouns compared to those with incongruent nouns. For both incongruent and neuter nouns, effects of grammaticality detection only surfaced at higher proficiency levels, which suggests that the lexical bottleneck posed by lexical incongruency in gender class widens with more experience and greater facility in the L1. As can be seen in Figure 6, as proficiency rises, the reading times for grammatical sentences tend to reduce. This seems to indicate that, as proficiency rises, the lexical competition between different gender classes in the L2 and L1 can be resolved more easily, which in turn allows the parser to integrate syntactic ungrammaticality signals incrementally. When the effects of lexical competition abate, evidence of target syntactic processing emerges.
The finding that neuter nouns pattern with incongruent nouns is inconsistent with Klassen's (2016a,b) findings for words processed in isolation. In an L2 picture naming task, L2 German learners of Spanish showed significantly less response interference arising from the co-activation of the Spanish gender when asked to produce bare nouns or NPs containing neuter nouns compared to incongruent feminine or masculine nouns. To the best of our knowledge, the present study is the first to examine the effect of such an asymmetry in the L1 and L2 gender systems at the level of L2 sentence processing. Unlike isolated picture naming, in the present study, processing does not differ for neuter and incongruent nouns in sentence contexts in terms of affecting L2 learners' ability to detect syntactic violations. This could be due to the fact that more processes are involved in the sentences presented to participants in this reading experiment as compared to picture naming, which is largely limited to lexical retrieval. Further research is required to determine the precise locus of the asymmetric neuter effect in word recognition and its possible replication with other asymmetric gender language pairings.
Overall, the interactions between lexical and syntactic processing observed in the present study attest that the creation of syntactic structure in real-time L2 comprehension is sensitive to the lexical properties of the elements involved therein, even when these properties are not relevant for building the structure as such (see also Hopp, 2016;Miller, 2014; for L1 processing, see Tily et al., 2010). In L2 processing, lexical aspects unique to bilingualism, for example, cross-linguistic gender congruency effects, impact the time-course and degree of ungrammaticality detection in word order. In this respect, they also underscore that non-native-like processing signatures in syntax are not necessarily rooted in problems with using syntax in real time (e.g., Clahsen & Felser, 2006), but may be caused by difficulties at processing stages that precede and subserve syntactic processing. Accordingly, the present findings highlight the need for an integrated study of the bilingual language system, encompassing both the lexicon and the grammar, in order to delineate aspects of adult L2 grammatical processing that are natural consequences of bilingualism from those that may reflect maturational constraints on late L2 acquisition.
A potential limitation of this study is the collection of data via the internet with participants using their personal computers. While recent analyses of online versus laboratory data collection have shown reliable results with both methods (e.g., Anwyl-Irvine et al., 2020), we further minimized risks associated with unsupervised participation by controlling the type of device permitted by the software (computers only), introducing technical and attentional checkpoints into the programming, and imposing strict inclusion criteria on the data in the analyses (see the Participants and Results sections for details). Even with discarding data from 27 participants, our study included a large sample (n = 95) that found robust effects comparable to laboratory-based experiments.
Another potential limitation lies in the fact that we only tested one bilingual group and that we therefore cannot definitively tie the findings to effects coming from L1 Spanish. However, given that the native German control group did not show congruency effects, noun and adjective frequency were controlled across conditions, and the congruent and incongruent nouns were indistinguishable in gender assignment accuracy in the post-task, there is no reason to believe that the different pattern of results across the conditions reflects item-level differences. Hence, we confidently relate the differential results between congruent, incongruent, and neuter nouns to L1 effects. Nevertheless, a further study could test a different L1 group whose L1 has prenominal adjectives to explore if similar effects of gender congruency on syntactic ungrammaticality detection can be observed, even if the ungrammatical word order in the L2 does not map to a licit word order in the L1.

Conclusion
In sum, this study reveals lexical gender effects in bilinguals' detection of ungrammatical word order in L2 sentence processing. It suggests that lexical bottlenecks in L2 sentence processing reduce sensitivity to ungrammatical syntax. These findings not only advance the state of the art but also open new avenues for further research at the crossroads between lexical gender and syntactic congruency in bilinguals.

1.
Denn was chosen in order to avoid verb-final word order in the embedded clause. 2. The mean number of letters and syllables in critical and spillover regions was balanced across sentences according to gender congruency and grammaticality conditions. 3. The minimum time to complete the proficiency test was based on the fastest native speaker completion times in piloting, and a maximum was established in order to prevent participants from stopping frequently to search for answers on the internet. 4. Considerable evidence from 2L1 children's code-switches between the article and the noun shows that pre-school-age children draw connections between masculine and feminine across Romance and Germanic languages. Findings such as those in Cantone and Müller (2008) and Eichler, Hager and Müller (2012) illustrate that in spontaneous speech, 2L1 Spanish-German, Italian-German, and French-German children have a general tendency to produce code-switched NPs in which there is cross-linguistic gender agreement between the article and the noun (e.g., una SP-fem Schlange GER-fem "a snake").
Appendix A A1. Frequency data for nouns (means) by language and condition.

Condition
German Spanish Frequency per million (log10)