Introduction
For successful language comprehension, new information has to be stored and linked to already existing information. The use of a pronoun suggests that its referent is already known, but pronouns themselves generally encode little information (apart from e.g., person, number). Therefore, linking the pronoun to a referent is necessary to retrieve more information for the comprehension process. But how do comprehenders decide to which referent they link a particular pronoun? That is the central question in pronoun resolutionFootnote 1 research. To answer this question, many theoretical accounts appeal to a ranking of referents in terms of their accessibility or prominence (e.g. Grosz et al., Reference Grosz, Weinstein and Joshi1995a; Ariel, Reference Ariel, Sanders, Schilperoord and Spooren2001; Arnold, Reference Arnold2001; Arnold et al., Reference Arnold, Brown-Schmidt and Trueswell2007; von Heusinger and Schumacher, Reference von Heusinger and Schumacher2019). Others, however, have pointed out that accounts of prominence or accessibility as the determining factor in pronoun resolution risk being circular, as well as being unable to fully account for the complexity of the emerging picture (Kaiser and Trueswell, Reference Kaiser and Trueswell2008; Hert et al., Reference Hert, Järvikivi and Arnhold2024; Bader and Portele, Reference Bader and Portele2025).
A wide variety of factors can affect the likelihood of a certain referent being linked to a pronominal form, among them grammatical role (e.g., Alonso-Ovalle et al., Reference Alonso-Ovalle, Fernández-Solera, Frazier and Charles2002; Carminati, Reference Carminati2002; Crawley and Stevenson, Reference Crawley and Stevenson1990; Frederiksen, Reference Frederiksen1981; Fukumura and van Gompel, Reference Fukumura and van Gompel2015; Gordon and Chan, Reference Gordon and Chan1995; Hert et al., Reference Hert, Järvikivi and Arnhold2024; Järvikivi et al., Reference Järvikivi, van Gompel, Hyönä and Bertram2005; Kaiser, Reference Kaiser2011a; Okuma, Reference Okuma, Herschensohn and Tanner2011; Song and Fisher, Reference Song and Fisher2005) and information structure (e.g., Colonna et al., Reference Colonna, Schimke and Hemforth2012, Reference Colonna, Schimke, Hemforth, Hemforth, Mertins and Fabricius- Hansen2014, Reference Colonna, Schimke and Hemforth2015; de la Fuente and Hemforth, Reference de la Fuente and Hemforth2013; Ellert, Reference Ellert2013; Xu, Reference Xu2015). However, it is not quite clear why certain factors seem to have stronger effects than others. Moreover, the factors and their relative weights have been shown to differ between languages and individual pronominal forms (e.g., Bader and Portele, Reference Bader, Portele, Gattnar, Hörning, Störzer and Featherston2019a,b; Ellert, Reference Ellert2010; Ellert et al., Reference Ellert, Roberts, Järvikivi, Spiegel and Krafft2011; Kaiser and Trueswell, Reference Kaiser and Trueswell2008). Kaiser and Trueswell (Reference Kaiser and Trueswell2008) captured these findings in the form-specific approach that states that multiple factors (e.g., subjecthood, focus) play a role in linking a specific referent to a specific pronoun, and the degree of sensitivity to these factors varies for the different pronominal forms (e.g. personal pronoun vs demonstrative pronoun, or overt pronouns vs null pronouns). In other words, what makes a referent more likely to form a link with the pronoun depends not only on the factors’ contribution to the referent’s accessibility, but also on the pronoun’s sensitivity to these factors.
The current study examines how changes in information structure affect the resolution of different pronouns in German for first (L1) and second language (L2) speakers. Experiment 1 tests how focus marking on preceding subject and object referents influences referential choice for the personal pronoun er and demonstrative pronoun der for L2 speakers of German. In experiment 2, we employ prosodic focus marking in the form of accents on pronouns themselves, comparing the subject pronoun er (“he”) and the object (accusative case-marked) pronoun ihn, and examine the effect of prosodic focus marking on referent selection for both L1 and L2 speakers. We concentrate on information structure for three reasons: First, the notions of prominence and accessibility are more or less explicitly connected to information structural concepts such as focus (Gundel et al., Reference Gundel, Hedberg and Zacharski1993; von Heusinger and Schumacher, Reference von Heusinger and Schumacher2019; Ladd and Arvaniti, Reference Ladd and Arvaniti2023). Therefore, investigating the effect of information structure on pronoun resolution is a promising way of accessing prominence- and accessibility-based accounts of pronoun resolution (also see Hert et al., Reference Hert, Järvikivi and Arnhold2024). Second, information structure is a central factor connecting several other factors that have been found to affect pronoun resolution, such as grammatical role and position. Thus, while grammatical role and position generally coincide in fixed word order languages like English (e.g. in The baker called the tailor, the baker is both the subject and the first-mentioned entity, whereas the tailor is the object and second mentioned), they can be disentangled in a flexible word order language like German (Järvikivi et al., Reference Järvikivi, van Gompel, Hyönä and Bertram2005; Sauermann et al., Reference Sauermann, Höhle, Chen and Järvikivi2011; Kaiser and Trueswell, Reference Kaiser and Trueswell2004). However, changes in word order mark differences in information structure (Frey, Reference Frey2006; Weskott et al., Reference Weskott, Hörnig, Fanselow and Kliegl2011; Fanselow, Reference Fanselow, Féry and Ishihara2015), which should therefore be manipulated directly. Third, as reviewed below, several studies have suggested that L2 speakers are more affected by information structure than L1 speakers (Okuma, Reference Okuma, Herschensohn and Tanner2011; Schimke and Colonna, Reference Schimke and Colonna2016; Ellert et al., Reference Ellert, Roberts, Järvikivi, Spiegel and Krafft2011), whereas one study suggests they are less likely to take information structure into account than L1 speakers (Abashidze et al., Reference Abashidze, Gagarina and Bittner2023). This disagreement warrants another look, especially since none of the existing studies explicitly controlled for prosody, which is the most natural way to mark information structure in many languages, including German. Therefore, our objective is to gain new insights into the pronoun and L2 literature.
Pronoun resolution in L1 German
In German, the personal subject pronoun er (“he”) usually refers to the preceding subject referent (e.g., Bader and Portele, Reference Bader and Portele2019b; Bouma and Hopp, Reference Bouma and Hopp2007; Colonna et al., Reference Colonna, Schimke and Hemforth2012; Hert et al., Reference Hert, Järvikivi and Arnhold2024). In addition to the personal pronoun, demonstrative pronounsFootnote 2 like der are used anaphorically as well. The two pronominal forms – personal and demonstrative pronouns – have been found to differ in their preference regarding choice of referents. The difference between the two pronominal forms has been described in terms of complementary preferences for grammatical role as well as for information structure. Unstressed personal pronouns have been shown to prefer the subject/topical referent, whereas demonstratives are more likely to be linked to the object/non-topical referent (Bosch et al., Reference Bosch, Rozario and Zhao2003, Reference Bosch, Katz, Umbach, Schwarz-Friesel, Consten and Knees2007; Comrie, Reference Comrie1997; Diessel, Reference Diessel1999; Kaiser, Reference Kaiser2011b). Importantly, in line with Kaiser and Trueswell’s form-specific approach, studies have shown that the extent of sensitivity toward these factors varies between the two pronominal forms: demonstratives are affected more by information structure than personal pronouns, whereas personal pronouns are influenced by grammatical role to a greater degree than demonstrative pronouns (e.g., Bader and Portele, Reference Bader, Portele, Gattnar, Hörning, Störzer and Featherston2019a, 2025; Hert et al., Reference Hert, Järvikivi and Arnhold2024; Kaiser and Trueswell, Reference Kaiser and Trueswell2008; Kaiser, Reference Kaiser2011c; Portele and Bader, Reference Portele and Bader2016).
Regarding the difference between subject and object pronouns, Sauermann and Gagarina (Reference Sauermann and Gagarina2017) investigated the effects of word order and grammatical role parallelism (i.e., a subject referent preference for subject pronouns like er “he” and an object referent preference for object pronouns like ihn “him”) during online pronoun processing using eye-tracking. The gaze data showed an effect of grammatical role parallelism, i.e. with the subject pronoun er, the participants fixated subject referents more than object referents, whereas the opposite pattern was found for the object pronoun ihn. This effect was present 750–2500 ms from pronoun onset, irrespective of word order. However, as recent research suggests, online preferences in the visual world do not necessarily reflect the final interpretation (Blything et al., Reference Blything, Järvikivi, Toth and Arnhold2021; Hert et al., Reference Hert, Järvikivi and Arnhold2024; Schumacher et al., Reference Schumacher, Dangl, Uzun, Holler and Suckow2016, Reference Schumacher, Roberts and Järvikivi2017), it is not clear to what extent these findings reflect the final (pronoun) interpretation. In fact, Sauermann and Gagarina, who investigated eye gaze during 12 consecutive 250-ms time segments after pronoun onset, note that the effect of grammatical role was not present in the last two time segments comprising 2500-3000 ms after pronoun onset. They further assumed that this effect might decrease during later processing, but did not collect offline responses on final interpretations.
Following Sauermann and Gagarina’s experimental design, Abashidze, Gagarina, and Bittner (Reference Abashidze, Gagarina and Bittner2023) examined the influence of grammatical role and positional parallelism (preference for the referent occupying the same syntactic position as the pronoun, regardless of grammatical role, see e.g., Smyth (Reference Smyth1994)) during online and offline resolution of subject and object pronouns in German. Their gaze data revealed an initial preference toward the subject referent for subject and object pronouns, which was higher for the object pronoun at first, but increased further over time for the subject pronoun. The offline results showed a preference for the subject referent with the subject pronoun, while for the object pronoun, referent choice was at chance level. Thus, similar to Sauermann and Gagarina, Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023) found that grammatical role strongly affects online processing. Unlike Sauermann and Gagarina, however, the gaze pattern for object pronouns also showed a subject referent preference. Abashidze et al. explain their results in terms of topicality. They see personal pronouns as a tool for topic continuation, meaning participants would be biased toward interpreting personal pronouns as topics. Resolution of subject pronouns in the sentence-initial topic position would be straightforward because grammatical role and topicality align. On the other hand, the preferences for object pronouns would not be as clear, since grammatical role would point to the object and the topic bias toward the subject—the topic—as the referent, which might explain the chance-level performance on the offline interpretation. However, unlike Sauermann and Gagarina (Reference Sauermann and Gagarina2017), Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023) only tested subject-verb-object (SVO) word order, which impedes the comparison between the results of the two studies. Further, as word order is known to express information structure in German (Fanselow, Reference Fanselow, Féry and Ishihara2015; Frey, Reference Frey2006), the fact that Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023) did not manipulate word order or vary information structure in any other way makes it difficult to directly assess their claims that both subject and object pronouns preferentially refer to topics.
For the difference in referent selection between unaccented and accented pronouns, there is no research on German. The theoretical literature generally assumes that accented pronouns reverse the referent preference (Akmajian and Jackendoff, Reference Akmajian and Jackendoff1970). For example, in Rachel texted Monica and Ross called her/HER, the unaccented pronoun would refer to the object Monica, but the accented one to the subject Rachel. This pattern has been derived in both coherence-based accounts (e.g., Kehler, Reference Kehler2005) and Accessibility Theory (Ariel, Reference Ariel1988, Reference Ariel1990, Reference Ariel, Sanders, Schilperoord and Spooren2001) (see also e.g., Givón, Reference Givón1983; Gundel et al., Reference Gundel, Hedberg and Zacharski1993; Kameyama, Reference Kameyama1999; Smyth, Reference Smyth1994, for similar accounts). Experimental studies suggest that the reversal effect of accented pronouns is only present in certain contexts (Mozuraitis and Heller, Reference Mozuraitis and Heller2017; Taylor et al., Reference Taylor, Stowe, Redeker and Hoeks2013), but there is no agreement on the precise conditions for it. However, most of these findings are based on English, which only has one type of subject and object pronoun. Since German has different types of pronouns that differ in their degree of sensitivity to different factors (e.g., er vs der, as detailed above), we may find differences in the resolution of their accented versions, as well.
To summarize, the studies above show not only personal subject pronouns to be preferably interpreted as the preceding subject/topic and demonstrative pronouns as the preceding object/non-topic but also that these pronouns differ in their sensitivity toward grammatical role and information structure (Abashidze et al., Reference Abashidze, Gagarina and Bittner2023; Bader and Portele, Reference Bader, Portele, Gattnar, Hörning, Störzer and Featherston2019a,b; Bosch et al., Reference Bosch, Rozario and Zhao2003, Reference Bosch, Katz, Umbach, Schwarz-Friesel, Consten and Knees2007; Kaiser, Reference Kaiser2011b; Sauermann and Gagarina, Reference Sauermann and Gagarina2017). Similarly to the personal subject pronoun, for the (personal) object pronoun, the two determining factors for its resolution are grammatical role—object—and topicality. However, it is not quite clear which of these two factors exerts a stronger influence. Finally, the effect of accenting the pronoun has not been investigated for German so far.
Pronoun resolution in L2
In research on L2 pronoun resolution, the focus has been on whether L2 speakers can perform like native speakers. Especially null subject or pro-drop languages have been contrasted with languages that generally only allow overt pronouns. Since they differ in their referential options—while pro-drop languages utilize overt and null pronominal forms, non-pro-drop languages only employ overt forms—it has been questioned whether learners can fully acquire the differences in the use of the pronominal forms. However, when compared to native speaker control groups, differences in the use, interpretation, and processing of null and overt pronouns have been found for L2 speakers regardless of whether their L1 is a non-pro-drop or a prop-drop language (Belletti et al., Reference Belletti, Bennati and Sorace2007; Lozano, Reference Lozano2018; Okuma, Reference Okuma, Herschensohn and Tanner2011; Polio, Reference Polio1995; Roberts et al., Reference Roberts, Gullberg and Indefrey2008; Sorace and Filiaci, Reference Sorace and Filiaci2006). While some findings suggest that when trying to resolve ambiguity, L1 influence can emerge (e.g. Roberts et al., Reference Roberts, Gullberg and Indefrey2008), other findings reveal that even when both L1 and L2 are pro-drop languages, L2 learners do not necessarily benefit from this similarity (e.g. Lozano, Reference Lozano2018; Polio, Reference Polio1995). This means that differences in L2 pronoun resolution cannot entirely be explained by the difficulty of switching from a non-pro-drop to a pro-drop system or vice versa.
Turning now to German, several studies have investigated pronoun resolution in L2 learners. While German does generally not allow null subject pronouns, as previously mentioned, demonstrative pronouns can be used anaphorically. Thus, in German too, there are different referential forms with different underlying referential functions, i.e., similar to null and overt pronouns, they are affected to different degrees by grammatical roles and information structural roles (see section 1.1). What may make the differences in interpretation between personal and demonstrative pronouns not obvious is that the anaphoric use of the demonstrative pronoun often gets overlooked in grammars (cf. Ahrenholz, Reference Ahrenholz2007) and may therefore not even be part of the L2 acquisition process. So far, three studies have investigated the different pronominal forms explicitly and considered the effect of information structure in nonnative German speakers’ pronoun resolution.
Ellert, Roberts, and Järvikivi (Reference Ellert, Roberts, Järvikivi, Spiegel and Krafft2011) investigated the effect of topicality on the resolution of subject personal and demonstrative pronouns (er and der) in L2 German. The L2 speakers’ L1 was Dutch, which allows demonstrative pronouns to be used anaphorically and shows the same referent preferences of personal and demonstrative pronouns as German (e.g., Bosch et al., Reference Bosch, Katz, Umbach, Schwarz-Friesel, Consten and Knees2007; Kaiser, Reference Kaiser2011c). In their items (e.g., Der Schrank ist schwerer als der Tisch. Er/Der stammt […]. “The cabinet is heavier than the table. It comes […].”), the first-mentioned NP (Schrank “cabinet”) is assumed to be topical, whereas the second-mentioned NP (Tisch “table”) is non-topical. The results revealed that L2 learners linked both pronouns to the topical referent, which is different from the L1 preference (both in Dutch and German) to resolve the personal pronoun toward the topical referent and the demonstrative toward the non-topical referent. Nevertheless, even though L2 speakers showed a topic preference for both pronouns, the preference was stronger for personal than for demonstrative pronouns. This was observed in their online eye-tracking data as well as in the offline comprehension questionnaire. These results suggest that the referential function of the individual pronouns played a decisive role also for L2 speakers. Ellert (Reference Ellert2010) further suggests that proficiency is a crucial factor in L2 resolution of personal and demonstrative pronouns. She observes that less proficient learners used both pronominal forms for the same function (i.e., linked to topical referents), whereas highly proficient L2 learners differentiated distinct functions for the personal pronoun (i.e., linked to topics) and for demonstrative pronouns (i.e., linked to non-topics).
In contrast to Ellert et al. (Reference Ellert, Roberts, Järvikivi, Spiegel and Krafft2011), who investigated topicality, Patterson, Esaulova, and Felser (Reference Patterson, Esaulova and Felser2017) conducted three experiments to examine how focus affects the resolution of the within-sentence subject pronoun er in both native and non-native German speakers, as well as native Russian speakers. Focus was established through the use of cleft constructions and focus-sensitive particles. The results indicated a distinct contrast between native and non-native speakers that could not be attributed to L1 influence. Specifically, native speakers of German and Russian were less likely to link a pronoun with a referent in focus (via cleft) when compared to a non-focused referent in the same position. In contrast, non-native speakers did not display this effect, but rather tended to resolve a pronoun toward referents appearing with a focus-sensitive particle. Thus, L2 speakers showed sensitivity to focus marking. Since results were the same for both L1 groups, the L2 group’s divergence cannot be explained by possible L1 influences. Lastly, Wilson (Reference Wilson2009) tested word order and grammatical role effects on the resolution of German personal and demonstrative subject pronouns in L1 and L2 speakers (L1 English). The results showed L2 speakers to prefer the first-mention—topical—referent with the personal pronoun, while L1 speakers showed no preference. For the demonstrative pronoun, L2 speakers had no preference, whereas L1 speakers linked it to the second mention—non-topical—referent. However, Wilson mentions that her manipulation of word order may not have triggered changes in information structure as intended, since her stimuli did not employ appropriate prosody.
As for the resolution of object pronouns, German as L2 has not yet been extensively researched. The above-mentioned study by Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023) contrasted L2 speakers (L1 Georgian) with native German speakers. In the gaze data, they found L2 speakers to attend more to the subject referent than the object referent after a subject pronoun, which corresponded to L1 speakers’ gaze pattern. For object pronouns, L2 speakers fixated the object referent more than L1 speakers. In the offline results, L2 speakers showed the same tendency as L1 speakers, namely, selecting the subject referent more often for the subject pronoun than for the object pronoun. However, L2 speakers preferred the object referent with the object pronoun, whereas L1 speakers did not show a preference for either referent. Abashidze et al. conclude that while L1 speakers’ preference is affected by grammatical role and topicality, L2 speakers may have difficulties employing information structural cues and hence rely only on grammatical parallelism during pronoun resolution. Note, however, that their study did not manipulate information structure directly, whereas studies that directly investigated information structural effects showed that L2 German speakers were sensitive to changes in information structure (Ellert et al., Reference Ellert, Roberts, Järvikivi, Spiegel and Krafft2011; Patterson et al., Reference Patterson, Esaulova and Felser2017). This has also been shown for L2 speakers of other languages, and, for example, Schimke and Colonna (Reference Schimke and Colonna2016) suggest that L2 learners might rely on discourse-level information to a greater extent than L1 speakers when interpreting pronouns (also see Okuma, Reference Okuma, Herschensohn and Tanner2011).
In sum, the studies presented in this section show that native and non-native speakers are affected by grammatical role and information structure when interpreting pronouns. However, these factors seem to be weighted differently in L2 than in L1 speakers. The underlying cause for the differences in weighting is not obvious. While proficiency plays an important role in L2 speakers achieving a more native-like performance (e.g., Ellert, Reference Ellert2010; Lozano, Reference Lozano2018; Polio, Reference Polio1995), the role of L1 influence is not certain (cf. Ellert et al., Reference Ellert, Roberts, Järvikivi, Spiegel and Krafft2011; Patterson et al., Reference Patterson, Esaulova and Felser2017; Roberts et al., Reference Roberts, Gullberg and Indefrey2008; Lozano, Reference Lozano2018; Polio, Reference Polio1995). However, language influence in the sense of dominance/proficiency cannot completely be disregarded, as Tsimpli et al. (Reference Tsimpli, Sorace, Heycock and Filiaci2004) have shown that even native speakers under attrition can behave more like L2 than L1 speakers. This is also indicated in Roberts et al.’s (Reference Roberts, Gullberg and Indefrey2008) study, where L2 speakers of different L1s showed similar online processing patterns, but deviated in their final interpretation. Thus, these findings entail that the effect of language proficiency should be further investigated together with the role of information structure.
The role of prosodic focus marking in L1 and L2 speakers
Intonation is commonly used as an indication of information structure. For instance, it can mark whether an element of a sentence has been introduced in the previous discourse (Schwarzschild, Reference Schwarzschild1999), whether that element is new (i.e., update of the common ground, Lambrecht, Reference Lambrecht1994) or whether that element indicates the relevance of alternatives (Roberts, Reference Roberts2012; Rooth, Reference Rooth1985, Reference Rooth1992). In German, while focus is associated with a falling accent (H*(+L)) (e.g., Baumann, Reference Baumann2006; Büring, Reference Büring1997; Féry, Reference Féry1993), topics are connected to rising accents (L*+H), especially when contrastive (e.g., Féry, Reference Féry1993; Büring, Reference Büring1997; Braun, Reference Braun2006; Repp and Drenhaus, Reference Repp and Drenhaus2015). Focus is acoustically marked with a wider pitch range, an increased intensity, and increased duration compared to other speech elements that are not in focus (Féry and Kügler, Reference Féry and Kügler2008).
Literature suggests that native speakers are able to identify and integrate prosodic information to build information structure in real time (Heim and Alter, Reference Heim and Alter2006; Wang et al., Reference Wang, Wang, Qadir, Lee and Zee2011). An event-related potential (ERP) experiment in German (Hruska and Alter, Reference Hruska and Alter2004) revealed an increased N400 response at words that were expected to carry a focus pitch accent but did not. In eye-tracking studies, contrastive focus marking has been revealed to trigger anticipatory eye movements, e.g. hearing blue ball followed by GREY raises expectations that the upcoming noun will also be ball (Ito and Speer, Reference Ito and Speer2008; Ito et al., Reference Ito, Bibyk, Wagner and Speer2014), which in turn can support target search. Similarly, it has been observed that in H*L (focus) conditions, the initial higher proportion of looks directed toward the competitor decreases earlier compared to L*H (non-focus) conditions (Chen et al., Reference Chen, Den Os and De Ruiter2007; Sedivy et al., Reference Sedivy, Tanenhaus, Chambers and Carlson1999). This shows that native listeners can make predictions about upcoming referents in real time using prosodic cues.
For L2, on the one hand, some studies suggest that L2 learners might have difficulties producing and perceiving prosodic cues, particularly if these differ from prosodic cues in their L1 (Mennen and De Leeuw, Reference Mennen and De Leeuw2014). Akker and Cutler (Reference Akker and Cutler2003) found that L2 Dutch learners of English were not able to map pitch accents to semantic information as effectively as native speakers of English, even though the use of prosodic cues for information structure is similar in Dutch and English (also see Chen and Lai, Reference Chen and Lai2011). On the other hand, Takahashi et al. (Reference Takahashi, Kao, Baek, Yeung, Hwang and Broselow2018) found shorter reaction times for sentences with felicitous contrastive pitch accent as compared to infelicitous use of contrastive pitch for L2 Chinese learners of English. The authors assumed that the effective use of English contrastive prosodic cues in L2 speakers stemmed from similarities in pitch cues to focus in English and Mandarin Chinese. ERP studies (Reichle, Reference Reichle2010; Reichle and Birdsong, Reference Reichle and Birdsong2014) revealed that L2 proficiency can affect the online perception of information structure. Unlike low-proficiency L2 English learners of French, high-proficiency learners showed a native-like anterior negativity response for contrastive focus. Perdomo and Kaan (Reference Perdomo and Kaan2021) looked into the effects of proficiency and working memory on L2 information structure processing. They found that while L2 speakers used prosodic information to build information structure during listening, neither proficiency nor working memory influenced L2 speakers’ use of contrastive pitch accent to predict or process the following noun phrase.
Thus, the evidence so far suggests that L2 speakers show sensitivity to modulations of L2 prosody, but may experience difficulties in effectively using the prosodic information for the subsequent discourse. As to the role of proficiency, its effect on L2 performance is not yet clear.
As to the role of prosodic focus marking in L2 pronoun resolution, it has not been investigated so far (see Tsoukala et al., Reference Tsoukala, Vogelzang and Tsimpli2024, for effects of implicit prosodic rhythm cues in L2 English). The present study is intended to fill this gap.
Hypotheses
Various accounts have been put forward attempting to explain observed differences between L1 and L2 language processing, such as the interface hypothesis (Sorace and Filiaci, Reference Sorace and Filiaci2006; Sorace, Reference Sorace2011) or the shallow structure hypothesis (Clahsen and Felser, Reference Clahsen and Felser2006, Reference Clahsen and Felser2018). Despite making different predictions about when difficulties for L2 speakers arise, what these accounts have in common is that they describe L1-L2 differences in terms of difficulties in applying information during online processing. Cunnings (Reference Cunnings2017) proposes that during the cue-based memory retrieval processes, similarities among the cues may interfere and lead to differences in weighting of the cues, which in turn may result in differences during processing. Cunnings’s approach fits particularly well with the idea of the form-specific account that different syntactic, pragmatic, and discourse-level cues are weighted to render each pronoun’s referent (cf. Kaiser and Trueswell, Reference Kaiser and Trueswell2008; Kaiser, Reference Kaiser2017). Since languages and individual pronouns may differ in how they weight cues, bilingual speakers may have more weighting options available. Moreover, discourse-based cues seem to generally be weighted more strongly in L2 than L1 processing (cf. Schimke and Colonna, Reference Schimke and Colonna2016). This could actually be explained in terms of cue-weighting as follows: Prosodic focus marking results in increased attention to focused referents during processing and memory retention (Hert et al., Reference Hert, Järvikivi and Arnhold2024; Káldi and Babarczy, Reference Káldi and Babarczy2021). Therefore, using attention, an explicit link can be established to pronoun resolution: A prosodically focus-marked referent may receive a boost in its memory representation, which in turn will make establishing the link between it and a pronoun easier. In L1 processing, this does not determine the ultimate likelihood of establishing such a link, since pronoun resolution is generally determined by grammatical characteristics of individual pronouns, e.g. subject pronouns like English he and German er are linked to preceding subject referents (Foraker and McElree, Reference Foraker and McElree2007; Bly- thing et al., 2021; Hert et al., Reference Hert, Järvikivi and Arnhold2024). For L2 speakers, for whom cue-weighting is generally expected to be challenging, the effect of focus marking can be expected to be even stronger, and it is possible that, unlike L1 speakers, L2 speakers are generally more likely to select focused referents in pronoun resolution.
Thus, the first overarching hypothesis tested in the present study is that L2 speakers are more sensitive to information structure than L1 speakers in processing pronouns. In particular, we predict them to be more likely to select focused entities as pronoun referents than L1 speakers. Our second overarching hypothesis, based on previous findings (e.g., Ellert, Reference Ellert2010; Lozano, Reference Lozano2018; Wilson, Reference Wilson2009), is that proficiency will affect L2 speakers’ pronoun processing, such that more proficient L2 speakers will be more similar to L1 speakers than less proficient ones. Further, we predict proficiency to be more important than type of L1 in the L2 speakers group, since L1 type does not necessarily lead to a native-like performance (see Polio, Reference Polio1995) and even native speakers under attrition behaved more similar to L2 speakers than to native controls (Polio, Reference Polio1995; Tsimpli et al., Reference Tsimpli, Sorace, Heycock and Filiaci2004). The third overarching hypothesis to be tested is therefore that proficiency is more important for L2 speakers’ performance than whether their L1 is a pro-drop or non-pro-drop language.
More detailed predictions are derived in the individual sections for the two experiments below. Both experiments were approved by the Research Ethics Board 2 of the University of Alberta (study ID Pro00105075).
Experiment 1: Interpretation of er and der in L2 speakers
In experiment 1, we investigate whether L2 speakers’ referential choice can be aided by focusing on possible referents. That is, can their referent preference be biased toward one referent if prosody explicitly marks that referent as focused in the discourse context (see section 3.1.2 for an example)? In accordance with the first overarching hypothesis, we predict that the answer is yes, and that focused referents will be chosen more often than those that are not focused.
Further, we want to examine whether L2 speakers are sensitive to the different referential functions of the subject pronouns er and der. Previous L2 research suggests that L2 speakers differ from L1 speakers in their resolution of the demonstrative pronoun der, but are more alike to L1 speakers with the personal pronoun er (e.g., Ellert et al., Reference Ellert, Roberts, Järvikivi, Spiegel and Krafft2011). Therefore, we predict interpretation of the personal pronoun er in L2 speakers to be similar to L1 speakers, i.e. they will show a preference for the subject referent, but we predict a less clear preference for the demonstrative pronoun der.
As for proficiency, we predict that L2 speakers’ referent selection interacts with their level of proficiency, resulting in an increasingly more native-like performance with increasing levels of proficiency, as per the second overarching hypothesis. This means, more proficient L2 speakers should select the subject referent more often than less proficient ones, especially for er. Finally, in accordance with the third overarching hypothesis, we predict that proficiency will have a stronger effect on the results than whether participants’ L1 is a pro-drop language.
Method
Design and materials were identical to those used with L1 speakers in experiment 2 reported in Hert et al. (Reference Hert, Järvikivi and Arnhold2024).
Participants
In total, 80 participants with various L1sFootnote 3 (see Table 1) completed the experiment via Prolific for monetary compensation (£9/h). For the analysis, we excluded 15 participants based on their high error rate with filler items (more than 60% incorrect), resulting in 65 participants (age range: 19–62, mean: 33, sd: 12.4). Participants indicated they learned German in language classes in school or university. At the time of testing, participants spent all or the majority of their lives in their respective L1 country.
Table 1. L1s and number of speakers for experiment 1

Materials
We designed a comprehension task where participants listened to short dialogues. Two experimental factors were manipulated: (i) whether the dialogue contained an unaccented personal pronoun er or an unaccented demonstrative pronoun der; (ii) which possible referent for the pronoun was focused (subject, object; see Table 2 for an example). Information structure was manipulated in a twofold way, coupling prosodic focus marking with changes in the context that licensed the prosodically indicated information structure. A total of four conditions were tested, and we used ten sentences per condition, which resulted in 40 experimental dialogues.
Table 2. Example dialogue with critical manipulation in all four conditions. Prosodic focus marking in italics, unaccented ambiguous pronoun in bold. Contexts were identical for all conditions except where indicated with slashes and condition names in brackets. Note that information structure was manipulated prosodically in the critical sentence, as well as in the preceding context

The dialogues were recorded using Shure SM10A headset microphones in a sound-attenuated booth by two native speakers of German, one female and one male. The female speaker (the second author, a prosody researcher) recorded all the introductions and critical sentences (A-turns in Table 2) for the experimental items, while for the filler items, it was the male speaker. The speaker uniformly produced a single falling accent on the focused constituent, while the rest of the sentence remained unaccented (represented as H* L-% in GToBI notation Grice et al., Reference Grice, Baumann, Benzmüller and Jun2005, Reference Grice, Baumann, Ritter and Röhr2017).
The experimental items were distributed across four lists in a Latin square design. Additionally, we constructed 40 filler items that were the same across all lists. These fillers also contained four possible referents, but unlike the experimental items, we did not include any ambiguous pronouns.
Note that, like prior research on German pronouns, we only use masculine third-person singular pronouns in the experimental items. This is because the feminine third person singular pronouns sie and die are the same in nominative and accusative cases, and are additionally homonymous with the third person plural and the polite second person singular pronouns, which would induce additional ambiguity. The possible referents in all target items were grammatically masculine occupation names, which are interpreted as referring to men in German (Horvath et al., Reference Horvath, Merkel, Maass and Sczesny2016), meaning all of them were equally suitable referents for the target pronouns.
Procedure
The experiment was created with the jsPsych framework for carrying out online experiments (version 7.2.1, de Leeuw, Reference de Leeuw2015). The participants were given a brief written explanation of the tasks they were about to complete. First, participants filled out a questionnaire about their language background. Additionally, we included the German LexTALE (Lemhöfer and Broersma, Reference Lemhöfer and Broersma2012) as a measure of L2 speakers’ vocabulary knowledge (see Table 3). Afterwards, a screen with instructions appeared, asking participants to carefully listen to the dialogues. They were also given the chance to check their speakers’/headphones’ volume before starting the task.
Table 3. LexTALE scores (raw) for L2 speakers, including range, mean, and standard deviation

While listening to the dialogues, participants saw the names of the four mentioned referents on the screen. Following each dialogue, they saw a question on the screen probing to which of the two target referents, subject or object, the pronoun referred (see last row in Table 2). We also included the other two referents as possible responses to ensure that participants paid attention during the experiment. Participants gave their answer by clicking on one of the names on the screen. The positions of the referents’ names on the screen were randomized for each list. Halfway through the experiment, participants were given a break.
Results
We performed generalized linear mixed-effects regression modeling (GLMER) using the lme4 package (version 1.1-35.1, Bates et al., Reference Bates, Mächler, Bolker and Walker2015) in the software R (version 4.3.0, R Core Team, 2023) to analyze the participants’ responses. The models included a binomial dependent variable coding whether the participant chose the subject or the object as the referent of the pronoun. We therefore excluded 111 responses choosing a distractor referent, i.e. 4.27% of the data, leaving 2489 data points for analysis. We added a three-way interaction for Condition (subject focus vs object focus), Pronoun (er vs der), and LexTALE (centered). L1 type (a binary variable coding whether participants’ L1 was a pro-drop language) was also included as a fixed effect (a more complex model with a four-way interaction did not converge). Opting for a backward-fitting procedure, we excluded fixed factors one by one to see whether they significantly contributed to the model fit. Neither L1 type nor any interactions between Pronoun, Condition, and LexTALE affected the model’s fit significantly and therefore the fixed effect L1 type and the interactions were excluded from the final model shown in Table 4, which only indicates significant effects of Condition, Pronoun, and LexTALE. Moreover, the model contained by-Participant random slopes for Pronoun, as well as a random intercept for Item. Models with more complex random effects structures did not converge.
Table 4. Fixed effects for best fitting generalized linear mixed-effects model of referent choice for er and der

As illustrated in Figure 1, the subject referent was chosen significantly less often in the object focus condition compared to the subject focus condition (the intercept in Table 4). With regards to the pronouns, participants selected the subject referent significantly less often for der than for er. Moreover, participants’ LexTALE score predicted their referent choice significantly. Higher scores correlated with an increase in subject preference.

Figure 1. Referent choice for er and der by condition, with error bars for standard error.
In addition, we compared the subject preference for the two pronouns in both prosody conditions to chance level using one-sample Wilcoxon signed rank tests and found the difference to be significant only for er in subject focus, where subject preference was significantly above chance level, and der in object focus conditions, where it was significantly below chance level (for both p < 0.05).
Discussion
As predicted by our first hypothesis, the L2 speakers were sensitive to focus marking and chose focused referents more often than non-focused ones. Unlike for L1 speakers (cf. Hert et al., Reference Hert, Järvikivi and Arnhold2024), the effect of information structure did not significantly interact with the factor Pronoun. Thus, while effects of focus were attenuated and ultimately overridden by the individual pronoun’s referent preference in L1 speakers, the effect of focus on L2 speakers was the same for both pronouns. Figure 1 depicts a preference for the subject referent in the subject focus condition with the personal pronoun er, and a preference for the object referent in the object focus condition for the demonstrative pronoun der. These two preferences were not as pronounced as for L1 speakers (cf. Hert et al., Reference Hert, Järvikivi and Arnhold2024). Nonetheless, the results show that L2 speakers were sensitive to the two different referential forms: er was more likely to be resolved toward the subject referent, and der was more often linked to the object referent. Similar to L1 speakers, there was no preference for either referent with der in the subject focus condition for the L2 group. However, unlike L1 speakers who still preferred the subject referent, L2 speakers did not show a referent preference for er in the object focus condition. As hypothesized, L2 speakers do not show a strong preference for der, but, surprisingly, there was also no clear preference for er. This can be ascribed to the effect of information structure: In short, for both pronouns, a preference for one referent emerged only when grammatical role and focus marking were combined, and, as hypothesized, L2 speakers were more swayed by focus marking than L1 speakers.
As for the second hypothesis, more proficient L2 speakers chose the subject referent more often than less proficient speakers, as hypothesized. In addition, while proficiency had a significant effect, L1 type did not, supporting the third hypothesis.
Experiment 2: Interpretation of er and ihn in L1 and L2 speakers
In experiment 2, we test L1 and L2 speakers’ referential choice for the subject pronoun er and object pronoun ihn. We manipulated whether pronouns were unaccented or accented, as for the subject pronoun in (1), to investigate whether prosody would affect referent selection and if L1 speakers would differ from L2 speakers (see section 4.1.2 for full example and all pronoun conditions).
-
(1) Der Arzt bringt den Koch mit einer Clownsnase zum Lachen, als er/ER die Musikerin mit der Kamera filmt.
“The doctor makes the cook laugh with a clown’s nose, when he/HE filmed the musician with the camera.”
As discussed in section 1.1, unaccented personal subject pronouns are preferably resolved toward the preceding subject referents (e.g., Abashidze et al., Reference Abashidze, Gagarina and Bittner2023; Bader and Portele, Reference Bader and Portele2019b; Bouma and Hopp, Reference Bouma and Hopp2007; Colonna et al., Reference Colonna, Schimke and Hemforth2012; Hert et al., Reference Hert, Järvikivi and Arnhold2024), whereas for object pronouns, a preference for object referents (Sauermann and Gagarina, Reference Sauermann and Gagarina2017) or subject and object referents (Abashidze et al., Reference Abashidze, Gagarina and Bittner2023) was observed in L1. Therefore, for L1 speakers, we expect a clear preference for linking the unaccented subject pronoun to the subject referent. For the unaccented object pronoun, if its interpretation is indeed affected to the same extent by multiple factors as proposed by Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023), then we expect no preference for either referent. With respect to accented pronouns, previous findings suggest that whether or not the accent leads to a reversal in referential choices depends on alternatives being explicitly available in the previous discourse (Mozuraitis and Heller, Reference Mozuraitis and Heller2017). Our experimental items include alternative referents, which should enable the reversal of accented subject pronouns. For the object pronoun, we assume no preference following Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023) and Sauermann and Gagarina (Reference Sauermann and Gagarina2017), and accenting the pronoun should therefore not lead to a reversal in referent preference.
Turning to L2 speakers, considering the research presented in section 1.2, we assume referent choice to be similar to that of L1 speakers with unaccented pronouns, but the preference may not be as pronounced as with L1 speakers. Regarding accented pronouns, in line with our first hypothesis, experiment 1 showed that L2 speakers are highly sensitive to focus marking, where it helps them with the task of resolving ambiguity of pronouns, even though this can lead to non-native like preferences. But will the effect of information structure marking be equally pronounced when it applies to the pronoun itself rather than highlighting one of the possible referents, and thus not directly aiding with ambiguity resolution?
In addition to the accent, we included an order of mention manipulation for the object pronoun ihn. This was done so that we could target possible effects of parallel position (cf., Abashidze et al., Reference Abashidze, Gagarina and Bittner2023; Sauermann and Gagarina, Reference Sauermann and Gagarina2017). Note that while Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023) and Sauermann and Gagarina (Reference Sauermann and Gagarina2017) have looked into the effect of parallel position as well, they actually did not manipulate order of mention of the pronoun.
Lastly, in relation to previous work on L2 pronoun resolution mentioned in section 1.2, we also considered the role of proficiency and included a measure of vocabulary knowledge. Following previous findings (e.g., Ellert et al., Reference Ellert, Roberts, Järvikivi, Spiegel and Krafft2011; Lozano, Reference Lozano2018; Wilson, Reference Wilson2009), we assume that L2 speakers’ performance correlates with their level of proficiency. This means the higher their score on the proficiency measure, the closer their performance on referent selection should be to that of the native speakers, as stated by our second overarching hypothesis. Moreover, following the third hypothesis, we expect proficiency to be of greater importance for pronoun resolution than for L2 speakers’ L1.
Methods
Participants
A total of 249 participants were recruited via Prolific as well as from the University of Kaiserslautern. After excluding German bilingual participants and participants who scored less than 60% correct on the non-ambiguous filler items (see below), data from a total of 220 L1 (n = 113; age range: 18–69, mean: 31, sd: 10.47) and L2 (n = 107; age range: 19–66, mean: 30, sd: 10.37) participants were analyzed. L2 participants varied in their L1s (see Table 5; pro-drop: n = 94). All L2 participants learned German in school or university languages classes. Prolific participants received monetary compensation (£9/h), participants from the University of Kaiserslautern received a 10€ gift card as compensation for their participation.
Table 5. L1s and number of speakers for L2 participants in experiment 2

Materials
For the experimental items, we created 42 mini stories containing either a subject or an object pronoun; see example in (2). The first sentence introduced a feminine referent, followed by two masculine referents. The second sentence contained additional information about the first sentence, but did not include any of the referents. The third sentence was made up of a main clause, which repeated the two masculine referents, one as subject the other as object referent (marked with italics in (2)), and a subordinate clause which contained an ambiguous pronoun (bold), either subject (see 2a) or object pronoun either in first- or second-mention position (see 2b and 2c). Additionally, the pronouns were either unaccented or accented (upper-case letters).
-
(2) Die Fotografin, der Buchhalter und der Radiosprecher haben für Silvester eine kleine Feier ge- plant. Es gibt auch ein Feuerwerk. Der Buchhalter überwacht den Radiosprecher beim Zünden der Raketen, als
“The photographer (feminine), the bookkeeper (masculine) and the radio announcer (masculine) have planned a small party for New Year’s Eve. There will also be fireworks. The bookkeeper supervises the radio announcer firing the rockets, when”
-
a. er/ER die Fotografin achtsam anstupst. (subject pronoun) “he/HE carefully nudges the photographer.”
-
b. ihn/IHN die Fotografin achtsam anstupst. (object pronoun first) “the photographer carefully nudges him/HIM.”
-
c. die Fotografin ihn/IHN achtsam anstupst. (object pronoun second) “the photographer carefully nudges him/HIM.”
The mini stories were recorded using a Shure SM10A headset microphone in a sound-attenuated booth by a native speaker of German. The speaker (the second author, a prosody researcher) uniformly produced a contrastive rising accent on the accented pronoun, which would be represented as L*H following Féry (Reference Féry1993) and as L+H* in GToBI notation (Grice et al., Reference Grice, Baumann, Benzmüller and Jun2005, Reference Grice, Baumann, Ritter and Röhr2017). Accented pronouns were approximately four times longer in duration than unaccented pronouns. Figure 2 shows an example of a pronoun in unaccented (Figure 2a) and accented (Figure 2b) conditions.

Figure 2. Prosodic contours for unaccented and accented pronouns.
The target items were distributed across six lists following a Latin square design. Additionally, we constructed 28 filler items that were the same across all lists. These fillers also contained three referents, but unlike the experimental items, they had two feminine and one masculine referent and were followed by a question asking about the referent of a full noun phrase instead of a pronoun.
As in experiment 1, the target items only contained masculine pronouns due to the syncretism of the feminine pronouns with other forms. Also, we again used occupation names, which are grammatically unambiguously gendered. Therefore, the feminine referent was not a possible referent for the pronoun due to grammatical gender mismatch (and because the feminine referent was already present in the same clause as the other argument of the verb and could therefore only be referred to by a reflexive pronoun). Like previous research, we are interested in whether the pronoun is resolved toward the subject or object of the preceding sentence (i.e. the two masculine referents). We opted to include a third referent in the discourse context not to make this choice more difficult, but to make the use of accented pronouns more natural. While the feminine-marked referent is not a possible referent for the pronoun itself, it is a plausible member of a set of alternative referents with which the pronoun referent is contrasted, licensing the use of contrastive accent on the pronoun.
Procedure
The experiment was created using jsPsych (version 7.2.1, de Leeuw, Reference de Leeuw2015). As in experiment 1, participants carried out a questionnaire and LexTALE (see Table 6), followed by the main task. While listening to the stories, participants saw the names of the three mentioned referents on the screen. Following each story, they saw a question on the screen asking them to choose the subject, object, or another masculine referent that was not mentioned in the story as the referent of the pronoun. For subject pronouns, we asked a subject question (e.g. “Who filmed the musician?”), and for the object pronouns, we asked an object question (e.g. “Who did the musician film?”). The positions of the referents’ names on the screen were randomized for each list.
Table 6. LexTALE scores (raw) for L1 and L2 speakers, including range, mean, and standard deviation

Results
We performed GLMER using the lme4 package (version 1.1-25, Bates et al., Reference Bates, Mächler, Bolker and Walker2015) in the software R (version 4.3.0, R Core Team, 2023) to analyze responses. The models included a binomial dependent variable coding whether the participant chose the subject or the object as the referent of the pronoun. We excluded answers that selected a distractor (0.84%), which resulted in a total of 9162 observations. Fixed effects included a 4-way interaction between Language_Type (L1, L2 pro-drop or L2 non-pro-dropFootnote 4 ), Pronoun (er, 1_ihn, or 2_ihn), Prosody (unaccented or accented), and the centered LexTALE score. Selecting a backward-fitting procedure, we excluded fixed factors one by one to see whether they significantly contributed to the model fit. For random effects, as more complex random effects structures led to convergence issues (we initially included the interaction term as a by-participant random slope), we followed Sonderegger’s (Reference Sonderegger2023) steps for non-convergence issues. After some nonintrusive steps (e.g. changing the maximum number of iterations for the optimization), which did not help with model convergence, we ultimately dropped one random effect at a time to obtain the maximal model possible. The final model included a by-Participant random slope for Pronoun and a random intercept for Item. LexTALE did not contribute to any significant interactions and only remained as a fixed effect. The final model in Table 7 shows an effect for LexTALE, as well as for the interaction of Language_Type, Pronoun, and Prosody.
Table 7. Fixed effects for best fitting generalized linear mixed-effects model of referent choice for er and der

Both Language_Type and Pronoun consist of three levels. In order to see where the significant difference between the different levels of factors involved in the interaction are, we ran a pairwise comparison using emmeans (version 1.10-0, Lenth, Reference Lenth2024) applying Bonferroni adjustment. Table 8 shows the significant differences among the factor level combinations.
Table 8. Multilevel comparison of the interaction term Language * Pronoun * Prosody of the generalized mixed-effects model. Positive estimates indicate a higher bias for the subject referent for the left factor level combination in the pair

As can be seen in Figure 3, overall, there was a preference for the subject referent for both pronouns with all groups. However, within L1 speakers, there was a difference for er; the subject referent was selected significantly less often for accented than unaccented er (see Table 8). L1 speakers also selected the subject referent significantly less often for accented er compared to accented first- and second-mention ihn. While there were no significant differences between the two L2 speaker groups, only L2 speakers with pro-drop L1s showed significant differences from the L1 group: For unaccented first-mention ihn and accented second-mention ihn, L1 speakers chose the subject referent more often than L2 speakers with pro-drop L1s. Finally, the positive effect of LexTALE in Table 7 indicates that the overall subject preference was larger for higher-proficiency speakers.

Figure 3. Referent choice for L1 and L2 speakers, accented and unaccented for er and ihn, with error bars for standard error.
Discussion
The results of this experiment revealed both differences and similarities between L1 and L2 speakers’ choices for the subject pronoun er and the object pronoun ihn. L1 speakers showed an overall subject preference for all pronouns, as hypothesized with the exception of accented er. In contrast to our prediction based on findings of Sauermann and Gagarina (Reference Sauermann and Gagarina2017) and Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023), the subject preference was not weaker for ihn than for er.
L2 speakers also showed an overall subject preference. However, unlike the L1 group, L2 speakers’ preference did not decrease when er was accented. In addition, as predicted, proficiency affected referent choice, in that higher LexTALE scores correlated with more subject choices overall, like in experiment 1, which generally led to a more native-like performance (cf. Ellert et al., Reference Ellert, Roberts, Järvikivi, Spiegel and Krafft2011; Lozano, Reference Lozano2018; Wilson, Reference Wilson2009). Moreover, type of L1 affected referent selection in L2 speakers. L2 speakers with a non-pro-drop L1 performed more native-like than L2 speakers with a pro-drop L1, since significant differences between L1 and L2 speakers were only observed for the pro-drop group. However, there were no significant differences between the two L2 groups, consistent with our hypothesis that proficiency is of greater importance than L1 type (cf. Tsimpli et al., Reference Tsimpli, Sorace, Heycock and Filiaci2004).
General discussion
The two experiments in this study investigated the effects of prosodic focus marking in the preceding context on referent choice for the subject pronouns er (personal) and der (demonstrative), as well as the effects of accents on the subject pronoun er and the object pronoun ihn (personal) in L1 and L2 speakers. We additionally varied the position of the pronoun ihn, allowing us to disentangle potential effects of grammatical role parallelism from positional parallelism.
Our first overarching hypothesis was that L2 speakers are more sensitive to information structure than L1 speakers in processing pronouns. In the first experiment, we asked if L2 speakers’ referent selection would be affected if the referents in the previous discourse were prosodically marked as focused, hypothesizing that focus marking may guide L2 speakers’ attention toward the focus-marked referent, which would render this referent more available in memory and in turn lead the referent to be selected more often. The results supported this hypothesis. We found that L2 speakers preferred the focused subject referent with the personal pronoun er, and chose the focused object referent more often for the demonstrative der. However, the results suggest that clear preferences only emerge when information structure and pronoun effects combine. Interestingly, while converging effects may help to achieve a more native-like performance, this is not guaranteed, as the two factors are not simply additive (as they appear to be in L2), but show a complex interplay in L1 processing, in line with the form-specific account (Kaiser and Trueswell, Reference Kaiser and Trueswell2008). Thus, L1 speakers prefer the subjects and topics as referents for er and objects and non-topics as referents for der, but subjecthood is the determining factor for the default pronoun er, whereas the effect of information structure is stronger for the more marked der (e.g., Bader and Portele, Reference Bader, Portele, Gattnar, Hörning, Störzer and Featherston2019a, 2025; Hert et al., Reference Hert, Järvikivi and Arnhold2024; Kaiser and Trueswell, Reference Kaiser and Trueswell2008; Kaiser, Reference Kaiser2011c; Portele and Bader, Reference Portele and Bader2016). By contrast, our findings point toward L2 learners weighting information structural (discourse-based) cues more heavily than grammatical role cues overall, in line with Cunnings’s account (2017) and our hypothesis.
In experiment 2, we investigated referential choice for unaccented and accented subject pronoun er and object pronoun ihn in L1 and L2 speakers. The results showed an overall preference to link both pronouns to the subject referent, which was only reduced with accented er for L1 speakers. This suggests that L1 speakers interpreted the accent on this pronoun as a signal for a topic shift, i.e. for a reversal of the usual preference of er to be resolved toward the subject (and topic) of the preceding sentence, but L2 speakers did not. How does this fit with our hypothesis that L2 speakers are more affected by information structure than L1 speakers?
One possible explanation for the absence of reversal in L2 follows from Robenalt and Goldberg (Reference Robenalt and Goldberg2016)’s findings that only highly proficient L2 speakers can take competing alternative forms into account during language processing. Thus, when less proficient L2 speakers encounter an accented pronoun, they are still unable to consider resolving it toward the alternative referent – a process that is acquired over time (cf. Robenalt and Goldberg, Reference Robenalt and Goldberg2016). However, L2 proficiency was notably higher for participants in experiment 2 (mean scores of 71.25% for non-pro-drop and 69.99% for the pro-drop group) than in experiment 1 (44.00%), yet information structure affected participants in experiment 1, but not in experiment 2.
Therefore, another explanation is more likely: We suggest that our manipulation of information structure in experiment 1 affected L2 participants because it provided a salient cue that (seemingly) helped them resolve the ambiguous pronoun, even if this led to a non-native like result. Thus, when participants encounter an ambiguous pronoun, they have to pick one referent with which to associate it. As argued above, focus marking in experiment 1 highlighted one possible referent, leading participants to resolve the ambiguous pronoun toward this referent. Even though focus marking did not necessarily help L2 speakers by allowing them to resolve pronouns the way that native speakers do, it allowed them to make a choice, resolve the pronoun, and move on. In contrast, accenting the pronoun in experiment 2 did not have the same effect because it did not help resolve the ambiguity. For L1 speakers, who already have a stable preference for resolving unaccented er, accenting the pronoun signals a contrast, which allows them to adjust their normal interpretation of this pronoun. As results of both of our experiments show, L2 speakers do not have an equally strong subject preference for unaccented er, meaning they do not start from the same point as L1 speakers and are therefore unable to follow the same process. Instead, they are still faced with the basic task of pronoun resolution—pick a referent—and unlike highlighting one of the possible referents, accenting the pronoun itself does not help with this task.
In fact, it is possible that accounts of pronoun resolution in terms of accessibility, prominence or salience (e.g. Ariel, Reference Ariel, Sanders, Schilperoord and Spooren2001; Arnold, Reference Arnold2001; von Heusinger and Schumacher, Reference von Heusinger and Schumacher2019; Grosz et al., Reference Grosz, Weinstein and Joshi1995b) may work well for L2 pronoun resolution, precisely for the reasons why they do not work for explaining the behavior of L1 speakers. L1 speakers have established (grammatical) preferences, which differ for individual pronouns like personal er vs demonstrative der. There is therefore no need to simply pick the referent that currently happens to be at the forefront of their minds (i.e. the most accessible/prominent/salient) to deal with the apparent ambiguity. In contrast, L2 speakers do not have the preferences of individual pronouns and the differences between them firmly acquired. In the absence of the detailed knowledge of different weights of various grammatical factors, they seem more likely to assign the most accessible/prominent/salient referent to any pronoun they encounter. If this is true, it is not clear that L2 speakers are necessarily following information structure as such, in the sense that they would decode its marking and compute its implications. Instead, they may simply be affected by emphasis, in other words, by the fact that focus marking makes one referent perceptibly more prominent, which in turn renders the referent more distinct in working memory (Birch et al., Reference Birch, Albrecht and Myers2000; Foraker and McElree, Reference Foraker and McElree2007; Norberg and Fraundorf, Reference Norberg and Fraundorf2021). Thus, even though both L1 and L2 speakers are seemingly affected by information structure, it is possible that they are affected by it in different ways. For L1 speakers, effects of information structure are grammatically driven, part of their linguistic knowledge—e.g. that der prefers non-topic referents whereas er prefers subjects. For L2 speakers, prosodic focus marking as a cue may be easier to access than form-based preferences and subjecthood, as these would involve accessing grammar. This fits with results from experiment 1, suggesting that less proficient participants were more likely to be guided by information structure marking than more proficient participants. It may be easier to rely on overt information structure marking as a cue, especially at earlier stages of L2 acquisition, since it is readily, perceptually available and its use would not need access to grammar (cf. Clackson et al., Reference Clackson, Felser and Clahsen2011; Felser and Clahsen, Reference Felser and Clahsen2009; Jacob et al., Reference Jacob, Şafak, Demir and Kırkıcı2019). So, what has previously been defined as an influence of L2 learners’ L1 may actually be more accurately described as an over-reliance on prominence. This would then also explain why in some studies where both the L2 and L1 shared a referent preference pattern, L2 learners did not benefit and their performance deviated from native speakers’ (e.g. Polio, Reference Polio1995).
Our second overarching hypothesis was that more proficient L2 speakers would be more similar to L1 speakers (see also Ellert et al., Reference Ellert, Roberts, Järvikivi, Spiegel and Krafft2011; Lozano, Reference Lozano2018; Polio, Reference Polio1995). In line with this hypothesis, there was a significant effect of proficiency, with higher LexTALE scores leading to an increase in the subject referent choices for both pronouns in both experiments. Still, in both experiments, L2 speakers differed from the L1 group, although pairwise comparisons were only significant for L2 speakers from the pro-drop group in experiment 2. This is in line with previous research as reviewed in section 1.2 and suggests that pronoun resolution is a challenging task, reflecting the complexity of cue-weighting for individual pronouns as described in the form-based account Kaiser (Reference Kaiser, Branco, McEnery and Mitkov2005).
Even though the effect of proficiency was the same in both experiments, the second experiment seems to show a more native-like performance than the first experiment. The difference in performance may be attributed to the overall level of proficiency being higher in experiment 2 than in experiment.
1. We therefore assume that proficiency plays an important role in achieving a more native-like pattern of pronoun interpretation. In fact, if we take proficiency to reflect language dominance more generally, then this could also explain why speakers under L1 attrition perform less native-like (Tsimpli et al., Reference Tsimpli, Sorace, Heycock and Filiaci2004) when compared to native speakers. This would predict that language dominance affects even native-like speakers’ performance during pronoun resolution.
The results further support our third hypothesis that proficiency is more important for L2 pronoun resolution than learner’s L1, whether L1 is a pro-drop language or not. In contrast to the effect of proficiency, which was significant and consistent across both experiments, L1 background did not have a significant effect in experiment 1. Experiment 2 showed some significant differences between L1 speakers and pro-drop L2 speakers, whereas there were no differences between and L1 and non-pro-drop L2. While this seems to suggest slight differences between the two L2 groups, there were no significant differences between them. Note that the group size for the L2 non-pro-drop speakers is a lot smaller than the other L2 or L1 groups, which may have obscured some possible differences. However, Figure 3 shows very similar patterns for both groups, and both groups had very similar mean LexTALE scores (cf. Table 6), meaning this factor at least did not obscure any possible effect of L1 type. Thus, on the basis of the present evidence, proficiency indeed seems to be a stronger predictor of L2 performance than whether the L1 is a pro-drop language.
To be clear, we are not arguing that proficiency is the only predictor of L2 performance. We likewise acknowledge that proficiency may itself be predicted or modulated by other factors, such as length of residence in a German-speaking country or quantity and quality of L2 input (see e.g. Xu et al., Reference Xu, Case and Wang2009; Granena and Long, Reference Granena and Long2013; Paradis et al., Reference Paradis, Rusk, Duncan and Govindarajan2017). Here, we were specifically interested in the relative influence of proficiency and the pro-drop status of L1, since both of these factors have been suggested as major influences on L2 pronoun resolution (recall references in section 1.2).
A final aspect of the present results that deserves discussion concerns the L1 speakers’ interpretation of the object pronoun ihn. Regarding unaccented pronouns, referent preference for subject and object pronouns has been mostly described in terms of grammatical role parallelism (Kehler et al., Reference Kehler, Kertz, Rohde and Elman2008; Sauermann and Gagarina, Reference Sauermann and Gagarina2017; Smyth, Reference Smyth1994) or, more recently (Abashidze et al., Reference Abashidze, Gagarina and Bittner2023), in terms of a combination of grammatical role parallelism, positional parallelism, and topic bias. Abashidze et al. argue that this combination would lead L1 speakers not to show a preference for the object pronoun, since the general topic/subject-bias is pitted against grammatical role parallelism. But in contrast to Abashidze et al.’s findings, our participants preferred the subject referent also for object pronouns. Thus, the subject referent was selected regardless of the grammatical role of the pronoun. Therefore, our results do not support grammatical role or positional parallelism. Further, Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023) claim that moving the object pronoun into first mention position, as done in their items, would lead to an enhanced topicality effect. However, our results do not support this assumption as there was no difference in preference when the object pronoun was in first mention position compared to when it was in second mention position.
So, how can we explain the different results in our study compared to Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023) (and Sauermann and Gagarina, Reference Sauermann and Gagarina2017, who found differences between subject and object pronouns in eye-gaze data)? A possible explanation stems from the experimental items. All the items in Sauermann and Gagarina (Reference Sauermann and Gagarina2017) and the two example items that are shown in Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023) contain experiencer verbs and an inanimate second argument in the pronoun-clause, presumably to license the sentence-initial position for object pronouns (e.g. Ihn ermuntern die Farben “He is encouraged by the colors (literally: him encourage the colors)” and Er mag die Farben “He likes the colors”). This may have affected the results, although it is not discussed by the authors: Abashidze et al., Reference Abashidze, Gagarina and Bittner2023 report a bias test for the verbs in the preceding clause, which were apparently relatively prototypical agent-patient verbs such as fangen “catch” or malen malen “paint,” but not for those in the pronoun clause (for the effect of agentivity on pronoun resolution, see Schumacher et al., Reference Schumacher, Dangl, Uzun, Holler and Suckow2016, Reference Schumacher, Roberts and Järvikivi2017, though these studies manipulated agentivity of the preceding referents only). In our materials, both the preceding clause and the pronoun-clause contained similar agent-patient verbs with animate referents as both subject and object. Another difference is that Abashidze et al. and Sauermann and Gagarina used contexts for inter-sentential pronoun resolution (see Colonna et al., Reference Colonna, Schimke and Hemforth2015, for differences between intra- and intersentential pronoun resolutions in German, though this study focused on subject pronouns). Finally, Sauermann and Gagarina (Reference Sauermann and Gagarina2017) and Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023) used contexts containing only two referents in their experimental items, whereas our items included a third referent that differed in gender. As already noted by Patterson and Schumacher (Reference Patterson and Schumacher2021), a limitation in many experiments on pronoun resolution is that the contexts are limited to two potential referents. However, including a third referent can give a more complex base for the interpretation of focus, as there are more alternatives present in the discourse (see Krifka, Reference Krifka2008, for a review on alternatives and focus), and lead to a more detailed picture of how sensitive different pronouns are to the factors involved in pronoun resolution (e.g. grammatical role, information structure). The difference in referent preference for the object pronoun observed in our study may be the result of a “richer” discourse context.
Turning now to accented pronouns, previous research has established (Mozuraitis and Heller, Reference Mozuraitis and Heller2017; Taylor et al., Reference Taylor, Stowe, Redeker and Hoeks2013) that accented pronouns do not always lead to a reversal in referent preference, as had been initially claimed (e.g., Akmajian and Jackendoff, Reference Akmajian and Jackendoff1970). Indeed, our findings for the object pronoun do not show a switch in referent preference for the accented pronouns. However, we do find a switch for the accented subject pronoun. Neither Taylor et al.’s nor Mozuraitis and Heller’s explanations as to why reversal patterns may not be triggered can account for the difference between subject and object pronouns, because their assumptions are not about pronoun type, but about the accessibility of alternatives. Heeding their analyses, we designed experiment 2 to explicitly have alternative referents available, and the fact that we observed a preference switch for the subject pronoun er suggests that this effort was successful. The differences in findings could be partly due to the sentence structure: intra- (current study) vs inter-sentential (Taylor et al. and Mozuraitis and Heller’s studies) pronoun resolution, or there could be true differences between German on the one hand and English and Spanish on the other. Further, our results may actually paint a more detailed picture of the pronoun types’ sensitivity to factors such as information structure and subjecthood (cf. Kaiser and Trueswell, Reference Kaiser and Trueswell2008). This would imply that object pronouns are more restricted regarding the degree to which some of these factors can affect referential links than subject pronouns. Results of our study suggest that object pronouns are more sensitive to subjecthood than to information structure, since changes in information structure (i.e., unaccented vs accented pronouns) did not influence referential choice. This would also explain why there was no effect of “enhanced topicality” (cf. Abashidze et al., Reference Abashidze, Gagarina and Bittner2023) when the object pronoun was first mentioned compared to when it was second mentioned.
Conclusion
In conclusion, the two experiments showed that L2 speakers are sensitive to focus marking during pronoun resolution. Moreover, in accordance with our first hypothesis, the results support Cunnings’ (Reference Cunnings2017) cue-based memory retrieval account, which explains that non-native performance occurs in L2 speakers because they over-rely on discourse-based cues during pronoun processing. It is also possible that they are guided by prominence, as is frequently suggested for L1 pronoun resolution (most recently von Heusinger and Schumacher, Reference von Heusinger and Schumacher2019), although to avoid the circularity inherent in these approaches, it should be assumed that L2 speakers are attuned to prominence in the sense of prosodic or other enhancements that make certain parts of an utterance “stick out” instead of declaring all factors that have been shown to influence prominence resolution, such as subjecthood, “prominence-enhancing.”
The study further supports the importance of proficiency for achieving native-like patterns of pronoun interpretation, in accordance with our second hypothesis. As stated in our third hypothesis, prominence was more important than whether the L2 speakers’ L1 was a pro-drop language.
For stressed pronouns, we provide evidence that the cause for the reversal pattern cannot be simply explained by whether information present in the discourse context makes a switch in preference probable (Mozuraitis and Heller, Reference Mozuraitis and Heller2017; Taylor et al., Reference Taylor, Stowe, Redeker and Hoeks2013), but that other factors must be included. For instance, the effect of grammatical role needs to be further investigated as we have only found referent preference to switch with the accented subject pronoun, but not with the accented object pronoun. Further, we did not find evidence for position of the object pronoun to affect its degree of topicality as has been claimed by Abashidze et al. (Reference Abashidze, Gagarina and Bittner2023). Our study’s results give rise to further investigate pronoun resolution not only in L2 but also in L1 speakers with respect to testing effects of number of possible referents in the discourse, verb type in the pronoun sentence, and the difference between inter- and intra-sentential object pronouns.
Replication package
The data, analysis code, and materials can be found at: https://osf.io/s6cb7/.
Acknowledgements
We would like to thank Lindsay Griener and John Gamboa for their help with participant recruitment.
Funding Statement
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by Social Sciences and Humanities Research Council of Canada [grant number 435-2017-0692].
Competing interests
The authors declare that there is no conflict of interest.