Interpretation preferences in contexts with three antecedents: examining the role of prominence in German pronouns

Abstract This paper focuses on the relational notion of prominence, in which entities of equal type are ranked according to certain prominence-lending features. In German two demonstrative forms, “der” and “dieser”, can function like personal pronouns in English. It has been proposed that processing “der” involves computing a prominence hierarchy of the prior referents, and excluding the referent with the highest prominence rank. The demonstrative “dieser” has not been extensively tested. In the current study, personal and demonstrative pronominal forms were investigated following ditransitive contexts, where three potential antecedents are available, in two rating experiments. The personal pronoun showed flexibility in that it received equally high ratings for all three antecedents in canonical configurations. The ratings for dieser followed a graded sensitivity to thematic role prominence, with lowest scores when referring to prominent antecedents (agents) and the highest scores for the least prominent antecedents (patients), with scores for the medium prominence candidate (recipients) differing from both. Der followed a similar but not identical pattern, with a less marked difference between lower prominence candidates. Positional information also has a strong influence on demonstratives. In sum, final interpretation is sensitive to fine-grained differences in prominence hierarchies.

The resolution of personal pronouns has attracted a great deal of attention in the psycholinguistics literature. It has been established that pronoun resolution can be affected by various different factors, and that many of these factors contribute to the relative prominence of potential antecedents in the prior discourse (Arnold, 2010;Arnold et al., 2000;Cowles et al., 2007;Järvikivi et al., 2005;Kaiser & Trueswell, 2008;Rohde, 2019;Schumacher et al., 2016). Thus, we can generalize that the process of resolving a pronoun is driven, at least to some extent, by the prominence of its antecedent.
The establishment of coreference relations is here viewed within the prominence framework (Himmelmann & Primus, 2015;von Heusinger & Schumacher, 2019). Prominence is characterized as an organizational principle that is relational and dynamic and attracts structural operations. For the purposes of this paper, we focus on the relational notion, that is that entities of equal type (here reference to individuals) are ranked according to certain prominence-lending features. The criterion of dynamicity posits that linguistic units may shift their (prominence) status as discourse unfolds, that is the prominence status of an entity is not static and entities can be promoted or demoted with regard to their prominence status. The criterion of structural attraction assumes that prominent entities attract more operations than less prominent entities. This is, for instance, reflected in the availability of alternative structures for prominent entities, for example, prominent arguments license passive constructions or clefts or prominent referents express coreference with more referential forms.
A core prerequisite for the relational notion is that referents are represented as an ordered set in discourse representation (e.g., Grosz et al., 1995;von Heusinger, 2006). The ordering of referents at a given point in time relies on prominencelending cues. While various factors have been identified to interact during the computation of prominence rankings (see below), we test the hypothesis that agentivity serves as a central prominence-lending cue for reference resolution (Schumacher et al., 2016(Schumacher et al., , 2017. We further assume that coreference resolution is affected by the prominence ranking. We thus utilize coreference via different referential expressions (i.e., personal vs. demonstrative pronouns) to observe the prominence ranking. 1 In contrast to accounts that posit a strict correspondence between referential expressions and the status of an entity in discourse (e.g., Ariel, 1990;Gundel et al., 1993), the current proposal adopts a more dynamic correspondence between form and function.
We thus ask how comprehenders choose from an ordered set during reference resolution. Some theories assume one backward looking center, such as Centering Theory (Grosz et al., 1995), which has been widely applied to the resolution of personal pronouns. 2 In its original proposal, the prominence framework proposed that one entity is singled out from a set of entities of equal type. Here, we test whether the role of prominence ranking is merely to single out one referent from a set or whether it makes available an ordered set that interacts with different referential expressions in specific ways.
Earlier approaches to pronoun resolution sought to identify the individual factors which determined the prominence of particular antecedents. Later, a more nuanced view of prominence developed, in which prominence is determined via the combination of several factors such as grammatical role, position, or thematic role (Bader & Portele, 2019;Järvikivi et al., 2005;Kaiser & Trueswell, 2004;Schumacher et al., 2015Schumacher et al., , 2016. More recent approaches have gone beyond this and identified how pronoun resolution is part of a larger process of establishing coherence in a text or discourse, but is at the same time affected by certain biases. For instance, Kehler and Rohde (2013) show that pronoun interpretation influenced by form-related production biases (how likely it is that a pronoun will be produced to refer to a particular entity) but also next mention biases (likelihood that a particular entity will be mentioned again); these biases are related in a Bayesian model (see also Kehler et al., 2008).
While much research in pronoun resolution has been devoted to third-person personal pronouns (he/she/they) in English, in recent years, a broader spectrum of languages and pronoun types have been investigated (e.g., Carminati, 2002;Çokal et al., 2018;Colonna et al., 2012;de la Fuente, 2015;Ellert, 2010;Fossard et al., 2012;Hemforth et al., 2010;Kaiser, 2011b;Kaiser & Trueswell, 2008;Keating et al., 2016). The current study continues this trend by investigating personal and demonstrative pronouns in German. Broadening research across a wider variety of languages not only increases the cross-linguistic validity of existing proposals, it also gives access to a wider range of pronoun types, which leads us to consider different questions with respect to pronoun resolution. For instance, studies in Finnish have shown that the pronouns hän and tämä are differently sensitive to syntactic role and word order/information structure (Kaiser, 2003(Kaiser, , 2005Kaiser & Trueswell, 2008). The contrast between null and overt pronouns (for instance in Spanish, Italian, Greek, Turkish) points to a division of labor between these two forms (e.g., Alonso-Ovalle et al., 2002;Carminati, 2002;Dimitriadis, 1996;Filiaci, 2010;Turan, 1995); moreover, this division of labor is subtly different in different languages and language varieties (e.g., Barbosa et al., 2005;Filiaci et al., 2014). A similar division of labor may also be observed in the contrast between personal and demonstrative forms in Germanic languages such as Dutch (e.g., Haeseryn et al., 1997;Kaiser, 2011a).
Our investigation of German personal and demonstrative pronouns addresses a major limitation in the research to date: namely that many experiments on pronouns are limited to contexts in which only two potential antecedents are available. This limitation gives the impression that prominence is binary: a potential antecedent is either prominent or not prominent for the purposes of pronoun resolution, leaving open the question of how pronoun resolution plays out when more than two antecedents are in the discourse context. Our study starts to address this limitation by investigating the resolution of pronouns in ditransitive constructions where three potential antecedents are available. The aim of our study is to test the sensitivity of the different pronouns to the relative prominence of the antecedent when more than two antecedents are available, and also to probe the division of labor between the demonstrative forms. This is a particularly relevant question given that demonstratives in German seem to target less prominent antecedents.
In the next section, we review previous research on personal and demonstrative pronouns in German and clarify our assumptions about the notion of prominence in pronoun resolution; following that we review the relevant aspects of ditransitive constructions and work on pronoun resolution in these contexts. We then present hypotheses about division of labor and targeting less prominent antecedents which are addressed in the current study.

Personal and demonstrative pronouns in German
German has a richer pronominal system than English for referring to animate entities. The personal pronouns, which are inflected for gender, number, and case, are complemented by several types of demonstrative pronoun. In the current paper, we focus on masculine and feminine forms of the third-person singular personal pronouns er/sie ("he/she"), contrasting them with two types of demonstrative, der/die and dieser/diese (for convenience, we refer subsequently only to the masculine form of these pronouns).
The German demonstrative pronouns are regularly used to refer to human referents (as well as inanimate referents), in contrast to English this/that. 3 The two demonstrative forms in German, der and dieser, do not convey distance-based information such as proximate/distal (but note that there is such a contrast between dieser and the more obsolete jener). Dieser further expresses contrast or delimitation (e.g., Bisle-Müller, 1991). The two demonstrative pronouns der and dieser are claimed to differ in their potential to shape the upcoming discourse (Ahrenholz, 2007;Ehlich, 1983;Weinrich, 1993;Zifonun et al., 1997), but interpretive differences between these forms have not been empirically demonstrated, with the exception of Fuchs & Schumacher (2020) who show discrete referential shift functions.
Previous studies on the interpretation of different pronominal forms have focused mainly on der in contrast to the personal pronoun er. Dieser has received relatively less attention. Findings on the interpretation of er have found that this form refers preferentially to a prominent antecedent from the prior discourse (Bader & Portele, 2019;Bouma & Hopp, 2006, 2007Ellert & Holler, 2011;Schumacher et al., 2016), while it retains flexibility in its interpretation and can also refer felicitously to less prominent antecedents, in particular when multiple prominence hierarchies come into play Schumacher et al., 2016Schumacher et al., , 2017Wilson, 2009). Findings for der mainly agree that it refers to a less prominent antecedent and is more rigid in its interpretation (Bader & Portele, 2019;Kaiser, 2011b;Schumacher et al., 2015Schumacher et al., , 2016Schumacher et al., , 2017Wilson, 2009). However, there is no firm consensus across studies on the features contributing to prominence as it relates to pronoun resolution in German. Earlier studies proposed that grammatical role was the critical factor with personal pronouns preferring subject antecedents and demonstratives preferring nonsubjects (Bosch et al., 2003. This was disputed in later studies, which proposed instead that topichood was the critical factor Wilson, 2009; see also Abraham, 2002). For instance, Bosch and Hinterwimmer (2016) proposed that the semantic representation of er and der differs by way of an "avoid topic" feature for der.
A different proposal is that thematic role is an important component of prominence when it comes to German pronouns, particularly when contrasted with the contribution of grammatical role. This has been supported by a number of studies that manipulated both thematic role hierarchy and grammatical role hierarchy by contrasting anaphoric resolution preferences in active-accusative verbs versus dative-experiencer verbs. Schumacher and colleagues (2016) showed that er was resolved most often to antecedents with the thematic role of (proto-) agent 4 , while der was resolved most often to antecedents with the role of (proto-) patient. These findings have been confirmed by ERP evidence and sentence completions (Schumacher et al., 2015) as well as visual world eye tracking (Schumacher et al., 2017) and acceptability ratings and eye-tracking during reading (Patterson & Schumacher, 2020). Nevertheless, these studies also showed that it was not the thematic role alone that determined resolution preferences, rather, it was thematic role in combination with grammatical role and linear order that determined these preferences. Relatedly, Portele and Bader (2016) claim that a combination of syntactic prominence and givenness is a key component in differentiating the two pronouns, and Bader and Portele (2019) claim that d-pronouns favor the least prominent antecedent that can be determined by a combination of topichood, grammatical role, and linear position, while the personal pronoun in contrast appears to favor the subject of the previous clause.
In contrast with er and der, the interpretation preferences for dieser have received very little attention to date. Zifonun et al. (1997) claim that dieser is restricted to referring to the last-mentioned antecedent, but they do not present any empirical evidence for this claim. Preliminary studies suggest, however, that dieser patterns similarly to der in preferring an antecedent with lower prominence, even when the lower prominence antecedent was not the last-mentioned antecedent (Fuchs & Schumacher, 2020;Lange, 2016;Özden, 2016). In Özden's (2016) sentence completion study, two potential antecedents, with the roles agent and patient, were introduced. 5 The order of antecedents was manipulated (agent-before-patient vs. patient-before-agent). Note that the patient-before-agent order gives rise to a change in information structure such that the patient is now the assumed topic. Dieser was mainly interpreted as referring to the patient, irrespective of antecedent order. Similarly, Lange (2016) found that dieser was usually interpreted as the (proto-)patient with dative-experiencer verbs, irrespective of antecedent order. These preliminary findings are not compatible with Zifonun et al.'s (1997) description that dieser refers to the last-mentioned antecedent and are suggestive instead of other prominence cues being involved in the interpretation of dieser.
Based on the evidence presented above, we can characterize the resolution preferences of German personal and demonstrative pronouns as follows. Personal pronouns are somewhat flexible in their reference, but show a tendency to resolve to more prominent antecedents, such as topics, subjects or those with the (proto-)agent thematic role. The demonstrative der strongly avoids referring to prominent antecedents, and has a strong tendency to refer to antecedents with a less prominent role such as (proto-)patient. The demonstrative dieser patterns with der in referring to less prominent antecedents, with preliminary evidence pointing away from a purely position-based account for dieser.
It is not our aim in this study to directly examine the precise nature of prominence-lending cues; however, it is important here to clarify our own assumptions for the purpose of the current study. We follow the majority of recent studies in rejecting the outdated notion that prominence is determined by a single factor. Rather, we follow Schumacher et al. (2015Schumacher et al. ( , 2016 and Bader and Portele (2019) (who limit this notion to demonstrative pronouns) in assuming that a single prominence hierarchy is determined by a combination of several factors including, but not limited to, thematic role and linear order. Furthermore, we follow Schumacher et al. (2015Schumacher et al. ( , 2016 in assuming that there is some commonality between factors, which determine the prominence hierarchy for personal pronouns and demonstrative pronouns, contra Kaiser and Trueswell (2008) who argue that different pronoun types are affected by different factors. Finally, we acknowledge the body of work by Kehler, Rhode and colleagues (e.g., Kehler et al., 2008;Kehler & Rohde, 2013;Rohde & Kehler, 2014), which suggests that the importance of prominence is relegated to a subject or topic resolution bias for personal pronouns (in English); this overlays a general production bias toward mentioning entities that are rendered semantically prominent through coherence, discourse properties, world knowledge, etc. The semantic component is not specific to pronouns, but is predicated on the notion of which entity is likely to be mentioned next, in whatever form. While these claims are important to understand the underlying mechanisms supporting pronoun resolution, the current study does not support or contradict the claims made in this literature, and the focus of this study is not to apply the Bayesian model to German pronouns (for a discussion of the Bayesian model in German pronouns, see Bader and Portele (2019)). The focus of this study is rather to probe the behavior of personal and demonstrative pronouns in contexts with three antecedents, a question that has not been much addressed to date.
The evidence reviewed so far is limited to scenarios where two potential antecedents are presented. It is, therefore, unclear to what extent the above findings are generalizable to a broader set of contexts. This limitation may also mask the precise division of labor among the different pronoun types, particularly der and dieser. In the current study, we address this limitation by presenting pronouns following ditransitive constructions. By using ditransitive verbs, three potential antecedents are presented, which allows us to extend our understanding of the resolution of personal and demonstrative pronouns. Ditransitive constructions are particularly useful to address this issue because they introduce three potential antecedents that are all arguments of the same verb; these arguments are thought to have an internal hierarchy in terms of the prominence of their thematic roles. Other ways of introducing three potential antecedents, for instance, a matrix verb with a clausal complement such as "Marie thinks [that Louise likes Jasmine]", have an additional hierarchical clause structure that may cause a confound when examining prominence hierarchies; using ditransitive constructions avoids this problem.
Using ditransitives allows us to test hypotheses that are not testable with two antecedents alone. Our question is, how does the comprehension system choose from an ordered set of potential antecedents? The prominence framework, as discussed at the beginning of this paper, originally proposed that one entity was singled out from a set of entities of equal type. Here, we test whether the role of prominence ranking is merely to single out one referent from a set, rendering the other referents "invisible" to the referential expression, or whether referential expressions interact with the complete (ordered) set of referents. Take, as a basis, a ditransitive context with three arguments, A, B, and C. These arguments form an ordered set, with A the most prominent and C the least prominent (setting aside discussion of precisely which factors determine the prominence ranking). Encountering a demonstrative pronoun involves selecting an antecedent from this ordered set. Now imagine that the "rule" for resolving a demonstrative pronoun involves excluding the highestranked candidate. In a transitive context when only two potential antecedents are available (A, B), excluding the highest ranked candidate, A, only leaves B available for reference. But in a ditransitive context, such a description is underspecified. Excluding A leaves both B and C available. Are these candidates equally likely to be chosen? Existing accounts of demonstrative resolution do not explicitly state their expectations with regard to three-candidate contexts, but we can identify two contrasting scenarios that are implied by the characterization of demonstrative resolution preferences discussed earlier, as well as proposing a third alternative. The most common way of describing the preferences for demonstratives in German is to state that they avoid the most prominent antecedent; for instance, the "avoid topic" feature developed in Bosch et al. (2003) and Bosch and Hinterwimmer (2016). This implies that, once candidate A has been taken out of play, then any other available antecedents (B and C) are equal candidates for demonstrative reference. An alternative, contrasting, possibility is that the demonstrative seeks out the least prominent antecedent. This is most closely implied in Bader and Portele (2019, p. 185), who state that "a d-pronoun prefers the antecedent that is least favored by the structural biases". Targeting a single (lowest prominence) antecedent, (C in our example) implies that A and B are unavailable or not chosen during resolution. In a three-antecedent scenario, then, the two remaining antecedents that are not the least prominent are equally unavailable.
There is one further possibility. There could be a graded preference that directly reflects the prominence ranking, whereby the least prominent antecedent is most favored (for demonstrative resolution), the most prominent antecedent is least favored, and the medium prominent antecedent sits between the two extremes. This final possibility, which assumes graded preferences, is not specifically excluded by existing accounts of demonstrative resolution. But neither is it proposed or discussed, probably because it is not possible to test when only two possible antecedents are presented. Directly accessing a set of potential referents with graded preferences represents an extension of the prominence framework. This possibility involves further specifying demonstrative preferences to include sensitivity to prominence ranking, beyond excluding the most prominent or targeting the least prominent candidate.
Our current study uses ditransitive constructions to present three potential antecedents in order to assess the resolution preferences of personal and demonstrative pronouns in German. Below we review existing relevant work on ditransitives.

Ditransitive constructions and pronoun resolution
Ditransitive constructions are those in which a predicate takes three arguments; the prototypical examples are transfer of possession events with the verb "give", as in (1), which involves the girl, the boy, and a book: (1) The girl gave the boy a book. This is expressed in German in a similar way, with the recipient argument (the boy in (1)) taking dative case and the theme (a book in (1)) taking the accusative case: (2) Das Mädchen gab dem Jungen ein Buch.
The girl gave the boy a book.
In English, the same event can also be expressed using a prepositional object for the recipient argument (The girl gave a book to the boy). 6 There has been a great deal of debate in the theoretical literature about the motivation for this alternation. What is relevant for the current study, however, are the thematic roles assigned to the three arguments in constructions such as (2) and their relative prominence. The nominativemarked argument is generally understood to have the thematic role of source or protoagent, the accusative argument is the theme or proto-patient, and the dative argument is the proto-recipient. Proto-recipients (the boy in [1]) share features with the other two proto-roles, for example, in transfer of possession events, they are "proto-patient-like" by undergoing a change in possession relative to the proto-agent (the girl) but they are "proto-agent-like" in their function as possessor relative to the proto-patient (the book). Primus (1999) in her analysis of thematic roles assigns the three main roles the hierarchy PROTO-AGENT > PROTO-RECIPIENT > PROTO-PATIENT (for simplicity, we refer to these proto roles as agent, recipient and patient throughout the paper). Primus highlights the shared features between the agent and the recipient by placing the recipient higher in the hierarchy than the patient. A similar constellation holds for verbs with benefactives. 7 Earlier psycholinguistic work on English pronoun resolution in ditransitive contexts focused mainly on the observation that there is a bias toward resolving pronouns to referents with the role of goal (recipient) in transfer of possession events (e.g., Stevenson et al., 1994). Arnold (2001) used sentence completion and corpus data to demonstrate that the bias arises from a general likelihood of a particular discourse referent being mentioned again; she suggested that the activation of the recipient in the discourse was increased because of the increased expectation that it would be mentioned again, making pronoun resolution to the recipient more likely. Data from Rohde et al. (2006) and Kehler et al. (2008) suggests that it is not the event itself, but the unfolding of the discourse via different coherence relations that in turn makes either the agent or the recipient more prominent with respect to pronoun resolution. Kehler and Rohde (2017) demonstrate that expectations about the unfolding discourse (in this case, manipulating the expected Questionunder-Discussion) can influence the preference for source or goal interpretations of pronouns in real time before the critical information has even been encountered. Based on these findings, we took account of potential biases arising from the verb/scenario/discourse context in the current study by paying particular attention to the discourse coherence of the items, and by pretesting the materials.
Prior research on the behavior of pronouns in dative constructions in German is fairly limited by comparison. Bouma and Hopp (2006) tested preferences for resolving the personal pronoun er to the recipient and patient arguments in German (although ditransitive verbs were only a subset of their experimental items), using a referent-choice task. They did not find a strong preference for resolving er to the recipient or patient argument, regardless of argument order (recipient before patient or patient before recipient). In Bouma and Hopp (2007) there was again no preference, except in conditions when one of the arguments was topicalized by moving it to initial position, in which case, a slight preference for resolving to the recipient was observed. These results support the notion that the personal pronoun is flexible in its interpretation; however, because they did not test resolution to the agent argument in the ditransitive constructions, it is not possible to assess the true extent of this flexibility among three potential referents. Demonstrative pronouns were not included in these studies. Two previous studies that did include demonstratives in ditransitive contexts were conducted by Uzun (2015bUzun ( , 2015a. In a forced-choice pronoun resolution task and a sentence completion task, Uzun (2015a) compared er, der, and dieser in ditransitive contexts while manipulating the animacy of the patient argument and the word order. This resulted in a strong general preference to resolve all pronouns to the patient argument. This result is unexpected given the large number of previous studies in two-argument constructions, where clear differences emerge between preferences for personal and demonstrative pronouns. 8 In Uzun's studies, the manipulation of both animacy and word order led to a very high number of experimental conditions; additionally, the ditransitive contexts with three animate arguments did not all sound very natural. In the current study, we hope to improve upon the design and materials by restricting the number of conditions and by making the experimental items sound more natural.

Current study
The aim of the current study is to test the resolution preferences of German personal and demonstrative pronouns in ditransitive constructions, when three potential antecedents are available. This enables us to examine how preferences for less prominent antecedents are worked out for demonstrative pronouns in particular, in nonbinary contexts. We test three alternative scenarios, as outlined above. One scenario for demonstratives is that the most prominent candidate antecedent is identified and rejected, but all other candidates are equally available. The second scenario is that the candidate antecedent with the lowest prominence rank is identified as the most suitable antecedent, meaning that reference to all other candidate antecedents is rejected. The final scenario is that the resolution preferences of the pronominal forms are graded, directly reflecting their prominence ranking. In addition to uncovering preferences for demonstratives in nonbinary contexts, we can also observe whether there is a division of labor between the two forms of the demonstrative, der and dieser (cf. Brown-Schmidt et al., 2005;Çokal et al., 2018;Kaiser & Trueswell, 2004, 2008Wilson, 2009). We conducted three acceptability rating experiments (1a, 1b, and 2) with passages consisting of two sentences that included a referentially unambiguous pronoun. The use of acceptability ratings, rather than a task that asks the participant to choose a referent, ensures that we have sufficient and balanced data not only about the favored antecedent but also about less favored antecedents as well, which is important for the assessment of and comparisons between dispreferred candidates.
The purpose of Experiment 1 (1a and 1b) was to test resolution preferences for demonstrative pronouns (and personal pronouns as a point of comparison) in ditransitive contexts in which three candidate antecedents are available; Experiment 2 more closely examines the last-mention preference for dieser (Zifonun et al., 1997) by repeating Experiment 1, but reversing the order of the last two antecedents.
Following Primus (1999), we assume that the linearization of thematic roles in canonical order gives rise to a clear prominence ranking among the three potential antecedents in ditransitive contexts (with an inanimate patient): Agent > Recipient > Patient. Note that, while animacy has been viewed as a prominence-lending feature in previous research (e.g., Aissen, 1999Aissen, , 2003, we assume that animacy is an epiphenomenon of agentivity and its feature specification . If it were to exert an independent influence on prominence, animacy would, in any case, align with the ranking shown above. 9

Experiment 1a
Hypotheses, predictions, participant numbers, and a data analysis plan (including exclusion criteria for participants) were registered in advance on aspredicted.org (see supplemental materials in the Data Availability section).

Materials
Items consisted of two sentences each and the second sentence included an unambiguous pronoun, which was disambiguated on the basis of gender agreement or plausibility information. The first sentence (S1) was a ditransitive construction, that is, a three-argument construction containing an agent, a recipient and a patient. 10 Agent and recipient were always animate role nouns and the patient was always inanimate. 11 The thematic roles of the recipient argument include recipientpossessor, goal-possessor, addressee-listener, and addressee-viewer. Benefactive constructions were also included. 12 The second sentence (S2) began with a pronoun referring unambiguously to one of the three NPs and described a state or event relating to that referent. Two three-level factors are fully crossed in order to create nine conditions. The factor Pronoun manipulated the pronoun type that was presented: the personal pronoun er; the demonstrative pronoun der; or the demonstrative pronoun dieser; in half the items, the pronouns were masculine, otherwise feminine. The factor Referent manipulated which NP the pronoun refers to (agent, recipient, or patient). This was achieved via gender and plausibility cues. In the Agent conditions, the pronoun matched in gender with the agent and patient but not the recipient; the event/state described in S2 was only compatible with an animate agent, thus ruling out the patient as a possible referent. In the Recipient conditions, the pronoun matched only the recipient in gender, and the patient was also ruled out on grounds of plausibility. In the Patient conditions, the gender of the pronoun matched all three NPs, but the event/state described in S2 was only compatible with an inanimate entity, thus ruling out the agent and recipient as possible referents.
The event or state described in S2 was always different for the Patient conditions in comparison to the Agent and Recipient conditions to ensure that it was compatible with an inanimate referent. In order to retain a high degree of comparability between the Agent and Recipient conditions, S2 was kept as similar as possible between these two conditions (in 13 items it is identical), with the same event or state being described. However, it was also crucial to maximize the discourse coherence between the two sentences so that the participants' ratings were not adversely affected by unnatural-sounding stimuli. For this reason, where necessary S2 contained an adverb that differed between Agent and Recipient conditions, or the event/state was slightly different, ensuring that the scenario sounded natural and coherent in each of the conditions. An example of the nine conditions is shown in Table 1. The full set of materials can be found in the supplemental materials linked to in the Data Availability section. 13 Translation "The (female) owner rented the parking space to the (male) resident. She was delighted about the longterm contract." "The (male) owner rented the parking space to the (female) resident. She was delighted about the longterm contract." "The (female) owner rented the parking space to the (female) resident. Fortunately it was a shady spot."

Procedure
Thirty-six items, which had been pretested for plausibility 14 were distributed across nine lists in a Latin square design; participants saw each of the 36 items in one condition only, and saw four items per condition. The 36 items were interspersed with 36 filler scenarios that were one or two sentences long (a list of fillers can be found in the supplemental materials linked to in the Data Availability section). Fourteen fillers contained no pronoun, and the remainder contained a variety of pronoun types. Fourteen fillers were deliberately made unnatural or implausible, in order to encourage participants to use the full range of the rating scale and to test whether participants were paying attention to the task. 15 The presentation order of the items was randomized for each participant. Participants completed the questionnaire remotely using the Qualtrics survey platform (Qualtrics, Provo, UT). The task was an acceptability rating task. Participants were asked to read each text carefully and rate how good each scenario sounds from a scale of 1 sehr seltsam ("very strange") to 7 perfekt ("perfect"). They indicated their response by clicking on stars below each item.

Participants
Based on a power calculation 16 , 60 participants were recruited via the Prolific platform (www.prolific.ac). Two participants were excluded based on their responses to the test fillers. The remaining 58 participants were all native speakers of German (35 male, 22 female, 1 nonbinary; no reported language-related disorders) and had a mean age of 31 years (range 18-66). All participants gave written informed consent.

Agent conditions
The personal pronoun er should be the most suitable pronoun for referring to the Agent. In all three outlined scenarios, both der and dieser should elicit significantly lower scores than er because demonstratives tend to avoid reference to prominent referents.

Recipient conditions
The recipient is intermediate between the agent and the patient with respect to role prominence. If er is flexible in its reference, we expect high ratings in this condition (though possibly not as high as in the Agent conditions, if er truly prefers prominent antecedents). If der avoids reference to all but the least prominent candidate, then ratings for der should be equally low in this condition as in the Agent condition. If, on the other hand, der simply avoids reference to the most prominent candidate, then it should receive higher ratings in this condition than in the Agent condition. Dieser should pattern with der (Lange, 2016;Özden, 2016). However, low ratings for dieser would also be compatible with Zifonun et al.'s (1997) claim that dieser is reserved for referring to the last-mentioned antecedent.

Patient conditions
The patient has the lowest role prominence (Primus, 1999); it is also last-mentioned, and inanimate. Therefore, both der and dieser should receive high scores in this condition, which would be attributable either to role prominence or a last-mention effect for dieser (Zifonun et al., 1997). If er is truly flexible in its reference it should also receive high ratings here; however, if er prefers a higher prominence antecedent the scores for er will be significantly lower here than scores for der and dieser.

Data analysis
Ratings (1-7) for each item (including fillers) from each participant were converted to z-scores, as recommended in Schütze and Sprouse (2014), to account for variation in participants' use of the scale. To establish that the task was carried out correctly, mean z-scores for the unnatural fillers were compared with those from the normal fillers, using a linear mixed-effects model with a fixed effect of filler type (normal; unnatural) and random intercepts for participant and item. Additionally, each participant's mean z-score for the two filler types were compared to check if any participants should be excluded (exclusion criteria was having a higher mean z-score for the unnatural fillers than for the normal fillers). Two participants were excluded on the basis of their ratings for the unnatural fillers leaving 58 participants for the analysis. Participants' z-scores for the experimental items were analyzed using a series of linear mixed-effects models. As the predictions relate primarily to the reference forms for each NP separately, and because there are slight differences in the materials between the three Referent conditions, a separate model was run for each level of Referent to ensure maximum comparability. The model contained the fixed factor Pronoun (er; der; dieser) and random intercepts for participant and item; the inclusion of random slopes for Pronoun were determined by assessment of singularity and by using the rePCA function in the package RePsychLing (Baayen et al., 2015).
Further to the main analysis, ratings were also explored across the three Referent conditions. First, to explore the flexibility of the personal pronoun er, ratings for er across the three Referent conditions were compared. A linear mixed-effects model containing the fixed factor Referent was run on the ratings for er. Second, to compare the demonstrative forms der and dieser in the lower prominence Referent conditions, a model containing the fixed factors Pronoun (der; dieser) and Referent (Recipient; Patient) was run on the der and dieser ratings. For both models, random intercepts for participant and item were included and slopes were determined as described for the primary analysis above.

Results
All data, scripts, and model outputs are available in the supplemental materials linked to in the Data Availability section. Unnatural fillers received significantly lower ratings than normal fillers (ß = −1.54, SE = 0.14, t = −11.22, p < 0.001). For the experimental items, mean ratings (raw scores and z-scores) per condition are shown in Table 2. Z-scores are plotted in Figure 1.

Primary analysis
A likelihood ratio test of an overall model containing a Referent × Pronoun interaction against the same model without the interaction showed a significant difference between models (χ 2 (4) = 152.72, p < 0.0001), justifying the separate inspection of the data at each level of Referent. 17 Model outputs for each level of Referent are shown in Table 3. In the Agent conditions, ratings for er were significantly better than for der and dieser. This was also true for the Recipient conditions. In the Patient conditions, ratings for er were marginally better than for der. Ratings for dieser, however, were significantly better than for er.

Secondary analysis (preregistered)
While the primary analysis reflects the maximal comparability between the Pronoun conditions within one level of the Referent factor, it is also interesting to look at the pattern of responses across the three Referent conditions. In order to explore the flexibility of reference for the personal pronoun er, a model containing the fixed factor Referent was computed for the er data. This model showed no differences in ratings for er across the three antecedent conditions (all ts < 2). In order to explore the pattern of both demonstrative pronouns across the lower prominence antecedents, a model containing the fixed factors Pronoun (der; dieser) and Referent (Recipient; Patient) was computed. The output is shown in Table 4. This analysis showed that there was no significant difference between der and dieser in the Recipient condition. Ratings for der in the Patient condition are significantly better than in the Recipient condition; this is also true for dieser. Finally, two more models were calculated to address specific questions arising from the results that were not covered by the preregistered analysis. First, to directly test whether dieser received the best ratings of all pronouns in the Patient conditions, the Patient model was rerun with dieser as the baseline. This model showed that ratings for dieser in the Patient condition were significantly better than ratings for er (t = −2.127) and for der (t = −3.487). Second, a model was calculated to further explore the results across all three referent conditions for the demonstrative pronouns, because the preregistered analysis does not test specifically whether ratings for the demonstratives significantly improved sequentially across the three Referent conditions. This question was addressed using a model with forward-contrast coding for the Referent conditions, which compares each level with the subsequent level. The output of this model is shown in Table 5. This model shows that overall ratings for dieser were better than ratings for der. Furthermore, ratings for both pronouns improved significantly from Agent to Recipient and from Recipient to Patient. The degree of improvement did not differ between der and dieser.

Experiment 1b
Before discussing the outcome of Experiment 1a, we present here the results of a follow-up experiment (Experiment 1b). In Experiment 1a, pronouns were made to refer unambiguously to one of the three potential antecedents through a combination of gender cues and plausibility cues. While reference to the two animate entities, the agent and recipient, was always disambiguated through gender, in the Patient conditions the gender of the pronoun matched all three NPs, but the event/state described in S2 was only compatible with an inanimate entity. This should have ruled out the agent and recipient as possible referents. However, it leaves open the possibility that the pronoun in this condition is nonetheless interpreted as referring to the agent or patient. If so, the event/state in S2, would be seen as incongruous (because it is not compatible with an animate referent). On this basis, participants could have given bad ratings in this condition for seemingly incongruent scenarios. While we think that this is unlikely, we cannot rule out the possibility that this took place, and the ratings therefore come under question. We therefore conducted a follow-up experiment in which the gender cues fully disambiguate the reference in all three conditions. Example sentences are given in Table 6. The full set of materials can be found in the supplemental materials linked to in the Data Availability section.

Methods
All methodological and analysis details for Experiment 1b are identical to Experiment 1a, with the exception of the amended items as described above and a new set of participants. Participant details can be found in the supplemental materials linked to in the Data Availability section.

Results
The results from Experiment 1b are illustrated in Figure 2. A full summary of results for this experiment can be found in the supplemental materials linked to in the Data Availability section. Results for Experiment 1b were identical to the results for Experiment 1a, with the following exceptions: there was no significant difference between ratings for the three pronouns in the Recipient conditions; in the Patient conditions, there was no significant difference between dieser and er; the secondary and exploratory analysis showed that the improvement in scores for der between the Recipient and Patient conditions was only marginal.

Experiment 1 discussion
The results showed that for the Agent conditions, the personal pronoun er was rated significantly better than both the demonstrative pronouns. This aligns with the prediction that the personal pronoun is more suitable than the demonstrative pronouns for referring to a prominent antecedent and is compatible with all three scenarios outlined for demonstratives. In the Recipient conditions there is no difference between the three pronouns (Experiment 1b), but, as the exploratory analyses show, ratings for the demonstratives in Recipient conditions significantly improved compared to the Agent conditions while the ratings for the personal pronoun did not change. This result shows that using a demonstrative to refer to a medium prominence antecedent was not considered as bad as referring to the most prominent antecedent. But it is considered less good than referring to the least prominent antecedent, at least for dieser. This result is informative for two reasons. First, the description that demonstratives simply avoid the most prominent antecedent appears to be inadequate in light of our results. Second, the results show that the demonstrative pronouns do not simply seek an antecedent that is the least prominent and strongly reject all others. Rather, it suggests that sensitivity to the prominence of an antecedent is a gradable phenomenon. This is discussed in more detail in the general discussion. Finally, we turn to the prediction based on Zifonun et al that dieser is reserved for referring to the last-mentioned antecedent. Based on the current results, this appears to be too strong a formulation. Dieser was sensitive to prominence in a graded way across the three Referent conditions. This does not accord with the characterization of a form being reserved for the last-mentioned Translation "The (male) owner rented the parking space to the (female) resident. He was delighted about the longterm contract." "The (female) owner rented the parking space to the (male) resident. He was delighted about the longterm contract." "The (male) owner rented the parking space to the (male) resident. Fortunately it was a shady spot." antecedent, but suggests that dieser is sensitive to distinctions in prominence when referring to antecedents that are not in final position. That being said, our results cannot tell us the degree to which the ratings were affected by the last-mention effect because the last-mentioned referent was also lowest in terms of its thematic role prominence (and was also inanimate). In order to further explore the importance of the last-mention position for dieser, we conducted Experiment 2. It was noted by an anonymous reviewer that it is unclear whether the medial position is really more prominent than the final position (contra Primus, 1999); if so, then the demonstrative may have a preference for the medial rather than the final referent if prominence ranking is sensitive to position. This scenario appears to be ruled out for dieser, given the clear preference for the last-mentioned antecedent, although it may be a relevant factor for der, which did not show such a strong distinction between the recipient and patient. Antecedent order is explored further in Experiment 2.

Experiment 2
Hypotheses, predictions, participant numbers, and a data analysis plan (including exclusion criteria for participants) was registered in advance of carrying out the experiment on aspredicted.org (see the supplemental materials linked to in the Data Availability section). The purpose of this experiment was to examine the extent to which the final position contributes to resolution preferences of demonstratives, particularly dieser, against thematic role prominence. This was done by using materials from Experiment 1a and changing the order of the NPs so that the least prominent antecedent (based on animacy and thematic role assignment, i.e., the patient) appeared in second position and the medium prominence antecedent (i.e., the recipient) appeared as last-mentioned. It should be noted that swapping the order of the recipient and patient arguments results in a noncanonical order. The new scrambled order is, according to our intuitions, somewhat marked but not strongly ungrammatical. There is a great deal of discussion in the theoretical literature about the motivation for scrambled orders such as the one we are using for Experiment 2 (see Müller, 1999 for an overview). Furthermore, some authors have argued that there is an animacy constraint placing animate arguments before inanimate ones (e.g., Fanselow, 2003;Lenerz, 1977;Vogel & Steinbach, 1998). What is important to our study, however, is the relative prominence of the three arguments, that is whether changing the argument order to PATIENT -RECIPIENT impacts the relative prominence of the arguments. To our knowledge, there is no discussion in the literature of the relative prominence of the arguments as a result of the scrambled order. However, final position is typically seen as a focus position, so it is possible that moving the patient to the medial position, leaving the recipient in the final position, puts the recipient in focus (Fanselow, 2003 and references therein). If edge positions confer extra prominence (Himmelmann & Primus, 2015), this would mean that the patient (now in NP2 position) remains the lowest ranked in terms of prominence. Thus, in Experiment 2, the (relative) prominence status of the arguments remains the same as in Experiment 1, but the linear order has changed. The lowest prominence argument is now in NP2 position.
An alternative scenario is that putting the recipient in focus confers a more topical status to the other two referents, and thus makes them more suitable than the recipient as referents for the pronoun (this is the basis of the so-called "anti-focus" effect, see Colonna et al., 2015;de la Fuente, 2015;Patterson et al., 2017). 18 This is why it is critical to retain the other two pronouns in this experiment and not just test dieser alone. Any changes to the availability of particular antecedents based on the altered information structure should be reflected in responses for er and der, but these two pronouns should not be affected by a strong preference for the final position per se.
To avoid ambiguity between genitive and dative case when feminine NPs were placed in final position, the gender of the pronouns and antecedents had to be manipulated differently from Experiment 1, however, this was counterbalanced in other conditions so that there was still an even split between masculine and feminine pronouns. 19 The same factors (Pronoun, Referent) were manipulated as in Experiment 1 to give nine conditions as before. Sample materials are shown in Table 7.

Procedure
The design and procedure were as per Experiment 1. An additional nine filler items, which started with masculine NPs and resembled the experimental items were added in order to counterbalance the experimental items that always start with a feminine NP. Translation "The (female) owner rented the parking space to the (male) resident. She was delighted about the long-term contract." "The (female) owner rented the parking space to the (male) resident. Fortunately it was a shady spot." "The (female) owner rented the parking space to the (male) resident. He was delighted about the long-term contract."

Participants
Sixty participants were recruited via the Prolific platform (www.prolific.ac). They all gave written informed consent. Two were excluded because they were not native German speakers. One further participant was excluded based on responses to the unnatural fillers. The remaining 57 participants (32 male, 25 female; no reported language-related disorders) had a mean age of 30 years (range 18-66).

Predictions
In comparison to Experiment 1, we expected ratings for Experiment 2 to be slightly lower due to the noncanonical constituent ordering. Er was shown to be flexible in its reference in Experiment 1 and should therefore score well in all three conditions of Experiment 2. Predictions for the demonstratives der and dieser are as follows.

Agent conditions
Der and dieser should elicit low scores in this condition because they tend to avoid reference to prominent referents, as per Experiment 1.

Patient conditions
According to the last-mention account (Zifonun et al., 1997), scores for dieser should be low (comparable to the scores for dieser in the Agent condition). Conversely, the prominence hierarchy account predicts that der and dieser are sensitive to the hierarchy from the thematic roles, which would lead to very high ratings for der and dieser here. Ratings should be as high or higher than ratings for er, and comparable to der and dieser ratings for the Patient condition in Experiment 1. If both constituent order and prominence are taken into account for the interpretation of dieser, higher ratings based on thematic role prominence should be somewhat tempered by the medial placement of the constituent. If der is more sensitive to prominence than constituent order, we expect high ratings in this condition, comparable to the der ratings for the patient in Experiment 1.

Recipient conditions
According to the last-mention account (Zifonun et al., 1997), dieser should receive high ratings in this condition, possibly higher than ratings for er, as per Experiment 1. According to the Prominence hierarchy account, der and dieser should receive lower ratings than er here but not as low as in the Agent condition. If both constituent order and prominence play a role, scores for dieser should fall somewhere between the level of recipient and patient scores from Experiment 1. If der is sensitive only to the prominence hierarchy it should get lower scores here than in the Patient conditions.

Data analysis
The data were analyzed as in Experiment 1.

Results
All data, scripts, and model outputs can be found in the supplemental materials linked to in the Data Availability section. One participant rated the catch fillers slightly better than the normal fillers; this participant was removed from the analysis. All other participants gave the catch fillers lower ratings than the normal fillers; group mean ratings were 3.11 (SD 1.81) for the catch fillers and 5.55 (SD 1.72) for the normal fillers (z-scores −0.68 and 0.60), which was significant in the LME model (ß = −1.28, SE = 0.15, t = −8.38, p < 0.001). For the experimental items, mean ratings (raw scores and z-scores) per condition are shown in Table 8. Z-scores are plotted in Figure 3.

Primary analysis (preregistered)
A likelihood ratio test of an overall model containing a Referent × Pronoun interaction against the same model without the interaction showed a significant difference between models (χ 2 (4) = 40.81, p < 0.0001), justifying the separate inspection of the data at each level of Referent. 20 Model outputs for each level of Referent are shown in Table 9. In the Agent condition, ratings for er were significantly better than for der and dieser. In the Patient condition, ratings for er were better than for der but not for dieser. In the Recipient condition, there were no significant differences between ratings.

Secondary analysis (preregistered)
In order to explore the flexibility of reference for the personal pronoun er, a model containing the fixed factor Referent was computed for the er data. This model showed that er was rated significantly lower in the Patient condition when compared to the Agent condition (t = −2.035). There was no difference between ratings in the Agent and Recipient conditions (t = −0.787). In order to explore the pattern of both demonstrative pronouns across the three Referent conditions, a model containing the fixed factors Pronoun (der; dieser) and Referent (Agent; Patient; Recipient) was computed. The output is shown in Table 10. This analysis showed no difference in ratings between der and dieser in the Agent conditions. Both der and dieser received better ratings when referring to the Patient compared to the Agent, and when referring to the Recipient compared to the Agent.
Exploratory analysis (not-preregistered) Finally, a model for the demonstratives with forward-contrast coding for the Referent conditions was computed as per Experiment 1. The output of this model is shown in Table 11. This model shows that overall ratings for dieser were better than ratings for der. Furthermore, ratings for both pronouns improved significantly from Agent to Patient and from Patient to Recipient. The degree of improvement did not differ between der and dieser.

Experiment 2 discussion
We predicted that the ratings for Experiment 2 would be slightly lower than for those in Experiment 1, due to the noncanonical ordering of the constituents. This was indeed the case, as can be seen from the z-scores which were all negative, meaning that the experimental items in Experiment 2 were given lower ratings than the fillers in the same experiment (which had a canonical argument order).  Our interest lies in examining the difference between the conditions in the experiment. We expected that the personal pronoun er would be as flexible as it was in Experiment 1, but this prediction was not completely borne out; er was given significantly lower ratings in the Patient condition. This finding was unexpected. It is possible that a combination of the low prominence of the patient role and the medial position in the sentence renders the patient in this experiment particularly nonprominent, making er a less suitable referential form. However, if this was the case we would expect the demonstrative pronouns to do better in this condition; in fact, der received even lower ratings than er, and dieser ratings were similar to those for er. This suggests perhaps that using any type of pronoun is sub-optimal for this referent, because participants were not expecting the second sentence to continue with that referent. We speculate that using the full NP to refer to the patient in this context may have been more felicitous. In this situation, it would be advantageous to have insight production and interpretation data in the style of Kehler and Rohde's experiments (e.g., Kehler & Rohde, 2013), to be able to separate the likelihood of continuing with a particular entity from the likelihood of using a pronoun.
Relatedly, the movement of the recipient to the final position may have given it a boost in prominence. For example, Himmelmann & Primus (2015) discuss edge placement as a potential prominence-lending cue. In this regard, initial and final arguments are more privileged than an argument in medial position. This may have two causes: either edge placement is a purely serial criterion that marks the borders of certain grammatical structure, or edge placement is motivated information structurally by representing the topic or focus. In our experiment, this placement may have increased the expectation that the second sentence would continue with the recipient, leading to higher ratings in the Recipient condition and conversely to lower ratings in the Patient condition.
An alternative explanation of the lower ratings for the Patient condition, also motivated by information structure, is that the inanimate patient is licensed to move in front of the recipient if it is a topic. 21 If demonstratives strongly avoid topics then the lower ratings for demonstratives could be explained by an anti-topic preference. However, if the patient is a topic, we would not expect ratings for the personal pronoun also to be low in this condition.
The existence of several conflicting interpretations for the results of Experiment 2, which was designed principally to test predictions based on Zifonun et al.'s (1997) claim that dieser is reserved for referring to the last-mentioned referent, highlights a drawback in the design of this experiment. Untangling the information-structural influences underlying the behavior would require several further targeted experiments, which is outside the scope of the current study. This limitation does not prevent us from extracting some useful and interpretable findings.
The primary analysis showed that when referring to the Agent, the personal pronoun er was rated significantly better than both the demonstrative pronouns. This replicates our finding from Experiment 1 and confirms that the personal pronoun is more suitable than the demonstrative pronouns for referring to the most prominent antecedent. Results from the Patient and Recipient conditions give us some indication as to whether the demonstrative pronouns are sensitive to prominence from role hierarchies, to order of mention (last mention) or both factors. Reference to the Recipient, which is in the final position but has medium role prominence (possibly even boosted by its edge position as discussed above), elicit the best ratings for both demonstrative pronouns. This points to a strong role of order of mention (possibly motivated by information structural factors) and suggests, importantly, that the demonstratives are very well suited to referring to a last-mentioned entity even when it is not the referent with the lowest role prominence. However, the ratings in the Patient condition are also informative here. Unlike the Agent conditions, there is no difference in ratings between dieser and er in the Patient conditions; such a difference would be expected if dieser was only suitable for last-mentioned referents. Additionally, the ratings for both dieser and der are significantly better in the Patient condition than in the Agent condition; again, this is unexpected on a strong last-mention account, where all other options should be strongly rejected. These results are discussed further in the general discussion.

General discussion
Many studies of pronoun resolution preferences are limited to contexts in which two candidate antecedents are presented, leaving open the question of how pronoun resolution plays out when more than two possible referents are available in the discourse context. This is particularly important to examine given that many pronoun resolution studies rely on the assumption of an ordered set of potential candidate antecedents. While demonstratives appear to prefer less prominent antecedents, it is unclear if they reject all but the least prominent antecedent, or if they reject only the most prominent antecedent. It is not possible to assess these possibilities using only two potential antecedents. In the current study, we addressed this limitation by examining pronoun interpretation preferences in ditransitive contexts in which there are three potential antecedents. We examined both personal and demonstrative pronouns in German because they are assumed to be sensitive to the prominence of potential antecedents, while relatively little is known about the interpretations of dieser in particular.
The main finding from our experiments was a graded sensitivity to referent prominence for both demonstrative pronouns. This was seen in the response patterns for the demonstratives, particularly dieser, in the lower prominence conditions (Recipient and Patient), which showed differences in both experiments, further confirmed in the exploratory analyses. This finding is important because it furthers our understanding about the preference of demonstrative pronouns for less prominent antecedents. From previous studies, it was not clear whether demonstratives avoid reference to all but the least prominent candidate, or simply avoid reference to the most prominent candidate. It seems that neither of these characterizations is complete, since they imply that the remaining candidates are considered to be equal. Our results show that in fact, prominence gives rise to more nuanced preferences among the three candidate antecedents. However, the outcome of Experiment 2 demonstrated that referring to the last-mentioned candidate is preferable for both demonstrative forms even when the last-mentioned candidate is not the least prominent candidate in terms of thematic role. Given that the literature on demonstratives in German has largely concentrated on prominence of grammatical or thematic roles or on notions such as topichood rather than positional preferences, it was surprising to find such a strong influence of position in our results. It seems that the description in Zifonun et al. (1997) is partially correct. However, despite the preference for demonstratives to refer to referents in final position, the medially placed referents were rated better than the agent in both experiments, confirming that the demonstrative preferences cannot simply be described as a final position preference, but are also influenced by a referent's role prominence.
We noted in the introduction that three studies had found that dieser referred preferentially to the patient even when it was not in final position (Fuchs & Schumacher, 2020;Lange, 2016;Özden, 2016); on the surface this is a different pattern than was found in our data where final position was important. But in these studies, only two potential antecedents were tested, an agent and a patient, in both agent-before-patient and patient-before-agent order. In the latter order, the agent was in final position, which was never the case in the current study, so we cannot directly compare the two patterns. In our experiments referring to an agent with a demonstrative pronoun always received the lowest ratings. It remains to be seen whether using a demonstrative to refer to an agent in final position in ditransitive contexts would result in equally low ratings. But it should be noted that this would result in a very marked change to the information structure of the context sentence.
By using ditransitive contexts, we made more than two potential antecedents available in the context, which also allowed us to test whether there is a division of labor between der and dieser with respect to their interpretation preferences. This is especially important given the lack of a significant literature on dieser.
The results from Experiment 1 show that both forms have a similar pattern with respect to the three antecedents, which is not suggestive of a strong division of labor. However, in Experiment 1b, the difference between recipient and patient was only marginally significant for der. This could indicate that der is less strongly influenced by a final position preference than dieser. Furthermore, in both experiments, dieser received better scores than der overall. This may be due to dieser being the more formal counterpart of der (Patil et al., 2020); given that the experiment was conducted as judgments on written material, this may have led to participants assigning higher ratings to dieser. The language modality may also play a role, with der being more suitable to spoken language. The interaction of register and modality on the occurrence and resolution of demonstrative forms is a topic for future research (Patil et al., 2020). Another possibility (albeit speculative) is that dieser has a stronger singling out function that is particularly suitable to contexts in which several antecedents are available.
As regards the division of labor between personal and demonstrative forms, which has been found in previous studies, we made the basic assumption that in German, personal and demonstrative pronouns are sensitive to the same set of factors affecting prominence, as was the case in Schumacher et al. (2015Schumacher et al. ( , 2016. While our study was not set up directly to test this, we can see that the final position may well be more important for demonstratives, in particular dieser, than for personal pronouns. However, future studies further probing the information-structural motivation for the importance of final position may sharpen our insight in this direction. An important difference between this study and previous ones is the use of a rating task instead of an antecedent choice which is more commonly used in studying pronoun resolution preferences. We find the rating method to be very useful for assessing pronoun resolution preferences because it reveals not only the most preferred option in a particular condition, but also more subtle information such as relative acceptability of dispreferred alternatives that are not outright rejected. In a forced-choice or sentence completion task, a lot of data points are gathered about the most preferred option, while very little data are gathered about less preferred options. It is difficult to tell apart options that are completely rejected from those that are simply dispreferred, as the behavior (choosing the preferred answer option) is the same in both cases. For the assessment of the hypotheses we presented, it was important to have sufficient data about less preferred antecedents as well. That being said, the judgment task does not tap into a participants' preferred interpretations as directly as a forced-choice or sentence completion task. Nevertheless, we think that having more information about lesser preferred candidates is also a valid point of investigation, as readers and listeners are often confronted with interpretation situations in which the preferred candidate is not available.
As far as the relational notion of prominence of individual referents is concerned, the experiments support the assumption of an ordered set of potential candidate antecedents that give rise to graded preferences. This is particularly evident from the ratings for dieser, which strengthen the prominence-lending function of thematic role by turning down the agent as a potential candidate for coreference, while at the same time showing a gradient resolution pattern with respect to the two nonagent roles. The prominence framework offers a broader perspective on reference resolution by making available an ordered set of referential candidates. Rather than merely focusing on the most prominent entity from the set, the current data indicate that the full set is utilized during reference resolution. Overall, this suggests that gradience should be taken more seriously in research on reference resolution. Furthermore, since the prominence framework is geared toward a general organizational principle of the language system, gradience should also be accounted for when prominence relations are considered at other levels of linguistic description (see, e.g., Roessig et al., 2019 for prosody;Kretzschmar et al., 2019 for an agentivity cline).
Finally, the data indicate a considerable contribution of orderor rather edge placement. However, we cannot draw strong conclusions about the informationstructural triggers without further testing. The pattern is in line with Himmelmann and Primus's (2015) discussion of edge as a potential prominence-lending cue which would be a fruitful avenue for further research. In the current case, the sentence boundary serves as a privileged position (possibly contributing information structural cues triggered by the marked argument order in Experiment 2). The medial position in turn may function as a landing site for entities that are demoted in terms of their accessibility, which would explain why even the personal pronoun, which was otherwise very flexible, received lower ratings in this position for Experiment 2. It is also noteworthy that the personal pronoun showed a high degree of flexibility in its antecedent selection. This suggests that using personal pronouns to identify the features that lend prominence to an entity may not be the most promising avenue for investigation.
In sum, presenting more than two potential antecedents for German pronouns enables us to better characterize their interpretation preferences. The personal pronoun is highly flexible, being equally acceptable when referring to any of the three candidates in canonical conditions, while not always being the preferred pronoun. The interpretation of demonstrative pronouns follows a graded sensitivity to referent prominence, ranging from lower acceptability when referring to highly prominent antecedents to the highest acceptability for the least prominent antecedents. Importantly, the ratings for the medium prominence candidate differed from both the most prominent and the least prominent candidate, revealing a graded pattern. Models for reference resolution should thus capture the gradient nature underlying competing referential candidates.
Data Availability. Supplementary data and materials can be found on OSF: https://osf.io/zrkvx/.