5.0 Questions and Answers
(1) Why do you think that discourse relations and structure have been interesting to both linguists and philosophers?
Discourse relations allow us to combine propositional contents in semantically significant ways in order to achieve a wide variety of discourse goals, from exposition and description to entertainment and even deception. In doing so, they add semantic content above and beyond the individual propositions expressed by the utterances in a discourse. Discourse relations and, importantly, the complex structures to which they give rise, can also influence the interpretations of individual utterances, having an effect on the very propositions the utterances are understood to express. In this way, they help us better understand content that is explicitly expressed through language, as well as the way in which language connects with the extralinguistic world.
(2) What recent developments in linguistics and philosophy do you think are most exciting in thinking about discourse relations and structure?
Recent work applies the machinery of discourse structure and interpretation to model a diverse range of phenomena, from at-issue (ai) and not-at-issue (nai) content, to discourse goals, to multimodal interactions. The first body of work casts the ai/nai distinction as a byproduct of constraints that guide discourse attachment and the construction of complex discourse structures, and provides an independently motivated and flexible notion of ai/nai content that captures the variable discourse status of a variety of constructions, including appositive relative clauses and speech reports. The second body of work considers how discourse relations and structures can be used to model discourse goals as well as certain kinds of subjectivity in discourse interpretation, and it looks at how the biases that lead to subjective interpretations can be self-reinforcing. The third body of work argues that discourse relations can take propositional contents contributed by nonlinguistic eventualities as arguments. In such cases, nonlinguistic contents are introduced into discourse structure via the same reasoning processes that lead an interpreter to infer a discourse relation between linguistically expressed propositions, rather than through any sort of lexically based anaphora. This leads to a two-way flow of information: integrating nonlinguistic events can impact discourse content and structure, and conversely, discourse structure and interpretation can help guide interpretation of the nonlinguistic environment in conversationally significant ways.
(3) What do you consider to be the key ingredients in adequately analyzing discourse relations and structure?
Because discourse interpretation is influenced by how an interpreter understands a discourse context and what they infer about a speaker’s discourse goals, work on discourse structure must control – to the extent that this is possible – for subjectivity in discourse interpretation, if judgments about discourse-sensitive phenomena are to be reliable and informative. Moreover, because constraints on discourse structure or the behavior of discourse-sensitive phenomena might only reveal themselves over discourses containing at least three, but sometimes many more, discourse units, it is important to be able to draw on data involving extended discourses. Inventing complex discourses, not to mention minimal pairs of such discourses, is no simple task. For this reason, much work on discourse structure has and will continue to depend on corpus study. However, corpus work comes with its own set of problems: corpora must be sufficiently large and be annotated by people with sufficient knowledge. To ease the annotation task, weak supervision approaches that draw on linguistic expertise to guide a process of automatic annotation may prove promising.
(4) What do you consider to be the outstanding questions pertaining to discourse relations and structure?
Current work on discourse structure raises many exciting questions for future research. Some of these questions concern the relation between discourse relations and Questions Under Discussion (QUDs), an alternative approach to discourse analysis that posits that discourse is centered around often implicit questions that conversational participants work together to answer. While our chapter focuses on discourse relations, there are good reasons for thinking that some implicit aspects of what drives discourse development and structure are left unaccounted for by a discourse relations approach. How can we bring some of the elements posited by QUD together with this approach? And what, for example, will be the consequences for a discourse-based theory of goals?
Other questions concern ways to make more precise predictions about discourse attachment. At the moment, the Right Frontier of a discourse graph determines a set of nodes available for discourse attachment, but information about, say, prosody, or different kinds of constructions (such as appositives), or lexical facts, might help improve predictions about where a discourse unit actually will attach. Finally, questions remain about the ways that nonlinguistic eventualities can influence the structure and interpretation of discourse, and the processes by which conversation can help us to ground more precise interpretations of the nonlinguistic context.
5.1 Introduction
We communicate for a variety of reasons, be it to exchange information, to persuade someone of a certain point of view, or simply to entertain each other. In each case, achieving our goals requires linking together the contents of multiple discourse units, which include the contents of individual speech acts and, for conversations situated in a shared visual environment, the contents contributed by physical gestures and other nonlinguistic events. A fundamental insight that guides work on discourse and dialogue interpretation is that the way in which discourse units are related to one another within the context of a conversation is essential to the conversation’s meaning.
To develop an intuition of what we mean when we talk about relations between discourse units, we will start with a simple example. Suppose a friend says to you:
I need my hat back. I’m leaving for São Paulo in two days.
Your friend’s utterance contains two sentences, each of which contributes a single discourse unit, understood here roughly as a single proposition.Footnote 1 Although there is no lexical or syntactic indication that the two discourse units are related, given that they are uttered together you will automatically try to find a connection between them that explains what your friend’s needing their hat back has to do with their leaving in two days. Perhaps the most reasonable explanation is that they want to wear the hat in São Paulo, in which case your interpretation will be roughly the same as if they had used an explicit discourse marker as in (2):
I need my hat back because I’m leaving for São Paulo in two days.
Interpreting (1) along the lines of (2) places the discourse units in a semantic relation of explanation, but there are other types of relations you could infer between the same discourse units given different contextual factors. For instance, a change in intonation between the two moves might signal that (1) merely conveys a list of your friend’s thoughts, linked by a conjunction relation, and would be roughly equivalent to:
I need my hat back. Also, I’m leaving for São Paulo in two days.
Semantic relations can also be inferred between the contents of discourse units made by different speakers. Questions and answers are primary examples of this.
(4)
a. What are you going to do downtown? b. I’m going to the bookstore.
It is a basic assumption of conversational exchange that if someone is asked a question they will answer it promptly, if not directly, in the next discourse move, as seen in (4). However, just as with the discourse units in (1), nothing about the content or surface form of (4b) when considered in isolation indicates that it is an answer to any question, let alone (4a). The question–answer relation between the two units is inferred from their content plus the assumption that people tend to promptly answer questions when asked.
The central role of inference in interpreting question–answer relations between discourse units becomes more apparent if we imagine more moves intervening between a question and its answer:
(5)
a. What are you going to do downtown? b. Ugh, I’m so mad! My brother lost my copy of The Watchmen, and I need to reread it for class. I’m going to the bookstore.
The answer to the question asked in (5a) is the same as the answer provided in (4b), but in (5), the speaker first provides unsolicited background information before giving the answer. Despite this detour, the speaker who asked (5a), or someone just listening in on the conversation, would be able to identify this answer by reasoning about the content of each intervening discourse unit expressed after the question.Footnote 2 In order to provide a systematic account of the inferences needed for the interpretation of discourse, such as those seen in the foregoing examples, formal approaches to discourse structure and interpretation such as Rhetorical Structure Theory (RST; Mann & Thompson Reference Mann and Thompson1987) and Segmented Discourse Representation Theory (SDRT; Asher Reference Asher1993; Asher & Lascarides Reference Asher and Lascarides2003), building on the work of Hobbs (Reference Hobbs1979, Reference Hobbs1985), incorporate semantic relations, called discourse relations, into their models of discourse.Footnote 3 They focus on determining the variety of relations that can connect discourse units, e.g. Explanation or Question–Answer Pair, and the kinds of information – semantic, discursive, or otherwise – that speakers use to determine them. Discourse relations are considered to contribute truth-conditional content above and beyond that conveyed by the collection of discourse moves alone, which has an effect on the logical form of a discourse. Moreover, going beyond single relation instances, theories of discourse structure seek to identify the structural constraints on discourse development which limit how discourse structures built from multiple discourse units can evolve as a discourse proceeds, and to describe the nature and interpretation of full discourse structures.
In the discussion that follows, we start by taking a look in Section 5.2 at why discourse relations are important for philosophy and linguistics and by situating theories of discourse structure in the larger field of dynamic semantics. In Section 5.3, we show how recent work on discourse relations and structure has been used to model phenomena along the semantics–pragmatics interface, to analyze multimodal discourse, and to provide an account of discourse goals and the interaction of bias and discourse interpretation. We conclude in Section 5.4 with a set of open questions for theories of discourse structure and their role in semantics and pragmatics debates as well as a discussion of what kinds of tools will be most helpful for answering these questions.
5.2 The Semantic Effects of Discourse Structure
A simple sentence consisting of a single clause is the minimal tool for conveying a description of the world.Footnote 4 In modern philosophy and linguistics, specifically in truth-conditional semantics, the meaning of a clause is modeled as a proposition, which is often defined as the set of possible worlds in which the state of affairs, or eventuality, described by the sentence holds. Propositions have held the interest of philosophers and linguists because they are the minimal bearers of truth or falsity, allowing us to exchange information and learn new things about the world. From Plato and Aristotle up through modern day model-theoretic accounts of linguistic meaning stemming from the work of Frege and Russell, simple sentences, and the propositions they express, have been the primary units of study in semantics and philosophy of language; likewise they are the starting point for discourse-based language modeling.
It is clear that a discourse or conversation proffers a more complex representation of the world than does a simple sentence. Discourse units combine recursively to create more and more complex semantic structures, giving rise to both “bottom-up” and “top-down” effects. This section introduces some of these effects, starting with those that arise with individual discourse relations and finishing with effects related to complex discourse structures built from more than two discourse units.Footnote 5
5.2.1 Discourse Relations
A discourse will generally support inferences which would not be entailed by the set of its constituent propositions alone,Footnote 6 allowing us to communicate, and infer, more complicated messages. Let’s consider again our introductory examples (1)–(3), about the hat and leaving for São Paulo in two days. Inferring the relation Explanation between the two discourse units of (1), as made explicit in (2), will entail that the speaker needs the hat back in less than two days. This inference, however, cannot be attributed solely to the discourse units involved, for it is not supported if we infer a different relation between them: if (1) were interpreted along the lines of (3), the inference would not be supported and the speaker could reasonably continue (3) with “I can pick up the hat once I get back from Brazil.”
To some extent, we can say the same thing about complex propositions formed from Boolean operators: while
and
are built up from the same constituent propositions, they do not support the same set of entailments because they do not involve the same relation. Only
entails the set
, for example. But it is important to note that, whereas the truth value of a complex formula formed by a Boolean operator is determined entirely by the truth values of its arguments, the truth value of an instance of a discourse relation cannot be so reduced: if two propositions, p and q, are both true, this automatically entails the truth of
, but not that of Explanation
. The Explanation relation adds additional semantic content that is itself truth-evaluable. Another crucial difference is that discourse relations can add substantial semantic content to discourse even in the absence of an explicit relation marker, as illustrated by (1). In this case, content is added to the discourse during composition as the result of a reasoning process, not by a logical operation determined by the semantics of a particular operator. In this way, the bottom-up effects of discourse relations go beyond those observed in standard truth-conditional semantics.
When discourse is modeled using discourse relations, a variety of “top-down” effects are also observed. That is, information the interpreter has about how discourse units are related – which might come from explicit discourse markers or other lexical information from the discourse units, or even nonlexical contextual information – can be used to interpret the content of an individual discourse unit. In particular, discourse relations have significant effects on various anaphoric phenomena, including the resolution of anaphoric pronouns and the temporal interpretation of individual clauses. This is part of what makes discourse relations so important to philosophy, particularly the philosophy of language, and linguistics: not only do they provide a framework which allows a more complete representation of the complex semantic structures that mediate nearly all human information exchange and knowledge acquisition, but they do so by offering a new perspective on problems, such as anaphora resolution, that have been discussed in the literature for a long time.
As an illustration of top-down effects, consider the minimal pair in (6):
(6)
a. Andy’s bike broke down this morning. He showed up late for work. b. Andy showed up late for work. His bike broke down this morning
The past tense employed in (6a) indicates that the events of Andy’s bike breaking down and his showing up late for work happened in the past of the utterance time, but the example as a whole suggests more information about the timing of the individual events. In particular, we infer that the time at which Andy’s bike broke down was in the past of the time at which he showed up late for work. Surely the order in which the events are described plays a role in this interpretation, but, as noted already by Ancient rhetoricians (e.g. Quintilian Reference Reese and Asher1963), this cannot be the whole story. If we reverse the arguments, as in (6b), our tendency to understand Andy’s bike troubles as the cause of his tardiness – an interpretation motivated by world knowledge – leads us to understand the event described second as actually having occurred first (Lascarides & Asher Reference Lascarides and Asher1993). From the perspective of SDRT, these observations are explained by noting that we infer different discourse relations in (6a) and (6b), namely Result and Explanation, respectively. The semantics of these relations then entail the differing temporal interpretations: Result, when its two arguments denote events, requires that the event described by its second argument occur after that described by its first, whereas Explanation requires the opposite structure.Footnote 7
Similarly, there are situations in which pronoun resolution is most effectively explained by appealing to discourse relations and world knowledge, as illustrated by (7), taken from Kehler et al. (Reference Kehler, Kertz, Rohde and Elman2008) and adapted from Winograd (Reference Winograd1972) (cf. also Hobbs Reference Hobbs1979; Kehler Reference Kehler2002).
(7)
The city council denied the demonstrators a permit because … a. … they feared violence. b. … they advocated violence.
The pronoun they is understood as referring to the city council in (7a), but to the demonstrators in (7b). Arguably, this is because world knowledge suggests that fearing violence is a good reason for an agent to reject a permit, while advocating violence is a good reason to have one’s request rejected.
The role of world knowledge and reasoning comes out perhaps even more clearly if we consider an ambiguous discourse marker, such as and, which in (8) could support either a Parallel relation or a Result relation, leading to different interpretations of they:
The city council denied the demonstrators a permit and they advocated violence.
The context of (8) and the world view of the speaker will be what tips the balance in favor of one interpretation or the other. If it is understood that demonstrators would likely react violently to authoritarian obstacles, the Result relation would best support this reading of (8), where they are the demonstrators. But in a context in which one accepts that a body of government might advocate violence against a group of people who are wanting to protest a cause, (8) could equally express a Parallel relation which would make an interpretation of they as picking out the city council more accessible than one in which they picks out the demonstrators. The important point here is that the sentence in (8) supports two different interpretations of the pronoun they, and the choice of interpretation is accounted for by the type of semantic relation inferred.
The foregoing analysis of anaphora resolution and temporal interpretation generalizes insights from dynamic semantics. In dynamic semantics, models of pronominal anaphora take into account the order in which two clauses are added to the discourse context in order to capture the fact that, for instance, reversing the sentences in (6a) would lead to a less felicitous discourse (Kamp & Reyle Reference Kamp and Reyle2013). Temporal interpretation can likewise be sensitive to update order and also to tense and aspect: were we to change the aspect in the second sentence of (6a) to the past perfect, this would change the inferred temporal relation between the clauses (Kamp Reference Kamp1988). Work on discourse relations, in particular Hobbs (Reference Hobbs1979), Kehler (Reference Kehler2002), Asher (Reference Asher1993), and Asher and Lascarides (Reference Asher and Lascarides2003), incorporates the idea that a model of anaphoric phenomena must take into account the way in which the utterance content is linked to other contents in the incoming information state and the way in which discourse units are described.Footnote 8 In these accounts, however, update order and tense and aspect influence interpretation only indirectly by helping an interpreter determine what discourse relation is at work. Anaphora and temporal interpretation are thus understood as byproducts of reasoning about discourse relations (and, as we will see in the next subsection, discourse structure).Footnote 9
5.2.2 Discourse Structures
So far we have discussed the role of discourse relations in the computation of temporal structure and the resolution of anaphora using examples that contain pairs of discourse units. However, there are some other important anaphoric facts, such as propositional anaphora, which cannot be explained by considering pairs alone. Consider this example from Asher (Reference Asher1993) (and see Snider Reference Snider2017 for an in-depth discussion of propositional anaphora):
(9)
What is the antecedent for the pronoun this? For most speakers, the only possible antecedents are either the proposition expressed by the combination of (9a)–(9c), a complex discourse unit that we will denote as [(9a)–(9c)], or the proposition expressed by the discourse unit (9c) alone.
Note that how we resolve the pronoun this in (9) goes hand in hand with how we understand the scope of the discourse marker but in (9d): if this picks out the proposition expressed by (9c), then (9c) is also understood as the first argument of but. However, if this is understood as picking up on the complex proposition formed from (9a)–(9c), then it is the complex discourse unit [(9a)–(9c)] that provides the first argument to but. Observations about propositional anaphora take insights from Section 5.2.1 concerning the relation between discourse relations and anaphora to a new level. Now, it’s not just a question of how anaphoric relations between two consecutive clauses are interpreted; in SDRT, RST, and the theory of Polanyi (Reference Polanyi1985), discourse attachment itself becomes anaphoric.
When a new discourse unit is introduced into an ongoing discourse, we must consider which discourse units already present in the discourse will be able to connect with it via a semantic relation. In a coherent discourse, each new unit of discourse content must attach and bear some semantic relation to some other constituent in the discourse structure; each discourse unit becomes, in effect, a “zero-anaphor” looking for an antecedent discourse unit or complex discourse unit. And as illustrated by (9), it might be that only a subset of the constituents in a discourse representation are salient and available as attachment points when updating the discourse context with new information.Footnote 10
To define the set of salient constituents that are accessible for attachment, commonly called The Right Frontier (RF; Polanyi Reference Polanyi1985; Asher Reference Asher1993), we need to represent discourse structure as a graph whose nodes are discourse units and whose edges are instances of discourse relations between constituents.Footnote 11 We thus introduce a few fundamental features of the SDRT language here. The vocabulary contains a countable set of discourse unit labels
for elementary discourse units (edus), which are discourse units that cannot be decomposed into further discourse units, and complex discourse units (cdus), which group together multiple dus (edus or cdus). It further includes a finite set of discourse relation symbols
, which we add to the vocabulary of a language L, such as the language of dynamic predicate logic, for describing the contents of edus. Formulas in the SDRT language are of the form
, where
describes the content of
and
can be: a formula of L; a formula of the form
, which says that
stands in coherence relation R to
; or a conjunction of SDRT formulas. Following Asher and Lascarides (Reference Asher and Lascarides2003), each discourse relation comes with constraints as to when it can be coherently used in context and when it cannot.Footnote 12
A discourse structure for a text can be represented as a graph
, where
is a set of vertices each representing a discourse unit;
a set of directed edges representing links between discourse units that are labeled by
with discourse relations;
describes the membership relation between the set of dus figuring in cdus and the cdus in which they figure; and Last
is the last edu in the linear, textual ordering of edus in d. An sdrs is spanning in that all elements of V other than the root have at least (and possibly more than) one incoming edge:
. Note that when discourse units are grouped together in a cdu, they will be related in such a way as to determine a subgraph respecting the foregoing conditions.
The Right Frontier Constraint (RFC) requires that given a discourse graph G, a new edu to be attached to G must be attached to a node along the RF of G. (Nodes that are not on the RF can be accessed, but only through what Asher Reference Asher1993 calls discourse subordination.) The RF evolves dynamically as a discourse proceeds and is sensitive to whether a new du is attached via a subordinating relation or a coordinating relation, as indicated in Figure 5.1.

Figure 5.1 Subordinating and coordinating relations and their relation to the RF
A subordinating relation, including Explanation, Elaboration, and Background, is one in which the second argument seems to provide further information about the first (Asher & Lascarides Reference Asher and Lascarides2003). Crucially, the addition of the second argument does not render the first argument less salient or inaccessible for anaphora, which means that both discourse units will be on the RF. Let’s return to our first example, repeated here as (10):
I need my hat back. I’m leaving for São Paulo in two days.
The speaker could easily continue with (11):
I’m sorry. I know you enjoy wearing it.
In apologizing, the speaker expresses the idea that she feels bad about asking for her hat back; the apology I’m sorry is thus related via the relation Comment to the discourse unit I need my hat back. The pronoun it in (11) likewise depends on the first discourse unit of (10), referring to the hat introduced in that unit. These attachments are possible despite the fact that the second sentence of (10) is uttered in the interim.
The left graph in Figure 5.2 represents the structure of (10). The vertical arrow connecting the discourse units indicates that Explanation is a subordinating relation, and the dashed line (RF1) represents the RF just before the speaker utters (10). The right graph shows how the discourse structure changes when we update with (10). I’m sorry is attached to the top node via Comment, a subordinating relation, and I know you enjoy wearing it, which explains why the speaker is sorry, is attached via Explanation. This graph also shows how updating with (10) changes the RF: the unit I’m leaving for São Paulo in two days is pushed to the left and is no longer on the RF (RF2), though the remaining three units are.
Coordinating relations shut off the accessibility of their first arguments and advance the discourse to a new topic instead of providing further information on the current topic. The discourse units contributed by (9a)–(9b) and (9b)–(9c), for instance, are related by the coordinating relation Continuation in SDRT, whose semantics roughly correspond to Boolean conjunction. This means that (9b) is predicted to be inaccessible for attachment once (9c) is introduced, which is what we observed above. Coordinating relations, such as Continuation, Narration, and Result, are represented with horizontal arrows to show that they push the RF forward, or to the “right” as shown in case 1 of the figure. These assumptions imply that should we insert material in (9) such as:
These people were really badly treated.
before (9d), the available antecedents for the pronoun they should shift again. And indeed as SDRT predicts, our intuitions change in this new example: (9c) is no longer available as an antecedent.
Note that the RF, as it can contain numerous discourse units, does not determine where a new discourse unit will attach to the discourse graph to date, but only where it can attach. We can now formally define the RF in the style of SDRT. Let
mean that edge e has initial point
and endpoint
. A node
is on the RF of a graph G, i.e. rf
, just in case
is Last,
is related to a node in rfG via a subordinating (Sub) edge, or
is a cdu that includes a node in rfG:
Definition 1. Let
be a discourse graph.
, rf
iff
(i)

(ii)

(iii)

Note that the rf is updated dynamically each time a new edu is processed; the rf for (attachment of) an edu
will be determined by the graph
. The rf for a cdu
,
, is the rf for
.Footnote 13 This predicts that (9c) is available for attachment in (9) because it is Last, but (9a) and (9b) are both inaccessible because neither satisfies any of the conditions (i)–(iii). The complex unit [(9a)–(9c)] is correctly predicted to be available, however, because it includes Last, which is a member of the RF, thus satisfying condition (iii). The two possibilities for attachment in (9) are represented in Figure 5.3.

Figure 5.3 Two discourse graphs for (9)
The picture that emerges from an account of discourse structure is one in which the attachment point of a new discourse unit to an existing discourse graph is itself an anaphoric process guided by a combination of reasoning about world knowledge and linguistic cues. When a speaker makes a new utterance, determining to what part of the conversation their new utterance is relevant is in fact a complex process. Given the hypothesis developed in Section 5.2.1, that pronominal anaphora resolution and temporal interpretation are byproducts of inferring discourse relations and structure, it follows that the former are guided by the same complex reasoning processes as the latter. While the RF cannot on its own determine where a discourse unit will attach – and thus in what unit a pronoun must find its antecedent or a temporal expression must be interpreted – it helps to greatly restrict the possibilities and facilitate discourse comprehension while offering a more comprehensive mechanism for interpretation.
5.3 Complex Discourse Structures
If the nature of dynamically evolving, complex discourse structures can influence the interpretation of semantic phenomena such as pronominal anaphora resolution and temporal interpretation, the question arises as to what other semantic phenomena might be efficiently modeled by exploiting the full machinery of a theory of discourse structure. In this section, we examine three other types of phenomena that we feel are best analyzed through the lens of a discourse theory, namely (certain types of) at-issue and not-at-issue content, discourse goals, and multimodal interactions.
5.3.1 At-Issue and Not-at-Issue Content
Theories of discourse structure in the tradition of SDRT and RST have focused largely on defining the function of a discourse unit in terms of the kind of discourse relation to which it contributes: whether it serves to explain something, to answer a question, to continue a narrative, and so on. But this is not the only way to understand discourse function: fueled by the observations and theory presented in Potts (Reference Potts2005), there is an ongoing and lively debate in linguistics and philosophy of language about how to classify discourse content in terms of how central it is to discourse development and, often, to discourse goals or purposes. In current terminology, the challenge is to determine the conditions under which content is at-issue (ai), and thus central to discourse development and/or discourse goals, or not at-issue (nai), and thus relevant to a discourse in some more indirect way.
In this subsection, we take a look at recent work that brings theories of discourse structure and interpretation to bear on the ai/nai discussion by focusing on two phenomena that have been said to involve nai content: appositive relative clauses and discourse parenthetical reports. Before addressing these topics in turn, however, we need to clarify what is meant by ai and nai content. Efforts to define these concepts more precisely have led to a variety of diagnostic tests, and because these tests do not always yield the same judgments, the result is that there is more than one way of carving up the ai/nai distinction (Koev Reference Koev2018). Here we will focus on two ways of categorizing ai and nai content: as backward-looking AI/NAI or as forward-looking AI/NAI.Footnote 14
To determine if content is backward-looking AI/NAI, we look at how it interacts with the preceding discourse. Consider (13):
Marie, the chemistry teacher at our old high school, is joining our volleyball team.
We say that the main clause of (13) is backward-looking ai while the appositive relative clause is backward-looking nai because the former must be relevant to the preceding discourse in a way that the content of the latter need not be, as shown by the contrast between (14) and (15):
(14)
a. Who is joining your team this year? b. Marie, the chemistry teacher at our old high school, is joining.
(15)
a. Who is Marie? b. ?? Marie, the chemistry teacher at our old high school, is joining our volleyball team.
The infelicity of (15b) arguably shows that the main clause of a sentence containing an appositive relative clause must convey main point content, i.e. be backward-looking ai, while the acceptability of (14b) shows that the an appositive relative clause can be backward-looking nai.
Forward-looking ai status is diagnosed by looking at possibilities for subsequent discourse continuations, like those in (16a) and (16b):
(16)
Marie, the chemistry teacher at our old highschool, is joining our volleyball team. a. That’s not true! (=It’s not true that Marie is joining the team.) b. Wait, I thought she was the physics teacher.
The main clause content of (16) is forward-looking ai because it is treated as more salient or discourse central by subsequent discourse moves, as shown by the fact that the pronoun that in (16a) seems to automatically target this content, while ignoring that of the appositive relative clause. Correcting the latter requires more effort, as shown by (16b); here, the speaker must employ explicit descriptive content to show that she is taking issue with the appositive, suggesting that the appositive content is forward-looking nai (cf. Von Fintel Reference Von Fintel, Bezuidenhout and Reimer2004).
With these notions of forward and backward-looking AI/NAI content in place, we now turn to a discussion of how discourse structure has been exploited to model the behavior of two types of content that sometimes exhibit unexpected ai behavior: appositive relative clauses and the embedded clauses of speech reports. While the foregoing discussion might lead us to conclude that appositive relative clauses are by their very nature vehicles for backward-looking and forward-looking nai content, the following subsection introduces data that show they can be both backward and forward-looking ai in certain cases. We then focus on data that show that the embedded content of a speech report can be backward-looking ai even while syntactically embedded under content that appears to be backward-looking nai. In both cases, we show how a theory of discourse structure can be brought to bear on these phenomena in a way that accounts for their nuanced behavior. While a discourse-based account of appositive relative clauses emerges naturally from the existing tools such as the RF, an account of speech reports requires some supplemental assumptions.
Appositive Relative Clauses
As pointed out by numerous authors, appositive relative clauses pass diagnostic tests for forward-looking ai content when they appear in sentence-final position. This is illustrated by the fact that the direct rejection in (17b) targets the content of the appositive as easily as (17a) targets the content of the main clause (AnderBois et al. Reference AnderBois, Brasoveanu and Henderson2015; Syrett & Koev Reference Syrett and Koev2015):
(17)
This year, we’ll be joined by Marie, (who was) the chemistry teacher at our old highschool. a. That’s not true. She’s moving to Germany now. b. That’s not true. She was the physics teacher.
In fact, even appositive relative clauses in sentence-medial position can arguably convey forward-looking ai content in certain cases. Compare (18) and (19), from Hunter and Asher (Reference Hunter, Asher, Moroney, Little, Collard and Burgdorf2016).
(18)
a. Marie, the best volleyball player in the district, is joining our team. b. We’re going to be invincible!
(19)
a. Marie, the worst volleyball player in the district, is joining our team. b. ? We’re going to be invincible!
While an appositive relative clause cannot be targeted by a direct rejection such as That’s not true, (18) and (19) show that such a clause can nevertheless play a central role in the acceptability of discourse continuations. And crucially, it can play this role even if a speaker makes no particular effort to raise this content to salience. In contrast to the appositive in (16), which must be explicitly targeted by a move like (16b) in order to be made salient, the appositive in (18) is automatically understood to be a part of the speaker’s main point – that they’re going to be invincible because the best player in the district is joining their team. Arguably, then, sentence-medial appositives can sometimes be forward-looking ai, even if direct rejection tests fail to diagnose them as such.
Examples similar to (18) suggest that sentence-medial appositive relative clauses can be backward-looking ai as well (Syrett & Koev Reference Syrett and Koev2015):
(20)
a. Our team is so much stronger this year. b. Marie, the best player in the district, joined our team in March.
Without the appositive relative clause, it might have been possible to infer from (20) that the team is stronger because Marie joined, but for an audience who does not know Marie or how good of a player she is, this interpretation is greatly aided by making explicit why Marie’s presence would strengthen the team. As with (18), the appositive content in (20b) plays a central role in conveying the speaker’s main point, this time by directly contributing to the explanation of (20a).
In a discourse theory, the above observations fall out naturally by appealing to the nature of subordinating relations and the RF (Hunter & Asher Reference Asher, Hunter, Morey, Benamara and Afantenos2016; Jasinskaja Reference Jasinskaja2016; cf. Asher Reference Asher2000). Let’s begin with (17). Ignoring the frame adverbial This year, which would introduce complexities irrelevant to the current discussion, (17) can be decomposed into two discourse units,
: we’ll be joined by Marie and
: (Marie was) the chemistry teacher at our old high school. In SDRT, these units will be related by the subordinating relation Elaboration, i.e. Elaboration
, because the content of
elaborates on the entity Marie, introduced in
. Recall that the RF includes: (i) Last, (ii) any unit x directly superordinate to a node y on the RF, and (iii) any cdu x that includes a node y on the RF. The unit
satisfies condition (i), as it is the most recently uttered discourse unit, while
satisfies condition (ii) because it is superordinate to
(e.g. the source of a subordinating relation connecting
to the graph). We thus predict that both
and
are on the RF and available for discourse continuations, as shown in Figure 5.4.Footnote 15

Figure 5.4 Discourse graph for (17) showing that both edus are on the RF
The definition of the RF can also be used to predict that the medial appositive relative clause in (18) cannot be targeted by a direct rejection although it can be relevant for discourse continuations like that in (18b). As illustrated in Figure 5.5, (18) can be decomposed into two discourse units,
: the best volleyball player in the district and
: Marie is joining our team. In this case, the main clause, whose content is
, is the last completed unit in (18a), and so it follows from condition (i) of the RF that it can support discourse continuations. The unit
, by contrast, fails to satisfy (ii), because is it actually subordinate to
, not superordinate to it. Thus we predict, correctly, that
alone cannot be targeted by a discourse continuation like That’s not true. However, if
contributes to a complex discourse unit that contains another unit on the RF, then by condition (iii), we predict that the entire cdu can support discourse continuations. And this is what we observe: the cdu
in (18) supports the continuation in (18b). Parallel remarks can be made for the discourse centrality of the appositive relative clause in (20b): the complex discourse unit as a whole provides the explanans, making the appositive discourse central.

Figure 5.5 Discourse graph showing that the appositive relative clause in (18) is inaccessible as the sole target of direct rejection but can figure in a cdu that licenses discourse continuations
Recasting the ai/nai distinction as a byproduct of constraints that guide discourse attachment and the construction of complex discourse structures provides an independently motivated and flexible notion of AI/NAI content that accounts for the variable ai status of appositive relative clauses. Within such a framework, there is no need to posit that appositive content is by its very nature nai or that it gives rise to a special interpretation procedure (cf. AnderBois et al. Reference AnderBois, Brasoveanu and Henderson2015); nor do we need to posit new syntactic constraints as in Koev (Reference Koev2013). In the next section, we consider another phenomenon that raises questions about the ai/nai distinction, namely discourse parenthetical interpretations of indirect speech reports. Like the behavior of appositive relative clauses, the behavior of the embedded clauses of speech reports appears to motivate a discourse-level explanation. Unlike the former, however, a discourse-based analysis of discourse parenthetical reports requires us to adopt some new assumptions.
Discourse Parenthetical Reports
In certain cases, the embedded clause of an indirect speech report seems to convey backward-looking ai content despite being syntactically embedded under content that is less discourse central. Consider the contrast between (21) and (22).
(21)
a. Rose is grumpy. b. Nicholas said her chocolate cake is dry and bland.
(22)
a. Rose is bringing dessert to the party. b. Nicholas said she is making a chocolate cake.
In (21), the report in (21b) as a whole explains why Rose is grumpy – regardless of whether or not Rose’s cake actually is dry and bland, Rose is upset simply because Nicholas said it was. Intuitively, we could represent (21) using the first graph in Figure 5.6. A parallel analysis for (22), shown in the second graph, is unsatisfactory, however; the speaker is not suggesting that the event of Rose bringing a dessert to the party is going to furthermore be an event of Nicholas saying that she is making a chocolate cake. The speaker rather seems to be committed to something closer to the elaboration captured by the third graph in Figure 5.6, and the fact that Nicholas said what he did somehow provides evidential support for this elaboration. Following Hunter (Reference Hunter2016), we will call speech reports like (22b) in which the embedded content appears to be backward-looking ai while the report clause plays a supportive, evidential role, discourse parenthetical.

Figure 5.6 The main discursive contribution of a discourse parenthetical report (middle graph) is intuitively closer to an example without a report (right graph) than to a nonparenthetical report (left graph)
In an attempt to provide more intuitive annotations for discourse parenthetical reports that more accurately represent the inferences that one can draw from them, numerous discourse theories have proposed that speech reports generate two discourse units, one for the main report clause and one for the embedded clause (Dinesh et al. Reference Dinesh, Lee, Miltsakaki, Prasad, Joshi and Webber2005; Hunter et al. Reference Hunter, Asher, Reese, Denis, Sidner, Harpur, Benz and Kühnlein2006; Buch-Kromann & Korzen Reference Buch-Kromann and Korzen2010, see also Carlson & Marcu Reference Carlson and Marcu2001). Such an approach has been further supported by experimental work in Simons (Reference Simons2019). As illustrated in (24), for example, the report in (22b) can be decomposed roughly as follows: [Nicholas said [she is making chocolate cake.]
]
, so that the report as a whole introduces a discourse unit
, and the embedded clause introduces a separate discourse unit
: she is making a chocolate cake. Because the different interpretations of (22b) and (21b) seem to result from how the reports are used in the discourse, rather than some kind of hidden syntactic difference (Simons Reference Simons2007), we can further assume that all speech reports should be decomposed into two units, even when it is the main report clause that conveys discourse central information, as illustrated in (23).
(23)
a. [Rose is grumpy] 
b. [Nicholas said [her chocolate cake is dry and bland]
]
(24)
a. [Rose is bringing dessert to the party] 
b. [Nicholas said [she is making chocolate cake]
]
To derive the distinction between nonparenthetical and parenthetical readings, then, one approach is to posit that they involve two different discourse relations, say Attribution in (21b) and Source in (22b) (Hunter et al. Reference Hunter, Asher, Reese, Denis, Sidner, Harpur, Benz and Kühnlein2006), as shown in Figure 5.7. While Attribution mirrors the syntactic structure of a report, keeping the embedded clause subordinate to the main clause, Source reverses the order of its arguments so that the embedded clause can be directly related to the discourse preceding the report. This reflects the intuition that the embedded clause is backward-looking ai and is thus intuitively central for the incoming discourse context. Semantically, Attribution does not entail the truth of its second argument – i.e. the content of the embedded clause – and so is interpreted as expected for a report involving a nonfactive verb. When the embedded clause of a report contributes the first argument of Source, however, its truth is entailed. Relations such as Elaboration and Explanation are veridical, meaning that they entail the truth of both of their arguments; it thus follows that if the embedded content of a report attaches to the incoming discourse via one of these relations, its truth is entailed.

Figure 5.7 In the relation Attribution, the embedded clause of a report is subordinate to the main clause, mirroring the syntactic structure of the report; in Source, the main clause is subordinate to the embedded content, allowing the latter to enter directly into discourse relations with the incoming discourse context
While the Source relation addresses the intuition that
is backward-looking ai, it creates new problems. First, it fails to capture examples in which both the main clause and the embedded clause of a speech report are backward-looking ai, as illustrated by (25) (for extended examples, in which the clauses are related to very different parts of a discourse, see Hunter Reference Asher, Hunter, Morey, Benamara and Afantenos2016).
(25)
a. Have you talked to the guests? What are they bringing? b. Nicholas said Rose is bringing a chocolate cake, and he said that he would bring chips and guacamole. Kate is bringing veggie burgers, but I haven’t heard from Isabel. Do you think I should call her?
In this example, the second speaker uses consecutive utterances to simultaneously provide suites of answers to both questions in (25a), telling the first speaker which guests they have talked to, as well as addressing the question of what each person is bringing. It is thus hard to say for any of the reports in (25b) which is the unit that attaches to the incoming discourse context or licenses discourse continuations, so adopting Source to represent the discourse centrality of
is unmotivated at best (cf. also two-dimensional accounts such as Maier & Bary Reference Maier and Bary2015). Moreover, adopting Source fails to account for the fact that a speaker who uses a discourse parenthetical report generally hedges their commitment to the embedded content of that report: in (22b), the speaker is not fully committed to the claim that Rose is making a chocolate cake for the party; they are committed at most to
Elaboration
, so Elaboration
is too strong. This observation would be naturally explained by appealing to the fact that
is in the scope of a speech report – that is, by making
subordinate to
via Attribution.
Now we’re back to the drawing board: if we posit Attribution
, how can we represent the intuitive Elaboration relation between
and
? Directly relating them will lead to a violation of the RF, as can be seen from Figure 5.8.
and
form a complex unit that needs to be attached to the incoming discourse, i.e.
. The only way to do this without violating the RF is to attach
to
, but that would yield the reading of (22b) that we have rejected, namely a reading in which Nicholas saying what he did elaborates on the event of Rose bringing a dessert to the party. Attaching
directly to
is not permitted by the RF as introduced in Section 5.2.2:
is not Last for
(
is), nor is
superordinate to
via a chain of subordinating relations, as
is not attached to
at all.

Figure 5.8 Connecting the second argument of Attribution directly to a discourse unit preceding the report leads to an RF violation
As argued in Hunter (Reference Hunter2016), however, modifying the RFC to allow for such “violations” is independently motivated in the case of third-party speech reports. As Hunter explains, the constraint that a discourse unit attach to another unit along the RF is best understood as a constraint on how a speaker presents her own commitments. When a speaker decides to use someone else’s commitments to make a point, they must first set up this commitment space before using what that person has said to make a contribution to the larger discourse.Footnote 16 Of course, while speakers can use things that others have said to, say, elaborate on or explain other discourse units, we have seen that when they do so, they weaken their own commitments to the reported content. It follows that they hedge their commitments to the proposed relations as well: if the speaker of (22b) is not entirely committed to the claim that Rose is making a chocolate cake, they cannot be entirely committed to the claim that the event of Rose’s bringing a dessert to the party is going to be an event of her bringing a chocolate cake to the party.
Accordingly, Hunter (Reference Hunter2016) posits that anytime a speaker opts for a discourse parenthetical report, the report will contribute an instance of Attribution, just as a nonparenthetical report would, but when we link the embedded content to the discourse context preceding the report, the Attribution will have the effect of weakening the speaker’s commitment to the relation. That is, rather than Elaboration
as in the graph above, linking
to content inside of an Attribution context weakens the relation to
Elaboration
.Footnote 17 This proposal allows us to systematically derive the difference between discourse parenthetical and nonparenthetical readings with minimal, well-motivated adjustments to a classic discourse theory: in discourse parenthetical readings, the embedded content is backward-looking ai and thus related directly to the incoming discourse context, although the speaker’s commitment to this relation is hedged; in nonparenthetical readings, the main clause is backward-looking ai and no speaker commitment is entailed to the embedded content of the report (for third-person reports). Furthermore, this approach predicts that both the main and embedded clauses of a speech report can be backward-looking (and forward-looking) ai, as desired.
In the analysis of at-issue and not-at-issue content that emerges from this section, at-issue content is content that is central to discourse development. A discourse unit
is ai if it attaches directly to the incoming discourse context via a discourse relation or supports anaphoric continuations that need not explicitly evoke the content of
; a discourse unit
is nai if it is contributed by a syntactically complex discourse move that contributes more than one discourse unit, and
does not (or cannot) attach to the incoming context or cannot support anaphoric continuations on its own. The possibilities for attachment of a discourse unit are in turn governed by the RF and rules limiting discourse development. As mentioned at the outset of this section, however, ai content is sometimes presented as content that directly addresses a speaker’s discourse goals. In the next section, we take on the topic of modeling discourse goals in a theory of discourse structure and interpretation and show how goals and at-issue content are decoupled in more recent work.
5.3.2 Discourse Goals and Subjectivity
Language is a tool for achieving one’s ends, even if one’s goal is merely to pass the time or to make someone laugh. Understanding how language can be used to bring about certain effects on one’s audience has been of interest to the study of language going back to ancient work on rhetoric. The study of discourse goals has also recently become a main topic of interest in discussions of discourse analysis, from SDRT to theories of conversation centered around Questions Under Discussion, in which a discourse goal is understood as a question that a discourse move is expected to address (QUDs; Simons et al. Reference Simons, Tonhauser, Beaver and Roberts2010; Ginzburg Reference Ginzburg2012; Roberts Reference Roberts2012). Modeling goals is important for discourse interpretation for multiple reasons. First, because we can expect a speakers’ discourse goals to guide the discourse moves they make and the way they put them together, we can expect discourse structure and goals to be very closely related. Moreover, an interpreter’s expectations concerning what a speaker aims to achieve with her discourse will affect not only how she chooses to converse with the speaker, but also how she interprets the speaker’s moves when there is ambiguity (as is often the case at the discourse level).
In this section, we take a look at how discourse relations and complex discourse structures can be used to model discourse goals. We also consider how the relation between discourse goals and ai content should be understood given the discourse structural perspective on ai content developed in the previous section. We conclude by showing how different perceptions of discourse goals can lead to different interpretations of what is said in discourse.
Goals
Sometimes, speakers converse to get information from an interlocutor or to persuade someone of a position; in other exchanges, the desired outcome might be an action of some sort, as in (26).
(26)
a. Julie: It’s time to go to bed. b. Rose: OK, good night. c. [nonlinguistic action: Rose goes to bed.]
Intuitively, one might simply say that Julie’s discourse goal in (26) is to get Rose to go to bed, as she does in (26c). If (26) is a conversation that goes well for Julie and meets her discourse goals, however, (27), which has the same outcome, is a much less satisfactory conversational exchange:
(27)
a. Julie: It’s time to go to bed. b. Rose: OK, but I’m still watching my show. c. [30 mins later] Julie: OK, Rose it’s really time now to go to bed. d. Rose: I’m still watching my show. You told me I could! e. Julie: It’s no longer the same show! No story! [followed by half an hour of arguing …] f. [Rose goes to bed.]
A satisfactory model of the relation between discourse structure and goals cannot focus only on whether a conversation successfully achieves a desired outcome, but also on how the outcome is achieved. Speakers are usually trying to satisfy multiple constraints at once.
Because a full understanding of discourse goals usually requires modeling extended discourses and goals can be ranked not only by their final outcomes but by the different paths that the conversation can take to achieve these outcomes, recent work in this area models discourse goals as sets of full discourse structures – the structures in which the conversation “goes well” for a particular conversationalist. Asher et al. (Reference Asher, Paul and Venant2017) model a conversational goal as a subset of all possible conversations or discourse structures in the sequential game space of all possible discourse moves. The goal of making a conversation coherent, for example, is modeled as the set of all coherent discourse structures or, alternatively, as the set of all conversations or strings of discourse moves that generate such structures. Not angering one’s interlocutor might be another goal, denoting a different set of structures. Exogenously given decision problems or conversations aimed at answering a particular question are also instances of such goals. Goals can be complex, formed from combinations of simpler goals. Where Wini is the set of goals for a player i, the strategies that i adopts in conversation – the discourse moves that i chooses to make and how they are related – will be adopted to steer the conversation into Wini.
Now if discourse goals are modeled as (sets of) full discourse structures, and ai content is defined in terms of attachment within larger discourse structures, what is the relation between ai content and goals? Consider the following exchange from the film The Princess Bride, in which the ai/nai distinction is exploited to achieve a discourse goal:
(28)
a. Buttercup: He [Humperdink] … can find a falcon on a cloudy day, he can find you! b. Wesley: You think your dearest love will save you? c. Buttercup: I never said he was my dearest love, and yes, he will save me. That I know. d. Wesley: You admit to me that you do not love your fiancé? e. Buttercup: He knows I do not love him.
As background for those who haven’t seen the cult classic, Buttercup and Wesley had previously been in love and had planned to marry, but then Wesley was taken hostage by pirates and Buttercup was told that he was dead. A few years later, Prince Humperdink chose Buttercup to be his bride, though she had no desire to marry him. In this scene, Buttercup has been taken hostage by Wesley whom she believes, given his disguise and behavior, to be the pirate who killed Wesley. In the conversation above, Wesley exploits the fact that Buttercup does not recognize him to try to get her to say, without exogenous influence, whether she still loves him.
Let’s now look at how Wesley uses the conversation to achieve this goal. In (28a), Buttercup tries to intimidate Wesley so that he will release her, and Wesley follows up in (28b) with a confirmation question that seems to directly address her goal, but in fact, he is merely seizing the opportunity for his own ends. He aims to find out whether Buttercup loves Humperdink, but surmises that a direct question might make Buttercup suspicious or trigger feelings of guilt, leading to a less than fully reliable answer. Wesley thus opts to disguise his question in a presupposition (your dearest love), a paradigm nai construction, in (28b).Footnote 18 Buttercup takes the bait in (28c) and directly responds to the noncentral content of Wesley’s question, which allows Wesley to follow up directly on his real question about whether she loved Humperdink in (28d). Wesley continues to return to the topic in later scenes, and ultimately admits to Buttercup that he disguised himself in order to get an honest answer to his question.
In (28), the presuppositional content in (28b) is arguably more directly related to Wesley’s discourse goal than is the ai content. It’s not just that Wesley wants an answer to the question of whether Buttercup still loves him; he wants to get this answer in the most reliable way possible, and opting for a presuppositional expression figures in an optimal strategy for achieving this outcome. From the discourse-driven perspective developed in this chapter, then, ai content turns out to be a very local notion in the sense that is understood in terms of how a discourse unit attaches to the incoming discourse or licenses subsequent discourse moves. A discourse goal, on the other hand, will generally be a much larger structure (or set of larger structures) and we do not predict that ai content will reflect a discourse goal in any direct sense.
This understanding of the relation between goals and ai content stands in contrast to that developed in the QUD-based account of Roberts (Reference Roberts2012). The latter assumes that conversation is a fundamentally cooperative activity aimed at getting more information about the world and posits that ai content is content that directly addresses a speaker’s discourse goal, which is understood as the question (QUD) that the discourse tries to answer. However, while someone who has a cooperative goal of sharing information with an interlocutor might find ai constructions to be the most straightforward means of sharing discourse central information with an interlocutor, people adopt a wide variety of goals that might make use of ai content in less direct ways, and in some cases, hiding one’s central concerns behind nai content might be preferable. It follows that in a discourse-based account of goals and ai content, an ai discourse unit may not have a direct relation to a discourse goal, but merely play an important part in how that goal is realized.
Bias and Subjective Interpretation
In (28), while Wesley and Buttercup seem to agree on what has been said in the conversation, the fact that they come to the conversation with different sets of background beliefs, including their understanding of whom Buttercup is talking to, leads to importantly different perceptions of Wesley’s discourse goal. In other situations, discrepancies in background beliefs and expectations can lead to different interpretations of discourse structure. To illustrate this we revisit (8), repeated here as (29):
The city council denied the demonstrators a permit and they advocated violence.
Example (29) is ambiguous: it can be interpreted as expressing either an instance of Parallel or of Result depending on the context, the interpreter’s background beliefs, and expectations about the speaker’s discourse goals. Such small-scale ambiguities both at the level of relation type and attachment point arise somewhat regularly in discourse interpretation, and are a familiar phenomenon to anyone who has tried to annotate texts for discourse structure and had to arbitrate inter-annotator disagreement. In conversation, such interpretive differences might not be exposed unless one interpretation comes into contradiction with some other part of the discourse. Thus, a speaker and interpreter might have conflicting interpretations of a discourse without even realizing it. This is not always problematic; it might be completely irrelevant to an interpreter’s goals to settle on one interpretation or another.
In other cases, however, disagreements about discourse interpretation can become central to discourse content and development, and even have legal ramifications. Consider the following exchange, discussed in Asher and Paul (Reference Asher and Paul2018), in which a reporter is questioning Sheehan, the spokesperson for the former US senator Norm Coleman:
(30)
a. Reporter: On a different subject is there a reason that the Senator won’t say whether or not someone else bought some suits for him? b. Sheehan: Rachel, the Senator has reported every gift he has ever received. c. Reporter: That wasn’t my question, Cullen. d. Sheehan: (i) The Senator has reported every gift he has ever received. (ii) We are not going to respond to unnamed sources on a blog. e. Reporter: So Senator Coleman’s friend has not bought these suits for him? Is that correct? f. Sheehan: The Senator has reported every gift he has ever received …
In (30b), Sheehan responds to the Reporter’s question in (30a). Sheehan acts as though he is answering the question, and an audience biased towards Sheehan or the senator he represents might very well take his response as an answer (and likewise for Sheehan’s other responses). The reporter, however, clearly does not interpret Sheehan’s move as an answer, leading to a repetitive back and forth exchange, as each tries to push their particular discourse goal.
Asher and Paul (Reference Asher and Paul2018) provide a way of modeling competing interpretations of a conversation in an epistemic game-theoretic framework, and they show how discourse goals, and interpreters’ views on these goals, influence discourse interpretation. They also show how interpreters of conversations such as (30) can become more and more convinced of their interpretation as the dialogue continues. Supporters of the reporter see Sheehan’s repetition of the Senator has reported every gift he has ever received as confirming more and more that he is evading the reporter’s questions, while supporters of Sheehan get more confirmed in their belief or bias that Sheehan has answered the question and that it’s time to move on. This phenomenon of bias-hardening in interpretation gets replayed at the level of beliefs as well, and can be very hard to control, let alone eliminate. This is a familiar phenomenon from political discussions and even personal relationships, and given the impact that it can have on our ability to use language to exchange ideas or learn about our world, an important topic for philosophers and linguists to grapple with.
5.3.3 Multimodal Interactions
As hinted at in the discussion of (26) and (27), complex discourse structures can also be employed to model multimodal discourse. Let’s return to example (26), repeated here as (31):
(31)
a. Julie: It’s time to go to bed. b. Rose: OK, good night. c. [Rose goes to bed.]
The exchange in (31) culminates in a nonlinguistic event of Rose going to bed, but with a young child who still needs guidance, successfully getting her to go to bed would likely involve multiple multimodal exchanges along the way. (32) offers one such example:
(32)
a. Julie: It’s time to get your pyjamas on. b. [Rose puts on her pyjamas] c. Julie: OK. Now let’s go brush your teeth.
In (32), the event of Rose putting on her pyjamas contributes semantic content to discourse in much the same way as (33b) does in (33):
(33)
a. Julie: It’s time for a snack. b. Rose: I’d like some applesauce and cookies. c. Julie: OK. Now go wash your hands.
Rose’s response in (33b) contributes a discourse unit that plays a central role in discourse development; were we to take it out, the remaining discourse would be infelicitous in part because there would be no answer for Julie to acknowledge in (33c) and in part because there would be no concluded event to license the discursive use of now, which indicates that the speaker is moving from one eventuality to another in a sequence. In this case, now is licensed because the discussion about a snack has been concluded and it is time to move on to the next topic.Footnote 19 Similarly, in (32), we need to understand the nonlinguistic event in (32b) as contributing propositional content that can serve as an argument to a discourse relation. It is this event, and more specifically the event together with semantic content that is understood to describe it – that licenses the Acknowledgment marked by OK and makes it possible to close off the pyjama discussion and move on to teeth-brushing via an instance of the relation Sequence, whose second argument is Now let’s go brush your teeth.
Given the claim laid out in Section 5.2 that pronoun resolution is guided by reasoning about discourse relations and structure, we should expect reasoning about the discursive role of nonlinguistic eventualities to influence demonstrative reference as well. Suppose that the exchange in (32) continues with (34):
(34)
a. [Rose starts toward the bathroom] b. Rose: Wait! c. [Rose goes back to her bed, grabs her teddy bear, and then heads back to the bathroom] d. Rose: [looking up at Julie] He needs to brush his teeth too.
In (34d), he will clearly refer to the teddy bear. At first glance, this might not sound so surprising – of course third-person pronouns can be used to refer to entities in the nonlinguistic context. But there is a lot more going on here than demonstrative reference. For one thing, Rose need not point to her bear or even look at him to get the demonstrative reference to work. In addition, understanding the relation between (34d) and the nonlinguistic events described in (34c), and how this interaction contributes to the interpretation of the larger interaction between Julie and Rose requires more than understanding to whom he refers. Rose is not merely saying that her teddy bear needs to brush his teeth; she is explaining why she is taking him to the bathroom. This Explanation relation is crucial in the context: we can easily imagine a different scenario in which Rose goes back to get her bear because she suddenly remembered that she forgot to give him dinner and now wants to go feed him. This scenario is likely to get a negative reaction from Julie. In explaining her actions as she does in (34d), she shows that she understands that it’s time for teeth-brushing and that she is cooperating with Julie’s discourse goal, making her more likely to get a positive reaction. The fact that she is explaining her previous action also explains why she doesn’t need to go to any further trouble to make the bear salient. The entire sequence of events in which she went to get him and then started walking with him in her arms is salient, and he is a central figure in that sequence of events (cf. Stojnić et al. Reference Stojnić, Stone and Lepore2013).
The important point is that nonlinguistic eventualities do not only influence the interpretation of a linguistically expressed discourse unit, as entities picked out through deixis do; they can actually contribute entire discourse units, and they can do so without being picked out by any kind of referential expression (Hunter et al. Reference Hunter, Asher and Lascarides2018). This means that contents contributed by nonlinguistic events need to be taken into account in models of discourse structure – a difficult task given that nonlinguistic events are parts of the actual world and not just denotations of speech acts. This also means that they might impact discourse development. In fact, Hunter et al. (Reference Hunter, Asher and Lascarides2018) argue that nonlinguistic eventualities do not contribute to the RF in the way that linguistically expressed contents do and thus have different effects on salience (cf. the concluding discussion of Simons Reference Simons2019, comparing implicated content and explicit content).
Moreover, the top-down effects of multimodal discourse go beyond the interpretation of deictic or temporal expressions in a clause: given that in multimodal conversation, a nonlinguistic eventuality can contribute an entire discourse unit in the absence of any linguistic description of that event, reasoning about discourse relations and structure can determine an entire event-level content. The event in (32b) might be conceptualized differently in a different context, for instance; we might rather think of it as an event in which Rose changes out of her dirty clothes or simply, Rose changes clothes. But in the context of (32), these other conceptualizations will not do: Julie must understand the event as one in which Rose changes into her pyjamas because only that kind of event will satisfy Julie’s request in (32a). A related point is that there are multiple ways of grouping and describing the events that take place in (32b). While all of the actions involved in (32b) were grouped together under the description Rose puts on her pyjamas, in another context, it might have been more pertinent to focus on some part of this larger event, e.g. Rose put on her pyjama top.
This discussion highlights an aspect of multimodal discourse that makes it very difficult to study systematically. The nonlinguistic context consists of a potentially evolving stream of information that must be decomposed into discourse-unit-level segments according to discourse purposes, but there is nothing like grammatical structure or intonation to suggest segment boundaries. And even if we determine such boundaries, individuated eventualities must be assigned semantic contents. The difficulty of assigning content to nonlinguistic eventualities makes studying either direction of information flow – bottom-up or top-down – a daunting task.
Work by Lascarides and Stone (Reference Lascarides and Stone2009a, Reference Lascarides and Stone2009b) has made some important first steps to understanding how discourse structure and interpretation can guide the conceptualization or description of nonliguistic events by focusing on the interaction of discourse and coverbal gesture. Their research also reveals, however, that coverbal gesture illustrates yet a different kind of discursive interaction from either those observed between purely linguistically expressed discourse units or those observed above in (32) and (34). On the one hand, while coverbal gestures are nonlinguistic, they exhibit a kind of dependence on linguistic content that is not observed with contributions like that of (32b) in (32): similar to appositive relative clauses and the embedded clauses of discourse parenthetical reports, coverbal gestures are introduced into the discourse context in conjunction with another discourse unit through a complex update. On the other hand, coverbal gestures affect discourse development in ways that call for a radically different notion of the RF, and even of discourse graphs, than described in Section 5.2 for linguistic content.
5.4 Looking Ahead
The foregoing discussion raises a variety of questions that will be important to future research on discourse structure and interpretation. First, what is the relation between discourse relations and Questions Under Discussion (QUDs)? While our main focus has been on discourse relations, QUDs have become a popular tool among formal semanticists for diagnosing the presence of utterance-level phenomena that semantically depend on the incoming discourse context. Much of this work centers on very specific types of discourse dependencies, such as focus structure, which have not been at the center of attention in work on discourse relations, and so might be seen as complementary. Some linguistics have posited that QUDs actually play a more fundamental role in determining salience and driving discourse development, however, and that discourse relations are derivative of them (Roberts Reference Roberts2012), while others have argued against such a position (Hunter & Abrusán Reference Hunter and Abrusán2015). Regardless of how this debate turns out, there are good reasons to think that there is something interesting to be said about the interaction between discourse relations and questions. Let’s return to (5), repeated here as (35).
(35)
a. What are you going to do downtown? b. Ugh I’m so mad! My brother lost my copy of The Watchmen, and I need to reread it for class. I’m going to the bookstore.
In addition to the relations at work, the speaker of (35b) seems to be answering an implicit question of why a trip to the bookstore is necessary. How does either a discourse structure or QUD account handle this example? And what are the effects for the theory of discourse goals presented in this chapter if part of what drives discourse development is left implicit?
Another ongoing discussion that will continue to be important for future study concerns how to make more precise predictions about discourse attachment. The RF determines a set of nodes available for discourse attachment, but it cannot help predict for a given incoming discourse unit which node on the RF will be the best choice. By adding information about, say, prosody, or different kinds of constructions (such as appositives), or lexical facts, we might be able to say more. Following Hirschberg and Pierrehumbert (Reference Hirschberg and Pierrehumbert1986)’s attempt to link discourse structure and relations to the interpretation of prosody, researchers in SDRT have examined links between questions, prosody and discourse structure (Asher & Reese Reference Asher and Reese2007; Reese Reference Asher and Reese2007; Reese & Asher Reference Asher and Reese2007), but there is much, much more to explore.
The discussion in Sections 5.2 and 5.3, however, highlights three significant features that will complicate efforts to answer these questions and to systematically study discourse structure in general. First, because background beliefs and discourse goals add a highly subjective element to discourse interpretation that can be hard to pin down and eliminate, any efforts to use experimental or survey data to study discourse-sensitive phenomena have to be very careful to control contextual elements that could influence interpretation. When it comes to judgments about discourse, the question is not only whether a certain discourse structure is acceptable, but what relations are at work in that structure. If two speakers disagree in their judgments, it could be that they disagree about the acceptability of the very same discourse structure, but it could also be, especially if a lot of context is left implicit, that they imagined different discourse contexts or inferred different discourse goals and thus interpreted the discourse differently.
A second hurdle is that providing an analysis of conversation that takes place in a shared perceptual environment will require modeling relations between linguistically expressed discourse units on the one hand and contents assigned to nonlinguistic actions, events and states, on the other. But as we have seen, individuating nonlinguistic eventualities is not a straightforward task, nor is specifying their semantic contents. The way in which discourse structure and interpretation guides the conceptualization of nonlinguistic eventualities, and vice versa, is still very much an open question.
Finally, developing an analysis of discourse-sensitive phenomena often requires considering extended discourse structures, not just pairs of discourse units, as some phenomena only develop over multiple discourse moves. The relation between the nai content in Wesley’s move in (28b) (You think your dearest love will save you?) and his discourse goal, for example, would not have been apparent had we only considered that move together with (28a) (He [Humperdink] … can find a falcon on a cloudy day, he can find you!). Nor does the fact that both discourse units in a discourse parenthetical report can be backward-looking ai come out if we only consider the report and one preceding discourse unit. Even the effects of the RF are hard to test if we only consider two or three units.
When developing accounts of linguistic phenomena in formal semantics and philosophy, the standard tool of choice is the minimal pair. Approaches to modeling intersentential phenomena extend this to looking at minimal pairs of pairs. Tests for forward-looking at-issueness, for example, tend to apply the “that’s not true!” test to a given example and then apply the “wait” (or “hey, wait a minute!” (Von Fintel Reference Von Fintel, Bezuidenhout and Reimer2004)) test to the same example. Similarly, backward-looking at-issueness is often diagnosed with question–answer pairs. Such diagnositic tests are very useful for showing the existence of discourse sensitivity and can shed light on some minimal aspects of discourse structure, as in the case of Simons (Reference Simons2019)’s experiments that support the view that indirect speech reports always contribute two discourse units. But the complexity of discourse structure and the subjectivity of discourse interpretation make these tools inapt for developing explanatory accounts of discursive phenomena. For this, we need discourse examples complicated enough to show the full behavior of the phenomenon in question and also to limit the influence of contextual factors. By embedding a target discourse structure inside of a larger discourse structure, we can better control the background context in which the target structure is interpreted and limit the direction in which an interpreter can expect the discourse to develop.
Extended, natural sounding discourses are difficult to invent, however, and much work on discourse structure and interpretation has heavily relied, and will continue to rely, on the annotation of corpora (see also Abrusán’s discussion of the need for corpus data in this volume). Deep learning approaches, or other machine learning methods designed to entirely bypass annotation, have not been successful at learning discourse structure – the lack of good training data, the sparsity of positive attachments in any given data set, and the presence of long-distance attachments makes the task particularly difficult. Of course, corpus annotation comes with its own set of well-known problems. It is incredibly time consuming, first of all, and requires annotators who are ready to think carefully about how to most reasonably represent the content of a given discourse, which usually requires a level of experience that makes finding reliable annotators difficult. To further complicate matters, discourse structure often contains long-distance dependencies, where a discourse unit
is attached to a discourse unit that was produced many steps back rather than to the discourse unit
that was expressed immediately prior to
. This means that you cannot simply divide a discourse into chunks of three or four units and pass them out to different annotators or appeal to crowdsourcing if you want to get good annotations.
For these reasons, future work on discourse structure is going to have to get creative, especially for studying multimodal discourse. Recent attempts to apply distant supervision methods to produce automatic discourse annotations on chat discussion are very promising (Badene et al. Reference Badene, Thompson, Lorré and Asher2019a, Reference Badene, Thompson, Lorré and Asher2019b). Hopefully, future work will prove the general applicability of these methods to other types of discourse so that we can make the systematic study of discourse structure as accessible as the study of more local semantic phenomena has been.
We have defined a story as a narrative of events arranged in their time-sequence. A plot is also a narrative of events, the emphasis falling on causality. “The king died and then the queen died” is a story. “The king died, and then the queen died of grief” is a plot. The time-sequence is preserved, but the sense of causality overshadows it … Consider the death of the queen. If it is in a story we say “and then?” If it is in a plot we ask “why?”
6.0 Questions and Answers
Questions about narrative structure, and discourse structure more generally, ultimately concern whether there are linguistic representations beyond the sentence level, an issue of import to linguists working at the semantics–pragmatics interface, as well as to philosophers of language. The question of modes of discourse goes back to at least Plato, with implications for philosophy of mind if narrative text is delimited in some way, to say nothing of how it is delimited (e.g. by relationship to time, event ontology, or causality). At the same time, issues of point of view in natural language interpretation have loomed large, in both linguistics and philosophy, across several empirical domains. In this chapter, we introduce a puzzle involving an interaction between how tenses and predicates of personal taste (ppts) are used in narrative discourse. After pinning down which notions of point of view are sensible in these domains, we develop a solution that may help us understand larger architectural questions about narrative structure.
(2) What recent developments in linguistics and philosophy do you think are most exciting in thinking about narrative and point of view?
Recent relativist treatments that split utterance and assessment times have provided new tools for understanding the core properties of ppts (MacFarlane Reference MacFarlane2014). These have also provided useful for tackling certain puzzling tense uses (Schlenker Reference Schlenker2004; Sharvit Reference Sharvit2008; Anand & Toosarvandani Reference Anand and Toosarvandani2017, Reference Anand and Toosarvandani2018, Reference Anand, Toosarvandani, Rhyne, Lamp, Dreier and Kwon2020; Bary, this volume). We believe the additional degrees of freedom afforded by relativism offers a framework for attacking the puzzle in this chapter and enables an understanding of the interaction between tense and ppts that is more nuanced than would have been possible before.
(3) What do you consider to be the key ingredients in adequately analyzing narrative and point of view?
Relativist semantics for tense and ppts are necessary ingredients for solving the puzzle introduced in this chapter. But a theory of narrative structure is needed, in addition, that yokes together the point of view encoded in these two domains. We offer the beginnings of such a theory grounded in the pragmatic conventions underlying the narrative genre. Building on the results from the psychology of collaborative storytelling (Edwards & Middleton Reference Edwards and Middleton1986) and from discourse analysis (Labov & Waletzky Reference Labov, Waletzky and Heim1966), this theory provides a top-down structure for narratives, in which events are described from a unitary perspective.
(4) What do you consider to be the outstanding questions pertaining to narrative and point of view?
One set of questions involves the appropriate formalization of the theory of narrative structure offered. What is the appropriate formal framework for encoding perspective in narratives so that it interfaces appropriately with the intentions and expectations of the speaker (author) and hearers (readers)? How does this framework relate to other formal discourse models (based in, for instance, questions under discussion or discourse representation theory)? A more explanatory question is also relevant here: why is narrative structured in the way it is and not another way?
Another set of more specific questions has to do with the semantics for tense and ppts. While we advance relativist semantics for both kinds of linguistic expressions, much remains to be understood. For tense: Is the temporal perspective encoded by present and past tense in English shared by their correlates in other languages? How is the point of view represented in so-called “narrative” tenses related to the notions introduced in the chapter? For ppts: How is the judge for these expressions determined in narratives, and how might this underlie judge selection in other discourse genres? To what extent do related expressions (e.g. epistemic modals) track ppts in narratives or require distinct perspectival-taking mechanisms?
6.1 Setting the Scene
As any reader of a novel or short story knows, the events in a narrative can be described in more than one way. The point of view, or perspective, can shift many times in the course of even a single narrative, sometimes from one sentence to another. When theorists use terms like “point of view” or “perspective,” though, they may have different ideas in mind. In many cases, point of view is meant logically, to represent an implicit argument or parameter necessary for the evaluation of a relational predicate, as is sometimes invoked for positionals like left and behind or the temporal landmark for tense. In other cases, the term is meant to invoke something more cognitive or experiential, such as the epistemic or evaluative state of some salient protagonist, or the embodied experience of a situation (as in the inside–outside distinction discussed in work on mimesis, e.g. Vendler Reference Vendler, Simon and Scholes1982; Walton Reference Walton1990; Recanati Reference Recanati2007).
While undoubtedly all these perspectival notions are constituents of the aesthetic effect of a narrative, from the point of view of philosophy of language and formal semantics the central questions are about how such categories intersect the structure of natural language: Are there forms or constructions that privilege particular kinds of perspective? Do these forms or perspectives interact? And how do they connect with what makes narrative genres so apparently replete with perspectival switching?
In this chapter, we explore these questions by examining a previously undiscussed interaction between temporal perspective, in the form of the historical present, and evaluative perspective, in the form of predicates of personal taste. By historical present, we mean the noncanonical use of a present tense to describe a past event (see also Bary, this volume) and exemplified below.
If the funeral had been yesterday, I could not recollect it better […]Mr. Chillip is in the room, and comes to speak to me. “And how is Master David?” he says, kindly. I cannot tell him very well. I give him my hand, which he holds in his.
While the historical present clearly changes the logical perspective for tense, it is often claimed to do more, giving the effect that the narrator, the reader, or both are witnessing events before their eyes. It is, thus, a fitting vehicle for exploring how logical perspective shifts may coincide with other notions of point of view.
Our puzzle starts from one of the central issues in the literature on predicates of personal taste (ppts): disagreements involving individual-standard-dependent predicates like delicious or fun seem to be faultless (Kölbel Reference Kölbel2003), that is, they have no clear fact of the matter. Consider the following toy dialogue:
(2)
[A and B are tasting a bottle of cider at an apple orchard.] A: This cider is delicious! B: No, it’s not delicious.
Intuitively, what is delicious to A here need not be delicious to B, and this is sufficient to allow neither A nor B to be making a mistake despite their seemingly contradictory beliefs.
There is little reason to think this kind of perspective-taking has much to do with what an author does by deploying the historical present. And yet, the two interact, as can be seen by embedding the disagreement above in a joint oral narrative like (3), where A and B together describe a shared experience.
(3)
C: [talking to A and B] How was your vacation? A: Well, after we arrive in Paris, we take a bus to the Normandy coast. We visit an apple orchard. B: They have their own cider. It’s delicious! A1: No, it isn’t delicious. A2: No, it wasn’t delicious.
In this context, the faultlessness canonically associated with ppts varies with the tense of A’s response. If A uses the simple past, as in the A2 response, the sense of faultlessness can persist. However, if A uses the present tense, as in the A1 response, the disagreement never seems faultless: either she or B has made a mistake about the taste of the cider at the orchard. In short, A can only disagree faultlessly by using the past tense.
The solution to this puzzle, we will advance, lies in the pragmatic conventions that shape the narrative genre. To motivate these conventions, we will draw on the literature on joint oral narratives within psychology. A key empirical generalization comes from Edwards and Middleton’s (Reference Edwards and Middleton1986) seminal study of collaborative story telling. They show that the participants engaged in such enterprises are strongly motivated to collaboratively construct a story line. However, after a consensus version of what happened has been reached, participants are free to (faultlessly) share their own take on the significance of those events to themselves or others. We take this perspectival structure to characterize narratives in general, a generalization which we state as follows:
(4)
In framing this generalization, we draw on Labov and Waletzky’s (Reference Labov, Waletzky and Heim1966) theory of narrative structure. This foundational work within the linguistic discipline of discourse analysis includes a place, not just for a sequence of event descriptions, what Labov and Waletzky call a complication, but also for some component conveying the significance of those situations to conversational participants, what they call an evaluation. While a unitary perspective is enforced in the complication, speakers’ perspectives are permitted to diverge when the broader significance of these events is being considered in the evaluation.
The puzzle in (3) forms the empirical foundation for the npg, whose effects might be hard to discern in single-authored written narratives. We argue that those effects are revealed in such joint oral narratives, where there are multiple speakers whose points of view can, in principle, diverge. However, a linguistic theory of this contrast, involving tense and ppts, needs more than just this empirical generalization. It requires a formal system that can represent the pragmatic principles underlying narrative structure in such a way that they meaningfully interact with the semantic theories of the relevant phenomena. The existing theories of discourse structure within formal semantics, reviewed by Bary, Hunter and Thompson, and Pavese (this volume), make nontrivial claims about the point of view invoked by grammatical and lexical aspect, but they do not enable an understanding of the interaction between tense and appraisal. We instead turn, in Section 6.2, to Roberts’s (Reference Roberts2012) notion of a strategy of inquiry, a sequence of questions representing the conversational goals of a discourse that directs the contributions that participants can make. We offer a way to encode the division between complication and evaluation, along with the perspectival limitations these come with, in a strategy of inquiry for narratives.
To connect this theory of narrative structure to the contrast in (3), we introduce a semantics for ppts in Section 6.3. First, we survey contextualist and relativist approaches, aiming to uncover their respective understandings of faultless disagreement. We adopt a relativist approach, in which the notion of propositional content is revised to include a place for a perspective point (Kölbel Reference Kölbel2003; Lasersohn Reference Lasersohn2005; MacFarlane Reference MacFarlane2014). While there are substantive differences amongst relativist accounts, they all attribute faultlessness to heteroperspectival appraisal – evaluation relative to distinct perspectives – while nonfaultless disagreement arises from homoperspectival appraisal – evaluation relative to a single perspective. Ultimately, we build our account on MacFarlane’s bicontextual semantics for ppts, where the relevant perspective point is a parameter, not in the context of utterance, but a context of assessment.
To derive the contrast in (3), a semantics for tense is also required. A recent line of work, which we discuss in Section 6.4, has sought to capture certain unexpected tense uses, including the historical present, by deploying a bicontext (Schlenker Reference Schlenker2004; Sharvit Reference Sharvit and Young2004, Reference Sharvit2008; Eckardt Reference Eckardt2012; Anand & Toosarvandani Reference Anand and Toosarvandani2017, Reference Anand and Toosarvandani2018, Reference Anand, Toosarvandani, Rhyne, Lamp, Dreier and Kwon2020). In our extension of Sharvit’s bicontextual semantics of tense, present and past tense describe reference time intervals relative to the time of the assessment context. With both ppts and tense sensitive to the context of assessment, albeit to different parameters, a path to the solution for our puzzle opens up. The npg can be cashed out as a requirement, encoded in a strategy of inquiry, that the complication of a narrative be evaluated from a unitary context of assessment. In a nutshell, the present tense leads to nonfaultless disagreement when it describes past events, as in (3), because its semantics tightly binds the temporal location of an event to the contextual parameter relevant for appraisal. The past tense permits a distal temporal point of view on the events described, and so it is compatible, outside of complications, with appraisal involving past events from present perspectives.
It is important to point out that, while the past tense can be used in (3) to disagree faultlessly, it does not have to be. The simple past in English permits faultless disagreement, though a speaker can also use it, like the historical present, to disagree nonfaultlessly. In Section 6.5, we explore this flexibility, tying it to the broader distribution of past tense forms in narrative. While the historical present is restricted to complications, the simple past can be used throughout a narrative (Wolfson Reference Wolfson1979: 171–172; Schiffrin Reference Schiffrin1981). We revise the existing semantics for past tense to enable this flexibility, engendering a new perspective on the crosslinguistic variation in tense usage.
6.2 The Structure of Narratives
We can start with what a narrative is. A narrative can be transmitted in a written form (e.g. David Copperfield) or orally (e.g. Aesop’s fables or the Panchatantra before they were committed to paper). The events described can be part of an imagined world (a novel) or the actual one (a biography). And for oral narratives, these can be narrated by just a single speaker or jointly by more than one person, as (3) is.
Despite these differences, all narratives describe events, the individuals participating in them, and where these events and individuals are located in time and space. There is no necessary correspondence between how these elements are structured within the story world (what narratologists call the fabula) and how they are described in the narrative (the syuzhet). Mismatches between them could in principle involve any aspect of an event or individual that can be described. But, temporal correspondences between the story world and narrative are particularly salient, perhaps due to the important role that events play in scaffolding our understanding of a story.
Since the sequence of descriptions in a narrative is dictated entirely by the act of speaking or writing, the temporal ordering of events in a story world, whether imagined or real, must be inferred by hearers and readers. A narrative can describe a sequence of events iconically in a forward-moving fashion through narrative progression, as in (5a). Or, the temporal order can fail to correspond to the narrative order, with events temporally overlapping or even inverted through backshifting, as in (5b).
(5)
a. Max stood up. John greeted him. b. Max fell. John pushed him.
It is these temporal mappings which have primarily animated formal semanticists’ investigations of narrative. The theories they have developed can be divided between two main approaches: reference time theories and discourse coherence theories. We review these briefly below, though reference time theories are discussed further by Bary (this volume) and discourse coherence theories by Hunter and Thompson (this volume) and Pavese (this volume).
To make progress on our puzzle, we will argue for a theory of narrative which, unlike reference time or discourse coherence theories, encodes the goals of narrative production. We will review certain empirical generalizations from discourse analysis and psychology that will allow us to begin to understand what the speakers in narratives, both monologic and dialogic, are aiming to do. And these generalizations, once constituted as pragmatic conventions of the genre and formalized in the question-under-discussion framework (Roberts Reference Roberts2012), will provide a path to understanding how the historical present can be deployed in a narratives, and how this leads to a lack of faultlessness with ppts.
6.2.1 Formal Semantic Treatments of Narrative
While formal semanticists have investigated the temporal properties of narratives, developing theories to account for them, they have not necessarily aimed for a theory of narrative.
Reference time theories, for instance, have a relatively restricted scope, seeking primarily to derive the temporal inferences in a narrative from how tense finds a referent in the discourse (Partee Reference Partee1984; Dowty Reference Dowty1986; Hinrichs Reference Hinrichs1986; Webber Reference Webber1988; Caenepeel Reference Caenepeel1989), as in an anaphoric theory of tense (Partee Reference Partee1973). Within many of these theories, the variability in temporal relations is traced to lexical and grammatical aspect. The first two sentences of (6), for instance, are understood as taking place one after another, because they are eventive. By contrast, the last two sentences in (6) are interpreted as temporally overlapping the preceding sentences, since they are stative.
(6)
He went to the window. He pulled aside the soft drapes. It was a casement window. Both panels were cranked out to let in the night air. (after Hinrichs Reference Hinrichs1986: 67)
Reference time theories might seem, at first, well furnished to solve the puzzle posed by (3), given the deep connection they posit between narrative structure and tense. However, their notion of perspective is not particularly well-suited to handle a contrast in faultlessness.
Reference time theories assume a single narrator’s perspective, with the narrative representing their beliefs about the temporal order of events (even if this order is also reflected in the perceptions of a protagonist, as Dowty and Caenepeel contemplate). These theories thus posit a relatively slight formal machinery that includes no explicit place for the speaker-author. But this simplification also prevents these theories from extending to joint oral narratives, like the one in (3), which have more than one speaker. If the possibility or impossibility of faultless disagreement with ppts depends on the individualistic perspective inherent to appraisal, then these individuals and their perspectives must find their way, somehow, into the structure for a narrative.
The goals of discourse coherence theories are, by contrast, more general, aiming to uncover the principles that organize texts of all types (Halliday & Hasan Reference Halliday and Hasan1976; Hobbs Reference Hobbs1979, Reference Hobbs1990; Mann & Thompson Reference Mann and Thompson1988; Lascarides & Asher Reference Lascarides and Asher1993; Kehler Reference Kehler2002; Asher & Lascarides Reference Asher and Lascarides2003). They posit an inventory of primitive coherence relations between sentences, containing temporal information as well as other kinds of information (e.g. causal, spatial), as described by Hunter and Thompson (this volume) and Pavese (this volume). The temporal inferences between sentences in a narrative come from which coherence relations are inferred, rather than being rigidly tied to the aspectual properties of the sentences. When no coherence relation can be inferred, a discourse is infelicitous, as in the defective narrative in (7): it is simply not clear why these events are described in the way they are.
Discourse coherence theories have more room, in principle, for developing an account of the faultlessness contrast in (3), since they aim for a general understanding of why texts cohere. In general terms, the historical present would only be coherent when deployed in a joint narrative if the perspective taken precludes the possibility of faultless disagreement. Since coherence, or the lack thereof, depends on the specific inventory of coherence relations adopted, as well as a calculus for combining them, saying something about faultlessness disagreement would require that discourse coherence theories make reference in some fashion to the primitives underlying faultlessness.
While it may be possible to enrich a discourse coherence theory like Segmented Discourse Representation Theory (Asher & Lascarides Reference Asher and Lascarides2003) in this way, we pursue a different path here. A core property of narratives relevant for our puzzle, we believe, involves what speakers are trying to do when they describe a sequence of events. This intentional structure suggests a top-down organization for narratives, which we formalize within Roberts’s (Reference Roberts2012) question-under-discussion framework. This is, in principle, compatible with an analysis of narrative in terms of discourse coherence, with the intentional structure being layered onto the network of coherence relations connecting a narrative.Footnote 1
6.2.2 Toward a Theory of Narrative Structure
In the question-under-discussion (qud) framework, questions represent the goals of conversational participants (see Westera this volume). For a typical information-seeking exchange, the goal might, for instance, be to answer the question What is the way things are? These questions, which represent the shared goals of speakers and hearers, can be introduced explicitly, signaled covertly through prosody or other linguistic means, or just inferred. Both conversational participants’ contributions and their expectations about these contributions are involved in inferences about the question under discussion.
As Roberts points out, no discourse comprises answers to some randomly selected set of questions. Conversational participants work together in a systematic fashion towards reaching their final goal. She proposes that a strategy of inquiry is the way they do this: it comprises the qud that is the discourse’s overall goal, along with a sequence of other quds that they plan to use to answer it. It is possible, we think, to characterize narrative in terms of a conventionalized strategy of inquiry. In other words, what goes wrong in a defective narrative like (7) is that we, as readers, cannot infer a suitable strategy of inquiry based on just the two sentences provided.
What might this strategy of inquiry be? Labov and Waletzky (Reference Labov, Waletzky and Heim1966), in their influential analysis of oral narratives, show that these are conventionally divided into several parts, illustrated by the narrative below. After an initial orientation (8a), the complication describes the main series of events (8b); this is always accompanied by an evaluation, which conveys the broader significance of these events (8c). (These can be followed by a resolution, and then a coda.)
(8)
a. […] We were all going out for lunch // it was our birthdays // and we were C.I.T.’s // so we were allowed to. b. We borrowed someone’s car // and we got blown out. […] So we asked some guy // t’ come over an’ help us. // So he opens the car // and everyone gets out except me and my girlfriend. // We were in front // and we just didn’t feel like getting out. // And all of a sudden all these sparks // start t’ fly. // So the girl says, // ‘Look, do you know what you’re doing? Because y’ know um … this is not my car // an’ if you don’t know what you’re doing, // just don’t do anything.’ // And he says, //’Yeh, I have t’ do it from inside.’ // And all of a sudden he gets in the car, // sits down, // and starts t’ turn on the motor. c. We thought he was taking off with us // We really thought- h- he was- // he was like real- with all tattoos and smelled- an’ we thought that was it! hhh // But he got out hhh after awhile. I really thought I was gonna die // or be taken someplace far away. It was so crazy, // because we couldn’t call anybody. // It was really funny. (Schiffrin Reference Schiffrin1981: 47–48)
Formal semanticists have been primarily interested in the complication, which is comprised primarily of event descriptions with an iconic temporal ordering. Changing the order of the sentences in this narrative spine changes their temporal order, though the complication can also contain additional satellite material that is not temporally ordered relative to the narrative spine.
Labov and Waletzky argue that the evaluation is just as integral to the construction of a coherent narrative as the complication. It assigns an external significance to the events described in the story world. They identify two ways in which evaluations can be realized in narratives. In (8), the evaluation is external: it is a distinct textual segment following the complication, in which the speaker exits the story world, characterizing the events contained within it for the hearers. They suggest an evaluation can also be integrated into the complication itself. In such an internal evaluation, the event descriptions themselves give significance to the story, making its point clear. They can do this relatively indirectly, by inviting the addressee to infer the importance of those events on their own, rather than telling them directly.
Building on these empirical generalizations, we suggest that narratives are the product of a conventionalized strategy of inquiry, an initial version of which we state in (9): the questions it contains correspond to the different components of a narrative identified by Labov and Waletzky.
(9)
Narrative Strategy of Inquiry (nsi; initial version): A narrative is the product of a strategy of inquiry to answer a qud, which contains at least the question What is the way things are (in the story world)?
The evaluation emerges from answering whatever qud the entire strategy of inquiry is dedicated to resolving. This must involve some sequence of event descriptions, a requirement that is encoded by having one of the questions in the strategy be What is the way things are (in the story world)? There might be any number of substrategies for answering this question depending on the complexity of the complication. For the forward-moving sequence comprising the narrative spine, the substrategy might be: What happened first? What happened second? …; for satellite descriptions, the substrategy might include questions like What was it like then? or Why did that happen? (see also von Stutterheim & Klein Reference von Stutterheim, Klein, Dietrich and Graumann1989; van Kuppevelt Reference van Kuppevelt1995; Onea Reference Onea2016; Velleman & Beaver Reference Velleman, Beaver, Féry and Ishihara2016; Kamp Reference Kamp2017; Riester Reference Riester, Zimmermann, von Heusinger and Gaspar2019). If the evaluation is internal, this might be all that the strategy of inquiry for a narrative contains. But if the evaluation is external, there will need to be additional questions, possibly organized in substrategies of their own, explicitly relating the events described to the highest-level qud.
Under this view, the problem with the defective narrative in (7) is that the qud at the root of the entire strategy of inquiry cannot be inferred based solely on the information that is provided. It is clearly possible to understand how the two sentences are related to one another in order to answer the question What are the way things are? But without saying more, it is simply not possible to understand what higher-level qud this is directed toward answering. For the nsi to be explanatory, actual narrative strategies of inquiry have to be more restrictive than this schematic one. It should be pointed out that there is, in general, no problem with two-sentence narratives,Footnote 2 as the invited six-word science fiction stories in (10)–(11) from Wired magazine demonstrate.Footnote 3
(10)
Corpse parts missing. Doctor buys yacht. (Margaret Atwood) (11)
Easy. Just touch the match to (Ursula K. LeGuin)
Based on our knowledge about who the authors are and the context in which these stories are presented, we can infer the quds these narratives are dedicated to answering. Klauk et al. (Reference Klauk, Köppe and Onea2016) suggest that, for (10), this is Who did it?, the conventional goal of a whodunit detective story. The inference involved here is clearly complex, and Klauk et al. observe that we probably cannot even arrive at this conclusion until after reading both sentences in the narrative.
The short narrative by Ursula K. LeGuin illustrates a different point about what is, and is not, required in a narrative strategy of inquiry. The events described need not reach any sort of intuitive finality, what in literary studies is called narrative closure. In (11), events are, in fact, described only incompletely for humorous effect. Carroll (Reference Carroll2007: 4) treats narrative closure informally as a sensation that arises “when all of the questions that have been saliently posed by the narrative get answered.” Klauk et al. make clear that the questions that must be answered for narrative closure to arise are only those that “have the plot … as an object” (p. 45). If we take these, roughly, to resolve the question What is the way things are (in the story world)? in the nsi, it is clear then that this strategy does not require that a narrative provide a “complete” description of events in any sense. What the nsi does require, however, is what Klauk et al. refer to as tellability closure, the sense that the narrative has a point. They refer to Labov and Waletzky’s observation that oral narratives always have an evaluation. This requirement is encoded in the nsi, since a strategy of inquiry’s aim, in Roberts’s sense, is to answer a given qud. So, while narrative closure may not be required, depending on what questions are in the strategy of inquiry, the presence of an evaluation, which gives rise to tellability closure, is necessary for a narrative to be complete.
The nsi is, by design, somewhat schematic. It is silent about the relationship between the question that is answered in the complication and the higher-level qud the entire strategy is dedicated to. This freedom is needed to capture the wide variety of functions that narratives serve. A speaker may describe some sequence of events to convey something about who they are, as in a personal anecdote. Or, a narrative may be used to convey a prescription that the hearer-reader should follow, as in Aesop’s fables. In origin myths, the narrative serves to explain why the world is the way it is within a given ideological or belief system. In the fictional written narratives in (10)–(11), their goal is circumscribed by the relatively narrow conventions of specific literary genres (a whodunit or thriller). Given the wide-ranging goals of narratives, it seems only appropriate that certain aspects of the nsi are filled in by more specific conventions.
At the same time, there are some necessary characteristics of narratives, which have not been included in the initial version of the nsi in (9). These come from looking at joint oral narratives, which exhibit a particularly interesting combination of properties: they are narrated by more than one speaker, whose individual contributions are easily distinguishable. While joint oral narratives are not, as a genre, attended to much by linguists, they are widely studied in research on human psychological processes, including language development, belief formation, episodic recall, collective memory, well-being, and social identity (see, e.g. Edwards & Middleton Reference Edwards and Middleton1986; Hirst et al. Reference Hirst, Manier, Apetroaia, Snodgras and Thompson1997; Holmberg et al. Reference Holmberg, Orbuch and Veroff2004; Kellas Reference Kellas2005; Ekeocha & Brennan Reference Ekeocha and Brennan2008; Pinto et al. Reference Pinto, Tarchi and Bigozzi2018). One persistent finding in this literature is that the collaborative nature of these enterprises produces a strong motivation for consensus about the story line. For instance, in Edwards and Middleton’s seminal work on the topic, eight acquaintances were asked to recall the plot and memorable episodes of the movie E.T. The resulting narrative was analyzed for a wide variety of linguistic markers of dialogue structure, metanarrative negotiation, and social function. Edwards and Middleton note that participants quickly established a routine: first, providing essentially chronological description, frequently in the historical present, and then after this plot outline, engaging in a more freewheeling, temporally inconsistent sharing of what they found memorable or significant about the film. In other words, participants first collaboratively constructed the complication of the story, interspersed with some evaluative commentary, and then engaged in (external) evaluation. In the complication portion, the motivation for consensus was so strong that it even frequently carried over into negotiations over the evaluative commentary, which included ppts, so that there was a consensus perspective on those issues as well. In contrast, during the final evaluation, there was far less of this. Participants could share their own private opinions without any negotiation, agreeing to disagree.
Joint oral narratives, it turns out then, hew rather closely to a particular set of pragmatic conventions, stated in (4). The event descriptions in the complication must all be evaluated relative to a single shared perspective. By contrast, contributions in the evaluation are relative to the individual perspectives of speakers, which may coincide or diverge, as the case may be.
(4)
While this generalization is motivated by findings about joint oral narratives, it plausibly characterizes all narratives. Joint oral narratives simply provide a way of seeing the generalization in a way that is not possible with other kinds of narratives. They have multiple speakers who can have, in principle, divergent perspectives. In a monologic narrative, by contrast, where there is a sole speaker-author, there is only ever a single perspective to represent.
This means, then, that the npg should be incorporated into the nsi. We do this by relativizing different quds in the strategy of inquiry to different perspectives. The highest-level qud is evaluated relative to the utterance event, while the subquestion for the complication is evaluated relative to a salient perspective point that we represent, for now, as ρ.
(12)
Narrative Strategy of Inquiry (nsi; revised version): A narrative is the product of a strategy of inquiry to answer a qud relative to the utterance event, which contains at least the question What is the way things are (in the story world) relative to ρ?
This enforces a shared perspective for the event descriptions in the complication. But contributions directed toward resolving the highest-level qud will allow diverging points of view, as these will be evaluated relative to distinct utterance events, whose speakers and their perspectives may diverge.
6.2.3 Tense in Narratives
The contours of a solution to our puzzle should now be emerging. Disagreement with the historical present is not faultless in (3) because the events in the complication of a narrative are described from a unitary perspective. This attributes the absence of faultlessness, in other words, to the perspectival properties of narratives. Of course, we still need an understanding of how tense and ppts are sensitive to this particular kind of perspective-taking, and the remainder of this chapter will establish just this. Building on recent developments in the formal semantic and philosophical literatures, we will provide a semantics for tense and ppts, which makes them both sensitive to the perspective point invoked by the complication in a narrative, represented simply as ρ above.
For ppts, it is more clear what direction this line of inquiry will take, given their more transparent perspectival sensitivity. For tense, this is perhaps somewhat less obvious. In contemporary theories of tense, which build on the work of Reichenbach (Reference Reichenbach1947) and Klein (Reference Klein1994), it is commonplace for this grammatical category also to encode a type of temporal perspective. Any tense must minimally locate the reference time relative to a time coordinate that can, at least sometimes, be identified with the “now” of an utterance. Fairly standard denotations are given in (13) for present and past tense (cf. Kratzer Reference Kratzer, Strolovich and Lawson1998: 101).
(13)
a. 
b. 
The present tense locates an eventuality at the temporal perspective point, while the past tense locates an eventuality before it. Under attitude predicates, this time coordinate is the “now” of an attitude holder (Abusch Reference Abusch1997).
As Bary (this volume) discusses, the relatively simple semantics in (13) confronts a problem with the historical present, which in root clauses does not describe eventualities not located at the time of utterance. In one line of thinking, this variability can be traced to the temporal perspective that is part of the meaning of tense (Schlenker Reference Schlenker2004; Eckardt Reference Eckardt2012; Anand & Toosarvandani Reference Anand and Toosarvandani2017, Reference Anand and Toosarvandani2018). Rather than locating the the reference time relative to the “now” of the actual utterance, tense locates it with respect to a temporal coordinate that can be located at the utterance event or float free. Under this view, the historical present arises when it is dissociated from the utterance time, thereby allowing for the description of nonpresent eventualities. It is this temporal perspective point that we will propose is associated with ρ in the nsi in (12).
Some initial evidence in support of this possibility comes from the distribution of the historical present in oral narratives. As Schiffrin (Reference Schiffrin1981) shows, following earlier observations by Wolfson (Reference Wolfson1979: 171–172), the historical present is essentially found only in complications. In a corpus of 73 oral narratives, she finds no occurrences of the historical present in external evaluations or codas, with only a few instances in orientations (3 percent of verbs). The historical present appears almost entirely in complications (on 30 percent of verbs, or 381 out of 1288). In the narrative in (8), too, it appears only in the complication. This distributional restriction has a plausible source in the perspectival properties of narratives. If the present tense can only describe past events when the temporal perspective point at which it locates events is divorced from the utterance event, and if this temporal coordinate is related, in some fashion, to the unitary perspective point present in the complication, then we might expect the historical present to only show up inside complications.
This is admittedly somewhat suggestive so far. We will be returning to the semantics for tense in Section 6.4, advancing a formal proposal based on our own earlier work, that incorporates an additional time parameter. This will serve, as we will see, to explicitly connect the temporal perspective invoked by the historical present to the appraisal inherent to ppts. But before we do this, we first need a better understanding of these latter expressions.
6.3 Point of View in Predicates of Personal Taste
The past decade and a half has seen a renewed attention, in both formal semantics and the philosophy of language, to subjective expressions in natural language. There has been a particular focus on predicates of personal taste (ppts) (Kölbel Reference Kölbel2003; Lasersohn Reference Lasersohn2005): expressions like tasty or beautiful which, intuitively, describe objects in terms of characteristics that vary from individual to individual. What is tasty or beautiful to one person need not be the same for others, and there are many cases on which there is likely no consensus.
There are three interconnected puzzles that ppts pose for conventional truth-conditional semantics. First, if the standards for taste and beauty are perspectival, the foundational question is how that perspective is represented. Second, whatever that representation of perspective is, it must be flexible enough to allow people not simply to assert perspectivally-situated claims, as A does in (14), but to also disagree with such claims, as B does.
(14)
[A and B are tasting a bottle of cider at an apple orchard.] A: This cider is delicious! B: No, it’s not delicious.
Intuitively, in the heteroperspectival dialogue in (14), A and B are making claims about the cider relative to their own perspectival standards. So it is not clear why this should be construed as a coherent disagreement. Compare this to a parallel dialogue using the expression local, which is also intuitively speaking perspectival, though not to a standard of taste, but a locative origio.
(15)
A: [in Los Angeles] This cider is from a local farm. B: [in New York] No, it’s {not from a local farm, from the east coast}.
In contrast to (14), (15) is coherent only if A and B are referencing the same origio. If, for example, they reference their different coasts, the polarity particle no is not licensed. Given this contrast, ppts must have some property beyond general perspectival-dependence, which interacts with the pragmatics of dialogue to allow for heteroperspectival disagreements.
This point brings us to the third puzzle, the one of central concern to this chapter. The dialogues in (14) and (15) vary, not only in whether they allow heteroperspectival disagreement, but also in the objectivity of the disagreement. In the case of (15), there does seem to be a fact of the matter that is in dispute: one of the two parties is mistaken. In the case of (14), by contrast, many people report that it allows for instances where there is no mistake: both parties can be equally correct in their claims. It is this state of the discourse that Kölbel (Reference Kölbel2003) terms faultless disagreement, which he describes as follows:
(16)
The problem, then, is how A and B can believe what seem to be contradictories without one being somehow in error.
These three questions – how perspective is represented for ppts, how heteroperspectival disagreements are possible with ppts, and how heteroperspectival disagreements can be understood as faultless – have led to a rich theoretical landscape (see MacFarlane Reference MacFarlane2014; Lasersohn Reference Lasersohn2017 for detailed discussions). For our purposes, it is useful to consider three approaches: contextualist relativism, utterance-sensitive relativism, and bicontextualism.Footnote 4 On all three accounts, ppts are, at least at some conceptual level, dyadic predicates holding of an object and some perspectival component. Suggestive evidence for this position comes from the fact that, in addition to their “bare” uses, many ppts allow overt experiencer phrases such as to me or for her, which make the perspective explicit (Lasersohn Reference Lasersohn2005; Stephenson Reference Stephenson2007; Bylinina Reference Bylinina2017).
6.3.1 Contextualist Approaches
In contextualist approaches, the perspectival component is typically treated as a variable in logical form, akin to a pronoun.Footnote 5 In typical usage, this pronominal is identified with the speaker, so that the ppt is interpreted as an assertion from the speaker’s perspective, what Lasersohn (Reference Lasersohn2005) calls an autocentric use.
Under this account, the logical form of a sentence on an autocentric use varies with the utterer, as does the content of the sentence. To illustrate, the logical forms for A and B’s assertions in (14) can be schematized as follows:
(17)
a. pres the cider be delicious x2 b. pres the cider not be delicious x9
Pronunciation not withstanding, these two propositions are logically independent. A’s assertion is roughly equivalent to The cider is delicious to A (if x2 refers to A), and B’s assertion to The cider is not delicious to B (if x9 refers to B). This makes it possible for them to be simultaneously true, and hence for no fault or mistake to arise on the part of either interlocutor.
However, as Kölbel (Reference Kölbel2003) notes, contextualism achieves this result without explaining why the dialogue in (14) feels like a disagreement, or why ellipsis and polarity particles (i.e. expressions like yes and no) are possible in heteroperspectival disagreements with ppts but not with perspectival expressions like local. More pointedly, as Lasersohn (Reference Lasersohn2005) notes, heteroperspectival disagreements seem markedly worse with overt experiencers.
(18)
A: The cider is delicious to me. B: #No, it’s not delicious to me.
That overt experiences do not pattern with the implicit perspective of delicious is a deep problem for contextualist accounts, since they would naturally receive the same treatment as implicit perspectives.
Thus, while simple contextualism avoids fault in disagreements with a ppt, it leaves unclear how there is even a disagreement in the first place. One response is group contextualism (DeRose Reference DeRose1991; Anand Reference Anand2009; Moltmann Reference Moltmann2012; Pearson Reference Pearson2013). It posits that the implicit perspective in these cases belongs to a group containing both A and B (and perhaps others), as illustrated in (19).
(19)
A: The cider is delicious to {us, people like us}. B: No, it’s not delicious to {us, people like us}.
This dialogue is coherent and is, moreover, construed as a disagreement. However, it accomplishes those goals at the cost of giving up the explanation for faultlessness, since now the contents of A and B’s assertions are the same.
The fundamental challenge for contextualist accounts, then, is that the contents of utterances with ppts contain the perspective point. In reaction to this, a large family of approaches has sought to remove the perspective point from propositional content. In this way, the content of two claims might be directly related (as logical opposites) without giving up faultlessness.
6.3.2 Relativist Approaches
For relativists like Kölbel (Reference Kölbel2003), Lasersohn (Reference Lasersohn2005), and MacFarlane (Reference MacFarlane2014), propositional content is revised to directly include a notion of perspective, to which some expressions are sensitive. Propositions under this view correspond not to world-time pairs, but to judge-world-time triples. Expressions like ppts are sensitive to both the judge and world-time coordinates; nonsubjective expressions, like Californian, are sensitive only to the world-time coordinates.
(20)
a. 
b. 
Thus, as desired, A’s and B’s assertions in (14) are contradictories (i.e. if one is true at index i the other must be false at i): one predicates that the cider is delicious to the judge of the evaluation index and the other that it is not.
In an intensional logic, the truth of an assertion in context is determined by evaluating the propositional content of the assertion relative to a contextually supplied world, typically the world in which the assertion was made. Lasersohn and MacFarlane both propose that assertions are likewise evaluated relative to a contextually supplied judge. But they differ in what sort of context supplies that judge, and what contextual flexibility exists. For Lasersohn, the context of utterance determines the judge, just as it determines the world of evaluation.
(21)
Truth in a Context: 
Under his approach, the context of utterance crucially negotiates how the truth of judge-dependent material is calculated. In MacFarlane’s subtly different view, that task is taken up not by the context of utterance, but the context of assessment, a distinct context whose role is to fix parameters of appraisal and evaluation. For him, then, truth is defined not at a context, but at a bicontext.
MacFarlane’s goal is to capture a range of behaviors linked to individuals standing in a state of disagreement. To understand his concern, consider how relativist treatments of ppts handle the coherence of heteroperspectival disagreements. We have seen that because judges enter propositional content, it is possible to say that the contents of A’s and B’s assertions are contradictories. But the same could be said for a temporally-variant proposition. If A says It is noon, and then hours later B says It isn’t noon, there is no sense of disagreement. What explains this contrast between judges and times? Without a satisfying answer to this question, it is not clear that relativist treatments improve much beyond contextualist ones in deriving a sense of disagreement. For MacFarlane, the answer comes from the bicontextual pragmatics of truth: since the context of assessment supplies the judge, judge-sensitive propositions will differ from those that are purely time-sensitive. Thus, only the former show an ability to consider the truth of an assertion relative to a judge different from the one supplied by the context of utterance.Footnote 6
We will ultimately build our account in terms of a bicontextual semantics, though MacFarlane’s particular philosophical commitments lead to a view of the context of assessment that is not empirically borne out. As a result, we will end up arguing for a bicontextual semantics with a bit more expressive freedom, which we will exploit in building an account of faultless disagreement and our core contrast in (3). We can start by scrutinizing how these two flavors of relativism handle cases where the judge is, intuitively, not the speaker.
6.3.3 Relativism and Exocentric Readings
While the contextual world of evaluation is not typically very flexible, Lasersohn argues that the contextual judge has considerable freedom. Beyond autocentric uses, it also has exocentric uses, as in questions posed to the addressee (23) or in discussions of some relevant protagonist (24).
(23)
A: [asking B about a book B is reading] Is the book good? (24)
Mary: How did Bill like the rides? John: Well, the merry-go-round was fun, but the water slide was a little too scary. (Lasersohn Reference Lasersohn2005: 672)
In contrast, MacFarlane assumes that the assessment context is quite rigid, providing only the assessment standard of the assessor at the time of assessment. For exocentric uses, he follows Stephenson (Reference Stephenson2007), who proposes that, while autocentric uses are relative to the contextually supplied judge, exocentric uses are derived via variables in the logical form, as under contextualist accounts. As evidence for this hybrid system, Stephenson observes that exocentric readings do not readily lead to coherent heteroperspectival disagreements.
(25)
Sam: The tuna is tasty. Sue: (#)No, it isn’t! It’s not tasty at all! (Stephenson Reference Stephenson2007: 521)
Stephenson notes that if Sam intends exocentrically to reference a salient cat’s judgment of the tuna, and if Sue (knowing this) then brings in her perspective, Sue’s statement is incoherent. Under the theory that exocentric readings require variables that lead to judge-invariant propositional content, such mismatches are predicted, while under the one where exocentric readings arise from the context of utterance, they are not.
But a dialogue like (25) can be felicitous depending on the individuals that are referenced by the interlocutors. Consider the scenario in (26), where two parents are discussing how a certain child enjoyed their birthday party. It seems much more acceptable here for another child to offer their own opinion.
(26)
Parent A: How was the cake at the party? Parent B: It was delicious. Child: No, it wasn’t! It was disgusting.
This suggests that what is going on in exocentric–autocentric mismatches is not as clear-cut as Stephenson suggests, and that the infelicity of (25) is not a matter of mismatching logical forms, but rather of overall discourse coherence.Footnote 7
In addition, based on tests furnished by MacFarlane, as well as Anand and Korotkova (Reference Anand and Korotkova2018), exocentric readings of bare ppts can be shown to be distinct from those with overt experiencers. MacFarlane notes that ppts with overt experiencers evaluate the predicate relative to a standard determined by the overt experiencer’s standards of taste in the index of evaluation, while bare ppts do not. One vivid illustration comes from a contrast he observes in counterfactual conditionals. In (27a), the counterfactual state of affairs involves some change in the structure of horse manure that would make it tasty relative to the assessor’s real-world standards of taste. In contrast, (27b) admits a state of affairs where the speaker’s standards of taste are different from their real-world standards.
(27)
a. If horse manure were tasty, I would never go hungry. b. If horse manure were tasty to me, I would never go hungry. (after MacFarlane Reference MacFarlane2014)
Similarly, Anand and Korotkova show that overt experiencers change the signature of the acquaintance inference that ppts impose. Bare ppts in typical autocentric assertive contexts give rise to the inference that the speaker has some direct evidence for their judgment (Stephenson Reference Stephenson2007; Pearson Reference Pearson2013). This inference disappears under operators like epistemic maybe (Ninan Reference Ninan, Snider, D’Antonio and Weigand2014).
(28)
a. #The cake was delicious, but I never tasted it. b. The cake maybe was delicious, but I never tasted it. (Anand & Korotkova Reference Anand and Korotkova2018: 56)
In sharp contrast, ppts with overt experiencers do not lose the acquaintance inference, behaving exactly analogous to other predicates with experiencer arguments, including psych-predicates, such as like.
(29)
a. #The cake maybe was delicious to me, but I never tasted it. b. #I maybe liked the cake, but I never tasted it. (Anand & Korotkova Reference Anand and Korotkova2018: 56)
These facts do not depend on autocentric judgment: exocentric judgments also require acquaintance and show the same signature of obviation.
(30)
a. #Hobbes’s new food was tasty, but he never ever tried it. b. Hobbes’s new food maybe was tasty, but he never tried it. c. #Hobbes’s new food maybe was tasty to him, but he never tried it. (after Anand & Korotkova Reference Anand and Korotkova2018: 63)
Returning now to the counterfactual examples in (27), we see the same pattern. Consider a situation where two parents are discussing their child’s picky eating habits. An overt experiencer, as in (31b), allows the parents to consider a state of affairs where the child’s eating habits are different from in the real world.
(31)
a. If our dinner had been tasty, he would have eaten it. b. If our dinner had been tasty to him, he would have eaten it.
Importantly, the bare ppt form in (31a) does not: it only allows consideration of a state of affairs where the subject of the ppt itself changes composition.
In sum, if exocentric readings involve variables, as Stephenson and MacFarlane suggest, bare ppts in counterfactuals and acquaintance-obviation environments should pattern with their overt experiencer counterparts when the ppt is interpreted relative to an exocentric perspective. This prediction does not seem to hold: both exocentric and autocentric perspectives show the same contrast with their corresponding overt experiencer forms.
6.3.4 Relativism and Faultless Disagreement
We take the facts above, about exocentric readings, as evidence for Lasersohn’s approach, where the context may set the judge to a perspective distinct from the speaker’s. This is a position, we should note, that is compatible both with utterance-sensitive relativism and bicontextualism. Importantly, if we adopt this view, faultless disagreement can be blocked with exocentric readings, but only if the exocentric perspectives that speakers are employing are the same: in such a case, it is impossible for a proposition and its negation to be true relative to the contexts of utterance/assessment.
Taking stock now, in surveying the literature on ppts, we have argued that judge contextualism is the most challenged approach and that bicontextual relativism is the least, while utterance-sensitive relativism needs to explain the contrast between ppt disagreements and temporally sensitive sentences like It is noon. At the same time, we have argued based on contrasts between overt and covert experiencer data that exocentric readings should both be treated relativistically, that is, that the context of assessment should be free to choose judges other than the speaker’s.
But regardless of what one might conclude from disagreements and overt experiencers, when it comes to explaining the presence or absence of faultlessness, contextualist and relativist accounts are remarkably consonant in their explanation. Faultlessness comes from heteroperspectival evaluation (whatever its source), which allows intuitively contrary propositions to be simultaneously true because they are evaluated relative to distinct perspectives. And, in turn, the lack of faultlessness comes from homoperspectival evaluation (whatever its source), precisely because in such cases the contrary propositions cannot be simultaneously true (relative to the same perspective).
6.4 A Minimal Working Solution
We can now return to the puzzle in (3). It has two elements: on the one hand, the lack of faultlessness with the historical present and, on the other, the possibility of faultlessness with the simple past. We are ultimately committed to three theses to account for both of these:
(1) The complication in a narrative enforces a single perspective, while the evaluation admits diverse perspectives, i.e. the npg in (4).
(2) Faultlessness with ppts arises from heteroperspectival evaluation (independent of auto- vs. exo-centrism), while nonfaultlessness arises from homoperspectival evaluation.
(3) The historical present is only compatible with homoperspectival evaluation, while the simple past is more flexible.
The first two we have already addressed. Only the third remains. Why should the historical present have such a restriction? And why should it differ from the simple past in this regard? Ultimately, we believe the answers to both questions have their roots in the semantics of tense, as it interacts with the structure of a narrative. Thus, we aim to reduce the lack of faultlessness with the historical present to the fact that it is only employed in the complication of a narrative, and the possibility of faultlessness with the simple past to its availability in all parts of a narrative. We have already seen, in Section 6.2.3, that these tenses are indeed distributed in this way. But how should this be expressed formally? To answer this question, we turn to a more extensive formal analysis of tense.
6.4.1 A Bicontextual Semantics for Tense
We introduced a standard semantics for tense in (13) above and saw how it runs into problems with the historical present. If the present tense is sensitive to the time of the context and if this context encodes aspects of the utterance event, then it is hard to understand how this tense form could ever describe a past event. At the same time, there is no evidence for a distinct historical present morpheme. The historical present is just one use of a tense form that is also used for other purposes, including the canonical (utterance-time indexical) present and the so-called play-by-play (or broadcaster) present.
In Anand and Toosarvandani (Reference Anand and Toosarvandani2017), we argue these three uses can be unified, building on Sharvit’s (Reference Sharvit and Young2004, Reference Sharvit2008) bicontextual semantics for free indirect discourse, as long as: (i) tense is sensitive to a time in the context of assessment, as in (32), and (ii) this time of assessment can be set relatively freely. Pronominal indexicals, e.g. I, are sensitive instead to the utterance context.
(32)
a. 
b. 
Because tenses are sensitive to the assessment context, we cannot maintain MacFarlane’s Truth in a Bicontext (22), which sets the time coordinate of the index based on the utterance context. We need a more general notion, one which explicitly evaluates an expression relative to the assessment context:
(33)
Truth in a Bicontext (revised): 

While Sharvit takes the two contexts to be identical at the root level, we propose, following Schlenker (Reference Schlenker2004), that the time of assessment is set pragmatically in root contexts (see Bary, this volume for discussion):
(34)
a. 
b. 
c. 
When the time of assessment is the time of utterance, the canonical present results. When it is anterior to the actual speech time, the historical present results. And when it abuts the actual speech time, the play-by-play results.
With bicontextualism, in short, we can retain an indexical theory of the present tense in English, treating its various uses as arising from the mapping between the utterance time and the time that tense is indexical to. One component of this analysis is that the width of the time of assessment is also contextually determined. For the canonical present, the width is infinitesimal, small enough that only stative eventualities can occur. But for noncanonical uses, the interval is set freely, and it is for this reason that both the historical and play-by-play present allow episodic events while the canonical present does not. We suggest that for the historical present, in particular, the interval can be set wide enough to accommodate the entire story. What this means concretely is that sentences in historical present discourses require the same temporal perspective: they are evaluated relative to the same time of assessment.
This suggests, given what we observed about judges above, that the nsi in (12) can be rewritten more precisely. The quds in the complication of a narrative are all evaluated relative to a single context of assessment, while the qud that gives rise to the evaluation is evaluated relative to the utterance context.
(35)
For a qud evaluated relative to a given context, the only relevant answers will be ones that describe eventualities relative to that same context, assuming a sufficiently fine-grained conception of relevance. Thus, all assertions in the complication will be evaluated relative to a single assessment context.
We can see how the semantics of tense interacts with the nsi by looking at a simplified version of the joint oral narrative in (3).
(36)
A: We arrive in Paris. (i) A: We take a bus to the Normandy coast. (ii) A: We visit an apple orchard. (iii) B: They have cider. (iv) B: It’s delicious. (v) A: It isn’t delicious. (vi-a) It wasn’t delicious. (vi-b)
The sentences in (iii) and (iv) have the approximate logical forms in (37a) and (37b), respectively.
(37)
a. pres3 pfv we8 visit an apple orchard b. pres4 pfv they9 have cider
Each sentence is evaluated relative to an utterance context, which is updated throughout the narrative. But it is also evaluated relative to an assessment context, which we have posited does not change across the complication of a narrative. Thus, these two sentences have the following truth conditions:
(38)
a. 

;
b. 
;
Each sentence commits the speaker to the existence of a particular kind of eventuality, with the presuppositions of the indexical elements constraining these eventualities. In both (38a) and (38b), the reference time interval the present tense denotes is inside the assessment time. Perfective aspect further requires, again in both, that the eventuality lie within the reference time interval. Since a is constant across the complication of a narrative, according to the nsi,
is as well. So, by the narrative architecture of complications, the present tense locates both the visiting and possessing eventualities within the same assessment interval, which by the pragmatic conventions for historical present precedes the times at which these sentences were uttered.
6.4.2 Adding ppts
We can now turn to the final sentence of the discourse in (36). We treat delicious as a predicate of events, as in (39), for compositional simplicity.
(39)


Sentence (v) accordingly has the logical form in (40a) and the resulting truth conditions in (40b).
(40)
a. pres5 pfv it10 be delicious b. 
;
The perspective for the ppt here is the judge of the index (i), which at the root level is determined by whichever assessment context is relevant for the complication of this narrative. Per Truth in a Bicontext (33), (40a) is evaluated against the sequence
, so that its semantics reduces to the following:
(41)

;
When A follows up with sentence (vi-a), disagreeing using the historical present by saying It isn’t delicious, only a nonfaultless disagreement is possible. To see why, consider the logical form and truth conditions for A’s assertion:
(42)
a. neg pres6 pfv it10 be delicious b. 



which, again given Truth in a Bicontext, yields the following semantics:
(43)




As with the other sentences in the historical present, A’s disagreement here will be added to the complication of the narrative. But then, as the nsi requires, the perspective must be the same as for B’s original assertion. There is, as a result, no way for faultlessness to arise.
6.4.3 Disagreements Using the Simple Past
This deals with half of the puzzle posed by the joint oral narrative in (36). But what happens when A disagrees using the simple past, as in sentence (vi-b)? It seems that, by saying It wasn’t delicious, A can disagree faultlessly.
The truth conditions for this sentence differ from those of its historical present alternative only in the presupposition triggered by tense:
(44)
a. neg past6 pfv it10 be delicious b. 



The past tense requires that the reference time precede the assessment time. Given Truth in a Bicontext, (44) produces the following semantics:
(45)




This allows for a more complex set of interpretative possibilities. One is that the past tense has its canonical use, equating the assessment and utterance times. Then, A’s disagreement cannot be construed as an addition to the complication, since it is not interpreted relative to the relevant assessment context. It does, however, allow A to make an assertion from an autocentric perspective. In this case, since the judges for B’s and A’s assertions are distinct, a heteroperspectival disagreement should result, and thus also a faultless disagreement.Footnote 8
Since A’s assertion is not part of the complication, it does not contribute to the consensus description of the story world. This seems intuitively correct. By using the simple past, A reveals her own perspective on the events described. What this contribution means dialogically is less clear, since it can signal a range of intents. The disagreement may be a proposal about the evaluation of the joint narrative; it might register a dissent to the collective appraisal; or, finally, it may be a comment outside the narrative entirely, simply stating the speaker’s opinion. At this point, it not clear how these differ empirically.
One important question is how the identity of the consensus judge in a complication impacts this reasoning. Based on Edwards and Middleton (Reference Edwards and Middleton1986), it might seem reasonable to assume that this judge is a group containing the appropriate discussants in a conversation. However, this complicates our explanation for the faultlessness made available by using the simple past. Under our proposal, switching to the simple past requires a change in assessment context, which opens up the possibility of a change in judges. But a change may not be enough. If a ppt evaluated relative to a group judge is entailed to be true of its subgroups, when B says the cider is delicious to the group, it will be delicious to B and to A. But then a switch from B’s group judge to A’s autocentric judge is not enough, since B’s claim precludes the truth of A’s claim.
We can see two responses to this objection, the first a more nuanced view of what nonfaultlessness means in these dialogues, and the second a proposal that the common judge need not be the group, but rather a more abstract narrator.
Varieties of Nonfaultlessness
Let us first consider nuancing nonfaultless disagreements.Footnote 9 The assertions we are considering, and their judges in this context, are given below:
(46)
B: The cider is delicious. 
(47)
A: The cider isn’t delicious. 
(48)
A: The cider wasn’t delicious. 
We have already seen that, relative to any particular bicontext, (46) is contrary to both (47) and (48), so the distinction we are making is not about truth-conditional relations in a bicontext. However, B’s goal in making the assertion in (46) is to make a claim about A and B’s common judgment, a fact represented by the plural judge. In typical information-seeking exchanges, where the aim is to contribute novel information, asserting that one’s interlocutors have a particular judgment runs afoul of the first-personal privilege judgments of taste typically have, and hence comes across as deeply coercive. But in joint oral narrative, making assertions about another author’s judgment may simply be reporting what is a common belief of both authors already. In this regard, the assertions in (48) and (47) both start from the common belief that they share a common judgment. The disagreement is about what the common judgment is, but not whether there is a common judgment.
In contrast, the autocentric use in (48) is limited to A’s judgments alone. In doing this, A makes no commitments as to a common judgment. In addition, because it does not obey the nsi, A’s assertion is made outside the goal of joint narrative. It thus stands apart in two ways from the assertion that prompted it, and may thus be seen as a metanarrative signal about issues with the joint narrative. Indeed, this is precisely our feeling of the import of the disagreement in (48). B has made a claim about the joint judgment of A and B, and A’s goal here is simply to react to the assumption that there is a joint judgment, saying simply that, as for A themselves, the cider is not delicious. In contrast, (47) goes further, claiming that the joint judgment is that the cider is not delicious. Thus, while both (48) and (47) lead to nonfaultless disagreements, their impacts on the development of the narrative are different. (47) will lead to a discussion about what the consensus position was, while (48) is an attempt to deny that there was a consensus to begin with, thus serving as a metanarrative comment about what can be part of the complication of the joint narrative.
Narrator Judges
Another potential option to be considered for the judge in joint oral narratives is an abstract narrator. In this way, we could perhaps preserve faultless disagreement in some sense, since the narrator and any particular speaker would not necessarily be connected in a way that could preclude the narrator and the speaker from having differing judgments.
Such an avenue is especially attractive when we consider storytelling where the goal is to construct a fictional narrative. In such cases, there is no compelling reason to claim that ppts report the judgments of the group. Moreover, the difference between historical present and canonical past disagreements dissolves and the canonical past seems to trigger the same kind of nonfaultless disagreement. Consider a version of our narrative in (36), cast as a fictional account:
(49)
A: Our story begins as a couple arrives in Paris. (i) A: They take a bus to the Normandy coast. (ii) A: They wander around, eventually stopping in an apple orchard. (iii) B: They have cider. (iv) B: It’s delicious. (v) A: No, it isn’t delicious. (vi-a) No, it wasn’t delicious. (vi-b)
Both of A’s possible responses, in (vi-a) or in (vi-b), now read as nonfaultless attempts to impose a different consensus view of the story. If this is the case, there must be a way for speakers to felicitously disagree about some storywide perspective, but where it is not possible to bring in one’s own perspective. If the storywide perspective is the plural individual for the speaking group, it is hard to see why that would be. But if we recognize the possibility of an abstract narrator perspective, then the point would be that in fictional accounts one can disagree about the narrator’s judgment, but talking about one’s own perspective on something one is not acquainted with will be problematic.
The central problem with this account is that there is no clear notion of what the narrator requires, aside from being a perspectival respository (though see Eckardt Reference Eckardt, Birke and Köope2015, Reference Eckardt, Maier and Stokke2021). Is this an individual who exists in some particular world or is it something more abstract, like a standard of taste? And do we require an abstract narrator for all narratives, including nonfictional ones? Though these are important narratological questions, we have not been able to operationalize them in a way that allows them to be tested. We thus simply note that, while this option is open to us, advancing it more seriously would require some motivation for the ontological sophistication it may lead to.
6.5 Narratives in the Past
In the preceding, we outlined a solution to our puzzle, one that could handle both why the historical present cannot be used to disagree faultlessly and why the simple past can. But we have yet to address another aspect of the joint oral narrative in (36). While A can make an assertion relative to her own autocentric perspective by using to the simple past, A could also convey appraisal relative to the consensus judge with this tense form. That is, she can do with the simple past what she does with the historical present, disagreeing nonfaultlessly.
This perspectival flexibility could simply be a matter of what the judge of the assessment context is. The anteriority encoded in the standard semantics for the past tense in (32) absolutely prohibits a ppt from holding at the time of the context whose judge it is evaluated relative to. So, for sentence (vi-b) in (36), one option would be to allow the judge to remain the consensus judge, even while the assessment time is fixed to the utterance time. (The assessment context would thus not be completely identical to the utterance context.) Under this view, the flexibility in how the simple past is used simply boils down to variation in what the judge of the assessment context can be.
However, we think a more principled account is possible, linking this perspectival flexibility to more general facts about past tense usage in narratives. Consider an alternative version of (36), conducted entirely in the simple past:
(50)
A: We arrived in Paris. (i) A: We took a bus to the Normandy coast. (ii) A: We visited an apple orchard. (iii) B: They had cider. (iv) B: It was delicious. (v) A: No, it wasn’t delicious! (vi)
Setting aside the disagreement in (v–vi), it is not clear, given our assumptions so far, how sentences (i) through (iv) comprise a coherent narrative. The restrictive formulation of the nsi requires a single context of assessment for all assertions in the complication. But we have assumed that, in its canonical use, the past tense identifies the assessment context with the utterance context. Since the latter advances in time with each speech act, so will the former. Thus, it should be impossible for a sequence of past tense sentences to comprise a complication, since the assessment context is different for each of them.
This is, of course, simply not the case: while (50) differs from its historical present counterpart, it does not differ in its coherence. This means that one or more of our assumptions must be relaxed. The tension here is between the semantics for the past tense in (32), which translates a fairly standard denotation into a bicontextual framework, and the nsi. In principle, either hypothesis could be loosened or removed. We will try, however, to maintain the nsi in its present form in (35) and revise the semantics for past tense. This enables a common understanding of how the past tense can coherently be used in a narrative like (50) and why it is perspectively flexible, unlike the present tense.
6.5.1 Sources of Anteriority
In revising the semantics for past tense, we might look to the semantics proffered in the literature for other kinds of past meanings. One case of this is the past perfect, which intuitively conveys two levels of anteriority: it invokes a salient time anterior to the utterance time – what Reichenbach (Reference Reichenbach1947) calls the “reference point” – which the event is itself anterior to. It is tempting to view this as a consequence of two morphemes, the past, responsible for anteriority with respect to the utterance time, and the perfect, responsible for the other case of anteriority. Kamp and Reyle (Reference Kamp and Reyle1993: 483–689) argue that both relations should be encoded in the semantics of tense, since this behavior is independent of the aspectual properties of a sentence. They observe that a sequence of sentences in the past perfect also exhibits narrative progression.
(51)
Fred arrived at 10. He had gotten up at 5; he had taken a long shower, had got dressed, and had eaten a leisurely breakfast. He had left the house at 6:30. (Kamp & Reyle Reference Kamp and Reyle1993: 594)
Kamp and Reyle introduce another perspectival point beyond the reference and utterance times, which is anchored to an event in the discourse: in (51), it is anchored to the arriving event described by the initial sentence.
Elsewhere (Anand & Toosarvandani Reference Anand and Toosarvandani2017), we have argued that this perspective point can be assimilated to the assessment time, since an event described by the historical present can also serve as the anchor for the past perfect.
(52)
Rumors of Berlusconi’s crimes swirl. His advisors confront him. He scoffs. He had paid off the prostitute for her silence already. (Anand & Toosarvandani Reference Anand and Toosarvandani2017: 29)
All told, this would suggest the following semantics for the past perfect within a bicontextual framework:
![]()
Intuitively, it could be possible to see the simple past as an instance of this. All-past narratives would be coherent, then, because they are described as past relative to an assessment time that is itself anterior to the utterance time. However, there is a real contrast in temporal perspective taking between (50) and (51). In the past perfect example, there is a sense that there is a temporal vantage point (10 p.m.) relative to which the other events are being viewed. In the simple past narrative, by contrast, that feeling is absent or at least not necessary.
An interesting constellation of properties has been described in this connection for past tense forms in German. Kratzer (Reference Kratzer, Strolovich and Lawson1998: 105–106) observes that the German simple past (Präteritum) is, unlike its English counterpart, not felicitous out of the blue, while the German present perfect form (Perfekt) is. She proposes that the German simple past is strongly anaphoric to a temporal interval salient (thereby excluding it from out-of-the-blue uses), and requires that the event be contained inside this interval. She locates this sensitivity in the semantics of aspect (perfect vs. perfective aspect). However, as with the past perfect, this restriction may be better located in the semantics of tense. The simple past in German cannot be used to backshift relative to a salient past time (Dickey Reference Dickey2001: 88), a restriction it shares with the simple past in French and Dutch (Molendijk & de Swart Reference Molendijk and de Swart1999: 90–91).
(54)
a. ?? Max fiel. John schubste ihn. ‘Max fell. John pushed him.’ (Dickey Reference Dickey2001: 88) b. #Jean mourut. Max l’assassina. ‘Jean died. Max assassinated him.’ (Molendijk & de Swart Reference Molendijk and de Swart1999: 90) c. ?? Jane verliet me. Ze werd verliefd op een ander. ‘Jane left me. She fell in love with someone else.’ (Dickey Reference Dickey2001: 87)
This is not an idiosyncratic property of “narrative” past tense forms. The historical present also prohibits backshifting (Anand & Toosarvandani Reference Anand and Toosarvandani2018): e.g. #John dies. Max assasinates him. This parallelism between the historical present and the simple past in these languages plausibly has its source in a shared sensitivity to the same time parameter.
Let us suppose, then, that in a bicontextual framework the simple past in German (as well as in Dutch and French) locates the reference time in the assessment time, which is itself located anterior to the utterance time. It realizes, in other words, a past tense morpheme that we can call the r(emote)-past. Its semantics would differ from that for p-past in (53) solely in the relation between reference time and assessment time.
![]()
This past tense morpheme is a bicontextual cousin of the present, which also locates the reference time inside the assessment time. As the assessment time is not the utterance time, it must be a salient past time, which means the r-past must be temporally anchored. At the same time, since the reference and assessment times are related by inclusion, we do not have the requirement for a salient “intermediate” past that we had for the past perfect. In sum, r-past serves as an excellent candidate for the German simple past and similar “narrative” past tenses like the Dutch and French simple past. Next, we argue that it is also part of the meaning of the simple past in English.
6.5.2 A Revised Semantics for Past Tense
If we took the English simple past simply to encode r-past, like its German counterpart, then we would have a straightforward explanation for why an all-past narrative, like (50), is coherent according to the nsi. The assessment time can be set to a salient interval containing all the eventualities described in the complication, precisely as we have argued for narratives in the historical present. In addition, we can account for why the simple past allows for the option of homoperspectival evaluation, relative to the consensus judge. With the r-past, the simple past can describe a past eventuality without having to shift from the assessment context of the complication. A ppt could then be evaluated relative to the judge parameter of this context.
But the simple past in English cannot merely encode r-past. If it did, there would no contrast with the historical present in the availability of faultless disagreements with ppts. Said another way, we would not derive the fact that the simple past allows heteroperspectival evaluation (though it does not require it). Additionally, we might expect it to be infelicitous out of the blue, like its German counterpart. It seems that we have to embrace some kind of polysemy for the simple past in English. It could be ambiguous (as Kratzer Reference Kratzer, Strolovich and Lawson1998; Kamp & Reyle Reference Kamp and Reyle1993 have, in fact, proposed), between past and r-past morphemes. Or, it could have an underspecified meaning: one candidate for this u(nderspecified)-past is given below.
(56)


With this semantics, the u-past simply says that the reference time is anterior to some bicontextual time, leaving underspecified which coordinate this is. For example, if
, a classical indexical past results that does not mention the assessment context at all. It could thus be used in an out-of-the-blue setting or in a narrative without violating the nsi, since the assessment time does not constrain the tense’s denotation at all. If
, then a backshifted past becomes possible when the assessment time is contextually set to a time anterior to time(u).Footnote 10 This polysemy, regardless of which version is adopted, corresponds to the perspectival flexibility exhibited by the simple past.
6.5.3 Considering an Alternative
It is important to consider whether this approach, which posits polysemy for the past tense in English, along with crosslinguistic variation in its semantics, is ultimately more explanatory than the alternative. The standard semantics for past tense in (32) could be maintained by restricting the scope of the nsi, making it a claim not about coherent narratives simpliciter, but merely about coherent narratives in the historical present.Footnote 11 This alternative would amount to a pragmatic principle that directly mandates a homoperspectival stance for one use of the simple present. It would be completely silent about other tense forms: it would have nothing to say about a disagreement in the simple past, whether following a sequence of historical present sentences, as in our original joint oral narrative in (36), or whether in an all-past narrative, as in (50).
Empirical coverage aside, there are a couple reasons to think that the alternative, which posits a direct mapping between a tense use and homoperspectival appraisal, is not on the right track. In the account we have advanced, the connection between the historical present and nonfaultless disagreement is indirect: the historical present is restricted to the complication because of the semantics of present tense. If this restriction means anything, there should be evidence for it outside of disagreements with ppts. We would expect evaluative language, in general, to be treated as part of the “facts” of the story when it is expressed using the historical present. There is some intuitive evidence for this idea. Consider the following historical present story:
(57)
My neighbor and I start hanging out more after work. We go to see the new Star Wars movie later that month. But in the theater, they suddenly seem cold and distant. They stop returning my calls. a. They are falling in love with me, but I don’t know that. b. They were falling in love with me, but I didn’t know that.
We have the intuition that, in (57a), the fact that the neighbor is falling in love with the protagonist is part of the story; it is a crucial plot point that will propel some of the story events. For (57b), by contrast, we do not have that feeling: the prominent reading is one where the falling in love is a post facto explanation for why things happened. This contrast is rather subtle, but it does follow from the indirect account as we have advanced it. It is less clear how the same observations would be cached out in the alternative, which only posits a connection between the historical present and homoperspectival evaluation.
The indirect route, moreover, makes interesting prediction about ppts in the “narrative” past tenses in German, French, and Dutch. For these languages, we suggested that there was a distinction, parallel to the historical vs. canonical present contrast, that was encoded as a semantic distinction between two past tense morphemes. We thus predict that the simple past in German, French, and Dutch should trigger nonfaultless disagreement, while the present perfect should allow faultless disagreement. Importantly, this is attributed, not to a stipulation about tense–judge interactions, as the direct alternative would have, but to a general constraint on narrative genres. It may be that, ultimately, the account we have advanced is too ambitious. But it does make clear empirical predictions, showing at the same time what work needs to be done next.
6.6 Conclusion
We began this chapter with a novel puzzle about how tense and predicates of personal taste (ppts) interact in the performance of joint oral narratives. We have used this puzzle to mount an argument for the linguistic importance of structures for oral narratives identified in the discourse analysis and psychology literatures. Namely, ppts and other perspectival expressions are evaluated differently in the complication and evaluation portions of narrative, a claim we have called the Narrative Perspectival Generalization (4).
We have cached out the npg theoretically by combining a bicontextual theory of perspective (MacFarlane Reference MacFarlane2003, Reference MacFarlane2014; Sharvit Reference Sharvit and Young2004, Reference Sharvit2008) with Roberts’s (Reference Roberts2012) theory of discourse structure, leading us to a constraint on the strategies of inquiry for narratives, which makes mention of both bicontextual perspectives (35). In turn, we have shown how the linking of grammatical tense to the assessment context allows us to account for the contrast between (historical) present and (canonical) past disagreements in joint oral narratives.
In closing, we would like to reflect on the larger implications of the empirical puzzle we have focused on and the proposals we have advanced. Perhaps the most immediate question that arises is the status of the nsi within a theory of linguistic competence. Ultimately, we see the nsi as a claim about what speakers know about the pragmatics of the narrative genre. Genres ultimately are shaped by cultural practice, and hence are matters of convention. It may be that there are very few, if any, cognitive or properly linguistic constraints on possible genres. Nevertheless, we believe that the conventions of a genre can make direct reference to linguistic structures, and thus the study of the structures of genres can provide indirect evidence for underlying linguistic structures. In the present case, the interaction of tense and faultlessness provides, we believe, strong evidence that temporal perspective and evaluative perspective are grammatically linked, a claim we have cached out by making them both sensitive to the same object (the assessment context). Beyond that, we should understand that much of the surrounding structural dichotomizing – complication vs. evaluation, assessment vs. utterance context – is provisional, absent a theory of sufficient richness.
Hence, while our particular way of implementing that linguistic importance involved bicontextual parameters, our aim in this chapter was more general. We hope to have shown that a richer, more capacious notion of what constitutes narrative perspective is needed, one that engages with the intentional structure of a narrative, and a sense of a narrative as a practice distinct from information exchange. While there is a substantial treatment of Narration as a coherence relation between discourse units within formal semantics and philosophy of language, very little has been done in these traditions to understand narrative as a larger intentional form of language use. We believe that there is real opportunity for progress in this arena. But this progress will only come by closely attending to the interaction with other perspectival notions by particular grammatical formatives (like the “narrative” tenses in German, French, and Dutch), as well as by undertaking more serious and sustained attempts to formalize the richer, more capacious notions of narrative structure found in discourse analytical, narratological, and psychological investigations of narrative discourse.
Such sustained, interdisciplinary examination will be necessary, we believe, to understand what narration is and why it has the character it does. While we have argued for a grammatical interaction between temporal and evaluation perspective in this chapter, our account, in essence, simply stipulates this interaction by making different sets of morphemes dependent on the same perspectival parameter. We have not touched the more important explanatory question of why things are organized this way and not another. In more naive discussions of perspective, the perspectival center is characterized, not as an abstract vantage point, but as some actual individual in the story world (see, e.g. Walton Reference Walton1990). For such a view, it is not surprising that there is a unity of temporal and evaluative perspective. However, we have in Section 6.4 argued at some length that it is difficult to link the evaluative perspective with any particular set of individuals, even for autobiographical oral narratives, and more complex narratives clearly lack an obvious embodied perspectival center. It is thus surprising that tense and evaluation continue to track together formally, even when they are not linked to any clear individual. It may be that the narrator plays a crucial role here, and that even in cases where there is no actual person, there is some “counterfactual person” from whose vantage point the narration is simulated to take place. While we acknowledge the promise of this idea, moving from informal notions into something more substantive will require much more careful theorizing around narrators, and thus also the intentional structure of narration. We hope that this chapter has illustrated some potential payoffs of that task for linguists and philosophers of language alike.








