Part I Problems and Theories
2 Research paradigms in pragmatics
There is no shortage of definitions for pragmatics. Many have proposed a division of labor between grammatical phenomena with their dedicated accounts and pragmatic phenomena with their dedicated accounts. Few succeeded. The majority failed, because they were aiming too high (see Ariel, Reference Ariel2010: Part I; Levinson, Reference Levinson1983: Chapters 1, 7; Turner, Reference Turner1999). Two major obstacles blocked the attempts to come up with a coherent definition for pragmatics. First, there was the hope that a multiplicity of criteria simultaneously converge to distinguish between all of grammar (including semantics) and all of pragmatics, e.g., context sensitivity, non-truth-conditionality, implicitness, discourse scope (and many others), all characterizing pragmatics, and context invariability, truth-conditionality, explicitness, sentential scope (and many others), all characterizing grammar. Naturally, the more criteria we can mobilize for drawing the grammar/pragmatics division of labor, the more contentful each of the defined domains is made out to be, and the more significant the distinction between them. The second high hope was that complete topics, such as speech acts, implicatures, politeness, functional syntax, deixis, presupposition, agreement, and argument structure each, en bloc, belongs either in grammar or in pragmatics. Indeed, this would guarantee a very neat division of labor between grammar and pragmatics. Thus, speech acts, implicatures, politeness, and functional syntax should wholly belong on the pragmatics turf, agreement and argument structure should wholly belong on the grammatical turf, and semanticists and pragmatists should do battle over, e.g., presupposition and deixis.
Unfortunately, neither one of these worthy goals can be achieved. Multiple criteria for distinguishing grammar and pragmatics resulted in multiple contradictions between the various criteria, which render the definitions of both grammar and pragmatics incoherent. And aiming for a single-domain account (either pragmatics or grammar) for each topic on the canonical list of pragmatic topics (such as speech acts and implicatures) is a mission impossible, because almost all linguistic expressions require both a grammatical account and a pragmatic account. Therefore, for a division of labor between grammar and pragmatics to work, we have to lower our expectations. We must do with just a single dichotomy, and we must concede that only aspects of phenomena can receive uniform grammatical or pragmatic analyses: Labeled topics (such as presupposition, reference, politeness, functional syntax) typically straddle both sides of the grammar/pragmatics divide. Whereas the issue of a uniform domain analysis for each topic is hardly ever explicitly addressed (but see Ariel, Reference Ariel2010: Part III; Wilson and Sperber, Reference Wilson and Sperber1993a), the multiple definitional problem has been widely recognized, and one criterion has emerged as a clear, even if not exclusive, winner.
On this criterion, grammar is taken to comprise conventional codes pertaining to linguistic forms, and pragmatics comprises plausible (not necessarily logical) inferences based partly on that code. Now, some have advocated this criterion as the single criterion to be used (Ariel, Reference Ariel, Darnell, Moravcsik, Newmeyer, Noonan and Wheatley1999, Reference Ariel2008b, Reference Ariel2010; Prince, Reference Prince, Newmeyer and Robins1988, Sperber and Wilson, Reference Sperber and Wilson1986). Others believe that the code/inference criterion converges with some other criterion, (non-)truth-conditionality being the favorite (Grice, Reference Grice1989; Recanati, Reference Recanati, Horn and Ward2004b). Be that as it may, there is at least some consensus in the field: grammar minimally analyses conventional codes whereby specific linguistic expressions are associated with their semantic interpretations and/or use conditions; pragmatics complements grammar in that it is responsible for speaker-intended meanings or use conditions she complies with, which are rationally derivable when we take into consideration the linguistic meaning, relevant contextual assumptions, and some discourse or cognitive principle(s). This division of labor is the starting point for this chapter.
Assuming we define grammar as a set of codes associating linguistic forms with their interpretations and use conditions and pragmatics as a set of inferences rationally, but only plausibly, derived on the basis of the explicit utterance, a more practical set of questions arises: What are the research questions linguists pose when they do pragmatics? What kind of answers can they give when analyzing pragmatic phenomena? I will briefly outline three prominent research paradigms adopted by pragmatists who set out from the assumption that grammar is encoded and pragmatics is inferred (sections 2.1, 2.2, 2.3 respectively).1 My goal is to offer a bird's eye view of how these approaches go about doing pragmatic research.2 As will become evident below, however, agreeing on a set of research questions, and even on methodology, does not mean that each research paradigm reduces to a single theoretical approach, nor to uniform linguistic analyses for specific phenomena. Quite the contrary. Typically, the theories and analyses grouped under the same paradigm compete with each other, so that within-paradigm disagreements are quite common. Surprisingly, perhaps, cross-paradigmatic theories and analyses typically (though not invariably) actually complement each other, since they each shine the light on different aspects of linguistic phenomena.
In order to facilitate comparisons between the different accounts within, and more importantly, across the three research paradigms, some common denominator is needed. I have chosen one issue, discourse reference, which I exemplify for all the theories discussed in this chapter. Different analyses of the same issue, each falling within one of the three research paradigms, can be readily compared and contrasted (section 2.4). In the end, the argument I will make is that even if some or even all current pragmatic theories, as well as specific analyses of discourse reference, turn out to be wrong, all three research paradigms are necessary for a complete picture of grammar, of pragmatics, and of the grammar/pragmatics interface (section 2.5).
2.1 Grammar is grammar and pragmatics is pragmatics: Inferential pragmatics
Inferential pragmatics theories are all traceable to Grice (Reference Grice1989).3 The main protagonists within this pragmatics research paradigm are Griceans, neo-Griceans, and Relevance theoreticians. Their goals are: (a) to provide an inferential pragmatic theory to account for speaker-intended implicit interpretations; (b) based on this theory, to establish the division of labor between grammar, specifically semantics, and pragmatics, concentrating on difficult cases, often those termed generalized conversational implicature cases (GCIs); (c) to distinguish between different implicit interpretations: particularized conversational implicatures (PCIs), versus GCIs, implicatures versus inferences that form part of the truth conditions of the proposition expressed (the explicature), conventional implicatures versus conversational implicatures. Naturally, both codes and inferences are here discussed, the goal being a complementary distribution between the two.
2.1.1 The inferential pragmatics research paradigm
Grice's original insight about the crucial role of inferencing for conveying the speaker's intended message (at the expense of encoding) has been adopted by all inferential pragmatics theories, and inferencing is now seen as a major ingredient in utterance interpretation. In fact, whenever the linguist is debating whether to assign semantic or pragmatic status to some interpretation, the default preference is to treat it as pragmatically inferred. Following Grice's (Reference Grice1989: Chapter 3) Modified Occam's Razor Principle, it is more economical to assign a given interpretation to pragmatics than to semantics, since the former comes “for free”, speakers performing pragmatic inferences anyway. The latter requires grammatical stipulation, which is supposed to burden native speakers.
Inferential pragmatists leave formal grammar mostly intact, except for some “territory seizures”, where some potentially semantic accounts are relegated to pragmatics. As empirical basis for the analyses offered, inferential pragmatists traditionally, and very often even today, rely on their own intuitions, although recently, some corpus studies (Ariel, Reference Ariel2004) and some experimentation (Gibbs and Moise, Reference Gibbs and Moise1997; Noveck, Reference Noveck2001; Noveck and Chevaux, Reference Noveck, Chevaux and Skarabela2002; Papafragou and Musolino, Reference Papafragou and Musolino2003) have been adduced in support of researchers’ positions. The three variant inferential pragmatics theories proposed by Grice, neo-Griceans, and Relevance theoreticians are quite well known. Given the commonalities outlined above, emphasis will here be laid on the differences between them.
We start with Grice. Grice's theory simultaneously addresses two fundamental problems about linguistic interactions. The first one is, how can we distinguish between a natural and an unnatural discourse? In other words, what characterizes coherent discourse progression? Grice's answer is that interlocutors first and foremost abide by the Cooperative Principle, which dictates to them to “[m]ake your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged” (Grice, Reference Grice, Cole and Morgan1975: 45). Presuming that this agreement is adhered to, the interlocutors can further assume that four maxims and eight sub-maxims inform speakers’ utterances, and hence, addressees’ interpretations: Quantity (informativity), Quality (truth and reliability), Relation (relevance), and Manner (optimal choice of linguistic form).
The second question is, how can speakers convey more than they explicitly encode? Grice proposed that interlocutors need not always straightforwardly obey the maxims in order to be cooperative. He offered the mechanism of conversational implicature generation to account for such cases. The idea is that speakers guide their addressees to extract more meaning out of their utterances than they actually encode. Taking into consideration the encoded message, the maxims, as well as relevant contextual background, addressees must “read between the lines” in order to process the additional inferred meanings intended by the speaker. These conversational implicatures may be generated in order to make sure that some maxim is abided by, or in order to justify an apparent maxim violation.
The neo-Griceans (most notably, Horn, Reference Horn and Schiffrin1984b, Reference Horn1989; Levinson, Reference Levinson, Verschueren and Bertuccelli-Papi1987, Reference Levinson2000) accept the Gricean picture for the most part, but they don't have use for so many maxims and sub-maxims. They leave Quality intact, Relation is kept by Levinson but rejected by Horn. The neo-Griceans’ attention is mainly focused on reducing the six sub-maxims subsumed under Quantity and Manner. Horn proposes two principles instead, based on Grice's Quantity. Whereas Grice saw no particular problem in balancing between his Q1 and Q2, Horn and Levinson see the two as clashing, one instructing the speaker to be maximally informative, thereby blocking potentially inferable enrichments, the other instructing the speaker to be minimally informative, thereby encouraging the addressee to enrich the speaker's linguistic meaning with pragmatic inferences. They then explain how interlocutors resolve this clash, by reference to markedness: “The use of a marked expression when a corresponding unmarked alternate expression is available tends to be interpreted as a marked message (one which the unmarked alternative would not or could not have conveyed)” (Horn, Reference Horn and Schiffrin1984b: 22). In other words, enrichment to the stereotype (a derivative of Q2) applies when unmarked forms are involved, and anti-enrichment (a derivative of Q1) applies when marked forms are used.
Sperber and Wilson (Reference Sperber and Wilson1986) are even more reductive than the neo-Griceans. They have suggested that a single principle of optimal relevance can account for discourse coherence, as well as for how addressees go about reading more into speakers’ utterances. According to this theory, Relevant information necessarily hooks up with addressees’ contextual assumptions, yielding contextual implications based on the two computed together. Speakers aim for an adequate quantity of contextual implications (pragmatic inferences), which must be conveyed in a way that requires the least processing cost from the addressee.
In this way all three theories account for how discourse is coherent, and what triggers addressees’ inferences. Interestingly, while the theories are quite different, for the most part there is agreement between them on where to draw the grammar/pragmatics division of labor for specific linguistic phenomena. For example, they all agree that, e.g., and's semantic meaning is equivalent to its logical counterpart ∧, and that the additional interpretations often associated with it (e.g., ‘and then’, ‘and therefore’, etc.) are pragmatically derived. The same is true for (either) or and for scalar expressions (such as some, good). The semantic meanings of the latter are lower-bounded only (e.g., ‘at least some/good’), the upper bound (‘not all’/‘not excellent’) pragmatically provided. However, there can, theoretically, be disagreements within this inferential pragmatics research paradigm regarding the grammar/pragmatics division of labor.4 Moreover, the fact that we mostly find agreement regarding the semantics/pragmatics division of labor does not mean that the analyses are otherwise identical. Quite the contrary. (Neo-) Gricean and Relevance-theoretic theories classify pragmatic interpretations somewhat differently.
While there is universal agreement that PCIs play a major role in natural discourse, Sperber and Wilson have proposed an additional distinction between implicated assumptions and implicated conclusions. The former are assumptions the speaker intends the addressee to draw on because they are needed for deriving the implicated conclusions. Only the latter are actually a speaker's point in the discourse. Next, (neo-)Griceans assume a special status for conventional implicatures. These are like conversational implicatures in that they don't affect the truth conditions of the proposition they are in, but like semantic meanings, they are conventionally encoded for specific expressions. (Neo-)Griceans here prefer truth-conditionality as the criterion for the grammar/pragmatics divide. Since Relevance theoreticians (see Blakemore, Reference Blakemore1987 and onwards about but) take the code/inference distinction alone as the criterion distinguishing semantics from pragmatics, conventional implicatures fall squarely on the semantics side of the divide. In other words, there is no place for conventional implicatures as a distinct species of pragmatic interpretation under Relevance theory, since no inference is involved.
Last, neo-Griceans, much more than Grice himself, focus their attention on generalized conversational implicatures (GCIs). These are conversational implicatures, except that they are generated under normal circumstances (as a default, according to Levinson). It is these implicatures which under some contextual circumstances “intrude” on semantics, in that they contribute to the truth-conditional meaning of the relevant proposition, despite the fact that they are pragmatic implicatures. Relevance theoreticians have argued against assuming this additional type of conversational implicature. According to Carston (Reference Carston1990 and onwards), for example, to the extent that GCIs are conversational implicatures, they should be seen as PCIs (which happen to be generated rather often). But for many inferences treated as GCIs by the neo-Griceans, Relevance theoreticians have proposed a different status, that of explicated inferences, i.e., inferences which form part of the Relevance-theoretic concept of explicature.5
Relevance theoreticians are not surprised that pragmatics intrudes on semantics. They take it as given that pragmatic inferences contribute to the truth-conditional content of the proposition. While semanticists and pragmatists are in agreement these days that the linguistically encoded meaning falls short of the meaning actually communicated by the speaker, Griceans have more or less adhered to Grice's original “what is said,” which includes only a minimal quantity of pragmatically inferred interpretations, and they are happy to call this level truth-conditional semantics. Sperber and Wilson disagree. Any pragmatically inferred interpretation is a full-fledged pragmatic inference, which should count as one. The fact that many of these are needed to develop the encoded Logical Form of the speaker's utterance into a complete proposition doesn't alter their pragmatic status. Instead of the minimally enriched “what is said,” Sperber and Wilson offer the pragmatically richer explicature (the idea being that the pragmatic inferences needed to complete the speaker's encoded meaning into a full proposition count as part of the explicit message). Now, these differences mean that pragmatic interpretations associated with the same linguistic expressions may well receive different analyses by proponents of different inferential pragmatics theories. Indeed, whereas for (neo-)Griceans the pragmatic interpretations commonly associated with and are GCIs, they are explicated inferences for Relevance theory. And whereas the upper bound on, e.g., some is a GCI under the neo-Gricean analysis, it is sometimes a PCI, and sometimes part of the explicature for Relevance theoreticians.6
In sum, all three inferential pragmatics theories propose some overarching principle(s) in order to account for discourse coherence and for how pragmatic inferences are routinely drawn over and above the speaker's linguistically encoded message. Differences between the theories are found in the number of principles, and in the types of pragmatic interpretations researchers assume in general and for specific cases (and, some). But these don't alter the common focus within this research paradigm: drawing a grammar/pragmatics division of labor between semantic codes and pragmatic inferences.7
2.1.2 Inferential pragmatics and reference
Reference, just like any other speaker-intended interpretation, requires both a code and a set of inferences. The common goal shared by inferential pragmatics theories is the attempt to relegate as much as possible to pragmatics. In other words, preference is given to analyses involving small codes and big inferences. The role of the specific referential forms tends to be downplayed, while the role of the contextual inferences is upgraded.
Kempson (Reference Kempson, Newmeyer and Robins1988) is a typical Relevance-based proposal. Referring expressions, Kempson reminds us, have quite a variety of uses. For example, only some are referential. Some are impersonal, some involve bound variable anaphora, others require bridging. Kempson proposes a very “thin” code for all definite noun phrases: “Presumed accessibility.” It is then up to the addressee to determine both the type of reading and the referent intended by the speaker, based on Relevance theory (making sure there are sufficient contextual effects for a minimal processing cost). Wilson (Reference Wilson1992) emphasizes that reference assignment forms part of the overall process of utterance interpretation, which is Relevance-guided. She focuses on cases where more than one NP can serve as antecedent (e.g., Sean Penni attacked a photographerj. The mani/j was badly hurt). While both interpretations are conceivable, Relevance theory, she proposes, does not force the addressee to consider all options before he can choose the one intended by the speaker (a costly processing procedure). Interlocutors are tuned to specific coherence patterns, and can immediately choose the appropriate overall interpretation (the state here described in the second utterance is the consequence of the event in the first) based on the content of the utterances. This effortlessly points them to the appropriate reference.
Assuming that coreferential readings are more informative than disjoint readings, Levinson (Reference Levinson2000) constructs a Horn scale of the prototypical referring expressions: Lexical NP < Pronoun < Ø (see also Huang, Reference Huang2000). The idea is that less informative forms (on the right) trigger stronger (coreferential) readings (via Q2 – enrichment to the stereotype). If so, using an informative expression when the grammar would allow for a coreferent reading with a less informative expression triggers a disjoint reading (via Q1 – the anti-enrichment principle, e.g., Heiwent over there and approached the manj).
As can be seen, inferential pragmatic analyses naturally assign a primary role to pragmatic inferencing in determining the use and interpretation of referring expressions. While Levinson does draw some formal distinctions between referring expressions, a much finer set of distinctions will be offered by form/function pragmatics theories (see section 2.2.2).
2.2 (Some) pragmatics is grammar: Form/function pragmatics
Form/function pragmatics, as I propose to call the second research paradigm, has altogether different intellectual roots from inferential pragmatics. Whereas the inspiring figure in the first tradition is Paul Grice, form/function pragmatics emerged as a reaction to Chomsky's generative syntax. Hence, researchers within this paradigm focused on syntax initially, where, they argued, not all conditions (on transformations, in those days) could be grammatically specified, because they often pertained to information structure, a pragmatic concept. The thought behind most research within this paradigm is that grammar is at least partly geared towards communication. Hence, it is only natural for factors relevant to communication to play a role in it. Pioneering research was conducted by Susumu Kuno, Ellen Prince, Wallace Chafe, Sandra Thompson, and various generative semanticists (see Chafe, Reference Chafe and Li1976, Reference Chafe1994; Green, G. M., Reference Green, Cole and Morgan1975, Reference Green1976; Hooper and Thompson, Reference Hooper and Thompson1973; Kuno, Reference Kuno1972, Reference Kuno1987; Prince, Reference Prince1978, Reference Prince, Newmeyer and Robins1988; Thompson, S. A., Reference Thompson, Edmondson, Feagin and Mühlhäusler1990). In time, linguistic analyses were no longer restricted to syntactic structures, discourse markers too receiving prime attention (following Schiffrin, Reference Schiffrin1987). And some current approaches actually advocate the replacement of formal syntax by form/function theories (most notably, Goldberg, Reference Goldberg1995).
2.2.1 The form/function pragmatics research paradigm
Form/function pragmatics is concerned with a small subset of pragmatic meanings, those conventionally associated with specific linguistic expressions (constructions and discourse markers for the most part). Since these meanings are conventional, the pragmatic meanings offered are claimed to belong in grammar. Most, though not all, researchers use naturally occurring examples, but hardly any experimentation has been used to support pragmatic form/function claims.8
Here is one example. Originally, the generative account needed to stipulate that the subject (X) “removed” from sentence initial position in existential sentences (there is an X…) had to be an indefinite NP. This grammatical condition, however, is lacking in both descriptive and explanatory adequacies, it was argued, in the jargon of the period. The missing explanatory generalization is that existential constructions serve a specific discourse function, to introduce new entities into the discourse. Since indefinite NPs stand for new entities, no wonder this restriction applies to existential sentences. Hence, what seems to be an arbitrary grammatical stipulation turns out to be pragmatically motivated. Moreover, once the grammatical restriction (against definites) was replaced with a functional restriction (against Given entities), formally definite NPs no longer constitute counterexamples to the account, provided they are not Given (Rando and Napoli, Reference Rando and Napoli1978).
Now, grammar is responsible for form/function correlations, and the analyses here concerned clearly associate specific forms with specific functions. Why should this research be considered pragmatics then? Different researchers provide different reasons. The dilemma here is identical to the one concerning conventional implicatures – in fact, there is no reason not to view the form/function correlations here concerned as conventional implicatures. For some form/function pragmatists, their research is pragmatic, because the functions associated with the syntactic constructions and discourse markers all concern interpretations that do not contribute truth-conditional elements to the proposition expressed. In other words, if we were to present new information where old information is called for, or vice versa, the resulting sentence is not taken to convey a false proposition, nor is it an ungrammatical sentence. Since, especially early on, many pragmatists subscribed to the truth-conditionality criterion as distinguishing semantics from pragmatics, the functions associated with syntactic constructions were automatically classified as pragmatic (see Brinton, Reference Brinton2008 for a similar assumption). Later on, in view of the inner contradictions between various definitions for pragmatics, pragmatists tended to give up on a coherent definition for the field, and such research was seen as part of pragmatics, simply because a canonical list of pragmatics topics had been established, and functional syntax formed a (marginal) member on this list. Thus, for the most part, functional syntax, and certainly research on discourse markers, is considered pragmatic, rather than semantic.
At the same time, some researchers insisted that the form/function correlations they were analyzing were conventional, and hence part of grammar. Prince (most explicitly in Reference Prince, Newmeyer and Robins1988) treated the non-truth-conditional interpretations she was analyzing as grammatical, because they directly and conventionally associated specific linguistic forms with specific interpretations and/or use conditions.9 Now, although for Prince form/function pragmatics forms part of grammar, she assigns it to a special, discourse component within grammar. Blakemore took one step further, and treated conventional implicatures as part of semantics proper (see Ariel, Reference Ariel2010: Chapter 8 for arguments for and against this position). So, while the status of form/function pragmatic meanings within grammar is not settled, there is no doubt that the conventions involved (or at least a subset of these) are encoded for specific forms, and hence fall on the grammar side of the grammar/pragmatics divide.
Most classical form/function pragmatic analyses leave formal grammar intact. Rarely do they appropriate a grammatical phenomenon as pragmatic (as in the case of the definiteness restriction in existential constructions). Traditionally, attention was focused on so-called optional choices between variant forms (paraphrastic utterances). The argument made was that speakers’ preference for one rather than another of these syntactic paraphrases (or for using or not using some discourse marker) was not random: “every contrast a language permits to survive is relevant, some time or other” (Bolinger, Reference Bolinger1972: 71). Formal grammar cannot offer an account for such choices between semantic paraphrases (nor between zero discourse markers and explicit discourse markers), but form/function pragmatics can. In fact, what seems an “optional” choice between “free variants” in terms of grammar turns out to be a choice informed by form/function conventions pertaining to non-truth-conditional meanings and/or use conditions. Recently, there is a new line of research which incorporates many of the insights introduced by form/function pragmatists, but sees no reason whatsoever to (a) distinguish between + and – truth-conditional meanings and (b) accept formal syntactic analyses: Cognitive Linguistics (Lakoff, G., Reference Lakoff1987; Langacker, Reference Langacker1987, Reference Langacker1991) and Construction Grammar (Goldberg, Reference Goldberg1995). On these approaches, form/function correlations are part of grammar, no matter what their nature is.
A typical form/function pragmatics analysis is Prince (Reference Prince1978), where she not only distinguishes between unmarked non-clefted constructions and cleft sentences, she also points to the different discourse functions associated with the two cleft sentences in English. The presupposed component in it-clefts, she argued, introduces Given information in general. But the presupposed component in wh-clefts refers to a subset of the information Given to the addressee, that which can cooperatively be assumed to be currently accessible to him. In Ariel (Reference Ariel1983) I analyzed a specific appositive in Hebrew, which is dedicated to the introduction of (very) important persons, and Ward (Reference Ward1990) argued that English VP preposing is restricted to cases where the speaker intends to affirm her commitment to a proposition which has recently been evoked in the discourse. In each of these cases a specific discourse function is associated with a specific syntactic construction.
But these are marked constructions. Most constructions do not manifest a one-to-one form/function correspondence. In fact, even Prince analyzed a few functions for it-clefts, as did Ziv (Reference Ziv1982) for existential constructions (and see especially Kuzar, Reference Kuzar2009; Lakoff, G., Reference Lakoff1987). The idea is that each construction has multiple features, each of which is potentially appropriate for a certain discourse function. For example, Prince found that in addition to stressed-focus it-clefts as characterized above, there are also informative it-clefts, where the presupposed information is not actually Given to the addressee. Rather, it is presented as a fact known to some, knowledgeable people. It is not surprising that it-clefts, rather than wh-clefts, are used for this non-Given function. Note that unlike for wh-clefts, the presupposed information in it-clefts is presented in final sentence position, a position more appropriate for non-Given information. Different sentential positions for certain syntactic roles (non-initial position for subjects for existential constructions), different degrees of verb transitivity and semantic content (existential there takes low-transitivity and low-content verbs), different optional components in the construction (is there or is there not an additional embedded clause attached to the subject NP in the existential clause?) are all potentially mobilized for conveying various discourse functions. Hence the realization that syntactic constructions are not invariably reducible to single pragmatic “sign” functions. Rather, they constitute structure functions which allow for a variety of sign functions to be realized in them (Du Bois, Reference Du Bois and Tomasello2003).
Summing up, form/function pragmatic research associates discourse functions and/or use conditions with linguistically specified forms (morphemes, phrases, or whole syntactic constructions). The difference between classical semantic functions and these mostly information-status conditions is that although they are equally grammaticized, they involve extralinguistic factors which normally do not contribute to the truth-conditions of the proposition expressed. Since researchers within this paradigm are committed to the code/inference division of labor between grammar and pragmatics, these form/function correlations are considered grammatical. The result is that they see the pragmatic interpretations they analyze as separate in nature from other pragmatic interpretations, most notably, from conversational implicatures, analyzed by inferential pragmatics theories. But at the same time, many of these researchers also uphold the conviction that being non-truth-conditional, the interpretations at hand are pragmatic. Hence their self-classification as pragmatists.
2.2.2 Form/function pragmatics and reference
Although form/function pragmatists do appreciate the role of inferencing in referential acts, they see a larger role for the specific referring expressions. Both Ariel (Reference Ariel1990) and Gundel et al. (Reference Gundel, Hedberg and Zacharski1993) have proposed much richer scales of referring expressions than Levinson's, the idea being that there is a conventional form/function association between forms and referential interpretations.10 I have proposed that referring expressions each specialize for a different degree of accessibility for the mental representation the addressee is to retrieve. The form/function associations are far from arbitrary. Less informative expressions (she vs the woman), less rigid (uniquely referring) expressions (the woman vs Dana), and more phonetically attenuated forms (Ø vs she, USA vs the United States) are used to access relatively highly accessible referents. Conversely, the more informative, rigid, and phonetically large the form (e.g., June, the woman who just walked out, SBC: 008), the less accessible the mental representation is assumed to be.11 Infinitely many referring expressions indicate various intermediate degrees of mental accessibility (e.g., June, the woman, demonstratives, pronouns, cliticized pronouns, etc.), all arranged on a scale of degrees of accessibility. Gundel et al. (Reference Gundel, Hedberg and Zacharski1993) are more ambitious and propose that each referring expression encodes a specific cognitive status (such as ‘in focus’, ‘activated’, ‘familiar’ etc…). For example, it is chosen in (1), because the colander is extremely accessible according to Ariel, and similarly, it is ‘in focus’ for Gundel et al. (Reference Gundel, Hedberg and Zacharski1993). That, on the other hand, is used in (2) because the propositional ‘reading you some’ is not highly accessible to justify the use of a pronoun according to Ariel (it is ‘activated’, not ‘in focus’ for Gundel et al., Reference Gundel, Hedberg and Zacharski1993):
(1) MARILYN:…There's a colanderi –…Oh. Iti's gone. Oh here iti is. (SBC: 003)
(2) PAMELA:…I could read you some…I mean is that allowed? (SBC: 005)
As can be seen, a much more central role is assigned to specific linguistic forms under this approach, although both theories fully appreciate the need for pragmatic inferencing on top of decoding for proper reference resolutions (see section 2.5.1). Referring expressions are each directly associated with some cognitive concept in a grammatical manner.
2.3 Grammar is (yesterday's) pragmatics: Historical and typological pragmatics
The pragmatists so far considered may infringe on the grammarian's territory, pushing some borders around, but they do not challenge deeply ingrained generative grammar assumptions. The functionalists discussed here do. Most importantly, many reject the innateness hypothesis, arguing that typological universals are better accounted for by reference to extralinguistic factors. Grammar is neither innate nor arbitrary. It evolves in real-time through discourse use. Since communicative needs are relatively similar across different speech communities, it's not surprising that similar grammatical constructions and semantic meanings evolve in a similar fashion from similar sources in unrelated languages. Now, not all extragrammatical forces are pragmatic in the sense being discussed, but we here focus specifically on the role of recurrent pragmatic inferences in shaping grammar.12 Prominent practitioners within this research paradigm include Givón (Reference Givón1979 and onwards), Traugott (Reference Traugott, Lehmann and Malkiel1982 and onwards), Traugott and Heine (Reference Traugott and Heine1991), Traugott and Dasher (Reference Traugott and Dasher2002), Heine (Reference Heine1993), Bybee et al. (Reference Bybee, Perkins and Pagliuca1994), Comrie (Reference Comrie1994), Du Bois (Reference Du Bois1987), Haspelmath (Reference Haspelmath, Fischer, Norde and Perridon2004), and Croft (Reference Croft2000). Note that these researchers do not actually consider themselves pragmatists. Rather, they see themselves as historical linguists, as functionalists, and/or as typologists. Nonetheless, I insist that this research count as a relevant pragmatics research paradigm, because a crucial grammar/pragmatics interface is analyzed: the process by which recurrent pragmatic inferences turn into grammatical codes.13 The majority of the research in grammaticization (the creation of some grammatical form) and semanticization (the evolution of new meanings for old forms) is corpus-based, and so is some recent functional typological research (Haspelmath, Reference Haspelmath2008).14
2.3.1 The historical and typological pragmatics research paradigm
Much historical (grammaticization) and typological pragmatics research analyzes the current grammar as pragmatically motivated. As in inferential pragmatics thinking, the assumption here is that in order to best use their grammar to fulfill their interactional goals, interlocutors make heavy use of context. All newly formed form/function correlations are crucially context-bound, and hence the clear role of pragmatics in their initiation. Gradually, a consistent use of some form in some context, which contributes to the derivation of some extralinguistic reading, may bring about the entrenchment of the pragmatically derived form/function correlation into conventional codes. Once conventional, cancelability no longer applies, nor is contextual support needed. A piece of grammar has emerged.
We start with innovations. When meanings are intended, which are not straightforwardly codable in the current grammar, speakers must improvise. Uttering ungrammatical utterances is not an option. But speakers can mobilize current lexemes and constructions to convey their innovative messages relying on pragmatic inferencing as mediators. A wealth of relevant examples, universally attested, are provided by Heine and Kuteva (Reference Heine, Wischer and Diewald2002): ‘Alone’ can be mobilized to express ‘only’, body parts (e.g. ‘back’) can be mobilized to express spatial relations (‘in back of’), demonstratives can indicate a relative clause construction (that anaphorically referring to the head) or definiteness in general (that > the), ‘or’ can help mark (alternative) questions, ‘ability’ can trigger an interpretation of ‘possibility’, ‘all’ or ‘people’ can indicate ‘plurality’, possessive + ‘head’ or ‘body’ can help indicate an action performed on the self, etc. In all these cases, the assumption is that initially, the innovated meaning was only a conversational implicature derived with the help of a richly supportive context. But, as Grice (Reference Grice1989) originally noted, what is initially generated as a PCI may end up a semantic meaning.
Note that speakers rely on pragmatic implicatures not only when their grammar is incapable of expressing their intended meaning. We often prefer to convey our messages only implicitly. For example, instead of explicitly stating that x is the reason for y (using some because expression), speakers can implicate it. Now, we're here interested in the eventual development of a specialized form for expressing reasons. One form speakers can use in order to implicate a causal relation is a temporal adverbial such as after, because events preceding other events are easily construed as causing or explaining them (Hopper and Traugott, Reference Hopper and Traugott2003). Interestingly, cross-linguistically, many reason conjunctions are either ambiguous (since) or etymologically derived from temporal expressions (Hebrew ekev derived from be=ikvot ‘following’):
(3) And there's a lot of moves that we just know. After being…there for so long (LSAC).
Clearly, “being there for so long” is intended as an explanation for how it is that “there's a lot of moves that we just know.” But the speaker in (3) chose not to use an explicit ‘because’ adverbial, although quite a few of those are available in English: because, since, for, as, on account of. The reason is that speakers sometimes wish to convey their message indirectly. This can account for the fact that many languages possess multiple causal/reason adverbials, most of which are motivated diachronically since they can easily be seen as having originally triggered a causal implicature. Clearly grammatical constructions too can be similarly motivated, the creation of what are now called reflexive pronouns, for example. Initially, speakers mobilized an emphatic (complex: pronoun + self) referring expression to indicate what they rightly considered an unexpected/marked coreference relation (see section 2.3.2 below).
Finally, even an (initially) unintended meaning may become associated with some form, given our encyclopedic knowledge, which is automatically brought in when we interpret messages. Consider the reflexive construction, which often turns into a low transitivity construction (e.g., French se laver, ‘wash oneself’ = ‘wash’ – Kemmer, Reference Kemmer1993). It so happens that self-inflicted actions tend to be associated with a low degree of transitivity (as defined by Hopper and Thompson, Reference Hopper and Thompson1980). They are often unintended (Cf. I hit/cut myself with I hit/cut it), internal, rather than external (blame/consider oneself), etc. If that's the case, then even in the absence of an initial intention to convey a low degree of transitivity, interlocutors may come to associate the construction with low-transitivity activities.
The idea is that all contextual enrichments, no matter their source, may penetrate the grammar. The processes here briefly mentioned have one thing in common. Once many speakers start using the same explicit forms (e.g., back, after) as a basis for the same pragmatic inferences (a spatial relationship, a causal interpretation, respectively) a conventional code may be established between these forms and functions. The pragmatically derived meaning semanticizes, and becomes a coded meaning. Similarly, once the syntactic contexts in which reflexive forms tended to recur became a discourse pattern salient to speakers, a grammatical convention could set in, and in fact did set in in many languages. Since reflexive pronouns were very often used for a coreferent co-argument, this could very well explain the nature of Binding Condition A. And once the low transitivity often attributed to reflexive-marked actions becomes salient to speakers (not necessarily consciously) the construction may be reanalyzed as a low-transitivity or even an intransitive construction (cf. French se).
Researchers within this paradigm feel they can offer substantive explanations not only for why grammars of unrelated languages are so similar. They can also account for differences between languages, which the innateness assumption cannot (parametric variation is not enough to account for the rich, fine-grained variety found in the world's languages). The idea is that for the most part what we have are near universals, rather than precisely identical codes (Evans and Levinson, Reference Evans and Levinson2009). If grammar evolves when specific forms gradually come to be associated with specific functions, then differences between languages can be found where some language has already grammaticized some form/function correlation, but another hasn't (yet). Supporting evidence for this claim is the common finding that what is grammatically stipulated in one language is a possible, or sometimes even very common, discourse pattern in another (English after possibly triggers a ‘because’ interpretation, as in (3), but in Hebrew it has grammaticized). And if the Binding Conditions are setting in, they may only be optionally applied in some language (e.g., Old English, where personal pronouns regularly received bound readings), but be obligatory in another (Modern English). Another source of (slight) variation is the somewhat different translation of a pragmatic motivation into grammatical dress. For example, if speakers are indirectly conveying the ‘behind’ spatial relation using a body part, they may choose ‘back’ for triggering this initially pragmatic interpret-ation, but they may equally plausibly choose ‘buttocks’ for this purpose (this is the source of Hebrew ‘behind’). Such is also the variation between languages which use resumptive pronouns, as analyzed in Keenan and Comrie (Reference Keenan and Comrie1977).
Summing up, researchers within the historical and typological pragmatics paradigm deal with both codes and inferences, although their main interest lies in providing what they consider natural explanations for the universal tendencies in the typology and grammaticization of natural languages. The interest in pragmatic inferences is restricted to explaining the why and how of grammatical codes: the processes leading from inferences and represented discourse patterns to grammatical codes and the principles restricting typological variability. Core grammatical phenomena are here pragmatically motivated, and not only optional choices among equally acceptable forms (cf. form/function pragmatics).
2.3.2 Historical/typological pragmatics and reference
Binding relations are taken as a universal, in fact innate, linguistic concept by generative grammarians. However, not all languages have use for a dedicated marking of these relations. Historical/typological research seeks to explain how it is that such marking can evolve in real discourse. The assumption is that the pragmatic inference at work here is that co-arguments of the same verbs tend to stand for disjoint referents (e.g., Faltz, Reference Faltz1985, and see Ariel Reference Ariel2008a: 6.1.1 for supporting corpus statistics). If so, co-argument personal pronouns should preferably be interpreted as disjoint. However, since speakers do sometimes need to refer to coreferent co-arguments (e.g., Am I killing myself?, SBC: 015), more marked pronouns (what we now call reflexive pronouns) gradually evolved to indicate the marked coreference. Since it is co-argument coreference that is consistently marked, the resulting binding conditions grammaticized specifically around co-arguments (Reinhart and Reuland, Reference Reinhart and Reuland1993). Such an account can explain why the application of the binding rules was slow and gradual. Reflexive forms first appeared for more marked cases, where self actions (i.e., coreference) are less expected, as when destruction events are described (e.g., X destroyed herself). They appeared later for self actions which are more expected (e.g., X dressed her(self). Keenan, Reference Keenan, Moore and Polinsky2003).
Ariel (Reference Ariel, Barlow and Kemmer2000) has argued that the evolution of verbal person agreement markers out of personal pronouns (e.g., Hebrew shavar=t ‘broke=2nd pers. fem.’ from shavar at ‘broke you-fem.’) can be accounted for by Accessibility theory. Recall that Accessibility theory predicts that highly accessible mental representations are retrieved using highly accessible referring expressions, which are phonetically attenuated. It stands to reason that the speaker and the addressee, both highly accessible referents in face-to-face interactions, would frequently be referred to by high-accessibility markers, cliticized pronouns quite often. A consistent pattern whereby first and second person pronouns are reduced may lead to their cliticization to the verb. This may pave the way for a reanalysis whereby the original pronouns are taken as first/second person verbal bound morphemes. This pragmatic explanation can also account for why typologists have found a cross-linguistic difference between first/second person verbal agreements and third person verbal agreement. The latter are far less commonly marked on verbs. Accessibility theory can account for this asymmetric paradigm by noting that unlike the speaker and the addressee, third person referents are not consistently highly accessible in discourse (see the statistics in Ariel, Reference Ariel, Barlow and Kemmer2000). Hence, third person pronouns do not get reduced consistently enough to trigger a gradual process of cliticization.
In sum, typological/historical analyses set out from pragmatically motivated pressures (the need to mark a marked coreference, the tendency to reduce referring expressions denoting highly accessible referents) and explain the grammaticized pattern as resulting from an unintended series of small and gradual changes, starting with optional choices, going through preferred discourse patterns, and ending with an entrenchment whereby the discourse pattern is translated into an obligatory grammatical convention. The focus for this research paradigm is on the crossing from pragmatics to grammar.
2.4 Competition within and across paradigms
We have now briefly surveyed three paradigms of pragmatics research. Inferential pragmatics focuses on drawing the grammar/pragmatics divide at the synchronic level, distinguishing between predominantly truth-conditional codes and pragmatic inferences. Form/function pragmatics focuses on extralinguistic factors, which nonetheless play a grammatical role, because they manifest a coded association with specific linguistic forms. And typolo-gical/historical pragmatics focuses on explaining the diachronic relationship between pragmatic inferences and evolving grammatical codes. To give the flavor of the three research paradigms we briefly mentioned a few proposals within each paradigm, mainly ones pertaining to discourse reference. But what is the relationship between these theories? Are all of them needed? Indeed, some theories within and across paradigms stand in conflict. For lack of space, this section again focuses on reference.
2.4.1 Competition within paradigms
The different accounts for discourse reference by the two inferential pragmatics theories mentioned above (the Relevance and neo-Gricean accounts) follow straightforwardly from the general differences between the theories. Typically, the Relevance accounts assign a more significant role to the overall interpretation of the utterance, governed by the single Principle of Relevance. They naturally emphasize how reference resolution is a by-product of this interpretative process. Levinson focuses more on (some) actual forms, and relies on the interaction between the various neo-Gricean Principles (I, Q) to explain referential patterns. The specific context plays less of a role for him, the interpretations seen as GCIs.
Next, recall that both Ariel and Gundel Reference Gundel, Hedberg and Zacharskiet al. offer form/function accounts for the use and interpretation of referring expressions. While Ariel views referring expressions as arranged on a scale, each indicating a relatively higher (or lower) degree of mental accessibility, Gundel Reference Gundel, Hedberg and Zacharskiet al.'s theory is much more precise in that it associates each referring expression with a specific cognitive status. I have elsewhere argued that when we examine the definitions given by Gundel et al. (Reference Gundel, Hedberg and Zacharski1993) for each cognitive status, it's no longer clear that they are indeed distinct (Ariel, Reference Ariel, Sanders, Schilper-oord and Spooren2001). But a major problem with the Givenness hierarchy is that there simply aren't enough cognitive statuses to go around. Here's one such case:
(4)
REBECCA: .. put the newspaperi on his lap,
RICKIE: Y[eah],
REBECCA: Ø [mas]turbated,
and then lifted the paperi up, (SBC: 008)
It is no coincidence that the first mention of the newspaper is phonetically larger than the second one (this is a consistent finding). But under a six-category Givenness scale (Gundel Reference Gundel, Hedberg and Zacharskiet al.'s) there is no way to account for such delicate differences (between a full and a reduced definite NP, as well as between full and cliticized pronouns, etc.). An additional theory is needed, which could arguably be Accessibility theory. In fact, this is the point of Gundel Reference Gundel, Hedberg and Zacharskiet al. (forthcoming), where their Givenness hierarchy is shown to be orthogonal to Accessibility accounts (for the main part). If so, the fact that both theories offer form/function pragmatics accounts for referential forms does not automatically render either one of them redundant. It may well be that referential forms must meet both requirements, for example, that a referent marked by a demonstrative NP must be ‘familiar’, as well as relatively (but not maximally) accessible.
2.4.2 Competition across paradigms
I here cite two cases where the grammar/pragmatics division of labor is under debate. Both form/function pragmatic theories and historical/typological accounts compete with inferential pragmatics as to what to relegate to pragmatics and what to grammar. What is viewed as currently pragmatically inferable under an inferential pragmatics approach may be taken as a form/function code (Hebrew cliticized pronouns). Similarly, a distributional pattern may be analyzed as grammaticized by historical/typological theories, although it seems straightforwardly derivable by inferential pragmatic theories (some reflexive pronouns).
Consider the following Hebrew discourse, where the same entity (the pressi), is already very highly accessible (the first mention here is the sixteenth reference to it):

Given the availability of cliticized pronouns for Hebrew hem ‘they’, it is the avoidance of the shorter forms in the second and fourth references that is puzzling on any inferential pragmatic theory. Minimizing processing cost should have prompted the speaker to use the reduced pronouns throughout according to Relevance theory, for the same interpretation is made available by the two forms (see especially Reboul, Reference Reboul, Connolly, Vismans, Butler and Gatward1997). Levinson actually predicts a disjoint reading here, because the speaker avoided the most minimal form. Form/functionalists Ariel and Reference Gundel, Hedberg and ZacharskiGundelet al., however, can account for the alternating uses of full and cliticized pronouns by noting the points at which the speaker switches from one to the other. Discourse connectives, such as ‘but’ and ‘another thing’ here, signal that a potential topic change may occur, thus reducing the accessibility of the current topic. Reference Gundel, Hedberg and ZacharskiGundelet al. can similarly explain the data as a change from ‘in focus’ to ‘activated’.15 Hence, the differential preference for full and for cliticized pronouns in different contexts. In this case, it looks like the form/function approach is superior to the inferential approach.
Next, what is the status of the binding conditions regarding the use of reflexives? Levinson (2000: Chapter 4), König and Siemund (Reference König, Siemund, Frajzyngier and Curl2000) and Ariel (Reference Ariel2008b: Chapter 6) are all in agreement that a pragmatic interpretative pattern lies behind these grammaticized conventions (see again sections 2.3.1, 2.3.2). As an inferential pragmatist, Levinson points out that reflexives in different languages don't necessarily share the same grammatical/pragmatic status. For example, in some languages co-arguments can, but need not always be reflexive-marked. Obviously, the pattern is only pragmatic in these languages. But the picture is more complicated than that. Even for a single language, sub-patterns may be either pragmatic or grammatical. For example, according to Ariel's (Reference Ariel2008b: 6.3) grammaticization analysis, some adjuncts obligatorily take reflexive forms (despite herself, with himself) when coreferential with a clause-mate antecedent, although other adjuncts only manifest a pragmatic preference for the extension of the Binding convention beyond co-arguments (except for him(self), picture of her(self), jokes about him(self)). Other sub-constructions obligatorily take a pronoun, even though an obligatorily coreferential co-argument is involved (He didn't have any spots on him(*self/*her)). Although this pattern is an exception to the general Binding principle, and although it is clearly pragmatically motivated (according to the pragmatic motivation here, unmarked coreference is not in need of special marking), a grammatical convention is nonetheless involved (note the unacceptable reflexive form above). In other words, a competition between pragmatic and grammatical analyses is at work not just for the general distributional pattern applicable to the language as a whole, but also at much lower-level generalizations within the same language.
Different pragmatic theories certainly compete for the best account, whether within (2.4.1) or across (2.4.2) research paradigms. While researchers have mainly engaged in intra-paradigm debates, it's time for cross-paradigmatic debates and collaborations to take the stage in pragmatic research.
2.5 Inferential, form/function, and historical/typological pragmatics too
Surprisingly perhaps, there is not much interaction between researchers subscribing to different research paradigms. The goal of this section is to prompt all researchers to open up to the idea that their research must take into consideration questions addressed by other research paradigms. The three paradigms, I argue, complement each other in crucial ways.
2.5.1 Reference in three keys
We've briefly looked at a number of competing pragmatic accounts for the use and interpretation of referring expressions. Although some accounts will ultimately have to be rejected, reality is that we need all three approaches in order to fully account for natural language reference systems. Each approach provides some of the relevant pieces needed to complete the great grammar/pragmatics puzzle. Form/function pragmatics codes (which may turn out to be neither Reference Gundel, Hedberg and ZacharskiGundelet al's nor Ariel's) are needed to account for the basic conventions informing reference marking and resolution. But, of course, codes never exhaust actual use. This is true for classical semantic codes, and it is equally true for grammatical pragmatic codes. This is where inferential pragmatic accounts must come in. For example, those two idiots (in (6)) seems to be too low an accessibility marker for an entity just now mentioned by pronominal they. The violation of Accessibility theory can, then, be explained as a special inferred use (epithet), where the speaker's goal is not just to refer, but at the same time to also predicate on the referent:
(6) .. So theyi go barging in on ∼Mar.
.. So Mom felt obligated,
to ask those two idiotsi to lunch (SBC: 006).
The same applies to the reference resolution cases analyzed by Kempson (Reference Kempson, Newmeyer and Robins1988) and Wilson (Reference Wilson1992) (see section 2.1.2). Finally, we need an account for why and how the pragmatics/grammar divide is crossed (e.g., how pronouns evolve into verbal person agreement markers). It is the discourse profiles analyzed by historical and typological pragmatists that constitute potential grammaticization paths. Such theories bridge the gap between temporary integrations between inferred and coded meanings (on-line conveyed meanings) and permanent (grammaticized) form/function correlations. To account for the full range of use and interpretation of referring expressions we therefore need the three different pragmatic approaches, even if some (or possibly all) current theories are factually incorrect. Different research paradigms are needed for handling the different aspects of the grammar and pragmatics of reference. The same applies to all other linguistic phenomena.
2.5.2 Conclusion: The value of multiple paradigms in pragmatics
Each of the pragmatic approaches here surveyed has offered a significant amendment to the limited classical code model of language. The main point of the (originally Gricean) inferential pragmatics critique of the code model was that truth-conditional codes cannot exhaust speakers’ intended on-line meanings, because ad-hoc contextual inferences are generated in addition. According to form/function pragmatists, the classical codes fall short of exhausting all grammatically specified conventions, because not all codes are truth-conditional. Finally, contra generative grammarians, historical/typological pragmatists argue that classical codes cannot account for possible versus impossible natural language grammars. Since each approach finds different flaws in the classical code model, no wonder they each enrich it in a different direction. But it cannot be emphasized enough that these different research directions are not at all contradictory. Quite the contrary. They complement each other. Here are some thoughts on how they could each gain from interactions with the other approaches.
The most basic pragmatic approach, inferential pragmatics, has a strong preference for maximizing pragmatics and minimizing grammar. Anything that can be analyzed as pragmatic inference must so be analyzed. However, once researchers consider form/function pragmatics findings regarding potentially conventional associations between constructions and non-truth-conditional meanings, some analyses in terms of inferences may be relegated to form/function pragmatics within the grammar. Why shouldn't perfectly plausible inferences get entrenched in an automatic, subconscious process? For example, perfectly general inferential pragmatics attempts to explain the use and interpretation of cleft constructions by reference to their compositional meaning accompanied by pragmatic inferences (Wilson and Sperber, Reference Wilson, Sperber, Oh and Dinneen1979) must give way to a form/function pragmatic analysis, where a conventional interpretation is associated with each of the constructions (see section 2.2.1 above). Indeed, Blakemore (Reference Blakemore2002) is a clear example of a Relevance theoretician who has consistently used form/function (procedural) analyses for discourse connectives (and see Ariel, Reference Ariel2010: Part III).
The same is true for the relevance of historical/typological pragmatics findings for inferential pragmatics, specifically, the recognition that inferences may gradually turn into grammatical codes. For example, if GCIs are normal or default inferences associated with specific forms, they are potential semanticization cases (Traugott and Dasher, Reference Traugott and Dasher2002). Indeed, Levinson's (Reference Levinson2000: Chapter 4) is an analysis in this spirit (and see Ariel, Reference Ariel2008b: Part II). At the same time, the fact that the GCIs most discussed in the field (e.g., scalar implicatures, and-associated implicatures) do not seem to semanticize in any language might mean that the inferences are not after all as “normal” or default. Why is that? Such questions will naturally arise once cross-paradigmatic discussion (and debate) become more commonplace.
Next, form/function pragmatists tend to automatically assume that the correlations they find between specific linguistic forms and non-truth-conditional meanings or use conditions are conventional, and hence, grammatical. But it's not clear that this is invariably the case. Such pragmatists would do well to integrate the code/inference division of labor with form/function analyses. While, no doubt, many of the correlations they discussed are encoded for specific constructions and forms, others may be better accounted for as further inferences (see Ariel, Reference Ariel2010: Chapter 7 for such analyses). Du Bois’ (Reference Du Bois and Tomasello2003) distinction between sign functions (codes) and structure functions, which only allow for certain functions to be fulfilled in them (with the help of inferencing), should then be useful. Especially now that less marked constructions are increasingly analyzed for their functions (following Construction Grammar), it is becoming rather clear that there is no one-to-one relationship between the added meaning of syntactic constructions and discourse functions. Often, a few functions are associated with a single construction, only some of which are encoded. Form/function pragmatists tend not to problematize such questions of code/inference division of labor (Goldberg, Reference Goldberg2006: Chapter 8).
Finally, historical/typological pragmatics research too could benefit from integrating inferential pragmatics theories and questions. Proponents must ascertain that they only assume very small steps of grammaticization, each of which is analyzable as depending on a reasonable on-line pragmatic inference. Etymological analyses can serve as excellent pointers to potential paths of grammaticization, but they cannot replace detailed analyses of the actual inferential steps, leading from one stage to the next. There is a danger in only noting common source-target pairs (Heine and Kuteva, Reference Heine and Kuteva2002), because these are more often than not connected by a whole series of very small changes. For example, it's a bit misleading to claim that expressions denoting small quantities, such as French pas ‘step’, Hebrew klum ‘something’, evolve into negators or negative polarity items, because the change never involves the initial and the endpoint directly. Indeed, this is why some of Sweetser's (Reference Sweetser1990) analyses of metaphoric changes have been reanalyzed by Bybee et al. (Reference Bybee, Perkins and Pagliuca1994) and Traugott and Dasher (Reference Traugott and Dasher2002) as processes where enrichment inferences have been semanticized. The same applies to the very valuable typological clines and semantic maps (Croft, Reference Croft2001; Haspelmath, Reference Haspelmath and Tomasello2003; Kemmer, Reference Kemmer1993). Semantic changes must be shown to have evolved out of theoretically justified on-line pragmatic inferences.
We can also envision further sophistication in explaining the processes leading from pragmatics to grammar by reference to inferential pragmatics theories. Ever since Grice proposed that conversational implicatures can semanticize, the assumption has been that implicatures sometimes turn semantic. Traugott has argued that it is specifically generalized invited inferences (she prefers this term over GCIs), rather than particularized inferences, that are potential semanticization cases. But what about the Relevance-theoretic explicated inferences (see the definition in section 2.1.1 above)? Given that explicated inferences are closer to semantic meanings than PCIs and GCIs (explicated inferences contribute truth-conditional aspects; they are inseparable from the linguistic meaning – Recanati, Reference Recanati and Davis1989), I have suggested that it may be explicated rather than implicated inferences that directly semanticize. PCIs must go through an explicated stage before they semanticize (Ariel, Reference Ariel2008a, Reference Ariel2008b).16 Last, some historical pragmatists have relied on the form/function pragmatics practice of distinguishing truth-conditional from non-truth-conditional meanings. Since some historical changes evolve non-truth-conditional functions (for many discourse markers), they have defined these as pragmaticization, rather than semanticization (Erman and Kostinas, Reference Erman and Kostinas1993). Given that the processes here concerned make a pragmatic pattern evolve into grammatical convention, the distinction is probably not justified. Greater attention should then be paid to the role of inferential pragmatics theories.
Summing up, while theories within the same research paradigm addressing the same linguistic phenomenon are in competition with each other (e.g., neo-Gricean and Relevance-theoretic analyses of scalars and and), theories from different research paradigms complement each other for the most part. Just because they each focus on a different aspect of linguistic form and use, all three paradigms are necessary for a complete picture of grammatical forms, and for what should be assigned pragmatic status. Our main (but all too brief) test cases were competing, as well as complementary, referential theories. I’ve argued that form/function pragmatic theories (e.g., Reference Gundel, Hedberg and ZacharskiGundelet al., Ariel) should account for the conventional interpretations associated with various referring expressions, inferential pragmatic theories (e.g., Relevance, neo-Gricean theories) account for the many referential interpretations mediated by inference (e.g., bridged coreference, type versus token readings), and historical/typological pragmatic theories account for dominant discourse tendencies regarding the actual use of referring expressions (e.g., marked coreference by marked forms), some of which turn grammatical, at least in some languages.
What the field of pragmatics, indeed of linguistics, needs now is not so much, or not only, inter-paradigmatic debates about who got it right and who got it wrong. More fruitful and exciting insights will be gained from cross-fertilization and cooperative research between proponents of different paradigms. It takes all pragmatic keys to fine-tune the grammar/pragmatics division of labor.17
Funding for this research was received from THE ISRAEL SCIENCE FOUNDATION, grant # 161–09.
3 Saying, meaning, and implicating
You make a few distinctions. You clarify a few concepts. It's a living.
A speaker can say something without meaning it, by meaning something else or perhaps nothing at all. A speaker can mean something without saying it, by merely implicating it. These two truisms are reason enough to distinguish saying, meaning, and implicating. And that's what we'll do here, looking into what each involves and how they interconnect. The aim of this chapter is to clarify the notions of saying, meaning, and implicating and, with the help of some other distinctions, to dispel certain common misunderstandings.
Paul Grice famously developed accounts of what it is for a speaker to mean something and to implicate something. His basic idea was not new, as this oft-quoted passage from Mill illustrates:
If I say to any one, “I saw some of your children today”, he might be justified in inferring that I did not see them all, not because the words mean it, but because, if I had seen them all, it is most likely that I should have said so: even though this cannot be presumed unless it is presupposed that I must have known whether the children I saw were all or not. (Mill Reference Mill1867: 501)
Not only did Mill appreciate the phenomenon of what, thanks to Grice, has come to be known as conversational implicature, in this passage Mill points to the importance of distinguishing what is meant by the words a speaker utters and what a speaker means in uttering them. This is perhaps the distinction most basic to pragmatics.
So we have the distinction between linguistic and speaker's meaning, as well as the three-part distinction between saying, meaning, and implicating, as done by a speaker. Why fuss over these distinctions? The main reason is to identify the sorts of information that speakers (or writers) make available to their listeners (or readers), the sorts of intentions that speakers have in so doing, and the means by which this information is made available to or is inferable by the hearer from the fact that the speaker did what she did. We do not use psychokinesis to make ourselves understood or telepathy to figure out what others mean. We rely primarily on the meanings of the words we utter or hear. They carry information and we, as speakers of the same language, share this information and mutually presume that we share it. But we do not rely solely on linguistically encoded information. In communicating to and understanding one another, we rely also on general background information and on specific information about the situation in which the utterance is taking place. Importantly, this includes the very fact that the utterance, that utterance, is being made. As speakers aiming to communicate things, we choose to utter bits of language that make our communicative intentions evident to our hearers. We do so with the tacit expectation that the package of linguistic and extralinguistic information associated with our utterance will enable our listeners to figure out what we mean. Correlatively, as hearers, we rely on what we presume to be the very same information, both linguistic and extralinguistic, to figure out what the speaker means.
In the first three sections we will take up saying, meaning, and implicating, respectively. Our initial discussion of saying will be brief, serving mainly to explain how saying, in the sense tied to linguistic meaning, contrasts with (speaker) meaning and implicating. The discussion of speaker meaning will focus on its two main features, one due to Grice and one due to his critics. Grice's ingenious idea was that in meaning something a speaker has a special sort of hearer-directed intention, which he sometimes called a reflexive intention, because part of its content is that the hearer recognize this very intention. She succeeds in communicating if he does recognize it (from now on, when using pronouns for a pair of interlocutors, I will use “she” for the speaker and “he” for the hearer). As for implicating, it is a case of meaning something without saying it. Grice proposed an extraordinarily influential account of how this works, at least when communication succeeds and the conversational implicature is recognized, by proposing a Cooperative Principle and certain conversational maxims subordinate to it.
Grice's account, as influential as it has been, has also been widely misunderstood and even misrepresented. In section 3.4 we will identify the main misconceptions and thereby clarify just what he was claiming or, in some cases, should have claimed. In section 3.5 we will consider several complications to the distinction between saying, meaning, and implicating, including the phenomena of conventional implicature and conversational impliciture (as opposed to implicature), and, in light of these phenomena and in the face of certain popular objections, modify our notion of saying.
3.1 Saying and what is said
The verb “say” has a variety of everyday uses. We speak not only of speakers saying things but also of sentences, signs, and even clocks saying things. Even limited to acts by speakers, “say” has a range of common uses. On one end of that range, it denotes the act of uttering (a sentence, typically) and, on the other end, acts of stating or asserting (a proposition). Acts of the former sort are reported by direct quotation, of the form “S said ‘…’,” and acts of the latter sort by reports of the form “S stated/asserted that p,” where “p” denotes a proposition. Given that we have these other verbs and given that stating or asserting something entails meaning it (not that this in turn entails believing it), it makes sense to reserve the term “say” for the in-between act that is reported by indirect quotation, with sentences of the form “S said that p,” assuming that what is said is a proposition.
The notion of saying, along with the correlative notion of what is said, comes into the picture for a very simple reason: a speaker can say one thing while meaning something else. She could mean something instead of what she says, or she could mean something in addition to what she says. Indeed, a speaker can say something without meaning anything at all, as in recitation or translation. Acts of saying, in the sense in which we will be using the term, correspond to Austin's (Reference Austin1962) notion of locutionary act. Performing a locutionary act goes beyond merely producing certain sounds, even as belonging to a certain language. On the other hand, it must be distinguished from both the illocutionary act of doing something in saying something and the perlocutionary act of doing something by saying something. To perform a locutionary act is to utter a sentence “with a more or less definite sense and a more or less definite reference” (Austin Reference Austin1962: 93). To be sure, these different categories of speech are abstractions from the total speech act. It is not as though in uttering a sentence a speaker is performing a series of acts. Rather, in uttering, say, “I love turnips,” a speaker would be saying that she loves turnips, probably asserting that she loves turnips, and perhaps wanting and maybe even getting her audience to want to try some.
Grice's stipulated sense of “say” is not quite the same as Austin's. He writes, “I intend what someone has said to be closely related to the conventional meaning of the words (the sentence) he has uttered” (Grice Reference Grice, Cole and Morgan1975/1989: 25). Assuming that what is said must be a unique proposition, he required further that any semantic ambiguities be resolved and references be fixed. So far this sounds like Austin's notion of locutionary act, although, curiously, Grice did not connect his notion with his former teacher's (indeed, as we will see in the next section, Grice's analysis of speaker meaning could have benefited by taking into account Austin's distinction between illocutionary and perlocutionary acts). However, unlike Austin, Grice required that saying something entails meaning it. Otherwise, one merely “makes as if to say” it. This requirement seems odd (it conflicts with the first of our opening truisms), since if one can't say something without meaning it, one doesn't say anything unless one means it. There is a sense in which that is true, the sense in which “say” is synonymous with “state.” Indeed, in Grice's (Reference Grice1961) preliminary account of implicature, the preferred verb was “state,” not “say.” In my opinion, Grice's main reason for insisting on this stronger sense of “say” was that it supported his controversial view (proposed in Grice Reference Grice1968), not to be discussed here, that what expressions mean in a language ultimately comes down to what speakers mean in using them. In any case, surely there's a perfectly good sense in which one can say something without meaning it.
What is the rationale for adopting a locutionary notion of saying and the correlative notion of what is said? The point of tying what is said closely to the conventional meaning of the uttered sentence is to limit it to information carried by that sentence. We can think of what is said as, in effect, the interpreted logical form of the sentence. Grice's reason for requiring resolution of ambiguity is to further limit what is said to the sense of the sentence that is operative in the speaker's act of uttering it. Otherwise, whenever there is ambiguity (often!), multiple things would be said. Presumably it is the speaker's semantic intention that does the disambiguating. This intention determines what she takes her words to mean as she is using them, and is distinct from her communicative intention, which determines how she intends her audience to take her act of uttering those words.
As for fixing reference, in cases involving indexicals (including pronouns, certain temporal and locational adverbs, and tense), the point is more subtle. With them we need to distinguish their meaning from their reference and take into account the fact that it is the reference, not the meaning, that figures in what is said. The meaning helps pin down the reference but is not itself part of what is said. So, for example, if I utter the sentence “I love turnips,” I thereby say that I love turnips. I do not say that the current utterer of this sentence loves turnips. After all, what I said, that I love turnips, could be true (not that it is true) even if I hadn't uttered the sentence. The meaning rule, that “I” as used by a given speaker on a given occasion refers to that speaker, is not part of what I say, or would say, if I were to utter “I love turnips.” An analogous point applies to the use of the present tense. The general idea here was developed by Kaplan (Reference Kaplan, Almog, Perry and Wettstein1989a), who proposed that the character of an expression determines the content of the expression relative to a given context of use. The character is a meaning rule that provides for how this content, the expression's reference, is determined in the context. Obviously the rule for “I” is different from, for example, the rule for “you” and the one for “yesterday.”
There is an ongoing debate in philosophy regarding the range of expressions whose reference is literally determined, according to a meaning rule, as a function of their context of use. The primary question at issue is whether it is really the context of use, as opposed to the speaker's referential intention, that determines the reference. We will not pursue this issue here. Suffice it to say that there seems to be a basic difference between what determines the reference of terms like “she” and “that” as opposed to the reference of terms like “I” and “today.” Arguably, the difference is great enough to justify Strawson's (Reference Strawson1950) contention that speakers, not expressions, refer (see Bach Reference Bach, Lepore and Smith2006d for discussion of the question “What does it take to refer?”).
The above niceties aside, both ambiguity and indexicality are different ways in which linguistic meaning does not determine speaker meaning even if the speaker is being completely literal. The meaning of an ambiguous sentence underdetermines what a speaker could mean in using it literally, obviously because the sentence has too many meanings, i.e. more than one (a speaker can mean more than one thing by trading on an ambiguity, as with puns, but this is an exceptional case). The case of indexicality is different. When we use an expression like “I,” “tomorrow,” “she,” or “that,” the meaning of the expression merely constrains what we can mean in using it literally. Indexicality is like ambiguity insofar as in both cases linguistic meaning limits but does not fully determine what one can mean in speaking literally. They differ, however, in that with ambiguity there is too much linguistic meaning and with indexicality there is too little.
As we will see later on, following our discussions of meaning and implicature, ambiguity and indexicality are not the only ways in which linguistic meaning can underdetermine literal speaker meaning. It will emerge that finding a suitable notion of saying, together with the correlative notion of what is said, is not as straightforward as Grice supposed, and not just because he needlessly required that to say something entails meaning it. But what is it for a speaker to mean something?
3.2 Speaker meaning
Grice (Reference Grice1957) contrasted “natural” meaning with the sort of meaning (“non-natural,” he called it) involved in language and communication. Smoke means fire because it is naturally correlated with fire, but the word “smoke” means smoke by virtue of being conventionally correlated with smoke. Smoke means fire in the sense of indicating the presence of fire, and that is because it is correlated with fire. However, the word “smoke” is not correlated with the presence of smoke. It is a conventional means for talking about smoke, whether or not smoke is present. Its meaning is a matter of convention, since it could just as well have meant something else (and some other word could just as well have meant smoke). It means smoke because, and only because, speakers normally use it to mean that and expect others who use it to mean that as well.
So within the category of non-natural meaning there is both linguistic meaning, in this case what “smoke” means, and speaker meaning, here what a speaker means in using that word. Take an example. Suppose a speaker utters the sentence, “I smell smoke,” using the pronoun “I” to refer to herself, the verb “smell” (in the present tense) for olfactory sensing, and the noun “smoke” to refer to smoke, presumably some that is nearby. Given what these words mean and how, as syntactically determined, these combine to comprise the (linguistic) meaning of the entire sentence, the semantic content of the sentence, relative to that context, is that she (the speaker) smells smoke. This could well be what she means in uttering that sentence, but it might not be. For she could have been speaking figuratively, in which case she would have meant something else. She could have meant, regarding something her interlocutor had just said, that he was trying to divert her attention from what was at issue. On the other hand, she could have been speaking perfectly literally. Even then, it is one thing for the sentence to mean something and another for a speaker to mean that in uttering it.
3.2.1 Meaning intentions
We will not delve into the long-debated meta-semantic question of what it is for an expression to have meaning, except to note one aspect of that question: which is more basic, expression meaning or speaker meaning? This question was important to Grice, who held not only that speaker meaning is more basic but that expression meaning ultimately reduces to speaker meaning (Grice Reference Grice1968). His was a controversial version of the relatively uncontroversial view that semantics reduces to psychology.
Our question is what is it for a speaker to mean something, whether in using language or in doing something else. It is not merely to produce some effect in one's audience. There are lots of ways of doing that. At the very least it must be intentional. Besides, there are different ways in which one can intend to produce some effect on others, and most of them are not just matters of successful communication. In communicating something, one has a special sort of intention and intends to produce a special sort of effect.
What is special about the intention? Part of it is that one intends one's audience to recognize the very effect one is trying to produce in them. Moreover, as Grice (Reference Grice1957) argued, one intends to produce that effect precisely by way of their recognizing that intention. This is the gist of Grice's ingenious idea that the special sort of intention involved in meaning something, in trying to communicate something, is in a certain sense reflexive.
Think about what is involved in communicating. You have a certain thought and you wish to “get it across” to someone. So your intention to convey it must not be hidden. Your intention will not be communicative if you intend the hearer to think a certain thing without thinking you intend them to think it. For example, if you yawningly say “I am sleepy,” your intention that they think you are sleepy is not essential to their coming to think that you are sleepy – your yawning manner of speech will do. Of course, they will recognize that you intend them to think that you are sleepy. However, because of how you said it, their recognizing this would not have been necessary. Indeed, they would think that you are sleepy even if you had said something completely different, provided you said it yawningly. In some cases, recognizing your intention may vitiate it, for example, if you make some self-deprecating remarks in order to get your listener to think you are modest. Your intention that they think this won't be fulfilled if they recognize your intention that they think this. But even in our first example, in which recognizing the speaker's intention doesn't interfere with its fulfillment, the hearer's recognition of it is not needed for its fulfillment.
Such examples suggested to Grice that for an intention to be communicative, it must be overt in a specific sort of way. It must be intentionally overt and this feature must play a special role in its fulfillment. That is, in trying to communicate something to others by saying something, a speaker intends the audience to recognize that intention partly by taking into account that they are intended to recognize it. Because this is part of what the speaker intends, communicative intentions are distinctively self-referential or reflexive. A speaker means something by her utterance only if she intends her utterance “to produce some effect in an audience by means of the recognition of this intention” (Grice Reference Grice1957/1989: 220). However, not just any sort of effect will do.
3.2.2 The intended “effect”
If you are communicating something to someone, communicative success does not require that they respond as you wish, such as to believe you, obey you, or forgive you. As Searle pointed out, these are perlocutionary effects, the production of which goes beyond merely communicating (1969: 47). It is enough, as Strawson similarly argued, for the hearer to understand the utterance (1964b: 459), that is, for the speaker to achieve uptake (Grice later (Reference Grice1989: 351–2) objected to this, but did not make clear why). For that the hearer must identify the attitude the speaker is expressing – believing, intending, regretting, etc. – and its content, but the speaker can succeed in communicating even if she does not actually possess that attitude. For example, she can convey an apology without actually having the regret she expresses and without the hearer believing she possesses it. If the speaker says, “I’m sorry I broke your vase,” to succeed in communicating the apology it is enough that the hearer take her to be expressing her regret. This is clear from the fact that the hearer might understand the apology as such even if he doubts that the speaker regrets breaking the vase.
So, it seems, the intended “effect” required for meaning something, for communicating, is for the hearer (or reader) to recognize one's communicative intention. Achieving any further effect, such as being believed or being obeyed, goes beyond communicating successfully. The purely communicative effect is just having one's utterance understood. Bach and Harnish distinguish expressing an attitude (a belief, desire, regret, or whatever) from actually possessing it or at least intending the hearer to think one possesses it. According to their definition, “to express an attitude is reflexively to intend the hearer to take one's utterance as reason to think one has that attitude” (Bach and Harnish Reference Bach and Harnish1979: 15). Accordingly, communicating successfully, being understood, consists simply in having the expressed attitude recognized. It does not require the hearer to respond in any further way, not even to think one actually possesses attitude. As they say, “the fulfillment of a communicative intention consists simply in its recognition” (ibid.). By isolating the purely communicative effect of an act of utterance, this formulation makes sense of Grice's idea that speaker meaning essentially involves a reflexive intention.
3.2.3 Reflexive paradox?
Now that we have identified the intended effect specific to communication, we can return to Grice's original characterization of the intention itself. After describing it as the intention “to produce an effect in an audience by means of the recognition of this intention,” Grice comments, “this seems to involve a reflexive paradox, but it does not really do so” (1957/1989: 219). It seems to because the intention is self-referential. Moreover, there seems to be something circular about the hearer's inference. After all, the hearer is to identify the speaker's intention partly on the supposition that he is intended to. But is there anything really paradoxical about this?
A reflexive intention is not a series of intentions, each referring to the previous one. Not appreciating this has led to considerable confusion, even on Grice's part. Indeed, earlier in the very paragraph just quoted from, he gives an alternate formulation that requires that a speaker “must also intend his utterance to be recognized as so intended” (1957/1989: 219). Grice (Reference Grice1969), in trying to improve upon his earlier formulation, explicitly abandons reflexive intentions in favor of iterative intentions. So had his critic Strawson (Reference Strawson1964b), and so would his defender Schiffer (Reference Schiffer1972). Their ever more complex formulations, each prompted by counterexamples to the previous formulation, start with an intention to convey something and a further intention that the first be recognized, itself accompanied by a still further intention that it in turn be recognized, and potentially so on ad infinitum. No wonder Grice was eventually led to reject the whole idea and suggest that what is needed instead is the absence of a “sneaky intention” (1989: 302). Sticking with self-referential intentions avoids this complexity and the threat of an infinite regress. For there was nothing wrong with Grice's original idea, assuming the intended effect is properly characterized, as above. It does not lead to the reflexive paradox that worried Grice.
The semblance of reflexive paradox in Grice's original formulation arises from the key phrase “by means of the recognition of this intention.” This might suggest (and has suggested to some) that to understand the speaker the hearer must engage in some sort of circular reasoning. It sounds as though the hearer must already know what the speaker's communicative intention is in order to recognize it. But that misconstrues what the hearer has to take into account in order to recognize the speaker's intention. The hearer does not infer that the speaker means a certain thing from the premise that the speaker intends to convey that very thing. Rather, he operates on the presumption that the speaker, like any speaker, intends to communicate something or other. The hearer takes into account this general fact, not the content of the specific intention, in order to identify that intention.
3.3 Conversational implicature
Grice is best known, in both linguistics and philosophy, for his theory of conversational implicature. It was sketched in a section (III) of his 1961 paper, and developed in his William James Lectures at Harvard in 1967, which were subsequently published individually in disparate places and eventually collected as Part I of the posthumous Grice Reference Grice1989. The main ideas are laid out in “Logic and Conversation” (Grice Reference Grice, Cole and Morgan1975/1989), which, from what I have been able to ascertain from Google Scholar, is the most cited philosophy paper ever published. Grice's basic idea was not new, although his name for it was. What distinguished his work from previous work on “contextual” or “pragmatic” implication (see Hungerland Reference Hungerland1960) was his ingenious account of how it works (it also served as an antidote to the excesses of ordinary language philosophy, in ways chronicled in Chapman Reference Chapman2005). This account was essentially an extension of his theory of speaker meaning, but what made it original, as we will see, was the role of his “Cooperative Principle” and the various “maxims of conversation” that fall under it.
In Grice's view one can mean something either by saying it or by saying (or “making as if to say”) something else. What one implicates by saying something is generally not implied by what one says. That is why Grice used the verb “implicate” rather than “imply” and the neologism “implicature” rather than “implication.” For example, suppose you are asked about a dinner you had at an expensive restaurant, and you reply, “It didn't make me sick.” Your saying this implicates that it was not very good. However, what you said obviously does not imply this. After all, a dinner that does not make you sick can still be excellent. However, it is possible for what is implicated to be implied. If you are asked whether you have more than two children and you reply that you have three girls and a boy, what you say implies the very thing that you implicate, namely, that you have more than two children. That is because you also mean, albeit indirectly, that you have more than two children. There are many other things implied by what you say that you do not mean, hence do not implicate, for example that you have more than one child and that you have more than two girls (much of Davis's (Reference Davis1998) and other critiques of Grice assume that he did not require that implicatures be things speakers mean).
The mediocre meal example illustrates Grice's observation that conversational implicatures are cancelable – you could have added, “I don't mean to suggest that the meal wasn't great,” without taking back your assertion that it didn't make you sick. In fact, there are circumstances in which the implicature (that the dinner was not very good) would not have been in the offing in the first place. Suppose that you and your interlocutor had just learned that there had been an outbreak of food poisoning at the restaurant in question. In that case, your saying that the meal didn't make you sick would not implicate anything about its culinary quality.
How can a speaker implicate something that is not implied by what she says and still manage to convey it? She can do this by exploiting the fact that the hearer presumes her to be cooperative, in particular, to be speaking truthfully, informatively, relevantly, and otherwise appropriately. If taking the utterance at face value is incompatible with this presumption, the hearer, still relying on this presumption, must find some plausible candidate for what else the speaker could have meant. And, crucially, the speaker must intend him to do this. In the case of the speaker asked about dinner, the hearer must figure out what she meant, relying on the presumption that she intended it to be an accurate, informative, and appropriate answer to the question. In effect what the hearer does is, on the presumption that the speaker is being cooperative, to find a plausible explanation for why she said what she said.
3.3.1 The Cooperative Principle and the maxims of conversation
Grice systematized these ideas by formulating an overarching Cooperative Principle and four sets of subordinate maxims of conversation (Grice, Reference Grice, Cole and Morgan1975/1989: 26–7):
COOPERATIVE PRINCIPLE: Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange.
QUALITY: Try to make your contribution one that is true.
1. Do not say what you believe to be false.
2. Do not say that for which you lack evidence.
QUANTITY:
1. Make your contribution as informative as is required (for the current purposes of the exchange).
2. Do not make your contribution more informative than is required.
RELATION: Be relevant.
MANNER: Be perspicuous.
1. Avoid obscurity of expression.
2. Avoid ambiguity.
3. Be brief. (Avoid unnecessary prolixity.)
4. Be orderly.
We will not fuss over the precise meanings of these maxims, except to note one quirk about the wording of the sub-maxims of Quality. Like the Cooperative Principle itself, most of the maxims and sub-maxims concern a speaker's “conversational contribution.” However, the sub-maxims of Quality specifically concern what (not) to say. This was probably just a slip on Grice's part, since if these two sub-maxims do not constrain what the hearer can plausibly take the speaker to be implicating, they won't motivate an inferential strategy aimed to correct the appearance, due to what the speaker says, that what she means is false or unwarranted.
There are questions one could ask about the precise formulation of Grice's maxims and about whether the list is incomplete or, for that matter, overly elaborate (for discussion see Harnish Reference Harnish, Bever, Katz and Langendoen1976 and Grice Reference Grice1989: 368–72). One might wonder, for example, about what happens when maxims clash, that is, when applying different maxims gives different candidates for what a speaker might be implicating. A common objection to Grice's account is that it is not adequately predictive and, indeed, that different social situations or cultural norms call for different formulations. Such worries presuppose that Grice intended his account to explain precisely how a hearer figures out what the speaker is implicating and, for that matter, how a speaker manages to come up with something to say that will make evident what she means in saying it. These are psychological questions, far beyond the capacity of current psychology to answer.
Philosophically, the important point is that, whatever the particulars, even though what we mean cannot in general be read off of what we say, we as speakers are pretty good at making our communicative intentions evident and that as hearers we are pretty good at identifying such intentions. Grice's primary insight was that unless communication were a kind of telepathy, there must be rational constraints on speakers’ communicative intentions and corresponding constraints on hearers’ inferences about them. As the examples below will illustrate, Grice's maxims point to the sorts of considerations that speakers intend hearers to take into account, and hearers do take into account, if communication is to succeed, not that it always does. With this in mind Bach and Harnish suggest that the maxims are better viewed as presumptions (1979: 62–5), which hearers rely on to guide their inferences as to what speakers mean. So, when a presumption seems not to be in force, a hearer seeks an interpretation of the speaker's utterance such that it does apply after all, and interprets it partly on the supposition that she intends him to. Bach and Harnish propose to replace Grice's CP, the vague and rather unrealistic Cooperative Principle, with a Communicative Presumption: when people speak, presumably they do so with identifiable communicative intentions (1979: 12–15). After all, conversations need not be cooperative – people often argue or have conflicting aims – but successful communication is still generally necessary for whatever else takes place.
3.3.2 Examples
The following examples illustrate how hearers compensate for apparent violations of the different maxims (or, if you prefer, apparent suspensions of different presumptions). If a speaker says something that is obviously false, thereby flouting the first maxim of quality, she could well mean something else. For example, with (1) she would probably mean the opposite of what she says, with (2) something less extreme, and with (3) something more down to earth.
(1) George W. Bush was the most intellectual president in American history.
(2) I could have eaten a million of those chips.
(3) He bungee‐jumped from 85% approval down to 40%, up to 60%, and down to 15%.
In these cases, respectively, of irony, hyperbole, and metaphor, it should be evident what a speaker is likely to mean, even though it is not what she says. Notice that it is possible, however unlikely, that the speaker does mean what she says, but then her communicative intention would be unreasonable, since she could not reasonably expect the hearer to figure this out. It is important to remember that it is one thing for a speaker to mean/implicate something and another thing for the hearer to figure out what she means/implicates, that is, for the utterance to be communicatively successful.
With quantity implicatures a speaker typically means not just what she says but also that she does not mean something stronger. It is her not saying the stronger thing that conveys that she is not in a position to assert it (note that the speaker may implicate instead that she is unwilling to assert something stronger). Consider these examples:
(4) Barry tried hard to lift the 300‐lb barbell.
(5) He thought he was strong enough to lift it.
(6) He had lifted the 250‐lb barbell three times.
(7) Barry finished his workout with a swim or a run.
Keeping in mind that speakers, not sentences, implicate things, we have to imagine uttering such sentences or hearing them uttered in particular contexts in order to get clear cases of implicature, quantity implicature in this case. In uttering (4) you would likely implicate that Barry failed to lift the 300-lb barbell. Otherwise, you would have said that he succeeded. Similarly, with (5) you might implicate that he wasn't sure that he could lift it. With (6) you would probably implicate that he didn't lift the 250-lb barbell more than three times. And, finally, in uttering (7) you would implicate that Barry went for either a swim or a run and not both and that you do not know which. In the case of (4) and (5), what the speaker implicates can be figured out on the presumption that if she was in a position to give stronger or more specific information, she would have. With (6), the presumption is that the speaker is in a position to know how many times Barry lifted the 250-lb barbell, whereas with (7) the presumption is just the opposite, since if she knew whether Barry went for a swim or a run (or both), she would have said so.
Relevance implicatures can also be cases of conveying information by saying one thing and leaving something else out. Grice's two best-known examples are of this type:
(8) There is a garage around the corner. [said in response to “I am out of petrol”]
(9) He is punctual, and his handwriting is excellent. [the entire body of a letter of recommendation]
An utterance of (8) is relevant, and a rational speaker would intend it as such, only if the speaker means also that the garage is open and has petrol for sale. So the hearer is to reason accordingly. (9) is rather different, on account of the speaker's reason for not being more explicit. In this case, the writer intends the reader to figure out that if she had anything more positive to say about the candidate, she would have said it.
If it seems that quantity implicatures are special cases of relevance implicatures, that is because, generally speaking, they are. What makes them special is that they involve the exploitation of a scale. As Horn (Reference Horn1972) first spelled out, conveying a “scalar implicature” takes advantage of the existence of a naturally stronger alternative along a scale, e.g. “some” rather than “all” and “or” rather than “and.” So using “some” typically conveys “not all” and using “or” typically conveys “not both.”
Manner implicatures are probably the least common. They exploit not just the speaker's saying a certain thing but her saying it in a certain way. For that reason, they are exceptions to Grice's nondetachability test, according to which what a speaker implicates would have been implicated even if the speaker had said the same thing in a different way (Grice Reference Grice, Cole and Morgan1975/1989: 39). With manner implicatures, the way matters. It could be the wording, such as using an elaborate phrase when a single word is obviously available to say the same thing, or perhaps the pronunciation, such as by uttering a certain word in a conspicuously odd way. Obviously, if there are different ways of saying the same thing and how the speaker says it affects what the hearer is likely to take into account in figuring out what the speaker means, the implicature is detachable. The following examples illustrate this.
(10) You have prepared what closely resembles a meal of outstanding quality.
(11) I would like to see more of you.
Imagine a culinary instructor uttering the long-winded (10). Her intention would likely be to convey that the meal is not nearly as good as it appears. A speaker of (11) could exploit its ambiguity (compare “I would like to see you again”) to convey something besides wanting to spend more time with the hearer.
It should be understood that Grice does not suppose that speakers consciously exploit the maxims or that hearers consciously take them into account. However, this raises the interesting question of just what is involved psychologically in the process of communication when the speaker does not mean exactly what she says. There is not only the commonly addressed question of how hearers manage to figure out what speakers mean given that they say what they say (and say it in the way they say it), but also the rarely addressed question of how speakers choose what to say so as to make evident what they mean, even when they do not make it explicit. Grice did not address the latter question, and his account of implicature is commonly misconstrued as an answer to the former question. Being clear on what Grice was up to and what he was not avoids a number of misunderstandings.
3.4 Common misunderstandings about conversational implicature
There are two common misconceptions about the role of the maxims (or presumptions). First, they do not determine implicatures (even Grice (Reference Grice1989: 372) occasionally suggested that they do) but, rather, help explain how they get conveyed. They are considerations that speakers implicitly intend hearers to, and hearers do, take into account to figure out (“determine” in the sense of ascertain) what the speaker is implicating. Since that is a matter of speaker meaning, it is the speaker's communicative intention that determines (in the sense of constitute) what is implicated. Also, it should not be supposed, as it often is, that the maxims apply only to implicatures. This misconception is understandable insofar as the maxims play a key role in Grice's account of how implicatures get conveyed, but in fact they apply equally to completely literal utterances, where the speaker means just what she says. After all, the hearer still has to infer this. It is thus wrong to suppose that the maxims come into play only where linguistic meaning leaves off and speaker meaning and extralinguistic, contextual information take over (for more on context and what it does and doesn't determine, see Bach Reference Bach and Szabó2005).
Another misunderstanding concerns what, when a speaker says one thing but means something else, the hearer is to infer. Contrary to what philosophers and linguists seem commonly to suppose (perhaps because of the ambiguous phrase “infer what the speaker implicates”), the hearer does not have to infer the thing the speaker implicates. He merely has to infer that the speaker implicates (means) it. So, for example, if the speaker says and means that a certain new book has a beautiful cover and implicates that it is not worth reading, the hearer needs to infer that the speaker means that it is not worth reading. He does not need to infer that it is in fact not worth reading. The speaker might want him to believe that, but this is not necessary as far as communication is concerned. Indeed, he could even doubt that she believes it (he might think she resents the author's success).
A remarkably widespread misconception is that implicatures are inferences, or at least are determined (and not merely ascertained) by inferences rather than by the speaker. It is based on confusing what is implicated (by a speaker) with what is involved in figuring out what is implicated. Implicatures are things speakers mean, not what hearers, even rational ones, think they mean. Accordingly, if a speaker is to succeed in communicating something, the hearer must figure out that it is meant. That requires inference. Yet some of the most brilliant researchers, including Levinson (Reference Levinson2000), Geurts (Reference Geurts2010), and Reference Chierchia, Fox, Spector and MaienbornChierchiaet al. (forthcoming), write as if implicatures themselves are inferences.1
A further misconception is that linguistic expressions can implicate things. Speakers do. To be sure, there are certain expressions that are characteristically used (by speakers) to implicate things (Davis (Reference Davis1998) regards this as a reason to say that sentences themselves implicate things, but he does so in the course of arguing that what is implicated is generally a matter of convention, not speaker intention). When this occurs we have what Grice calls generalized conversational implicatures (as opposed to particularized ones). These have been investigated in great depth by Levinson (Reference Levinson2000), who thinks they give rise to an intermediate level of meaning. In fact, they give rise to an intermediate kind of inference, but inferences are not meanings.
A related misunderstanding leads to Levinson's objection that Grice's approach cannot account for the (alleged) phenomenon of “pragmatic intrusion,” which he thinks is exemplified by so-called embedded implicatures. As Levinson puts it,
Grice's account makes implicature dependent on a prior determination of ‘the said’. The said in turn depends on disambiguation, indexical resolution, reference fixing, not to mention ellipsis unpacking and generality narrowing. But each of these processes, which are prerequisites to determining the proposition expressed, may themselves depend crucially on processes that look indistinguishable from implicatures. Thus what is said seems both to determine and to be determined by implicature. Let us call this Grice's circle. (Levinson Reference Levinson2000: 186; my italics)
This objection is based on confusing the two sorts of determination mentioned above. The first two highlighted words are forms of “determine” in the sense of ascertain, but the last two, where Levinson draws his conclusion, are in the constitutive sense. In that sense, what is said neither determines nor is determined by what is implicated. This is a matter of the speaker's intention.
Levinson and many others misconstrue Grice's account as a psychological model of the hearer's inference, indeed one according to which the hearer must ascertain what the speaker says before figuring out what the speaker is implicating (see Bach Reference Bach2001: 24–5, and Saul Reference Saul2002b). But that is not how Grice intended his account. He required that “the presence of a conversational implicature be capable of being worked out” (1975/1989: 31), but he did not require that it must be.
This last misconception (for still more see Bach Reference Bach, Birner and Ward2006c) leads to the widespread misconception, evident from an extensive literature, that some implicatures are “embedded,” “pre-propositional” (Recanati Reference Recanati2003), or “pre-compositional” (Reference Chierchia, Fox, Spector and MaienbornChierchiaet al. forthcoming). This is thought to arise with utterances of sentences like these:
(12) It is better to get married and get pregnant than to get pregnant and get married.
(13) Bill thinks that there were four boys at the party.
Since the two infinitival clauses of (12) are semantically equivalent, a speaker is likely to implicate that what is better is getting married and then getting pregnant. With (13) the implicature is not that there were exactly four boys at the party but that Bill thinks that. In fact, such examples illustrate merely that the process of figuring out what is implicated does not require first ascertaining what is said. They do not show that the implicature is embedded in anything. Indeed, since speakers implicate, it does not even make sense to say that some implicatures are embedded. It is irrelevant that the hearer figures out what is implicated without having first to figure out what is said. Recanati supposes that Grice's account requires (conversational) implicatures to have a “global, post-propositional character,” on the grounds that for Grice “implicatures are generated via an inference whose input is the fact that the speaker has said that p” (Recanati Reference Recanati2003: 300). Recanati's point is that certain implicatures get “computed” before what is said is ascertained. However, this does not show that the implicature itself is somehow embedded. In the case of (13), one of Recanati's examples, the implicature is supposed to be that Bill believes that there were not more than four boys at the party in question. But the implicature is not literally embedded. What the speaker says is the proposition expressed by (13), and what the speaker implicates is this other proposition. How the hearer figures this out is another matter.
Not only that, it seems that the speaker does not really mean two things, the proposition that Bill believes that there were four boys at the party and the proposition that Bill believes that there were not more than four boys at the party in question. It seems, rather, that the speaker means but one thing, that Bill believes that there were exactly four boys at the party in question. This example illustrates that some apparent instances of implicature are really cases of something else.
3.5 Between saying and implicating
A speaker can say something and mean just that. The contrast between saying and implicating allows both for cases in which the speaker means what she says and something else as well and for ones in which the speaker says one thing and means something else instead. Grice counted both as kinds of implicature, although the latter might better be described as speaking figuratively (recall, though, that Grice described this as a case of merely “making as if to say” something, since for him saying entails meaning). Grice seems to have overlooked a phenomenon intermediate between saying and implicating, one that has been investigated by many others (Sperber and Wilson Reference Sperber and Wilson1986, Bach Reference Bach1994a, Carston Reference Carston2002, and Recanati Reference Recanati2004a). As they have observed, there are many sentence forms whose typical uses go beyond their meanings, even with references fixed and ambiguities resolved, but are not cases of implicating (or speaking figuratively). The reason is that what the speaker means, although distinct from what is said (strictly speaking), is too closely tied to what is said to be a case of merely implicating.
3.5.1 Two kinds of impliciture
In homage to Grice I call these in-between cases conversational impliciture (Bach Reference Bach1994a). That's with an “i” rather than an “a.” It comes in two forms, depending on whether or not what is said fully comprises a proposition. In the first case what the speaker means is a more elaborate proposition than what is expressly said, as with a likely utterance of (14):
(14) Jack and Jill are married.
The speaker is likely to mean that they are married to each other, even though she does not make the last part explicit. Clearly that element is cancelable, since she could have added ‘but not to each other’ without contradicting herself. Even so, she is not implicating that Jack and Jill are married to each other, since she does not mean both that Jack and Jill are married and that they are married to each other. She means one thing, not two. What she means is an embellished version of what she says. Similarly, someone uttering (15),
(15) Harry took two aspirin and got rid of his headache.
would likely mean that Harry took two aspirin and then, as a result, got rid of his headache. Again, the inexplicit part is cancelable, for a speaker uttering (15) could have added, “but not because of the aspirin.” In both cases it is not the linguistic meaning of the uttered sentences but the fact that the speaker said what she said, presumably with maximal relevant informativeness, as per the first maxim of quantity, that provides the hearer with reason to think that the speaker intended to convey something more expansive.
Then there are cases in which what the speaker says is not merely less expansive than what she means but falls short of comprising a proposition (even with references fixed and ambiguities resolved). Suppose you are meeting some friends at a restaurant. You arrive at the appointed time, and all but one of the others are there. After a few minutes, you remark,
(16) Larry is late.
You cannot mean merely that Larry is late, full stop. Presumably you mean that Larry is late for the dinner in question. And if the maître d’ announces,
(17) Your table is ready.
presumably he means that your table is ready for your party to be seated there. In both cases the sentence falls short of fully expressing a proposition – it is semantically incomplete. Yet in each case what the speaker means is a complete proposition. Sentences like (16) and (17) appear to violate the grammar school dictum that a sentence, unlike a mere word, phrase, or “sentence fragment,” must express a “complete thought.” As with (14) and (15), though for a different reason (semantic incompleteness), what the speaker means is more specific than what the sentence means. We might say that whereas what a user of (14) or (15) means is an expansion of the sentence meaning, what a user of (16) or (17) means is a completion of it. These terms are meant to suggest, on the assumption that what a speaker means must be propositional, that in the first case what the speaker said is something that she could have meant (expansion is in a sense optional), whereas in the second case what was said is insufficient to have been meant (it requires completion into a proposition).
Regarding examples like (16) and (17) and many others like them (for numerous examples see Sperber and Wilson Reference Sperber and Wilson1986 and Bach Reference Bach1994a), it might be objected that the lexical semantics of “late” or of “ready” requires a complement (or, as it is sometimes put, includes a variable that must be given a value or an argument slot that must be filled), hence that (16) and (17) are not really semantically incomplete but more akin to sentences containing indexicals. Properly replying to this objection would require going through the variety of different lexical items that seem to give rise to semantic incompleteness, but the general idea is very simple. Consider examples like (18) and (19):
(18) It is 9 in the morning.
(19) The earth rotates at more than 1,000 mph.
Time of day is relative to a time zone, but (18), as it stands, that is, without any specification of time zone, is neither true nor false and does not express a proposition. Many speakers, particularly very young ones, are ignorant of time zones, and it would be charitable to attribute to them implicit reference to a time zone or even to their location. It is a fact about time of day, not about lexical semantics, that time of day is relative to a time zone. Similarly, as it stands (19) is neither true nor false and does not express a proposition. Even leaving aside the fact that the earth's rotation is relative to other objects, the sun in particular, its speed of rotation is relative to latitude. If the intended location is at or near the equator, a speaker of (19) would be asserting something true but not if it were at the North or South Pole. However, there seems to be no basis for supposing that the requirement of relativization to latitude is lexically or otherwise linguistically imposed. So, even if it is arguable that some terms that seem to give rise to semantic incompleteness lexically require complements, this is not the case in general.
Now some have contended that semantic incompleteness is the norm, not the exception. Recanati, for example, denies “that semantic interpretation can deliver something as determinate as a proposition. On my view, semantic interpretation, characterized by its deductive character, does not deliver complete propositions: it delivers only semantic schemata” (2004a: 56). Sperber and Wilson (Reference Sperber and Wilson1986) and Carston (Reference Carston2002) have taken a similar stance. However, it seems to me that while it is true that much is left implicit in ordinary speech, they have seriously overgeneralized from this fact and relied on a skewed sample of relatively short sentences. If they were right, then all of the things we mean would be ineffable. For their view entails that no proposition is semantically expressible, even by a sentence too verbose to use in casual conversation. However, the most they can hope to have shown is that the propositions people convey when using short, idiomatic sentences are not semantically expressed by those sentences. They haven't begun to show that there aren't other, less semantically impoverished sentences that speakers could have used to make what they meant fully explicit.
3.5.2 Saying and impliciture
These same critics of Grice have pointed out that expansions and completions are not related closely enough to conventional meaning to fall under Grice's notion of what is said but are too closely related to count as implicatures. Sperber and Wilson (Reference Sperber and Wilson1986: 182) coined the word explicature for this in-between category, since part of what is meant explicates what is said (sometimes they describe it as a “development of the logical form” of the uttered sentence). However, their neologism trades on an association with “explicit,” as in their pet phrase “explicit content” and Carston's (Reference Carston2002) “explicit communication” (this term occurs in the book's subtitle). “Impliciture” seems like a better term for this phenomenon, since it suggests that part of what is meant, the implicit content, is communicated implicitly, whether by expansion or completion (however, the issue here is not merely terminological, for, as explained in Bach Reference Bach, Soria and Romero2010, there are some subtle but real differences between impliciture and explicature). David Braun (p.c.) has invented the verb “implicite” for what a speaker does when what she means is an enrichment of what she ‘locutes’ (says in the locutionary sense).
Rather than adopt the term “explicature,” Recanati (Reference Recanati2004a) proposes to extend the notion of what is said, hence of saying itself, to cover the above cases. In fact, he offers a series of progressively more liberal notions of saying. It is hard to see that what we mean by “say,” hence by “what is said,” can be anything more than a terminological question, albeit one whose answer depends on theoretical utility. Grice's preferred notion had the constituents of what is said “corresponding to the meanings of the constituents of the sentence and mirroring its syntactic structure” (Grice Reference Grice1969/Reference Grice1989: 87), but he also insisted that what is said be meant (by the speaker). The latter requirement seems arbitrary, for reasons discussed in section 3.1. Worse than that, it obscures the distinction between saying in the locutionary sense, for which we have independent need, and the illocutionary notion of saying, i.e. stating or asserting. Keeping this distinction in mind allows for a notion, saying in the illocutionary sense of stating, whereby what is said is “enriched” by “pragmatic processes” (Recanati Reference Recanati2004a). Notice, moreover, that this does not result in what Levinson calls “pragmatic intrusion” (Reference Levinson2000: 189ff.), since the illocutionary level is inherently pragmatic.
Recanati does not deny that we can notionally draw this distinction, but he has argued, on both intuitive and psychological grounds, against the theoretical utility of adopting the locutionary notion of saying (for fuller discussion of the following issues see Bach Reference Bach2001: 21–8). He contends that people's intuitions about what is said (and about the truth or falsity of what is said) tend to be responsive to the presence of implicit elements. However, all this goes to show, assuming that Recanati is right about people's intuitions (he has not conducted actual studies), is that people tend to conflate saying with meaning, specifically stating or asserting. Imagining themselves in real-life conversational situations, they would imagine what speakers are likely to mean in making their utterances. It seems likely that subjects would make stereotypical assumptions about the situations in which target sentences are uttered and that their intuitions would be colored accordingly. So of course their intuitions would be responsive to embellishments of the content of the sentence actually uttered.
Recanati also appeals to claims about psychological processes to debunk the locutionary notion of saying/what is said, and Carston has argued similarly (Reference Carston2002: 170–81). The gist of their argument is that what is said in the strict, locutionary sense generally does not get mentally represented in the process of understanding an utterance. They claim, quite plausibly, that hearers figure out what is “implicited” on the fly, not after and without the benefit of ascertaining what is strictly said. However, this is irrelevant to what speakers do when they produce utterances. What is said (again, in the sense at issue) is the content of the locutionary act performed by the speaker. It has nothing to do with what goes on in the mind of the hearer.
This is not to deny the importance of investigating the cognitive processes involved in hearers’ understanding of what speakers mean. Like Recanati, Sperber and Wilson (Reference Sperber and Wilson1986), Carston (Reference Carston2002), and many others have concerned themselves with these processes, but that does not justify equivocating on the term “determination” as it occurs in the phrase “determination of what is said.” As we have seen, this phrase can designate either the process of ascertaining what a speaker says in uttering a certain sentence or whatever it is that makes it the case that the speaker says a certain thing. Obviously ascertaining what a speaker says in uttering something presupposes that there is something that the speaker does say. It plays no role in making it the case that the speaker says what she says.
3.6 Summing up
Clarifying the distinction between saying, meaning, and implicating has required refining Grice's notions of each by way of introducing further distinctions. Borrowing from speech act theory, we invoked the distinction between locutionary and illocutionary acts to drop Grice's counter-intuitive requirement that to say something entails meaning it. In order to make sense of Grice's ingenious idea that speaker meaning involves a kind of self-referential yet audience-directed intention, we needed to distinguish the specifically communicative “effect” of understanding from other, perlocutionary effects on an audience. Next we clarified Grice's notion of conversational implicature, mainly by identifying various misconceptions about it. Many of these can be avoided by heeding the distinction between a speaker's meaning something and the hearer's figuring out that the speaker means it. Also, the conversational maxims or presumptions do not generate or even determine implicitures but, rather, provide considerations that the hearer may be intended to take into account to figure out what a speaker means/implicates. We then pointed out that the distinction between saying and implicating is not exhaustive and defended the use of the term “impliciture” for what falls between what is said (and meant) and what is implicated. Finally, by distinguishing the speaker's semantic intention from her communicative intention and both from what may be intuitive, evident to, or otherwise go on in the mind of the hearer, we defended the notion of a speaker's purely locutionary, or semantic, act of saying. The aim of all this has been to distinguish the linguistic from the extralinguistic information that speakers try to make available to their listeners, to identify the sorts of intentions they have in so doing, and to describe the means by which this information is made available to or is inferable by the hearer from the fact that the speaker did what she did.
We have not covered the many debates that Grice's notions have provoked, the radical alternatives to his approach, the range and variety of implicatures (never mind presuppositions) that philosophers and linguists have discussed, or how all this fits into speech act theory and pragmatics in general. Whereas we have focused on issues raised by Grice's most important and influential ideas, speaker meaning and conversational implicature, Neale (Reference Neale1992) presents a much fuller discussion of these and related ideas, Levinson (Reference Levinson2000) offers an in-depth study of generalized conversational implicature, Bach (Reference Bach1999a, Reference Bach2006a) and Potts (Reference Potts2005, Reference Potts2007a) address Grice's controversial notion of conventional implicature, Chapman (Reference Chapman2005) provides a full-length intellectual biography of Grice, Horn (Reference Horn2009b) gives a forty-year retrospective on implicature, Geurts (Reference Geurts2010) offers in-depth study of quantity implicatures, the most thoroughly studied kind, and the papers collected in Petrus Reference Petrus2010 present some of the most recent developments.
4 Implying and inferring
To draw inferences has been said to be the great business of life.
4.1 A “vulgar conflation”
In Bach's short manifesto unveiling his list of the top ten misconceptions about implicature (2006c: 23), #2 is the thesis “Implicatures are inferences.” For Bach, such a claim – whether explicit (as in the subtitle of Levinson Reference Levinson2000, “Generalized conversational implicatures as default inferences”) or implicit – is a “misdenomer” amounting to a “slight variation on the vulgar conflation of implying with inferring.” The distinction in each case is seen as a straightforward one: implying (or, more specifically, implicating) is something the producer or sender of a message (speaker or writer) does, while inferring pertains to the cognitive effort of the receiver. Bach submits the entry from the American Heritage Book of English Usage:1
When we say that a speaker or sentence implies something, we mean that information is conveyed or suggested without being stated outright…Inference, on the other hand, is the activity performed by a reader or interpreter in drawing conclusions that are not explicit in what is said.
The distinction is vital for pragmatic theory because an interpreter may “recover” an implication (presupposition, implicature) that was not intended by the utterer, and a speaker may imply (presuppose, implicate) something that the interpreter does not grasp. In I. A. Richards's words (MWDEU 1994: 541), “An utterer may imply things which his hearers cannot reasonably infer from what he says”; in other cases, the expectation of inference may be reasonable but nevertheless unfulfilled, whether through inattention or a mismatch of shared beliefs. But what of the straightforward distinction between implying and inferring itself?
Most usage manuals endorse the distinction, but a closer look shows that it's not that simple. WDS (1942), for example, begins with this prescription, complete with pointy finger:
☞Do not confuse infer with imply.
But then the rockier landscape of actual usage is surveyed (WDS 1942: 449–50):
The use of infer in the sense of to hint at or to intimate (as, by his remarks he infers [correctly implies]) is still regarded as erroneous. However, in the past infer sometimes meant, and still to some extent means, to give grounds for believing (something that is stated) or to permit (something) to be inferred. In such use, a personal subject is to be avoided, for in precise English only that which gives the grounds for or permits an inference, or which leads to a given conclusion, can rightly be the subject; as “This doth infer the zeal I had to see him” (Shak.); “Consider first that great Or bright infers not excellence” (Milton); “Matters were by no means so far advanced between the young people as Henchard's jealous grief inferred” (Hardy).
So, in effect, “Don't use infer to mean imply – but if you do, make sure your subject is inanimate.” The secure footing appears to have become a bit slippery, presumably because a misstep with impersonal subjects yields no ambiguity.
In Merriam Webster's Dictionary of English Usage, the linguistically best informed usage compendium, the extensive entry for infer, imply (MWDEU 1994: 541–4) disentangles three uses of infer, eloquently illustrating that “Real life is not as simple as commentators would like it to be” (541). Besides the universally accepted ‘deduce, conclude’ sense for infer, MWDEU differentiates the impersonal-subject infer-for-imply illustrated in the Shakespeare, Milton, and Hardy examples above2 from the “personal infer” attested in an Ellen Terry letter from 1896: “I should think you did miss my letters. I know it! but…you missed them in another way than you infer, you little minx!” The MWDEU suggests that this latter use, the specific target of twentieth-century prescriptive opprobrium, has always been largely restricted to informal spoken language.
Similarly, the OED's sense 4 for infer reads as follows:
To lead to (something) as a conclusion; to involve as a consequence; to imply. (Said of a fact or statement; sometimes, of the person who makes the statement.)
This use is widely considered to be incorrect, esp. with a person as the subject.
But even if incorrect, not inexistent; cites range from the sixteenth century to the nineteenth (“Socrates argued that a statue inferred the existence of a sculptor”) and the twentieth (“I can't stand fellers who infer things about good clean-living Australian sheilahs”).
It should be noted that the direct target of objections like that of Bach (or Horn Reference Horn, Horn and Ward2004) is the nominal form (inference) rather than the verbal (infer), as in Levinson's implicatures as default inferences. But the speaker-oriented sense here boasts its own distinguished pedigree: the relevant OED sense 2 for inference is glossed as
That which is inferred, a conclusion drawn from data or premises. Also an implication; the conclusion that one is intended to draw. Cf. inferv. 4.
– with no disparagement of the latter usage and with attestations back to a 1612 essay by Francis Bacon on judicial practice warning that “Judges must beware of hard constructions and strained inferences.”
What does not seem to be noted in any of the dictionaries and manuals is the asymmetry exhibited in these purportedly erroneous uses. While infer has been used for almost five centuries in the sense of ‘imply, convey’ (whether with impersonal or personal subjects), imply is always sender-, not receiver-oriented; it is never used for ‘infer’, to refer to the cognitive processes of the hearer or reader. Although this asymmetry may be seen as betokening the primary status of the speaker or producer of the message, we can find other cases in which a predicate exhibits an analogous ambiguity but where the direction of the meaning shift is less clear.3 Thus entendre exhibits a range of meanings in French from the speaker-oriented ‘mean, intend’ to the hearer-oriented ‘understand, hear’. This is especially significant for pragmatic theory because, as we shall see, conversational implicature first appeared in Mill's invocation of the sous-entendu, that which is literally under-meant or under-understood.
4.2 Implication and implicature
Since classical rhetoricians first described figures in which we say less and mean more (minus dicimus et plus significamus),4 semanticists and pragmaticists have explored the boundaries between what is said and what is meant-but-not-said. The latter is the realm of the implied. Recognition of the distinction between the said and the (merely) implied is not, however, limited to philosophers, linguists, and rhetoricians. Consider this exchange between Elinor and Marianne Dashwood, Austen's eponymous Sense and Sensibility, respectively (1811: Chapter 29), concerning the reprehensible but not actionable misbehavior of the latter's erstwhile beau Willoughby:
‘But he told you that he loved you?’
‘Yes – no – never absolutely. It was every day implied, but never professedly declared. Sometimes I thought it had been, but it never was.’
This is a distinction vital to lawyers as well as cads, as seen in the myriad devices for exploiting the difference between what is said and what is implied under oath. An important precedent in this domain is Bronston v. United States. Samuel Bronston, president of a film production company, responded as follows to a cross-examining prosecutor in his 1973 trial (409 U.S. 352–354, cited in Solan and Tiersma Reference Solan and Tiersma2005: 213):
Q: Do you have any bank accounts in Swiss banks?
A: No, sir.
Q: Have you ever?
A: The company had an account there for about six months, in Zurich.
In fact, besides the company account, Bronston had actively maintained a large personal account in a Swiss bank. Thus, while his first response was truthful (depending, as President Clinton might have said, on the meaning of the word do), his second answer was at the very least misleading or “non-responsive.” But was it false? Bronston was convicted of perjury, on the grounds that his last response, while literally true, “falsely implied that he had never had a personal Swiss bank account,” but the judgment was reversed by a unanimous US Supreme Court. The particulars of this and related cases are illuminated by Solan and Tiersma (Reference Solan and Tiersma2005: 212–35), who point out that Bronston's violation concerned what is implicated (via the quantity and relation maxims) rather than what is literally said and endorse the Bronston “literal truth” defense against perjury charges, whether for sleazy movie producers or jesuitical presidents.
The difference between lying (based on the falsity of what is said) and misleading (based on the falsity of what is implied), as instantiated above and in a variety of other fictional and all too real settings over the last two millennia from the Oval Office to everyday conversation, can be taken to support an orthodox Gricean conception of what is said that hugs the syntactic ground of the spoken or written sentence as opposed to an “inflationary” view that incorporates pragmatically derived aspects of the intended communication; see Horn Reference Horn2009b for elaboration.
But not just any (non-logical) implication is an implicature. In particular, conversational implicature in the Gricean model typically arises from what the speaker didn't say but (given rationality and cooperation) would have been expected to say if she had been in a different epistemic position. This point, rightly associated with Grice's William James lectures, was actually made exactly a century earlier.
In the locus classicus, a speaker uttering Some F are G implies that (for all she knows) not all F are G because she would have been expected by the hearer to have expressed the stronger proposition if she had been in a position to do so. The key insight is provided in this passage in which John Stuart Mill rejects Sir William Hamilton's (1860) treatment of some as logically expressing ‘some only, some but not all’:
No shadow of justification is shown…for adopting into logic a mere sous-entendu of common conversation in its most unprecise form. If I say to any one, “I saw some of your children today”, he might be justified in inferring that I did not see them all, not because the words mean it, but because, if I had seen them all, it is most likely that I should have said so: though even this cannot be presumed unless it is presupposed that I must have known whether the children I saw were all or not. (Mill Reference Mill1867: 501)
Mill invokes here the two-stage process allowing the hearer's move from the weaker recovered implication (‘for all the speaker knows, not all…’) to the stronger (‘the speaker knows that not all…’) when epistemically licensed. These are the primary and secondary implicatures of Sauerland (Reference Sauerland2004), built into the rationality-driven Gricean model (cf. Horn Reference Horn1989, Reference Horn2009b: §2; Geurts Reference Geurts2009) but not captured in current alternative grammatical theories of “blind mandatory scalar implicature” (Reference Chierchia, Fox, Spector and MaienbornChierchiaet al. forthcoming; Magri Reference Magri2009; see Geurts Reference Geurts2010 for discussion).
Mill's allusion to a tacit principle requiring the speaker's choice of the stronger all over the weaker some when possible, and inviting the hearer to draw the corresponding inference when the stronger term is eschewed, is echoed by others in his own time –
Whenever we think of the class as a whole, we should employ the term All; and therefore when we employ the term Some, it is implied that we are not thinking of the whole, but of a part as distinguished from the whole – that is, of a part only. (Monck Reference Monck1881: 156, emphasis added)
– and in Grice's (e.g. Nowell-Smith Reference Nowell-Smith1954, Fogelin Reference Fogelin1967: 20–22; see Horn Reference Horn1990, Chapman Reference Chapman2005 for discussion).5
The principle tacitly invoked by Mill and Monck for generating such implications (or sous-entendus) is formulated by Strawson (Reference Strawson1952: 178–9) as a “general rule of linguistic conduct” he attributes to “Mr H P Grice”: “One should not make the (logically) lesser, when one could truthfully (and with greater or equal clarity) make the greater claim.” The implicational relation between the subcontraries some and some not is captured independently in this overlooked passage that stresses the role of cancelability in distinguishing what is said from “what can be understood without being said” while also touching on the roles of relevance, economy, and epistemic insecurity:
What can be understood without being said is usually, in the interest of economy, not said…A person making a statement in the form, “Some S is P”, generally wishes to suggest that some S also is not P. For, in the majority of cases, if he knew that all S is P, he would say so…If a person says, “Some grocers are honest”, or “Some books are interesting”, meaning to suggest that some grocers are not honest or that some textbooks are not interesting, he is really giving voice to a conjunctive proposition in an elliptical way.
Though this is the usual manner of speech, there are circumstances, nevertheless, in which the particular proposition should be understood to mean just what it says and not something else over and above what it says. One such circumstance is that in which the speaker does not know whether the subcontrary proposition is also true; another is that in which the truth of the subcontrary is not of any moment. (Doyle Reference Doyle1951: 382)
Grice's contribution, beyond securing the naming rights to the relation in question, was to ground the operation of Mill's “sous-entendu of common conversation”6 within an overall account of speaker meaning and the exploitation of conversational principles based on assumptions of the interlocutors’ rationality and mutual goals. In fact, like presupposition, implicature was (re)introduced into the philosophical literature and thence into the consciousness of linguists not with a specialized label but as a species of implication distinct from logical implication or entailment:
To say, “The king of France is wise” is, in some sense of “imply” to imply that there is a king of France. But this is a very special and odd sense of “imply”. “Implies” in this sense is certainly not equivalent to “entails” (or “logically implies”). (Strawson Reference Strawson1950: III)
If someone says “My wife is either in the kitchen or in the bedroom” it would normally be implied that he did not know in which of the two rooms she was. (Grice Reference Grice1961: 130)
Just as Strawson (Reference Strawson1952) carved out a dedicated relation of presupposition two years after his first broadside at the non-existent French king,7 so too Grice ([Reference Grice1967]Reference Grice1989) advances specialized labels – conventional and non-conventional (specifically conversational) implicature – for what he had earlier (1961: §3) described as varieties of (non-logical) implication delineated by the diagnostics of cancelability and detachability.
Conversational implicature arises from the shared presumption that S and H interact to reach a shared goal. A speaker S saying p and implicating q counts on her interlocutor's ability to compute what was meant from what was said, based on the assumption that both S and H are rational agents. On Grice's view, speakers implicate, hearers infer; such inferences may or may not succeed in recovering the speaker's intended implicature(s), if any. Nevertheless, it is S's assumption that H will draw the appropriate inference that makes implicature a rational possibility.8
The governing dictum is the Cooperative Principle: “Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange” (Grice Reference Grice1989: 26). This principle is instantiated by a set of general maxims of conversation whose exploitation potentially yields implicatures:
quality: Try to make your contribution one that is true.
1. Do not say what you believe to be false.
2. Do not say that for which you lack evidence.
quantity:
1. Make your contribution as informative as is required(for the current purposes of the exchange).
2. Do not make your contribution more informative thanis required.
relation: Be relevant.
manner: Be perspicuous.
1. Avoid obscurity of expression.
2. Avoid ambiguity.
3. Be brief. (Avoid unnecessary prolixity.)
4. Be orderly.
A year after introducing the notion (or, more precisely, the label) of implicature in the William James lectures, Paul Grice published Lecture 6 as Grice Reference Grice1968, situating his take on linguistic semantics within his broader project for speaker meaning (Grice Reference Grice1968: 225; cf. also Grice Reference Grice1989: 118):
The wider programme…arises out of a distinction I wish to make within the total signification of a remark, a distinction between what the speaker has said (in a certain favored and maybe in some degree artificial, sense of “said”), and what he has “implicated” (e.g., implied, indicated, suggested, etc.), taking into account the fact that what he has implicated may be either conventionally implicated (implicated by virtue of the meaning of some word or phrase which he has used) or non-conventionally implicated (in which case the specification of implicature falls outside the specification of the conventional meaning of the words used).
The characterization of implicature as an aspect of speaker meaning – what is meant without being said – was set forth at a time when similar notions were in the air, especially that of Oxford and its ordinary-language sphere of influence. A case in point is “contextual implication” as invoked by Nowell-Smith (Reference Nowell-Smith1954: 80–82) and revisited by Hungerland (Reference Hungerland1960) in her eponymous paper: “A statement p contextually implies q if anyone who knew the normal conventions of the language would be entitled to infer q from p in the context in which they occur.” The locus classicus for contextual implication is the context familiar from Moore: “When a speaker uses a sentence to make a statement, it is contextually implied that he believes it to be true” (Nowell-Smith Reference Nowell-Smith1954: 81). But the relation between (my saying) He has gone out and my believing that he has gone out cannot be assimilated to conversational implicature, for reasons Grice himself would later provide:9
On my account, it will not be true that when I say that p, I conversationally implicate that I believe that p; for to suppose that I believe that p (or rather think of myself as believing that p) is just to suppose that I am observing the first maxim of Quality on this occasion. I think this consequence is intuitively acceptable; it is not a natural use of language to describe one who has said that p as having, for example, “implied”, “indicated”, or “suggested” that he believes that p. The natural thing to say is that he has expressed (or at least purported to express) the belief that p. (Grice Reference Grice1989: 42)
The difficulty of canceling such a (putative) implicature without epistemic or doxastic anomaly also argues against such an analysis. This applies as well to other cases of sincerity conditions Hungerland cites, e.g. the relation between I promise to p and I intend to p. Another of Nowell-Smith's examples of contextual implication does qualify as prefiguring Grice on the maxims, specifically Relation, although not on implicature as such: “What a speaker says may be assumed to be relevant to the interests of the audience.” This maxim may indeed be overridden, as Nowell-Smith and Hungerland both observe. But Hungerland (Reference Hungerland1960: 212) is properly skeptical of the heterogeneity of a construct that extends from this relevance injunction to the sincerity condition on assertions and promises. Nowell-Smith himself concedes that the violation of the latter leads to “logical oddity” – “It's raining, but I don't believe it is” – while the non-observance of the relevance rule, according to him (and Hungerland), runs the mere risk of boredom. Even when relevance violations produce confusion or a recognition of the different conversational goals of speaker and hearer, such consequences do not rise to the level of the “logical oddity” of Moore's paradox.
4.3 Scalar implicature and the maxim map
Conversational implicature differs from contextual implication, and non-demonstrative implication more generally, in being defined as a relation between a speaker (not a sentence!) and a proposition that typically arises from the exploitation of the maxims (Grice Reference Grice1989: 26ff.; cf. Horn Reference Horn, Horn and Ward2004); in the case of scalar implicature in particular, what is implicated depends on what isn't (but could have been) said. The crucial principle of informative strength or quantity – whether formulated à la Strawson Reference Strawson1952 channeling Grice (“One should not make the (logically) lesser, when one could truthfully (and with greater or equal clarity) make the greater claim”), à la Grice Reference Grice1961 (“One should not make a weaker statement rather than a stronger one unless there is a good reason for so doing”) or à la Grice [1967]1989 (“Make your contribution as informative as is required (for the current purposes of the exchange)”) – balances what the speaker can and does say with what she doesn't and hence presumably can't (or shouldn't) say. A speaker may opt for a weaker utterance from a belief that to utter its stronger counterpart might violate considerations of relevance, brevity, clarity, or politeness (note the codicils and parentheticals in each of the formulations above10), but especially – as Mill and Doyle foresaw – from a lack of certainty that the stronger counterpart holds.
This reasoning, exploiting Grice's first quantity maxim, is systematically exploited to yield upper-bounding scalar implicatures associated with relatively weak scalar operators, those configurable on a scale defined by unilateral entailment as in <all, most, many, some>. What is said in the use of a weak scalar value like those boldfaced in (2) is the lower bound (…at least…); what is implicated, in the absence of contextual or linguistic cancellation, is the upper bound (…at most …). What is communicated is the “two-sided reading
” that combines what is said with what is implicated. Thus in (2c), to quote Mill (Reference Mill1867: 512), “If we assert that a man who has acted in a particular way must be either a knave or a fool, we by no means assert…that he cannot be both” – but we do typically communicate this “exclusive” understanding of the disjunction.

The alternative view on which each scalar predication in (2) is lexically ambiguous between one-sided and two-sided readings contravenes Grice's (Reference Grice1989: 47) Modified Occam's Razor: “Senses are not to be multiplied beyond necessity.” Scalar implicature was introduced and formalized in work by Horn (Reference Horn1972, Reference Horn1989), Gazdar (Reference Gazdar1979), Hirschberg (Reference Hirschberg1991), and Levinson (Reference Levinson2000); cf. also Matsumoto (Reference Matsumoto1995), Katzir (Reference Katzir2007), and Geurts (Reference Geurts2010) for insightful discussions of certain problems arising in the implementation of the central notions involved and Bontly (Reference Bontly2005) for a defense of Modified Occam's Razor based on its role as a heuristic in acquisition.
The implicature-based approach to scalar predications has been vigorously challenged by relevance theorists (see Carston Reference Carston and Kempson1988, Reference Carston2002, Reference Carston2004a and work reviewed therein), who take such sentences to involve propositional ambiguity, with the pragmatically enriched two-sided meanings constituting not implicatures but explicatures, pragmatically derived components of propositional content.11
Two major challenges to the Gricean picture of implicatures involve the number and status of the maxims and the relationship between implicature and propositional content. To begin with the former issue, Grice himself later acknowledged (1989: 371ff.) that the four macroprinciples (inspired by Kant) and nine actual maxims in his inventory are somewhat overlapping and non-coordinate. The number of maxims has been revised both upward (Leech Reference Leech1983) and downward. The dualistic program of Horn (Reference Horn1984b, Reference Horn1989, 2007a) follows Grice (Reference Grice1989: 371) in ascribing a privileged status to Quality, on the grounds that without the observation of Quality, or Lewis's (Reference Lewis1969) convention of truthfulness, any question of the observance of the other maxims fails to arise (though relevance theorists, beginning with Sperber and Wilson Reference Sperber and Wilson1986, offer a dissenting view). The remaining maxims are subsumed under two countervailing functional principles governing the economy of communication. On Horn's Manichaean model, implicatures may be generated by either the Q Principle (essentially ‘Say enough’, generalizing Grice's first sub-maxim of Quantity and collecting the first two ‘clarity’ sub-maxims of Manner) or the R Principle (‘Don't say too much’, subsuming Relation, the second Quantity sub-maxim, and brevity).
The hearer-oriented Q Principle is a lower-bounding guarantee of the sufficiency of informative content, exploited to generate upper-bounding (typically scalar) implicata. The speaker-oriented R Principle reflects Zipf's principle of least effort dictating minimization of form, exploited to induce strengthening implicata; it is responsible for euphemism, indirect speech acts, neg-raising, and meaning change (Horn Reference Horn and Burton-Roberts2007a). Opposition and equilibria between speaker's and hearer's communicative economies have been posited since Paul (Reference Paul1889: 351ff.) and Zipf (1949: 20ff.). According to the division of pragmatic labor (Horn Reference Horn and Schiffrin1984b), a relatively unmarked form – briefer and/or more lexicalized – will tend to become R-associated with a particular unmarked, stereotypical meaning, use, or situation, while its periphrastic or less lexicalized counterpart, typically more complex or prolix, will tend to be Q-restricted by implicature to those situations outside the stereotype, for which the unmarked expression could not have been used appropriately (as in kill vs cause to die, or mother vs father's wife). Formalizations of the division of pragmatic labor have been undertaken within bidirectional optimality theory and game-theoretic pragmatics; cf. e.g. Blutner Reference Blutner, Horn and Ward2004, van Rooij Reference van Rooij2009, and references cited therein.
Levinson's (Reference Levinson2000) framework posits an interaction of three heuristics: Q, I (for Informativeness, ≈ Horn's R), and M (Manner). Levinson's reconstruction of the division of pragmatic labor involves not Q but the M heuristic, given that some differs from all in informative content whereas kill differs from cause to die in complexity of production or processing. As Levinson acknowledges, however, the Q and M patterns are closely related, since each is negatively defined and linguistically motivated: S tacitly knows that H will infer from S's failure to use a more informative and/or briefer form that S was not in a position to have used that form. Unlike Q implicature, R/I-based implicature is not negative in character and is socially rather than linguistically motivated, typically yielding a culturally salient stereotype (cf. Huang Reference Huang2007 for a useful overview).
Relevance theorists (e.g. Sperber and Wilson Reference Sperber and Wilson1986, Carston Reference Carston2002) posit one pragmatic principle, that of Relevance, defined in non-Gricean terms. It may be argued, however, that the RT program is itself covertly Manichaean, given that Relevance itself is calibrated as a minimax of effort and effect. In the words of Carston (Reference Carston1990: 231), “Human cognitive activity is driven by the goal of maximizing relevance: that is…to derive as great a range of contextual effects as possible for the least expenditure of effort.”
4.4 What is meant and what is said: the -plicature family
We now return to the perennial dispute over the shape of the landscape of implied meaning. In recent years a partial consensus has formed as to semantic underspecification and pragmatic enrichment, one that transgresses the view inherited from Grice that the pragmatics can be simply “read off” the semantics. When we turn from the relatively straightforward cases of reference fixing and ambiguity resolution acknowledged by Grice himself to the more problematic phenomena of completion, saturation, and free enrichment (cf. Bach Reference Bach2001; Recanati Reference Recanati2001, Reference Recanati2002; Carston Reference Carston2002; and references therein, as well as the relevant chapters in Horn and Ward Reference Horn and Ward2004), it is clear we must grant what Bach (Reference Bach and Szabó2005) terms the “contextualist platitude”:
Linguistic meaning generally underdetermines speaker meaning. That is, generally what a speaker means in uttering a sentence, even if the sentence is devoid of ambiguity, vagueness or indexicality, goes beyond what the sentence means.
Thus, the speaker uttering the non-bracketed material in each example in (3) may well communicate the full sentences indicated, enriched by the bracketed addenda. As seen from the cancelability evidence in (4), however, this process, resulting in truth-conditionally relevant propositions not directly expressed, is pragmatic in character.
(3)
a. I haven't had breakfast {today}.
b. John and Mary are married {to each other}.
c. They had a baby and they got married {in that order}.
d. Dana is ready {for the exam}.
(4)
a. John and Mary are married, but not to each other.
b. They had a baby and got married, but not necessarily in that order.
Those enrichments constituting necessary conditions for the expression of truth-evaluable propositions involve what Recanati has called saturation and Bach completion. Recanati (Reference Recanati2002) distinguishes the bottom-up processes linguistically triggered by indexicals (I, today) and other expressions requiring saturation from those top-down modulation and free enrichment processes motivated on purely pragmatic grounds. At issue here are, for example, the underspecification of genitives (John's car – the one he owns? is driving? is following? is painting? is repairing?), unspecified comparison sets (Chris is tall – for an adult? for an adult American of the relevant sex?), and various expressions with apparent free variable slots: You are late (for what?), Robin is too short (for what?). For Recanati – contra Stanley Reference Stanley2000 and King and Stanley Reference King, Stanley and Szabó2005 – “unarticulated constituents” are real and cannot be reduced to independently motivated elements in abstract syntax or logical form.
Since Grice, the pragmatic landscape has exploded with aspects of meaning variously identified as conversational implicatures, conventional implicatures, presuppositions, implicitures, and explicatures. These are not simply diverse labels for given subclasses of implication but different ways of mapping the territory between the said and the meant. Situating “what is said” along this spectrum is itself controversial; what is said for Recanati (Reference Recanati2001), Ariel (Reference Ariel2008b), and the relevance theorists is enriched by pragmatically derived material (hence constituting an explicature). Levinson (Reference Levinson2000), on the other hand, responds to the apparent need to allow “pragmatic intrusion” into what is said by allowing conversational implicatures to have truth-conditional consequences for the propositions in question, contra Grice; in cases like (3c) or Deirdre Wilson's aperçu It's better to meet the love of your life and get married than to get married and meet the love of your life, an implicature (“P precedes Q”) can feed into (rather than just being read off) what is said. (See Carston Reference Carston2002 and Russell, Benjamin Reference Russell2006 for illuminating discussions of the complexity of conjunction buttressing.)
For orthodox Griceans, the pragmatically enriched proposition in such cases – what is communicated – is distinct from what is said. As we saw in §4.2, an “austere” conception of what is said (to borrow Jenny Saul's phrase; cf. Borg Reference Borg2004, Horn Reference Horn2009b), corresponding closely to the syntax of the sentence uttered and excluding pragmatically derived material, may have more to recommend it than first appears. Further, as Bach (Reference Bach2001) observes, once we abandon “OSOP” (the One Sentence, One Proposition assumption) we can recognize that a sentence may express not only more than one proposition but fewer than one. What is said in Dana is ready constitutes not a truth-evaluable proposition but a propositional radical. Completing such a radical within a given context to yield e.g. Dana is ready to write a dissertation yields not what is said (which is tightly constrained by the actual syntax) or an explicature (since there is nothing explicit about it), but rather an impliciture, a proposition implicit in what is said in a given context as opposed to a true implicature, a proposition read off what is said (or the way it is said). What Grice failed to recognize, argues Bach, is the non-exhaustive nature of the opposition between what is said and what is implicated.
“Scalar ‘implicatures’ are implicatures” is #9 in Bach's hit parade of misconceptions (2006c: 28–9): since a speaker uttering “Some of the boys went to the party” means not two separate things but just one, i.e. that some but not all of them went, this enriched proposition is an implic-i-ture (built up from what is said), not an implicature. But on the Gricean account (Horn Reference Horn1972, Reference Horn1989; Gazdar Reference Gazdar1979; Hirschberg Reference Hirschberg1991), the strong scalar implicature here is “Not all of the boys went to the party”; this combines with what is said (“Some…”) to yield what is communicated (“Some but not all…”). Thus the impliciture includes the scalar implicature rather than supplanting it.12
While Levinson (Reference Levinson2000) defines generalized conversational implicatures as default inferences, others argue that they are neither inferences – an implicature is an aspect of speaker's meaning, not hearer's interpretation13 – nor true defaults. This last point is especially worth stressing in the light of much recent work in experimental pragmatics (see e.g. Noveck and Posada Reference Noveck and Posada2003; Bott and Noveck Reference Bott and Noveck2004; Breheny et al. Reference Breheny, Katsos and Williams2006; Katsos Reference Katsos2008) suggesting that children and adults do not first automatically construct implicature-based enriched meanings for scalar predications and then, when the “default” interpretation is seen to be inconsistent with the local context, undo such meanings and revert to the minimal implicature-free meaning. To the extent that this empirical work on the processing of implicature recovery can be substantiated and extended, this is a very interesting result, but not (contrary to some claims) one that threatens the actual Gricean tradition, which predicts no automatic enrichment or default interpretation. This is clear from the passage distinguishing generalized and particularized implicature (Grice Reference Grice1989: 37, emphases added):
I have so far considered only cases of what I might call ‘particularized conversational implicature’…in which an implicature is carried by saying that p on a particular occasion in virtue of special features of the context, cases in which there is no room for the idea that an implicature of this sort is normally carried by saying that p. But there are cases of generalized conversational implicature. Sometimes one can say that the use of a certain form of words in an utterance would normally (in the absence of special circumstances) carry such-and-such an implicature or type of implicature.
The classic contrast here dates back to Grice Reference Grice1961: §3 – the particularized implicature with the “Gricean letter of recommendation” for a philosophy job candidate (Jones has beautiful handwriting and his English is grammatical) vs the generalized implicature with logical disjunction (My wife is in Oxford or in London, implicating I don't know which). Crucially, an implicature may arise in an unmarked or default context without thereby constituting a default or automatic inference. (See Bezuidenhout Reference Bezuidenhout, Campbell, O’Rourke and Shier2002a; Jaszczolt Reference Jaszczolt2005; and Geurts Reference Geurts2009 for different views on defaults and their relation to implicature.)
Despite their substantial differences (from each other and from Grice) as to the role of implicature and the relation between what is implicated and what is said, the proponents of the approaches touched on above share Grice's commitment to situating implicature within a rationality-based pragmatics. On a competing view that has recently been elaborated by Chierchia (Reference Chierchia and Belletti2004) and his colleagues, scalar implicatures in particular are generated locally as part of the grammar and/or the conventional lexical semantics of weak scalar operators. Support for this variety of defaultism involves an appeal to cases in which the Gricean model appears to yield the wrong results, thus arguing for local computation of “embedded implicatures.” Others (e.g. Sauerland Reference Sauerland2004; Russell, Benjamin Reference Russell2006; Horn Reference Horn, Turner and von Heusinger2006) have challenged these conclusions and defended a global account of implicature along Gricean lines. In particular, Geurts (Reference Geurts2009, Reference Geurts2010) provides a broad survey of the landscape. Drawing a distinction between marked L[evinson]-type cases and unmarked C[herchia]-type cases of putative locality effects, Geurts (Reference Geurts2009) argues that unlike the latter type, the Levinsonian contrast-induced narrowings represent true problems for a classical Gricean (or neo-Gricean) theory of implicature but shows that these can be handled by allowing upper-bounding to enter into the reinterpretation of what scalar operators express, a reinterpretation that is itself pragmatic in nature. In his treatise on Q-implicatures, Geurts (Reference Geurts2010) argues that the conventionalist alternative to a Gricean approach is not only stipulative but also empirically flawed in predicting the full range of implicature-related results.
4.5 Conventional implicature from Frege to Grice (and beyond)
Alongside the successful conversational implicature model, Grice's category of conventional implicature – a non-cancelable but truth-conditionally transparent component of encoded content – plays the role of ugly stepsister. The coherence of this category has evoked much skepticism: Bach (Reference Bach1999a) consigns it to the dustbin of mythology, Carston (Reference Carston2002: 134) remarks that “there simply is no such thing as ‘conventional’ implicature in relevance theory (or, we would argue, in reality),” while Potts (Reference Potts2005, Reference Potts2007b) rehabilitates it in a different guise. But Grice's account of conventional content that does not affect the truth conditions of the asserted proposition has a rich lineage. Frege (Reference Frege1892, Reference Frege and Hermes1897, Reference Frege1918) delineates a class of meanings that, while of linguistic interest, do not “affect the thought”:
With the sentence “Alfred has still not come” one really says “Alfred has not come” and, at the same time hints [andeutet] that his arrival is expected, but it is only hinted. It cannot be said that, since Alfred's arrival is not expected, the sense of the sentence is therefore false. The word ‘but’ differs from ‘and’ in that with it one intimates [andeutet] that what follows it is in contrast with what would be expected from what preceded it. Such suggestions in speech make no difference to the thought. A sentence can be transformed by changing the verb from active to passive and making the object the subject at the same time…Naturally such transformations are not indifferent in every respect but they do not touch the thought, they do not touch what is true or false. (Frege Reference Frege1918: 295–6)
While recent scholarship largely follows Dummett (Reference Dummett1973) in dismissing Frege's positive proposals in this area as representing a confused and subjective notion of “tone,” this mischaracterizes Frege's actual account of the relevant phenomena. The two verbs in Geach's rendering highlighted above – hint and intimate – both translate Frege's andeuten, i.e. ‘conventionally implicate’; no subjectivity or confusion is involved.
For a range of constructions including discourse particles (but, even, Ger. ja, doch), subject-oriented adverbs, epithets, and other “loaded” words, a version of the approach proposed by Frege and Grice remains eminently plausible (Barker Reference Barker2003; Horn Reference Horn, Kecskes and Horn2007b, Reference Horn2008; Gutzmann Reference Gutzmann2008; Williamson Reference Williamson, Almog and Leonardi2009). Such an approach extends naturally to a range of other linguistic phenomena, including the familiar vs formal second person singular (“T/V”) pronouns of many modern European languages, evidential markers, and arguably the uniqueness/maximality condition on definite descriptions. In addition, certain syntactic constructions can be profitably analyzed along these lines such as the southern US English “personal dative,” a non-argument pronominal appearing in transitive clauses that obligatorily coindexes the subject as exemplified in I love me some squid, truth-conditionally equivalent to, but not fully synonymous with, ‘I love squid’ (Horn Reference Horn2008). In each case, we find aspects of conventional content that are not entailed and do not fall inside the scope of logical operators.
The category of conventional implicatures poses a complication for the distinction between what is said and what is meant. Such expressions present a recalcitrant residue for Grice (who was concerned with delineating what is said and what is conversationally, and hence calculably, implicated) as they did for Frege (who was concerned with the thought, i.e. with sense and potential reference); for both, detecting a conventional implicature facilitates the real work by clearing away the brush. But Grice did undertake to situate this relation within what we refer to (though he did not) as the semantics/pragmatics divide. His contributions in this area, if not always accepted, are widely recognized, as in this passage from Davidson (Reference Davidson1986: 161–2):
It does not seem plausible that there is a strict rule fixing the occasions on which we should attach significance to the order in which conjoined sentences appear in a conjunction: the difference between “They got married and had a child” and “They had a child and got married.” Interpreters certainly can make these distinctions. But part of the burden of this paper is that much that they can do should not count as part of their linguistic competence. The contrast in what is meant or implied by the use of “but” instead of “and” seems to me another matter, since no amount of common sense unaccompanied by linguistic lore would enable an interpreter to figure it out. Paul Grice has done more than anyone else to bring these problems to our attention and help to sort them out.
But how, exactly, does this sorting work? If descriptive content, reflecting what is said, is clearly semantic and if what is conversationally implicated (e.g. the “for all I know, not both p and q” upper-bounding implicatum associated with the utterance of the disjunction p or q or the negative effect of the Gricean letter of recommendation) is pragmatic (paceReference Chierchia, Fox, Spector and MaienbornChierchia 2004, among others), where is conventional implicature located? One standard view is that by falling outside what is said, the conventionally implicated must be pragmatic (see e.g. Gutzmann Reference Gutzmann2008: 59). One argument on this side is terminological; in Kaplan's words (1999: 20–21):
According to Grice's quite plausible analysis of such logical particles as “but”, “nevertheless”, “although”, and “in spite of the fact”, they all have the same descriptive content as “and” and differ only in expressive content…The arguments I will present are meant to show that even accepting Grice's analysis, the logic is affected by the choice of particle…If this is correct, then generations of logic teachers, including myself, have been misleading the youth. Grice sides with the logic teachers, and though he regards the expressive content as conventional and hence (I would say) semantic (as opposed to being a consequence of his conversational maxims), he categorizes it with the maxim-generated implicatures.
To be sure, conventional implicatures are implicatures. But then again, they are conventional; we are indeed dealing here, unlike in the maxim-based cases, with aspects of content.
Two decades after the William James lectures, Grice revisited these categories in his Retrospective Epilogue (1989: 359–65), where he sought to establish central and non-central modes of meaning through the criteria of formality (“whether or not the relevant signification is part of the conventional meaning of the signifying expression”) and dictiveness (“whether or not the relevant signification is part of what the signifying expression says”). Thus, when a speaker says “p; on the other hand, q” in the absence of any intended contrast of any kind between p and q, “one would be inclined to say that a condition conventionally signified by the presence of the phrase ‘on the other hand’ was in fact not realized and so that the speaker had done violence to the conventional meaning of, indeed had misused, the phrase ‘on the other hand’.” Crucially, however, “the nonrealization of this condition would also be regarded as insufficient to falsify the speaker's statement” (Grice Reference Grice1989: 361). Thus, formality without dictiveness yields conventional implicature. (As for dictiveness without formality, a plausible candidate is the pragmatically enriched content of relevance theorists, the truth-conditional pragmatics of Recanati Reference Recanati2001.)
In uttering a given sentence in a given context, the speaker may intentionally communicate more than one truth-evaluable proposition, but these communicated propositions do not necessarily have equal status; in They are poor but happy, the conjunctive content is truth-conditionally relevant while the contrastive content is not. Yet but and and are not synonyms. Conventional implicatures constitute part of encoded content but not part of truth-conditional content per se; their falsity does not project as falsity of the expression to which they contribute (cf. Barker Reference Barker2003). What they contribute is use-conditional meaning (Kaplan Reference Kaplan1999, Gutzmann Reference Gutzmann2008).
Besides detachability and non-cancelability (Grice Reference Grice1961, Reference Grice1989), additional diagnostics for conventional implicatures include their tendency to project out of embedded contexts, their immunity to certain kinds of objection, and their contextual variability or descriptive ineffability (Potts Reference Potts2007b, Horn Reference Horn2008). The last property is the difficulty of pinning down the precise contribution of but (contrast? unexpectedness?) or even (relative or absolute? unlikelihood or noteworthiness?) and in the intention prompting the use of a second person “familiar” (T[u]) or “formal” (V[ous]) pronoun (T can be affectionate, presumptuous, comradely, or condescending; V can be polite, aloof, diplomatic, or hostile). Ineffability has a plausible source: modulo the well-known problems associated with vagueness, it is plausible that the edges of truth-conditional meaning should for the most part remain discrete, while inconsistency in the mental representation of non-truth-conditionally relevant content is less pernicious. If you know that my use of vous rather than tu signals some aspect of formal respect, distancing, or lack of intimacy, my precise motives can be left underdetermined, but if you don't know whether I’m using a second person or third person pronoun, the indeterminacy is more costly.
For Frege and Grice, identifying the class of conventional implicature-licensing constructions – scalar particles, speaker-oriented sentence adverbs, epithets and slurs, prosodic features, evidential markers, “affected” pronominals, word order effects – serves to characterize them in terms of what they are not: they do not affect the thought or the truth-conditionally relevant meaning of a given expression, and at the same time they are not derivable from general principles of rationality. While they share the former property with conversational implicatures, they differ crucially from them in the latter respect.
4.6 Implication and speaker meaning
Whether conversationally or conventionally triggered, implicatures are generally understood to constitute a proper subset of speaker-meant implications. As we have seen, the assimilation of implicature to inference has been deplored as an instance of the imply/infer confusion. But just as the latter turns out to be more complex than meets the eye, so too the subsumption of conversational implicature within the category of speaker meaning is not entirely straightforward. Saul (Reference Saul2002a) has argued that the full range of Grice's remarks on the topic suggests that we need to allow for a category of audience-implicature along with the traditional utterer-implicature. Others (e.g. Bach Reference Bach2001, Reference Bach2006c), while acknowledging the role of the speaker's expectations about the inferences that the hearer can reasonably be expected to draw, defend the classical view that implicatures constitute part of speaker meaning.
In this respect, it is worth touching on evidence that imply in ordinary language use is not always definable in terms of speaker intention. A Google search on the string didn't mean to imply returns 929 valid hits (retrieved 16 June 2010) – “The President didn't mean to imply that AARP supports current health care legislation”, “Bolivian President Evo Morales Didn't Mean to Imply Straight Guys Go Gay Over Chicken”, etc. – with the sense of ‘x didn't intend in saying p that y should infer q’. So we can unintentionally imply propositions, whether or not we can unintentionally implicate them. This is reinforced by the 785 googla instantiating unintentionally imply/implied – “I’m sorry if I unintentionally implied here anything that touched your feelings”, “I realized that I might have unintentionally implied that no Christian would feel right about attending a pagan ceremony” – and the 705 hits for accidentally imply/implied, including “When you accidentally imply she's fat – send chocolates instead” and “Did you know you're supposed to say ‘best wishes’ to the bride instead of ‘congratulations’, lest you accidentally imply that the bride won her groom through trickery or deceit? Like she most likely did. But still.”14
Ideally, imply and infer would map respectively into what the speaker intends and what the hearer grasps. Most likely, that would make for a simpler pragmatics. But still.
I am grateful to Barbara Abbott, Keith Allan, Kent Bach, Betty Birner, Bart Geurts, Kasia Jaszczolt, Nicole Palffy-Muhoray, and Gregory Ward for their helpful comments and pointers; the usual disclaimers apply.
5 Speaker intentions and intentionality
5.1 Introduction
Speaker intentions made their way into contemporary pragmatics through three different but interrelated routes. One of them dates back at least to medieval philosophy and the inquiries into the logic of modal contexts, leading to the study of intentionality. Another begins with ordinary language philosophy of the mid-1950s and the attempts to define meaning through language use, which in turn led to the employment of the concept of speaker's intended effect of an act of communication. The third, and arguably the most influential route, was that of the attempts to rescue formal semantic analyses by employing a concept of meaning that would incorporate not only the truth-conditional content but also the intended implicated messages, forming the overall concept of communicated content. This introduction to intentionality and intentions is structured as follows. In section 5.2, we present the philosophical idea of intentionality and explain the relation between intentionality of mental states and linguistic intentionality conveyed through acts of communication. In section 5.3 we move to the role of intentions in communication, focusing on the second and third routes mentioned above, starting with Grice's and Searle's views, attempting also a typology, explanation and exemplification of various kinds of speaker intentions distinguished in the literature. Finally, we discuss the question as to where intentions are located, contrasting cognitive, interactional and discursive perspectives on this issue, and conclude in section 5.4 with a brief assessment of the advantages and weaknesses of utilising intentions and intentionality in pragmatic theory.
The first question to ask is why linguists would want to appeal to a concept as murky as intentions and award it the status of an explanandum in pragmatic theory. The main reason is that the possession of the concept of intentional action is crucial for understanding human behaviour. As Anscombe (Reference Anscombe1957: 83) says in her seminal book Intention,
there are many descriptions of happenings which are directly dependent on our possessing the form of description of intentional actions. It is easy not to notice this, because it is perfectly possible for some of these descriptions to be of what is done unintentionally. For example ‘offending someone’; one can do this unintentionally, but there would be no such thing if it were never the description of an intentional action.
The notions of intention and intentionality have since been deployed in a multitude of ways in explaining speaker meaning (Grice Reference Grice1957, Reference Grice1989), speech acts (Searle Reference Searle1969, Reference Searle1983), the development of language (Searle Reference Searle, Kecskes and Horn2007), social development (Tomasello et al. Reference Tomasello, Carpenter, Call, Behne and Moll2005), and the cognitive processes underlying action and meaning interpretation (Bara Reference Bara2010; Pacherie Reference Pacherie, Pockett, Banks and Gallagher2006, Reference Pacherie2008), to name just a few. However, this proliferation has generated challenges for the conceptual and theoretical status of intentionality and intention in pragmatics, as we will discuss.
5.2 Intentionality
Intentionality has a long and celebrated tradition in philosophy. Coming from the Latin term intendere, meaning aiming in a certain direction, directing thoughts to something, on the analogy to drawing a bow at a target, it has been used to name the property of minds of having content, aboutness, being about something (Duranti Reference Duranti1999; Harland Reference Harland1993; Lyons Reference Lyons1995; Jacob Reference Jacob2003; Nuyts Reference Nuyts, Verschueren, Östman, Blommaert and Bulcaen2000; Smith, D. W. Reference Smith2008; Jaszczolt Reference Jaszczolt1999; Woodfield Reference Woodfield and Asher1994). In other words, it means the ability of minds to represent objects, properties or states of affairs. In medieval philosophy, forms of beings consisted of so-called esse naturale, or natural objects, and esse intentionale, or concepts, mental images or thoughts. It is the latter that we are interested in here.
The very idea of intentionality dates back to ancient philosophy, as far back as the fifth century BCE and Parmenides of Elea. It was taken up by Aristotle and the Stoics (Caston Reference Caston2007), and was subsequently extensively used in medieval doctrines of knowledge and revived in nineteenth-century phenomenology by Brentano (Reference Brentano1874) and Husserl (Reference Husserl1900–1901), and later by Meinong, Twardowski, Heidegger, Sartre, Merleau-Ponty and many others. By phenomenology we mean the study of the way in which things (phenomena) are presented in consciousness, or generally the study of forms of conscious experience from the first-person point of view. Mental attitudes such as belief, desire or want are intentional in that they are about something, they have an object. For phenomenologists, things exist as physical objects, but they also have an intentional existence, so to speak, in acts of consciousness. Their intentional existence is revealed in our mental states or acts, as for example in (1) or (2).
(1) I am thinking about my holiday in Australia.
(2) I hope to meet Peter Carey one day.
In the case of hallucinations, the existence is only intentional.
Brentano and Husserl developed intricate arguments concerning the meaning of such mental existence, in particular the question as to whether there are intentional objects that are internal to the thinker's mind. This discussion, albeit fundamental in the study of phenomenology, can only be sketched here. For Brentano, objects of conscious mental acts had the status of mental entities. On this view, the act does not consist of a relation between its subject and an intersubjectively identifiable object but rather can be spelled out in a so-called adverbial theory: ‘Tom sees a horse’ amounts to a property of Tom's ‘seeing horsely’, so to speak (see Smith and Smith Reference Smith, Smith, Smith and Smith1995). However, the adverbial theory comes with a considerable limitation. It does not provide for the fact that our acts of consciousness are about things in the world, real world properties, or states of affairs. It was therefore subsequently replaced with the relational theory by Twardowski and Meinong. In the next logical step, Husserl rejected mental objects tout court. Current theories of intentionality adopt such a relational view.
Now, this is not to say that where the real object does not exist, ‘made-up’ substitutes are never needed in a theory of natural language meaning. To give one pertinent example, Clapp (Reference Clapp2009) discusses so-called negative existentials, i.e. sentences of the type in (3) and (4) (from Clapp (Reference Clapp2009: Reference Clapp1422)).
(3) The Loch Ness monster does not exist.
(4) Nessie does not exist.
As he puts it, the problem with negative existentials is that they presuppose existence and immediately deny it. We can, of course, stipulate that there is something real to which one refers in (3) and (4), à la Meinong, or say that there is no existential presupposition there, à la Russell. On the modern version of the classical phenomenology, mental objects are intentional, mind-independent entities; following McGinn (Reference McGinn2000), we can introduce the theoretical construct of representation-dependent, intentional objects. Then we replace the dichotomy ‘existent – non-existent’ with a new dichotomy ‘existent – intentional’ and thereby vindicate the latter as a theoretical construct.
Intentional content has been the subject of discussions between so-called neo-Fregeans and neo-Russellians (Recanati Reference Recanati1993; Siewert Reference Siewert2006). In the Anglo-American tradition in semantics and pragmatics we often think of the analytic tradition as separate from continental European psychologism with its emphasis on consciousness, mental states, first-person psychology and the like. However, it is worth remembering that the most ground-breaking achievements in phenomenology and in analytic philosophy took place at the same time, at the end of the nineteenth century and the beginning of the twentieth century, and that they are much more interrelated than they are usually taken to be. Frege's (Reference Frege1892) notion of sense, whose offshoots have been so influential in current semantic theory of the Anglo-American tradition, is a direct predecessor of Husserl's phenomenological theory of meaning and object. We briefly attend to this interrelation in what follows.
Husserl held various consecutive ideas of meaning, culminating in his view from Ideen (1913) according to which meaning is contained in the objective content of consciousness, called noema. An act of consciousness is directed at an object determined by noemata. Here Frege's concept of sense can be seen as a clear precursor for Husserl's idea of meaning as distinguished from objects, and hence subsequently also his notion of noema. However, their respective distinctions between meaning and object, and sense and reference, led them in very different directions. Whereas Husserl and other phenomenologists concentrated on experience, Frege and analytic philosophers concentrated on developing a theory of meaning making use of intersubjective generalisations over ‘meaning-giving mental acts’ such as Frege's sense.
But this departure from psychologism does not reflect a sudden split. It happened gradually through various reanalyses of intentionality and subsequently ascribing intentionality to linguistic acts. So, the question to ask is how the theory of meaning became dissociated from Husserlian meaning as noema, that is from meaning attached to mental acts, discussed above. First of all, for Husserl from his Ideen period (1913), it was language, rather than a particular mental act of a language user as it was claimed in earlier phenomenological accounts, that was a carrier of meaning. Already in Husserl's work of this period meaning acquires the status of an abstraction, noematic meaning, attached to relevant mental acts. There is one step from there to freeing theory of meaning from the constraints of psychology. And this step was taken by Frege and his battle against the ‘corrupting intrusion’ of psychology in logic and mathematics, and thereby subsequently in natural language meaning.
Frege's sense (Sinn), as contrasted with reference (Bedeutung), saved formal semantics from the problem of substitutivity of coreferential expressions. For example, since it is true that Hilary Mantel is the author of Wolf Hall, it should be possible to substitute ‘Hilary Mantel’ for ‘the author of Wolf Hall’ as in (6), preserving the truth of the sentence.
(5) The author of Wolf Hall is visiting Cambridge this spring.
(6) Hilary Mantel is visiting Cambridge this spring.
However, although Hilary Mantel is identical to the author of Wolf Hall, sentence (7) differs substantially from (8); (7) is informative while (8) is not.
(7) Hilary Mantel is the author of Wolf Hall.
(8) Hilary Mantel is Hilary Mantel.
These two ways of referring to the winner of the 2009 Man Booker Prize come with different ways of thinking, ways of understanding or, to use the celebrated phrase, different modes of presentation of the referent. The senses we grasp when we understand these two expressions are different. Now, when we embed (7) and (8) in the contexts expressing mental attitudes, using intentional verbs such as ‘believe’ or ‘doubt’, Frege's concept of sense acquires a new importance for semantic theory. In (9) and (10), using Frege's own explanation, the sense of ‘Hilary Mantel’ plays the role of the referent.
(9) Harry believes that Hilary Mantel is the author of Wolf Hall.
(10) Harry believes that Hilary Mantel is Hilary Mantel.
Analogously, in (11), the sense of the definite description ‘the author of Wolf Hall’ plays the role of the referent. The description refers not to its customary reference, but instead to the sense. Sense, however, is more than the speaker's way of thinking; it is an intersubjective way of thinking. It can function as the speaker's own mode of presentation of the referent, but equally it can function as someone else's way of thinking about this referent.
(11) Harry believes that the author of Wolf Hall is a man.
This objectivity of sense shifted intentionality from the domain of the individual and the mental to the domain of the external. For Kripke (Reference Kripke1972) and Putnam (Reference Putnam1975), proper names and natural kind words acquire reference through the causal link with the world. The topic of the externalist/internalist debate, albeit very important, is tangential to the present discussion of intentionality vis-à-vis intentions and will not be pursued here. Instead, we will now focus on one aspect of the debate, namely the role of intentionality in the theory of meaning.
Let us now return to the topic of the ‘antipsychologism’ of analytic phil-osophy. Frege can be credited with developing a new concept of logic. In his Begriffsschrift (Frege Reference Frege and Bynum1879), he developed an analysis of the logical form of sentences in terms of predicates and arguments, where the reference of a predicate is a function from objects to truth values. This development marked the end of the era of psychological logic that studied thought processes and subjective mental representations. According to Frege, the object of study of logic is not the agent who uses the rules of inference, but the rules of logical inference themselves, and thereby the languages of logic themselves. In Grundlagen der Arithmetik, Frege (Reference Frege1884) offers a new, depersonalised stance on truth, definitions, logic, mathematics and indirectly on natural language semantics. He says that ‘[t]here must be a sharp separation of the psycho-logical from the logical, the subjective from the objective’ (p. 90). The ways of thinking about objects must not be confused with the objects themselves. Instead of focusing on someone's ‘thinking that something is true or valid’, we now focus on truth and validity as such. To give further examples, in his Grundgesetze der Arithmetik (Frege Reference Frege1893: 202), he calls the effect of psychology on logic a ‘corrupting intrusion’ because it has to be emphasised that ‘being true is quite different from being held as true’. In Logic, he stresses that ‘Logic is concerned with the laws of truth, not with the laws of holding something to be true, not with the question of how people think, but with the question of how they must think if they are not to miss the truth’ (Frege Reference Frege and Hermes1897: 250).
This ban on psychologism has never been lifted since.1 Frege's function/argument analysis, juxtaposed with Tarski's semantic definition of truth, led to the development of formal semantics of natural languages. Now, this rejection of psychology can easily be taken to suggest that the very idea of intentionality had to be poured out with the bath water as well. After all, what we have here is Frege's forceful rebuttal of Husserl's stance on logic as a study of mental processes.2 As can be seen from the following section, however, intentionality need not necessarily pertain to mental states and proves to be a useful theoretical tool that can easily be dissociated from ‘psychologising’.
The follow-up question to ask is whether intentionality is a property of mental states only, or whether it can also be construed as a property of other objects. The answer gleaned from subsequent theorising is definitely positive. For the current purpose, however, we will not be concerned with the intentionality of human organs and limbs, or intentionality of information-processing systems discussed by Millikan and Dretske (see e.g. Lyons Reference Lyons1995; Jacob Reference Jacob2003). Neither will we be interested in Fodor's (e.g. Reference Fodor1975, Reference Fodor1981) discussion of intentionality as a feature of the brain. What we will focus on is the intentionality of linguistic expressions.
For Searle (Reference Searle1983, Reference Searle1992b), our beliefs and intentions have intrinsic, basic intentionality, while linguistic expressions have derived intentionality in the sense that the meaning of acts of speech can be analysed in terms of intentional states, such as belief or intention. In other words, Searle says that the mind ‘imposes’ intentionality, so to speak, on linguistic expressions in that the basic intention to represent is responsible for the derived intention to communicate. The intentionality of the mental state that underlies the act of communication bestows on that act so-called conditions of satisfaction. In brief, beliefs have intrinsic intentionality, while utterances have derived intentionality (Searle Reference Searle, Lepore and van Gulick1991: 84), or ‘I impose Intentionality on my utterances by intentionally conferring on them certain conditions of satisfaction which are the conditions of satisfaction of certain psychological states’ (Searle Reference Searle1983: 28).
What Searle proposes here is a so-called double level of intentionality. Mental states such as beliefs, wishes or hopes impose conditions of satisfaction on the expressions of these states. These conditions, in turn, play a major role in determining the meaning of the linguistic expressions. For example, a request may inherit conditions of satisfaction from the speaker's wish or desire. An important question arises at this point, namely what exactly does it mean to impose, or confer, conditions of satisfaction by one object on another? How can one transfer them from a wish, a mental state, to a request, a linguistic object? Harnish (Reference Harnish and Burkhardt1990: 189) points out that an intention to confer them does not suffice because there must be a constraint that one cannot intend and fail. Furthermore, there must be a restriction on types of objects from and to which intentionality can be transferred. As was proposed in Jaszczolt (Reference Jaszczolt1999: 106), we should divert our attention from the conferment as such and direct it instead to the fact that the same conditions of satisfaction pertain to the mental and to the linguistic. Once we pose the problem in this way, what remains is to take a stance on the relation between the mental and the linguistic, perhaps arguing as follows. Since the double level is supposed to characterise speech acts themselves rather than the relation between a speech act on the one hand and a mental state on the other, then we should think that speech acts have both basic intentionality, qua externalisations of mental states, and derived intentionality, qua linguistic objects. Since language is one of the vehicles of mental states, this seems to be a natural way of explaining the mysterious and rather unfortunately named ‘double level’ of intentionality.3 By this reasoning, intentionality is always an intrinsic rather than a ‘conferred’ property.
Next, just as intentionality is a property of linguistic acts, so having intentions is a property of their owners. The latter is the topic to which we now turn.
5.3 Intentions in communication
5.3.1 Intentions and inferences
Intentions and the study of language communication have long and intertwined traditions. For John Locke, language is there to fulfil the intention of expressing the thoughts of their holders. In a similar vein, for Grice, the concept of meaning is founded on what is communicated, intentionally, by the speaker. Analysing sentence meaning takes us only part of the way; without employing the communicative intention the analysis is incomplete. In order to present the principles on which his theory of meaning is founded, we have to begin with Grice's seminal paper ‘Meaning’ (Grice Reference Grice1957). Firstly, he points out the difference between the so-called natural meaning and non-natural meaning. When meaning that p entails that it is the fact that p, we have an instance of natural meaning. (12) is an example of natural meaning in that the symptom and the disease are linked through a natural connection.
(12) These red spots mean meningitis.
This kind of meaning is of no interest to a linguist. Instead, linguists should focus on speaker's meaning or non-natural meaning, also known as meaningnn. Meaningnn is conventional, not characterised by the relation of entailment discussed above, and it is this kind of meaning that is the object of study in Gricean and post-Gricean pragmatics. Speaker's intentions are crucial for defining meaningnn, and so is, to a greater or lesser degree depending on the approach, the recognition of speaker's intentions by the addressee. Grice utilises intentions in the definition as follows:
‘A meantnn something by x’ is roughly equivalent to ‘A uttered x with the intention of inducing a belief by means of the recognition of this intention’. (Grice Reference Grice1957 [1989]: 219)
and elaborates further in ‘Utterer's Meaning and Intentions’ (Grice Reference Grice1969 [1989]: 92):
‘U meant something by uttering x’ is true iff [if and only if], for some audience A, U uttered x intending:
[1] A to produce a particular response r
[2] A to think (recognize) that U intends [1]
[3] A to fulfill [1] on the basis of his fulfillment of [2].
This definitional role of intention was also essential in the speech-act literature of that period (cf. Austin Reference Austin1962; Searle Reference Searle1969).
The reliance of pragmatic theory on intentions does not mean, however, a return to the psychologism addressed in section 5.2. Making a definitional use of them need not lead to the inclusion of a theory of mental processes in pragmatics. Further developments in the post-Gricean tradition testify to a free choice here: some pragmaticists stayed close to Grice's spirit and upheld the antipsychological stance (e.g. Levinson Reference Levinson2000), focusing on general principles of rational action, but as a matter of methodological assumption, without an investigation of the psychological processes underlying linguistic communication,4 while others ventured into cognitive science and discussions of inferential processes that lead to the intended or recovered utterance meaning (Sperber and Wilson Reference Sperber and Wilson1986; Recanati Reference Recanati2004a; Jaszczolt Reference Jaszczolt2005). It has to be remembered at this juncture that for Grice the recognition of the speaker's intentions need not always mean conscious and laborious processing. The recovery of the intention can be ‘short-circuited’, so to speak, when the meaning is conventionalised in a language and the conventions create a ‘shortcut’ through the recognition of the intentions. It can also be short-circuited when the intended content can be presumed in the particular context. Default meanings of the first type originated in Grice's concept of the generalised conversational implicature and have been developed further in the theory of presumptive meanings (Levinson Reference Levinson2000). Default Semantics (Jaszczolt Reference Jaszczolt2005, Reference Jaszczolt and Cummings2010c) accounts for both types.5 Some examples are given in (13)–(17), with the convention-driven or context-driven defaults in (13a)–(17a).
(13) Some people like jazz.
(13a) Not all people like jazz.
(14) It is possible that soon all cars will run on electricity.
(14a) It is not certain that soon all cars will run on electricity.
(15) A secretary brought us coffee.
(15a) A female secretary brought us coffee.
(16) A Botticelli was stolen from the Uffizi.
(16a) A painting by Botticelli was stolen from the Uffizi Gallery in Florence.
(17) Kate and Leonardo performed superbly in Revolutionary Road.
(17a) Kate Winslett and Leonardo diCaprio performed superbly in the film Revolutionary Road.
All in all, the possibility of such a non-inferential uptake notwithstanding, it is evident that Gricean accounts explain communication in terms of intentions and inferences. An intention to inform the addressee is fulfilled simply by the recognition of this intention by the addressee. We devote more attention to the types of intentions in communication in section 5.3.2, and the range of inferences underlying communication in section 5.3.3.
Meaningnn, explained in terms of intentions and inferences, provided the foundations for Grice's theory of cooperative conversational behaviour and thereby his theory of implicature. According to this theory, interlocutors are rational agents whose behaviour is governed by the Cooperative Principle: ‘Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged’ (Grice Reference Grice, Cole and Morgan1975 [1989]: 26).6 Implicatures are normally considered to be meanings intended by the speaker and recovered thanks to this rationality assumption. To use Horn's (Reference Horn, Horn and Ward2004: 6) apt dictum, ‘Speakers implicate, hearers infer’ (but see Horn, this volume). However, in literary texts, for example in poetry, the writer may intentionally leave the choice of possible implicatures open for the reader. And even in interaction, speakers may be taken to be implying something that is contrary to their intention (so-called ‘unintended implicature’) (Cummings Reference Cummings2005: 20–21; Haugh Reference Haugh2008b), or may intentionally leave the interpretation of what has been said or implied open to the hearer (Clark, H. Reference Clark1997; Jaszczolt Reference Jaszczolt1999: 85; Haugh Reference Haugh, Davies, Haugh and Merrison2011). It has to be pointed out that originally, in Grice's writings, the term ‘implicature’ referred to the process of intentionally conveying some meaning in addition to what is said (and thus was restricted to the speaker), whilst the product of this process was called an ‘implicatum’ (based on inferences by either the speaker or the hearer). Gradually, however, the term ‘implicature’ took over to serve both functions, obscuring the distinction between ‘utterer-implicature’ and ‘audience-implicature’ (Saul Reference Saul2002a; see also Bach, this volume, and Horn, this volume).
It is evident that implicatures are pragmatic constructs through and through. They pertain to communicated thoughts and need not, or according to some post-Griceans must not, have direct counterparts in uttered sentences. They stand for inferred meanings and one of their most interesting characteristics is their cancellability. For example, we can always cancel the implicature in (15a) as in (15b).
(15b) A secretary brought us coffee. He was smartly dressed and moved swiftly.
This property is particularly interesting in that it clearly allows for gradation. In post-Gricean pragmatics, one of the main bones of contention has been the delimitation of what is said: contextualists, such as Sperber and Wilson (e.g. Reference Sperber and Wilson1995b) or Recanati (e.g. Reference Recanati and Davis1989, Reference Recanati2004a) provide for the development of the logical form of the sentence until it represents the content understood to be the main message communicated by this utterance. Such an extended unit is not, however, dubbed an implicature; it is an explicature or what is said, respectively for these authors. Implicatures have to pertain to separate, additional thoughts with their own pragmatic force (Haugh 2002: 128–30). Default Semantics, a more radical contextualist post-Gricean approach, lifts this restriction on what is said by the utterance and allows for the main intended message, or primary meaning, to be independent from the logical form of the uttered sentence. This is where the interesting property of cancellability comes into the picture. When we try to define implicatures qua separate thoughts as cancellable meanings, while what is said qua enriched, modulated, extended sentence meaning, we soon realise we are barking up a wrong tree and are forced to retreat all the way to Grice (Reference Grice and Cole1978). Implicatures can be dubbed cancellable only if we maintain Grice's original all-encompassing definition of conversational implicature,7 also pertaining to the cases of the development of the sentence meaning such as (18a).
(18) Sue took a sharp knife and chopped the onions.
(18a) Sue took a sharp knife and then chopped the onions.
When we allow for such cases of sentence embellishment to be classified as what is said or what is explicit, leaving the term implicature only for instances of separate thoughts communicated by means of the uttered sentence, then the property of cancellability becomes much less clear-cut than it was on Grice's original account. In theory, what we would want in order to strengthen the rationale for the what is said/what is implicit distinction drawn in this way is the property of cancellability to apply to implicatures but not to what is said. This, however, is not the case. It is not the ‘enriched said’ vs implicit distinction that marks the boundary between the cancellable and non-cancellable meanings, but rather, when our aim remains Gricean meaningnn, we have to acknowledge that cancellability is a gradable property, dependent on the speaker's intentions. When the explicit content corresponds to the main speaker intention, cancellability is unlikely; likewise, when one of the implicit contents corresponds to the main intended meaning, this implicature is also entrenched. All in all, we have to either remain minimalist about semantics and construe what is said as sentence meaning tout court (Borg Reference Borg2004; Cappelen and Lepore Reference Cappelen and Lepore2005a) and deny it the property of cancellability, or we should be contextualist about meaning and tie cancellability to the strength of intending, disregarding the explicit/implicit distinction.8
5.3.2 Types of intentions
The intention on which Grice's theory of meaningnn is founded can be called communicative intention: an intention to communicate certain content to the audience. It is fulfilled by its recognition. Bach and Harnish (Reference Bach and Harnish1979: 7) call it an illocutionary-communicative intention and founded it on the so-called communicative presumption: an assumption that when the speaker utters something to the addressee, the speaker is doing so with an illocutionary intention. Potentially ambiguous utterances result in unambiguous acts of communication thanks to the recognition of the speaker's intentions. Analogously, sentences with indexical terms result in referentially complete utterances because of the recognition of the speaker's referential intention. Referential intention is understood here as part of the overall communicative intention.
The inherent reflexivity of the communicative intention is what, according to Levinson (Reference Levinson2006a: 87), ‘makes open-ended communication, communication beyond a small fixed repertoire of signals’. There is some debate, however, around whether the communicative intention involves one or two degrees of reflexivity.9 Grice's original formulation of communicative intention involved two levels of reflexivity: a first-order intention (to intend to inform or represent something) embedded in a second-order intention (to intend that this first-order intention be recognised by the hearer), which was further embedded in a third-order intention (to intend that the recognition of the first-order intention be based on the hearer recognising the speaker's second-order intention). The utility of this third-order intention, however, has been disputed (Bara Reference Bara2010: 82–3). Searle (Reference Searle, Kecskes and Horn2007: 14), for instance, argues that Grice's third-order intention ties meaningnn to perlocutionary effects thereby confounding meaning with ‘successful’ communication.
Subsequently, communicative intention has thus been reanalysed as only involving two kinds of intentions, the latter embedded in the first: the communicative intention and the informative intention. Communicative intention in this view consists of making it obvious to the addressee that the speaker has an intention to inform him/her about something (Sperber and Wilson Reference Sperber and Wilson1986: 61). An informative intention can remain covert when it is not issued as part of the communicative intention. In other words, A may let B know something without letting B know that A wants B to know that. Puzzles carried by examples of situations contrived to demonstrate communication through hidden intentions, and implications for the definition of meaningnn, have exercised many philosophers, and it has been proposed that “manipulative intentions” may underlie informative and communicative language use in some instances (Németh Reference Németh2008).
Searle's (Reference Searle1983) work on intentionality introduced a further distinction between prior intentions and intention-in-action, the latter referring to ‘the proximal cause of the physiological chain leading to overt behaviour’ (Ciaramidaro et al. Reference Ciaramidaro, Adenzato, Enrici, Erk, Pia, Bara and Walter2007: 3106; see also Pacherie Reference Pacherie2000: 403). The different types of intention said to underlie communication have subsequently proliferated as intention has been variously used to refer to “action planning and representation, goal-directedness and action control” (Becchio and Bertone Reference Becchio, Bertone and Cummings2010: 226), thereby encompassing a continuum from states of mind to actions (Pacherie Reference Pacherie and Nadel2003: 599). In the remainder of this section we concentrate on exploring different types of prior intentions, rather than action intentions,as it is the former that are arguably the most relevant to the analysis of (speaker) meaning in pragmatics.
The notion of prior intention was initially proposed by Searle (Reference Searle1983: 165–6) to encompass the communicative intention (or what he prefers to call meaning intention, encompassing a first-order representation intention and a second-order communication intention). However, subsequent work has indicated that there are a range of different types of prior intentions, including not only communicative and informative/representation intentions, but also future-directed/higher-order intentions, private intentions and we-intentions. The latter are ontologically ambiguous, however, as to whether they are actually, in practice, prior intentions, or better characterised as (post-facto) ‘emergent’ intentions.
The first distinction made in relation to prior intentions is between those which are present-directed (or proximal) and those which are future-directed (or distal) (Bratman Reference Bratman1987, Reference Bratman1999; Ciaramidaro et al. Reference Ciaramidaro, Adenzato, Enrici, Erk, Pia, Bara and Walter2007: 3106).10 While the communicative intention is essentially present-directed, being used to account for speaker meanings at the utterance level, it has become apparent that higher-order intentions, controlling ‘whole segments of dialogue’ (Tirassa Reference Tirassa1999) or the planning of activity types, including long-term goals (Bratman Reference Bratman1999), may also be relevant to speaker meaning.
The analytical import of considering higher-order intentions is underscored in Ruhi's (Reference Ruhi2007) analysis of compliment responses. In the excerpt below, a compliment (you're a good cook) is deployed by one family member to another in order to imply a request (that the receiver of the compliment cook again for guests).
(19a)
1 Aysun: Ayhan! I didn't know you were so skilled [at cooking] darling
2 Ayhan: Go on canım! You continue thinking I'm inept
(Ruhi Reference Ruhi2007: 138)11
Ayhan responds to Aysun's positive assessment of his cooking in line 1 by questioning the sincerity condition (line 2), thereby displaying uptake of Aysun's communicative intention to compliment him. However, Ayhan's response to the compliment also ‘metarepresents the higher-order intention of the C[ompliment] as a request’ (Ruhi Reference Ruhi2007: 139), as by disputing Aysun's claim that she thinks Ayhan is skilled in cooking, Ayhan forestalls the implication that Aysun wants Ayhan to cook for the guests (ibid: 138–9). In other words, Ayhan undermines the legitimacy of Aysun's implied request that he cook for guests by challenging one of its preparatory conditions (i.e. he can cook well).
This analysis is confirmed in the subsequent turns, reproduced below, where it becomes obvious that Ayhan was attempting to pre-empt any future requests for him to cook again.
3 Aysun: How can that be! From now on we'll discover your talents every time we have guests for dinner
4 Ayhan: Don't exaggerate Aysun! This was just a one‐off
(Ruhi Reference Ruhi2007: 138).
Ruhi goes on to argue that such an example challenges the view that meanings can be analysed solely in terms of (communicative) intentions, and suggests that meanings and actions are ‘hierarchically controlled by higher-order intentions that affect single speech act production and the unfolding of discourse’ (Ruhi Reference Ruhi2007: 139).
A second distinction made is that between private intentions (involving the representation of a goal that involves only the speaker) and social intentions (involving the representation of a social goal, where the speaker's goal involves at least one other person in addition to the speaker). While it might be assumed at first glance that only prior intentions involving the representation of social goals could be relevant to communication, it is evident that private intentions may also become salient in some cases, for example, when inferences about private intentions (e.g. the hearer reaching into her purse to pay for her coffee) may enter into a context salient to formulating a social intention (e.g. the speaker offering to pay for the hearer's coffee).
Finally, the argument that certain joint, cooperative activities, including communication, cannot be straightforwardly reduced to individual intentions has led to the introduction of the notion of we-intention into this increasingly complex landscape (Searle Reference Searle, Cohen, Morgan and Pollack1990). One key question around the notion of we-intentions (sometimes also called shared or joint intentions) is whether they can be reduced to individual intentions supplemented with mutual beliefs (Bratman Reference Bratman1992, Reference Bratman1993, Reference Bratman1999; Pacherie Reference Pacherie, Penco, Beaney and Vignolo2007; Tuomela and Miller Reference Tuomela and Miller1988), or whether they constitute a primitive form of intentionality that cannot be reduced to individual intentions (Becchio and Bertone Reference Becchio and Bertone2004: 126; Searle Reference Searle, Cohen, Morgan and Pollack1990; Tuomela Reference Tuomela2005).
According to Bratman's (Reference Bratman1992, Reference Bratman1993) account, which is defended by Pacherie (Reference Pacherie, Penco, Beaney and Vignolo2007), shared intention can be explicated in terms of individual intentions in the following way:
Where J is a cooperatively neutral joint-act type, our J-ing is a shared cooperative activity only if:
(1)(a)(i) I intend that we J.
(1)(a)(ii) I intend that we J in accordance with and because of meshing subplans of (1) (a) (i) and (1) (b) (i).
(1)(b)(i) You intend that we J.
(1)(b)(ii) You intend that we J in accordance with and because of meshing subplans of (1) (a) (i) and (1) (b) (i).
(1)(c) The intentions in (1) (a) and (1) (b) are minimally cooperatively stable.
While Pacherie (Reference Pacherie, Penco, Beaney and Vignolo2007) argues that this formulation of shared intention captures the commitment of the participants to a joint activity (conditions (1) (a) (i) and (1) (b) (i)) the mutual responsiveness of each participant to the other (conditions (1) (a) (ii) and (1) (b) (ii)) and commitment to mutual support (condition (1) (c)), Becchio and Bertone (Reference Becchio and Bertone2004: 127–8) argue that a formulation of shared intention building on mutual belief about individual intentions is ‘cognitively implausible’ as the requirement for sharedness is too strenuous.
Searle (Reference Searle, Cohen, Morgan and Pollack1990: 407), on the other hand, defines a we-intention as simply ‘we intend that we perform act A’, which in turn presupposes cooperation on the part of the we-intenders. He argues that ‘we-intentions cannot be analyzed into sets of I[ndividual]-intentions, even I-intentions supplemented with beliefs’ (ibid.: 404), giving the example of playing football to illustrate his point:
Suppose we are on a football team and we are trying to execute a pass play. That is, the team intention, we suppose, is in part expressed by ‘We are executing a pass play.’ But now notice: no individual member of the team has this as the entire content of his intention, for no one can execute a pass play by himself. Each player must make a specific contribution to the overall goal…Each member of the team will share in the collective intention, but will have an individual assignment that is derived from the collective but has a different content from the collective. Where the collective's is ‘We are doing A,’ the individual's will be ‘I am doing B,’ ‘I am doing C,’ and so on. (Searle Reference Searle, Cohen, Morgan and Pollack1990: 403)
The essence of Searle's argument, then, is that each member of the football team I-intending different parts of the action of pass play does not simply add up (i.e. summatively) to the collective action of a pass play, because a level of cooperation is presupposed that goes beyond each person ‘doing their part’ (ibid.: 405). Cooperation implies, we would suggest, that each team member's I-intentions are responsive to their perceptions of the I-intentions of other team members, meaning, in other words, that the I-intentions of team members are both afforded and constrained by the I-intentions of others in making the pass play. In order for these I-intentions to be responsive in this manner, it is claimed that the team members must be we-intending to engage in this collective activity.
This description of a particular collective action bears remarkable similarities to the cooperative nature of communication assumed by Grice, with a number of scholars noting, in passing, the potential importance of we-intentions for the analysis of communication (Becchio and Bertone Reference Becchio and Bertone2004: 132; Clark H. Reference Clark1996, Reference Clark1997; Gibbs Reference Gibbs1999, Reference Gibbs, Malle, Moses and Baldwin2001; Searle Reference Searle, Cohen, Morgan and Pollack1990: 415). The place of we-intentions in analysing speaker meaning and communication, however, has received only passing attention in pragmatics to date. This is perhaps due to the inherent ontological ambiguity of we-intentions (Haugh Reference Haugh, Kecskes and Mey2008c: 53–4): while they are characterised as prior intentions in the minds of speakers and hearers, there is equivocality about how such an a priori mental state comes to be shared between two or more people in the first place (Fitzpatrick Reference Fitzpatrick2003; Velleman Reference Velleman1997), even in models which argue for their importance (Tomasello and Rakoczy Reference Tomasello and Rakoczy2003; Tomasello et al. Reference Tomasello, Carpenter, Call, Behne and Moll2005).
Consider, for instance, Levinson's (Reference Levinson1983: 358) example of a ‘transparent pre-request’ in line 1 in the excerpt below:
(20)
1 A: Hullo I was wondering whether you were intending to go to Popper's talk this afternoon
2 B: Not today I'm afraid I can't make it to this one
3 A: Ah okay
4 B: You wanted me to record it didn't you heh!
5 A: Yeah heheh
6 B: Heheh no I'm sorry about that…
(Levinson Reference Levinson1983: 358)
While it is not entirely clear from B’s response in line 2 whether B was in fact treating A's utterance in line 1 as a pre-request or simply as a request for information, it is evident by line 4 that B has understood A's initial question as a pre-request through his topicalisation of A's intentions (‘wanted me to record it’) (Haugh Reference Haugh2009: 95). However, in order for B to infer the communicative intention underpinning A's question in line 1 (i.e. the intention to check the preparatory conditions for a forthcoming request that B record Popper's talk), it appears an inference about A's higher-order intention (to launch a request sequence) is also required. This, in turn, presupposes that A and B are we-intending engagement in a particular activity, namely, conversational interaction. The issue is, however, and this proves crucial to the interpretation of A's communicative intention underlying his utterance in line 1, whether they are we-intending the joint activity of phatic/relational talk (consistent with an interpretation of A's communicative intention simply being a request for information about B's intentions in relation to Popper's talk), or they are we-intending the joint activity of goal-oriented talk (consistent with A's higher-order intention of negotiating a request that B record the talk for him).12 In other words, the communicative intention of A is embedded within his higher-order intention, with both intentions arguably being further embedded within a we-intention.
This requirement for multiple types of prior intentions embedded within each other, however, leads to analytical equivocality on two counts. First, the requirement for inferences directed at both A's communicative intention (which is present-directed) and his higher-order intention (which is future-directed) leads to trouble in ascertaining when exactly this implied request arises. Second, there is equivocality about how this we-intention is shared in the first place. It is not until line 4 in this interaction that A and B could plausibly be taken to be we-intending the joint activity of goal-oriented talk (in the context of which a request sequence makes sense). Searle's characterisation of a we-intention as a prior intention thus strikes very real problems when applied to the analysis of joint activities such as conversation. It appears, then, that what underpins the request implicature here might be better characterised as a kind of ‘emergent’ intention (Clark H. Reference Clark1997; Gibbs Reference Gibbs1999: 38, Reference Gibbs, Malle, Moses and Baldwin2001; Haugh Reference Haugh2007c: 95, Reference Haugh2008c, Reference Haugh2009; Kecskes Reference Kecskes2010a: 60–61).
Finally, it is also worth noting in passing that in example (20) above, A also makes reference to B's (higher-order) intentions (‘intending to go’) in line 1, while A ratifies in line 5 (‘yeah’) B's preceding topicalisation of his intentions. It appears, then, that the folk notion of intending, encompassing (1) the expression of future plans for self, (2) ascribing to or asking of others their future plans, (3) describing what oneself or others want to achieve by doing or saying something and (4) classifying actions as being done with the speaker's awareness of the implications of them (Gibbs Reference Gibbs1999: 22–3), may also be sometimes relevant to the analysis of speaker meaning. The notion of higher-order intention appears to overlap with senses (1) through to (3) of the folk notion of intention while the last two senses of intention appear consistent with the claim in ethnomethodological conversation analysis that speakers are held (morally) accountable for their meanings (Garfinkel Reference Garfinkel1967; Heritage Reference Heritage and Antaki1988); Sacks [Reference Sacks1992:1964 4–5].13 While there has been work focusing on intuitive, folk understandings of intention (Breheny Reference Breheny2006; Gibbs Reference Gibbs1999, Reference Gibbs, Malle, Moses and Baldwin2001; Knobe 2003; Knobe and Burra 2006; Malle Reference Malle2004, Reference Malle2006; Malle et al. Reference Malle, Moses and Baldwin2001),14 there remains considerable work to be done in comparing folk understandings of intention with the various analytical concepts of intention developed in philosophical pragmatics. Drawing a clear distinction between folk and theoretical/analytical notions of intention is clearly important, as it is through their common link to the folk notion of intentional actions that intentionality (encompassing the directedness/aboutness of linguistic acts) and speaker intentions (encompassing the a priori goal-directedness and deliberateness of speaker actions) are often confounded. It is only by making such comparisons that we will better understand the relationship between the intuitive notions of intention/intending and the more technical notions we have reviewed here.
5.3.3 From mental acts to communicative acts and types of inference
The detailed work on different types of (prior) intention and their relationship to speaker meaning has been expanded upon in two quite different ways in pragmatics, as noted in section 5.3.1. Those more closely aligned with the spirit of Grice's original programme have focused on moving from mental acts to communicative acts. This includes both the so-called neo-Griceans and speech act theorists. Those taking a more psychological or cognitive stance, in contrast, have focused on explicating the inferential processes leading from communicative acts to intended or recovered speaker meaning.
In Searle's (Reference Searle1983, Reference Searle, Kecskes and Horn2007) work on the relationship between speech acts, speaker meaning and intention(ality), he claims that for a speech act to be communicated a double level of intentionality is required, as noted in section 5.2. This double level of intentionality builds on the notion of direction of fit. Word-to-world direction of fit encompasses the assertive class of speech acts (e.g. statements, assertions etc.), which are ‘expressions of beliefs and are supposed, like beliefs, to represent how the world is’ (Searle, e.g. Reference Searle, Kecskes and Horn2007: 14), while world-to-word direction of fit encompasses both the directive class of speech acts (e.g. requests, orders, commands etc.), which are expressions of desires, and the commissive class (e.g. promises, offers etc.), which are expressions of intention (ibid.). Declaratives have a double direction of fit, while expressives (e.g. apologies, thanks etc.) have no direction of fit (ibid.: 14–15). To illustrate this double level of intentionality he gives the example of a speaker wishing to communicate the belief ‘It is raining’:
When the speaker intentionally utters a token of the symbol [It is raining], the production of the token is the condition of satisfaction of his intention to utter it. And when he utters it meaningfully he is imposing a further condition of satisfaction on the token uttered. The condition of satisfaction is: That it is raining. The imposition of conditions of satisfaction on conditions of satisfaction is the essence of speaker meaning. (Searle Reference Searle, Kecskes and Horn2007: 23, original emphasis)
According to Searle, then, the first level of intentionality arises from the prior intention of the speaker, namely, that uttering ‘It is raining’ is a goal-directed and deliberate action. The second level of intentionality arises from the intrinsic aboutness of beliefs. It is by imposing conditions of satisfaction of the belief in question (i.e. that it is indeed raining) upon the conditions of satisfaction of the intention in question (i.e. to utter the token ‘it is raining’) that a speaker can be taken to be meaning this particular assertion. Further work modelling the link between prior intentions (i.e. intentions as mental states) and intention-in-action (i.e. uttering something) has been undertaken (Bara Reference Bara2010; Pacherie Reference Pacherie, Pockett, Banks and Gallagher2006, Reference Pacherie2008), in an attempt to move Searle's analytical work into an empirical, testable reality. But such work, albeit very important, is largely tangential to the current discussion since it focuses on investigating neural mechanisms.15
While Searle makes little reference to inferential work in his analysis,16 Levinson's (Reference Levinson2000) account of implicature is grounded in default logics (for generalised implicatures) and practical reasoning (for particularised implicatures). Default logics aim to capture the notion of reasonable presumption, a ceteris paribus assumption. Levinson (Reference Levinson2000: 46) argues that default logics capture two important features of generalised implicatures: their defeasibility and the way in which they involve preferred presumptions. While a number of models of default logics are discussed by Levinson (Reference Levinson2000: 46–9), he does not offer a definitive answer to the question of how to formally model the inferences underlying generalised implicatures.
Practical reasoning systems, first mentioned by Aristotle, aim to explicate how speakers reason from a particular goal (i.e. a prior intention) to the means by which this goal is achieved (e.g. an utterance). Levinson argues that a modified ‘Kenny logic’, building on Kenny's (Reference Kenny1966) work on practical reasoning, captures two important features of desirability reasoning: its defeasibility and its ampliativeness (see Brown P. and Levinson Reference Brown and Levinson1987: 89). The system formalises inferences that are valid when propositions are ‘satisfactory’ relative to goal wants: ‘A fiat F(a) is satisfactory relative to a set G of desires {F(g1), F(g2),…F(gn)} if and only if, whenever a is true, (g1 and g2…and gn) is true’ (ibid.: 87). In other words, practical reasoning accounts for how speakers move from formulating intentions (ends) to what is uttered (means). However, it has to be acknowledged that a logic-based reasoning system such as this cannot account for how hearers understand speakers’ communicative intentions (ibid.: 8).
In moving from communicative acts to mental acts, in contrast, a number of different types of inference have been claimed to underlie speaker meaning, including the recovery of (intended) implicatures. According to some, an implicature is recovered through inductive inference on Grice's account, and thus constitutes ‘a probabilistic conclusion derived from a set of premises that include the utterance and such contextual information as appears relevant’ (Grundy Reference Grundy2008: 102). It is argued that the probabilistic nature of inductive inferences accounts for mismatches between what the speaker intends and how this is interpreted by the hearer (ibid.). Others have claimed that the recovery of conversational implicatures by hearers is better explicated with reference to abductive reasoning (i.e. inference to best explanation) (Allot Reference Allot2007), or its formal analogue, inference to best interpretation (Atlas Reference Atlas2005; Atlas and Levinson Reference Atlas, Levinson and Cole1981). The key difference between induction and abduction is that the former involves reasoning ‘from particular instances to a general hypothesis’, while the latter involves reasoning where ‘the conclusions are based on a best guess’ (Allan Reference Allan and Brown2006c: 652).
Somewhat controversially, in their Relevance book, relevance theorists Sperber and Wilson have proposed that deductive reasoning is central to utterance interpretation (Sperber and Wilson Reference Sperber and Wilson1995b: 69; see also Wilson and Sperber Reference Wilson, Sperber and Travis1986: 45). In the following example, Mary implies that she wouldn't drive a Mercedes in response to Peter's question.
(21)
Peter: Would you drive a Mercedes?
Mary: I wouldn't drive any expensive car.
(Sperber and Wilson Reference Sperber and Wilson1995b: 194)
It is claimed by Sperber and Wilson that this implication can be deduced through the addition of encyclopaedic information about expensive cars (‘a Mercedes is an expensive car’) as an additional premise together with Mary's assertion that she ‘wouldn't drive any expensive car’ (ibid.).
One key problem with a reliance on deductive reasoning in explicating how hearers recover speaker meaning, however, as Cummings (Reference Cummings1998, Reference Cummings2005) points out, is how to establish the closed set of propositions required for strict deduction (Allan Reference Allan and Brown2006c: 553; Allot Reference Allot2007: 60). As she argues,
no set of deductive premises is ever fully circumscribed in the sense of containing all the information that is relevant to the comprehension of that utterance – for every set of such premises, some factor that is not part of the set can nonetheless be shown to be integral to the comprehension of that utterance. (Cummings Reference Cummings2005: 130)
In other words, there is no solution given in Relevance theory as to how the hearer decides which premises to include and those to be excluded, except by the arguably circular argument that the premises selected are determined by calculations of relevance,17 which in turn determines what the hearer understands to have been implied by the speaker. The circularity of their approach as presented in that book lies in the fact that calculations of the relevance of the contextual premises are determined by the hearer's calculations of the relevance of what is implied.18
As Cummings (Reference Cummings2005: 108) has argued, there remains considerable work to be done to clarify the types of inference that underlie the recovery of speaker meaning. In line with the emphasis on non-deductive inference present in the pragmatic literature in the last decade (see e.g. Levinson Reference Levinson2000, or, in a different paradigm, Asher and Lascarides Reference Asher and Lascarides2003), she discusses two further types of inference that are crucial to this process: elaborative inferences and presumptive reasoning. Elaborative inferences are knowledge-based inferences used to ‘establish causal connections between events and construct intentional relations between actions within reasoning’ (Cummings Reference Cummings2005: 91), with this knowledge often being of ‘behaviour tendencies and everyday routines’ that serve ‘to specify normality or typicality conditions on inference’ (ibid.: 93). She claims that the implicature arising in the example below, for instance, cannot be recovered without elaborative inference.
(22)
A: I'm out of milk.
B: There's a shop at the end of the street.
(Cummings Reference Cummings2005: 102)
In order for A to understand that B is implying that A can get milk from the shop at the end of the street, real-world knowledge is required: that corner shops stock commonly used groceries, including milk (ibid.: 103). She goes on to claim that ‘knowledge provides a cohesive link between the utterances of the above exchange – it is through our knowledge of corner shops and their merchandise that we are able to establish the relevance of B's utterance to A's utterance in this exchange’ (ibid.). Such elaborative inferences allow hearers to go from ‘abstract communication norms and principles’ to particular communicated meanings (ibid.: 104).
Cummings (Reference Cummings2005) also suggests that presumptive reasoning is crucial to the recovery of implicatures, since although potential implicatures are always subject to revision based on the addition of further contextual information, they are nevertheless held by hearers to be communicated, at least provisionally. One kind of presumptive reasoning system proposed by Cummings is ‘argument from ignorance’, where a proposition is accepted as true because there is no evidence that it is not true (or vice versa) (ibid.: 109). Thus, while earlier attempts to model the move from communicative acts (utterances) to mental acts (intentions) have tended to rely on traditional forms of inference, both monotonic and non-monotonic (deductive, inductive, abductive), strong arguments have been mounted for more attention to be paid to other non-monotonic forms of inference as well (Cummings Reference Cummings2005: 242; Levinson Reference Levinson2000: 46).19
One problem facing current models of inference that privilege the speaker's intention in determining whether communication has occurred (whether in moving from speaker intentions to communicative acts or vice versa), is the failure to ‘address how the participants themselves could come to know whether the recipient's inference and attribution regarding that intention is to any extent consistent with it’ (Arundale Reference Arundale2008: 241). In other words, there is no account in current intention-based models given as to how speakers and hearers determine something has indeed been communicated. Approaches that conceptualise the inferential work underlying meaning and communication as contingent and non-summative (Arundale and Good Reference Arundale, Good, Fetzer and Meierkord2002; see also Haugh Reference Haugh2009) thus arguably deserve further consideration as well.
In the next subsection we consider the question of where intentions (and inferences about them) can be located, in particular, whether or not they should be analysed as a purely cognitive phenomenon. While the emphasis in the discussion thus far has largely been on speaker intentions underlying utterances, in broadening our scope to consider their place in interaction and broader society, it is further emphasised that intentions are not only traceable to the mind, but also to interaction, and to broader societal norms and discourses.
5.3.4 Locating intentions
According to the received view of meaning as the intended expression of thoughts, intentions pertain to the mental states of speakers. Recent work in social neuroscience has begun exploring the neural correlates of our ability to attribute mental states, including intentions, to others (Walter et al. Reference Walter, Adenzato, Ciaramidaro, Enrici, Pia and Bara2004: Reference Walter, Adenzato, Ciaramidaro, Enrici, Pia and Bara1854). This capacity to attribute mental states is termed theory of mind (ToM), with it being assumed that without ToM ‘other people's behaviour would be meaningless from a third-person perspective: behaviour would be observed, but the meaning of actions would not be understood’ (Ciaramidaro et al. Reference Ciaramidaro, Adenzato, Enrici, Erk, Pia, Bara and Walter2007: 3111). Experiments combining brain imaging with various kinds of language use prompts have found that a distributed neural system underlies the ToM mechanism, including the right and left temporo-parietal junctions (right TPJ and left TPJ), the precuneus, and the medial prefrontal cortex (MPFC) (Ciaramidaro et al. Reference Ciaramidaro, Adenzato, Enrici, Erk, Pia, Bara and Walter2007: 3105; Enrici et al. Reference Enrici, Adenzato, Cappa, Bara and Tettamanti2011; Walter et al. Reference Walter, Adenzato, Ciaramidaro, Enrici, Pia and Bara2004: Reference Walter, Adenzato, Ciaramidaro, Enrici, Pia and Bara1854). The detection of agency, for instance, has been tied to neural activity in the superior temporal sulcus (STS) (Frith and Frith Reference Frith and Frith2003), while the representation of our own and other people's mental states (including intentions) has been tied to activity in the medial prefrontal cortex (MPFC) (Walter et al. Reference Walter, Adenzato, Ciaramidaro, Enrici, Pia and Bara2004: Reference Walter, Adenzato, Ciaramidaro, Enrici, Pia and Bara1854).
Further work on intentions specifically has established that different parts of this distributed neural network are activated depending on the kind of prior intention involved. This lends some empirical support to the hitherto conceptual distinctions discussed in section 5.3.2, including that between private and social intentions, and within the category of social intentions, between communicative intentions and prospective social intentions, the latter being a form of higher-order intention (Ciaramidaro et al. Reference Ciaramidaro, Adenzato, Enrici, Erk, Pia, Bara and Walter2007: 3105; see also Saxe Reference Saxe2006; Walter et al. Reference Walter, Adenzato, Ciaramidaro, Enrici, Pia and Bara2004), and also between the representation of individual and we-intentions (Becchio and Bertone Reference Becchio and Bertone2004: 132).
Attempts to model the understanding of intentions (Bara Reference Bara2010; Pacherie Reference Pacherie, Pockett, Banks and Gallagher2006, Reference Pacherie2008) have also been given empirical support. Becchio et al. (Reference Becchio, Adenzato and Bara2006), for instance, have argued that the recognition, attribution and representation of action intentions can be traced to different forms of neural activity. They hypothesise that the recognition of the intentions of others is partly based on the same areas of the brain that are activated when one (intentionally) performs actions oneself. The neural basis of this has been argued to be ‘mirror neurons’ (i.e. neurons that fire during action execution and action observation) in the premotor cortex (Fogassi et al. Reference Fogassi, Ferrari, Gesierich, Rozzi, Chersi and Rizzolatti2005; Rizzolatti and Craighero 2004; Rizzolatti and Fabbri Destro 2007), although mirror neurons do not in themselves provide the basis for different forms of agentive understanding and shared intentionality (Pacherie and Dokic Reference Pacherie, Pockett, Banks and Gallagher2006).20 The attribution of intention to a particular agent is traced in Becchio et al.'s (2006) experiments to another area of the brain, the inferior parietal lobe, while the representation of prior intentions, particularly those which are social, has been traced to the anterior paracingulate cortex (Walter et al. Reference Walter, Adenzato, Ciaramidaro, Enrici, Pia and Bara2004). Recent work has also shown that a common neural network is employed when subjects are comprehending communicative intentions, no matter whether the prompt is linguistic or gestural (Enrici et al. Reference Enrici, Adenzato, Cappa, Bara and Tettamanti2011). Although the area of study is in its relative infancy, then, social neuroscientists have begun tracing hypothesised mental states (i.e. intentions and inferences about intentions) to specific neural activity, lending support to the kinds of distinctions made in pragmatics between different kinds of intentions.
However, work in psycholinguistics on the comprehension and attribution of intentions suggests that while these may indeed have neural correlates, speakers consistently over-estimate their ability to project intended meanings onto addressees (Keysar Reference Keysar1994b, Reference Keysar2000; Keysar and Henly Reference Keysar and Henly2002). Moreover, hearers often do not routinely consider what the speaker knows (i.e. common ground) or other mental states in interpreting what has been said (Keysar Reference Keysar2007, Reference Keysar, Kecskes and Mey2008). This fundamental egocentrism in the early stages of processing language suggests that due caution should be given to interpreting results of experiments attesting to neural correlates of intentions. While no one would suggest that speakers do not at times have intentions motivating them to say things, or that hearers do not at times make conscious inferences about such intentions, the question is the extent to which such an explanation is sufficient to account for speaker meaning and communication (Haugh Reference Haugh, Kecskes and Mey2008c: 52; Reference Haugh2009: 93). The evidence from psycholinguistics suggests that while a (prior) intention-based account of speaker meaning may be necessary, it may not be sufficient.
Work from an interactional perspective also attests to the difficulty of locating intentions relative to meanings in discourse (rather than simply relative to utterances). Haugh (Reference Haugh2008c, Reference Haugh2009), for instance, argues that the intentions hypothesised to underlie implicatures are temporally, ontologically and epistemologically ambiguous when the analyst attempts to trace them in actual interactional data. In more closely tracing intentions in conversational interaction it becomes apparent that intentions can be characterised as being ‘emergent’, as both the speaker and the hearer jointly co-construct understandings of what is meant (Clark H. Reference Clark1997; Gibbs Reference Gibbs1999: 38, Reference Gibbs, Malle, Moses and Baldwin2001; Haugh Reference Haugh2007c: 95, Reference Haugh2008c, Reference Haugh2009; Kecskes Reference Kecskes2010a: 60–61). Kecskes (Reference Kecskes2010a), for instance, argues that John's initial intention to give Peter a chance to talk about his trip is not realised in the excerpt below.
(23)
John: Want to talk about your trip?
Peter: I don't know. If you have questions…
John: OK, but you should tell me…
Peter: Wait, you want to hear about Irene?
John: Well, what about her?
Peter: She is fine. She has…well…put on some weight, though
(Kecskes Reference Kecskes2010a: 60).
Kecskes suggests that John's original intention is sidelined by Peter talking about Irene, perhaps because he thinks John might want to know about her (being his former girlfriend). He argues that ‘it was the conversational flow that led to this point, at which there appears a kind of emergent, co-constructed intention’ (Kecskes Reference Kecskes2010a: 61, original emphasis).
These ‘emergent’ intentions have interesting parallels with the notion of we-intention discussed in section 5.3.2. However, Haugh (Reference Haugh, Kecskes and Mey2008c) argues that the relatively static characterisation of we-intentions does not do full justice to the contingent and emergent nature of the inferential work underlying joint, cooperative activities, including conversational interaction. The key difference between the notions of we-intention and emergent intention is that the former assumes a monadic view of cognition, involving ‘the summative sequence of individual cognitive activities’ (Arundale and Good Reference Arundale, Good, Fetzer and Meierkord2002: 124), while the latter builds upon a dyadic view of cognition:
Each participant's cognitive processes in using language involve concurrent operations temporally extended both forward in time in anticipation or projection, and backwards in time in hindsight or retroactive assessing of what has already transpired. As participants interact, these concurrent cognitive activities become fully interdependent or dyadic. (Arundale and Good Reference Arundale, Good, Fetzer and Meierkord2002: 122)
Yet while there are differences in their underlying frameworks, the view in philosophical and cognitive pragmatics that we-intentions cannot be reduced to individual intentions due to the sharedness requirement being cognitively implausible (Becchio and Bertone Reference Becchio and Bertone2004: 127–8), and the complementary view in interactional pragmatics that models of individual intentions are ‘formally incapable of explaining the non-summative effects or emergent properties observable when individuals are engaged in interaction’ (Arundale Reference Arundale2008: 243), both strongly suggest that meaning cannot necessarily always be traced analytically to prior intentions of individual speakers (including their communicative intentions). Instead, the interpretation of meaning is ‘doubly-dynamic’ in the sense that it is ‘created in-between the interlocutors and the hearer may also be given freedom to create assumptions rather than recover them’ (Jaszczolt Reference Jaszczolt1999: 76; see also Hirsch Reference Hirsch2010). Gauker (Reference Gauker2001, Reference Gauker2003, Reference Gauker2008) has also argued that the role of situational inferences has been vastly underplayed in communicative intention-based models of meaning and communication.
However, such views should not be simplistically interpreted as constituting an anti-intentionalism stance as some have recently assumed (Åkerman 2009; Montminy Reference Montminy2010). Instead, there is growing evidence that we need to make clearer distinctions between speaker (intended) meaning, which pertains to the subjective processing domain at the utterance level, and ‘joint meaning’, which pertains to the interpersonal domain at the discourse level (Carassa and Colombetti Reference Carassa and Colombetti2009; Kasper Reference Kasper, Bardovi-Harlig, Felix-Brasdefer and Omar2006; Kecskes Reference Kecskes2008, Reference Kecskes2010a; Kriempardis Reference Kriempardis2009: 186). Different types of intentions arguably have different roles to play relative to these different types of meaning. Instead of remaining committed to a hardline intentionalist or anti-intentionalist stance, greater dialogue between those with different views on intention lies at the heart of advancing our understanding of meaning and communication (Haugh Reference Haugh2008a).
We have discussed thus far how intentions can be located both in the minds of individuals, as well as more diffusely in interaction, where they emerge ‘between the mind and the world’ (Jaszczolt Reference Jaszczolt1999: 117). There is, however, an additional level at which intentions can be productively located, namely, the social (or societal) level of analysis. The focus here is on deontological aspects of intention and intentionality. Philosophers have conceptualised this as commitment to undertake we-intended actions (Gilbert Reference Gilbert2009), for instance, or commitment to the truth conditions of utterances (Searle Reference Searle, Kecskes and Horn2007: 33–4). The notion of speaker accountability in ethnomethodological conversation analysis (Garfinkel Reference Garfinkel1967; Sacks Reference Sacks1992 [1964]: 4–5), where interlocutors hold themselves and others normatively accountable for the meanings that arise from what is said (Heritage Reference Heritage1984), can also be productively explored in regards to both speaker intentions and intentionality (Arundale Reference Arundale2008; Edwards Reference Edwards2006, Reference Edwards2008; Haugh Reference Haugh, Kecskes and Mey2008c, Reference Haugh2009). For instance, the ways in which speakers are held accountable to meanings through topicalising intentions, and how this intersects with interpretative and socio-cultural norms can be explored (Haugh Reference Haugh2008b, Reference Haugh, Kecskes and Mey2008c).
Holding speakers accountable for what they are understood by others to have implied can also enter into broader societal debates, as argued by Haugh (Reference Haugh2008b) in an analysis of the discursive dispute arising in the Australian media as to what was intended by comments in regards to the status of women made by a Muslim cleric.21 In the following excerpt, the cleric is being interviewed in the controversial wake of the publication of his sermon.

Martin (the interviewer) first asks in lines 1–2 why Hilali would say something offensive when he holds a position of responsibility in the Australian Muslim community (the grand mufti), to which Hilali responds by claiming he always speaks honestly, implying that he does not necessarily always say what others want him to say (line 3). Martin then invokes the folk view of meaning as residing in words (Bilmes 1986), in arguing that Hilali is responsible for how people understand him (lines 6–8), and that one cannot be absolved from this responsibility by claiming one was ‘misunderstood’ or ‘misinterpreted’ (lines 4–6). This stance, however, is implicitly rejected by Hilali in line 9, when he claims he stands behind the ‘correct’ interpretation of what he implied, in this case, what he intended by his comments. The deontological dimensions of speaker meaning were thus clearly topicalised in this particular interview.
In considering the question of where intentions can be located, then, it has become apparent that this concept is deployed in pragmatics for a number of different analytical purposes. In philosophical pragmatics it is used in accounting for how speakers mean things (communicative intentions) or undertake joint activities (we-intentions), although there are varying levels of commitment to the psychological reality of those intentions (Jaszczolt Reference Jaszczolt1999). In cognitive pragmatics there is a clearer commitment to the assumption that the recognition and attribution of (communicative) intentions underlies communication, but there is also consideration of a much more fine-grained range of different types of intentions (including the distinction between prior intentions versus intentions-in-action), with work attempting to correlate neural activity with such distinctions. Here, intentions tend to be conceptualised as being firmly located in the minds of speakers. In contrast, in interactional pragmatics the focus is on examining the relationship between speaker (intended) meaning and joint or interactionally achieved meanings, with the notion of ‘emergent intention’ (Kecskes Reference Kecskes2010a) or ‘emergent intentionality’ (Haugh Reference Haugh, Kecskes and Mey2008c, Reference Haugh2009) sometimes being deployed to account for the latter. Here, intentions and inferences are treated as contingent (and non-summative), arising in the course of interaction, and thus better traced with reference to dyadic views of cognition (Arundale and Good Reference Arundale, Good, Fetzer and Meierkord2002) rather than individual minds. Finally, in more discursive approaches to pragmatics, the analytical focus is on the normative work intention does when deployed in discourse or interaction, with a particular emphasis on how speaker commitment or accountability can be disputed. In these approaches, intentions can be found to be diffused across social networks, ranging from dyadic units through to larger social groups.
5.4 Concluding remarks: intentions as ‘creatures of darkness’ or a useful tool?
We are now in a position to address the methodological question as to how pragmatic theory, aspiring to high predictive power, can be founded on intentions and intentionality – theoretical notions which are inherently so imprecise and, moreover, possibly not directly empirically testable. In other words, this lack of testability may prove not to be a fleeting state of affairs but an inherent property of intentions. In answering this question one has to point out that the advantages seem to outweigh the shortcomings. We attempt to list here a few arguments in favour of an outlook that maintains the importance of intentions for theorising in pragmatics.
(1) If we ban intentions from pragmatics, we have to use a substitute theoretical tool such as default rules of inference, semanticised pragmatic relations between sentences as in dynamic approaches to meaning, or other similar solutions, e.g. constraints of optimality-theory pragmatics. None of these alternative tools has comparable predictive power as far as speaker meaning is concerned. Instead, we are forced to change the object of study of pragmatics from, so to speak, speaker meaning ‘whatever means the speaker may have used to convey it’ to speaker meaning ‘modelled on the fairly probable semantic patterns’.22
(2) In the current state of experimenting in neuroscience, it seems very unlikely that intentions can remain creatures of darkness. Instead, they are being correlated with neural activity as discussed in the preceding section. Intentions in communication derive their theoretical status from intentionality of consciousness. The more we know about intentionality in the brain, the more will we know about intentions. The structure of the explanation is already there in the form of Gricean pragmatics; the scientific flesh is being provided as cognitive science progresses.
(3) Language is a vehicle of thought and pragmatic theory of its use in communication should derive from theories of thought. In order to theorise expression meaning (word/sentence meaning), the basic intentionality of thought needs to be taken into account. In this way, the extent to which expression meaning can be productively defined in terms of speaker meaning (intentions), as originally proposed by Grice, may be further explored.23
(4) The notion of intention (and indeed intentionality) is already being productively deployed in many different ways in pragmatics. While this proliferation can at times create analytical confusion, it is also no doubt reflective of the metaphorical power of intentions and intentionality in advancing our understanding of how speakers mean things through the use of language.
We suggest, therefore, that while intentions may be difficult to pin down, it is clear that disciplines do not advance by avoiding slippery questions, particularly when they lie at their very foundations, as do the concepts of intention and intentionality in pragmatics. Ultimately, it is only through refining or even discarding certain views and developing alternatives that we will continue to advance in our theorisation and analysis of meaning and communication.
6 Context and content Pragmatics in two-dimensional semantics
6.1 Introduction
Two-dimensional semantics in its most general characterization is a semantics that recognizes two kinds of content: narrow content and wide content.1 Narrow contents can take the form of linguistic meanings (e.g., functions from context to content) or descriptive Fregean contents. Wide content is a set of possible worlds or a structured Russellian proposition consisting of properties and/or physical objects. On some approaches (for instance, Kaplan's and Stalnaker's), “narrow content” is not semantic content (it is not what is shared in communication) but belongs to what Stalnaker calls “the metasemantics.” On other approaches (e.g., Chalmers and Jackson), the narrow content is (part of) the semantic content.
Kaplan (Reference Kaplan, Almog, Perry and Wettstein1989a) originally introduced two-dimensional semantics in order to account for the semantics of context-sensitive expressions, primarily indexicals and demonstratives. Names, indexicals, and demonstratives have often been treated as having as their semantic content the individuals to which they refer. But indexicals and demonstratives do not acquire their content in the same way as names. Unlike names, as traditionally construed, their content varies across worlds, times, speakers, and locations. This difference between names and indexicals prompted Kaplan to introduce a further layer of meaning, viz. a function that takes an expression from worlds, times, speakers, or locations to its content. This function can then be either constant (names) or variable (indexicals and demonstratives).
Two-dimensional semantics has subsequently been employed by Robert Stalnaker (e.g. Reference Stalnaker and Cole1978), David Chalmers (e.g. Reference Chalmers1996, Reference Chalmers, Garcia-Carpintero and Macia2006), Frank Jackson (e.g., Chalmers and Jackson Reference Chalmers and Jackson2001), and others as a tool for explaining how sentences that have necessary contents can be cognitively informative and how sentences that are cognitively uninformative can be contingent, and David Chalmers and Frank Jackson have developed the approach further in debates about the role of conceptual analysis and reductive explanation.
Here I will introduce and assess the notion of context-sensitivity presented by the various two-dimensional frameworks, with a special focus on how it relates to the notion of cognitive significance and whether it includes an intuitively plausible range of expressions within its scope. I will argue that the two phenomena (viz. context-sensitivity and cognitive significance) are, to some extent, inseparable. I will conclude with a discussion of the prospects of using epistemic two-dimensional semantics to account for context-sensitive expressions in dynamic discourse.
6.2 Kaplan
6.2.1 Kaplan's two-stage theory
In “Demonstratives” (Reference Kaplan, Almog, Perry and Wettstein1989a) David Kaplan introduced a two-stage semantic theory of indexicals and demonstratives. In Kaplan's framework, disambiguated expression types have a linguistic meaning or what Kaplan calls a “character.” A character is a function from context to content. The context is a sequence of parameters which include (at least) a world, a speaker, a time, and a location. In the extended (Reference Kaplan, Almog, Perry and Wettstein1989a) version of the previously unpublished text Kaplan included an addressee among the contextual parameters. A demonstration is not directly included in the context as a contextual parameter. Instead demonstratives associated with different demonstrations are treated as different lexical items.2 The character of ‘I am speaking now’ is a function from a world w, a speaker s, and a time t of the context to a content that is true just in case s is speaking at t in w. The content of a sentence is the proposition expressed and also what is said by the sentence in context. Context together with a disambiguated sentence type with a determinate character thus yields a proposition.
In Kaplan's original framework, only pure indexicals (e.g., ‘I’, ‘now’, and ‘here’), true demonstratives (e.g., ‘this’, ‘that’) and perhaps a few other types of expression (including complex demonstratives) are context-sensitive expressions. They are the only expressions that have variable character. Noun phrases have constant character.3 So, the word type ‘water’ used by Oscar on Earth where the clear potable liquid that comes out of faucets is H2O and the word type ‘water’ used by Oscar's physical and phenomenal duplicate Twin-Oscar on Twin-Earth where the clear potable liquid that comes out of faucets is XYZ are different lexical items with different constant characters.
For Kaplan, contexts need not be real speech situations. Strictly speaking, they need not be possible either. A sequence of the actual world, the present author, 3 pm Canberra time, 1535, and The Lounge in Melbourne forms an improper context. However, the sorts of contexts that are relevant for determining contents are possible contexts.
There are two main reasons that contexts, for Kaplan, need not comprise real speech situations. First, this notion of context allows that sentence types that are necessarily true or necessarily false relative to real speech situations can be assigned a different truth-value relative to different contexts. These include sentence types such as ‘I am not speaking now’, ‘I do not exist now’, and ‘I am not here now’. Since there are contexts relative to which these sentence types express true propositions and contexts relative to which these sentence types express false propositions, these sentence types are not logical truths but contingent truths.
Second, if contexts were real speech situations, intuitively valid arguments would come out as invalid. Consider the following argument:
If John is hungry now, then he is grumpy now.
John is hungry now.
So, John is grumpy now.
Utterances take time. So, if contexts were real speech situations, then there would be contexts in which it is true that if John is hungry now, then he is grumpy now, and true that John is hungry now, but false that John is grumpy now because he eats something before we have a chance to utter the conclusion. Below I will sketch a way to deal with dynamic discourses in a two-dimensional framework.
6.2.2 Kaplan and cognitive significance
Kaplan's theory was intended to give an account of context-sensitive expressions. It was not intended as an account of cognitively informative necessities (Kripke's a posteriori necessities) or cognitively uninformative contingencies (Kripke's a priori contingencies). But Kaplan's framework provides a reasonably good explanation of the phenomenon when indexicals and demonstratives are implicated. For example, ‘I am Brit (if I exist)’ expresses, relative to my context, a necessary truth of the form ‘a = a (if a exists)’. Yet it can be cognitively significant. We can explain its cognitive significance by noting that ‘I’ and ‘Brit’ have different characters. The character of ‘I’ is a function from a context to the speaker of the context, the character of ‘Brit’ is a function from a context to Brit. The identity claim, it may be said, is cognitively significant, because the expressions flanking the identity sign have different characters. If we discover that the different characters determine the same content in a context, our overall knowledge state has been enriched. Kaplan's original example of cognitive significance runs as follows:
If I see, reflected in a window, the image of a man whose pants appear to be on fire, my behavior is sensitive to whether I think, “His pants are on fire” or “My pants are on fire”, though the object of thought may be the same. (1989a: 533)
Thinking “his pants are on fire” (referring to myself) and thinking “my pants are on fire” will elicit different behavioral responses. For Kaplan, what explains the difference and hence what explains cognitive significance is the character of the sentence (the way the proposition expressed is presented), not the proposition expressed.
Kaplan's theory does not by itself offer a general explanation of cognitive significance. ‘Hesperus is Phosphorus’ is ordinarily cognitively significant, yet if ‘Hesperus’ and ‘Phosphorus’ are genuine names rather than disguised descriptions, then the character of ‘Hesperus’ is identical to the character of ‘Phosphorus’. It is a constant one-place function mapping contexts to Venus.
Noting that Kaplan's theory does not offer a general explanation of cognitive significance is not a criticism of his framework thought of as a theory of indexicals and demonstratives. However, there is some reason to think that the correct story of indexicals and demonstratives ought to be generalizable to noun phrases. To see this it may be helpful to introduce the notion of ‘twin-earthability’. David Chalmers defines the notion as follows:
We can say that two possible individuals (at times) are twins if they are physical and phenomenal duplicates; we can say that two possible expression tokens are twins if they are produced by corresponding acts of twin speakers. Then a token is Twin-Earthable if it has a twin with a different 2-intension [Russellian intension]. (Chalmers Reference Chalmers, Garcia-Carpintero and Macia2006: section 3.5)
A twin-earthable expression is one which has a twin token with a different 2-intension (or Kaplan content). For example, ‘water’ is twin-earthable. ‘Water’ has the same 2-intension as ‘H2O’ when I use the word, and it has the same 2-intension as ‘XYZ’, when my phenomenal and physical twin uses the word on Twin-Earth, where the clear potable liquid that flows in oceans, rivers, and lakes has the chemical composition XYZ. Note that, on Chalmers’ definition, it is not required that the word tokens spoken by the twin-speakers be tokens of the same lexical item.
Twin-earthability plays a crucial role in testing whether an expression is semantically neutral. Semantically neutral expressions, roughly, are the non-twin-earthable expressions (e.g. ‘phenomenally conscious’, ‘cause’, and ‘friend’). To a first approximation, they are expressions that do not have wide content (Kaplan content) that does not supervene on the physical-phenomenal make-up of the particular speakers (deferential uses aside). Twin-earthable expressions are required to generate (ideally) cognitively significant sentences.4 Semantically neutral expressions cannot be used to generate (ideally) cognitively significant sentences.
Note that in order for twin-earthability and semantic neutrality to be relevantly connected, it must be true that narrow content (1-intension) supervenes on the physical-phenomenal make-up of particular speakers.5 If it does not, then almost any expression (semantically neutral or not) comes out as twin-earthable. Take ‘phenomenally conscious’. There is a scenario in which people speak the language Schmenglish. In Schmenglish ‘phenomenally conscious’ has a somewhat different 1-intension (and hence 2-intension). It means, roughly, alert. So, when twins on Earth and Schmenglish Earth make corresponding acts and say “I am conscious,” their tokens pick out different properties. So ‘phenomenally conscious’ comes out as twin-earthable, which was not the result we wanted.
Assuming Kaplan's original framework, indexical and demonstrative expressions are both twin-earthable and context-sensitive. If Oscar and Twin-Oscar both say “I am human,” the twin tokens of ‘I’ have different 2-intensions (Kaplan contents). One picks out Oscar at every world, the other picks out Twin-Oscar at every world. Likewise, if Oscar and Twin-Oscar demonstrate the clear potable liquid in their drinking glasses and say “that is water,” the twin occurrences of ‘that’ have different 2-intensions. One picks out H2O at every world, the other picks out XYZ at every world.
Given Kaplan's original framework, most noun phrases are twin-earthable but not context-sensitive. ‘Water’ is twin-earthable, because in Kaplan's framework, Oscar and Twin-Oscar's tokens of ‘water’ pick out different chemical substances. ‘Water’ is not context-sensitive because its character is a constant function from contexts to contents.
There is some reason to think that the characters of noun phrases like ‘water’ ought not to be considered constant functions. Here is one argument. Chalmers’ definition of twin-earthability does not require that the twins use tokens of the same lexical item. However, as it stands, it does not rule it out either. But whether we consider Oscar's and Twin-Oscar's uses of the string of letters ‘water’ tokens of one word type or different lexical items seems somewhat of a methodological choice (I say “somewhat” because treating Oscar and Twin-Oscar as speaking the same language allows us to reject the requirement that narrow content supervenes on the intrinsic make-up of particular speakers). Suppose, then, that we consider Oscar's and Twin-Oscar's uses of tokens of a single lexical item. Oscar and Twin-Oscar then use twin tokens to pick out different chemical substances. So, the lexical item ‘water’ then must have a variable character yielding different contents relative to different contexts. Relative to Oscar's context the character of ‘water’ yields H2O, and relative to Twin-Oscar's context it yields XYZ. So, the character of ‘water’ is variable. But if the character of ‘water’ is variable, then it may reasonably be thought to be in the range of broadly context-sensitive expressions. The same sort of argument applies to other noun phrases typically thought to be twin-earthable.
Granting this point, we can account for the cognitive significance of informative necessities in the following way. Being less than omniscient with respect to empirical matters we may not know whether we are on Earth or Twin-Earth (or some other exotic planet). If we are on Earth, then the character of ‘water’ determines H2O. If we are at Twin-Earth, then it determines XYZ. Discovering that water is H2O here thus amounts to discovering that ‘water’ determines H2O here, and hence amounts to discovering what the character of ‘water’ is in our context.
Of course, someone fond of Kaplan's original semantics might reasonably baulk at this move. Strings of letters forming lexical entries with a constant character, it might be said, simply form different lexical entries with a different constant character when used elsewhere. The phenomenon, they will continue, is akin to that of homonymy. The English word ‘red’ and the Danish word ‘red’ (meaning: ‘rode’) are not one lexical item with variable character but two distinct lexical entries spelled in the same way but with distinct characters. Likewise, given Kaplan semantics, the Earth word ‘water’ and the Twin-Earth word ‘water’ are distinct lexical items spelled the same way with distinct characters.
Personally I find this sort of response somewhat ad hoc. But granting it, we may reasonably ask whether there is a different way of accounting for cognitive significance within the original Kaplan framework. I think there is a way of partially accounting for the phenomenon. This involves introducing the notion of a diagonalized character. To a first approximation, a diagonalized character is a meaning content that picks out different referents at different contexts. It's similar to a linguistic contextual intension. For example, the diagonalized character of ‘I’ picks out the speaker of the context, and the diagonalized character of ‘now’ picks out the time of speech. For Kaplan, the linguistic contextual intension of names picks out the same referent in all contexts. However, as we will see below, we can define a diagonalized character in the following way. Though the characters of names are constant, the strings of letters forming English expressions could have had different characters. For example, ‘water’ could have been a function from context to XYZ rather than a function from context to H2O. So, if we take sentence types to be associated with different functions from contexts to characters at different worlds, we can take the diagonal character to be a set of functions from contexts to propositions that are true relative to parameters of the context. For example, the diagonal character of ‘water is H2O’ yields the proposition that (1) H2O is H2O at a context that contains the actual world as a parameter and yields the proposition that (2) the clear liquid that fills oceans, rivers, and lakes is XYZ at a context relative to which the character of ‘water’ is a function from context to the descriptive content ‘the clear liquid that fills rivers, oceans, and lakes’, and relative to which the character of ‘H2O’ is a function from context to XYZ. So, it might seem that we can account for the intuitive cognitive significance of ‘water is H2O’ by noting that the diagonal character yields different propositions at different contexts, and so says different things, relative to different contexts. It is plausible that further investigation into our use of the language is needed in order to discover which linguistic contextual intension (or what Kaplan simply calls “character”) sentences actually have and which propositions they actually express.
Despite their virtues, however, the variable-character approach and the diagonal approach to cognitive significance have at least two shortcomings. Both approaches trade on the idea that we are less than competent speakers and hence are not sure about what the characters of our expressions are. But this then has the consequence that ‘water is water’ might be cognitively significant. If we are not certain what the character of ‘water’ is, then we are not certain what the character of ‘water is water’ is. So, while we know that the sentence expresses necessary propositions relative to contexts, we do not know which proposition it expresses relative to our context.
Furthermore, ‘Hesperus is bright at night’ is contingent because Venus (which ‘Hesperus’ actually refers to) might not have been the brightest object in the evening sky, but if the name ‘Hesperus’ is introduced as a disguised description that stands for ‘the brightest object in the evening sky’, then the sentence is uninformative. Yet both its variable character and its diagonal character yield different propositions relative to different contexts. So, ‘Hesperus is bright at night’ comes out as cognitively informative when it should have come out as cognitively uninformative. So, neither the variable-character approach nor the diagonalization approach provides a good account of cognitive informativeness.
There is independent reason to think that even when limited to sentences containing context-sensitive expressions neither one of the attempted Kaplan explanations of cognitive significance is within the spirit of Kaplan's framework. In Kaplan's framework, the proposition determined by a sentence in a context is evaluated relative to a circumstance of evaluation. The parameters of the default circumstance of evaluation are parameters of the context. But circumstance-shifting operators can shift the parameters of the circumstance of evaluation. For example, in the case of ‘it was the case that John visited Mary’, ‘it was the case that’ functions as a circumstance-shifting operator which shifts the time parameter of the default circumstance of evaluation to some time prior to it. ‘It was the case that John visited Mary’ is true iff (if and only if) John visited Mary at some time prior to the time of speech. However, according to Kaplan, English contains no corresponding operators that operate on character. There are no Kaplanian monsters or context-shifting operators in English (Reference Kaplan, Almog, Perry and Wettstein1989a: 510). For example, ‘John believes that I am hungry’ cannot be used to express the proposition that John believes that he is hungry. ‘John believes’ does not operate on character and so cannot shift the speaker's context to John's context to produce a John-content for ‘I’. If, however, Kaplan's framework could explain informative necessities involving indexicals and demonstratives, then we need to allow that some sentential operators operate on character. To see this, consider (1).
(1) For all I can rule out from the armchair, I am not Brit.
(1) is a plausible specification of the cognitive significance of ‘I am Brit’. In the actual context, of course, ‘I’ and ‘Brit’ yield the same propositional contents. So, for (1) to be true, the operator ‘for all I can rule out from the armchair’ must take us to a context that yields different propositional contents for ‘I’ and ‘Brit’. The English operator ‘for all I can rule out from the armchair’ must function as a monster. So, if we extend Kaplan's framework in the ways suggested above, then there are monsters after all, contrary to what Kaplan claimed.
6.3 Stalnaker
6.3.1 Assertion
In “Assertion” Robert Stalnaker (Reference Stalnaker and Cole1978, see also 1999) offers a theory of content that is similar in some respects to Kaplan's. However, where Kaplan opts for structured propositions, Stalnaker takes a proposition to be a set of possible worlds that are the way the world is said to be. Stalnaker thinks of possible worlds as the conditions that must obtain for the proposition to be true, hence they are truth conditions. This idea is motivated by the thought that when one asserts a proposition, one seeks to eliminate the set of possible situations at which what one says is false. Unlike Kaplan, Stalnaker seeks to account for cognitive significance or the information that a statement conveys in those cases in which what one says is informative despite being necessarily true. Necessarily true statements do not exclude any possibilities, yet if one utters an informative necessity, then it seems that one is seeking to rule out possibilities.
In “Assertion” Stalnaker's account of cognitive significance is similar to the diagonalization approach I imaginatively outlined above on behalf of Kaplan. For Stalnaker, utterances are associated with propositional concepts, functions from possible worlds containing the utterance to propositions. The diagonal proposition is the set of worlds at which the utterance's propositional concept yields a true proposition. If, for example, context maps ‘water’ to the descriptive material ‘the clear liquid that fills oceans, rivers, and lakes’ and maps ‘H2O’ to XYZ, then the diagonal proposition of ‘water is H2O’ is the proposition that the clear liquid that fills oceans, rivers, and lakes is XYZ. Cognitive informativeness is explained by our uncertainty about which context we occupy and hence which proposition is in fact expressed by our assertions.
This approach generates the same problems as diagonalizing on character does. We might be uncertain about which proposition is in fact expressed by the a priori necessity ‘water is water’ and the a priori contingency ‘Hesperus is bright at night’, in which case these sentences are cognitively informative despite being a priori.
Of course, our uncertainty might be limited in various ways. We might know ‘water’ does not refer to tigers but refers to a chemical substance. Likewise, we might know that ‘water = water’ expresses a necessary proposition even if we don't know what. Still, the approach is bound to yield some intuitively mistaken results.
The early Stalnaker approach has other virtues. It can explain communication failure. If A is identical to B, but I lack competence, I may fail to grasp that ‘A is F’ and ‘B is F’ express the same proposition. If you are competent and believe I am too, communication failure might ensue. Stalnaker's early account can explain why. However, the early Stalnaker account does not yield a good account of informative necessity and uninformative contingency.
6.3.2 Assertion revisited
In “Assertion revisited” Stalnaker (Reference Stalnaker2004) proposes to treat Kaplanian character as a kind of narrow content but understood metasemantically (i.e., not as content proper). He also proposes to treat a wider class of expressions as context-sensitive. Nearly all names and descriptive expressions are treated as having a generalized variable character. For example, ‘Socrates’ and ‘water’ are treated on the model of indexicals. So, their character is a variable function mapping context to individuals. The generalized character of ‘Socrates lived in Athens’ is a variable function mapping context to propositions (or sets of worlds). In this new framework, it is the variability of the character of ‘water is H2O’ that explains its cognitive informativeness. Stalnaker furthermore assumes that there may be cases in which speakers know what a sentence says. So, a sentence may be informative relative to one speaker but not relative to another.
Stalnaker's new approach, of course, is similar in a number of ways to the first approach I outlined above on behalf of Kaplan. However, it differs from this approach in treating some expressions which intuitively are not twin-earthable as context-sensitive. Like the variable-character approach we considered above on behalf of Kaplan, Stalnaker's new approach has difficulties distinguishing between the differences in the cognitive value of ‘water is water’ and ‘water is H2O’, and ‘Venus is bright at night’ and ‘Hesperus is bright at night’. If ‘water is H2O’ is informative in virtue of the fact that we don't know which content its variable character determines in our context, then so is ‘water is water’. Likewise, if ‘Venus is bright at night’ is informative because we don't know what content its variable character determines in our context, then so is ‘Hesperus is bright at night’.
Stalnaker's proposal has further problems. One potential problem is over-generation. If the expression ‘The Mayor of Boston’ is treated on a par with ‘I’ and ‘now’, which the framework seems to allow, then it has a variable character which maps the expression from context to an individual. So, ‘the Mayor of Boston is rich’ and ‘Thomas M. Menino is rich’ have the same content (though different characters). This is not a super-controversial proposal. It familiarly goes against Bertrand Russell's (Reference Russell1905) theory of descriptions, according to which descriptions are incomplete symbols which do not denote. But it is in broad agreement with Peter Strawson's (Reference Strawson1950) proposal that descriptions are referential and Keith Donnellan's (Reference Donnellan1966) proposal that descriptions have attributive and referential uses. Elsewhere both Stalnaker and Kaplan have welcomed this consequence of an extended two-stage approach to semantics.6
Furthermore, starting with Strawson many have thought that incomplete descriptions must be treated as context-sensitive. Stanley and Szabó (Reference Szabó2000), for example, introduce nominal restriction to account for the context-sensitivity of quantifiers. If I were to say that every bottle is on the table, I would not ordinarily mean that every bottle in the universe is on the table. What I would mean in the envisaged circumstance is that every <bottle, i> is on the table, where ‘i’ is contextually completed. So, ‘<bottle, i>’ denotes, say, the set of bottles in the kitchen. The problem of incomplete descriptions can possibly be resolved in the same way (see e.g. Stanley Reference Stanley, Preyer and Peter2002b). ‘The book is on the table’ can be treated as being of the form ‘the <book, i> is on the <table, j>’, where ‘<book, i>’ and ‘ <table, j>’, once the indexical variables are contextually completed, denote, say, the set of books I have in mind and the set of tables in the living room, respectively.
However, granting that incomplete descriptions are context-sensitive does not vindicate a treatment of characters as generalized. Nominal restriction, for example, does not require treating characters as generalized. It does admittedly require treating the restrictor-plus-hidden variable as having a variable character that returns a property or an extension relative to context (e.g., the property of being a bottle in my kitchen or the set of bottles in my kitchen). But this character variability is arguably just of the sort postulated in the original Kaplan text. Nominal restriction does not indicate a need for a more generalized treatment of character.
Moreover, without further constraints on which expressions can have variable character, it would seem that the generalized proposal extends to other quantificational expressions. For example, ‘every rich person from Boston’ might have a variable character that takes the expression from a context to a set of people. A referential treatment of universally quantified expressions seems less attractive than a referential treatment of definite descriptions.
The generalized character proposal also threatens to turn intuitively non-twin-earthable (or semantically neutral) expressions like ‘consciousness’, ‘friend’, and ‘cause’ into context-sensitive expressions with variable characters. For example, it might be thought that, given some functional description of consciousness, we can define a variable character that yields different properties relative to different contexts. So, relative to the actual world, ‘is conscious’ and ‘is phenomenally conscious’ might have the same content, whereas at a zombie world ‘is conscious’ and ‘is able to report part of the content of their computational states’ might have the same content. Without further constraints in place, Stalnaker's new proposal seems to threaten to make too many expressions context-sensitive. In response to this sort of worry Stalnaker suggests treating fundamental, natural properties and relations as context-insensitive (and hence as having a constant character).
It might be thought that a limited extension of the variable-character approach would be a good way to account for context-sensitivity more generally. For example, it might be thought that expressions such as ‘local’, ‘nearby’, ‘tall’, ‘big’, and so on, which beg to be treated as context-sensitive, could be treated as having themselves variable characters rather than as being associated with a hidden constituent with a variable character.7 But the motivation for this approach is somewhat meager, given that these expressions can be treated, on the model of nominal restriction, as containing a hidden indexical variable (see, e.g. Stanley Reference Stanley, Preyer and Peter2002b).
Philosophers have in recent years argued that the range of context-sensitive expressions must be widened to include epistemic expressions (e.g., ‘know’ and ‘might’), moral expressions (e.g., ‘decent’ and ‘appropriate’), predicates of personal taste (e.g., ‘fun’ and ‘tasty’) and vague expressions (e.g., ‘bald’ and ‘heap’). If they are right about this, then these expressions may well be best treated on the more generalized variable-character model, though it is possible that the less explored options in these areas of philosophy, viz. the hidden-indexical-variable approach or the epistemic two-dimensional approach, ultimately can provide a better account of the contextual nature of these expressions.
6.4 Chalmers
6.4.1 Epistemic two-dimensionalism
David Chalmers (Reference Chalmers1996, Reference Chalmers2002, Reference Chalmers, Garcia-Carpintero and Macia2006, forthcoming a) and Frank Jackson (Chalmers and Jackson Reference Chalmers and Jackson2001) have proposed a different two-dimensional framework, according to which epistemic variability and cognitive significance are to be explained at least partially in terms of a basic notion of apriority (“partially” because they recognize that a priori sentences may be cognitively informative to non-ideal reasoners). If ‘Hesperus’ is introduced as shorthand for ‘the brightest object in the evening sky’, it is a priori that Hesperus is the brightest object in the evening sky. That is, we can figure out that this is true without engaging in empirical investigations of what the world is like. It is also a priori that Hesperus is Hesperus. But it is not a priori that Hesperus is Phosphorus or that water is H2O. Even though ‘Hesperus is Phosphorus’ and ‘water is H2O’ are necessarily true, it wasn't possible for us to just see that this was so. To figure it out, we needed to examine whether the same planet, namely Venus, appeared both in the morning and at night and whether the chemical composition of water was H2O or some other chemical structure. Over the years Chalmers and Jackson have appealed to the notion of apriority to formulate a two-dimensional framework that avoids the shortcomings of the linguistic approaches. Roughly, the idea is that because we cannot know a priori that water is H2O, there are scenarios compatible with what can be known a priori in which water is not H2O. So, the sentence is cognitively significant. There are some differences between Chalmers's, and Jackson's approaches. Here I shall focus on Chalmers's (Reference Chalmers1996, Reference Chalmers2002, Reference Chalmers, Garcia-Carpintero and Macia2006, forthcoming a) framework.
Chalmers offers two versions of his epistemic two-dimensional framework, one which treats the space of possible worlds as plentiful enough to provide a model for deep epistemic possibility (i.e., possibilities not ruled out a priori), and one which constructs epistemic space out of sentences. Let us look at the former proposal first. On the former proposal, there is a space of possible worlds which allows us to define 2-intensions (Kaplan contents or intensions) as either sets of worlds or as functions from worlds to truth-values. 2-intensions are necessarily true in the standard sense when they yield the truth-value true at every world and necessarily false when they yield the truth-value false at every world. On the plenitude proposal, however, the space of possible worlds is large enough to model deep epistemic possibility. A scenario is a possible world in which certain features are marked: in most cases an individual and a time. A world in which certain features are marked is called ‘a centered world’. As a helpful heuristic, we can think of a centered world as an n-tuple of parameters. A centered world in which a speaker and a time is marked can be thought of as a triple containing a world, a speaker, and a time-parameter. For every centered world, there is then a maximal hypothesis about the world in question expressed in a canonical language. A canonical language is a semantically neutral language. To a first approximation, an expression is semantically neutral just in case it is not twin-earthable as defined above.8 Part of the hypothesis about a centered world might be that some heavenly body called ‘Venus’ is the brightest object in the evening sky and that some heavenly body called ‘Venus’ is the brightest object in the morning sky. As ‘Venus’ is not semantically neutral, it is not part of the description that Venus is the brightest object in the evening sky.
The important principle on the plenitude conception of deep epistemic possibility is what Chalmers calls ‘Metaphysical Plenitude’:
Metaphysical Plenitude: For all S, if S is a priori possible, there is a centered metaphysically possible world that verifies S.
Suppose it is not ruled out a priori that Hesperus, the brightest object in the evening sky, is not identical to Phosphorus, the brightest object in the morning sky, despite the fact that it is necessary that Hesperus is identical to Phosphorus. It is then deeply epistemically possible that Hesperus is not identical to Phosphorus. While there is no possible world in which Hesperus is not identical to Phosphorus, there is a canonical description of a centered world which a priori implies that Hesperus is not identical to Phosphorus. This description might, for example, be a description of a centered world in which Jupiter is the brightest object in the evening sky and Mars is the brightest object in the morning sky. As Jupiter is not identical to Mars, the world verifies the sentence ‘Hesperus is not identical to Phosphorus’.
On the alternative picture of scenarios, a scenario is an equivalence class of epistemically complete sentences in a canonical language. A specification of a scenario is a sentence in its equivalence class. The rest of the two-dimensional apparatus goes as before. On the constructive treatment of scenarios, one can in principle allow for strong necessities (i.e., deeply epistemically contingent necessities involving semantically neutral terms).
Given either framework, the 1-intension of an expression is defined as a function from scenarios to extensions. A priori (and uninformative) sentences have epistemically necessary 1-intensions, whereas a posteriori sentences have epistemically possible 1-intensions.9 So, given this framework ‘water is H2O’ has a necessary 2-intension that yields the truth-value true at every possible world, but it has a contingent 1-intension that yields the truth-value true at some scenarios and the truth-value false at some scenarios. For example, it yields the truth-value false at scenarios that have a canonical description which a priori implies that the clear liquid that flows in rivers, oceans, and lakes is XYZ. All a posteriori necessities, in Kripke's sense, have a 1-intension that comes apart from their 2-intension.
Chalmers's two-dimensional framework does not inherit the problems of the earlier frameworks. But its explanatory power has some limitations. The framework explains cognitive informativeness for ideal reasoners, but as Chalmers recognizes, it does not explain informativeness for non-ideal reasoners. Despite the fact that all mathematical and logical truths are a priori, many true mathematical conjectures are cognitively informative. Moreover, because the space of scenarios is the space of deep epistemic possibilities, the space of scenarios cannot be used as a way to model hyperintensionality more generally, for instance, belief contexts and strict epistemic possibility contexts. For example, ‘I believe that p and not-p’ might be true, but there is no scenario that verifies p & not-p. However, as Chalmers (forthcoming a) observes, the framework of scenarios can be extended to account for both non-ideal cognitive informativeness and hyperintensionality.
In Chalmers's two-dimensional semantics, scenarios play roughly the role contexts play in Kaplan's framework, narrow contents (1-intensions) play roughly the role that characters play, except that they are part of the semantics proper (“the propositional content”), and wide contents (2-intensions) play roughly the role that Kaplan contents play.
Despite the analogy between these approaches, however, there are significant differences. Chalmers explicitly distinguishes epistemic variability from context-sensitivity. An expression is epistemically variable if it has a non-constant 1-intension, whereas an expression is context-sensitive if it has a non-constant character. Epistemic variability is structurally analogous to standard context-sensitivity in certain respects but is conceptually distinct.
Given Chalmers's framework, standard context-sensitive expressions (e.g. pure indexicals and true demonstratives) have 1-intensions that come apart from their 2-intensions. Their 1-intensions are non-constant functions from the marked individuals and times (or features of the individuals) in the centers of scenarios to extensions. Their 2-intensions are constant functions from the speaker's scenario to extensions. For example, the 1-intension of a token of I is a non-constant function from scenarios to the individual in the center. The 2-intension of a token of I is a constant function from the speaker's scenario to the speaker. Likewise, the 1-intension of a token of this is a non-constant function from scenarios to the object demonstrated by the individual in the center. The 2-intension of ‘this’ is a constant function from the speaker's scenario to the object demonstrated by the speaker.
Context-sensitive expressions are twin-earthable. Oscar and Twin-Oscar's tokens of I have the same 1-intension but different 2-intensions. Other twin-earthable expressions, of course, also have a 1-intension that comes apart from their 2-intension (e.g. ‘water’, ‘Hesperus’, etc.). But Chalmers treats these other twin-earthable expressions as epistemically variable, not as context-sensitive. One advantage of doing so is that Kaplan's original division of expressions into context-sensitive and non-context-sensitive is preserved. Indexicals come out as context-sensitive, whereas names come out as non-context-sensitive.
6.4.2 Twin-earthability and context-sensitivity broadly speaking
Despite the differences between epistemic variability and context-sensitivity within the original epistemic two-dimensional framework, I believe one could understand epistemic variability as a kind of context-sensitivity. This proposal is similar to the variable-character approach I sketched above on behalf of Kaplan but it avoids some of the most obvious problems with the earlier suggestions. It takes context-sensitivity, broadly construed, to be grounded in twin-earthability. Chalmers, of course, would not endorse this. But here is some motivation for this line of thought. Suppose we have a more localized Twin-Earth phenomenon. Water in and around Australia is XYZ, whereas water in and around America is H2O. It now seems possible to treat the 1-intension of ‘water’ as a function from locations of individuals in scenarios to chemical substances found on those locations and the 2-intension of ‘water’ as a function from the speaker's location to the chemical substance found on that location.
Of course, there is still the option of denying that the string of letters ‘water’ forms is a single lexical item. One could hold that we have here a case of homonymy or polysemy. If this were so, then ‘water’ as used by speakers in Australia and ‘water’ as used by speakers in America would be different lexical entries with the same spelling.
But this rejoinder can be empirically falsified. For example, we might discover that competent American and Australian speakers treat the word ‘water’ in the same way that they treat the word ‘I’, not as a string of letters spelling different words, but as a single lexical entry, and that they are oblivious to the context-sensitivity of the word. To illustrate consider the following dialogue:
(A)
American: I am hungry.
Australian: No, you are wrong, I am not hungry.
Just as competent speakers treat (A) as revealing a failure to recognize that ‘I’ is context-sensitive rather than a failure to recognize that the speakers use two different homonymous words, so we can imagine that competent speakers would treat the following dialogue as revealing a failure to recognize that ‘water’ is context-sensitive rather than a failure to recognize that the speakers use two different homonymous words:
(B)
American: Water is H2O.
Australian: No, you are wrong, water is XYZ.
To say that context-sensitivity, broadly construed, is grounded in twin-earthability is not to deny that there are important differences between, say, indexicals and noun phrases. There is, for example, a difference in how we normally use these two types of expression. As we normally use ‘I’, its 1-intension is a function from centered worlds to individuals in the center. As we normally use ‘water’, its 1-intension is, roughly, a function from scenarios to whichever substance satisfies the description ‘the clear potable liquid that comes out of our kitchen faucets’.
But it is not clear how much weight this difference carries. First, like the 1-intension of ‘water’, the 1-intension of ‘I’ is, roughly, equivalent to the 1-intension of a description, perhaps, ‘dthat [the person who utters this token]’ (Kaplan Reference Kaplan, Almog, Perry and Wettstein1989a: 522). For example, the 1-intension of ‘I am not here’ is plausibly equivalent to the 1-intension of ‘dthat [the person who utters this token] is not here’.
Second, it is not difficult to imagine that ‘water’ could have the 2-intension it actually does and yet have a 1-intension that is definable in much the same way as the 1-intension of ‘I’. Suppose water is person-dependent and happens to have a person-specific chemical structure. For example, chemical structures might be causally connected to the individual essences of people. So, when you pour yourself a glass of clear odorless liquid, the chemical structure is H2O, and when I pour myself a glass of clear odorless liquid the chemical structure is XYZ. In the envisaged circumstances we might be using ‘water’ with a 1-intension that, at each scenario, picks out the chemical substance at the center of the scenario. The fact that this 1-intension, even in this scenario, very well could be (roughly) equivalent to the 1-intension of ‘the clear, odorless liquid that comes out of the faucet in my kitchen’ only emphasizes my point that there is no obvious reason to treat indexicals as context-sensitive and twin-earthable noun-phrases as context-insensitive.
However, as I mentioned above, Chalmers would not endorse this proposal. Moreover, there is an interesting difference between how indexicals and noun phrases are actually used. Presumably Kaplan was right in thinking that there are no operators in English that can change the context to yield a different content for ‘I’. For example, ‘John believes I am hungry’ cannot be used to say that John believes that John is hungry. However, it is possible that there are some monsters in English that can change the context to yield a different content for noun phrases. Here is how.
If twin-earthable expressions can be treated as a kind of context-sensitive expressions, then whenever we have an expression that has a twin token which yields a different 2-intension elsewhere, we have context-sensitivity. If, now, we can have operators relocating speakers from a place where the expression has one 2-intension to a place where it has a different 2-intension, then we have an operator on context, hence a monster in Kaplan's sense. It might plausibly be thought that there are operators of this kind. Consider a color term like ‘redness’. Let's suppose, for simplicity, that its 1-intension picks out a reflectance type in the actual world but some other property in other worlds. For example, in worlds in which colors are purely qualitative properties instantiated by external objects (as color primitivism would have it), the 1-intension of ‘red’ picks out these purely qualitative properties. Now consider (2).
(2) In Eden objects instantiate redness in virtue of instantiating a purely qualitative or ‘primitive' color property.
If ‘redness’ as it occurs in (2) picked out a reflectance property, (2) would be false. But (2) looks true. So, it must be that ‘In Eden’ is capable of changing the scenario (or centered world) from the actual one to an Edenic one. So, to the extent that scenarios are contexts in a broad sense ‘In Eden’ is a context-shifting operator.
Other cases of context-shifting are more controversial. Consider the following conditional:
(3) Should this turn out to be Twin‐Earth, water is not H2O.
Intuitions seem divided on whether (3) is true or false. I am inclined to think it is true (given the fiction). There are numerous ways to deal with conditionals. But on one plausible account the antecedent ‘should this turn out to be Twin-Earth’ functions as a context-shifting operator which shifts the speaker's context (i.e., scenario) from Earth to Twin-Earth (either by shifting the world or the location within the world). Given a variable-character approach to names, the character of ‘water’ is something like a description which, relative to Twin-Earth, determines that the content of ‘water’ is XYZ, not H2O. Given a version of epistemic two-dimensionalism, it is plausible to think that the antecedent ‘Should this turn out to be Twin-Earth’ shifts us to a different centered world (or scenario or context) that has a specification that a priori implies that water is XYZ. So, while the 1-intension of ‘water’ stays the same, the 1-intension now determines a 2-intension that picks out XYZ at every possible world. If Twin-Earth is a planet in our universe, the operator ‘On Twin-Earth’ shifts the speaker's scenario to a scenario with a different center, viz., a center occupied by Twin-Oscar. The 1-intension of ‘water’ stays the same, but because we have a new center, the 1-intension determines a different 2-intension that picks out XYZ at every possible world.
So, given a reasonable extension of epistemic two-dimensionalism, some operators in English (in the broad sense of ‘operator’ that includes antecedents of conditionals) can perhaps be thought of as context-shifting operators which shift context (i.e., the scenario) to determine a new content (2-intension) for broadly context-sensitive expressions (viz., noun phrases).
It may be argued that there are no operators like ‘In Eden’ or ‘Should this turn out to be Twin-Earth’ in English. However, I think English is expressively rich, and we can felicitously utter sentences like (2) and (3). Whether they have true readings is an empirical question. If they do, then there are context-shifting operators, in the broad sense, in English.
I also note that while Chalmers does not endorse these considerations and does not propose to treat 1-contingency in terms of context-shift, the foregoing considerations suggest a way for the two-dimensional approach to be understood this way. If (3) is true in my mouth and we evaluate whether (3) is 1-contingent or not, it might reasonably be suggested that we evaluate the sentence within the scope of an envisaged context-shifting operator, for instance, ‘should this turn out to be Twin-Earth’ or ‘given hypothesis H’. On this approach then, an evaluation of a sentence's 1-modal features requires shifting the scenario (i.e., the context) either by shifting the world part of the scenario or by shifting the center of the scenario. We then look to see what content is determined given this context-shift. I believe this way of thinking about two-dimensional semantics is broadly within the spirit of Chalmers's approach.
6.5 Dynamic two-dimensional semantics
As it stands, epistemic two-dimensional semantics does not allow for a treatment of dynamic discourse.10 Consider the following discourse fragment:
(4) John is now entering the room, and he is now taking off his hat and is therefore not now wearing a hat, but he is now putting the hat back on, and is therefore now again wearing a hat.
If the conjuncts in (4) are thought of as parallel-asserted, then (4) is a contradiction, not so if (4) is asserted at the right sort of pace in non-parallel fashion. Call a non-parallel assertion a ‘dynamical assertion’. So, a parallel assertion of (4) and a dynamical assertion of (4) have different intensions.
The fact that discourse fragments like (4) exist in natural language suggests a need for a dynamic two-dimensional semantics. Given length considerations I can only briefly sketch how such an approach might proceed. Consider:
(5) John1 is now1 spotting Susan2.
Following Irene Heim (Reference Heim1982), let us introduce the notion of a filing system, that is, a system that keeps track of variables, names, and descriptive material introduced by the discourse. Here is an illustration.
Filing system F1:
x, y, t1
Now t1
John x
Susan y
Spot (x, y)
Additions to the discourse give rise to a new system:
(6) He1 is now2 walking over to her2.
Filing system F2:

(7) And is now3 starting a conversation with her2
Filing system F3:

(8) She2 is now4 talking to a man3.
Filing system F4:

(9) Now5 he1 is talking to the man3 she2 talked to just a moment ago4
Filing system F5:

We can introduce a notion of truth of files as follows. Given an understanding of scenarios as centered possible worlds, we can take 1-models to be pairs of a domain D of actual individuals, some of which serve as a center and an interpretation function I.

The 1-interpretation function I maps the non-indexical variables (i.e., discourse referents) of file F to members of D, indexical variables of F to the center of D, and the descriptive material and names (i.e., predicate nominals) of F to properties or relations (or sets of n-tuples) on D. One set of assignments, i@1, contains as an element the distinguished interpretations: the actual sets of assignments of extensions to the expressions (or the set of actual centered worlds). Let a 1-assignment A in a model M = <D, I> be a mapping of non-indexical variables onto elements of D, and indexical variables onto the centered elements of D. We can then say that assignment A verifies filing system F in M if there is an extension E of A such that the elements satisfy the descriptive material and names at the scenario. A 1-information state is a space of scenarios (interpretations) with the same filing system. A dynamic 1-intension is a sequence of 1-information states. We can then say that discourse fragments express dynamic 1-intensions which are sequences of spaces of scenarios that share a filing system. If we introduce a designated sequence of scenarios, we can then say that a dynamic 1-intension is true at a designated sequence of scenarios just in case each scenario in the designated sequence of scenarios is in the space of scenarios (or interpretations) with the same filing system as the designated scenario.
A different model is needed to model dynamic 2-intensions. A 2-model is a pair of an uncentered domain of actual individuals and an interpret-ation function. The 2-interpretation function I maps the indexical and non-indexical variables and the associated names (if any) of file F to members of D, and the descriptive material of file F to properties or relations (or sets of n-tuples) on D. One assignment, i@2, is the distinguished interpretation, the actual assignment of extensions to the expressions (or the actual world). Let an assignment A in a model M = <D, I> be a mapping of variables and associated names (if any) onto elements of D. We can then say that assignment A verifies filing system F in M if there is an extension E of A such that the elements satisfy the descriptive material at the scenario.
A 2-information state is a space of worlds with the same filing system. A dynamic 2-intension is a sequence of 2-information states. A dynamic 2-intension is true at a distinguished sequence of worlds just in case each world in the designated sequence of worlds is in the space of worlds with the same filing system as the designated world.
We can give the following validation conditions for universal operators:
□h, □hΦ is true just in case at every h-admissible interpretation i, i assigns true to Φ.
So, ‘It is metaphysically necessary that Φ’ is true iff at every 2-interpretation i, i assigns T to Φ, and ‘It is a priori that Φ’ is true iff at every 1-interpretation i, i assigns T to Φ. To illustrate consider (10) and (11).
(10) The brightest heavenly object is now1 Hesperus1, and the brightest heavenly object is now2 Phosphorus2, and Hesperus1 is identical to Phosphorus2.
(11) It is deeply epistemically possible that: the brightest heavenly object is now1 Hesperus1, and that the brightest heavenly object is now2 Phosphorus2, and that Hesperus1 is not identical to Phosphorus2.
(10) is true iff the dynamic 2-intension it expresses has the truth-value true at the designated sequence of worlds. So, (10) is true if the brightest heavenly object is Hesperus at the worlds that assign the utterance time to the variable associated with ‘now1’, and the brightest heavenly object is Phosphorus at a designated world that assigns the utterance time to the variable associated with ‘now2’, and Hesperus is identical to Phosphorus. (11) is true iff there is a sequence of scenarios at which the dynamic 1-intension of the embedded clause is true. This is so if the brightest heavenly object is the brightest object in the evening sky at the scenarios that assign the time in the center to the variable associated with ‘now1’, and the brightest heavenly object is the brightest object in the morning sky at the scenarios that assign the time in the center to the variable associated with ‘now2’, and the brightest object in the evening sky in scenarios in the first set is not identical to the brightest object in the morning sky in the scenarios in the second set.
6.6 Conclusion
As we have seen, in Kaplan's original framework, context-sensitive expressions are expressions with a variable character – a linguistic meaning that determines different contents relative to different contextual parameters. This approach to context-sensitivity has much to recommend it but it fails to capture a plausible connection between context-sensitivity, broadly understood, and cognitive informativeness. The two phenomena are not unrelated. Cognitively informative necessities arguably contain what David Chalmers calls “twin-earthable expressions,” linguistic strings that have twin tokens with a different content. Twin-earthable expressions are broadly context-sensitive. When I use ‘water’, the term picks out H2O. When my twin on Twin-Earth uses ‘water’, it picks out XYZ. On the assumption that we speak the same language, an assumption which is consistent with the definition of Twin-Earthability, ‘water’ has a variable character and hence is broadly context-sensitive. But if ‘water’ is context-sensitive, we have a partial explanation of the cognitive significance of sentences such as ‘water is H2O’. The sentence, relative to my context, expresses a proposition of the form ‘a = a’, yet owing to my cognitive limitations I may not know what content is determined by the character of ‘water’ at my location. Being told that the character of ‘water’ determines H2O is informative for me. Thus, if we keep fixed the language spoken, there is a tight connection between twin-earthability and cognitive informativeness.
I have assessed three types of two-dimensional semantic frameworks in terms of how well they account for the connection between cognitive significance and the broader notion of context-sensitivity. One is a diagonalized character approach that takes names and kind terms to have a constant character but allows for the possibility that a diagonal proposition that functions much like a variable character has some explanatory power with respect to the phenomenon of cognitive informativeness. A second approach is to widen the range of expressions that have a variable character to include various noun phrases and descriptions. A third approach is to treat expressions as having epistemic meaning in addition to semantic meaning. I argued that while all three approaches go some way toward explaining the connections between context-sensitivity and cognitive informativeness, the non-epistemic approaches seem to fall short of offering a fully adequate account of cognitively informative necessities and informative contingencies. I concluded by pondering a new problem for the epistemic approaches posed by indexicals in dynamic discourses and sketched a solution.
I am grateful to Keith Allan, David Chalmers, Melissa Ebbers, and Kasia Jaszczolt for helpful comments on an earlier version of this paper.
7 Contextualism Some varieties
A number of distinct (though related) issues are raised in the debate over Contextualism. My aim in this chapter is to disentangle them, so as to get a clearer view of the positions available (where a ‘position’ consists of a particular take on each of the relevant issues simultaneously). The position I defend will be apparent at the end of the chapter.
7.1 The modularity issue
According to a view which used to be standard, and which, for reasons that will soon emerge, I call the modular view, knowledge of a language, and especially semantic knowledge or semantic competence, enables language users to ascribe truth conditions to arbitrary sentences of that language. To be sure, when a sentence is context-sensitive (as most sentences are), it only carries truth conditions ‘with respect to context’; so knowledge of the context is required in addition to knowledge of the language. But (according to the view in question) the context at issue involves only limited aspects of the situation of utterance: who speaks, when, where, to whom and so forth. Given a context thus understood, the rules of the language – e.g. the rule that ‘I’ refers to the speaker – suffice to determine the truth-conditional contribution of context-sensitive expressions. There is no need to appeal, in addition, to pragmatic competence. That is the gist of the modular view.
By ‘pragmatic competence’, I mean the ability to understand what the speaker means by his or her utterance. As Grice emphasised, speaker's meaning is a matter of intentions: what someone means is what he or she overtly intends – or, as Grice says, ‘M-intends’ – to get across through his or her utterance. Communication succeeds when the M-intentions of the speaker are recognised by the hearer. Pragmatic competence is needed to determine what the speaker means on the basis of what she says; but what the speaker says is supposed to be autonomously determined by the semantics (with respect to context), irrespective of the speaker's beliefs and intentions. So the modular story goes.
On this conception, semantics and pragmatics are insulated from each other. Pragmatics takes as input the output of semantics, but they do not mix, and in particular, pragmatic processes do not interfere with the process of semantic composition which outputs the truth conditions.1 This makes sense if one construes semantic competence and pragmatic competence as belonging to two distinct ‘modules’ (Borg Reference Borg2004: chapter 2) – hence my name for the view.2 Semantic competence belongs to the language faculty, Borg says; it is an aspect of our ‘knowledge of language’. Pragmatic competence has more to do with the so-called ‘theory of mind’, that faculty in virtue of which human subjects are able to explain other people's behaviour by ascribing intentions to them and reading their mind.
The modular picture I have just described has started to lose grip in recent years. Nearly everybody nowadays acknowledges the fact that the reference of indexicals and, more generally, the semantic value of context-sensitive expressions cannot be determined without appealing to full-fledged pragmatic factors (e.g. speaker's intentions). The semantic value of a context-sensitive expression varies from occurrence to occurrence, yet it often varies not as a function of some objective feature of the context but as a function of what the speaker means. Pragmatic competence, therefore, is required not only to determine what the speaker means on the basis of what she says, but also to determine what is said in the first place. That means that we have to give up the modular view, and accept that pragmatics and semantics do mix in fixing truth-conditional content.
Of course, if one wants to maintain a semantics pure of pragmatic intrusion, one can, but then one has to construe the goal of semantics differently than it is on the standard conception. Pure semantics will no longer deliver truth conditions, but it will deliver, say, conditional truth conditions, or schemata, or characters, or propositional radicals, or whatever. To get full-blown truth-conditional content, pragmatics will be needed. This non-modular approach to truth-conditional content is one of the key ingredients of contemporary Contextualism.
7.2 The ‘extent of context-sensitivity’ issue
According to most contemporary theorists, context-sensitivity is pervasive in natural language. In addition to the obvious indexicals, many expressions turn out to be context-sensitive in one way or another. This covers two types of case. There are expressions which display hidden indexicality in that, when properly analysed, they turn out to behave very much like standard indexicals, or to contain hidden constituents which do; and there are expressions which display other forms of context-sensitivity. In any case, we don't know in advance whether a given expression is, or is not, context-sensitive: it is an empirical question, to be resolved through linguistic analysis. So we must reject a certain presumption which was still prevalent twenty years ago and which we may call the ‘literalist presumption’.
To introduce the literalist presumption, let us start from the following (uncontroversial) premiss:
There is a ‘basic set’ of expressions whose content is known to depend upon the context in a systematic manner: the indexicals.
The presumption which the pervasiveness of context-sensitivity leads us to reject can now be stated as follows:
Literalist presumption: expressions not in the basic set are (by default) assumed to be context-insensitive.
The literalist presumption is implicitly at work in a number of fallacious arguments using Grice's ‘Modified Occam's Razor’, or an equivalent principle of parsimony, to demonstrate that a semantic analysis in terms of conversational implicature is preferable to an account in terms of truth-conditional content proper (Recanati Reference Recanati and Tsohatzidis1994, Reference Recanati2004a: 155–8). Classic examples involve the use of Modified Occam's Razor against Strawson's (1952) view of the contextually varying truth-conditional contributions of ‘and’, or against Donnellan's view of the contextually varying truth-conditional contributions of definite descriptions. In each case, the possibility that the relevant expression (which seems to carry different contents in different contexts) might be context-sensitive even though it does not belong to the basic set is ignored, in virtue of the literalist presumption, and the argument proceeds as if the only options available to account for the data were lexical ambiguity on the one hand and conversational implicature on the other hand (with Modified Occam's Razor being used to rule out the former option).
Rejection of the literalist presumption is another key ingredient in contemporary Contextualism. It corresponds to a stance I dubbed ‘Methodological Contextualism’ (Recanati Reference Recanati and Tsohatzidis1994). According to Methodological Contextualism, we don't know in advance which expressions are context-sensitive and which aren't. For all we know, every expression might be context-sensitive. Here the universal quantifier takes scope over the epistemic modal, so what generalises is the possibility of context-sensitivity. For every expression e – including ‘and’ or definite descriptions – it may be that e is context-sensitive and contributes different contents in different contexts (even though e is not ambiguous). As a result, we need to draw a general distinction between the linguistic meaning of an expression and its contribution to propositional content, while allowing for special cases in which they will be identical, instead of doing the opposite (i.e. equating conventional meaning and propositional contribution, while allowing for exceptions – the expressions in the ‘basic set’).
Methodological Contextualism goes together with the view that context-sensitivity is a pervasive phenomenon in natural language. Not everybody accepts this view, however. The so-called ‘semantic minimalists’ (e.g. Cappelen and Lepore Reference Cappelen and Lepore2005a) believe that context-sensitivity is a very limited phenomenon, corresponding roughly to expressions in the basic set, so they don't feel compelled to reject the literalist presumption.
7.3 Does context-sensitivity generalise?
The two contextualist ingredients I have described so far correspond to views that are widely shared among contemporary theorists. Those who still believe in modularity and are faithful to the literalist presumption are a rather small minority (see Chapter 25 for more on their view). But there is another minority, at the other end of the spectrum: the radical contextualists. What characterises their view is the generalisation of context-sensitivity.3
There are several possible arguments for the generalisation of context-sensitivity. First, one should distinguish the claim that context-sensitivity generalises at the sentential level from the much stronger claim that it generalises at the constituent or the lexical level. One generalises context-sensitivity at the sentential level if one holds that the truth conditions of a sentence always depend upon the context (so that there are no ‘eternal sentences’). Here is an example of an argument for that conclusion:
(a) A successful sentence (i.e. a sentence that succeeds in expressing a proposition) expresses either a singular proposition or a general proposition.
(b) If a sentence expresses a singular proposition, it does so in a context-sensitive manner because (i) a sentence expresses a singular proposition only if it contains a successful referring expression (i.e. an expression which succeeds in referring), and (ii) reference is inherently context-sensitive. (That is so because an expression-token only refers to some object in virtue of contextual relations between the token and the object.4)
(c) If a sentence expresses a general proposition, it does so in a context-sensitive manner because (i) a sentence expresses a general proposition only if it contains a (successful) quantificational expression, and (ii) quantification is inherently context-sensitive. (That is so because the domain of quantification depends upon the context. An expression succeeds in quantifying only if the context supplies an appropriate domain of quantification.)
(d) Conclusion: Whenever a sentence expresses a proposition, it does so in a context-sensitive manner.
The argument I have just presented relies on many controversial assumptions regarding, inter alia, the nature of reference, the semantics/meta-semantics distinction, the semantic analysis of plurals, proper names and definite descriptions, quantifier domain restriction, and so on and so forth. It is not my intention to go into these thorny issues here, in order to evaluate the argument. Rather, I will focus on arguments for the generalisation of context-sensitivity at the level of sentential constituents, or at the lexical level, rather than at the sentential level. As I already suggested, the claim that context-sensitivity generalises at the constituent or lexical level is much stronger than the claim that it generalises at the sentential level. What we are now considering is the possibility that every expression (not just every sentence) might be context-sensitive. This is a very radical form of Contextualism indeed.
We have already encountered the claim that ‘every expression might be context-sensitive’ in the context of Methodological Contextualism, but then the universal quantifier ‘every expression’ took scope over the epistemic modal. What Methodological Contextualism meant to generalise to every expression was only the possibility of its being context-sensitive. In the context of Radical Contextualism, the claim that ‘every expression might be context-sensitive’ is understood differently: the modal now scopes over the universal quantifier. The possibility that is being considered is the possibility that, for every expression e, e is context-sensitive. Here what tentatively generalises is (actual) context-sensitivity, not the possibility of context-sensitivity.
In what follows I will present two types of argument for the generalisation of context-sensitivity at the lexical or constituent level. One such argument involves the phenomenon of pragmatic modulation. It is of special importance and will be discussed in sections 7.5 and 7.6. Other arguments, to which I now turn, are based on considerations from lexical semantics.
7.4 Arguments from lexical semantics
What is the meaning of a word? Let us assume that utterances express ‘propositions’ or ‘thoughts’, and that these propositions/thoughts are made out of, or can be analysed into, certain building blocks or constituents, to be called senses. The standard assumption regarding word meaning is that the conventions of the language associate expressions with senses. What an expression contributes, when it is used (together with other expressions) in making a complete utterance, is supposed to be the sense which it independently possesses in virtue of the conventions of the language. Indexicals (and context-sensitive expressions more generally) are considered an exception: their sense is not to be equated with their linguistic meaning, but depends upon the context. Now Radical Contextualism rejects the very idea that the conventions of the language associate expressions with full-fledged senses. It generalises the distinction between the lexical meaning of an expression and its sense (or ‘content’ or ‘propositional contribution’), which is said to depend upon the context.
Putnam's ideas about the lexical semantics of nouns can be seen as a forerunner of Radical Contextualism. Putnam criticises the Fregean idea that the meaning of a noun is a (context-independent) sense which determines the noun's reference (i.e. its extension). In many cases, Putnam points out, we start with the reference: the noun is associated with contextually given exemplars known to fall into the extension of the noun, or, on a more refined picture, with a stereotype used to identify the exemplars to be found in one's local environment. What determines the extension of the noun is not the lexically given stereotype, however, but a certain relation R of similarity to the local exemplars. So the extension-determining sense of an expression is context-dependent on two counts: it depends upon the contextually given exemplars (e.g. whether the transparent, thirst-quenching liquid around us is H2O or XYZ), and it depends upon the relation R, which itself may vary according to the interests of the conversational participants (Putnam Reference Putnam1975: 238–9). The sense thus determined in context is utterly different from the lexical meaning of the noun, which Putnam describes as a ‘vector’ consisting of, inter alia, a ‘semantic marker’ (e.g. liquid) and the exemplar-fixing stereotype.
Putnam's story is meant to apply to the nouns that are used to talk about things in the environment, and which we learn by getting acquainted with the things they are used to talk about. Not merely natural-kind terms like ‘water’, but also, he says, nouns for artefacts like ‘pencil’. It could presumably be extended to other categories of lexical items, such as verbs and adjectives. On a Putnamian semantics, context-sensitivity generalises to all the words which have a ‘referential’ dimension and directly connect up with aspects of the world around us. That is arguably the core of the language.
Another class of expressions which might claim to be ‘the core of the language’, though for totally different reasons, are the most frequently used expressions – light verbs (e.g. get, have, take), prepositions and the like. Such expressions exhibit a high degree of polysemy, and this raises a problem for lexical semantics. What do such words mean? A number of scholars believe their meaning is schematic and has to be fleshed out on any particular use. This suggests that, perhaps, their conventional meaning is not a full-fledged sense. Can one argue that they are ambiguous between a number of distinct senses? That is not obvious because it does not seem that there is a discrete list of such senses available but, rather, a continuum of possible senses to which one can creatively add in an open-ended manner. That is not to say that the meaning of such an expression reduces to an abstract schema: the expression is undoubtedly also associated in memory with conventional ways of using it in collocations with (more or less) determinate senses. All this – the abstract schema or schemata, the collocations, the senses – arguably goes into the linguistic meaning of the expression, which starts looking rather messy. On such a view, the meaning of an expression does not have the right ‘format’ to be what the expression contributes to propositional content.5 In other words, linguistic meanings are not senses (though they may involve senses, inter alia).
The two views of lexical semantics I have just mentioned may well be wrong, of course. When it comes to lexical semantics, nearly everything is up for grabs. That, however, is precisely the point. As theorists, we have an idea what senses are, i.e. what words contribute when we speak. We know, more or less, how to model that. But we know very little about what words themselves mean and what relation there is between word meaning and contributed sense. In view of the limits of our knowledge, it is reasonable to give up the simplifying assumption that linguistic meanings are senses, in order at least to start making serious enquiries in that area.
If we give up that simplifying assumption, we are left with the idea that lexical meaning plays some role in determining the sense which is an expression's contribution to the thought expressed. This idea can be expressed by saying that the sense of an expression is a function of the lexical meaning of that expression and some factor x, where ‘x’ is whatever, in addition to lexical meaning, is needed to determine sense.6 If, as seems very likely, ‘x’ includes the context in which the expression is tokened (and in particular the most important among contextual factors: what one is talking about), then we get a radical form of Contextualism that ‘generalises indexicality’.
7.5 Pragmatic modulation
The arguments from lexical semantics I have just mentioned support Radical Contextualism because they cast doubt on the idea that an expression's contribution to truth-conditional content (its sense) is fixed by the rules of the language independent of context. Most expressions are treated as indexical-like in that their semantic content depends upon the context somehow. But this does not (yet) mean that the semantic contribution of every expression is context-dependent. The next argument for Radical Contextualism leads to that stronger conclusion, however. I call it the argument from pragmatic modulation.
The argument from pragmatic modulation is independent of the arguments from lexical semantics, and it can be put forward even if we grant the assumption which the arguments from lexical semantics lead us to doubt: that the conventions of the language directly associate expressions with full-fledged senses.7 Let us, indeed, grant that assumption. One can still deny that the senses which – on this view – are the meanings of expressions are also what these expressions contribute when they are used (together with other expressions) in making a complete utterance. Because of pragmatic modulation, an expression may, but need not, contribute its sense – i.e. the sense it independently possesses in virtue of the conventions of the language (assuming for the sake of argument that it possesses such a sense); it may also contribute an indefinite number of other senses resulting from modulation operations (e.g. free enrichment, predicate transfer, sense-extension etc.) applied to the proprietary sense.
For example, a sentence like (1) has several readings.
(1) There is a lion in the middle of the piazza.
On one reading ‘lion’ is given a non-literal interpretation and means something like ‘statue of a lion’. On that reading (1) may be true even if, literally, there is no lion in the middle of the piazza (but only a statue of a lion). Or consider (2).
(2) The ATM swallowed my credit card.
This can be given a literal reading, if we imagine a context à la Putnam in which ATMs turn out to be living organisms. But the sentence can also and typically will be interpreted non-literally. In an ordinary context, ‘swallow’ will be given an extended sense, corresponding to what ATMs sometimes do with credit cards (something which, indeed, resembles swallowing). The sentence may be true, on such a reading, even though no real swallowing takes place. In a less ordinary context in which there is a person disguised as an ATM, the predicate ‘ATM’ in the description will be given a non-literal reading: through ‘predicate transfer’ (Nunberg Reference Nunberg1995), it will acquire the meaning ‘person disguised as an ATM’ (just as ‘lion’ acquires the meaning ‘statue of a lion’ in the previous example). The sentence may well be true, on that interpretation, even though no real ATM swallows anything, provided the person disguised as an ATM does swallow the credit card.8
Accepting pragmatic modulation (here, the process mapping the literal meaning of ‘lion’ or ‘ATM’ to the relevant representational reading, or the meaning of ‘swallow’ to its extended reading) as a possible determinant of truth-conditional content leads to a radical form of Contextualism, because modulation itself is context-sensitive: whether or not modulation comes into play, and if it does, which modulation operation takes place, is a matter of context. It follows that what an expression actually contributes to the thought expressed by the utterance in which it occurs is always a matter of context.
Of course, not everybody accepts that pragmatic modulation affects the truth-conditional content of an utterance. On the currently dominant picture, pragmatics comes into play in the determination of truth-conditional content but does so only when the semantic rules of the language prescribe it (as when an indexical demands a contextual value). On this view the only truth-conditional role of pragmatics corresponds to what I have called ‘saturation’ (in contrast to ‘modulation’). Saturation is a pragmatic process of contextual value-assignment that is triggered (and made obligatory) by something in the sentence itself, namely the linguistic expression to which a value is contextually assigned. For example, if the speaker uses a demonstrative pronoun and says ‘She is cute’, the hearer must determine who the speaker means by ‘she’ in order to fix the utterance's truth-conditional content. The expression itself acts as a variable in need of contextual instantiation. So pragmatic competence comes into play, but it does so under the guidance of the linguistic material: the pragmatic process of saturation is a ‘bottom-up’ process in the sense that it is signal-driven, not context-driven. In contrast, pragmatic modulation is a ‘top-down’ or context-driven process, i.e. a pragmatic process which is not triggered by an expression in the sentence but takes place for purely pragmatic reasons – in order to make sense of what the speaker is saying. In other words, it is a ‘free’ pragmatic process – free because it is not mandated by the linguistic material but responds to wholly pragmatic considerations. That is clearly the case for the pragmatic processes through which an expression is given a non-literal interpretation: we interpret an expression non-literally in order to make sense of the speech act, not because this is dictated by the linguistic materials in virtue of the rules of the language.
The dominant view is that the only pragmatic process that can affect truth-conditional content is saturation. No ‘top-down’ or free pragmatic process can affect truth-conditions – such a process can only affect what the speaker means (but not what she says). As Stanley puts it, ‘all truth-conditional context-dependence results from fixing the values of contextually sensitive elements in the real structure of natural language sentences’ (Stanley Reference Stanley2000: 392). Or, as King and Stanley put it, there can only be ‘weak’ pragmatic effects on truth-conditional content. They define a weak pragmatic effect as follows:
A weak pragmatic effect on what is communicated by an utterance is a case in which context (including speaker intentions) determines interpretation of a lexical item in accord with the standing meaning of that lexical item. A strong pragmatic effect on what is communicated is a contextual effect on what is communicated that is not merely pragmatic in the weak sense. (King and Stanley Reference King, Stanley and Szabó2005: 118–19; emphasis mine)
Radical Contextualism rejects the view that only weak pragmatic effects can affect what is said. It holds that truth-conditional content may be affected not only by saturation (as when an indexical is assigned a contextual value) but also by free pragmatic processes of modulation. In (1) the non-literal reading of ‘lion’ arguably results from a pragmatic operation that is not dictated by the lexical item lion in virtue of its standing meaning. There is no slot to be filled, no free variable or context-sensitive element whose value is to be fixed, or anything of the sort. Moreover, nothing (except the desire to make sense of the speaker, in a context in which the literal reading is unlikely) prevents the sentence from being interpreted literally. The pragmatic effect here looks like a strong pragmatic effect, yet it affects truth-conditional content. Since the semantic content of any expression can be pragmatically modulated in this way, what an expression actually contributes depends upon the context.
7.6 Why resist Radical Contextualism?
I have presented five distinct issues relevant to the overall debate between Contextualism and Literalism. On each issue there are two sides: the contextualist side and the literalist side.
Modularity. Is pragmatic competence involved in the determination of truth-conditional content? Contextualism: Yes. Literalism: No.
Extent of context-sensitivity. Is context-sensitivity pervasive in natural language? Contextualism: Yes. Literalism: No.
Generalisation of context-sensitivity (1). Is it true that there are no eternal sentences, i.e. no sentence which expresses a proposition independent of context? Contextualism: Yes. Literalism: No.
Generalisation of context-sensitivity (2). Are all/most expressions like indexicals in that their lexical meaning does not add up to a full-fledged sense? Contextualism: Yes. Literalism: No.
Generalisation of context-sensitivity (3). Does pragmatic modulation affect truth-conditional content? Contextualism: Yes. Literalism: No.
On the first two issues (the modularity issue, and the extent of context-sensitivity issue) Contextualism wins in the sense that it is the dominant position. The role of a speaker's intentions and pragmatic competence in fixing contextual values for indexicals etc. is widely acknowledged, as is the pervasiveness of context-sensitivity in natural language. With respect to the ‘generalisation of context-sensivity’ issues, however, it is the other way round: Literalism wins, sociologically speaking. There is widespread resistance to the most radical forms of Contextualism – those which generalise context-sensitivity. Why?
In this section, I will discuss some of the reasons one might have for resisting the generalisation of context-sensitivity.9 I will argue that they do not carry much weight. This will bring a sixth issue to the fore – the systematicity issue, to which the final section of this chapter will be devoted.
Let us start with the arguments from lexical semantics. Why not simply accept them? I submit that the main source of resistance to the idea that a word's lexical meaning does not add up to a full-fledged sense is the following. Most semanticists worry more about compositional semantics than about lexical semantics, so they make their lives simpler by uncritically accepting the simplifying assumption (inherited from our elders) that lexical meanings are senses. In this way we don't have to care about how senses are generated – we take them simply as given. Admittedly, this is an acceptable idealisation at a certain stage in the development of semantics; but as soon as one gets interested in the foundations of lexical semantics, one should start by lifting the simplifying assumption in order at least to consider the issue with an open mind.
The resistance to pragmatic modulation is a more serious matter. Here it seems that there are substantive reasons to be suspicious of Radical Contextualism. The first reason is this. If free pragmatic processes are allowed to affect semantic content, semantic content leaps out of control – it is no longer determined by the rules of the language but varies freely, à la Humpty Dumpty. But then, how can we account for the success of communication? Communication (content sharing) becomes a miracle since there is nothing to ensure that communicators and their addressees will converge on the same content. Now communication is possible (it takes place all the time), and there is no miracle. It follows that we should give up the view that free pragmatic processes play a role in the determination of semantic content (Cappelen and Lepore Reference Cappelen and Lepore2005a: chapter 8).
This argument fails, I believe, because the problem it raises is a problem for everybody, as soon as one gives up the modular view. Whenever the semantic value of a linguistic expression must be pragmatically inferred, the question arises, what guarantees that the hearer will be able to latch on to the exact same semantic value as the speaker? Whether the pragmatic process at stake is saturation or modulation is irrelevant as far as this issue is concerned, so the argument fails as an argument specifically intended to cast doubt on pragmatic modulation.10
Another argument against pragmatic modulation as a possible determinant of semantic content can be put as follows. ‘What is said’, the truth-conditional content of an utterance, is what is literally said, and that – by definition – has to be determined by the conventions of the language. Pragmatics can enter the picture, provided its role is to assign a contextual value to a lexical item in a bottom-up manner, i.e. in accord with (and under the guidance of) the conventional meaning of that context-sensitive item. In contrast, strong pragmatic effects achieved in order to make sense of the speech act without being linguistically mandated take us into the realm of speaker's meaning, away from literal meaning.
Insofar as this argument is based upon a certain understanding of the phrase ‘what is said’ (or ‘what is literally said’), it is not substantive, but verbal. There is no doubt that one can define ‘what is said’ in such a way that only weak pragmatic effects can affect what is said. But what the advocate of pragmatic modulation means by ‘what is said’ corresponds to the intuitive truth-conditional content of the utterance.11 According to the contextualist side in the debate, the intuitive truth conditions of an utterance of (1) or (2) are affected by free pragmatic processes. Assuming this is true, this does not prevent us from defining another notion of what is said, conforming to literalist standards. Let ‘what is saidmin’ be the proposition expressed by an utterance when strong pragmatic effects have been discounted (the so-called ‘minimal proposition’), and let ‘what is saidint’ correspond to the intuitive truth conditions of the utterance. According to the contextualist view under discussion, what is saidint may be affected by top-down pragmatic processes. This is compatible with the claim that only weak pragmatic effects can affect what is saidmin. So that claim in no way counters the idea that pragmatic modulation affects the (intuitive) truth-conditional content of utterances.
According to yet another argument, if we accept the view that pragmatic modulation affects semantic content, we blur the semantics/pragmatics distinction to the point where there no longer is any difference between what is said and what is meant; so – assuming this distinction is essential – we should reject the view that pragmatic modulation affects truth-conditional content. I find this argument unconvincing, because acknowledging the effects of pragmatic modulation on truth-conditional content (what is saidint) in no way prevents one from distinguishing what is saidint from other things that are conveyed by an utterance without belonging to its intuitive truth-conditional content, e.g. the particularised conversational implicatures (Grice Reference Grice1989) or the effects achieved through ‘staging’ (Clark H. Reference Clark1996). In other words, we can distinguish between ‘primary’ pragmatic processes, such as modulation, and ‘secondary’ pragmatic processes that do not contribute to what is saidint (Recanati Reference Recanati and Davis1989, Reference Recanati2004a). So we do not lose the distinction between what is said and what is meant.
Conclusion: the three arguments against pragmatic modulation I have extracted from the literature and presented in this section are no good.12 However, the most important argument against pragmatic modulation is, by far, the systematicity argument, which I have not yet introduced. That argument, which I take to be the main source of resistance to Radical Contextualism, deserves separate discussion since it raises a new issue relevant to the overall debate.
7.7 The systematicity issue
Many theorists argue as follows. If Radical Contextualism is true, the project of constructing a systematic truth-conditional semantics for natural language is doomed to failure. We should therefore reject Radical Contextualism, since it leads to scepticism. Or rather, we should reject that ingredient which is incompatible with the project of constructing a systematic truth-conditional semantics. That is not the generalisation of indexicality prompted by the lexical semantics considerations; for, if indexicality is compatible with formal semantics, generalised indexicality should be compatible with it, too. The problem, rather, comes from the acceptance of pragmatic modulation as a determinant of semantic content. That is what we should reject.
Of course, one has to say why the acceptance of pragmatic modulation as a determinant of semantic content is incompatible with the project of building a systematic semantics. Here is a first sketch of an argument for that conclusion.
In contrast to the contextual assignment of values to indexicals, modulation is not driven by the linguistic meaning of words. Nothing in the linguistic meaning of the words whose sense is modulated tells us that modulation ought to take place. Modulation takes place purely as a matter of context, of ‘pragmatics’; what drives it is the urge to make sense of what the speaker is saying. So modulation is unsystematic. If we allow it as a determinant of semantic content, we make it impossible to construct a systematic theory of semantic content.
I grant the objector that modulation is unsystematic. Still, I think it is easy to make room for it within a systematic semantics.13 In general, nothing prevents unsystematic factors from being handled systematically, by being assigned their proper place in the theory. In the case at hand, we can define a function mod taking as argument an expression e and the context c in which it occurs: the value of mod is the particular modulation function that is contextually salient/relevant/appropriate for the interpretation of that expression in that context. If no modulation is contextually appropriate and the expression receives its literal interpretation, the value of mod will be the identity function. In this framework, we can distinguish between the literal sense of a simple expression e, namely its semantic interpretation I(e), and the modulated sense M(e)c carried by an occurrence of e in context c. The modulated sense of an expression e (in context c) results from applying the contextually appropriate modulation function mod (e, c) to its semantic interpretation I(e):

So far, this is very standard: in distinguishing I(e) from M(e)c we are just appealing to the traditional semantics/pragmatics distinction. What is not standard is the claim that the semantic interpretation of a complex expression (e.g. a sentence) is a function of the modulated senses of its parts and the way they are put together (Recanati Reference Recanati2010: chapter 1). This is what examples like (1) and (2) suggest if we take at face value the effects of modulation on truth-conditional content which they seem to display. On the resulting view the semantic process of composition and the pragmatic process of sense modulation are intertwined. For simple expressions, their semantic interpretation is their literal sense, but for complex expressions pragmatic modulation is allowed to enter into the determination of semantic content. This is non-standard, for sure, but there is nothing unsystematic about this view.
The systematicity objection can be understood differently, however. What is not systematic enough, according to the objection, is not so much the radical contextualist's theory of utterance interpretation, but utterance interpretation itself (what the theory is about) as construed by the radical contextualist. We have seen that, in the contextualist framework with its ‘free’ pragmatic processes, interpretation (content recovery) is no longer driven by the linguistic material. In introducing modulation (in contrast to saturation), I said that in saturation ‘pragmatic competence comes into play, but does so under the guidance of the linguistic material’, whereas modulation ‘is not triggered by an expression in the sentence but takes place for purely pragmatic reasons – in order to make sense of what the speaker is saying’. This suggests that, for a radical contextualist, utterance interpretation is pragmatic through and through and does not significantly differ from ‘the kind [of interpretation] involved in interpreting kicks under the table and taps on the shoulder’ (Stanley Reference Stanley2000: 396): it is not the systematic affair which formal semanticists have claimed it to be. In this way, we reach the conclusion that Radical Contextualism is incompatible with the programme of formal semantics.
Thus understood, I think the objection is confused. Even though free pragmatic processes, i.e. pragmatic processes that are not mandated by the standing meaning of any expression in the sentence, are allowed to enter into the determination of truth-conditional content, still, in the framework I have sketched, they come into the picture as part of the compositional machinery. Semantic interpretation remains grammar-driven even if, in the course of semantic interpretation, pragmatics is appealed to not only to assign contextual values to indexicals and free variables but also to freely modulate the senses of the constituents in a top-down manner. Semantic interpretation is still a matter of determining the sense of the whole as a function of the (possibly modulated) senses of the parts and the way they are put together.
If what I have just said is right, the systematicity issue is orthogonal to the other issues I discussed as relevant to the Contextualism/Literalism debate. The systematicity issue can be formulated thus:
Systematicity. Is semantic interpretation a matter of holistic guesswork (like the interpretation of kicks under the table), rather than an algorithmic, grammar-driven process as formal semanticists have claimed? Contextualism: Yes. Literalism: No.
On that issue I am happy to part company with the most radical contextualists – the ‘sceptics’ who would go for the holistic guesswork answer (assuming they exist, which I doubt). Like Stanley and the formal semanticists, I maintain that semantic interpretation is grammar-driven. But this issue is orthogonal to the others! So, without contradicting what I have just said, I can still hold that a good deal of holistic guesswork comes into play in semantic interpretation, e.g. in order to fix the values of context-sensitive elements or to pick the right modulation functions. Both in the case of saturation and in the case of modulation – the two types of contextual process that play a part in the determination of truth-conditional content – one has to rely heavily on pragmatic competence. Accepting this point means that one endorses a quite radical form of Contextualism; but this is compatible with maintaining that semantic interpretation is grammar-driven and proceeds recursively. I conclude that there is no reason to take one's adherence to the project of building a systematic semantics to prevent one from being a contextualist with respect to any of the issues talked about earlier.
Of course, a radical contextualist will be prone to set limits to systemat-icity in semantics; but that is not the same thing as getting rid of systematicity altogether. As I said, when it comes to fixing the values of context-sensitive elements or to picking the right modulation functions, pragmatic competence takes over: formal semantics has nothing to say regarding the pragmatic mechanisms in play, and the hand-waving word ‘salience’ which semanticists like to use is only a placeholder. But these limitations put on semantics are nothing but the price to pay for giving up the modular view. We have to admit, once and for all, that (intuitive) truth-conditional content or ‘what is said’ is not something purely semantic – something that can be retrieved simply by exercising one's semantic competence. Pragmatic competence massively comes into the picture. To reach that conclusion, however, there is no need to consider anything fancier than deictic pronouns. In particular, there is no need to go into the ‘generalisation of context-sensitivity’ issues which have loomed large in our discussion of Radical Contextualism.
What I have just said shows that there is a continuum of positions with respect to systematicity. The more literalist one is, the stronger the form of systematicity one will be in a position to claim for semantics. This means that even a moderate contextualist – someone who (merely) gives up the modular view and accepts the pervasiveness of context-sensitivity – will have to set limits to systematic semantics. These limitations I think most language theorists currently accept. Radical contextualists who generalise context-sensitivity simply go a bit farther in the same direction, but, I insist, there is only a difference of degree between them and the moderate contextualists. (In this respect I agree with Cappelen and Lepore Reference Cappelen and Lepore2005a.) Radical Contextualism, therefore, is not a revolutionary position which threatens the programme of formal semantics, as many theorists have claimed. If revolution there is, it antedates Radical Contextualism and coincides with the advent of the moderate form of Contextualism which almost everybody embraces.
The research leading to this paper has received funding from the European Research Council under the European Community's Seventh Framework Programme (FP7/2007–2013) / ERC grant agreement n 229 441 – CCC.
8 The psychology of utterance processing Context vs salience
8.1 Introduction
Consider the cartoon in Figure 8.1,1 which is an optimal innovation: it includes a novel stimulus intended to further activate coded, salient meanings, so that both the novel and the salient, though different, may interact and affect pleasurability (Giora, Reference Giora2003; Giora et al., Reference Giora, Fein, Kronrod, Elnatan, Shuval and Zur2004).
Put differently, this cartoon will be optimally innovative only to those familiar with the coded meanings it echoes (see e.g., the poster in Figure 8.2). The initiated audiences are thus invited to invoke a wide range of coded, salient meanings which might allow an insight into the ironic message of the cartoon. However, those not in the know will only have access to what is literally spelled out, superimposed on the pictorial background (in addition, of course, to the figurative title and the symbol of the crown). Indeed, if taken at face value, the cartoon will encourage people to get terrified and stop doing what they are doing on account of the nasty weather (referred to in the background). But the poster in Figure 8.2, if retrievable from memory, will bring to mind an altogether different meaning which this cartoon must be weighed against: the British mindset on the eve of World War II – the “British restraint and stiff upper lip.”2

Figure 8.1. Spirit of the Blitz 2009

Figure 8.2. Keep Calm and Carry On
Additionally, one of Lance Corporal Jones's catchphrases in Dad's Army – the British sitcom about the Home Guard in the Second World War (Perry and Croft, Reference Perry and Croft1968–Reference Perry and Ludlow1977) – might also spring to mind, echoing, via another irony, the mindset derided here:
(1) Don't panic!3
The title of the cartoon – Spirit of the Blitz 2009 – allows another ironic turn of the screw, reminding us of the spirit of the blitz exercised by the people of Britain, who, during the 1940 bombing of London, exhibited stoical courage and endurance.4
All these meanings, if coded and available, will be invoked automatically as a direct response to the stimulus in Figure 8.1, despite their apparent irrelevance to the immediate context at hand – the ferocious weather in Britain during December 2009. When weighed against contextual information – the blitz-like storm which forced many in Britain to assume the spirit of the blitz rather than the spirit of Christmas – these meanings strongly deride the panicky spirit of the Brits on Christmas Eve 2009.
The way this cartoon can be interpreted illustrates the need to take various factors into consideration when examining the end-product of utterance interpretation. Indeed, the psychology of utterance processing takes into account a number of factors that shape utterance interpretation, such as (i) salient/coded meanings, (ii) contextual information, and (iii) their unfolding interaction (or lack of it).
Debates within psycholinguistics can, thus, be viewed as divisible into two main approaches (for a review, see Giora, Reference Giora2003: chapters 1–3). At one end of the spectrum are context-based models which assume that a strong context reigns supreme in that it governs early processes and facilitates contextually compatible meanings only. Consequently, the output of the interpretation processes must be seamless, involving no contextually incompatible meanings and interpretations (the connectionist model, e.g., Bates, Reference Bates, Bizzi, Calissano and Volterra1999; Bates and MacWhinney, Reference Bates, MacWhinney, MacWhinney and Bates1989; MacWhinney, Reference MacWhinney and MacWhinney1987; Small et al., 1988; the constraint-based model, e.g., McRae et al., Reference McRae, Spivey-Knowlton and Tanenhaus1998; Pexman et al., Reference Pexman, Ferretti and Katz2000; the direct access view, e.g., Gibbs, Reference Gibbs1979, Reference Gibbs1994; Keysar, Reference Keysar1994a; Ortony et al., Reference Ortony, Schallert, Reynolds and Antos1978).
At the other end are lexicon-based models which hold that coded meanings of stimuli are speedy responses, activated automatically, regardless of contextual information. As a result, initially accessed meanings may fail to meet context fit. Consequently, they may induce incompatible interpret-ations which will either feature in final outputs alongside the appropriate ones or will be subjected to revisitation or suppression processes (the modular view, e.g., Fodor, Reference Fodor1983; Swinney, Reference Swinney1979; the standard pragmatic model, e.g., Grice, Reference Grice, Cole and Morgan1975; Searle, Reference Searle1979a; the graded salience hypothesis, e.g., Giora, Reference Giora1997, Reference Giora2003; Peleg et al., 2004, Reference Peleg and Eviatar2008). The context-based and the lexicon-based models, then, have different predictions, especially with regard to initial processes, which, in turn, affect later interpretation processes.
Specifically, according to the context-based view, specific and supportive contextual information penetrates lexical access and selects the contextually appropriate meaning only, the consequence of which is contextually compatible interpretations only (section 8.1.1). In contrast, the lexicon-based view predicts that coded meanings cannot be blocked and, therefore, at times, end-product interpretations will also involve contextually incompatible meanings and interpretations (section 8.1.2). To tease apart these two approaches, we need to look at research into utterance interpretation processes.
8.1.1 Context-based approaches
Context-based approaches focus on the facilitative effects of strong contexts which allow them to select only compatible meanings and interpretations. Thus, according to the connectionist model, when words (bulb), ambiguous between salient/dominant (‘light’) and less salient/subordinate (‘flower’) meanings, are preceded by a context strongly biased toward one of the meanings, their processing will result in selecting the contextually appropriate meaning exclusively, regardless of degree of salience. For instance, processing The gardener dug a hole. She inserted the bulb resulted in an exclusive activation of the less-salient, subordinate (‘flower’) meaning of bulb when probed immediately (Vu et al., 1998; Vu et al., Reference Vu, Kellas, Metcalf and Herman2000).
However, as shown by Peleg and Giora (in press) and Peleg et al. (Reference Peleg, Giora and Fein2001, Reference Peleg, Giora, Fein, Noveck and Sperber2004, Reference Peleg and Eviatar2008), this finding need not attest to selective access; it might just as well be the effect of a predictive context, which guesses the intended meaning without interacting with lexical processes. For instance, Peleg et al. (Reference Peleg, Giora and Fein2001) show that guessing the contextually appropriate meaning in a context biased toward the less salient meaning occurred even before the processor encountered the relevant stimulus (bulb). Additionally, as shown by Peleg and Eviatar (Reference Peleg and Eviatar2008, Reference Peleg and Eviatar2009), briefly following the encounter of the stimulus in question at 250 ms stimulus onset asynchrony (SOA), the salient incompatible meaning as well as the less salient compatible meaning were both activated. This was further qualified by the type of homograph. In the case of homophonic homographs (bulb), both meanings were activated at a short (150 ms SOA) delay and remained active even 100 ms afterwards; in the case of heterophonic homographs (tear), the contextually appropriate, less salient meaning was activated exclusively in the left hemisphere at 150 ms SOA, but 100 ms later (at 250 ms SOA), the salient but contextually incompatible meaning also became available (Peleg and Eviatar, Reference Peleg and Eviatar2008, Reference Peleg and Eviatar2009). When probed later, 1000 ms after encountering the homophonic homograph, the left hemisphere selected the contextually appropriate (less salient) meaning, whereas both the salient and less salient meanings were still activated in the right hemisphere. At this long delay, 1000 ms following the onset of the ambiguous word, the left hemisphere was unable to suppress the salient contextually inappropriate meaning of heterophonic homographs, while the right hemisphere could (see Peleg et al., Reference Peleg and Eviatar2008). Such findings demonstrate that salient meanings cannot be blocked, not even by a strong context.
According to the constraint-based model, both contextual as well as lexical “constraints” may affect end-product interpretations, depending on their quantitative strength. The greater the number of the constraints favoring a specific meaning/interpretation the greater chance it stands to be selected exclusively (McRae et al., Reference McRae, Spivey-Knowlton and Tanenhaus1998). For instance, if contextual information is biased toward an ironic interpretation of a target, and, in addition, involves other biasing factors (such as a speaker whose profession indicates s/he could be ironic), such a strong context should facilitate the appropriate (ironic) interpretation only, even though its literal meaning may be more salient.
Findings, however, show that such contexts did not facilitate non-coded, inferred or novel interpretations such as irony, but instead slowed down ironic targets compared to more salient literal counterparts (Pexman et al., Reference Pexman, Ferretti and Katz2000; but see Ivanko and Pexman, Reference Ivanko and Pexman2003 for similar but also for somewhat different results, argued against in Giora et al., Reference Giora, Horn and Kecskes2007b).
The direct access view (Gibbs, Reference Gibbs1979, Reference Gibbs1986, Reference Gibbs1994) argues against the temporal priority of utterance-literal interpretation (posited by Grice, Reference Grice, Cole and Morgan1975), contending instead that, in a strongly supportive context, interpretations of literal and non-literal utterances should exhibit similar interpretive processes. Indeed, when Ortony et al. (Reference Ortony, Schallert, Reynolds and Antos1978) embedded statements, ambiguous between literal and novel (metaphoric) interpretations, in poor contexts, literal utterances were faster to read; however, when provided with rich contextual support, both literal and metaphoric statements took similarly long to read, thus testifying to context's facilitative effects on novel metaphors.
Later studies, however, failed to replicate these results. Rather, novel metaphoric items, embedded in supportive contexts, always took longer to read compared to their literal interpretation (Brisard et al., Reference Brisard, Frisson and Sandra2001; Giora and Fein, Reference Giora and Fein1999a; Pexman et al., Reference Pexman, Ferretti and Katz2000; Tartter et al., Reference Tartter, Gomes, Dubrovsky, Molholm and Vala Stewart2002; see also Giora, Reference Giora1997, Reference Giora1999). Similarly, non-coded ironic utterances were always processed literally first despite a strongly supportive context (Giora et al., Reference Giora, Fein, Aschkenazi and Alkabets-Zlozover2007b; Giora et al., Reference Giora, Fein, Kaufman, Eisenberg, Erez, Brône and Vandaele2009).
8.1.2 Lexicon-based approaches
Lexicon-based approaches focus on the insensitivity of lexical processes to contextual information. According to the modular view (Fodor, Reference Fodor1983), cognitive processes are either domain-specific or domain-general. Domain-specific processes (such as lexical access) are modular: they are low-level bottom-up processes, which are sensitive only to relevant stimuli (e.g., lexical items). Among other things, modular processes are informationally encapsulated, that is, impenetrable to processes occurring outside the input system. In contrast, domain-general, central systems, such as contextual information, consist in top-down, integrative, and predictive processes that are receptive to outputs of various domains.
Modular processes such as lexical access, then, are not affected by top-down feedback from higher-level representations such as contextual information or world knowledge. Rather, lexical access is autonomous and, on some views, exhaustive: all the meanings of a lexical stimulus are activated once this stimulus is encountered, regardless of either contextual bias or degree of salience. However, once these meanings are activated, contextual, central system processes may influence them. For instance, the central system may either integrate them with contextual information or discard them from the mental representation as contextually incompatible (for a review of other versions of modular and also hybrid models, see Giora, Reference Giora2003: chapters 1–3).
The standard pragmatic model (Grice, Reference Grice, Cole and Morgan1975; Searle, Reference Searle1979a) may be viewed as a version of a modular view, attributing properties such as imperviousness to contextual information and, consequently, temporal priority to literal meanings. According to the standard pragmatic model, the meanings of a linguistic stimulus to be activated first are literal. On the basis of these literal meanings, utterance-literal interpretations are to be constructed first. However, if literally-based (meanings and) interpretations do not meet contextual fit, suppression of these representations will take place, to be followed by their replacement with contextually appropriate alternatives.
Following the modular view (Fodor, Reference Fodor1983), the graded salience hypothesis (Giora Reference Giora1997, Reference Giora1999, Reference Giora2003; Peleg and Giora, in press; Peleg et al., Reference Peleg, Giora and Fein2001, Reference Peleg, Giora, Fein, Noveck and Sperber2004, Reference Peleg, Giora and Fein2008) assumes two kinds of mechanisms that run parallel: a bottom-up modular system, which is encapsulated and autonomous in that it is impervious to context effects (e.g., lexical access), and a top-down central system (e.g., contextual information), which is integrative but can also be strong enough to predict a compatible meaning or interpretation. Diverging from the modular view, however, the graded salience hypothesis posits that lexical access is ordered: salient meanings are activated faster than less salient ones. In addition, suppression of contextually incompatible meanings is not unconditional but rather functional; it is sensitive to discourse goals and requirements, allowing for contextually incompatible meanings and interpretations to be retained if invited or if supportive or non-intrusive of the intended interpretation (see also Giora and Fein, Reference Giora and Fein1999a; Giora et al., Reference Giora, Horn and Kecskes2007a).
According to the graded salience hypothesis, salience is a matter of degree: a meaning is salient if it is coded in the mental lexicon and enjoys prominence due to cognitive priority (e.g., prototypicality, stereotypicality) or amount of exposure (e.g., experiential familiarity, frequency, or conventionality), regardless of degree of literality; a meaning is less salient if it is coded but low on these variables, regardless of degree of literality; a meaning is non-salient if it is non-coded – either novel or derivable (e.g., on the basis of contextual information), regardless of degree of literality.
Although salience is a property of words and fixed expressions rather than a property of utterances’ compositional meaning and interpretation, utterance interpretation may often rely on the salient meanings of its components. Interpretations that are based on the salient meanings of the utterance components are salience-based interpretations and could be both literal and non-literal (see Giora et al., Reference Giora, Fein, Aschkenazi and Alkabets-Zlozover2007b).
Rich and specific contextual information can be predictive of an oncoming message, as well as supportive and facilitative. However, even when it is rich enough to activate meanings and interpretations on its own accord, it does not penetrate lexical processes but runs parallel. As a result, often salient but inappropriate meanings and, consequently, salience-based but inappropriate interpretations might be involved in utterance interpretation. Such inappropriate interpretations need not be suppressed; they may be retained, provided they do not interfere with the final contextually compatible interpretation. The result is the involvement of such interpretations in the final outputs of utterance interpretation (e.g., the salience-based, often literal interpretation of ironies and metaphors, see Brisard et al., Reference Brisard, Frisson and Sandra2001; Giora and Fein, Reference Giora and Fein1999a; Pexman et al., Reference Pexman, Ferretti and Katz2000; Tartter et al., Reference Tartter, Gomes, Dubrovsky, Molholm and Vala Stewart2002; see also Giora, Reference Giora1997, Reference Giora1999).
8.2 Salient meanings and salience-based interpretations are not necessarily literal
According to the graded salience hypothesis (Giora, Reference Giora2003, 2006; Giora and Fein, Reference Giora and Fein1999a), neither salient meanings nor salience-based interpretations need to be literal. Similarly, non-salient interpretation need not be figurative.
8.2.1 Salient meanings are not necessarily literal
A number of studies demonstrate that salient meanings need not be literal. For instance, in Gibbs (1980), familiar (English) idioms (spill the beans), whose salient meaning is figurative, took less time to read in an idiomatically than in a literally biasing context, the latter inviting a salience-based literal interpretation. In Giora and Fein (Reference Giora and Fein1999b), familiar (Hebrew) ironies, whose salient meanings were both ironic and literal, were processed initially (at 150 ms ISI [interstimulus interval]) both ironically and literally, regardless of context bias; in contrast, less familiar ironies, whose salient components were literal, were processed initially only literally, regardless of context bias.
In Colston and Gibbs (Reference Colston and Gibbs2002), familiar (English) metaphors (This one's really sharp said of a student) were faster to process when embedded in metaphorically than in ironically biasing contexts. Indeed, one of the salient meanings of their keyword (e.g., sharp) is metaphorical, which accounts for their metaphorical salience-based interpretation. When embedded in an irony-biasing context, the salience-based interpretation of the ironic use should be metaphorical and will have to be adjusted to the ironically biased contextual information.
Similarly, in Giora et al. (forthcoming), (healthy) participants were faster to respond to familiar (Hebrew) metaphors (flower bed), whose salient meaning is metaphorical, than to novel ones whose non-salient interpretation is metaphorical (golden laugh), even though the individual words that made up the target word-pairs were similarly highly familiar.
8.2.2 Salience-based interpretations are not necessarily literal
Given that salient meanings need not be literal (section 8.2.1), it follows that salience-based interpretations need not be sensitive to degree of literality either. After all, they are derived on the basis of the salient meanings of the utterance's components, which may be either literal or non-literal. Recall that the ironic This one's really sharp, whose salience-based interpretation is metaphorical (relying on the salient, metaphorical meaning of sharp) took longer to process in an ironically than in a metaphorically biased context (see Colston and Gibbs, Reference Colston and Gibbs2002), since the salience-based (metaphorical) interpretation of the ironic use is contextually incompatible.
8.2.3 Non-salient interpretations are not necessarily non-literal
Consider the example in Figure 8.1, which is literal: its final interpretation includes a number of salient meanings to which it adds a novel twist. Consider further Know Hope (in Figure 8.3)5 which is a literal, non-salient use, harping on the salient, literal “no hope.”

Figure 8.3. Know Hope
In Giora et al. (Reference Giora, Fein, Kronrod, Elnatan, Shuval and Zur2004) we studied such (Hebrew) innovations, termed “optimal innovations.” To be optimally innovative, a stimulus should evoke a novel – less or non-salient – response to a familiar stimulus, alongside a salient one from which it differs (both quantitatively and qualitatively), so that both can interact, regardless of non-literality. Admittedly many optimal innovations may be non-literal (novel metaphors or unfamiliar ironies). However, quite a few are literal (as can be deduced from Figures 8.1 and 8.3). Consider the following examples: Body and sole – the name of a shoe shop – which evokes the salient body and soul, both of which are literal but only the first is non-salient; Curl up and dye (the name of a hair salon) which is non-salient and literal and which evokes the salient but metaphorical Curl up and die (see Giora et al., Reference Giora, Fein, Kronrod, Elnatan, Shuval and Zur2004).
Given that optimal innovation involves processing a salient meaning on top of the novel interpretation, it is no wonder that optimal innovations take longer to process than their salient meanings, regardless of metaphoricality (as shown by Giora et al., Reference Giora, Fein, Kronrod, Elnatan, Shuval and Zur2004). Indeed, in Giora et al. (forthcoming: Experiment 2), both healthy individuals and individuals diagnosed with Asperger‘s syndrome took longer to process and more frequently erred on (Hebrew) non-salient interpretations compared to salient ones, regardless of metaphoricality. Thus, literal optimal innovations such as a Tverian horse (meaning ‘a horse from Tiberias’), reminiscent of a salient, metaphorical collocation – a Trojan horse, were slower to induce correct meaningfulness judgments compared to familiar literal collocations, whose meanings are salient. Non-salient interpretations then are not necessarily non-literal.
8.3 Opting for the literal interpretation is not necessarily a default strategy
To further test the claim that opting for a non-literal rather than a literal interpretation may be a default strategy, independent of explicit contextual information (including information about the speaker and the addressee), one needs to neutralize factors affecting non-literality such as degree of salience (recall that familiar items may have a lexicalized non-literal meaning; see section 8.2.1), semantic anomaly (known to trigger metaphoricality; see e.g., Beardsley, Reference Beardsley1958), and contextual information (since breach of pragmatic maxims or contextual misfit may invite a non-literal interpretation; see e.g., Grice, Reference Grice, Cole and Morgan1975). It thus follows that for this claim to be experimentally substantiated, testing it should involve novel items susceptible to a literal interpretation and presented outside a specific context.
Indeed, in Giora et al. (2010), materials were affirmative statements of the form X is Y (such as This is Memorial Day; I am your doctor) and their negative versions (This is not Memorial Day; I am not your doctor), all of which could, potentially, be assigned a literal interpretation in that they were all equally novel, semantically intact, and presented in isolation (see Figures 8.4 and 8.5).

Figure 8.4. This is Memorial Day

Figure 8.5. This is not Memorial Day
Participants were asked to rate the interpretation of these targets on a seven-point scale ranging between two specific (either literal or non-literal) interpretations presented randomly at the scale's ends.
Results show that the negative statements were rated as significantly more metaphorical than their affirmative counterparts, supporting the claim that opting for the literal interpretation need not be a default strategy.
Indeed, when the negative statements were embedded in contexts equally supportive of either their metaphorical or their literal interpretation, reading times were faster for items embedded in metaphorically than in literally biasing contexts (Giora et al., 2011a), further supporting the claim that a non-literal interpretation may be a default interpretation.
Additional evidence supportive of this claim comes from research into irony interpretation. Giora et al. (2005b) show that negating affirmative overstatements results in assigning these statements an ironic interpretation, even though these statements are amenable to literal interpretation as well. Specifically, findings demonstrate that negative overstatements (He is not exceptionally bright) come across as ironic even outside a specific context. When presented in isolation, they are rated as more ironic than other alternatives such as affirmative overstatements (He is exceptionally bright) or negated non-overstatements (He is not bright).
Taken together, findings from various negative statements indicate that it is not the case that, when available, literal interpretation is a default. Instead, our findings demonstrate that even non-conventional utterances, susceptible to literal interpretation, are often perceived as non-literal, even when no supportive information of that interpretation is made manifest.
8.4 Context effects – later interpretation processes
Although the various approaches outlined above differ in their predictions with regard to the effects of a “strong context” on early lexical processes (section 8.1), there seems to be an agreement that contextual information should affect later interpretation processes. But what these effects should look like is still a matter of debate. To test the various predictions of the approaches with regard to later processes, I will focus here on late effects of a “strong context” (such that anticipates an ironic utterance) on irony interpretation and on the effects of coherence on later processes of negated information.
8.4.1 Irony interpretation
According to context-based approaches, if context is strongly biased in favor of the appropriate (ironic) interpretation, only that interpretation should be activated immediately and feature exclusively in the final product. According to some lexicon-based approaches, inappropriate meanings as well as inappropriate (literal) interpretations (of irony) should be activated immediately even in the presence of a strong context. However, later, they should be discarded from the mental representation so that the final product features only contextually appropriate (ironic) interpretations (Grice, Reference Grice, Cole and Morgan1975; Fodor, Reference Fodor1983).
Whereas these two approaches have similar predictions with regard to how contextual information should affect the final (ironic) representation, the graded salience hypothesis has different predictions. According to this hypothesis, salience-based yet inappropriate interpretations (the salience-based interpretation of novel metaphors and unfamiliar ironies), which are activated immediately, need not be discarded. They may be retained since they contribute to (or at least do not disrupt) the final interpretation processes (Giora, Reference Giora2003).
But what makes up a strong context? According to Gibbs (Reference Gibbs1986, Reference Gibbs2002), a context may be strong enough to facilitate the ironic interpretation of an utterance exclusively if it sets up an “ironic situation” through contrast between what is expected and the reality that frustrates it. Inducing an expectation for an ironic utterance should allow ironic interpretation to be tapped directly, with no recourse to contextually inappropriate utterance-level interpretations. According to Gibbs, then, a strong context is one that allows addressees to anticipate an ironic utterance. This expectation should, in turn, render irony interpretation frictionless (the expectation hypothesis).
But a close look at “ironic situations” reveals that they need not promote an expectation for an ironic utterance nor do they facilitate irony interpretation (Giora et al., Reference Giora, Fein, Kaufman, Eisenberg, Erez, Brône and Vandaele2009; see also Ivanko and Pexman, Reference Ivanko and Pexman2003). Rather, such contexts encouraged readers to select literal utterances (This demonstration is a remarkable failure), which were by far the most preferred option, over ironic ones (This demonstration is a remarkable success), whether following a context featuring a frustrated expectation (section 8.4.1.1) or a fulfilled one (section 8.4.1.2) see Giora et al. (Reference Giora, Fein, Kaufman, Eisenberg, Erez, Brône and Vandaele2009).
8.4.1.1 Frustrated expectation
Shirley is a feminist activist. Two weeks ago, she organized a demonstration against the closure of a shelter for victimized women, and invited the press. She hoped that due to her immense efforts many people would show up at the demonstration, and that the media would cover it widely. On the day of the demonstration, twenty activists arrived, and no journalists showed up. In response to the poor turn out, Shirley muttered:
a. This demonstration is a remarkable success. (Ironic)
b. This demonstration is a remarkable failure. (Literal)
8.4.1.2 Realized expectation
Shirley is a feminist activist. Two weeks ago, she organized a demonstration against the closure of a shelter for victimized women, and invited the press. As always, she prepared herself for the idea that despite the hard work, only a few people would show up at the demonstration and the media would ignore it entirely. On the day of the demonstration, twenty activists arrived, and no journalists showed up. In response to the poor turn out, Shirley muttered:
a. This demonstration is a remarkable success. (Ironic)
b. This demonstration is a remarkable failure. (Literal)
In addition, “ironic situations” did not facilitate irony. Rather, ironic statements (A skiing vacation is recommended for your health) took as long to read following a context featuring a contrast between what is expected and the reality that frustrates it (section 8.4.1.3) as following a context in which this expectation is met (section 8.4.1.4).
8.4.1.3 Frustrated expectation
Sagee went on a skiing vacation abroad. He really likes vacations that include sport activities. A relaxed vacation in a quiet ski-resort place looked like the right thing for him. Before leaving, he made sure he had all the equipment and even took training classes on a ski simulator. But as early as the beginning of the second day he lost his balance, fell, and broke his shoulder. He spent the rest of the time in a local hospital ward feeling bored and missing home. When he got back home, his shoulder still in a cast, he said to his fellow workers:
“A skiing vacation is recommended for your health”. (Ironic)
Everyone smiled.
8.4.1.4 Realized expectation
Sagee went on a skiing vacation abroad. He doesn't even like skiing. It looks dangerous to him and staying in such a cold place doesn't feel like a vacation at all. But his girlfriend wanted to go and asked him to join her. As early as the beginning of the second day he lost his balance, fell, and broke his shoulder. He spent the rest of the time in a local hospital ward feeling bored and missing home. When he got back home, his shoulder still in a cast, he said to his fellow workers:
“A skiing vacation is recommended for your health”. (Ironic)
Everyone smiled.
Importantly, however, both ironic targets took longer to read than a salience-based (literal) interpretation which followed a context featuring no expectation (section 8.4.1.5).
8.4.1.5 No-expectation
Sagee went on a skiing vacation abroad. He has never practiced skiing so it was his first time. He wasn't sure whether he would be able to learn to ski and whether he could handle the weather. The minute he got there he understood it was a great thing for him. He learned how to ski in no time and enjoyed it a lot. Besides, the weather was nice and the atmosphere relaxed. When he got back home, he said to his fellow workers:
“A skiing vacation is recommended for your health”. (Salience-based, literal)
Everyone smiled.
8.4.1.6 Will expecting an ironic utterance facilitate it initially?
What, then, can make a strong context, such as would induce an expectation for an ironic utterance? In Giora et al. (Reference Giora, Horn and Kecskes2007b), we showed that the involvement of an ironic speaker in vivo (in context mid-position, in bold for convenience) induced an expectation of another such utterance on the part of that speaker when these contexts were presented without the final utterances and had to be completed by participants. We therefore used these contexts, completed by an utterance which was biased either toward the ironic (2) or toward its salience-based (literal) interpretation (3) in a reading experiment.
(2)
Barak: I finish work early today.
Sagit: So, do you want to go to the movies?
Barak: I don't really feel like seeing a movie.
Sagit: So maybe we could go dancing?
Barak: No, at the end of the night my feet will hurt and I'll be tired.
Sagit: You're a really active guy …
Barak: Sorry, but I had a rough week.
Sagit: So what are you going to do tonight?
Barak: I think I'll stay home, read a magazine, and go to bed early.
Sagit: Sounds like you are going to have a really interesting evening.
Barak: So we'll talk sometime this week.
(3)
Barak: I was invited to a film and a lecture by Amos Gitai.
Sagit: That's fun. He is my favorite director.
Barak: I know, I thought we'll go together.
Sagit: Great. When is it on?
Barak: Tomorrow. We will have to be in Metulla in the afternoon.6
Sagit: I see they found a place that is really close to the center.
Barak: I want to leave early in the morning. Do you want to come?
Sagit: I can't, I'm studying in the morning.
Barak: Well, I'm going anyway.
Sagit: Sounds like you are going to have a really interesting evening.
Barak: So we'll talk sometime this week.
Although both contexts raised an expectation for an ironic utterance, identical targets (Sounds like you are going to have a really interesting evening) took longer to read following ironically biasing contexts (2) than following salience-based, literally biasing contexts (3). Strong contexts, then, inducing an expectation for an ironic utterance, did not facilitate ironic interpretations nor did they slow down salience-based, literal interpretations.
To further test the expectation hypothesis we attempted to induce an expectation of an ironic utterance by presenting participants only with contexts that ended in an ironic utterance (4) so that they are trained to anticipate an ironic utterance. This (+Expectation) condition was compared to a weaker (–Expectation) condition in which only half the contexts ended in an ironic utterance; the other half ended in a non-ironic utterance (5). Results from lexical decisions to probes related to ironic (“harmful”) and salience-based (“healthy”) utterance-level interpretations showed facilitation of the salience-based interpretation only, regardless of context bias. This was true when short (250 ms) as well as long (750–1000 ms) processing time was allowed. This pattern of results was not different from the one obtained in the weaker condition. Such results suggest that, contra the direct access view, but in keeping with the graded salience hypothesis, inducing an expectation for an ironic utterance does not facilitate ironic interpretation immediately and does not affect a seamless interpretation process.
(4) Yuval and Omry went out for their lunch break after a morning of work. They went to the cafeteria in their office building and each filled a platter with food. They stood in line for a long while and were eager to start the meal. When they had sat down, Yuval saw that his colleague chose fried sausage, chips, a glass of coke for a drink, and a sugar‐glazed doughnut for dessert. Then Yuval said: “I see that you picked the ideal meal today!”
(5) Yuval and Omry went out for their lunch break after a morning of work. They went to the cafeteria in their office building and each filled a platter with food. They stood in line for a long while and were eager to start the meal. When they had sat down to eat, Yuval saw that his colleague filled his platter with salad, tofu, and sprouts and chose natural carrot juice for a drink. Then Yuval said: “I see that you picked the ideal meal today!”
Importantly, in Giora et al. (2011b; see Giora 2011), we strengthened the ironically biasing condition used in Giora et al. (Reference Giora, Horn and Kecskes2007b) by introducing an additional constraint, informing participants that the aim of the experiment was to test irony interpretation. The control group, whose experimental design was mixed, raising no expectation, were not informed about this specific aim of the experiment; their contextual information was therefore weaker compared to that in which expectation of ironic utterances was made more pronounced, both implicitly and explicitly. Still, although contextual information was now more strongly biased in favor of the ironic interpretation, this did not affect the pattern of results which replicated those obtained earlier (in Giora et al., Reference Giora, Horn and Kecskes2007b). Even this multiple constraints condition did not facilitate irony interpretation; only salience-based albeit incompatible interpretations were made available in both the strongly and weakly biasing conditions.
Would allowing extra processing time make a difference? In Giora et al. (2011b) we allowed participants longer (1500 ms) processing time. We predicted that even if, at this stage, irony is understood, salience-based but incompatible interpretations would still be available. Indeed, as predicted, even at such a long delay, the pattern of results did not change: the salience-based though incompatible interpretation was never less accessible than the ironic yet compatible interpretation.
Such results demonstrate that understanding utterances in a strong context supportive of and anticipating their non-salient (ironic) interpretation via inducing an expectation of such an interpretation does not unconditionally involve dispensing with the salience-based but incompatible interpretation. Do salience-based interpretations have a role in shaping up utterance-final products?
We have seen that processing utterances in context may involve entertaining meanings and interpretations on account of their salience rather than because of their contextual compatibility. According to the suppression/retention hypothesis (Giora, Reference Giora2003), such meanings and interpretations, when incompatible, will either be retained or discarded from the mental representation depending on the role they might play in shaping the contextually appropriate interpretation. When they might contribute to the final representation they will be retained. For instance, on the indirect negation view, the involvement of salience-based interpretation in irony processing allows computing the gap between what is said and the reality referred to (Giora, 1995 and Giora et al., Reference Giora, Fein, Kaufman, Eisenberg, Erez, Brône and Vandaele2009); on the tinge hypothesis (Dews and Winner, Reference Dews and Winner1995, Reference Dews, Winner, Mandell and McCabe1997, Reference Dews and Winner1999; Dews et al., Reference Dews and Winner1995), the involvement of salience-based interpretations in irony processing is functional in mitigating the negativity of ironic criticism and the positivity of ironic praise. Although salience-based interpretations might not be the intended interpretation, they are instrumental in shaping it up and are therefore retained.
The involvement of salience-based but incompatible interpretations in the late stages of irony interpretation supports the graded salience hypothesis and the suppression/retention hypothesis (Giora, Reference Giora1997, Reference Giora1999, Reference Giora2003). However, it argues against both the standard pragmatic model (Grice, Reference Grice, Cole and Morgan1975)7 and the direct access view (Gibbs, Reference Gibbs1986, Reference Gibbs2002), which assume that the products of irony processing will not involve incompatible interpretations.
8.4.2 Negation interpretation
Consistent with the graded salience hypothesis, results obtained from studies of irony interpretation support the view that salient meanings and salience-based interpretations are activated initially, regardless of a strong contextual bias toward the non-salient interpretation. They further demonstrate that, as predicted by the retention/suppression hypothesis (Giora, Reference Giora2003; Giora and Fein, Reference Giora and Fein1999a), suppression of contextually incompatible meanings and interpretations is not unconditional. Instead, suppression is attuned to contextual goals and requirements and would not operate if apparently incompatible information may be conducive to the final interpretation.
A great number of studies probing the effect of negation (no, not) on patterns of activation of negated concepts show that, outside a specific context, negated information (‘fast’ in The train to Boston was no rocket) is activated initially (between 100–500 ms) as is non-negated information (Hasson and Glucksberg, Reference Hasson and Glucksberg2006; Giora, Balaban, Fein, and Alkabets, 2005a; Kaup et al., Reference Kaup, Yaxley, Madden, Zwaan and Lüdtke2007; MacDonald and Just, Reference MacDonald and Just1989: Experiments 1–2 reading phase). However, later on (between 500–1000 ms) initial levels of activation of negated concepts drop to baseline levels (Hasson and Glucksberg, Reference Hasson and Glucksberg2006). When given extra processing time (1500 ms), negated information is suppressed and replaced by an alternative opposite (Kaup et al., Reference Kaup, Lüdtke and Zwaan2006; for conflicting results, however, see Lüdtke et al., 2008). Outside a specific context, then, the suppressive effect of negation is a default strategy.
However, when provided with late coherent vs incoherent context (The train to Boston was no rocket. The trip to the city was *fast*, though), such negated items did not dispense with the negated, albeit incompatible concept (‘fast’) even as long as 1000 ms following its mention. Retention of apparently irrelevant, salience-based interpretations, however, allow for late context to resonate with earlier context, despite indication to the contrary invited by the negation marker (Giora et al., Reference Giora, Horn and Kecskes2007a; on discourse resonance, see Du Bois, Reference Du Bois1998, Reference Du Bois2001; on discourse resonance following negation, see Giora, Reference Giora, Horn and Kecskes2007).
Contextual effects on the retention of negated information were also found in Kaup (Reference Kaup2001) and Kaup and Zwaan (Reference Kaup and Zwaan2003). Results of these studies show that a concept's accessibility may be affected by its presence in the situation model rather than by negation. In the following (6), the fact that a referent (photographs) is not removed from the situation allows its retention despite it being within the scope of negation (Kaup, Reference Kaup2001):
(6) Elizabeth tidied up her drawers. She burned the old letters but not the photographs. Afterwards she cleaned up.
Similarly, in Kaup and Zwaan (Reference Kaup and Zwaan2003), only concepts absent from the situation described lost accessibility, regardless of negation. In contrast, negated concepts present in the situation described retained their accessibility even after being entertained for as long as 1500 ms.
Suppression and retention are functional, then, and conform to global rather than to local coherence considerations (Giora, 2006).
8.5 Coda
Is our mind “efficient” enough to engage in processing only contextually appropriate interpretations given that contextual information is strongly supportive of that interpretation, as argued by the contextualist school (e.g., Gibbs, Reference Gibbs1994; Vu et al., Reference Vu, Kellas, Metcalf and Herman2000; see section 8.1.1)? Luckily, it is not. In fact, there is enough evidence now to allow the conclusion that processing utterances, even inside highly biasing contexts, may involve entertaining meanings and interpretations solely on account of their meaning salience and consequently their salience-based interpretation, regardless of contextual fit, as argued by the graded salience hypothesis (see section 8.1.2). Additionally, the activation of such incompatible meanings and interpretations in utterance processing is not unconditionally aborted by suppression processes (as assumed by Fodor, Reference Fodor1983 or Grice, Reference Grice, Cole and Morgan1975). Rather, such meanings and interpretations are retained because they are deemed functional in shaping up final interpretations (the suppression/retention hypothesis, Giora, Reference Giora2003). Such a functional view of suppression and retention allows for the poetics of linguistic and non-linguistic stimuli such as optimal innovations (whether literal or non-literal; section 8.1), for discourse resonance, and humor (section 8.4).
It has also become clear that, contra Grice (Reference Grice, Cole and Morgan1975), non-literal interpretations may be a default interpretation even when innovative, free of semantic anomaly, and context-less (section 8.2.3). Indeed, when affirmative statements of the form X is Y are negated (This is not Memorial Day), they are assigned a non-literal interpretation even when presented in isolation. Similarly, negative overstatements such as He is not exceptionally bright are assigned an ironic interpretation even outside a specific context.
The review of the literature introduced in this chapter reveals that the psychology of utterance processing is a multi-faceted phenomenon; its products may, at times, be surprisingly creative and even amusing.
This paper was supported by a grant to the second author by THE ISRAEL SCIENCE FOUNDATION (grant No. 652/07). Thanks also go to Ran Abramson for the cartoons in Figures 8.1 and 8.2 and for example (1), and to Keith Allan and Kasia Jaszczolt for their very valuable comments.
9 Sentences, utterances, and speech acts
A gleam pushed through the sleepiness in his grey eyes, and he sat up a little in his chair, asking: ‘Leggett's been up to something?’
‘Why did you say that?’
‘I didn't say it. I asked it.’
9.1 Introduction
Most of the time, when we speak, we do more than express propositions; we suggest, promise, offer, accept, order, threaten, assert – we perform speech (or illocutionary) acts. The history of the research on this topic – initiated by Austin (Reference Austin1975) – is well-documented, and many textbooks, handbooks and encyclopaedias contain excellent surveys, thus treating speech acts as a major topic (e.g. Levinson Reference Levinson1983: chapter 5; Jaszczolt Reference Jaszczolt2002: chapter 14; Sadock Reference Sadock, Horn and Ward2004). However, the main contemporary pragmatic theories of utterance interpret-ation devote little space, if any at all, to the way utterances are interpreted as speech acts, that is to the way they are assigned an illocutionary force (see, for instance, Sperber and Wilson Reference Sperber and Wilson1986; Levinson Reference Levinson2000; Carston Reference Carston2002; Recanati Reference Recanati2004a; Jaszczolt Reference Jaszczolt2005). One might think that speech acts went out of fashion simply because the topic had been exhausted by the considerable number of publications spanning from Austin's work in the late fifties to the late eighties – when other topics, such as the pragmatic determinants of literal meaning, came to the fore.
Yet, contemporary literature is rife with confusions stemming from the lack of careful consideration of the role of illocutionary force attribution in utterance interpretation. In particular, two crucial mistakes must be avoided. The first consists in conceiving of illocutionary forces as determined by sentence meaning; the second equates utterance content and speech act content. Interestingly enough, both confusions can be traced back to the founding fathers of modern pragmatics: the former to Searle, the latter to Grice. I will start this chapter by considering these two problematic legacies in turn. Next, we will see how avoiding the confusion between sentence meanings, utterance meanings and illocutionary contents helps to better grasp the major issues related to the analysis of illocutionary force attribution.
9.2 Searle: illocutionary forces as intrinsic to sentence meanings
According to Searle (Reference Searle1969, Reference Searle and Gunderson1975a, b; Searle and Vanderveken Reference Searle and Vanderveken1985), the meaning of a sentence corresponds to the speech act any literal utterance of this sentence constitutes. In Searle's conception, the study of linguistic meaning amounts to the study of speech acts. It follows that in order to determine the literal illocutionary force of an utterance, it suffices to know its linguistic meaning. Searle's view rests on the assumption (explicitly stated by his Expressibility Principle; cf. Searle Reference Searle1969: 20–21) that any illocutionary act type IA can be matched with a certain sentence type s in such a way that IA corresponds to the literal meaning of s. Detailed criticisms of such a ‘literalist’ view – which takes literal utterance meaning to be determined by sentence meaning – can be found in Recanati (Reference Recanati1987: 219–24), Carston (Reference Carston2002: 30–42, 64–70) and Kissine (2011). The important point for present purposes is that Searle's conception implies that illocutionary forces are located at the level of conventional sentence meaning. In his view, if a sentence is used literally, its illocutionary force is directly derivable from its linguistic meaning.
Sentences with imperative mood might seem to provide the strongest case for such a direct derivation of illocutionary force from sentence meaning. This is the reason why I will use imperative sentences to build up my case against incorporating illocutionary forces within sentence meaning. In languages that have a morphological imperative mood – and we will see in a moment that this qualification is important – directive speech acts such as requests, orders, commands etc. are prototypically realised by uttering grammatically imperative sentences. Of course, this does not entail that the linguistic, conventional meaning of the imperative mood is to be analysed in terms of a directive illocutionary force. Yet, I am not flogging a dead horse here, for such an idea underlies many recent accounts of imperative mood: see, for instance, Han (Reference Han2000), Barker (Reference Barker2004), Portner (Reference Portner2007) or (Benjamin) Russell (Reference Russell2007). Without getting into the details, all such theories presuppose that the imperative mood, at the level of sentence meaning, encodes the notion that the speaker (S) is prompting the addressee (A) to bring about the truth of the propositional content (for recent and critical surveys, see Schwager Reference Schwager2006; Iatridou Reference Iatridou2009).
As is well known, many imperative sentences may be used to perform non-directive speech acts (e.g. Wilson and Sperber Reference Wilson, Sperber, Dancy, Moravcsik and Taylor1988). The most obvious examples include permissions (1), advice (2), good wishes (3), and threats (4):
(1) A: May I have this piece of cake?
B: Yes, take it.
(2) Always cut your fingernails round and your toenails square.
(from Hamblin Reference Hamblin1987: 11)
(3) Have a nice journey.
(4) Hit me, and I'll hit you back.
Virtually everyone agrees that in (sincerely) performing a directive speech act FD(p),where FD stands for the directive force and p for the propositional content, S expresses her intention/desire that A bring about the truth of p with FD(p) as a reason (e.g. Searle Reference Searle and Gunderson1975a; Bach and Harnish Reference Bach and Harnish1979; Searle and Vanderveken Reference Searle and Vanderveken1985; Alston Reference Alston2000). Clearly, the examples in (1)–(4) can be felicitously uttered even though it is mutually manifest that S does not intend A to bring about the truth of the propositional content; for instance, because S would prefer H not to (1), because S does not care whether H does so or not (2), because H has no active control on the truth of the propositional content (3) or because H's bringing about the truth of the propositional content is undesirable for S (4).1
Those scholars who analyse the meaning encoded by the imperative mood as including some reference to the directive illocutionary force have two options available in order to explain away examples like (1)–(4).2 The first consists in elaborating a semantic account of the imperative mood that is flexible enough to predict non-directive uses of imperative sentences (e.g. Wilson and Sperber Reference Wilson, Sperber, Dancy, Moravcsik and Taylor1988; Clark B. Reference Clark1993; Allan Reference Allan2006a; Schwager Reference Schwager2006). This amounts to rejecting the equivalence between the directive illocutionary force and the imperative mood – which is precisely a theoretical recommendation I wish to make in this chapter. The second option is to maintain that the imperative mood encodes the illocutionary directive force, and to claim that (1)–(4) are either indirect or non-literal speech acts.
Let us start by considering the possibility that (1)–(4) are indirect speech acts. Traditionally, a speech act is said to be indirect whenever its performance by means of the utterance u requires u to constitute another, direct speech act. (We will qualify this definition later on, but it is perfectly suited for the needs of the present discussion.) For instance, while (5) is literally a question, it will also constitute a request in many contexts.
Likewise, (6) is, literally, an assertion; but it can be interpreted as a directive speech act.
(6) You've had enough beer already.
The important point is that while (5) and (6) often constitute indirect directive speech acts, they still remain a question and an assertion, respectively. In other words, if an illocutionary act IA2 is performed by way of performing another illocutionary IA1, such that both IA1 and IA2 correspond to the same utterance u, the interpretation of u as IA1 remains available.
In sum, the claim under examination holds that (1)–(4) are, qua direct speech acts, directives (since the imperative mood encodes the directive illocutionary force), but that, qua indirect speech acts, they are interpreted as permission, advice, a good wish and a threat, respectively. However, it is impossible to interpret (1–4) as directive speech acts, except in very specific contexts. And whenever one sets up such a context, it becomes impossible to interpret the examples at hand as permission, advice, a good wish and a threat, respectively. Imagine, for instance, that S and A are involved in some kind of sadomasochistic game, and that S utters (4). To be sure, the imperative in (4) then receives the directive illocutionary force; but in such a context, the example cannot be read as a threat anymore.
It thus turns out that, if the imperative mood is to be associated with the directive illocutionary force, (1)–(4) cannot mean what would be literally said. So we are led to the idea that, while (1)–(4) are directive speech acts when taken literally, they are not interpreted as such when S is speaking non-literally. In other words, non-directive imperatives should be treated in the same way as sarcastic or ironic declaratives. By uttering (7), S may mean that the party is boring; similarly, by uttering (1)–(4) non-literally, S will give advice, give permission, express a good wish or make a threat.
(7) This party is great.
The first problem here is that irony may be, and often is, missed. Hearing (7) you can fail to discern the sarcasm, and come to believe that I really love the party. Yet, it seems totally implausible to suppose that, when hearing (1)–(4), we may miss the alleged non-literal meaning and interpret these utterances as plain directive speech acts.
Perhaps such an argument does not settle the issue, for it may be argued that the difficulty of accessing the (alleged) literal meaning of (1)–(4) – that is, to interpret them as directive speech acts – stems from the fact that the (alleged) non‑literal readings – viz. permission, advice, good wish and threat – are so conspicuous that one can hardly miss them. In order to bring home the point that in (1)–(4) the imperative mood is used literally, albeit non-directively, let us consider in more detail the idea that the threat in (4) means the opposite of what is literally said (cf. Dominicy and Franken Reference Dominicy, Franken, Vanderveken and Kubo2002). A first possibility would consist in assuming that only the imperative conjunct is non-literal; (4) would then amount to something like (8).
(8) Don't hit me, and I'll hit you back.
This is clearly not what S means by (4). One can envisage, as a second possibility, that both conjuncts are used non-literally such that (4) amounts to something like (9).
(9) Don't hit me and I won't hit you back.
Conjunctions of the form !p and q (where ! stands for the imperative mood) entail the corresponding conditionals if p, q; thus (4) entails (10).
(10) If you hit me, I will hit you back.
However, what (9) entails is (11), not (10):
(11) If you don't hit me, I won't hit you back.
To be sure, (11) can be ‘perfected’, that is pragmatically enriched, into a biconditional (e.g. Geis and Zwicky Reference Geis and Zwicky1971; Horn Reference Horn2000) that would ground the entailment relation between (9) and (10). However, ‘conditional perfection’ is a pragmatic, hence defeasible interpretive process, so that (9) – which is supposed to be what is meant by (4) – is compatible with the falsity of (10). The examples in (12) are typical instances of cancelled pragmatic enrichment.
(12)
a. Don't hit me and I won't hit you. But/actually, even if you hit me, I won't hit you back.
b. If you don't hit me, I won't hit you back. But/actually, even if you hit me, I won't hit you back.
By contrast, the threat in (4) proves infelicitous if the conditional in (10) turns out to be false, and the examples in (13) are sheer contradictions.
(13)
a. Hit me, and I'll hit you back. ?But/ ?actually, even if you hit me, I won't hit you back.
b. If you hit me, I'll hit you back. ?But/ ?actually, even if you hit me, I won't hit you back.
It can be concluded from the foregoing that non-directive uses of the imperative mood are as literal and direct as the directive ones. This I take to be a clear case against building the illocutionary force within sentence meaning (for the same point, see Wilson and Sperber Reference Wilson, Sperber, Dancy, Moravcsik and Taylor1988).
Scholars eager to pair sentence meanings with illocutionary forces sometimes invoke typological data (e.g. Han Reference Han2000: 164). The rationale is along the following lines: ‘If natural languages bother to devote a specific form to directive speech acts, then the directive illocutionary force is not a matter of pragmatic processing, but part and parcel of sentence meanings.’ Sadock and Zwicky (Reference Sadock, Zwicky and Shopen1985) claim that every language has a specific imperative sentence-type associated with directive speech acts. Yet, the morphosyntactic system of many languages lacks – totally or partially – distinctively imperative linguistic forms (this is the case for 122 languages out of the 552 analysed in van der Auwera and Lejeune Reference Auwera, Lejeune and Haspelmath2005). In languages with defective or empty imperative paradigms, various compensatory strategies can be found: aorist (e.g. Georgian), subjunctive mood (e.g. French or Armenian), optative mood (e.g. Eskimo), irrealis mood (e.g. Javanese), indicative mood (e.g. Hebrew) and, perhaps more surprisingly, passive forms (Maori) (see, for instance, Xrakovski Reference Xrakovski2001; Allan Reference Allan2006a; König and Siemund Reference König, Siemund and Shopen2007). If the directive illocutionary force were encoded within the linguistic meaning, how are we to explain these typological facts? Should we accept that, in languages that lack genuine imperative mood, some sentence-types are linguistically ambiguous?
First, such an ‘ambiguity’ thesis violates Grice's Modified Occam's Razor (which recommends avoiding the multiplication of linguistic meanings beyond necessity). Second, the ambiguity thesis proves hard to maintain across languages. To see this clearly, contrast the use of future in Nunggubuyu and in French. Nunggubuyu lacks any morphosyntactic imperative; the same construction is used to express future time reference and to perform directive speech acts (Heath Reference Heath1984, Reference Heath1986). Out of context, it is impossible to decide whether (14a) should be translated as (14b) or (14c).
(14)
a. Ba‐buraː‐
(Verstraete Reference Verstraete2005; after Heath Reference Heath1984: 343)2sg(class‐A)3‐sit-nonpast
b. ‘You will sit down.’
c. ‘Sit down.’
In French, an authoritative way to order things is to use the future constructions. However, French has also a morphological second person imperative; accordingly, a literal translation of (15a) can only be (15b) – not (15c).
(15)
a. Tu partiras demain.
You.sg leave‐ind.fut.simple.2sg tomorrow
b. ‘You will leave tomorrow.’
c. ‘Leave tomorrow.’
As far as I can see, there is no principled ground for the following joint claims: (a) that the Nunggubuyu construction ‘Class-A prefix + non-past’ is ambiguous between two illocutionary forces, each such sentence being either a direct and literal assertion about the future or a direct and literal directive speech act; and (b) that in French requests performed with the future indicative are indirect.
But if future tense constructions are said to be ambiguous between the assertive and the directive forces across languages – thus in French too – why should we refrain from extending this rationale to other morphosyntactic forms? As I have just mentioned, there exists a wide range of linguistic strategies for compensating for defective imperative paradigms. Following the thread, one would have to assume linguistic ambiguity for any such form that happens to be prototypically used to perform directive speech acts in some language. Take Lingala, which has an imperative form for the singular second person only; in directive speech acts with second person plural, subjunctive mood is used instead (van der Auwera and Lejeune Reference Auwera, Lejeune and Haspelmath2005). On the one hand, no sensible semantic theory of mood would consider it plausible that, in Lingala, the ambiguity between a directive and some other illocutionary force might characterise the plural second person subjunctive form only, and not extend to the singular second person subjunctive form. On the other hand, data on the distribution of moods militates against the ad hoc hypothesis of a cross-linguistic ambiguity of the subjunctive mood. For instance, French has both second person imperative and subjunctive forms; but the subjunctive proves unacceptable whenever imperative forms are available. To borrow an example from Schlenker (2005), one can advise the Queen to be prudent using either a third person subjunctive (16) or a second person imperative (17); by contrast, the second person subjunctive in (18) is deviant.

In (16), the Queen is addressed as a third person; second person addressed directive speech acts are unacceptable with the subjunctive.

Moreover, the acceptability of subjunctive in French is not linked to the presence or the absence of the directive force. For instance, in French equivalents of (4), the surface form of the first verb is unambiguously imperative (20a) (cf. note 2), and the first clause has clearly no directive force. In such environments, second person subjunctives are unacceptable (20b); however, in the third person – for which there is no morphological imperative form – the subjunctive is fine (20c).

At this point, I see no justification for arguing that a certain sentence-type – e.g. future in Nunggubuyu or subjunctive in Lingala – is ambiguous between several illocutionary forces, rather than admitting that illocutionary forces belong to the level of utterances rather than to that of sentences.5
9.3 Grice's heritage: illocutionary forces and utterances
In the previous section, we saw that illocutionary forces do not belong to the level of sentence meaning. From this, however, one should not conclude that whenever a sentence is uttered and acquires propositional meaning, a speech act has been performed eo ipso. Yet, a customary – but misguided – shortcut is precisely to recast the opposition between sentences and utterances in terms of the contrast between sentence meanings and speech acts. This opposition plays a major role in a much-discussed issue of contemporary philosophy of language: that of contextual contributions to the propositional content. To take a well-worn example, (21) will mean different things in different contexts – depending on what John is ready for.
(21) John is ready.
Phenomena of this kind have been taken to show that literal content cannot be determined by linguistic meaning alone. Yet, opposing this interpretation, so-called ‘semantic minimalists’ argue (a) that the contents expressed by (21) on different occasions of utterance are speech act contents; (b) that these speech act contents do not correspond to the literal, semantic content of the sentence uttered. In other words, while different utterances of the same sentence correspond to the performance of different speech acts, with possibly different contents, speech act contents are to be distinguished from the semantic content proper, which – except a restricted set of indexicals – is entirely determined by sentence structure and remains constant across contexts of uses (Cappelen and Lepore Reference Cappelen and Lepore2005; Soames Reference Soames2002). Implicit in this view is the assumption that any utterance constitutes a speech act, or, in other words, that for a sentence to be uttered amounts to its acquiring an illocutionary force.
Such an assumption can be traced back to Grice's notion of meaningnn. Saying, for Grice, is intending to provoke some cognitive response, such that the reason for the cognitive response is the recognition of this very intention (or, at least, cases of saying can always be reconstructed in this way). The nature of the response S intends to provoke by her utterance in the mind of A determines the ‘central’ speech act the instance of saying corresponds to: if it is the belief that p, S's saying will be an assertive speech act; if it is the intention to bring about the truth of p, S's saying will be a directive speech act (cf. Grice Reference Grice1968, Reference Grice2001: 50–55).
Now take a sarcastic utterance like (7) above: S says that the party is great, but, clearly, she doesn't mean it. To the best of my knowledge, no theory of irony would claim that in such a case S asserts that the party is great. Furthermore, a sarcastic utterance that p is clearly not accompanied by an overt intention (to communicate that p) of the kind that, according to Grice, characterises saying. This entails, in Grice's view, that the sarcastic S does not say anything, but just acts as if she was saying something – a very counter-intuitive consequence (see Neale Reference Neale1992; Carston Reference Carston2002: 114–16; Kissine Reference Kissine2009).
The existence of cases where S says that p without asserting that p constitutes a strong argument against equating utterances and illocutionary acts. Austin (Reference Austin1975) explicitly distinguished between sentence meaning (phatic act), contextual meaning (locutionary act) and illocutionary act (for a detailed discussion, see Kissine Reference Kissine2008b, Reference Kissine2009). In other words, even though sentence meaning and utterance meaning are not to be confused, illocutionary force attribution constitutes yet another level of interpretation.
In a series of papers, Kent Bach (e.g. 1994a, 2005) argues that what is said by an utterance does not coincide with the content of the illocutionary act performed by this utterance; what is said corresponds to the locutionary act performed. However, unlike, for instance, Recanati (Reference Recanati2004a and this volume) or Carston (Reference Carston2002 and this volume), Bach defines what is said as the output of the semantic interpretation of the syntactic structure; pragmatic contributions to what is said are limited to determining the reference of a restricted set of indexicals. Moreover, in Bach's view, what is said does not always correspond to a full-blown proposition. Take (21), for instance. As pointed out above, it is impossible to decide what John is ready for in the absence of any contextual information. But such extra-linguistic information does not contribute, according to Bach, to what is said. Illocutionary content is the place where such pragmatic influences come into play. What is said by (21), for Bach, is the sub-propositional radical [John is ready___], whose empty slot has to be completed at the illocutionary level.
One of the reasons Bach – rightly – invokes for distinguishing between locutionary and illocutionary acts is that sometimes we do not mean what we say; irony is a case in point. Accordingly, in such cases, the utterance reduces, at the literal level, to a locutionary act that corresponds, in Bach's view, to the semantic interpretation of the syntactic structure and to the indexical resolution, both of which remain blind to the pragmatic, wide context.
Bach's conception does not conform to Austin's original definition of the term locutionary act (cf. Kissine Reference Kissine2008b, Reference Kissine2009). More importantly, while the existence of cases where we do not mean what we say justifies the rejection of Grice's equation of saying with performing a speech act, it also shows that ‘forceless’ saying – the locutionary act – is already endowed with propositional, pragmatically determined meaning. Imagine that John is notoriously bad at preparing his talks on time. Imagine that, a few hours before John's talk, S sarcastically says:
(22) Of course, John is ready.
Whatever theory of irony one favours, it is clear that in saying (22) S does not mean what she says. But in order to determine what she does mean – something like ‘Of course, John is not ready for his talk’ – one has to know what S does not mean. And at this stage one cannot avoid determining what John is ready for in (22). In other words, S's locutionary act – what she says in (22) – is not, pace Bach, a sub-propositional radical: it is a full-fledged, contextually determined proposition.
There is another argument to support this claim. The verb say, as used in indirect reports of the ‘S said that p’ kind, is ambiguous, according to Bach (Reference Bach and Szabó2005), between locutionary and illocutionary meanings. Therefore, Bach must accept that in reporting the ironic utterance in (22) by (23) one transmits S's locutionary act (since her illocutionary act has a different content):
(23) S said that John was ready.
Some may feel inclined to say that the report in (23) is false in a context where the audience does not have access to the non-literality of S's original utterance. However, no sense could be made of (24) if (23) were not true:
(24) S said that John was ready, but she didn't mean that/it.
Even though the truth of (23) does not depend on the possibility for the audience of recovering the non-literality of the original utterance, it remains impossible to maintain that the embedded clause in (24) is sub-propositional. The main reason is, of course, that the reference of the demonstrative that or of it in (24) must be a full-fledged proposition.
9.4 Locutionary acts
In section 9.2, we saw that illocutionary forces do not belong to sentence meaning. In the previous section, we saw that some utterances do not have any (direct and literal) illocutionary force, even though they are endowed with propositional and context-dependent meaning. However, until now the discussion was limited to grammatically declarative sentences. A chief reason for introducing the notion of a locutionary level – intermediate between sentence meaning and illocutionary level – is, as we have seen, the existence of cases where what is said differs from what is meant. Parallel cases of non-literality can be found with imperative sentences. In the following example, S does not literally request, order, allow or wish A to ruin her carpet.
(25) [A spills his glass of wine over the carpet, and clumsily attempts to wipe it off. S says:] Go on! Ruin my carpet!
Exactly as S does not literally assert anything when she sarcastically says that the party is great, the speaker of (25) does not perform any literal directive speech act. But if so, what is the literal content of (25)?
In section 9.2, I have argued that the meaning of the imperative mood should not be analysed in illocutionary terms. Without getting into details, imperative clauses may just be, at the literal level, expressions of propositional contents under a certain attitude or with a certain mode of presentation. Let me just quote two possible accounts, among many. According to the first, the imperative mood presents the utterance content as desirable and potential (Wilson and Sperber Reference Wilson, Sperber, Dancy, Moravcsik and Taylor1988; Clark B. Reference Clark1993; for critical discussions, see Dominicy and Franken Reference Dominicy, Franken, Vanderveken and Kubo2002; Schwager Reference Schwager2006). According to the second, the imperative mood functions like a necessity operator; roughly, the ‘attitude’ bearing on the propositional content would be derived from the context-dependent base (i.e. from the domain of quantification) of the modal (Schwager Reference Schwager2006). Independently of the account favoured, locutionary acts must be thought of not only as having propositional content, but also as endowed with a certain mode of presentation of this content (Kissine Reference Kissine2009).
Assessing the semantic accounts of sentence types – e.g. imperative, indicative, subjunctive, interrogative – falls far beyond the scope of this chapter. The important point is that the semantics of sentence types predicts which locutionary-act types the utterances of the sentence can constitute; the nature of the locutionary act performed constrains, in turn, the range of those direct illocutionary acts the locutionary act can constitute.
9.5 Forceless meaning and indirect speech acts
Conceiving of illocutionary forces as optional properties of utterances allows a fresh perspective on indirect speech acts. Classically, a speech act is said to be indirect whenever its uptake (i.e. A's understanding the utterance as being this speech act) is tied to the uptake of another speech act (Searle Reference Searle, Cole and Morgan1975b; Bach and Harnish Reference Bach and Harnish1979: 70; also Recanati Reference Recanati1987). For instance, if S utters (26) as an answer to A's offer to go to the movies, A can infer, by using various conversational, cooperation-based principles, that in addition to stating that she is tired, by (26) S rejects A's offer.
(26) I'm very tired.
Some indirect speech acts are highly conventionalised. For instance, although (27) has an interrogative syntactic structure, it constitutes, in English, an extremely conventionalised means to request things.
(27) Can you pass me the salt?
After Morgan (Reference Morgan and Cole1978), it is customary to think of such cases as ‘short-circuited’ implicatures. In a nutshell, Morgan's idea is that the link between (27) taken as a literal question about A's ability to pass the salt and (27) taken as a request to pass the salt can be reconstructed in Gricean terms; however, such an inference generally does not take place. This is so because the link between the Can you___? construction and the directive interpretation is highly conventionalised and largely automatic. (Of course, conventionalisation is not arbitrary. In particular, the link between the literal meaning and the indirect force must be easy to grasp. Not every question of the form Can you___? will easily receive a directive interpretation. For instance, that S is able to pass the salt is a preparatory condition which must be fulfilled in order for a request to pass the salt to be successful. According to Searle (Reference Searle, Cole and Morgan1975b), this is why a conversationally irrelevant question about a preparatory condition will be readily interpreted as the performance of the corresponding request.)
A view rival to the ‘short-circuited implicature’ account is that of Sadock (Reference Sadock1974), according to whom each sentence-type is associated with an illocutionary force at the syntactic level. Under such an analysis, (27) is linguistically ambiguous between the interrogative and imperative meanings. As we have seen in section 9.2, the doctrine of a linguistic coupling between sentence types and illocutionary forces is problematic for independent reasons. There is one point of Sadock's analysis that is worth considering here, though. According to him (1974: 97–109), certain grammatical properties distinguish questions used as requests, and conventionalised forms whose meaning is (allegedly) ambiguous between a question and a directive speech act. Take (28) as an instance of the former case, (29) of the latter.
(28) When will you close the door?
(29) Will you close the door?
Sadock assumes that only grammatically imperative sentences can be followed by please or by an indefinite vocative.
(30) Close the door, please.
(31) Close the door, someone.
The unacceptability of (32)–(33) shows, according to Sadock, that (28) is an indirect request performed by means of uttering an unambiguously interrogative sentence.
(32) ?When will you please close the door?
(33) ?When will you close the door, someone?
By contrast, the acceptability of (34)–(35) (allegedly) reveals that (29) is linguistically ambiguous between being a grammatically imperative sentence – hence a direct request to close the door – and being a grammatically interrogative sentence – hence a direct question about S's ability to close the door.
(34) Will you please close the door?
(35) Will you close the door, someone?
In reaction to Sadock's argument, Bach and Harnish (Reference Bach and Harnish1979: 200–202) point out that please is also acceptable in (36).
(36) Can you reach the salt, please?
If (27) is, grammatically, a request to bring about the truth of the propositional content, one should expect (36) to be a request to reach the salt. This is counter-intuitive: (36) is a request to pass the salt, not to reach the salt. Likewise, as pointed out by Bach and Harnish, the following example is clearly not ambiguous between an indicative and an imperative underlying structure, despite the acceptability of please.
(37) I'd like some salt, please.
Following Morgan's lead, Bach and Harnish claim that certain questions of the form Can you___? or Will you___? are standardised means to perform requests. In order to explain away Sadock's grammatical constraints, they argue that (29) is ungrammatical, although pragmatically acceptable.
Both Sadock's and Bach and Harnish's accounts presuppose that the grammatical acceptability of please is linked to the imperative mood. However, the acceptability of please or of the indefinite vocative does not depend on the utterance's mood, but on whether or not the utterance's (primary) illocutionary force is directive. The following examples are acceptable only if S believes (or pretends to believe) A to have control over his having a nice journey back or getting well soon, that is only if the utterances at hand constitute (pretend) directive speech acts.
(38) ?Have a nice journey back, please.6
(39) ?Get well soon, please.
Even under such a reading, indefinite vocatives prove unacceptable:
When the imperative clauses cannot receive a directive illocutionary force, as in (4) and (42), please and the indefinite vocative are clearly ruled out.
(4) Hit me and I'll hit you back.
[repeated]
(42) Be tall and people will be respectful.
(43) ?Hit me, please, and I'll hit you back.
(44) ?Be tall, please, and people will be respectful.
(45) ?Hit me, someone, and I'll hit you back.
(46) ?Be tall, someone, and people will be respectful.
It thus seems that the acceptability of please does not depend on grammatical factors – on the sentence mood – but on pragmatic ones – the utterance's illocutionary force. What do we make of the fact that the adjunction of please and of someone is constrained by the presence of the directive illocutionary force and not of the imperative mood? According to Bach and Harnish's standardisation thesis, when an indirect speech act is conventionalised, hearers automatically derive the secondary indirect force without going through the derivation of the primary, literal speech act. That is, (29) is directly understood as a request to close the door, whereas (28) is interpreted as a question, and it takes supplementary pragmatic reasoning for A to understand it as a request to close the door.
So why not say that (29) has the force of a request only, whereas (28) has the primary force of a question, and, indirectly, constitutes a request? We have seen above that some utterances constitute a locutionary act with a certain content p but do not correspond to any illocutionary act with the content p. The same applies here. In (29), the content of the directive illocutionary act – of the request to close the door – differs from that of the locutionary act which the utterance of the interrogative sentence corresponds to.7 Since the only illocutionary force of (29) is a directive one, it is also, trivially, the primary force. As expected, constructions whose acceptability depends on the primary force being directive are allowed to take please. By contrast, the literal force of (28) is that of a question; since the directive force is not primary in this case, please is not allowed.
To repeat, (29) is syntactically and semantically an interrogative; however, it is not interpreted – nor intended by S to be interpreted – as a question. Such a rationale presupposes that literal and serious utterances of interrogative sentences are not necessarily associated with the act of requesting information, exactly in the same way as not all utterances of imperative sentences constitute directive speech acts. Interrogatives that are neither requests for information nor expressions of ignorance include the following: rhetorical questions (47), exam questions (48), guess questions (49), and surprise questions (50).
(47) [Peter, who had made a New Year resolution to give up smoking, lights up. Mary says:] What was your New Year's resolution?
(from Wilson and Sperber Reference Wilson, Sperber, Dancy, Moravcsik and Taylor1988)
(48) Where did Napoleon die?
(49) Which hand is the marble in?
(50) A: The President has resigned.
B: Good heavens. Has he?
(from Wilson and Sperber Reference Wilson, Sperber, Dancy, Moravcsik and Taylor1988)
For a careful and illuminating account of the relationship between the interrogative mood and the speech act of questions the reader is referred to Fiengo (Reference Fiengo2007), who analyses interrogative sentences as expressing incomplete or truth-valueless propositions.
Further arguments in support of the view that conventionalised indirect speech acts like (27) or (29) are, at the literal and direct level, cases of saying without performing an illocutionary act, can be found in Terkourafi (Reference Terkourafi, Brabanter and Kissine2009c), who provides an interesting discussion of the relevant experimental data.8 On the one hand, experiments reveal that the putative direct force of a Could you___ request like (27), namely the illocutionary force of questioning, seems to be ignored in favour of the intended speech act of request. On the other hand, the interrogative form of the sentence is processed, as shown by answers such as ‘Yes, I can’ or by the fact that the interrogative form is recalled (for an extensive discussion and references, see Terkourafi Reference Terkourafi, Brabanter and Kissine2009c).
We thus arrive at the following picture of the possible relationships between locutionary and illocutionary acts:
a) The locutionary content corresponds to the content of the primary, direct speech act. This is the ordinary case.
b) The locutionary act does not constitute any direct speech act; the only speech act performed by the utterance is indirect. Here, two further sub-categories must be distinguished.
i) The utterance is non-literal; S does not endorse the locutionary content. For instance, this content ‘echoes’ an utterance or a thought of another (possibly virtual) person (e.g. Sperber and Wilson Reference Sperber, Wilson and Cole1981; Wilson Reference Wilson2006). The important point is that, in such cases, S's performance of the indirect speech act cannot be reconstructed as an inference taking as one of its premises the performance of a primary speech act that shares its content with the locutionary act.
ii) The utterance is literal, but the content of the only illocutionary act the utterance constitutes is distinct from that of the corresponding locutionary act. However, it is possible to reconstruct the interpretation process as starting from the performance of an illocutionary act whose content is identical with that of the locutionary act.
The contrast between ironic utterances – point b(i) – and conventionalised indirect speech acts – point b(ii) – deserves a little more discussion. Morgan's (Reference Morgan and Cole1978) idea in treating conventional indirect speech acts as ‘short-circuited’ implicatures is precisely that even though the indirect speech act can be derived from the putative performance of a direct speech act, such an inference does not actually take place: instead, A jumps directly to the indirect speech act. Take (27).
(27) Could you pass the salt? [repeated]
One possible Gricean reconstruction of A's interpretation of (27) runs as follows (cf. Searle Reference Searle, Cole and Morgan1975b):
(51)
- Step 1:
S is asking me whether I have the capacity to pass the salt;
- Step 2:
S probably knows that I have this capacity;
- Step 3:
S knows that I know that she knows that;
- Step 4:
So, S believes that I understand that she does not want to be informed as to my capacity to pass the salt;
- Step 5:
We are at a dinner, and it is possible that S needs salt;
- Step 6:
Being able to pass the salt is necessary in order to pass the salt;
- Step 7:
So, by asking me whether I am able to pass the salt S requests me to pass her the salt.
However, A's exposure to conventions of language use allows him to jump directly from his recognition of S's utterance of an interrogative sentence of a certain form – which does not amount to a question – to interpreting S's utterance as a request to pass the salt. The important point about the rational reconstruction of the interpretation of an indirect speech act like (27) is that the first step – S's performance of the primary speech act of questioning – remains compatible with the last step – S's performance of a request. This reconstruction parallels that of genuinely indirect speech acts like (28).
(28) When will you close the door? [repeated]
The fact that, by uttering (28), S asks A when he will close the door is compatible with the fact that S, by means of this same utterance, requests A to close the door.
Now, contrast this with the ironic utterance of (7):
(7) This party is great. [repeated]
In order to understand what S really means by (7), A has to understand that S does not assert what she is saying. As in (51), let us try a rational reconstruction that would start with the premise that S performed a direct and literal speech act.
(52)
- Step 1:
S is asserting that the party is great;
- Step 2:
This party is all that S hates;
- Step 3:
So, most probably, S does not believe that the party is great;
- Step 4:
S is cooperative and would not violate conversational Maxims gratuitously;
- Step 5:
S believes that I believe that S does not believe that the party is great;
- Step 6:
S does not assert that the party is great;
- Step 7:
S means that the party is awful.
Whatever the details, the important point is that, this time, and by contrast with (51), taking the last two steps requires falsifying the first one.
Note also that both rational reconstructions (51) and (52) presuppose that A is capable of making hypotheses about S's beliefs about A's mental states – i.e. that A has the capacity to attribute second-order or third-order mental states. Take first the reconstruction in (51). In order to get to the conclusion that S does not merely want him to say whether he can pass the salt, A must assume that S believes that A knows that S is not interested in (merely) knowing the answer. It would be irrational for S to use a question in order to request something, if she were fairly certain that A would not understand this. That is, A must attribute to S beliefs about A's beliefs about S's beliefs. Things are similar for (52): if A does not understand that S believes that A believes that S does not like the party, A cannot tell the difference between S's telling a lie and S's being sarcastic (for a more detailed discussion see Kissine Reference Kissine2008a).
From the foregoing, we can draw an important empirical reason for not taking rational reconstructions of indirect speech act interpretation as reflecting actual interpretive processes. Children do not master second-order mental state attribution before the age of 7 (Perner and Winner Reference Perner and Winner1985). Importantly, this cognitive ability seems to be required for understanding irony and for lying in an efficient way (Winner and Leekam Reference Winner and Leekam1991; Talwar and Gordon Reference Talwar and Gordon2007). By contrast, it has been repeatedly shown that well before 7, children respond adequately to and produce (conventionalised) indirect requests (e.g. Bates Reference Bates1976: 275–82; Shatz Reference Shatz1978; Reeder Reference Reeder1978; Carrell Reference Carrell1981; O’Neill Reference O’Neill1996), which reveals that this pragmatic ability does not require complex mind-reading skills, contrary to what is implied by rational reconstructions of the kind of (51).
9.6 Indirect speech acts and explicit performatives
Explicit performatives constitute one of the oldest and most vexing topics in the history of theorising about speech acts (for recent surveys, see Harnish Reference Harnish, Grewendorf and Meggle2002, Reference Harnish, Brisard, Meeuwis and Vandenabeele2004). While, again, I will not attempt an exhaustive review here, the fact, discussed in the previous section, that conventionalised indirect speech acts do not have more than one illocutionary force has an implication for the analysis of performatives that is worth considering.
Prototypical performative sentences have the form ‘I VP ___’, where the VP stands for the illocutionary act the utterances of these sentences constitute under normal conditions. Here are some examples:
(53) I order you to leave this room.
(54) I promise that I'll come to your party.
(55) I name this ship Queen Elizabeth.
Utterances of (53)–(55) are generally self-verifying: they constitute the act named by the matrix verb. By uttering (53), S makes it the case that she has ordered A to leave; by uttering (54), S makes it the case that she has promised to come to the party; by uttering (55), S makes it the case that the ship is named Queen Elizabeth.
The ‘self-verifying’ character of performatives like (55) can be explained by extra-linguistic, culture-specific institutions (for a more extensive discussion, see Kissine forthcoming). However, it is highly doubtful that the same treatment can be applied to (53)–(54) (for an attempt, see Searle Reference Searle1992a; for a cogent criticism, see Bach and Harnish Reference Bach and Harnish1992). The challenge consists in explaining how (53) is interpreted as an order and (54) as a promise without, at the same time, giving up commonplace semantics – i.e. by deriving the truth-conditional content of (53)–(54) through the same compositional semantic principles as for other structures of the form ‘I VP that___’, as in (56), or other present simple forms of order, as in (57).
(56) I hope that you are well.
(57) He orders you to leave.
That explicit performatives do not involve any extraordinary semantic features is further suggested by the observation that in some circumstances a sentence like (53) does not constitute a directive speech act at all, and has the content predicted by standard truth-conditional interpretation.
(58) A: Imagine that I light up a cigarette. What is your reaction?
B: I order you to leave this room.
Another interesting fact is that the following examples (59)–(62) are as well suited as (53)–(54) to performing orders and promises.
(59) Leave the room, and that's an order.
(60) I'll come to your party, and that's a promise.
(62) I'll come to your party, I promise.
The analysis of examples in (59)–(60) is pretty straightforward. The speaker first utters a sentence, and then, with the second conjunct, indicates its illocutionary force, the demonstrative that picking up the first conjunct. As for (61)–(62), according to the account developed by Potts (Reference Potts2005; also Bach Reference Bach1999a), utterances with parentheticals express two propositions. The first is the main one – ‘at-issue’ – while the second is secondary. This second proposition may have a ‘procedural’ role (cf. Wilson and Sperber Reference Wilson and Sperber1993a), in that it facilitates the processing of the at-issue information. Following this widely accepted line of thought, in (61)–(62) the secondary propositions – the parenthetical ones – help attribute the illocutionary force to the main content.
The spirit of Davidson's (Reference Davidson and Margalit1979) ‘paratactic’ account of explicit performatives is quite similar. In his view, the surface form of, for instance (53), hides the following two utterances, where the first demonstrates the second one: I order you that. You leave the room. However, it is pretty clear that the that in (53) is not a demonstrative. Compare with French where there is no homophony between demonstratives and complementisers or relative pronouns.

Although it is not viable, Davidson's account is attractive in that it puts the explicit performatives in (53)–(54) and (59)–(62) on the same footing; the utterance's main informational content is assigned an illocutionary force through a comment on this content.
As for Bach and Harnish (Reference Bach and Harnish1979: 203–33; Reference Bach, Cohen, Morgan and Pollack1992), they claim that explicit performatives are standardised (conventionalised) indirect speech acts, exactly in the same way as indirect requests of the kind of (27).
(27) Could you pass the salt? [repeated]
Recall from the previous section that, on their view, although at the direct level (27) is a question, standardisation allows one to interpret it as a request without going through the inference from the direct to the indirect force. Likewise, they argue, explicit performatives like (53)–(54) are, at the direct level, ‘constative’ speech acts. The constative class includes those speech acts that aim at transferring information, assertion being a paradigmatic case. Bach and Harnish thus claim that by uttering (53) S asserts that she orders A to leave the room; this assertion is interpreted as an indirect order. However, because the inference from the direct to the indirect force is ‘compressed by the precedent’, (53) is a standardised indirect order exactly in the same way as (27). Bach and Harnish thus predict both that the content of (53) is the proposition [S orders that A leaves] and that (53) is an order.
Reimer (Reference Reimer1995) objects to Bach and Harnish's account that it does not seem intuitively plausible that S asserts anything by means of (53); the constative part of performatives is introspectively inaccessible. However, it is not obvious that the (alleged) question part of indirect requests like (27) is always accessible either. If, when solving a problem, I ask you ‘Can you tell me the square root of 16?’, it would be highly counter-intuitive to interpret my utterance as a question (e.g. Clark H. Reference Clark1979). In any event, nothing implies that conventionalised indirect speech acts have two illocutionary forces. We have seen above that indirect speech acts do not necessarily have a direct illocutionary force; in conventionalised indirect speech acts the content of the locutionary act is different from that of the illocutionary act, but only one illocutionary act is performed. Things are not different with explicit performatives. Recall that locutionary acts are endowed with a full-blown propositional content. Therefore, at the locutionary level explicit performatives have exactly those truth conditions that would be predicted by regular compositional semantics. The propositional content of the locutionary act performed in (53) is [S orders A to leave]. Yet, this does not imply that S asserts this content. The only illocutionary act performed in (53) is the order that A (should) leave. We can reconcile two seemingly conflicting intuitions: on the one hand, an explicit performative has the same propositional content as would an assertion constituted by an utterance of the same sentence; on the other hand, it does not constitute such an assertion.9
An interesting example in favour of the account of explicit performatives just sketched can be drawn from Allan (Reference Allan2006a). He points out that (65) is adequately paraphrased by (66):
(65) In the first place I admit to being wrong; and secondly I promise it will never happen again.
(66) The first thing I have to say is that I admit to being wrong; and the second thing I have to say is that I promise it will never happen again.
In (66) the second thing does not refer to a second act of promising, but to the second act of saying, that is to the explicit performative I promise it will never happen again; ditto for secondly in (65). That explicit performatives are acts of saying is consonant with the analysing of saying as the performance of a locutionary act. This is not to say that explicit performatives cannot be subject to a rational reconstruction, starting from the premise that S asserted that she, say, ordered A to leave the room. However, to repeat, rational reconstructions do not aim at modelling actual interpretive processes. Note that the fact that the interpretation of explicit performatives can be reconstructed on the basis of primary assertions does tell us something: that S is ready to endorse the content of the locutionary act – that she's not ironic. That is, it tells us that explicit performatives can be interpreted as assertions; but it remains that, in most cases, explicit performatives are not interpreted as assertions.
9.7 By way of conclusion: illocutionary force attribution
A natural question to ask, at this point, is how illocutionary forces are attributed. We have seen that developmental evidence makes it unlikely that illocutionary force attribution is underpinned by Gricean inferences about multi-layered communicative intentions (for a classical version of such an account, see Bach and Harnish Reference Bach and Harnish1979). Very young children are good at attributing illocutionary forces to utterances well before being able to attribute complex mental states required by Gricean reconstructions. As for contemporary pragmatic theories, as I said in section 9.1, they are fairly elusive on psychological mechanisms underlying illocutionary force attribution.
In Kissine (Reference Kissine2009) I argued that speech acts should be thought of as (not necessarily effective) reasons for S to believe that the propositional content is true (assertives), for A to bring about the truth of the propositional content (directives) or for S to bring about the truth of the propositional content (commissives).10 An important feature of this analysis is that it does not require advanced mind-reading skills for attributing illocutionary forces, except for complex communicative moves like irony (cf. Kissine Reference Kissine2008a).
Whenever the content of a speech act corresponds to the content of the constitutive locutionary act, this speech act is literal and direct. We have seen that utterances often constitute locutionary acts, that is they express propositional contents under a certain mood of presentation. The type of the mode of presentation constrains the range of the possible direct speech acts the locutionary act may constitute. For instance, if the imperative mood expresses an attitude characteristic of desiderative mental states, the potential direct illocutionary force will be a directive one.
However, there does not necessarily exist a one-to-one correspondence between sentence types and modes of presentation of the propositional content. In particular, it is fairly possible that the indicative mood is neutral in this respect (e.g. Recanati Reference Recanati1987; Allan Reference Allan2006a). An indicative sentence like (67) can be performed either as an assertion that A leaves the city on Monday or as an order that A leave the city on Monday.
(67) You leave the city Monday.
Let us assume that the indicative mood does not constrain the locutionary-act type. Illocutionary force attribution and the interpretation of the utterance type as a locutionary act with a certain mode of presentation will thus mutually influence each other. Note also that, in both cases, the propositional content is the same; it thus follows that the order in (67) is as direct as the one in (68); cf. Recanati (Reference Recanati1987) for the same conclusion.
(68) Leave the city tomorrow morning.
This is in tune with the widely acknowledged fact that most interpretive processes operate on-line. The precise ways sentence types, locutionary modes of presentation and illocutionary forces interact constitute a rich matter for future research.
I’m grateful to Keith Allan, Philippe De Brabanter, Marc Dominicy and Kasia Jaszczolt for helpful comments on previous versions of this paper. My research is supported by a post-doctoral research grant from the Fonds de la Recherche Scientifique de la Communauté Française de Belgique (F.R.S.-FNRS). The results presented here are also part of the research carried out within the scope of the ARC project 06/11–342 Culturally modified organisms: ‘What it means to be human in the age of culture’, funded by the Ministère de la Communauté française – Direction générale de l’Enseignement non obligatoire et de la Recherche scientifique.
10 Pragmatics in update semantics
10.1 Introduction
A person who listens to, understands and accepts the utterance of another acquires new information. There may be new beliefs that are about the main event described in the utterance, but crucially understanding and accepting it entails obtaining new information about the intentions of the other speaker. And if the intention is to inform, it also entails obtaining new information about the world. As long as there is no intention to correct old information, the new information can be described as an update of the beliefs of the hearer (and as a ‘downdate’ followed by an update if the new information is a correction of existing information). The direct description of the utterance as an update of an existing information state and the claim that this cannot be reduced to more fundamental operations and is the suitable framework for natural language semantics are characteristic of update semantics.
Probably the most basic way to think of update semantics in relation to pragmatics is as a natural environment for formalising pragmatics. Nearly everything in pragmatics can be described as (partly or completely) determining a change to an information state, the information state of the hearer, of an idealised hearer, of the speaker or the speaker's model of the hearer or of the common ground or the common ground according to one of the speakers. It seems fair to say that a pragmatic theory that cannot be described as characterising aspects of the up- or downdates in a context that models (real or imagined) hearers in a context interpreting the utterance is hard to imagine. On the other hand, many pragmatic approaches are still not theories that make precise and testable predictions and formalisation is very much needed to turn this area of study into proper science. So update semantics and update-semantical approaches to pragmatics are an important field of study even if e.g. one rejects the ideology of update semantics or some of the ideas that have been formalised in it. It still would seem the natural format for formalising any pragmatic ideas, even if one would like to include new mechanisms or reinterpret some of the mechanisms contained in the literature.
In this sense, up- and downdating information states is just a technical tool for being precise about what utterances do to the various information states involved in communication (the common ground between speaker and hearer, the hearer's and the speaker's knowledge and beliefs) and the natural successor of the commitment slates of Hamblin (Reference Hamblin1971) or the version of those that is employed by Gazdar (Reference Gazdar1979) or by Karttunen (Reference Karttunen1973) in the definition of filters. The tool has the advantage that whatever semantic assumptions one needs to make for one's pragmatics can be formulated directly in the same framework and in the same way.
What information states one wants to update depends on one's pragmatics. For semantic applications, it does not seem to matter. It suffices to assume somebody exposed to a series of trustworthy speech acts who is updating her own set of beliefs by accepting all these utterances and who happens to be an ideal hearer perfectly understanding the utterances she is exposed to. For pragmatical applications, the obvious candidate seems to be the common ground between speaker and hearer in a conversation. Both parties can update that and it would be possible to follow the progress of the conversation by taking in the effect of both parties’ moves. This would be third-party conversational analysis as proposed by Sacks et al. (Reference Sacks, Schegloff and Jefferson1974), but it is well known that it does not quite work for real conversations since misunderstandings and deceptions occur frequently. The best choice would be to have two separate versions of the common ground (or more if there are more speakers) and to try to track the evolution of all these common grounds. The common ground according to the speaker seems the most relevant information state from a linguistic point of view, since it can account for the speaker's choices in formulation and for the effect the speaker expects her utterances to have on the hearer. It also accounts for the hearer's choices in interpreting the utterances and for modelling the hearer in making choices in her formulation of those utterances, but in this second role, it does not account for the variation in linguistic form studied in linguistics. But there is no extra problem involved in formalising the third-party perspective: one just takes the model of the common ground of both parties.
Any serious semantic and pragmatic theory makes empirical predictions. Update semantics would be the theory with the shortest distance between the theoretical construct and what can be tested. What people assume after an utterance which they did not assume before is what can be tested almost directly. Much more than a truth-conditional account, update semantics seems to be about the primary function of linguistic utterances: to transfer information from one person to another (this also covers speech acts other than assertion).
Information states can be and have been modelled in many different ways. A simple and effective way is to start from a logical formalism with a proper model theory. Information states are consistent theories (sets of formulas that can be satisfied on at least one model for the theory) and updating is just adding new formulas to the set. Most of the older approaches like Hamblin's or Gazdar's follow this pattern, and Discourse Representation Theory (Kamp and Reyle Reference Kamp and Reyle1993) can be seen as an instance of this approach with a more sophisticated interface with syntax and maximal decomposition of formulas. It is also not difficult to make connections with tableau-based theorem-proving and to think of information states as sets of open branches with positive and negative literals (a formalism like that of Dekker (Reference Dekker, Dekker and Stokhof1992) comes close to that). Literals are here atomic formulas with constants, variables and skolem constants as the set of allowed terms. This may well be the best road towards implementing update semantics. Updating becomes the closure of those branches that become inconsistent under the new information and the addition of the necessary new literals to the branches that remain. Purely eliminative notions of update semantics are reached if the information state is a set of models, a set of possible worlds or a set of possibilities. Conceptual simplicity is typical for this last group of approaches; concreteness and thereby simple search procedures are typical for the first group, while mixed systems are best for implementational purposes. But it is not difficult to do translations between different formalisms. In this chapter, an elimination approach will be introduced in preference to other approaches because of its inherent simplicity.
10.1.1 Basic update semantics
The following gives a basic development of update semantics for a language L (the set of atomic propositions) of propositional logic. The definitions follow the notation of Veltman (Reference Veltman1996).
(1) Models M for L:
Any subset of L is a model M. (The subset of true atoms).
Information states σ ∊ Σ:
Any subset σ of the set of models for L is an information state. This makes the set of all information states the powerset of the set of models.
Σ = pow(pow(L))
Atomic updates: (the update of an information state σ with an atomic proposition p)
σ[p] = {M ∊ σ: p ∊ M}
Full updates: (the update of an information state σ with a complex formula ϕ).
σ[ϕ∧ψ] = σ[ϕ][ψ]
σ[
ϕ] = σ \ σ [ϕ]σ [ϕ → ψ] = σ [
(ϕ ∧
ψ)]σ [ϕ ∨ ψ] = σ [
(
ϕ ∧
ψ)]σ [ϕ ↔ ψ] = σ [(ϕ ∧ ψ) ∨ (
ϕ ∧
ψ)](2) ϕ is inconsistent with σ iff σ[ϕ] = ∅
(3) σ ⊨ ϕ iff σ[ϕ] = σ
The state of no information (1) is pow(L), the set of all models. The inconsistent information state (0) is ∅.
This basic system can be extended to first-order logic and modal logics. For first-order logic one changes the set of models to first-order models for a language L. Variables are treated as extra constants for which the models are either defined or not.
A simple development, designed to capture discourse referents and DRT-style semantics goes as follows. This requires a DRT-like language in which atoms are either variables or atoms as in first-order logic.
If ϕ1 and … and ϕn are formulas so is ϕ1∧…∧ϕn
If ϕ is a formula so is
ϕIf ϕ and ψ are formulas so are ϕ → ψ, ϕ∨ψ and ϕ↔ψ
The update semantics below assumes ‘normal’ formulas: variable atoms are on the left in a conjunction, conjunctions are always maximal and there is at most one occurrence of an atom x in a formula for each variable.
(4) σ[x] = {M ∊ σ: Mx is defined}
This allows the definition of a function dm(σ) which finds the discourse referents of σ
(5) dm(σ) = {x: ∀M ∊ σMx defined}
Atomic updates are redefined in (6).
(6) σ[Px1…xn] = {M ∊ σ: <Mx1, …, Mxn > ∊MP or < Mx1, …, Mxn > is undefined}
A negation with built-in quantification is given in (7). The two information states σ and σ[ϕ] together determine the set of discourse referents introduced by ϕ. An element of σ is eliminated if it has a variant with respect to these discourse referents in σ[ϕ].
(7) σ [
ϕ] = { M ∈ σ :
∃ M′M′ = X MM′ ∈ σ [ϕ] and X = dm(σ[ϕ]\dm(σ]}
(8) defines M = XM′. Two models M and M′ are exactly the same apart from what their interpretation functions may do with the subset X of the language L.
(8) M′ = XM iff for all α ∊ L\XMα = M′α
The definition of negation builds in quantification over discourse referents in the usual way: σ[ϕ] is existentially closed for its own discourse referents (which are the ones it does not share with σ) before being subtracted from σ. The definitions of the other operations are as before.
10.1.2 History
Update semantics came into its own in Karttunen (Reference Karttunen and McCawley1976), Heim (Reference Heim1982), Heim (Reference Heim, Barlow, Flickinger and Westcoat1983), Kamp and Reyle (Reference Kamp and Reyle1993) and Stalnaker (Reference Stalnaker1973) with the breakthrough that is now known as dynamic semantics and which was aimed at three problems: pronoun-binding semantics, presupposition satisfaction and the semantics of tense in discourse. The problems in pronoun-binding semantics go back to Geach (Reference Geach1962) (with medieval predecessors) and become acute in Montague (Reference Montague and Thomason1974b). The update perspective on presupposition starts with the notion of pragmatic presupposition of Stalnaker (Reference Stalnaker1973) reacting to the essentially semantic view of presupposition in a tradition going back to Frege where presupposition is a condition on having a truth value, something that is hard to maintain for a class of presupposition triggers. For Stalnaker (Reference Stalnaker1973), presuppositions become the conditions under which it is appropriate to use presupposition triggers. Karttunen (Reference Karttunen1974) extends this view with a detailed study of the kind of requirements that triggers impose on their environments (the inheritance conditions). Heim (Reference Heim, Barlow, Flickinger and Westcoat1983) finally managed to reduce these inheritance conditions to the update conditions of the operations involved (see above). The problems of temporal reference were first noted by Reichenbach (Reference Reichenbach1947), addressed by Kamp and Rohrer (Reference Kamp, Rohrer, Bauerle, Schwarze and von Stechow1983) in early DRT, while Partee (Reference Partee1973 and Reference Partee1984) showed the essential similarity between the problems of pronouns and temporal reference.
The common solution turned out to be to go dynamic. That is, first, to explain the truth of a natural language assertion in terms of both the world of evaluation and the informational component of the context (the ‘linguistic context’) and, second, to explain how the utterance (or parts of it) changes the informational context for further processing. These two aspects of dynamic semantics can be combined into a theory of information states and an account of how the expressions of a natural language (or of a suitable formalism designed to capture aspects of natural language) update information states.
An important insight is that truth can be defined in terms of updates. A true information state is one that contains the one true model (the model of the actual world) and (9) defines truth of sentences in terms of true information states.
(9) An utterance is true iff it updates any true information state into a true information state.
This can be generalised into an account of intensions. Let M be any model.
(10) ϕ is true on M iff it maps any information state containing M onto an information state still containing M.
And into an account of logical consequence as in (11):
(11) ϕ1…ϕn ⊨ ψ iff ψ is true on any model on which each of ϕ1 and … and ϕn are true.
More natural in the dynamic setting is to have a sequential definition as in (12), though this does not correspond with a classical notion.
This also has a local version, as in (13).
(13) ϕ1…ϕn ⊨ ψ on σ iff σ[ϕ1]…[ϕn] is defined and identical to σ[ϕ1]…[ϕn][ψ]
One is a proper update semanticist iff one holds that the meaning of a natural language expression is its contribution to the updates defined by the complete expressions in which it occurs.1 This includes, for example, that the setting up of a new discourse referent belongs to the meaning of an indefinite description, that the meaning of a definite description contains its undefinedness in information states in which it does not single out a unique referent, or the variation of referents or undefinedness of pronouns in particular information states. But often the update-semantical version of a meaning is an obvious lift of its traditional meaning.
10.1.3 Pragmatics
The interesting aspect of the proposal is that now suddenly many phenomena in the pragmatic waste-bin (which had landed there because they resisted a treatment in terms of truth conditions) can be taken out again and become part of semantics.
The reason is that far more structure is available. If classical semantics treats sentences as functions from possible worlds to truth values, update semantics assigns to the same sentences functions from information states to information states with the additional option of letting the function be undefined for certain options. If classical semantics were correct (i.e. the meaning of sentences is exhaustively characterised by a function from possible worlds to truth values), the intension [[ϕ]] of classical semantics would be sufficient for defining the corresponding updates. The update of ϕ, written [ϕ], on an information state σ (written σ[ϕ]) could be given as σ restricted to those worlds M such that [[ϕ]](M) = true. If classical truth-conditional semantics is insufficient for giving an account of natural language meanings and update semantics does indeed help, it can be shown that the reverse is not the case. Updates that are not intersective (σ[ϕ]⊆σ) and distributive (σ[ϕ] = {{M}[ϕ]: M ∊ σ}) are not characterisable by a truth condition [[ϕ]]. The interesting updates are therefore the ones that lack one of the two properties. A simple example is the update with ϕ as a version of the partial analysis of assertion in Stalnaker (Reference Stalnaker and Cole1978) (an update [ASSERT(ϕ)] which updates with ϕ if ϕ is consistent and informative and the trivial update otherwise):
(14) if ϕ is consistent with σ (σ[ϕ] ≠ ⊘) and informative σ[ϕ] ≠ σ), σ[ASSERT(ϕ)] = σ[ϕ] else σ[ASSERT(ϕ)] = σ).
One cannot see in a single world w whether ϕ is consistent or informative with respect to the whole information state. If ϕ is true in w, it may fail to be informative by not eliminating anything in the larger σ; if it is false, ϕ is informative but it may be inconsistent with σ by eliminating all the worlds of σ.
In many cases, the inhabitants of the pragmatic waste-bin should not have been there in the first place. The specification of what it means that a pronoun has a certain antecedent in a context seems a semantical problem (and was taken e.g. by Montague to be an important part of natural language semantics). An equally semantic question is which antecedents are possible given the linguistic context. Arguably only the question which antecedent is the right one for a given pronoun is a pragmatic question. In presupposition theory, the question whether the presupposition is true in the context of a trigger seems semantic (or the more appropriate question for ‘anaphoric presupposition’: whether the context of the trigger entails a suitable antecedent for the presupposition) but questions of accommodation (for the case that the presupposition is locally satisfied) would be pragmatic. The meaning of tenses and aspectual operators is also a semantic question, but the anaphoric processes that in many accounts play an important role in characterising their meaning in a particular context are not semantic but pragmatic.
A number of core applications of update semantics are also ‘semantic’ in this sense of enlarging the horizon beyond truth-conditional semantics. A good example is the semantics of epistemic may and must and natural language implication in the work of Veltman (Reference Veltman1996), or the semantics of imperatives and deontic modality in the work of Mastop (Reference Mastop2005) and Nauze (Reference Nauze2008). This work adds more structure to information states: an updatable preference ordering in the case of Veltman and ‘to do lists’ in the case of Mastop and Nauze. Such applications are, however, properly semantic and aim at exploiting update semantics for dealing with semantic phenomena that are not within the reach of truth-conditional semantics.
10.2 Semantics and pragmatics
If pragmatics is incorporated in update semantics the distinction between semantics and pragmatics seems largely lost: in both cases one has updating operations. One can do some classification: some update operations are eliminative and distributive and can be called classical, but semantics seems to go further than just that.
For example, the analysis of may in Veltman (Reference Veltman1996) makes it a test on information states: does the information state have worlds in which its complement holds? If so, [mayϕ] is the trivial update; otherwise it is inconsistent information and leads to an absurd information state.
This would be half of the minimal analysis of assertion by Stalnaker, which requires assertions to be informative and consistent with the common ground. A variant of the analysis in which bad assertions lead to the absurd information state can be defined using may as: σ [may ϕ][may
ϕ][ϕ].
The same holds for a proposal by Beaver (Reference Beaver2001) to analyse a presupposition trigger like bachelor as σ [pres (adult(x) ∧ male(x))][
married(x)]. The semantics for pres ϕ proposed is also a test: σ[presϕ] = σ if σ[ϕ] = σ and undefined otherwise. This is close to the semantics for must that is most in line with the semantics for may. Being non-classical therefore does not mean being ‘pragmatic.’ The semantic/pragmatic distinction appealed to in attributing may a semantics is the distinction between lexically coded properties and properties that belong to the use of the expression. But it is questionable whether that distinction can be maintained. Conventional implicatures are lexically coded and so are presuppositions. Many particles are best seen as markers of rather unclear semantic features like additivity, formal contrast, denial of expectation etc. Their functional explanation seems related not to the expressive possibilities of the language, but to the difficulties of maintaining complex ranges of facts in memory.
One way to save the distinction would be to revert to a historical picture. An important feature of grammaticalisation processes is the semantics of the grammaticalised items becoming more pragmatical (‘semantic bleaching’). The existence of modal operators with ‘pragmatic’ meaning could then be explained as the outcome of such a process. The ‘ability’ readings with which ‘may’ started – as is generally assumed – would be classical updates. The process by which ‘ability’ became ‘epistemic possibility’ turned the classical update into a non-classical one. And this holds for the other cases as well. It is a standard assumption in historical linguistics that words start as either deictic elements or as descriptive (‘truth-conditional’ or property-expressing) elements. The elements that give trouble must therefore have come invariably from elements whose truth-conditional characterisation is unproblematic.
10.3 Presupposition
Karttunen's study of presupposition projection starting in the 1960s and carried on into the 1970s can be thought of as the origin of update semantics, but without the name and the slogan ‘the meaning of an expression is the change it brings to an information state’. For Karttunen, presuppositions are the prerequisite for using their triggers: a context that does not entail the presupposition cannot accept the use of the trigger that presupposes it. In Karttunen (Reference Karttunen1973) this leads to a classification of contexts of triggers into plugs, holes and filters. Plugs are sentential operators that do not let any presuppositional requirement out (e.g. verbs like ‘x says that p’), holes let all the presuppositional requirements out (not, maybe, x knows that) and filters are the ones that let some presuppositions out and not others (and, or and if…then are examples). To understand this properly, one must realise that Karttunen is interested in presuppositional requirements.2 The requirement of a trigger like ‘is glad that’ can be fulfilled under ‘says that’ as in (15), and so it would be wrong to make the truth of the complement a requirement of the whole sentence.
(15) John said that Mary left and that Bill is glad that she did.
Filters are formalised in Karttunen (Reference Karttunen1974) in terms of information state, as in (16).
(16) X allows Ap iff X ⊨ p
X allows
A iff X allows AX allows A & B iff X allows A and X + A allows B
X allows A → B iff X allows A and X + A allows B
X allows A∨B iff X allows A and X +
A allows B
The last of these has given some trouble, due to examples like
(17) Either the toilet is in a funny place or this house does not have a toilet.
There is still no fully satisfactory solution.
From the point of view of update semantics, this is an attempt to characterise presupposition in terms of properties of the input context for their triggers: the use of the trigger is inappropriate unless the presupposition holds in the input state. Update semantics can capture this ‘allowing’ by update restrictions along the lines of (18).
(18) σ[T] is undefined unless σ ⊨ p
The notion is problematic however. For some triggers (too and many other particles; definite descriptions) the place where the presupposition is true can be other than the local context of the trigger. In (19), B's parents presumably have no opinion about whether A is in bed, and John can think that Tim likes him, without having any idea that Tim is the hearer's brother.
(19) A: My parents think I am in bed.
B: My parents believe I am in bed too.
John thinks that your brother likes him.
And it ignores accommodation. Often – but not always and not for all triggers – the missing presupposition can be added, but when this happens it is more likely to happen in the global context than in the direct local context of the presupposition trigger. What one can say is that if the presupposition is not just missing but overtly absent (the speaker says ‘I do not know whether p’) or if it holds in the context of the trigger that p is false, the trigger is not acceptable.
Irene Heim realised that this can be stated better as an update semantics.3 In this way, truth conditions and allowance conditions are captured by one single definition.
Heim brings a second improvement to Karttunen's system, the addition of global accommodation. If a trigger is updated into an auxiliary information state and the presupposition p can be added either locally or globally, it is assumed that either the local state or the global state can be changed to be X[p], with a preference for the global state. This captures the predictions of Gazdar.
While Karttunen uses a concept of update semantics to explain filters, Gazdar's aim was the modern update semantics goal of stating the full change that utterance will bring to an input context. His contribution is flawed, because of an attempt to replace Karttunen's insights with an only seemingly equivalent formulation in which filtering is the effect of clausal implicatures. This does not work, witness (20).
(20) If John has a son, his child must be happy.
(20) would only produce clausal implicatures that the speaker does not know whether John has a son. This is not inconsistent with the presupposition of the definite description ‘his child’ which therefore does not get cancelled. So Gazdar predicts – against intuition – that ‘John has a child’ is projected.
It is, however, not difficult to amend Gazdar's idea with Karttunen-style filtering, as in (21). Here X is the given context (the information state in which the trigger is used).
(21)
(i) The potential presuppositions of a simplex formula A under X are those given by the triggers in A that are not entailed by X.
(ii) For O one of
, □, ⋄, x Vs that, the potential presuppositions of OA under X are those of A under X.(iii) The potential presuppositions of A → B and QAB (for Q a positive quantifier) under X are the potential presuppositions of
(A ʌ
B) under X and those of A∨B under X those of
(
A ʌ
B) under X.(iv) The potential presuppositions of A∧B under X are the potential presuppositions of A together with those of B that are not entailed by A under X.
The exception to the cumulative hypothesis given in Langendoen and Savin (Reference Langendoen, Savin, Fillmore and Langendoen1971) can be fully explained by Gazdar's view of triggers as signs that the speaker accepts the presupposition. Such a sign has no significance if the speaker has locally entailed that she accepts the presupposition in the context. The conversational contribution of utterance u can now be simply stated as
(22) X∪{u}∪!potIMP(u)
where potIMP(u) collects the potential scalar implicatures of u, the potential clausal implicatures of u and the potential presuppositions of u and ∪! is Gazdar's satisfiable incrementation: add an element x of potIMP to X∪{u} unless x makes a set consisting of X∪{u} and other members of potIMP inconsistent. The global projection of presuppositions in this view is treated quite rightly as yet another implicature,4 nowadays attributed by most who share the view that these projections are implicatures to Grice's maxim of relevance.5
While this is an attractive reformulation of Gazdar that improves on Karttunen and Heim by restricting global accommodation in interesting ways (conflicts with implicatures and other presuppositions), it falls short of Heim and Van der Sandt in one point: a cancelled presupposition only stays as a local entailment of the simplex clause in which the trigger occurs.6 This is probably too inflexible and the theories of van der Sandt (Reference Sandt1992) and Heim (Reference Heim, Barlow, Flickinger and Westcoat1983) offer the full space of auxiliary contexts necessary to put the presupposition in other places.
Van der Sandt's theory assimilates presupposition to the dynamic treatment of anaphora and is formulated in the DRT (a representational version of update semantics, as assumed in this paper). It incorporates the idea that presupposition triggers (just as pronouns) can only update the context if the context contains the presupposition (an antecedent for an anaphoric pronoun). Van der Sandt obtains the effects of Gazdar by allowing the insertion of a missing presupposition in any context to which the presupposition trigger has access and to which the presupposition can be consistently added, with a preference for the highest available one, a stronger assumption than in Heim, who just postulates a preference for the main context.7
All the theories treated so far fail on one important presuppositional phenomenon: partial resolution. This is the situation where an antecedent is present but not quite. Spenader (Reference Spenader, Kuehlein, Rieser and Zeevat2003) investigated corpora for too and Kamp and Rossdeutscher (Reference Kamp and Rossdeutscher1994) formalised some specific texts for the German wieder, ‘again’. In both studies, most antecedents turn out to be incomplete, i.e. one finds material that is intuitively the antecedent, but information needs to be added to it to make it a proper antecedent as required by the trigger. Kamp and Rossdeutscher (Reference Kamp and Rossdeutscher1994) have interesting suggestions about how to make the amendments and the treatment of these is more feasible in a representational theory than in an eliminative update semantics. Important is also to prevent it happening. For example, John is walking in the park. Mary will have dinner in New York too. does not lead to a repair in which John will have dinner in New York. (The example is due to Nicholas Asher, p.c.) This partial resolution happens with other triggers as well, as in McCawley's famous example (23).
(23)
a. LBJ dreamt that he was a homosexual and everybody knew that his foreign policy was a failure.
b. LBJ dreamt that he was a homosexual and everybody knew that he was waiting for boys in the restroom of the YMCA.
The examples show that (23b) is an example of resolution, since the presupposition does not project out. The processing that a sufficiently informed human interpreter of the sentence seems to go through seems to involve weighing several possibilities against each other, e.g. as in (24).
(24) Is LBJ waiting for boys in restrooms? Seems rather unlikely. Does everybody know this in his dream where he is a homosexual? Quite possibly.
This weighing of possibilities is important. (25a) is an example from Beaver (Reference Beaver2001).
(25)
a. When Spaceman Spiff lands on planet X he will be surprised that he weighs more than on Earth.
b. When Spaceman Spiff lands on planet X he will notice that he weighs more than 200 kilos.
In the (25b) example, Spaceman Spiff can weigh 200 kilos (in his own perception of his weight) when he lands on planet X, but this can also be his normal weight. In the latter case (this is serious overweight of the kind that happens), it would be extremely unlikely that he would be chosen to be a spaceman. The more local accommodation, that he weighs 200 kilos on planet X, is just as preferred as in the (a) case, but unlike the (a) case, it depends merely on probabilistic reasoning: the other possibility is consistent.
10.4 Implicature
Gazdar's formalisations of clausal and scalar implicatures are the first in an update perspective. They consist in the cancellable additions of the negations of scalar lexical alternatives further up the Horn scale and in the addition of
K
ϕ and
Kϕ for clauses ϕ whose truth value is left undetermined by the matrix sentence in which they occur.
While there are many alternative treatments for the scalar implicatures, the treatment of clausal implicatures does not seem to have drawn much discussion. An obvious criticism (Rob van der Sandt, p.c.) is that the proposal appears to be too strong. It is perfectly possible to believe A or that A is false and at the same time state that John believes that A. But this is less of a problem in the context of the revised Gazdar solution from the last section, where it is easy to get cancellation of these implicatures. The implicatures can be connected to either quantity (if A is left undecided, while the speaker knows that A or that not A, he is being economical with the truth) or relevance: using the clause A is raising the question whether A and leaving it open is a sign that one cannot decide the matter. This also seems relevant for the example (26) discussed in Karttunen and Peters (Reference Karttunen, Peters, Oh and Dinneen1979), where in (26b) the medical speaker would take into account that there could be a view of the matter around under which the measles is not a possibility.
(26)
(a) If he has the measles, he would have precisely these symptoms.
(b) If he had the measles, he would have precisely these symptoms.
There are now many versions of update semantics where questions are used to obtain exhaustivity effects, including scalar implicatures. The idea combines an analysis of the topic/focus distinction in terms of answers to a question (Klein and von Stutterheim Reference Klein and von Stutterheim1987; van Kuppevelt Reference Kuppevelt1995) with the exhaustive interpretation of wh-questions (Groenendijk and Stokhof Reference Groenendijk and Stokhof1984). The trouble of course is that very often these questions are not overtly given, while at the same time the scalar implicatures and the exhaustivity effects are intuitively there.
Zeevat (Reference Zeevat, Aloni, Butler and Dekker2007) uses question updates to assign an exhaustive value to the wh-variable involved. In this way, John has two pigs can answer questions like How many pigs does John have? What animals does John have? Who has pigs? Who has two pigs? and give exhaustivity entailments like: the number of John's pigs is 2, the animals that John has are pigs, the one who has pigs is John, the one who has two pigs is John. The context should make it clear which of these questions apply.
Another way of going about this problem is to structure the set of possibilities in an information state into partitions (e.g. Jäger Reference Jäger, Dekker and Stokhof1996) and more recently in inquisitive semantics (an enterprise that goes well beyond just a treatment of these implicatures) by maintaining a possibility structure over the set of indices (Groenendijk Reference Groenendijk, Bosch, Gabelaia and Lang2009).
Another approach (Schulz Reference Schulz2007) integrates another aspect of relevance, utility for action, into the topic question. The treatment of relevance as a dimension of information structure is probably the biggest theme of update semantics.
Inquisitive semantics achieves a breakthrough by addressing the representation of questions in a way which also deals with the natural emergence of questions out of assertions. John is in Paris or London will lead to the information state that also represents the question whether John is in Berlin or Paris. And this should be generalisable to a similar mechanism for existentials: not just the information that there is one, but also the question which one.
Theories of topic and focus and many accounts of particles can profit from mechanisms that raise questions as well as give information. One would like to have predictive theories that add questions to the information state, such as a causal question if the information is surprising, a justification question if there is good reason for doubting the information, a background question if the setting is too unclear, more information questions when an object or a concept is not clear etc. This is often not a question of only logic.
10.5 Speech acts and discourse relations
There is surprisingly little work on speech acts within update-semantical pragmatics.8 Yet, the work of Austin (Reference Austin1962) and Searle (Reference Searle1969) fits in remarkably well. First of all, speech acts come with preconditions which can be captured much like other presuppositions, including accommodation. Then they can be seen as manifestations of intentions. The speaker wants to inform that p, wants to know who ϕ's or whether ψ, wants to commit herself to a course of action p, wants the hearer to do p, etc. Then the decision to use a speech act indicates a belief in its possible success, which normally can be cashed out as attitudes of the hearer: the hearer does not believe that
p, the hearer may know whether p, the hearer would want the speaker to do x, the speaker has the necessary power over the hearer to order that p etc. Finally, the intention may connect to known goals of the speaker and allow further inferences. Especially if the utterance does not stand alone but is connected to an earlier utterance by the same or another speaker.
Then various forms of reaction to the speech act can be characterised in a similar way. The common ground that results from these speech acts may contain shared information, shared goals, shared emotions, contracts, shared questions, shared experience and shared plans.
10.6 Probabilities
A problem that has been equally ignored in update semantics as in other dynamic frameworks is the problem of pronoun resolution. It is normally put out of sight by coindexing or by employing the same variable in different updates. Or equally problematic: the choice between different readings of a word, for which – if the problem is noticed – a similar enrichment of the update with subscripts is chosen. What if one tried to integrate it? After all, the context should not just make the antecedents available but also serve as the content that allows a good choice between different antecedents. The context is normally supposed to play a similar role for word senses.
It is not easy to imagine something which is not similar to the picture that comes out of recent psychological investigations: the parallel activation of many readings with elimination of the less likely on the fly. An approximation is to have all the possible readings and a mechanism that computes conditional probabilities based on the probabilities of the predications on types (hitag(person, stone) vs hitth(person, stone) vs hitth(stone, person)) and combines them with information coming from the context. The construction requires a theory of how the information state can influence the probabilities and it is not directly clear how this can be achieved, though quite a number of things can be tried. The principle seems to be that if there is a clear probabilistic winner, it is the update, otherwise the update blocks (and feedback is sought). It is an unusual ingredient of a semantics, but it fits directly into the picture of an update semantics trying to model what happens to a hearer when exposed to utterances. It is also a necessary ingredient of any pragmatics. The process that selects word meanings, parses, antecedents, discourse relations and other unexpressed ingredients of the intended meaning is pragmatic because it stands under the normal pragmatic constraints. In particular, it stands under a constraint which makes the global interpretation as probable as possible, among the set of possible global interpretations.
It can be argued that this entails several operations that have traditionally been studied as part and parcel of pragmatics. One is a preference for connecting interpretations -- formulated in different traditions as ‘Do not miss anaphoric possibilities!’ (Hendriks and de Hoop Reference Hendriks and de Hoop2001), ‘Do not accommodate!’ (Blutner and Jäger Reference Blutner, Jäger, Fabricius-Hansen, Lang and Maienborn2003), ‘*New’ (Zeevat Reference Zeevat2009), ‘The Inertia Principle’ (Hamm and van Lambalgen Reference Hamm and van Lambalgen2005), ‘Minimal Models’ (Schulz Reference Schulz2007) -- over interpretations that are less connected. It can be constructed as consisting of preferring anaphoric (repeating) referents over bridging referents over brand-new referents, with a preference for connections to most activated referents. An anaphoric referent is the referent of some earlier expression, a bridging referent is a proper part of a given plural referent or a proper part of a given singular referent, or the cause of a given event. This principle has a surprising number of applications in the area of presupposition treatment and in the area of pronoun resolution, but also in the area of discourse structure.
It is not correct that a principle of this kind increases probability on the level of content. If p and p′ only differ in that p makes more contextual connections, p may well have a lower probability given the context. But on the level of what speakers are likely to intend with utterances such as p, they have a clear advantage: speakers seem to operate under the principle that where an identification is not intended, it is overtly marked that the entities are different. The result is that identifying interpretations are more probable, since they have been made impossible if unintended.
An explanation for this speaker strategy may lie in the fact that any kind of perception seems biased to identification and integration with the preceding experience. Natural language interpretation is just a special case of perception, so that speakers need to work against overidentification and overintegration.
A second principle is known as ‘Relevance’ (also as ‘Strongest Interpretation’ (Blutner and Jäger Reference Blutner, Jäger, Fabricius-Hansen, Lang and Maienborn2003), ‘Maximise Discourse Coherence’ (Asher and Lascarides Reference Asher and Lascarides2003), ‘Minimal Models’ (Schulz Reference Schulz2007) in many traditions, though the different traditions give different accounts and it is not directly clear that everybody intends the same. The author's favourite formulation is that interpretations that can be constructed as addressing a raised question win over minimally different interpretations that do not and lose out to identical interpretations that are taken as settling that question (‘Strongest Interpretation’ -- there are a number of instances of a strict interpretation of this principle that are counterexamples, and it should probably not be taken as a correct characterisation; see Beaver and Lee Reference Beaver, Lee, Blutner and Zeevat2003). ‘Maximise Discourse Coherence’ (Asher and Lascarides Reference Asher and Lascarides2003) seems to work in the same direction as this principle. Minimal Models either minimises the domain of the model (new objects are assumed only under force majeure, possibly by assuming that the new object is part of an old one), or alse ‘minimal’ refers to the size (extension) of certain relations.
The effect of the relevance principle is restricted by the conservativity principle: one cannot freely invent answers to a raised question. In the formulation given, it will deal with a wide range of phenomena again: scalar implicatures (27a), a side effect of settling quantity questions; exhaustivity implicatures (27b), a side effect of settling identity questions; presupposition projection (27c), a side effect of settling the question whether the presupposition holds taking into account that the speaker gives a signal with her use of the trigger that it does; certain explicature implicatures (27d); strengthening discourse relations (27e); relevance implicatures in the sense of Grice (27f) and even most of the other effects of relevance in Relevance Theory. (The symbol ‘+ >’ in (27) stands for ‘conversationally implicates’.)
(27)
a. John has 2 cats and a canary. + > John does not have three cats.
b. John has 2 cats and a canary. + > John does not have other pets.
c. John hopes that his niece will visit. + > John has a niece.
d. John had a drink. + > John had an alcoholic drink.
e. John pushed Bill. He fell. + > Bill fell because of the pushing.
f. There is a garage around the corner. + > The garage can help you to get some petrol.
Settling raised questions produces more information than keeping these questions open and so again works against probability. The set of minimal models is also a part of the set of all models, so again, probability can at best go down. But again it holds that speakers who want to avoid such inferences should mark against them, i.e. that not marking against them is sufficient for being taken to intend them. Again, the tendency of interpreters to make them can be reduced to a bias in perception (seeing what you want to see).
The author would conjecture that approximations to content probability together with symbolic implementations of the two principles discussed above would give promising results.
The given information state functions three times in such a programme: as a source of information which determines the relevant conditional probability that would decide between different interpretations of the new contribution; as a source of the objects given in the context and of their activation levels, for the second principle; and as a source of activated questions and their activation levels, for the third.
The last role highlights once again the importance of integrating the other speech acts into update semantics. The information state that is shared between speaker and hearer must meet the functional demands arising from the various roles that it has to play.
10.7 Conclusion and prospects
Probably the most important conclusion about update semantics is that it is a central format for any theory of interpretation. Even if one holds that the basic mechanisms of pragmatics are not related to updating, the predictions of any new proposal can only be tested if they are brought into the format of update semantics, since only this will give the predictions of the new proposal.
Second, update semantics seems central to accounts of pronouns, presupposition and speech acts, to the extent that alternative treatments tend to be unnatural and problematic. Whatever is interesting about E-type analyses of pronouns can be easily captured in an update semantics. The view that pronouns refer to an object of their local context is easier and more natural and much closer to the historical origin: deictic pronouns. Presupposition requires a distinction between old and new, a distinction which is the architectural backbone of update semantics. In other frameworks (cf. Schlenker Reference Schlenker2008) this distinction has to be carefully reconstructed. While non-controversial accounts of accommodation and partial resolution are not currently available, the formalisation of relevance on which it seems to depend is the focus of study in recent update semantics, since it is a central aspect of pragmatic updating. Relevance seems much further away in classical semantics.
Third, there is another side to our problematic attempts to separate semantics from pragmatics. In the classical picture, the historical shifts from a classical notion such as ability to a pragmatic notion such as epistemic modality would necessarily be something of a saltus: one moment there is a truth criterion, the next moment there is only an assertability condition. This does not sit well with an evolutionary account of these changes in which certain numerical parameters would slowly change. Update semantics offers continuity for changes of this kind and would predict in general the possibility of slow and continuous shifts in the meaning of lexical items that can move from descriptive to more and more pragmatic.
Finally, there is a philosophical point akin to the eliminability of a representational level for semantics in Montague's conception of natural language semantics. Update semantics is about changes to information states and these are sufficiently accessible to speakers and hearers to allow for experimental validation. This would be a level of representation that is as robust as the ability to communicate nonsense words through speech and hearing: performance can be tested. This does not apply to any of the intervening levels in standard accounts of language such as syntactic trees, logical forms of various flavours, proto-logical forms without resolution of pronouns, representations of ambiguous words. These levels have been assumed in accounts of grammar or of the interpretation or of the formulation and speaking process but lack any direct evidence of their own. Update semanticists are not committed to any of these intermediate representations, with the exception of phonetic representation, though – of course – they face the burden of explaining how the assumed processes might work just as much as anybody else and may have to adopt some of these levels in their accounts.
Current update semantics falls short of modelling linguistic agents in various ways: they have information (including anything that can be modelled as information, such as perception and desires), defaults, obligations and questions. There is no good reason, it seems, why they should not have other aspects as well, such as emotions, goals, intentions and decisions. It can be surmised that the future update semantics will not be about information states anymore, but about agents that are updated by linguistic communication (or about all the aspects of an agent that can be changed by communication).
Another element that was mentioned above is the need for more probabilities in these models (this is also part and parcel of a shift from information states to agents). Current update-semantical pragmatics is not able to weigh different resolutions of the same pronoun against each other, is not able to decide between accommodating and partly resolving a presupposition, and certainly not capable of the subtle weighing of the literal meaning of an utterance against an ironic interpretation. This last task is difficult. It would be wrong just to proceed with a probabilistic reinterpretation of the bivalent notion. In our perception and in linguistic communication, there is perhaps always an element of doubt, but on the whole our intuitions place us firmly in one single environment.
11 The normative dimension of discourse
11.1 Discourse and normativity
It is not too controversial to say that discourse may interact with normative relationships among people, relationships such as obligations or entitlements. Some speech acts may produce novel normative links: if you promise your friend to return the money he lent you, then your speech act institutes an obligation on your part, the obligation to return the money. Some speech acts may presuppose specific already-extant normative relationships: for example, to speak at the banquet of a scientific conference presupposes some entitlement yielded by the speaker's status – such as his being the head of the organizing committee or perhaps the dean of the faculty backing the conference.1
In some cases, a speech act may both presuppose and institute normative relationships. A typical case in point is ordering (commanding): this purports to institute an obligation on the part of its addressee; but it succeeds in instituting it only if it meets the condition that the actor's position is in a relevant sense superior to that of the addressee. Moreover, it would seem that ordering can be completely characterized in terms of the changes of the normative links it brings about: ordering, we can say, is simply the act which, when carried out by an entitled actor, creates a specific obligation on the part of the addressee. However, ordering is not usually thought of as a particularly typical speech act; language, so the usual story goes, is more a matter of something like “encoding and decoding information” or “communicating ideas and feelings”, while giving orders, or other ways of building on or establishing normative links, is little more than a by-product of this.
In this chapter I want to explore the possibility that normativity is far more crucial to language than this. An idea flickering in the theories of several twentieth-century philosophers of language (and seen earlier in Immanuel Kant) is that a certain kind of normativity is constitutive of our distinctively human mind (aka reason), founding our concepts and infiltrating the semantics of our language. If this is true, then normativity is not only an accidental element of some of our speech acts, but rather their essential ingredient. Here we want to expose the motivations supporting this view and search out its consequences.
In section 11.2, we will reconsider the traditional picture of communication as essentially a matter of transferring information and the related picture of language as a collection of representations (of ideas or things of the outer world); and we will consider, in section 11.3, alternative pictures. This will not only result in giving pragmatics pride of place over semantics in explaining the nature of language, but, moreover, endorsing a peculiar version of pragmatics, which we will call pragmatist pragmatics and which will take words to be first and foremost means of achieving practical ends. However, this will be only an intermediate station before our ultimate terminus: inspection of normative versions of pragmatics. We will reach it in section 11.4 after we reject the possibility of erecting the pragmatist picture of language on the concept of disposition and thus will be driven to what we will call the normative turn. In sections 11.5 and 11.6 we will then consider the consequences of this turn.
To foreshadow this approach to language and discourse, let us take as a basic thesis that meaningfulness is not a (naturalistic) property of a type of sound or an inscription, but rather a propriety: saying that an expression means thus and so is saying that it is correct to use it thus and so, that it is governed by a certain set of rules. The mechanism is supposedly similar to that animating games and sports: saying a piece of rubber is an ice-hockey puck is not ascribing the piece a peculiar (naturalistic) property, but citing its role vis-à-vis the rules of ice hockey. Hence from this viewpoint, discourse is like a kind of game: it is governed by rules (though in contrast to ice-hockey, not necessarily explicit rules) and meaningfulness is the effect of the rules.
It is crucial to realize that rules, as we understand the term here, are something different from what is studied in Chomskyan linguistics. The latter are certain principles implemented in human brains and thus inevitably governing our linguistic behavior; whereas rules in the sense adopted here guide us always “evitably” – it is a hallmark of the rules in our sense that they can be, as a matter of principle, violated. A rule construed thus is a social fact, consisting in the collective awareness that something ought to be thus and so, manifested in the corresponding behavioral regularities (not only the more or less regular compliance with the rules, but a more or less regular prosecution of deviations, etc.)
11.2 Transferring information?
Consider the received wisdom that language is a matter of transferring information. There seems nothing particularly normative about such transfers (aside from the trivialities that it can be exercised, as any other kind of act, correctly or incorrectly, well or badly, appropriately or inappropriately, etc.). What is going on is the moving of something from one head to another – a perfectly naturalistic enterprise.
But is this truly so? Consider the act of “transferring information” accompanying the act of assertion. What happens when I assert something, tell something to somebody, or inform somebody about something? I emit a certain sound (let us disregard the written form of language, for simplicity's sake), and it reaches my audience's ear. What can be effected by a mere sound? Seeing it as merely a sound, rather than an expression, we can imagine that its capabilities might include being able to scare the audience or attract its attention; but using the specific kinds of sounds that constitute language enables us to do much more complicated and very specific things: for example, we might cause somebody at a distant place to open a particular door, go into a room and take something from a particular drawer. How do we achieve this?
There is the temptation to have all the explanatory work done by the terms information and, indeed, meaning. The sounds that constitute language differ from other kinds of sounds, and acquire their almost miraculous abilities, because they have been furnished with meanings. (How? Perhaps – we pull another rabbit out of the hat – by means of a convention!) This grants them the ability to transfer information from one head to another, and as potential pieces of information constitute an incredibly large and fine-grained spectrum, we can achieve, by sending them to our audience, a very large and differentiated spectrum of reactions.
However, using such concepts as information and meaning as unexplained explainers begs the question. Almost everybody would agree that the talk about transferring information by means of words is merely metaphorical, that there are no such things as actual units of information literally hanging on the words that flow from my mouth to your ear.2 But what is it a metaphor of? An immediate answer might be that our brains have the ability of dealing with language in the sense that they discern a vast number of sounds and somehow recognize them as “codes” encoding something. Hence the speaker uses the ability of his brain to “encode” information and the hearer uses it to “decode” it. Thus, nothing must literally hang on the words, it “hangs” on them merely metaphorically in that they encode it.
But in what precisely does the encoding/decoding ability of our brains consist? Does the brain contain some huge “code table,” which allows it to translate information to and from the linguistic codes? This seems to be precisely the idea put forward by Chomsky, who urges that language is a huge system of pairings of sounds and meanings.3 But how do such “code tables” materialize in our brains?
Clearly, this must happen during the process of learning the language in question. (It cannot be innate, for the codes, the words, differ from society to society.) A forthright picture of how this happens was described by Augustine: we are shown things and told their names, thereby building the table.4 This has developed into what can be called a semiotic or a representationalist picture of language, according to which meaningfulness consists, first and foremost, in standing for something. But is this picture plausible? Two problems immediately surface: First, although we can be shown things or people, and perhaps also, less straightforwardly, we can be introduced to qualities such as redness or manhood by being shown red things or male persons, there still remains a plethora of words where it is utterly unclear what we can be shown (think, for example, of adverbs, such as rapidly or always). Second, to speak a language is to have use of an unlimited number of expressions, so inevitably no amount of pointing can furnish them all with their meanings.
These problems are usually not considered insurmountable. As for the first, it is assumed that only some of the words of our language are learned by ostension and thus come to function as proxies for things, whereas the rest of them constitute some kind of auxiliary scaffolding for employing the “core words” (see, e.g., Weinreich, Reference Weinreich, Householder and Saporta1962: 36). As for the second, learning a language is seen to consist of learning the meanings of words plus mastering ways of composing meanings of complex expressions out of those of simpler ones.
Note, however, that this essentially compromises the picture of our semantic competence as a matter of the possession of a code table. As for the first problem, the remaining worry concerns what it is that is stored in our brains along with words like rapidly or always. As for the second, we can certainly say that what effects the encoding/decoding should be seen more as an algorithm than as a table, which would account for the potential infinity of meaning–sound pairs; but what about the infinity of meanings themselves? Certainly if we are to encode a potentially infinite number of entities, we need to have the entities (if only merely potentially). But how do we generate all the meanings expressible in a language? Of course, we can generate them via generating all the expressions expressing them; but if they are to be encoded by the expressions, then it seems they should be available before the encoding happens. And if we have an unlimited number of meanings, what sense does it make to think that we have them as being assembled in our brain for a potential encoding?
We may try to meet this challenge by claiming that the meanings are something we are born with – that nature and our genes endow us with a “language of thought” that does not consist of expressions codifying meanings, but rather of expressions that directly are meanings. This seems to be the strategy of Fodor (Reference Fodor1975; Reference Fodor2008) and his followers. However, if we see a language as a collection of sound- or inscription-types that become meaningful via their engagement with our complicated discursive practices, then such a detachment of meanings from expressions is basically problematic: once again it leads to unexplained explainers that stand in essential need of explanation themselves.
11.3 The pragmatist turn
At this point, the picture of communication as principally a matter of exchanging encoded information loses its initial plausibility, encouraging us to consider some other approach. And as many philosophers and linguists have recommended, a useful alternative might be to see language not in terms of representing things and encoding/decoding information, but rather in terms of practical ends to which linguistic items can serve as means, to see expressions as tools rather than as codes. This insight is characteristic of philosophical pragmatism (see Haack and Lane, Reference Haack and Lane2006), but it has found its way into various other philosophical and linguistic conceptions of language in the later twentieth century. It is central to the neopragmatist theories of Quine and Davidson, it animates the later Wittgenstein's theory of language games and it is partly also present within the speech act theory initiated by Austin and Grice. I will call this view of language and communication pragmatist pragmatics. From this perspective, questions of meaning and information, of course, are radically altered.
Consider a toolbox (the metaphor for language favored by the later Wittgenstein5). I may learn various ways of using the tools the toolbox contains; and not only each of them alone, but also in combination: the hammer and a nail, the screwdriver and a screw, a bolt and a nut…I may accomplish various useful things. The more skillful I am with the tools, the more practical tasks they can help me solve. Moreover, they render it possible for me to do things I would never have even considered before: to build and mend things I would previously have been unable to imagine. Thus, although sometimes I use tools to cope with tasks that would have faced me independently of whether I had a toolbox or not, very often I use them to do things that I would not come to think about were it not for my experience with the toolbox – it is the skill of using the tools that makes many tasks that can be solved by its means visible in the first place. For this reason it would be absurd to think of the tools and their combinations as responses to tasks given beforehand.
Switching to this pragmatist view of language also prompts us to re-evaluate the traditional view of the semantics/pragmatics boundary. Recall that the traditional definition of the boundary between semantics and pragmatics, as given by Morris (Reference Morris1938) and Carnap (Reference Carnap1942),6 was conceived within the representationalist framework. Semantics was considered to address questions central to the framework, namely what things our words stand for, while pragmatics was relegated to the sphere of peripheral, residual questions of how we use words – how the words endowed with meanings get also peripherally endowed with our habits, moods, or fancies.
Given the pragmatist turn that leads us to see language more as the vehicle of an activity, semantics is effectively swallowed up by pragmatics – everything is a matter of how we use language. Of course, we can still have a semantics/pragmatics boundary, but now pragmatics will not be an unessential appendage of semantics, but rather semantics will be a slightly artificially demarcated part of pragmatics (such as, for example, the part which deals with truth-conditions, as Stalnaker (Reference Stalnaker1970) suggests).
Hence, the pragmatist turn may help us overcome the obstacles generated by the earlier idea that what we principally do with language is transfer information, which led to the consequent code conception of language.7 From the pragmatist viewpoint, language is not a mere instrument of dealing with the extralinguistic world: while in some cases it may help us cope with problems we would face independently of us having or not having a language, more often than not we will use it for tasks which only came into being with the birth of language – discussing theoretical questions, reciting poetry, buying a book, and so forth. Hence it seems that the tasks for which expressions are fitted co-evolve with language. For this reason it seems to me to be misleading to imagine expressions as codes of something given independently of them.
Note that, although we claimed that this pragmatist approach to pragmatics is not alien to the speech act theory of Austin and Grice (see their perlocution dimension of the speech act), ultimately it is an alternative to the specifically Gricean approach to pragmatics based on the concept of intention, which dominates the current pragmatics scene. Just as this avoids taking either the concept of information or that of meaning as an unexplained explainer, so it also avoids taking intention as such.8 In this way it is more thoroughly naturalistic.
Hence the impasse into which we were brought by the code conception might be overcome if we take the pragmatist turn; but from the opening comments of this chapter it follows that we want to consider one more turn, namely a normative one. Why is this? Why should we not rest content with the pragmatist turn?
11.4 What ties an observation report to what it reports?
Consider the sentence This is a dog. What does the fact that it means what it does in English consist in? It would seem that whatever else might come into its meaning, what is essential is that English speakers use this sentence when confronted with a dog, and not when confronted with, say, a horse.
However, is this true? For a start, we can think of cases where competent speakers might utter This is a dog when confronted with a horse – as a result of bad sight, of jocularity, irony, or poetic inspiration, etc. But perhaps such cases may be dismissed as statistically insignificant, and more substantial is the observation that the majority of English speakers when confronted with a dog would be extremely unlikely to actually utter This is a dog. In fact, I suspect the number of positive cases may well be statistically insignificant. (As Chomsky (Reference Chomsky, Davidson and Hintikka1975) conjectured, the statistical probability of uttering almost any specific expression in a given situation will not be significantly higher than zero.)
Hence we are back at the conundrum of what establishes the connection between This is a dog and dogs. The usual rabbit that philosophers and linguists pull out of their hats here is the concept of disposition. This is a dog means that this is a dog (and not, say, a horse) because speakers are disposed to utter it when confronted with a dog. This disposition sometimes provokes the overt utterance, but more often than not it remains covert.
But this rabbit, I feel, is just a trick to delay clarification. What does it mean that a speaker has a covert disposition to utter This is a dog? It amounts to the counterfactual claim that the person would utter it were it not for some hindrance. What would substantiate such a claim? Certainly, it would be well substantiated were we able to identify some mechanism in a person's brain which tends to lead to the utterance, which may be obstructed by certain factors, and which may be shown to be so obstructed in the case in point. But at present we are party to no such mechanisms.
Alternatively, we can interpret the claim as not a claim about an inner mechanism, but rather about empirical regularities. We may report that This is a dog is always uttered in the presence of a dog unless certain “hindering” factors occur. But for such a claim to have empirical content (to be, for example, testable) we would have to be able to specify the relevant factors. Otherwise the claim would be empty: any evidence would be compatible with it. (We would never be able to object that the claim is at variance with the facts: cases of speakers reacting to dogs with This is a dog would be in order, and cases of those not reacting in this way could always be explained away with reference to unspecified hindering factors.)
Are we able to give an exhaustive catalog of things or events that stop one exclaiming This is a dog in the presence of a dog? Can we say, for example, that principal factors are unwillingness to talk, preoccupation with other matters, or not noticing the dog in question? Hardly; we can clearly think of any number of others. Hence it seems that invoking the concept of disposition here is a mere illusion of explanation. Does this mean that there is, despite appearances, no intelligible connection between This is a dog and dogs after all? I do not think so; but I think that we tend to look at this connection in the wrong way. What I think is the case is that the connection is normative rather than causal. This is to say that the link between the occurrence and the utterance is not a matter of any causal mechanism connecting the two, but rather of the fact that to utter This is a dog when a dog is in focus is correct.9
However, without further ado, this would be at most a gesture towards an explanation (if not merely another trick). What is a non-causal, normative connection? Should one imagine some kind of supernatural fiber leading from dogs (and other potential objects of reference) to the minds of speakers? It is clear that unless we give a viable clarification, this alleged explanation would not really be useful. Hence it is the clarification to which we now turn our attention.
11.5 Correctness
To foreshadow where we expect our normative turn to lead us to, let us consider an activity seemingly very different from using language: the game of football. What is important for us is that playing football amounts to enjoying a spectrum of actions that are not available for us outside of its framework: get into an offside position, foul an opponent, or (joy of joys!) score a goal. How do such actions become available for me? Obviously because I, as well as my team-mates and our opponents, submit to the rules of football – it is the rules, and in particular the collective submission to them, that open up the space for the new kind of actions. And the thesis I want to put forward and discuss is that linguistic actions, actions that we tend to describe as cases of meaningful talk, transfer of information, or stating facts (or whatever else one can do with meaningful language), arise analogously: namely as a result of our collective recognition of the rules of language.
This recognition means nothing over and above the fact that we take certain linguistic utterances for correct, and others for incorrect. (This may be the case on several disparate levels – an utterance may be, e.g., grammatically correct while being incorrect as an assertion.) We know that it is correct to say This is a dog when pointing at a dog (whereas that this is incorrect when pointing at a bus); we know that it is incorrect to dissent from This is an animal while assenting to This is a dog; and we know that it is correct to raise one's hand when assenting to Will you raise your hand? Thus, a rule in this sense is a matter of a collective awareness, of an awareness that something is correct and something else is incorrect, leading to the appropriate behavior (praising the correct and trying to do away with the incorrect).
Let us start from the question of how we recognize the presence of a normative link of the kind discussed above. (After all, as Quine reminded us, we all learn to speak by means of observing our elders and peers and as what we can perceive is exclusively behavior – hence, we can say, there cannot be anything in meaning that was not in behavior before.) How does a language-learner recognize that This is a dog is “normatively linked” to dogs (rather than horses) and so grasp the meaning of the sentence (and especially of the word dog)?
When learning a language we may witness a demonstration of using This is a dog as accompanying pointing at a dog. However, though this can indicate the existence of a link, it cannot intimate the nature of the link, let alone that it is a normative link. Given our genetic tendency to imitate, we may come to utter This is a dog when pointing at a dog ourselves; but nothing apparently stops us from uttering it when pointing at things other than dogs – say, all furry things, or even at anything whatsoever.
The decisive step here is that we must learn that using it when pointing at something not a dog is incorrect. How do we learn this? By experiencing some kind of “social friction,” by facing “corrective reactions” of other speakers to such misuse (our own or somebody else's). What constitutes such a “corrective reaction”? It may be anything from a true punishment to a mild dissatisfaction or puzzlement. Anyway, one of the principal “social skills” anybody must master to be an integral part of his or her society is to recognize which kinds of behavior count as “corrective” (and being able to feel this kind of “friction” appears to be one of the most essential social skills).
Hence the original encounter with the normativity of meaning is in this “must not”: We do not learn what we should do, but rather what we should not do. This is important, for this may help clarify one of the most frequent misunderstandings regarding the normativity of meaning: the normativity does not rid us of our freedom in using language and hence does not contradict the obvious fact that using language is a spontaneous activity – it merely restricts the freedom, still leaving a vast number of possibilities. (We will return to this later.)
This also indicates that the normativity of meaning is somehow carried by the corrective, or as we may say more generally, normative attitudes of speakers to other speakers’ pronouncements. And this may bring us to a suspicion that we have not done away with the concept of disposition we deemed suspicious above, but merely shifted it one level up: for do we not need the concept for the characterization of the concept of normative attitude? Can we say that normative attitudes consist in a “corrective behavior” – or do we have to say that they consist in the disposition to corrective behavior? After all, not everybody who uses language incorrectly faces correction by others!
It is true that though the occurrence of a generic “corrective behavior” is more easily predicted than individual pronouncements, it is not, of course, the case that we cannot say that a pronouncement is wrong only if some such behavior occurs. But we may avoid the concept of disposition in the same way as before: instead of saying that an utterance is incorrect either if it faces corrective behavior, or if others would have the disposition to the “corrective behavior” towards it, we can say that it is incorrect if the corrective behavior towards it would be correct.
However, this definitely looks like a trick, and in this case a particularly naive one. (Am I criticizing my colleagues for pulling rabbits out of their hats only to end up pulling one out myself?) Is it not merely shifting the whole problem to a third level and thus possibly starting an infinite regress? The attempt to reduce correctness to behavior or to dispositions to behavior would indeed lead us to an infinite regress; but my answer is that this is to be taken as indicating that the reduction is impossible.
But is this, then, not a mere reductio ad absurdum of the whole idea of the “normativist turn”? Am I suggesting that behind (or above) human behavior (and the whole network of causal relationships, in which it is embedded) there looms some different, supernatural stratum of reality where we can encounter correctnesses? Of course not, though admittedly it may sometimes be useful to invoke this picture as a metaphor. The point is rather that there are no such things as correctnesses. Why they seem to be here is that we seem to state facts about them; but what looks like declarative statements about such correctnesses – I will call the statements normatives – are not always really declaratives. Hence what is behind the untranslatability of the normative idiom into the declarative one (and hence the reduction of “norms” to “facts”) is not the incommensurability of a “normative” and a “factual” stratum of reality, but simply the more mundane fact that many of the normatives are not genuine declarative sentences at all, but rather belong to a different species of speech act. They are, as Wilfrid Sellars (Reference Sellars1962: 44) put it, “fraught with ought.”
Return to the case of football; and consider the statement Hands should not touch the ball. This is a normative. There are two ways of employing a statement of this kind. First, one can state the fact that this kind of rule is in force in some community. This is, as it were, an “outsider” statement; a statement made by a disengaged observer describing the practices of the community in question. Second, one can state this as an “insider”, which does not amount to (or only to) stating a fact, but also to upholding the rule, urging its propriety or at least confirming its legitimacy. And true normatives are normatives posed precisely from this perspective. It follows that to say that rules exist is strictly speaking a metaphor: they do not exist, of course, in the way rocks, trees, or dolphins do. To say that a rule exists is to take some true normatives people use for ordinary declaratives. It seems to be our human way to do this; but we should be aware of the fact that this is a sense of existence different from the one in which we use the word when we talk about the existence of spatiotemporal particulars and their constellations.
And here we come to the mystery of how correctnesses, or proprieties, can exist relatively independently of our attitudes, and yet without being situated in some independent stratum of reality. The point is that any verdict we reach regarding correctness is at best tentative, it belonging to the nature of the concept that the verdict is considered as always amendable by our successors. They can find out that what we hold for correct is in fact not correct – but unlike in the case of blue, indivisible, or animate, we do not always see such cases as errors in the application of the concept, but rather as discovering the true nature of the concept of correctness.
Perhaps we can say that a rule or a norm is always an unfinished and open project (see Gauker, Reference Gauker2007). Usually it grows out of our current practices, is continuous with them, sometimes to the extent that we can understand its statement as a description of the practices as they, as a matter of fact, are; but the statement of a norm is also usually its prolongation, an urge “And we should go on like this!” Hence a rule is never a completed whole, it is always open to the future, not only to prolongations, but also to modifications and amendments. It is like a track that we must go on extending to ever new horizons.
11.6 Norms and convention
I have mentioned the concept of convention as a “rabbit” which is usually pulled out of the semanticist's hat when we need to say what ties an observation report to the observation it reports. But at this point someone might want to wonder whether the theory I have been developing in terms of the concepts of norms and normativity is not about what semanticists have long been referring to by means of the term “convention.” Well, in one sense it is. However, this is due to the fact that the term is largely ambiguous; and some of its senses do refer to normative phenomena.
As Rescorla (Reference Rescorla2007) points out in his overview article, “in everyday usage, ‘convention’ has various meanings, as suggested by the following list: Republican Party Convention; Geneva Convention; terminological conventions; conventional wisdom; flouting societal convention; conventional medicine; conventional weapons; conventions of the horror genre.” He offers a useful list of possibilities that the “conventional” may be contrasted with: “the natural; the mind-independent; the objective; the universal; the factual; and the truth-evaluable.” This indicates that to deal with the concept of convention, we need to start with disambiguation. I think there are at least three senses of the term relevant for the theory of language. In the first sense, “convention” is something like a habit; in this sense, “the conventional” is, in the words of Goodman (Reference Goodman and Krausz1989: 80), “the ordinary, the usual, the traditional, the orthodox as against the novel, the deviant, the unexpected, the heterodox”. In the second sense, convention is what has been established by man and has not been part of nature all along; in this sense, “the conventional” is, using Goodman's words again, “the artificial, the invented, the optional, as against the natural, the fundamental, the mandatory” (ibid.). In the third sense, “the conventional” is what has been explicitly agreed upon.
I think that the attractiveness of the term “convention” stems largely from the conflation of the three senses. When we encounter a problem concerning the way language hooks onto the world, we often invoke the term in the second sense. Surely the sentence – that is, the sound-type – “This is a dog” is not naturally connected with dogs, hence the relation is conventional. So far so good; but aside from giving the relationship a label nothing has been explained yet. However, the next step often is that we do not really need to explain anything, for the concept of convention is more or less self-explanatory. However, whereas this might be true for “convention” in the third, or perhaps also in the first, sense, it is definitely not true for “convention” in the second sense, for this sense remains blatantly neutral with respect to how conventions come into being and what their nature is. And it is clear that if we use the term to account for how language hooks onto the world, then it cannot be generally convention in the third sense of the word: language cannot be based on this kind of convention, for language is presupposed by this kind of convention.10
Could it be “convention” in the first sense, convention as a habit, that holds together the sentence and dogs? Habits do not seem to be too unperspicuous; so maybe it is “convention” in this sense that those who use the term as the universal unexplained explainer use it. However, “conventions” in this sense clearly do not overlap with our normative account of language and discourse: habits as such do not have any normative dimension. If my habit is to go for a walk every evening, it may be surprising that I do not go out today, but it is in no sense wrong.
Habits, to be sure, may evolve into norms. Once people start to take the habitual as not merely what usually happens, but rather what should happen, there emerges a norm – or, you may want to say, the habit becomes a norm. But here the latter step is crucial. For consider chess or football, which we use as our models of our discursive practices. People may acquire the habit not to take the ball into their hands; but the game cannot really get off the ground until this starts to be felt as what should not be done and until those who keep doing this start to be penalised. Hence the habitual substrate is surely not everything that makes up norms.
When you look into the writings of Austin (Reference Austin1961; Reference Austin1962), Grice (Reference Grice1989), Searle (Reference Searle1969) or other speech act theorists, what you find there is that the terms “convention” and “conventional” are among the most frequently used words, but despite this none of the authors pay much explicit attention to the question of what conventions are. Of course there are some hints: Austin (Reference Austin1961: 64), for example, when mentioning a “semantic convention” adds a parenthesis “(implicit, of course)” which indicates that what he has in mind is not “convention” in our third sense of the word. Elsewhere he talks of an assertion justified “not merely by convention, nor merely by nature,” which in turn indicates that he uses “conventional” in opposition to “natural”; hence that he uses it in the second of our three senses.
There is a flagrant disproportion between the huge explanatory work the concept of convention is supposed to do in such writings and the absence – or near absence – of its own explanation. This can be, to a great extent, justified by the fact that these authors use the terms “convention” and “conventional” in a sense that they do not see as being in need of explanation, hence, I would think, mostly in our second sense. But sooner or later, then, we must face the question of how this kind of conventionality comes into being.
The first person who realized that this is a serious problem was David Lewis (Reference Lewis1969). He clearly realized that the concepts of convention and conventionality that occur so frequently in writings about language cannot be generally construed as explicit agreements, and set himself the task of showing how “tacit conventions” can come into being. His solution of this problem is based on two assumptions: conventions come into being to solve coordination problems, and the solution of such problems can evolve spontaneously along the lines envisaged by game theory.
I think that despite the fact that Lewis laudably brought the nature and emergence of conventions into the focus of attention and showed how some tacit conventions may emerge spontaneously (thus breaking from the vicious circle into which we would fall if we wanted to base language on explicit conventions), his approach is not general enough. In particular I think that by no means all the norms language is based on can be seen as deriving from conventions solving coordination problems – at least not unless we generalize the concept of coordination problem to the extent that it will no longer be explainable in Lewis's simple game-theoretical terms.
Consider chess or football again. Can we say that their rules are a matter of conventions? Obviously, we can; in fact it would seem that the rules of games or sports are prototypes of what we would call conventional. (As we saw, there might be some terminological disputes over whether we should say that the rules themselves are conventional, or whether they evolved from conventions, but this is not important now.) But can we see them as solutions to coordination problems? This does not seem to be too plausible.
The fact is that, as I have argued elsewhere (see Peregrin, 2011 from the game-theoretical viewpoint the basic kind of rules relevant for language is more akin to those governing games of the kind of the Prisoner's Dilemma (solving genuine conflicts) than those of coordination.11 Thus I do not think that the sector of game theory Lewis took into account is general enough to account for the problems standardly addressed under the heading of normativity.
On the other hand, I repeat that I think that it is possible to see a large overlap between “normativity” and “conventionality.” However, one thing to keep in mind is that this is due to the catholic nature of the concept of conventionality. Moreover, if we adhere to the sense of conventionality for which this overlap obtains, this sense of convention is not sufficiently explained and the present considerations may be seen as a contribution to its explanation.12
11.7 Normative speech acts theory?
The ideas exposed in the previous sections have led to the project of normative pragmatics that was first explicitly formulated – on a very general level – by Brandom (Reference Brandom1994: Chapter 1).13 Brandom's tenet is that
it is only insofar as it is appealed to in explaining the circumstances under which judgments and inferences are properly made and the proper consequences of doing so that something associated by the theorist with interpreted states or expressions qualifies as a semantic interpretant, or deserves to be called a theoretical concept of a content. (Brandon, 1994: 144)
In this sense semantics must be “answerable to pragmatics,” namely to normative pragmatics.
When Searle (Reference Searle1969), in his classic book about speech acts, elaborated on the Gricean and Austinian speech act theory, his major example, discussed in the third chapter of the book, was the act of promising. His incipient characterization of this act reads as follows (Searle Reference Searle1969: 57–61):
Given that a speaker S utters a sentence T in the presence of a hearer H, then, in the literal utterance of T, S sincerely and non-defectively promises that p to H if and only if the following conditions 1–9 obtain:
1. Normal input and output conditions obtain.
2. S expresses the proposition that p in the utterance of T.
3. In expressing that p, S predicates a future act A of S.
4. H would prefer S's doing A to his not doing A, and S believes H would prefer his doing A to his not doing A.
5. It is not obvious to both S and H that S will do A in the normal course of events.
6. S intends to do A.
7. S intends that the utterance of T will place him under an obligation to do A.
8. S intends (i-I) to produce in H the knowledge (K) that the utterance of T is to count as placing S under an obligation to do A. S intends to produce K by means of the recognition of i-I, and he intends i-I to be recognised in virtue of (by means of) H's knowledge of the meaning of T.
9. The semantical rules of the dialect spoken by S and H are such that T is correctly and sincerely uttered if and only if conditions 1–8 obtain.
This characterization involves a normative notion, namely the notion of obligation (point 7). Then Searle points out that his characterization leaves no room for insincere promises; so he then proposes to replace condition 6 with 6a:
6a. S intends that the utterance of T will make him responsible for intending to do A.
Thus he introduces the normative notion of responsibility. (As the concepts of obligation and responsibility may be interdefinable, maybe it is not another normative notion, but merely the reiteration of the original one.)
However, is the list, and especially the role of the normative notions in it, formulated adequately? Does one, making a promise, intend to be placed under an obligation? Of course, as we assume that a typical promise is an intentional act, we would tend to consent; but is this inevitable? Suppose that I agree, say in a written form, to return somebody some money he lends me. Then suppose that I do not do so and when my creditor sues me, I tell the court that I did not really intend to have this obligation, hence that my act was not really a promise. (And let us suppose this is all true.) Am I likely to win the trial on the basis of proving that I have not promised anything? (It is hard to imagine how my declaration about my intention – if we understand the term “intention” in the sense of Grice and Searle as a basically internal act – could be challenged, for I alone have direct access to it.)
In view of this fact we can consider replacing 7 with 7*:
7*. The utterance of T will place S under an obligation to do A.
Then, it would seem, some of the other entries on the list may become superfluous. Consider, for that matter, 6 or 6a. Suppose that someone promises me to give me some money, but in fact does not intend to be responsible for it. Does it mean that what he does is not promising?
In fact, it would seem that the rest of the list might also dwindle (if not vanish completely). True, the inflation of the point 7 to 7* results from using a more robust concept of obligation than the one used by Searle: insofar as I understand him, what he has in mind is obligation as a matter of psychology, whereas the one suggested by me is obligation in the sociologico-institutional sense. Hence inflating the normative dimension of the act also involves moving the act “out of the head,” into the open. This means that some or all of the differences between Searle's account and the proposal made here may be terminological.
This does not mean, though, that the difference between them is insubstantial. The normative twist given to speech act theory involves a significant reinterpretation of the whole enterprise – instead of having psychological states as its direct concern, it now concentrates on normative statuses. Unlike its traditional version, it is wholly broken away from psychology of communication, which is the result of the conviction that language, and especially meaning, is more an institution than a psychological reality, and that psychology of communication is no more directly connected to communication than psychology of chess to chess.14
Moreover, a normative speech act theory must make it plausible that not only speech acts like promising or ordering, which it can handle relatively easily, but also such speech acts as asserting can be characterized in normative terms. This is a much harder nut (see, e.g., Pagin, Reference Pagin2004, for a skeptical viewpoint). The idea here is that making an assertion is nothing over and above assuming the commitment to provide a specific justification; and to entitle anybody else to reassert the sentence in question while deferring its justification to you.
Kukla and Lance (2009) proposed a normative version of speech act theory, according to which speech acts are generally characterized by the normative conditions of their appropriateness and the normative outcomes of their occurrence. Thus, for example, ordering is appropriate if the orderer is in some sense superior to the orderee; and in such a case its normal felicitous outcome is the commitment on the part of the orderee to do what he or she was ordered. Using this unusual key to the classification of speech acts yields an unusual classification: for example, the usual category of assertions divides into declaratives and what Kukla and Lance call observatives. The two acts differ in their conditions. Declaratives, assertions like There is a pig in the yard, are indiscriminatingly available to anybody; whereas only certain people are entitled to observatives, assertions like Lo, a pig!
Another peculiar kind of speech act which has surfaced after Kukla and Lance put our linguistic practices under the normative lens is what they called vocatives, acts of the kind of Hey, you! While observatives are characterized by having specific normative conditions (not everybody is entitled to make them), but general normative outcomes (they entitle everybody to make use of them), with vocatives it is the other way around: they have general normative outcomes (everybody is entitled to them), but their outcome is specific (they entitle a specific individual to enter the ongoing language game). Kukla and Lance claim that the identification of such speech acts, which do not surface in traditional speech act theory, significantly advances our understanding of language.
All of this, of course, presupposes that we accept that there is no meaningfulness without a normative dimension. This is, recall, the result of taking the Wittgensteinian picture of our discursive practices as a cluster of language games at face value, not only in the sense that the practices are heterogenous, but also in the sense that they are essentially underlain by rules. Given this, any speech act is individuated by the way in which it fits into the normative scaffolding that constitutes the space which provides the necessary substrate for such speech acts. And given this, in turn, our perspective on discursive practices shifts significantly, and may illuminate aspects hardly discernible from the perspective of Austin and Grice.
11.8 Conclusion
The traditional approach to language was based on the assumption that we must first explain how a word means something (which was, in turn, taken as tantamount to explaining how it can stand for that something), and then we would be able to explain language as a product of the synergic effect of an assembly of meaningful words. The pragmatic turn in the twentieth century (especially in its pragmatist variety) inverted the perspective: we must explain, the credo has come to be, directly how language works, i.e. our linguistic practices, using the concept of meaning at most as an expedient of this enterprise.
What I have been exploring here is the possibility of this pragmatic turn being given a normative twist: of meanings being explained as roles vis-à-vis rules of language. Let us return to football. As I noted above, once you accept its rules, you can do things which you were not capable of doing before. Note that this does not mean: some things you were capable of doing before (like kicking a round thing through a gate-like thing) now receive new descriptions (“scoring a goal”). Scoring a goal is not reducible to kicking. I and my team-mates might do precisely the same movements we do now, but without being caught in the normative network constituted by the rules of football they would not be scoring goals and they would not have many of the effects they have now (like making us happy, making our opponents annoyed, bringing money to those who laid bets on us, while causing those who betted against us to lose their money, etc.). In short, the rules of football constitute a new spectrum of actions not available to us before. And likewise, the amazing spectrum of things we can do with words is created analogously – by means of the rules of language.
When Carnap and Morris presented their division of the theory of language into syntax, semantics, and pragmatics, they gave, in effect, pride of place to semantics (relegating syntax to the auxiliary role of honing the vehicles that only semantics discloses as carrying meanings; and relegating pragmatics to the marginal role of sidekick to semantics). Moreover, Carnap then reconstructed semantics as a mostly logico-mathematical, armchair enterprise: the semantic theories he presents in his Introduction to Semantics (1942) or Meaning and Necessity (1947) do not seem interconnected, in any significant way, with any empirical investigations of natural languages. I think this was an unhappy development (and it justified the revolt of the many theoreticians of language who subsequently made pragmatics, rather than semantics, the centerpoint of the study of language15), and I want to ensure that the normative turn discussed here should not lead to a similar consequence.
It is true that normativity seems to be a tool that only philosophers have in their philosophical toolbox. Linguistics, one is tempted to say, is a down-to-earth science, and science describes how things are, not how they should be – so what use for normativity is there? The present chapter has tried to offer an answer, an answer as down-to-earth as possible. Human activities, be it chess or football or some much more complex and socially important ones, are governed by rules – indeed they are constituted by the rule-governance. This is clearly nothing mysterious or at odds with science – it is simply an empirical fact. And here I want to suggest that insofar as this applies also to language (which presupposes seeing language as a social institution, rather than, say, a psychological reality), we may come to see that this enterprise has an important normative dimension and that to understand this dimension may be essential for understanding language and discourse.
Work on this paper was supported by the research grant No. P401/10/0146 of the Czech Science Foundation.
12 Pragmatics in the (English) lexicon
12.1 Introduction
In this chapter I shall discuss only the lexicon of English, but the general principles seem to apply to many, if not all, other languages even though the minutiae do not. By “lexicon” I mean a rational model of the mental lexicon or dictionary. Although the way a lexicon is organized depends on what it is designed to do, it is minimally necessary for it to have formal (phonological and graphological), morphosyntactic (lexical and morphological categorization), and semantic specifications. Relations are networked such that formal specifications are (bi-directionally) directly linked to morphosyntactic specifications that are directly linked to semantic specifications – which, for the moment, subsumes pragmatic specifications. A lexicon must be accessible from three directions: form, morphosyntax, and meaning; none of which is intrinsically prior. Each of these three access points is, additionally, bi-directionally connected with an encyclopaedia. Haiman (1980: 331) claimed “Dictionaries are encyclopaedias” and certainly many desktop dictionaries contain extensive encyclopaedic information (e.g. Hanks Reference Hanks1979; Kernfeld Reference Kernfeld1994; Pearsall Reference Pearsall1998). The position taken here is that a lexicon is a bin for storing listemes,1 language expressions whose meaning is (normally) not determinable from the meanings (if any) of their constituent forms and which, therefore, a language user must memorize as a combination of form, certain morphosyntactic properties, and meaning. An encyclopaedia is a structured database containing exhaustive information on many (perhaps all) branches of knowledge. It therefore seems more logical that the lexicon forms part of an encyclopaedia than vice versa, but the actual relationship does not significantly affect this chapter. I assume that encyclopaedic information is typically, if not uniquely, pragmatic.
A lexicon is a bin for storing listemes for use by language speakers in any and all contexts. This is not to deny that new listemes are occasionally created, but the coining of a new listeme is a rare event and the resources of a lexicon are normally adequate for all contexts that a speaker faces. Consequently the meanings of listemes are expected to be adapted by semantic extension or narrowing both concretely and figuratively by speakers in utilizing them and hearers in interpreting them. Such lexical adjustment can be illustrated by the various meanings of the related listemes cut in (1).
(1) cut grass, cut hair, cut steel, cut the thread, cut the cards, cut your losses, cut out the middle man, cut the ties, to cut and run, cut the cackle, cut a class, cut someone socially, be a cut above, she's all cut up by the breakdown in her marriage, be cut to the quick, cut through the obfuscation, cut my finger, cut the tyres, cut the cake, cut a disk, a railway cutting, cut through the back lane, cut a [fine] figure
Most, if not all, of these seem to derive from a basic notion of severing, interpreted in various ways according to what is severed and/or the manner of severing (this could even apply to cut a figure). Similarly, it is well known that a color term may extend to shades very far from the focal color (Berlin and Kay Reference Berlin and Kay1969; MacLaury Reference MacLaury1997) as selected from, say, the Munsell Color Array; we can attribute this to the elasticity that language needs to have in order that it can usefully be applied to the world around us. In certain domains and in certain formulaic expressions color terms are used of hues vastly distant from the focal color. Take the domain of human appearance: terms like white, black, yellow, and brown have all been used to characterize the skin pigmentation of people of different races, often dysphemistically. These color terms are descriptively appropriate not so much in relation to the focal colors as in relation to each other: a white person is typically paler than the others and a black person darker; a yellow person is typically yellower than the others. The peoples of southeast Asia and Austronesia are often referred to as brown, despite the fact that peoples labeled black are often of similar brown skin color. So brown, too, functions by contrast with white, black, and yellow in this domain. In the domain of oenology, red wine does have a (usually dark) red tinge but white wine is only white by virtue of being paler than red wine; white wine is normally pale yellow or pale green. Clearly what determines the meanings of these particular sets of color terms is their comparative function: by means of very rough approximation to the focal color, they distinguish within a semantic field between different species of the kind of entity denoted by the noun they modify.
Pragmatics within the lexicon is largely an addition to the semantic specifications; for instance, it is useful to identify the default meanings and connotations of listemes. Default meanings are those that are applied more frequently by more people and normally with greater certitude than any alternatives. Bauer (Reference Bauer1983: 196) proposed a category of “stylistic specifications” to distinguish between piss, piddle, and micturate, i.e. to reflect the kind of metalinguistic information found in traditional desk-top dictionary tags like ‘colloquial’, ‘slang’, ‘derogatory’, ‘medicine’, ‘zoology’; such metalinguistic information is more encyclopaedic than lexical. So too is etymological information. Pustejovsky (Reference Pustejovsky1995: 101) specifies book as a “physical object” that “holds” “information” created by someone who “write[s]” it and whose function is to be “read.” Certainly, there is a relation between book, write, and read that needs to be accounted for either in the semantic specification or pragmatically – Pustejovsky represents it in terms of a network and networks are also used in frame semantics (Fillmore Reference Fillmore1982; Reference Fillmore and Brown2006; Fillmore and Atkins Reference Fillmore, Atkins, Lehrer and Kittay1992; FrameNet at http://framenet.icsi.berkeley.edu) and by Vigliocco et al. (2009). Category terms like noun, verb, adjective, and feminine are part of the meta-language, not the object language; but they also appear in the lexicon as expressions in the object language and there needs to be a demonstrable relation from object language to meta-language (and vice versa). It would seem incontrovertible that encyclopaedic data is called upon to interpret non-literal expressions like Ella's being a tiger; likewise, to explain the extension of a proper name like Hoover to denote vacuum cleaners and vacuum cleaning or the formation of the verb bowdlerize from the proper name Bowdler. I assume that, because many proper names are shared by different name-bearers, there must be a stock of proper names located either partially or wholly in the lexicon, even if they are stored differently in the brain (see section 12.9). The production and interpretation of statements like those in (2)–(3) requires pragmatic input.
(2) Caspar Cazzo is no Pavarotti!
(3) Harry's boss is a bloody little Hitler!
(2) implies that Caspar is not a great singer; we infer this because Pavarotti's salient characteristic was that he was a great singer. (3) is abusive because of the encyclopaedic entry for the name Hitler that carries biographical details of a particular infamous name-bearer. Such comparisons draw on biodata that are appropriate in an encyclopaedia entry for the person who is the standard for comparison but not appropriate in a lexicon entry; the latter should identify the characteristics of the typical name-bearer, such as that Aristotle and Jim are normally names for males, but not (contra Frege Reference Frege1892) the biographical details of any particular name-bearer – any more than the dictionary entry for dog should be restricted to a whippet or poodle rather than the genus as a whole.
One of the earliest investigations of lexical pragmatics was McCawley (Reference McCawley and Cole1978); McCawley (correctly) argued that a listeme (such as pink or kill) and a semantically equivalent paraphrase (such as pale red or cause to die) are subject to different pragmatic conditions of appropriateness that give rise to different interpretations, which he thought could be captured by general conditions of cooperative behavior such as Grice's cooperative maxims. He did not tackle the question of whether pragmatics intrudes on lexical entries. Nor does Blutner (Reference Blutner1998; Reference Blutner, Horn and Ward2004; Reference Blutner and Cummings2009). Blutner discusses pragmatic compositionality, blocking (if a listeme already exists to express a meaning, do not construct another one without good reason to do so),2 and pragmatic anomaly (recognized as early as Apollonius Dyscolus in Peri Suntaxeōs III.149; see Uhlig Reference Uhlig1883). The closest Blutner comes to pragmatics within the lexicon is discussing the interpretation of certain adjectives and institute-type nouns (Blutner Reference Blutner1998).
Carston (Reference Carston2002: Ch. 5), then Wilson and Carston (Reference Carston and Frápolli2007) discuss lexical narrowing (e.g. drink used for ‘alcoholic drink’), approximation (e.g. flat meaning ‘relatively flat’), and metaphorical extension (e.g. bulldozer used to mean ‘forceful person’). They argue that the same interpretive processes as are employed for literal utterances are used for narrowing, broadening, through to approximation and figurative usage in hyperbole and metaphor. Interpretation is triggered by the search for “relevance” constrained by the principle of least effort: “An input is relevant to an individual when it connects with available contextual assumptions to yield positive cognitive effects (e.g. true contextual implications, warranted strengthenings or revisions of existing assumptions)” (Wilson and Carston Reference Carston and Frápolli2007: 245). Inferences deriving from “explicature,” “implicature,” and context-based assumptions satisfy the expectation of relevance, which causes the interpretive process to stop at whatever interpretation a hearer judges satisfactory in the context of utterance.
Huang (Reference Huang2009) also deals with lexical narrowing, lexical blocking, and pragmatic anomaly and, in addition, contrastive focus reduplication. But (despite his title “Neo-Gricean pragmatics and the lexicon”) he has very little more to say about pragmatics in the lexicon than is found in Blutner or Wilson and Carston.
Copestake and Lascarides (Reference Copestake and Lascarides1997) identified the importance of noting in the lexicon the frequency of particular word senses, in a manner very similar to that independently proposed for a broader range of data by Allan (Reference Allan and Peeters2000; Reference Allan2001 and again in this chapter). Copestake and Lascarides (Reference Copestake and Lascarides1997: 140) write “For example, in the bnc [British National Corpus] diet has probability of about 0.9 of occurring in the food sense and 0.005 in the legislature sense (the remainder are metaphorical extensions, e.g. diet of crime).” In section 12.2 below I introduce a credibility metric like that of Copestake and Lascarides which applies to (some) nonmonotonic statements within the lexicon. I argue the case for nonmonotonic statements in the lexicon in entries for nouns in section 12.3 and for verbs in section 12.4. In section 12.5 I discuss the pragmatic intrusions into the interpreting of collectives and collectivized nouns. This leads naturally to a consideration in section 12.6 of the entries for animal nouns that may refer to either the animal's meat or its pelt (after Allan Reference Allan1981; Nunberg and Zaenen Reference Nunberg, Zaenen, Tommola, Varantola, Salmi-Tolonen and Schopp1992). Section 12.7 takes up the dictionary entry for and; 12.8 discusses the pragmatic component of lexicon entries for sorites terms; 12.9 looks at the place of “prefabs” or “formulaic expressions” in the lexicon and 12.10 tackles ways in which connotation might be incorporated into entries for listemes. Section 12.11 summarizes the chapter.
12.2 A credibility metric
In some of what follows it will be helpful to use a credibility metric for a proposition. The truth value of a proposition p hinges on whether or not p is, was, or will be the case. What matters to language users is not so much what is in fact true, but what they believe to be true.3 The credibility of p is what is believed with respect to the truth of p, or believed is known, or is in fact known of its truthfulness. Because most so-called “facts” are propositions about phenomena as interpreted by whomever is speaking, we find that so-called “experts” differ as to what the facts are (for instance, with regard to global warming, or what should be done about narcotics, or what is the best linguistic theory). Whether ordinary language users judge a proposition true or false depends partly on its “pragmatic halo” (Lasersohn Reference Lasersohn1999): in any normal situation Sue arrived at three o'clock is treated as true if she arrived close to three o'clock; the slack afforded by the pragmatic halo is restricted by a pragmatic regulator such as precisely or exactly in Sue arrived precisely at three o'clock or Sue arrived at exactly three o'clock.4 Mostly, though, truth or falsity is assigned by the ordinary language user on the basis of how credible the proposition is, and this is reflected in the way that language is produced and understood. There is a credibility metric such as that in Table 12.1, in which complete confidence that a proposition is true rates 1, represented as cred = 1.0, and complete confidence that a proposition is false rates cred = 0.0; indeterminability is midway between these two: cred = 0.5. Other values lie in between. (□ is the necessity operator, ⋄ is the possibility operator,
symbolizes exclusive disjunction,
p means “not-p.”)
Table 12.1. The credibility metric for a proposition

In reality, one level of the metric overlaps an adjacent level so that the cross-over from one level to another is more often than not entirely subjective; levels 0.1, 0.4, 0.6, 0.9 are as much an artifact of the decimal system as they are independently distinct levels in which I have a great deal of confidence. Nonetheless, I am certain that some variant of the credibility metric exists and is justified by the employment of the adverbials (very) probably, (very) possibly, and perhaps in everyday speech. This metric is needed in some lexical entries, as we shall see.
12.3 Semantic specifications for bird and bull
Birds are feathered, beaked, and bipedal. Most birds can fly. Applied to an owl this attribute of flight is true; applied to a penguin it is false. Birds are sexed and a normal adult female bird can lay eggs. It is a defining characteristic that members of the female sex carry ova; I’ll label this function sxF (which can be glossed ‘sexual female’). Where they don't, or the ova are non-viable, the organism can count for our purposes as a gendered female, genF, but not sxF. Mostly, sexual females are gendered females too; see (4) where → indicates semantic entailment.
(4) Most(x)[sxF(x) → genF(x)]
Although we do speak of human eggs, nonetheless the default egg is from an oviparous genus such as a bird, so I’ll assume this characteristic ought to be noted in the lexicon.5 Based on Allan Reference Allan2001: 252, I propose that the semantic part of the lexicon entry for bird be (5), where ∧ symbolizes logical conjunction, +> indicates (defeasible) nonmonotonic inference (NMI), which could perhaps be referred to as an implicature and which is cancelled for species such as emus and penguins.

The lambda-operator is useful to identify an individual as having a number of properties jointly, e.g. being a member of the set of creatures that are at the same time feathered and beaked and bipedal. In (5) the line bird(x) +> ⋄fly(x) identifies that a bird is most probably capable of flight with a credibility rating of 0.7. In the case of a sparrow, the semantic component of the lexicon entry may look like (6); for a penguin, like (7).


For both (6) and (7) the oviparity of sxF sparrows and penguins is an entailment of their being birds. The credibility of a sparrow being able to fly is estimated at cred ≥ 0.99 (it might be injured), whereas the credibility of a penguin flying is 0 (its not-flying has a credibility of 1).
The first entry under bull in the Oxford English Dictionary (1989) is “The male of any bovine animal; most commonly applied to the male of the domestic species (Bos Taurus); also of the buffalo, etc.” Part of this is more formally stated in (8).
(8) ∀x[λy[bull(y) ∧ animal(y)](x) → λz[male(z) ∧ bovine(z)](x)]
I will ignore the facts identified in (9).
(9) male(x) → genM(x) +> sxM(x)
(8) is inaccurate because the noun bull is not restricted in application to bovines; it is also properly used of male elephants, male hippos, male whales, male seals, male alligators, and more. The initial plausibility of (8) is due to the fact that it describes the stereotypical bull. The world in which the English language has developed is such that bull is much more likely to denote a bovine than any other species of animal. Peripheral uses of bull
are examples of semantic extension from bovines to certain other kinds of large animals; consequently they require that the context make it abundantly clear that a bovine is not being referred to. This is often achieved by spelling it out in a construction, such as bull elephant or bull whale, which is of greater complexity than the simple noun bull used of bovines – a difference motivated by the principle of least effort (Zipf Reference Zipf1949). There is no regular term for “the class of large animals whose males are called ‘bulls’, females ‘cows’, and young ‘calves’” so in Allan Reference Allan2001: 273 I coined the term *bozine to label it.6 The semantics of English bull is given in (10) from which the NMI of bovinity will be cancelled where the animal is contextually specified as giraffid, hippopotamid, proboscid, pinniped, cetacean, or crocodilian.

Once again we see a default interpretation being recorded as an NMI in the lexicon because of the salience of this particular characteristic, namely bovinity, of the default reference (i.e. the denotatum) for bull
. (At first sight a salient meaning should be almost the opposite of a default meaning: something that is salient jumps out at you; by contrast a default is the fall-back state when there is no contextual motivation to prefer any other. On a second look, what qualifies a state to become the default is its salience in the absence of any contextual motivation to prefer another.) The credibility of ≥0.9 is based on my intuition. A search of ten corpora totaling about ten million words (the Australian corpus of English; Australian ICE; the Lancaster–Oslo/Bergen corpus of British texts; the London–Lund corpus; the Freiburg corpus of British texts; the Freiburg corpus of American texts; the Brown corpus of American texts; the Wellington corpus of written New Zealand texts; New Zealand ICE; Kenya–East Africa ICE) revealed no applications of bull to animals other than bovines, nor indeed were such searches useful in confirming or disconfirming any of the other credibility ratings in this chapter.
In this section I have shown that a lexicon entry can be constructed to indicate the necessary components of meaning for the entry and also the most probable additional components of meaning that obtain for most occasions of use but which may be canceled as a function of contextual constraints. These can be seen as prototype effects that, for instance, help distinguish cup from mug and bowl (see Labov Reference Labov, Farkas, Jacobsen and Todrys1978). Traditional Arab and Turkish coffee cups are small bowls with no handle, very similar in configuration to Chinese porcelain teacups. The typical Western teacup or coffee cup has a handle and is accompanied by a saucer. All these types of cup are bowl-like in shape though they are smaller, usually have higher sides, and serve a different function than most bowls. Cups are intended to be put to the lips to convey liquid to the mouth whereas liquid in food bowls is spooned into the mouth; otherwise a bowl is used for food preparation. These kinds of conditions (that distinguish cup from mug and bowl) are encyclopaedic and pragmatic rather than purely semantic.
For each lexicon entry the semantic identity of the listeme is presented as a meaning postulate – cf. (10); for instance, the noun bull is semantically represented by the predicate bull ranging over a variable for the entity denoted. Predicates like bull, animal, male, and bovine are not decomposed into semantic primitives but give rise to certain inferences some of which are necessary semantic entailments, others are probabilistic nonmonotonic inferences. Similar conditions apply to the verb climb, as we see in section 12.4.
12.4 Climbing
Jackendoff (Reference Jackendoff1985) identified some interesting characteristics of the verb climb. From (11) we understand that Jim climbed up the mountain – contrast (11) with (12). We also understand that he used his legs and feet – contrast (11) and (12) with (13).
(11) Jim climbed the mountain.
(12) Jim climbed down the mountain.
(13) Jim climbed (down) the mountain on his hands and knees.
Snakes, airplanes, and ambient temperature lack legs and feet they can use when climbing (which is presumably a metaphorical extension with these actors), and they can't climb down, some other verb must be employed.

In (17) the lexicon entry captures the fact that the default interpretation of climb presumes both upward movement, symbolized by ↑7and the use of feet (and therefore legs, too).

NMIs apply not just to nouns and verbs but potentially in any lexicon entry.
12.5 Collectives and collectivizing
Allan Reference Allan1976 and Reference Allan2001 discuss the semantics of collective nouns such as admiralty, aristocracy, army, assembly, association, audience, board, class, clergy, committee, crowd, flock, government, and collectivized nouns such as those italicized in (18)–(19).
(18) These three elephant my great-grandfather shot in 1920 were good tuskers, such as you never see today.
(19) Four silver birch stand sentinel over the driveway entrance.
A definition of collectivizing will be given shortly, but let's begin with familiar collectives.
Collective nouns allow reference to be made to either the set (collection) as a whole or to the set members. In many dialects of English (but not all) the different interpretations are indicated by NP-external number registration
; consider (20).8

Whereas singular NP-external number registration indicates that the set as a holistic unit is being referred to, cf. (21), the plural indicates that the set members are being referred to (22). In these and later examples, X and Y are (possibly null) variables for NP constituents; NPsg is a singular NP, and NPpl is plural; x, y, z are sets, either unit sets (individuals)9 or multi-member sets, so one should understand from (21) and (22) that ∀x[∃y[y⊆x]].
(21) ∀x[NPsg[X Nhead[λy[many(y) ∧ collocated(y)](x)] Y]
→ combined_membership(x)]
(22) ∀x[NPpl[X Nhead[λy[many(y) ∧ collocated(y)](x)] Y]
→ constituent_members(x)]
Thus, (23) identifies the composition of the committee, while (24) identifies dissension among the membership of the committee.


NPs denoting institutions, e.g. the company I work for, the BBC, the university must be singular (NPSG in (27) and (28)) when the institution as a building, location, or single constituent body is referred to, as in (25), but can have plural NP-external registration when referring to the people associated with it (26).

The facts with respect to such collective nouns are represented in (27)–(29), where N0 is the form of the noun unmarked for number.
(27) ∀x∃z[N0[library(x)] → λy[many(y) ∧ book(y) ∧ collocated(y)](z) ∧ x⊇z]
+> ∃x[NPsg[X N0,head[library(x)] Y] ∧ institution(x)]
(28) ∀x[NPsg[X N0,head[institution(x)] Y] → constituent_body(x) ∨ site(x)]
(29) ∀x[NPpl[X N0,head[institution(x)] Y] → staff_members(x)]
There is no evidence in (20)–(29) of probabilistic representation being required in the lexicon. The different interpretations are indicated through morphosyntactic choices.
Allan Reference Allan1976 and Reference Allan2001 identify a principle of N0 usage for English, given in (30).
(30) N0, the form of the noun unmarked for number, is used when the denotation for N is perceived not to consist of a number of significant similar units.
In a plural NP headed by N0, the absence of plural inflexion on the head noun marks “collectivizing.” Collectivizing signals hunting, conservation, or farming jargon because N0 is characteristically used of referents that are NOT perceived to be significant as individuals. Early users of the collectivized form were not interested in the individual animals except as a source for food or trophies. Consider the italicized nouns in (18)–(19) and (31)–(34), to which italics have been added.
(31) A three month shooting trip up the White Nile can offer a very good mixed bag, including, with luck, Elephant, Buffalo, Lion, and two animals not found elsewhere: Nile or Saddle-back (Mrs. Gray's) Lechwe and White-eared Kob. (Maydon Reference Maydon1951: 168)
(32) On the way back to camp we sighted two giraffe on the other side of the river, which were coming down to the water's edge to drink. (Arkell-Hardwicke Reference Arkell-Hardwicke1903: 285)
(33) These cucumber are doing well; it's a good year for them.
(34) The cat-fishes, of which there are about fifty distinct forms arranged in four families, constitute the largest group, with probably the greatest number of individuals per species. In some parts of the country where nets are little used and fishing is mainly done with traps and long lines, at least three-quarters of the annual catch is of cat-fish. (Welman Reference Welman1948: 8)
The plural NP “cat-fishes” at the beginning of (34) refers to species of cat-fish, whereas the N0 at the end refers to individuals caught by fishermen. Collectivizing of trees and other plants is much less common than collectivizing animals – from which, perhaps, it derives. Vermin are never collectivized; though individual language users may differ over what counts as vermin. Early uses of the collectivized form were applied to animals hunted for food or trophies. Today, collectivizing occurs in contexts and jargons of hunting, zoology, ornithology, conservation, and cultivation, where N0 is characteristically used of referents that, as I’ve already said, are not perceived to be significant as individuals. Two possible contributing factors to the establishment of N0 as the mark of collectivizing are (1) the unmarked plural of deer – which once meant ‘wild animal, beast’, and (2) the fact that meat nouns are N0 (discussed in the next section). Despite the fact that there is a good deal of variation in the data (see Allan Reference Allan1976: 100f.), collectivizable nouns should be marked as such in the lexicon. Reference will need to be made to the discourse domain being one of the contexts identified above and vermin will need to be excluded. The kind of entry I envisage is (35), which uses giraffe as an example.
(35) IF Domain = conservation THEN ∀x[NPpl[X N0[giraffe(x)] Y]]; cred ≈ 0.6
Clearly, more work is needed.
12.6 Animals for food and fur
In this section I take up a discussion from Allan Reference Allan1981. Consider the sentences in (36)–(37).
(36) Harry prefers lamb to goat.
(37) Jacqueline prefers leopard to fox.
Most likely you will interpret the animal product nouns in (36) to refer to meat, such that (36) is paraphrasable by (38), whereas the animal product nouns in (37) refer to animal pelts such that (37) is paraphrasable by (39).
(38) Harry prefers eating lamb to eating goat.
(39) Jacqueline prefers leopard skin to fox fur.
The converse interpretations are unlikely, especially Jacqueline prefers eating leopard to eating fox.10 The predicate prefer in (36)–(37) offers a neutral context permitting the default animal product to rise to salience. This suggests that the lexicon entries for lamb and goat, and that for other creatures (such as whale, see (40)) should include a specific application of the formula in (41).
(40) In Tokyo, whale gets ever more expensive!

The lexicon entries for leopard and fox should include a specific application of the formula in (43); so will all of the italicized animal product nouns in (42).
(42)
a. Jacqueline was wearing mink.
b. Elspeth's new handbag is crocodile, I think.
c. This settee's made of buffalo.
d. The tannery has loads of impala right now.

A mass NP headed by an animal noun will refer to the pelt of the animal denoted by that NP when there is in the clause an NP head or clause predicate describing apparel, accessories to apparel, furniture, the creation of an artifact, or any object likely to be made from leather and any place or process that involves pelts, hides, or leather such that these constrain the domain for the interpretation of N0. Thus the nonmonotonic inference in (41) is canceled by the implications of the lining in (44); from (43) the NMI is canceled by the predicate eat in (45).
(44) I prefer the lining to be made of lamb, because it's softer.
(45) All we had to eat was leopard.
More subtle interpretations are required in (46)–(49).
(47) The girl holding the plate was wearing rabbit.
(48) The girl who wore mink was eating rabbit.
(49) Because she decided she preferred the lamb, Hetty put back the pigskin coat.
In (46) “plate of lamb” identifies meat. Although the most likely interpretation of a plate of steel is ‘a plate made of steel’ (cred ≥ 0.99), a plate of lamb is, with similar credibility, interpreted as ‘a plate bearing food’. The predicate “wearing rabbit” in (47) identifies the rabbit pelts as apparel (again, cred ≥ 0.99) and, likewise, “wore mink” in (48) identifies mink as apparel while the predicate in “eating rabbit” coerces the reference to rabbit meat. In (49) “the lamb” is most likely to be interpreted as meat (cred ≥ 0.8) until this is revealed as a “garden-path” misinterpretation, corrected by the preference for a porcine pelt in the second clause, which cancels the original NMI, replacing it with the coerced interpretation “lambskin coat”.
In this section I have claimed that animal nouns in mass NPs which denote a product from the dead animal typically refer to either the animal's flesh or its pelt, but this probabilistic inference can be canceled by certain contextual elements that condition the domain for interpretation. Credibility rankings can be assigned as shown in (50). However, in (50) these rankings are based on my intuition, although they ought to be made on the basis of the frequency of interpretations retrieved from large and diverse corpora.
(50) NPmass [N[λy[lamb(y) ∧ animal(y)](x)]] +> meat_of(x); cred ≥ 0.8
IF NOT meat_of(x) THEN pelt_of(x)
NPmass [N[λy[goat(y) ∧ animal(y)](x)]] +> meat_of(x); cred ≥ 0.7
IF NOT meat_of(x) THEN pelt_of(x)
NPmass [N[λy[rabbit(y) ∧ animal(y)](x)]] +> meat_of(x); cred ≥ 0.7
IF NOT meat_of(x) THEN pelt_of(x)
NPmass [N[λy[leopard(y) ∧ animal(y)](x)]] +> pelt_of(x); cred ≥ 0.9
IF NOT pelt_of(x) THEN meat_of(x)
NPmass [N[λy[fox(y) ∧ animal(y)](x)]] +> pelt_of(x); cred ≥ 0.9
IF NOT pelt_of(x) THEN meat_of(x)
NPmass [N[λy[mink(y) ∧ animal(y)](x)]] +> pelt_of(x); cred ≥ 0.9
IF NOT pelt_of(x) THEN meat_of(x)
NPmass [N[λy[buffalo(y) ∧ animal(y)](x)]] +> pelt_of(x); cred ≥ 0.8
IF NOT pelt_of(x) THEN meat_of(x)
NPmass [N[λy[crocodile(y) ∧ animal(y)](x)]] +> pelt_of(x); cred ≥ 0.8
IF NOT pelt_of(x) THEN meat_of(x)
NPmass [N[λy[impala(y) ∧ animal(y)](x)]] +> pelt_of(x); cred ≥ 0.7
IF NOT pelt_of(x) THEN meat_of(x)
It would seem obvious that there should be some generalization over nouns that can refer to either meat or pelts; one might refer to the degree of choice between these two alternatives being “graded salience” (Giora Reference Giora2003: 10 and this volume), but this notion is yet more relevant in the lexicon entry for and.
12.7 And
And may conjoin all sorts of sentence constituents and whatever is felicitously conjoined is grouped together such that there is always some plausible reason for the grouping. This “plausibility” valuation is a coherence metric and necessarily pragmatic because it relies on knowledge of whatever world is spoken of; later, I shall question whether it is relevant to the lexicon entry for and. With the exception of some conjoined NPs that I will refer to as NP-*com-Conjunction (and briefly exemplify in (61)–(65)), the conjoined constituents are synonymous with a conjunction of sentences, e.g. in (51e) ‘Two is a number ∧ Three is a number’.
(51)
a. Sue is tall and slim.
b. Eric was driving too fast and hit a tree.
c. Elspeth always drove slowly and carefully.
d. Joe and Harriet are tall.
e. Two and three are numbers.
On the assumption that Φ and Ψ are well-formed (combinations of) propositions expressed as well-formed conjunctions in English, the semantics of Φ and Ψ is as presented in (52). There is, in addition, a series of nonmonotonic inferences that exemplify Giora's “graded salience” (Giora Reference Giora2003: 10); they are listed with the strongest contextually possible inference as the first to be considered.
(52) Φ and Ψ ↔ Φ ∧Ψ
a. IF cred(
Φ →
Ψ) ≥ 0.9 ∧ cred(cause(Φ,Ψ)) ≥ 0.8THEN Φ and Ψ +> Φ causes Ψ (e.g. Flick the switch and the light comes on; cause ≺ effect11) ELSE
b. IF cred(enable ([do(Ø,Φ)],Ψ)) ≥ 0.9 ∧ cred(
Φ →
Ψ) ≥ 0.8THEN Φ and Ψ +> Φ enables the consequence Ψ ∨ Φ is a reason for Ψ (e.g. Stop crying and I'll buy you an ice-cream; action ≺ consequence) ELSE
c. IF cred(Φ<Ψ) ≥ 0.8
THEN Φ and Ψ +> Φ and then later Ψ (e.g. Sue got pregnant and married her boyfriend; Φ ≺ Ψ) ELSE
d. IF cred(enable(Φ,[do(S,[say(S,Ψ)])])) ≥ 0.812
THEN Φ and Ψ +> Φ is background for Ψ (e.g. There was once a young prince, and he was very ugly) ELSE
e. Φ and Ψ +> Φ is probably more topical or more familiar to S than Ψ (e.g. On Saturdays my mum cleans the flat and Sue washes the clothes)
Note the conditional relations in (53):
(53) (Φ causes Ψ) → (Φ is a reason for or enables the consequence Ψ) → (Φ temporally precedes Ψ)13
Whether the last two discourse-based implicatures of (52) are part of this sequence remains to be determined. However, it is arguable that if Φ is background for Ψ then Φ is prior to Ψ; and if Φ is more topical or more familiar than Ψ, then again, it is arguable that Φ is prior to Ψ; and should these rather tenuous claims be acceptable, then the fact that Φ precedes Ψ when they are conjoined is normally iconic. However, the choice of sequence is a matter of usage (or pragmatics) and is not obligatory, but it does seem to justify a general statement such as (54):
(54) Φ and Ψ ↔ Φ ∧ Ψ
Φ and Ψ +> Φ is prior to Ψ; cred ≥ 0.9
Consider (from (52c)) Sue got pregnant and married her boyfriend: it is false (cred = 0) that Sue's getting pregnant literally causes her to marry her boyfriend, though it may be her reason for doing so, cred ≈ 0.4; but it is quite probable (cred ≈ 0.75) that her marriage to the boyfriend is a consequence of her being pregnant, whether or not he is the biological father-to-be. It is almost certain (cred ≥ 0.9), even though defeasible, that Sue's pregnancy precedes her marriage. Out of any natural context of use it is not possible to determine whether or not saying Sue got pregnant is a background for going on to say that she married her boyfriend. This aside, it has been possible to propose a (partial) lexicon entry for and which includes its implicatures in grades of salience. There seems to be no good reason to treat and as multiply ambiguous semantically when one core meaning can be identified (logical conjunction) and all other interpretations can be directly related to that as a hierarchy of nonmonotonic inferences processed algorithmically. As Ockham wrote: Numquam ponenda est pluralitas sine necessitate ‘Plurality should never be posited without necessity’ (Ordinatio Distinctio 27, Quaestio 2, Ockham Reference Ockham, Gál and Brown1967–88: I, K)
Is it possible to define a plausibility measure for Φ and Ψ that is semantically based? I suspect not. At first sight the acceptability of (55) as against the unacceptability of (56) seems explicable semantically because only living things eat and if Max is dead he is no longer living and this is a semantic entailment of die.
(55) Max ate a hearty meal and died.
(56) *Max died and ate a hearty meal.
However, the situation seems pragmatically determined in (57)–(60): it is a matter of conventional beliefs about death, going to hospital, and going to heaven.
(57) Max went to hospital and died there.
(58) *Max died and went to hospital.
(60) *Max went to heaven and died there.
In NP-*com-Conjunction, *com is a ≥2-place predicate with a sense ‘is added to, is mixed or combined with, acts jointly or together with, is acted upon jointly or together with’ (Allan Reference Allan and Peeters2000: 196). It is found in (61), which is not semantically equivalent to (62) – contrast the latter with (51e).
(61) Two and three are five.
(62) *Two is five ∧ Three is five
A revealing recipe-like paraphrase of (61) is (63), which accounts for the fact that (64) is a paraphrase of (61).
(63) Take twox and take threey, combine them (*com(x,y)), and you get fivew, cf. Mix flourx and watery to make pastew or just Flour and water make paste.
(64) Two and three make five.
NP-*com-Conjunction is recognized when a conjunction of sentences either cannot apply or is unlikely to apply as in (61) and (65).
(65) Joe and his wife have a couple of kids.
The subject NP of (65) is most likely NP-*com-Conjunction whereas that of (66) is not. That these judgments are pragmatically rather than semantically plausible is seen by comparing them.
(66) Joe and his sister have a couple of kids.
(66) is, given social constraints on incest, most likely an infelicitous manner of expression where the conjunction is intended to be Φ and Ψ with the weakest of nonmonotonic inferences; preferred would be Joe and his sister each have a couple of kids. With respect to (65), although it is true that each of Joe and his wife has two kids, the sentence Joe and his wife each have a couple of kids suggests these derive from former relationships such that the married couple has four children altogether.
12.8 Sorites
Two horses don't constitute a herd nor do ten grains of sand constitute a heap. For collections such as these, denoted by sorites nouns,14 the number of constituents needed to render the description accurate depends on the nature of the constituents; for example, whereas the least lower bound on a herd of horses might be three, that on a heap of sand is probably more than a hundred. There are sorites predicates like be bald, be tall, be many and sorites adverbs like slowly, loudly. These are invariably gradable and contextually determined as may be seen from the contrasts in (67).
(67) tallfor a Pygmyversustallfor a North American basket-ball professional15
manypeople thought George W Bush was a foolversusmany of my students didn'tattend class today
a slug moves slowlyversusthe train went through the station slowly
There is a similar contextual relevance for the nouns: a herd of horses, elephants, or giraffes will typically have fewer members than a herd of wildebeest, though this is not necessarily the case; moreover, it has no bearing on the lexical meaning of herd. The least lower bound on a heap of beans is lower than that on a heap of sand, probably because of the size of the constituent members. Clearly these are facts about the world referred to but are they facts about the meaning of listemes? No, but they are relevant to the propositions in which the listemes occur: for instance, if speakers wish to report the speed at which a slug is moving they need to apply different criteria than when reporting the speed at which a train is moving. It appears from work reported by Hagoort et al. (Reference Hagoort, Hald, Bastiaansen and Petersson2004) that the brain is prepared to do exactly that kind of thing and that contextual information is integrated with semantic information from the start; see also Terkourafi (Reference Terkourafi, Haugh and Bargiela-Chiappini2009b). However, as I’ve said, although this is relevant to the meaning of propositions, we can dispense with such enriched interpretations in the lexicon because they are instances of lexical adjustment: they count as “ad hoc categories” (Barsalou Reference Barsalou1983; Carston Reference Carston2002; Wilson and Carston Reference Carston and Frápolli2007) dependent on a particular domain of discourse. What we see in (67) is a context-induced specification of the meaning for the sorites words. The same holds for bald: various degrees of baldness are characterized in (68)–(70).
(68) His hair is thinning / thin ≈ He is balding / going bald / has a bald patch.
(69) He is bald.
(70) He is completely bald.
The domain of baldness extends from thinning (head) hair to its almost complete absence. It is arguable that (69) is applicable in situations where (68) or else (70) would also hold true, even though the accuracy of (69) might be disputed in favor of either (68) or (70). So, how sorites words should be specified in a lexicon is highly controversial.
Although not directly concerned with the lexicon, there is a large number of proposals discussed in Williamson (Reference Williamson1994), Beall (Reference Beall2003), and N. J. J. Smith (Reference Smith2008). They include supervaluation, subvaluation, and plurivaluation. Smith suggests “talk of the meanings of some terms must always be relative to a group of speakers, whose dispositions regarding the use of those terms play an essential part in fixing those meanings” (Smith Reference Smith2008: 314). This is a recasting of Quine's “There is nothing in linguistic meaning beyond what is to be gleaned from overt behavior in observable circumstances” (Quine Reference Quine1992: 38). To return to (69): what I suggest for the meaning of bald is the minimal semantics of (71).
Two speakers, or the same speaker on different occasions, may differ as to what counts as ‘not a full complement of hair’ such that x is bald has a range of truth values; i.e. there is no single state of hair-loss for which it is invariably true of x that x is bald for all occasions and all speakers. A modification like (68) is appropriate to the least lower bound and (70) to the greatest upper bound; (69) applies to both.
Defining sorites terms often invokes alternative points on the relevant scale. For instance many implies a contrast with other points on a quantity scale; more precisely, less than most and greater than a few. In (72), |f∩g| can be glossed ‘the number of Fs that (are) G’.
(72) [many(x): Fx](Gx) → |f∩g|> [a_few(x): Fx]G(x)
+> |f∩g|< [most(x): Fx]G(x)
(I assume that a few x > few x > one x.) The domain referred to significantly affects the actual numbers, as we saw in (67). It is notable that to establish the truth of (73) we cannot look to a specific number because even if that can ever be known, the precise number that justifies the use of “many” will differ for different speakers and even for the same speaker on different occasions.
(73) Many US citizens live in poverty.
Although the meaning of (73) falls under the definition in (72) there is also an implication, or perhaps connotation, that (according to the speaker) the number of US citizens living in poverty is greater than it ideally ought to be. Similar conditions hold for Many of my students were absent from class today, which does not imply that more than half of them weren't there, but that ‘more than one might have expected to be absent were in fact absent’ – and that could easily be as little as 5 per cent.
For sorites like tall and slowly it will be necessary to invoke, respectively, the height scale tall > average height > short and the speed scale slow < average speed < fast on condition that these apply to a particular domain or set of domains as shown in (67).
Sorites like herd and heap (in the sense of Eubulides’ soros) involve configurational criteria.


Suppose that three is the least lower bound for a herd or heap and often the number of constituents is many more, often vastly many more. There is
no upper bound. A heap of sand will typically have many more constituents than a heap of logs; though if the domain of discourse is an egg-timer on the one hand and a clear-felled forest on the other, there may not be such a discrepancy. There is no unique quantity that defines a heap, not even a heap of some particular substance; that is, there is no exact number that determines when a quantity of sand constitutes a heap; the roughly conical configuration is a necessary part of the requirement but is insufficient in itself – as is the condition on quantity. However, the semantic extension of heap(s) as in I have a heap of things to do and There were heaps of people at the party has lost all notion of a particular configuration and is roughly synonymous with lots of or many and must be defined in a manner similar to (72).
12.9 Formulaic language in the lexicon
“A formulaic sequence is a sequence, continuous or discontinuous, that appears prefabricated and stored as a chunk, rather than being generated afresh” (Wray Reference Wray2008: 94). Just as metaphor is pervasive in language, so are “prefabs” – a useful term succinctly defined by Erman and Warren Reference Erman and Warren2000 as “specific conventionalized multiword strings.” Especially in the spoken language, people use thousands of them (just look, for example, at www.phrases.org.uk/index.html); but they are also markers of oral literature, religious texts, best-seller scripts, and popular radio and TV shows (see Allan Reference Allan2001, Reference Allan2006b; Corrigan et al. Reference Corrigan, Moravcsik, Ouali and Wheatley2009; Donahue Reference Donahue1991; Goldman Reference Goldman1990; Jackendoff Reference Jackendoff1995b; Jensen Reference Jensen1980; Kuipers Reference Kuipers2009; Paraskevaides Reference Paraskevaides1984; Schmitt Reference Schmitt2004; Wray Reference Wray2002, Reference Wray2008). Prefabs can be classed into at least three groups.
Idioms are primarily figurative; they include: a bit of the other; Bob's your uncle; by and large; come a cropper; fuck off; go the whole hog; kick the bucket; put a sock in it; rain cats and dogs; set store by; sleep like a log; spill the beans; sweat blood; the key to.
Clichés are primarily nonfigurative; they include: be heavily compromised; be not very well; believe you me; don't do anything I wouldn't do; Good Lord; Happy Birthday! Hot-dog! [= great!]; ladies and gentlemen; out of sight out of mind; reading, writing, and (a)rithmetic; to make a long story short; un je ne sais quoi; you can say that again; you'd better [do A].
Catchphrases include: Beam me up, Scotty; Computer says ‘No’; Frankly, my dear, I don't give a damn; It doesn't amount to a hill of beans; Not that there's anything wrong with it; One potato, two potato, three potato, four…; Play it again Sam; S/he loves me, s/he loves me not.
Subclassifications of these groups sometimes suggest themselves (e.g. imprecations, proverbs) and a prefab can often be classed into more than one of the three (e.g. be worth one's weight in gold).
Prefabs have similar characteristics to compounds and phrasal verbs in that, although they may have a variable slot, they are largely immutable and function as lexical islands phonologically and syntactically (Underwood et al. Reference Underwood, Schmitt, Galpin and Schmitt2004; Van Lancker et al. Reference Van Lancker, Canter and Terbeek1981; Wray Reference Wray2002, Reference Wray2008). Like proper names and tabooed terms (such as fuck) they seem to be stored in a different manner from the normal lexicon, perhaps in the right brain. The evidence for this is that people with left hemisphere trauma often have access to prefabs, proper names, and tabooed terms when they don't have normal access to ordinary language; furthermore, persons with right hemisphere damage use significantly fewer prefabs than normal subjects (Van Lancker Reference Van Lancker, Corrigan, Moravcsik, Ouali and Wheatley2009: 452). Lexicography has ignored the conclusion that different kinds of vocabulary are stored in different hemispheres of the brain, even though it could be relevant to classifying types of lexical data; I shall maintain this tradition.
A simplified lexicon entry for kick the bucket might be something like (76).
(76) /kɪk ðə bʌkət/ –- [VP[V[kick]] NP[D[the] N[bucket]]] → die(x)

The ellipse in the figure contains encyclopaedic information that is clearly pragmatic yet according to Allan (Reference Allan2001) is outside of the lexicon. Traditionally such information is located in dictionaries, for instance, the Oxford English Dictionary (1989) labels kick the bucket “slang” and the Macquarie Dictionary (2003) describes it as “Colloquial” (it doesn't appear in Webster Reference Webster and Gove2002). Such descriptions, whether assigned to the lexicon or the networked encyclopaedia, are clearly pragmatic. The explanation for the meaning of kick the bucket is metonymic: in former times a bucket was a ‘beam’ and when an animal (such as a pig) was tied to the beam by its hind legs to be slaughtered, it would kick the bucket in the throes of death. But information about this source for the idiom is an encyclopaedic datum that is not generally known, and plays no part in the interpretation today of the idiom kick the bucket.
Unlike the meaning of the typical idiom, the meaning of a typical cliché is computable from its constituent parts. What marks the cliché is that it occurs frequently as the clichéd chunk (Bannard and Lieven Reference Bannard, Lieven, Corrigan, Moravcsik, Ouali and Wheatley2009: 300f., 304), and experimental evidence suggests that it is normally processed as a chunk and not according to its constituent parts (Underwood et al. Reference Underwood, Schmitt, Galpin and Schmitt2004; Wray Reference Wray2002, Reference Wray2008). I suggest that clichés should therefore be noted in full in a lexicon and (pragmatically) marked as clichés. Mutatis mutandis, the same goes for catchphrases: their meaning is almost invariably computable from their parts, but they are recalled and used as chunks – or perhaps as articulated chunks in the case of items of play like one potato, two potato, three potato, four…, or the words of a national anthem or of the full version of Happy Birthday to you. It is a debatable matter whether these can count as lexical entries rather than encyclopaedia entries. They seem to be evoked by a particular kind of event that triggers a speech act, e.g. Happy birthday by the occasion of someone's birthday that the speaker wishes to demonstrably recognize; Beam me up, Scotty is triggered by the thought ‘Get me out of here’. It seems feasible to propose that the listeme birthday is linked to the networked encyclopaedia with a free pragmatic condition like (77):
(77) If it is X's birthday then it is appropriate to tell X Happy birthday.
The situation with respect to Beam me up, Scotty is far more constrained: it can perhaps be tagged to the phrasal verb get NP out in some thesaurus-like way on condition that the constituent NP refers to the speaker (perhaps, along with others); it can only be used as a jocular expression and to an addressee likely to understand the utterance as a catchphrase. This latter condition does not apply to all catchphrases: for instance, it doesn't apply to not that there is anything wrong with it, which functions adequately as a non-prefab; the condition that applies is that “it” refers to a mildly tabooed topic (such as being gay).16 This illustrates the squishiness17 of prefabs.
Prefabs are, by definition, multiword expressions. Traditional dictionaries of phrases list them in alphabetical order but the mental lexicon is surely more akin to a database which is searched in a manner similar to a Google search engine operating on key words and combinations of words. The mental lexicon will also be accessed semantically and pragmatically (i.e. via meanings and encyclopaedic information; see Giora, this volume and Katsos, this volume) and not merely through aspects of the form of language expressions.
12.10 Connotation in the lexicon
The connotations of a language expression are pragmatic effects that arise from encyclopaedic knowledge about its denotation (or reference) and also from experiences, beliefs, and prejudices about the contexts in which the expression is typically used. Terms like surgeon, nurse, secretary/receptionist, and motor mechanic evoke connotations of gender from the fact that the typical job-holder in each case is, even today, a gendered stereotype: most surgeons and motor mechanics are male; most nurses and secretary/receptionists are female. These connotations are all, clearly, the pragmatic effects of normative conceptions of typical job-holders.

The most common denotations of bunny and rabbit or doggie and dog are the same, but the connotations are different: bearing the diminutive, the first member of the two pairs connotes endearment or childish language; see (80).

To avoid blaspheming (for which the Bible sanctions execution: Leviticus 24:16), people use a variety of euphemistic expletives (see Allan and Burridge Reference Allan and Burridge2006: 15ff., 39). For instance, Jesus is end-clipped to Jeeze! and Gee! (which is also the initial of God); Gee whiz! is a remodelling of either jeeze or jesus. More adventurous remodelings are By jingo! Jiminy cricket! [from Jesus Christ] Christmas! Crust! Crumbs! Crikey! Note that the denotation of Gee!, Jeepers!, and Jesus! is identical. All function as exclamations of surprise, dismay, enthusiasm, or emphasis. From a purely rational viewpoint, if one of them is blasphemous, then all of them are. What is different is that the first two have connotations that are markedly different from the last. Connotation – or, more precisely its pragmatic effect, reaction to connotation – is seen to be a vocabulary generator. But the question here is what goes into the lexicon, and I suggest (81)–(82) (in which statements introduced by a simple + are encyclopaedic).


Whether the encyclopaedic statements should be included within the lexicon is a matter of debate. I personally don't believe they should form a part of the lexicon entry but they must certainly be accessible from and networked with the lexicon.
12.11 Conclusion
In this chapter I have looked at ways in which pragmatics intrudes on the lexicon. I count as “pragmatic” encyclopaedic data and nonmonotonic inferences (NMI) – which arguably arise from encyclopaedic data. In section 12.2, I introduced the notion of a credibility metric for a proposition and used it to calibrate NMIs in the lexicon to correspond with the degree of confidence one might have in the truth of the inference: its probability. Sections 12.3 and 12.4 demonstrated that in addition to the lexicon entry specifying the necessary components of meaning in the semantics for an entry, it should also specify the most probable additional components of meaning, which are accepted or canceled as a function of contextual constraints. These same sets of conditions were demonstrated for different kinds of entries throughout the rest of the chapter. Section 12.5 looked at lexicon entries for collective and collectivizable nouns. These differ in that different interpretations for collective nouns arise from their morphosyntactic context and although this needs to be captured in the lexicon it is not a matter of pragmatics; on the other hand, a noun is collectivizable only in some defined set of contexts and these are a pragmatic constraint. Section 12.6 discussed the use of animal nouns in mass NPs to denote either the animal's meat or its pelt. Although there are defined morphosyntactic conditions on such interpretations, the choice of one interpretation or the other is pragmatically determined because it is contextually induced and is open to calibration against a credibility metric. Section 12.7 returned to the much disputed semantics of and. The view taken here is for a monosemic semantics which assumes that English and has the semantics of logical conjunction but there is a graded salience captured in an algorithm that assigns one of a set of nonmonotonic inferences as supplementary meaning on the basis of context. Section 12.8 discussed the vexed question of how to represent the semantics of sorites terms in the lexicon. A minimalist semantics was proposed. Section 12.9 discussed the matter of prefabs or formulaic expressions. It is only recently that their frequency and ubiquity has been recognized. They pose a challenge to the lexicon principally because they are multiword expressions; many are figurative; many are stylistically marked. These pragmatic characteristics are appropriate to encyclopaedic information linked to the entry. Section 12.10 considered the representation of connotation in the lexicon as a matter of pragmatic intrusion.
In this chapter I have shown different motivations for including pragmatics in the lexicon or linking it to the lexicon, and I have demonstrated how that may be accomplished. This is not to deny that other formalizations are possible.
My thanks to Kasia Jaszczolt for making me clarify bits of this chapter. Kasia is not to blame for remaining infelicities; indeed, she heartily disapproves some of my claims.
13 Conversational interaction
13.1 Introduction
The study of conversational interaction is now approached in a multitude of different ways in pragmatics, reflecting in part the ever increasing diversity of the field. Approaches range from the study of the structure and management of talk as a form of social order itself in conversation analysis, through to the study of a wide variety of pragmatic phenomena that occur in conversational interaction, including formulaic language, discourse/pragmatic markers, reference and deixis, presupposition, implicature, speech and pragmatic acts, humour, im/politeness and beyond to issues of identity and power, to name just a few. The latter study of pragmatic phenomena in conversational interaction draws from a wide range of approaches, including conversations reconstructed through the introspective methods of philosophical pragmatics, the study of naturally occurring conversations through ethnography of speaking, interactional sociolinguistics, philology, (critical) discourse ana-lysis, interactional pragmatics and more recently corpora, and the study of conversation elicited through devices such as discourse completion tests or role plays.1
Within this complex analytical landscape two key trends can be discerned in relation to the place of conversational interaction in pragmatics. First, the work of the ordinary language philosophers, Austin, Grice and Searle, who were all focused on analysing meaning (and to a lesser extent action) in language from ordinary conversation, has been enormously influential with regard to the ways in which conversation itself, and language data from conversation, are approached by many in pragmatics, particularly those practising cognitive and philosophical forms of pragmatics (so-called Anglo-American pragmatics). In such approaches, the analyst largely abstracts away from the details of conversation itself in order to formalise the rules and principles by which speakers mean (and to a lesser extent do) things in ordinary discourse. For instance, the constitutive rules for different speech acts developed by Searle (Reference Searle1969), and the conversational logic proposed by Grice (Reference Grice1989) to account for meaning that the speaker intended but did not expli-citly express, are both examples of formalised systems of abstract reasoning through which speaker meaning can be analysed. This tradition of drawing from conversational data, albeit often reconstructed through introspection, has been inherited by scholars who focus on the analysis of pragmatic meaning (see Allan, Bach, Carston, de Saussure, Horn, Jaszczolt, Sullivan, this volume), or treat speech acts as a form of speaker meaning (see Kissine, Peregrin, this volume).
The second key trend is the formative influence of the work of sociologists and anthropologists, in particular, that of conversation analysis (Sacks et al. Reference Sacks, Schegloff and Jefferson1974) and interactional sociolinguistics (Gumperz Reference Gumperz1982), which has changed the ways conversation is approached by many scholars in pragmatics, particularly those practising interactional and socio-cultural forms of pragmatics (so-called European-Continental pragmatics). In conversation analysis (CA), however, the main focus is on understanding the organisational and social structure of conversation itself. The analyst thus closely examines the fine details of conversational interaction, teasing out how participants themselves understand and experience action, and manage the mechanisms through which talk is accomplished. While this approach could not be more different to cognitive-philosophical pragmatics in its underlying ontological and epistemological commitments, and thus in its treatment and analysis of conversational data, both approaches have nevertheless formed the bedrock of much of the work in pragmatics on or using conversation, ever since they were brought together by Levinson's (Reference Levinson1983) highly influential textbook.
However, while much of the interest in pragmatics and CA was initially focused on conversation in the folk sense of ordinary or everyday uses of talk between family, friends or acquaintances, conversation has since garnered a more technical definition, encompassing all types of face-to-face or telephone-mediated interaction that use language, including that occurring in institutional settings such as the classroom or workplace. More recently still it has been extended again to include various forms of computer-mediated communication (CMC), particularly those which allow (close to) real-time exchange of messages. The latter more technical notion of conversation is sometimes called talk-in-interaction in CA in order to distinguish this broader, academic notion from the ordinary sense of conversation. In this chapter, however, the term conversational interaction will be retained, in part to emphasise that it is a perspective on mundane and institutional conversation rooted in the analytical concerns of pragmatics, which is the primary focus of discussion here, rather than those of CA proper.
The importance of a pragmatics perspective on conversation is emphasised here because although work in CA has been enormously influential in advancing our understanding that conversational interaction is fundamentally emergent or non-summative in nature, the view that talk is situated not only locally in interaction but also in the sociocognitive worlds of participants does not lie within the purview of CA. In not strictly allowing for ‘inferences about what they [participants] are thinking, or why they do what they do, or assumptions about their roles and the wider social context’ (Myers Reference Myers, Culpeper, Katamba, Kerswill, Wodak and McEnery2009: 502), conversation analysts place evaluations, meanings to some extent, as well as the sociocognitive underpinnings of conversational interaction outside of the direct scope of their analytical interests.2 The point here is not to argue for the superiority of one methodology over another, however, as Jucker (Reference Jucker2009) quite rightly points out in regard to the analysis of pragmatic phenomena:
an assessment of a particular method always depends on the specific research question that the researcher tries to answer because the different methods vary enormously in their suitability for specific research questions. One particular method may provide interesting results for one specific question or set of questions while it is of little value for another set of questions. (Jucker Reference Jucker2009: 1633)
Thus, while it is suggested here that developing an approach informed by research and methods in CA in studying pragmatic phenomena that occur in conversational interaction is likely to be more productive than one which eschews such research, it is important to note that to be informed by such research is one thing, to be unduly constrained by it is another. Consequently, in this chapter, it is proposed that although the emergent nature of pragmatic phenomena in conversational interaction should clearly not be neglected, neither should the way in which such phenomena are inherently situated, not only locally in interaction but also within the sociocognitive worlds of participants. In this respect, it is argued that the properties of both emergence and situatedness should be taken into account in any analysis of pragmatic phenomena in conversational interaction. This entails drawing in a principled manner from a range of other analytical traditions, then, including not only conversation analysis, but those of cognitive, philosophical and socio-cultural pragmatics.
This chapter begins by first briefly outlining the landscape in terms of the different types of conversational interaction that have been studied thus far, including emerging computer-mediated forms of conversational interaction. The interactional engine underpinning these different types of conversational interaction, including the ways in which it is structured and managed, is next briefly described. The two key properties of conversational interaction, namely, emergence and situatedness, are then discussed, with particular emphasis placed on the implications of these properties for the analysis of pragmatic meaning, action and evaluation in conversational interactions. In this section, the sociocognitive engine that underpins conversational interaction is also discussed, based on the specific requirements that conversational interaction places on that system. The chapter concludes by sketching a programme for furthering our understanding of the pragmatics of conversational interaction.
13.2 Types of conversational interaction
Conversational interaction defined in the broad sense of all face-to-face or technology-mediated forms of interaction that use language encompasses a wide range of different types of talk. Building on Hakulinen's (Reference Hakulinen, D’hondt, Östman and Verschueren2009) analysis of conversation types, conversational interactions can be classified relative to four different dimensions: degree of institutionality, activity type/genre, channel and participation framework.
As previously noted, the main focus of analysis in pragmatics and CA was initially everyday conversations between family, friends and acquaintances, although the scope was soon extended to include other institutional forms of talk in workplaces, classrooms and the like. Mundane or ordinary conversation is thus generally defined in contrast to institutional talk. Levinson (Reference Levinson1983), for instance, defines (ordinary) conversation as ‘the predominant kind of talk in which two or more participants freely alternate in speaking, which generally occurs outside specific institutional settings‘ (p. 284). More specifically, such mundane conversation involves ‘organization of talk which is not subject to functionally specific or context-specific restrictions or specialized practices or conventionalized arrangements’ (Schegloff Reference Schegloff1999: 407, original emphasis). Yet despite being defined relative to institutional forms of conversational interaction, ordinary conversation is regarded as primordial. Schegloff (Reference Schegloff1999), for instance, goes on to argue
[w]hat humans grow up with is an ordinary interaction within the family, within peer groups, neighborhoods, communities, etc. In all of these, it appears most likely that the basic medium of ‘interactional exchange’ is ordinary conversation – in whatever practices it is embodied in those settings. (Schegloff Reference Schegloff1999: 413)
In the sense that everyone engages in it from an early age and constantly throughout their lives, then, ordinary conversation is uncontroversially basic.
However, the claim in CA that institutional forms of conversational interaction are restricted variations of the basic system of ordinary conversation is somewhat more controversial. Schegloff (Reference Schegloff1999) argues that ‘other speech-exchange systems themselves appear to be shaped by the adaptation of the practices and organizations of ordinary conversation to their special functional needs, legal constraints, etc.’ (ibid. 415; see also Goodwin and Heritage Reference Goodwin and Heritage1990: 289). Types of institutional conversational interaction examined in this way range from courtroom talk, classrooms and workplace meetings and interactions through to broadcast interviews and debates, police interviews, medical consultations and various forms of counselling, including phone-in help-lines (Heritage Reference Heritage, Fitch and Sanders2005). Key differences between ordinary and institutional forms of conversational interaction include their overall structure and turn allocation. While institutional talk generally has particular phases (e.g. a recognisable beginning and end), ordinary talk has no such recognisable phases or formal procedures. Moreover, turns in institutional talk tend to be pre-allocated, while in everyday talk turns are allocated on a local basis (Heritage Reference Heritage, Fitch and Sanders2005). However, while the claim that institutional talk involves particular adaptations of the turn-taking and sequential structure of ordinary talk is perhaps intuitively appealing, such a claim still requires further exploration across various institutional settings.
A second, related dimension of conversational interaction is that of activity type (Levinson Reference Levinson1979) or communicative genre (Fairclough Reference Fairclough2003). These involve ‘practised patterns of language use’ that are constitutive of different communicative activity types or genres, such as intimate talk, family dinner-table conversation, troubles telling (or troubles talk), small talk, negotiation talk, consultation, advice giving and so on (Hakulinen Reference Hakulinen, D’hondt, Östman and Verschueren2009). However, while activity types or genres are clearly an important dimension of analysis, there is no principled way of classifying conversational interactions in this way, with such categorisations often being based on commonsense or vernacular terms that inevitably overlap in some respects. The analytical focus is thus generally more on how such activity types or genres are accomplished through conversational interaction rather than attempting to enumerate them as such (Auer Reference Auer, Verschueren and Östman2009: 95).
The channel in which conversational interaction takes place was, up until recently, restricted to auditory (e.g. telephone conversations) or audio-visual (e.g. face-to-face conversations) modes of communication. However, in recent years the notion of conversational interaction has been extended to encompass various forms of computer-mediated communication, particularly those that occur in real-time (or near to it). These include instant messaging via various software applications such as Windows Live Messenger or Yahoo Messenger, on-line chat rooms or forums (synchronous conferencing), text messaging (or SMS) via mobile phones, and the use of email in some instances for close to real-time exchange of messages. The increasing use of software such as Skype that supports audio-visual calls over the internet is also an area of interest in that such software often features text messaging and file sharing functions together with the audio(-visual) channel. Early studies of turn-taking systems in various forms of computer-mediated communication (CMC) indicated that the turn-taking system in ordinary face-to-face conversation cannot always be straightforwardly mapped onto that occurring in CMC. A variety of factors relating to the medium of CMC in question can influence how closely it maps onto spoken conversational interaction (Garcia and Jacobs 1999; Georgakopoulou Reference Georgakopoulou, Verschueren, Östman, Blommaert and Bulcaen2005; Herring Reference Herring1999, Reference Herring, Barab, Kling and Gray2004, Reference Herring2007). Key factors include synchronicity, message transmission (i.e. one-way transmission where the receiver cannot see the message until it is sent versus two-way transmission where both the sender and receiver are able to see or hear the message as it is produced), persistence, size of message, channels of communication (including not only text but images, audio and video), degree of anonymity and whether the system enables private or public communication (Herring Reference Herring2007: 14–17).
Generally speaking, synchronous forms of private CMC are structurally the closest in many respects to spoken conversational interaction, as will be discussed in the following section. Yet while on-line interactions can have similarities with spoken conversational interaction, CMC does not constitute one homogeneous variety of discourse, as it also encompasses asynchronous forms such as blogs (= ‘weblogs’), discussion boards and social networks, and also varies in terms of register, style and genre of the forum in question (Georgakopoulou Reference Georgakopoulou, Verschueren, Östman, Blommaert and Bulcaen2005; Herring Reference Herring2007). The question of which forms of CMC can be legitimately treated as forms of conversational interaction thus remains an empirical one, and an evolving one at that, as emerging technologies continue to afford different configurations of channel and other medium-related factors in communicating.
The fourth dimension influencing the type of conversational interaction in question is the participation framework (Goffman 1979). Multiparty interactions tend to have more complex instantiations of turn allocation practices as opposed to dyadic interactions where turn-taking can be more smoothly (albeit not always) accomplished (Sacks, Schegloff and Jefferson Reference Sacks, Schegloff and Jefferson1974). In addition, multiparty interactions afford different discourse roles for participants ranging from straightforward speakers and addressees through to virtual speakers (who invoke what could have been said by someone at some point) and non-addressed participants, who include side-participants, bystanders and overhearers (Levinson 1988; Verschueren Reference Verschueren1999).
13.3 The interactional machinery of conversational interaction
While conversational interaction encompasses a diverse range of forms of talk (and more recently text), one of the key contributions of CA has been in explicating the interactional mechanisms underlying these different types of conversational interaction. These are generally taken to include turn-taking, sequence organisation, repair organisation, recipient design, and structural organisation (Schegloff Reference Schegloff, Enfield and Levinson2006). In this section, the two main features of this interactional machinery relevant to analysing pragmatic phenomena in conversational interaction, namely, turn-taking and sequence organisation, are briefly described in relation to an excerpt from a spoken face-to-face conversation, and then extended to an analysis of a conversation conducted through an on-line messaging service.3
The first excerpt, which is from a casual chat between two male undergraduate students who are also friends, can be used to illustrate some of the most salient interactional features of ordinary conversation.45

The first thing to note in this conversation is that the two participants take turns in speaking, illustrating the first key principle of the interactional machinery of conversational interaction, namely, composition (Clift et al. Reference Clift, Drew, Hutchby, Verschueren and Östman2006: 11). Although there are no set institutional rules guiding the allocation of turns, the two participants are able to take the speaking floor in a relatively orderly fashion that reflects the principles of turn-taking first outlined by Sacks et al. (Reference Sacks, Schegloff and Jefferson1974: 704). These rules, in turn, rest on the ability of both participants to parse the ongoing speech into what are called turn construction units (TCUs) on the basis of grammar, intonational packaging and where it ‘constitutes a recognizable action in context’ (Schegloff Reference Schegloff2007: 3–4). The span of a TCU-in-progress that projects imminent possible completion is termed a transition relevance place (TRP) (Schegloff Reference Schegloff2007: 4). Such TRPs are not spans where speaker change has to occur, however, but rather are recognisable places where speaker change can occur. Speaker change may, of course, occur at points other than a TRP, but speakers are held interactionally accountable for such changes.
In the excerpt above, we can see Ben selects Alex as next speaker in lines 1 and 5, while he self-selects as next speaker to continue his speaking turn in line 14. The only point at which there is overlap between semantically meaningful TCUs occurs across lines 9–10, when Ben offers a gloss of Alex's weekend (as ‘just >chilling out<’). There are two things to note about this overlapping turn. First, its design closely aligns with the action in the preceding turn, as through it Ben offers a formulation of what it is that Alex is trying to do in the preceding turn, namely, describe his weekend activities (Garfinkel and Sacks Reference Garfinkel, Sacks, McKinney and Tiraykian1970: 342). Second, it occurs at a projectable TRP, thus constituting a form of recognitional onset (Jefferson Reference Jefferson1986). It can also be observed more generally that turns are designed to accomplish actions in a manner that maintains progressivity in moving from one element (e.g. a turn or TCU) to the next with as little intervening as possible (Heritage Reference Heritage and Turner2009: 308; Schegloff Reference Schegloff2007: 14–15) and respects minimisation with regard to reference to or formulations of persons, places, times, actions and so on (Schegloff Reference Schegloff, Enfield and Levinson2006: 80). The second-turn design principle has also been productively applied to the analysis of person reference (or deixis) (Enfield and Stivers Reference Enfield and Stivers2007; Lerner and Kitzinger Reference Lerner and Kitzinger2007).
The second point to note is that the relationship between the turns is orderly, illustrating the second key principle of the interactional machinery, namely, position or sequentiality (Clift et al. Reference Clift, Drew, Hutchby, Verschueren and Östman2006: 11). The observation by Sacks that actions in conversational interaction tend to come in pairs was formalised through the notion of adjacency pairs, such as question-answer or invitation-acceptance/declination. Adjacency pairs are not limited to two turns, however, as they can also be expanded in various ways, the main ones being pre-expansion (e.g. pre-request, pre-invitation, pre-offer), insert expansion (e.g. pre-second insert expansion for request) and post-expansion (Schegloff Reference Schegloff2007). In the excerpt above, question-answer adjacency pairs and minimal post-expansions are evident in lines 1–12, while Ben's telling in lines 14–17 occasions an assessment on the part of Alex of the upshot of this telling (lines 19–20), with which Ben displays agreement (line 21) (Pomerantz Reference Pomerantz, Atkinson and Heritage1984). Notably here, as Alex does not reciprocate Ben's initial question about the weekend's activities (cf. the reciprocal ‘how are you’ routine noted by Schegloff Reference Schegloff1986), Ben occasions this telling through a self-directed question about his own weekend's activities. Ben's telling is treated as a possible complaint by Alex (Schegloff Reference Schegloff2005), who responds with an ironic positive assessment (line 19). The non-serious frame here is marked through laughter (Glenn Reference Glenn2003; Jefferson Reference Jefferson and Psathas1979; Schenkein Reference Schenkein1972), and Ben's subsequent continuation of the ironic frame in upgrading Alex's preceding positive assessment from ‘fun’ to ‘very interesting’ (line 21).
However, it is important to note that while reference has been made to adjacency pairs in explicating the sequential architecture of this particular excerpt, this is not to say that all conversational interaction can be reduced to adjacency pairs or expansions thereof. Instances of extended storytelling (or personal narratives), for instance, involve a different kind of overall sequential structure (Jefferson Reference Jefferson and Schenkein1978; Sacks Reference Sacks1986). One of the key findings in CA, however, is that such sequences, whether with respect to adjacency pairs or the overall structural organisation of the interaction, are for the most part organised around actions rather than topics (Schegloff Reference Schegloff1997, Reference Schegloff2007).
The interactional machinery of turn-taking and sequence organisation
can also be applied to some synchronous forms of private CMC, as seen in the following excerpt from a private conversation between two friends conducted through instant messaging.6

The excerpt above bears a number of striking similarities with the face-to-face conversational interaction discussed in example (1) despite being conducted entirely through a textual channel. Once again there is an orderly taking of turns by the two participants. However, rather than relating to recognition of TRPs, next turns are allocated by the current sender, who holds the floor in three main ways. First, in a manner similar to ordinary spoken conversation, the current sender can select next sender through ‘addressing them with a turn whose action requires a responsive action next’ (Schegloff Reference Schegloff2007: 4). In this interaction these are primarily information-seeking questions (Stivers Reference Stivers2010), as seen in lines 1, 13 and 19. Second, the current sender self-selects by continuing her current turn. For instance, Rachel moves from an expression of solidarity in response to Bronwyn's previous turn (constituting one TCU) to a telling of her recent activities (constituting a second TCU) in the same turn (line 6). Unlike face-to-face conversation, however, interruption or overlap of a current turn is not possible due to limitations of the medium (i.e. the messenger allows only one-way transmission), although a turn can be displaced if another participant sends another message in the meantime. Third, the current sender selects next sender through posting the message, thereby implicitly relinquishing rights to next turn, a practice evident throughout the interaction. However, this implicit relinquishment of next turn can easily be reversed, if that sender goes on to take the next turn. An example of this latter move, which can be seen in lines 17–18, appears to be occasioned by a lack of reciprocation in asking questions on the part of Bronwyn.
There are also similarities with the underlying sequential structure of the face-to-face conversation in example (1), as a clear pattern of adjacency pairs (question-answer-assessment) is apparent. There is, moreover, a striking resemblance with another practice found in example (1), namely, self-talk despite the current speaker's inquiries about the other not being reciprocated. Rachel's inquiries about Bronwyn's recent activities (line 1), her plans for the holidays (line 13) or how Bronwyn has been (line 19), for instance, occasion a telling on Rachel's part about her recent activities (lines 5–8), her plans for the holidays (lines 17–19) and how she has been (line 21). These latter tellings by Rachel occur despite Bronwyn not reciprocating Rachel's questions, which mirrors Ben's telling about his weekend's activities despite his initial inquiry not being reciprocated by Alex in example (1). It appears, then, that the routinely reciprocal nature of such question-answer-assessment sequences can occasion self-initiated tellings in some cases, or what might be glossed as ‘occasioning self-talk through inquiring about others’.
13.4 Characterising conversational interaction
Conversational interaction can arguably be characterised in terms of two key properties. The first is emergence, where the activities of participants in conversational interaction are reciprocally linked and conditional upon those of others (Arundale Reference Arundale1999, Reference Arundale2006a, Reference Arundale2008, Reference Arundale2010a, Reference Arundale2010b; Arundale and Good Reference Arundale, Good, Fetzer and Meierkord2002). The upshot of this reciprocal conditionality, or non-summativity, is that pragmatic phenomena cannot always be straightforwardly reduced to an analysis of the mental states, including a priori intentions, of individuals (see Haugh and Jaszczolt, this volume). The second key property is that of situatedness, where the activities of participants in conversational interaction are not only interpretable in terms of the local, situated context of that particular interaction (that is, what has come before and what comes afterwards in the interactional sequence to which participants themselves are demonstrably orienting) (Schegloff Reference Schegloff, Alexander, Giesen, Munch and Smelser1987, Reference Schegloff, Boden and Zimmerman1991, Reference Schegloff2005), but also beyond the here-and-now, encompassing the layers of historicity and orders of indexicality (Blommaert Reference Blommaert2005, Reference Blommaert2007) that afford pragmatic meanings and actions (Mey Reference Mey2001: 115, 227; Reference Mey and Cummings2010: 445–6). The upshot of the inherent situatedness of conversation is that pragmatic phenomena in conversational interaction cannot always be straightforwardly reduced to emergent interactional phenomena in the here-and-now without a proper consideration of their sociocognitive roots.
13.4.1 Emergence
The term interaction is used in two quite distinct ways in pragmatics. In the ordinary or ‘weak’ sense, interaction refers to ‘a situation in which people converse’ (Arundale Reference Arundale2006a: 196), and thus where talk is ‘directed to another person and has potential for affecting that other person’ (Schiffrin Reference Schiffrin1994: 415). In the folk sense, then, interaction is synonymous with talk or contact. This view of interaction assumes a summative view where the ‘output of one system…serves as an input to a separate, independent system’ (Arundale Reference Arundale2006a: 196). The conceptualisation of interaction as the summative pairing of the behavioural and cognitive states of individuals is dependent on the assumption that no reciprocal conditionality can be identified across the activities of those individuals, and thus ‘explanations of those activities can be reduced without remainder to the simple sum of the independent individual's behaviour and/or cognitive states’ (Arundale Reference Arundale2010a: 2079). The classic view in pragmatics that meaning involves recipients attributing intentions to speakers based on the behaviour, normally an utterance, of that speaker (see Haugh and Jaszczolt, this volume, for further discussion) presupposes a summative conceptualisation of interaction, where speaker meaning is reduced to the speaker's behaviour (output) and the recipient's inferences about intentions underlying the speaker's behaviour (input). It is assumed, therefore, that in cases of successful communication, the recipient's inferences about the speaker's intentions match those of the speaker, while miscommunication is characterised as cases where those inferences do not match the actual intentions of the speaker. While such an approach can account for at least some of what goes on in conversational interaction, especially when language-use data from conversations is abstracted away from its sequential environment, the inadequacy of a summative conceptualisation of interaction for explaining how meanings, actions and evaluations often arise in conversational interaction has been repeatedly emphasised by those examining naturally occurring conversation (see Arundale Reference Arundale, Fitch and Sanders2005 for a good summary).
For example, in the following exchange an understanding that Sirl wanted to use the shower first emerged over a number of turns, and was thus only retrospectively attributed to a previous utterance.

While Sirl's initial inquiry in turn 1 might be interpreted as a pre-request in that it constitutes a preparatory condition for making a request to use the bathroom first, and Michael subsequently leaves open the possibility that Sirl may wish to make such a request in turn 2, Sirl does not go on to make the request in turn 3. Instead, he treats his first utterance as simply a request for information, thereby blocking the interpretation of this utterance as a pre-request. However, the marked pause after Sirl's counter-response indicates that something has been left unsaid, and so Michael makes an offer to Sirl in turn 4 that he use the bathroom first, thereby reinterpreting Sirl's utterance in turn 1 as a pre-request. The subsequent acceptance of this offer by Sirl in turn 5 ratifies Michael's retrospective interpreting. Crucially, this interpreting only emerges in this interaction over the course of a number of turns, and rests on contingent inferences made by Michael that are afforded and constrained by the inferences made by Sirl, and vice versa.
According to a summative view of interaction this short conversation is an instance of miscommunication, as Sirl's utterance in turn 1 (the output) initially failed to elicit the ‘correct’ inference on the part of Michael (the input). Yet characterising this interaction as miscommunication neglects the possibility that it was in fact a matter of the two participants ‘sounding each other out’ before coming to an agreed interpretation of Sirl's initial utterance. Moreover, an analyst taking a summative view of interaction would also have to assume that Sirl really did intend to imply that he wanted to use the shower with his first utterance. Yet it is equally possible that Sirl decided he wanted to use the shower only after Michael made the offer. Sirl himself may not have been able to distinguish his own a priori intention and what eventually emerged as their understanding of his utterance in turn 1 (cf. Haugh Reference Haugh2007c: 94–5)
The contingent inferencing underlying the emergence of a particular understanding of Sirl's first utterance in this interaction goes beyond the traditional monadic account of inference located in the autonomous minds of individuals. Instead, it is arguably more productively understood in terms of anticipatory and retroactive inferencing (Arundale Reference Arundale2008; Arundale and Good Reference Arundale, Good, Fetzer and Meierkord2002; Good Reference Good and Goody1995).
Each participant's processing in using language involves a set of concurrent cognitive operations that are temporally extended, not only forward in time in recipient design of their own utterances and in anticipation of other's talk…but also backwards in time in the retroactive assessing of interpretations of what has already been produced in their own and in other's utterance. (Arundale and Good Reference Arundale, Good, Fetzer and Meierkord2002: 135)
In other words, implicatures and other interpretings are contingent not because the inferences are necessarily always defeasible, as various scholars have pointed out (Carston Reference Carston2002: 138–9; Haugh Reference Haugh2008d: 445–6; Weiner Reference Weiner2006), but because such inferences are both anticipatory and retroactive in nature (Haugh Reference Haugh2009: 97–102).
In another example of the emergent nature of conversational interaction, Chris is asking Emma about her acupuncture business. The utterance-final disjunction here appears to occasion an instance of ‘not saying’, where the speaker partly leaves the interpretation of what is not said up to7 the recipient.

In this excerpt, Chris begins by asking about Emma's customers’ level of satisfaction with her work (lines 1–2). Emma responds in the affirmative (line 4) after a brief pause (line 3), before going on to give a justification for her assertion that her customers are happy, namely that she is getting business through word of mouth (lines 6–8, 10), which presupposes they must be happy. Careful examination of Chris's utterance in lines 1–2, however, indicates that his question could have been legitimately interpreted both as a polar question (‘or not’), or as an alternative question (‘or something’, for example ‘unhappy’, ‘dissatisfied’ etc.). Chris leaves it open to Emma as to which one of these interpretations is meant. In this way, a default implicature of epistemic optionality arises, whereby Chris not only claims a lack of knowledge about the level of satisfaction experience by Emma's customers, a claim standardly associated with questions (Heritage and Raymond forthcoming; Raymond Reference Raymond2003), but increases the epistemic cline between them in implicitly claiming that he is unable or unwilling to even guess by offering a complete candidate answer (Pomerantz Reference Pomerantz1988).8 Emma opts for an interpretation of his question as a polar question in responding with ‘YEAH’ in the next turn (Raymond Reference Raymond2003), which is subsequently accepted by Chris (line 5), who also simultaneously prompts her to give an account of her answer through repeating her affirmative response.
Crucially, then, the interpretation of Chris's question in lines 1–2 as a polar question rather than an alternative question emerges over the course of a number of turns, that is, the initial question from Chris, Emma's response and Chris's subsequent response. This interpretation cannot be reduced to a summative explanation based on Chris's intentions at this point as Chris's intentions were left opaque through his not saying. Thus, while a default implicature of epistemic optionality is interactionally achieved here, this arguably follows from a general presumption of intentionality (Haugh and Jaszczolt, this volume), and an understanding of the practice of not saying through utterance-final disjunctive type questions, rather than from the ascription of specific a priori intentions to Chris (cf. Haugh Reference Haugh, Kecskes and Mey2008c: 59–60). It also draws from inferencing on the part of Chris and Emma which is mutually affording and constraining, or what Arundale and Good (Reference Arundale, Good, Fetzer and Meierkord2002) term more broadly ‘dyadic cognizing’:
Each participant's cognitive processes in using language involve concurrent operations temporally extended both forward in time in anticipation or projection, and backwards in time in hindsight or retroactive assessing of what has already transpired. As participants interact, these concurrent cognitive activities become fully interdependent or dyadic. (Arundale and Good Reference Arundale, Good, Fetzer and Meierkord2002: 122)
Such a view precludes a monadic explanation of the inferencing underpinning this interaction, where each individual participant's inferences arise independently of the other, but instead suggests that these cognitive processes are fully interdependent (Arundale and Good Reference Arundale, Good, Fetzer and Meierkord2002: 127).
In both these examples, then, another sense of interaction appears more apt, namely, a technical conceptualisation of interaction as ‘the conjoint, non-summative outcome of two or more factors’ (Arundale Reference Arundale2006a: 196). In these instances, the factors involved are the inferences made by the two participants about what is being meant, done and evaluated through these conversations. The definition of interaction in the technical sense of emergence assumes conversational interaction is ‘a non-summative phenomenon involving two or more cognitively autonomous persons engaged in affording and constraining one another's designing and interpreting of utterances and/or observable behaviours in sequence’ (Arundale Reference Arundale2010a: 2079; cf. Arundale Reference Arundale1999: 125–6, 129; Kidwell and Zimmerman Reference Kidwell and Zimmerman2006: 7; Krippendorff Reference Krippendorff2009: 43). Treating conversational interaction as emergent assumes that it is a system that can be treated as fundamentally non-summative in nature.
Arundale (Reference Arundale, Locher and Lambert Graham2010b) explains this property of non-summativity by appealing to analogies with statistical inference, systems theory and chemical reactions. In mathematical terms, summativity refers to instances where the effect of one factor depends on the levels of another factor in statistical inference, while non-summativity encompasses instances where there are interaction effects between variables that go beyond that main effect when conducting analysis of variance. In systems theory, a formal system arises when
[t]he state of each unit is constrained by, conditioned by, or dependent on the state of other units. The units are coupled. Moreover, there is at least one measure of the sum of its units which is larger [or less] than the sum of that measure of its units. (Miller Reference Miller1965: 200–201)
In other words, the emergent property is characteristic of the system as a whole, not its individual parts (Georgiou Reference Georgiou2003: 241). In chemistry, non-summativity refers to the properties of a compound (e.g. sodium chloride, NaCl, or common salt) being qualitatively different from the properties of the two elements (sodium and chlorine) that constitute it. More generally, non-summativity refers to cases where ‘the state(s) of one component become reciprocally linked to and conditional upon those of other component(s) in space and/or time’ (Arundale Reference Arundale, Locher and Lambert Graham2010b: 139).
In relation to conversation, then, non-summativity is a property of conversational interaction as a system, which is framed by Arundale as involving ‘two [or more] persons’ evolving, reciprocal co-creating of meanings and actions in on-going address and uptake’ (Arundale Reference Arundale2010a: 2079; see also Arundale and Good Reference Arundale, Good, Fetzer and Meierkord2002; Krippendorff Reference Krippendorff2009: 37–47). This view is developed in considerable detail in Arundale's Co-Constituting Model of Communication (CCM) (Arundale Reference Arundale1999, Reference Arundale, Fitch and Sanders2005, Reference Arundale2006a, Reference Arundale2008, Reference Arundale2010a, Reference Arundale2010b; Arundale and Good Reference Arundale, Good, Fetzer and Meierkord2002),9 and to some extent in other interactional achievement models of communication (Clark H. Reference Clark1996; Sanders Reference Sanders1987). It also underpins the analysis of social actions in CA (Schegloff Reference Schegloff1996, Reference Schegloff2007). The key idea is that participants reciprocally afford and constrain interpretations of meanings, actions and evaluations in interaction through the adjacent placement of subsequent utterances, through which they display their understandings of the interactional import of prior and forthcoming utterances (Arundale Reference Arundale2006a: 196). In examples (1) and (2) above, the response of the recipient helps constrain what is implicated amongst the potential interpretings initially afforded by the speaker's first utterance, an interpretation which is subsequently ratified, qualified or rejected by the first speaker.
While a non-summative model of conversational interaction is necessarily complex, it is arguably essential for the analysis of pragmatic phenomena, as summative models are ‘formally incapable of explaining the non-summative effects or emergent properties observable when individuals are engaged in interaction’ (Arundale Reference Arundale2008: 243; see also Krippendorff Reference Krippendorff1970). The advantage of an approach that accounts for the emergent properties of conversational interaction is that it can also accommodate particular instances of pragmatic phenomena which do not necessarily need to be treated as formally emergent. In other words, a summative explanation of certain pragmatic phenomena is possible within a non-summative approach. The reverse, however, is not true. One implication of treating pragmatic phenomena in conversational interaction as emergent or non-summative phenomena is that while the sociocognitive engine (cf. Levinson Reference Levinson2006a: 86) underlying conversational interaction clearly encompasses monadic (i.e. individual) cognitive processes and states of attention, (individual) intentions, inference and agency, it also needs to go beyond explanations rooted at the level of individual psychological processing into forms of dyadic cognising (Arundale Reference Arundale2008; Arundale and Good Reference Arundale, Good, Fetzer and Meierkord2002; Haugh Reference Haugh2009; see also Haugh and Jaszczolt, this volume).
13.4.2 Situatedness
While CA is primarily concerned with the local situated context, a stance that is understandable given the analytical concerns of conversation analysts, it is commonly argued in pragmatics that we also need to consider the historical, social and cultural circumstances in which conversational interaction occurs (Grundy Reference Grundy2008: 223). Mey (Reference Mey2001), for instance, argues that language, which lies at the heart of conversational interaction, ‘transcends the historical boundaries of the “here and now”, as well as the subjective limitations of the individual's knowledge and experience’ (115), as it ‘functions both as a repository of earlier experience and as a tool-box for future changes’ (227). Such a stance points to a potential issue for the CA approach to analysing actions (and to a lesser extent meanings) in conversational interaction: while it is claimed ‘analysis requires demonstrating that the action in question was understood and experienced as such by the participants’ (Schegloff Reference Schegloff1996: 172, original emphasis), the identification (and labelling) of actions by the analyst itself is treated as unproblematic. For example, recognising a ‘possible complaint’ as an action is argued to be ‘a matter of position and composition – how the talk is constructed and where it is’ and thus ‘it is not a matter of divining intentions’ (Schegloff Reference Schegloff, Enfield and Levinson2006: 88). However, it is never made clear how the analyst (or participants) know this constitutes a (possible) complaint in the first place.
In the previous analysis of example (1), the relevant section of which is reproduced below, for instance, the recognition of a “complainable” (namely, having to spend all weekend studying) was claimed to be evident from Alex's subsequent ironic assessment of Ben's weekend spent studying.

In lines 15–17, Ben describes how he spent the whole weekend studying. This is treated as a complainable by Alex, who responds with an ironic assessment in line 19 (‘sounds like fun °there goes° the students life’). It is interpretable as ironic since evidently Alex does not mean studying all weekend is really a fun thing to be doing, but rather that having to study all weekend can be assessed as being exactly the opposite (that is, not fun), and thus something about which making a complaint is understandable. However, such an analysis begs the question of just how it is we can conclude a possible complaint is at issue here, and how we recognise Alex's assessment as ironic.10 Evidently we are drawing from some kind of interactional competence or intuition that lies outside of this particular interaction.
Mey (Reference Mey and Cummings2010) argues that what makes speech acts possible, including ‘possible complaints’, are particular conditions or what he terms ‘affordances’:
for any activity to be successful, it has to be ‘expected’, not just in the sense that somebody is waiting for the act to be performed, but rather in a general sense: this particular kind of act is apposite in this particular discursive interaction. (Mey Reference Mey and Cummings2010: 445)
In other words, social actions are dependent on the ‘situation being able to “carry” them’ (Mey Reference Mey and Cummings2010: 445). In the case of possible complaints and ironic assessments this involves a general understanding of what students do (i.e. not only study, but also recreation, and sometimes part-time work), how much time they normally spend doing study as opposed to those activities, and thus what would constitute a reasonable, as opposed to unreasonable, balance between less desirable activities (i.e. study) and more desirable activities (i.e. recreation).
It appears, then, that attempts to formalise such conditions in philosophical pragmatics perhaps have a more important role to play than generally acknowledged by those preferring empirical analyses of naturally occurring data, since without some ‘precise reflections on what constitutes the nature’ of the speech act in question (Jucker Reference Jucker2009: 1620), analysts may inadvertently reify their own intuitions to the level of theory. Such a problem is admittedly less likely to arise in CA, where such theorising is normally eschewed. However, in pragmatics, where theorising about pragmatic phenomena is one of its primary goals, the reification of informal intuitions can lead to an endless proliferation of categories, internally contradictory schemas, and ultimately, incoherence in theorising about pragmatic phenomena. The potential for such problems suggests that we need to take not only the local, situated context of a particular interaction into account in our analyses of conversational interactions, but also the sociocognitive world that underpins them.
The notion of situatedness proposed here encompasses both the local, here-and-now of conversational interaction, as well as the ‘layered simultaneity’ and horizontal and vertical distribution of ‘orders of indexicality’ characterising that which lies beyond the here-and-now (Blommaert Reference Blommaert2005, Reference Blommaert2007). As Blommaert (Reference Blommaert2005) argues, various forms of discourse, including conversational interaction, are subject to what he describes as ‘layered simultaneity’.
It occurs in real-time, synchronic event, but it is simultaneously encapsulated in several layers of historicity, some of which are within the grasp of participants while others remain invisible but are nevertheless present. (Blommaert Reference Blommaert2005: 130)
Such a view of interaction goes beyond the locally observable to encompass ‘a polycentric and stratified’ environment where ‘people continuously need to observe “norms” – orders of indexicality – that are attached to a multitude of centres of authority, local as well as translocal, momentary as well as lasting’ (Blommaert Reference Blommaert2007: 2). These orders of indexicality are spread horizontally across social networks, as well as vertically in the sense that they belong to different scales of operation and degrees of validity (Blommaert Reference Blommaert2007: 1).
Norms (see Peregrin, this volume) are often conceptualised as external forces that drive (or cause) particular actions, interpretations or evaluations by individuals, a perspective that is sometimes labelled Parsonian, as it was most famously advanced in the work of Parsons ([Reference Parsons1937]1968). The treatment of norms as external forces was critiqued by Garfinkel (Reference Garfinkel1967) among others,11 and the view has since emerged that norms, or ‘orders of indexicality’, can be understood more productively as distributed and emergent properties that are enabled across networks of speakers, including those broader networks of which conversational interactions constitute an important part. Arundale argues that
we co-constitute anew in each inter-action patterns that we have likely co-constituted in similar form in the past, and it is the continual re-co-constituting or co-maintaining of these patterns that observers attempt to explain using abstractions like ‘ideologies’ or ‘social institutions.’ But from the perspective of the co-constituting model of communication, an ideology or social institution does not exist except as a continual renewing of patterns in inter-action. (Arundale Reference Arundale1999: 141–2, original emphasis)
In other words, the ongoing renewal of inter-action12 patterns across social networks is the mechanism by which orders of indexicality are sustained (and thus evolve). Instances where such inter-action patterns are not renewed thus constitute challenges to orders of indexicality, although the impact of such challenges depends very much on their place of occurrence within the wider social network. A similar view of ‘norms’ is advanced by Eelen (Reference Eelen2001), who suggests they constitute a kind of ‘working consensus’, a set of practices rather than beliefs. Indeed, this view of broader socio-cultural norms reflects a perspective that has long been argued for by ethnomethodologists (Garfinkel Reference Garfinkel1967; Heritage Reference Heritage1984), with Mey (Reference Mey and Brown2006b) succinctly representing this perspective in arguing that the ‘social context not only constrains language use, but is itself constructed through the use of appropriate language: social norms are (re)-instituted through the use of language’ (53). These orders of indexicality, then, form part of a ‘shared memory’ that arises as ‘inter-action within a network across time and space creates a structural form of social memory, independent of the memories of individuals’ (Arundale Reference Arundale1999: 141). Krippendorff (Reference Krippendorff2009) argues that such social memory cannot be reduced, at least not without remainder, to the psychological processes of individuals. Instead, it should be modelled as a system in its own right distributed across social networks. Work on different kinds of knowledge schemata, such as frames and scripts, for instance, represent an attempt to do just that, albeit within a largely structuralist framework.13
Such a perspective clearly has implications for those attempting to characterise the conditions (or affordances) that make particular pragmatic meanings, actions and evaluations possible. It also has implications for the way in which we approach the analysis of various pragmatic phenomena in conversational interaction. In particular, the situatedness of conversational interaction needs to be taken into account, especially in cases where speakers in local, situated interaction also invoke particular schemata and orders of indexicality that are variously distributed across social networks. In the following excerpt, for instance, from a conversation between two friends talking about Chris's recent visit to the dentist, the way in which scripts and orders of indexicality associated with dentists are invoked is14 illustrated.

Up until this point in the conversation, Chris has been telling Mark about his dentist, and how impressed he is with him. Mark responds to this over-extolling by teasing Chris that he is probably charged a lot for the dentist's services (line 1). While Chris initially agrees, he goes on to account for his previous extolling by claiming that his dentist only does work that needs to be done (lines 3–5). This presupposes that dentists sometimes do unnecessary dental work, thereby inflating what they can charge. This presupposition is made explicit in a subsequent turn, when Chris begins talking about what some dentists do, namely, offering a fluoride treatment (apparently perceived by Chris as unnecessary) (lines 7–8), before Mark co-constructs a humorous completion to Chris's utterance, where it is suggested that dentists make money by first deliberately causing damage to their patient's teeth (lines 9–10). While it is not clear what Chris might have said had he finished what he was saying in line 8, Mark was able to anticipate what could have been said.15 In this way, Mark co-constructs a common stance about dentists who do unnecessary work with Chris (cf. Haugh Reference Haugh2009: 101–2). This stance is recognisably humorous not only because of the incongruity between the action attributed to such dentists (i.e., deliberately damaging the patient's teeth), and the assumption that dentists are there to help people, but also because it draws upon stereotypical views of dentists as overcharging their patients. In other words, in order to fully appreciate the humour here, we arguably need to go beyond the local, situated interaction to include a consideration of views of dentists that reside in our shared memories or schemata.
In another example, taken from an interaction occurring in an institutional setting (a television broadcast), the way in which multiple layers of historicity can be invoked becomes even more apparent. The following excerpt is from a debate between Sanna Trad, daughter of a prominent leader in the Australian Muslim community in Sydney, and Bronwyn Bishop, a conservative senator, about Australian Muslims wearing the hijab.


This excerpt begins with Trad expressing her frustration that wearing the hijab is interpreted (by many Australians) as a sign of a lack of freedom on the part of Muslim women, and then going on to claim that she is actually exercising her agency in choosing to wear a hijab (lines 1–4). However, she is interrupted by Bishop, who implies that Trad is unaware of the limits placed on her freedom by her Islamic beliefs, by drawing an analogy with slaves who in never having experienced freedom do not understand its meaning (lines 6–7). Trad's subsequent response is very heated. In particular, she tries to counter the implication that she is not part of Australian society (which is a negative evaluation of Trad displayed by Bishop, which thus constitutes a potential challenge to her identity), by claiming she was ‘born and raised’ in Australia (lines 13–14) and has Australian friends who are, importantly, from Anglo-Christian backgrounds (lines 16–18). The way in which this face threat involves multiple layers of historicity is made evident in Trad's response to Bishop's implied accusation, namely, that Trad (and all those who wear hijabs) are not legitimate members of Australian society. In other words, the discourse of ‘Muslims as un-Australian’ as a broader underlying concern is invoked in this interaction. It also draws from a shared understanding that there are supposedly ‘real’ Australians who are born and raised in the country and are stereotypically from Anglo-Christian backgrounds. The point here is not to endorse such understandings, but simply to point out that such understandings are evidently presupposed by both Bishop and Trad.
While conversation analysts often eschew the need for invoking shared knowledge (see the recent debate in McHoul et al. Reference McHoul, Rapley and Antaki2008), it has been argued here that pragmatic phenomena in conversational interaction cannot always be reduced to emergent interactional phenomena in the here-and-now. More serious consideration of their sociocognitive roots, and consequently acknowledgement of the inherent situatedness of conversation interaction, has thus been advocated here.
13.5 Towards a pragmatics of conversational interaction
Conversational interaction is an important concern for any theory of pragmatics. While there are clearly many forms of language use apart from conversational interaction that are also deserving of analytical attention, conversational interaction remains important due to its enduring ubiquity in social life. In this chapter, it has been argued that pragmatic phenomena in conversational interaction should be recognised as emergent and situated in nature. This has implications for the ways in which we might approach the analysis of pragmatic phenomena that occur in conversational interaction.
A basic framework for analysing pragmatic phenomena in conversational interaction has also been assumed in this discussion. This framework is based on a tripartite distinction between pragmatic meaning, action and evaluation, which, along with investigations of the interactional machinery and sociocognitive engine underlying conversation, arguably forms the basis of a programme for investigating the pragmatics of conversational interaction. In advancing this programme, a wide range of methodologies can legitimately be called upon. It has been argued here, however, that this should occur in a principled manner, bearing in mind the properties of emergence and situatedness that characterise conversational interaction.
It is also apparent that more work on conversational types of computer-mediated communication is necessary. While face-to-face conversation remains at the core of much of our language use, variant forms of conversational interaction are emerging as different types of computer-mediated communication are increasingly becoming a part of our daily lives. These emergent forms of conversational interaction need to be accommodated within dyadic as well as multiparty theories of conversational interaction in pragmatics. The assumption that talk is inherently fleeting also needs revisiting in light of the increasing prevalence of conversational forms of computer-mediated communication where the conversational record remains there for all to inspect. The advantage of such a record is that it engenders greater metalinguistic and metapragmatic awareness amongst interactants themselves, with such awareness offering a rich analytical vein for future research on conversational interaction. It appears, then, that there still remains much to explore in better understanding the pragmatics of conversational interaction.
14 Experimental investigations and pragmatic theorising
14.1 Introduction: pragmatics in the mind
Paul Grice's (Reference Grice, Cole and Morgan1975) seminal proposal on the Cooperative Principle and maxims can be seen as a philosophical reconstruction of the conventions that interlocutors assume each other to adhere to. Bach (Reference Bach2001 and this volume), Horn (Reference Horn, Horn and Ward2004, Reference Horn2005 and this volume), Saul (Reference Saul2002b) and other researchers construe the study of pragmatics at such a normative level, subscribing to a view whereby Grice and neo-Gricean accounts are concerned with what interlocutors ought to do, and which implications ought to arise given a set of pragmatic principles and what has been said. Yet a great and still growing part of the pragmatic community sees pragmatics (and semantics) as a subset of the science that investigates human communicative competence (see Carston and Powell, Reference Carston, Powell, Lepore and Smith2006; Chierchia, Reference Chierchia and Belletti2004; Levinson, Reference Levinson2000; Sperber and Wilson, Reference Sperber and Wilson1986; among others). Such theorists consider that the object of description for semantics and pragmatics is the implicit knowledge that competent speakers have which enables them to produce and understand meaningful utterances. The cognitively oriented pragmatician considers that pragmatics, just like syntax or phonology, is related to cognitive psychology and to disciplines that investigate knowledge representation (Marantz, Reference Marantz2005; Ferreira F., Reference Ferreira2005). Under this light, whether Grice himself actually thought about the representation of pragmatic norms in the mind is a question that is perhaps more relevant to the historian of pragmatics rather than the modern linguist.
It follows from the difference in the object of description that the two traditions also differ in the kind of data that they admit as relevant evidence. For example, in the cognitive tradition it is important to document the dominant interpretation of an utterance as well as any other possible but less preferred ones, and to account for the processes through which these interpretations are derived. As such, the cognitive turn brings pragmatics much closer to the discipline that studies how language is processed in the minds of speakers and listeners, namely psycholinguistics. Recently, the collaboration of theoretical pragmaticians and psychologists of language has been of benefit to all parties. Linguistic phenomena have been a fruitful domain for psychological study, and the empirical data gathered in connection with these phenomena have provided theory-critical evidence beyond the reach of conceptual argumentation and reflective intuition.
In this review, I focus in particular on quantity implicature, and I outline three theory-critical questions that are best addressed through empirical methodologies. These concern (i) the role of context in the derivation of implicature; (ii) differences between generalised and particularised implicatures; and (iii) the post- or sub-propositional nature of the process of implicature derivation. I discuss how these questions can be addressed through experimental methodologies that study the time-course of the process of meaning derivation as well as subtle differences between the degree to which certain interpretations are preferred.
Quantity implicature is but one of the topics that have recently been approached from this interdisciplinary perspective: other topics include reference, speech act, metaphor and figurative language (see the contributions in Noveck and Sperber, Reference Noveck and Sperber2004). Recent studies on these aspects of language, initially motivated by theoretical linguistic considerations, have generated plentiful psycholinguistic data. In many cases, these data not only have enabled researchers to settle linguistic debates but also contribute to more general models of human cognition. However, quantity implicature is a particularly good case to exemplify the contribution of experimental methods to pragmatics on three grounds. First, for most cases, there is widespread agreement among theoretical accounts about the eventual interpretation of scalar terms. Consequently the predictions that discriminate between these accounts are subtle and fine-grained and concern the process through which the interpretation is derived. Chierchia (Reference Chierchia and Belletti2004), Levinson (Reference Levinson2000) and Sperber and Wilson (Reference Sperber and Wilson1986) concur that this renders empirical investigation a more appropriate means of evaluating these predictions than using the traditional tools of the theoretical linguist, introspection and intuition. Their theoretical accounts of implicature are tailored to this form of investigation, exhibiting a conceptual clarity and precision which makes it possible to draw testable predictions from the theories. These cases are reviewed in section 14.2.1. Second, for the cases where there is disagreement about the preferred interpretation of an utterance, or the relative preference for a certain interpretation, it is once again advantageous to turn to empirical methods in order to document which interpretation is favoured by linguistically naïve competent speakers. These cases are reviewed in sections 14.2.2--3. Third, the theories of implicature that will be put to the test make claims that have implications for the organisation of the entire meaning system and how the semantics/pragmatics distinction is to be conceptualised. As such, it is possible to draw conclusions whose implications are wider than the actual phenomenon under study. Let us briefly review the linguistic phenomena.
14.2 Quantity implicature
It is a commonplace observation that interlocutors may communicate and infer more information than what is explicitly said. Take for example the discourses in (1) and (2).
(1)
a. Mary: Did you dance with John and Bill?
b. Jane: I danced with John.
c. Implicature: Jane did not dance with Bill.
(2)
a. Mary: Did all your class fail the test?
b. Jane: Some of my class failed.
c. Implicature: Not all Jane's class failed.
Given questions (1a) and (2a), the speaker of (1b) and (2b) can be understood as conveying their literal meaning as well as (1c) and (2c) respectively. However, the latter inferences are not part of what is explicitly said. Grice's Cooperative Principle and maxims (Reference Grice, Cole and Morgan1975) have been seminal in providing a framework of how information is communicated in this implicit fashion. Grice proposed that interlocutors should assume each other to be cooperative, and moreover to be informative, truthful, concise and relevant. If it would appear that the information that is explicitly communicated by the speaker would violate any one of these assumptions, listeners are enjoined to infer that some additional information that would repair such a violation is implicitly communicated. These reparatory pragmatic inferences are known as implicatures. (See Horn, this volume; Ariel, this volume; Jucker, this volume.)
Specifically, with regard to (1), according to the Gricean proposal (Grice, Reference Grice, Cole and Morgan1975; see also Atlas and Levinson, Reference Atlas, Levinson and Cole1981; Horn, Reference Horn1972, Reference Horn and Schiffrin1984b; Levinson, Reference Levinson1983) the implicature in (1c) is derived if interlocutors assume each other to adhere to the first maxim of Quantity, which enjoins them not to provide less information than what would be required bearing in mind what is relevant for the purpose of the conversation. The inference would be derived by the following reasoning: Jane said that she danced with John, but there is a more informative statement that she could have made, namely that she danced with John and Bill (the fact that the latter statement is more informative can be easily demonstrated by the observation that ‘Jane danced with John and Bill’ entails ‘Jane danced with John’ – but not the other way round). Given question (1a), it would be relevant to know whether the more informative statement is the case. By virtue of the Cooperative Principle and the first maxim of quantity of information, Mary is licensed to assume that Jane would not be underinformative. Therefore, the most likely reason why Jane did not mention that she danced with Bill as well as John would be that this is not the case, and so Jane could be understood to be communicating this implicitly through an implicature. Since this implicature is derived in order to observe the first maxim of Quantity, it could further be called a quantity implicature.
In a similar fashion for the discourse in (2), there exists a proposition, ‘all of Jane's class failed’, that would have been relevant and more informative than what was explicitly said. The fact that Jane did not say so enjoins Mary to infer that Jane is communicating that this is not the case, to the effect of deriving the implicature in (2c). Because (2b) is part of a scale of informativeness formed by propositions with the quantifiers ‘some’, ‘many’, ‘most’, ‘all’, it could further be called a scalar implicature.
While both (1c) and (2c) are instances of quantity implicatures, they differ on the grounds that the alternative more informative proposition that could have been used in the case of (1c) can't be independently established outside of a specific context. By contrast, the scale of informativeness for quantifiers, sentence connectives (‘or’, ‘and’), modals (‘might’, ‘must’) and other expressions can be known without reference to a specific communicative situation. Grice, and subsequent pragmatic theorists, acknowledged this difference. The terms generalised and particularised conversational implicatures captured the difference between implicatures that are usually associated with a certain form of words as opposed to implicatures that are not. In this spirit, (2c) is a scalar implicature (because ‘some’ belongs to a scale which is context-independent) and it can be classified as a generalised quantity implicature. (1c) on the other hand relies on a contrast that is only available in an ad hoc fashion, depending on the context and it can be classified as a particularised quantity implicature. However, both cases satisfy criteria for pragmatic rather than logical meaning (for a discussion on criteria for implicature-hood see Horn, Reference Horn and Schiffrin1984b; Levinson, Reference Levinson1983; Sadock, Reference Sadock and Cole1978). For example, neither of these implicatures will be generated if the speaker is not in position to make the stronger proposition (because for example she doesn't have complete knowledge of the situation), or if she is opting out of the maxim of Quantity for reasons of diffidence, deceit or politeness (for empirical evidence for the latter see Bonnefon et al., Reference Bonnefon, Feeney and Villejoubert2009). Moreover, the explicit contradiction of such implicatures is considered less infelicitous than contradictions of logical entailment (for empirical evidence see Katsos, Reference Katsos2007).
Recently, certain pragmatic theories have strengthened the distinction between generalised and particularised implicatures, to the effect that the former are considered to be triggered by virtue of the tokening of the lexical expressions that they are associated with. This is done by a short-circuited process that does not employ all the steps of the Gricean derivation. Specifically, Levinson (Reference Levinson2000) and Chierchia (Reference Chierchia and Belletti2004, Reference Chierchia2006) propose that scalar implicatures are derived just as long as they lead to an informationally stronger interpretation. Other considerations, such as whether the implicature is relevant to the context of the conversation, whether the interlocutor is cooperative and whether she is in a position to know whether the stronger statement is true, are important steps in the derivation, but they come into play at a later stage and may call for the cancellation of an implicature that has already been derived. The system that derives these short-circuited but defeasible implicatures is either a specialised default-pragmatic system (according to Levinson, Reference Levinson2000), dedicated to generalised implicatures, or a subcomponent of the grammatical system which has incorporated a mechanism for evaluating the informativeness of a proposition (according to Chierchia, Reference Chierchia and Belletti2004). Both these alternatives call for a distinction between a Gricean-like pragmatic system which deals with the derivation of particularised implicatures (like the ad hoc quantity ones) and a short-circuited system that derives default (lexically associated and defeasible) generalised implicatures. One of the main perceived advantages of these default accounts is that they make use of the intuitive difference between two types of implicatures in a systematic way. (It is worth noting from the beginning that these accounts are not the only ones that make use of defaultness as a theoretically and empirically relevant notion in the study of meaning. See Jaszczolt (Reference Jaszczolt2005) for various uses of the term with distinct content. Thus, when referring to ‘the default account’ I shall always be referring to the specific proposals put forward by Levinson (Reference Levinson2000) and Chierchia (Reference Chierchia and Belletti2004)).
However, according to other neo-Gricean and post-Gricean theorists, the distinction between generalised and particularised implicatures does not entail the postulation of two separate pragmatic systems. Unitary accounts such as the ones put forth by Carston (Reference Carston, Carston and Uchida1998), Geurts (Reference Geurts2010), Hirschberg (Reference Hirschberg1991), Horn (Reference Horn and Schiffrin1984b) and Sperber and Wilson (Reference Sperber and Wilson1986) among others interpret the distinction as an empirical generalisation on the degree of association of a certain form of words with a certain implication. However, regardless of whether a certain form of words tends to be associated with a certain implicature more than others do, all conversational implicatures are derived by a common set of pragmatic principles, which takes into account cooperativity, epistemic state and relevance to the discourse context among others.
The predictions of the default, two-systems accounts, and the unitary accounts respectively can be clustered as follows with regard to quantity implicatures. First, according to default accounts, scalar implicatures are generated upon the tokening of a scalar expression and as long as the interpretation with the scalar implicature is informationally stronger than the interpretation without. Should the implicature not be relevant to the context of the conversation, or should the interlocutor not be cooperative or in a position to know whether the stronger statement is true, the implicature which has been generated by default is subsequently cancelled. On the contrary, unitary accounts predict that the implicature is simply not generated in the first place if any of the licensing conditions are not met. This prediction is best tested by looking at the time-course of the interpretation of scalar expressions in conditions where one of these licensing factors is not met.
Second, according to default accounts, scalar implicatures are more strongly associated with their triggering expressions than ad hoc ones. It follows that young children who are still in the process of acquiring adult-like pragmatic competence should find more evidence in the input to guide their acquisition of default implicatures than ad hoc ones. (A second prediction is that the contradiction of a default implicature should be more infelicitous than the contradiction of a non-default one.) By contrast, unitary accounts predict that children acquire competence with all quantity implicatures across the board. (Moreover, the contradiction of either kind of implicature is equally infelicitous.)
A third topic that will be discussed is whether the locus of implicature generation is sub- or post-propositional. This issue is conceptually distinct from the topic of the defaultness of scalar implicatures. It is relevant to review it here as default theories such as Levinson's and Chierchia's also tend to take a sub-propositional, localist position, on the grounds that this explains why sometimes scalar implicatures seem to intrude into truth conditions. This issue will be reviewed in section 14.2.3. Let us now turn to the debate on the role of context.
14.2.1 The role of context
Unitary and default accounts agree that scalar implicatures (henceforth SIs) are ultimately available when relevant to the discourse goal and not available otherwise; however, they differ as to how this interpretation arises. Default accounts predict that the SI is first generated, regardless of relevance to the context, and then cancelled if it is contextually irrelevant, whereas context-driven accounts predict that the SI is not generated at all if it would not be relevant. Clearly intuition alone cannot settle this question, as we are concerned not with the ultimate interpretation but with the process through which it is derived. However, psycholinguistic investigations of the time-course of scalar implicature can shed light on this.
More generally, a central debate in the area of sentence processing concerns whether there is an initial, encapsulated stage of structure assignment, or whether different categories of information (syntax, semantics, context, co-occurrence frequencies etc.) interact and coordinate from the earliest possible stage (Spivey-Knowlton and Sedivy, Reference Spivey-Knowlton and Sedivy1995; Tanenhaus et al., Reference Tanenhaus, Spivey-Knowlton, Eberhard and Sedivy1995). The question of how SIs are generated could be seen as part of this debate. According to the default approach, more processing time should be required when participants are processing scalar terms in so-called lower-bound contexts (where all that is relevant is the semantic meaning of the scalar expression and the SI is irrelevant) than in upper-bound contexts (where the SI is relevant), because in the latter the inference must be generated and then cancelled. This additional processing should manifest itself as a delay in reading time, in self-paced reading studies. Unitary accounts do not share this prediction, as no cancellation is required in lower-bound contexts; in fact, accounts based on Relevance theory (see Bott and Noveck, Reference Bott and Noveck2004) predict that reading time will be slower in upper-bound contexts, because an additional inference is generated in these contexts. Thus, by investigating the reading times of scalar expressions in upper- and lower-bound contexts, we can gather evidence about the role of context as well as the automaticity of the inference.
To exemplify this, let us return to examples (1) and (2) and look at the effect of manipulating the question that is asked:
(1)
a. Mary: Did you dance with John and Bill?
a′. Mary: Why are you upset?
b. Jane: I danced with John.
c. Implicature: Jane did not dance with Bill.
(2)
a. Mary: Did all your class fail the test?
a′. Mary: Why are you disappointed?
b. Jane: Some of my class failed.
c. Implicature: Not all Jane's class failed.
Both unitary and default accounts predict that if Mary were to ask (1a) and (2a) then she would be licensed to infer that her interlocutor is also communicating (1c) and (2c) in addition to the literal meaning of (1b) and (2b). This is because the specific question she asked raises the issue of whether the stronger proposition, the one with the conjunction (‘John and Bill’) or the one containing ‘all’, is the case. Moreover, no implicature would have been inferred had Mary asked (1a′) and (2a′) as in this context it is not relevant to consider the stronger proposition. The semantic meaning of the utterances in (1b) and (2b) suffice to answer the question that she asked. However, while the unitary account predicts that given (1a′) and (2a′) no implicature is generated in the first place, the default account predicts that in the case of (2a′) the implicature is first generated by virtue of being associated with the lexical expression which creates a scale (‘some’) and then cancelled. This contrasts with (1a′) where the implicature is simply not generated.
In a series of self-paced reading experiments, Breheny, Katsos and Williams (Reference Breheny2006; see also Katsos, Reference Katsos2008; Katsos et al., Reference Katsos, Breheny, Williams, Bara, Barsalou and Bucciarelli2005) investigated the reading time of scalar expressions in so-called upper- and lower-bound contexts where the implicature is and isn't relevant to the context respectively. For example, for the existential quantifier ‘some of the Fs’, Breheny et al. (Reference Breheny, Katsos and Williams2006) constructed upper-bound (UB) and lower-bound (LB) conditions by manipulating whether ‘all of the Fs’ was relevant to the discourse. Using the examples employed so far, a UB context for the utterance ‘some of my class failed’ can be created by a preceding question which explicitly raises the importance of ‘whether all of the class failed’ as in (2a). An LB context can be created by a preceding question which is fully answered by the utterance ‘some of my class failed’ regardless of whether some or all did, such as (2a′). Moreover, Breheny et al. (Reference Breheny, Katsos and Williams2006) included a control condition with ‘only’ (e.g. ‘only some of my class failed’) that explicitly encoded the SI. The reading time was measured on the scalar term itself (‘some of the students…’) and on a dependent phrase that followed (e.g. ‘The rest passed the test’) whose interpretation is facilitated by the SI. If the participants interpret ‘some’ as ‘some but not all’, then the reference set for ‘the rest’ is already available when they read the dependent phrase; if they do not, then extra processing time is required to interpret who ‘the rest’ refers to.
As predicted by all accounts, the dependent phrase was read faster in the UB and control conditions than in the LB. This indicates that the SI was ultimately generated in UB and not in LB, indicating that the conditions were appropriately constructed. To answer the theory-critical question – whether the SI is generated and then cancelled in the LB condition – we have to look to the scalar expression itself. In this study (and in Katsos et al., Reference Katsos, Breheny, Williams, Bara, Barsalou and Bucciarelli2005 for English), the trigger phrase ‘some of the Fs’ was read more slowly in UB than in LB contexts. This is hard to reconcile with the default account, since no reading-time delay was evident in the LB condition. Thus, these findings show that contextual factors have a primary role in the process that generates SIs, not merely in cancelling them at a later stage. (For experimental studies that reach different conclusions, see Bezuidenhout and Cutting, Reference Bezuidenhout and Cutting2002; but see also the discussion of that paper in Breheny et al., Reference Breheny, Katsos and Williams2006). Further evidence that interpreting a scalar expression with an SI is more time-consuming than interpreting it without are presented by Bott and Noveck (Reference Bott and Noveck2004), Noveck and Posada (Reference Noveck and Posada2003), de Neys and Schaeken (Reference Neys and Schaeken2007). Again, these findings pose a challenge to the default account's prediction that in cases where the scalar term is interpreted without an implicature the implicature has nevertheless been generated and then cancelled.
14.2.2 Implicatures drawn from generalised and ad hoc scales
Accounts of quantity implicature differ in the extent to which they posit a difference between generalised and particularised conversational implicatures. In the previous section I reviewed evidence that suggests that scalar implicatures, a subclass of generalised implicatures, are not generated by default. If this is on the right track, then generalised and particularised implicatures must be generated by the same mechanism. However, the evidence for this is only indirect so far. It is therefore desirable to directly test whether the two types of implicatures are generated in the same way.
The comparison between scalar and ad hoc quantity implicature was studied by Papafragou and Tantalou (2004). To study whether a participant generated an implicature they used the under-informative utterance paradigm. In this paradigm participants are presented with a situation where a certain proposition is true (e.g. the dolls painted all the stars or the mouse ate the cheese and the ham). However, a puppet describes the situation using an informationally weaker proposition which triggers a quantity implicature (e.g. she would say that ‘the doll painted some of the stars’, ‘the mouse ate the cheese’). Participants who interpret the utterance with a quantity implicature should reject it as a description of the situation.
Specifically, Papafragou and Tantalou examined whether Greek participants reject underinformative utterances with three types of scales / contrasts: the generalised lexical quantifier scale <some, all>, contrasts that rely on encyclopaedic world-knowledge such as <cheese, sandwich>, and so-called ad hoc contrasts that are evoked only in specific conversational contexts and could not in any sense generalise across situations (e.g. <{parrot}, {doll}, {parrot and doll}>). The critical utterances in these cases are those which are strictly speaking true but which have the potential to give rise to false implicatures to the effect that the stronger term of the scale does not hold. The task was oriented towards young children, addressing in particular the claim (typical of default accounts) that children acquire the implicatures of context-independent generalised scales sooner than the truly Gricean particularised implicatures of context-dependent contrasts.
In their experiment, adults and 5-year-old children were presented with act-out scenarios in which a puppet would receive a reward if (s)he performed a task which involved achieving the stronger term of the scale, e.g. if (s)he managed to colour all of the stars, to eat the sandwich, or to wrap up the presents (the parrot and the doll). The puppet went away and performed the action hidden from the participant's view, and then came back to report that (s)he had achieved something less than the goal that was set, by saying e.g. ‘I coloured some of the stars’, ‘I ate the cheese’, ‘I wrapped the parrot’. The participants were then asked to decide whether or not the puppet should receive the reward.
For each critical underinformative condition, the adults always withheld the reward. The children withheld the reward in over 70 per cent of the trials, and they were able to justify their response on the grounds that the puppet did not complete the task. Numerically, children were more sensitive to violations with contrasts that are specific to some context, than with logical or encyclopaedic ones (withholding the reward in 90%, 77.5% and 70% of cases respectively), but this difference did not reach statistical significance. The authors interpreted their data as supportive of unitary models of pragmatics, in that the reward is withheld at comparable levels regardless of whether or not the critical term entered a generalised scale.
However, in interpreting these findings, it is necessary to sound notes of caution. The experimental design is atypical within the literature (for other applications of this paradigm see Noveck, Reference Noveck2001, experiment 3; Guasti et al., Reference Guasti, Chierchia, Crain, Foppolo, Gualmini and Meroni2005, all experiments; Pouscoulous et al., Reference Pouscoulous, Noveck, Politzer and Bastide2007, experiment 1). In a typical task involving underinformative utterances, participants who do not detect that an utterance is underinformative should be able straightforwardly to accept the utterance, while participants who detect the underinformativeness should be able to reject the utterance. In Papafragou and Tantalou's task, the picture is less clear. If the puppet is taken to be informative, participants should withhold the reward, because they can infer that the task has not been completed. However, if the puppet is understood to be underinformative (e.g. ‘I wrapped the parrot’ is interpreted as ‘I wrapped the parrot, and it is possible that I also wrapped the doll’), then participants are again entitled to withhold reward, as they have no way of knowing with certainty whether the task actually has been completed. Therefore the grounds for withholding reward are potentially ambiguous, which undermines the results of the study. It must be noted however that the justifications given by participants are consistent with the informative interpretation of the utterances, which suggests that the findings are at least indicative.
Based on this study, Katsos and Smith (Reference Katsos, Smith, Franich, Iserman and Keil2010) investigated the same question using the standard paradigm for underinformative utterance tasks. In this paradigm, participants watch the situation unfold, and can therefore tell whether or not an utterance is underinformative for the actual situation. Katsos and Smith looked at 7-year-old English-speaking children as well as adults. Corroborating the findings of Papafragou and Tantalou (2004), they demonstrated no advantage for generalised scales in the child groups. Indeed, the numerical tendency of Papafragou and Tantalou's study attained significance in this study: underinformative utterances with ad hoc contrasts were rejected at higher rates than underinformative utterances with generalised scales. This result is clearly not predicted by the default account, but nor is it straightforwardly supported by unitary accounts, which predict a uniform pattern of development for all scales.
Relevant data are also reported by Katsos and Bishop (2011), who tested a group of 5-year-old English-speaking children and a group of adults as well. In this study, which has younger children than the one reported in Katsos and Smith (Reference Katsos, Smith, Franich, Iserman and Keil2010), the difference between generalised and ad hoc contrasts was numerically in favour of ad hoc contrasts but it did not reach statistical significance. However, a novel pattern emerged with the adults. While adults always objected to underinformative utterances with generalised and ad hoc contrasts at ceiling rates, an indirect, qualitative advantage for generalised expressions over ad hoc ones was obtained. That is, rejections of underinformative utterances were of two different types: first, straightforward rejections, and second, indirect rejections, phrased as revisions, hedging remarks, ambivalent judgements or metalinguistic comments (‘Yes, but he painted the heart as well’; ‘This was half right, half wrong’; ‘It's not false, but he missed something’; ‘This one is tricky!’; ‘This is technically correct’). While roughly 15 per cent of the adult objections to underinformative utterances with scalar contrasts were indirect, over 40 per cent of the adult objections to underinformative utterances with ad hoc contrasts were indirect. If we were to take the straightforwardness of the response as an index of how participants treat underinformativeness, we could interpret this as evidence that adults treat violations of informativeness with generalised scales as graver than violations with ad hoc contrasts.
To summarise, all three studies, Papafragou and Tantalou (2004), Katsos and Smith (Reference Katsos, Smith, Franich, Iserman and Keil2010) and Katsos and Bishop (2011) administered a version of the underinformative utterances task and reported differences between context-independent generalised expressions and context-dependent ad hoc expressions. Children's performance on ad hoc expressions proved better than that with generalised expressions (it was always numerically higher, and in Katsos and Smith it reached statistical significance as well). Moreover, Katsos and Bishop demonstrated a qualitative advantage for generalised expressions over ad hoc ones in the adult data. Thus, we arrive at a picture that is not predicted by any of the existing theories.
One suggestion to explain the child data is to focus on the kind of violations that were evoked for the specific generalised and ad hoc contrasts tested. In this methodology, when a speaker is underinformative with regard to the generalised quantifier scale, they are correct about the type of object acted upon (e.g. carrots rather than pumpkins), but miss out information on the quantity of objects (some rather than all). Thus, the speaker has met some of the informativity requirements (kind of objects) but failed others (quantity of objects). However, when a speaker is underinformative with regard to the ad hoc scale, they miss out information both on how many objects were acted upon and on the identity of one of the objects. A child who considers it important to give information first and foremost about the kind of objects that were acted upon might plausibly tolerate the former kind of underinformativeness while rejecting the latter. The difference between scales that was obtained may be due to the fact that younger children seem first and foremost to be focused on avoiding and objecting to violations about the kind of objects.
Turning to the adult group, we note that the pattern of privileged treatment is reversed. The indirect privilege of the generalised scale (in terms of how categorically they were rejected) can be interpreted in two ways: either this reflects the special status of generalised scales in the linguistic system (as per default accounts), or this is due to some other factor. As the default account was not upheld for the child groups, and they are not upheld by the studies on the role of context reviewed in section 14.2.1, one would need to postulate some non-obvious reason why these accounts should in any case apply for adults. Other factors might include an effect of frequency of contrast: in actual language use, ‘some’ is clearly far more often contrasted with ‘all’ than, for instance, ‘the triangle’ is contrasted with ‘the triangle and the heart’. This may explain why the privileged status of generalised scales is manifest only in the adults, this being the group with the greatest exposure to language. Horn's (Reference Horn and Schiffrin1984b; see also Reference Horn, Horn and Ward2004) non-default account of implicature and informativeness might be compatible with this explanation. He proposes that the contexts in which terms of a generalised scale are contrasted with one another are quantitatively more numerous than the contexts in which the terms of an ad hoc scale are contrasted. Thus, context-independent generalised scales are associated not with default implicatures, but with default contexts of occurrence. This account is compatible with the data presented here, if we assume that the adults’ far richer experience with language and contexts of use makes them, unlike the children, sensitive to this special property that the terms of a generalised scale possess. In summary, the differences between generalised and ad hoc expressions that are reported do not conform with the predictions of the default account.
14.2.3 Scalar implicature and the localist--globalist distinction
Another point of contention between competing theories of quantity implicature concerns the locus of the inference. In the traditional neo-Gricean understanding of implicature, the input to the pragmatics is the output of the semantic system, which is nothing less than a full proposition. For quantity implicature specifically, this means that the linguistic unit that is evaluated for whether it was informative or not is the whole proposition that is expressed by a speech act. To turn to the examples in (3a) and (4a), the standard neo-Gricean process can derive the (c) implicature: these statements are simply the negations of the stronger alternatives to (a), ‘George believes that all of his advisors are crooks’ and ‘Every student passed all of the tests’ respectively. However, in these widely discussed examples, cited here from Benjamin Russell (Reference Russell2006), it has been argued that it is also possible to interpret these utterances as conveying the (b) implicatures. These implicatures can only be generated if the input to the quantity maxim is a unit which is different to the full proposition expressed by (3a) and (4a). In fact, the input has to be a sub-propositional expression, the complement of the belief clause and the object of the verb respectively. In the terminology used, the implicatures in (c) are called global or post-propositional, while the ones in (b) are called local or sub-propositional.
(3)
a. George believes that some of his advisors are crooks.
b. George believes that not all of his advisors are crooks.
c. George does not believe that all of his advisors are crooks.
(4)
a. Every student passed some of the tests.
b. Every student passed some but not all of the tests.
c. Not every student passed all of the tests.
On the strength of the observation that the (b) implicatures are possible, some accounts of scalar implicature take a localist stance, and predict that (3a) and (4a) should be interpreted with the local implicature. Such accounts typically also commit to a default view of scalar implicature, although these are in principle independent considerations. In a similar vein, it has been observed (by Levinson and others) that implicatures appear in some cases to enter into the truth conditions of the proposition which gives rise to them. This is also impossible on the traditional Gricean account. Establishing the presence of these aspects of meaning in the truth conditions of the proposition is a sensitive issue. However, one commonly accepted criterion is passing the scope test, according to which only aspects of meaning that can be part of what is denied or supposed, or generally fall within the scope of logical operators, are truth-conditional. Carston (Reference Carston and Bianchi2004b) discusses the history of this approach. Under this criterion, widely cited examples such as (5a--c) have convinced Levinson (Reference Levinson2000: 198ff.), M. Green (1998) and others that scalar implicatures can intrude into truth conditions.
a. It is better to eat some of the cake than it is to eat all of it.
b. You shouldn't be too upset about failing some of your exams; it's much better than failing the whole lot.
c. Because the police have recovered some of the gold, they will no doubt recover the lot.
According to these analyses, the scalar implicature (that ‘some’ implies ‘some but not all’) must fall within the scope of logical operators in the above examples in order for the constructions to be felicitous. Since this is not compatible with the classical neo-Gricean account, it appears that some kind of encapsulated default-pragmatic system must be involved in generating these inferences, which are neither semantic nor fully pragmatic.
However, Chierchia (Reference Chierchia and Belletti2004) and Horn (Reference Horn, Horn and Ward2004), among others, doubt that these examples show intrusion into truth conditions per se. They argue that what is involved is post-propositional accommodation of the inference, which is triggered retrospectively once the sentences following the scalar terms are processed in order to avoid a contradiction. Horn (Reference Horn, Horn and Ward2004) further suggests that the extraordinary nature of this process is indicated by the requirement for focus intonation on the scalar term (an observation which is also part of King and Stanley's (Reference King, Stanley and Szabó2005) account; and see Geurts, Reference Geurts2009).
Localism does not necessarily take a position on whether SIs can intrude upon truth conditions: it only makes predictions with regard to the domain (sub- or post-propositional) in which pragmatic principles may operate. It is also conceivable to detach localism from defaultism: a local but non-default theory, in which SIs are generated locally but are context-dependent, is in principle coherent. However, these two claims have tended to go hand-in-hand for Levinson, and perhaps even for Chierchia.
Localist accounts argue that the occurrence of local implicatures is a grave challenge for global accounts. Globalists have addressed this challenge in three ways. According to Benjamin Russell (Reference Russell2006) and Geurts (Reference Geurts2009), one can consider the instances brought forward by the localists one by one, and argue in each case whether these involve true implicature generation or some other process. (This is similar to Chierchia's and Horn's responses to cases cited in Levinson such as (5) above.) For instance, the global account can derive local SIs if some non-arbitrary assumptions are taken into account. In the case of examples such as (3a) Russell (Reference Russell2006) provides a globalist account of the derivation of the ‘local’ implicature, by adding the assumption that George is epistemically adept, or at least biased towards his beliefs. That is, George could be supposed to take a stance and either believe that something is the case or believe that it is not the case; that is, we exclude the possibility that he simply does not believe it to be the case. With regard to the proposition that all George's advisors are crooks, the global implicature (3c) is compatible with two situations, one in which George simply does not hold the belief that all his advisors are crooks; and one where he believes that it is not the case that all his advisors are crooks (this latter equating to the situation in (3b)). The addition of the assumption that George is opinionated about his beliefs rules out the situation in which George does not have the belief that all his advisors are crooks. This allows only the situation in which George believes it is not the case that all his advisors are crooks. Ergo, a global implicature augmented with assumptions about the interlocutor's epistemic stance can generate what looks like a local implicature. The process that generates the SI, however, is fully Gricean.
Besides these responses, it is possible to cast doubt upon the very foundations of the localist challenge. Geurts and Pouscoulous (Reference Geurts, Pouscoulous, Egré and Magri2009a, Reference Geurts and Pouscoulous2009b) have been asking whether local implicatures are actually as readily available as has been assumed. To investigate this, they presented participants with embedded and unembedded instances of propositions with the existential quantifier, as in (6a) and (6b). They then asked participants to respond to the corresponding questions (6a′) and (6b′).
(6)
a. Fred heard some of the Verdi operas.
b. Betty thinks that Fred heard some of the Verdi operas.
a′. Would you infer from this that Fred didn't hear all the Verdi operas?
b′. Would you infer from this that Betty thinks that Fred didn't hear all the Verdi operas?
While the localist account predicts equal rates of acceptance for (6a′) and (6b′), the global account predicts that participants will be much more prone to accept (6a′) than (6b′). This is indeed documented by Geurts and Pouscoulous, for whom participants show evidence of unembedded SIs at rates of 93 per cent but embedded SIs only at 50 per cent. Moreover, they found that the rates of local implicatures vary substantially between conditions: while the rates of generation are as high as 50 per cent for embedding under ‘think’, they are as low as 3 per cent for embedding under the universal quantifier (as in example 4a). These findings seem clearly to indicate that local implicatures are not derived with the consistency expected by local accounts. In fact, bearing in mind that the local implicature under ‘think’ can be derived using a global process of inference, as proposed by Russell (Reference Russell2006), the evidence from embedding under ‘every’ suggests that there may be no truly local SIs, as predicted by global accounts.
Of course, it should further be remarked that these investigations are disconfirming the local account without necessarily providing positive evidence for the global one. They do not investigate whether participants are generating the global implicature associated with (6b) (which is also predicted by the global account, just as much as it is predicted that there should not be local implicatures), nor do they show that the apparent instances of local implicature do not arise through authentic scalar implicatures. While the former can easily be tested with the existing paradigm, simply by asking participants whether the global SI follows from (6b), addressing the latter question is perhaps less straightforward.
14.3 Overview and outlook
In the previous sections we reviewed empirical investigations on quantity implicature that have been motivated by debates in the theoretical pragmatics literature. The evidence that is accumulating on the role of context (section 14.2.1) suggests that scalar implicatures, which are an instance of generalised implicatures, are generated only when relevant to the contextual purpose (which is considered a sine qua non condition for particularised implicatures as well). Further direct comparisons between scalar and non-scalar, ad hoc quantity implicatures in child language acquisition (section 14.2.2) document that scalar implicatures are not privileged in development. When we look at these two strands of research in combination, the emerging pattern strongly disconfirms the predictions of the default account, which postulates a special pragmatic system dedicated to the generation of generalised implicatures. Instead, they are in line with the predictions of unitary accounts. However, the evidence from the adult experiment reported by Katsos and Smith (Reference Katsos, Smith, Franich, Iserman and Keil2010) on the comparison of scalar and non-scalar implicatures calls for a careful treatment of the distinction between generalised and particularised implicatures: rather than completely dismissing it, it can be interpreted as a frequency-based observation about contexts of occurrence rather than theory-critical mechanisms of derivation.
Finally, with respect to the locus of SI generation, in section 14.2.3 we reviewed empirical evidence that the ‘local’ embedded SIs are inferred with much less frequency than the ‘global’ non-embedded SIs, and that in many cases the former are apparently not available to participants at all. All in all, these experimental investigations are contributing evidence against default and local accounts of scalar implicature. It is worth reiterating that not all unitary or default accounts take (or need to take) a position with regard to all these issues. It is possible to have a local account without assuming defaultness (in the sense of independence from discourse context), or vice versa. However, the emerging picture favours accounts which take a more-or-less orthodox Gricean position, where a single pragmatic system is responsible for deriving both generalised and particularised implicatures in a post-propositional process. Crucially, from the perspective of the cognitively oriented pragmaticians, these investigations contribute towards constructing a model of language processing and acquisition where Gricean maxims are not just philosophical norms but psycholinguistically valid principles as well.
Although we have seen how various different methodologies may be gainfully employed when we come to operationalise competing theoretical proposals, we must remain vigilant that the responses of participants are in fact conditioned by the variables that we wish to test. Moreover, I sounded caution on how minor changes in the details of an experimental design can impact upon the interpretation of the data (section 14.2.2). It is also important to highlight potential terminological pitfalls, such as the use of the term ‘default’ to refer to accounts which associate inferences with lexical expressions. The evidence we reviewed disconfirmed the predictions of these ‘lexical-default’ accounts, without bearing on other accounts which employ the concept of defaultness in different ways and at different levels (e.g. see Jaszczolt's Reference Jaszczolt2005 conceptual-default account).
Notwithstanding these difficulties, we see that the relationship between concept and experiment is a productive one, as far as quantity implicature is concerned. Clear predictions at a theoretical level motivate empirical study, and the drive towards empirical investigation motivates clarity at a theoretical level. Moreover, critical evaluation of the nature of responses to experimental stimuli can broaden our theoretical base by suggesting the relevance of additional factors in actual linguistic contexts, and thus rendering these additional factors amenable to theoretical formalisation. For quantity implicature, as for many topics in semantics and pragmatics, the collaboration of theorist and experimentalist appears set to pave the way for future progress at both levels.
(Verstraete 






