4.1 Introduction
Languages provide speakers with a variety of means for inserting the speech of others into their own discourse, ranging from verbatim recitation of the other’s words (1a), to a recording of the content of what the other said (1b), to brief reports of the other’s speech act, with or without the content of what was said (1c–d):
a. My friend John said, “I can meet you tomorrow at noon if I can find a ride.”
b. John told me that he could probably meet me the next day at noon.
c. He agreed to meet me.
d. He provisionally agreed.
“At its root, speech presentation is a pragmatic issue” (Reference Moore, Kytö and PahtaMoore 2016: 482). As Reference CollinsCollins (2001: xiv) observes:
One of the reasons why [reported speech] is of significance for pragmatics is that the differences between its formal varieties cannot be understood in any meaningful sense without reference to pragmatic (contextual) factors … the choice of a given strategy is determined by the larger structure of the discourse and by the communicative intentions of the speaker or writer … Speakers and writers choose the form that they perceive as potentially most effective for what they want to communicate and, concomitantly, for how they intend to organize their texts.
According to Reference Bublitz and BednarekBublitz and Bednarek (2006), the central pragmatic function of reported speech is evaluative. The way in which the source of the speech is labeled, described, or evaluated (my friend John, John, he) can influence “the hearer’s judgment of the reliability of what is reported” (550–551). The reporting expressions used (said, told me, (provisionally) agreed) may indicate the degree to which the speaker agrees with what was said or positively or negatively evaluates it. And the category of speech representation chosen (direct speech, indirect speech, report of speech act, etc.) has a range of different expressive functions, as we will discuss in the course of this chapter. Reference Bublitz and BednarekBublitz and Bednarek (2006) also suggest that speech representation can have a social function in establishing and maintaining social relationships and a textual function in foregrounding or backgrounding elements in the narrative. Representation of the thoughts of others can take similar forms and serve many of the same functions.
The historical study of speech and thought representation has focused on three questions as they have changed over time (cf. Reference Grund and WalkerGrund and Walker 2021a on speech representation):
1. The mechanics of representing speech and thought, such as the reporting verbs used, the placement of reporting clauses, or even the use of quotation marks. As Grund and Walker note, “the present-day system of marking is not directly reflected in historical materials” (Reference Grund and Walker2021a: 7).
2. The categories of speech and thought representation existing in earlier stages of the language compared to those of Present-day English, including the extent to which they are found in different genres.
3. The textual, discursive, and sociopragmatic roles served by the different types of speech and thought representation. Grund and Walker point to a number of functions, such as the expression of evidentiality, credibility, authority, detachment or distancing, dramatization, negotiation of relationships, or characterization and theme; they argue that the same effects existed historically as do today.
In this chapter, we begin with the framework of speech and thought representation identified for Present-day English. We then take a diachronic approach to see how speech and thought have been represented over the course of English language history and how pragmatic concerns are central for the use and development of discourse representation resources and functions. You will be introduced to speech representation in Old and Middle English, speech and thought representation in Early Modern English and Late Modern English. The development of quotation marks in early printing provided a dedicated means of representing direct speech. The rise of free indirect speech is often associated with the rise of the novel in the LModE period. Finally, the appearance of new reporting verbs, such as be like, is characteristic of Present-day English.
Note that the terms scholars use – speaking of the “report,” the “presentation,” or the “representation” of speech and thought – imply different views of the relationship between the original speech or thought act and its record. I have chosen “representation” because the recorder is often not just repeating the earlier speech or thought by rote but actively creating or constituting it.
4.2 Categories of Speech and Thought Representation
One of the most widely recognized categorizations of speech and thought in Present-day English is that of Reference Semino and ShortSemino and Short (2004), which is an adaptation of the framework originally proposed by Reference Leech and ShortLeech and Short (2007[1981]). Apart from pure narration, they propose a set of parallel categories for speech, thought, and writing on a cline from the most narrator-controlled to the most autonomous categories and also ranging from most to least summarizing (see Table 4.1). It is widely acknowledged that these categories are quite porous, with somewhat fuzzy boundaries, leading to mixed or indeterminate forms. However, the schema is a useful starting point for the understanding of speech and thought representation. (For the sake of space, I will leave the parallel categories of writing representation out of the following discussion.)
Table 4.1 Cline of speech and thought representation

The most autonomous forms are (free) direct speech ((F)DS) and (free) direct thought ((F)DT). The only difference between the free and non-free form is the absence of a reporting clause in the free form. The structure of the form is biclausal, with two semi-independent clauses: the reporting clause (belonging to the narrator or reporter) and the quoted clause (belonging to the speaker or thinker). The reporting clause – often called the inquit (Latin for ‘he/she/it says’) or quotative clause – may precede, follow, or intervene in the direct speech/thought, and in medial and final position it frequently involves inversion of the subject and verb. In (2), an exchange between Jane Eyre and Mr. Rochester in Jane Eyre is presented in free direct speech, ending with direct speech from Mrs. Fairfax, with the reporting clause “remarked Mrs. Fairfax.”
(2)
“You are very cool! No! What! A novice not worship her priest! That sounds blasphemous.” “I disliked Mr. Brocklehurst; and I was not alone in the feeling. He is a harsh man; at once pompous and meddling; he cut off our hair; and for economy’s sake bought us bad needles and thread, with which we could hardly sew.” “That was very false economy,” remarked Mrs. Fairfax, who now again caught the drift of the dialogue. (1847 E. Brontë, Jane Eyre; Reference BusseB. Busse 2020: 115–116)
Direct speech is independent of the reporter, presenting a verbatim – or at least faithful – replication of what the speaker said. All of the expressive material within quotation marks is purported to belong to the original speaker. (We know, of course, that much of what is presented as direct quotation, even in real-life situations, is invented or constructed.) Pragmatically, (F)DS may be used for vividness, liveliness, or dramatic effect, is a foregrounding device, and provides a sense of authenticity or credibility. FDS has no overt narrative intervention and may be used to capture the rapid nature of conversational exchanges.
(F)DT presents thought as internal speech. It is suggestive of an omniscient reporter who has inner knowledge of the minds of others. Again, all of the expressive material apart from the reporting clause (when it is present) belongs to the consciousness of the thinker. The thoughts are generally understood as conscious and may represent a sudden realization or revelation on the part of the thinker. (F)DT is typically not set off by quotation marks (3a) but it may be (3b):
a. Only these, thought I—what an education! (1910 Thackeray, Book of Snobs; Reference BusseB. Busse 2020: 127)
b. “And when,” thought Emma, “will there be a beginning of Mr Churchill?” (1815 Austen, Emma; www.gutenberg.org/files/158/158-h/158-h.htm)
FDT, especially when it is ungrammatical and is disjunctive or disconnected, is often associated with what is called interior monologue or stream-of-consciousness.
Indirect speech (IS) and indirect thought (IT) are presented entirely from the point of view of the reporter, as we can see from the shifted pronominal, deictic, and tense forms, which conform to the time and place or “deictic center” of the reporter. They are fully integrated into the discourse. The expressive elements of DS and DT are eliminated, with the indirect form expressing the content or “gist” but not the exact wording of the speech or thought. Indirect discourse consists of a single main clause (the reporting clause), which always assumes initial position, and a subordinate clause (the reported material), expressed in a nominal or infinitival clause. No non-embeddable structures are permitted. The complementizer (that, if, whether) may or may not be present, and there is no graphic device distinguishing the two parts, as there is in DS. A constructed comparison is a helpful way of identifying the formal differences between direct and indirect speech most clearly, though there is not a derivational or one-to-one relationship between direct and indirect forms:
Elizabeth asked John, “Oh, will you, my dear friend, write that ridiculous note for me tomorrow?”
Elizabeth asked John whether he would write that (“ridiculous”) note for her the next day.
Here we see the shift of the speaker’s you to the reporter’s he, of me to her, of will to would and of tomorrow to the next day. The direct question with auxiliary-subject inversion is replaced by an embeddable form, an indirect whether clause with subject–auxiliary order. Indirect speech may sometimes contain quoted expressive elements from direct speech (“ridiculous”), but most such elements, for example, interjections (“oh”) and vocatives (“my dear friend”), are eliminated. IS is often used for backgrounding, summarizing, or providing factual information.
Here are some literary examples of IS (4) and IT (5).
(4) He explained, in gentle and convincing tones, that his wife had started at a moment’s notice for Brittany to her dying mother; (1907 Conrad, The Secret Agent; www.gutenberg.org/files/974/974-h/974-h.htm)
(5) and this persuasion, joined to all the rest, made her think that she must be a little in love with him, in spite of every previous determination against it. (1816 Austen, Emma; Reference BusseB. Busse 2020: 135)
Direct speech in (4) would be “My wife has started …”, whereas direct thought in (5) would be “I must be a little in love with him.” In (5) “a little in love” is ambiguous: it may be attributed to Emma’s own verbalized thought or it could be a narrative paraphrase of the content of her thought.
In between direct and indirect speech/thought is free indirect speech and thought. This form has received a wealth of scholarly attention (see Further Reading). It goes by a number of different names: erlebte Rede, style indirect libre, empathetic narrative, narrated monologue, represented speech and thought, and so on. I will refer to it collectively as free indirect discourse (FID). Several controversies surround this form: how and when it developed (discussed below), whether it is “single” or “dual” voiced, that is, whether we are hearing just the voice of the original thinker/speaker or whether we hear both that and the voice of the narrator, whether it may express empathetic identification with the character or ironic distancing, or both, whether it is restricted to literary genres or can be found in other genres, whether the sentences of FID are “unspeakable” (because they lack the I–you of communication), and whether all or most of the formal features must be present to qualify as FID. Here is a prototypical example of free indirect thought:
(6) Never had she imagined she could look like that. Is mother right? she thought. And now she hoped her mother was right. (1922 Mansfield, “The Garden Party”; Reference BrintonBrinton 1980: 366)
In this example, we see many of the characteristic features of the style, including:
a third-person subject of consciousness (expressed pronominally) to whom all of the expressive content belongs (she);
reference to others by means of pronouns, pet names, or familial relationships (mother);
narrative past tense cooccurring with present or future time deictics representing the here and now of the character’s act of consciousness (now she hoped);
shifted modals with past time reference (could);
non-embeddable, independent clauses of direct discourse, such as questions, imperatives, fragments, sentences with fronting, dislocations, initial conjunctions, and so on (Is mother right?); and
an optional reporting clause, typically appearing final or medial (she thought).
Other characteristics include expressive lexical items belonging to the character (e.g., dear, miserably, fool), dialect or pronunciation features, idioms or colloquialisms of the character, the past progressive, (reflexive) pronouns with no antecedent in the preceding discourse, and expressive structures such as interjections, exclamations, sentence adverbs, repetitions, hesitations, pragmatic markers, or fillers.
The advantage of FID is that it creates a seamless juncture between narrative (the external discourse) and speech and thought. It allows the (verbatim) speech and thoughts of characters – of other consciousnesses – to be presented in the third-person narration with no narrative break. It leads to an alignment of character and narrator, thus overcoming the limitations of one narrator/one point of view and leading to the loss of authorial omniscience. It “erases some of the impression of clear hierarchy, of an external teller and an observed character being told about” (Reference ToolanToolan 2006: 705). Inner and outer worlds become one, giving a sense of immediacy. But it does not completely sacrifice narrative control (exerted by narrative tenses and pronouns) and thus retains the possibility of distancing. In free indirect speech, speech is presented as it is experienced (interpreted) by others, and thus the form may be used for the purposes of irony or parody in a way that DS cannot. In free indirect thought, we are given an illusion of the character’s mental state. Thoughts are not assumed to be fully verbalized but may be unarticulated, partly conscious, or on the threshold of verbalization. This avoids the artificiality of direct thought. We are inside the head of the character and hear the inner voice with which consciousness addresses itself.
In the free indirect speech below, we hear the speech of the conjurer (7a) and of Lizzie and the Secretary (7b), and in the free indirect thought, we have direct access to Emma’s (8a) and Arthur’s (8b) private thoughts. FID is often recognized by some cues in the immediate context which direct the reader to the character’s speech or thought:
a. The object of his discourse was a panegyric of himself and a satire on all other conjurors. He was the only conjuror, the real one, a worthy descendant of the magicians of old. (1826 Disraeli, Vivian Grey; Reference BusseB. Busse 2020: 111)
b. Lizzie was very desirous to thank her unknown friend who had sent her the written retraction. Was she, indeed? observed the Secretary. Ah! Bella asked him, had he any notion who that unknown friend might be? He had no notion whatever. (1864–5 Dickens, Our Mutual Friend; Reference FludernikFludernik 1993:119)
a. She then took a longer time for consideration. Should she proceed no farther? – should she let it pass, and seem to suspect nothing? – Perhaps Harriet might think her cold or angry if she did; or perhaps if she was totally silent, it might only drive Harriet into asking her to hear too much … (1816 Austen, Emma; www.gutenberg.org/files/158/158-h/158-h.htm)
b. When he [Arthur Clennam] got to his lodging, he sat down before the dying fire … and turned his gaze back upon the gloomy vista by which he had come to that stage in his existence: So long, so bare, so blank. No childhood; no youth, except for one remembrance; that one remembrance proved, only that day, to be a piece of folly (1855–7 Dickens, Little Dorrit; Reference FludernikFludernik 1993: 237)
FID, especially free indirect thought, when it is lacking many of the grammatical markers, can be difficult to distinguish from pure narration; that is, we cannot know whether we are hearing the thoughts of the character or of the narrator. But it is this uncertainty which often contributes to the literary depth and complexity of the style.
At the most summarizing end of the scale of speech representation are narrative representation of voice (NV), which conveys in the briefest possible way that some communicative act has occurred (9), and narrative representation of speech act (NRSA), which expresses the illocutionary force of the speech act and sometimes the topic (10). In both cases, the wording is that of the narrator, as is the point of view. The effect is distancing or backgrounding, implying that the character’s speech and actual words are unimportant. Both forms have a summarizing or encapsulating function which serves to move the narrative forward:
(9) He then stepped across the pavement to her, and said something; she seemed embarrassed, and desirous of getting away; (1847 E. Brontë, Wuthering Heights, CLMET3.0; Reference GrundGrund 2021a: 109)
(10) The “How d’ye do’s” were quiet and constrained on each side. She asked after their mutual friends; (1816 Austen, Emma; www.gutenberg.org/files/158/158-h/158-h.htm)
In the realm of thought representation, we find narrative representation of thought (NT), which encapsulates a thinking process (11), and narrative representation of thought act (NRTA), which records a thought act (12) but has less content and immediacy than IS:
(11) He sat really lost in thought for the first few minutes; (1816 Austen, Emma; Reference BusseB. Busse 2020: 139)
(12) She put to herself a series of questions. (1919 Woolf, Night and Day; Reference Semino and ShortSemino and Short 2004: 45)
Reference Semino and ShortSemino and Short (2004) propose an additional category called internal narration (NI), “the presentation of mental states and changes which involve cognitive and affective phenomena but which do not amount to specific thoughts” (132). Internal narration allows access to a character’s internal viewpoint but does not explicitly specify that a thought act has occurred. This is similar to what Reference FludernikFludernik (1993) calls “psycho-narration.” We see this in (13):
a. She was vexed beyond what could have been expressed – almost beyond what she could conceal. Never had she felt so agitated, mortified, grieved at any circumstance in her life. (1815 Austen, Emma; www.gutenberg.org/files/158/158-h/158-h.htm)
b. and when I could hardly see the dark mountains, I felt still more gloomily. The picture appeared a vast and dim scene of evil … (1818 Shelley, Frankenstein; Reference BusseB. Busse 2020: 142)
Note that there are various refinements and alternatives to these categories and their labels. For example, Reference VandelanotteVandelanotte (2009) proposes a new category of “distancing indirect speech and thought” to account for an example such as the following:
(14) So I suggested we dine. But Priscilla wasn’t hungry. She had eaten too much of the smoked salmon at the reception. I proposed we visit a few of the places we had known together … Dancing, she claimed, would exhaust her utterly. Did I want that? (1961 Fuller, The Father’s Comedy; Reference VandelanotteVandelanotte 2009: 143)
In (14), we have IS (“So I suggested we dine,” “I proposed …”), but the rest, despite appearances, cannot be interpreted as FID, Vandelanotte argues. The proper name Priscilla and I (in “Did I want that”) would be she and he, respectively, in FID. Here the I-narrator takes over, draws the speech representation into his own perspective, structuring an utterance from his or her own deictic perspective, and appropriates the original speaker’s expressivity. The discourse is from the point of view of the narrator, or current speaker, much as it would be in IS, not of the represented speaker, as it would be in FID. (On alternative schemas, see Further Readings.)
In a corpus study which looks at these categories of speech and thought (and writing) representation in prose fiction, newspaper reports, and (auto)biography in Present-day English, Reference Semino and ShortSemino and Short (2004) find that for the presentation of speech, (F)DS is the norm in all genres, followed by NRSA and IS. This is consistent with Leech and Short’s claim that “DS is a norm of baseline for the portrayal of speech” (Reference Leech and Short2007[1981]: 268). Fiction privileges the direct end, (F)DS and free indirect speech, while non-fiction privileges the non-direct end, IS and NRSA. Thought is more often expressed in fiction and (auto)biography, where NI is the norm, followed by free indirect thought. NRTA and (F)DT are the least frequent. Semino and Short’s findings are inconsistent with Leech and Short’s claim (Reference Leech and Short2007[1981]: 276) that IT is the norm for thought, a claim based on the assumption that the thoughts of others are not directly accessible to others and are not verbally formulated, so cannot be reported verbatim. But news reporting and autobiography do favor IT. News reporting rarely uses FDT or NRTA, and free indirect thought is absent. We will see below that the categories found in earlier genres differ from these.
4.3 Speech Representation in Old English
In Old English, only the representation of speech (and writing) is found. Thought representation is absent (Reference LouviotLouviot 2016).
Prior to the advent of printing and the (eventual) adoption of quotation marks to denote direct speech, a multiplicity of strategies were used in medieval manuscripts to set direct speech off from narrative: these include physical aspects of the page (mise-en-page), such as rubricated letters, underlining, paragraph marks (¶), special line spacing, marginal notes, and various marks of punctuation (e.g., marks known as the virgule, the punctus, and punctus elevatus), but none of these is used exclusively for the purposes of marking quotation and they are often applied inconsistently even within the same manuscript (see Reference MooreMoore’s (2011, Reference Moore, Kytö and Pahta2016) discussion of ME manuscripts). Punctuation of quotation is “sparse and unsystematic” in Old English (Reference LouviotLouviot 2016: 59). The same means can be used for quoting the direct speech of characters and, perhaps even more often, for citing the words of authorities (such as scriptural quotations).
Lacking clear graphic devices, medieval manuscripts make use of reporting verbs, or verba dicendi ‘verbs of speech’, to mark direct speech. These appear in reporting clauses, which in Old English typically precede a passage of quoted speech, and in verse may often be quite long and elaborate (15a). A variety of verbs all meaning ‘to speak, say’ (e.g., maþelian, cweðan, sprecan, frignan, secgan, reordian) are used in Old English. Speeches may be followed by a final inquit which echoes the initial one (15b):
a. Þa spræc guðcyning, Sodoma aldor, secgum befylled, to Abrahame him wæs ara þearf. “Forgif me …” (Gen A,B 2123; Reference LouviotLouviot 2016: 50)
‘then spoke the battle king, the prince of Sodom, deprived of his men, to Abraham (he was in need of favours: “Grant me …”
b. … Swa hleoðrode halig cempa, ðeawum geþancul. (And 461; DOEC)
‘… thus said the holy warrior, mindful of his servants’
Reporting clauses that interrupt or follow the passage of direct speech, what Reference CichoszCichosz (2018) calls “parenthetical reporting clauses,” often show inversion of the subject and the verb, especially with nominal subjects. In Old English, the most common verb here is cweþan, the origin of Modern English quoth (OED, s.v.v. queath, v. and quoth, v.):
(16)
Þæt is soð, cwæð Beotius (Bo 26.59.10; Reference CichoszCichosz 2018: 189) ‘that is true, said Boethius’
In a study of direct speech in Old English verse, Reference LouviotLouviot (2016) argues that speeches in Old English are lengthy, formal or public, and often spoken in isolation. There are what she calls “pseudo-exchanges” consisting of a series of speeches by the same character resembling a long speech interrupted by inquits. But in verse there are few examples of the back-and-forth exchanges that we expect in contemporary dialogues (though these may occur more often in other genres, such as saints’ lives). Louviot concludes that direct speech is a fundamental part of the narrative, allowing the poet to create salient points in the narrative and to actualize the narrative, but it is not a way to represent conversations (“what characters might have said”) or to supply characterization.
Both direct and indirect discourse have been available from the earliest English. IS in earlier English shows the backshifting of tenses, shifting of personal pronouns and of locative/temporal adverbs, and omission of expressive elements characteristic of the form in contemporary English. But Reference VisserVisser (1972: 775–779) points out the existence of a form consisting of an introductory reporting clause and complementizer that followed by the actual words spoken. We recognize this form by the lack of backshifting of the verbs is (17a) and purposeth (17b), though shifting of pronouns does occur (i.e., I to he in (17b)):
a. Mid ðæm worde he cyðde ðæt hit is se hiehsta cræft, (CP 52.409.19; Reference VisserVisser 1972: 776)
‘he said that his is the highest craft’
b. He toke and tolde him his corage, That he purposeth a viage (1393 Gower, Confessio Amantis; Reference VisserVisser 1972: 780)
‘he took and told him his desire, that he plans a journey’
Visser argues that “[t]his is perhaps the oldest form of reporting speech or thought” (Reference Visser1972: 775), citing examples from Old English through the nineteenth century, and thus should not be considered an “exception.” Among mixed forms in Old English, a form that apparently shows indirect speech morphing into direct speech was termed “slippage” by Reference SchuelkeSchuelke (1958). The passage in (18) begins indirectly (sege him þæt ‘tell him that’) but the unshifted pronouns ðe and ðu (bit ðe þæt ðu cume ‘command you that you come’) suggest direct speech:
(18) Ða cwæð se cyngc: Ga rædlice and sege him þæt se cyngc bit ðe þæt ðu cume to his gereorde. (Apollonius of Tyre 14.11; Reference VisserVisser 1972: 782)
‘then said the king “Go quickly and say to him that the king commands you that you come to his feast”’
Slipping was initially thought to be inadvertent, perhaps due to either a misconstrual of the first part as indirect discourse or to the scribe’s inability to maintain the shifted tenses and pronouns of indirect speech over longer passages. But more recent thinking (e.g., Reference RichmanRichman 1986) is that the switch from indirect to direct speech may be a conscious technique, with direct discourse being used – on account of its increased vividness and drama – to emphasize the most important part of the quoted material.
4.4 Speech Representation in Middle English
In Middle English, reporting clauses continue to be the primary means of marking direct speech. Reference MooreMoore (2011) finds, for example, that in Chaucer’s Troilus and Criseyde, 92 percent of the instances of direct speech are marked by reporting clauses, 64 percent of which precede the quotation and 28 percent of which are inserted within the quoted speech, typically immediately after its onset, as follows:
(19)
“Noon oother lyf,” seyde he “is worth a bene;” (1387 Chaucer, Canterbury Tales E.Mch 1263) ‘“No other life,” said he “is worth a bean,”’
(Note here that the quotation marks have been added by the modern editor.) Seien ‘to say’ becomes the primary reporting verb in Middle English, and it frequently shows inversion, as in this example. Cichosz suggests that the sudden popularity of seien may be due to the need for a more flexible verb than quoth, with inversion occurring by analogy with quoth as a means to distinguish the inquit from the comment clause I say (Reference Cichosz2018: 201–202). (In a comment clause, I say does not introduce speech but serves other (inter)subjective functions, such as emphasis, as in This much, I say, is indisputable [COCA]; see Chapter 3 §3.6.) In the Corpus of Middle English Prose and Verse, Reference MooreMoore (2011) finds that 80 percent of the instances of direct speech occur with reporting clauses, with the verb seien used over half the time. Seien has lost its propositional meaning to a great extent, serving primarily as a textual or boundary marker. Evidence for the reduced meaning is the occurrence of conjoined structures such as asked and seide or answerd & seyd, where say is always the second verb. While the second instance of say has been explained as an empty punctuation mark or pragmatic marker, or as a complementizer introducing direct discourse, Reference Herlyn, Jucker, Fritz and LebsanftHerlyn (1999) argues that it serves a cohesive function; it signals the quotedness of the following discourse and ties the material more closely to the narrative frame.
Reference MooreMoore (2011) considers three Middle English genres which have a large proportion of speech representation – defamation depositions, sermons, and histories – and in which we might expect a high degree of faithfulness to the defendant’s words, to scriptural passages and the opinions of church fathers, and to the speeches of historical figures, respectively. She argues, however, that the looser and less determined way of marking direct and indirect speech in Middle English suggests a laxity in faithfulness, perhaps because speakers did not feel that exactness was necessary and did not assume that speeches were reported in a verbatim manner. For example, in depositions, witnesses may represent speech, and scribes report it, in a way that makes it conform more with the legal standard of defamation than with what was actually said (which may be imperfectly remembered). In sermons, one finds biblical quotations that are loosely cited, paraphrased, wrongly attributed, or incorrectly sourced: “the levels of fidelity expected were not those of absolute precision that present-day readers expect from quotations of written sources” (112). In general, Moore argues, a strict distinction between the de dicto (‘about what is said’) quality of DS, its fidelity to the actual linguistic expression of speech, and the de re (‘about the thing’) quality of IS, its fidelity to the content or sense of speech, did not hold; direct speech could allow a de re interpretation. Moore sees this as closer to modern spoken discourse, in which it is often the case that direct speech is not verbatim but in a sense “constructed” or imagined (see also Reference Moore, Minkova and StockwellMoore 2002).
In Sir Gawain and the Green Knight, Reference Pons-SanzPons-Sanz (2019) calculates that represented speech constitutes almost half of the lines of the poem. For speech, (F)DS is the norm, as it is in Present-day English. In contrast, the norm for thought is DT, or fully verbalized internal discourse, often accompanied by clauses such as “he said to himself”; thus, Pons-Sanz classifies this as a kind of internal direct speech. Pons-Sanz also groups NRSA together with NV as “narrated speech,” as both are expressed in single clauses and serve to summarize relatively unimportant parts. Based on the assumption that the Gawain-poet chooses carefully the mode of speech representation used, Pons-Sanz argues that the mode chosen has stylistic and pragmatic effects: it serves to emphasize aspects of the narrative and to shape the audience’s interpretation of the text. Reference MooreMoore (2011) also argues that the indeterminacy of modes of speech representation in Sir Gawain and the Green Knight as well as in Pearl highlights “the homiletic insights, moral dilemmas and narrative frames” (134) of the poem. Likewise, the permeability of speech forms in the Canterbury Tales allows for different voices (narrator, character, pilgrim) without clearly disentangling these voices; Chaucer uses this indeterminacy for artistic purposes.
4.5 The Development of Quotation Marks
The rise of quotation marks is associated with the advent of printing, but it took some time for this particular form of punctuation to become regularized and conventionalized. The development of a distinct set of markers for direct speech had not only typographical consequences but pragmatic ones as well, since it allowed authors to clearly distinguish between different voices in the text and to bring direct speech into the foreground of the narrative.
The quotation mark (or inverted comma) arose out of the diple ‘double’, a graphic symbol going back to Greek and Latin manuscripts. As discussed by Reference ParkesParkes (1992: 58–59), the diple, shaped like a semi-circular comma mark (›), was first printed outside the left margin on every relevant line to indicate scriptural quotations or sententiae. This alerted the reader to the presence of the words of an authoritative figure. In the 1570s this mark was extended to indicate direct speech and moved to within the line. It was not until the eighteenth century that printers created out of the diple a new punctuation symbol, the quotation mark, which was gradually accepted by the second half of the century. It was first used to mark the opening of a passage of quoted speech (inverted comma) and only later the closing of the passage (uninverted comma). Reference CrystalCrystal (2015) suggests that the quotation mark is associated with the rise of the novel. Single and double quotation marks were originally exploited for different purposes – single for indirect speech and double for direct speech – but they are now differently distinguished, for example, for quotations within quotations or by nationality (the British preferring single quotations, Americans double quotations). As Crystal points out, while direct speech is uniquely marked with quotation marks (though they may be omitted), quotation marks serve a variety of other functions, such as denoting titles of short works, linguistic glosses, scare quotes, citation forms, and so on.
Prior to the conventionalization and acceptance of the quotation mark, printers and writers experimented with a variety of means of displaying quoted material. The quoted material could be set in italics or indented. A common technique, used somewhat unevenly and often inconsistently from the sixteenth to the eighteenth century, was to enclose the reporting clause within parentheses. Reporting clauses could also be set off by commas or dashes. Examples of the use of parentheses begin to appear in the 1520s. For example, an Early Modern printed edition of Chaucer punctuates the line given above in (19) as follows:
(20) non other lyfe (said he) is worthe a bean (Thynne, ed. 1532; Reference MooreMoore 2020: 85)
As Moore points out (Reference Moore2020), the marking of the backgrounded reporting clause rather than the foregrounded quoted clause is a pragmatic choice which points to differing organization of discourse. Reference MooreMoore (2020, Reference Moore2021) studies the use of parentheses with say and quoth clauses in Early English Books Online. Overall, nearly 40 percent of the clauses are set off by parentheses in the seventeenth century. Quoth clauses are more likely to appear in parentheses than say clauses since they are more restricted as inquits with direct speech. Say is a more versatile verb, has wider use, and is more frequent. But despite the fairly high correlation of parentheses with reporting clauses, parentheses never became specialized for the quotative use and were, as Moore observes, only “partially pragmaticalized.” We should note also that a variety of typographical means for marking quotation remained in use much longer in private (handwritten) correspondence before the conventionalization of the quotation mark.
4.6 Speech and Thought Representation in Early Modern English
The representation of speech and thought in Early Modern English has been fairly extensively studied, with focus either on the formal marking of direct speech or on the categories of speech and thought representation in different text types.
Reporting clauses, speech-internal markers, and speech descriptors:
The formal marking of speech can be either external to the quoted material (i.e., reporting clauses) or internal to the quoted material.
Reporting verbs in the Corpus of English Dialogues 1560–1760 have been surveyed in two studies, one focusing on the witness depositions (Reference AijmerAijmer 2015) and one on prose fiction (Reference Grund, Kytö and SmitterbergWalker and Grund 2020). Both studies find the neutral verb say to be the most common reporting verb, as it is in Present-day English, and both find a decline in quoth over time. Aijmer identifies no example of quoth after 1639. She also finds no cases where the reporting clause is omitted. Reference Grund, Kytö and SmitterbergWalker and Grund (2020) record an increasing inventory of verba dicendi ‘verb of speaking’ over time, including “structuring” verbs such as question, answer, and reply and “descriptive” verbs such as continue and cry (out). Double quotatives (e.g., answered and said) or even longer forms (did revile and curse and said), which were observed in Middle English, continue to occur in Early Modern English; Reference AijmerAijmer (2015) argues that and said here has lost its propositional meaning and become a highly routinized and grammaticalized quotative marker. Reporting clauses in initial position almost always have subject–verb order, and medial/final inquits invariably have verb–subject word order. Grund and Walker find quoth to occur only with verb–subject order in medial/final position; overall, verb–subject order is evenly split between nominal and pronominal subjects. Medial position of the reporting clause increases over time, serving to emphasize the shift between speakers. Reference Moore, Minkova and StockwellMoore (2002, Reference Moore2006) finds that a reporting verb distinctive to Early Modern legal language is videlicit (viz., vit., vid.), Latin for ‘namely, that is to say’. This is used in slander depositions to introduce the alleged slanderous utterance in either direct or indirect speech, often marking a shift from Latin to English. Moore see videlicit becoming grammaticalized as a quotative marker. Latin dixit ‘says’ or denonit ‘testifies’ can also serve this function.
The absence of quotation marks in EModE direct speech as well as the placement of reporting clauses in medial or final position means that the beginning of direct speech is often not explicitly signaled. Identifying the onset of direct speech may depend, therefore, on the presence of what Reference MooreMoore (2011: 46) calls “speech-internal perspective shifters.” These include interjections (alas, ey), vocatives (Sire, Madame), first- and second-person pronouns (I/we, thou/you), deictic pronouns, shifted tenses, conversational routines (yes/no), and pragmatic markers (but, well, why) as well as non-embeddable structures not found in indirect speech (direct imperatives and questions). All of these signal a change in voice from narrative to DS: they focus on the speaker and the here and now of direct speech, place the discourse within a conversational exchange, and/or evoke the colloquial language of direct speech. Examining two Middle English texts (Chaucer’s Troilus and Criseyde and Hoccleve’s The Regiment of Princes), Reference MooreMoore (2011: 46–49) finds that although the vast majority of direct speech instances are marked externally by reporting clauses and inquits, there is also considerable dependence upon vocatives, interjections, deictic and personal pronouns, and pragmatic markers; even in Middle English she finds examples of conversational exchanges which depend exclusively on the presence of vocatives to distinguish the different speakers. Looking at the witness depositions and prose fiction texts of the Corpus of English Dialogues, Reference LutzkyLutzky (2015, Reference Lutzky2021) finds that three-quarters of all instances of direct speech contain speech-internal perspective shifters. Pragmatic markers (ah, and, but, marry, now, oh, pray, well, what, and why) are the most common form in both genres; they signal a change in speaker but may also show emotional involvement, be interactive, and structure discourse. First- and second-person pronouns predominate in witness depositions, while prose fiction makes use of vocatives. Pragmatic markers are much more common in prose fiction than in witness depositions. Lutzky hypothesizes that pragmatic markers may have been omitted by scribes in witness depositions, perhaps due to the formal nature of the proceedings (on pragmatic markers, see Chapter 3).
Reporting clauses may be accompanied by “speech descriptors,” modifiers that evaluate, clarify, or hedge a reporting clause. They provide pragmatic information about “how the reporters view the speech, what characteristics the speech event had beyond what is signaled by the actual representation and the reporting expression, and how faithful a given representation is to the original speech event” (Reference GrundGrund 2017a: 42). Here, “says the Gentleman, very gravely” is a speech descriptor:
(21) You may call her out, there she is, why Sister, says the Gentleman, very gravely, What do you mean? (1722 Defoe, Moll Flanders; CED; Reference GrundGrund 2018: 274)
Studying speech descriptors in EModE witness depositions and prose fiction, Reference GrundGrund (2017a, Reference Grund2018) argues that they are markers of stance (pragmatic subjectivity) in that they allow the reporters to signal their attitude toward the represented speech (and speaker). They may take the form of a prepositional phrase (said in a trembling voice), an adverbial (phrase) (said sharply), a participle (said smiling), an adjectival (phrase) (said scurrilous things), an or construction (said the following or words to this effect), or a noun phrase (said several times). Grund proposes five pragmatic categories of speech descriptors: evaluation, emphasis, frequency/quantity, formulation hedging, and clarification. All the categories are found in witness depositions, though evaluation is the most common. Speech descriptors are less common in prose fiction but he finds that they increase over time, especially in Late Modern English (Reference Grund, Kytö and SmitterbergGrund 2020, Reference Grund2021a); their function is strongly evaluative, falling into a number of subcategories, as set out in Table 4.2. Grund attributes the predominance of evaluation in fiction to the fact that represented speech in fiction is often a means by which the narrator can characterize a person or situation and thus inject a subjective attitude.
Table 4.2 Evaluation subtypes in speech descriptors in Late Modern English
| Subtypes of evaluation | Examples of evaluation |
|---|---|
| Intent | scornfully, insistently, disdainfully |
| Language variety | in pretty good English, in the gentlest of accents, in Spanish |
| Length | briefly, very concisely, rather shortly |
| Mental state | pensively, gruffly, impetuously, passionately |
| Pitch | in her deep voice, in a high jocular voice |
| Speech character | recklessly, perversely, repellingly |
| Speech quality | huskily, hoarsely, in a much shaken voice |
| Speed | hurriedly, quickly, slowly, hastily |
| Strength | quietly, faintly, in a lower voice |
| Style of speaking | with emphasis, interrogatively, mechanically |
Categories of speech and thought representation:
A number of studies of the realizations of Reference Semino and ShortSemino and Short’s (2004) categories in Early Modern English have been undertaken. It is not always easy to compare these studies as they look at different genres (where we might well expect the speech and thought categories to be differently realized). Moreover, incompatibilities in the findings of existing studies may also result from the categorization of examples by different scholars, which may differ rather widely, or at least are not always entirely clear.
Reference WłodarczykWłodarczyk (2007) examines two EModE trial transcripts, adapting the Semino and Short system to a context in which there is no narrator per se. Reference Walker and GrundWalker and Grund (2017) look at speech representation in EModE witness depositions. Reference McIntyre and WalkerMcIntyre and Walker (2011) is a study of the different categories in EModE news journalism and narrative fiction (1511–1736), and Reference EvansEvans (2021) examines categories of speech representation in sixteenth-century letters.
Overall, speech presentation is much more common in Early Modern English than is thought presentation. Unlike in Present-day English, where the “showing” end – (F)DS – is the norm for speech presentation, in Early Modern English, the “telling” end – NRSA, IS, and NV – predominates, as in these examples from Reference Walker and GrundWalker and Grund (2017):
NV (talking), which sets the scene for further verbal behavior or evaluates verbal behavior;
NRSA (ded confesse the truth wyth lamenting), which frames speech events; and
IS (askyd hym where Mr Doctor Barrett was), which summarizes and backgrounds, focusing on actions rather than words.
IS and NRSA may contain bits of quoted direct speech (e.g., he said he did not care a t—d for him, he might kiss is arse). NRSA and IS are equally common in witness depositions while IS predominates in correspondence. McIntyre and Walker find somewhat different results for fiction and news reporting, but compared to Present-day English, DS is the most underrepresented and NV the most overrepresented in Early Modern English, again pointing to EModE’s preference for more “telling” types of speech representation. Thus, McIntyre and Walker see a trend toward less narrator interference over time. Interestingly, Włodarczyk and McIntyre and Walker find a few rare examples of free indirect speech, while Grund and Walker and Evans find none, though Evans does find mixed DS/IS forms.
For the representation of thought, Early Modern English again favors the “telling” end of the spectrum, including NRTA, IT, and NI. (F)DT and free indirect thought are either not found or are extremely rare. In the trial transcripts, Włodarczyk finds that NRTA is most common, but IT is of very low frequency. McIntyre and Walker find that NI is most common in their fiction and newspaper corpus (as it is in Present-day English), with NRTA and IT twice as common as in Present-day English. The frequent use of IT in news reporting is apparently used to speculate about the reactions of others to reported events.
Reference WłodarczykWłodarczyk (2007) finds occasional slipping from indirect to direct speech (which she sees as inadvertent). Reference Walker and GrundWalker and Grund (2021) look specifically at the existence of mixed modes in witness depositions. While infrequent (only 5.6 percent), they are identifiable most often by the presence of subjective vocabulary (swearing, insults, glossed words, dialect features, pragmatic markers, idiomatic phrases, evaluative adjectives), by switches in mode (direct to indirect and vice versa), and by pronoun, tense, and deictic switches (first-person ~ third-person pronoun, past ~ present tense, now ~ then), all of which evoke the voice of the original speaker. In (22a), we see subjective language, “not very wise” and “such beardles boys” in the context of indirect speech (“said that”), while in (22b), also in the context of indirect speech (“Reeve told him”) we see a passage of direct speech (“if you goe to Mrs Jennings”) followed by indirect speech (“she would give him the Guniea”), indicated by the pronoun shift from you to him:
a. said that the Magestrates of Colchester were not very wise to choose such beardles boys to be Constables as Tom Smith the Appoth[e]cary a Constable of St Runwals (1650–75 F_3EC_Colchester_021; ETED; Reference Grund and WalkerGrund and Walker 2021a: 169)
b. the sd Reeve told him if you goe to Mrs Jennings in St Peters of Mancroft she would give him the Guniea, (1700–54 F_4EC_Norwich_018; ETED; Reference Grund and WalkerGrund and Walker 2021a: 170)
Walker and Grund reject an explanation of such passages as FID or “slipping” since here there are switches in both directions, not just from indirect to direct discourse. To see the system of speech representation as one in flux, not yet fully developed and not yet fully distinguishing between direct and indirect discourse, as has been suggested by Reference MooreMoore (2011), while it has appeal, is ultimately rejected by Walker and Grund. They argue that the mixed forms are “artful” and “help in disambiguation, in dramatisation, and in foregrounding or backgrounding a voice” (Reference Walker and Grund2021: 180). As we pointed out in the introduction, the way in which speech is represented (the source of speech, the characterization of the speech, the (in)directness of speech) has important pragmatic functions, influencing the reader’s judgment concerning reliability or veracity and their ultimate acceptance (or not) of the content of the speech. Moreover, the form of speech representation can also serve a textual function in foregrounding (with DS) or backgrounding (with IS) the content of speech.
4.7 The Rise of Free Indirect Discourse
The existence of FID in pre-modern texts is highly debated. It is typically associated with the rise of the novel and the expression of consciousness and seen as exclusively literary (e.g., Reference BanfieldBanfield 1982). However, Reference FludernikFludernik (1993: 93–99) argues that free indirect discourse does indeed exist, at least in proto-form and for speech only, as early as Middle English. She cites an example from Chaucer (23), where the expressive element thanked be God and the unshifted past-time modals moste and sholde point to free indirect speech:
(23)
Daun John … hym told agayn, ful specially,/ How he hadde well yboght and graciously,/ Thanked be God, al hool his merchandise;/ Save that he moste in all maner wise,/ Maken a chevyssaunce, as for his best,/ And thanne he sholde been in joye and reste. (1387 Chaucer, Canterbury Tales B.Sh 342–8; Reference FludernikFludernik 1993: 93–94) ‘Dan John … told him again, very specially, how he had bought well and successfully, thanked be God, all of his merchandise; except that he must no matter what arrange for a loan as for his best, and then he should be in joy and rest’
Reference Pons-SanzPons-Sanz (2019) and Reference MooreMoore (2011) agree that Middle English examples such as (24a–b), while they resemble free indirect discourse, are better seen as “mixed speech” in which the boundary between direct and indirect speech is blurred: “Although medieval works do have passages that evoke a blending of voices and some that permit the intrusion of a character’s thoughts into the narrative, the result is not the application of consistent conventions of a separate discourse mode, but is rather a mixture of incompletely divided discourse modes” (Reference MooreMoore 2011: 131). These examples lack many of the features of the fully developed free indirect discourse form:
a. And he nicked hym “Naye!” –– he nolde bi no ways’ (1390–1400 Sir Gawain and the Green Knight 2471; Reference Pons-SanzPons-Sanz 2019: 215)
‘And he told him “No!” – he would not on any account’
b. And there he swoor on ale and breed/ How that the geaunt shal be deed,/ Bityde what bityde! (1387 Chaucer, Canterbury Tales B.Th 872–4)
‘And there he swore on ale and bread that the giant shall be dead, Come what may!’
Likewise, Reference Walker and GrundWalker and Grund (2021) point to EModE instances such as (25), which, while it resembles free indirect discourse because of the third-person pronoun, unshifted wolde, and subjective language, does not exemplify FID as a “full-fledged, separate mode”: it is introduced by a reporting clause (“the said Seaton … said”) and does not create ambiguity of voice (dual voice), which would be expected for FID:
(25) the said Seaton was in a greate rage and said God damn Him He would have another knock att Him, (1724–58 F_4NC_Northern_004; ETED; Reference Walker and GrundWalker and Grund 2021: 166)
Reference FludernikFludernik (1993) sees free indirect discourse as appearing in full form, albeit rarely, in the late seventeenth century, particularly in literary imitations of colloquial language (see also Reference Leech and ShortLeech and Short 2007[1981]: 266):
(26) When Father Worsley came to discourse [with] Don Tomazo in English, heavens, what a refreshing it was him! For he had not spoken to any person whatever in ten weeks before (1680 Don Tomazo; Reference FludernikFludernik 1993: 95)
Reference AdamsonAdamson (1994, Reference Adamson, Willie and Chatman2001) agrees with Fludernik’s dating, relating the rise of what she calls “empathetic narrative” to Puritan conversion narratives in the seventeenth century; in (27) one finds was cooccurring with now/at this time, where the past tense and present-time deictic bridge the gap between the remembering self (who has attained grace) and the remembered self (who is not spiritually reborn):
(27) And now was I both a burthen and a terror to myself, nor did I ever so know as now, what it was to be weary of my life and yet afraid to die. Oh, how gladly now would I have been anybody but myself (1666 Bunyan, Grace Abounding; Adamson 1995: 81, Reference Adamson, Willie and Chatman2001: 88)
For Adamson, the extension of this style from first to third person occurred in the Bildungsroman, the secular equivalent of the conversion narrative, and laid the groundwork for FID. Reference McIntyre and WalkerMcIntyre and Walker (2011) likewise find rare examples of free indirect speech (28) in their EModE corpus of news journalism and fictional prose:
(28) the rogues presented each a pistol to them and bid them deliver, or they would blow the brains out of their head (1736 Country Journal; Reference McIntyre and WalkerMcIntyre and Walker 2011: 104)
Reference VandelanotteVandenalotte (2021) sees the rise of FID as a “drift” away from the norms of indirect speech (144), with gradual conventionalization of the style over the course of the nineteenth century. Authors keep the pronouns and tense of ID but allow for the syntactic freedom of DD. Early examples may even retain the that complementizer. In the early part of the century, typographical practices are not yet stable, so quotation marks may be used for DD and FID or even ID, or they may be omitted. This passage from Jane Austen begins with FID (both speech and thought) without quotation marks, followed by free indirect speech with quotation marks, and then FDD and DD using quotation marks.
(29) She asked after their mutual friends; they were all well. – When had he left them? – Only that morning. He must have had a wet ride. – Yes. – He meant to walk with her, she found. “He had just looked into the dining-room, and as he was not wanted there, preferred being out of doors.” …
“You have some news to hear, now you are come back, that will rather surprize you.”
“Have I?” said he quietly, and looking at her; “of what nature?” (1815 Austen, Emma; www.gutenberg.org/files/158/158-h/158-h.htm)
By the time of Dickens (mid-nineteenth century), non-quotation-marked FID seems to have become established.
The pragmatic challenge posed by the representation of speech is to incorporate it without interrupting the narrative frame (as does (F)DS) and yet to preserve its exact wording, dramatic import, and subjectivity of speech (which are not allowed in IS). Free indirect speech allows the speech of characters to be expressed seamlessly within the narrative frame (in the third person and past tense of narration), with all of the speaker’s subjectivity retained; the speech is often portrayed as it is experienced by others. At the same time, because FID is within the narrator’s control, the narrator is able to adopt either an empathetic closeness to or ironic distance from the character or their speech. The pragmatic challenge posed by the representation of thought is to present thoughts with immediacy and subjectivity (not allowed in IT) yet not to suggest that the thoughts are “internal speech” and necessarily conscious (as does (F)DT). FID allows the representation of thought in a way which gives an illusion of the character’s mental state, often with thoughts below the level of consciousness. FID thus seems to be a literary form highly suited to the expression of consciousness, a phenomenon which we associate with the novel.
4.8 Speech and Thought Representation in Late Modern English
Reference BusseB. Busse (2020) is a study of speech, thought, and writing representation in a selection of nineteenth-century novels (by Austen, Scott, C. Brontë, E. Brontë, Thackeray, Gaskell, Kingsley, Dickens, Eliot, Oliphant, Stevenson, Wilde, and Hardy). She counts both the units of speech, thought, and writing and the number of words within each unit. She compares her numerical results with those of Reference Semino and ShortSemino and Short (2004) discussed above. Overall, she finds that units of speech representation and pure narration are equally common in nineteenth-century fiction, but units of speech representation are more common in PDE fiction. In both centuries, however, narration comprises the largest number of words. This means that twentieth-century narrators produce longer passages of pure narration, while nineteenth-century speakers give longer speeches.
In terms of speech representation, (F)DS is the most common in both periods, with slightly more frequent and more verbose passages of (F)DS in the nineteenth century. Free indirect speech is still remarkably uncommon in the nineteenth century, under 1 percent of the words, compared to c.19 percent in PDE. Thought representation occurs less often than does speech representation in both periods. In the nineteenth century, NRTA is the most common and IT the second most common means of presenting thought, whereas NI is the most common in Semino and Short’s corpus. (F)DT is much rarer in the nineteenth century than in the twentieth (0.3 percent compared to 28 percent of the words). Free indirect thought is also much rarer (7 percent compared to 26 percent of the words), but the passages are comparatively longer. B. Busse concludes, “In 20th-century thought presentation, the presentation of mental states dominates, whereas in the 19th century it is the summary of a mental act in NRTA” (81). What B. Busse’s results seem to show us is that in general in the nineteenth century the direct, character-centered representation of internal thought does not figure prominently and that FID, especially free indirect speech, is not yet fully developed. Thus, we can say that in the nineteenth century, thought is presented indirectly, from the viewpoint of the narrator and in the narrator’s words; we do not experience thoughts directly as the unfiltered expressions of the subjective consciousness of the character; pragmatically, this determines the extent to which we accept the statements as accurate representations of the character’s thoughts.
4.9 Reporting Verbs in Late Modern English and Present-day English
While the formal marking of direct speech more or less stabilizes by Late Modern English (with the exception of FID), the inventory of reporting verbs continues to grow. Say remains the foremost reporting verb, but Reference CichoszCichosz (2018) finds at least thirty-eight other verbs used in inquits in her corpus (e.g., add, reply, answer, return, ask, exclaim, repeat, go on, remark, resume). Quoth is all but obsolete. The newer and less frequent verbs do not show inversion of the subject and verb, though inversion remains common for say and some of the more frequent verbs, as shown in (30a) with a speech descriptor. The forms says I and, more rarely, says you also appear as reporting clauses in Late Modern English (30b) (OED, s.v. say v.1 and int, def. I1c(b)).
a. “I said so!” cried Morrice triumphantly, “I was sure there was no gentleman but would be happy to accommodate two such ladies!” (1782 Burney, Cecilia; CLMET3.0)
b. “Ah, Betsey,” says I, “you are always building castles in the air.” (1827 Royall, The Tennessean; COHA)
We also see growing specialization of reporting verbs, where some are restricted to DS (go, be like, recite), some to IS (indicate, alert), some to DS and FID (cry, consider, splutter), some to IS and FID (notice, object, gather), and so on (see Reference FludernikFludernik 1993: 292–293).
For the speaker of Present-day English, what seems most striking is the rise of the new reporting verbs and constructions go, be like, and be all:
a. I go, “Dad, why don’t we just put it where it belongs?” (1999 Dr. Katz, Profession …; COHA)
b. And I was like, What have they done to my boy over there? (1987 Jakes, Heaven and Hell; COHA)
c. And I’m all, “You know, I just made some gingerroot gazpacho, come on over.” (1999 Edtv; COHA)
These forms occur in spoken conversation (personal narratives and transcribed interviews) and represented speech (in fiction) but rarely in written English (except in more speech-like written forms such as blogs). They invariably accompany DS, not IS.
Go is the oldest of these three forms. It likely arose out of the use of go to record a sound or noise (see OED, s.v. go, v., def. 11c(b)), which occurs as early as the nineteenth century (see 32a–b); this was then extended to the quoting of direct speech (see 31a and 32d). This is the source suggested by an early commentator (Reference ButtersButters 1980). The first example of quotative go cited in the OED dates from 1967. A related use is to specify the wording of a proverb, saying, song, account, and such (OED, s.v. go, def. 14), which dates from the late sixteenth century (32c). Like other reporting verbs, go can occur in medial and final position (32a, c, d) as well as initial position (31a, 32b).
a. And then the tapping in his head became louder, more metallic, like a carpenter’s mallet. “Toke-toke toke,” it went. (1935 Green, The Body the Earth; COHA)
b. He goes, “Quack, quack, quack. Hello.” (1988 Full House; COHA)
c. We convalescents and exinvalids have a theme song. “Until I got sick,” it goes, “I never dreamed how lovely life could be.” (1943 Good Housekeeping; COHA)
d. “China,” she goes, “your poetry is closer to the surface, just under your skin.” (2002 Frank, Life is Funny; COHA)
The origin and spread of quotative be like has been the source of exhaustive study in the sociolinguistic literature, which notes its appearance in global varieties of English, in the speech of young people, at roughly the same time (e.g., Reference BuchstallerBuchstaller 2014). According to the OED, be like appears in the early 1980s (s.v.v. like, adj, adv, conj., prep., def. P8 and def. B6c; be, v., def. 21), though it can be dated somewhat earlier:
a. I’m like, “Watch out for that!” I said, “Would you like!” (1976 Saturday Night Live; COHA)
b. It was like “You’re coming. We’re driving away.” (SCVE/f/1945; Reference D’ArcyD’Arcy 2021: 93)
c. And I was all like, “You want me to do a verse on your album? That’s 100 k …” (1976 Saturday Night Live; COHA)
While be like is often associated with “Valley Girl Speak” (see the OED entry), it clearly did not originate in this variety. In her corpora of oral narratives and interviews (unscripted speech materials), Reference D’ArcyD’Arcy (2021) finds examples of quotative be like in speakers born as early as the 1950s (see 33b); be like surpasses say in 1970/75 and is increasing rapidly in frequency. There are also earlier forms such as think like, say like, feel like, and go like (OED, s.v. like, def. B6c).
Quotative be like is not synonymous with either say or go (see Table 4.3). Say typically introduces the direct speech of the self or of others and is seen as the neutral or unmarked form. Both say and go imply vocalization and cannot be used for the expression of thought. Like go, be like may introduce sound, but it is more often used to introduce thought or inner monologue, or one’s own speech (i.e., for self-presentation). Thus, it most often occurs in the first person (I’m like, I was like), while go is more common in the third person. Be like has an expressive or affective dimension and is used to “indicate aspects of speaker subjectivity” (Reference Romaine and LangeRomaine and Lange 1991: 242). It does not make strong claims to faithfulness of the quoted speech or thought: the “speaker stands in reduced responsibility and commitment to the truth of the report” (Reference Romaine and LangeRomaine and Lange 1991: 263). It frequently occurs in the historical present, contributing to the sense of dramatized or enacted thought or speech and expressing evaluation. It must always occur in initial position, preceding the quoted material, never medially or finally. Be operates as an auxiliary in subject–auxiliary inversion and negative placement, though questions and negatives with be like are uncommon, perhaps because pragmatically we do not question or negate our own thoughts.
Table 4.3 Quotative uses of say, go, and be like
| Form | Function |
|---|---|
| say |
|
| go |
|
| be like |
|
In contrast to be like, be all seems to have had a short span of popularity. But it is not obsolete. The OED gives the first example from 1982 (OED, s.v. all, adj., pron., and n., adv. and conj., def. C1d). (34b) looks like an unusually early example.
a. I’m not so sure. You were all, “I’m sure he’s heard of styling gel.” (1974 The Life and Times of the Happy Hooker; COHA)
b. but he looked so stupefied. The way he was all, “Great, great, I’m hip. I’m cool.” (1952 Something to Live for; COHA)
Considering the use of reporting verbs exclusively in spoken English, Reference D’ArcyD’Arcy (2017: 16–23, Reference D’Arcy2021) presents the following scenario. In the late nineteenth century say predominates, primarily in the third person and past tense, for the representation of speech. Over this period a range of other reporting verbs arise. In the early twentieth century, say remains dominant, but think gains ground for the expression of internal monologue, as does first-person quoted speech. The mid- to late twentieth century sees a greater range of content expressed, including speech, thought (real or imagined), writing, sound, and gesture, using a wide range of verbs: say for speech in the third-person past tense, think for first-person thought, go for mimetic content, and be like for first-person thought in the historical present.
There is considerable debate about the origin of be like. Some scholars relate its development to the rise of the pragmatic marker like (e.g., Do you think we need people to be, like witnesses? [COCA] or Like one day I was doing the laundry for example [COCA]; see Reference D’ArcyD’Arcy 2017 and Chapter 3 on pragmatic markers). But others see them as separate developments. Reference MeehanMeehan (1991) sees forms of like as grammaticalizing from the OE adjective gelic ‘similar to’. The development of quotative be like is related to both the complementizer and pragmatic marker functions because like is a “quasi-complementizer” having scope over an entire clause and it focuses new information. It is related to the original meaning ‘similar to’, since it denotes that the quoted information need not be exact. Reference Romaine and LangeRomaine and Lange (1991) see like in be like as a specialization within the textual domain of the grammaticalized complementizer like (as in you sound like you care) meaning ‘as if’, with the addition of a dummy be verb. It derives from the meaning of comparison or exemplification of like: “the speaker presents the clause created for comparison or exemplification so that it can be construed as a report of speech or thought” (262). While they see like in be like as deriving from the complementizer, they note that like is unlike that in say that since it does not effect the deictic shift found in indirect speech. Rather, much like FID (see above, §4.2), it retains in the quoted clause the deictic perspective of the represented speaker. They suggest it might be the “natural historical development” of FID in the spoken channel because it allows the speaker to keep the vividness of direct speech without suggesting that the words were actually spoken.
The view of be like as a case of grammaticalization has been contested, however. Reference D’ArcyD’Arcy (2017) does not find that be like undergoes the contextual expansion found in grammaticalization but is grammatically stable, and Reference Vandelanotte, Buchstaller and IngridVandelanotte (2012) sees no sense in which be like is decategorialized or fossilized. Reference D’ArcyD’Arcy (2017: 16–23, Reference D’Arcy2021) argues that be like develops from resources already available in the system, namely quotative be, dating from the late nineteenth century; the pragmatic marker like meaning ‘in this way’ or ‘for example’ is added by analogy with earlier forms such as say like, think like, go like, or feel like. The fact that be is pragmatically unrestricted makes be like suitable for representing all types of content, while the discourse marker like’s meaning of exemplification makes it suitable for mimetic representation and implies that the quotation need not be exact. Reference Vandelanotte, Buchstaller and IngridVandelanotte (2012) argues against the complementizer source, because like does not function as a complementizer; that is, it does not take a nominal complement and it does not introduce indirect but rather direct speech. He sees the entire clause (e.g., I am like) as undergoing constructionalization, with the meaning of be like as deriving transparently from the semantics of like: speakers using I am like announce “that they are about to give a partial or ‘approximative’ imitation of thought, emotion states or words” (183).
4.10 Historical Overview
The history of speech and thought presentation in English yields a complex picture of changes in form and type, which are at least in part pragmatically motivated. Over time, we can see increasing frequency of thought presentation over speech presentation, a change from more indirect (narrator-controlled, summarizing) to more direct (autonomous or non-narrator-controlled, verbatim) ways of presenting speech and thought, and an expanding use of specialized reporting verbs and speech evaluators. While the development of quotation marks in the EModE period allows quotation to be clearly delineated from narrative, which was not the case in medieval manuscripts, the development of FID in the modern period again erases the boundary between speech/thought and narrative. Which form of speech and thought a writer chooses can have pragmatic consequences, in influencing, especially, intersubjective relations between the writer, speaker/thinker, and reader: are we to accept the speaker/thinker as credible or authoritative, are we to believe their words/thoughts to be accurate or verifiable, what attitude is the writer taking toward the speaker/thinker, how are we expected to respond to the speaker/thinker? The form of speech and thought presentation adopted can also have textual functions in distinguishing voices within the text (or blurring such voices). It can serve purposes of foregrounding or backgrounding speech and thought within the text or of dramatizing speech and thought. Finally, of course, it can serve purposes of characterization and thematic development within a text.
4.11 Chapter Summary
This chapter covered the following topics:
the formal features and pragmatic functions of different categories of speech and thought representation in Present-day English, ranging from least to most summarizing and from most to least autonomous:
◦ with (F)DS found to be the norm for speech representation, and NI (and also free indirect thought) the norm for thought representation;
speech representation in Old English, where medieval manuscripts have no one designated way to mark direct speech but make frequent use of inquits:
◦ with direct speech consisting of long, formal speeches spoken in isolation; and
◦ with “slippage” from indirect to direct speech;
speech representation in Middle English, where speech is more loosely represented, and there may be indeterminacy between de dicto and de re interpretations:
◦ with seien the most frequent reporting verb; and
◦ with thought often presented as internalized speech;
the gradual conventionalization of quotation marks to denote DS in printed texts;
speech and thought representation in Early Modern English, where categories of speech and thought fall on the “telling” end of the spectrum (NRSA, IS, NV and NRTA, IT, IN, respectively):
◦ with an increase in the inventory of reporting verbs and decline of quoth;
◦ with the use of speech-internal perspective shifters and speech descriptors for evaluation; and
◦ with the frequent use of mixed modes;
the development of FID, associated with the expression of consciousness and the rise of the novel, resulting in conventionalization of the form in the nineteenth century;
speech and thought representation in Late Modern English, where speech representation is still more common than thought representation:
◦ with longer passages of speech compared to longer passages of narration in Present-day English;
◦ with thought on the “telling” end (using NRTA and IT) as opposed to the “showing” end of Present-day English (NI and free indirect thought);
reporting verbs in Present-day English, including go, be like, and be all.
