Phrasal verbs in Early Modern English spoken language: a colloquialization conspiracy?

Phrasal verbs (e.g. fade away, give up) tend to be associated with spoken, colloquial registers, not only in Present-day English, but also in previous stages of the language. This view has recently been challenged by Thim's (2006a, 2012) ‘colloquialization conspiracy’, according to which the idea that phrasal verbs are colloquial is based on a misconception which first arose in the eighteenth century. In the current study we seek to verify Thim's claim by exploring phrasal verbs in A Corpus of English Dialogues 1560–1760, a 1.2-million-word corpus of Early Modern English (EModE) speech-related text types. Based on a sample of over 7,000 examples, we demonstrate that the linguistic features, distribution and high productivity of phrasal verbs in the EModE period point towards a full entrenchment of these combinations in the spoken language, which leads us to the conclusion that the colloquial status of phrasal verbs in EModE is not merely a matter of a ‘colloquialization conspiracy’.


Introduction
Phrasal verbs (e.g. fade away, give up) tend to be associated with spoken colloquial registers, not only in contemporary English (see, e.g., Cowie & Mackin 1975: iv;McArthur 1989: 40;Biber et al. 1999: 408, 409;Liu 2011: 675), but also in previous stages of the language (see, e.g., Hiltunen 1994;Claridge 2000: 185-97;Kytö & Smitterberg 2006;Smitterberg 2008;Rodríguez-Puente 2017, 2019. As far as Present-day English (PDE) is concerned, Dempsey et al. (2007) used phrasal verbs to distinguish computationally between spoken and written registers, concluding that these combinations can be indicative of the degree of spokenness and formality of a given text. Likewise, phrasal verbs are a linguistic feature with a positive load on Dimension 1 'Oral vs. Literate Discourse' in Biber's (2003) multi-dimensional analysis of variation between university spoken and written registers. As regards earlier periods, in a corpus-based study of phrasal verbs from 1650 to 1999, Rodríguez-Puente (2019) analyzed these combinations in ten different registers, demonstrating that 'phrasal verbs significantly distinguish spoken from written registers and informal from formal registers ' (2019: 282). For her, the high occurrence of phrasal verbs in the trial proceedings of the Old Bailey Corpus, a register which closely represents spoken language and also lower social strata of society (Huber 2007), is a clear indication that phrasal verbs were a feature of the spoken language in the Late Modern English (LModE) period (2019: 264-77). However, Rodríguez-Puente (2019) observes that the subject matter and the idiosyncrasies and particular style of writers can also condition the appearance of phrasal verbs, with certain combinations being particularly useful to describe events in the narrative sections (see also Hiltunen 1994: 138).
With reference to the Early Modern English (EModE) period, however, the view that phrasal verbs are typical of spoken, colloquial registers has been challenged by Thim (2006aThim ( , 2012, who argues that such an idea is based on a 'colloquialization conspiracy', a misconception which took off in the eighteenth century. He claims that the recurrent assumption that EModE phrasal verbs are typical of colloquial, everyday registers is based on the common view that PDE phrasal verbs are colloquial. Thim (2006a) examines a sample of over 2,000 phrasal verbs extracted from Everyday English (Cusack 1998), a collection of 64 non-literary EModE texts which include the speech production of members of the lower social classes between 1500 and 1684. However, the author does recognize that, while some of the texts in this collection portray renderings of the spoken language, 'others seem conceptionally literate' (Thim 2006a: 292), so that the assignment of levels of orality is dubious in some cases. Nonetheless, from the analysis of phrasal verbs in this collection, Thim concludes that EModE phrasal verbs are 'stylistically neutral ' (2006a: 302). For him, the use of these combinations is motivated by the subject matter of the text, rather than the medium (spoken or written) or the degree of formality (see also Thim 2011). He subsequently compared the frequencies of phrasal verbs provided in several diachronic and contemporary studies (2012: 211-14), questioning their reliability and arguing that most statements made so far about the quantitative diachronic development of phrasal verbs throughout the EModE period are 'meaningless' (2012: 213), as it is not possible to find any clear trends in the historical development of these constructions. For him, the colloquiality of phrasal verbs has been taken for granted in studies such as Hiltunen (1994), Claridge (2000) and Blake (2002), 'even if the evidence does not support that assumption' (Thim 2012: 216;see further Thim 2006a: 298-9).
Despite his criticism of previous research, Thim (2012: 213) recognizes that the analysis of a larger diachronic corpus to trace the development of phrasal verbs is necessary to arrive at new long-term comparisons, a task which he describes as 'inadvisable in an untagged corpus ' (2012: 213), in that a great deal of manual analysis would be necessary to discern phrasal particles from their homonymous prepositions. With this article we have assumed such an inadvisable task in an attempt to verify Thim's (2006aThim's ( , 2012 argument by exploring phrasal verbs in A Corpus of English Dialogues 1560-1760 (CED; Kytö & Culpeper 2006). CED contains EModE speech-related texts in which constructed and real dialogues are represented (see section 3), thus providing a more reliable source of speech-like production of the period than Cusack's (1998) collection. Our goal is to analyze the most salient linguistic features of phrasal verbs, their frequency, usage and productivity in a large sample of EModE speech-related texts, including a wide number of particles and a transparent semantic classification along the lines of Rodríguez-Puente (2019: 75-84). In light of our findings, we also revisit previous studies criticized by Thim in order to provide an alternative explanation for those apparently chaotic paths of distribution which he finds in them. We thus contribute to the existing literature with a robust quantitative diachronic study that allows us to describe phrasal verbs empirically and objectively in the EModE period and to attain solid conclusions about their alleged colloquial status at that time.
The article is structured as follows. Section 2 provides an overview of phrasal verbs and their historical development, paying special attention to their status in the EModE period. Section 3 describes the corpus and the methodology used. In section 4 we present our main results and the analysis of our findings, focusing first on the most salient linguistic aspects of the combinations (section 4.1), and then moving on to an analysis of their frequencies and register distribution (section 4.2). Section 5 closes the article with some conclusions.

On English phrasal verbs
Phrasal verbs in this article are understood as (partially) lexicalised and idiomatised two-word lexical units made up of a verb and a particle of adverbial origin, whose syntactic bondedness can be described in terms of a gradient from lowly lexicalised to fully lexicalised and whose meaning can be analysed along a scale from fully transparent to fully idiomatised. (Rodríguez-Puente 2019: 151) Phrasal verbs are thus distinguished from other related categories, such as prepositional verbs (look into in (1)), where the particle is a preposition, and phrasal-prepositional verbs (look up to in (2)), which contain both an adverb and a preposition.
(1) They are looking into the matter very carefully.
(2) Most people look up to celebrities.
Although the set of possible particles which can be said to be used for the creation of phrasal verbs varies from one author to another, for our purposes and for the sake of comparison with previous work, we have included those listed by Rodríguez-Puente (2019: 44), which are based on compilations provided in previous studies (see, among others, Bolinger 1971;Cowie & Mackin 1975;Fraser 1976;Quirk et al. 1985Quirk et al. : 1151Hiltunen 1994;Claridge 2000: 46). This list includes the following thirty-four items: aback, aboard, about, above, across, after, ahead, along, apart, around, aside, astray, asunder, away, back, behind, by, counter, down, forth, forward(s), home, in, off, on, out, over, past, round, through, to, together, under, up Whereas a set of possible particles is easy enough to establish, verbs are a different matter. In principle, any kind of lexical verb can function as the verbal element in a phrasal 809 PHRASAL VERBS IN EARLY MODERN ENGLISH SPOKEN LANGUAGE combination, although they are most commonly (but not necessarily) of native origin and generally 'monosyllabic or disyllabic verbs with the accent on the first syllable' (Claridge 2000: 54; see also Martin 1990: 115;Thim 2006a: 219).
Despite having relatively clearly identifiable parts, phrasal verbs are a fuzzy category and constitute an example of how categories can be graded, with some combinations being more prototypical than others in terms of their semantic and syntactic properties (see Rodríguez-Puente 2019: 107-52 for discussion). Following traditional classifications, 2 we include literal (non-idiomatic) combinations (e.g. bring up in (3)), metaphorical or figurative combinations, whose meaning is quite transparent but somehow removed from its original connotation (e.g. bring up in (4)), and idiomatic or non-compositional categories, whose interpretation cannot be deduced from the meaning of the parts (e.g. bring up in (5)). Along the lines of Rodríguez-Puente (2019: 55-84), we further distinguish aspectual or aktionsart combinations (e.g. eat up in (6)), where the particle contributes to the compound with an aspectual or aktionsart meaning (see further Denison 1985;Brinton 1988), reiterative combinations (e.g. rise up in (7)), in which the particle repeats part of the meaning already conveyed by the verb, and emphatic combinations (e.g. barrel up in (8)), where the meaning of the verb is exactly the same as when appearing alone, but the particle contributes a more colloquial tone and facilitates the division of labor between the verb and the particle, thus having an effect on the information structure of the clause (for further discussion, Syntactically speaking, phrasal verbs can be transitive or intransitive. When transitive, the particle can appear at either side of the object when it is a noun phrase (bring [up] the wine [up]), whereas it must follow the object when it is a (non-stressed) pronoun (bring it up vs *bring up it).
Phrasal verbs have been studied widely from a diachronic perspective, in terms of their development, their particular semantic and morphosyntactic features, as well as their alleged association with the spoken, colloquial language. 4 Although these combinations were already attested in OE, it is generally agreed that their frequency (in terms of types and tokens) has increased greatly over time, reflecting the analytic drift which has affected English historically (see Spasov 1966: 18-22;Denison 1998: 223;Hiltunen 1999: 133;Rodríguez-Puente 2019: 175-7). The presence of phrasal verbs in modern English is now a common feature in all registers (see Claridge 2000: 104), although they are more commonly found in colloquial, informal language than in formal, written registers (see, among others, Dempsey et al. 2007).
As regards semantics, it seems that the majority of the EModE phrasal-verb combinations tend to be concrete or literal, 'with only incipient metaphorical developments in certain contexts' (Hiltunen 1994: 132). According to Thim (2006aThim ( : 296, 2006b, non-compositional phrasal verbs are rare before 1600 and their degree of opacity may vary. Therefore, although during the EModE period there is an increase of metaphorical and idiomatic meanings, as well as of aspectual/aktionsart uses (Konishi 1958: 122;Claridge 2000: 96), these are probably not as abundant as in PDE.

Corpus and methodology
The features of EModE phrasal verbs described in section 2 are examined in the present article in light of new data extracted from CED.
CED is a 1.2-million-word corpus of EModE speech-related text types running from 1560 to 1760. Although the texts in CED are recorded in the written medium, they contain dialogic exchanges which can be used to explore representations of the spoken language from the past. These can be divided into two broad categories (see Kytö & Walker 2006: 12): authentic dialogue, that is, written records of real speech events (trial proceedings and witness depositions), and constructed or fictional dialogue (drama comedy, handbooks and prose fiction). The group of handbooks consists of instructional or informational texts presented in the form of dialogues between a master and a student, and is further divided into two subcategories, 'Language Teaching' and 'Other'. CED includes an additional subgroup of miscellaneous dialogues of various kinds (Kytö & Walker 2006: 12, 24), but given their mixed nature, they have been omitted from the current analysis. 5 Table 1 summarizes the overall structure of the corpus texts used in this study.
For a more fine-grained analysis of the results, we divide the texts in CED following Culpeper & Kytö's (2010: 17-18) classification of speech-related texts into speech-like (e.g. personal letters and diaries), speech-based (e.g. trial proceedings and witness depositions) and speech-purposed (e.g. dramatic dialogues and sermons).
The examples of phrasal verbs were retrieved automatically from the corpus using WordSmith Tools version 8 (Scott 2020). Since CED is not morphologically tagged, the procedure required searching for the individual particles by means of concordances which included all their possible spellings in the EModE period. 6 This was followed by further manual analysis to identify those cases in which the particles were part of a phrasal verb and to discard homonymous prepositions (for a similar approach, see, among others, Rodríguez-Puente 2019: 44-5).
Differences in the number of words between the various sections of the corpus were considered for the analysis by normalizing the raw data when necessary. We further verified the statistical significance of our results by applying the Kruskal-Wallis test and the Wilcoxon test 7 by means of the free software R version 4.0.3 (R Core Team 2020). As usual in linguistic analyses, the threshold for statistical significance was set at p < 0.05.

Results: phrasal verbs in CED
Corpus searches yielded 7,289 examples, a volume of occurrence which indicates that phrasal verbs are well represented in the spoken language of the EModE period, and which is sufficient to describe developmental trends. In this section we first address the most salient linguistic aspects of the combinations, and then move on to an analysis of their frequencies and register differences.

Linguistic aspects of phrasal verbs in CED
The corpus provided examples for most of the particles included in the analysis (see section 2). Five of them, however, were not attested in the sample texts (above, around, All the possible spellings of the particles were identified by means of the WordList feature of WordSmith, as well as by examining their corresponding entries in the OED. 7 The Kruskal-Wallis rank sum test is a non-parametric method used for comparing more than two samples that are independent or not related. When the Kruskal-Wallis test leads to significant results, then at least one of the samples is different from the other samples. In such case, a non-parametric test, such as the Wilcoxon test, can be applied for a more fine-grained analysis (see further Brezina 2018: 195ff.).

812
PAU L A RO DR Í GU E Z-PUENTE AND MARÍA OBAYA-CUELI astray, counter and past). As shown in table 2, the least common particles are aback, across, asunder, under, aboard, apart and ahead; conversely, up, out, away, down, in, on and off are the most frequently used particles in the corpus. As shown in table 2, a rising trend is observed in most particles, which indicates that phrasal combinations were growing in number during the period. One notable exception is the particle forth which reduces considerably in frequency (see also Hiltunen 1994: 134;Martin 1990: 111;Nevalainen 1999: 423;Los 2004: 98;Akimoto 2006;Ishizaki 2009: 45-50;De Smet 2010;Rodríguez-Puente 2019: 167). In our eighteenth-century data forth is only attested in trial proceedings, always in the quite fixed and formulaic combination set forth, with reference to a deposition or an indictment, as in (9). Whereas the set of possible particles is rather limited, the number of verbs which can enter into these combinations in EModE is much wider. In our data there are in fact 420 verbal bases. For a more fine-grained analysis we considered the etymology of the verbs using the OED. As expected, a significant majority of the verbal bases are native (65.2 percent), although verbs of French and Anglo-Norman (AN) origin are also well represented (26.4 percent), as shown in figure 1. 8 Verbs of Latin origin are scarce, which comes as no surprise considering the dialogic nature of the texts in the corpus. Thim (2006b: 219) also finds very low rates of French verbs (5 percent) and no examples of Latin bases in the fifteenthand sixteenth-century letters of the Corpus of Early English Correspondence (CEEC). Our proportion of Latin verbs is, in fact, half the size of that found by Rodríguez-Puente (2019: 159), whose study also included formal, written texts.
The verbal bases are primarily monosyllabic or disyllabic verbs with the accent on the first syllable. However, twenty-four of our examples do not follow this general tendency. Five verbs contain three syllables and nineteen are disyllabic with the accent on the second syllable. Combinations containing these verbs are scarce and, in most cases, attested only once. Curiously enough, most of them are cases of reiterative combinations, such as return back in (10), and emphatic combinations, such as those illustrated in (11) to (13). As argued by Rodríguez-Puente (2019: 61-70; see also Rodríguez-Puente 2013), emphatic combinations are those which contain an apparently superfluous particle, in that the meaning of the compound is not different from that of the simple verb. However, the addition of such particles helps to render the verb 'more colloquial, informal, or familiar in tone and to make it more salient in the clause' (2019: 62). Reiterative combinations, in turn, contain a particle that repeats the meaning denoted by the verb. In both types of combinations, the particle is somehow unnecessary. It seems, therefore, that speakers spontaneously add those particles to polysyllabic verbs of foreign origin (and other verbs) on analogy with native combinations as a means of rendering them more colloquial and familiar. The prolific attestation of these combinations in the data lends evidence to the idea of the colloquial character of the texts in CED.
Among these combinations, one curious case is subpoena up (see (13)). The combination is produced by a court witness, Samuel Sylvester. The use of the verb subpoena by a witness is in itself surprising, as this is a specialized word typical of legal language. The fact that he uses it with the particle up may be an indication that he is not completely satisfied with such a formal, specialized word, and that he feels the need to add a colloquializing element, such as up.
Having described the two members of the compound separately, we can now move on to the analysis of the combinations. A total of 992 different phrasal verbs were found in CED, including literal combinations but also combinations whose meanings cannot be deduced from those of their parts. Come in appears only in a literal sense in the corpus, whereas fall out and find out are only used idiomatically. Example (14) contains two different uses of fall out, the first meaning 'argue' and the second 'happen'. In turn, come up, for example, can be used literally (15), metaphorically (16) and with various idiomatic meanings, as in (17), where it means 'appear'. Providing a full analysis of the semantic types of phrasal verbs goes beyond the scope of this article. However, our data indicate that metaphorical and idiomatic meanings are well represented in CED, which may be accounted for in terms of the colloquial nature of the texts. Hiltunen (1994), who analyzed eight different registers from the EModE section of the Helsinki Corpus (HC), finds that most collocations there are concrete (1994: 132; see also Thim 2006aThim : 296, 2006b. He explains that phrasal verbs are less frequent in written than in colloquial language because 'metaphorical combinations tend to be even more marked as colloquial and therefore avoided in the written language ' (1994: 139n). In our data, metaphorical and figurative combinations are found as early as the sixteenth century, which is, therefore, another indicator of the  (14), other non-compositional combinations found in CED in subperiod 1560-1599 include bring up 'rear', face down 'dispute', find out 'discover', give down 'let flow milk', give in 'admit', set down 'put down in writing', take up 'occupy' and turn to 'apply oneself to some task or occupation'. Some illustrative examples are provided in (18)  Another indication that phrasal verbs are well entrenched in the spoken language of the EModE period is their occurrence in several idiomatic expressions, such as fall together by the ears ('be at variance, fall out' (OED s.v. by the ear c.(d.) in ear n.)), as in (21), give in verdict and take up arms. Among the combinations attested in CED we also found a great number of hapax legomena and dislegomena, which, as demonstrated by Baayen (1989Baayen ( , 1992Baayen ( , 1993, constitute a good measure of the productivity of a given construction. Applying such a measure, phrasal verbs seem to be highly productive in CED, in that 456 out of 992 combinations are used only once in the corpus, whereas 149 are attested only twice. Many of the single-occurrence items are, however, nonce formations no longer found in PDE (and not even recorded in the OED), such as some of the examples with polysyllabic verbs mentioned above, which seem to be spontaneous creations typical of spoken interaction. Curious examples include weigh up (22) and know through (23), unusual not only for being attested only once in the corpus but also because they are formed with stative verbs, usually described in the literature as rare in the creation of phrasal verbs.

Diachronic distribution and cross-register analysis
In this section we examine the diachronic and register distribution of phrasal verbs in CED. Figure 2 illustrates their diachronic development across the various subperiods represented in the corpus. 10 As can be seen, although there is a slight decrease between the first and second subperiods, phrasal verbs grow significantly from 1600-39 to the second half of the eighteenth century ( p = 0.02598), a rise which is particularly pronounced between the last two subperiods.
Previous studies indicate that phrasal verbs begin to increase in frequency from 1700 onwards (see Konishi 1958: 125;Spasov 1966: 125;Martin 1990;Wild 2010: 227), although this depends to a large extent on the type of texts examined (see Rodríguez-Puente 2019: 175-7). In CED the sharpest increase is recorded between the last two subperiods, yet phrasal verbs experience a slow but general growth from 1600 onwards, much earlier than usually reported in the literature, which may be considered another indication of their consolidated status in the speech-like texts of the period. Previous studies also point towards a downturn in the growth of phrasal verbs during the eighteenth and the first half of the nineteenth centuries, probably aided by the prescriptivist ideas of the time, which encouraged the use of Latin forms and criticized particles for being superfluous and for their connection with monosyllables and stranded prepositions (see especially Yáñez-Bouza 2015). Such a reversal is, however, not attested in CED, perhaps because of the dialogic nature of the texts involved or simply because we do not have data beyond 1760 and the change is yet to be seen in spoken registers. However, a closer look at the evolution of individual registers is necessary to confirm this general trend (see below for further discussion).
The data presented in figure 2 indicate that the total frequency of combinations is higher at the end of the period than at the beginning. Yet a look at the frequencies of types (figure 3) shows that this is not accompanied by an increase in their type productivity. However, it must be borne in mind that these data reflect the total number of combinations, but not their different (literal and/or idiomatic) connotations. As Moreover, the data represented in figures 2 and 3 refer to the whole corpus. Although standard English grammar and vocabulary may have a common core, they can vary according to register (Nevalainen & Tieken-Boon van Ostade 2006: 303;Biber & Gray 2013). In order to see whether the use of phrasal verbs varies across registers, the tokens obtained from the corpus have been plotted in a boxplot (figure 4), which displays graphically the location and spread of a variable and provides some indication of data symmetry and skewness (see further Brezina 2018: 22-4). 11 The boxplot shows that phrasal verbs are unequally distributed across the various registers represented in CED. The value of their medians (marked by the horizontal black line) is remarkably different, with fiction presenting the highest median. The application of the Kruskal-Wallis test, however, leads to non-significant results ( p = 0.473), probably because all the registers represent dialogic exchanges where phrasal verbs are common (unlike in formal, written documents). The two circles above the whiskers in dramatic texts and witness depositions represent two outliers (i.e. extreme values which are far from other values), which we must bear in mind during data analysis. Figure 5 plots the overall normalized frequencies across registers in a bar line graph. 12 The highest frequencies of phrasal verbs are found in trial proceedings, followed by fictional texts and witness depositions. Conversely, the lowest rates occur in the two groups of didactic works and dramatic texts. In general, these figures indicate that phrasal verbs are more common in those categories which contain authentic, rather than constructed, dialogue, fiction being the only exception to this general tendency. We acknowledge, however, that this is a perception not supported by the statistical analysis. Our results can be compared with those of Hiltunen (1994) and Claridge (2000), whose studies contain written, formal documents. Hiltunen (1994) analyzes a narrower set of particles (away, back, forth, down, off, out and up) and presents his results in percentages (1994: 136). For a more accurate comparison with our data, we normalized Hiltunen's raw figures in table 3, which also includes Claridge's (2000) data from the Lampeter Corpus and Thim's (2006b) results for the personal letters of CEEC. 13 We cannot draw comparisons with Blake (2002) for several reasons. First, his study is not quantitative, and thus there are no frequencies to compare with ours here. Second, there are certain methodological differences. Blake's study is particularly criticized by Thim (2006a) for its inability to account for the occurrence of phrasal verbs in passages of formal poetry (Blake 2002: 37). However, Thim (2006a) fails to see that the combinations that Blake analyzes under the label 'phrasal verb' also include prepositional verbs and other structures containing prepositional phrases (e.g. scorn'd at me and feared of all).
As can be seen, in Hiltunen's data from the HC, the lowest frequency of phrasal verbs occurs in the most formal and authoritative types of documents, namely official letters and statutes, the successors of fifteenth-century Chancery English and representatives of the evolving standard norm at the beginning of the sixteenth century (Nevalainen & Raumolin-Brunberg 2011). The frequencies of phrasal verbs in those two text types are much lower than those found for all the registers analyzed in CED, which lends support to the idea that phrasal verbs were already colloquial and related to spoken  Claridge's (2000) results for the category 'Law' are, however, much higher, but this is mostly due to the fact that this category in the Lampeter Corpus contains not only law documents, but also other types of texts thematically related to legal matters, such as trial proceedings, letters, treatises and even a document portraying a fictional dialogue (Claridge 2000: 190). In fact, all the categories in the Lampeter Corpus contain a mixture of different types of documents. This makes the interpretation of the results difficult, which is probably the reason why Thim (2006a: 299) finds Claridge's figures inconclusive and interpreted from the preconceived position that phrasal verbs are colloquial.
In contrast, the highest figures of phrasal verbs in Hiltunen's data are recorded in handbooks, fictional and biblical texts. As regards handbooks, it must be noted that the  handbooks in the HC are of a mixed nature, containing both dialogic and non-dialogic texts, which again makes their analysis problematic. As to fictional and biblical texts, the high frequency of phrasal verbs, though still lower than that of the fictional texts in CED, can be accounted for by the fact that both types of texts combine narratorial sections in the past with conversational exchanges between characters. As observed by Hiltunen (1994: 138) and Rodríguez-Puente (2019: 228 et passim), phrasal verbs, particularly those with literal meanings, are a very useful device to describe the events (both backgrounded and foregrounded eventive clauses) 14 in the narrative sections. This specific function of phrasal verbs, illustrated in (24), is what may account for the fact that fictional texts present the second highest frequency of these combinations in Hiltunen's data and in CED, even higher than in dramatic texts. However, many combinations are also found in the conversational exchanges between the various characters, as shown in (25) and (26) In contrast to biblical texts, the sermons of the HC have a rather low frequency of phrasal verbs, despite being speech-like texts. This is probably because sermons, although designed to be read aloud, are 'based on a careful written draft (and thus potentially influenced by the norms of written language)' (Claridge & Wilson 2002: 33), in which phrasal verbs are not, in principle, expected. Claridge (2000 188-90), on a closer inspection of the category 'Religion', observes that the sermons in the Lampeter Corpus contain a higher frequency of phrasal verbs than other religious texts. The frequency of phrasal verbs in the sermons of the Lampeter Corpus is slightly higher than that of the HC, possibly because of the time period covered by the two corpora. As phrasal verbs increase in frequency from EModE onwards, it is not surprising to find higher figures in the Lampeter Corpus, which covers the years 1640-1740.
The results for personal letters in Hiltunen's data may seem, a priori, 'unexpectedly low,' as Claridge describesthem (2000: 187). They contain avery low frequencyof phrasal verbs, a figure which is close to that found by Thim (2006b) in CEEC (see table 3). However, if we thinkof personal letters in earlier stages of the language, the low rates of phrasal combinations may turn out to be less surprising. Though portraying speech-like exchanges, EModE personal letters were not colloquial at all. They were carefully crafted through a conscious process of writing and followed specific conventions learned at home, in grammar schools and in letter-writing manuals (see, e.g, Anderson & Ehrenpreis 1966: 273;Austin 1998: 323;Nevalainen 2001). In fact, personal letters have changed considerably over time; whereas eighteenth-century letters were found to be expository, descriptive or argumentative in purpose, PDE letters are personally involved and interactive (see Biber & Finegan 1989, 1997Biber 2001: 105). Indeed, as noted by Rodríguez-Puente (2019: 235-6), the frequency of phrasal verbs in personal letters increases significantly over time, particularly from the mid-nineteenth century onwards.
As shown in figure 5 above, the highest frequency of phrasal verbs in CED occurs in trial proceedings, which comes as no surprise, considering that they are based on real (not invented) speech, and hence linguistically close to spoken face-to-face interaction, not pre-planned and highly interactive (Culpeper & Kytö 2010: 63-4). Although the exchanges are produced in a formal setting, the proceedings contain passages which are quite colloquial in tone, some even recording bad language and insults (see Widlitzki & Huber 2016). Moreover, unlike witness depositions, whose narrative is normally in indirect speech, trial proceedings typically include transcripts in dialogue format (Culpeper & Kytö 2010: 49-59). By contrast, in Hiltunen's study trial proceedings display a much lower rate of phrasal verbs. This may be due to the difference in the number of particles analyzed or, more probably, the fact that the category 'Trial proceedings' of the HC also includes witness depositions, so that it is not entirely comparable with the samples of CED. Claridge (2000: 190-2) makes a more in-depth analysis of the real dialogue represented in three texts containing trials: one narrated in direct speech, another in indirect speech and a third which mixes dialogues and monologues (2000: 190). She points out that the highest figures of phrasal verbs in the Lampeter Corpus are found in a text containing transcriptions of direct speech produced by people from low socio-economic sectors which, for her, 'speaks for the fact that these verbs are indeed typical of spoken language, and thus probably also in general of more colloquial styles in the language ' (2000: 192). Similar results are reported by Rodríguez-Puente (2019: 264-70) in comparing the trial proceedings of the Old Bailey Corpus with other registers represented in A Corpus of English Historical Registers (ARCHER). Therefore, a closer inspection of the results in previous studies and the data obtained from the larger sample of CED here seems to confirm that EModE phrasal verbs are indeed a feature of the spoken language, as they are amply represented in speech-like texts, especially those which contain authentic dialogue with minimal narratorial intervention.
Some illustrative examples of how phrasal verbs are used in the spoken language of trial proceedings are presented in (27) to (30). Most of these examples contain features of subjectivity and personal affect, typical of oral interaction, such as the use of first and second person pronouns, private verbs (e.g. think and conceive), direct questions (see Biber 1988: 225;Taavitsainen 1994: 202) and, as shown, also phrasal verbs. Interestingly, although produced in a formal context, trial proceedings acquire a rather colloquial tone in certain passages, especially during the testimonies of witnesses and the accused, where insults and bad language are occasionally found alongside phrasal verbs (see (31)-(33)). In contrast to trial proceedings, witness depositions are normally narrated in indirect speech, as in (34), where phrasal verbs are integrated in the narration, being particularly useful for the description of events. (34) […] Joan Buts came in, and sat her down upon a Stool, looking with a most frightful and gastly Countenance; and being asked by a Woman that was there present, what she ailed? she answered, She was not well, nor had been out in seven weeks before; why would you come out now then, said the Woman I could not forbear coming to see you, said she; and with that, she threw down her Hat and tumbled down, wallowing on the ground, making a fearful and dismal noise; and being got up, she fell a cursing in a most horrid manner.
(Examination of Joan Buts, 1682) Narrational sections are not so frequent in didactic works and drama. Dramatic texts include some asides, but these tend to be extremely short and contain few phrasal verbs, most of them indicating movement (e.g. come in and go out). The scarcity or complete lack of such narratorial sections is perhaps what explains the lower rates of phrasal verbs in these registers (especially in contrast to fiction), which otherwise contain combinations that provide their conversational exchanges with a colloquial tone, as in (35) to (37). It must be noted, however, that these texts represent constructed dialogue and, even though authors may decide to represent all kinds of characters, their way of speaking and interacting is crafted rather than natural, which may account for the relatively lower use of phrasal verbs in them.
(35) Yea I will quarter him, and pull all the bones out of his flesh, then will I barrell vp his bowels.
( We move on now to the analysis of the diachronic distribution of phrasal verbs across the various registers, with the prediction that their evolution may not have been the same. As is well-known, the standardization of English involved a change within the written language which, particularly during the eighteenth century, rendered it less oral and more literate, so that it would be more suitable for the manifold functions of the standard language. Extensive research by Biber (2001) and Biber & Finegan (1989, 1992, 1997 on the development of registers from the seventeenth to the twentieth century in ARCHER has consistently demonstrated that styles tend to become more literate during the eighteenth century, and then registers drift apart linguistically, some becoming more oral and others more literate. McIntosh (1998: 23-4) accounts for the change towards a more polished, written-like style in the eighteenth century in terms of standardization, gentrification, the cleaning-up and modernization of English, and the culmination of the prescriptivist period.
It seems necessary, then, to see whether differences can be detected in this respect when we compare the samples of authentic dialogues with those containing constructed dialogues. In the former, such a turn towards more literate styles is, in principle, not expected. The latter, however, represent dialogues crafted by authors, who might be expected to be more predisposed to adopt the stylistic fashion of the time. The evolution of phrasal verbs across registers is represented in figure 6, which clearly confirms our initial predictions.
Towards the end of the sixteenth century all the registers represented in CED have frequencies higher than 3.5, that is, much higher than the mean of the most formal, written registers of the HC (i.e. statutes and official letters; see table 3), which implies that these combinations were clearly a feature of the spoken language of the time. During the first corpus subperiod, fiction is the register with the highest frequency of phrasal verbs, a rate which remains quite high until subperiod 1600-39, but then begins to decrease progressively until the last corpus subperiod ( p = 0.1717). A similar, though not as marked, path is followed by other registers containing constructed dialogues, namely drama ( p = 1) and didactic works 'Other' (p = 0.9143), where frequencies remain rather stable from the first to the last subperiod. In turn, didactic works on language teaching have higher rates of phrasal verbs in the last subperiod than in the first, though from subperiod 1680-1719 their frequency evolves slightly downwards as well ( p = 0.6667). At the opposite end of the spectrum, trial proceedings contain the lowest rates of phrasal combinations at the beginning of the period, whereas witness depositions present frequencies similar to those of dramatic texts. However, in trial proceedings the frequency of phrasal verbs grows significantly from the first to the last subperiod ( p = 0.01714), reaching its peak in subperiod 1720-60. Witness depositions still present relatively low rates of phrasal verbs in the second subperiod, but these grow steadily from then on ( p = 0.0303). The sharp and significant escalation in the use of phrasal verbs in registers that contain real language contrasts strongly with the stability seen in registers that contain constructed dialogue. If phrasal verbs were growing in number, they should be growing across all registers equally. Our hypothesis here is 824 PAU L A RO DR Í GU E Z-PUENTE AND MARÍA OBAYA-CUELI that the change towards more literate styles in the eighteenth century may have affected those registers containing invented speech. These do not present increasing figures of phrasal verbs, perhaps as a reflection of their authors seeking to maintain a more polished and formal style. A striking finding from the data, as presented in figure 6, is the extremely low frequency of phrasal verbs in the trial proceedings of the first subperiod. In order to try to account for this, we looked at the dispersion of our data in this specific register, shown in figure 7, where the y-axis represents the normalized frequencies of individual texts and the x-axis the corpus subperiods. Although no outliers were present in the group of trial proceedings in the boxplot analysis (figure 4), the dispersion graph shows that the three trials in the first subperiod contain very low frequencies of phrasal verbs, whereas in the second subperiod all the trials have frequencies higher than five. In the remaining periods there is considerable variety, with some texts situated below five and the highest being above twenty.
In order to ascertain why the frequencies of phrasal combinations vary so drastically within the same period, we read the individual texts to see whether their subject matter might be a reason, as some authors have argued (Thim 2006a: 302;Rodríguez-Puente 2019: 224). Rather than the subject matter, however, we suspect that this variation is due to the specific style used by the transcribers of the trials or the participants involved in them. Trial proceedings are supposed to portray conversational exchanges with minimal narratorial intervention, but we cannot fully discard the possibility of scribal manipulation or stylistic bias. The proceeding with the lowest frequency in the corpus is the Trial of Mr. Robert Hickford (1571). Although this text contains conversational exchanges in first-person narration, these are very formal in tone, either because the speech itself was pre-planned by the speaker or because it was rendered as such by the scribe. Note, for example, the highly nominalized discourse illustrated in (38), more typical of formal written language than of natural conversation. The formality of this statement contrasts strongly with the short question and answer exchanges which characterize the trial in which we find the highest frequency of phrasal verbs, namely Minutes taken at the Court Martial, held upon Captain John Ambrose (1745). Extract (39) illustrates that phrasal verbs here are mostly literal, used to describe the movements of the ships, and are also quite repetitive, as they appear first in the questions and are then sometimes repeated in the answers. Although repetitions are part of natural conversation, in this particular case the subject matter of the text (the description of the movements of ships) is also relevant in accounting for the high frequency of phrasal verbs. Based on a sample of over 7,000 corpus examples, in this article we have demonstrated that phrasal verbs were well entrenched in the spoken, colloquial language of the EModE period. The wide variety of combinations, the attestation of non-compositional meanings and the ample number of hapax legomena all serve to underline the fact that phrasal verbs were solidly consolidated at the time. Moreover, the analysis of the diachronic distribution of phrasal combinations shows that the growth of phrasal verbs in CED begins at a rather early stage (already from 1600 onwards), much earlier than usually reported in the literature (1700 onwards). The growth of phrasal verbs is particularly marked in those registers which most closely represent the spoken language of the past, namely trial proceedings and witness depositions, especially towards the eighteenth century. Arguably, this happens because, contrary to those texts which represent constructed dialogues, they are not conditioned by the stylistic fashion of the time which promoted the adoption of a more literate style. Crucially, this quantitative diachronic study has allowed us to describe empirically and objectively phrasal verbs in the EModE period. In so doing, we revisited earlier studies which were seen to be 'meaningless' by Thim (2012: 213), noting that the proportion of phrasal verbs in the texts of CED is much larger than that found in the most authoritative and formal written types of documents, namely statutes and official letters. Moreover, we provide an alternative explanation for the apparently chaotic distribution of phrasal verbs which originally led Thim to propose his 'colloquialization conspiracy', and we point to methodological differences, as well as differences among registers which were probably not considered in earlier work (e.g. the non-colloquial character of EModE personal letters). Once more, we demonstrate the importance of register analysis for the study of the development of languages while acknowledging that other features, such as subject matter, as well as the idiosyncrasies and personal style of authors and/or speakers, can also trigger or deflect the use of phrasal verbs. These new insights into the history of phrasal verbs through the data presented here lead us to conclude that the colloquial status of phrasal verbs at the time cannot be understood merely in terms of a 'colloquialization conspiracy'.