To the Present

Part IV To the Present

Chapter 15 Developments in the Frequency of English Binomials, 1600–2000

15.1 Introduction

It is a common claim in literature on binomials in earlier periods of English that binomials were particularly frequent then (especially in Old and Middle English), and have since declined in frequency. For example, Koskenniemi (Reference Koskenniemi1968: 11–12) justifies her choice of the Late Old English and Early Middle English period for an analysis of binomials with the then “rich and varied use of this device, both in poetry and prose”. Mueller (Reference Mueller1984: 147) voices the opinion that “word pairs” (in her terminology) were particularly “pervasive” in fourteenth- and fifteenth-century prose, and both Héraucourt (Reference Héraucourt1939: 192) and Markus (Reference Markus2006: 72) maintain that they were popular in Middle English in general. Greenough and Kittredge date the heyday of binomial usage even earlier, claiming that the use of binomials was “an English literary habit of the ninth century” (referring to Bede’s Ecclesiastical History), which “survived in English prose until the end of the eighteenth century”, implying that binomials are no longer as frequent in Present-Day English: “though out of favor at the moment, it [the habit] has left a number of idiomatic colloquial phrases in the language” (Reference Greenough and Kittredge1902: 114). However, no empirical evidence so far exists that binomials have indeed become less frequent in English. An empirical description of the development of binomial frequency throughout the history of English is problematic for reasons of data availability. Ideally, the frequency of particular structures ought to be traced through a large-scale diachronic representative corpus, which is not available for the full time span from Old English to today.

In this chapter, the focus will thus be on developments throughout the Early Modern, Late Modern and Present-Day English periods, which will show whether binomial frequency has been subject to discernible changes over the past four hundred years. Necessarily, the chapter will focus on questions of data availability and methodology: which corpus resources are most viable in tracking frequency changes of binomials, how do they compare to each other, and which questions can or can they not help answer? Moreover, a discussion of register distribution of binomials will be necessary, since it will be shown on the basis of both contemporary as well as historical corpus data that the frequency of binomials may vary strongly between registers. By analysing binomials in two small one-register corpora, it becomes clear that binomial frequency is both determined by register as well as by individual preferences and specific subject matters. Change over time is thus possible, if register conventions and preferred styles change. In the following, then, a quantitative study will be attempted on the basis of the as yet only two feasible corpora for this purpose, COHA and Google Books, before moving on to discuss register differences with the help of BNC and ARCHER (all data sources to be introduced in the respective sections). Finally, a more qualitative analysis of ARCHER subcorpora will move the focus to the reasons why binomials are used, and how these may influence overall frequency statistics.

15.2 Quantitative Analysis of Binomial Frequency Developments: Google Books vs COHA

In order to begin answering the question of frequency developments, a quantitative study based on large-scale diachronic corpora would be ideal. The corpus linguist’s dream would thus be a representative, part-of-speech (POS) tagged corpus of British English featuring register continuity, but this dream may be unattainable. In fact, there are only two large-scale tagged historical corpora on offer, both of which will be consulted in this chapter: COHA and Google Books.

First, however, let us consider the definition of the term ‘binomial’ followed here, as well as in the other contributions to this book. Binomials are coordinated word pairs, in which the two lexical elements come from the same word class. In order to facilitate the analysis, only those binomials coordinated with the conjunction and will be investigated here (such as peace and quiet, men and women, or quickly and easily). An important part of the definition thus rests on word class, and since it is not feasible in very large corpora to sift non-binomials from a concordance of and, only corpora that are tagged for parts of speech are of use in a quantitative analysis – hence the corpus linguist’s dream of a POS-tagged historical corpus.

One of the very few large diachronic POS-tagged corpora is COHA, the Corpus of Historical American English (Davies Reference Davies2010–). COHA covers 400 million words of writing (fiction, non-fiction, newspaper and magazine texts) from 1810 to 2009. The corpus is thus less than ideal on two important counts: the variety of English covered as well as the short time span. Even if it may be assumed that American and British English differ only marginally in the frequency of binomials, the time span is too short to be able to track changes throughout Early and Late Modern English. Nevertheless, the analysis of COHA does still demonstrate interesting facts, especially when compared to Google Books. The corpus was searched for the strings ‘noun and noun’, ‘verb and verb’, ‘adjective and adjective’ as well as ‘adverb and adverb’. Unfortunately, with this non-parsed corpus, it is not possible to restrict hits to true binomials in the sense that only such tokens are included in which and coordinates two lexical items from the same level of grammatical hierarchy (a further criterion introduced in Malkiel’s Reference Malkiel1959 definition). An example for a possible false positive, in which and coordinates two clauses rather than two lexical items would be the hypothetical sentence “She loved truth and untruth was horrible to her” – which the COHA search interface would count as a token of the binomial truth and untruth. However, it needs to be said that such false positives are relatively infrequent, since such clauses are typically divided by commas (and are then not wrongly picked up as binomials), and even if commas are not inserted, it is quite rare for the last word of the first and the first word of a second coordinated clause to share the same word class. Checks of random concordance lines in COHA suggest that false positives hardly ever occur – in fact, not one false positive was found among random sets of twenty concordance lines for each of fifty binomials, but it has, of course, not been possible to check each of the over 1.5 million tokens of binomials in COHA.

Now what do the COHA data suggest regarding the frequency of binomials in written American English over the past two hundred years? Figure 15.1 shows the token frequencies of binomials per 10,000 words for each of the twenty decades from the 1810s to the 2000s – for example, seventy-two tokens of binomials per 10,000 words in the 1810s, and seventy in the 2000s. All in all, the number of tokens of binomials decreases over time, as a correlation of the tokens per 10,000 words with time values demonstrates. Kendall’s correlation coefficient, used here because it is less sensitive to outliers than Pearson’s r (cf. Hilpert & Gries Reference Hilpert and Gries2009: 390), is τ = −0.65**.¹ Correlating the frequencies of different word-class binomials with the variable of time, we only see a significant correlation for adjectival binomials with τ = −0.87**. Therefore, as far as the frequency of binomial tokens in COHA is concerned, there has been a significant decline, due mostly to the less frequent pairing of adjectives, in the past 200 years.²

Figure 15.1 Frequencies of X and X tokens (e.g. noun and noun) per 10,000 words per decade in COHA

The picture looks quite different, however, when one considers the second database, the Google Books n-gram data. This dataset, produced by linguists as a derivative of the commercial Google Books scanning initiative (Michel et al. Reference Michel2011), makes available n-gram lists³ based on millions of published books from 1500 to today. The main drawback of the data is that no concordances and not even raw frequencies of words or phrases are provided, so that results cannot be verified. All the users have available to them is a graph that shows the development of frequencies, albeit not absolute or normalised frequencies, but relative frequencies in the form of the share (in percentages) among all the n-grams in one year’s corpus. The raw data are downloadable, yet they do not consist of texts, but of lists of n-grams and their frequency in percentages for each year’s corpus, and the size of the data files is so extensive that they cannot be processed by standard computers. A further important drawback of the Google Books n-gram data is that they only list n-grams occurring forty times or more frequently in the whole corpus (the word size of which is not published). This makes it difficult to chart the development of a class of phrases such as binomials, since a large proportion of binomial tokens may be made up of hapaxes or infrequent items in each year’s sampled published writing, so that the overall numbers of occurrence may be heavily distorted. Maybe the gravest disadvantage is the general lack of verifiability: it is not possible to check which kinds of books form the basis of the corpus, whether the books have been correctly classified as to their year of writing (for similar criticism, cf. Nunberg Reference Nunberg2010) or to check the quality of the POS tagging, which has been added to the 2012 version of the dataset (and which proceeded on the basis of the full text, cf. Lin et al. Reference Lin, Michel, Aiden, Orwant, Brockman and Petrov2012).

Nevertheless, the 2012 Google Books n-gram data is certainly the largest POS-tagged diachronic corpus of English that is available, and was thus employed for this exploratory analysis, if with caution. The advantage in comparison to COHA is, however, certainly the fact that British English (as well as American English) is available and the time span covered is much longer, so that a study with Google Books can include the Early Modern period as well. The subcorpus used here was the British English subset of the 2012 data, stretching from 1500 to 2009. The size of this component is unknown, but Davies calculated the size of the 2009 British English subset to be 34 billion words (Davies Reference Davies2011–), and the 2012 subset is claimed to be substantially larger (Lin et al. Reference Lin, Michel, Aiden, Orwant, Brockman and Petrov2012). Figure 15.2 gives the results of the percentage frequency of nominal, verbal, adjectival and adverbial binomials for selected years (every tenth year starting with 1600, then 1610 etc.). The years before 1600 were excluded because of erratic results, which are probably due to inaccurate POS-tagging in these early decades with a tagger trained on Present-Day English (cf. for example the implausible results for the percentage of nouns in general in the sixteenth century in the n-gram viewer).

Figure 15.2 Relative frequencies of X and X tokens (e.g. adverb and adverb) in percent of the year’s corpus for selected years in the Google Books n-gram data

From this graph, it becomes immediately obvious that these results directly contradict the COHA results for the nineteenth and twentieth centuries. Independent of the fact that COHA reports figures normalized to 10,000 words, and Google Books normalizes to percentages of the corpus (which does not make a difference for overall trends), the corpora show different trends for the time span that both cover – the nineteenth and twentieth centuries. For COHA, we see a significant decline of the use of binomials, for Google Books, a significant increase. All in all, the Google Books n-gram data show a highly significant increase of binomials from the beginning of the seventeenth century to today, for the set of binomials of all word classes (τ = 0.47**) and for nominal (τ = 0.44**), adjectival (τ = 0.44**) and adverbial (τ = 0.40**) binomials, with only verbal binomials demonstrating no significant trend over time (τ = −0.13, n.s.). However, it is difficult to know whether these results can be trusted, owing to the drawbacks of the data mentioned above. As long as we do not possess a more trustworthy large-scale POS-tagged diachronic resource, our insights remain somewhat shaky, or limited to a very short time period (as for COHA).

15.3 Register-Specificity of Binomial Frequency

We are thus thrown back to an examination of smaller historical corpora in order to detect frequency changes for binomials. One crucial factor, however, needs to be borne in mind when one is tempted to compare different corpora from different time periods for this purpose. This is the fact that the proportional frequency of binomials may be expected to vary strongly between registers,⁴ at least if we expect that such register differences as may be found in Present-Day English are likely to have applied in historical periods as well. For instance, the British National Corpus (BNC), representing British English of the late twentieth century, shows a strikingly lower frequency of binomials in speech than in writing, and, in addition, large differences between different written registers. Figure 15.3 shows binomial frequency (again for the word classes noun, verb, adjective and adverb) in five selected BNC genres. Even if speech is left aside, which historical corpora can only approximate (see, e.g., Doty and Wicklund, this volume), the divergences in the written genres are striking. Speech shows by far the lowest frequency of binomials, and only fiction approximates the lower frequency of binomials found in speech. Fiction is often found to linguistically behave more like speech than other types of writing (cf., e.g., McCarthy et al. Reference McCarthy, Hoey, Fox and Sinclair1993: 177), since it often contains dialogue or internal monologue crafted to resemble natural speech. As for speech showing fewer binomials than writing, this tallies well with Chafe’s (Reference Chafe and Tannen1982: 42) observation on the higher frequency of coordinated phrases generally in writing than in speech. Chafe argues that this type of coordination allows for more information to be packed into a linguistic unit, which is characteristic of the integrated nature of writing as opposed to the fragmentary nature of speech (cf. Reference Chafe and Tannen1982: 38–39). In any case, register differences are significant. Hatzidaki (Reference Hatzidaki1999: 369) even goes so far as to suggest that the frequency (and type) of binomials used in a given text is a marker of the register membership of this text.

Figure 15.3 Frequency of binomials per 10,000 words in selected BNC registers

It is therefore necessary to track the frequency of binomials over time in single-register corpora (which, however, rarely cover long stretches of time), or the register-specific subcorpora of general diachronic collections. In the following section, this approach will be attempted for two registers included in the ARCHER corpus (A Representative Corpus of Historical English Registers): diaries and sermons. The two text types were chosen because it may reasonably be assumed that there is a certain continuity in the purpose and intended audience of the texts over time. In addition, the two registers are interesting in themselves, diaries constituting a relatively oral register, while sermons are frequently characterized by formulaic constructions. The British English texts of the ARCHER 3.2 version covering seven fifty-year periods from 1650 to 1999 were used, with ten diary texts and five sermons per period, amounting to about 154,000 and 76,000 words, respectively. In these small subcorpora, it was of course possible to consider only cases of true binomials without any false positives thrown up by a mere POS search of the type ‘noun and noun’. Thus, for all the texts, concordances of both and and the ampersand & were manually checked, retaining all tokens of binomials, irrespective of word class, excluding only coordinations of proper names, and coordinations of complex phrases and multi-word lexemes.

For the ARCHER British English diaries, Figure 15.4 shows the frequency of true binomials for all word classes (nouns, verbs, adjectives and adverbs, but also, rarely, prepositions, pronouns and conjunctions). The divergence between the periods is striking, varying from 65 per 10,000 words (close to the figure for BNC fiction, the BNC written register with the smallest proportion of assumed binomials) to over 100 (and thus going beyond the figure for the BNC register with the highest proportion). What is more, there does not seem to be a clear historical trend of either an increase or a decrease of binomial frequency, as Kendall’s correlation coefficient confirms at τ = −0.43, which is not significant.⁵ Thus, the data do not support a hypothesis of a loss of frequency of binomials, at least not in this specific register.

Figure 15.4 Frequency of binomials per 10,000 words in British English diaries (ARCHER 3.2)

A similar picture obtains for sermons, which might be expected to form an even more stable register than diaries (cf. Figure 15.5). There is no significant correlation between the values for binomial frequency and time (τ = −0.05), but again a striking degree of variation. What does become clear, however, is that the use of binomials is highly characteristic of the sermon as a text type, with the section with the lowest frequency figure still lying above the BNC register with the highest ratio, and this does not appear to have changed over the past 350 years. Nevertheless, the fluctuations in frequency between the periods need to be examined more closely, not least because they so blatantly disconfirm the hypothesis of a clear historical trend as maintained by some sources reviewed in the introductory section.

Figure 15.5 Frequency of binomials per 10,000 words in British English sermons (ARCHER 3.2)

The variation between figures for binomial frequency is not only considerable between periods in the ARCHER corpus, but also on a finer level of granularity, between the individual texts. Figure 15.6 demonstrates in even more detail that there is no diachronic trend, but a very high degree of variation between texts – and thus between individual authors. The columns showing the frequency of binomials per 10,000 words in each sermon are ordered as the texts are in the ARCHER corpus, i.e. chronologically (at least roughly, since for some texts the exact year of writing is unknown). Binomial frequency varies between 49 and 230 binomials per 10,000 words in the sermons, and as is transparent, there are those authors who are prone to using many binomials and those who are not in all the periods.

Figure 15.6 Frequency of binomials per 10,000 words in individual British English sermons (ARCHER 3.2)

The scores in Figure 15.6 imply that the frequency of binomials does not depend on period, i.e. the use of binomials is not truly due to ‘fashion’. It does depend to some degree on register – there are some registers which will, on average, have more or fewer binomials proportionally. However, even within these registers, the actual use of binomials depends on the individual writer and is thus a matter of style.

One possible explanation for a frequent use of binomials is thus that this is an individually stable characteristic in the sense that, for some writers, a high incidence of binomials forms part of their idiolect (for idiolect corpus linguistics, cf. Mollin Reference Mollin2009). To illustrate this point, the sermon with the highest frequency of binomials in ARCHER, ‘1732berk’, by one George Berkeley, may be thought of as a prime example of an individual preference for binomials. It is possible that the author was striving for copia verborum, the Latin rhetorical ideal of demonstrating a rich vocabulary. One of the ways to achieve this abundant style in classical rhetoric is through synonymia, which is why synonym pairs are frequent in Latin (Mueller Reference Mueller1984: 150–151). Consider Example (1), an extract from Berkeley’s sermon, in which the relevant tokens of binomials are in italics:

(1) And that there actually is in the mind of man a strong instinct and desire, an appetite and tendency towards another and a better state, incomparably superior to the present, both in point of happiness and duration, is no more than everyone’s experience and inward feeling may inform him.

In just one sentence, we find three binomials and two further complex types of coordination. Not all of them are strictly necessary in the sense that the coordinated elements add different semantic colouring to the statement (except in the case of happiness and duration), but rather we find a style in which the coordinated elements are near-synonymous, differing only in nuances. This style is almost certainly influenced by the Latin example, used in order to demonstrate rhetorical power, and may indeed have been more current in previous periods of English. Koskenniemi (Reference Koskenniemi1968: 115–116) mentions language contact with Latin in the Old and Middle English period as one of the most important sources for the frequent use of binomials. This influence on English may have been direct in translations where Latin word pairs were rendered with English word pairs as well as wherever writers explicitly followed Latin rhetorical rules. Furthermore, there may have been an indirect influence in that the exposure to a synonym-rich style, for example in reading Classical authors, strengthened the preference for binomials.

The use of near-synonymous binomials, as in Example (1), may thus be the result of a Latin-like rhetorical style. A further motive for their use, which has been mentioned frequently in the literature, is the binomials’ effect of emphasis. For example, Leisi (Reference Leisi1947: 133) considers that the synonymous pairs that he finds in Caxton’s translation of the Eneydos are used for emphasis – in fact he names these binomials ‘tautologous pairs’, and claims that since the two (near-)synonymous elements are not logically necessary, they must have been selected for the sake of emphasis (cf. also Kellner Reference Kellner1894). Koskenniemi (Reference Koskenniemi1968: 118) also treats emphasis as one of several important motives for employing binomials, especially in instructive writing (such as religious tracts), where authors seek to impress points on the readers, as is the case in the sermons analysed here. However, both seeking emphasis and striving for copia verborum may only explain the higher frequency of synonymous and near-synonymous binomials, but not other types. Gustafsson (Reference Gustafsson1975: 85–87) distinguishes four main categories of semantic relationships pertaining between the lexical elements of binomials: (1) (near-)synonymy (semantic homeosemy in her terminology, as in intents and purposes or aches and pains), (2) antonymy (Gustafsson’s semantic opposition, e.g. hot and cold, men and women), (3) hyponymy, in which one element is the hyperonym of the other (e.g. birds and animals), and (4) semantic complementation, in which elements A and B share some type of semantic similarity, but are not antonyms, synonyms or hyponyms (e.g. hand and foot, fair and reasonable). In analysing the semantic structure of highly frequent BNC binomials in Present-Day English (Mollin Reference Mollin2014: 36), I found that only 5% of these have a synonymous internal structure (complementation: 57%; antonymy: 25%; hyponymy: 1%), so that any factors explaining binomial frequency in a given text that are only applicable to synonymous tokens do not cover the whole picture – even if the proportional distribution of semantic types of binomials may have been different in historical registers.

For instance, binomials may also quite simply be frequent in any given text as a result of a specific subject matter. In the sermons, it appears that this is the case for antonymous binomials, which are used when describing real-life opposites. As an example, consider Table 15.1, which lists the most frequent binomials (with a minimum of three occurrences) among the 961 tokens found in all sermons (of all periods) together. While there are a number of binomials in this list that show near-synonymy, possibly bordering on complementation, there are also some that contain clear antonyms, such as God and man, men and women, life and death, etc. The occurrence of these pairs is certainly not due to either copia verborum or striving for emphasis, but to the simple fact that the author evokes binary concepts. As Koskenniemi (Reference Koskenniemi1968: 110) points out, “there are individual concepts marked by an inherent duality, which motivates the use of two words”, so that binomials reflect our perception of the world in binary categories. Similarly, for complementation binomials, their use is motivated by the contents, not the style – in arise and eat, two connected actions are alluded to, neither of which could be omitted.

Table 15.1: The most frequent binomials in British English sermons (ARCHER 3.2, all periods), minimum of three occurrences

Rank	Binomial	Absolute token frequency
1	God and man	6
2	men and women	5
3	life and death	4
4	God and Christ	4
5	here and now	3
6	wisdom and goodness	3
7	scribes and pharisees	3
8	heaven and earth	3
9	death and hell	3
10	arise and eat	3
11	God and Jesus	3
12	great and noble	3
13	grace and glory	3

Further motives for using binomials that have been mentioned in the previous literature are less likely to be able to contribute to a significant degree to potential historical trends in binomial frequency. The best-known of these has become known as the interpretation theory, or the “Behrens-Jespersen view” (Mueller Reference Mueller1984: 152). This hypothesis refers to Behrens’s (Reference Behrens1886: 8) suggestion that Middle English authors routinely juxtaposed French loanwords with English paraphrases (even though he does not specifically refer to word pairs), to which Jespersen (Reference Jespersen1905: 96–98) made reference, mentioning that similar combinations of English and French elements in word pairs occur frequently in Chaucer, even though here Jespersen assumes less of an interpretation purpose than that of an intended stylistic effect. The theory, if it deserves the name, has been interpreted by other authors to mean that English-French binomials are used frequently in Middle English in order to gloss the meaning of a French word for English readers. Empirical evidence, however, has shown that only a minority of Middle English binomials are of the etymologically mixed type (e.g. Bugaj Reference Bugaj [= Kopaczyk]2006a), so that the influence of this factor on the diachronic frequency of binomials is likely to have been small.

15.4 Conclusion

The question of whether English binomials have become more or less frequent over the past centuries is not easy to answer. A quantitative analysis is difficult, because the largest POS-tagged corpora either only cover a short time span (i.e. COHA) or are not very trustworthy, lacking verifiability of results (i.e. Google Books n-gram data), and, what is more, these two data sources even contradict each other for those centuries that they both cover. A smaller-scale analysis of diachronic corpora needs to pay special attention to the question of register, since, as was shown for both BNC and ARCHER data, binomial frequency varies strongly with register. As an example, writing typically contains many more binomials than speech, and sermons are on average among the most binomial-full registers. The analysis of individual ARCHER sermons, however, has demonstrated clearly that even within one register, there are great individual differences as regards the use of binomials.

Thus, even though we cannot yet definitively answer the question of whether binomials have indeed become less popular over the course of the history of the English language, we can now at least make some educated guesses as to which motives may potentially be driving historical trends in this area. I would like to propose that the use of specific types of binomials (antonymous as well as complementation-based) is due to the contents of a text. Unless there are trends over time as regards the contents of texts in a register (e.g. more or fewer sermons on heaven versus hell or good versus evil), it is unlikely that these binomials will contribute to a great degree to a historical trend in binomial frequency. Antonymous and complementation binomials may, however, characterise a specific register, such as the sermon, since the texts in one register may very well be more or less likely to cover real-life opposites or additions than those in another.

Synonymous or near-synonymous binomials, in contrast, could very well contribute strongly to an increasing or decreasing frequency of binomials. The use of these binomials is stylistically motivated, with authors striving for emphasis as well as potentially emulating a Latin rhetorical ideal, and such a style may well be subject to changing preferences, possibly retaining strongholds in specific registers.

It remains for future studies to empirically analyse different kinds of historical corpora as to their frequency of binomials, but in particular as to their frequency of binomials of particular semantic structures. Only then will it be possible to state definitively whether and why binomials have become more or less current over time.

Chapter 16 Binomials in English Novels of the Late Modern Period: Fixedness, Formulaicity and Style

16.1 Introduction¹

Coordinated word pairs, or binomials, have been studied by linguists under a number of different terms and conceptual paradigms,² with the most basic definition being that a binomial is a pair of words of the same word class connected with the coordinating syntactic element and or or (see Kopaczyk and Sauer, this volume). Although this definition does not specify the word class of the coordinated items, the most typical binomials comprise of two nouns or two verbs, sometimes adjectives and adverbials; the present study will focus exclusively on nouns. As a general rule, binomials are considered multi-word units, lexical bundles or n-grams with an explicit phraseological frame, and the main points of interest have been the way binomials become entrenched and how they function as fixed and formulaic units (see, e.g., Wray Reference Wray2002; Moon Reference Moon1998; Kopaczyk 2008);³ definitions of the two terms are by no means universal within the discipline, and there is much overlap in the terminology.⁴ Like many other types of formulaic sequences, such as idioms, set phrases and quotes, binomials are a natural phenomenon of all language use and can be considered a universal. A widely accepted theory holds that frequently used lexical units are cognitively entrenched in the linguistic repertoire of native speakers and advanced non-native speakers, and this reduces processing effort on the part of both the producer and recipient (see Cooper and Ross Reference Cooper, Ross, Robin, San and Vance1975; Benor and Levy Reference Benor and Levy2006; Renner Reference Renner2014). Thus, for example, when we refer to a culturally salient entity such as women and children, a familiar concept frequently evoked in news reports of wartime atrocities and natural disasters, we place the items in that precise order because it has become entrenched in the language system; saying or writing children and women would be stylistically marked and would require more cognitive effort. Recent work by Mollin (Reference Mollin2013), in particular, has shed new light on the fixedness of binomial sequences, showing that in addition to being subject to freezing over time, binomials may also unfreeze, and the order of preference may even reverse. Although usually largely synonymous with fixedness, formulaicity arguably has a slightly wider conceptual breadth, extending to discourse and genre studies. For example, the binomials associated with a specific genre or domain can be called formulaic in that particular register, which means that those specific phrases are preferentially used when their referents need to be mentioned and also that mature, competent speakers of the language will expect to see them in a well-formed text of that particular type. In the case of genres with particular cultural or informational gravitas such as legal, business or medical writing, such fixed units may even be mandated by regulations or commonly held standards (see Kopaczyk and Sauer, this volume).

When binomials are employed in literary prose, the domain-specific constraints are much more relaxed, albeit the general cognitive constraints concerning fixedness and formulaicity naturally still apply. However, the literary use of binomials can also be examined from a more explicitly stylistic perspective, asking the question whether or not authors pay attention to their use of binomials. The literary perspective was once the primary line of inquiry into binomials (see Section 16.2 for a brief overview), but in recent times it has fallen out of favour among linguists, especially when it comes to analysing literature published after the early modern period. By contrast, the present study focuses on Late Modern English and adopts as its main approach a quantitative methodology that draws on corpus linguistic evidence to show that the use of coordinated pairs of nouns was a salient and characteristic feature of some late nineteenth-century authors’ writing styles. To that end, the focus of this chapter is on quantifying the use of binomials in the works of twenty-five authors to determine whether binomials remained a stylistically foregrounded feature of language use of which literary authors made conscious use, or whether binomials had become unmarked to the extent that no author in the corpus would stand out from the others by consistently over- or underusing them. The corpus, which comprises nearly 300 full-length novels published around the turn of the twentieth century, provides sufficient evidence to support the argument that there are statistically significant correlations between the frequencies of different types of binomials, and that while most of the authors use binomials in a fashion that may be described as normal or standard, there is a small number of those whose use of binomials is consistently abnormal in a way that suggests a conscious and deliberate stylistic motivation. Along the way, I will also offer some remarks on the concreteness and abstractness of the nouns in the binomials, making the tentative claim that abstract nouns show a higher tendency for productivity when it comes to the number nouns with which they form binomials while concrete nouns are more likely to be restricted to fewer and more saliently fixed combinations of nouns.

16.2 Binomials, Literary Stylistics and Stylometry

As noted earlier, substantial scholarship already exists on binomials in early canonical literature. Various studies published since the middle of the twentieth century have examined the use of binomials in prestigious literary works, often focusing on major authors such as Shakespeare and Chaucer (see Gerritsen Reference Gerritsen1958; Nash Reference Nash1958; Potter Reference Potter1972; Kohonen Reference Kohonen1979; Roscow Reference Roscow1981). In what is one of the first truly extensive studies of binomials in early English literature, Koskenniemi (Reference Koskenniemi1968) suggested that the use of such doublets in Middle English prose was typically characterised by redundancy: the second lexical item of the pair of coordinated words echoes the meaning of the first, either fully (to the extent that that is ever possible) or by defining or extending its meaning in some way.⁵ More recent studies in the same vein include Blake (Reference Blake1991), who discusses Caxton’s practice of introducing doublets for stylistic reasons and Orchard (Reference Orchard2003), who reports that repetitive word pairs were frequently used in Old English translations from Latin to render the exact meaning of Latin words. Klégr and Čermák (Reference Klégr, Čermák, Procházka and Čermák2008), having analysed 700 binomials in Hamlet, argued that “English binomials are first and foremost an aesthetic device which may become frozen by frequent use and only then turns into a collocational unit (a phraseme or idiom). They come into existence as a useful means of conveying a given concept in a particularly forceful way as the occasion arises” (Reference Klégr, Čermák, Procházka and Čermák2008: 58). The idea of an aesthetic motivation suggests that the use of binomials is essentially a stylistic choice which, if true, naturally leads to the further realisation that binomials are likely to be a conspicuous feature of language use to at least some members of the reading public. Biber and Conrad (Reference Biber and Conrad2009: 23) describe style as “the characteristic way of using language” and importantly keep it separate from genre and register; one can have a particular style of writing, and there may be one or more recognisable styles within a specific genre or register. Thus, one might argue that the frequent use of binomials was a discernible stylistic feature of Middle English prose and drama, but equally that in the Modern period, it had ceased to be a stylistic feature of prose on the whole but could still be recognised as part of an individual author’s personal style.

From the perspective of literary studies, binomials or doublets are sometimes seen as a type of pleonasm, a term that refers to the use of superfluous words or phrases that furnish additional intensity or weight to the message.⁶ Some literary binomials can also be defined as hendiadys (Gr. ἓν διὰ δυοῖν), or pairs of coordinated words used to express a single idea: for example, come and see instead of come to see. Once considered a device of high literary style, pleonastic binomials have been seen by some modern authors and literary critics as a largely unnecessary affectation best avoided in fluent prose. On the other hand, as we will see shortly, some of the most popular authors of the Late Modern period used binomials to great success.

It is worth noting here that the structural repetition inherent to binomials can be read as stylistically meaningful in itself. As Leech and Short (Reference Leech and Short2007: 15) note, in literary texts “the elaboration of form inevitably brings an elaboration of meaning”, suggesting that even in cases of apparent complete redundancy the repetition itself produces a rhetoric effect: to take a simple example, there is a stylistic difference in saying something like I saw men and women or I saw men and I saw women. Although the two sentences convey essentially the same information, the first is more neutral and could be interpreted to mean that the men and women were seen together in the same space, while the coordination and the structural repetition in the second sentence might suggest that there is a significance to seeing both men and women. Importantly for the present study, the modern literary author is almost entirely free from observing the kinds of text-typological constraints that overshadowed the choices of Old and Middle English literary authors or, it goes without saying, contemporary authors in more formulaic genres such as legal writing. However, while he or she is not required to use or avoid specific binomials, the stylistic choices will necessarily be made against the backdrop of contemporary linguistic conventions: the decision to reverse a firmly fixed pair of coordinated nouns – for example, writing butter and bread instead of bread and butter – or to use redundant word pairs unusually often will quickly become stylistically marked. This is particularly true if we subscribe to the idea that, when it comes to style, the language of literary texts is often more carefully crafted than that of most other written genres: in literature, stylistic choices are one of the key features by which individual authors are recognised and appreciated, while in most other genres the information content or clarity of expression are more highly prized. This is not to say that one could not discuss the linguistic characteristic of genres such as legal or administrative writing as styles, but it seems reasonable to claim that the individual author’s personal style is less conspicuous. Consequently, literary texts are read with a particular eye to the author’s personal style and the choices they make are likely to be interpreted by readers as having particular significance. The use of the familiar binomial structure when expressing unique, context-dependent ideas can be considered a creative and writerly act; as Wray (Reference Wray2002: 12) notes, “in most cases ‘novelty’ is much less a question of doing things with grammar than juxtaposing new ideas in commonplace grammatical frames”.

Stylometry is a field of scholarship situated at the meeting point of literary studies and linguistics, of late more particularly corpus and computational linguistics. Over the last two or three decades stylometry has taken substantial leaps forward due to the rapidly expanding availability of computer-readable copies of literary texts and equally rapidly increasing computational power, with methods ranging from statistically driven authorship attribution to distant reading and macroanalysis (see, e.g., Moretti Reference Moretti2005; Eder Reference Eder2011; Jockers Reference Jockers2013) opening up entirely new ways of understanding how literary texts relate to each other, how literary styles have evolved, and how personal styles manifest in writing.⁷ Although the fundamental objectives of linguistic and literary studies are quite different – a fact Ramsay (Reference Ramsay2003: 173) sums up with the observation that “empirical validation and hypothesis testing simply make no sense in a discourse where the object is not to be right (in the sense that a biologist is ever ‘right’) but to be interesting (in the sense that a great philosopher is ‘interesting’)” – there can be no doubt that quantitative analyses of large data sets can reveal new and invaluable insights. As Mahlberg (Reference Mahlberg and Baker2009: 47) notes, “corpus methodology can help base stylistic studies on exhaustive quantitative and statistical information and add to the amount of detail that a stylistic analysis can achieve”. The frequency-based approaches at the heart of corpus linguistic studies are most suited for identifying quantitative differences and similarities, patterns of repeated behaviour and correlations between co-occurring items, features and clusters of features. Depending on the point of reference, the frequency differences can be interpreted as deviations from a norm which, in turn, may be interpreted in terms more appropriate to stylistic analysis.⁸

16.3 Data and Methods

This study uses as its primary data a corpus of prose fiction, namely the Corpus of English Novels, or CEN. Compiled by Hendrik de Smet, the corpus comprises 290 novels written by twenty-five British and North American authors and published between 1881 and 1922.⁹ The texts, included in the corpus in toto, were collected from the Project Gutenberg web archive. With 26.2 million words, the corpus provides a comprehensive and representative picture of English novel writing at the turn of the century, albeit that the opportunistic sampling method does introduce a selection bias in favour of more-established authors. Arguably, CEN is a particularly suitable corpus for the type of analysis attempted here because it is at once large enough to capture a wide selection of both common and less common binomials, focused enough in terms of both time span and text type not to suffer from the kind of excessive generality common to large generic corpora, and representative enough of individual author’s texts to allow analysis on the level of idiolect.¹⁰ The novels in the corpus are mostly representative of naturalism, a style popular in the late nineteenth and early twentieth centuries. Naturalism was associated with a fiercely realistic, determinist and often pessimistic tone,¹¹ and a preoccupation with interpreting the human condition through the circumstances of background, social class and experiences of life. The authors included in the corpus range from well-known and enduring literary figures such as Robert Louis Stevenson, Gertrude Atherton and Arthur Conan Doyle to authors like Marie Corelli and Ralph Connor, who may be less familiar to general audiences today but were widely read in their day. The text-linguistic research design of the present study treats each novel as a separate entity that yields a single observation. The metadata for each novel includes the year of publication, the name of the author and a categorical label, introduced for the present study, with three possible values: serious, leisure or juvenile. The category ‘serious’ refers to what is sometimes described as literary fiction, ‘leisure’ to so-called paraliterary fiction (for example, mysteries and adventure stories) and ‘juvenile’ to works primarily written for children. Some of the authors wrote fiction for a variety of audiences, while others wrote exclusively within a single genre; the column ‘primary category’ in Table 16.1 gives the style of writing with which the author is most closely associated.

Table 16.1: Authors in the Corpus of English Novels

Author	Novels	Combined word count	Primary category
Andy Adams	5	450,564	Leisure
Arthur Conan Doyle	18	1,566,987	Leisure
Edith Nesbit	8	537,969	Juvenile
Edith Wharton	11	872,824	Serious
Emerson Hough	9	751,315	Leisure
Frances Burnett	11	974,948	Juvenile
Francis Marion Crawford	13	1,396,223	Leisure
George Augustus Moore	10	996,682	Serious
George Gissing	20	2,408,767	Serious
Gertrude Atherton	10	634,864	Serious
Gilbert Parker	16	1,398,355	Leisure
Grant Allen	8	590,205	Serious
Hall Caine	4	665,937	Leisure
Henry Rider Haggard	25	2,556,621	Leisure
Henry Seton Merriman	12	988,647	Leisure
Humphry Ward (Mary Augusta Ward)¹²	17	2,252,823	Serious
Irving Bacheller	8	511,064	Leisure
Jerome K. Jerome	10	706,389	Leisure
Kate Douglas Wiggin	14	677,656	Juvenile
Lyman Frank Baum	14	622,700	Juvenile
Marie Corelli	11	1,719,829	Leisure
Ralph Connor	11	974,840	Leisure
Robert Barr	10	731,329	Leisure
Robert Louis Stevenson	9	676,472	Leisure
Stanley J. Weyman	6	563,418	Leisure

To facilitate the retrieval of binomials, CEN was part-of-speech tagged using Yasu Imao’s CasualTagger, a front-end for Engtagger developed by Yoichiro Hasebe.¹³ The tagger makes use of Perl’s Lingua:En:Tagger module, and the tagset is a very slightly modified version of the one used in Penn Treebank. The retrieval of binomials was operationalised by scripting a query that returned all trigrams with common nouns as items 1 and 3, and either and or or as item 2. In Wray (Reference Wray2002: 32), search sequences like this are described as “normal prefabricated frames, with some fixed items and some gaps for open class items”. The formulaic sequences under investigation are thus the four binomial patterns rather than any specific lexical manifestations thereof. The two nouns in a single binomial had to be either singular (tagged NN) or plural (NNS), and no intervening items such as adjectives were allowed; multinomials were also excluded from this study. The initial query returned a total of 64,058 hits comprising 42,591 different types. A manual semantic analysis was then carried out to prune mistagged items as well as noun pairs which could not be considered binomials following the working definition provided by Kopaczyk and Sauer (this volume). This resulted in 8,458 items being discarded.¹⁴ The relevant figures for each binomial pattern are given in Table 16.2.

Table 16.2: Initial and pruned search results

Binomial pattern	Raw frequency pre-pruning	Raw frequency post-pruning	St. frequency post-pruning (/1,000 words)	Number of unique types	TTR
Singular noun, and	42,341	35,605	1.35	22,122	0.62
Singular noun, or	5,352	4,476	0.17	3,647	0.81
Plural noun, and	15,569	14,804	0.56	9,224	0.62
Plural noun, or	796	719	0.03	637	0.80

TOTAL	64,058	55,604	1.63	35,630	0.64

The initial results show that the and-pattern is clearly more common with both singular and plural nouns (Table 16.2). The numbers also indicate that the type/token ratio (TTR) is roughly the same regardless of whether the nouns in question are singular or plural, and that the or pattern shows a higher ratio. This suggests that the noun and noun pattern is somewhat more prone to fixedness and the noun or noun pattern is used more flexibly; in other words, assuming we had an equal number of randomly selected noun and noun and noun or noun binomials, we would see a greater number of different binomials in the latter group.

16.4 Fixedness and Formulaicity

Before addressing the main question of stylistic variation, let us take a closer look at the data from the perspective of fixedness. Given that there are virtually countless possible binomials in English, and many thousands of different ones in the corpus, it is worth noting that only few reach high frequencies.

Binomials of the noun and noun pattern show that the most frequent types are all concerned with familiar everyday topics such as kinship terms, parts of the body, temporal expressions or objects such as bread and butter (see Table 16.3). Most are concrete rather than abstract, and in nearly all of them the semantic relationship between the nouns can be analysed as being contrastive or complementary in some way. It is naturally a matter of contextual interpretation whether a word pair such as father and mother should be seen as one or the other, and it is beyond the scope of this study to analyse each observation in context. The frequencies reported here, as with all the other binomials, follow a typical Zipfian distribution with a steep drop-off following the first few items.

Table 16.3: Top twenty noun + and + noun binomials in the Corpus of English Novels

Rank	Singular noun	Raw freq.	Plural noun	Raw freq.
1	father and mother	288	men and women	651
2	day and night	229	women and children	179
3	night and day	187	hands and knees	112
4	life and death	156	boys and girls	85
5	flesh and blood	138	years and years	77
6	man and woman	115	days and nights	75
7	bread and butter	112	hands and feet	70
8	mind and body	108	brothers and sisters	65
9	heart and soul	105	ladies and gentlemen	62
10	husband and wife	102	odds and ends	57
11	gold and silver	95	eyes and ears	56
12	body and soul	94	books and papers	42
13	brother and sister	77	wives and children	41
14	hair and beard	73	arms and legs	40
15	man and wife	71	sights and sounds	40
16	wife and child	65	twos and threes	39
17	body and mind	64	men and horses	38
18	bread and cheese	61	doors and windows	35
19	face and form	58	miles and miles	32
20	father and son	57	sons and daughters	31

Furthermore, Table 16.3 shows that although the overall frequency of plural binomials is lower that that of singular binomials, the former pattern in fact features the binomial with the highest frequency in the corpus, men and women, with 651 hits. This gives a standardised frequency of twenty-eight hits per million words. All other binomials are considerably less frequent, with the great majority occurring only once or twice.

A similar picture emerges when we turn to noun or noun binomials (Table 16.4). Many of the same noun pairs that were features in Table 16.3 are seen here, but the frequencies are much lower, particularly for the plural or-pattern where the highest frequency is a mere eleven occurrences.

Table 16.4: Top twenty noun + or + noun binomials in the Corpus of English Novels

Rank	Singular noun	Freq.	Plural noun	Freq.
1	man or woman	130	men or women	11
2	life or death	44	weeks or months	10
3	day or night	36	days or weeks	8
4	word or look	26	months or years	6
5	night or day	19	brothers or sisters	5
6	father or mother	18	friends or enemies	4
7	head or tail	17	girls or women	4
8	success or failure	16	parents or guardians	4
9	heaven or earth	15	beasts or birds	3
10	earth or heaven	14	boys or girls	3
11	friend or foe	13	friends or acquaintances	3
12	man or beast	12	friends or foes	3
13	sea or land	12	relatives or friends	3
14	joy or sorrow	11	advantages or disadvantages	2
15	word or deed	11	beans or peas	2
16	word or sign	11	billiards or cards	2
17	hand or foot	10	books or newspapers	2
18	time or place	10	cabins or huts	2
19	kith or kin	9	castles or manors	2
20	praise or blame	9	cattle or horses	2

Looking at the entire set of binomials, there are 456 occurrences (149 types) of instances where the second noun repeats the first. The highest frequencies are typically time references such as years and years (seventy-seven hits), hours and hours (twenty) and days and days (sixteen); the only high-frequency non-time-related fully repetitive binomial is man and man with twenty-four hits.

All the frequencies here are low enough to suggest that, if binomials are to be considered fixed phrasal units in literary texts at all, the frequency threshold – if indeed frequency is taken as a requirement for fixedness – cannot be particularly high. For example, the usual cut-off points given in literature for the fixed multi-word sequences known as lexical bundles range from ten to forty hits per million words, which means that even at the lower cut-off point only two binomials, men and women and father and mother, would make the cut.¹⁵

Apart from the frequencies of specific binomials, we may also use the data to investigate which nouns are particularly prone to forming binomials. This relates directly to the question of fixedness, particularly if the nouns which participate in binomials tend to belong to particular semantic fields. Table 16.5 gives the twenty most common items to appear as the first and second element of a binomial, the corresponding numbers of different binomial types and the resulting type/token ratios. It goes without saying that, due to the low frequencies, the type/token ratios are useful as general indicators of proportions and that they should not be used for direct comparisons.

Table 16.5: First and second nouns in binomials with the highest frequencies in the Corpus of English Novels

Rank	First noun	Freq.	Number of binomial types	TTR	Second noun	Freq.	Number of binomial types	TTR
1	men	869	73	0.08	women	707	26	0.03
2	life	656	201	0.30	mother	344	12	0.03
3	man	525	88	0.17	death	323	71	0.21
4	love	494	182	0.37	night	283	9	0.03
5	father	454	30	0.07	woman	267	15	0.05
6	face	382	151	0.40	soul	262	20	0.07
7	day	333	28	0.08	children	256	26	0.10
8	heart	328	95	0.29	water	236	53	0.22
9	bread	321	40	0.12	manner	214	38	0.17
10	mind	305	87	0.29	fear	212	96	0.45
11	mother	266	40	0.15	day	211	6	0.03
12	time	259	106	0.41	body	197	19	0.09
13	hands	244	33	0.14	love	197	96	0.48
14	night	244	10	0.04	beauty	196	77	0.39
15	strength	239	128	0.54	wife	187	10	0.05
16	fear	234	110	0.47	blood	178	29	0.16
17	head	218	84	0.39	strength	167	70	0.41
18	pain	218	114	0.52	power	156	81	0.52
19	women	218	19	0.09	mind	151	28	0.18
20	friend	211	98	0.46	sister	150	8	0.05

The type/token ratios reveal that there is considerable variation when it comes to how productive different nouns are in the first and second positions. While nouns like life, love, time and fear appear as first elements in numerous different binomials, others, such as night, men and day, although common in binomials, are more likely to be used in fixed expressions. For example, out of the 869 occurrences of binomials with men as the first item, 651 are men and women, leaving 218 tokens for the remaining 72 types; by contrast, there are 128 different types of strength binomials with 239 tokens between them. In general, high-frequency second elements are much more prone to appear in fixed binomials; with a few exceptions, such as fear, love, beauty, strength and power, the secondary elements show very low type/token ratios.

The noteworthy feature here is the apparent association between the abstractness of the noun and a tendency to form a wider variety of different binomials. As seen in Table 16.3, most of the concrete nouns show relatively low type/token ratios, while abstract nouns appear on average to be more productive. If we look at the first and second items with the highest number of unique types, nearly all the nouns denote abstract concepts (Table 16.6).

Table 16.6: First and second nouns with the highest type counts in the Corpus of English Novels

Rank	First noun	Unique types	Second noun	Unique types
1	fear	97	life	200
2	love	97	love	182
3	power	82	face	151
4	beauty	78	strength	129
5	pain	77	fear	117
6	pleasure	74	pain	115
7	death	72	time	106
8	excitement	72	thought	101
9	strength	71	beauty	100
10	things	71	friend	99
11	comfort	66	heart	95
12	despair	66	blood	94
13	passion	66	man	89
14	joy	65	mind	88
15	misery	63	shame	86
16	pity	61	head	84
17	sorrow	61	light	84
18	life	56	hope	83
19	love	54	pride	82
20	face	54	surprise	82

What the evidence shows is that fixedness in binomials is largely, though naturally not exclusively, associated with concrete concepts, while linguistic creativity is more often called into play when writers discuss thoughts, emotions, sentiments and other abstract concepts. While concrete binomials are frequently evoked in contexts where the use of a binomial is almost unavoidable because there simply are two entities that need to be mentioned, such men and women or mother and father, the abstract binomials are more common in situations where the writer would have the option of avoiding the use of a binomial, if he or she so chose, and where the second noun serves to emphasise, elaborate or define the sense of the first. For example, in Ralph Connor’s The Prospector (1904) we encounter the following exchange:

(1) But Betty shook her head decidedly, saying, “I’ll find some way. Tell me, what does she like?”

“Shock.”

“But I mean what amusement and pleasure has she?”

“Amusement! Shades of the mighty past! Why, Miss Betty,” Brown’s tone is sad and severe, “in my young days young people never thought of amusement. We had no time for such follies.”

The two nouns of the binomial amusement and pleasure are sufficiently synonymous here to suggest that the binomial could be read as being semantically repetitive in the sense that the sentence would convey the information perfectly well with no binomial and simply the noun amusement or pleasure. More significantly, we may also note that, because amusement is an abstract noun, it can take a wide variety of such supportive nouns: there are thirty-three binomials with amusement or amusements as the first item, and not a single one of them occurs even twice. Another, similar example is hatred and rage. In Arthur Conan Doyle’s Sir Nigel (1906), we find the following:

(2) Nearer and nearer yet, with stealthy step, and then with a bound and a cry of hatred and rage Paul de la Fosse had sped his blow. It was well judged and well swung, but point would have been wiser than edge against that supple body and those active feet.

Again, the nouns hatred and rage are quite synonymous in this context, but the repetition serves to emphasise the depth of feeling felt by the character. Hatred occurs twenty-eight times as the first noun of a binomial and, as with amusement, every single binomial occurs only once.

16.5 Distribution of Binomials and its Stylistic Implications

The next step in the analysis is to look at the use of binomials across the 290 novels in the corpus. Are binomials used with more or less equal frequency in all the novels, suggesting that the use of binomials is a universal feature which simply comes about in the normal course of speaking or writing English, or can we detect trends or tendencies which suggest that some authors use binomials more or less than the others?

Starting with the specific patterns of binomials, it is clear that there are strong correlations to be seen between the frequencies of specific binomial patterns. With no evidence to the contrary we may assume that the different patterns are independent of each other, that is, that there is no linguistic reason to assume that the combined frequency of noun and noun pattern binomials should correlate with the frequency of noun or noun pattern binomials in the same text. However, with a Spearman’s ranked correlation coefficient of 0.44 (p = ***),¹⁶ there is indeed a strong positive correlation between the standardised frequencies of the two types of binomial constructions (Figure 16.1). In the figure, the dots represent single texts in the corpus and their position is determined by the standardised frequency of the N + and + N binomials (horizontal axis) and N + or + N binomials (vertical axis). The correlation is visually apparent in the diagonal shape of the cloud of dots and algebraically verified with Spearman’s ranked correlation test. A strong correlation means that when the frequency of N + and + N type binomials is high in a text, there is a strong tendency that the frequency of N + or + N pattern binomials is also high in the same text. Keeping in mind that we are looking at standardised frequencies that are unaffected by the length of the text, the logical explanation is that this happens because of stylistic reasons, whether conscious or unconscious.

Figure 16.1 Frequency correlation between N + and + N and N + or + N patterns in the Corpus of English Novels

If we turn to the two different patterns individually and see whether correlations can be found between singular and plural nouns in each, the answer is positive in both cases. The Spearman coefficient for singular and plural and-patterns is 0.577 (p = ***), demonstrating that when the binomial and-pattern is used frequently in a novel, there is a very strong tendency for the same novel also to feature a higher frequency of the same pattern with plural nouns (Figure 16.2).

Figure 16.2 Frequency correlation plot of and-pattern binomials in the Corpus of English Novels

A similar, if somewhat weaker, correlation is seen with the or-pattern (Spearman 0.33, p = ***; Figure 16.3). The three correlations can be interpreted as supporting the argument that, whether consciously or not, the authors in CEN make use of binomials as a phraseological frame rather than simply as individual, stylistically foregrounded binomials. If the latter was the case, we would not see these overall patterns which represent the use of at least several dozen and in most cases hundreds of occurrences of binomials per text.

Figure 16.3 Frequency correlation plot of or-pattern binomials in the Corpus of English Novels

The next question to be answered concerns outliers, but before that a brief general note about the nature of corpus linguistic data is in order. A commonly encountered challenge in quantitative corpus linguistics is that linguistic features very often show a distribution with a long right tail, that is, that while most observations show a fairly conservative frequency, there are some that show much higher frequencies. With the possible exception of very high-frequency grammatical items such as articles, auxiliary verbs and prepositions, which would be difficult either to over- or underuse in significant amounts, we should expect that most linguistic phenomena will rarely follow a Gaussian distribution. Consequently, and somewhat paradoxically, because descriptive statistics such as the mean and the standard deviation are easily affected by extreme observations, the identification of outliers can end up being overly conservative.¹⁷ Given that the premise of the present investigation is to determine whether some late modern authors used binomials in a consistently non-normative fashion, which would indicate awareness of binomials as a stylistic device, outliers are one of the main points of interest and thus the method by which they are defined has particular relevance. The normality of a dataset can be investigated using a simple quantile–quantile plot (hereafter QQ plot) which compares the quantiles of two samples by sorting them and then plotting the sorted samples against each other. To test the normality of the data, we plot the quantiles of our sample against the theoretical normal quantiles using the qqnorm() function in R. If the data is normally distributed, the plot forms a more or less straight diagonal line, while skewed or non-normal data leans away from the diagonal. Examining the pooled frequency of all noun binomials, we see that, although the majority of the observations fall neatly in line, there are a number of apparent outliers and the data is left-skewed (Figure 16.4).

Figure 16.4 Quantile–quantile plot of binomials in the Corpus of English Novels against normal distribution

Because the data is not normally distributed, it makes sense to adjust our definition of an outlier. One way of dealing with non-Gaussian data is to leave out any aberrant observations, but because we know that all the data points come from legitimate novels, it would not be appropriate to discard the extreme data points.¹⁸ Instead, we will use a robust dispersion statistic. A visual inspection of the scatter plot shows five obvious outliers in the upper right-hand corner of Figure 16.4. Keeping in mind that all 290 novels are included in the corpus in full length, the outliers cannot be explained by an unusually low word count skewing the result.¹⁹ Futhermore, because the dataset is not normally distributed, we use a modified z-score to identify the outliers formally.²⁰ Instead of using standard deviation, we calculate the median absolute deviation (MAD), a very robust statistic that negates the skewing effect of extreme values.²¹ Then, using MAD in the place of standard deviation and defining as outliers standardised frequencies with a modified z-score greater than 3 MAD, we can identify outliers based on a robust statistic (Table 16.7).

Table 16.7: Outliers in binomials in the Corpus of English Novels

Binomial pattern	MAD	Median	Outlier threshold (+3 MAD)	Number of outliers
All noun binomials	0.71	1.94	4.07	5
Noun + and	0.679	1.77	3.83	5
Noun + or	0.175	0.100	0.62	4
Singular noun + and	0.476	1.21	2.64	6
Singular noun + or	0.08	0.15	0.39	9
Plural noun + and	0.28	0.52	1.36	2
Plural noun + or	0.02	0.022	0.08	9

A small selection of texts stands out. Jerome K. Jerome’s The Great Taboo (1891) and Grant Allen’s The White Company (1891) are outliers when it comes to singular and binomials as well as both plural patterns, Arthur Conan Doyle’s Born in Exile (1892) is an outlier in singular and plural and binomials as well as in plural and binomials, and Marie Corelli’s The Life Everlasting (1911) is an outlier in singular and as well as singular and plural or patterns. Two other texts, Grant Allen’s The British Barbarians (1895) and Kate Douglas Wiggin’s Penelope’s Postscript (1915) show two outliers each, with a few more texts showing a single pattern as an outlier. These observations tell us that some of the authors appear to be flexible about their use of binomials, occasionally using them at a considerably high or low frequency if the context warrants it. In some cases, the frequent use of binomials may arise because of the topic – a story dealing with brothers and sisters – while at other times there may be a single character for whom binomials are a common trope. Although this may signal that the author is using binomials to a stylistic effect, the more interesting question concerns the possibility that some authors would consistently do so, which would make binomial use a general stylistic feature associated with that author.

16.5.1 Authorial Style and Binomials

Finally, we turn to the question of whether some of the individual authors favour binomials more than others. The analysis of dispersion already showed that specific novels appear to account for the majority of the outliers, with several different patterns of binomials in the same text. However, we must be careful not to generalise too soon, because it may well be that these specific novels are merely exceptions in their authors’ overall oeuvres. As Leech and Short (Reference Leech and Short2007: 34–43) point out, a deviance from the norm can be either qualitative or quantitative, and from the perspective of stylistic study the most relevant questions are whether such deviations are prominent in the eyes of the reader and whether they have literary relevance, particularly in the sense of contributing to a broader complex of features which may be interpreted as an identifiable literary style. The authors may have either deliberately or unintentionally used lots of binomials in these specific novels even though they normally would not – which might say something about their ability to alter their style as needed – or there may be a specific contextual reason which requires repeated references to a particular pair of nouns. On the other hand, if an author is found to consistently favour a marked feature such as binomials at frequencies significantly higher than the norm, we may conclude that the feature is used for stylistic embellishment.

To analyse the authors’ styles, we therefore need to look at each individual author’s books as a group. Table 16.8 gives the mean standardised frequencies and population standard deviations for the novels of each author, calculated separately for the N + and + N and N + or + N patterns.

Table 16.8: Binomial frequencies by author in the Corpus of English Novels

Author	Noun and noun mean (/1,000 w)	Noun or noun mean (/1,000 w)
Andy Adams	1.96	0.23
Grant Allen	2.15	0.39
Gertrude Atherton	2.30	0.16
Irving Bacheller	2.29	0.15
Robert Barr	0.71	0.12
Lyman Frank Baum	1.66	0.13
Frances Burnett	1.95	0.18
Hall Caine	2.09	0.17
Ralph Connor	1.97	0.17
Marie Corelli	3.04	0.41
Francis Marion Crawford	1.40	0.13
Arthur Conan Doyle	1.78	0.22
George Gissing	1.36	0.15
Henry Rider Haggard	1.82	0.22
Emerson Hough	1.60	0.28
Jerome K. Jerome	2.13	0.25
Henry Seton Merriman	1.22	0.22
George Augustus Moore	1.83	0.14
Edith Nesbit	1.83	0.10
Gilbert Parker	2.04	0.24
Robert Louis Stevenson	1.68	0.17
Humphry Ward	2.74	0.19
Stanley J. Weyman	1.56	0.18
Edith Wharton	1.51	0.09
Kate Douglas Wiggin	2.64	0.29

Box plots of the two binomial patterns are given below in Figures 16.5 and 16.6. In addition to differences between means, it is also easy to see the differences in distribution. Some authors, such as Francis Marion Crawford and Henry Seton Merriman, use binomials in a very consistent fashion with very little variation between novels, while others, like Irving Bacheller and Grant Allen, show considerable variation. Allen’s The White Company (1891) was mentioned earlier for showing outliers in three of the four binomial patterns under investigation, but as seen in Figure 16.5, his overall use of the and pattern is actually quite conservative. Two authors stand out as particularly noteworthy: Marie Corelli and Robert Barr. Corelli uses binomials consistently more than the other authors, while Barr appears not to use binomials less than the others, particularly when it comes to the much more common and pattern.

Figure 16.5 Boxplot of and-pattern binomials by author in the Corpus of English Novels

Figure 16.6 Boxplot of or-pattern binomials by author in the Corpus of English Novels

The same data is given in Figure 16.7 as a scatterplot, which makes it easy to see that, while most of the authors cluster together and could thus be described as average or normal users of binomials, there are a few who stand out, most especially Marie Corelli, who overuses both patterns of binomials compared to all other authors. By contrast, Grant Allen is revealed to be an over user of the or-pattern but not of the and-pattern while both Humphrey Ward and Kate Douglas Wiggin use the and-pattern but not the or-pattern. At the other end of the spectrum, Robert Barr underuses the and-pattern and is also an infrequent user of the or-pattern, and both Edith Wharton and Edith Nesbith appear to avoid the or-pattern but to use the and-pattern at a very standard frequency.

Figure 16.7 Scatterplot of and- and or-pattern binomials by author in the Corpus of English Novels

When it comes to evaluating the authors’ use of binomials as a feature characteristic of their overall style, it is worth observing that we are primarily interested in the mean frequency instead of unusually high or low frequencies found in single novels. For example, although Arthur Conan Doyle is a fairly average user of binomials, his novel Born in Exile features a very high frequency of singular and plural and binomials as well as plural or binomials.

There are eleven novels written by Marie Corelli in CEN and the high frequency of binomials is found to be a consistent feature of her style. Corelli was a very popular novelist, who outsold many of her contemporaries, some of whom later found lasting fame, and her melodramatic style and flair for the supernatural is reported to have been popular particularly among the working classes.²² Corelli’s The Life Everlasting (1911) has the highest frequency of binomials in the corpus, but, more significantly, Corelli is consistent in her overuse of binomials. The following example from The Life Everlasting exemplifies the author’s frequent use of binomials:

(3) Age cannot touch them – death has no meaning for them, – life is their air and space and movement – life palpitates through them and warms them with colour and glory as the sunshine warms and reddens the petals of the rose – they grow beyond mortality and are immune from all disaster – they are a world in themselves, involuntarily creating other worlds as they pass from one phase to another of production and fruition.

The author with the lowest overall frequency of nominal binomials in CEN is Robert Barr. His novel One Day’s Courtship, and the Heralds of Fame (1896) has the single lowest frequency of nominal binomials at 0.47/1,000 words. Barr was a British-Canadian author who spent time in both countries, including a spell as a headmaster in Ontario, and befriended many of the literary giants of his time. He wrote crime fiction and mysteries in the mode of Conan Doyle.

We may also note in passing that the four most frequent users of and binomials are female authors, which, given that only seven out of the twenty-five authors are female, is statistically quite improbable; in fact, Edith Wharton is the only female author to use and binomials less than the average. When it comes to or binomials the situation is more even, though, notably, female authors occupy both the highest and lowest mean frequencies. As ever, more research is needed.

16.6 Conclusions

What insights do these quantitative results give us into the use of nominal binomials in late modern literature? Firstly, the majority of the more frequent binomials deal with everyday topics, family members, items of food, and so on. By contrast, nouns to do with the senses, emotions and abstract concepts appear to be quite flexible in attaching to other similar nouns. None of the individual binomials stands out as particularly frequent. Secondly, we can say that while there is some natural variation within the sub-corpus of each author’s novels, binomials were used relatively evenly by most of the authors. The observed correlation that when the frequency of one pattern of binomial is higher, it is likely that other patterns show a similar trend, suggests that the authors are at least subconsciously aware of their own use of binomials as a stylistic feature. Only very few of the authors, most particularly Marie Corelli and Robert Barr, used binomials in such a consistently unusual fashion that it seems reasonable to suggest that they did so as a measured part of their literary style.

As is so often the case, at the conclusion of this study more questions are asked than were answered. The data provides a starting point for a more detailed analysis into the productivity of different types of nouns in binomial units and raises some questions about the methodology of identifying fixedness. Questions concerning the dispersion of the nouns used in binomials also had to be left aside, to be taken up in a follow-up study. It would be most interesting to expand the corpus to include less prominent authors as well as authors who write in a wide variety of genres, which would allow an analysis of whether stylistic consistency is a feature associated with experienced or mature authors or whether it is the style itself rather than the consistency of it that is the predictor of literary success. As a general conclusion, this study has perhaps demonstrated some of the potential benefits of studying stylistic features using quantitative methods and larger collections of texts. Most importantly, such approaches help us to formally identify both standard and non-standard features, and to recognise patterns, correlations and causes.

Chapter 17 On the Linguistic and Social Development of a Binomial: The Example of to have and to hold

17.1 Introduction

The collocation of the verbs have and hold would probably not figure high in a robust statistic corpus analysis of habitual binomials in use today. Yet this does not preclude many people from knowing the formulaic to have and to hold because they are familiar with the current frame of the religious matrimonial ceremony. Another frame in which the formulaic to have and to hold is set today, though with much more restricted access, is the law: more precisely, the legally binding written documentation of conveying land or other goods in a will or a lease. As such it confirms the general observation that legal language has a strong affinity with binomials, in particular with near and complete synonyms.¹

However, counter to other verbal elements in legal binomials or even trinomials like, for example, give, devise and bequeath (Tiersma Reference Tiersma2006: 36), both verbal constituents of to have and to hold are of a stunning simplicity, and in fact are so basic that, impressionistically speaking, they might link up to an archaic meaning of ‘possession’ within the legal frame and that of ‘safeguarding’ within the matrimonial frame. Such a straight descent was, for instance, proposed by David Mellinkoff over fifty years ago. In his discussion of to have and to hold he moves swiftly with elegant ease from a (pre)feudal legal meaning, through that of Late Middle English documents of conveyance, to its use in the matrimonial script as fixed in the 1549 Book of Common Prayer (Reference Mellinkoff1963: 93f.). However, as the title of my contribution indicates, the collocational use of have and hold is so well documented that it is worthwhile looking much more closely into its possible origins and its history up to the sixteenth century and later.

One historical specificity, Mellinkoff’s first step, is that the evidence for collocating have and hold goes all the way back to Old English. As we will see, in that period the verbs are coordinated in various congruent inflectional forms and in a range of discourse traditions (see also the Old English part of this volume). This is continued in the Middle English period, with increasing evidence for its narrowed discourse-traditional use in legal documents and in the religious matrimonial ceremony. This narrowing down goes along with the syntactic fixing into the form to have and to hold, but within the two preferred contexts the underlying structure of the formula is different, and I will show that this grammatical difference results from individual developments in the legal and the matrimonial use.

While I was working on (to) have and (to) hold in 2014, the entries for the individual verbs have and hold – and hence also for the binomial have and hold – in the online Oxford English Dictionary (OED) were not only “not fully updated”, but still provided information and documentation taken over from the first into the second edition (²OED). However, during the preparation of this chapter in March 2015, the OED published a completely updated entry for have, and with it also for to have and to hold (³OED have v., P3. Other phrases. a.). The comments on the binomial differ substantially from the old entry, so that I had to take them into consideration here. However, they do not at all make the original core ideas of my contribution obsolete. Quite the contrary.

Before I begin my historical overview, it is necessary to explain my practice of referring to the binomial under discussion. Firstly, unless I give verbatim historical quotes, I use the modern forms have and hold. Secondly, I use coordinated have and hold when I refer to occurrences of the binomial in inflected form or as infinitive that does not have purposive meaning. In contrast, to have and to hold indicates the syntactically frozen use within a given discourse tradition.

17.2 Have and Hold in Old English

In the ²OED, the entry for have, v. (1898) provides historical information on the meaning under “Signification”:²

From a primitive sense ‘to hold (in hand)’, have has passed naturally into that of ‘hold in possession’, ‘possess’, and has thence been extended to express a more general class of relations, of which ‘possession’ is one type, some of which are very vague and intangible.

The ³OED reflects this under “1.a. trans.” as:

To hold in one’s hand, on one’s person, or at one’s disposal; to hold as property; to be in possession of (something received, acquired, earned, etc.); to possess. […]

The ²OED’s historical account for the signification of hold (s.v. hold, v. (1899)) is set out as follows:

In Gothic, haldan is recorded only in the sense ‘to watch over, keep charge of, keep, herd, pasture (cattle)’. […] This is generally accepted as the original sense in the Teutonic langs. (cf. Grimm, s.v. Halten […]), whence have arisen the senses, ‘to rule (people), guard, defend, keep from getting away or falling, preserve, reserve, keep possession of, possess, occupy, contain, detain, entertain, retain, maintain, sustain’, in which it is now used. In some of these hold covers the same conceptual grounds as keep (which has superseded it in reference to cattle), in others it is a stronger synonym of have.

And for the phrase to have and to hold the ²OED (s.v. have, v.;.1c (1898)) argues:

a phrase app. of legal origin (cf. law Latin habendum et tenendum: see habendum n.), retained largely, as in German, Dutch, etc., on account of its alliterative form: To have (or receive) and keep or retain, indicating continuance of possession.

However, this claim of “legal origin” is not substantiated by any quote from a legal text proper. One such piece of evidence could come from Cnut’s Winchester Code (issued 1020–1021), although this may not yet have been accessible at that time:

(1a)
Gif hwa amansodne man oððe utlahne hæbbe & healde, plihte him sylfum & ealre his are.
(Liebermann Reference Liebermann1903–1916: I, 352)³

‘If somebody has and holds an excommunicated or an outlawed man, he shall imperil himself and all his possession.’

In this quote the meaning of have and hold seems the composite one as indicated by the ²OED glossing. And a later Latin translation supports this interpretation. The post-conquest Quadripartibus transposes the binomial thus:

(1b)
[…] excommunicatum uel utlagam habeat et manuteneat […]
(Liebermann Reference Liebermann1903–1916: I, 353)

‘[…] has and maintains an excommunicated or an outlaw […]’

The prosecutable act identified here lies at least as much in the factual initial ‘accepting’ (hæbbe/habeat) as in what follows that ‘having’, namely in subsequently ‘maintaining’ an excommunicated person or an outlaw (healde/manutenat).

The 2015 update keeps the semantic paraphrase of ²OED, but abstains from giving a definitive account as to the origin of the binomial:

to have and retain; to receive and retain; to continue in possession of. In later use esp. in Christian wedding vows (after quot. 1549). Also (Law) used in a deed of conveyance to define the extent and conditions of ownership (cf. habendum n., tenendum n.) (now chiefly hist.). Chiefly in inf. [In Old English a specific use of an alliterative formula used more widely. With use in context of legal ownership, compare post-classical Latin habere et tenere (from 11th cent. in British sources) …]

The first quote from Old English given in ³OED is the Beowulf line quoted below in (7). But it is set in square brackets indicating that this “quotation is relevant to the development of a sense but not directly illustrative of it” (³OED online “Key to symbols and other conventions”). The second quotation for the binomial is from the late tenth-century Old English translation of Bede’s Historia Ecclesiastica gentis Anglorum:⁴

(2) OE tr. Bede Eccl. Hist. (Corpus Oxf.) v. xvii. 450 Wæs he on iuguþe mon willsumlicre yldo & fægernesse, & ealre his þeode leof heora rice to habbanne & to healdenne [L. ad tenenda seruandaque regni sceptra].

‘He was a young man of pleasant youth and handsomeness and welcomed by all his people to obtain and keep the rule of the realm.’

One may wonder whether the infinitival Old English binomial translating ad tenenda seruandaque better illustrates the meaning ‘have and retain’ than does the line from Beowulf given below in (7). However, the authors of the updated entry are certainly correct in observing that in Old English the “alliterative formula [was] used more widely”, although they give no further Old English evidence. Here are some select quotes substantiating the ³OED’s general statement.

We find a halfline in the Battle of Maldon (not given in any of the respective OED entries) in which the binomial is to be understood literally and/or metonymically, yet, in the latter case not ‘indicating continuance of possession’:

(3)           Us is eallum þearf

þæt ure æghwylc   oþerne bylde

wigan to wige,    þa hwile þe he wæpen mæge

habban and healdan,  heardne mece,

gar and godswurd.

(233b-237a; ASPR 6: 13)

‘It behoves us all that each warrior should encourage the other to fight, as long as he can have and hold a weapon, the hard blade, the spear and the good sword [i.e. as long as he is able to fight].’

But the meaning of ‘physically having and keeping’ is also reflected in the metrical part of the charm named “For Theft of Cattle”:⁵

(4) Garmund,        godes ðegen

find þæt feoh        and fere þæt feoh

and hafa þæt feoh        and heald þæt feoh

and fere ham þæt feoh.

(6–9; ASPR 6: 125)

‘Garmund, God’s thane, find that cattle and lead that cattle and have that cattle and hold that cattle and lead home that cattle.’

As fits the genre, the binomial is in the hortative subjunctive, yet whether its use in a charm attests to anything other than ubiquity, I leave open. In any event, homilists also avail themselves of the collocation. In the Blickling Homilies (Dominica V. in Quadragesima), for instance, it is used for having the right belief:

(5)
þa þe Godes rices geleafan habbað & healdaþ
(Morris Reference Morris1880: 55)

‘those who have and hold the belief of God’s kingdom’

In fact, the author of this homily is so fond of this collocation that he uses it – along with others such as smeagan & þencan (‘ponder and think’) or reccan & secgan (‘tell and say’) – three times within ten edited lines (Morris Reference Morris1880: 55).⁶

My first quote from Beowulf testifies to a sense that seems to stress the original aspect of guarding encapsulated in hold:

(6) Ic wæs syfanwintre,        þa mec sinca baldor,

freawine folca        æt minum fæder genam;

heold mec ond hæfde        Hreðel cyning,

geaf me sinc ond symbel

(2428–2431a; Klaeber 2008)

‘I was seven-winter old, when the lord of treasure, the lord and friend of the folk, took me from my father, held and had me, King Hrethel, [and] gave me treasure and feasting.’

This reflects the not unusual practice of giving a son into somebody else’s – here: the grandfather’s – care for education.⁷ The binomial is reversed, and as there is no obvious metrical reason for this, the poet might have wanted to highlight the ‘guarding’ aspect, thus tolerating that the institutionalized status of the binomial is broken up. Other than that, the act in which the father gives his son into somebody else’s guardianship might have been accompanied by ceremonious words. These could have sounded similar to the following wording in which Hrothgar hands over his hall Heorot to Beowulf. The Geatish hero has just declared that he has come to help fight the monster that has waged fierce attacks on this hall. At the end of the welcome feasting, Hrothgar announces that he will retire and rest with his queen:

(7) Gegrette þa           guma oþerne,

Hroðgar Beowulf,            ond him hæl abead,

winærnes geweald,          ond þæt word acwæð:

“Næfre ic ænegum men          ær alyfde,

siþðan ic hond ond rond           hebban mihte,

ðryþærn Dena         buton þe nu ða.

Hafa nu ond geheald        husa selest,

gemyne mærþo, […]”

(652–659a; Klaeber 2008)

‘Then greeted the man the other – Hrothgar [greeted] Beowulf – and wished him success, [gave over to] him the control of the banquet-hall and spoke this word: “Never before have I entrusted, since I could lift the hand and the shield, the mighty hall of the Danes, except now to thee. Have now and hold the best of all houses, bear in mind fame […]”’

The fact that we are facing here a ceremonious dispositive act is clear from the imperative forms hafa and geheald, supported by the preceding expressions abeodan geweald (‘give over the control’, ll. 653f.) and the verb alyfan (‘entrust’, l. 655), that spell out the character of the act. What Hrothgar hands over to Beowulf’s protection in this scene is, of course, not just the hall as a building, but also, by way of metonymy, what the hall ‘contains’, namely Hrothgar’s men, and with them the king’s most fundamental social responsibility.

If Examples (6) and (7) mirror residues of an oral culture based on ritual societal bonding, the following quote from ³OED gives evidence of the rising bureaucratic literacy in the wake of the Norman Conquest. It is taken from a bilingual charter of William I issued on 11 May 1068, on behalf of St. Martin-le-Grand at London (Bates Reference Bates1998: 594):

(8) 1309 (►OE) Royal Charter: William I to St. Martin-le-Grand, London in D. Bates Regesta Regum Anglo-Normannorum (Reference Bates1998) 599 Eall þar þar [read þas] þing habbe & healde [L. habeant et teneant] Sanctes Martines mynster & [þa] canonichas a on ecnesse.

‘The minster of Saint Martin and the canons have and hold all these things eternally.’

As the Old English text (roughly) translates the Latin one, linguistic precedence is hard to determine. However, a few lines down the Latin text formulaically concatenates legal binomials expressing privileges in the vernacular such as socnam et sacam (Old English socne & sace ‘right of holding court’; Bates Reference Bates1998: 598f.). This suggests that Latin habeant et teneant also imitates Old English habbe & healde.

To summarize briefly, the Old English evidence clearly shows that the use of the binomial is licensed for the most prominent discourse traditions that made it into writing. Though identical in inflectional form in Examples (4) and (7), only in (7) does hafa […] and geheald function in terms of an act whose socio-moral dimension is quite evident. Does Beowulf thus provide us with a scene of a – temporary – entrustment which could have also taken place in ‘real life’? In other words, was Old English have and hold ‘originally’ a dispositive phrase that by the turn of the millennium had spread to other text traditions in an unmarked form? Because evidence for a safe answer will never be available, I suggest rather that pragmatically the binomial could function either way.

By the time we get to (8), Latin habeant & teneant can reproduce habbe & healde in a royal charter. Although the updated entry in ³OED does not repeat the ²OED’s claim of legal origin, it retains the cross-reference to habendum (and adds tenendum). Yet this is restricted to the legal meaning, as is the remark “With use in context of legal ownership, compare post-classical Latin habere et tenere (from 11th cent. in British sources)”. The authors of the update thus seem to suggest that the legal use is the result of specialization. In doing so they have tacitly cast off a spirit looming large with the late nineteenth-century lexicographers and their treatment of to have and to hold: Grimm and the Deutsches Wörterbuch.

17.3 Binomials and the Law

The connection between binomials and the language of vernacular legal texts was, of course, initially phrased by Jacob Grimm in the introduction to his Deutsche Rechtsalterthümer in 1828. In the introduction he remarks:

Es läßt sich erwarten, daß die in unserer ganzen sprache und dichtkunst eingewurzelte alliterierende form auch in den deutschen gesetzen und gerichtlichen urkunden zu hause sein werde […]. [I]n solchen alliterationen [werden] nur gleicharthige redetheile, nicht ungleichartige gebunden […].

(⁴1899.1: 8)

‘It is to be expected that the alliterative form, which is deeply rooted in our whole language and poetic art, will also be at home in the Germanic laws and legal documents. […] Such alliterations only bind equal parts of speech and not unequal ones, […]’⁸

Thirteen years before, i.e. in 1815, Grimm had postulated in his article “Poesie im Recht” (‘poetics in the law’) that vernacular law and vernacular poetry had once been identical, that one contained the other (1881: 154). Apart from all the romantic innuendoes here, it has to be kept in mind that Jacob Grimm was a trained lawyer – or rather: a law historian. So he developed his – ultimately untenable – theory in view of legal history, not of philology, a discipline whose founder he undoubtedly was, though only in hindsight.⁹

An early philologist to take up Grimm’s idea in the later nineteenth century was Moritz Heyne. In 1864 he submitted a Habilitationsschrift at Halle, consisting of a collection of formulae alliterantes (‘alliterating formulae’) from the oldest Frisian laws. In the short introduction to his collection he dares, however, take issue with Grimm’s 1828 statement that “alliterations only bind equal parts of speech and not unequal ones” and falsifies this with the terse statement that “regula illa neque carminibus nec vetustioribus legum libris confirmatur” (‘that rule is confirmed neither by poetry nor by older law books’; Reference Heyne1864: iv). In doing so he willingly (?) overlooks that Grimm made this statement in view of his collection of bi- and trinomials. Apart from this, Heyne is of particular interest for our topic because between 1868 and 1876 (Kirkness Reference Kirkness and Haß2012: 227) – twenty years before the OED’s original entry for to have and to hold – he provided the entries for the letter H in the Deutsches Wörterbuch (WB; e-source,). There he underpins the basic meaning of haben as ‘physically hold in hand’ with the observation: “es wird die begriffliche zusammengehörigkeit von haben und halten durch eine enge allitterierende verbindung betont, die weit verbreitet ist” (‘the conceptual solidarity of haben and halten is emphasized by an intimate alliterating connection, which is widely spread’; s.v. haben; WB10.50).¹⁰ To substantiate this he gives two High German prose quotes from legal texts, one from the fourteenth century (gehept und gehalten) and one from the fifteenth century (ze habene und ze behaltenne), and additionally a Frisian legal prose example with reversed order of the collocation. Without further comment he then gives the two respective lines from Beowulf quoted above in (6) and (7). In his halten entry he cross-references to his haben entry, yet this time his binomial quotes all come from non-legal texts, thus perhaps tacitly putting the great master in his place.

Heyne nevertheless follows Grimm in so far as he surmises a semantically reinforcing capacity of alliteration, thus reminding one – avant la lettre – of Jakobson’s “poetic function” of language that is achieved when a paradigmatic relation is projected onto the syntagmatic axis (Jakobson Reference Jakobson and Sebeok1960: 358). Technically, alliteration – itself the binding element in Germanic metrical units – in legal binomials has long been identified as a mnemonic device (e.g. by Sonderegger Reference Sonderegger1962–1963: 270). As research into the formulaicness of Old English and other earlier medieval poetry has shown, formal mnemotechnics and sociocultural memorability seem intricately intertwined. Thus, Paul Kiparsky, in the heyday of the discussion evolving around the ‘Oral-Formulaic Theory’, more generally noted that meter has “both a mechanically mnemonic function […] and a central esthetic function (itself of mnemonic value) of foregrounding” (Reference Kiparsky, Stolz and Shannon1976: 91f.). I myself have tried to grasp this potential by suggesting that the form of what was inherited as memorable may also be reused to mark what one wants to endow with memorability (Schaefer Reference Schaefer1992: 86f.; cf. also Schaefer Reference Schaefer, Nevalainen and Traugott2012). And this also works in cultures that no longer depend on oral mnemonics. Alliterating (and also rhyming) binomials, I would maintain, feed on exactly this and therefore may also be newly produced.

In closing this section, a word of caution has to be thrown in with regard to paralleling German and English evidence of the occurrence of binomials in legal language. Their study remained a topic both for scholars of the Germanic languages and for German law historians long into the twentieth century (see Dilcher Reference Dilcher1961: 13–15). In the course of these discussions, attempts to account for the legal binomials changed direction from the romanticist myth of a common origin of poetic and legal language to more sound conceptual considerations. Thus, what Grimm interpreted as the force endowing legal terminology with ‘heightened, more vivid sense and more strength and stability’ is understood by Walter Merk (Reference Merk1933) and – with less rhetorical art but more theoretical matter – by Gerhard Dilcher (Reference Dilcher1961) as the attempt to grasp abstract concepts in concrete words. Complementary to this, Sonderegger’s remark that binomials are early forms of definition (Reference Sonderegger1962–1963: 268) stands to reason. And this ties in well, e.g., with Koskenniemi’s – independently phrased – observation that in “legal language a double expression is generally employed for the sake of precision and not merely for rhetorical emphasis” (Reference Koskenniemi1968: 78).

However, beyond this we have to be careful when we interpret any “große Übereinstimmung entlegener örter und zeiten” (‘great congruity of distant places and times’; Grimm ⁴1899: I, 8). In particular, the vernacular legal language in later medieval England cannot and must not be analyzed without its model languages Latin and French. The updated entry for to have and to hold in ³OED hints at this link with the statements “With use in context of legal ownership, compare post-classical Latin habere et tenere (from 11th cent. in British sources); also Anglo-Norman aver et tenir (14th cent.)”.¹¹ But this does not really lead anywhere because, unelaborated as these hints stand, the ³OED users are left with the fundamental question as to how the English binomial is related to the Latin and French parallels.¹² Mellinkoff spoke only of the “grand mixture of languages” (Reference Mellinkoff1963: 120), yet this is neither lexically nor syntactically a random matter, as will be illustrated in my discussion to follow.

17.4 Middle English: From have and hold to to have and to hold

With the reemergence of written English after 1200 we also see that the binomial have and hold has persisted. The Middle English Dictionary (MED) gives evidence of a relatively wide range of uses in the first two centuries, while in the fifteenth century the legal use of to have and to hold prevails. As we will see, the MED pays particular attention to the semantic interpretation of legal to have and to hold. In contrast, it does not give any specific prominence to the matrimonial purposive to have and to hold which is also documented as of the fourteenth century. The ³OED’s update, in turn, identifies the two uses separately, but only links the legal use to Latin and French respectively. However, a closer look into the relevant sources as well as syntactic analyses may provide some clearer insights into both the multilingual Middle English scenario and the social backgrounds underlying these uses.

17.4.1 The General Use: 1200 to 1400

The two oldest quotes for have and hold in the MED come from the early thirteenth-century Katherine Group:¹³

(9) c1225(?c1200) St.Kath.(1) (Einenkel) 1867: Þis me were leouere..to habben & to halden þe cwic þen to acwellen þe

‘This I would prefer..to have and to hold thee alive than to kill thee’

(10) c1225(?c1200) St.Marg.(1) (Bod 34) 6/13: Ich hire wule habben & halden to wiue¹⁴

‘I will have and hold her as my wife’

While the binomial in (9) is to illustrate the meaning “to preserve (sb. or sth.), save; keep (life, limbs); protect (sb. from sth.)” (MED s.v. haven vb., 5a(d)), (10) is listed as the binomial variant of simple to haue with the meaning “to accept or receive (sb.) as (one’s king, lord, superior); take (sb.) as (a witness, companion, wife)” (MED s.v. haven vb., 7b(c)). This may be taken as evidence that denotationally the binomial conveys no semantic surplus. On the other hand, if we look at the immediate context of (10), it is tempting to read it as reiterating the matrimonial pledge: ʒef heo his freo wummon. ich hire wule habban & haldan to wiue & ʒef heo þeowe is ich cheose hire to cheuese (d’Ardenne Reference d’Ardenne1977: 56; ‘If she is a free woman I will have and hold her to wife, and if she is a slave I choose her as my concubine’). However, as the whole Katherine Group abounds with binomials (Schaefer Reference Schaefer and Pilch1996), we should probably not overestimate this evidence. A clearly ‘extramarital’ meaning is attested in a line from one of the so-called Harley Lyrics:

(11) c1325 Most i ryden (Hrl 2253) 56: Myhte ich hire haue ant holde, in world wel were me

‘If I could have and hold her, I would feel well in this world’

The fourth meaning for the binomial given in the MED – “To have (sb.) under one; have (a servant, slave, attendant, soldier, subordinate); command (an army or part of an army” (MED s.v. haven vb., 3(a)) – is identified in a line from the romance Guy of Warwick:

(12) c1330(?c1300) Guy(1) (Auch) 168: Kniȝtes to hauen & holden of pris

‘to have and hold well renowned knights’

In this as in the other earlier Middle English examples the collocation functions stylistically as reinforcing simple (to) have, as we find semantically and contextually equivalent examples for the individual use of both verbs in the MED.

The last example to quote here is adduced by the MED for the meaning “to have (a woman) as wife or mistress” (7b(b)). It is l. 950 of the Legend of St Gregory in the version of the Vernon manuscript, dated 1370–1380 by Keller (Reference Keller1914: vi). I give the quote here with the text immediately preceding the line with the binomial:

(13) To Chircheward heo wenten sone

Barouns two þe lauedy ledde

Al þat men scholde at weddyng don

Þe prest in bok song and redde

As Mon þat wyf wol vndurfon

To haue and holde at bord and bedde

(MS Vernon, ll. 945–950; Keller 1914: 122)

‘Churchward they soon went, two barons led the lady. All that one should do at the wedding the priest sang and read in the book, as a man who will receive a wife, to have and hold at board and bed.’

Here we can be sure that this opens a literary window to the matrimonial ceremony of the period in reiterating part of the wording that by this time had become an integral part of the religious institutionalizing act to be performed by both bridegroom and bride.

17.4.2 ‘Matrimonial’ to have and to hold

The updated entry for to have and to hold in ³OED has it that the binomial can be identified in “later use esp. in Christian wedding vows (after quot. 1549).” If I read this correctly the ³OED considers the wording in the first Book of Common Prayer to be the oldest evidence. However, written documentation of the binomial in the religious wedding ceremony – almost identical with the ceremony and its wordings in today’s Book of Common Prayer – surfaces at least 150 years earlier.¹⁵ The institutional origin of the ceremonious pledge containing to have and to hold lies in canonical discussions of the eleventh and twelfth centuries as to what constitutes a legal marriage. By the early thirteenth century it was clear that the consensus de praesenti – the agreement in the present (as opposed to that for the future) – was indispensible (Reynolds Reference Reynolds, Reynolds and Witte2007: 11f.), as it had already been partially practiced before. With the Magdalen Pontifical (Wilson Reference Wilson1910) we have, as Reynolds puts it, a “convenient snapshot of the preliminary rites” from the late twelfth century (Reference Reynolds, Reynolds and Witte2007: 24). Consent was expressed as the answer volo to the priest’s question Vis hanc feminam, asked on the doorsteps to the church. Subsequently the priest asks:

(14)
Vis eam seruare in dei fide et in tua et in sanitate et in infirmitate, sicut christianus homo debet suam sponsam seruare?
(Wilson Reference Wilson1910: 202)

‘Do you will to look after her in God’s faith and in your own, in health and in sickness, as a Christian man ought to look after his wife?’
(Reynolds Reference Reynolds, Reynolds and Witte2007: 24)

After the bridegroom has answered volo, the bride is asked the same (similiter interroget sponsam), and after her positive answer her right hand is given to the bridegroom, etc. Reynolds supposes that “the priest would probably have conducted the dialogue in the vernacular (i.e. in English or French)” (Reference Reynolds, Reynolds and Witte2007: 24, fn. 88). Reynolds confirms that the “full-fledged dialogical form” – as opposed to the simple volo answer given to the priest – appears in the fourteenth century (Reference Reynolds, Reynolds and Witte2007: 27), and since then it has been well documented in English. Its wording takes up and reverses the Latin binomial in sanitate et in infirmitate, and Late Middle English to have and to hold seems to emanate from Latin servare as a conceptually reinforced equivalent.¹⁶

Here is the relevant passage from the York Missal, whose oldest manuscript is dated “Sec. xiv” (Henderson Reference Henderson1875: xiv):

(15) Here I take the N. to my wedded wyfe, to haue and to holde, at bedde and at borde, for fayrer for fouler, for better for wars, in sekenes and in helth […]

(Henderson 1875: 27)

In other missals from the fifteenth century the vernacular pledge is further documented. In one instance the binomial is reversed (Henderson Reference Henderson1875: xvi), in another it is altogether missing (Henderson Reference Henderson1875: 116*).¹⁷ Nevertheless the pledge stabilizes with the formula to have and to hold, so that it is entered in the Book of Common Prayer of 1549 as

(16) I N. take thee N. to my wedded wife, to have and to holde from this day forwarde, for better, for wurse, for richer, for poorer, in sickenes, and in health, […]

(BCP 1549; e-source)

Here, as elsewhere – and mutatis mutandis – the pledge of the bride to take the husband has the same wording.

Semantically one could contend that hold here too does not just ‘double’ the meaning of have but refers to a consequence of the act of taking. This is confirmed by the ³OED’s paraphrase ‘to have and retain’. And, as a matter of fact, the MED gives evidence for have to wife as ‘marry’ (MED s.v. haven vb., 3(d)) and hold to wife/spouse (MED holden s.v. vb., 8(a)) as ‘keep in matrimony’. This would then suggest the implicational cline take > have > hold. However, I see no way of giving preference to this interpretation over take > [have & hold]. But for its phrasal stabilization in the given cotext and context, it is, I think, much more relevant to look at the construction to wife that may serve as adverbial for all the three verbs.

According to the ²OED, the purposive use of the preposition to in PPs becomes “Obs. or arch.” in Modern English with the exception of “certain phrases, as to take to wife, to call to witness, etc.” (s.v. to, prep., conj., and adv; I. 1.b (1912)). The frozen phrasal (I take thee N.) to my wedded wife/husband to have and to hold thus seems to testify to a reverse development of grammatical constructions involving purposive to. While PPs of the to + NP kind are restricted to ‘archaic’ set phrases, to in the construction to have and to hold is the “original prepositional purposive ‘to’”, testifying de-grammaticalization of to as infinitive marker in such constructions (Fischer Reference Fischer, Fischer, Rosenbach and Stein2000: 155).¹⁸

There remains the question whether ‘matrimonial’ to have and to hold qualifies as an idiom. Going by the narrow definition that an idiomatic expression defies semantic compositional analysis, this binomial is probably not an idiom. But if we allow syntactic anomaly not only as a potential but also as a defining characteristic of an idiomatic expression (cf. Lambrecht Reference Lambrecht1984: 756), then ‘matrimonial’ to have and to hold would be an apt candidate. And this is not only so because of the to-constructions, but because both have and hold ‘normally’ demand a direct object, unless the to-infinitives carry a passive meaning. This option, however, must be excluded here.

17.4.3 ‘Legal’ to have and to hold in Late Middle English

With the increase of administrative literacy in post-conquest England, distinct legal discourse traditions proliferate from a system of text production established in the chancery of Henry II (1154–1189). Clanchy states that this system “could potentially mass-produce documents from a few stereotypes” (Reference Clanchy1993: 91). This standardization established text forms with specific formulae allocated to specific places of a document, and the charter takes the most prominent place in the overall administrative-legal discourse tradition. Clanchy defines the charter as “a public letter issued by a donor recording a title to property” (Reference Clanchy1993: 85). From the early thirteenth century onward, habendum et tenendum has a fixed place in the charter and functions as identified by Hubert Hall over a hundred years ago:

The ‘Habendum et Tenendum’ clause, which appeared in the reign of John, marks the division of the Dispositive Clause into two well-defined parts. The first of these states the nature of the concession, and the second defines its conditions.

(Hall 1908: 25)

We find a famous illustration of this structure in the seeming paragon of charters, the Magna Carta; seeming, that is, because pragmatically the Magna Carta actually contains an agreement between the king and his barons. Yet, as Clanchy notes, it was “issued in the form of an ordinary charter, presumably to emphasize that it was a free gift by the king and not a compromised agreement” (Reference Clanchy1993: 88). After the initial protocol naming the king in all his functions and the addressees, the king first confirms the liberty of the English Church. Then the Magna Carta reads:

(17)
Concessimus etiam omnibus liberis hominibus regni nostri, pro nobis et haeredibus nostris in perpetuum, omnes libertates subscriptas, habendas et tenendas, eis et heredibus suis, de nobis et heredibus nostris […]
(Stubbs Reference Stubbs and Davies1913: 293)

‘We have also granted to all freemen of our kingdom, for us and our heirs forever, all the underwritten liberties [i.e. the liberties named below], to be had and held by them and their heirs, of us and our heirs forever […].’

Subsequently, Latin documentary models first developed by the king’s chancery trickled down the administrative scale and spread socially down to the gentry and the emerging merchant class (Clanchy Reference Clanchy1993: 44–78). By, say, 1400 the documentary practices had long been firmly stabilized so that their formulaic repertoire could be reproduced in any of the three languages available for documentation (cf. Weber Reference Weber2010). Therefore, it is only curious on the surface that the two oldest quotations in the MED documenting a legal conveyance are from literary sources. Under “1a (f) in legal phrases: to ~and to holden, to possess and retain possession” the MED quotes:¹⁹

(18) c1390 PPl.A(1) (Vrn) 2.70: Wiþ þe Erldam of Envye..Wiþ þe kingdom of Couetise..I sese hem to-gedere, To habben [vrr. hauen, haue] and to holden.

‘with the earldom of Envy..with the kingdom of Greed..I seize them together, To have and to hold’

This is part of a mock-charter (cf. Steiner Reference Steiner2003: 107f.) in which False – all the vices associable with False included – is given to Lady Meed, as we read nine lines before the formula:

(19)
That I, Fauuel, feffe Fals to that mayden Meede
(A.II, l. 61; Skeat Reference Skeat1886: I, 47)

‘That I, Deceit, give False into the possession of Meed’

The second late-fourteenth-century quotation is from the The Charter of the Abbey of the Holy Ghost, which “purports to be one of the documents associated with the building, confirming the grant of the Abbey and its lands from God” (Boffey Reference Boffey2003: 120). The MED identifies here an alternative construction of the kind to ~and to holden to and translates this as “to be had and retained by (sb.)”:

(20) (c1390) Chart.Abbey HG (LdMisc 210) 339: To hauen & to holden þis preciouse place..to þe forseyde Adam & to Eue & to alle here eyres.

‘to have and to hold this precious place..to the aforementioned Adam and Eve and all their heirs.’

To hauen & to holden here mirrors, of course, the Latin gerundive formula habendum et tenendum which also immediately precedes the English binomial in two manuscripts of The Charter of the Abbey of the Holy Ghost (see Horstman Reference Horstman1895: 339).

Before we take pains to find out why the MED identifies two different meanings in (18) and (20) respectively, it has to be repeated that the core structure and phrasing of these literary ‘charters’ is not the invention of the English authors. Moreover, the fact that (18) and (20) here figure as earliest examples must be due to the lack of other non-literary evidence from this time. Still, both show that once the textual model was established for a record type, it worked as a template that could, in due course, also serve for the realization of that record type in (Anglo-)French or English. I will illustrate this with a comparison of three samples from medieval charters.

My first quote from a Latin charter containing the habendum et tenendum dates from the year 1235. King Henry III donates Martley Manor – and legal claims ensuing from this – to Geoffrey le Dispencer:

(21)
Sciatis nos dedisse, concessisse et hac carta nostra confirmasse dilecto et fideli nostro Galfrido Dispensario, pro homagio et servicio suo, manerium de Martelega cum advocatione ecclesié ejusdem manerii et omnibus aliis pertinentiis suis. Habendum et tenendum de nobis et heredibus nostris, sibi et heredibus suis, in feodo […]
(Hall Reference Hall1908: I, 30)

‘Know that we have given, conceded and with this charter confirmed to our esteemed and true Geoffrey le Dispencer, for his homage and service, the manor of Martley with the bailiwick of the manor’s church and everything else belonging to it. To be had and held from us and our heirs, to him and his heirs, in fief […]’

Compare to this a charter in Anglo-French from the year 1409. Although this charter only confirms a forty-year lease of land, the wording follows the same pattern as the royal charter phrased over 170 years before:

(22)
Sachez nous avoir grante et a ferme lesse a Richard Norton certeins croftes de terre, pree et pasture, appelles Dypplyngholme, Penycroft, Pittes, et Blakmanpottez, oue lour appourtenances, deins nostre seignieurie de Rypon. A avoir et tenir, au dit Richard, ses heires, et executours […]
(Ripon, 2_144; e-source)

‘Know that we have granted and let for rent to Richard Norton certain enclosures of earth, grassland and pasture called Dipplingholm, Penycroft, Pitts and Blackmanpots, with the appendages, in our domain at Ripon. To have and to hold, to the said Richard, his heirs and his executors […]’

And finally I quote from a lease issued in 1534 concerning the Manor of Walton in Buckinghamshire:

(23)
witnessith that the said master william ffranklyn, Clerke, hathe dymysed graunted and to fferme letten and by these presentes graunteth dymyseth and to fferme letteth vnto the said william ffranklyn [of Thyrley] and katheryn his wiffe, his Manour place of walton in the countie of buckes, […] and advantagyes […] in eny wise belongyng.

To haue and to hold, and peassebly to occupye and inioye, the said manour of walton […] vnto the said william and katheryn his wiffe, ther executours and assignes […].
(Clark Reference Clark1914: 171)

‘witness that the said franklin Master William, clerk, hath demised, granted and let for rent and by this document demises, grants and and lets for rent unto the said franklin William [of Thyrley] and his wife Katherine, his manor place of Walton in Buckinghamshire […] and privileges […] in any way belonging thereto.

To have and to hold, and peacably to occupy and enjoy, the said manor of Walton […] unto the said William and his wife Katherine, their executors and assigns […].

To show the relevant linguistic parallels in these three samples I list them side by side in the following table:

Table 17.1: Comparison of Latin, French and English formulaic syntax in charters

	Royal Donation 1235 (Lat)	Ripon 1409 (AFr)	Lincoln 1534 (EModE)
a.	sciat nos	saches nous	witnesseth that the said master […]
b.	dedisse, concessisse	avoir grante et a ferme lesse	hathe dymysed graunted and to fferme letten
c.	et hac carta nostra confirmasse		and by these presentes graunteth dymyseth and to fferme letteth
d.			vnto the said william ffranklyn and katheryn his wiffe
e.	manerium de Martelega cum […] aliis pertinentiis suis	certeins croftes de terre […] deins nostre seignieurie de Rypon	his Manour place of walton in the countie of buckes […] and advantagyes […] in eny wise belongyng
f.	Galfrido Dispensario	a Richard Norton
g.	Habendum et tenendum	A avoir et tenir	To haue and to holde and peassebly to occupye and inioye
h.			the said manour of walton […]
g.	de nobis
h.	sibi et heredibus suis	au dit Richard, ses heires, et executours	vnto the said william and katheryn his wiffe, ther executours and assignes

As stated by Hall, habendum et tenendum functions as a marker to the effect that what follows are the conditions of the bequest. However, the syntactic function of the Latin gerundival binomial is that of a complement to the head of the direct object, here: manerium, and, in fact, in the present instance it is in inflectional concord with this head. Yet this seems to be a coincidence because, as Eileen A. Gooder has stated, expanding “habend’ et tenend’ to agree with the properties makes for more harmonious Latin, but experience shows that habendum et tenendum could not be regarded as an incorrect expansion.” She thus relativizes her own hunch that by abbreviating the endings of the gerundives, the scribes may have covered their uncertainty, because, as the statement of the donation extended in linguistic size, the allocation of the concord became more and more problematic. She ultimately sees an increasing grammatical independence of the Latin binomial in “the practice of […] signposting the successive clauses by a capital letter for the leading word” (Reference Gooder1978: 59).

But let us now look more closely at the parallels in three samples. Preceding the Habendum et tenendum / A avoir et tenir / To haue and to holde, all three texts express what is to be donated in an extensive NP (see line (e)) functioning as the direct object to the preceding verbs (see lines (b) and (e)). The Latin text names the donee in the dative object NP, the French in a prepositional adjunct à + NP following the direct object (see line (f)), while the English text has the equivalent prepositional adjunct to + NP (see line (d)) precede the direct object. The wordings immediately following the formula vary in so far as the French and the English texts do not repeat the grantor here, and the English text is the only one that repeats the grant. This seems to be the rule in English realizations as we have already seen in (20).

Finally, the habendum et tenendum formula itself: (Anglo-)French can obviously realize the Latin gerundival complements of the ‘donation’ NP as purposive infinitives with à, and in English the to-infinitive binomial can do the same. Whether or not the English is a constructional replication of the French solution seems grammatically irrelevant as the ‘replica’ construction in English fits into ‘native’ English syntax (cf. Matras and Sakel Reference Matras and Sakel2007). This fit is continued from Old English, where formally active purposive infinitives may have passive meaning when functioning as adjunct to a matrix clause object and “the subject of the infinitive is […] controlled by another NP in the matrix clause” (Fischer Reference Fischer and Kastovsky1991: 179f.; see also Reference Fischer and Kastovsky1991: 142). Yet some syntactic problems apparently arise from what follows the to-infinitive imitating the French à + infinitive transposition of the Latin gerundive.

In our sample comparison, only the English text repeats the donation in the form of a direct object as syntactically demanded by the preceding matrix VP(s) and here provides the direct object for the infinitives. However, the ‘native’ English syntax apparently breaks down with the repetition of the adjunct PP to + NP (naming the donee) whose NP is the ‘implied subject’ of the infinitives. Again, the parallel reading shows that the French version supplies the same structural solution for the Latin gerundival construction. I cannot judge whether or not this French solution is a quaint one.²⁰ However, in English the formula creates an apo koinou construction with the postponed adverbial adjunct to + NP (reproducing the repeated indirect object of the Latin model) syntactically stranded, as it were.

In brief: with the loss of the inflectionally marked dative for nouns having been completed a long time before, the legal use of to have and to hold seems to have survived as a syntactically opaque phrase whose proliferation is only licensed by its institutionalized use in a set discourse tradition. The MED’s translation “to be had and retained by (sb.)” for the construction to ~ and to holden to thus amounts to identifying an idiom proper, because only in combination with the following PP to + NP does the to have and to hold formula have a ‘passive meaning’.

If we compare ‘legal’ to have and to hold to the matrimonial formula, an interesting difference arises: while there the non-repetition of the direct object of the verb binomial creates a syntactic anomaly, the syntax of the binomial in the legal texts displays an overexplicitness in that it repeats the PP to + NP, naming the donee – which results in a syntactic collapse. Overexplicitness has long been identified as the hallmark of legal texts, a discourse-specific characteristic that Joanna Kopaczyk, following Matti Rissanen, has so nicely called the “paradox of verbosity for the sake of clarity” (Reference Kopaczyk, McConchie, Honkapohja and Tyrkkö2009: 90).

Finally, the intent of my preceding analysis was not to make any contribution from which valid general conclusions for the history of, say, English infinitive constructions may be drawn. As we have seen with ‘legal’ to have and to hold, this construction petrified into the pillar of a specific discourse tradition, or, as Kopaczyk put it, as one of the “text-type markers” in legal documentation (Reference Kopaczyk, McConchie, Honkapohja and Tyrkkö2009: 90). Hence, quantitative approaches to a binomial such as ‘legal’ to have and to hold might turn out to be of relatively confined linguistic relevance, as balanced as the composition of the overall corpus may be. For instance, Luis Iglesias-Rábade’s study of what he calls “twin lexical collocations” in legal Late Middle English provides the finding that in his corpus have and hold is the “second-highest V and V collocation” (Reference Iglesias-Rábade2007: 24). Yet this does not really say anything about the collocational cohesiveness of the two verbs or about a preference of the use of this collocation. Instead, a quantitative finding of such specific-purpose binomials found in their ‘natural habitat’ numerically substantiates that a specific formula occurs where and in the form in which it has to occur in this confined discourse tradition.

17.5 Conclusions

Collocating the verbs have and hold obviously has a long tradition in English and beyond, and it is quite likely that Heyne was right in stating that their alliterating quality highlights their close conceptual relation. The evidence from Old English proves the binomial to be largely fixed in sequence yet versatile in its inflectional form and its occurrence in different discourse traditions. Its scenic embedding in (7), taken from Beowulf, suggests a ceremonious use of the binomial in which somebody is not only entrusted with a hall, but also – metonymically – with all the moral duties ensuing from this possession. Despite the evidence for this performative use, there is no way of knowing that this is an indication as to where the binomial originated.

In contrast, we safely know where the binomial ended as a fixed to-infinitival formula in the late Middle English period. Moreover, my discussion of ‘legal’ and ‘matrimonial’ to have and to hold should have made it clear that here we are not dealing with a straight line of descent, as Mellinkoff – and now the updated entry in the ³OED (by way of postulating specialization) – surmise. And although the recent ³OED’s entry does distinguish between the ‘legal’ and the ‘matrimonial’ binomial, it treats this split as a simple matter of sequence, not of descent. As I have shown, ‘legal’ to have and to hold has a history clearly running within the confines of high and late medieval documentary practice from Latin through (Anglo-)French to English, resulting in formulaically equivalent building-blocks in the three available languages. ‘Matrimonial’ to have and to hold, in turn, has a Middle English history of its own, once the solemnization script provides the bridegroom and the bride with their own wordings of the pledge in lingua materna (cf. Henderson Reference Henderson1875: 26) in the dialogical form cued on the scene by the priest. From what we have seen with regard to the development of the ‘legal’ to have and to hold, it is quite unlikely that the matrimonial formula was a simple adoption of the legal one. If not only for the different discourse traditions in which the ‘legal’ and the ‘matrimonial’ formula evolved, their different underlying syntactic structure speaks strongly against such a straight line of descendance.

All this does, of course, not preclude that in both instances the age-old binomial have and hold resounds faintly. Yet I find it hard to believe that it (still) did so in the ‘legal’ to have and to hold by the time it surfaces in fifteenth-century English-phrased wills and leases. And this doubt is not really dispelled by the likely speculation that the Latin coordinated gerundives of habere et tenere were once the simulacrum of the vernacular binomial, later to be transposed into the (Anglo-)French infinitives aver e tener. In contrast, for ‘matrimonial’ to have and to hold a much more immediate link may be surmised because it is part of a vocal ritual. Moreover, in this particular part of the solemnization script, to have and to hold is not the only binomial: from the Latin et in sanitate et in infirmitate have sprouted other adjuncts enumerating the conjunction of imaginable opposites as scenarios of the conditions for the having and holding. ‘Legal’ to have and to hold, in turn, experiences, for instance, the extension and peassebly to occupye and iniyoe as in (23).²¹ While these may be attributed to the pedantic mannerism of legalese, the repetitive contrastive rhythm of the conditional binomials in the matrimonial pledge instead seems an apt rhetorical device to intensify the ultimately archaic to have and to hold.²²

Book contents

Part IV - To the Present

Summary

Information