Twentieth-Century Received Pronunciation

doi:10.1017/9781107279865.004

Chapter 3 Twentieth-Century Received Pronunciation Prevocalic /r/

3.1 Introduction

Because of its dominance in the programming by the British Broadcasting Corporation for many years, Received Pronunciation is a relatively accessible variety for historical corpus collection and sociolinguistic comparison. There is now a large amount of recorded speech data available from a range of online sources, and more is constantly being added to the collections as resources permit material to be salvaged and digitized (see, e.g., www.bbc.co.uk/archive/tv_archive.shtml?chapter=10). Since the BBC and British Library archive collections are made up of exemplars of many types of speech context and content, rather than, for example, consisting solely of autobiographical interviews or group conversations, the challenge for the discipline of sociolinguistics lies in being able to assess this historical recorded data in a sociolinguistically sensitive way. This task requires constant attention to the circumstances of the recordings, to be able to set up systematic corpora which can provide comparable and quantifiable data sets, if the aim, as in this chapter, is to treat the material using proven sociolinguistic methods, including statistical treatment.

For the purposes of this chapter, the author has gathered a small corpus of fourteen speakers gleaned from the BBC archive, from recordings in a variety of settings and types of TV or radio programme, many of which consisted of personal reminiscences of various kinds, while others were documentary features, made between 1939 and 1977. This small corpus consisted of just under four hours of recordings, which yielded 2,511 tokens of /r/ spoken in a range of linguistic contexts.Footnote ¹ The data were analysed auditorily and explored for patterns of co-variation between speaker profile, linguistic characteristics and time depth in the historical record. The results show that tapped and trilled variants of /r/ (analysed together as ‘taps’) seem to have a different social and linguistic profile to ‘labial’ variants, which were found primarily in the speech of three individuals. The data suggest that these /r/ variants have different statuses, premised on different conditions in time and social space.

Systematic observations of historical recordings can ultimately provide points of quantitative comparison with younger data sets which have yielded evidence of other changing features during the twentieth century. These include many features which have already been examined quantitatively, such as conditioned t-glottalling (Fabricius Reference Fabricius2002), L-vocalization (Wells Reference Wells, Casado and Palomo1997), as well as vocalic variation and change most notably in the trap, foot and goose vowels (Harrington et al. Reference Harrington, Palethorpe and Watson2000, Reference Harrington, Kleber and Reubold2011; Fabricius Reference Fabricius2007a, Reference Fabricius2007b), and the weak-syllable happy-vowel (Harrington Reference Harrington2006). Other candidates for such investigations would be the goat-split (Wells Reference Wells, Casado and Palomo1997) and goat-fronting to similarity with face, as well as an incipient goose-split which has been reported anecdotally in the south-east of England.Footnote ² face,Footnote ³ mouth and fleece, on the other hand, also provide interesting points of comparison. Creaky voice quality, smoothing of diphthongs (Hannisdal Reference Hannisdal2007) and conservative features such as apical /t/ and /s/ have also been treated sporadically in the literature on RP (for present-day developments in /s/, see Levon and Holmes Elliot Reference Levon and Holmes-Elliott2013), and could also be considered more extensively in future diachronic comparisons in data sets constructed using online resources.

3.2 Background

If there is one variety of British English which is amenable to quantitative exploration through the construction of a historical sociolinguistically sensitive corpus of early recordings, it is surely the British English variety we have come to know as Received Pronunciation (RP). Its dominance of the airwaves in the early days of the BBC is well documented, so that historical archives such as those of the BBC and the British Library are replete with examples of spoken material. ‘Listening to the past’, in this case, is a relatively accessible activity, although the sheer volume of material means that much background research remains to be done: the recordings that can be analysed here are only a tiny fraction of what could be considered systematically, and the analysis presented here can only be regarded as a first step in this direction. Future research of this type has a great deal of potential to detail the historical trajectories which are suggested by summaries such as Wells (Reference Wells, Casado and Palomo1997), and to contribute to a better theoretical understanding of detailed quantitative linguistic pathways to feature obsolescence. In addition, other historical sources from the time can provide glimpses of rich social context for these linguistic variations, or even specific metalinguistic commentary (see further below).

I will make brief remarks here on a related meta-issue, that of labelling ‘the accent’, since this is a facet of the considerable historical sociolinguistic complexity of RP, although as such it is outside the major focus of the present chapter. Recently, within phonetics especially, authors have adopted the designation SSBE (Standard Southern British English) (Hudson et al. Reference Hudson, de Jong, McDougall, Harrison and Nolan2007), while other writers continue to refer to (modern) RP (Fabricius Reference Fabricius2000, Reference Fabricius2002), and the implications of the use of various names are not irrelevant to the discussion. The latter suggests a generational continuity with older forms of RP, while SSBE makes no such tacit claim. The complexity of giving the accent ‘a name’ is increased further by the sheer scale of the speech community we are dealing with here. The population of England is probably presently 56–57 million people,Footnote ⁴ and if we take as a first estimate Trudgill's reference to around 3% of the population being ‘RP speakers’ of some kind (discussed in Trudgill Reference Trudgill2002), we end at a likely population of around 1.7 million people. Clearly, empirically grounded linguistic generalizations over that scale are difficult to make, and we can also assume that the likelihood that such a number of speakers represents one completely homogeneous spoken variety is probably small.

Any analysis of RP, moreover, has to confront the inbuilt sociolinguistic ambiguity of the name (Fabricius Reference Fabricius2000: 29–36). The name RP has become an enregistered folk concept (see Agha Reference Agha2003; Fabricius and Mortensen Reference Fabricius, Mortensen, Kristiansen and Grondelaers2013), and the variety has been formally codified in many descriptive phonetic publications since the early twentieth century (for details of the enregisterment history, see, e.g., Leitner Reference Leitner1982; Mugglestone Reference Mugglestone2007). The result is that the name refers not only to a vernacular variety (or set of features) spoken by a primarily socially defined group of speakers, but also to a more or less overtly codified pronunciation blueprint or norm, a ‘construct’ which is known and recognized by speakers of many other varieties of English, both within the UK and beyond, especially in former colonies. Discussions that arose in the 1990s as to whether Estuary English was ‘replacing’ RP, for instance (cf. Trudgill Reference Trudgill2002) remind us that the philosophical question of when a variety ‘ceases to be’, historically, is complex and remains pertinent. In a sense, this type of question can be raised of any ‘enregistered’ variety, be it accent, dialect or language. Enregisterment (Agha Reference Agha2003) as a process produces a labelled construct that names a way of speaking and circulates as a piece of social knowledge, so that ‘BBC English’ and ‘The Queen's English’ (compare ‘Geordie’, ‘Scouse’ or ‘Cockney’) are culturally defined labels that are attached in the community's minds to linguistic features at a particular point in history, either individually (‘dropping your t's’, for instance, as a folk term for t-glottalling) or collectively as ‘spoken styles’ (Johnstone et al. Reference Johnstone, Andrus and Danielson2006).

Received Pronunciation's broadcast prominence during the emergence of British public radio (with the founding of the British Broadcasting Corporation in 1926), and for many years after, was the result of its social position as the speech variety used by people within a socially, culturally and economically circumscribed dominant elite group. RP was instituted and remained unquestioned as the ‘proper’ accent in which to conduct broadcasting from the earliest years of the BBC, until its monopoly gradually broke down over the years since the 1960s. Its original claim to prominence seems to have been justified on the basis of perceptions that it had widespread acceptance, or rather, did not provoke the opposite. Leitner (Reference Leitner1982: 98) quotes a 1929 letter from John (later Lord) Reith, at the time Managing Director of the BBC, to Robert Bridges (then the Poet Laureate), wherein he claims that the BBC's Advisory Committee aspires to promote pronunciation forms that reflect ‘the type of educated English which can be broadcast without evoking any considerable degree of relevant adverse criticism’. This is of course a claim that stems from a particular class and ideological position; related to that, Leitner (Reference Leitner1982) describes the sociolinguistic anxieties of the ‘new man’ of the early part of the twentieth century, the socially upwardly mobile individual concerned with sociolinguistic acceptance (cf. Preston Reference Preston2013 on linguistic insecurity).

For academic linguists, RP has most often been described in a ‘variety-based’ way that is commensurate with the structuralist paradigm in linguistics that flourished at the same time and, as Leitner (Reference Leitner1982) points out, coinciding with general societal concern at the time with the establishment of a British standard pronunciation. Certain speech patterns and forms came to be considered canonical ‘RP’, while others were definitely excluded from it and indeed railed against. Trudgill (Reference Trudgill2002: 174) formulates it thus: ‘[w]hen it comes to employing a codified language variety, a miss is as good as a mile’. Indeed, Daniel Jones, lecturer in Phonetics at University College London in 1910, came up against considerable opprobrium from the man who became Poet Laureate in 1913, Robert Bridges (see Collins and Mees Reference Collins and Mees1999: 104–106), as to whether or not professional phoneticians had a duty to recommend certain pronunciation forms and not others. Daniel Jones himself was notably objectivist rather than prescriptivist, apart from some comments in his very earliest book-length publication, The Pronunciation of English from 1909 (Collins and Mees Reference Collins and Mees2001).

In other words, the variety RP came to be treated by society as a whole as a descriptive linguistic monolith, codified in pronunciation dictionaries. This viewpoint became more and more difficult to sustain as time went on, and observations of successive generations of speakers meant that the model regularly needed to be adjusted to represent what seemed to be new mainstream pronunciations; cf. Gimson (Reference Gimson1981) and his discussion of criteria for his updates of the English Pronouncing Dictionary, originally Jones (Reference Jones1917). Sensitivity to the developments that RP speech has been undergoing is also evident repeatedly in Gimson's and Wells's publications, and second and third editions of the Longman Pronunciation Dictionary (Wells Reference Wells2000, Reference Wells2008) include reports of speakers’ perceptions of their own usage of certain pronunciation variants.

This ongoing process of change ultimately raises the philosophical questions of when RP might have ceased to be – a new label such as SSBE circumvents this – and how closed the boundaries are to be considered, questions that were especially exercising linguists and lay persons alike in debates on ‘Estuary English’ during the 1990s (cf. Kerswill Reference Kerswill and Britain2007). Furthermore, the enregisterment of a variety under a new name renders invisible the sociolinguistic process of obsolescence that individual linguistic features undergo within a generational frame, thus cutting these processes off from theoretical consideration and insight. That so much energy has been spent on questions of RP's linguistic enregisterment (for academic treatments of this, see Trudgill Reference Trudgill2002; Wells Reference Wells, Casado and Palomo1997) is probably indicative of its continuing gatekeeper function, which has undoubtedly been diluted since the 1960s; RP's place in the sociolinguistic landscape has been affected by widespread socioeconomic and ideological change over the past fifty years (Coupland Reference Coupland, Junod and Maillat2010; Coupland and Bishop Reference Coupland and Bishop2007; Kerswill Reference Kerswill and Britain2007; Armstrong and Mackenzie Reference Armstrong and Mackenzie2013). Mugglestone (Reference Mugglestone2007: 254–294) describes RP's historical transition from ‘proper’ to ‘posh’, and this alliterative characterization sums up the transformation well. It seems that a variety's demise must depend upon the loss not only of distinctive (majority) variants that characterize the speech form, but also the loss of its ideological place in the sociolinguistic landscape. If RP was established and codified to fulfil a social gatekeeping role, and continues to exert that pressure in certain contexts, even if its phonetic forms have undergone systematic sociolinguistic change, is it then a ‘dead accent’? The only way out of this conundrum, it seems to the present author, is to regard any accent as an assemblage of a collection of linguistic features as well as an enregistered ideological linguistic construct (a set of postulates about what the accent sounds like). Determining whether RP exists, then, is a tractable empirical question, at least from a micro-linguistic point of view, feature by feature (on the present ideological landscape of RP as expressed through one young speaker's metalinguistic reactions, see Fabricius and Mortensen Reference Fabricius, Mortensen, Kristiansen and Grondelaers2013).

Systematically collected quantitative speech survey data in the UK has long ignored RP, in contrast to other varieties which have been the focus of dialectological studies (the Survey of English Dialects, for example; see Robinson, this volume) and sociolinguistic studies for many years (US urban varieties, Scottish English); for a discussion of the reasons which lie behind this academic sociolinguistic gap, see Fabricius (Reference Fabricius2000). RP was not generally the subject of systematic quantitative variationist studies until around 2000 (for example, Fabricius Reference Fabricius2000; Harrington et al. Reference Harrington, Palethorpe and Watson2000; Altendorf Reference Altendorf2003; Hawkins and Midgley Reference Hawkins and Midgley2005; Hannisdal Reference Hannisdal2007). Historical sociolinguistic corpora of English, such as those listed on the CORD database (see www.helsinki.fi/varieng/CoRD/index.html) are an important research resource, but the potential for productive spoken language corpora built with historical materials has scarcely been touched, let alone fulfilled.

3.2.1 Variationist Sociolinguistics and RP

In constructing and analysing a historical corpus, even on the relatively small scale as in this chapter, we are subscribing to the well-known apparent time hypothesis (Bailey Reference Bailey, Chambers, Trudgill and Schillling-Estes2002), the assumption current within variationist sociolinguistics, that speakers’ phonological patterns and phonetic variation, assuming no great shifts, spatial or otherwise, in their lives, reflect their vernacular and thus the community grammar at the time of their early childhood. The assumption is that this vernacular variety will be evident in speech production throughout their lifespan. The motivation for different types of speech tasks in the classic sociolinguistic interview was that vernacular speech would emerge most strongly in situations where speech production was least monitored. The concomitant of this premise that ‘vernacular situations’ were somehow definable and separable from other more monitored situations is not one which has generally held up to critical scrutiny, as has been shown in publications since Bell (Reference Bell1984) (see, e.g., the papers in Eckert and Rickford Reference Eckert and Rickford2001).

Nonetheless, I subscribe to an assumption here that variation in certain linguistic features will be ‘under the (vernacular) radar’ for speakers, and, in the absence of evidence to the contrary, include /r/ variation in that category. Determining whether /r/ variation was subject to overt commentary at the time of the recordings will require more extensive research on this type of data and in historical sources than has been possible to date. I make some comments in the analysis below about the types of media event that these examples of early recordings might represent, and the relationship that these might have to vernacular speech of the time.

These caveats notwithstanding, this work proceeds from a quantitative sociolinguistic point of departure. Its analyses are premised on the theoretical claim that examination of quantitative patterns of the distribution of variants reveals the embedding of the variation in a social grammar of the variety. It assumes that speakers are members of a community of some sort, although such speakers do not occupy a clearly defined geographical area, nor do they necessarily interact often with each other, so that in this case the community is widely defined as speakers with upper middle class and upper class backgroundsFootnote ⁵ (see Appendix), since our knowledge of these individuals’ actual social networks at the time is limited. We work on the premise that /r/ constitutes a linguistic variable with potential sociolinguistic significance within an indexical field (Eckert Reference Eckert2008), and the task at hand in this chapter is to begin to reveal a structure to that significance in the production of speech.

3.2.2 /r/ in Historical RP

Figure 3.1 reproduces the table of English speech sounds from Daniel Jones'sThe Pronunciation of English, first published in 1909. The interesting feature of this chart is the representation of the consonant /r/, which is split into two forms, the roll (or trill) and the so-called ‘fricative r’ (and it is not clear precisely what this is intended to refer to). No explicit mention is made of a ‘tapped’ r, which, as we shall see below, is an important feature of the corpus of recordings that are analysed here.

Figure 3.1 Consonant chart, Jones (Reference Jones1909: xiii).

Wells (Reference Wells, Casado and Palomo1997) dates the loss of tapped /r/ and its replacement by alveolar [ɹ] in intervocalic positions to the ‘early twentieth century’, and while we do not have sufficient data to corroborate this in the present corpus, where no speakers were born later than 1918, it would indeed be interesting to pursue this variant in speakers born later in the century (again, more research is needed). It is noticeable, for instance, in the recording with Lord Cromer, YM1 in this corpus (see below), that the interviewer (Paul Ferris) uses tapped /r/ far more frequently than the interviewee.

Labiodental and generally fronted /r/s are not mentioned in Jones's treatments of RP pronunciation, but are discussed in Wells's description of U-RP (Wells Reference Wells1982: 282), along with tapped /r/. Labiodental [ʋ] is referred to as being ‘regarded as an upper-class affectation’ (Wells Reference Wells1982: 282), an image which is corroborated by George Orwell's representation of it in 1936 in Keep the Aspidistra Flying (cited in Foulkes and Docherty Reference Foulkes and Docherty2000), where <w> for orthographic <r> is used to represent /r/ within a consonant cluster in bwowse and poetwy, intervocalic /r/ in tewwible, and initial /r/ in wesist, in a parody of upper-class speech.

We turn now to a discussion of the corpus data and its analysis.

3.2.3 Data

In dealing with the historical material, it has been a principle of the work to select recordings and speakers from the BBC archive who can be independently established to fall within an expected social grouping of RP speakers at the time. This use of independent social criteria (contra, e.g., Gimson Reference Gimson1981 referred to above) was done so as to circumvent the circularity inherent in choosing RP speakers by means of linguistic criteria alone, a problem discussed in Fabricius (Reference Fabricius2000, Reference Fabricius2002). The group of RP speakers, for the time period covered in the present corpus, where speakers were born between 1880 and 1918, largely consisted of speakers with upper-middle-class or upper-class backgrounds, who, according to Jones (Reference Jones1917: viii), came from families and social circles whose education had typically been at the English public schools. This educational criterion certainly applies in the case of the male speakers in the corpus, many of whom have preparatory and public-school backgrounds, while many of the female speakers were educated by governesses rather than at school, although this is not the case for all of them (Speaker OF2, for example, had a university education).

The data were selected by surveying the publicly available BBC archive corpus www.bbc.co.uk/archive/. Tables 3.1a and 3.1b give details of the selected recordings. Note that OF1 and OF2 are interviewed within the same recording.

Table 3.1a Composition of the data corpus

ID	Speaker	DOB	Recorded	Length
OF1	Baroness Asquith	1887	1968	00:22:11
OF2	Baroness Stocks	1891	1968	00:22:11
OF3	Dame Agatha Christie	1890	1955	00:02:46
			1955	00:02:06
OF4	Bridget Monckton, eleventh Lady Ruthven of Freeland	1896	1977	00:13:37
OM1	Baron Dowding	1882	1968	00:36:41
OM2	Viscount Alanbrooke	1883	1957	00:28:54
OM3	A. P. Herbert	1890	1954	00:09:13
OM4	E. F. L.Wood, first Earl of Halifax	1881	1939	00:17:21
YF1	Doris Langley-Moore	1902	1957	00:13:27
YF2	Lady Alexandra Naldera Curzon	1904	1977	00:07:37
YF3	Dame Daphne du Maurier, Lady Browning DBE	1907	1971	00:12:20
YM1	Lord Cromer, GRS Baring	1918	1964	00:08:18
YM2	Cecil Day-Lewis	1904	1962	00:12:52
YM3	Sir Arthur John Gielgud	1904	1954	00:25:59
	TOTAL TIME			3:55:33

Table 3.1b Links to online recordings

ID	Link to recording
OF1	www.bbc.co.uk/archive/suffragettes/8318.shtml
OF2	www.bbc.co.uk/archive/suffragettes/8318.shtml
OF3	www.bbc.co.uk/archive/agatha_christie/12501.shtml
	www.bbc.co.uk/archive/agatha_christie/12503.shtml
OF4	www.bbc.co.uk/archive/edward_viii/12939.shtml
OM1	www.bbc.co.uk/archive/battleofbritain/11421.shtml
OM2	www.bbc.co.uk/archive/churchill/11010.shtml
OM3	www.bbc.co.uk/archive/churchill/11009.shtml
OM4	www.bbc.co.uk/archive/ww2outbreak/7933.shtml
YF1	www.bbc.co.uk/archive/whatwewore/5610.shtml
YF2	www.bbc.co.uk/archive/edward_viii/12927.shtml
YF3	www.bbc.co.uk/archive/writers/12222.shtml
YM1	www.bbc.co.uk/archive/menandmoney/6800.shtml
YM2	www.bbc.co.uk/archive/van_gogh/10901.shtml
YM3	www.bbc.co.uk/archive/hamlet/8505.shtml

Each individual's social and family background was investigated as far as possible through online sources (primarily Wikipedia) to determine the extent to which speakers could be said to have a social background which would firmly place them within the upper middle class or upper class, and/or a relevant educational background. Some speakers clearly came from aristocratic families (OF4, OM4, YF2, YM1, YM2). Several speakers had been pupils at public schools, Oxbridge or had military officer training. Growing up in England was generally considered a requirement, but growing up abroad before education in England was also considered admissible when other background factors were taken into account (aristocratic backgrounds in India, for example). Cecil Day Lewis proved to be an interesting example, as close listening to his recording showed traces of his Anglo-Irish upbringing, which emerged in slight phonetic details (post vocalic /r/, clear /l/ for syllabic /l/ (by now, 2015, almost a historical variant, Raymond Hickey, p.c.) in a small passage of a few seconds in the recording (5:31 to 5:38; in the phrases ‘of the highest art’ and ‘lower art’, ‘people are more important’). Day Lewis's biography also suggests that he considered himself a British citizen rather than an Irish one. As these pronunciations were singular, he was considered eligible to be in the corpus. The Appendix gives further summary biographical details for each speaker included.

The recordings include a range of situational contexts and speech registers, including personal domestic settings, in the case of Daphne du Maurier, talking about her writing career in a recording made at her private home, to Lord Barings, Governor of the Bank of England, being interviewed in the bank about the role of the Governor vis-à-vis the government of the day. Interviews of war reminiscences feature as well, with individuals who were leading high-ranked military officers at the time (Hugh Dowding and Lord Allanbrooke). Baroness Asquith and Baroness Stokes are interviewed together about the history of the suffrage movement some fifty years before the recording. Several recordings are monologues on various topics, ranging from Cecil Day Lewis's presentation of Van Gogh's art, Doris Langley-Moore's documentary on the history of fashion to A. P. Herbert's reminiscences of Winston Churchill's parliamentary career and Sir John Gielgud's discussion of Hamlet. As we saw above, there are hints in the phonetic literature that trilled /r/ (or, as it is called in Daniel Jones's early work, the rolled /r/) had a special status around the turn of the twentieth century. Trilled /r/ is very rare in the corpus of recordings, occurring in all only ten times (while tapped /r/s occur 333 times in all). Five of these occur in Cecil Day Lewis's monologue on the art and life of Vincent van Gogh, and on that basis it may be that the trill has in the past had a special ‘performance-style’ status, but this idea requires more research. There is clearly the potential for a good deal of stylistic variation in the historical data due to speaker stance (Kiesling Reference Kiesling and Jaffe2009) and topic.

One small observation of the potential effect of speaker stance on pronunciation can be illustrated from the data: the interview between an interviewer, Baroness Stokes and Baroness Asquith. In the last part of the interview, the interaction moves to more of a conversation between the two interviewees, who begin to discuss animatedly what types of barriers to public life for women have been removed over the years. At this point, there is a case of trilled /r/ within a syllable onset cluster in ‘breached’, occurring at around 19 minutes into the recording, uttered by Baroness Asquith. The same author also utters a trilled /r/ in one token of ‘heroism’ at 14:39 minutes into the recording, which immediately follows two medial alveolar tokens of ‘heroic’. A spectrogram of this token is shown in Figure 3.2. These cases demonstrate that there is a potential value in a close qualitative analysis of interactional details, as a supplement to larger-scale quantitative work such as in the present chapter. The fine-grained moves of stance-taking on an individual level may indeed reveal interesting tendencies which need to be taken into account in a more complete analysis.

Figure 3.2 Waveform and spectrogram of trilled /r/ sequence in ‘heroism’ spoken by OF1 (Baroness Asquith).

3.2.4 Methods

The final set of recordings having been chosen and located in the BBC Archive (www.bbc.co.uk/archive/), all recordings were accessed and recorded as WAV files onto a personal computer using SoundTap software (www.nch.com.au/soundtap/). Each sound file was then imported into ELAN (http://tla.mpi.nl/tools/tla-tools/elan/), which enabled the linking of the sound file with its transcription. For each file, tokens of /r/ were identified in four phonological positions: word-initial, medial/intervocalic, potential sites for r-sandhi (‘linking r’), and within clusters at the onset of syllables. Separate tiers for the word and its phonetic transcription were used systematically throughout all ELAN files. Phonological position within the word was noted for each case.Footnote ⁶

The varying sound quality of the recordings meant that acoustic instrumental analysis of the data was judged to be potentially difficult and in some cases not possible at all. In addition, while some recordings were clearly BBC studio recordings, others were made in differing circumstances (one in a private home, for instance), and in none of the cases was it possible to reconstruct what recording equipment had been used. Since microphones, for example, have an effect on formant measurements (Foget Hansen and Pharao Reference Foget Hansen and Pharao2006) which are relevant for /r/ (Foulkes and Docherty Reference Foulkes and Docherty2000: 4), it was decided to avoid potential complications in using instrumental measures and to proceed with auditory analysis. The files were played back using a Dell Inspiron 1525 computer's High Definition Audio device and listened to using Sennheiser HD212 Pro headphones. Each token of /r/ was identified and coded phonetically for word position and phonetic character by the author, using one of a set of phonetic possibilities:

- Strongly alveolar approximant [ɹ], with no auditory evidence of fronting, and auditorily rounded
- Labialized alveolar tokens with weak rounding, midway between [ɹ] and [ʋ]
- Labiodental approximant tokens: [ʋ]
- Tapped /r/ tokens: [ɾ]
- Trilled /r/: [r]

Additional minor tokens occurring in the data were as follows. Fricated /r/s were found in initial position and in syllable onset clusters. Linking /r/s were occasionally vocalized, as were some medial /r/s between weak syllables. Single backed /r/s were coded as velarized in initial position or as retroflex in linking /r/ position, where they had a quality similar to an American [ɻ]. Some tokens were discarded because of inaudibility due to recording noise or speaker overlap. Three hours and thirty-three minutes of recordings yielded in all 2,511 usable tokens of /r/, divided between the fourteen speakers as shown in Table 3.2 above.

Table 3.2 Number of /r/ tokens by speaker

Speaker	N /r/ tokens
OF1	150
OF2	97
OF3	72
OF4	112
OM1	293
OM2	311
OM3	106
OM4	222
YF1	241
YF2	88
YF3	125
YM1	81
YM2	202
YM3	411
Total	2,511

The speaker with the smallest number of tokens was OF3 with 72, while YM3's recording contains over 400 tokens. Male speakers dominate the corpus time-wise (2 hours 19 minutes as opposed to 1 hour 14 minutes for female speakers); all recordings with female speakers were shorter on average than those of male speakers.Footnote ⁷

Table 3.3 shows the distribution of the 2,511 tokens according to phonological position. Almost half of the tokens were located in onset clusters (a category which included both word-initial and word-medial clusters). Smaller numbers were available for the other environments, but the data was extensive enough to allow statistical treatment.

Table 3.3 Tokens of /r/ by phonological position

Position	Total
Initial	498
Linking /r/	284
Medial	624
Onset cluster	1,105
Total	2,511

Two interesting phonetic features within the data emerged during analysis. It was decided to test statistically for the distributions of tapped and trilled /r/, and for labial and labialized /r/, and to determine which independent factors seem to be promoting their usage. Initial cross-tabulations showed that most speakers included tapped tokens of /r/, but they were absent from the recording of Lord Halifax (recorded 1939). Labialized tokens were rarer, but, as Figure 3.3 demonstrates, these were particularly prominent in one individual: again, Lord Halifax, OM4. For that reason, the statistical results on tapped and trilled /r/ below are conducted on thirteen speakers (omitting OM4), while the statistics on labialized tokens include all fourteen speakers.

Figure 3.3 Trends in rates of tapped and trilled /r/ by word position according to decade of recording.

It was determined that a suitable modelling of the data could be obtained using mixed methods logistical regression using speaker as a random factor (Johnson Reference Johnson2009). For the purposes of statistical analysis using multiple logistic regression as made available in Rbrul,Footnote ⁸ the data were recoded such that examples of categories 2 and 3 were categorized as having labial character to some degree, while all other tokens were classed as ‘non-labial’. Similarly, in a second run in the data, categories 4 and 5 taps and trills were together recategorized as ‘taps’ (since there were few tokens of trills) while all other /r/ tokens were classified as ‘non-taps’. In that way, the tokens were reduced to a binary contrast (tap/nontap, labial/nonlabial) which could be tested statistically.

The data was coded with a set of internal linguistic and external social predictors which were tested in the statistical model: gender, date of birth, date of the recording, and type of speech (whether monologue or interview). Date of birth was initially converted into a binary ‘century of birth’ factor, whether nineteenth (1800s) or twentieth century (1900s), and then also tested as a continuous factor. ‘Date of the recording’ was later recoded into a factor called ‘decade of recording’ and tested within the model. Position in the word was the only internal linguistic factor tested. ‘Speaker’ was included in the model as a random factor, a practice that brings sociolinguistic modelling better in line with other social science disciplines in treating individuals as potentially divergent from their social group (Johnson Reference Johnson2009: 365).

3.3 Results

We turn first to examine tapped and trilled /r/ in the data. As Figure 3.3 shows, cross-tabulations of the data revealed a trend of decreasing tapped /r/ usage across the decades under consideration. This decrease applies across medial (intervocalic) contexts and linking /r/ contexts most obviously, while taps and trills are almost absent from the other two environments. Taps and trills in initial position, however, also decrease from a very marginal rate in the 1950s. This trend, which is independent of the speakers’ dates of birth, suggests a changing ‘style of the time’ where taps and trills become more and more rare in the BBC recordings. As we shall see below, this factor does prove significant in modelling taps and trills in the post-war recordings.

These trends suggested that it could be revealing to model the data according to word position and decade of recording, as well as year of birth as a continuous factor and speaker as a random factor. This model turned out to be highly significant, while factors such as gender and speech context (interview versus monologue) did not.

Table 3.4 shows firstly the results for thirteen speakers and 2289 tokens (omitting OM4, as noted above) for tapped and trilled /r/, recoded and categorized together as ‘taps’. The three independent factors, in order from least to most significant effect, were position in the word (p = 6.26e-132), decade of recording (p = 157e-07) and year of birth, examined as a continuous variable (p = 0.000325). Speaker as a random effect was also part of the model and contributes to a strengthening of the results. In the context of this prestigious variety, as it was at the time, tapped /r/ seems to have had the status of majority variant in certain word positions: the model here shows that medial and linking /r/ positions greatly favour the variant. This is, for that matter, not surprising in articulatory terms, since these are environments that always contain /r/ in intervocalic position. Initial /r/ may also be immediately following a vowel, but as preceding environment was not coded on this run through the data, the significance of tapped /r/ within the word-initial category cannot be fully tested. The results for the factor ‘decade of recording’ show that, while in the 1950s data the tapped/trilled /r/s factor weight favours taps and trills at 0.635, we already see a slight disfavouring of the feature in the 1960s, and a further decrease in the 1970s. The 1950s and 1960s are therefore particularly interesting decades to explore through further data. Year of birth as a continuous variable also shows that taps and trills decrease systematically through later generations, so that the purely speaker-diachronic trend follows the ‘decade of recording’ trend, and both are independently significant.

Table 3.4 Mixed methods logistic regression modelling for tapped and trilled /r/ (omitting speaker OM4), N = 2289

Deviance				1283.101
Df				8
Grand mean				0.15
Factors	Log Odds	Tokens (N)	Proportion of application value	Centred factor weight
DECADE of recording
1950s	0.556	1,141	0.187	0.635
1960s	−0.033	823	0.119	0.492
1970s	−0.523	325	0.098	0.372
YEAR OF BIRTH (continuous)
	0.025
POSITION
Medial	2.122	578	0.439	0.916
Linking /r/	1.234	260	0.250	0.775
Initial	−1.163	451	0.029	0.238
Onset cluster	−2.193	1,000	0.011	0.1

We turn now to consider the status of labialized /r/. Table 3.5 below shows the results for labiodental and labialized /r/, recoded and categorized together as ‘labials’. Three independent factors proved significant in the modelling of variation in /r/. These were, in order from least to most significant, speech type (p = 0.00294), decade of recording (p = 0.00805) and position within the word (p = 5.96e-13).

Table 3.5 Mixed methods logistic regression modelling for labiodental and labialized /r/, N = 2511

Deviance				1118.775
Df				9
Grand mean				0.101
Factors	Log Odds	Tokens (N)	Proportion of application value	Centred factor weight
SPEECH TYPE
Interview	1.012	1,257	0.093	0.733
Monologue	−1.012	1,254	0.108	0.267
DECADE OF RECORDING
1930s	3.449	222	0.527	0.969
1950s	−0.745	1,141	0.035	0.322
1970s	−0.900	325	0.154	0.289
1960s	−1.803	823	0.056	0.141
POSITION
Initial	1.062	498	0.189	0.743
Medial	0.266	624	0.093	0.566
Onset cluster	−0.076	1,105	0.083	0.481
Linking /r/	−1.252	284	0.032	0.222

Note that gender was not proven significant in this model either; labial variation seems to have a different social status to tapped/trilled /r/ given the profile in Figure 3.4, but we cannot find detailed evidence for what that status consisted of here in terms of a possible gender dynamic. Labials (labiodentals and labialized alveolars) are predominantly produced by three individuals only. Figure 3.4 illustrates this. OF2 (Baroness Stokes), OM4 (Lord Halifax) and YF3 (Daphne du Maurier) are the only three speakers whose production of labials is above the average for all speakers in the corpus at just on 10%. Lord Halifax (socially speaking from an aristocratic, upper-class background) is by far the most prolific user of labials for /r/ at 52.7%. While he is the only speaker of this type in the present corpus, the BBC archive potentially holds other examples of comparable recordings which could provide a firmer basis for conclusions in the future.

Figure 3.4 Proportion of labial and non-labial variants in the corpus by individual speaker.

To turn to the factors which did prove significant in the logistic model, word position, decade of recording and speech type, we can consider each in turn. Labials appear most strongly favoured in initial position, and medials, with a factor weight of 0.566, are also favoured, although more weakly. Other word positions do not emerge as favouring labial production. Decade of recording is strongly favoured only in the case of the 1930s, which is the single recording of OM4 referred to above. Other decades do not favour labial production, but as Table 3.5 shows, this is a result which is strongly affected by the dominance of a single speaker in this limited corpus. The result for speech type shows a strong factor weight favouring ‘Interview’ as speech context, which may seem anomalous given that OM4's recording is a monologue, but Table 3.5 shows that a large number of labial tokens also occur in the interviews recorded with YF3 (Daphne du Maurier) and OF3 (Baroness Stokes). Although we cannot tell the final conclusive ‘story of labial /r/’ here, we do have tentative indications that labial variants are an idiosyncratic and individual feature in this corpus rather than a general sociolinguistic feature of the group, as tapped /r/s seem to be.

3.4 Conclusions

As this volume demonstrates, it is not only ongoing, present-day sociolinguistic variation and change that can now be studied empirically and quantitatively. The audible past is accessible and, with sensitive sociolinguistic treatment, can yield many insights into the detailed trajectories of variation at the level of single variables over time. How the different variants, such as different phonetic qualities for /r/, cluster by co-occurrence into varieties can then become a matter of empirical concern, and thereby inform other branches of the historical sociolinguistic and linguistic enterprise. Linguistic variants’ ‘routes to obsolescence’ are actually a relatively under-researched area in sociolinguistics, since the field tends to focus on ‘new and upcoming’ linguistic features.

Systematic observations of historical recordings can ultimately also provide points of quantitative comparison with younger, more contemporary data sets. In that way, we can gain a greater time depth for studies, providing substantial empirical evidence of the changing features of RP during the twentieth century, including, as here, the quantitative profiles of various qualities of /r/. The present study has demonstrated that labial /r/ was a more peripheral and idiosyncratic feature in these recordings from the decades around and after World War II, while tapped /r/ was more solidly socially based, and over time became an archaic feature, associated with older speakers and styles of an older time. By the time of the 1970s recordings, tapped /r/ was peripheral and largely dispreferred. Many other such variable features await diachronic comparisons of this kind which will enable us to track the progression to obsolescence of older features more precisely.

The exemplary analysis here also serves to demonstrate the potential this type of public media-derived data can have for large-scale systematic research. The technologies that are needed, in terms of large digital memory capacities and accessible open source and freeware analytical tools, are more and more widely available. We can look forward to ‘listening to the past’ being a more common pursuit in the future.

Book contents

Chapter 3 - Twentieth-Century Received Pronunciation

Summary

Information

Chapter 3 Twentieth-Century Received Pronunciation Prevocalic /r/

3.1 Introduction

3.2 Background

3.2.1 Variationist Sociolinguistics and RP

3.2.2 /r/ in Historical RP

3.2.3 Data

3.2.4 Methods

3.3 Results

3.4 Conclusions

Footnotes

References

Accessibility standard: Unknown

Why this information is here

Accessibility Information

Book contents

Chapter 3 - Twentieth-Century Received Pronunciation

Summary

Information

3.1 Introduction

3.2 Background

3.2.1 Variationist Sociolinguistics and RP

3.2.2 /r/ in Historical RP

3.2.3 Data

3.2.4 Methods

3.3 Results

3.4 Conclusions

Footnotes

References

Accessibility standard: Unknown

Why this information is here

Accessibility Information

Save book to Kindle

Save book to Dropbox

Save book to Google Drive