Book contents
- Frontmatter
- Contents
- List of figures
- List of maps
- List of tables
- Preface and acknowledgments
- 1 Introduction
- 2 Data and methods
- 3 The feature catalogue
- 4 Surveying the forest: on aggregate morphosyntactic distances and similarities
- 5 Is morphosyntactic variability gradient? Exploring dialect continua
- 6 Classification: the dialect area scenario
- 7 Back to the features
- 8 Summary and discussion
- 9 Outlook and concluding remarks
- Appendices
- References
- Index
2 - Data and methods
Published online by Cambridge University Press: 05 December 2012
- Frontmatter
- Contents
- List of figures
- List of maps
- List of tables
- Preface and acknowledgments
- 1 Introduction
- 2 Data and methods
- 3 The feature catalogue
- 4 Surveying the forest: on aggregate morphosyntactic distances and similarities
- 5 Is morphosyntactic variability gradient? Exploring dialect continua
- 6 Classification: the dialect area scenario
- 7 Back to the features
- 8 Summary and discussion
- 9 Outlook and concluding remarks
- Appendices
- References
- Index
Summary
This chapter provides an overview of the data sources tapped, and methods used in the book. Thus Section 2.1 is dedicated to introducing the Freiburg Corpus of English Dialects, as well as two smallish reference corpora sampling Standard British and American English for benchmarking purposes. Section 2.2 sketches the technicalities behind the study's empirical approach in generic terms (more detailed introductions to particular techniques are reserved for those chapters where those techniques are actually utilized).
Data
This study draws on three naturalistic text corpora: First and foremost, the Freiburg Corpus of English Dialects (Section 2.1.1), and second, two compact reference corpora sampled from the British component of the International Corpus of English and the Corpus of Spoken American English for benchmarking purposes (Section 2.1.2).
The Freiburg Corpus of English Dialects
The Freiburg Corpus of English Dialects (henceforth: FRED) (see Kortmann and Wagner 2005; Hernández 2006; Anderwald and Wagner 2007; Szmrecsanyi and Hernández 2007) is a major dialect corpus that samples traditional dialect speech all over Great Britain. The version used here (we removed some localities with comparatively thin textual coverage from the full corpus) contains 368 individual texts (that is, interviews) and spans 2,437,000 words of running text, interviewer utterances excluded.
FRED makes second-hand use of so-called “oral history” interviews, sometimes with more than one informant at once (note that some informants also star in more than one interview, and that some interviews have more than one interviewer).
- Type
- Chapter
- Information
- Grammatical Variation in British English DialectsA Study in Corpus-Based Dialectometry, pp. 15 - 31Publisher: Cambridge University PressPrint publication year: 2012