Published online by Cambridge University Press: 05 June 2014
Introduction
The work we report in this chapter began with the aim of finding techniques to minimize the problems that arise from small data samples in fields such as historical sociolinguistics. However, the solutions we propose are not limited to historical sociolinguistics, but are applicable to quantitative sociolinguistic and corpus studies in general. Establishing the frequency of given linguistic forms is a crucial issue in studying differences in linguistic usage between populations or points in time. In its simplest form, the question can be posed as follows: suppose there are two alternative forms, A and B, of a linguistic variable – alternative pronunciations, words or phrases meaning the same, functionally equivalent grammatical structures – what is the frequency of use of each? The basic questions we address include the use of aggregate data and its relation to individual variation when individuals contribute different amounts of data to the aggregate.
The other problem we discuss is similarly a fundamental one: what is the minimum sample size – number of speakers, writers or texts, depending on the research topic – that is required to yield consistent results for a given linguistic variable? For a historical sociolinguist using a public corpus, this may be a question of a scarcity of data due to a high rate of illiteracy in a particular period. For sociolinguists who have to elicit their interview data, it is an issue of research economy. In Tagliamonte’s words (2006: 33): ‘The size of the sample must necessarily be balanced with the available time and resources for data handling.’ Looking back at 40 years of sociolinguistic research, Labov (2006 [1st edn. 1966]: 400–401) notes that the analysis of the stratification by age, gender and social class of a given city has usually required 60–100 speakers. Without introducing any testing of sample size, he considers the 120 speakers used in a Montreal study to be ideal, although he emphasizes the care with which the sampling was designed.
To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Find out more about the Kindle Personal Document Service.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.
To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.