Skip to main content Accessibility help
Internet Explorer 11 is being discontinued by Microsoft in August 2021. If you have difficulties viewing the site on Internet Explorer 11 we recommend using a different browser such as Microsoft Edge, Google Chrome, Apple Safari or Mozilla Firefox.

Chapter 13: Creating and using corpora

Chapter 13: Creating and using corpora

pp. 257-287

Authors

, University of California, , University of Alberta
Resources available Unlock the full potential of this textbook with additional resources. There are free resources available for this textbook. Explore resources
  • Add bookmark
  • Cite
  • Share

Summary

Introduction

Over the last few decades, corpus-linguistic methods have established themselves as among the most powerful and versatile tools to study language acquisition, processing, variation, and change. This development has been driven in particular by the following considerations:

  • a. technological progress (e.g., processor speeds as well as hard drive and RAM sizes);

  • b. methodological progress (e.g., the development of software tools, programming languages, and statistical methods);

  • c. a growing desire by many linguists for (more) objective, quantifiable, and replicable findings as an alternative to, or at least as an addition to, intuitive acceptability judgments (see Chapter 3);

  • d. theoretical developments such as the growing interest in cognitively and psycholinguistically motivated approaches to language in which frequency of (co-)occurrence plays an important role for language acquisition, processing, use, and change.

  • In this chapter, we will discuss a necessarily small selection of issues regarding (i) the creation, or compilation, of new corpora and (ii) the use of corpora once they have been compiled. Although this chapter encompasses both the creation and use of corpora, there is no expectation that any individual researcher would be engaged in both these kinds of activities. Different skills are called for when it comes to creating and using corpora, a point noted by Sinclair (2005: 1), who draws attention to the potential pitfalls of a corpus analyst building a corpus, specifically, the danger that the corpus will be constructed in a way that can only serve to confirm the analyst’s pre-existing expectations. Some of the issues addressed in this chapter are also dealt with in Wynne (2005), McEnery, Xiao, and Tono (2006), and McEnery and Hardie (2012) in a fairly succinct way, and more thoroughly in Lüdeling and Kytö (2008a, 2008b) and Beal, Corrigan, and Moisl (2007a, 2007b).

    About the book

    Access options

    Review the options below to login to check your access.

    Purchase options

    eTextbook
    US$60.00
    Hardback
    US$143.00
    Paperback
    US$60.00

    Have an access code?

    To redeem an access code, please log in with your personal login.

    If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.

    Also available to purchase from these educational ebook suppliers