Book contents
6 - Future prospects in corpus linguistics
Published online by Cambridge University Press: 03 December 2009
Summary
In describing the complexity of creating a corpus, Leech (1998: xvii) remarks that “a great deal of spadework has to be done before the research results [of a corpus analysis] can be harvested.” Creating a corpus, he comments, “always takes twice as much time, and sometimes ten times as much effort” because of all the work that is involved in designing a corpus, collecting texts, and annotating them. And then, after a given period of time, Leech (1998: xviii) continues, the corpus becomes “out of date,” requiring the corpus creator “to discard the concept of a static corpus of a given length, and to continue to collect and store corpus data indefinitely into the future …” The process of analyzing a corpus may be easier than the description Leech (1998) gives above of creating a corpus, but still, many analyses have to be done manually, simply because we do not have the technology that can extract complex linguistic structures from corpora, no matter how extensively they are annotated. The challenge in corpus linguistics, then, is to make it easier both to create and analyze a corpus. What is the likelihood that this will happen?
Planning a corpus. As more and more corpora have been created, we have gained considerable knowledge of how to construct a corpus that is balanced and representative and that will yield reliable grammatical information.
- Type
- Chapter
- Information
- English Corpus LinguisticsAn Introduction, pp. 138 - 141Publisher: Cambridge University PressPrint publication year: 2002