In this paper I discuss a corpus-based approach to the identification of metaphorical expressions in English and consider the insights which can be gained through such an approach. The approach described here has its roots in lexicography, and is influenced by models of language and thought developed by research within cognitive linguistics by writers such as Lakoff and Johnson (1980) and Gibbs (1994). I begin by arguing that the study of large corpora can give information about the frequency and use of linguistic metaphors which is otherwise difficult to access. I then discuss decisions which must be made in order to develop a sound methodological framework for research into metaphor. I give examples of some aspects of linguistic metaphors which can be investigated using corpora, and finally list some limitations of a corpus-based approach.
A computerised corpus is a large collection of texts held in electronic form, which can be accessed using various types of software packages; for the purposes of language description, these are often concordancing programs. A concordancing program enables the researcher to study a word form (or forms) by looking at large numbers of citations of that word form with its linguistic contexts. (To see how this data is presented, see the extract of the concordance of heated, below.) Citations can be sorted in various ways, enabling the researcher to examine different patterns of structure and collocation.