Skip to content

Cambridge English Corpus

An image showing the Cambridge English Corpus logo

The Cambridge English Corpus is a multi-billion word collection of written and spoken English.

Our Corpus helps us to understand more about the English language, and how people use it when they speak and when they write.

Our learning materials are developed using our Corpus, making them more authentic and useful - illustrating language as it is really used.

World-leading  spoken language research

Cambridge University Press and Lancaster University are running a research project to create world-leading freely-availably resource for linguistic research – the British National Corpus 2014.

About the project

We are compiling a very large collection of recordings of real-life, informal, spoken interactions between speakers of British English from across the United Kingdom.

These recordings will be transcribed and made freely available for a wide range of research purposes.

Here at Cambridge, we’ll use the collection to further improve our learning materials.

Why collect spoken language?

The last project of this scale and type was completed in the UK in the early 1990s – before Twitter, selfies, smartphones, and Facebook! 

We think it is of great importance to collect new recordings, from the 2010s, in order to understand the nature of British English speech as it is today and not how it was over two decades ago.

To find out more about the Corpus, go to the link in the Extras menu.

  • Share:

Thank you for your feedback which will help us improve our service.

If you requested a response, we will make sure to get back to you shortly.

Please fill in the required fields in your feedback submission.
This site uses cookies to improve your experience. Read more Close