Skip to content

Your Cart


You have 0 items in your cart.

Cambridge English Corpus

An image showing the Cambridge English Corpus logo

The Cambridge English Corpus is a multi-billion word collection of written and spoken English.

Our Corpus helps us to understand more about the English language, and how people use it when they speak and when they write.

Our learning materials are developed using our Corpus, making them more authentic and useful - illustrating language as it is really used.

World-leading  spoken language research

Cambridge University Press and Lancaster University are running a research project to create world-leading freely-availably resource for linguistic research – the British National Corpus 2014.

About the project

We are compiling a very large collection of recordings of real-life, informal, spoken interactions between speakers of British English from across the United Kingdom.

These recordings will be transcribed and made freely available for a wide range of research purposes.

Here at Cambridge, we’ll use the collection to further improve our learning materials.

Why collect spoken language?

The last project of this scale and type was completed in the UK in the early 1990s – before Twitter, selfies, smartphones, and Facebook! 

We think it is of great importance to collect new recordings, from the 2010s, in order to understand the nature of British English speech as it is today and not how it was over two decades ago.

What do we need?

In order for this project to succeed, we need your help!

We’re looking for participants to help us to collect:

  • Audio recordings of face-to-face conversations between people whose first language is British English.
  • Natural on authentic conversational speech (rather than, e.g., monologues or speeches.)

How can you help?

We'd like people from all over the UK to use whatever equipment they have to record their interactions and send them to us in exchange for a small payment.

For each hour of good quality digital recordings we receive, along with all associated consent forms and information sheets completed correctly, we will pay £18.

To register your interest in this project, and to find out more, please contact:

About the Cambridge Corpus

The Cambridge English Corpus is a multi-billion word collection of written and spoken English. It includes the Cambridge Learner Corpus, a unique bank of exam candidate papers. Our authors study the Corpus to see how English is really used, and to identify typical learner mistakes. This means that Cambridge materials help students to avoid mistakes, and you can be confident the language taught is useful, natural and fully up-to-date.

Cambridge learner's dictionaries, grammar and vocabulary learning materials, and examination, business and general English course books have all benefited from the information in the Cambridge English Corpus. We no longer have to rely on intuition to know what people say or write; instead, we can see what hundreds of different speakers or writers have actually said or written. So, materials developed with our Corpus are more authentic and can illustrate language as it is really used.

To find out more about the Corpus, go to the link in the Extras menu.

  • Share:

Thank you for your feedback which will help us improve our service.

If you requested a response, we will make sure to get back to you shortly.

Please fill in the required fields in your feedback submission.