Cambridge International Corpus

Cambridge International Corpus The Cambridge International Corpus (CIC) is an 900-million word database of texts developed by Cambridge University Press to help analyse how language is really used. It includes examples from a variety of sources including newspapers, literary texts, web sites and recordings of everyday conversations.

It also includes the CANCODE corpus, a unique collection of five million words of naturally-occurring spoken English. CANCODE can be searched to find examples of how English is spoken today and to check facts about what people really say when they talk to each other.

In researching and writing the Cambridge Grammar of English the authors made full use of the CIC in the belief that a modern grammar should be informed by evidence from an extensive corpus.

English speakers have no difficulty in understanding sentences such as:

His cousin in London, her boyfriend, his parents bought him a Ford Escort for his birthday

even though they are not found in writing.

Find out more in the

Cambridge Grammar of English