Cambridge International Corpus

Cambridge International Corpus The Cambridge International Corpus (CIC) is an 900-million word database of texts developed by Cambridge University Press to help analyse how language is really used. It includes examples from a variety of sources including newspapers, literary texts, web sites and recordings of everyday conversations.

It also includes the CANCODE corpus, a unique collection of five million words of naturally-occurring spoken English. CANCODE can be searched to find examples of how English is spoken today and to check facts about what people really say when they talk to each other.

In researching and writing the Cambridge Grammar of English the authors made full use of the CIC in the belief that a modern grammar should be informed by evidence from an extensive corpus.

In the spoken language, 60% of all examples of the verb

know

occur in the phrase

you know

Find out more in the

Cambridge Grammar of English