Cambridge International Corpus
The Cambridge International Corpus (CIC)
is an 900-million word database of texts developed by Cambridge University Press to help analyse
how language is really used. It includes examples from a variety of sources including newspapers,
literary texts, web sites and recordings of everyday conversations.
It also includes the CANCODE corpus, a unique collection of five million words of naturally-occurring spoken English. CANCODE can be searched to find examples of how English is spoken today and to check facts about what people really say when they talk to each other.
In researching and writing the Cambridge Grammar of English the authors made full use of the CIC in the belief that a modern grammar should be informed by evidence from an extensive corpus.
In the spoken language, 60% of all examples of the verb
know
occur in the phrase
you know
Find out more in the
Cambridge Grammar of English
