In the two previous editions of this book, both Reid (Reference Reid and Trudgill1984) and Gardner-Chloros (Reference Gardner-Chloros and Britain2007) deplored the fact that, at the time of writing, there were no comprehensive sources of data on the languages that were spoken in Britain and Ireland by people who were not considered indigenous to the region. Until 2001, the questionnaires used to collect census data in Wales, Scotland, Northern Ireland and the Republic of Ireland included questions about respondents’ abilities only in Welsh, Scottish Gaelic and Irish respectively. The questionnaire used in England included no language questions. No questions were included in any of the questionnaires about the languages of European immigrants (or any other immigrants, for that matter) or about the other ‘indigenous’ languages of Britain and Ireland, such as Angloromani or the British and Irish Sign languages; see Figure 19.1. Anyone interested in multilingualism had to collate information from various sources of different types and scopes, the most reliable (although not infallible) of which were surveys of the languages spoken by school pupils in the UK drawing their data from Local Educational Authorities (Baker and Eversley Reference Baker and Eversley2000) or the Annual School Census (von Ahn et al. Reference von Ahn, Lupton, Greenwood and Wiggins2010). The inclusion in future censuses of ‘a question on language skills’ was therefore put forward by Gardner-Chloros as an ‘extremely useful’ way (Reference Gardner-Chloros and Britain2007:329) to address this ‘conspicuous’ omission (Aspinall Reference Aspinall2005).


Figure 19.1 Language questions in the questionnaires used in the 2001 and 2011 censuses in the UK nations and the Republic of Ireland.
An important step in this direction was taken for the first time in 2011, when questions about all the languages spoken by the populations of Britain and Ireland were added to the census questionnaires in the two countries. This was generally received as a welcome and groundbreaking development, which officially acknowledged that multilingualism in Britain and Ireland extended beyond the use of the Celtic minority languages and had the potential to offer an extensive and consistent account of linguistic diversity. In the four datasets that were published in the years after the census, the five jurisdictions reported different numbers of languages in addition to Britain and Ireland’s ‘indigenous’ languages (Romani, British Sign Language, Cornish, English, Irish, Irish Sign Language, Manx Gaelic, Scots, Scottish Gaelic, Traveller languages, Welsh):Footnote 1
• 66 languages were reported in the data for England and Wales;
• 164 languages were reported in Scotland;
• 79 languages were reported in Northern Ireland; and
• 44 languages were reported in the Republic of Ireland.
All datasets provided the number of responses each language had received in the census. In addition to the numbers above and the ‘indigenous’ languages, the datasets also included categories that aggregated unspecified numbers of unspecified languages under broad headings such as ‘South Asian language (all other)’ in the England and Wales data or ‘Other stated languages (incl. not stated)’ in the Republic of Ireland, and gave the total number of responses for these, as well.
While the practice of aggregating responses into groups may have been justified for statistical disclosure control reasons, it did obscure linguistic diversity to a significant extent. Upon closer examination of the full classification for the language variable in the information documentation for England and Wales (Office for National Statistics (ONS) 2013a, b), we find that the ‘South Asian language (all other)’ category encompassed sixty-one languages, which had received 27,359 responses altogether. Perhaps more surprisingly, the count of the number of languages other than the UK’s ‘indigenous’ ones that were labelled and coded by the ONS for England and Wales revealed a total of 588 languages, a much higher number than the 66 languages that were reported in the published dataset. In contrast to the latter, the full classification categorised responses to the language question in great detail. For example, ‘Albanian (Gheg/Kosovan)’, ‘Albanian (Tosk)’, ‘Albanian (not otherwise specified)’, ‘Arberesh’ and ‘Arvanitika’ all appear as separate labels each having its own code. In the published dataset, the first three codes are reported as a single ‘Other European Language (non-EU): Albanian’ category, presumably because the corresponding varieties are considered by the ONS to be spoken by Albanian and Kosovan nationals, while the last two codes are included in the ‘Any other European language (EU)’ category alongside a number of other coded languages (see Table 19.4), as these varieties are seen as spoken by Italian and Greek nationals, respectively. Arberesh and Arvanitika are Albanian linguistic varieties spoken in the south of Italy, including in Sicily, and in parts of Greece, especially in the south of the country such as in the regions of Attica and Boeotia.
Unpublished data obtained from the ONS in 2020 offer an insight into the three-level processing that language data in the census went through. The responses that citizens wrote in the relevant box of the language question (raw data) were grouped into a long list of language labels (primary data), which were in turn grouped further into the language categories that were ultimately reported in the published dataset (structural data). See Figure 19.2 for the processing of the Albanian data. The exact criteria and methods the ONS used to group write-in responses (Level 1 in Figure 19.2) into language labels (Level 2) and language labels into table categories (Level 3) are not known. It would seem, however, that a combination of information about respondents’ language, ethnicity and nationality was used. Numbers of responses seem to have been less central in the process. For example, Slovenian received 1,235 responses and was included as a table category, presumably because it is one of the official languages of the European Union. Other table categories, like ‘South Asian language (all other)’ mentioned above, were much larger in size.

Figure 19.2 Example of relation between write-in responses, write-in response groupings (labels) and published language data (table categories) in the 2011 census in England and Wales.
In Table 19.1, the top twenty most frequently reported languages across the four datasets are given, excluding the ‘indigenous’ languages and aggregate categories. Polish tops all four lists, with a combined total of 618,091 responses in the UK. A clear reflection of the growth of the Polish community after Poland joined the EU in 2004, this finding was widely reported in the media, which branded Polish as the second (most spoken) language in the UK. In terms of other European languages in the four top 20s:
French, German, Lithuanian, Portuguese, Russian and Spanish are listed in all four top 20s.
Italian and Romanian appear in three lists each: Italian in England and Wales, Scotland and the Republic of Ireland; Romanian in England and Wales, Northern Ireland and the Republic of Ireland.
Czech, Hungarian, Latvian and Slovak are each listed in the two Irish top 20s.
Bulgarian is listed only in Northern Ireland.
Dutch appears only in the Scottish top 20.
Table 19.1 The top 20 most frequently reported languages in the 2011 census statistics across the UK nations and the Republic of Ireland
| England and Wales | Scotland | Northern Ireland | Republic of Ireland | |||||
|---|---|---|---|---|---|---|---|---|
| Language | Responses | Language | Responses | Language | Responses | Language | Responses | |
| 1. | Polish | 546,174 | Polish | 54,186 | Polish | 17,731 | Polish | 119,526 |
| 2. | Panjabi | 273,231 | Urdu | 23,394 | Lithuanian | 6,250 | French | 56,430 |
| 3. | Urdu | 268,680 | Panjabi (not otherwise specified)a | 23,150 | Portuguese | 2,293 | Lithuanian | 31,635 |
| 4. | Bengali (with Sylheti and Chatgaya) | 221,403 | Chinese (not otherwise specified)b | 16,830 | Slovak | 2,257 | German | 27,342 |
| 5. | Gujarati | 213,094 | French | 14,623 | Chinese (Not otherwise specified)c | 2,214 | Russian | 22,446 |
| 6. | Arabic | 159,290 | German | 11,317 | Tagalog/Filipino | 1,895 | Spanish | 21,640 |
| 7. | French | 147,099 | Spanish | 10,556 | Latvian | 1,273 | Romanian | 20,625 |
| 8. | All other Chinesed | 141,052 | Arabic | 9,097 | Russian | 1,191 | Chinese | 15,166 |
| 9. | Portuguese | 133,453 | Italian | 8,252 | Malayalam | 1,174 | Latvian | 12,996 |
| 10. | Spanish | 120,222 | Cantonese | 7,486 | Hungarian | 1,008 | Portuguese | 11,902 |
| 11. | Tamil | 100,689 | Russian | 6,001 | Cantonese | 966 | Arabic | 11,834 |
| 12. | Turkish | 99,423 | Hindi | 5,058 | Spanish | 918 | Italian | 10,344 |
| 13. | Italian | 92,241 | Dutch | 3,750 | French | 850 | Yoruba | 10,093 |
| 14. | Somali | 85,918 | Bengali | 3,626 | Romanian | 791 | Slovak | 9,481 |
| 15. | Lithuanian | 85,469 | Lithuanian | 3,496 | German | 728 | Malayalam | 8,849 |
| 16. | German | 77,240 | Malayalam | 3,397 | Arabic | 549 | Urdu | 8,443 |
| 17. | Persian/Farsi | 76,391 | Tagalog/Filipino | 3,379 | Bulgarian | 535 | Hungarian | 7,625 |
| 18. | Tagalog/Filipino | 70,342 | Portuguese | 3,175 | Czech | 533 | Filipino | 6,680 |
| 19. | Romanian | 67,586 | Turkish | 3,000 | Hindi | 454 | Tagalog | 6,190 |
| 20. | Russian | 67,366 | Mandarin Chinese | 2,963 | Tetun | 429 | Czech | 5,307 |
a. Panjabi (India): 80 responses; b. Min Nan Chinese: 102 responses; c. Mandarin Chinese: 400 responses; d. Mandarin Chinese: 22,025 responses, Cantonese Chinese: 44,404 responses.
However, multilingualism experts highlighted that the census results (or, more precisely, the published datasets) were not in line with the findings of independent academic surveys conducted in parts of Britain and Ireland. For example, Baker and Eversley (Reference Baker and Eversley2000) had recorded 283 languages spoken only among London’s schoolchildren (cf. von Ahn et al.’s (Reference von Ahn, Lupton, Greenwood and Wiggins2010) finding of ‘over 300 languages’). There was also research showing that, apart from the number of languages, the numbers of speakers for named languages were also under-reported in the census data (Matras and Robertson Reference Matras and Robertson2015).
The formulation of the language question in the questionnaires used in England, Wales and Northern Ireland was put forward as the principal reason for under-reporting. In these three jurisdictions, people were asked to answer the question ‘What is your main language?’. If that was not English, they were asked to write in only one other answer and subsequently self-assess their ability to speak English; see Figure 19.1. Sebba’s (Reference Sebba2018) critique identifies several problems with these questions. In the glossary of the terms used in the census, ‘main language’ is defined as ‘a person’s first or preferred language’ (Office for National Statistics 2014a:30). The term ‘main’, however, is open to a range of interpretations by respondents: it can be the language they are more proficient in and/or feel more confident using, the language they first acquired as children, or the language they use the most during the day either at home or in work. Ambiguity aside, there is an assumption that respondents have only one ‘main’ language (or variety) in their repertoires across all contexts and modes of use (inside and outside the home; inside and outside their families, friends, and community networks; formal and informal; spoken and written), which makes multilingualism invisible. The questionnaire also incentivises respondents to choose English as their main language in order to avoid having to report a less than optimal ability of speaking the majority language.
The language question was more specific in Scotland, where people were asked if they used a language other than English at home. Despite this, the Scottish questionnaire also allowed respondents to provide only a single answer and designated this as the language of the home, thus obscuring the use of multiple languages in other domains as well as of multiple languages within the same household. The formulation in the Republic of Ireland was similar to Scotland (‘Do you speak a language other than English or Irish at home?’), but the number of reported languages was rather low. This can perhaps be explained by the fact that a high number of unspecified languages have been aggregated under the ‘Other stated languages (incl. not stated)’ category, judging by the number of responses listed under it (38,997, or 7.6 per cent of the overall number of people who answered they spoke a language other than English or Irish at home). Akan, the last-named language on the Irish data, received 1,007 responses, which means that the languages under the ‘Other’ category received 1,006 responses or fewer.
With these shortcomings in mind and with a view to improving the data on multilingualism, linguists and independent campaigners engaged with statistical authorities in the run-up to the 2021 census, calling for the wording of the language question to change in a way that would allow respondents to give a full(er) account of their linguistic repertoires (Matras et al. Reference Matras, Bak, Sebba and Ayres-Bennett2018; Payne Reference Payne2018). ONS officials, however, were keen to remind them what the census is and what it is not. The line that Sebba (Reference Sebba2018) reports, according to which the census is not an exhaustive survey of UK citizens’ full gamut of languages but rather aimed at establishing the languages that government can use to provide public services to citizens (Office for National Statistics 2014b:13), remains the same. Indeed, the language questions remained unchanged in the questionnaires that were used across the four UK nations and the Republic of Ireland for the purposes of the 2021 and 2022 censuses (in Scotland and the Republic of Ireland, the censuses that were planned for 2021 took place in 2022 due to the Covid-19 pandemic). In 2021, however, the following definition of ‘main language’ was provided in the UK: ‘This is the language you use most naturally. For example, it could be the language you use at home.’. The definition was only available to people who completed the census questionnaire online, however; the paper questionnaire did not include the clarification; see Figure 19.3. Details on how the ONS researched, developed, and tested the language question used in the 2021 census for England and Wales are provided by the ONS in a dedicated report (Office for National Statistics 2020).

Figure 19.3 Language questions in the online and paper questionnaires used in the 2021 census in England.
Given that (a) the formulation of the language questions did not change; (b) the clarifications that were given for ‘main language’ limit the domain of usage to the home and introduce the rather vague quality of naturalness; and (c) the methodology for the analysis of the language data did not change significantly in 2021 (the ONS stated that the language data in the 2021 census is highly comparable, meaning that it can be directly compared with the data from the 2011 census), the 2021 census data will have to be approached with the same caveats as the 2011 data. These considerations aside, the publication of data about main language, English language proficiency and household language in England and Wales in late 2023 revealed interesting changes in the number of speakers of European languages (and other languages, for that matter) and in the composition of the top twenty most frequently reported languages ranking. Table 19.2 shows the top twenty main languages spoken in England and Wales, excluding English and Welsh. It compares the number of responses each language received in the 2011 and 2021 censuses and indicates the increase or decrease in responses as well as the changes in the ranking of each language on the top-20 table.
Table 19.2 The top 20 main languages spoken in England and Wales, excluding English (English or Welsh in Wales), in the 2021 census
| Language | Responses | Rank | |||||
|---|---|---|---|---|---|---|---|
| 2011 | 2021 | Change (%) | 2011 | 2021 | Rank | ||
| 1. | Polish | 546,174 | 611,845 | +12.02 | 1 | 1 | 0 |
| 2. | Romanian | 67,586 | 471,954 | +598.30 | 19 | 2 | +17 |
| 3. | Panjabi | 273,231 | 290,745 | +6.41 | 2 | 3 | +1 |
| 4. | Urdu | 268,680 | 269,849 | +0.44 | 3 | 4 | −1 |
| 5. | Portuguese | 133,453 | 224,719 | +68.39 | 9 | 5 | +4 |
| 6. | Spanish | 120,222 | 215,062 | +78.89 | 10 | 6 | +4 |
| 7. | Arabic | 159,290 | 203,998 | +28.07 | 6 | 7 | +1 |
| 8. | Bengali (with Sylheti and Chatgaya) | 221,403 | 199,495 | −9.90 | 4 | 8 | −4 |
| 9. | Gujarati | 213,094 | 188,956 | −11.33 | 5 | 9 | −4 |
| 10. | Italian | 92,241 | 160,010 | +73.47 | 13 | 10 | +3 |
| 11. | Tamil | 100,689 | 125,363 | +24.51 | 11 | 11 | 0 |
| 12. | French | 147,099 | 120,259 | −18.25 | 7 | 12 | −5 |
| 13. | Lithuanian | 85,469 | 119,656 | +40.00 | 15 | 13 | +2 |
| 14. | All other Chinese | 141,052 | 118,271 | −16.15 | 8 | 14 | −6 |
| 15. | Turkish | 99,423 | 112,978 | +13.63 | 12 | 15 | −3 |
| 16. | Bulgarian | 38,496 | 111,431 | +189.46 | 29 | 16 | +13 |
| 17. | Russian | 67,366 | 91,255 | +35.46 | 20 | 17 | +3 |
| 18. | Persian or Farsi | 76,391 | 87,713 | +14.82 | 17 | 18 | −1 |
| 19. | Hungarian | 44,365 | 87,356 | +96.90 | 27 | 19 | +8 |
| 20. | Greek | 50,205 | 76,675 | +52.72 | 23 | 20 | +3 |
Overall, the changes reflect demographic developments that England and Wales underwent in the decade between the two censuses, most notably the increase in the number of people who migrated to the two nations from the southern European countries that were worst affected by the 2008 financial crisis, namely Greece, Italy, Spain and Portugal (Pratsinakis et al. Reference Pratsinakis, King, Himmelstine and Mazzilli2020). All the languages associated with these countries saw increases both in the number of responses, ranging from 52.72 per cent (Greek) to 78.89 per cent (Spanish), and upward moves in the language ranking. As a result, Greek is now included in the list of the top twenty languages in England and Wales. The most striking changes, however, were seen in the case of Romanian and, to a lesser extent, Bulgarian. The number of responses for Romanian increased by 598.3%, from 67,586 to 471,954, moving the language seventeen places up the ranking, from nineteenth position in 2011 to second position in 2021. The increase reflects the rise in the number of Bulgarian and Romanian citizens living in the UK, which is in turn linked to the lifting in 2014 of the employment restrictions that had been put in place immediately after the accession of Bulgaria and Romania to the European Union in 2007 (Ruhs and Wadsworth Reference Ruhs and Wadsworth2018). Bulgarian and Romanian speakers also arrived in the UK from Greece, Italy, Spain and Portugal as onward migrants pushed by the financial crisis. At the same time, the number of people responding that they have other European languages as their main languages either decreased, as in the case of French (−18.25%), or increased less pronouncedly, as in the case of Polish (+12.02). German also fell down fifteen places in the ranking, from sixteenth language in 2011 to thirty-first language in 2021, with the number of respondents decreasing by 39.9 per cent, from 77,240 in 2011 to 46,421 in 2021. Even though studies that address these particular cases were not available at the time of writing, it could be hypothesised that the changes in French, Polish, German, and other languages may be due to a combination of people leaving England and Wales, including EU citizens leaving because of Brexit (Stawarz and Witte Reference Stawarz and Witte2023), and language shift to English among linguistic groups with a longer presence in the country.
The 2011 England and Wales census was the only one that classified languages on the basis of region. In the published dataset, French, Portuguese, Russian and Spanish appeared as individual categories; eighteen other languages were listed as ‘Other European language (EU)’; Albanian, Serbian/Croatian/Bosnian and Ukrainian were categorised as ‘Other European language (non-EU)’; and Romani and Yiddish as ‘Other European language (non-national)’. There were also three aggregate categories. The number of responses for each of these is shown in Table 19.3. In the full classification, a total of seventy languages were distributed across the seven categories (Table 19.4).
Table 19.3 Numbers of responses European languages received in the 2011 England and Wales census
| Category | Language | Responses |
|---|---|---|
| French | French | 147,099 |
| Other European language (EU) | Polish | 546,174 |
| Italian | 92,241 | |
| Lithuanian | 85,469 | |
| German | 77,240 | |
| Romanian | 67,586 | |
| Slovak | 50,485 | |
| Greek | 50,205 | |
| Hungarian | 44,365 | |
| Bulgarian | 38,496 | |
| Latvian | 31,523 | |
| Czech | 29,363 | |
| Dutch | 26,657 | |
| Swedish | 19,211 | |
| Danish | 9,971 | |
| Finnish | 6,592 | |
| Estonian | 3,398 | |
| Maltese | 3,108 | |
| Any other European language (EU) | 2,969 | |
| Slovenian | 1,235 | |
| Other European language (non-EU) | Albanian | 32,425 |
| Serbian/Croatian/Bosnian | 14,164 | |
| Northern European language (non-EU) | 10,777 | |
| Ukrainian | 6,578 | |
| Any other Eastern European language (non-EU) | 1,735 | |
| Other European language (non-national) | Yiddish | 3,987 |
| Romani language | 629 | |
| Portuguese | Portuguese | 133,453 |
| Russian | Russian | 120,222 |
| Spanish | Spanish | 67,366 |
Table 19.4 Languages coded as European in the full classification of the 2011 England and Wales census
| Category | Language label |
|---|---|
| French | French |
| Other European Language (EU) | Arberesh, Arvanitika, Basque/Euskara, Breton, Bulgarian, Catalan, Corsican, Czech, Danish, Dutch, Estonian, Finnish, Franco-Provencal, Frisian, Friulian, Galician, Gallo-Italian (not otherwise specified), German, Greek, Hungarian, Italian, Kashubian, Ladin, Latvian, Lithuanian, Luxembourgish, Maltese, Napoletano-Calabrese, Occitan, Padanian, Polish, Rhaeto-Romance (not otherwise specified), Romanian, Saami, Sardinian, Sicilian, Slovak, Slovenian, Sorbian (Lower), Sorbian (Upper), Swedish, Venetian |
| Other European Language (non-EU) | Albanian (Gheg/Kosovan), Albanian (Not otherwise specified), Albanian (Tosk), Belarusian, Bosnian, Croatian, Erzya, Faroese, Icelandic, Komi, Macedonian, Moksha, Norwegian, Romansch, Serbian, Serbo-Croat (Not otherwise specified), Sorbian (Not otherwise specified), Swiss German, Ukrainian, Yugoslav |
| Other European Language (non-national) | Romani, Romany (not otherwise specified), Sinti, Yiddish |
| Portuguese | Portuguese |
| Russian | Russian |
| Spanish | Spanish |
The number and range of languages in the full classification are comparable with the EU’s official estimate that a total of eighty-three languages are spoken in its geographical area. These are either the official languages of European nation states (23 languages) or the languages of ‘indigenous’ communities which may or may not enjoy official recognition of their minority status in European states (60 languages; Eurobarometer 2012). However, as McPake et al. (Reference McPake, Tinsley, Broeder, Mijares, Latomaa and Martyniuk2007) point out, the notion of linguistic diversity should incorporate all the languages that are in use in a society so that, in addition to official and regional languages, migrant languages, non-territorial languages (the languages of travellers and historically displaced groups) and sign languages are also included. In the census, Romani and Yiddish are included as European non-territorial languages (non-national in ONS terminology), whereas sign languages are distributed across three categories: ‘British sign language’, ‘Sign language (all other)’ and ‘Any sign communication system’. Migrant languages, however, are geographically classified on the basis of their origin so that Panjabi, Urdu, Bengali and Gujarati – the four most commonly reported languages after Polish in England and Wales – are under the ‘South Asian language’ category.
While this designation may be accurate in reflecting the historical provenance of these (and many other) languages, it conceals the fact that, in today’s times of increased mobility, they are spoken by people for whom Europe is an important part of their trajectories of migration and Europeanness an important dimension of their identities. Consider, for example, the case of onward migrants who relocated to the UK after first having migrated to another European country and whose linguistic repertoires include ranges of ‘European’ and ‘non-European’ languages such as the Italian Bangladeshis (Della Puppa and King Reference Della Puppa and King2019), Dutch Somalis, Swedish Iranians, or German Nigerians (Ahrens, Kelly and Van Liempt Reference Ahrens, Kelly and Van Liempt2016). Allowing census respondents to name multiple languages in response to the language question will capture the extent of multilingualism in Britain and Ireland to an unprecedented extent, while at the same time calling into question labels that reproduce simplistic and hierarchising views about the correspondences between language, ethnicity, nationality, ancestry and identity. If, to name just one example, Bengali is spoken across Europe, including in the UK by large numbers of people many of whom were born in and are citizens of one or more European countries, there should be no ideology-free reasons for which Bengali cannot be said to be a European language.



