We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In 1782, Rüdiger published a book titled Von der Sprache und Herkunft der Zigeuner aus Indien (On the Indic Language and Origin of the Gypsies). In that book, Rüdiger postulated an Indian origin of the Romani language and its speakers and its connection to languages of the Indian subcontinent such as Hindi and Bengali. He used a surprisingly modern methodology, collecting his Romani data directly from a Romani speaker (which he admitted to finding “tiresome and boring”) and his Hindi data from a manual written by a missionary. He examined the vocabularies of Romani and Hindi, including numerals like ekh/ek ‘one’, duj/do ‘two’, trin/tīn ‘three’, and so on (the first numeral in each pair listed here is from Romani and the second is from Hindi). Noticing similarities across numerous vocabulary items, Rüdiger surmised that these similarities must derive from the common origin of the two languages. But Rüdiger did not limit himself to examining the vocabulary of Romani and other languages we now call Indo‑Aryan. He wrote: “As regards the grammatical part of the language the correspondence is no less conspicuous, which is an even more important proof of the close relation between the languages” (Rüdiger [1782] 1990: 7).
In this chapter, we consider two large language families, Uralic and Turkic, as well as several smaller language families and isolates. Section 4.1 examines the Uralic language family. The realm of Uralic languages stretches from northern Scandinavia in the northwest and the Great Hungarian plain in the southwest to east of the Ural Mountains, the mountain chain that separates Europe and Asia. In section 4.2 we examine Turkic languages and in section 4.3 languages of Siberia that do not belong to the Turkic or Uralic family. This vast region – larger than any independent country – is home to dozens of indigenous languages still spoken today. Finally, section 4.4 introduces an unusual linguistic phenomenon that is found in Turkish and several of the Siberian languages (as well as some other languages elsewhere), called grammatical evidentiality (or evidentiality, for short).
Dominated by the imposing Great Caucasus Mountains (which includes Europe’s highest mountain, Mount Elbrus), the Caucasus region stretches between the Black Sea and the Caspian Sea. Although historically its location at the center of the Afro-Eurasian ecumene (essentially “the Old World”) made it a vital crossroads for many a century, nowadays it is more often than not overlooked in geographical descriptions. Yet it is fascinating from the ethnolinguistic perspective, not only because of the sheer number of ethnic groups that inhabit, and languages that are spoken in, this relatively small region, much of which is an uninviting terrain, but also because of a number of linguistic peculiarities that the languages here exhibit.
Any map of the languages of Canada and the USA looks like a veritable mosaic, whether we look at a map of indigenous languages or a map of languages that arrived in the Americas in the last half a millennium, first with early European colonizers (not only English, but French, Dutch, even Swedish made an impact in the early colonial period), but also with more recent waves of immigrants. The Ethnologue list of immigrant languages spoken in the USA alone includes 203 languages, running alphabetically from Adamawa Fulfulde (a Niger-Congo language from West Africa; see Chapter 7) to Zoogocho Zapotec (an Oto-Manguean language from Northern Oaxaca, Mexico; see Chapter 12). There are altogether 255 living languages spoken in Canada and the USA. Thus, like New Guinea and Australia (see Chapter 10), Canada and the USA exhibit a high degree of linguistic diversity, muted by waves of language extinction in the last few centuries, but coupled with our incomplete understanding of language relatedness in the region, which creates a very muddled picture of the linguistic landscape.
Sub-Saharan Africa is home to numerous languages that belong to several language families. To get a sense of just how linguistically diverse sub-Saharan Africa is, consider the following figures from Ethnologue. The nearly 1 billion people living in sub-Saharan Africa speak over 2,040 languages. On a national level, there are 16 countries in this region with 50 or more languages spoken. For comparison, in Europe the only country with more than 50 languages is the Russian Federation, most of whose languages are spoken in Siberia rather than in European Russia. In Asia, 12 countries with more than 50 languages are listed.
In this chapter we focus on languages spoken on the numerous islands in the Pacific Ocean, as well as one big island in the Indian Ocean: Madagascar. Most of these languages belong to the Austronesian language family, and it will be our primary concern here; other languages in this region belong to the geographical (but not familial!) grouping of Papuan languages, which will be considered in detail in Chapter 10.
Because languages of New Guinea and Australia are considered in the same chapter of this book, one might expect them to be related. Because of the geographical proximity between the two areas, the idea that the languages spoken there constitute one family has been explored by several linguists; however, the question remains open. Until recently, no plausible link had been found, in part because researchers looked in the wrong place: the south coast of New Guinea, immediately above the Torres Strait and the Arafura Sea. It is a logical place to look, for sure, as it is the area closest to Australia, but unfortunately, no languages spoken there today show any evidence of a link to languages of Australia. More recent studies suggest that a possible link between languages of Australia and New Guinea may be found in an unexpected location: in the highlands of New Guinea (see Foley 1986: 271–275).
Imagine a Martian scholar who aims to understand the linguistic diversity on Planet Earth. One way this imaginary alien scholar might go about this research is by exploring Google Translate. However, doing so would offer our imaginary extraterrestrial scholar only a very biased and rather misleading picture. As of November 2022, Google Translate supports 134 languages, which constitutes about 2 percent of the world’s 7,151 languages identified by Ethnologue. Not only is it a small fraction of the total number of languages, but the list is not representative of global linguistic diversity: of the 134 Google Translate languages, 64 (or nearly half) are Indo-European languages.
Ferdinand de Saussure, a Swiss linguist considered to be one of the fathers of modern linguistics, in his Cours de linguistique générale (Course in General Linguistics, 1916) defines language as “a product of the collective mind of linguistic groups”. But this definition hardly helps us in drawing the boundary between one language and another: how can we tell who is or is not to be included in any given “linguistic group”? Take any two people, even close relatives, and they are sure to speak at least slightly differently. Yet it is not insightful to say that there are as many languages in the world as there are individual people!
The region that we focus on in this chapter is larger than the one traditionally designated as the Middle East, yet smaller than the political term “the Greater Middle East”, which was coined during the second George W. Bush administration in the USA to refer to the contiguous Muslim world, a larger zone extending beyond Egypt’s western and southern borders – in North Africa and the Horn of Africa – and beyond Iran’s eastern borders – in parts of South Asia (Afghanistan and Pakistan). Linguistically speaking, the region of interest to us here stretches from Egypt in the west to the western border of Iran in the east and from the southern border of Turkey in the north to Yemen and Oman in the south. (Languages of Iran and Turkey themselves are discussed in Chapters 3 and 4, respectively.)