Triangulating the Indo-European homeland

Asya Pereltsvaig; Martin W. Lewis

doi:10.1017/CBO9781107294332.013

9 Triangulating the Indo-European homeland

In the previous chapter, we have shown that evidence from linguistic paleontology, especially as it pertains to vocabulary elements associated with wheels and wheeled vehicles, is congruent with the later date, circa 3500 BCE, for the highest-order split of the Indo-European family (which separated Anatolian languages from the rest of the family). In this chapter, we examine what linguistic (and to a lesser extent, archeological) evidence can tell us about the location of the Indo-European homeland. In particular, we will examine three types of sources: clues from linguistic paleontology, evidence of language contact with other groups, and reconstructions of the migration histories of the various groups involved.

Clues from linguistic paleontology

Although archeological evidence of a material culture contains no traces of the language or languages spoken by the people who produced the artifacts, language – documented or reconstructed – does contain traces of the speakers’ material life. Linguistic paleontology searches the reconstructed vocabulary of a proto-language for clues about how and where its speakers lived. Firm conclusions about environment and culture, however, can be drawn only if the signals from linguistic paleontology are bolstered by evidence from other fields, such as archaeology and palynology (the study of fossil pollen).

As discussed in the previous chapter, which focuses on the timing of Proto-Indo-European (PIE), the logic of linguistic paleontology holds that if a proto-language had a reconstructable word for a certain concept, its speakers were therefore familiar with that concept. For example, reconstructing the terms for wheeled vehicles, which appear in the archeological record only as far back as 3500 BCE, serves to invalidate hypotheses that place PIE farther back in time, such as the Gray–Atkinson neo-Anatolian hypothesis. Linguistic paleontology can shed light not only on when but also on where a given proto-language was spoken. For example, reconstructions of Proto-Salishan vocabulary from western North America include some “two dozen [reconstructed plant and animal terms] represent[ing] species found only on the coast”, including ‘harbour seal’, ‘whale’, ‘cormorant’, ‘seagull’ and the like, which “suggest[s] a coastal, rather than an interior, homeland for the Salish”, a family of languages spoken not only on the Pacific coast but also as far inland as Montana and Idaho (Kinkade Reference Kinkade1991: 143–144, cited in Campbell Reference Campbell2013: 427). Additional linguistic paleontology clues allowed Kinkade (Reference Kinkade1991: 147, cited in Campbell Reference Campbell2013: 427) to narrow down the proto-Salishan homeland even further:

extend[ing] from the Fraser River southward at least to the Skagit River and possibly as far south as the Stillaguamish or Skykomish rivers […] From west to east, their territory would have extended from the Strait of Georgia and Admiralty Inlet to the Cascade Mountains.

The locations of several other language families’ respective homelands have also been reconstructed on the basis of evidence from linguistic paleontology. Siebert (Reference Siebert1967) placed the original homeland of the Algonquian family “between Lake Huron and Georgian Bay and the middle course of the Ottawa River, bounded by Lake Nipissing and the northern shore of Lake Ontario” (Campbell Reference Campbell2013: 425). Fowler (Reference Fowler and Fowler1972) concluded that the homeland of the Numic family (a subgroup of the Uto-Aztecan family) was “in Southern California slightly west of Death Valley” (Campbell Reference Campbell2013: 427). (The homeland of the Uto-Aztecan family as a whole remains more controversial.) Much work has been done on locating the Proto-Uralic homeland (or the homeland of its largest constituent branch, Finno-Ugric), yet the answer remains elusive. (We return to the question of the Proto-Uralic homeland in the next section.)

While the logic of the linguistic paleontology method is self-evident, what exactly can be learned from specific reconstructions is often open to interpretation. For example, most scholars agree that there are no reconstructable PIE words for Mediterranean and Southwest Asian animals and plants, such as ‘cypress tree’, ‘palm tree’, ‘date tree’, ‘olive’, ‘camel’, or ‘donkey’.Footnote ¹ Although each such argumentum e silentio for any particular meaning is always weak, the combined absences of roots for various southern plants and animals, as well as the presence of the roots for ‘snow’ (*sneig^wh- and *g̑^hei̯om-) or possibly ‘winter’, suggest a northerly location. However, even in Anatolia and the Levant, highland areas receive abundant snowfall. The roots for ‘snow’ could therefore indicate a high-altitude rather than a high-latitude location, especially in conjunction with the root *g^werH- for ‘mountain’. Clearly, much uncertainty surrounds these kinds of arguments.

Another controversial PIE reconstruction is the root *mori, which presumably means ‘sea’. If this term is indeed traceable all the way back to PIE, then by the logic of linguistic paleontology one might assume that the speakers of PIE must have lived near a large body of water of some type. Note, however, that the relevant cognates come from the northwestern Indo-European languages: Lithuanian māres, Old Church Slavonic morje, Latin mare, Old Irish muir, Gothic marei. No relevant cognates are found in the Anatolian, Tocharian, Greek, Armenian, Albanian, or even Indo-Iranian branches of the family. The Greek word thalassa ‘sea’, for example, almost certainly comes from a pre-Indo-European substrate. As a result of such absences, the root *mori cannot be reliably reconstructed all the way back to PIE. It is possible that the Indo-European branches that lack a word for ‘sea’ once had it but later lost it, perhaps by acquiring it from the local substratum language, as has been proposed for the Greek thalassa (as discussed in Chapter 7). Alternatively, it is possible that the root *mori ‘sea’ was coined by – or borrowed into – the common ancestor of a particular branch of the Indo-European family.

As it turns out, determining whether a word that is absent in many descendant languages stems from PIE is often a difficult matter. In the case of ‘sea’, the issue is further complicated by the fact that even in the Germanic and Celtic languages we find other roots meaning the same thing, as evident in the English word sea itself. Moreover, some of the roots for ‘sea’ can also refer to other types of water bodies. For example, the German cognate of the English sea, See can refer to either ‘lake’ or ‘sea’, whereas German Meer refers to either ‘sea’ or ‘ocean’ while the Dutch word meer generally means ‘lake’. Scottish Gaelic loch refers to either ‘fresh-water lake’ or ‘salt-water sea inlet’. Similarly, Russian more, just like its English counterpart sea, can also refer to a large landlocked body of water, such as the Aral Sea, the Caspian Sea, the Dead Sea, or the Sea of Galilee. Thus, it is possible that PIE speakers were familiar not with the sea in the sense of the ocean, but rather with a large interior body of water.

An additional twist is added to the ‘sea’ mystery by the root *lok̑so-, putatively meaning ‘salmon’ (cf. Tocharian B laks, Proto-Germanic *lahsaz, Old High German lahs, Icelandic lax, Lithuanian lãšis, Russian losos’). Unlike the root for ‘sea’, this word is not limited to the younger branches of the Indo-European family. Its presence in Tocharian means that it must have existed at least in Proto-Nuclear-Indo-European (PNIE), if not in PIE itself. But salmon is found only in northern seas and the rivers that flow into them, a fact that has been used by some scholars against the idea that the Indo-European family originated in the steppe zone. But it is also true that closely related species in the family Salmonidae, such as trout, are found in the rivers of southern Russia and Ukraine, as well as in the mountainous lakes of Armenia. The conclusion that PIE *loksos referred not to a specific species but to ‘any large anadromous salmonid fish’ is reached, for example, by Diebold (Reference Diebold1985: 11).

It is also possible that the PIE word *lok̑so- referred not to ‘salmon’ or ‘trout’, but to ‘fish’ in general, eventually narrowing its meaning in descendant languages. An illustrative example of how such narrowing might have happened comes from a different language family: Athabascan. The Proto-Athabascan *łu:q’ə is reconstructed to have meant ‘fish’ because it is the “common denominator” of meanings in modern descendant languages. In Navajo, spoken in landlocked areas, łóó’ means ‘fish’. Its cognate in Hupa, spoken in Northern California, ło:q’ means ‘fish’ or more specifically ‘salmon’. In Tlingit, however, the reflex of the Proto-Athabascan root l'ook means neither ‘fish’ nor ‘salmon’, but specifically ‘silver/coho salmon’ (Oncorhynchus kisutch). Other species of salmon are referred to in Tlingit by separate words: t’á (Oncorhynchus tshawytscha), gaat (Oncorhynchus nerka), téel’ (Oncorhynchus keta), and cháas’ (Oncorhynchus gorbuscha). The generic term for ‘fish’, which by default refers to any kind of salmon, is xáat. In parallel to the Tlingit case, the PIE *lok̑so- may have simply meant ‘fish’, with descendant languages narrowing the meaning to ‘salmon’, ‘trout’, or some other locally available variety of fish. Admittedly, such a scenario is less likely for the Indo-European family than for languages in the Athabascan family spoken where salmon served as the staple food. However, it cannot be excluded altogether because we cannot be certain that the Tocharian B laks meant ‘salmon’ rather than ‘fish’ or ‘trout’ (or some other fish species).

Another much-discussed set of clues about the location of the Indo-European homeland comes from the names of various types of trees. German philologists in the first half of the nineteenth century were preoccupied with finding the PIE Urheimat through tree names, favoring – unsurprisingly – a more northern homeland. But as we shall see below, the linguistic paleontology argument based on tree names is not without flaws. First, let us consider PIE reconstructions of tree names and what can be learned from them. Upon a thorough study of the topic, Friedrich (Reference Friedrich1970: 1) claims that PIE speakers recognized and named at least eighteen categories of trees, but PIE (or “large groups of PIE dialects”) contained at least thirty names of trees, “attested in varying ways and degrees in languages of the descendant stocks, but particularly in Italic, Germanic, Baltic, and Slavic, and to a lesser degree in Celtic and Greek”. The discrepancy between the number of tree categories and the number of labels used to designate them stems from the fact that “a considerable number of […] arboreal units have two or more alternate names or forms […] specifically, the oak and the willow have three each, and five trees have two each: the yew, the apple, the maple, the elm, and the nut (tree)” (Friedrich Reference Friedrich1970: 4). In the case of the nut (tree), the two forms are “complimentary geographically” (Friedrich Reference Friedrich1970: 4), as one form is reconstructed based on the reflexes in eastern branches (Baltic, Slavic, Greek, and Albanian), while the other form is reconstructed based on derivatives in western branches (Italic, Celtic, and Germanic). In other cases, the two (or more) forms must have referred to distinct species or subspecies, or perhaps distinct functions (e.g. tree in nature vs. material used in religious ritual). Moreover, Friedrich (Reference Friedrich1970: 1) maintains that the thirty or so tree names

refer to categories of trees that in the main correspond to generic groups such as the birch (Betulus), but are limited in other cases to single species, such as the Scotch pine (Pinus sylvestris); some PIE classes cross-cut the familiar ones of the English language, or of the language of Linnaean botany, as when, for example, *Ker-n- includes both the wild cherry (Prunus padus), and the cornel cherry (Cornus mas).

The eighteen categories of trees whose names have been reconstructed for PIE, their names (in some cases more than one name per category, as discussed above), and the languages/branches that have reflexes of these PIE forms, are listed in Table 9.1 (based on Friedrich Reference Friedrich1970: 24–25, Table 2; for the specific reflexes of these PIE roots in the various daughter branches/languages, see Friedrich Reference Friedrich1970: 24–25). What can we deduce about the location of the Indo-European homeland from these tree names in the reconstructed PIE vocabulary, assuming for the sake of the argument that the translations in Friedrich's study are correct?

Table 9.1 Tree categories whose names are reconstructed for PIE (based on Friedrich Reference Friedrich1970: 24–25)

Tree categories	PIE tree names	IE branches with reflexes of the PIE form
Birch	*b^herH-g̑-o-	Slavic, Baltic, Italic, Germanic, Indic, Iranian
Conifers	*pitu-	Greek, Italic, Indic, Albanian
	*pi/uK-	Slavic, Baltic, Greek, Italic, Celtic, Germanic
Junipers and cedars	*el-u̯-n-	Slavic, Baltic, Greek, Armenian
Populus	*osp-	Slavic, Baltic, Greek, Germanic, Indic, Iranian
Willows	*s/wVlyk	Greek, Celtic, Germanic, Anatolian
	*wyt-	Slavic, Baltic, Greek, Celtic, Germanic, Indic, Armenian
Apples	*abVl-	Slavic, Baltic, Italic, Celtic, Germanic
	*maHlo-	Greek, Albanian, Anatolian, Tocharian
Maples	*klen-	Slavic, Baltic, Greek, Celtic, Germanic
	*akVrno-	Greek, Italic, Germanic, Indic
Alder	*aliso-	Slavic, Baltic, Greek, Italic, Celtic, Germanic
Hazel	*kos(V)lo-	Slavic, Baltic, Italic, Celtic, Germanic
Nut (tree)s	*knu-	Italic, Celtic, Germanic
	*ar-	Slavic, Baltic, Greek, Albanian
Elms	*Vlmo-	Slavic, Italic, Celtic, Germanic
	*u̯-ig̑-	Slavic, Baltic, Germanic, Iranian
Linden	*lenTā-	Slavic, Baltic, Greek, Italic, Germanic, Albanian
	*lēipā-	Slavic, Baltic, Greek, Celtic
Ash	*os-	Slavic, Baltic, Greek, Italic, Celtic, Germanic, Albanian, Armenian
Hornbeam	*grōb^h-	Slavic, Baltic, Greek, Italic, Albanian
Beech	*b^hāg̑o-	Slavic, Greek, Italic, Celtic, Germanic, Albanian
Cherry	*K(e)r-n-	Slavic, Baltic, Greek, Italic, Albanian
Yews	*eywo-	Slavic, Baltic, Greek, Italic, Celtic, Germanic, Armenian, Anatolian
	*tVk̑so-	Slavic, Greek, Italic, Iranian
Oaks	*g^welH-	Slavic, Baltic, Greek, Italic, Indic, Albanian, Armenian
	*ai̯g-	Greek, Italic, Germanic
	*perk^w-	Slavic, Baltic, Italic, Celtic, Germanic, Indic, Albanian, Armenian, Anatolian
	*dóru-	Slavic, Baltic, Greek, Italic, Celtic, Germanic, Indic, Iranian, Albanian, Armenian, Anatolian, Tocharian

To answer this question, we turn to the map database compiled by EUFORGEN, a European forestry conservation program.Footnote ² As can be seen from Map 23 in the Appendix, and as has been long noted by advocates of linguistic paleontology, common beech – known scientifically as Fagus sylvatica – is not found east of the line that runs from Klaipeda in western Lithuania down to the Crimea, although a related species, oriental beech, or Fagus orientalis, is found in northern Anatolia, Crimea, on the slopes of the Great Caucasus Mountains, and in the Alborz Range south of the Caspian Sea) (see Map 24). The limited distribution of the tree has been taken as prima facie evidence that the original Indo-Europeans must have lived to the west of this line, rather than in Russia or (eastern) Ukraine. Note that this “beech argument” is in direct contradiction to the “horse argument”: areas where beech and horse were found in the relevant time period do not in general overlap.

The distribution of other species for which Indo-European roots have been reconstructed by Friedrich (Reference Friedrich1970) sheds new light on the problem of the PIE homeland – or throws a wrench into the works, depending on how one looks at it.Footnote ³ Silver birch (Betula pendula) is found in an extensive zone stretching throughout Northern and Eastern Europe and across the Pontic steppe zone (see Map 25). Common ash (Fraxinus excelsior) grows in Western, Central, and Eastern Europe, the Caucasus, and northern Anatolia (see Map 26). The distribution of European white elm (Ulmus laevis) extends through most of Central and Eastern Europe, including the northern part of the steppe zone, but it is virtually unknown in Anatolia (see Map 27). Eurasian aspen (Populus tremula) is found in Europe, central Russia, western Kazakhstan, and sporadically in the steppe zone of southern Russia and Ukraine, yet is virtually unknown in Anatolia (see Map 28). The distribution of silver fir (Abies alba) is limited to an even smaller area; like beech, it too is not found east of the Klaipeda–Crimea line (see Map 29). Of the various other conifers on which information is available in the EUFORGEN database, only Brutia pine (Pinus brutia) and black pine (Pinus nigra) grow in the areas of interest for us, specifically in coastal Anatolia (see Maps 33 and 34; two other coniferous species, Pinus halepensis and Pinus sylvestris, grow sporadically through southern and northern Anatolia, respectively). Black alder (Alnus glutinosa) grows in Western and Central Europe and sporadically throughout Russia, Ukraine, the North Caucasus, and the mountainous areas of northern and eastern Anatolia (see Map 30). (A related species, Italian alder, or Alnus cordata, is found only in southern Italy.) Field maple (Acer campestre) is found extensively throughout Europe, as well as in northern Ukraine and central Russia, in the North Caucasus, and along a narrow coastal strip in northern Anatolia (see Map 31). Black poplar (Populus nigra) grows almost everywhere in Europe (except the most northern areas), the steppe and forest zones of Russia, Ukraine, and western Kazakhstan, the Caucasus, Anatolia, and northern Fertile Crescent (see Map 32).Footnote ⁴ Wild apple (Malus sylvestris) grows in the steppe zone, but only sporadically in Anatolia (see Map 35). Linden (Tilia cordata) grows in parts of the steppe zone, but not in Anatolia. Wild cherry (Prunus avium) is not common to either the steppe zone or Anatolia (see Map 36). As for the oaks, one species (Quercus petraea) grows in Anatolia but not the steppes, while another (Quercus robur) grows sporadically in both the steppe zone and Anatolia (see Maps 37 and 38). The distribution of the various tree species in the two areas identified as the candidate PIE locations according to the Steppe and Anatolian hypotheses in Bouckaert et al. (Reference Bouckaert, Lemey, Dunn, Greenhill, Alekseyenko, Drummond, Gray, Suchard and Atkinson2012, Supplementary Materials: Figure S5) is summarized in Table 9.2.

Table 9.2 The distribution of tree species in the two candidate PIE locations according to the Steppe and Anatolian hypotheses

Common name	Species	Steppe zone	Southern and Central Anatolia
Beech	Fagus sylvatica	–	–
Birch	Betula pendula	Sporadically in the north, not in the south	Sporadically in central Anatolia
Ash	Fraxinus excelsior	In the west only	–
Elm	Ulmus laevis	In the north only	–
Aspen	Populus tremula	In the north, sporadically elsewhere	–
Conifers	Abies alba	–	–
	Pinus brutia, Pinus nigra		Sporadically in coastal Anatolia
Alder	Alnus glutinosa	Sporadically in the north	Sporadically
Maple	Acer campestre	In the northwest only	–
Poplar	Populus nigra	Yes	Yes
Apple	Malus sylvestris	Yes	Sporadically
Linden	Tilia cordata	Yes	–
Cherry	Prunus avium	–	–
Oak	Quercus petraea, Quercus robur	Sporadically	Yes

Comparing the two areas – the steppe zone vs. central and southern Anatolia – it is clear that the former is a better candidate for the PIE homeland than the latter, as it includes many more tree species known to the speakers of PIE. In fact, only the sporadic distribution of some coniferous species in coastal Anatolia and the appearance of certain oak species in parts of Anatolia point towards Anatolia as the Indo-European Urheimat. Birch, alder, and poplar are not informative, as all three grow in both areas (the former two only sporadically, however). Beech, most coniferous species, and cherry present a problem for both hypotheses at first glance, as neither grows in either one of the two areas; we shall revisit the “beech problem” below. The distributions of ash, elm, aspen, maple, wild apple, and linden, however, point to the steppe zone, as these trees are found in parts of the steppe zone but not in Anatolia.

Although the distribution of the various tree species points to the Pontic steppe zone as the prime candidate for the PIE homeland, this argument is not without flaw. First, as mentioned above, some species with reconstructed PIE roots, most notably beech, are found neither in the steppe zone nor in Anatolia. This so-called “beech argument” has been used by various scholars to limit the area in which PIE might have been spoken (see Schrader Reference Schrader and Jevons1890; Schrader and Nehring Reference Schrader and Nehring1923/1929; Mallory Reference Mallory1973; Gamkrelidze and Ivanov Reference Gamkrelidze and Ivanov1984; inter alia). However, tree names often provide unreliable evidence. For example, the root for ‘beech’ is reconstructed as *b^hāgo- (Friedrich Reference Friedrich1970: 25) on the basis of such cognates as Greek phāgós, Latin fāgus, Old English bōc, and Russian buzina. However, names of plants and animals often acquire new meanings by shifting their referents from one species to another. In fact, the Greek reflex of the PIE root *b^hāgo-, phāgós, also means ‘oak’ (Beekes Reference Beekes1995: 48), while the Russian cognate buzina means ‘elder tree, Sambucus’ rather than ‘beech’.

Several recent cases illustrate how common such name transfers can be. Before the discovery of the New World, Europeans were familiar with spice-producing plants from the Piper genus of the Piperaceae family, especially Piper nigrum, which gives us black pepper (cooked and dried unripe fruit), green pepper (dried unripe fruit), and white pepper (dried ripe seeds). The word pepper comes from pippali, found in several Dravidian languages of southern India, which is used for a closely related species, long pepper (Piper longum). Apparently, the Romans erroneously believed that black pepper and long pepper derived from the same plant, so they used the same word to refer to both. The Latin word is the source of not only the modern English pepper (from the Old English pipor) but also of German Pfeffer, Dutch peper, French poivre, Romanian piper, Italian pepe, and other cognate forms. When the Portuguese traveled to West Africa in the fifteenth century, they used the same term for the spicy seeds of Aframomum melegueta, which they deemed “Guinea pepper” or “melegueta pepper”. In the sixteenth century, the English word pepper and its counterparts in other languages were applied as well to the unrelated New World chili pepper, the fruit of plants from the genus Capsicum, most of which also have a distinctively spicy taste.

Another example of species-name transfer involves Jerusalem artichoke, a New World plant that acquired the artichoke label because the taste of its edible tuber reminded Europeans of an artichoke. The Jerusalem tag is a folk-etymologized version of the Italian name of the plant, girasole, which literally means ‘turns with/to the sun’, which originally referred to ‘sunflower’, a related plant from the Helianthus genus.

An even more complicated story of terminological confusion and name transfer from one species to another involves the word turkey. Originally, the term referred to the domesticated guineafowl of West African origin that were introduced to Britain in the 1500s by “Turkey merchants” trading with the Ottoman Empire. Subsequently, the first English colonists in New England confused the large native fowl with the African bird, calling it “turkey” as well. Further name-shuffling is illustrated by the fact that the word meleagris (Greek for ‘guineafowl’) is used in the scientific names of the two species, at the species level for the West African guineafowl and at the genus level for the turkey. Note also that – in a grand example of geo-bio-linguistic confusion – many other languages call the North American fowl by a term deriving from the root for ‘India’, after the West Indies (itself a geographical misnomer!): compare the French dinde (originally, d'Inde ‘from India’), Russian indejka, and Hebrew hodu (the same as the name of the country, India).

Returning to tree species, another reconstructed PIE root that must have changed its meaning is *b^herHĝo-, reconstructed on the base of such cognates as Sanskrit bhūrjá, Ossetian bärz, English birch (Old English berc), Latvian bērzs, Lithuanian béržas, and Russian berëza, all referring to trees of the genus Betula (cf. Friedrich Reference Friedrich1970: 24). However, the Latin cognate, fraxinus, refers to the ash tree, a completely different genus. The scientific name for its genus is also Fraxinus, and the two species belong to different taxonomic orders. Romance languages use the inherited Latin root to designate ‘ash tree’: compare French frêne, Italian frassino, Spanish fresno. Baltic and Slavic languages have another set of cognates for ‘ash tree’: Russian jasen’, Polish jasień, Slovenian jásen, Latvian osis, Lithuanian úosis, Old Prussian woasis. Cognates of the Balto-Slavic words, also denoting ‘ash tree’, are found also in Old Icelandic askr and Latin ornus ‘type of ash tree’. This PIE root is reconstructed in Friedrich (Reference Friedrich1970: 25) as *os meaning ‘ash tree’. Curiously, a reflex of that PIE root in Albanian, ah, means ‘beech’. As the discussion so far makes amply clear, most of the roots reconstructed to mean a certain type of tree in PIE could easily have referred originally to a different type of tree (or perhaps to some non-species-related properties of trees: e.g. ‘tall tree’, ‘tree with a straight trunk’, or the like). Friedrich's list of reconstructed PIE tree names includes the root *dóru-, whose reflexes in Balto-Slavic languages (e.g. derevo in Russian, dervà in Lithuanian) mean ‘tree’ in general. We thus agree with Diebold (Reference Diebold1985: 11), who notes that “where linguistic paleontology frequently errs is in using the reconstructed sense and reference of such signifiers as a source for making macro- or extralinguistic inference without equally careful reconstruction of their meanings”.

Besides the uncertainty about what each reconstructed PIE root referred to, there is also a problem concerning the depth of the reconstruction. Thus far, following Friedrich (Reference Friedrich1970), we assumed that each root is indeed reconstructible to PIE. However, as can be seen from Table 9.1 above, only five roots have reflexes in Anatolian languages and can be thus reconstructed all the way to PIE. Only one of the roots, *dóru-, has a reflex in Tocharian, but it also has a reflex in Anatolian. Therefore, all but five roots can be reconstructed technically only as far as Proto-Surviving-Indo-European, or PSIE (i.e. non-Anatolian/non-Tocharian branches of Indo-European) (see Figure 4 in the Appendix). Consequently, the majority of the reconstructed tree names can only shed light on the homeland of one of the main branches descending from PIE, not of PIE itself.

To recap, the different “facts” deduced from clues provided by linguistic paleontology have resulted in different putative homelands for the Indo-European language family, ranging from Northern Europe and the Baltic area to Hungary and southern Russia. But because names for species are often reassigned, especially by peoples who move into new areas and encounter new forms of life, it is difficult, if not impossible, to determine where PIE must have been spoken on the basis of such inconsistent evidence. In the final analysis, however, reconstructed names for trees and other features of the natural world point away from Anatolia as the original PIE homeland, but not in a conclusive manner.

Clues from linguistic contact

Evidence for the location of a given proto-language can also be derived from signs of contact between languages in different families. If we observe significant borrowing from language A into language B, or vice versa, the speakers of the two tongues must have lived in close proximity, perhaps along established trade routes or in some other geographical configuration that promoted information exchange. In the case of PIE, contacts with languages in three other groupings have been documented: Uralic, Semitic, and the languages of the Caucasus. Although the latter category is geographical rather than phylogenetic in nature, grouping languages belonging to at least three distinct families (Northwest Caucasian, Northeast Caucasian, and Kartvelian), most Caucasian languages share some common features, such as ejective sounds (discussed in detail below).

Evidence of linguistic contacts between PIE and Proto-Semitic and Proto-Kartvelian, and perhaps also Hurrian (a member of the extinct Hurro-Urartian family) and Sumerian, would seemingly place PIE south of the Caucasus region, possibly in Anatolia (see Gamkrelidze Reference 296Gamkrelidze, Renfrew, McMahon and Trask2000: 458; Illič-Svityč Reference Illič-Svityč and Toporov1964; Dolgopolsky Reference Dolgopolsky1987a; Dolgopolsky Reference Dolgopolsky1987b: 4–12; Dolgopolsky Reference Dolgopolsky1988: 12–17, 24–26). Proposed examples of Semitic loanwords in PIE include PIE *septḿ̥8 ‘seven’ from Proto-Semitic *šabʕ-at-u-m ‘seven’ and PIE *h₂ster- ‘star’ from Proto-Semitic *ʕaθtar- ‘Venus’. Note that reflexes of these roots are found in both Anatolian and non-Anatolian languages, and thus must be reconstructed all the way back to PIE. Additional examples of proposed Semitic loanwords in PIE are listed and discussed in Dolgopolsky (Reference Dolgopolsky, Renfrew, McMahon and Trask2000: 406). Contacts with Proto-Kartvelian are reflected in such alleged loanwords as Proto-Kartvelian *otχo- ‘four’ from PIE *Hok̑tō(u̯) ‘tetrad’ (lost in later Indo-European but traceable back by internal reconstruction); see Dolgopolsky (Reference Dolgopolsky, Renfrew, McMahon and Trask2000: 406).

Arguments from such evidence of linguistic contact have been used extensively to support the Armenian theory of Indo-European origins, an alternative to both the Steppe and Anatolian hypotheses, put forward by Soviet scholars Tomaz Gamkrelidze and Vjačeslav V. Ivanov. The Armenian theory places the PIE homeland just to the east of Anatolia, south of the Great Caucasus Range. Although the Armenian hypothesis has fairly close geographical affinities with the Anatolian school, its historical reconstructions are much closer to those of the Steppe school. According to the Armenian hypothesis, PIE was spoken in the fourth millennium BCE, whereas the Anatolian hypothesis places it in the seventh millennium BCE.

Gamkrelidze and Ivanov's Armenian hypothesis is based largely on presumed contacts with languages spoken to the south of the formidable Great Caucasus Mountain Range. It also relies on proposed PIE reconstructions for the names of plants and animals that suggest a southern latitude, such as ‘panther’, ‘lion’, and ‘elephant’ (see Gamkrelidze and Ivanov Reference Gamkrelidze and Ivanov1995: 420–431, 443–444). Most of these reconstructions, however, have been discredited since Gamkrelidze and Ivanov first put forward their theory (see Beekes Reference Beekes1995 for details).

Some of Gamkrelidze and Ivanov's alleged exchanges of loanwords between PIE and languages in other families are problematic as well. The key difficulty is ascertaining the direction of borrowing: which languages borrowed a given loanword from which other tongue. Consider the controversy surrounding the word for ‘wine’. Since similar forms are found in Anatolian languages (Hittite wiyana-, Luwian winiyant-, Hieroglyphic Luwian wiana-), as well as in non-Anatolian languages (e.g. Homeric Greek (w)oinos, Armenian gini, Albanian vēnë, Latin uīnum, Gothic wein, Old Church Slavonic vino, etc.), Gamkrelidze and Ivanov (Reference Gamkrelidze and Ivanov1995: 557–564) reconstruct it back to PIE as *w(e/o)ino-.Footnote ⁵ Similar terms are found in a number of ancient Near Eastern languages. The Proto-Semitic form, for example, is reconstructed as *wayn-‘wine’ (cf. Akkadian īnu-, Arabic wayn-, Ugaritic yn, Hebrew yayin). Similarly, the South Caucasian (Kartvelian) form *γwino- ‘wine’ can be reconstructed on the basis of Georgian γwino, Mingrelian γwin-, Laz γ(w)in-, and Svan γwinel. Gamkrelidze and Ivanov regard this term as a Wanderwort, a word that spreads among numerous languages and cultures in connection with trade or cultural diffusion. This particular word for ‘wine’, they argue, “must have passed from one language to another at a protolanguage level, i.e. prior to the breakup of each protolanguage into separate dialects” (Gamkrelidze and Ivanov Reference Gamkrelidze and Ivanov1995: 559). But which proto-language did this word originate in? Based both on the formal phonological/morphological characteristics of their PIE reconstruction and on the importance of grapes and wine in early Indo-European traditions, Gamkrelidze and Ivanov classify the word as a PIE native that spread into Semitic and Kartvelian.Footnote ⁶ The idea that the word for ‘wine’ is native to PIE and borrowed by Proto-Kartvelian is likewise adopted by Dolgopolsky (Reference Dolgopolsky, Renfrew, McMahon and Trask2000: 406). Gamkrelidze and Ivanov further derive the word for ‘wine’ from the verb root *u̯ei̯H-~ *u̯iH- ‘weave, plait, twist’, the connection being the root for ‘grapevine’, *u̯iH-ti-. Based on the work of other Soviet scholars (Vavilov Reference Vavilov1959–1965; Kušnareva and Čubinišvili Reference Kušnareva and Čubinišvili1970), Gamkrelidze and Ivanov place the center of grape (Vitis vinifera) domestication in southwestern Asia, south of the Caucasus. Consequently, they locate the Indo-European homeland in that area as well.

However, this analysis is highly contentious, and the more recent research supports a different analysis. According to McGovern (Reference McGovern2007), the Eurasian wine grape was probably domesticated in the south Caucasus (in modern-day Georgia and Armenia) some 8,000 years ago. It is therefore possible, and perhaps even likely, that Kartvelian was the ultimate source of the ‘wine’ root, and that PIE (as well as the Semitic languages) borrowed it, possibly at a significantly later date. Another possibility is that the forms found in various Indo-European languages are not reflexes of a shared ancestral PIE root, but are rather loanwords borrowed separately into the daughter languages (James Clackson, personal communication). This view is supported by the fact that the Latin word for ‘wine’ is neuter, whereas its Greek counterpart is masculine, as well as other morphological discrepancies.

The strongest argument put forward by Gamkrelidze and Ivanov in favor of the Armenian hypothesis was based on Gamkrelidze's Glottalic theory, stemming from the work of Pedersen (Reference Pedersen1951: 10–16) and Martinet (Reference Martinet1953: 70).Footnote ⁷ According to this theory, PIE had glottalized (i.e. ejective) consonants p^ʔ, t^ʔ, and k^ʔ instead of voiced stops b, d, and g.Footnote ⁸ To pronounce ejective stops, in addition to creating a closure in the mouth, the space between the vocal cords, the “glottis”, is closed and then sharply opened as well.Footnote ⁹ English speakers are familiar with this glottal closure, as it occurs in the middle of uh-oh (speakers of certain English dialects, particularly Cockney and Estuary English, use the “glottal stop” in place of t in words such as bottle, better, and the like). The additional closure of the glottis during the articulation of ejectives creates the dramatic burst of air that distinguishes an ejective sound from a plain one and gives it a certain “spat out” quality. Cross-linguistically, ejective sounds are fairly common, found in 92 out of 567 languages in the World Atlas of Linguistic Structures Online (WALS) sample (see Maddieson Reference Maddieson, Dryer and Haspelmath2013).Footnote ¹⁰ But languages of the Caucasus are particularly well known for their ejective sounds. For example, Georgian (a member of the Kartvelian family) has four ejective stops and two ejective affricates: p^ʔ, t^ʔ, k^ʔ, q^ʔ, ts^ʔ, and tʃ^ʔ.

According to Gamkrelidze and Ivanov, the hypothesized presence of glottalic consonants in PIE provides a unified explanation for both Lachmann's Law for Latin (proposed by the German Latinist Karl Lachmann in the middle of the nineteenth century) and Winter's Law for Balto-Slavic (advanced by Werner Winter in 1978). Both of these laws describe a similar phenomenon in the languages under consideration: a reflex of the PIE unaspirated voiced stop b d g g^w before a consonant lengthened a preceding vowel. For example, a reflex of the PIE g in PIE *ph₂g-to- ‘fortified’ is responsible for lengthening the preceding vowel in the Latin pāctus (the original short vowel is observable in Sanskrit pajrás). Similar process of vowel lengthening is also observed before PIE laryngeals, which are assumed to have included a glottal stop. Assuming that PIE had glottalic sounds thus allows for a unified analysis of various instances of vowel lengthening.

The Glottalic theory has also been used to rationalize the analysis under which the PIE phoneme inventory was reconstructed to have three series of stops (consonants produced by a complete closure of articulators, such as p, t, and k, and their voiced counterparts b, d, and g). Consider the labial series: prior to the Glottalic theory, it was assumed that PIE had a three-member labial series including p, b, and b^h. Such a system, however, presents a mysterious anomaly in light of the fact that typologically languages have either p^h instead of b^h (i.e. they have p, b, and p^h), or both p^h and b^h together. Under the Glottalic theory, the typologically uncommon system p, b, b^h was reformulated as a more expected one: p, p^ʔ, p^h. In other words, the essence of the Glottalic theory is that the sound originally reconstructed as a voiced b was reinterpreted as a voiceless glottalized p^ʔ. While glottalic/ejective articulation is phonetically different from voicing, both processes involve vocal cords and the space between them, the glottis. Moreover, it has been noted that voiced stops are equivalent to the glottalic series of other language families with respect to sound symbolism (Swadesh Reference Swadesh1971: 219).

The Glottalic theory, however, has its own problems. Its most serious challenge concerns the typological commonality of the PIE consonant system: if the system were typologically common, as proposed by the Glottalic theory, then it would be expected to be stable and, therefore, to have been preserved in at least some Indo-European daughter languages. Such preservation, however, did not occur: no Indo-European language has retained ejective sounds where the Glottalic theory postulates them. Both Ossetian (a member of the Iranian branch) and some dialects of Armenian do have glottalic sounds, but they reflect relatively recent borrowing from neighboring languages in the Caucasus. Significantly, Ossetian is the only Iranian language with such phonetic characteristics. More important is the fact that the distribution of ejectives in modern Armenian and Ossetian does not fit the Glottalic theory. If, in contrast, we assume that PIE had a typologically unusual system, as postulated by the traditional reconstruction, then it might be expected to have been replaced by more typical sound inventories, possibly with different solutions achieved in its various daughter languages, which is exactly what one does find. Because of these and other objections, the Glottalic theory has been rejected by most Indo-Europeanists, though it still has some adherents, such as Robert S. P. Beekes, Frederik Kortlandt, and A. M. Lubotsky (cf. Beekes Reference Beekes1995; Kortlandt Reference Kortlandt1995, Reference Kortlandt2010; Lubotsky Reference Lubotsky and Nussbaum2007). Alan Bomhard (Reference Bomhard2008, Reference Bomhard2011) supports the Glottalic theory in connection with the controversial Nostratic hypothesis, which posits a “mega” language family that would include the Indo-European, Uralic, Altaic, Afro-Asiatic, Kartvelian, and Dravidian families.

Gamkrelidze and Ivanov's argument for the Armenian hypothesis rests heavily on the Glottalic theory. These authors further maintain that PIE originally borrowed its ejective consonants from Proto-Kartvelian, a language that was presumably spoken south of the Caucasus Mountains. They are certainly correct in contending that sounds can be “borrowed” from one language into another.Footnote ¹¹ Examples from the more recent history of Indo-European languages include the “borrowing” of retroflex consonants by many Indo-Aryan languages from their Dravidian neighbors and the “borrowing” of pharyngealized consonants by Domari, another Indo-Aryan language (see Chapter 3 above), from Arabic (Matras Reference Matras2012: 42–43). However, even if we assume, following Gamkrelidze and Ivanov, that the Glottalic theory is correct and that PIE acquired ejective sounds from some neighboring language, Gamkrelidze and Ivanov's argument rests on two crucial yet implicit and not necessarily valid assumptions: that Proto-Kartvelian had ejective sounds as its descendants do, and that it was spoken in the same area where its descendants have been spoken in historical times. In effect, Gamkrelidze and Ivanov project properties of modern Kartvelian languages (both phonological and geographical) onto Proto-Kartvelian, a step that needs to be at least acknowledged explicitly. But even if we make all these assumptions, the conclusion – that PIE “borrowed” glottalic/ejective sounds from Proto-Kartvelian – is not the only possible one. In addition to the Kartvelian languages spoken to the south of the Great Caucasus crest, languages found on the northern side of the mountains, and historically into the adjacent lowlands as well, also have ejective sounds, including languages in both the Northwest Caucasian family (e.g. Abkhaz) and the Northeast Caucasian family (e.g. Tsez, Avar). One must therefore consider possible connection between PIE and the ancestral forms of languages indigenous to the North Caucasus.

The existence of common typological features in PIE and Proto-Northwest Caucasian (Proto-NWC) has long been noted.Footnote ¹² For instance, Matasović (Reference Matasović2012: 283) notes that although no “certain proofs of lexical borrowing between PIE and North Caucasian” were ever found, “there are a few undeniable areal-typological parallels in phonology and grammar”. Those features include “the high consonant-to-vowel ratio, tonal accent, number suppletion in personal pronouns, the presence of gender and the morphological optative and, possibly, the presence of glottalized consonants and ergativity” (Reference Matasović2012: 283). Those features are generally attributed to PIE but are not found in the majority of languages of North and Northeastern Eurasia, yet they are common, or universally present, in the languages of the Caucasus (especially the North Caucasus). Therefore, linguists have generally analyzed such commonalities as evidence of linguistic contact between the two proto-languages (Kortlandt Reference Kortlandt1990, Reference Kortlandt1995). In addition to the above-mentioned glottalic sounds, PIE and Proto-NWC are said to have developed labiovelars (e.g. k^w) in a similar fashion: by reassigning a vowel feature to adjacent consonants. In other words, an ancestral ku may have become k^wə; Northwest Caucasian languages may have done the same thing with respect to palatalization, turning an ancestral ki into k^jə. Taken to an extreme, this sort of historical change reduces vowel inventories and generates large numbers of consonants. The ultimate example of such a process is Ubykh, a now-extinct Northwest Caucasian language, which had 81 consonant phonemes (Colarusso Reference Colarusso and George Hewitt1992) and a mere two vowels. Yet it is not entirely clear whether PIE and Proto-NWC acquired labiovelars via horizontal transmission (presumably, from Proto-NWC into PIE) or from parallel developments. Kortlandt (Reference Kortlandt1995: 93–94) takes the former position, buttressing his argument by arguing that “the area around Majkop […] was a cultural center in the formative years of the Indo-European proto-language. It is therefore easily conceivable that the Indo-European sound system originated as a result of strong Caucasian influence.”

Such strong evidence of linguistic contact between the early Indo-Europeans and the linguistic ancestors of the present-day Abkhaz, Adyghe, and Kabardian peoples makes the Pontic steppes to the north of the Caucasus Mountains a more likely candidate for the Indo-European homeland than either the Armenian Highlands or central Anatolia. To begin with, the Great Caucasus Range forms a formidable physical barrier, making it likely that speakers of Proto-NWC were in much closer contact with groups to the north than with those to the south. We must also consider Nichols's (Reference Nichols1992) contention that both Proto-NWC and Proto-NEC must have been spoken in more northerly areas of subdued topography and low altitude than their modern descendants. Later arrivals of other peoples, chiefly those speaking Turkic languages, pushed the speakers of Northwest Caucasian and Northeast Caucasian languages into their highland refuges. As a consequence, the homelands of Northwestern and Northeastern Caucasian languages were probably adjacent to the hypothesized Indo-European homeland in the Pontic steppes, which would promote considerable linguistic borrowing between the two families. All such theorizing, however, is muddled by the complex linguistic history and geography of the region. For example, it has been suggested that in remote antiquity Northwest Caucasian languages were spoken in present-day Adjara along the Black Sea Coast, where they may have left substrate traces on Kartvelian toponyms. Moreover, the position of the extinct Hurro-Urartian languages, which may have been related to the Northeast Caucasian family, is uncertain. We should therefore regard the linguistic connections between PIE and the languages found to the north of the Caucasus with some skepticism, although the existing evidence is suggestive.

Much better documented are the linkages between PIE and Uralic, which also support a more northerly PIE homeland (Ringe Reference Ringe, Salmons and Joseph1998; Dolgopolsky Reference Dolgopolsky, Renfrew, McMahon and Trask2000: 407; Janhunen Reference Janhunen, Nurk, Palo and Seilenthal2000, Reference 300Janhunen, Carpelan, Parpola and Koskikallio2001; Kallio Reference Kallio, Carpelan, Parpola and Koskikallio2001; Koivulehto Reference Koivulehto, Carpelan, Parpola and Koskikallio2001; Salminen Reference Salminen, Carpelan, Parpola and Koskikallio2001; Katz Reference Katz2003). Morphological and lexical resemblances between the two language families are so numerous and striking that some scholars have proposed an Indo-Uralic macro-family, which would encompass all Indo-European and Uralic languages (e.g. Kortlandt Reference Kortlandt1995). Most linguists, however, believe that such similarities resulted from extensive contact rather than common descent, attributing many resemblances, especially the lexical ones, to borrowing – usually from Indo-European into Uralic. Still, the issue remains disputed, as the border between “good evidence” of contact and evidence “too good” to substantiate mere contact, instead implying common descent, can be a fine line indeed.

In many cases, borrowings into Uralic can be identified because either their phonology or their morphology is “out of place” for the family. An often-cited example is the word *pork̑o-‘pig, piglet’ in Proto-Finno-Ugric. The palatalization suggests that this word was borrowed from Proto-Iranian rather than PIE itself. However, crucial to our discussion is the fact that this word bears traces of Indo-European (i.e. non-Finno-Ugric) morphology. Specifically, *-os (which became *-as in Finno-Ugric due to an independent sound change) is an Indo-European masculine nominative singular ending, but it has no meaning in Uralic languages. We can therefore conclude that the whole word was borrowed as a unit and is not part of the original Uralic vocabulary, which is unsurprising, as speakers of Proto-Uralic had no domesticated animals other than dogs. In other cases, the evidence is even less clear. For example, Kortlandt (Reference Kortlandt1989) argues that verbs commonly taken to be Indo-European loanwords in Uralic (e.g. Rédei Reference Rédei1986), including ‘to give’, ‘to wash’, ‘to bring’, ‘to drive’, ‘to do’, ‘to lead’, and ‘to take’, were actually inherited from an ancestral language common to both families, Indo-Uralic.

Several significant conclusions can be drawn from borrowings between Indo-European and Uralic languages. First, the contact between the two language groups must have taken place over a long period of time. In a study of the earliest contacts between the two families, Rédei (Reference Rédei1986) divides his list of sixty-four words supposedly borrowed from Indo-European into Uralic into three groups based on their presence or absence in major subfamilies of the two groups. He finds that seven Indo-European words are attested in both Finno-Ugric and Samoyedic, eighteen Indo-European or Indo-Iranian words are attested in Finno-Ugric but not in Samoyedic, and thirty-nine Indo-Iranian words are found only in the Finnic branch. Given the history of Uralic, the first set of words must have been borrowed in the earliest period, the second set in a more recent period, and the third set in a more recent period still. Similarly, Häkkinen (Reference Häkkinen2012) differentiates four layers of borrowings from Indo-European into Uralic, listed below from the oldest to the newest. Note that all these borrowings originated from the Indo-Iranian branch of the Indo-European, not from PIE. However, such layered borrowings indicate that Uralic languages must have been in contact with Indo-European languages, particularly the Indo-Iranian ones, for millennia.

1. Early Proto-Indo-Iranian *g̑ug̑heu- ‘to pour, libate’ → Early Proto-Uralic *juxi-/jôwxi- ‘to drink’
- < IE *g̑heu-
2. Middle Proto-Indo-Iranian (Pre-Iranian dialect) *dzen- → Middle Proto-Uralic *sen-ti- ‘to born’
- < IE *g̑enh-
3. Late Proto-Indo-Iranian *ćatam → Late Proto-Uralic *śeta ‘100’
- < IE *k̑m8tóm
4. Early Iranian zaranya → Late Proto-Uralic *serńa ‘gold’
- < Late Proto-Aryan *źhar- < IE *g̑h(o)l(H)-

Wiik (Reference Wiik, Renfrew, McMahon and Trask2000: 469) distinguishes ten layers of Indo-European loanwords in Finnish, starting from PIE loanwords (circa 4000 BCE) and continuing to most recent borrowings from English (circa 1960). At least four of the layers date before Common Era: borrowings from PIE (e.g. jyviä ‘grains’), Proto-Indo-Iranian (e.g. varsalle ‘for the foal’), Pre-Baltic (e.g. puuro- ‘porridge’), and Proto-Germanic (e.g. ruokaa ‘food’).

Importantly, similarities between Indo-European and Uralic are not limited to lexical items; elements of morphology are shared as well. Examples of shared morphemes include the pronominal roots (*m- for first person; *t- for second person; *i- for third person), case markings (accusative *-m, ablative/partitive *-ta), interrogative pronouns (*k^w- ‘who?, which?’), and the negative particle ne. Other, less obvious correspondences have been suggested, such as the Indo-European plural marker *-es and its Uralic counterpart *-t. This same word-final assibilation of *-t to *-s may also be present in Indo-European second-person singular *-s in comparison with Uralic second-person singular *-t. Some similarities have also been noted between the verb conjugation systems of Uralic languages (e.g. that of Finnish) and of several Indo-European languages (e.g. those of Latin, Russian, and Lithuanian). As mentioned in Chapter 3, although it is common for a language to borrow heavily from the vocabulary of another language, it is extremely unusual for a language to borrow its basic system of verb conjugation from another tongue. In fact, such deep grammatical borrowings are so rare that they are generally interpreted as either evidence for extremely intense and prolonged contact, or for common descent.

All told, such linguistic evidence suggests deep and extended contact between the Uralic and Indo-European language families, likely with a high incidence of intermarriage, which would imply close proximity in the period when the borrowings occurred. Proto-Uralic is generally accepted as having been a language of foragers living in the forests to the north of the Pontic Steppes, who, as mentioned above, had no domesticated animals other than dogs. Assuming that the early Indo-Europeans maintained contact with speakers of Proto-Uralic, they must have lived in an area bordering the forest zone. The large number of reconstructed PIE roots for various tree species, discussed in the previous section, points in the same direction.

Although Uralic undoubtedly borrowed heavily from Indo-European, it does not follow that such borrowings came directly from the ancestral proto-language itself rather than from a descendant language (or languages) of PIE. In fact, most linguists reject the idea that PIE served as the source of these transmissions, favoring instead Proto-Indo-Iranian, one of the main Indo-European branches. Here, however, the evidence is solid. We can therefore deduce that speakers of Proto-Indo-Iranian, as well as peoples speaking later forms of the languages in this subfamily, lived in close proximity to the Uralic peoples inhabiting the forest zone just to the north of the Pontic Steppes.

But even if PIE itself had no direct contact with Uralic, the demonstrated relationship between Uralic and Proto-Indo-Iranian still runs counter to the Anatolian hypothesis of Gray and Atkinson. Recall that their model posits Proto-Indo-Iranian genesis on the Iranian plateau, many hundreds of miles to the south of the likely Uralic homeland, with a subsequent eastward migration. Such a scenario maintains a significant distance between the two language groups during the crucial period of linguistic exchange, and is therefore highly unlikely if not outright impossible. It must be admitted, however, that Colin Renfrew's modified Anatolian hypothesis is not contradicted by the evidence of close contacts between the Uralic and Indo-Aryan language groups. Although Renfrew regards PIE as having been limited to the Anatolian Plateau, he speculates that the early Indo-Iranians moved north into the steppe zone, where they could have had close contact with Uralic speakers.

The evidence of contact between these two language families also sheds new light on the problem of the Uralic homeland, which is by no means completely resolved either. Some scholars place the Uralic homeland to the east of the Ural Mountains in western Siberia, others to the west, in European Russia. The Siberian hypothesis was based on two main arguments. The first one concerned the family's highest-order split, which was thought to have separated Samoyedic and Finno-Ugric; more recent analyses, however, take the highest-order split of Uralic to be between the Finno-Permic and the Ugro-Samoyedic branches (Häkkinen Reference Häkkinen2007 and his later work). The second argument was based on paleolinguistic evidence pertaining to two coniferous tree names in Proto-Uralic (Abies sibirica and Pinus cembra), but these trees have also long been present in easternmost Europe. Because of these problems, most scholars now reject the Siberian theory of Uralic origins. For example, Carpelan and Parpola (Reference Carpelan, Parpola, Carpelan, Parpola and Koskikallio2001: 79) associate Proto-Uralic with the archeologically attested Pit-Comb Ware culture found to the west of the Urals. Thus, both the Indo-Iranian and Uralic homelands were probably located in what is now European Russia and environs. Neither the Gray–Atkinson thesis of Indo-Iranian development on the Iranian plateau nor the postulated Siberian homeland of early Uralic makes sense in light of the evidence of linguistic contact between the two language families. It remains possible, however, that PIE originated to the south of the Black Sea and that the initial homeland of Proto-Uralic was in western Siberia, with subsequent migrations generating the proximity necessary for intensive language exchange. According to this scenario, the Indo-Iranian speakers would have moved north into the southern Russian steppes while the speakers of the Finno-Ugrian branch of Uralic simultaneously spread from the east into the forest belt located to the north of the steppes. Note, however, that such large-scale migrations are precluded by the Gray–Atkinson model, as discussed in Chapter 7.

To recap, evidence of linguistic contact with other language families, though copious, does not yield firm conclusions. The etymologies of some putative borrowings remain unclear. In many cases, we cannot be sure whether the word in question is shared via borrowing or because of inheritance from a common ancestral language; in other cases, we are not certain which language family generated a shared word, or how it spread to the other languages. Conclusions about language contact often depend on the precise reconstructions of proto-languages, which have not yet been definitively established. Drawing undisputable conclusions about PIE based on evidence of contact requires certainty about the development not only of Indo-European, but also of the other language families with which it interacted. However, we are no more certain about the deep past of the Uralic, Semitic, and Caucasian languages than we are about the history of the Indo-European tongues. Thus, conclusive proof of the area of language-family origination cannot be drawn from information about the interaction of language families. But even if firm conclusions cannot be reached, the existing evidence is still strongly suggestive, pointing in the direction of the Steppe hypothesis and away from the Anatolian model of Gray and Atkinson.

Clues from the phylogenetic tree and migration history

A third line of evidence concerns the Indo-European phylogenetic tree in light of the reconstructed migration histories of the early speakers of Indo-European languages. As we shall see in the following pages, this type of evidence also leans towards the Steppe hypothesis, but again fails to offer conclusive proof.

As was shown in our discussion in Chapter 4, the exact shape of the Indo-European family tree is far from resolved. Ten major subgroupings, known as “benchmark groupings” (cf. Nichols and Warnow Reference Nichols and Warnow2008: 777), are well established: Anatolian, Tocharian, Celtic, Italic, Greek, Armenian, Albanian, Germanic, Indo-Iranian, and Balto-Slavic. Consensus is still lacking, however, in regard to the higher levels of the taxonomic structure, as can be seen from the various trees in Figures 1–6, although virtually all scholars agree about the first two divisions. The highest-order split separated the Anatolian languages (Hittite, Luwian, and Lycian) from the rest of the family, and the next split took the ancestor of the Tocharian languages (Tocharian A and Tocharian B) on their own trajectory. Evidence from the phonological and morphological development of these languages, as well as that pertaining to the development of the wheeled-vehicle vocabulary (see Chapter 8), substantiate the consensus view that the Anatolian and Tocharian groups were the first two to separate from the main Indo-European stem.

As Don Ringe (Reference Ringe2006 and elsewhere) conclusively demonstrates, the Anatolian and Tocharian branches split off “cleanly” from the main Indo-European stem, meaning that once the division occurred, the speakers of Proto-Anatolian and Proto-Tocharian lost contact with the other Indo-Europeans. As a result, they shared no common innovations with other Indo-European languages, nor did they borrow from or provide loanwords to them.Footnote ¹³ In other words, these diversification events are nicely depicted by the tree model of linguistic evolution. This kind of complete separation can be most easily effected by the migration of one subgroup, removing it from its original homeland by distance and often by geographical obstacles as well.

The prime candidates for the Indo-European homeland – the southern Russian steppes and Anatolia – are separated from each other by two profound geographical barriers: the Caucasus Mountains and the Black Sea. The east–west-oriented Great Caucasus Range presents a formidable impediment to traffic between the southern Russian steppes and the Transcaucasian region, the Middle East, and Anatolia. It is pierced by few negotiable passes, the most important of which extends along the Darial Gorge through the so-called Caucasian Gates. (It is by no means accidental that the territory of the only ethno-linguistic group historically found on both sides of the Great Caucasus Mountain Range, the Ossetians, spans the Darial Gorge.) The Black Sea was also a major barrier before the development of Bronze-Age shipping, as shown by Davison et al. (Reference Davison, Dolukhanov, Sarson and Shukurov2006). Ancient mariners could sail only in the narrow coastal zone around the Black Sea. Similarly, easy overland movement was limited to a narrow strip along the coast, owing to the rugged mountains running parallel to it. It is therefore likely that whatever population movements occurred between the Pontic steppes and Anatolia, in either direction, did not proceed across the Black Sea or the Caucasus Mountains.

Before we can determine which of the two main competing theories of the Indo-European homeland offers a more cohesive account of the early movements that led to the clean Anatolian and Tocharian splits, another warning is in order. Any given division of a linguistic tree can result from migration or demic diffusion of either the branch that splits off from the rest of the tree, or of the branch that constitutes the main stem. (For the sake of convenience, the following discussion will use the term “migration” to refer to any process of substantial movement, whether by way of slow demic diffusion or more rapid “migration”, as the term is conventionally conceptualized.) For example, the Anatolian split could have been a result of the migration of the speakers of Proto-Anatolian or of the speakers of PNIE (the “rest of the tree” at this point). Similarly, the Tocharian split could have been the result of the migration of the Proto-Tocharian speakers themselves or of the remainder of the Indo-European group. Given a binary split on the tree, we cannot decide on the basis of phylogenetic evidence which of the branches relocated. Nor is the relative linguistic conservativeness of any branch helpful in this regard. While it is often assumed, implicitly or explicitly, that the group that remains in situ is more linguistically conservative, this assumption does not necessarily hold, as is illustrated by Icelandic, the most conservative Germanic language, the speakers of which clearly migrated over a particularly large distance in a relatively short period of time.

With such considerations in mind, let us now consider the earliest split from the Indo-European trunk, that of the Anatolian branch. The main argument for Anatolian being the initial branch derives from the fact that it lacks a significant number of grammatical features that characterize PNIE, including such verbal forms as the aorist, perfect, subjunctive, optative, and so on (Fortson Reference Fortson2010: 173). The links between Anatolian and PNIE were therefore probably severed before PNIE underwent these significant innovations. An alternative analysis attributes the lack of these various morphosyntactic categories in Anatolian to losses resulting from the influence of some non-Indo-European language or languages within Anatolia itself. In this scenario, Anatolian is taken to be the “innovator” instead of the more conservative branch. Regardless of whether the Anatolian or the PNIE is viewed as the innovative branch, speakers of Proto-Anatolian and those of PNIE must have fully parted ways, linguistically and geographically. How might this have happened under either the Steppe or the Anatolian hypotheses?Footnote ¹⁴

If one assumes a Pontic-Caspian PIE homeland, the highest-order split can be envisaged as resulting from the migration of the linguistic ancestors of the Anatolian branch. Such a mass movement would most likely have passed through the Balkans (Mallory Reference Mallory1989: 241; Anthony Reference Anthony2007: 259; Anthony Reference Anthony2013: 8–10), entering Anatolia by the beginning of the Bronze Age, where the newcomers settled and eventually dominated indigenous non-Indo-European peoples such as the Hattians and the Hurrians (see Chapter 6). Anthony (Reference Anthony2013) identifies the Anatolian split with the movement of steppe people into the lower Danube Valley in the wake of the collapse of agricultural tell settlements in that area. Based on archeological evidence, the breakdown of the tell culture is dated circa 4400–4200 BCE; “the archeologically undocumented shepherds who grazed their sheep on the abandoned tells in the Balkan uplands between 4200–3500 BCE could have been the distant antecedents of the Anatolian branch”, Anthony writes (Reference Anthony2013: 9). Under this scenario, PNIE would have remained in the Pontic-Caspian steppes, where its characteristic shared innovations developed. Whatever its archeological merit, this hypothesis nicely accounts for the split between Anatolian and PNIE.

The Anatolian hypothesis, however, encounters massive challenges in accounting for the Anatolian split, as it would require chains of highly unlikely events. One possible solution, proposed by Grigoriev (Reference Grigoriev2002: 354–357, 412–415), is to assume that the ancestors of the Anatolian languages migrated away from the Anatolian homeland, leaving speakers of what was to become “the rest of the Indo-European family” in (eastern) Anatolia, and then subsequently returned to the region. In Grigoriev's model, the group ancestral to the Anatolians moved from Asia Minor into the Balkans, and then moved back into Anatolia during the Bronze Age under pressure from the Greeks, who had themselves made their way through the Caucasus, around the Black Sea, and then into the Balkan Peninsula. As discussed above, such migration would involve crossing formidable geographical obstacles, in the form of high, rugged mountains and deep seas, making it highly unlikely. From the linguistic perspective, while this imaginative thesis meets the minimal requirement of accounting for the separation of the Anatolian branch from the rest of the Indo-European tree, it is unduly complicated and it fits poorly with the archeological record.

The second scenario possible under the Anatolian hypothesis involves the Anatolian group remaining within the homeland while the speakers of PNIE relocated elsewhere. If they had ended up in southern European Russia, further developments would essentially be the same as those postulated under the Steppe hypothesis. This scenario is explored in Colin Renfrew's later work (Renfrew Reference Renfrew1999), which Mallory (Reference Mallory2013a) refers to as “Anatolian Neolithic Plan B”. While accounting for the geography of the Anatolian split, this model runs into serious problems with respect to chronology. Just as the earlier model developed by Renfrew (Reference Renfrew1987) requires one to assume that PIE was spoken in an unchanged form for 3,000 years, the revised model requires a similarly untenable assumption with respect to PNIE (see Chapter 8).

But the problems confronted by the Gray–Atkinson diffusion-based model are deeper still. Recall that their model does not allow for large-scale, rapid migration, effectively ruling out the movement of PNIE speakers to the steppes under Renfrew's Neolithic Plan B, as well as the relocations of both the Anatolian speakers and the predecessors of the Greeks in the Grigoriev thesis. The Gray–Atkinson scheme is instead more akin to Renfrew's earlier proposals (i.e. what Mallory calls Anatolian Neolithic Plan A), which would have speakers of what was to become the European branches of the family slowly diffusing westward and speakers of eastern branches gradually moving eastward, while the future Anatolians (in the linguistic sense) stayed put. Under this scenario, the ancestors of Greek would be situated in Greece to the west of Proto-Anatolian and the ancestors of Indo-Iranians would be located to the east of Anatolian. But if that were the case, how could the various non-Anatolian branches maintain the unity that characterizes PNIE? Or in the words of James P. Mallory (Reference Mallory2013a: 148),

How is one to explain parallel linguistic innovations both to the east and west of the region assigned to proto-Anatolian? The statisticians who devised this model seem to require some form of mutual contact at a distance, one of the stranger aspects of quantum theory that Einstein once dismissed as “Spukhaftige Fernwirkung” [or “spooky action at a distance”].

Barring any spooky, telepathic action, the model presented in “Mapping the Origins and Expansion of the Indo-European Language Family” (Bouckaert et al. Reference Bouckaert, Lemey, Dunn, Greenhill, Alekseyenko, Drummond, Gray, Suchard and Atkinson2012) and illustrated by their animated map “movie” fails to provide anything approaching a reasonable explanation for the shared development of all non-Anatolian languages, “the only feature of Indo-European phylogeny that has near universal support” (Mallory Reference Mallory2013a: 148).

Now let us consider the second split, that of the Tocharian languages. Under the Steppe hypothesis, the Tocharian division can be identified with the second archeologically documented movement of people from the Pontic-Caspian steppes, the one that gave rise to the intrusive Afanasievo culture in the western Altai Mountains (Anthony Reference Anthony2011). This migration, supported by evidence from grave inventories and genetic data, proceeded eastward across an astonishingly long distance of approximately 1,200 miles circa 3300–3000 BCE. According to Anthony (Reference Anthony2013: 10),

the Afanasievo migrants seem to have introduced a pastoral economy, wheeled vehicles, horses, and an accompanying new social order into mountain meadows formerly occupied by ceramic-making mountain foragers, some (many?) of whom probably were absorbed into the Afanasievo culture.

The Afanasievo material culture, moreover, exhibits traits characteristic of the Yamnaya culture (also known as “Pit Grave Culture” or “Ochre Grave Culture”), which appears to coincide in time and space with PNIE. Yamnaya kurgan grave types, a typical Yamnaya burial pose, Yamnaya-Repin ceramic types and decoration, and sleeved axes and daggers of specific Yamnaya types are all found in Afanasievo sites in the western Altai (Kubarev Reference Kubarev1988; Chernykh et al. Reference Chernykh, Kuz'minykh, Orlovskaya and Linduff2004). The link between the linguistic ancestors of the Tocharians and the Afanasievo culture is further explored in Mallory and Mair (Reference Mallory and Mair2000).

Although plausible at first glance, the linking of the Tocharians to eastward steppe expansions associated with the Afanasievo or similar cultures of the Altai region and southern Siberia faces one daunting problem: the Tocharian languages retained inherited Indo-European agricultural vocabulary, but there is no evidence of arable farming east of the Urals before 2000 BCE. This is not unlike the “wheel problem”, discussed in Chapter 8: if Proto-Tocharians did not practice agriculture, why did they retain vocabulary pertaining to it? It is well known that the steppe populations ate the seeds of wild plants such as Chenopodium and Amaranthus, but the semantic variance among cognates pertaining to cereal types indicates that the reconstructed roots referred to either ‘wheat’, ‘barley’, or ‘millet’, not some type of wild grain or pseudograin (Mallory Reference Mallory2013a: 151). One possibility here is that they retained some forms of cropping, but practiced it so sparingly as to leave no traces that have yet been discovered, an archeologically unsatisfying scenario.

While the fact that the Tocharian languages retained agricultural vocabulary despite the absence of evidence for agriculture in presumed Proto-Tocharian-speaking sites poses a problem for the Steppe hypothesis, the Anatolian hypothesis as advocated by the Gray–Atkinson approach faces its own obstacles in this regard. According to the Gray–Atkinson model, the Tocharian split is dated at either circa 5900 BCE (Gray and Atkinson Reference Gray and Atkinson2003) or 4900 BCE (Bouckaert et al. Reference Bouckaert, Lemey, Dunn, Greenhill, Alekseyenko, Drummond, Gray, Suchard and Atkinson2012), long before the arrival of agriculture in the more easterly lands that the ancestors of the Tocharians would have had to have passed through. In other words, if the Tocharians are presumed to have engaged in some farming based on their retention of the agricultural vocabulary of their linguistic ancestors, there is no more archeological warrant for postulating movement across the Iranian plateau than there is for migration via the western Altai.

Moreover, the diffusion-based model illustrated by the animated map showcased by the Gray–Atkinson approach runs into a further problem, as it postulates that the ancestors of the Tocharians (and later the ancestors of the Indo-Iranians) moved through the northern Fertile Crescent and hence across the Iranian plateau, areas relatively densely populated by agriculturalist groups speaking non-Indo-European languages – Hurrian, Semitic, Sumerian, and Elamite – and yet remained “unscathed” by any of these languages. It is certainly more plausible that the Tocharian migration occurred through the steppe zone, sparsely populated by nomadic or semi-nomadic groups. To make matters worse, after they have survived diffusion through non-Indo-European-speaking groups, the Tocharians are depicted in the animated map as moving through impassibly high altitudes, passing along the crest of the Tian Shan Mountains at elevations in excess of 20,000 feet (6,000 meters) (see also Chapter 7).

To summarize, evidence from migration history, like clues from linguistic paleontology and language contact, generates “facts” that are open to interpretation. As a result, the Indo-European homeland controversy can only be addressed by weighing different pieces of evidence against each other, and even then the conclusions remain tentative. Yet all in all, such data show that although the Steppe hypothesis is not without flaws, much more evidence weighs on its side, as compared to the Anatolian hypothesis, especially if the latter is coupled with the diffusion-based model of the Gray–Atkinson approach.

Book contents

9 - Triangulating the Indo-European homeland

Summary

Information

9 Triangulating the Indo-European homeland

Clues from linguistic paleontology

Table 9.1 Tree categories whose names are reconstructed for PIE (based on Friedrich Reference Friedrich1970: 24–25)

Table 9.2 The distribution of tree species in the two candidate PIE locations according to the Steppe and Anatolian hypotheses

Clues from linguistic contact

Clues from the phylogenetic tree and migration history

Footnotes

Accessibility standard: Unknown

Why this information is here

Accessibility Information

Book contents

9 - Triangulating the Indo-European homeland

Summary

Information

Clues from linguistic paleontology

Table 9.1 Tree categories whose names are reconstructed for PIE (based on Friedrich Reference Friedrich1970: 24–25)

Table 9.2 The distribution of tree species in the two candidate PIE locations according to the Steppe and Anatolian hypotheses

Clues from linguistic contact

Clues from the phylogenetic tree and migration history

Footnotes

Accessibility standard: Unknown

Why this information is here

Accessibility Information

Save book to Kindle

Save book to Dropbox

Save book to Google Drive