Part II Case studies in contact
4 Structural features and language contact in the Isthmo-Colombian area
This chapter examines the role of structural linguistic features as indicators of nested levels of social history in a specific geographic region. The Isthmo-Colombian area, dominated by speakers of Chibchan languages for millennia, is a region of rich resources, long-term settlement, and relative social stability where goods and technologies were exchanged within the region and with neighbors north, south, and east of the Chibcha sphere. For this study, structural features from fourteen languages of the region were coded as stable or unstable, using a composite ranking of relative stability, and as template or contents, using a functional metric. Patterns of similarity indicate that the set of features defined as contents, that involve choosing what to encode in a given structural feature, is more successful than any other set at replicating areal patterns. The analysis suggests that structural features, like lexical items, can be divided into types which are more and less susceptible to conscious manipulation by speakers, and that their role must be interpreted within a specific sociohistorical context.
1 Introduction
By the time of European contact in the early sixteenth century, Chibchan languages were distributed across four non-contiguous regions in Central and South America (Constenla 2012: 419):
from southern Nicaragua to western Panama (the Votic branch and most of the Isthmic branch)
eastern Panama and northwest Colombia (Kuna (Isthmic), and the probably Chibchan extinct languages Catío and Nutabe)
along the Magdalena River from Cundimarca north to the sea (the Magdalenic branch)
This distribution presents a type of natural laboratory for looking at the effects of contact between particular Chibchan languages and genealogically diverse neighbors on the various borders, including Jicaquean and Misumalpan languages in the north, Chocoan,Barbacoan, and Paezan languages in the south, and Arawakan and Cariban languages in the east. This chapter examines patterns of structural similarity in seven Chibchan languages and seven neighbors to investigate multiple roles of structural data in tracing the history of language contact in the Isthmo-Colombian area.
Quantitative investigations to date of the role of structural features in historical linguistics have focused a great deal on notions of dependency or predictability among features (Dunn et al. Reference Dunn, Greenhill, Levinson and Gray2011, Hammarström and O’ConnorReference Hammarström, O’Connor, Borin and Saxena2013) and especially on assessment of the relative stability of individual features or meaningful subsets of features. Analysis by Dediu and Levinson (Reference Dediu and Levinson2012) of the World Atlas of Language Structures (WALS) database suggests that as few as ten to eighteen features may form the operative basis of abstract stability profiles in the language families of the world, yet the small set of crucial features varies across families and, importantly, includes both stable and unstable features. The study in this chapter makes use of a synthesis of stability rankings of individual WALS features, compiled in Dediu and Cysouw (Reference Dediu and Cysouw2013), described further in Section 3.1. The limitation of models based solely on frequency of values in a database like WALS is partly inherent, as conclusions can only be based on the incomplete inventory of languages and features that could be included. The limitation is also partly a question of the narrow focus on linguistic factors, without incorporating quantifiable assessments of other fundamental properties of language, especially as a communicative system shaped by human interaction. Patterns of structural stability and change seem to emerge from a cluster of factors that includes the structural resources in particular languages and language families, the physical characteristics of the geographic area of contact among speakers, and the size and tenor of overlapping social networks. We need categories that allow us to consider individual characteristics of specific sociohistorical contexts and that account for the psycholinguistic behavior of speakers in those contexts.
This chapter contributes a multi-faceted approach to assessing the role of structural features in the investigation of language prehistory by expanding the categories for evaluating structural features and profiting from the position of a single language family in a particular set of natural and sociohistorical circumstances. Chibchan languages predominate in a relatively small and cohesive geographic region extending from Central America through the northwest corner of Colombia, and they are surrounded by languages from a variety of unrelated families. Section 2 introduces the region, the languages, and a characterization of the social scenario of contact, in which Isthmo-Colombian societies apparently incorporated non-linguistic objects and practices in a specific type of cultural change and transmission. In Section 3, the linguistic data for analysis are categorized in two ways, with features classified as stable or unstable, based on the Dediu and Cysouw (Reference Dediu and Cysouw2013) proposed ranking, and classified independently as template or contents, based on functional characteristics. Details of data categorization and data collection are presented in Section 3, and the analysis and results are discussed in Section 4. The final section offers some concluding remarks.
2 The Isthmo-Colombian area: region and languages
Languages of the Isthmo-Colombian area are spoken at the gateway to South America, across the land bridge that connects the American continents and along the northwest coast of the southern landmass. The territory in question stretches from northern Honduras through the Isthmus of Panama and into the northwestern areas of Colombia, Venezuela, and Ecuador (Map 4.1). The topography varies enormously, from the mountainous Isthmus, across the thickly jungled Darien, to the wide river plains and estuaries along the Caribbean coast. Within Colombia, the sphere of Isthmo-Colombian influence encompasses the Pacific coast region as well as the extensive riverine networks cut by the Atrato, Cauca, and Magdalena rivers and tributaries through the northern reaches of the cordilleras of the Andean mountain chain. The eastern border is traced by the Magdalena River valley north to the Sierra Nevada de Santa Marta and east to the Guajira Peninsula.

Map 4.1 The Isthmo-Colombian area, noting the position of languages in this study
Throughout history the speaker communities have shared borders with powerful Mesoamerican civilizations to the north and with dynamic Caribbean, Amazonian, and Andean groups to the south, and for some time the region was called the “Intermediate Area” to reflect its position between better-documented societies north and south (see Hoopes and Fonseca Reference Hoopes, Oscar, Fonseca, Quilter and Hoopes2003: 51–54 for a discussion of the motivations and relative appropriateness of various terms used in the literature; the label “Isthmo-Colombian Area” was adopted from this paper). Scholarship in recent decades from across the human sciences has provided a more nuanced view of the region, as a place where technologies did indeed sweep through from all directions, but where the human populations remained relatively stable and intact. Objects, products, and practices probably transformed the material culture periodically, but there was little permanent immigration.
2.1 The historical and cultural context of the region
A language contact scenario can be thought of as “the organized fashion in which multilingual speakers, in certain social settings, deal with the various languages in their repertoire” (Muysken Reference Muysken2008d). This chapter is based on the premise that language is in large part like any other cultural trait, whose practice can be inherited, acquired, modified, or lost by any generation of speakers (Mace et al. Reference Mace, Holden and Shennan2005; Gray et al. Reference Gray, Greenhill and Ross2007, Gray et al. Reference Gray, Bryant and Greenhill2010). Therefore, an assessment of the non-linguistic record that encompasses details of the archaeology, ethnohistory, and ecological history of the region is relevant to our understanding of the language contact scenario: the hypothesis is that patterns of speaker behavior in dealing with non-linguistic cultural practices will shed light on how speakers may have dealt with the various languages that entered their environments, as well. This section provides a brief sketch of the historical and cultural context of the Isthmo-Colombian area, looking at evidence for practices related to subsistence, social organization, and trade, starting from the earliest known populations through to the character and consequences of an apparent watershed moment, roughly 1,500 years ago.
People have been living in the Isthmus of Panama since the late Pleistocene, some 12–10,000 years ago (see O’Connor and Kolipakam, this volume). Multidisciplinary research suggests that the Chibchan communities in the region today are in fact the genetic (Barrantes et al. Reference Barrantes, Smouse, Mohrenweiser, Gershowitz, Azofeifa, Arias and Neel1990, Melton et al. Reference Melton, Briceño, Gomez, Devor, Bernal and Crawford2007) and linguistic (Constenla 1991, 2012) descendants of the earliest inhabitants, having spread north into Nicaragua and southeast into Colombia, eventually occupying scattered territories of the Caribbean littoral and the drainage areas of the Cauca and Magdalena rivers. Although these groups apparently engaged in frequent conflict with each other and with non-Chibchan neighbors, they are also described as connected by a “diffuse unity” that encompassed belief systems and associated material practices (Hoopes and FonsecaReference Hoopes, Oscar, Fonseca, Quilter and Hoopes2003). Other studies discuss an “Isthmian Interaction Sphere” that extended from central Colombia to the Mexican Yucatan, in overlapping circles or down-the-line chains, within which commerce and the practice of other cultural activities flowed back and forth across stable local boundaries (MyersReference Myers and Browman1978, BrayReference Bray, Lange and Stone1984, CookeReference Cooke2005).
As will be presented in more detail below, Proto-Chibchan probably split from an older Central American stock nearly 10,000 years ago, while all but one contemporary Chibchan language developed from a core branch that emerged 3,000 years later, its speakers gradually filling the narrow isthmian region of Costa Rica and Panama. The diverse ecology of the region, with deep mountain valleys and rich coastal resources on both shores, encouraged long-term settlement of small groups that practiced a wide variety of subsistence strategies. People ate fish, shellfish, and birds, and evidence of early, local agriculture includes traces of bottle gourd, arrowroot, leren, and squash in Panama 9000–7000 BP (Cooke Reference Cooke2005). Raymond (Reference Raymond, Silverman and Isbell2008) notes maize, manioc, arrowroot, and yams on the Pacific coast, where skeletons show that maize was a primary ingredient of the diet by 6000 BP. The presence of both maize, from Mesoamerica, and manioc, from South America, demonstrates the impact of long-distance exchange in this cultural and commercial nexus.
By the time Chibchan speakers started to radiate away from the core Chibchan region, beginning some 5,000 years ago, a clear record of sustained human settlement had already emerged in multiple places. This development is seen in three archaeological sites mentioned repeatedly in the archaeological literature that nicely delimit our region of focus: Cerro Mangote, from pre-6000 BP, on the Pacific coast of Panama; Las Vegas, from 8500–4600 BP, on the Santa Elena peninsula in western Ecuador; and Puerto Hormiga, from 5100 BP, near the Caribbean coast in the estuaries of the Magdalena River in northern Colombia. Evidence at all three sites points to the existence of foragers and collectors who practiced a type of residential mobility (RaymondReference Raymond, Silverman and Isbell2008: 80–86). These mobile communities trekked in circuits, probably to achieve their attested varied diet of fish, birds, game, and other forest products, returning to the central bases to bury their dead in what Raymond notes as symbolic if not physically permanent homes (p. 81). As noted above for Panamanian societies, in Ecuador, too, we find evidence of very early plant domestication, with traces of squash and leren from nearly 10,000 BP, and maize and bottle gourds by about 8000 BP.
All three regions described above had ceramics fairly early. Pots were found at Valdivia (after 4500 BP) near the older Las Vegas site, and sand-tempered pots of lower quality were found at Monagrillo, around 4400 BP, near the older Cerro Mangote site. Distinctive large and globular ceramic bowls called tecomate, made of clay tempered with fibers, were found throughout Colombia at sites such as Puerto Hormiga, Turbana, Monsu, and San Jacinto, with the earliest tecomate dated pre-5000 BP (AllaireReference Allaire, Salomon and Schwartz1999: 679). Similar tecomate ware has been documented throughout the greater Central American region: from 4000 BP, on the Guajira Peninsula and the Pacific coast of Guatemala; from 3000–3500 BP, at Tronadora and Chaparron in Costa Rica; and from 3000 BP, at Momil in the Sinu lagoons of northern Colombia (Allaire Reference Allaire, Salomon and Schwartz1999). On the evidence of these ceramics, Myers (Reference Myers and Browman1978) made the case that overland trade routes could indeed have linked the coasts of Guatemala and Ecuador, and he also suggested that Puerto Hormiga ceramics looked more like those of the Orinoco and Amazon than like those of Ecuador. It is often quite difficult to prove an ultimate origin of particular ceramics; to complicate the question, the word tecomate appears to come from the Nahuatl word tecomatl, which describes this very type of pot. And yet, perhaps the Isthmo-Colombians traded tecomate pots for Saladoid and Barrancoid ceramics coming from the Orinoco region, as well as for jade from Guatemala, both of which entered the area by about 3500 BP.
Despite our increasing knowledge of the region through archaeological and ecological data, we cannot reconstruct with confidence any major social trends or patterns of dominance for much of prehistory. Scholars of the Isthmo-Colombian area do however note a major moment of change sometime around 500 CE (Bray Reference Bray, Lange and Stone1984: 331, Allaire Reference Allaire, Salomon and Schwartz1999: 707, Hoopes Reference Hoopes2005). The transition may have been motivated in part by increased consumption of maize, a much better crop for floodplain cultivation (Bray Reference Bray, Lange and Stone1984), or by climatic events, such as environmental catastrophe (Hoopes Reference Hoopes2005), and many think the sociocultural change was linked to a transition from jade to gold as the precious material of greatest cultural importance. Quilter (Reference Quilter, Quilter and Hoopes2003) describes the paradigm shift. Mesoamerican jade was difficult to obtain, and its hardness meant that working with it was a slow if relatively easy process. Once shaped, jade artifacts were quite durable. Gold was found locally, and the rendering process was complex and somewhat mysterious yet relatively fast. Furthermore, gold artifacts could be melted down and transformed into an entirely different object. The power of gold, as a commodity to be mined, owned, worked, traded, stockpiled, refashioned, and passed on to descendants, played a key role in the emergence between 300 and 600 CE of the ranking, inequality, and complexity that still characterized the societies encountered 1,000 years later by Europeans (e.g. Quilter and Hoopes Reference Hoopes, Oscar, Fonseca, Quilter and Hoopes2003, Hoopes Reference Hoopes2005). Many details of social development remain to be deciphered. As observed by Bray (Reference Bray, Lange and Stone1984: 307), early Spanish chronicles from the Gulf of Uraba, between Panama and Colombia, mention witnessing “a thriving business in slaves, fish, salt, cotton cloth, and live peccaries, as well as gold.” Bray continues, “It is worth noting that most of the products on this list will leave no archaeological trace and that pottery does not figure at all.”
Some of the societies that were powerful during the last millennium before the Conquest would be particularly relevant to the linguistic analysis in this chapter. Among these are two whose languages remain uncertain: the Zenu (or Senu), who erected astonishing raised fields over vast expanses of the San Jorge River basin in northwest Colombia, and the Quimbaya, renowned and prolific gold workers from farther south on the upper Cauca, whose gold work has been found throughout the Isthmo-Colombian area. There were also two important centers of Chibchan-speaking groups near the Magdalena River along the eastern edge of the area. The Muisca realm occupied the upper Magdalena, near present-day Bogota, at a site known to early Spaniards as El Dorado for its power and wealth, most notably in gold and emeralds. In the Sierra Nevada of Santa Marta near the mouth of the Magdalena, we find the Tairona civilization. The Tairona melded complex architecture with diverse agricultural practices in the design of terraced villages and fields at various altitudes, and they left a copious iconographic record of their rich religious life. Archaeological evidence suggests all these societies were chiefdoms, entities that demonstrate organization and hierarchy in political, religious, and economic activities, seen in artifacts such as large-scale public works, settlement layout, burial practices, and iconography (Bray Reference Bray, Lange and Stone1984: 331).
There is a growing body of literature from recent decades on what constitutes a chiefdom and on the many ways sociopolitical complexity can be manifest, and these definitions shape what we might expect from a given social scenario in terms of language contact. For example, Hoopes (Reference Hoopes2005: 6–9) discusses the literature on two modes of social power, known as network and corporate. A network type of chiefdom emphasizes such factors as individual power passed through hereditary lines, centralized chiefs, and commercial power achieved and maintained by military means. This type of scenario seems more likely to lead to language shift, as speaker communities are conquered and subjugated to enrich a central power. In contrast, the outcome of a corporate mode of social power may be more consonant with diglossia and language maintenance, as a corporate mode emphasizes the power of the office, non-linear inheritance, and control achieved through ritual and ideological means. In this context, the important leaders could have been not chiefs but priests and shamans, who exercised locally the power of a broadly shared worldview, expressed in common iconography and “routinized ritual” rather than through control of key resources (Hoopes Reference Hoopes2005: 31). Cooke (Reference Cooke2005: 31) proposes a type of mixed model, in which “above the chiefdom, there were larger, equally important social units – to judge from the ethnographic record, some kind of descent group or groupings of ethnias with closely related languages and memories of common origins, shared songs and praises, and conflicts between real and mythical personalities and social groups.” Thinking again of the linguistic outcome, we might envision the role of Latin as a language of worship that left space for the maintenance of local languages. Within the Isthmo-Colombian region, chiefdoms of Central America tended toward the network model, perhaps due to influence from Mesoamerica, and chiefdoms of Colombia tended toward the corporate model, a difference that would likely have consequences for patterns of language contact.
This section began with a mention of the “diffuse unity” said to characterize the Isthmo-Colombian area. Bray (Reference Bray, Lange and Stone1984: 336–337) argues the opposite side of the same coin: that despite constant contact and constant conflict, especially among close neighbors, individual cultures in the region remained distinct. He calls this phenomenon “conservatism in the face of opportunity for change” and proposes that while population stability may play a role, the true barriers to convergence were ideological:
When borrowing does occur, what is usually taken over is the technology (metalworking, pottery painting, crop complexes), but this technology is used for purely local ends. There is surprisingly little direct copying. The more neutral the trait, the wider its distribution and the greater its chances of acceptance. As our comparisons have shown, geometrical designs travel faster and farther than figurative or symbolic themes, which are often strongly regional.
The key notion here is that societies seem to have accepted the basic frameworks of new technologies, practices, or artifacts, and to have adapted and reproduced the new structures with locally relevant contents. With this notion in mind, I will summarize what this brief review of the non-linguistic literature on the Isthmo-Colombian area can bring to the question of the linguistic prehistory of the region, and especially to the investigation of the effects of language contact. Speakers of Chibchan languages have been in situ for millennia, with mostly Chibchan neighbors throughout Costa Rica and Panama and among mostly non-Chibchan speakers in scattered pockets from Nicaragua to central Colombia. There was ongoing contact and conflict, especially among nearest neighbors, which suggests a degree of bilingualism (or multilingualism) and intermarriage (which may have been forced, as an outcome of conflicts). Something happened around 500 CE that led to greater sociopolitical complexity that affected the entire region, shaped societies for the next 1,000 years, and may have involved the imposition of dominant languages, in the form of language shift or of diglossia. Throughout, individual cultures remained relatively distinct, but at the same time, societies did take advantage of new technologies and practices, accepting the frameworks and adapting details of the content to fit local needs.
When looking at language systems, stability may take different forms. Basic vocabulary is expected to be stable and to indicate family relations, while cultural vocabulary is expected to show more effects of cross-family borrowings that reflect the specific cultural and ecological context. A resistance to lexical borrowing is often interpreted as the conscious maintenance of a distinct social identity, a phenomenon documented in contact situations from the Vaupés (Aikenvald 2002; EppsReference Epps, Aikhenvald and Dixon2007a, Reference Epps, Matras and Sakel2008a) to Vanuatu (FrançoisReference François2011), and proposed for languages of Colombia as well (O’ConnorReference O’Connor2011). As was discussed in the introduction of this chapter, categories of structural features and their interpretation are less clear, but there are general expectations that stable features will reflect genealogy better than unstable features will. This chapter contributes a perspective from structural data that unpacks relative stability, or relative resistance to borrowing, using two types of metrics, one of which explicitly operationalizes the notions of abstract template and locally relevant content, as described by Bray (Reference Bray, Lange and Stone1984).
2.2 The Chibchan family
The Chibchan language family is by far the largest family in the Isthmo-Colombian Area, in number of languages and in geographic spread, and we know a great deal about the history of this family thanks especially to the work of Constenla Umaña (e.g. Reference Constenla Umaña1981, Reference Constenla Umaña1991, Reference Constenla Umaña, Campbell and Grondona2012) and Quesada (e.g. Reference Quesada1999, Reference Quesada2007). Citing evidence from phonological, lexical, and grammatical comparison and reconstruction, Constenla (2012: 418) suggests that the proto-language split around 9,700 years ago from a Lenmichí “micro-phylum” composed of the Lencan, Misumalpan, and Chibchan families. The Paya language, now spoken in northern Honduras, probably split from the proto-language some 6,700 years ago, leaving what is known as Core Chibhan (see Figure 4.1). Judging by the distribution and degree of present diversity, Constenla (2012: 419) presumes a Chibchan homeland in southern Central America and estimates that it was from this Isthmian homeland that other branches developed as speakers migrated north (Votic branch, ∼5325 BP) and east across northern Colombia (Magdalenic branch, ∼5225 BP), with another migration east by Kuna speakers some 4,800 years ago.

Figure 4.1 The Chibchan language family, after Constenla (2012: 417). Boxed languages appear in this study.
There were at least twenty-one languages, of which sixteen survive. Two of the extinct tongues would have been particularly useful for this study of language contact: the Antioquian languages Catío and Nutabe, thought to have been spoken between the Sinu and Cauca Rivers, near the Zenu and Quimbaya societies mentioned in Section 2. Sadly, they are virtually undocumented and therefore could not be included in Constenla's classification of Chibchan subgroups. Of particular interest in this chapter is the observation that “Tairona… seems not to be another language, but a variant of the still spoken Damana” (Constenla 2012: 391, citing older literature).
2.3 The languages in the study
The fourteen languages for this study, listed in Table 4.1, were chosen for their geographic location on the borders of the Chibcha sphere and because there were sufficient descriptive materials available for the data collection questionnaire, described in Section 3.
Table 4.1 Languages in this study

A fundamental goal of this paper is to determine if any particular subset of features can be identified as a good “trace of contact”: in other words, will any category of feature highlight areal relations among languages and speakers by occurring in patterns that correlate with geographic proximity of languages irrespective of genealogical relationship? Regional subareas are defined here (see Table 4.1) as a Northern group (languages 1–4), an Isthmian group (5–7), a Southern group (8–11), and an Eastern group (12–14).1 The subset of features which best replicates these geographic subareas will therefore be claimed to contain the features most susceptible to the effects of contact in the given social scenario.
It should also be noted that, even in this small inventory of languages, assigning languages to areal groups is itself a matter for investigation and experimentation (see Map 4.1). We might expect Paya to group with the Jicaquean and Misumalpan languages. Jicaquean is a small family unconnected to any other, and not much is known about the history of the people. It is spoken in northern Honduras, along the Caribbean coast, and while it is likely a long-term neighbor of Paya, there is no known interaction. Misumalpan is a small family of languages spoken primarily throughout central and eastern Nicaragua, extending across the border into southern Honduras, and with a small pocket of speakers on the border between Honduras and El Salvador (Constenla 1991). The only representative in our dataset is Misquito, especially interesting for its historical extension all along the Caribbean coast of Nicaragua, where speakers could have had contact with Paya to the north and Rama to the south.
Rama, from the Votic branch of Core Chibchan, could pattern with the Northern group or alternatively with the Isthmian languages Teribe and Guaymi, while the position of the third Isthmian language, Kuna, could be expected to vary between its Chibchan Isthmian cousins and the Chocoan languages with which it has surely been in contact. The Chocoan language family is composed of two living language varieties, Waunana and Embera. Embera is itself described as a set of closely related languages or as a dialect continuum, each variant named for the region where it is spoken, and it is divided into Northern and Southern branches. The sample here includes one Northern Embera variant (called Northern Embera) and one Southern Embera (Epena Pedee). Chocoan languages are spoken today all along the Pacific coast of Colombia and into eastern Panama, and the speakers call themselves terms that translate as ‘mountain dwellers,’ ‘river dwellers,’ and ‘people of the wild cane’ (MortensenReference Mortensen1999: 1). Historically, they are known as a “flexible and expanding population” who have settled in regions vacated by other groups during the process of colonization (Adelaar with MuyskenReference Adelaar, Adelaar and Muysken2004: 56–57). Several extinct languages prominent in the discussion of the Isthmo-Colombian prehistory have been associated with Chocoan, though none of these has sufficient documentation to confirm the relationship. These include Cueva, which was spoken on the Isthmus between Kuna to the east and the rest of Isthmian Chibchan to the west, and the extinct Colombian languages Quimbaya, of the Upper Cauca Valley in western Colombia, and Sinúfana, of the Sinú region between the Sinu and Lower Cauca rivers near the Caribbean coast.
The Southern Embera languages may show areal similarities with a Southern group that contains Paezan and Barbacoan languages. Paez is the only language (or only surviving language) in the Paezan family, spoken on the eastern and western slopes of the Andean cordillera central in southwestern Colombia. Paez has had known contact with the surrounding Barbacoan languages Guambiano and Totoro and likely contact with Southern Emberan (Chocoan) languages of the nearby Saija, San Juan, and Cauca River systems. The Barbacoan languages are spoken in separate pockets scattered from the mountainous regions of southwestern Colombia to the coastal lowlands of northwestern Ecuador. The Barbacoan language in this study, Awa Pit, is spoken in the western foothills of the Andes along the Colombia–Ecuador border. This language is perhaps an odd choice for a study of areal contact, as the community is (and may have long been) known for a culture of “secrecy” and inaccessibility (CurnowReference Curnow1997; Curnow and LiddicoatReference Curnow and Liddicoat1998).
The final subregion in this study is the Eastern group, centered on Ika and Damana, Chibchan Magdalenic languages of the Sierra Nevada de Santa Marta, which may have had contact with the Northern Maipuran Caribbean branch of Arawakan languages. As mentioned previously, Damana may be the modern version of the language spoken in the powerful Tairona chiefdom, a factor which may impact its regional profile. Arawakan is a large family of around thirty languages spread geographically from Belize to Bolivia. The language of interest here is Guajiro, of the Guajira peninsula on the Caribbean coast at the border of Colombia and Venezuela.
Regretfully, no Western Cariban languages could be included in this dataset due to insufficient documentation. Cariban is a large family of forty to sixty languages, many extinct, and mostly spoken from the Orinoco basin of eastern Venezuela across the Guianas to the Amazon, and into central Brazil. There were Cariban speakers in the Magdalena River valley, and some scholars suggest that at least some of the unknown languages spoken throughout the Caribbean lowlands of northern Colombia were also Cariban. Constenla notes that between the Kuna and the Magdalenic group of Chibchan “there was a series of people of proven or supposed Cariban affinities, such as the Opon, the Muzo, the Panche, and the Pijao” (2012: 419). The surviving Cariban language closest to the Isthmo-Colombian region is Yukpa, a language cluster with scant documentation, spoken just west of Lake Maracaibo along the Colombia–Venezuela border. Relying on small descriptions in older sources, Constenla (1991: 60) described Yukpa as SV/SVO, with genitive and demonstrative before the noun and adjective and numeral after the noun. In more recent work, Flores (Reference Flores2002) finds that while constituency is varied, the basic word order of Japreira, one language of the Yukpa group, is SOV. Postpositions are illustrated as suffixes, and nominal constituent order includes Genitive-Noun, possessor-possessed, and both Adjective-Noun and Noun-Adjective.
A typological overview of the relevant language families, based on the Constenla (1991) binary-coded dataset, is presented in Table 4.2.
Table 4.2 Typological profiles, using features from Constenla (1991)

While some feature values seem rather widespread, such as SOV basic word order, postpositions, and case suffixes everywhere but in Arawakan, we can also see a certain amount of variation and indeed several features with “mixed” answers within the Chibchan family.
Interestingly, the history of the genealogical classification of some languages in the study has taken them from sisters to cousins to neighbors. The Chibchan family was identified as such by Uhle (Reference Uhle1890). Genealogical classifications involving the Chibchan, Chocoan, Barbacoan, and Paezan families include efforts by Rivet (Reference Rivet, Meillet and Cohen1924b) and Loukotka (Reference Loukotka1968), that grouped Barbacoan and Paezan inside Chibchan; by Greenberg (Reference Greenberg1987), that proposed a Chibchan-Paezan subgroup within the single family Amerind, placing Barbacoan and Chocoan inside the Paezan division; and by Campbell (Reference Campbell, Campbell and Grondona2012a), that registers the groups as four distinct families without known interrelation.
3 Features and methods
The dataset for this study consists of binary answers (yes = 1, no = 0) to 90 questions about structural features in 14 languages of the Isthmo-Colombian region (see Table 4.5 in the appendix). Features were classified as stable vs. unstable and, independently, as template vs. contents. These categorizations are explained below.
3.1 Stable vs. unstable
The sets of stable and unstable features used in this study were selected from a proposed stability ranking of structural features (Dediu and Cysouw Reference Dediu and Cysouw2013), itself based on a comparative analysis of eight individual approaches to calculating the stability of features archived in the World Atlas of Language Structures (WALS). The eight studies made use of different statistical analyses and operated under different definitions of stability. Some were based on the persistence of features within families, others on estimates of the evolution of feature values through time, and others measured patterns of persistence within the dataset without initial consideration of known language families. Every study devised an estimate of relative stability for each individual WALS feature, up to a total of 132 features, depending on the study. Dediu and Cysouw then converted the stability estimates into relative ranks from 0.00 (least stable) to 1.00 (most stable) to facilitate comparability, reported in their Table 1, and their Table 7 presents a composite ranking for the 62 features represented in all eight analyses, based on principal component analysis. For the present study, features 1–31 in the 62-item list were classified as stable, and features 32–62 as unstable. To supplement the dataset, 19 additional features were chosen from the 132-item list by averaging the seven relative rank scores reported for that feature. The resulting average score was then compared to the average score of the “cut-off” feature (that is, feature 31 in the 62-item list), calculated by averaging the same seven relative rank scores, in order to situate each supplementary feature in the appropriate stable or unstable category.2
The features used in this chapter were chosen with four parameters in mind: (i) relative position in the proposed stability rankings, (ii) presence in the Constenla (1991) dataset of typological features, (iii) presence in WALS, and (iv) likelihood of appearance in existing descriptive materials for the languages in question. The last parameter is subjective yet realistic, given the scarcity and brevity of materials on under-described languages of South America.
The Constenla (1991) dataset consists of binary indications (yes = 1, no = 0) of the presence of 42 morphological features and 39 phonological features in 76 languages of Mesoamerica, Central America, and northwestern South America. The first step in data collection was to incorporate all relevant information from the Constenla dataset for the 35 languages of the region in question, yielding data for 35 features in all languages. Next, all possible information from WALS (accessed 13 August 2012) was added to the dataset, providing information on 59 more features for only some languages (reflecting the uneven coverage in WALS). If WALS contained more recent documentation that contradicted information from Constenla (1991), the WALS feature value was used. These collections were especially fruitful for the subset of stable features. Published grammatical materials were then consulted to code the remaining features, resulting in data for 49 stable features and 49 unstable features for 14 languages. Certain pairs of questions, such as “Is the order of the NP Adj-N?” and “Is the order of the NP N-Adj?” gave only duplicate information for this set of languages, so these questions were eliminated. Nevertheless, dependencies remain in the data, in part because WALS features with multiple possible values were expressed in the questionnaire as multiple questions with yes/no answers. This left a final dataset for the 14 languages of 90 features each, composed of 44 stable and 46 unstable features, including five cells of missing data.
Initial hypotheses are that stable features will more often replicate genealogical patterns than areal patterns in the data, and unstable features will more often replicate areal patterns in the data. This is a conservative view, equating geographic proximity with the probability of language convergence and shared structures.
3.2 Template vs. contents: patterns that connect
The next step was to categorize each feature as template or contents according to its function within a language. Template and contents classifications are comparable to familiar structuralist categories of syntagmatic and paradigmatic relations, respectively. The label “template” appeals to formal, constructional properties, at the level of the phrase or the word. These features describe the sequence of forms in a construction (constituent order, the location of affix or adposition) and the quality of the form as bound or free (affix, clitic, free form). Template features indicate where and in what form a feature is expressed. In contrast, “contents” features indicate which value from a set of values will fill a given formal position. These features appeal to choices and computations made by the speaker, often in response to factors of social cognition, group identity, and cultural rules and preferences. Contents features include those relating to the sound system; to the choice of pronoun, based on gender, politeness, or inclusivity; to choices of nominal marking, based on animacy, inalienability, or other classificatory quality; and to the choice of verbal marking, based on the grammatical role, number, or referent of the participant.
The categories of template and contents were operationalized in part from descriptions of structural change (HeathReference Heath1984; Zavala 2002; WinfordReference Winford2003; Heine and KutevaReference Heine and Kuteva2005; Matras and SakelReference Matras and Sakel2007), all of which draw on distinct feature types commonly known as “pattern” (the formal construction, corresponding to “template”) and “matter” (the morphophonological content). The key difference between “matter” and “contents” is that “contents” indexes a semantic component rather than a specific morphophonological form, identifying the meaningful distinctions encoded without examining the forms.
The decision-based nature of contents features suggests these are more likely to be shaped and enforced by interaction in specific sociocultural contexts, in effect, computed for each utterance; the initial hypothesis is that contents features will reflect areal relationships. The stencil-like nature of template features suggests two possible outcomes. On a psycholinguistic basis, “template” could describe a property that would be backgrounded and perhaps not easily accessible and manipulable by speakers. In this case, the feature might be relatively stable and reflect inheritance more than contact. On the other hand, in the description of cultural borrowing patterns in Section 2, Bray (Reference Bray, Lange and Stone1984: 337) was quoted in Section 2.1 as observing “The more neutral the trait, the wider its distribution and the greater its chances of acceptance.” If we connect the pattern of linguistic change to the pattern of cultural change, extending the dynamics of cross-linguistic priming to cross-modal priming of non-material culture, then template features may reflect a more general, regional level of contact, while contents features reflect the most local social groups. In this sense, the template and contents features could possibly both reflect areal patterns in a set of nested levels.
Table 4.3 describes the resulting database of 90 features and two types of categorization, with intersecting subsets (and see appendix).
Table 4.3 Numbers of features in major categories and their intersections

4 Assessment of feature role
The previous section presented hypotheses of what each category of structural features is likely to tell us about relationships among the various languages.

In this section, the hypotheses are explored and tested using quantitative tools. The first subsection uses a linguistic distance matrix calculated with the NeighborNet algorithm in SplitsTree4 (Huson and BryantReference Huson and Bryant2006). This tool produces a distance matrix based on the presence or absence of each feature for every pair of languages. Patterns of similarity in the data matrix were represented graphically as networks and as bifurcating trees as a first quick look at relationships (not reproduced in this chapter); the examination suggested that the best guide to identifying contact relationships among languages in the dataset are the contents features and particularly the subset of stable contents features. The graphics also suggested that no set or subset of features will identify a family relationship, successfully grouping Chibchan languages apart from other languages.
4.1 Linguistic distance by feature category and pre-defined group
In this subsection, linguistic distances from the matrix calculated by SplitsTree are compared by family grouping and by the regional groupings defined earlier. The graphics represent, for each feature type, the median distance (bar inside the box), the standard deviation spread around the average distance (top and bottom of box), and the maximum and minimum values in the data (whisker tips). The distribution in this dataset did not have extreme outliers, so the median is close to the average value for each type. The lower the linguistic distance, the more similar are the pairs of languages.
Figure 4.2 addresses genealogical relations, illustrating linguistic distance by feature subset for all Chibchan languages (CH) vs. all non-Chibchan languages (NCH), with 21 unique pairs in each calculation, noted as (21).

Figure 4.2 Linguistic distances for Chibchan vs. non-Chibchan languages
The first observation is that average linguistic distance for “all features” is nearly identical between the two groups (although the spread is larger in non-Chibchan languages), perhaps reflecting the known similarity of structural features across the region.
The results also suggest the importance of separating the simple category of stable features into different types of stable features to identify Chibchan family relationship: the best predictors of genealogy in this dataset are stable template features, while the stable contents features show a higher average distance and much greater spread. Within Chibchan, contents features as a whole and in subcategories consistently show slightly higher linguistic distances than unstable features. This suggests that a classification scheme of contents vs. template has more consistently identified properties of relative mutability among features than has the stable vs. unstable metric.
No particular category of features signals any type of relationship among the diverse languages in the non-Chibchan group. Stable template features in this group display an extremely wide spread, suggesting these features are quite poor indicators of a general areal relationship for this set of languages.
The representation of linguistic distances in Figure 4.2 argues that the stable vs. unstable opposition is not useful in delineating the Chibchan languages in the dataset. Template features, particularly the stable template features, predict genealogical rather than areal relationships, and contents features seem most sensitive to change within Chibchan but make no predictions about contact relationships in the non-Chibchan group.

Figure 4.3 Linguistic distances for pre-defined areal groups
Figure 4.3 addresses areal relations, illustrating linguistic distance by feature subset among pairs of languages in northern, Isthmian, southern, and eastern groups. Data numbers are very small: groups with four members have six unique pairs for comparison, shown as (6), and groups with three members have three unique pairs for comparison, (3).
This discussion of Figure 4.3 examines the graphic from left to right within each regional group. The common sense hypothesis is that unstable features will be better than stable as predictors of areal relationship. This means that linguistic distances should be smaller for unstable features than for stable features – and this is only true in the Eastern group. In all other groups, the average distance of stable features is smaller than the average distance of unstable. This contrary outcome actually might make sense for Isthmian, where all three neighbor languages are also Chibchan, but in general we can say that unstable features do not consistently predict areal relations better than stable features in this dataset.
The prediction for the next two categories is that contents features will show local area while template features may reflect a larger area or may reflect genealogy. Results here are quite mixed. Contents distances are smaller than template distances in Northern and (only slightly) in Southern groups, while this is not true in Isthmian and Eastern groups. The outcome is again striking in Isthmian, where the spread among features is quite small and the template distance is much lower than the contents distance. The results in Eastern are harder to interpret, as the spreads are similar and the numbers are so small. The summary statement for these categories is that neither type consistently predicts clear areal relationships in the dataset.
The perspective from the four multi-class categories suggests no common predictor of areal relation for the four pre-defined groups. The results here simply subdivide the unexpected outcome of the stable vs. unstable features described above, with stable features in three of the four regions showing smaller linguistic distances, which could be interpreted as the feature sets most useful in delimiting each areal group. These are stable contents for Northern, stable template for Isthmian, and either of those for Southern. In the Eastern group, unstable contents and unstable template features show the smaller distances.
4.2 Geographic distance by feature category
This subsection departs from the linguistic distance matrix calculated by Splits-Tree and from pre-defined regional groups of languages. In this analysis, pairs of languages are compared using geographic distance to calculate a normalized median distance between languages that share values for each feature. There are 90 binary features, which yields 180 possible sets of shared values (pairs that share 1 and pairs that share 0, for each feature). Of the 180 possible sets, 162 were shared by at least one pair of languages. Distances between pairs of languages were calculated using point coordinates for language locations in the WALS database and an online calculator that used the haversine formula to calculate the shortest distance between two points over the earth's surface (the great-circle distance).
Normalized median distance was calculated as the median distance between all pairs that shared the value of a given feature divided by the median distance of all pairs with a value for that feature (see appendix). If all languages shared a value for a given feature, as was the case for example with feature S1 = 0, then the normalized median distance was 1. If no languages or only 1 language had a specific value for a feature, as with S1 = 1, then there was no pair and no median distance. Features S33, S34, U27, U34, and U35 contain missing data for one language each, and therefore the total number of languages with values is 13.
In all cases, the lower the normalized median distance, the closer the areal relation among the languages that share the value for the feature. Figure 4.4 illustrates the plotting of feature category by the number of languages that share values and the normalized median distance between pairs.3

Figure 4.4 Geographic distribution of feature values
The general impression is of a rather homogeneous distribution of feature types, with a possible cluster of unstable contents features between 0.8 and 1.0 normalized median distance. The spread is characterized in Table 4.4, which shows counts by feature type for cumulative ranges of the 162 features ranked by normalized median distance. The distribution of stable and unstable features in each range is strikingly balanced, which contradicts the hypothesis that unstable features should show clearer areal affinities than stable features. The highest proportion of unstable features (%U, at 55%) occurs at either end of the scale, where 11 of the 20 lowest and 11 of the 20 highest normalized median distances are between shared values of unstable features.
Table 4.4 Ranges of features and types ranked by normalized median distance (N = 162)

The proportions of template and contents feature types show a bit more imbalance, with the percentage of contents features in the low 60% bracket for most of the lower normalized distances. The highest percentage of contents features (%C, at 64%) occurs in the range of the 50 lowest normalized median distances. The final column of Table 4.4 shows the breakdown of stable and unstable secondary categorizations within the contents category, and indeed stable features outnumber unstable features throughout the range of lower rankings. These counts indicate that contents features and specifically stable contents features are best at capturing general areal relationships in this dataset.
4.3 Summary of quantitative assessments
The first subsection examined a matrix of linguistic distance among pairs of languages based on shared features, produced using SplitsTree4. Hypotheses of relationship among the languages were investigated in Section 4.1 with representations of linguistic distances by feature type within pre-defined groups. Here we saw that all stable features are not alike: while the comparison of all stable and unstable features was inconclusive in distinguishing the Chibchan family, the subset of stable template features was best at identifying the genealogical group. The perspective from linguistic distances in the four small regional groups was less fruitful, suggesting in fact that diverse categories of stable features were slightly better at characterizing individual groups. Hypotheses on the roles of unstable and/or contents features in reflecting local areas were not confirmed; instead, the role of stable template features as predictors of family relations was given a small measure of support in the results for the Isthmian group, composed of three Chibchan languages.
The second subsection took a more general approach to the question of areal relations, calculating a normalized median geographic distance between pairs of languages that shared values for each feature. The overall picture suggested a rather homogeneous distribution of feature type. However, a closer look at proportions of feature categories in graded ranges of distance revealed a small but consistent predominance of contents features among the lower distances, with more stable contents than unstable contents, in a trend that peaked and then faded as the distances increased. The consistently uniform distribution of stable and unstable features at every range of distance was surprising; this picture may reflect the negotiated nature of the ranking from which features and labels were drawn, and it certainly reflects the complex character of structural stability. The prevalence of stable contents over unstable contents features as predictors of areal relation was also surprising, or counterintuitive, as one might have logically identified unstable contents as the category of features most susceptible to change. This outcome invites further investigation of the impact of social constraints on linguistic change.
5 Conclusions
If any linguistic feature can be borrowed (Thomason and KaufmanReference Thomason and Kaufman1988; CurnowReference Curnow, Aikhenvald and Dixon2001), then we need something beyond the linguistic system to predict what will and will not be borrowed. Feature frequencies based on existing descriptions and inventories without weighting for geographic or sociohistorical factors are only part of the story. The fourteen languages in this study are very similar typologically, and at least half of them have occupied stable geographic locations for millennia. However, too many languages with insufficient documentation were missing from an ideal linguistic profile of the Isthmo-Colombian area. Furthermore, geography alone does not dictate the quality of contact; the importance to the linguistic system of simple proximity will vary, and most details of the social history are yet unknown. What are the linguistic impacts of “down-the-line” contact, and among which parts of society did this contact occur? How can the effects of sociopolitical transitions, of the network, corporate, or any other type, be incorporated into models of language change?
Genealogical classification of South American languages based on lexical data does not seem to match up with the interesting stories told by structural features, and indeed the record of genealogical inheritance is only part of the history of a language and of its speakers (e.g. RossReference Ross and Hickey2003). Among neighboring languages, even lexical items are sometimes consciously conserved to maintain social difference, while at other times these serve as a good reflection of contact relations. How can this psycho-social parameter of speaker communities be operationalized for interpreting patterns of convergence and difference among all types of linguistic features?
This study has attempted to go beyond stability as defined by families and frequencies, proposing other categorizations of structural features that could be useful in tracing areal relations among languages in a specific social setting. The premise was that template and contents categorization, based partly on psychological notions, might mimic proposed contours of cultural exchange, might be a productive way to define linguistic practices that are sensitive to group membership, and might in some way be comparable to differences between basic and non-basic vocabulary. These preliminary proposals on the essence and predictive powers of template and contents features led to conflicting hypotheses about the role of each type in replicating areal patterns in the data, whether reflecting cultural convergence with speakers of neighboring languages or in providing a mechanism for the maintenance of separate cultural identities. Results from such a tiny database can only be suggestive, yet they do suggest that refining categories of “stable” and “unstable” with notions that reflect properties of human interaction will bear fruit. Studies of linguistic prehistory require the quantification and incorporation of data that delineate the social scenario of language contact; this study was meant to take a modest step in that direction.
Appendix
Table 4.5 Feature table






This paper was improved by thoughtful comments from Pieter Muysken, Dan Dediu, and Simon van de Kerke. I am also grateful to Ana Vilacy Galucio for early discussions on database design and to Arnold van der Wal for statistical analyses and Figures 4.2 and 4.3.
1 As such, this study and its goal are a micro-version of the seminal areal study of the Isthmo-Colombian region by Constenla (1991), which arrived at a macro-view of the area. Constenla's goal was to determine if the cultural area designated as the Intermediate Area, defined mostly by anthropologists, also constituted a single linguistic area. He concluded that the languages were better classified into three groups, as members of a Central American-Colombian subarea (CAC), an Ecuadorian-Colombian (Andean) subarea (EC), and a Venezuelan-Antillean (Caribbean) subarea (VA). Nearly half of the structural features collected and analyzed by Constenla were also used in the present study, primarily in the set of stable features. Under Constenla's (1991) scheme, the languages in this study are 1–9 and 12–13 in CAC, 10–11 in EC, and 14 in VA.
2 It should be noted that Dediu and Cysouw (Reference Dediu and Cysouw2013) offer the lists as ranked compilations, not as explicit hypotheses to be tested as is being done in this chapter. Dediu (p.c.) stresses that the first principal component “agreement” upon which Table 7 is organized is a global negotiation among the different methods, and he suggests that the rankings in the Parkvall (Reference Parkvall2008) study, discussed in their paper, might serve as a fruitful basis for future investigations of interactions among vertical and horizontal transmissions. The 19 WALS features drawn from Table 1 for use in this Isthmo-Colombian investigation are features 34, 35, 36, 52, 71, 78, 101, 103, 112, 116 (unstable; with average stability estimates from 0.239 to 0.510) and features 33, 51, 63, 69, 81, 88, 98, 99, 100 (stable; with average stability estimates from 0.564 to 0.819). The corresponding average stability score of the “cut-off” item, item 31 in the 62-item list, is 0.523.
3 Calculations were actually made using pairs of languages, where 14 languages produce 91 unique pairs, 13 languages produce 78 pairs, 12 languages produce 66 pairs, etc. Simple numbers of languages are used on the y axis to ease understanding of the plot.
5 The Andean foothills and adjacent Amazonian fringe
This chapter on the distribution of Andean and Amazonian features in the upper Amazon area shows that the transition from the Andean to the Amazonian area is gradual and complex. This is consistent with the intricate history of contact between the different ethnic groups of the area, and it presents a strong argument for connecting the research traditions associated with these areas. Morphosyntactic influence generally seems to represent older contact situations than phonological influence.
1 Introduction
South America is generally regarded as linguistically unusually diverse, especially in terms of genealogical units (including the exceptionally high number of isolates), but also in terms of the range of possibilities one finds in grammatical constructions. Nevertheless, regional traits of varying extensions that cross family boundaries have also been observed by several authors. Some of these characteristics are shared widely by South American languages in general, and some are restricted to particular areas of varying size.
Two macro-areas within South America have received recurring attention from scholars in terms of shared grammatical features: the Amazon basin and the Andes (see also Birchall, this volume). The middle Andes, ranging from northern Ecuador to central Chile and Argentina, has been described as “a self-contained area that proved resistant to linguistic influences from the outside” (Adelaar Reference Adelaar, Campbell and Grondona2012b: 586). Contact between the different languages that are and were spoken along the Andean mountain range, especially those spoken in the inter-Andean valleys and along the coast on the western slopes, left its imprint on the languages in the form of a number of shared traits (see e.g. Büttner Reference Büttner1983; Torero Reference Torero2002; Adelaar Reference Adelaar, Campbell and Grondona2012b). The Amazon basin is more diffuse typologically than the middle Andes, but several scholars have observed shared traits across language families over large territories (e.g. Derbyshire and Pullum Reference Derbyshire, Derbyshire and Pullum1986; Derbyshire Reference Derbyshire1987; Derbyshire and Payne Reference Derbyshire, Payne and Payne1990; Payne, D. Reference Payne and Payne1990; Dixon and Aikhenvald Reference Dixon and Aikhenvald1999).
In spite of the relative self-containedness of the Andean cultural region, and perhaps also in spite of the fact that Andean and Amazonian studies seem to form separate worlds, it is obvious that the transition from the Amazon basin into the Andes is not an abrupt one, they shade off into each other. Moreover, there is archaeological and ethnohistorical evidence that there used to be much more contact between the highlands and upper Amazon area until quite recently, continuing into the post-Columbian era (Taylor Reference Taylor, Salomon and Schwartz1999).
In this chapter, I take a closer look at the area where the Amazon basin and the Andes meet, an area that I will term the foothill-fringe (FF) area, covering the eastern slopes of the Andes and the westernmost fringe of the Amazon basin. It is an explorative chapter in the sense that it does not aim to test specific hypotheses about this area (there is, for instance, no underlying claim that the foothill-fringe forms a linguistic area), but rather tries to take stock of the distribution of linguistic features of the FF languages, especially those that have been claimed to be important areal characteristics of the Amazonian and Andean areas. There were certainly close historical connections of many of the FF languages with the Andean cultures (see e.g. Adelaar Reference Adelaar, Campbell and Grondona2012b), as well as with Amazonian cultures like Arawakan and Tupian, also longer-distance riverine connections (Taylor Reference Taylor, Salomon and Schwartz1999). In fact, a good many FF languages are classified as Arawakan or Tupian.
The chapter is structured as follows. In Section 2, I first define what I mean by the FF area, and I introduce the languages that represent the area in this paper. Section 3 is devoted to a discussion of “Amazonian” and “Andean” linguistic features, as they have been proposed in the literature. Section 4 describes the approach taken to measuring distances between the languages of the sample, as well as the results. In the last section (5) I come to a conclusion.
2 The foothill-fringe area
The eastern slopes, or the foothills, of the Andean mountain range and the western fringe of the Amazon basin are among the genealogically most diverse areas of the continent. The region is home to many isolates and small language families, as well as representatives of larger families that have extended into this transition zone. Defining this area is not an easy task, because it is essentially an area between two other zones. Therefore we will first direct our attention to the zones that border the FF area.
To the west, a number of successive Andean civilizations have occupied varying parts of the Andean mountains. The last of these indigenous civilizations, the Inca civilization, had its greatest extension as recently as the late fifteenth to early sixteenth century, when its influence stretched along the mountain range all the way from northern Ecuador/southern Colombia to central Chile (see Van de Kerke and Muysken, this volume). This relatively recent expansion has left a firm linguistic mark on the Andean landscape, not only in terms of the spread of the Quechuan languages and the extensive mutual interference with Aymaran languages, but also in terms of shallower contact with languages spoken on the outskirts of the empire, in Chile and Ecuador.
To the east of the FF area, two major expansive movements took place over the last millennia: that of the Arawakan culture (see Eriksen and Danielsen, this volume) and later that of the Tupí-Guaranian culture (see Eriksen and Galucio, this volume). These expansions were mostly by river and promoted the spread of Arawakan and Tupí-Guaranian languages. Different opinions exist about the homeland of these cultures, but it is clear that both expanded (among other directions) east towards the Andes.
Map 5.1 shows the maximum expansion of Quechuan and Aymaran languages in the Andes, as well as the probable maximum extensions of Tupian, Arawakan, and Panoan languages. Given that the different groups expanded at different times (see below), the map should not be regarded as representing the distribution of languages at any given time in history.
Roughly speaking, the FF region as understood in this chapter comprises the strip of land between the Andes and the Amazon, delimited by the river systems that flow together into the Amazon River, resulting in a geographic range from northern Ecuador to southern Bolivia. This territory can be divided into three major sub-areas on the basis of the river systems: a northern system defined by the Napo and upper Marañon Rivers that join together (with the Ucayali) into the Amazon River near Iquitos, a central system where two major rivers (the Huallaga and the Ucayali) flow into a general south–north direction across Peru, joining the Marañon in northern Peru, and finally a southern system (Madre de Dios-Beni-Mamoré) covering southern Peru and Bolivia.
The position of the FF languages in the midst of a number of cultural-linguistic expansions raises the question of how speaker communities have dealt with these expansions and, more particularly, what imprint, if any, this cultural interaction has made on the languages that they speak. Reviewing all languages of this area is at this point beyond our reach, since data are scanty, and the time span for the current chapter was not long enough. Therefore I confine myself to reviewing a representative sample of the languages listed in Table 5.1 (the number refers to the number on Map 5.2).
Table 5.1 The languages in the sample and their sources


Map 5.2 The languages in the sample and their geographic distribution
3 Andean versus Amazonian features
A number of different authors have proposed “areal” or “regional” features both for an Amazonian and for an Andean area. The proposals of these authors are not always easy to compare, since there is no clear consensus with respect to the precise extensions of the areas. This is the case especially for the Amazonian area. Some authors look at a limited number of language families that cover a broad territory (see e.g. Payne, D. Reference Payne and Payne1990); others look at a sample of languages spoken in different parts of Amazonia (Derbyshire and Pullum Reference Derbyshire, Derbyshire and Pullum1986), and yet others look at the entire Amazon basin that contains a multitude of families and which may also contain smaller linguistic areas (see e.g. Dixon and Aikhenvald Reference Dixon and Aikhenvald1999). This sometimes makes it hard to compare results, as they can be incompatible. In the discussion of the features, I will indicate the problematic points and the way I treat these problems. First, however, I briefly introduce the sources for the features in Table 5.2.
Table 5.2 Areal studies of the Amazon and Andean regions used in this study

In what follows I will discuss proposals made by these authors for widely shared features in the Amazon and Andes with respect to phonology, morphology, syntax, and lexicon. I favor those characteristics that contrast the Andean area with the Amazonian area. Moreover, I favor those characteristics that pertain to languages and language families that are or were spoken in the FF zone between northern Ecuador and southern Bolivia.
3.1 Phonology and morphophonology
Dixon and Aikhenvald (Reference Dixon and Aikhenvald1999) list the following phonological features, which are explicitly marked as being absent or having different values in the Andean area.
1. one liquid phoneme, frequently a flap
2. affricates outnumber fricatives
3. presence of a high, unrounded central vowel
4. presence of mid vowels
Andean languages, according to Dixon and Aikhenvald, typically have more than one liquid phoneme and a preference for fricatives over affricates in terms of numbers of phonemes. The high unrounded central vowel is mentioned by Torero (Reference Torero2002) as an Andean characteristic with limited extension, as it occurs in Mapudungun (central Chile) and the extinct northern Peruvian coastal language Mochica. He furthermore mentions that it is possibly reconstructable for Puquina, also extinct, which was spoken around Lake Titicaca at the present-day Bolivian–Peruvian border. It should be borne in mind that the range of the Andean area that Torero talked about has a wider extension than the area talked about in this chapter, as Torero's Andean area extended all the way down to the southern cone and included also the formerly spoken coastal languages. Since Mapudungun and Mochica fall outside the part of the Andes immediately adjacent to the foothill-fringe area, and because the high mid vowel is present in members of the most dominant western Amazonian families (Arawakan, Tupí-Guaranian, and Panoan), I take it up in the list of Amazonian features for this chapter. With respect to the mid vowels /e/ and /o/, there are also a number of Andean languages that have mid vowels as phonemes (see Torero Reference Torero2002: 524; Adelaar Reference Adelaar, Lubotsky, Schaeken and Wiedenhof2008a: 26), but the two most dominant language families of the Andes, Quechuan and Aymaran, have three-vowel systems containing only high and low vowels. Nevertheless, this feature should be considered with care, since Adelaar (Reference Adelaar, Lubotsky, Schaeken and Wiedenhof2008a) reports that some variants of Quechuan and Aymaran have developed phonemic mid vowels, possibly due to Spanish and Portuguese influence. Vowel nasalization is decidedly Amazonian, and does not occur in Andean languages.
Payne (Reference Payne, Dixon and Aikhenvald2001) adds a sixth morphophonological feature to the Amazonian list, nasal spreading, noting that the Tupí-Guaranian, Tucanoan, Jê, Panoan, and Makú families all show some form of this characteristic.
6. nasal spread
Moving on to the Andean literature, Torero (Reference Torero2002) distinguishes between a number of levels of feature diffusion (general – wide extension – limited extension – restricted). In the questionnaire I consider the first two groups, plus a subset of the features with limited extension, to the extent that they are found in languages or language families that cover major parts of the Andes along the foothill region as it is considered here. These, however, will be used with some caution, and especially to shed more light on subareas within the general region.1
General and widely extended
1. presence of a palatalized nasal
2. frequently closed syllables
3. velar–uvular opposition for voiceless stops
4. presence of retroflex affricate
Limited extension
5. glottalization of stops (some Quechuan, Aymaran, Uru-Chipaya)
6. aspiration of stops (some Quechuan, Aymaran, Uru-Chipaya)
7. three-vowel system (Quechuan, Aymaran)
The palatal nasal, the velar–uvular distinction, and the retroflex affricate are mentioned as traits that distinguish the Andes from the Amazonian area (Torero Reference Torero2002: 523–524). Glottalization and aspiration of obstruents in Quechuan languages is limited to those languages that are situated in southern Peru and Bolivia. This is very probably an Aymaran substrate feature (see e.g. discussion in Büttner Reference Büttner1983 and Adelaar Reference Adelaar, Campbell and Grondona2012b). The three-vowel system consisting of phonemes /u/, /i/, and /a/ is also in particular a feature of Quechuan and Aymaran languages (although perhaps not historically – Adelaar Reference Adelaar, Campbell and Grondona2012b), and is rare in Amazonia.
Andean feature 2 requires a few extra remarks, first of all because it is not an entirely straightforward feature with respect to the Andean area, and second because it must be translated into a question to which an answer can be given in terms of discrete categories. Adelaar (Reference Adelaar, Campbell and Grondona2012b: 601–602) mentions that neither proto-Quechua nor proto-Aymara allowed complex codas in underlying form, but since Aymaran morphophonology contains complex deletion rules of phonetic material, surface forms can contain highly complex consonant clusters. Moreover, Adelaar mentions that proto-Aymara may have been more restrictive in terms of the kinds of elements allowed in the coda, although modern Aymaran languages seem to have acquired greater coda tolerance, possibly as a result of contact with Quechuan languages (see Cerrón-Palomino Reference Cerrón-Palomino2008: 47). In addition, many Amazonian languages do allow a few consonants in the coda (usually nasals or fricatives), but tend to have more severe restrictions on what can be present in the coda. Therefore, rather than looking at abstract syllable structure, I analyze the issue of closed syllables as the degree to which restrictions are placed on segments in the coda of a syllable – not counting phonologically deviating words like ideophones, interjections, etc. and looking at underlying syllable structure.2 The answer to this question can be based on the percentage of phoneme consonants that can occur in coda position, ranging from 0 to 100, divided into three groups: A: 0–30, B: 31–60, C: 61–100. More Andean-type syllable structures will fall into categories B and C, with Amazonian-types in category A.
Since Andean characteristic 7 inherently contrasts with Amazonian characteristics 3 and 4, they can be collapsed. This leaves a total of twelve contrastive Andean and Amazonian phonological features for analysis: ten general features plus two which are more restricted (Table 5.3).
Table 5.3 The phonological features

3.2 Morphosyntax
Both Andean and Amazonian languages are by-and-large characterized by having verbs with a highly synthetic, agglutinating morphological structure. Although this is a salient feature, it is not contrastive. Nevertheless, a number of contrasting features can still be listed on the basis of the proposals by the different scholars.
The status of argument cross-referencing on the verb is unclear, since where Derbyshire and Pullum (Reference Derbyshire, Derbyshire and Pullum1986) claim that the tendency for Amazonian languages is to have a set of pronominal affixes for both subject and object participants, Dixon and Aikhenvald (Reference Dixon and Aikhenvald1999) claim that it is typically Amazonian to cross-reference only one core argument on the verb (which may differ according to context). Andean languages often cross-reference both subjects and objects on the verb, so this is potentially a contrastive feature. However, the three families with a large western Amazonian presence (Arawakan, Tupí-Guaranian, Panoan) differ with respect to this parameter. Whereas Arawakan languages usually have cross-reference markers on the verb for both subject and object, Tupí-Guaranian languages conform to Dixon and Aikhenvald's prototypical Amazonian situation in that they mark one core argument on the verb, and Panoan languages, finally, have “no, incipient, or little developed argument marking in the main verb or auxiliary” (Valenzuela Reference Valenzuela, Gildea and Queixalós2010: 68).
What is striking, however, is the number of Amazonian languages that have pronominal prefixes (see Payne, D. Reference Payne and Payne1990: 221). This may be part of a more general difference between Amazonian and Andean languages, in that large-scale Andean languages like Quechuan and Aymaran are exclusively suffixing, whereas in Amazonian languages, prefixing is more common and is present in almost all languages to different degrees (see e.g. Payne, D. Reference Payne and Payne1990; Dixon and Aikhenvald Reference Dixon and Aikhenvald1999: 9; Torero Reference Torero2002: 526).
Another opposing feature to do with person markers is the fact that isomorphism between possessors and one of the core arguments is common in Amazonia (Dixon and Aikhenvald Reference Dixon and Aikhenvald1999), and rather rare in the Andes. In Torero's data, this is limited to foothill languages Cholón and Cunza. The isomorphism feature can be extended to languages that do not have bound person markers by taking into account isomorphism on the basis of form parameters such as case marking or special forms of pronouns. I am more wary of basing isomorphism solely on positional encoding, since the possible variation is too limited. Therefore, languages that treat possessive and argument pronouns as the same only in terms of their position with respect to their head are counted as “non-applicable.”
In addition to verbal cross-referencing, many Andean languages employ rich case systems (Torero Reference Torero2002: 527), including core case markers, whereas Amazonian languages tend to have elaborate applicative systems (Payne, D. Reference Payne and Payne1990), and a rather small set of peripheral case markers (Dixon and Aikhenvald Reference Dixon and Aikhenvald1999: 8). This characteristic is hard to quantify, since it is difficult to tell what is a rich system and what is a restricted system. Iggesen (Reference Iggesen, Dryer and Haspelmath2012) classifies languages into nine categories. I will distinguish three categories in a less refined way: (A) small set of case markers or no case marking (0–4), (B) medium set of case markers (5–6), and (C) large set of case markers (>6), where the typical Amazonian profile is “small set of case markers” and the typical Andean profile “large set of case markers.” Aymaran and Quechuan languages moreover have accusative case markers. Core case markers are un-Amazonian, with the exception of Panoan languages, which often have an ergative marker.
Finally, an often mentioned trait of Amazonian languages is their tendency to have ergative alignment, or alignment systems with clear and substantial ergative elements (e.g. Derbyshire Reference Derbyshire1987). Although the range of the systems encountered in Amazonia is rather great and involves various types of split systems, fully accusative systems appear to be very rare in Amazonia (Dixon and Aikhenvald Reference Dixon and Aikhenvald1999: 8), so this feature can be contrasted with the Andean languages, such as Quechuan and Aymaran languages, as well as Barbacoan languages, Cunza, and Huarpe, which have accusative systems (Torero Reference Torero2002: 529). Encoding strategies I consider are constituent order, verbal cross-referencing, and case marking. For a language to be coded as accusative, at least one of these three must follow an accusative pattern, and the others cannot give a contrastive signal. I particularly look at NPs in simple clauses, and do not count as accusative any system that has a major alignment split (e.g. based on definiteness, semantic role, etc.).
In the nominal realm, possessive constructions can be contrasted. The typical Amazonian structure involves a head-marked construction, making use of bound person markers (see e.g. Dixon and Aikhenvald Reference Dixon and Aikhenvald1999: 8). The Andean type often involves dependent marking, sometimes in combination with head marking (Quechuan, Aymaran – see Torero Reference Torero2002; Adelaar Reference Adelaar, Campbell and Grondona2012b), sometimes not (Mochica, Huarpe, Barbacoan – Torero Reference Torero2002: 528). Puquina and Mapudungun both have Amazonian-type possessive structures in that they mark possessive relations on the head by means of person prefixes. Nevertheless, it is reasonable to contrast this feature in terms of Andean versus Amazonian in that the former tend to have dependent-marking strategies, and the latter not.
Another salient Amazonian feature is the presence of a noun class or gender system of some sort. Noun class systems are also encountered in some of the languages that Torero counts as being part of the Andean area (Mochica and Cholón), but as mentioned Mochica is a coastal language (on the west side of the Andes) and Cholón is considered here to be part of the FF area.
In terms of negation, Andean languages display several different strategies: a preposed particle, a suffix, or a combination of those. Torero (Reference Torero2002: 528–529) mentions that the first two strategies are also common in Amazonian languages, especially suffixal negation. So this feature is not contrastive enough to take up in the questionnaire.
Apart from the aforementioned subject and object cross-referencing and negation marking, a number of further traits are encountered to different degrees both in the Andean and the Amazonian area, and are therefore not contrastive: evidentiality, nominalized subordinate clauses, switch reference, phrase- or sentence-final particles or enclitics, inclusive–exclusive distinction, alienability, incorporation, and lack of passive. The above considerations leave us with a further seven contrastive morphosyntactic characteristics (Table 5.4).
Table 5.4 The morphosyntactic features

3.3 Constituent order
Especially Derbyshire and Pullum (Reference Derbyshire, Derbyshire and Pullum1986) and Derbyshire (Reference Derbyshire1987) give close attention to issues of constituent order. Among the Amazonian constituent order traits, they include O before S (Derbyshire and Pullum Reference Derbyshire, Derbyshire and Pullum1986) or O-initial (Derbyshire Reference Derbyshire1987) constituent order in the sentence and the combination of NA, Pr-Pd orders and postpositions. Torero (Reference Torero2002), on the other hand, mentions SOV clause order as an Andean trait with limited but still wide extension (Quechuan, Aymaran, Chipaya, and also true for Barbacoan languages – see Curnow and Liddicoat Reference Curnow and Liddicoat1998: 387), and AN and Pr-Pd orders as widely shared features. This results in two contrastive features (Table 5.5).
Table 5.5 The constituent order features

I have chosen the formulation of feature 20 as noted in Table 5.5 because asking for SOV order would have been too restrictive on the Andean-like languages, and asking for O-initial, too restrictive for Amazonian languages (in the sense that “no” as an answer would encompass many more logical possibilities). I will give more detailed information on word order below.
3.4 Lexicon
A final domain for which proposals have been made on the basis of which we can contrast an Amazonian profile with an Andean profile is the lexicon. One salient feature for Andean languages is a decimal counting system, shared by many of the languages and language families: Quechuan, Aymaran,3Puquina, Mochica, Cholón, Uru-Chipaya, Cunza, Huarpe, and Mapuche. Although we cannot contrast this trait as such with the Amazonian type of numeral systems, Dixon and Aikhenvald (Reference Dixon and Aikhenvald1999: 9) mention that there is generally only a small class of numerals in Amazonian languages. This means that we can set up an Andean–Amazonian contrast on the basis of elaboration of the numeral system, where a stable numeral system that goes to at least 10 (and that does not contain Spanish or Portuguese loans) is typically Andean, whereas smaller systems are typically Amazonian.
A final lexical characteristic of Amazonian languages is mentioned by Payne (Reference Payne, Dixon and Aikhenvald2001): the presence of an elaborate class of ideophones. Ideophones can be defined as “marked words that depict sensory imagery” (Dingemanse Reference Dingemanse2011: 25), i.e. they are words that typically show deviating characteristics, especially in their phonology and phonetics but often also in their morphological and/or syntactic behavior, that depict a situation in such a way that it evokes a perceptual sensation or perceptual knowledge. This goes well beyond the arguably universal onomatopoeia, as ideophones can depict at higher levels of abstraction, often involving perceptual modalities other than hearing, such as vision, taste, smell, touch, etc.
Table 5.6 completes the list of twenty-three contrastive features.
Table 5.6 The lexicon features

In the comparisons that are discussed in the next sections, I score the features for each of the thirty-two languages in the sample if the available data allow for it. By regarding the Andean profile and the Amazonian profile as “language” profiles, on a par with the profiles of the FF languages, I can calibrate the distance of an FF language to the Andean and Amazonian type.
4 Results and discussion
4.1 Linguistic distance
Figure 5.1 represents the distance between languages by taking into account all of the twenty-three features discussed above, without any differences in weight for the features. The Andean and Amazonian profiles (which are maximally contrastive) are treated as if they were languages; they are boxed in the network. The distances between the languages are visualized in a Neighbor-Net network (Bryant and Moulton Reference Bryant and Moulton2004), a distance-based method that shows splits between languages, but also signals that go against proposed splits in the form of reticulation or ‘webbing’.

Figure 5.1 NeighborNet representation of the distances between the languages of the sample
A first major split we can observe is indicated by the vertical thick dotted line in Figure 5.1. The group of languages above the dotted line contains all the Arawakan languages as well as a few others, like Cholón, Muniche, Movima, and Záparo, and towards the right the isolate Yurakaré and Tupí-Guaranian language Yuki. The group below the dotted line contains the Quechuan, Tacanan, Panoan, and Jivaroan languages, as well as Aymara (Aymaran), Secoya (Tucanoan), Amarakaeri (Harakmbet), Cocama (Tupí-Guaranian) and the (semi-)isolates Leko, Waorani, Cofán, Mosetén Taushiro, and Urarina. For ease of reference, I will refer to the group above the dotted line as “Amazonian” and to the group below the dotted line as “Andean.”
If we contrast these two blocks, the binary features that contribute most to the contrast between them are (ordered according to contrast, the highest contrast first):
1. presence of core case markers (0% of the ‘Amazonian’ group versus 89% of the ‘Andean’ languages);
2. isomorphism of possessor and core verbal argument (91% Amazonian, 32% Andean);
3. dependent marking for possession (9% Amazonian, 68% Andean);
4. the presence or absence of an elaborate case marking system (64% of Amazonian languages have a small case marker inventory vs. 5% of the Andean group, and 18% of the Amazonian group and 84% of the Andean group have an elaborate case marking system);4
5. the presence of gender/classifier systems (73% Amazonian, 16% Andean);
Both groups of languages show moreover a secondary contrast between languages to the left of the graph and languages to the right. This contrast is much less clear and seems more reminiscent of a continuum, or perhaps a tripartite distinction, and is indicated by the two thin vertical lines. If we take the most contrastive languages on the left–right axis, we can again distinguish an “Andean” group on the left, consisting of Arawakan languages Ashéninka, Nanti, and Yanesha’, and Cholón (Hibito-Cholón) and the isolate Muniche, as well as the Quechuan languages Imbabura and Cuzco Quechua, the Tacanan languages Cavineña and Ese Ejja, Aymara, and the isolate Leko. To the right of the graph we can distinguish an “Amazonian” group consisting of Yurakaré (isolate), Tupí-Guaranian languages Yuki and Cocama, the three Jivaroan languages Aguaruna, Achuar, and Shuar, Panoan Shipibo and Cashibo, isolate Urarina, and Amarakaeri (Harakmbet).
For this axis the most contrastive features are the following:
1. basic adjective-noun order (73% of Andean, 0% of Amazonian);
2. phonemic central high vowel (18% Andean, 90% Amazonian);
3. the presence of more than one liquid phoneme (82% Andean, 10% Amazonian);
4. nasal spread (0% Andean, 70% Amazonian);
6. phonemic palatal nasal (100% Andean, 40% Amazonian).
While the rather clear top-to-bottom split is dominated by morphosyntactic features, the more diffuse left-to-right split seems to be particularly based on phonological features (except for the first). This may at least in part reflect the fact that the phonological features seem to be more sensitive to diffusion through contact, probably through the incorporation of loanwords. If we split the phonological features from the morphosyntactic features,5 we can observe that the distributions of Arawakan and to a lesser extent Quechuan languages (together with Aymara) are rather diffuse in the network based on the phonological features (Figure 5.2), and much closer together in the network based on morphosyntactic features (Figure 5.3). In Figure 5.3, all the Arawakan languages in the sample are in the left “tail” of the figure (with Movima and Muniche). The two Quechuan languages and Aymara are identical with respect to the morphosyntactic features, and converge completely on the Andean profile. In Figure 5.2 on the other hand, the Arawakan languages are spread all over the network, while Quechuan languages and Aymara are still rather close to each other, if not so close as in Figure 5.3.

Figure 5.2 NeighborNet of distances between languages of the sample (phonological features only)
Another interesting difference that can be observed is the fact that Panoan (Cashibo, Shipibo) and Tacanan (Cavineña, Ese Ejja) languages, which are often regarded as being related in a deep sense (see e.g. Key Reference Key1968; Girard Reference Girard1971), are rather close together in the morphosyntactic representation, compared to the phonological feature representation. On the whole, then, the morphosyntactic picture makes the impression of representing a more conservative, genealogical picture than the phonological one. This can possibly be connected to borrowing of linguistic forms (especially lexicon), thus introducing new phonemes to the recipient language. Since grammar is generally assumed to be more resistant to borrowing than the lexicon, we might hypothesize that Figure 5.2 may be read as indicating patterns of (shallow) language contact, whereas Figure 5.3 may be reflecting either genealogical links or deep/intense contact.
In terms of features, the same major contributing factors that were identified for the top–bottom divide in Figure 5.1 are responsible for the main divide in Figure 5.3, which contrasts the same two groups of languages. The main contributing features of Figure 5.2 are nasal spread (79% for the Amazonian side, 0% for the Andean side), the presence of more than one liquid phoneme (5% for the Amazonian side, 79% for the Andean side), and the presence of a phonemic palatal nasal (32% for the Amazonian side, 100% for the Andean side).
4.2 Correlations with geographic factors
This chapter focuses on linguistic issues in the discussion on the foothill-fringe area. I will here touch on some possible geographic correlates, but it is clear that more in-depth research is necessary to give more detailed and definite answers to these matters.
The Andean side of the divide in Figure 5.2 suggests contact between Andean (Quechuan and Aymaran) languages and some of the languages spoken close to the Andes in northern Bolivia (Leko, Cavineña) and Peru (Nanti, Ashéninka, Cholón). From there towards the right of the graph the situation becomes more diffuse. There are a few probable contact pairs (Cofán and Secoya, perhaps Urarina with the Panoan languages – although there is quite a lot of reticulation, including the biggest divide in the graph), but there are more surprising positions: Záparo, Mosetén, and Amarakaeri, whose closest neighbors in the network are not their closest neighbors geographically. On the whole, then, the phonological graph seems to represent a rather specific Andean profile with some possible points of contact-induced change, and a much more diffuse Amazonian zone, where languages may share some traits but not others. The most widespread features in the Amazonian group are the presence of mid vowels (also quite common on the Andean side of Figure 5.2) and nasal spread (both 79 percent).
One clear exception to the pattern that phonology is less stable than morphosyntax is the fact that the Jivaroan languages are distributed much more diffusely in the morphosyntactic picture than in the phonological one. There is no straightforward explanation for this apparent anomaly. A suggestion towards an explanation may come on the one hand from the long-term and complex relations between Jivaroan groups and highland groups,6 and on the other hand because the Jivaroan groups “show a particularly strong ethnic consciousness” (Adelaar with Muysken Reference Adelaar, Adelaar and Muysken2004: 432). The first factor may contribute to the result that Achuar is found rather close to the Andean profile, and the latter may account for the fact that the Jivaroan languages pattern so closely phonologically, perhaps due to an ethnic consciousness that includes a resistance to lexical (conscious) borrowing.
Figure 5.3 shows a relatively homogeneous “Amazonian” group, containing all Arawakan languages, but also the isolates Movima and Muniche and, at more distance, Cholón and Záparo. The Movima case may be explained by (deep) contact of Movima with Trinitario/Ignaciano and also Baure. Other cases, such as the puzzling position in the midst of the Arawakan languages in the network of Muniche, Záparo, and Cholón, may require less straightforward explanations.7
Given the particular position of the languages in the sample, between two major geographical (and perhaps cultural) zones, a natural question to ask is whether we can find any correlations8 between the linguistic patterns and geographic variables. A first, simple question would be whether there is any correlation between linguistic distance and geographic distance.9 It seems that there is no such correlation, as can be observed in Figure 5.4, which shows geographic distance on the x-axis and linguistic distance on the y-axis. This confirms the observations made (i) that there are a few languages oddly placed in the graph, and (ii) that genealogical signals can be strong without necessarily coinciding with geographic proximity.

Figure 5.4 Correlation between linguistic distance and geographic distance
Two other geographic factors that seem intuitively important are elevation and river systems, as they are of consequence to how people travel, and perhaps they limit the range of contacts between peoples. Figure 5.5 shows the correlation between geographic elevation and linguistic distance. It should be read as follows: the greater the difference in elevation between a language pair, the greater the linguistic difference between these languages. In very general terms, this can be interpreted as identifying the Andean mountains as a barrier for contact, although the correlation is not very strong (r = 0.37).

Figure 5.5 Correlation between geographic elevation and linguistic distance
The idea that differences in elevations are barriers for contact is corroborated by the graph indicated in Figure 5.6, which is the correlation between elevation and proximity to the Andean profile. The x-axis indicates the height of the location where the languages are spoken, and the y-axis indicates distance from the Andean profile. As a general tendency, the higher a language is spoken, the more it conforms to the Andean profile.

Figure 5.6 Correlation between elevation and proximity to the Andean profile
A final geographic factor that I want to take into consideration is the river system. Rivers in South America form pathways along which people move around, are in contact with each other, and thus possibly influence each other. The foothill-fringe area as presented here can be said to consist of three major river systems:
1. The northern basin, delimited in the north by the lower Napo and Aguarico Rivers and in the south by the lower Ucayali and Marañon Rivers, encompassing eastern Ecuador and northern Peru.
2. The drainage basin of the upper Ucayali and Huallaga Rivers, covering north-central to southern Peru.
3. The basin defined by the Madre de Dios and Mamoré Rivers, covering Bolivia and a small part of southern Peru.
The languages in the sample can be classified according to the river system they belong to (see above), and the average distance between their linguistic profiles can be compared to the average of the entire sample. However, it seems that the river systems do not have any impact on the average linguistic distance, as is shown in Table 5.7.
Table 5.7 The average linguistic distance per river system

One proviso that we should make with this result is that the genealogical diversity is greater in the Napo/Aguarico (ten families) and Madre de Dios/Mamoré basins (nine families) than in the Ucayali/Marañon basin (four families). This means that perhaps the picture should be adjusted somewhat and there might be a (relative) contact effect after all in the northern and southern basins. However, this is difficult to take into the equation, and must, moreover, await more detailed research.
5 Conclusion
When reviewed in terms of areal linguistic features that are considered to be of importance for the Andean and the Amazonian areas, the FF languages conform neither to the Amazonian profile, nor to the Andean profile. Instead, they form a mixed group, which fits well with their position between these two areas, and reflects their complex past of multilateral contacts. The results of the study do clearly show that the FF area, which is mostly associated with the Amazon in traditional terms, does not conform to the Amazonian prototype. On the basis of the results of this preliminary study, we can tentatively draw a few further conclusions (pending more research that incorporates results from ethno-historical and archaeological studies).
In terms of genealogical patterns we cannot say very much on the basis of this sample, which would need to be expanded to allow for more firmly supported conclusions. Nevertheless, with this proviso in mind, some patterns can still be recognized and perhaps serve as hypotheses for future studies. The Quechuan languages (Cuzco and Imbabura) do end up relatively close to each other in all networks, but Aymara is generally closer to Cuzco Quechua than Imbabura. This is in line with the conclusions drawn in Van de Kerke and Muysken (this volume) about Ecuadorian variants of Quechua being substantially different due to contact effects. The Tacanan and Panoan languages show rather strong signals, and even end up together when only morphosyntactic features are considered, possibly reflecting an even older connection. Arawakan and Jivaroan languages show ambiguous signals (see below).
Apart from the Andean sphere (including Tacanan and Leko), and a recurring northern group of Cofán, Warao and Secoya, there are no obvious major areal patterns in the data. There may be some further more local areal patterns (Movima and Trinitario, Yurakaré and Yuki, Urarina with the Panoan languages). Closer scrutiny may reveal more of these local patterns.
In general terms, morphosyntactic features seem to represent more stable structural traits than the phonological features, which is possibly attributable to the fact that lexical borrowing, more likely to occur than structural borrowing, can influence phoneme systems. The phonological picture is more diffuse than the morphosyntactic one, which shows a clear – Arawakan-dominated – Amazonian group and a (somewhat more diffuse) Andean group. One of the hypotheses that could be tested further is that the patterns reflect two time layers: an older layer of languages with the longest presence in the area and the longest history of contact with the Andean civilizations, and a group of languages and language families that have moved into the area from Amazonia proper, dominated by the Arawakan profile, and which have undergone less long-term Andean influence. The Jivaroan languages form a notable exception to this pattern, which is possibly due to factors of a more ethno-cultural nature. This issue clearly requires more in-depth research.
Apart from linguistic and cultural-historical considerations, I have reviewed some geographic factors that may be of influence. There is no correlation between geographic and linguistic distance as such, but there is a correlation between difference in elevation between language pairs and their linguistic distance, suggesting that elevation differences form a natural barrier against contact. This is corroborated by the fact that there is a correlation between the elevation of a language and the degree of conformation to the Andean profile. Belonging to the same broad river system seems to be less influential when it comes to predicting linguistic distance, although future research may reveal that there is some impact in the northern and southern river systems, or that smaller river systems may give more meaningful results.
This paper was partly prepared at the Radboud University Nijmegen, supported by NWO grant 275–89–006, which is gratefully acknowledged. I thank the editors for useful comments on earlier drafts of this paper, and Françoise Rose, Lev Michael, and Marine Vuillermet for generously providing unpublished material and/or personal comments on specific data points. I furthermore thank Harald Hammarström for his invaluable help with the statistics. Remaining errors are mine.
1 This particularly means those traits found in at least two of the following languages/language families in Torero's list: Aymaran, Quechuan, Puquina, Uru-Chipaya, Cunza. Traits limited to coastal languages like Mochica and Sechura and/or to southern cone languages/families Mapudungún, Huarpe, and Cunza are not taken into consideration, since they are or were not spoken in areas adjacent to the foothill-fringe.
2 It would actually be preferable to also look at surface codas, since that is the signal that may be transferred from one language to the other, but lack of systematic data prevents this.
3 Aymaran has in fact historically a five-term system, but this is supplemented with Quechua loans (Cerrón-Palomino 2000, Van de Kerke Reference Kerke, Crevels and Muysken2009, Adelaar Reference Adelaar, Campbell and Grondona2012b).
4 In the latter interpretation (i.e. the presence or absence of an elaborate case marking system), this feature is the second-highest contributing factor.
5 The constituent order features are included in the morphosyntactic features; the lexicon features have not been considered in either of these networks.
6 Adelaar with Muysken (Reference Adelaar, Adelaar and Muysken2004: 432) suggest that the Jivaroan territory may even have extended into the Andean highlands.
8 I thank Harald Hammarström for the calculations as well as providing me with the data points on geographic position and elevation of the languages in question.
9 I have taken a crude approach here, distances being represented as the crow flies, and the languages being considered points rather than polygons on the map.
6 The Andean matrix
This chapter deals with several long-standing issues in Andean linguistics: What is the best way to classify the Quechua language family internally and what does this classification tell us about the history of the language, contrasting a new morphosyntactic dataset with the lexical data analyzed by Heggarty (Reference Heggarty2005, Reference Heggarty2007)? What do the complex structural relations between Quechua, Aymara, and the other highland languages reveal about their historical relationship? We hope to show that the Quechua language family is quite coherent and stable in many respects. There is structural evidence for a QI/QII split rather than for a more wave- or network-like configuration within the family. The Aymaran languages are clearly set apart from Quechua varieties as a group, but at the same time, the structural distance between the northern varieties of Quechua and those of most of Peru and Bolivia is larger than that between Aymaran and e.g. Quechua I. The other Andean languages clearly have separate structural profiles.
1 Introduction
Over the last three thousand years the central Andean area has seen the rise and fall of different civilizations. Periods of centralization of power were followed by periods of regionalization, but on the whole the direction was towards large-scale cultural integration and increasing state control over an ever-growing territory, connected with the Chavín, Huari/Tiahuanacu, and Inca horizons. Maximal integration was reached when the Inca Empire controlled an area that ran from Ecuador to Argentina, just before the victory of the Spaniards in 1532 CE. The central Andes area is one of the hot spots of human civilization. It has seen the development and specialization of different food supplies: tubers like potato, maize, camelids, the sacred coca leaf, all of them linked to the complex Andean ecosystem, known in the literature as the pisos ecológicos, ecological levels. It also led to a “vertical” social organization, with exchanges between different eco-zones. Living and agricultural conditions called for massive labor. Moving labor forces around has led to mixing populations of different origins. Under Inca rule, lasting no more than two centuries, a Quechuan lingua franca was propagated within the empire. Inca policy generally, however, was not to suppress local cultures and languages, but to overlay them with the state culture and the imperial language. This process was so successful that 500 years later some form of Quechuan, in its many local varieties, is still spoken by millions of people from the south of Colombia, through Ecuador, Peru, and Bolivia, into the north of Argentina. Only one important other language family managed to resist Spanish pressure: Aymaran. Of the other languages of the Andes, only a small community of speakers of Chipaya (Uru-Chipaya family) survives.
The territory from which Quechuan and Aymaran started to expand around 1 CE lies in central Peru. While Quechuan and Aymaran share a homeland, they probably did not share a direct ancestor. In spite of structural and lexical similarities between the two families, the question of whether genealogical relatedness or language contact may explain this has been the subject of debate (see Adelaar with Muysken Reference Adelaar, Adelaar and Muysken2004: 34–36; Muysken Reference Muysken, Heggarty and Beresford-Jones2012a; Adelaar Reference Adelaar, Campbell and Grondona2012a). Lexical comparisons within and between varieties of the two families suggest intensive language contact, but morphology and syntax have not been analyzed systematically. The current chapter aims at a more refined comparison of the two language families, enhancing the picture with a more structurally oriented analysis of dialectal variation within the Quechua family, taking into account the other relevant languages spoken in the area.
The Andean matrix has played an important role in South American indigenous linguistics, because of the postulated Andean linguistic area and civilization and its impact on its neighbors. Here we critically survey the current state of knowledge regarding the linguistic history of the Andes and present a new set of analyses based on structural rather than lexical or phonological criteria. We will use a data set of coded features for noun phrase structure and argument realization (see the contributions by Krasnoukhova and Birchall, this volume), as well as structural and morphological features specifically selected to distinguish Quechuan and other Andean languages.
We mean to throw new light on several long-standing issues:
(a) What is the best way to classify the Quechua language family internally and what does this classification tell us about the language history, contrasting our morphosyntactic data with the lexical data analyzed by Heggarty (Reference Heggarty2005, Reference Heggarty2007) (Section 4)?
(b) What does the complex structural relation of Quechua, Aymara, and the other highland languages suggest about their historical relationship (Section 5)?
We begin by presenting the distribution of languages in the region in Section 2, while Section 3 contains information about sampling and coding. We conclude and raise the questions remaining in Section 6.
2 Definition and distribution of languages
Although the Andean ridge runs all along the western coast of the South American continent, the Andean matrix proper refers to the geographic area closed off in the north by the Chibchan area, in the east by the Eastern Lowlands, and in the south by the Southern Cone (Map 6.1). This largely coincides with the area brought under Inca rule in the latest phase of the Inca horizon, which lasted from 1300 CE until the collapse of the empire in the mid sixteenth century. Not only the Incas were confined to this central Andean area; the preceding Huari (500–900 CE) and linked Tiahuanacu (500–1000 CE) horizons were as well. Smaller, more localized outbursts of power concentration like Chavín in northern Peru, Moche on the northern coast, and Nazca on the central Peruvian coast also flourished within these confines.

Map 6.1 Approximate distribution of the indigenous languages in the Andes in the mid twentieth century (Map 4, p. 169, from Willem F. H. Adelaar, with Pieter C. Muysken, The languages of the Andes (Reference Adelaar, Adelaar and Muysken2004), Cambridge University Press)
Environmental conditions include a generally arid coast, where life was confined to the river valleys that at the same time served as avenues to the highlands. A steep climb leads to altitudes between 2,000 and 3,500 meters, where food production makes larger concentrations of humans possible, either by tilling the land or by herding. Passing over the ridge of the Andes at 5,000 meters, a steep plunge through a very productive area leads one into the forested lower mountain slopes and then the jungle. Only during the larger archaeological horizons was there north to south contact, while in the intermediate periods we find small local kingdoms/cultures that manipulated the west to east link, integrating the different altitudes in the vertical exchange system.
Without written material it is clear that we have to rely fully on archaeological information up to the end of the Huari/Tiahuanacu horizon. After this date, Andean oral traditions, as recorded by the Spanish, start to play a role. On the basis of this historical information, we may conclude that by that time the Bolivian Altiplano was populated with Aymara-speaking strongholds on the western side of Lake Titicaca and Puquina-speaking Collas on the eastern side. It is highly likely that the Urus, then as now, lived as hunter-gatherers in the surrounding river system. Further to the south we find first Atacameño in northern Chile and then Araucanian (Mapuche) in central Chile. Lule-Vilela was spoken in northern Argentina. Going to the north, we enter a large zone where speakers of Aymara, Quechua, and Puquina coexisted up to the area that was the supposed point of origin of Aymara and Quechua expansion, in central Peru. There we find Mochica, the language of the Moche or Chimu culture (1300–1438), and a number of other coastal languages: Tallana, Sechura, Olmos, and Quingnam. Very little is known of these latter languages (Adelaar with Muysken Reference Adelaar, Adelaar and Muysken2004: 397–407). In Southern Cajamarca and Northern Ancash the now extinct language Culli was spoken, possibly related to Cholón. Cholón was spoken further east, on the Amazonian fringe. Further to the north into Ecuador a number of languages were spoken in the pre-Columbian era, most of which have disappeared, largely replaced by Quechuan varieties (see Adelaar with Muysken Reference Adelaar, Adelaar and Muysken2004: 392–397).
Some languages situated on the fringes of Inca state control, such as Atacameño and Coli (a variant of Puquina) on the southern coast, survived for a time before disappearing around 1900. All other languages whose existence is known to modern scholars, including Puquina, the language associated with the Tiahuanacu period, were replaced by Aymara and Quechua. It is only on the borders of the Inca Empire, mainly on the eastern side of the Andes where diseases, armed resistance, and geographic circumstances brought the Inca armies down, that other languages were able to survive (see van Gijn on the Andean foothills, this volume).
3 Research methodology and language varieties studied
The main focus in our project has been on morphosyntactic characteristics. For this reason we use questions and data that were collected for the Noun Phrase (Krasnoukhova, this volume) and Argument realization (Birchall, this volume) questionnaires. We used a subset of the features in these questionnaires because of the overlap that exists between them and because of the fact that a number of the features were irrelevant for the languages in our sample. Of the total of 96 features from the Noun Phrase questionnaire we use 58; from the total of 83 features in the Argument realization questionnaire we use 68, giving a total of 126. We added the data for another 13 to the 4 Quechua varieties represented in the Krasnoukhova and Birchall samples.
Apart from that we composed a questionnaire specifically aimed at Quechuan, with features that in earlier studies were identified as distinguishing Quechuan varieties. Most questions, 25 in total, concern the form of morphemes (see the Quechua questionnaire in the appendix, Table 6.6). A small number of questions were more general and could also be used for a comparison of the whole set of Andean languages (see the Andean questionnaire with 12 features in the appendix, Table 6.6). This means that the Quechuan languages (including Kallawaya) may be contrasted on 163 features, and the whole complex of Andean languages on 138 features.
The use of new analytical tools allows for an in-depth analysis of the variation within the Quechuan family on the one hand and between the Andean languages as a possible Sprachbund on the other, and the results may shed light on a number of questions put forward by Heggarty (Reference Heggarty2005, Reference Heggarty2007). To facilitate a comparison with Heggarty, working with lexical data, we aimed at a comparable set of dialects and languages, as shown in Table 6.1. It presents the language varieties included in this study, compared to those used in the lexico-semantic study of Heggarty and the groupings in Adelaar with Muysken (Reference Adelaar, Adelaar and Muysken2004).
Table 6.1 Languages of the study, compared to the varieties used by Heggarty (Reference Heggarty2005) and by Adelaar with Muysken (Reference Adelaar, Adelaar and Muysken2004)


For the linguistic data we rely on published descriptions of Andean languages and in a few cases on our own fieldwork data.1
To analyze the material, we used feature distance matrices and NeighborNet analysis (Huson and Bryant Reference Huson and Bryant2006), which allows the representation of distances between languages as well as reticulations (shared features between different branches).
4 The internal structure of the Quechua language cluster
The internal structure of the Quechua family has been a subject of study from the moment that the Spanish friars tried to gain a grip on the complex language situation they were confronted with. Another period of attention came in the nineteenth century when large numbers of European explorers arrived to study the Andes, often as forerunners of commercial exploration. Then another century had to pass by before the pioneering work of Parker (Reference Parker1963) and Torero (Reference Torero1964) made it obvious that treating Quechua as if it were one language is mistaken; a reasonable point of comparison would be the variation within the Romance family.
They argued for a split in the family into a more conservative central branch Quechua I (QI) and a more innovative southern and northern branch Quechua II (QII). This view is widely accepted, although the gradualness of distinctions between QI and QII varieties is debated. The QI dialects are spoken in a relatively small uninterrupted area in central Peru; they share a number of differences with the QII dialects but among themselves they diverge widely. The QII dialects have much in common but are subdivided into a number of geographically based sub-branches. Apart from this basic opposition, Parker and Torero have shown that Quechua was not simply spread from imperial Cuzco, but in an earlier phase had emanated from the central Andean area. Subsequently, a debate started between proponents of the “Cuzco origin” and “Central Andean origin” schools. Many comparative studies were carried out on different aspects of the lexicon, phonology, and morphology of the different dialects, to get a better understanding of the developments within the Quechua language family, all using qualitative methods. The lexical data were used to estimate time depth of a possible split using the lexical statistical method (Torero Reference Torero and Escobar1972), but it was only in the last decade that Heggarty (Reference Heggarty2005) initiated a new research line by evaluating the data by means of phylogenetic network trees, a research line pursued in this chapter.
4.1 QI and QII
It is clear that there are important differences between QI and QII varieties (Parker Reference Parker1963; Torero Reference Torero1964; Cerrón-Palomino 1987; Adelaar with Muysken Reference Adelaar, Adelaar and Muysken2004). It is also generally agreed upon that the split between them took place quite early, possibly longer than 1,500 years ago. The main question is whether this led to a sharp division, as assumed by the sources listed above, or a more gradual one, as argued by Heggarty (Reference Heggarty2007: 335). Summarizing the NeighborNet analysis of lexical data reported on in Heggarty (Reference Heggarty2005) and on a reanalysis of the data presented in Torero (Reference Torero and Escobar1972), he writes:
All of these graphical outputs look nothing like neatly branching trees, but webs suggestive not of some radical early split in Quechua but a gradual expansion into a broad dialect continuum. Indeed the varieties of Northern Highland Peru, supposedly QIIa, i.e. a sub-branch of QII, in fact appear much closer to QI than to the rest of QII.
Inspection of the graphs presented in Heggarty suggests that the sources of contention concern three datasets:
(a) The northern varieties of Cajamarca and Ferreñafe (Torero) and Cajamarca and Inkawasi (Heggarty) do not cluster with the other QII varieties, but either with Central QI (Torero) or as a separate branch (Heggarty).
(b) The central variety of Pacaraos that was classified as QII by Torero (Reference Torero1964) is actually closer to QI in Torero's (Reference Torero and Escobar1972) data.
(c) The varieties labeled Yauyos are intermediate between QI and QII in both datasets.
Our data involve systematic morphosyntactic datasets that can help clarify these issues. The main question will be: Is the split between QI and QII so deep that we can speak of genealogical units or do we see a process of dialectalization in a network-like form with numerous early split-offs? We will focus on the position of Cajamarca, Pacaraos, and Yauyos.
4.2 Quantitative results for the Quechua languages
The primary technique we used involves distance matrices, representations of the percentage of shared features between any varieties in the sample.2
Applied to the three datasets we have, Noun Phrase, Argument expression, and Quechua 37 (the combined questions of Quechua and Andean), we observe that the internal variation within the Quechuan family is low when the analysis is based either on the Noun Phrase (GAD 0.11) or the Argument expression (GAD 0.12) questionnaire. The Noun Phrase feature database yields very few differences between the varieties (distances ranging from 0.05 to 0.20), suggesting that these features are relatively stable across the family. The Argument realization feature matrix results in a low distance on the average, but somewhat greater internal variation (range 0.05–0.30). The NeighborNet graphs associated with these figures present the dialects as relatively randomly grouped. This may be interpreted as an indication of the fact that the basic morphosyntactic layout of Quechuan languages has remained amazingly stable, assuming that geographic split and subsequent dialect formation started 1,500 years ago.
It may not come as a surprise that the global average distance jumps considerably to 0.41 (range 0.15–0.70) with the 37 combined Quechua and Andean features. The associated NeighborNet strongly contrasts the QI, QIIb, and QIIc dialects and the effect of these features is so strong that even if they are lumped together with the Noun Phrase and Argument expression features, the global average of this combination (Quechuaall) falls to 0.18 (range 0.05–0.30), but the associated NeighborNet gives the same clear picture: see Figure 6.1.

Figure 6.1 NeighborNet representation for the relative distances of the members of the Quechuan language family
The main branching between QII languages and QI languages is confirmed. The problematic variety of Pacaraos is close to QI, and problematic Cajamarca and Yauyos are close to QII. Ecuadorian Quechua, a sub-group within Quechua II, is always the outlier, with other lowland Quechua languages such as Peruvian Pastaza Quechua and Inga (Colombia) close by.
A first conclusion may be that the forms of morphemes may have changed (the 25 features in the Quechua database) but not the typological frame (Noun Phrase and Argument expression databases). Interestingly, variation is not random but clearly supports the classical division of Quechuan into QI and QII. In that way it provides answers to a number of questions that were formulated above. The split between the Quechua I, including Pacaraos, and the Quechua II dialects is deep and not likely the result of a slow dialectal spread. The northern Peruvian dialects Cajamarca and San Martin, and Yauyos Quechua as well, form a branch of the Quechua II cluster. This may be interpreted as support for the view that they are the result of early Huari expansion, as we will argue below. The loss of morphological complexity observed in the Ecuadorian dialects sets them clearly apart as a subgroup within QII. It is noteworthy that Kallawaya groups with them in this respect.
Kallawaya does not resemble the surrounding Southern Quechua varieties, in contrast with most remarks in the literature, namely that Kallawaya would simply be a relexified variety of Bolivian QIIc. Its intermediate position next to the Ecuadorian Quechua varieties probably results from processes of simplification in the Quechua varieties that were the input to Kallawaya, similar to what happened in Ecuador.
4.3 The QII cluster and Ecuadorian Quechua
The QII varieties of the southern branch (spread from Ayacucho in the Peruvian highlands to northern Argentina) form a genealogical unit together with Cajamarca and Yauyos. However, the distance matrices and the NeighborNet analysis in 4.2 suggest that northern QII languages (including Ecuador and some speakers in southern Colombia) form a group by themselves. The key question concerns the relation between the Peruvian QII varieties and those exported to Ecuador. The varieties of Ecuador have been linked to a lingua franca Quechuan form called lengua general, assumed to have been the Inca imperial expansion variety.
The morphological features of Ecuadorian Quechua, or Quichua, show that it is an off-shoot of QII varieties. It is related to early Chinchay (Torero Reference Torero1975) or “general Quechua” and has a few features of Cuzco Quechua as well. It was introduced into Ecuador in the Incaic period (see also the testimony of Cieza de León Reference Cieza de León1984 [1553]), and consolidated during the colonial period. Hocquenchem (Reference Hocquenchem, Heggarty and Beresford-Jones2012) argues convincingly that all ethnohistorical and archaeological evidence points to expansion of Quechua into Ecuador during the Inca conquests after 1450 CE. We know of no compelling linguistic arguments which would support an earlier expansion.
Quichua differs considerably from the Quechuan languages it is related to, such as Ayacucho Quechua (Muysken Reference Muysken1977; Reference Muysken and Arends2000b). It was consolidated as the community language of the runa peasants of the Ecuadorian highlands during the Incaic period, and particularly the Colonial period. Prior to the advent and genesis of Quichua, other substrate languages were spoken in Ecuador, notably Barbacoan languages in the north and Jivaroan languages in the south. There is some variation in Quichua that may be attributed to different substrates. Table 6.2 presents some of the main differences between Quichua and its Peruvian relatives.
The differences are striking particularly in the nominal domain. Compare verbal constructions (1a) and (1b), where the basic person indexing is maintained, to (2a) and (2b). Ecuador has lost nominal person indexing:


The consequences of this loss are also crucial in the nominalization and hence subordination domain; cf. the contrast between (3a) and (3b):

However, not all Amazonian varieties have lost person marking on nominals, in nominalizations, and in adverbial clauses; cf. the following examples from Waters (Reference Waters and Parker1996: 167–169), of equivalents in (a) Pastaza (Peruvian Amazonian Quechua) and (b) Napo (Ecuadorian Amazonian):
(4)

(5)

(6)

Thus the changes in the northern varieties (including Peruvian Amazonian), separating them structurally from all Peruvian Quechua varieties, cannot be uniquely attributed to the loss of nominal person marking.
4.4 Origin and history
Given the results from the NeighborNet analysis and the earlier literature, the scenario which we consider most likely for the origin and spread of Quechua contains the following elements.
Origin. As argued in Adelaar (Reference Adelaar, Campbell and Grondona2012a) and Muysken (Reference Muysken, Heggarty and Beresford-Jones2012a) Quechua emerged through interaction with Aymara. The precise details of the interaction are a matter of dispute and need further investigation. The overall typological profile of what we may think of as very early Quechua suggests affinities with languages in the north central Andes such as the Barbacoan and Jivaroan family. This reflects a possible northern origin for Quechuan.
Initial spread. The evidence regarding internal variation in the Quechua language family points to an initial dispersal from the central Peruvian highlands. First, varieties which later gave rise to QII moved towards the south (to roughly the Ayacucho area, associated with the Huari civilization). The original area remained highly differentiated, with some varieties later emerging as the current QI languages (including Pacaraos).
The Huari civilization. Beresford-Jones and Heggarty (Reference Beresford-Jones, Heggarty, Heggarty and Beresford-Jones2012b: 57–84) argue that the Huari horizon is associated with the expansion of Quechua.4 We agree with this and assume that later consolidation and spread of QII varieties was linked to the Huari civilization. In the Huari period terrace construction gained momentum, increasing the potential for maize cultivation through stone heat retention, but it requires substantial labor and thus population density. Huari settlements were very grid-like and regular in shape. Huari expansion was achieved through military rule rather than through extensive movement of people. We follow Adelaar (Reference Adelaar, Campbell and Grondona2012b) in the idea that Cajamarca and related Quechua varieties were an off-shoot from the Huari civilization.
The Inca Empire. The later spread of Quechua in the Inca period, both northward into Ecuador and southward into Argentina, can be linked to actual population movements. It is not evident that the Inca actually imposed Quechua on their subjects. There was very extensive movement of people, involving various types of subjects, during Inca rule, and in terms of numbers the mitmaqkuna were the most important. Mitmaqkuna were extended families or ethnic groups resettled by the Incas by force from their home territory to recently conquered areas.5 The Inca occupation of northern Argentina took place in the middle of the fifteenth century.
The Spanish occupation. According to Cook (Reference Cook1998) the Andean population was 9 million at the time of conquest, and this went down to 600,000 in 1620, an incredible devastation. The figure of 370,000 can be given for Peru in 1730, which increased to 1.5 million in 1876, and 2.9 million in 1940. At present the population is somewhere near its size at the time of conquest. Thus the current distribution of languages does not necessarily represent the original situation. Quechua and Aymara continued to spread into the foothill regions of Ecuador, Peru, and Bolivia during the colonial period. An overview is given in Figure 6.2.

Figure 6.2 The distribution of languages per region, over time (Q = Quechua)
Whether the Chavín and Huari/Tiahuanacu horizons need to be linked to language spread (Beresford-Jones and Heggarty Reference Beresford-Jones, Heggarty, Heggarty and Beresford-Jones2012b) is an open question. The Chavín horizon is very early, 900–200 BCE, and is believed to have been limited in its expansion. It pre-dates the generally believed presence of both Quechua and Aymara in central Peru, which began around 200 CE. As such, Chavín may be linked to pre-proto-Quechua, as has been argued by Heggarty (p.c.). On the other hand, linking Chavín with pre-proto-Aymara, as is done in Beresford-Jones and Heggarty (Reference Beresford-Jones, Heggarty, Heggarty and Beresford-Jones2012b), is not incompatible with (1) the idea that Quechua was an invading language coming from the north, maybe filling the gap after the fall of the Chavín cultural complex (Adelaar Reference Adelaar, Campbell and Grondona2012a), and (2) the spread of Aymara down the coast into the Nazca area, where it became the language of the Nazca culture (100 BCE–700 CE). This may also have been the avenue for the attested presence of Aymara in the highland area around 1200 CE, which makes it plausible that Aymara was one of the languages spoken in the Tiahuanacu realm, in addition to Puquina and Uru. There is no evidence that Quechua was present in southern Peru until the end of the Tiahuanacu era, and this raises the question why by 1300 CE Quechua was chosen as the imperial language by the (supposedly Aymara- or Puquina-speaking) Inca elite. The most plausible reason is that large parts of the south central area of Peru were Quechua-speaking by that time, making it likely that during the earlier Huari horizon (500–900 CE) Quechua was spoken in this area, probably in addition to Aymara. This squares with the idea that Quechua- and Aymara-speaking groups, in a herding and agriculture symbiosis, moved southward from north central Peru into the Ayacucho area between 300 and 600 CE.
5 The relation between Quechua and the other languages in the Andean matrix
Our sample of seventeen Quechuan dialects and Kallawaya against three Aymaran and two Uru-Chipaya languages, as well as Mochica and Cholón, makes any comparison highly skewed towards the Quechua data. However, the distance matrices clearly show that we have introduced a number of unrelated languages. The low global average we have seen in our comparison of the Quechua dialects in the Noun Phrase and Argument realization, around 0.12, now jumps to 0.18. The Quechua dialects present low to high scores (0.05 to 0.35) (little internal variation and larger distance from the non-Quechua languages) just as we find for the Aymaran dialects (varying from 0.07 to 0.38), while the other languages present less variance, for example Mochica (0.32 to 0.43), that is unlike any other language in the sample. If we combine the features of the Noun Phrase and Argument realization databases and represent them in a NeighborNet (Allandean NPArg), we see that the whole Quechua cluster, internally relatively unstructured, is set apart away from Cholón, Mochica, and Uru-Chipaya and is somewhat closer to the Aymaran dialects.
It is interesting to see that the addition of the twelve features of the Andean questionnaire to the combined features of the Noun Phrase and Argument realization databases into AllAndean not only adds a lot of structure to the Quechuan family with evidence of a clear QI, QIIc, and QIIb group, but at the same time makes the Aymaran dialects shift to a position much closer to the clearly visible QI subgroup: see Figure 6.3.

Figure 6.3 NeighborNet representation for the relative distances of the different Andean languages discussed in this chapter
A number of other conclusions may be drawn. Obviously, the completely different typological make-up of Mochica makes it into an outlier. However, Cholón is also much less “Andean” than sometimes suggested, despite obvious Quechua borrowings. Uchumataqu (Uru) and Chipaya are together, but the split was not as early in the history of the Altiplano as their structural separation in the splitsgraph would lead us to believe. This may be the effect of language attrition and death in Uchumataqu.
We will briefly discuss the different language groups, starting with the most important one, Aymaran. The intermediate position of the Aymaran varieties in the NeighborNet graph asks for an explanation.
5.1 Aymaran
Apart from two very small and rather distinct older branches of the family in central Peru (Jaqaru and Cauqui), Aymara is spoken in a contiguous area in the south of Peru and northwest of Bolivia, with very little internal differentiation, as far as we know. Given the many similarities and limited structural distance between Quechua and Aymara, questions concerning the relation involve both ultimate relatedness and further contacts between the two families.
Ultimate relatedness. How similar are Quechua and Aymara, and what light does this shed on their genealogical relationship? By both proponents and opponents of such a relationship it is accepted that if they are related, the moment they split dates back to a period at least 2000 BCE if not much more. Adelaar with Muysken (Reference Adelaar, Adelaar and Muysken2004: 34ff.) argue that the evidence points to a separate origin for the two language families, and we have no reason to assume differently here. So, unless clear genealogical links of either one of the languages with languages spoken to the north would make a scenario of the movement of either language family to Central Peru plausible, for our current state of knowledge of the early history of the central Andes, this does not make much of a difference. However, from a linguistic point of view, it does since it is a kind of a test case up to which points languages may converge or split.
Aymaran, QI, and QII. Adelaar assumes that Quechua coexisted in a (pre-) proto-form with (pre)proto-Aymara speaking populations, to which this proto-Quechua adjusted its form. Much later this transformed Quechua began its spread during the Huari horizon, both to the north (to the central highlands where it became a superstrate on Aymara dialects and to the coast) and to the south east (Cuzco). However, this expansion can only have involved QII varieties. There are at least two points in which Aymara and QI strongly differ from QII: directionals in verbal derivation and verbal plural cross reference marking. The Aymaran system is richer than the QI system but both have directional and aspectual suffixes. QII manipulates a number of these suffixes but only one has retained a directional meaning, apart from an aspectual one like the others; -yku, -rqu, -rpa. This does not look like a real innovation but more like a slow loss of complexity. Plural marking is a parallel case. It is an innovation in the QII dialects to create an extra slot for number marking in the verbal matrix and to get rid of the fairly complex event plurality as it is marked in the Aymara and QI dialects. Since Cajamarca, San Martín, and Yauyos share this innovation we must assume that they were later split offs from Huari (Ayacucho) varieties rather than early remnants.
Substrate. Another intriguing aspect of the Quechua/Aymara contact is the fact that Aymara may have disappeared without leaving many traces when overlaid by Quechua. One of Adelaar's arguments for assuming much wider Aymara presence are a few Aymara lexical elements present in the north-central Quechua speaking area where Aymara for time immemorial has disappeared, but a much more recent case is the disappearance of Aymara in southern Bolivia where it was widely distributed well into the colonial times and where it disappeared without leaving noticeable traces, for example in the Northern Potosi area.
Has the spread of QII southward led to Aymaran substrate influence in more southern varieties such as Cuzco and Puno? Regarding Aymaran influence, it has often been assumed that the glottalization and aspiration of initial stops in the QII varieties found in Cuzco and Bolivia (where historical sources suggest the earlier presence of Aymara) may reflect an Aymaran substrate. Likewise, the use of separate lexical dependent clause markers in southern varieties of QII, such as chayqa ‘that’ and hina ‘like’ may possibly be linked to Aymaran influence. However, a systematic exploration of possible syntactic convergence of southern varieties has not yet been undertaken.
5.2 Uru
The Uru languages Chipaya, still spoken, and Uchumataqu, which survived until recently, were spoken in parts of the Altiplano and the Lake Titicaca basin, mostly on the Bolivian side (Adelaar with Muysken Reference Adelaar, Adelaar and Muysken2004: 362–363). There is no evidence of population movements for the group as a whole, although it is clear that the range of communities where Uru languages were spoken is much wider than the present, quite reduced aquatic zones along the lakes and rivers of the Altiplano. There is evidence of earlier structural influence of Quechua on the Uru languages, and of current Aymara influence (see e.g. Muysken Reference Muysken, van der Voort and van de Kerke2000a), suggesting the possibility of metatypy (Ross Reference Ross1999, Reference Ross and Brown2006). Nonetheless the Uru languages are clearly distinct from Quechua and Aymara structurally.
5.3 Cholón
Cholón was spoken until fairly recently in the upper Huallaga valley in northern Peru, and together with related Hibito may have occupied a larger area in the Andean foothills. Alexander-Bakkerus (Reference Alexander-Bakkerus2005) has documented a number of borrowings from Quechua, both lexical and morphological. In spite of this, Cholón appears as a clearly separate entity in the NeighborNet trees.
5.4 Mochica
Mochica (Muchik) or Yunga was traditionally associated with the Moche or Chimu culture in coastal northern Peru. Mochica is typologically quite distinct from the other languages in the area, and has been linked to the Mayan languages (Stark Reference Stark1972), without general agreement so far.6 Some of the “Mochica” were probably Muchik speakers, but there were multiple competing polities in Moche (Kaulicke Reference Kaulicke, Heggarty and Beresford-Jones2012). Cerrón-Palomino (1987) has shown multiple contacts between Mochica and Quechua. This may have led to structural convergence, but Mochica remains clearly distinct.
5.5 Arawakan, Puquina, and Kallawaya
Although it is still very much debated, for the sake of exposition we will operate on the assumption here that elements of what came to be known as Puquina came from an Arawakan language, with an early presence in the Altiplano. Relations between the Quechuan, Aymaran, and Arawakan language families are very complex and may span the last two thousand years. We may distinguish at least three stages in the interaction.
First of all, the possibility has been suggested that Arawakan peoples participated in the culture of Tiahuanacu (300–1000 CE) near Lake Titicaca. However, Tiahuanacu may not have been exclusively Arawakan. Archaeologists Eduardo Machicado, Paul Goldstein, and Sarah Baitzel argue that Tiahuanacu was heterarchical rather than hierarchical, as can be seen in the archaeological remains. There were, for example, different styles of cranial modification. In Tiahuanacu itself there were specialized workshops with different food consumption patterns, suggesting multiethnic constituency. Tiahuanacu expansion was through people, with non-contiguous settlements and several co-existing styles in diasporic enclaves, which suggest that settlements were themselves multi-ethnic.
Further north in the Andes, the presence of monkey imagery in Nazca cultural representations (the famous Nazca lines are dated 400–650 CE) may likewise be linked to Amazonian, possibly Arawakan, influence in coastal Peru. Monkeys were not present on the coast, but they appear in the lines in the desert as well as in pottery motifs.
It is clear that there were Arawakan words in Cuzco Quechua, like unu ‘water,’ as well as the term for month, -quiz. This suggests an early influence on Inca civilization from Arawakan societies. Arawakan words have also been found in the Chilean coastal language Mapuche, further to the south, and Bertonio's Aymara dictionary contains words from Puquina. Although it is clear that there was a strong Arawakan influence in the Andes starting at least 1 CE, this does not imply necessarily that there were large Arawakan-speaking populations. This is compatible with the idea that Puquina, identified above as one of the important languages in southern Peru and Bolivia in the pre-Inca and Inca periods, contains a substantial Arawakan component but is not a fully Arawakan language. Puquina is most likely one of the important languages of Tiahuanacu.
Rodolfo Cerrón Palomino argues that Puquina was the early language of the Incas. His evidence for this comes from the names of the first members of the dynasty, from the names for the rituals in Inca civilization, and from the practice of sun worship. Cerrón Palomino also assumes that the term Colla, now used for the Altiplano Aymaran populations, originally referred to Puquina.
Cesar Itier's work on Amarete Quechua in northern Bolivia may reveal still more traces of Puquina vocabulary. In the colonial period there were intensive relations between the Arawakan language Amuesha and local Quechua in the central Peruvian foothills. Currently, Quechuan and Arawakan Campa languages are in close contact in southern Peru.
Of course the most famous example of Puquina–Quechua interaction is Kallawaya, the lexicon of which contains a number of Puquina elements, and the grammar of which, even though primarily Quechua, also has some unusual features, particularly in the early sources (Muysken, Reference Muysken, Crevels and Muysken2009). Kallawaya is the (almost) extinct ritual language of the Charazani region (northern Bolivia). Katja Hannss is currently exploring the Kallawaya lexicon, attempting to find more Puquina roots. A typical example of the differences between Kallawaya and Quechua is found in (7), cited from Oblitas Poblete (Reference Oblitas Poblete1968: 40). Bold elements are Quechua endings.
(7)

In our interpretation, Kallawaya may be an example of a language changed by processes of metatypy (through which Puquina was gradually restructured under the influence of Quechua), and then its structural frame was fully replaced by Quechua. Quechua underwent relexification with words from Puquina and other languages. Table 6.3 summarizes a proposed history of the complex historical relationships of the region.
Table 6.3 Putative historical development of language use in Kallawaya villages in the Charazani region

6 Stability in the Quechuan family and links to other languages
The time depth of 2000 years postulated for the Quechua family and the internal differentiation of the family into numerous documented branches encourage us to address the question of stability: To what extent are the structural features of the Quechua languages shared by all branches of the family? Can we discern particular components where more and fewer changes have occurred? The division of Quechuan into several branches is based primarily on lexical, morphemic, and phonological criteria. Altogether, there are only a few aspects of morphosyntactic organization that distinguish the different branches.
Quechuan has remained surprisingly stable grammatically, as noted by Parker (Reference Parker1969: 130): “however, the texts available for many dialects show a very high degree of syntactic uniformity except as regards restructuring which has resulted in certain dialects from the borrowing of prepositions and conjunctions from Spanish.” This is illustrated in Table 6.4.
Table 6.4 The forms reconstructed for Proto-Quechua by Parker (Reference Parker1969)

Thus it is striking that on the whole so many features can be reconstructed in the family. Sometimes, the actual forms have changed, but, as we indicated above, the more abstract underlying categories persist. Several explanations can be given for this:
(a) high population density in the region, and subsequently intensive exchange and trade, keeping branches in contact;
(b) internal movements of Quechua peoples within the Inca state, as the result of resettlement policies;
(c) structural similarities at the outset between Quechua and Aymara and possibly other languages, and hence little syntactic change when an Aymaran-speaking population shifted to a Quechuan language, or vice versa, leading to many reconstructable features.7
To discuss this last possibility we may ask ourselves to what extent the most stable features in Quechua have counterparts in Aymara and the other Andean languages, which may have consolidated these features (see Table 6.5).
Table 6.5 Features in more than twelve of the seventeen Quechuan varieties in our database, as compared to their occurrence in Aymaran and other Andean languages

The fact that we find structural overlap with Cholón, Uru-Chipaya, and Puquina may also reflect the influence of Quechuan on these languages. In the case of Cholón this is reflected in the morphological borrowing of additive -pit (Q -pis), apparently with the same range of meanings as in Quechua.
7 Conclusions
We hope to have clarified the issue of the origin and spread of the languages of the Andes by exploring their similarity in terms of structural properties. A number of conclusions can be drawn. First, the Quechua language family is quite coherent and stable in many respects. Second, there is structural evidence for a QI/QII split rather than for a more wave- or network-like configuration within the family. Third, the Aymaran languages are clearly set apart from Quechuan varieties as a group, but, at the same time, the structural distance between the northern varieties of Quechuan and those of most of Peru and Bolivia is larger than that between Aymaran and e.g. Quechua I. The other Andean languages clearly have separate structural profiles, even though they have undergone influence from Quechuan and Aymaran.
Appendix
Table 6.6 Quechua and Andean feature questionnaire




We are grateful to Loretta O’Connor for comments on various earlier versions of this chapter, to Willem Adelaar for several suggestions, and to Harald Hammarström for technical support with the NeighborNet graphs. We also want to acknowledge discussions with Andeanist colleagues, notably Rodolfo Cerrón-Palomino, and Paul Heggarty. None of these people are responsible for the errors of fact and interpretation in this chapter, of course.
1 (Van de Kerke for Bolivian Quechua and Muysken for Ecuadorean Quechua.) Our main sources for the ethnohistorical and archaeological data are the essays in Heggarty and Beresford-Jones (Reference Beresford-Jones, Heggarty, Heggarty and Beresford-Jones2012).
2 Here a 0.10 value implies that two varieties only diverge in 10 percent of the features in the sample, while the global average distance (GAD) gives an indication of the overall variation between all of the varieties in the sample. A graphical representation of these distances is given in NeighborNet graphs, that give a good visual representation of underlying dependencies.
3 ac = accusative; af = affirmative; ag = agentive; ben = benefactive; ds = different subject; fu = future; ge = genitive; nom = nominalizer; pr = progressive; pro = pronoun; re = reflexive; ss = same subject; to = topicalizer.
4 Heggarty (p.c.) has later suggested associating QI with the Chavín culture, and QII with Huari in the Middle Horizon.
5 Other transplanted groups include Aqllakuna (women removed from their native homes at a young age and brought to state facilities called aqllahuasi, where they learned various crafts) and Yanakuna (people taken out of the ayllu system to work for the Incas as servants).
6 A new ERC project (Willem Adelaar, PI), is exploring possible links between Mochica and Meso-America.
7 Several cases of such shifts have been documented (Torero Reference Torero1987).
7 The Arawakan matrix
This chapter investigates the cultural and linguistic characteristics of the ethnolinguistic groups in the Arawakan language family, particularly relating to situations of contact and exchange within and outside the family. In 1492, Arawakan languages were distributed from the Greater Antilles in the north to the Gran Chaco area in the south, and from the Amazon River mouth in the east, to the eastern Andean slopes in the west. The Arawakan languages expanded successfully across the South American continental land mass during pre-Columbian times as part of a powerful cultural complex characterized by intensive contact and exchange with neighboring groups: the Arawakan matrix, which this chapter aims to investigate and map. Geographic Information Systems (GIS) and various phylogenetic methods are used to explore the spatial and temporal distribution of cultural and linguistic features of Arawakan-speaking people, to gain a more complete picture of their expansion. The chapter also adds to our current theoretical knowledge about the sociocultural mechanisms of the Arawakan diaspora and the spatial distribution of particular linguistic features characteristic of the Arawakan language family.
1 Introduction
The study of the expansion of Arawakan languages across prehistoric Amazonia has much to gain from the integration of linguistic (Danielsen) and archaeological (Eriksen) perspectives. Previous studies of the Arawakan language family have revealed that its members possess not only highly characteristic lexical and structural features (see Payne Reference Payne, Derbyshire and Pullum1991; Aikhenvald Reference Aikhenvald, Dixon and Aikhenvald1999a; Danielsen et al. Reference Danielsen2011) but also a set of cultural features clearly distinguishing them from their indigenous Amazonian neighbors (cf. Hill and Santos-Granero Reference Hill, Hill and Granero2002; Eriksen Reference Eriksen2011). In order to understand the means and timing of the Arawakan expansion, it is therefore necessary to integrate findings from ethnography, archaeology, and linguistics.
Investigating the timing of the expansion of the Arawakan language family in Amazonia is more difficult than mapping the expansion of archaeological cultures and associated language families in e.g. the Pacific, where the Austronesian languages and the material culture of the speaking communities can be nicely plotted from island to island as the communities migrated across the Pacific, carrying with them material culture as well as language (Gray and Jordan Reference Gray and Jordan2000). In contrast, the lexical, structural, and cultural features of the Arawakan family had to be navigated through a cultural landscape already fully populated by such features belonging to other ethnolinguistic entities, making constant negotiations and renegotiations between the speakers an unavoidable component of the Arawakan expansion.
Strikingly, the geographic distance between Arawakan languages only predicts 7 percent of the typological distance between them (the so-called “isolation by distance” measure), indicating that there were contacts between members of the family until fairly recently. The time depth of the ultimate diversification cannot be very great (Danielsen et al. Reference Danielsen2011: 183f). This means that the Arawakan languages expanded relatively late in the prehistoric sequence, i.e. during a period when Amazonia had long since experienced advanced ceramic manufacture, intensive crop cultivation, and hierarchical social organization (see below).
The title of this chapter, the Arawakan matrix, refers to the set of cultural features – material as well as non-material – identified in multidisciplinary investigations of Arawak-speaking societies as the set that “constitutes simultaneously the background, framework, and source of information that informs the sociocultural practices of the members of a given language family” (Santos-Granero Reference Santos-Granero, Hill and Santos-Granero2002: 42). The term was first coined by Santos-Granero (Reference Santos-Granero, Hill and Santos-Granero2002: 42ff.) to refer to a set of five key Arawakan non-material cultural features in societies across Amazonia (Map 7.1):
(1) suppression of endo-warfare,
(2) a tendency to establish sociopolitical alliances with linguistically related groups,
(3) a focus on descent and consanguinity as the basis of social life,
(4) the use of ancestry and inherited rank as the foundation for political leadership, and
(5) an elaborate set of ritual ceremonies that characterizes personal, social, as well as political life.
By conducting a large-scale GIS-mapping of pre-Columbian material culture across Amazonia, Eriksen (Reference Eriksen2011: 9) was able to add four additional points, linked to material culture, to the list:
(6) various types of high-intensity landscape management strategies as the basis of subsistence (cf. Hill Reference Hill, Hornborg and Hill2011),
(7) a tendency to situate their communities in the local and regional landscapes through the use of such techniques as “topographic writing,” ceremonial earthworks, extensive systems of place-naming, or rock art (cf. Santos-Granero Reference Santos-Granero1998),
(8) an elaborate set of rituals including a repertoire of sacred musical instruments and extensive sequences of chanting, often performed as part of place-naming rituals (cf. Hill Reference Hill2007),
(9) a proclivity to establish settlements along major rivers and to establish trade and other social relations through river transportation (cf. Hornborg Reference Hornborg2005).
The current investigation seeks to map the timing of the expansion of these nine features, alongside a similar mapping of the linguistic features of Arawakan languages, thus seeking a detailed, multidisciplinary understanding of the Arawakan expansion. A linguistic database of Arawakan features was created by Danielsen using complex linguistic questionnaires.
Our central theoretical assumption is that the best way to explain the interplay between non-material culture (points 1–5 above), material culture (points 6–9 above), and language features is to view these all as part of one single phenomenon: the ethnic identity of Arawak-speaking communities. Ethnic identities and ethnicity in indigenous Amazonia have recently been explored as an important interdisciplinary field of research (Hornborg and Hill Reference Hornborg and Hill2011). This involves the formation and renegotiation of Amazonian ethnic identities – the concept of ethnogenesis – (Hill Reference Hill and Hill1996; Hornborg Reference Hornborg2005; Hornborg and Eriksen Reference Eriksen2011), and results in a new understanding of the multitude of ethnic identities in Amazonia and their role in situations of contact and exchange between indigenous groups. Here, the concept of ethnogenesis is used as a tool to understand the process of spreading of components of the Arawakan matrix to new groups through sociocultural, material, and linguistic exchange, thereby integrating other Amazonian groups into the Arawakan identity, a process inevitably leading to the incorporation of new cultural and linguistic elements among Arawak-speaking communities, and ultimately to a renegotiation of the Arawakan cultural and linguistic identity by small, constant changes and updates of the cultural matrix.
2 The ecology of the Arawakan expansion
2.1 The Amazonian pioneers
Since the first voyage of Christopher Columbus and his followers, South American landscapes, and particularly the Amazon region, have been thought of as the ultimate example of pristine wilderness, encompassing a unique example of rich biodiversity with little human disturbance. Informed by research in anthropology, archaeology, historical ecology, and soil science since the 1980s, the scientific community has slowly adjusted this image towards a view encompassing substantial human influence in the species composition of the world's largest area of tropical rainforest. Since the discovery of Balée (Reference Balée, Descola and Taylor1993: 231) that up to 12 percent of the Amazonian ecosystem is of anthropogenic origin, scholars have noted that much of the “pristineness” of Amazonian forest is actually an effect of the demographic collapse of the indigenous populations following in the wake of the European colonization. Furthermore, archaeological investigations reveal large-scale earthworks, subsistence systems, and settlements, confirming the hypothesis that the sparsely populated Amazonia of the historical period is a relatively recent anomaly when contrasted to the socio-economic development of the region during the last 3,000 years.
Human subsistence strategies in Amazonia have for at least 9,000 years involved domesticated crops (Piperno and Pearsall Reference Piperno and Pearsall1998: 4; Oliver Reference Oliver, Silverman and Isbell2008: 208). When small bands of hunter-gatherers at that time began the domestication of bitter manioc (Manihot esculenta crantz), it marked the starting point of a landscape modification process that was to continue until the demographic collapse following the European colonization some 8,500 years later.
By 7000 BP1 the indigenous societies along the lower Amazon and the present Brazilian Atlantic coastline were producing ceramic vessels and shell middens, forming the earliest centers of pottery production in the New World. Along the middle and lower Amazon, the archaeological sites of Dona Stella, Pedra Pintada, and Taperinha show initial signs of horticultural activities between 8000 and 7000 BP (Roosevelt et al. Reference Roosevelt, Lima da Costa, Machado, Michab, Mercier, Valladas, Feathers, Barnett, da Silveira, Henderson, Silva, Chernoff, Reese, Holman, Toth and Schick1996; Piperno and Pearsall Reference Piperno and Pearsall1998: 4; Petersen et al. Reference Petersen, Neves and Bartone2004), and at Taperinha forest clearing is indicated by 7000 BP, and clearly documented at Lake Geral, a site located in the same region dated to 5760 BP (Bush et al. Reference Bush, Miller, de Oliveira and Colinvaux2000). These early signs of food production and accomplished material culture soon spread through a regional exchange system operating along the coastline between the mouth of the Amazon and Orinoco Rivers (Eriksen Reference Eriksen2011: 127f). Along the coastline of present-day Colombia, Venezuela, and Guyana, the location of a number of shell mounds with a characteristic lithic assemblage labeled the Ortoiroid series, sharing similarities with the above-mentioned sites of the Amazon river region, indicates the establishment of a wide-reaching exchange system already at this point in time (Boomert Reference Boomert2000: 74).
Sometime between 6500 and 5250 BP, the art of ceramic manufacture was exchanged between the lower Amazon and the Guyana coastline, an event marked by the establishment of the Late Alaka phase of the latter area (Evans and Meggers Reference Evans and Meggers1960; Roosevelt Reference Roosevelt1997: 360; Plew Reference Plew2005: 13). These two areas continued to share similarities when the Mina phase (5500–4000 BP), another archaeological complex producing crude ceramics and shell mounds, was established along the lower Amazon and at the coastline south of the river mouth (Simões and Araujo-Costa Reference Simões and de Araujo-Costa1978; Roosevelt Reference Roosevelt1997). Without losing ourselves in details of the early indigenous material culture, subsistence strategies, and exchange systems of northern South America, it is safe to say that much of the early accomplishments of these socio-economic categories took place through the sharing of important achievements between different groups separated by rather large geographic distances, a mechanism in itself indicative of the character of the future to come.
2.2 The birth of the Arawakan matrix
The complexity of pottery production grew steadily along the Guyana coastline and the Orinoco River, a process leading to the establishment of technologically more complex and stylistically elaborated wares in the form of the Saladoid2 and Barrancoid3 series along the Orinoco River by around 3000 BP. By that time, agriculture also was substantially intensified in the same region. Along the Orinoco, the Saladoid and Barrancoid producing societies developed a technology for soil fertilization based on the addition of ash, charcoal, and domestic waste to the soil, with increased microbial activity and improved fertility as the result (Oliver Reference Oliver, Silverman and Isbell2008: 211; see also Arroyo-Kalin et al. Reference Arroyo-Kalin, Neves, Woods, Woods, Teixeira, Lehmann, Steiner, Winkler Prins and Rebellato2009 for technical specifications). This process created sustainable conditions for high-intensity food production, and the black, fertile soils (also known as terras pretas or Amazonian Dark Earths [ADE]) spread widely across Amazonia between 900 BCE and 1500 CE (Lehmann et al. Reference Lehmann, Kern, Glaser and Woods2003; Glaser and Woods Reference Glaser and Woods2004; Woods et al. Reference Woods, Teixeira, Lehmann, Steiner, WinklerPrins and Rebellato2009). Apart from the addition of charcoal and ashes to the agricultural lands, burnt tree bark (Licania sp.) was also being added to the ceramics as a potent tempering material for increased solidity of the vessels. West of the Orinoco River, on the seasonally sedimentary soils of the flooded savannas of the Llanos, proper drainage of the soils was a bigger challenge than lack of available nutrients. In this area agricultural intensification took place through the construction of elevated cultivation surfaces, so-called raised fields or camellones, improving agricultural conditions by elevating parts of the otherwise flat landscape for the improvement of soil conditions, drainage, water management, and nutrient production in order to stimulate agricultural productivity (Denevan Reference Denevan1970; Darch Reference Darch1983; Erickson Reference Erickson, Balée and Erickson2006: 251).
The refinements of agricultural technologies and pottery production were not isolated technological advancements, but, as we will argue below, part of a cultural package that was just beginning its march across Amazonia. Interestingly, the cohesive links of this cultural package were not the technological advancements or the surplus production (even though they were both intrinsic parts of it) but language, and more particularly languages of the Arawakan language family. At the time of European contact4 at least sixty Arawakan languages were spoken from the Greater Antilles in the north to northern Argentina in the south, and from the mouth of the Amazon in the east to the eastern Andean slopes in the west (Grimes 2009 lists fifty-nine documented Arawakan languages, not including several extinct ones) (Map 7.1). Arawak-speakers across South America and the Caribbean are united by two main factors: (1) the genealogical relationship of their languages, i.e. their descent from a common proto-language, and (2) a shared set of cultural features, i.e. both material and non-material attributes.
By 1500 CE the Arawak-speaking groups inhabited numerous seasonally flooded environments of the South American tropical lowlands with raised field agriculture or similar technologies, including the Llanos of Venezuela and Colombia (Spencer and Redmond Reference Spencer and Redmond1992), the Llanos de Moxos of Bolivia (Erickson Reference Erickson, Balée and Erickson2006), Marajó Island at the mouth of the Amazon (Schaan Reference Schaan, Silverman and Isbell2008), and the Guyana Littoral (Versteeg Reference Versteeg, Silverman and Isbell2008). They were also known as the moderators of an elaborate set of ritual ceremonies with the use of sacred musical instruments and chanting as essential ingredients (Izikowitz Reference Izikowitz1935; Hill Reference Hill and Santos-Granero2009). Apart from this, Arawak-speakers such as the Taino of the Greater Antilles, the Lokono of the Guyana Littoral, the Manao of the central Amazon, the Achagua and Caquetío of the Llanos, and the Moxo of the Llanos de Moxos (just to name a few) were well-known traders carrying out exchange between various ethnolinguistic groups (Eriksen Reference Eriksen2011: 275).
2.3 The timing of the geographic expansion of the Arawakan matrix
The main ingredients of the Arawakan cultural package were first being brought together in the Orinoco region around 900 BCE. By that time, we find high-intensity landscape management systems in the form of raised fields and terras pretas for agricultural production and causeways for transportation, water management, and possibly also including ritual functions. Also present were ceramic artifacts rich in painted and plastic decoration, that is to say features indicative of a rich ceremonial life similar to that documented from Arawak-speaking communities of the historical period (Santos-Granero Reference Santos-Granero1998; Heckenberger 2008; Hill Reference Hill, Hornborg and Hill2011). Interestingly, the rich ceremonial life of the Arawak-speakers of the historical period was essentially constructed around two themes: (1) the presence of elaborate techniques for physically and ritually domesticating their surrounding landscapes, and (2) the use of fire in processes of landscape domestication and during other types of ritual ceremonies.
As described above, by-products of fire such as charcoal and ashes were an essential part of the subsistence strategies and ceramic manufacture of the Orinoco Region already by 900 BCE. The elaborately decorated ceramics of the Saladoid and Barrancoid series act as indicators of the elaborate ceremonial life of the Orinoco communities, and, interestingly, the rich ethnographic record of Arawak-speaking communities across Amazonia shows that the use of tobacco smoke by Arawakan shamans is considered an essential aspect of ritual ceremonies, including healing processes. Thus, an image of a cultural package capable of transforming the landscape into a high-productive resource, while at the same time providing a powerful ceremonial life for its members, now arises through the archaeological, anthropological, geological, and historical records.
The combination of high-intensity landscape management strategies and a rich ceremonial life would prove to be highly successful during the centuries to follow. By 400 BCE the first evidence of terra preta farming appears in the central Amazon, and shortly thereafter the first earthworks of the Llanos de Moxos represents the initial signs of landscape modification in this region. Judging by the great differences in terms of ecology between the habitats colonized by the subsistence strategies of this emergent regional system, there was a great deal of adaptation available within the communities involved in this process. In much of the central and lower Amazon, large surfaces of fresh sediments rich in available nutrients annually drained from the Andes, forming great conditions for high-productive agriculture, were available. These so-called várzeas eliminated the need for raised fields or terras pretas in many areas of central Amazonia and, interestingly, Versteeg (Reference Versteeg, Silverman and Isbell2008: 305) notes how the raised fields can be compared to artificial várzeas that are subjected to controlled inundation, bringing nutrient-rich sediments to the elevated surfaces during parts of the year.
By 200 BCE Barrancoid pottery and terra preta farming were present along the Ucayali River in the Peruvian Amazon (Lathrap Reference Lathrap1970: 117; Eden et al. Reference Eden, Bray, Herrera and McEwan1984: 126), and at around 100 BCE archaeological dates of the huge earthwork complex of Acre, northwest of the Llanos de Moxos, begin to cluster (Saunaluoma Reference Saunaluoma2010: 106). The geometrical earthworks of Acre, also known as geoglyphs, are perhaps the most visually stunning example of the ceremonial aspects of earth-moving that had been crystallizing across Amazonia from about 900 BCE. The Acre geoglyphs so far discovered consist of more than 200 (an estimated 10 percent of the total number [Mann Reference Mann2000]) geometrical figures carved out of the soil by ditches and walls extending up to 3 meters deep and 11 meters wide. The size of the earthworks measures up to 300 meters across and their frequency is up to 5 geoglyphs per km2 (Hornborg et al. Reference Hornborg, Eriksen and Bogadóttir2013).
The cultural associations of these earthworks are so far unknown, but the presence of pottery related to the Barrancoid series (Saunaluoma Reference Saunaluoma2010: 94), their dating, and the engineering skills and focus on soil moving among the Arawak-speaking communities on the nearby savannah of the Llanos de Moxos makes an association with the Arawakan cultural complex highly plausible. As for the functions of the geoglyphs, little is known, but suggestions that they were used for fortification purposes have been made. Although this may be true for the circular structures, particularly those dated to the late pre-Columbian period when military conflicts were expanding across the lowlands (see below), it is unlikely that the communities erected up to five elaborately constructed geometrical structures with low ditches with little or no defensive capabilities per square kilometer because they feared an external threat in the form of spears, arrows, or war clubs. On the contrary, the fortified villages of the Arawakan cultural complex documented from the late pre-Columbian periods are semi-circular structures, located on high ground, adapted to the local topography and surrounded by palisades (Rebellato et al. Reference Rebellato, Woods, Neves, Woods, Teixeira, Lehmann, Steiner, WinklerPrins and Rebellato2009).
Interestingly, new research in the upper Xingú area, another region of Amazonia populated with – and culturally dominated by – Arawak-speakers, is finding support for the notion that Arawak-speakers sometimes devoted themselves to strictly ceremonial domestication of their landscapes. During the late pre-Columbian period, the upper Xingú area had developed integrated patterns of centers organized in multi-ethnic, or “galactic,” clusters populated by up to 2,500, and perhaps as many as 5,000 inhabitants (Heckenberger Reference Heckenberger, Balée and Erickson2006: 330; Reference Heckenberger, Silverman and Isbell2008: 955). These multiethnic confederations, referred to as an early form of urbanism by Heckenberger et al. (Reference Heckenberger, Silverman and Isbell2008), were integrated by wide road-like causeways resembling the elevated causeways of the Llanos de Moxos, which facilitated cultural, linguistic, and material exchange within and between regions. The landscapes were domesticated by the creation of circular villages with a central plaza and radial road networks with perfectly straight passages connecting a multitude of such population centers to each other in a regional system.
In Arawak-speaking areas where earthworks and other landscape-altering techniques were less prevalent, other strategies of landscape domestication were employed. The Yanesha’, an Arawak-speaking group of the eastern Andean slopes in present-day Peru, apply an intricate system of “topographic writing” in order to maintain an intimate relationship to their landscape (Santos-Granero Reference Santos-Granero1998). Topographic writing is the concept Santos-Granero uses to describe how individual place names (topograms) are connected to extensive systems (topographs) and reiterated, for instance through chanting in ritual ceremonies in order to strengthen the ties to the local and regional landscape (p. 128). Such ritual place-naming is also well documented from the northwest Amazon Arawakan people (Vidal Reference Vidal2000, Reference Vidal, Hill and Santos-Granero2002; Hill and Chaumeil Reference Hill, Hornborg and Hill2011; Wright Reference Wright, Hill and Chaumeil2011) and from Arawakan groups in southern Amazonia such as the Paresi (Schmidt Reference Schmidt1917: 21f). Santos-Granero (Reference Santos-Granero1998: 132, 139) refers to the landscape domesticating process of topographic writing among the Yanesha’ as a form of proto-writing, also present among other Amazonian groups such as the Paéz (to which it likely diffused through contact with nearby Arawak-speaking communities), a linguistic isolate between the Marañon and Napo Rivers, and the Arawak-speaking Kurripako (Wakuénai) of the northwest Amazon (Hill Reference Hill and Hill1996: 153f; Reference Hill, Hill and Granero2002: 235f; Reference Hill and Santos-Granero2009: 250).
Returning to the archaeological material, by 300 CE a new ceramic style, labeled the Amazonian Polychrome tradition, had developed out of Barrancoid material along the middle and lower Amazon (Hilbert Reference Hilbert1968; Lathrap Reference Lathrap1970: 156f; Eden et al. Reference Eden, Bray, Herrera and McEwan1984: 137; Myers Reference Myers, Glaser and Woods2004: 79; Petersen et al. Reference Petersen, Neves and Bartone2004: 9; Eriksen Reference Eriksen2011: 181). At the time of European contact, the Arawak-speakers of Marajó Island at the mouth of the Amazon, the Aruã, were still manufacturing an undecorated variant of the Marajoara phase (one of the most elaborate phases of the Amazonian Polychrome tradition)5 labeled the Aruã phase, when they were encountered by the Europeans (Brochado and Lathrap 1982: 53).
Once again, the significance of burning is reflected in the anthropomorphic burial urns typical of the Amazon Polychrome tradition, an inventory indicating secondary urn burials in which the cremation of the corpse and the storing of the ashes in the urn were central components. In many instances, even the pottery itself included ashes in the form of caraipé temper utilized in the Ipavu phase in the upper Xingú (Heckenberger Reference Heckenberger1996: 136f), the Guarita phase in the middle Amazon (Petersen et al. Reference Petersen, Heckenberger, Neves, Alofs and Dijkhoff2003: 252), the Mazagão phase in Maracá (Meggers and Evans Reference Meggers and Evans1957: 596), the Koriabo phase of the Guyanas (Boomert Reference Boomert, Delpuech and Hofman2004: 259), and, together with crushed sherds, in Marajoara (Brochado and Lathrap Reference Brochado and Lathrap1982: 50). The burial urns, also well known among the northwest Amazon Arawakan people and those in the Llanos de Moxos − the Moxo and the Baure − (Mann Reference Mann2000; Métraux 1948c) and Arawakan groups in the Bolivian Chiquitanía, like the Paunaka (Métraux Reference Métraux and Steward1948b), were often stored in caves that could be visited and inspected (Chaumeil Reference Chaumeil, Fausto and Heckenberger2007: 250ff), indicating close ties to the ancestors, and the importance of maintaining a strong relationship to deceased relatives, shamans, and political leaders – a custom typical of the Arawak-speaking communities of the historical period. Another way of maintaining a close link to the ancestors was through ritual consumption of their remains, as illustrated by the Arawak-speaking Guayupe and Sae, who cremated their ancestors and drank their ashes mixed with beer (Kirchhoff Reference Kirchhoff and Steward1948: 387f.).
The process of burning was also a central aspect of other Arawak-moderated rituals performed in their sphere of influence. At religious ceremonies performed in the northwest and southwest Amazon and in the upper Xingú area, tobacco smoke is a central element in healing-ceremonies conducted by the Arawakan shamans, who blow the tobacco smoke on the patient's body in order to eliminate the patient's symptom (Hill Reference Hill and Santos-Granero2009: 249, 259; Hill and Chaumeil Reference Hill, Hornborg and Hill2011). The blowing of tobacco smoke on patients has also been reported of the Bolivian Arawakan groups of the Paunaka in shamanic ceremonies (Danielsen, own observation) and it is also the way the shaman gets in contact with the spirits of the deceased among the Baure (Riedel Reference Riedel2012). According to Goldman (Reference Goldman and Steward1948: 789), smoke was also blown during funerals, reflecting the association between this custom and the importance of deceased ancestors.
Via the historical and contemporary ethnographical sources, we find another interesting connection between, on the one hand, shamanistic blowing of smoke, and, on the other, the ritual wind instrument also utilized by Arawakan shamans during religious ceremonies. Among contemporary Arawak-speaking communities of the northwest Amazon, the upper Purús River (the Apurinã) and the upper Xingú area, ritual wind instruments play a central role in annual ceremonies and during rites de passages.
Apart from the apparent association to shamanic blowing, the sacred flutes of the Arawakan people also had a striking connection to landscape and fire worth exploring further. According to the legends of the Arawak-speakers of the northwest Amazon, the earth was created from the remains of a mythological hero, Kúwai, after his body had been destroyed in a fire. Besides being the ancestor from whose body the world of humans was created, Kúwai also provided material for the ritual wind instruments used in religious ceremonies. The Yuruparí flutes are artifacts directly derived from the bones of the mythological hero and thus representatives of the ancestors (Steverlynck Reference Steverlynck2008: 580). In the words of Robin Wright (Reference Wright, Hill and Chaumeil2011):
After his [Kúwai's] sacrificial death in an enormous conflagration, from the ashes of his body emerged the sickness-giving spirit Iupinai but also a giant tree from which the sacred flutes were made, and it is with these flutes that traditionally the men initiated boys and girls in the major rituals held at the beginning of the rainy season.
Overall, the sacred wind instruments of the Arawakan people were one of their most central characteristics. Sacred flutes have been known to occur among a number of Arawak-speaking groups of the northwest Amazon, including the Achagua, Baniwa, Bare, Cabiyarí,6Kurripako, Maipure, Yucuna,7 Pasé,8Resígaro, and Yumana, and they also occur among neighboring non-Arawak-speaking groups who maintain close sociocultural contact with the Arawakan groups (Chaumeil Reference Chaumeil, Fausto and Heckenberger2007; Wright Reference Wright, Hill and Chaumeil2011). Chaumeil (Reference Chaumeil1997, cited in Steverlynck Reference Steverlynck2008: 579) points to the connection between the sacred flutes complex of the northwest Amazon and the use of ceremonial trumpets by Taino shamans of the Greater Antilles.

Map 7.1 The reconstructed geographical dispersal of the Arawakan and Tupian language families at the time of European contact. For complete references, see Eriksen Reference Eriksen2011: 12
Chaumeil (Reference Chaumeil, Fausto and Heckenberger2007: 265) notes how Arawak-speaking groups dominate the sacred flutes complex throughout Amazonia, and Wright (Reference Wright, Hill and Chaumeil2011) identifies the sacred flutes as an important element in the expansion of Arawakan languages. Arawak-speaking groups located outside of the northwest Amazon who also use sacred flutes include the Apurinã of the Purús River; the Baure and Moxo in the Llanos de Moxos; the Paresi further west; and the Mehinaku in the upper Xingú. Other groups belonging to the same complex include some Tupian-speaking groups such as Cocama and Omagua, Mundurukú, Tupinambá, and Kamayurá. In the upper Xingú, the complex also spread to the Carib-speaking Kalapalo and Bakairí, who were “Arawakanized” by their Mehinaku, Kustenau, Yawalapití, and Waurá neighbors (Chaumeil Reference Chaumeil, Fausto and Heckenberger2007: 266).
During female initiation rites among the Arawak-speaking Kurripako (Wakuénai) in the northwest Amazon, the sacred flutes are used during up to six hours long ceremonies of chanting during which an enormous series of place names along the rivers of northern South America are reiterated (Hill Reference Hill and Hill1996: 153f; Reference Hill, Hill and Granero2002: 235f; Reference Hill and Santos-Granero2009: 250). These place names represent nodes in an exchange system once dominated by Arawak-speakers, but they are also part of a geographic network with strong mythological connotations. This exchange network, known as the Kúwai route (borrowing its name from the creator), represents both a trade network constructed on the basis of physical travels over centuries, but also a collection of mythological places where Arawakan shamans head on their transcendental journeys during séances. Thus, the sacred flutes, the Kúwai routes, and the relationship to the ancestors and the mythological past form a trinity of inseparable components that collectively contributed to a strengthened identity and sociopolitical status of the Arawak-speaking communities.
Like many indigenous cultures around the world, for Arawak-speaking communities the physical and religious aspects of the landscape form a whole. The landscape functions as a single meaningful unit, steadily present in the life and minds of its inhabitants. However, the Arawakan domestication of the landscape was not only meaningful to the Arawakan people themselves but also part of a vast socio-religious and economic exchange system that affected the lives of all inhabitants of northern South America between 1000 BCE and 1000 CE. Together with intensive agriculture, an effective exchange system, and an advanced sociocultural and religious concept based on social hierarchies and ancestor worship, Arawakan languages (which formed intrinsic parts of these three concepts) expanded across an enormous geographic territory on mainland South American and in the Caribbean. The diversity and the power of the Arawakan groups led to the complete or partial adoption of the Arawakan cultural matrix and associated languages by many indigenous groups between 1000 BCE and 1000 CE. At the time of the European arrival Arawakan languages were widely spoken and northern South America showed extensive domesticated landscapes (Map 7.1).
Arawakan culture also came to influence the Andean region substantially, as indicated by the presence of a large number of lowland products brought to the highlands via the Arawak-controlled trade routes (Eriksen Reference Eriksen2011: 164). Along the trade routes of the eastern Andean slopes, an ethnolinguistic group known as the Kallawaya transported lowland products with pharmaceutical or hallucinogenic characteristics to the Andean cultures (Rowe Reference Rowe and Steward1946: 239; Wassén Reference Wassén1972: 63; Lathrap Reference Lathrap1973: 180f; Taylor Reference Taylor, Salomon and Schwartz1999: 199; Eriksen Reference Eriksen2011: 78). The Kallawaya tongue was a mixed language based on the Arawakan language Puquina, the isolated Chipaya language, and Quechua (Gordon Reference Gordon2005, Hannss p.c.). Puquina was a high-status language spoken among the Inca elite (Torero Reference Torero2002; Dudley Reference Dudley and Alexiades2009: 146). Along the eastern Andean slopes in present-day Peru, a number of Arawakan languages are still spoken, and advanced systems for ritual domestication of the landscape have been documented among the Yanesha’ of that region (Santos-Granero Reference Santos-Granero1998) (Map 7.1). These groups bear testimony to the incredible ability of the Arawakan matrix to maintain its relevance for its users during the sociocultural changes on-going for centuries.
2.4 The expansion of the Arawakan family from a linguistic perspective
As noted, the Arawakan language family has expanded over a very large area of South America, more than other language families (Map 7.1). A suggested Arawakan language classification is presented in Table 7.1, mainly based on geographic proximity, but also on grammatical features, summarized from Aikhenvald (Reference Aikhenvald, Dixon and Aikhenvald1999a), Walker and Ribeiro (Reference Walker and Ribeiro2011), and Danielsen and Terhart (forthcoming).9
Table 7.1 The Arawakan language family

The internal classification of Arawakan languages is difficult to establish (see Facundes Reference Facundes, Hill and Santos-Granero2002), and some reasons for this are discussed in 2.6. This section focuses more on the character of the Arawakan linguistic family as such. Instead of taking the lexicon as the basis of comparison, which was done in other studies (Payne Reference Payne, Derbyshire and Pullum1991, Walker and Ribeiro Reference Walker and Ribeiro2011), we take grammatical features as our point of departure here and compare our findings to those of lexical comparisons. The similarities of the Arawakan languages in some key linguistic features suggest that the expansion happened rather quickly (see Danielsen et al. Reference Danielsen2011). The personal paradigm, e.g., is similar in many respects in most Arawakan languages, formally as well as functionally (see Payne, David L. Reference Payne1987), and it has often served as the first characteristic for assigning the genealogical relationship to a language (Gilij Reference Gilij1780–84). The proto-system of person marking was presumably a “Latin-type” paradigm (see Cysouw Reference Cysouw2003: 107): 1sg, 2sg, 1pl, 2pl, 3sg with a gender distinction (masculine/non-feminine and feminine), and 3pl (also applied as a general pl suffix). Person markers are employed to mark the possessor on nouns (prefix), subject on verbs (SA, prefix), object(s) on verbs (suffixes), and the subject on stative or non-verbal predicates (SO, suffix).10 In addition, free pronouns are derived from these bound personal forms, as well as certain adpositions marked for person. This general grammatical system can be claimed for all Arawakan languages, and the differences lie mainly in the specific lexemes that make use of a certain kind of marking, such as which nouns are actually part of the category inalienably possessed (with person marking), and which verbs belong to the (active) set with SA marking or the (stative) set with SO marking. There may also be striking differences in the SAP (speech act participants, i.e. first and second person reference) marking system versus the 3rd person in some languages, and the SAP forms tend to be more stable. Here also certain functions only hold for sub-clusters of the language family, such as gender marking or derivation of adjectives or nouns by means of the same personal forms (suffixes) for 3rd person. If we model the Arawakan language family as a NeighborNet splitsgraph only with respect to the person marking system (forms and functions), we get a rather plausible picture of the geographic distribution of the languages and possible migration routes (Figure 7.1).

Figure 7.1 The distribution of the personal paradigms in Arawakan languages11
The star-like splitsgraph in Figure 7.1 shows no clear northern versus southern branching of the language family with respect to the features examined, contrary to what Table 7.1 suggests. There is a tendency that Northern Arawakan and Southern Arawakan languages group together and a few sub-clusters can be observed. The same is true for the lexicon, as analyzed in Walker and Ribeiro (Reference Walker and Ribeiro2011: 2563). This means that there are some shared Arawakan features almost equally distributed: the personal paradigms to some extent and some of the conservative lexicon. The reason for this presumably goes back to the Arawakan exchange network that functioned for a long time. If the languages had spread like Tupian, we would presumably be able to see certain clear expansion groups, which is not really the case (compare Figure 8.1 in Chapter 8 on the Tupian expansion). However, some groups and subgroups within the language family can be found clustering in the graph in Figure 7.1: Some Northern Arawakan languages cluster, like Lokono with Island Carib, Garifuna, and Paraujano, but also with Resígaro in this graph. While Bare and Tariana appear closely related, Piapoco and Achagua have been separated according to their person marking systems. So, there is some confusion and a geographically unclear picture of northern and central Amazonian Arawakan languages. A similar observation was made about the lexical relations (Walker and Ribeiro Reference Walker and Ribeiro2011: 2563). The Purus group and Campa Arawakan cluster nicely, and Bolivian Arawakan languages, the Baure, Moxo, Pauna languages, more or less as well. The less Arawakan character of Apolista, Yanesha’, and Chamicuro, presumably due to the Andean influence these languages have undergone, is reflected in their relatively isolated positions in the splitsgraph. This may be a sign of an earlier migration into the region than the other Arawakan languages of the Andean foothills. The complicated position of Resígaro within the Arawakan language family has been discussed in the literature before (Payne Reference Payne1985). Its odd position among the Northern Arawakan languages in the graph − and also in Figure 7.2 − is probably just a sign of a connection at some point in history; this may have been in times when the Arawakan web stretched from the Caribbean to the Andes. The Taino language, excluded from the analysis in Figure 7.1 due to the lack of sufficient data, has also lately been described as being on the one hand closely related to Northern Arawakan languages, such as Island Carib, Lokono, Guajiro, Piapoco, and Achagua (cf. Granberry and Vescelius Reference Granberry and Vescelius2004: 56). On the other hand, it also seems to have certain characteristics found in South-Western Arawakan languages, in particular the Campa group, namely certain nominal (classifying) root formatives (cf. Granberry and Vescelius Reference Granberry and Vescelius2004: 94). Is this another hint at traces of the times of the interaction over such distances? The Northern Arawakan language Palikur appears right among the Southern Arawakan languages, from which it is geographically far away. However, being a Brazilian Arawakan language, Palikur seems to display some connection to other eastern and southeastern (Brazilian) Arawakan languages. The latter have not been claimed to form any particular subgroup, but in Walker and Ribeiro (Reference Walker and Ribeiro2011: 2563), these languages are included under the name “Central Brazil,” since they cluster in their lexical analysis. In Table 7.1, we call this assumable group South-Eastern Arawakan. The Arawakan languages of the Xingú group (Paresi, Saraveka, Waurá) and possibly those of the Terêna subgroup (Terêna, Kinikinau, Mehinaku) may well be a loose intermediate group with characteristics similar to their northern and northwestern neighbors as well as to their southern genealogical neighbors. The findings in Granberry and Vescelius (Reference Granberry and Vescelius2004: 55 ff.) also seem to point in this direction, and Walker and Ribeiro (Reference Walker and Ribeiro2011) demarcate Palikur (and Marawan) as a subgroup named “Northeast.” This, however, remains to be further substantiated.

Figure 7.2 Minimum spanning network of the Arawakan language family (also taken from the NeighborNet algorithm, Huson and Bryant Reference Huson and Bryant2006)12
To get a better picture of the possible expansion of Arawakan languages, the same feature matrix as used for the NeighborNet splitsgraph in Figure 7.1 can be reduced to a Minimum Spanning Tree (MST), as given in Figure 7.2.
The graph in Figure 7.2 provides us with possible routes of the dispersal of the respective Arawakan languages. Even though this is only one interpretation of the given data as a dispersal route, this scenario has plausibility.13 The position of Island Carib and Garifuna shows an excursion of the Arawak-speaking people into the Caribbean Sea, probably at an early stage and starting off from Maipure and Palikur. Another expansion of Arawakan could have led to the northern coast with Lokono at its end. As already mentioned above for Figure 7.1, Resígaro may be a remnant of the Arawakan expansion towards the Andes and a sign that there was still regular exchange at that time between the east (Resígaro) and the west (Lokono and others) of Amazonia through Arawakan peoples.14 In the south, we may conclude that the Moxo languages Trinitario and Ignaciano came into the area through Baure. There is some evidence for the fact that Baure came into the region earlier due to its relatively conservative character. It is here also suggested that Apolista and Yanesha’ are part of one migration route, starting off with Chamicuro, which is the most northern member of this loose group of Andean-influenced Arawakan languages.
2.5 The fragmentation of the Arawakan matrix
By approximately 800 CE, the Arawakan languages and the associated cultural complex, here labeled the Arawakan matrix, had reached their maximal territorial extent. By that time, Arawakan languages from the Greater Antilles to northern Argentina and from the Atlantic to the Andes were united by a large and complex regional exchange system. The Arawak-speaking communities had by that time accumulated considerable land-based capital in the form of agricultural earthworks, aquacultural facilities, and infrastructure that was attractive to other indigenous Amazonian groups.15 Since the early centuries of the first millennium CE, another major ethnolinguistic formation came to the fore in the Tupian language family, which had begun expanding out of its point of origin in the Brazilian state of Rondônia (Rodrigues Reference Rodrigues1964). Up until then, the geographic distribution of the Tupian family had remained very restricted, despite early internal branching (Eriksen and Galucio, this volume).
The geographic expansion of the Tupian languages took place very differently from the spread of the Arawakan languages. While the Arawakan languages were part of a complex exchange system with strong mythological and ceremonial underpinnings, the Tupian language family was part of an expansionistic military culture. Where the Arawakan societies prioritized ancestry and descent as the bases for political power, the Tupians based their social hierarchies on feats on the battlefield (Eriksen and Galucio, this volume).
Particularly among the communities of the Tupí-Guarani branch, the groups developed an effective ability to absorb cultural traits and technological elements from neighboring groups in order to strengthen their own social status, military power, and agricultural production (Eriksen and Galucio, this volume). Due to these abilities of the Tupian cultures, it was inevitable that the encounter between groups speaking Tupian and Arawakan languages along the shores of the middle and lower Amazon around 700 to 800 CE would lead to extended periods of conflict (Map 7.1). Military aggression was in evidence from the start and the remains in the archaeological record bear witness of burned villages, destroyed palisades, and ultimately a change in village layout from the circular villages of the Arawakan communities to the linear settlements of the Tupí-speakers documented from the historical period (Heckenberger Reference Heckenberger2005: 56; Rebellato et al. Reference Rebellato, Woods, Neves, Woods, Teixeira, Lehmann, Steiner, WinklerPrins and Rebellato2009: 22, 29). The military conflict between Tupí- and Arawak-speaking communities along the main river ultimately led to Tupian control of a large section of the Amazon River from the mouth of the Amazon to the tributaries in eastern Peru by 1200 CE (Map 7.1).
In addition to this, other Tupí-Guarani languages had expanded along the Atlantic coastline, circumscribing the Macro-Jê speakers and restricting their distribution to the Brazilian highlands. The three other expanding branches of the Tupian language family, Mundurukú, Mawé-Sateré, and Yuruna expanded in the area immediately south of the middle and lower Amazon River, contributing to a strong dominance of Tupian languages in southern Amazonia during the late pre-Columbian period (Eriksen Reference Eriksen2011). Overall, the expansion of the Tupian family replaced the Arawakan dominance in many areas of Amazonia and led to the fragmentation of the previously pan-Amazonian character of the Arawakan regional exchange system (Map 7.1).
However, the sociocultural processes through which Tupian languages replaced Arawakan were sometimes more complex than the predatory ethos of the Tupí-speaking groups would lead us to believe. According to several linguists (Cabral Reference Cabral1995; Jensen Reference Jensen, Dixon and Aikhenvald1999: 129; Adelaar with Muysken Reference Adelaar, Adelaar and Muysken2004: 432), the structures of the Tupí-Guarani languages (Omagua, Cocama, and Cocamilla) of the upper Amazon indicate that they represent a language shift16 from some non-Tupian language(s) to Tupinambá. This indicates that a new cultural pattern, including both language and material culture, was adopted in the region about 1200 CE. Included in this cultural package was polychrome pottery, locally developed into the Napo and Caimito phases. Epps (Reference Epps2009: 599) has suggested that Cocama and Omagua represent two different language shifts from Arawakan languages to Nheengatú, the Tupinambá-based lingua franca still spoken in the northwest Amazon.
As a result of the Tupian expansion at the expense of the Arawakan languages, a change in land use also followed. While the Arawakan communities had been heavily sedentary, relying on their earthworks, terras pretas, and aqua-cultural facilities for long-term subsistence, the military apparatus of the Tupian groups acted as an incentive for more mobile subsistence strategies. Many ethnolinguistic groups of the Tupí-Guarani, Mundurukú, Mawé-Sateré, and Yuruna branches launched annual war expeditions up the major rivers and tributaries. These expeditions, some of which were documented by the early Europeans of the continent, lasted for months and required access to easily transportable food resources. As a result of this, the Tupian groups along the main river came to rely heavily on short-ripening maize-varieties that were grown on the annually flooded várzea areas. The Arawakan groups also produced a considerable food surplus, e.g. beer made from maize and manioc, to be consumed during religious ceremonies. Thus, the Tupians could rely on their ability to steal food during their war expeditions and their dominance over tributary populations for part of their subsistence (Santos-Granero Reference Santos-Granero2009).17
It would take another millennium before Amazonia once again experienced an alteration of the landscape similar to the one that took place during the Arawakan expansion from roughly 1000 BCE until 1000 CE. Between 1000 and 1500 CE, the region suffered conflicts and warfare, and the most important socio-economic progress took place in the Andes. During this period, the landscape alteration processes were less intensive than during the first millennium CE. As a consequence of the demographical collapse among the indigenous populations that followed in the wake of the European colonization (an event that eradicated perhaps 90 percent of the population in Amazonia), the anthropogenic environments of Amazonia underwent a reforestation process that in most areas resulted in an advance of the tropical rainforest at the expense of previously maintained grounds. The image of the reforested Amazonia (a process that was completed in most areas of the lowlands before any Europeans entered) has contributed strongly to the image of Amazonia as an ecosystem with little human historical influence.
2.6 The fragmentation of the Arawakan language family from a linguistic perspective
A subdivision into Northern versus Southern Arawakan is not as straightforward as suggested in Aikhenvald Reference Aikhenvald, Dixon and Aikhenvald1999a (Danielsen et al. Reference Danielsen2011, cf. also Walker and Ribeiro Reference Walker and Ribeiro2011: 2563). This is supported by the fact that no major branches can be shown for the language family (cf. Figure 7.1). As we have shown in Section 2.4, Arawakan languages are much alike, at least with respect to selected linguistic features such as the personal paradigms and the lexicon. This then results in the star-like splitsgraph in Figure 7.1. Some other features are in tendency more Southern Arawakan, such as the morphological complexity of the verb and applicative marking types on verbs.18 Taking all linguistic features into consideration, however, the picture is much more blurred. The splitsgraph in Figure 7.1 demonstrates that there are not many major splits between groups of languages of the Arawakan family, and the distances between them are relatively even, and much more balanced than a main subdivision into Northern versus Southern Arawakan would suggest. This tells us something about the nature of the Arawakan expansion: Firstly, the Arawakan matrix must have remained intact for quite some time, so that linguistic features could still be exchanged (and spread throughout Amazonia to other languages). This explains why general Amazonian features (see Derbyshire and Pullum Reference Derbyshire, Derbyshire and Pullum1986: 19; Dixon and Aikhenvald Reference Dixon and Aikhenvald1999: 8–9) mostly reflect Arawakan features, and why it is almost impossible to find an Arawakan feature that is not also Amazonian or vice versa. Secondly, the expansion of Arawakan was neither unidirectional, nor did it happen in one stroke. Walker and Ribeiro (Reference Walker and Ribeiro2011: 2566) have suggested a more southern point of departure of the Arawakan expansion − western Amazonia, in the area of the Apurinã − than others have come up with before (the Caribbean coast in Aikhenvald Reference Aikhenvald, Dixon and Aikhenvald1999a: 75; the Upper Amazon referring to Lathrap Reference Lathrap1970 and Oliver Reference Oliver1989 in Aikhenvald Reference Aikhenvald, Dixon and Aikhenvald1999a: 75). Usually, we take the area of most linguistic diversity within the language family as the probable homeland, as in the case of Tupian. However, the diversity must represent the source of divergence within the family. In the case of Arawakan, diversity may mean local linguistic interaction with other unrelated languages and is not directly related to the different migrations of Arawakan languages. The area of the northern Amazon and the Caribbean coast are both examples of intensive language contact, in particular between Arawakan languages and languages of other stocks. Therefore, linguistic diversity alone may not be always taken as the key evidence for a homeland. Later language contact is probably the reason why an analysis of general linguistic features of Arawakan languages gives us the picture it does in Figure 7.3 (see Danielsen et al. Reference Danielsen2011).

Figure 7.3 Structural analysis of thirty-one Arawakan languages19
In Figure 7.3, which is again a star without any major branching, we see that general linguistic features are shared by some subgroups within Arawakan − indicated by the dotted lines − but a great number of the languages appear to be simply mixed in the graph. This fact is the main reason why it has been complicated till now to come up with a decent internal classification of the language family. In contrast to Figures 7.1 and 7.2, Resígaro now occurs more closely to the languages that are also geographically closest, such as Tariana, and not to the Caribbean Arawakan language Lokono. Thus, local contact effects are stronger than possible historical genealogical relations that may be restored from the personal paradigm. Examples of languages that underwent language contact with genetically unrelated languages and are therefore grammatically quite different from other Arawakan languages are the following:
Garifuna (see Escure Reference Escure, Escure and Schwegler2004): Arawakan with Cariban (and European languages); Garifuna is a language of Arawakan origin (Island Carib) with substantial interaction of Cariban (at the time during colonization) and English- and French-lexifier pidgins and Creoles at the time during and after colonization)
Tariana: Arawakan with Tucanoan (AikhenvaldReference Aikhenvald, Dixon and Aikhenvald1999b, Reference Aikhenvald, Aikhenvald and Dixon2001, Reference Aikhenvald2002); Aikhenvald has done a detailed study of language diffusion in the Vaupés area, and these strong effects of language contact hold for all the languages in this region, not only Arawakan.
Resígaro: Arawakan with Bora (SeifartReference Seifart2011); Resígaro has in particular changed its nominal morphology and underwent great grammatical changes under the influence of the Bora language; the lexicon and the verbal morphology remained more Arawakan.
Yanesha’ (WiseReference Wise1976) and other Andean foothill Arawakan languages (see Table 7.1): Arawakan with Quechua; Yanesha’, Chamicuro, and Apolista were influenced by Quechua, and they are therefore grammatically distinct from other Arawakan languages in the same area, like e.g. the Campan Arawakan languages.
Paunaka: Arawakan with Bésiro (Macro-Jê) from the Chiquitanía (Danielsen and Terhart forthc.); the Paunaka language shows regional grammatical constructions in the morphology of borrowed verbs that are typical for the Chiquitanía, and Paunaka has also had lexical influence from Bésiro more recently.
Moxo: Arawakan with possibly Bésiro (Macro-Jê) or already extinct language(s) of the area; the Moxo languages only show a particular non-Arawakan pattern in the personal paradigm form for 3rd person with speaker gender distinction (see Danielsen Reference Danielsen2011, Rose, p.c.) that is not Arawakan.
The long list of reported cases of language contact of Arawakan languages with other languages demonstrates why it is difficult to base an internal classification on the same features for all languages. While some Arawakan languages have been influenced in their nominal morphology, others have changed their verbal morphology, and again others the personal paradigms or the lexicon, the results of the fragmentation of the Arawakan matrix after their wide expansion. A more detailed comparative analysis is needed of the grammatical features of the languages that Arawakan languages came into contact with before a more comprehensive analysis of the fragmentation process can be carried out.
3 Conclusion
In this chapter we have sketched the birth, expansion, and fragmentation of the Arawakan matrix, one of the most important cultural systems of prehistoric South America. It is characterized by a surprisingly robust uniformity in its earlier stages, but then in its aftermath, it underwent complex interactions with neighboring systems. The expansion of the Arawakan matrix was characterized by a network of contact and exchange manifested in a regional exchange system that spread the material culture and languages of the matrix to neighboring groups, but also absorbed linguistic and cultural traits – thereby contributing to constant renegotiations and renewal of the system.
The linguistic analysis shows that the regional exchange system of the Arawak-speaking communities must have been intact until late prehistory. This is evident from the distribution of linguistic features typical of the Arawakan family among most of the languages of the family. Many features typical of the Arawakan family are also characteristic of Amazonian languages in general. This is most likely the result of the fact that the Arawakan languages, through the cultural matrix which they were part of and the exchange system which they spread through, came into contact with a very large number of Amazonian languages belonging to other genealogical groupings. The process of contact and exchange between Arawakan and non-Arawakan languages resulted in a diffusion of features between these two. Another archaeological claim (apart from the existence of the regional exchange system) sustained by the linguistic analysis is the tendency of the Arawakan matrix to expand in a multidirectional and irregular fashion (Figures 7.2 and 7.3). According to the archaeological analysis, the Arawakan matrix constantly renegotiated its character through contact with new groups, and new items were added to the matrix as it expanded along the major rivers of the great basin. The regional exchange system facilitated recursive feedback of new features, thereby contributing to the fluent and dynamic character of the matrix, a feature probably contributing to the longevity of the system.
As a result of the dynamic character of the Arawakan matrix, Figure 7.3, which depicts diversity of linguistic features within the Arawakan family, could just as well be used as an illustration of possible routes of contact and exchange of material culture or as the routes of mythological travels of the Arawakan shamans (both phenomena would have contributed to the linguistic exchange). Thus the expansion of Arawakan languages was a complex process where language, material culture, and non-material culture formed an inseparable entity and where all components were crucial for the successful expansion and renewal of the system. It also shows that in order to decipher such a process, a broad, multidisciplinary scientific approach is called for, matching the many different aspects of the system.
Finally, the composition of the language groupings and the distribution of individual linguistic features among Amazonian languages is the result of long-term processes of contact and exchange, in which material culture, social organization, customs and traditions, and language have interacted to form complex sociolinguistic structures that require multidisciplinary research to unravel.
The Arawakan linguistic database was created partly with support from and in interaction with the LinC (Languages in Contact) project at the Radboud University Nijmegen.
1 “Before present,” i.e. years before 1950 according to international standards for the calibration of C14-dates derived from the radiocarbon method.
2 The Saladero phase dates to approximately 1300 BCE, i.e. 3000 BP (Roosevelt Reference Roosevelt1997; Boomert Reference Boomert2000).
3 The Barrancas and Isla Barrancas phases date to approximately 900 BCE, i.e. 2800 BP (the discrepancy of the BP-dates in footnotes 2 and 3 is due to the non-linear correspondence between BCE/CE and BP in the C14 calibration curves. This phenomenon is in itself an effect of uneven amounts of solar radiation hitting the earth during different time periods (Cruxent and Rouse Reference Cruxent and Rouse1958, Reference Cruxent and Rouse1959; Sanoja Reference Sanoja1979; Sanoja and Vargas Reference Sanoja and Vargas1983; Barse Reference Barse1989; Oliver Reference Oliver1989; Roosevelt Reference Roosevelt1997; Boomert Reference Boomert2000; Gassón Reference Gassón2002).
4 The Arawak-speaking Taino of the Greater Antilles was actually the first indigenous group encountered by Columbus on his first voyage (Rouse Reference Rouse1993).
5 Brochado and Lathrap (1982: 51) at one point describe Marajoara as “one of the most complex art styles of the world.”
6 Cabiyarí (Cauyari, Cabuyarí, Acaroa) is classified as a dialect of Tariana (Landar Reference Landar and Sebeok1977: 454).
7 Yucuna is also known as Matapí (Matapí-Tapuya) (Lewis Reference Lewis2009).
8 Métraux (Reference Métraux and Steward1948c: 708) writes that the “Pasé were considered the most advanced Indians of the middle Amazon.”
9 Aikhenvald (Reference Aikhenvald, Dixon and Aikhenvald1999a) is the basis for all subdivisions in Northern Arawakan. Danielsen and Terhart (forthcoming) specifies the Southern Arawakan group, which is less classified in Aikhenvald (Reference Aikhenvald, Dixon and Aikhenvald1999a), according to the former lack of information. Some more subgrouping could be done on the basis of the findings in Walker and Ribeiro (Reference Walker and Ribeiro2011), which are supported by the analyses in the present chapter. The Purus subgroup has been claimed by Facundes (Reference Facundes, Hill and Santos-Granero2002) under the name A-P-I; the name Purus is used in Walker and Ribeiro and on Ethnologue (Lewis Reference Lewis2009).
10 These are generally concepts expressed by adjectives in other languages.
11 In this graph, Southern Arawakan languages are marked by bold grey script, Northern by black bold italics. The grey broken lines encircle the members of possible subgroups, as given in Table 7.1.
12 In this graph, the size of the circles indicates relative frequency of shared features of the present study. The grey shades of the circles refer to Southern Arawakan.
13 Cf. Salipante and Hall Reference Salipante and Hall2011 for criticism on the interpretation of these graphs.
14 Unfortunately, we do not have enough data for the inclusion of the Chané Arawakan language into the analysis of personal paradigms. It would indeed be interesting to see where this old and already extinct Arawakan language that reached the north of Argentina would be in the graph. Chané had been replaced by a Tupian language during the early days of the European colonization.
15 For an extended discussion on the relationship between the Arawak-created land-based capital and the socio-economic and cultural development of the region, see Hornborg et al. (Reference Hornborg, Eriksen and Bogadóttir2013).
16 The occurrence of multilingualism and language shifts has been documented in various parts of Amazonia (Schmidt Reference Schmidt1917; Sorensen Reference Sorensen1967; Jackson Reference Jackson1983; Campbell Reference Campbell1997; Aikhenvald Reference Aikhenvald2002, Reference Aikhenvald2003b). For other examples of language contact situations resulting in language shifts, see e.g. Thomason and Kaufman (Reference Thomason and Kaufman1988); Sasse (Reference Sasse and Brenzinger1992).
17 For an extended discussion on indigenous slavery and predation in Amazonia, see Santos-Granero (Reference Santos-Granero2009).
18 Taking only features related to the marking of semantic roles on either the verb or not, we do get some Northern versus Southern branching (Danielsen, unpublished).
19 The feature list consists of the Constenla (1991) questionnaire and additional distinctive features selected by Danielsen; for more details see Danielsen et al. (Reference Danielsen2011). Excluded from the analysis for the combined feature set were Apolista, Enawenê-Nawê, Mehinaku, Saraveka, and Taino because of incomplete data.
8 The Tupian expansion
This chapter explores the expansion of the Tupian languages and culture across greater Amazonia to better understand the mechanisms and processes of cultural and linguistic contact and change. Tupian languages are or were spoken among indigenous groups in Lowland South America from the Brazilian Atlantic coast through Paraguay to the eastern Andean slopes of Peru. The investigation uses Geographic Information Systems (GIS) to map the spatial distribution of cultural and linguistic features associated with Tupí-speaking groups in order to plot the historical expansions of the Tupian languages and to characterize the sociocultural and linguistic context and consequences of these events, particularly relating to internal and external contact situations. Research is directed toward multidisciplinary integration of linguistic data with cultural data derived from anthropology, archaeology, ethnohistory, and geography in order to reach a multifaceted understanding of the history of contact and exchange involving Tupí-speaking groups. The chapter breaks new ground in combining traditional studies of material culture with linguistic data through the use of GIS, as well as in mapping and investigating the spatial distribution of linguistic features and their relationship to cultural attributes.
1 Introduction
Attempts to reconstruct the expansion of the major linguistic families have a long and proud history in the research of the tropical lowlands of South America. Schmidt (Reference Schmidt1917) described the expansion of Arawakan, while Nordenskiöld (Reference Nordenskiöld1918–38) dealt with both Arawakan and Tupian, along with other ethnolinguistic groups. Lathrap (Reference Lathrap1970) described the expansion of Panoan, Arawakan, and Tupian, and his followers Brochado (Reference Brochado1984) and Oliver (Reference Oliver1989) concerned themselves with Tupian and Arawakan, respectively. Meggers (e.g. Reference Meggers1971) tried to explain the expansion of the major linguistic families in the region as a consequence of population movements triggered by climate fluctuations, and Meggers and Evans (Reference Meggers, Evans and Jennings1978) proposed an origin of the Tupian family east of the Madeira River (a hypothesis already advocated by Métraux (Reference Métraux1928) and Rodrigues (Reference Rodrigues1964); see below). Noelli (Reference Noelli1998, Reference Noelli, Silverman and Isbell2008) and Urban (Reference Urban1996) also devoted studies to the Tupian expansion, while Heckenberger (Reference Heckenberger, Hill and Santos-Granero2002) addressed the Arawakan dispersal. More recently, Neves (Reference Neves, Hornborg and Hill2011) has attempted to correlate ceramic styles with Arawakan and Tupian languages from an archaeological perspective; Walker and Ribeiro (Reference Walker and Ribeiro2011) have modeled the linguistic history of Arawakan, and Eriksen and Danielsen (this volume) have studied the Arawakan dispersal from a transdisciplinary perspective.
There have been several advances in various academic disciplines relevant to our understanding of linguistic expansions in pre-Columbian Amazonia during the last two decades. One decisive theoretical advance in this field of research comes from the work of Hornborg (Reference Hornborg2005) and Hornborg and Hill (Reference Hornborg and Hill2011), who stress the importance of understanding the development of ethnic identities through the process of ethnogenesis (i.e. the development and continuous renegotiation of ethnic identities through sociocultural interaction) in order to decipher processes of cultural and linguistic exchange among indigenous groups. Another important factor includes the use of large-scale computerized databases of spatially distributed cultural and linguistic data (Geographic Information Systems, or GIS), which promotes multidisciplinary comparative studies of the interplay between cultural and linguistic variables through time (Eriksen Reference Eriksen2011). And finally, the field of linguistics has seen a veritable boom both in good quality documentation of South American languages and in the use of computational tools and large-scale databases to probe the internal relationships of Amazonian language families as well as the areal diffusion of lexical and structural features between different linguistic groupings (Muysken and O’Connor, this volume).
2 The Tupian language family and its branches
In order to contextualize the current investigation, we start with a basic and non-exhaustive orientation to what has been accomplished in previous Tupian studies. The Tupian family is one of the largest and most widely distributed language families in lowland South America, with languages still spoken in a large geographic area that covers a great part of Brazil as well as adjacent areas in Paraguay, Argentina, French Guiana, Bolivia, and Peru (Map 8.1). It has long been recognized, based on the time depth of regional Tupian diversity, that the vast expansion of the Tupian family stems from a single point of origin (Métraux Reference Métraux1928; Rodrigues Reference Rodrigues1964; Noelli Reference Noelli, Silverman and Isbell2008), located east of the Madeira-Guaporé basin, in the Brazilian state of Rondônia. From this point of origin, the family has expanded into ten branches: Tuparí, Arikém, Puruborá, Ramarama, Mondé, Juruna, Mundurukú, Tupí-Guaraní, Awetí, and Mawé (Figure 8.1) over a time span of roughly 4–5,000 years (cf. estimates by Rodrigues Reference Rodrigues1964 and other researchers). These ten branches encompass about 40–45 languages, not counting the differences among dialects spoken by distinct ethnic groups (Moore et al. Reference Moore, Galucio and Gabas2008).

Map 8.1 The location of Tupí-speaking groups at the time of European contact

Figure 8.1 The branches of the Tupian language family1
The genetic relationship and internal classification of the Tupian family shown schematically in Figure 8.1 incorporates recent historical-comparative studies concerning internal classification and proposals of intermediary stages in the evolution from Proto-Tupí to the current languages, including the results of lexical and grammatical comparison and reconstruction for the different branches of the Tupian family (Rodrigues Reference Rodrigues1984, Reference Rodrigues, Klein and Stark1985; Gabas Jr. Reference Gabas, van der Voort and van de Kerke2000; Galucio and Gabas Jr. Reference Galucio and Gabas2002; Moore Reference Moore2005; Drude Reference Drude, Dietrich and Symeonidis2006; Picanço Reference Picanço, Gildea and Galucio2010; Galucio and Nogueira Reference Galucio and Fernanda Nogueira2012).
The close relationship between Awetí, Mawé, and the Tupí-Guaraní languages has long been recognized (Rodrigues Reference Rodrigues1964, Reference Rodrigues1984, Reference Rodrigues, Klein and Stark1985; Rodrigues and Dietrich Reference Rodrigues and Dietrich1997), and it is by now well established that these languages constitute a large branch inside Tupí, the Mawé-Awetí-Tupí-Guaraní branch (Drude Reference Drude, Dietrich and Symeonidis2006; Correa da Silva Reference Correa da Silva2011; Drude and Meira to appear), termed the Mawetí-Guaraní branch by the latter two authors. This branch represents the major branch of the family, in number of languages and in territorial extension.
Given the enormous diversity within the family in terms of territorial expansion, it is clear that the different Tupian groups have been shaped by distinct and individual historical experiences. An attempt to reconstruct the internal diversification and expansion of Tupian languages must therefore take these experiences into account, aided by a multidisciplinary approach that seeks to understand not only the genealogic relationship and contact history of the languages from a linguistic point of view, but also the particular historical experiences of the groups by mapping the sociocultural features associated with them.
3 Lexical and structural distances between Tupian languages
The genealogical classification of the Tupian family is presented in Figure 8.1, which shows the distinct levels of relationships between the languages and their evolutionary paths from the ancestor language, Proto-Tupí. In this section, we present another view of Tupian language relations, based on distance matrices of shared features. We analyzed the data compiled for this study using quantitative techniques to visualize patterns of relationship in terms of lexical and structural similarity, without presuming an explicit genealogical history. We then compare assessments of similarity presented in network representations (Figures 8.2 and 8.3) to the internal relationships and genealogy of the Tupian family (Figure 8.1).
3.1 Linguistic distance based on lexical similarity analysis
Galucio and colleagues (to appear) present the results of a lexicostatistical and phylogenetic study based on the analysis of the Swadesh list of 100 diagnostic words considered to be most stable over time (Swadesh Reference Swadesh1955) for all the nineteen Tupian languages outside the Tupí-Guaraní family and for four Tupí-Guaraní languages (Guaraní, Parintintim, Tapirapé, and Urubu-Kaapor). Their study shows the degree of distance across Tupian languages, confirms the two more recently established branches of Ramarama-Puruborá and Mawé-Awetí-Tupí-Guaraní, and also supports the internal structure of each branch of the family based on historical-comparative methods. In the case of Tuparí and Mondé, the two most diversified branches outside Tupí-Guaraní, the phylogenetic similarity tree agrees exactly with the independent internal classification of these branches (Moore Reference Moore2005; Galucio and Nogueira Reference Galucio and Fernanda Nogueira2012). We took their study and extended it to include a more complete set of languages from the Tupí-Guaraní branch. Using the NeighborNet algorithm implemented in Splits-Tree4 (Huson and Bryant Reference Huson and Bryant2006), we generated an unrooted network expressing a distance measure among the Tupian languages on the basis of lexical similarity in the basic vocabulary for thirty-one Tupí-Guaraní languages and dialectal varieties and the nineteen languages from the other Tupian branches already established by Galucio et al. (to appear). The distances between the languages based on the percentage of shared lexical items are shown in the NeighborNet representation in Figure 8.2.2
The analysis is not intended to show the historical development of these languages but rather the degree of distance between them, based on lexical similarity that may also reflect the result of horizontal transfer. It is nonetheless remarkable that the major clusters of languages that surface from the distance measure shown in the graphic are comparable to the proposed path of historical development for the Tupian languages, on the basis of the comparative method (cf. Figure 8.1). The NeighborNet representation places Awetí as the closest language to the Tupí-Guaraní cluster, followed by Mawé, and together forming the Mawetí-Guaraní larger cluster, which is consistent with the proposed path of evolution in the history of these languages (Drude Reference Drude, Dietrich and Symeonidis2006; Correa da Silva Reference Correa da Silva2011; Drude and Meira, to appear). The six other lexical clusters (Juruna, Arikém, Ramarama-Puruborá, Mondé, Tuparí, and Mundurukú) and their sub-splits correspond exactly to the more recent genealogic classification of these languages, as clearly seen in the Tuparí and Mondé branches (Moore Reference Moore2005; Galucio and Nogueira Reference Galucio and Fernanda Nogueira2012).
The linguistic cohesiveness of the Tupí-Guaraní branch is also prominent in the network representation. Horizontal transfers due to contact and borrowing may be responsible for a great number of the synchronic resemblances in the Tupí-Guaraní lexicon, not all of them due to retention from a common ancestor language. Nonetheless, as expected from the known history of these languages, the thirty-one Tupí-Guaraní languages are closer to each other than to any other language in the Tupian family.
However, the splits inside the Tupí-Guaraní cluster do not correspond exactly to classifications of the Tupí-Guaraní branch based on phonological criteria (Mello Reference Mello2000; Rodrigues and Cabral Reference Rodrigues, Cabral, Cabral and Rodrigues2002) or on a combination of lexical, phonological and grammatical criteria (Dietrich Reference Dietrich1990). There is an overall absence of well-delimited lexical clusters inside the Tupí-Guaraní group in Figure 8.2. Among the few specific clusters that surface from the quantitative lexical comparison are the Kawahib languages (Parintintim, Tenharim, Amondawa, and Uru-eu-uau-uau) that are classified as dialectal variants (Sampaio Reference Sampaio1997); the Yuki-Sirionó cluster of two closely related Tupian Bolivian languages; the Wayampi-Emérillon cluster of languages spoken in the same geographic area in French Guiana; the Língua Geral Amazônica3-Urubu-Kaapor grouping, for which there have been claims of mutual influence through contact; and a Guaraní cluster that includes most of the languages in Rodrigues and Cabral's subgroup I of Tupí-Guaraní (2002) but also includes Guarayo, spoken in Bolivia. The Cocama-Cocamilla language4 appears close to Xeta. The Cocama lexicon, including the core vocabulary, is primarily Tupian (Cabral Reference Cabral1995), but it also shows lexical traits of Arawakan, Panoan, and Quechuan origin, in addition to Portuguese and Spanish (Muysken Reference Muysken, Campbell and Grondona2012b).5

Figure 8.2 NeighborNet representation of lexical distances among Tupian languages
3.2 Linguistic distance based on structural similarity analysis
For the structural analysis, we designed a preliminary questionnaire of twenty prominent typological features, divided between phonology, morphology and syntax.6 Due to the availability of data, our structural sample is smaller than the lexical sample. It consists of thirty languages, including eighteen from the Tupí-Guaraní branch and twelve from the other nine branches of the family. The features were coded on the basis of published material, complemented with direct verification with specialists working on particular languages. The current location of the analyzed languages is shown in Map 8.2. Only two features are identical for all the languages in the sample. All Tupian languages have the order possessor-possessed in the possessive phrase, and all have noun-postposition order in the noun phrase, which is consistent with the general head-marking characteristic of the family. With the exception of Cocama-Cocamilla, which has a causative suffix -ta, all other languages have a causative prefix of the form mV- (<Proto-Tupí *mõ-). The phonological features show few splits throughout the family. The vowel inventories go from four (Língua Geral Amazônica) to seven vowels (Karo, Puruborá and Tenetehara-Tembé), with most of the languages showing five or six vowels. With respect to suprasegmental phenomena, languages of two branches (Mondé and Mundurukú) have contrastive tone, three languages (Juruna, Karitiana, and Karo) have pitch-accent systems, and the remaining languages, including Xipaya, the sister language of Juruna, have stress-only systems (cf. also Storto and Demolin Reference Storto, Demolin, Campbell and Grondona2012).

Map 8.2 Current location of the languages in our structural sample
Preliminary results of the structural analysis can be visualized in the network representation in Figure 8.3. The discrepancies between the lexical and grammatical analyses are remarkable. The only languages that seem to cluster in both analyses are Kaiowa with Mbyá-Guaraní, and Karo with Puruborá, although there is little evidence for the latter in the grammatical outcome, probably due to the gaps in the Puruborá data.7 The other lexical clusters that correspond to the known genealogy of the Tupian languages are not found in the structural analysis. The lexical major split opposing a large Mawé-Awetí-Guaraní branch on one side to all the other branches (cf. Figure 8.2) is not present in the structural analysis, either. A split in terms of Eastern and Western languages (Rodrigues Reference Rodrigues, Cabral and Rodrigues2007) that opposes the language branches spoken in the Rondônian region (Arikém, Ramarama-Puruborá, Tuparí, and Mondé) to the branches spoken outside that region (Mawetí-Guaraní, Juruna, and Mundurukú) does not surface in either the lexical or the grammatical analysis.

Figure 8.3 NeighborNet representation of structural distances among Tupian languages
The output of the structural distance measure does not compare favorably with known genealogical relations. The closely related Juruna branch members Juruna and Xipaya are adjacent in the lexical measure but do not cluster together in the structural analysis. On the other hand, languages that belong to distant genealogical branches such as Makurap (Tuparí) and Asuriní do Tocantins (Tupí-Guaraní) or Mekens (Tuparí) and Yuki (Tupí-Guaraní) show high measures of grammatical similarity. Cocama-Cocamilla and Língua Geral Amazônica constitute another unorthodox cluster in Figure 8.3, sharing fourteen of the twenty analyzed features, including SVO as the basic order of clausal constituents. As the most prominent expansion varieties of Tupian, these two languages played an important role in the contact scenario (see Section 5).
4 Cultural characteristics of the Tupí-speaking groups
When investigating a language family as widespread as Tupian, great caution must be used in assigning general cultural characteristics to such a large number of groups located in such diverse ecological zones within the lowlands of tropical South America. Let us therefore first point out that we do not propose that all Tupí-speaking groups share a single set of cultural features. What we do propose, however, is that several socio-cultural characteristics, material as well as non-material, with great potential to influence language dispersal, have very interesting distributional patterns within the family and therefore deserve closer investigation. Recent research by Walker et al. (Reference Walker, Wichmann, Mailund and Atkisson2012) applied quantitative methods to analyze the distribution of cultural features within the Tupian family in order to reconstruct its homeland and the rates of cultural change within it. However, the research by Walker and colleagues does not investigate cultural features that are typical of Tupí-speaking societies, but rather features that are common among Tupian groups (and, not incidentally, just as common among members of many other lowland South American language families). The present investigation seeks instead to isolate cultural features that are more exclusively Tupian, allowing us to identify Tupian versus non-Tupian characteristics in contact situations. The latter method is particularly useful when researching prehistoric contact scenarios, where the primary sources of evidence are the remains of material culture (see below).
Since the publication of Eduardo Viveiros de Castro's (Reference Viveiros de Castro1992) From the enemy's point of view, the so-called “bellicose ethos” identified by the author has become a powerful cultural characterization of Tupí-speaking groups across Amazonia. The expression stems from the observation that the Tupí-speaking groups studied displayed a strong predatory cosmology, where social prestige and status is gained by violent performances such as warfare, enslavement, and anthropophagy. Anthropophagy was a widespread and well-integrated cultural feature of many Tupí-speaking societies at the time of contact (Gareis Reference Gareis2002: 248; Fausto Reference Fausto, Fausto and Heckenberger2007: 83; Santos-Granero Reference Santos-Granero, Hornborg and Hill2011: 343). Many of the violent encounters between Europeans and Tupí-speaking groups in the 1500s bear witness to the presence of these practices among Tupians of the Atlantic Coastline. It is then tempting, of course, to assign the predatory cosmology of the Tupí-speaking groups encountered by early Europeans to the whole language family, but such a generalization must nevertheless be corroborated by careful study of a large and representative number of communities, in order to confirm such a hypothesis. Let us therefore for a moment return to Map 8.1, where the spatial distribution of the Tupian branches at the time of contact is depicted.
Early Europeans who set foot in South America along the Atlantic coastline and later along the lower Amazon River would have come into contact primarily with speakers of the Tupí-Guaraní branch, and secondarily with speakers of the Mundurukú, Mawé, and Juruna branches. These branches are also of great interest for our reconstruction of the expansion of the family. These four branches are responsible for the majority of the territorial expansion of the Tupian languages,8 and they are, therefore, the most salient ones in terms of indigenous contact scenarios. Fortunately, the ethnographical information available for the members of these four branches is relatively rich.
In an intriguing study of Amazonian captive identities,9 Santos-Granero (Reference Santos-Granero2009: 5) investigates the “native regimes of capture and servitude,” providing extensive detail on the cultural characteristics of the Chiriguano (Tupí-Guaraní branch). Warfare and the taking of captives were central elements in Chiriguano society, and these also played an important role in the contact scenario between these Tupians and their Arawakan neighbors. In the Gran Chaco, the neighboring Arawakan-speaking Chané were repeatedly raided for captives, who were sometimes left alive and integrated into Chiriguano society and sometimes executed as part of anthropophagous rituals (p. 75).
According to Santos-Granero (p. 78), the Chiriguano kept more captives alive than did the Tupinambá (of the Tupí-Guaraní branch), who also had a great reputation as warriors and anthropophages. Furthermore, both the Chiriguano and Tupinambá had war clubs (Métraux Reference Métraux and Steward1948a: 95; Santos-Granero Reference Santos-Granero2009: 77), a weapon used exclusively during warfare. The Mundurukú (Mundurukú branch) also had war clubs and were involved in military conflicts (Horton Reference Horton and Steward1948: 271f., 276; Balée Reference Balée and Ferguson1984: 257). Like the Chiriguano, the Mundurukú were also known captive-takers (Horton Reference Horton and Steward1948: 278), a practice shared with the Mawé of the Mawé branch (Nimuendajú Reference Nimuendajú and Steward1948b: 251). The Juruna and Xipaya, members of the fourth Tupian linguistic branch that expanded far from Rondônia, were also known as cannibals and were documented as very hostile by the early Europeans (Nimuendajú Reference Nimuendajú and Steward1948a: 218, 235). The Xipaya also carried war clubs (Nimuendajú Reference Nimuendajú and Steward1948a: 232). In addition to taking captives, making war trophies out of enemy heads was another common practice of these Tupians, as documented in the Juruna, Mawé, Kuruaya, and Mundurukú groups (Nimuendajú Reference Nimuendajú and Steward1948a: 236, Reference Nimuendajú and Steward1948b: 251; Santos-Granero Reference Santos-Granero2009: 104, Figure 38).
Thus the ethnolinguistic groups of the four branches responsible for the major territorial expansion of the Tupian family share a certain set of features with respect to their strong emphasis on warfare and the taking of captives. This pattern of violent sociocultural interaction with neighboring groups had very particular consequences in terms of the cultural and linguistic exchange between the combatant societies. Judging from the unequal power relations that often arose between captives and captors, one may get the impression that cultural and linguistic exchange was unidirectional in contact situations, with elements flowing in the direction from the powerful to the powerless only. However, many examples involving the four non-Rondônian branches of the family indicate that contact scenarios were more complex and interesting.
Several ethnographical examples from Tupí-speaking societies provide a picture of communities particularly keen to integrate new cultural elements into their repertoires. Although the Arawak-speaking Chané were Chiriguano-ized by their powerful Tupí-speaking neighbors, there was also a substantial Arawakan cultural element flowing in the opposite direction (Santos-Granero Reference Santos-Granero2009: 33, 186). In terms of agricultural skills, the Chané were much better equipped than the Chiriguano, which led to the abandonment of traditional Chiriguano subsistence strategies in favor of obtaining food produced by subjugated Chané groups (Santos-Granero Reference Santos-Granero2009: 80f., 136). Furthermore, as noted by Iris Gareis (Reference Gareis2002: 264) in a study of the cultural encounters between Tupí-speaking groups of the Brazilian Atlantic Coast and the first Europeans to settle there, the Tupians quickly integrated the Europeans and their artifacts into their own cultural sphere through the establishment of exchange relations with the French and the Portuguese.
Another intriguing example of the Tupian tendency to adopt cultural elements from others is given in Gow's (Reference Gow, Fausto and Heckenberger2007) study of the Tupí-Guaraní-speaking Cocama communities of the Peruvian Amazon, who adopt the surnames of groups with high social ranking, e.g. Brazilians, in order to escape their status as “indigenous,” commonly seen as low-status citizens. The dynamic and flexible nature of Tupian exchange relations vis à vis other groups helps us understand the dispersal of the non-Rondônian branches of the language family: arenas of linguistic exchange were created which fostered integration and assimilation scenarios among groups and individuals and facilitated the borrowing of cultural and linguistic features. This interaction style is a key criterion in researching the nature of linguistic contacts involving the Tupí-Guaraní branch.
5 The nature and timing of the Tupian language expansions
Of the ten traditional Tupian branches, a single one, Tupí-Guaraní (with approximately twenty-two languages and about forty dialectal variants), is responsible for the major part of the territory conquered by the family (Map 8.1). Some of this spread was accomplished during European colonization. Métraux (Reference Métraux and Steward1948a: 98–99) reports two great Tupinambá migrations: one from the coast of Brazil to Chachapoyas in Peru (1540–1549), and a second one involving a different Tupinambá group from the coast of Brazil through the Amazon and Madeira rivers up to Bolivia and then back to the mouth of the Madeira, where they settled on the Tupinambarana island. Similarly, Old Guaraní was attested from the end of the seventeenth century until about the middle of the eighteenth century, when reference to the language basically disappears. The language surfaced again around the mid-nineteenth century to become the starting point of modern Guaraní. By then the Guaraní language had already diverged into at least three major branches: Chiriguano, Guaraní, and Mbya-Guaraní, spoken in Bolivia, Paraguay, and Brazil, respectively (Schleicher Reference Schleicher1998: 3). This scenario of Tupian expansion is also linked to the development of línguas francas that surfaced in the context of the Portuguese and Spanish colonization. The best known cases are Guaraní, Língua Geral Paulista, and Língua Geral Amazônica. The first is now a national language of Paraguay. Língua Geral Paulista is extinct, and Língua Geral Amazônica is still spoken by various ethnic groups in the northwest Amazon (Map 8.2).
During the first two centuries of Portuguese colonization, Tupinambá was largely spoken along the coast of Brazil and subsequently spread into the provinces of Maranhão and Grão-Pará in Amazonia.10 The language evolved in contacts during the earlier colonial period between Tupinambá, Portuguese, mestizos, and speakers of other indigenous languages, and it was used for daily, official, and religious communication. By the mid-eighteenth century, there was already an identifiable variety distinct from Tupinambá that later came to be called Língua Geral Amazônica (Rodrigues Reference Rodrigues1986; Moore et al. Reference Moore, Facundes and Pires1993). Official prestige for the language shifted drastically during the centuries, from being the institutionalized official language of the Maranhão and Grão-Pará provinces during the seventeenth century to being prohibited in the mid-eighteenth century, due to the promotion of Portuguese. The nineteenth century saw at the same time a gradual decline of language usage along with a gradual adoption of the language by new groups, first as a second language of Arawakan and other ethnic groups, later as a first language in the upper Rio Negro (Brazil), where it is now spoken by thousands of people mostly in Brazil but also in Venezuela (Cruz Reference Cruz2011). An important factor in the history of Língua Geral Amazônica is its continuous transmission over the centuries (Cabral Reference Cabral2011; Moore Reference Moore and Mufweneforthcoming). It has been modified over time as it adapted to new settings, but its transmission was never interrupted. In the first centuries, the main mechanism of change seems to have been substratum influence during the acquisition of Língua Geral Amazônica by speakers of diverse languages, resulting in grammatical changes rather than lexical and grammatical borrowing (Moore et al. Reference Moore, Facundes and Pires1993). Increasing bilingualism, the dislocation of speaker populations, and the mixing of native groups led to increasing levels of Portuguese lexical and grammatical borrowing, observed in the twentieth century (Moore Reference Moore and Mufweneforthcoming: 180).
Another Tupí-Guaraní language that played a central role in the expansion of the family through horizontal transfer (contact and borrowing) is Cocama-Cocamilla. Cocama has been claimed (Cabral Reference Cabral1995) to be the result of incomplete shift in a process of rapid creolization, starting in the fifteenth century, when a group of Tupí-speakers (possibly Tupinambá) migrated from the Atlantic coast inland to the upper Amazon and came into close contact with speakers of one or more other languages, possibly of Arawakan origin. Cabral argues that due to the mixed origin of Cocama and its history of intense language contact, reflected in lexical and grammatical idiosyncrasies, it cannot be assigned Tupian or any other genealogical affiliation (Cabral Reference Cabral2011: 20). However, the genealogical affiliation of Cocama as Tupian has also been defended (Michael Reference Michael2010; Vallejos-Yopán Reference Vallejos-Yopán2010). Despite traces of non-Tupian contributions in its grammar, regular patterns of grammatical change can be identified in Cocama that support an early development from a Proto-Tupí-Guaranían origin with gradual changes over the centuries (Vallejos-Yopán Reference Vallejos-Yopán2010: 753–758). The role of Arawakan, Panoan, Quechuan, Portuguese, Spanish, and other non-identified languages as suppliers of lexical and grammatical items for Cocama-Cocamilla highlights the complex contact scenario found in the Amazon region since pre-Columbian times. The effect of an earlier Quechua-based pidgin in the upper Amazon (Crevels and Muysken Reference Crevels, Muysken, Noll and Symeonidis2005b) might be reflected in the high number of Quechua words in Cocama, including plant and animal names, verbs, adverbials, numerals, and a special Quechua perfective morpheme used in verbs of Spanish origin (Muysken Reference Muysken, Campbell and Grondona2012b: 249f.).
There may also have been horizontal transfers from Arawakan (and other languages) to Tupian groups through second language acquisition, as with the Chiriguano who subjugated Arawakan populations. As the powerful Arawakan regional exchange system that had expanded across Amazonia between 900 BCE and 1000 CE declined (Hornborg Reference Hornborg2005; Eriksen Reference Eriksen2011; Eriksen and Danielsen, this volume), there were both Tupian and Cariban expansions in the period between 1000 and 1500 CE (Eriksen Reference Eriksen2011). Tupí-speakers competed with the Caribs for Arawakan trade routes, and the headwaters of the Amazon River were part of early Tupian routes.
Other Tupian groups who left Rondônia include speakers of the Mundurukú, Mawé, and Juruna branches, who expanded their territories, and speakers of the single-language branch Awetí, who are concentrated today in a small geographic area in the Upper Xingú Indigenous Park. In the second half of the eighteenth century, the Mundurukú appeared in the colonial records as inhabitants of the Maué river, a tributary of the Amazon.11 They expanded their territory through warfare and came to completely dominate the region between the Madeira and Tapajós rivers by the beginning of the nineteenth century. Sateré-Mawé-speakers have inhabited the region of the Tapajós and Madeira rivers for more than 300 years. The first reference to their presence there dates back to 1639 (Carvajal, de Rojas and de Acuña Reference Carvajal, de Rojas and de Acuña1941, cited in Franceschini Reference Franceschini1999), and their split from a Proto-Mawetí-Guaraní language is certainly much earlier than the Tupinambá migration that left the coast around 1530 and arrived at the Tapajós-Madeira region around 1590 (Métraux Reference Métraux1928). The internal classification of the Mawetí-Guaraní super-branch shows an Awetí-Tupí-Guaraní branch opposed to Mawé (cf. Figure 8.1). This internal classification implies that the Mawé were the first to separate from the ancestral group, which later split into the ancestors of the Awetí and the ancestors of another group that later became the diverse Tupí-Guaraní branch (Rodrigues and Dietrich Reference Rodrigues and Dietrich1997; Drude and Meira, to appear).
The remaining five Tupian branches are still located within Rondônia and its adjacent areas, and there are no indications that they have ever expanded outside of this region.
5.1 The material culture of the Tupian expansion
Most attempts to reconstruct the Tupian expansions have relied heavily on pottery as a marker of the pre-Columbian spread of Tupian languages (see e.g. Lathrap Reference Lathrap1970; Brochado Reference Brochado1984; Noelli Reference Noelli, Silverman and Isbell2008; Neves Reference Neves, Hornborg and Hill2011). Lathrap (Reference Lathrap1970) argued that the dispersal of the Arawakan and Tupian families took place from roughly the same point of origin in central Amazonia. He suggested that Arawak-speaking groups were associated with Barrancoid pottery (also known as Incised Rim), while the Tupian groups could be traced through the presence of ceramics associated with the Amazonian Polychrome Tradition.
The Barrancoid tradition originated in the Orinoco Valley around 900 BCE (Cruxent and Rouse Reference Cruxent and Rouse1958, Reference Cruxent and Rouse1959; Sanoja Reference Sanoja1979; Sanoja and Vargas Reference Sanoja and Vargas1983; Barse Reference Barse1989; Oliver Reference Oliver1989; Roosevelt Reference Roosevelt1997; Boomert Reference Boomert2000; Gassón Reference Gassón2002), expanded into the central Amazon by 400 BCE (Heckenberger et al. Reference Heckenberger, Petersen and Neves1999; Rebellato et al. Reference Rebellato, Woods, Neves, Woods, Teixeira, Lehmann, Steiner, WinklerPrins and Rebellato2009), and progressed further into the upper Amazon (Evans and Meggers Reference Evans and Meggers1968: 17, 81; Lathrap Reference Lathrap1970: 109; Brochado and Lathrap Reference Brochado and Lathrap1982: 12) and southern Amazonia (Lathrap Reference Lathrap1970: 159; Heckenberger Reference Heckenberger2005: 56; Saunaluoma Reference Saunaluoma2010: 94) during the centuries around 1 CE (see Eriksen Reference Eriksen2011 for an inclusive account of the Barrancoid expansion). The timing and expansion of the Barrancoid tradition correlates strongly with the Arawakan linguistic dispersal, supporting their association (Eriksen and Danielsen, this volume).
The historical record on a correlation between Tupian languages and Amazonian Polychrome pottery is less straightforward. The association between this pottery style and Tupian groups originated because the first Europeans who entered Amazonia found polychrome pottery among several indigenous groups of the Tupí-Guaraní branch, including the Omagua, Cocama, and Cocamilla of the upper Amazon (Salazar Reference Salazar, Silverman and Isbell2008: 264), the Tupinambá of the Atlantic coastline south of the mouth of the Amazon (Brochado Reference Brochado1984: 283–297), and the southern Tupí-Guaraní groups from the Sao Paulo area to the Andean foothills of Bolivia, among whom a variant of the Polychrome tradition with corrugated decoration was widespread (Howard Reference Howard1947; Métraux Reference Métraux and Steward1948b: 411). The earliest confirmed datings12 of the tradition are from the Marajoara phase of Marajó Island at the mouth of the Amazon (Meggers and Evans Reference Meggers and Evans1957; Roosevelt Reference Roosevelt1991; Schaan Reference Schaan, Silverman and Isbell2008), from around 300 CE (Brochado and Lathrap Reference Brochado and Lathrap1982: 51), but dates closer to 750 CE are more frequent (Boomert Reference Boomert, Delpuech and Hofman2004: 259).
Around 500 CE, polychrome decoration also began to develop in the central Amazon, alongside ceramics of the Barrancoid tradition and later of the Paredão phase (700–1200 CE). These are two ceramic styles associated with Arawak-speaking groups (Neves Reference Neves, Hornborg and Hill2011: 48), who by that time were integrated into a vast regional exchange system encompassing large parts of the northern South American lowlands (Hornborg Reference Hornborg2005; Eriksen Reference Eriksen2011; Neves Reference Neves, Hornborg and Hill2011: 41). Arawak-speaking groups dominated sociocultural exchange in Amazonia during this period, demonstrating status through different but interrelated ceramic styles. The complex historical record of Amazonian Polychrome pottery illustrates the multifaceted character of its association with potentially multiple ethnolinguistic groups.
The Arawakan expansion began in the Orinoco area around 900 BCE (Eriksen Reference Eriksen2011) and continued along the Amazon River, via the Madeira onto the Llanos de Moxos, and further into the upper Xingu between 500 and 1 BCE. The vast Arawakan linguistic and cultural dispersal was integrated through a regional exchange system that circumscribed the Tupian family in Rondônia on all sides but the east, so when the Tupian family slowly started to expand, it was indeed eastward (cf. Noelli Reference Noelli, Silverman and Isbell2008).
5.2 The chronology of the Tupian expansion
Early proposals (cf. Martius Reference Martius1867) postulated a relatively recent date for the onset of Tupian linguistic diversification and expansion, placing it shortly before the arrival of Europeans. Rodrigues’ (Reference Rodrigues1964) estimated chronology situated the origin of Proto-Tupí in the Madeira-Guaporé region around 3000 BCE and the beginning of Tupí-Guaraní expansions east and south between 500 and 1 BCE. In a more recent review based on archaeological data, Noelli (Reference Noelli, Silverman and Isbell2008) assesses available dates for what he calls the Tupí-Guaraní ceramic tradition and concludes that this tradition may have started expanding at least around 2,000 years ago.
It is important to consider that, despite the great territorial extension of Tupí-Guaraní speaking groups (see Map 8.1), the languages reveal a high degree of uniformity. The rate of shared cognates among the Tupí-Guaraní languages is around 70 percent or higher, and they also share many other phonological and morphosyntactic features that distinguish them from other Tupian languages. While the beginning of the diversification process that gave rise to the current languages is placed some 2,500 to 2,000 years ago, linguistic comparative studies indicate more recent subgroupings within the family. Thus, a rapid rate of displacement is necessary to account for both the great spatial distribution of the Tupí-Guaraní groups and their great linguistic similarity.
The complex Tupí-Guaraní migration process that began in prehistoric times (Métraux Reference Métraux1928) was probably accelerated and intensified by the European occupation. At the time of the initial Tupian expansions (specifically the Tupí-Guaraní branch), Arawakan-speaking groups dominated the cultural and linguistic exchange in large parts of Amazonia. The Arawak-speaking communities were river-oriented, settling close to major transportation routes that formed the basis of their exchange system, while the Tupí-speaking groups preferred the headwaters of the rivers, not the riverbeds (Migliazza Reference Migliazza and Prance1982; Urban Reference Urban1996). The Amazon River seems to have been particularly important in this sense, serving as a major artery connecting the Arawak-speaking Aruã at the mouth of the Amazon with the Manao at the juncture of the Negro, Madeira, and Amazon Rivers. The connection continued into the upper tributaries, where groups such as the Chamicuro were still located at the time of European contact. Given the dominance of Arawakan culture and languages in Amazonia at around 500 CE, it is highly likely that the early polychrome ceramics found along the main river, far up the Rio Negro and into the Aruã territory on Marajó Island, dating from precisely this period, were associated with the Arawakan cultural complex. This claim will be further elaborated on below.
In this light, the latter period of the Amazonian Polychrome tradition carries a totally different story. Its early history was very much a product of the Arawakan exchange system that had produced a complex culture, centered around ancestry and inherited rank as the basis for social hierarchies and political power (Santos-Granero Reference Santos-Granero, Hill and Santos-Granero2002: 42ff.), displayed through complex ritual ceremonies, and accompanied by elaborate expressions of material culture such as intricate musical instruments (Izikowitz Reference Izikowitz1935; Hill Reference Hill, Hornborg and Hill2011), dancing masks (Santos-Granero Reference Santos-Granero, Hornborg and Hill2011: 344), and beautifully decorated ceramic artifacts. In contrast, the late history of this ceramic style contains an abrupt shift of cultural context related to the expansion of Tupí-speaking groups. The polychrome ceramic associated with Tupian occupation, sometimes labeled the Tupí-Guaraní tradition, covered a large geographic area from the Ji-Paraná and Aripuanã rivers in Rondônia to the Xingu and Tocantins rivers in Mato Grosso and Tocantins (Cruz Reference Cruz2008).
As noted by archaeologists, the circular villages of the Arawak-speaking communities established along the Amazon River and its major tributaries during the Barrancoid period came to a sudden end around 900 CE. A period of warfare resulted in the replacement of the circular and palisaded Arawakan villages with linear, un-palisaded villages similar to those noted by the early European travelers (Rebellato et al. Reference Rebellato, Woods, Neves, Woods, Teixeira, Lehmann, Steiner, WinklerPrins and Rebellato2009: 22, 29; see also Roosevelt Reference Roosevelt, Descola and Taylor1993; Porro Reference Porro and Roosevelt1994; Hemming Reference Hemming2004 [1978]). The change in architectural style likely signals the violent expansion of the speakers of the Tupí-Guaraní branch, enacting an early version of the predatory cosmology described earlier. The outcome of this period of social conflict and warfare was not only a change in village layout (Rebellato et al. Reference Rebellato, Woods, Neves, Woods, Teixeira, Lehmann, Steiner, WinklerPrins and Rebellato2009: 22) (and most likely also language use), but also a stronger representation of the Amazonian Polychrome tradition along the main river and an elaboration of the painted decoration of the vessels.13
The timing of the expansion of Tupian languages and polychrome pottery south of the Amazon River indicates that these ceramics are another example of the Tupian process, discussed above, of absorbing external cultural elements into their own repertoire to strengthen ethnic identity vis à vis other groups. As noted above, the Barrancoid (400 BCE–900 CE) and Paredão (700–1200 CE) occupations in the central Amazon area have been associated with an Arawak-controlled regional exchange system (Eriksen Reference Eriksen2011; Neves Reference Neves, Hornborg and Hill2011: 41, 45, 48f.). The period of warfare that led to the replacement of villages and pottery styles also brought Guarita phase ceramics of the Amazonian Polychrome tradition (Neves et al. Reference Neves, Petersen, Bartone, Heckenberger, Glaser and Woods2004: 133; Rebellato et al. Reference Rebellato, Woods, Neves, Woods, Teixeira, Lehmann, Steiner, WinklerPrins and Rebellato2009: 22, 27). However, Guarita phase pottery was not a truly intrusive ware in the region. It had already started to develop out of Barrancoid material in the area around 500–600 CE (Lathrap Reference Lathrap1970: 155–157; Brochado Reference Brochado1984: 317; Petersen et al. Reference Petersen, Neves and Bartone2004: 9), but it did not gain an important position until the period of social upheaval around 900. The territory of the Guarita phase increased between 900 and 1550, not only in territories known to have been controlled by Tupí-speaking groups at the time of contact, but also along the Rio Negro14 – in an area known to have been a stronghold of the Manao, one of the most powerful Arawak-speaking groups of Amazonia, and their neighbors, the Arauakí (Eriksen Reference Eriksen2011: 112, Figure 4.3.1).
Indeed, the Manao were powerful enough to substantially eradicate all speculations of a Tupi-controlled lower Rio Negro during late prehistory, but the picture from ceramic evidence is more complicated. Although the pottery of the historical Manao shows influence from the polychrome tradition (Myers 1999: 36f.), the main affiliation of their ceramics is to the Arauquinoid/Incised Punctuated tradition, a ceramic style originating around the Guiana Highlands and most commonly associated with Cariban language speakers15(Neves Reference Neves, Hornborg and Hill2011: 47). Furthermore, the Manao traded gold to the Tupí-speaking Curuzirari of the Amazon River, who in turn manufactured polychrome pottery that was circulated to neighboring groups (Eriksen Reference Eriksen2011: 205). Thus, in gold we find another high-status cultural element that was exchanged between Arawakan and Tupian groups of the area, suggesting further language contact between these groups, even as connections multiply between pottery styles and a variety of speaker groups.
Vessel shapes and functions of the Guarita phase attest to two different wares: a variety that was elaborately decorated with polychrome painting, probably used in ceremonies, and a simple variety, with little or no decoration, probably manufactured for everyday use (Petersen et al. Reference Petersen, Neves, Heckenberger, McEwan, Barreto and Neves2001: 97). Considering the relatively complex inter-ethnic relationships of the central Amazon during late prehistory, it is likely that some groups used Guarita ceramics as their everyday-ware, while high-status pottery was traded among all influential groups as a sign of social status and ethnic affiliation.
Therefore, the suggested chronology is one in which polychrome pottery was initially manufactured by Arawakans 500–900 CE, alongside Barrancoid and Paredão phase ceramics. In the subsequent period, polychrome ceramics were transformed into more elaborately decorated high-status ware that from 900 to 1200 was acquired by Tupí-speakers as part of the shifting power relations between Arawakans and Tupians in Amazonia. Finally, between 1200 and 1550 the high-prestige polychrome pottery was circulated among the different ethnolinguistic groups of the central Amazon, an area now in the hands of Tupians, as a marker of political and religious power and social status that was also communicated by Tupian language use.
Downriver in the lower Amazon, cultural exchange and dynamics of ethnogenesis between Tupí- and Arawak-speakers began a little earlier, around 700–800 CE, with evidence of the pottery later known as the Tupinambá phase of the Amazonian Polychrome tradition found south of Marajó Island (Brochado Reference Brochado1984: 342). Brochado (Reference Brochado1984: 313) notes that Tupian polychrome pottery of eastern Brazil, the Guaraní and Tupinambá ceramic phases, originated in the Guarita sub-tradition. Many of the stylistic elements in Guarita and Marajoara thereafter remained in the Tupinambá phase, but Brochado (Reference Brochado1984: 369) is careful to point out that Marajoara dates at least 500 years before Tupinambá, eliminating all speculations on a reverse cultural exchange between these two groups. Guaraní ceramics, the southernmost distributed branch of the Tupian polychrome pottery, lost some decorative elements in the split from Guarita (Noelli Reference Noelli, Silverman and Isbell2008: 661), and there was a certain amount of stylistic exchange with local complexes in southern Brazil, particularly a Kaingang style (manufactured by a Macro-Jê-speaking community) (Brochado Reference Brochado1984: 378).
The so-called Tupí-Guaraní pincer movement, a process that resulted in the circling of the Brazilian Highlands and the meeting of Tupinambá- and Guaraní-speaking groups in the area of the present-day Sao Paulo, took place between 500 and 1000 CE (Brochado Reference Brochado1984: 383). After this, a period of constant warfare between the two groups ensued (Brochado Reference Brochado1984: 386). Unlike most Arawak-speaking groups of Amazonia (cf. Santos-Granero Reference Santos-Granero, Hill and Santos-Granero2002), the Tupian groups do show signs of endo-warfare: at times, they seem to have considered their closest linguistic relatives their worst enemies. Despite frequent hostilities, trade and exchange between the two Tupian groups were maintained, leading to ceramic interaction and facilitating a certain degree of homogenization between their respective ceramic phases (although vessel shapes were sometimes adapted to local conditions) (Brochado Reference Brochado1984: 387f., 394).
As for the sociocultural relationship between Tupí-Guaraní-speakers and non-Tupian groups, several observations point to a scenario in which there was actually less warfare than usually assumed for the expansion of Tupí-Guaraní (cf. Viveiros de Castro Reference Viveiros de Castro1992). According to Noelli (Reference Noelli, Silverman and Isbell2008: 664), there was a gradual expansion of Tupian groups into others’ territories, which would explain the ceramic acculturation noted above. Brochado (Reference Brochado1984: 402) and Santos-Granero (Reference Santos-Granero2009: 33) note that the Tupinization of outside groups was an important way of incorporating new populations into the group, thus also spreading Tupian languages. This fits well with the observation made earlier that Tupí-Guaraní-speaking groups were generally very skilled in incorporating new cultural elements into their own repertoire and often engaged in processes of ethnogenesis with neighboring indigenous groups and later with Europeans. As noted, the Tupí-speaking Chiriguano had an unequal power relation with the Arawak-speaking Chané (Santos-Granero Reference Santos-Granero2009: 186), but there were nevertheless many cultural traits flowing from the Chané to the Chiriguano.
Linking this observation back to the discussion of polychrome pottery and its possible correlation with Tupians, the occurrence of this ceramic tradition in the Peruvian Amazon was initially closely associated with the Cocama, who established themselves together with the polychrome pottery in the region between 1200 and 1400 (Evans and Meggers Reference Evans and Meggers1968; Lathrap Reference Lathrap1970; Myers Reference Myers, Glaser and Woods2004; Salazar Reference Salazar, Silverman and Isbell2008: 264). The pottery was then transformed from an intrinsic part of the Tupian Cocama, Cocamilla, and Omagua identity into an ethnic marker of the Pano-speaking Shipibo and Conibo who co-existed with the Cocama in the Jesuit missions in colonial times (Lathrap Reference Lathrap1970: 184; Myers Reference Myers1976; Brochado Reference Brochado1984: 304; DeBoer and Raymond Reference DeBoer and Scott Raymond1987: 128f.; DeBoer Reference DeBoer, Conkey and Hastorf1990: 87, 103). Thus, tourists walking the streets of Pucallpa and Iquitos in eastern Peru are addressed in Spanish and offered “traditional” Panoan shirts16 and pottery with polychrome decoration. This is perhaps the ultimate example of the tendency for powerful cultural markers to transform themselves and take on new functions through sociocultural exchange and ethnogenetic processes – always involving language as a crucial component of contact.
6 Conclusion
This study has illustrated that out of the ten branches of the Tupian family, five (Juruna, Mundurukú, Mawé, Awetí, and Tupí-Guaraní) are largely responsible for the territorial expansion, and that one of them (Tupí-Guaraní) has expanded over large parts of South America. Linguistic analysis has shown that three out of five expanding branches (Mawé, Awetí, and Tupí-Guaraní) are closely related, forming a sub-branch of their own labeled Mawetí-Guaraní (cf. Map 8.1 and Figure 8.1). While earlier research emphasized warfare as a central mechanism of expansion for these five branches, this study has shown that although warfare was a central component of many of the expanding Tupí-speaking communities (as it was for many non-Tupí Amazonians, as well), much of the territorial expansion was the result of a gradual process in which Tupians absorbed cultural elements gradually from neighboring groups in order to strengthen themselves and their culture. This process ultimately led to the Tupinization of neighboring communities through contact scenarios dominated by Tupians, which in turn led to the adoption of Tupian languages by non-Tupí-speaking groups.
However, the process was far from unidirectional, as can be seen from the constant updates and adjustments that Tupian cultures underwent through contact with neighboring societies. In this respect, the most expansive branches of Tupian are characterized by hybrid cultures (and sometimes also languages, such as Cocama-Cocamilla), constantly renegotiated through ethnogenetic processes (cf. Hornborg Reference Hornborg2005) when in contact with other indigenous Amazonians, the colonial powers, or the current nation states.
The results of linguistic analysis of lexical material showed that patterns of synchronic similarity are consistent with genealogic classifications that incorporate an estimate of history. However the analysis of typological material showed considerably different results. Patterns of similarity were neither consistent with the known genealogy of the family nor with contact situations suggested by non-linguistic findings. The same mismatch of results has been reported for the Arawakan family (Carling et al. Reference Carling, Eriksen, Holmer and van de Weijerforthcoming), and it is an indication that lexical and typological components of languages are subjected to different historical developments in contact situations. As is well known in historical linguistics, this is due in part to varying degrees of feature borrowability. Basic vocabulary is traditionally considered to be more stable and less prone to replacement, an outcome reflecting the results of the linguistic analysis in Section 3. For example the Tupí-Guaraní languages have around 70 percent shared cognates on the basic vocabulary, while the structural data are much less homogeneous, confirming that lexicon, especially basic vocabulary, is better preserved in contact situations. The high rates of cognacy and, to a lesser extent, of structural similarity found among Tupí-Guaraní languages point to a relatively recent and rapid movement of Tupí-Guaraní speaking groups. They expanded to a vast territorial area but retained a substantial level of linguistic uniformity, even in contact situations where there was a tendency for cultural assimilation.
This conclusion fits well with the archaeological data, which propose a gradual replacement of the Arawak-dominated political control of greater Amazonia by Tupians from about 700–800 to 1200 CE. This process undoubtedly involved diverse aspects of interaction including warfare, cultural exchange, ecological practice, and linguistic evolution that ultimately led to the distribution of Tupian languages depicted in Map 8.1. Finally, the development of lingua francas such as Guaraní, Língua Geral Paulista, and Língua Geral Amazônica, in the contact scenario of early colonization, is a prime example of these processes within the Tupí-Guaraní branch.
Parts of the studies reported on here were carried out under the Tupí Comparative Project, a collaborative project ongoing at the Museu Goeldi/Brazil, since 1998, in cooperation with various Tupian specialists. We thank Pieter Muysken for making his personal notes on Língua Geral Amazônica and Cocama-Cocamilla available to us.
1 The Tupí-Guaraní branch has several languages and sub-branches, represented by the dotted lines, which are not shown in the diagram.
2 Analysis relative to the non-Tupí-Guaraní languages draws directly from Galucio et al. (to appear).
4 Also known as Kokama. In this volume the spelling Cocama-Cocamilla is adopted, following Peruvian usage.
5 The question of Cocama's genetic affiliation is discussed in Section 5.
6 Coded features are: vowel inventory, vowel length, tonal contrast, relational prefix, noun classifiers, positional demonstratives, valence change morphemes, order inside possessive phrases, order of adpositions and nouns, basic word order, subject-verb and object-verb preferred order, alignment systems, verbal argument marking, case markers, and subordination strategies (relative, complement, and adverbial clauses).
7 In a NeighborNet representation the amount of evidence within the data supporting each split is indicated by the length of the lines. Gaps in the data may account for the high number of unresolved or weakly supported clusters in the graph.
8 The Awetí branch, represented by the single Awetí language in the Upper Xingu area, also represents a major territorial expansion from the heartland in Rondônia, but it offers a limited amount of information relevant to the present study due to its minimal territorial extension for the short period we have good data. However, its close linguistic relationship to Mawé and Tupí-Guaraní (see above) points to a scenario where this group was once part of the expansion by the Mawetí-Guaraní branch.
9 The study also extends into the Caribbean, but is mainly focused on the tropical lowlands of South America.
10 Corresponding to the current Brazilian states of Pará and Maranhão.
11 Roteiro da viagem da cidade do Para até as ultimas colônias do sertão da Provincia (1768), by Jose Monteiro de Noronha (cited in Picanço Reference Picanço2005)
12 Eurico Miller (Reference Miller1992) has unearthed polychrome pottery claimed to date around 800 BCE along the upper Madeira River (interestingly, an area very close to the proposed homeland of the Tupian family and many of its sub-branches). If confirmed, such early dates would force archaeologists to alter the chronology and internal development of the Amazonian Polychrome tradition.
13 Among the Guaraní in southern Brazil, decorative elements in the polychrome pottery seem to have been lost instead (Noelli Reference Noelli, Silverman and Isbell2008: 661). See further discussion below.
14 Along the Rio Negro, polychrome Guarita-like ceramics were still being manufactured by Arawakans into the 1800s (Boomert Reference Boomert, Delpuech and Hofman2004: 261), and Neves (Reference Neves, McEwan, Barreto and Neves2001: 274f.) has confirmed the link between contemporary Arawakan pottery and the Amazonian Polychrome tradition along the same river.
15 It is known from historical sources that the Manao traded with the Caribs of the Guiana Highlands (Edmundson Reference Edmundson1904: 16).
16 The Panoan bark shirt (cushma) was initially an Andean trait transferred to the lowland groups through the ancient exchange systems of the eastern Andean slopes (Bodley, Reference Bodley, Francis, Kense and Duke1981: 54f.), but many textiles are nowadays, as elsewhere in the world, being imported from Southeast Asia.












































