Cash Crops, Print Technologies, and the Politicization of Ethnicity in Africa

Published online by Cambridge University Press:  27 August 2021

ETH Zürich, Switzerland
William & Mary, United States
University of Nottingham, United Kingdom
Yannick I. Pengl, Postdoctoral Researcher, International Conflict Research, Department of Humanities, Social and Political Sciences, ETH Zürich, Switzerland,
Philip Roessler, Associate Professor, Department of Government, William & Mary, United States,
Valeria Rueda, Assistant Professor, Department of Economics, University of Nottingham, United Kingdom,
Rights & Permissions[Opens in a new window]


What are the origins of the ethnic landscapes in contemporary states? Drawing on a preregistered research design, we test the influence of dual socioeconomic revolutions that spread throughout Africa during the nineteenth and twentieth centuries—export agriculture and print technologies. We argue these changes transformed ethnicity via their effects on politicization and boundary-making. Print technologies strengthened imagined communities, leading to more salient—yet porous—ethnic identities. Cash crop endowments increased groups’ mobilizational potential but with more exclusionary boundaries to control agricultural rents. Using historical data on cash crops and African language publications, we find that groups exposed to these historical forces are more likely to be politically relevant in the postindependence period, and their members report more salient ethnic identities. We observe heterogenous effects on boundary-making as measured by interethnic marriage; relative to cash crops, printing fostered greater openness to assimilate linguistically related outsiders. Our findings illuminate not only the historical sources of ethnic politicization but also mechanisms shaping boundary formation.

Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (, which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
© The Author(s), 2021. Published by Cambridge University Press on behalf of the American Political Science Association


What are the origins of the ethnopolitical landscapes that shape contemporary states? A voluminous literature points to the influence that ethnicity—social identity based on shared descent and culture—has on politics and the allocation of state resources. From the provision of public services to civil war, ethnicity is found to structure a wide range of political and economic processes (Chandra Reference Chandra2004; Habyarimana et al. Reference Habyarimana, Humphreys, Posner and Weinstein2009; Horowitz Reference Horowitz1985; Roessler Reference Roessler2016). In this paper we address the question of what drives ethnic politicization—that is, why politics revolves around some cultural groups and not others. Despite a rich qualitative and historical literature on the topic (Bates Reference Bates, Rothchild and Olorunsola1983; Posner Reference Posner2005; Vail Reference Vail and Vail1989a), quantitative studies typically do not engage the endogenous sources of ethnopolitical divisions that shape policy outcomes. This represents an important limitation, as inferences on the consequences of ethnic politics may be vulnerable to selection problems (Birnir et al. Reference Birnir, Wilkenfeld, Fearon, Laitin, Gurr, Brancati and Saideman2015; Reference Birnir, Laitin, Wilkenfeld, Waguespack, Hultquist and Gurr2018).Footnote 1

We seek to advance knowledge on this question, reporting the results of a preregistered research design.Footnote 2 We distinguish between two interrelated processes that shape ethnic politics: boundary-making and politicization. The former—the sine qua non of ethnicity (Barth Reference Barth1969)—encapsulates the social boundaries that regulate group membership and shape inter-ethnic ties. Following from Weber (Reference Weber1978) and others (Caselli and Coleman Reference Caselli and Coleman2013; Fearon Reference Fearon1999; Parkin Reference Parkin and Parkin1974; Wimmer Reference Wimmer2013), we conceive of the porosity of group boundaries as being especially consequential for ethnic politics. Politicization, on the other hand, occurs when members of a cultural group coordinate on their shared identity to compete for state power (Bates Reference Bates, Rothchild and Olorunsola1983; Fearon Reference Fearon1999). In accounting for variation in boundary-making and politicization, our framework focuses on periods of significant material and cultural change that potentially strengthened groups’ mobilizational capabilities and redefined the markers of group membership.

We study these phenomena across countries in Africa, a region in which ethnicity has structured political competition, but only among a subset of ethnolinguistic groups.Footnote 3 Much existing scholarship on the politicization of ethnicity in Africa points either to the lasting effects of colonialism—via the arbitrary territorial partition of the continent to the imposition of indirect rule (Asiwaju Reference Asiwaju1984; Ekeh Reference Ekeh1990; Englebert, Tarango, and Carter Reference Englebert, Tarango and Carter2002; Mamdani Reference Mamdani1996)—or to the role of contemporary political competition (Posner Reference Posner2005). These factors are no doubt important but arguably too widespread to explain significant within-country variation in ethnic identity salience (Vail Reference Vail and Vail1989a). In addition, colonialism was embedded in larger socioeconomic changes. Two of particular importance were the cash crop revolution and the spread of Christianity by missionaries. Both of these transformations preceded the "Scramble for Africa" and may have affected ethnic identities independently of or in interaction with colonial policy making. We argue that these fundamental changes have path-dependent effects on contemporary ethnic mobilization and coalition formation despite significant institutional change over the last 150 years.

First, we posit that both the spread of cash crops and Christian missions contributed to the politicization of ethnicity. We hypothesized the transition to commercial export agriculture increased the ethnic politicization of groups endowed with cash crops through a resource channel that bolstered these groups’ mobilizational capabilities but also via competition for land and the enforcement of descent-based property rights regimes. While missionaries also brought about important material changes through investments in new infrastructure and provision of education, perhaps even more important was the communication revolution they unleashed. Intent on spreading the Gospel, missionaries invested heavily in standardizing, writing and printing what were primarily oral languages. This improved treated groups’ communication capabilities, while increasing ethnic salience through the strengthening of “imagined communities” (Anderson Reference Anderson1983)—as the adoption of a standardized language and the consumption of a uniform set of cultural characteristics, texts, and histories enhanced group solidarity.

Even as these dual socioeconomic forces increased ethnic politicization, we hypothesized they differently reshaped ethnic boundaries. The “imagined communities” reconstructed through language standardization created an opportunity for the assimilation of outsiders through language and cultural immersion—leaving a legacy of more inclusionary ethnic boundaries. Cash crop agriculture had a very different effect, as it was tied to control of the land. In the face of growing demand for access to their agricultural homeland, local communities employed ethnicity as a means of “social closure” (Parkin Reference Parkin and Parkin1974; Weber Reference Weber1978) to regulate land ownership and control agricultural rents—leaving a legacy of more exclusionary ethnic boundaries.Footnote 4

To test these hypotheses, we combine detailed historical data on cash crop production and the diffusion of print and writing technology (as measured by publications in African languages) with contemporary ethnicity data. Our cash crop data is based on a comprehensive historical map on the source locations of exports in late colonial Africa created by Hance, Kotschar, and Peterec (Reference Hance, Kotschar and Peterec1961) and digitized by Roessler et al. (Reference Roessler, Pengl, Marty, Titlow and van de Walle2020). To measure language standardization and its dissemination through printing, we compile a novel dataset of historical African language publications from Rowling and Wilson (Reference Rowling and Wilson1923) and Mann and Sanders (Reference Mann and Sanders1994). Together, these two bibliographic sources cover approximately 10,000 titles in 370 distinct African languages.

We employ group-level and individual-level indicators to measure ethnic politicization. At the group level, we use the Ethnic Power Relations (EPR; Vogt et al. Reference Vogt, Bormann, Rüegger, Cederman, Hunziker and Girardin2015) and the Politically Relevant Ethnic Groups (PREG) datasets (Posner Reference Posner2004a) to measure which ethnic groups or coalitions have been active in competition for state power during the postindependence period. At the individual level, we use Afrobarometer Rounds 3–6 that include a question on whether respondents self-identify more in ethnic or national terms. To analyze the hypothesized heterogeneous legacies of cash crops and print technologies on boundary-making and social closure, we employ a behavioral measure of ethnic assimilation: inter- and intraethnic marriages from a large sample of couples surveyed by USAID’s Demographic and Health Surveys.

We use linguistic groups identified in the Ethnologue database as our primary unit of analysis to minimize concerns about endogenous sample selection (Laitin Reference Laitin2000a, 142). This enables us to merge our cash crop, publishing, and outcome data, along with a host of control variables, to the Ethnologue groups through spatial overlays or ethnic name matching.Footnote 5 In the survey-based analyses, we use two types of specifications. The first—geographic models—are based on the location of individuals and the Ethnologue polygons in which they reside. These models compare people located in different places with and without historical cash crop production and/or missionary publishing. The second—ethnic models—are based on survey respondents’ affiliation to a given ethnic group rather than place of residence. Thus, they compare individuals residing in the same location but from ethnic groups with differential exposure to historical cash crop production and missionary publishing. This enables separating culturally transmitted attitudes and behaviors from locational effects.

We employ three main methods to mitigate endogeneity concerns. First, we employ location fixed effects in our ethnic-level specifications to address mission selection into areas with favorable locational fundamentals or those populated by already large and more powerful groups. Second, we use our African-language publishing data to analyze the effects of print technologies at the intensive margin (i.e., estimating the effects of the magnitude of publication records among groups with at least one publication). Third, we instrument actual crop production with agroclimatic suitability to address the potentially endogenous uptake of commercial agriculture. We also conduct additional robustness checks to rule out alternative explanations such as the effects of group size, precolonial centralization, indirect rule, ethnic diversity, and conversion to Christianity.

We find that groups historically exposed to cash crops or print technologies are significantly more likely to be politically relevant after independence. According to PREG (EPR), groups that cultivated at least one of five major cash crops through the end of colonialism or with a historical publication in their language are, respectively, 129% (54%) and 88% (45%) more likely than the average group to be politically relevant. These results are robust to instrumenting crops with suitability and when focusing only on the subsample of groups exposed to Christian missions.

At the individual level, we find that citizens residing in areas of historical cash crop production or living in Ethnologue polygons with a history of publishing are significantly more likely to self-identify with their ethnic group rather than nationality. Moreover, ethnic salience follows our expectation of cash crops producing location-specific effects among “stayers” and publishing producing broader cultural effects, including among “movers” (i.e., respondents living outside their ancestral ethnic homeland). We do not find evidence, however, that groups treated with cash crops or print technologies have more homogeneous political preferences today.

We find strikingly different effects of cash crops and publishing on the porosity of ethnic boundaries, as measured by observed interethnic marriage rates. Consistent with our expectation that cash crops engendered social closure and less openness to ethnic outsiders, we find interethnic marriage to be significantly lower even with linguistically closely related groups. In contrast, and consistent with the hypothesis that print technologies led to salient but more porous ethnic boundaries, we find null and sometimes positive effects on interethnic marriage with linguistically close ethnic outsiders but negative effects on marriages across large linguistic distances. However, in contrast to our expectations, both exposure to cash crops and print technologies are positively associated with contemporary ethnic-based conflict—suggesting that, even as print technologies opened the door to assimilation of culturally proximate outsiders, its politicizing effects ensured these groups have not escaped cycles of ethnic conflict.

Our findings address different research streams in the social sciences. Despite a strong consensus on the constructivist nature of ethnicity (Chandra Reference Chandra2012; Laitin and Posner Reference Laitin and Posner2001), the endogenous sources of ethnogenesis remain understudied. Our paper illuminates the historical role of export agriculture and publishing in Africa. Moreover, our analysis sheds light on the relationship between ethnic politicization and boundary-making (Wimmer Reference Wimmer2013). It is generally assumed that these two processes are reinforcing, leading perhaps to convergence in the types of social boundaries regulating politically relevant ethnic groups. This may or may not be the case; as we illustrate, even across politicized groups, boundary policing can vary based on path-dependent effects of material and cultural changes on assimilationist practices and norms of openness.

In advancing this line of inquiry, we draw on classic theories of group formation—Weber’s (Reference Weber1978) notion of social closure, Anderson’s (Reference Anderson1983, 46–7, 7) framework on the ethnonational effects of print technologies, and prominent but conflicting accounts of how economic change transforms ethnic identities (Bates Reference Bates1974; Gellner Reference Gellner1983; Robinson Reference Robinson2014). To date, there have been few systematic tests of Anderson’s “imagined communities” hypothesis.Footnote 6 We find strong support for a link between print technologies, language standardization, and ethnonationalism in Africa. However, as we explain below, the mechanisms through which these processes reconstructed ethnic identity differed from those of nineteenth-century Europe where “print capitalism,” bureaucratic “languages of power,” and state-sponsored nation-building fostered national identities rather than the subnational identities that arose across Africa.

As far as “modernization” is concerned, our results are broadly in line with Bates’s (Reference Bates1974) intuition that competition for economic benefits may deepen ethnic divisions. At the same time, our focus on cash crops produced by African smallholder farmers suggests that rural economic change was just as important as the urban dynamics prominently highlighted in the existing literature (Cohen Reference Cohen1969; Epstein Reference Epstein1958).

Finally, our paper employs a preregistered design to address growing concerns about publication bias and data mining for significant results in historical persistence studies. Beyond guarding against cherry-picking positive findings, preregistration encourages careful ex ante theorizing and hypotheses development. Preregistration does not preclude ex post modifications of the prespecified analyses, but it does necessitate transparency about any changes made. In this vein, we describe all prespecified hypotheses and analyses in Supplementary Information IV.

The Determinants of Africa’s Ethnic Landscape

In this section we more fully advance our theoretical argument on the influence of the cash crop and print revolutions on shaping Africa’s modern ethnic landscape. Before addressing each in turn, we first situate our argument within the broader ethnicity scholarship.

Ethnic Boundary-Making and Politicization

We conceive of a country’s ethnic landscape as shaped by two key processes: boundary-making and politicization. The former encompasses the construction and maintenance of social differences (Barth Reference Barth1969) in which individuals employ “points of social reference,” such as ascriptive, cultural, or other markers, to place themselves and others into groups to “order” the world (Hale Reference Hale2004). Boundary-making helps to solidify social groups through the adoption of criteria for membership and their enforcement by in-group members (Wimmer Reference Wimmer2013). Following from Weber (Reference Weber1978), we consider a group’s closure or accessibility as one of the most important dimensions of boundary-making (Wimmer Reference Wimmer2013). Politicization, on the other hand, entails members of a given group consciously or subconsciously leveraging their shared identity to coordinate their behavior to access political and economic benefits (Bates Reference Bates, Rothchild and Olorunsola1983; Fearon Reference Fearon, Wittman and Weingast2006).

Generally, boundary-making and ethnic politicization are theorized to be reinforcing. This is perhaps most starkly illuminated in the civil war literature in which conflict along ethnic lines contributes to the hardening of social boundaries (Fearon and Laitin Reference Fearon and Laitin2000; Kalyvas Reference Kalyvas2008). Other forms of political competition, such as elections, are also found to increase ethnic salience (Eifert, Miguel, and Posner Reference Eifert, Miguel and Posner2010; Oucho Reference Oucho2002)—although this does not necessarily translate into higher degrees of closure.Footnote 7 The reverse—that boundary-making facilitates ethnic politicization—is an important assumption in rationalist accounts of ethnic coalition formation that stress the need to exclude outsiders from the returns to collective action (Fearon Reference Fearon1999).Footnote 8 The reinforcing effects of boundary-making and politicization may suggest some degree of convergence in the structure of social boundaries across politicized groups, but as far as we know this has not been empirically assessed.Footnote 9

Existing Literature

What then explains boundary-making and politicization? Following from our conceptual framework, we expect factors shaping boundary-making to drive the construction and enforcement of socially differentiated groups, whereas factors activating politicization likely work through their effects on group coordination and mobilization. Here we briefly synthesize existing research with a focus on sub-Saharan Africa.

Evolutionary and geographic approaches, respectively, attribute Africa’s comparatively high ethnic diversity to the loss of genetic variation as human species migrated from the cradle of humankind (Ahlerup and Olsson Reference Ahlerup and Olsson2012; van den Berghe Reference van den Berghe1981) and ecological variation, leading to economic and cultural differentiation (Michalopoulos Reference Michalopoulos2012; Nettle Reference Nettle1998). What form these groups take and the degree of their politicization then depends on a host of historical, material, and institutional factors.

One factor regularly advanced as contributing to political relevance is group size, following the logic that a minimum support base is necessary to sustain viable political coalitions (Bates Reference Bates, Rothchild and Olorunsola1983; Posner Reference Posner2004b; Reference Posner2005). Beyond size, others point to the importance of groups’ sociopolitical structures, in particular legacies of centralized and hierarchical institutions (Michalopoulos and Papaioannou Reference Michalopoulos and Papaioannou2013).Footnote 10 In the context of Africa’s multiethnic states, historical statehood may have deepened ethnopolitical cleavages (Paine Reference Paine2019).

Other research focuses on economic change and its differential effects on groups across the continent. Ekeh (Reference Ekeh1990), Nunn (Reference Nunn2008), and Nunn and Wantchekon (Reference Nunn and Wantchekon2011) highlight how the slave trades contributed to ethnic fractionalization and strengthened norms of mistrust. The decline of the slave trades corresponded with the spread of export agriculture (Hopkins Reference Hopkins1973) and Christian missionaries across the continent (Cagé and Rueda Reference Cagé and Rueda2016). Bates (Reference Bates1974) mentions both of these factors as examples for spatially concentrated modernization benefits that spurred intergroup inequality, competition, and politicization.Footnote 11 Other relevant economic changes include mining, railway construction, and perhaps most prominently urbanization (Cohen Reference Cohen1969; Horowitz Reference Horowitz1985; Nnoli Reference Nnoli1978; Vail Reference Vail and Vail1989a).

Beyond their material effects, missionaries, export agriculture, and the colonial state had profound cultural effects. Through tracing the historical process, Ranger (Reference Ranger and Vail1989) shows how missionary investments in the translation and printing of Bibles in vernacular languages “created rather than merely reflected” extant ethnolinguistic divisions.Footnote 12 Berry (Reference Berry1993) and Lentz (Reference Lentz2013) point to the effects of the commercialization of agriculture on the reconstruction of social identities, especially the distinction between “natives”—or “sons of the soil”—and “strangers” (Lentz Reference Lentz2013). Mamdani (Reference Mamdani1996) argues that the colonial project had much broader cultural effects through social engineering around the “customary.” Reinforced through indirect rule and other colonial policies of social control (Eyoh Reference Eyoh, Zeleza and Kalipeni1999; Posner Reference Posner2005), colonialism sharpened communal identitiesFootnote 13 through ideologies of “tribalism” (Ekeh Reference Ekeh1975) and “autochthony” (Lentz Reference Lentz2013).Footnote 14

The anticolonial liberation struggle held the promise to reimagine social relations and national communities (Ake Reference Ake1993; Ekeh Reference Ekeh1990; Fanon Reference Fanon1963)—and in some cases, such as Nyerere’s Tanzania, this was achieved (Miguel Reference Miguel2004). But largely, postcolonial competition for state power revolved around ethnopolitical networks, further deepening ethnic politicization (Horowitz Reference Horowitz1985; Nnoli Reference Nnoli and Nnoli1998; Roessler Reference Roessler2016; Rothchild Reference Rothchild1997). The advent of multiparty elections with the end of the Cold War, in some cases, transformed ethnopolitical configurations (Posner Reference Posner2005), but this often intensified rather than dampened ethnic salience (Eifert, Miguel, and Posner Reference Eifert, Miguel and Posner2010; Oucho Reference Oucho2002) as well as autochthonous mobilization (Ceuppens and Geschiere Reference Ceuppens and Geschiere2005; Marshall-Fratani Reference Marshall-Fratani2006). However, there is some evidence that urbanization and demographic change (leading to greater levels of ethnic diversity), as well as democratic institutions, are reducing ethnic favoritism (Burgess et al. Reference Burgess, Jedwab, Miguel, Morjaria and Miquel2015; Ichino and Nathan Reference Ichino and Nathan2013; Kramon et al. Reference Kramon, Hicks, Baird and Miguel2021).


We build on and extend this literature by developing and systematically testing new hypotheses on how the cash crop and print revolutions shaped processes of ethnic boundary-making and politicization from the nineteenth century onward.

The Cash Crop Revolution

In the nineteenth and twentieth centuries, African economies underwent an important structural transformation away from the slave trades that dominated exchange for the previous four hundred years to commercial export agriculture (Frankema, Williamson, and Woltjer Reference Frankema, Williamson and Woltjer2018; Hogendorn Reference Hogendorn, Duignan and Gann1969; Hopkins Reference Hopkins1973).Footnote 15 The cash crop revolution led to an important spatial shift in economic production to areas suitable for oil palm, groundnuts, cocoa, coffee, and cotton and enabled millions of African smallholders and traders to benefit from global exchange (Hopkins Reference Hopkins1973). Fueled by European-financed transportation infrastructure before and during colonialism, these cash crop zones were vertically integrated with export markets but with weak horizontal linkages with the rest of the colony (Hirschman Reference Hirschman1977; Rodney Reference Rodney1972; Roessler et al. Reference Roessler, Pengl, Marty, Titlow and van de Walle2020).

Consistent with Bates (Reference Bates1974), we posit that the spatial disparities arising from the cash crop revolution had important path-dependent effects on ethnic politicization. The takeoff of export agriculture endowed some groups—those who would be the primary producers of cash crops or the owners of the land on which they were produced—with a common economic niche, much greater wealth potential than others, and clear incentives to defend these advantages in competition with other groups.

A second and closely related channel of ethnic politicization was via the effects of the commercialization of agriculture on land tenure regimes.Footnote 16 Many of the most suitable areas experienced an increase in demand for land as waves of farmers, including enterprising migrant farmers (Hill Reference Hill1963), adopted cash crops. Labor migration to cash crop areas further increased local diversity, land pressures, and intergroup competition.

The commercialization of agriculture combined with migration-led population growth induced important changes in the social bases of land tenure regimes. In precolonial Africa, land rights were contingent on group membership or allegiance to traditional authorities (Berry Reference Berry1993). These practices did not change per se with the advent of cash crop agriculture and colonialism. What did, however, were outsiders’ eligibility for group membership as ethnic boundaries became more tightly regulated (Boni Reference Boni, Richard and Lentz2006; Lentz Reference Lentz2013). Thus, following from Weber’s (Reference Weber1978) idea of social closure (Parkin Reference Parkin and Parkin1974), in which social identity is employed as a means of restricting access to economic rents, in the face of rising land values and an influx of migrants, ethnic boundaries were more firmly policed to exclude outsiders from land ownership.Footnote 16 In line with the idea that ethnic differences are constructed, at least partially, as “a boundary-enforcement device” (Caselli and Coleman Reference Caselli and Coleman2013, 162) to control private goods, contestation over land not only made ethnicity more salient; it likely led sons of the soil to emphasize less accessible criteria of group membership such as ascriptive characteristics and ancestral ties to the land.Footnote 18 In a fascinating ethnography of the effects of the spread of cocoa and migrant farmers to the Sefwi homeland (located in present-day western Ghana) from the early twentieth century onward, Boni (Reference Boni, Richard and Lentz2006) documents this precise dynamic unfolding—resulting in the “ancestralization of land rights” and more stringent enforcement to prevent migrants from permanently owning land.

We expect these mechanisms to only apply to regions of African smallholder production. Where European companies, settlers, or the colonial state dominated production, land alienation and labor coercion likely undercut local control of agricultural rents, weakened ethnic institutions, and reduced opportunities for ethnic boundary-making.

Christian Missions, Print Technologies, and African Language Publications

As the abolition of the slave trade ushered in cash crop agriculture in Africa, it also gave momentum to the spread of Christian missions across the continent. In their endeavor to spread the Gospel, missionaries spearheaded a communication revolution.

Missionaries translated the Bible and education materials into vernacular languages as a vehicle for conversion (Laitin Reference Laitin2007; Ranger Reference Ranger and Vail1989; Woodberry Reference Woodberry2012). As most African languages were oral languages, missionaries first invested in language standardization and developing Latin-script writing systems (Posner Reference Posner2003; Ranger Reference Ranger and Vail1989). To propagate language knowledge and consumption of the written texts, printing presses were imported to publish Bibles, hymnals, and grammar books that were then used in churches and schools (Cagé and Rueda Reference Cagé and Rueda2016; Posner Reference Posner2003). This communication revolution was most intense in British colonies given the preponderance of Protestant missionaries and the promotion of local languages and culture as part of indirect rule arrangements (Albaugh Reference Albaugh2014).Footnote 19

Anderson’s (Reference Anderson1983) argument on the influence of print capitalism on European nation-building is a valuable reference when considering the effects of Africa’s print revolution on ethnonational communities. However, while language standardization and printing underpinned significant social changes in both regions, some mechanisms differed (Ranger Reference Ranger and Vail1989). First, given lower literacy and less integrated markets, in Africa the consolidation of ethnolinguistic consciousness and politicization did not result from the simultaneous mass consumption of newspapers and novels followed by state adoption and enforcement of national languages. Instead, missionary investments in language and printing in Africa instigated much more localized “imagined communities,” which were constructed and sustained by new cultural entrepreneurs (initially indigenous missionaries undertaking the language standardization) and by community members’ exposure to the translated Bibles, conversionary material, and other printed texts in vernacular languages. These activities spurred ethnonational “awakenings” similar to what Anderson (Reference Anderson1983, 73) describes in Europe—where the “energetic activities of … professional intellectuals were central to the shaping of nineteenth-century European nationalisms.” In Africa, an intelligentsia of mostly mission-educated linguists, writers, and teachers transmitted ideas of groupness through the churches and schools and, in turn, created new ethnic elites who further promoted the group’s values and solidarity through literature, newspapers, and the formation of cultural associations (Vail Reference Vail and Vail1989a, 11–2).

Another important difference with Europe was the role of the state. According to Anderson (Reference Anderson1983, 76), the expansion of European states increased the importance of official languages and fostered the development of a bureaucratic middle class. At the same time, state-sponsored nationalisms promoted linguistic assimilation and national identities (Weber Reference Weber1976). In contrast, in colonial Africa, the state was run by Europeans with little interest in fostering an African class of bureaucrats. Instead, colonial authorities focused on thwarting rather than promoting any kind of national identity, fearing the rise of revolutionary movements (Vail Reference Vail and Vail1989a).

The Yoruba represent a paradigmatic case of the influence of missionary language investments and publishing on the reconstruction of ethnic identity.Footnote 20 With the collapse of the Oyo empire at the end of the eighteenth century, civil wars and slave raiding divided the Yoruba into rivalrous subgroups (Adediran Reference Adediran1984). From the 1840s onward, however, missionaries from the Church Missionary Society (CMS), including freed slaves, such as Samuel Crowther, contributed to the rebuilding of the Yoruba ethnic nation. Intent on spreading Christianity, the CMS missionaries worked on Yoruba orthography, translation, and publishing, even starting a Yoruba newspaper in as early as 1859 (Falola Reference Falola1999). In propagating a standardized language and embracing and promoting the ethnonym “Yoruba,” the Christian missionaries boosted Yoruba ethnic consciousness (Peel Reference Peel2003). Moreover, as missionaries interpreted Yoruba history and tradition through a Christian lens (most famously Samuel Johnson in The History of the Yorubas), ethnogenesis and religious change reinforced each other. Consistent with Vail (Reference Vail1989b), missionary schools contributed to the propagation of standardized Yoruba through instruction in the language, which then produced new elites who served as champions of Yoruba solidarity and nationalism (Usman and Falola Reference Usman and Falola2019). This is personified in the life of Obafemi Awolowo, one of Nigeria’s founding fathers. Awolowo, born into one of the first Christian families in Ikenne, was educated in missionary schools before leading a pan-Yoruba cultural association (Egbé Ọmọ Odùduwà) dedicated to “reinventing a common Yorùbá identity” (Adebanwi Reference Adebanwi2014).

The standardization and printing of African languages is therefore expected to have strengthened groups’ ethnonationalism and their mobilizational capabilities—with the rise of new ethnic elites and the writing and printing technologies they could wield as they competed in the political arena. In addition to strengthening groups’ political capacity, the print revolution likely contributed to more expansionary identities than cash crop agriculture, as missionaries encouraged language uptake and provided opportunities for outsiders to learn the language via dissemination of language materials, church-related activities, and schooling. The upshot was the construction of more porous ethnic boundaries and assimilationist cultural practices—at least among those who adopted the group’s language.


Following from our theoretical framework, we preregistered the hypothesis that groups exposed to cash crops or print technologies are more likely to be politically relevant in the postindependence period. We also expected this to lead to more salient ethnic identities among individual group members. Despite these similar effects on ethnic politicization, we expected differential effects on boundary-making. We hypothesized that the commercialization of agriculture led to the construction of less porous ethnic boundaries than vernacular publishing and predicted lower rates of interethnic marriage for the cash crop than for the publication treatment.Footnote 21


In this section we describe the various sets of data we assemble to test our hypotheses. We explain the use of Ethnologue to derive units of analysis, describe our historical data on cash crops and African language publishing, and discuss our proxies for ethnic politicization, salience, and boundary-making.Footnote 22

Historical and Geographic Data

Identifying Potentially Relevant Groups

For a candidate list of nominal ethnic categories, we use Ethnologue, a reference source on living languages. Ethnologue attempts to capture the complete universe of languages regardless of their social or political relevance or demographic size (Simons and Fennig Reference Simons and Fennig2017). Having been compiled from the 1950s onward, Ethnologue may nevertheless miss a few precolonial small or extinct ethnolanguage groups. However, selection issues seem minimal in comparison with datasets like AMAR, EPR, or Murdock (Reference Murdock1959; Reference Murdock1967).Footnote 23 Identifying potentially salient ethnic categories from Ethnologue restricts our focus to ethnolinguistic rather than racial, religious, or regional markers. The analytical consequences of this restriction are minimal since in our sub-Saharan African sample practically all ethnic categories in EPR, PREG, Afrobarometer, and DHS are equivalent to, or combinations of, language families, languages, or dialects. Another advantage of Ethnologue is that its companion dataset, the World Language Mapping System (WLMS) provides maps demarcating linguistic homelands, which we leverage to spatially aggregate our cash crop data, survey-based outcome measures, and geographic control variables as described in detail below.

Cash Crops

To measure cash crop production, we use a geospatial dataset on the primary commodity revolution in Africa from Roessler et al. (Reference Roessler, Pengl, Marty, Titlow and van de Walle2020), drawing on a historical map produced by Hance, Kotschar, and Peterec (Reference Hance, Kotschar and Peterec1961). The map depicts the source locations of more than 95% of exports in 1957 across 38 states in sub-Saharan Africa.Footnote 24 Each primary commodity production point represents a value of $289,270 in 1957 USD. The dataset covers nine groups of cash crops;Footnote 25 20 minerals and metals; and forest, animal, and manufactured products.

Our main analysis focuses on the five main cash crops: cocoa, coffee, cotton, palm, and groundnut, representing 80% of total cash crop production and no less than half of all exports in 1957 across the countries in our sample. In addition, these five crops were predominantly produced by African smallholders rather than European settlers or on plantations, which makes them more relevant for our stipulated causal mechanism than other resources. Our Supplementary Information (section III.5) presents additional analyses also including other crops and minerals and more precisely coding the mode of production for all country–crop combinations in the Hance data. Figure 1 maps the 4,651 locations that produced one of the five most important export crops.

Figure 1. Publications and Cash Crop Locations Note: Language homelands are mapped according to Ethnologue. Grayed regions are Ethnologue polygons for which there is no record of publications. Colors indicate the number of publications listed in Rowling and Wilson (1923). Each blue cross locates 289,270 USD (1957) of cash crop export value for either cocoa, coffee, cotton, groundnuts, or palm oil. Solid black country borders describe our sample.

Print Technologies and Publishing Data

To capture exposure to print technologies, we draw on two library databases to construct a record of historical publishing at the language level.Footnote 26 In combination with Ethnologue and WLMS, this represents the first ethnically linked and geocoded database of publishing in African languages throughout the colonial period and after independence.

The first source is a 1923 compilation of 2,480 publications across 168 languages (Rowling and Wilson Reference Rowling and Wilson1923). It was intended to serve as a reference book for publications by Christian missionaries in Africa including not just religious texts but also dictionaries, grammar books, educational materials, and newspapers. It also provides contemporaneous estimates of the number of speakers per included language, which we use to normalize the number of publications.

Our second source (Mann and Sanders Reference Mann and Sanders1994) catalogues “collections of African language texts at SOAS, … the African Department of SOAS, the International Institute for African Languages and Cultures, … and the International Committee on Christian Literature for Africa.” This source complements Rowling and Wilson (Reference Rowling and Wilson1923), especially given its greater temporal coverage. However, Mann and Sanders (Reference Mann and Sanders1994) exclude grammars and dictionaries, which may have been particularly important for constructing salient ethnolinguistic communities. It is much less comprehensive on early printed materials, as it counts 50% fewer pre-1925 titles than Rowling and Wilson (Reference Rowling and Wilson1923). We thus use Rowling and Wilson (Reference Rowling and Wilson1923) as the main source in our analysis and present results using Mann and Sanders (Reference Mann and Sanders1994) in the Online Appendix.

The map in Figure 1 shows the total number of publications per ethnolinguistic polygon as listed in Rowling and Wilson (Reference Rowling and Wilson1923).

Contemporary Data on Ethnic Identities and Political Relevance

We use several data sources to measure the main outcomes of our study: ethnic politicization and boundary-making at the group and individual level.

Group-Level Politicization Measures

To measure which Ethnologue groups serve as bases for contemporary political mobilization, we match Ethnologue to two expert-coded sources on ethnic groups’ relevance in national-level political competition postindependence: the Politically Relevant Ethnic Groups (PREG; Posner Reference Posner2004a) and the Ethnic Power Relations (EPR; Vogt et al. Reference Vogt, Bormann, Rüegger, Cederman, Hunziker and Girardin2015) datasets. For each, we code a binary outcomes indicating whether the Ethnologue group has a one-to-one match in PREG/EPR (e.g., Yoruba and Yoruba) or is a clearly identifiable part of a broader ethnic coalition coded as relevant by the respective dataset (e.g., the Gikuyu language as part of the Kikuyu-Meru-Embu coalition in EPR). All Ethnologue groups without any plausible exclusive or coalition match to the respective dataset are coded zero on the respective PREG or EPR outcome.Footnote 27

Individual-Level Politicization Measures

The salience of individual members’ ethnicity vis-à-vis other identities likely varies between and within ethnic groups. To analyze this, we use survey data from rounds 3–6 of Afrobarometer, which ask respondents whether they identify more in ethnic or in national terms (Ali et al. Reference Ali, Fjeldstad, Jiang and Shifa2019; Robinson Reference Robinson2014). We use a dummy variable of whether a respondent identifies more strongly or even only in ethnic rather than national terms as the outcome in our Afrobarometer specifications.


A key dimension of boundary-making is a group’s accessibility to outsiders. Given the importance of marriage in social relations and group maintenance, many scholars view “endogamy [as] the ultimate measure of the salience of boundaries for intergroup relations” (Hechter Reference Hechter1978, 304). The underlying assumption is that groups with more exclusionary boundaries are less likely to marry outside their group—and to develop norms against such practices. To calculate ethnic exogamy, we use USAID’s Demographic and Health Surveys (DHS) that includes data on the ethnicity of individuals and their spouses. These measures are described in more detail below.

Analysis I: Ethnic Politicization and Salience

We first report our specifications and results for the effect of cash crops and publishing on ethnic politicization at the group and individual levels.

Group-Level Specification and Results

To test for group-level effects, we estimate regression equation 1 using OLS.

(1) $$ \mathrm{Po}{\mathrm{l}}_{ec}={\beta}_0+{\beta}_1\mathrm{CashCrop}{\mathrm{s}}_{ec}+{\beta}_2\mathrm{Publication}{\mathrm{s}}_{ec}+{X}_{ec}^{\prime}\gamma +{\lambda}_c+{\varepsilon}_{ec}. $$

Pol ec measures the political relevance of Ethnologue group e in country c, using PREG or EPR. Cash Crops ec is a binary measure of historical cash crop cultivation in the Ethnologue polygon. Publications ec indicates whether Rowling and Wilson (Reference Rowling and Wilson1923) lists at least one publication in Ethnologue language e; λc represents country fixed-effects; and X′ ec is a set of standard geographic and historical controls including agricultural suitability; tsetse fly and malaria ecology; elevation; ruggedness; average yearly precipitation; average yearly temperature; distances (in logs) to the coast, to navigable rivers, to cities in 1900, to the country capital, to historical missions, and to missionary printing presses; and absolute longitude and latitude.

Figure 2 reports the estimates of regression 1 when the outcome is a binary variable equal to one if the ethnic group is matched to a politically relevant group or coalition in PREG or EPR. Our baseline results indicate that, conditional on controls, a group with historical cash crop production is roughly 16–17 percentage points more likely to be listed as politically relevant in PREG and EPR (a 129% and 54% increase from the sample mean of the dependent variable, respectively). Similarly, languages with historical publishing are 11–13 percentage points more likely to be listed as politically relevant in PREG and EPR (an 88% and 45% increase from the respective outcome mean).

Figure 2. Cash Crops, Print Technologies, and Political Relevance Note: These figures summarize the results of eight regression models. The two binary outcomes indicate whether an Ethnologue group is matched to a group or coalition listed as politically relevant in PREG (left-hand panel) or EPR (right-hand panel). Lines 1 and 2 report effects using binary treatments, indicating whether Ethnologue groups were exposed to cash crop production and/or print technologies. In lines 3 and 4, cash crops are instrumented with the mean agroclimatic suitability for the five most important export crops by using the spatial 2SLS approach described in the text. In lines 5 and 6, the sample is restricted to Ethnologue polygons that experienced missionary activity. Lines 7 and 8 control for logged historical population per Ethnologue polygon based on HYDE raster data.

Potential endogeneity necessitates caution in causally interpreting the correlations reported in Figure 2. One important concern is that our results are driven by geographic or historical determinants of ethnic groups’ take-up of cash crops and print publishing.Footnote 28 We employ several strategies to address this issue.

First, we instrument Cash Crops ec with indicators of suitability for cash crop agriculture and estimate the effects using a spatial-2SLS (S2SLS) strategy, following Betz, Cook, and Hollenbach (Reference Betz, Cook and Hollenbach2019). The instrument is the average agroclimatic suitability from the FAO GAEZ database across the five most important African cash crops (cocoa, coffee, cotton, groundnuts, and oil palm) in the homeland of ethnic group e. These suitability scores combine soil and climatic characteristics to predict the ecological potential to grow specific crops in rainfed agricultural systems. To serve as a valid instrument, suitability may only affect outcome variables through its influence on actual cash crop production. We argue that this exclusion restriction likely holds, conditional on the rich set of geographic and historical controls in our models, especially general agricultural suitability, temperature, and precipitation, which are included to isolate cash-crop-specific effects from overall agricultural productivity and its social and political consequences.

The suitability instrument strongly predicts colonial cash crop production in first-stage regressions. The first-stage F-statistic is 13.5 in the EPR and 13.3 in the PREG models. To account for potentially similar spatial patterns in the instrument and outcomes that may threaten the exclusion restriction, the IV models further include a spatial lag of the respective political relevance outcomes instrumented with first- and second-order spatial lags of the baseline controls (Betz, Cook, and Hollenbach Reference Betz, Cook and Hollenbach2019). All spatial lags are based on a binary contiguity matrix that defines ethnic group e’s neighbors as all other ethnic polygons within a 100-km centroid distance.Footnote 29 Line 3 in Figure 2 shows that S2SLS results remain similar to baseline OLS although confidence intervals naturally become wider.

A second endogeneity concern is that European missions tended to establish outposts in geographically favorable areas or those with already more intensive colonial presence (Jedwab, zu Selhausen, and Moradi Reference Jedwab, Selhausen and Moradi2018). Subsetting the analysis to groups exposed to missions makes the analysis sample more comparable in terms of geographic fundamentals and other potential determinants of missionaries’ targeting of specific groups and areas. The results, reported in line 6 of Figure 2 remain robust, despite the large reduction in observations and correspondingly large standard errors.

Individual-Level Specification and Results

To test for individual-level effects, we use survey data of expressed ethnic salience and estimate the following equation in a geographic and an ethnic variant:

(2) $$ \begin{array}{c}\mathrm{Sa}{\mathrm{l}}_{ie\ell cs}={\mu}_0+{\mu}_1\mathrm{Cash}\hskip0.33em \mathrm{Crop}{\mathrm{s}}_{kcs}+{\mu}_2\mathrm{Publication}{\mathrm{s}}_{kcs}\\ {}\hskip-3.5pc +\hskip1.5pt {W}_{i\ell cs}^{\prime}\gamma +{\eta}_{k^{\prime }}+{\varepsilon}_{ie\ell cs}.\end{array} $$
(3) $$ k\in \left\{e,\ell \right\},\hskip1em {k}^{\prime}\in \left\{\ell, cs\right\}. $$

$ Sa{l}_{ielcs} $ is a binary Afrobarometer-based survey measure of greater ethnic than national identification. The unit of analysis now is respondent i, who identifies with ethnic group e, residing in survey location $ \mathrm{\ell} $ , in country c, and is interviewed in Afrobarometer survey round s. We assign Afrobarometer respondent i to ethnic group e based on the language they report speaking at home and use geographic information on Afrobarometer’s survey locations $ \mathrm{\ell} $ to assign individual respondents to Ethnologue polygons.

In our geographic specifications ( $ k=\mathrm{\ell} $ ), we use the cash crop production value within a 15-km radius of a survey location as treatment variable $ {\mathrm{Cash}\ \mathrm{Crops}}_{\ell cs} $ . The variable $ {\mathrm{Publications}}_{\ell cs} $ is the number of publications in the language of the local Ethnologue polygon normalized by historical group population as provided in Rowling and Wilson (Reference Rowling and Wilson1923). The geographic specifications thus assign treatment variables entirely based on respondents’ place of residence and irrespective of their self-reported ethnic identity. The ethnic specifications ( $ k=e $ ), on the other hand, only use self-reported ethnic affiliation to assign treatments, regardless of individual locations. More specifically, for all members of group e in country c for survey s, Cash Crops ecs is the value of historical cash crop production per km2 in the ethnic polygon of e, and Publications ecs is the number of publications in the language of e, again normalized by population.Footnote 30

Fixed effects $ \eta $ are either at the country-round level (cs) for geographic specifications or at the survey location-level ( $ \mathrm{\ell} $ ) for ethnic specifications. The main motivation for these two specifications is to separate location-specific from culturally transmitted groupwide effects. Thus, the geographic specifications investigate whether respondents living in areas historically exposed to cash crop production and/or missionary publishing report more salient ethnic identities. The ethnic specifications examine whether members of historically exposed ethnic groups report salient identities, even when compared with respondents from other groups in the same location. Where not absorbed by location fixed effects, geographic and historical controls are the same as those stated in the previous section and always include an estimate of logged historical population from HYDE (Klein Goldewijk, Beusen, and Janssen, Reference Goldewijk, Kees and Janssen2010). In all Afrobarometer analyses, we also control for individual-level controls including gender, age, education levels, and indicators of standards of living.

Table 1 reports the results of our geographic specifications. A one-standard-deviation increase in the value of cash crop production around location $ \mathrm{\ell} $ increases respondents’ ethnic identification by around 1.1% of a standard deviation (approximately 0.4 percentage points or 3% of the outcome mean). Similarly, a one-standard-deviation increase in publications per capita increases ethnic identification by around 3.7% of a standard deviation (approximately 1.3 percentage points or 10% of the outcome mean). This effect is robust to intensive-margin comparisons (column 6). The effects are driven by ethnic stayers—individuals who reside in one of the Ethnologue polygons matched to their self-reported ethnic group e. Column 5 shows that restricting the analysis to ethnic leavers (those who reside outside of their ethnic group’s homeland) results in a null effect of print technologies and a significant negative effect of cash crops.

Table 1. Geographical Persistence in Ethnic Identity

Note: The table reports standardized OLS estimates (beta coefficients). Standard errors are reported in parentheses and clustered at the location level. The dependent variable is a binary variable flagging whether respondents declare stronger ethnic than national identities. In column 4, we instrument cash crop production with agricultural suitability to cash crop production using the spatial 2SLS approach described in the text. Column 5 restricts the sample to ethnic leavers. Column 6 restricts the sample to locations with at least one historical publication. *p < 0.10, ** p < 0.05, *** p < 0.01.

The results of the ethnic specifications are reported in Table 2. Among individuals within the same survey location, ethnic salience is significantly higher among the ethnic groups with a history of publishing. A one-standard-deviation increase in publications per (estimated) thousand people increases respondents’ ethnic identification by around 1.0% of a standard deviation (approximately 0.3 percentage points, or 2.4% of the mean outcome, see columns 2 and 3). In contrast, historical cash crop production now has no significant effect.Footnote 31

Table 2. Cultural Persistence in Ethnic Identity

Note: The table reports standardized OLS estimates (beta coefficients). Standard errors are reported in parentheses and clustered at the location level. The dependent variable is a binary variable equal to one if respondents declare stronger ethnic than national identities. Column 4 restricts the sample to ethnic leavers. Column 5 restricts the sample to ethnic leavers from groups with at least one historical publication. *p < 0.10, ** p < 0.05, *** p < 0.01.

Whereas cash crops increased ethnic salience only among stayers, publishing significantly elevates ethnic identities among movers (column 4). This cultural mover effect is robust to intensive margin-only comparisons (column 5). This suggests a culturally transmitted effect of print technologies—the formation of an “imagined community”—which persists even among migrants (or their descendants).

Overall, we find that ethnic groups with higher levels of historical cash crop production and publishing are more likely to be politically relevant in the postindependence period and that individuals from these groups report more politically salient ethnic identities. The individual-level ethnic salience results suggest we are capturing two different channels of politicization—one tied to place and the other stemming from cultural transmission. That these correlate, respectively, with localized cash crop production and vernacular publishing increase our confidence that these historical processes were at least part of the causal chain shaping ethnic politicization in Africa.

Analysis II: Ethnic Boundary-Making

We now turn to analyzing ethnic boundary-making operationalized through interethnic marriage. To measure ethnic exogamy, we take advantage of the couple recodes of the DHS household surveys, which capture self-reported ethnic identities of married couples. The empirical specifications are equivalent to the Afrobarometer-based geographic and cultural persistence models above, but now the unit of analysis is interviewed couple i residing in location $ \mathrm{\ell} $ in country c with spouses identifying with ethnic group(s) ef and em.

Knowing the appropriate match of practically all raw ethnic categories in DHS on the Ethnologue language tree allows us to analyze interethnic marriages at different levels of ethnolinguistic differentiation.Footnote 32 Ethnologue has 13 levels of language differentiation d in our sample. Differentiation d = 1 distinguishes broad language families, and as d increases, more closely related ethnolinguistic categories are separated. We therefore define 13 binary outcome variables $ {Sal}_{ie\mathrm{\ell}}^d, $ indicating whether the two spouses in respondent couple i self-report belonging to different ethnic groups at level of differentiation d.

Two examples from Nigeria illustrate the operationalization of our interethnic marriage outcomes. A marriage between a female respondent identifying as Yoruba and a male Hausa respondent is coded as exogamous on all levels of the language tree. The Yoruba language belongs to the Niger-Congo language family, whereas Hausa is an Afro-Asiatic language. These language families are already separate on the first level, and therefore Yoruba and Hausa do not share any nodes on the language tree. In contrast, a Yoruba–Igala couple is coded as endogamous on levels 1–6 and as exogamous thereafter. The Yoruba and Igala languages share the first six nodes of the language tree but then branch out in different directions.Footnote 33

If cash crop agriculture sparked a process of more exclusionary identities, we would expect lower interethnic marriage rates at even the furthest branches of the language tree. A Yoruba respondent from a cash crop region would be similarly less likely to be married to a Hausa as to an Igala speaker. If print technologies led to salient but porous ethnic boundaries, we would expect members of these groups (e.g,. Yoruba) to be less likely to choose a spouse from a linguistically distant group (e.g., Hausa) but still open to intermarrying with linguistically related ethnic others (e.g., Igala). We test these hypotheses for both the geographic and ethnic definitions of our treatment, as defined above.

Geographic Persistence

Figure 3 presents coefficient estimates from 13 models based on geographically assigned treatment variables. All 13 exogamy outcomes and both treatment variables are standardized to mean 0 and SD 1 to facilitate comparing coefficient sizes across Ethnologue levels and treatments. The cash crop coefficients in Figure 3 are consistently negative and significant across all linguistic levels of differentiation. Interethnic marriages are between 0.015 and 0.025 standard deviations less likely in locations with one-standard-deviation higher levels of late colonial cash crop production. While these effect sizes may appear small in standard-deviation terms, their coefficients are, again, similar in magnitude to contemporary modernization proxies such as education and formal employment.Footnote 34 The coefficients on the publication variable are negative, significant, and somewhat larger in absolute size on levels 1–8 of the Ethnologue language tree. From level 9 onward, publication coefficients drop substantially and become statistically indistinguishable from zero. This pattern supports our theoretical conjecture that African-language printing heightened the salience of ethnic identities but, compared with cash crop agriculture, led to more porous boundaries and more assimilation among linguistically close ethnic categories. We show in the Appendix (Figure A6) that, similar to the Afrobarometer analysis above, these geographic effects are driven by ethnic stayers.

Figure 3. Geographic Persistence: Cash Crops, Publications, and Ethnic Marriages Note: The figure reports standardized OLS estimates from 13 regressions with country-round fixed effects. Standard errors are clustered at the survey location level. Each triangle represents the coefficient of geographically assigned cash crops and publications treatments, as described in the text. Bars represent 95% confidence intervals.

Cultural Persistence

Figure 4 summarizes results from models that assign treatment variables by husbands’ ethnic identities and include location fixed effects.Footnote 35 The left-hand panel reports findings from analyses of the entire sample of couples for which both spouses’ ethnic identity was successfully matched to the Ethnologue language tree, whereas the right-hand panel restricts the sample to ethnic movers only and thus compares marital choices by husbands outside of their ancestral homeland. These within-location models yield results that are substantively similar to those from the geographic-persistence analysis above. Effect sizes and the level difference between historical cash crop production and African language publishing appear, if anything, to be more pronounced.

Figure 4. Cultural Persistence: Cash Crops, Publications, and Ethnic Marriages Note: Each triangle represents the standardized OLS estimates (beta coefficient) of ethnic-level cash crop and print technology treatments, as described in the text. The left panel is based on analyses of the whole sample, and the right panel reports results from models run on the subsample of ethnic movers only. Bars represent 95% confidence intervals.

Robustness and Mechanisms

The empirical results in previous sections suggest that (i) historical cash crop production and the uptake of print technologies increased groups’ mobilizational capabilities and political relevance in the postindependence period; (ii) these historical forces also have had persistent effects on individual ethnic salience but through different channels—cash crop effects appear tied to land and sites of historical cultivation and publishing effects stem from cultural transmission among members of the ethno-linguistic group; and (iii) we observe differential effects on interethnic marriage with linguistically proximate out-groups. Note that in contrast to the Afrobarometer models, we find cultural persistence (ethnic mover) effects of cash crops on ethnic marriages, suggesting perhaps that political ethnicity is easier to change than deep-rooted cultural norms about appropriate marital choices.Footnote 36

In the remainder of this section, we summarize findings from our prespecified analyses to account for potential endogeneity before presenting additional specifications that address a series of potential alternative explanations that might account for the observed empirical patterns.

Addressing Endogeneity

Across most analyses, we address threats that the effects of historical cash crop production and vernacular language publishing are endogenous to underlying geographic factors or ethnic groups’ precolonial characteristics. The effects of cash crops on group-level politicization (Figure 2) and interethnic marriages (Appendix Figure A4) are robust to instrumenting cash crop production with indicators of suitability in a spatial-2SLS setup.Footnote 37 To account for potential selection of missionary and publishing activities into certain areas or groups, we show the results are robust to restricting the analysis to Ethnologue groups with a Christian mission (Figure 2) and publishing at the intensive margin (column 6 in Table 1; column 5 in Table 2; Figures A4 and A5). To address potential geographic confounders of publishing, the results presented in Table 2 and Figure 4 include location fixed effects. This increases our confidence that geographic confounders do not explain away exogamy patterns or cultural persistence in ethnic identity.

Alternative Explanations

Group Size

If larger groups were more likely to cultivate cash crops or have vernacular publications, our results may pick up their size-based advantages in coalition formation (Bates Reference Bates, Rothchild and Olorunsola1983; Posner Reference Posner2005). We account for this issue in several ways. First, the publications treatment in the survey analyses is normalized by the number of language speakers as estimated by Rowling and Wilson (Reference Rowling and Wilson1923). Second, we use the HYDE population rasters (Klein Goldewijk et al. Reference Goldewijk, Kees, Doelman and Stehfest2017) to control for precolonial population per ethnic polygon across all three analysis sections (see above). As HYDE only imperfectly captures group-level population, Appendix Table B7 and Figure B8 add precolonial political centralization as a proxy for precolonial group size and political cohesion (Murdock Reference Murdock1967). Results remain generally robust to accounting for group size, although coefficients get significantly smaller in the group-level political relevance models with the HYDE control.Footnote 38

Colonizer Effects

We also show that the effects of cash crop agriculture and publishing on ethnic politicization and marriage patterns are not mere artifacts of British indirect rule (Ali et al. Reference Ali, Fjeldstad, Jiang and Shifa2019). The results are reported in section III.4 of our Supplementary Information. We do observe that former French colonies have either zero or dampened publication effects, perhaps a consequence of France’s more hegemonic cultural and linguistic policies in its colonies (Albaugh Reference Albaugh2014; Cogneau and Moradi Reference Cogneau and Moradi2014). These heterogeneous effects offer additional suggestive evidence of the importance of vernacular language standardization and its propagation through schools and churches as a key mechanism driving ethnic politicization.


We run causal mediation models (Acharya, Blackwell, and Sen Reference Acharya, Blackwell and Sen2016) to gauge the mechanisms through which our historical treatments affect contemporary ethnic salience and exogamy. First, we observe that accounting for modernization proxies such as urbanization, education, and wealth does not explain our findings and, if anything, makes them stronger (Figures B12[a], B12[d], and B13). Second and in line with Cagé and Rueda (Reference Cagé and Rueda2016), political engagement and public sphere variables from Afrobarometer explain up to 17% of the publications effect. Finally, historical group-level advantages in secondary and higher education account for relatively large shares of the publishing effect on interethnic marriages (15–26% in geographic models, 16–43% in ethnic specifications, see Figure B14). These results, while only suggestive, point to the roles of an early intelligentsia in constructing ethnic identities and of continued political engagement in maintaining them.

Resource Types

We expected cash crop agriculture to matter due to local ethnic competition for economic benefits and ethnic elites’ and communities’ strategic boundary-making. This mechanism is unlikely to play out under European-owned plantation or settler agriculture, nor is it likely in mining regions where there were limited benefits for indigenous farmers or where the colonial state or concession companies regulated access. Consistent with this, we show in Supplementary Information III.5 that our results are mainly driven by smallholder crops predominantly cultivated by African farmers. The effects of historical plantation agriculture and mining are weaker or even point in the opposite direction.

Diversity and Religion

One concern about the interethnic marriage results is whether they merely reflect differences in local ethnic diversity. In Supplementary Information III.6, we account for or interact our treatments with local-level ethnic fractionalization scores. The cash crop effects are larger in ethnically diverse locations strengthening our confidence that ethnic competition rather than local-level ethnic homogeneity explains lower exogamy levels.

Another possibility is that the publishing measure is merely picking up the spread of Christianity, which may explain politicization or marital choices. To rule this out, we control for Christian population share in the group-level models, rerun all exogamy models with directed religious couple fixed effects, and use religious denomination dummies in mediation models. Results are nearly identical to those from our baseline analyses (Supplementary Information III.7).


Our analysis shows that Africa’s contemporary ethnic landscape was at least partially shaped by the persistent effects of the cash crop and printing revolutions that spread from the nineteenth century onward. In line with our hypotheses, geographic variation in cash crop agriculture and the uneven diffusion of print technologies differentially increased groups’ mobilizational potential and their capabilities to compete for state power after independence. Our analysis of individual-level identity salience suggests that these two forces affected ethnicity through different channels—with cash crop effects on individual identity salience tied to historic agricultural zones and publishing effects transmitted culturally among language speakers even beyond their ethnic homeland. Beyond self-reported identity salience, we find that these socioeconomic transformations resulted in different patterns of interethnic marriage. Publishing contributed to the construction of more porous boundaries than cash crop agriculture, leading to comparatively higher rates of intermarriage with linguistically related out-group members. This points to important differences in boundary policing among politicized groups based on their historical exposure to commercial agriculture and print technologies.

In shedding light on these endogenous processes, we highlight key underlying factors that may confound analyses of contemporary ethnic politics—such as contestation over land and cross-cutting languages.Footnote 39 These dynamics require greater attention among scholars of ethnic politics and conflict, especially in light of more recent waves of internal migration, climate change, and rising land pressures.Footnote 40 How these changes affect ethnic boundaries, not least between pastoral and agricultural groups, are important questions for future research.

Our findings also have important implications for understanding the effects of colonialism on ethnicity. Much existing scholarship emphasizes the top-down effects of colonial social engineering and indirect rule on ethnic politicization.Footnote 41 In contrast, our analysis demonstrates the importance of broader social and economic forces, which preceded colonialism and were key drivers of it. Further, our findings suggest that colonialism did not uniformly mold or “fix” ethnic boundaries. Instead, identity (re)construction arose as much from the strategic actions of African farmers, landowners, and elites, as well as those of missionaries, culture brokers, and ordinary people, responding to opportunities and constraints brought about by economic and technological change in the nineteenth and twentieth centuries.

Supplementary Materials

To view supplementary material for this article, please visit


Research documentation and data that support the findings of this study are openly available at the American Political Science Review Dataverse:


For excellent research assistance at various stages of this project, we thank, from ETH-Zurich: Paola Galano Toro, Vanessa Kellerhals, Benjamin Füglister, Lukas Dick; from Witten/Herdecke University: Carlos Mairoce and Julian Seitlinger; from Oxford University: Sidhart Bhushan, and Hedda Roberts; and from William & Mary: Layla Abi-Falah, Aaron Spitler, and Henry Young. Earlier versions of our research design were presented at WGAPE, LSE, March 2018; Annual Meeting of the American Political Science Association, Boston, September 2018; Princeton University Comparative Politics Colloquium, October 2018; Annual Meeting of the AEHN Bologna, October 2018. First results and early paper versions presented at the Annual Meeting of the American Political Science Association, Washington DC, September 2019; Political Economy Workshop, Zurich, October 2019; Annual Meeting of the Swiss Political Science Association, February 2020; virtual APSA 2020; Zurich Workshop in Empirical Political Economy, September 2020; ASREC annual conference, November 2020; Leiden Workshop in Political Science, October 2020; Workshop on the Politics of Favoritism, ZEW Mannheim, February 2021. We are grateful to participants for their suggestions and feedback. Special thanks to three anonymous reviewers, Matthew Gichohi, Corinne Bara, Joan Ricart-Huguet, Leila Demarest, Dan Posner, Tim Phillips, Carl Müller-Crepon, and Lars-Erik Cederman for comments and discussions.


This research was generously supported by the Swiss National Science Foundation (grant #P0EZP1_159076, Pengl) and the US National Science Foundation (award #1628498, Roessler).


The authors declare no ethical issues or conflicts of interest in this research.


The authors affirm that this research did not involve human subjects.


1 Across the ethnic politics literature, many studies model competition for power and resources among a given subset of politically relevant ethnic groups.

2 We preregistered our research design with Evidence in Governance and Politics (EGAP) on April 24, 2019, after some promising preliminary analyses but before merging our publications and cash crop data with Ethnologue language categories and group polygons and, via Ethnologue, to EPR, PREG, Afrobarometer, and DHS. We had already seen geographic correlations between cash crop locations and Afrobarometer/DHS outcomes as well as between proximity to missionary printing presses and Afrobarometer identity salience. However, we were in no position to analyze group-level outcomes, the actual publications treatment, or the ethnic specifications described below, as all of these require ethnic matches. Our preanalysis plan can be found here:

3 In our sample of 35 sub-Saharan African countries, there exist 2,303 Ethnologue languages, whereas the Ethnic Power Relations dataset counts 140 groups relevant in the first year that countries in the region enter the dataset and another 158 groups relevant through its last year (Vogt et al. Reference Vogt, Bormann, Rüegger, Cederman, Hunziker and Girardin2015).

4 See Caselli and Coleman (Reference Caselli and Coleman2013) for a formalization of the link between social closure and ethnicity.

5 For combining Ethnologue groups with information from EPR, PREG, DHS, and Afrobarometer, we use the publicly available ethnic links coded by Müller-Crepon, Pengl, and Bormann (Reference Müller-Crepon, Pengl and Bormann2020).

6 However, see Sasaki (Reference Sasaki2017), who focuses on the influence of the printing press in Europe on language standardization.

7 Salience and closure capture different but potentially reinforcing identity dimensions. The former reflects the importance of an identity to oneself or others—i.e., the likelihood that a given identity and not others will be invoked across different situations (Stryker Reference Stryker1980). In contrast, closure reflects the degree to which a group is accessible to outside members (Wimmer Reference Wimmer2013). Following from Stryker (Reference Stryker1980), we might expect closed groups, in which entry and exit pose higher costs, to correlate with more salient identities.

9 We analyze this in Supplementary Information I. We find that politically relevant groups do tend to have less porous boundaries as measured by interethnic marriage, though these correlations are not particularly strong.

10 Koter (Reference Koter2016), in contrast, argues that hierarchical institutions enabled postindependence rulers to target groups with patronage-based policies rather than ethnic appeals, potentially dampening ethnic salience. Also, Dunning and Harrison (Reference Dunning and Harrison2010) find that the historical legacy of cousinage from the Mali Empire has helped to weaken the political effects of ethnicity.

11 For an illuminating ethnography on the interactive effects of Christian missionaries and cash crops on ethnic association formation, anticolonial resistance, and political mobilization, see Spear (Reference Spear1997). In the case of the Meru, ethnic mobilization contributed to the development of a broader nationalist movement (Okoth Reference Okoth2006).

12 See also Chimhundu Reference Chimhundu1992; Posner Reference Posner2003.

13 Colonial partition itself, however, may have contributed to stronger national identities among groups divided between two sovereign states (Miles and Rochefort Reference Miles and Rochefort1991; Robinson Reference Robinson2014).

14 On the genealogy of autochthony and its roots in colonialism, see Ceuppens and Geschiere (Reference Ceuppens and Geschiere2005); Marshall-Fratani (Reference Marshall-Fratani2006).

15 Cash crops would prove a much more important source of colonial exports than minerals. By 1957, across the 35 countries in our dataset, cash crops accounted for 59.4% of total exports (by value) compared with only 22% for minerals (Hance, Kotschar, and Peterec Reference Hance, Kotschar and Peterec1961).

16 For important previous work on the sociopolitical implications of the transition to commercial agriculture, see Colson (Reference Colson and Victor1971), Berry (Reference Berry1993), and Boone (Reference Boone2014; Reference Boone2017).

17 This process of ethnic boundary hardening was driven from below—as chiefs found themselves under growing pressure from their constituents not to give away too much land to outsiders (Boni Reference Boni, Richard and Lentz2006)—but also supported from above—as colonial governments promoted neocustomary land tenure regimes (Boone Reference Boone2017; Mamdani Reference Mamdani1996, 104–5).

18 See also Bates (Reference Bates1974, 465–7) on how local elites in cash crop areas used ethnic criteria to restrict access to modernization benefits.

19 In French colonies, educational instruction was mandated to be in French. Albaugh (Reference Albaugh2014) estimates that by 1950 only around 58% of the population in French colonies had their languages transcribed compared with 76–81% in British, Belgian, and Portuguese colonies.

20 For other case studies, see Ranger (Reference Ranger and Vail1989), Chimhundu (Reference Chimhundu1992), and Strommer (Reference Strommer, Zimmerman and Kellermeier-Rehbein2015).

21 We also preregistered a set of ancillary hypotheses and analysis on homogeneous political preferences, interethnic trust and ethnic conflict that we report in Supplementary Information IV.

22 Data and replication scripts for all analyses in this article and the Online Appendix are openly available in the APSR Dataverse (Pengl, Roessler, and Rueda Reference Pengl, Roessler and Rueda2021). The replication folder also contains extended Supplementary Information with additional data descriptions and results.

23 AMAR and EPR rely on some indication of social or political relevance as a basis for inclusion. Murdock (Reference Murdock1959; Reference Murdock1967) has a much smaller number of groups than Ethnologue. See Laitin (Reference Laitin2000b, 142) on the advantages of using “language as a proxy for ethnicity.”

24 It excludes data on the Union of South Africa (including present-day Namibia), Madagascar, and other island colonies.

25 Cocoa, coffee, cotton, groundnuts, oil palm, stimulants, other food crops, other industrial crops, other oils.

26 This approach was inspired by Chaney’s (Reference Chaney2016) work on the Middle East.

27 In robustness checks, we also use more restrictive versions and only code Ethnologue groups with exclusive one-to-one matches as 1 and all other groups as 0. Supplementary Information I.3 provides an intuitive example of this distinction and Appendix Figure A1 shows results. We also use AMAR (All Minorities at Risk) to measure groups’ social relevance capturing group consciousness and shared norms and cultural features short of national-level political mobilization (Birnir et al. Reference Birnir, Wilkenfeld, Fearon, Laitin, Gurr, Brancati and Saideman2015, 112). See Appendix Figure A2 for results.

28 Figure I.9 in our Supplementary Information shows that groups with cash crops or publications systematically differ from those without on a number of baseline covariates.

29 The joint significance of spatially lagged baseline controls in the second first stage (predicting the spatially lagged dependent variable) is high, and the respective F statistics remain well above conventional thresholds.

30 See Supplementary Information (Figure I.8) for a concrete example.

31 These standardized effects are comparable or larger than other controls. For instance, $ {\hat{\beta}}_2 $ in column 3 of Table 2 is six times larger than the absolute effect of a 10% increase in precolonial ethnic population and roughly 20% smaller than the effect of formal primary schooling. Appendix Table A1 compares our coefficients with individual-level proxies used by Robinson (Reference Robinson2014). The effect of a one-standard-deviation change in our treatments of interest represent, across specifications, 20–56% of the effect of contemporary individual characteristics such as gender or formal employment. Also note that town fixed effects in Table 2 can aggravate attenuation bias (Aydemir and Borjas Reference Aydemir and Borjas2011).

32 See Cervellati, Chiovelli, and Esposito (Reference Cervellati, Chiovelli and Esposito2018) for a similar approach.

33 Figures I.6 and I.7 in the SI schematically illustrate these examples.

34 See Appendix Tables A2–A5.

35 See Appendix Figure A7 for results when assigning treatments based on wives’ ethnicities.

36 This seems consistent with recent findings that local ethnic minorities face incentives to vote for the local majority candidate rather than one of their own (Ichino and Nathan Reference Ichino and Nathan2013).

37 Afrobarometer results disappear when using this approach. One explanation is the lower spatial coverage of Afrobarometer, which has less than half the number of unique survey locations than DHS. In addition, Afrobarometer was geocoded ex post and location coordinates are probably less accurate.

38 Figures B10 and B11 further control for ethnic polygon area. Supplementary Information III.1.2 more closely investigates the relationship between group size and publications.

39 On these points, see respectively, Boone (Reference Boone2014) and Laitin (Reference Laitin2000a).

40 See Klaus (Reference Klaus2020) for a recent such example.

41 See for example Mamdani (Reference Mamdani1996) and Posner (Reference Posner2005) and more recently Ali et al. (Reference Ali, Fjeldstad, Jiang and Shifa2019), McNamee (Reference McNamee2019), and Müller-Crepon (Reference Müller-Crepon2020).


Figure 0

Figure 1. Publications and Cash Crop LocationsNote: Language homelands are mapped according to Ethnologue. Grayed regions are Ethnologue polygons for which there is no record of publications. Colors indicate the number of publications listed in Rowling and Wilson (1923). Each blue cross locates 289,270 USD (1957) of cash crop export value for either cocoa, coffee, cotton, groundnuts, or palm oil. Solid black country borders describe our sample.

Figure 1

Figure 2. Cash Crops, Print Technologies, and Political RelevanceNote: These figures summarize the results of eight regression models. The two binary outcomes indicate whether an Ethnologue group is matched to a group or coalition listed as politically relevant in PREG (left-hand panel) or EPR (right-hand panel). Lines 1 and 2 report effects using binary treatments, indicating whether Ethnologue groups were exposed to cash crop production and/or print technologies. In lines 3 and 4, cash crops are instrumented with the mean agroclimatic suitability for the five most important export crops by using the spatial 2SLS approach described in the text. In lines 5 and 6, the sample is restricted to Ethnologue polygons that experienced missionary activity. Lines 7 and 8 control for logged historical population per Ethnologue polygon based on HYDE raster data.

Figure 2

Table 1. Geographical Persistence in Ethnic Identity

Figure 3

Table 2. Cultural Persistence in Ethnic Identity

Figure 4

Figure 3. Geographic Persistence: Cash Crops, Publications, and Ethnic MarriagesNote: The figure reports standardized OLS estimates from 13 regressions with country-round fixed effects. Standard errors are clustered at the survey location level. Each triangle represents the coefficient of geographically assigned cash crops and publications treatments, as described in the text. Bars represent 95% confidence intervals.

Figure 5

Figure 4. Cultural Persistence: Cash Crops, Publications, and Ethnic MarriagesNote: Each triangle represents the standardized OLS estimates (beta coefficient) of ethnic-level cash crop and print technology treatments, as described in the text. The left panel is based on analyses of the whole sample, and the right panel reports results from models run on the subsample of ethnic movers only. Bars represent 95% confidence intervals.

