9.1 Introduction
In Part IV of the book we will explore the last two transitions on our list of six: the emergence of cities and states. We will not linger on the question of whether these developments were important. Everyone knows that cities and states are foundational components of the modern world. It is challenging to imagine what our lives would be like without them. But as with the four previous transitions we have studied, cities and states had origins, and we will attempt to explain how they arose.
Early states involved government of, by, and for elites. The elites achieved better standards of living than commoners in many ways, and the resulting levels of inequality varied from one society to another. The mechanisms of elite control included taxation; the ownership of agricultural land or other natural resources; the monopolization of trade, mining, or manufacturing; the plunder of neighboring societies; corvée labor; and slavery. But early states also offered benefits to commoners such as public order, suppression of local warfare, infrastructure, insurance against natural disasters, and the like. Archaeologists and economists share an interest in the question of whether early states made commoners better off or worse off on balance. We will return to this issue later.
Our concern is only with “pristine” or “primary” states. We ignore situations where states developed in response to influences from other pre-existing states or through cycles of collapse and reintegration. Pristine states emerged independently, or nearly so, in several parts of the world. The classic list includes Mesopotamia, Egypt, the Indus River valley, China, Mesoamerica, and South America. However, other probable or definite examples can be found in Africa, Southeast Asia, Europe, and North America.
Chapter 9 surveys the archaeological evidence about the emergence of city-states in southern Mesopotamia and outlines a number of causal hypotheses about this process. Chapter 10 builds a formal economic model of our own hypothesis, which we argue can explain several prominent features of the regional narrative from Chapter 9. In Chapter 11 we compare the Mesopotamian example with those of Egypt, the Indus Valley, China, Mesoamerica, and South America. We will argue that these six cases exhibit recurring patterns, including the universality of stratification in early state societies (Feinman and Marcus, Reference Feinman and Marcus1998), and frequent connections between urbanization and state formation. We will also argue that a small set of causal mechanisms can account for these patterns. The rest of this section frames the key issues and provides conceptual background.
The first item on the agenda is to define a state. Scholars have proposed many definitions, and we will postpone a detailed discussion until Section 11.3. But in brief, we define a state to be an organized elite with significant powers of taxation in a well-defined geographic area. When further nuance is needed, we add the requirement that taxation is institutionalized in the sense that (a) it is carried out regularly by a permanent set of specialists, and (b) the resulting resources are transferred to a central elite group for allocation. The rationale for our definition is that if a state is to function as a collective actor, the elite must possess a centralized fiscal system so that it can collect and expend resources in a coherent way. We are not alone in seeing taxation as integral to the state. Most economic historians stress the importance of fiscal centralization (see the literature review in Acemoglu and Robinson, Reference Acemoglu, Robinson, Eloranta, Golson, Markevich and Wolf2016).
Although we require a sufficient degree of internal organization within the elite, we do not require a king or some other supreme leader. A council of elite landowners, merchants, or priests might well collect taxes and allocate state resources according to a coherent preference ordering. At the other end of the spectrum, we would not view a set of fiscally autonomous landlords as constituting a state, even if they jointly monopolized the use of force, in the absence of centralized institutions for resource allocation.
Economists often take it for granted that states supply public goods such as law and order. While this is plausible, we do not require it here. In fact one implication of our formal model in Chapter 10 is that a state can arise because it enhances private elite consumption. Thus we view taxation, but not public goods supply, as a defining feature of the state. Of course, in practice most pristine states did make investments in temples, monuments, irrigation canals, and other public goods of interest to the elite.
Chapter 6 introduced a technology of exclusion to explain early inequality, and Chapters 7–8 introduced a technology of combat to explain early warfare over land. Part IV requires a technology of confiscation in order to explain the emergence of early states. This term refers to techniques through which an elite extracts resources from the agents who live within its territory.
Our definition of taxation includes all cases where an elite collectively confiscates resources (food, labor, urban outputs, raw materials, and so on). This includes the corvée labor often used for construction projects in early states. It also includes slavery, which appears to have been widespread in some early states. The labor time of slaves could be appropriated by elite individuals or households rather than the state itself, although state power might be deployed if a slave tried to escape.
Tax collection could involve the threat or use of physical force, but it need not. Depending on the circumstances, recalcitrant taxpayers might face the threat of losing social ties with other elite agents, or the threat that deities will withdraw their protection. As Liverani (Reference Liverani, Bahrani and Van de Mieroop2006, 25) observes, elites often prefer ideological forms of coercion where producers are convinced to give up substantial resources for religious reasons. See also Steinkeller (Reference Steinkeller, Steinkeller and Hudson2015a, 15–17) on the probable role of deities in mobilizing the resources used to supply public goods in ancient Mesopotamia.
For concreteness, suppose the technology of confiscation involves the taxation of food output produced from agriculture. Many factors will influence the ease or difficulty of tax collection. These include:
(a) Whether tax collectors must travel long distances;
(b) Whether output is easily transported and stored;
(c) Whether output is easily hidden from tax collectors;
(d) Whether output is easily estimated by tax collectors;
(e) Whether potential taxpayers can run away.
The reader can no doubt think of additional factors that might be relevant.
These considerations are not limited to agriculture. Sedentary foragers are a poor target for taxation in several ways. Such societies often have patchy resources, and good sites may be far apart. Foragers also tend to harvest many different species of plants and animals, with some being difficult to store and others being easily hidden. Labor effort is usually unobservable, unlike the case of collective work on cultivated fields. Finally, the food income of foragers is difficult to estimate due to the diversity of harvesting activities and the many random variables that can influence outputs from hunting and gathering.
Agricultural societies are a mixed bag. Crops harvested in large quantities on a seasonal cycle at a limited number of sites could present an attractive target, but not all crops have these features. Some crops are readily stored while others are not (see Scott, Reference Scott2017, on the differences between cereals and tubers). In some cases it is easy to estimate the output of farmers and tax them accordingly, but in other cases it is not. For example, it was easy to infer output from the height of the Nile flood in Egypt, but harder to make similar inferences in southern Mesopotamia (see our discussion of Egypt in Section 11.6 and our discussion of Mayshar et al., Reference Mayshar, Moav and Neeman2017, in Section 9.8).
We will also consider taxation of the output from urban manufacturing, which we think was a particularly tempting target. Such manufacturing occurred in compact areas; outputs of textiles, metal items, and pottery were easily counted, stored, and transported; and it would have been difficult to conceal workshops or warehouses. These factors tend to strengthen the linkage between early urbanization and early state formation.
We will not include all of the issues discussed here in our formal model (Chapter 10). However, for the reasons given above, we presume that technologies of confiscation were largely ineffective for sedentary foragers, had variable effectiveness for agriculture, and tended to be highly effective for urban manufacturing. Of course, environmental and technological details are important, and it matters what is being taxed. But the key point is that for an early state to have robust fiscal foundations, productive agriculture by itself is not enough. An effective technology of confiscation is also required (on this point, see also Mayshar et al., Reference Mayshar, Moav and Pascali2022).
We draw an important distinction between land rent and taxation. Simply put, land rents arise from a technology of exclusion, while taxes arise from a technology of confiscation. These are different things. In Chapter 6 we showed that a local elite can use an exclusion technology to prevent outsiders from accessing land at a particular site. This does not involve taxation but rather the closure of geographic areas that were once available to all. After such property rights have been created, markets for land or labor enable elite agents to collect land rents from commoners, leading to class stratification.
Recall from Chapter 6 that when we use the term “wage” we simply mean a food payment from an elite agent to a commoner in exchange for labor time. In this context, “land rent” means the food output retained by the elite agent after paying the “wage.” This is the labor market way of telling the story. We can also have the commoner pay direct “land rent” (again in the form of food) to the elite agent in exchange for the right to work on a parcel of land, where the commoner keeps the output left over after paying the land rent. This is the land market way of telling the story.
In all of our formal models these two descriptions are equivalent, but we find it more convenient to use the description in terms of wages and labor markets, with land rent treated as a residual for the elite. We also note that in our analysis it does not matter whether wages or rents are fixed payments or shares in output. For urban manufacturing, what we call wages are sometimes called “rations.”
Archaeologists and anthropologists often apply the label “tributary economy” to a system involving either direct taxes or land rents generated through market mechanisms, without specifying which of the two is being discussed (however, Earle, Reference Earle1987, 294–295, has a cogent discussion of land rent in the context of chiefdoms). Writers in these fields do distinguish staple finance (e.g., transfers of grain or animal herds from commoners to the elite, usually with central storage facilities) from wealth finance (e.g., the payment of taxes in specialized craft goods or exotic materials such as metals). Both are consistent with centralized fiscal authority in a state. We will assume staple finance for simplicity, but nothing essential hinges on this assumption.
As we showed in Chapter 6, pre-state stratification can be based entirely on land rent without any reliance on taxation. In contrast, many economists concerned with state origins focus on technologies of confiscation where bandits, predators, gangs, or the state appropriate some portion of the output generated by producers. The primary issue in this literature is whether unorganized bandits will be replaced by an organized state that taxes the producers (see Section 11.4). This is an interesting question, but it is distinct from the question of whether an elite collects land rent.
To grasp this distinction, consider the following possibilities. In an agricultural setting it might be easy for a local elite to identify and punish trespassers (technology of exclusion), but difficult to seize output produced by commoners because food is easy to hide (technology of confiscation). This leads to land rent without taxation. On the other hand, it might be easy for roving bandits to seize food stores (confiscation), but hard for them to maintain permanent ownership rights over fixed land parcels (exclusion). This leads to taxation without land rent.
Our formal model in Chapter 10 begins with an agricultural economy where the elite controls land and collects rent but does not engage in taxation (perhaps because it is easy to hide food or long distances make the monitoring of agricultural activities costly). We will assume that urban manufacturing is readily taxed once it exists (perhaps because workshops are highly visible and cities are physically compact). In this setting we show that the elite confronts a tradeoff involving the land rent from agriculture versus the tax revenue from manufacturing. The challenge will be to identify conditions under which the balance tips toward the latter, in which case the elite establishes a city-state.
Most writers believe agriculture was a necessary condition for the existence of a state, and economists have examined the length of the lag between agriculture and state formation. Borcan et al. (Reference Borcan, Olsson and Putterman2021) estimate the average lag to have been about 3,400 years for six pristine states (the same six we will discuss in Chapter 11). Their method of dating the transition to agriculture is to use the “approximate year in which a substantial population in some part of a country relied mainly on cultivated crops and domesticated animals for their subsistence” (10). They use two definitions of the state. Paramount chiefdoms that include multiple individually substantial chiefdoms count as proto-states, but specialized administration and soldiery are required for a full state. Much of their research deals with non-pristine state formation, which we will not address here.
Two common arguments about the necessity of agriculture for a state appear in the literature: (a) agriculture is needed in order to obtain food surpluses that can support state personnel (or urban populations); and (b) agriculture is needed in order to provide a sufficient tax base to finance state activities. Each argument has difficulties. A problem with (a) is that food surpluses are often defined in a purely technological way, as output relative to biological subsistence requirements. However, we argued in Section 6.2 that commoner food intake is not biologically fixed in stratified societies. Another issue is that agriculture is not the only possible source of food (see our discussion of wetlands in Mesopotamia in later sections). A problem with (b) is that this argument ignores issues with the technology of confiscation. As we suggested above, not all crops are equally useful for taxation purposes, and in some cases it may be considerably easier to tax urban manufacturing than rural agriculture.
Nevertheless, we believe that agriculture did play a crucial indirect role in pristine state formation. Our causal argument runs as follows: rising agricultural productivity led to rising population densities through Malthusian dynamics, and the rising population densities led to the formation of elite and commoner classes through the endogeneity of property rights (Chapter 6). Once stratification existed, several paths could lead to state formation, but this development was not inevitable, and different mechanisms appear to have been relevant in different regions. We postpone the details until Chapter 11 but we think pristine states had three main proximate causes: (i) a long run decline in commoner wages resulting from population growth and endogenous property rights that eventually led to urban manufacturing; (ii) chronic elite warfare that led to territorial expansion and defensive agglomeration; and (iii) climate changes that triggered migration toward refuge areas (often river valleys), again leading to declining wages and urban manufacturing.
Although sedentary foragers could and often did develop stratified societies, they lacked the technological dynamism resulting from domestication and learning by doing in agriculture, and therefore tended not to have persistently rising population densities over time. Moreover, as we discussed earlier, sedentary foraging is generally not an attractive target for tax collectors and is therefore an unpromising fiscal basis for a state.
In our view, the archaeological literature on pristine states supports the following qualified generalizations.
(a) Agriculture was a necessary condition for pristine states, but in some cases food obtained through foraging continued to be a significant component of the diet.
(b) Stratification was a necessary condition for pristine states, but a stratified core was often surrounded by an open-access periphery.
Note that agriculture may not have been a necessary condition purely for caloric reasons but also for its role in promoting stratification and providing a secure fiscal basis for the state. This chapter and Chapter 11 provide regional examples that illustrate these points.
Borcan et al. (Reference Borcan, Olsson and Putterman2021) find statistical support for the idea that both agriculture and stratification played important roles in prehistoric political development. They use data from the Atlas of Cultural Evolution, which provides information on prehistoric societies defined by archaeological traditions. These are populations that had similar subsistence practices, technology, and sociopolitical organization; were spatially contiguous over a large area; and endured for a long time. Political integration, the dependent variable, was defined to be 1 for societies with integration above the community level and 0 otherwise (note that this includes simple and complex chiefdoms in addition to states). The authors regressed this variable on agriculture, stratification, population density, and urbanization in the year 4000 BP using observations on 74 societies.
Agriculture and stratification were both highly significant predictors of political integration, with a negative interaction between the two suggesting some substitutability between them. The other variables were less important. Although these cross-sectional findings do not shed any direct light on the causal dynamics central to our approach, and the definition of political integration is much broader than our definition of a state, we regard the results as consistent with our theoretical framework.
We begin our study of pristine states with southern Mesopotamia. Section 9.2 gives preliminary information. Sections 9.3–9.7 provide a chronological narrative for this region, including the pre-’Ubaid period (9.4), the ’Ubaid period (9.5), the Uruk period (9.6), and the post-Uruk period (9.7). In Section 9.8 we discuss several hypotheses about Mesopotamian city-state formation proposed by archaeologists and economists. Section 9.9 describes our own hypothesis, which will be formalized in Chapter 10. Section 9.10 is a brief conclusion. We defer acknowledgments of advice and financial support for this part of the book until the end of Chapter 11.
9.2 Southern Mesopotamia
The city-states that emerged in southern Mesopotamia (also known as Sumer) by at least 5200 BP (calendar years before present) offer a sensible starting point for several reasons. They are often described as the first states; they have been subjected to intense archaeological study, especially the city-state of Uruk; and they had a major historical impact. Even so, debates on their origins continue, and economic reasoning may shed light on the issues at stake. We will start with a brief sketch of this transition. Empirical nuances and controversies are ignored in this section but will be addressed in detail later.
Southern Mesopotamia corresponds to modern southern Iraq. We structure the narrative using four time periods: the pre-’Ubaid, the ’Ubaid, the Uruk, and the post-Uruk. Authorities in the field assign different dates to these periods that sometimes vary widely. The beginning of the ‘Ubaid period is dated in the range 8000–7500 BP, the beginning of the Uruk period is dated in the range 6300–5900 BP, and the beginning of the post-Uruk period is dated in the range 5100–4900 BP. We arbitrarily select the earliest date in each case in order to impose consistency throughout the chapter. Therefore in our scheme the ‘Ubaid period runs from 8000–6300 BP, the Uruk period runs from 6300–5100 BP, and the post-Uruk period runs from 5100–4350 BP. These conventions about time periods have no substantive implications for our arguments.
We report dates in calendar years before present (BP) throughout. Some writers use dates BCE (before current era) and we have added 2,000 years in these cases. Thus if the original source reports that an event occurred in the 4th millennium BCE (4000–3000 BCE), we translate this as the 6th millennium BP (6000–5000 before present).
While it is generally agreed that city-states existed in southern Mesopotamia by 5200 BP, there is ambiguity over the date at which the first city-states appeared. In the Early Uruk period there was increasing immigration into Sumerian cities, but firm data on the extent of urbanization is unavailable. In roughly the same time period excavations reveal the existence of Tell Brak in northern Mesopotamia. Brak was a substantial city of 55 hectares, much larger than neighboring cities in the north but smaller than the southern city of Uruk (Ur et al., Reference Balter2007). There is controversy over which location, Uruk or Brak, was the first to witness large-scale urbanization (see Pournelle and Algaze, Reference Pournelle, Algaze, McMahon and Crawford2014, 9–13). There is no controversy, however, in terms of relative historical importance. Tell Brak was largely abandoned in the 6th millennium BP. But the Sumerian cities, of which Uruk was by far the largest, continued to expand well into the 5th millennium BP, and as noted above are frequently described as the first states.
In the ‘Ubaid period (8000–6300 BP), southern Mesopotamia developed a few towns of about 1000–2000 people and many smaller villages. This has sometimes been described as a system of “simple chiefdoms,” with the larger settlements having temples and elite residences. Food was obtained from hunting, gathering, fishing, and farming.
During the Uruk period (6300–5100 BP), large-scale urbanization occurred. This culminated in a population of 20,000–50,000 for the city of Uruk and populations in the tens of thousands for nearby cities. Many authorities believe the urbanization process was triggered by adverse climatic changes, especially reduced rainfall. This encouraged migration toward southern Mesopotamia, where wetlands and river irrigation made food production less vulnerable to increasing aridity. These migratory responses could have caused spikes in local population at a few especially attractive sites in the south.
Urbanization was associated with the development of manufacturing, especially textiles, but also pottery, metallurgy, and stonework. Such activities had previously been carried out in smaller villages, but became more specialized and larger-scale in the new centers like Uruk. The evidence suggests that urban manufacturing had scale economies external to the individual workshop but internal to each city. There is an archaeological consensus that Uruk satisfied standard criteria for a city-state by 5200 BP if not earlier, including a large urban population, monumental architecture, specialized bureaucracy, a multitiered settlement hierarchy, colonial expansion, and in the final phase of the Uruk period, specialized commodity production and writing.
Next we provide a brief preview of our hypothesis about the economic sources of this transition (for a detailed verbal description see Section 9.9, and for a formal model see Chapter 10). We believe regional climate change in the form of increasing aridity reduced agricultural productivity in outlying areas including northern Mesopotamia. The result was a drop in the demand for commoner labor and a region-wide decrease in living standards for commoners. Simultaneously commoners migrated toward the south, where food production was less dependent on rainfall. Local elites in the south could pay lower wages in the form of rations to commoners, which made urban manufacturing attractive. We believe that manufacturing activities were more easily taxed than agriculture and that this provided the early fiscal foundations for the city-states of southern Mesopotamia.
Many other authors have proposed alternative hypotheses. We will review a number of them later in the chapter. But any credible hypothesis has to explain three main points. First, it needs to say why the population of southern Mesopotamia grew to a point at which large cities were possible. Second, it needs to say why such cities actually developed and were accompanied by large-scale manufacturing. Third, it needs to say why these were city-states; that is, why cities led to states in a region that previously had neither. Our theory accounts for each of these points.
9.3 Archaeological Background
Our chronology involves a synthesis of facts, opinions, and interpretations from a number of archaeologists. Most statements reflect a wide consensus among the experts, to the extent that an outsider can judge. Where some issue is uncertain or controversial, we flag that issue for the reader. The archaeological literature includes debates not only about facts, but also about the causal mechanisms responsible for the emergence of the earliest city-states. We review these arguments in Section 9.8 along with related causal arguments from economists.
Problems associated with the lack of data or the interpretation of data are discussed by Algaze (Reference Algaze2008, Reference Algaze and Crawford2013), Nissen and Heine (Reference Nissen, Heine and Nissen2009), Brisch (Reference Brisch and Crawford2013), Pournelle (Reference Pournelle and Crawford2013), Wilkinson (Reference Wilkinson and Crawford2013), and Pournelle and Algaze (Reference Pournelle, Algaze, McMahon and Crawford2014), among others. In particular, Brisch (Reference Brisch and Crawford2013, 112) remarks that radiocarbon dating for ancient Mesopotamia is imprecise and difficult to link with historical events. In the absence of carbon dating, chronological analysis is based on ceramics. Ur (Reference Algaze and Crawford2013, 135) gives an example where a site with Uruk-period pottery was dated to the fourth millennium BCE (the sixth millennium BP). Also see Ur (Reference Algaze and Crawford2013, 132–136) for a summary of issues and methods pertaining to Mesopotamian settlement surveys.
Evidence for the period before 5200 BP comes entirely from archaeological findings (including seals, which pre-date writing). After this date written texts and tablets become available, but the earliest of these are difficult to interpret and offer no historical information.
9.4 The Pre-’Ubaid Period: Before 8000 bp
The source for information on this period is Kennett and Kennett (Reference Kennett and Kennett2006) except where indicated, but also see Brooks (Reference Brooks2006, Reference Brooks2013) for a further discussion of geography and climate. At around 15,000 BP the global sea level was 100 m below today’s level. Marine transgression into the Persian Gulf, associated with deglaciation, was just starting. The modern delta was absent. The Ur-Schatt river, carrying large volumes of sediments from the Tigris, Euphrates, and Karun rivers, flowed in an incised canyon. Climate was characterized by extreme aridity. Rapid sea level rise in the Persian Gulf occurred from 12,000 to 11,500 BP and 9500 to 8500 BP. Seasonal rainfall rose across the Arabian Peninsula and southern Mesopotamia between 10,000 and 6000 BP.
A transition from mobile hunting and gathering to sedentary agriculture occurred in southwest Asia between 12,000 and 8000 BP (see Chapters 4 and 5). Early villages in northern Mesopotamia had a few hundred people. Rainfall levels were often sufficient to support crop cultivation in northern and eastern Mesopotamia, but not sufficient in the south and west (Nissen and Heine, Reference Nissen, Heine and Nissen2009, 2).
In the south, rivers created an immense alluvial plain that was nearly flat, such that small differences in elevation could result in large differences in the flow of water. The marine transgression reached the present-day northern Gulf area between 9000 and 8000 BP. From 9000 BP onward the rate of sea level increase slowed, and sea level may have reached a Holocene maximum at 6000 BP. At this point the Persian Gulf extended 200–250 km north of the current coastline, reaching the hinterlands of the settlements that would later become Ur and Uruk (Ur, Reference Algaze and Crawford2013, 132). The area between the southern alluvial plain and the Persian Gulf was filled with marshes and lagoons. Estuaries extended north, following the Tigris–Euphrates–Karun River canyons, and formed a variety of productive wetland habitats throughout this region.
9.5 The ‘Ubaid Period: 8000–6300 BP
The first evidence of permanent human settlement in southern Mesopotamia goes back to 8000 BP, the beginning of the ‘Ubaid period. Early settlements were located on slight rises (turtlebacks) within aquatic habitats resulting from seasonal rains, or in the wetlands at the head of the Persian Gulf (Kennett and Kennett, Reference Kennett and Kennett2006). These turtlebacks were the remains of river terraces cut during the Pleistocene. Over time the areas around the terraces were filled by river sediment and other early settlements are probably buried beneath silt deposits (Oates, Reference Gibson, McMahon and Crawford2014, 1480).
Algaze (Reference Algaze2001, Reference Bocquet-Appel, Bar-Yosef, Bocquet-Appel and Bar-Yosef2008, ch. 4) emphasizes several natural advantages of the region. Geography led to low transport costs through the use of rivers, the seacoast, and canals in flat terrain (the role of canals is controversial, as will be discussed below). Diverse local ecosystems included productive agricultural land due to river floods, nearby land suitable for grazing animals, and riverine, coastal, or marine areas where aquatic resources such as fish were available. Algaze (Reference Algaze2008, 42) believes that these factors, along with favorable climate conditions including high winter rainfall and the possibility of some summer rain, supported a high population density at the mid-Holocene climatic optimum that prevailed during the ‘Ubaid.
Throughout the ‘Ubaid, human populations increased and nucleated into small towns and villages. By the middle ‘Ubaid some communities in southern Mesopotamia had grown larger than their neighbors, including Eridu, ’Ouelli, Ur, and Tell al-’Ubaid. However, these communities were still small, averaging about one hectare in area with estimated populations seldom exceeding 1000 people. They were also widely dispersed and lacked the linear distribution typical of settlements dependent on irrigation canals (Adams, Reference Adams1981, 59).
In southern Mesopotamia and the Susiana Plain the number of known settlements increased, both due to greater archaeological visibility (larger sites are visible above the alluvium) as well as a larger regional population. A significant population expansion appears to have occurred in southern Mesopotamia during the ‘Ubaid to Uruk transition around 6300 BP, but some areas such as the northern alluvium saw population decline as settlements became more concentrated in the south (Adams, Reference Adams1981, 60–61).
According to Pollock (Reference Pollock1999, 78–84), the primary food sources were wheat, barley, lentil, flax, caprids, cattle, pigs, fish, and hunting. Pollock believes that farmers paid land rent in goods or labor and made offerings to temples, but that production was not directly managed by elites (Reference Pollock1999, 79–80). The distribution of wealth items within ‘Ubaid sites is suggestive of economic inequality (Stein, Reference Stein, Stein and Rothman1994). There was mild inequality in housing, but there were no dramatic differences in grave goods and depictions of rulers were rare (Stein, Reference Stein, Stein and Rothman1994; Pollock, Reference Pollock1999, 86–92, 176–177, 199–204). Stone (Reference Stone, Kohler and Smith2018) reports a “moderate” Gini coefficient for grave goods at Eridu, with all later Gini coefficients being higher.
A hierarchical distribution of settlements becomes visible in the middle ‘Ubaid (Wright, Reference Wright and Adams1981; Kennett and Kennett, Reference Kennett and Kennett2006). Stein (Reference Stein, Stein and Rothman1994) describes these as simple chiefdoms with a two-tier settlement hierarchy, where larger sites had temples. Stone tools, pottery, and cloth were made locally but metal ores and specialized stones such as obsidian would have been imported. Stein (Reference Stein, Stein and Rothman1994) asserts that good agricultural land with easy irrigation was a scarce commodity to which access could be denied, and that this was the basis for chiefly power (but see the debate about irrigation below).
Temples played a prominent role in the ‘Ubaid period. They were rectangular, oriented to the cardinal directions, and contained altars and offering tables. The temples were located at focal settlements throughout the region, a pattern that remained stable for 1500 years (Stein, Reference Stein, Stein and Rothman1994). The evidence from Eridu indicates that such temples tended to become larger and more elaborate over the course of the ‘Ubaid. Stein (Reference Stein, Stein and Rothman1994) argues that the ‘Ubaid temple complex was used to legitimize differential access to resources including water, land, and labor, as agricultural production was intensified to sustain larger populations in the region. He also argues that temples mobilized food surpluses for storage and insurance purposes.
In Pollock’s (Reference Pollock1999) view, the ‘Ubaid had some trade, probably no slavery, and probably no warfare except perhaps near the very end of the ‘Ubaid. Stein (Reference Stein, Stein and Rothman1994) sees no evidence for trade in exotic materials, and he agrees that warfare and political instability were absent. He also sees little evidence for elite control of long distance exchange or centralized control of high-status craft production. Kennett and Kennett (Reference Kennett and Kennett2006) likewise agree that there is little evidence of warfare in southern Mesopotamia until the end of the ‘Ubaid. Settlements were unfortified and ‘Ubaid seals do not depict warfare. In contrast, there is evidence of warfare in northern Mesopotamia during this period.
During the ‘Ubaid southern Mesopotamia was linked to the surrounding region by social networks, and similarities in artifacts indicate widespread exchange of goods and knowledge. For example, ‘Ubaid period ceramics from the south appear in settlements in northern Mesopotamia and along the eastern coast of the Arabian Peninsula (Kennett and Kennett, Reference Kennett and Kennett2006, 81; Oates, Reference Gibson, McMahon and Crawford2014, 1481). Such observations can be interpreted as evidence for cultural replication, migration, or political integration. Nissen (Reference Nissen and Yoffee2015, 124) employs data from surface surveys to argue that small ‘Ubaid settlements were separated by large enough distances that “they were not part of a regulated or central system.” We follow what we take to be the majority opinion in the literature that there were regional markets for utilitarian goods, but without any political unification beyond the two-tier system of a town and its associated villages or hamlets.
There is a lively debate about the role of irrigation in the ‘Ubaid and subsequent Uruk periods. Many authors take it for granted that southern Mesopotamia already had irrigation canals in the ‘Ubaid. For example, Kennett and Kennett (Reference Kennett and Kennett2006, 81) argue that reduced rainfall and a high water table stimulated early experimentation with irrigated crops, and Stein (Reference Stein, Stein and Rothman1994) argues that the power of ‘Ubaid chiefs was based on control over irrigated land. Oates (Reference Gibson, McMahon and Crawford2014, 1483–1484) distinguishes “the more productive irrigation-based economy of the south versus the rain-fed economy of the north.”
However, Pournelle (Reference Pournelle and Stone2007, Reference Pournelle and Crawford2013) maintains that the wetlands provided abundant food even without irrigation or reliance on cereal crops. Her sources include high quality satellite images, aerial surveys, hydrological history, ancient sediments and watercourses, climate history, and archaeological remains. She concludes that wetlands offered easy access to fish, dates, birds, and turtles; fodder for domesticated goats, sheep, and pigs; reeds for housing and watercraft; edible plants such as club rush, cattails, and bulrush; and small mammals as well as migrating gazelles. From this standpoint early cities can be imagined as “islands imbedded in a marshy plain, situated on the borders and in the heart of vast deltaic marshlands” (Pournelle, Reference Pournelle and Crawford2013, 28).
Wilkinson (Reference Wilkinson and Crawford2013, 35–40) accepts that wetlands were an important source of food in the ‘Ubaid, but uses evidence from charred plant remains to argue that irrigated cereal cultivation also took place. He concludes that with the available data, it is not possible to determine which food source, cereal cultivation or wetland foraging, was more important to the ‘Ubaid economy.
The relationship between irrigation and cereal cultivation follows from the pattern of seasonal flooding (information in this paragraph is from Wilkinson, Reference Wilkinson and Crawford2013, 38–42). In southern Mesopotamia cultivated cereal crops were dependent on the use of river water due to the lack of rainfall. However, the timing of seasonal flooding in the Tigris and Euphrates caused problems. Flooding peaked in April and May as a result of winter rainfall and spring snowmelt in the mountains of Anatolia, Iran, and Iraq. Crops were endangered just as they were about to be harvested. Conversely, river levels were at their lowest in the fall when seedlings were most in need of water. Irrigation canals could be used to store excess water during the spring floods and then discharge it in the fall. Another problem was that hydromorphic soils, common in the flood basins and lower levee slopes of southern Mesopotamia, could become saline when the water table was high, leading to crop losses. This resulted in the extensive use of salt resistant barley as opposed to wheat.
9.6 The Uruk Period: 6300–5100 BP
This section surveys information on urban population, specialized bureaucracy, settlement hierarchy, colonial expansion, and excavations of monumental architecture. All are normally treated in the literature as indicators of state-level organization.
In the Uruk period, the city of Uruk, the largest Sumerian city by far, achieved a population of 20,000–50,000, and several other substantial cities of smaller size came into existence (Yoffee, Reference Yoffee2005, 43; Algaze, Reference Algaze2008, 103). Uruk covered an area of 250 ha with an urban core of 100 ha (Nissen, Reference Nissen1988; Kennett and Kennett, Reference Kennett and Kennett2006; Nissen and Heine, Reference Nissen, Heine and Nissen2009). For the early Uruk period, estimates of the share of the population living in relatively large towns (10 or more hectares) are in the range of 50–80% (Pollock, Reference Pollock and Rothman2001; Algaze, Reference Algaze and Crawford2013). By the end of the Uruk period, the city of Uruk plus its hinterland and offshoots had a total population estimated at 80–90,000 people (Algaze, Reference Algaze and Crawford2013, 74).
Recently these population estimates have become controversial. While there is general agreement about the geographical size of Uruk around 5200 BP, the population estimate depends on density, which is more difficult to measure. Steinkeller (Reference Steinkeller2018, 47), in comments on Algaze (Reference Algaze2018), argues that large unoccupied spaces existed within Uruk, reducing the total population estimate considerably. In private correspondence (2020), Algaze agrees with Steinkeller that Uruk was characterized by large open areas, which were recently documented in magnetometry surveys (Fassbinder et al., Reference Fassbinder, Hahn, Scheiblecker and van Ess2018). However, citing the excavation work of Postgate (Reference Postgate1994) and his collaborators at Abu Salabikh and the remote sensing work of Stone (Reference Stone, Heffron, Stone and Worthington2017) at a variety of southern Mesopotamian urban sites, Algaze continues to believe that an estimate of 100 persons per occupied hectare is a reasonable, if somewhat minimal, conversion factor for demographic calculations of ancient southern Mesopotamian settlements. Such an estimate would yield a population around 25,000 for the city of Uruk in the Late Uruk period.
The presence of organizational complexity is inferred from lists of specialized administrative positions inscribed on clay tablets (Brisch, Reference Brisch and Crawford2013). However, Algaze (Reference Algaze and Crawford2013, 72) points out that because most of the tablets were found in secondary contexts and date to the end of the Uruk period after city-state formation had already occurred, they “shed no light whatsoever on the beginnings of the urban revolution.”
Another commonly used indicator of administrative specialization in state-level polities is a four-tier settlement pattern with cities, towns, villages, and hamlets (Adams, Reference Adams1981). ‘Ubaid settlement distributions were bi-modal, indicating a chiefdom level of organization (Nissen, Reference Nissen1988; Stein, Reference Stein, Stein and Rothman1994). According to Algaze (Reference Algaze2008), survey evidence suggests that multimodal patterns typical of states were in place throughout the entire Uruk period.
In categorizing Uruk as a state, Algaze (Reference Algaze and Crawford2013, 82–85) gives particular weight to colonial expansion in the Middle and Late Uruk periods. He argues that such “massive, quickly erected and well planned enclaves … could only have been built by state institutions capable of levying, commanding, and deploying substantial resources and labor” (85).
For most of the nineteenth and twentieth centuries, the excavations at Uruk were exclusively concerned with uncovering and analyzing monumental buildings (for details see Nissen, Reference Nissen1988, 96–100; Nissen and Heine, Reference Nissen, Heine and Nissen2009, 22-26; Algaze, Reference Algaze and Crawford2013, 75-79). As a result, the excavations revealed little else. Ur (Reference Balkansky, Renfrew and Bahn2014, 14) notes that, “not a single non-monumental Uruk structure has been excavated on the southern Mesopotamian plain.” Nissen and Heine (Reference Nissen, Heine and Nissen2009, 21) refer to the interval from the end of the ‘Ubaid to 5200 bp as being “among the worst documented times of the history of the ancient Near East.”
Algaze (Reference Algaze and Crawford2013, 71–72) summarizes the excavations as follows. First, excavations focused only on elite quarters of the city, not commoner residences or industrial areas. Except for a few deep soundings, most of the materials and buildings excavated date to the very end of the Uruk period after the city-state had already formed. There has been almost no systematic exploration of second-tier regional centers, villages, or hamlets. Nevertheless, estimates of the immense mobilization of labor and materials required to construct the buildings thus far exposed at the core of the site imply state-level control over labor as well as resource procurement and allocation.
There is general agreement that Uruk began as two settlements facing each other on opposite sides of a major channel of the ancient Euphrates. At an uncertain date the river shifted to a new course around the city, leaving the area between these two sites available for further settlement (Oates, Reference Gibson, McMahon and Crawford2014, 1484; Nissen, Reference Nissen and Yoffee2015, 116–117). The fact that Uruk was built with mud bricks confirms that it was on dry ground, but it was in close proximity to lowlands with a marshy environment. River water may have been used for irrigation on dry land located between the city and the surrounding wetlands.
There is likewise broad agreement that high levels of migration into southern Mesopotamia occurred in the Uruk period (Kennett and Kennett, Reference Kennett and Kennett2006; Algaze, Reference Algaze2008, Reference Algaze and Crawford2013, Reference Algaze2018; Nissen and Heine, Reference Nissen, Heine and Nissen2009). Nissen (Reference Nissen and Yoffee2015, 118, 126) describes an “enormous population increase” in the interval 6000–5500 BP, with the number of settlements in the hinterland of Uruk rising from 11 to more than 100, many of which were larger than the earlier settlements. Nissen believes this increase occurred within a short period of time, was more than could be accounted for by natural population growth, was associated with in-migration, and was linked to a drier climate that may have created new habitable land. To support the hypothesis of in-migration, Algaze cites Kouchuokos and Wilkinson (Reference Kouchoukos, Wilkinson and Stone2007) for evidence of declining populations in northern Mesopotamia and southwestern Iran that appear roughly contemporaneous with population increase in the alluvium (see also Nissen and Heine, Reference Nissen, Heine and Nissen2009, 40).
Because of the richness of the wetlands, archaeologists also stress the likelihood of high levels of migration to the wetlands prior to and during the Uruk period. Many of these immigrants would have become marsh dwelling herders and foragers. Pournelle and Algaze (Reference Pournelle, Algaze, McMahon and Crawford2014, 13–20) use data sources from the last century to document a deltaic productive system that offers a picture of what rural life in the wetlands might have been like in prehistoric and ancient times. Potential confirming evidence from those times is hidden under deep layers of silt and in the impermanence of reed-constructed dwellings.
Climate change during the Uruk period is a key variable in several explanations of the origins of Uruk as a city-state, including our explanation. The climate phenomenon most relevant to the Uruk case is increasing aridity starting around 6000 BP. Kennett and Kennett (Reference Kennett and Kennett2006) report evidence from the Red Sea area indicating a marked trend toward greater aridity between 7000 and 5000 BP, which ultimately extended to Arabia and sub-Saharan Africa. After about 5500 BP, weakening monsoons led to more aridity over the Arabian Peninsula, with increasing dust transport in the Arabian Sea after 5300 BP. Aeolian deposits in deltaic sediments were most pronounced in southern Mesopotamia between 5000 and 4000 BP, and indicate severe aridity during this interval. Brooks (Reference Brooks2006, Reference Algaze and Crawford2013) links this overall climate trend to urbanization in southern Mesopotamia. He also views increasing aridity in Mesopotamia as part of a global climate transition related to solar radiation and the angle of the earth to the sun. For more on this global perspective, see the literature review in Roland et al. (Reference Roland2015, 209–210).
Direct evidence of the effect of increased aridity on cereal crops in northern Mesopotamia is provided in Riehl et al. (Reference Riehl, Pustovoytov, Weippert, Klett and Hole2014). The authors use carbon 13 measurements from ancient barley seeds sampled from northern Mesopotamia and the Levant to infer drought stress. Their study covers the period from 7500 BP to 2900 BP. The results show that favorable conditions prevailed from 7500 BP to 6000 BP. Then a downward trend began that reached a level of severe drought stress around 5500 BP and maximum drought stress at 5200 BP.
In the wetlands of southern Mesopotamia, Nissen and Heine (Reference Nissen, Heine and Nissen2009, 39) observe a relatively sudden shift to cooler and drier conditions starting about 5500 BP and associate it with less river flow than in previous millennia, fewer disastrous floods, more exposed land, and a dramatic increase in settlements in the hinterland of Uruk.
Crawford (Reference Crawford2004, chs. 8–9; Reference Crawford and Crawford2013) provides a useful summary of manufacturing and trade. Urban manufacturing included textile production, pottery making, metalworking, and stoneworking. Metals and stone were not available locally and had to be imported, along with precious stones and high quality timber for roofing (but also see Algaze, Reference Algaze2018, 36, on pine trees grown in the marshes a millennium later). Exports likely consisted of manufactured goods, primarily woolen textiles of varying quality. Cereals did not play a major role in cross-cultural long-distance trade. Trade flows increased significantly in the Uruk period. Following McCorriston (Reference McCorriston1997), Algaze (Reference Algaze2008, Reference Algaze and Crawford2013, Reference Algaze2018) emphasizes the transition from linen to wool textiles, and sees textiles as the most important of the urban manufacturing industries in terms of economic growth.
Presuming that the manufacture and export of textiles in the 6th millennium BP did not substantially diverge from later historically well-understood patterns dated to the 5th and 4th millennia BP, Algaze (Reference Algaze2008, ch. 5) describes large urban workshops, which apparently adopted an extensive division of labor. Workshops were probably operated by state institutions, competing temples, and wealthy households. The existence of multiple workshops suggests that scale economies may have been partly internal to the production units and partly external. The degree of specialization among towns is unclear.
Precise estimates of the percentage of urban residents in the Uruk period who were employed in manufacturing are not available. Algaze (Reference Algaze2018, 48) acknowledges that most were engaged in subsistence-related activities. But he also points out that “it is not always possible to sort ancient Mesopotamian productive activities into self-contained agrarian versus industrial categories owing to the substantial overlaps that existed between the two categories in terms of labor (seasonal workers), raw materials (e.g., wool, flax, skins, wood, and sesame oil), and energy sources (wood charcoal).”
There is little direct archaeological evidence for any form of taxation before written documents. However, Steinkeller (Reference Steinkeller, Steinkeller and Hudson2015a, 17) believes corvée labor was used in the late Uruk period for harvest work, major building construction, maintenance of irrigation systems, and military service. Such work was of limited duration and owed to the state by all free citizens. Elites could substitute a payment in place of labor (see Steinkeller, Reference Steinkeller, Steinkeller and Hudson2015b, 138–153, for a detailed discussion of corvée). Under our definition, mandatory labor contributions involve confiscatory technology and constitute a form of taxation, as do mandatory payments by individual elite agents.
We also infer the presence of taxation in city-states from the existence of massive public architecture and multilevel administration. However, apart from corvée labor it is difficult to determine which sectors were taxed (agriculture, manufacturing, local trade, or external trade), or how taxes were levied (on income, wealth, sales, profits, imports, or exports). Liverani (Reference Liverani, Bahrani and Van de Mieroop2006, ch. 3) gives examples of taxes on cereals, manufacturing, trade, and raw materials, but the time period of these taxes is unclear, especially with respect to the question of whether they apply to the period of city-state formation before 5200 bp.
Because data from burial sites are not available for the Uruk period, there is no evidence from this source regarding the degree of stratification. However, stratification with respect to housing does become more pronounced (Pollock, Reference Pollock1999, 204–205), and elites apparently had better access to meat than commoners (Pollock, Reference Pollock1999, 112). Based upon satellite imagery, Stone (Reference Stone, Kohler and Smith2018) computes Gini coefficients for housing during the Uruk and subsequent Jemdet Nasr periods (see Section 9.7). Her estimates range from a low of 0.32 to a high of 0.57. According to Stone, housing data “suggest that the initial phase of social complexity in ancient Mesopotamia was associated with significant inequality” (249). Elite residents living close to temples enjoyed more spacious housing than those living in crowded residential areas elsewhere.
Another aspect of inequality in the Uruk period involves the relative political power of elites and commoners. On this, Graeber and Wengrow (Reference Graeber and Wengrow2021, 297–302) point to three observations suggesting that elite rule was at least somewhat constrained: (1) monarchy was absent until the Early Dynastic period; (2) taxes in the form of corvée were paid by both elites and commoners; (3) there may have been popular councils and citizen assemblies (though these are more typically associated with the Early Dynastic period). In our view these observations do not contradict the evidence for increasing inequality in the Uruk and Jemdet Nasr periods, but they do suggest that oppression of the commoners was not absolute.
Slave labor was clearly used in the Uruk period (Englund, Reference Englund2009). This may have involved prisoners of war (although see our remarks on warfare below) or farmers who became unfree when they were unable to pay off their debts. The proportions of free and slave labor in the urban and rural sectors are unknown.
Another indicator of the quality of life in prehistoric Uruk is life expectancy. Algaze (Reference Algaze2018, 26) argues that mortality rates in Uruk would have been high due to poor sanitation and disease. He lists the following candidates for life threatening illnesses affecting dense urban populations in southern Mesopotamia: several types of malaria; hemorrhagic fevers of unknown origin; smallpox, chickenpox, and/or measles; early variants of bubonic plague; and tuberculosis, pneumonia, and/or pertussis. Chronic diseases included typhoid, cholera/dysentery, and hepatitis. He notes that these chronic diseases would have been particularly virulent among the young.
The standard indicators of warfare in prehistory are skeletal remains showing violence, city walls, settlements situated in locations that afforded defensive advantages, and visual depictions. For the Uruk period these indicators are largely lacking. Human skeletons were not analyzed or saved in the excavations. The construction of city walls is generally acknowledged to have occurred later in the Early Dynastic period. Defensive locations are not obvious. In sum, our reading suggests that the extent of warfare within southern Mesopotamia in the Uruk period is largely unknown.
9.7 The Post-Uruk Period: 5100–4350 BP
This section briefly sketches some developments after 5100 BP. Although we are mainly interested in the city-state formation process during the Uruk period, the next 750 years are of interest because written records are more informative, and one can observe the subsequent trajectory followed by city-states like Uruk, Kish, Nippur, Shuruppak, Isin, Lagash, Umma, and Ur, among others. Archaeological evidence supports the view that all of Sumer had a similar culture with minimal local variation during this time. For example, written scripts were similar across the region (Nissen and Heine, Reference Nissen, Heine and Nissen2009, 42).
The Jemdet Nasr period starts around 5100 BP and is largely a continuation of the later Uruk. Wool supplanted linen in textile production, with linen comprising perhaps only 10% of total Sumerian output after 5000 BP (Wright, Reference Wright and Crawford2013, 397). Writing evolved in ways that allowed greater nuance, complexity, and speed. There were improvements in pottery technology that facilitated mass production, a trend starting in the middle Uruk period (Nissen and Heine, Reference Nissen, Heine and Nissen2009, 42–43). The latter authors cite evidence of a city wall around Uruk and the beginning of large-scale canal and irrigation systems (46–48). Pournelle (Reference Pournelle and Stone2007, Reference Algaze and Crawford2013) raises questions about the existence of a city wall at this time and contends that the observed straight alignment of watercourses indicates ancient transport routes rather than irrigation canals. Wilkinson (Reference Wilkinson and Crawford2013, 36) says, “Because the channels in the lower plain have a tendency towards straightness rather than being meandering, it is difficult to distinguish between natural and artificial channels.” In contrast to the Uruk period, burials now become archaeologically visible, with extensive inequality in grave goods (Pollock, Reference Pollock1999, 206).
The Early Dynastic period runs from about 4900 BP to 4350 BP. There were written lists of kings, palaces gradually became architecturally distinct from temples and other large elite residences (Pollock, Reference Pollock1999, 48–51), and inequality of grave goods became more extreme (Pollock, Reference Pollock1999, 213–217). However, Stone (Reference Stone, Kohler and Smith2018) computes lower figures for Gini coefficients on housing in the Early Dynastic relative to previous periods and she notes that kings described themselves as protectors of the weak from the strong. There is some possibility that popular assemblies existed side by side with kings.
The economy was dominated by large self-sufficient households centered around temples, palaces, and wealthy estates, where elite corporate groups owned both farm land and urban workshops (Pollock, Reference Pollock1999, 117–123, 147–148). The elites managed production directly and workers received rations of food and clothing. Some workforces were kin-based but other large workforces involved non-kin. Women and children were the main workers in large-scale cloth manufacturing. Slave labor was common and included prisoners of war. Uruk was a walled city-state.
During the Early Dynastic, the number of watercourses decreased with continued aridity (the information in this paragraph is from Nissen and Heine, Reference Nissen, Heine and Nissen2009, 46–47). Some patches of land lost their access to naturally flowing water. Agricultural areas evolved into a system of “irrigation oases,” each fed by a main canal that in turn was fed by river water. Populations in smaller settlements that were now without water tended to move into larger settlements, and by 4500 BP most of the population lived in cities. In some marginal areas, labor shifted toward herding, a semi-nomadic lifestyle.
Pournelle (Reference Pournelle and Crawford2013, 24) agrees that during the Early Dynastic period, “aggregate site area increased even as the number of settlements fell.” She attributes this to a migration of population from surrounding drying wetlands. Nevertheless, texts show the continuing importance of marsh resources like reeds, fish, fowl, pigs, and trees, as well as marsh-based products such as bitumen, boats, mats, and standardized fish baskets.
Warfare between cities is documented after 4600 bp and was motivated by conflict over food–producing areas. One example is a war between Lagash and Umma over the boundary between their rural hinterlands (Van de Mieroop, Reference Van de Mieroop1997, 33–34; Pollock, Reference Pollock1999, 181–84; Yoffee, Reference Yoffee2005, 57). Kennett and Kennett (Reference Kennett and Kennett2006) and Algaze (Reference Algaze and Crawford2013) propose a causal link from warfare to the size of the urban population, with rural populations falling as people moved into the cities for protection. The first political unification of multiple city-states occurred through a series of conquests by Sargon around 4350 BP. This brought the Early Dynastic period to an end. The city of Uruk “remained occupied or was resettled at an urban scale for almost five thousand years” (Ur, Reference Algaze and Crawford2013, 151).
9.8 Existing Causal Hypotheses
This section reviews existing hypotheses about the development of city-states in southern Mesopotamia. These ideas provide context for the presentation of our theory in Section 9.9 and Chapter 10. We classify hypotheses into groups concerned mainly with (a) food supply; (b) climate and migration; (c) taxation of cereals; (d) manufacturing and trade; (e) warfare; and (f) religion. In each case we sketch the relevant ideas and identify a few problems or questions.
There appears to be a broad consensus that the climate of southern Mesopotamia was becoming more arid in the Uruk period, reflecting trends across the region. There is also a consensus that reduced rainfall outside the southern alluvium led to migration into the south where food availability was much less dependent on precipitation. Beyond this, individual authors begin to diverge in the stories they tell.
Food Supply:
Pournelle (Reference Pournelle and Stone2007, Reference Pournelle and Crawford2013) and Pournelle and Algaze (Reference Pournelle, Algaze, McMahon and Crawford2014) believe that due to abundant food resources in the wetlands, there is less need to conceptualize irrigation as being essential for the early cities in the alluvial lowlands. In their view this continued to be true after the Uruk period and perhaps into the Early Dynastic. Our sense is that Pournelle sees the productivity of wetlands as a permissive condition, allowing a transition to city-states but not compelling it. Pournelle does not explain how the use of wetlands for food was compatible with stratification in the ‘Ubaid, or why urbanization occurred. She also does not explain why urbanization resulted in state organization. In response to Pournelle and Algaze (Reference Pournelle, Algaze, McMahon and Crawford2014), Gibson (Reference Gibson, McMahon and Crawford2014, 191–192) agrees that marshlands were important, but he stops short of concluding that marshes were the “mother” of cities. He writes, “That cities depended on marshes as much as on irrigation is clear from early cuneiform records, but I would question that they were the primary factor.”
Climate and Migration:
Nissen and Heine (Reference Nissen, Heine and Nissen2009) argue that drier conditions led to lower river levels and caused people to congregate at places with better water supplies. At first the availability of more dry land may have attracted population into the former wetlands and encouraged urbanization. Fewer disastrous floods may also have attracted larger populations to such centers. But eventually rising aridity required large irrigation systems to support the urban populations. In their view large-scale irrigation was more a result of urbanization than a cause. These authors identify Uruk as a state based upon its political and economic dominance over the surrounding area as well as its structured city government, but do not specify a causal link between urbanization and state institutions.
Kennett and Kennett (Reference Kennett and Kennett2006) agree that climate change triggered a migration into the cities of the south, probably due to declining labor productivity elsewhere relative to southern Mesopotamia. They believe that wetlands were an important food source but that irrigation was also important. They share the Nissen and Heine view that drying wetlands led to greater land area and at the same time led to congregation, urbanization, and taxation based on cereals. The latter point about taxation puts them into alignment with Mayshar et al. and Scott on state formation, as we will discuss below. With regard to urbanization, Kennett and Kennett (Reference Kennett and Kennett2006, 90–91) suggest that growing aridity between the end of the ‘Ubaid and 5000 bp led to more competition for land and water resources, the concentration of the population in centers close to rivers and estuaries, and greater reliance on irrigation. They also suggest that the threat of war motivated migration from hinterlands to urban centers (90).
We make the following observations about Nissen and Heine (NH) and Kennett and Kennett (KK). In order for their stories to line up correctly with the timing of city-state emergence, the marshlands would have to dry out substantially earlier (i.e., in the Uruk period, not the Early Dynastic) from what Pournelle and Algaze appear to envisage. Another problem is that neither NH nor KK provide a detailed explanation for the urban agglomeration of the Uruk period. It is one thing to say that increasing aridity and lower river flow caused population to congregate in well–watered areas, but quite another thing to say that this led to cities of 20,000 or more people. Why did foragers or farmers agglomerate in a city? Why not spread out across the landscape to be close to food sources? If some people in the city were not producing food, what were they doing?
Taxation of Cereals:
Mayshar et al. (Reference Mayshar, Moav and Neeman2017, 630–631) consider ancient states in southern Mesopotamia as one case in a broader analysis of early states. Their general point is that when farming is transparent, for instance due to observable inputs such as available water for irrigation or homogeneity of the land, it is easier for elites to estimate output and thus to generate significant revenues through taxation. Although the focus in their 2017 article is not specifically on cereals, it is easy to imagine seasonal grains as a strong candidate for taxation because these are more transparent than other crops. The problem for tax collection and state formation in southern Mesopotamia, according to Mayshar et al. (Reference Mayshar, Moav and Neeman2017), is that cereal outputs in that region were not especially transparent due to details of the irrigation process that would have made it hard for external observers (e.g., tax collectors) to estimate the true output of an individual elite estate and compare outputs across estates. While the elite estate managers knew their own outputs, they had an incentive not to reveal output and instead free ride on the financing of state formation.
In Mayshar et al. (Reference Mayshar, Moav and Pascali2022) the focus is directly on cereal production. They include other aspects of appropriability in addition to transparency and measurement, especially durability and storage. Cereals, unlike other food sources such as tubers, can be stored for relatively long periods of time. Storage makes it easier for elites to appropriate grain output. They support their argument with extensive statistical analysis showing a strong empirical association between complex hierarchies (with states being a prime example) and cereal dependency. Land productivity has no effect after controlling for cereals.
Scott (Reference Scott2017), like Mayshar et al. (Reference Mayshar, Moav and Pascali2022), argues that extensive cereal production was an essential source of tax revenue for emerging states. He then combines taxation with the climate change argument in Nissen and Heine (Reference Nissen, Heine and Nissen2009) to nail down the timing of state formation at Uruk. More specifically, Scott argues (120–122) that increasing aridity in the Uruk period caused a reduction in river levels, forcing the population to congregate in well-watered places and to become more urban. Another effect of increasing aridity was to diminish the productivity of wetland foraging, which reinforced the concentration of the population. The result was greater investment in irrigation canals in order to feed the increasingly urbanized population. Taxes on the newly expanded cereal output allowed city-states to form (128–136).
Allen et al. (Reference Allen, Berazzini and Heldring2020) see state formation in southern Mesopotamia as a response to a series of river shifts that increased the demand for irrigation canals to replace river irrigation. Their hypothesis is that this collective action problem was solved by a social contract embodied in state formation. Statistical analysis supports the correlation of river shifts with city-state creation.
The hypotheses in this group, like others reviewed earlier, lack a clear reason for the formation of large cities, apart from a general tendency of population to concentrate in well-watered areas. Cereal taxation can potentially explain state formation, but it is unclear why the resulting states would not have been largely rural rather than urban.
Another difficulty for Mayshar et al., Scott, and Allen et al. involves timing. Small-scale cereal irrigation probably existed in the ‘Ubaid period and probably generated land rent for the local elites, but no one argues that taxation of cereals in the ‘Ubaid could have supported a state. Strong evidence for large-scale irrigation allowing high levels of cereal cultivation does not appear until the Jemdet Nasr or Early Dynastic period (Nissen and Heine, Reference Nissen, Heine and Nissen2009; Wilkinson, Reference Wilkinson and Crawford2013). This leaves a long gap during the Uruk period, when city-states with monumental architecture and other markers of state organization actually emerged. It is unclear how cereal taxation could have filled this chronological gap.
One possibility is a major improvement in the technology of food production in the centuries before large-scale irrigation becomes archaeologically visible. Liverani (Reference Liverani, Bahrani and Van de Mieroop2006, 15–19) and Mayshar et al. (Reference Mayshar, Moav and Neeman2017, 630) stress technological innovation in agriculture. We will propose instead that the gap was filled by taxation of urban manufacturing. We note in this regard that the statistical analysis from Mayshar et al. (Reference Mayshar, Moav and Pascali2022) does not control for urban manufacturing. Moreover, our theory provides a pathway to state formation that circumvents the taxation problem identified by Mayshar et al. (Reference Mayshar, Moav and Neeman2017), namely the lack of transparency in Mesopotamian cereal cultivation. We return to these issues near the end of Section 9.9.
Manufacturing and Trade:
Algaze (Reference Algaze2001, Reference Bocquet-Appel, Bar-Yosef, Bocquet-Appel and Bar-Yosef2008, Reference Algaze and Crawford2013) adopts a quite different theoretical framework from the authors we have discussed so far. He focuses primarily on manufacturing, trade, and urbanization, with little discussion of climate or food production. The central idea for Algaze (Reference Algaze and Crawford2013) is that Smithian growth involving labor specialization raises productivity, and may temporarily outrun Malthusian population growth, leading to rising income per capita. However, population responds positively to productivity over time. Population growth encourages further specialization, yielding a virtuous circle. This central dynamic supports various related trends: increasing imports of raw materials to the south for processing; more exports of finished textiles to pay for imports; a general expansion in economic scale, along with import substitution; and a flow of captive labor to the south, both skilled and unskilled.
To our knowledge, Algaze is alone in the literature in arguing that manufacturing provided a reason for urbanization. But there are several puzzles in Algaze’s account. One involves timing: there is no clear trigger that explains the development of city-states during 5500–5000 BP rather than some other time period.
A second puzzle is that while Algaze emphasizes the abrupt change in settlement patterns at the ‘Ubaid/Uruk boundary, it is unclear why manufacturing or trade patterns would have shifted abruptly at this time in a manner that can account for the change in settlements. To be fair, Algaze (Reference Algaze2018) clearly states that changes in manufacturing were not the initial cause of urbanization. But if not, then what was?
Most importantly, although Algaze maintains that productivity growth led to the expansion of trade and the formation of cities, he does not say why this led to a state. In Section 9.9 we will argue that our analysis offers answers to all three of these puzzles.
Warfare:
We are not aware of any specialists in the field who believe that warfare was the central factor in the formation of southern Mesopotamian city-states. However, we make two points. First, as discussed in Section 9.5, some scholars believe there may have been warfare near the end of the ‘Ubaid period. This would not be surprising given the climate deterioration at this time (see Chapter 7 for a model of climate shocks and early warfare). Second, both Kennett and Kennett (Reference Kennett and Kennett2006) and Algaze (Reference Algaze and Crawford2013) argue that much later in the Uruk period, warfare could have caused rural populations to flee to the cities for protection, which may have contributed to the size of the cities.
Religion:
Algaze (Reference Algaze and Crawford2013, 75), inspired by Jacobsen (Reference Jacobsen1976), suggests that cities arose from “the ideological attractions of living in centers where the gods themselves were thought to reside.” The problem is that such religious ideas either go back to the ‘Ubaid period and thus cannot serve as a trigger for urbanization in the Uruk period, or are an exogenous cultural mutation that coincides with city-state formation.
We do note, however, that our own theory about city-state formation will stress a falling wage for commoners, which we believe triggered the rise of urban manufacturing. In our approach the decrease in the wage is driven by climate change. But if people have non-economic reasons for wanting to be in a city, such as the threat of war or ideological convictions, this could expand the supply of urban labor and reduce wages beyond what we describe in Section 9.9, reinforcing the climate effect. However, the timing of these other effects did not necessarily coincide with urban manufacturing. Arguments of this kind also require a distinct urban labor market with a lower wage than in rural areas. In our model we will ignore such issues and assume a single wage for the entire region.
We conclude with a final point. The hypotheses reviewed in this section do not speak in any direct way to issues of stratification or inequality. They do not explain why ‘Ubaid society had moderate inequality of grave goods (at Eridu) while later periods had uniformly higher Gini coefficients for grave goods (Section 9.5); why levels of housing inequality rose during the Uruk and Jemdet Nasr periods (Section 9.6); or why inequality of grave goods increased in the Early Dynastic (Section 9.7). Our theory in Section 9.9 and Chapter 10 accounts for such observations by showing how elites became better off and commoners became worse off in the early stages of city-state formation.
9.9 Our Causal Hypothesis
A formal model of our own hypothesis about the Mesopotamian case is presented in Chapter 10. Here we provide a detailed verbal exposition. Our story borrows liberally from hypotheses reviewed in Section 9.8. In particular, we synthesize ideas about aridity and migration with Algaze’s arguments about urban manufacturing. We agree with Scott (Reference Scott2017, 129, 135) and Mayshar et al. (Reference Mayshar, Moav and Pascali2022, 3) that taxing wetland hunting and gathering would have been difficult due to a lack of transparency and appropriability. But we also argue, again using criteria of transparency and appropriability, that urban manufacturing was probably an easier target for taxation than rural agriculture in southern Mesopotamia (see Mayshar et al., Reference Mayshar, Moav and Neeman2017, 630–631), due to geographically compact production, visible workshop employment, and durable and storable outputs.
Our economic model for the rise of Mesopotamian city-states has the following structure. First consider an agricultural/foraging economy prior to urbanization. This can be interpreted as the ‘Ubaid period. Food can be obtained from two areas: an open-access commons and a closed site controlled by a local elite. The results would be similar if we included multiple elite-controlled sites in the model but for simplicity we have only one. However, we do assume that individual elite agents have individual land parcels, which collectively add up to the fixed land area of the closed site. We sometimes use the term “estates” when referring to the land parcels owned by these individual agents.
The commons includes many small sites with diverse natural resources and food production techniques (such as farming, pastoralism, hunting, gathering, and/or fishing). There is a fixed regional population of commoners who are free to move anywhere in the commons. Alternatively, they could choose to work at the elite-controlled site.
Due to open access, in equilibrium all agents in the commons receive the same food per person. This is also the level of food per person that individual elite agents must pay in order to attract commoners to work on their estates at the closed site. Depending on context, we sometimes refer to food income per commoner as the commoner standard of living, the average product of food labor in the commons, the marginal product of food labor on elite estates, or simply “the wage.” We emphasize that equality of food incomes among commoners is just a theoretical simplification. In reality we expect migration to respond to differences in living standards across the region and reduce these differences, without necessarily eliminating them completely.
The region-wide supply curve for labor is vertical in the short run due to a fixed population of commoners. The demand for labor has two sources: a demand curve from the river–irrigated elite site in the south, and a “demand curve” for labor in the commons. The latter is derived by computing the number of commoners who work in the commons at a given level of food income. We could add a third source of labor demand from elite-controlled but rain-dependent farms in the north. This would not alter our conclusions.
Both demand curves slope down because labor has diminishing returns when land and other natural resources are fixed. Summing these demands and equating aggregate demand with supply yields an equilibrium wage for the region as a whole. As mentioned in Section 9.1, while it is helpful for expositional purposes to assume that elite agents hire commoners in a labor market, our conclusions would be unchanged if commoners instead rented land from elite agents in a land market.
Now suppose some or all sites in the commons are vulnerable to aridity, so when rainfall starts to decline in the early Uruk period, output at such sites falls at a given level of labor input. Thus the “demand” for labor at the rainfall-dependent commons sites will drop. For the commons sites located in the wetlands of the south, the dominant view is that this process does not become severe until after the Uruk period, but eventually the wetlands dry out and become less productive, intensifying the decline in aggregate labor demand. Meanwhile the demand for labor at the elite site in the south is less affected by aridity due to local irrigation opportunities based on river water.
Increasing aridity causes a decline in the regional wage, and this causes the elite in the south to hire more commoners on their agricultural land. Thus commoners migrate from rainfall-dependent locations toward elite estates in the south and (initially) also foraging opportunities in the wetlands. This accounts for the evidence that the climate shift was accompanied by migration from the north to the south. It also accounts for the evidence of greater elite–commoner inequality, because land rents at the elite site were rising while wages were falling (keeping in mind that in our model, the total land area controlled by the elite is fixed).
In addition to food production, the regional economy had a latent manufacturing sector. When manufacturing became active, it exhibited aggregate increasing returns to scale, as long as the individual workshops were in close spatial proximity. In the model in Chapter 10, we attribute increasing returns to static Smithian forces of specialization and division of labor, although we believe dynamic processes of learning by doing and micro-invention were probably more important in the long run. These forces led to urban agglomeration.
We assume that manufactured goods were sold on a competitive regional market to all agents (both elite and commoner) at all sites. Manufacturing began when the wage became low enough relative to the productivity of manufacturing labor and the consumer demand for manufactured goods. As described above, the wage declined when climate deteriorated. Thus increasing aridity stimulated the formation of a manufacturing sector at the elite-controlled site.
As a benchmark we consider a free-entry equilibrium where any elite agent can establish an urban workshop when it is profitable to do so. The demand for labor now has three sources: the commons, elite agricultural estates, and manufacturing. The wage is determined by labor market clearing, the price of manufactured goods is determined by product market clearing, and the scale of urban manufacturing is determined by a zero-profit condition. We derive a threshold level for our climate parameter such that when commons productivity drops below this level, the manufacturing sector becomes active and urbanization occurs.
We next suppose that the elite at the closed site is organized enough to collect taxes from workshops based on the number of commoners they employ. We show in Chapter 10 that taxation of the workers themselves would give identical results. Though individual elite entrepreneurs are price takers, the elite as a whole has both monopoly and monopsony power. Specifically the elite can use the tax rate to limit the size of the urban sector, both driving up the output price and driving down the wage. The resulting profit is appropriated through the tax system and rebated to individual elite agents. In short, the elite taxes its members in order to enforce a cartel agreement. In our model tax revenue is used for private elite consumption, but in reality it was also used for public goods such as temples and to support state personnel.
In this framework the elite faces a tradeoff between land rent and manufacturing profit. For moderate climate deterioration, yielding a moderate decline in the commoner wage, the elite prefers to enjoy higher land rents and suppresses manufacturing. But for more severe climate deterioration and a correspondingly lower wage, the elite sets a tax rate at which manufacturing occurs. This yields a city-state with both urbanization and taxation. We show in Chapter 10 that elites became better off and commoners became worse off early in this process.
We close with some points of comparison between our hypothesis and those from Section 9.8. As with many other authors, we regard increasing aridity as the prime mover leading to city-state formation. We also believe the climate deterioration accounts for the evidence of migration to the south from other areas (northern Mesopotamia, southwestern Iran, and so on). One distinctive feature of our framework is that we link the exogenous climate shift to an endogenous decline in the commoner standard of living and increased use of commoner labor on elite agricultural estates. We also link the decline in the wage to elite incentives to promote urban manufacturing. We connect urbanization with state formation by arguing that urban manufacturing was an easier target for taxation than rural agriculture. And finally, we show that the elite may want to engage in taxation simply to enhance their own private consumption, even if they lack interest in public goods.
Because there has been much debate about the relative importance of food from the wetlands and food from agriculture, we stress that the qualitative implications of our theory do not depend on the initial richness of the wetlands or the rate at which they dried out. If there had been no wetlands at all, we would have defined the commons to be a set of open sites used for farming or pastoralism. If the wetlands were initially rich and dried out at the same rate as other sites in the commons, we would include wetland sites in our definition of the commons without changing our story. If the wetlands were initially rich and dried out more slowly, then migration from commons sites that were more vulnerable to aridity would still drive down the average product of labor at the wetland sites, and the wage paid by elites to commoners would still fall. At most, the role of the wetlands as a buffer would increase the size of the climate shift needed for city-state formation.
Our theory emphasizes wage reduction as the crucial stimulus to manufacturing, and in our model the land area controlled by the elite is fixed. However, some authors argue that increased aridity led to lower river levels and decreased flooding in the Uruk period. The resulting new land area available for settlement could have provided further incentives for manufacturing, which required contiguous land for worker residences and the exploitation of scale economies. We will return to this issue in Section 10.7.
In Section 9.8 we commented that the emphasis by Mayshar et al. (Reference Mayshar, Moav and Neeman2017, Reference Mayshar, Moav and Pascali2022), Scott (Reference Scott2017), and Allen et al. (Reference Allen, Berazzini and Heldring2020) on cereal taxation does not explain how the city-states were financed in the Uruk period, before large-scale irrigation projects appeared in the Jemdet Nasr and Early Dynastic periods. We propose that taxation of manufacturing filled this gap, and in our formal modeling we make the stark claim that this source of tax revenue was necessary and sufficient for city-states. Our story does not require technical innovation in the food production sector, which would have tended to increase the wage rather than decreasing it, making it harder to explain why urban manufacturing became profitable and why inequality between elites and commoners rose.
We would agree that in practice cereal taxation could have contributed something to the city-states of the Uruk period, but two factors probably limited revenues from this source. As Mayshar et al. (Reference Mayshar, Moav and Neeman2017) mention, irrigated agriculture in southern Mesopotamia would not have been completely transparent for tax collectors. Moreover, the availability of wetland foods could have restrained demand for irrigated cereals and thus the potential tax revenues from such crops, until the wetlands eventually began to dry out.
We are doubtful that other revenue sources could have financed a state during the Uruk period. In particular, the lack of city walls suggests that little attempt was made to tax goods obtained in the surrounding wetlands or to tax trade directly (rather than taxing manufactured goods, some of which were exported and some of which were sold locally).
Although we agree with Algaze about the centrality of urban manufacturing, his story and ours have some important differences. Algaze asserts that productivity growth due to Smithian specialization, along with internal and external gains from trade, pulled labor into the cities. In contrast, we see climate-led reductions in the productivity of the commons as pushing labor into cities. One advantage of our approach is that it accounts for the timing of urbanization, and another is that it is consistent with evidence for rising elite–commoner inequality. Furthermore, we tend to think that learning by doing was an important source of urban productivity growth and it is hard to see how this process could have taken off without some initial trigger for manufacturing. Our climate story provides that trigger.
Pulling all of these elements together, we offer explanations for the following:
(a) The timing of city-state formation (increasing aridity due to climate change).
(b) Population growth in the south (migration due to declining productivity of food labor elsewhere in the region).
(c) Increasing inequality (falling wages and rising land rents).
(d) Urbanization (aggregate increasing returns to scale in manufacturing).
9.10 Conclusion
The next two chapters will elaborate on the ideas developed here. The first task is to build a formal model that captures the causal logic of Section 9.9, which we undertake in Chapter 10. The second task is to investigate whether our hypothesis about city-state formation in southern Mesopotamia can be generalized to other regions of the world. We address this question in Chapter 11. We will argue that although the climate mechanism has wider applicability, it is not universally relevant, and two other mechanisms are also important: one based on the model of property rights from Chapter 6, and another based on the model of elite warfare from Chapter 8.
10.1 Introduction
The preceding chapter provided a case study of city-state formation in southern Mesopotamia. We used archaeological evidence to construct a detailed chronology of events, and toward the end of the chapter we reviewed several attempts to explain these events. Specifically, we want to understand the transition from a set of relatively small and dispersed settlements to a few large cities with all the trappings of state power.
This chapter constructs a formal economic model embodying our own hypothesis about the transition to city-states in southern Mesopotamia. Like other economic models it is schematic, and is not meant to capture every archaeological fact from Chapter 9. We use economic logic to fill in gaps when the archaeological record does not speak directly to certain issues. As a result our hypothesis involves a degree of speculation, although of course the same is true for hypotheses advanced by other authors.
Having said that, we believe the theory presented in this chapter is consistent with the key facts, captures the key causal factors, and suggests questions that archaeologists could investigate in the future. We also ask the reader to consider the merits of our story not in isolation but in comparison with the alternative stories from Section 9.8. As we will argue later (Section 10.7), our theory often explains observations that other theories do not. We already gave a detailed verbal exposition of our story in Section 9.9 and will not repeat that discussion here. The rest of this introduction outlines the structure of the present chapter.
Section 10.2 introduces basic assumptions and describes an initial equilibrium for an agricultural society. Food can be produced in a commons where sites are open to all, and at a single closed site controlled by a local elite. In this setting we derive the effects of climate change on the wage and the allocation of labor between open and closed sites.
Section 10.3 describes the manufacturing sector and characterizes the conditions under which such a sector would emerge, assuming no taxation and free entry by firms controlled by individual elite agents who are price-takers in the markets for labor, food, and manufactured output. Due to aggregate increasing returns to scale, manufacturing develops at a single site (the one that is under elite control). We interpret the resulting agglomeration of population as an urbanization process.
Section 10.4 extends the model to include taxation of the manufacturing sector by the elite, where we assume the elite as a whole behaves collusively and uses taxation as a cartel enforcement device. By limiting employment in the manufacturing sector, taxation drives down the wage paid to commoners and drives up the price at which manufactured goods are sold. In place of the zero-profit equilibrium arising under free entry, the elite now captures positive profit in the form of tax revenue, which is rebated to the individual elite agents and enhances their private consumption. A large negative climate shock can lead to the emergence of a manufacturing sector. Given our definition of the state as an organized elite that can collect taxes, these theoretical results provide an explanation for the formation of city-states in southern Mesopotamia.
The next two sections develop further implications of the model. Section 10.5 is devoted to the effects of the transition on elite and commoner welfare. We show that in the early stages of city-state formation elites always become better off and commoners always become worse off. However, when the manufacturing sector is large the impact on commoners becomes ambiguous. Section 10.6 explores the question of whether long-run Malthusian population dynamics would undo the conclusions derived from our short run framework. We show that with certain assumptions about parameter values, our key results continue to hold in a long-run setting with endogenous population.
The remaining three sections are verbal. Section 10.7 discusses how the model accounts for the archaeological facts described in Chapter 9 and compares our framework with the alternative hypotheses in Section 9.8. Section 10.8 offers closing thoughts and Section 10.9 is a very brief postscript (we postpone acknowledgments of assistance until the postscript in Chapter 11). We do try to provide some intuition for non-economists in Sections 10.2–10.6 but these are unavoidably mathematical. Readers not interested in the math can skip sections 10.2–10.6 and proceed directly to Section 10.7. Proofs of all formal propositions are available at cambridge.org/economicprehistory.
10.2 Stratification
We begin with a model of the ’Ubaid period prior to urbanization. The only consumption good is food, which is produced using inputs of labor and land. Food production takes place at two locations: an open-access commons dependent on rainfall and a special closed site called U. The commons includes many small sites with varying land areas. Site U is an ’Ubaid town and the future city of Uruk. Food production at site U makes use of river-based irrigation. Thus production in the commons is vulnerable to drought but production at site U is not. A local elite controls access to the land at site U while the commons is beyond elite control.
Geographically the commons may include areas of northern Mesopotamia, areas of southwestern Iran, or wetlands in the south. These areas were not all equally exposed to drought, and the effects of aridity arrived at different times in different places. For example, wetland foraging was probably not initially vulnerable to reduced rainfall, but eventually the wetlands began to contract. We suppress these details and assume in the model that aridity affected all sites in the commons simultaneously.
In our approach the formation of a city-state does not require interaction with any other incipient city-state, so we simplify by having a single site of this kind. Section 10.8 will discuss how the theory would change if we had multiple sites controlled by distinct local elites, leading to the formation of multiple city-states.
We refer to all food production generically as “agriculture,” although pastoralism and foraging may also have been important in the commons and herding may have been important at the closed site U. We assume site U has a fixed total land area and access to irrigation. It is divided equally into land parcels controlled by individual elite landlords. We refer to these individual land parcels as “estates.”
There is a fixed population of commoners who can use any site in the commons. We ignore the costs of moving from one site to another. Commoners may also work on elite-controlled agricultural land. Due to open access, all agents throughout the commons have the same food income, and this is also the amount of food the individual elite agents must offer in order to attract commoners to work on their estates. We call this the wage. As mentioned in Section 9.1, our conclusions would be the same if commoners instead handed over some of their food output to elite landlords as a rental payment.
The assumption of zero mobility costs across sites implies that we have a regional “labor market” encompassing all open sites, whether located in the north or in the southern wetlands, as well as the closed site controlled by the southern elite. Importantly, we do not have separate labor markets with separate wages in different parts of Mesopotamia. For simplicity, we ignore any potential role for elites located in northern Mesopotamia. The existence of such elites would not affect our qualitative conclusions.
Each agent is endowed with one unit of time and maximizes food consumption. As in Chapters 6–8 we use a Cobb–Douglas production function for food. First consider the commons, which has total population C and total land area Z. The total food output from the commons is
The parameter
captures the effect of rainfall on food output in the commons. At some points we consider the case
, which is equivalent to a situation where no commons exists, but
will be assumed unless otherwise stated.
Equation (10.1) is an aggregate production function. We think of the commons as including many small sites, each with the same Cobb–Douglas technology but possibly with different land areas. Due to free mobility, the population C is distributed in a way that equalizes the average product of labor across sites. The Cobb–Douglas assumption implies that the marginal product of labor is also equalized, so total food is maximized for the commons population C. This yields (10.1). Food per person in the commons is
Site U has the farming population F and one unit of land. Its food output is
Land at site U is irrigated using river water, so rainfall is irrelevant. We impose 0 ≤ θ ≤ 1 so the commons is less productive. Irrigated sites may also have had some problems with aridity due to lower river flows and shifting river channels, but we ignore this and assume irrigation systems could be expanded when needed to cope with these problems. Another wrinkle is that aridity led to some drying of the marshes and a corresponding increase in the land area available for elite-controlled agriculture. We do not include this change in land endowments in the formal model but will discuss the likely effects in Section 10.7.
To study inequality we assume that if an organized group occupies a site, has a density of e > 0 agents per unit of land as in Chapter 8, and uses all of its time to exclude other agents, it enjoys collective property rights over the land at that site. Such a group will be called an elite. When a non-elite agent threatens the property rights of an elite agent, the latter can call upon other nearby landlords for help in repelling the intruder. In principle this could occur not only at site U but also at the sites in the commons, if those sites have enough elite agents per unit of land. The parameter e might vary as a function of geography (it may be easier to defend property rights over land on a flat alluvial plain than in wetlands or mountains) but here we assume it is identical for all sites.
Non-elite agents will be called commoners regardless of whether they work at a site in the commons or on elite land at site U. Commoners are unorganized in the sense that landlords only need to repel or deter them one at a time in order to maintain control over land. This resembles our model of inequality from Chapter 6. By contrast with the model from Chapter 6 but like the model in Chapter 8, we ignore farm labor supplied by the elite agents. We also ignore warfare, which involves conflict between two organized groups (Chapters 7 and 8). The commoners at site U are free to exit to the commons if they wish (they are not slaves). There is good evidence that slavery existed at Uruk but our model still applies as long as there was also a sufficiently large free labor force.
The regional population N is divided into classes of size (C, F, e) such that
is assumed throughout so there is a positive number of commoners
. Total population N is exogenous in Sections 10.2–10.5 but endogenous in Section 10.6.
Landlords treat the food per person w from (10.2) as a parametric wage because the standard of living in the commons determines what must be offered to attract and retain farm labor at site U. These wages are paid in the form of food. As in Chapter 8, the rent to the landlord is the output of food net of such wage payments.
An individual elite agent has an estate with 1/e land units and hires n commoners to maximize
. This results in the individual labor demand
. Multiplying by the number of elite agents gives the total demand for farm labor at site U:
Total land rent for the elite is
(10.6)Food per elite agent is
.
It will often be convenient to aggregate agricultural output from the commons and site U. Let
be total food output and let
be total agricultural labor. Using (10.1) and (10.3) and the fact that the marginal product of labor at site U is equal to the average product of labor in the commons, we obtain
(10.7)Total food Ya as a function of total agricultural labor A is
Because the marginal products of labor are not equated between the commons and site U, this aggregate output is below the theoretical maximum.
Equations (10.2) and (10.5) give a relationship between the wage and the total demand for agricultural labor at all sites in the region (the commons and site U together):
In a purely agricultural economy the equilibrium wage equates this total labor demand to the total supply of commoners N - e.
Definition 10.1 The wage w(θ) is an agricultural equilibrium associated with the commons productivity θ when
or equivalently
.
In an equilibrium of this kind (10.7) gives
and
.
Two restrictions ensure open access in the commons and stratification at site U.
Definition 10.2 An agricultural equilibrium satisfies the stratification constraints when
(a)
so commons population density is too low to support elites, and(b)
so elite agents at site U are at least as well off as commoners.
Proposition 10.1
(stratification).
A necessary condition for an agricultural equilibrium to satisfy both stratification constraints is
When (10.10) holds, both stratification constraints are satisfied if and only if
where the set of N values from (10.11) is non-empty. The lower bound in (10.11) is an increasing function of θ and the upper bound in (10.11) is a decreasing function of θ.
Condition (10.10) says that the productivity of the commons must be low enough relative to the productivity of site U (normalized at unity). If the sites in the commons are highly productive, two things happen. First, agents migrate into the commons, which can propel the population density there beyond the threshold e, resulting in the formation of elites. Second, a highly productive commons implies a high wage, which can make it unprofitable to be a member of the elite at U. When (10.10) is violated, at least one of these outcomes must occur. When (10.10) holds, there are commoner population levels N−e that satisfy both stratification constraints. The set of (N−e, θ) points for which this is true is shown in Figure 10.1.
When rainfall declines in the commons, θ falls while productivity at site U is left unchanged. The wage from D10.1 decreases, yielding less food per capita in the commons. Labor is reallocated away from the commons (C falls and F rises) because landlords hire more commoners at the lower wage. As a result total land rent R(w) rises. It is easy to see from Figure 10.1 that for a fixed population N, if both stratification constraints were satisfied initially then the same must be true after θ decreases due to lower rainfall.
Section 10.3 will show that a wage reduction of this kind can lead to urbanization at site U. One might think that similar results could occur if the agricultural productivity of site U rose for technological or environmental reasons. This would pull labor into site U rather than pushing it out of the commons. However, higher agricultural productivity at U would cause elite landlords to bid up the wage, which would not provide the same stimulus to manufacturing as a wage reduction caused by a drought in the commons.
10.3 Urbanization
Here we build on the model of Section 10.2 to show how urbanization could have occurred in the Uruk period. Rather than having just one consumption good we will have two: food (y) and manufactured goods (m). Prior to urbanization, manufactured goods such as textiles, pottery, and metal objects were produced by local specialists or farmers. We ignore these activities and treat the goods produced in cities as a distinct commodity.
We want to capture two aspects of the urbanization process. First, manufacturing activities were concentrated at Uruk rather than being dispersed throughout the commons. This suggests the presence of increasing returns to scale in manufacturing, likely due to gains from a Smithian division of labor. Uruk and other southern towns probably became manufacturing hubs due to easy river and coastal transportation as well as the presence of elites who could enforce property rights over inputs and outputs. We omit transportation costs from the formal model.
Second, scale economies likely operated at the level of the city as a whole rather than at the level of the individual workshop, and involved the usual suspects: a trained labor pool, industry-specific inputs, and technological spillovers (for a discussion, see Krugman, Reference Krugman1991, ch. 2). We assume that manufacturing requires labor but not land, so individual workshops can be in close physical proximity. We define units so that one workshop produces one unit of the manufactured output, and assume workshop size is constrained by supervisory limits. We do not model these managerial constraints and ignore the discrete nature of individual workshops.
We want to study the transition from a boundary solution with no manufacturing to an interior solution with positive manufacturing. Similar issues arose in Chapters 3–5, where we started with zero levels for the use of certain natural resources, for sedentary technology, or for agricultural technology, and derived the conditions under which these levels would become positive. The Cobb–Douglas functional form is not suitable for this purpose because the marginal product of labor approaches infinity as the quantity of labor approaches zero. Therefore we characterize the manufacturing sector using exponential functions for demand and cost that imply finite vertical intercepts at zero. For simplicity as well as consistency with the technological assumptions of Chapters 6–8 we continue to use the Cobb–Douglas functional form for agriculture, where boundary solutions will play no role in our analysis.
Consider the demand for manufactured goods. Each agent (elite and commoner) has the identical quasi-linear utility function
This functional form has b(0) = 0 with b′(m) > 0 and b′′(m) < 0 for all m ≥ 0. It also has finite marginal utility b′(0) = q at zero consumption, so boundary equilibria without any manufacturing can occur. The price of the manufactured good is p, the price of food is unity, and income is x, so the budget constraint associated with (10.12) is pm + y = x. The non-negativity constraint y ≥ 0 is ignored because this requirement is satisfied for all of the equilibria studied in this section.
There is a regional market for the m good so the price p applies both at site U and in the commons. All agents are price-takers. Let M be the aggregate market demand for the manufacturing sector. Because agents have identical preferences and the distribution of income can be ignored due to quasi-linearity, M satisfies
(10.13)Next consider the supply side. Labor is the only input for manufacturing. Let L be the workforce in this sector. The total output of the manufactured good at site U is
This functional form has M(0) = 0 with M′(L) > 0 and M′′(L) > 0 for all L ≥ 0. The sign of the second derivative yields aggregate increasing returns. The marginal product M′(0) = r at zero input is positive and finite.
Due to increasing returns, we replace the standard supply curve with a zero-profit condition
Zero profit is maintained through free entry and exit by manufacturing workshops, where the individual elite entrepreneurs at site U are price-takers for both inputs and outputs.
We now extend the definition of equilibrium from D10.1 to include manufacturing.
Definition 10.3 The array (p0, w0, A0, L0) is a zero-profit equilibrium associated with the commons productivity θ when the following conditions hold.
| (a) | consumer optimization: | p0 = b′[M(L0)/N] | if L0 > 0 | or |
| p0 ≥ b′(0) | if L0 = 0 | |||
| (b) | manufacturing equilibrium: | w0 = p0M(L0)/L0 | if L0 > 0 | or |
| w0 ≥ p0M′(0) | if L0 = 0 | |||
| (c) | agricultural equilibrium: | A0 = A(w0, θ) | ||
| (d) | labor market equilibrium: | A0 + L0 + e = N. |
Condition (a) requires that consumer demand for manufactured goods at the price p0 add up to the amount M(L0) produced by firms. If no manufactured goods are produced, we allow any price p0 that is sufficiently high to choke off demand. Condition (b) requires that if the manufacturing sector is active, firms receive zero profit. When L0 = 0, we set the average product of labor equal to the marginal product, and allow any wage w0 at or above the value of this average product. Thus profit is non-positive and there is no entry into manufacturing. Condition (c) requires that farm labor in the commons and on elite estates add up to total agricultural labor at the wage w0. This always implies that A0 > 0. Finally, condition (d) requires that total labor by commoners add up to the supply N−e. The food market clears automatically due to Walras’s Law.
An equilibrium with positive manufactured goods (L0 > 0) requires
(10.16)where the first equation comes from (a) and (b) in D10.3 and the second equation comes from (c) and (d) in D10.3. Together these yield
(10.17)To avoid complications involving multiple equilibria for a given value of
, we want to guarantee that the right side of (10.17) is decreasing in L. The term involving the supply of agricultural labor
is clearly decreasing in L. Thus it is sufficient for the average value product
to be non-increasing in L. Condition A10.1 ensures that this is true.
Assumption 10.1 ![]()
A10.1 does not involve r so this parameter is irrelevant. We only need enough concavity in the utility function relative to the population N. A zero-profit equilibrium as in D10.3 with
is stable in the sense that if the prices p and w adjust rapidly to a given labor allocation
, profit will be positive when
so new workshops enter, while profit will be negative when
so some existing workshops exit.
Assuming A10.1 holds, the nature of zero-profit equilibrium is shown in Figure 10.2. The horizontal axis shows manufacturing labor L from left to right and agricultural labor A from right to left. These must sum to N−e to clear the labor market. The vertical axis shows the wage. At a high enough value θ′ for the commons productivity, the demand curve for agricultural labor from (10.9) does not intersect
and we have an agricultural equilibrium where
as in Section 10.2. In this situation the wage w′ that clears the labor market is too high to make manufacturing attractive. At the productivity level θ0 the wage is qr and again
. But if productivity falls to
the system moves to an interior equilibrium with the manufacturing labor
and the wage
. This marks the start of urbanization. Under present assumptions the level of manufacturing labor is a continuous function of θ so the transition does not involve any abrupt jumps.
To formalize these ideas we derive the boundary productivity θ0 by setting L0 = 0 in (10.17). This yields
(10.18)Note that
holds if and only if
so there is a large enough supply of commoner labor. If there are too few commoners then demand for agricultural labor by elite estates always keeps the wage above the level needed to trigger manufacturing, even if the commons has zero productivity (so in effect there is no commons). In such a situation, the causal link running from climate change to urbanization is disabled.
Proposition 10.2
(zero-profit equilibrium).
Assume A10.1 holds. If
then only part (a) below applies. If
then both parts (a) and (b) apply.
(a) A zero-profit equilibrium with
exists if and only if
. For any such θ the wage w0 is the same as in the agricultural equilibrium from D10.1 in Section 10.2.(b) A zero-profit equilibrium with
exists if and only if
. For any such θ the associated equilibrium is unique. On this interval(i) The equilibrium values (p0, w0, A0, L0) are differentiable functions of θ.
(ii) The variables (p0, w0, A0) move in the same direction as θ. The variable L0 moves in the opposite direction from θ.
(iii) As
from below, we have
,
,
, and
. The equilibrium values of these variables are continuous at θ0.
These results show that a drop in commons productivity from the interval in (a) to the interval in (b) leads to positive manufacturing output. This can only occur when the boundary value θ0 is strictly positive. Taking the commoner labor supply N−e as given, this requirement places a lower bound on qr (the average value product of manufacturing labor evaluated at zero input) through (10.18). When qr is too small, the elite’s demand for farm labor always keeps the wage too high to make urban workshops profitable.
For an interesting model we need
where the upper bound comes from (10.10). Otherwise it would be impossible to have an agricultural equilibrium satisfying the stratification constraints from Section 10.2. It can be shown that if these constraints are satisfied in an agricultural equilibrium with
, they remain satisfied when
so manufacturing occurs. Thus open access persists in the commons and elite control persists at site U.
10.4 Taxation
The model in Section 10.3 showed that climate change could trigger urbanization, but not why the resulting cities would also be states. Here we extend the model to show why the emergence of a city coincided with the emergence of a centralized fiscal system. We assume the elite at Uruk was organized enough to tax its own members and punish individuals who did not comply. The punishments could have involved physical force, social ostracism up to and including expulsion from the elite, or supernatural sanctions. The question is whether the elite as a whole would have gained from such taxation. We suggest that one motivation for a tax system would have been its usefulness as a cartel enforcement device.
Individual elite agents are wage takers in the labor market and price takers in the product market because they are small relative to the size of each market. However, as a group the elite confronts an upward sloping labor supply curve and a downward sloping product demand curve for manufacturing. The elite understands that if it can collectively restrict labor input (and thus output) in this sector, it can drive down the wage and drive up the output price. Starting from a zero-profit equilibrium this yields positive profit.
In our model tax revenue is rebated to individual elite agents and used for private consumption. This revenue is identical to the profit of the manufacturing sector. It is not important whether taxes nominally fell on elites or commoners, or whether workshops were taxed based on their inputs or outputs. We adopt the convenient assumption that a tax was levied on each elite workshop based on the number of commoners it employed.
For readers accustomed to the idea that elites exploit commoners, our assumption that the elite taxes itself may be counterintuitive. Such readers should bear in mind that the elite is using taxation to solve a collective action problem involving collusion against commoners. We show below that this can be done equally well by taxing the individual elite agents who hire commoners or the individual commoners who work for elite agents. As long as the tax revenue flows to the elite in each case, the results are the same.
It does matter that only manufacturing was taxed. If taxation of agriculture had been profitable, a state would have emerged prior to urbanization. This contradicts the idea that the Mesopotamian city-states were in fact pristine states. Most scholars date the earliest Mesopotamian states to the mid- or late-Uruk period, when cities were forming or had recently formed. We infer that agriculture by itself did not provide an adequate basis for centralized taxation, probably because its dispersed activities made rural output and/or workers costlier to monitor than urban output and/or workers.
The crucial question is whether the elite wants to tax manufacturing when doing so has no administrative cost in the form of hired tax collectors, fixed costs from setting up the tax system, and the like. Clearly a necessary condition for taxation to arise is that it must offer net benefits to the elite when it can be done costlessly. If this is true, it may also be profitable to create a tax system having fixed or variable administrative costs, as long as these costs are not excessive relative to the revenue obtained.
Suppose the elite levies a tax
per manufacturing worker, where tax revenue is collected by a central agency that redistributes it equally among the elite agents. The previous zero profit condition from (10.15) now becomes
Total tax revenue is
, which is also the total profit from manufacturing.
Figure 10.3 illustrates these ideas. The average value product for manufacturing is given by the locus
, which is downward sloping under condition A10.1 from Section 10.3. The supply curve for manufacturing labor is given by w(L, θ) reading from left to right, which is identical to the demand curve for agricultural labor in Figure 10.2 reading from right to left. This supply curve is always upward sloping.
The equilibrium (L0, w0) with zero taxation
occurs at point A. When the tax rate on manufacturing labor is positive
, tax revenue and manufacturing profit are given by the area of the rectangle in Figure 10.3 defined by the average value product at point B and the wage at point C. The gap between B and C is wB − wC = t, the tax rate per worker. A higher tax rate implies a lower manufacturing workforce L. For any tax rate
there is a unique equilibrium level of L and vice versa, as long as L lies between zero and its equilibrium level
with zero tax. In what follows it is simplest to have the elite choose L directly and collect the resulting manufacturing profit using whatever tax rate
is needed to induce the desired L.
We can now show why taxation of elite employers and commoner employees has the same result. Suppose each urban worker pays the tax
from Figure 10.3. Such workers must receive the wage wB from their employers in order to be recruited into the manufacturing sector, because the net wage is wC after the worker pays the tax and wC is available in the commons. The wage wB is the average value product in manufacturing, which leaves zero profit for workshop owners. Again the elite collects its profit through the tax system, not directly in the markets for goods and labor. The level of profit is the same for a given tax rate
whether it is employers or employees who nominally pay.
A positive tax
has several effects relative to the zero-profit equilibrium with no taxation. First, taxation decreases urban employment L and manufactured output M. Second, the price p for manufactured goods rises. Third, the wage w falls. Finally, the lower wage raises the land rent of the elite. A sophisticated elite will take these effects into account in choosing L.
To characterize the elite’s optimal choice of L we need the elite’s indirect utility as a function of prices. The prices w and p are determined from (10.9) and (10.13) by
The indirect utility function for an individual agent is written as v(p, x) = s(p) + x where s(p) is consumer surplus and x is income. Letting m(p) be consumption for an individual agent, we obtain s(p) = b[m(p)] − pm(p), where in market equilibrium m(p) = M(L)/N. Multiplying v(p, x) by e gives the total indirect elite utility VE(p, XE) = es(p) + XE or
Hence the elite is concerned with three things: its consumer surplus from manufactured goods, the total profit from manufactured goods, and the total land rent.
We now define an equilibrium with optimal elite taxation, where the scale of the manufacturing sector is chosen to maximize the total utility of the elite agents.
Definition 10.4 The array (pE, wE, AE, LE) is an elite taxation equilibrium associated with the commons productivity θ if
| (a) | LE maximizes VE[p(L), XE(L, θ)] | for 0 ≤ L ≤ L0(θ) |
| (b) | pE = p(LE) = b′[M(LE)/N] | from (10.21) |
| (c) | wE = w(LE, θ) = [β(θ)/(N − e −LE)]1-α | from (10.20) |
| (d) | AE + LE + e = N | from (10.4) |
In condition (a) of the definition, L is constrained not to exceed L0(θ) because this is the labor input for manufacturing when the tax rate is zero and we are only considering t ≥ 0. A solution to the optimization problem exists because V is continuous and the feasible set is non-empty and compact. The other conditions are straightforward. The constraint that commoners have non-negative food consumption is discussed in the proofs of the formal propositions.
To characterize an elite taxation equilibrium, we need to differentiate the indirect utility function VE[p(L), XE(L, θ)] from (10.22) with respect to L. The derivative can be broken into two parts, corresponding to two channels through which L affects the other components of the model. The first channel acts through the product market and affects M, p, manufacturing revenue, and consumer surplus, where the key parameters are q and r. The second channel acts through the labor market and affects w, A, manufacturing cost, and land rent, where the key parameters are θ, Z, and α.
The derivative of elite utility with respect to manufacturing labor is
(10.23)The function μ(L) captures the product market channel and has an ambiguous sign. The function λ(L, β) captures the labor market channel and is strictly negative for all L < N − e and all β. The elite will always choose L < N − e because L → N − e implies A → 0, so the marginal product of agricultural labor goes to infinity, as does the wage.
Define Lmax > 0 by μ(Lmax) ≡ 0, or equivalently M(Lmax) = N/q(1 − e/N). No labor input with L ≥ Lmax can be optimal because then μ(L) ≤ 0 and λ(L, β) < 0 imply dVE/dL < 0. In a situation of this kind, the elite would reduce L. Therefore the relevant interval in the optimization problem D10.4(a) is 0 ≤ L < min {Lmax, N − e}.
Because θ is non-negative the minimum value of β is α1/(1−α). For β greater than or equal to this level, λ(L, β) is strictly decreasing in β and thus strictly decreasing in θ. This implies that a higher θ makes λ[L, β(θ)] more negative at any fixed value of L, so the elite is less inclined to expand manufacturing when commons productivity is higher.
It can be shown that ∂λ/∂L < 0 always holds. However, in general ∂μ/∂L has an ambiguous sign. Condition A10.2 below ensures that ∂μ/∂L < 0 holds on the interval 0 ≤ L < min {Lmax, N − e}.
Assumption 10.2 ![]()
A10.2 gives d2VE/dL2 < 0 for 0 ≤ L < min {Lmax, N − e}, so the objective VE[p(L), XE(L, θ)] from (10.22) is strictly concave in L on this interval. This is true for any productivity θ ≥ 0. A10.2 guarantees the uniqueness of the solution to the elite’s optimization problem in D10.4(a).
In non-trivial situations where L0 > 0 so some manufacturing is feasible in D10.4(a), there are three possible cases. First, there could be a boundary solution having LE = 0 so the tax rate is high enough to prevent any manufacturing. This requires μ(0) + λ[0, β(θ)] ≤ 0. Second, there could be an interior solution with 0 < LE < L0 so manufacturing occurs but with a positive tax rate. This requires μ(LE) + λ[LE, β(θ)] = 0. Third, there could be a boundary solution with LE = L0 so the level of manufacturing is the same as in a zero-profit equilibrium and the tax rate is zero. This requires μ(L0) + λ[L0, β(θ)] ≥ 0. Later we use a sufficient condition that rules out the last case, so a solution either involves no manufacturing or positive manufacturing at a scale below the zero-profit level.
First we consider boundary solutions with
so manufacturing is absent.
Lemma 10.2
Define
implicitly by
. There are two cases.
(a) If
there is no such
. In this case, the derivative in (10.23) is negative at
for all
. Thus the elite always chooses
.(b) If
there is a unique
. The equality implies
and the inequality implies
. In this case the derivative in (10.23) is positive at
for
, zero at
for
, and negative at
for
. Thus the elite chooses
for
and
for
.
Part (a) shows that if qr and N−e are small enough, the elite never wants manufacturing
. Part (b) shows that when qr and N−e are large enough and θ is small enough, the elite wants positive manufacturing
.
Next we rule out boundary solutions of the form
. Condition A10.3 (together with A10.2) is sufficient for this purpose.
Assumption 10.3 ![]()
This condition says that commoners represent at least half of the total population, and it limits the size of consumer surplus effects relative to income effects for the elite.
The last possibility is an interior solution with
. Suppose Lemma 10.2(b) applies and
so the elite’s optimal labor input is positive. Let L(θ) be the input level defined implicitly by the first order condition
This yields
Due to
and
in the relevant range for
, the numerator is positive. As mentioned above,
always holds, and
holds in the relevant interval if A10.2 holds. Under these conditions
holds in (10.25), so a lower productivity in the commons leads to a larger manufacturing sector. We use
to denote the elite’s optimal output choice, and
to denote output with free entry and zero taxation as in Section 10.3.
Lemma 10.4
Suppose Lemma 10.2(b) applies, and A10.2 and A10.3 hold. Consider the interval
where
is defined in Lemma 10.2. We have
where the zero-profit boundary
is defined in (10.18). The elite’s optimal output
is unique with
and
for
.
is decreasing on this interval and continuous at θe.
Lemma 10.4 shows that when
three things are true: (i) a zero-profit equilibrium of the kind described in Section 10.3 would lead to positive manufacturing; (ii) the elite will impose a positive tax on manufacturing; and (iii) the tax is not so high that manufacturing is entirely suppressed.
We now state our main results on elite taxation and its relationship to the zero-profit equilibria studied in Section 10.3.
Proposition 10.3
(elite taxation equilibrium).
Let θ0 be the zero-profit boundary in (10.18) and let θe be the elite taxation boundary in Lemma 10.2. Assume A10.2 and A10.3 hold.
(a) If
then
and Lemma 10.2(a) applies so there is no solution for
. For all
, zero-profit equilibrium gives
without taxation.(b) If
then
and either Lemma 10.2(a) applies so there is no solution for
or Lemma 10.2(b) applies with
.(i) For
, zero-profit equilibrium would give
but the elite enforces
through high taxes.(ii) For
, zero-profit equilibrium gives
without taxation.
(c) If
then Lemma 10.2(b) applies and
.(i) For
, the elite imposes positive taxes but allows
.(ii) For
, zero-profit equilibrium would give
but the elite enforces
through high taxes.(iii) For
, zero-profit equilibrium gives
without taxation.
Proposition 10.3 shows that when qr, the average value product at
, is small, as in (a) and (b), a manufacturing sector does not emerge for any commons productivity θ. This is true either because entry is unprofitable even without taxation, or because entry would occur but the elite prevents it in order to maintain land rents. The only situation where the elite allows manufacturing is (c), where this sector is profitable enough that the elite taxes it moderately. Even in this case it is also necessary to have a low commons productivity θ as in (c)(i), which pushes the wage down to a point where the gains from manufacturing outweigh the losses in land rent. These results are depicted in Figure 10.4, which shows case (c) where manufacturing is permitted when θ is sufficiently low. The intercept θs and the function MS(θ) associated with a social planner will be discussed in Section 10.5 and can be ignored for now.
A key implication is that a well-organized elite with the capacity to tax its own members may not want to form a state, even if manufacturing would be profitable and even if taxation has no administrative cost. The elite must also consider the opportunity cost in terms of the land rent it collects from farming. If manufacturing would place too much upward pressure on the wage rate, and therefore too much land rent would have to be sacrificed, the elite prefers to suppress manufacturing.
However, a well-organized elite with low administrative costs will allow a city-state if there is high demand for manufactured goods, high productivity in manufacturing, a large supply of commoner labor, and low productivity in outlying areas beyond elite control (or some combination of these factors, allowing for tradeoffs among them). If these requirements are met, urbanization and taxation arise together. The collection of taxes is driven by the elite’s desire to enforce a monopolistic restriction on output from urban workshops, which increases total elite utility.
10.5 Welfare
This section develops some welfare implications of the model. These results are of interest because archaeologists and others often debate whether state formation makes commoners better or worse off. We first summarize our results verbally for readers who want to skip the math. We will then develop them in a mathematical way.
We consider the effects of a gradual decline in the commons productivity θ due to increasing aridity across the region. The effects are clear when manufacturing is absent. A lower θ always reduces the wage, which makes the elite better off and the commoners worse off. It also reduces total utility because total food output declines.
When some manufacturing exists, a reduction in commons productivity again makes the elite better off, but matters become much more complex for commoners. When θ falls, the elite employs more manufacturing labor, which reduces the price of manufactured goods and thus enhances the consumer surplus of commoners. There is also an indirect effect where the increased demand for manufacturing labor pushes up the wage. On the other hand, the direct effect of lower commons productivity is to decrease the wage. In early stages of urbanization when the manufacturing sector is small, the consumer surplus effect is negligible and the direct effect on the wage dominates the indirect effect. Therefore commoners become worse off early in the transition to a city-state.
The impact on total utility early in the transition depends on parameter values. It may seem strange that a reduction in a productivity parameter could increase total utility. However, there are distortions in the pricing of manufactured goods that could yield this result. Due to increasing returns, marginal cost is lower than average cost. A first-best allocation requires marginal cost pricing, while a zero-profit equilibrium with no taxation requires average cost pricing, and an equilibrium with positive taxes requires price above average cost. Because the elite restricts output, lower productivity in the commons can lead to a social gain through greater manufacturing despite lower food production. This is not a Pareto improvement because the commoners become worse off.
As urbanization proceeds, commoner utility may eventually rise for two reasons: consumer surplus effects become more important and rising demand for labor may drive up the wage. However, if the initial productivity in the commons was sufficiently high, commoners never fully regain the welfare level they enjoyed before climate deterioration began, and total utility must be lower in a city-state than in the agricultural economy that preceded it (the loss to the commoners always outweighs the gain to the elite).
The rest of this section derives these results mathematically. We assume M > 0 throughout. Total utility for the elite class has already been computed in (10.22), which we repeat here for easy reference:
Total utility for the commoner class is
where the total commoner income is XC(L, θ) = (N-e)w(L, θ) = (C + F + L)w(L, θ). Summing (10.26) and (10.27) gives total utility for the entire region, which we denote by a superscript S. This is the objective a benevolent social planner would maximize.
(10.28)The first two terms in the second line of (10.28) give total utility from manufactured goods. F[w(L, θ)] is the elite’s demand for farm labor at the wage w(L, θ) and Cw(L, θ) is food produced in the commons. Together the last two terms in the second line represent total food output (and also consumption because the market for food clears by Walras’s Law).
Lemmas 10.5–10.7 derive the effects of θ on VE, VC, and VS respectively. In all cases L is treated as a function of θ from elite taxation equilibrium, and the assumptions used in Section 10.4 are maintained.
Lemma 10.5
![]()
This result follows from the envelope theorem. VE is already maximized with respect to L, so indirect effects through L can be ignored. The same is true for indirect effects on land rent through F. This leaves the direct effect of θ on the wage (∂w/∂θ) through (10.20) times the total labor hired by the elite, which is F + L > 0. The wage is increasing in θ, so the elite always gains from lower commons productivity.
Lemma 10.6

The envelope theorem is of less use in this case, although one can exploit the fact that commoners maximize utility in choosing their consumption level for the m good. As a result the derivative of consumer surplus with respect to price p reduces to −m = −M/N. For both p and w we have indirect effects of θ acting through the elite’s optimal labor input L(θ). There is also a direct effect of θ on the wage captured by ∂w/∂θ.
The derivative in Lemma 10.6 is hard to sign. Given positive manufactured output, a reduction in θ increases L. This lowers p and raises consumer surplus, which is good for commoners. The indirect effect through L also raises the wage from (10.20), which is again good for commoners. But the direct effect of θ on the wage in (10.20) implies that lower productivity yields a lower wage, which is bad for commoners. Even if we limit attention to the early stages of urbanization where M is small and thus consumer surplus effects are negligible, the direct and indirect effects of θ on the wage go in opposite directions. We will return to this issue below.
Lemma 10.7

This is similar to Lemma 10.6 except that here the direct wage effect ∂w/∂θ only multiplies the commons labor C rather than the entire commoner labor supply N-e = C + F + L. The reason is that when we sum to obtain VS = VE + VC, wage transactions between the elite and commoners cancel out. However, the elite does not pay for labor in the commons, and the food produced there is important from a social standpoint. Notice also that elite consumer surplus plays no role in Lemma 10.7 because it vanished in Lemma 10.5 due to the envelope theorem (the elite already internalizes the effect of L on its own surplus).
The following proposition exploits these lemmas to characterize welfare effects when θ is near θe so the manufacturing sector is small. These can be regarded as effects arising in the early stages of city-state formation. After presenting the technical results, we provide some economic intuition.
Proposition 10.4
(local welfare effects).
Adopt the assumptions used for Proposition 10.3(c)(i) so θe > 0. Abbreviate β(θe) ≡ βe. Consider the interval 0 ≤ θ < θe where L(θ) > 0 and let θ → θe from below, which gives L(θ) → 0 and ME(θ) → 0. Let D < 0 be the limit of the denominator in (10.25). As θ → θe from below, the derivatives from Lemmas 10.5–10.7 have the following features.
(a)

(b)

(c)

Part (a) is trivial because we already know from Lemma 10.5 that the derivative for the elite is negative. However, we state the limit result in terms of underlying parameters in order to facilitate comparisons with parts (b) and (c).
Part (b) indicates that commoners are always made worse off by a small decrease in the commons productivity θ if M is sufficiently small. More technically we can find a neighborhood (θe - δ, θe) with δ > 0 on which this is true. There are two ideas involved. First, when M is small the effect of θ on consumer surplus can be ignored. Second, a drop in θ has direct and indirect effects on the wage. The direct effect lowers the wage, while the indirect effect raises it by encouraging the elite to hire manufacturing labor. Close to θe the direct effect dominates so the commoners are worse off. The sign in part (b) is obtained by noting that
, the other factors in the first line are positive, and the second line is negative because A10.2 from Section 10.4 implies
.
The outcome in part (c) depends in a complex way on the parameter values. It can be shown that either sign is possible. The proof involves fixing a suitable value for
, setting
and
so A10.2 and A10.3 hold with equality, and adjusting r to keep βe constant while varying the level of N. For a small population N, total utility moves in the same direction as commons productivity, but for a large value of N, total utility moves in the opposite direction to commons productivity.
The results in Proposition 10.4 are local (they apply only in some neighborhood of the transition point θe), and do not provide a global verdict on the welfare effects of city-state formation for all possible values of θ. Some progress on the global effects of θ can be made by imagining that a social planner allocates labor between the agricultural and manufacturing sectors to maximize total utility for all agents, both elite and commoner. Thus the planner maximizes
where
as in (10.8) and b(m) comes from (10.12). The first term in (10.29) is total food and the second is total utility from manufacturing. The planner must respect the constraint that within the agricultural sector, the average product of labor in the commons is equal to the marginal product of labor on elite estates. This constraint is embedded in the productivity coefficient
. The planner also respects the constraint that individual agents consume equal amounts of the manufactured good, as they would in market equilibrium, so welfare differences between elites and commoners are driven by differences in their levels of food consumption.
The term in (10.29) involving manufactured output is strictly concave when A10.2 holds from Section 10.4. Strict concavity guarantees a unique solution for L at each θ ≥ 0. We denote the solution in (10.29) by LS(θ) and the resulting manufactured output by MS(θ) ≡ M[LS(θ)]. In Lemma 10.8 below we will use θs to denote the commons productivity level separating boundary solutions LS(θ) = 0 from interior solutions LS(θ) > 0. There are no boundary solutions of the form L = N − e, so this case will be ignored.
Lemma 10.8
Assume A10.2 and A10.3 hold. Define θs ≥ 0 (uniquely) by αγ(θs) ≡ qr(N − e)1−α when such a value of θ exists.
(a) If qr(N − e)1-α < α then no such θs exists and MS(θ) = 0 for all θ ≥ 0.
(b) If qr(N − e)1-α = α then θs = 0 and MS(θ) = 0 for all θ ≥ 0.
(c) If qr(N − e)1-α > α then θs > 0. In this case
(i) MS(θ) > 0 on the interval 0 ≤ θ < θs and MS(θ) = 0 for θs ≤ θ
(ii) dMS(θ)/dθ < 0 on the interval 0 ≤ θ < θs and MS(θ) is continuous at θs
(iii) M0(θ) < MS(θ) on the interval 0 ≤ θ < θs
(iv) If Lemma 10.2(b) applies then 0 ≤ θe < θ0 < θs
The implications of Lemma 10.8 are shown in Figure 10.4. If θ ≥ θs so the planner chooses M = 0, we also have M = 0 in a zero-profit equilibrium and an elite taxation equilibrium. When the planner chooses M > 0, this output is larger than the zero-profit output, which may be zero or positive. If the zero-profit output is positive, then the zero-profit output is larger than the elite taxation output, which may be zero or positive. These relationships arise because a planner is not constrained to break even in the manufacturing sector. By comparison with zero-profit equilibrium, the elite wants to restrict output while the social planner wants to expand it.
Lemma 10.8 leads to the following global welfare results.
Proposition 10.5
(global welfare effects).
Assume A10.2 and A10.3 hold. Suppose
where
is the upper bound defined in (10.10). Compare any agricultural equilibrium with
against any manufacturing equilibrium with
. Relative to the initial agricultural equilibrium, the commoners are worse off in the manufacturing equilibrium while the elite are better off. Total utility is lower in the manufacturing equilibrium.
A quick sketch of the proof runs as follows. First notice that the planner always uses positive labor for agriculture. This implies that the total utility of the planner is an increasing function of θ because when θ increases, the planner always has the option of maintaining the previous labor allocation and obtaining higher total utility. Whenever
, the social planner and the elite choose identical allocations with
and
. Thus total utility for the social planner is the same as in the initial agricultural equilibrium favored by the elite. For
, the social planner has lower total utility than for
. Furthermore, the elite’s allocation for
cannot yield a higher total utility than the planner would obtain at the same θ. Hence total utility in the manufacturing equilibrium is lower than in the agricultural equilibrium. From Lemma 10.5 the elite are better off in the manufacturing equilibrium. Because total utility decreases, the commoners must be worse off in the manufacturing equilibrium. It can be shown that there are parameter values for which the conditions of Proposition 10.5 hold.
10.6 Population
The model from Sections 10.2–10.5 has an important limitation. Although we considered migration from rural to urban areas, we held the total regional population N constant. In reality the Uruk transition unfolded over centuries, and population would probably not have stayed constant through a large climate shift and an ensuing economic upheaval. This gap can be filled using Malthusian population dynamics. We will show that a combination of Malthusian dynamics and learning by doing in the manufacturing sector yields results consistent with our claim that climate change was the trigger for city-state formation. The next several paragraphs provide a summary for readers who want to skip most of the math. Afterward we will present the technical details.
The population model runs as follows. Time is discrete and a period is the length of a human generation (about twenty years). An individual adult who is alive in period t engages in economic activities and generates utility ut. This adult has nt+1 = ρut surviving adult children in period t+1, where the period-t adults die at the start of period t+1. The coefficient ρ > 0 captures the idea that a higher level of parental utility increases fertility and decreases child mortality. This parallels our framework in Chapters 6 and 7 except that we include utility from both manufactured goods and food. For example, children are more likely to survive if they receive warm clothing as well as nutritious bread. The demographic parameter ρ is identical for all agents and stays constant over time. Events occurring within a single period constitute the short run while events spanning multiple periods constitute the long run.
Aggregating utility across agents (both elite and commoner) generates a time path for aggregate population. For a given level of productivity θ in the commons, we define long-run equilibrium (LRE) as a population N and a level of manufacturing labor LE such that (a) LE is optimal for the elite given (N, θ), and (b) the total utility U(LE, N, θ) keeps the aggregate population N constant over time. We compare LREs for alternative values of θ and ignore the path along which Nt approaches a new equilibrium when θ changes. This is a reasonable simplification for processes operating on a time scale of centuries.
In an LRE associated with a given productivity θ, elites and commoners will have different incomes and different utilities. As a result, they will have different numbers of surviving children. To keep the populations of the two classes stationary, there must be some downward mobility from the elite to the commoner class, perhaps based upon birth order (see Section 6.10).
The main results are as follows. Suppose we start with an initial LRE with a high commons productivity θ and (due to Malthus) a correspondingly high regional population N, but no manufacturing. We know that in the short run with N held constant, a negative climate shock can reduce θ and trigger LE > 0 so that manufacturing becomes active. But in the long run lower commons productivity tends to reduce population and the supply of labor, which could return the system to a purely agricultural economy.
However, a shock giving LE > 0 in the short run is likely to raise the productivity of manufacturing in the long run, using our standard arguments about learning by doing for a previously unexploited technology or resource (see Chapters 3–5). If the increase in manufacturing productivity is large enough, this can rule out a new long-run equilibrium based on agriculture alone. Instead the economy shifts permanently to an equilibrium in which some commoner labor is devoted to urban manufacturing.
We now proceed to the mathematics. Aggregating utility across agents (both elite and commoner) generates a time path for aggregate population according to
where L(Nt, θt) is manufacturing labor in period t when the regional population is Nt and commons productivity is θt. Because all agents have the same ρ, aggregate population in period t+1 depends only on aggregate utility in period t. As in (10.29) the utility function is
The first term in (10.31) is the total output of food and the second is the total utility derived from manufactured goods. We use the notation U to indicate the aggregate direct utility function based on quantities of goods, as contrasted with the notation V used in Section 10.5 for the aggregate indirect utility function based on prices.
Definition 10.5 Fix the parameters (α, Z, e, q, r, ρ). Let LE(N, θ) be the labor optimally allocated to manufacturing by the elite from D10.4. The pair (L, N) is a long-run equilibrium (LRE) associated with the commons productivity θ if
(a) L = LE(N, θ)
(b) N = ρU(L, N, θ)
Condition (a) says that L is a short run equilibrium (SRE) at the prevailing population and productivity, in the sense that this is the labor allocation the elite would choose in such circumstances. Condition (b) says that N remains constant over time, given elite optimization and the prevailing level of commons productivity.
First we study a purely agricultural economy where we impose the constraint L ≡ 0. We will show that there is a unique stable stationary population N for each commons productivity level θ ≥ 0 if and only if the size of the elite (e) is small enough in relation to other parameters. An upper bound on the elite size is required because the elite produces no food but must be replaced demographically in every period. Stability requires that the regional population rise (fall) when N is slightly below (above) the equilibrium level.
Figure 10.5 illustrates these issues. With the constraint L ≡ 0 the total utility from (10.31) is U(0, N, θ) ≡ γ(θ)(N − e)α. For N to be stationary, the utility function must intersect the ray N/ρ. As shown in the graph this typically occurs at two points (the possibility of a tangency point is ignored because the associated population would be unstable). Such intersection points can only occur when the elite is small enough.
Lemma 10.9
In an agricultural economy with L ≡ 0, there is a (unique) stable stationary population iff e < (1−α)αα/(1−α)(ργ)1/(1−α). This population is increasing in γ.
When the inequality in Lemma 10.9 holds, the intersection point from Figure 10.5 with the lower population is an unstable steady state and the one with the higher population is a stable steady state (directions of population change are shown by the arrows). We will ignore the former throughout and focus on the latter.
Recall that θ = 0 implies γ = 1. In this case there is no commons and all labor is used on elite agricultural estates. If the condition in Lemma 10.9 holds for this case, as will be true when the elite size e is small enough relative to the demographic parameter ρ, it holds automatically for all θ ≥ 0.
Now consider an arbitrary regional population N0 > e. We want to characterize the parameter values that would support N0 as a stable LRE with agriculture alone (LE = 0). N0 must be a steady state population as in D10.5, which implies
This only determines the product (ργ)0, not the individual factors ρ and γ. Equation (10.33) does not guarantee that the resulting steady state is stable. For this to occur we need the expression γ(N − e)α − N/ρ to be decreasing at N0, which is true for the parameters (ργ)0 in (10.33) if and only if
When (10.33) and (10.34) both hold, the parameter values (ργ)0 from (10.33) satisfy the condition in Lemma 10.9. The restriction in (10.34) is similar to A10.3 where we assumed 2e ≤ N to ensure that the elite wanted positive taxation. We are not concerned with this question here. But we do observe that if α ≥ 1/2 then (10.34) implies A10.3. If α < 1/2, then A10.3 implies (10.34).
Next we want to identify the productivity levels γ for which a purely agricultural steady state subject to the constraint L ≡ 0 is also a full LRE with elite optimization as in D10.5. This requires that LE = 0 be optimal. Conversely, we want to know when this is not true, so any LRE with elite optimization must involve positive manufacturing. Lemma 10.10 addresses this issue using the results on elite optimality from Lemma 10.2.
Lemma 10.10
Consider some steady state population N0 supported by (ργ)0 with L = 0 as in (10.33), where N0 satisfies the stability condition in (10.34). The optimal elite labor allocation LE has the following features.
(a) Suppose (qr)0 ≤ α(2−α)/(N0 − e)1−α. This implies LE = 0 for all θ ∈ [0, θmax) or equivalently all γ ∈ [1, γmax).
(b) Suppose (qr)0 > α(2−α)/(N0 − e)1−α. There is a unique θe > 0 implicitly defined by

where βe ≡ β(θe) ≡ α1/(1−α) + Zθe1/(1−α)
such that
(i) LE > 0 for θ ∈ [0, θe) or equivalently γ ∈ [1, γe)
(ii) LE = 0 for θ ∈ [θe, θmax) or equivalently γ ∈ [γe, γmax)
where γe ≡ γ(θe) ≡ [αα/(1−α) + Zθe1/(1−α)]/β(θe)α.
We will use Lemmas 10.9 and 10.10 to construct a sequence of events consistent with a long-run transition from a purely agricultural economy to one with manufacturing. Our argument has the following steps.
(a) Start with a purely agricultural LRE where productivity is θ0 and population is N0. Let this population be supported by (ργ)0 and let the initial value for (qr)0 satisfy part (b) of Lemma 10.10. Suppose case (b)(ii) applies, so 0 < θe ≤ θ0. This ensures that it is initially optimal for the elite to choose LE = 0.
(b) Consider a decline in commons productivity to θ′ ∈ [0, θe) with N0 held constant. From part (b)(i) of Lemma 10.10, in the short run the productivity θ′ leads to LE > 0.
(c) At the new productivity θ′ consider a purely agricultural steady state population N′ with L ≡ 0. Suppose LE > 0 at step (b) yields r′ > r0 via learning by doing. In addition suppose that (qr)′ leads to case (b)(i) of Lemma 10.10 for θ′ and N′. This implies that there is no LRE associated with θ′ such that LE = 0.
(d) Suppose in the scenario from step (c) that we do have an LRE with θ′ and (qr)′ such that LE > 0. If so, the productivity drop from θ0 to θ′ triggers a transition from an LRE having only agriculture to an LRE with positive manufacturing.
Proposition 10.6 addresses each step in the preceding argument. The proof, which is included with the proofs for the other formal propositions in the chapter, constructs a set of parameter values for which the claims in the proposition are true. The point is to show that there are conditions under which a long run transition to urban manufacturing would still occur despite the inclusion of Malthusian population dynamics in the model.
One major simplification is to consider the most severe possible climate shock, where productivity in the commons drops to θ′ = 0 or equivalently γ′ = 1. This implies that after the shock hits, agricultural labor is used only on elite estates. A less extreme shock can generate qualitatively similar results.
Proposition 10.6
(long-run transition to manufacturing).
Assume climate change reduces the commons productivity from θ0 > 0 to θ′ = 0, or equivalently from γ0 > 1 to γ′ = 1. Further assume that learning by doing in response to LE > 0 increases manufacturing productivity from r0 to r′. The parameters (α, Z, e, q, ρ) are unaffected by climate change.
Fix (α, Z) arbitrarily. Conditions (a)–(d) below are satisfied for suitable choices of the remaining parameters (e, q, ρ, θ0, r0, r′).
10.7 Explanations
This section summarizes the ways in which the theory from Sections 10.2–10.6 accounts for the facts in Chapter 9. We also identify some areas where the model makes predictions that are open to future empirical testing. It may not be entirely surprising that our model can explain certain facts about southern Mesopotamia. After all, we had these facts in mind when we constructed the model. Nonetheless it is reassuring to know that the theory is consistent with available archaeological evidence. We discuss the potential generality of our framework in relation to other regions of the world in Chapter 11.
Recall from Chapter 9 that the early Uruk period is poorly documented in general, and the city of Uruk emerges into the archaeological daylight around 5200 BP largely in its full form. Therefore, theoretical speculation on the origins of southern Mesopotamian city-states is not tightly constrained by empirical observations. This is a problem for us, but it is likewise a problem for other scholars seeking to explain the transition.
The model in Section 10.2 is meant to capture some central features of the ’Ubaid period. Most importantly we want an equilibrium in which some agents are members of the elite and the rest are commoners. For this purpose we assumed that an organized elite with a high enough density of agents per unit of land can prevent outsiders from entering a site. By controlling access to especially productive areas, such an elite can collect land rent by hiring commoners at a wage (or alternatively by renting land to commoners).
In equilibrium the commoners are indifferent between working for the elite and working on open-access land of lower quality. When the stratification constraints from Section 10.2 are satisfied the elite are better off than the commoners, and the density of the commoner population at open-access sites is too low to allow the formation of elites there. Thus rather than assuming the existence of elite and commoner classes a priori, we derive stratification as an equilibrium outcome even though all agents are identical.
Due to the relatively favorable climate of the ’Ubaid, the region as a whole would have had a relatively high population for the Malthusian reasons in Section 10.6. As long as commoners enjoyed access to areas beyond elite control with good productivity when rainfall was abundant (whether food was obtained by rain-fed farming, pastoralism, or foraging in wetlands), the standard of living for commoners could have been close to that of the elite. This is compatible with archaeological evidence suggesting that during the ‘Ubaid period, inequality was relatively mild.
In Section 10.3 we extended the model to allow for manufacturing in urban areas controlled by elites. As long as climate remained favorable at locations outside southern Mesopotamia and the southern wetlands remained productive, the southern elite did not establish urban workshops because the wage required to attract commoner labor was too high. But archaeologists agree that the entire region was becoming more arid as the Uruk period began. This triggered migration from outlying areas into the southern alluvium, and from both outlying areas and the southern wetlands into elite-controlled areas where food production was less vulnerable to aridity due to irrigation opportunities. Our model is consistent with the views of most archaeologists about these migration patterns.
Section 10.3 showed that a sufficiently large climate shift could induce the elite to engage in urban manufacturing. Other things equal, a larger population inherited from the ’Ubaid period would have strengthened the incentive for manufacturing in the Uruk period. Our framework is consistent with the observation that a city-state had multiple workshops (not one big factory), although manufacturing had increasing returns to scale at the level of the city as a whole. The incentive to create workshops would have been strongest for industries where latent productivity (our parameter r) and latent consumer demand (our parameter q) were the highest. Textiles would seem to fit this description although pottery, metalwork, and other trades were also significant. Our assumption of increasing returns at the level of the city is consistent with the size distribution of early Mesopotamian cities, where Uruk was much larger than its rivals.
The model in Section 10.3 did not include taxation and the scale of manufacturing there was determined through a zero-profit condition. We assumed a competitive market for manufactured goods throughout the region, including those sites not under the control of the southern elites. This is consistent with the emphasis by Algaze on regional trade in manufactured goods during the Uruk period, as discussed in Chapter 9.
To explain the rise of city-states, in Section 10.4 we allowed the elite to levy taxes on manufacturing. There is a broad archaeological consensus that state institutions arose by 5200 BP, and coincided with increased urbanization. Because we define a state as an organized elite with the power to tax, we need to explain why taxation and urbanization arose together. Our hypothesis is that southern elites initially benefited from agriculture only through land rent, because food production was hard to tax. But urban workshops were highly visible and easy to tax, so taxation and urbanization went hand in hand.
As discussed in Chapter 9, several lines of evidence suggest taxation in the late Uruk period, including four-tier settlement hierarchies, monumental architecture, and a few vague written records. We regard the growing inequality from 5200 BP onward as further evidence that taxes were being collected. But there is no direct archaeological evidence on the subject so we are free to speculate about the form these taxes may have taken. One possibility is that they were analogous to modern religious tithing. The gods lived in temples in the cities, and protected against bad outcomes in the present as well as meting out rewards and punishments in the future. Perhaps tithing was a social norm within the elite, with both divine punishment and social shunning directed toward the members of the elite who did not pay.
In our theoretical argument we assumed taxation had no administrative cost, and identified conditions under which the elite would want to tax manufacturing. If the elite does not find taxation profitable with zero administrative cost, then clearly this is also true when administrative costs are positive. On the other hand, if taxation is profitable with a zero administrative cost, it will also be profitable at a sufficiently small positive cost.
One key theoretical result is that even when it is costless to collect taxes, the elite may tax manufacturing at a prohibitive level or simply ban it. The reason is that the elite faces a tradeoff between land rent and tax revenue from manufacturing. Shifting some labor into manufacturing may have too high an opportunity cost in foregone land rent.
Another theoretical point is that states can arise purely for reasons of private elite consumption. In our model, elite agents only care about food and manufactured goods. We could have used a more complex utility function where elite agents also care about public goods, and most economists who study early states assume that public goods must somehow play a role. However, we deliberately omitted public goods from our definition of the state in order to show that this is not a necessary condition. Empirically speaking, early states generally did allocate some tax revenue to public goods as well as the support of specialized personnel (including those who collected the taxes). Accordingly, it makes sense to use monumental architecture and administrative bureaucracy as markers for the existence of a state. But this does not alter our conclusion that state formation could be motivated primarily by a desire for greater private consumption on the part of the elite.
In Section 10.5 we addressed the welfare effects of city-state formation. Although data on inequality or commoner standards of living are unavailable for most of the Uruk period, by the end of this period inequality had clearly increased substantially relative to the ’Ubaid. Inequality increased further in the Jemdet Nasr and Early Dynastic periods. Our results in Section 10.5 predict that elites always become better off in the process of city-state formation, and that commoners always become worse off in the early stages of this process. We were unable to show that the latter trend continues indefinitely. But we did show that if the climate is favorable enough before urbanization, the commoners must be worse off after city-states have formed.
The theoretical results from Sections 10.2–10.5 treated the regional population as fixed. This is not an ideal assumption for studying a process that unfolded over several centuries, where population probably evolved in response to changing standards of living through Malthusian dynamics. For this reason we extended the theory in Section 10.6 to consider city-state formation when population adjusts according to a simple Malthusian model. We showed that for certain parameter values our earlier conclusion still stands: increasing aridity can lead to urban manufacturing and taxation.
The role of wetlands in southern Mesopotamia has been a controversial question among archaeologists, and it may be useful to elaborate on how this question is handled in our formal modeling. The model aggregates a heterogeneous array of sites involving rainfall agriculture, pastoralism, and wetlands foraging, and labels them collectively as “the commons” (see Section 10.2). These sites share the feature that they are not under elite control, in contrast to the elite lands in the south where irrigation is feasible.
In our theory labor mobility maintains approximate equality of living standards throughout the commons, despite the heterogeneity of natural resources and production techniques used at the individual sites. If food per person differed substantially among these sites, labor would move from sites where food per capita was low to sites where it was high, which would tend to erase any such differences. We are assuming that such migration was possible on a decadal time scale, within one human generation.
Because we aggregate sites involving rainfall farming with those involving the wetlands, our model does not explicitly address flows of population between such sites. We do rely implicitly on migration within the commons to maintain a uniform standard of living across sites, as described above. But none of our formal propositions address migration from areas of rainfall farming to areas with wetlands because in our modeling these population flows are all internal to the commons. Our model is designed instead to examine how climate deterioration leads to a flow of labor out of the commons, regarded as an aggregate, into elite-controlled agricultural lands and ultimately (assuming various conditions are met) into the urban manufacturing sector.
Archaeologists have differing views about the timing and extent of climate effects on southern wetlands. Take Nissen as an example (see Chapter 9). In his view growing aridity was already causing the wetland areas to dry out during the Uruk period. In fact, he believes this led to a major expansion in the availability of dry land, which could have been associated with urbanization. This scenario is consistent with our model. If rainfall farming, pastoralism, and wetlands foraging all suffered from a more or less simultaneous productivity decline due to greater aridity, it is reasonable to combine these diverse sites into a single sector called the commons and use a single productivity parameter for all of them. In this scenario we would not expect large migrations among the individual sites in the commons, because productivity would decrease in a parallel way at all of these sites. However, labor flows from the commons into elite-controlled agriculture and ultimately manufacturing would still be important.
Now suppose instead we adopt the view of Pournelle, who believes the wetlands remained highly productive as late as the Early Dynastic (see Chapter 9). In this scenario the urbanization in the Uruk period could not be explained by the drying of the wetlands many centuries later. Moreover, it would be reasonable to think that the initial impact of aridity would be to cause migration from areas reliant on rainfall farming to the southern wetlands, which would function as a refuge. If the wetlands had an infinite capacity to absorb climate refugees, it would be hard to use increasing aridity as an explanation for the Uruk urbanization process.
These issues can be easily addressed. Suppose we disaggregate the commons into rainfall farming sites and wetland sites. Further suppose the early stages of aridity reduce productivity at the former sites but not the latter. As long as the resources of the wetlands are finite, the wetlands will have diminishing returns to labor for standard reasons, even if these sites are not directly affected by aridity themselves. The flow of labor from rainfall farming to wetland foraging will therefore reduce food per person at the wetlands sites in a way that maintains an equal standard of living with those areas that are directly affected by the lower rainfall. As before a reduced standard of living at both types of open-access site will imply that the elite now pays less for labor, and will cause an expansion of elite-controlled agriculture. If the decline in the commoner wage is large enough, we will get urban manufacturing. These tendencies are simply reinforced when intensifying aridity finally starts to affect the wetlands directly, perhaps during the Early Dynastic period.
In the Pournelle scenario the wetlands do provide some cushion against aridity. The commoner standard of living falls by less than if the wetlands were directly affected by aridity from the outset. The wetlands also restrain the tendency for elite agriculture to expand in response to aridity and postpone the beginning of urbanization. But as long as the wetlands have a limited physical scale and display diminishing returns as commoners flow in, the central mechanisms of our model still operate. To be sure, if the migration of rainfall farmers to the wetlands is an empirically important phenomenon, one might want to disaggregate the commons to study this process in detail. However, for our purposes a delayed or absent effect of climate deterioration on the productivity of the wetlands only has a quantitative effect on the amount of aridity needed to trigger urbanization and state formation. As long as there are enough climate refugees fleeing from rainfall-dependent farming or pastoralism, our qualitative conclusions remain unchanged.
One issue we did not address involves changing land endowments. As mentioned above, Nissen and others believe that wetlands were vulnerable to increasing aridity over time, that these effects were important during the Uruk period, and that the drying out of the wetlands made more land available for elite agriculture and urbanization. While the irrigated land controlled by the elite was also negatively affected by regional aridity, the elite invested in an expansion of irrigation systems after the Uruk period to deal with this problem. Our model simplifies this process by holding productivity constant for the elite lands, without incorporating either increased land or new irrigation investments.
The conversion of some local land from wet to dry would increase the elite land endowment at site U, assuming the newly dry land could be irrigated at reasonable cost. This would increase the demand for farm labor on elite estates. On the other hand, the retreat of the wetlands would decrease the “demand” for labor in the commons. It is not obvious whether the net result is upward or downward pressure on wages. If the former, the transition to manufacturing would have been delayed, but if the latter, the transition would have been accelerated.
Dry land might also have been an important input to urban activities as Nissen suggests. These could include manufacturing itself or closely related activities such as the provision of commoner residences. For these reasons one might think that more dry land would stimulate manufacturing. A possible counterargument is that agriculture was more land-intensive than manufacturing so effects involving agriculture would have been stronger. In any case this is an interesting line of inquiry for archaeological research.
Our model generates a variety of predictions that could be subjected to future empirical testing. One example involves migration. Among sites outside elite control, those with the greatest reductions in rainfall or the most vulnerability to drought should have seen the largest migratory outflows. On the receiving end, those sites where food production was least vulnerable to drought should have seen the largest inflows.
Another example involves inequality. Assuming that climate conditions changed gradually, our model predicts rising inequality at stratified sites during the ‘Ubaid as the climate began to deteriorate, with standards of living gradually improving for the elites in southern Mesopotamia and falling for commoners throughout the region. This tendency should be visible prior to the emergence of cities and it should continue at least through the early stages of urbanization and state formation.
A third example involves regional population. Our long-run model in Section 10.6 predicts a region-wide decline in population due to deteriorating climate in outlying areas dependent on rainfall. This may appear to conflict with evidence for population growth and increasing numbers of settlements during the ‘Ubaid and early Uruk periods. But to the extent that such evidence involves elite-controlled sites not vulnerable to drought, it can be explained by migratory effects and need not conflict with a Malthusian decline in overall population when outlying areas are also taken into account.
During the Uruk period, technical innovations associated with learning by doing in the manufacturing sector could well have dominated climate effects, generating region-wide population growth and more urban agglomeration. Such productivity growth almost certainly occurred through experiments with the division of labor, supervisory practices, and record keeping (including written records). These innovations would have had two effects: (a) a scale effect involving Malthusian population growth, and (b) a substitution effect intensifying the agglomeration of the population in cities.
Urban growth would probably have promoted investment in canals, warehouses, and other infrastructure, which would have stimulated additional learning, productivity growth, and population growth. Such developments could have been augmented by the selective migration of individuals with skills that were especially useful in the city, such as metallurgy. For these reasons a trajectory beginning with climate deterioration and a lower population could ultimately have led to better technology and a higher population.
10.8 Conclusion
Our argument in this chapter resembles the argument we used in Chapter 5 for the origins of agriculture. In Chapter 5 a negative climate shock induced short-run migration to a few refuge sites, creating local population spikes and triggering cultivation. We also showed that if the productivity of cultivation had remained constant, long-run Malthusian adjustments resulting from the climate shock would have reduced population, causing the system to return to universal foraging. In reality, learning by doing kept cultivation going and restrained the decline in population that would otherwise have occurred. Along with the favorable Holocene climate, these technical innovations led to substantial population growth and the eventual spread of agriculture across the region.
In this chapter a negative climate shock stimulated short-run migration from the commons to elite lands, which functioned as the equivalent of refuge sites. At the same time the commoner wage fell, triggering urban manufacturing. But if the productivity of manufacturing had stayed constant, long-run Malthusian dynamics could have reduced the commoner population, pushed the wage back up, and restored universal agriculture. This was averted through learning by doing in manufacturing, which raised productivity and kept manufacturing going despite these Malthusian tendencies. The outcome was a trajectory of continued urbanism, permanent manufacturing, regional population growth, and the augmentation of elite land rents with income from urban tax revenue.
The use of taxes for cartel enforcement (in our model, driving down commoner wages and driving up the price of manufactured goods) is the simplest economic rationale for the rise of a state because it does not involve public goods. In principle this rationale might also apply to an agrarian state where food is the only consumption good, although we tend to think that the administrative costs of taxes might be too high in this case. But the point is that the emergence of archaic states can potentially be explained in a model that includes only private consumption. Of course elites may also have valued certain public goods and used tax revenue to finance them, but this factor is not essential.
The cartel enforcement story only works when the elite can collectively exercise significant monopoly power over prices in the labor market, the product market, or both. This is less likely in a setting with many competing city-states. But increasing returns at the level of the city implies that in the early stages of urbanization, a single dominant city is likely to emerge, and in fact this was true for southern Mesopotamia. The distribution of city sizes is not only evidence for the presence of increasing returns, but is consistent with the exercise of market power by the elite at Uruk.
After a number of city-states had evolved, the market power of Uruk would have diminished, though it could still have been substantial if Uruk had transportation or other advantages. As competing cities developed, over time the tax system was probably used more to accommodate elite demand for public goods such as temples, palaces, walls, and armies, and less to enforce monopolistic restrictions. In the end the rival city-states were incorporated into a unified regional empire, curtailing the ability of local elites to exploit whatever market power they still possessed.
10.9 Postscript
The first draft of the material for this chapter was completed in the summer of 2018. While we were developing the formal model in this chapter, we were generally aware of the archaeological literature from Chapter 9 and we constrained our modeling efforts to maintain consistency with the facts described there. For acknowledgments of advice and financial support, see Section 11.14.
11.1 Introduction
Chapter 9 presented a case study of pristine city-state formation in southern Mesopotamia, and Chapter 10 provided a formal model that captured key features of that case. Here we take a more panoramic view of urbanization and state formation. We want to ask broader questions: What do we mean by “cities” and “states”? Do these institutions usually develop together, or can we have one without the other? What theories of city and state formation have archaeologists and economists proposed? Is there any evidence that speaks to the merits of the competing theories?
We will undertake systematic regional comparisons to shed light on these matters. Of course it is impossible to consider all early cities or all early states. Instead we study six key regions: Mesopotamia, Egypt, the Indus Valley, China, Mesoamerica, and South America. These are the classic cases of pristine state formation.
Section 11.2 discusses an economic puzzle about the emergence of cities, as well as hypotheses archaeologists have advanced to explain urbanization. In Section 11.3 we review a number of definitions of the state from archaeologists, anthropologists, political scientists, and economists. In Section 11.4 we consider ideas about the origins of the state and survey the literature on the subject.
Section 11.5 introduces our agenda for regional comparisons. We then move to a set of examples that supplement our detailed study of Mesopotamia in Chapter 9. These cover Egypt (11.6), the Indus Valley (11.7), China (11.8), Mesoamerica (11.9), and South America (11.10). Section 11.11 distills several general patterns from these cases.
In Section 11.12 we present our theoretical framework for the evolution of cities and states. We suggest three pathways leading to a pristine state. The first is based on ideas about endogenous property rights from Chapter 6, the second is based on the theory of elite warfare in Chapter 8, and the third is based on exogenous environmental shifts of the kind studied in Chapter 10. In all three of these pathways, cities and states are closely linked. Section 11.13 applies our three hypotheses to the regional cases described earlier. Section 11.14 concludes the chapter and Part IV of the book. We add a short postscript in Section 11.15 to thank people and institutions for their support.
11.2 Urbanization
A highly influential discussion of ancient cities among archaeologists was Childe (Reference Childe1950), who constructed a list of ten traits such cities allegedly possessed. They included densely populated settlements; specialized workers in non-subsistence activities; monumental public buildings; stored surplus; stratification; writing; predictive sciences; representational art; trade over long distances; and political organization based on residence rather than kinship. For a review of Childe’s ideas, see Smith (Reference Smith2009). For a history of archaeological research on early cities, see Yoffee with Terrenato (Reference Yoffee2015). Modern archaeologists tend to define early cities using criteria such as the permanence of a settlement, dense population nucleation, and heterogeneity in social roles (Jennings and Earle, Reference Jennings and Earle2016, 475).
We use the term city informally to mean any large permanent agglomeration of people living in a compact area. We do not use specific numerical thresholds and we do not include economic specialization among residents as part of the definition, although as an empirical matter this is almost always observed. We use the term urbanization when referring to a process through which a subset of the settlements in a region becomes more “city-like.” This may occur either because for a fixed regional population the ratio of rural to urban population shifts toward the latter, or because regional population increases and some individual settlement sizes also increase, without any reduction in rural population.
A rough sense of the numbers and sizes of ancient cities can be gleaned from data compiled by Modelski (Reference Modelski1999), who uses archaeological site measurements and a constant population density of 200 people per hectare to estimate population levels. At 5700 BP there were two cities in the world with an estimated population of 10,000 or more, which increased to 27 cities at 4300 BP before starting a gradual decline. At 5300 BP there was one city (Uruk) with an estimated population of 20,000 or more. The number in this size category remained in single digits until 3200 BP.
Many agricultural or pastoral societies lack cities, and it is not obvious why cities would ever emerge in such societies. Because food production is land-intensive, it makes sense for the population to spread out over the landscape in order to minimize travel time between residences and production locations. We would expect population density to be positively correlated with soil fertility, access to surface water, gentleness of terrain, and similar aspects of land quality, but without a large concentration of population in a single location. At most we might expect a few small villages for trading activities or religious ceremonies, and perhaps small elite settlements in places that are especially pleasant or administratively convenient. Aside from travel time in relation to production activities, early cities were likely centers for disease transmission with high mortality rates (Algaze, Reference Algaze2018, 26), providing another incentive for population dispersion.
Our reading of the archaeological literature reveals three main hypotheses about the process of urbanization (see Marcus and Sabloff, Reference Hansen, Marcus and Sabloff2008, for an overview). First, there is the development of manufacturing for local or long-distance markets as in the example of southern Mesopotamia (see Chapter 9). Because manufacturing is less land-intensive than agriculture or pastoralism, it does not require that the workforce disperse across the landscape. Indeed, the opposite is true: productivity tends to be higher when workers are concentrated in a compact area because this facilitates the shipping of raw materials and finished goods, the pooling of skilled labor, supply chain coordination, and knowledge spillovers. For a review of economic benefits and costs of urban density in the modern world, see Duranton and Puga (Reference Duranton and Puga2020).
Second, warfare or the threat of warfare can cause people to seek refuge in easily defended and highly fortified locations. There is a pervasive tradeoff between economic convenience, which in an agricultural context motivates the population to disperse, and a security imperative, which motivates the population to agglomerate. This tradeoff goes back to sedentary foraging groups and is also visible in small-scale agricultural societies (see Johnson and Earle, Reference Johnson and Earle2000, on the cases of the Yanomamo, the Tsembaga Maring, and the Central Enga). When a society already has relatively high population density for the familiar Malthusian reasons (favorable climate, geography, and technology), defensive motives may lead to agglomerations that most people would be willing to call cities. Gat (Reference Gat2006, 278–280) argues that virtually all early city-states arose from the need to defend against other nearby city-states, citing evidence that (a) most of the population of such cities consisted of peasants who walked to nearby fields, not craft workers engaged in non-agricultural pursuits, and that (b) such city-states have tended to arise in mutually antagonistic clusters, not in isolation.
Third, urbanization could be propelled by culture. For example, Ur (Reference Acemoglu, Golosov, Tsyvinski and Yared2012, 554) concludes that Bronze-Age Mesopotamian cities most likely grew for ideological reasons. This may have involved religion: the gods lived in temples and people lived in cities near temples in order to be close to the gods. Other cultural attractors to central places include entertainment, social life, and ethnic or linguistic enclaves. We will not have much to say about cultural explanations because they do not generally identify a clear triggering event or condition that sets the urbanization process in motion.
The manufacturing and warfare mechanisms for urbanization lead to differing predictions about the locations of cities. In the manufacturing story we would expect cities to be in places that are convenient for large-scale economic activities, especially those having cheap transportation, even if they are difficult to defend against attack. In the warfare story we would expect cities to be on hills, cliffs, or peninsulas, with large investments in walls, moats, and other defenses, even if such locations make economic activity more expensive. It should therefore be possible to distinguish between the two mechanisms through archaeological evidence.
11.3 Definitions of the State
The definition of the state is not a trivial matter. Different writers often adopt different definitions, either explicitly or implicitly, so when they attempt to explain the origins of the state they are explaining different things. In addition to presenting a clear definition, it is important to explain how the definition fits into a relevant body of theory and to describe the archaeological markers used to infer the existence of a state.
A common starting point in archaeology and anthropology is the classic typology of Service (Reference Service1962) involving bands, tribes, chiefdoms, and states. This system has been adapted and modified by other authors (e.g., Earle, Reference Earle1987; Johnson and Earle, Reference Johnson and Earle2000). The principal issue here is how to distinguish states from chiefdoms. Both have stratification involving elites and commoners, both treat class positions as hereditary, and both exhibit multi-settlement hierarchies where central authority is exercised from larger settlements.
One popular point of distinction is that the elites in chiefdoms are generalists. A chief might negotiate an alliance on Monday, collect tribute on Tuesday, celebrate the gods on Wednesday, and organize a war party on Thursday. This makes it difficult to arrange a partial delegation of power, while a full delegation of power creates political instability. In a state, by contrast, elite agents are specialists: some are political rulers, some are tax collectors, some are priests, some are warriors, and so on (Wright, Reference Wright1977; Johnson and Earle, Reference Johnson and Earle2000). A state can delegate partial authority over particular tasks like public works or tax collection to local representatives, which makes it easier to control a large population or geographic area while limiting the risk that a subordinate might lead an insurrection (Spencer and Redmond, Reference Spencer and Redmond2004; Spencer, Reference Spencer2010).
Another way to distinguish chiefdoms from states is that chiefdoms are organized through kinship ties while states use non-kin-based bureaucratic principles (Johnson and Earle, Reference Johnson and Earle2000). However, early states often continue to rely on kinship connections within the elite to some degree (Yoffee, Reference Yoffee2005, 16–17). For this reason among others, we assign little weight to the kinship criterion.
A third approach is to define states as having more tiers in their settlement hierarchies than chiefdoms. In the Standard Cross Cultural Sample, a data set widely used for anthropological research (Murdock and White, Reference Murdock and White1969, Reference Marshall2006), states are defined as having four tiers rather than two or three (e.g., hamlets, small villages, large villages, and towns). Economists sometimes use similar definitions. For example, one prominent data set on state history going back to 3500 BP (Borcan et al., Reference Borcan, Olsson and Putterman2018, 5) draws a line between simple chiefdoms (non-states) and paramount chiefdoms including “multiple individually substantial chiefdoms” (incipient states). Most writers who adopt such criteria interpret multiple tiers in a distribution of settlement sizes as reflecting multiple tiers of authority within an elite administrative structure.
Some authors distinguish states from chiefdoms based on the greater scale of states measured by population or land area (see Diamond, Reference Diamond1997, ch. 14). There is little agreement on where such lines should be drawn, with some writers arguing that states may involve only a few thousand people, others preferring a boundary between ten and fifty thousand, and still others wanting to reserve the term for very large polities having hundreds of thousands of people. Arbitrary numerical cutoffs of this sort are not very useful, and we prefer a definition linked to an explicit theoretical framework.
Having discussed ideas from archaeology and anthropology, we move on to other social sciences. Political scientists emphasize the idea that a state has a monopoly on the use of force within a territory, while economists emphasize the idea that a state supplies public goods and collects taxes to pay for them. The former definition dates to Weber (Reference Weber, Roth and Wittich1968, 54–6) and is endorsed by some anthropologists (Service, Reference Service1975, ch. 1). The latter is widespread in the economic literature (see our review of economic theories below).
Neither of these criteria corresponds to the distinctions between chiefdoms and states drawn by anthropologists. In many of the societies an anthropologist would call chiefdoms, the elite might have an effective monopoly on the use of force. A political scientist would then have to accept these societies as examples of states. Likewise, in societies an anthropologist would call chiefdoms, the elite might tax people to provide defense, insurance, infrastructure, or monuments. An economist would then have to accept these societies as states.
From this standpoint, the origin of the state is likely to remain shrouded in the mists of time because there is no practical way to determine when a chief first used force to collect a tax from a commoner. Clearly, something more is needed in order to draw a meaningful dividing line between societies like the Northwest Coast of North America (which no one thinks were states) and Mesopotamian city-states (which everyone thinks were states). There is no perfect solution to this definitional quandary, but we think the best approach is to use the term “state” when taxation is institutionalized, in the sense that (a) it is carried out on a regular basis by a permanent set of specialized functionaries, and (b) the resulting resources are aggregated and placed at the disposal of a central elite group for allocation. Institutionalization is a matter of degree and has fuzzy boundaries but we see no better way to draw a suitable line. This leads to the definition adopted in Section 9.1: a state is an organized elite that has significant powers of taxation in a well-defined geographic area, where taxation is institutionalized in the above sense.
As in Chapters 9 and 10, we require a centralized fiscal system because a state can only function as a collective actor if the elite is organized enough to appropriate resources through mandatory payments and allocate them according to coherent preferences. We continue to define taxation as including all cases where the elite collectively confiscates resources (food, craft outputs, labor, raw materials, and so on). As explained in Section 9.1, the underlying technology of confiscation influences the conditions under which a state is likely to emerge. The supply of public goods is not part of our definition of the state (elites could use taxation simply to enhance their private consumption), but as an empirical matter most early states do devote significant resources to public goods.
We end this section by emphasizing the distinction between our definition of the state and the empirical indicators that suggest the existence of a state. This is important because taxation plays a central role in our theory, and we therefore give it a central role in our definition. At the same time taxation is difficult to observe directly in the absence of written records. Reasonable archaeological indicators of a state include
(a) Monumental architecture, such as very large temples or palaces, which would likely have required large-scale taxation for construction and maintenance.
(b) A multi-tiered settlement hierarchy that was likely associated with administrative hierarchy involving functional specialization, where
(i) Substantial tax revenue would likely have been needed to support elite personnel, and
(ii) One of the specialized functions would likely have been tax collection.
(c) Evidence for routine coercive labor practices organized by the elite (recall that we include coerced labor as a form of taxation).
(d) Large central transportation and storage facilities that were likely used for the collection of taxes in kind (such as grain or animal herds).
(e) Other evidence of central control such as extensive city planning or standardized weights and measures.
(f) A level of inequality that would have required systematic taxation because it went beyond the norm for stratified societies based on land rent alone.
(g) A large total population or total land area controlled by a unified elite, which may suggest the existence of a large tax base or a scale of elite activity that could only have been supported through taxation.
Each indicator has its problems. For example, simple chiefdoms can sometimes organize monumental construction projects (Stanish, Reference Stanish2001). Also it is hard to be certain that an administrative hierarchy is supported by tax revenue rather than land rent alone. But a flexible combination of such indicators is likely to be the best one can do. Few if any archaeological cases will have strong evidence for all seven markers, but we would be willing to infer the presence of a state if three or more are clearly present, especially for criteria appearing higher up on the list.
11.4 Theories of the State
There are at least as many theories of pristine state formation as there are pristine states. In our experience archaeologists are frequently skeptical about a research agenda involving generalizations across regions of the world and instead prefer to emphasize the particularities of each individual case. Among those willing to engage in generalization, there is substantial debate about causal mechanisms. For example, Yoffee (Reference Yoffee2005, ch. 3) maintains that most early states began as city-states. But Carneiro (Reference Carneiro1970, Reference Acemoglu, Golosov, Tsyvinski and Yared2012), Marcus (Reference Feinman and Marcus1998), and Flannery (Reference Flannery1999) maintain that most early states arose through warfare among competing chiefdoms, a mechanism seemingly having little to do with cities. For a useful assortment of archaeological perspectives, see Feinman and Marcus (Reference Feinman and Marcus1998).
Before reviewing individual theories, we first consider how alternative theories of state formation can be assessed. A good theory should explain why early states emerged at particular times and places rather than other times and places. Ideally a theory would specify a set of necessary conditions for state formation that are jointly sufficient. The causal trigger must logically be the last of the necessary conditions to be satisfied.
Accordingly, one way to evaluate a theory of state formation is to ask whether it clearly identifies a causal trigger of this kind and whether the role of this alleged trigger is backed up by convincing archaeological evidence. Of course, the proximate cause of state formation need not be the ultimate cause. Popular candidates for the role of ultimate cause include climate change, technological innovation, and population growth, although we would argue that technology and population could be endogenous, depending on the relevant time frame. But whatever the proximate cause may be, a change in this variable should be linked closely in time to the emergence of state institutions.
Another way to assess theories of state formation is through comparisons across regions. Theory A might claim that geographic feature X is conducive to state formation while theory B claims that geographic feature Y is more important. Two examples to be discussed below involve geographical circumscription and characteristics of food crops. Diamond (Reference Diamond1997) and Litina (Reference Litina2014) provide further examples. Because hypotheses of this sort are generally based on permanent geographic differences across regions, they do not identify temporal triggers and therefore do not by themselves explain the timing of state formation. But in principle one could examine whether particular geographical features are correlated with the frequency or antiquity of state formation.
Cultural explanations for the origins of the state tend to have problems similar to cultural explanations for the origins of cities (see Section 11.2): they do not generally pin down any specific temporal trigger for state formation or any geographic conditions that make state formation more likely. Moreover, we tend to be dissatisfied with theories that treat unobserved cultural mutations or unobserved strategic manipulation of social norms as exogenous variables. For these reasons we do not pursue the cultural approach taken by Flannery and Marcus (Reference Flannery and Marcus2012), despite their rich empirical descriptions of the processes through which particular chiefdoms, kingdoms, and empires arose.
A popular classification system for theories of state origins divides them into two groups: integration theories and conflict theories. Integration theory is based on the idea that states develop in response to managerial problems facing the society as a whole. The writers in this camp tend to argue that the state makes both elites and commoners better off (in the language of economists, the state is a Pareto improvement). Conflict theory is based on the idea that the state serves the interests of the elite and helps the elite profit at the expense of commoners. Such writers tend to argue that early states involved coercion rather than coordination or public goods supply. These rival views go back centuries.
The distinction between integration and conflict theories is slippery for at least two reasons. First, some authors emphasize self-interest and the use of force, but even so, conclude that the state made everyone better off. Such arguments sometimes appear in the economics literature, as we will illustrate in our discussion of Grossman (Reference Grossman2002).
Second, some authors claim that elite authority was originally necessary in order to solve a collective social problem, but that this authority was subsequently abused in a self-interested and coercive way. Perhaps the most spectacular example of this approach is the oft-cited and now discredited “hydraulic” theory of Wittfogel (Reference Wittfogel1957, chs. 1–2), who argued that the benefits of irrigation and flood control systems in arid regions with rivers could only be achieved through mass labor. This required subordination to an authority that could coordinate the construction and maintenance of such systems, which then led to political despotism.
A similar example is provided by Diamond (Reference Diamond1997, ch. 14), who argues that large societies require more contact among non-kin and therefore have greater interpersonal conflict. The solution is to establish an authoritative leader who can restrain conflict. The problem is that such leaders tend to operate in self-interested ways and ultimately become kleptocrats who exploit commoners.
For both Wittfogel and Diamond, authority is initially used as a tool for solving an important social problem (integration theory), but is eventually converted into a tool for oppression (conflict theory). We cite these authors only as examples; arguments of this sort are quite common in archaeological or anthropological discussions of the state. Such theories are often more than a little vague about how authoritative leaders escape from the control of the majority, or why a majority irreversibly grants power to an elite despite the danger that this power could be abused at a later date. These issues feature prominently in the literature on the evolution of modern democracy (see Section 12.5).
Apart from physical infrastructure and the maintenance of law and order, other versions of integration theory emphasize state-run storage or insurance systems (Adams, Reference Adams1981), facilitation of regional trade (Litina, Reference Litina2014), and collective defense against bandits or attackers (Grossman, Reference Grossman2002; Baker, Bulte, and Weisdorf, Reference Baker, Bulte and Weisdorf2010; Konrad and Skaperdas, Reference Konrad and Skaperdas2012). Among economists the most popular of these ideas is the last, perhaps because defense is a quintessential public good and is generally financed through taxation. We start with a review of this strand within the economic literature.
Grossman (Reference Grossman2002) assumes that individual agents can choose whether to produce output or steal output from others. Producers devote resources to guarding output. If the technology of predation is highly effective, banditry will be a serious threat. In this case everyone (including potential bandits) can be made better off by creating a state that taxes producers and deters banditry. Though the state maximizes elite consumption, excessive taxation is restrained by the possibility that producers could choose to become (untaxed) bandits. Although one might think from Grossman’s emphasis on the use of force and the self-interest of the state that this story belongs in the conflict theory camp, the claim that the state yields a Pareto improvement relative to anarchy flags it as an integration theory. Indeed, one could imagine that such states arise through a voluntary social contract. The main empirical prediction from the model is that states tend to emerge when it is easy for bandits to appropriate the output generated by producers.
Baker et al. (Reference Baker and Bulte2010) use a similar theoretical framework. They do not distinguish between chiefdoms and states, and assume there is a government only if food producers benefit from it. Thus the state is a response to the demand for law and order. Unlike Grossman (Reference Grossman2002), technology and population are endogenous where the former evolves through learning by doing and the latter through Malthusian dynamics. An argument running through the paper is that as technology advances, storage becomes more important, output becomes easier to steal, and the demand for a state intensifies.
By contrast with Grossman (Reference Grossman2002), who assumes that a king has a monopoly on the sale of protection, Konrad and Skaperdas (Reference Konrad and Skaperdas2012) allow free entry of rival lords who all sell protection and fight for control over peasants. Competition among the lords will dissipate the potential gains from the sale of protection, leaving people no better off (and possibly worse off) than under anarchy. They are also pessimistic about the viability of self-government, in the sense of small groups that provide their own security as a local public good, in a world where such groups must compete against predatory states.
Boix (Reference Boix2015) elaborates on the latter point. He argues that there are two pathways to the state: one where the predators organize against the producers, leading to a monarchy, and another where producers defend themselves against the predators, leading to a republic. Boix believes that the pathway to a republic is rare but argues that the outcome depends partly on the nature of military technology (see Section 12.5 for a further discussion).
Conflict theories come in numerous flavors. One highly influential article in this tradition is Carneiro (Reference Carneiro1970), who argues that wars among rival chiefdoms can potentially lead to state formation, but only when commoners are unable to flee from the state due to geographical circumscription involving deserts, mountains, oceans, and the like. A lucid summary of this approach is provided by Flannery (Reference Flannery1999), who cites Carneiro’s recipe for a pristine state: defeat neighboring villages by force, incorporate their territory into your political unit, use prisoners of war as slaves, use close supporters to administer conquered territory if the local leaders are rebellious, require payment of tribute from your subjects, and require them to provide fighters in times of war.
Flannery (Reference Flannery1999) adds that chiefdoms often cycle between simple and complex forms, where complex chiefdoms tend to fragment for a variety of reasons, including factional competition and succession problems. While acknowledging that only a very small fraction of chiefdoms ever give rise to states, Flannery maintains that almost all pristine states were formed through warfare and territorial expansion among chiefdoms.
Gat (Reference Gat2006, 278–293) believes that city-states in many parts of the world emerged through warfare, or at least a perceived need for defense. He explains the apparent lack of fortifications, especially walls, in the early stages of city-state development by arguing that proto-state warfare consisted mainly of raids, and that the sizes of cities alone would have made them difficult targets for raiding. Gat grants that once cities arose they were rarely attacked. One implication is that deterrence through urbanization was frequently effective, which would tend to stabilize a system of small autonomous city-states. Note that this story differs from the emphasis of Flannery and others on territorial expansion.
Allen (Reference Allen1997) builds on the ideas of Carneiro (Reference Carneiro1970) to explain the early Egyptian state, arguing that circumscription of the Nile valley by deserts enabled the elite to extract output from the commoners. According to Allen the temporal trigger for state formation was the arrival of agricultural technology from southwest Asia, leading to storable grain surpluses. Agriculture also made labor more seasonal, leaving commoners available for elite public works such as pyramid construction (for further discussion, see Section 11.6).
According to Mayshar et al. (Reference Mayshar, Moav and Neeman2017, Reference Mayshar, Moav and Pascali2022), early states were most likely to arise in regions where food stores were readily appropriable (e.g., cereals but not tubers; see also Scott, Reference Scott2017), and where production technology was transparent in the sense that the elite could readily measure or estimate the food output produced by commoners. They reject assertions that early states required a technologically determined surplus or arose through population pressure. In each case their objection is based on Malthusian considerations: surpluses resulting from high agricultural productivity will be eroded through population growth in the long run, and population pressure will be self-correcting due to the long-run effects of low food per capita on fertility and mortality.
Economists have begun to develop an empirical literature on state formation. The data sets are often dominated by non-pristine states, but the origins of pristine states have also received attention. Borcan et al. (Reference Borcan, Olsson and Putterman2021) emphasize the central role of agriculture and stratification as preconditions (see Section 9.1). Mayshar et al. (Reference Mayshar, Moav and Pascali2022) stress the roles of transparency, durability, storage, and other features of crops that facilitate appropriation by an elite (see Sections 9.8 and 9.9). They find no effect of land productivity on state formation after controlling for reliance on cereal crops. Schönholzer (Reference Schönholzer2020) finds that states emerge in places that are agriculturally productive but also circumscribed by low productivity land, making it difficult to evade taxation through migration. He finds no effect of land productivity on state formation after controlling for circumscription.
All economic theories of the early state, whether based on integration or conflict theory, need to specify some limits on the ability of the elite to collect taxes. We do this by allowing agents to flee to an open-access commons. Other constraints on elite power include the possibilities that commoners will reduce effort in response to high tax rates, hide food from tax collectors, choose banditry over farming, or engage in rebellion.
Although we will not go into details, we make some brief remarks on the potential role of rebellion. Our combat technology from Chapters 7 and 8 says that the probability of a group winning a war depends on that group’s size relative to the size of the opposing group. Suppose now that commoners want to use such a combat technology to overthrow an oppressive elite. The difficulty is that if the rebel group is initially small, the elite can defeat it with virtual certainty. Even given parity with respect to the quality of weapons and leadership, a commoner rebellion can only succeed if it is concealed from the elite in its early stages and then rapidly grows into an organized group on a scale comparable to that of the elite itself. Moreover, an elite under serious threat can use financial resources to recruit a mercenary army and such an army can be used to crush a commoner rebellion (see Chapter 8). But extreme forms of elite oppression might be deterred by the prospect of a rapidly spreading rebellion that would be costly to extinguish.
Looking at the debate between integration theorists and conflict theorists from a broad perspective, the picture is probably more mixed than either approach alone would suggest. Early states did provide some public goods that would likely have been valued by commoners such as infrastructure projects, insurance, sacred temples, law and order, suppression of local warfare, and protection from raiders or invaders. At the same time these states exercised coercive power, used forced labor, and diverted substantial output to elite consumption. The net effect on the nutrition, health, and life expectancy of the commoners is an empirical question, and the answer could differ from case to case.
The question is also hard to answer because the counterfactual is poorly defined. For example, if one defines a state by the use of taxation, should we compare the actual welfare of commoners in early city-states with what it would have been with no taxes and free entry into manufacturing? Chapter 10 gives a theoretical framework for comparisons of this kind, but empirical research along these lines would clearly be challenging.
A related problem is to distinguish the welfare effects of the factors that triggered state formation from the welfare consequences of the state itself. In the model of Chapter 10, increasing aridity makes commoners worse off even in an agricultural society without a state. It also causes city-states to form. How much of the total reduction in commoner welfare was due to climate effects that would have occurred anyway, and how much was due to elite taxation of manufacturing activities that arose in response to climate change?
A third problem involves the nature of elite income. No one disputes that elites benefited from state formation but there is genuine debate about how they became better off. A conflict theorist would maintain that rising elite welfare reflects greater success in taking output or labor time from commoners by force, while an integration theorist might argue that elites (also?) obtain returns on human capital. For example, elites may possess specialized knowledge about water management systems. Archaeologists will probably find it difficult to distinguish categories of elite income in ways that map neatly onto the theoretical categories economists would typically use.
11.5 Regional Comparisons
The traditional regional list for pristine states includes Mesopotamia, Egypt, the Indus River, China, Mesoamerica, and South America (Service, Reference Service1975; Adams, Reference Adams2001, 346; Spencer and Redmond, Reference Spencer and Redmond2004, 174). The literature is enormous and we cannot undertake a complete review here. In the next five sections we summarize archaeological evidence about state formation for regions on the above list other than Mesopotamia, which was discussed at length in Chapter 9.
For each society, we explain why scholars regard that society as having a state, and provide a chronological narrative of the events leading up to state formation. To the extent possible, each narrative centers on variables relevant for our theoretical approach, such as climate, geography, food technology, population, inequality, migration, warfare, cities, and manufacturing. In cases where our search of the literature uncovered specific causal hypotheses about state formation, we summarize them. However, in some cases we did not find any causal hypotheses, and in no case did we find a set of hypotheses as rich as those we discussed for Mesopotamia in Section 9.8. We describe our own causal hypotheses in Section 11.12 and suggest applications to regional cases in Section 11.13.
Readers who want additional information can consult Renfrew and Bahn (Reference Balkansky, Renfrew and Bahn2014), Yoffee (Reference Clayton and Yoffee2015), and Yoffee (Reference Petrie and Yoffee2019). These edited volumes include chapters on regions of the world beyond those we discuss, such as southern and western Africa, Southeast Asia, Europe, and North America. Renfrew and Bahn address many subjects in prehistory by region, Yoffee (Reference Clayton and Yoffee2015) focuses on early cities, and Yoffee (Reference Petrie and Yoffee2019) focuses on early states.
11.6 Egypt
The Egyptian part of the Nile River valley is traditionally divided into two major population centers, Upper Egypt (in the south) and Lower Egypt (in the north, including the Nile delta). These were separated by Middle Egypt, where the valley is narrow and the prehistoric population density was low. The Dynastic period in which the pharaohs ruled began around 5000 BP and was associated with the start of written records. This was preceded by the pre-Dynastic period covering the millennium from 6000–5000 BP, when written records are absent. Two key questions about the formation of the Egyptian state are why a state arose in Upper Egypt (and probably also Lower Egypt), and how the political unification of Upper and Lower Egypt was achieved.
Hunting, gathering, and fishing were long used for food acquisition at the mouths of wadis and the edges of lakes (recall that in the early Holocene, between about 12000–7000 BP, the Sahara was much wetter than it is today). A period of aridity around 7000–6000 BP forced populations from the deserts to migrate into the Nile valley. The earliest evidence for Neolithic sites having domesticated species appears around 7000 BP, where agricultural technology was not pristine but instead was borrowed from southwest Asia. Hunting, gathering, and fishing remained important and people did not become sedentary immediately (information is from Midant-Reynes, Reference Midant-Reynes2000, except where indicated).
Early in the pre-Dynastic, soon after 6000 BP, metalworking appears in the south. The southern population continued to rely heavily on fishing and hunting but abandoned the edges of wadis, which became inhospitable due to climate change, and concentrated in the limited area of the floodplain. Pastoralism was abandoned in favor of agriculture and the population began to cluster together. Settlements in the south in the early part of the pre-Dynastic period had about 50–200 people (183–184).
During the course of this millennium, the number of bodies buried in small pits increased while a small number of individuals began to be buried in larger graves (170). Grave goods from larger tombs indicate a trend toward hierarchy with a dominant elite directing the work of craft manufacturers. There were probably specialized workshops for flint knappers, potters, stoneworkers, and metalworkers (193–198).
Along with a number of smaller towns, three main urban centers arose in Upper Egypt: Naqada, Hierakonpolis, and Abydos. Naqada existed for at least 500 years, with an average calibrated radiocarbon date of about 5400 BP. Hierakonpolis emerged during 5800–5100 BP due to several factors: degradation of fragile desert ecosystems, increasing aridity leading to migration into the river valley, defensive clustering, the adaptation of agricultural technology to the annual flooding of the river plain, the use of the river for trade, and the role of the city as a religious center. Eventually it became the capital of an early Upper Egyptian kingdom (200–201).
Midant-Reynes estimates that an urban center from this period generally had a few hundred craft workers and officials, and that each non-food-producer was supported by fifty agriculturalists (198). This implies urban cores in combination with agricultural hinterlands on the order of 10,000 people, likely ruled by “kinglets” who controlled trade and manufacturing (207). Local irrigation may have begun around this time and toward the end of the pre-Dynastic period artistic images of irrigation are found. Early rulers in the Dynastic period may have encouraged further irrigation (232–234).
During the pre-Dynastic millennium, Lower Egypt became a sedentary society of pastoralist-agriculturalists, where cultural influences spread from the south to the north. Several sites are well known including Maadi, Heliopolis, and Buto. Metal objects are common at Maadi, which may have been a commercial center. Heliopolis appears to be mainly a cemetery. Buto, at the northern end of the Delta, became the capital of an early Lower Egyptian kingdom, comparable to Hierakonpolis in the south (218). With regard to the unification of south and north at the start of the Dynastic period, Midant-Reynes mentions fortified cities (including Hierakonpolis, also known as Nekhen) and depictions of battle scenes and victorious rulers, but not any skeletal evidence for warfare. She sees warfare as a minor factor by comparison with politics and culture (237–246).
Wengrow (Reference Wengrow2006) provides a portrait similar to that of Midant-Reynes but with a few differences in emphasis. He points out that the bulk of the evidence on economic and other matters in Egyptian prehistory comes from cemeteries rather than settlements due in part to the accumulation of river silts. He agrees that domesticates arrived from southwest Asia between 7000 and 6000 BP and that a unified territorial state with a central monarch followed with a lag of about 1500 years. Wengrow emphasizes that for some time after domesticates arrived there is little sign of permanent villages, and argues that this reflects the dominance of pastoralism relative to cereal cultivation (26–31, 63–64).
Early in the pre-Dynastic proper (6000–5000 BP), food sources in both Upper and Lower Egypt included cultivated wheat and barley, flax, lentils, and peas, along with wild roots, figs, and berries. Domesticated animals included cattle, sheep, goats and often pigs (84). Around the time of the transition from the Naqada I (6000–5650 BP) to the Naqada II (5650–5300 BP) period, cereal farming began to play a “decisive role” in Upper Egypt, with mud-brick housing indicating greater sedentism (33).
Wengrow is not convinced by arguments for urbanism or distinct city-states in Upper Egypt during Naqada I–II (72–73). Hierakonpolis was the largest habitation center of this period. Wengrow estimates that burials at Hierakonpolis in Naqada I–II number in the thousands, with additional nearby cemeteries having similar scales, including Naqada itself. Smaller centers had 100–200 burials and the majority had 600–1000 (73–75).
Although he downplays claims for early city-states, Wengrow believes that towns in Naqada I–II had craft specialization, trade networks, and some political centralization (76–83). During Naqada II (5650–5300 BP), migration to the floodplain in Upper Egypt was associated with the emergence of specialized craft zones for stonework, pottery, baking, and brewing (95–98). At Hierakonpolis, for example, “the concentration of activity along the floodplain is associated with the earliest evidence for specialised manufacturing areas” (consisting of debitage from stonework), and grave goods provide more evidence for expanding craft output (38). Material culture became broadly uniform for Upper and Lower Egypt in the Naqada II period. Metallurgical knowledge is evident in the later part of this millennium.
Wengrow believes that a process of urbanization was occurring in Upper Egypt prior to the emergence of a politically unified state in the Naqada III period (82–83), and that this process involved ritual centers as foci for craft production and trade (265). He states that at this time the number of sites was increasing in Lower Egypt (89), and with Naqada III there was a shift toward mass production for pottery, baking, and brewing (159–164). The wares of Lower Egypt were “the products of specialised workshops, oriented increasingly towards large-scale production and utility of form” (163).
A unified state spanning Upper and Lower Egypt, with kingship, elite dependents, and writing, arose in the Naqada IIIA–C1 period, corresponding to 5300–5100 BP (137). The elite took control over metal and mineral resources and introduced capital-intensive production techniques from southwest Asia, leading to an increasingly polarized society (142). In the Nile delta new centrally governed estates provided for the everyday needs of their dependents (173–175).
The account of pre-Dynastic Egypt by Hendrickx and Huyge (Reference Hendrickx, Huyge, Renfrew and Bahn2014) is generally consistent with those of Midant-Reynes and Wengrow. The former authors place greater emphasis on skeletal evidence of violence by early Naqada II (fractures of the braincase, cut marks on neck vertebrae, and the like). However, they link this to capital punishment or human sacrifice rather than warfare. They also emphasize stratification dating back to 5700 BP, with cemeteries indicating that, “the elite clearly showed fewer traces of work stress and were far better fed” (252).
Hendrickx (Reference Hendrickx, Renfrew and Bahn2014) traces “royal tombs” back to early Naqada II at Hierakonpolis. However, by early Naqada III (around 5300 BP), Abydos had taken over as the center of power in Upper Egypt. While granting the role of violence in iconography, Hendrickx does not believe there was war between these two cities. He argues that a collaboration or alliance between them was much more likely. He also points to the earliest attestation of writing at Abydos as a sign that this city played a pivotal role in state formation. The eventual unification of Egypt resulted in a transfer of governance from Abydos north to Memphis, largely for economic reasons.
Some authors believe that pre-Dynastic Egypt had fortified city-states (Trigger, Reference Trigger2003, 104; Yoffee, Reference Yoffee2005, 47). However, the arguments for fortifications in the late pre-Dynastic often appear to rest on artistic depictions (Hendrickx, Reference Hendrickx, Renfrew and Bahn2014, 272) and may not warrant much attention without greater tangible evidence. The earliest sites that clearly do have fortifications are in peripheral areas far to the south or far to the north (Moeller, Reference Moeller2016, 76–81). For Hierakonpolis, which has good data, there is no town enclosure wall before 5200 bp and possibly not until 5100 BP, which for Moeller (Reference Moeller2016, 81–84) is the start of the Early Dynastic period. We note that such a town wall could have been used to regulate individual access and does not necessarily imply a threat of war.
Another uncertainty involves the activities of non-food-producers in urban areas. Some authors suggest that the pre-Dynastic Egyptian cities had mainly administrative and ceremonial purposes (Yoffee, Reference Yoffee2005, 48; Kemp, Reference Kemp2006). But Midant-Reynes and Wengrow both emphasize craft manufacturing and trade in the towns and cities of the pre-Dynastic period, and Moeller (Reference Moeller2016, ch. 4) takes a similar view.
Allen’s (Reference Allen1997) argument that the Egyptian state arose rapidly after the arrival of agriculture is clearly incorrect. Evidence for the use of domesticated plants and animals can be found at least 2,000 years before the Dynastic period, and there is no debate about the economic centrality of agriculture in the millennium-long pre-Dynastic period. Thus agricultural technology by itself cannot be regarded as a trigger for state formation.
A more plausible argument is that Egypt was affected by similar climate changes to those in Mesopotamia at around the same time. In both cases a prolonged reduction in rainfall caused the populations of outlying areas to migrate toward river valleys, resulting in the development of urban populations within societies controlled by landowning elites. In our view this climate shift was the trigger for urbanization and state formation in both cases (see Brooks, Reference Brooks2006, Reference Brooks2013, for evidence that these regions were affected by the same global climate events).
11.7 The Indus Valley
The early society of the Indus valley is sometimes called the “Indus Tradition” or the “Indus Civilization.” The term “Indus valley” can be misleading as a geographic label because this culture is defined by a set of archaeological markers that extend well beyond the river valley to include tributaries of the Indus River in the north, coastal settlements in the south, and many other sites quite far from the river itself. When we refer to the Indus valley, we mean the “greater Indus valley” in this broad sense.
Chronologies and terminology differ somewhat from one source to another. For simplicity we define the pre-urban period to be 3200–2600 BCE, the urban period to be 2600–1900 BCE, and the post-urban period to be 1900–1300 BCE. Dates in the literature are almost always given as BC or BCE rather than BP, so we follow this convention. As many as 2600 sites are associated with one or more of the three periods listed above. An Indus script existed but has not been deciphered, so practically speaking Indus society is in the realm of prehistory. Information here is from Kenoyer (Reference Kenoyer, Renfrew and Bahn2014) except where noted.
The river valley has a rich alluvial plain. A winter cyclonic pattern brings rain from the northwest and a summer monsoon brings rain from the southwest. Agricultural and pastoral communities date to at least 5500 BCE and probably earlier. Winter–spring crops included wheat, barley, peas, and lentils, while summer–autumn crops included beans and millets. Domesticated animals included sheep, goats, and cattle. There was also some reliance on hunting in the pre-urban period, as well as riverine, lacustrine, or marine resources depending on the location of a particular site.
In the pre-urban period there were various distinct cultural traditions, but toward the end of this period there was a notable tendency toward uniformity of pottery styles, technology, architecture, and settlement organization. Copper-melting crucibles indicate metallurgical sophistication and there is evidence for textile production. The wheel may have been invented for carts drawn by oxen about 3700–3300 BCE. By 2800–2600 BCE three- or four-tier settlement patterns had arisen, with walls around larger centers. These walls could have had various functions such as offering protection against flooding or raiders, or controlling access to the site for trade and other purposes.
The transition from the pre-urban to urban period involved a smooth evolution at some sites, but in other places the abandonment of old sites and creation of new settlements. A few sites have ash layers at the transition, but evidence for any conflict at larger sites is absent. Kenoyer believes this transition did not involve warfare or conquest. The major cities of Harappa in the north and Mohenjo-Daro in the south were much larger than any earlier settlements (about 150–250 ha). These two leading cities were 570 km apart. The city of Harappa could accommodate 40,000–60,000 people, but seasonally the population was probably quite a bit less.
Ratnagar (Reference Ratnagar2016, 52) stresses the radical nature of the shift from the pre-urban to the urban period, pointing to the emergence of many new settlements, citadels, planned street layouts, standardized weights, and writing. But according to Sinopoli (Reference Sinopoli and Yoffee2015), only four or five sites from the “urban” period would actually qualify as urban, and these were separated by hundreds of kilometers. Petrie (Reference Petrie and Yoffee2019) believes the larger cities arose quite rapidly (within a century) but suggests that most of the Indus population remained rural rather than urban.
The major cities housed administrators, ritual specialists, craft specialists, and traders. These groups were supported by farmers and herders inside and outside the city, with a hinterland of small towns and villages. Kenoyer distinguishes four categories of manufacturing activity: basic crafts using locally available materials (wood, clay, and animal products) with simple technologies; crafts using materials not available locally such as stone, but again with simple technologies; crafts using local materials but more complex technology (textiles, furniture, and some ornaments); and crafts with both non-local materials and complex technology (seals, artifacts of copper or copper alloys, hard stone beads, precious metals, glazes, and shells). He believes that the last two categories were most closely controlled by elites in order to produce high-status items and for local or long-distance trade. Pottery and copper-working operations were often located at the edges of settlements. Other operations “took place in segregated areas of larger domestic structures or in the streets between structures.” Concentration of activity in certain parts of the city made it “easier for elites to monitor specific crafts” (2014, 421).
By 2450 BCE, new suburbs were added to the cities and massive new walls were constructed around them. Kenoyer remarks that, “the rapid population growth during this period can be explained only through the migration of new communities to the cities” (415). He observes that the nearby rural settlements remained occupied, so migrants may have come from more distant regions.
Differences in housing sizes, construction materials used, and locations (inside or outside city walls) indicate clear stratification within the large cities. Only a few burials have been found, which have distinctive pottery and ornaments showing membership in the elite. Commoners were not buried, providing additional evidence for stratification.
There is no evidence that cities were ever attacked or destroyed through warfare. Ratnagar (Reference Ratnagar2016) places much greater emphasis than Kenoyer on the military significance of walls, citadels, and caches of defensive projectiles. This subject is controversial, with Kenoyer (Reference Kenoyer2019) arguing that the walls were used mainly to control access and trade, and would have been militarily useless (2020, personal communication). We cannot resolve this debate here. What matters for us is that warfare was absent, whether this was due to the absence of military threats or the effectiveness of defense in deterring attacks. In any event Indus society in the urban period remained stable for over 700 years. We will not discuss the post-urban period except to note that it involved localized cultures, an end to writing, and the abandonment of most large cities.
The existence and nature of a state in the urban period is controversial. Based mainly on the absence of palaces and temples, Possehl (Reference Possehl, Feinman and Marcus1998) argues that the region was an example of cities without a state. Possehl also emphasizes the apparent absence of a supreme political authority, a state religion, or a bureaucracy. Kenoyer (Reference Kenoyer, Renfrew and Bahn2014, 416), by contrast, argues that certain elite residences at Mohenjo-Daro might qualify as palaces. Based mainly on the evidence for systematic city planning and the construction of large walls, Kenoyer asserts that the Indus area was characterized by independent city-states, with the major settlements ruled by corporate bodies representing wealthy landowners, merchants, and religious leaders. He regards the use of uniform stone weights, most commonly found near gateways and workshop areas, as evidence for taxation of trade items. Based on the standardization of weights and other aspects of cultural uniformity, Ratnagar (Reference Ratnagar2016, ch. 7) suggests a degree of political centralization across the region as a whole. However, it is hard to reconcile arguments for a regional state with Ratnagar’s own stress on the importance of military defenses at the level of the individual cities.
There is a longstanding tradition of including the Indus Valley on lists of pristine states (Service, Reference Service1975; Adams, Reference Adams2001). We certainly accept the city-state description given evidence for four-tiered settlement systems, massive construction projects, city planning, the likelihood of taxation for trade goods, and the use of writing. We are less persuaded that political unification extended beyond the level of individual urban centers and their nearby rural hinterlands, particularly given the large distances between major centers.
The key question from our standpoint is whether there is any identifiable trigger for the urbanization process. We found relatively little discussion of causal mechanisms for the origin of city-states in this region, but the leading contender appears to be climate change. Madella and Fuller (Reference Madella and Fuller2006) argue that Indus urbanism occurred during a lengthy trend toward declining rainfall. Evidence for this view comes from the sediments in salt lakes, pollen sequences, and correlations with other regional and global climate events. Madella and Fuller suggest that relatively high rainfall during the mid-Holocene led to the spread of sedentism and agriculture in the Indus area, partly through increased flood levels that allowed more extensive cultivation as the floods receded. This would have supported demographic expansion. However, diminishing rainfall by 2900–2450 bce contributed to a rising population near the Indus and Ghaggar-Hakra rivers in the urban period, as rain-fed cultivation became more difficult.
These ideas are elaborated by Giosan et al. (Reference Giosan2012, E1688), who find that aridity intensified after about 3000 BCE and suggest “a gradual decrease in flood intensity that probably stimulated intensive agriculture initially and encouraged urbanization” around 2500 BCE. Giosan et al. (Reference Giosan2012, E1693) assert that “river floods have always been far more important and reliable for agriculture than rainfall,” that precipitation feeding the rivers decreased after 3000 BCE, and that it reached its lowest level after about 2000 BCE. They conclude, “This drying of the Indus region supports the hypothesis that adaptation to aridity contributed to social complexity and urbanization.”
This argument is reinforced by archaeological evidence that Indus cities made major investments in water management. Ratnagar (Reference Ratnagar2016, ch. 9) argues that cities had deep wells tapping into a region-wide aquifer for drinking water, and that reservoirs for the storage of flood water “required enormous community labour under direction and coordination” (192). Kenoyer agrees that “one of the outstanding features of the Indus cities is the technology of water management through the construction of wells and reservoirs” (2014, 418). However, he is skeptical about the relevance of climate change.
In a detailed survey of the literature Petrie et al. (Reference Petrie2017) describe a wide range of views, from the belief that there was no change in annual rainfall patterns between 4000 BCE and the present to the belief that climate change was the main cause of the collapse of Indus civilization. Petrie et al. accept that monsoonal rainfall decreased during 4400–3760 BCE and 2200–2000 BCE, but these episodes do not coincide with the timing of the initial urbanization. They stress the diversity of microclimates and ecosystems, as well as numerous data problems, and believe it is unlikely that climate change would have led to uniform effects across the region. However, they do not directly criticize the arguments of Madella and Fuller (Reference Madella and Fuller2006) or Giosan et al. (Reference Giosan2012). Although the debate over the role of climate change is unresolved, our assessment is that increasing aridity is a prominent candidate as a trigger for the emergence of Indus city-states.
11.8 China
The process of pristine state formation in China is complex. There is still active debate both about which site was the first to become a state and about the causal forces that led to this outcome. We will describe what we believe is the majority view on the subject, but we include a contrary view at the end of this section so readers will gain a sense of the issues at stake.
Cohen and Murowchick (Reference Cohen, Murowchick, Renfrew and Bahn2014) provide a detailed description of the Middle and Late Neolithic periods in northern China from about 3500–1800 BCE. These cultures are known as Yangshao and Longshan, and provide the background for state formation. This period reveals a trajectory starting from simple egalitarian societies and moving over time toward increasing site numbers; increasing population densities; increasing stratification; greater craft specialization (jade, lithics, pottery, and metallurgy); multi-tiered settlement hierarchies with central places; some very large site areas; defensive walls; and warfare.
Elite warfare appears to have been common in the Late Neolithic. In a review of the Chinese archaeological literature on warfare for 5000–2000 BCE involving 85 sites, James Kai-sing Kung (personal communication, 2020) identified 30 sites with weapons including arrowheads, spears, axes, and bullets or balls; 27 sites with fortifications (four had trenches and the rest were walled towns); and 18 sites with unnatural deaths (largely confined to the Yellow River area, and with skeletons usually having scars from arrows). We note that some of this evidence is ambiguous. For example, weapons might be used for hunting and walls might be used to control individual access to a site. However, two sites had strong evidence for mass violence: Taosi, located near the Yellow River, with a mass grave for 40 people along with the destruction of a town and tombs; and Yuanmou-Dadunzi in Yunnan province, where most victims were young men hit by large stones or killed by arrows.
Recent research at Shimao, in modern Shaanxi province well north of the central plain of the Yellow River, has shed new light on the role of warfare in the late Neolithic. This settlement existed from 2300–1800 BCE, and with an area of more than 400 ha was the largest walled settlement in China at the time (information is from Jaang et al., Reference Jaang, Sun, Shao and Li2018). The first construction project was a palace center probably meant as a residence for ruling elites, with a design suggesting desires both for defense and restrictions on access. Metal artifacts were manufactured in this center. Construction beyond this complex was “much more humble, both in terms of size and building techniques” (1013). An initial stone wall enclosed an area of 210 ha, and this was followed by an outer wall around 2100 BCE that was clearly designed for defense. The Shimao polity had a four-tier hierarchy with up to 4,000 settlements.
Jaang et al. believe that Shimao destroyed Taosi, a large Neolithic urban center to the south, around 2000–1900 BCE. This involved a breach of the outer wall at Taosi, the death of at least 50 people in the palatial center, destruction of the palace, and dragging of elite corpses out of tombs. Cultural evidence suggests that the region surrounding Taosi became a colony in the Shimao network.
Sun et al. (Reference Sun2017) assert that Shimao flourished during a relatively warm and wet climate, and declined during climate deterioration around 1800 BCE. They believe the period of favorable climate was associated with a very high population and that the rapid nature of the population growth was attributable at least in part to migration into the area by pastoralists. Their discussion suggests causality running from climate to population to warfare. Sun et al. also suggest that people from Shimao may have been responsible for the destruction of Taosi.
There is a reasonably broad consensus that the first Chinese state was centered at Erlitou in the Yiluo basin from 1900 to 1500 BCE (a dissenting opinion will be discussed later in this section). This interval is divided into four phases of about one century each based on a ceramic typology, but precise dates are difficult to determine. Our description follows Liu and Chen (Reference Liu and Chen2012, ch. 8) and is supplemented by Liu and Chen (Reference Liu and Chen2003) and Liu (Reference Liu and Storey2006). Liu and Chen (Reference Liu and Chen2012, 258) define a state as having a ruling class and commoner class, where the elite has centralized decision-making with functional specialization and the regional settlement hierarchy has at least four tiers. They argue that Erlitou satisfied these requirements while Late Neolithic societies did not.
The region is a large fertile alluvial basin surrounded by hills, mountain ranges, and the Yellow River to the north. Good land quality provided high yields for grains and domesticated animals, and permitted a high population density. After previous Neolithic occupation, the area was abandoned for 500–600 years. While most regions exhibited a decline in population density and social complexity around 2000 BCE, the Yiluo basin was an exception, developing a large urban center and a four-tiered settlement hierarchy. The broader Erlitou culture, classified as Bronze Age rather than Neolithic, includes more than 300 sites with similar material characteristics spread across the middle Yellow River valley. Over 200 of these sites were distributed on the alluvial plains and loess tablelands of the core Erlitou region. The site of Erlitou was probably chosen for the transportation advantages of the Yiluo River system and because surrounding mountains offered some protection. The central site was not fortified, although a few fortified sites in peripheral areas outside the Yiluo basin are known to have existed.
In Phase I the site measures more than 100 ha in area and seems to be the largest site in the Yiluo region and beyond. The population of the site is estimated at 3,500–5,800. “Such rapid population nucleation can be explained only by migration from surrounding areas” (2012, 266). It is unclear where the migrants came from, what led them to move to the site, or whether there was an amalgamation of existing villages in the region. But Rawson (Reference Rawson2017) has suggested that migrants from Shimao may have moved to Taosi and then on to Erlitou, bringing bronze technology with them.
Liu (Reference Liu2004, 235) points out that the Yellow River changed its course around 2000 BCE, that this was a time of flooding in many parts of the Yellow River valley, and that the floods may have been caused by climatic fluctuation. Cohen and Murowchick (Reference Cohen, Murowchick, Renfrew and Bahn2014, 783) refer to “a major cold event at 2000 BCE that may have been one driving factor in the end of Longshan-related cultures and the emergence of early state-level Bronze Age societies.” Wu et al. (Reference Wu2016) have argued that an earthquake caused a landslide damming the Yellow River, which then led to a massive outburst flood around 1920 BCE and may have triggered the formation of Erlitou. Thus a number of scholars cite environmental shifts that could have caused the rise of a state, but there is little agreement on details.
The tool assemblages from Phase I at Erlitou included agricultural and hunting-fishing implements. The population engaged in both food production and crafts, with workshops producing both utilitarian and elite goods. Craft specializations from Phase I included pottery, bone carving, and bronze casting. Elite items included white pottery, ivory and turquoise artifacts, and bronze tools.
By Phase II the site area had increased to its maximum extent of 300 ha and the population is estimated to have increased to 8,300–13,900. A complex of rammed-earth buildings (12 ha) emerged, where the buildings were comparable in size and complexity to the palaces of the later Shang period. Two groups of elite burials were unearthed from the courtyards of a palace. More generally, Liu and Chen emphasize the “sharp contrast between the commoners’ small and semisubterranean houses and poor burials on the one hand, and the elite’s large, rammed-earth palatial structures and rich tombs containing artifacts made of bronze, jade, turquoise, cowry shell, ivory, and kaolinic clay (white pottery), on the other” (2012, 263). A bronze-casting foundry was located close to the palace complex. This area was densely populated by craftsmen and their families. The production of prestige items appears to have been controlled by the state. Agricultural tools continued to make up the largest proportion of tool assemblages. The number of arrowheads grew at a rate similar to other tools, indicating that their primary function was hunting rather than warfare. This phase had a four-tier settlement hierarchy for the Yiluo region as a whole.
Phase III was the peak period for Erlitou, with a marked increase in the numbers of pits, houses, burials, and kilns. The urban center probably had around 18,000–30,000 people. The old palaces were abandoned and six new palaces were constructed in a more organized way, likely requiring large-scale earth moving. Water wells and storage pits were dramatically reduced in the palace precinct, suggesting a larger elite role and more functional specialization. Craft workshops became even more numerous and produced both utilitarian and elite goods, where the latter included greater emphasis on turquoise. While high-prestige goods continued to be produced near the palace complex, probably under state control, bone and ceramic workshops were spread more widely, indicating a continuation of independent craft production. Agricultural tools decreased in proportion to craft goods. The city population probably relied heavily on food from the hinterland, and the total population of the Yiluo region may have reached 54,000–82,000 people. It is unclear whether market systems had developed for utilitarian goods or to what degree the supply of agricultural output to the center reflected a tribute system. The number of arrowheads increased rapidly, suggesting military expansion motivated by a desire to control outlying sources of salt and metals.
In Phase IV the site area stayed the same as in Phase III and palatial structures remained in use, with Erlitou maintaining its position as the largest urban center in the region. Quantities of tools in all categories increased except for needles and awls used in domestic crafts. However, the population declined, with craft production becoming less important in relation to food output and residential construction. Arrowhead production grew dramatically. In the late part of Phase IV a large fortified city (200 ha) was built 6 km northeast of Erlitou. The material remains are characteristic of the Erligang (or early Shang) period. Although the issue is controversial, this may reflect a conquest of Erlitou by the Shang. Elite goods production at Erlitou stopped after Phase IV, and the site was reduced to the scale of an ordinary village. It was ultimately abandoned as people moved to settlements linked with the new and larger Shang state.
The role of “the earliest Chinese state” is most often assigned to Erlitou, but some dissenters award this honor to one or another Late Neolithic settlement prior to Erlitou, or to the Erligang/Early Shang state around 1600 BCE. Shelach-Lavi (Reference Shelach-Lavi2015, chs. 7–8) falls in the latter category. We review his arguments partly for the light they shed on Erlitou, and partly to illustrate larger problems surrounding the identification of pristine states.
Shelach-Lavi accepts the idea that flooding around 2000 BCE, perhaps due to a shift in the course of the river, could have led to the collapse of societies in the middle and lower Yellow River regions, which could account for the rise of Erlitou. He is also willing to entertain the argument that such flooding was caused by anomalous monsoon patterns, which simultaneously brought exceptional aridity to North China. Although he does not say so directly, these events could have triggered migration to the Yiluo region. However, Shelach-Lavi expresses reservations about the aridity scenario, noting that it is hard to reconcile with population growth in parts of northeastern China that should have been particularly vulnerable to drought.
Shelach-Lavi’s general description of the Erlitou site does not differ much from that given above. However, he rejects it as the center of a state-level society for several reasons (184–190). First, he questions arguments based on the size of the site, saying that Late Neolithic settlements like Taosi and Shijiahe were of similar size. He also remarks that no state-level monumental architecture has been found at Erlitou. In particular, the so-called “palaces” are small in relation to the palaces of pristine states in other parts of the world and are comparable to a Late Neolithic site in China. The labor investments needed to build the “palaces” were not especially impressive. Similarly grave sizes and grave goods do not reflect anything qualitatively new. Although Erlitou had a substantial bronze industry, it can be regarded as a continuation of earlier ceramic works rather than a new departure, and was unimpressive relative to the bronze works of the Shang period.
Shelach-Lavi doubts the claimed role of Erlitou in a four-tier settlement hierarchy, maintaining that there is little evidence that the alleged second and third tier centers had any administrative functions. He also points to the absence of large centralized granaries that would have supported royal or elite households. Claims of military expansion in the Erlitou Phase III are rejected on the grounds that cultural boundaries need not correspond to political boundaries, and that some of the asserted military activities are unrealistic. In conclusion Shelach-Levi asserts, “This discussion does not preclude the possibility that a state-level society did emerge in the Yiluo basin during Erlitou phase III. It suggests, however, that there is currently no clear evidence in support of this hypothesis” (190). He argues that the Erligang (or Early Shang) society formed around the time of Erlitou Phase IV does pass this test.
For brevity we will not pursue a discussion of the Shang polity here, although we note that it also has some questionable features. For example, Shelach-Lavi suggests that the Shang lacked institutionalized methods for tax collection and may have financed their “state” through land rent (nominally the king owned all the land; 222). He also describes the Shang administrative system as “generalized” rather than “specialized,” despite giving considerable weight to functional specialization as a defining feature of the state in his preceding chapter. No one seriously doubts that the Shang polity counts as a state, but we raise these points to show that defining a state and identifying archaeological criteria for the presence of a state are not straightforward matters. On balance, however, we do accept what we take to be the majority view, which is that Erlitou was a pristine state.
11.9 Mesoamerica
The region of Mesoamerica encompasses central and southern Mexico, Belize, Guatemala, and parts of El Salvador and Honduras. The development of early cities and states was complex, involving several sub-regions. It is often difficult to assess whether events at a specific time and location should be called “pristine,” given the prevalence of interactions across the region involving migration, trade, warfare, and ideology.
Background conditions include regional population growth and rising numbers of tiers in settlement hierarchies during the Middle Formative (1000–400 BCE). The urban status of Early and Middle Formative centers is debated, but processes of nucleation and centralization at nodes within regional networks were underway (Pool, Reference Chase, Chase, Nichols and Pool2012). Spencer and Redmond (Reference Spencer and Redmond2004) use markers for states including a four-tiered settlement hierarchy, royal palaces and specialized temples, and conquest or subjugation of distant territories. In their view the Olmec society of the Middle Formative period involved chiefdoms and hence we omit a discussion of this case. Spencer and Redmond regard Oaxaca, the Basin of Mexico, and the Lowland Maya area as locations for the emergence of primary states. However, we begin with a discussion of events along the Pacific coast.
On the Pacific coastal plain from Chiapas to El Salvador, horticulture and partial sedentism arose by the Late Archaic period (information in this paragraph and the next is from Love, Reference Love, Nichols and Pool2012). Maize and other cultigens existed by 3500 BCE. Pottery appeared in the Early Formative period (1900–1000 BCE). Increasing sedentism and cultivation led to increasing population, as well as wealth inequality and two-tiered settlement systems by 1700 BCE, suggestive of simple chiefdoms. Soon after 1400 BCE new paramount settlements arose, which appear to be successive capitals of a larger regional polity. In the Middle Formative (1000–400 BCE) population growth continued, probably due in part to improvements in the characteristics of maize, leading to greater density and larger settlements. The largest political systems were complex chiefdoms, which had incipient urbanism, rigid social stratification, and structured regional hierarchies.
This trajectory climaxed with fully urban states throughout the coastal plain and in the highlands during the Late Formative (400 BCE–200 CE). The cities were larger than those of the Middle Formative and spread over a larger area. This “Southern City-State Culture” had numerous urban centers of over 4 sq km with hinterlands generally less than 1,000 sq km. Love refers to these polities, which were linked by local and long-distance trade, as “micro-states.” Each had a core with massive public construction, and several were focal for settlement hierarchies that had four or more tiers. Love does not speculate about the causality behind this process of city-state formation. In particular he does not discuss the possible roles of craft manufacturing or warfare. He does, however, suggest that drought brought an end to this system by 200 CE.
The nearby area of Oaxaca is the center of a zone called the “southern highlands” (information is from Elson, Reference Elson, Nichols and Pool2012). The timing for the emergence of stratification and hereditary leadership in the southern highlands is debated but may date back to around 700 BCE. During the Middle Formative, chiefdoms had varying population sizes and degrees of social complexity.
The Oaxaca Valley is divided into three areas that together form a Y shape. In the period 700–500 BCE there is evidence for ranked societies that had three-tier settlement hierarchies. The Middle Formative is known for widespread warfare among chiefdoms, with raiding, temple burning, and taking of captives for sacrifice. An uninhabited buffer zone developed at the center of the valley where the three branches met. Between 500–100 BCE several sites in the region became urban centers, with state formation occurring by the end of this period.
The settlement of Monte Albán dates to 500 BCE and was located at the center of the Oaxaca Valley, on a mountain 1,400 m tall at the juncture of the valley’s three arms. Ongoing warfare among the chiefdoms in the valley branches, as well as external threats, made this defensive location attractive. Some authors also emphasize the ceremonial and religious significance of the site. The initial population of 5,000 tripled over the next four centuries. Specialized state architecture is apparent around 100 BCE–200 CE but likely dates back to 300 BCE. State formation is agreed to have occurred by 300–100 BCE. Expansion outside the valley began before the rival polities within the valley had been defeated. By 150 BCE the capital of one rival polity had been burned and abandoned. The last holdouts within the Oaxaca Valley were incorporated by 200 CE. This led to Zapotec civilization, which lasted for five more centuries. At its peak Monte Albán had a population of 15,000–30,000.
Balkansky (Reference Balkansky, Renfrew and Bahn2014) adds a few elements to this review. Craft specialization was an early development in Oaxaca, dating to 1000 BCE and correlated with rising inequality. Important products included shell jewelry, pottery, chipped stone, and textiles. Around the time of its urbanization process, Oaxaca was an important demographic center for Mesoamerica as a whole. Chiefly competition and warfare dates at least to 600 BCE, with evidence including depictions of slain captives, unoccupied buffer zones, burned public buildings, and trophy taking. Toward the end of the Middle Formative there were many sites where chiefdoms gave way to urbanism: “The movement of entire populations from valley floor to terraced hilltops characterized Oaxaca’s urban revolution” (1032). Urbanism at Monte Albán was a defense against external threats. This city-state covered 6.5 sq km of rugged hilltops, and controlled multiple valleys and ethno-linguistic groups.
The Central Mexican Highlands have a rich natural environment with alluvial plains and lakes surrounded by volcanic mountain ranges. The climate is temperate due to an altitude about 2,000 m above sea level, and rainfall averages 450–900 mm annually (information in this paragraph is from Sugiyama, Reference Sugiyama, Nichols and Pool2012). A wide variety of plants were domesticated by 5000 BCE. During the Formative and Early Classic periods, irrigation and terracing were applied to maize, beans, squash, and other crops. Raised agricultural fields sustained increasing populations. Dogs and turkeys were domesticated, with deer, peccaries, rabbits, and other species providing additional animal foods. By the Middle Formative improving technology had led to growing population and social complexity, with craft industries, market economies, and several populous villages.
In the Late Formative, by 250–100 BCE, the Basin of Mexico had one location with 20,000–40,000 people (Teotihuacan) and a second with about 20,000 (Cuicuilco); see Spencer and Redmond (Reference Spencer and Redmond2004). The latter was focal for a cluster of sites exhibiting a four-tiered hierarchy and had several monumental buildings. It is less certain whether Teotihuacan was focal for a four-tiered hierarchy during this period or had similar large structures. Some authors argue that the site locations in the Basin of Mexico reflected a concern with defense, but there is no evidence for fortifications, warfare, or extension of control to distant areas around this time. Cuicuilco was destroyed by a volcanic eruption between 1 CE and 400 CE, and the duration of its overlap with Teotihuacan is uncertain (Sugiyama, Reference Sugiyama, Nichols and Pool2012). Volcanic activity may have resulted in large-scale migration into the Teotihuacan Valley but this is likewise unclear (Clayton, Reference Clayton and Yoffee2015, 282–283).
According to Sugiyama, Teotihuacan began as a regional ritual center. By about 200 CE, it had 60,000–80,000 inhabitants, and some of its neighborhoods were occupied by migrants from distant areas. Construction followed a master plan starting around 200 CE. Offerings of projectile points and other war-related objects at major temples, as well as human sacrifices of prisoners of war, suggest that military institutions were important to city governance. “Ample data” indicate craft specialization and stratification. Its peak population may have been 100,000–150,000. The city was eventually destroyed through warfare by unknown enemies at an unknown date.
Manzanilla (Reference Manzanilla, Renfrew and Bahn2014) adds the following. Population growth in the basin of Mexico was sustained and substantial during 900–250 BCE. This was accompanied by the rise of settlement hierarchies with regional centers, large villages, small villages, and hamlets. Clustering of settlements with empty buffer zones and the use of sites on mountaintops suggest some degree of political hostility. However, the subsequent population growth at Teotihuacan in the northeast part of the basin “should be seen not as a forceful act or the effect of conquest … but the natural consequence of a large population shift” (991), probably associated with groups fleeing from volcanic eruptions to the south. By 200–350 CE, Teotihuacan was exceptional for its size, urban planning, settlement pattern (a huge city surrounded by rural sites), corporate elite strategy (including co-rulership), and multiethnic composition. Craft production occurred on multiple scales: everyday needs in apartment compounds, extensive craft sectors in the periphery to produce items for the urban population, specialized identity markers produced in barrio sectors supervised by noble “houses,” and specific crafts from workshops under the control of rulers.
We close this section with a discussion of the southern Maya lowlands, covering parts of Mexico, Belize, and Guatemala (information in this paragraph is from Chase and Chase, Reference Chase, Chase, Nichols and Pool2012). The first sedentary Maya lived in fully formed villages dated to 1200–900 BCE. By sometime after 600 BCE ceramics and architecture across the region became more standardized and recognizably “Maya.” In the Late Preclassic period (300 BCE–250 CE) raised agricultural fields were in use, and in the Classic period (250–800 CE) there was extensive terracing of the landscape.
Freidel (Reference Freidel, Renfrew and Bahn2014) emphasizes the emergence of large urban centers like El Mirador during the later Preclassic. He calls these the capitals of “kingdoms.” One rich tomb at Tikal, dating to 200 CE, may have been for the founder of the royal dynasty of this city. Terms like “royal” and “kingdoms” suggest the existence of states, but Freidel does not offer much further archaeological evidence for the existence of states in the usual sense. He argues that the boundary of the Preclassic and Classic periods, around 200–250 CE, was marked by catastrophe, with the collapse of several political capitals. The possible reasons include environmental degradation, deforestation, climate change, and drought.
By 250 CE, at the start of the Early Classic, developments included elite tombs, political control by elite families, and stone monuments with written texts (information is from Chase and Chase, Reference Chase, Chase, Nichols and Pool2012). The latter indicate founding dates for political dynasties ranging from 100 CE to 426 CE. Burial data reveal ranked societies, and in some places stratification, throughout the lowlands; “rulership was a prerogative of a small elite group at each site” (260). Architectural complexes described as palaces became widespread and state-level societies were “surely achieved” by this time. There were numerous small city-states, with Tikal being the preeminent site of the Early Classic period.
In the Late Classic (550–800 BCE), Maya cities reached their maximum size, with Caracol and Tikal each having landscapes of about 200 sq km containing about 100,000 occupants. Chase and Chase describe these as “low-density cities” comparable to other examples of tropical urbanism in Southeast Asia and Africa. Spatially distinct areas of public architecture were linked by causeways and continuous residential settlements. A market economy probably existed.
Chase and Chase do not suggest any causal explanation for state formation in the Maya lowlands. Based upon the existence of palaces, temples, and a four-tier hierarchy, Spencer and Redmond (Reference Spencer and Redmond2004) agree that states had formed by around 250–500 CE. They likewise refrain from causal explanations, although they do not cite evidence for warfare or conquest during this period.
11.10 South America
Burger (Reference Burger, Renfrew and Bahn2014) offers an account of early Peruvian developments, both along the coast and in the highlands, up to about 50 BCE. The story is one of initially simple food technologies and egalitarian social systems, which over time evolved in the direction of more complex technology, larger settlements, monumental architecture, the formation of elites, and (toward the end) strong evidence for warfare. This trajectory is the backdrop for subsequent processes of urbanization and state formation.
Stanish (Reference Stanish2001) surveys the archaeological evidence for pristine state formation in South America. The key areas are the Peruvian coast and the central Andean highlands. Stanish downplays monumental architecture as an indicator of early states, arguing that non-state societies like chiefdoms are capable of mobilizing the required labor. He dates the first fully sedentary and complex societies on the Pacific coast of Peru to 3000–2500 BCE, with the first stratified societies in South America arising soon after. Most scholars describe these as simple chiefdoms. More complex but non-urban chiefdoms emerged in the period leading up to 200 BCE. These societies lacked the socioeconomic hierarchies normally found in connection with states.
According to Stanish (Reference Stanish2001), the first states were Moche, Tiwanaku, and Wari, which arose in the Andes during the first millennium ce. These three cases were largely independent of one another. Stanish remarks that Moche was “perhaps the first true city in the Andes” (53) and was the center of a multi-valley polity. The view that Moche had state-level organization is controversial (Quilter and Koons, Reference Quilter and Koons2012), but for the purposes of the following discussion we will accept the claim that all three societies had states. Tiwanaku and Wari had similar site sizes and arose more or less simultaneously, not long after Moche. For detailed descriptions of Tiwanaku and Wari, see Quilter (Reference Quilter2014, ch. 8).
Again according to Stanish (Reference Stanish2001), all three states had elites, palaces, large urban capitals, settlement hierarchies with at least four tiers, economic specialization, and total populations for each polity from 50,000 to 200,000. The capital cities resembled Uruk and Teotihuacan. State formation occurred in unusually productive environmental zones where it was possible to intensify agriculture fairly easily, including through irrigation. Direct elite control of irrigation was probably not a significant factor. There was some gradual population growth but not population pressure within circumscribed areas. All three states have strong evidence of warfare involving iconography, physical remains, and defensive architecture, as well as good evidence for colonization of distant locations.
Moore (Reference Moore2014, ch. 9) believes that Moche society developed from multiple origins along the Peruvian coast in the period 200–850 CE. The extension of agricultural fields through improved and reliable irrigation technologies led to a rich elite culture. Moore argues that a centralized state emerged in the southern valleys of the Moche culture area, although he acknowledges that this view is controversial. The capital city (also known as Moche) was centered around two complexes of temples and plazas, with densely packed workshops and dwellings covering about 100 ha on a flat plain between these complexes. Roads had a grid system and in each block there were workshops that produced ceramics, metal objects, semi-precious stone ornaments, and textiles. “Craft production was specialized and organized” (323). The elite mobilized commoner labor for public works projects such as irrigation canals. The Moche state was followed by secondary states in the same region that need not be discussed here.
According to Moore (Reference Moore2014, ch. 9), the Wari state arose in the Central Andes about 550–600 ce and lasted until about 1000 ce. It is unclear whether Wari was actually a pristine state or coalesced from earlier settlements that had already achieved state-level organization. The capital city of Huari was located in the Ayacucho Valley, which was agriculturally fertile and heavily reliant on maize. Its urban core covered 2.5 sq km and was surrounded by 15 sq km of less monumental buildings and residences. Elite culture was reflected in fine craft production. Infrastructure included compounds, storerooms, and roads. The Wari constructed local administrative centers elsewhere in their empire, which probably extended to sites in Peru more than 900 km from the capital. Outposts were often located on impregnable hilltops.
Moore (Reference Moore2014, ch. 9) dates the Tiwanaku society around the Lake Titicaca basin to about 400–1100 Ce and describes it as a rival of Wari. The core of the capital city (also called Tiwanaku) was a large planned area of pyramids and plazas surrounded by 4–6 sq km of middens. Population estimates vary but Moore believes the capital had 15,000–25,000 inhabitants. The elite lived in a distinct residential area near sacred spaces, with elaborate architecture and rich burials. The remainder of the population lived in denser settlements stretching from the city core to Lake Titicaca. Raised fields were a crucial component of the agricultural economy. About 100,000 people in the lake basin could have been fed by these fields. The extent of the fields increased as the Tiwanaku empire coalesced and expanded. The empire displayed explosive growth during 725–1000 CE, when loosely integrated colonies were converted into a system of centrally governed provinces. Climate change led to prolonged droughts during 950–1100 CE that brought a massive reduction in population and an end to the empire.
11.11 General Patterns
In this section we attempt to distill some general patterns from the regional cases sketched above, as well as the case of southern Mesopotamia from Chapter 9. Patterns of this kind impose useful constraints on theory. Some causal mechanisms that can account for these patterns will be discussed in Section 11.12. Six regions do not constitute a large sample, so the ideas developed in these two sections should be regarded as hypotheses or conjectures to be tested through more research, both for the regions we have discussed as well as other regions not covered here.
Agricultural Productivity:
Almost every account of pristine state formation in a given region includes a comment to the effect that fertile agricultural land was plentiful. Depending on the case this may involve alluvial plains, volcanic soil, or the like. We are unaware of pristine states based exclusively on hunting, gathering, and fishing, although the latter activities were often important supplements to agriculture. Indeed, it has been argued that in the early stages of urbanization at Uruk, food from wetlands was primary and agriculture was secondary (see our remarks on Algaze and Pournelle in Chapter 9).
Stratification:
Every account of pristine state formation we have seen reports the existence of elite and commoner classes. Furthermore, these class divisions existed prior to state formation and were not simply a product of it. We have seen no account where a state arose in an egalitarian society. As with a highly productive food technology, pre-existing stratification appears to be a necessary condition.
Urbanization:
As a conceptual matter we can imagine cities without states and states without cities. However, we are struck by the recurring role of cities in all of the regional cases we have investigated. From an empirical standpoint pristine cities and pristine states are closely intertwined. We are not quite ready to claim that cities are a necessary or sufficient condition for states, but pristine states assembled entirely by the political unification of rural territories appear to be rare.
We do want to acknowledge the possibility of counterexamples. Kirch (Reference Kirch2010) asserts that Hawaii was an example of state formation and one could argue that cities had no role in this process, but the Hawaiian state was not pristine (Kamehameha I had access to European weapons and advisors). Other examples of state formation from historical or ethnographic accounts, such as those used by Flannery (Reference Flannery1999) and Flannery and Marcus (Reference Flannery and Marcus2012), have similar difficulties. Hansen (Reference Hansen, Marcus and Sabloff2008, 69–70) suggests some examples of early states without cities and early cities without states that seem less problematic. We grant that such cases can be found, but believe that theories linking states with cities are likely to have greater explanatory power than theories that ignore the catalytic role of cities in the process of state formation.
Migration:
In several (although not all) of the regional cases we studied, experts commented that the agglomeration of population in a compact area was too rapid to have occurred without migration from external sources. At least in these cases this means that urbanization probably had a short-run trigger and did not require Malthusian population growth over several generations. In some cases rural settlements located near emerging cities did not lose significant population, indicating that migrants must have come from farther afield. Of course, the importance of rapid migration in some cases does not rule out the potential relevance of slower migration flows extending over many generations, or long-run Malthusian dynamics, in other cases.
11.12 Three Pathways
Assuming the reader is willing to grant the existence of recurrent patterns across regions of the world, we take up the challenge of explaining these patterns. No theory will ever explain every particularity of every case, but there is no reason to believe that attempts to formulate general explanations are somehow intellectually illegitimate or doomed to failure. In our view there are three principal mechanisms that help to explain the patterns we have described. All build upon models we developed earlier in the book. We call them (a) the property rights hypothesis, (b) the elite warfare hypothesis, and (c) the environmental shift hypothesis. We will describe each of these causal mechanisms and discuss how it applies to regional cases. But first we make a few general points.
All three pathways to the state rest upon the theory of stratification from Chapter 6. This theory treated technological innovation in food production as exogenous, used Malthusian principles to derive an implication of long-run regional population growth, and showed that such population growth can eventually lead to the formation of elite and commoner classes at the best sites. Section 11.11 argued that stratification of this kind is a necessary condition for pristine state formation. Section 11.11 also argued that a good natural environment for food production is a necessary condition. The latter factor tends to support high regional population and makes it more likely that technological advances will eventually generate the population levels required to induce stratification.
We generally expect technological innovation and regional population growth to be observed prior to pristine state formation, because these factors drive stratification and the latter is a precondition for the state. Such innovations could include improvements in the characteristics of domesticated plant or animal species, or adaptation of these species to local conditions; improvements in sowing, plowing, and harvesting; or investments in irrigation and terracing systems. We expect technological developments of this kind to result in population growth that should become archaeologically visible through a higher settlement density and/or larger settlements as measured by geographic size or number of inhabitants. Stratification should become visible in the usual ways: more inequality with respect to housing, burials, nutrition, health, access to exotic items, and so on. Although our model in Chapter 6 did not capture the idea of settlement hierarchies, as an empirical matter such stratification is usually associated with simple or complex chiefdoms having two or three-tiered hierarchies.
Another preliminary remark involves urbanization and craft manufacturing. Any causal mechanism that leads to urban agglomeration is likely to stimulate manufacturing. The reason is summed up in Adam Smith’s adage that the division of labor is limited by the extent of the market. Because urbanization creates large local markets, it encourages craft specialization as long as food producers are willing to exchange some of their output for manufactured goods. Processes of learning by doing then tend to raise manufacturing productivity, reinforcing the trend toward urbanization. We have also argued that urban manufacturing can sometimes be easier to tax than rural agriculture, so these dynamics can lead to city-states.
The three hypotheses we will discuss below have these features in common, but differ with respect to the causal trigger that initiates the agglomeration process.
The Property Rights Hypothesis:
We showed in Chapter 6 that technical progress and the resulting population growth make commoners worse off (see especially Section 6.8). The reason is that as technology improves and the regional population rises, more sites become closed. Due to the endogeneity of property rights, fewer sites remain in the commons, these sites are the least desirable ones, and the commoner standard of living falls. This drives down the wage offered to commoners by elites at stratified sites.
There are several implications. First, the declining wage means that elites want to hire more commoners, so the populations of the high-quality sites will grow. Second, we showed in Chapter 10 that a falling wage can trigger a shift toward urban manufacturing. In Chapter 10 the falling wage was the short-run result of a negative climate shock while here it is a long-run result of technological advance, population growth, and endogenous property rights. The result, however, is much the same. Elites eventually allocate some labor to craft manufacturing, this stimulates urbanization, and the visibility of inputs and outputs in the urban sector makes it easier to collect taxes there. This yields a city-state.
If there are several good sites of roughly equal quality within a region, we would expect the property rights mechanism to generate several small city-states, one for each local elite in control of a high-quality site. Because market power is less significant in a region with competing city-states, in this situation we expect taxation to be motivated less by exploitation of monopoly or monopsony positions in the markets for goods and labor, and more by the provision of local public goods of interest to the elite. These predictions could, however, be overturned if geography implies the existence of a single good site, or increasing returns to scale at the level of the city strongly outweigh transportation costs. In the latter situations, a single dominant city could arise. But in any event, because the property rights hypothesis relies upon Malthusian population growth we would expect city-state formation to be a gradual process extending over multiple generations, not an abrupt response to migration flows.
The Elite Warfare Hypothesis:
In Chapter 8 we showed that stratified societies are prone to chronic conflict over land rents. This may either take the form of open warfare or credible threats to engage in such warfare. The key parameter in determining which of the two is more likely is the degree of stratification. We showed that if land rents are of moderate size relative to commoner wages (and thus relative to the cost of a hired army), there are equilibria where intimidation tactics can succeed without any need for open war. However, when land rents are large relative to the commoner wage, equilibria must have a positive probability of open warfare. In a region with many production sites the elites controlling the best sites are the most likely to engage in open warfare with one another, because these are the sites with the highest land rents relative to the region-wide wage.
Successful elites engage in territorial expansion. In principle this expansion can yield a geographically extensive agrarian state without cities. However, a region having serious threats of warfare can be expected to have some degree of urbanization due to the benefits of agglomeration at defensible locations, and the fact that the large size of a city in itself tends to deter attack. As with the environmental shift hypothesis to be discussed below (although for different reasons), the threat of warfare can generate an implosion of the regional population from the periphery to one or more centers. We do not expect any similar regional implosion under the property rights hypothesis.
Our analysis of elite warfare in Chapter 8 did not focus on the distinction between short-run and long-run equilibria because warfare was driven by wages and land rents, not directly by regional population. But if technological progress in agriculture and resulting Malthusian population growth leads to high land rents for the elite and low standards of living for commoners, as we expect from Chapter 6, in the long run this will exacerbate elite warfare, promote territorial expansion, and encourage city-state formation.
The primary factor restraining elite warfare (or intimidation) is defensive military technology, which can potentially deter attacks. Accordingly, it is crucial not to infer the prevalence of warfare from evidence of defensive locations or fortifications. When these preventive measures are effective, deterrence works, and elite warfare shuts down. Better indicators of active warfare include skeletal evidence for mass violence; settlements that were sacked, destroyed, or conquered despite defensive investments; and perhaps artistic images of conquest. Of course, these indicators may be absent in cases where territorial expansion and political unification occur through intimidation rather than open warfare. Although effective defensive technology does restrain territorial expansion through overt coercion or covert threats, it does not rule out the formation and stability of autonomous city-states, which may well arise in response to threats of warfare (Gat, Reference Gat2006, 278–293).
Given the possibility that defensive military technology could stabilize a region with competing elites, we should consider some factors that might destabilize a peaceful equilibrium. One potential trigger for warfare is a climate shock, which could increase the temptation for one elite to attack another. A second potential trigger is a change in production technology rendering agricultural output a more tempting target for predation (perhaps involving more storage or more transparency in the production process), which could strengthen the incentive for elites to engage in violent competition over land rents. A third might be a change in military technology that favors attackers over defenders.
The Environmental Shift Hypothesis:
We already discussed this causal mechanism at length in Chapter 10 and it is not necessary to repeat that discussion here. As with the property rights hypothesis, the key idea is that the commoner wage falls, triggering elite interest in urban manufacturing and the tax revenue obtainable from it.
These two hypotheses differ because the property rights mechanism requires long-run Malthusian population growth, while environmental shifts can operate as triggers for urbanization and state formation through short-run migration (recall that we held regional population constant for most of Chapter 10). A second difference is that in the property rights hypothesis, falling wages result from rising agricultural productivity. This is due to Malthusian population growth, endogenous property rights, and the contraction of the commons. But in the environmental shift hypothesis, falling wages result from falling agricultural productivity due to a deterioration in natural conditions, at least for a subset of vulnerable sites in the region. A third difference is that an environmental shift could lead to city-state formation even if property rights at individual sites do not change (it is unnecessary for additional sites to become closed and for the commons to contract).
We prefer the term “environmental shift” to “environmental shock” because while short-run shocks could be sufficient for city and state formation, a gradual deterioration in the natural environment unfolding over centuries can have similar long-run consequences. The most obvious kind of environmental shift is increasing aridity, which drives a wedge between the productivity of outlying areas in the commons (dependent on rainfall) and refuge areas controlled by elites (with access to rivers or irrigation systems). But similar effects could result from alterations in river courses, volcanic eruptions, changing disease conditions, exogenous ecological disruptions, or endogenous environmental degradation such as soil loss due to deforestation. Any of these factors could potentially change the relative productivities of the sites within a region, motivate migration from the commons to sites controlled by elites, and depress the commoner wage to a level that causes local elites to pursue urban manufacturing. These effects will be larger when the pre-existing regional population is larger, when the productivity wedge between the commons and the refuge sites is larger, and when there are few good refuge locations.
The environmental shift hypothesis makes different predictions from the previous two hypotheses. In contrast to the property rights hypothesis, it predicts agglomerations of population in refuge areas sheltered from the changing natural environment rather than a gradual proliferation of small city-states at sites with permanently high productivity for geographic reasons. In contrast to the warfare hypothesis, it predicts intensive growth in areas that are relatively immune to environmental decline, rather than formation of city-states in places that are relatively immune to attack, or extensive growth of a unified state over a broad geographic area.
We also point out that according to the environmental shift hypothesis, people are pushed from the periphery (commons) to the center (refuge area). We are doubtful that a magnetic pull emanating from the center can give similar results because the productivity of urban manufacturing is unlikely to rise without learning by doing. In turn, learning by doing is unlikely to occur when an urban manufacturing sector has not yet developed. An external trigger is therefore needed in order for the urbanization process to get underway. This is similar to the role of climate change in pushing Upper Paleolithic societies out of stagnation traps (see Chapter 3).
11.13 Applications
The preceding section described three pathways leading to cities and states. In this section we discuss how our hypotheses might apply to the six regional cases we have surveyed. In every case there is strong evidence for stratification before state formation, so for brevity we omit any comments on this factor.
Mesopotamia:
As discussed in Chapter 9, there is no evidence of warfare leading up to the formation of the Uruk city-state. The preceding ’Ubaid period had little or no warfare, and the first wall at Uruk was constructed only after the formation stage. In the ensuing centuries there was warfare among city-states, and after almost a millennium the region was unified politically through conquest, but all of this came much later. We do have reasonable evidence for an environmental shift story based on increasing aridity, a migratory response where some people moved to elite-controlled areas in the south, and resulting urbanization based on craft manufacturing and trade.
Egypt:
The evidence for warfare in the period leading to state formation in Upper Egypt is a bit more pronounced than in Mesopotamia, with not only fortified city-states but also vivid artistic depictions of warfare. However, we do not know of any evidence for warfare in the relevant time period based upon skeletal trauma or destroyed cities, and some experts emphasize politics and culture as key factors in the unification of Upper and Lower Egypt. Whatever the role of warfare may have been, Egypt clearly experienced a trend toward greater aridity (the same trend that affected Mesopotamia), migration away from the desert toward the elite-controlled Nile valley, and urbanization based upon craft manufacturing and some trade. Thus the emergence of early city-states was likely driven by environmental factors, even if their later unification required the use or threat of force.
In the early (pre-Dynastic) stages, both Egypt and Mesopotamia had urbanization and substantial craft manufacturing. But in Egypt independent city-states did not last for very long, while in southern Mesopotamia this stage lasted for almost a full millennium prior to political unification. One possible explanation involves differences in geography such as tighter circumscription in Egypt. Another involves the greater reliance on cereals in Egypt, with the Nile flood plain being ideally suited for cereal agriculture, while the floods in southern Mesopotamia were poorly timed and demanded more costly forms of irrigation. A third explanation is that in Egypt it was easy to infer cereal output from the height of the flood (Mayshar et al., Reference Mayshar, Moav and Neeman2017, Reference Mayshar, Moav and Pascali2022), so a centralized state could be based mainly on tax revenue derived from the agricultural sector. In Mesopotamia, by contrast, it may have been harder for elites to tax agricultural output, so at least in the early stages Mesopotamian elites were more dependent on the tax revenue from urban manufacturing. Such factors may have slowed political unification in southern Mesopotamia by limiting the incentive for a local elite at one city-state to attempt the conquest of a rival city-state.
The Indus Valley:
The principal cities had walls but their purpose is controversial and it might not have been to deter attacks. There is no evidence of open warfare during the pre-urban period, and little evidence around the start of the urban period as city-states were forming, aside from ambiguous ash layers at a subset of sites. There is no evidence that one large city-state attacked another in the ensuing centuries, and little evidence that the region was a territorially unified state. Some researchers argue that increasing aridity contributed to city-state formation, but others disagree or maintain that adequate evidence is lacking. The jury clearly remains out on the environmental shift hypothesis, although this would be consistent with evidence for migration to elite-controlled areas near rivers. If the elite warfare and environmental shift theories are both rejected, we would have to fall back on the property rights hypothesis and argue that improving technology led to a gradual demographic expansion, closure of good sites, stratification, falling wages, and urban manufacturing. We leave it for the experts to evaluate the merits of this scenario.
China:
There is little doubt that stratified societies engaged in considerable war during the millennia leading up to state formation at Erlitou. The evidence includes not only fortifications but also weapons and skeletal trauma. On the other hand, there seems to be no direct evidence of warfare from the first two centuries of Erlitou. In its third and fourth centuries Erlitou appears to have engaged in territorial expansion through military means, and may have been conquered by a more powerful rival. Most scholars agree that Erlitou arose largely through migration and that craft manufacturing played an important role. It seems possible that the initial migrants were fleeing from warfare. There are also suggestions that environmental shifts could account for the rise of Erlitou, such as colder conditions, changing monsoons, flooding, or drought, but better evidence is needed.
Mesoamerica:
This region is complex and two or all three of our hypotheses may apply. The “Southern City-State Culture” along the Pacific coast resembles the cluster of microstates we would expect under the property rights hypothesis, and the same may be true for the initial set of small city-states in the Maya lowlands. In the absence of direct evidence for environmental shocks or warfare at the time these city-states were forming, we lean toward that interpretation. This does not contradict strong evidence for warfare and fortifications later in Maya history (Gat Reference Gat2006, 282–284). On the other hand it is quite clear that the rise of Monte Albán in Oaxaca occurred in a context of warfare and that the city arose for defensive reasons. We also note the prominence of territorial expansion in this case. Finally, the story for the basin of Mexico is unclear, but some authors put little weight on warfare in the early phases of Teotihuacan, instead stressing volcanic eruptions and migration effects, which could be consistent with our environmental shift hypothesis. Military conflict clearly became important later, but more evidence would be needed to make a case that it played a significant role in the initial establishment of this city-state.
South America:
Moche, which in the view of some scholars was the first pristine state along the Peruvian coast, seems to have arisen through warfare. Evidence includes not just fortifications but also extensive artistic depictions, weapons, and practices often associated with warfare such as human sacrifice of prisoners. Similar remarks apply to the later states in the Andean highlands, Wari and Tiwanaku. In all three cases territorial expansion was significant. The rapid growth of large capital cities rules out the property rights hypothesis, which predicts a gradual rise of city-states on a Malthusian time scale. We are not aware of any credible evidence for environmental factors, and conclude that the warfare hypothesis provides the best fit.
In sum, for all six examples of pristine state formation we find a close link with urbanization. The causal factors behind the development of city-states vary by region. Increasing aridity appears to have been the main driver for Mesopotamia and Egypt. It is possible that environmental factors played a role in the Indus valley and northern China, but the situation is unclear. Mesoamerica is complicated, with certain subregions where the property rights hypothesis may be a good fit, at least one subregion that supports the warfare hypothesis, and substantial uncertainty about state origins in the basin of Mexico. The three cases from South America all appear consistent with the warfare hypothesis. It would be of great interest to expand the data set to study the relative frequency of these causal mechanisms in other regions of the world such as Africa, Southeast Asia, Europe, and North America.
11.14 Conclusion
Our exploration of the institutional trajectories of prehistory has taken us from the origins of inequality (Chapter 6) and the emergence of warfare over land (Chapters 7–8) to the rise of cities and states (Chapters 9–11). With the last two developments we reach what is often called “civilization.” In particular we reach the technological innovation of writing, which brings prehistory to an end.
We will not attempt to summarize everything that has gone before, but we remind the reader of a few general points. First, our theoretical approach throughout Parts III and IV of the book has been based on what we call “technologies of coercion.” These come in three varieties: A technology of exclusion (Chapter 6), a technology of combat (Chapters 7–8), and a technology of confiscation (Chapter 10). We have shown that even relatively simple formalizations of these ideas can lead to rich modeling frameworks. We hope the reader has been persuaded that these exercises can enhance our understanding of the data.
Stepping back from the math, our argument is fundamentally about the manner in which elite groups arise and how they pursue their joint interests. This takes for granted that such elite groups are organized enough to overcome the coordination and free rider problems bedeviling all forms of collective action. Although we do not make universal claims about the organizational coherence of elites, they have several advantages: elites are relatively small numerically, they tend to arise in geographically compact places (at least initially), and their members are quite likely to engage in repeated interaction, with strong information flows within the group and the capacity to impose large penalties on non-cooperators through ostracism (expulsion to the commoner class).
Commoners, in contrast, tend to be numerous and possibly mobile, perhaps with weaker information flows among members and less ability to impose large penalties via ostracism. Commoners may also have more heterogeneous objectives than elite agents, who are likely to share an interest in the maximization of land rent or profit. Moreover, once an elite class emerges, it has a strong incentive to discourage effective organization within the commoner class and to promote ideologies legitimizing its dominance. In this view, the pristine state represents the victory of the organized over the unorganized.
11.15 Postscript
Chapter 11 was largely written in the summer of 2020. The formal models from Chapters 6, 8, and 10 were constructed before we read most of the literature on the five regional cases discussed in this chapter.
The following acknowledgments apply to all of Part IV (Chapters 9–11). In fall 2018 we visited Harvard University at the invitation of two members of the economics department, Nathan Nunn and Melissa Dell. During that semester we presented our work on southern Mesopotamia to the economic history group. We are grateful to the seminar participants, especially Claudia Goldin, for helpful comments. During the same semester we visited the Standing Committee of Archaeology at Harvard at the invitation of Rowan Flad, who provided valuable guidance on China. We attended a course offered by Jeffrey Quilter and Jason Ur called “Urban Revolutions,” and both were generous in responding to our many questions. We thank Gojko Barjamovic for feedback on the Mesopotamian part of our research.
From January until March 2019 we visited the Institute of Advanced Studies at University College London, hosted by the director Tamar Garb. IAS was a friendly and stimulating research environment. We attended weekly meetings at IAS and presented our work on southern Mesopotamia in a seminar there. Stephen Shennan of the Institute of Archaeology organized our overall visit to UCL and invited us to give two seminars at the Institute. We thank the audiences at both IAS and the Institute of Archaeology for the valuable comments and suggestions we received, and especially Stephen Shennan for his persistent attempts to educate us about archaeology.
Guillermo Algaze was highly generous in reading multiple drafts of Chapter 9 and responding with very extensive comments. Archaeological experts who provided helpful suggestions for Chapter 11 include Jonathan Mark Kenoyer, Li Liu, Jeff Quilter, David Wengrow, and Norman Yoffee. Among economists we thank Omer Moav for comments on Chapter 9 and Louis Putterman for comments on Chapters 9 and 11. Due to looming deadlines and page constraints, as well as our own intellectual idiosyncrasies, we did not follow every piece of advice we were offered, but we deeply appreciate all of it.
As mentioned in Section 11.8, James K.-S. Kung shared an unpublished literature review on warfare in Neolithic China. Huiqian Song assisted in our discussion of Kung’s data and created the graphs in Chapter 10. Michael Straw was a research assistant in the early stages of our work on Part IV.
We also thank the Social Sciences and Humanities Research Council of Canada for generously funding our visits to Harvard and UCL. All of the chapters in this book, but especially Chapters 9–11, were substantially improved as a result. However, we are solely responsible for the content. None of the individuals or organizations listed in this postscript should be blamed for our interpretations or mistakes.











