Dynamics of Polarization: Affective Partisanship and Policy Divergence

Abstract We explore the dynamics of affective partisanship and policy divergence in a behavioral voting model. Voters are adaptive and influenced by partisan affect, while political parties are rational and office motivated. We show that the affective partisanship of the electorate and the divergence of party platforms can be mutually reinforcing, thus providing an explanation for the observed co-movement of affective and elite polarization in recent decades. Whether the induced behavioral path exhibits low polarization or high polarization depends on the salience of group identity and the number of moderate voters. Thus, shocks to those factors, perhaps due to such events as economic crises or war, can lead to the polarization or depolarization of the electorate and of the elite.

The political elite in the United States have become more partisan over the last forty years (Barber and McCarty 2015;Layman, Carsey, and Horowitz 2006;McCarty, Poole, and Rosenthal 2016).This development has spurred extensive research and debate.One controversy is whether elite polarization is driven by the polarization of the electorate in terms of values and ideology.While there is some evidence in favor of this hypothesis (Abramowitz 2010;Abramowitz and Saunders 2008), others have argued that the electorate's issue preferences have remained fairly stable over the years (see Fiorina and Abrams 2008;Fiorina, Abrams, and Pope 2011;Hetherington 2009;Hill and Tausanovitch 2015;Levendusky 2009).
Traditionally, research on mass polarization focuses on the public's preferences on issues and policies.Yet, polarization can manifest itself directly in terms of the feelings and attitudes of one group toward another.Research in social psychology suggests that individuals have inherently positive feelings and attitudes toward members of their own social group, the "in-group," and negative feelings and attitudes toward members of the "out-group" (Tajfel 1970;Tajfel and Turner 1979).This is true even when group membership is defined via trivial characteristics or assigned randomly (Chen and Li 2009;Landa and Duell 2015).
In a political context, social groups are naturally defined along party lines, that is, the party that a person affiliates with defines the person's "in-group," while the other party defines the "out-group."In-group members include voters, politicians, public officials, or party staff-anyone who supports one particular party and not the other.According to this perspective, joining or supporting a party is less a rational decision and more a reflection of a person's group identity, which triggers emotions typical of in-group-out-group dynamics.
The role of affect and emotions in voting behavior was recognized as early as the 1950s.Classic studies by the Columbia and Michigan Schools found that voting behavior is often influenced by habit, social influence, and emotions rather than careful comparisons of issue positions (Berelson, Lazarsfeld, and McPhee 1954;Campbell et al. 1960). 1 These insights led to the idea of party identification, which has dominated the empirical study of voting ever since (see Bartels 2000;Campbell et al. 1960;Erikson, Mackuen, and Stimson 2002;Lewis-Beck et al. 2008;Miller, Shanks, and Shapiro 1996). 2  Recent research provides ample evidence of rising affect-based partisanship, or affective polarization, over the past several decades in the United States (Iyengar and Krupenkin 2018;Iyengar, Sood, and Lelkes 2012;Iyengar et al. 2019). 3Generally speaking, there has been an increase in animosity, distrust, and stereotyping between Republicans and Democrats.In 1960, for instance, less than 5 per cent of Republican and Democratic voters said they would be "displeased" if their children married across party lines.By 2010, the numbers had risen to 49 per cent for Republicans and 33 per cent for Democrats (Iyengar, Sood, and Lelkes 2012).A recent study by Abramowitz and Webster (2016) shows that partisan affect is a good predictor of partisan voting in US presidential and congressional elections.
Some notable studies have alluded to a potential connection between partisan affect and elite behavior.Using National Election Studies (NES) surveys and House budget votes, Coleman (1996) documents a strong correlation between elite polarization and mass partisanship.Bartels (2000) shows that US voters have demonstrated increasing party loyalty since the 1970s, coinciding with elite polarization.
Importantly, existing evidence indicates that affective polarization is distinct from the polarization of policy attitudes.For example, the correlation between policy preferences and measures of partisan affect is weak.Also, strong partisans with committed policy positions, that is, liberal Democrats and conservative Republicans, do not exhibit similar increases in interparty animus (Iyengar, Sood, and Lelkes 2012).
Given the empirical findings, it is natural to posit affective polarization as a potential driver of elite polarization.There is also the intriguing possibility of a causal effect in the opposite direction.In discussing the evidence of rising mass partisanship, Bartels (2000) proposes elite polarization as a potential cause.For example, candidates for political office may want to make the differences between the parties more salient to increase turnout.More recent work by Rogowski and Sutherland (2016) and Banda and Cluverius (2018) offers direct evidence that elite behavior can intensify voters' partisan affect.
Informal discussions notwithstanding, there is little systematic and rigorous exploration of the various threads suggested by the empirical evidence.One obstacle to such explorations is incompatible research traditions.While much of the empirical research is grounded in social psychology, the formal study of elections, beginning with Downs (1957), conceptualizes voters as rational decision makers who vote based on the evaluation of the "expected party differential."Notably, one of the robust findings of the Downsian model is the convergence of office-motivated 1 Recent work suggests that these features do not simply rest on the lack of information or attention, shortcomings that may be alleviated by the use of heuristics like cue taking (Popkin 1994) or retrospective voting (Healy and Malhotra 2013), but are grounded in the psychological processes of opinion formation and candidate evaluation (Achen and Bartels 2017;Lodge and Taber 2013;Zaller et al. 1992).

2
Various scholars have equated party identification with affect-based partisanship (see, for example, Burden and Klofstad 2005;Green, Palmquist, and Schickler 2004;Greene 2004).Indeed, this view can be traced back to the conceptualization of party identification in The American Voter (Campbell et al. 1960).Accounts of party identification based on group identity and emotional affiliations are not the only approaches.Alternative accounts of party identification include such concepts as "running tallies" (Fiorina 1981) and "macropartisanship" (MacKuen, Erikson, and Stimson 1989).
candidates to the center. 4However, formal theories based on the rational-choice paradigm are of limited use in exploring group identity and affective polarization.
In a recent approach, Diermeier and Li (2019) incorporated affective partisanship into a spatial-voting model and showed how affective polarization can induce policy divergence.However, their model is static and therefore cannot address questions about the dynamic interaction between mass partisanship and elite behavior.
In this article, we explore these questions by extending Diermeier and Li's (2019) framework to a dynamic context.Our model takes some basic building blocks from standard spatial-voting models.Voters have ideal points on a policy space, and parties are rational and office motivated: they choose policies to maximize their vote share.However, unlike in the traditional approach, voters act according to a behavioral heuristic that maps their partisan attachment and their experiences under implemented policies into voting behavior.Unlike in Downsian-type models, voters act in response to past experiences; they need not be aware of their own ideological positions, the parties' policy locations, or any other aspects of the model.It is worth noting that affective polarization is often presented in the literature as capturing how voters feel toward each other.Given the social identity conception of partisanship, it is reasonable to assume that the feelings toward co-and out-partisans apply equally to party elites, as they are prominent members of the in-/out-group.Several studies have adopted this broader interpretation of affective partisanship. 5 The heuristic that guides voter behavior is derived from two principles established in the literature on group identity.The first principle-in-group favoritism-is the positive bias in the evaluation of behavior and characteristics of in-group members (see Mullen, Brown, and Smith 1992).The second principle-in-group responsiveness-is the observation that individuals tend to be more sensitive to the behavior and traits of in-group members than to those of outgroup members (Frimer and Skitka 2020;Marques and Paez 1994;Marques, Yzerbyt, and Leyens 1988).One explanation for in-group responsiveness is that individuals apply stricter norms to in-group members and, therefore, judge more severely in-group members who violate such norms (Mackie and Cooper 1984;Pinto et al. 2010).
A key feature of the present model that sets it apart from Diermeier and Li's (2019) is that elite behavior can also influence voters' affective partisanship.This creates a bidirectional linkage between elite behavior and affective polarization, whereas Diermeier and Li (2019) explore the causal link in only one direction (that is, from affective polarization to elite behavior).The evolution of affective partisanship is modeled via a Markov process.Affective partisanship is the "state variable" and is updated each period based on the policy implemented by the incumbent.We assume that positive experience enhances a voter's affective association with the incumbent and that negative experience decreases it. 6This postulate is congruent with the "law of effect"-"the most important principle in learning theory" (Hilgard and Bower 1966, 481). 7he adaptive nature of partisanship allows for potential feedback between elite behavior and affective partisanship: politicians adjust policies in response to mass polarization and, at the same time, policies shape partisanship going forward.
We show how affective partisanship and elite behavior can reinforce each other.Importantly, qualitatively different behavioral paths may arise, depending on the electoral environment.If group identity is not salient or the number of moderate voters is large, then the behavioral path features low polarization, in which the parties choose centrist policies and affective partisanship moderates over time.On the other hand, if group identity is salient 4 For a survey, see Duggan (2012). 5 For an example, see Landa and Duell (2015).
6 Rogowski and Sutherland (2016) show that voters' affective partisanship responds to elite policy positions.More generally, a long line of research in psychology suggests that an individual's affective state is a function of experience and stimuli (see, for example, Lazarus 1991;Lerner and Keltner 2000).This condition is consistent with the evidence that party identification responds to the performance and actions of incumbents (MacKuen, Erikson, and Stimson 1989).and moderate voters are few, a high polarization path emerges, in which parties choose extreme policies and affective partisanship is high.Thus, our model can account for the observed correlation between elite polarization and mass partisanship.More important, an implication of these observations is that shocks to the electoral environment, possibly due to such events as economic crises and war, may lead to a shift from modest political polarization to heightened polarization, or vice versa.
Our article adds to the burgeoning body of formal work that incorporates behavioral agents.Our approach takes inspiration from the literature on adaptive learning (see Börgers and Sarin 1997;Hart 2005), which has seen increasing application in political science.Bendor, Diermeier, and Ting (2003) explore the "paradox of voting" in an adaptive-voting model.Diermeier and Li (2017) study electoral accountability with behavioral voters.Andonie and Diermeier (2019) examine multiparty elections.More closely related to this model, Bendor, Kumar, and Siegel (2010) examine how adaptive voters can develop partisanship toward candidates in the long run.However, the candidates in their model are not strategic; rather, their policy positions are fixed.Bendor et al. (2011) numerically analyze a model where both candidates and voters are adaptive.
While the adaptive-learning framework is useful in capturing a general lack of sophistication, other approaches focus on more specific behavioral biases.For example, several recent studies have explored the implications of biases in statistical reasoning, such as correlation neglect and motivated reasoning (Levy and Razin 2015;Little 2019;Minozzi 2013;Ortoleva and Snowberg 2015).Bisin, Lizzeri, and Yariv (2015) and Lizzeri and Yariv (2017) study government policy in the presence of time-inconsistent voters.Patty and Penn (2020) study the impact of identity and culture on individual behavior in organizations.

The Model
Two parties, A and B, compete in elections at the end of date t = 1, 2, ….The winner of the election at date t − 1 becomes the incumbent at date t and chooses policy θ t from the interval [ − 1, 1].The objective of the incumbent is to maximize their vote share in the date t election.The parties are ex ante homogeneous; they do not differ in exogenous valence, nor are they policy motivated.It should be noted that this would rule out policy divergence in a typical Downsian setting (see Roemer 1994).
There is a unit continuum of voters, divided into three blocs according to their bliss point b ∈ {l, m, r}, where l = −1, m = 0, and r = 1.Let κ b denote the measure of voters with bliss point b.We assume that κ l = κ r , that is, the distribution of bliss points is symmetric around the median.Voters with bliss point b will be referred to collectively as "b voters." Voters are adaptive: their voting behavior responds to prior experiences with implemented policies.Such experiences, however, do not presuppose any knowledge or evaluation of policies.Rather, voters vote "how they feel."A voter's propensity to vote for a given candidate also depends on their affective partisanship, which is the emotional attachment, or affinity, to one of the two parties (Dias and Lelkes 2022).Formally, we denote a (generic) voter's affinity for the in-party at date t by p t .We assume for tractability that affinity is binary: a voter either has high affinity for the in-party or for the out-party. 8That is to say, p t takes one of two values {h, l}, with h indicating high affinity for the in-party and l indicating high affinity for the out-party. 9The probability that a voter votes for the incumbent at the date t election is a function of p t and θ t .For tractability, we focus on a linear functional form and explore a more general setting in Appendix 1. Specifically, the probability of a voter with bliss point b voting for the incumbent is assumed to be: where τ determines both the level of α and its sensitivity to the distance between the policy and the voter's bliss point.Letting τ h ≡ τ(h) and τ l ≡ τ(l ), we make the following assumption (see Figure 1).
The fact that τ h and τ l are positive means that a voter is more likely to vote for the incumbent the more congruent the adopted policy is with their bliss point.This is reminiscent of standard Downsian preferences, but, importantly, we do not presume that voters conduct any rational evaluation of policy platforms.Rather, voters are more likely to have a positive experience when the policy is closer to their ideal point, and they respond positively to such experience.They may not be conscious of the underlying mechanism that links policy with experience, or even the proximity of policy to their own bliss points (Joesten and Stone 2014).It should be noted that a consequence of linearity is the symmetry in a voter's responses to deviations of policy toward and away from their bliss point.10This is not essential to our result.The generalization of the model in Appendix 1 allows for concavity or convexity of α with respect to policy.The restriction that τ h and τ l are less than 1/2, together with the functional form of α, ensures that α remains in the interior of [0, 1].One may adopt a more general piecewise linear functional form for α that relaxes this restriction and still obtain qualitatively similar insights.
The assumption τ h > τ l plays a key role in the model, and its implications are twofold.First, it implies that α(h, θ) > α(l, θ), meaning that a voter is more likely to vote for the candidate they have high affinity for.This is an instance of in-group favoritism, a well-established finding in social psychology (Mullen, Brown, and Smith 1992).Jessee (2010) documents in-group favoritism in the 2008 US presidential election.Specifically, Democratic and Republican partisans "show a strong tendency to vote for their party's candidate even in situations in which they are ideologically closer to the other candidate" (Jessee 2010, 328) Secondly, the assumption implies that for θ ≠ b, In other words, a change in the incumbent's policy stance has a bigger impact on the propensity of its "co-partisans" (voters with high affinity) than on that of "out-partisans" (voters with low affinity).This is an example of in-group responsiveness as identified in various research in social psychology (Biernat, Vescio, and Billings 1999;Marques, Yzerbyt, and Leyens 1988;Mendoza, Lane, and Amodio 2014).In particular, studies have shown that people tend to punish in-group members more severely than out-group members for norm-violating behavior.11Here, deviations from the "party" line are similar to norm violations.We interpret the difference τ h − τ l , which determines how sensitive voter behavior is to affective partisanship, as measuring how salient group identity is.
Consistent with in-group responsiveness, affectively partisan voters tend to be more zealous, that is, they exhibit stronger emotional responses to their own party's actions, such as its policy positions.The Tea Party movement is a good example.While Tea Party supporters on the whole hold a similar ideology as average Republicans (Newport 2010), the former tend to be more impassioned (Kimball, Anthony, and Chance 2018).Accordingly, Tea Party supporters are keener to hold their elected representatives accountable for holding conservative policy positions.
Republican congresspeople reportedly felt immense pressure from the Tea Party to remain uncompromising politically, as "any deviation from the conservative line is met with a flood of phone calls and a credible threat of a primary challenge" (Lee 2013). 12uch in-group responsiveness is important whenever deviations from a group's core identity are at issue, that is, when elected politicians appear to stray from the pure party line.On other types of behavior, in-group responsiveness is not important.For example, some empirical studies suggest that voters are more lenient toward co-partisan politicians for general misbehavior or lack of performance, such as corruption or poor economic performance (see, for example, Eggers et al. 2014; Kayser and Wlezien 2011).Such behavior is deplorable, and bad outcomes are unwelcome, but they do not threaten group identity.In such contexts, in-group favoritism applies but not in-group responsiveness, that is, members of one's own group are given the benefit of the doubt, while members of the out-group are evaluated harshly.In other words, on general misbehavior not tied to features that distinguish the two parties, members of one's own party are treated with leniency (in-group favoritism).On the other hand, deviations from the party line, for example, taking a different stance on highly polarized topics like immigration or abortion, are viewed as betrayals and punished severely (in-group responsiveness).
To complete the model, we now describe how affective partisanship evolves over time.Here, we follow the tradition of adaptive-voting models (see, for example, Bendor et al. 2011) and assume party affinity evolves according to a Markov process.Specifically, let I be the date t incumbent, then the probability of a generic voter having high affinity for I at date t + 1 is α( p t , θ t ).In other words, α( p t , θ t ) captures both the probability that a voter votes for party I and the probability that the voter will have high affinity for I at date t + 1 (see Figure 2).As an illustration of the sequence of events, suppose the Democratic Party is in power at time t and implements its platform.Voters may have positive or negative experiences during the election cycle, which is partially influenced by government policies but also depends on individual circumstances, such as their partisan leanings and other random events.When it comes time for an election at the end of date t, a voter considers their experience and votes to reelect the Democratic Party if their experience was positive and votes for the Republican party otherwise.To put it more succinctly, voters consider the famous question posed by Ronald Reagan-"Are you better off than you were four years ago?"-and vote accordingly.
The intertemporal linkage between elite behavior and affective partisanship is established through the Markov process.Affective partisanship, as reflected in the distribution of party affinity, influences the incumbent's policy choice.Specifically, the incumbent seeks to maximize vote share a(p t , u t )dF t , where F t is the distribution of p t .
At the same time, the incumbent's policy choice shapes affective partisanship in the future.The fact that α is decreasing in δ t means that a voter is more likely to develop high affinity for the incumbent the more the voter likes the incumbent's policy.This represents the "law of effect."

Analysis
To simplify notation, the time subscript is omitted whenever possible.Also, define δ t = |θ t − b| as the proximity of policy to a voter's bliss point; we will sometimes use δ t as an argument in α instead of θ t , that is, α( p t , δ t ) = −τ( p t )δ t + 2τ( p t ).Let g b denote the proportion of b voters that have high affinity for party A. 13 In a given period, the triple (g l , g m , g r ) fully describes the affective partisanship of the electorate.Given this, affective polarization is defined as follows: Definition 1: The electorate is affectively polarized if g l > 1/2 > g r or g l < 1/2 < g r .
In other words, the electorate is affectively polarized if, on average, l voters and r voters favor different parties.It should be noted that the notion of affective partisanship is independent of voters' ideologies.The former may evolve over time in response to policies, while the ideological composition of the electorate is assumed to be fixed.By treating the two as distinct concepts, we can isolate the implications of affective partisanship and help explain the observation that partisanship has increased while attitudes on policy have largely been constant (Fiorina and Abrams 2008;Fiorina, Abrams, and Pope 2011;Hetherington 2009;Levendusky 2009).Without loss of generality, we assume that affective polarization takes the form of g l > 1/2 > g r , that is, l voters favor party A, while r voters favor party B. For the following discussion, we interpret the difference g l − g r as the intensity of affective polarization at the societal level.
First, we establish a sufficient condition for the existence of a behavioral path in which affective partisanship is moderate and parties take on centrist positions.This "low polarization" path depends on the initial affective partisanship, the salience of group identity, and the size of moderate voters: Proposition 1 (low polarization): If g l and g r in the initial period satisfy: then either party, when in power, adopts the median policy.Moreover, affective polarization disappears in the long run, that is, g l,t − g r,t converges monotonically to 0 as t → ∞. 13 The distribution of affinity for party B is implicitly defined given that a voter's affinity for party A is high iff their affinity for B is low.
When affective polarization is low, that is, g l − g r is small, there is little electoral advantage for the incumbent to appeal to extreme voters.For illustration, consider the case where g l = g r .If the incumbent were to deviate from the median, say to the left, then the gains they make with l voters will be canceled out by the loss of r voters.On top of it, they lose support from m voters.Thus, there is a net loss from the deviation, and there is no incentive for the incumbent to deviate.The moderate stance of the parties will, in turn, limit affective partisanship, especially among voters at the extremes of the spectrum, thereby creating a "virtuous cycle" of centrist policies and moderate affective partisanship.
An immediate consequence of Proposition 1 is the following corollary: Corollary 1: A low polarization path arises if group identity is not salient (that is, τ h − τ l is small) or the size of moderate voters is large (that is, κ m is large).
For intuition, consider the extreme case where group identity is not operative, that is, τ h = τ l .Thus, regardless of ideological position, voters are equally responsive to the incumbent's policy.The incumbent therefore has nothing to gain from biasing their policy toward either l or r voters, as any gains in support from one group will be canceled out by the loss of support from the other group.Besides the salience of group identity, the distribution of voters on the ideological spectrum is also an important factor in the incumbent's calculus.Specifically, the greater the proportion of moderate voters, the more costly it is for the incumbent to deviate from the center.
Next, we provide a sufficient condition for a high-polarization path in which the parties choose extreme policies (that is, party A chooses −1 when in power and B chooses 1) and a high level of affective partisanship persists.To state the result, we define g(d) to be the solution of the following equation: .
Intuitively, g(d) is the stationary distribution of affinity among a bloc of voters, assuming one party is always in power and chooses a policy that is δ away from the voters' bliss point.We then have the following result: Proposition 2 (high polarization): If g l and g r in the initial period satisfy the following conditions: then party A chooses θ t = −1 and B chooses θ t = 1 whenever in power and: If affective partisanship is sufficiently extreme, then it is in the incumbent's interest to appeal to its partisans by deviating from the center.Doing so will engage its base sufficiently to overcome any loss of votes from the other blocs.The extreme policy stance, in turn, reinforces affective partisanship, thereby creating a "vicious cycle" of policy divergence and high levels of affective partisanship.Condition 3 mirrors Condition 2, for it is a lower bound on the initial level of affective polarization.The bound can be interpreted as a "tipping point" for polarization: if the initial level of polarization is above this threshold, then it will remain above this threshold forever because of the vicious cycle logic.How might one identify the tipping point empirically, say, in the case of the United States?Given appropriate time-series data, the level of polarization at time t may be considered as the tipping point if: (1) polarization in subsequent periods does not show moderation, minor volatility notwithstanding; and (2) there is a sharp rise in the ideological division between the Democratic and Republican parties around time t that persists afterward.Condition 4 provides bounds that allow us to establish monotonicity, in the sense that starting within this bound, g l and g r will eventually move outside of it. 14 It is straightforward to see that Condition 3 is more easily satisfied if τ h − τ l is large and κ m is small, giving rise to the following corollary: Corollary 2: A high polarization path is more likely to arise if group identity is salient (that is, τ h − τ l is large) and the size of moderate voters is small (that is, κ m is small).
In general, a stronger sense of group identity and fewer moderate voters increase the incumbent's incentive to appeal to its partisan base.Suppose Party A is the incumbent, then strong salience of group identity means that l voters are much more responsive to Party A's policy than r voters.This means that Party A would rather excite its base than "reach across the aisle" and sway r voters.Since moderate voters exert a centripetal force upon Party A, a strategy of "rallying the base" will indeed be optimal if moderate voters are few.

Discussion
One broad lesson from our results is that shocks to the electoral environment can have substantial long-term consequences with regard to political polarization.The idea is similar to a standard exercise in macroeconomic research, starting with the Nobel Prize-winning work of Kydland and Prescott (1982), which shows how exogenous shocks to such variables as monetary policy or technology can generate aggregate fluctuations in output and employment.Importantly, even temporary shocks can have a persistent impact on the long-run equilibrium path through changing players' beliefs-a phenomenon known as "hysteresis" (see, for example, Cooper 1994; Morris and Yildiz 2019).Thus, the economy can dive into a persistent recession if people become (briefly) pessimistic about the future due to unforeseen events that have little direct effect on the economy's fundamentals.
In the context of our model, shocks to the salience of group identity or the number of moderate voters can switch the equilibrium path from one of low polarization to one of high polarization, or vice versa.Suppose, for example, the system is initially on the low polarization path, where parties are choosing centrist policies and affective partisanship is low (that is, Condition 2 holds).Suppose at some date t, there is a shock to group salience such that τ h − τ l increases to th − tl .If it is the case that g l,t − g r,t .((t l /t h − tl ) + 1)(2k m /1 − k m ), then the system will switch to the high polarization path, where the incumbent chooses extreme policies and affective partisanship is high.It should be noted that the shocks need not be permanent to induce a permanent change.If shocks to the salience of group identity described earlier persist through several periods, so that affective polarization increases significantly, then even if the salience of group 14 Condition 4 is not necessary if one is only interested in showing that affective polarization remains above some threshold.
identity eventually reverts back to the original level, the high polarization path will persist.The aforementioned shocks can take the form of specific political events, such as an economic crisis or political scandals.Some observers have pointed to the end of the Cold War as an accelerator of political polarization in the United States (Blankenhorn 2018).The idea is that the liberal/conservative identity is less salient when there is a "common enemy," as in the Soviet Union.The disappearance of a common enemy then highlights group differences.Other candidates include the financial crisis and the subsequent outrage at the bailing out of banks and other financial institutions, which gave rise to the Tea Party.A particularly interesting extension of our model would include rhetorical strategies by political entrepreneurs to highlight group identity for their political benefit.The details of such a model are, however, beyond the scope of this article.
Another interesting observation from the model has to do with how affective polarization relates to ideological polarization.In standard Downsian models, the ideological composition of the electorate is of little consequence to elite behavior.Specifically, a main prediction of Downsian theory is policy convergence with office-motivated candidates, in the form of medianand mean-voter theorems, for any distribution of voter ideologies. 15In our model, an increase in ideological division, that is, a decrease in κ m , can have a material effect on elite behavior.This does, however, depend crucially on affective partisanship.If group identity is not salient, that is, τ h − τ l is small, then Condition 2 for the low polarization path will be satisfied regardless of the extent of ideological division.In that case, one recovers the classic convergence result.On the other hand, if group identity is salient, then sufficient ideological division as reflected by a small κ m can lead to high polarization.In other words, issue polarization is neither necessary nor sufficient for elite polarization.It will not matter unless affective polarization is sufficiently high.If, however, affective polarization is indeed sufficiently high, issue polarization will serve as an amplifier of polarization.
The model has various implications that could help guide empirical studies.One obvious pattern is the following: Implication 1: The trend in affective polarization is correlated with policy divergence: affective partisanship decreases over time when differentiation between parties' platforms is small and increases when the differentiation is large.
It should be noted that we are careful not to assert a causal relationship in any particular direction.In our model, affective partisanship and elite behavior influence each other (that is, both are endogenous variables).From the data, it may appear that changes in the elite's behavior precede changes in the attitudes of the electorate (see Implication 5), and one may therefore be tempted to infer a particular causal relationship.Our model suggests this line of thinking may be misguided and that a more sophisticated empirical approach, for example, simultaneous equations or structural approaches, would be required to understand the connection between affective and elite polarization.
The next two empirical implications from the model identify factors associated with high or low polarization regimes: Implication 2: The more salient is group identity, the more likely the high polarization regime will emerge; the less salient, the more likely the low polarization regime.

15
Policy divergence in the Downsian setting requires policy-motivated candidates and uncertainty about the election outcome (see, for example, Calvert 1985;Wittman 1983), or, alternatively, one candidate having a valence advantage (see, for example, Ansolabehere and Snyder 2000;Groseclose 2001).One exception is Kamada and Kojima (2014), where policy divergence occurs with office-motivated candidates.However, their result relies crucially on the assumption that voters' preferences are convex, that is, risk loving.
Implication 3: The smaller the segment of moderate voters, the more likely the high polarization regime will emerge; the smaller the segment of moderate voters, the more likely the low polarization regime.
As discussed earlier, ideological differences on issue only matter if group identity is sufficiently strong: Implication 4: Parties' platforms will become divergent as more voters hold extreme ideological positions but only when there is a heightened sense of group identity based on partisan affiliation.
The next implication follows from the observation that a transition between the low and the high polarization regimes entails a drastic change in the parties' platforms, for example, a switch from centrist to extreme policies.However, within a given regime, parties' policy positions will not change much.On the other hand, the level of affective partisanship among the masses changes gradually, both within and across regimes: Implication 5: Changes to the parties' policy positions are less frequent but more drastic than changes in the levels of affective partisanship.
Finally, we point out a direct consequence of in-group favoritism versus in-group responsiveness: Implication 6: Voters of a party will be more forgiving for general lack of performance or personal misconduct of public officials of their own party, provided such cases are unrelated to the core differences between the parties.On issues related to such core differences, public officials of the same party will be treated more harshly if they deviate from the party line.
Implications 2, 3, and 6 are straightforward and consistent with prior insights in the polarization literature.They demonstrate that our model can account for known empirical regularities.Implications 1, 4 and 5, on the other hand, are novel and somewhat subtle.They point to the value of developing a formal model of affective polarization.

Conclusion
While elite polarization has been well established, there are continuing debates about its cause.Recent research documents a rise in affective partisanship along the same period, suggesting a potential link between the two phenomena.In this article, we study a dynamic behavioral-voting model in which the elite's policy stance and the affective partisanship of the masses can mutually sustain each other, creating virtuous or vicious cycles.This can account for the observed temporal correlation between mass and elite partisanship noted by Coleman (1996) and Bartels (2000).We show that a high polarization path exists when group identity is salient and ideological division is high, while a low polarization path exists when group identity is not salient and ideological division is low.Thus, such events as economic crises that magnify group identity can lead to changes in the trajectory of political polarization.On the other hand, when group identity is less salient, for example, during periods of national security challenges attributed to a common enemy, polarization in general will be lower.
Notably, these implications hold even when voters' attitudes on policies are stable.Indeed, an electorate's polarization on issues is irrelevant for elite polarization unless voters exhibit a sufficient level of dislike for members of the other party.While changes in affective polarization may be the result of external events, they may also result from rhetorical strategies by candidates or changes in the media environment, for example, the emergence of more polarizing news sources.
The interactive, dynamic nature of our model suggests that empirical studies need to account for the mutual influence of mass and elite polarization.This would require estimation techniques from, for example, macroeconomics or industrial organization that can address the simultaneity issues.This is no accident.In both macroeconomics and our model, we find multiple regimes sustained by positive feedback loops and hysteresis, that is, the long-term impact of temporary shocks.
Although we framed the model and results based on US national politics, the model could be applicable in other contexts as well, for example, state and local politics in the United States.Indeed, the model is mostly agnostic about institutional features.The only limit on the scope is the assumption of two-party systems.Extending the model to multiparty systems is not straightforward given how we define and measure affective polarization.Such an extension will be especially challenging in multiparty systems that involve coalition formation, for example, parliamentary democracies under proportional representation.With those qualifications in mind, we believe that the model, though highly stylized, is a useful first step in studying the dynamics of affective partisanship and elite behavior.We hope for more research on this topic in the future.
Proposition 3: Suppose the electorate is polarized initially, then for all future periods, Party A chooses leftist policies and Party B chooses rightist policies whenever in office (that is, u A t ≤ 0 ≤ u B t ), and l and r voters are partisan for Party A and Party A, respectively (that is, g l,t > (1/2) > g r,t ).
Intuitively, affective polarization drives policy divergence because of in-group responsiveness, as in Diermeier and Li (2019).Moreover, the process by which partisan affinity evolves implies that by adopting a biased policy, the incumbent builds rapport with one group of voters at the expense of alienating another.This induces affective polarization.The next two results provide a partial generalization of the conditions for the low and high polarization paths in the baseline model in the main text.First, Proposition 4 identifies conditions under which there exists a non-trivial lower bound on polarization: Proposition 4: If there exists g .(1/2), such that: where u is Party A's optimal policy given g l = g, g m = 1, g r = 1 − g, then if the initial distribution of affinities is such that g l ≥ g and g r ≤ 1 − g, then it is the case that g l,t ≥ g,g r,t ≤ 1 − g, and u A t ≤ u , 0 , 1 − u ≤ u B t for all t.
The proof (as well as the one for Proposition 5) can be found in Appendix 2. The proposition states that if g satisfies Condition 6, then it is a lower bound on polarization, in the sense that the policies are bounded away from the median, and the degree of affective polarization g l − g r is at least 2 g. 17 Next, Proposition 5 derives a sufficient condition for an upper bound on polarization: Proposition 5: If there exists g > (1/2), such that: a l (h, u)g + a l (l, u)(1 − g) ≤ g a r (h, u)(1 − g) + a r (l, u)g ≥ 1 − g, where θ is Party A's optimal policy given g l = g, g m = 0, g r = 1 − g, then if the initial distribution of affinities is such that g l ≤ g and g r ≥ 1 − g, then for any s > t, it is the case that g l,t ≤ g, g r,t ≥ 1 − g, and u ≤ u A t ≤ 0 ≤ u B t ≤ 1 − u for all t.
Given Propositions 4 and 5, one can show that if g .g, then there are two qualitatively different equilibrium paths: one in which affective polarization and policy divergence are moderate; another in which affective polarization and policy divergence are severe.