A Group-Based Approach to Measuring Polarization

Despite polarization’s growing importance in social science, its quantitative measurement has lagged behind its conceptual development. Political and social polarization are group-based phenomena characterized by intergroup heterogeneity and intragroup homogeneity, but existing measures capture only one of these features or make it difficult to compare across cases or over time. To bring the concept and measurement of polarization into closer alignment, I introduce the cluster-polarization coefficient (CPC), a measure of multimodality that allows scholars to incorporate multiple variables and compare across contexts with varying numbers of parties or social groups. Three applications to elite and mass polarization demonstrate that the CPC returns substantively sensible results, and an open-source software package implements the measure.

Table S1: Bibliometric Search Procedures All types of scholarship-empirical, theoretical, and formal theoretical-were included, but the vast majority included at least some quantitative empirical analysis.After obtaining these 322 articles, I read each of them to determine how the authors operationalized polarization. 2In most cases, this information was contained in the "Data and Measures" section, or another section titled similarly. 3fore presenting the results, a few words are in order with regard to coding rules.When possible, I attempted to group similar operationalizations together.The "difference" category subsumed any operationalization measuring the distance between observations or group means, or extremity as distance from a scale midpoint.Any operationalization measuring the sum of distances from an overall or group mean or otherwise measuring the cohesion of a group was coded as "variance."Both of these measures are often adjusted, for example, with party vote shares.Many authors attempt to capture some sense of bimodality with their measurement, with kurtosis and the Reynal-Querol index being popular choices; these are included under "bimodality."The finally category garnering an appreciable number of articles is that of "overlap;" there are several measures purporting to measure the degree to which two distributions (generally displayed as kernel density plots) overlap, and these are all subsumed under this category.Other categories contain, at most, a few articles and are generally self-explanatory.As articles often employ multiple operationalizations, I allow articles to be coded with more than one measure when appropriate.Proportions therefore may not sum to 1.  S2 displays the proportion of articles in each journal and overall that use each measure of polarization.Three patterns in the results characterize the state of polarization measurement in the political science literature.First, there is wide variability in how scholars attempt to tap into polarization-this simple analysis reveals no less than twenty distinct measurement strategies.In one sense, then, there is little consistency in measurement.In contrast, two measures-difference and variance-are substantially more common than all others, appearing in 56.5% and 13.7% of all articles, respectively.In this sense, then, there is great consistency in measurement.However, the reliability of past findings may be called into question if these two measures do not adequately capture polarization.Finally, there is some variation among subfields.

APSR AJPS JOP
In contrast to work in American politics, comparative scholarship tends to rely less on measures of difference and more on measures of variance, likely because the former complicates measurement in multi-party systems-a benefit to using the CPC that I show in the main text.

S2 Properties of the Measure S2.1 Derivation
To derive the CPC, I begin by decomposing the total variance of clustered data (T SS) in (S1) into components directly corresponding to the two features of polarization: the variance accounted for between the clusters (BSS, corresponding to intergroup heterogeneity) and the variance accounted for within all clusters (W SS, corresponding to intragroup homogeneity).Dividing by T SS and solving for the BSS term gives an expression for the proportion of the total variance accounted for by the between-cluster variancewhat I call the cluster-polarization coefficient (CPC): More formally, I compute three terms in (S2): the total sum of squares T SS, the between-cluster sum of squares BSS, and the total within-cluster sum of squares W SS, where each individual i in cluster k holds a position on dimension j: Mirroring (S1), I arrive at formal expressions of the variance of clustered data and of the CPC in (S3).
Expressed in this way, the CPC appears related-though not identical-to a one-way ANOVA F -statistic and the coefficient of determination (R 2 ).
The CPC thus possesses two desirable properties: It is naturally bounded on the interval [0, 1] and it takes into account both features of polarization.The CPC increases when the distance between groups increases or when groups become more tightly concentrated around their collective ideal point, but the rate of those increases depends on the relative levels of BSS and W SS.
A Group-Based Approach to Measuring Polarization As I show in section S2.3, however, this measure will be biased upward in small samples.To make the CPC more generalizable to contexts with varying numbers of observations, variables, and clusters, I incorporate corrections for lost degrees of freedom into the key variance expressions in (S4) and derive the adjusted CPC in (S5): (S5)

S2.2 Distribution
Properties of the CPC can be further explored by deriving a sampling distribution.Because the CPC is effectively a ratio of two variances, an F statistic can be calculated: where n i , n j , and n k denote the number of observations, dimensions, and clusters, respectively; and BSS ∼ Solving for the CP C term from (S6) produces: ) under the null-a sensible result given that the CPC and the Beta distribution both have continuous support on the [0, 1] interval.

S2.3 Unbiasedness
From this distribution, it is straightforward to recover the mean: The unadjusted CPC is therefore asymptotically unbiased, again under the null hypothesis that BSS − W SS = 0, as lim ni→∞ E(CP C) = 0.However, it will be biased upward in finite samples, and the degree of that bias will depend on both the number of dimensions and number of clusters, underscoring the need for the degrees-of-freedom corrections shown in section S2.1.Taking advantage of the expression in (S8), we can show that the adjusted CPC is, in fact, an unbiased estimator under the null:

S2.4 Consistency
The variance of the CPC can also be recovered from its sampling distribution: Taking advantage of (S10), we can also derive the variance of the adjusted CPC: The adjusted CPC is therefore consistent, as lim ni→∞ Var(CP C adj ) = 0.

S2.5 Additional Properties
Finally, I derive expressions for the median and mode of the sampling distribution.The median is presented in (S12), where I x gives the regularized incomplete Beta function: The mode of the sampling distribution can be expressed as: implying that the distribution possesses a unique and finite mode when n i − n j > 4 and n j n k − n j ≥ 2.
5. The solution given in (S12) is approximate-the median of the Beta distribution has no closed-form expression for arbitrary parameters.

S3.1 Set-Up
In this section, I present evidence for the efficacy of the CPC by simulating both univariate and bivariate data using Gaussian mixture distributions.The purpose of this simulation exercise is to evaluate-in a controlled environment-the extent to which the CPC captures the two features of polarization, and whether it does so better than existing measures.Gaussian mixtures are uniquely suited for this purpose because they provide a straightforward method for mimicking distributional polarization.Each component of a Gaussian mixture is parameterized by a location parameter µ and scale parameter σ, which neatly correspond to the two features of polarization: intergroup heterogeneity and intragroup homogeneity, respectively.By manipulating these component parameters, therefore, I can generate mixture distributions with varying levels of polarization and estimate those levels using the CPC and other existing measures.I focus here on comparing the CPC to the two most popular strategies for measuring polarization in the political science literature: difference-in-means and variance.To identify the accuracy of each polarization measure, I examine the intergroup heterogeneity and intragroup homogeneity features separately.To simulate polarization as a result of increasing intergroup heterogeneity, I execute a four-step simulation exercise, randomly varying µ: 1. Fix component standard deviations at a range of values σ ∈ {0.5, 1, 1.5, 2}. 6For identification, I use the same σ for each component and maintain a global mean of zero.

Apply each polarization measure to the resulting distribution.
The result of this procedure is 1,000 distributions, each with N = 1000, with which to evaluate the performance of each polarization measure.To simulate polarization as a result of increasing intragroup homogeneity, I execute a similar four-step simulation exercise, randomly varying σ: 1. Fix component means at a range of values µ ∈ {2, 3, 4, 5}. 7For identification, I use the same absolute value of µ for each component and maintain a global mean of zero.
6.As seen in Figure S1, even this relatively short range of values is sufficient to generate distributions ranging from unimodal to distinctly bimodal.
7. Again, as seen in Figure S1, even this relatively short range of values is sufficient to generate distributions ranging from unimodal to distinctly bimodal.2. For each component mean µ, select 1,000 values of σ as independent draws from U(0.5, 2).
4. Apply each polarization measure to the resulting distribution.
The result of this procedure is 1,000 distributions, each with N = 1000, with which to evaluate the performance of each polarization measure.For identification, I use equal component weights across all simulated distributions. 8 Using these simulation frameworks, I evaluate the performance of the adjusted CPC relative to difference and variance in both univariate and bivariate contexts. 9Pursuant to the two definitional characteristics of polarization, an appropriate measure should indicate higher polarization when the distance between com-8.Additional simulations in Supplementary Information section S3.4 investigate how the CPC changes in response to varying component weights.9.For bivariate data, I calculate difference by taking the average Euclidean distance between all component means, and I calculate variance as the trace of the covariance matrix.
ponent means increases or when the standard deviation of each component decreases.Because polarization can occur around more than two poles, especially in multiparty systems, I conduct these procedures for distributions with two, three, and four components. 10For all simulations, I calculate the adjusted CPC using true group memberships, which are known from the data randomization procedure.By using true group memberships instead of estimating them using a clustering algorithm, we can be sure that any advantages or disadvantages uncovered in the simulation results are attributable to the CPC itself and not to a clustering method being well-or ill-suited to this particular data structure.

S3.2 Results
I evaluate each simulated distribution using all three measures and present the results in two ways.First, Figures S2 and S3 present the raw polarization estimates as a function of the randomized parameters for two-component simulations with univariate and bivariate data, respectively. 11All measures are scaled to [0, 1] to enable comparison and plotted using locally estimated scatterplot smoothing (LOESS).A measure performing in line with theoretical expectations would register a positive slope in plot (a) and a negative slope in plot (b).However, the magnitude of those slopes and the absolute level of estimated polarization should differ depending on the fixed parameter.For example, the sets of distributions with fixed σ = 0.5 or fixed µ = (−5, 5) are more polarized on average than the distributions with a greater fixed σ or fixed µ parameters that are closer together.As a result, polarization estimates should generally be higher for those distributions and less sensitive to the value of the randomized parameter.
The results presented in Figures S2 and S3 generally align with expectations.Looking first at plot (a), slopes for difference, variance, and the adjusted CPC carry the expected sign.The magnitude of the difference and variance slopes, however, is relatively constant regardless of the fixed parameter, and the absolute level of estimated polarization appears similar.For example, with random component means of (−5, 5) at the far right hand side of each facet, difference and variance output almost identical polarization estimates regardless of whether component standard deviations are 0.5, 2, or anywhere in between.The CPC, on the other hand, appears more sensitive to those fixed parameters and displays intercepts and slope magnitudes more in line with expectations. 12The insensitivity of difference and variance to component standard deviations can be seen more clearly in plot (b).While the adjusted CPC again performs as expected, difference and variance appear as nearly flat lines, although they do output higher polarization estimates when the difference between fixed component means grows larger.
Understanding how raw polarization estimates track with distributional characteristics is valuable, but it complicates a formal evaluation of a measure's effectiveness because the estimated level of polarization (the output of each measure) and the parameters controlling the simulated level of polarization (standard deviation or distance between means) are different quantities and are on different scales.Moreover, we do not have information about the "true" level of distributional polarization-estimating such quantities is the very goal of this measurement approach. 1310.For three and four components, I calculate the difference score by taking the average distance between all component means.
11.The online replication materials contain results of three-and four-component simulations.12.At extremely high levels of polarization (e.g.σ = 0.5), however, the CPC is likewise relatively insensitive to increasing the distance between component means.This may not be a desirable property if the intended use is to measure polarization when groups are extremely concentrated around their ideal point.However, Supplementary Information section S5 analyzes the relative weight each feature has on the CPC across a range of values plausible in real-world data, and results suggest this diminishing impact only occurs at very high levels of polarization.
13. Supplementary Information section S4 pursues another strategy for procuring "ground-truth" polarization estimates: using human coders to evaluate relative polarization levels.
By holding all other distributional characteristics constant and randomly varying only component means and standard deviations, however, we do have information about each distribution's level of polarization relative to every other distribution.For example, the simulation to assess the intergroup heterogeneity feature holds standard deviations constant and randomly varies component means.Randomly generated means that are further apart will generate a distribution that is, in theory, more polarized.The result of the simulation, then, is 1,000 distributions that randomly vary in their level of polarization, and those relative levels of polarization can be identified by the relative value of the random component means.I therefore follow the approach taken by Lupu, Selios, and Warner (2017) and use the estimated polarization from each measure to rank-order the distributions and compare those rankings to the true rank-order recovered from the randomized parameters, with a higher rank indicating a greater level of polarization.3.For each component, draw a set of component weights ϕ K as independent draws from U(0, 1), such that 2 k=1 ϕ k = 1. 4. Take 1,000 independent draws from a Gaussian mixture parameterized by N(ϕ 1 , −µ, σ; ϕ 2 , µ, σ).

Univariate
5. Apply each polarization measure to the resulting distribution.
The result of this procedure is 1,000 distributions, each with N = 1000.
For each set of simulations, I then plot estimated polarization as a function of the difference between the two component weights.As in all other simulations, I calculate the CPC using the true cluster memberships, which are known from the data randomization procedure.Figure S6 displays the results.As expected, the adjusted CPC is relatively invariant to the difference in component weights except when that difference is high, at which point the mixture distribution approaches unimodality and the CPC decreases precipitously.

S4 Benchmarking Against Human Coders
The simulation evidence presented in the previous section is helpful for ensuring the CPC responds in theoretically appropriate ways to changes in distributional features.However, the lack of ground-truth polarization labels makes it difficult to judge whether the CPC is a more accurate measure of polarization compared to others.Recovering a ranking of distributions based on randomized distribution parameters comes close to solving that problem, but in these simulations, only one feature can be manipulated at a time.In real-world data, both features vary simultaneously, often independent of each other.I therefore use human coders to gather ground-truth annotations of distributions' level of polarization, against which I can benchmark each measure's performance.I first generated fifty bimodal distributions using the same parameters as in section S3.1, randomly varying both component means and standard deviations at the same time.To make the task more accessible to non-experts, I colored one component blue and one component red and referred to them as graphs of Democratic and Republican ideology.The full task preamble and sample graphs can be seen in Figure S7.
After the preamble, coders answered three screening questions to ensure they understood the task.These screening questions were identical to those they saw in the main task, but I chose the graphs presented in the screening questions such that one was obviously more polarized than the other.Coders were only allowed to complete the main task if they answered all three screening questions correctly.I explicitly told coders that polarization involved both features and tested them on that knowledge with easy screening questions for two reasons.First, I wanted to make sure coders understood the task.Second, I was not concerned with how the coders themselves thought about polarization, only that they could evaluate polarization according to the two-feature definition while both features varied across distributions-something I could not achieve in the simulations.For those coders who passed the screening questions, I randomly selected twenty pairs of plots, presented them to the coders, and asked them: "Which of the two graphs below is more polarized?" The online replication materials contain full item wordings and a more complete illustration of the task.
The result of each coder's task was therefore twenty random pairwise comparisons labeled according to which of the two options was more polarized.495 workers on Amazon's Mechanical Turk completed the former carries greater weight when the overall level of polarization is high.However, in neither case does this diminishing effect appear to be substantial.
Taking into account all results presented in this section and in the simulations, the preponderance of evidence points toward three conclusions: First, the contributions of each feature to CPC estimates are comparable, at least over a range of values observed in real-world data.Second, BSS may contribute more heavily in cases of low polarization, but this difference is not substantial.Third, W SS may contribute more heavily in cases of high polarization, but this difference is, again, not substantial.

Figure
Figure S1 presents a visualization of this intuition.This figure displays kernel density plots for simulated univariate data with two components.The plots are arranged such that the least polarized distributions fall at the top left and the most polarized distributions fall at the bottom right, and component parameters are provided by the plot labels along the top and right axes.Consider what happens to these component parameters as we move from a less polarized to a more polarized distribution.Moving from left to right across the rows of the facet plot, for example, the distributions become more polarized as the difference between component means increases and the components grow farther apart.Likewise, moving from top to bottom along the columns of the facet plot, the distributions become more polarized as component standard deviations decrease and the components grow more compact.

Figure S1 :
Figure S1: Visualization of Simulation Set-Up.Simulated Gaussian mixture distributions with µ global = 0; rows represent diverging means with standard deviations held constant and columns represent decreasing standard deviations with means held constant; thus, the least polarized distributions appear at top left and the most polarized distributions appear at bottom right.

Figure S4 :
Figure S4: Polarization Estimates with Log-Normal Data.Results from univariate simulations of polarization measures with two components, showing estimated level of polarization for a randomly varying distribution parameter, holding the other parameter constant.All measures scaled to [0, 1] to enable comparison and plotted using LOESS with a span of 0.75.

Table S2 :
Proportion of Articles Using Polarization MeasuresTable

Table S3 :
Error of Distribution Rankings in Simulation.Root mean squared error and Spearman's ρ calculated for univariate and bivariate simulations with two, three, and four components; bolded values denote measure with lowest error in each category.TableS3reports the root mean squared error (RMSE) and Spearman's rank correlation coefficient (rho) for all three measures across all simulations. 14Bolded values represent the best-performing measure in each category.Examining these results, a clear pattern emerges.Difference and variance register the lowest error rates on the intergroup heterogeneity feature, regardless of the number of variables or components.This makes intuitive sense; holding the standard deviations of a mixture distribution constant and pulling the

Table S5 :
Effect of Individual Features on CPC Estimates.BSS and W SS unit-normalized.Standard errors in parentheses.