The comparative meaning of political space: a comprehensive modeling approach

Abstract In latent scaling applications, such as the positioning of political parties, differential item functioning (DIF) may occur because of measurement issues or because of substantive differences in the association between latent and manifest variables. While the first source of DIF has received considerable attention, the second has not, although it is of potential interest to comparative scholars. In this research note, we introduce a novel hierarchical Bayesian item response model that allows us to disentangle different sources of DIF. Drawing on the 2019 Chapel Hill Expert Survey (CHES), we highlight how the same issues are unequally politicized across Western Europe, and how some issues are less ideologically determined than others. Our model can be adapted to alternate settings, allowing researchers to shine a light on variation in, e.g., ideology, issue politicization, or party competition.


Introduction
Latent variable models allow the estimation of latent traits, such as party ideology, based on manifest behavior, such as issue positions (Bafumi et al., 2005). This approach has proven useful in many areas, one of which is placing parties in political space (König et al., 2013). A recurring question in such applications, however, is whether or not the resulting estimates of party positions are comparable across contexts writ large (Davidov et al., 2014). Comparability is hampered under contextual variation in the associations between latent and manifest variables, what is typically called differential item functioning (DIF).
DIF comes in two types. The first reflects artifacts emanating from measurement, which have received a great deal of attention (Bakker et al., 2014a,b;Hare et al., 2015). The second reflects substantive differences in the ideological space and its meaning and relevance for parties' issue positions. This has received much less attention, although the fluidity of the specific meaning of general ideology is well-documented at the mass and party levels (Bornschier, 2010;Wheatley and Mendez, 2021). This is our central focus.
We introduce a novel hierarchical item response-theoretic (IRT) model that allows researchers to disentangle and quantify different sources of DIF across countries, parties, and measurement instruments. The model particularly yields insight into the substantive aspect of DIF: the differential meaning of ideology across countries and the fact that ideology does not fully account for the positions of some parties on some issues. We apply this model to the placement of Western-European parties in a two-dimensional ideological space using data from the 2019 Chapel Hill Expert Survey (CHES; Jolly et al., 2022). We find that economic left-right maps quite uniformly onto issues across Western Europe. To a slightly lesser degree, this is true of the cultural dimension of ideology. By contrast, issues related to the European Union map rather heterogeneously across Europe. We also find that the two ideological dimensions cannot fully account for the positioning of specific parties on specific issues. The analysis speaks to a variety of ongoing debates about the Western-European political space, including its meaning and dimensionality (Hooghe and Marks, 2018;De Vries and Hobolt, 2020;Wheatley and Mendez, 2021). More generally, the tool we develop (1) distinguishes between a variety of sources of contextual variation; (2) can be used for a wide variety of data sources on party positions, including expert surveys, voter placements, party manifestos, and other textual data; and (3) extends other frameworks such as Aldrich-McKelvey (A-M) scaling (Hare et al., 2015).

Understanding DIF in party ideology
Latent variable models typically assume that latent traits are reflected in observable behavior and that, indeed, those traits induce coherence in what would otherwise be disparate items (Borsboom et al., 2003). When the manifest variables are political issues, the latent variables are typically conceived of as ideological dimensions, following the logic of ideological constraint outlined by Converse (1964). In the Western-European context, there is increasing consensus that two dimensions are required to capture the nature of political conflict across political parties and citizens (Marks et al., 2006;Bornschier, 2010;Hooghe and Marks, 2018). We take this two-dimensional view of the political space as our starting point in this research note that imposes a common structure between latent traits and observable behavior. 1 DIF concerns the question of whether latent dimensions connect uniformly to manifest variables (for an overview, see Davidov et al., 2014). We identify four sources of DIF in placing parties in a political space; two related to measurement and two related to substantive differences across contexts and actors. A first measurement-related source of DIF are the raters of party positions, whether they be experts, voters, or coders of party manifestos (e.g., see Steenbergen and Marks, 2007). This problem is typically addressed through response aggregation (e.g., Jolly et al., 2022).
A second measurement-related source of DIF concerns cross-national differences in response behavior, which may result from different understandings of constructs or response styles. This problem can be addressed through the use of vignettes (Bakker et al., 2014a,b) or A-M scaling (Hare et al., 2015). Such approaches can be used to recover cross-national distortions in the linkage between observed indicators and latent traits.
Of course, we pay close attention to these sources of DIF in our analysis. Our primary interest, however, is one that we share with comparativists; substantive differences in the ideological space, as the meaning of political ideology is not the same everywhere (Huber and Inglehart, 1995;Kriesi et al., 2006;Bornschier, 2010). The central question here is how political ideology connects to specific issues, i.e., how issues are politicized (Bakker et al., 2012;Rovny and Whitefield, 2019). If it is true, as much of the literature assumes, that ideology has a static quality, then the question of how existing ideological conflicts translate into issues is highly relevant and a potential third source of cross-national DIF (Marks et al., 2006;Hooghe and Marks, 2018).
A fourth source of DIF concerns cross-party differences. It is possible that parties take issue positions that are unexpected given their placement in ideological space. This is consistent with the idea of political entrepreneurship-parties strategically seeking to politicize issues orthogonal to existing ideological dimensions of conflict-but it may also be a legacy of specific issues giving rise to the party (De Vries and Hobolt, 2020). We conceive of this in terms of idiosyncratic shocks, in the way Lauderdale et al. (2018) have done for public opinion. The result of these shocks is that parties with identical ideological positions can nevertheless display quite different positions on certain issues, reflecting that issue-specific dimensions of conflict need not necessarily align with ideological dimensions of conflict. 1 The model presented below can be adapted to alternate dimensionalities without further ado.
We need then a model that incorporates four different sources of variation relative to the common structure of a two-dimensional political space as described above: (1) variation among observers of issue positions; (2) national variation in response styles across those observers; (3) cross-national differences in the linkage between issue positions and political ideology; and (4) idiosyncratic shocks that cause ideologically identically positioned parties to take on heterogeneous issue positions. Such a model does not yet exist and we develop it in the next section.

Setup
In our exposition, y ijce is the position of political party i on issue j, in country c, as indicated by expert e on an ordered scale. While we focus on experts, e can be generally thought of as any coder of issue positioning. The probability Pr (y ijce = k), where the response option k = 1, ⋅ ⋅ ⋅ , K, is determined by an appropriate link function between y ijce and the latent continuum y * ijce . The link function can be adapted to incorporate other kinds of measures such as word counts for manifestos.
Two party-specific factors are systematically associated with y * ijce , namely a party's position on D ideological dimensions, θ ic , and a party's idiosyncratic preferences, γ ij . The discrimination parameters of an issue in R D are given by β jc and capture the strength and direction of the association between latent ideological traits and issue positions. Importantly, the subscript c allows for crossnational variation in these associations, so that ideological conflict can play out differently at the issue-level in different locations. In keeping with random-effects IRT models, we postulate β jcd = β jd0 + β jcd1 , where β jd0 is the mean and β jcd1 is a country-issue-specific error term (cf. De Jong et al., 2007;Fox and Verhagen, 2018). This way of modeling the discrimination parameters constitutes one of our main contributions.
The idiosyncratic shocks are applied to the item difficulty parameters of the model. Specifically, for each party and each issue, we postulate a difficulty of γ ij . This is an additive component that makes it more or less likely for political parties to embrace a particular position, regardless of θ ic . Thus, two parties with identical θ ic can still have different positions on an issue.
Two additional factors influence the relationship between latent and manifest variables, namely heteroskedasticity in the scale of experts' errors, σ e (cf. Harvey, 1976), and country-level variation in scaling, ζ c . The expert-specific term captures variation across experts, whereas the parameter ζ c is akin to the scale parameter in A-M models (Hare et al., 2015) and captures crossnational differences in response behavior.
Combining everything so far, we obtain the following model: We use an ordered probit specification to link y * ijce and y ijce , using a set of ordered cutpoints α jc1 , ⋅ ⋅ ⋅ , α jcK−1 (Samejima, 1969). Similar to the discrimination parameters, the cutpoints vary across issues and countries. We model them as α jck = α jk0 + α jc1 + α c , where α c is akin to the shift parameter in the A-M model, which captures country-specific shifts in responses (Hare et al., 2015), and α jc1 is a country-issue-specific error term. The response probabilities are now given by: where Φ denotes the cumulative standard normal distribution function.

Renewable Agriculture and Food Systems
The model's innovation is to allow for variation in the discrimination parameters and thresholds. In addition, it combines research on idiosyncratic shocks (Lauderdale et al., 2018) and A-M scaling (Hare et al., 2015). Indeed, the model generalizes the A-M approach to settings with multiple observations of indicators for units of analysis, whereas that approach is typically used with single observations of indicators for units. This allows us to estimate country-item-specific scale and shift parameters in addition to the country-specific scale and shift parameters estimable via the A-M approach.

Priors
We estimate our IRT model using Bayesian inference, as implemented in Stan (Carpenter et al., 2017), with the following priors. Parties' positions on latent ideological dimensions are drawn from a standard normal distribution: u icd N (0, 1). Further, g ij N (0, s g ). Throughout, the hyperparameters σ for priors of the idiosyncratic shocks (σ γ ), the random effects on the discrimination parameters (σ β ) and cutpoints (σ α ), as well as expert-specific errors (σ σ ) and country-specific scale parameters (σ ζ ) are drawn from a truncated normal distribution, s N + (0, 1), to ensure that σ > 0.
The mean item parameters are also drawn from normal distributions: (1) b jd0 N (0, 1) and (2) a jk0 N (0, 1) (plus an ordering constraint). The random effects for both sets of parameters are drawn from multivariate normal distributions, specifically b cd1 MVN(0, S b d ) and a c1 MVN(0, S a ). The covariance matrices here are estimated from the data by first multiplying scale parameters σ β and σ α with two vectors of length J drawn from uniform Dirichlet distributions (D(1)). This results in vectors τ β and τ α as σ β and σ α are distributed across each element of the vectors constrained to sum to 1. Then, diagonal matrices with diagonal elements consisting of these vectors are multiplied with correlation matrices, Ω, drawn from LKJ-priors with shape = 4 so that, e.g., Σ α = diag(τ α )Ω α diag(τ α ) ( Barnard et al., 2000). The country-specific parameters for the ordered cutpoints are drawn from a standard normal distribution: a c N (0, 1).
Finally, the scale of expert-specific errors is drawn from a symmetric Dirichlet distribution, where the hyperparameter 1/σ σ is equal across experts: s e D(1/s s ). This constitutes an uninformative prior over the resulting vector and assures that the resulting estimates cannot be negative. This is also the case for the country-specific scale parameters: z c D(1/s z ).

Identification
Our IRT model is not identified without further constraints, which pertain to location, scale, and rotation of the latent dimensions (see Bafumi et al., 2005). We fix the scale and location by re-scaling estimated ideological positions θ to a standard normal distribution after each iteration of the sampling procedure. We address the rotational invariance problem by constraining the discrimination parameters to 0 for selected issues on selected dimensions (see below) and by setting starting values for party positions based on prior knowledge.
The introduction of idiosyncratic preferences and random effects on item parameters introduce the potential for novel identification problems regarding location. We solve this by fixing the mean of those parameters to 0 for each issue and parameter. Introducing σ e and ζ c raises potential identification issues for the scale of the estimated latent parameters. Thus, we impose a mean of 1 for those parameters.
We assess convergence of our model via theR convergence diagnostic, which is below 1.1 for all parameters (Gelman and Rubin, 1992).

Data
In our application, we use expert-level CHES data (2019), which covers 15 Western-European countries, 21 issues, 129 political parties, and 191 experts (Jolly et al., 2022). For each issue, experts position the party on an ordered response scale. The issues are selected to tap into three distinctive areas of political conflict, namely the economy (five items), social/cultural issues (ten items), and the European Union (six items). For more on the data, see Appendix 1 in the SI.
One major advantage of CHES is that it contains multiple observations per party and issue from different experts. On the one hand, this allows us to disentangle ideology from idiosyncrasy. On the other, this allows us to estimate σ e . Other data sources that have this feature are voter placements of parties or party manifestos, if there are multiple coders.
We impose a two-dimensional ideological structure to explain variation in party preferences. One dimension captures conflict over the economy, specifically the role of the state, whereas the second dimension captures conflict over culture with poles that have sometimes been described as green-alternative-libertarian (GAL) and traditional-authoritarian-nationalist (TAN) (Bakker et al., 2014a). For identification purposes, we set the discrimination parameter of the issue "economic intervention" to 0 for the cultural dimension. We also set the discrimination parameter for "immigration policy" on the economic dimension equal to 0. Substantively, this specifies that these two issues are entirely unrelated to these two ideological dimensions.

The ideological component
How did parties place on the ideological components? What patterns do the item discrimination parameters reveal? Figure 1 displays the correlation between the standardized mean expert placements on economic left-right and GAL-TAN (horizontal axes) and estimates of θ ic from our model (vertical axis). Overall, the values strongly correlate: r = 0.93 for the economic dimension and r = 0.90 for the cultural dimension. We interpret this as high face validity of our model. Figure 2 is a box plot of the mean item discrimination parameters, β jc , with the boxes reflecting cross-country variation. 2 The absolute size of the parameters indicates how strongly issues connect with the ideological dimensions (i.e., how well does an issue distinguish parties on either side of a dimension?), whereas the sign indicates the nature of the ideological politicization of an issue (e.g., does the economic right or the economic left favor a policy?). The figure highlights that the core issues associated with the economic dimension are very similar across Western Europe (visible in the first row of the figure). For deregulation, economic intervention, income redistribution, and spending versus tax cuts, there is hardly any variation in the parameters.
On the social/cultural issues, there is more heterogeneity (visible in the second row). This is particularly true for urban-rural policy emphases, the role of religious principles in politics, and the status of regions. Of the latter issue, it could be said that it is salient in only a few countries (e.g., Belgium, Spain, and the United Kingdom) and can be framed in both economic and cultural terms. Considering crime, the environment, immigration, nationalism, and social lifestyle, there is considerably less cross-national variation in the discrimination parameters. Those issues appear to define the cultural GAL-TAN dimension uniformly in Western Europe.
The greatest variation in item discrimination parameters can be found for the issues related to European integration (visible in the third and final row). Here the inter-quartile distances (the box lengths) and the whiskers are sizable compared to the other issues. In some countries, such as the United Kingdom, the European Union mostly seems to be a cultural issue, showing a much larger discrimination parameter for that than for the economic dimension (see Figures 4 and 5 in the SI). In other countries, such as Greece, Europe appears to be contested mainly on the economic dimension.
The payoff of our modeling strategy is that we unconfound the discrimination parameters from the scaling factor ζ c . If we do not do this, then at least part of the variation in the association between items and latent variables may be due to differences in cross-national response behavior. That is not the case with our setup.

The idiosyncratic component
Our model allows for party-specific shocks, γ ij , relative to the ideological component structuring parties' issue preferences. Figure 3 shows the size of those shocks for the various issues. The issues are sorted by the standard deviations of the shocks. The first thing to observe is that the shocks can be sizable, meaning that for certain issues party preferences are driven not by ideology alone, but also by idiosyncratic preferences. If the image so far has been one of relative agreement about the issues that discriminate in the two-dimensional space, the current image is one of distinct heterogeneity.
The largest shocks appear for five issues: (1) the role of religious principles in politics; (2) the urban-rural focus of parties; (3) their regional policy stances; (4) the environment; and (5) social lifestyle. All of those issues discriminate on the cultural dimension. It is clear, however, that some parties take positions that deviate from their cultural ideology. Further analysis shows this to be true of the agrarian party family for urban-rural concerns, the confessional family for religious principles and social lifestyle, the green family for the environment, and the regionalist family for regional issues (see Figure 6 in the SI).

Other results
In our discussion thus far, we have focused on parameters associated with party-specific factors influencing parties' issue positions, namely parties' latent ideological placement (θ ic ), discrimination parameters (β jc ), and idiosyncratic shocks (γ ij ). In doing so, we have highlighted the face validity of this model, variation in the politicization of issues across Western Europe, and the extent to which parties' positions are determined by factors other than ideology. These are the quantities most clearly associated with our primary interest, namely substantive sources of DIF.
For brevity, we present the resulting estimates of other parameters regarding measurementspecific sources of DIF from this model in the SI: the country-issue-specific shocks to difficulty parameters (α jc ; Figure 7), the country-specific scale and shift parameters (ζ c and α c ; Figures 8  and 9), as well as expert-level errors (σ e ; Figure 10). In Appendix 3 in the SI, we show via model comparisons that the single sole component that accounts for the largest increase in model performance relative to a conventional IRT model is the inclusion of idiosyncratic preferences. Overall, the results show that our model disentangles different sources of DIF that represent different sources of both substantial and measurement-specific variation in the association between manifest and latent variables across countries.

Conclusions
In this paper, we have outlined a method that allows scholars to parse DIF into a variety of sources, distinguishing substantive factors (i.e., idiosyncratic preferences and differential meanings of ideology) from measurement-related factors. The approach speaks to the nature of political conflict, issue politicization, and the role of agency in politics, i.e., parties positioning themselves in unique ways. In general, the IRT model uncovers political space in all its local nuance. Since the concept of space plays a crucial role in voting behavior and political representation, it is essential that scholars have the tools to understand the full complexity of its nature.
Our approach is general. While we focused on expert ratings of party positions, one could easily use other measurement instruments such as voter placements or placements by coders of party manifestos or other relevant texts. Key for the comprehensive model outlined here is that there are multiple observations per country, party, rater, and issue; but a more restricted model can also be estimated with more lenient data requirements. Of course, our insights were premised on a specific choice of two dimensions. For country comparisons to work, it is necessary to impose a specific dimensionality of the space. Differences in that dimensionality are another source of DIF. As long as we can identify subsets of at least two countries that abide by a particular dimensionality, our approach can be applied to those subsets. For instance, at the mass-level, Wheatley and Mendez (2021) found that two dimensions were needed in some countries, whereas others required three. Thus, we could estimate separate models for both groups of countries.