Hostname: page-component-5db58dd55d-m58mf Total loading time: 0 Render date: 2026-05-27T05:29:29.438Z Has data issue: false hasContentIssue false

A Nonparametric Bayesian Model for Detecting Differential Item Functioning: An Application to Political Representation in the US

Published online by Cambridge University Press:  21 February 2023

Yuki Shiraito*
Affiliation:
Department of Political Science, University of Michigan, Ann Arbor, MI, USA. E-mail: shiraito@umich.edu
James Lo
Affiliation:
Department of Political Science and International Relations, University of Southern California, Los Angeles, CA, USA. E-mail: lojames@usc.edu
Santiago Olivella
Affiliation:
Department of Political Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. E-mail: olivella@usc.edu
*
Corresponding author Yuki Shiraito
Rights & Permissions [Opens in a new window]

Abstract

A common approach when studying the quality of representation involves comparing the latent preferences of voters and legislators, commonly obtained by fitting an item response theory (IRT) model to a common set of stimuli. Despite being exposed to the same stimuli, voters and legislators may not share a common understanding of how these stimuli map onto their latent preferences, leading to differential item functioning (DIF) and incomparability of estimates. We explore the presence of DIF and incomparability of latent preferences obtained through IRT models by reanalyzing an influential survey dataset, where survey respondents expressed their preferences on roll call votes that U.S. legislators had previously voted on. To do so, we propose defining a Dirichlet process prior over item response functions in standard IRT models. In contrast to typical multistep approaches to detecting DIF, our strategy allows researchers to fit a single model, automatically identifying incomparable subgroups with different mappings from latent traits onto observed responses. We find that although there is a group of voters whose estimated positions can be safely compared to those of legislators, a sizeable share of surveyed voters understand stimuli in fundamentally different ways. Ignoring these issues can lead to incorrect conclusions about the quality of representation.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of the Society for Political Methodology
Figure 0

Table 1 Simulated versus estimated clusters, MPS model. The estimated clusters recover the simulated clusters, but the sub-clustering phenomenon results in multiple estimated versions of the same cluster. For example, estimated clusters 2 and 4 represent two different ways to identify the simulated cluster 2.

Figure 1

Figure 1 Gap statistic over different numbers of substantive clusters, defined as communities in a graph of item parameter correlations. High values of the gap statistic indicate a grouping with high within-cluster similarity relative to a null model (in which edges are drawn uniformly at random) with no heterogeneity. Thus, the k that maximizes the gap statistic is a reasonable estimate for the number of substantive clusters in the data.

Figure 2

Figure 2 Graphs defined on nodes given by DP mixture sub-clusters. The graph has weighted edges defined using pair-wise correlations between discrimination parameters (left panel) and difficulty parameters (right panel). True simulation clusters are denoted with different node shapes, and communities detected by a modularity-maximizing algorithm are denoted with shaded regions. Recovery is of simulated clusters is exact in both instances.

Figure 3

Figure 3 Correlation of item discrimination parameters. Main diagonal plots estimated versus simulated parameters for each cluster and show that the item discrimination parameters are correctly recovered to an affine transformation. Off-diagonal plots show cross-cluster correlation between estimated and true item parameters, which is expected (under the simulation) to be zero.

Figure 4

Figure 4 Correlation of latent traits parameters. Plots show simulated against estimated latent traits for all 10 estimated clusters.

Figure 5

Figure 5 Gap statistic. Statistic defined over different numbers of substantive clusters, when true Data Generating Process (DGP) has no heterogeneity. In this case, the gap statistic again recommends the correct number of clusters—one, in this case.

Figure 6

Table 2 Estimated versus starting clusters. Legislators all started in cluster 1, and remained there throughout estimation.

Figure 7

Table 3 Correlations of item discrimination parameters between estimated CCES 2008 clusters. Standard errors in parenthesis.

Figure 8

Figure 6 (Left) Gap statistic. (Right) Graph on nodes given by DP mixture sub-cluster. Left panel shows two substantive clusters appear to fit the data best. Right panel graph has weighted edges defined using pairwise correlations between discrimination parameters in a model estimated on the 2008 CCES data. Shaded regions denote communities detected by a modularity-maximizing algorithm. Again, two substantive clusters appear summarize the data best, with a “legislator cluster” formed by sub-clusters 1, 2, and 5.

Figure 9

Figure 7 Point estimates and 90% credible intervals for coefficients in Bayesian probit regression of membership into estimated legislator cluster. A reference line is added at zero. We find that “political interest,” “race,” and “age” are likely to be characteristic of voters in the legislator cluster.