Hostname: page-component-89b8bd64d-rbxfs Total loading time: 0 Render date: 2026-05-05T10:00:01.011Z Has data issue: false hasContentIssue false

Integrating Data Across Misaligned Spatial Units

Published online by Cambridge University Press:  23 March 2023

Yuri M. Zhukov*
Affiliation:
Department of Political Science, University of Michigan, Ann Arbor, MI, USA. E-mail: zhukov@umich.edu, martydav@umich.edu, kkollman@umich.edu
Jason S. Byers
Affiliation:
Social Science Research Institute, Duke University, Durham, NC, USA. E-mail: jason.byers@duke.edu
Marty A. Davidson II
Affiliation:
Department of Political Science, University of Michigan, Ann Arbor, MI, USA. E-mail: zhukov@umich.edu, martydav@umich.edu, kkollman@umich.edu
Ken Kollman
Affiliation:
Department of Political Science, University of Michigan, Ann Arbor, MI, USA. E-mail: zhukov@umich.edu, martydav@umich.edu, kkollman@umich.edu
*
Corresponding author Yuri M. Zhukov
Rights & Permissions [Opens in a new window]

Abstract

Theoretical units of interest often do not align with the spatial units at which data are available. This problem is pervasive in political science, particularly in subnational empirical research that requires integrating data across incompatible geographic units (e.g., administrative areas, electoral constituencies, and grid cells). Overcoming this challenge requires researchers not only to align the scale of empirical and theoretical units, but also to understand the consequences of this change of support for measurement error and statistical inference. We show how the accuracy of transformed values and the estimation of regression coefficients depend on the degree of nesting (i.e., whether units fall completely and neatly inside each other) and on the relative scale of source and destination units (i.e., aggregation, disaggregation, and hybrid). We introduce simple, nonparametric measures of relative nesting and scale, as ex ante indicators of spatial transformation complexity and error susceptibility. Using election data and Monte Carlo simulations, we show that these measures are strongly predictive of transformation quality across multiple change-of-support methods. We propose several validation procedures and provide open-source software to make transformation options more accessible, customizable, and intuitive.

Information

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of the Society for Political Methodology
Figure 0

Figure 1 Spatial data layers (U.S. state of Georgia).

Figure 1

Table 1 Relative scale and nesting of polygons in Figure 1.

Figure 2

Figure 2 Output from change-of-support operations (Georgia). : source features are polygons. $\bigodot $: source features are polygon centroids.

Figure 3

Figure 3 Relative nesting, scale and transformations of election data (Georgia).

Figure 4

Figure 4 Examples of spatial data layers used in Monte Carlo study. Dotted lines are source units ($\mathcal {G}_S$), solid lines are destination units ($\mathcal {G}_D$).

Figure 5

Figure 5 Relative nesting and transformations of synthetic data.

Figure 6

Figure 6 Transformation quality at different percentiles of relative nesting.

Supplementary material: PDF

Zhukov et al. supplementary material

Online Appendix

Download Zhukov et al. supplementary material(PDF)
PDF 6.5 MB
Supplementary material: Link

Zhukov et al. Dataset

Link