Hostname: page-component-89b8bd64d-r6c6k Total loading time: 0 Render date: 2026-05-06T12:13:48.031Z Has data issue: false hasContentIssue false

ecolRxC: Ecological inference estimation of R × C tables using latent structure approaches

Published online by Cambridge University Press:  14 October 2024

Jose M. Pavía*
Affiliation:
GIPEyOP, Area of Quantitative Methods, Universitat de Valencia, Valencia, Spain
Søren Risbjerg Thomsen
Affiliation:
Department of Political Science, University of Aarhus, Aarhus, Denmark
*
Corresponding author: Jose M. Pavía; Email: pavia@uv.es
Rights & Permissions [Opens in a new window]

Abstract

Ecological inference is a statistical technique used to infer individual behavior from aggregate data. A particularly relevant instance of ecological inference involves the estimation of the inner cells of a set of R × C related contingency tables when only their aggregate margins are known. This problem spans multiple disciplines, including quantitative history, epidemiology, political science, marketing, and sociology. This paper proposes new models for solving the problem using the latent structure theory, and presents the ecolRxC package, an R implementation of this methodology. This article exemplifies, explains, and statistically documents the new extensions and, using real inner cell election data, shows how the new models in ecolRxC lead to significantly more accurate solutions than ecol and VTR, two Stata routines suggested within this framework. ecolRxC also holds its own against ei.MD.bayes and nslphom, the two algorithms currently identified in the literature as the most accurate to solve this problem. ecolRxC records accuracies as good as those reported for ei.MD.bayes and nslphom. Besides, from a theoretical perspective, ecolRxC stands up for modeling a causal theory of political behavior to build its algorithm. This distinguishes it from other procedures proposed from different frameworks (such as ei.MD.bayes and nslphom) which model expected behaviors, instead of modeling how voters make choices based on their underlying preferences as ecolRxC does.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press on behalf of EPS Academic Ltd
Figure 0

Figure 1. Graphical summary example of an output of ecolRxC. The global total counts are presented in the margins of the plot table and the estimated transition row-standardized fractions in the inner-cells of the table. The sizes of the numbers in each interior cell are (in log-scale) proportional to its corresponding estimated counts and the intensity of the color of each cell within each row is proportional to the fraction of voters of the corresponding row option that switch to the corresponding column option.

Figure 1

Table 1. Basic ecological inference latent structure procedures available in ecolRxC

Figure 2

Figure 2. Graphical representation of average values of EI (upper panels), EPW (intermediate panels), and EQ (lower panels) errors by procedure (specification) using either the logit (left panels) or the probit (right panels) fraction-transformations. The correspondence between the acronyms of the procedures and its ecolRxC specification is detailed in Table 1. In the ecol specification, errors are computed as simple averages of the RC errors corresponding to the RC possible reference solutions. The smaller the number, the better the accuracy.

Figure 3

Table 2. Averages of EI errors by group of elections

Figure 4

Table 3. Averages of EI errors by group of elections for the eight composite solutions

Figure 5

Figure 3. Estimated EI (left panel) and EPW (right panel) errors by election corresponding to the ecolRxC default solution (red points) and its linked RC solutions (black points) attained choosing as reference all the RC possible pairs with a row and a column. Elections have been ordered from smallest to largest EI.

Supplementary material: File

Pavía and Thomsen supplementary material

Pavía and Thomsen supplementary material
Download Pavía and Thomsen supplementary material(File)
File 308.7 KB