Minmaxing of Bayesian Improved Surname Geocoding and Geography Level Ups in Predicting Race

Jesse T. Clark; John A. Curiel; Tyler S. Steelman

doi:10.1017/pan.2021.31

Minmaxing of Bayesian Improved Surname Geocoding and Geography Level Ups in Predicting Race

Published online by Cambridge University Press: 29 November 2021

and

Jesse T. Clark: Affiliation:
Postdoctoral Research Associate, Princeton University, Princeton, NJ
John A. Curiel: Affiliation:
Assistant Professor, Ohio Northern University, Ada, OH, USA
Tyler S. Steelman*: Affiliation:
Department of Political Science, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA. E-mail: tsteelman@unc.edu
*: Corresponding author Tyler S. Steelman

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Racial identification is a critical factor in understanding a multitude of important outcomes in many fields. However, inferring an individual’s race from ecological data is prone to bias and error. This process was only recently improved via Bayesian improved surname geocoding (BISG). With surname and geographic-based demographic data, it is possible to more accurately estimate individual racial identification than ever before. However, the level of geography used in this process varies widely. Whereas some existing work makes use of geocoding to place individuals in precise census blocks, a substantial portion either skips geocoding altogether or relies on estimation using surname or county-level analyses. Presently, the trade-offs of such variation are unknown. In this letter, we quantify those trade-offs through a validation of BISG on Georgia’s voter file using both geocoded and nongeocoded processes and introduce a new level of geography—ZIP codes—to this method. We find that when estimating the racial identification of White and Black voters, nongeocoded ZIP code-based estimates are acceptable alternatives. However, census blocks provide the most accurate estimations when imputing racial identification for Asian and Hispanic voters. Our results document the most efficient means to sequentially conduct BISG analysis to maximize racial identification estimation while simultaneously minimizing data missingness and bias.

Keywords

Bayesian improved surname geocoding racial identification geocoding geographic information system ZIP codes

Information

Type: Letter
Information: Political Analysis , Volume 30 , Issue 3 , July 2022 , pp. 456 - 462

DOI: https://doi.org/10.1017/pan.2021.31 [Opens in a new window]
Copyright: © The Author(s) 2021. Published by Cambridge University Press on behalf of the Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

Edited by Jeff Gill

References

Alvarez, R. M., Katz, J. N., and Kim, S. S.. 2020. “Hidden Donors: The Censoring Problem in U.S. Federal Campaign Finance Data.” Election Law Journal 19(1):1–18.CrossRef Google Scholar

Amos, B., and McDonald, M. P.. 2020. “A Method to Audit the Assignment of Registered Voters to Districts and Precincts.” Political Analysis 28(3):356–371.CrossRef Google Scholar

Clark, J., Curiel, J. A., and Steelman, T. S.. 2021. “Replication Data for: Minmaxing of Bayesian Improved Surname and Geography Level Ups in Predicting Race.” Harvard Dataverse, V1. https://doi.org/10.7910/DVN/IH7ICK.CrossRef Google Scholar

Curiel, J. A., and Steelman, T. S.. 2018. “Redistricting out Representation: Democratic Harms in Splitting Zip Codes.” Election Law Journal 17(4):328–353.CrossRef Google Scholar

Duque, J. C., Laniado, H., and Polo, A.. 2018. “S-Maup: Statistical Test to Measure the Sensitivity to the Modifiable Areal Unit Problem.” PLoS One 13(11):1–25.CrossRef Google Scholar

Edwards, F., Esposito, M. H., and Lee, H.. 2018. “Risk of Police-Involved Death by Race/Ethnicity and Place, United States, 2012–2018.” American Journal of Public Health 108(9):1241–1248.CrossRef Google Scholar

Einstein, K. L., Glick, D. M., and Palmer, M.. 2020. Neighborhood Defenders: Participatory Politics and America’s Housing Crisis. Cambridge: Cambridge University Press.Google Scholar

Elliott, M. N., Fremont, A., Morrison, P. A., Pantoja, P., and Lurie, N.. 2008. “A New Method for Estimating Race/Ethnicity and Associated Disparities Where Administrative Records Lack Self-Reported Race/Ethnicity.” Health Services Research 43(5p1):1722–1736. https://doi.org/10.1111/j.1475-6773.2008.00854.x.CrossRef Google Scholar

Enos, R. D., Kaufman, A. R., and Sands, M. L.. 2019. “Can Violent Protest Change Local Policy Support? Evidence from the Aftermath of the 1992 Los Angeles Riot.” American Political Science Review 113(4):1012–1028.CrossRef Google Scholar

Fraga, B. L. 2018. The Turnout Gap: Race, Ethnicity, and Political Inequality in a Diversifying America. Cambridge: Cambridge University Press.CrossRef Google Scholar

Imai, K., and Khanna, K.. 2016. “Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Record.” Political Analysis 24(2):263–272.CrossRef Google Scholar

King, G. 1997. A Solution to the Ecological Inference Problem: Reconstructing Individual Behavior from Aggregate Data. Princeton: Princeton University Press.Google Scholar

Lu, C., et al. 2019. “Examining Scientific Writing Styles from the Perspective of Linguistic Complexity.” Journal of the Association for Information Science and Technology 70(5):462–475.CrossRef Google Scholar

Masuoka, N. 2006. “Together they Become One: Examining the Predictors of Panethnic Group Consciousness Among Asian Americans and Latinos.” Social Science Quarterly 87(5):993–1011.CrossRef Google Scholar

Masuoka, N., Ramanathan, K., and Junn, J.. 2019. “New Asian American Voters: Political Incorporation and Participation in 2016.” Political Research Quarterly 72(4):991–1003.CrossRef Google Scholar

Nall, C. 2018. The Road to Inequality: How the Federal Highway Program Polarized America and Undermined Cities. Cambridge: Cambridge University Press.CrossRef Google Scholar

Nemerever, Z., and Rogers, M.. 2021. “Measuring the Rural Continuum in Political Science.” Political Analysis 29(3):1–20.CrossRef Google Scholar

Robinson, W. S. 1950. “Ecological Correlations and the Behavior of Individuals.” American Sociological Review 15(3):351–357. https://doi.org/10.2307/2087176.CrossRef Google Scholar

Signorella, M. L. 2020. “Toward a More Just Feminism.” Psychology of Women Quarterly 44(2):256–265. https://doi.org/10.1177/0361684320908320.CrossRef Google Scholar

Studdert, D. M., et al. 2020. “Handgun Ownership and Suicide in California.” New England Journal of Medicine 382(23):2220–2229.CrossRef Google Scholar PubMed

Swift, J. N., Goldberg, D. W., and Wilson, J. P.. 2008. “Geocoding Best Practices: Review of Eight Commonly Used Geocoding Systems.” Technical report 10, University of Southern California Research GIS Laboratory, Los Angeles. https://spatial.usc.edu/wp-content/uploads/2014/03/gislabtr10.pdf.Google Scholar

Clark et al. supplementary material

PDF 666.5 KB

Article contents

Minmaxing of Bayesian Improved Surname Geocoding and Geography Level Ups in Predicting Race

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Clark et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests