Skip to main content Accessibility help
×
Home
Hostname: page-component-7f7b94f6bd-rpk4r Total loading time: 0.327 Render date: 2022-06-29T08:28:16.277Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "useRatesEcommerce": false, "useNewApi": true } hasContentIssue true

Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records

Published online by Cambridge University Press:  04 January 2017

Kosuke Imai*
Affiliation:
Department of Politics and Center for Statistics and Machine Learning, Princeton University, Princeton, NJ 08544
Kabir Khanna
Affiliation:
Department of Politics, Princeton University, Princeton, NJ 08544
*
e-mail: kimai@princeton.edu; URL: http://imai.princeton.edu (corresponding author)

Abstract

In both political behavior research and voting rights litigation, turnout and vote choice for different racial groups are often inferred using aggregate election results and racial composition. Over the past several decades, many statistical methods have been proposed to address this ecological inference problem. We propose an alternative method to reduce aggregation bias by predicting individual-level ethnicity from voter registration records. Building on the existing methodological literature, we use Bayes's rule to combine the Census Bureau's Surname List with various information from geocoded voter registration records. We evaluate the performance of the proposed methodology using approximately nine million voter registration records from Florida, where self-reported ethnicity is available. We find that it is possible to reduce the false positive rate among Black and Latino voters to 6% and 3%, respectively, while maintaining the true positive rate above 80%. Moreover, we use our predictions to estimate turnout by race and find that our estimates yields substantially less amounts of bias and root mean squared error than standard ecological inference estimates. We provide open-source software to implement the proposed methodology.

Type
Letters
Copyright
Copyright © The Author 2016. Published by Oxford University Press on behalf of the Society for Political Methodology 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Authors' note: We thank Bruce Willsie, the CEO of L2, for the data and answering numerous questions, and the participants of “Building the Evidence to Win Voting Rights Cases” conference at the American Constitutional Society for Law and Policy for their helpful comments. Two anonymous reviewers provided helpful suggestions. The R package, wru: Who Are You? Bayesian Prediction of Racial Category Using Surname and Geolocation, is freely available for download at https://cran.r-project.org/package=wru. Replication files for this study are available on the Political Analysis Dataverse at http://dx.doi.org/10.7910/DVN/SVY5VF. Supplementary materials for this article are available on the Political Analysis Web site.

References

Ansolabehere, S., and Hersh, E. 2003. Gender, age, race, and voting: A research note. Politics and Governance 1(2): 132–37.Google Scholar
Barber, M., and Imai, K. 2013. Estimating neighborhood effects on turnout from geocoded voter registration records. Working Paper available at http://imai.princeton.edu/research/neighbor.html (accessed February 24, 2016).Google Scholar
Barreto, M. A. 2007. Si Se Puede! Latino candidates and the mobilization of Latino voters. American Political Science Review 101(3): 425–41.CrossRefGoogle Scholar
Barreto, M. A., Segura, G. M., and Woods, N. D. 2004. Mobilizing effect of majority-minority districts. American Political Science Review 98(1): 6575.CrossRefGoogle Scholar
Dempster, A. P., Laird, N. M., and Rubin, D. B. 1977. Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society, Series B, Methodological 39(1): 137.Google Scholar
Elliott, M. N., Fremont, A., Morrison, P. A., Pantoja, P., and Lurie, N. 2008. A new method for estimating race/ethnicity and associated disparities where administrative records lack self-reported race/ethnicity. Health Services Research 43(5p1): 1772–36.CrossRefGoogle Scholar
Elliott, M. N., Morrison, P. A., Fremont, A., McCaffrey, D. F., Pantoja, P., and Lurie, N. 2009. Using the Census Bureau's surname list to improve estimates of race/ethnicity and associated disparities. Health Services and Outcomes Research Methodology 9(2): 6983.CrossRefGoogle Scholar
Enos, R. D. 2015. Testing the elusive: A field experiment on intergroup competition and voting. Working Paper, Department of Government, Harvard University.Google Scholar
Fieldhouse, E., and Cutts, D. 2008. Diversity, density and turnout: The effect of neighbourhood ethno-religious composition on voter turnout in Britain. Political Geography 27(5): 530–48.CrossRefGoogle Scholar
Fiscella, K., and Fremont, A. M. 2006. Use of geocoding and surname analysis to estimate race and ethnicity. Health Services Research 41(4p1): 1482–500.Google ScholarPubMed
Fraga, B. 2013. Winning the race, losing the base? Demobilization, competitiveness, and electoral influence. Working Paper, Department of Political Science, Indiana University.Google Scholar
Fraga, B. 2016. Candidates or districts? Reevaluating the role of race in voter turnout. American Journal of Political Science 60(1): 97122.CrossRefGoogle Scholar
Gay, C. 2001. The effect of black congressional representation on political participation. American Political Science Review 95(3): 589602.CrossRefGoogle Scholar
Goodman, L. 1953. Ecological regressions and behavior of individuals. American Sociological Review 18:663–66.CrossRefGoogle Scholar
Greiner, J. D. 2007. Ecological inference in voting rights act disputes: Where are we now, and where do we want to be? Jurimetrics 47(2): 115–67.Google Scholar
Greiner, D. J., and Quinn, K. M. 2008. R × c ecological inference: bounds, correlations, flexibility and transparency of assumptions. Journal of the Royal Statistical Society, Series A 172(1): 6781.CrossRefGoogle Scholar
Greiner, D. J., and Quinn, K. M. 2010. Exit polling and racial bloc voting: Combining individual level and ecological data. Annals of Applied Statistics 4(4): 1774–96.CrossRefGoogle Scholar
Hajnal, Z. L., and Trounstine, J. 2005. Where turnout matters: The consequences of uneven turnout in city politics. Journal of Politics 67(2): 515–35.CrossRefGoogle Scholar
Harris, J. A. 2015. What's in a name? A method for extracting information about ethnicity from names. Political Analysis 23(2): 212–24.Google Scholar
Henderson, J. A., Sekhon, J. S., and Titiunik, R. 2014. Cause or effect? Turnout in Hispanic majority-minority districts. Department of Political Science, University of California, Berkeley.Google Scholar
Herron, M. C., and Sekhon, J. S. 2005. Black candidates and black voters: Assessing the impact of candidate race on uncounted vote rates. Journal of Politics 67(1): 154–77.CrossRefGoogle Scholar
Imai, K., Lu, Y., and Strauss, A. 2008. Bayesian and likelihood inference for 2 × 2 ecological tables: An incomplete data approach. Political Analysis 16(1): 4169.CrossRefGoogle Scholar
Khanna, K., and Imai, K. 2016. Replication data for: Improving ecological inference by predicting individual ethnicity from voter registration records. http://dx.doi.org/10.7910/DVN/SVY5VF, Harvard Dataverse, V1 (accessed February 24, 2016).CrossRefGoogle Scholar
King, G., 2004. A solution to the ecological inference problem: Reconstructing individual behavior from aggregate data. Princeton, NJ: Princeton University Press.CrossRefGoogle Scholar
King, G., and Roberts, M. 2012. Ei: A(n r) program for ecological inference.Google Scholar
King, G., Rosen, O., and Tanner, M., eds. 2004. Ecological inference: New methodological strategies. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
Michelson, M. R. 2003. Getting out the Latino vote: How door-to-door canvassing influences voter turnout in rural central California. Political Behavior 25(3): 247–63.CrossRefGoogle Scholar
Newport, F. 2013. Democrats racially diverse; Republicans mostly white—Gallup. http://www.gallup.com/poll/160373/democrats-racially-diverse-republicans-mostly-white.aspx (accessed January 24, 2015).Google Scholar
Tam Cho, W. K., Gimpel, J. G., and Dyck, J. J. 2006. Residential concentration, political socialization, and voter turnout. Journal of Politics 68(1): 156–67.Google Scholar
Wakefield, J. 2004. Ecological inference for 2 × 2 tables (with discussion). Journal of the Royal Statistical Society, Series A 167:385445.CrossRefGoogle Scholar
Supplementary material: PDF

Imai and Khanna supplementary material

Appendix

Download Imai and Khanna supplementary material(PDF)
PDF 416 KB
Supplementary material: Image

Imai and Khanna supplementary material

Figure

Download Imai and Khanna supplementary material(Image)
Image 209 KB
57
Cited by

Save article to Kindle

To save this article to your Kindle, first ensure coreplatform@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records
Available formats
×

Save article to Dropbox

To save this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Dropbox account. Find out more about saving content to Dropbox.

Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records
Available formats
×

Save article to Google Drive

To save this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you used this feature, you will be asked to authorise Cambridge Core to connect with your Google Drive account. Find out more about saving content to Google Drive.

Improving Ecological Inference by Predicting Individual Ethnicity from Voter Registration Records
Available formats
×
×

Reply to: Submit a response

Please enter your response.

Your details

Please enter a valid email address.

Conflicting interests

Do you have any conflicting interests? *