Skip to main content Accessibility help
Hostname: page-component-684bc48f8b-g7stk Total loading time: 0.771 Render date: 2021-04-12T14:03:15.123Z Has data issue: true Feature Flags: { "shouldUseShareProductTool": true, "shouldUseHypothesis": true, "isUnsiloEnabled": true, "metricsAbstractViews": false, "figures": false, "newCiteModal": false, "newCitedByModal": true }

Election Fraud: A Latent Class Framework for Digit-Based Tests

Published online by Cambridge University Press:  04 January 2017

Juraj Medzihorsky
Department of Political Science, Central European University, Nador u. 9., 1051 Budapest, Hungary


Digit-based election forensics (DBEF) typically relies on null hypothesis significance testing, with undesirable effects on substantive conclusions. This article proposes an alternative free of this problem. It rests on decomposing the observed numeral distribution into the “no fraud” and “fraud” latent classes, by finding the smallest fraction of numerals that needs to be either removed or reallocated to achieve a perfect fit of the “no fraud” model. The size of this fraction can be interpreted as a measure of fraudulence. Both alternatives are special cases of measures of model fit—the π∗ mixture index of fit and the Δ dissimilarity index, respectively. Furthermore, independently of the latent class framework, the distributional assumptions of DBEF can be relaxed in some contexts. Independently or jointly, the latent class framework and the relaxed distributional assumptions allow us to dissect the observed distributions using models more flexible than those of existing DBEF. Reanalysis of Beber and Scacco's (2012) data shows that the approach can lead to new substantive conclusions.

Copyright © The Author 2015. Published by Oxford University Press on behalf of the Society for Political Methodology 

Access options

Get access to the full version of this content by using one of the access options below.


Author's note: I am grateful to Tamás Rudas, Gábor Tóka, Levente Littvay, Zoltán Fazekas, Daniela Širinić, Pavol Hardos, two anonymous reviewers, and the editors for helpful comments and suggestions, and the members of the Political Behavior Research Group at CEU for helpful discussion. Replication materials are available online as Medzihorsky, Juraj, 2015, “Replication Data for: Election Fraud: A Latent Class Framework for Digit-Based Tests”,, Harvard Dataverse, V2 (Medzihorsky 2015b), and include the version of the R package pistar (Medzihorsky 2015a) used in the analysis. The article uses data from Beber and Scacco (2012), which is available online also as Beber and Scacco (2011). Supplementary materials for this article are available on the Political Analysis Web site.


Agresti, A. 2002. Categorical data analysis, 2nd ed. Hoboken, N.J.: John Wiley & Sons.Google Scholar
Alvarez, R. M., Hall, T. E. and Hyde, S. D. 2009. Election Fraud: Detecting and Deterring Electoral Manipulation. Washington, D.C.: Brookings Institution Press.Google Scholar
Alvarez, R. M., Atkeson, L. R., and Hall, T. E. 2012. Evaluating elections: A handbook of methods and standards. Cambridge [England]; New York: Cambridge University Press.CrossRefGoogle Scholar
Beber, B., and Scacco, A. 2011. Replication Data for: What the Numbers Say: A Digit-Based Test for Election Fraud. Harvard Dataverse, V2. (accessed April 26, 2014).Google Scholar
Beber, B., and Scacco, A. 2012. What the numbers say: A digit-based test for election fraud. Political Analysis 20(2): 211–34.CrossRefGoogle Scholar
Benford, F. 1938. The law of anomalous numbers. Proceedings of the American Philosophical Society 78:551–72.Google Scholar
Breunig, C., and Goerres, A. 2011. Searching for electoral irregularities in an established democracy: Applying Benford's law tests to Bundestag elections in unified Germany. Electoral Studies 30(3): 534–45.CrossRefGoogle Scholar
Buttorf, G. 2008. Detecting fraud in America's Gilded Age. Unpublished manuscript, University of Iowa.Google Scholar
Cantú, F., and Saiegh, S. M. 2011. Fraudulent democracy? An analysis of Argentina's infamous decade using supervised machine learning. Political Analysis 19(4): 409–33.CrossRefGoogle Scholar
Clogg, C., Rudas, T., and Xi, L. 1995. A new index of structure for the analysis of models for mobility tables and other cross-classifications. Sociological Methodology 25:197222.CrossRefGoogle Scholar
Clogg, C. C., Rudas, T., and Matthews, S. 1997. Analysis of contingency tables using graphical displays based on the mixture index of fit. In Visualization of categorical data, eds. Blasius, J. and Greenacre, M., 425–39. San Diego: Academic Press.Google Scholar
Dayton, C. M. 2003. Applications and computational strategies for the two-point mixture index of fit. British Journal of Mathematical and Statistical Psychology 56(1): 113.CrossRefGoogle ScholarPubMed
Deckert, J., Myagkov, M., and Ordeshook, P. C. 2011. Benford's law and the detection of election fraud. Political Analysis 19(3): 245–68.CrossRefGoogle Scholar
Formann, A. K. 2000. Rater agreement and the generalized Rudas-Clogg-Lindsay index of fit. Statistics in Medicine 19(14): 1881–8.3.0.CO;2-I>CrossRefGoogle ScholarPubMed
Formann, A. K 2003a. Latent class model diagnosis from a frequentist point of view. Biometrics 59(1): 189–96.CrossRefGoogle Scholar
Formann, A. K 2003b. Latent class model diagnostics—A review and some proposals. Computational Statistics & Data Analysis 41(3): 549–59.CrossRefGoogle Scholar
Formann, A. K 2006. Testing the Rasch model by means of the mixture fit index. British Journal of Mathematical and Statistical Psychology 59(1): 8995.CrossRefGoogle ScholarPubMed
Giles, D. E. 2007. Benford's law and naturally occurring prices in certain eBay auctions. Applied Economics Letters 14(3): 157–61.CrossRefGoogle Scholar
Gini, C. 1914. Di una misura della dissomiglianza tra due gruppi di quantità e delle sue applicazioni allo studio delle relazione statistiche. Atti del Reale Instituto Veneto di Scienze, Lettere ed Arti (Series 8) 74:185213.Google Scholar
Hernández, J. M., Rubio, V. J., Revuelta, J., and Santacreu, J. 2006. A procedure for estimating intrasubject behavior consistency. Educational and Psychological Measurement 66(3): 417–34.CrossRefGoogle Scholar
Hill, T. P. 1995. A statistical derivation of the significant-digit law. Statistical Science 10(4): 354–63.CrossRefGoogle Scholar
Ispány, M., and Verdes, E. 2014. On the robustness of mixture index of fit. Journal of Mathematical Sciences 200(4): 432–40.Google Scholar
Jiménez, R., and Hidalgo, M. 2014. Forensic analysis of Venezuelan elections during the Chávez presidency. PLoS One 9(6):e100884.CrossRefGoogle ScholarPubMed
Judge, G., and Schechter, L. 2009. Detecting problems in survey data using Benford's law. Journal of Human Resources 44(1): 124.CrossRefGoogle Scholar
Leemann, L., and Bochsler, D. 2014. A systematic approach to study electoral fraud. Electoral Studies 35:3347.CrossRefGoogle Scholar
Leemis, L. M., Schmeiser, B. W., and Evans, D. L. 2000. Survival distributions satisfying Benford's law. American Statistician 54(4): 236–41.Google Scholar
Mebane, W. R. 2006a. Election forensics: The second-digit Benford's law test and recent American presidential elections. In Prepared for delivery at the Election Fraud Conference. September 29–30, Salt Lake City, Utah.Google Scholar
Mebane, W. R 2006b. Election forensics: Vote counts and Benford's law. Summer Meeting of the Political Methodology Society, UC-Davis, July.Google Scholar
Mebane, W. R 2007. Election forensics: Statistical interventions in election controversies. Prepared for presentation at the 2007 Annual Meeting of the American Political Science Association, Chicago, Aug 30-Sep 2.Google Scholar
Mebane, W. R 2008. Election forensics: Outlier and digit tests in America and Russia. American Electoral Process Conference, Center for the Study of Democratic Politics, Princeton University.Google Scholar
Mebane, W. R 2010a. Election fraud or strategic voting? Can second-digit tests tell the difference? Summer Meeting of the Political Methodology Society, University of Iowa.Google Scholar
Mebane, W. R 2010b. Fraud in the 2009 presidential election in Iran? Chance 23(1): 615.CrossRefGoogle Scholar
Mebane, W. R 2011. Comment on “Benford's law and the detection of election fraud.” Political Analysis 19(3): 269–72.CrossRefGoogle Scholar
Mebane, W. R., and Kalinin, K. 2009. Comparative election fraud detection. Prepared for presentation at the 2009 Annual Meeting of the American Political Science Association, Toronto, Canada, Sept 3–6.Google Scholar
Medzihorsky, J. 2015a. pistar: Rudas, Clogg and Lindsay mixture index of fit. R package version Scholar
Medzihorsky, J 2015b. Replication Data for: Election Fraud: A Latent Class Framework for Digit-Based Tests. Harvard Dataverse, V2 [UNF:6:FIWHvsHNzZgPStT0+kgbsQ==]. Scholar
Newcomb, S. (1881). Note on the frequency of use of the different digits in natural numbers. American Journal of Mathematics 4(1): 3940.CrossRefGoogle Scholar
Nickerson, R. S. 2002. The production and perception of randomness. Psychological Review 109(2): 330.CrossRefGoogle ScholarPubMed
Norris, P., Frank, R. W., and Coma, F. M. I. 2014. Advancing electoral integrity. Oxford: Oxford University Press.CrossRefGoogle Scholar
Pericchi, L., and Torres, D. 2011. Quick anomaly detection by the Newcomb-Benford law, with applications to electoral processes data from the USA, Puerto Rico, and Venezuela. Statistical Science 26(4): 502–16.CrossRefGoogle Scholar
Revuelta, J. 2008. Estimating the &b.pi;* goodness of fit index for finite mixtures of item response models. British Journal of Mathematical and Statistical Psychology 61(1): 93113.CrossRefGoogle Scholar
Rudas, T. 1998. The mixture index of fit. In Advances in methodology, data analysis, and statistics, ed. Ferligoj, A., 1522. Ljubljana: FDV.Google Scholar
Rudas, T 1999. The mixture index of fit and minimax regression. Metrika 50(2): 163–72.CrossRefGoogle Scholar
Rudas, T 2002. A latent class approach to measuring the fit of a statistical model. In Applied latent class analysis, eds. Hagenaars, J. A. and McCutcheon, A. L., 345–65. Cambridge: Cambridge University Press.Google Scholar
Rudas, T 2005. Mixture models of missing data. Quality & Quantity 39(1): 1936.CrossRefGoogle Scholar
Rudas, T., Clogg, C., and Lindsay, B. 1994. A new index of fit based on mixture methods for the analysis of contingency tables. Journal of the Royal Statistical Society. Series B (Methodological) 56(4): 623–39.Google Scholar
Rudas, T., and Verdes, E. 2015. Model-based analysis of incomplete data using the mixture index of fit. In Advances in latent class analysis: A Festschrift in Honor of C. Mitchell Dayton, eds. Hancock, G. R. and Macready, G. B. Charlotte, NC: Information Age Publishing.Google Scholar
Rudas, T., and Zwick, R. 1997. Estimating the importance of differential item functioning. Journal of Educational and Behavioral Statistics 22(1): 3145.CrossRefGoogle Scholar
Tam Cho, W. K., and Gaines, B. J. 2007. Breaking the (Benford) law: Statistical fraud detection in campaign finance. American Statistician 61(3): 218–23.CrossRefGoogle Scholar
Verdes, E., and Rudas, T. 2003. The π* index as a new alternative for assessing goodness of fit of logistic regression. In Foundations of Statistical Inference: Proceedings of the Shoresh Conference 2000, eds. Haitovsky, Y. and Ritov, Y., 167–77. Berlin and Heidelberg: Springer.Google Scholar
Ziliak, S. T., and McCloskey, D. N. 2008. The cult of statistical significance: How the standard error costs us jobs, justice, and lives. Ann Arbor: University of Michigan Press.Google Scholar

Medzihorsky supplementary material


PDF 262 KB

Altmetric attention score

Full text views

Full text views reflects PDF downloads, PDFs sent to Google Drive, Dropbox and Kindle and HTML full text views.

Total number of HTML views: 0
Total number of PDF views: 132 *
View data table for this chart

* Views captured on Cambridge Core between 04th January 2017 - 12th April 2021. This data will be updated every 24 hours.

Send article to Kindle

To send this article to your Kindle, first ensure is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about sending to your Kindle. Find out more about sending to your Kindle.

Note you can select to send to either the or variations. ‘’ emails are free but can only be sent to your device when it is connected to wi-fi. ‘’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Election Fraud: A Latent Class Framework for Digit-Based Tests
Available formats

Send article to Dropbox

To send this article to your Dropbox account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Dropbox.

Election Fraud: A Latent Class Framework for Digit-Based Tests
Available formats

Send article to Google Drive

To send this article to your Google Drive account, please select one or more formats and confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your <service> account. Find out more about sending content to Google Drive.

Election Fraud: A Latent Class Framework for Digit-Based Tests
Available formats

Reply to: Submit a response

Your details

Conflicting interests

Do you have any conflicting interests? *