Hostname: page-component-848d4c4894-jbqgn Total loading time: 0 Render date: 2024-06-13T16:41:05.593Z Has data issue: false hasContentIssue false

Relating Latent Class Assignments to External Variables: Standard Errors for Correct Inference

Published online by Cambridge University Press:  04 January 2017

Zsuzsa Bakk*
Department of Methodology and Statistics, Tilburg University, Room P1113, PO Box 90153, 5000 LE Tilburg, The Netherlands
Daniel L. Oberski
Department of Methodology and Statistics, Tilburg University, The Netherlands
Jeroen K. Vermunt
Department of Methodology and Statistics, Tilburg University, The Netherlands
e-mail: (corresponding author)


Latent class analysis is used in the political science literature in both substantive applications and as a tool to estimate measurement error. Many studies in the social and political sciences relate estimated class assignments from a latent class model to external variables. Although common, such a “three-step” procedure effectively ignores classification error in the class assignments; Vermunt (2010, “Latent class modeling with covariates: Two improved three-step approaches,” Political Analysis 18:450–69) showed that this leads to inconsistent parameter estimates and proposed a correction. Although this correction for bias is now implemented in standard software, inconsistency is not the only consequence of classification error. We demonstrate that the correction method introduces an additional source of variance in the estimates, so that standard errors and confidence intervals are overly optimistic when not taking this into account. We derive the asymptotic variance of the third-step estimates of interest, as well as several candidate-corrected sample estimators of the standard errors. These corrected standard error estimators are evaluated using a Monte Carlo study, and we provide practical advice to researchers as to which should be used so that valid inferences can be obtained when relating estimated class membership to external variables.

Research Article
Copyright © The Author 2014. Published by Oxford University Press on behalf of the Society for Political Methodology 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)


Author's note: Thanks are due to the anonymous reviewers and the editor, whose helpful comments improved the article considerably.


Ahlquist, John S., and Breunig, Christian. 2012. Model-based clustering and typologies in the social sciences. Political Analysis 20: 92112.Google Scholar
Alwin, Duane F. 2007. Margins of error: A study of reliability in survey measurement. New York: Wiley.Google Scholar
Asparouhov, Tihomir, and Muthén, Bengt. 2013. Auxiliary variables in mixture modeling: A 3-step approach using Mplus. Mplus Web Notes 15: 148.Google Scholar
Bakk, Zsuzsa, Tekle, Fetene T., and Vermunt, Jeroen K. 2013. Estimating the association between latent class membership and external variables using bias-adjusted three-step approaches. Sociological Methodology 43: 272311.Google Scholar
Bandeen-Roche, Karen, Miglioretti, Diana L., Zegger, Scott L., and Rathouz, Paul J. 1997. Latent variable regression for multiple discrete outcomes. Journal of the American Statistical Association 92: 1375–86.CrossRefGoogle Scholar
Beissinger, Mark R. 2013. The semblance of democratic revolution: Coalitions in Ukraine's orange revolution. American Political Science Review 107: 574–92.CrossRefGoogle Scholar
Blackwell, Matthew, Honaker, James, and King, Gary. 2012. Multiple overimputation: A unified approach to measurement error in missing data. (accessed November 15, 2013).Google Scholar
Blaydes, Lisa, and Linzer, Drew A. 2008. The political economy of women's support for fundamentalist Islam. World Politics 60: 579609.Google Scholar
Bolck, Annabelle, Croon, Marcel, and Hagenaars, Jacques A. 2004. Estimating latent structure models with categorical variables: One-step versus three-step estimators. Political Analysis 12: 327.Google Scholar
Breen, Richard. 2000. Why is support for extreme parties underestimated by surveys? A latent class analysis. British Journal of Political Science 30: 375–82.Google Scholar
Carroll, Raymond J., Ruppert, David, Stefanski, Leonard A., and Crainiceanu, Ciprian. 2006. Measurement error in nonlinear models: A modern perspective. 2nd ed. Boca Raton, FL: Chapman and Hall/CRC.Google Scholar
Chan, Tak Wing, and Goldthorpe, John H. 2007. Social stratification and cultural consumption: Music in England. European Sociological Review 23: 119.Google Scholar
Clark, Renee M., and Besterfield-Sacre, Mary E. 2009. A new approach to hazardous materials transportation risk analysis: Decision modeling to identify critical variables. Risk Analysis 29: 344–54.Google Scholar
Collins, Linda M., and Lanza, Stephanie T. 2010. Latent class and latent transition analysis: With applications in the social, behavioral, and health sciences. New York: Wiley.Google Scholar
Dayton, C. Mitchell, and Macready, George B. 1988. Concomitant-variable latent class models. Journal of the American Statistical Association 83: 173–78.Google Scholar
Dias, Jose G., and Vermunt, Jeroen K. 2008. A bootstrap-based aggregate classifier for model-based clustering. Computational Statistics 23: 643–59.Google Scholar
Feick, Lawrence F. 1989. Latent class analysis of survey questions that include don't know responses. Public Opinion Quarterly 53: 525–47.Google Scholar
Feingold, Alan, Tiberio, Stacey S., and Capaldi, Deborah M. 2013. New approaches for examining associations with latent categorical variables: Applications to substance abuse and aggression. Psychology of Addictive Behaviors. (accessed November 15, 2013).Google Scholar
Fornell, Claes, and Larcker, David. 1981. Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research 18: 3950.Google Scholar
Fuller, Wayne A. 1987. Measurement error models. New York: Wiley.Google Scholar
Glasgow, G., Golder, M., and Golder, S. N. 2012. New empirical strategies for the study of parliamentary government formation. Political Analysis 20: 248–70.CrossRefGoogle Scholar
Gong, Gail, and Samaniego, Francisco J. 1981. Pseudo maximum likelihood estimation: Theory and applications. Annals of Statistics 9: 861–69.Google Scholar
Goodman, Leo A. 1974. The analysis of systems of qualitative variables when some of the variables are unobservable. Part I: A modified latent structure approach. American Journal of Sociology 79: 1179–259.Google Scholar
Grimmer, Justin. 2013. Appropriators not position takers: The distorting effects of electoral incentives on congressional representation. American Journal of Political Science 57: 624–42.Google Scholar
Grimmer, Justin, and Stewart, Brandon M. 2013. Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis 21: 267–97.Google Scholar
Hagenaars, Jacques A. 1990. Categorical longitudinal data—Loglinear analysis of panel, trend and cohort data. Newbury Park, CA: Sage.Google Scholar
Hagenaars, Jacques A. 1993. Loglinear models with latent variables. Newbury Park, CA: Sage.Google Scholar
Hill, Jennifer L., and Kriesi, Hanspeter. 2001. Classification by opinion-changing behavior: A mixture model approach. Political Analysis 9: 301–24.CrossRefGoogle Scholar
Kauermann, Göran, and Carroll, Raymond J. 2001. A note on the efficiency of sandwich covariance matrix estimation. Journal of the American Statistical Association 96: 1387–96.Google Scholar
King, Gary, and Roberts, Margaret. 2012. How robust standard errors expose methodological problems they do not fix. (accessed November 15, 2013).Google Scholar
König, Thomas, Marbach, Moritz, and Osnabrügge, Moritz. 2013. Estimating party positions across countries and time: A dynamic latent variable model for manifesto data. Political Analysis 21: 468–91.Google Scholar
Lanza, Stephanie T., Tan, Xianmin, and Bethany Bray, C. 2013. Latent class analysis with distal outcomes: A flexible model-based approach. Structural Equation Modeling 20: 1:126.CrossRefGoogle ScholarPubMed
Linzer, Drew A. 2011. Reliable inference in highly stratified contingency tables: Using latent class models as density estimators. Political Analysis 19: 173–87.Google Scholar
Loken, Eric. 2004. Using latent class analysis to model temperament types. Multivariate Behavioral Research 39: 625–52.Google Scholar
Marsh, Herbert W., Ludtke, Oliver, Trautwein, Ulrich, and Morin, Alexandre J. S. 2009. Classical latent profile analysis of academic self-concept dimensions: Synergy of person- and variable-centered approaches to theoretical models of self-concept. Structural Equation Modeling 16: 191225.Google Scholar
McCutcheon, Allan L. 1985. A latent class analysis of tolerance for nonconformity in the American public. Public Opinion Quarterly 49: 474–88.CrossRefGoogle Scholar
McCutcheon, Allan L. 1987. Latent class analysis. Newbury Park, CA: Sage.Google Scholar
Mislevy, R. J. 1988. Randomization-based inferences about latent variables from complex samples. Technical Report. Educational Testing Service.Google Scholar
Murphy, Kevin M., and Topel, Robert H. 1985. Estimation and inference in two-step econometric models. Journal of Business and Economic Statistics 3: 8897.Google Scholar
Mustillo, Thomas J. 2009. Modeling new party performance: A conceptual and methodological approach for volatile party systems. Political Analysis 17: 311–32.Google Scholar
Muthén, L. K., and Muthén, B. O. 1998–2012. Mplus Users Guide. 7th ed. Los Angeles: Muthén & Muthén.Google Scholar
Nylund, Karen L., Asparouhov, Tihomir, and Muthén, Bengt. 2007. Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling 14: 535–69.Google Scholar
Oberski, Daniel, and Vermunt, Jeroen K. 2013. A model-based approach to goodness-of-fit evaluation in item response theory. Measurement: Interdisciplinary Research & Perspectives 11: 117–22.Google Scholar
Oberski, Daniel L., and Satorra, Albert. 2013. Measurement error models with uncertainty about the error variance. Structural Equation Modeling 20: 409–28.Google Scholar
Oehlert, Gary W. 1992. A note on the delta method. American Statistician 46: 2729.Google Scholar
Olino, Thomas M., Klein, Daniel N., Lewinsohn, Peter M., Rohde, Paul, and Seeley, John R. 2010. Latent trajectory classes of depressive and anxiety disorders from adolescence to adulthood: Descriptions of classes and associations with risk factors. Comprehensive Psychiatry 51: 224–35.Google Scholar
Parke, William R. 1986. Pseudo maximum likelihood estimation: The asymptotic distribution. Annals of Statistics 14: 355–57.CrossRefGoogle Scholar
R Core Team. 2013. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0. (accessed December 2, 2013).Google Scholar
Rabe-Hesketh, Sophia, Skrondal, Anders, and Pickles, Andrew. 2003. Maximum likelihood estimation of generalized linear models with covariate measurement error. Stata Journal 3: 386411.Google Scholar
Ristei Gugiu, M., and Centellas, M. 2013. The Democracy Cluster Classification Index. Political Analysis 21: 334–49.Google Scholar
Roeder, Kathryn, Lynch, Kevin G., and Nagin, Daniel S. 1999. Modeling uncertainty in latent class membership: A case study in criminology. Journal of the American Statistical Association 94: 766–76.Google Scholar
Rubin, Donald B. 1987. Multiple imputation for nonresponse in survey research. New York: Wiley.Google Scholar
Schafer, J. L. 1997. Analysis of incomplete multivariate data. Boca Raton, FL: Chapman & Hall/CRC.Google Scholar
Sclove, Stanley L. 1987. Application of model-selection criteria to some problems in multivariate analysis. Psychometrika 52: 333–43.Google Scholar
Skinner, C. J., Holt, D., and Smith, T. M. F. 1989. Analysis of complex surveys. New York: Wiley.Google Scholar
Skrondal, Anders, and Kuha, Jouni. 2012. Improved regression calibration. Psychometrika 77: 649–69.Google Scholar
Sniderman, Paul M., Tetlock, Philip E., Glaser, James M., Green, Donald Philip, and Hout, Michael. 1989. Principled tolerance and the American mass public. British Journal of Political Science 19: 25.Google Scholar
Stouffer, Samuel Andrew. 1955. Communism, conformity, and civil liberties: A cross-section of the nation speaks its mind. New Jersey: Transaction Books.Google Scholar
Tanner, Martin A., and Wong, Wing Hung. 1987. The calculation of posterior distributions by data augmentation. Journal of the American statistical Association 82: 528–40.Google Scholar
Treier, Shawn, and Jackman, Simon. 2008. Democracy as a latent variable. American Journal of Political Science 52: 201–17.Google Scholar
Van der Heijden, Peter, Hart, Harm't, and Dessens, Jos. 1997. A parametric bootstrap procedure to perform statistical tests in a LCA of anti-social behaviour. In Applications of latent trait and latent class models in the social sciences, eds. Rost, J. and Langeheine, R., 196208. New York: Waxmann.Google Scholar
Vermunt, Jeroen K. 2010. Latent class modeling with covariates: Two improved three-step approaches. Political Analysis 18: 450–69.Google Scholar
Vermunt, Jeroen K., and Magidson, Jay. 2013. Technical guide for Latent GOLD 5.0: Basic and advanced. Belmont, MA: Statistical Innovations Inc.Google Scholar
Wang, Chen-Pin, Brown, Hendriks C., and Bandeen-Roche, Karen. 2005. Residual diagnostics for growth mixture models: Examining the impact of a preventive intervention on multiple trajectories of aggressive behavior. Journal of the American Statistical Association 100: 1054–10.Google Scholar
Wedel, Michel, Hofstede, Frenkel Ter, and Steenkamp, Jan-Benedict E. M. 1998. Mixture model analysis of complex samples. Journal of Classification 15: 225–44.Google Scholar
White, Halbert. 1982. Maximum likelihood estimation of misspecified models. Econometrica: Journal of the Econometric Society 50: 125.Google Scholar