Skip to main content

Relating Latent Class Assignments to External Variables: Standard Errors for Correct Inference

  • Zsuzsa Bakk (a1), Daniel L. Oberski (a2) and Jeroen K. Vermunt (a2)

Latent class analysis is used in the political science literature in both substantive applications and as a tool to estimate measurement error. Many studies in the social and political sciences relate estimated class assignments from a latent class model to external variables. Although common, such a “three-step” procedure effectively ignores classification error in the class assignments; Vermunt (2010, “Latent class modeling with covariates: Two improved three-step approaches,” Political Analysis 18:450–69) showed that this leads to inconsistent parameter estimates and proposed a correction. Although this correction for bias is now implemented in standard software, inconsistency is not the only consequence of classification error. We demonstrate that the correction method introduces an additional source of variance in the estimates, so that standard errors and confidence intervals are overly optimistic when not taking this into account. We derive the asymptotic variance of the third-step estimates of interest, as well as several candidate-corrected sample estimators of the standard errors. These corrected standard error estimators are evaluated using a Monte Carlo study, and we provide practical advice to researchers as to which should be used so that valid inferences can be obtained when relating estimated class membership to external variables.

Corresponding author
e-mail: (corresponding author)
Hide All

Author's note: Thanks are due to the anonymous reviewers and the editor, whose helpful comments improved the article considerably.

Hide All
Ahlquist, John S., and Breunig, Christian. 2012. Model-based clustering and typologies in the social sciences. Political Analysis 20: 92112.
Alwin, Duane F. 2007. Margins of error: A study of reliability in survey measurement. New York: Wiley.
Asparouhov, Tihomir, and Muthén, Bengt. 2013. Auxiliary variables in mixture modeling: A 3-step approach using Mplus. Mplus Web Notes 15: 148.
Bakk, Zsuzsa, Tekle, Fetene T., and Vermunt, Jeroen K. 2013. Estimating the association between latent class membership and external variables using bias-adjusted three-step approaches. Sociological Methodology 43: 272311.
Bandeen-Roche, Karen, Miglioretti, Diana L., Zegger, Scott L., and Rathouz, Paul J. 1997. Latent variable regression for multiple discrete outcomes. Journal of the American Statistical Association 92: 1375–86.
Beissinger, Mark R. 2013. The semblance of democratic revolution: Coalitions in Ukraine's orange revolution. American Political Science Review 107: 574–92.
Blackwell, Matthew, Honaker, James, and King, Gary. 2012. Multiple overimputation: A unified approach to measurement error in missing data. (accessed November 15, 2013).
Blaydes, Lisa, and Linzer, Drew A. 2008. The political economy of women's support for fundamentalist Islam. World Politics 60: 579609.
Bolck, Annabelle, Croon, Marcel, and Hagenaars, Jacques A. 2004. Estimating latent structure models with categorical variables: One-step versus three-step estimators. Political Analysis 12: 327.
Breen, Richard. 2000. Why is support for extreme parties underestimated by surveys? A latent class analysis. British Journal of Political Science 30: 375–82.
Carroll, Raymond J., Ruppert, David, Stefanski, Leonard A., and Crainiceanu, Ciprian. 2006. Measurement error in nonlinear models: A modern perspective. 2nd ed. Boca Raton, FL: Chapman and Hall/CRC.
Chan, Tak Wing, and Goldthorpe, John H. 2007. Social stratification and cultural consumption: Music in England. European Sociological Review 23: 119.
Clark, Renee M., and Besterfield-Sacre, Mary E. 2009. A new approach to hazardous materials transportation risk analysis: Decision modeling to identify critical variables. Risk Analysis 29: 344–54.
Collins, Linda M., and Lanza, Stephanie T. 2010. Latent class and latent transition analysis: With applications in the social, behavioral, and health sciences. New York: Wiley.
Dayton, C. Mitchell, and Macready, George B. 1988. Concomitant-variable latent class models. Journal of the American Statistical Association 83: 173–78.
Dias, Jose G., and Vermunt, Jeroen K. 2008. A bootstrap-based aggregate classifier for model-based clustering. Computational Statistics 23: 643–59.
Feick, Lawrence F. 1989. Latent class analysis of survey questions that include don't know responses. Public Opinion Quarterly 53: 525–47.
Feingold, Alan, Tiberio, Stacey S., and Capaldi, Deborah M. 2013. New approaches for examining associations with latent categorical variables: Applications to substance abuse and aggression. Psychology of Addictive Behaviors. (accessed November 15, 2013).
Fornell, Claes, and Larcker, David. 1981. Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research 18: 3950.
Fuller, Wayne A. 1987. Measurement error models. New York: Wiley.
Glasgow, G., Golder, M., and Golder, S. N. 2012. New empirical strategies for the study of parliamentary government formation. Political Analysis 20: 248–70.
Gong, Gail, and Samaniego, Francisco J. 1981. Pseudo maximum likelihood estimation: Theory and applications. Annals of Statistics 9: 861–69.
Goodman, Leo A. 1974. The analysis of systems of qualitative variables when some of the variables are unobservable. Part I: A modified latent structure approach. American Journal of Sociology 79: 1179–259.
Grimmer, Justin. 2013. Appropriators not position takers: The distorting effects of electoral incentives on congressional representation. American Journal of Political Science 57: 624–42.
Grimmer, Justin, and Stewart, Brandon M. 2013. Text as data: The promise and pitfalls of automatic content analysis methods for political texts. Political Analysis 21: 267–97.
Hagenaars, Jacques A. 1990. Categorical longitudinal data—Loglinear analysis of panel, trend and cohort data. Newbury Park, CA: Sage.
Hagenaars, Jacques A. 1993. Loglinear models with latent variables. Newbury Park, CA: Sage.
Hill, Jennifer L., and Kriesi, Hanspeter. 2001. Classification by opinion-changing behavior: A mixture model approach. Political Analysis 9: 301–24.
Kauermann, Göran, and Carroll, Raymond J. 2001. A note on the efficiency of sandwich covariance matrix estimation. Journal of the American Statistical Association 96: 1387–96.
King, Gary, and Roberts, Margaret. 2012. How robust standard errors expose methodological problems they do not fix. (accessed November 15, 2013).
König, Thomas, Marbach, Moritz, and Osnabrügge, Moritz. 2013. Estimating party positions across countries and time: A dynamic latent variable model for manifesto data. Political Analysis 21: 468–91.
Lanza, Stephanie T., Tan, Xianmin, and Bethany Bray, C. 2013. Latent class analysis with distal outcomes: A flexible model-based approach. Structural Equation Modeling 20: 1:126.
Linzer, Drew A. 2011. Reliable inference in highly stratified contingency tables: Using latent class models as density estimators. Political Analysis 19: 173–87.
Loken, Eric. 2004. Using latent class analysis to model temperament types. Multivariate Behavioral Research 39: 625–52.
Marsh, Herbert W., Ludtke, Oliver, Trautwein, Ulrich, and Morin, Alexandre J. S. 2009. Classical latent profile analysis of academic self-concept dimensions: Synergy of person- and variable-centered approaches to theoretical models of self-concept. Structural Equation Modeling 16: 191225.
McCutcheon, Allan L. 1985. A latent class analysis of tolerance for nonconformity in the American public. Public Opinion Quarterly 49: 474–88.
McCutcheon, Allan L. 1987. Latent class analysis. Newbury Park, CA: Sage.
Mislevy, R. J. 1988. Randomization-based inferences about latent variables from complex samples. Technical Report. Educational Testing Service.
Murphy, Kevin M., and Topel, Robert H. 1985. Estimation and inference in two-step econometric models. Journal of Business and Economic Statistics 3: 8897.
Mustillo, Thomas J. 2009. Modeling new party performance: A conceptual and methodological approach for volatile party systems. Political Analysis 17: 311–32.
Muthén, L. K., and Muthén, B. O. 1998–2012. Mplus Users Guide. 7th ed. Los Angeles: Muthén & Muthén.
Nylund, Karen L., Asparouhov, Tihomir, and Muthén, Bengt. 2007. Deciding on the number of classes in latent class analysis and growth mixture modeling: A Monte Carlo simulation study. Structural Equation Modeling 14: 535–69.
Oberski, Daniel, and Vermunt, Jeroen K. 2013. A model-based approach to goodness-of-fit evaluation in item response theory. Measurement: Interdisciplinary Research & Perspectives 11: 117–22.
Oberski, Daniel L., and Satorra, Albert. 2013. Measurement error models with uncertainty about the error variance. Structural Equation Modeling 20: 409–28.
Oehlert, Gary W. 1992. A note on the delta method. American Statistician 46: 2729.
Olino, Thomas M., Klein, Daniel N., Lewinsohn, Peter M., Rohde, Paul, and Seeley, John R. 2010. Latent trajectory classes of depressive and anxiety disorders from adolescence to adulthood: Descriptions of classes and associations with risk factors. Comprehensive Psychiatry 51: 224–35.
Parke, William R. 1986. Pseudo maximum likelihood estimation: The asymptotic distribution. Annals of Statistics 14: 355–57.
R Core Team. 2013. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0. (accessed December 2, 2013).
Rabe-Hesketh, Sophia, Skrondal, Anders, and Pickles, Andrew. 2003. Maximum likelihood estimation of generalized linear models with covariate measurement error. Stata Journal 3: 386411.
Ristei Gugiu, M., and Centellas, M. 2013. The Democracy Cluster Classification Index. Political Analysis 21: 334–49.
Roeder, Kathryn, Lynch, Kevin G., and Nagin, Daniel S. 1999. Modeling uncertainty in latent class membership: A case study in criminology. Journal of the American Statistical Association 94: 766–76.
Rubin, Donald B. 1987. Multiple imputation for nonresponse in survey research. New York: Wiley.
Schafer, J. L. 1997. Analysis of incomplete multivariate data. Boca Raton, FL: Chapman & Hall/CRC.
Sclove, Stanley L. 1987. Application of model-selection criteria to some problems in multivariate analysis. Psychometrika 52: 333–43.
Skinner, C. J., Holt, D., and Smith, T. M. F. 1989. Analysis of complex surveys. New York: Wiley.
Skrondal, Anders, and Kuha, Jouni. 2012. Improved regression calibration. Psychometrika 77: 649–69.
Sniderman, Paul M., Tetlock, Philip E., Glaser, James M., Green, Donald Philip, and Hout, Michael. 1989. Principled tolerance and the American mass public. British Journal of Political Science 19: 25.
Stouffer, Samuel Andrew. 1955. Communism, conformity, and civil liberties: A cross-section of the nation speaks its mind. New Jersey: Transaction Books.
Tanner, Martin A., and Wong, Wing Hung. 1987. The calculation of posterior distributions by data augmentation. Journal of the American statistical Association 82: 528–40.
Treier, Shawn, and Jackman, Simon. 2008. Democracy as a latent variable. American Journal of Political Science 52: 201–17.
Van der Heijden, Peter, Hart, Harm't, and Dessens, Jos. 1997. A parametric bootstrap procedure to perform statistical tests in a LCA of anti-social behaviour. In Applications of latent trait and latent class models in the social sciences, eds. Rost, J. and Langeheine, R., 196208. New York: Waxmann.
Vermunt, Jeroen K. 2010. Latent class modeling with covariates: Two improved three-step approaches. Political Analysis 18: 450–69.
Vermunt, Jeroen K., and Magidson, Jay. 2013. Technical guide for Latent GOLD 5.0: Basic and advanced. Belmont, MA: Statistical Innovations Inc.
Wang, Chen-Pin, Brown, Hendriks C., and Bandeen-Roche, Karen. 2005. Residual diagnostics for growth mixture models: Examining the impact of a preventive intervention on multiple trajectories of aggressive behavior. Journal of the American Statistical Association 100: 1054–10.
Wedel, Michel, Hofstede, Frenkel Ter, and Steenkamp, Jan-Benedict E. M. 1998. Mixture model analysis of complex samples. Journal of Classification 15: 225–44.
White, Halbert. 1982. Maximum likelihood estimation of misspecified models. Econometrica: Journal of the Econometric Society 50: 125.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Political Analysis
  • ISSN: 1047-1987
  • EISSN: 1476-4989
  • URL: /core/journals/political-analysis
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed