Skip to main content Accessibility help

Latent Class Modeling with Covariates: Two Improved Three-Step Approaches

  • Jeroen K. Vermunt (a1)


Researchers using latent class (LC) analysis often proceed using the following three steps: (1) an LC model is built for a set of response variables, (2) subjects are assigned to LCs based on their posterior class membership probabilities, and (3) the association between the assigned class membership and external variables is investigated using simple cross-tabulations or multinomial logistic regression analysis. Bolck, Croon, and Hagenaars (2004) demonstrated that such a three-step approach underestimates the associations between covariates and class membership. They proposed resolving this problem by means of a specific correction method that involves modifying the third step. In this article, I extend the correction method of Bolck, Croon, and Hagenaars by showing that it involves maximizing a weighted log-likelihood function for clustered data. This conceptualization makes it possible to apply the method not only with categorical but also with continuous explanatory variables, to obtain correct tests using complex sampling variance estimation methods, and to implement it in standard software for logistic regression analysis. In addition, a new maximum likelihood (ML)—based correction method is proposed, which is more direct in the sense that it does not require analyzing weighted data. This new three-step ML method can be easily implemented in software for LC analysis. The reported simulation study shows that both correction methods perform very well in the sense that their parameter estimates and their SEs can be trusted, except for situations with very poorly separated classes. The main advantage of the ML method compared with the Bolck, Croon, and Hagenaars approach is that it is much more efficient and almost as efficient as one-step ML estimation.



Hide All
Bandeen-Roche, Karen, Miglioretti, Diana L., Zeger, Scott L., and Rathouz, Paul J. 1997. Latent variable regression for multiple discrete outcomes. Journal of the American Statistical Association 92: 1375–86.
Blaydes, Lisa, and Linzer, Drew A. 2008. The political economy of women's support for fundamentalist Islam. World Politics 60: 579609.
Bolck, Annabel, Croon, Marcel A., and Hagenaars, Jacques A. 2004. Estimating latent structure models with categorical variables: One-step versus three-step estimators. Political Analysis 12: 327.
Breen, Richard. 2000. Why is support for extreme parties underestimated by surveys? A latent class analysis. British Journal of Political Science 30: 375–82.
Chung, Hwan, Flaherty, Brian P., and Schafer, Joseph L. 2006. Latent class logistic regression: Application to marijuana use and attitudes among high school seniors. Journal of the Royal Statistical Society Series A—Statistics in Society 169: 723–43.
Clogg, Clifford C. 1981. New developments in latent structure analysis. In Factor analysis and measurement in sociological research, ed. Jackson, D. J. and Borgotta, E. F., 215–46. Beverly Hills, CA: Sage.
Collins, Linda M., and Wugalter, Stuart E. 1992. Latent class models for stage-sequential dynamic latent variables. Multivariate Behavioral Research 27: 131–57.
Croon, Marcel A. 2002. Using predicted latent scores in general latent structure models. In Latent variable and latent structure models, ed. Marcoulides, George A. and Moustaki, Irini, 195224. Mahwah, NJ: Lawrence Erlbaum.
Dalton, Russell J. 2006. The two faces of citizenship. Democracy & Society 3: 21–3.
Dalton, Russell J. 2008. Citizenship norms and the expansion of political participation. Political Studies 56: 7698.
Dayton, C. Mitchell, and Macready, Geoffrey B. 1988. Concomitant-variable latent-class models. Journal of the American Statistical Association 83: 173–8.
Dias, José G., and Vermunt, Jeroen K. 2008. A bootstrap-based aggregate classifier for model-based clustering. Computational Statistics 23: 643–59.
Edlund, Jonas. 2006. Trust in the capability of the welfare state and general welfare state support: Sweden 1997-2002. Acta Sociologica 49: 395417.
Feick, Lawrence F. 1989. Latent class analysis of survey questions that include don't know responses. Public Opinion Quarterly 53: 525–47.
Galindo-Garre, Francisca, and Vermunt, Jeroen K. 2006. Avoiding boundary estimates in latent class analysis by Bayesian posterior mode estimation. Behaviormetrika 33: 4359.
Garrett, Elisabeth S., and Zeger, Scott L. 2000. Latent class model diagnosis. Biometrics 56: 1055–67.
Garrett, Elisabeth S., Eaton, William W., and Zeger, Scott L. 2002. Methods for evaluating the performance of diagnostic tests in the absence of a gold standard: A latent class model approach. Statistics in Medicine 21: 1289–307.
Goodman, Leo A. 1974a. The analysis of systems of qualitative variables when some of the variables are unobservable: Part I—A modified latent structure approach. American Journal of Sociology 79: 1179–259.
Goodman, Leo A. 1974b. Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61: 215–31.
Goodman, Leo A. 2007. On the assignment of individuals to classes. Sociological Methodology 37: 122.
Haberman, Shelby J. 1979. Analysis of qualitative data, Vol. 2: New developments. New York: Academic Press.
Hagenaars, Jacques A. 1990. Categorical longitudinal data—Loglinear analysis of panel, trend and cohort data. Newbury Park, CA: Sage.
Hagenaars, Jacques A. 1993. Loglinear models with latent variables. Newbury Park, CA: Sage.
Hill, Jennifer L., and Kriesi, Hanspeter. 2001a. Classification by opinion-changing behavior: A mixture model approach. Political Analysis 9: 301–24.
Hill, Jennifer L., and Kriesi, Hanspeter. 2001b. An extension and test of converse's ‘black-and-white’ model of response stability. American Political Science Review 95: 397413.
Howard, Marc M., Gibson, James L., and Stolle, Dietlind. 2005. The U.S. Citizenship, Involvement, Democracy survey. Center for Democracy and Civil Society, Georgetown University.
Kamakura, Wagner A., Wedel, Michel, and Agrawal, Jagdish. 1994. Concomitant variable latent class models for the external analysis of choice data. International Journal of Marketing Research 11: 451–64.
Katz, Jonathan N., and Katz, Gabriel. 2009. Reassessing the link between voter heterogeneity and political accountability: A latent class regression model of economic voting. Paper presented at the 26th Annual Society for Political Methodology Summer Conference, July 23-25, 2009, Yale University.
Lazarsfeld, Paul F., and Henry, Neil W. 1968. Latent structure analysis. Boston, MA: Houghton Mill.
Linzer, Drew A. 2006. A comparative analysis of ideological constraint using latent class models. Paper presented at the annual meeting of the Midwest Political Science Association, Palmer House Hilton, Chicago, IL, April 20, 2006.
Lu, Irene R.R., and Roland Thomas, D. 2008. Avoiding and correcting bias in score-based latent variable regression with discrete manifest items. Structural Equation Modeling 15: 462–90.
Magidson, Jay. 1981. Qualitative variance, entropy, and correlation ratios for nominal dependent variables. Social Science Research 10: 177–94.
Magidson, Jay, and Vermunt, Jeroen K. 2001. Latent class factor and cluster models, bi-plots and related graphical displays. Sociological Methodology 31: 223–64.
McCutcheon, Allan L. 1985. A latent class analysis of tolerance for nonconformity in the American public. Public Opinion Quarterly 49: 474–88.
McCutcheon, Allan L. 1987. Latent class analysis. Newbury Park, CA: Sage.
McLachlan, Geoffrey J., and Peel, David. 2000. Finite mixture models. New York: Wiley.
Moors, Guy, and Vermunt, Jeroen K. 2007. Heterogeneity in postmaterialist value priorities. Evidence from a latent class discrete choice approach. European Sociological Review 23: 631–48.
Muthén, Linda K., and Muthén, Bengt O., 2004. Mplus3.0: User's manual. Los Angeles, CA: Muthén and Muthén.
Patterson, Blossom H., Mitchell Dayton, C., and Graubard, Barry I. 2002. Latent class analysis of complex sample survey data: Application to dietary data. Journal of the American Statistical Association 97: 721–8.
Rubin, Donald B. 1987. Multiple imputation for nonresponse in surveys. New York: Wiley.
Schafer, Joseph L. 1997. Analysis of incomplete multivariate data. London: Chapman & Hall.
Skinner, Chris J., Holt, Tim, and Fred Smith, T. M. 1989. Analysis of complex surveys. New York: Wiley.
Simmons, Solon. 2008. Ascriptive justice: The prevalence, distribution, and consequences of political correctness in the academy. Forum 6: 8.
Skrondal, Anders, and Laake, Petter. 2001. Regression among factor scores. Psychometrika 88: 563–76.
Van den Hout, Ardo, and Van der Heijden, Peter G. M. 2004. The analysis of multivariate misspecified data, with special attention to randomized response data. Sociological Methods and Research 32: 310–36.
Van de Pol, Frank, and Langeheine, Rolf. 1990. Mixed Markov latent class models. Sociological Methodology 20: 213–47.
Van der, Heijden, Zvi Gilula, Peter G. M., and Andries Van der Ark, L. 1999. An extended study into the relationship between correspondence analysis and latent class analysis. Sociological Methodology 29: 147–86.
Vermunt, Jeroen K. 1997. Log-linear models for event histories. Advanced quantitative techniques in the social sciences series. Thousand Oaks, CA: Sage.
Vermunt, Jeroen K. 2003. Multilevel latent class models. Sociological Methodology 33: 213–39.
Vermunt, Jeroen K. 2005. Mixed-effects logistic regression models for indirectly observed outcome variables. Multivariate Behavioral Research 40: 281301.
Vermunt, Jeroen K. 2008. Latent class and finite mixture models for multilevel data sets. Statistical Methods in Medical Research 17: 3351.
Vermunt, Jeroen K., Langeheine, Rolf, and Böckenholt, Ulf. 1999. Discrete-time discrete-state latent Markov models with time-constant and time-varying covariates. Journal of Educational and Behavioral Statistics 24: 178205.
Vermunt, Jeroen K., and Magidson, Jay. 2004. Latent class analysis. In The Sage encyclopedia of social science research methods, ed. Lewis-Beck, Michael, Bryman, Alan, and Liao, Tim F., 549–53. Newbury Park, CA: Sage.
Vermunt, Jeroen K., and Magidson, Jay. 2005. Latent GOLD 4.0 user's guide. Belmont, MA: Statistical Innovations.
Vermunt, Jeroen K., and Magidson, Jay. 2008. LG-Syntax user's guide: Manual for Latent GOLD 4.5 syntax module. Belmont, MA: Statistical Innovations.
Yamaguchi, Kazuo. 2000. Multinomial logit latent-class regression models: An analysis of the predictors of gender-role attitudes among Japanese women. American Journal of Sociology 105: 1702–40.
Recommend this journal

Email your librarian or administrator to recommend adding this journal to your organisation's collection.

Political Analysis
  • ISSN: 1047-1987
  • EISSN: 1476-4989
  • URL: /core/journals/political-analysis
Please enter your name
Please enter a valid email address
Who would you like to send this to? *


Altmetric attention score

Full text views

Total number of HTML views: 0
Total number of PDF views: 0 *
Loading metrics...

Abstract views

Total abstract views: 0 *
Loading metrics...

* Views captured on Cambridge Core between <date>. This data will be updated every 24 hours.

Usage data cannot currently be displayed