Hostname: page-component-6766d58669-88psn Total loading time: 0 Render date: 2026-05-21T12:18:30.602Z Has data issue: false hasContentIssue false

A Two-Step Estimator for Multilevel Latent Class Analysis with Covariates

Published online by Cambridge University Press:  01 January 2025

Roberto Di Mari*
Affiliation:
University of Catania
Zsuzsa Bakk
Affiliation:
Leiden University
Jennifer Oser
Affiliation:
Ben-Gurion University
Jouni Kuha
Affiliation:
London School of Economics and Political Science
*
Correspondence should bemade to Roberto DiMari, Department of Economics and Business, University of Catania, Corso Italia 55, 95128 Catania, Italy. Email: roberto.dimari@unict.it
Rights & Permissions [Opens in a new window]

Abstract

We propose a two-step estimator for multilevel latent class analysis (LCA) with covariates. The measurement model for observed items is estimated in its first step, and in the second step covariates are added in the model, keeping the measurement model parameters fixed. We discuss model identification, and derive an Expectation Maximization algorithm for efficient implementation of the estimator. By means of an extensive simulation study we show that (1) this approach performs similarly to existing stepwise estimators for multilevel LCA but with much reduced computing time, and (2) it yields approximately unbiased parameter estimates with a negligible loss of efficiency compared to the one-step estimator. The proposal is illustrated with a cross-national analysis of predictors of citizenship norms.

Information

Type
Theory and Methods
Creative Commons
Creative Common License - CCCreative Common License - BY
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Copyright
Copyright © 2023 The Author(s)
Figure 0

Figure 1. Graphical representation of a multilevel latent class model which includes a low-level latent class variable Xij\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$X_{ij}$$\end{document} nested in a high-level latent class variable Wj\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$W_j$$\end{document}, and covariates Zij\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$Z_{ij}$$\end{document} for Xij\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$X_{ij}$$\end{document}. Here the response probabilities for items Yijh\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$Y_{ijh}$$\end{document} depend directly only on Xij\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$X_{ij}$$\end{document}.

Figure 1

Figure 2. Step 2 of the two-step estimation: Estimating the structural model for low-level latent classes Xij\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$X_{ij}$$\end{document} given covariates Zij\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\textbf{Z}_{ij}$$\end{document} and high-level latent classes Wj\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$W_j$$\end{document}, keeping measurement model parameters for items Yijh\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$Y_{ijh}$$\end{document} fixed at their estimates from Step 1.

Figure 2

Table 1. 24 simulation conditions.

Figure 3

Figure 3. Line graphs of estimated bias for the one-step, two-step, and two-stage estimators, for the 36 simulation conditions, averaged over the 500 replicates. Error bars are based on mean bias ± Monte Carlo standard deviations.

Figure 4

Figure 4. Observed coverage rates of 95% confidence intervals, averaged over covariate effects, for the one-step, two-stage and two-step estimators for the 36 simulation condition, averaged over the 500 replicates. Lower and higher confidence values reported in the confidence bars, based on the minimum and maximum coverages of the confidence intervals for each covariate effect.

Figure 5

Figure 5. Relative computation time for the one-step, two-stage and two-step estimators for the 24 simulation condition, averaged over the 500 replicates. The one-step estimator’s estimation time is taken as reference. Confidence bands based on average values ± their Monte Carlo standard deviation.

Figure 6

Table 2. Number of respondents per country of the third wave (2016) of the IEA survey used for the analysis.

Figure 7

Table 3. Summary statistics for the measurement model.

Figure 8

Figure 6. Measurement model at the lower (individual) level: line graph of the class-conditional response probabilities.

Figure 9

Table 4. Estimated proportions of low-level (individual-level) classes conditional on high-level (country-level) class membership.

Figure 10

Table 5. Assignment of countries to the high-level classes, based on the maximum a posteriori (MAP) classification rule. M=3\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$M=3$$\end{document}.

Figure 11

Table 6. Estimated coefficients of structural models, i.e. multinomial logistic models for membership of the four individual-level latent classes conditional on covariates, separately within each of the three country-level latent classes (HL1, HL2 and HL3).

Figure 12

Table 7. CPU time to estimation in seconds, and number of iterations until convergence for the two methods - one-step and two-step estimators.

Figure 13

Table 8. Average relative efficiency for the two-step and two-stage estimator relative to the one-step estimator (SD over benchmark one-step SD), averaged over covariate effects.