Hostname: page-component-89b8bd64d-5bvrz Total loading time: 0 Render date: 2026-05-07T02:02:15.812Z Has data issue: false hasContentIssue false

The Role of Conditional Likelihoods in Latent Variable Modeling

Published online by Cambridge University Press:  01 January 2025

Anders Skrondal*
Affiliation:
Norwegian Institute of Public Health University of Oslo University of California, Berkeley
Sophia Rabe-Hesketh
Affiliation:
University of California, Berkeley
*
Correspondence should be made to Anders Skrondal, CEFH, Norwegian Institute of Public Health, P.O.Box 222 Skøyen, N-0213 Oslo, Norway. Email: anders.skrondal@fhi.no
Rights & Permissions [Opens in a new window]

Abstract

In psychometrics, the canonical use of conditional likelihoods is for the Rasch model in measurement. Whilst not disputing the utility of conditional likelihoods in measurement, we examine a broader class of problems in psychometrics that can be addressed via conditional likelihoods. Specifically, we consider cluster-level endogeneity where the standard assumption that observed explanatory variables are independent from latent variables is violated. Here, “cluster” refers to the entity characterized by latent variables or random effects, such as individuals in measurement models or schools in multilevel models and “unit” refers to the elementary entity such as an item in measurement. Cluster-level endogeneity problems can arise in a number of settings, including unobserved confounding of causal effects, measurement error, retrospective sampling, informative cluster sizes, missing data, and heteroskedasticity. Severely inconsistent estimation can result if these challenges are ignored.

Information

Type
Theory and Methods
Creative Commons
Creative Common License - CCCreative Common License - BY
This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Copyright
Copyright © 2022 The Author(s)
Figure 0

Figure. 1 Illustration of clustered data for N=3\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$N=3$$\end{document} clusters and n=2\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$n=2$$\end{document} units per cluster. Exchangeable units (upper panel) and non-exchangeable units (lower panel).

Figure 1

Figure. 2 Cluster-level exogeneity (left panel) and cluster-level endogeneity (right panel).

Figure 2

Figure. 3 Automatic inconsistency correction of MML estimation for logistic random-intercept model as a function of cluster size n. Cor(ζj,xij)=.4\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\mathrm{Cor}(\zeta _j,x_{ij}) = .4$$\end{document} (solid curve) and Cor(ζj,xij)=.2\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\mathrm{Cor}(\zeta _j,x_{ij}) = .2$$\end{document} (dashed curve) for N=1,000,000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$N=1{,}000{,}000$$\end{document} clusters.

Figure 3

Figure. 4 Protective MML estimate for simulated data with correct auxiliary model for logit link as a function of N. Cor(xij,ζj)=0.4\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\mathrm{Cor}(x_{ij},\zeta _j)=0.4$$\end{document} and n=4\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$n=4$$\end{document}.

Figure 4

Figure. 5 Joint modeling using SEM for identity link and normal conditional distribution. Path diagrams (n=3\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$n=3$$\end{document}) for standard random-intercept model where Cor(xij,ζj)=0\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\mathrm{Cor}(x_{ij},\zeta _j) \! = \! 0$$\end{document} and Cor(vj,ζj)=0\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\mathrm{Cor}(v_{j},\zeta _j) \! = \! 0$$\end{document} (left panel) and joint SEM specifying Cor(xij,ζj)≠0\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\mathrm{Cor}(x_{ij},\zeta _j) \! \ne \! 0$$\end{document} and Cor(vj,ζj)=0\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\mathrm{Cor}(v_{j},\zeta _j) \! = \! 0$$\end{document} (right panel).

Figure 5

Figure. 6 Unobserved cluster-level confounding. Cluster-level unobserved confounder uj\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$u_j$$\end{document} (left panel) and resulting cluster-level endogeneity (right panel).

Figure 6

Figure. 7 Retrospective sampling of units. Unselected population (left panel) and selected sample (right panel).

Figure 7

Figure. 8 Retrospective sampling of clusters. Unselected population (left panel) and selected sample (right panel).

Figure 8

Figure. 9 Informative cluster-sizes.

Figure 9

Figure. 10 Outcome dependent missingness. Current outcome dependent missingness (left panel) and lag(1) dependent missingness for n=4\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$n=4$$\end{document} (right panel).

Figure 10

Figure. 11 Latent-variable and covariate dependent missingness. Unselected population (left panel) and selected sample (right panel).

Figure 11

Figure. 12 Latent-variable dependent missingness.

Figure 12

Figure. 13 Heteroskedastic latent variable.