Hostname: page-component-5db58dd55d-d6ndz Total loading time: 0 Render date: 2026-06-03T07:52:05.345Z Has data issue: false hasContentIssue false

Maximum Augmented Empirical Likelihood Estimation of Categorical Marginal Models for Large Sparse Contingency Tables

Published online by Cambridge University Press:  01 January 2025

L. Andries van der Ark*
Affiliation:
University of Amsterdam
Wicher P. Bergsma
Affiliation:
The London School of Economics and Political Science
Letty Koopman
Affiliation:
University of Amsterdam
*
Correspondence should bemade to L. Andries van der Ark, Research Institute of Child Development and Education, University of Amsterdam, P.O. Box 15776, 1001, NG Amsterdam, The Netherlands. Email: L.A.vanderArk@uva.nl
Rights & Permissions [Opens in a new window]

Abstract

Categorical marginal models (CMMs) are flexible tools for modelling dependent or clustered categorical data, when the dependencies themselves are not of interest. A major limitation of maximum likelihood (ML) estimation of CMMs is that the size of the contingency table increases exponentially with the number of variables, so even for a moderate number of variables, say between 10 and 20, ML estimation can become computationally infeasible. An alternative method, which retains the optimal asymptotic efficiency of ML, is maximum empirical likelihood (MEL) estimation. However, we show that MEL tends to break down for large, sparse contingency tables. As a solution, we propose a new method, which we call maximum augmented empirical likelihood (MAEL) estimation and which involves augmentation of the empirical likelihood support with a number of well-chosen cells. Simulation results show good finite sample performance for very large contingency tables.

Information

Type
Theory and Methods
Creative Commons
Creative Common License - CCCreative Common License - BY
This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Copyright
Copyright © 2023 The Author(s)
Figure 0

Table 1. Convergence rates (percentage) and median computation times in seconds for ML, MEL, and MAEL, for three different CMMs, two numbers of items (J), and three percentages of unobservable response patterns (U) based on 1,000\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$1,\!000$$\end{document} (J=4,8\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$J = 4, 8$$\end{document}) and 100 (J=10\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$${ J} = 10$$\end{document}) replications.

Figure 1

Table 2. Type I error rate for MAEL estimation of three different CMMs, four different numbers of items (J), and three different sample sizes (N), based on 1000 replications.

Figure 2

Figure 1. Type I error rates by the ratio of sample size and degrees of freedom in Study 2. Dashed lines are the limits of the 95% confidence interval of the Type I error rate due to Monte Carlo error.

Figure 3

Table 3. Estimated standard error of CMM-parameter estimate β^\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$\hat{\beta }$$\end{document} for Model “Mean”, for four different numbers of items (J), and three different sample sizes(N), based on 1000 (J=20\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$J= 20$$\end{document} and J=40\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$J=40$$\end{document}) and 10, 000 (J=4\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$J= 4$$\end{document} and J=8\documentclass[12pt]{minimal}\usepackage{amsmath}\usepackage{wasysym}\usepackage{amsfonts}\usepackage{amssymb}\usepackage{amsbsy}\usepackage{mathrsfs}\usepackage{upgreek}\setlength{\oddsidemargin}{-69pt}\begin{document}$$J=8$$\end{document}) replications.