Hostname: page-component-5db58dd55d-h5th4 Total loading time: 0 Render date: 2026-05-31T04:15:07.746Z Has data issue: false hasContentIssue false

Stacking Models of Growth: A Methodology for Predicting the Pace of Progress to the Education Sustainable Development Targets Using International Large-Scale Assessments

Published online by Cambridge University Press:  13 February 2025

David Kaplan*
Affiliation:
Educational Psychology, University of Wisconsin-Madison, Madison, WI, USA
Kjorte Harra
Affiliation:
Educational Psychology, University of Wisconsin-Madison, Madison, WI, USA
Jonas Stampka
Affiliation:
Institut für Bildungswissenschaft, Universität Heidelberg, Heidelberg, Baden-Württemberg, Germany
Nina Jude
Affiliation:
Institut für Bildungswissenschaft, Universität Heidelberg, Heidelberg, Baden-Württemberg, Germany
*
Corresponding author: David Kaplan; Email: david.kaplan@wisc.edu
Rights & Permissions [Opens in a new window]

Abstract

To assess country-level progress toward these educational goals it is important to monitor trends in educational outcomes over time. The purpose of this article is to demonstrate how optimally predictive growth models can be constructed to monitor the pace of progress at which countries are moving toward (or way from) the education sustainable development goals as specified by the United Nations. A number of growth curve models can be specified to estimate the pace of progress, however, choosing one model and using it for predictive purposes assumes that the chosen model is the one that generated the data, and this choice runs the risk of “over-confident inferences and decisions that are more risky than one thinks they are” (Hoeting et al., 1999). To mitigate this problem, we adapt and apply Bayesian stacking to form mixtures of predictive distributions from an ensemble of individual models specified to predict country-level pace of progress. We demonstrate Bayesian stacking using country-level data from the Program on International Student Assessment. Our results show that Bayesian stacking yields better predictive accuracy than any single model as measured by the Kullback–Leibler divergence. Issues of Bayesian model identification and estimation for growth models are also discussed.

Information

Type
Application and Case Studies - Original
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Psychometric Society
Figure 0

Table 1 Fifty-four countries and economies with complete mathematics assessment data from 2009 to 2022

Figure 1

Figure 1 Trend lines for PISA mathematics proficiency from 2009 to 2022. The red line is for the girls, the blue line is for boys, and the horizontal black line is the cutoff for PISA level-2 minimum proficiency. Note that Chinese–Taipei is shown here but was removed from the stacking analysis due to a large amount of missing data on relevant predictors.

Figure 2

Table 2 Posterior estimate of starting points and rates of progress, 90% credible intervals (in parentheses), and predictive evaluation under linear and two latent basis models$^{a}$

Figure 3

Table 3 Expected log predictive performance based on loo cross-validation for boys (upper panel) and girls (lower panel)

Figure 4

Table 4 Stacking weights and Kullback–Leibler divergence scores for each model separately and for the ensemble

Figure 5

Figure 2 Predictive densities of the pace of progress across different stacking weights and different models for boys and girls.

Figure 6

Figure 3 Within-sample and one-cycle ahead predictions for each ensemble member and for the stacked prediction based on ELPD$_{loo}$ for boys’ and girls’ performance on the PISA mathematics assessment.

Figure 7

Table A1 Variable names, indicator category, and model number, for multi-model ensemble members

Figure 8

Table A2