Hostname: page-component-89b8bd64d-7zcd7 Total loading time: 0 Render date: 2026-05-05T23:23:56.246Z Has data issue: false hasContentIssue false

State-Dependent Missingness in Hidden Markov Models, with an Application to Drop-Out in a Clinical Trial

Published online by Cambridge University Press:  03 January 2025

Maarten Speekenbrink*
Affiliation:
Department of Experimental Psychology, University College London, London, UK
Ingmar Visser
Affiliation:
Department of Psychology, University of Amsterdam, Amsterdam, Netherlands
*
Corresponding author: Maarten Speekenbrink; Email: m.speekenbrink@ucl.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

Establishing the effectiveness of treatments for psychopathology requires accurate models of its progression over time and the factors that impact it. Longitudinal data is however fraught with missingness, hindering accurate modeling. We re-analyse data on schizophrenia severity in a clinical trial using hidden Markov models (HMMs). We consider missing data in HMMs with a focus on situations where data is missing not at random (MNAR) and missingness depends on the latent states, allowing symptom severity to indirectly impact probability of missingness. In simulations, we show that including a submodel for state-dependent missingness reduces bias when data is MNAR and state-dependent, whilst not reducing accuracy when data is missing-at-random (MAR). When missingness depends on time, a model that allows missingness to be both state- and time-dependent is unbiased. Overall, these results show that modelling missingness as state-dependent and including relevant covariates is a useful strategy in applications of HMMs to time-series with missing data. Applying the model to data from a clinical trial, we find that drop-out is more likely for patients with less severe symptoms, which may lead to a biased assessment of treatment effectiveness.

Information

Type
Application and Case Studies - Original
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Psychometric Society
Figure 0

Figure 1 Average ratings of the severity of illness (IMPS item 79) by week and drug type. Bars depict 95% confidence intervals. Note that for the placebo group, confidence intervals at week 4 and 5 extend beyond the plot due to the small number of observations.

Figure 1

Figure 2 Improvement in symptoms at the main measurement weeks by missingness pattern and drug status. Note that missingness pattern concerns solely the main measurement weeks. Improvement are differences in scores between the main measurement weeks. Dots represent means and ranges 95% confidence intervals.

Figure 2

Table 1 Results of a logistic regression analysis modeling missingness as a function of drug, week, and whether the week was a main measurement occasion or not

Figure 3

Figure 3 State-conditional response distributions in the simulation studies. In Simulation 1 and 2, states are reasonably well-separated, although there is still considerable overlap of the distributions. In Simulation 3 and 4, states are less well-separated.

Figure 4

Table 2 Results of Simulation 1 (MNAR, low variance)

Figure 5

Table 3 Results of Simulation 2 (MAR, low variance)

Figure 6

Table 4 Results of Simulation 3 (MNAR, high variance)

Figure 7

Table 5 Results of Simulation 4 (MAR, high variance)

Figure 8

Table 6 Results of Simulation 9 (MNAR, related to true value)

Figure 9

Table 7 Results of Simulation 10 (time-dependent missingness, low variance)

Figure 10

Table 8 Modeling results for the MAR and MNAR HMM with 2-5 latent states

Figure 11

Figure 4 Histograms and QQ plots of the pseudo-residuals for the MAR and MNAR model.

Figure 12

Table 9 Parameter estimates of the state dependent logistic regression models for missingness, with lower and upper reflecting the lower and upper bounds of the approximate $95\%$ confidence intervals

Figure 13

Figure 5 Predicted probability of missing IMPS Item 79 ratings by week for each state in the three-state MNAR hidden Markov model.

Figure 14

Figure 6 Proportions of MAP state assignments over weeks for the medication and placebo groups, according to the MAR and MNAR model.

Figure 15

Table A1 Results of Simulation 5 (MNAR, low variance)

Figure 16

Table A2 Results of Simulation 6 (MAR, low variance)

Figure 17

Table A3 Results of Simulation 7 (MNAR, low variance)

Figure 18

Table A4 Results of Simulation 8 (MAR, low variance)