Hostname: page-component-7dd5485656-bt4hw Total loading time: 0 Render date: 2025-10-26T11:02:28.284Z Has data issue: false hasContentIssue false

Formulating and evaluating time series algorithms to forecast daily asthma hospital admissions

Published online by Cambridge University Press:  30 July 2025

Stephen P. Colegate*
Affiliation:
Division of Biostatistics and Epidemiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA
Michael Seid
Affiliation:
Division of Pulmonary Medicine, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, USA James M. Anderson Center for Health Systems Excellence, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA
David Hartley
Affiliation:
Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, USA James M. Anderson Center for Health Systems Excellence, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA
Aaron Flicker
Affiliation:
James M. Anderson Center for Health Systems Excellence, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA Michael Fisher Child Health Equity Center, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA
Joseph Bruce
Affiliation:
James M. Anderson Center for Health Systems Excellence, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA Michael Fisher Child Health Equity Center, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA
Joseph Michael
Affiliation:
James M. Anderson Center for Health Systems Excellence, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA Michael Fisher Child Health Equity Center, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA
Mfonobong Udoko
Affiliation:
Division of Pulmonary Medicine, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, USA James M. Anderson Center for Health Systems Excellence, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA
Andrew F. Beck
Affiliation:
Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, USA James M. Anderson Center for Health Systems Excellence, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA Michael Fisher Child Health Equity Center, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA Division of General & Community Pediatrics, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA Office of Population Health, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA
Cole Brokamp
Affiliation:
Division of Biostatistics and Epidemiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, USA Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, USA
*
Corresponding author: S.P. Colegate; Email: stephen.colegate@cchmc.org
Rights & Permissions [Opens in a new window]

Abstract

Introduction:

Asthma exacerbations are frequent causes of pediatric hospital admissions. We sought to develop a time series algorithm to forecast next-day daily asthma hospitalizations.

Methods:

Daily hospitalizations for asthma were collected at Cincinnati Children’s from January 1, 2016, to December 31, 2023. We evaluated Autoregressive Integrated Moving Average (ARIMA), Exponential Smoothing (ETS), Prophet, and Ensemble models to forecast next-day asthma hospitalizations validated on 2023 data, considering varying historical training data lengths. Forecasts were calibrated to identify days exceeding a 5% high-risk threshold of historical totals and considered multiple validation years and years before and during the COVID-19 pandemic.

Results:

A total of 5,593 hospital admissions were recorded for asthma. Over 2,922 days, 166 days met the 5% high-risk threshold equating to 6 or more admissions. The Ensemble (Median Absolute Percentage Error (MAPE): 46.7%; Positive Predictive Value (PPV): 0.278; Negative Predictive Value (NPV): 0.942; Area Under the ROC Curve (AUC): 0.740; Sensitivity: 0.800; Specificity: 0.656) model achieved higher accuracy of high-risk days than ARIMA (MAPE: 46.5%; PPV: 0.278; NPV: 0.942; AUC: 0.709; Sensitivity: 0.760; Specificity: 0.571), ETS (MAPE: 47.2%; PPV: 0.222; NPV: 0.939; AUC: 0.711; Sensitivity: 0.800; Specificity: 0.668), and Prophet (MAPE: 48.9%; PPV: 0.444; NPV: 0.951; AUC: 0.732; Sensitivity: 0.680; Specificity: 0.741) models.

Conclusions:

Our Ensemble model of mean predictions from ARIMA, ETS, and Prophet models was the most accurate in forecasting future asthma hospitalizations. Integrating forecasting techniques with clinical operations could enable proactive prevention through enhanced population care management.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Association for Clinical and Translational Science

Introduction

Accurately predicting future hospital admissions is valuable to decision makers who are seeking to maximize operational efficiency and to effectively meet the needs of patients [Reference Ryu, Romero-Brufau, Shahraki, Zhang, Qian and Kingsley1]. We sought to develop, implement, calibrate, and optimize an asthma time series forecasting algorithm that could predict future pediatric asthma hospitalization admissions at Cincinnati Children’s Hospital Medical Center (CCHMC). We formulated four forecasting algorithms using a variety of analytic approaches to predict the number of patients who would be admitted for asthma and periods when high numbers of hospital admissions for asthma would occur. Our goal was to produce a forecasting algorithm for asthma hospital admissions that clinicians could use to enhance allocation of staffing and to optimize preparedness and readiness for anticipated surges in asthma admission numbers. Such an algorithm could also be relevant to develop population care management approaches to deter asthma admissions before they could occur.

Acute asthma is one of the most frequent causes of hospital admissions, especially in children [Reference McBride, McCarty, Wong, Baskin, Currier and Chiang2,Reference Haktanir Abul and Phipatanakul3]. Approximately 12 million people in the United States experience an acute asthma exacerbation each year, a quarter of which require hospitalization [Reference Fergeson, Patel and Lockey4]. Children with uncontrolled asthma experience a lower quality of life [Reference Montalbano, Ferrante and Montella5]. In 2013, the total direct costs of pediatric asthma were $5.92 billion [Reference Perry, Braileanu, Palmer and Stevens6]. The frequency of asthma exacerbations serves as a key indicator of disease severity, and mitigation of these episodes is a critical endpoint in assessing the therapeutic efficacy of asthma management strategies [Reference Ramsahai, Hansbro and Wark7]. Identifying individuals (and populations) at risk of an asthma exacerbation may provide greater opportunity for prevention and promotion of better health outcomes [Reference Fleming9].

Barriers to fully understanding the underlying complexity of asthma exacerbation triggers have hindered efforts to mitigate risk by targeting optimal and appropriate preventative treatment [Reference Martin, Beasley and Harrison10]. Cincinnati Children’s Asthma Learning Health System (ALHS) is a learning network that strives to connect clinical and community service providers, as well as families, to spur collective action, accelerate research, and improve health outcomes for children with asthma [Reference Beck, Seid and McDowell11]. ALHS was created to serve youth living in Metropolitan Cincinnati, Ohio, and the surrounding region. Approximately 190,000 children live in Hamilton County, the metropolitan area’s population center, of which it is estimated that 30,000 have asthma. Launched in March 2022, ALHS strives to develop and deploy data-driven tools to enhance medical and social screening of asthma patients and the response to appropriate treatment in a timely and effective manner.

Well-known time series models exist and have been used for meaningful prediction [Reference Perone12]. Nonseasonal and seasonal Autoregressive Integrated Moving Average (ARIMA) models, as well as seasonal linear regression models, were used to develop forecasts for patients being admitted to a neonatal intensive care unit having a maximum capacity of 72 beds, with roughly 1100 neonatal admissions per year [Reference Capan, Hoover, Jackson, Paul and Locke13]. Ensemble methodologies incorporating both time-varying information and patient admission trends over time saw improved prediction accuracy for 3, 5, and 7-day census forecasts and increased prediction accuracy compared to forecasting approaches that ignore patient-specific information [Reference Koestler, Ombao and Bender14]. Traditional forecasting methods like Exponential Smoothing (ETS) are not always appropriate for forecasting large and noisy time series data, spurring interest in machine learning models like long short-term memory (LSTM) networks [Reference Hochreiter and Schmidhuber15] and Prophet [Reference Taylor and Letham16,Reference Toharudin, Pontoh, Caraka, Zahroh, Lee and Chen17], for their implementation and scalability. The implementation of traditional forecasting algorithms versus machine learning algorithms depend upon the approach, the model’s application, and the complexity of the data [Reference Menculini, Marini and Proietti18]. A forecasting model for forecasting asthma hospital admissions could help clinicians, caregivers, and patients more effectively identify solutions to improve care and prevent morbidity before it occurs [Reference Martin, Beasley and Harrison10,Reference Jones, Turner, Perera, Hiscock and Chen19,Reference Li, Liu, Yang, Peng and Zhou20]. Thus, we sought to develop a time series algorithm to forecast next-day daily asthma hospitalizations at CCHMC.

Methods

We compared the performance of classical time series forecasting models ARIMA and ETS to a Prophet model, as well as an Ensemble model that averaged predictions, on daily pediatric asthma hospital admissions at CCHMC. We selected the best time series model from each model class and evaluated the performance of the four best models based on prediction accuracy and timing of the predictions. We calibrated our models in the form of a high-risk asthma admittance threshold, opting for a threshold in line with the 5% of days with the highest numbers of admissions. All our models were formulated and validated using only asthma hospital admissions at CCHMC, identified retrospectively. No additional factors such as environmental variables (temperature, air pollution, weather) or other mechanisms were incorporated into these initial models, as we sought to evaluate predictive capabilities using only historical asthma admission data. Future efforts will focus on the added precision from such contextual variables and the potential prospective use-cases.

Daily asthma hospitalizations

Daily asthma hospitalizations were retrieved from the CCHMC electronic health record between January 1, 2016, to December 31, 2023, from CCHMC’s two main campuses: (1) The Clifton campus in Cincinnati, Ohio, located in Hamilton County, and (2) the Liberty Township campus, located in Butler County adjacently north of Hamilton County. An asthma hospital admission was defined when the primary diagnosis was a J45 ICD-10-CM (International Classification of Diseases) code [21], or if the primary diagnosis was a respiratory infection, coughing, nonchronic respiratory failure, shortness of breath or hypoxemia if there was a secondary J45 ICD-10-CM code classification. In addition to a primary or secondary J45 diagnosis, the patient also must have had a code for acute exacerbation or status asthmaticus in order to be classified as having an asthma hospitalization. Patients with cystic fibrosis, congenital heart disease, chronic respiratory failure, sickle cell disease, those with a tracheostomy tube, or those who were ventilator dependent were excluded from the asthma hospitalization count.

Validation

Daily asthma hospitalizations were split into training (2016 – 2022) and validation sets (2023) according to the daily number of asthma hospital admissions observed. To determine the optimal size of the training history window, we formulated models using various training history window sizes ranging from 1 year (2022), 2 years (2021 – 22), 3 years (2020 – 22), 4 years (2019 – 22), 5 years (2018 – 22), 6 years (2017 – 22), and 7 years (2016 – 22). Predictions were made after continuously updating each model based on the latest observations, always using the most recent 365-day window of daily asthma hospitalizations. This allowed each model to be updated daily with the latest information to account for recent trends and simulated how each model would behave if it were deployed and updated daily in a real-world setting.

Daily predictions from the optimal model of each model class were compared to the observed daily admissions data in the validation set to evaluate model performance. Temporally cross-validated accuracy of each model was assessed using (1) the absolute median percentage error (MAPE) between the predicted and actual number of admissions and (2) the percentage of predicted 95% confidence intervals (CI) that captured the actual number of admissions. We wanted to identify how many asthma hospital admissions were expected each day and to identify which days would have an extremely high number of admissions. Therefore, we created a high-risk threshold based on identifying days with admissions greater than the 95th percentile of historical daily hospitalization totals. Daily model predictions were calibrated using this 5% high-risk threshold to forecast days when the number of asthma hospitalizations was greater than the 95th percentile of historical daily asthma hospitalizations. The 5% high-risk threshold was chosen arbitrarily to control how many days would be forecasted to have an extremely high number of hospitalizations. This was compared with the actual top 5% of days with the highest number of admissions, allowing us to compare each model’s performance in identifying days when a historically extreme number of hospital admissions occurred. Receiver operating characteristic (ROC) curves [Reference Mandrekar22] were constructed to estimate the area under the ROC curve (AUC). Sensitivity, specificity, positive predictive values (PPV), and negative predictive values (NPV) were calculated using Youden’s J statistic. The pROC [Reference Robin, Turck and Hainard23] (ver. 1.18.5) package was used to formulate the ROC curves and to obtain the estimates of sensitivity, specificity, AUC, and the 95% CI for the AUC estimate. All data processing, model formulation, cross-validation, and plot creation were performed in R version 4.4.3.

Models

ARIMA models require the series be stationary [Reference Hyndman and Athanasopoulos24] – whose observations do not depend on time – by ensuring no trends or seasonality are present. The Kwiatkowski-Phillips-Schmidt-Shin (KPSS) [Reference Kwiatkowski, Phillips, Schmidt and Shin25] test indicated the series was not stationary (p-value: 0.023). We then looked at the first-difference series, Δy t = y t y t − 1, and, by the KPSS test, found it to be stationary (p-value >0.1), so the model was fitted to the first-difference series. We computed the sample autocorrelation function (ACF) and the partial autocorrelation function (PACF) for Δy t to find the moving average model of order q, MA(q), and the auto regressive model of order p, AR(p) [Reference Hyndman and Athanasopoulos24]. We included a constant term in the ARIMA models to ensure that the long-range forecasts would go to the mean of the time series [Reference Hyndman and Athanasopoulos24]. The Bayesian Information Criterion (BIC) [Reference Schwarz26] was used to compare ARIMA models, and the residuals were checked with the Ljung-Box test [Reference Box and Pierce27] to ensure they followed a white noise process. ETS models are one of the most well-known forecasting methods that deploy weighted averages of past observations to produce forecasts [Reference Holt28,Reference Winters29]. We fitted various ETS models and selected the ETS model based on the Akaike’s Information Criterion (AIC) [Reference Akaike30]. The R package fable [Reference O’Hara-Wild, Hyndman and Wang31] (ver. 0.4.1) was used to fit both the ARIMA and ETS models, and the feasts [Reference O’Hara-Wild, Hyndman, Wang and Cook32] package was used to extract the model components. The framework of a Prophet model [Reference Taylor and Letham16] takes a decomposable time series y(t) and aggregates three main components: the growth or trend g(t), seasonality s(t), and holidays h(t):

$$ y\left(t\right)=g\left(t\right)+s\left(t\right)+h\left(t\right)+\epsilon _{t} \# \left(1\right)$$

where the error terms ϵ t represent variations in the time series that are not attributed to growth, seasonality, or holiday effects. The trend function g(t) uses a generative model process by assigning S changepoints over a history of T points, each with rate change δ j  ∼ Laplace(0,τ), where τ directly controls the flexibility of the model [Reference Taylor and Letham16]. The seasonal function s(t) uses a Fourier series to allow for periodic effects. There were no obvious holiday effects to consider, as hospitals typically stay open throughout the holidays, so we set h(t) = 0. We used the R package prophet [Reference Taylor and Letham33] (ver. 1.0) to fit the Prophet model.

In addition to fitting ARIMA, ETS, and Prophet models, we were also interested in a model that combined the daily predictions made by each of these three models. We formulated an Ensemble model designed as a combination of predictions from the best ARIMA, ETS, and Prophet models we constructed. Ensemble model predictions were obtained by taking the average of the daily point estimates obtained from the ARIMA, ETS, and Prophet models. Similarly, Ensemble model 95% CI were constructed by calculating the mean lower 95% CI limit and the upper 95% CI limit of the daily 95% CI estimates from the ARIMA, ETS, and Prophet models.

Other considerations

We also compared the performance of ARIMA, ETS, Prophet, and Ensemble models on validation sets with multiple years (2022 – 2023) to determine how the models would perform when tasked with predicting two years as opposed to one. We also examined asthma hospitalization trends before and during the COVID-19 pandemic due to the presence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and its potential interaction with our asthma-related outcomes [Reference Aggarwal, Agarwal, Dhooria, Prasad, Sehgal and Muthu34Reference Chatziparasidis and Kantar37]. Models were formulated (2016 – 2018) and evaluated (2019) on daily asthma hospitalizations before the World Health Organization (WHO) declared COVID-19 to be a global health pandemic on March 11, 2020 [38]. Models were then formulated (2020 – 2022) and evaluated (2023) on daily asthma hospitalizations to account for disruptive patterns brought on by the COVID-19 pandemic.

Results

There were 5,593 asthma hospital admissions at CCHMC between January 1, 2016, and December 31, 2023 (Table 1 and Figure 1), yielding an average of 699.1 cases per year and a median of 727.5 cases per year. There was a noticeable 52% drop in asthma admissions in 2020 when the COVID-19 pandemic began. During the pandemic, there was a 33% increase in asthma admissions in 2022 compared to 2021. We saw 2,088 asthma hospitalizations before the pandemic (2016 – 2019) and 2,036 asthma hospitalizations during the pandemic (2020 – 2023). September was the month with the highest number of admissions (639 cases), followed by October (606 cases), November (604 cases), and August (540 cases) – months encompassing the beginning of the school year. May (526 cases), March (482 cases), December (475 cases), February (441 cases), April (421 cases), and January (330 cases) cover periods during the school year. July (281 cases) and June (248 cases) saw the fewest number of admissions when school is out for the summer. Admissions were also higher at the start than at the end of the work/school week: Sunday (855 cases), Monday (950 cases), and Tuesday (862 cases) had noticeably higher admissions than Wednesday (764 cases), Thursday (753 cases), Friday (687 cases), or Saturday (722 cases).

Figure 1. Daily asthma hospitalizations at Cincinnati Children’s Hospital Medical Center (CCHMC) between 2016 and 2023 on a log1p-scale. Dates are shaded whether the number of admissions was lower (blue), higher (red), or normal (white) relative to the median number of admissions.

Table 1. Asthma hospitalization cases and percentages (%) at Cincinnati Children’s Hospital Medical Center (CCHMC) between 2016 and 2023. Admissions are segmented by period for the training and validation sets. The number of cases and the frequency of cases per day are stated for each period

The 5% high-risk threshold allowed us to identify high-risk days for asthma hospitalizations. Approximately 78.9% of the 2,922 days between 2016 and 2023 saw at least one hospitalization for asthma – 27.3% of days saw exactly one hospitalization for asthma, while 21.1% of days experienced two hospitalizations, and 15.1% of days experienced three hospitalizations. The maximum number of admissions in one day varied across the study period: 9 were admitted on one day in 2016, 7 in 2017, 8 in 2018 and 2019, 6 in 2020, 11 in 2021, 14 in 2022, and 9 in 2023. There were 616 days (21.1%) where there were no hospitalizations, the vast majority of those occurred in 2020 (155 days) when the pandemic began. We found that the top 5% of days experienced at least 6 or more hospital admissions, so the high-risk threshold was met if the observed number of asthma hospitalizations in one day was 6 or more. The 5% high-risk threshold was met for 166 days (2016 – 2023); 104 days from 2016 to 2022, and 15 days in 2023. Model predictions were calibrated similarly by identifying the top 5% of days with the highest predicted number of asthma hospitalizations. All models had similar 5% high-risk thresholds (ARIMA: 4.18, ETS: 4.17, Prophet: 4.10, ensemble: 4.13) used to classify whether a day was forecasted to meet the high-risk threshold.

We considered varying combinations of p and q to find the best ARIMA model. The ARIMA (0,1,2) had a slightly higher BIC (BIC: 7872) than the ARIMA (0,1,1) (BIC: 7870) but with fewer estimated parameters than the ARIMA (0,1,3) (BIC: 7880), the ARIMA (1,1,1) (BIC: 7883) and the ARIMA (2,1,0) (BIC: 8281) models. Residuals from the ARIMA (0,1,2) followed a white noise process (p-value: 0.167) while residuals from the ARIMA (0,1,1) did not (p-value: 0.080) so the ARIMA (0,1,2) model was chosen as our best ARIMA model. We fitted ETS models with various error, trend, and seasonality patterns, including simple exponential smoothing [Reference Hyndman, Koehler, Ord and Snyder39], Holt’s linear trend [Reference Holt28], and Holt-Winters’ damped method [Reference Hyndman and Athanasopoulos24]. The simple exponential smoothing model had additive errors with no trend or seasonality (AIC: 18,496). The Holt’s linear trend model had additive errors and additive trend but no seasonality (AIC: 18,507). The Holt-Winters’ damped model had additive terms and a damped trend but no seasonality (AIC: 18,503). By comparing the various forms of error, trend, and seasonality, our ETS model that yielded the smallest AIC was one with additive errors, no trend, and additive seasonality (AIC: 18,474).

The MAPE and the 95% CI coverage (Table 2 and Figure 2) were calculated from predictions on admissions in 2023 made from the models using various training history windows ranging from one year (2022) to seven years (2016 – 2022). ARIMA achieved the lowest MAPE (46.5%) using only 4 years of training data, followed by the Ensemble model (46.7%) with 4 years of training data. The ETS model reached its lowest MAPE (47.2%) with 7 years of training data. The Prophet model obtained its lowest MAPE (48.9%) with 7 years of training data but was the highest of all four models. By 95% CI coverage, the ETS model tied the Ensemble model with the highest coverage (96.4%), but ETS required only 1 year of training data as opposed to 2 years for the Ensemble model. ARIMA also had similar coverage (96.2%) with 1 year of training data, while the Prophet model also had the worse coverage (95.9%). The Prophet model had the highest PPV (0.444) with 4 years of training data, surpassing the highest PPV achieved for the Ensemble (0.278), ARIMA (0.278), and ETS (0.222) models. The Prophet model also had the highest NPV (0.951) with 4 years of training data, but the highest NPV achieved by the Ensemble (0.942), ARIMA (0.942), and ETS (0.939) models were similar. The Ensemble (0.800), ARIMA (0.800), and ETS (0.800) models all achieved the highest sensitivity, but the Ensemble model only required 2 years of training data – ARIMA and ETS both required 6 years of training data for similar performance. The optimal sensitivity obtained by the Prophet model (0.680) was lower but achieved the highest optimal specificity (0.741) with 6 years of training data. Optimal specificity for the ETS (0.682), Ensemble (0.656), and ARIMA (0.571) models were lower, but the Ensemble model only needed 1 year of training data to obtain this. The Ensemble model also had the highest AUC (0.740) achieved, followed by the Prophet (0.732), ETS (0.711), and ARIMA (0.709) models. The Ensemble model performed the best from among ARIMA, ETS, and Prophet models with 3 years of training data.

Figure 2. Comparison of model class prediction accuracy on 2023 asthma hospitalization admissions by length of training history. Median absolute percentage error is compared for varying sizes of training history, from 1 year (2022) to 7 years (2016 – 2022).

Table 2. Validation of 2023 hospitalizations by training period – median absolute percentage error (MAPE), 95% confidence interval coverage percentage (Cover %), positive predictive value (PPV), negative predictive value (NPV), sensitivity (Sens), specificity (Spec), area under the receiver operating characteristic curve (ROC) with 95% confidence interval (AUC + 95%CI)

We compared the MAPE, coverage percentage, PPV, NPV, sensitivity, specificity, and AUC from these models with two years of validation data (2022 – 2023) as opposed to one (Table 3). The MAPE from the ARIMA (48.6%) and Ensemble (48.7%) models were similar, with the ETS (49.8%) and Prophet (53.0%) models having noticeably higher MAPE, though the 95% CI coverage were all identical. Prophet achieved a lower PPV, NPV, and sensitivity (PPV: 0.139; NPV: 0.941; Sens: 0.600) than either ARIMA (PPV: 0.306; NPV: 0.950; Sens: 0.760), ETS (PPV: 0.306; NPV: 0.950; Sens: 0.640), and Ensemble (PPV: 0.306; NPV: 0.950; Sens: 0.720) models. Prophet had the highest specificity and AUC (Spec: 0.738; AUC: 0.730), followed by the ARIMA (Spec: 0.521; AUC: 0.683), ETS (0.659; AUC: 0.704), and Ensemble (Spec: 0.579; AUC: 0.726) models. We also compared the performance of predictions from these models before and during the COVID-19 pandemic. While we did see some slight differences in prediction accuracy, we did not see any substantial performance shifts on periods before and during the pandemic among the model classes.

Table 3. Validation of hospitalization predictions on two-year validation (Trained: 2016-2021; validated: 2022 – 2023), before COVID-19 (Trained: 2016 – 2018; validated: 2019) and during COVID-19 (Trained: 2020-2022; validated: 2023). – median absolute percentage error (MAPE), 95% confidence interval coverage percentage (Cover %), positive predictive value (PPV), negative predictive value (NPV), sensitivity (Sens), specificity (Spec), area under the receiver operating characteristic curve (ROC) with 95% confidence interval (AUC + 95%CI)

Discussion

We developed an optimal time series model to forecast future daily pediatric hospital admissions for asthma at CCHMC. We produced four candidate models utilizing both classical and modern methods to time series forecasting and then evaluated the models to determine an optimal forecasting procedure. We implemented a 5% high-risk threshold on our best ARIMA, ETS, Prophet, and Ensemble models to determine how they were able to forecast when a high number of hospital admissions for asthma occurred. We found that an Ensemble model combining the predictions of the ARIMA, ETS, and Prophet models with 3 years of training data made the most accurate forecasts. We believe the Ensemble model performs better because the higher prediction accuracy of two of the three models outweighs any suboptimal prediction accuracy of any one model.

We were interested in the performance of a newer forecasting tool (Prophet) on asthma hospitalizations that borrowed ideas from modern machine learning algorithms, in addition to experimenting with traditional time series models (ARIMA and ETS). Prophet recognized seasonal patterns in our asthma hospitalization series, but this required many years of training history, as we saw poor high-risk prediction performance from Prophet using only one or two years of training data. The main advantage of using Prophet was its straightforward setup – utilizing the Generalized Additive Model approach with limited preprocessing. There were no candidate Prophet models to compare, and no postprocessing that must be checked. Aside from setting a few parameters during setup, getting predictions from the Prophet model was straightforward and appeared reliable at first. When we cross-validated the predictions made by Prophet and compared them to traditional time series models ARIMA and ETS, the prediction accuracy was not as good. ARIMA and ETS models can quickly react to sudden shifts in recent daily asthma hospitalizations. Other authors have noted similar findings to what we experienced – Prophet was quick and easy to use, but was considerably less accurate [Reference Menculini, Marini and Proietti18,Reference Chan40,Reference Yenidoğan, Çayir, Kozan, Dağ and Arslan41]. We were able to validate their claims when we applied the models to our daily asthma hospitalization series. Trying various ARIMA and ETS models and identifying the “best” candidate model from their respective class yields better prediction accuracy, and the simplicity and ease-of-use in deploying Prophet may not be worth the diminished prediction accuracy. None of these models were able to predict a sudden spike in asthma hospital admissions beforehand. Prediction of spontaneous surges in asthma hospitalizations could enable improved preparedness for the sudden influx of patients entering hospitals. It could also allow for the initiation of preventative actions for those patients at high risk of experiencing an exacerbation. A forecasting algorithm that could predict spikes in asthma hospitalizations before they happen would be of great use in population care management, aiming to prevent asthma-related symptoms before they occur.

The hospital-governed operational definition for an asthma-related admission was established and maintained by CCHMC’s internal data depot, using the ICD codes as described above. There are several publications on ICD-code inaccuracy and validity with asthma hospitalizations [Reference Sanders, Gregg and Aronsky42Reference Seol, Wi, Ryu, King, Divekar and Juhn44], and we acknowledge that this definition may have missed certain admissions (or counted admissions that may not have been related to asthma). While we have made every effort to screen the proper ICD codes to include only patients admitted to the hospital primarily for asthma-related issues and remove admissions that arise due to another related health issue, there was no way we could verify the accuracy and validity with the ICD codes for asthma. Still, we do not believe biases introduced because of our admission definition would have been differential to our results.

The asthma admissions data we used to construct our models came from CCHMC’s two main campuses in metropolitan Cincinnati, Ohio. Our admissions data, our models, and our conclusions can only be made for our area and will not be generalizable to other regions. However, the idea of curating asthma hospital admissions data, as well as constructing, calibrating, and validating time series models, could possibly be extended to other hospitals in other regions, with the goal of improving clinician staffing and readiness in their hospitals for their respective areas.

The model classes that were chosen were suitable for their ability to be continuously updated as new admissions numbers were included in their training history. It is possible that model performance could change as new data is introduced into the training history, so we wanted to see how updating each model every day would impact prediction performance. As more training history was introduced, these models did not adjust their predictions as much to sudden shifts in admissions numbers. We could incorporate machine learning methods, such as LSTM, that fully exploit all the data and can remember features of the series that happened previously but would require data preprocessing to implement. Recent developments of a fast algorithm called Super Learner for constructing an initial set of candidate learners based on V-fold cross-validation techniques could identify better performing candidate models [Reference van der Laan, Polley and Hubbard45,Reference Polley and van der46]. Our models only used retrospective daily asthma hospital admissions numbers to make their predictions. Future work to refine these models will include additional covariates that could be highly influential in forecasting future asthma hospitalizations. We will be incorporating environmental variables (e.g., PM2.5, NO2, O3), air quality index, crime information, school year, and respiratory illnesses (e.g., RSV cases, COVID-19 cases, flu cases) to determine which variables improve prediction accuracy in our models.

Models that forecast future daily asthma hospital admissions are just one tool clinicians can use to improve care for their patients. Such methods can be used to inform anticipatory guidance for at-risk patients, support population care management of a cohort of children with asthma, and plan for when sudden surges in asthma hospitalizations may occur. Future work employing methods such as user-centered design, inclusive of clinicians, patients, and families, could optimize the use for such technology [Reference Clay, Peerenboom and Connors47]. Future efforts will utilize data recorded at CCHMC to build research tools and dashboards to identify asthma patients who may be at high risk for an exacerbation, to allow the clinician to intervene, and to ultimately prevent increases in hospitalizations. Equipping clinical teams with crucial information they need to target patients at high risk at the appropriate time can increase preventative treatment and improve health care outcomes for asthma patients.

Our optimal forecasting algorithm is the first step of producing tools that will better meet the needs of healthcare professionals and promote mitigation of symptoms in equitable ways. We hope to contribute to the development and implementation of time series forecasting algorithms in asthma preparedness for both patients and clinicians. These technologies can assist with preparation abilities, alert necessary caregivers of anticipated increased asthma hospitalizations, and optimize staff and resources to maximize care to their patients.

Data availability

The authors do not have permission to share the data.

Author contributions

Stephen Colegate: Conceptualization, Formal analysis, Methodology, Software, Writing-original draft, Writing-review and editing; Michael Seid: Conceptualization, Funding acquisition, Methodology, Writing-original draft, Writing-review and editing; David Hartley: Conceptualization, Methodology, Writing-original draft, Writing-review and editing; Aaron Flicker: Data curation, Methodology, Software, Writing-original draft, Writing-review and editing; Joseph Bruce: Conceptualization, Data curation, Methodology, Software; Joseph Michael: Conceptualization, Data curation, Methodology, Software, Writing-review and editing; Mfonobong Udoko: Conceptualization, Writing-review and editing; Andrew Beck: Conceptualization, Funding acquisition, Methodology, Writing-original draft, Writing-review and editing; Cole Brokamp: Conceptualization, Funding acquisition, Methodology, Writing-original draft, Writing-review and editing.

Funding statement

The work was supported by the Cincinnati Children’s Research Foundation’s Academic & Research Committee, the Agency for Healthcare Research and Quality [grant number R01HS027996 to A.B.], the U.S. National Library of Medicine [grant number R01LM013222 to C.B.], the Cincinnati Children’s Hospital Medical Center Academic and Research Council [no grant number to M.S.], and the National Center for Advancing Translational Sciences of the National Institutes of Health [grant number UL1TR001425 to S.C.].

Competing interests

None declared.

Ethics approval

Although this study uses data derived from humans, it is not considered human subjects research. Our study does not involve contacting patients directly. We have used deidentified data not collected specifically for this study. Data was analyzed as aggregated counts per day and individual-level data was not used by any investigators for this study.

References

Ryu, AJ, Romero-Brufau, S, Shahraki, N, Zhang, J, Qian, R, Kingsley, TC. Practical development and operationalization of a 12-hour hospital census prediction algorithm. J Am Med Inform Assoc JAMIA. 2021;28:19771981. doi: 10.1093/jamia/ocab089.CrossRefGoogle ScholarPubMed
McBride, SC, McCarty, K, Wong, J, Baskin, M, Currier, D, Chiang, VW. A pediatric hospital-wide asthma severity score: reliability and effectiveness. Pediatr Pulmonol. 2022;57:12231228. doi: 10.1002/ppul.25861.CrossRefGoogle ScholarPubMed
Haktanir Abul, M, Phipatanakul, W. Severe asthma in children: evaluation and management. Allergol Int Off J Jpn Soc Allergol. 2019;68:150157. doi: 10.1016/j.alit.2018.11.007.CrossRefGoogle ScholarPubMed
Fergeson, JE, Patel, SS, Lockey, RF. Acute asthma, prognosis, and treatment. J Allergy Clin Immunol. 2017;139:438447. doi: 10.1016/j.jaci.2016.06.054.CrossRefGoogle ScholarPubMed
Montalbano, L, Ferrante, G, Montella, S, et al. Relationship between quality of life and behavioural disorders in children with persistent asthma: a multiple indicators multiple causes (MIMIC) model. Sci Rep. 2020;10:6957. doi: 10.1038/s41598-020-62264-9.CrossRefGoogle ScholarPubMed
Perry, R, Braileanu, G, Palmer, T, Stevens, P. The Economic Burden of Pediatric Asthma in the United States: literature review of current evidence. PharmacoEconomics. 2019;37:155167. doi: 10.1007/s40273-018-0726-2.CrossRefGoogle ScholarPubMed
Ramsahai, JM, Hansbro, PM, Wark, PAB. Mechanisms and management of asthma exacerbations. Am J Respir Crit Care Med. 2019;199:423432. doi: 10.1164/rccm.201810-1931CI.CrossRefGoogle ScholarPubMed
Nguyen, HM, Turk, PJ, McWilliams, AD. Forecasting COVID-19 hospital census: a multivariate time-series model based on local infection incidence. JMIR Public Health Surveill. 2021;7:e28195. doi: 10.2196/28195.CrossRefGoogle ScholarPubMed
Fleming, L. Asthma exacerbation prediction: recent insights. Curr Opin Allergy Clin Immunol. 2018;18:117123. doi: 10.1097/ACI.0000000000000428.CrossRefGoogle ScholarPubMed
Martin, MJ, Beasley, R, Harrison, TW. Towards a personalised treatment approach for asthma attacks. Thorax. 2020;75:11191129. doi: 10.1136/thoraxjnl-2020-214692.CrossRefGoogle ScholarPubMed
Beck, AF, Seid, M, McDowell, KM, et al. Building a regional pediatric asthma learning health system in support of optimal, equitable outcomes. Learn Health Syst. 2024;8:e10403. doi: 10.1002/lrh2.10403.CrossRefGoogle ScholarPubMed
Perone, G. Comparison of ARIMA, ETS, NNAR, TBATS and hybrid models to forecast the second wave of COVID-19 hospitalizations in Italy. Eur J Health Econ HEPAC Health Econ Prev Care. 2022;23:917940. doi: 10.1007/s10198-021-01347-4.CrossRefGoogle ScholarPubMed
Capan, M, Hoover, S, Jackson, EV, Paul, D, Locke, R. Time series analysis for forecasting hospital census: application to the neonatal intensive care unit. Appl Clin Inform. 2016;7:275289. doi: 10.4338/ACI-2015-09-RA-0127.Google ScholarPubMed
Koestler, DC, Ombao, H, Bender, J. Ensemble-based methods for forecasting census in hospital units. BMC Med Res Methodol. 2013;13:67. doi: 10.1186/1471-2288-13-67.CrossRefGoogle ScholarPubMed
Hochreiter, S, Schmidhuber, J. Long short-term memory. Neural Comput. 1997;9:17351780. doi: 10.1162/neco.1997.9.8.1735.CrossRefGoogle ScholarPubMed
Taylor, SJ, Letham, B. Forecasting at scale. Am Stat. 2018;72:3745. doi: 10.1080/00031305.2017.1380080.CrossRefGoogle Scholar
Toharudin, T, Pontoh, RS, Caraka, RE, Zahroh, S, Lee, Y, Chen, RC. Employing long short-term memory and facebook prophet model in air temperature forecasting. Commun Stat - Simul Comput. 2023;52:279290. doi: 10.1080/03610918.2020.1854302.CrossRefGoogle Scholar
Menculini, L, Marini, A, Proietti, M, et al. Comparing Prophet and deep learning to ARIMA in forecasting wholesale food prices. Forecasting. 2021;3:644662. doi: 10.3390/forecast3030040.CrossRefGoogle Scholar
Jones, R, Turner, B, Perera, P, Hiscock, H, Chen, K. Understanding caregiver perspectives on challenges and solutions to pediatric asthma care for children with a previous hospital admission: a multi-site qualitative study. J Asthma Off J Assoc Care Asthma. 2022;59:19731980. doi: 10.1080/02770903.2021.1984528.Google ScholarPubMed
Li, Z, Liu, F, Yang, W, Peng, S, Zhou, J. A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans Neural Netw Learn Syst. 2022;33:69997019. doi: 10.1109/TNNLS.2021.3084827.CrossRefGoogle ScholarPubMed
Organization WH ICD-10 : international statistical classification of diseases and related health problems : tenth revision. 2004; World Health Organization. (https://iris.who.int/handle/10665/42980) Accessed June 19, 2025.Google Scholar
Mandrekar, JN. Receiver operating characteristic curve in diagnostic test assessment. J Thorac Oncol. 2010;5:13151316. doi: 10.1097/JTO.0b013e3181ec173d.CrossRefGoogle ScholarPubMed
Robin, X, Turck, N, Hainard, A et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12::77. doi: 10.1186/1471-2105-12-77.CrossRefGoogle Scholar
Hyndman, R, Athanasopoulos, G. Forecasting: Principles and Practice. 3rd ed. OTexts; 2021. (https://otexts.com/fpp3/) Accessed February 11, 2025.Google Scholar
Kwiatkowski, D, Phillips, PCB, Schmidt, P, Shin, Y. Testing the null hypothesis of stationarity against the alternative of a unit root: how sure are we that economic time series have a unit root? J Econom. 1992;54:159178. doi: 10.1016/0304-4076(92)90104-Y.CrossRefGoogle Scholar
Schwarz, G. Estimating the dimension of a model. Ann Stat. 1978;6:461464. doi: 10.1214/aos/1176344136.CrossRefGoogle Scholar
Box, GEP, Pierce, DA. Distribution of residual autocorrelations in Autoregressive-integrated moving average time series models. J Am Stat Assoc. 1970;65:15091526. doi: 10.1080/01621459.1970.10481180.CrossRefGoogle Scholar
Holt, CC. Forecasting seasonals and trends by exponentially weighted moving averages. Int J Forecast. 2004;20(1):510. doi: 10.1016/j.ijforecast.2003.09.015.CrossRefGoogle Scholar
Winters, PR. Forecasting sales by exponentially weighted moving averages. Manag Sci. 1960;6:324342.10.1287/mnsc.6.3.324CrossRefGoogle Scholar
Akaike, H. A new look at the statistical model identification. IEEE Trans Autom Control. 1974;19:716723. doi: 10.1109/TAC.1974.1100705.CrossRefGoogle Scholar
O’Hara-Wild, M, Hyndman, R, Wang, E. Forecasting models for tidy time series. (https://cran.r-project.org/web/packages/fable/index.html) Accessed February 11, 2025.Google Scholar
O’Hara-Wild, M, Hyndman, R, Wang, E, Cook, D. features) TT (Correlation, method) LC (Guerrero’s. feasts: Feature Extraction and Statistics for Time Series. Published online September 25, 2024. (https://cran.r-project.org/web/packages/feasts/index.html) Accessed February 11, 2025.Google Scholar
Taylor, S, Letham, B. Automatic forecasting procedure. Published online March 30, 2021. (https://cran.r-project.org/web/packages/prophet/index.html) Accessed February 11, 2025.Google Scholar
Aggarwal, AN, Agarwal, R, Dhooria, S, Prasad, KT, Sehgal, IS, Muthu, V. Impact of asthma on severity and outcomes in COVID-19. Respir Care. 2021;66:19121923. doi: 10.4187/respcare.09113.CrossRefGoogle ScholarPubMed
Sunjaya, AP, Allida, SM, Di Tanna, GL, Jenkins, CR. Asthma and COVID-19 risk: a systematic review and meta-analysis. Eur Respir J. 2022;59:2101209. doi: 10.1183/13993003.01209-2021.CrossRefGoogle Scholar
Liu, S, Zhi, Y, Ying, S. COVID-19 and asthma: reflection during the pandemic. Clin Rev Allergy Immunol. 2020;59:7888. doi: 10.1007/s12016-020-08797-3.CrossRefGoogle ScholarPubMed
Chatziparasidis, G, Kantar, A. COVID-19 in children with asthma. Lung. 2021;199:712. doi: 10.1007/s00408-021-00419-9.CrossRefGoogle ScholarPubMed
WHO Director-General’s opening remarks at the media briefing on COVID-19 - (WHO Director-General’s opening remarks at the media briefing on COVID-19 - 11 March 2020. (https://www.who.int/director-general/speeches/detail/who-director-general-s-opening-remarks-at-the-media-briefing-on-covid-19---11-march-2020) Accessed February 11, 2025.Google Scholar
Hyndman, R, Koehler, AB, Ord, JK, Snyder, RD. Forecasting with Exponential Smoothing: The State Space Approach. Berlin/Heidelberg, Germany: Springer Science & Business Media, 2008.10.1007/978-3-540-71918-2CrossRefGoogle Scholar
Chan, WN. Time series data mining: comparative study of ARIMA and Prophet methods for forecasting closing prices of Myanmar stock exchange. J Comput Appl Res. 2020;1:7580.Google Scholar
Yenidoğan, I, Çayir, A, Kozan, O, Dağ, T, Arslan, Ç. Bitcoin forecasting using ARIMA and PROPHET. IEEE. 2018; 621624.Google Scholar
Sanders, DL, Gregg, W, Aronsky, D. Identifying asthma exacerbations in a pediatric emergency department: a feasibility study. Int J Med Inf. 2007;76:557564. doi: 10.1016/j.ijmedinf.2006.03.003.CrossRefGoogle Scholar
Cooke, CR, Joo, MJ, Anderson, SM, et al. The validity of using ICD-9 codes and pharmacy records to identify patients with chronic obstructive pulmonary disease. BMC Health Serv Res. 2011;11:37. doi: 10.1186/1472-6963-11-37.CrossRefGoogle ScholarPubMed
Seol, HY, Wi, CI, Ryu, E, King, KS, Divekar, RD, Juhn, YJ. A diagnostic codes-based algorithm improves accuracy for identification of childhood asthma in archival data sets. J Asthma Off J Assoc Care Asthma. 2021;58:10771086. doi: 10.1080/02770903.2020.1759624.Google ScholarPubMed
van der Laan, MJ, Polley, EC, Hubbard, AE. Super learner. Stat Appl Genet Mol Biol. 2007;6:Article25. doi: 10.2202/1544-6115.1309.CrossRefGoogle ScholarPubMed
Polley, E, van der, L M. Super learner in prediction. UC berkeley div biostat work pap ser. Published online May 3, 2010. (https://biostats.bepress.com/ucbbiostat/paper266) Accessed April 17, 2025.Google Scholar
Clay, I, Peerenboom, N, Connors, DE, et al. Reverse engineering of digital measures: inviting patients to the conversation. Digit Biomark. 2023;7:2844. doi: 10.1159/000530413.CrossRefGoogle ScholarPubMed
Figure 0

Figure 1. Daily asthma hospitalizations at Cincinnati Children’s Hospital Medical Center (CCHMC) between 2016 and 2023 on a log1p-scale. Dates are shaded whether the number of admissions was lower (blue), higher (red), or normal (white) relative to the median number of admissions.

Figure 1

Table 1. Asthma hospitalization cases and percentages (%) at Cincinnati Children’s Hospital Medical Center (CCHMC) between 2016 and 2023. Admissions are segmented by period for the training and validation sets. The number of cases and the frequency of cases per day are stated for each period

Figure 2

Figure 2. Comparison of model class prediction accuracy on 2023 asthma hospitalization admissions by length of training history. Median absolute percentage error is compared for varying sizes of training history, from 1 year (2022) to 7 years (2016 – 2022).

Figure 3

Table 2. Validation of 2023 hospitalizations by training period – median absolute percentage error (MAPE), 95% confidence interval coverage percentage (Cover %), positive predictive value (PPV), negative predictive value (NPV), sensitivity (Sens), specificity (Spec), area under the receiver operating characteristic curve (ROC) with 95% confidence interval (AUC + 95%CI)

Figure 4

Table 3. Validation of hospitalization predictions on two-year validation (Trained: 2016-2021; validated: 2022 – 2023), before COVID-19 (Trained: 2016 – 2018; validated: 2019) and during COVID-19 (Trained: 2020-2022; validated: 2023). – median absolute percentage error (MAPE), 95% confidence interval coverage percentage (Cover %), positive predictive value (PPV), negative predictive value (NPV), sensitivity (Sens), specificity (Spec), area under the receiver operating characteristic curve (ROC) with 95% confidence interval (AUC + 95%CI)