Hostname: page-component-77f85d65b8-6bnxx Total loading time: 0 Render date: 2026-03-28T13:05:43.988Z Has data issue: false hasContentIssue false

Application of a long short-term memory neural network: a burgeoning method of deep learning in forecasting HIV incidence in Guangxi, China

Published online by Cambridge University Press:  09 May 2019

G. Wang
Affiliation:
Guangxi Collaborative Innovation Center for Biomedicine, Life Science Institute, Guangxi Medical University, Nanning 530021, Guangxi, China
W. Wei
Affiliation:
Guangxi Key Laboratory of AIDS Prevention and Treatment & Guangxi Universities Key Laboratory of Prevention and Control of Highly Prevalent Disease, School of Public Health, Guangxi Medical University, Nanning 530021, Guangxi, China
J. Jiang
Affiliation:
Guangxi Key Laboratory of AIDS Prevention and Treatment & Guangxi Universities Key Laboratory of Prevention and Control of Highly Prevalent Disease, School of Public Health, Guangxi Medical University, Nanning 530021, Guangxi, China
C. Ning
Affiliation:
Guangxi Collaborative Innovation Center for Biomedicine, Life Science Institute, Guangxi Medical University, Nanning 530021, Guangxi, China
H. Chen
Affiliation:
Geriatrics Digestion Department of Internal Medicine, The First Affiliated Hospital of Guangxi Medical University, Nanning 530021, Guangxi, China
J. Huang
Affiliation:
Guangxi Key Laboratory of AIDS Prevention and Treatment & Guangxi Universities Key Laboratory of Prevention and Control of Highly Prevalent Disease, School of Public Health, Guangxi Medical University, Nanning 530021, Guangxi, China
B. Liang
Affiliation:
Guangxi Key Laboratory of AIDS Prevention and Treatment & Guangxi Universities Key Laboratory of Prevention and Control of Highly Prevalent Disease, School of Public Health, Guangxi Medical University, Nanning 530021, Guangxi, China
N. Zang
Affiliation:
Guangxi Collaborative Innovation Center for Biomedicine, Life Science Institute, Guangxi Medical University, Nanning 530021, Guangxi, China Guangxi Key Laboratory of AIDS Prevention and Treatment & Guangxi Universities Key Laboratory of Prevention and Control of Highly Prevalent Disease, School of Public Health, Guangxi Medical University, Nanning 530021, Guangxi, China
Y. Liao
Affiliation:
Guangxi Collaborative Innovation Center for Biomedicine, Life Science Institute, Guangxi Medical University, Nanning 530021, Guangxi, China
R. Chen
Affiliation:
Guangxi Collaborative Innovation Center for Biomedicine, Life Science Institute, Guangxi Medical University, Nanning 530021, Guangxi, China
J. Lai
Affiliation:
Guangxi Collaborative Innovation Center for Biomedicine, Life Science Institute, Guangxi Medical University, Nanning 530021, Guangxi, China
O. Zhou
Affiliation:
Guangxi Collaborative Innovation Center for Biomedicine, Life Science Institute, Guangxi Medical University, Nanning 530021, Guangxi, China
J. Han
Affiliation:
Guangxi Collaborative Innovation Center for Biomedicine, Life Science Institute, Guangxi Medical University, Nanning 530021, Guangxi, China
H. Liang
Affiliation:
Guangxi Collaborative Innovation Center for Biomedicine, Life Science Institute, Guangxi Medical University, Nanning 530021, Guangxi, China Guangxi Key Laboratory of AIDS Prevention and Treatment & Guangxi Universities Key Laboratory of Prevention and Control of Highly Prevalent Disease, School of Public Health, Guangxi Medical University, Nanning 530021, Guangxi, China
L. Ye*
Affiliation:
Guangxi Key Laboratory of AIDS Prevention and Treatment & Guangxi Universities Key Laboratory of Prevention and Control of Highly Prevalent Disease, School of Public Health, Guangxi Medical University, Nanning 530021, Guangxi, China
*
Author for correspondence: L. Ye, E-mail: yeli@gxmu.edu.cn
Rights & Permissions [Opens in a new window]

Abstract

Guangxi, a province in southwestern China, has the second highest reported number of HIV/AIDS cases in China. This study aimed to develop an accurate and effective model to describe the tendency of HIV and to predict its incidence in Guangxi. HIV incidence data of Guangxi from 2005 to 2016 were obtained from the database of the Chinese Center for Disease Control and Prevention. Long short-term memory (LSTM) neural network models, autoregressive integrated moving average (ARIMA) models, generalised regression neural network (GRNN) models and exponential smoothing (ES) were used to fit the incidence data. Data from 2015 and 2016 were used to validate the most suitable models. The model performances were evaluated by evaluating metrics, including mean square error (MSE), root mean square error, mean absolute error and mean absolute percentage error. The LSTM model had the lowest MSE when the N value (time step) was 12. The most appropriate ARIMA models for incidence in 2015 and 2016 were ARIMA (1, 1, 2) (0, 1, 2)12 and ARIMA (2, 1, 0) (1, 1, 2)12, respectively. The accuracy of GRNN and ES models in forecasting HIV incidence in Guangxi was relatively poor. Four performance metrics of the LSTM model were all lower than the ARIMA, GRNN and ES models. The LSTM model was more effective than other time-series models and is important for the monitoring and control of local HIV epidemics.

Information

Type
Original Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © The Author(s) 2019
Figure 0

Fig. 1. Diagram of LSTM neural network pattern. Input gate (it) determines which information needs to be updated in the unit state; the forgetting gate (ft) controls information which needs to be discarded from the unit state; then input gate and a vector $(\widetilde{c}_t)$ are created by Tanh to determine which new information is stored in the unit state to update the old unit state, and turn into the new unit state (ct). Finally, cell state information is filtered with the output gate (ot) to update the hidden state (ht), which is the output of the LSTM cell.

Figure 1

Fig. 2. Monthly incidence of HIV in Guangxi, China (from January 2005 to December 2015). According to the trend section, it can be found that the incidence of HIV shows seasonal tendency (s = 12). From 2005 to 2011, the HIV incidence in Guangxi was increasing slowly, and the epidemic situation in 2011–2016 showed a seasonal slow decline.

Figure 2

Fig. 3. The MSE of LSTM models with different N values using HIV incidence in 2015 and 2016. MSE, mean square error; N: the number of input to the LSTM model. The yellow line means N value and corresponding MSE in 2015, while the purple line means N and MSE in 2016. As can be seen from the figure, when the N was 12, the model had the minimum MSE in 2015 and 2016.

Figure 3

Fig. 4. The forecasting curves of the optimal LSTM and other models as well as the actual HIV incidence series. Comparison of LSTM model and other models. LSTM, the long short-term memory neural network model; ARIMA, the autoregressive integrated moving average model. The black line means the actual data, the blue dashed line means the predictive data via the LSTM model, the red dashed line means the predictive value via the ARIMA model, the yellow dashed line means the predictive value via the SES model, while the green dashed line means the predictive value via the GRNN model. Compared with ARIMA, SES and GRNN, the predicted value of LSTM was closer to the actual value.

Figure 4

Table 1. Performance of LSTM and the other models in 2015 and 2016

Supplementary material: File

Wang et al. supplementary material

Wang et al. supplementary material 1

Download Wang et al. supplementary material(File)
File 2.4 MB