Hostname: page-component-76fb5796d-zzh7m Total loading time: 0 Render date: 2024-04-27T22:50:05.679Z Has data issue: false hasContentIssue false

Identification of COVID-19 can be quicker through artificial intelligence framework using a mobile phone–based survey when cities and towns are under quarantine

Published online by Cambridge University Press:  03 March 2020

Arni S. R. Srinivasa Rao*
Affiliation:
Division of Health Economics and Modeling, Department of Population Health Sciences, Medical College of Georgia, Augusta University, Augusta, Georgia Laboratory for Theory and Mathematical Modeling, Division of Infectious Diseases, Department of Medicine, Medical College of Georgia, Augusta, Georgia Department of Mathematics, Augusta University, Augusta, Georgia
Jose A. Vazquez
Affiliation:
Division of Infectious Diseases, Department of Medicine, Medical College of Georgia, Augusta University, Augusta, Georgia
*
Author for correspondence: Arni S. R. Srinivasa Rao, E-mail: arrao@augusta.edu
Rights & Permissions [Opens in a new window]

Abstract

We propose the use of a machine learning algorithm to improve possible COVID-19 case identification more quickly using a mobile phone–based web survey. This method could reduce the spread of the virus in susceptible populations under quarantine.

Type
Commentary
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© 2020 by The Society for Healthcare Epidemiology of America. All rights reserved.

Emerging and novel pathogens are a significant problem for global public health. This is especially true for viral diseases that are easily and readily transmissible and have asymptomatic infectivity periods. The novel coronavirus (SARS-CoV-2) described in December 2019 (COVID-19) has resulted in major quarantines to prevent further spread, including major cities, villages, and public areas throughout China and across the globe.13 As of February 25, 2020, the World Health Organization’s situational data indicate ∼77,780 confirmed cases in 25 countries, including 2,666 deaths due to COVID-19.4 Most deaths reported so far have been in China.5 The Centers for Disease Control and Prevention (CDC) and the World Health Organization have issued interim guidelines to protect the population and to attempt to prevent the further spread of the SARS-CoV-2 virus from infected individuals.6 Cities and villages throughout China are unable to accommodate such large numbers of infected individuals while maintaining the quarantine, and several new hospitals have been built to manage the infected individuals.Reference Wang, Zhu and Umlauf7 It is imperative that we evaluate novel models to attempt to control the rapidly spreading SARS-CoV-2.8 Technology can assist in faster identification of possible cases to yield more timely interventions.

To reduce the time needed to identify a person under investigation (PUI) for COVID-19 and their rapid isolation, we propose to collect a basic travel history along with the more common signs and symptoms using a mobile phone–based online survey. Such data can be used in the preliminary screening and early identification of possible COVID-19 cases. Thousands of data points can be processed through an artificial intelligence (AI) framework that can evaluate individuals and stratify them into no risk, minimal risk, moderate risk, and high risk groups. The high-risk cases identified can then be quarantined earlier, thus decreasing the chance of spreading the virus (Table 1).

Table 1. Steps involved in the collection of data through a mobile phone-based survey

Appendix 1 (online) lists the details of the steps involved in collecting data from all respondents independent of whether or not they think they are infected. The AI algorithm described in Appendix 2 (online) can identify possible cases and send an alert to the nearest health clinic as well as to the respondent for an immediate health visit. We call this an “alert for health check recommendation for COVID-19.” If the respondent is unable to commute to the health center, the health department can send an alert to a mobile health unit to conduct a door-to-door assessment and even test for the virus. If a respondent does not have an immediate risk of symptoms or signs related to the viral infection, then an AI-based health alert cab be sent to the respondent to notify them that there is no current risk of COVID-19. Figure 1 summarizes the outcomes of data collection and identification of possible cases.

Fig. 1. Conceptual framework of data collection and possible COVID-19 identification. (a) A geographical region (eg, a city, county, town, or village) with households in it. (b) Respondents and nonrespondents of a mobile phone–based web survey. (c) Possible identified cases of COVID-19 among the survey respondents and possible cases of COVID-19 among nonrespondents of the survey.

Fig. 2. Number of possible cases identified through artificial intelligence (AI) framework versus the number of individuals who responded to a mobile phone–based web survey.

The signs and symptoms data recorded in step 5 of the algorithm are collected prior to Health Check Recommended for Coronavirus (HCRC) alerts or Health Check Recommended for Coronavirus (MHCRC) alerts (for possible identification and assessment) and No Health Check Recommended for Coronavirus (NCRC) alerts (for nonidentified respondents). These procedures are explained in steps 3 and 4 in Appendix 2. The extended analysis we propose can help determine any association among sociodemographic variables and the signs and symptoms, such as fever and lower respiratory infection including cough and shortness of breath, in individuals with and without possible infection. A 2 x 2 table of number of COVID-19 cases identified through AI and the number of people responded to a mobile survey is described in Figure 2.

Applications of AI and deep learning can be useful tools in assisting diagnoses and decision making in treatment.Reference Liang, Tsui and Ni10,Reference Rao and Diamond11 Several studies have promoted disease detection through AI models.Reference Neill12Reference Kumar, Kumar and Saboo15 The use of mobile phonesReference Tomlinson, Solomon and Singh16Reference Bastawrous and Armstrong19 and web-based portalsReference Paolotti, Carnahan and Colizza20,Reference Fabic, Choi and Bird21 have been tested successfully in health-related data collection. In addition, our proposed algorithm can be easily extended to identify individuals who might have any mild symptoms and signs. However, such techniques must be applied in a timely way for relevant and rapid results. Apart from cost-effectiveness, our proposed modeling method could greatly assist in identifying and controlling COVID-19 in populations under quarantine due to the spread of SARS-CoV-2.

Acknowledgments

We thank Professor N.V. Joshi, Indian Institute of Science, Bengaluru, and Mr P. Sashank, CEO Exaactco Compusoft Global Solutions, Hyderabad, India, for their editorial comments.

Financial support

No financial support was provided relevant to this article.

Conflicts of interest

All authors report no conflicts of interest relevant to this article.

Authors contributions

ASRSR designed the study, developed the methods and wrote the first draft of the paper. JAV contributed in clinical verbiage editing, inputs and editing into the draft.

Appendix 1. Steps Involved in Data Collection Through Mobile Phones

We have developed our data collection criteria based on the CDC’s Flowchart to Identify and Assess 2019 Novel Coronavirus,9 and we have added additional variables for the extended utility of our efforts in identifying infected and controlling the spread (see Table 1 in the text).

Appendix 2. Algorithm

Let O 1, O 2, O 3, O 4, O 5 be the outputs recorded during the data collection steps 1 through 5 described in the Appendix 1. The 3 outputs within O2 are given as

$${\bi {O_{\bf 2}} = \left\{ {{O_{{\bf 2}G}},{O_{{\bf 2}A}},{O_{{\bf 2}R}}} \right\},}$$

and 9 pairs of outputs within O5 are given as

$${\bi {O_{\bf 5}} = \left\{ {\matrix{ {\left( {{O_{{\bf 5}A}},{D_{{\bf 5}A}}} \right),\left( {{O_{{\bf 5}B}},{D_{{\bf 5}B}}} \right),\left( {{O_{{\bf 5}C}},{D_{{\bf 5}C}}} \right),\left( {{O_{{\bf 5}D}},{D_{{\bf 5}D}}} \right),} \cr {\left( {{O_{{\bf 5}E}},{D_{{\bf 5}E}}} \right),\left( {{O_{{\bf 5}F}},{D_{{\bf 5}F}}} \right),\left( {{O_{{\bf 5}G}},{D_{{\bf 5}G}}} \right),\left( {{O_{{\bf 5}H}},{D_{{\bf 5}H}}} \right),} \cr {\left( {{O_{{\bf 5}I}},{D_{{\bf 5}I}}} \right)} \cr } } \right\}},$$

where the pair O5i, D5i for i = A, B, …I represents the respondent’s response regarding the presence or absence of ith sign and symptom (O5i) and duration of corresponding sign and symptom (D5i)

(1) If the set of identifiers, I 1, for

$${\bi {I_{\bf 1}} = \left\{ {{O_{\bf 3}},{O_{{\bf 5}A}},{O_{{\bf 5}B}},{O_{{\bf 5}C}}} \right\}}$$

is equal to one of the elements of the set C 1, for

$${{\bi{C}}_{\bi{1}}} = \left\{ {\matrix{ {\matrix{ {\left( {{\bf{1}},{\bf{1}},{\bf{1}},{\bf{1}}} \right)} \cr {\left( {{\bf{1}},{\bf{1}},{\bf{1}},{\bf{0}}} \right)} \cr {\left( {{\bf{1}},{\bf{1}},{\bf{0}},{\bf{0}}} \right)} \cr } } \cr {\left( {{\bf{1}},{\bf{0}},{\bf{1}},{\bf{1}}} \right)} \cr {\left( {{\bf{1}},{\bf{0}},{\bf{0}},{\bf{1}}} \right)} \cr {\left( {{\bf{1}},{\bf{0}},{\bf{1}},{\bf{0}}} \right)} \cr {\left( {{\bf{1}},{\bf{1}},{\bf{0}},{\bf{1}}} \right)} \cr } } \right\},$$

for a respondent, then, send HCRC or MHCRC. If I 1 is not equal to any of the elements of the set C 1 then proceed to test criteria (3).

(2) If the set of identifiers, I 2, for

$${\bi {I_{\bf 2}} = \left\{ {{O_{\bf 4}},{O_{{\bf 5}A}},{O_{{\bf 5}B}},{O_{{\bf 5}C}}} \right\}}$$

is equal to one of the elements of the set C 1, then send HCRC or MHCRC to that respondent, else proceed to the test criteria (4).

(3) If I 1 is equal to one of the elements of the set C 2, for

$${{\bi{C}}_{\bi{2}}} = \left\{ {\matrix{ {\matrix{ {\left( {{\bf{0}},{\bf{1}},{\bf{1}},{\bf{1}}} \right)} \cr {\left( {{\bf{0}},{\bf{1}},{\bf{1}},{\bf{0}}} \right)} \cr {\left( {{\bf{0}},{\bf{1}},{\bf{0}},{\bf{0}}} \right)} \cr } } \cr {\left( {{\bf{0}},{\bf{0}},{\bf{1}},{\bf{1}}} \right)} \cr {\left( {{\bf{0}},{\bf{0}},{\bf{0}},{\bf{1}}} \right)} \cr {\left( {{\bf{0}},{\bf{0}},{\bf{1}},{\bf{0}}} \right)} \cr {\left( {{\bf{0}},{\bf{1}},{\bf{0}},{\bf{1}}} \right)} \cr } } \right\}$$

then the respondent will be sent an NCRC alert.

(4) If I 2 is equal to one of the elements of the set C 2, then the respondent will be sent an NCRC alert.

A comparison of test criteria results of (3) and (4) with their corresponding geographic and sociodemographic details will yield further investigations of signs and symptoms based on whether or not an individual in the survey has traveled to coronavirus-affected areas or has had contact with any person who is known to have COVID-19. Here, we focus only on the identification of cases; further analysis techniques are beyond our scope. However, our approach is flexible enough to capture various other associations within the populations.

Appendix 3. Further Computations on the Data Collected

Suppose n and m are individuals in a region who have responded and not responded, respectively, for a mobile phone–based online survey. Responses are randomly associated and not depended on the sickness due to the virus. The pair

$${\bi \left( {\frac{{n}\over{{n + m}}},\frac{{m}\over{{n + m}}}} \right)}$$

yields the proportions of those who have responded and not responded in that region. Notably, we can compute ${\bi \frac{{m}\over{{n + m}}}}$ because the value m is known to us in that region. Here, n 1 of n are possible cases identified through our algorithm, and m 1 of m are possible cases of the virus that were not identified by the algorithm because m individuals never responded to the survey. Because n and m are known to us, one of the following relations will hold:

(A2.1) $${\bi \left\{ {n \gt {\bi m} ,{\bi n} = {\bi m} ,{\bi n} \lt {\bi m}} \right\}}$$

Thus, we will see which of the relations listed in (A2.1) is true. When n>m, one of the following relations will hold:

(A2.2) $${\bi \left\{ {\frac{{{{n_{\bf 1}}}}\over{n}} \gt \frac{{{{\bi m_{\bf 1}}}}\over{\bi m}},\frac{{{{\bi n_{\bf 1}}}}\over{\bi n}} = \frac{{{{\bi m_{\bf 1}}}}\over{\bi m}},\frac{{{{\bi n_{\bf 1}}}}\over{\bi n}} \lt \frac{{{{\bi m_{\bf 1}}}}\over{\bi m}}} \right\}}$$

However, we will never know which of the relations in (A2.1) is true because m 1 were never identified by the algorithm. For example, suppose 2,000 individuals respond to the survey, and of these, 500 individuals do not respond to the survey and 400 are identified as possible cases by the algorithm. If there are 100 possible cases of virus (which we do not have a mechanism to count) among the 500 who never responded, then the relation

$${\bi \frac{{{{n_{\bf 1}}}}\over{n}} = \frac{{{{m_{\bf 1}}}}\over{m}}}$$

is true. Similarly, other relations of (A2.2) could arise when n>m Using a similar argument, we can verify that when other relations of (A2.1) are true, we are still unsure which of the relations in (A2.1) are true. The 2 × 2 contingency options are provided in Figure 2 (in the text) to visualize the data to be generated using the proposed method.

Theorem: Let there be N individuals in a region. The probability that n 1 cases identified through the AI framework given that there are n individuals responded to the survey is ${\bi \frac{{{{n_{\bf 1}}N}}\over{{{n^{\bf 2}}}}}.}$

Proof: Let N = n + m, and let

$${\bi U = \left\{ {{u_{\bf 1}},{u_{\bf 2}}, \ldots ,{u_n}} \right\}}$$

be the collection of n individuals who responded,

$${\bi V = \left\{ {{v_{\bf 1}},{v_{\bf 2}}, \ldots ,{v_m}} \right\}}$$

be the collection of m individuals who did not responded. Suppose

$${\bi {U_{\bf 1}} = \left\{ {{u_{{a_{\bf 1}}}},{u_{{a_{\bf 2}}}}, \ldots ,{u_{{a_{{n_{\bf 1}}}}}}} \right\} \subset U}$$

is the collection of respondents who are identified as possible cases. Here UV can be considered the region shown in (a), U shown in (b) and U 1 in (c) shown in Figure 1 (in the text).

Suppose we define 2 events E 1 and U using the sets U, V and U 1 as follows:

E 1: n 1 of n responded cases are identified through the algorithm

E : n of N have responded to the survey.

The conditional probability of the event E 1 given the event E, say, P(E1/E) is computed as follows:

$${\bi P({E_{\bf 1}}/E) = {{P({E_{\bf 1}} \cap E)} \over {P(E)}} = {{{{{n_{\bf 1}}} \over n}} \over {{n \over N}}} = {{{n_{\bf 1}}N} \over {{n^{\bf 2}}}}}.$$

References

More Chinese cities shut down as novel coronavirus death toll rises. Channel News Asia website. https://www.channelnewsasia.com/news/asia/wuhan-coronavirus-more-china-cities-shut-hangzhou-zhejiang-hubei-12395706. Published February 5, 2020. Accessed February 10, 2020.Google Scholar
Weinland, D, Yu, S. Chinese villages build barricades to keep coronavirus at bay. The Financial Times website. https://www.ft.com/content/68792b9c-476e-11ea-aeb3-955839e06441 Published February 7, 2020. Accessed February 10, 2020.Google Scholar
Transit going in and out of Wuhan, China, is being shut down to contain coronavirus. Business Insider website. https://www.businessinsider.com/transit-wuhan-china-shut-down-coronavirus-2020-1. Published January 25, 2020. Accessed February 10, 2020.Google Scholar
The WHO COVID-19 situation report 36. World Health Organization website. https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200225-sitrep-36-covid-19.pdf?sfvrsn=2791b4e0_2. Published February 25, 2020. Accessed February 26, 2020.Google Scholar
Coronavirus explained: All your questions about COVID-19 answered. C Net website. https://www.cnet.com/how-to/coronavirus-explained-all-your-questions-about-covid-19-answered/. Updated March 24, 2020. Accessed March 27, 2020.Google Scholar
Preventing the spread of coronavirus disease 2019 in homes and residential communities. Centers for Disease Control and Prevention website. https://www.cdc.gov/coronavirus/2019-ncov/hcp/guidance-prevent-spread.html. Updated March 6, 2020. Accessed March 27, 2020.Google Scholar
Wang, J, Zhu, E, Umlauf, T. How China built two coronavirus hospitals in just over a week. The Wall Street Journal website. https://www.wsj.com/articles/how-china-can-build-a-coronavirus-hospital-in-10-days-11580397751. Published February 6, 2020. Accessed February 10, 2020.Google Scholar
Expert: better models, algorithms could help predict and prevent virus spread. The Augusta Chronicle website. https://www.augustachronicle.com/news/20200128/expert-better-models-algorithms-could-help-predict-and-prevent-virus-spread. Published January 28, 2020. Accessed on February 11, 2020.Google Scholar
Flowchart to identify and assess 2019 novel coronavirus. Centers for Disease Control and Prevention website. https://www.cdc.gov/coronavirus/2019-ncov/hcp/clinical-criteria.html?CDC_AA_refVal=https%3A%2F%2Fwww.cdc.gov%2Fcoronavirus%2F2019-ncov%2Fhcp%2Fidentify-assess-flowchart.html. Updated February 27, 2020. Accessed March 27, 2020.Google Scholar
Liang, H, Tsui, BY, Ni, H. et al. Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nat Med 2019;25:433438.CrossRefGoogle ScholarPubMed
Rao, ASRS, Diamond, MP. Deep learning of Markov model-based machines for determination of better treatment option decisions for infertile women. Reprod Sci 2020;27:763770.Google Scholar
Neill, DB. Using artificial intelligence to improve hospital inpatient care. IEEE Intell Syst 2013;28:9295.10.1109/MIS.2013.51CrossRefGoogle Scholar
Rajalakshmi, R, Subashini, R, Anjana, RM, et al. Automated diabetic retinopathy detection in smartphone-based fundus photography using artificial intelligence. Eye 2018;32:11381144.CrossRefGoogle ScholarPubMed
Zeinab, A, Roohallah, Al, Mohamad, R, Hossein, M, Ali, AY. Computer-aided decision making for heart disease detection using hybrid neural-network genetic algorithm. Comput Methods Programs Biomed 2017;141:1926.Google Scholar
Kumar, VB, Kumar, SS, Saboo, V. Dermatological disease detection using image processing and machine learning. Third International Conference on Artificial Intelligence and Pattern Recognition (AIPR), Lodz; 2016:1–6.CrossRefGoogle Scholar
Tomlinson, M, Solomon, W, Singh, Y. et al. The use of mobile phones as a data collection tool: a report from a household survey in South Africa. BMC Med Inform Decision Making 2009;9:51.CrossRefGoogle ScholarPubMed
Ballivian, A, Azevedo, JP, Durbin, W. 2015. Using mobile phones for high-frequency data collection. In: Toninelli, D, Pinter, R, de Pedraza, P, eds. Mobile Research Methods: Opportunities and Challenges of Mobile Research Methodologies. London: Ubiquity Press; 2015:2139.Google Scholar
Braun, R, Catalani, C, Wimbush, J, Israelski, D. Community health workers and mobile technology: a systematic review of the literature. PLoS One 2013;8(6):e65772.CrossRefGoogle ScholarPubMed
Bastawrous, A, Armstrong, MJ. Mobile health use in low- and high-income countries: an overview of the peer-reviewed literature. J Roy Soc Med 2013;106:130142.CrossRefGoogle ScholarPubMed
Paolotti, D, Carnahan, A, Colizza, V, et al. Web-based participatory surveillance of infectious diseases: the Influenzanet participatory surveillance experience. Clin Microbiol Infect 2014;20:1721.CrossRefGoogle ScholarPubMed
Fabic, MS, Choi, YJ, Bird, S. A systematic review of demographic and health surveys: data availability and utilization for research. Bull World Health Org 2012;90:604612.CrossRefGoogle Scholar
Figure 0

Table 1. Steps involved in the collection of data through a mobile phone-based survey

Figure 1

Fig. 1. Conceptual framework of data collection and possible COVID-19 identification. (a) A geographical region (eg, a city, county, town, or village) with households in it. (b) Respondents and nonrespondents of a mobile phone–based web survey. (c) Possible identified cases of COVID-19 among the survey respondents and possible cases of COVID-19 among nonrespondents of the survey.

Figure 2

Fig. 2. Number of possible cases identified through artificial intelligence (AI) framework versus the number of individuals who responded to a mobile phone–based web survey.