Monitoring telehealth vomiting calls as a potential public health early warning system for seasonal norovirus activity in Ontario, Canada

Norovirus is a predominant cause of infectious gastroenteritis in countries worldwide [1–5]. It accounts for approximately 50% of acute gastroenteritis (AGE) and >90% of viral gastroenteritis outbreaks [6, 7]. The incubation period ranges between 10 and 48 h and illness duration is generally 1–3 days with self-limiting symptoms; however, this duration is often longer (e.g. 4–6 days) in vulnerable populations such as hospital patients or young children [2, 8]. Symptomatic infection of norovirus presents as acute vomiting, diarrhoea, abdominal cramps and nausea, with severe vomiting and diarrhoea (non-bloody) being most common [2, 5, 9].

Norovirus activity occurs with predictable seasonality. Although the specific months vary, activity peaks during cold weather winter months in temperate climates, similar to other common respiratory viruses (e.g. influenza and rhinovirus) [2,10]. Due to its predictable winter seasonality, norovirus is often referred to as 'winter vomiting disease' [11,12]. However, outbreaks can occur throughout the year, especially in years when a new norovirus variant has emerged [13].
Traditional surveillance of norovirus activity plays an essential role in the identification of seasonal trends, disease burden in the population and the extent of geographical spread. This traditional surveillance is reliant upon datasets such as laboratory submissions and outbreak case reporting to carry out these tasks. However, due to the self-limiting nature and short duration of norovirus illness, public health surveillance is limited by under-reporting and subsequent time lags [14][15][16][17]. Hall et al. demonstrated that only 15% of individuals affected with AGE illness seek medical intervention, and of those, diagnostic testing samples were only requested from 13% [14]. Other studies have described the absence of routine surveillance and frequent under-reporting, which affect the overall ability to monitor disease activity [15,18]. The time lags associated with traditional laboratory surveillance methods have long been recognised as a limitation for disease surveillance due to their delayed effects on public health interventions. A study performed by Ashford et al. demonstrated across a number of public health scenarios that traditional reporting techniques had a delay of 0-26 days between the index case and when the problem was identified through surveillance [17]. Such lags include the time taken for general practitioners to submit samples to the laboratory, and for the specific laboratory tests to be performed and reported [17]. These delays reduce the timeliness of disease monitoring and affect the ability to implement effective public health interventions.
This paper presents a proposal for an early warning syndromic surveillance system using telehealth syndromic surveillance call data to detect winter norovirus activity in Ontario, Canada. The use of real-time surveillance for monitoring norovirus activity could greatly improve the ability of public health authorities to monitor illness, identify seasonal increases and thus implement timelier interventions to reduce subsequent impact.

Methods
This study was conducted using data from the province of Ontario, Canada. It is the largest province in the country with a population of roughly 14.1 million residents [19].

Syndromic surveillance telehealth call data
Telehealth Ontario call data were obtained from the Ontario Ministry of Health and Long-Term Care and Sykes Assistance Services Corporation for the period 17 June 2011-29 August 2015. Telehealth Ontario is a nurse-led service available to all residents of Ontario, 24 h a day, 7 days a week. The system provides Ontarians with immediate health information and advice for a variety of medical concerns, and where appropriate provides triage using Schmitt-Thompson guidelines to direct them to the most appropriate health care service, e.g. a general practitioner appointment or an immediate visit to an emergency department [20,21]. It is estimated that Telehealth Ontario receives approximately 75-80 000 calls per month [22,23]. The telehealth data were received as daily call counts and aggregated into weekly counts for sufficient sample sizes in analyses. All AGE calls were requested which included the chief complaints: vomiting, diarrhoea, vomiting and diarrhoea, vomiting without diarrhoea (subsequently coded as 'vomiting'), abdominal pain and vomiting with blood. For the purposes of this study, the 'vomiting', 'diarrhoea' and 'vomiting and diarrhoea' chief complaints were included in analyses (subsequently referred to as the 'relevant chief complaints').

Norovirus laboratory reports
Norovirus laboratory data were obtained from Public Health Ontario for the period 17 June 2011-30 September 2014. Laboratory data were defined as stool samples obtained from residents of Ontario with suspected norovirus, which were submitted to Public Health Ontario by a physician for confirmatory testing as part of routine patient management protocols. Samples were tested for a variety of gastrointestinal (GI) pathogens using polymerase chain reaction, electron microscopy or immunochromatographic testing; the presence or absence of norovirus in the sample was reported, however no genotype information was available. The data were received as daily norovirus test results and aggregated into weekly counts to enable sufficient sample sizes in analyses, reduce the impact of under-reporting during weekends and holidays, and to account for the availability of laboratory data which are often available on a weekly basis. For the purposes of this study, the collection dates (the date of sample collection from infected patients), and only samples which tested positive for the presence of norovirus were utilised in analyses.

Negative binomial regression analyses
Negative binomial regression analyses were performed to compare the telehealth and laboratory datasets. The telehealth data were used as the outcome variable and the laboratory data as a predictor. Negative binomial regressions were selected due to the fact count data were analysed.
Both the telehealth and laboratory datasets were analysed over the period 17 June 2011-30 September 2014 (week 24, 2011-week 4 02 014). A 'public holiday' variable was added to both datasets to indicate the occurrence of an Ontario-wide statutory holiday; this was included because previous studies have shown the effects public holidays have on call volumes and the necessity to treat holidays differently than normal days [24]. A 'date' variable was also added to the datasets to indicate the start of the week analysed; this was included as a trend variable to assess long-term patterns in the data.
The laboratory dataset was used at time 0 and shifted 1, 2 and 3 weeks forward to construct lag variables (t 0 , t 1 , t 2 and t 3 , respectively) to observe whether a statistically significant time lag existed between the datasets. Prior to performing this study, an assumption was made that the telehealth data might precede the laboratory data by a number of weeks [23,25,26].
Sets of four negative binomial regression models were constructed with the original laboratory data (t 0 ), and the t 1 , t 2 and t 3 lag variables as predictor variables. These predictors were regressed against weekly counts of calls to telehealth using the following parameter combinations: all-ages and all relevant chief complaints ('index parameter model'), all-ages and vomiting-only chief complaints ('vomiting' and 'vomiting and diarrhoea' calls only), 0-4 ages and all relevant chief complaints, ⩾65 ages and all relevant chief complaints, 0-4 ages and vomiting-only chief complaints, and ⩾65 ages and vomiting-only chief complaints. These various combinations were tested to observe whether the index parameter model was the most sensitive to early warning, or if a combination of indicators were most sensitive. The specific age groups selected were tested as they are known to be disproportionately affected by norovirus and other gastroenteritis infections [5,9,27], and the vomiting-only chief complaints were tested because vomiting is the primary symptom of norovirus.
Included in each of these models were the public holiday and date variables. The (McFadden's) pseudo-R 2 values of these models were compared with each other to determine the bestfitting model; however, it should be noted that pseudo-R 2 values should not be interpreted the same as R 2 for normal data.

Early warning system threshold calculations
The telehealth data were plotted from 19 June 2011 to 29 August 2015 (week 25, 2011-week 34,2015). A downward trend in call volumes was observed during this time period; therefore, a linear trendline was inserted to calculate the overall slope. This slope (−0.46) was subsequently used to standardise the overall baseline of the data, setting the very first week as the referent.
The periods of increased laboratory reports of norovirus were observed to be regular, within weeks 44-24 (Oct/Nov-June) and were therefore defined as the active season; weeks 25-43 (June-Oct) were defined as out-of-season. Using these weeks as approximate measures for the winter norovirus season, the average weekly call volume for the out-of-season periods was calculated (referred to as the baseline). Using this baseline, the standard deviation (S.D.) was then calculated and modelled incrementally (up to 4S.D.) above the overall average for the out-of-season call volumes for the period of 19 June 2011 to 29 August 2015. This was done using Shewhart (control) chart method [28], a method which displays the mean of the statistic on a graph and models S.D. of the statistic above and/or below the mean to indicate a threshold of data variation.
All threshold calculations were executed using Microsoft Office (Excel) 2016.

Negative binomial regression analyses
The negative binomial regression models demonstrated that the all-ages, vomiting-only chief complaint t 0 , t 1 , t 2 and t 3 models were the best-fitting of all the parameter combinations tested. This parameter combination produced the highest pseudo-R 2 values when compared with all other parameter combinations (Table 1). In this combination, the t 0 and t 1 variables were not significant (P > 0.05); however, the t 2 and t 3 lag variables were significant (P = 0.004, 0.003, respectively) ( Table 2). Although both the t 2 and t 3 lag variables were significant, the fit of the t 2 model (pseudo-R 2 = 0.1861) was higher than that of the t 3 model (pseudo-R 2 = 0.1852) ( Table 1). A visual comparison/overlay of the telehealth and laboratory datasets over the timeframe used in the regression analyses (17 June 2011-30 September 2014) is displayed in Figure 1, and the full model results are described in Table 2.
The public holiday variable was not significant in any of the t 0 , t 1 , t 2 and t 3 all-ages, vomiting-only chief complaint models (P ⩾ 0.05). However, due to its practical significance and importance in syndromic models demonstrated by other research groups worldwide, it was kept in the final models. The date variable was observed to be significant in all of the t 0 , t 1 , t 2 and t 3 all-ages, vomiting-only chief complaint models (P < 0.05), and was therefore included in the final models.

Early warning threshold calculations
Trend-adjusted call volumes were plotted and visualised (Fig. 2). S.D. were calculated on the baseline (average number of weekly calls recorded during the out-of-season time period) of vomiting-only calls. Each S.D. level of vomiting-only calls was set as an early warning threshold across each of the four winter norovirus seasons analysed. Call volumes were observed noting the week during which each S.D. threshold was breached, and the peak week of activity during each season (Table 3). Where weekly vomiting call volumes breached the S.D. level of vomiting-only calls to telehealth for 3 consecutive weeks, this was noted as a winter norovirus season early warning alarm.
The presence of false alarms was assessed during the out-ofseason weeks. Applying the recommended control chart standard of using the mean + 3S.D. to devise an early warning threshold [28], This is in comparison to the eight false alarms that would have been triggered using the 2S.D. level of vomiting-only calls, which has also been recommended by other researchers [29].

Main findings
This study provides an initial validation that there is a statistical association between vomiting calls made to a telehealth system in Ontario and norovirus-positive laboratory test results. Regression analysis provided evidence of a 2-week lag of telehealth vomiting calls ahead of norovirus-positive laboratory reports, thus revealing potential to provide early warning of seasonal norovirus activity by monitoring telehealth vomiting calls. A series of early warning thresholds were developed based upon previously published methods and tested for the triggering of alerts in the study period.
Telehealth calls with vomiting as the chief complaint in all ages demonstrated the timeliest prediction of the winter norovirus season, thus were the focus of this analysis. This timely indication over all other telehealth parameters assessed is likely due to vomiting being the most predominant symptom of the illness in adults [5,9]. These results were comparable to those found by Loveridge et al., who demonstrated that vomiting-only calls in the <5 age group gave the most sensitive prediction of the winter norovirus season [25]. A significant 2-week prediction of telehealth calls ahead of norovirus laboratory reports was also observed, emphasizing the value of using community-based surveillance systems that monitor initial presentation of symptoms at a population level, rather than subsequent patient-level laboratory reports. This also highlights the importance of integrating syndromic surveillance systems with traditional systems for enhanced disease surveillance. Loveridge et al. found an early warning of 6-11 weeks in the <5 ages; however, were able to calculate AGE calls as a proportion of total calls to telehealth and adjust the system to the most sensitive indicators [25]. The highest value in each row is underlined. Bolded text indicates the favoured model deduced from analyses. a Laboratory data used in negative binomial models was regressed at time 0 (t 0 ), and shifted 1 (t 1 ), 2 (t 2 ) and 3 (t 3 ) weeks in the future, separately against telehealth calls. b Vomiting-only calls encompass 'vomiting' and 'vomiting and diarrhoea' telehealth chief complaints. c Vomiting and diarrhoea calls encompass 'vomiting', 'diarrhoea' and 'vomiting and diarrhoea' telehealth chief complaints.
A set of winter norovirus early warning alarm thresholds were devised for the four winter periods analysed, indicating the week during which an early warning alert would be appropriate. Following analyses, the recommended control chart standard of the mean + 3S.D. was found to be ideal for this system. This is because analyses demonstrated that the 3S.D. level would have only triggered five false alarms and would not have significantly delayed public health action to mitigate winter norovirus activity. Selecting a threshold with a shorter time delay would have incorporated more false alarms (2S.D. level), and a threshold with fewer false alarms would have delayed public health action (4S.D. level). Therefore, this recommendation was most appropriate at capturing the winter norovirus season without setting off a high number of false alerts in the out-of-season periods, and without significant time delay. This 3S.D. level agrees with several studies in the fields of syndromic surveillance and/or norovirus, which have followed this metric to accurately identify variation within sample statistics [28,[30][31][32]. However, using the 2S.D. in addition to the 3S.D. threshold could also prove to be useful as a possible additional early alert signal to warn public health authorities that an alarm may be triggered the following week. This would allow earlier public health action to be considered even before an official alert is triggered.
While this telehealth early warning system demonstrated the strongest association with a 2-week prediction of the norovirus laboratory reports, additional time lags must be considered. The average time delay between sample collection and laboratory reporting was 2.6 days (Hughes et al., unpublished data). However, it is important to note that this was based upon analysis of retrospective data; prospectively, these data would not necessarily be tested or reported in a timely manner, with results often batched and/or reported at a later date, extending the delay to reporting. Therefore, prospective reporting of laboratory data always needs to incorporate these additional reporting delays, and further highlights the importance of having real-time data available for use in surveillance systems to detect disease activity as early as possible.
The observed decrease in the telehealth weekly vomiting calls was evident over the entire period analysed in this study. Such decreases may be due to the overall decrease in telehealth utilisation in Ontario and increased usage of the Internet for at-home health care advice. A high percentage of ill patients in countries worldwide are now utilizing the Internet as their first point of medical care [33,34]; a study performed by the Pew Research Center in 2012 found that 31% of all cell phone owners utilised their devices for health-related advice (in comparison to 17% in 2010) by means of Internet searches, texting services and/or health applications [35]. These statistics, in combination with the decreases observed in telehealth call numbers as far back as 2009, may explain the general downward trend observed in the data in this study. To verify whether this decrease was limited just to the gastroenteritis reasons for call, or affected all telehealth calls, additional calculations comparing both the telehealth gastroenteritis call volumes and total call volumes (all reasons for calls) during the same timeframe would be required. However, these data were not available for this study.

Strengths and limitations
This study successfully utilised two existing datasets to provide an understanding of seasonal norovirus activity in the community. The advantage of using telehealth data was the improved  representativeness of community activity. Laboratory data are frequently under-representative of disease burden and are biased, with sampling predominantly occurring in patients from higher risk or vulnerable groups and those from institutionalised outbreaks, particularly from hospital or long-term care settings [23,36]. Telehealth Ontario provides a service to all Ontarians, and better captures those patients who do not present to other healthcare services. In addition, this study was able to highlight that telehealth data were able to provide early warning in comparison with the laboratory data. There were several limitations associated with the datasets used in this study. The underlying reasons why physicians requested stool samples, or what pathogens/tests were requested were unknown; these decisions were made as part of the routine patient care pathway, but not made available as part of the study dataset. This may have had some effect on the validity of the data. Furthermore, it was only possible to undertake a comparative analysis across three norovirus seasons due to a lack of data from 2015 in the laboratory dataset. The lack of sufficient daily data in both datasets prevented analysis at a more granular daily level, as laboratory submissions and telehealth call counts were aggregated to weekly counts due to low daily counts. Additionally, the overall decreasing trend of weekly vomiting calls to telehealth over the study period made the early warning  threshold calculations challenging. Adding a slope adjustment to calculate an adjusted baseline was required; however, if adopted as a prospective system, further decreasing call volumes would need to be continually monitored and thresholds adjusted accordingly. Finally, the telehealth data monitored the presentation of patients' reported symptoms and were not a confirmation of norovirus illness. The laboratory data were used to determine the sensitivity of vomiting calls to norovirus activity; however, it is still important to note that the telehealth data can only be used as a proxy for norovirus activity and will also be affected by seasonal increases in a range of other GI pathogens.

Implications for public health practice
It is recommended that the early warning system proposed in this study be further validated with subsequent implementation and routine use in the province of Ontario, Canada. Adopting syndromic surveillance alongside existing laboratory surveillance would improve monitoring of disease and allow for timelier tracking of norovirus activity. This work also highlights the utility of telehealth call data to detect the onset of the winter norovirus season as early as possible, thus potentially facilitating health care professionals to mitigate the impact by implementing infection control measures, such as increased hand hygiene and patient isolation. Ward closures may also be initiated in high-risk settings (e.g. hospitals, long-term care homes) following the breaching of early warning thresholds [37]. Additionally, the development of an early warning system can also be implemented out-ofseason to warn of potential unusual activity, which in the case of norovirus may be caused by the introduction of new genotype variants [13,38].

Future directions
If adopted into routine public health practice, the norovirus surveillance and early warning system developed in this study should be updated prospectively with daily data, should sufficient daily counts be available in Ontario, to provide the timeliest surveillance information. In addition, as telehealth data become more commonly utilised in public health practice, it will become more feasible to develop similar, well-adapted telehealth syndromic surveillance systems for the detection of a wider range of public health issues, including other communicable diseases (e.g. influenza) and environmental impacts (e.g. heatwaves). These systems could be adopted by other provinces across Canada which employ similar telehealth systems, such as the Tele-Care telehealth service in New Brunswick, and Health Link in Alberta [39,40]. Both of these examples offer 24 h-a-day, 7days-a-week nurse triaging services, similar to Telehealth Ontario, and could therefore be utilised in a provincial telehealth syndromic surveillance system.
Additional sources of syndromic data could be used to complement telehealth data, such as pharmacy purchases, Internet search queries, school absenteeism and general practitioner diagnoses, and integrated into existing laboratory and outbreak surveillance methods [26,41]. In doing so, data gaps and delays would be reduced and more effective public health actions performed to protect Ontarians.