Determination of cut-off cycle threshold values in routine RT–PCR assays to assist differential diagnosis of norovirus in children hospitalized for acute gastroenteritis

SUMMARY Norovirus (NV) is an important cause of acute gastroenteritis in children, but is also frequently detected in asymptomatic children, which complicates the interpretation of NV detection results in both the clinical setting and population prevalence studies. A total of 807 faecal samples from children aged <5 years hospitalized for acute gastroenteritis were collected in Thai Binh, Vietnam, from January 2011 to September 2012. Real-time RT–PCR was used to detect and quantify NV-RNA in clinical samples. A bimodal distribution of cycle threshold (Ct) values was observed in which the lower peak was assumed to represent cases for which NV was the causal agent of diarrhoea, whereas the higher peak was assumed to represent cases involving an alternative pathogen other than NV. Under these assumptions, we applied finite-mixture modelling to estimate a threshold of Ct <21·36 (95% confidence interval 20·29–22·46) to distinguish NV-positive patients for which NV was the likely cause of diarrhoea. We evaluated the validity of the threshold through comparisons with NV antigen ELISA results, and comparisons of Ct values in patients co-infected with rotavirus. We conclude that the use of an appropriate cut-off value in the interpretation of NV real-time RT–PCR results may improve differential diagnosis of enteric infections, and could contribute to improved estimates of the burden of NV disease.


INTRODUCTION
Norovirus (NV) (family Caliciviridae, genus Norovirus) is a major cause of gastrointestinal disease worldwide, and the cause of an estimated 200 000 deaths and 1·1 million hospitalizations in children <3 years of age annually [1,2]. The NV detection rate in diarrhoeal patients varies from as high as 31-48% [3][4][5] to as low as 3-5% [6], suggesting that the burden of NV disease may be highly variable geographically. Similarly, a wide range of NV prevalence (5-48%) has been observed in different regions of Vietnam [7][8][9][10]. While transmission dynamics may indeed vary across regions, differences between studies may also reflect differences in diagnostic methodology. Detection sensitivities depend on many factors such as sample preparation methods, RNA extraction, presence of reverse-transcriptase inhibitors, primer design, amplification chemistries (e.g. enzymes/buffers), virus concentrations in the sample (influenced by the time elapsed from the onset of diarrhoea to sampling, and sample storage conditions), as well as viral factors such as genetic variation in circulating strains. Interpreting NV detection results in both the clinical setting and in population prevalence studies is complicated by the fact that NV is also frequently detected by reverse-transcriptase-polymerase chain reaction (RT-PCR) in asymptomatic individuals [10][11][12][13][14][15][16][17]. Only a few studies have examined and compared the detection of NV between diarrhoeal cases and concurrent non-diarrhoeal controls (reviewed in [18]). In some studies, similar NV detection rates have been observed in both cases and controls, or even higher rates in controls vs. diarrhoea cases [12,19]. NV diarrhoeal patients are known to shed virus for prolonged periods of time following recovery [20].
Shedding of microorganisms without diarrhoeal symptoms is a common phenomenon for many enteric pathogens, such as neonatal rotavirus (RV), bocavirus, NV, Vibrio cholerae O1, enterotoxigenic Escherichia coli, enteropathogenic E. coli, Campylobacter jejuni and Giardia lamblia [21][22][23]. Asymptomatic shedding complicates patient management, studies of disease burden, and monitoring of vaccine trials. Improved methods of inferring a causative role of NV in clinical diagnostics would be a welcome contribution to transmission studies and modelling efforts, with potential applications to clinical diagnostics. Therefore, the objective of this study was to identify a cut-off value for viral load of NV in diarrhoea samples to infer a causative role in diarrhoea.
We started from the observation that distributions of NV C t values in diarrhoeal cases are typically bimodal [19]. We hypothesized that the lower C t value peak corresponds to cases for which the cause of the diarrhoea is NV, and the higher C t value peak corresponds to cases for which another co-infecting agent is the principal cause of diarrhoea. We modelled the NV C t value bimodal distribution with a finite-mixture model, which allowed us to identify a C t threshold value associated with disease risk. We then tested our initial hypothesis using information on RV and NV co-infections as determined by antigen enzyme-linked immunosorbent assay (ELISA).

Sample collection
The samples and dataset used in this study were generated from a large prospective, hospital-based diarrhoea study in Thai Binh Paediatric Hospital, Thai Binh, Vietnam. Faecal samples of children aged <5 years hospitalized acute gastroenterititis were collected upon obtaining parental consent from 2011 to 2012. Inclusion criteria included diarrhea episode 53 times per 24 h, and admission to hospital within 7 days from onset. The study was approved by the Medical Research Ethical Committee of the National Institute of Hygiene and Epidemiology, Vietnam, and the Internal Review Board of Nagasaki University.

Evaluation of cut-off cycle threshold for NV
The cycle threshold (C t ) value from the real time RT-PCR was used as a proxy measure of faecal viral load; C t <40 was considered positive. C t values are inversely proportional (on a logarithmic scale) to viral load, hence lower C t values correspond to higher viral loads. The same positive control was used throughout all experiments, and its C t value varied within 0·5. Threshold level of the thermocycler (0·03) remained constant throughout the analysis.

Modelling of C t distributions by finite-mixture models
The typical bimodal distribution of NV C t values (Figs 1 and 2) was modelled by a finite-mixture model using continuous unimodal distributions from the exponential family [28]. In the absence of prior information on expected distribution for each peak, we evaluated the normal, log-normal, gamma, Weibull distributions, and all possible combinations (4 × 4 = 16). The normal distribution is defined on all the real numbers, whereas the three other distributions (lognormal, Weibull, gamma) are defined on positive reals. These distributions are all characterized by two parameters: a location parameter μ and a scale parameter σ. The first parameter accounts for most of the data and corresponds to the mean of the normal distribution, the mean on a log-scale for the lognormal distribution, and the shape parameter for the gamma and Weibull distributions. The second parameter accounts for the spread of the data around the location parameter and corresponds to the standard deviation in the case of the normal distribution, the standard deviation on a log-scale for the lognormal distribution, the rate parameter (1/scale) for the gamma distribution and the scale parameter for the Weibull distribution. We refer to μ 1 and σ 1 for the location and scale parameters of the first peak (i.e. lowest C t values) and μ 2 and σ 2 for the location and scale parameters of the second peak (i.e. highest C t values). The density of the bimodal distribution of C t values thus reads where D 1 and D 2 are the distributions accounting for the first (i.e. lowest C t values) and second (i.e. the highest C t values) peaks, respectively, and λ and (1 -λ) are the weights for the D 1 and D 2 distributions, respectively. For each of the 16 combinations of distributions, parameters λ, μ 1 , σ 1 , μ 2 , σ 2 were estimated by maximum likelihood (ML) using the expectationmaximization (EM) algorithm [29], the confidence intervals were calculated as proposed by Oakes [30] and all calculations were done in R [31]. Given the equal numbers of parameters (n = 5) of the 16 models, the best combination of distributions for the two peaks was chosen as the one minimizing the minus log-likelihood. From parameterized distribution D 1 and D 2 of the best model, we computed the probability P to belong to the low C t value peak as P = D 1 /(D 1 + D 2 ) and this probability was used to derive a cut-off value separating the two peaks.

Antigen detection of NV and RV by ELISA
RV antigens were tested on all samples whereas NV antigens were tested in a subset of samples (n = 182). Of these 182 samples, 47 were randomly selected from the NV-negative samples (C t 540) and 135 NV-positive samples were randomly selected to represent a uniform distribution of C t values <40. NV and RV antigen detection was performed by NV-AD-III kit (Denka Seika, Japan) and Rotaclone™ enzyme immunoassay (Meridian, USA) according to the manufacturers' instructions, respectively. The cut-off for NV and RV positivity was OD 450nm = 0·15; samples within 0·01 OD of the cut-off were repeated to confirm results.

Cut-off C t value for NV-associated diarrhoea
The 16 finite-mixture models were fitted to the 346 C t values <40 (represented by the grey histogram in Fig. 1). According to the log-likelihood, the best fit model comprised a log-normal distribution for D 1 and Weibull distribution for D 2 , with λ = 74% [95% confidence interval (CI), 70-79] for the values belonging to the lower peak of C t values (Supplementary Table S1, Fig. 1). The constitutive distributions D 1 and D 2 are the left-most and rightmost curves, respectively, from which the probability  P of belonging to the lower peak C t values peak is represented by the green curve. From this latter we infer that a probability P of 50% belongs to the lower peak of C t values, which corresponds to a cutoff of 25·45 (95% CI 25·45-25·45); a probability P of 95% belongs to the lower peak of C t values corresponding to a cut-off C t of 21·36 (95% CI 20·29-22·46); a probability P of 99% belongs to the higher peak of C t values corresponding to a cut-off C t of 19·01 (95% CI 17·59-20·45). Using the 95% P C t cutoff (21·36), the adjusted NV detection rates reduced significantly from 43% (95% CI 41-51) to 28% (95% CI 29-38) ( Table 1). Thus, using these new criteria, RV and NV were causative agents in 40% and 28% of diarrhoea cases, respectively.

DISCUSSION
Several previous studies have evaluated NV viral loads in paediatric populations in order to better understand dynamics of NV shedding in diarrhoeal cases vs. healthy controls [19,32]. Barreira et al. [19] reported tenfold higher NV RNA copy numbers in diarrhoeal cases compared to controls, and Phillips et al. [32] observed the range of C t values in children aged <5 years with diarrhoea [interquartile range (IQR) 32-37, median 35] and in age-matched healthy/nondiarrhoeal controls (IQR 34-38, median 37), although these groups overlapped substantially. Phillips et al. used the Youden index and ROC curve analysis of diarrhoeal and healthy control groups to propose a C t cut-off value for NV gastroenteritis; they suggested a C t <30 for children aged <5 years, and C t = 33 for older children and adults. Recently, Elfving et al.
presented a study to determine threshold cycle cut-offs for multiple pathogens causing acute diarrhoea [33]. The study proposed cut-off values for Cryptosporidium, Shigella, and ETEC-estA (35, 30, 31, respectively); however, no value could be identified for RV and NV [33].
Here we propose an analytical approach to distinguish diarrhoeal cases in which NV is likely to have played a causative role in disease presentation vs. cases in which an alternative pathogen may be involved. Our method was based on the distribution curves of NV C t values, and did not require comparison to healthy controls. It was founded on the observations that (i) the distribution of C t values for NV is generally bimodal, (ii) NV RNA in healthy children are detected at similar or even higher rates than in diarrhoea cases [12,19]; however, (iii) diarrhoea cases shed larger amount of NV than healthy children [19]. We modelled the bimodal distribution of C t values with a finite-mixture model, and tested the hypothesis that the lower peak of C t values corresponded to samples for which the cause of diarrhoea was NV and the higher peak corresponded to samples where other agents were involved. Our working hypothesis was tested by applying our method to a subset of samples with RV co-infections, and by evaluating confirmatory NV diagnoses using NV antigen ELISA. As expected, the proportion of samples with low NV viral load had significantly more RV co-infections, whereas those with high NV viral loads were more likely to be RV negative. By contrast, the interval between onset of diarrhoea and sampling, patient gender, and virus genotypes were not significant in explaining NV viral load.
After applying the cut-off value obtained in this study to our own dataset, the proportion of NV cases decreased from 43% (C t <40) to 28% (C t <21·36). However, there was no change in the age distribution of cases. Similar NV prevalence (20·6%) was found in children with diarrhoea in a study conducted in Ho Chi Minh city during 2009-2010 [10]. Due to the complicated nature of NV infections, interpretations of low viral loads are difficult, as these values may still indicate a causative role in symptomatic infections; indeed, the authors suggest that clinical diagnostic laboratories must evaluate appropriate cut-off values for different patient populations.
The probability of positive NV ELISA test increases when the C t value decreases and this can be explained by the relatively poor sensitivity of antigen detection by ELISA compared to RT-PCR. When comparing our method with ELISA, our proposed C t <21·36 cut-off agreed well with ELISA, with 98% specificity (96% when considering only the C t <40). The observation that some NV genotypes and variants were not detected by ELISA may explain the lower sensitivity (85%) of ELISA. Effectively, this means that a positive NV ELISA test can be considered as indicative of a case where NV is the actual cause of the diarrhoea, whereas only 90% (81% if considering only C t <40) of negative NV ELISA tests can be considered as indicative of a diarrhoea case not caused by NV. It is worth noting that the C t <21·36 threshold is in the range reported by Due to differences in laboratory practices, it is not feasible to directly compare our suggested cut-off of C t <21·36 to the value of C t = 31 proposed by Phillips et al. [32] for distinguishing cases from controls. Of note, our study involved exclusively hospitalized cases of diarrhoea, whereas in the Phillips et al. study cases were recruited from primary care and the community, thus representing a broader spectrum of disease severity. In addition to differences in study design, different sampling techniques, sample quality issues, RNA extraction methods and sample preparation procedures are likely to affect virus genome concentrations, while different RT-PCR assays may vary in sensitivity. We suggest that each laboratory should conduct its own analysis to evaluate distribution curves and determine an appropriate cut-off value for NV-associated disease. The advantage of our approach is that such determinations of a C t threshold for NV-associated disease may be generated in the absence of data from controls, and thus, could be applied to any laboratory receiving clinical samples on a regular basis.
In conclusion, we propose a threshold C t value for real-time RT-PCR to assess the aetiological role of NV in children with diarrhoea. Our statistical method allowed determination of a cut-off value without reference to any controls, which is essential for the feasibility of extending this analysis to other laboratories conducting routine epidemiological surveillance and clinical diagnostics. We suggest that cut-off values should be determined individually in each laboratory based on its own assay performance, and that accumulation of these types of data across multiple laboratories would contribute to improved understanding of the burden of NV disease.