Dengue viruses (DENV) produce the most common arthropod-borne infections worldwide and dengue remains a major public health problem in tropical countries despite aggressive measures to control the mosquito vector [Reference Gubler and Clark1]. Several dengue vaccine candidates are in the late stages of development and have entered clinical trials [Reference Lang2–Reference Sun4]. Once safe and effective dengue vaccines become available, robust estimates of dengue disease burden will be required in order to make decisions regarding their integration into national immunization programmes [Reference Hombach5–Reference Mahoney7]. However, national surveillance data have often been shown to significantly underestimate true incidence and burden for a number of diseases.
Dengue is endemic in Cambodia, with co-circulation of all four DENV serotypes during most years. Since 2002, 10 000–40 000 hospitalized dengue cases in children aged ⩽15 years have been reported annually to the Cambodian National Dengue Surveillance System (NDSS) [Reference Huy8]. However, the NDSS clinical case definition does not require laboratory confirmation of DENV infection and only includes hospitalized children aged ⩽15 years. To assess the degree of underreporting and under-recognition of dengue in the Cambodian national surveillance system, we performed a capture–recapture analysis [Reference Hook and Regal9] during 2006–2008.
Cambodia, one of the poorest countries with an annual GDP per capita of US$ 600, is located in tropical Asia with a population of 14·4 million, ~50% of whom are children aged ⩽15 years and an estimated annual population growth rate of 1·8%. Phnom Penh is its capital city with a population of 1·4 million. The country consists of 24 provinces, 185 districts, and 13 408 villages . This project was a partnership between the Cambodia National Dengue Control Programme in the Ministry of Health, Institut Pasteur in Cambodia (IPC) and the Pediatric Dengue Vaccine Initiative (PDVI), a programme of the International Vaccine Institute (IVI), Seoul, Korea. The study protocol was approved by the Cambodia Ethics Committee for Health Research and the Institutional Review Board of the IVI, Seoul, Korea.
A two-sample, capture–recapture method was used to determine the effectiveness of NDSS reporting and the strength of incidence estimates. The capture consisted of identification of laboratory-confirmed dengue cases through active surveillance for febrile illness in the study population over the 3-year period. The recapture consisted of identification of all dengue cases residing in the study area and reported to the NDSS over the same 3 years. Dengue cases identified in the capture and recapture were compared to determine matches. An estimation of the total number of dengue cases (N) in the population under active surveillance was made using the following formula [Reference Gallay11, Reference Hook and Regal12]:
where N A and N B are dengue cases detected in the capture and the recapture, respectively, and x AB is the dengue cases identified in both captures (matches). The 95% confidence interval (CI) for the estimate of N was calculated from the variance (var) as follows:
where x A0 is the dengue cases captured by NDSS but not by active surveillance and x 0B is the cases captured by active surveillance but not by NDSS. Both numbers x A0 and x 0B are derived from N A−x AB and N B−x AB, respectively.
We defined true dengue cases as persons with a febrile illness that were DENV-positive by serology or molecular testing. Dengue cases for the purposes of NDSS reporting were identified on a clinical basis using the 1997 World Health Organization case definition . Incidence rates were expressed in person-seasons because our previously published estimates of incidence stem from data collected during the dengue season and not year-round. These data were not annualized assuming they represented annual incidence because of the marked seasonality of dengue [Reference Vong14].
Capture: study population under active, community-based surveillance
During 2006–2008, active, community-based surveillance for febrile illnesses was conducted in a convenience sample of 32 villages and 10 urban areas in four districts of Kampong Cham province, which is the most populous province of Cambodia with ~1·7 million people and a capital city of ~90 000 people. Not all study villages and urban areas were included for the entire period; five were included for 3 years, 13 for 2 years and the remainder for only 1 year. The study population – defined as the population under active surveillance – represented 34% of the total population of these villages or urban areas (range 6·2–78·6% per village or urban area). A total of 14 354 individuals aged <20 years had 22 498 person-seasons of surveillance follow-up over the 3-year period. Active surveillance was conducted over the 3 years, mainly during the rainy season: 6657 children aged ⩽15 years from 16 villages were followed during 8 May–23 November 2006; 10 086 and 7673 individuals aged <20 years were followed during 1 June–31 December 2007 and 1 April–31 December 2008, respectively. As described previously [Reference Vong14], a census was updated in each village to identify eligible families, a village team visited families once a week to identify persons with fever or history of fever (axillary temperature ⩾37·5°C) in the previous 7 days. In 2007 and 2008, digital thermometers and temperature logbooks were additionally provided to participating households to record any suspected fever occurring between two visits. Acute- and convalescent-phase serum samples were collected by an investigation team after obtaining signed consent and signed parental consent for children aged <18 years. Additional information relating to type of care sought were collected during the convalescent phase (14–21 days after fever onset). Blood specimens were transported to Kampong Cham Hospital at 4°C in insulated boxes, separated into serum aliquots, which were stored in liquid nitrogen and transported to IPC twice weekly for subsequent serological and molecular testing. All acute- and convalescent-phase serum specimens were tested for both anti-DENV IgM and anti-Japanese Encephalitis virus (JEV) IgM using in-house capture enzyme-linked immunosorbent assays (MAC-ELISA) as described by Rossi & Ksiazek [Reference Rossi, Ksiazek, Lee, Calisher and Schmaljohn15] and adapted for DENV and JEV diagnosis [Reference Hunsperger16]. Only acute-phase sera of participants who were positive for anti-DENV IgM in the convalescent sample were tested for DENV using molecular methods. DENV ribonucleic acid amplification, detection and serotyping were performed using reverse transcriptase–polymerase chain reaction (RT–PCR) according to Lanciotti et al. [Reference Lanciotti17] as modified by Reynes et al. [Reference Reynes18]. Overall, the median age of participants was 7 years and 52% were males. The number of refusals to participate in the study averaged 0·6% per year (range 0·4–0·9%). Few participants (average 0·9%, range 0·8–1·1%) moved outside the surveillance area.
Recapture: national dengue surveillance
The NDSS is based on reporting of hospitalized, clinically diagnosed dengue cases aged ⩽15 years [Reference Huy8]. The National Dengue Control Programme (NDCP) gathered data reported passively from referral hospitals (all public-sector hospitals) and collected actively at sentinel hospitals on only a weekly basis. Sentinel sites included one not-for-profit private paediatric hospital in Phnom Penh, two not-for-profit private paediatric hospitals in Siem Reap, the national paediatric hospital in Phnom Penh, and one public-sector hospital each in Takeo and Kampong Cham provinces. Patient data collected on the NDSS reporting form includes name, demographics, classification of disease severity [dengue fever (DF), dengue haemorrhagic fever (DHF) and dengue shock syndrome (DSS)], district of residence and disease outcomes. The forms were stored centrally at the NDCP office and data were entered into a computerized database using statistical software. A system was in place to check patients' names so that there was no duplication of those who were hospitalized at several different sites for the same illness episode.
A database of the entire study population was established in the Khmer language and cases from the two captures were also entered in a database in Khmer (Microsoft Access 2003, Visual Basic 6.0 interface; Microsoft, USA). Because Khmer is a complex language, we converted family and first names for all lists and databases, including NDSS, into Latin letters using accepted, predefined rules [Reference Finot19]. Conversions were performed by a single person and validated independently by two other staff trained in the method. Duplicate entries were removed, and the active surveillance and NDSS databases were merged and sorted by names, gender, district of residence, date of hospital admission (±1 week) and age (±2 years). A wide range was allowed for ages since Khmer ages are calculated using a semi-lunar calendar. Matches were then visually and phonetically inspected using the original Khmer names.
The capture–recapture analysis
Dengue cases identified by active surveillance in the study population (N A) were readily available. However, dengue cases from the study population reported to NDSS (N B) were not readily identified directly because the NDSS cases only had name and district of residence, and not village of residence. Therefore, the list of dengue cases reported to NDSS during each year of this study was extracted in a two-step process. First, we extracted cases reported for the same periods as the active surveillance study who resided in the four districts that encompassed the active surveillance study (N districts). Second, this list (N districts) was name-matched to the list of all active surveillance participants during the respective time periods (P cohort). Thus, N B=P cohort∩N district (Fig. 1).
Matching the laboratory-confirmed dengue cases identified by active surveillance (N A) and the cases identified through NDSS (N B) led to identification of ‘true dengue’ cases matched in both captures [x AB (=N A∩N B)].
We then determined dengue cases captured by active surveillance but not by NDSS (x 0A=N A−x AB). We also determined the number of dengue cases captured by NDSS but not by active surveillance (x B0=N B−x AB). In addition, some patients captured by both systems tested negative for DENV infection in the active surveillance study; therefore we corrected x B0, to exclude these false-positive cases (FP): x B0 – FP. We used a probability estimate (y) to define as DENV-positive those cases in the NDSS not captured and laboratory-tested by active surveillance (x B0 – FP) as follows:
Hence, the final number of DENV-positive cases in the non-tested dengue cases reported to NDSS was
A ‘corrected’ N B was determined
from the clinically diagnosed dengue cases captured by NDSS and in the study population for the respective study years in which false-positive cases were excluded. Finally, we estimated the total number of dengue cases (N) in the study population, using N A, corrected N B and x AB in equation (1).
An expansion factor, defined as the inverse of the NDSS underreporting rate, was obtained by dividing N, the total number of dengue cases in the study population (estimated by capture–recapture) by N B, the number of cases in the study population reported to NDSS.
We performed all statistical analyses using Stata version 9.0 (StataCorp, USA) and Excel 2003 (Microsoft, USA).
The annual number of dengue cases reported by NDSS as residents of the study districts (N district) were 661, 1445 and 529 for the respective three study years, which yielded annual incidences of 3·6, 5·1 and 1·9/1000 for persons aged <20 years in 2006, 2007 and 2008, respectively. During the large 2007 epidemic, most of the hospitalized cases reported to NDSS were reported as ‘cases with complications’ (i.e. DHF or DSS) (55·9%), a complication rate significantly higher than that of 2006 (46·0%, P=0·001) and 2008 (33·3%, P<0·001). No gender differences were observed between the two case-capture systems (Table 1).
A total of 89, 530 and 117 dengue cases were detected during each of the three study years, respectively, yielding an annual incidence of 13·4/1000 person-seasons in 2006, 57·8/1000 person-seasons in 2007 and 17·6/1000 person-seasons in 2008. Significant differences in the proportion of cases requiring hospitalization were observed during the study: 41·6%, 10·6% and 2·6% in 2006, 2007 and 2008, respectively (P<0·001). The highest age-specific incidence rates were observed in the 0–4 years (annual range 13·2–81·7/1000 person-seasons) and 5–9 years (annual range 15·9–84·2/1000 person-seasons) age groups [Reference Vong14].
The number of dengue cases reported to the NDSS that were also identified by active surveillance in the study population (N B) numbered 23, 29 and 4 for 2006, 2007 and 2008, respectively. Of these, the true cases (i.e. laboratory-confirmed DENV infection; x AB) numbered 19, 17 and 2 during the 3 years, respectively. There were four false-positive cases identified in the NDSS in 2006, six in 2007, and none in 2008 which were used to estimate the probability of true dengue cases in those reported to NDSS [see equations (2) and (3)]. This estimate [equation (4)] yielded a corrected N B of 19, 23 and 2 for the 3 years, respectively. The estimated number of dengue cases (N) in the study population that should have been identified [equation (1) with N A, corrected N B and x AB] was 89, 648 and 148 in 2006, 2007 and 2008, respectively (Tables 1 and 2). When compared with the cases reported to NDSS (Table 2) over the 3-year period, the calculated expansion factors were 3·9 (95% CI 3·5–4·2) in 2006, 22·3 (95% CI 18·1–26·6) in 2007 and 29·0 (95% CI 16·5–42·0) in 2008.
NDSS, National Dengue Surveillance System; DENV, dengue virus; CI, confidence interval.
* Active surveillance study as capture 1 and NDSS as capture 2.
† Applying equation (1) and using N A, corrected N B and x AB parameters.
Hospitalized dengue cases
The NDSS only captures hospitalized cases. The number of dengue cases identified in the active surveillance study subsequently reported as being hospitalized (N B), was 41, 56 and 3 in 2006, 2007 and 2008, respectively. The estimated number of hospitalized dengue cases in the active surveillance study group was 41, 69 and 4 from 2006 to 2008 (calculated from x AB and x 0B and corrected N B for hospitalized cases). Hence, the calculated expansion factors for hospitalized cases reported to NDSS were 1·8 (95% CI 1·7–2·0) for 2006, and 2·4 (95% CI 2·0–2·8) in 2007 and 1·1 (95% CI 0·8–1·4) in 2006.
Undercounting of dengue cases is a common problem to most surveillance systems, particularly when the case definition uses the previous 1997 WHO classification and case definition and only includes hospitalized cases ; as a result, many severe cases other than those of DHF or DSS were probably missed by the NDSS [Reference Anderson21]. As shown in this study, a substantial proportion of the overall disease incidence is represented by non-hospitalized persons who present initially with a febrile illness, which is subsequently found to be dengue. We conservatively estimated that there was a 4- to 30-fold degree of dengue under-recognition and underreporting to NDSS during 2006–2008 in Kampong Cham, the most populous province in Cambodia. Interestingly, under-detection levels changed significantly from one year to another: from 4- to 22-fold during the 2006 and 2008 non-epidemic years, and 29-fold during the 2007 large-scale epidemic year. As shown, the major reason for underreporting is related to hospitalization rates, which itself could be considered as a surrogate for severe dengue illness. The expression of disease severity and to a certain extent hospitalization rates are probably affected by a intertwined number of factors, including introduction of new DENV types [Reference Vaughn22, Reference Gibbons23], viral genetic factors associated with severe disease, and the host's pre-existing immunity from a prior dengue virus infection to another serotype leading to antibody enhancement or cross-protective heterotypic antibody. Changes in healthcare-seeking behaviours of the population and clinical practices could have also affected hospitalization rates but were unlikely in our study area.
In contrast, our results indicated that under-recognition and reporting for hospitalized cases of dengue were much lower and generally more stable from year to year. However, during the large epidemic in 2007, underreporting was twofold higher than in other years. Dengue is high focally and explosive in nature. Health facilities, which cover an affected area could be rapidly saturated by the overflow of patients. Consequently, hospital staff dealing with the increasing workload may treat case reporting as a lower priority [Reference Ngan, Guyant and Hoyer24]. In other words, many cases may not have been notified during the rush period, which may have explained the higher degree of underreporting of NDSS during the 2007 epidemic.
Taken together, these findings raise important perspectives. First, to rely exclusively on reports of hospitalized cases for surveillance of dengue would significantly underestimate the trends and the magnitude of the burden of disease in Cambodia, and probably any other country. The reason being that the number and proportion of hospitalized cases is not directly proportional to the overall disease incidence from year to year. Our findings show that NDSS appeared to accurately and consistently capture hospitalized cases over time. If these results can be generalized to other areas of Cambodia, trends of incidence generated by NDSS by district may reflect spatial-temporal dynamics of dengue in Cambodia. Space–time modelling of dengue incidence could be subsequently developed to predict patterns of transmission and at-risk areas so that appropriate control interventions could begin ahead of the dengue season. To the extent our estimates of the degree of under-reporting can be extrapolated to the whole country needs further investigation.
The capture–recapture analysis also enabled us to estimate the sensitivity of active, community-based surveillance for acute febrile illness. Despite weekly home visits and commitment of mothers to report febrile illnesses, the sensitivity of this surveillance did not reach 100% in detecting all hospitalized dengue cases over the 3-year period, particularly in 2007. This finding is plausible since reporting to community surveillance workers always depended on the mothers' goodwill and recording of the fever events. Thus, it is likely that some mothers would rush to a healthcare facility/hospital once their child became febrile or once complication signs were recognized, particularly during the large epidemic, and would not bother to report the febrile illness to the surveillance team.
Results of our study must be interpreted in light of several possible limitations.
First, because we only conducted active dengue surveillance during the dengue epidemic season, it is unclear whether the degree of underreporting would be different compared to that of the off-dengue seasons. Nevertheless, it is unlikely that this data would affect our overall finding as less than 5% of the laboratory-confirmed cases from NDSS occurred between dengue seasons [Reference Huy8].
The second issue is whether the capture–recapture methodology met the six conditions for validity [Reference Hook and Regal9, Reference Gallay11, Reference Hook and Regal12]. These included that: (1) sources and measurement of the capture and recapture groups were independent; (2) all cases must have the same probability of identification within each identification system, although the probability of capture may vary between systems; (3) capture and recapture must be conducted during concurrent time periods and the populations must be geographically inclusive; (4) the population must be closed and have little in- or out-migration; (5) matching of cases between the data sources must be high probability matches; and (6) reported cases must be true cases.
Ensuring the independence between the capture and recapture sources is more difficult to achieve when only two data sources are employed. While more than two data sources would have allowed use of a log-linear model and generated more reliable estimates [Reference Hook and Regal25, Reference Chao26], we assessed dependence using a qualitative approach [Reference Hook and Regal12]: when a case was detected by active community surveillance, there were no systematic reports to NDSS since parents/patients were free to choose their caregivers and the surveillance staff were specifically trained not to interfere with their patients' healthcare-seeking behaviour. However, despite these precautions, the degree of dependency between NDSS and active surveillance remains somewhat uncertain and if there was positive dependency between the two sources, true matches would increase and the results would tend to underestimate N, the total number of dengue cases.
The assumption that all cases have the same probability of capture by a given system was met through regular monitoring of the quality of home visits in the active, community-based surveillance. We are confident that the probability of capturing a person with a febrile illness would be the same for all participants over the 3-year study period. In contrast, inherent to NDSS, severe cases were more likely to be hospitalized and subsequently reported to NDSS. In other words, moderate forms of dengue, even if requiring hospitalization and missed by active surveillance would have little probability of being captured by NDSS. Therefore, estimating accurately the overall number of dengue cases would mainly depend on performance of the active surveillance system.
The remaining assumptions were easier to validate and included: that matches between the data sources were performed for the same time periods and for the same administrative districts; that there was minimal in- and out-migration (<1%); that we ensured the accuracy of the reports and matching through the standardized process described in the Methods section; and that true matches between the two lists were confirmed DENV infection.
Capture–recapture analysis has been frequently used to estimate numbers of accidents and injuries [Reference LaPorte27] and chronic diseases [Reference Orton, Rickard and Miller28–Reference Mahr32]. It has been less frequently used to evaluate infectious disease surveillance systems [Reference van Hest33] and to our knowledge, only one study applied the method to estimate surveillance for dengue, which was in Puerto Rico [Reference Dechant and Rigau-Pérez34]. Based on our results, we suggest this method may prove to be a worthwhile tool for assessing the magnitude and the pattern of underreporting in dengue national surveillance systems, and allow for better estimates of disease burden from these systems. Given the cyclical nature of dengue incidence, any assessment of dengue surveillance, capture–recapture or other, needs be conducted over several years to encompass epidemic cycles.
We are most grateful to the study participants from the Kampong Cham community for their enthusiasm and support. We also thank the members of the teams for their dedicated work in this study. The active surveillance project was funded by the KOICA (Korea International Cooperation Agency) and The Pediatric Dengue Vaccine Initiative which received funding from the Bill & Melinda Gates Foundation (grant no. 23197).
DECLARATION OF INTEREST