Molecular epidemiology of a familial cluster of SARS-CoV-2 infection during lockdown period in Sant Kabir Nagar, Uttar Pradesh, India

We report a familial cluster of 24 individuals infected with severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2). The index case had a travel history and spent 24 days in the house before being tested and was asymptomatic. Physical overcrowding in the house provided a favourable environment for intra-cluster infection transmission. Restriction of movement of family members due to countrywide lockdown limited the spread in community. Among the infected, only four individuals developed symptoms. The complete genome sequences of SARS-CoV-2 was retrieved using next-generation sequencing from eight clinical samples which demonstrated a 99.99% similarity with reference to Wuhan strain and the phylogenetic analysis demonstrated a distinct cluster, lying in the B.6.6 pangolin lineage.

Severe acute respiratory syndrome-coronavirus-2 (SARS-CoV-2) was first reported from Wuhan, China [1]. Several studies have reported the transmission of SARS-CoV-2 from human to human, asymptomatic transmission and transmission in family and hospital settings [2,3]. The first case in eastern Uttar Pradesh, India was reported from Basti town on 31 March 2020 [4]. Following this, the next case in this region was reported from Sant Kabir Nagar (SKN), a district adjacent to Basti [5]. The first case in SKN was reported on 15 April 2020 where a 71-year-old male who had visited New Delhi was found positive. The infection in this case was limited with no positive contacts. Following this a second case was noticed from SKN who was identified as the index case of the present cluster and he had returned home due to countrywide lockdown. Although he remained asymptomatic during his stay, the infection spread further leading to a family cluster. The present study describes in detail the asymptomatic transmission and delineates the transmission dynamics using nextgeneration sequencing (NGS).

Study setting
A 23-year-old male student suspected to be infected with SARS-CoV-2 along with two of his co-travellers on the same bus returned to SKN from Deoband, Uttar Pradesh during the lockdown. He was tested positive for SARS-CoV-2 on 17 April 2020, following which he was quarantined. As part of contact tracing, all of his family members (28 individuals) and seven relatives who were residing in the same house were tested. In all the individuals, nasopharyngeal and nasal swab samples were collected by the state government health department and sent to our laboratory for testing. All the samples were processed as per the standard protocol and stored at −80°C. Eighteen out of 35 members were found positive. The positive and negative tested family members were quarantined separately. On repeat testing of negative cases on 02 and 03 May 2020, five more members tested positive for SARS-CoV-2. Overall 12 out of 36 people in the house remain infection-free throughout their quarantine period (median quarantine period: 18 days, range 16-26 days). The secondary attack rate in this familial cluster was found to be 65.7% (assumption: all people got infected in the house before quarantine).
The median age in this cluster was 20.5 years (range: 2 months−72 years). The cluster had 10 family members below 18 years of age. This cluster was composed of 19 males (52.8%) and Fisher's exact test was applied ( Table 1). The family lived in a pukka (cemented) house with a total of eight living rooms and a separate kitchen. Overcrowding was noted in the family. Among the individuals tested positives, only three out of 24 developed mild symptoms. Details of the chronological events and family structure are shown in (Fig. 1). Three members in this cluster had co-morbid conditions and were also found infected. All 24 infected individuals recovered from infection and no case fatality occurred in this cluster.

Next-generation sequencing
NGS was conducted by preparing RNA libraries from the extracted RNA. The RNA libraries prepared were sequenced using the Illumina platform (Qiagen, Germany) to delineate the transmission dynamics using the positive samples to retrieve the complete genomic sequence of the SARS-CoV-2. The detailed protocol of the method used is described elsewhere [6]. The pipeline used to obtain the SARS-CoV-2 sequences is depicted in Figure 2a. The retrieved sequences were aligned using the representative GISAID Indian SARS-CoV-2 sequences. A neighbour-joining tree was generated using the best model in MEGA version 7.0 [7]. A bootstrap replication of 1000 replication was used to assess the statistical robustness. Amino acid variations were also observed for the different proteins encoded by SARS-CoV-2.
The percentage of genome recovered from 24 samples ranged from 1.49 to 99.99 and the relevant reads mapped lay between 0.0 and 92.35%. The details of the percentage of relevant reads mapped and the percentage of the genome retrieved for all the 24 samples are tabulated in Table 2. Eight genomic sequences were retrieved with a genome coverage ranging between 99.94 and 99.96% while the other 16 sequences were below 95.5%. The neighbour-joining tree as generated using the Kimura-2parameter model demonstrated that the retrieved sequences lay in the B.6.6 pangolin lineage (https://pangolin.cog-uk.io/). Two distinct clusters were observed for the generated tree  Table 1). The amino acid position of ORF1ab protein L3606F (nsp6) is shared with clade O and A3i, and substitution at position A4489 V (nsp12) is shared with clade A3i. The nucleocapsid protein of the studied strain showed a single variable site in amino acid sequence at position P9265L. The percentage of nucleotide and amino acid similarity for different genes are tabulated in the Technical Appendix Table 2.

Discussion
We describe here a familial cluster (n = 24), out of 36 who were found to be infected with SARS-CoV-2. The index case had a travel history and spent 24 days in the house before being tested and was asymptomatic. Physical overcrowding in the house provided a favourable environment for intra-cluster infection transmission. Restriction of movement of family members due to countrywide Co-morbid illnesses  [8]. The viral load in symptomatic and asymptomatic individuals is shown to be similar [9] and they serve as potential source of infection in the community and hospital settings [3,10].
Females were infected more than males. Other factors like age, occupation, co-morbidity and marital status were not found statistically significant. The present cluster had 10 members of age  <18 years, of which two were <5 years. Children being infected from the family members given the proximity and asymptomatic nature of the illness has been documented [8].
In India, multiple SARS-CoV-2 clades are reported to be circulating. The phylogenetic analysis demonstrated a distinct cluster, lying in the B.6.6 pangolin lineage. Further, genetic analysis of the sequences in this study demonstrated conserved amino acid variation in the ORF1ab regions. The variations were observed in the amino acid that contains trans-membrane domain (nsp3, nsp4 and nsp6), RdRp (nsp12) and helicase (nsp13). The implications of these changes need to be further explored.
This study provides insight into transmission of SARS-CoV-2 within a crowded household setting using genome sequencing, which is of notable value when assessing cluster outbreaks and transmission dynamics. With novel variants requiring attention and resources, the window period for such cluster analysis should be considered timely. However, of the 24 samples, genome was retrieved in only eight samples which might be due to low viral load; this highlights the limitation of sequencing in such situations.

Conclusion
This study encourages future implementation of genome sequencing when addressing outbreaks in real-time, particularly in crowded household settings, and tangentially highlights the effectiveness of lockdown measures. Financial support. Financial support was provided from the Indian Council of Medical Research-Regional Medical Research Centre (ICMR-RMRC), Gorakhpur.
Conflict of interest. The authors do not have any conflict of interest.
Ethical standards. Ethical approval for this study was taken from the ICMR-RMRC Gorakhpur Human Institutional Ethical Committee.