Hostname: page-component-848d4c4894-wg55d Total loading time: 0 Render date: 2024-04-30T11:28:09.396Z Has data issue: false hasContentIssue false

Coronavirus Host Genomics Study: South Africa (COVIGen-SA)

Published online by Cambridge University Press:  01 January 2024

Andrew K. May*
Affiliation:
Sydney Brenner Institute for Molecular Bioscience (SBIMB), Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
Heather Seymour
Affiliation:
Sydney Brenner Institute for Molecular Bioscience (SBIMB), Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa Division of Human Genetics, National Health Laboratory Service and School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
Harriet Etheredge
Affiliation:
Wits Donald Gordon Medical Centre, School of Clinical Medicine, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa Steve Biko Centre for Bioethics, School of Clinical Medicine, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
Heather Maher
Affiliation:
Wits Donald Gordon Medical Centre, School of Clinical Medicine, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
Marta C. Nunes
Affiliation:
South African Medical Research Council, Vaccines and Infectious Diseases Analytics (VIDA) Research Unit, School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa Department of Science and Technology, National Research Foundation, South African Research Chair Initiative in Vaccine Preventable Diseases, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
Shabir A. Madhi
Affiliation:
South African Medical Research Council, Vaccines and Infectious Diseases Analytics (VIDA) Research Unit, School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa Department of Science and Technology, National Research Foundation, South African Research Chair Initiative in Vaccine Preventable Diseases, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
Simiso M. Sokhela
Affiliation:
Ezintsha, Wits Health Consortium, University of the Witwatersrand, Johannesburg, South Africa
W. D. Francois Venter
Affiliation:
Ezintsha, Wits Health Consortium, University of the Witwatersrand, Johannesburg, South Africa
Neil Martinson
Affiliation:
Perinatal HIV Research Unit, University of the Witwatersrand, Johannesburg, South Africa John Hopkins University Center for TB Research, Baltimore, MD, USA Centre for Respiratory Diseases and Meningitis, National Institute for Communicable Diseases and School of Pathology, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
Firdaus Nabeemeeah
Affiliation:
Perinatal HIV Research Unit, University of the Witwatersrand, Johannesburg, South Africa John Hopkins University Center for TB Research, Baltimore, MD, USA
Cheryl Cohen
Affiliation:
Division of Infectious Diseases, Department of Internal Medicine, Chris Hani Baragwanath Academic Hospital, University of the Witwatersrand, Johannesburg, South Africa
Jocelyn Moyes
Affiliation:
Division of Infectious Diseases, Department of Internal Medicine, Chris Hani Baragwanath Academic Hospital, University of the Witwatersrand, Johannesburg, South Africa
Sibongile Walaza
Affiliation:
Division of Infectious Diseases, Department of Internal Medicine, Chris Hani Baragwanath Academic Hospital, University of the Witwatersrand, Johannesburg, South Africa
Stefano Tempia
Affiliation:
Division of Infectious Diseases, Department of Internal Medicine, Chris Hani Baragwanath Academic Hospital, University of the Witwatersrand, Johannesburg, South Africa
Jackie Kleynhans
Affiliation:
Division of Infectious Diseases, Department of Internal Medicine, Chris Hani Baragwanath Academic Hospital, University of the Witwatersrand, Johannesburg, South Africa
Anne von Gottberg
Affiliation:
Division of Infectious Diseases, Department of Internal Medicine, Chris Hani Baragwanath Academic Hospital, University of the Witwatersrand, Johannesburg, South Africa
Jeremy Nel
Affiliation:
Department of Medicine, Greys Hospital, Pietermaritzburg, South Africa
Halima Dawood
Affiliation:
Caprisa University of KwaZulu-Natal, Durban, South Africa
Ebrahim Variava
Affiliation:
Klerksdorp-Tshepong Hospital Complex, Klerksdorp, and Department of Medicine, University of the Witwatersrand, Johannesburg, South Africa
Stephen Tollman
Affiliation:
MRC/Wits Rural Public Health and Health Transitions Unit (Agincourt), School of Public Health, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
Kathleen Kahn
Affiliation:
MRC/Wits Rural Public Health and Health Transitions Unit (Agincourt), School of Public Health, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
Kobus Herbst
Affiliation:
Africa Health Research Institute, Durban, KwaZulu-Natal, South Africa Division of Infectious Diseases, University of Alabama at Birmingham, Birmingham, AL, USA
Emily B. Wong
Affiliation:
Africa Health Research Institute, Durban, KwaZulu-Natal, South Africa Division of Infectious Diseases, University of Alabama at Birmingham, Birmingham, AL, USA
Caroline T. Tiemessen
Affiliation:
Centre for HIV and STIs, National Institute for Communicable Diseases, and Faculty of Health Sciences, University of Witwatersrand, Johannesburg, South Africa
Alex van Blydenstein
Affiliation:
Division of Pulmonology, Department of Internal Medicine, Chris Hani Baragwanath Academic Hospital, University of the Witwatersrand, Johannesburg, South Africa
Lyle Murray
Affiliation:
Department of Medicine, Greys Hospital, Pietermaritzburg, South Africa
Michelle Venter
Affiliation:
Department of Medicine, Greys Hospital, Pietermaritzburg, South Africa
June Fabian
Affiliation:
Wits Donald Gordon Medical Centre, School of Clinical Medicine, Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
Michéle Ramsay*
Affiliation:
Sydney Brenner Institute for Molecular Bioscience (SBIMB), Faculty of Health Sciences, University of the Witwatersrand, Johannesburg, South Africa
*
Correspondence should be addressed toAndrewK.May;andrew.may@wits.ac.za
Michéle Ramsay; michele.ramsay@wits.ac.za
Rights & Permissions [Opens in a new window]

Abstract

Host genetic factors are known to modify the susceptibility, severity, and outcomes of COVID-19 and vary across populations. However, continental Africans are yet to be adequately represented in such studies despite the importance of genetic factors in understanding Africa’s response to the pandemic. We describe the development of a research resource for coronavirus host genomics studies in South Africa known as COVIGen-SA—a multicollaborator strategic partnership designed to provide harmonised demographic, clinical, and genetic information specific to Black South Africans with COVID-19. Over 2,000 participants have been recruited to date. Preliminary results on 1,354 SARS-CoV-2 positive participants from four participating studies showed that 64.7% were female, 333 had severe disease, and 329 were people living with HIV. Through this resource, we aim to provide insights into host genetic factors relevant to African-ancestry populations, using both genome-wide association testing and targeted sequencing of important genomic loci. This project will promote and enhance partnerships, build skills, and develop resources needed to address the COVID-19 burden and associated risk factors in South African communities.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © 2022 Andrew K. May et al.

1. Introduction

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has resulted in over 540 million infections and over six million deaths since the first outbreak was detected in December 2019 [1]. Although rapid improvements in disease prevention and management have occurred with increased uptake of vaccines [Reference Tregoning, Brown and Cheeseman2] and immunomodulation and oxygen therapy strategies for those hospitalised [Reference Gorman, Connolly, Couper, Perkins and McAuley3, Reference Alunno, Najm and Mariette4], the pandemic remains a substantial worldwide problem that is anticipated to persist for the foreseeable future.

At the outset of the COVID-19 pandemic, there were fears that Africa would suffer the worst of the disease’s impact [Reference El-Sadr and Justman5,Reference Ghosh, Jonathan and Mersha6]. Virus transmission was forecast to be unmanageably high due to limited healthcare infrastructure, few human health resources, and poor socioeconomic circumstances [Reference El-Sadr and Justman5, Reference Samadizadeh, Masoudi, Rastegar, Salimi, Shahbaz and Tahamtan7Reference Mbow, Lell and Jochems9], leading to understandable concerns about the impact of COVID-19, particularly among people living with tuberculosis (TB), HIV, and/or comorbid non-communicable diseases [Reference Li, Chen and Liu10Reference Jassat, Cohen and Tempia12]. Preliminary statistics from Western countries suggested that individuals of Asian, Black, and Hispanic ethnicity were at greater risk of COVID-related death [Reference Pan, Sze and Minhas13], mostly due to poor socioeconomic indicators in minoritised, underserved populations [Reference Samadizadeh, Masoudi, Rastegar, Salimi, Shahbaz and Tahamtan7, Reference Rasmussen, Abul-Husn, Casanova, Daly, Rehm and Murray14]. Initial predictions for Africa were dire, with 70 million infections and 3 million deaths forecasted by June 2020 [Reference Maeda and Nkengasong8].

However, the true impact of SARS-CoV-2 in Africa has been difficult to gauge as epidemiological surveillance infrastructure and widespread access to screening and testing have been extremely limited (Figure 1) and vary from region to region [Reference Aborode, Hasan and Jain15]. Routine death reporting occurs in a handful of African countries, and among these, coverage is often incomplete [Reference Okonji, Okonji, Mukumbang and Van Wyk16]. Nevertheless, some have speculated that the relative youth of African populations [Reference Chitungo, Dzobo, Hlongwa and Dzinamarira17], their increased exposure to infectious pathogens including other coronaviruses [Reference Njenga, Dawa and Nanyingi18], and prior vaccination (such as with the Bacille Calmette–Guerin vaccine), might serve as protective factors against SARS-CoV-2 [Reference Samadizadeh, Masoudi, Rastegar, Salimi, Shahbaz and Tahamtan7].

Figure 1 COVID-19 case numbers and tests per thousand across different regions: (a) confirmed case numbers on the African continent remain low in comparison to other continents despite early predictions that African countries would struggle the most to maintain infection control and (b) however, testing per thousand individuals in selected African countries is a fraction of those conducted in, for example, the United States and the United Kingdom. Data sourced from [Reference Dong, Du and Gardner19, Reference Ritchie, Mathieu and Rodes-Guirao20].

Based on the clinical heterogeneity of COVID-19 [Reference Karaderi, Bareke and Kunter26] and reports of fulminant illness and death among young patients in good health [Reference van der Made, Simons and Schuurs-Hoeijmakers24, Reference Carter-Timofte, Jørgensen and Freytag27], host genetic factors have been suspected to moderate disease susceptibility, severity, and outcomes [Reference Murray, Kenny and Ritchie28]. Several global initiatives have fostered collaboration between human geneticists, enabling rapid sample collection and data analysis. For example, the COVID Human Genetic Effort (HGE) [Reference Casanova, Su and Abel29] is concerned with identifying single gene inborn errors of immunity, which are likely to be rare but of large effect size, while the COVID-19 host genetics initiative (HGI) [21] seeks to understand the role of common, small-to-medium effect size variants throughout the human genome. The most recent meta-analysis from the COVID HGI [22], incorporating over 940,000 participants, supports at least 15 independent association signals across the genome, which together implicate close to 50 genes in modifying disease severity and/or susceptibility to infection. Priority candidates among this list of genes include TYK2, PPP1r151, ABO, FOXP4, IFNAR2, DPP9, CXCR6, LZTFL1, and TMEM65, which either have biological plausibility or contain coding sequence variants in strong linkage disequilibrium with a lead signal. Other strong candidates include ACE2 and TMPRSS2, which both regulate the entry of SARS-CoV-2 into cells [23]. Meanwhile, studies investigating single gene inborn errors have identified toll-like receptor 3 (TLR3), TLR7 [24], and type I interferon immunity in predisposing individuals to critical COVID-19 illness [14, 25].

The portability of disease-associated genetic signals across different geographies and ethnicities is limited [Reference Sirugo, Williams and Tishkoff30, Reference Rosenberg, Huang, Jewett, Szpiech, Jankovic and Boehnke31], with examples relevant to COVID-19. The lead-associated variant implicating the FOXP4 gene (primarily expressed in the airway epithelium) has a high frequency of the effect-allele in Middle Eastern and East Asian communities (40%) but is of less significance in European populations, where the effect-allele frequency is capped at only 3% [22]. The level of expression of the candidate gene ACE2 is known to decline from Europeans to Asians [Reference Samadizadeh, Masoudi, Rastegar, Salimi, Shahbaz and Tahamtan7], possibly explaining the greater burden of COVID-19 in Italy and Spain. Additionally, ACE2 is a target of ACE inhibitor class drugs, to which Africans generally respond poorly compared to other ethnicities, suggesting the presence of genetic variation that may also impact COVID-19 [Reference Ghosh, Jonathan and Mersha6]. Globally, the frequency of an insertion/deletion variant in the related ACE1 gene varies widely, with the COVID-19 high-risk deletion allele more common outside East Asia [Reference Samadizadeh, Masoudi, Rastegar, Salimi, Shahbaz and Tahamtan7, Reference Gomez, Albaiceta, Garcia-clemente and Lopez-Larrea32]. Currently, the leading genetic risk factor for severe COVID-19 among European-ancestry populations is a 50 kb haplotype, on chromosome 3, reported to be introgressed from Neanderthals [Reference Zeberg and Paabo33]. This haplotype has not been found to confer equivalent risk in Indian communities [Reference Singh, Srivastava and Sultana34] and is near-absent among continental Africans [Reference Zeberg and Paabo33]. Meanwhile, a disease protective splice variant in OAS1 occurs more frequently in African-ancestry individuals (58%) compared to Europeans (32%; [Reference Huffman, Butler-Laporte and Khan35]), suggesting that genetic risk profiling for COVID-19 is likely different among Africans.

To date, there has been little representation of continental African populations in COVID-19 host genomic studies. In a recent meta-analysis from the COVID HGI, only 5% of 48,714 participants were of African ancestry [22]. This lack of African-specific data poses serious limitations to efforts aimed at diminishing health disparities between people from different ethnic backgrounds and between high and low-income countries [Reference Adebisi, Ekpenyong and Ntacyabukura36]. Understanding host genetic factors promises to improve COVID-19 disease risk profiling [Reference Naslavsky, Vidigal and Barros23] and could provide attractive targets for therapeutic drug design [Reference Pairo-Castineira, Clohisey and Klaric37].

Motivated by these concerns and responding to calls from others emphasising the need for better inclusion of non-European participants in COVID-19 research [Reference Naslavsky, Vidigal and Barros23], we established the coronavirus host genomics study: South Africa (COVIGen-SA). COVIGen-SA is a strategic collaboration between multiple study partners designed to promote and facilitate COVID-related research that is specific to Black South Africans. The primary aim of COVIGen-SA is to explore host genetics, providing insights into factors that moderate COVID-19 susceptibility, severity, and outcomes in a continental African population. The secondary aim is to build a research resource of demographic, clinical, and genetic variables (and associated DNA samples) from Black South Africans that can be accessed to answer research questions. Ultimately, COVIGen-SA is designed to provide a foundation for new cross-disciplinary partnerships and leverage skills and infrastructure needed to bolster COVID-19 host genetic research in Africa.

2. Methods

2.1. Experimental Design and Scope

2.1.1. Participant Recruitment Model

To be eligible for enrollment into COVIGen-SA, potential participants may be of any sex and can reside in either rural or urban settings. Black African individuals over the age of 18 are prioritised for inclusion, but participants of other ethnicities and ages are increasingly included as the study expands. Note that “Black African” is a South African government-utilised racial category encompassing all individuals of African ancestry. While Black South Africans are of mostly south-eastern Bantu-speaking descent [Reference Choudhury, Ramsay and Hazelhurst38], their genetic composition may be variably admixed with other African and non-African ethnicities [Reference Sengupta, Choudhury and Fortes-Lima39].

To maximise recruitment efficiency and minimise costs, we have multiple partners across the country (Figure 2; Table 1), all of whom are involved in separate research projects with specific inclusion/exclusion criteria in which participants are being actively recruited and health-related data collected. Some of these partner studies are directly concerned with medical aspects of COVID-19, while others perform PCR tests for SARS-CoV-2 infection as part of their study’s inclusion/exclusion and/or follow-up criteria. This approach capitalises on existing infrastructure and resources for participant recruitment and limits research fatigue in participants by integrating consent and sample collection into existing participation sessions. From consenting participants, data and samples are captured and stored.

At first, we seek to enroll 5,000 Black South African participants across three clinical categories of COVID-19 disease, defined as follows: (i) critical COVID-19 illness—hospitalised cases requiring supplemental oxygen, ventilatory, or other organ support and/or who have died as a result of COVID-19; (ii) moderate to severe illness—hospitalised cases not requiring ventilatory or other organ support; and (iii) mild or asymptomatic disease—PCR-confirmed SARS-CoV-2 infection but asymptomatic or mild symptoms. These categories were selected to align closely with those used by the COVID HGI [22]. Participant recruitment will continue beyond 5,000 individuals should resources permit.

For comparison, we will use a population control sample comprised 5,000 Black South African individuals from the Africa Wits-INDEPTH Partnership for Genomic Studies (AWI-Gen), for whom genome-wide genotyping data is already available [Reference Ramsay, Crowther and Tambo43, Reference Ali, Soo and Agongo44]. AWI-Gen is an NIH-funded Collaborative Centre of the Human Heredity and Health in Africa Consortium [Reference Ramsay, Crowther and Tambo43] and has participants from four African countries. We note that the use of population control is a limitation of the study design, given the unavoidable presence of a subset of control individuals who may develop severe COVID-19 illness once exposed [22]. However, the use of population control has yielded robust findings in other studies [22, Reference Pairo-Castineira, Clohisey and Klaric37] and is considered a valid strategy.

2.1.2. Data Capture and Storage

A study-specific instrument was designed to capture variables pertinent to the investigation of COVID-19 (Table 2). The instrument comprises three broad sections, including a demographic section (sex, age, self-reported ethnicity, living conditions), a general health section (past/current comorbidities and medications), and a COVID-19 section (diagnosis, symptoms, and disease outcomes). For new recruitments, instrument responses are recorded directly into a REDCap database. For partner studies collecting similar variables, we import the harmonised data into this database. Participant DNA is extracted from 6-10 ml venous blood samples in EDTA, stored at 4°C. Within one week the EDTA tubes are collected from study sites and centrifuged to separate out the buffy coats, which are frozen at -80°C until DNA extraction is required. DNA is extracted using an automated method performed on the Qiasymphony SP platform. DNA samples are then stored in an ethics-approved biobank (clearance certificate number: BEC20200401), located at the Sydney Brenner Institute for Molecular Bioscience (SBIMB). Genome-wide genotyping of DNA samples will be conducted in batches, as funding permits, using the H3Africa genotyping array [45]. The H3Africa array is custom-designed, incorporating 2.3 million SNP markers enriched for common variants in African genomes, and was previously used to genotype AWI-Gen participants who will form the control sample for COVIGen-SA.

2.1.3. Analysis Strategy

During and after the establishment of the research resource, we plan to conduct several investigations into host genetic factors. These include, but are not limited to, genome-wide association analysis using different phenotypic categories, haplotype analysis, and bioinformatic fine-mapping. Further data collected from targeted sequencing, whole exome, and whole genome sequencing will facilitate additional investigations such as novel variant discovery and the identification of signals for selection. More nuanced investigations, such as sex-specific and burden analyses [Reference Curtis46], will also be explored. An overview of the project design is shown in Figure 3.

2.1.4. Study Coordination and Ethical Considerations

COVIGen-SA is headed by Professor Mich’ele Ramsay (principal investigator) and Dr. June Fabian (Co-PI) and is jointly based at the Wits Sydney Brenner Institute for Molecular Bioscience and the Wits Donald Gordon Medical Centre (WDGMC). The project leverages the strengths of these institutes, benefitting from the sample and data storage capabilities at the SBIMB Biobank, as well as the institute’s track record for bioinformatic analysis of genomic data [Reference Sengupta, Choudhury and Fortes-Lima39], and the ethical, clinical research expertise, and experience of the WDGMC. The organisational structure of COVIGen-SA is summarised in Figure 4. COVIGen-SA has received ethical clearance from the University of Witwatersrand Human Research Ethics Committee—Medical (HREC [M]; clearance number M200642). For each additional project linked to COVIGen- SA, an amendment is submitted to include or link a substudy or new cohort. Existing study-specific ethics clearance certificate numbers are provided in Figure 4. Participation requirements are minimal, including additional consenting processes (including specific consent for genetic studies and sharing of data and specimens) and, in some cases, an additional 6–10 ml EDTA venous blood draw. We administer the COVIGen-SA instrument only in situations where the partner study is not already collecting similar variables. Throughout recruitment, COVID-19-related guidelines are followed to protect both researchers and participants from infection. COVIGen-SA demonstrates a collective commitment to furthering our understanding of COVID-19 through robust scientific collaboration and information sharing.

2.2. Data Quality Control and Availability

Preprocessing of clinical and demographic data will focus on mitigating missing data and determining whether such data are missing completely at random, missing at random, or not missing at random [Reference Jakobsen, Gluud, Wetterslev and Winkel47]. In line with Anderson et al. [Reference Anderson, Pettersson, Clarke, Cardon, Morris and Zondervan48], preprocessing of genomic data will be conducted at both a per-individual and per-marker level prior to removing individuals and/or SNP markers from the data set. Individuals will be earmarked for removal if (a) sex information is discordant between genotype data and self-reported sex and (b) genotyping or heterozygosity rates are outliers. Depending on the nature of the downstream analyses, individuals may be removed if cryptic relatedness is detected (i.e., removing one individual from a pair sharing an identity by descent score >0.1875) and/or participant ancestry is divergent from the majority of cases based on principal component analysis. SNP markers will be removed should they have excessive missing genotype information (a call rate < 95%), a low minor allele frequency (< 0.05, depending on the sample size and analysis being performed), or substantial deviation from Hardy–Weinberg equilibrium (P < 0.001). To maximise coverage across the genome, SNP genotype imputation will be leveraged using an appropriate reference panel. Raw and preprocessed data will be made available upon request and subject to a data-transfer agreement and appropriate ethical clearance. DNA samples will be available, conditional on ethical clearance, a material transfer agreement, and availability of DNA.

3. Results

COVIGen-SA currently includes collaborative efforts across five institutions and seven study partners that together enable potential participant recruitment across 14 different studies and 10 recruitment sites, situated across 5 of the 9 provinces in South Africa (Figure 2). Participant recruitment commenced in October 2020. At the time of writing, data and samples had been collected from over 2,000 participants (from six of 14 studies), of whom 1,354 are reported here (Table 3). In line with demographics for the African continent, the majority of participants are younger than 40 years of age. Both rural- and urban-dwelling individuals are represented, distinguished most notably by the number of residents per household, which was larger amongst rural dwellers (e.g. in the PHIRST-C cohort; Kruskal-Wallis test statistic = 493.29, adjusted p<0.01). Documented comorbidities are diverse, ranging from high cholesterol and hypertension to renal disease and cancer. Given the high disease burden in South Africa and our partnership with HIV-focused research groups, COVIGen-SA is anticipated to include a high proportion of HIV comorbid participants, with 329 (24.30%) such participants enrolled to date. COVID-19 symptom severity also varies substantially across enrolled participants, although severely affected (i.e. hospitalised, and requiring either supplementary oxygen or mechanical ventilation) individuals remain underrepresented at present (n = 333, 24.59%). Efforts continue to prioritise the recruitment of severely affected individuals who are more likely to harbour large effect size genetic variants modifying COVID-19 severity. In the first genotyping batch of 576 participants, 73 were removed during preliminary quality control procedures (four due to sample failure and 69 due to divergent ancestry). Remaining participant genotypes were merged with AWI-Gen (control) [Reference Kleynhans, Tempia and Wolter40, Reference Wong, Olivier and Gunda41] and 1000 Genomes Project data [Reference Auton, Abecasis and Altshuler49] and principal component analysis conducted (Figure 5) using PLINK (version 1.90b6.21) [Reference Chang, Chow, Tellier, Vattikuti, Purcell and Lee50, Reference Purcell and Chang51], R (version 4.1.0) [52], RStudio (version 1.4.1103) [53], and the ggplot2 package (version 3.35) [Reference Wickham54]. Case and control participants clustered together (Figure 5(a)), suggesting common ancestry, but Black South African samples were substantially more dispersed compared to other ethnicities (Figure 5(b)), in line with the known magnitude of genetic variation for continental Africans [Reference Sengupta, Choudhury and Fortes-Lima39, Reference Choudhury, Aron and Botigue´55].

Table 1 A brief overview of COVIGen-SA partner studies.

* A: ezintsha, B: vaccines and infectious diseases analytics research unit, C: centre for respiratory diseases and meningitis and the MRC/wits rural public health and health transitions research unit [Reference Kleynhans, Tempia and Wolter40], D: Perinatal HIV Research Unit, E: African health research institute (AHRI) [Reference Wong, Olivier and Gunda41, Reference Gunda, Koole and Gareta42], F: HIV vaccine translational research entity, 3TC: lamivudine, ART: antiretroviral treatment, DCV: daclatasvir, DTG: dolutegravir, EFV: efavirenz, FTC: emtricitabine, HIV: human immunodeficiency virus, NNRTI: non-nucleoside reverse transcriptase inhibitor, NTZ: nitazoxanide, PLHIV: people living with HIV, RSV: respiratory syncytial virus, SOF: sofosbuvir, TAF: tenofovir alafenamide, and TDF: tenofovir disoproxil.

Table 2 Demographic and clinical variables collected for COVIGen-SA.

Table 3 Demographic summary of COVIGen-SA SARS-CoV-2 positive participants recruited from four studies.

See Table 1 for a description of each study.

Figure 2 Study sites for COVIGen-SA. Participants for COVIGen-SA are currently being recruited from 10 different sites, in and around areas including Johannesburg, Polokwane, Bushbuckridge, Klerksdorp, Pietermaritzburg, and Northern KwaZulu Natal (KZN).

Figure 3 Overview of the COVIGen-SA research resource and planned host genetic studies. COVIGen-SA is based on a governance framework that promotes cross-disciplinary collaboration and transparent data and sample sharing that is ethically approved and legally compliant. In addition to the SBIMB and WDGMC, seven partners have joined the study to date, all contributing to a unified research resource that will facilitate host genetic and other COVID-related studies. The project data will be made available to improve the representation of continental Africans in public data sets. SBIMB: Sydney Brenner Institute for Molecular Bioscience, WDGMC: Wits Donald Gordon Medical Centre, AHRI: Africa Health Research Institute, CRDM: Centre for Respiratory Disease and Meningitis, National Institute for Communicable Diseases, PHRU: Perinatal HIV Research unit, RPHHT: Rural Public Health and Health Transitions Research Unit, (Agincourt) VIDA: Vaccines and Infectious Diseases Analytics Unit, and HVTR: HIV Vaccine Translational Research Entity.

4. Discussion

4.1. Anticipated Impact

Continental Africans have the highest genetic diversity compared to all other ethnicities [Reference Choudhury, Aron and Botigue´55, Reference Fan, Kelly and Beltrame56] and thus harbour individual variants and patterns of variation not observed elsewhere [Reference Petersen, Libiger and Tindall57, Reference Yu, Chen and Ota58]. Deeper and more extensive genetic profiling is imperative to understanding (and ultimately improving) health outcomes for African-ancestry individuals [Reference Pereira, Mutesa, Tindana and Ramsay59, Reference Bentley, Callier and Rotimi60]. As evidence gathers to implicate host genetic factors in COVID-19 outcomes, it is reasonable to assume that novel associations and/or variants, possibly private to African genomes, may help in understanding the impact of the pandemic in Africa [Reference Choudhury, Aron and Botigue´55]. Our preliminary principal component analyses (Figure 5) reiterate the known genetic diversity of self-identified Black South Africans, for whom no COVID-19 host genomic research has been completed to date. In addition to unparalleled genetic variation, our current participants already represent a heterogeneous array of demographic and health-related backgrounds. All considered, our sample holds the substantial potential to reveal novel insights into COVID-19. The COVIGen-SA research resource has thus been designed to facilitate host genetic, and possibly other, explorations into this unique sample. While the primary motivation is to help contribute towards alleviating the disease burden in this specific population, we also anticipate the knowledge will have relevance to other African ethnicities and will provide additional opportunity for documenting and understanding medically relevant genetic variation in Africa more broadly. Furthermore, we expect that COVIGen-SA will have meaningful outcomes regarding scientific capacity development in South Africa, high-impact publications, and collaborations that cross multiple disciplines. Future manuscripts will centre on genetic investigations, guided by the results of our early GWAS findings. These publications will promote awareness and improve literacy regarding the importance of genetic factors in COVID-19 host response and the need for African-specific research.

4.2. Study Governance

The COVID-19 pandemic has fundamentally altered the global status quo, introducing substantial challenges necessitating collaboration and cooperation on an unprecedented scale [Reference Reid, Abdool-Karim, Geng and Goosby61]. Scientific research, in particular, has relied on increased collaboration, particularly interdisciplinary, to respond to the pandemic [Reference Moradian, Ochs and Sedikies62] and to stay abreast of the evolving lineages of SARS-CoV-2 [Reference Cella, Benedetti and Fabris63], most recently illustrated by the emergence of the Omicron variant, which dominated the fourth wave of infection [Reference Viana, Moyo and Amoako64].

The COVIGen-SA study has developed several interdisciplinary partnerships that should maximise the impact of the project and assist in overcoming the logistic and financial limitations to scientific research imposed by COVID-19. This is particularly relevant in the developing country context of Africa, where infrastructural challenges may restrict the degree to which research priorities can be addressed [Reference Aborode, Hasan and Jain15], further entrenching health disparities laid bare by the pandemic [Reference Reid, Abdool-Karim, Geng and Goosby61]. Cooperative research endeavors are thus doubly important if Africa is to keep pace with the rest of the world. The requisite urgency of responding to COVID-19 has resulted in substantial upward pressure on the ethics and regulatory infrastructure of research internationally, yet the imperatives of ethical research remain, and these need to be carefully considered and addressed. COVIGen-SA has been built upon an ethics and regulatory framework that seeks to maximise participant protection and respond to the unique research challenges presented by the pandemic (Figure 4). In South Africa, this framework has been developed within a vacuum, as no guidelines for ethics in pandemic research were previously available. However, the National Human Research Ethics Committee (NHREC) has recently released a draft Pandemic Research Ethics guideline for public comment. The ethics framework of COVIGen-SA already largely conforms to this guideline but will be revised as guidance emerges. It is similarly critical that we do not sacrifice academic integrity for the sake of “speed science” (a problem brought into sharper focus due to the pandemic) [Reference Dinis-Oliveira65], which runs the risk of erroneous claims. To this end, we aim to share the COVIGen-SA research resource as widely as possible so that it may be assessed and validated by other research groups. We have developed a data protection and sharing framework that will enable us to do this both locally and internationally, according to the relevant data privacy legislation and ethical principles. Ultimately, we hope that the post-COVID emphasis on scientific collaboration becomes a guiding principle for research governance, especially in Africa, and ideally in international research that better represents the African continent.

Figure 4 An organisational chart of the COVIGen-SA project. COVIGen-SA currently incorporates five institutions and seven study partners. Each partner is engaged in one or several independent studies from which eligible participants were recruited for the COVIGen-SA study. Each study has ethical clearance, while each partnership is also covered by an ethically approved agreement.

Figure 5 Preliminary PCA plots of COVIGen-SA data. (a) Initial genotyping results for 503 COVIGen-SA (CVG, case) participants were merged with data from 5,139 Black South African AWI-Gen (AWG, control) participants. Participants from both studies overlapped substantially, suggesting common ancestry, but some individuals map further from the centre of the cluster, reflecting the considerable genetic variation of Black South Africans and possible minor admixture components from other ancestries. (b) Case and control data were then merged with publicly available 1,000 Genomes Project data for select populations across the globe (n = 2,504). The first principal component separated continental African ethnicities from others. The second principal component then separated out non-African ethnicities. AWG: AWI-Gen (controls); BEB: Bengali in Bangladesh; CDX: Chinese Dai in Xishuangbanna, China; CHB: Han Chinese in Beijing, China; CVG: COVIGen-SA (cases); FIN: Finnish in Finland; GBR: British in England and Scotland; GIH: Gujarati Indian in Houston, TX; ITU: Indian Telugu in the UK; JPT: Japanese in Tokyo, Japan; LWK: Luhya in Webuye, Kenya; MSL, Mende in Sierra Leone; PJL: Punjabi in Lahore, Pakistan; and TSI: Toscani in Italia.

4.3. Challenges and Limitations

Although supported by a strong foundation of partnerships, several key challenges remain for the COVIGen-SA study. Of these, funding is the most pressing concern. Participant recruitment and sample ascertainment costs have been kept to a minimum by aligning our recruitment with the studies of our partners, but sample storage and genotyping expenses remain a considerable strain on available funds. Secondly, we have struggled to recruit individuals severely affected by COVID-19 for a variety of reasons. In the main, the infrastructural and logistic shortcomings of South Africa’s health system reduce our ability to identify the full scope of severely affected COVID-19 patients. The majority of our partner studies have similarly noted a lack of severely affected patients in their cohorts. Although the limited number of severe patients might speak to the resilience of Black South Africans against SARS-CoV-2 infection [Reference Huffman, Butler-Laporte and Khan35], the estimates of excess deaths in the country [Reference Bradshaw, Dorrington, Laubscher, Moultrie and Groenewald66] suggest that these individuals may not be receiving the timeous intervention and appropriate support at clinics and hospitals. Lastly, based on the preliminary principal component analysis, the genetic diversity among COVIGen-SA participants is substantial. Such diversity may dilute our ability to find clear genetic association signals, but next-generation sequencing efforts are poised to reveal novel variations that may shed further light on the genetic aetiology of COVID-19.

Logistical challenges aside, we foresee some limitations to the current design of the study. Despite our numerous partnerships and recruitment sites, our sample size is likely to remain small compared to other host genomic studies (e.g., [Reference Huffman, Butler-Laporte and Khan35]), reducing our power to detect/replicate smaller effect size associations. Furthermore, our phenotypic database will comprise data from several independent studies, in which some data points may be missing when collating and harmonising the final data set. These data will have been collected over various waves of COVID-19 infection (driven by different viral variants) and at various stages of vaccine rollout, which could potentially undermine the representativeness of our sample and may introduce confounding effects we could not have anticipated at the start of the pandemic. However, these limitations are balanced against the uniqueness of our sample, both in terms of genetic and clinical diversity, which remains a particular strength of African-centric research.

5. Conclusion

Accurately determining the burden of COVID-19 in South Africa and other African countries remains challenging and important in the context of health planning. While vaccine rollout begins to alleviate some of the pressure of the pandemic, significant impetus remains to develop improved therapeutic approaches to COVID-19, especially as the SARS-CoV-2 virus mutates and evolves.

Research attention directed at this problem should be inclusive of communities worldwide if meaningful progress in reducing health inequality is to be made. COVIGen-SA represents our attempt not only to contribute to the fight against a global pandemic but also to serve an underrepresented ethnicity in genetic research. We envisage the project as a unifying framework that brings together otherwise disconnected efforts to study the host genomics of COVID-19 in South Africa. We continue to scan the horizon for further collaboration and funding opportunities and look forward to maximising the anticipated outcomes and impact of the project.

Data Availability

The data used to support this study can be obtained from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors wish to thank Natalie Smyth and the SBIMB Biobank team for the management of the biological samples linked to COVIGen-SA, the study participants who have generously provided their time to participate in research, and Illumina for providing substantial discounts on the H3Africa genotyping arrays. The PHIRST-C study was supported by the NICD of the National Health Laboratory Service and the US CDC (cooperative agreement number 6 U01IP001048-04-02), and the Wellcome Trust (grant number 221003/Z/20/Z) in collaboration with the UK Foreign, Commonwealth and Development Office. VIDA is supported by the Bill and Melinda Gates Foundation (grant number INV-016202) and the South African Medical Research Council (grant number SHIP NCD 96756). The GIVOCA study is in part funded by the COVID Africa Rapid Grant Fund (ARGF; UID 129210) and the South African Research Chairs Initiative of the Departments of Science and Innovation and National Research Foundation (UID 84177). The AWI-Gen study is NIH-funded (grant number U54HG006938), and MR is supported by the South African Research Chairs Initiative of the Department of Science and Innovation and the National Research Foundation (UID 89646). Funding for the genotyping has been generously provided by the International 100K+ Cohorts Consortium (IHCC) and the Wits African Leadership in Vaccinology Expertise (ALIVE) consortium. Participating studies, the Sydney Brenner Institute for Molecular Bioscience, and the Wits Donald Gordon Medical Centre have contributed to recruitment and data and sample collection costs.

References

World Health Organisation, “WHO Coronavirus Disease (COVID-19) Dashboard,2021, https://covid19.who.int.Google Scholar
Tregoning, J. S., Brown, E. S., Cheeseman, H. M. et al., “Vaccines for COVID-19,Clinical and Experimental Immunology, vol. 202, no. 2, pp. 162192, 2020.CrossRefGoogle ScholarPubMed
Gorman, E., Connolly, B., Couper, K., Perkins, G. D., and McAuley, D. F., “Non-invasive respiratory support strategies in COVID-19,The Lancet Respiratory Medicine, vol. 9, no. 6, pp. 553556, 2021.CrossRefGoogle ScholarPubMed
Alunno, A., Najm, A., Mariette, X. et al., “Immunomodulatory therapies for SARS-CoV 2 infection: a systematic literature review to inform EULAR points to consider,Annals of the Rheumatic Diseases, vol. 80, no. 6, pp. 803815, 2021.CrossRefGoogle ScholarPubMed
El-Sadr, W. M. and Justman, J., “Africa in the path of covid-19,New England Journal of Medicine, vol. 383, no. 3, p. e11, 2020.CrossRefGoogle ScholarPubMed
Ghosh, D., Jonathan, A., and Mersha, T. B., “COVID-19 pandemic: the african paradox,Journal of Global Health, vol. 10, no. 2, pp. 16, 2020.CrossRefGoogle ScholarPubMed
Samadizadeh, S., Masoudi, M., Rastegar, M., Salimi, V., Shahbaz, M. B., and Tahamtan, A., “COVID-19: why does disease severity vary among individuals?,Respiratory Medicine, 180, Article ID 106356, 2021.CrossRefGoogle ScholarPubMed
Maeda, J. M. and Nkengasong, J. N., “The puzzle of the COVID-19 pandemic in Africa,Science, vol. 371, no. 6524, pp. 2728, 2021.CrossRefGoogle ScholarPubMed
Mbow, B. M., Lell, B., Jochems, P. et al., “COVID-19 in africa: dampening the storm?,Science, vol. 369, no. 6504, pp. 624627, 2020.CrossRefGoogle ScholarPubMed
Li, D., Chen, Y., Liu, H. et al., “Immune dysfunction leads to mortality and organ injury in patients with COVID-19 in China: insights from ERS-COVID-19 study,Signal Transduction and Targeted Therapy, vol. 5, no. 1, pp. 810, 2020.Google Scholar
Visca, D., Ong, C. W., Tiberi, S. et al., “Tuberculosis and COVID-19 interaction: a review of biological, clinical and public health effects,Pulmonology, vol. 27, no. 2, pp. 151165, 2021.CrossRefGoogle ScholarPubMed
Jassat, W., Cohen, C., Tempia, S. et al., “Risk factors for COVID-19-related in-hospital mortality in a high HIV and tuberculosis prevalence setting in South Africa: a cohort study,The Lancet HIV, vol. 8, no. 9, pp. e554e567, 2021.CrossRefGoogle Scholar
Pan, D., Sze, S., Minhas, J. S. et al., “The impact of ethnicity on clinical outcomes in COVID-19: a systematic review,EClinicalMedicine, vol. 23, Article ID 100404, 2020.CrossRefGoogle ScholarPubMed
Rasmussen, S. A., Abul-Husn, N. S., Casanova, J. L., Daly, M. J., Rehm, H. L., and Murray, M. F., “The intersection of genetics and COVID-19 in 2021: preview of the 2021 rodney howell symposium,Genetics in Medicine, vol. 23, 2021.CrossRefGoogle Scholar
Aborode, A. T., Hasan, M. M., Jain, S. et al., “Impact of poor disease surveillance system on COVID-19 response in africa: time to rethink and rebuilt,Clinical Epidemiology and Global Health, vol. 12, Article ID 100841, 2021.CrossRefGoogle ScholarPubMed
Okonji, E. F., Okonji, O. C., Mukumbang, F. C., and Van Wyk, B., “Understanding varying COVID-19 mortality rates reported in Africa compared to Europe, Americas and Asia,Tropical Medicine and International Health, vol. 26, no. 7, pp. 716719, 2021.CrossRefGoogle ScholarPubMed
Chitungo, I., Dzobo, M., Hlongwa, M., and Dzinamarira, T., “COVID-19: unpacking the low number of cases in Africa,Primary Health in Practice, vol. 1, 2020.Google ScholarPubMed
Njenga, M. K., Dawa, J., Nanyingi, M. et al., “Why is there low morbidity and mortality of COVID-19 in Africa?,The American Journal of Tropical Medicine and Hygiene, vol. 103, no. 2, pp. 564569, 2020.CrossRefGoogle ScholarPubMed
Dong, E., Du, H., and Gardner, L., “An interactive web-based dashboard to track COVID-19 in real time,The Lancet Infectious Diseases, vol. 20, no. 5, pp. 533534, 2020.CrossRefGoogle ScholarPubMed
Ritchie, H., Mathieu, E., Rodes-Guirao, L. et al., “Coronavirus pandemic (COVID-19),2020, https://ourworldindata.org/coronavirus.Google Scholar
The COVID-19 Host Genetics Initiative, “The COVID-19 host genetics initiative, a global initiative to elucidate the role of host genetic factors in susceptibility and severity of the SARS-CoV 2 virus pandemic,European Journal of Human Genetics, vol. 28, no. 6, pp. 715718, 2020.CrossRefGoogle Scholar
The COVID-19 Host Genetics Initiative, “Mapping the human genetic architecture of COVID-19 by worldwide meta-analysis,Nature, vol. 600, pp. 472477, 2021.CrossRefGoogle Scholar
Naslavsky, M. S., Vidigal, M., Barros, R. et al., “Extreme phenotypes approach to investigate host genetics and COVID-19 outcomes,Genetics and Molecular Biology, vol. 44, Article ID e20200302, 2021.CrossRefGoogle ScholarPubMed
van der Made, C., Simons, A., Schuurs-Hoeijmakers, J. et al., “Presence of genetic variants among young men with severe COVID-19,The Journal of the American Medical Association, vol. 324, no. 7, pp. 663673, 2020.CrossRefGoogle ScholarPubMed
Zhang, Q., Liu, Z., Moncada-Velez, M. et al., “Inborn errors of type I IFN immunity in patients with life-threatening COVID-19,Science, vol. 370, no. 6515, 2020.CrossRefGoogle ScholarPubMed
Karaderi, T., Bareke, H., Kunter, I. et al., “Host genetics at the intersection of autoimmunity and COVID-19: a potential key for heterogeneous COVID-19 severity,Frontiers in Immunology, vol. 11, pp. 110, 2020.CrossRefGoogle ScholarPubMed
Carter-Timofte, M. E., Jørgensen, S. E., Freytag, M. R. et al., “Deciphering the role of host genetics in susceptibility to severe COVID-19,Frontiers in Immunology, vol. 11, pp. 114, 2020.CrossRefGoogle ScholarPubMed
Murray, M. F., Kenny, E. E., Ritchie, M. D. et al., “COVID-19 outcomes and the human genome,Genetics in Medicine, vol. 22, no. 7, pp. 11751177, 2020.CrossRefGoogle ScholarPubMed
Casanova, J. L., Su, H. C., Abel, L. et al., “A global effort to define the human genetics of protective immunity to SARS-CoV 2 infection,Cell, vol. 181, no. 6, pp. 11941199, 2020.CrossRefGoogle ScholarPubMed
Sirugo, G., Williams, S. M., and Tishkoff, S. A., “The missing diversity in human genetic studies,Cell, vol. 177, no. 1, pp. 2631, 2019.CrossRefGoogle ScholarPubMed
Rosenberg, N. A., Huang, L., Jewett, E. M., Szpiech, Z. A., Jankovic, I., and Boehnke, M., “Genome-wide association studies in diverse populations,Nature Reviews Genetics, vol. 11, no. 5, pp. 356366, 2010.CrossRefGoogle ScholarPubMed
Gomez, J., Albaiceta, G. M., Garcia-clemente, M., and Lopez-Larrea, C., “Angiotensin-converting enzymes (ACE, ACE2) gene variants and COVID-19 outcome,Gene, vol. 762, 2020.CrossRefGoogle ScholarPubMed
Zeberg, H. and Paabo, S., “The major genetic risk factor for severe COVID-19 is inherited from Neanderthals,Nature, vol. 587, no. 7835, pp. 610612, 2020.CrossRefGoogle ScholarPubMed
Singh, P. P., Srivastava, A., Sultana, G. N. N. et al., “The major genetic risk factor for severe COVID-19 does not show any association among South Asian populations,Scientific Reports, vol. 11, no. 1, pp. 1922, 2021.CrossRefGoogle Scholar
Huffman, J. E., Butler-Laporte, G., Khan, A. et al., “Multi-ancestry fine mapping implicates OAS1 splicing in risk of severe COVID-19,Nature Genetics, 54, 2022.CrossRefGoogle ScholarPubMed
Adebisi, Y. A., Ekpenyong, A., Ntacyabukura, B. et al., “COVID-19 highlights the need for inclusive responses to public health emergencies in Africa,The American Journal of Tropical Medicine and Hygiene, vol. 104, no. 2, pp. 449452, 2020.CrossRefGoogle ScholarPubMed
Pairo-Castineira, E., Clohisey, S., Klaric, L. et al., “Genetic mechanisms of critical illness in COVID-19,Nature, vol. 591, no. 7848, pp. 9298, 2021.CrossRefGoogle ScholarPubMed
Choudhury, A., Ramsay, M., Hazelhurst, S. et al., “Whole-genome sequencing for an enhanced understanding of genetic variation among South Africans,Nature Communications, vol. 8, no. 1, pp. 112, 2017.CrossRefGoogle ScholarPubMed
Sengupta, D., Choudhury, A., Fortes-Lima, C. et al., “Genetic substructure and complex demographic history of South African Bantu speakers,Nature Communications, vol. 12, no. 1, pp. 113, 2021.CrossRefGoogle ScholarPubMed
Kleynhans, J., Tempia, S., Wolter, N. et al., “SARS-CoV 2 seroprevalence in a rural and urban household cohort during first and second waves of infections,Emerging Infectious Diseases, vol. 27, no. 12, pp. 30203029, 2021.CrossRefGoogle Scholar
Wong, E. B., Olivier, S., Gunda, R. et al., “Convergence of infectious and non-communicable disease epidemics in rural South Africa: a cross-sectional, population-based multimorbidity study,Lancet Global Health, vol. 9, pp. e967e976, 2021.CrossRefGoogle ScholarPubMed
Gunda, R., Koole, O., Gareta, D. et al., “Cohort profile: the vukuzazi (“Wake up and know yourself ” in isiZulu) population science programme,International Journal of Epidemiology, 1, 2021.Google Scholar
Ramsay, M., Crowther, N., Tambo, E. et al., “H3Africa AWI-gen collaborative centre: a resource to study the interplay between genomic and environmental risk factors for cardiometabolic diseases in four sub-Saharan African countries,Global Health, Epidemiology and Genomics, vol. 1, 2016.CrossRefGoogle Scholar
Ali, S. A., Soo, C., Agongo, G. et al., “Genomic and environmental risk factors for cardiometabolic diseases in Africa: methods used for phase 1 of the AWI-gen population cross-sectional study,Global Health Action, vol. 11, 2018.CrossRefGoogle ScholarPubMed
H3ABioNet, “H3Africa genotyping chip,2020, https://www.h3abionet.org.Google Scholar
Curtis, D., “A weighted burden test using logistic regression for integrated analysis of sequence variants, copy number variants and polygenic risk score,European Journal of Human Genetics, vol. 27, no. 1, pp. 114124, 2019.CrossRefGoogle ScholarPubMed
Jakobsen, J. C., Gluud, C., Wetterslev, J., and Winkel, P., “When and how should multiple imputation be used for handling missing data in randomised clinical trials—a practical guide with flowcharts,BMC Medical Research Methodology, vol. 17, no. 1, pp. 110, 2017.CrossRefGoogle ScholarPubMed
Anderson, C. A., Pettersson, F. H., Clarke, G. M., Cardon, L. R., Morris, A. P., and Zondervan, K. T., “Data quality control in genetic case-control association studies,Nature Protocols, vol. 5, no. 9, pp. 15641573, 2010.CrossRefGoogle ScholarPubMed
Auton, A., Abecasis, G. R., Altshuler, D. M. et al., “A global reference for human genetic variation,Nature, vol. 526, no. 7571, pp. 6874, 2015.Google ScholarPubMed
Chang, C. C., Chow, C. C., Tellier, L. C., Vattikuti, S., Purcell, S. M., and Lee, J. J., “Second-generation PLINK: rising to the challenge of larger and richer datasets,GigaScience, vol. 4, no. 1, pp. 116, 2015, https://doi.org/10.1186/s13742-015-0047-8.CrossRefGoogle Scholar
Purcell, S. and Chang, C., “PLINK,2020, https://www.cog-genomics.org/plink/.Google Scholar
R Core Team, “R: a language and environment for statistical computing,2021, https://www.gbif.org/tool/81287/r-a-language-and-environment-for-statistical-computing.Google Scholar
RStudio Team, “RStudio: integrated development for R,2020, https://www.rstudio.com/.Google Scholar
Wickham, H., Ggplot2: Elegant Graphics for Data Analysis, Springer-Verlag, Berlin, Germany, 2016.CrossRefGoogle Scholar
Choudhury, A., Aron, S., Botigue´, L. R. et al., “High-depth African genomes inform human migration and health,Nature, vol. 586, no. 7831, pp. 741748, 2020.CrossRefGoogle ScholarPubMed
Fan, S., Kelly, D. E., Beltrame, M. H. et al., “African evolutionary history inferred from whole genome sequence data of 44 indigenous African populations,Genome Biology, vol. 20, no. 1, pp. 114, 2019.Google ScholarPubMed
Petersen, D. C., Libiger, O., Tindall, E. A. et al., “Complex patterns of genomic admixture within southern africa,PLoS Genetics, vol. 9, no. 3, pp. 1013, 2013.CrossRefGoogle ScholarPubMed
Yu, N., Chen, F. C., Ota, S. et al., “Larger genetic differences within Africans than between Africans and eurasians,Genetics, vol. 161, no. 1, pp. 269274, 2002.CrossRefGoogle ScholarPubMed
Pereira, L., Mutesa, L., Tindana, P., and Ramsay, M., “African genetic diversity and adaptation inform a precision medicine agenda,Nature Reviews Genetics, vol. 22, pp. 284306, 2021.CrossRefGoogle ScholarPubMed
Bentley, A. R., Callier, S. L., and Rotimi, C. N., “Evaluating the promise of inclusion of African ancestry populations in genomics,NPJ Genomic Medicine, vol. 5, no. 1, pp. 19, 2020.CrossRefGoogle ScholarPubMed
Reid, M., Abdool-Karim, Q., Geng, E., and Goosby, E., “How will COVID-19 transform global health post-pandemic? defining research and investment opportunities and priorities,PLoS Medicine, vol. 18, no. 3, pp. 25, 2021.CrossRefGoogle ScholarPubMed
Moradian, N., Ochs, H. D., Sedikies, C. et al., “The urgent need for integrated science to fight COVID-19 pandemic and beyond,Journal of Translational Medicine, vol. 18, no. 1, pp. 17, 2020.CrossRefGoogle ScholarPubMed
Cella, E., Benedetti, F., Fabris, S. et al., “SARS-CoV 2 lineages and sub-lineages circulating worldwide: a dynamic overview,Chemotherapy, vol. 66, no. 1-2, pp. 37, 2021.CrossRefGoogle ScholarPubMed
Viana, R., Moyo, S., Amoako, D. G. et al., “Rapid epidemic expansion of the SARS-CoV 2 Omicron variant in southern Africa,Nature, vol. 603, 2022.CrossRefGoogle ScholarPubMed
Dinis-Oliveira, R. J., “COVID-19 research: pandemic versus “paperdemic,” integrity, values and risks of the “speed science,Forensic Sciences Research, vol. 5, no. 2, pp. 174187, 2020.CrossRefGoogle ScholarPubMed
Bradshaw, D., Dorrington, R. E., Laubscher, R., Moultrie, T. A., Groenewald, P., “Tracking mortality in near to real time provides essential information about the impact of the COVID-19 pandemic in South Africa in 2020,South African Medical Journal, vol. 111, no. 8, pp. 732740, 2021.CrossRefGoogle ScholarPubMed
Figure 0

Figure 1 COVID-19 case numbers and tests per thousand across different regions: (a) confirmed case numbers on the African continent remain low in comparison to other continents despite early predictions that African countries would struggle the most to maintain infection control and (b) however, testing per thousand individuals in selected African countries is a fraction of those conducted in, for example, the United States and the United Kingdom. Data sourced from [19, 20].

Figure 1

Table 1 A brief overview of COVIGen-SA partner studies.

Figure 2

Table 2 Demographic and clinical variables collected for COVIGen-SA.

Figure 3

Table 3 Demographic summary of COVIGen-SA SARS-CoV-2 positive participants recruited from four studies.

Figure 4

Figure 2 Study sites for COVIGen-SA. Participants for COVIGen-SA are currently being recruited from 10 different sites, in and around areas including Johannesburg, Polokwane, Bushbuckridge, Klerksdorp, Pietermaritzburg, and Northern KwaZulu Natal (KZN).

Figure 5

Figure 3 Overview of the COVIGen-SA research resource and planned host genetic studies. COVIGen-SA is based on a governance framework that promotes cross-disciplinary collaboration and transparent data and sample sharing that is ethically approved and legally compliant. In addition to the SBIMB and WDGMC, seven partners have joined the study to date, all contributing to a unified research resource that will facilitate host genetic and other COVID-related studies. The project data will be made available to improve the representation of continental Africans in public data sets. SBIMB: Sydney Brenner Institute for Molecular Bioscience, WDGMC: Wits Donald Gordon Medical Centre, AHRI: Africa Health Research Institute, CRDM: Centre for Respiratory Disease and Meningitis, National Institute for Communicable Diseases, PHRU: Perinatal HIV Research unit, RPHHT: Rural Public Health and Health Transitions Research Unit, (Agincourt) VIDA: Vaccines and Infectious Diseases Analytics Unit, and HVTR: HIV Vaccine Translational Research Entity.

Figure 6

Figure 4 An organisational chart of the COVIGen-SA project. COVIGen-SA currently incorporates five institutions and seven study partners. Each partner is engaged in one or several independent studies from which eligible participants were recruited for the COVIGen-SA study. Each study has ethical clearance, while each partnership is also covered by an ethically approved agreement.

Figure 7

Figure 5 Preliminary PCA plots of COVIGen-SA data. (a) Initial genotyping results for 503 COVIGen-SA (CVG, case) participants were merged with data from 5,139 Black South African AWI-Gen (AWG, control) participants. Participants from both studies overlapped substantially, suggesting common ancestry, but some individuals map further from the centre of the cluster, reflecting the considerable genetic variation of Black South Africans and possible minor admixture components from other ancestries. (b) Case and control data were then merged with publicly available 1,000 Genomes Project data for select populations across the globe (n = 2,504). The first principal component separated continental African ethnicities from others. The second principal component then separated out non-African ethnicities. AWG: AWI-Gen (controls); BEB: Bengali in Bangladesh; CDX: Chinese Dai in Xishuangbanna, China; CHB: Han Chinese in Beijing, China; CVG: COVIGen-SA (cases); FIN: Finnish in Finland; GBR: British in England and Scotland; GIH: Gujarati Indian in Houston, TX; ITU: Indian Telugu in the UK; JPT: Japanese in Tokyo, Japan; LWK: Luhya in Webuye, Kenya; MSL, Mende in Sierra Leone; PJL: Punjabi in Lahore, Pakistan; and TSI: Toscani in Italia.