Hostname: page-component-77f85d65b8-pkds5 Total loading time: 0 Render date: 2026-03-28T21:00:47.909Z Has data issue: false hasContentIssue false

Case–control association study between polygenic risk score and COVID-19 severity in a Russian population using low-pass genome sequencing

Published online by Cambridge University Press:  26 December 2024

Arina Nostaeva*
Affiliation:
City Hospital No. 40 of Kurortny District, St. Petersburg State Budgetary Healthcare Institution, Sestroretsk, Russia St. Petersburg State University, St. Petersburg, Russia
Valentin Shimansky
Affiliation:
City Hospital No. 40 of Kurortny District, St. Petersburg State Budgetary Healthcare Institution, Sestroretsk, Russia St. Petersburg State University, St. Petersburg, Russia
Svetlana Apalko
Affiliation:
City Hospital No. 40 of Kurortny District, St. Petersburg State Budgetary Healthcare Institution, Sestroretsk, Russia St. Petersburg State University, St. Petersburg, Russia
Ivan Kuznetsov
Affiliation:
Center of Life Sciences, Skolkovo Institute of Science and Technology, Moscow, Russia
Natalya Sushentseva
Affiliation:
City Hospital No. 40 of Kurortny District, St. Petersburg State Budgetary Healthcare Institution, Sestroretsk, Russia
Oleg Popov
Affiliation:
City Hospital No. 40 of Kurortny District, St. Petersburg State Budgetary Healthcare Institution, Sestroretsk, Russia St. Petersburg State University, St. Petersburg, Russia
Anna Asinovskaya
Affiliation:
City Hospital No. 40 of Kurortny District, St. Petersburg State Budgetary Healthcare Institution, Sestroretsk, Russia St. Petersburg State University, St. Petersburg, Russia
Sergei Mosenko
Affiliation:
City Hospital No. 40 of Kurortny District, St. Petersburg State Budgetary Healthcare Institution, Sestroretsk, Russia St. Petersburg State University, St. Petersburg, Russia
Lennart Karssen
Affiliation:
PolyKnomics BV, s-Hertogenbosch, The Netherlands
Andrey Sarana
Affiliation:
St. Petersburg State University, St. Petersburg, Russia
Yurii Aulchenko
Affiliation:
PolyKnomics BV, s-Hertogenbosch, The Netherlands
Sergey Shcherbak
Affiliation:
City Hospital No. 40 of Kurortny District, St. Petersburg State Budgetary Healthcare Institution, Sestroretsk, Russia St. Petersburg State University, St. Petersburg, Russia
*
Corresponding author: Arina Nostaeva; Email: avnostaeva@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

The course of COVID-19 is highly variable, with genetics playing a significant role. Through large-scale genetic association studies, a link between single nucleotide polymorphisms and disease susceptibility and severity was established. However, individual single nucleotide polymorphisms identified thus far have shown modest effects, indicating a polygenic nature of this trait, and individually have limited predictive performance. To address this limitation, we investigated the performance of a polygenic risk score model in the context of COVID-19 severity in a Russian population. A genome-wide polygenic risk score model including information from over a million common single nucleotide polymorphisms was developed using summary statistics from the COVID-19 Host Genetics Initiative consortium. Low-coverage sequencing (5x) was performed for ~1000 participants, and polygenic risk score values were calculated for each individual. A multivariate logistic regression model was used to analyse the association between polygenic risk score and COVID-19 outcomes. We found that individuals in the top 10% of the polygenic risk score distribution had a markedly elevated risk of severe COVID-19, with adjusted odds ratio of 2.9 (95% confidence interval: 1.8–4.6, p-value = 4e-06), and more than four times higher risk of mortality from COVID-19 (adjusted odds ratio = 4.3, p-value = 2e-05). This study highlights the potential of polygenic risk score as a valuable tool for identifying individuals at increased risk of severe COVID-19 based on their genetic profile.

Information

Type
Original Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Table 1. COVID-19-related characteristics of the participants

Figure 1

Figure 1. Study design and workflow. The PRS model for COVID-19 severity was derived by combining summary association statistics from the COVID-19 Host Genetics Initiative consortium and a linkage disequilibrium reference panel of 50,000 individuals of European ancestry from the UK Biobank data set. As a computational algorithm, SBayesR was used, which is a Bayesian approach to calculate a posterior mean effect for all variants based on a prior (effect size in the previous GWAS) and subsequent shrinkage based on linkage disequilibrium. The PRS model was restricted by the list of variants from HapMap3 and included about one million variants.

Figure 2

Figure 2. Prevalence of the severe COVID-19 according to PRS decile. All participants (N = 982) were stratified by decile of the PRS distribution. The average prevalence in per cent and 95% CI within each decile are displayed.

Figure 3

Figure 3. Comparison of distributions of PRS values between the groups with and without severe COVID-19. (a) Distribution of PRS in the groups with (Ncases = 319) and without (Ncontrols = 663) severe COVID-19. The x-axis represents PRS, with values scaled to a mean of 0 and a standard deviation of 1 (in the total sample) to facilitate interpretation. (b) PRS values among cases versus controls. Within each box plot, the horizontal lines reflect the median, the top, and bottom of each box reflect the interquartile range, and the whiskers reflect the rest of the distribution, except for points that are determined to be ‘outliers’.

Figure 4

Figure 4. The comparison of receiving operating curves for three logistic regression models. The full model included the covariates (sex, age, comorbidities, and the first ten PCs) and the PRS, while the covariates-only model excluded the PRS.

Figure 5

Figure 5. Comparison of distributions of PRS values between the groups with and without death outcome. (a) Distribution of PRS in the groups with (Ndeath = 133) and without (Nno death = 849) death outcome of COVID-19. The x-axis represents PRS, with values scaled to a mean of 0 and a standard deviation of 1 (in the total sample) to facilitate interpretation. (b) PRS values among cases versus controls. Within each box plot, the horizontal lines reflect the median, the top, and bottom of each box reflect the interquartile range, and the whiskers reflect the rest of the distribution, except for points that are determined to be ‘outliers’.

Figure 6

Figure 6. Association of PRS with COVID-19 severity. All participants (N = 982) were stratified into three categories, based on their PRS: bottom decile, deciles 2–9, and top decile. The Kaplan–Meier curve is plotted according to the PRS category.

Supplementary material: File

Nostaeva et al. supplementary material 1

Nostaeva et al. supplementary material
Download Nostaeva et al. supplementary material 1(File)
File 363.7 KB
Supplementary material: File

Nostaeva et al. supplementary material 2

Nostaeva et al. supplementary material
Download Nostaeva et al. supplementary material 2(File)
File 36.3 KB