| Literature DB >> 35328087 |
Sergey G Shcherbak1,2, Anton I Changalidi1,3,4, Yury A Barbitoff1,3,5, Anna Yu Anisenkova1,2, Sergei V Mosenko1,2, Zakhar P Asaulenko2,6, Victoria V Tsay2,7, Dmitrii E Polev2,5, Roman S Kalinin2,7, Yuri A Eismont2,7, Andrey S Glotov1,5, Evgeny Y Garbuzov2, Alexander N Chernov2,8, Olga A Klitsenko1,2,6, Mikhail O Ushakov5, Anton E Shikov2,9, Stanislav P Urazov2, Vladislav S Baranov5, Oleg S Glotov2,5,7.
Abstract
The COVID-19 pandemic has drawn the attention of many researchers to the interaction between pathogen and host genomes. Over the last two years, numerous studies have been conducted to identify the genetic risk factors that predict COVID-19 severity and outcome. However, such an analysis might be complicated in cohorts of limited size and/or in case of limited breadth of genome coverage. In this work, we tried to circumvent these challenges by searching for candidate genes and genetic variants associated with a variety of quantitative and binary traits in a cohort of 840 COVID-19 patients from Russia. While we found no gene- or pathway-level associations with the disease severity and outcome, we discovered eleven independent candidate loci associated with quantitative traits in COVID-19 patients. Out of these, the most significant associations correspond to rs1651553 in MYH14p = 1.4 × 10-7), rs11243705 in SETX (p = 8.2 × 10-6), and rs16885 in ATXN1 (p = 1.3 × 10-5). One of the identified variants, rs33985936 in SCN11A, was successfully replicated in an independent study, and three of the variants were found to be associated with blood-related quantitative traits according to the UK Biobank data (rs33985936 in SCN11A, rs16885 in ATXN1, and rs4747194 in CDH23). Moreover, we show that a risk score based on these variants can predict the severity and outcome of hospitalization in our cohort of patients. Given these findings, we believe that our work may serve as proof-of-concept study demonstrating the utility of quantitative traits and extensive phenotyping for identification of genetic risk factors of severe COVID-19.Entities:
Keywords: COVID-19; GWAS; NGS; deep phenotyping; genetic associations; genetic variants; severity
Mesh:
Year: 2022 PMID: 35328087 PMCID: PMC8949130 DOI: 10.3390/genes13030534
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
Figure 1Identification of candidate genetic markers of severe COVID-19 using a deeply phenotype cohort. (a) Distributions of selected quantitative traits for individuals with different disease outcome (death or recovery) in the cohort of 840 COVID-19 patients from Russia. Shown are the distributions of the serum C-reactive protein (CRP), interleukin-6, and D-dimer levels, CT-based lung involvement score (ranging from 0 to 4), counts of lymphocytes, leukocytes, and neutrophils in the blood samples, as well as the National Early Warning Score (NEWS). All values shown correspond to maximum values recorded during the course of hospitalization. Values exceeding three standard deviations from the population mean are omitted. ***— in Wilcoxon-Mann-Whitney test (for quantitative traits) or chi-squared test (for categorical traits). (b) A schematic representation of the data analysis pipeline employed in the present study.
Figure 2Genome-wide association results for selected quantitative traits in COVID-19 patients. Shown are Manhattan (left) and quantile-quantile (right) plots of association p-values for (from top to bottom) the serum CRP levels (a), CT-based lung involvement score (b), serum lymphocyte (c), leukocyte (d), and neutrophil (e) counts. Thresholds on the Manhattan plots correspond to the exome-wide significance cutoff () and the sub looser cutoff used to select candidate associations.
Candidate genetic variants associated with COVID-19 related quantitative traits in a cohort of Russian patients.
| Locus | rsID | Substitution | AF * | Trait(s) | Gene | Consequence | β ** | GTEx eQTLs *** | |
|---|---|---|---|---|---|---|---|---|---|
| 2:219280564 | rs2276638 | 6247C>G |
| Leukocytes |
| intron variant |
|
| Multiple genes and tissues |
| 3:38894643 | rs33985936 | c.2725G>T (p.Val909Phe) |
| CT score |
| missense variant |
|
| Multiple genes and tissues |
| 3:68997990 | rs4855544 | g.20905C>A |
| Lymphocytes |
| intron variant |
|
| Multiple genes and tissues |
| 6:16306520 | rs16885 | c.2257C>T (p.Pro753Ala) |
| CT score |
| missense variant |
|
| none |
| 6:51830849 | rs1571084 | g.261777T>A |
| CT score |
| intron variant |
|
| |
| 9:98299383 | rs41273925 | g.414815C>G |
| CT score |
| intron variant |
|
| |
| 9:132278286 | rs11243705 | g.81700A>G |
| CRP |
| intron variant |
|
| |
| 10:71799129 | rs4747194 | c.7073G>T (p.Arg2358Gln) |
| Lymphocytes |
| missense variant |
|
| |
| 16:88738516 | rs34600315 | c.*648_*649del |
| CT score |
| non coding transcript exon variant |
|
| |
| 19:50259161 | rs1651553 | c.2127A>G |
| Leukocytes, neutrophiles |
| synonymous variant | none | ||
| 22:20992196 | rs112544 | g.14928T>G |
| Neutrophiles |
| intron variant |
|
| Multiple genes and tissues |
*—allele frequency is given with respect to the non-reference allele; **—the effect sizes are given with respect to the IRNT-transformed values of quantitative traits; ***—data for the GTEx Analysis Release v8 (full list of significant cis-eQTLs is available in Supplementary Table S2). Bold font indicates a p-value passing formal exome-wide significance threshold. DNAJB2—Dna J heat shock protein family (Hsp40) member B2; SCN11A—sodium voltage-gated channel alpha subunit 11; EOGT—EGF domain specific O-linked N-acetylglucosamine transferase; ATXN1—ataxin 1; PKHD1—PKHD1 ciliary IPT domain containing fibrocystin/polyductin; GABBR2—gamma-aminobutyric acid type B receptor subunit 2; SETX—senataxin; CDH23—cadherin related 23; PIEZO1—piezo type mechanosensitive ion channel component 1; MYH14—myosin heavy chain 14; LZTR1—leucine zipper like transcription regulator 1.
Figure 3A risk score based on 11 identified variants predicts disease severity and outcome. (a) Distribution of the risk score computed based on the 11 lead SNPs shown in Table 1. Shaded area indicates the top score decile corresponding to high-risk individuals. (b) Bar plots showing the proportion of patients with different outcome (top), severity (middle), or presence of the cytokine storm (bottom) in the high-risk and low-risk groups. In all cases, in chi-squared test.
Replication of the association for the 11 identified variants in independent cohorts.
| Variant | Gene | A2 †,* | B1 †,** | B2 †,*** | C2 †,**** | The Severe COVID-19 GWAS Group †† | UK Biobank PheWAS Traits ††† | |
|---|---|---|---|---|---|---|---|---|
| rs2276638 |
|
|
|
|
|
|
| none |
| rs33985936 |
|
|
|
|
|
|
| Platelet count, platelet crit |
| rs4855544 |
|
|
|
|
|
|
| none |
| rs16885 |
|
|
|
|
|
|
| Mean corpuscular hemoglobin |
| rs1571084 |
|
|
|
|
|
|
| none |
| rs41273925 |
|
|
|
|
|
|
| none |
| rs11243705 |
|
|
|
|
|
|
| none |
| rs4747194 |
|
|
|
|
|
|
| Monocyte % |
| rs34600315 |
|
| n.a. |
|
|
|
| none |
| rs1651553 |
|
|
|
|
|
|
| none |
| rs112544 |
|
|
|
|
|
|
| none |
Bold font corresponds to variants passing replication at adjusted p-value < 0.05. †—COVID-19 HG; ††—COVID- 19 cases vs. controls from Italy and Spain (corrected for 10 genomics PCs, sex, and age); †††—phenome-wide associations were selected using the Global Biobank Engine at p-value 10−5; *—very severe respiratory confirmed COVID-19 vs. population; **—hospitalized COVID-19 vs. not hospitalized COVID-19; ***—hospitalized COVID-19 vs. population; ****—COVID-19 vs. population.