| Literature DB >> 33298875 |
Fang Wang1, Shujia Huang2,3, Rongsui Gao1, Yuwen Zhou2,4, Changxiang Lai1, Zhichao Li2,4, Wenjie Xian1, Xiaobo Qian2,4, Zhiyu Li1, Yushan Huang2,4, Qiyuan Tang1, Panhong Liu2,4, Ruikun Chen1, Rong Liu2, Xuan Li1, Xin Tong2, Xuan Zhou1, Yong Bai2, Gang Duan1, Tao Zhang2, Xun Xu2,5, Jian Wang2,6, Huanming Yang2,6, Siyang Liu7, Qing He8, Xin Jin9,10, Lei Liu11.
Abstract
The COVID-19 pandemic has accounted for millions of infections and hundreds of thousand deaths worldwide in a short-time period. The patients demonstrate a great diversity in clinical and laboratory manifestations and disease severity. Nonetheless, little is known about the host genetic contribution to the observed interindividual phenotypic variability. Here, we report the first host genetic study in the Chinese population by deeply sequencing and analyzing 332 COVID-19 patients categorized by varying levels of severity from the Shenzhen Third People's Hospital. Upon a total of 22.2 million genetic variants, we conducted both single-variant and gene-based association tests among five severity groups including asymptomatic, mild, moderate, severe, and critical ill patients after the correction of potential confounding factors. Pedigree analysis suggested a potential monogenic effect of loss of function variants in GOLGA3 and DPP7 for critically ill and asymptomatic disease demonstration. Genome-wide association study suggests the most significant gene locus associated with severity were located in TMEM189-UBE2V1 that involved in the IL-1 signaling pathway. The p.Val197Met missense variant that affects the stability of the TMPRSS2 protein displays a decreasing allele frequency among the severe patients compared to the mild and the general population. We identified that the HLA-A*11:01, B*51:01, and C*14:02 alleles significantly predispose the worst outcome of the patients. This initial genomic study of Chinese patients provides genetic insights into the phenotypic difference among the COVID-19 patient groups and highlighted genes and variants that may help guide targeted efforts in containing the outbreak. Limitations and advantages of the study were also reviewed to guide future international efforts on elucidating the genetic architecture of host-pathogen interaction for COVID-19 and other infectious and complex diseases.Entities:
Year: 2020 PMID: 33298875 PMCID: PMC7653987 DOI: 10.1038/s41421-020-00231-4
Source DB: PubMed Journal: Cell Discov ISSN: 2056-5968 Impact factor: 10.849
Fig. 1Clinical and laboratory assessments of the recruited 332 COVID-19 patients.
a Number of samples belong to the five categories. b Top 20 features that classify the patient categories in the machine learning trained model. c Age distribution for the five categories of patients. d Distribution of disease duration, i.e., the duration between the disease onset and the first negative RT-PCR test among the five groups of patients. e Gender distribution for the five categories of patients by age. f Distribution of the proportion of patients with or without medical comorbidities among the five categories of patients by age.
Fig. 2Deep whole-genome sequencing and genetic variation among the patients.
a Sequencing depth distribution. b Proportions and numbers of types (SNP, Indel) of genetic variants identified from the patients. c Proportions and numbers of functional consequences of the genetic variants among the patients. d Pedigree gene discovery. GOLGA3 and DPP7 were marked in bold for presence of rare recurrent loss of function variants. e Mutation burden association test for loss of function between the severe and non-severe patients. f Allele frequency distribution for all the missense and loss of function variants present in ACE2 and TRMPSS2 genes.
Fig. 3Genetic loci associated with patient severity.
a–c Single variant and association test for three severity traits. a Severe and critical severe groups vs. the rest of the non-severe groups. b Severity score assessed by laboratory test measurements. c The duration from disease onset to recovery. d–f Gene-based association test for three traits.
Fig. 4LD, allele frequency and pleiotropic effects of the TMEM189–UBE2V1 signal suggestively associated with COVID-19 patient severity.
a Locuszoom plot shows the p value of the SNPs centering the lead SNP rs6020298 and the recombination rate. Color of the dots indicate linkage disequilibrium r2 metric. b Allele frequency of s6020298 among the 1000 genomes populations. The allele frequency of the reference and alternative allele is visualized by the geography of genetic variants browser developed by the university of Chicago. c p Value of the single variant genome-wide association test for the 64 laboratory assessments at the lead SNP rs6020298. The p value of the three traits (severity, severity score and disease duration) in Fig. 3 were also displayed.
Comparison of the allele frequency and genome-wide association signals for two associated loci in European population.
| Compared information | rs11385942 3p21.31 | rs657152 9q34.2, ABO | |
|---|---|---|---|
| CHROM | chr3 | chr9 | |
| POS (hg38) | 45,834,967 | 133,263,862 | |
| Risk allele | GA | C | |
| Other allele | G | A | |
| All patients ( | 0 | 0.453 | |
| Asymptomatic ( | 0 | 0.413 | |
| Moderate ( | 0 | 0.44 | |
| Severe ( | 0 | 0.469 | |
| Critical ( | 0 | 0.618 | |
| ChinaMAP | 0.00396 | 0.424 | |
| 1000G_EAS | 0.005 | 0.628 | |
| 1000G_EUR | 0.0805 | 0.601 | |
| 1000G_SAS | 0.296 | 0.596 | |
| 1000G_AFR | 0.053 | 0.539 | |
| gnomAD_EAS | 0.00061 | 0.633 | |
| Italian | OR (95% CI) | 1.53–2.48 | 1.22–1.59 |
| 7.02E–08 | 5.31E–05 | ||
| Spain | OR (95% CI) | 1.76–4.42 | 1.17–1.60 |
| 1.17E–05 | 2.81E–03 | ||
| Chinese | OR (95% CI) | NA | 0.38–1.42 |
| NA | 0.693 | ||
Nominal association of HLA allele and severity by logistic regression.
| Severe | Non-severe | OR | SE | ||
|---|---|---|---|---|---|
| C*14:02 | 0.086 | 0.047 | 4.75 | 0.52 | 0.003028 |
| B*51:01 | 0.101 | 0.058 | 3.38 | 0.45 | 0.007017 |
| A*11:01 | 0.297 | 0.263 | 2.33 | 0.32 | 0.008512 |
| DRB1*14:04 | 0.029 | 0.005 | 15.1 | 1.06 | 0.01027 |
| DRB1*01:01 | 0.022 | 0.005 | 13.7 | 1.13 | 0.02034 |
| DPB1*03:01 | 0.008 | 0.044 | 0.09 | 1.15 | 0.03669 |
| DQA1*01:01 | 0.029 | 0.009 | 6.05 | 0.87 | 0.03947 |
| DRB1*12:01 | 0.022 | 0.037 | 0.18 | 0.87 | 0.04478 |
| B*13:02 | 0.058 | 0.051 | 0.27 | 0.66 | 0.04935 |
Severe group indicates severe and critical patients.
Non-severe group includes asymptomatic, mild, and moderate patients.
Fig. 6Single variant and gene-based association test between COVID-19 patients and the general populations.
a Single variant association test and b gene-based association test between the unrelated COVID-19 patients (n = 284) and the 1KGP Chinese population (n = 301) c single variant association test and d gene-based association test between the unrelated COVID-19 patients (n = 284) and the CNRP Chinese population (n = 665). Only variants with moderate or high impacts by variant effect predictor were shown in (a) and (c).