| Literature DB >> 32690544 |
Kohei Fujikura1, Kazuma Uesaka2.
Abstract
AIMS: The recent emergence of novel, pathogenic severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) poses a global health emergency. The coronaviral entry requires the spike (S)-protein for attachment to the host cell surface, and employs human angiotensin-converting enzyme 2 (hACE2) for entry and transmembrane protease serine 2 (TMPRSS2) for S-protein priming. Although coronaviruses undergo evolution by mutating themselves, it is also essential to know the host genetic factors. Here, we describe the single nucleotide variations (SNVs) in human ACE2 and TMPRSS2.Entities:
Keywords: genetics; infections; viruses
Year: 2020 PMID: 32690544 PMCID: PMC7385749 DOI: 10.1136/jclinpath-2020-206867
Source DB: PubMed Journal: J Clin Pathol ISSN: 0021-9746 Impact factor: 3.411
Figure 1Genetic variation in ACE2. (A) The distribution of nucleotide polymorphisms along the full-length ACE2 gene. The vertical bar indicates allele frequency (AF) (%). Single nucleotide variations (SNVs) are grouped by type: missense (blue), stop-gained (orange), start-lost (grey), indel (yellow) and splice site (green). The putative functional domains are depicted by coloured boxes. (B) Pie chart of 349 SNVs in ACE2. Each colour code corresponds to a different SNV type. (C) Relative abundance of SNVs plotted over their AFs. SNVs are grouped by type: missense (blue), synonymous (orange), other non-synonymous (grey) and indel (yellow). The inset demonstrates that the majority of SNVs in coding regions were quite rare (48.4%, AF <0.001%), while 90.3% were classified as rare (0.001%< AF <0.01%) and only 9.7% had low frequency or were common. (D) Percentage of fraction deleterious. Twenty-three software were employed to predict the pathogenicity of missense variants. (E) The non-linear regression fitting was performed based on the scatter plot showing the relationship between AF and variation of SNVs. (F) Relationship between total number of SNVs and population size. The ACE2 variations are expected to rise as the population size increases. A higher number of rare SNVs could be detected in females compared with males. An enlarged view of the graph is also indicated in the upper panel. (G) Chow-Ruskey diagrams showing the number of shared and unique genetic variants for ACE2 genes across four large-scale population studies. (H) Comparison of AFs across four large-scale population studies. ACE2, angiotensin-converting enzyme 2; HEMGH, metalloprotease zinc-binding site; NHLBI, National Heart, Lung, and Blood Institute; TM, transmembrane domain; ToMMo,Tohoku Medical Megabank Organization.
Comparison of allele frequency between putative deleterious variants and putative tolerated variants
|
| |||||||||||
|
|
|
|
|
|
|
|
|
|
|
| |
| Deleterious | 0.0008 | 0.0007 | 0.0011 | 0.0011 | 0.0011 | 0.0007 | 0.001 | 0.0011 | 0.001 | 0.0007 | 0.0011 |
| Tolerated | 0.0012 | 0.0011 | 0.0011 | 0.0011 | 0.0011 | 0.0011 | 0.0015 | 0.0011 | 0.0011 | 0.0011 | 0.0011 |
| p-value | 0.0661 | 0.1724 | 0.7992 | 0.2397 | 0.5946 | 0.1396 | 0.0617 | 0.0877 |
| 0.1569 | 0.9218 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 0.0012 | 0.001 | 0.0011 | 0.0006 | 0.0011 | 0.001 | 0.0006 | NA | 0.001 | 0.0011 | 0.001 | 0.0011 |
| 0.0011 | 0.0016 | 0.0011 | 0.001 | 0.001 | 0.0011 | 0.0011 | 0.0011 | 0.0011 | 0.0016 | 0.0013 | |
| 0.2381 | 0.085 | 0.9882 |
| 0.4816 | 0.333 | 0.1536 | 0.2478 | 0.86 |
| 0.7635 | |
|
| |||||||||||
|
|
|
|
|
|
|
|
|
|
|
| |
| Deleterious | 0.0011 | 0.0008 | 0.0008 | 0.0008 | 0.0008 | 0.0008 | 0.0008 | 0.0008 | 0.0008 | 0.0008 | 0.0004 |
| Tolerated | 0.0008 | 0.0008 | 0.0008 | 0.0008 | 0.0011 | 0.0012 | 0.0008 | 0.001 | 0.0011 | 0.0012 | 0.0032 |
| p-value | 0.9983 | 0.4043 | 0.3368 | 0.2947 |
|
| 0.3384 | 0.4184 | 0.1499 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 0.0007 | 0.0008 | 0.0006 | 0.0004 | 0.0008 | 0.0008 | 0.0007 | 0.0008 | 0.0008 | 0.0008 | 0.0008 | NA |
| 0.0012 | 0.0012 | 0.0012 | 0.0008 | 0.0012 | 0.0008 | 0.0011 | 0.0011 | 0.0012 | 0.0008 | 0.0012 | |
|
| 0.082 |
| 0.0766 |
| 0.5259 |
|
|
| 0.5279 |
| |
The indicated values correspond to median allele frequency. Twenty-three deleteriousness-prediction software are employed in this study. P value is calculated using the Mann-Whitney U test. A probability of p<0.05 is considered statistically significant (values in bold).
ACE2, angiotensin-converting enzyme 2; NA, not available; TMPRSS2, transmembrane protease serine 2.
Genetic variations in hACE2 detected in the interface between SARS-CoV/SARS-CoV-2 S-protein and hACE2
| SARS-CoV-2 S-protein | SARS-CoV S-protein | hACE2 | Polymorphism in hACE2 | GERP++_RS | phyloP100way_vertebrate | phastCons100way_vertebrate |
| A475, G476 | P462 | S19 | p.Ser19Pro | −0.495 | −0.801 | 0 |
| F456, Y473, A475, Y489 | L443, Y475 | T27 | p.Thr27Ala | 1.85 | 0.345 | 0 |
| Q493 | – | E35 | p.Glu35Lys | −0.741 | −0.368 | 0 |
| p.Glu35Asp | 0.813 | −0.753 | 0.013 | |||
| Y505 | Y491 | E37 | p.Glu37Lys | 5.81 | 2.967 | 0.995 |
| F486 | L472 | M82 | p.Met82Ile | −10.4 | −1.463 | 0 |
| – | R426 | E329 | p.Glu329Gly | 2.8 | 1.025 | 0.001 |
| T500, G502 | T486, T487, G488 | D355 | p.Asp355Asn | 5.34 | 7.905 | 1 |
hACE2, human angiotensin-converting enzyme 2; SARS-CoV-2, severe acute respiratory syndrome coronavirus 2.
Figure 2Genetic variation in TMPRSS2. (A) The distribution of nucleotide polymorphisms along the full-length TMPRSS2 gene. The vertical bar indicates allele frequency (AF) (%). Single nucleotide variations (SNVs) are grouped by type: missense (blue), stop-gained (yellow), stop-lost (orange), splice site (green) and indel (grey). The putative functional domains are depicted by coloured boxes. (B) Pie chart of 551 SNVs in TMPRSS2. Each colour code corresponds to a different SNV type. (C) Relative abundance of SNVs plotted over their AFs. SNVs are grouped by type: missense (blue), synonymous (orange), other non-synonymous (grey) and indel (yellow). The inset demonstrates that the majority of SNVs in coding regions were quite rare (49.9%, AF <0.001%), while 87.3% were classified as rare (0.001%< AF <0.01%) and only 12.7% had low frequency or were common. (D) Percentage of fraction deleterious. Twenty-three software were employed to predict the pathogenicity of missense variants. (E) The non-linear regression fitting was performed based on the scatter plot showing the relationship between AF and variation of SNVs. (F) Relationship between total number of SNVs and population size. The TMPRSS2 variations are expected to rise as the population size increases. An enlarged view of the graph is also indicated in the upper panel. (G) Chow-Ruskey diagrams showing the number of shared and unique genetic variants for TMPRSS2 genes across four large-scale population studies. (H) Comparison of AFs across four large-scale population studies. ACE2, angiotensin-converting enzyme 2; LDLRA, low-density lipoprotein receptor A domain; NHLBI, National Heart, Lung, and Blood Institute; SRCR, scavenger receptor cysteine-rich domain; TM, transmembrane domain; TMPRSS2, transmembrane protease serine 2; ToMMo,Tohoku Medical Megabank Organization.