| Literature DB >> 35947637 |
Marie Cariou1, Léa Picard1,2, Laurent Guéguen2, Stéphanie Jacquet1,2, Andrea Cimarelli1, Oliver I Fregoso3, Antoine Molaro4, Vincent Navratil5,6,7, Lucie Etienne1.
Abstract
The coronavirus disease 19 (COVID-19) pandemic is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), a coronavirus that spilled over from the bat reservoir. Despite numerous clinical trials and vaccines, the burden remains immense, and the host determinants of SARS-CoV-2 susceptibility and COVID-19 severity remain largely unknown. Signatures of positive selection detected by comparative functional genetic analyses in primate and bat genomes can uncover important and specific adaptations that occurred at virus-host interfaces. We performed high-throughput evolutionary analyses of 334 SARS-CoV-2-interacting proteins to identify SARS-CoV adaptive loci and uncover functional differences between modern humans, primates, and bats. Using DGINN (Detection of Genetic INNovation), we identified 38 bat and 81 primate proteins with marks of positive selection. Seventeen genes, including the ACE2 receptor, present adaptive marks in both mammalian orders, suggesting common virus-host interfaces and past epidemics of coronaviruses shaping their genomes. Yet, 84 genes presented distinct adaptations in bats and primates. Notably, residues involved in ubiquitination and phosphorylation of the inflammatory RIPK1 have rapidly evolved in bats but not primates, suggesting different inflammation regulation versus humans. Furthermore, we discovered residues with typical virus-host arms race marks in primates, such as in the entry factor TMPRSS2 or the autophagy adaptor FYCO1, pointing to host-specific in vivo interfaces that may be drug targets. Finally, we found that FYCO1 sites under adaptation in primates are those associated with severe COVID-19, supporting their importance in pathogenesis and replication. Overall, we identified adaptations involved in SARS-CoV-2 infection in bats and primates, enlightening modern genetic determinants of virus susceptibility and severity.Entities:
Keywords: SARS-CoV-2 and COVID-19; comparative genetics; positive selection; primates and bats; virus–host coevolution
Mesh:
Substances:
Year: 2022 PMID: 35947637 PMCID: PMC9436378 DOI: 10.1073/pnas.2206610119
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 12.779
Fig. 1.Identification of the SARS-CoV-2 interactome with signatures of positive selection (PS) in bats and primates. (A) Overview of the DGINN pipeline to detect adaptive evolution in SARS-CoV-2 VIPs. CDS, coding DNA sequence; ORF, open reading frame. (B) Natural selection acting on bat and primate VIP genes. Comparison of omega (dN/dS) values of the VIPs during bat (y axis) and primate (x axis) evolution, estimated by Bio++ Model M0. In black, the bisector. In red, the linear regression. The names correspond to genes that we comprehensively analyzed (Table 1). (C) Overview of the number of VIPs under significant PS (i.e., by at least three methods in the DGINN screen) in bats and/or primates. A total of 324 genes could be fully analyzed in the two mammalian orders. Numbers represent the number of genes in the categories: No PS or PS, within each host, is represented by a pictogram. The numbers correspond to the conservative values after visual inspection of the positively selected VIP alignments, while the italic numbers are from the automated screen. (D) Table showing the genes identified by x,y DGINN methods in bats and primates, respectively. For the genes with low DGINN scores (<3), only the number of genes in each category is shown ( for details). Of note, seven primate genes are false positive, as follows: EMC1 (ER membrane protein complex subunit 1), MOV10 (Mov10 RISC complex RNA helicase), POR (cytochrome p450 oxidoreductase), PITRM1 (pitrilysin metallopeptidase 1), RAB14, RAB2A, and TIMM8B (translocase of inner mitochondrial membrane 8 homolog B).
Results from the comprehensive PS analyses of the genes of interest
| Seq alignment info | Identified under PS by x/7 methods | MEME ( | FUBAR (PP > 0.9) | Bio++ | codeml | aBSREL ( | PSS aln | PSS in human ref | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Gene | Order | Size | n sp. | M2 PS ω | M2 PSS | M8 PS ω | M8 PSS | M2 PSS | M8 PSS | ||||||
| FYCO1 | bats | 1481 | 18 |
| 375, 504, 566, 688, 790, 1059, 1092 | — | — | — | — | — | — | 607 | — | ||
|
|
| 1500 | 29 |
| 355, 416, |
| 4.42 |
| 1.73 |
|
| rhiBie |
|
| |
|
|
| 1484 | 17 |
| 147, | 109, 196, | 59.91 | 2.05 |
|
|
| phyDis, pteAle |
|
| |
|
|
| 1451 | 25 |
| 590, 707, 718, 1082 |
| — | — | 14.36 |
| N |
|
| ||
|
|
| 421 | 17 |
| — | 257, 258, 275, 278, | — | — | — | — |
| N |
|
| |
| PRIM1 | primates | 420 | 26 |
| 277, 361 | — | — | — | — | — | — | — | — | ||
|
|
| 517 | 16 |
| 5, | 23.14 |
| 12.34 |
|
| na |
|
| ||
|
|
| 517 | 14 |
| 10.31 |
| 2.51 | n = 69 |
|
| myoBra, myotis |
| |||
| PRIM2 | primates | 513 | 20 |
| 5, | — | — | — | — | — | — | ||||
|
|
| 669 | 18 |
| 127, 230, |
| 54.40 | 1.61 |
|
| myoDav |
| |||
|
|
| 669 | 15 |
| 127, 230, | 16, 277, | 70.02 | 10.77 |
|
| myoDav |
| |||
| RIPK1 | primates | 671 | 29 |
| 491, 493, 592 | — | — | — | — | — | — | 664 | — | ||
| TMPRSS2 | bats | 496 | 17 |
| 19, 136, 249, 364, 412, 416 | — | — | — | 1.29 | 49, 216, 364, 413, 416, 435 | — | — | |||
|
|
| 563 | 28 |
| 168, 232, 299, | 117, | 5.70 |
| 5.19 |
|
|
| N |
|
|
| ZNF318 | bats | 2142 | 16 |
| 127, 586, 851, 1647, 1665, 2015, 2016 | — | — | — | na | na | — | — | |||
| ZNF318 | bats | 2395 | 11 |
| 78, 370, 830, 1107, 1588, 2153, 2267 | — | — | — | — | — | — | — | — | ||
|
|
| 2108 | 29 |
| 93, 843, | — | — | — | — |
| N |
|
| ||
|
|
| 2496 | 24 |
| 38, 75, 104, 174, 1372, |
| 501.94 | 1.45 | n = 68 |
| rhiBie, micMur, homSap |
| |||
For each gene, are presented the results of the comprehensive phylogenetic and PS analyses, as follows: BUSTED, MEME, FUBAR, aBSREL from HYPHY/Datamonkey.com, M1vsM2, M7vsM8, M8avsM8 from Bio++, and M1vsM2, M7vsM8, M8avsM8 from PAML codeml. The genes identified under PS are highlighted in gray. The sites considered under PS after the analyses are in “PSS aln” and “PSS in human ref”, corresponding to the site number in the codon alignment and the corresponding amino acid site in the human reference sequence. Alignments, trees, and interactive table are available at https://virhostnet.prabi.fr/virhostevol/. Table with extended results including statistical P values for each test is in . Legend details: size, length of the codon alignment; n. sp., number of species included in the alignment; PSS, PS sites; the cutoff for each method is given in the table; PS omega, corresponds to the omega value in the PS class (dN/dS > 1) of the given model M2 or M8. ZNF318 and the proteins from the Primase complex are in and in , respectively. For aBSREL, the branch identified under PS is given by the DGINN nomenclature (three letters from the genus and three from the species). na, not available.
*For Bio++ M8 PSS analyses there were dozens of sites under PS due to the low omega value in the class w >1.
Fig. 2.SARS-CoV-2 VIPs under PS are interacting proteins of other coronaviruses, as well as other viral families. Virus–host protein–protein interaction network of VIP genes under PS and interconnected with (A) other coronaviruses (from alpha- or beta-coronavirus genus), and (B) viral families other than coronaviruses. VIPs interacting with more than one additional viral family are in the Center and arranged in columns (from Left to Right, interconnected with 2 to 6 different viral families). Node sizes at the virus families are proportional to the number of edges. The VIPs not interconnected are shown in .
Fig. 3.TMPRSS2 has evolved under strong PS in primates but not in bats. (A) Role of TMPRSS2 in SARS-CoV-2 entry. (B) Diagram of TMPRSS2 predicted domains, with sites under PS in primates represented by triangles (Table 1). Codon numbering and amino acid residue based on Homo sapiens TMPRSS2. (C) 3D structure modeling of human TMPRSS2 (amino acids 1 to 492) with the positively selected sites (red), the SARS-CoV-2 predicted interface (light blue), and the catalytic site (dark blue). (D) The positively selected sites identified in primate TMPRSS2 are highly variable in primates (Top) but more conserved in bats (Bottom) where they are not identified as under adaptive evolution. Left, cladograms of primate and bat TMPRSS2 with species abbreviation and accession number of sequences. Amino acid color-coding, RasMol properties (Geneious, Biomatters). Icon legend is embedded in the figure, with multicolored pictograms/triangles showing cases fulfilling multiple conditions. (E) Positively selected sites in primates exhibit different patterns of variability in other mammals, as follows: pangolin, carnivores, artiodactyls, and rodents. Right, numbers in brackets correspond to the number of species within the order with the same TMPRSS2 haplotype at these positions (e.g., the QSSKL motif in Mustela putoris was found in 14 rodent species). The corresponding motif in species/cells susceptible or permissive to coronaviruses is shown in .
Fig. 4.Domains of FYCO1 that are associated with severe COVID-19 in human have also evolved under significant PS in primates but not in bats. (A) Known cellular role of FYCO1. (B) Diagram of FYCO1 predicted domains, with sites under PS in primates represented by triangles (Table 1). Codon numbering and amino acid residue based on Homo sapiens FYCO1. (C) Amino acid variation at the positively selected sites in primates. Left, cladogram of primate FYCO1 with major clades highlighted. The exact species and accession number of sequences are shown in E. Amino acid color-coding, RasMol properties (Geneious, Biomatters). (D) Sites identified in the coding sequence of FYCO1 as under PS in primates (Top) and as associated with severe COVID-19 in human from two GWAS studies (Middle: GWAS1, COVID-19 Host Genetics Initiative, 2021; Bottom: GWAS2, Pairo-Castineira et al., 2020). x axis, nucleotide numbering. (E) Amino acid variations in primate species at the sites associated with severe COVID-19 in GWAS.
Fig. 5.The multifunctional and inflammatory RIPK1 protein exhibits strong evidence of adaptation in bats at key regulatory residues. (A) Schematic diagram of the three main functions associated to human RIPK1 in TNF signaling. As part of the TNFR1-associated complex, RIPK1 induces prosurvival signals that notably lead to NFkB activation. When dissociating from this complex, as a result of multiple events involving both phosphorylation and ubiquitination, RIPK1 can associate to FADD and lead to apoptosis or necrosis. (B) Diagram of RIPK1 domains with the residues under PS in bats (black triangles) with the corresponding position and amino acid residue in human RIPK1 (Table 1). (C) 3D structure prediction of bat (Rhinolophus ferrumequinum) RIPK1, using RaptorX. The protein domains are color coded as in B. Residues under PS are in red and numbered is according to their position in bat RIPK1. (D) The positively selected sites identified in bat RIPK1 are highly variable in bats (Top), but more conserved in primates (Bottom), where they are not identified as under adaptive evolution. Left, bat and primate RIPK1 with species abbreviation and accession number of sequences. Amino acid color coding, polarity properties (Geneious, Biomatters). The correspondence of residues from Rhinolophus ferrumequinum bat RIPK1 (gray) to human numbering (black) is shown at the Top. Detailed representation is shown in .