| Literature DB >> 33103435 |
Rachel Nadeau1,2,3, Soroush Shahryari Fard1,2,3, Amit Scheer1,2,3, Emily Hashimoto-Roth1,2,3, Dallas Nygard1,2,3, Iryna Abramchuk1,2,3, Yun-En Chung1,2,3, Steffany A L Bennett1,4,2,4,3,5, Mathieu Lavallée-Adam1.
Abstract
While the COVID-19 pandemic is causing important loss of life, knowledge of the effects of the causative SARS-CoV-2 virus on human cells is currently limited. Investigating protein-protein interactions (PPIs) between viral and host proteins can provide a better understanding of the mechanisms exploited by the virus and enable the identification of potential drug targets. We therefore performed an in-depth computational analysis of the interactome of SARS-CoV-2 and human proteins in infected HEK 293 cells published by Gordon et al. (Nature 2020, 583, 459-468) to reveal processes that are potentially affected by the virus and putative protein binding sites. Specifically, we performed a set of network-based functional and sequence motif enrichment analyses on SARS-CoV-2-interacting human proteins and on PPI networks generated by supplementing viral-host PPIs with known interactions. Using a novel implementation of our GoNet algorithm, we identified 329 Gene Ontology terms for which the SARS-CoV-2-interacting human proteins are significantly clustered in PPI networks. Furthermore, we present a novel protein sequence motif discovery approach, LESMoN-Pro, that identified 9 amino acid motifs for which the associated proteins are clustered in PPI networks. Together, these results provide insights into the processes and sequence motifs that are putatively implicated in SARS-CoV-2 infection and could lead to potential therapeutic targets.Entities:
Keywords: COVID-19; SARS-CoV-2; clustering; enrichment analysis; gene ontology; graph theory; motif discovery; protein−protein interaction network; statistics
Mesh:
Substances:
Year: 2020 PMID: 33103435 PMCID: PMC7640966 DOI: 10.1021/acs.jproteome.0c00422
Source DB: PubMed Journal: J Proteome Res ISSN: 1535-3893 Impact factor: 4.466
Figure 1Graphical representations of the PPI networks analyzed and the enrichment analysis approaches applied on them. Color-filled proteins in the PPI networks represent SARS-CoV-2 proteins. Proteins without color filling represent H. sapiens proteins interacting with SARS-CoV-2 proteins. Black-filled proteins represent H. sapiens superinteractors.
Figure 2Enrichment analyses on H. sapiens interactors of a selected set of SARS-CoV-2 proteins. (A–K) GO enrichment analysis of protein interactors of SARS-CoV-2 proteins. GO cellular components are color-coded. p corresponds to an FDR-adjusted p-value. (L) Motifs enriched among the protein sequences of the H. sapiens interactors of SARS-CoV-2 proteins N, nsp13 and orf6. Proteins containing the motifs are listed below the motif logos. Amino acids are color-coded based on their properties.
Figure 3Clustering of functional annotations in the PPI networks. (A) Pie charts of the GO enrichment analysis of MCL clusters from the STRING-augmented network (FDR-adjusted p-value <0.01). When a GO biological process was enriched in more than one MCL clusters, the lowest FDR-adjusted p-value was used to generate the pie chart. (B) Cell Map location enrichments in MCL cluster from the STRING-augmented (left) and -extended networks (right) (FDR-adjusted p-value <0.05). (C) Clustering statistical significance of GO biological processes according to GoNet in the STRING-augmented network (FDR < 0.01). (A,C) GO biological processes with the highest level of enrichment statistical significance are attributed larger pieces of the pie. The portion occupied by each GO term is also represented in percentages next to the term names. Significant GO annotations were processed with REVIGO[56] to summarize the main annotations and remove redundancy. REVIGO output was then fed as input to CirGO[57] for pie chart visualization.
Gene Ontology Terms for Which the Proteins Are Clustered in the STRING-Extended Network According to GoNet and Are Enriched for Differential Expression upon SARS-CoV-2 Infection
| GO identifier | GO name | FDR-adjusted | |
|---|---|---|---|
| GO:0034663 | endoplasmic reticulum chaperone complex | 5.28 × 10–06 | 0.0041 |
| GO:0071407 | cellular response to organic cyclic compound | 0.00013 | 0.041 |
| GO:0009628 | response to abiotic stimulus | 0.00022 | 0.041 |
| GO:0005924 | cell-substrate adherens junction | 0.00035 | 0.041 |
| GO:0005925 | focal adhesion | 0.00035 | 0.041 |
| GO:0030055 | cell-substrate junction | 0.00035 | 0.041 |
| GO:0030054 | cell junction | 0.00036 | 0.041 |
| GO:0005912 | adherens junction | 0.00050 | 0.048 |
| GO:0070161 | anchoring junction | 0.00056 | 0.048 |
Figure 4Protein sequence motifs that were enriched in protein clusters detected by MCL in the STRING-augmented PPI network (A) and STRING-extended PPI network (B) (E-value <0.05). Proteins containing the motifs are listed below the motif logos. Amino acids are color-coded based on their properties.
Figure 5Family representative sequence motifs for which the associated proteins are significantly clustered in the STRING-augmented network. (A) Complete STRING-augmented network where proteins containing significantly clustered motifs are larger and labeled. Selected set of representative motifs are shown on the network coloring the proteins containing them (FDR < 0.05). (B) All family representative motifs detected by LESMoN-Pro are shown as sequence logos built from their actual occurrences in their associated protein sequences. Amino acids are color-coded based on their properties.