| Literature DB >> 32625189 |
Sergey Ivanov1,2, Alexey Lagunin1,2, Dmitry Filimonov1, Olga Tarasova1.
Abstract
The interaction of human immunodeficiency virus with human cells is responsible for all stages of the viral life cycle, from the infection of CD4+ cells to reverse transcription, integration, and the assembly of new viral particles. To date, a large amount of OMICs data as well as information from functional genomics screenings regarding the HIV-host interaction has been accumulated in the literature and in public databases. We processed databases containing HIV-host interactions and found 2910 HIV-1-human protein-protein interactions, mostly related to viral group M subtype B, 137 interactions between human and HIV-1 coding and non-coding RNAs, essential for viral lifecycle and cell defense mechanisms, 232 transcriptomics, 27 proteomics, and 34 epigenomics HIV-related experiments. Numerous studies regarding network-based analysis of corresponding OMICs data have been published in recent years. We overview various types of molecular networks, which can be created using OMICs data, including HIV-human protein-protein interaction networks, co-expression networks, gene regulatory and signaling networks, and approaches for the analysis of their topology and dynamics. The network-based analysis can be used to determine the critical pathways and key proteins involved in the HIV life cycle, cellular and immune responses to infection, viral escape from host defense mechanisms, and mechanisms mediating different susceptibility of humans to infection. The proteins and pathways identified in these studies represent a basis for developing new anti-HIV therapeutic strategies such as new drugs preventing infection of CD4+ cells and viral replication, effective vaccines, "shock and kill" and "block and lock" approaches to cure latent infection.Entities:
Keywords: OMICs; human immunodeficiency virus; network analysis; protein–protein interactions; transcriptomics; virus–host interaction
Year: 2020 PMID: 32625189 PMCID: PMC7311653 DOI: 10.3389/fmicb.2020.01314
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Public databases containing data on interactions between HIV and human proteins.
| NCBI database2 | 1037 | 842 | – | – | |
| HPIDB | 1668 | 1390 | 25 | 24 | |
| PHISTO | 1978 | 1460 | 27 | 26 | |
| VirHostNet | 1077 | 985 | 13 | 13 | |
| Viruses.STRING | 929 | 827 | 3 | 3 | |
| VirusMentha | 1206 | 1052 | 14 | 13 | |
| Total3 | 2910 | 2051 | 34 | 30 |
FIGURE 1Intersections of HIV-1-human PPIs between six databases. Y-axis represents the numbers of HIV-1-human PPIs, which are either unique for a particular database or shared by two, three, four, five, and six databases. The connections between circles at the bottom part of figure represent intersections of PPIs between databases. The unconnected circles represent PPIs, which are unique for a particular database. The horizontal bars represent the total numbers of HIV-1-human PPIs in each database.
Distribution of the numbers of PPIs between different HIV-1 groups and subtypes.
| HIV-1 group M subtype A | 3 | 162–350 | 353 | b, d |
| HIV-1 group M subtype B | 27 | 6–1875 | 1910 | a, b, d, f |
| HIV-1 group M subtype C | 1 | 5 | 5 | b, d, f |
| HIV-1 group M subtype D | 5 | 167–347 | 356 | b, d, f |
| HIV-1 group M subtype F1 | 1 | 1 | 1 | b, d |
| HIV-1 group M subtype G | 1 | 1 | 1 | b, d |
| HIV-1 group M subtype H | 1 | 1 | 1 | b, d |
| HIV-1 group M subtype U | 1 | 139 | 139 | d |
| HIV-1 group N | 1 | 1 | 1 | b, d |
| HIV-1 (group unknown) | – | 995 | 995 | a, b, c, d, e, f |
Various types of interactions between HIV-1 and human RNAs presented in public databases.
| Human miRNA–viral mRNA | 43 | 7 | 21 | 49 |
| Human miRNA–human mRNA | 50 | – | 3 | 51 |
| Viral miRNA–human mRNA | 21 | 13 | 2 | 21 |
| Viral miRNA–viral mRNA | 5 | 2 | – | 5 |
| Viral miRNA–host miRNA | 2 | – | – | 2 |
An overview of HIV-related transcriptomic experiments.
FIGURE 2The general pipeline of network-based analysis of HIV-related OMICs data. Public databases provide access to OMICs data on human and HIV-human protein-protein interactions (HPIDB, PHISTO, VirHostNet, VirusMentha, and others, see Table 1 and Supplementary Table S1), interactions between human and viral coding/non-coding RNAs (ViRBase, VmiReg, VIRmiRNA databases, Table 3 and Supplementary Table S2), transcriptomics, proteomics and epigenomics data (GEO, ArrayExpress, ProteomeXChange, HIVed databases, Supplementary Tables S3–S6) (nodes of purple color in the figure). The HIV-related OMICs data can be used to create context-specific protein-protein interaction networks, co-expression, gene regulatory and signaling networks (nodes of green color in the figure). The context-specific networks can be constructed by weighting protein–protein interactions using transcriptomics, proteomics or epigenomics data, or by taking into account only DEG/proteins (DEGs/DEPs). Co-expression networks can be created using transcriptomics data and weighted gene correlation network analysis (WGCNA). Gene regulatory networks can be inferred from transcriptomics data using reverse engineering approaches, whereas signaling networks are usually manually created by experts based on a great deal of information regarding the protein interactions, post-translational modifications, and other data types. The created networks can be used for different types of analysis (nodes of blue color in the figure): (1) identification of dense communities in human protein-protein interaction and co-expression networks (clusters or modules), or in HIV-human interaction networks (biclusters). The pathway enrichment analysis applied to clusters and biclusters allows identifying pathways and cellular processes, which are essential for HIV-human interaction; (2) degree and centrality analysis, gene phenotype prioritization analysis, as well as dynamic modeling with in silico gene knockout allows identifying proteins, which are the most essential for HIV-human interaction (host dependency factors), and can be considered as potential targets for new anti-HIV therapeutic approaches.
FIGURE 3Main topological characteristics of human and HIV–human protein–protein interaction networks. (A) Example of a network with modular structures. Modules are dense communities of nodes that are highly interconnected but weakly connected to other nodes in the network. Red nodes represent “hubs,” which are proteins with a high number of interactions. Green node is an example of “bottleneck,” which exclusively connects distinct modules. (B) Principle of centralities calculation. Both closeness and betweenness centralities rely on the shortest paths between pairs of nodes in the network. The closeness centrality is the average length of the shortest paths between the protein and all the other proteins in the network (the shortest paths between red-colored node and nodes 3 and 4 are marked by red arrows). The betweenness centrality quantifies the number of times a protein acts as a bridge along the shortest path between two other proteins in the network (the shortest path between node 1 and node 2 passing through the red-colored node is marked by blue arrows). If the network includes both interactions between human proteins (blue nodes) as well as interactions between human and HIV proteins (gray nodes), then the modified betweenness centrality can be calculated. It reflects the number of times a human protein acts as a bridge along the shortest path between two HIV proteins (the shortest paths between node 5 and node 6 or node 7 passing through green-colored node are marked by green arrows). (C) Illustration of bipartite graph and biclusters. The bipartite graph contains two types of nodes (corresponding to HIV and human proteins) and edges connecting only nodes of different types, but not the same type. The corresponding data can be used to identify biclusters, which contain human proteins that interact with a common set of HIV proteins.
FIGURE 4Co-expression analysis framework. To determine co-expression, the correlation between expression values for all gene pairs must be calculated. The weighted or unweighted co-expression networks can be created based on these values. A cluster analysis, which is based on co-expression values, allows identifying modules (clusters) of co-expressed genes. The co-expression module contains genes that could be regulated by the same transcription factors and have similar biological functions. The genes from each module can be used for pathway and Gene Ontology enrichment analysis to identify corresponding biological functions. Weighted gene correlation network analysis (WGCNA) is the most popular framework for the creation of co-expression networks. Along with the identification of modules, it can be used to estimate module preservation between two networks created for different conditions, reveal modules associated with a clinical trait of interest and find intermodular “hubs,” which could be the essential genes regulating the expression of the other genes in the module.