| Literature DB >> 30984370 |
Mariam R Farman1, Ivo L Hofacker1, Fabian Amman1,2.
Abstract
High throughput techniques such as RNA-seq or microarray analysis have proven to be invaluable for the characterizing of global transcriptional gene activity changes due to external stimuli or diseases. Differential gene expression analysis (DGEA) is the first step in the course of data interpretation, typically producing lists of dozens to thousands of differentially expressed genes. To further guide the interpretation of these lists, different pathway analysis approaches have been developed. These tools typically rely on the classification of genes into sets of genes, such as pathways, based on the interactions between the genes and their function in a common biological process. Regardless of technical differences, these methods do not properly account for cross talk between different pathways and most of the methods rely on binary separation into differentially expressed gene and unaffected genes based on an arbitrarily set p-value cut-off. To overcome this limitation, we developed a novel approach to identify concertedly modulated sub-graphs in the global cell signaling network, based on the DGEA results of all genes tested. To this end, expression patterns of genes are integrated according to the topology of their interactions and allow potentially to read the flow of information and identify the effectors. The described software, named Modulated Sub-graph Finder (MSF) is freely available at https://github.com/Modulated-Subgraph-Finder/MSF.Entities:
Keywords: Differential gene expression analysis; cell signalling network; combining p-value; pathway analysis
Mesh:
Substances:
Year: 2018 PMID: 30984370 PMCID: PMC6446500 DOI: 10.12688/f1000research.16005.3
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Figure 1. Graphical representation of the MSF heuristic approach to detect modulated sub-graphs in a global gene regulatory network.
Comparison of connected sub-graphs of modulated genes in the global network identified after the analysis with MSF and applying different p-value cut-offs from edgeR to genes in MSF identified modulated sub-graphs.
| Total number of
| Number of connected
| |
|---|---|---|
|
| ||
| edgeR + MSF | 250 | 3 |
| p-value ≤ 0.1 | 166 | 87 |
| p-value ≤ 0.05 | 152 | 89 |
| p-value ≤ 0.01 | 125 | 76 |
|
| ||
| edgeR + MSF | 656 | 7 |
| p-value ≤ 0.1 | 457 | 183 |
| p-value ≤ 0.05 | 418 | 198 |
| p-value ≤ 0.01 | 332 | 216 |
|
| ||
| edgeR + MSF | 744 | 6 |
| p-value ≤ 0.1 | 514 | 189 |
| p-value ≤ 0.05 | 468 | 206 |
| p-value ≤ 0.01 | 363 | 241 |
Figure 2. Visualization of the three modulated directed sub-graphs identified by MSF at 6 hpi.
The node coloring is associated to KEGG pathways referring to the colors in the legend. The graph edges are from Reactome.
Figure 3. Recall rates for genes in MSF identified sub-graph for the three different time points of EBOV infection data for 100 simulations where Poisson distributed noise was added to the experimentally deduced reads per gene.
Figure 4. The Venn diagram shows the common genes identified as modulated from MSF identified sub-graphs and jActiveModules identified module.
Figure 5. The Upset plot shows the number of shared pathways between MSF identified sub-graph gene list and DEG cut-off list for the three time-points.
All 10 different Toll-like receptor cascades are in the set of 164 shared pathways only between MSF at different time-points.