| Literature DB >> 29868818 |
Alicia Amadoz1, Marta R Hidalgo2, Cankut Çubuk2, José Carbonell-Caballero3, Joaquín Dopazo2,3,4.
Abstract
Understanding the aspects of cell functionality that account for disease mechanisms or drug modes of action is a main challenge for precision medicine. Classical gene-based approaches ignore the modular nature of most human traits, whereas conventional pathway enrichment approaches produce only illustrative results of limited practical utility. Recently, a family of new methods has emerged that change the focus from the whole pathways to the definition of elementary subpathways within them that have any mechanistic significance and to the study of their activities. Thus, mechanistic pathway activity (MPA) methods constitute a new paradigm that allows recoding poorly informative genomic measurements into cell activity quantitative values and relate them to phenotypes. Here we provide a review on the MPA methods available and explain their contribution to systems medicine approaches for addressing challenges in the diagnostic and treatment of complex diseases.Entities:
Keywords: disease mechanism; mathematical models; networks; signaling pathways; systems biology; transcriptomics
Mesh:
Year: 2019 PMID: 29868818 PMCID: PMC6917216 DOI: 10.1093/bib/bby040
Source DB: PubMed Journal: Brief Bioinform ISSN: 1467-5463 Impact factor: 11.622
Figure 1.Schematic representation of the three families of methods: enrichment analysis, PT-based analysis and MPA. The conventional enrichment analysis assumes the existence of a background (A) in which an observed percentage (25% in the example) of the genes differentially expressed (or mutated, associated to a trait, etc.). If gene sets are sampled based on some property shared by all the genes (e.g. they belong to a given pathway), a scenario (B) in which 60% of them are differentially expressed is found; the application of a simple test will evidence that this gene set is significantly enriched in differentially expressed genes, whereas in other scenarios (C), the gene set would not be different from a random sample of genes from the background. A PT-based algorithm takes into consideration the topology of the gene set, and a scenario (C) in which the differentially expressed genes are more connected among them would get a better score that an alternative scenario (D) in which the level of connection of the genes is lower. The significance of this data set would depend on the algorithm that estimates the score and the specific test applied. In MPA, there is more or less specific definition of circuits (subnetworks) within the pathway that should be related to cell activity in some way, and the connectivity of such circuits will determine the potential changes in cell activity. If circuits define subnetworks connecting receptor proteins to effector proteins in a signal transduction pathway, the same number of active genes could allow signal transduction (E) or being incompatible with the arrival of the signal to the current effector proteins (F), even in scenarios that would be significantly enriched in a conventional enrichment method.
List of mechanistic pathway activity methods
| Method | Date | Code | Pathway modeled | Circuit definition | Scoring method | Activation / inhibition | Input | Result | Scope |
|---|---|---|---|---|---|---|---|---|---|
| Hipathia [ | 2017 | Web application | KEGG | Receptor-to-effector circuits | Propagation algorithm | Yes | MA, RNA-seq |
| Multiple analyses |
|
| |||||||||
| Hipathia R code | |||||||||
| MinePath [ | 2016 | Web application | KEGG | All possible circuits | Discretized gene expression (GE) values with logical operators | Yes | MA RNA-seq |
| Multiple analyses |
|
| |||||||||
| subSPIA [ | 2015 | R code | KEGG | Minimal spanning trees (MST) | Differentially expressed (DE) genes used to define the MST | No | MA RNA-seq |
| Comparison |
| PathiVar [ | 2015 | Web application | KEGG | Receptor-to-effector circuits | Probabilistic model | Yes | MA, VCF |
| Multiple analyses |
|
| |||||||||
| Pathome [ | 2014 | NA | KEGG | Receptor-to-effector linear circuits | Correspondence between pattern activation/inhibition and co-expression in adjacent gene pairs | Yes | MA RNA-seq |
| Comparison |
| Pathiways [ | 2013 | Web application | KEGG | Receptor-to-effector circuits | Probabilistic model | Yes | MA |
| Multiple analyses |
| DEAP [ | 2013 | Python code | KEGG | Receptor-to-effector linear circuits | Running sum of discretized DE | Yes | MA RNA-seq | Maximally differential expressed pathway | Comparison |
| CLIPPER [ | 2013 | R package | KEGG; Reactome | All possible circuits | Weighted sum of GE | No | MA RNA-seq | Most relevant circuit per pathway | Comparison |
| ToPASeq R package | |||||||||
| Web application: | |||||||||
|
| |||||||||
| TEAK [ | 2013 | Code @ Google (Windows and Mac) | KEGG | Receptor-to-effector circuits | Fits a Bayesian network for circuit and uses the BIC | No | MA | Ranked circuits | Comparison |
| PRS [ | 2012 | ToPASeq R package | KEGG | Trees of associated DE genes | Topologically weighted sum of DE | No | MA RNA-seq | Ranked subpathways | Comparison |
| DEGraph [ | 2012 | R package | KEGG; User defined pathways | All possible circuits | Multivariate two-sample tests of means of DE genes within a subgraph. | No | MA RNA-seq |
| Comparison |
| ToPASeq R package | |||||||||
| Rivera et al. [ | 2012 | NA | NetPath | All possible circuits | Weighted | No | MA |
| Comparison |
| Chen et al. [ | 2011 | NA | KEGG | Receptor-to-effector circuits | Euclidian distance | No | MA |
| Comparison |
| PWEA [ | 2010 | ToPASeq R package | User defined pathways | All possible circuits | Mutual influence among gene expression within the circuit | No | MA RNA-seq |
| Comparison |
| TopologyGSA [ | 2010 | ToPASeq R package | User defined pathways | All possible circuits | Comparison of covariance matrices of genes in the circuit | Yes | MA RNA-seq |
| Comparison |
| DEGAS [ | 2010 | Java (Windows) | KEGG | All possible circuits | Heuristic to find the largest dysregulated circuit | No | MA | One circuit per pathway | Comparison |
| TAPPA [ | 2007 | ToPASeq R package | KEGG | All possible circuits | Scores of co-expression that explain the compared conditions | No | MA RNA-seq |
| Multiple analyses |
The first column (Method) contains the name or acronym of the method, if exists, otherwise, we refer to it as the first author of the publication. The second column (Date) contains the publication date. The third column (Code) informs on the availability of the code to run the method. The fourth column (Pathway modeled) indicates the database used for pathway definition used by the method. The fifth column (Circuit definition) is the type of circuit used by the method. The sixth column (scoring method) summarizes how the circuit activity is inferred in the method. The seventh column (Activation/inhibition) denotes whether the scoring method uses the information of activation or inhibition nodes. The eight column (Input) indicates the data type that inputs the method (MA: expression microarray; RNA-seq: counts of RNA-seq experiments; VCF: mutation files). The ninth column (Result) describes the results provided by the method. And the tenth column (Scope) indicates the type of analyses the method permits, which can be either only conventional two conditions comparison or a wide range of analyses if the method first recodes gene expression into circuit activities.
Figure 2.TPR or sensitivity was computed as the number of significant cancer pathways found, when cancer samples are compared with samples of the tissue of reference, divided by the total number of cancer pathways (14 for HiPathia and DEAP and 13 for the rest of methods, because PPAR signaling pathway [hsa03320] was not implemented in them) per method and cancer. Violin plots obtained using 12 cancer types show for any method the mean TPR in the central dot, all possible results, with thickness indicating how common, in the outer shape and the layer inside, represents the values that occur 95% of the time. The figure shows the methods ranked by TPR value. A Wilcoxon test with Bonferroni correction was used to compare successive TPR distributions to detect significant differences among them. Black lines denote significant differences between consecutive methods. Brackets define groups of methods with no significant differences in their performances.
Figure 3.FPR or specificity was computed as the mean of the number of significant cancer pathways found, when cancer samples are compared with cancer samples, divided by the total number of KEGG cancer pathways along 100 bootstraps, per method and cancer. Violin plots show average values and distributions of the proportions of false discoveries made by any method. The figure shows the methods ranked by FPR value. A Wilcoxon test with Bonferroni correction was used to compare successive FPR distributions to detect significant differences among them. Black lines denote significant differences between consecutive methods. Brackets define groups of methods with no significant differences in their performances.
Figure 4.Schema of the mechanisms behind the deactivation of the immune system (A) and the changes in the metabolism (B) caused by changes of blood transcriptome after death and subsequent postmortem cold ischemia [73].
Figure 5.Simultaneous comparison of sensitivities and specificities of the different MPA methods. The results obtained in the 12 cancers are used to obtain a mean value and an error. The x-axis represents 1 − the FPR. Horizontal bars represent in each point 1 SD of the FPR for the corresponding method.