| Literature DB >> 35978279 |
Mario Grassi1, Barbara Tarantino2.
Abstract
BACKGROUND: Pathway enrichment analysis is extensively used in high-throughput experimental studies to gain insight into the functional roles of pre-defined subsets of genes, proteins and metabolites. Methods that leverages information on the topology of the underlying pathways outperform simpler methods that only consider pathway membership, leading to improved performance. Among all the proposed software tools, there's the need to combine high statistical power together with a user-friendly framework, making it difficult to choose the best method for a particular experimental environment.Entities:
Keywords: Pathway enrichment analysis; Pathway topology; Power; Prioritization; SEM; SEMgsa; Sensitivity; Type I error
Mesh:
Year: 2022 PMID: 35978279 PMCID: PMC9385099 DOI: 10.1186/s12859-022-04884-8
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Fig. 1Visualisation of SEMgsa() procedure starting from Asthma KEGG pathway. The first graph summarise Asthma network properties, showing a pathway consisting of 31 nodes, 4 edges and 25 singletons. To maximise pathway information, SEMgsa procedure adds a binary group node (G = {0, 1}) that directly affects the set of genes in the pathway. In this way, the pathway with numerous singleton genes is edged with a group node and group-genes, resulting in a linked graph
Overall pathway perturbation
| Up/down regulation | Node perturbation | Overall perturbation |
|---|---|---|
| + 1 | P− (inh) | Down act |
| − 1 | P− (inh) | Up inh |
| + 1 | P+ (act) | Up act |
| − 1 | P+ (act) | Down inh |
Summary of simulation design () with 100 randomization per design levels
| Topology design | Gene regulation | Mean signal | ||
|---|---|---|---|---|
| Betweenness | Up/down | 100 | 100 | 100 |
| Community | Up/down | 100 | 100 | 100 |
| Neighbourhood | Up/down | 100 | 100 | 100 |
Overview of tested pathway enrichment methods
| Method | Null hypothesis | Gene | Expression data | Pathway | R/Bioconductor [References] |
|---|---|---|---|---|---|
| SEMgsa | Self-contained | No | Yes | Topology | SEMgraph 1.1.1 [ |
| DEGraph | Self-contained | No | Yes | Topology | DEGraph 1.46.0 [ |
| TopologyGSA | Self-contained | No | Yes | Topology | TopologyGSA 1.4.7 [ |
| NetGSA | Self-contained | No | Yes | Topology | netgsa 4.0.3 [ |
| PathwayExpress | Competitive | Optional | No | Topology | ROntoTools 2.23.0 [ |
| ORA | Competitive | Yes | No | Membership | EnrichmentBrowser 2.25.3 [ |
Benchmark results on Coronavirus disease (COVID-19) and frontotemporal dementia (FTD)
| Metrics | Method | Disease | |
|---|---|---|---|
| Coronavirus disease (COVID-19) | Frontotemporal dementia (FTD)* | ||
| Sensitivity | SEMgsa() | ||
| DEGraph | 0.771 | 0.653 | |
| NetGSA | 0.011 | 0.563 | |
| ORA | 0.709 | 0.375 | |
| PathwayExpress | 0.444 | 0.981 | |
| TopologyGSA | 0.740 | ||
| Prioritization | SEMgsa() | ||
| DEGraph | 90 | 39 | |
| NetGSA | 63 | 44 | |
| ORA | 83 | 56 | |
| PathwayExpress | 32 | 100 | |
| TopologyGSA | 13 | 58 | |
*Since the term Frontotemporal lobar degeneration (an alias for FTD; KEGG ID: H00078) is associated to 6 KEGG pathways, sensitivity and prioritisation metrics have been aggregated by taking the median. Results for SEMgsa() have been highlighted in bold
Fig. 2Average type I error on the 10 KEGG pathways grouped by method and mean signal on simulated data. Average type I error together with standard deviation across simulations is displayed for each method. Lower type I error indicates better performance. At the 0.05 significance level, all methods control the type I error rate across the 10 pathways under different level of mean signal
Fig. 3Average statistical power on the 10 KEGG pathways grouped by method and mean signal on simulated data. Average power together with standard deviation across simulations is displayed for each method. Higher power indicates better performance. SEMgsa stands out among all with 90–100% power across simulation. NetGSA and topologyGSA get close to SEMgsa with about 75% statistical power only with differential mean level of 0.7
Overall pathway perturbation of KEGG pathways related to Coronavirus disease (COVID-19) and frontotemporal dementia (FTD)
| Disease | KEGG pathway | Pert |
|---|---|---|
| Coronavirus disease—COVID-19 | Coronavirus disease—COVID-19 | Up act |
| Frontotemporal dementia (FTD) | Protein processing in endoplasmic reticulum | Up act |
| Endocytosis | NA | |
| Neurotrophin signaling pathway | Up act | |
| Wnt signaling pathway | NA | |
| MAPK signaling pathway | Up act | |
| Notch signaling pathway | Down act |