| Literature DB >> 31165139 |
Joanna Zyla1,2, Michal Marczyk1,3, Teresa Domaszewska2, Stefan H E Kaufmann2, Joanna Polanska1, January Weiner2.
Abstract
MOTIVATION: Analysis of gene set (GS) enrichment is an essential part of functional omics studies. Here, we complement the established evaluation metrics of GS enrichment algorithms with a novel approach to assess the practical reproducibility of scientific results obtained from GS enrichment tests when applied to related data from different studies.Entities:
Mesh:
Year: 2019 PMID: 31165139 PMCID: PMC6954644 DOI: 10.1093/bioinformatics/btz447
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Examples of tmod package graphical illustrations of enrichment results. (A) A panel plot which allows presentation of large number of comparisons; (B) a tag cloud for enriched GS; (C) evidence plot for a selected GS, where the AUC corresponds to effect size; (D) principle component analysis combined with enrichment allows to functionally annotate the components
Fig. 2.Scheme of reproducibility analysis performed for all tested algorithms
Fig. 3.Comparison of the impact of different gene ranking metric (top row) and sample sizes (bottom row) on the surrogate sensitivity (lower is better) and FPR (closer to 5% is better) for the CERNO algorithm. Panels (A) and (B) represent impact of different ranking metrics in terms of surrogate sensitivity and FPR at various sample size, respectively. Panels C and D represent impact of sample size to surrogate sensitivity and FPR for MSD metric only. Red line on panels (B) and (D) represents the expected FPR level
Sensitivity, FPR, prioritization, computational time and reproducibility of tested algorithms
| Algorithm | Sensitivity | FPR | Time [s] | Prioritization | Reproducibility |
|---|---|---|---|---|---|
| CERNO | 0.949 | 3.602 | 5.987 | 18.73 |
|
| GeneSetTest | 0.979 | 4.215 | 132.557 | 14.88 | 40.84 |
| GLOBAL TEST | 0.994 | 0.486 |
| 28.38 | 35.34 |
| GSEA | 0.900 | 2.696 | 289.216 | 19.35 | 38.60 |
| GSVA | 0.496 | 3.124 | 6.335 | 40.11 | 37.65 |
| ORA | 0.896 |
| 11.058 | 27.07 | 36.96 |
| PADOG | 0.996 | 0.082 | 71.682 |
| 39.25 |
| PLAGE |
| 3.309 | 4.508 | 23.49 | 33.84 |
| Wilcoxon GST | 0.995 | 4.601 | 132.557 | 17.06 | 36.83 |
Note: Higher values of sensitivity and reproducibility are better; lower values of FPR, prioritization and time are better. For each column, the best value is shown in bold.
Fig. 4.Percent of significant pathways by average for each algorithm under various P-value thresholds across six datasets of ccRCC. The black, dashed, vertical line represents Bonferroni correction for multiple testing
Fig. 5.Cluster heatmap of normalized evaluation statistics on 28 datasets with unpaired design. Blue color represents good, gray medium and red poor evaluation. Numbers next to the algorithms name represent the overall rank from the best (1) to the worst (9) performance. Dendrogram corresponds to hierarchical clustering based on Euclidian distance
Summary of algorithm evaluation and their flexibility
| Algorithm | CERNO | GeneSetTest | GLOBALTEST | GSEA | GSVA | ORA | PADOG | PLAGE | Wilcoxon GST |
|---|---|---|---|---|---|---|---|---|---|
| Sensitivity | + | + | + | + | −− | + | ++ | + | + |
| FPR | − | −− | ++ | +− | − | + | + | − | −− |
| Time | + | − | ++ | −− | + | + | +− | + | − |
| Prioritization | + | + | − | + | −− | +− | ++ | + | + |
| Reproducibility | ++ | + | − | + | + | − | + | −− | − |
| Sensitive to GS size | No | No | Yes | No | No | No | No | Yes | No |
| Sensitive to sample size | No | No | Yes! | No | No | No | No | Yes | No |
| Data input | Ordered gene list | Ordered gene list | Expression matrix + class labels | Expression matrix + class labels | Expression matrix | List of DEGs and background | Expression matrix + class labels | Expression matrix | Ordered gene list |
| GS input | Built-in modules or user input | User input | Built-in GSs only | User input | User input | User input | User input | User input | User input |
| Alternative ranking metrics | Yes | Yes | No | Yes | N.A. | Yes | No | N.A. | Yes |
Note: Columns correspond to the algorithms and rows to selection criteria.
Based on available implementation.
N.A., not applicable. The assessment of +/− was performed base on Table 2 and Figure 5. Double +/− are assign only for the best and the worst algorithm in category. Further the symbols were referred by color class form Figure 5.