| Literature DB >> 28249561 |
Antonio Fabregat1,2, Konstantinos Sidiropoulos1, Guilherme Viteri1, Oscar Forner1, Pablo Marin-Garcia3,4, Vicente Arnau5,6, Peter D'Eustachio7, Lincoln Stein8,9, Henning Hermjakob10,11.
Abstract
BACKGROUND: Reactome aims to provide bioinformatics tools for visualisation, interpretation and analysis of pathway knowledge to support basic research, genome analysis, modelling, systems biology and education. Pathway analysis methods have a broad range of applications in physiological and biomedical research; one of the main problems, from the analysis methods performance point of view, is the constantly increasing size of the data samples.Entities:
Keywords: Data structures; Over-representation analysis; Pathway analysis
Mesh:
Substances:
Year: 2017 PMID: 28249561 PMCID: PMC5333408 DOI: 10.1186/s12859-017-1559-2
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Radix tree representation for the identifiers P60484, P60467, P60468, P29172, P11087, P11086, P10639, P10636, P10635, P10622, P10620, P12939, P12938, P12931, P05480, P05386, PTEN
Fig. 2Graph representation where P are proteins; C are complexes, S are sets and prime nodes are the same but for other species. a One species graph. b Relation between two species. c Base node content
Fig. 3Double-linked tree to represent the event hierarchy in Reactome. The root node defines the species and its children represent the different pathways and sub-pathways in Reactome. Each node contains the pathway identifier, name, the total curated entities and the number of entities found in the user’s sample
Fig. 4Representation of two analysis use cases joining the different data structures. In red an analysis performed using the projection to human. In green an analysis performed without projection
Comparison of resources providing analysis methods and accessibility
| Resource | Analysis methods | Online tool | Programmatic access | Processing time | ||||
|---|---|---|---|---|---|---|---|---|
| Hippocampal atrophy - 79 genesa | Migraine disorder - 644 genesb | Parkinson’s disease - 1492 genesc | Multiple sclerosis - 2570 genesd | Inflammatory bowel disease - 4110 genese | ||||
| PANTHER | ORA | ✔ | – | ~2 s | ~4 s | ~6 s | ~8 s | ~12 s |
| Consensus PathDB | ORA | ✔ | SOAP/WSDL | ~1 min | ~1 min | ~3 min | ~3 min | ~1 min |
| DAVID | ORA | ✔ | SOAP/WSDL | ~4 s | ~4 s for conversion of official gene ids to 7498 DAVID ids. Analysis not performed - sample size limitation | ~5 s for conversion of official gene ids to 17272 DAVID ids. Analysis not performed - sample size limitation | ~8 s for conversion of official gene ids to 29420 DAVID ids. Analysis not performed - sample size limitation | Not performed - sample size limitation |
| GSEA | ORA | – | – | – | – | – | – | |
| REACTOME v1.0 | ORA | ✔ | – | ~2 min | ~7 min | ~12 min | ~19 min | ~25 min |
| REACTOME v2.0 | ORA | ✔ | REST | ~1 s | ~1 s | ~2 s | ~2 s | ~3 s |
Comparison between different resources and whether they provide analysis methods which are accessible online (UX or programmatic access) and the average response time for a predefined sample. For the comparison of processing time, only the first column in the test sets -the gene identifiers- has been used. Datasets are available in
a https://www.targetvalidation.org/disease/EFO_0005039/associations (accessed 13/07/2016)
b https://www.targetvalidation.org/disease/EFO_0003821/associations (accessed 13/07/2016)
c https://www.targetvalidation.org/disease/EFO_0002508/associations (accessed 13/07/2016)
d https://www.targetvalidation.org/disease/EFO_0003885/associations (accessed 13/07/2016)
e https://www.targetvalidation.org/disease/EFO_0003767/associations (accessed 13/07/2016)