| Literature DB >> 29291722 |
Anna Marco-Ramell1,2, Magali Palau-Rodriguez1,2, Ania Alay3, Sara Tulipani1, Mireia Urpi-Sarda1,2, Alex Sanchez-Pla3,4, Cristina Andres-Lacueva5,6.
Abstract
BACKGROUND: Bioinformatic tools for the enrichment of 'omics' datasets facilitate interpretation and understanding of data. To date few are suitable for metabolomics datasets. The main objective of this work is to give a critical overview, for the first time, of the performance of these tools. To that aim, datasets from metabolomic repositories were selected and enriched data were created. Both types of data were analysed with these tools and outputs were thoroughly examined.Entities:
Keywords: Bioinformatic tools; Database; Enrichment; HumanCyc; KEGG; Metabolite; Metabolomics; Over-representation analysis; Pathway; Reactome
Mesh:
Year: 2018 PMID: 29291722 PMCID: PMC5749025 DOI: 10.1186/s12859-017-2006-0
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Main characteristics of the datasets used, extracted from the repository MetabolomeXchange
| Dataset | Repository reference | Condition of study | Metabolomic platform | Significant metabolites in publication | Total metabolites analysed by authors | Reference |
|---|---|---|---|---|---|---|
| 1 | ST000091 | Type 1 diabetes | LC(RP)-MS | 8 | 44 | [ |
| 2 | ST000383 | Type 2 diabetes and obesity | GC-MS | 27 | 106 | [ |
| 3 | MTBLS364 | Smokers | NMR, LC(HILIC-/RP)-MS | 81 | – | [ |
| 4 | MTBLS424 | Breast cancer | NMR | 22 | 25 | [ |
| 5 | ST000284 | Colorectal cancer | LC(RP)-MS | 42 | 113 | [ |
Abbreviations: GC gas chromatography, HILIC hydrophilic interaction liquid chromatography, LC liquid chromatography, MS mass spectrometry, NMR nuclear magnetic resonance, RP reverse phase
Summary of the tools used to assess the performance of over-representation (ORA) methods and their main characteristics (July 2017). Tools and databases are sorted alphabetically
| Tool name | Tool version | Database used | Database version | Test used in this work | Platform | Input code | Website |
|---|---|---|---|---|---|---|---|
| ConsensusPathDB | 32 | HumanCyc | 19.1 (06/2015) | Fisher’s exact test | Online | HumanCyc | |
| HumanCyc | 21.0 | HumanCyc | 21.0 (12/2016) | Fisher’s exact test | Online | Name | |
| IMPaLA | 10 | HumanCyc | NA | Fisher’s exact test | Online | HumanCyc | |
| IPA® | NA | IPA® disease | NA | Fisher’s exact test, Z-score | Java-based software | KEGG | |
| KEGGREST | 1.17.0 | KEGG | NA | – | R | KEGG |
|
| MBRole | 2.0 | HMDB disease | 3.5 (01/2013) | Hypergeometric test | Online | HumanCyc |
|
| MetaboAnalyst | 3.0 | SMPDB disease | NA | Fisher’s exact test, hypergeometric test | Online | KEGG | |
| Metabox | NA | KEGG | NA | Hypergeometric test | R | PubChem |
|
| MetaCore™ | NA | MeSH and OMIM disease | NA | – | Online | PubChem | |
| MetExplore | 2.11.2 | HumanCyc | 18.0 (02/2014) | Fisher’s exact test | Online | HumanCyc |
|
| MPEA | (2010) | KEGG | (2010) | Hypergeometric test | Online | KEGG |
|
| PathVisio | 3.2.4 | Reactome | 54 (10/2015) | Z-score | Java-based software | KEGG |
|
| Reactome | 61 | Reactome | 61 (06/2017) | Fisher’s exact test | Online | KEGG |
|
Abbreviations: NA not available
Fig. 1Non-metric multidimensional scaling (NMDS) plot of the most used tools for metabolomics data enrichment based on Jaccard’s distances. Additional file 3: Table S3 shows the main features of each tool
Evaluation of over-representation analysis (ORA) outputs of bioinformatic tools employing KEGG pathways. Real (from dataset ST000284) and enriched data were used. The number of total metabolites in the pathway, the number of hits, the ranking of the pathway among all the KEGG pathways (according to their significance), the p-value and the adjusted p-value were calculated by the tools
| Tool | Data | Rank | Total metab. | Hits | Adjusted | |
|---|---|---|---|---|---|---|
| Alanine, aspartate and glutamate metabolism | ||||||
| ConsensusPathDB | Real | 2 | 28 | 8 | 3.77E-11 | 9.99E-10 |
| Enriched | 2 | 28 | 7 | 3.76E-13 | 7.32E-12 | |
| IMPaLA | Real | 2 | 28 | 8 | 3.77E-11 | 7.91E-09 |
| Enriched | 2 | 28 | 7 | 3.76E-13 | 3.00E-10 | |
| KEGGREST | Real | NA | 28 | 7 | NA | NA |
| Enriched | NA | 28 | 7 | NA | NA | |
| MBRole (full database) | Real | 2 | 24 | 8 | 3.47E-12 | 2.07E-10 |
| Enriched | 1 | 24 | 7 | 7.23E-14 | 5.86E-12 | |
| MBRole ( | Real | 1 | 24 | 8 | 2.31E-11 | 1.50E-09 |
| Enriched | 1 | 24 | 7 | 3.85E-13 | 2.00E-11 | |
| MetaboAnalyst (Fisher) | Real | 1 | 24 | 7 | 3.91E-06 | 6.74E-05 |
| Enriched | 1 | 24 | 7 | 6.21E-12 | 4.97E-10 | |
| MetaboAnalyst (hyper.) | Real | 1 | 24 | 7 | 3.91E-06 | 6.74E-05 |
| Enriched | 1 | 24 | 7 | 6.21E-12 | 4.97E-10 | |
| Metabox | Real | 2 | 32 | 8 | 3.60E-11 | 5.22E-10 |
| Enriched | 2 | 32 | 7 | 1.34E-13 | 1.27E-12 | |
| MetExplore | Real | 3 | NA | 8 | 1.03E-08 | 4.32E-07 |
| Enriched | 2 | NA | 7 | 4.42E-10 | 1.33E-08 | |
| MPEA (top down analysis) | Real | 1 | 24 | 8 | 4.41E-11 | 0.660 |
| Enriched | 1 | 24 | 7 | 5.01E-13 | 0.440 | |
| MPEA (bottom up analysis) | Real | 1 | 24 | 8 | 1.01E-11 | 0.170 |
| Enriched | 1 | 24 | 7 | 1.08E-12 | 1.00 | |
| Aminoacyl-tRNA biosynthesis | ||||||
| ConsensusPathDB | Real | 4 | 52 | 8 | 7.89E-10 | 1.05E-07 |
| Enriched | 9 | 52 | 5 | 2.58E-07 | 1.12E-06 | |
| IMPaLA | Real | 4 | 52 | 8 | 7.89E-09 | 8.74E-07 |
| Enriched | 9 | 52 | 5 | 2.58E-07 | 1.13E-05 | |
| KEGGREST | Real | NA | 52 | 8 | NA | NA |
| Enriched | NA | 52 | 5 | NA | NA | |
| MBRole (full database) | Real | 5 | 75 | 8 | 6.07E-08 | 1.20E-06 |
| Enriched | 12 | 75 | 5 | 1.23E-06 | 8.30E-06 | |
| MBRole ( | Real | 5 | 75 | 8 | 3.75E-07 | 4.87E-06 |
| Enriched | 6 | 75 | 5 | 3.95E-06 | 3.42E-05 | |
| MetaboAnalyst (Fisher) | Real | 3 | 75 | 8 | 1.40E-05 | 3.75E-04 |
| Enriched | 7 | 75 | 5 | 2.72E-05 | 3.11E-04 | |
| MetaboAnalyst (hyper.) | Real | 3 | 75 | 8 | 1.40E-05 | 3.75E-04 |
| Enriched | 7 | 75 | 5 | 2.72E-05 | 3.11E-04 | |
| Metabox | Real | 4 | 56 | 8 | 4.28E-09 | 3.10E-08 |
| Enriched | 4 | 56 | 5 | 8.69E-08 | 2.15E-07 | |
| MetExplore | Real | 5 | NA | 8 | 1.55E-06 | 1.69E-06 |
| Enriched | 7 | NA | 5 | 1.51E-05 | 4.52E-04 | |
| MPEA (top down analysis) | Real | 3 | 53 | 8 | 5.32E-09 | 1.00 |
| Enriched | 7 | 53 | 5 | 1.42E-06 | 1.00 | |
| MPEA (Bottom up analysis) | Real | 5 | 53 | 8 | 7.12E-08 | 1.00 |
| Enriched | 4 | 53 | 5 | 6.57E-08 | 1.00 | |
| Arginine and proline metabolism | ||||||
| ConsensusPathDB | Real | 9 | 76 | 7 | 2.79E-06 | 1.64E-05 |
| Enriched | 4 | 76 | 6 | 3.94E-08 | 3.85E-07 | |
| IMPaLA | Real | 9 | 76 | 7 | 2.79E-09 | 8.74E-07 |
| Enriched | 4 | 76 | 6 | 3.94E-08 | 2.18E-06 | |
| KEGGREST | Real | NA | 77 | 7 | NA | NA |
| Enriched | NA | 77 | 6 | NA | NA | |
| MBRole (full database) | Real | 3 | 82 | 10 | 2.59E-10 | 8.55E-09 |
| Enriched | 2 | 82 | 8 | 9.30E-12 | 3.77E-10 | |
| MBRole ( | Real | 2 | 82 | 10 | 2.58E-09 | 8.38E-08 |
| Enriched | 2 | 82 | 8 | 6.21E-11 | 1.61E-09 | |
| MetaboAnalyst (Fisher) | Real | 2 | 77 | 9 | 6.69E-06 | 6.74E-05 |
| Enriched | 2 | 77 | 8 | 8.61E-10 | 3.45E-08 | |
| MetaboAnalyst (hyper.) | Real | 2 | 77 | 9 | 6.69E-06 | 6.74E-05 |
| Enriched | 2 | 77 | 8 | 8.61E-10 | 3.45E-08 | |
| Metabox | Real | 9 | 84 | 7 | 1.92E-05 | 6.18E-05 |
| Enriched | 4 | 84 | 6 | 1.25E-08 | 5.96E-08 | |
| MetExplore | Real | 2 | NA | 10 | 4.03E-08 | 1.69E-06 |
| Enriched | 1 | NA | 8 | 3.34E-10 | 1.00E-08 | |
| MPEA (top down analysis) | Real | 4 | 90 | 10 | 1.40E-08 | 1.00 |
| Enriched | 2 | 90 | 7 | 1.09E-10 | 1.00 | |
| MPEA (bottom up analysis) | Real | 2 | 90 | 10 | 2.24E-09 | 1.00 |
| Enriched | 2 | 90 | 8 | 1.69E-10 | 1.00 | |
NA means that information was not provided by the tool. Abbreviations: Fisher Fisher’s exact test, hyper hypergeometric test, NA not available
Evaluation of over-representation analysis (ORA) outputs of bioinformatic tools employing Reactome and HumanCyc pathways. Real (from dataset ST000284) and enriched data were used. The number of total metabolites in the pathway, the number of hits, the ranking of the pathway among all the Reactome or HumanCyc pathways (according to their significance), the p-value and the adjusted p-value were calculated by the tools
| Tool | Data | Rank | Total metab. | Hits | Adjusted | |
|---|---|---|---|---|---|---|
| Reactome | ||||||
| Metabolism of amino acids and derivates | ||||||
| ConsensusPathDB | Real | 3 | 272 | 18 | 8.46E-14 | 4.55E-12 |
| Enriched | 1 | 272 | 12 | 7.67E-15 | 5.75E-13 | |
| IMPaLA | Real | 3 | 272 | 18 | 8.46E-14 | 4.21E-11 |
| Enriched | 1 | 272 | 12 | 7.67E-15 | 1.02E-11 | |
| PathVisio | Real | NA | NA | NA | NA | NA |
| Enriched | NA | NA | NA | NA | NA | |
| Reactome | Real | 9 | 283 | 18 | 1.03E-04 | 3.81E-03 |
| Enriched | 1 | 283 | 12 | 8.18E-08 | 1.00E-05 | |
| HumanCyc | ||||||
| tRNA charging | ||||||
| ConsensusPathDB | Real | 2 | 24 | 8 | 9.14E-12 | 1.92E-10 |
| Enriched | 2 | 24 | 5 | 4.40E-09 | 1.30E-07 | |
| HumanCyc | Real | 8 | 24 | 8 | 2.57E-05 | 0.002 |
| Enriched | 18 | 24 | 5 | 7.90E-05 | 4.25E-03 | |
| IMPaLA | Real | 2 | 24 | 8 | 9.14E-12 | 2.28E-09 |
| Enriched | 2 | 24 | 5 | 4.40E-09 | 3.51E-07 | |
| MBRole (full database) | Real | 4 | 64 | 8 | 8.38E-09 | 1.14E-06 |
| Enriched | 47 | 64 | 4 | 3.58E-07 | 7.97E-06 | |
| MBRole ( | Real | 5 | 64 | 8 | 1.26E-04 | 2.44E-03 |
| Enriched | 11 | 64 | 5 | 1.58E-04 | 1.52E-03 | |
| MetExplore | Real | 1 | NA | 8 | 7.11E-07 | 7.75E-05 |
| Enriched | 6 | NA | 5 | 4.45E-05 | 3.60E-04 | |
NA means that information was not provided by the tool. Abbreviations: NA not available
Disease-based enrichment analyses of the five datasets performed with MetaboAnalyst (SMPDB disease database), MBRole (HMDB disease database) and IPA® (in-house disease database) and MetaCore (based on MeSH and OMIM annotations). When the exact disease/condition of study was not obtained, a similar disease was selected
| Dataset | Disease input | Disease output | Rank | Input number metabolites | Hits output | Adjusted | |
|---|---|---|---|---|---|---|---|
| MetaboAnalyst | |||||||
| ST000091 | Type 1 diabetes mellitus | Diabetes mellitus MODY | 20 | 8 | 2 | 3.40E-02 | 5.84E-01 |
| ST000383 | Type 2 diabetes mellitus | Diabetes mellitus MODY | 4 | 27 | 4 | 8.60E-03 | 6.69E-01 |
| Obesity | Obesity | 31 | 27 | 1 | 9.07E-02 | 8.83E-01 | |
| MTBLS364 | Smokers | – | – | 81 | – | – | – |
| MTBLS424 | Breast cancer | Mammary tumour | 30 | 22 | 2 | 4.08E-03 | 4.68E-02 |
| ST000284 | Colorectal cancer | Cervical/colon/ovarian cancer | 46 | 42 | 1 | 8.47E-02 | 5.30E-01 |
| MBRole | |||||||
| ST000091 | Type 1 diabetes mellitus | – | – | 8 | – | – | – |
| ST000383 | Type 2 diabetes mellitus | Type 2 diabetes mellitus | 8 | 27 | 3 | 1.16E-02 | 5.48E-02 |
| Obesity | Obesity | 28 | 27 | 1 | 1.08E-01 | 1.48E-01 | |
| MTBLS364 | Smokers | Lung Cancer | 16 | 81 | 31 | 3.02E-02 | 9.25E-02 |
| MTBLS424 | Breast cancer | Lung Cancer | 7 | 22 | 6 | 1.27E-04 | 1.09E-03 |
| ST000284 | Colorectal cancer | Colorectal cancer | 44 | 42 | 1 | 5.19E-02 | 1.14E-01 |
| IPA® | |||||||
| ST000091 | Type 1 diabetes mellitus | – | – | 8 | – | – | – |
| ST000383 | Type 2 diabetes mellitus | Insulin resistance | 21 | 27 | 3 | 6.10E-05 | NA |
| Obesity | Adipogenesis of fat | 264 | 27 | 1 | 1.54E-02 | NA | |
| MTBLS364 | Smokers | Cough | 490 | 81 | 11 | 4.33E-02 | NA |
| MTBLS424 | Breast cancer | Gastric cancer | 2 | 22 | 9 | 5.03E-11 | NA |
| ST000284 | Colorectal cancer | Colorectal cancer | 3 | 42 | 11 | 2.31E-08 | NA |
| MetaCore™ | |||||||
| ST000091 | Type 1 diabetes mellitus | Type 1 diabetes mellitus | NA | 8 | 0 | NA | NA |
| ST000383 | Type 2 diabetes mellitus | Type 2 diabetes mellitus | NA | 27 | 7 | NA | NA |
| Obesity | Obesity | NA | 27 | 1 | NA | NA | |
| MTBLS364 | Smokers | Respiratory disorders | NA | 81 | 1 | NA | NA |
| MTBLS424 | Breast cancer | Breast neoplasms | NA | 22 | 0 | NA | NA |
| ST000284 | Colorectal cancer | Colorectal neoplasms | NA | 42 | 13 | NA | NA |
Abbreviations: NA not available