| Literature DB >> 26602607 |
Pravin Dudhagara1, Sunil Bhavsar2, Chintan Bhagat2, Anjana Ghelani3, Shreyas Bhatt4, Rajesh Patel4.
Abstract
The development of next-generation sequencing (NGS) platforms spawned an enormous volume of data. This explosion in data has unearthed new scalability challenges for existing bioinformatics tools. The analysis of metagenomic sequences using bioinformatics pipelines is complicated by the substantial complexity of these data. In this article, we review several commonly-used online tools for metagenomics data analysis with respect to their quality and detail of analysis using simulated metagenomics data. There are at least a dozen such software tools presently available in the public domain. Among them, MGRAST, IMG/M, and METAVIR are the most well-known tools according to the number of citations by peer-reviewed scientific media up to mid-2015. Here, we describe 12 online tools with respect to their web link, annotation pipelines, clustering methods, online user support, and availability of data storage. We have also done the rating for each tool to screen more potential and preferential tools and evaluated five best tools using synthetic metagenome. The article comprehensively deals with the contemporary problems and the prospects of metagenomics from a bioinformatics viewpoint.Entities:
Keywords: Metagenomes; Metagenomics; Software tools; Synthetic metagenome; Web resources
Mesh:
Year: 2015 PMID: 26602607 PMCID: PMC4678780 DOI: 10.1016/j.gpb.2015.10.003
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Figure 1Relative citation of the metagenomics software tools from articles published in peer-reviewed scientific journals by October 2015
Year in the bracket in the legend box indicates the year of original release of the respective tool. Citation of each tool was tracked from Google Scholar. Total citation of all tools is considered as 100% in each year and relative percentage of citations of each tool per year was calculated in relative to the total citations of all tools in the respective year.
Main online software tools for metagenomics studies
| MG-RAST | SEED subsystem, COG, KO, NOG, eggNOG, M5RNA, KEGG, TrEMBL, SEED, PATRIC, SwissProt, GenBank, RefSeq | HM, PT, BC, T | Blog and manual | 215,773 metagenome dataset and 30,589 public metagenomes | |||
| IMG/M | COG, KOG, KEGG, KO, Pfam, TIGRfam, TIGR, MetaCyc, GO | T, PC, PT, RP, HM | User guide, user forum, standard operating procedure | 32,802 genome and 5234 metagenome dataset | |||
| METAREP | GO, NCBI Taxonomy | T, HM, HCP | User manual, demo video | No storage | |||
| CoMet | Pfam, GO | T, DG, BC, DM | Online help | No storage | |||
| METAGENassist | BacMap, GOLD, NCBI Taxonomy, PubMed | DG, HM, KM, SOM | Tutorials, FAQs, data examples | No storage | |||
| MetaABC | Database of reference genomes (NCBI) | HM, BC, PC | Online help | 52 dataset | |||
| MyTaxa | Database of reference genes and genomes (NCBI) | PT, BC | FAQs, examples | No storage | |||
| metaMicrobesOnline | TIGRfam, COG, Pfam | T, PT | Guide, tutorial, help through email | 155 metagenome and 3527 genome dataset | |||
| EBI Metagenomics | RDP, Greengenes database, InterPro protein signature database | T, PC, BC, HM, SC, PCA | Training, online support by email and Twitter | 141 projects and 5800 dataset | |||
| CAMERA | FragGeneScan, MetaGene, COG, Pfam, TIGRfam, GO, KEGG | T, PC, DG | Tutorials, video tutorials, online manual | 128 projects and 2660 samples | |||
| METAVIR | Pfam, RefSeq virus database | HM, DG, RP, PC | Video tutorial, guide, FAQs, online contact | 170 viral metagenomic dataset and 335 projects | |||
| VIROME | SEED, ACLAME, COG, GO, KEGG, MGOL, UniRef 100 | T, PC, TDD | Tutorial videos | 466 libraries containing 24,386,816 reads | |||
Note: Darkness of the grayscale bars in the rating column indicates higher importance and usefulness. Rating is based on five different criteria represented with a gray scale from left to right: (i) easiness in data uploading, (ii) availability of online user support, (iii) spectrum of analysis, (iv) citation, and (v) stored data size. HM, heatmap; PT, phylogenetic tree; BC, bar chart; T, tabulation; PC, pie chart; RP, recruitment plot; HCP, hierarchical cluster plot; DM, distance matrix; DG, dendrogram; KM, K-means; SOM, self-organizing map; TDD, tab-delimited data; SC, stacked column; PCA, principal component analysis. Data storage was obtained from the respective website in October 2015.
Result of synthetic metagenome analysis using the five selected tools
| MG-RAST | 98 | 05 |
| IMG/M | 95 | 96 |
| EBI Metagenomics | 81 | 04 |
| CoMet | 172 | 05 |
| METAVIR | 100 | 72 |
Note: Tools were selected based on rating criteria for taxonomic assessment using online support provided by the software developers.