| Literature DB >> 25421430 |
Haema Nilakanta, Kimberly L Drews, Suzanne Firrell, Mary A Foulkes, Kathleen A Jablonski1.
Abstract
BACKGROUND: Over the past ten years, there has been an explosion of microbiome research. Many software packages for analyzing microbial sequences such as the 16S gene from 454 sequencers and Illumina platforms are available. But for a new researcher, it is difficult to know which package to choose. We present a systematic review of packages for the analysis of molecular sequences used to describe and compare microbial communities. This review gives students and researchers information to help choose the best analytic pipeline for their project. To the best of our knowledge, this is the first review of such software.Entities:
Mesh:
Year: 2014 PMID: 25421430 PMCID: PMC4258797 DOI: 10.1186/1756-0500-7-830
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Major functions of seven pipelines
| Capabilities | Mothur | Qiime | Waters | RD-Pipline | VAMPS ‡ | Genboree ‡ | SnoWMAn ‡ | |
|---|---|---|---|---|---|---|---|---|
| Documentation | Available guides | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Installation | Shortcut option (instant download) | ✓ | ✓ | |||||
| Native version: Mac OSX | ✓ | ✓ | ✓ | |||||
| Native version: Windows | ✓ | ✓ | ✓ | |||||
| Native version: Linux | ✓ | ✓ | ✓ | |||||
| Web based | ✓ | ✓ | ✓ | ✓ | ||||
| Updating | Re-download entire program | ✓ | ✓ | ✓ | ||||
| Re-download updated sections | ✓ | |||||||
| Interface | Command line | ✓ | ✓ | |||||
| Graphical User Interface | ✓ | |||||||
| Web form GUI | ✓ | ✓ | ✓ | ✓ | ||||
| Sequencing platforms | Illumina | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| 454 Pyroseq | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Preparing sequences | Accepted file formats | |||||||
| sff | ✓* | ✓* | ✓ | ✓ | ✓ | |||
| Fasta | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Quality score | ✓ | ✓ | ✓ | ✓ | ||||
| Flow file data | ✓ | ✓ | ||||||
| User defined barcodes or primers | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| User defined metadata | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Alignment | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Summary function | ✓ | ✓ | ✓ | |||||
| Trims barcodes and primers off of sequences | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Removes short reads | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Identify and remove chimeras | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Remove contaminants | ✓ | ✓ | ||||||
| Approaches to analyze files | OTU binning/clustering | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Phylotype binning | ✓ | ✓ | ✓ | |||||
| Phylogenetic tree | ✓ | ✓ | ✓ | ✓ | ||||
| Analysis output | Alpha diversity | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Beta diversity | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Ecological indexes | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Unifrac | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |
| Visualization | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
*Not required but strongly suggested; †Engineered for specific capability; ‡Integrates other pipelines
Defaults for each of the reviewed pipelines
| Mothur | QIIME | WATERS | RDPipeline | VAMPS | Genboree | SnoWMAn | |
|---|---|---|---|---|---|---|---|
|
| 0 – User defined | 200 bp | 500 bp | 150 bp | 200 bp | 0 – User defined | 150 bp |
|
| Not bounded – User defined | 1000 bp | 500 bp | 1000 bp | Not bounded – User defined | ||
|
| Not bounded – User defined | 6 bp | |||||
|
| Not bounded - User defined (recommended AB = 0) | AB = 6 | N = 0 | N = 0 | Not bounded – User defined | N = 0 | |
|
| 0 – User defined | 25 | 20 | 0 – User defined | 20 | ||
|
| Needleman | Pynast | Infernal | Infernal | GAST | Pynast | Varies by pipeline method |
|
| User defined | User defined | Mallard | Uchime | ChimeraSlayer |
bp = base pairs.
Summary of programs to remove chimeras
| Chimera program | Year published | Method | Advantage |
|---|---|---|---|
| CHIMERA_CHECK
[ | 1999 | Initial program | |
| Bellerphon
[ | 2004 | Partial treeing approach | Initial program |
| Pintail
[ | 2005 | Reference database comparing variation differences | More sensitive than earlier methods |
| Ccode
[ | 2005 | Reference of putative chimeras, measuring variability | Bypasses need for manual inspection |
| Mallard
[ | 2006 | Reference database comparing variation differences to all pairs | More sensitive than earlier Pintail program |
| ChimeraChecker
[ | 2010 | Focuses on ITS region using BLAST | Used for fungal sequences |
| ChimeraSlayer
[ | 2011 | Reference database constructs potential alignments with parent strands | Useful for short sequences and where parents of chimeras are closely related – more sensitive than earlier methods |
| Perseus
[ | 2011 | Searches for parts of parent sequences in higher abundance |
|
| UCHIME
[ | 2011 | Uses multiple reference databases, aligning to top hits and computes score | Faster without sacrificing sensitivity, identifying chimeras with more than two parents |
| DECIPHER
[ | 2012 | Search-based approach, detecting short fragments | Useful for short sequences |
Figure 1Decision tree showing feature options for pipeline choice (based on documented features within each package).