| Literature DB >> 26537885 |
Michael A Peabody1, Thea Van Rossum2, Raymond Lo3, Fiona S L Brinkman4.
Abstract
BACKGROUND: The field of metagenomics (study of genetic material recovered directly from an environment) has grown rapidly, with many bioinformatics analysis methods being developed. To ensure appropriate use of such methods, robust comparative evaluation of their accuracy and features is needed. For taxonomic classification of sequence reads, such evaluation should include use of clade exclusion, which better evaluates a method's accuracy when identical sequences are not present in any reference database, as is common in metagenomic analysis. To date, relatively small evaluations have been performed, with evaluation approaches like clade exclusion limited to assessment of new methods by the authors of the given method. What is needed is a rigorous, independent comparison between multiple major methods, using the same in silico and in vitro test datasets, with and without approaches like clade exclusion, to better characterize accuracy under different conditions.Entities:
Mesh:
Year: 2015 PMID: 26537885 PMCID: PMC4634789 DOI: 10.1186/s12859-015-0788-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Microbes used in the 2 simulated mock communities
| MetaSimHCa | Freshwaterb (FW) | ||||
|---|---|---|---|---|---|
| Genus | Species | Strain | Genus | Species | Strain |
|
|
| C58 |
|
| FZB42 |
|
|
| ATCC 29413 |
|
| ATCC 14579 |
|
|
| DSM 4304 |
|
| J2315 |
|
|
| HD100 |
|
| K-12 |
|
|
| 81–176 |
|
| CcI3 |
|
|
| ATCC 824 |
|
| NCTC 2665 |
|
|
| SK11 |
|
| PAO1 |
|
|
| ATCC 19718 |
|
| UCBPP-PA14 |
|
|
| PA7 |
|
| Pf-5 |
|
|
| A3(2) |
|
| KT2440 |
|
|
| str. 7 |
|
| SB 1003 |
|
|
| A3(2) | |||
aMetaSimHC is a test dataset of 11 diverse microbial genomes covering several phyla of Bacteria and Archaea proposed in [21]
bFreshwater (FW) is a set of bacterial genomes found in previous freshwater metagenomics studies (see Methods)
List of metagenomics sequence classification methods and their characteristics sorted by class of method
| Method name | Class of method | Sequence alignment method/Composition method | Standalonea/Web server | Most recent year published (first time published)b | Functional classification if applicable | References | Number of citationsc |
|---|---|---|---|---|---|---|---|
| MEGAN4 | Similarity | MEGABLAST, BLASTN, BLASTX, RAPSEARCH2 [ | Yes/No | 2011 (2007) | KEGG, SEED | [ | 1089 |
| MG-RAST | Similarity | BLASTN, BLAT / N/A | No/Yes | 2008 | SEED, NOG, COG, KEGG | [ | 691 |
| CAMERA | Similarity | All 6 BLAST programs / N/A | No/Yes | 2007 (2011) | Pfam, TIGRFAM, COG, KOG, PRK | [ | 324 |
| CARMA3 | Similarity | BLASTX, HMMER3 [ | Yes/Yes | 2011 (2008) | GO | [ | 201 |
| WebMGA | Similarity | FR-HIT [ | No/Yes | 2013 | Pfam, TIGRFAM, COG, KOG, PRK, GO | [ | 54 |
| DiScRIBinATE (SOrt-ITEMS)d | Similarity | BLASTX, RAPSEARCH2 / N/A | Yes/No | 2010 (2009) | N/A | [ | 48 |
| Ray Meta | Similarity | Exact match k-mers / N/A | Yes/No | 2012 | N/A | [ | 34 |
| Kraken | Similarity | Exact match k-mers / N/A | Yes/No | 2014 | N/A | [ | 15 |
| RTM | Similarity | k-mers / N/A | Yes/Yes | 2012 | KEGG | [ | 12 |
| Genometa | Similarity | Bowtie [ | Yes/No | 2012 | N/A | [ | 7 |
| LMAT | Similarity | Exact match k-mers / N/A | Yes/No | 2013 | N/A | [ | 6 |
| Sequedex | Similarity | Exact match k-mers / N/A | Yes/No | 2012 | N/A | [ | 5 |
| MetaBin | Similarity | BLASTX, BLAT / N/A | Yes/Yes | 2012 | COG | [ | 4 |
| TAMER | Similarity | MEGABLAST / N/A | Yes/No | 2012 | N/A | [ | 4 |
| metaBEETL | Similarity | Direct comparison of compressed text indices / N/A | Yes/No | 2013 | N/A | [ | 2 |
| SPANNER | Similarity | BLASTP / N/A | Yes/No | 2013 | N/A | [ | 2 |
| GOTTCHA | Similarity | BWA / N/A | Yes/No | 2015 | N/A | [ | 0 |
| CLARK | Similarity | k-mers / N/A | Yes/No | 2015 | N/A | [ | 0 |
| MLTreeMap | Marker | BLASTX / N/A | Yes/Yes | 2010 (2007) | 4 Enzyme families | [ | 206 |
| AMPHORA2 | Marker | HMMER3 / N/A | Yes/Yes | 2012 (2008) | N/A | [ | 190 |
| MetaPhlAn | Marker | MEGABLAST, Bowtie2 [ | Yes/Yes | 2012 | N/A | [ | 114 |
| MetaPhyler | Marker | BLASTN, BLASTX / N/A | Yes/No | 2011 | N/A | [ | 42 |
| mOTU | Marker | HMMER3 / N/A | Yes/Yes | 2013 | N/A | [ | 24 |
| Phylosift | Marker | LAST, HMMER3 / N/A | Yes/No | 2014 | N/A | [ | 18 |
| phymmBL | Hybrid | MEGABLAST / IMM | Yes/No | 2011 (2009) | N/A | [ | 182 |
| RITA | Hybrid | Pipeline of BLAST variations / NB | Yes/Yes | 2012 (2011) | N/A | [ | 38 |
| SPHINX | Hybrid | BLASTX / k-means | No/Yes | 2010 | N/A | [ | 17 |
| TaxyPro | Hybrid | CoMet web server / Mixture model | Yes/No | 2013 | Pfam | [ | 3 |
| TWARIT | Hybrid | BWA short read alignment [ | No/Yes | 2012 | N/A | [ | 2 |
| PhyloPythiaS | Composition | N/A / SVM | Yes/Yes | 2011 (2007) | N/A | [ | 269 |
| TACOA | Composition | N/A / k-NN | Yes/No | 2009 | N/A | [ | 65 |
| NBC | Composition | N/A / NB | Yes/Yes | 2011 (2008) | N/A | [ | 35 |
| RAIphy | Composition | N/A / RAI | Yes/No | 2011 | N/A | [ | 18 |
| ClaMS | Composition | N/A / DBC signature | Yes/No | 2011 | N/A | [ | 10 |
| INDUS | Composition | N/A / k-means | No/Yes | 2011 | N/A | [ | 8 |
| TAC-ELM | Composition | N/A / Neural Network | Yes/No | 2012 | N/A | [ | 5 |
| MetaCV | Composition | N/A / CV | Yes/No | 2013 | KEGG | [ | 4 |
| GSTaxClassifier | Composition | N/A / Bayesian | No/No | 2010 | N/A | [ | 2 |
N/A not applicable, IMM interpolated Markov model, NB naive Bayes, SVM support vector machine, k-NN k-Nearest Neighbour, RAI relative abundance index, DBC signature de Bruijn chain signature, CV composition vector
aStandalone refers to whether the program can be run locally
bSome methods have had several publications, with later publications regarding improvements on functionality. In these cases the most recent publication was listed, with the first time the method was published in brackets
cNumber of citations is based on Web of Science as of April 21, 2015
dDiScRIBinATE is the successor for SOrt-ITEMS so they were included in the same row
Fig. 1Performance as clade exclusion level is varied. Sensitivity (a) and precision (b) on the MetaSimHC dataset of simulated 250 bp reads. There is a wide range of variability in the sensitivity and precision of the methods with sensitivity tending to decrease as the level of clade exclusion moves from species to class. Performance is calculated based on proportion of reads appropriately assigned and averaged per genome (see Methods)
Fig. 2Distributions of assignments to taxonomic ranks. Proportion of reads assigned at each taxonomic rank on the MetaSimHC dataset of simulated 250 bp reads under genus clade exclusion (includes both correct and incorrect assignments). Although the lowest possible correct rank is family, many methods still classify the majority of reads at the species level. CARMA3 and DiScRIBinATE are slightly more conservative, classifying a large number of reads at the family or order levels, whereas TACOA is extremely conservative, classifying the majority of the reads at the superkingdom level
Fig. 3Performance as clade exclusion level is varied with overpredictions (see Methods for details) classified as correct. Sensitivity (a) and precision (b) on the MetaSimHC dataset of simulated 250 bp reads. Methods such as MEGAN4 which classify many reads at lower taxonomic levels see a considerable increase in performance, whereas more conservative methods such as CARMA3 see only a slight improvement. Performance is calculated based on proportion of reads appropriately assigned and averaged per genome (see Methods)
Fig. 4Performance of FW in silico versus FW in vitro. Sensitivity (a) and precision (b) of methods on the FW dataset comparing the performance on the in silico community versus the in vitro community under species clade exclusion. The results are similar between the in vitro and in silico communities, demonstrating that methods appear to be relatively robust to real Illumina sequencing errors for this simple community. Performance is calculated based on proportion of reads appropriately assigned and averaged per genome (see Methods)
Fig. 5Performance of MetaSimHC compared to FW in silico. Sensitivity (a) and precision (b) of methods on the MetaSimHC dataset compared to the FW in silico of simulated 250 bp reads. Values are averaged over all levels of clade exclusion from species to class. Although the microbes in the dataset changed, the relative performance of the methods remains very similar. Performance is calculated based on proportion of reads appropriately assigned and averaged per genome (see Methods)
Number of correctly and incorrectly predicted speciesa for different thresholdsb without clade exclusion. Some methods vastly overpredict the number of species, even when the true number of species is low (in this case the true number of species is 11)
| No cutoffb | Cutoff > 0.01 %b | Cutoff > 0.1 %b | Cutoff > 1 %b | |||||
|---|---|---|---|---|---|---|---|---|
| Method | Correct | Incorrect | Correct | Incorrect | Correct | Incorrect | Correct | Incorrect |
| CARMA3 | 11 | 56 | 11 | 4 | 11 | 0 | 10 | 0 |
| CLARK | 11 | 364 | 11 | 25 | 11 | 5 | 11 | 0 |
| DiScRIBinATE RAPSearch2c | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
| Kraken | 11 | 327 | 11 | 25 | 11 | 5 | 11 | 0 |
| Filtered Kraken | 11 | 14 | 11 | 1 | 11 | 0 | 11 | 0 |
| MEGAN4 BlastN | 11 | 110 | 11 | 19 | 11 | 3 | 9 | 1 |
| MEGAN4 RAPSearch2 | 11 | 183 | 11 | 41 | 11 | 1 | 9 | 1 |
| MetaBin | 11 | 561 | 10 | 77 | 10 | 6 | 10 | 1 |
| MetaCV | 11 | 1226 | 11 | 232 | 11 | 6 | 10 | 1 |
| MetaPhyler | 11 | 9 | 11 | 9 | 11 | 5 | 7 | 1 |
| PhymmBLc | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
| RITA | 11 | 466 | 10 | 80 | 10 | 10 | 10 | 1 |
| TACOAc | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
| MG-RAST best hit | 11 | 927 | 10 | 180 | 10 | 36 | 10 | 8 |
| MG-RAST LCA | 11 | 476 | 11 | 69 | 11 | 5 | 11 | 1 |
aUsing the FW in vitro dataset of sequenced reads from 11 species
bA cutoff of > × %, for example 0.01 %, would indicate that only species with a predicted abundance of at least x % of the total set of predictions were considered. Correctly predicted species are any of the 11 species that were used to simulate the reads in the dataset, whereas any other predicted species was incorrect
cThese methods do not predict to the species level at this read length (they require longer read lengths). See additional analyses at other levels of clade exclusion
Number of incorrectly predicted speciesa for different abundance thresholdsb without clade exclusion. Fewer incorrectly predicted species are predicted with the in silico data that does not contain errors versus the in vitro data containing sequencing errors (Table 3)
| No cutoffb | Cutoff > 0.01 %b | Cutoff > 0.1 %b | Cutoff > 1 %b | |||||
|---|---|---|---|---|---|---|---|---|
| Method | Correct | Incorrect | Correct | Incorrect | Correct | Incorrect | Correct | Incorrect |
| CARMA3 | 11 | 41 | 11 | 3 | 11 | 1 | 11 | 1 |
| CLARK | 11 | 0 | 11 | 0 | 11 | 0 | 11 | 0 |
| DiScRIBinATE RAPSearch2c | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
| Kraken | 11 | 0 | 11 | 0 | 11 | 0 | 11 | 0 |
| Filtered Kraken | 11 | 0 | 11 | 0 | 11 | 0 | 11 | 0 |
| MEGAN4 BLASTN | 11 | 0 | 11 | 0 | 11 | 0 | 10 | 0 |
| MEGAN4 RAPSearch2 | 11 | 92 | 11 | 29 | 11 | 1 | 10 | 0 |
| MetaBin | 11 | 286 | 11 | 41 | 11 | 3 | 11 | 0 |
| MetaCV | 11 | 0 | 11 | 0 | 11 | 0 | 11 | 0 |
| MetaPhyler | 10 | 12 | 10 | 12 | 10 | 8 | 7 | 3 |
| PhymmBLc | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
| RITA | 11 | 0 | 11 | 0 | 11 | 0 | 11 | 0 |
| TACOAc | N/A | N/A | N/A | N/A | N/A | N/A | N/A | N/A |
| MG-RAST best hit | 10 | 646 | 10 | 136 | 10 | 26 | 10 | 6 |
| MG-RAST LCA | 10 | 300 | 10 | 54 | 10 | 8 | 9 | 3 |
aUsing the FW in silico dataset of sequenced reads from 11 species
bA cutoff of > × %, for example 0.01 %, would indicate that only species with a predicted abundance of at least × % of the total set of predictions were considered
cThese methods do not predict to the species level at this read length (they require longer read lengths). See additional analyses at other levels of clade exclusion