| Literature DB >> 28130230 |
B Fosso1, M Santamaria1, M D'Antonio2, D Lovero1, G Corrado3, E Vizza3, N Passaro4, A R Garbuglia5, M R Capobianchi5, M Crescenzi4, G Valiente6, G Pesole1,7.
Abstract
SUMMARY: Shotgun metagenomics by high-throughput sequencing may allow deep and accurate characterization of host-associated total microbiomes, including bacteria, viruses, protists and fungi. However, the analysis of such sequencing data is still extremely challenging in terms of both overall accuracy and computational efficiency, and current methodologies show substantial variability in misclassification rate and resolution at lower taxonomic ranks or are limited to specific life domains (e.g. only bacteria). We present here MetaShot, a workflow for assessing the total microbiome composition from host-associated shotgun sequence data, and show its overall optimal accuracy performance by analyzing both simulated and real datasets.Entities:
Mesh:
Year: 2017 PMID: 28130230 PMCID: PMC5447231 DOI: 10.1093/bioinformatics/btx036
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
(A) Benchmark assessment of Kraken (KR) and MetaShot (MS) on a simulated dataset (see the Supplementary Material for details) consisting of 19 582 500 human (94.5%), 986 114 bacterial (4.8%) and 146 886 viral (0.7%) reads. (B) Precision (P), Recall (R), F-measure (F) and Unclassified reads (U) of Kraken (KR), MetaShot (MS) and MetaPhlAn2 (MP) on the same simulated dataset, at the Species level
| (A) | ||||||
|---|---|---|---|---|---|---|
| Assigned %a | Correctly Assigned %b | |||||
| Human (host) | 100.00 | 99.18 | 0c | 100.00 | 99.99 | 0c |
| Family | 57.41 | 97.91 | 5.16 | 96.77 | 98.37 | 97.59 |
| Genus | 55.01 | 98.14 | 4.96 | 95.92 | 98.17 | 98.02 |
| Species | 54.17 | 99.31 | 4.76 | 79.52 | 88.06 | 90.7 |
| Family | 74.78 | 97.74 | 49.32 | 99.16 | 98.53 | 98.48 |
| Genus | 101.88 | 97.39 | 66.85 | 99.37 | 99.75 | 99.30 |
| Species | 73.45 | 97.81 | 43.86 | 98.98 | 96.70 | 95.46 |
The percentage refers to the total number of reads assignable to the specific taxonomic rank.
The percentage refers to the relevant assigned reads.
MetaPhlAn2 assigns just the sequences containing specific taxon markers and does not search for human host sequences.