| Literature DB >> 27153586 |
Andreas Bremges1, Esther Singer2, Tanja Woyke2, Alexander Sczyrba1.
Abstract
UNLABELLED: We present a new tool, MeCorS, to correct chimeric reads and sequencing errors in Illumina data generated from single amplified genomes (SAGs). It uses sequence information derived from accompanying metagenome sequencing to accurately correct errors in SAG reads, even from ultra-low coverage regions. In evaluations on real data, we show that MeCorS outperforms BayesHammer, the most widely used state-of-the-art approach. MeCorS performs particularly well in correcting chimeric reads, which greatly improves both accuracy and contiguity of de novo SAG assemblies.Entities:
Mesh:
Year: 2016 PMID: 27153586 PMCID: PMC4937190 DOI: 10.1093/bioinformatics/btw144
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Performance of SAG error correction
| Program | % perfect | % chimeric | % better | % worse |
|---|---|---|---|---|
| Raw | 22.52 ± 1.07 | 0.73 ± 0.15 | – | – |
| BayesHammer | 80.35 ± 8.77 | 0.77 ± 0.17 | 71.66 ± 2.12 | 0.33 ± 0.06 |
| MeCorS | 95.52 ± 0.43 | 0.06 ± 0.02 | 75.45 ± 1.11 | 0.26 ± 0.03 |
Mean percentage and standard deviation of perfect reads, chimeric reads (i.e. reads with parts mapped to different places), corrected reads becoming better and worse than the raw reads. Evaluation as described in Li (2015); please refer to Supplementary Table S3 for per-SAG metrics, including runtime and memory usage.
Fig. 1.Effect on SAG assembly. We corrected the raw reads (R) with BayesHammer (B; Nikolenko ) or MeCorS (M). We then used IDBA-UD (Peng ) and SPAdes (Bankevich ) to assemble the SAGs. Brackets indicate all statistically significant changes (P < 0.05; two-tailed Wilcoxon signed-rank test). Quality assessment with QUAST (Gurevich ); Supplementary Tables S4 and S5 contain in-depth assembly statistics