| Literature DB >> 23471300 |
Gabriel Renaud1, Martin Kircher, Udo Stenzel, Janet Kelso.
Abstract
MOTIVATION: The conversion of the raw intensities obtained from next-generation sequencing platforms into nucleotide sequences with well-calibrated quality scores is a critical step in the generation of good sequence data. While recent model-based approaches can yield highly accurate calls, they require a substantial amount of processing time and/or computational resources. We previously introduced Ibis, a fast and accurate basecaller for the Illumina platform. We have continued active development of Ibis to take into account developments in the Illumina technology, as well as to make Ibis fully open source.Entities:
Mesh:
Year: 2013 PMID: 23471300 PMCID: PMC3634191 DOI: 10.1093/bioinformatics/btt117
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Accuracy for each basecaller on a Illumina GAIIx dataset (2 × 126 cycles with 366 135 257 clusters)
| Basecaller | Training time | Calling time | Mapped (%) | Edit distance |
|---|---|---|---|---|
| Bustard | 583 348 201 (83.93%) | 1.379 | ||
| naiveBayesCall | 591 h | 658 h | 578 957 145 (83.34%) | 1.496 |
| AYB | 394 h | 593 183 967 (85.52%) | 1.076 | |
| Ibis | 19.4 h | 13.2 h | 592 929 953 (85.31%) | 1.167 |
| freeIbis | 21.3 h | 12.2 h | 594 095 219 (85.48%) | 1.145 |
The human sequences were mapped to the hg19 version of the human genome. The number of mapped sequences and the average number of mismatches for those were tallied for each method. Time trials were conducted on a machine with 74 GB of RAM and using 8 of the 12 Intel Xeon cores running at 2.27 GHz. aPercentage relative to sequences assigned to the read group of interest.
Fig. 1.Plot of the predicted versus the observed base quality score for control reads. Ideally the base qualities should follow the diagonal line. The root mean square error (RMSE) shows that quality scores predicted using freeIbis have a greater correlation to their observed error rates