| Literature DB >> 22776067 |
Abstract
BACKGROUND: Next-generation sequencing systems are capable of rapid and cost-effective DNA sequencing, thus enabling routine sequencing tasks and taking us one step closer to personalized medicine. Accuracy and lengths of their reads, however, are yet to surpass those provided by the conventional Sanger sequencing method. This motivates the search for computationally efficient algorithms capable of reliable and accurate detection of the order of nucleotides in short DNA fragments from the acquired data.Entities:
Mesh:
Year: 2012 PMID: 22776067 PMCID: PMC3464607 DOI: 10.1186/1471-2105-13-160
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1A hidden Markov model of the generated signal in Illumina sequencing-by-synthesis platforms. An illustration of the graphical HMM of the Illumina’s sequencing platform. The observations represent signal intensities after the removal of residual effects. The states are the combinations of and , which represent a subsequence of the template centered at position t and per-cluster density, respectively.
Comparison of ParticleCall with different
| Method | error rate | base-calling time (min) | ||
|---|---|---|---|---|
| ParticleCall (via MCEM) | 400 | 0.0126 | 46 | |
| | 800 | 0.0124 | 88 | |
| | 1200 | 0.0124 | 130 | |
| ParticleCall (via PFEM) | 400 | 0.0128 | 46 | |
| | 800 | 0.0125 | 91 | |
| | 1200 | 0.0125 | 133 | |
| Rao-Blackwellized ParticleCall (via MCEM) | 100 | 0.0128 | 103 | |
| | 200 | 0.0125 | 190 | |
| | 300 | 0.0124 | 287 | |
| 400 | 0.0124 | 386 |
ParticleCall is run using parameters obtained via the MCEM parameter estimation scheme as well as via the PFEM parameter estimation algorithm proposed in this paper. Rao-Blackwellized ParticleCall is run using parameters via the MCEM parameter estimation scheme.
ParticleCall parameter estimation
| | | parameter estimation |
|---|---|---|
| Window length | base-calling error rate | time (min) |
| 4 | 0.0125 | 50 |
| 5 | 0.0125 | 39 |
| 6 | 0.0127 | 29 |
| 7 | 0.0130 | 25 |
ParticleCall base-calling error rate and the parameter estimation time of the proposed PFEM parameter estimation algorithm.
Comparison of error rates and speed
| | | base-calling | parameter estimation | |
|---|---|---|---|---|
| Method | error rate | time (min) | time (min) | |
| Bustard | 0.0152 | 2 (total) | | |
| Rolexa | 0.0170 | 35 (total) | | |
| naiveBayesCall | 0.0132 | 21 | 1139 | |
| BayesCall | 0.0124 | 231 | 1139 | |
| ParticleCall | ||||
| (via MCEM) | 0.0124 | 88 | 1139 | |
| ParticleCall | ||||
| (via PFEM) | 0.0125 | 91 | 39 |
The base-calling error rate and the running times of different algorithms. ParticleCall is run using parameters obtained via the MCEM parameter estimation scheme as well as via the PFEM parameter estimation algorithm proposed in this paper. For Bustard and Rolexa, only the total running times are reported.
Figure 2Per-cycle error rates of ParticleCall, BayesCall, naiveBayesCall, Rolexa and Bustard. The figure compares the per-cycle error rates of different base-calling algorithms. ParticleCall and BayesCall are the most accurate ones.
Figure 3Discrimination abilityD(ε) of quality scores vs error tolerance. The figure shows the percentage of correctly called bases under different error tolerance ε.
de novo assembly results
| | | | | | | | | | ParticleCall | ParticleCall | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Coverage | Bustard | Rolexa | naiveBayesCall | BayesCall | via MCEM | via PFEM | ||||||
| N50 | Max | N50 | Max | N50 | Max | N50 | Max | N50 | Max | N50 | Max | |
| 5X | 271 | 607 | 259 | 565 | 278 | 604 | 292 | 629 | 299 | 637 | 289 | 632 |
| 10X | 1169 | 1750 | 971 | 1557 | 1180 | 1731 | 1269 | 1831 | 1316 | 1900 | 1341 | 1865 |
| 15X | 3624 | 3823 | 2885 | 3170 | 3726 | 3908 | 3466 | 3741 | 3742 | 3935 | 3697 | 3918 |
| 20X | 4694 | 4744 | 4529 | 4614 | 4756 | 4816 | 4827 | 4875 | 5102 | 5116 | 4795 | 5039 |
The maximum contig length and N50 length of de novo assembly using Velvet. The average values over 200 experiments are shown in the table.