| Literature DB >> 23822731 |
Richard J Roberts, Mauricio O Carneiro, Michael C Schatz.
Abstract
Of the current next-generation sequencing technologies, SMRT sequencing is sometimes overlooked. However, attributes such as long reads, modified base detection and high accuracy make SMRT a useful technology and an ideal approach to the complete sequencing of small genomes.Entities:
Mesh:
Year: 2013 PMID: 23822731 PMCID: PMC3953343 DOI: 10.1186/gb-2013-14-6-405
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Idealized assembly graphs [18]of the 5.2 megabase-pair . The graphs encode the compressed de Bruijn graph derived from infinite coverage error-free reads, effectively representing the repeats in the genome and the upper bound of what could be achieved in a real assembly. Increasing the read length decreases the number of contigs because the longer reads will span more of the repeats. Note the assembly with 5,000 bp reads has a self-edge because the chromosome is circular.
Figure 2A sequencing context breakdown of the empirical insertion error rate of the two platforms on NA12878 whole genome data. In this figure we show all contexts of size 8 that start with AAAAA. The empirical insertion quality score (y-axis) is PHRED scaled. Despite the higher error rate (approximately Q12) of the PacBio RS instrument, the error is independent of the sequencing context. Other platforms are known to have different error rates for different sequencing contexts. Illumina's HiSeq platform, shown here, has a lower error rate (approximately Q45 across eight independent runs), but contexts such as AAAAAAAA and AAAAACAG have extremely different error rates (Q30 versus Q55). This context-specific error rate creates bias that is not easily clarified by greater sequencing depth. Empirical insertion error rates were measured using the Genome Analysis Toolkit (GATK) - Base Quality Score Recalibration tool.