| Literature DB >> 31794015 |
Hui-Su Kim1, Sungwon Jeon1,2, Changjae Kim1, Yeon Kyung Kim1, Yun Sung Cho3, Jungeun Kim4, Asta Blazyte1, Andrea Manica5, Semin Lee1,2, Jong Bhak1,2,3,4.
Abstract
BACKGROUND: Long DNA reads produced by single-molecule and pore-based sequencers are more suitable for assembly and structural variation discovery than short-read DNA fragments. For de novo assembly, Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) are the favorite options. However, PacBio's SMRT sequencing is expensive for a full human genome assembly and costs more than $40,000 US for 30× coverage as of 2019. ONT PromethION sequencing, on the other hand, is 1/12 the price of PacBio for the same coverage. This study aimed to compare the cost-effectiveness of ONT PromethION and PacBio's SMRT sequencing in relation to the quality.Entities:
Keywords: Hi-C; KOREF; Korean reference genome; PromethION; nanopore sequencing; single-molecule sequencing
Year: 2019 PMID: 31794015 PMCID: PMC6889754 DOI: 10.1093/gigascience/giz125
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Statistics of raw sequenced reads
| Statistic | ONT PromethION R9.4.1 | PacBio Sequel | Short-read Illumina HiSeq 2000 | ||
|---|---|---|---|---|---|
| 27× | 64× | 30× | 62× | ||
| Number of reads | 15,004,723 | 47,591,997 | 11,195,434 | 20,683,965 | 1,433,779,680 |
| Total length of reads (bp) | 80,770,821,288 | 193,027,803,978 | 92,229,416,062 | 187,914,740,184 | 144,811,747,680 |
| N50 (bp) | 12,736 | 9,190 | 13,426 | 14,568 | 101 |
| Maximum contig length (bp) | 774,322 | 1,160,324 | 65,865 | 169,910 | 101 |
Statistics of KOREF genome assemblies using ONT PromethION and PacBio Sequel sequencing
| Statistic | ONT PromethION R9.4.1 | PacBio Sequel | ||
|---|---|---|---|---|
| 27× assembly | 64× assembly | 30× assembly | 62× assembly | |
| Contigs No. | 3,262 | 3,725 | 2,443 | 2,695 |
| Total length (bp) | 2,757,297,803 | 2,827,624,042 | 2,800,962,512 | 2,815,311,932 |
| N50 (bp) | 7,655,153 | 16,706,773 | 11,137,362 | 17,931,968 |
| Max contig length (bp) | 60,569 695 | 88,903,341 | 50,101,007 | 77,816,513 |
| Gap | 0 | 0 | 0 | 0 |
| GC content | 40.82% | 40.81% | 40.90% | 40.92% |
Figure 1:Comparison of (A) N50 lengths and (B) the longest contig or scaffold lengths for PromethION and PacBio assemblies of KOREF. Contig corresponds to assemblies without Hi-C data and scaffold corresponds to assemblies with Hi-C data.
Statistics of KOREF genome assemblies using ONT PromethION and PacBio Sequel sequencing with Hi-C mapping information
| Statistic | ONT PromethION R9.4.1 assembly with Hi-C | PacBio Sequel assembly with Hi-C | ||
|---|---|---|---|---|
| 27× | 64× | 30× | 62× | |
| Scaffolds No. | 2,313 | 3,179 | 1,476 | 2,139 |
| Total length (bp) | 2,757,776,303 | 2,827,900,542 | 2,801,450,512 | 2,815,594,432 |
| N50 (bp) | 32,758,624 | 56,457,651 | 38,113,117 | 59,361,327 |
| Maximum scaffold length (bp) | 120,666,262 | 175,227,974 | 126,818,544 | 174,360,016 |
| Gap | 0.02% | 0.01% | 0.02% | 0.01% |
| GC content | 40.82% | 40.81% | 40.90% | 40.90% |
Statistics of KOREF genome assembly assessment using BUSCO and accuracy comparison
| ONT PromethION R9.4.1 (%) | PacBio Sequel (%) | |||||||
|---|---|---|---|---|---|---|---|---|
| BUSCO assessment | 27× assembly | 64× assembly | 27× assembly with Hi-C | 64× assembly with Hi-C | 30× assembly | 62× assembly | 30× assembly with Hi-C | 62× assembly with Hi-C |
| Complete | 92.5 | 92.7 | 92.6 | 94.0 | 93.8 | 93.9 | 93.8 | 93.5 |
| Complete and single-copy | 91.8 | 91.6 | 91.9 | 93.2 | 93.0 | 93.1 | 93.0 | 92.7 |
| Complete and duplicated | 0.7 | 1.1 | 0.7 | 0.8 | 0.8 | 0.8 | 0.8 | 0.8 |
| Fragmented | 3.1 | 3.7 | 3.2 | 3.1 | 3.0 | 2.9 | 3.0 | 3.1 |
| Missing | 4.4 | 3.6 | 4.2 | 2.9 | 3.2 | 3.2 | 3.2 | 3.4 |
| Accuracy comparison | 99.78 | 99.73 | 99.78 | 99.73 | 99.83 | 99.79 | 99.86 | 99.80 |
Compared with KOREF_S, the single assembly of KOREF.