| Literature DB >> 25345569 |
Hong Zhang1, Engkong Tan1, Yutaka Suzuki2, Yusuke Hirose1, Shigeharu Kinoshita1, Hideyuki Okano3, Jun Kudoh4, Atsushi Shimizu5, Kazuyoshi Saito6, Shugo Watabe7, Shuichi Asakawa1.
Abstract
Improvement in de novo assembly of large genomes is still to be desired. Here, we improved draft genome sequence quality by employing doubled-haploid individuals. We sequenced wildtype and doubled-haploid Takifugu rubripes genomes, under the same conditions, using the Illumina platform and assembled contigs with SOAPdenovo2. We observed 5.4-fold and 2.6-fold improvement in the sizes of the N50 contig and scaffold of doubled-haploid individuals, respectively, compared to the wildtype, indicating that the use of a doubled-haploid genome aids in accurate genome analysis.Entities:
Mesh:
Year: 2014 PMID: 25345569 PMCID: PMC5381364 DOI: 10.1038/srep06780
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Comparison of assembly performance for wildtype and doubled-haploid genomes.
The sizes of (a) contig N50, (b) contig max, (c) scaffold N50, and (d) scaffold max, including sizes of four individuals (WT-1, WT-2, DH-1, and DH-2) with different data coverage (44×, 49×, 54×, and 59×) are shown.
DNA libraries and sequencing conditions
| Individuals | Libraries | Instruments | Num. of seq | Length (bp) | Total residues (bp) |
|---|---|---|---|---|---|
| WT-1 | 230-bp PE | HiSeq 2000 | 251,423,596 | 100 | 25,142,359,600 |
| WT-2 | 230-bp PE | HiSeq 2000 | 247,609,546 | 100 | 24,760,954,600 |
| WT-3 | 400-bp PE | GAIIx | 84,857,156 | 101 | 8,570,572,756 |
| 2-kb MP | HiSeq 2000 | 278,642,344 | 76 | 21,176,818,144 | |
| 5-kb MP | HiSeq 2000 | 244,796,700 | 76 | 18,604,549,200 | |
| DH-1 | 300-bp PE | HiSeq 2000 | 283,351,680 | 101 | 28,618,519,680 |
| 500-bp PE | HiSeq 2000 | 253,179,572 | 101 | 25,571,136,772 | |
| 230-bp PE | HiSeq 2000 | 245,201,650 | 100 | 24,520,165,000 | |
| 2-kb MP | HiSeq 2000 | 248,467,078 | 101 | 25,095,174,878 | |
| 5-kb MP | HiSeq 2000 | 339,507,094 | 101 | 34,290,216,494 | |
| DH-2 | 230-bp PE | HiSeq 2000 | 236,267,482 | 100 | 23,626,748,200 |
Results of mixed scaffolding using paired reads of DH-1 and WT-3 libraries
| Input libraries (Number of sequences for scaffolding) | Contigs | Scaffolds (bp) | ||
|---|---|---|---|---|
| N50 | Longest | Total residues | ||
| DH-1300-bp PE | WT-1 | 353,902 | 2,553,311 | 450,109,058 |
| WT-2 | 413,499 | 3,882,648 | 454,810,436 | |
| DH-1 | 947,327 | 6,228,152 | 379,297,254 | |
| DH-2 | 919,481 | 7,216,699 | 377,125,616 | |
| WT-3400-bp PE | WT-1 | 449,077 | 3,817,845 | 448,973,730 |
| WT-2 | 519,253 | 3,919,511 | 454,445,076 | |
| DH-1 | 1,008,262 | 6,464,954 | 387,500,787 | |
| DH-2 | 1,000,074 | 7,536,928 | 386,941,998 | |
aAll reads from the paired-end (PE) libraries were trimmed to 100 bp in length.
bAll reads from the mate pair (MP) libraries were trimmed to 75 bp in length.