| Literature DB >> 28061749 |
Aboozar Soorni1,2, David Haak3, David Zaitlin4, Aureliano Bombarely5.
Abstract
BACKGROUND: The development of long-read sequencing technologies, such as single-molecule real-time (SMRT) sequencing by PacBio, has produced a revolution in the sequencing of small genomes. Sequencing organelle genomes using PacBio long-read data is a cost effective, straightforward approach. Nevertheless, the availability of simple-to-use software to perform the assembly from raw reads is limited at present.Entities:
Keywords: Chloroplast; Mitochondria; Organelle Genome Assembly; PacBio
Mesh:
Year: 2017 PMID: 28061749 PMCID: PMC5219736 DOI: 10.1186/s12864-016-3412-9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Summary of the M. musculus mitochondrial genome assembly
| Input reads | Reference | Mapped reads | % Bases mapped | Estimated depth (x) | Assembly size (bp) | Organelle completeda |
|---|---|---|---|---|---|---|
| 50,000 |
| 39 | 0.22 | 26 | 2678 | NO |
| 100,000 | 83 | 0.18 | 42 | 12,377 | NO | |
| 163,477 | 138 | 0.22 | 69 | 16,294 | YES | |
| 50,000 |
| 35 | 0.20 | 24 | NA | NO |
| 100,000 | 69 | 0.16 | 36 | 12,332 | NO | |
| 163,477 | 110 | 0.15 | 56 | 16,299 | YES | |
| 50,000 |
| 26 | 0.16 | 19 | NA | NO |
| 100,000 | 53 | 0.13 | 30 | 10,247 | NO | |
| 163,477 | 86 | 0.12 | 44 | 16,292 | YES | |
| 50,000 |
| 10 | 0.07 | 8 | NA | NO |
| 100,000 | 17 | 0.05 | 11 | 6580 | NO | |
| 163,477 | 31 | 0.04 | 17 | 7193 | NO |
aThe mitochondria genome assembly was considered complete when the difference in size compared to the reference genome was <10 nucleotides
Fig. 1a- Remapping of Mus musculus PacBio DNA sequencing reads to a mitochondrial genome reference assembly. Each of the reads is represented by a darkgray line marking its position. b- Coverage for the PacBio read remapping for M. musculus. c- Alignment between the M. musculus reference mitochondrial genome (NC_005089.1) and the M. musculus assembly performed by Organelle_PBA. SNPs are represented by small horizontal blue lines. d- Remapping of Arabidopsis thaliana PacBio sequencing reads to a reference chloroplast genome assembly. Each of the reads is represented by a darkgray line marking its position. e- Coverage for the PacBio read remapping for A. thaliana. The inverted repeats are indicated by ~2X coverage relative to the LSC and SSC regions. f - Alignment between the A. thaliana reference chloroplast genome (NC_000932.1) and the A. thaliana assembly performed by Organelle_PBA. SNPs and Indels are represented by small horizontal blue and purple lines respectively. Reversed alignments are represented by darkgray lines
Summary of the A. thaliana chloroplast genome assembly
| Input reads | Reference | Mapped reads | % Bases mapped | Estimated depth (x) | Assembly size (bp) | Organelle completeda |
|---|---|---|---|---|---|---|
| 5000 |
| 287 | 23.06 | 23 | 42,978 | NO |
| 10,000 | 611 | 24.40 | 50 | 150,039 | NO | |
| 50,000 | 3,013 | 23.79 | 244 | 154,472 | YES | |
| 100,000 | 5,777 | 23.01 | 470 | 154,471 | YES | |
| 163,448 | 9,409 | 23.15 | 771 | 154,474 | YES | |
| 5000 |
| 277 | 21.82 | 22 | 59,513 | NO |
| 10,000 | 591 | 23.08 | 48 | 153,132 | NO | |
| 50,000 | 2923 | 23.08 | 239 | 154,474 | YES | |
| 100,000 | 5565 | 22.13 | 457 | 154,481 | YES | |
| 163,448 | 9102 | 22.43 | 755 | 154,473 | YES | |
| 5000 |
| 233 | 18.88 | 18 | 73,382 | NO |
| 10,000 | 507 | 20.62 | 41 | 151,984 | NO | |
| 50,000 | 2516 | 20.67 | 204 | 154,469 | YES | |
| 100,000 | 4807 | 20.04 | 393 | 154,477 | YES | |
| 163,448 | 7855 | 20.28 | 649 | 154,472 | YES |
aThe chloroplast genome assembly was considered complete when the difference in size compared to the reference genome was <10 nucleotides
Summary of the A. thaliana mitochondrial genome assembly
| Input reads | Reference | Mapped reads | % Bases mapped | Estimated depth (x) | Assembly size (bp) | Organelle completeda |
|---|---|---|---|---|---|---|
| 5000 |
| 101 | 8.88 | 4 | 27,294 | NO |
| 10,000 | 215 | 8.80 | 8 | 57,303 | NO | |
| 50,000 | 1,006 | 8.63 | 37 | 156,177 | NO | |
| 100,000 | 1,861 | 7.92 | 68 | 152,405 | NO | |
| 163,448 | 3,046 | 7.92 | 111 | 177,810 | NO | |
| 490,143 | 11,080 | 8.31 | 434 | 136,334 | NO | |
| 817,099 | 21,099 | 8.72 | 829 | 150,873 | NO |
aThe mitochondrial genome assembly was considered complete when the size difference compared to the reference was <100 nucleotides