| Literature DB >> 26504677 |
Silvia Garaycochea1, Pablo Speranza2, Fernando Alvarez-Valin3.
Abstract
PREMISE OF THE STUDY: We developed a bioinformatic strategy to recover and assemble a chloroplast genome using data derived from low-coverage 454 GS FLX/Roche whole-genome sequencing.Entities:
Keywords: bioinformatic methods; chloroplast genome; next-generation sequencing; weedy rice
Year: 2015 PMID: 26504677 PMCID: PMC4610308 DOI: 10.3732/apps.1500022
Source DB: PubMed Journal: Appl Plant Sci ISSN: 2168-0450 Impact factor: 1.936
Fig. 1.Schematic representation of the evolutionary process that occurs after the insertion of a chloroplast DNA segment in the nuclear genome. The inserted fragment is represented by a blue box in the nuclear genome, whereas the homolog fragment that remains in the chloroplast (referred to as “donor” DNA) is represented by a green box. The main evolutionary events are depicted: accumulation of point mutations in both genomes (represented by yellow vertical lines) and fragmentation of inserts in the nuclear genome. The different predicted types of sequencing reads (1A, 1B, 2A, 2B, 3A, and 3B) and how they are expected to match with both genomes are also schematized.
Fig. 2.Strategy to identify AM356-8-specific chloroplast DNA insertions into the nuclear genome. Read-filtering criteria: partial, non-overlapping alignment with both genomes (O% < 80 with both genomes) and 100% identity with both genomes on the respective segments (ID% ≈ 100%).
Fig. 3.The strategy followed for the identification of chloroplast reads from whole genome data. R japonica, R indica, R nivara = reads aligned with each respective chloroplast reference genome; RC = set of reads with more than 99% overlap with the chloroplast reference genome.
Fig. 4.Overview of alignment between contigs recovered for de novo AM356-8 chloroplast and the reference chloroplast genome. The alignment was made with Artemis Comparison Tool (ACT; Carver et al., 2005). LSC = long single copy; SSC = short single copy.
Indel comparison among the four reference genomes and reads from AM356-8. Only indels in the AM356-8 chloroplast genome are shown.
| Indel position (kb) | Variant genome | AM356-8 (no. of reads) | Length (bp) | Variable sequence | Annotation |
| 8 | 23 | 69 | GAATCCTATTTTTGTTCTTATACCCATGCAATAGAGAGGAGTGGGAAAAGGGAGGTTACTTTTTTTCA | Nonannotated predicted ORF | |
| 12 | 14 | 4 | AGGG | Intergenic | |
| 14 | 1 | 2 | AC | Intergenic | |
| 46 | 8 | 5 | TATAT | Intergenic | |
| 57 | 10 | 16 | TTTTTTAGAATACTAA | Intergenic | |
| 60 | 32 | 5 | deletion: TATTG | Intergenic | |
| 65 | 23 | 2 | TT | Intergenic | |
| 77 | 43 | 3 | deletion: TGG | Intergenic |
Position of the indel in the alignment of the AM356-8 genome with the four reference genomes.
Reads representing AM356-8-specific nuclear insertion of chloroplast DNA.
| Read ID | Matching genome | ID% | Aln length (bp) | Read length (bp) | Aln start (read) | Aln end (read) | Start DB (genome) | End DB (genome) |
| 2GW6GW | Plastid | 99.59 | 246 | 517 | 271 | 515 | 54,081 | 54,326 |
| Chr. 2 | 99.63 | 270 | 517 | 1 | 270 | 1,879,451 | 1,879,182 | |
| 02F9BJE | Plastid | 100 | 116 | 382 | 267 | 382 | 111,458 | 111,573 |
| Chr. 2 | 98.13 | 268 | 382 | 1 | 268 | 15,462,658 | 15,462,391 | |
| 2IEQIE | Plastid | 100 | 115 | 516 | 1 | 115 | 30,704 | 30,818 |
| Chr. 4 | 99.76 | 411 | 516 | 106 | 516 | 32,858,510 | 32,858,919 | |
| 2JK097 | Plastid | 100 | 199 | 421 | 223 | 421 | 43,044 | 42,846 |
| Chr. 5 | 99.56 | 228 | 421 | 1 | 228 | 2,460,207 | 2,460,433 | |
| 2HC0XG | Plastid | 100 | 131 | 437 | 307 | 437 | 45,371 | 45,241 |
| Chr. 5 | 98.04 | 306 | 437 | 1 | 306 | 3,748,044 | 3,748,344 | |
| 2GJK3F | Plastid | 100 | 134 | 359 | 226 | 359 | 46,482 | 46,615 |
| Chr. 5 | 99.12 | 228 | 359 | 1 | 228 | 113,635,156 | 113,634,929 | |
| 2G6KGA | Plastid | 98.70 | 154 | 351 | 198 | 351 | 67,685 | 67,533 |
| Chr. 5 | 100 | 205 | 351 | 1 | 205 | 26,830,841 | 26,831,045 | |
| 2H8EV7 | Plastid | 89.36 | 141 | 488 | 1 | 134 | 71,978 | 71,844 |
| Chr. 8 | 96.87 | 319 | 488 | 113 | 430 | 25,340,556 | 25,340,238 | |
| 2JD9YU | Plastid | 100 | 185 | 397 | 1 | 185 | 88,217 | 88,401 |
| Chr. 12 | 100 | 218 | 397 | 180 | 397 | 23,930,784 | 23,930,567 | |
| 2GBG4O | Plastid | 100 | 123 | 281 | 159 | 281 | 77,439 | 77,561 |
| Chr. 4 | 100 | 162 | 281 | 1 | 162 | 16,565,916 | 16,566,077 | |
| 2F55JZ | Plastid | 100 | 185 | 397 | 1 | 185 | 88,217 | 88,401 |
| Chr. 12 | 100 | 218 | 397 | 180 | 397 | 23,930,784 | 23,930,567 |
Note: Read ID = read name; ID% = percentage of nucleotide identity; Aln length = alignment length in base pairs (between read and genome); Aln start = coordinate on the read where the alignment starts; Aln end = coordinate on the read where the alignment ends; Start DB = coordinate on the database (genome) where the alignment starts; End DB = coordinate on the genome where the alignment ends.
All read names start with the sequence GCFF90V.
Plastid stands for Oryza sativa subsp. japonica (cv. Nipponbare) chloroplast genome, accession number: AY522330.1. Chromosomes 1–12 are the chromosomes from the same cultivar.