| Literature DB >> 26264372 |
R M Redwan1, A Saidin2, S V Kumar3.
Abstract
BACKGROUND: Pineapple (Ananas comosus var. comosus) is known as the king of fruits for its crown and is the third most important tropical fruit after banana and citrus. The plant, which is indigenous to South America, is the most important species in the Bromeliaceae family and is largely traded for fresh fruit consumption. Here, we report the complete chloroplast sequence of the MD-2 pineapple that was sequenced using the PacBio sequencing technology.Entities:
Mesh:
Year: 2015 PMID: 26264372 PMCID: PMC4534033 DOI: 10.1186/s12870-015-0587-1
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Fig. 1Syntenic dotplot generated by MUMmer [96] based on nucmer alignment between the contig produced by MIRA (a) before rearrangement and (b) after rearrangement to the T. latifolia chloroplast genome as the reference. The initial contig produced by MIRA had a different start as compared to conserved chloroplast structure such as the T. latifolia and it was in overlap to its ends which was the inverted repeat region. In addition, the SSC was in inverse as compared to the reference due to the uncertainty in placing the first copy of the repeated sequence flanking the SSC
Fig. 2The coverage profile of the pineapple chloroplast after mapping back the error corrected, uncorrected and the Illumina short reads. Each three rings represented the depth of coverage from mapping back reads used in the assembly of the pineapple chloroplast genome. From the outermost to the innermost, the ring represents the corrected PacBio, the Illumina short reads and uncorrected PacBio mapped to the chloroplast genome of pineapple. The height of each ring is in proportion to the number of reads mapped across the chloroplast genome. Figure was illustrated using BRIG [41]
Fig. 3The A. comosus chloroplast genome. The figure shows circular representation of the pineapple chloroplast genome with structural organization of the gene content ring which was colour coded based on its functional category. The innermost circle denotes the GC content across the genome. The genes that were transcribed counter-clockwise and clockwise were at the outer and inner ring, respectively. The chloroplast illustration was drawn using OGDRAW [40]
List of genes in the chloroplast genome of pineapple
| Groups of genes | Name of genes | |
|---|---|---|
| Protein synthesis and DNA-replication | Transfer RNAs |
|
| Ribosomal RNAs |
| |
| Ribosomal protein small subunit |
| |
| Ribosomal protein large subunit |
| |
| Subunits of RNA polymerase |
| |
| Photosynthesis | Photosystem I |
|
| Photosystem II |
| |
| Cythochrome b/f complex |
| |
| ATP synthase |
| |
| NADH-dehydrogenase |
| |
| Large subunit Rubisco |
| |
| Miscellaneous group | Translation initiation factor IF-1 |
|
| Acetyl-CoA carboxylase |
| |
| Cytochrome c biogenesis |
| |
| Maturase |
| |
| ATP-dependent protease |
| |
| Inner membrane protein |
| |
| Pseudogene unknown function | Conserved hypothetical chloroplast ORF |
|
Relative synonymous codon usage (RSCU) for protein-coding genes in Ananas comosus
| Codon | AA | ObsFreq | RSCU |
|---|---|---|---|
| UGA | * | 26 | 0.772 |
| UAG | * | 31 | 0.921 |
| UAA | * | 44 | 1.307 |
| GCU | A | 630 | 1.819 |
| GCG | A | 151 | 0.436 |
| GCC | A | 210 | 0.606 |
| GCA | A | 394 | 1.138 |
| UGU | C | 246 | 1.46 |
| UGC | C | 91 | 0.54 |
| GAU | D | 914 | 1.616 |
| GAC | D | 217 | 0.384 |
| GAG | E | 414 | 0.563 |
| GAA | E | 1058 | 1.438 |
| UUU | F | 921 | 1.205 |
| UUC | F | 607 | 0.795 |
| GGU | G | 611 | 1.336 |
| GGG | G | 298 | 0.652 |
| GGC | G | 157 | 0.343 |
| GGA | G | 763 | 1.669 |
| CAC | H | 157 | 0.463 |
| CAU | H | 521 | 1.537 |
| AUU | I | 1120 | 1.431 |
| AUA | I | 715 | 0.914 |
| AUC | I | 513 | 0.655 |
| AAA | K | 1026 | 1.456 |
| AAG | K | 383 | 0.544 |
| CUA | L | 409 | 1.169 |
| CUC | L | 214 | 0.612 |
| CUG | L | 177 | 0.506 |
| CUU | L | 599 | 1.713 |
| UUA | L | 838 | 1.17 |
| UUG | L | 594 | 0.83 |
| AUG | M | 656 | 1 |
| AAC | N | 298 | 0.461 |
| AAU | N | 996 | 1.539 |
| CCA | P | 340 | 1.195 |
| CCC | P | 231 | 0.812 |
| CCU | P | 425 | 1.494 |
| CCG | P | 142 | 0.499 |
| CAA | Q | 721 | 1.488 |
| CAG | Q | 248 | 0.512 |
| AGA | R | 548 | 1.524 |
| AGG | R | 171 | 0.476 |
| CGA | R | 369 | 1.525 |
| CGC | R | 96 | 0.397 |
| CGG | R | 137 | 0.566 |
| CGU | R | 366 | 1.512 |
| AGC | S | 108 | 0.409 |
| AGU | S | 420 | 1.591 |
| UCA | S | 458 | 1.129 |
| UCC | S | 374 | 0.922 |
| UCG | S | 198 | 0.488 |
| UCU | S | 593 | 1.461 |
| ACC | T | 263 | 0.739 |
| ACA | T | 438 | 1.23 |
| ACG | T | 163 | 0.458 |
| ACU | T | 560 | 1.573 |
| GUU | V | 525 | 1.425 |
| GUG | V | 207 | 0.562 |
| GUC | V | 188 | 0.51 |
| GUA | V | 554 | 1.503 |
| UGG | W | 468 | 1 |
| UAC | Y | 218 | 0.424 |
| UAU | Y | 811 | 1.576 |
Repeat sequences for Ananas comosus chloroplast genome
| No. | Type | Location | Region | Repeat unit | Period size (bp) | Copy number |
|---|---|---|---|---|---|---|
| 1 | T | trnS-GCU and trnG-GCC | LSC | TACATTAAACAATATTAAAT | 20 | 2 |
| 2 | D | psbI and trnG-GCC psbE and petL | LSC | TAAAAATATATATATATATATAAATATATTATAGTA | 36 | 2 |
| 3 | T | accD and psaI | LSC | TAATTAAGATAGACAA | 16 | 2 |
| 4 | T | accD and psaI | LSC | TTTTCATAAGAAAACTCCT | 18 | 2 |
| 5 | T | accD and psaI | LSC | ATTTGAGATTTCCAAATAATA | 20 | 2 |
| 6 | P | accD and psaI | LSC | GTATAATATGAAGTTTGAATAT | 22 | 2 |
| 7 | T | clpP (intron) | LSC | TTAGGACAAAATTGTATCTC | 20 | 2 |
| 8 | T | clpP (intron) | LSC | AGTAATAGTAGGTATAA | 17 | 3 |
| 9 | T | ndhB and trnL-CAA | IRA | GTCATTCAAGCGTAT | 15 | 2 |
| 10 | T | ndhC and trnV-UAC | LSC | ATTCTAAATAATAAAAG | 17 | 2 |
| 11 | T | ndhF and rpl32 | SSC | TATTTATTAGATTTTGC | 16 | 2 |
| 12 | T | ndhF and rpl33 | SSC | TCGGAAATCTTATGATACTCCTT | 23 | 2 |
| 13 | T | petD (intron) | LSC | TTATATGGGTTTATTTCTGTTA | 22 | 2 |
| 14 | P | petN and psbM | LSC | CTAAAGAGTGGTAGAAAGGACTA | 24 | 2 |
| 15 | D | psaB (CDS) and psaA (CDS) | LSC | TGCAATAGCTAAATGATGATGAGCAATATCGGTCA | 34 | 2 |
| 16 | P | psbA | LSC | AAAAAATACCCAATATCTTGT | 21 | 2 |
| 17 | P | psbT and psbH | LSC | ATTGAAGTAATGAGCCTCCCA | 21 | 2 |
| 18 | T | rbcL and accD | LSC | TATATACAAG | 10 | 5 |
| 19 | T | rpoC2 (CDS) | LSC | TGTCTCATGTAAATT | 15 | 2 |
| 20 | T | rps11 (CDS) | LSC | TACGCCCATTCTTACGTGAACCAA | 24 | 2 |
| 21 | P | trnD-GUC and trnE-UUC | LSC | TTTCATGATACTTTACTTA | 19 | 2 |
| 22 | T | trnF-GAA and ndhJ | LSC | TATTCTATTTCGTCA | 15 | 2 |
| 23 | T | trnL-CAA ndhB | IRB | ACATACGCTTGAATG | 15 | 2 |
| 24 | P | ycf1 (CDS) | SSC | TTTTATTTTGACTTGTATTTTTAT | 22 | 2 |
| 25 | T | ycf15 and trnL-CAA | IRB | GAATAACTAAAGAAAATAGATA | 22 | 2 |
| 26 | T | ycf15 and trnL-CAA | IRA | TCTATCTATTTTCTTTACTTAT | 22 | 2 |
| 27 | T | ycf2 (CDS) | IRB | TTTGTCCAAGTCACTTCTCTT | 21 | 3 |
| 28 | T | ycf2 (CDS) | IRB/IRA | CTTTTTGTCCAAGTCACTTCC | 21 | 3 |
| 29 | T | ycf2 (CDS) | IRB/IRA | GATATCGATATTGATGATAGTGAC | 24 | 2 |
| 30 | T | ycf2 (CDS) | IRA | GAAGTGACTTGGACAAAAAGA | 21 | 3 |
| 31 | D/P | ycf3 (intron) petB (intron) rps12 and trnV-GAC | LSC/IRA/IRB | CCAGAACCGTACATGAGATTTTCATCTCATACGGCTCCTC | 39 | 3 |
*Letter T, D, and P in Region column represents Tandem, Dispersed and Palindromic repeats, respectively. IRA, IRB, LSC and SSC represents inverted region A, inverted region B, long single copy and short single copy, respectively. All of the repeat locations are in intergenic spacer regions, except otherwise indicated
Fig. 4Genome comparison of nine chloroplast genomes from the subclass Commelinidaes to the pineapple chloroplast genome. From the third ring: Typha latifolia (green), Ravenala madagascariensis (purple), Calamus caryotoides (light purple), Dasypogon bromelifolius (turqoise), Anomochloa marantoidea (blue), Pharus lappulaceus (light blue), Puelia olyriformis (yellow), Aristida purpurea (green) and Olyra latifolia (light green). Pairwise comparisons using blast n were performed on every chloroplast genome to the chloroplast genome of pineapple and produced alignments which were colour coded based on the similarity score: dark shade, lighter shade and grey depicts similarity score of above 90 %, above 80 % and below 80 %, respectively. The first outer rings are the protein-coding gene features positioned based on the pineapple chloroplast genome
Fig. 5The Ka/Ks ratio of the protein-coding gene from the nine members of commelinid for comparison with A. comosus
Fig. 6Comparison of chloroplast borders of LSC, SSC and IR regions among the species from subclass Commelinidae
Fig. 7Phylogenetic tree of all available commelinids’ complete chloroplast sequences including the newly sequenced A. comosus chloroplast, in total of 100 taxa. The analysis was inferred using 56 protein coding sequences by maximum likelihood analysis with –lnL of 400419.797049 and bootstrap of 1000 replicates values were shown at the nodes