| Literature DB >> 29360746 |
Zhi-Zhong Li1,2, Josphat K Saina3,4,5, Andrew W Gichira6,7,8, Cornelius M Kyalo9,10,11, Qing-Feng Wang12,13, Jin-Ming Chen14,15.
Abstract
The family Balsaminaceae, which consists of the economically important genus Impatiens and the monotypic genus Hydrocera, lacks a reported or published complete chloroplast genome sequence. Therefore, chloroplast genome sequences of the two sister genera are significant to give insight into the phylogenetic position and understanding the evolution of the Balsaminaceae family among the Ericales. In this study, complete chloroplast (cp) genomes of Impatiens pinfanensis and Hydrocera triflora were characterized and assembled using a high-throughput sequencing method. The complete cp genomes were found to possess the typical quadripartite structure of land plants chloroplast genomes with double-stranded molecules of 154,189 bp (Impatiens pinfanensis) and 152,238 bp (Hydrocera triflora) in length. A total of 115 unique genes were identified in both genomes, of which 80 are protein-coding genes, 31 are distinct transfer RNA (tRNA) and four distinct ribosomal RNA (rRNA). Thirty codons, of which 29 had A/T ending codons, revealed relative synonymous codon usage values of >1, whereas those with G/C ending codons displayed values of <1. The simple sequence repeats comprise mostly the mononucleotide repeats A/T in all examined cp genomes. Phylogenetic analysis based on 51 common protein-coding genes indicated that the Balsaminaceae family formed a lineage with Ebenaceae together with all the other Ericales.Entities:
Keywords: Balsaminaceae; Hydrocera triflora; Impatiens pinfanensis; chloroplast genome; phylogenetic analyses
Mesh:
Substances:
Year: 2018 PMID: 29360746 PMCID: PMC5796262 DOI: 10.3390/ijms19010319
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Comparison of the chloroplast genomes of Impatiens pinfanensis and Hydrocera triflora.
| Species | ||
|---|---|---|
| Total Genome length (bp) | 154,189 | 152,238 |
| Overall G/C content (%) | 36.8 | 36.9 |
| Large single copy region | 83,117 | 84,865 |
| GC content (%) | 34.5 | 34.7 |
| Short single copy region | 25,755 | 25,622 |
| GC content (%) | 29.3 | 29.9 |
| Inverted repeat region | 17,611 | 18,082 |
| GC content (%) | 43.1 | 43.1 |
| Protein-Coding Genes | 80 | 80 |
| tRNAs | 31 | 31 |
| rRNAs | 4 | 4 |
| Genes with introns | 17 | 17 |
| Genes duplicated by IR | 18 | 18 |
Figure 1Gene map of the Impatiens pinfanensis chloroplast genome. Genes lying outside of the circle are transcribed clockwise, while genes inside the circle are transcribed counterclockwise. The colored bars indicate different functional groups. The dark gray area in the inner circle corresponds to GC content while the light gray corresponds to the adenine-thymine (AT) content of the genome.
Figure 2Gene map of the Hydrocera triflora chloroplast genome. Genes lying outside of the circle are transcribed clockwise, while genes inside the circle are transcribed counterclockwise. The colored bars indicate different functional groups. The dark gray area in the inner circle corresponds to (guanine cytosine) GC content while the light gray corresponds to the AT content of the genome.
Genes encoded in the Impatiens pinfanensis and Hydrocera triflora Chloroplast genomes.
| Group of Genes | Gene Name |
|---|---|
| rRNA genes | |
| tRNA genes | |
| Ribosomal small subunit | |
| Ribosomal large subunit | |
| DNA-dependent RNA polymerase | |
| Large subunit of rubisco | |
| Photosystem I | |
| Photosystem II | |
| NADH dehydrogenase | |
| Cytochrome b/f complex | |
| ATP synthase | |
| Maturase | |
| Subunit of acetyl-CoA carboxylase | |
| Envelope membrane protein | |
| Protease | |
| Translational initiation factor | |
| c-type cytochrome synthesis | |
| Conserved open reading frames ( |
Genes with one or two introns are indicated by one (*) or two asterisks (**), respectively. Genes in the IR regions are followed by the (×2) symbol.
Codon usage in Impatiens pinfanensis and Hydrocera triflora chloroplast genomes.
| Amino Acid | Codon | Number | RSCU | Amino Acid | Codon | Number | RSCU | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Phe | UUU | 913 | 908 | Ser | UCU | 482 | 482 | ||||
| UUC | 387 | 406 | 0.60 | 0.62 | UCC | 252 | 264 | 0.88 | 0.92 | ||
| Leu | UUA | 854 | 842 | UCA | 360 | 324 | |||||
| UUG | 468 | 486 | UCG | 142 | 181 | 0.50 | 0.63 | ||||
| CUU | 517 | 503 | Pro | CCU | 376 | 371 | |||||
| CUC | 160 | 162 | 0.40 | 0.40 | CCC | 175 | 167 | 0.74 | 0.71 | ||
| CUA | 310 | 315 | 0.77 | 0.78 | CCA | 294 | 290 | ||||
| CUG | 121 | 128 | 0.30 | 0.32 | CCG | 103 | 112 | 0.43 | 0.48 | ||
| Ile | AUU | 1035 | 1020 | Thr | ACU | 493 | 500 | ||||
| AUC | 359 | 376 | 0.53 | 0.56 | ACC | 198 | 180 | 0.68 | 0.63 | ||
| AUA | 624 | 611 | 0.93 | 0.91 | ACA | 358 | 368 | ||||
| Met | AUG | 547 | 548 | 1.00 | 1.00 | ACG | 108 | 104 | 0.37 | 0.36 | |
| Val | GUU | 482 | 469 | Ala | GCU | 580 | 593 | ||||
| GUC | 134 | 135 | 0.43 | 0.44 | GCC | 183 | 191 | 0.59 | 0.60 | ||
| GUA | 457 | 457 | GCA | 346 | 353 | ||||||
| GUG | 167 | 174 | 0.54 | 0.56 | GCG | 141 | 143 | 0.45 | 0.45 | ||
| Tyr | UAU | 704 | 697 | Cys | UGU | 191 | 196 | ||||
| UAC | 155 | 146 | 0.36 | 0.35 | UGC | 58 | 63 | 0.47 | 0.49 | ||
| TER | UAA | 41 | 44 | TER | UGA | 18 | 18 | 0.66 | 0.67 | ||
| UAG | 23 | 19 | 0.84 | 0.70 | Trp | UGG | 412 | 412 | 1.00 | 1.00 | |
| His | CAU | 405 | 421 | Arg | AGA | 406 | 407 | ||||
| CAC | 121 | 114 | 0.46 | 0.43 | AGG | 134 | 143 | 0.60 | 0.62 | ||
| Gln | CAA | 627 | 626 | Arg | CGU | 302 | 299 | ||||
| CAG | 186 | 192 | 0.46 | 0.47 | CGC | 88 | 95 | 0.39 | 0.41 | ||
| Asn | AAU | 885 | 868 | CGA | 317 | 333 | |||||
| AAC | 231 | 238 | 0.41 | 0.43 | CGG | 98 | 103 | 0.44 | 0.45 | ||
| Lys | AAA | 976 | 978 | Ser | AGU | 363 | 72 | ||||
| AAG | 284 | 289 | 0.45 | 0.46 | AGC | 110 | 108 | 0.39 | 0.37 | ||
| Asp | GAU | 720 | 737 | Gly | GGU | 525 | 525 | ||||
| GAC | 159 | 160 | 0.36 | 0.36 | GGC | 160 | 165 | 0.40 | 0.42 | ||
| Glu | GAA | 914 | 929 | GGA | 639 | 625 | |||||
| GAG | 264 | 272 | 0.45 | 0.45 | GGG | 258 | 238 | 0.65 | 0.61 | ||
RSCU: Relative synonymous Codon Usage. RSCU > 1 are highlighted in bold.
SSR types and amount in the Impatiens pinfanensis and Hydrocera triflora Chloroplast genomes.
| SSR Type | Repeat Unit | Amount | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Mono | A/T | 176 | 139 | 117 | 153 | 146 | 154 | 161 | 134 |
| C/G | 4 | 2 | 4 | 4 | 4 | 8 | 1 | 4 | |
| Di | AT/AT | 8 | 9 | 8 | 5 | 3 | 13 | 11 | 6 |
| Tri | AAG/CTT | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |
| AAT/ATT | 3 | 3 | 2 | 1 | 1 | 2 | 4 | 0 | |
| AGC/CTG | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | |
| Tetra | AAAG/CTTT | 1 | 0 | 3 | 2 | 1 | 3 | 1 | 1 |
| AAAT/ATTT | 2 | 3 | 3 | 3 | 4 | 3 | 6 | 2 | |
| AATG/ATTC | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | |
| AATT/AATT | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | |
| AGAT/ATCT | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |
| AAGT/ACTT | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | |
| AACT/AGTT | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | |
| AATC/ATTG | 0 | 0 | 2 | 0 | 1 | 1 | 0 | 0 | |
| AAAC/GTTT | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | |
| AAGG/CCTT | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | |
| Penta | AATAC/ATTGT | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| AAAAT/ATTTT | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | |
| AAATT/AATTT | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | |
| AATGT/ACATT | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | |
| AATAT/ATATT | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | |
| Hexa | AATCCC/ATTGGG | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| AGATAT/ATATCT | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | |
| AAGATG/ATCTTC | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | |
| Total | 197 | 159 | 143 | 171 | 161 | 187 | 188 | 150 | |
Figure 3Non-synonymous (Ka) and synonymous (Ks) substitution rates and Ka/Ks ratio between I. pinfanensis and H. triflora. One gene psbK had Ka/Ks ratio greater than 1.0, whereas all the other genes were less than 1.0.
Figure 4Comparison of IR, LSC and SSC border regions among eight Ericales cp genomes. The IRb/SSC junction extended into the ycf1 genes creating various lengths of ycf1 pseudogenes among the eight cp genomes. The numbers above, below or adjacent to genes shows the distance between the ends of genes and the boundary sites. The figure features are not to scale. ᵠ indicates a pseudogene.
Figure 5Phylogenetic relationships based on 51 common protein-coding genes of 38 representative species from order Ericales and four Cornales as Outgroup species with maximum likelihood. The numbers associated with the nodes indicate bootstrap values tested with 1000 replicates.