| Literature DB >> 30987338 |
Yuying Huang1, Zerui Yang2, Song Huang3, Wenli An4, Jing Li5, Xiasheng Zheng6.
Abstract
In the last decade, several studies have relied on a small number of plastid genomes to deduce deep phylogenetic relationships in the species-rich Myrtaceae. Nevertheless, the plastome of Rhodomyrtus tomentosa, an important representative plant of the Rhodomyrtus (DC.) genera, has not yet been reported yet. Here, we sequenced and analyzed the complete chloroplast (CP) genome of R. tomentosa, which is a 156,129-bp-long circular molecule with 37.1% GC content. This CP genome displays a typical quadripartite structure with two inverted repeats (IRa and IRb), of 25,824 bp each, that are separated by a small single copy region (SSC, 18,183 bp) and one large single copy region (LSC, 86,298 bp). The CP genome encodes 129 genes, including 84 protein-coding genes, 37 tRNA genes, eight rRNA genes and three pseudogenes (ycf1, rps19, ndhF). A considerable number of protein-coding genes have a universal ATG start codon, except for psbL and ndhD. Premature termination codons (PTCs) were found in one protein-coding gene, namely atpE, which is rarely reported in the CP genome of plants. Phylogenetic analysis revealed that R. tomentosa has a sister relationship with Eugenia uniflora and Psidium guajava. In conclusion, this study identified unique characteristics of the R. tomentosa CP genome providing valuable information for further investigations on species identification and the phylogenetic evolution between R. tomentosa and related species.Entities:
Keywords: Rhodomyrtus tomentosa; chloroplast genome; phylogenetic analysis; species identification
Year: 2019 PMID: 30987338 PMCID: PMC6524380 DOI: 10.3390/plants8040089
Source DB: PubMed Journal: Plants (Basel) ISSN: 2223-7747
Figure 1Genome scheme of the Rhodomyrtus tomentosa chloroplast genome. Genes inside the circle are transcribed clockwise, while those outside are transcribed counterclockwise Filled colors represent different functional groups that specific genes fall into according to the legend on the bottom. Gray arrow represents gene direction. The darker gray color in the inner circle corresponds to GC (guanine and cytosine) content, whereas the lighter gray corresponds to AT (adenine and uracil) content.
Base composition in the chloroplast genome of R. tomentosa.
| Region | Positions | T (%) | C (%) | A (%) | G (%) | Length (bp) |
|---|---|---|---|---|---|---|
| LSC | 33.3 | 18.0 | 31.7 | 17.1 | 86,298 | |
| IRA | 28.4 | 20.7 | 28.7 | 22.2 | 25,824 | |
| SSC | 34.1 | 16.2 | 35.1 | 14.6 | 18,183 | |
| IRB | 28.7 | 22.2 | 28.4 | 20.7 | 25,284 | |
| Total | 31.8 | 18.9 | 31.0 | 18.2 | 156,129 | |
| CDS | 31.6 | 17.5 | 30.8 | 20.1 | 76,113 | |
| 1st position | 24.0 | 18.6 | 30.8 | 26.3 | 25,371 | |
| 2nd position | 33.0 | 19.8 | 29.6 | 17.5 | 25,371 | |
| 3rd position | 37.0 | 14.1 | 32.0 | 16.5 | 25,371 |
Base composition in the chloroplast genome of R. tomentosa.
| Gene Classification | Gene Names | Number |
|---|---|---|
| Photosystem I | 5 | |
| Photosystem II | 15 | |
| Cytochrome b/f complex | 6 | |
| ATP synthase | 6 | |
| NADH dehydrogenase | 12(1) | |
| RuBisCO large subunit |
| 1 |
| RNA polymerase | 4 | |
| Ribosomal proteins (SSC) | 14(2) | |
| Ribosomal proteins (LSC) | 11 | |
| Ribosomal RNAs | 8(4) | |
| Protein of unknown function | 5(1) | |
| Transfer RNAs | 37 tRNAs (8 contain an intron, 7 in the inverted repeats region) | 37(7) |
| Other genes | 5 | |
| Total | 129 |
* Indicates gene contains one intron; ** indicates two introns; (×2) indicates the number of the repeat unit is 2.
Base composition in the chloroplast genome of R. tomentosa.
| Gene | Location | Exon I (bp) | Intron I (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) |
|---|---|---|---|---|---|---|
|
| LSC | 144 | 739 | 411 | ||
|
| LSC | 69 | 852 | 296 | 637 | 228 |
|
| SSC | 552 | 1058 | 540 | ||
|
| IR | 777 | 681 | 756 | ||
|
| LSC | 6 | 778 | 642 | ||
|
| LSC | 7 | 750 | 473 | ||
|
| LSC | 9 | 988 | 399 | ||
|
| IR | 391 | 664 | 434 | ||
|
| LSC | 451 | 733 | 1619 | ||
|
| LSC | 114 | 232 | 546 | 26 | |
|
| LSC | 40 | 860 | 212 | ||
|
| IR | 35 | 803 | 38 | ||
|
| LSC | 23 | 734 | 48 | ||
|
| IR | 37 | 949 | 35 | ||
|
| LSC | 35 | 2469 | 37 | ||
|
| LSC | 35 | 505 | 50 | ||
|
| LSC | 37 | 587 | 37 | ||
|
| LSC | 126 | 754 | 226 | 723 | 155 |
Figure 2Comparison of amino acid sequence of atpE in the Rhodomyrtus tomentosa chloroplast genome with those of three closely related species. * Indicates termination codons.
Figure 3Repeat sequences in four chloroplast genomes. F, P, R, and C indicates the repeat types: F (forward), P (palindrome), R (reverse), and C (complement). Repeats with different lengths are indicated in different colors.
Base composition in the chloroplast genome of R. tomentosa.
| SSR Type | Repeat Unit | Amount | Ratio (%) |
|---|---|---|---|
| Mono | A/T | 171 | 98.8 |
| C/G | 2 | 1.2 | |
| Di | AC/GT | 16 | 43.2 |
| AT/AT | 21 | 56.8 | |
| Tri | AAC/GTT | 6 | 9.5 |
| AAG/CTT | 18 | 28.6 | |
| AAT/ATT | 24 | 38.1 | |
| ACC/GGT | 2 | 3.2 | |
| ACT/AGT | 3 | 4.8 | |
| AGC/CTG | 5 | 7.9 | |
| AGG/CCT | 1 | 1.6 | |
| ATC/ATG | 4 | 6.3 | |
| Tetra | AAAG/CTTT | 2 | 16.7 |
| AAAT/ATTT | 3 | 25.0 | |
| AAGC/CTTG | 1 | 8.3 | |
| AAGT/ACTT | 1 | 8.3 | |
| AATT/AATT | 2 | 16.7 | |
| AGAT/ATCT | 3 | 25.0 | |
| Penta | ACCGG/CCGGT | 2 | 100.0 |
Figure 4Codon content of 20 amino acid and stop codons in all protein-coding genes of the Rhodomyrtus tomentosa chloroplast genome.
Figure 5Comparison of the borders of the LSC, SSC, and IR regions among five chloroplast genomes. Ψ: pseudogenes, /: distance from the edge.
Figure 6Sequence identity plot comparison of the chloroplast genome of Rhodomyrtus. tomentosa with three others using mVISTA. Gray arrows and thick black lines above the alignment indicate genes with their orientation and the position of the IRs, respectively. A cut-off of 70% identity was used for the plots, and the Y-scale represents the percentage identity ranging from 50 to 100%.
Figure 7Phylogenetic relationship of the 17 species inferred from maximum likelihood analyses based on the complete chloroplast genome excluding the IRA region. Numbers at nodes represent bootstrap support values.