| Literature DB >> 29439505 |
Hong-Ying Jian1, Yong-Hong Zhang2, Hui-Jun Yan3, Xian-Qin Qiu4, Qi-Gang Wang5, Shu-Bin Li6, Shu-Dong Zhang7.
Abstract
Rosa chinensis var. spontanea, an endemic and endangered plant of China, is one of the key ancestors of modern roses and a source for famous traditional Chinese medicines against female diseases, such as irregular menses and dysmenorrhea. In this study, the complete chloroplast (cp) genome of R. chinensis var. spontanea was sequenced, analyzed, and compared to congeneric species. The cp genome of R. chinensis var. spontanea is a typical quadripartite circular molecule of 156,590 bp in length, including one large single copy (LSC) region of 85,910 bp and one small single copy (SSC) region of 18,762 bp, separated by two inverted repeat (IR) regions of 25,959 bp. The GC content of the whole genome is 37.2%, while that of LSC, SSC, and IR is 42.8%, 35.2% and 31.2%, respectively. The genome encodes 129 genes, including 84 protein-coding genes (PCGs), 37 transfer RNA (tRNA) genes, and eight ribosomal RNA (rRNA) genes. Seventeen genes in the IR regions were found to be duplicated. Thirty-three forward and five inverted repeats were detected in the cp genome of R. chinensis var. spontanea. The genome is rich in SSRs. In total, 85 SSRs were detected. A genome comparison revealed that IR contraction might be the reason for the relatively smaller cp genome size of R. chinensis var. spontanea compared to other congeneric species. Sequence analysis revealed that the LSC and SSC regions were more divergent than the IR regions within the genus Rosa and that a higher divergence occurred in non-coding regions than in coding regions. A phylogenetic analysis showed that the sampled species of the genus Rosa formed a monophyletic clade and that R. chinensis var. spontanea shared a more recent ancestor with R. lichiangensis of the section Synstylae than with R. odorata var. gigantea of the section Chinenses. This information will be useful for the conservation genetics of R. chinensis var. spontanea and for the phylogenetic study of the genus Rosa, and it might also facilitate the genetics and breeding of modern roses.Entities:
Keywords: Rosa chinensis var. spontanea; SSRs; chloroplast genome; genome comparison; phylogeny; repeats
Mesh:
Year: 2018 PMID: 29439505 PMCID: PMC6017658 DOI: 10.3390/molecules23020389
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Base composition in the chloroplast genome of Rosa chinensis var. spontanea.
| Region | A | T (U) | G | C | Length | |
|---|---|---|---|---|---|---|
| LSC | 31.7 | 33.1 | 17.2 | 18.0 | 85,910 | |
| SSC | 34.4 | 34.3 | 15.1 | 16.3 | 18,762 | |
| IRB | 28.7 | 28.5 | 22.2 | 20.6 | 25,959 | |
| IRA | 28.7 | 28.5 | 22.2 | 20.6 | 25,959 | |
| Total | 31.0 | 31.8 | 18.6 | 18.7 | 156,590 | |
| PCGs | 30.6 | 31.4 | 20.3 | 17.7 | 79,773 | |
| 1st position | 30.7 | 24 | 26.9 | 18.7 | 26,591 | |
| 2nd position | 29.5 | 33 | 17.7 | 20.2 | 26,591 | |
| 3rd position | 31.7 | 38 | 16.4 | 14.1 | 26,591 |
PCGs: protein-coding genes.
Figure 1Chloroplast genome map of Rosa chinensis var. spontanea. Genes inside the circle are transcribed clockwise, and those outside are counterclockwise. Genes of different functions are color-coded. The darker gray in the inner circle shows the GC content, while the lighter gray shows the AT content.
Condon-anticodon recognition patterns and codon usage of the Rosa chinensis var. spontanea chloroplast genome.
| Amino Acid | Codon | Count | RSCU | tRNA | Amino Acid | Codon | Count | RSCU | tRNA |
|---|---|---|---|---|---|---|---|---|---|
| Phe | UUU | 1015 | 1.3 | Ser | UCU | 580 | 1.62 | ||
| Phe | UUC | 545 | 0.7 | Ser | UCC | 370 | 1.03 | ||
| Leu | UUA | 897 | 1.87 | Ser | UCA | 406 | 1.13 | ||
| Leu | UUG | 580 | 1.21 | Ser | UCG | 222 | 0.62 | ||
| Leu | CUU | 595 | 1.24 | Pro | CCU | 424 | 1.45 | ||
| Leu | CUC | 217 | 0.45 | Pro | CCC | 241 | 0.82 | ||
| Leu | CUA | 380 | 0.79 | Pro | CCA | 320 | 1.09 | ||
| Leu | CUG | 202 | 0.42 | Pro | CCG | 187 | 0.64 | ||
| Ile | AUU | 1136 | 1.48 | Thr | ACU | 542 | 1.55 | ||
| Ile | AUC | 451 | 0.59 | Thr | ACC | 269 | 0.77 | ||
| Ile | AUA | 716 | 0.93 | Thr | ACA | 418 | 1.19 | ||
| Met | AUG | 635 | 1 | Thr | ACG | 171 | 0.49 | ||
| Val | GUU | 550 | 1.44 | Ala | GCU | 645 | 1.76 | ||
| Val | GUC | 193 | 0.5 | Ala | GCC | 244 | 0.67 | ||
| Val | GUA | 567 | 1.48 | Ala | GCA | 391 | 1.07 | ||
| Val | GUG | 223 | 0.58 | Ala | GCG | 183 | 0.5 | ||
| Tyr | UAU | 798 | 1.6 | Cys | UGU | 237 | 1.48 | ||
| Tyr | UAC | 198 | 0.4 | Cys | UGC | 83 | 0.52 | ||
| stop | UAA | 59 | 1.38 | stop | UGA | 21 | 0.49 | ||
| stop | UAG | 48 | 1.13 | Trp | UGG | 484 | 1 | ||
| His | CAU | 476 | 1.49 | Arg | CGU | 362 | 1.28 | ||
| His | CAC | 161 | 0.51 | Arg | CGC | 120 | 0.43 | ||
| Gln | CAA | 734 | 1.51 | Arg | CGA | 385 | 1.36 | ||
| Gln | CAG | 236 | 0.49 | Arg | CGG | 144 | 0.51 | ||
| Asn | AAU | 1003 | 1.52 | Ser | AGU | 420 | 1.17 | ||
| Asn | AAC | 317 | 0.48 | Ser | AGC | 156 | 0.43 | ||
| Lys | AAA | 1082 | 1.48 | Arg | AGA | 488 | 1.73 | ||
| Lys | AAG | 385 | 0.52 | Arg | AGG | 194 | 0.69 | ||
| Asp | GAU | 890 | 1.62 | Gly | GGU | 612 | 1.3 | ||
| Asp | GAC | 207 | 0.38 | Gly | GGC | 209 | 0.44 | ||
| Glu | GAA | 1052 | 1.46 | Gly | GGA | 694 | 1.48 | ||
| Glu | GAG | 390 | 0.54 | Gly | GGG | 365 | 0.78 |
RSCU: Relative Synonymous Codon Usage.
Repeat sequences in the Rosa chinensis var. spontanea chloroplast genome.
| ID | Repeat Start 1 | Type | Size (bp) | Repeat Start 2 | Mismatch (bp) | E-Value | Gene | Region |
|---|---|---|---|---|---|---|---|---|
| 1 | 4426 | F | 29 | 45,071 | −2 | 8.74 × 10−5 | IGS; | LSC |
| 2 | 4427 | F | 30 | 4428 | −3 | 6.56 × 10−4 | IGS | LSC |
| 3 | 4428 | F | 28 | 45,072 | −3 | 8.47 × 10−3 | IGS | LSC |
| 4 | 4432 | F | 26 | 45,072 | −2 | 4.48 × 10−3 | IGS | LSC |
| 5 | 8329 | F | 29 | 36,077 | −2 | 8.74 × 10−5 | LSC | |
| 6 | 8873 | F | 20 | 8895 | 0 | 6.27 × 10−3 | IGS | LSC |
| 7 | 9804 | F | 27 | 37,135 | −1 | 3.10 × 10−5 | LSC | |
| 8 | 13,510 | F | 20 | 89,606 | 0 | 6.27 × 10−3 | IGS; | LSC; IRa |
| 9 | 14,236 | F | 20 | 29,560 | 0 | 6.27 × 10−3 | IGS | LSC |
| 10 | 27,619 | F | 24 | 27,643 | 0 | 2.45 × 10−5 | IGS | LSC |
| 11 | 29,555 | F | 24 | 29,556 | −1 | 1.76 × 10−3 | IGS | LSC |
| 12 | 33,157 | F | 20 | 33,177 | 0 | 6.27 × 10−3 | IGS | LSC |
| 13 | 39,390 | F | 30 | 41,614 | −3 | 6.56 × 10−4 | LSC | |
| 14 | 42,625 | F | 25 | 147,248 | −1 | 4.59 × 10−4 | IGS | LSC; IRb |
| 15 | 44,406 | F | 39 | 100,262 | 0 | 2.28 × 10−14 | LSC; IRa | |
| 16 | 44,406 | F | 38 | 122,332 | 0 | 9.13 × 10−14 | LSC; SSC | |
| 17 | 45,075 | F | 24 | 142,008 | −1 | 1.76 × 10−3 | LSC; IRb | |
| 18 | 47,622 | F | 25 | 47,645 | 0 | 6.13 × 10−6 | IGS | LSC |
| 19 | 58,656 | F | 34 | 58,687 | 0 | 2.34 × 10−11 | IGS | LSC |
| 20 | 66,712 | F | 41 | 66,752 | 0 | 1.43 × 10−15 | IGS | LSC |
| 21 | 66,939 | F | 20 | 66,958 | 0 | 6.27 × 10−3 | IGS | LSC |
| 22 | 68,033 | F | 21 | 68,052 | 0 | 1.57 × 10−3 | IGS | LSC |
| 23 | 71,232 | F | 20 | 84,928 | 0 | 6.27 × 10−3 | IGS | LSC |
| 24 | 80,953 | F | 27 | 80,966 | −2 | 1.21 × 10−3 | IGS | LSC |
| 25 | 83,166 | F | 29 | 122,320 | −3 | 2.36 × 10−3 | LSC;SSC | |
| 26 | 83,172 | F | 28 | 122,326 | −3 | 8.47 × 10−3 | LSC;SSC | |
| 27 | 90,610 | F | 29 | 90,631 | −2 | 8.74 × 10−5 | IRa | |
| 28 | 97,630 | F | 31 | 144,839 | −3 | 1.81 × 10−4 | IRa; IRb | |
| 29 | 100,260 | F | 40 | 122,330 | 0 | 5.70 × 10−15 | IGS; | IRa; SSC |
| 30 | 101,012 | F | 23 | 101,033 | 0 | 9.80 × 10−5 | IGS | IRa |
| 31 | 141,437 | F | 30 | 141,458 | −2 | 2.34 × 10−5 | IGS | IRb |
| 32 | 141,444 | F | 23 | 141,465 | 0 | 9.80 × 10−5 | IGS | IRb |
| 33 | 151,840 | F | 29 | 151,861 | −2 | 8.74 × 10−5 | IRb | |
| 34 | 6406 | I | 20 | 71,231 | 0 | 6.27 × 10−3 | IGS | LSC |
| 35 | 6408 | I | 24 | 71,228 | −1 | 1.76 × 10−3 | IGS | LSC |
| 36 | 8622 | I | 26 | 45,073 | −2 | 4.48 × 10−3 | IGS | LSC |
| 37 | 8625 | I | 23 | 45,077 | −1 | 6.76 × 10−3 | IGS | LSC |
| 38 | 71,232 | I | 20 | 84,930 | 0 | 6.27 × 10−3 | IGS | LSC |
F: Forward; I: Inverted; IGS: intergenic space.
Simple sequence repeats (SSRs) in the Rosa chinensis var. spontanea chloroplast genome.
| ID | Repeat Motif | Length (bp) | Start | End | Region | Gene | ID | Repeat Motif | Length (bp) | Start | End | Region | Gene |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | (A)11 | 11 | 279 | 289 | LSC | 44 | (TTTA)4 | 12 | 50,468 | 50,479 | LSC | ||
| 2 | (T)11 | 11 | 4108 | 4118 | LSC | 45 | (TA)5 | 10 | 52,742 | 52,751 | LSC | ||
| 3 | (A)19 | 19 | 4428 | 4446 | LSC | 46 | (T)10 | 10 | 55,811 | 55,820 | LSC | ||
| 4 | (A)10 | 10 | 4449 | 4458 | LSC | 47 | (AAAT)3 | 12 | 55,911 | 55,922 | LSC | ||
| 5 | (A)10 | 10 | 4887 | 4896 | LSC | 48 | (TAAT)3 | 12 | 58,366 | 58,377 | LSC | ||
| 6 | (T)10 | 10 | 5023 | 5032 | LSC | 49 | (T)14 | 14 | 60,810 | 60,823 | LSC | ||
| 7 | (TATAT)3 | 15 | 6102 | 6116 | LSC | 50 | (TC)5 | 10 | 62,280 | 62,289 | LSC | ||
| 8 | (T)17 | 17 | 6407 | 6423 | LSC | 51 | (T)11 | 11 | 64,513 | 64,523 | LSC | ||
| 9 | (AATA)3 | 12 | 6525 | 6536 | LSC | 52 | (T)10 | 10 | 69,689 | 69,698 | LSC | ||
| 10 | (AG)5 | 10 | 6755 | 6764 | LSC | 53 | (A)16 | 16 | 69,739 | 69,754 | LSC | ||
| 11 | (A)11 | 11 | 6945 | 6955 | LSC | 54 | (T)18 | 18 | 71,235 | 71,252 | LSC | ||
| 12 | (TAA)4 | 12 | 8257 | 8268 | LSC | 55 | (T)15 | 15 | 71,933 | 71,947 | LSC | ||
| 13 | (A)10 | 10 | 8639 | 8648 | LSC | 56 | (A)10 | 10 | 72,733 | 72,742 | LSC | ||
| 14 | (AT)6 | 12 | 10,093 | 10,104 | LSC | 57 | (AT)6 | 12 | 73,632 | 73,643 | LSC | ||
| 15 | (TAT)4 | 12 | 10,343 | 10,354 | LSC | 58 | (A)12 | 12 | 79,231 | 79,242 | LSC | ||
| 16 | (T)11 | 11 | 12,157 | 12,167 | LSC | 59 | (A)14 | 14 | 79,393 | 79,406 | LSC | ||
| 17 | (T)10 | 10 | 12,915 | 12,924 | LSC | 60 | (T)10 | 10 | 79,429 | 79,438 | LSC | ||
| 18 | (A)10 | 10 | 13,184 | 13,193 | LSC | 61 | (ATGT)3 | 12 | 79,529 | 79,540 | LSC | ||
| 19 | (C)10 | 10 | 14,237 | 14,246 | LSC | 62 | (T)11 | 11 | 81,586 | 81,596 | LSC | ||
| 20 | (T)10 | 10 | 14,247 | 14,256 | LSC | 63 | (A)10 | 10 | 82,641 | 82,650 | LSC | ||
| 21 | (T)11 | 11 | 18,361 | 18,371 | LSC | 64 | (A)12 | 12 | 83,422 | 83,433 | LSC | ||
| 22 | (TA)5 | 10 | 19,730 | 19,739 | LSC | 65 | (A)11 | 11 | 83,498 | 83,508 | LSC | ||
| 23 | (T)10 | 10 | 26,080 | 26,089 | LSC | 66 | (T)18 | 18 | 84,931 | 84,948 | LSC | ||
| 24 | (T)12 | 12 | 28,925 | 28,936 | LSC | 67 | (TAT)4 | 12 | 86,619 | 86,630 | IRa | ||
| 25 | (C)15 | 15 | 29,556 | 29,570 | LSC | 68 | (TAGAAG)3 | 18 | 93,987 | 94,004 | IRa | ||
| 26 | (T)10 | 10 | 29,571 | 29,580 | LSC | 69 | (T)11 | 11 | 101,618 | 101,628 | IRa | ||
| 27 | (AAT)4 | 12 | 30,504 | 30,515 | LSC | 70 | (AGGT)3 | 12 | 107,843 | 107,854 | IRa | ||
| 28 | (T)14 | 14 | 30,519 | 30,532 | LSC | 71 | (TATT)3 | 12 | 110,028 | 110,039 | IRa | ||
| 29 | (A)10 | 10 | 30,666 | 30,675 | LSC | 72 | (TGT)4 | 12 | 111,869 | 111,880 | SSC | ||
| 30 | (TA)5 | 10 | 36,313 | 36,322 | LSC | 73 | (T)10 | 10 | 115,507 | 115,516 | SSC | ||
| 31 | (T)11 | 11 | 36,473 | 36,483 | LSC | 74 | (TAA)4 | 12 | 115,558 | 115,569 | SSC | ||
| 32 | (AT)5 | 12 | 37,070 | 37,079 | LSC | 75 | (A)13 | 13 | 115,612 | 115,624 | SSC | ||
| 33 | (C)13 | 13 | 37,303 | 37,315 | LSC | 76 | (T)10 | 10 | 120,845 | 120,854 | SSC | ||
| 34 | (A)11 | 11 | 37,316 | 37,326 | LSC | 77 | (AT)7 | 14 | 121,678 | 121,691 | SSC | ||
| 35 | (AT)5 | 10 | 43,682 | 43,691 | LSC | 78 | (A)16 | 16 | 122,551 | 122,566 | SSC | ||
| 36 | (A)15 | 15 | 45,073 | 45,087 | LSC | 79 | (T)15 | 15 | 122,804 | 122,818 | SSC | ||
| 37 | (A)10 | 10 | 45,392 | 45,401 | LSC | 80 | (T)10 | 10 | 129,830 | 129,839 | SSC | ||
| 38 | (T)10 | 10 | 45,931 | 45,940 | LSC | 81 | (ATAA)3 | 12 | 132,463 | 132,474 | IRb | ||
| 39 | (A)11 | 11 | 47,296 | 47,306 | LSC | 82 | (CTAC)3 | 12 | 134,645 | 134,656 | IRb | ||
| 40 | (TAAT)3 | 12 | 48,112 | 48,123 | LSC | 83 | (A)11 | 11 | 140,873 | 140,883 | IRb | ||
| 41 | (T)14 | 14 | 48,306 | 48,319 | LSC | 84 | (CTTCTA)3 | 18 | 148,497 | 148,514 | IRb | ||
| 42 | (A)12 | 12 | 48,420 | 48,431 | LSC | 85 | (ATA)4 | 12 | 155,871 | 155,882 | IRb | ||
| 43 | (TA)5 | 10 | 48,500 | 48,509 | LSC |
Figure 2Complete chloroplast genome comparison of four rose species using the chloroplast genome of R. chinensis var. spontanea as a reference. The grey arrows and thick black lines above the alignment indicate the gene orientation. The y-axis represents the identity from 50% to 100%.
Figure 3Comparison of the LSC, SSC and IR regions in chloroplast genomes of four species. Ψ: pseudogenes, →: distance from the edge.
Figure 4Phylogeny of 22 species within Rosaceae based on the ML analysis of the chloroplast genome’s LSC, SSC, and one-IR regions with Berchemiella wilsonii (Rhamnaceae) as the outgroup. The position of R. chinensis var. spontanea is shown in block letters.