| Literature DB >> 30716116 |
Dong Meng1,2, Zhou Xiaomei1,2, Ku Wenzhen1, Zhenggang Xu1,2.
Abstract
Artemisia selengenesis is not only a health food, but also a well-known traditional Chinese medicine. Only a fraction of the chloroplast (cp) genome data of Artemisia has been reported and chloroplast genomic materials have been widely used in genomic evolution studies, molecular marker development, and phylogenetic analysis of the genus Artemisia, which makes evolutionary studies, genetic improvement, and phylogenetic identification very difficult. In this study, the complete chloroplast genome of A. selengensis was compared with that of other species within Artemisia and phylogenetic analyses was conducted with other genera in the Asteraceae family. The results showed that A. selengensis is an AT-rich species and has a typical quadripartite structure that is 151,215 bp in length. Comparative genome analyses demonstrated that the available chloroplast genomes of species of Artemisia were well conserved in terms of genomic length, GC contents, and gene organization and order. However, some differences, which may indicate evolutionary events, were found, such as a re-inversion event within the Artemisia genus, an unequal duplicate phenomenon of the ycf1 gene because of the expansion and contraction of the IR region, and the fast-evolving regions. Repeated sequences analysis showed that Artemisia chloroplast genomes presented a highly similar pattern of SSR or LDR distribution. A total of 257 SSRs and 42 LDRs were identified in the A. selengensis chloroplast genome. The phylogenetic analysis showed that A. selengensis was sister to A. gmelinii. The findings of this study will be valuable in further studies to understand the genetic diversity and evolutionary history of Asteraceae.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30716116 PMCID: PMC6361438 DOI: 10.1371/journal.pone.0211340
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Base compositions in the A. selengensis chloroplast genome.
| Location | T/U (%) | C (%) | A (%) | G (%) | Length (bp) |
|---|---|---|---|---|---|
| Genome | 31.28 | 18.67 | 31.26 | 18.79 | 151215 |
| tRNA genes | 22.66 | 26.73 | 24.59 | 26.02 | 2798 |
| rRNA gens | 22.46 | 27.54 | 22.46 | 27.54 | 9048 |
| Introns region | 32.31 | 18.86 | 30.75 | 18.07 | 17240 |
| Protein-coding genes | 31.54 | 17.75 | 30.53 | 20.19 | 77778 |
| Intergenic region | 33.34 | 16.07 | 34.23 | 16.36 | 44274 |
| 1st positon | 23.50 | 19.08 | 30.58 | 26.84 | 25926 |
| 2nd positon | 32.70 | 20.39 | 29.21 | 17.70 | 25926 |
| 1st+2nd positon | 28.10 | 19.73 | 29.90 | 22.27 | 51852 |
| 3rd positon | 38.42 | 13.78 | 31.78 | 16.02 | 25926 |
Fig 1Gene map of the complete chloroplast genome of A. selengensis.
Genes lying inside of the circle are transcribed clockwise, and those outside are transcribed counterclockwise. Different color of blocks represent different functional groups. The darker gray color of the inner circle corresponds to the GC content, and the lighter gray color corresponds to the AT content.
Genes predicted in the chloroplast genome of A. Selengensis.
| Category | Group of genes | Name of genes |
|---|---|---|
| Self-replication | Large subunit of ribosomal proteins | |
| Small subunit of ribosomal proteins | ||
| DNA dependent RNA polymerase | ||
| rRNA genes | ||
| tRNA genes | ||
| Photosynthesis | Photosystem I | |
| Photosystem II | ||
| NADH dehydrogenase | ||
| Cytochrome b6/f complex | ||
| ATP synthase | ||
| Rubisco | ||
| Other genes | Translational initiation factor | |
| Maturase | ||
| Protease | ||
| Envelop membrane protein | ||
| Subunit Acetyl-CoA-Carboxylase | ||
| C-type cytochrome synthesis gene | ||
| Genes of unkown function | Conserved Open reading frames |
a Duplicated gene
b Trans-splicing gene.
Length of introns and exons of the split genes in the A. Selengensis complete chloroplast genome.
| Gene Name | Gene Location | Length (bp) | ||||||
|---|---|---|---|---|---|---|---|---|
| Strand | Start | End | Exon I | Intro I | Exon II | Intro II | Exon III | |
| - | 5190 | 6275 | 40 | 861 | 185 | |||
| + | 15912 | 18705 | 432 | 721 | 1641 | |||
| + | 26621 | 27874 | 145 | 699 | 410 | |||
| - | 41826 | 43775 | 126 | 703 | 228 | 740 | 153 | |
| - | 68800 | 70794 | 68 | 798 | 292 | 609 | 228 | |
| + | 73721 | 75113 | 6 | 745 | 642 | |||
| + | 75302 | 76459 | 8 | 675 | 475 | |||
| - | 79921 | 81347 | 9 | 1019 | 399 | |||
| - | 83042 | 84530 | 393 | 661 | 435 | |||
| - | 93079 | 95281 | 777 | 670 | 756 | |||
| - | 117648 | 119820 | 553 | 1081 | 539 | |||
| + | 138855 | 141057 | 777 | 670 | 756 | |||
| + | 149606 | 151094 | 393 | 661 | 435 | |||
| - | 1722 | 4340 | 37 | 2547 | 35 | |||
| - | 29908 | 30705 | 23 | 728 | 47 | |||
| + | 46606 | 47116 | 37 | 424 | 50 | |||
| - | 51073 | 51719 | 38 | 572 | 37 | |||
| + | 100805 | 101657 | 43 | 775 | 35 | |||
| + | 101722 | 102606 | 38 | 812 | 35 | |||
| - | 131530 | 132414 | 38 | 812 | 35 | |||
| - | 132479 | 133331 | 43 | 775 | 35 | |||
The codon-anticodon recognition pattern and codon usage for A.Selengensis chloroplast genomeAnimo acid.
| Animo acid | Codon | No. | RSCU | tRNA | Animo acid | Codon | No. | RSCU | tRNA |
|---|---|---|---|---|---|---|---|---|---|
| Ala | GCU | 365 | 1.565 | trnA-UGC | Pro | CCA | 409 | 1.504 | trnP-UGG |
| Ala | GCG | 132 | 0.566 | Pro | CCC | 236 | 0.868 | ||
| Ala | GCC | 210 | 0.9 | Pro | CCU | 306 | 1.125 | ||
| Ala | GCA | 226 | 0.969 | Pro | CCG | 137 | 0.504 | ||
| Cys | UGU | 305 | 1.063 | trnC-GCA | Gln | CAA | 630 | 1.491 | trnQ-UUG |
| Cys | UGC | 269 | 0.937 | Gln | CAG | 215 | 0.509 | ||
| Asp | GAU | 642 | 1.566 | trnD-GUC | Arg | AGA | 518 | 1.265 | trnR-ACG |
| Asp | GAC | 178 | 0.434 | Arg | AGG | 301 | 0.735 | trnR-UCU | |
| Glu | GAG | 263 | 0.517 | trnE-UUC | Arg | CGA | 240 | 1.299 | |
| Glu | GAA | 755 | 1.483 | Arg | CGC | 125 | 0.677 | ||
| Phe | UUU | 984 | 1.15 | trnF-GAA | Arg | CGG | 140 | 0.758 | |
| Phe | UUC | 728 | 0.85 | Arg | CGU | 234 | 1.267 | ||
| Gly | GGU | 411 | 1.185 | trnG-GCC | Ser | AGC | 365 | 0.892 | trnS-GCU |
| Gly | GGG | 256 | 0.738 | trnG-UCC | Ser | AGU | 453 | 1.108 | trnS-GGA |
| Gly | GGC | 203 | 0.585 | Ser | UCA | 182 | 0.491 | trnS-UGA | |
| Gly | GGA | 517 | 1.491 | Ser | UCC | 502 | 1.354 | ||
| His | CAC | 149 | 0.423 | trnH-GUG | Ser | UCG | 266 | 0.717 | |
| His | CAU | 555 | 1.577 | Ser | UCU | 533 | 1.438 | ||
| Ile | AUU | 1031 | 1.294 | trnI-CAU | Thr | ACC | 413 | 1.151 | trnT-GGU |
| Ile | AUA | 715 | 0.897 | trnI-GAU | Thr | ACA | 301 | 0.839 | trnT-UGU |
| Ile | AUC | 644 | 0.808 | Thr | ACG | 238 | 0.663 | ||
| Lys | AAA | 988 | 1.332 | trnK-UUU | Thr | ACU | 483 | 1.346 | |
| Lys | AAG | 495 | 0.668 | Val | GUU | 403 | 1.387 | trnV-GAC | |
| Leu | CUA | 184 | 0.648 | trnL-CAA | Val | GUG | 186 | 0.64 | trnV-UAC |
| Leu | CUC | 261 | 0.92 | trnL-UAA | Val | GUC | 206 | 0.709 | |
| Leu | CUG | 205 | 0.722 | trnL-UAG | Val | GUA | 367 | 1.263 | |
| Leu | CUU | 485 | 1.709 | Trp | UGG | 376 | 1 | trnW-CCA | |
| Leu | UUA | 433 | 0.785 | Tyr | UAC | 339 | 0.61 | trnY-GUA | |
| Leu | UUG | 670 | 1.215 | Tyr | UAU | 773 | 1.39 | ||
| Met | AUG | 528 | 1 | trnM-CAU | * | UGA | 237 | 0.763 | |
| Asn | AAC | 383 | 0.54 | trnN-GUU | * | UAG | 202 | 0.65 | |
| Asn | AAU | 1035 | 1.46 | * | UAA | 493 | 1.587 |
The asterisk (*) means stop codon.
Characteristics of nine Asteraceae species.
| Species | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Length (bp)/GC content (%) | 151215/37.46 | 151056/37.46 | 151076/37.48 | 151318/37.42 | 151130/37.48 | 151012/37.47 | 150784/37.46 | 152229/37.33 | 152585/37.70 |
| Size (bp)/GC content (%) of LSC | 82920/35.55 | 82821/35.56 | 82740/35.58 | 83061/35.49 | 82873/35.57 | 82817/35.56 | 82958/35.51 | 83954/35.32 | 83622/35.82 |
| Size (bp)/GC content (%) of SSC | 18367/30.81 | 18309/30.72 | 18392/30.83 | 18335/30.83 | 18339/30.87 | 18281/30.85 | 18338/31.12 | 18233/31.11 | 18651/31.51 |
| Size (bp)/GC content (%) of IR | 24964/43.09 | 24963/43.08 | 24972/43.06 | 24961/43.06 | 24959/43.08 | 24957/43.08 | 24744/43.10 | 25021/42.97 | 25156/43.13 |
| Size (bp)/GC content (%) of CDS | 77928/37.84 | 79197/37.71 | 79182/37.77 | 79167/37.76 | 78912/37.77 | 76983/38.02 | 78372/37.75 | 78771/37.88 | 80257/38.03 |
| Size (bp)/GC content (%) of introns | 17240/36.94 | 17244/36.92 | 17259/36.93 | 17303/36.85 | 17308/36.88 | 15524/37.74 | 16197/37.41 | 16479/37.18 | 16200/37.28 |
| Size (bp)/GC content (%) of rRNA | 9048/55.08 | 9048/55.08 | 9048/55.08 | 9048/55.08 | 9048/55.08 | 9048/55.08 | 9047/55.18 | 9047/55.18 | 9046/55.23 |
| Size (bp)/GC content (%) of tRNA | 2798/52.75 | 2798/52.72 | 2806/52.67 | 2798/52.75 | 2806/52.71 | 2723/52.63 | 2692/52.45 | 2694/52.86 | 2726/52.93 |
| Size (bp)/GC content (%) of IGSs | 44274/32.43 | 42872/32.48 | 42854/32.44 | 43075/32.34 | 43129/32.50 | 46807/32.20 | 44549/32.49 | 45311/31.94 | 44446/32.77 |
| No. of different genes | 114 | 114 | 114 | 114 | 114 | 113 | 111 | 111 | 114 |
| No. of different protein-coding genes | 80 | 80 | 80 | 80 | 80 | 80 | 79 | 79 | 81 |
| No. of different rRNA genes | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 | 4 |
| No. of different tRNA genes | 30 | 30 | 30 | 30 | 30 | 29 | 28 | 28 | 29 |
| No. of different duplicated genes by IR | 18 | 19 | 20 | 19 | 19 | 18 | 19 | 19 | 21 |
| No. of genes with introns | 21 | 21 | 21 | 21 | 21 | 19 | 20 | 20 | 20 |
* The presence of pseudogenes in the complete genome of A. frigida, A. montana, S. sessilis, D. glutinosum (ycf1, rps19), and C. humilis (ycf1, ycf68, rps19).
** Introns losses: one intron missing in rpl16 (C. boreale, S. sessilis and D. glutinosum).
Fig 2Genomic rearrangement of nine Asteraceae species relative to C. humilis.
Locally collinear blocks (LCBs) are colored to indicate syntenic regions. Homologous sequences are connected with the same color strand. Histograms of each LCBs corresponds to sequence similarity. Blocks below the center line indicate regions that align in the reverse complement (inverse) orientation. The small boxes below the LCBs of each chloroplast genome are represented as genes.
Fig 3The expansion and contraction of the inverted repeats (IRs) of nine Asteraceae species relative to A. selengensis.
The small boxes of each chloroplast genome are represented as genes. Genes above the larger box correspond to their transcriptions in forward direction and genes below the larger box represent their transcriptions in reverse direction.
Fig 4Kimura’s two parameter (K2p) values of introns and intergenic spacers (IGSs) between intro-generic species (within Artemisia: A. selengensis, A. capillaris, A. frigida, A. gmelinii, A. montana) and inter-generic species (other genus of Asteraceae: A. selengensis, C. boreale, S. sessilis, D. glutinosum, C. humilis).
Black circles represent the mean K2p values of intro-generic species, and blank triangles indicate the mean K2p values of inter-generic species. Bars are mean values (±SE, n = 5). Symbols indicate levels of statistical significance between intro-generic species and inter-generic species: no symbol P > 0.05; *P = 0.01–0.05; **P < 0.01. X-axis denotes the homologous regions arranged by position.
Fig 5Ka/Ks ratio of protein-coding genes between intro-generic species (within Artemisia: A. selengensis, A. capillaris, A. frigida, A. gmelinii, A. montana) and inter-generic species (other genus of Asteraceae: A. selengensis, C. boreale, S. sessilis, D. glutinosum, C. humilis).
Black circles represent the mean Ka/Ks values of intro-generic species, and blank triangles indicate the mean Ka/Ks values of inter-generic species. Bars are mean values (±SE, n = 5). Symbols indicate levels of statistical significance between intro-generic species and inter-generic species: no symbol P > 0.05; *P = 0.01–0.05; **P < 0.01. X-axis denotes the homologous genes arranged by position.
Fig 6Phylogenetic relationships based on 72 conserved chloroplast protein-coding sequences shared among 29 Asteraceae species with neighbor-joining (NJ) method.
C. cornigera and C. humilis were selected as the out group.