| Literature DB >> 25642231 |
Yongbing Zhao1, Jinlong Yin2, Haiyan Guo3, Yuyu Zhang1, Wen Xiao1, Chen Sun1, Jiayan Wu3, Xiaobo Qu2, Jun Yu3, Xumin Wang3, Jingfa Xiao3.
Abstract
Panax ginseng C.A. Meyer (P. ginseng) is an important medicinal plant and is often used in traditional Chinese medicine. With next generation sequencing (NGS) technology, we determined the complete chloroplast genome sequences for four Chinese P. ginseng strains, which are Damaya (DMY), Ermaya (EMY), Gaolishen (GLS), and Yeshanshen (YSS). The total chloroplast genome sequence length for DMY, EMY, and GLS was 156,354 bp, while that for YSS was 156,355 bp. Comparative genomic analysis of the chloroplast genome sequences indicate that gene content, GC content, and gene order in DMY are quite similar to its relative species, and nucleotide sequence diversity of inverted repeat region (IR) is lower than that of its counterparts, large single copy region (LSC) and small single copy region (SSC). A comparison among these four P. ginseng strains revealed that the chloroplast genome sequences of DMY, EMY, and GLS were identical and YSS had a 1-bp insertion at base 5472. To further study the heterogeneity in chloroplast genome during domestication, high-resolution reads were mapped to the genome sequences to investigate the differences at the minor allele level; 208 minor allele sites with minor allele frequencies (MAF) of ≥0.05 were identified. The polymorphism site numbers per kb of chloroplast genome sequence for DMY, EMY, GLS, and YSS were 0.74, 0.59, 0.97, and 1.23, respectively. All the minor allele sites located in LSC and IR regions, and the four strains showed the same variation types (substitution base or indel) at all identified polymorphism sites. Comparison results of heterogeneity in the chloroplast genome sequences showed that the minor allele sites on the chloroplast genome were undergoing purifying selection to adapt to changing environment during domestication process. A study of P. ginseng chloroplast genome with particular focus on minor allele sites would aid in investigating the dynamics on the chloroplast genomes and different P. ginseng strains typing.Entities:
Keywords: Panax ginseng; SNP; chloroplast genome; comparative genomics; minor allele
Year: 2015 PMID: 25642231 PMCID: PMC4294130 DOI: 10.3389/fpls.2014.00696
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Gene contents in .
| Self-replication | rRNA genes | |
| tRNA genes | 34 trn genes(7 contain an intron, 12 in the IR regions) | |
| Small subunit of ribosome | ||
| Large subunit of ribosome | ||
| DNA dependent RNA polymerase | ||
| Genes for photosynthesis | Subunits of NADH-dehydrogenase | |
| Subunits of photosystem I | ||
| Subunits of photosystem II | ||
| Subunits of cytochrome b/f complex | ||
| Subunits of ATP synthase | ||
| Large subunit of rubisco | ||
| Other genes | Translational initiation factor | |
| Maturase | ||
| Protease | ||
| Envelope membrane protein | ||
| Subunit of Acetyl-CoA-carboxylase | ||
| c-type cytochrome synthesis gene | ||
| Genes of unknown function | Open Reading Frames (ORF, ycf) |
One and two asterisks after gene names reflect one- and two-intron containing genes, respectively. Genes located in IR regions are indicated by the (×2) symbol after the gene name. The rps12 gene is divided: the 5 '-rps12 is located in LSC region and the 3 '-rps12 in IR region.
Figure 1Gene map of . Genes shown outside the outer circle are transcribed clockwise, and those inside are transcribed counterclockwise. Genes belonging to different functional groups are color coded. The dashed area in the inner circle indicates GC content of the chloroplast.
Simple sequence repeats in .
| A | 10 | 1 | 17677–17686 |
| 11 | 1 | 23946–23956 | |
| 13 | 2 | 4823–4835, 14249–14261 | |
| C | 10 | 2 | 7503–7512, 38191-38200 |
| 11 | 1 | 137043–137053 | |
| G | 11 | 1 | 105431–105441 |
| T | 10 | 7 | 27594–27603 |
| 11 | 3 | 19889–19899 | |
| TA | 14 | 1 | 85868–85881 |
| AAGA | 12 | 1 | 30782–30793 |
| TCTT | 12 | 1 | 30804–30815 |
| AATT | 12 | 1 | 30948–30959 |
| ATTT | 12 | 1 | 34090–34101 |
| TATT | 12 | 1 | 69890–69901 |
| AAAG | 12 | 1 | 72233–72244 |
| AGGT | 12 | 1 | 107514–107525 |
| CTAC | 12 | 1 | 134957–134968 |
| ATTAG | 15 | 1 | 100769–100783 |
| CTAAT | 15 | 1 | 141701–141715 |
| CATAGT | 18 | 1 | 74295–74312 |
Gene name in the bracket indicates that this SSR located on the gene.
Long repeat sequences in .
| (CTACATC)3 | 21 | 1945–1965 | Intergenic region |
| (CGATATTGATGCTAGTGA)4 | 72 | 92801–92872 | |
| (ATATCGTCACTAGCATCA)4 | 72 | 149606–149677 | |
| (AGAAACCCCAACAACGGAAGAAAGGGGGGAAAGTGAGGAAGAAACAGATGTAGAAAT)4 | 228 | 111304–111531 | Intergenic region |
| (GTTTCTATTTCTACATCTGTTTCTTCCTCACTTTCCCCCCTTTCTTCCGTTGTTGGG)4 | 228 | 130947–131174 |
Figure 2Phylogenetic tree based on 52 protein-coding genes (maximum likelihood). DMY is marked in blue and the two outgroup species are marked in red.
Figure 3A comparison of LSC, SSC, and IR region borders among four chloroplast genomes.
Comparison of protein-coding region (CDS), intron, and intergenic spacers (IGS) at LSC, IR, and SSC regions of chloroplast genomes.
| CDS | LSC | 1958 | 0.0443 | 0.0212 | 0.1484 | 0.1427 (0) | 2021 | 0.0458 | 0.0226 | 0.1496 | 0.1511 (0) | 352 | 0.0079 | 0.004 | 0.0229 | 0.1765 (8.5e-58) |
| IR | 496 | 0.0258 | 0.0224 | 0.0407 | 0.5507 (8.3e-10) | 470 | 0.0243 | 0.0221 | 0.0351 | 0.6276 (4.9e-6) | 62 | 0.0032 | 0.0037 | 0.0015 | 2.528 (0.0142) | |
| SSC | 1296 | 0.0907 | 0.0721 | 0.2321 | 0.3107 (1.9e-99) | 1276 | 0.0895 | 0.0708 | 0.2314 | 0.306 (2e-100) | 281 | 0.0193 | 0.0154 | 0.0391 | 0.3937 (1.7e-13) | |
| TOTAL | 3750 | 0.0482 | 0.0305 | 0.1316 | 0.2319 (0) | 3767 | 0.0485 | 0.031 | 0.1308 | 0.2371 (0) | 695 | 0.0089 | 0.006 | 0.0202 | 0.2988 (7e-54) | |
| Intron | LSC | 713 | 0.0962 | – | – | – | 711 | 0.0962 | – | – | – | 106 | 0.0139 | – | – | – |
| IR | 46 | 0.0174 | – | – | – | 30 | 0.0113 | – | – | – | 4 | 0.0015 | – | – | – | |
| SSC | 115 | 0.1136 | – | – | – | 115 | 0.1130 | – | – | – | 109 | 0.0282 | – | – | – | |
| TOTAL | 874 | 0.0789 | – | – | – | 856 | 0.0774 | – | – | – | 130 | 0.0115 | – | – | – | |
| IGS | LSC | 3898 | 0.1295 | – | – | – | 3871 | 0.1260 | – | – | – | 719 | 0.0210 | – | – | – |
| IR | 768 | 0.0284 | – | – | – | 413 | 0.0151 | – | – | – | 68 | 0.0025 | – | – | – | |
| SSC | 626 | 0.1797 | – | – | – | 597 | 0.1722 | – | – | – | 109 | 0.0282 | – | – | – | |
| TOTAL | 5292 | 0.0873 | – | – | – | 4881 | 0.0793 | – | – | – | 896 | 0.0137 | – | – | – | |
| TOTAL | 9916 | 0.0664 | – | – | – | 9504 | 0.0632 | – | – | – | 1721 | 0.0111 | – | – | ||
This is a summary table of each calculation from three different comparisons of P. ginseng vs. D. carota, P. ginseng vs. A. cerefolium, and P. ginseng vs. E. senticosus. Abbreviation: NP, the numbers of polymorphic sites; ND, nucleotide differences; Ks, synonymous substitution differences; and Ks, nonsysnonymous substitution differences. In the column Ka/Ks, there are two numbers, in which the above number is the Ka/Ks value and the below value in bracket is the p-value.
Figure 4Minor allele distribution among chloroplast genomes of four strains. In the panels DMY, EMY, GLS, and YSS, the dashed lines indicate a minor allele frequency (MAF) of 0.05. The panel Sup shows tRNA and rRNA genes, protein-coding, intergenic, and intron regions on the chloroplast genome, the colors are shown in the legend for Sup. LSC, SSC, and IR regions are marked using arrows.
Figure 5Minor allele overlaps among chloroplast genomes of four strains.
Figure 6The evolutionary relationship among five ginseng chloroplast genomes.