| Literature DB >> 35913971 |
Jingwen Yue1, Yang Ni1, Mei Jiang2, Haimei Chen2, Pinghua Chen1, Chang Liu2.
Abstract
Codonopsis pilosula subsp. tangshen is one of the most important medicinal herbs used in traditional Chinese medicine. Correct identification of materials from C. pilosula subsp. tangshen is critical to ensure the efficacy and safety of the associated medicines. Traditional DNA molecular markers could distinguish Codonopsis species well, so we need to develop super or specific molecular markers. In this study, we reported the plastome of Codonopsis pilosula subsp. tangshen (Oliv.) D.Y. Hong conducted phylogenomic and comparative analyses in the Codonopsis genus for the first time. The entire length of the Codonopsis pilosula subsp. tangshen plastome was 170,672 bp. There were 108 genes in the plastome, including 76 protein-coding genes, 28 transfer RNA (tRNA), and four ribosomal RNA (rRNA) genes. Comparative analysis indicated that Codonopsis pilosula subsp. tangshen had an unusual large inversion in the large single-copy (LSC) region compared with the other three Codonopsis species. And there were two dispersed repeat sequences at both ends of the inverted regions, which might mediate the generation of this inversion. We found five hypervariable regions among the four Codonopsis species. PCR amplification and Sanger sequencing experiments demonstrated that two hypervariable regions could distinguish three medicinal Codonopsis species. Results obtained from this study will support taxonomic classification, discrimination, and molecular evolutionary studies of Codonopsis species.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35913971 PMCID: PMC9342729 DOI: 10.1371/journal.pone.0271813
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Fig 7The hypervariable regions between the Codonopsis genus.
The horizontal direction represents the intergenic spacer regions that are highly variable among the four Codonopsis species. The vertical direction is the arbitrary K2P distance of these regions. The square in the middle of each line represents the main distance of each intergenic spacer region.
Fig 1The schematic representation of the plastome of C. pilosula subsp. tangshen created by CPGAVAS2.
The map contains four rings. From the center to outward, the first circle represents the forward and reverse repeats indicated by red and green arcs, respectively. The second circle represents the tandem repeats marked with short bars. And the third circle shows the microsatellite sequences identified using MISA. The fourth circle shows the gene structure on the plastome. The genes were colored based on their functional categories, which are shown in the left corner.
Gene compositions of the Codonopsis pilosula subsp. tangshen plastome.
| Category of genes | Group of genes | Name of genes |
|---|---|---|
| rRNA | ||
| tRNA | ||
| photosynthesis | Subunits of ATP synthase Subunits of photosystem II | |
| Subunits of cytochrome b/f complex | ||
| Subunits of photosystem I | ||
| Subunit of rubisco | ||
| Subunits of NADH-dehydrogenase | ||
| Self-replication Other genes | Large subunit of ribosome | |
| DNA dependent RNA polymerase | ||
| Small subunit of ribosome | ||
| c-type cytochrome synthesis gene | ||
| Envelop membrane protein | ||
| Protease | ||
| Maturase | ||
| Unknown | Conserves open reading frames |
Fig 2Comparison of LSC, IR, and SSC border among the complete plastomes of 4 Codonopsis species, 2 Campanula species, Leptocodon hirsutus, and Platycodon grandiflorus.
The JLB, JSB, JSA, and JLA represent junction sites of LSC/IRb, IRb/SSC, SSC/IRa, IRa/LSC, respectively. Different colors represent different regions. The blue represents LSC, the orange represents IRs, and the green represents SSC. The number on the arrow indicates the distance between the gene and the boundary. The genes shown in Fig are the genes closest to the boundary.
Fig 3The phylogenetic tree of species from Codonopsis and other genus constructed based on the nucleotide sequences of 68 conserved plastid protein-coding genes using the maximum likelihood (ML) method and Bayesian Inference (BI) method.
The number next to each node represents the corresponding bootstrap value and the BI posterior probabilities, respectively. The GenBank accession number are shown after the Latin name of the species. The sequence obtained from this study was highlighted in Bold. The length of the branch corresponds to the frequency of base substitutions.
Fig 4Structure variation in C. pilosula subsp. tangshen.
A comparison of tangshen plastome from this study and C. lanceolata, C. minima, and C. tsinglingensis from NCBI revealed similarities and differences in syntenic blocks. The box in purple indicates the correspondence of the region in different species. The C. pilosula subsp. tangshen was used as the reference. An inversion was found in C. pilosula subsp. tangshen was shown in purple.
Fig 5Alignment of two repeat sequences at both ends of the inversion in C. pilosula subsp. tangshen.
(a) The schematic representation of the inversion region and the two repeat sequences. The black shaded area represents the inversion region. Grey areas links repeat sequences at both ends. The yellow areas represent the sequences at both ends of the repeat sequences. rpoC2 and clpP represent the genes closest to the inversion region. The direction was from 5’ to 3’. The number on the arrows represents the starting and ending positions of the two repeat sequences, respectively. The dash in the middle of the sequences represents the omitted sequence. (b) Alignment of the repeat sequences in C. pilosula subsp. tangshen and C. lanceolata. The black shading represents the two repeat unit sequences. The two repeat units are palindromic. The red box in the middle represents the omitted sequences between the two repeat units. The numbers pointed by the red arrows represent the start and end positions of the two repeat units in C. pilosula subsp. tangshen and C. lanceolata. The repeat sequence in C. pilosula subsp. tangshen is longer than that in C. lanceolata. The code for the species was listed below this Fig.
Fig 6Comparison of the four Codonopsis plastomes by mVISTA.
The vertical scale on the right indicates the percentage of identity, ranging from 50 to 100%. The horizontal axis shows the coordinates within the plastome. Gray arrows indicate the genes above the alignments. Different colors represent different regions. The dark blue, light blue, and pink represent exon, tRNAs, or rRNAs and conserved non-coding sequences. The reference is the C. pilosula subsp. tangshen, with its inversion region inverted for comparison. The number code for the species was shown below the picture.
Fig 8The alignment of the sequencing chromatogram of the PCR products was amplified using the primers of Com1 and Com4.
The ID of each sequence is shown on the left side of each panel. The composition of the ID in turn includes the abbreviation of the species name, plant individual id, and primer name. The figure of alignment represents the individuals 1–5 of three species. The sequencing chromatogram of the PCR products amplified takes individual 1 of three species as an example. The red squares represent the SNP and Indel regions, which can distinguish the three species. The nucleotides identical across all plastomes are shaded in black, whereas those conserved in 60% of the sequences are shaded in gray. lan: Codonopsis lanceolata; tan: Codonopsis pilosula subsp. tangshen; tsi: Codonopsis tsinlingensis. Arabic numerals represent the individual 1.