| Literature DB >> 34872502 |
Shujie Dong1, Zhiqi Ying1, Shuisheng Yu2, Qirui Wang1, Guanghui Liao1, Yuqing Ge3, Rubin Cheng4.
Abstract
BACKGROUND: The Stephania tetrandra S. Moore (S. tetrandra) is a medicinal plant belonging to the family Menispermaceae that has high medicinal value and is well worth doing further exploration. The wild resources of S. tetrandra were widely distributed in tropical and subtropical regions of China, generating potential genetic diversity and unique population structures. The geographical origin of S. tetrandra is an important factor influencing its quality and price in the market. In addition, the species relationship within Stephania genus still remains uncertain due to high morphological similarity and low support values of molecular analysis approach. The complete chloroplast (cp) genome data has become a promising strategy to determine geographical origin and understand species evolution for closely related plant species. Herein, we sequenced the complete cp genome of S. tetrandra from Zhejiang Province and conducted a comparative analysis within Stephania plants to reveal the structural variations, informative markers and phylogenetic relationship of Stephania species.Entities:
Keywords: Chloroplast genome; Comparative analysis; Mutational hotspots; Phylogenetic relationship; Stephania tetrandra
Mesh:
Year: 2021 PMID: 34872502 PMCID: PMC8647421 DOI: 10.1186/s12864-021-08193-x
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Circular chloroplast (CP) genome map of Stephania tetrandra. Genes drawn outside the circle are transcribed anti-clockwise, while those inside the circle are transcribed clockwise. Large single copy (LSC) region, inverted repeat (IRA, IRB) regions and small single copy (SSC) region are shown in the figure. The darker gray in the inner circle corresponds GC content whereas the lighter gray corresponds AT content. Different colors of genes represent their different functions
List of genes annotated in the chloroplast genomes of Staphania tetrandra voucher ZJ
| No. | Group of genes | Gene names | Amount |
|---|---|---|---|
| 1 | Photosystem I | 5 | |
| 2 | Photosystem II | 14 | |
| 3 | Cytochrome b/f complex | 6 | |
| 4 | ATP synthase | 6 | |
| 5 | NADH dehydrogense | 12 | |
| 6 | RubisCO large subunit | 1 | |
| 7 | RNA polymerase | 4 | |
| 8 | Ribosomal proteins (SSU) | 13 | |
| 9 | Ribosomal proteins (LSU) | 13 | |
| 10 | Other genes | 6 | |
| 11 | Proteins of unknown function | 7 | |
| 12 | Transfer RNAs | 37 | |
| 13 | Ribosomal RNAs | 8 | |
| 14 | pseudo genes | 2 |
One or two asterisks after a gene indicate that the gene contains one or two introns, respectively
Statistics on the basic features of the chloroplast genomes of the three Stephania species
| Characteristics | |||
|---|---|---|---|
| Accession number | MT849286 | KU204903.1 | MN654112.1 |
| Total length (bp) | 157,725 | 157,719 | 156,624 |
| LSC length (bp) | 89,468 | 88,693 | 87,759 |
| SSC length (bp) | 19,685 | 20,346 | 20,169 |
| IR length (bp) | 48,572 | 48,680 | 48,696 |
| Total Number of Genes | 134 | 132 | 132 |
| Coding Genes | 87 | 85 | 85 |
| rRNA Genes | 8 | 8 | 8 |
| tRNA Genes | 37 | 37 | 37 |
| Pesudo genes | 2 | 2 | 2 |
| GC content | |||
| Total (%) | 38.18 | 38.23 | 38.44 |
| LSC (%) | 36.31 | 36.42 | 36.66 |
| SSC (%) | 33.00 | 32.94 | 33.40 |
| IR (%) | 43.73 | 43.75 | 43.76 |
LSC Large single copy, SSC Small single copy, IR Inverted repeat, GC Guanine and Cytosine
Codon usage and codon-anticodon recognition patterns of three Stephania plants
| Codon | tRNA | Numbers and RSCU | ||
|---|---|---|---|---|
| UUU(F) | 870/1.20 | 882/1.23 | 889/1.22 | |
| UUC(F) | trnF-GAA | 576/0.80 | 557/0.77 | 568/0.78 |
| UUA(L) | 749/1.67 | 768/1.72 | 757/1.69 | |
| UUG(L) | trnL-CAA | 581/1.30 | 556/1.24 | 567/1.27 |
| CUU(L) | 594/1.32 | 601/1.35 | 590/1.32 | |
| CUC(L) | 193/0.43 | 185/0.41 | 189/0.42 | |
| CUA(L) | 374/0.84 | 368/0.82 | 371/0.83 | |
| CUG(L) | 199/0.44 | 203/0.45 | 207/0.46 | |
| AUU(I) | 1065/1.45 | 1057/1.45 | 1056/1.46 | |
| AUC(I) | trnI-GAU | 464/0.63 | 447/0.61 | 460/0.63 |
| AUA(I) | 674/0.92 | 677/0.93 | 659/0.91 | |
| AUG(M) | trnM-CAU | 625/1.00 | 613/1.00 | 613/1.00 |
| GUU(V) | 506/1.39 | 510/1.41 | 506/1.39 | |
| GUC(V) | 179/0.49 | 175/0.48 | 181/0.50 | |
| GUA(V) | 537/1.48 | 534/1.47 | 535/1.46 | |
| GUG(V) | 233/0.64 | 231/0.64 | 239/0.65 | |
| UCU(S) | 540/1.56 | 526/1.56 | 524/1.55 | |
| UCC(S) | trnS-GGA | 371/1.07 | 362/1.07 | 370/1.09 |
| UCA(S) | trnS-UGA | 451/1.31 | 432/1.28 | 428/1.26 |
| UCG(S) | 199/0.57 | 191/0.57 | 196/0.58 | |
| CCU(P) | 417/1.48 | 409/1.46 | 397/1.42 | |
| CCC(P) | 236/0.84 | 234/0.84 | 253/0.90 | |
| CCA(P) | trnP-UGG | 331/1.17 | 318/1.14 | 321/1.15 |
| CCG(P) | 142/0.50 | 156/0.56 | 150/0.54 | |
| ACU(T) | 531/1.55 | 518/1.54 | 517/1.54 | |
| ACC(T) | trnT-GGU | 264/0.77 | 264/0.78 | 259/0.77 |
| ACA(T) | trnT-UGU | 417/1.22 | 409/1.21 | 405/1.21 |
| ACG(T) | 155/0.45 | 158/0.47 | 159/0.47 | |
| GCU(A) | 623/1.77 | 606/1.74 | 601/1.72 | |
| GCC(A) | 223/0.63 | 222/0.64 | 223/0.64 | |
| GCA(A) | 392/1.11 | 376/1.08 | 386/1.11 | |
| GCG(A) | 174/0.49 | 191/0.55 | 186/0.53 | |
| UAU(Y) | 747/1.59 | 749/1.57 | 731/1.57 | |
| UAC(Y) | trnY-GUA | 192/0.41 | 205/0.43 | 201/0.43 |
| UAA(*) | 46/1.59 | 41/1.45 | 43/1.52 | |
| UAG(*) | 25/0.86 | 26/0.92 | 26/0.92 | |
| CAU(H) | 490/1.50 | 475/1.50 | 468/1.49 | |
| CAC(H) | trnH-GUG | 163/0.50 | 158/0.50 | 162/0.51 |
| CAA(Q) | trnQ-UUG | 731/1.53 | 718/1.53 | 724/1.52 |
| CAG(Q) | 225/0.47 | 223/0.47 | 226/0.48 | |
| AAU(N) | 961/1.52 | 940/1.51 | 948/1.51 | |
| AAC(N) | 301/0.48 | 307/0.49 | 305/0.49 | |
| AAA(K) | 994/1.47 | 992/1.46 | 976/1.46 | |
| AAG(K) | 362/0.53 | 366/0.54 | 364/0.54 | |
| GAU(D) | 872/1.59 | 856/1.58 | 859/1.58 | |
| GAC(D) | trnD-GUC | 227/0.41 | 226/0.42 | 229/0.42 |
| GAA(E) | trnE-UUC | 958/1.45 | 953/1.46 | 953/1.45 |
| GAG(E) | 361/0.55 | 351/0.54 | 358/0.55 | |
| UGU(C) | trnC-GCA | 221/1.45 | 223/1.46 | 224/1.49 |
| UGC(C) | 84/0.55 | 82/0.54 | 77/0.51 | |
| UGA(*) | 16/0.55 | 18/0.64 | 16/0.56 | |
| UGG(W) | trnW-CCA | 470/1.00 | 465/1.00 | 465/1.00 |
| CGU(R) | trnR-ACG | 352/1.32 | 352/1.33 | 358/1.35 |
| CGC(R) | 115/0.43 | 112/0.42 | 111/0.42 | |
| CGA(R) | 360/1.35 | 343/1.30 | 352/1.32 | |
| CGG(R) | 125/0.47 | 132/0.50 | 124/0.47 | |
| AGU(S) | 390/1.13 | 386/1.14 | 384/1.13 | |
| AGC(S) | trnS-GCU | 127/0.37 | 130/0.38 | 132/0.39 |
| AGA(R) | trnR-UCU | 477.1.79 | 468/1.77 | 472/1.77 |
| AGG(R) | 174/0.64 | 176/0.67 | 179/0.67 | |
| GGU(G) | 588/1.31 | 584/1.30 | 583/1.30 | |
| GGC(G) | trnG-GCC | 168/0.38 | 176/0.39 | 171/0.38 |
| GGA(G) | 723/1.62 | 712/1.59 | 713/1.59 | |
| GGG(G) | 311/0.69 | 324/0.72 | 323/0.72 | |
Fig. 2Codon content and RSCU value of the 20 amino acid and stop codons in all protein-coding genes of Stephania tetrandra chloroplast genome. The color of the histogram corresponds to the color of codons
Fig. 3Number of the RNA editing sites in the cp genome of Stephania tetrandra, Stephania japonica and Stephania kwangsiensis predicted by PREP-Cp program with a cutoff value 0.8
Amino acid conversion frequency of protein coding gene of three Stephania species
| Amino acid conversion | Edited position | Number and percentage | ||
|---|---|---|---|---|
| S-L | Second nucleotide | 39/42.4% | 34/37.8% | 34/36.2% |
| P-L | Second nucleotide | 11/12.0% | 13/14.4% | 14/14.9% |
| H-Y | First nucleotide | 11/12.0% | 10/11.1% | 11/11.7% |
| L-F | First nucleotide | 4/4.3% | 5/5.6% | 3/3.2% |
| S-F | Second nucleotide | 5/5.4% | 5/5.6% | 5/5.3% |
| T-M | Second nucleotide | 4/4.3% | 4/4.4% | 4/4.3% |
| A-V | Second nucleotide | 3/3.3% | 4/4.4% | 5/5.3% |
| P-S | First nucleotide | 2/2.2% | 4/4.4% | 4/4.3% |
| R-W | First nucleotide | 5/5.4% | 4/4.4% | 5/5.3% |
| T-I | Second nucleotide | 4/4.3% | 5/5.6% | 5/5.3% |
| R-C | First nucleotide | 2/2.2% | 0 | 2/2.1% |
| P-F | First and second nucleotide | 2/2.2% | 2/2.2% | 2/2.1% |
The value of Ka/Ks in 25 protein coding genes with RNA editing sites in S. tetrandra voucher ZJ
| Gene | Number of RNA editing sites | Non-synonymous substitutions (Ka) | Synonymous substitutions (Ks) | Ka/Ks |
|---|---|---|---|---|
| 13 | 0.0035 | 0.0107 | 0.3271 | |
| 9 | 0.0106 | 0.0947 | 0.1119 | |
| 8 | 0.0045 | 0.0613 | 0.2887 | |
| 7 | 0.0177 | 0.0615 | 0.0732 | |
| 7 | 0.0202 | 0.0181 | 0.1869 | |
| 6 | 0.0074 | 0.0927 | 0.0798 | |
| 4 | 0.0373 | 0.0948 | 0.3936 | |
| 4 | 0.0061 | 0.0617 | 0.0989 | |
| 4 | 00159 | 0.0444 | 0.3581 | |
| 4 | 0.0054 | 0.0385 | 0.1403 | |
| 3 | 0.0071 | 0.0595 | 0.1193 | |
| 3 | 0.0101 | 0.0734 | 0.1376 | |
| 3 | 0.0239 | 0.1146 | 0.2086 | |
| 2 | 0.0207 | 0.0841 | 0.2461 | |
| 2 | 0.0037 | 0.0824 | 0.0449 | |
| 2 | 0.0022 | 0.0554 | 0.0397 | |
| 2 | 0.0131 | 0.0453 | 0.2892 | |
| 2 | 0.0047 | 0.0295 | 0.1593 | |
| 2 | 0.08 | 0 | 0 | |
| 1 | 0.0036 | 0.0643 | 0.0559 | |
| 1 | 0.2122 | 0.1658 | 1.2799 | |
| 1 | 0.0091 | 0.0108 | 0.8426 | |
| 1 | 0.0116 | 0.0486 | 0.2387 | |
| 1 | 0.0078 | 0.0423 | 0.1844 | |
| 1 | 0 | 0 | 0 |
Fig. 4Number of different types of SSRs in the cp genomes of Stephania tetrandra, Stephania japonica and Stephania kwangsiensis, setting parameters as 10 for mononucleotide SSRs (Mono), 5 for dinucleotide SSRs (Di), 4 for trinucleotide SSRs (Tri), 3 each for tetranucleotide (Tetra), pentanucleotide (Penta) and hexanucleotide (Hexa) SSRs
The SSR types of the three Stephania plants
| SSR type | Repeat unit | Amount | ||
|---|---|---|---|---|
| Mono | A/T | 52 | 57 | 43 |
| C/G | 2 | 2 | 2 | |
| Di | AC/GT | 1 | 0 | 1 |
| AG/CT | 2 | 3 | 3 | |
| AT/AT | 7 | 12 | 14 | |
| Tri | AAG/CTT | 1 | 1 | 0 |
| AAT/ATT | 7 | 3 | 6 | |
| ATC/ATG | 0 | 1 | 0 | |
| Tera | AAAC/GTTT | 0 | 2 | 0 |
| AAAG/CTTT | 1 | 2 | 2 | |
| AACC/GGTT | 1 | 0 | 1 | |
| AATT/AATT | 1 | 0 | 0 | |
| AAAT/ATTT | 0 | 5 | 3 | |
| AGAT/ATCT | 1 | 1 | 0 | |
| ATCC/ATGG | 1 | 0 | 1 | |
| Penta | AACAT/ATGTT | 1 | 0 | 1 |
| AACCC/GGGTT | 0 | 0 | 1 | |
| AATCT/AGATT | 0 | 0 | 1 | |
| AATAG/ATTCT | 0 | 1 | 0 | |
| AAATC/ATTTG | 0 | 0 | 1 | |
Fig. 5The analysis of the number and length of the long repeats identified from Stephania tetrandra, Stephania japonica and Stephania kwangsiensis chloroplast genomes. The type of long repeats contains forward (F), palindromic (P), reverse (R) and complement (C). The hamming distance of 3, the minimal repeats of 30 and the maximum repeats of 50 were applied during the calculating process
Fig. 6Comparison of junctions between the large single-copy (LSC), small single-copy (SSC) and inverted repeat (IR) regions among 10 chloroplast genomes, including four Stephania plants, one closely related species in family of Menispermaceae, and five species from Berberidaceae, Ranunculaceae and Papaveraceae. The numbers of above the gene features indicate the distance between the ends of genes and border sites
Comparison of substitutions and InDels in four Stephania species
| A/G | C/T | A/T | A/C | C/G | G/T | ||||
| Large single copy | 158 | 173 | 13 | 61 | 15 | 35 | 455 | 2.6 | |
| 177 | 207 | 20 | 64 | 19 | 40 | 527 | 2.7 | ||
| 242 | 245 | 28 | 69 | 26 | 60 | 670 | 2.5 | ||
| 228 | 239 | 39 | 87 | 25 | 61 | 679 | 2.2 | ||
| Inverted repeat | 8 | 9 | 0 | 3 | 2 | 1 | 23 | 2.1 | |
| 10 | 7 | 1 | 5 | 1 | 3 | 27 | 1.7 | ||
| 10 | 11 | 1 | 4 | 2 | 5 | 33 | 1.9 | ||
| 9 | 13 | 1 | 5 | 1 | 5 | 34 | 1.8 | ||
| Small single copy | 43 | 38 | 5 | 19 | 4 | 18 | 127 | 2.0 | |
| 64 | 61 | 15 | 21 | 9 | 10 | 180 | 2.2 | ||
| 68 | 92 | 21 | 25 | 11 | 21 | 238 | 2.2 | ||
| 73 | 66 | 13 | 27 | 7 | 24 | 210 | 1.9 | ||
| No’s of Indels | InDel average length | No’s of Indels | InDel average length | No’s of Indels | InDel average length | ||||
| 11 | 88.636 | 5 | 104.400 | 5 | 28.200 | ||||
| 13 | 42.581 | 2 | 87.300 | 5 | 28.200 | ||||
| 19 | 37.421 | 6 | 89.500 | 5 | 28.200 | ||||
| 21 | 34.571 | 4 | 134.250 | 5 | 23.400 | ||||
Fig. 7Nucleotide diversity (Pi) analysis for chloroplast genomes from the Staphania plants. Sliding window length was 800 bp and step size was selected as 200 bp
Multiple analysis of the mutational hotspots in four Stephania plants
| Mutational hotspots | Species | Length | Number of SNP sites | Total length of Gaps | Ka/ks |
|---|---|---|---|---|---|
| 543 bp | / | / | 1.0892 | ||
| 597 bp | 62 | 30 | 0.9363 | ||
| 594 bp | 83 | 29 | 0.7585 | ||
| 556 bp | 121 | 31 | 0.8556 | ||
| 1146 bp | / | / | 0.5510 | ||
| 1206 bp | 97 | 26 | 0.5689 | ||
| 1124 bp | 83 | 29 | 0.7639 | ||
| 827 bp | 136 | 83 | 0.7030 | ||
| 404 bp | / | / | 0.7224 | ||
| 475 bp | 65 | 14 | 0.8217 | ||
| 481 bp | 70 | 8 | 1.4076 | ||
| 481 bp | 57 | 9 | 1.1759 | ||
| 662 bp | / | / | 1.1528 | ||
| 687 bp | 77 | 28 | 0.9143 | ||
| 673 bp | 91 | 18 | 1.1237 | ||
| 616 bp | 106 | 32 | 1.5410 | ||
| 531 bp | / | / | 1.3464 | ||
| 492 bp | 97 | 8 | 0.7692 | ||
| 507 bp | 84 | 7 | 1.0783 | ||
| 492 bp | 91 | 21 | 0.9638 |
Gaps consists of the insertion and deletion of single bases and fragments
PCR primers designed according to the mutational hotspots within four Stephania species
| Mutational hotspots | PCR primers | Expected length |
|---|---|---|
| F:CGCCGTAGTAAATAGGAGA | 860 bp | |
| R:TCATCAACCGYGCTAACCT | ||
| F:RATACAATAAGCAAGCTC | 957 bp | |
| R:TCCCRAAACAAGAAAACG | ||
| F:GTGCTCTGACCGATTGAACT | 525 bp | |
| R:GGCAATATGTCTACGCTGGT | ||
| F:TAGGTAGGGATGACAGGA | 918 bp | |
| R:GACCCGAACCATAGAGTA | ||
| F:CRAATCCMAAATTAGACCA | 668 bp | |
| R:GACGCTTAGGAACACCAA |
Fig. 8Phylogenetic relationships based on the conserved chloroplast protein coding genes from four Stephania plants and other representative Ranunculales species using maximum likelihood (ML) method. The number on each node represents the bootstrap value from 500 replicates. Papaver orientale and Papaver rhoeas were set as the outgroups