| Literature DB >> 30380708 |
Yuxin Zhou1, Jing Nie2, Ling Xiao3, Zhigang Hu4, Bo Wang5.
Abstract
Rhubarb is an important ingredient in traditional Chinese medicine known as Rhei radix et rhizome. However, this common name refers to three different botanical species with different pharmacological effects. To facilitate the genetic identification of these three species for their more precise application in Chinese medicine we here want to provide chloroplast sequences with specific identification sites that are easy to amplify. We therefore sequenced the complete chloroplast genomes of all three species and then screened those for suitable sequences describing the three species. The length of the three chloroplast genomes ranged from 161,053 bp to 161,541 bp, with a total of 131 encoded genes including 31 tRNA, eight rRNA and 92 protein-coding sequences. The simple repeat sequence analysis indicated the differences existed in these species, phylogenetic analyses showed the chloroplast genome can be used as an ultra-barcode to distinguish the three botanical species of rhubarb, the variation of the non-coding regions is higher than that of the protein coding regions, and the variations in single-copy region are higher than that in inverted repeat. Twenty-one specific primer pairs were designed and eight specific identification sites were experimentally confirmed that can be used as special DNA barcodes for the identification of the three species based on the highly variable regions. This study provides a molecular basis for precise medicinal plant selection, and supplies the groundwork for the next investigation of the closely related Rheum species comparing and correctly identification on these important medicinal species.Entities:
Keywords: Rheum; complete chloroplast genome; identification; rhubarb; ultra-barcode
Mesh:
Substances:
Year: 2018 PMID: 30380708 PMCID: PMC6278470 DOI: 10.3390/molecules23112811
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
The basic characteristics of chloroplast genomes of the three Rheum species.
|
|
|
| |
|---|---|---|---|
| Location | Qinghai | Sichuan | Gansu |
| Accession number in GenBank | MH572012 | KR816224 | MH572013 |
| Total clean read | 820,613 kb | 644,941 kb | 685,879 kb |
| N50 of contigs (bp) | 86,523 | 86,483 | 86,439 |
| Total chloroplast DNA size (bp) | 161,093 | 161,541 | 161,053 |
| LSC size (bp) | 86,609 | 86,518 | 86,604 |
| IR size (bp) | 30,956 | 30,956 | 30,961 |
| SSC size (bp) | 12,750 | 13,111 | 13,147 |
| Total number of genes | 131 | 130 | 129 |
| Number of different protein-coding genes | 92 | 92 | 92 |
| Number of different tRNA genes | 31 | 30 | 29 |
| Number of different rRNA genes | 8 | 8 | 8 |
| GC content (%) | 37.3 | 37.3 | 37.3 |
| GC content of LSC (%) | 35.3 | 35.4 | 35.4 |
| GC content of IR (%) | 41.1 | 41 | 41.1 |
| GC content of SSC (%) | 32.5 | 32.5 | 32.6 |
LSC: large single-copy region; IR: inverted repeats; SSC: small single-copy region.
A list of genes found in the chloroplast genomes of the three Rheum species including copy number and introns included.
| Group of Genes | Name of Gene |
|---|---|
| Transfer RNAs (31) | |
| photosystem I (5) | |
| Assembly/stability of photosystem I (2) | |
| photosystem II (15) | |
| Maturase (1) |
|
| Ribosomal protein (25) | |
| cytochrome b6/f complex (6) | |
| ATP synthase (6) | |
| RNA polymerase (4) | |
| NADH dehydrogenase (13) | |
| Rubisco large subunit (1) |
|
| Acetyl-CoA carboxylase (1) |
|
| envelope membrane protein (1) |
|
| ATP-dependent protease subunit (1) |
|
| translation initiation factor (1) |
|
| Conserved reading frames (ycfs) (8) | |
| Ribosomal RNAs (8) | |
| c-type cytochrome biogenesis (1) |
|
* contains one intron; ** contains two introns; Numbers in brackets behind name of gene group give number of repetitive genes; trnI-GAT * exists in R. officinale.
Figure 1Gene map of the Rheum chloroplast genome. The genes lying inside and outside the outer circle are transcribed in a clockwise and counterclockwise direction, respectively (as indicated by arrows). Colors denote the genes belonging to different functional groups. The hatch marks on the inner circle indicate the extent of the inverted repeats (IRa and IRb) that separate the small single copy (SSC) region from the large single copy (LSC) region. The dark gray and light gray shading within the inner circle correspond to the percentage of G + C and A + T content, respectively.
Figure 2Comparison of SSR types and quantities in the three studied Rheum species. (a) Number of SSR types; (b) SSRs of three species in four regions; (c) The percentages of SSRs number in four regions; (d) Frequency of SSRs by length. SSR: Simple sequence repeats; LSC: large single-copy region; SSC: small single-copy region; IRA and IRB: inverted repeats.
R. officinale chloroplast genome SSR distribution.
| SSR nr. | SSR Type | SSR | Size | Star | End | Location |
|---|---|---|---|---|---|---|
| 2 | p1 | (A)10 | 10 | 1883 | 1892 | CNS |
| 3 | p1 | (T)11 | 11 | 2053 | 2063 | CNS |
| 4 | p4 | (TGAT)3 | 12 | 2688 | 2699 |
|
| 5 | p1 | (T)12 | 12 | 3040 | 3051 |
|
| 6 | p1 | (T)11 | 11 | 3505 | 3515 |
|
| 8 | p1 | (A)12 | 12 | 4763 | 4774 | CNS |
| 10 | p1 | (T)10 | 10 | 5538 | 5547 | CNS |
| 16 | p1 | (A)12 | 12 | 8117 | 8128 | CNS |
| 23 | p4 | (GTCT)3 | 12 | 12207 | 12218 |
|
| 32 | p1 | (T)11 | 11 | 19306 | 19316 |
|
| 35 | p2 | (AT)5 | 10 | 20683 | 20692 |
|
| 49 | p1 | (T)10 | 10 | 33669 | 33678 | CNS |
| 50 | p1 | (A)12 | 12 | 34263 | 34274 | CNS |
| 57 | p1 | (A)10 | 10 | 39162 | 39171 | CNS |
| 62 | p1 | (T)10 | 10 | 45369 | 45378 | CNS |
| 63 | p3 | (AAT)4 | 12 | 46315 | 46326 | CNS |
| 65 | p4 | (TTGG)3 | 12 | 46894 | 46905 | CNS |
| 87 | p4 | (TATT)3 | 12 | 61268 | 61279 | CNS |
| 90 | p2 | (TA)5 | 10 | 63994 | 64003 |
|
| 97 | p1 | (A)10 | 10 | 67687 | 67696 | |
| 116 | p1 | (T)15 | 15 | 82003 | 82017 | CNS |
| 120 | p1 | (T)12 | 12 | 86283 | 86294 |
|
| 125 | p1 | (A)10 | 10 | 89370 | 89379 |
|
| 129 | p3 | (CTT)4 | 12 | 92498 | 92509 |
|
| 151 | p1 | (A)16 | 16 | 114057 | 114072 |
|
| 158 | p1 | (A)10 | 10 | 117661 | 117670 |
|
| 165 | p1 | (T)10 | 10 | 120389 | 120398 | CNS |
| 171 | p4 | (AATA)3 | 12 | 122749 | 122760 |
|
| 174 | p2 | (AT)5 | 10 | 124108 | 124117 | CNS |
| 175 | p1 | (A)10 | 10 | 125400 | 125409 |
|
| 183 | p1 | (T)16 | 16 | 133545 | 133560 |
|
| 205 | p3 | (AAG)4 | 12 | 155108 | 155119 |
|
| 209 | p1 | (T)10 | 10 | 158238 | 158247 |
|
SSR: Simple sequence repeats; CNS: non-coding sequences.
Figure 3Phylogenetic tree constructed using neighbor joining (NJ), based on the whole chloroplast genomes from different species. Amborella trichopoda was set as outgroup.
Figure 4Comparison of three chloroplast genomes using R. palmatum as the reference. The vertical scale indicates the percentage of identity, ranging from 50% to 100%; the horizontal axis indicates the coordinates within the chloroplast genome. Annotated genes are displayed along the top. Genome regions are color-coded as either protein-coding exons, rRNA, tRNA, or conserved non-coding sequences (CNS). UTR: Untranslated Region.
Figure 5Base information of the identification sites of sequences obtained by chosen primer pairs for the three study species. (a) Primer pair 1; (b) Primer pair 7; (c) Primer pair 9; (d) Primer pair 10; (e) Primer pair 15; (f) Primer pair 17; (g) Primer pair 21; (h) Primer pair 6. For more detailed information on primer pairs see Table S3.