| Literature DB >> 31849682 |
Zhonglian Zhang1,2, Yue Zhang2, Meifang Song2, Yanhong Guan2, Xiaojun Ma1.
Abstract
The taxonomy and nomenclature of Dracaena plants are much disputed, particularly for several Dracaena species in Asia. However, neither morphological features nor common DNA regions are ideal for identification of Dracaena spp. Meanwhile, although multiple Dracaena spp. are sources of the rare traditional medicine dragon's blood, the Pharmacopoeia of the People's Republic of China has defined Dracaena cochinchinensis as the only source plant. The inaccurate identification of Dracaena spp. will inevitably affect the clinical efficacy of dragon's blood. It is therefore important to find a better method to distinguish these species. Here, we report the complete chloroplast (CP) genomes of six Dracaena spp., D. cochinchinensis, D. cambodiana, D. angustifolia, D. terniflora, D. hokouensis, and D. elliptica, obtained through high-throughput Illumina sequencing. These CP genomes exhibited typical circular tetramerous structure, and their sizes ranged from 155,055 (D. elliptica) to 155,449 bp (D. cochinchinensis). The GC content of each CP genome was 37.5%. Furthermore, each CP genome contained 130 genes, including 84 protein-coding genes, 38 tRNA genes, and 8 rRNA genes. There were no potential coding or non-coding regions to distinguish these six species, but the maximum likelihood tree of the six Dracaena spp. and other related species revealed that the whole CP genome can be used as a super-barcode to identify these Dracaena spp. This study provides not only invaluable data for species identification and safe medical application of Dracaena but also an important reference and foundation for species identification and phylogeny of Liliaceae plants.Entities:
Keywords: Dracaena Vand. ex L.; Liliaceae; chloroplast genome; identification; super-barcode
Year: 2019 PMID: 31849682 PMCID: PMC6901964 DOI: 10.3389/fphar.2019.01441
Source DB: PubMed Journal: Front Pharmacol ISSN: 1663-9812 Impact factor: 5.810
Summary statistics for assembly of the six complete chloroplast (CP) genomes of Dracaena species.
| Species names |
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| Raw reads | 50,369,222 | 50,223,744 | 47,183,736 | 44,650,560 | 55,383,360 | 49,214,248 |
| raw data (bp) | 7,555,383,300 | 7,533,561,600 | 7,077,560,400 | 6,697,584,000 | 8,307,504,000 | 7,382,137,200 |
| Mapped CP reads | 2,818,312 | 1,539,342 | 1,561,310 | 3,249,348 | 2,238,882 | 3,657,230 |
| Size (bp) | 155,449 | 155,291 | 155,332 | 155,347 | 155,340 | 155,055 |
| LSC length (bp) | 83,907 | 83,752 | 83,807 | 83,794 | 83,796 | 83,621 |
| SSC length (bp) | 18,492 | 18,489 | 18,465 | 18,493 | 18,494 | 18,456 |
| IR length (bp) | 53,050 | 53,050 | 53,060 | 53,060 | 53,050 | 52,978 |
| Coding (bp) | 77,187 | 77,202 | 78,732 | 78,744 | 78,744 | 77,130 |
| Non-coding (bp) | 78,262 | 78,089 | 76,600 | 76,603 | 76,596 | 77,925 |
CP, complete chloroplast; IR, inverted repeat; LSC, large single-copy; SSC, small single-copy.
Figure 1Gene map of the complete chloroplast genomes of Dracaena species. Genes on the inside of the circle are transcribed clockwise, and those on the outside are transcribed counter-clockwise. The darker gray area in the inner circle corresponds to GC content, whereas the lighter gray corresponds to AT content.
List of genes found in the six CP genomes of Dracaena species.
| No. | Group of genes | Gene names | Number of genes |
|---|---|---|---|
| 1 | Photosystem I | psaA, | 5 |
| 2 | Photosystem II | psbA, | 15 |
| 3 | Cytochrome b/f complex | petA, | 6 |
| 4 | ATP synthase | atpA, | 6 |
| 5 | NADH dehydrogenase | ndhA*, | 11 |
| 6 | RubisCO large subunit | rbcL | 1 |
| 7 | RNA polymerase | rpoA, | 4 |
| 8 | Ribosomal proteins (SSU) | rps2, | 12 |
| 9 | Ribosomal proteins (LSU) | rpl2*(×2), | 9 |
| 10 | Other genes | accD, | 5 |
| 11 | Proteins of unknown function | ycf1, | 4 |
| 12 | Transfer RNAs | 38 | |
| 13 | Ribosomal RNAs | rrn4.5(×2), |
*Gene contains one intron. **Gene contains two introns. (×2) indicates the number of the repeat unit is 2.
Figure 2Repeat analysis in six Dracaena complete chloroplast (CP) genomes. REPuter was used to identify repeat sequences with length ≥30 bp and sequence identified ≥90% in the CP genomes. F, P, R, and C indicate the repeat types F (forward), P (palindrome), R (reverse), and C (complement), respectively. Repeats with different lengths are indicated in different colors.
The simple sequence repeat (SSR) types of the six CP genomes of Dracaena species.
| SSR type | Repeat unit | Amount | |||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||
| Mono | A/T | 36 | 37 | 37 | 34 | 41 | 44 |
| Mono | C/G | 2 | 2 | 1 | 1 | 1 | 0 |
| Di | AG/CT | 3 | 3 | 3 | 3 | 3 | 3 |
| Di | AT/AT | 11 | 11 | 12 | 12 | 11 | 11 |
| Tri | AAG/CTT | 1 | 1 | 1 | 1 | 1 | 1 |
| Tri | AAT/ATT | 2 | 2 | 2 | 2 | 2 | 2 |
| Tetra | AAAG/CTTT | 2 | 2 | 2 | 2 | 2 | 2 |
| Tetra | AAAT/ATTT | 4 | 4 | 4 | 4 | 4 | 4 |
| Tetra | AATC/ATTG | 2 | 2 | 2 | 2 | 2 | 2 |
| Tetra | AATG/ATTC | 1 | 1 | 1 | 1 | 1 | 1 |
| penta | AAACG/CGTTT | 2 | 2 | 0 | 0 | 0 | 0 |
| penta | ACTAT/AGTAT | 1 | 0 | 0 | 0 | 0 | 0 |
| Hexa | AATTAT/AATTAT | 2 | 2 | 1 | 0 | 1 | 1 |
| Hexa | AAGATT/AATCTT | 0 | 0 | 1 | 1 | 1 | 0 |
| Hexa | AAAAGT/ACTTTT | 0 | 0 | 0 | 1 | 0 | 0 |
Figure 3Structure comparison of the six Dracaena CP genomes by using the mVISTA program. Gray arrows and thick black lines above the alignment indicate genes with their orientation and the position of the IRs, respectively. A cut-off value of 70% identity was used for the plots, and the Y-scale represents the percent identity between 50% and 100%.
The 10 most-divergent coding regions and intergenic regions in the six Dracaena species.
| Regions | Genes | Length | Variable sites | Indels | Percentage of identical sites (%) |
|---|---|---|---|---|---|
| Coding regions | ycf1 | 5466 | 73 | 35 | 99.15 |
| clpP | 2052 | 48 | 90 | 97.58 | |
| ccsA | 972 | 45 | 50 | 96.05 | |
| ndhD | 1521 | 25 | 8 | 99.1 | |
| rbcL | 1440 | 21 | 14 | 98.89 | |
| psbB | 1527 | 15 | 26 | 99 | |
| psbF | 120 | 12 | 0 | 94.88 | |
| rps18 | 306 | 8 | 7 | 97.62 | |
| rps16 | 210 | 5 | 0 | 98.98 | |
| psbK | 186 | 4 | 11 | 96.86 | |
| Intergenic regions | rps7 → trnV-GAC | 2716 | 59 | 16 | 98.54 |
| rps12 → trnV-GAC | 1859 | 57 | 16 | 97.9 | |
| trnS-GCU → trnG-UCC | 1026 | 24 | 90 | 95.1 | |
| rpl32 → trnL-UAG | 873 | 21 | 56 | 97.03 | |
| trnT-UGU → trnL-UAA | 932 | 17 | 153 | 95 | |
| trnC-GCA → petN | 923 | 17 | 30 | 97.66 | |
| trnT-GGU → psbD | 743 | 17 | 15 | 98.24 | |
| psbE → petL | 1315 | 15 | 36 | 98.4 | |
| trnF-GAA → ndhJ | 723 | 14 | 12 | 98.54 | |
| trnP-UGG → psaJ | 362 | 13 | 6 | 97.41 |
Figure 4Phylogenetic tree constructed using maximum parsimony (MP) based on complete CP genomes of six Dracaena and other 31 species. Numbers above the branches are the bootstrap support values.