| Literature DB >> 36202844 |
Waqar Ahmad1,2, Sajjad Asaf1, Arif Khan3, Ahmed Al-Harrasi4, Abdulraqeb Al-Okaishi5, Abdul Latif Khan6.
Abstract
Dracaena (Asparagaceae family) tree is famous for producing "dragon blood"-a bioactive red-colored resin. Despite its long history of use in traditional medicine, little knowledge exists on the genomic architecture, phylogenetic position, or evolution. Hence, in this study, we sequenced the whole chloroplast (cp) genomes of D. serrulata and D. cinnabari and performed comparative genomics of nine genomes of the genus Dracaena. The results showed that the genome sizes range from 155,055 (D. elliptica) to 155,449 (D. cochinchinensis). The cp genomes of D. serrulata and D. cinnabari encode 131 genes, each including 85 and 84 protein-coding genes, respectively. However, the D. hokouensis had the highest number of genes (133), with 85 protein coding genes. Similarly, about 80 and 82 repeats were identified in the cp genomes of D. serrulata and D. cinnabari, respectively, while the highest repeats (103) were detected in the cp genome of D. terniflora. The number of simple sequence repeats (SSRs) was 176 and 159 in D. serrulata and D. cinnabari cp genomes, respectively. Furthermore, the comparative analysis of complete cp genomes revealed high sequence similarity. However, some sequence divergences were observed in accD, matK, rpl16, rpoC2, and ycf1 genes and some intergenic spacers. The phylogenomic analysis revealed that D. serrulata and D. cinnabari form a monophyletic clade, sister to the remaining Dracaena species sampled in this study, with high bootstrap values. In conclusion, this study provides valuable genetic information for studying the evolutionary relationships and population genetics of Dracaena, which is threatened in its conservation status.Entities:
Mesh:
Year: 2022 PMID: 36202844 PMCID: PMC9537188 DOI: 10.1038/s41598-022-20304-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1The dragon tree plants and their habitat. D. serrulata (A) and D. cinnabari (B).
Figure 2Genome Map of the D. serrulata and D. cinnabari cp genomes. Thick lines represent inverted repeat regions (IRs). IRs split the cp genome into large single copies (LSC) and single small copies (SSC) regions. The counter-clockwise transcribing genes are drawn outside while the clockwise are drawn inside the circle. Genes related to different functional groups are color coded. The GC and AC content is represented by the circle's dark and light green shades.
Chloroplast genomes features summary of D. serrulata, D. cinnabari and related species of Dracaena genus.
| Size (bp) | 155,398 | 155,351 | 155,332 | 155,291 | 155,449 | 155,182 | 155,422 | 155,055 | 155,340 | 155,183 | 155,347 |
| Overall GC contents | 37.6 | 37.5 | 37.5 | 37.6 | 37.6 | 37.5 | 37.6 | 37.5 | 37.5 | 37.5 | 37.5 |
| LSC size in bp | 83,871 | 83,818 | 83,803 | 83,752 | 83,907 | 83,702 | 83,942 | 83,621 | 83,976 | 83,703 | 83,794 |
| SSC size in bp | 19,247 | 18,579 | 18,465 | 18,489 | 18,492 | 18,466 | 18,472 | 18,456 | 18,494 | 18,466 | 18,493 |
| IR size in bp | 26,140 | 26,477 | 26,530 | 26,525 | 26,525 | 26,507 | 26,504 | 26,489 | 26,525 | 26,507 | 26,530 |
| Protein coding regions size in bp | 78,777 | 77,658 | 78,732 | 77,202 | 77,187 | 78,708 | 78,537 | 77,130 | 78,744 | 78,297 | 78,744 |
| tRNA size in bp | 3061 | 2936 | 2873 | 2874 | 2874 | 2866 | 2867 | 2874 | 2874 | 2867 | 2873 |
| rRNA size in bp | 9050 | 9040 | 9050 | 9050 | 9050 | 9050 | 9050 | 9050 | 9050 | 9050 | 9050 |
| Number of genes | 131 | 131 | 131 | 130 | 130 | 131 | 131 | 130 | 131 | 133 | 131 |
| Number of protein coding genes | 85 | 84 | 85 | 84 | 84 | 85 | 85 | 84 | 85 | 85 | 85 |
| Number of rRNA | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 | 8 |
| Number of tRNA | 38 | 38 | 38 | 38 | 38 | 38 | 38 | 38 | 38 | 38 | 38 |
| Genes with introns | 22 | 22 | 23 | 23 | 23 | 23 | 23 | 23 | 23 | 23 | 23 |
| Gene Bank Accession Number | MT408026 | OK235335 | MN200193 | MN200194 | MF943127 | MN200195 | MN990038 | MN200196 | MW123093 | MN200197 | MN200198 |
Gene composition in Dracaena species cp genomes.
| Category of genes | Group of genes | |
|---|---|---|
| Genes for photosynthesis | Subunits of ATP synthase | atpA, atpB, atpE, atpF, atpH, atpI |
| Genes for photosynthesis | Subunits of photosystem II | psbA, psbB, psbC, psbD, psbE, psbF, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ |
| Genes for photosynthesis | Subunits of NADH-dehydrogenase | ndhA, ndhB, ndhB, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK |
| Genes for photosynthesis | Subunits of cytochrome b/f complex | petA, petB, petD, petG, petL, petN |
| Genes for photosynthesis | Subunits of photosystem I | psaA, psaB, psaC, psaI, psaJ |
| Genes for photosynthesis | Subunit of rubisco | rbcL |
| Self-replication | Large subunit of ribosome | rpl14, rpl16, rpl2, rpl2, rpl20, rpl22, rpl23, rpl23, rpl32, rpl33, rpl36 |
| Self-replication | DNA dependent RNA polymerase | rpoA, rpoB, rpoB, rpoB, rpoC1, rpoC2 |
| Self-replication | Small subunit of ribosome | rps11, rps12, rps12, rps14, rps15, rps16, rps18, rps2, rps3, rps4, rps7, rps7, rps8 |
| Other genes | Subunit of Acetyl-CoA-carboxylase | accD |
| Other genes | c-type cytochrom synthesis gene | ccsA |
| Other genes | Envelop membrane protein | cemA |
| Other genes | Maturase | matK |
| Unkown | Conserved open reading frames | ycf1, ycf2, ycf3, ycf4 |
Introns and exons lengths for the splitting genes in cp genomes of D. serrulata and D. cinnabari.
| Gene | Start | End | ExonI (bp) | IntronI (bp) | ExonII (bp) | IntronII (bp) | ExonIII (bp) | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| DS | DC | DS | DC | DS | DC | DS | DC | DS | DC | DS | DC | DS | DC | |
| trnK-UUU | 1513 | 1513 | 4157 | 4157 | 37 | 37 | 2568 | 2568 | 40 | 40 | ||||
| rps16 | 4789 | 4789 | 5910 | 5910 | 46 | 46 | 867 | 867 | 209 | 209 | ||||
| trnG-GCC | 9131 | 9131 | 9906 | 9906 | 23 | 23 | 716 | 716 | 37 | 37 | ||||
| atpF | 11,854 | 11,854 | 13,230 | 13,230 | 145 | 145 | 828 | 828 | 404 | 404 | ||||
| rpoC1 | 20,640 | 20,640 | 23,415 | 23,415 | 432 | 432 | 718 | 718 | 1626 | 1626 | ||||
| ycf3 | 42,150 | 42,150 | 44,126 | 44,126 | 126 | 126 | 731 | 731 | 220 | 220 | 739 | 739 | 161 | 161 |
| trnL-UAA | 46,962 | 46,962 | 47,593 | 47,593 | 35 | 35 | 547 | 547 | 50 | 50 | ||||
| trnV-UAC | 52,093 | 52,093 | 52,754 | 52,754 | 39 | 39 | 586 | 586 | 37 | 37 | ||||
| clpP | 70,044 | 70,497 | 72,097 | 72,016 | 69 | 69 | 825 | 819 | 291 | 291 | 644 | 621 | 225 | 225 |
| petB | 74,979 | 74,979 | 76,381 | 76,381 | 7 | 7 | 752 | 752 | 644 | 644 | ||||
| petD | 76,586 | 76,586 | 77,830 | 77,830 | 8 | 8 | 732 | 732 | 505 | 505 | ||||
| rpl2 | 84,455 | 84,455 | 85,928 | 85,928 | 391 | 391 | 652 | 652 | 431 | 431 | ||||
| ndhB | 94,954 | 94,954 | 97,185 | 97,185 | 775 | 775 | 699 | 699 | 758 | 758 | ||||
| trnA-UGC | 103,803 | 103,803 | 104,690 | 104,690 | 38 | 38 | 815 | 815 | 35 | 35 | ||||
| ndhA | 120,271 | 120,271 | 122,444 | 122,444 | 559 | 559 | 1076 | 1076 | 539 | 539 | ||||
| trnA-UGC | 134,580 | 134,580 | 135,467 | 135,467 | 38 | 38 | 815 | 815 | 35 | 35 | ||||
| ndhB | 142,085 | 142,085 | 144,316 | 144,316 | 775 | 775 | 699 | 699 | 758 | 758 | ||||
| rps12 | ||||||||||||||
| trnG-GCC | 9131 | 9035 | 9906 | 9811 | 23 | 23 | 716 | 706 | 37 | 48 | ||||
| trnI-GAU | 135,532 | 102,671 | 136,545 | 103,689 | 42 | 32 | 937 | 947 | 35 | 40 | ||||
Figure 3Repetitive sequences in D. serrulata, D. cinnabari and related Dracaena species cp genomes. (A) A total number of repetitive sequences in cp genomes, (B) Lengthwise frequency of palindromic repeats (C) Lengthwise frequency of forward repeats (D) Lengthwise frequency of reverse repeats (E) Lengthwise frequency of tandem repeats.
Figure 4Simple sequence repeats (SSRs) in D. serrulata, D. cinnabari, and related Dracaena species cp genomes. (A) Total number of SSRs in cp genomes, (B) SSR motif frequency in cp genomes, (C) Mono-nucleotides SSRs (D) Di-nucleotides SSRs, (E) Tri-nucleotides SSRs, (F) Tetra-nucleotides SSRs, (G) Penta-nucleotides SSRs and (H) Hexa-nucleotides SSRs.
Figure 5Visual alignment of D. serrulata, D. cinnabari, and related Dracaena species cp genomes. VISTA-based identity plot showing sequence identities among eleven Dracaena species, using D. serrulata as a reference. Genome regions are color-coded as protein-coding, rRNA coding, tRNA coding, or conserved non-coding sequences (CNS). The x-axis represents the coordinate in the chloroplast genome. Annotated genes are displayed along the top. The sequences similarity of the aligned regions is shown as horizontal bars indicating the average percent identity between 50 and 100%.
Figure 6Heatmap plot of codon distribution of all shared protein-coding genes in 11 Dracaena species. Color key: yellow indicates lower, green indicates moderate, while purple indicates higher RSCU values.
Figure 7Sliding window analysis of nucleotide variability among the Dracaena species cp genomes (window length: 600 bp; step size: 200 bp).
Figure 8Distances between adjacent genes and junctions of the small single-copy (SSC), large single-copy (LSC), and two inverted repeats (IR) regions among D. serrulata, D. cinnabari, and related Dracaena species cp genomes. Boxes above and below the primary line indicate the adjacent border genes. The figure is not scaled regarding sequence length and only shows relative changes at or near the IR/SC borders.
Figure 9The phylogenetic tree is based on 46 complete cp genomes from subfamily Nolinoideae and four complete cp genomes from subfamily Asparagoidea as outgroups using neighbor-joining (NJ), maximum likelihood (ML), Bayesian inference (BI) and maximum parsimony (MP) methods.Numbers above the branches represent bootstrap values in NJ, ML, BI and MP trees, respectively. Different colors represent the subfamilies in Asparagaceae family.