| Literature DB >> 30699955 |
Dong-Mei Li1, Chao-Yi Zhao2, Xiao-Fei Liu3.
Abstract
Kaempferia galanga and Kaempferia elegans, which belong to the genus Kaempferia family Zingiberaceae, are used as valuable herbal medicine and ornamental plants, respectively. The chloroplast genomes have been used for molecular markers, species identification and phylogenetic studies. In this study, the complete chloroplast genome sequences of K. galanga and K. elegans are reported. Results show that the complete chloroplast genome of K. galanga is 163,811 bp long, having a quadripartite structure with large single copy (LSC) of 88,405 bp and a small single copy (SSC) of 15,812 bp separated by inverted repeats (IRs) of 29,797 bp. Similarly, the complete chloroplast genome of K. elegans is 163,555 bp long, having a quadripartite structure in which IRs of 29,773 bp length separates 88,020 bp of LSC and 15,989 bp of SSC. A total of 111 genes in K. galanga and 113 genes in K. elegans comprised 79 protein-coding genes and 4 ribosomal RNA (rRNA) genes, as well as 28 and 30 transfer RNA (tRNA) genes in K. galanga and K. elegans, respectively. The gene order, GC content and orientation of the two Kaempferia chloroplast genomes exhibited high similarity. The location and distribution of simple sequence repeats (SSRs) and long repeat sequences were determined. Eight highly variable regions between the two Kaempferia species were identified and 643 mutation events, including 536 single-nucleotide polymorphisms (SNPs) and 107 insertion/deletions (indels), were accurately located. Sequence divergences of the whole chloroplast genomes were calculated among related Zingiberaceae species. The phylogenetic analysis based on SNPs among eleven species strongly supported that K. galanga and K. elegans formed a cluster within Zingiberaceae. This study identified the unique characteristics of the entire K. galanga and K. elegans chloroplast genomes that contribute to our understanding of the chloroplast DNA evolution within Zingiberaceae species. It provides valuable information for phylogenetic analysis and species identification within genus Kaempferia.Entities:
Keywords: Illumina sequencing; Kaempferia elegans; Kaempferia galanga; PacBio sequencing; chloroplast genome; comparative analysis; genomic structure
Mesh:
Substances:
Year: 2019 PMID: 30699955 PMCID: PMC6385120 DOI: 10.3390/molecules24030474
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Circular gene map of chloroplast genomes of two Kaempferia species. The gray arrowheads indicate the direction of the genes. Genes shown inside the circle are transcribed clockwise and those outside are transcribed counterclockwise. Different genes are color coded. The innermost darker gray corresponds to GC content, whereas the lighter gray corresponds to AT content. IR, inverted repeat; LSC, large single copy region; SSC, small single copy region.
Features of the chloroplast genomes of K. galanga and K. elegans.
| Species | Regions | Positions | Length (bp) | T/U (%) | C (%) | A (%) | G (%) | AT/U (%) |
|---|---|---|---|---|---|---|---|---|
|
| Genome | 163,811 | 32.2 | 18.3 | 31.7 | 17.7 | 63.9 | |
| LSC | 88,405 | 33.7 | 17.3 | 32.4 | 16.4 | 66.1 | ||
| IRa | 29,797 | 28.8 | 19.8 | 30.0 | 21.2 | 58.8 | ||
| SSC | 15,812 | 34.5 | 15.5 | 35.9 | 13.9 | 70.5 | ||
| IRb | 29,797 | 28.8 | 19.8 | 30.0 | 21.2 | 58.8 | ||
| Protein coding genes | 83,172 | 31.5 | 17.2 | 31.4 | 19.7 | 63.0 | ||
| 1st position | 27,724 | 23.9 | 18.2 | 31.4 | 26.3 | 55.4 | ||
| 2nd position | 27,724 | 32.4 | 20.0 | 30.1 | 17.3 | 62.6 | ||
| 3rd position | 27,724 | 38.3 | 13.3 | 32.8 | 15.5 | 71.1 | ||
| tRNA | 2,870 | 24.9 | 23.6 | 22.0 | 29.3 | 47.0 | ||
| rRNA | 9,046 | 18.6 | 23.6 | 26.1 | 31.5 | 44.8 | ||
|
| Genome | 163,555 | 32.2 | 18.3 | 31.7 | 17.7 | 63.9 | |
| LSC | 88,020 | 33.7 | 17.4 | 32.4 | 16.5 | 66.1 | ||
| IRa | 29,773 | 28.8 | 19.8 | 30.1 | 21.3 | 58.9 | ||
| SSC | 15,989 | 34.6 | 15.5 | 36.1 | 13.8 | 70.6 | ||
| IRb | 29,773 | 28.8 | 19.8 | 30.1 | 21.3 | 58.9 | ||
| Protein coding genes | 79,117 | 31.6 | 17.3 | 31.2 | 19.9 | 62.8 | ||
| 1st position | 26,372 | 35.0 | 14.3 | 31.9 | 18.8 | 66.9 | ||
| 2nd position | 26,372 | 26.6 | 19.4 | 30.1 | 23.9 | 56.7 | ||
| 3rd position | 26,372 | 33.3 | 18.2 | 31.4 | 17.1 | 64.7 | ||
| tRNA | 2,852 | 24.9 | 23.7 | 22.0 | 29.4 | 46.9 | ||
| rRNA | 9,046 | 18.7 | 23.6 | 26.1 | 31.5 | 44.8 |
Genes present in the chloroplast genomes of K. galanga and K. elegans.
| Category | Gene Name |
|---|---|
| Photosystem I | |
| Photosystem II | |
| Cytochrome b/f | |
| ATP synthase | |
| NADH dehydrogenase | |
| Rubisco |
|
| RNA polymerase | |
| Large subunit ribosomal proteins | |
| Small subunit ribosomal proteins | |
| Other proteins | |
| Proteins of unknown function | |
| Ribosomal RNAs | |
| Transfer RNAs |
Kg: K. galanga; Ke: K. elegans; ×2: Gene with two copies; *: Genes containing introns both in K. galanga and K. elegans; **: Genes containing introns only in K. galanga.
Genes with introns in the chloroplast genomes of K. galanga and K. elegans, including the exon and intron lengths.
| Species | Gene | Location | Exon I (bp) | Intron I (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) |
|---|---|---|---|---|---|---|---|
|
|
| IR | 38 | 801 | 35 | ||
|
| LSC | 14 | 711 | 48 | |||
|
| IR | 42 | 935 | 35 | |||
|
| LSC | 35 | 2646 | 37 | |||
|
| LSC | 35 | 536 | 50 | |||
|
| LSC | 37 | 598 | 38 | |||
| LSC/IR | 114 | - | 231 | 540 | 27 | ||
|
| LSC | 212 | 749 | 40 | |||
|
| IR | 443 | 650 | 315 | |||
|
| LSC | 402 | 1058 | 9 | |||
|
| LSC | 6 | 783 | 642 | |||
|
| LSC | 8 | 740 | 481 | |||
|
| LSC | 425 | 816 | 145 | |||
|
| SSC | 518 | 1083 | 562 | |||
|
| IR | 778 | 673 | 782 | |||
|
| LSC | 1632 | 726 | 432 | |||
|
| LSC | 252 | 636 | 306 | 856 | 60 | |
|
| LSC | 153 | 794 | 228 | 714 | 132 | |
|
|
| IR | 38 | 801 | 35 | ||
|
| IR | 42 | 935 | 35 | |||
|
| LSC | 35 | 2663 | 37 | |||
|
| LSC | 35 | 535 | 50 | |||
|
| LSC | 37 | 598 | 38 | |||
| LSC/IR | 114 | - | 231 | 540 | 27 | ||
|
| LSC | 212 | 729 | 40 | |||
|
| IR | 432 | 659 | 387 | |||
|
| LSC | 402 | 1056 | 9 | |||
|
| LSC | 6 | 784 | 642 | |||
|
| LSC | 8 | 741 | 481 | |||
|
| LSC | 411 | 816 | 144 | |||
|
| SSC | 540 | 1079 | 552 | |||
|
| IR | 756 | 700 | 777 | |||
|
| LSC | 1632 | 728 | 432 | |||
|
| LSC | 255 | 636 | 291 | 854 | 69 | |
|
| LSC | 153 | 794 | 228 | 723 | 132 |
* The rps12 gene is divided into 5′-rps12 in the LSC region and 3′-rps12 in the IR region.
Figure 2Amino acid frequencies in K. galanga and K. elegans protein-coding sequences.
Figure 3Distribution of SSRs in the chloroplast genomes of K. galanga and K. elegans. (A) Number of different SSR types detected in the two Kaempferia species chloroplast genomes; (B) Frequency of identified SSR motifs in different repeat class types; (C) SSR distribution in different genomic regions of two Kaempferia species chloroplast genomes; (D) SSR distribution between coding and non-coding regions of two Kaempferia species chloroplast genomes.
Figure 4Analysis of long repeat sequences in the chloroplast genomes of K. galanga and K. elegans. (A) Frequency of long repeats types; (B) Frequency of long repeats by length.
Figure 5Comparison of the borders of the LSC, SSC and IR regions among five Zingiberaceae chloroplast genomes. Ψ, pseudogenes. Boxes above the main line indicate the adjacent border genes. The figure is not to scale with respect to sequence length and only shows relative changes at or near the IR/SC borders.
Figure 6Comparison of five chloroplast genomes, with K. galanga as a reference using mVISTA alignment program. Gray arrows and thick black lines above the alignment indicate gene orientation. Purple bars represent exons, sky-blue bars represent transfer RNA (tRNA) and ribosomal RNA (rRNA), red bars represent non-coding sequences (CNS) and white peaks represent differences of chloroplast genomes. The y-axis represents the identity percentage ranging from 50 to 100%.
Figure 7Sliding window analysis of the whole chloroplast genomes. Window length: 800 bp; step size: 200 bp. X-axis:position of the midpoint of a window. Y-axis: nucleotide diversity of each window. (A) Pi between K. galanga and K. elegans. (B) Pi among two Kaempferia species, Alpinia zerumbet, Curcuma flaviflora and Zingiber spectabile.
Figure 8Phylogenetic trees constructed with SNPs from 11 species using maximum likelihood (ML, left) and maximum parsimony (MP, right) methods. Numbers at nodes on the tree indicate bootstrap values (>50%).