| Literature DB >> 29449878 |
Ming-Li Wu1,2, Qing Li3, Jiang Xu1, Xi-Wen Li1.
Abstract
BACKGROUND: Amomum compactum is one of the basic species of the traditional herbal medicine amomi fructus rotundus, with great pharmacology effect. The system position of A. compactum is not clear yet, and the introduction of this plant has been hindered by many plant diseases. However, the correlational molecular studies are relatively scarce.Entities:
Keywords: Amomum compactum; Chloroplast genome; High-throughput sequencing technology; Phylogeny; SSR
Year: 2018 PMID: 29449878 PMCID: PMC5811967 DOI: 10.1186/s13020-018-0164-2
Source DB: PubMed Journal: Chin Med ISSN: 1749-8546 Impact factor: 5.455
Fig. 1A. compactum cp genome map. Genes drawn outside the circle are counterclockwise, whereas inside are transcribed clockwise. Genes are color-coded according to different functional groups. The darker gray represents GC content in the inner circle, conversely the lighter one represents AT content
Base composition in the A. compactum cp genome
| T(U)% | C% | A% | G% | Length (bp) | |
|---|---|---|---|---|---|
| LSC | 33.8 | 17.2 | 32.5 | 16.5 | 88,535 |
| IR | 28.8 | 19.8 | 30.1 | 21.3 | 29,824 |
| SSC | 34.3 | 15.6 | 35.9 | 14.2 | 15,370 |
| Total | 32.3 | 18.3 | 31.7 | 17.8 | 163,553 |
| CDS | 31.6 | 17.2 | 31.5 | 19.8 | 79,701 |
| 1st position | 24 | 18.2 | 31.3 | 26.7 | 26,567 |
| 2nd position | 32 | 20.2 | 30.0 | 17.4 | 26,567 |
| 3rd position | 39 | 13.1 | 33.1 | 15.3 | 26,567 |
CDS protein-coding regions
Gene content of the A. compactum cp genome
| Gene category | Gene group | Gene name |
|---|---|---|
| Self-replication | rRNA genes | |
| tRNA genes | ||
| Small subunit of ribosome | ||
| Large subunit of ribosome | ||
| DNA dependent RNA polymerase | ||
| Translational initiation factor |
| |
| Genes for photosynthesis | Subunits of NADH dehydrogenase | |
| Subunits of photosystem I | ||
| Subunits of photosystem II | ||
| Subunits of cytochrome b/f complex | ||
| Subunits of ATP synthase | ||
| Large subunit of rubisco |
| |
| Genes of unknown function | Open reading frames (ORF, ycf) | |
| Pseudogenes |
|
aGene with one intron
bGene with two introns
cGene with two copies
Codon-anticodon recognition patterns and codon usage in the A. compactum cp genome
| Amino acid | Codon | No. | RSCU | tRNA | Amino acid | Codon | Count | RSCU | tRNA |
|---|---|---|---|---|---|---|---|---|---|
| Phe | UUU | 971 | 1.31 | Tyr | UAU | 811 | 1.57 | ||
| Phe | UUC | 516 | 0.69 | Tyr | UAC | 221 | 0.43 | ||
| Leu | UUA | 892 | 1.96 | Stop | UAA | 48 | 1.66 | ||
| Leu | UUG | 559 | 1.23 | Stop | UAG | 22 | 0.76 | ||
| Leu | CUU | 567 | 1.25 | His | CAU | 519 | 1.6 | ||
| Leu | CUC | 181 | 0.4 | His | CAC | 129 | 0.4 | ||
| Leu | CUA | 381 | 0.84 | Gln | CAA | 706 | 1.54 | ||
| Leu | CUG | 151 | 0.33 | Gln | CAG | 210 | 0.46 | ||
| Ile | AUU | 1146 | 1.47 | Asn | AAU | 989 | 1.55 | ||
| Ile | AUC | 426 | 0.55 | Asn | AAC | 289 | 0.45 | ||
| Ile | AUA | 763 | 0.98 | Lys | AAA | 1114 | 1.49 | ||
| Met | AUG | 614 | 1 | Lys | AAG | 383 | 0.51 | ||
| Val | GUU | 521 | 1.45 | Asp | GAU | 875 | 1.64 | ||
| Val | GUC | 159 | 0.44 | Asp | GAC | 192 | 0.36 | ||
| Val | GUA | 559 | 1.56 | Glu | GAA | 1125 | 1.53 | ||
| Val | GUG | 194 | 0.54 | Glu | GAG | 350 | 0.47 | ||
| Ser | UCU | 598 | 1.74 | Cys | UGU | 232 | 1.56 | ||
| Ser | UCC | 337 | 0.98 | Cys | UGC | 66 | 0.44 | ||
| Ser | UCA | 412 | 1.2 | Stop | UGA | 17 | 0.59 | ||
| Ser | UCG | 182 | 0.53 | Trp | UGG | 452 | 1 | ||
| Pro | CCU | 442 | 1.62 | Arg | CGU | 365 | 1.37 | ||
| Pro | CCC | 202 | 0.74 | Arg | CGC | 86 | 0.32 | ||
| Pro | CCA | 325 | 1.19 | Arg | CGA | 342 | 1.29 | ||
| Pro | CCG | 120 | 0.44 | Arg | CGG | 113 | 0.43 | ||
| Thr | ACU | 537 | 1.57 | Arg | AGA | 519 | 1.95 | ||
| Thr | ACC | 237 | 0.7 | Arg | AGG | 168 | 0.63 | ||
| Thr | ACA | 433 | 1.27 | Ser | AGU | 430 | 1.25 | ||
| Thr | ACG | 157 | 0.46 | Ser | AGC | 102 | 0.3 | ||
| Ala | GCU | 626 | 1.82 | Gly | GGU | 604 | 1.39 | ||
| Ala | GCC | 203 | 0.59 | Gly | GGC | 141 | 0.33 | ||
| Ala | GCA | 434 | 1.26 | Gly | GGA | 714 | 1.65 | ||
| Ala | GCG | 112 | 0.33 | Gly | GGG | 276 | 0.64 |
RSCU relative synonymous codon usage
Simple sequence repeats in the A. compactum cp genome
| cpSSR ID | Repeat motif | Length (bp) | Start | End | Region | Annotation |
|---|---|---|---|---|---|---|
| 1 | (T)10 | 10 | 3975 | 3984 | LSC | |
| 2 | (A)10 | 10 | 4328 | 4337 | LSC | |
| 3 | (TA)6 | 12 | 4900 | 4911 | LSC | |
| 4 | (A)10 | 10 | 5287 | 5296 | LSC | |
| 5 | (A)11 | 11 | 6253 | 6263 | LSC | |
| 6 | (TA)6 | 12 | 6609 | 6620 | LSC | |
| 7 | (A)10 | 10 | 7204 | 7213 | LSC | |
| 8 | (AT)6 | 12 | 7521 | 7532 | LSC | |
| 9 | (A)10 | 10 | 7700 | 7709 | LSC | |
| 10 | (T)12 | 12 | 8633 | 8644 | LSC | |
| 11 | (A)13 | 13 | 14,885 | 14,897 | LSC | |
| 12 | (T)10 | 10 | 17,474 | 17,483 | LSC | |
| 13 | (A)10 | 10 | 19,831 | 19,840 | LSC |
|
| 14 | (T)11 | 11 | 24,121 | 24,131 | LSC | |
| 15 | (A)10 | 10 | 28,802 | 28,811 | LSC | |
| 16 | (A)15 | 15 | 29,013 | 29,027 | LSC | |
| 17 | (A)11 | 11 | 30,868 | 30,878 | LSC | |
| 18 | (T)10 | 10 | 35,129 | 35,138 | LSC | |
| 19 | (TA)7 | 14 | 38,632 | 38,645 | LSC | |
| 20 | (A)12 | 12 | 39,292 | 39,303 | LSC | |
| 21 | (A)12 | 12 | 47,481 | 47,492 | LSC | |
| 22 | (T)10 | 10 | 48,986 | 48,995 | LSC | |
| 23 | (A)10 | 10 | 50,236 | 50,245 | LSC | |
| 24 | (AT)7 | 14 | 50,395 | 50,408 | LSC | |
| 25 | (T)10 | 10 | 51,829 | 51,838 | LSC | |
| 26 | (T)11 | 11 | 52,709 | 52,719 | LSC | |
| 27 | (ATA)5 | 15 | 54,345 | 54,359 | LSC | |
| 28 | (A)11 | 11 | 54,562 | 54,572 | LSC | |
| 29 | (T)10 | 10 | 58,778 | 58,787 | LSC | |
| 30 | (T)11 | 11 | 59,269 | 59,279 | LSC | |
| 31 | (A)12 | 12 | 60,919 | 60,930 | LSC | |
| 32 | (T)10 | 10 | 61,621 | 61,630 | LSC | |
| 33 | (AT)6 | 12 | 63,489 | 63,500 | LSC | |
| 34 | (A)12 | 12 | 68,715 | 68,726 | LSC | |
| 35 | (AT)10 | 20 | 69,266 | 69,285 | LSC | |
| 36 | (T)10 | 10 | 70,716 | 70,725 | LSC | |
| 37 | (A)10 | 10 | 72,600 | 72,609 | LSC |
|
| 38 | (TA)7 | 14 | 74,094 | 74,107 | LSC | |
| 39 | (A)10 | 10 | 74,569 | 74,578 | LSC | |
| 40 | (T)11 | 11 | 74,845 | 74,855 | LSC | |
| 41 | (T)10 | 10 | 75,108 | 75,117 | LSC | |
| 42 | (T)10 | 10 | 75,572 | 75,581 | LSC | |
| 43 | (T)10 | 10 | 75,831 | 75,840 | LSC | |
| 44 | (A)10 | 10 | 79,177 | 79,186 | LSC | |
| 45 | (AT)6 | 12 | 79,751 | 79,762 | LSC | |
| 46 | (T)10 | 10 | 86,407 | 86,416 | LSC | |
| 47 | (T)11 | 11 | 88,970 | 88,980 | IRa | |
| 48 | (T)10 | 10 | 116,573 | 116,582 | IRa |
|
| 49 | (A)11 | 11 | 120,872 | 120,882 | SSC | |
| 50 | (T)11 | 11 | 121,055 | 121,065 | SSC | |
| 51 | (A)11 | 11 | 128,865 | 128,875 | SSC | |
| 52 | (T)10 | 10 | 129,188 | 129,197 | SSC | |
| 53 | (AT)6 | 12 | 131,778 | 131,789 | SSC | |
| 54 | (T)11 | 11 | 133,103 | 133,113 | SSC | |
| 55 | (T)12 | 12 | 133,236 | 133,247 | SSC | |
| 56 | (T)11 | 11 | 133,374 | 133,384 | SSC |
|
| 57 | (A)10 | 10 | 135,507 | 135,516 | IRb |
|
| 58 | (A)11 | 11 | 163,109 | 163,119 | IRb |
Long repeat sequences in A. compactum cp genome
| ID | Repeat start 1 | Type | Size (bp) | Repeat start 2 | Mismatch (bp) | E value | Gene | Region |
|---|---|---|---|---|---|---|---|---|
| 1 | 3990 | P | 34 | 3996 | − 3 | 4.12E−06 | LSC | |
| 2 | 8768 | P | 31 | 48,057 | − 3 | 1.98E−04 | IGS; | LSC |
| 3 | 10,522 | F | 30 | 39,347 | − 3 | 7.15E−04 | LSC | |
| 4 | 31,322 | P | 32 | 31,352 | − 3 | 5.46E−05 | IGS | LSC |
| 5 | 32,991 | F | 30 | 33,020 | − 3 | 7.15E−04 | IGS | LSC |
| 6 | 39,660 | P | 32 | 39,701 | 0 | 4.08E−10 | IGS | LSC |
| 7 | 41,551 | F | 58 | 43,775 | − 3 | 7.54E−20 | LSC | |
| 8 | 41,595 | F | 37 | 43,819 | − 2 | 2.39E−09 | LSC | |
| 9 | 63,481 | P | 31 | 126,101 | − 3 | 1.98E−04 | IGS | LSC; SSC |
| 10 | 63,481 | F | 31 | 126,106 | − 3 | 1.98E−04 | IGS | LSC; SSC |
| 11 | 63,487 | F | 32 | 69,264 | − 3 | 5.46E−05 | IGS | LSC |
| 12 | 67,809 | P | 31 | 67,864 | − 2 | 6.83E−06 | IGS | LSC |
| 13 | 71,632 | F | 30 | 71,659 | 0 | 6.53E−09 | IGS | LSC |
| 14 | 72,281 | F | 42 | 72,302 | − 3 | 1.21E−10 |
| LSC |
| 15 | 91,249 | F | 46 | 91,299 | − 1 | 2.10E−16 | IRa | |
| 16 | 91,249 | P | 46 | 160,743 | − 1 | 2.10E−16 | IRa; IRb | |
| 17 | 91,299 | P | 46 | 160,793 | − 1 | 2.10E−16 | IGS | IRa; IRb |
| 18 | 93,917 | F | 30 | 93,938 | − 3 | 7.15E−04 |
| IRa |
| 19 | 93,917 | P | 30 | 158,120 | − 3 | 7.15E−04 |
| IRa; IRb |
| 20 | 93,938 | P | 30 | 158,141 | − 3 | 7.15E−04 |
| IRa; IRb |
| 21 | 121,695 | P | 30 | 121,723 | − 3 | 7.15E−04 | IGS | SSC |
| 22 | 158,122 | F | 30 | 158,143 | − 3 | 7.15E−04 |
| IRb |
| 23 | 160,743 | F | 46 | 160,793 | − 1 | 2.10E−16 | IGS | IRb |
| 24 | 160,762 | F | 30 | 160,812 | − 3 | 7.15E−04 | IGS | IRb |
F forward, P palindromic, IGS intergenic space
Fig. 2Comparison of the border positions of the LSC, SSC, and IR regions among four complete Zingiberaceae chloroplast genomes. Gene names are indicated in boxes, and their lengths in the corresponding regions are displayed above the boxes
Fig. 3Sequence comparison of the A. compactum, C. flaviflora, C. roscoeana and Z. spectabile cp genomes generated by mVISTA. Black lines designate regions of sequence identity by a 50% identity cutoff with A. compactum. Dashed rectangles indicate highly divergent regions of A. compactum compared with C. flaviflora, C. roscoeana and Z. spectabile
Fig. 4MP phylogenetical tree of 15 Zingiberales species, on basis of 67 protein-coding gene sequences in the cp genomes. Bootstrap values are indicated upon the branches
Fig. 5ML phylogenetical tree of 15 Zingiberales species, on basis of 67 protein-coding gene sequences in the cp genome. Bootstrap values are indicated upon the branches