| Literature DB >> 27275817 |
Yang He1, Hongtao Xiao2,3, Cao Deng4, Liang Xiong5, Jian Yang6, Cheng Peng7.
Abstract
Pogostemon cablin, the natural source of patchouli alcohol, is an important herb in the Lamiaceae family. Here, we present the entire chloroplast genome of P. cablin. This genome, with 38.24% GC content, is 152,460 bp in length. The genome presents a typical quadripartite structure with two inverted repeats (each 25,417 bp in length), separated by one small and one large single-copy region (17,652 and 83,974 bp in length, respectively). The chloroplast genome encodes 127 genes, of which 107 genes are single-copy, including 79 protein-coding genes, four rRNA genes, and 24 tRNA genes. The genome structure, GC content, and codon usage of this chloroplast genome are similar to those of other species in the family, except that it encodes less protein-coding genes and tRNA genes. Phylogenetic analysis reveals that P. cablin diverged from the Scutellarioideae clade about 29.45 million years ago (Mya). Furthermore, most of the simple sequence repeats (SSRs) are short polyadenine or polythymine repeats that contribute to high AT content in the chloroplast genome. Complete sequences and annotation of P. cablin chloroplast genome will facilitate phylogenic, population and genetic engineering research investigations involving this particular species.Entities:
Keywords: Pogostemon cablin; SSR; chloroplast genome; phylogenetic analysis; sequencing
Mesh:
Year: 2016 PMID: 27275817 PMCID: PMC4926354 DOI: 10.3390/ijms17060820
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Genome Annotation. (A) GC content of protein-coding genes and tRNA/rRNA genes; (B) the length distribution of protein-coding genes; (C) the number of best hits for P. cablin chloroplast protein-coding genes among species.
Figure 2Genome schema of the P. cablin chloroplast genome. Genes inside the circle are transcribed clockwise, and counterclockwise transcribed otherwise. Fill colors represent different functional groups that specific genes fall into.
The introns and exons length of intron-containing genes.
| Gene | Strand | Start | End | Exon I | Intron I | Exon II | Intron II | Exon III |
|---|---|---|---|---|---|---|---|---|
| + | 13,161 | 15,372 | 867 | 589 | 756 | |||
| + | 24,040 | 25,529 | 391 | 631 | 468 | |||
| − | 37,673 | 38,945 | 150 | 664 | 459 | |||
| − | 46,797 | 49,641 | 454 | 769 | 1622 | |||
| − | 67,542 | 69,472 | 134 | 696 | 235 | 720 | 146 | |
| − | 95,175 | 97,129 | 74 | 726 | 294 | 638 | 223 | |
| − | 109,281 | 110,769 | 394 | 628 | 467 | |||
| − | 119,440 | 121,648 | 870 | 586 | 753 | |||
| − | 143,869 | 146,015 | 556 | 1055 | 536 |
Codon usage of P. cablin chloroplast genome.
| AA | Codon | No. | Total | AA Frequency | AA | Codon | No. | Total | AA Frequency |
|---|---|---|---|---|---|---|---|---|---|
| A | GCA | 399 | 1418 | 5.25% | P | CCA | 318 | 1139 | 4.22% |
| GCC | 251 | CCC | 246 | ||||||
| GCG | 176 | CCG | 178 | ||||||
| GCT | 592 | CCT | 397 | ||||||
| C | TGC | 88 | 319 | 1.18% | Q | CAA | 702 | 925 | 3.42% |
| TGT | 231 | CAG | 223 | ||||||
| D | GAC | 179 | 1024 | 3.79% | R | AGA | 485 | 1639 | 6.07% |
| GAT | 845 | AGG | 197 | ||||||
| E | GAA | 1002 | 1358 | 5.03% | CGA | 367 | |||
| GAG | 356 | CGC | 121 | ||||||
| F | TTC | 551 | 1539 | 5.70% | CGG | 143 | |||
| TTT | 988 | CGT | 326 | ||||||
| G | GGA | 736 | 1823 | 6.75% | S | AGC | 131 | 2128 | 7.88% |
| GGC | 197 | AGT | 409 | ||||||
| GGG | 327 | TCA | 405 | ||||||
| GGT | 563 | TCC | 370 | ||||||
| H | CAC | 146 | 640 | 2.37% | TCG | 229 | |||
| CAT | 494 | TCT | 584 | ||||||
| I | ATA | 703 | 2319 | 8.58% | T | ACA | 413 | 1388 | 5.14% |
| ATC | 493 | ACC | 274 | ||||||
| ATT | 1123 | ACG | 164 | ||||||
| K | AAA | 1046 | 1440 | 5.33% | ACT | 537 | |||
| AAG | 394 | V | GTA | 557 | 1518 | 5.62% | |||
| L | CTA | 420 | 2916 | 10.79% | GTC | 178 | |||
| CTC | 203 | GTG | 223 | ||||||
| CTG | 213 | GTT | 560 | ||||||
| CTT | 616 | W | TGG | 497 | 497 | 1.84% | |||
| TTA | 881 | Y | TAC | 203 | 978 | 3.62% | |||
| TTG | 583 | TAT | 775 | ||||||
| M | ATG | 619 | 619 | 2.29% | Stop codon | TAA | 60 | 148 | 0.55% |
| N | AAC | 310 | 1245 | 4.61% | TAG | 42 | |||
| AAT | 935 | TGA | 46 |
Figure 3The COG (Clusters of Orthologous Groups) classification and distribution of genes in different species.
Figure 4Phylogenetic tree reconstructed based on chloroplast genome alignments from several species. The purple bars at the nodes indicate 95% posterior probability intervals, while the red dot correspond to the calibration point.
Statistics of chloroplast SSRs (simple sequence repeats) detected in 12 species in Lamiales.
| Species | Total | c | p2 | p1 | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| all | (A)10 | (A)11 | (A)12 | (T)10 | (T)11 | (T)12 | ||||
| 52 | 6 | 0 | 46 | 11 | 6 | 3 | 10 | 6 | 2 | |
| 26 | 3 | 2 | 21 | 2 | 3 | 0 | 5 | 5 | 1 | |
| 26 | 1 | 2 | 23 | 4 | 1 | 0 | 5 | 2 | 0 | |
| 39 | 4 | 1 | 34 | 8 | 4 | 1 | 8 | 3 | 5 | |
| 33 | 2 | 2 | 29 | 9 | 1 | 0 | 10 | 1 | 1 | |
| 27 | 4 | 3 | 20 | 3 | 2 | 0 | 7 | 3 | 0 | |
| 30 | 2 | 1 | 27 | 3 | 3 | 1 | 5 | 3 | 4 | |
| 20 | 2 | 2 | 16 | 4 | 1 | 0 | 6 | 3 | 1 | |
| 29 | 2 | 1 | 26 | 3 | 1 | 3 | 4 | 6 | 2 | |
| 23 | 0 | 2 | 21 | 6 | 0 | 0 | 7 | 5 | 0 | |
| 28 | 1 | 1 | 26 | 9 | 4 | 0 | 8 | 3 | 0 | |
| 11 | 0 | 2 | 9 | 4 | 0 | 0 | 3 | 0 | 0 | |
c, compound SSRs; p2, di-nucleotide SSRs; p1, mono-nucleotide SSRs. A, Adenine; T, Thymine, G, Guanine; C, Cytosine.