| Literature DB >> 29561773 |
Josphat K Saina1,2,3,4, Zhi-Zhong Li5,6, Andrew W Gichira7,8,9, Yi-Ying Liao10.
Abstract
Ailanthus altissima (Mill.) Swingle (Simaroubaceae) is a deciduous tree widely distributed throughout temperate regions in China, hence suitable for genetic diversity and evolutionary studies. Previous studies in A. altissima have mainly focused on its biological activities, genetic diversity and genetic structure. However, until now there is no published report regarding genome of this plant species or Simaroubaceae family. Therefore, in this paper, we first characterized A. altissima complete chloroplast genome sequence. The tree of heaven chloroplast genome was found to be a circular molecule 160,815 base pairs (bp) in size and possess a quadripartite structure. The A. altissima chloroplast genome contains 113 unique genes of which 79 and 30 are protein coding and transfer RNA (tRNA) genes respectively and also 4 ribosomal RNA genes (rRNA) with overall GC content of 37.6%. Microsatellite marker detection identified A/T mononucleotides as majority SSRs in all the seven analyzed genomes. Repeat analyses of seven Sapindales revealed a total of 49 repeats in A. altissima, Rhus chinensis, Dodonaea viscosa, Leitneria floridana, while Azadirachta indica, Boswellia sacra, and Citrus aurantiifolia had a total of 48 repeats. The phylogenetic analysis using protein coding genes revealed that A. altissima is a sister to Leitneria floridana and also suggested that Simaroubaceae is a sister to Rutaceae family. The genome information reported here could be further applied for evolution and invasion, population genetics, and molecular studies in this plant species and family.Entities:
Keywords: Ailanthus altissima; Sapindales; Simaroubaceae; chloroplast genome; microsatellites
Mesh:
Substances:
Year: 2018 PMID: 29561773 PMCID: PMC5979363 DOI: 10.3390/ijms19040929
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Circular gene map of A. altissima complete chloroplast genome. Genes drawn on the outside of the circle are transcribed clockwise, whereas those inside are transcribed clockwise. The light gray in the inner circle corresponds to AT content, while the darker gray corresponds to the GC content. Large single copy (LSC), Inverted repeats (IRa and IRb), and Small single copy (SSC) are indicated.
List of genes found in Ailanthus altissima Chloroplast genome.
| Functional Category | Group of Genes | Gene Name | Number |
|---|---|---|---|
| Self-replication | rRNA genes | 4 | |
| tRNA genes | 30 | ||
| Ribosomal small subunit | 12 | ||
| Ribosomal large subunit | 9 | ||
| DNA-dependent RNA polymerase | 4 | ||
| Photosynthesis | Large subunit of rubisco | 1 | |
| Photosystem I | 6 | ||
| Photosystem II | 15 | ||
| NADH dehydrogenase | 11 | ||
| Cytochrome b/f complex | 6 | ||
| ATP synthase | 6 | ||
| Other | Maturase | 1 | |
| Subunit of acetyl-CoA carboxylase | 1 | ||
| Envelope membrane protein | 1 | ||
| Protease | 1 | ||
| c-type cytochrome synthesis | 1 | ||
| Functions unknown | Conserved open reading frames ( | 4 | |
| Total | 113 |
Note: * Gene with one intron, ** Genes with two introns. (×2) Genes with two copies.
Figure 2Comparison of IR, LSC and SSC junction positions among seven Chloroplast genomes. The features drawn are not to scale. The symbol ᵠ means pseudogene created by IRb/SSC border extension into ycf1 genes. Colored boxes for genes represent the gene position.
Figure 3Gene arrangement map of seven chloroplast genomes representing families from Sapindales, and one reference species (Aquilaria sinensis) aligned using Mauve software Local collinear blocks within each alignment are represented in as blocks of similar color connected with lines. Annotations of rRNA, protein coding and tRNA genes are shown in red, white and green boxes respectively.
Figure 4Amino acid frequencies in A. altissima chloroplast genome protein coding sequences.
Predicted RNA editing site in the A. altissima chloroplast genome.
| Gene | Nucleotide Position | Amino Acid Position | Codon Conversion | Amino Acid Conversion | Score |
|---|---|---|---|---|---|
| 818 | 273 | T | S ≥ L | 0.80 | |
| 92 | 31 | C | P ≥ L | 0.86 | |
| 353 | 118 | T | S ≥ L | 1.00 | |
| 403 | 135 | 0.86 | |||
| 80 | 27 | T | S ≥ L | 1.00 | |
| 149 | 50 | T | S ≥ L | 1.00 | |
| 145 | 49 | L ≥ F | 1.00 | ||
| 556 | 186 | H ≥ Y | 1.00 | ||
| 319 | 107 | L ≥ F | 0.86 | ||
| 457 | 153 | H ≥ Y | 1.00 | ||
| 643 | 215 | H ≥ Y | 1.00 | ||
| 1246 | 416 | H ≥ Y | 1.00 | ||
| 107 | 36 | C | P ≥ L | 1.00 | |
| 341 | 114 | T | S ≥ L | 1.00 | |
| 566 | 189 | T | S ≥ L | 1.00 | |
| 1073 | 358 | T | S ≥ F | 1.00 | |
| 149 | 50 | T | S ≥ L | 1.00 | |
| 467 | 156 | C | P ≥ L | 1.00 | |
| 586 | 196 | H ≥ Y | 1.00 | ||
| 611 | 204 | T | S ≥ L | 0.80 | |
| 746 | 249 | T | S ≥ F | 1.00 | |
| 830 | 277 | T | S ≥ L | 1.00 | |
| 836 | 279 | T | S ≥ L | 1.00 | |
| 1255 | 419 | H ≥ Y | 1.00 | ||
| 1481 | 494 | C | P ≥ L | 1.00 | |
| 2 | 1 | A | T ≥ M | 1.00 | |
| 313 | 105 | R ≥ W | 0.80 | ||
| 383 | 128 | T | S ≥ L | 1.00 | |
| 674 | 225 | T | S ≥ L | 1.00 | |
| 878 | 293 | T | S ≥ L | 1.00 | |
| 887 | 296 | C | P ≥ L | 1.00 | |
| 1076 | 359 | G | A ≥ V | 1.00 | |
| 1298 | 433 | T | S ≥ L | 0.80 | |
| 1310 | 437 | T | S ≥ L | 0.80 | |
| 290 | 97 | T | S ≥ L | 1.00 | |
| 586 | 196 | L ≥ F | 0.80 | ||
| 1919 | 640 | G | A ≥ V | 0.80 | |
| 166 | 56 | H ≥ Y | 0.80 | ||
| 320 | 107 | A | T ≥ I | 0.80 | |
| 119 | 40 | C | P ≥ L | 0.86 | |
| 77 | 26 | T | S ≥ F | 1.00 | |
| 308 | 103 | T | S ≥ L | 0.86 | |
| 830 | 277 | T | S ≥ L | 1.00 | |
| 338 | 113 | T | S ≥ F | 1.00 | |
| 551 | 184 | T | S ≥ L | 1.00 | |
| 566 | 189 | T | S ≥ L | 1.00 | |
| 2426 | 809 | T | S ≥ L | 0.86 | |
| 41 | 14 | T | S ≥ L | 1.00 | |
| 1681 | 561 | H ≥ Y | 0.86 | ||
| 2030 | 677 | A | T ≥ I | 1.00 | |
| 2314 | 772 | R ≥ W | 1.00 | ||
| 4183 | 1395 | L ≥ F | 0.80 | ||
| 248 | 83 | T | S ≥ L | 1.00 | |
| 209 | 70 | T | S ≥ L | 0.83 |
The cytidines marked are putatively edited to uredines.
List of RNA editing sites shared by the seven plastomes predicted by PREP program.
| Gene | A.A Position | |||||||
|---|---|---|---|---|---|---|---|---|
| Codon (A.A) Conversion | ||||||||
| 31 | CCA (P) ≥ CTA (L) | CCA (P) ≥ CTA (L) | CCA (P) ≥ CTA (L) | CCA (P) ≥ CTA (L) | CCA (P) ≥ CTA (L) | CCA (P) ≥ CTA (L) | CCA (P) ≥ CTA (L) | |
| 187 | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | |
| CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | ||
| 358 | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | |
| 50 | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | |
| 156 | CCA (P) ≥ CTA (L) | CCA (P) ≥ CTA (L) | CCA (P) ≥ CTA (L) | CCA (P) ≥ CTA (L) | CCA (P) ≥ CTA (L) | CCA (P) ≥ CTA (L) | CCA (P) ≥ CTA (L) | |
| 196 | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | |
| 249 | TCT (S) ≥ TTT (F) | TCT (S) ≥ TTT (F) | TCT (S) ≥ TTT (F) | TCT (S) ≥ TTT (F) | TCT (S) ≥ TTT (F) | TCT (S) ≥ TTT (F) | TCT (S) ≥ TTT (F) | |
| 419 | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | |
| 1 | ACG (T) ≥ ATG (M) | ACG (T) ≥ ATG (M) | ACG (T) ≥ ATG (M) | ACG (T) ≥ ATG (M) | ACG (T) ≥ ATG (M) | ACG (T) ≥ ATG (M) | ACG (T) ≥ ATG (M) | |
| 128 | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | |
| 107 | ACA (T) ≥ ATA (I) | ACA (T) ≥ ATA (I) | ACA (T) ≥ ATA (I) | ACA (T) ≥ ATA (I) | ACA (T) ≥ ATA (I) | ACA (T) ≥ ATA (I) | ACA (T) ≥ ATA (I) | |
| 278 | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | |
| 113 | TCT (S) ≥ TTT (F) | TCT (S) ≥ TTT (F) | TCT (S) ≥ TTT (F) | TCT (S) ≥ TTT (F) | TCT (S) ≥ TTT (F) | TCT (S) ≥ TTT (F) | TCT (S) ≥ TTT (F) | |
| 184 | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | |
| 809 | TCG (S) ≥ TTG (L) | TCG (S) ≥ TTG (L) | TCG (S) ≥ TTG (L) | TCG (S) ≥ TTG (L) | TCG (S) ≥ TTG (L) | TCG (S) ≥ TTG (L) | TCG (S) ≥ TTG (L) | |
| 14 | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | |
| 563 | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | CAT (H) ≥ TAT (Y) | |
| 27 | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) | TCA (S) ≥ TTA (L) |
Figure 5Simple sequence repeat (SSRs) type, distribution and presence in A. altissima and other representative species from Sapindales. (A) Number of detected SSR motifs in different repeat types in A. altissima Chloroplast genome. (B) Number of identified repeat sequences in seven chloroplast genomes. (C) Number of different SSR types in seven representative species. F, indicate (forward), P (palindromic), R (reverse), and C (complement), while P1, P2, P3, P4, P5 indicates Mono-, di-, tri-, tetra-, and penta-nucleotides respectively. F: forward; P: palindromic, R: reverse; C: complement.
Distribution and localization of repetitive sequences F, forward: P, palindromic, R; reverse in A. altissima chloroplast genome.
| Number | Size | Position 1 | Type | Position 2 | Location 1 (2) | Region |
|---|---|---|---|---|---|---|
| 1 | 48 | 95,957 | F | 95,975 | IRa | |
| 2 | 48 | 153,174 | F | 153,192 | IRb | |
| 3 | 37 | 103,326 | F | 125,821 | IRa/SSC | |
| 4 | 30 | 95,957 | F | 95,993 | IRa | |
| 5 | 30 | 153,174 | F | 153,210 | IRb | |
| 6 | 29 | 50,944 | F | 50,972 | LSC | |
| 7 | 29 | 58,040 | F | 58,078 | LSC | |
| 8 | 28 | 115,434 | F | 115,460 | SSC | |
| 9 | 26 | 39,399 | F | 39,625 | LSC | |
| 10 | 25 | 71,153 | F | 71,178 | LSC | |
| 11 | 23 | 47,036 | F | 103,323 | LSC/IRa | |
| 12 | 23 | 112,456 | F | 112,488 | IRa | |
| 13 | 23 | 136,686 | F | 136,718 | IRb | |
| 14 | 22 | 11,749 | F | 11,771 | LSC | |
| 15 | 21 | 248 | F | 270 | LSC | |
| 16 | 21 | 9541 | F | 38,293 | LSC | |
| 17 | 21 | 41,956 | F | 44,180 | LSC | |
| 18 | 21 | 49,678 | F | 49,699 | LSC | |
| 19 | 20 | 1945 | F | 1965 | LSC | |
| 20 | 20 | 15,166 | F | 92,503 | LSC | |
| 21 | 20 | 47,039 | F | 125,821 | LSC/IRa | |
| 22 | 20 | 88,907 | F | 160,270 | IRa/IRb | |
| 25 | 48 | 31,790 | P | 31,790 | LSC | |
| 26 | 48 | 95,957 | P | 153,174 | IRa/IRb | |
| 27 | 48 | 95,975 | P | 153,192 | IRa/IRb | |
| 28 | 37 | 125,821 | P | 145,834 | SSC/IRb | |
| 29 | 36 | 30,970 | P | 30,970 | LSC | |
| 30 | 30 | 72,117 | P | 72,117 | LSC | |
| 31 | 30 | 95,957 | P | 153,174 | IRa/IRb | |
| 32 | 30 | 95,993 | P | 153,210 | IRa/IRb | |
| 33 | 27 | 542 | P | 571 | LSC | |
| 34 | 25 | 11,403 | P | 11,430 | LSC | |
| 35 | 24 | 4867 | P | 4897 | LSC | |
| 36 | 24 | 9535 | P | 48,164 | LSC | |
| 37 | 23 | 47,036 | P | 145,851 | LSC/IRb | |
| 38 | 23 | 51,804 | P | 119,066 | LSC/SSC | |
| 39 | 23 | 112,456 | P | 136,686 | IRa/IRb | |
| 40 | 23 | 112,488 | P | 136,718 | IRa/IRb | |
| 41 | 22 | 39,195 | P | 39,195 | LSC | |
| 42 | 20 | 15,166 | P | 156,674 | LSC/IRb | |
| 43 | 20 | 38,361 | P | 48,100 | LSC | |
| 44 | 20 | 88,907 | P | 88,907 | IRa | |
| 45 | 20 | 107,097 | P | 107,130 | IRa | |
| 46 | 23 | 39,184 | R | 39,184 | LSC | |
| 47 | 21 | 9751 | R | 9751 | LSC | |
| 48 | 21 | 51,281 | R | 51,281 | LSC | |
| 49 | 21 | 85,055 | R | 85,055 | LSC | |
| 50 | 20 | 53,712 | R | 53,712 | LSC | |
| 51 | 20 | 9385 | R | 13,356 | LSC |
F: forward; P: palindrome; R; reverse* intron or ** introns.
Figure 6Phylogenetic tree of 31 Sapindales species with three outgroup Malvales species inferred from ML (Maximum likelihood) based on common protein coding genes. The position of A. altissima is shown in bold, while bootstrap support values are shown at each node.