| Literature DB >> 28837061 |
Jianguo Zhou1, Xinlian Chen2, Yingxian Cui3, Wei Sun4, Yonghua Li5, Yu Wang6, Jingyuan Song7, Hui Yao8.
Abstract
The family Aristolochiaceae, comprising about 600 species of eight genera, is a unique plant family containing aristolochic acids (AAs). The complete chloroplast genome sequences of Aristolochia debilis and Aristolochia contorta are reported here. The results show that the complete chloroplast genomes of A. debilis and A. contorta comprise circular 159,793 and 160,576 bp-long molecules, respectively and have typical quadripartite structures. The GC contents of both species were 38.3% each. A total of 131 genes were identified in each genome including 85 protein-coding genes, 37 tRNA genes, eight rRNA genes and one pseudogene (ycf1). The simple-sequence repeat sequences mainly comprise A/T mononucletide repeats. Phylogenetic analyses using maximum parsimony (MP) revealed that A. debilis and A. contorta had a close phylogenetic relationship with species of the family Piperaceae, as well as Laurales and Magnoliales. The data obtained in this study will be beneficial for further investigations on A. debilis and A. contorta from the aspect of evolution, and chloroplast genetic engineering.Entities:
Keywords: Aristolochia contorta; Aristolochia debilis; chloroplast genome; molecular structure; phylogenetic analyses
Mesh:
Substances:
Year: 2017 PMID: 28837061 PMCID: PMC5618488 DOI: 10.3390/ijms18091839
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Gene map of the complete chloroplast genome of A. debilis. Genes on the inside of the circle are transcribed clockwise, while those outside are transcribed counter clockwise. The darker gray in the inner circle corresponds to GC content, whereas the lighter gray corresponds to AT content.
Base composition in the chloroplast genomes of A. debilis and A. contorta.
| Species | Regions | Positions | T(U) (%) | C (%) | A (%) | G (%) | Length (bp) |
|---|---|---|---|---|---|---|---|
| LSC | - | 32.2 | 18.7 | 31.2 | 17.9 | 89,609 | |
| SSC | - | 34.0 | 17.4 | 33.2 | 15.5 | 19,834 | |
| IRa | - | 28.4 | 22.4 | 28.3 | 21.0 | 25,175 | |
| IRb | - | 28.3 | 21.0 | 28.4 | 22.4 | 25,175 | |
| Total | - | 31.2 | 19.5 | 30.5 | 18.8 | 159,793 | |
| CDS | - | 30.9 | 18.1 | 30.2 | 20.8 | 78,717 | |
| - | 1st position | 23.5 | 18.8 | 30.5 | 27.2 | 26,239 | |
| - | 2nd position | 32.2 | 20.5 | 29.2 | 18.1 | 26,239 | |
| - | 3rd position | 36.9 | 15.1 | 31.1 | 17.0 | 26,239 | |
| LSC | - | 32.2 | 18.7 | 31.2 | 17.8 | 89,781 | |
| SSC | - | 33.9 | 17.4 | 33.3 | 15.4 | 19,877 | |
| IRa | - | 28.4 | 22.4 | 28.2 | 21.0 | 25,459 | |
| IRb | - | 28.2 | 21.0 | 28.4 | 22.4 | 25,459 | |
| Total | - | 31.2 | 19.5 | 30.6 | 18.8 | 160,576 | |
| CDS | - | 30.9 | 18.1 | 30.3 | 20.7 | 78,765 | |
| - | 1st position | 23.5 | 18.8 | 30.5 | 27.2 | 26,255 | |
| - | 2nd position | 32.2 | 20.6 | 29.2 | 18.1 | 26,255 | |
| - | 3rd position | 37.0 | 15.0 | 31.1 | 16.9 | 26,255 |
* CDS: protein-coding regions.
Figure 2Gene map of the complete chloroplast genome of A. contorta. Genes on the inside of the circle are transcribed clockwise, while those outside are transcribed counter clockwise. The darker gray in the inner circle corresponds to GC content, whereas the lighter gray corresponds to AT content.
Gene contents in the chloroplast genomes of A. debilis and A. contorta.
| No. | Group of Genes | Gene names | Amount |
|---|---|---|---|
| 1 | Photosystem I | 5 | |
| 2 | Photosystem II | 15 | |
| 3 | Cytochrome b/f complex | 6 | |
| 4 | ATP synthase | 6 | |
| 5 | NADH dehydrogenase | 12(1) | |
| 6 | RubisCO large subunit | 1 | |
| 7 | RNA polymerase | 4 | |
| 8 | Ribosomal proteins (SSU) | 14(2) | |
| 9 | Ribosomal proteins (LSU) | 11(2) | |
| 10 | Proteins of unknown function | 5(1) | |
| 11 | Transfer RNAs | 37 | 37(7) |
| 12 | Ribosomal RNAs | 8(4) | |
| 13 | Other genes | 6 |
* Gene contains one intron; ** gene contains two introns; (×2) indicates the number of the repeat unit is 2.
Genes with introns in the chloroplast genomes of A. debilis and A. contorta as well as the lengths of the exons and introns.
| Species | Gene | Location | Exon I (bp) | Intron I (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) |
|---|---|---|---|---|---|---|---|
| LSC | 145 | 805 | 410 | - | - | ||
| LSC | 71 | 781 | 292 | 678 | 255 | ||
| SSC | 552 | 1090 | 540 | - | - | ||
| IR | 777 | 705 | 756 | - | - | ||
| LSC | 6 | 214 | 642 | - | - | ||
| LSC | 6 | 485 | 476 | - | - | ||
| LSC | 8 | 1065 | 403 | - | - | ||
| IR | 391 | 657 | 431 | - | - | ||
| LSC | 430 | 776 | 1622 | - | - | ||
| LSC | 114 | - | 232 | 536 | 26 | ||
| LSC | 46 | 853 | 191 | - | - | ||
| IR | 38 | 809 | 35 | - | - | ||
| LSC | 24 | 761 | 48 | - | - | ||
| IR | 37 | 937 | 35 | - | - | ||
| LSC | 37 | 2658 | 35 | - | - | ||
| LSC | 35 | 521 | 50 | - | - | ||
| LSC | 39 | 597 | 37 | - | - | ||
| LSC | 126 | 777 | 228 | 753 | 147 | ||
| LSC | 145 | 771 | 410 | - | - | ||
| LSC | 71 | 821 | 292 | 664 | 255 | ||
| SSC | 552 | 1091 | 540 | - | - | ||
| IR | 777 | 716 | 756 | - | - | ||
| LSC | 6 | 214 | 642 | - | - | ||
| LSC | 7 | 485 | 476 | - | - | ||
| LSC | 8 | 1088 | 403 | - | - | ||
| IR | 391 | 657 | 431 | - | - | ||
| LSC | 430 | 776 | 1619 | - | - | ||
| LSC | 114 | - | 232 | 536 | 26 | ||
| LSC | 46 | 832 | 221 | - | - | ||
| IR | 38 | 809 | 35 | - | - | ||
| LSC | 24 | 751 | 48 | - | - | ||
| IR | 37 | 938 | 35 | - | - | ||
| LSC | 37 | 2648 | 35 | - | - | ||
| LSC | 35 | 552 | 50 | - | - | ||
| LSC | 39 | 605 | 37 | - | - | ||
| LSC | 126 | 764 | 228 | 760 | 147 |
Figure 3Comparison of the borders of LSC, SSC and IR regions among four chloroplast genomes. Number above the gene features means the distance between the ends of genes and the borders sites. The IRb/SSC border extended intothe ycf1 genes to create various lengths of ycf1 pseudogenes among four chloroplast genomes. These features are not to scale.
Figure 4Codon content of 20 amino acid and stop codons in all protein-coding genes of the chloroplast genomes of two Aristolochia species. The histogram on the left-hand side of each amino acid shows codon usage within the A. debilis chloroplast genome, while the right-hand side illustrates the genome of A. contorta.
Figure 5Repeat sequences in six chloroplast genomes. REPuter was used to identify repeat sequences with length ≥30 bp and sequence identified ≥90% in the chloroplast genomes. F, P, R, and C indicate the repeat types F (forward), P (palindrome), R (reverse), and C (complement), respectively. Repeats with different lengths are indicated in different colours.
Types and amounts of SSRs in the A. debilis and A. contorta chloroplast genomes.
| SSR Type | Repeat Unit | Amount | Ratio (%) | ||
|---|---|---|---|---|---|
| Mono | A/T | 78 | 91 | 96.3 | 94.8 |
| C/G | 3 | 5 | 3.7 | 5.2 | |
| Di | AC/GT | 0 | 1 | 0 | 3.6 |
| AG/CT | 0 | 1 | 0 | 3.6 | |
| AT/TA | 19 | 26 | 100 | 92.8 | |
| Tri | AAC/GTT | 1 | 1 | 10 | 8.3 |
| AAG/CTT | 1 | 1 | 10 | 8.3 | |
| ATC/ATG | 1 | 0 | 10 | 0 | |
| AAT/ATT | 7 | 10 | 70 | 83.4 | |
| Tetra | AAAC/GTTT | 2 | 2 | 16.7 | 14.3 |
| AAAT/ATTT | 4 | 5 | 33.3 | 35.7 | |
| AATC/ATTG | 1 | 1 | 8.3 | 7.1 | |
| AGAT/ATCT | 2 | 1 | 16.7 | 7.1 | |
| AATT/AATT | 0 | 1 | 0 | 7.1 | |
| ACAT/ATGT | 0 | 1 | 0 | 7.1 | |
| AACT/AGTT | 1 | 1 | 8.3 | 7.1 | |
| AATG/ATTC | 2 | 2 | 16.7 | 14.3 | |
| Penta | AATAT/ATATT | 2 | 2 | 33.3 | 50 |
| AAATT/AATTT | 1 | 0 | 16.7 | 0 | |
| AAATC/ATTTG | 1 | 0 | 16.7 | 0 | |
| AACAT/ATGTT | 0 | 1 | 0 | 25 | |
| AAAAT/ATTTT | 2 | 1 | 33.3 | 25 | |
| Hexa | AAATAG/ATTTCT | 0 | 1 | 0 | 50 |
| ACATAT/ATATGT | 0 | 1 | 0 | 50 | |
| ACTGAT/AGTATC | 1 | 0 | 100 | 0 | |
Figure 6Sequence identity plot comparing the five chloroplast genomes with A. debilis as a reference by using mVISTA. Grey arrows and thick black lines above the alignment indicate genes with their orientation and the position of the IRs, respectively. A cut-off of 70% identity was used for the plots, and the Y-scale represents the percent identity ranging from 50% to 100%.
Figure 7Phylogenetic tree constructed using Maximum parsimony (MP) method based on 60 protein-coding genes from different species. Numbers at nodes are values for bootstrap support.