| Literature DB >> 22606302 |
Xiaojun Nie1, Shuzuo Lv, Yingxin Zhang, Xianghong Du, Le Wang, Siddanagouda S Biradar, Xiufang Tan, Fanghao Wan, Song Weining.
Abstract
BACKGROUND: Crofton weed (Ageratina adenophora) is one of the most hazardous invasive plant species, which causes serious economic losses and environmental damages worldwide. However, the sequence resource and genome information of A. adenophora are rather limited, making phylogenetic identification and evolutionary studies very difficult. Here, we report the complete sequence of the A. adenophora chloroplast (cp) genome based on Illumina sequencing. METHODOLOGY/PRINCIPALEntities:
Mesh:
Substances:
Year: 2012 PMID: 22606302 PMCID: PMC3350484 DOI: 10.1371/journal.pone.0036869
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Chloroplast genome map of the A. adenophora.
Genes lying outside of the outer circle are transcribed clockwise whereas inside are transcribed counterclockwise. Genes belonging to different functional groups are color coded. The innermost darker gray corresponds to GC while the lighter gray corresponds to AT content.
Genes present in the A. adenophora cp genome.
| Gene products | ||
| 1 | Photosystem I | psaA, B, C, I, J, ycf3 |
| 2 | Photosystem II | psbA, B, C, D, E, F, H, I, J, K, L, M, N, T, Z/lhbA |
| 3 | Cytochrome b6/f | petA, B |
| 4 | ATP synthase | atpA, B, E, F |
| 5 | Rubisco | rbcL |
| 6 | NADH oxidoreductase | ndhA |
| 7 | Large subunit ribosomal proteins | rpl2 |
| 8 | Small subunit ribosomal proteins | rps2, 3, 4, 7 |
| 9 | RNAP | rpoA, rpoB, C1 |
| 10 | Other proteins | accD, ccsA, cemA, clpP |
| 11 | Proteins of unknown function | ycf1 |
| 12 | Ribosomal RNAs | rrn23 |
| 13 | Transfer RNAs | trnA(UGC) |
Gene containing two introns.
Gene containing a single intron.
Two gene copies in the IRs.
Gene divided into two independent transcription units.
Pseudogene.
The genes having intron in the A. adenophora cp genome and the length of the exons and introns.
| Gene | Location | ExonI bp | IntronIbp | ExonII bp | IntronII bp | ExonIII bp |
|
| LSC | 215 | 1092 | 40 | ||
|
| LSC | 1082 | 696 | 646 | 47 | 342 |
|
| LSC | 145 | 852 | 410 | ||
|
| LSC | 49 | 898 | 227 | 670 | 125 |
|
| LSC | 228 | 634 | 291 | 809 | 71 |
|
| LSC | 6 | 774 | 642 | ||
|
| LSC | 123 | - | 768 | ||
|
| LSC | 8 | 715 | 475 | ||
|
| IR | 474 | 457 | 351 | ||
|
| IR | 756 | 671 | 777 | ||
|
| SSC | 552 | 1097 | 540 | ||
|
| LSC | 35 | 1559 | 35 | ||
|
| IR | 38 | 645 | 31 | ||
|
| LSC | 37 | 437 | 50 | ||
|
| LSC | 37 | 574 | 38 | ||
|
| LSC | 47 | 727 | 23 | ||
|
| IR | 32 | 739 | 35 |
rps12 is trans-spliced gene with 5′ end exon located in the LSC region and the duplicated 3′ end exon located in IR regions.
The codon–anticodon recognition pattern and codon usage for A. adenophora cp genome.
| Amino acid | Codon | No. | tRNA | Amino acid | Codon | No. | tRNA | Amino acid | Codon | No. | tRNA |
| Phe | UUU | 889 | trnF-GAA | Ser | UCU | 581 | trnS-GGA | Tyr | UAU | 736 | trnY-GUA |
| Phe | UUC | 510 | Ser | UCC | 297 | Tyr | UAC | 174 | |||
| Leu | UUA | 803 | trnL-UAA | Ser | UCA | 380 | stop | UAA | 54 | ||
| Leu | UUG | 560 | trnL-CAA | Ser | UCG | 164 | stop | UAG | 21 | ||
| Leu | CUU | 567 | trnL-UAG | Pro | CCU | 411 | trnP-UGG | His | CAU | 458 | trnH-GUG |
| Leu | CUC | 185 | Pro | CCC | 186 | His | CAC | 149 | |||
| Leu | CUA | 370 | Pro | CCA | 306 | Gln | CAA | 674 | trnQ-UUG | ||
| Leu | CUG | 157 | Pro | CCG | 146 | Gln | CAG | 204 | |||
| Ile | AUU | 1016 | trnI-GAU | Thr | ACU | 524 | trnT-GGU | Asn | AAU | 882 | trnN-GUU |
| Ile | AUC | 429 | Thr | ACC | 233 | Asn | AAC | 279 | |||
| Ile | AUA | 642 | trnI-CAU | Thr | ACA | 382 | trnT-UGU | Lys | AAA | 888 | trnK-UUU |
| Met | AUG | 615 | trnM-CAU | Thr | ACG | 126 | Lys | AAG | 314 | ||
| trnfM-CAU | |||||||||||
| Val | GUU | 495 | trnV-GAC | Ala | GCU | 643 | trnA-UGC | Asp | GAU | 806 | trnD-GUC |
| Val | GUC | 184 | Ala | GCC | 222 | Asp | GAC | 205 | |||
| Val | GUA | 514 | trnV-UAC | Ala | GCA | 415 | Glu | GAA | 916 | trnE-UUC | |
| Val | GUG | 189 | Ala | GCG | 151 | Glu | GAG | 337 | |||
| Cys | UGU | 209 | trnC-GCA | Arg | CGU | 345 | trnR-ACG | Ser | AGU | 389 | trnS-GCU |
| Cys | UGC | 73 | Arg | CGC | 101 | Ser | AGC | 116 | |||
| Stop | UGA | 12 | Arg | CGA | 335 | Arg | AGA | 451 | trnR-UCU | ||
| Trp | UGG | 439 | trnW-CCA | Arg | CGG | 113 | Arg | AGG | 171 | ||
| Gly | GGU | 586 | trnG-UCC | Gly | GGC | 196 | |||||
| Gly | GGA | 676 | Gly | GGG | 293 |
Numerals indicate the frequency of usage of each codon in 24894 codons in 87 potential protein-coding genes.
Figure 2Percent identity plot for comparison of six Asteraceae chloroplast genomes using mVISTA program.
Top line shows genes in order (transcriptional direction indicated with arrow). Sequence similarity of aligned regions between A. adenophora and other five species is shown as horizontal bars indicating average percent identity between 50–100% (shown on y-axis of graph). The x-axis represents the coordinate in the chloroplast genome. Genome regions are color coded as protein-coding (exon), rRNA, tRNA and conserved non-coding sequences (CNS).
Promising regions identified for developing phylogenetic markers in Asteraceae family.
| Region | Length | Tree | CI | RI | Pars. | Topologies gene |
| (bp) | Length | Length | Inf.Char(%) | versus species tree | ||
|
| 1593 | 143 | 0.95 | 0.82 | 1.10% | Congruent |
|
| 690 | 90 | 0.97 | 0.90 | 1.90% | Congruent |
|
| 121 | 31 | 0.95 | 0.67 | 2.20% | Congruent |
|
| 690 | 50 | 0.98 | 0.80 | 0.70% | Incongruent |
|
| 2032 | 196 | 0.94 | 0.81 | 2.60% | Congruent |
|
| 2317 | 129 | 0.93 | 0.65 | 1.30% | Incongruent |
|
| 989 | 341 | 0.94 | 0.89 | 2.30% | Congruent |
|
| 242 | 154 | 0.93 | 0.65 | 4.50% | Congruent |
|
| 547 | 8 | 0.88 | 0.50 | 1.10% | Incongruent |
|
| 350 | 82 | 0.99 | 0.94 | 3.20% | Congruent |
|
| 678 | 57 | 1.00 | 1.00 | 0.76% | Incongruent |
|
| 1421 | 76 | 0.96 | 0.73 | 1.45% | Incongruent |
|
| 1197 | 53 | 0.98 | 0.93 | 1.83% | Congruent |
|
| 468 | 81 | 0.96 | 0.57 | 1.65% | Congruent |
|
| 141 | 26 | 0.92 | 0.67 | 4.04% | Congruent |
|
| 634 | 157 | 0.94 | 0.62 | 2.40% | Congruent |
|
| 1458 | 27 | 0.89 | 0.70 | 1.60% | Congruent |
|
| 197 | 88 | 0.98 | 0.67 | 1.80% | Congruent |
|
| 245 | 42 | 0.98 | 0.83 | 1.30% | Incongruent |
|
| 415 | 91 | 0.93 | 0.71 | 3.50% | Congruent |
|
| 347 | 62 | 0.95 | 0.82 | 3.90% | Congruent |
|
| 894 | 217 | 0.93 | 0.62 | 1.75% | Congruent |
|
| 1518 | 85 | 0.98 | 0.87 | 1.84% | Congruent |
|
| 878 | 132 | 0.95 | 0.74 | 1.50% | Congruent |
| Combined regions | 20062 | 2418 | 0.97 | 0.83 | 1.40% | Congruent |
commonly used phylogenetic markers included for comparison.
Figure 3Maximum parsimony (MP) trees of all the selected 24 chloroplast regions of six Asteraceae species
. The phylogram of “combined regions” was constructed from the MP analysis using all the 24 regions together.
Figure 4Comparison of the border positions of SSC, LSC, and IR regions among six Asteraceae chloroplast genomes.
Selected genes or portions of genes are indicated by the boxes above the genome. The IR regions are extended deep into (576 bp) IRb in the H. annuus and J. vulgaris chloroplast genomes. Various lengths of rps19 pseudogene (ψrps19) are created at the border of IR/LSC in all of the six chloroplast genomes.
Figure 5Repeat structure analysis in the A. adenophora cp genome.
The cutoff value for tandem repeat is 15 bp and 30 bp for dispersed repeat. A. Frequency of repeats by length; B. Repeat type; C. Location distribution of all the repeats.
Figure 6The MP phylogenetic tree is based on 35 protein-coding genes from 33
plant taxa. The MP tree has a length of 41, 661 with a consistency index of 0.4644 and a retention index of 0.6821. Numbers above node are bootstrap support values. ML tree has the same topology but is not shown.