| Literature DB >> 29874832 |
Xia Liu1, Boyang Zhou2, Hongyuan Yang3, Yuan Li4, Qian Yang5, Yuzhuo Lu6, Yu Gao7.
Abstract
Chrysanthemum carinatum Schousb and Kalimeris indica are widely distributed edible vegetables and the sources of the Chinese medicine Asteraceae. The complete chloroplast (cp) genome of Asteraceae usually occurs in the inversions of two regions. Hence, the cp genome sequences and structures of Asteraceae species are crucial for the cp genome genetic diversity and evolutionary studies. Hence, in this paper, we have sequenced and analyzed for the first time the cp genome size of C. carinatum Schousb and K. indica, which are 149,752 bp and 152,885 bp, with a pair of inverted repeats (IRs) (24,523 bp and 25,003) separated by a large single copy (LSC) region (82,290 bp and 84,610) and a small single copy (SSC) region (18,416 bp and 18,269), respectively. In total, 79 protein-coding genes, 30 distinct transfer RNA (tRNA) genes, four distinct rRNA genes and two pseudogenes were found not only in C. carinatum Schousb but also in the K. indica cp genome. Fifty-two (52) and fifty-nine (59) repeats, and seventy (70) and ninety (90) simple sequence repeats (SSRs) were found in the C. carinatum Schousb and K. indica cp genomes, respectively. Codon usage analysis showed that leucine, isoleucine, and serine are the most frequent amino acids and that the UAA stop codon was the significantly favorite stop codon in both cp genomes. The two inversions, the LSC region ranging from trnC-GCA to trnG-UCC and the whole SSC region were found in both of them. The complete cp genome comparison with other Asteraceae species showed that the coding area is more conservative than the non-coding area. The phylogenetic analysis revealed that the rbcL gene is a good barcoding marker for identifying different vegetables. These results give an insight into the identification, the barcoding, and the understanding of the evolutionary model of the Asteraceae cp genome.Entities:
Keywords: Asteraceae; C. carinatum Schousb; K. indica; barcoding; chloroplast genome; inversion
Mesh:
Substances:
Year: 2018 PMID: 29874832 PMCID: PMC6099409 DOI: 10.3390/molecules23061358
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1The complete cp genome map of C. carinatum Schousb and K. indica. (A) C. carinatum Schousb; (B) K. indica. The genes marked inside the circle are transcribed clockwise, and those outside are counterclockwise. Genes are color-coded according to their function.
The base composition in the C. carinatum Schousb and K. indica cp genomes.
| Region |
|
| ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| T(U) (%) | C (%) | A (%) | G (%) | Length (bp) | T(U) (%) | C (%) | A (%) | G (%) | Length (bp) | |
| LSC | 32.4 | 17.6 | 32.0 | 18.1 | 82,290 | 32.6 | 17.3 | 32.2 | 17.9 | 84,610 |
| SSC | 35.1 | 14.7 | 34.1 | 16.1 | 18,416 | 34.9 | 14.8 | 33.9 | 16.4 | 18,269 |
| IRa | 28.3 | 22.3 | 28.6 | 20.8 | 24,523 | 28.3 | 22.2 | 28.7 | 20.8 | 25,003 |
| IRb | 28.6 | 20.8 | 28.3 | 22.3 | 24,523 | 28.7 | 20.8 | 28.3 | 22.2 | 25,003 |
| Total | 31.8 | 19.0 | 30.7 | 18.5 | 149,752 | 31.8 | 19.0 | 30.7 | 18.5 | 152,885 |
| CDS | 31.5 | 17.7 | 30.7 | 20.1 | 77,289 | 31.5 | 17.7 | 30.6 | 20.3 | 78,372 |
| 1st position | 23.7 | 18.9 | 30.6 | 26.7 | 25,763 | 23.9 | 18.8 | 30.6 | 26.8 | 26,124 |
| 2nd position | 32.6 | 20.4 | 29.3 | 17.7 | 25,763 | 32.6 | 20.3 | 29.3 | 17.8 | 26,124 |
| 3rd position | 38.3 | 13.7 | 32.1 | 15.9 | 25,763 | 38.0 | 13.9 | 31.8 | 16.3 | 26,124 |
The genes present in the C. carinatum Schousb and K. indica cp genomes.
| Group of Genes | ||
|---|---|---|
| Photosystem I |
|
|
| Photosystem II |
|
|
| Cytochrome b/f complex |
|
|
| ATP synthase |
|
|
| NADH dehydrogenase |
| |
| RuBisCO large subunit |
|
|
| RNA polymerase |
|
|
| Ribosomal proteins (SSU) | ||
| Ribosomal proteins (LSU) | ||
| Miscellaneous proteins |
|
|
| Hypothetical chloroplast reading frames (ycf) | ||
| Transfer RNAs | ||
| Ribosomal RNAs | ||
| Pseudogenes |
|
|
| Total | 132 | 132 |
(×2) indicates a duplicated gene; * represents the introns that the gene contains; ** indicates there are two introns that the gene contains.
The intron-containing genes of the C. carinatum Schousb and K. indica cp genomes and the lengths of the introns and exons.
| Location | Exon I (bp) | Intron I (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) | Location | Exon I (bp) | Intron I (bp) | Exon II (bp) | Intron II (bp) | Exon III (bp) | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| LSC | 145 | 699 | 410 |
| LSC | 145 | 709 | 410 | ||||
|
| LSC | 71 | 796 | 291 | 611 | 229 |
| LSC | 71 | 814 | 291 | 615 | 229 |
|
| SSC | 539 | 1045 | 553 |
| SSC | 553 | 1064 | 539 | ||||
|
| IR | 777 | 670 | 756 |
| IR | 777 | 674 | 756 | ||||
|
| LSC | 6 | 751 | 642 |
| LSC | 6 | 823 | 642 | ||||
|
| LSC | 8 | 675 | 475 |
| LSC | 8 | 809 | 475 | ||||
|
| LSC | 9 | 1029 | 399 |
| LSC | 9 | 1098 | 399 | ||||
|
| IR | 391 | 664 | 434 |
| IR | 391 | 671 | 434 | ||||
|
| LSC | 432 | 733 | 1641 |
| LSC | 432 | 742 | 1638 | ||||
|
| LSC | 114 | - | 232 | 536 | 26 |
| LSC | 114 | 232 | 535 | 26 | |
|
| LSC | 41 | 891 | 184 |
| LSC | 41 | 876 | 226 | ||||
|
| IR | 38 | 812 | 35 |
| IR | 35 | 820 | 38 | ||||
|
| LSC | 23 | 722 | 47 |
| LSC | 23 | 732 | 48 | ||||
|
| IR | 43 | 775 | 35 |
| IR | 38 | 780 | 35 | ||||
|
| LSC | 37 | 2562 | 30 |
| LSC | 37 | 2539 | 35 | ||||
|
| LSC | 37 | 422 | 50 |
| LSC | 37 | 438 | 50 | ||||
|
| LSC | 38 | 573 | 37 |
| LSC | 38 | 573 | 37 | ||||
|
| LSC | 125 | 698 | 229 | 735 | 153 |
| LSC | 125 | 702 | 229 | 739 | 153 |
* The rps12 gene is a trans-spliced gene with the 5′ end located in the LSC region and a duplicated 3′ end in the IR region.
The codon-anticodon recognition pattern and codon usage for the C. carinatum Schousb cp genome.
| Amino Acid | Codon | No. | RSCU | tRNA | Amino Acid | Codon | No. | RSCU | tRNA |
|---|---|---|---|---|---|---|---|---|---|
| Phe | UUU | 956 |
| Stop | UAA | 49 |
| ||
| Phe | UUC | 496 | 0.68 |
| Stop | UAG | 21 | 0.74 | |
| Leu | UUA | 874 |
|
| Stop | UGA | 15 | 0.53 | |
| Leu | UUG | 553 |
|
| His | CAU | 452 |
| |
| Leu | CUU | 620 |
| His | CAC | 143 | 0.48 |
| |
| Leu | CUC | 183 | 0.40 | Gln | CAA | 703 |
|
| |
| Leu | CUA | 358 | 0.77 |
| Gln | CAG | 227 | 0.49 | |
| Leu | CUG | 190 | 0.41 | Asn | AAU | 978 |
| ||
| Ile | AUU | 1075 |
| Asn | AAC | 277 | 0.44 |
| |
| Ile | AUC | 428 | 0.58 |
| Lys | AAA | 1026 |
|
|
| Ile | AUA | 692 | 0.95 |
| Lys | AAG | 345 | 0.50 | |
| Met | AUG | 613 | 1.00 |
| Asp | GAU | 831 |
| |
| Val | GUU | 490 |
| Asp | GAC | 214 | 0.41 |
| |
| Val | GUC | 166 | 0.49 |
| Glu | GAA | 986 |
|
|
| Val | GUA | 523 |
|
| Glu | GAG | 331 | 0.50 | |
| Val | GUG | 178 | 0.52 | Cys | UGU | 198 |
| ||
| Ser | UCU | 580 |
| Cys | UGC | 88 | 0.62 |
| |
| Ser | UCC | 313 | 0.96 |
| Trp | UGG | 448 |
|
|
| Ser | UCA | 394 |
|
| Arg | CGU | 336 |
|
|
| Ser | UCG | 154 | 0.47 | Arg | CGC | 100 | 0.39 | ||
| Ser | AGA | 474 |
| Arg | CGA | 339 |
| ||
| Ser | AGG | 165 | 0.65 |
| Arg | CGG | 118 | 0.46 | |
| Tyr | UAU | 799 |
| Arg | AGU | 406 |
|
| |
| Tyr | UAC | 175 | 0.36 |
| Arg | AGC | 118 | 0.36 | |
| Pro | CCA | 322 |
|
| Gly | GGU | 581 |
| |
| Pro | CCG | 159 | 0.58 | Gly | GGC | 188 | 0.43 |
| |
| Pro | CCU | 434 |
| Gly | GGA | 686 |
|
| |
| Pro | CCC | 184 | 0.67 | Gly | GGG | 305 | 0.69 | ||
| Thr | ACU | 529 |
| Ala | GCA | 416 |
|
| |
| Thr | ACC | 236 | 0.73 |
| Ala | GCG | 162 | 0.46 | |
| Thr | ACA | 404 |
|
| Ala | GCU | 611 |
| |
| Thr | ACG | 125 | 0.39 | Ala | GCC | 223 | 0.63 |
RSCU: Relative synonymous codon usage. RSCU > 1 are highlighted in bold.
The codon-anticodon recognition pattern and codon usage for the K. indica cp genome.
| Amino Acid | Codon | No. | RSCU | tRNA | Amino Acid | Codon | No. | RSCU | tRNA |
|---|---|---|---|---|---|---|---|---|---|
| Phe | UUU | 982 |
| Stop | UAA | 49 |
| ||
| Phe | UUC | 515 | 0.69 |
| Stop | UAG | 21 | 0.74 | |
| Leu | UUA | 870 |
|
| Stop | UGA | 15 | 0.53 | |
| Leu | UUG | 578 |
|
| His | CAU | 452 |
| |
| Leu | CUU | 607 |
| His | CAC | 148 | 0.49 |
| |
| Leu | CUC | 186 | 0.4 | Gln | CAA | 723 |
|
| |
| Leu | CUA | 375 | 0.81 |
| Gln | CAG | 220 | 0.47 | |
| Leu | CUG | 179 | 0.38 | Asn | AAU | 969 |
| ||
| Ile | AUU | 1072 |
| Asn | AAC | 295 | 0.47 |
| |
| Ile | AUC | 427 | 0.58 |
| Lys | AAA | 1036 |
|
|
| Ile | AUA | 691 | 0.95 |
| Lys | AAG | 360 | 0.52 | |
| Met | AUG | 633 | 1 |
| Asp | GAU | 847 |
| |
| Val | GUU | 503 |
| Asp | GAC | 205 | 0.39 |
| |
| Val | GUC | 180 | 0.52 |
| Glu | GAA | 987 |
|
|
| Val | GUA | 517 |
|
| Glu | GAG | 359 | 0.53 | |
| Val | GUG | 190 | 0.55 | Cys | UGU | 209 |
| ||
| Ser | UCU | 585 |
| Cys | UGC | 85 | 0.58 |
| |
| Ser | UCC | 308 | 0.93 |
| Trp | UGG | 463 | 1 |
|
| Ser | UCA | 401 |
|
| Arg | CGU | 351 |
|
|
| Ser | AGA | 488 |
| Arg | CGC | 108 | 0.41 | ||
| Ser | AGG | 177 | 0.67 |
| Arg | CGA | 346 |
| |
| Ser | UCG | 172 | 0.52 | Arg | CGG | 114 | 0.43 | ||
| Tyr | UAU | 804 |
| Arg | AGU | 405 |
|
| |
| Tyr | UAC | 175 | 0.36 |
| Arg | AGC | 121 | 0.36 | |
| Pro | CCU | 414 |
| Gly | GGU | 571 |
| ||
| Pro | CCC | 209 | 0.76 | Gly | GGC | 199 | 0.45 |
| |
| Pro | CCA | 314 |
|
| Gly | GGA | 681 |
|
|
| Pro | CCG | 167 | 0.61 | Gly | GGG | 328 | 0.74 | ||
| Thr | ACU | 531 |
| Ala | GCU | 622 |
| ||
| Thr | ACC | 239 | 0.73 |
| Ala | GCC | 233 | 0.65 | |
| Thr | ACA | 405 |
|
| Ala | GCA | 410 |
|
|
| Thr | ACG | 137 | 0.42 | Ala | GCG | 161 | 0.45 |
The repeats of the C. carinatum Schousb cp genome and their distribution.
| No. | Size (bp) | Type | Repeat 1 Location | Repeat 2 Location | Region | No. | Size (bp) | Type | Repeat 1 Location | Repeat 2 Location | Region |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 60 | F | IRb | 22 | 31 | R | IGS ( | IGS ( | LSC | ||
| 2 | 60 | P | IRb, IRa | 23 | 30 | P | IGS ( | LSC | |||
| 3 | 60 | P | IRb, IRa | 24 | 30 | F | IRb | ||||
| 4 | 60 | F | IRa, IRa | 25 | 30 | P | IRb, IRa | ||||
| 5 | 51 | F | IGS ( | IGS ( | LSC | 26 | 30 | P | IRb, IRa | ||
| 6 | 48 | P | IGS ( | IGS ( | LSC | 27 | 35 | F | LSC | ||
| 7 | 46 | F | IGS ( | IGS ( | LSC | 28 | 35 | F | LSC, IRb | ||
| 8 | 45 | F | IRb | 29 | 35 | P | LSC, IRa | ||||
| 9 | 45 | P | IRb, IRa | 30 | 32 | P | IGS ( | IGS ( | LSC | ||
| 10 | 45 | P | IRb, IRa | 31 | 31 | F | IGS ( | IGS ( | LSC | ||
| 11 | 46 | P | IGS ( | IGS ( | LSC | 32 | 30 | P | IGS ( | LSC | |
| 12 | 39 | P | IGS ( | IRb, SSC | 33 | 30 | F | IGS ( | LSC | ||
| 13 | 39 | F | IGS ( | SSC, IRa | 34 | 32 | F | IGS ( | IGS ( | LSC | |
| 14 | 41 | F | IGS ( | LSC, IRb | 35 | 31 | R | IGS ( | IGS ( | SSC | |
| 15 | 41 | P | IGS ( | LSC, IRa | 36 | 30 | C | IGS ( | IGS ( | LSC | |
| 16 | 39 | P | LSC, SSC | 37 | 30 | F | LSC | ||||
| 17 | 42 | F | IRb | 38 | 30 | F | IRb, IRa | ||||
| 18 | 42 | P | IRb, IRa | 39 | 30 | P | IRb | ||||
| 19 | 42 | P | IRb, IRa | 40 | 30 | F | IGS ( | IGS ( | SSC | ||
| 20 | 42 | F | IRa | 41 | 30 | R | IGS ( | IGS ( | SSC | ||
| 21 | 41 | P | IGS ( | IGS ( | SSC | 42 | 30 | P | IRa |
F = forward, P = palindrome, R = reverse, C = complement, IGS = intergenic spacer.
The repeats of the K. indica cp genome and their distribution.
| No. | Size (bp) | Type | Repeat 1 Location | Repeat 2 Location | Region | No. | Size (bp) | Type | Repeat 1 Location | Repeat 2 Location | Region |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 48 | P | IGS ( | IGS ( | LSC | 31 | 32 | F | IGS ( | IGS ( | IRa |
| 2 | 39 | P | LSC, SSC | 32 | 34 | R | IGS ( | IGS ( | LSC | ||
| 3 | 41 | F | IGS ( | LSC, IRb | 33 | 31 | R | IGS ( | LSC | ||
| 4 | 41 | P | IGS ( | LSC, IRa | 34 | 33 | C | IGS ( | LSC | ||
| 5 | 39 | P | IGS ( | IRb, SSC | 35 | 33 | C | IGS ( | IGS ( | LSC | |
| 6 | 39 | F | IGS ( | SSC, IRa | 36 | 30 | P | IGS ( | LSC | ||
| 7 | 42 | F | IRb | 37 | 32 | C | IGS ( | IGS ( | LSC | ||
| 8 | 42 | P | IRb, IRa | 38 | 32 | F | IGS ( | IGS ( | LSC | ||
| 9 | 42 | P | IRb, IRa | 39 | 32 | P | IGS ( | IGS ( | LSC | ||
| 10 | 42 | F | IRa | 40 | 32 | C | IGS ( | LSC | |||
| 11 | 41 | R | LSC | 41 | 32 | F | LSC | ||||
| 12 | 43 | P | IGS ( | IGS ( | SSC | 42 | 32 | R | IGS ( | IGS ( | LSC |
| 13 | 39 | R | IGS ( | IGS ( | LSC | 43 | 31 | P | IGS ( | IGS ( | LSC |
| 14 | 31 | F | IGS ( | IGS ( | IRb | 44 | 31 | P | IGS ( | IGS ( | LSC |
| 15 | 31 | P | IGS ( | IGS ( | IRb, IRa | 45 | 31 | P | IGS ( | IGS ( | LSC |
| 16 | 31 | P | IGS ( | IGS ( | IRb, IRa | 46 | 31 | F | IGS ( | IGS ( | LSC |
| 17 | 31 | F | IGS ( | IGS ( | IRa | 47 | 31 | R | IGS ( | IGS ( | LSC |
| 18 | 37 | P | IGS ( | IGS ( | LSC | 48 | 31 | F | IGS ( | LSC | |
| 19 | 31 | R | LSC | 49 | 31 | C | LSC | ||||
| 20 | 30 | P | IGS ( | LSC | 50 | 31 | F | IGS ( | SSC | ||
| 21 | 30 | F | IRb | 51 | 30 | C | IGS ( | IGS ( | LSC | ||
| 22 | 30 | P | IRb, IRa | 52 | 30 | R | IGS ( | IGS ( | LSC | ||
| 23 | 30 | P | IRb, IRa | 53 | 30 | C | IGS ( | LSC | |||
| 24 | 32 | P | IGS ( | IGS ( | LSC | 54 | 30 | C | IGS ( | IGS ( | LSC |
| 25 | 32 | P | IGS ( | IGS ( | LSC | 55 | 30 | F | IGS ( | IGS ( | LSC |
| 26 | 32 | P | IGS ( | LSC | 56 | 30 | F | IGS ( | IGS ( | LSC | |
| 27 | 32 | R | LSC | 57 | 30 | R | IGS ( | IGS ( | LSC | ||
| 28 | 32 | F | IGS ( | IGS ( | IRb | 58 | 30 | R | IGS ( | LSC | |
| 29 | 32 | P | IGS ( | IGS ( | IRb, IRa | 59 | 30 | F | LSC | ||
| 30 | 32 | P | IGS ( | IGS ( | IRb, IRa |
F = forward, P = palindrome, R = reverse, C = complement, IGS = intergenic spacer.
Figure 2The repeat sequences of eight Asteraceae cp genomes. F (forward), P (palindrome), R (reverse), and C (complement) represent the repeat types. Different colours represent the repeats in different lengths.
The simple sequence repeats in the C. carinatum Schousb cp genome.
| Unit | Length | No. | Location | Region | Unit | Length | No. | Location | Region | Unit | Length | No. | Location | Region |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A | 19 | 1 | IGS ( | SSC | T | 22 | 1 | IGS ( | LSC | TA | 14 | 1 | IGS ( | LSC |
| 13 | 1 | IGS ( | LSC | 16 | 1 | LSC | 12 | 2 | IGS ( | LSC | ||||
| 12 | 1 | IGS ( | SSC | 14 | 3 | IGS ( | LSC | IGS ( | LSC | |||||
| 11 | 6 | IGS ( | LSC | IGS ( | LSC | 10 | 1 | LSC | ||||||
| IGS ( | LSC | IGS ( | LSC | TG | 10 | 1 | IGS ( | LSC | ||||||
| IGS ( | LSC | 13 | 3 | IGS ( | LSC | ATT | 12 | 2 | IGS ( | LSC | ||||
| IGS ( | LSC | IGS ( | LSC | IGS ( | SSC | |||||||||
| IGS ( | LSC | IGS ( | LSC | GAA | 15 | 1 |
| SSC | ||||||
| IGS ( | LSC | 12 | 5 | IGS ( | LSC | TTA | 12 | 1 | SSC | |||||
| 10 | 10 | LSC | IGS ( | LSC | TTC | 12 | 1 |
| LSC | |||||
|
| LSC | LSC | AATA | 12 | 2 | IGS ( | LSC | |||||||
| LSC | LSC | SSC | ||||||||||||
| LSC |
| LSC | ATAC | 12 | 1 | IGS ( | LSC | |||||||
| LSC | 11 | 1 | IGS ( | LSC | ATTG | 12 | 1 |
| IRa | |||||
| IGS ( | LSC | 10 | 9 | LSC | ATTT | 22 | 1 | IGS ( | LSC | |||||
| IGS ( | LSC | IGS ( | LSC | CAAT | 12 | 1 | IGS ( | IRb | ||||||
| IGS ( | LSC | LSC | GATT | 12 | 1 | SSC | ||||||||
|
| LSC | IGS ( | LSC | TAGA | 12 | 1 | IGS ( | LSC | ||||||
| IGS ( | IRb | IGS ( | LSC | TAAA | 12 | 1 | IGS ( | LSC | ||||||
| C | 12 | 1 | LSC | IGS ( | LSC | TAAT | 12 | 1 | IGS ( | LSC | ||||
| AT | 14 | 1 | LSC |
| LSC | TATT | 12 | 2 | IGS ( | LSC | ||||
| 12 | 1 | IGS ( | LSC | IGS ( | LSC |
| SSC | |||||||
| 10 | 1 |
| LSC | IGS ( | IRa | TTTC | 16 | 1 | LSC | |||||
| AATTT | 15 | 1 | IGS ( | SSC |
The simple sequence repeats in the K. indica cp genome.
| Unit | Length | No. | Location | Region | Unit | Length | No. | Location | Region | Unit | Length | No. | Location | Region |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A | 18 | 1 | IGS ( | LSC | AT | 16 | 1 | IGS ( | LSC | GAA | 15 | 1 |
| SSC |
| 12 | 1 | IGS ( | LSC | 10 | 7 | LSC | 12 | 1 |
| SSC | ||||
| 11 | 3 | IGS ( | LSC | 10 |
| LSC | TAT | 21 | 1 | IGS ( | IRb | |||
| 11 | IGS ( | LSC | 10 | IGS ( | LSC | 12 | 2 | IGS ( | LSC | |||||
| 11 | IGS ( | LSC | 10 | IGS ( | LSC | 12 | IGS ( | LSC | ||||||
| 10 | 7 |
| LSC | 10 | IGS ( | LSC | TTA | 15 | 2 | IGS ( | LSC | |||
| 10 | LSC | 10 | IGS ( | LSC | 15 | IGS ( | LSC | |||||||
| 10 | IGS ( | LSC | 10 | LSC | 12 | 2 | LSC | |||||||
| 10 | IRb | TA | 23 | 1 | LSC | 12 | LSC | |||||||
| 10 | SSC | 20 | 1 | IGS ( | LSC | 12 | 1 |
| LSC | |||||
| 10 | IGS ( | SSC | 18 | 1 | LSC | 12 | 1 | IGS ( | LSC | |||||
| 10 | IGS ( | IRa | 14 | 1 | IGS ( | LSC | 12 | 1 | IGS ( | LSC | ||||
| T | 18 | 1 | IGS ( | LSC | 12 | 3 | IGS ( | LSC | 12 | 1 | IGS ( | LSC | ||
| 17 | 1 |
| LSC | 12 | LSC | 12 | 1 | IGS ( | LSC | |||||
| 14 | 2 | IGS ( | LSC | 12 | IGS ( | LSC | 12 | 1 | IGS ( | LSC | ||||
| 14 | SSC | 10 | 3 | LSC | 12 | 1 | IGS ( | SSC | ||||||
| 13 | 1 | IGS ( | LSC | 10 | IGS ( | LSC | TATC | 12 | 1 | IGS ( | LSC | |||
| 12 | 2 | IGS ( | LSC | 10 | SSC | TATT | 12 | 2 | IGS ( | LSC | ||||
| 12 | LSC | AAT | 12 | 2 | IGS ( | LSC | 12 | IGS ( | SSC | |||||
| 11 | 2 | IGS ( | LSC | 12 | IGS ( | LSC | TCTA | 12 | 1 | SSC | ||||
| 11 | IGS ( | LSC | ATA | 21 | 1 | IGS ( | IRa | TTCT | 12 | 1 | IGS ( | LSC | ||
| 10 | 9 | IGS ( | LSC | 15 | 3 | LSC | TTTA | 12 | 1 | LSC | ||||
| 10 | IGS ( | LSC | 15 | IGS ( | LSC | TTTC | 12 | 1 | LSC | |||||
| 10 | IGS ( | LSC | 15 | IGS ( | SSC | ATTAG | 15 | 1 | IGS ( | LSC | ||||
| 10 | IGS ( | LSC | 12 | 2 | IGS ( | LSC | TATAT | 23 | 1 | LSC | ||||
| 10 | LSC | 12 | LSC | 20 | 1 | IGS ( | LSC | |||||||
| 10 | IGS ( | LSC | ATT | 15 | 1 | IGS ( | LSC | 15 | 1 | IGS ( | LSC | |||
| 10 | IGS ( | IRb | TAA | 15 | 1 | IGS ( | LSC | TATTA | 21 | 2 | IGS ( | IRb | ||
| 10 | IGS ( | SSC | 12 | 2 | IGS ( | LSC | 21 | IGS ( | IRa | |||||
| 10 | IRa | 12 | IGS ( | LSC | TCCTA | 15 | 1 | IGS ( | LSC |
The distribution of SSRs present in the Asteraceae cp genomes.
| Taxon | Genome Size (bp) | GC (%) | SSR Type | CDS | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Mono | Di | Tri | Tetra | Penta | Hexa | Total | % a | No. b | % c | |||
|
| 149,752 | 37.47 | 43 | 8 | 5 | 13 | 1 | 0 | 70 | 51.6 | 12 | 17.1 |
|
| 152,885 | 37.25 | 30 | 18 | 22 | 13 | 7 | 0 | 90 | 51.3 | 9 | 10 |
|
| 152,265 | 37.37 | 34 | 14 | 11 | 14 | 2 | 0 | 75 | 51.4 | 8 | 10.7 |
|
| 150,952 | 37.48 | 39 | 10 | 4 | 15 | 3 | 0 | 71 | 52.1 | 10 | 14.1 |
|
| 151,837 | 37.59 | 37 | 4 | 4 | 5 | 0 | 1 | 51 | 51.5 | 14 | 27.5 |
|
| 151,104 | 37.62 | 40 | 4 | 4 | 4 | 0 | 0 | 52 | 51.2 | 14 | 26.9 |
|
| 151,451 | 37.67 | 14 | 5 | 3 | 6 | 1 | 1 | 30 | 51.3 | 11 | 36.7 |
|
| 150,972 | 37.48 | 38 | 10 | 4 | 14 | 1 | 0 | 67 | 48.8 | 7 | 10.4 |
CDS: protein-coding regions. a the percentage ratio of the total length of the CDS to the genome size. b the total number of SSRs in CDS. c the percentage ratio of the total number of SSRs in CDS to the total number of SSRs in the whole genome.
Figure 3The comparison of eight Asteraceae cp genomes by using mVISTA. The grey arrows above the contrast indicate the direction of the gene translation.The y-axis represents the percent identity between 50% and 100%. Protein codes (exon), rRNA, tRNA and conserved non-coding sequence (CNS) are shown in different colors, respectively.
Figure 4The comparison of the borders of the LSC, SSC, and IR regions among the eight Asteraceae cp genomes.
Figure 5The molecular phylogenetic analysis of the cp protein-coding gene rcbL for 24 samples using the Maximum Likelihood method. The tree was constructed by using MEGA7. The stability of each tree node was tested by bootstrap analysis with 1000 replicates.