| Literature DB >> 28154574 |
Chao Xu1, Wenpan Dong2, Wenqing Li3, Yizeng Lu3, Xiaoman Xie3, Xiaobai Jin4, Jipu Shi5, Kaihong He5, Zhili Suo6.
Abstract
Crape myrtles are economically important ornamental trees of the genus Lagerstroemia L. (Lythraceae), with a distribution from tropical to northern temperate zones. They are positioned phylogenetically to a large subclade of rosids (in the eudicots) which contain more than 25% of all the angiosperms. They commonly bloom from summer till fall and are of significant value in city landscape and environmental protection. Morphological traits are shared inter-specifically among plants of Lagerstroemia to certain extent and are also influenced by environmental conditions and different developmental stages. Thus, classification of plants in Lagerstroemia at species and cultivar levels is still a challenging task. Chloroplast (cp) genome sequences have been proven to be an informative and valuable source of cp DNA markers for genetic diversity evaluation. In this study, the complete cp genomes of three Lagerstroemia species were newly sequenced, and three other published cp genome sequences of Lagerstroemia were retrieved for comparative analyses in order to obtain an upgraded understanding of the application value of genetic information from the cp genomes. The six cp genomes ranged from 152,049 bp (L. subcostata) to 152,526 bp (L. speciosa) in length. We analyzed nucleotide substitutions, insertions/deletions, and simple sequence repeats in the cp genomes, and discovered 12 relatively highly variable regions that will potentially provide plastid markers for further taxonomic, phylogenetic, and population genetics studies in Lagerstroemia. The phylogenetic relationships of the Lagerstroemia taxa inferred from the datasets from the cp genomes obtained high support, indicating that cp genome data may be useful in resolving relationships in this genus.Entities:
Keywords: Lagerstroemia; chloroplast genome; comparative genomics; phylogeny; plastid marker; sequence divergence; simple repeat sequence
Year: 2017 PMID: 28154574 PMCID: PMC5243828 DOI: 10.3389/fpls.2017.00015
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1Gene map of . The genes inside and outside of the circle are transcribed in the clockwise and counterclockwise directions, respectively. Genes belonging to different functional groups are shown in different colors. The thick lines indicate the extent of the inverted repeats (IRa and IRb) that separate the genomes into small single-copy (SSC) and large single-copy (LSC) regions.
Figure 2Sliding window analysis of the whole chloroplast genomes of six and five Lagerstroemia taxa (not including L. speciosa) (B) (window length: 600 bp, step size: 200 bp). X-axis, position of the midpoint of a window; Y-axis, nucleotide diversity of each window.
Figure 3Identity plot comparing the chloroplast genomes of six . The vertical scale indicates the percentage of identity, ranging from 50 to 100%. The horizontal axis indicates the coordinates within the chloroplast genome. Genome regions are color coded as protein-coding, rRNA, tRNA, intron, and conserved non-coding sequences (CNS).
Figure 4Phylogenetic relationships of the six . ML topology shown with MP bootstrap support values/ML bootstrap support value/Bayesian posterior probability listed at each node.
Summary of complete chloroplast genome features of the six .
| Large single copy (LSC, bp) | 84,062 | 84,046 | 83,890 | 84,193 | 83,920 | 83,811 |
| Small single copy (SSC, bp) | 16,919 | 16,915 | 16,909 | 16,833 | 16,934 | 16,909 |
| Inverted repeat (IR, bp) | 25,625 | 25,622 | 25,625 | 25,750 | 25,793 | 25,677 |
| Total | 152,231 | 152,205 | 152,049 | 152,526 | 152,440 | 152,074 |
| Protein-coding genes | 78 | 78 | 78 | 78 | 78 | 78 |
| rRNA | 4 | 4 | 4 | 4 | 4 | 4 |
| tRNA | 30 | 30 | 30 | 30 | 30 | 30 |
| Total | 112 | 112 | 112 | 112 | 112 | 112 |
| GC% | 37.59 | 37.59 | 37.59 | 37.57 | 37.60 | 37.62 |
Distribution of each SSR category in the six .
| Mono-nucleotide | 28 | 18 | 4 | 6 | 28 | 6 | 2 | 2 | |
| Di-nucleotide | 4 | 1 | 2 | 1 | 3 | 1 | 0 | 0 | |
| Tri-nucleotide | 6 | 4 | 2 | 1 | 3 | 2 | 1 | 1 | |
| Tetra-nucleotide | 7 | 3 | 3 | 2 | 6 | 2 | 0 | 0 | |
| Penta-nucleotide | 2 | 2 | 0 | 0 | 0 | 0 | 1 | 1 | |
| Subtotal | 47 | 28 | 11 | 10 | 40 | 11 | 4 | 4 | |
| Mono-nucleotide | 24 | 17 | 4 | 3 | 16 | 5 | 2 | 2 | |
| Di-nucleotide | 4 | 2 | 2 | 1 | 4 | 1 | 0 | 0 | |
| Tri-nucleotide | 7 | 4 | 2 | 1 | 3 | 2 | 1 | 1 | |
| Tetra-nucleotide | 7 | 2 | 3 | 2 | 5 | 2 | 0 | 0 | |
| Penta-nucleotide | 2 | 2 | 0 | 0 | 0 | 0 | 1 | 1 | |
| Subtotal | 44 | 27 | 11 | 7 | 28 | 10 | 4 | 4 | |
| Mono-nucleotide | 18 | 10 | 3 | 5 | 11 | 5 | 1 | 1 | |
| Di-nucleotide | 6 | 4 | 1 | 1 | 4 | 1 | 1 | 0 | |
| Tri-nucleotide | 7 | 4 | 2 | 1 | 3 | 2 | 1 | 1 | |
| Tetra-nucleotide | 7 | 2 | 3 | 2 | 5 | 2 | 0 | 0 | |
| Penta-nucleotide | 2 | 2 | 0 | 0 | 0 | 0 | 1 | 1 | |
| Subtotal | 40 | 22 | 9 | 9 | 23 | 10 | 4 | 3 | |
| Mono-nucleotide | 29 | 19 | 4 | 6 | 18 | 7 | 2 | 2 | |
| Di-nucleotide | 5 | 3 | 2 | 0 | 4 | 1 | 0 | 0 | |
| Tri-nucleotide | 7 | 4 | 2 | 1 | 3 | 2 | 1 | 1 | |
| Tetra-nucleotide | 7 | 2 | 3 | 2 | 5 | 2 | 0 | 0 | |
| Penta-nucleotide | 2 | 2 | 0 | 0 | 0 | 0 | 1 | 1 | |
| Subtotal | 50 | 30 | 11 | 9 | 30 | 12 | 4 | 4 | |
| Mono-nucleotide | 24 | 17 | 2 | 5 | 20 | 2 | 1 | 1 | |
| Di-nucleotide | 6 | 3 | 2 | 1 | 4 | 2 | 0 | 0 | |
| Tri-nucleotide | 7 | 4 | 2 | 1 | 4 | 1 | 1 | 1 | |
| Tetra-nucleotide | 9 | 4 | 3 | 2 | 7 | 2 | 0 | 0 | |
| Penta-nucleotide | 2 | 2 | 0 | 0 | 0 | 0 | 1 | 1 | |
| Subtotal | 48 | 30 | 9 | 9 | 35 | 7 | 3 | 3 | |
| Mono-nucleotide | 25 | 15 | 4 | 6 | 15 | 6 | 2 | 2 | |
| Di-nucleotide | 4 | 1 | 2 | 1 | 3 | 1 | 0 | 0 | |
| Tri-nucleotide | 7 | 4 | 2 | 1 | 3 | 2 | 1 | 1 | |
| Tetra-nucleotide | 8 | 3 | 4 | 1 | 6 | 2 | 0 | 0 | |
| Penta-nucleotide | 2 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | |
| Subtotal | 46 | 24 | 12 | 10 | 27 | 11 | 4 | 4 | |
| Total | 275 | 161 | 63 | 54 | 183 | 61 | 23 | 22 |
Numbers and percentage of SSRs in the six .
| 28 (59.57%) | 11 (23.40%) | 10 (21.28%) | 40 (85.11%) | 11 (23.40%) | 4 (8.51%) | 4 (8.51%) | 47 | |
| 27 (61.36%) | 11 (25.00%) | 7 (15.91%) | 28 (63.64%) | 10 (22.73%) | 4 (9.09%) | 4 (9.09%) | 44 | |
| 22 (55.00%) | 9 (22.50%) | 9 (22.50%) | 23 (57.50%) | 10 (25.00%) | 4 (10.00%) | 3 (7.50%) | 40 | |
| 30 (60.00%) | 11 (22.00%) | 9 (18.00%) | 30 (60.00%) | 12 (24.00%) | 4 (8.00%) | 4 (8.00%) | 50 | |
| 30 (62.50%) | 9 (18.75%) | 9 (18.75%) | 35 (72.92%) | 7 (14.58%) | 3 (6.25%) | 3 (6.25%) | 48 | |
| 24 (52.17%) | 12 (26.09%) | 10 (21.74%) | 27 (58.70%) | 11 (23.91%) | 4 (8.70%) | 4 (8.70%) | 46 | |
| Average | 26.8 | 10.5 | 9.0 | 30.5 | 10.2 | 3.8 | 3.7 | 45.8 |
| Min.–Max. | 22–30 | 9–12 | 7–10 | 23–40 | 7–12 | 3–4 | 3–4 | 40–50 |
| Total | 161 (58.55%) | 63 (22.91%) | 54 (19.64%) | 183 (66.55%) | 61 (22.18%) | 23 (8.36%) | 22 (8.00%) | 275 |
SSRs, simple sequence repeats. LSC, Large single copy region; SSC, Small single copy region; Ira, Inverted repeat region a; IRb, Inverted repeat region b.
Variable site analyses in the six .
| Large single copy region | 84,868 | 771 (0.91% | 150 (19.46% | 0.00345 |
| Small single copy region | 17,077 | 281 (1.65%) | 55 (19.57%) | 0.00639 |
| Inverted repeat region | 25,961 | 133 (0.51%) | 15 (11.28%) | 0.00175 |
| Complete cp genome | 153,842 | 1330 (0.86%) | 238 (17.89%) | 0.00322 |
The percentage of variable sites in the number of sites.
The percentage of informative sites in the number of variable sites.
Number of nucleotide substitutions and insertions/deletions in the six .
| 66 | 293 | 95 | 29 | 31 | ||
| 257 | 295 | 57 | 79 | 72 | ||
| 1084 | 1089 | 315 | 297 | 301 | ||
| 309 | 134 | 1105 | 103 | 91 | ||
| 24 | 249 | 1082 | 303 | 44 | ||
| 63 | 254 | 1083 | 291 | 57 |
The lower triangle indicates the number of nucleotide substitutions, the upper triangle shows the number of insertions/deletions.
Pairwise substitution rates (dN/dS) between the .
| 0.3102 (0.0009/0.0029) | |||||
| 0.3762 (0.0036/0.0095) | 0.3755 (0.0037/0.0098) | ||||
| 0.3178 (0.0009/0.0027) | 0.2605 (0.0002/0.0009) | 0.3710 (0.0036/0.0097) | |||
| 0.1688 (0.0001/0.0006) | 0.3374 (0.0010/0.0028) | 0.3767 (0.0036/0.0094) | 0.3326 (0.0009/0.0027) | ||
| 0.3420 (0.0002/0.0005) | 0.3174 (0.0009/0.0028) | 0.3711 (0.0035/0.0094) | 0.2963 (0.0008/0.0026) | 0.6081 (0.0002/0.0003) |
Primers for PCR amplification of the 12 relatively highly variable regions among the six .
| 1 | TGGGTTCATAGGACTCTATCCA | TTGCAATTGATGTGCGATCTCGA | 1202 | 56 | |
| 2 | ACCGAGTTATCAACGGAAACGGA | TAAAGTTTCTGCTCGGAATAAGA | 882 | 53 | |
| 3 | TCTAGAGGGATTATCTAGAAAGCA | AAGAGGTCAACGATTACGTGAGT | 975 | 55 | |
| 4 | AGAGGAATGTCCGTTGGG | CGATGACTTACGCCTTACC | 1471 | 53 | |
| 5 | TCTCTTAATTGAATTGCAATTCA | AATAGATGAATAGTCATTCGATGA | 703 | 49.5 | |
| 6 | GTGATCCTTCCGAATGGGATAAG | CAGTGAATTTCCATTTACTGATAT | 672 | 51.8 | |
| 7 | AGTAGAAGGTTTATATATCTAATA | GATTATTTCGTTGCAATCACAAC | 905 | 48.5 | |
| 8 | TTAGTTGCCACCGGTATGAGAGT | GGTCCTCTTCCCCATTACTTAGA | 1845 | 58 | |
| 9 | AGGTATAATCCATGAATATTGAT | TGAATTCATTATAGGACTTATTA | 1755 | 48 | |
| 10 | ATCGGTTGATAAATGAATTCCAA | CAAGGTTCAATTTGATCTAATCT | 790 | 51 | |
| 11 | TAAGTCTTCGTATCTTATTGGTG | GAGTTTGGATATTCTGATGATTCA | 1122 | 53 | |
| 12 | TAACCTCAGCCTTAGCATT | GGACAGAATAGACAAACCCT | 2191 | 50 |