| Literature DB >> 34966398 |
Ning Chen1, Li-Na Sha1, Yi-Ling Wang2, Ling-Juan Yin3, Yue Zhang1, Yi Wang1, Dan-Dan Wu1, Hou-Yang Kang1, Hai-Qin Zhang1, Yong-Hong Zhou1, Gen-Lou Sun4, Xing Fan1.
Abstract
To investigate the pattern of chloroplast genome variation in Triticeae, we comprehensively analyzed the indels in protein-coding genes and intergenic sequence, gene loss/pseudonization, intron variation, expansion/contraction in inverted repeat regions, and the relationship between sequence characteristics and chloroplast genome size in 34 monogenomic Triticeae plants. Ancestral genome reconstruction suggests that major length variations occurred in four-stem branches of monogenomic Triticeae followed by independent changes in each genus. It was shown that the chloroplast genome sizes of monogenomic Triticeae were highly variable. The chloroplast genome of Pseudoroegneria, Dasypyrum, Lophopyrum, Thinopyrum, Eremopyrum, Agropyron, Australopyrum, and Henradia in Triticeae had evolved toward size reduction largely because of pseudogenes elimination events and length deletion fragments in intergenic. The Aegilops/Triticum complex, Taeniatherum, Secale, Crithopsis, Herteranthelium, and Hordeum in Triticeae had a larger chloroplast genome size. The large size variation in major lineages and their subclades are most likely consequences of adaptive processes since these variations were significantly correlated with divergence time and historical climatic changes. We also found that several intergenic regions, such as petN-trnC and psbE-petL containing unique genetic information, which can be used as important tools to identify the maternal relationship among Triticeae species. Our results contribute to the novel knowledge of plastid genome evolution in Triticeae.Entities:
Keywords: IR expansion/contraction; Triticeae; chloroplast genome; genome size; genome variation
Year: 2021 PMID: 34966398 PMCID: PMC8710740 DOI: 10.3389/fpls.2021.741063
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
List of taxa used in this study.
| No. | Sequence number | Species | Accession no. | Genome | Ploidy | Origin | GeneBank | Chloroplast genome size (bp) |
| 1 | KJ614418 |
|
| Sb | 2× | SE Mediterranean | KJ614418 | 136,861 |
| 2 | KJ614416 |
|
| Sl | 2× | E Mediterranean | KJ614416 | 136,875 |
| 3 | KY636033 |
|
| C | 2× | NE Mediterranean | KY636033 | 136,063 |
| 4 | KJ614413 |
|
| SS | 2× | E Mediterranean | KJ614413 | 136,870 |
| 5 | KJ614419 |
|
| Ssh | 2× | Israel, Lebanon | KJ614419 | 136,867 |
| 6 | KJ614406 |
|
| S | 2× | E Mediterranean | KJ614406 | 135,652 |
| 7 | KJ614405 |
| S | 2× | E Mediterranean | KJ614405 | 135,660 | |
| 8 | KJ614412 |
| Cultivar | D | 2× | SW–C Asia | KJ614412 | 135,568 |
| 10 | KY636059 |
|
| U | 2× | SE Europe, SW Asia | KY636059 | 136,028 |
| 9 | KY636056 |
| U | 2× | SE Europe, SW Asia | KY636056 | 136,031 | |
| 11 | C22 |
|
| P | 2× | Kazakhstan | KY126307 | 135,554 |
| 12 | K30 |
|
| P | 2× | Nei Monggol, China | MH285848 | 135,547 |
| 13 | KY636075 |
|
| T | 2× | Turkey | KY636075 | 135,787 |
| 14 | E48 |
|
| W | 2× | NEW South Wales, Australia | MH331642 | 135,417 |
| 15 | CRK28 |
| ND | K | 2× | Greece | MH285849 | 136,436 |
| 16 | DAK25 |
|
| V | 2× | Greece | MH285850 | 135,249 |
| 17 | ERK39 |
|
| F | 2× | Afghanistan | MH285851 | 135,589 |
| 18 | ERK31 |
|
| Xe | 2× | Xinjiang, China | MH285852 | 135,554 |
| 19 | HEK26 |
| – | O | 2× | Iran | MH285853 | 135,659 |
| 20 | HEK24 |
|
| Q | 2× | Iran | MH285854 | 136,768 |
| 21 | E46 |
|
| H | 2× | Xinjiang, China | MH331641 | 136,968 |
| 22 | KM974741 |
| CAN:Saarela | H | 3× | – | KM974741 | 136,826 |
| 23 | EF115541 |
| Morex | I | 2× | Minnesota, United States | EF115541 | 136,462 |
| 24 | KC912689 |
| I | 2× | Minnesota, United States | KC912689 | 136,043 | |
| 25 | K27 |
|
| Ee | 2× | Tunisia | MH331643 | 135,020 |
| 26 | C26 |
|
| Ns | 2× | Former soviet union | MH331640 | 136,597 |
| 27 | E47 |
|
| St | 2× | Iran | KX822019 | 135,026 |
| 28 | PSK29 |
|
| St | 2× | Wyoming, United States | MH285855 | 135,165 |
| 29 | KC912691 |
| Imperial | R | 2× | Turkey | KC912691 | 135,564 |
| 30 | TAK23 |
|
| Ta | 2× | Afghanistan | MH285856 | 136,861 |
| 31 | THC25 |
|
| Eb | 2× | Estonia | MH331639 | 135,003 |
| 32 | LC005977 |
| – | A | 2× | Turkey | LC005977 | 136,886 |
| 33 | KC912692 | – | A | 2× | Turkey | KC912692 | 136,870 | |
| 34 | KJ614411 |
|
| A | 2× | E Mediterranean, Caucasus | KJ614411 | 136,865 |
| 35 | EU325680 |
| Cultivar | – | – | – | EU325680 | 135,199 |
The GenBank accession number with bold represent previously published sequences from the GenBank (
FIGURE 1Maximum-likelihood tree inferred from whole complete chloroplast (cp) genome sequences for the diploid Triticeae using RAxML v.8.2.10. (A) Phylogenetic tree topology with bootstrap support (BS) above and posterior probabilities (PP) below branches (>50% BS; >0.9 PP). (B) Gene loss/pseudonization, indels in protein coding genes, intron variation, and intergenic sequence (IGS) indels within the cp genomes were characterized and mapped on the branches of the phylogenetic tree.
FIGURE 2Comparison of the borders of the large-single-copy (LSC) (blue), small single copy (SSC) (green), and inverted repeat (IR) (orange) regions among the 34 cp genomes.
Sizes of 34 Triticeae species with complete chloroplast genomes and non-protein/protein coding sequences in four clades.
| No. | Species | Chloroplast genome size (bp) | Non-coding size (bp) | Coding size (bp) | Clade |
| KJ614418 |
| 136,861 | 77,543 | 59,318 | Clade I |
| KJ614416 |
| 136,875 | 77,557 | 59,318 | Clade I |
| KY636033 |
| 136,063 | 76,745 | 59,318 | Clade I |
| KJ614413 |
| 136,870 | 77,552 | 59,318 | Clade I |
| KJ614419 |
| 136,867 | 77,549 | 59,318 | Clade I |
| KJ614406 |
| 135,652 | 76,334 | 59,318 | Clade I |
| KJ614405 | 135,660 | 76,343 | 59,317 | Clade I | |
| KJ614412 |
| 135,568 | 76,250 | 59,318 | Clade I |
| KY636059 |
| 136,028 | 76,710 | 59,318 | Clade I |
| KY636056 | 136,031 | 76,713 | 59,318 | Clade I | |
| KY636075 |
| 135,787 | 77,485 | 58,302 | Clade I |
| KC912691 |
| 135,564 | 77,109 | 58,455 | Clade I |
| MH285856 |
| 136,861 | 77,537 | 59,324 | Clade I |
| LC005977 |
| 136,886 | 77,568 | 59,318 | Clade I |
| KC912692 | 136,870 | 75,821 | 61,049 | Clade I | |
| KJ614411 |
| 136,865 | 77,547 | 59,318 | Clade I |
| MH285849 |
| 136,436 | 77,109 | 59,327 | Clade I |
| MH285854 |
| 136,768 | 77,490 | 59,278 | Clade I |
| Average | 136,362 | 77,053 | 59,308 | ||
| KX822019 |
| 135,026 | 75,711 | 59,315 | Clade II |
| MH285855 |
| 135,165 | 75,854 | 59,311 | Clade II |
| MH285850 |
| 135,249 | 75,877 | 59,372 | Clade II |
| MH331643 |
| 135,020 | 75,707 | 59,313 | Clade II |
| MH331639 |
| 135,003 | 75,694 | 59,309 | Clade II |
| Average | 135,093 | 75,769 | 59,324 | ||
| KY126307 |
| 135,554 | 76,251 | 59,303 | Clade III |
| MH285848 |
| 135,547 | 76,244 | 59,303 | Clade III |
| MH331642 |
| 135,417 | 76,114 | 59,303 | Clade III |
| MH285851 |
| 135,589 | 76,248 | 59,341 | Clade III |
| MH285852 |
| 135,554 | 76,283 | 59,271 | Clade III |
| MH285853 |
| 135,659 | 76,353 | 59,306 | Clade III |
| Average | 135,553 | 76,249 | 59,305 | ||
| MH331641 |
| 136,968 | 77,672 | 59,296 | Clade IV |
| KM974741 |
| 136,826 | 77,466 | 59,360 | Clade IV |
| EF115541 |
| 136,462 | 77,153 | 59,309 | Clade IV |
| KC912689 | 136,043 | 76,945 | 59,098 | Clade IV | |
| Average | 136,575 | 77,309 | 59,266 |
FIGURE 3Ancestral state reconstructions were traced on the ML tree inferred from two selected data (cp genome sequences and non-protein gene sequences) using weighted squared-change parsimony. Different colors labeled the geographic information of monogenomic genera. The capital letters in the bracket indicate the genome type of the species.
FIGURE 4Maternal time-calibrated phylogeny was estimated based on complete cp genomes of Triticeae with 95% confidence intervals BEAST analyses.
Divergence time and indels in every node of 34 Triticeae species in the phylogenomic tree.
| Node | Divergence time (MYA) | Indel | CP genome size |
| 1 | 0.66 | 28 | 136,030 |
| 2 | 0.65 | 77 | 135,925 |
| 3 | 0.84 | 115 | 135,977 |
| 4 | 1.67 | 182 | 136,423 |
| 5 | 0.03 | – | 136,873 |
| 6 | 0.33 | 14 | 136,871 |
| 7 | 1.12 | 20 | 136,868 |
| 8 | 2.1 | 258 | 136,328 |
| 9 | 7.85 | 413 | 136,464 |
| 10 | 0.13 | 122 | 136,878 |
| 11 | 0.41 | 156 | 136,874 |
| 12 | 8.12 | 513 | 136,383 |
| 13 | 0.12 | – | 135,656 |
| 14 | 2.22 | 143 | 136,058 |
| 15 | 8.98 | 625 | 136,332 |
| 16 | 12.4 | 656 | 136,338 |
| 17 | 12.71 | 780 | 136,362 |
| 18 | 2.27 | 71 | 135,012 |
| 19 | 3.19 | 178 | 135,091 |
| 20 | 3.89 | 190 | 135,075 |
| 21 | 12.62 | 226 | 135,093 |
| 22 | 20.49 | 977 | 136,086 |
| 23 | 0.85 | 39 | 135,551 |
| 24 | 13 | 190 | 135,561 |
| 25 | 4.7 | 101 | 135,572 |
| 26 | 13.29 | 269 | 135,532 |
| 27 | 17.16 | 360 | 135,553 |
| 28 | 21.18 | 1,239 | 135,976 |
| 29 | 8.7 | 51 | 136,253 |
| 30 | 6.37 | 139 | 136,804 |
| 31 | 15.09 | 343 | 136,528 |
| 32 | 27.03 | 1,568 | 136,043 |
| 33 | 27.63 | 1,702 | 136,059 |
FIGURE 5Correlation tests of number of indel in 34 chloroplast genomes of Triticeae species against divergence time and chloroplast genomes size. (A) A correlation test of number of indel against divergence time (R = 0.87, p = 1.6e-10<0.05). (B) A correlation test of number of indel against chloroplast genomes size (R = 0.37, p = 0.04).
FIGURE 6Comparisons of complete cp genome, protein coding region, and non-protein coding region within the four clades, with significant differences (p < 0.05, Kruskal–Wallis test) being estimated. (A) Comparisons of complete cp genome sequences; (B) comparisons of protein coding gene sequences; (C) comparisons of non-protein coding sequences. *p < 0.05; **p < 0.01.