| Literature DB >> 29619041 |
Ying Wu1,2, Fang Liu1, Dai-Gang Yang1, Wei Li1, Xiao-Jian Zhou1, Xiao-Yu Pei1, Yan-Gai Liu1, Kun-Lun He1, Wen-Sheng Zhang1, Zhong-Ying Ren1, Ke-Hai Zhou1, Xiong-Feng Ma1, Zhong-Hu Li2.
Abstract
Cotton is one of the most economically important fiber crop plants worldwide. The genus Gossypium contains a single allotetraploid group (AD) and eight diploid genome groups (A-G and K). However, the evolution of repeat sequences in the chloroplast genomes and the phylogenetic relationships of Gossypium species are unclear. Thus, we determined the variations in the repeat sequences and the evolutionary relationships of 40 cotton chloroplast genomes, which represented the most diverse in the genus, including five newly sequenced diploid species, i.e., G. nandewarense (C1-n), G. armourianum (D2-1), G. lobatum (D7), G. trilobum (D8), and G. schwendimanii (D11), and an important semi-wild race of upland cotton, G. hirsutum race latifolium (AD1). The genome structure, gene order, and GC content of cotton species were similar to those of other higher plant plastid genomes. In total, 2860 long sequence repeats (>10 bp in length) were identified, where the F-genome species had the largest number of repeats (G. longicalyx F1: 108) and E-genome species had the lowest (G. stocksii E1: 53). Large-scale repeat sequences possibly enrich the genetic information and maintain genome stability in cotton species. We also identified 10 divergence hotspot regions, i.e., rpl33-rps18, psbZ-trnG (GCC), rps4-trnT (UGU), trnL (UAG)-rpl32, trnE (UUC)-trnT (GGU), atpE, ndhI, rps2, ycf1, and ndhF, which could be useful molecular genetic markers for future population genetics and phylogenetic studies. Site-specific selection analysis showed that some of the coding sites of 10 chloroplast genes (atpB, atpE, rps2, rps3, petB, petD, ccsA, cemA, ycf1, and rbcL) were under protein sequence evolution. Phylogenetic analysis based on the whole plastomes suggested that the Gossypium species grouped into six previously identified genetic clades. Interestingly, all 13 D-genome species clustered into a strong monophyletic clade. Unexpectedly, the cotton species with C, G, and K-genomes were admixed and nested in a large clade, which could have been due to their recent radiation, incomplete lineage sorting, and introgression hybridization among different cotton lineages. In conclusion, the results of this study provide new insights into the evolution of repeat sequences in chloroplast genomes and interspecific relationships in the genus Gossypium.Entities:
Keywords: Gossypium; chloroplast genome; divergent hotspot; phylogeny; repeat sequence
Year: 2018 PMID: 29619041 PMCID: PMC5871733 DOI: 10.3389/fpls.2018.00376
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Characteristics of chloroplast genomes in six Gossypium species.
| Genome features | ||||||
|---|---|---|---|---|---|---|
| Size (bp) | 160080 | 160347 | 159677 | 160142 | 160205 | 160199 |
| LSC length (bp) | 88657 | 88848 | 88284 | 88735 | 88811 | 88779 |
| SSC length (bp) | 20241 | 20287 | 20241 | 20233 | 20294 | 20318 |
| IR length (bp) | 25591 | 25606 | 25576 | 25587 | 25550 | 25551 |
| Coding (bp) | 78612 | 78528 | 78531 | 78552 | 78696 | 78681 |
| Non-coding (bp) | 81468 | 81819 | 81146 | 81590 | 81509 | 81518 |
| Number of genes | 130 | 130 | 130 | 130 | 130 | 130 |
| Protein-coding genes | 85 | 85 | 85 | 85 | 85 | 85 |
| tRNA genes | 37 | 37 | 37 | 37 | 37 | 37 |
| rRNA genes | 8 | 8 | 8 | 8 | 8 | 8 |
| Overall GC content (%) | 37.3 | 37.2 | 37.1 | 37.3 | 37.3 | 37.3 |
| GC content of LSC (%) | 35.3 | 35.2 | 35.1 | 35.3 | 35.3 | 35.2 |
| GC content of SSC (%) | 31.7 | 31.6 | 31.4 | 31.7 | 31.7 | 31.6 |
| GC content of IR (%) | 43.0 | 43.0 | 43.0 | 43.0 | 43.0 | 43.0 |