| Literature DB >> 28484234 |
Zheng Xiao-Ming1, Wang Junrui1, Feng Li1, Liu Sha1, Pang Hongbo2, Qi Lan1, Li Jing1, Sun Yan1, Qiao Weihua1, Zhang Lifang1, Cheng Yunlian1, Yang Qingwen3.
Abstract
The chloroplast genome originated from photosynthetic organisms and has retained the core genes that mainly encode components of photosynthesis. However, the causes of variations in chloroplast genome size in seed plants have only been thoroughly analyzed within small subsets of spermatophytes. In this study, we conducted the first comparative analysis on a large scale to examine the relationship between sequence characteristics and genome size in 272 seed plants based on cross-species and phylogenetic signal analysis. Our results showed that inverted repeat regions, large or small single copies, intergenic regions, and gene number can be attributed to the variations in chloroplast genome size among closely related species. However, chloroplast gene length underwent evolution affecting chloroplast genome size in seed plants irrespective of whether phylogenetic information was incorporated. Among chloroplast genes, atpA, accD and ycf1 account for 13% of the variation in genome size, and the average Ka/Ks values of homologous pairs of the three genes are larger than 1. The relationship between chloroplast genome size and gene length might be affected by selection during the evolution of spermatophytes. The variation in chloroplast genome size may influence energy generation and ecological strategy in seed plants.Entities:
Mesh:
Year: 2017 PMID: 28484234 PMCID: PMC5431534 DOI: 10.1038/s41598-017-01518-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Variations in chloroplast genome size and the sequence characteristics of chloroplast genomes within seed plants. The box plots of chloroplast genome size and sequence characteristics of chloroplast genome are shown for each order. The complete order names are in Supplementary Table S1. The box plots represent the median (central line), first and third quartiles (black box), and outliers (black circles). TL, IRL, LSCL, SSCL, GRL, IGRL, and GN indicate the total length of the chloroplast genome, length of the inverted repeat region, large single copy, small single copy, gene region, intergenic region, and gene number, respectively. Red triangles in TL indicate there were no inverted repeat regions in these species. The numbers below order names are the number of species collected in each order. The number in the brackets indicates the number of species without an inverted repeat region. The three lines from the top to bottom represent the first quartiles, the median, and the third quartiles of the sequence characteristics of all seed plants examined in this study, respectively.
Figure 2Variations in chloroplast genome size and the sequence characteristics of chloroplast genome in gymnosperms (pink), basal angiosperms (blue), monocots (yellow), and eudicots (green). The box plots represent the median (central line), first and third quartiles (black box), and outliers (black circles).
Standardized major axis (SMA) slope estimates describing the relationships between chloroplast genome size and TL, IRL, LSCL, SCCL, GRL, IGRL, GCC and GN for both cross-species and based on phylogenetic signal analyses.
| Chloroplast genome size | |||||||
|---|---|---|---|---|---|---|---|
| Cross-species | Phylogenetic |
| |||||
|
| Slope | 95% CI |
| Slope | 95% CI | ||
| IRL | 0.23 | 0.05 | (0.04, 0.05) | 0.11 | 0.03 | (0.03, 0.04) | 0.92 |
| LSCL | 0.24 | 0.04 | (0.04, 0.05) | 0.1 | 0.03 | (0.03, 0.05) | 0.93 |
| SCCL | 0.25 | 0.05 | (0.04, 0.05) | 0.11 | 0.04 | (0.04, 0.05) | 0.92 |
| GRL | 0.81 | 1.11 | (1.07, 1.13) |
|
| ( | 0.59 |
| IGRL | 0.69 | 0.83 | (0.78, 0.89) | 0.12 | 0.03 | (0.03, 0.04) | 0.95 |
| GCC | 0.09 | 0.29 | (0.25, 0.33) | 0.08 | 0.19 | (0.15, 0.22) | 0.97 |
| GN | 0.76 | 0.74 | (0.69, 0.76) | 0.13 | 0.07 | (0.06, 0.07) | 0.96 |
Statistically significant values are indicated in italics.
TL, IRL, LSCL, SSCL, GRL, IGRL and GN indicates the total length of chloroplast genome, the length of inverted repeat region, large single copy, small single copy, gene region, intergenic region, and the gene number respectively. K describes the degree of the difference between the F-statistic of simulated data and observed F-statistic distributions.
Figure 3Relationship between chloroplast genome size and chloroplast gene length. (a) Heat maps of standardized contrast (SC) values for each gene. SCs were obtained by dividing the raw contrasts between gene length and the average gene length by the standard deviation. Green indicates that the gene was lost in the species. White indicates that SC is zero, which indicates that the sequence characteristics of this species were equal to the average of the sequence characteristics of all seed plants collected for this study. Red stands for an SC larger than zero, blue stands for an SC smaller than zero, and larger absolute values of SC are indicated by darker colors. I indicates genes for the photosynthetic apparatus, II comprises RNA genes and genes for the genetic apparatus, and III represents potential genes. (b) The box plots of Ka/Ks values of atpA, accD, ycf1 atpI, ndhE, rbcL, rps8, and matK in the homologous genes of plants.