| Literature DB >> 27300553 |
Saminathan Subburaj1, Shuanghe Cao1, Xianchun Xia1, Zhonghu He1,2.
Abstract
The rice gene seed dormancy 4 (OsSdr4) functions in seed dormancy and is a major factor associated with pre-harvest sprouting (PHS). Although previous studies of this protein family were reported for rice and other species, knowledge of the evolution of genes homologous to OsSdr4 in plants remains inadequate. Fifty four Sdr4-like (hereafter designated Sdr4L) genes were identified in nine plant lineages including 36 species. Phylogenetic analysis placed these genes in eight subfamilies (I-VIII). Genes from the same lineage clustered together, supported by analysis of conserved motifs and exon-intron patterns. Segmental duplications were present in both dicot and monocot clusters, while tandemly duplicated genes occurred only in monocot clusters indicating that both tandem and segmental duplications contributed to expansion of the grass I and II subfamilies. Estimation of the approximate ages of the duplication events indicated that ancestral Sdr4 genes evolved from a common angiosperm ancestor, about 160 million years ago (MYA). Moreover, diversification of Sdr4L genes in mono and dicot plants was mainly associated with genome-wide duplication and speciation events. Functional divergence was observed in all subfamily pairs, except IV/VIIIa. Further analysis indicated that functional constraints between subfamily pairs I/II, I/VIIIb, II/VI, II/VIIIb, II/IV, and VI/VIIIb were statistically significant. Site and branch-site model analyses of positive selection suggested that these genes were under strong adaptive selection pressure. Critical amino acids detected for both functional divergence and positive selection were mostly located in the loops, pointing to functional importance of these regions in this protein family. In addition, differential expression studies by transcriptome atlas of 11 Sdr4L genes showed that the duplicated genes may have undergone divergence in expression between plant species. Our findings showed that Sdr4L genes are functionally divergent and positively selected. These may contribute to further functional analysis and molecular evolution of Sdr4L gene families in land plants.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27300553 PMCID: PMC4907471 DOI: 10.1371/journal.pone.0153717
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Phylogenetic tree showing relationships between Sdr4L protein sequences.
(A) The unrooted tree was built with MEGA5 by using sequence alignments of 54 Sdr4L members. The branch support bootstrap values were obtained by 1000 replicates using the NJ method. (B) Proteins showing conserved motifs obtained by MEME analysis are indicated in numbered colored boxes.
Fig 2Divergence time estimations between Sdr4L genes in plants based on aligned nucleotide sequences using Bayesian MCMC analysis.
A relaxed molecular clock approach with 20 million MCMC steps was used. Tree was drawn using the divergence of O. sativa and B. distachyon and M. domestica and C. sinensis as anchor points for monocot and dicot lineages, respectively. Number at each node represent estimated time in million years (MYA). The light pick and green circle shows the events of divergence between mono and dicot lineages and duplication in grass genomes, respectively. The major clusters of orthologous genes are shown in different colors and defined in different subgroups (I to VIII).
Functional divergence between Sdr4L subfamilies.
| Group 1 | Group 2 | Type-I | Type-II | |||
|---|---|---|---|---|---|---|
| Θ±SE | LRT | Qk > 0.9 | Θ±SE | Qk > 0.9 | ||
| I | II | 0.759±0.148 | 26.346 | 27 | 0.158±0.146 | 32 |
| I | VI | 0.334±0.193 | 0.003 | 0 | 0.195±0.105 | 32 |
| I | VIIIb | 0.459±0.185 | 6.147 | 1 | 0.281±0.094 | 38 |
| I | IV | 0.064±0.164 | 0.153 | 0 | 0.188±0.108 | 30 |
| I | VIIIa | 0.214±0.248 | 0.746 | 0 | 0.148±0.110 | 25 |
| II | VI | 0.787±0.254 | 9.590 | 13 | 0.272±0.129 | 50 |
| II | VIIIb | 0.839±0.207 | 16.420 | 33 | 0.289±0.115 | 43 |
| II | IV | 0.944±0.238 | 15.718 | 36 | 0.163±0.138 | 47 |
| II | VIIIa | 0.615±0.336 | 3.356 | 5 | 0.151±0.137 | 45 |
| VI | VIIIb | 0.642±0.211 | 9.257 | 1 | 0.188±0.072 | 21 |
| VI | IV | 0.001±0.022 | 0.000 | 0 | 0.014±0.083 | 4 |
| VI | VIIIa | 0.178±0.313 | 0.325 | 0 | 0.009±0.081 | 5 |
| VIIIb | IV | 0.174±0.146 | 1.416 | 0 | 0.006±0.071 | 2 |
| VIIIb | VIIIa | 0.383±0.205 | 3.499 | 0 | 0.033±0.067 | 10 |
| IV | VIIIa | 0.001±0.022 | 0.000 | 0 | -0.076±0.078 | 0 |
Note: θI and θII, the coefficients of Type I and Type II functional divergence between two gene clusters; LRT, likelihood ratio Statistic
*, P < 0.05
**, P < 0.01
Qk, posterior probability. Large Qk values indicate a high probability of functional constraint (or the evolutionary rate) or that physicochemical properties of a given amino acid site differs between two clusters.
Tests for positive selection among Sdr4L codons of using site models.
| Model | InL | 2ΔL | Estimates of parameters | Positively selected sites |
|---|---|---|---|---|
| M0 (one ratio) | -8461.241 | ω = 0.16456 | None | |
| M3 (discrete) | -8244.734 | 433.014 (M0 vs M3) | p0 = 0.55168, p1 = 0.33358, p2 = 0.11474, ω0: 0.05766, ω1:0.23719, ω2: 0.66151 | None |
| M7 (beta) | -8240.96 | p = 0.73100, q = 2.96832 | Not allowed | |
| M8 (beta & ω) | -9418.204 | 1177.244 (M7 vs M8) | p0 = 0.99999, p = 0.36186, q = 1.15287, p1 = 0.00001, ω = 2.06669 |
Note
a Number of parameters in the ω distribution (ratios of non-synonymous (dN or Ka) versus synonymous (dS or Ks) mutations).
b Positively selected sites are inferred from posterior probabilities > 95% (*) or 99% (**).
Sites implicated in Type I and Type II divergence are shown in bold. Codon (amino acid) positions are based on rice OsSDR4 protein.
Fig 3Sequence comparison by multiple clustal alignment of typical Sdr4L protein sequences from monocot and eudicot plant species.
Black arrows indicate predicted protein domains (PD319905 and PDB0A0W9). Critical amino acid sites (CAAS) responsible for functional divergence (Qk > 0.9, Type I and Type II) and phosphorylation sites are marked by pink triangles and blue stars, respectively. Adaptive selection sites for the site and branch site model are indicated by red and blue boxes, respectively. Putative nuclear localization signal (NLS) motifs are boxed.
Fig 4Three dimensional structures of rice OsSdr4L (a) and tomato SlSdr4L (b) proteins. The corresponding structures were obtained using i-TASSER [22]. The α-helices and β-strands are shown in green and yellow, respectively. Critical amino acid sites responsible for both positive selection and functional divergence are labeled in red. The edges of amino (N) and carboxyl (COOH) terminals, and random view of predicted domains of I and II (PD319905 and PDB0A0W9) are indicated in light blue.