| Literature DB >> 28303145 |
Alberto Cenci1, Mathieu Rouard1.
Abstract
GRAS transcription factors (TFs) play critical roles in plant growth and development such as gibberellin and mycorrhizal signaling. Proteins belonging to this gene family contain a typical GRAS domain in the C-terminal sequence, whereas the N-terminal region is highly variable. Although, GRAS genes have been characterized in a number of plant species, their classification is still not completely resolved. Based on a panel of eight representative species of angiosperms, we identified 29 orthologous groups or orthogroups (OGs) for the GRAS gene family, suggesting that at least 29 ancestor genes were present in the angiosperm lineage before the "Amborella" evolutionary split. Interestingly, some taxonomic groups were missing members of one or more OGs. The gene number expansion usually observed in transcription factors was not observed in GRAS while the genome triplication ancestral to the eudicots (γ hexaploidization event) was detectable in a limited number of GRAS orthogroups. We also found conserved OG-specific motifs in the variable N-terminal region. Finally, we could regroup OGs in 17 subfamilies for which names were homogenized based on a literature review and described 5 new subfamilies (DLT, RAD1, RAM1, SCLA, and SCLB). This study establishes a consistent framework for the classification of GRAS members in angiosperm species, and thereby a tool to correctly establish the orthologous relationships of GRAS genes in most of the food crops in order to facilitate any subsequent functional analyses in the GRAS gene family. The multi-fasta file containing all the sequences used in our study could be used as database to perform diagnostic BLASTp to classify GRAS genes from other non-model species.Entities:
Keywords: GRAS gene family; plant evolution; transcription factors
Year: 2017 PMID: 28303145 PMCID: PMC5332381 DOI: 10.3389/fpls.2017.00273
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Numbers of GRAS genes of the studied species assigned to each OG.
| OG-SCR-1 | 1 | 3 | 2 | 2 | 1 | 1 | 1 | 1 | 12 |
| OG-SCR-2 | 2 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 10 |
| OG-SCR-3 | 1 | − | 1 | 1 | 1 | 1 | − | 1 | 6 |
| OG-SHR-1 | 1 | 3 | 2 | 2 | 1 | 1 | 1 | 1 | 12 |
| OG-SHR-2 | 1 | 3 | 2 | − | 1 | 1 | − | 1 | 9 |
| OG-SCL32-1 | 1 | 3 | 2 | 1 | 1 | 1 | 1 | 1 | 11 |
| OG-SCL32-2 | 1 | 1 | 1 | 1 | 1 | 1 | − | 2 | 8 |
| OG-NSP1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 9 |
| OG-LS | 1 | 4 | 4 | 2 | 1 | 1 | 1 | 1 | 15 |
| OG-SCL4/7 | 1 | 3 | 2 | 1 | 1 | 1 | 2 | 1 | 12 |
| OG-NSP2-1 | 1 | 1 | 2 | 1 | 1 | 1 | 1 | 1 | 9 |
| OG-NSP2-2 | 1 | − | 1 | 2 | 1 | − | − | 1 | 6 |
| OG-NSP2-3 | 1 | − | − | − | 3 | 2 | − | 2 | 8 |
| NSP2-Amb | 2 | − | − | − | − | − | − | − | 2 |
| OG-HAM-II | 1 | 7 | 4 | 5 | 2 | 2 | 3 | 2 | 26 |
| OG-HAM-I | 1 | 3 | 2 | − | 1 | 1 | 1 | 1 | 10 |
| OG-DELLA-1 | 1 | 4 | 2 | 1 | 2 | 2 | 5 | 2 | 19 |
| OG-DELLA-2 | 1 | 1 | 2 | 2 | 1 | 1 | − | 1 | 9 |
| OG-PAT-1 | 1 | 3 | 2 | 3 | 3 | 3 | 4 | 2 | 21 |
| OG-PAT-2 | 1 | 4 | 2 | 1 | 1 | 1 | − | 1 | 11 |
| OG-PAT-3 | 1 | 3 | 1 | 1 | 2 | 1 | 1 | 1 | 11 |
| OG-PAT-4 | − | 7 | 2 | 2 | 1 | 1 | 1 | 1 | 15 |
| OG-SCL3 | 1 | 3 | 3 | 9 | 3 | 3 | 1 | 2 | 25 |
| OG-RAD1-1 | 1 | 1 | 1 | 1 | 2 | 2 | − | 2 | 10 |
| OG-RAD1-2 | 1 | − | 1 | − | 1 | 1 | − | − | 4 |
| OG-RAM1 | 1 | 1 | 1 | 1 | 1 | 1 | − | 1 | 7 |
| OG-DLT | 1 | 3 | 2 | 1 | 1 | 1 | 1 | 1 | 11 |
| OG-SCLA | 1 | 2 | 2 | 1 | 1 | 1 | − | 2 | 10 |
| OG-SCLB | 4 | − | 5 | − | 1 | 3 | − | 5 | 18 |
| OG-LISCL | 1 | 7 | 4 | 13 | 11 | 7 | 7 | 11 | 61 |
| Total | 34 | 72 | 59 | 56 | 49 | 44 | 33 | 50 | 397 |
Indicates presence of remnant(s) in the genome.
Missing in M. acuminata.
Missing in Brassicales.
Missing in Poales.
Missing in Malvales.
Missing in monocots.
Missing in Brassicaceae but present in other Brassicales.
Missing in C. canephora.
Os, Oryza sativa; At, Arabidopsis thaliana; Amb, Amborella trichopoda; Ma, Musa acuminata; Pd, Phoenix dactylifera; Vv, Vitis vinifera; Tc, Theobroma cacao; Cc, Coffea canephora.
Figure 1GRAS Gene distribution by orthogroup in 8 plant species. Numbers in the matrix represent the number of genes by OG specified in the header (with the exception of NSP2-AMB). The gradient color from yellow to red illustrates the abundance of genes. The WGD involving the ancestor of all the studied dicot species (γ event, hexaploidization) is indicated by an orange square.
Figure 2Unrooted ML phylogenetic tree based on GRAS sequences from . Branch support is based on aLRT score. The background colors differentiate subfamilies including more than one OG. OGs not grouped in subfamilies are differentiated by gray tones. Branches that were inconsistent with defined orthogroups were colored in red.
Figure 3Unrooted ML phylogenetic tree based on GRAS sequences from . Branch support is based on aLRT score. Clades containing genes assigned to a same OG are collapsed. The colors differentiate subfamilies including more than one OG. OGs not grouped in subfamilies are in black. Individual sequences not associated to any OG are not collapsed.
Conserved motifs identified by MEME software among defined gene clusters.
| OG-SCR-1 | 1.4e-093 | MDDTAATAWIDGIIRDIIHSS | None |
| 4.2e-217 | VSIPQLIHNVREIIHPCNPNLAAILEYRLRSLM | None | |
| OG-SHR-1 | 1.1e-097 | CHHFYMDEDFFSSSSSKHYHP | None |
| OG-SCR32-1 | 8.0e-053 | HQIGPCLDLTMNKNQIHRTRPWPGFPTSK | |
| OG-SCL4/7 | 2.6e-311 | MAYMCTDSGNLMAIAQQVIKQKQQQQQQQQQQQQQQQ | None |
| 3.8e-103 | EFDSDEWMESLMGGGDAEESDNM | None | |
| OG-NSP2-1 | 1.8e-031 | DDFHDLIESMMCD | None |
| OG-HAM-I | 6.3e-049 | NNNNHFNCNLCYEPTSVLDPHLSPSPVTE | |
| OG-DELLA-1 | 1.2e-429 | DCGM | None |
| 5.3e-299 | SHL | ||
| OG-DELLA-2 | 1.0e-079 | EI | |
| 6.0e-057 | |||
| OG-DELLA-1,2 | 6.8e-531 | QDCGM | |
| 1.6e-405 | L | ||
| OG-PAT-1 | 3.6e-159 | FQQNSQSYPSDQHHSPDNTYGSPISGSC | |
| 3.7e-170 | TDDPNDLKHKLR | None | |
| OG-PAT-2 | 1.1e-108 | Q | None |
| OG-PAT-3 | 1.5e-101 | IDYDEDEMRLKLQ | None |
| OG-PAT-4 | 9.1e-105 | GGILKRSLTEMERQQQQQQQQ | None |
| 7.9e-094 | LQ | None | |
| OG-PAT-1234 | 1.4e-421 | ITYDENDMKHKLQ | |
| OG-SCL3 | 4.4e-127 | QQDDGSSSVTSSPLQFFSLMSLSPGTGSP | |
| OG-DLT | 4.9e-074 | MGTQRLDLPCSFSRK | None |
| OG-RAM1 | 4.3e-033 | KGKGQSPLHKVFNSPNNQYMQ | |
| OG-RAD1-1 | 4.6e-079 | NRNGSTNSTNSLPRLHFRDHIWTYKQRYLAAEAMEEAAAAM | None |
| OG-RAD1-2 | 1.5e-011 | SFNHDTAIRRFCPARIEQEQ | None |
| 4.7e-008 | PPSLAASEEDEFVDSFINMDWCDDYDND | None | |
| OG-SCLA | 2.1e-035 | DVCEGKFFGLLQARERMLKVDPKRKGMED | |
| OG-SCLB | 5.4e-171 | NYRSSHGRLCGEKENEPTDGVTYPTGGGDELSTEEVIRIAGAHYVYMGTH |
Bold characters are used to highlight conserved amino acids in motifs identified in OGs classified in the same subfamily. In subfamily DELLA, where two conserved motifs were detected, second motif amino acids are underlined. In the last column are reported the species (and the number of relative sequences) where the motif was not detected.