| Literature DB >> 21729923 |
Keiichi Mochida1, Takuhiro Yoshida, Tetsuya Sakurai, Kazuko Yamaguchi-Shinozaki, Kazuo Shinozaki, Lam-Son Phan Tran.
Abstract
The interactions between transcription factors (TFs) and cis-regulatory DNA sequences control gene expression, constituting the essential functional linkages of gene regulatory networks. The aim of this study is to identify and integrate all putative TFs from six grass species: Brachypodium distachyon, maize, rice, sorghum, barley, and wheat with significant information into an integrative database (GramineaeTFDB) for comparative genomics and functional genomics. For each TF, sequence features, promoter regions, domain alignments, GO assignment, FL-cDNA information, if available, and cross-references to various public databases and genetic resources are provided. Additionally, GramineaeTFDB possesses a tool which aids the users to search for putative cis-elements located in the promoter regions of TFs and predict the functions of the TFs using cis-element-based functional prediction approach. We also supplied hyperlinks to expression profiles of those TF genes of maize, rice, and barley, for which data are available. Furthermore, information about the availability of FOX and Ds mutant lines for rice and maize TFs, respectively, are also accessible through hyperlinks. Our study provides an important user-friendly public resource for functional analyses and comparative genomics of grass TFs, and understanding of the architecture of transcriptional regulatory networks and evolution of the TFs in agriculturally important cereal crops.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21729923 PMCID: PMC3190953 DOI: 10.1093/dnares/dsr019
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Predicted TF models in six grasses
| TF gene families | |||||||
|---|---|---|---|---|---|---|---|
| 1 | (R1)R2R3_MYB | 86 | 214 | 116 | 89 | 15 | 64 |
| 2 | ABI3VP1 | 51 | 69 | 63 | 37 | 4 | 18 |
| 3 | Alfin-like | 16 | 33 | 15 | 10 | 2 | 7 |
| 4 | AP2_EREBP | 146 | 265 | 167 | 117 | 142 | 131 |
| 5 | ARF | 42 | 66 | 29 | 22 | 1 | 8 |
| 6 | ARID | 8 | 15 | 8 | 4 | 1 | 2 |
| 7 | atypical_MYB | 40 | 56 | 32 | 28 | 4 | 9 |
| 8 | Aux_IAA | 41 | 83 | 32 | 27 | 5 | 13 |
| 9 | BBR-BPC | 5 | 9 | 5 | 4 | 1 | 1 |
| 10 | BES1 | 7 | 14 | 8 | 6 | 1 | 3 |
| 11 | bHLH | 158 | 274 | 171 | 107 | 14 | 29 |
| 12 | bZIP | 103 | 185 | 108 | 78 | 13 | 53 |
| 13 | C2C2_Zn-CO-like | 38 | 53 | 37 | 25 | 15 | 18 |
| 14 | C2C2_Zn-Dof | 27 | 44 | 29 | 25 | 21 | 9 |
| 15 | C2C2_Zn-GATA | 36 | 31 | 33 | 24 | 3 | 12 |
| 16 | C2C2_Zn-YABBY | 15 | 26 | 8 | 7 | 1 | 4 |
| 17 | C2H2_Zn | 106 | 185 | 113 | 94 | 11 | 26 |
| 18 | C3H-TypeI | 85 | 160 | 73 | 60 | 10 | 32 |
| 19 | CAMTA | 10 | 9 | 7 | 5 | 0 | 1 |
| 20 | CCAAT_Dr1 | 1 | 5 | 1 | 2 | 0 | 1 |
| 21 | CCAAT_HAP2 | 18 | 27 | 11 | 10 | 2 | 14 |
| 22 | CCAAT_HAP3 | 18 | 22 | 13 | 10 | 1 | 7 |
| 23 | CCAAT_HAP5 | 13 | 21 | 15 | 5 | 3 | 3 |
| 24 | CPP | 11 | 17 | 8 | 9 | 0 | 6 |
| 25 | E2F_DP | 8 | 19 | 10 | 9 | 0 | 3 |
| 26 | EIL | 6 | 9 | 7 | 4 | 2 | 7 |
| 27 | GARP_ARRB | 11 | 10 | 8 | 6 | 0 | 2 |
| 28 | GARP_G2-like | 59 | 71 | 48 | 38 | 3 | 12 |
| 29 | GeBP | 17 | 29 | 18 | 12 | 2 | 4 |
| 30 | GRAS | 48 | 84 | 74 | 34 | 5 | 13 |
| 31 | GRF | 28 | 17 | 38 | 14 | 1 | 1 |
| 32 | HB | 112 | 194 | 88 | 73 | 15 | 40 |
| 33 | HMG-box | 16 | 27 | 13 | 11 | 5 | 17 |
| 34 | HRT | 1 | 3 | 1 | 1 | 2 | 0 |
| 35 | HSF | 30 | 51 | 25 | 26 | 1 | 7 |
| 36 | JUMONJI | 24 | 33 | 22 | 14 | 3 | 4 |
| 37 | LFY | 1 | 4 | 1 | 1 | 0 | 4 |
| 38 | LIM | 20 | 30 | 9 | 11 | 2 | 8 |
| 39 | LUG | 5 | 3 | 5 | 6 | 1 | 3 |
| 40 | MADS | 83 | 96 | 83 | 46 | 58 | 124 |
| 41 | MBF1 | 3 | 7 | 2 | 3 | 3 | 3 |
| 42 | MYB_related | 47 | 70 | 64 | 36 | 8 | 22 |
| 43 | NAC | 84 | 168 | 124 | 96 | 18 | 22 |
| 44 | Nin-like | 17 | 24 | 13 | 7 | 1 | 5 |
| 45 | PcG | 58 | 77 | 46 | 30 | 5 | 15 |
| 46 | PHD | 185 | 270 | 169 | 114 | 14 | 48 |
| 47 | PLATZ | 14 | 17 | 18 | 11 | 2 | 2 |
| 48 | S1Fa-like | 3 | 2 | 2 | 2 | 2 | 2 |
| 49 | SAP | 0 | 0 | 0 | 0 | 0 | 0 |
| 50 | SBP | 18 | 50 | 19 | 17 | 1 | 5 |
| 51 | SRS | 4 | 11 | 5 | 5 | 0 | 2 |
| 52 | TCP | 21 | 49 | 27 | 18 | 1 | 5 |
| 53 | Trihelix | 8 | 21 | 10 | 7 | 1 | 3 |
| 54 | TUB | 15 | 32 | 20 | 12 | 2 | 9 |
| 55 | ULT | 1 | 5 | 1 | 1 | 0 | 0 |
| 56 | VOZ | 2 | 8 | 2 | 2 | 0 | 2 |
| 57 | Whirly | 2 | 6 | 2 | 2 | 1 | 2 |
| 58 | WRKY_Zn | 80 | 151 | 96 | 79 | 1 | 36 |
| 59 | zf-HD | 16 | 26 | 15 | 15 | 0 | 2 |
| 60 | zf-TAZ | 5 | 10 | 5 | 5 | 0 | 1 |
| 61 | ZIM | 30 | 45 | 20 | 18 | 4 | 13 |
| Total | 2152 | 3623 | 2205 | 1597 | 444 | 916 |
aComplete TF repertoires predicted using proteomes annotated from genomic sequences.
bPartial TF repertoires predicted using FL-cDNA resources available on TriFLDB.
Figure 1.Distribution and number of TFs of T. aestivum and H. vulgare, which were found by HMM search or homology search with TFs of B. distachyon. The HMM search was performed against full-length cDNA/CDS of both species. The homology search using blastx was applied between NCBI UniGene data set of both species and B. distachion, the predicted protein data set in Bdi1.0 with 1e−10 to find significant homologues.
Figure 2.The representative distributions of the GO terms for biological processes associated with TFs from B. distachyon (B.d.), Z. mays (Z.m.), S. bicolor (S.b.), and O. sativa (O.s.) in comparison with A. thaliana (A.t.). The top 20 abundantly found GO terms were assigned based on homology searches against annotated Arabidopsis genes (blastp homology search with E-value < 1e−10). TF numbers are shown for each GO term.
Figure 3.The web-based user interface of GramineaeTFDB and a demonstration of a typical example of related annotations for a putative TF encoding gene. The homepage of GramineaeTFDB displays TF families and number of TFs of each TF family identified in six grass species: B. distachyon, O. sativa, S. bicolor, Z. mays, H. vulgare, and T. aestivum. By clicking on ‘Go to TF search’, the users will be directed to the search page which provides search queries for the names of TF families, keywords, sequence identifiers, identifiers of domains supported by InterProScan, GO terms, and available cis-motifs for each grass species (A). The search results are listed for a TF family of a grass species with a description of corresponding genes based on similarity searches. For those TF encoding genes of barley, maize, and rice, whose expression data are available through hyperlinks, [Genevestigator] and/or [RiceXPro] strings are displayed. [RiceFOX], [RiceGE], or [Closet DS] string is also displayed for to indicate the availability of hyperlinks linking the rice TFs to RiceFOX and RiceGE databases and maize TFs to PlantGDB database (Ac/Ds lines) in the detailed page (B). Users are able to navigate to the detailed annotation pages to browse the related annotations. The detailed annotation pages provide summarized basic information on each of the gene models annotated with gene structure. The figure for a gene structure is accessible via a hyperlink to a genome browser which is browsed together with other sequences allocated onto the grass genome (C). The HMM search result for the TF is displayed (D). The sequences of cDNA and protein are provided and all clickable buttons navigate users to the blast search interface directory (E). The similarity search results for each of the entries against NCBI nr, UniProt, and gene models of Arabidopsis and other grass species with detailed search results and hyperlinks to the original data (F). Resultant hierarchical clustering of homologous TFs can be browsed with multiple alignment of each cluster (G). Information of other sequence identifiers for representative transcript sequence databases, including UniGene, TIGR Gene Index, and PlantGDB as well as the probe ID of target sequences on the Affymetrix GeneChip, if available, are also accessible. Furthermore, information about available FL-cDNAs is provided through hyperlinks (H). The GO terms assigned to each of the entries based on InterProScan and sequence similarity search against the annotated genes of Arabidopsis of TAIR10 (I). The domain structure predicted by InterProScan is provided (J). The result of a cis-motif sequence pattern search of promoter regions for each gene is shown together with genomic gene structure (K). Hyperlinks to Genevestigator and/or RiceXPro are provided for those TFs for which expression data are available (L). Hyperlinks to RiceFox and/or RiceGE for rice TFs or PlantGDB (Ac/Ds lines) for maize TF (M).
The availability of resources for functional analyses of the TFs from six grass species
| FLcDNA/CDS | Microarray probe | Clustered EST | Expression | Genetic resource | |
|---|---|---|---|---|---|
| Rice | KOME | Affymetrix | Genevestigator RiceXPro | RiceFOX RiceGE | |
| Maize | Maize full-length cDNA project | Affymetrix | Genevestigator | PlantGDB | |
| NA | NA | NCBI UniGene PlantGDB | NA | NA | |
| NA | NA | TIGR Gene Index | NA | NA | |
| Wheat | TriFLDB | Affymetrix | Genevestigator | NA | |
| Barley | TriFLDB | Affymetrix | Genevestigator | NA |
NA, not available.