Literature DB >> 19884168

In silico analysis of transcription factor repertoire and prediction of stress responsive transcription factors in soybean.

Keiichi Mochida1, Takuhiro Yoshida, Tetsuya Sakurai, Kazuko Yamaguchi-Shinozaki, Kazuo Shinozaki, Lam-Son Phan Tran.   

Abstract

Sequence-specific DNA-binding transcription factors (TFs) are often termed as 'master regulators' which bind to DNA and either activate or repress gene transcription. We have computationally analysed the soybean genome sequence data and constructed a proper set of TFs based on the Hidden Markov Model profiles of DNA-binding domain families. Within the soybean genome, we identified 4342 loci encoding 5035 TF models which grouped into 61 families. We constructed a database named SoybeanTFDB (http://soybeantfdb.psc.riken.jp) containing the full compilation of soybean TFs and significant information such as: functional motifs, full-length cDNAs, domain alignments, promoter regions, genomic organization and putative regulatory functions based on annotations of gene ontology (GO) inferred by comparative analysis with Arabidopsis. With particular interest in abiotic stress signalling, we analysed the promoter regions for all of the TF encoding genes as a means to identify abiotic stress responsive cis-elements as well as all types of cis-motifs provided by the PLACE database. SoybeanTFDB enables scientists to easily access cis-element and GO annotations to aid in the prediction of TF function and selection of TFs with functions of interest. This study provides a basic framework and an important user-friendly public information resource which enables analyses of transcriptional regulation in soybean.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19884168      PMCID: PMC2780956          DOI: 10.1093/dnares/dsp023

Source DB:  PubMed          Journal:  DNA Res        ISSN: 1340-2838            Impact factor:   4.458


Introduction

Sequence-specific DNA-binding transcription factors (TFs) are the key molecular switches that control or influence many of the biological processes such as development, growth, cell division and responses to environmental stimuli in a cell or organism. By being capable of activating or repressing transcription of multiple target genes, they affect the metabolism, physiological balance and progression in cells and the responses of cells to the environment.[1-3] TFs form complex regulatory networks at the transcriptional level and through protein–protein interactions among themselves or with proteins of other classes. Protein–protein interactions may also form with other transcriptional regulators such as chromatin remodelling/modifying proteins to recruit or block access of RNA polymerases to the DNA template. The specific interactions between TFs and a family of cis-regulatory sequences described by a consensus motif play a central part in how genetic regulatory proteins affect spatial and temporal gene expression.[4] Additionally, alterations in the activity and regulatory specificity of TFs are emerging as a major source of diversity and evolutionary adaptation.[5,6] In the past decade, the availability of complete genome sequences and the development of high-throughput experimental techniques have enabled scientists to compile complementary information describing the function and organization of TF regulatory systems in a number of organisms. The identification, characterization and classification of TFs at the genome-wide level will provide an important resource for researchers who are interested in studying the regulation of gene expression. Similar to other proteins, TFs are comprised evolutionarily conserved units called ‘domains’, which belong to families that can occur in many different proteins. The majority of TFs can be grouped into a number of different families according to the specific type of DNA-binding domain (DBD) that is present within their sequence.[7-10] Using bioinformatics approaches, computational studies have documented valuable TF repertoires by searching for genes containing DBDs within individual organisms ranging from prokaryotes to eukaryotes or by searching across all completely sequenced genomes.[7,8,10-18] In plants, ∼7% of the genome encodes putative TFs.[19] Despite their importance as a fundamental component of biological systems, the TF repertoires for many plant genomes remain largely unknown and understudied. Analyses of expressed sequence tag (EST) and genome sequence databases have indicated that legumes encode more than 2000 TFs per genome. At the present time, less than 1% of these putative TFs have been genetically and functionally characterized.[19] Our basic knowledge of TFs and their role in transcriptional regulation is derived from molecular biological and genetic investigations. Proper characterization of particular TFs often requires a detailed study in the biological context of a whole TF family, since functional redundancy is a common occurrence within TF families.[20-24] Furthermore, since TFs control the expression of the genome, it is not possible to completely understand their function without performing detailed functional studies at a genome-wide level.[7,25-27] Soybean (Glycine max L.) is a nutritionally important crop which provides an abundant source of oil and protein for worldwide human consumption.[28-31] In addition, soybean is also viewed as an attractive crop for the production of renewable fuels such as biodiesel. Due to its symbiosis with nitrogen fixing bacteria, soybean can fix atmospheric nitrogen and therefore requires minimal input of nitrogen fertilizer. Agricultural dependence on nitrogen fertilizer often accounts for the single largest energy input in agronomic practices.[32] With the recent completion of the soybean genomic sequence (http://www.phytozome.net/soybean#C Soybean Genome Project, DOE Joint Genome Institute), the identification, isolation and functional analysis of important genes will be accelerated. From a biotechnology perspective, this resource will be especially important for studying regulatory genes involved in plant productivity, seed quality, nitrogen fixation and the sensing/response and adaptation to the environment. Within the soy genome model, ∼975 Mb has been captured in 20 chromosomes and 66 153 protein-coding loci have been predicted (http://www.phytozome.net/soybean#C). With the completion of the soybean genome sequence, the full complement of TF-encoding genes from this important crop can be characterized and functionally analysed. In this report, we searched for sequence-specific DNA-binding TFs using a prediction method which uses 51 Hidden Markov Models (HMMs) from the Pfam database. We also used 11 models, which were originally created by HMMbuild of HMMER2 package, to identify the domains within the putative TF proteins. The computational results predict that the soybean genome contains 5035 TF protein models coded from 4342 loci in 61 families. We created a database named ‘SoybeanTFDB’. This database provides open access for researchers to all relevant and basic information on functional motifs, full-length cDNAs, promoter regions, genomic distribution, gene duplication and multiple sequence alignment of the DBDs for each TF family. Since most of these TFs have not been experimentally characterized for regulatory function as indicated by assessment in PubMed, we searched for their putative regulatory function by assessing annotations of the gene ontology (GO) using comparative analysis with their Arabidopsis counterparts. As a complement to this functional prediction using GO annotations, we also mapped all putative cis-regulatory elements that were documented within the PLACE database on all TF encoding genes. In this analysis, we placed a particular emphasis on abiotic stress responsive cis-elements. Knowledge gained from identifying the presence of stress responsive cis-elements, in addition to GO annotation, enables effective prediction of stress responsive TFs. Taken together, in this study, we demonstrate a comprehensive and high-quality census of TFs encoded within the soybean genome. These results provide a solid foundation for further systematic characterization of soybean TFs using traditional molecular approaches and/or genomic techniques at either the single-gene level or family-wide scale.

Materials and methods

Identification of TF repertoire in soybean

To identify TF encoding genes from the annotations of Glyma1 in the soybean genome, 51 HMMs of Pfam[33] and those of 11 originally created using HMMbuild of the HMMER2 package (http://hmmer.janelia.org/) were applied, which corresponded to a total of 61 TF families (Supplementary Table S1). The modelled proteome data of annotated genes in Glyma1, which were downloaded from Phytozome (http://www.phytozome.net/), were subjected to a profile search for HMM dataset using Pfam-HMM with set thresholds of E-value, E < 1e−5 (Supplementary Table S1). The search results for each of the TF families were then applied to retrieve discovered regions as conserved DBDs and related annotations. To further classify genes with a conserved MYB domain into three subgroups: (R1)R2R3_MYB, MYB_related and atypical_MYB, the MYB soybean protein sequences were searched against previously classified Arabidopsis MYB genes[34] using blastp (E < 1e−5) and each top hit combination was applied to the classification. To avoid possible contaminations of pseudo response regulator or histidine kinase sequences into the GARP_ARRB family, genes containing CCT, CHASE, HATPase_c and HisKA together with Response_reg of Pfam domains were searched by InterProScan. Genes, which hit in this search, were subsequently removed from the GARP_ARRB family. The putative TF encoding genes discovered in the soybean genome were classified into the following four categories based on their potential functionality as TFs. The first group of TFs (Category A) consists of TF encoding genes showing sequence identity ≥95% and a blastn E ≤ 1e−100 with GenBank soybean sequences having a functional description as TFs. Category A genes were classified with the highest confidence level after assessment with the PubMed database. The second group of TFs (Category B) is comprised TFs which have an equivalent protein domain arrangement (blastp E ≤ 1e−30) for regulatory function in well-annotated plants, such as Arabidopsis and/or rice. The third group of TFs (Category C) combines possible TFs which show a significant hit with each of the HMM models used for DBD prediction (Pfam-HMM E ≤ 1e−20). The last group contains TFs which have promiscuous HMM models with a threshold of settled E-values.

Structural and functional annotations for putative soybean TFs

For annotating TF encoding genes in soybean,[35] we used protein and cDNA sequences of soybean TFs as queries against the following protein and nucleotide datasets using the BLAST algorithm:[36] the nr protein DB of NCBI (ftp://ftp.ncbi.nih.gov/blast/db); the protein data presented in TAIR release 8 (ftp://ftp.arabidopsis.org/home/tair/Sequences/blast_datasets); the protein data from UniProt (http://www.uniprot.org/); the TIGR/MSU Rice Genome Annotation Project (http://rice.plantbiology.msu.edu/) release 6; the soybean representative cDNA sequences in UniGene (ftp://ftp.ncbi.nih.gov/repository/UniGene/); the TIGR Transcript Assemblies (http://plantta.jcvi.org/); the Plant GDB (http://www.plantgdb.org/); the sequence sets of ESTs and high throughput cDNAs (HTCs) of RIKEN soybean full-length cDNA clones (http://rsoy.psc.riken.jp/);[37] the cDNAs of the previous version of the soybean genome annotation (Glyma0, Phytozome) and the target sequences of the Affymetrix soybean GeneChip (GPL4592 of NCBI GEO platform accession). All of the similarity searches using blastn were performed with threshold E < 1e−100, and the top scoring hit for each query was applied. All similarity searches with blastp against protein datasets were performed with a threshold E < 1e−5 to find possible functional descriptions for TF encoding genes. The top scoring hit for each query was applied. Conserved domains in the protein sequence of putative TF encoding genes were identified with InterProScan and the InterPro DB (http://www.ebi.ac.uk/interpro/) to predict structures of DBD of TFs together with other functional domains and associated GO terms. All domains and those positions predicted by the search were retrieved and implemented them into our database. To determine the global characteristic features of functional categories of TF encoding genes of soybean, the TFs were assigned to possible GO terms based on a blastp similarity search to find Arabidopsis counterparts together with those annotated GOs of TAIR8. Particular emphasis was placed on sequences serving under the ‘biological process’ functional category. TFs have been widely reported in plant TF databases such as DATF, AtTFDB, RARTF and PlnTFDB for Arabidopsis and DRTF, GRASIUS and PlnTFDB for rice. To annotate all putative soybean TFs in relation to Arabidopsis and rice counterparts, soybean TF sequences were assigned to annotation data related to TF families provided from each of the aforementioned databases based on sequence similarity searches between soybean proteome data and those of Arabidopsis and rice.[8,14,38-42] The interrelated dataset of soybean genes, in combination with related Arabidopsis and rice TF annotations, were implemented into the SoybeanTFDB to provide cross references with other plant TFDBs.

Gene duplications and gene clusters in soybean TF families

Gene duplications and gene clusterings in soybean TF families were estimated by analysing the amino acid sequences of TF genes found on soybean chromosomes. Specifically, the presence of gene pairs or gene clusters of closely homologous genes based on global sequence similarity with threshold of more than 60% amino acid sequence identity using cd-hit program of CD-HIT package were investigated.[43] Gene clusters are defined as genetic loci containing three or more closely homologous genes. Once identified, the pairs or gene clusters were used to assess the chromosomal allocation of highly homologous genes. Genes in tandem duplication are arbitrarily defined as those occurring within a sequence distance of 50 kb. On the other hand, genes that are duplicated in the same chromosome but reside >50 kb from each other are referred to as ‘Duplications in same chromosome’. ‘Duplications in different chromosomes’ indicate pairs of highly homologous genes which reside on different chromosomes.

Discovery of cis-regulatory motifs in promoter regions of TF genes

To discover cis-regulatory motifs located in the promoter regions of each putative soybean TF gene and to investigate the enriched representation of cis-motifs in each TF family, cis-motif sequences from the PLACE database (version 30, 469 entries) (http://www.dna.affrc.go.jp/PLACE/)[44] and the stress responsive cis-motifs previously reported[45] were used as queries to search against the Glyma1 genome scaffold sequence using the fuzznuc program of EMBOSS package (http://emboss.sourceforge.net/). The results of pattern matches were subsequently assessed to identify matched sequences located on the −500, −1000 and −3000 bp upstream sequences from the putative transcription start site for each TF encoding gene defined in the Glyma1 annotation. The cis-element search results were implemented into the SoybeanTFDB as a searchable property. In addition, these search results were also incorporated as an annotation track of the genome browser (Gbrowse). To assess the enrichment for the representative allocation of each cis-element identified on upstream sequences of each TF family, we analysed cis-element representations in the −1000 bp promoter region of TF members for TF families containing more than 50 gene loci to compare cis-element representations of randomly sampled gene loci of Glyma1. The computation of the overrepresentation test and its significance were performed by a Z-test as previously described.[46]

Construction of a web accessible database

The database is implemented in MySQL and the web interface of Perl CGI and Java script run on the Apache Web server. The definition strings used for sequence similarity searches for each database, the domain searches by InterProScan, cis-motif names from the PLACE database and the assigned GO terms have been assembled as a keyword database enabling users to specify queries on any keyword and to retrieve relevant information for genes from the SoybeanTFDB. A BLAST server was implemented to provide a similarity search interface for queried sequences using NCBI BLAST together with soybean Glyma1-related sequences, as well as those from Arabidopsis and rice. Generic Genome Browser (Gbrowse)[47] was also implemented in SoybeanTFDB with Glyma1 genome annotations released by Phytozome to visualize the gene annotations of the putative TF encoding genes together with cis-motifs found on the upstream sequence of the TF genes. All of the data in the SoybeanTFDB are accessible not only through a web interface but also as downloadable files from the website. The cross references of corresponding data for each of the entries were also implemented into the SoybeanTFDB together with the URLs for each of the original referenced data to provide hyperlinks on the web interface with seamless navigations.

Results and discussion

Identification of the soybean TF repertoire

For the purpose of identifying the repertoire of TFs within the soybean genome, we first define a class of proteins which bind DNA in a sequence-specific fashion. A protein is classified as a TF if it has a significant match to a model that we annotated as being a DBD, with the significance thresholds for HMM matches. Supplementary Table S1 summarized the HMMs used in TF predictions. For each HMM, we examined the description and associated literature to assess their sequence-specific DNA-binding capabilities. The pipeline that we used to predict soybean TFs began with retrieving the complete set of predicted proteins from the completely sequenced soybean genome. This approach was then followed by a HMMER search with all HMMs taken from the Pfam database (Fig. 1). In total, 4342 putative TF encoding loci which showed a significant match with these selected DBDs were extracted from the soybean genome sequence Glyma1 model (http://www.phytozome.net/soybean#C). These putative TFs represent 6.56% of the total number of predicted genes in soybean (Table 1 and Supplementary Table S2). In soy, this percentage of TFs to total gene number was similar to what has been observed for Arabidopsis. In the Arabidopsis genome, there are at least 1968 TFs which account for 7.23% of the total number of genes. Although the number of TFs generally increase with the number of genes in a genome, interestingly the percentage of TF genes described in rice (3.68%) is less than expected (Table 1). The identified soybean TFs were classified into 61 families based on the presence of domains that were specific for the family (Table 2). Among the identified TFs, a significant proportion of the soybean TF repertoire has not been annotated with full-length open reading frames in the Glyma1 model. As a means to address this deficiency, we took advantage of our recently released soybean FL-cDNA collection of 4712 complete sequences and 68 661 ESTs to assess the Glyma1 annotation of the identified TFs (http://www.legumebase.agr.miyazaki-u.ac.jp).[37] Table 2 summarized the full-length information of the soybean TF encoding genes annotated by Glyma1 and the FL-cDNA collection. Detailed information for each gene is available on Supplementary Table S2. Next, we then grouped the TFs into four categories according to our confidence in their structure and functionality by assessing PubMed and relevant databases as described in Materials and methods (Fig. 2, and Supplementary Table S3). Relevant information of the soybean TF repertoire can be easily accessed at our website SoybeanTFDB (http://soybeantfdb.psc.riken.jp). Information that is readily available for the TF repertoire includes nucleotide and amino acid sequences, promoter regions and domain alignments within the family as well as multiple alignments with putative Arabidopsis and homologous rice genes.
Figure 1

Schematic workflow of the computational pipeline used to discover and annotate genes encoding putative transcription factors in soybean.

Table 1

Numbers of TFs in Arabidopsis, rice and soybean

SpeciesNo. of non-redundant TF gene lociaNo. of non-redundant gene locibReferenced annotationPercentage of TFscDatabaseURL
A. thaliana192227 235TAIR87.06DATFhttp://datf.cbi.pku.edu.cn/
19687.23RARTFhttp://rarge.g.sc.riken.jp/rartf/
19617.20PlnTFDBhttp://plntfdb.bio.uni-potsdam.de/v2.0/index.php?sp_id=ATH
17376.38AtTFDBhttp://arabidopsis.med.ohio-state.edu/AtTFDB/
13584.99DBDhttp://dbd.mrc-lmb.cam.ac.uk/DBD/index.cgi?About
O. sativa192856 797Rice Pseudomole-cule Release 63.39DRTFhttp://drtf.cbi.pku.edu.cn/
20953.69PlnTFDBhttp://plntfdb.bio.uni-potsdam.de/v2.0/index.php?sp_id=OSAJ
21413.77GRASSIUShttp://grassius.org/summary.html
16292.87DBDhttp://dbd.mrc-lmb.cam.ac.uk/DBD/index.cgi?About
G. max434266 210Glyma1.06.56SoybeanTFDBhttp://soybeantfdb.psc.riken.jp

aNumber of predicted non-redundant TF gene loci in each genome.

bNumber of predicted non-redundant gene loci in each genome.

cPercentage of TFs per genome.

Table 2

Characteristics of soybean transcription factors

TF gene familiesNo. of modelsaNo. of gene locibNo. of models (FL)cNo. of models (not FL)dNo. of models assigned with RIKEN FL ESTeNo. of models assigned with RIKEN FL-HTCf
1(R1)R2R3_MYB33331924687376
2ABI3VP116313912142357
3Alfin-like271826172
4AP2_EREBP405382306997831
5ARF7558696296
6ARID222017570
7atypical_MYB89786722222
8Aux_IAA1268512064114
9BBR-BPC211021011
10BES1211817482
11bHLH390325317737218
12bZIP20514819411498
13C2C2_Zn-CO-like1018487143113
14C2C2_Zn-Dof8781789135
15C2C2_Zn-GATA65635312141
16C2C2_Zn-YABBY281827181
17C2H2_Zn27025821159425
18C3H-TypeI178151154245014
19CAMTA141412240
20CCAAT_Dr1231620342
21CCAAT_HAP2422340292
22CCAAT_HAP34739341352
23CCAAT_HAP5262320660
24CPP221717510
25E2F_DP231421230
26EIL141311362
27GARP_ARRB212116542
28GARP_G2-like10482959206
29GeBP171761141
30GRAS12711710126297
31GRF1089130
32HB283242240436720
33HMG-box5026437183
34HRT111011
35HSF65595411116
36JUMONJI5451371780
37LFY332100
38LIM413238350
39LUG10910020
40MADS2201861417970
41MBF1333021
42MYB_related16813513830358
43NAC205187159463711
44Nin-like232317620
45PcG94867618111
46PHD333285287468416
47PLATZ403334682
48S1Fa-like444020
49SAP221100
50SBP58484612152
51SRS242218600
52TCP61613922133
53Trihelix343331394
54TUB3724334182
55ULT323252700
56VOZ877121
57Whirly11711020
58WRKY_Zn219198167524712
59zf-HD5756372041
60zf-TAZ886210
61ZIM57345522813
Total50354342403210031003249

aNumber of predicted TF models in Glyma1 model.

bNumber of predicted TF loci in Glyma1 model.

cNumber of predicted full-length TF models in Glyma1 model.

dNumber of predicted not full-length TF models in Glyma1 model.

eNumber of predicted full-length TF models in soybean assigned with RIKEN full-length ESTs.

fNumber of predicted full-length TF models in soybean assigned with RIKEN full-length high throughput cDNAs.

Figure 2

The distributions of soybean TF encoding genes are classified into four categories of annotation levels. Category A includes soybean gene models showing sequence identity ≥95% and a blastn E ≤ 1e−100 with GenBank soybean sequences having a functional description as TFs. Category B includes gene models which have an equivalent protein domain arrangement (blastp E ≤ 1e−30) for regulatory function in well-annotated plants, such as Arabidopsis and/or rice. Category C includes gene models which show a significant hit with each of the HMMs used for DBD prediction (Pfam-HMM E ≤ 1e−20). Category D includes TF genes which have promiscuous HMMs with a threshold of settled E-values.

Schematic workflow of the computational pipeline used to discover and annotate genes encoding putative transcription factors in soybean. The distributions of soybean TF encoding genes are classified into four categories of annotation levels. Category A includes soybean gene models showing sequence identity ≥95% and a blastn E ≤ 1e−100 with GenBank soybean sequences having a functional description as TFs. Category B includes gene models which have an equivalent protein domain arrangement (blastp E ≤ 1e−30) for regulatory function in well-annotated plants, such as Arabidopsis and/or rice. Category C includes gene models which show a significant hit with each of the HMMs used for DBD prediction (Pfam-HMM E ≤ 1e−20). Category D includes TF genes which have promiscuous HMMs with a threshold of settled E-values. Numbers of TFs in Arabidopsis, rice and soybean aNumber of predicted non-redundant TF gene loci in each genome. bNumber of predicted non-redundant gene loci in each genome. cPercentage of TFs per genome. Characteristics of soybean transcription factors aNumber of predicted TF models in Glyma1 model. bNumber of predicted TF loci in Glyma1 model. cNumber of predicted full-length TF models in Glyma1 model. dNumber of predicted not full-length TF models in Glyma1 model. eNumber of predicted full-length TF models in soybean assigned with RIKEN full-length ESTs. fNumber of predicted full-length TF models in soybean assigned with RIKEN full-length high throughput cDNAs. Our prediction method depends heavily on the content of the Pfam database and the ability of the search algorithms to detect the DBDs in protein sequences, thus there are a few possible sources of inaccuracies in this prediction method. In addition, although the Glyma1 model contains more than 98% of known soybean protein-coding genes in its assembly, part of the TF repertoire may be clarified in the future by fine-tuning of the annotation. Finally, our literature analysis depends on the existing available published information pertaining to each gene, which will need to be updated as new findings are reported. The availability of updated HMM libraries or refinements of existing ones and better fine-tuned annotation and continuous searches for newly reported literature will enable us to improve the TF prediction coverage. We will continue to update the website with new information when it becomes available. Literature analysis, which is achieved by assessing corresponding genes deposited as soybean TF genes in the GenBank core nucleotide division together with associated identifiers of PubMed, has revealed that the majority of soybean TFs remains experimentally uncharacterized. Thus, we attempted to further extend our current knowledge base regarding their regulatory function by assessing the putative functions of soybean TFs via comparative analyses with relevant GO annotations of Arabidopsis in TAIR. First, we analysed the profile of GO terms at the biological process level which could be assigned to soybean TFs based on sequence similarity searches against Arabidopsis counterparts having GO terms in TAIR. In order to grasp the overall representation of GO terms in applied entries of soybean TFs, all of the assigned terms were counted after the similarity searches were completed. With the exception of ‘regulation of transcription’, ‘DNA binding’ and ‘biological process’, the top 21 most abundant terms were subsequently used to classify the TFs. The contig results of these total 21 GO terms for each soybean TF, which was based on similarity with Arabidopsis TFs, are provided in Supplementary Table S4. Figure 3A illustrates the distribution of soybean TFs in the 21 most abundant GO terms. A significant proportion of soybean TFs are related to stress and hormone responses (Fig. 3B), indicating the important role of TFs in controlling these biological processes. Of these assigned regulatory functions, responses to auxin, chitin and salt stress are the most highly represented. It is acknowledged that these annotations are the first steps in functional prediction, and researchers must use original publications as a source for a higher level of detailed information. In addition, it is ideal if functional analyses can be performed in order to gain a detailed understanding of gene function. Overall, these analyses emphasize the limited amount of functional information that we know regarding the biological processes that most of the TFs mediate, even for model plants such as Arabidopsis. Directing research efforts into uncharacterized TFs—for example, using high-throughput genomic surveys to describe the key features combined with a detailed examination using traditional molecular approaches—will undoubtedly accelerate our functional understanding of these important regulatory genes. The NAC TF family, which is widely distributed in plants but so far has not been found in other eukaryotes, is an excellent example of how research interests can suddenly arise the following key findings. The acceleration in functional studies has revealed their diverse functions in different biological aspects and future follow-up studies will rapidly improve our understanding of the regulatory function of NAC members. A greater understanding of how TFs operate will be subsequently translated into their potential applications to enhance plant productivity.[4,48,49]
Figure 3

The representative distributions of the GO terms for biological processes associated with soybean TF encoding genes. The top 21 abundantly found GO terms were assigned based on homology searches against annotated Arabidopsis genes (A). Abundant distribution of TFs in GO terms related to the response to various types of abiotic stresses in the soybean TF dataset (B). Gene numbers are displayed next to the terms.

The representative distributions of the GO terms for biological processes associated with soybean TF encoding genes. The top 21 abundantly found GO terms were assigned based on homology searches against annotated Arabidopsis genes (A). Abundant distribution of TFs in GO terms related to the response to various types of abiotic stresses in the soybean TF dataset (B). Gene numbers are displayed next to the terms.

Structural feature of TFs

As mentioned above, the most common classification of TFs is based on the structure of their DBD.[7,14] Grouping TFs by their structural domains has been extremely useful in gaining insights into how they recognize and bind specific DNA sequence. This strategy has also been proven successfully for characterizing their evolutionary histories as well. Moreover, the DBD may provide clues to their biological function. For example, ABI3/VP1 TFs are often associated with the regulation of abscisic acid (ABA)-responsive genes during seed development.[50] Since structural features of TF families have been extensively characterized in other reports, we do not cover this in detail within this report.[8] However, it is worthy to note that soybean contains a number of large families which consist of more than 100 members (Supplementary Table S5). For example, the large AP2_EREBP family alone contains 405 TF models and accounts for a total of 8.04% of the TF repertoire. The bHLH and (R1)R2R3_MYB TFs also represent major families with 390 and 333 members, respectively, which together occupy 14.36% of the TF repertoire. These observations agree with the previous studies in Arabidopsis and rice, which confirmed that the same three families contain the highest numbers of TFs in these model systems (Supplementary Table S5). In addition, the plant-specific NAC family, which comprises 205 models in soybean, represents a similar ratio in Arabidposis and rice (Supplementary Table S5). Taken together, these results suggest a similar tendency in the evolution of major TF families in plants. Furthermore, given that the size of TF families is influenced in part by the number of different DNA sequences that they are able to recognize, the DBDs of AP2_EREBP, bHLH and (R1)R2R3_MYB TF families may be able to diversify their collection of target sequences. As a result, they occur in the greater numbers in a genome.[4,51,52]

Chromosomal distribution and gene duplications of TFs

Our analysis has indicated that the soybean TF families are scattered throughout the genome. The larger families, such as AP2_EREBP, (R1)R2R3_MYB, have members that are distributed on every chromosome in soybean (Supplementary Table S2). The local distribution of TF genes relative to each other is also of interest. Previous studies have described duplications and clusters of highly homologous genes. In Arabidopsis, tandem gene duplications and large-scale duplications on different chromosomes may account for >60% of the genome.[7] In soybean, we were able to distinguish between two types of duplications and clusters based on the evolutionary history of the TF-coding genes that they contain. The first type of duplications and clusters consists of a series of paralogous genes, suggesting that they arose through repeated tandem duplications which originated from a founding locus. In contrast, the second type of duplications and clusters is not comprised paralogous genes. We anticipate that the TF-coding genes in these duplications and clusters arose independently of each other at diverse locations within the genome. Over time, it is likely that they relocated to form these duplications and clusters. Table 3 summarizes gene duplications and gene clusters in soybean TF families. Closely related genes, which are defined by >60% amino acid sequence identity, account for ∼77.75% of the total number in the TF families (Table 3). Pairs of duplicated genes on different chromosomes are most common and gene clusters of three or more highly related genes are also widely found (Table 3). On the basis of the distance of their occurrence, a few of the duplicated genes could be classified arbitrarily as either genes that were duplicated on same chromosome or genes that were tandemly duplicated. Evolutionary studies and haploid genome analysis have suggested that the soybean genome experienced a tetraploidization event which occurred an estimated 10–15 million years ago. Since then, the soybean genome has gone through extensive gene rearrangements and deletions to become diploidized.[53] Therefore, we can observe in soybean that multigene families, including TF families, contain highly related genes.[24,54]
Table 3

Classification of homologous soybean TF genes

TF familyNo. of gene lociaNo. of genes with close homologbPercentage of genes with close homologNo. of individual genesDuplications in different chromosomescDuplications in same chromosomedTandem duplicationseNo. of gene clusters/no. of genes in cluster (no. of chromosmes)f
(R1)R2R3_MYB31826182.0857502143/155 (20)
ABI3VP11399064.7549201212/44 (17)
Alfin-like1818100.0001001/16 (11)
AP2_EREBP38130981.1072741042/159 (20)
ARF584577.59138107/27 (16)
ARID20945.00111002/7 (6)
atypical_MYB784253.853614113/10 (6)
Aux_IAA857284.711390013/54 (15)
BBR-BPC1010100.0002001/6 (4)
BES1181794.4413002/11 (9)
bHLH32326983.2854631238/137 (20)
bZIP14812483.7824261018/70 (20)
C2C2_Zn-CO-like846779.761719007/29 (13)
C2C2_Zn-Dof816074.072117107/24 (13)
C2C2_Zn-GATA635384.131011009/31 (15)
C2C2_Zn-YABBY181583.3332003/11 (8)
C2H2_Zn25718772.7670612215/57 (18)
C3H-TypeI15112381.4628250115/71 (17)
CAMTA141285.7126000/0 (0)
CCAAT_Dr1161487.5022003/10 (9)
CCAAT_HAP2231669.5772003/12 (10)
CCAAT_HAP3393282.0574004/24 (14)
CCAAT_HAP5231982.6143013/11 (9)
CPP171058.8273001/4 (4)
E2F_DP141178.5732002/7 (5)
EIL131292.3112002/8 (6)
GARP_ARRB201365.0073002/7 (6)
GARP_G2-like825971.952318007/23 (13)
GeBP171376.4742002/9 (8)
GRAS11710287.1815230014/56 (20)
GRF8225.0061000/0 (0)
HB24019782.0843310033/135 (20)
HMG-box262284.6245002/12 (10)
HRT100.0010000/0 (0)
HSF594881.361112007/24 (17)
JUMONJI512854.90238013/10 (9)
LFY3266.6711000/0 (0)
LIM322887.5041005/26 (15)
LUG9888.8912001/4 (4)
MADS18615482.8032252118/98 (19)
MBF133100.0000001/3 (2)
MYB_related13511282.9623171116/74 (19)
NAC18717392.5114260130/119 (20)
Nin-like231565.2284002/7 (4)
PcG865665.123021004/14 (9)
PHD28518865.9697511018/84 (20)
PLATZ332781.8262014/21 (13)
S1Fa-like44100.0000001/4 (4)
SAP22100.0001000/0 (0)
SBP483981.2599006/21 (13)
SRS221881.8244003/10 (8)
TCP614472.131719002/6 (3)
Trihelix332575.76811001/3 (3)
TUB242083.3342003/16 (12)
ULT242083.3341102/16 (1)
VOZ77100.0001001/5 (4)
Whirly7685.7113000/0 (0)
WRKY_Zn19816683.8432490217/64 (20)
zf-HD564682.14107016/30 (15)
zf-TAZ8562.5031001/3 (3)
ZIM342779.4178003/11 (9)

aNumber of predicted TF loci found in soybean chromosomes (Glyma1 model).

bGenes were considered closely homologs if they showed >60% amino acid sequence identity.

cPairs of closely homologous genes which are duplicated in different chromosomes.

dPairs of closely homologous genes which are duplicated in same chromosome but resided >50 kb apart from each other.

ePairs of closely homologous genes which are duplicated in same chromosome but resided <50 kb apart from each other.

fClusters of three or more closely homologous genes.

Classification of homologous soybean TF genes aNumber of predicted TF loci found in soybean chromosomes (Glyma1 model). bGenes were considered closely homologs if they showed >60% amino acid sequence identity. cPairs of closely homologous genes which are duplicated in different chromosomes. dPairs of closely homologous genes which are duplicated in same chromosome but resided >50 kb apart from each other. ePairs of closely homologous genes which are duplicated in same chromosome but resided <50 kb apart from each other. fClusters of three or more closely homologous genes.

Promoter regions of the TFs and the discovery of cis-elements in the TF promoter regions

Cis-regulatory elements, which are the binding sites for TFs located in the promoter regions of genes, are the functional elements that determine the timing and location of transcriptional activity. Over the years, extensive promoter analyses have identified a large number of cis-elements, which are important molecular switches involved in the transcriptional regulation of a dynamic network of gene activities controlling various biological processes such as abiotic stress responses, hormone responses and developmental processes.[45,55] The PLACE database (http://www.dna.affrc.go.jp/PLACE/) has consolidated all of the published cis-motifs that have been identified to date. In addition, a number of stress responsive cis-motifs are also reported, which are of great interest to our area of research.[45] To facilitate the functional characterization of soybean TFs, we retrieved the promoter regions for all of the TF genes from soybean genomic sequence database. Specifically, we retrieved 500, 1000 and 3000 bp of sequence upstream from putative transcription start sites. We provided this data on our website in addition to other relevant information on the TFs for convenient downloading. The −500, −1000 and −3000 bp promoter regions were subjected to an extensive in silico analyses to search for the existence of all putative known cis-regulatory motifs. In addition, we also analysed the enrichment of all of the cis-motifs in each TF family using −1000 bp promoter regions as described in Materials and methods. Information on the cis-elements located in the promoter region of each TF is accessible on the detailed page of each TF gene under ‘cis-motif prediction’ function (Fig. 4C). In addition, our website provides the ‘cis-motif (PLACE)’ search function, which enables the search for all types of cis-motifs provided by the PLACE database in promoter region of any TF and/or the search for those TFs which contains the cis-motif(s) of interest (Fig. 4A). In combination with GO annotations (Fig. 4H), these data can facilitate the systematic functional predictions of soybean TFs.
Figure 4

The web-based user interface of SoybeanTFDB and a demonstration of a typical example of related annotations for a putative soybean TF encoding gene. The interface of SoybeanTFDB provides search queries for the names of TF families, keywords, sequence identifiers, identifiers of domains supported by InterProScan, GO terms and available cis-motifs (A). The search results are listed for each of the TF families with a description of corresponding genes based on similarity searches (B). Users are able to navigate to the detailed annotation pages to browse the related annotations. The detailed annotation pages provide summarized basic information on each of the gene models annotated in Glyma1 with gene structure. The figure for a gene structure is accessible via a hyperlink to a genome browser which is browsed together with other sequences allocated onto the soybean genome (C). The sequences of cDNAs and proteins are provided and all clickable buttons navigate users to the blast search interface directory (D). The similarity search results for each of the entries against NCBI nr, gene models of Arabidopsis and rice with detailed search results and hyperlinks to the original data (E). Resultant hierarchical clustering of homologous soybean TF genes can be browsed with multiple alignment of each cluster (F). The identifiers assigned provide hyperlinks to the annotation web pages of Arabidopsis and rice TF databases. Information of other sequence identifiers for representative transcript sequence databases, including PlantGDB, UniGene and TIGR GI, as well as the probe ID of target sequences on the soybean Affymetrix GeneChip, are also accessible. Furthermore, the interface also provides corresponding hyperlinks to the FL-cDNA provided from RIKEN (http://rsoy.psc.riken.jp/) (G). The GO terms assigned to each of the entries based on InterProScan and sequence similarity search against the annotated genes of Arabidopsis of TAIR8 (H). The domain structure predicted by InterProScan is provided (I). The result of a cis-motif sequence pattern search of promoter regions for each gene is shown together with genomic gene structure (J).

The web-based user interface of SoybeanTFDB and a demonstration of a typical example of related annotations for a putative soybean TF encoding gene. The interface of SoybeanTFDB provides search queries for the names of TF families, keywords, sequence identifiers, identifiers of domains supported by InterProScan, GO terms and available cis-motifs (A). The search results are listed for each of the TF families with a description of corresponding genes based on similarity searches (B). Users are able to navigate to the detailed annotation pages to browse the related annotations. The detailed annotation pages provide summarized basic information on each of the gene models annotated in Glyma1 with gene structure. The figure for a gene structure is accessible via a hyperlink to a genome browser which is browsed together with other sequences allocated onto the soybean genome (C). The sequences of cDNAs and proteins are provided and all clickable buttons navigate users to the blast search interface directory (D). The similarity search results for each of the entries against NCBI nr, gene models of Arabidopsis and rice with detailed search results and hyperlinks to the original data (E). Resultant hierarchical clustering of homologous soybean TF genes can be browsed with multiple alignment of each cluster (F). The identifiers assigned provide hyperlinks to the annotation web pages of Arabidopsis and rice TF databases. Information of other sequence identifiers for representative transcript sequence databases, including PlantGDB, UniGene and TIGR GI, as well as the probe ID of target sequences on the soybean Affymetrix GeneChip, are also accessible. Furthermore, the interface also provides corresponding hyperlinks to the FL-cDNA provided from RIKEN (http://rsoy.psc.riken.jp/) (G). The GO terms assigned to each of the entries based on InterProScan and sequence similarity search against the annotated genes of Arabidopsis of TAIR8 (H). The domain structure predicted by InterProScan is provided (I). The result of a cis-motif sequence pattern search of promoter regions for each gene is shown together with genomic gene structure (J). Numerous cis-elements have been reported for their essential roles in determining the tissue-specific or stress-induced expression patterns of genes.[45,55] Recently, a systematic combinatorial in silico analysis of cis-motifs and expression patterns in Arabidopsis indicated a positive correlation between multi-stimuli response genes and cis-element density in upstream regions.[56] Inspection of the relationship of the existence of cis-regulatory elements and the expression patterns of the TF genes can therefore help predict the function of the respective TFs during development, in different organs, cell types or in response to various endo- or exogenous stimuli. Additionally, quantitative models that describe how combinations of cis-elements dictate changes in expression will play an important role for enriching our understanding of the transcriptional response of individual genes to environmental perturbations.[26]

Cis-element- and comparative sequence analysis-based prediction of abiotic stress responsive TFs in soybean

Plants respond to environmental changes by altering large-scale transcriptional responses. The exquisite sensitivity and specificity of these responses are controlled in large part by the cis-regulatory elements. The molecular mechanisms regulating gene expression in response to abiotic stresses have been studied by analysing the cis- and trans-acting elements, i.e. the sequence-specific binding TFs.[4,45] Genes induced by stresses are classified into two groups: functional genes and regulatory genes. The regulatory group includes genes encoding various TFs which can regulate various stress-inducible genes cooperatively or separately and may constitute gene networks. Identification and functional analysis of these stress-inducible TFs should provide more information on the complex regulatory gene networks that are involved in stress responses. At the present time, the functions for most of stress responsive TF encoding genes are not fully understood. Some of the stress-inducible TFs have been overexpressed in transgenic plants and result in stress-tolerant phenotypes.[4,45,49] Recent studies have substantiated that sequence similarity-based clustering of the members of several TF families correlates with their function. Phylogenetic analysis of the AP2_EREBP and NAC families of soybean and the rice NAC family with orthologs from other plant species whose stress responsive expression pattern and/or function are known, resulted in a nearly perfect match between sequence conservation and function or expression patterns. These similarities clearly demonstrate that this can serve as a reliable approach to rationalize systematic functional predictions of different TF families.[21,24,54] Moreover, increasing evidence indicates that the cis-motifs are highly conserved among orthologous or paralogous genes and coregulated genes and defined cis-elements can effectively aid in the genome-wide screening of ABA and abiotic stress responsive genes.[57-59] These observations together prompted us to investigate in a comprehensive fashion the relationship between TFs and abiotic stress with the integration of cis-element annotation and comparative sequence analysis using stress responsive GO terms which aimed to identifying soybean TFs which may function in abiotic stress response. We, therefore, carried out comparative sequence analysis with stress-responsive Arabidopsis TFs to predict the soybean TFs with stress responsive GO terms (Fig. 3B). We also characterized information on stress-responsive cis-element distributions in promoter regions of each soybean TF gene on our webpage for querying and searching for putative stress responsive TFs in each family using ‘cis-motif (stress responsive)’ search function. With the help of our soybean TF database, we can use, for example, the ‘cis-motif (stress responsive)’ search function to identify TF genes which harbour major known stress responsive cis-motif(s) in their promoter regions (Fig. 4A). Next, we screen the identified TFs using GO annotation provided for each TF on detailed annotation page (Fig. 4H). Thus, we will be able to identify the putative stress responsive TFs based on both the existence of stress responsive cis-motif(s) and the associated stress responsive GO terms. The predicted stress responsive function of the identified TFs shall be then confirmed by experiments. The existence of major stress responsive cis-motifs enriched in −1000 bp promoter regions for a number of TF families was summarized in Table 4.
Table 4

Enriched cis-regulatory motifs found in promoter region (1000 bp upstream from transcription start site of each gene) of genes encoding each of the TF families

Cis-motif nameaCis-motif patternaTF familyNo. of gene loci hitNo. of gene lociMean observedbMean expectedcZ-scoreP-value (<0.001)
ABRE1[TC]ACGTGGCC2C2_Zn-CO-like48447.38514.0748.810
C2C2_Zn-GATA36347.78414.0748.920
C3H-TypeI615139.85114.0746.824.577E−12
JUMONJI25139.43514.0746.719.7873E−12
NAC518726.59314.0743.310.00046339
TCP56182.43314.07418.080
WRKY_Zn619829.95214.0744.200.000013318
ABRE2ACGTG[GT]C(R1)R2R3_Myb2631981.99655.793.590.00016627
AP2_EREBP3138281.15755.793.470.00025671
Aux_IAA78582.31255.793.630.00014072
C2C2_Zn-CO-like1384154.52255.7913.520
C2C2_Zn-Dof1081123.23155.799.240
C2C2_Zn-GATA56379.27755.793.220.00064947
C3H-TypeI1215179.11755.793.190.00070084
JUMONJI751137.74155.7911.220
MADS1818696.52455.795.581.2169E−08
NAC27187144.37955.7912.130
TCP861131.255.7910.330
Atypical_MYB1078128.07155.799.900
CE1TGCCACCGGC2H2_Zn12583.9520.5714.395.7069E−06
JUMONJI15119.6980.57124.830
WRKY_Zn419820.1740.57125.440
CRTGGCCGACATAP2_EREBP13822.5380.3243.880.000051901
C2H2_Zn12583.860.3246.202.8371E−10
DRETACCGACATABI3VP111397.2550.777.369.0372E−14
AP2_EREBP33827.9310.778.132.2204E−16
NAC11875.3510.775.209.9255E−08
ICEr2ACTCCGAP2_EREBP2238257.45337.6983.250.00057027
C2C2_Zn-GATA46363.22337.6984.200.000013136
JUMONJI45178.53737.6986.738.7457E−12
PHD2028570.00637.6985.325.1702E−08
TCP46165.40337.6984.562.5263E−06
MYBRTGGTTAGC2H2_Zn1925873.77848.0323.900.000047443
MADS1318669.50848.0323.260.00056509
TCP56182.12348.0325.170.000000118
MYCRCACATG(R1)R2R3_Myb98319307.51230.5955.931.5169E−09
ABI3VP141139295.074230.5954.973.3303E−07
AP2_EREBP112382293.307230.5954.836.6647E−07
ARF1958327.992230.5957.512.9865E−14
Aux_IAA2385270.693230.5953.090.00099623
C2C2_Zn-CO-like2584298.315230.5955.228.9042E−08
C2C2_Zn-Dof2981358.556230.5959.870
C2H2_Zn70258271.297230.5953.140.00085076
Myb_related38135281.165230.5953.900.000048357
bZIP42148283.077230.5954.050.000026039
NACRACACGCATGTC2H2_Zn12583.8230.4944.934.1633E−07
WRKY_Zn11984.9440.4946.592.2463E−11
bZIP11486.8040.4949.340

aAccording to Yamaguchi-Shinozaki et al.[45]

bThe mean values observed were calculated by counting motif pattern hit in 1000 random samplings in each 1000 trials for promoter pools of each TFs.

cThe mean values expected were calculated by counting motif patterns hit in 1000 random samplings in each 1000 trials for promoter pools of all genes annotated in soybean genome.

Enriched cis-regulatory motifs found in promoter region (1000 bp upstream from transcription start site of each gene) of genes encoding each of the TF families aAccording to Yamaguchi-Shinozaki et al.[45] bThe mean values observed were calculated by counting motif pattern hit in 1000 random samplings in each 1000 trials for promoter pools of each TFs. cThe mean values expected were calculated by counting motif patterns hit in 1000 random samplings in each 1000 trials for promoter pools of all genes annotated in soybean genome.

RIKEN soybean TF database

We constructed a TF database named SoybeanTFDB which is based on the identified soybean TF repertoire. Access to our database is available via the following link: http://soybeantfdb.psc.riken.jp, and all of the data described above are available for viewing and immediate downloading. The scientific community can browse predictions for a total of 5035 TF models and receive classifications for submitted nucleotide and protein sequences. Multiple alignments of amino acid sequences within TF families are also available for downloading and can be used for the construction of phylogenetic trees. We also provided clustered results showing amino acid similarity with different levels of amino acid identity (30, 60 and 90%), search functions for functional motif information of InterProScan, cis-motifs in promoter regions of TFs and GO annotations. Furthermore, cross-references and links to other databases such as Arabidopsis TAIR8, TIGR rice, UniProt, SoyBase, soybean FL-cDNA and other TF databases such as AtTFDB, DATF, RARTF, DRTF, Grassius, PlnTFDB are available. On the first page of SoybeanTFDB, we provide four types of search keywords to find an entry: ‘TF search’, ‘Similarity search’, ‘Genome browser’ and ‘Quick search’. Similarity search allows search using either nucleotide or amino acid sequence of any TF gene. Genome browser enables search using gene IDs and Quick search allows search function via any essential keywords such as ‘NAC’. Within the TF search keyword, we provide seven search functions (Fig. 4A). Figure 4 illustrates the web-based user interface of SoybeanTFDB with a detailed description. One can easily carry out functionality predictions for any TF of interest based on GO annotations and cis-motif search results. For instance, putative abiotic stress responsive TFs can be searched based on the existence of stress responsive cis-motifs and GO annotations. Thus, the database that we have developed consolidates comprehensive information for all of the members of soybean TF repertoire. This database is a very user-friendly interface which aims to meet the broad demands of researchers who strive to perform research with soybean TFs with the goal of gaining greater understanding of their putative roles in plant development, differentiation and environmental responses. Taken together, SoybeanTFDB will serve as an in silico analysis-based basic platform for the elucidation of regulatory mechanisms underlying different developmental and physiological processes and stress responses. We strongly feel that this database will rapidly accelerate the progress in ‘transcription factoromics’ of soybean, comparative genomics of TF repertoires both within legume species and between legumes and other species, as well as facilitate genetic engineering programs to improve the productivity of soybean grown in adverse conditions.

Supplementary data

Supplementary Data are available at www.dnaresearch.oxfordjournals.org.

Funding

Funding support from Grants-in-Aid (Start-up) for Scientific Research, Ministry of Education, Culture, Sports, Science, and Technology of Japan (No. 21870046) is gratefully appreciated.
  58 in total

Review 1.  A genomic perspective on plant transcription factors.

Authors:  J L Riechmann; O J Ratcliffe
Journal:  Curr Opin Plant Biol       Date:  2000-10       Impact factor: 7.834

2.  Protein-DNA interactions: amino acid conservation and the effects of mutations on binding specificity.

Authors:  Nicholas M Luscombe; Janet M Thornton
Journal:  J Mol Biol       Date:  2002-07-26       Impact factor: 5.469

3.  Real-time RT-PCR profiling of over 1400 Arabidopsis transcription factors: unprecedented sensitivity reveals novel root- and shoot-specific genes.

Authors:  Tomasz Czechowski; Rajendra P Bari; Mark Stitt; Wolf-Rüdiger Scheible; Michael K Udvardi
Journal:  Plant J       Date:  2004-04       Impact factor: 6.417

Review 4.  Legumes: importance and constraints to greater use.

Authors:  Peter H Graham; Carroll P Vance
Journal:  Plant Physiol       Date:  2003-03       Impact factor: 8.340

5.  The repertoire of DNA-binding transcriptional regulators in Escherichia coli K-12.

Authors:  E Pérez-Rueda; J Collado-Vides
Journal:  Nucleic Acids Res       Date:  2000-04-15       Impact factor: 16.971

6.  Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

Authors:  J L Riechmann; J Heard; G Martin; L Reuber; C Jiang; J Keddie; L Adam; O Pineda; O J Ratcliffe; R R Samaha; R Creelman; M Pilgrim; P Broun; J Z Zhang; D Ghandehari; B K Sherman; G Yu
Journal:  Science       Date:  2000-12-15       Impact factor: 47.728

7.  Functional conservation of a root hair cell-specific cis-element in angiosperms with different root hair distribution patterns.

Authors:  Dong Wook Kim; Sang Ho Lee; Sang-Bong Choi; Su-Kyung Won; Yoon-Kyung Heo; Misuk Cho; Youn-Il Park; Hyung-Taeg Cho
Journal:  Plant Cell       Date:  2006-11-10       Impact factor: 11.277

8.  Cloning and expression of an ABSCISIC ACID-INSENSITIVE 3 (ABI3) gene homologue of yellow-cedar (Chamaecyparis nootkatensis).

Authors:  Galina Lazarova; Ying Zeng; Allison R Kermode
Journal:  J Exp Bot       Date:  2002-05       Impact factor: 6.992

Review 9.  Phytoestrogens: a review of the present state of research.

Authors:  Andreana L Ososki; Edward J Kennelly
Journal:  Phytother Res       Date:  2003-09       Impact factor: 5.878

10.  AGRIS: Arabidopsis gene regulatory information server, an information resource of Arabidopsis cis-regulatory elements and transcription factors.

Authors:  Ramana V Davuluri; Hao Sun; Saranyan K Palaniswamy; Nicole Matthews; Carlos Molina; Mike Kurtz; Erich Grotewold
Journal:  BMC Bioinformatics       Date:  2003-06-23       Impact factor: 3.169

View more
  38 in total

1.  Positional cloning and characterization reveal the molecular basis for soybean maturity locus E1 that regulates photoperiodic flowering.

Authors:  Zhengjun Xia; Satoshi Watanabe; Tetsuya Yamada; Yasutaka Tsubokura; Hiroko Nakashima; Hong Zhai; Toyoaki Anai; Shusei Sato; Toshimasa Yamazaki; Shixiang Lü; Hongyan Wu; Satoshi Tabata; Kyuya Harada
Journal:  Proc Natl Acad Sci U S A       Date:  2012-05-22       Impact factor: 11.205

2.  Differential expression analysis of a subset of GmNAC genes in shoots of two contrasting drought-responsive soybean cultivars DT51 and MTD720 under normal and drought conditions.

Authors:  Nguyen Binh Anh Thu; Xuan Lan Thi Hoang; Hieu Doan; Thanh-Hao Nguyen; Dao Bui; Nguyen Phuong Thao; Lam-Son Phan Tran
Journal:  Mol Biol Rep       Date:  2014-07-02       Impact factor: 2.316

Review 3.  Genomics and bioinformatics resources for crop improvement.

Authors:  Keiichi Mochida; Kazuo Shinozaki
Journal:  Plant Cell Physiol       Date:  2010-03-05       Impact factor: 4.927

4.  Identification and prediction of abiotic stress responsive transcription factors involved in abiotic stress signaling in soybean.

Authors:  Lam-Son Phan Tran; Keiichi Mochida
Journal:  Plant Signal Behav       Date:  2010-03-06

Review 5.  Functional genomics of soybean for improvement of productivity in adverse conditions.

Authors:  Lam-Son Phan Tran; Keiichi Mochida
Journal:  Funct Integr Genomics       Date:  2010-06-27       Impact factor: 3.410

6.  In silico characterization and expression analysis of the multigene family encoding the Bowman-Birk protease inhibitor in soybean.

Authors:  Beatriz de Almeida Barros; Wiliane Garcia da Silva; Maurilio Alves Moreira; Everaldo Gonçalves de Barros
Journal:  Mol Biol Rep       Date:  2011-05-10       Impact factor: 2.316

7.  Exploration for the salt stress tolerance genes from a salt-treated halophyte, Suaeda asparagoides.

Authors:  Selvam Ayarpadikannan; Eunsook Chung; Chang-Woo Cho; Hyun-Ah So; Soon-Ok Kim; Joo-Min Jeon; Myoung-Hae Kwak; Seon-Woo Lee; Jai-Heon Lee
Journal:  Plant Cell Rep       Date:  2011-08-28       Impact factor: 4.570

8.  Genome-wide analysis of two-component systems and prediction of stress-responsive two-component system members in soybean.

Authors:  Keiichi Mochida; Takuhiro Yoshida; Tetsuya Sakurai; Kazuko Yamaguchi-Shinozaki; Kazuo Shinozaki; Lam-Son Phan Tran
Journal:  DNA Res       Date:  2010-09-03       Impact factor: 4.458

9.  The soybean R2R3 MYB transcription factor GmMYB100 negatively regulates plant flavonoid biosynthesis.

Authors:  Junhui Yan; Biao Wang; Yunpeng Zhong; Luming Yao; Linjing Cheng; Tianlong Wu
Journal:  Plant Mol Biol       Date:  2015-08-01       Impact factor: 4.076

10.  The Lesion Simulating Disease (LSD) gene family as a variable in soybean response to Phakopsora pachyrhizi infection and dehydration.

Authors:  Caroline Cabreira; Alexandro Cagliari; Lauro Bücker-Neto; Beatriz Wiebke-Strohm; Loreta B de Freitas; Francismar C Marcelino-Guimarães; Alexandre L Nepomuceno; Márcia M A N Margis-Pinheiro; Maria H Bodanese-Zanettini
Journal:  Funct Integr Genomics       Date:  2013-06-12       Impact factor: 3.410

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.