| Literature DB >> 18073188 |
Derek Wilson1, Varodom Charoensawan, Sarah K Kummerfeld, Sarah A Teichmann.
Abstract
DNA-binding domain (DBD) is a database of predicted sequence-specific DNA-binding transcription factors (TFs) for all publicly available proteomes. The proteomes have increased from 150 in the initial version of DBD to over 700 in the current version. All predicted TFs must contain a significant match to a hidden Markov model representing a sequence-specific DNA-binding domain family. Access to TF predictions is provided through http://transcriptionfactor.org, where new search options are now provided such as searching by gene names in model organisms, searching for all proteins in a particular DBD family and specific organism. We illustrate the application of this type of search facility by contrasting trends of DBD family occurrence throughout the tree of life, highlighting the clear partition between eukaryotic and prokaryotic DBD expansions. The website content has been expanded to include dedicated pages for each TF containing domain assignment details, gene names, links to external databases and links to TFs with similar domain arrangements. We compare the increase in number of predicted TFs with proteome size in eukaryotes and prokaryotes. Eukaryotes follow a slower rate of increase in TFs than prokaryotes, which could be due to the presence of splice variants or an increase in combinatorial control.Entities:
Mesh:
Substances:
Year: 2007 PMID: 18073188 PMCID: PMC2238844 DOI: 10.1093/nar/gkm964
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Examples of new search capabilities and content. (a) Search for TFs from a particular organism containing a specified DBD. The example used here is TFs from Homo sapiens containing the homoeobox domain. (b) The search in (a) results in TF predictions from Homo sapiens containing the homoeobox DNA-binding domain. (c) Selection of HOXA9 from (b) results in a web page with detailed information on this particular TF. (d) Clicking on the Pfam domain combination link in (c) retrieves the subset of TF predictions, which have the same two-domain arrangement as the HOXA9 transcription factor.
Figure 2.(a) Expansion and contraction patterns of DBD occurrence across the tree of life. Each column corresponds to a Pfam DBD. Each row of the heatmap represents a genome, ordered using the NCBI taxonomy. The vertical coloured bars indicate superkingdoms, kingdoms or phyla to which genomes belong. Eukaryotes are indicated using a red bar, archaea using a green bar and bacteria using a blue bar. Other kingdoms are represented using white bars. DNA-binding domain families are clustered using the average linkage method with Pearson correlation distance. Red squares represent an expansion of a DBD family, green squares represent a contraction of that family in a genome relative to other genomes. (b) A zoom on DBD expansions in the viridiplantae lineage. (c) Illustration of the three-dimensional structure of one of the DBDs specifically expanded in the viridiplantae kingdom, the AP2 domain in complex with DNA. The AP2 family transcription factors are known to be involved in plant pathogen defence response processes.