| Literature DB >> 20675012 |
Varodom Charoensawan1, Derek Wilson, Sarah A Teichmann.
Abstract
DNA-binding domains (DBDs) are essential components of sequence-specific transcription factors (TFs). We have investigated the distribution of all known DBDs in more than 500 completely sequenced genomes from the three major superkingdoms (Bacteria, Archaea and Eukaryota) and documented conserved and specific DBD occurrence in diverse taxonomic lineages. By combining DBD occurrence in different species with taxonomic information, we have developed an automatic method for inferring the origins of DBD families and their specific combinations with other protein families in TFs. We found only three out of 131 (2%) DBD families shared by the three superkingdoms. Copyright 2010 Elsevier Ltd. All rights reserved.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20675012 PMCID: PMC2937223 DOI: 10.1016/j.tig.2010.06.004
Source DB: PubMed Journal: Trends Genet ISSN: 0168-9525 Impact factor: 11.639
Figure 1Lineage-specific expansion patterns of DBD families. (a) The heatmap demonstrates the specific expansion patterns of DBD families between eukaryotic and prokaryotic genomes. Columns correspond to DBD families hierarchically clustered by their occurrence patterns in different species. Rows represent species ordered using the NCBI taxonomy. Orange indicates relative DBD expansion and blue represents contraction. The vertical coloured bars to the left of the heatmap indicate superkingdoms, kingdoms, or phyla to which species belong. Eukaryota (red) is divided into three kingdoms: Metazoa (pink), Fungi (orange) and Viridiplantae (yellow). Euryarchaea and Crenarchaea are labelled in pale and dark green, respectively. Bacteria are labelled using shades of blue: Actinobacteria (purple), Firmicutes (navy) and Proteobacteria (pale blue). The white areas in the right-hand coloured bar are species that do not belong to the main kingdoms/phyla mentioned above, e.g. protists and choanoflagellate. Specific patterns of occurrence were observed within the eukaryotic species. At the right-hand side is shown the detailed expansion patterns of selected eukaryotic lineages: protists including Mycetozoa (Dictyostelid) and Stramenopiles, animals including V for vertebrates and I for invertebtrates, MB for M. brevicollis (choanoflagellate), fungi, and plants including S for streptophyta (land plants) and C for chlorophyta (green algae). (b) A Venn diagram representing the number of Pfam DBD families that have taxonomic limits belonging to the three main superkingdoms. Only 19 out of 131 (15%) DBDs were found in more than one superkingdom, whereas most of these DBDs are shared by Bacteria and Archaea but not by Eukaryota. Only three DBD families (CSD, HTH_psq, and HTH_3) are shared by all of the superkingdoms.
Figure 2Network representation of DBD families and partner domains. (a) Examples of network representation of bacterial TF architectures. DBDs are shown as oblongs in protein chains and as circular nodes in our network representation. Partner domains are shown as rectangles in protein chains and as squares in the network representation. DBDs, e.g. HTH_1 and Fe_dep_repress, and their adjacent partner domains, e.g. LysR_substrate and Fe_dep_repr_C, are linked by unbroken arrows, pointing in the N- to C-terminal orientation. Broken arrows connect DBDs and partner domains that occur in the same TF chain but are not adjacent to DBDs, e.g. FeoA. Numbers on the top of each domain indicate its order from N- to C-terminus. Node sizes and arrow thickness are proportional to the abundance of domains and their combination, respectively. Coloured nodes and arrows indicate phylum-specific domain occurrence and domain combination, obtained from the taxonomic limit method described in Box 1 (e.g. the blue arrow linking HTH_1 to LysR_substrate indicates the combination is common to all Bacteria). Colour codes are as described for Figure 1. A white node means that the DBD is shared with other superkingdoms, e.g. HTH_1 and Fe_dep_repress are shared by Archaea. DBDs that occur alone as single-domain TFs in more than 25% of all their architectural patterns have orange borders, e.g. FlhC (see the supplementary material online for a complete bacterial TF architectural network and statistics used to generate the network). (b) Lineage-specific TF architectures in eukaryotes. A eukaryote-specific Tub DBD (represented by red oblongs in protein chains and by a circular node in our network) has distinct domain combinations in animals and plants. Although the Tub DBD occurs in single-domain TFs without a partner throughout eukaryotic species (green border), in animals, it occurs also C-terminal to the animal-specific SOCS_box (shown as a pink square node, a pink arrow indicates an animal-specific architecture). In contrast, the Tub DBD co-occurs with the F-box domain in plants, a eukaryote-specific partner domain (shown as a red square). This combination is observed exclusively in land plants (linked by a yellow arrow). (c) A network representing eukaryotic TF architectures. All architectures that occur in more than 5% of TFs in each DBD family are shown. DBDs that occur alone as single-domain TFs in more than 25% of all their architectural patterns have green borders. We observed the repetition of the same DBD within a TF (self-looping arrow) in 29% of eukaryotic DBDs, whereas DBD repeats in prokaryotes were observed in only one bacterial DBD, HTH_AraC.