| Literature DB >> 25890365 |
Shahram Bahrami1,2, Rezvan Ehsani3, Finn Drabløs4.
Abstract
BACKGROUND: Transcription factors are essential proteins for regulating gene expression. This regulation depends upon specific features of the transcription factors, including how they interact with DNA, how they interact with each other, and how they are post-translationally modified. Reliable information about key properties associated with transcription factors will therefore be useful for data analysis, in particular of data from high-throughput experiments.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25890365 PMCID: PMC4373352 DOI: 10.1186/s13104-015-1039-6
Source DB: PubMed Journal: BMC Res Notes ISSN: 1756-0500
Figure 1Prediction quality at the domain level. Domains are classified as TP, FN and FP as shown, relative to the curated Pfam domains. TNs are not included in this comparison, as negative domains are not well defined.
Figure 2Prediction quality at the nucleotide level. Regions are classified as TP, FN and FP as shown, relative to overlap with the curated Pfam domains. TNs are not included in this comparison, as they represent a very large fraction of the comparison, which may bias the analysis.
Prediction results for DNA-binding domains on positive data
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|
| Protein | proteins | 907 | 776 | 718 | - | - | 189 | 79.16 | - |
| Domain | domains | 70 | 46 | 40 | - | - | 30 | 57.14 | - |
| Domain | occurrences | 1159 | 872 | 863 | 519 | - | 296 | 74.46 | 62.45 |
| Nucleotide total | nucleotides | 69320 | 43326 | 42783 | 16899 | - | 26537 | 61.72 | 71.68 |
| Nucleotide average | nucleotides | 59 | 49 | 49 | 32 | - | 89 | 35.51 | 60.49 |
Figure 3Example of a challenging DBD prediction. The predicted region overlaps with two independent Pfam domains.
New DNA-binding and non-DNA-binding domain types
|
|
|
|
|
|---|---|---|---|
| Homeobox_KN | zf-C2H2_6 | Maf1 | PBC |
| MCM2_N | zf-C2H2_4 | zf-H2C2_5 | zf-C2H2_2 |
| CBFD_NFYB_HMF | TFIID-18 kDa | Exo_endo_phos | TFIIA |
| SKIP_SNW | TFIIB | DUF3432 | SCAN |
| Ku | DNA_methylase | Toprim | Prox1 |
| Pax2_C | TFIID_20kDa | SSXRD | |
| TAFII28 | ResIII | HJURP_C | |
| DUF2028 | FAD_binding_7 | Ku_N | |
| Histone | RNA_pol_Rpb1_1 | DNA_photolyase | |
| zf-H2C2_2 | SOXp | SNF2_N | |
| zf-met | DNA_topoisoIV | TIG |
*After filtering predicted DBDs for false positives.
Figure 4A Venn diagram for distribution of PTMs across TFs. The diagram shows that PTMs tend to co-occur, possibly due to experimental bias.
Overview of TF annotation data
|
|
|
|
|
|
|---|---|---|---|---|
| Uniprot ID | protein ID | 1978 | 1978 | 1 |
| Pfam non-DBD | domain IDs | 1978 | 753 | 2.16 |
| Pfam DBD | domain IDs | 1978 | 1225 | 1.33 |
| PPI | protein IDs | 1222 | 482 | 1.58 |
| PTM - acetylation | positions | 1978 | 884 | 3.55 |
| PTM - methylation | positions | 1978 | 376 | 3.22 |
| PTM - O-GlcNAc | positions | 1978 | 41 | 2.90 |
| PTM - phosphorylation | positions | 1978 | 1797 | 13.12 |
| PTM - sumoylation | positions | 1978 | 190 | 1.77 |
| PTM - ubiquitination | positions | 1978 | 896 | 4.38 |
*Number of TFs that actually have the property. **Average number of occurrences in the positive TFs.
Selected enriched terms according to GOrilla
|
|
|
|
| |
|---|---|---|---|---|
| DNA_Binding | DNA binding | 2.11E-185 | 1.72E-182 | 1.28 (1939,1475,1206,1174) |
| core promoter sequence-specific DNA binding | 7.87E-5 | 1.79E-3 | 1.37 (1939,60,1206,51) | |
| protein dimerization activity | 4.00E-8 | 1.13E-6 | 1.24 (1939,254,1206,196) | |
| Non_DNA_Binding | catalytic activity | 1.07E-49 | 8.75E-47 | 2.01 (1939,305,735,232) |
| RNA binding | 3.95E-34 | 1.62E-31 | 2.00 (1939,222,735,168) | |
| transcription cofactor activity | 9.56E-12 | 4.61E-10 | 1.42 (1939,359,735,193) | |
| histone binding | 1.03E-10 | 3.39E-9 | 2.07 (1939,60,735,47) | |
| ubiquitin-protein transferase activity | 2.29E-10 | 7.21E-9 | 2.40 (1939,33,735,30) | |
| methylated histone binding | 3.80E-10 | 1.11E-8 | 2.54 (1939,26,735,25) | |
| Acetylation | transcription factor binding | 2.12E-6 | 2.17E-4 | 1.28 (1939,292,879,169) |
| structure-specific DNA binding | 2.27E-5 | 7.76E-4 | 1.38 (1939,136,879,85) | |
| Non_Acetylation | sequence-specific DNA binding | 1.36E-6 | 1.11E-3 | 1.11 (1939,887,1061,537) |
| Methylation | protein binding | 2.67E-8 | 3.12E-6 | 1.21 (1939,1135,372,264) |
| chromatin binding | 3.93E-7 | 2.48E-5 | 1.62 (1939,264,372,82) | |
| O-GlcNAc | protein binding | 6.83E-6 | 2.80E-3 | 1.54 (1939,1133,41,37) |
| histone deacetylase binding | 2.71E-4 | 7.41E-2 | 6.31 (1939,45,41,6) | |
| Phosphorylation | protein binding | 4.93E-5 | 2.02E-2 | 1.02 (1939,1133,1782,1065) |
| PTM | protein binding | 3.12E-6 | 2.55E-3 | 1.02 (1939,1135,1827,1093) |
| Sumoylation | sequence-specific DNA binding | 3.00E-12 | 4.1E-10 | 1.73 (1939,617,189,104) |
| core promoter binding | 1.86E-7 | 8.03E-6 | 2.90 (1939,92,189,26) | |
| chromatin binding | 1.92E-7 | 7.86E-6 | 1.98 (1939,264,189,51) | |
| Ubiquitination | protein binding | 3.71E-30 | 3.04E-27 | 1.24 (1939,1133,888,641) |
| transcription cofactor activity | 3.27E-8 | 8.12E-7 | 1.28 (1939,359,888,211) | |
| Non_Ubiquitination | DNA binding | 6.99E-14 | 5.73E-11 | 1.09 (1939,1473,1052,869) |
| PPI | transcription factor binding | 1.38E-4 | 4.83E-2 | 1.31 (1203,185,475,96) |
Associations between property-based subgroups
|
|
|
|
| |
|---|---|---|---|---|
| Phosphorylation | Acetylation | 1.84E-10 | 5.15E-09 | 0.190 |
| Phosphorylation | Ubiquitination | 1.94E-10 | 2.72E-09 | 0.190 |
| DNA_Binding | Methylation | 2.08E-10 | 1.94E-09 | −0.156 |
| Phosphorylation | Methylation | 2.42E-10 | 1.70E-09 | 0.127 |
| Methylation | Acetylation | 2.78E-10 | 1.56E-09 | 0.202 |
| Ubiquitination | Methylation | 2.85E-10 | 1.33E-09 | 0.204 |
| DNA_Binding | Ubiquitination | 3.16E-10 | 1.26E-09 | −0.280 |
| Ubiquitination | Acetylation | 3.39E-10 | 1.19E-09 | 0.289 |
| Acetylation | Sumoylation | 5.99E-09 | 1.86E-08 | 0.131 |
| Ubiquitination | Sumoylation | 4.03E-08 | 1.13E-07 | 0.124 |
| DNA_Binding | Acetylation | 6.30E-08 | 1.60E-07 | −0.122 |
| Methylation | O-GlcNAc | 1.24E-05 | 2.90E-05 | 0.110 |
| Phosphorylation | Sumoylation | 1.51E-05 | 3.24E-05 | 0.086 |
| Acetylation | O-GlcNAc | 3.45E-04 | 6.91E-04 | 0.083 |
| Ubiquitination | O-GlcNAc | 3.82E-03 | 7.13E-03 | 0.067 |
| PPI | Sumoylation | 1.23E-02 | 2.16E-02 | 0.072 |
| Methylation | Sumoylation | 1.49E-02 | 2.46E-02 | 0.056 |
| Phosphorylation | O-GlcNAc | 2.85E-02 | 4.43E-02 | 0.046 |
| DNA_Binding | PPI | 3.43E-01 | 4.37E-01 | −0.027 |
Occurrence of DBDs in 762 PPI pairs
|
|
|
|
|
|
|---|---|---|---|---|
| both TFs | 255 | 229 | 0.046 | 4.58E-02 |
| only one TF | 371 | 343 | 0.042 | 4.58E-02 |
| none TFs | 135 | 190 | 7.50E-07 | 2.25E-06 |
Figure 5A matrix representation of enriched domain pairs in PPI data. Homodimers are indicated in orange. RNA polymerases and Pfam-B domains are not included; please see Figure S2 [in Additional file 2] for the full data set.
Results for TF expression changes during cell differentiation
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| 1 | Ubiquitination | 4 | 1 | 4.20E-02 | 3.36E-01 | 0.049 |
| Sumoylation | 2 | 0 | 4.84E-02 | 1.93E-01 | 0.062 | |
| 6 | O-GlcNAc | 3 | 0 | 9.69E-03 | 7.75E-02 | 0.086 |
| Ubiquitination | 16 | 9 | 1.58E-02 | 6.31E-02 | 0.058 | |
| Methylation | 9 | 4 | 2.35E-02 | 6.27E-02 | 0.059 | |
| 8 | Ubiquitination | 17 | 9 | 1.36E-03 | 1.09E-02 | 0.074 |
| PPI | 10 | 5 | 2.39E-02 | 9.58E-02 | 0.070 | |
| 1,2,3 | PPI | 9 | 4 | 1.57E-02 | 1.26E-01 | 0.072 |
| 4,5,6 | Ubiquitination | 43 | 29 | 1.48E-03 | 1.18E-02 | 0.074 |
| Methylation | 21 | 12 | 1.03E-02 | 4.11E-02 | 0.061 | |
| Sumoylation | 12 | 6 | 2.98E-02 | 7.93E-02 | 0.054 | |
| O-GlcNAc | 4 | 1 | 4.53E-02 | 9.07E-02 | 0.052 | |
| 7,8,9,10 | DNA_Binding | 28 | 38 | 7.49E-03 | 5.99E-02 | −0.062 |
| Ubiquitination | 38 | 28 | 1.32E-02 | 5.29E-02 | 0.058 |
*Indicates TFs with similar expression profiles: 1, 2, 3 - Up-regulated; 4, 5, 6 - Down-regulated; 7, 8, 9, 10 - No clear change.
Selected results for TFs that are frequently mutated in cancer
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| II + IIIAB | Acetylation | 48 | 26 | 1.823E-08 | 1.46E-07 | 0.125 |
| Ubiquitination | 47 | 27 | 2.08E-07 | 8.31E-07 | 0.118 | |
| Methylation | 26 | 11 | 1.21E-05 | 3.23E-05 | 0.110 | |
| PF00856(SET) | 6 | 0 | 1.21E-05 | 8.80E-03 | 0.164 | |
| PF13771(zf-HC5HC2H) | 4 | 0 | 4.90E-05 | 1.78E-02 | 0.175 | |
| PF00628(PHD) | 8 | 1 | 1.78E-04 | 2.58E-02 | 0.114 | |
| Sumoylation | 14 | 5 | 1.14E-03 | 2.28E-0 | 0.082 | |
| O-GlcNAc | 5 | 1 | 7.03E-03 | 1.12E-02 | 0.078 | |
| II | Acetylation | 21 | 12 | 1.67E-03 | 1.33E-02 | 0.073 |
| Ubiquitination | 20 | 12 | 6.61E-03 | 2.64E-02 | 0.063 | |
| Methylation | 11 | 5 | 1.23E-02 | 3.28E-02 | 0.062 | |
| O-GlcNAc | 3 | 0 | 1.89E-02 | 3.78E-02 | 0.073 | |
| IIIB | Sumoylation | 5 | 1 | 3.51E-03 | 2.80E-02 | 0.085 |
| I + IIIAB | Ubiquitination | 32 | 16 | 2.69E-07 | 2.15E-06 | 0.114 |
| Acetylation | 30 | 16 | 6.44E-06 | 2.12E-05 | 0.101 | |
| Methylation | 19 | 7 | 7.94E-06 | 2.12E-05 | 0.114 | |
| PF00856(SET) | 5 | 0 | 1.67E-05 | 5.78E-03 | 0.178 | |
| PF00628(PHD) | 7 | 1 | 4.70E-05 | 8.56E-03 | 0.136 | |
| PF13771(zf-HC5HC2H) | 3 | 0 | 3.17E-04 | 2.25E-02 | 0.168 | |
| Sumoylation | 10 | 3 | 1.82E-03 | 3.64E-03 | 0.082 | |
| DNA_Binding | 15 | 22 | 9.56E-03 | 1.53E-02 | −0.061 | |
| IIIAB | Acetylation | 27 | 14 | 5.47E-06 | 2.56E-05 | 0.102 |
| Ubiquitination | 27 | 14 | 6.39E-06 | 2.56E-05 | 0.101 | |
| PF00628(PHD) | 6 | 0 | 1.82E-04 | 2.64E-02 | 0.125 | |
| PF00439(Bromodomain) | 4 | 0 | 4.78E-04 | 4.35E-02 | 0.132 | |
| Methylation | 15 | 6 | 2.79E-04 | 7.44E-04 | 0.091 | |
| Sumoylation | 9 | 3 | 2.28E-03 | 4.56E-03 | 0.081 | |
| DNA_Binding | 12 | 19 | 5.45E-03 | 8.71E-03 | −0.065 | |
| I | Methylation | 4 | 0 | 5.47E-03 | 4.38E-02 | 0.078 |
*Indicates TFs with similar mutation profiles: I - Mainly mutated across many cancers; II - Highly mutated in a few cancers; IIIA - Highly mutated across many cancers; IIIB - Even more highly mutated across many cancers.
Selected results for TFs with differences in tissue specificity
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| General | DNA_Binding | 126 | 85 | 1.36E-10 | 1.09E-09 | 0.166 |
| Specific | DNA_Binding | 306 | 205 | 1.92E-10 | 1.53E-09 | 0.280 |
| Sumoylation | 57 | 31 | 1.85E-06 | 7.39E-06 | 0.115 | |
| PF00104(Hormone_recep) | 28 | 7 | 2.32E-11 | 1.69E-08 | 0.179 | |
| PF01352(KRAB) | 16 | 40 | 9.20E-07 | 1.67E-04 | −0.103 | |
| Unknown | DNA_Binding | 702 | 486 | 2.82E-10 | 2.26E-09 | 0.459 |
| Ubiquitination | 229 | 355 | 3.19E-10 | 1.28E-09 | −0.263 | |
| Methylation | 105 | 149 | 1.68E-07 | 4.47E-07 | −0.116 | |
| PPI | 146 | 172 | 1.48E-03 | 2.37E-03 | −0.091 | |
| PF01352(KRAB) | 200 | 96 | 2.11E-10 | 7.66E-08 | 0.324 |
*Indicates TFs found in many tissues (general), a few tissues (specific), or unknown (due to very low or no expression).