| Literature DB >> 22693215 |
R Gabdoulline1, D Eckweiler, A Kel, P Stegmaier.
Abstract
We present the webserver 3D transcription factor (3DTF) to compute position-specific weight matrices (PWMs) of transcription factors using a knowledge-based statistical potential derived from crystallographic data on protein-DNA complexes. Analysis of available structures that can be used to construct PWMs shows that there are hundreds of 3D structures from which PWMs could be derived, as well as thousands of proteins homologous to these. Therefore, we created 3DTF, which delivers binding matrices given the experimental or modeled protein-DNA complex. The webserver can be used by biologists to derive novel PWMs for transcription factors lacking known binding sites and is freely accessible at http://www.gene-regulation.com/pub/programs/3dtf/.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22693215 PMCID: PMC3394331 DOI: 10.1093/nar/gks551
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Outputs of Task modes 2 and 3 are plain text pages that feature the sequences, for which binding energies have been computed, a tabular description of the derived PWM as well as a PWM logo. The PWM can further be downloaded in the TRANSFAC-like format. In 3DTF, the consensus column is complemented with calculated binding energy contributions of each position (see below). The Task mode 2 output encompasses in addition the results of evaluating the input PDB file as described for the Task mode 1.
Figure 2.(A) LSBF versus information content of the position (from data in Application example 1). (B) PFM pairwise similarity of matrices derived from models versus pairwise sequence identity of modeled proteins.
Ranking of binding sequence energies for a set of TFs with assigned PDB entry and correlation between 3DTF and TRANSFAC PWMs
| PDB ID | Transcription factor | Average rank of site energies versus random (%) | A. 3DTF PWM versus homolog TRANSFAC PWMs | B. 3DTF PWM versus other TRANSFAC PWMs | C. Homolog TRANSFAC PWMs |
|---|---|---|---|---|---|
| 1XBR | Brachyury | 0 | 0.38 ± 0.09 | ||
| 1CF7 | E2F4 | 3.5 | 0.44 | 0.30 ± 0.10 | |
| 1SRS | SRF | 1.5 | 0.58 ± 0.03 | 0.29 ± 0.11 | 0.80 ± 0.06 |
| 1IF1 | IRF-1 | 9.5 | 0.30 ± 0.04 | 0.31 ± 0.07 | 0.76 ± 0.08 |
| 1IGN | RAP1p | 4.3 | 0.25 ± 0.07 | 0.69 ± 0.10 | |
| 1HDD | En | 3.6 | 0.61 | 0.32 ± 0.12 | |
| 1HCP | ER | 0.1 | 0.66 ± 0.08 | 0.35 ± 0.11 | 0.70 ± 0.08 |
| 2DGC | GCN4 | 8.4 | 0.36 ± 0.10 | 0.32 ± 0.12 | 0.73 ± 0.07 |
| 1MDY | E2A | 1.3 | 0.53 ± 0.06 | 0.28 ± 0.08 | 0.82 ± 0.05 |
| 1FOS | AP-1 | 4.3 | 0.54 ± 0.07 | 0.36 ± 0.13 | 0.89 ± 0.06 |
| 1TUP | P53 | 0 | 0.49 ± 0.05 | 0.37 ± 0.10 | 0.69 ± 0.12 |
| 2EZD | HMGIY | 22.6 | 0.46 ± 0.06 | 0.24 ± 0.16 | 0.45 ± 0.12 |
| 1UBD | YY1 | 1.6 | 0.67 ± 0.12 | 0.34 ± 0.10 | 0.74 ± 0.09 |
| 1YTB | TBP | 14.0 | 0.55 ± 0.07 | 0.28 ± 0.11 | 0.70 |
| 1APL | MATalpha2 | 9.7 | 0.58 ± 0.16 | 0.35 ± 0.08 | 0.43 |
| 2BOP | E2 | 1.8 | 0.57 ± 0.04 | 0.23 ± 0.08 | 0.91 ± 0.04 |
| 1BY4 | RXRalpha | 2.9 | 0.47 ± 0.10 | 0.37 ± 0.13 | 0.48 ± 0.06 |
| 1GTW | C/EBPbeta | 1.3 | 0.30 ± 0.10 | 0.81 ± 0.06 |
In column B, higher PCCs than those achieved by 3DPWM (Table 2) are highlighted bold.
aAverage rank of known sites is calculated from the ranks of energies to known sites in the list of ordered binding energies to 1000 random DNA sequences.
bCorrelation coefficients were calculated as described in the main text.
cCorrelation values and/or standard errors may be missing due to lack of data.
Correlation between 3DPWM and TRANSFAC PWMs for a set of TFs with assigned PDB entry
| PDB ID | A. 3DPWM versus homolog TRANSFAC PWMs | B. 3DPWM versus other TRANSFAC PWMs | C. Homolog TRANSFAC PWMs | A/B 3DPWM | A/B 3DTF |
|---|---|---|---|---|---|
| 1XBR | 0.55 | 0.42 ± 0.11 | 1.31 | 1.61 | |
| 1CF7 | 0.21 ± 0.08 | 3.52 | 1.47 | ||
| 1SRS | 0.28 ± 0.08 | 0.80 ± 0.06 | 2.31 | 2.00 | |
| 1IF1 | 0.41 ± 0.12 | 0.76 ± 0.07 | 1.40 | 0.97 | |
| 1IGN | 0.57 ± 0.04 | 0.29 ± 0.09 | 0.68 ± 0.10 | 1.94 | 2.68 |
| 1HDD | 0.38 ± 0.15 | 1.82 | 1.91 | ||
| 1HCP | 0.70 ± 0.10 | 0.40 ± 0.12 | 0.71 ± 0.06 | 1.76 | 1.89 |
| 2DGC | 0.35 ± 0.09 | 0.30 ± 0.10 | 0.73 ± 0.06 | 1.16 | 1.13 |
| 1MDY | 0.29 ± 0.08 | 0.82 ± 0.05 | 2.33 | 1.89 | |
| 1FOS | 0.50 ± 0.02 | 0.35 ± 0.10 | 0.89 ± 0.06 | 1.40 | 1.50 |
| 1TUP | 0.47 ± 0.07 | 0.33 ± 0.12 | 0.69 ± 0.11 | 1.43 | 1.32 |
| 2EZD | 0.32 ± 0.11 | 0.44 ± 0.11 | 1.83 | 1.92 | |
| 1UBD | 0.41 ± 0.10 | 0.74 ± 0.09 | 1.82 | 1.97 | |
| 1YTB | 0.31 ± 0.12 | 0.69 | 2.11 | 1.96 | |
| 1APL | 0.62 ± 0.04 | 0.38 ± 0.09 | 0.43 | 1.66 | 1.66 |
| 2BOP | 0.29 ± 0.10 | 0.90 ± 0.03 | 2.19 | 2.48 | |
| 1BY4 | 0.42 ± 0.15 | 0.48 ± 0.05 | 1.22 | 1.27 | |
| 1GTW | 0.57 ± 0.03 | 0.32 ± 0.09 | 0.81 ± 0.06 | 1.78 | 2.33 |
Columns A–C correspond to the same columns in Table 1 with PCC values for 3DPWM. The two rightmost columns contain the PCC of column A divided the PCC of column B of respective table row (Table 1 for 3DTF values). In column A, higher PCCs than those achieved by 3DTF (Table 1) are highlighted bold.