| Literature DB >> 20685816 |
Andreu Alibés1, Alejandro D Nadra, Federico De Masi, Martha L Bulyk, Luis Serrano, François Stricher.
Abstract
Quite often a single or a combination of protein mutations is linked to specific diseases. However, distinguishing from sequence information which mutations have real effects in the protein's function is not trivial. Protein design tools are commonly used to explain mutations that affect protein stability, or protein-protein interaction, but not for mutations that could affect protein-DNA binding. Here, we used the protein design algorithm FoldX to model all known missense mutations in the paired box domain of Pax6, a highly conserved transcription factor involved in eye development and in several diseases such as aniridia. The validity of FoldX to deal with protein-DNA interactions was demonstrated by showing that high levels of accuracy can be achieved for mutations affecting these interactions. Also we showed that protein-design algorithms can accurately reproduce experimental DNA-binding logos. We conclude that 88% of the Pax6 mutations can be linked to changes in intrinsic stability (77%) and/or to its capabilities to bind DNA (30%). Our study emphasizes the importance of structure-based analysis to understand the molecular basis of diseases and shows that protein-DNA interactions can be analyzed to the same level of accuracy as protein stability, or protein-protein interactions.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20685816 PMCID: PMC2995082 DOI: 10.1093/nar/gkq683
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Correlation factors for changes in interaction energy upon change in DNA base predicted by FoldX and Morozov et al. (24). The cases where FoldX performs better than the dynamic model are shown in green, otherwise in red
1AAY and 1JK1 are considered together in Morozov et al. (24).
a0.63 without two outliers.
b0.58 without three outliers.
Figure 1.Qualitative evaluation of FoldX-derived DNA-binding profiles. Experimental logos for a subset of proteins displaying different D values (see Table 2) are presented above each FoldX prediction. Correctly predicted positions according to our criterion (individual coefficient D < 0.58) are shown in blue, while those mispredicted (individual coefficient D > 0.58) are shown in black. For the PBM-derived logos (E and F), Enrichment Scores for the top seeds resulting in the PBM PWMs are 0.485 for Gcn4 (ATF) and 0.497 for Jun/Fos.
Quantitative evaluation of FoldX-derived DNA-binding profiles
Source: J, JASPAR (38); T, TRANSFAC (39); U, UniPROBE (40) and newly reported here.
Divergence coefficient values (see ‘Materials and Methods’ section) for each structure analyzed and a group is assigned according to its divergence coefficient (D < 0.38: white, 0.38 ≤ D < 0.58: light gray, D > 0.58: dark gray). The average information content for each position, IC/N, for the experimental and predicted logos is also shown. Logos presented in Figure 1 are in bold and are examples of different divergence coefficients.
Figure 2.Structure of the Pax6 paired domain (PDB id 6PAX) (43). Cartoon representation showing both N- and C-terminal domains. The figure was done with the molecular visualization software Pymol (57).
Figure 3.Comparison of the experimental logo (44) (top panel) and the predicted logo for the wild-type Pax6 (bottom panel).
Figure 4.Schematic view of the distribution of Pax6 mutations and the energy results. The residues are color coded according to its change in stability. Residues with two colors represent the results for different mutations in the same position. Residue numbering through out the article is based on the Uniprot numbering (Isoform 1) that is three positions shifted from the PDB one.
Figure 5.Energy changes upon mutation. Red for changes >1.6 kcal/mol, orange for changes between 1.6 and 0.8 kcal/mol and blue for those whose effect is <0.8 kcal/mol. The changes in stability are displayed as solid colors and the changes in interaction energy as hash-bars. Values >10 kcal/mol are shown with an arrow. Mutations are sorted according to their stability changes. Values can be found in Supplementary Table S3.