| Literature DB >> 21857971 |
Tao Huang1, Shen Niu, Zhongping Xu, Yun Huang, Xiangyin Kong, Yu-Dong Cai, Kuo-Chen Chou.
Abstract
As an important tumor suppressor protein, reactivate mutated p53 was found in many kinds of human cancers and that restoring active p53 would lead to tumor regression. In this work, we developed a new computational method to predict the transcriptional activity for one-, two-, three- and four-site p53 mutants, respectively. With the approach from the general form of pseudo amino acid composition, we used eight types of features to represent the mutation and then selected the optimal prediction features based on the maximum relevance, minimum redundancy, and incremental feature selection methods. The Mathew's correlation coefficients (MCC) obtained by using nearest neighbor algorithm and jackknife cross validation for one-, two-, three- and four-site p53 mutants were 0.678, 0.314, 0.705, and 0.907, respectively. It was revealed by the further optimal feature set analysis that the 2D (two-dimensional) structure features composed the largest part of the optimal feature set and maybe played the most important roles in all four types of p53 mutant active status prediction. It was also demonstrated by the optimal feature sets, especially those at the top level, that the 3D structure features, conservation, physicochemical and biochemical properties of amino acid near the mutation site, also played quite important roles for p53 mutant active status prediction. Our study has provided a new and promising approach for finding functionally important sites and the relevant features for in-depth study of p53 protein and its action mechanism.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21857971 PMCID: PMC3152557 DOI: 10.1371/journal.pone.0022940
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Number of features for one-site, two-site, three-site and four-site mutants.
| Features | One-site mutant | Two-site mutant | Three-site mutant | Four-site mutant |
| SNP features | 1×9+1 = 10 | (1×9+1)×2 = 20 | (1×9+1)×3 = 30 | (1×9+1)×4 = 40 |
| Pro-pro features | 1×9+1 = 10 | (1×9+1)×2 = 20 | (1×9+1)×3 = 30 | (1×9+1)×4 = 40 |
| Amino acid factor | 5×9+5 = 50 | (5×9+5)×2 = 100 | (5×9+5)×3 = 150 | (5×9+5)×4 = 200 |
| PSSM features | 20×9 = 180 | 20×9×2 = 360 | 20×9×3 = 540 | 20×9×4 = 720 |
| Disorder feature | 1×9 = 9 | 1×9×2 = 18 | 1×9×3 = 27 | 1×9×4 = 36 |
| GRANTHAM | 1 | 2 | 3 | 4 |
| Distance features | 0 | 1 | 2 | 3 |
| 2D structure features | 4826 | 4826 | 4826 | 4826 |
| 3D structure features | 582 | 582 | 582 | 582 |
| Total | 5668 | 5929 | 6190 | 6451 |
Gain/loss of amino acids during evolution.
Conservation of amino acid at protein-protein interface.
Figure 1The IFS curves for one-site, two-site, three-site and four-site p53 mutants.
In the IFS curve, the x-axis is the number of features used for classification, and the y-axis is the Mathew's correlation coefficients (MCC) generated by the jackknife test. (A) The IFS curve for one-site p53 mutants. The peak of MCC is 0.678 with 8 features. The top 8 features derived by the mRMR approach form the optimal feature set for one-site p53 mutants. (B) The IFS curve for two-site p53 mutants. The peak of MCC is 0.314 with 50 features. The top 50 features derived by the mRMR approach form the optimal feature set for two-site p53 mutants. (C) The IFS curve for three-site p53 mutants. The peak of MCC is 0.705 with 282 features. The top 282 features derived from the mRMR approach form the optimal feature set for three-site p53 mutants. (D) The IFS curve for four-site p53 mutants. The peak of MCC is 0.907 with 25 features. The top 25 features derived from the mRMR approach form the optimal feature set for four-site p53 mutants.
Optimal feature set for one-site p53 mutants.
| Order | Name | Score |
| 1 | AA3_PSSM-8-G | 0.144 |
| 2 | AA8_PSSM-19-Y | 0.105 |
| 3 | V241 | 0.067 |
| 4 | AA6_AAFactor-3 | 0.052 |
| 5 | V78 | 0.05 |
| 6 | AA5_AAFactor-1 | 0.04 |
| 7 | AA2_PSSM-18-W | 0.039 |
| 8 | AA4_disorder | 0.04 |
Top 10 features for two-site p53 mutants.
| Order | Name | Score |
| 1 | AP2.AA8_PSSM-3-N | 0.004 |
| 2 | V1152 | 0.002 |
| 3 | V55 | 0.002 |
| 4 | V1854 | 0.001 |
| 5 | V4001 | 0 |
| 6 | V2846 | 0 |
| 7 | V4168 | 0 |
| 8 | V1059 | 0 |
| 9 | V2633 | 0 |
| 10 | V3105 | 0 |
Top 10 features for three-site p53 mutants.
| Order | Name | Score |
| 1 | V2261 | 0.159 |
| 2 | V3291 | 0.074 |
| 3 | V4391 | 0.069 |
| 4 | V3106 | 0.067 |
| 5 | AP1.AA2_AAFactor-1 | 0.056 |
| 6 | V5068 | 0.061 |
| 7 | V4075 | 0.049 |
| 8 | V5278 | 0.046 |
| 9 | V3568 | 0.05 |
| 10 | V3978 | 0.052 |
Optimal feature set for four-site p53 mutants.
| Order | Name | Score |
| 1 | V431 | 0.461 |
| 2 | V4965 | 0.109 |
| 3 | V1675 | 0.147 |
| 4 | V414 | 0.132 |
| 5 | V3945 | 0.097 |
| 6 | V1116 | 0.102 |
| 7 | V2789 | 0.1 |
| 8 | V407 | 0.097 |
| 9 | AP1.AA9_PSSM-7-E | 0.09 |
| 10 | V432 | 0.084 |
| 11 | V3562 | 0.08 |
| 12 | V4524 | 0.079 |
| 13 | V2253 | 0.077 |
| 14 | AP1.AA2_PSSM-11-L | 0.077 |
| 15 | V1099 | 0.071 |
| 16 | V2718 | 0.067 |
| 17 | V438 | 0.07 |
| 18 | V4946 | 0.07 |
| 19 | V2817 | 0.069 |
| 20 | V1159 | 0.072 |
| 21 | V3477 | 0.072 |
| 22 | V2357 | 0.07 |
| 23 | V415 | 0.07 |
| 24 | AP1.AA2_AAFactor-4 | 0.072 |
| 25 | AP2.AA1_PSSM-14-F | 0.072 |