| Literature DB >> 23017156 |
Leandra M Brettner1, Joanna Masel.
Abstract
BACKGROUND: A hub protein is one that interacts with many functional partners. The annotation of hub proteins, or more generally the protein-protein interaction "degree" of each gene, requires quality genome-wide data. Data obtained using yeast two-hybrid methods contain many false positive interactions between proteins that rarely encounter each other in living cells, and such data have fallen out of favor.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23017156 PMCID: PMC3527306 DOI: 10.1186/1752-0509-6-128
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Figure 1A protein’s number of PPIs correlates poorly across two high throughput data types. Model 1 regression line is shown for illustrative purposes only, to show the weakness of the correlation.
Multiple regression results predicting noise
| | |||||
|---|---|---|---|---|---|
| R2 | 0.0093 | 0.0100 | 0.0064 | 0.0121 | |
| | p | *** | *** | ** | *** |
| R2 | ns | 0.0081 | ns | 0.0114 | |
| | p | - | ** | - | *** |
| R2 | 0.075551 | 0.0779 | 0.06022 | 0.0752 | |
| | p | *** | *** | *** | *** |
| R2 | 0.0067 | 0.0033 | 0.0038 | 0.0045 | |
| | p | ** | * | * | * |
| R2 | 0.017441 | 0.0169 | 0.01611 | 0.0211 | |
| | p | *** | *** | *** | *** |
| R2 | - | - | 0.04953 | 0.0781 | |
| | p | - | - | *** | *** |
| R2 | 0.0098 | - | 0.0060 | - | |
| | p | *** | - | ** | - |
| R2 | - | - | 0.0279 | 0.0815 | |
| | slope | - | - | 0.0010 | 0.0015 |
| | p | - | - | *** | *** |
| R2 | - | - | 0.0219 | 0.0281 | |
| | slope | - | - | 0.0004 | 0.0005 |
| | p | - | - | *** | *** |
| R2 | - | - | 0.0075 | - | |
| p | - | - | ** | - | |
1 also removed TATA × Essentiality.
2 also removed TATA × Essentiality, Plasticity if TATA(+), Plasticity if TATA(−), and restored Plasticity.
3 removed Plasticity if TATA(+) and Plasticity if TATA(−).
Models without (1st two numeric columns) and with (last two columns) plasticity as a predictor are shown. After extensive model building, we found that high noise is predicted by low stickiness (low Y2H degree), presence of a TATA box, ability to bind itself, non-essentiality, and high plasticity. A statistically significant interaction term between TATA presence and non-essentiality shows that these two factors have synergistic effects. The TATA × plasticity interaction term is also statistically significant (last row). To provide greater insight, we transformed 3 terms (TATA, plasticity and their interaction) into more intuitive forms (TATA, plasticity if TATA(+), plasticity if TATA(−)). The slope coefficient for plasticity if TATA(+) is 2.5 times larger than that for plasticity if TATA(−), but they make similar contributions to R2 due to the much larger number of TATA(−) genes. R2 values are shown for each predicting factor in isolation (2nd and 4th numeric columns), as well as, more importantly, for the reduction in the total coefficient of determination R2 when the factor is removed from the best model (1st and 3rd numeric columns). Sometimes, as indicated in the footnotes, this involved removing multiple terms and reversing the interaction factor transformation to get a biologically interpretable result. “ns” indicates p > 0.05, * p < 0.05, ** p < 0.01, *** p < 0.001.
Figure 2Illustration of binary predictors of noise and plasticity, taken in isolation. The presence of a TATA box strongly predicts noise and plasticity. Homo-oligomerization does not, in isolation, predict plasticity, and its effect on noise is only marginally statistically significant (p = 0.0496). However, these effects become significant when confounding factors are accounted for (Tables 1 and 2). Essentiality predicts noise but not plasticity. To better assess effect sizes using more intuitive noise and plasticity measures, back transformations were performed to restore original units. The mean plasticity residual was added to the mean Box-Cox transformed plasticity score, and then the Box-Cox transform was reversed, so that plasticity corresponds simply to the estimated number of experiments for which expression varies. The noise axis corresponds to the DM metric of Newman et al. [13]. Error bars correspond to 95% confidence intervals.
Multiple regression results predicting plasticity
| R2 | 0.01911 | 0.0194 | |
| | p | *** | *** |
| R2 | ns | 0.0040 | |
| | p | - | * |
| R2 | 0.02422 | 0.0445 | |
| | p | *** | *** |
| R2 | 0.00873 | 0.0015 | |
| | p | ** | ns |
| R2 | ns | 0.0005 | |
| | p | - | ns |
| R2 | 0.053444 | 0.0781 | |
| | p | *** | *** |
| R2 | 0.0315 | 0.0449 | |
| | slope | 167.24 | 0.9949 |
| | p | *** | *** |
| R2 | 0.0224 | 0.0438 | |
| | slope | 70.444 | −0.9858 |
| | p | *** | *** |
| R2 | 0.0085 | - | |
| | p | ** | - |
| R2 | 0.0123 | 0.0004 | |
| | slope | −0.3145 | −0.0221 |
| | p | *** | ns |
| R2 | 0.0070 | 0.0154 | |
| | slope | −0.0839 | −0.1113 |
| | p | ** | *** |
| R2 | 0.0059 | - | |
| p | ** | - |
1 removed Y2H if Self and Y2H if Non-self.
2 also removed Noise if TATA(+), Noise if TATA(−) and restored Noise.
3 also removed Y2H if Self and Y2H if Non-self and restored Y2H PPI.
4 removed Noise if TATA(+) and Noise if TATA(−).
After extensive model building, we found that high plasticity is predicted by low stickiness (low Y2H degree), presence of a TATA box, ability to bind itself, and high noise. The TATA × noise and self-interaction × Y2H interaction terms are also statistically significant. To provide greater insight, we transformed the interaction terms as described in the Table 1 legend. R2 values are shown for each predicting factor in isolation (last column), as well as, more importantly, for the reduction in the total coefficient of determination R2 when the factor is removed from the best model. Sometimes, as indicated in the footnotes, this involved removing multiple terms and reversing the interaction factor transformation to get a biologically interpretable result. “ns” indicates p > 0.05, * p < 0.05, ** p < 0.01, *** p < 0.001.
Figure 3A gene’s noise and plasticity are correlated. Multiple regression analyses in Tables 1 and 2 use Model 1 regression, but with reversed dependent and independent variables. For such a weak correlation, plasticity as a function of noise is quite different from the inverse function of noise as a function of plasticity: both lines are shown here. In the absence of a correlation, the functions describing these two lines would be horizontal and vertical, respectively. For comparison, the Model 2 Standard Major Axis regression line is also shown. The correlation between noise and plasticity is tighter in the top right corner, where values of both are high [11].
Figure 4Proteins that homo-oligomerize are stickier, but do not have more functional PPIs. Analyses were performed on log(PPI) and back-transformed to yield more intuitive PPI metrics. 95% confidence intervals are shown.
Figure 5Sigmoidal dose–response curves of cooperative proteins.A) In the shaded area, cooperativity suppresses the effects of gene expression noise, preventing inappropriate pathways from being switched on. B) Dose–response curves shown for Hill coefficients of 1, 2, 3, and 4.
Figure 6Loess regression correcting plasticity for protein abundance. Statistical analyses were performed on transformed plasticity numbers (left vertical axis), untransformed plasticity is shown right for illustration. Further analysis was performed on the deviate of each data point from the red loess regression line. The R loess regression function was used rather than the lowess function because loess returns residuals and better handles larger datasets.
Figure 7Methods flowchart. Simple illustrative flowchart showing progression of research methods including datasets analysed, data transforms, statistical tests, and regression models.