| Literature DB >> 23139826 |
John H Graham1, Daniel T Robb, Amy R Poe.
Abstract
BACKGROUND: Distributed robustness is thought to influence the buffering of random phenotypic variation through the scale-free topology of gene regulatory, metabolic, and protein-protein interaction networks. If this hypothesis is true, then the phenotypic response to the perturbation of particular nodes in such a network should be proportional to the number of links those nodes make with neighboring nodes. This suggests a probability distribution approximating an inverse power-law of random phenotypic variation. Zero phenotypic variation, however, is impossible, because random molecular and cellular processes are essential to normal development. Consequently, a more realistic distribution should have a y-intercept close to zero in the lower tail, a mode greater than zero, and a long (fat) upper tail. The double Pareto-lognormal (DPLN) distribution is an ideal candidate distribution. It consists of a mixture of a lognormal body and upper and lower power-law tails. OBJECTIVE AND METHODS: If our assumptions are true, the DPLN distribution should provide a better fit to random phenotypic variation in a large series of single-gene knockout lines than other skewed or symmetrical distributions. We fit a large published data set of single-gene knockout lines in Saccharomyces cerevisiae to seven different probability distributions: DPLN, right Pareto-lognormal (RPLN), left Pareto-lognormal (LPLN), normal, lognormal, exponential, and Pareto. The best model was judged by the Akaike Information Criterion (AIC).Entities:
Mesh:
Year: 2012 PMID: 23139826 PMCID: PMC3490920 DOI: 10.1371/journal.pone.0048964
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Probability distributions fit to a histogram of random phenotypic variation (phenotypic potential) in Saccharomyces cerevisiae gene knockouts.
Histogram data are from Table S1 in [29]. DPLN is the double Pareto-lognormal distribution. RPLN is the right Pareto-lognormal distribution. LPLN is the left Pareto-lognormal distribution. Simple Pareto and exponential distributions omitted.
The AIC c values for the fit of seven distributions to the phenotypic potential data from Saccharomyces cerevisiae.
| Distribution | log( |
|
| Δi |
|
| DPLN | −713.77 | 4 | 1435.55 | 0.0 | 1 |
| RPLN | −862.73 | 3 | 1731.47 | 295.92 | 5.5×10−65 |
| LPLN | −888.88 | 3 | 1783.76 | 348.21 | 2.4×10−76 |
| Lognormal | −902.96 | 2 | 1809.93 | 374.39 | 5.0×10−82 |
| Normal | −2525.41 | 2 | 5054.82 | 3619.27 | 0 |
| Exponential | −2912.06 | 1 | 5826.13 | 4390.58 | 0 |
| Pareto | −7478.32 | 2 | 14960.64 | 13525.09 | 0 |
Log(L) is the log-likelihood function. d is the number of parameters. AIC c is the corrected Akaike Information Criterion (AIC). The rescaled AICc is Δi and the Akaike weights are w i. DPLN is the double Pareto-lognormal distribution. RPLN is the right Pareto-lognormal distribution. LPLN is the left Pareto-lognormal distribution. The sample size n was 4,680. Data is from Table S1 in [29].
Parameter estimates for the fit of seven distributions to the phenotypic potential data from Saccharomyces cerevisiae.
| Distribution | Parameters |
| DPLN | α = 3.141, β = 3.242, τ = 0.1909, ν = −0.5121 |
| RPLN | α = 4.124, τ = 0.4165, ν = −0.7446 |
| LPLN | β = 5.198, τ = 0.4432, ν = −0.3098 |
| Lognormal | τ = 0.4849, ν = −0.5021 |
| Normal | σ = 0.4151, μ = 0.6854 |
| Exponential | α = 1.459 |
| Pareto | α = 0.3328, xm = 0.03 |
DPLN is the double Pareto-lognormal distribution. RPLN is the right Pareto-lognormal distribution. LPLN is the left Pareto-lognormal distribution. Data is from Table S1 in [29].
Figure 2Lower tail of the cumulative distribution function (cdf) of random phenotypic variation (phenotypic potential) in Saccharomyces cerevisiae gene knockouts, and the DPLN, RPLN, LPLN, and lognormal fits.
Data are from Table S1 in [29]. DPLN is the double Pareto-lognormal distribution. RPLN is the right Pareto-lognormal distribution. LPLN is the left Pareto-lognormal distribution. Simple Pareto and exponential distributions omitted.