| Literature DB >> 19763172 |
Steven H Wu1, Michael A Black, Robyn A North, Kelly R Atkinson, Allen G Rodrigo.
Abstract
Two dimensional polyacrylamide gel electrophoresis (2D PAGE) is used to identify differentially expressed proteins and may be applied to biomarker discovery. A limitation of this approach is the inability to detect a protein when its concentration falls below the limit of detection. Consequently, differential expression of proteins may be missed when the level of a protein in the cases or controls is below the limit of detection for 2D PAGE. Standard statistical techniques have difficulty dealing with undetected proteins. To address this issue, we propose a mixture model that takes into account both detected and non-detected proteins. Non-detected proteins are classified either as (a) proteins that are not expressed in at least one replicate, or (b) proteins that are expressed but are below the limit of detection. We obtain maximum likelihood estimates of the parameters of the mixture model, including the group-specific probability of expression and mean expression intensities. Differentially expressed proteins can be detected by using a Likelihood Ratio Test (LRT). Our simulation results, using data generated from biological experiments, show that the likelihood model has higher statistical power than standard statistical approaches to detect differentially expressed proteins. An R package, Slider (Statistical Likelihood model for Identifying Differential Expression in R), is freely available at http://www.cebl.auckland.ac.nz/slider.php.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19763172 PMCID: PMC2734266 DOI: 10.1371/journal.pcbi.1000509
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Results for different mean expression intensities between groups with all spots expressed.
| Difference between means (SD) | Case Mean | Control Mean | Student's t-test | LRT |
| 0 | −3.58 | −3.58 | 3.8% | 3.9% |
| 0.25 | −3.66 | −3.50 | 10.4% | 10.5% |
| 0.5 | −3.74 | −3.42 | 20.1% | 20.6% |
| 0.75 | −3.82 | −3.34 | 40.5% | 41.0% |
| 1 | −3.91 | −3.26 | 64.4% | 64.7% |
| 1.25 | −3.99 | −3.17 | 82.9% | 83.0% |
| 1.5 | −4.07 | −3.09 | 93.7% | 93.7% |
| 1.75 | −4.15 | −3.01 | 98.2% | 98.2% |
| 2 | −4.23 | −2.93 | 99.7% | 99.7% |
| 2.25 | −4.31 | −2.85 | 100% | 100% |
| 2.5 | −4.39 | −2.77 | 100% | 100% |
Proportion of proteins classified as differentially expressed by each model.
*: Difference in mean expression intensities between cases and controls, expressed as proportions of the standard deviation, σ.
Results for equal mean expression intensities but the probability of expression differs between groups.
|
| |||||||||||
| Case: Probability of Expression | |||||||||||
| 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | 1 | ||
| Control: Proability of Expression | |||||||||||
| 0.1 | 0.4% | ||||||||||
| 0.2 | 1.3% | 2.4% | |||||||||
| 0.3 | 1.5% | 2.9% | 5.5% | ||||||||
| 0.4 | 2.5% | 2.6% | 4.4% | 4.9% | |||||||
| 0.5 | 1.6% | 3.2% | 3.7% | 5.5% | 5.5% | ||||||
| 0.6 | 2.1% | 2.8% | 5.7% | 4.7% | 4.2% | 4.1% | |||||
| 0.7 | 1.8% | 4.1% | 4.3% | 6.2% | 4.6% | 5.0% | 5.2% | ||||
| 0.8 | 2.4% | 3.8% | 5.9% | 4.4% | 4.2% | 4.7% | 5.3% | 4.5% | |||
| 0.9 | 1.1% | 3.5% | 3.8% | 4.7% | 3.7% | 4.4% | 5.7% | 3.0% | 4.8% | ||
| 1 | 1.5% | 4.0% | 5.2% | 5.4% | 4.9% | 5.2% | 6.4% | 5.2% | 3.8% | 5.7% | |
|
| |||||||||||
| Control: Proability of Expression | |||||||||||
| 0.1 | 1.6% | ||||||||||
| 0.2 | 6.8% | 4.5% | |||||||||
| 0.3 | 20.4% | 7.6% | 5.4% | ||||||||
| 0.4 | 39.0% | 16.8% | 7.9% | 5.3% | |||||||
| 0.5 | 58.1% | 31.4% | 16.9% | 7.1% | 4.1% | ||||||
| 0.6 | 78.1% | 53.6% | 33.8% | 17.3% | 7.5% | 5.1% | |||||
| 0.7 | 89.3% | 71.2% | 49.7% | 32.1% | 14.0% | 5.7% | 4.3% | ||||
| 0.8 | 96.6% | 86.6% | 74.0% | 51.4% | 29.8% | 16.7% | 7.4% | 4.7% | |||
| 0.9 | 99.8% | 97.4% | 88.4% | 73.2% | 55.3% | 38.1% | 22.2% | 6.4% | 3.2% | ||
| 1 | 100.0% | 99.9% | 99.4% | 95.4% | 89.2% | 70.4% | 42.7% | 20.8% | 6.1% | 6.2% | |
|
| |||||||||||
| Control: Proability of Expression | |||||||||||
| 0.1 | 4.7% | ||||||||||
| 0.2 | 4.1% | 4.5% | |||||||||
| 0.3 | 3.9% | 4.9% | 6.4% | ||||||||
| 0.4 | 5.3% | 4.3% | 4.6% | 5.3% | |||||||
| 0.5 | 17.7% | 6.6% | 5.5% | 5.2% | 4.7% | ||||||
| 0.6 | 25.3% | 11.6% | 7.2% | 6.1% | 4.4% | 4.3% | |||||
| 0.7 | 37.7% | 20.6% | 9.7% | 6.9% | 4.9% | 4.9% | 5.4% | ||||
| 0.8 | 61.7% | 30.6% | 18.2% | 10.2% | 6.1% | 6.7% | 6.4% | 4.2% | |||
| 0.9 | 77.2% | 47.4% | 27.4% | 15.3% | 9.5% | 6.9% | 6.5% | 2.8% | 5.1% | ||
| 1 | 97.8% | 85.9% | 52.4% | 28.3% | 14.3% | 9.7% | 8.3% | 6.9% | 4.7% | 6.4% | |
Proportion of proteins classified as differentially expressed by each model.
Results for fixed difference in mean expression intensities and varying limits of detection.
| Quantile on the normal distribution | Limits of detection | Student's t-test exclude missing data | Student's t-test global minimum for missing data | Likelihood Ratio Test |
| 0% | -Infinity | 86.3% | 84.2% | 84.4% |
| 5% | −5.06 | 82.3% | 83.9% | 81.1% |
| 10% | −4.82 | 74.1% | 82.3% | 74.6% |
| 15% | −4.66 | 68.5% | 82.2% | 71.3% |
| 20% | −4.53 | 61.6% | 83.6% | 69.7% |
| 25% | −4.43 | 52.5% | 82.4% | 64.7% |
| 30% | −4.33 | 45.6% | 80.1% | 59.2% |
| 35% | −4.24 | 35.5% | 82.1% | 57.2% |
| 40% | −4.15 | 31.3% | 80.8% | 56.2% |
| 45% | −4.07 | 21.6% | 78.7% | 46.3% |
| 50% | −3.99 | 15.4% | 79.9% | 43.6% |
Proportion of proteins classified as differentially expressed by each model.
Result for different mean expression intensities and same probability of expression between two groups.
| Probability of Expression | M:0 | M:0.25 | M:0.5 | M:0.75 | M:1 | M:1.25 | M:1.5 | M:1.75 | M:2 |
|
| |||||||||
| 0.1 | 0.4% | 0.9% | 1.0% | 1.4% | 2.5% | 4.1% | 4.1% | 5.4% | 6.7% |
| 0.2 | 2.5% | 4.0% | 5.7% | 7.5% | 12.5% | 15.5% | 24.2% | 29.5% | 35.0% |
| 0.3 | 4.5% | 4.4% | 8.1% | 15.4% | 24.4% | 30.0% | 40.9% | 52.9% | 61.1% |
| 0.4 | 4.1% | 6.0% | 10.1% | 20.6% | 31.0% | 44.3% | 57.0% | 69.3% | 80.0% |
| 0.5 | 4.6% | 5.9% | 11.2% | 23.4% | 40.5% | 53.1% | 72.5% | 82.0% | 88.9% |
| 0.6 | 5.7% | 8.1% | 13.8% | 29.6% | 46.8% | 63.4% | 79.4% | 90.2% | 94.5% |
| 0.7 | 5.5% | 8.7% | 17.5% | 33.4% | 55.3% | 73.5% | 83.1% | 94.5% | 97.6% |
| 0.8 | 5.5% | 8.8% | 20.9% | 37.2% | 60.1% | 76.2% | 91.8% | 96.0% | 99.1% |
| 0.9 | 5.5% | 9.4% | 23.0% | 41.0% | 64.6% | 83.0% | 94.5% | 98.1% | 99.5% |
| 1 | 5.9% | 9.1% | 20.8% | 42.6% | 70.5% | 84.2% | 95.9% | 99.0% | 99.9% |
|
| |||||||||
| 0.1 | 1.3% | 1.7% | 1.4% | 1.6% | 1.7% | 1.6% | 1.2% | 2.0% | 1.9% |
| 0.2 | 4.6% | 3.3% | 4.9% | 2.1% | 4.0% | 4.0% | 3.4% | 4.4% | 4.6% |
| 0.3 | 5.9% | 4.5% | 4.7% | 3.5% | 5.7% | 6.1% | 6.4% | 5.6% | 4.6% |
| 0.4 | 5.2% | 5.7% | 4.7% | 4.2% | 5.0% | 5.7% | 5.8% | 6.4% | 7.8% |
| 0.5 | 5.0% | 4.9% | 4.4% | 6.6% | 8.0% | 7.4% | 7.8% | 8.7% | 9.6% |
| 0.6 | 5.5% | 6.4% | 5.3% | 6.7% | 6.2% | 7.3% | 7.8% | 8.7% | 9.3% |
| 0.7 | 6.5% | 5.4% | 6.1% | 7.1% | 7.1% | 10.0% | 9.6% | 11.2% | 17.8% |
| 0.8 | 4.8% | 5.7% | 6.3% | 8.6% | 11.7% | 13.2% | 17.3% | 19.1% | 24.7% |
| 0.9 | 2.7% | 4.7% | 8.1% | 12.3% | 20.4% | 25.4% | 30.4% | 35.8% | 38.9% |
| 1 | 5.4% | 8.1% | 19.1% | 38.8% | 65.4% | 82.0% | 94.5% | 98.5% | 99.8% |
|
| |||||||||
| 0.1 | 5.2% | 5.0% | 5.9% | 7.5% | 7.2% | 9.5% | 10.4% | 11.7% | 15.0% |
| 0.2 | 4.0% | 5.7% | 5.6% | 9.0% | 11.9% | 14.2% | 15.9% | 23.9% | 29.3% |
| 0.3 | 5.2% | 5.5% | 8.9% | 12.2% | 18.9% | 24.8% | 32.6% | 41.9% | 47.6% |
| 0.4 | 4.4% | 6.0% | 8.2% | 16.3% | 25.3% | 35.7% | 46.9% | 56.8% | 69.3% |
| 0.5 | 4.1% | 5.9% | 12.1% | 20.2% | 34.1% | 44.9% | 61.8% | 71.8% | 81.1% |
| 0.6 | 5.1% | 7.8% | 12.4% | 25.7% | 38.2% | 55.2% | 71.0% | 83.0% | 89.6% |
| 0.7 | 4.9% | 8.0% | 15.8% | 30.2% | 47.1% | 66.9% | 76.7% | 89.2% | 94.0% |
| 0.8 | 5.0% | 9.4% | 19.8% | 33.2% | 56.7% | 71.1% | 88.1% | 92.0% | 97.8% |
| 0.9 | 5.8% | 7.9% | 22.4% | 36.8% | 60.4% | 79.2% | 90.7% | 95.9% | 98.8% |
| 1 | 5.5% | 8.4% | 19.3% | 39.3% | 65.5% | 82.3% | 94.6% | 98.6% | 99.8% |
Proportion of proteins classified as differentially expressed by each model.
Figure 1Five differentially expressed spots.
(A) Five differentially expressed spots identified by the LRT on 2D PAGE. (B). Scatter plot of the five spots. PE = preeclampsia cases. C = Healthy controls.
Five differentially expressed spots identified by the Likelihood Ratio Test.
| Estimated Mean | Estimated Probability of expression | log maximum likelihood Null model | log maximum likelihood Alternative model | Likelihood Ratio Statistics | 95th% quantile | ||
| Spot 93 | Case | −3.29 | 0.33 | −26.81 | −10.46 | 16.41 | 11.58 |
| Control | −2.47 | 0.33 | −8.15 | ||||
| Spot 289. | Case | −1.55 | 0.33 | −34.79 | −13.21 | 15.11 | 12.49 |
| Control | −0.54 | 0.67 | −14.02 | ||||
| Spot 390 | Case | −2.44 | 0.75 | −21.47 | −12.34 | 18.26 | 13.66 |
| Control | −2.48 | 0.00 | −0.001 | ||||
| Spot 435 | Case | −1.09 | 1.00 | −63.87 | −5.90 | 41.37 | 27.86 |
| Control | −1.49 | 1.00 | −37.28 | ||||
| Spot 686 | Case | −4.90 | 1.00 | −26.64 | −0.69 | 21.34 | 17.20 |
| Control | −4.48 | 0.42 | −15.28 |
Compares the performance between four simulation analyses.
| Simulation 1 | Simulation 2 | Simulation 3 | Simulation 4 | |
| Student's t-test, missing values excluded | Good | Low power | Low power | Good |
| Student's t-test, missing values replaced with global minimum | Not applicable | Good | Good | Low power |
| Likelihood Ratio Test | Good | Reasonable | Reasonable | Good |