| Literature DB >> 19187558 |
Abstract
BACKGROUND: Many studies have provided algorithms or methods to assess a statistical significance in quantitative proteomics when multiple replicates for a protein sample and a LC/MS analysis are available. But, confidence is still lacking in using datasets for a biological interpretation without protein sample replicates. Although a fold-change is a conventional threshold that can be used when there are no sample replicates, it does not provide an assessment of statistical significance such as a false discovery rate (FDR) which is an important indicator of the reliability to identify differentially expressed proteins. In this work, we investigate whether differentially expressed proteins can be detected with a statistical significance from a pair of unlabeled protein samples without replicates and with only duplicate LC/MS injections per sample. A FDR is used to gauge the statistical significance of the differentially expressed proteins.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19187558 PMCID: PMC2645366 DOI: 10.1186/1471-2105-10-43
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Operation flowchart of protein sample preparation and LC/MS analysis. See Methods.
Figure 2Base-peak intensity chromatogram alignment. Two base-peak intensity chromatogram alignments are shown between the raw trace RP,1 with the pair traces RP,2 (A) and SP,1 (B) respectively. The alignment was performed with the procedure described in reference [21].
Calculation of FDRs for the sample pair SP/RP
| Statistical test option | ||||||
| Proteins | Source of significant proteins | t-test | ranksum | either | both | neither |
| UL | 32 | 42 | 44 | 30 | 92 | |
| 31 | 38 | 41 | 28 | 99 | ||
| 29 | 38 | 41 | 26 | 107 | ||
| 28 | 40 | 41 | 27 | 108 | ||
| MPSP | 22 | 29 | 29 | 22 | 61 | |
| IS | 9 | 14 | 15 | 8 | 72 | |
| 4 | 10 | 10 | 4 | 57 | ||
| 5 | 14 | 14 | 5 | 84 | ||
| 5 | 14 | 14 | 5 | 68 | ||
| MPSP | 1 | 5 | 5 | 1 | 28 | |
| FDR | 0.045 | 0.172 | 0.172 | 0.045 | 0.459 | |
FDRs were calculated for the sample pair SP/RP based on a separate quantitation of the UL protein relative abundance and the IS protein relative abundance. The fold-change was fixed at 2 and the MPSP at 4. The t-test and the Wilcoxon ranksum test were performed at a 5% significance level. RA1, RA2, RA3, and RA4 represent the protein relative abundance for the four permuted sample-pairings. Shown in a row of 'RAx' (x = 1, 2, 3, or 4 permuted sample-pairings) are the numbers of proteins found significant by the five statistical test options, including: the t-test alone ('t-test'), the ranksum test alone ('ranksum'), either the t-test or the ranksum test ('either'), both the t-test and the ranksum test ('both'), and without a statistical test ('neither'). Shown in the two 'MPSP' rows are the UL and IS proteins, respectively, found significant in all four permuted sample-pairings. A FDR is the ratio of the number of significant IS proteins over the number of significant UL proteins.
Figure 3Effect of fold-change and MPSP on FDR. The t-test was performed at a significance level of 5%. The fold-change varied from 1 to 3 with increment by 0.25. The MPSP varied from 1 to 4.
Figure 4Effect of statistical test and fold-change on FDR. MPSP was fixed at 4. The fold-change varied from 1 to 5 with 0.25 increments. With the t-test and the Wilcoxon ranksum test performed at a significance level of 5%, five combinations of the two tests were applied as the statistical test criterion: a) without statistical test (neither); b) significant by either the t-test or the Wilcoxon ranksum test (either); c) significant by the Wilcoxon ranksum test alone (ranksum); d) significant by the t-test alone (t-test); e) significant by both the t-test and the Wilcoxon ranksum test (both).
Figure 5Effect of minPCS and fold-change on FDR. MPSP was fixed at 1. The t-test was applied at a significance level of 5%. The fold-change varied from 1 to 5. The minPCS changed from 1 to 8.
Figure 6ROC analysis. The ROC curves were generated by varying the fold-change threshold from 1 to 5 with 0.25 increments. Positives and false positives were identified by a combination of three criteria, including a fold-change, a statistical test, and a MPSP. The MPSP was fixed at 1, 2, 3, and 4 in panels A, B, C, and D respectively. The t-test (blue curves) and the Wilcoxon ranksum test (pink curves) were compared in each panel. Both tests were performed at a significance level of 5%. For the blue curves, the arrows indicate an abrupt change in FDR, and the text labels indicate the corresponding fold-change where the abrupt change in FDR occurs.
Validation using different combinations among samples
| Validation sample set | Sample | Control | P | FP | FDR |
| I | SP/RP (UL) | SP/RP (IS) | 22 | 1 | 0.045 |
| II | SA/SB (UL) | SA/SB (IS) | 8 | 2 | - |
| SB/SC (UL) | SB/SC (IS) | 0 | 1 | - | |
| SC/SA (UL) | SC/SA (IS) | 1 | 0 | - | |
| Average | 3.0 | 1.0 | 0.33 | ||
| III | SP/RP (UL) | SA/SB (UL) | 22 | 8 | - |
| SB/SC (UL) | 22 | 0 | - | ||
| SC/SA (UL) | 22 | 1 | - | ||
| Average | 22 | 3.0 | 0.14 | ||
| IV | SP/RP (IS) | SA/SB (IS) | 1 | 2 | - |
| SB/SC (IS) | 1 | 1 | - | ||
| SC/SA (IS) | 1 | 0 | - | ||
| Average | 1.0 | 1.0 | 1.0 | ||
| V | SP/RP (UL) | Inj.1 vs Inj.2 (UL) of SP, SA, SB, and SC | 22 | 0 | 0.00 |
Sample set I represents the target of validation. The parameters used for FDR calculation were: 2-fold change, 4 MPSP, and t-test (p < 0.05). P, positives. FP, false positives. '-', not calculated. See text for more details.