| Literature DB >> 22338609 |
Cong Zhou1, Kathryn L Simpson, Lee J Lancashire, Michael J Walker, Martin J Dawson, Richard D Unwin, Agata Rembielak, Patricia Price, Catharine West, Caroline Dive, Anthony D Whetton.
Abstract
A mass spectrometry-based plasma biomarker discovery workflow was developed to facilitate biomarker discovery. Plasma from either healthy volunteers or patients with pancreatic cancer was 8-plex iTRAQ labeled, fractionated by 2-dimensional reversed phase chromatography and subjected to MALDI ToF/ToF mass spectrometry. Data were processed using a q-value based statistical approach to maximize protein quantification and identification. Technical (between duplicate samples) and biological variance (between and within individuals) were calculated and power analysis was thereby enabled. An a priori power analysis was carried out using samples from healthy volunteers to define sample sizes required for robust biomarker identification. The result was subsequently validated with a post hoc power analysis using a real clinical setting involving pancreatic cancer patients. This demonstrated that six samples per group (e.g., pre- vs post-treatment) may provide sufficient statistical power for most proteins with changes>2 fold. A reference standard allowed direct comparison of protein expression changes between multiple experiments. Analysis of patient plasma prior to treatment identified 29 proteins with significant changes within individual patient. Changes in Peroxiredoxin II levels were confirmed by Western blot. This q-value based statistical approach in combination with reference standard samples can be applied with confidence in the design and execution of clinical studies for predictive, prognostic, and/or pharmacodynamic biomarker discovery. The power analysis provides information required prior to study initiation.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22338609 PMCID: PMC3320746 DOI: 10.1021/pr200636x
Source DB: PubMed Journal: J Proteome Res ISSN: 1535-3893 Impact factor: 4.466
Figure 1Design of the 8 channel isobaric tagging experiments for relative quantification of proteins from plasma. (A) Methodological workflow of sample analysis. Plasma depletion was achieved using an antibody based removal of the 20 major proteins found in human plasma. This was followed by tryptic digestion of the analyte and peptide tagging in the 8 different samples with 8 distinct isobaric tags that enable relative quantification of peptides from the 8 samples by tandem mass spectrometry. Peptides from the 8 samples were pooled and then fractionated using high pH reverse phase liquid chromatography (LC). Fractions were then spotted onto MALDI target plates by low pH RP LC and plates were analyzed by MSMS in 5800 MALDI ToF/ToF instrument (Applied Biosystems). A number of data manipulations were then performed to assess the value of the workflow. (B) Experimental setup and labeling. In Experiment 1, plasma from healthy individuals was taken 16 h apart and processed and analyzed in duplicate to assess technical and biological (within and between person) variation. In experiments 2 and 3, plasma was isolated from patients enrolled in the PACER pancreatic cancer trial at the Christie Hospital, Manchester, U.K. and 2 pretreatment samples were taken one week apart. To compare across experiments, a pooled reference control was used (pool) containing an aliquot of all plasma samples from experiments 2 and 3 points prior to sample depletion and in a 1:1 ratio for all samples.
Use of a Target-Decoy Database Search of the Experiment 1 Data Set Using Different q-Value Thresholdsa
| PSMs | Quantified Peptides | Quantified Proteins | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Target | Decoy | FDR | Target | Decoy | FDR | Target | Decoy | FDR | |
| None | 69040 | 16266 | 0.24 | 7947 | 56 | 0.007 | 462 | 31 | 0.067 |
| 0.05 | 30672 | 1614 | 0.05 | 7763 | 53 | 0.007 | 459 | 29 | 0.063 |
| 0.001 | 18828 | 77 | 0.002 | 6208 | 2 | 0.0003 | 391 | 2 | 0.005 |
The number of target and decoy Peptide Spectral Matches (PSMs), quantified peptides and quantified proteins are shown for four choices of q-value threshold together with the calculated FDR (no threshold indicates the ProteinPilot default output).
Number of Proteins Identified (% of Total) with Different Variation Cut-offs for Technical and Biological Replicatesa
| variation of log2 protein ratio | technical replicate (% of total) | within person (% of total) | between person (% of total) |
|---|---|---|---|
| ±10% | 37 | 63 | 53 |
| ±20% | 61 | 69 | 61 |
| ±30% | 74 | 75 | 69 |
| ±40% | 83 | 81 | 74 |
| ±50% | 88 | 84 | 79 |
| ±60% | 92 | 87 | 82 |
| ±70% | 94 | 88 | 85 |
| ±80% | 96 | 90 | 87 |
| ±90% | 97 | 92 | 89 |
| ±100% | 98 | 94 | 90 |
Using data obtained in Experiment 1 the accuracy and amount of data that fell within various error ranges was calculated. Between person and within person variation listed here were derived from the observed data by removing the variance component from technical variation.
Estimated Sample Sizes Required Per Groupa
| variance | |||||||
|---|---|---|---|---|---|---|---|
| effect size | 1-β | number of replicates | 70th percentile | 75th percentile | 80th percentile | 85th percentile | maximum |
| log2(1.7) | 0.7 | 1 | 5 | 8 | 12 | 17 | 453 |
| 2 | 4 | 7 | 11 | 16 | 452 | ||
| 4 | 4 | 6 | 10 | 15 | 452 | ||
| log2(1.7) | 0.8 | 1 | 7 | 10 | 15 | 21 | 576 |
| 2 | 5 | 9 | 14 | 20 | 575 | ||
| 4 | 5 | 8 | 13 | 19 | 574 | ||
| log2(2.0) | 0.7 | 1 | 3 | 5 | 7 | 10 | 266 |
| 2 | 3 | 4 | 6 | 10 | 265 | ||
| 4 | 2 | 4 | 6 | 9 | 264 | ||
| log2(2.0) | 0.8 | 1 | 4 | 6 | 9 | 13 | 338 |
| 2 | 3 | 5 | 8 | 12 | 337 | ||
| 4 | 3 | 5 | 8 | 12 | 337 | ||
Using variance data obtained from Experiment 1 (healthy volunteers) sample sizes are reported for several choices of effect size and variance level. Sample sizes required are per group. To see an effect size greater than a 2 fold change, 3 samples per group (for e.g. 3 pre-treatment vs. 3 post-treatment) with 2 technical replicates would be sufficient for proteins with 70% variance with a power of 0.8.
Figure 2Bland-Altman plot for pooled reference reproducibility across iTRAQ experiments 2 and 3 (PACER day 0 and PACER day 7 clinical samples). Total of 3 patients, each with duplicate sample at day 0 and day 7 contributing equally to a pooled reference of 12 samples. Proteins that were identified and quantified in experiments 2 and 3 and originating from the pooled reference sample as defined by the incorporation of iTRAQ labels 113 and 114 were analyzed for the agreement in log2 iTRAQ ratios (ideally equivalent between both experiments). The agreement was calculated by the mean of the two measurements versus the difference in values, thus the smaller the difference the greater the reproducibility. The limits of agreement are shown by the average difference ±1.96 Std.
Proteins with Differential Expression in the PACER Study between Pretreatment Day 0 (Experiment 2) and Day 7 (Experiment 3)a
| protein names | patient C, day 7:0 | patient D, day 7:0 | patient E, day 7:0 |
|---|---|---|---|
| Anti-(ED-B) scFV (Fragment) | 1.395 | 0.663 | |
| ALDH1A1 Retinal dehydrogenase 1 | 1.099 | 1.469 | |
| APOL1 Isoform 2 of Apolipoprotein L1 | 1.059 | 0.901 | |
| CA1 Carbonic anhydrase 1 | 1.542 | 1.758 | |
| CETP Isoform 1 of Cholesteryl ester transfer protein | 1.561 | 1.081 | |
| CFP Properdin | 1.206 | 1.257 | |
| CRP Isoform 1 of C-reactive protein | 1.658 | 1.312 | |
| FETUB Fetuin-B | 0.953 | 1.034 | |
| GAPDH Glyceraldehyde 3-phosphate dehydrogenase | 0.750 | 1.207 | |
| GOT1 Aspartate aminotransferase, cytoplasmic | 1.227 | 1.017 | |
| HGFAC Hepatocyte growth factor activator | 0.802 | 1.595 | |
| HSP90B1 Endoplasmin | 0.592 | 0.634 | |
| KRT5 Keratin, type II cytoskeletal 5 | 0.515 | 0.696 | |
| PARK7 Protein DJ-1 | 1.572 | 1.444 | |
| PDLIM1 PDZ and LIM domain protein 1 | 1.044 | 0.520 | |
| PFN1 Profilin-1 | 0.947 | 1.785 | |
| PRDX6 Peroxiredoxin-6 | 1.412 | 1.492 | |
| PTPRG Isoform 1 of Receptor-type tyrosine-protein phosphatase gamma | 1.829 | 1.351 | |
| SAA2 Serum amyloid A protein | 1.151 | 0.764 | |
| TALDO1 Transaldolase | 1.082 | 1.154 | |
| TPI1 Isoform 1 of Triosephosphate isomerase | 0.906 | 1.738 | |
| TMSB4X TMSB4X protein (Fragment) | 0.797 | ||
| PRDX2 Peroxiredoxin-2 | 1.643 | ||
| CAT Catalase | 0.947 | ||
| HBA1 Hemoglobin subunit alpha | 1.740 | ||
| HBB Hemoglobin subunit beta | 1.580 | ||
| HBD Hemoglobin subunit delta | 1.579 | ||
| IGHA1 cDNA FLJ90170 fis | 1.678 | 1.937 | |
| PDLIM7 Isoform 1 of PDZ and LIM domain protein 7 | 1.516 | 0.954 |
Proteins identified to be differentially expressed in at least one patient are listed in the table and the significant changes are indicated in bold italic.
Figure 3Uncropped Western blots for levels of Peroxiredoxin II and Coagulation Factor XIII B Chain Precursor in undepleted patient plasma. Protein levels are shown in relation to the pooled reference and SH-SY5Y lysates were used as a positive control.