| Literature DB >> 22243476 |
Wilson Wen Bin Goh1, Yie Hou Lee, Zubaidah M Ramdzan, Marek J Sergot, Maxey Chung, Limsoon Wong.
Abstract
Traditional proteomics analysis is plagued by the use of arbitrary thresholds resulting in large loss of information. We propose here a novel method in proteomics that utilizes all detected proteins. We demonstrate its efficacy in a proteomics screen of 5 and 7 liver cancer patients in the moderate and late stage, respectively. Utilizing biological complexes as a cluster vector, and augmenting it with submodules obtained from partitioning an integrated and cleaned protein-protein interaction network, we calculate a Proteomics Signature Profile (PSP) for each patient based on the hit rates of their reported proteins, in the absence of fold change thresholds, against the cluster vector. Using this, we demonstrated that moderate- and late-stage patients segregate with high confidence. We also discovered a moderate-stage patient who displayed a proteomics profile similar to other poor-stage patients. We identified significant clusters using a modified version of the SNet approach. Comparing our results against the Proteomics Expansion Pipeline (PEP) on which the same patient data was analyzed, we found good correlation. Building on this finding, we report significantly more clusters (176 clusters here compared to 70 in PEP), demonstrating the sensitivity of this approach. Gene Ontology (GO) terms analysis also reveals that the significant clusters are functionally congruent with the liver cancer phenotype. PSP is a powerful and sensitive method for analyzing proteomics profiles even when sample sizes are small. It does not rely on the ratio scores but, rather, whether a protein is detected or not. Although consistency of individual proteins between patients is low, we found the reported proteins tend to hit clusters in a meaningful and informative manner. By extracting this information in the form of a Proteomics Signature Profile, we confirm that this information is conserved and can be used for (1) clustering of patient samples, (2) identification of significant clusters based on real biological complexes, and (3) overcoming consistency and coverage issues prevalent in proteomics data sets.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22243476 PMCID: PMC3472506 DOI: 10.1021/pr200698c
Source DB: PubMed Journal: J Proteome Res ISSN: 1535-3893 Impact factor: 4.466
Figure 1Proteomics signature profiling (PSP) pipeline. The pipeline consists of incorporating data from complex, PPI and GO. Protein lists from individual patients are converted into a proteomics signature profile (PSP) based on a vector of complexes generated from CORUM and graphlet-derived clusters. The PSP can then be used for performing sample clustering for assessing the patient samples and determining significant clusters. GO terms are used to evaluate functional significance and coherence. (Abbreviations: GDV, Graphlet degree vector; GDS, Graphlet Degree Similarity Scores). For detailed explanations, refer to Results.
Figure 2Comparison of bootstrapped HCL trees generated via pvclust. Values on the edges of the clustering are p-values (%). Red values are AU p-values, and green values are BP values as explained early under methods. Clusters with AU larger than 95% are highlighted by red boxes and are very strongly support by the data. With only 73 graphlet-derived clusters, this did not provide sufficient dimensions for clearly resolving the mod and poor patients (left column) although Paragon fared much better because of better hit rates. The right column shows that with the use of a much larger set of dimensions or clusters, in this case, derived from CORUM, the trees are virtually identical despite that Paragon reports a considerably larger number of proteins. It is also noteworthy in all cases; mod patient #203 is clustered with other poor patients.
Top Ranked Clusters
| cluster ID | mod score | poor score | cluster name | |
|---|---|---|---|---|
| 5179 | 0.000300541 | 0.513951977 | 3.159758312 | NCOA6-DNA-PK-Ku-PARP1 complex |
| 5235 | 0.000300541 | 0.513951977 | 3.159758312 | WRN-Ku70-Ku80-PARP1 complex |
| 1193 | 0.000300541 | 0.513951977 | 3.159758312 | Rap1 complex |
| 159 | 0 | 0 | 2.810927655 | Condensin I-PARP-1-XRCC1 complex |
| 2657 | 0.008815869 | 0 | 2.55616281 | ESR1-CDK7-CCNH-MNAT1-MTA1-HDAC2 complex |
| 3067 | 0.00911641 | 0 | 2.55616281 | RNA polymerase II complex, incomplete (CDK8 complex), chromatin structure modifying |
| 1226 | 0.013323983 | 0.715352108 | 2.420592827 | H2AX complex I |
| 5176 | 0 | 0.513951977 | 2.339059313 | MGC1-DNA-PKcs-Ku complex |
| 1189 | 0 | 0.513951977 | 2.339059313 | DNA double-strand break end-joining complex |
| 5251 | 0 | 0.513951977 | 2.339059313 | Ku-ORC complex |
| 2766 | 0 | 0.513951977 | 2.339059313 | TERF2-RAP1 complex |
Top Ranked GO BP Terms Found in Significant Clusters
| GO ID | description | no. of clusters |
|---|---|---|
| GO:0016032 | viral reproduction | 36 |
| GO:0000398 | nuclear mRNA splicing, via spliceosome | 34 |
| GO:0000278 | mitotic cell cycle | 28 |
| GO:0000084 | S phase of mitotic cell cycle | 28 |
| GO:0006366 | Transcription from RNA polymerase II promoter | 26 |
| GO:0006283 | Transcription-coupled nucleotide-excision repair | 22 |
| GO:0006369 | Termination of RNA polymerase II transcription | 22 |
| GO:0006284 | base-excision repair | 21 |
| GO:0000086 | G2/M transition of mitotic cell cycle | 21 |
| GO:0000079 | regulation of cyclin-dependent protein kinase activity | 20 |
| GO:0010833 | Telomere maintenance via telomere lengthening | 20 |
| GO:0033044 | regulation of chromosome organization | 19 |
| GO:0006200 | ATP catabolic process | 18 |
| GO:0042475 | Odontogenesis of dentine-containing tooth | 18 |
| GO:0034138 | toll-like receptor 3 signaling pathway | 17 |
| GO:0006915 | Apoptosis | 17 |
| GO:0006271 | DNA strand elongation involved in DNA replication | 17 |
| GO:0031145 | anaphase-promoting complex-dependent proteasomal ubiquitin-dependent protein catabolic process | 17 |
| GO:0006261 | DNA-dependent DNA replication | 17 |
| GO:0048015 | phosphatidylinositol-mediated signaling | 16 |
| GO:0006986 | Response to unfolded protein | 16 |
| GO:0000077 | DNA damage checkpoint | 16 |
| GO:0008063 | Toll signaling pathway | 16 |
| GO:0043488 | regulation of mRNA stability | 16 |
| GO:0006338 | chromatin remodeling | 16 |
| GO:0002756 | MyD88-independent toll-like receptor signaling pathway | 16 |
| GO:0000216 | M/G1 transition of mitotic cell cycle | 16 |
| GO:0071103 | DNA conformation change | 16 |
| GO:0000724 | double-strand break repair via homologous recombination | 16 |
| GO:0034142 | toll-like receptor 4 signaling pathway | 16 |
| GO:0010212 | Response to ionizing radiation | 16 |
| GO:0051301 | cell division | 15 |
| GO:0006333 | chromatin assembly or disassembly | 15 |
| GO:0071445 | cellular response to protein stimulus | 15 |
| GO:0002755 | MyD88-dependent toll-like receptor signaling pathway | 14 |
| GO:0043487 | regulation of RNA stability | 14 |
Figure 3Ranks correlation between PEP and PSP. Although PEP and PSP clusters were derived from very different methods, it can be seen that their results correlate well. To reduce the level of noise, we required a Jaccard score of at least 0.1 (10% similarity).
- Best matching PEP clusters
| PEP rank | PSP rank | members (as in PEP) | |
|---|---|---|---|
| 41 | 104 | 0.4 | DHX9 SMN1 DDX20 GEMIN4 SMN2 SNRPB SIP1 |
| 23 | 17 | 0.333333333 | COL1A2 CD36 ITGB3 ITGA2B |
| 9 | 1 | 0.25 | XRCC6 PCNA PRKDC WRN XRCC5 PARP1 |
| 20 | 134 | 0.25 | PRKDC XPA RPA1 RPA2 |
| 11 | 20 | 0.222222222 | ACTR2 ACTR3 ARPC4 ARPC5 |
| 40 | 147 | 0.222222222 | PRKCD RAF1MAPK1 PRKCZ PEBP1MAP2K1 |
| 16 | 1 | 0.2 | XRCC6 PCNA PRKDC TP53 WRN NCOA6 XRCC5 PARP1 |
| 34 | 74 | 0.1875 | MAP3K14 CHUK MAP3K7 PEBP1 IKBKB |
| 4 | 33 | 0.142857143 | FUS PTBP1 SFPQ ZMYM2 |
| 5 | 88 | 0.142857143 | YWHAB HSP90AB1 IKBKB MAP3K3 |
| 22 | 30 | 0.142857143 | CANX ITGA6 ITGB1 CD82 |
| 43 | 137 | 0.133333333 | GSN AR CASP3 PXN BCAR1 FYN |
| 1 | 77 | 0.125 | YWHAQ HSPA1A HSPA8 YWHAG |
| 2 | 0 | 0.125 | TP53 NPM1 NCL PARP1 |
| 12 | 77 | 0.125 | SET APEX1 GZMA HMGB2 |
| 13 | 23 | 0.125 | RAN RCC1 XPO1 RANBP3 |
| 52 | 162 | 0.125 | PRKCD EP300 CREBBP KLF5 |
| 57 | 147 | 0.125 | PRKACA RAF1 BAD BCL2 |
| 61 | 74 | 0.117647059 | AKT1 IRS1 PRKCQ CHUK IKBKB |
| 27 | 147 | 0.111111111 | YWHAZ RAF1 CDC25A MAP3K5 YWHAE |
| 31 | 77 | 0.111111111 | HSPA1A BAG1 STUB1 HSPA8 HSF1 |
| 65 | 147 | 0.111111111 | RAF1 AR HSP90AA1MAPK1 NR3C1 |