| Literature DB >> 21365014 |
Celine S Hong1, Juan Cui, Zhaohui Ni, Yingying Su, David Puett, Fan Li, Ying Xu.
Abstract
A novel computational method for prediction of proteins excreted into urine is presented. The method is based on the identification of a list of distinguishing features between proteins found in the urine of healthy people and proteins deemed not to be urine excretory. These features are used to train a classifier to distinguish the two classes of proteins. When used in conjunction with information of which proteins are differentially expressed in diseased tissues of a specific type versus control tissues, this method can be used to predict potential urine markers for the disease. Here we report the detailed algorithm of this method and an application to identification of urine markers for gastric cancer. The performance of the trained classifier on 163 proteins was experimentally validated using antibody arrays, achieving >80% true positive rate. By applying the classifier on differentially expressed genes in gastric cancer vs normal gastric tissues, it was found that endothelial lipase (EL) was substantially suppressed in the urine samples of 21 gastric cancer patients versus 21 healthy individuals. Overall, we have demonstrated that our predictor for urine excretory proteins is highly effective and could potentially serve as a powerful tool in searches for disease biomarkers in urine in general.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21365014 PMCID: PMC3041827 DOI: 10.1371/journal.pone.0016875
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1The flow of the study.
Classification performance by the trained classifier on the training and an independent test set.
| Sets | TP | TN | FP | FN | SEN | SP | ACC | MCC | AUC |
| Train | 972 | 2,493 | 134 | 341 | 0.74 | 0.95 | 0.88 | 0.52 | 0.94 |
| Independent | 360 | 1,983 | 165 | 100 | 0.78 | 0.92 | 0.90 | 0.45 | 0.93 |
TP = true positive; TN = true negative; FP = false positive; FN = false negative; N = total number of proteins in dataset; SEN = TP/(TP+FN); SP = TN/(TN+FP); ACC = (TP+TN)/N; MCC = (TPxTN-FPxFN)/√((TP+FN)(TP+FP)(TN+FP) (TN+FN)); AUC is described in (37).
Figure 2Western blot results.
A: Western blots for EL on control and gastric cancer samples. Control samples (denoted by the red lined box): Lanes 1–7, 11–17, 21–27. Cancer samples: Lanes 8–14, 18–24, 28–34. B: Corresponding whisker-box plot for the signal intensities. C. ROC curve of the EL Western blot. Red line: no discrimination; blue line: ROC by EL.