| Literature DB >> 28415579 |
Chi-Hung Shao1, Chien-Lun Chen2,3, Jia-You Lin4, Chao-Jung Chen5,6,7, Shu-Hsuan Fu8, Yi-Ting Chen8,9, Yu-Sun Chang8, Jau-Song Yu8,9,10, Ke-Hung Tsui2, Chiun-Gung Juo8, Kun-Pin Wu1.
Abstract
Bladder cancer is one of the most common urinary tract carcinomas in the world. Urine metabolomics is a promising approach for bladder cancer detection and marker discovery since urine is in direct contact with bladder epithelia cells; metabolites released from bladder cancer cells may be enriched in urine samples. In this study, we applied ultra-performance liquid chromatography time-of-flight mass spectrometry to profile metabolite profiles of 87 samples from bladder cancer patients and 65 samples from hernia patients. An OPLS-DA classification revealed that bladder cancer samples can be discriminated from hernia samples based on the profiles. A marker discovery pipeline selected six putative markers from the metabolomic profiles. An LLE clustering demonstrated the discriminative power of the chosen marker candidates. Two of the six markers were identified as imidazoleacetic acid whose relation to bladder cancer has certain degree of supporting evidence. A machine learning model, decision trees, was built based on the metabolomic profiles and the six marker candidates. The decision tree obtained an accuracy of 76.60%, a sensitivity of 71.88%, and a specificity of 86.67% from an independent test.Entities:
Keywords: bladder cancer; decision tree; machine learning; metabolite marker selection; metabolomics
Mesh:
Substances:
Year: 2017 PMID: 28415579 PMCID: PMC5503573 DOI: 10.18632/oncotarget.16393
Source DB: PubMed Journal: Oncotarget ISSN: 1949-2553
Patient characteristics
| BCa | Hernia | p value | |
|---|---|---|---|
| 87 | 65 | ||
| 68.2±14.5 | 64.6±13.2 | 0.117 | |
| Male | 54 (62%) | 62 (95%) | |
| Female | 33 (38%) | 3 (5%) | |
| 1.40 | 1.11 | 0.203 | |
| 12.33 | 13.67 | < 0.001* | |
| Early1 | 55 | ||
| Advanced2 | 32 |
* Statistically significance, as hematuria being the common finding in BCa
1 Early stage: superficial tumor without muscle involvement
2 Adcanced stage: tumor invasion to muscle layer
Figure 1OPLS-DA score plot of the BCa and hernia metabolomic profiles
Each box represents the metabolomic profile of 944219 spectral ions of an individual subject. There are 87 blue boxes representing BCa patients and 65 red boxes representing hernia patients.
Candidate ions for BCa detection
| Candidate ions | # BCa | # Hernia | Ratio | p value | AUC |
|---|---|---|---|---|---|
| 2.56 min: 314.085 m/z | 46 | 25 | 1.27 | 2.62E-05 | 0.73 |
| 3.65 min: 165.007 m/z | 38 | 21 | 2.24 | 5.14E-05 | 0.72 |
| 3.65 min: 183.018 m/z | 39 | 23 | 2.85 | 7.32E-05 | 0.72 |
| 12.53 min: 194.117 m/z | 42 | 22 | 11.20 | 8.15E-05 | 0.72 |
| 19.42 min: 213.146 m/z | 55 | 50 | 1.47 | 2.52E-04 | 0.71 |
| 2.04 min: 106.950 m/z | 50 | 36 | 1.41 | 4.06E-04 | 0.70 |
# BCa: Number of BCa samples containing the ion
# Hernia: Number of hernia samples containing the ion
Ratio: expression fold change of ions in BCa samples to hernia samples
p value: received from Wilcoxon rank sum test
AUC: area under the receiver operating characteristic curve
Figure 2LLE plot of the BCa and hernia profiles of marker candidates
Each dot represents the profile of six marker candidates of an individual subject. There are 87 blue dots representing BCa patients and 65 red dots representing hernia patients.
Figure 3The decision tree construction and evaluation workflow
First, the training set of 55 BCa and 50 hernia samples was subjected to a procedure of decision tree construction with 5-fold cross validation to evaluate the stability and generalization of the decision tree model. Second, the whole training set was used to build a final decision tree. Finally, an independent test was performed on the testing set of 32 BCa and 15 hernia samples to validate the final decision tree.
Performance of decision trees reported by the 5-fold cross validation
| Iteration | Accuracy | Sensitivity | Specificity |
|---|---|---|---|
| 1 | 85.71% | 81.82% | 90.00% |
| 2 | 82.14% | 79.55% | 85.00% |
| 3 | 86.90% | 84.09% | 90.00% |
| 4 | 83.33% | 81.82% | 85.00% |
| 5 | 85.71% | 81.82% | 90.00% |
| 84.76% ± 1.75% | 81.82% ± 1.61% | 88.00% ± 2.74% |
Accuracy–the probability that a sample is correctly classified; Sensitivity–the probability that a BCa sample is correctly classified as BCa; Specificity–the probability that a hernia sample is correctly classified as hernia.