| Literature DB >> 25126556 |
Li He1, Yuelong Wang2, Yongning Yang2, Liqiu Huang2, Zhining Wen2.
Abstract
For the purpose of improving the prediction of cancer prognosis in the clinical researches, various algorithms have been developed to construct the predictive models with the gene signatures detected by DNA microarrays. Due to the heterogeneity of the clinical samples, the list of differentially expressed genes (DEGs) generated by the statistical methods or the machine learning algorithms often involves a number of false positive genes, which are not associated with the phenotypic differences between the compared clinical conditions, and subsequently impacts the reliability of the predictive models. In this study, we proposed a strategy, which combined the statistical algorithm with the gene-pathway bipartite networks, to generate the reliable lists of cancer-related DEGs and constructed the models by using support vector machine for predicting the prognosis of three types of cancers, namely, breast cancer, acute myeloma leukemia, and glioblastoma. Our results demonstrated that, combined with the gene-pathway bipartite networks, our proposed strategy can efficiently generate the reliable cancer-related DEG lists for constructing the predictive models. In addition, the model performance in the swap analysis was similar to that in the original analysis, indicating the robustness of the models in predicting the cancer outcomes.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25126556 PMCID: PMC4122106 DOI: 10.1155/2014/424509
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
The 20 cancer-related signaling pathways collected from KEGG database for the construction of gene-pathway bipartite network.
| Pathway entry | KEGG pathway name | Number of genes |
|---|---|---|
| hsa05200 | Pathways in cancer | 327 |
| hsa05202 | Transcriptional misregulation in cancer | 179 |
| hsa05203 | Viral carcinogenesis | 206 |
| hsa05204 | Chemical carcinogenesis | 80 |
| hsa05205 | Proteoglycans in cancer | 225 |
| hsa05206 | MicroRNAs in cancer | 296 |
| hsa05210 | Colorectal cancer | 62 |
| hsa05211 | Renal cell carcinoma | 66 |
| hsa05212 | Pancreatic cancer | 66 |
| hsa05213 | Endometrial cancer | 52 |
| hsa05214 | Glioma | 65 |
| hsa05215 | Prostate cancer | 89 |
| hsa05216 | Thyroid cancer | 29 |
| hsa05217 | Basal cell carcinoma | 55 |
| hsa05218 | Melanoma | 71 |
| hsa05219 | Bladder cancer | 38 |
| hsa05220 | Chronic myeloid leukemia | 73 |
| hsa05221 | Acute myeloid leukemia | 57 |
| hsa05222 | Small cell lung cancer | 86 |
| hsa05223 | Non-small-cell lung cancer | 56 |
Figure 1The gene-pathway bipartite network constructed with 29 gene signatures that were used for predicting the reoperative treatment response of breast cancer.
The results of predicting the reoperative treatment response of breast cancer in original and swap analyses.
| Our model | MAQC-II candidate model | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| SP | SE | ACC | MCC | SP | SE | ACC | MCC | ||
| Original analysis | Training | 0.928 | 0.455 | 0.808 | 0.444 | 0.847 | 0.569 | 0.775 | 0.433 |
| Validation | 0.882 | 0.467 | 0.820 | 0.332 | 0.729 | 0.667 | 0.720 | 0.301 | |
|
| |||||||||
| Swap analysis | Training | 0.988 | 0.200 | 0.870 | 0.343 | 0.899 | 0.522 | 0.837 | 0.454 |
| Validation | 1.000 | 0.152 | 0.785 | 0.343 | 0.959 | 0.212 | 0.769 | 0.267 | |
In the prediction, pCR was defined as positive sample.
SP, SE, ACC, and MCC represented specificity, sensitivity, accuracy, and Matthew's correlation coefficient, respectively.
Figure 2The gene-pathway bipartite network constructed with 50 gene signatures that were used for predicting the overall survival milestone outcome of acute myeloma leukemia.
The results of predicting the overall survival milestone outcome of acute myeloma leukemia in original and swap analyses.
| Our model | 86-probe-set model | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| SP | SE | ACC | MCC | AUC | SP | SE | ACC | MCC | AUC | ||
| Original analysis | Training | 0.697 | 0.837 | 0.776 | 0.542 | 0.776 | 0.758 | 0.733 | 0.743 | 0.486 | 0.746 |
| Validation | 0.574 | 0.600 | 0.584 | 0.170 | 0.587 | 0.362 | 0.800 | 0.532 | 0.172 | 0.581 | |
|
| |||||||||||
| Swap analysis | Training | 0.830 | 0.700 | 0.779 | 0.533 | 0.765 | 1.000 | 0.433 | 0.779 | 0.564 | 0.717 |
| Validation | 0.545 | 0.756 | 0.664 | 0.308 | 0.655 | 0.712 | 0.523 | 0.605 | 0.236 | 0.618 | |
In the prediction, high-risk patient was defined as positive sample.
AUC represented the area under the ROC curve.
See notes under Table 2 for more information.
Figure 3The gene-pathway bipartite network constructed with 62 gene signatures that were used for predicting the molecular subclasses of high-grade glioblastoma.
The results of predicting the molecular subclasses of high-grade glioblastoma in original and swap analyses.
| Our model | ||||||
|---|---|---|---|---|---|---|
| SP | SE | ACC | MCC | AUC | ||
| Original analysis | Training | 0.900 | 0.900 | 0.900 | 0.800 | 0.900 |
| Validation | 0.750 | 1.000 | 0.875 | 0.775 | 0.875 | |
|
| ||||||
| Swap analysis | Training | 0.950 | 0.900 | 0.925 | 0.851 | 0.925 |
| Validation | 0.800 | 0.567 | 0.683 | 0.377 | 0.684 | |
In the prediction, the gene expression profile termed ProNeural (PN) was defined as positive sample.
AUC represented the area under the ROC curve.
See notes under Table 2 for more information.