| Literature DB >> 24558297 |
Xia Jiang1, Diyang Xue1, Adam Brufsky2, Seema Khan3, Richard Neapolitan4.
Abstract
The purpose of this investigation is to develop and evaluate a new Bayesian network (BN)-based patient survivorship prediction method. The central hypothesis is that the method predicts patient survivorship well, while having the capability to handle high-dimensional data and be incorporated into a clinical decision support system (CDSS). We have developed EBMC_Survivorship (EBMC_S), which predicts survivorship for each year individually. EBMC_S is based on the EBMC BN algorithm, which has been shown to handle high-dimensional data. BNs have excellent architecture for decision support systems. In this study, we evaluate EBMC_S using the Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) dataset, which concerns breast tumors. A 5-fold cross-validation study indicates that EMBC_S performs better than the Cox proportional hazard model and is comparable to the random survival forest method. We show that EBMC_S provides additional information such as sensitivity analyses, which covariates predict each year, and yearly areas under the ROC curve (AUROCs). We conclude that our investigation supports the central hypothesis.Entities:
Keywords: Bayesian network; Cox proportional hazard model; breast cancer; random survival forest; survivorship prediction
Year: 2014 PMID: 24558297 PMCID: PMC3928477 DOI: 10.4137/CIN.S13053
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1A BN modeling the relationships among a small subset of variables related to respiratory diseases.
Figure 2An ID modeling the decision of whether to be treated with a thoracotomy for a non-small-cell carcinoma of the lung.
Figure 3An example illustrating the EBMC search.
The variables used to predict survival.
| VARIABLE | DESCRIPTION | VALUES |
|---|---|---|
| Age at diagnosis of the disease | 0–39 | |
| Size of tumor in cm | 0–20 | |
| Number of positive lymph nodes | 0 | |
| Grade of disease | 1 | |
| Tumor histology | IDC | |
| ER status | + | |
| Estrogen receptor expression | + | |
| Progesterone receptor expression | + | |
| HER2 status | 1 | |
| HER2 copy number gain or loss | Neut | |
| HER2 expression | + | |
| Treatment | None | |
| Inferred menopausal status | Pre | |
| Characterizes patients by lymph node status and chemo- and hormonal therapy | 1 | |
| Composite of size and number of positive lymph nodes | Numeric | |
| Number of lymph nodes removed | Numeric | |
| Nottingham Prognostic Index, a composite of tumor size, number of positive lymph nodes, and grade | Numeric | |
| Cells seen on histopathology | High | |
| Whether P53 is mutated | MUT | |
| Type of P53 mutation | Frameshift | |
| Subtype inferred from expression data | Basal | |
| Cluster membership according to METABRIC | 1 | |
| Collection site information specific to METABRIC | 1 | |
| A composite of other variables used by METABRIC | ER+/HER2− |
A table developed from the METABRIC dataset.
| PATIENT X1 X2 … X24 | YEAR1 | YEAR2 | YEAR3 … | YEAR14 | YEAR15 |
|---|---|---|---|---|---|
| 1 | Alive | Alive | Alive | Alive | Alive |
| 2 | Alive | Dead | Dead | Dead | Dead |
| 3 | Alive | Alive | – | – | – |
| … |
Concordance indices with 95% confidence intervals for EBMC_S, the Cox proportional hazards model, and the RSF method.
| METHOD | 5 YEAR | 10 YEAR | 15 YEAR |
|---|---|---|---|
| EBMC_S | 0.666 | 0.688 | 0.688 |
| Cox | 0.620 | 0.647 | 0.671 |
| RSF | 0.687 | 0.686 | 0.663 |
Significance testing results for EBMC_S versus the Cox proportional hazards model and the RSF method.
| METHOD | 5 YEAR | 10 YEAR | 15 YEAR |
|---|---|---|---|
| Cox | EBMC_S > Cox; | EBMC_S > Cox; | EBMC_S > Cox; |
| RSF | EBMC_S < RSF; | EBMC_S > RSF; | EBMC_S > RSF; |
Figure 4ROC curves for 1, 5, 10, and 15 year predictions.
Figure 5AUROCs plotted as a function of year.
Figure 6Models learned by EBMC_S for 1, 5, 10, and 15 year predictions.