| Literature DB >> 24643167 |
Sunil Gupta1, Truyen Tran, Wei Luo, Dinh Phung, Richard Lee Kennedy, Adam Broad, David Campbell, David Kipp, Madhu Singh, Mustafa Khasraw, Leigh Matheson, David M Ashley, Svetha Venkatesh.
Abstract
OBJECTIVES: Using the prediction of cancer outcome as a model, we have tested the hypothesis that through analysing routinely collected digital data contained in an electronic administrative record (EAR), using machine-learning techniques, we could enhance conventional methods in predicting clinical outcomes.Entities:
Keywords: Cancer; Electronic Medical Record; Machine Learning; Prediction; Survival
Mesh:
Year: 2014 PMID: 24643167 PMCID: PMC3963101 DOI: 10.1136/bmjopen-2013-004007
Source DB: PubMed Journal: BMJ Open ISSN: 2044-6055 Impact factor: 2.692
Characteristics of derivation and validation cohorts
| Cohort 1: ECO | Cohort 2: ECO and EAR (n=664) | ||
|---|---|---|---|
| Derivation (n=869) | Validation (n=94) | ||
| Age (SD) | 67.6 (14.6) | 68.4 (13.6) | 66.3 (14.9) |
| Gender: male | 487* | 48 | 381 |
| Tumour stream | |||
| Genitourinary | 172 | 21 | 135 |
| Colorectal | 140 | 14 | 115 |
| Lung | 121 | 18 | 96 |
| Breast | 122 | 15 | 74 |
| Haematological | 99 | 7 | 85 |
| Upper gastrointestinal | 83 | 9 | 57 |
| Skin | 36 | 1 | 28 |
| Head and neck | 35 | 0 | 30 |
| Gynaecological | 19 | 4 | 17 |
| CNS | 15 | 1 | 9 |
| Unknown primary | 38 | 9 | 26 |
*Two unspecified.
CNS, central nervous system; EAR, electronic administrative records; ECO, Evaluation of Cancer Outcomes.
Performance of survival prediction: comparison between machine-learning method and clinicians
| Survival period | AUC (95% CI) | |
|---|---|---|
| Clinician panel | Machine-learning model | |
| 6 months | 0.79 (0.76 to 0.81) | 0.87 (0.85 to 0.89) |
| 1 year | 0.79 (0.76 to 0.81) | 0.80 (0.77 to 0.82) |
| 2 years | 0.75 (0.73 to 0.78) | 0.76 (0.74 to 0.79) |
AUC, area under the receiver operating characteristic curve.
Prediction performance of machine-learning algorithms: 6-month survival
| Cancer type | Area under ROC curve (95% CI) | ||
|---|---|---|---|
| EAR only | ECO only | EAR+ECO | |
| Genitourinary | 0.81 (0.77 to 0.85) | 0.82 (0.78 to 0.86) | 0.88 (0.85 to 0.91)*,† |
| Colorectal | 0.84 (0.80 to 0.88) | 0.85 (0.81 to 0.89) | 0.88 (0.84 to 0.91)*,† |
| Lung | 0.71 (0.67 to 0.76) | 0.73 (0.69 to 0.77)* | 0.77 (0.73 to 0.82)*,† |
| Breast | no deaths in the period | ||
| Haematological | 0.73 (0.68 to 0.79) | 0.74 (0.69 to 79) | 0.76 (0.71 to 0.81) |
| Upper gastrointestinal | 0.74 (0.69 to 0.78) | 0.64 (0.60 to 69) | 0.84 (0.80 to 0.87)† |
| Skin | 0.84 (0.77 to 0.90) | 0.85 (0.79 to 91) | 0.91 (0.86 to 0.96)*,† |
| Head and neck | 0.66 (0.61 to 0.71) | 0.70 (0.64 to 75) | 0.77 (0.72 to 0.82)*,† |
| Gynaecological | 0.97 (0.94 to 0.99) | 0.99 (0.98 to 1)* | 1 (0.99 to 1)* |
| CNS | 0.89 (0.85 to 0.94) | 0.84 (0.78 to 0.90) | 0.82 (0.77 to 0.88) |
| Unknown primary | 0.92 (0.89 to 0.95) | 0.79 (0.75 to 0.84) | 0.90 (0.87 to 0.93)*,† |
*Significantly greater than EAR only.
†Significantly greater than ECO only.
CNS, central nervous system; EAR, electronic administrative records; ECO, Evaluation of Cancer Outcomes; ROC, receiver operating characteristic.
Prediction performance of machine-learning algorithms: 12-month survival
| Cancer type | Area under ROC curve (95% CI) | ||
|---|---|---|---|
| EAR only | ECO only | EAR+ECO | |
| Genitourinary | 0.79 (0.75 to 0.83) | 0.79 (0.75 to 0.83) | 0.84 (0.80 to 0.87)*,† |
| Colorectal | 0.82 (0.78 to 0.86) | 0.83 (0.79 to 0.86) | 0.87 (0.83 to 0.90)*,† |
| Lung | 0.73 (0.69 to 0.77) | 0.78 (0.73 to 0.82)* | 0.82 (0.78 to 0.86)*,† |
| Breast | 0.71 (0.65 to 0.78) | 0.90 (0.86 to 0.94) | 0.92 (0.89 to 0.96)* |
| Haematological | 0.63 (0.59 to 0.68) | 0.70 (0.66 to 0.75)* | 0.69 (0.64 to 0.74)* |
| Upper gastrointestinal | 0.62 (0.57 to 0.66) | 0.70 (0.65 to 0.74)* | 0.72 (0.68 to 0.76)* |
| Skin | 0.76 (0.71 to 0.88) | 0.89 (0.85 to 0.93)* | 0.93 (0.90 to 0.96)* |
| Head and neck | 0.77 (0.73 to 0.88) | 0.68 (0.63 to 0.73) | 0.79 (0.75 to 0.84)† |
| Gynaecological | 0.95 (0.92 to 0.97) | 1 (1 to 1)* | 0.99 (0.98 to 1)* |
| CNS | 0.66 (0.58 to 0.73) | 0.68 (0.61 to 0.76) | 0.69 (0.63 to 0.76) |
| Unknown primary | 0.87 (0.84 to 0.91) | 0.81 (0.77 to 0.85) | 0.88 (0.84 to 0.91) |
*Significantly greater than EAR only.
†Significantly greater than ECO only.
CNS, central nervous system; EAR, electronic administrative records; ECO, Evaluation of Cancer Outcomes; ROC, receiver operating characteristic.
Prediction performance of machine-learning algorithms: 24-month survival
| Area under the ROC curve (AUC) | |||
|---|---|---|---|
| Cancer type | EAR only | ECO only | EAR+ECO |
| Genitourinary | 0.73 (0.69 to 0.78) | 0.84 (0.81 to 0.88)* | 0.86 (0.82 to 0.89)*,† |
| Colorectal | 0.76 (0.72 to 0.80) | 0.76 (0.72 to 0.80) | 0.76 (0.72 to 0.80) |
| Lung | 0.74 (0.69 to 0.78) | 0.78 (0.73 to 0.82)* | 0.82 (0.79 to 0.86)*,† |
| Breast | 0.67 (0.61 to 0.73) | 0.86 (0.82 to 0.90)* | 0.88 (0.84 to 0.92)* |
| Haematological | 0.73 (0.68 to 0.77) | 0.70 (0.66 to 0.75) | 0.80 (0.76 to 0.84)*,† |
| Upper gastrointestinal | 0.81 (0.77 to 0.85) | 0.77 (0.72 to 0.81) | 0.87 (0.83 to 0.9)*,† |
| Skin | 0.71 (0.65 to 0.76) | 0.85 (0.8 to 0.89)* | 0.94 (0.92 to 0.97)*,† |
| Head and neck | 0.74 (0.7 to 0.78) | 0.66 (0.51 to 0.61) | 0.71 (0.67 to 0.76)† |
| Gynaecological | 0.96 (0.94 to 0.99) | 0.99 (0.98 to 1)* | 0.97 (0.95 to 0.99) |
| CNS | 0.83 (0.78 to 0.89) | 0.87 (0.82 to 0.93) | 0.96 (0.93 to 0.99)*,† |
| Unknown primary | 0.74 (0.7 to 0.79) | 0.78 (0.74 to 0.82)* | 0.8 (0.76 to 0.84)* |
*Significantly greater than EAR only.
†Significantly greater than ECO only.
CNS, central nervous system; EAR, electronic administrative records; ECO, Evaluation of Cancer Outcomes; ROC, receiver operating characteristic.