| Literature DB >> 35311685 |
Thomas Ma Wilkinson1,2,3, Michael J Boniface4, Francis P Chmiel4, Dan K Burns4, John Brian Pickering4, Alison Blythin1.
Abstract
BACKGROUND: Self-reporting digital apps provide a way of remotely monitoring and managing patients with chronic conditions in the community. Leveraging the data collected by these apps in prognostic models could provide increased personalization of care and reduce the burden of care for people who live with chronic conditions. This study evaluated the predictive ability of prognostic models for the prediction of acute exacerbation events in people with chronic obstructive pulmonary disease by using data self-reported to a digital health app.Entities:
Keywords: COPD; chronic disease; digital applications; digital health; exacerbation events; health care applications; mHealth; machine learning; mobile health; myCOPD; remote monitoring
Year: 2022 PMID: 35311685 PMCID: PMC8981014 DOI: 10.2196/26499
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Self-reported symptom scores and chronic obstructive pulmonary disease assessment test (CAT) scores. (A) Symptom score rankings and classification of whether this score corresponds to an exacerbation event, as defined in the context of this work. (B) Example user (with high reporting frequency) self-reporting timeline where the top panel displays self-reported symptom scores and the bottom panel self-reported CAT results. CAT: chronic obstructive pulmonary disease assessment test.
Figure 2Selection of self-reports in our study cohort containing 2374 patients. Isolated reports (n=24,801) were those without a subsequent report in the following 3 days. Anomalous users (n=1942) were those who only reported exacerbation events or self-reported to the myCOPD app before their registration date. Exacerbation events (n=5906) were all self-reported to the app by 742 patients.
Patient demographics and smoking status in our cohort (N=2374). All information was self-reported to the myCOPD app.
| Group, subgroups | Patients, n (%) | ||
|
| |||
|
| Missing | 10 (0.4) | |
|
| 19-29 | 7 (0.3) | |
|
| 30-39 | 39 (1.6) | |
|
| 40-49 | 89 (3.7) | |
|
| 50-59 | 325 (13.7) | |
|
| 60-69 | 791 (33.3) | |
|
| 70-79 | 881 (37.1) | |
|
| 80-89 | 212 (8.9) | |
|
| 90-99 | 15 (0.6) | |
|
| 100-110 | 5 (0.2) | |
|
| |||
|
| Missing | 1724 (72.6) | |
|
| Male | 419 (17.6) | |
|
| Female | 231 (9.7) | |
|
| |||
|
| Missing | 1217 (51.3) | |
|
| Ex-smoker | 843 (35.3) | |
|
| Nonsmoker | 156 (6.6) | |
|
| Smoker | 158 (6.7) | |
Figure 3Self-reported symptom scores and results of chronic obstructive pulmonary disease assessment test (CAT) for reports in our 2374 patient cohort. (A-D) Displays the self-reported CAT result stratified by the self-reported symptom score (row) on the day of test completion. (E) Mean self-reported symptom scores in the days preceding (and following) a day where a patient self-reports their first exacerbation event. (F) Mean self-reported result of CAT in the days preceding (and following) a day where a patient self-reports their first exacerbation event. Grey dashed lines in all panels highlight the day of the first reported exacerbation event (time=0 days). Panels E and F indicate that exacerbation events can be associated with a worsening of symptom scores and CAT results several days in advance of the event. The width of the observed peaks (see panel E, right of dashed line) following the start of the exacerbation event demonstrates that exacerbation events can be multiple day events. CAT: chronic obstructive pulmonary disease assessment test.
Figure 4Model performance evaluated on the patient holdout test set. (A) Receiver operating characteristic curve of our models. The baseline model (dashed line) has only 1 nontrivial threshold for dichotomizing the prediction (diamond marker), whereas the machine learnt models has a number of possible thresholds, which needs to be optimized to suit the use case (so called sensitivity-specificity trade-off). (B) Feature importance (Gini importance) for the random forest model. CAT: chronic obstructive pulmonary disease assessment test.
Model performances evaluated on the holdout test set.a
| Name | Model | Area under the receiver operating characteristic curve (95% CI) | Threshold | Sensitivity (95% CI) | Specificity (95% CI) |
| A | Baseline model | 0.655 (0.632-0.676) | N/Ab | 0.551 (0.508-0.596) | 0.759 (0.752-0.767) |
| B | Logistic regression | 0.697 (0.689-0.711) | Youden’s J statistic | 0.708 (0.625-0.768) | 0.644 (0.574-0.706) |
| C | Random forest | 0.727 (0.720-0.735) | Youden’s J statistic | 0.755 (0.676-0.813) | 0.629 (0.564-0.700) |
| D | Random forest | 0.727 (0.720-0.735) | Specificity=0.25 | 0.921 (0.907-0.935) | 0.250 (0.246-0.254) |
| E | Random forest | 0.727 (0.720-0.735) | Specificity=0.75 | 0.576 (0.553-0.594) | 0.750 (0.749-0.751) |
aThe area under the receiver operating characteristic curve column denotes the area under the receiving operator curve (Figure 4) for each model. The 3 rightmost columns display the sensitivity and specificity of models at predicting exacerbations with different thresholds used to dichotomize the predictions. The baseline model is already binary and only has 1 nontrivial configuration, but the threshold used to dichotomize the machine learning models (B-E) can be tuned to suit the intended context of the model. The maximum of Youden’s J statistic is used as a baseline criterion for dichotomizing the prediction (models B and C), and other cutoffs yielding fixed specificities are investigated for the random forest model. The area under the receiver operating characteristic curve for models C, D, and E are the same since they correspond to the same underlying model.
bN/A: not applicable.