| Literature DB >> 34411121 |
Bernard Aguilaniu1, David Hess2, Eric Kelkel3, Amandine Briault4, Marie Destors4, Jacques Boutros5, Pei Zhi Li6, Anestis Antoniadis7,8.
Abstract
Facilitating the identification of extreme inactivity (EI) has the potential to improve morbidity and mortality in COPD patients. Apart from patients with obvious EI, the identification of a such behavior during a real-life consultation is unreliable. We therefore describe a machine learning algorithm to screen for EI, as actimetry measurements are difficult to implement. Complete datasets for 1409 COPD patients were obtained from COLIBRI-COPD, a database of clinicopathological data submitted by French pulmonologists. Patient- and pulmonologist-reported estimates of PA quantity (daily walking time) and intensity (domestic, recreational, or fitness-directed) were first used to assign patients to one of four PA groups (extremely inactive [EI], overtly active [OA], intermediate [INT], inconclusive [INC]). The algorithm was developed by (i) using data from 80% of patients in the EI and OA groups to identify 'phenotype signatures' of non-PA-related clinical variables most closely associated with EI or OA; (ii) testing its predictive validity using data from the remaining 20% of EI and OA patients; and (iii) applying the algorithm to identify EI patients in the INT and INC groups. The algorithm's overall error for predicting EI status among EI and OA patients was 13.7%, with an area under the receiver operating characteristic curve of 0.84 (95% confidence intervals: 0.75-0.92). Of the 577 patients in the INT/INC groups, 306 (53%) were reclassified as EI by the algorithm. Patient- and physician- reported estimation may underestimate EI in a large proportion of COPD patients. This algorithm may assist physicians in identifying patients in urgent need of interventions to promote PA.Entities:
Mesh:
Year: 2021 PMID: 34411121 PMCID: PMC8376055 DOI: 10.1371/journal.pone.0255977
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Study design.
See Table 1 for definitions of activity categories.
Clinical and functional characteristics of the stratified COPD patients (n = 1409).
| EI | INT | INC | OA | p-value | |
|---|---|---|---|---|---|
| n = 172 | n = 410 | n = 167 | n = 660 | ||
|
| |||||
|
Age (years) |
67.5 ± 10.1 |
65.4 ± 9.5 |
65.9 ± 8.6 |
65.5 ± 8.3 |
0.063 |
|
Male gender |
60.5% |
63.9% |
61.7% |
73.5% |
**** |
|
BMI (kg/m2) |
26.5 ± 6.8 |
26.2 ± 5.9 |
25.0 ± 5.4 |
25.8 ± 5.1 |
0.058 |
|
Smokers (current or ex) |
0.965 |
0.963 |
0.964 |
0.964 |
0.07 |
|
mMRC score |
2.7 ± 1.1 |
1.9 ± 1.1 |
1.8 ± 1 |
1.2 ± 0.9 |
**** |
|
DIRECT score |
17.4 ± 8.6 |
13.0 ± 8 |
12.0 ± 6.9 |
8.6 ± 6.4 |
**** |
|
CAT score |
21.3 ± 8.1 |
18.0 ± 7.6 |
17.4 ± 7.5 |
14.1 ± 7.2 |
**** |
|
HADS Anxiety subscore |
7.4 ± 4.6 |
6.3 ± 4.5 |
6.1 ± 4 |
5.4 ± 3.7 |
**** |
|
HADS Depression subscore |
8.2 ± 4.8 |
6.2 ± 4.2 |
5.6 ± 3.7 |
4.7 ± 3.5 |
**** |
|
| |||||
|
FEV1 (L) |
1.28 ± 0.6 |
1.57 ± 0.6 |
1.58 ± 0.7 |
1.82 ± 0.7 |
**** |
|
FEV1 (% predicted) |
50.9 ± 22.6 |
59.2 ± 22 |
59.2 ± 22.9 |
65.5 ± 20.5 |
**** |
|
FVC (L) |
2.5 ± 0.9 |
2.86 ± 0.9 |
3.0 ± 1.1 |
3.24 ± 1 |
**** |
|
FVC (% predicted) |
77.9 ± 24.3 |
85.4 ± 22.5 |
89.6 ± 25.2 |
92.8 ± 21.2 |
**** |
|
FEV1/FVC (%) |
50.6 ± 14.5 |
54.4 ± 13.3 |
52 ± 14 |
55.3 ± 11.8 |
**** |
|
GOLD 1 |
13.4% |
18.3% |
21.6% |
24.7% |
**** |
|
GOLD 2 |
32.0% |
45.9% |
36.5% |
49.4% |
**** |
|
GOLD 3 |
25.6% |
24.4% |
29.3% |
21.7% |
**** |
|
GOLD 4 |
29.1% |
11.5% |
12.6% |
4.2% |
**** |
|
| |||||
|
Cardiovascular disease and/or diabetes |
83.1% |
71.0% |
69.5% |
63.9% |
**** |
|
Treated for anxiety or depression |
72.7% |
59.8% |
59.9% |
51.1% |
**** |
|
Exacerbation within the previous year (≥1 severe or ≥2 mild/moderate) |
48.3% |
32.7% |
39.5% |
25.0% |
**** |
|
GOLD A |
2.9% |
6.8% |
10.2% |
22.0% |
**** |
|
GOLD B |
48.8% |
60.5% |
50.3% |
53.0% |
**** |
|
GOLD C |
0.0% |
2.4% |
2.4% |
3.9% |
**** |
|
GOLD D |
48.3% |
30.2% |
37.1% |
21.1% |
**** |
Data are presented as the percentage or mean ± standard deviation. Comparisons between PA categories were performed by Kruskal-Wallis tests and ANOVA with ordinal factors test (ordAOV). Significant differences are noted: p< 0,…****; p< 0.001 ***; p< 0.01 **; p< 0.05 *.
Abbreviations: BMI, body mass index; CAT, COPD Assessment Test; COPD, chronic obstructive pulmonary disease; DIRECT, Disability related to COPD Tool; FEV1, forced expiratory volume in 1 s; FVC, forced vital capacity; GOLD, Global Initiative for Chronic Obstructive Lung Disease classification; HADS, Hospital Anxiety and Depression Scale; mMRC, modified Medical Research Council dyspnea scale. For EI, INT, INC, and OA definitions, see Table 1.
Categorization of physical activity levels in COPD patients according to combined patient- and physician-derived estimates.
| Patient’s Estimate (daily walking time; n = 1409) | Physician’s Estimate (activity intensity; n = 1409) | ||
|---|---|---|---|
| (D)omestic | (R)ecreational | (A)ctive | |
| n = 504 | n = 530 | n = 375 | |
| (1) ≤ 10 min (n = 203) |
| INT-b (n = 23) | INC-c (n = 8) |
| n = 172 | EI predicted = 9 | EI predicted = 3 | |
| (2) 10–30 min (n = 440) | INT-a (n = 226) | INT-c (n = 161) | INC-d (n = 53) |
| EI predicted = 140 | EI predicted = 74 | EI predicted = 22 | |
| (3) 30–60 min (n = 399) | INC-a (n = 69) | ||
| EI predicted = 41 | |||
| n = 194 | n = 136 | ||
| (4) >60 min (n = 367) | INC-b (n = 37) | n = 152 | n = 178 |
| EI predicted = 17 | |||
Abbreviations: EI, extremely inactive category; OA, overtly active category; INT (a,b,c), physical activity levels intermediate between EI and OA; INC (a,b,c,d), incompatible physician and patient estimates of activity. (D)omestic, activities mainly confined to the home; (R)ecreational, predominantly outside the home; (A)ctive, predominantly devoted to maintaining fitness.
*EI predicted indicates the number of patients within each INT and INC subcategory reassigned to the EI category by the predictive algorithm.
Fig 2Univariate boxplots comparing the distribution of selected continuous variables according to the physical activity category described here (top row) and GOLD 2017 category (bottom row). Plots show the median, minimum, maximum, and interquartile values. See Table 1 for definitions of activity categories.
Fig 3Box plots (categorical/ordinal variables) and line plots (continuous variables) of the marginal effect of a predictor (x-axis) on the probability of a patient being assigned to the EI category according to the weighted random forest method (y-axis). See also S3 Fig for the inverse analysis of probability of assignment to the OA category. Box plots show the median, minimum, maximum, and interquartile values. See Table 1 for definitions of activity categories.
Evaluation of the performance of the predictive algorithm.
| Overall error | Accuracy | PPV | NPV | Sensitivity | AUPRC | AUROC | ||
|---|---|---|---|---|---|---|---|---|
| EI | OA | |||||||
| HyperSMURF | 13.7% | 0.76 (0.69–0.82) | 0.45 | 0.93 | 79.4% | 75.0% | 0.64 | 0.84 (0.75–0.92) |
| Weight Random Forest | 14.1% | 0.84 (0.87–0.90) | 0.63 | 0.90 | 59.0% | 90.9% | 0.49 | 0.75 (0.66–0.84) |
| BiMM Random Forest | 14.2% | 0.84 (0.78–0.90) | 0.87 | 0.67 | 47.0% | 94.0% | 0.47 | 0.70 (0.62–0.80) |
*Accuracy and AUROC are presented with 95% confidence intervals.
Data are based on analysis of 20% (n = 166) patients in the OA and EI groups.
Abbreviations: AUROC, area under the receiver operating characteristic curve; AUPRC, area under the precision and recall curve; BiMM, binary mixed model forest algorithm (23); HyperSMURF, hyper-SMOTE under sampled random forests (24); NPV, negative predictive value; PPV, positive predictive value. See Table 1 for definitions of EI and OA.
Fig 4Receiver operating characteristic curves for the prediction of EI using weighted random forest (WRF, left) and hyper-ensemble of SMOTE under sampled random forests (HyperSMURF, right) methods. Areas under the curves are shown as the median and 95% confidence intervals.