| Literature DB >> 35943965 |
Xiaoyue Wang1,2, Hong He3,4, Liang Xu5, Cuicui Chen1,2, Jieqing Zhang2,6, Na Li5, Xianxian Chen5, Weipeng Jiang1,2, Li Li1,2, Linlin Wang1,2, Yuanlin Song1,2, Jing Xiao5, Jun Zhang3,4, Dongni Hou1,2.
Abstract
BACKGROUND: Active targeted case-finding is a cost-effective way to identify individuals with high-risk for early diagnosis and interventions of chronic obstructive pulmonary disease (COPD). A precise and practical COPD screening instrument is needed in health care settings.Entities:
Keywords: chronic obstructive pulmonary disease; generalized additive model; machine learning; screening; smoking
Mesh:
Year: 2022 PMID: 35943965 PMCID: PMC9373185 DOI: 10.1177/14799731221116585
Source DB: PubMed Journal: Chron Respir Dis ISSN: 1479-9723 Impact factor: 3.115
Clinical characteristics of participants in cross-sectional data set and prospective validation data set.
| Characteristics | Cross-sectional data set | Prospective data set | ||||
|---|---|---|---|---|---|---|
| Non-COPD ( | COPD ( | Non-COPD ( | COPD ( | |||
| Age (year) | 53.2 ± 12.3 | 63.9 ± 10.1 | <.001 | 61.1 ± 9.7 | 69.1 ± 9.3 | <.001 |
| Male (%) | 1719 (41.1) | 344 (62.4) | <.001 | 403 (52.6) | 163 (84.9) | <.001 |
| Height (cm) | 161.4 ± 8.0 | 162.2 ± 8.2 | .009 | 163.6 ± 8.0 | 165.9 ± 7.0 | <.001 |
| Weight (kg) | 63.3 ± 10.8 | 64.1 ± 11.1 | .11 | 63.4 ± 10.4 | 65.4 ± 10.5 | 0.04 |
| BMI (kg/m2) | 24.2 ± 3.3 | 24.3 ± 3.5 | .58 | 23.7 ± 3.2 | 23.7 ± 3.2 | .88 |
| Cigarette smoke exposure | ||||||
| Current smoker | 869 (20.8) | 171 (31.0) | <.001 | 153 (20.0) | 41 (21.4) | .74 |
| Former smoker with passive smoking | 178 (4.2) | 65 (11.7) | <.001 | 85 (11.1) | 60 (31.2) | <.001 |
| Former smoker without passive smoking | 6 (0.1) | 2 (0.4) | .53 | 59 (7.7) | 40 (20.8) | <.001 |
| Never-smoker with passive smoking | 2817 (67.3) | 287 (52.0) | <.001 | 264 (34.4) | 17 (8.9) | <.001 |
| Never-smoker without passive smoking | 315 (7.6) | 26 (4.9) | .02 | 205 (26.8) | 34 (17.7) | .01 |
| Pack-year in current smoker (pack-years) | 6.7 ± 15.1 | 15.0 ± 21.7 | <.001 | 12.1 ± 21.8 | 29.7 ± 28.4 | <.001 |
| Respiratory symptoms | ||||||
| Dyspnea | 199 (4.8) | 64 (11.6) | <.001 | 31 (4.0) | 74 (38.5) | <.001 |
| Wheeze | 145 (3.5) | 97 (17.6) | <.001 | 81 (10.6) | 123 (64.1) | <.001 |
| mMRC grade ≥3 | 353 (8.4) | 179 (32.4) | <.001 | — | — | — |
| Chronic cough | 286 (6.8) | 84 (15.2) | <.001 | 122 (15.9) | 68 (35.4) | <.001 |
| Chronic phlegm | 269 (6.4) | 94 (17.0) | <.001 | 130 (17.0) | 104 (54.2) | <.001 |
| Any of the above respiratory symptoms | 894 (21.4) | 269 (48.7) | <.001 | 262 (34.2) | 159 (82.8) | <.001 |
| COPD grade | ||||||
| I | — | 334 (60.6) | — | — | 32 (16.7) | — |
| II | — | 173 (31.4) | — | — | 79 (41.1) | — |
| III | — | 38 (6.9) | — | — | 65 (33.9) | — |
| IV | — | 6 (1.1) | — | — | 16 (8.3) | — |
| FEV1 (mL) | 2693.3 ± 658.3 | 2085.0 ± 669.6 | <.001 | 2771.1 ± 711.0 | 1530.3 ± 625.1 | <.001 |
| FVC (mL) | 3276.5 ± 798.4 | 3382.3 ± 1057.3 | .03 | 3333.4 ± 920.9 | 2685.1 ± 759.4 | <.001 |
| FEV1/FVC (%) | 82.4 ± 6.3 | 61.5 ± 9.2 | <.001 | 83.7 ± 5.2 | 55.8 ± 11.4 | <.001 |
| Previous diagnosis of respiratory conditions | ||||||
| Asthma | 34 (0.8) | 34 (6.2) | <.001 | 12 (1.6) | 22 (11.5) | <.001 |
| COPD | 11 (0.3) | 17 (3.1) | <.001 | 0 (0) | 136 (70.8) | <.001 |
| Tuberculosis | 13 (0.3) | 5 (0.9) | .08 | 21 (2.7) | 16 (8.3) | .007 |
| Chronic bronchitis | 144 (3.4) | 66 (12.0) | <.001 | 42 (5.5) | 58 (30.2) | <.001 |
a Data are presented as % or mean ± SD, p-value are calculated based on Chi-square or Mann–Whitney U test.
Statistics of selected predictors of COPD in cross-sectional and prospective data sets.
| Predictors | Cross-sectional data set | Prospective data set | ||||
|---|---|---|---|---|---|---|
| Non-COPD ( | COPD ( | Non-COPD ( | COPD ( | |||
| Age | 53.2 ± 12.3 | 63.9 ± 10.1 | <.001 | 61.1 ± 9.7 | 69.1 ± 9.3 | <.001 |
| Gender | ||||||
| Female | 2466 (58.9) | 207 (37.6) | <.001 | 363 (47.4) | 29 (15.1) | <.001 |
| Male | 1719 (41.1) | 344 (62.4) | 406 (52.6) | 163 (84.9) | ||
| Pack-years | 0.0 (0.0 - 0.0) | 0.0 (0.0 - 30.25) | <.001 | 0.0 (0.0 - 20.75) | 25.5 (0.0 - 45.0) | <.001 |
| Job | ||||||
| Unemployed | 565 (13.5) | 118 (21.4) | <.001 | 435 (56.8) | 144 (75.0) | <.001 |
| Worker | 1086 (25.9) | 73 (13.2) | 38 (5.0) | 9 (4.69) | ||
| Farmer | 842 (20.1) | 146 (26.5) | 62 (8.1) | 21 (10.9) | ||
| Technical stuffs | 197 (4.7) | 16 (2.9) | 41 (5.4) | 4 (2.1) | ||
| Housekeeper | 89 (2.1) | 9 (1.6) | 9 (1.2) | 2 (1.0) | ||
| Official | 76 (1.8) | 4 (0.7) | 37 (4.8) | 2 (1.04) | ||
| Driver | 69 (1.6) | 7 (1.3) | 5 (0.7) | 1 (0.5) | ||
| Cook | 38 (0.9) | 1 (0.2) | 1 (0.1) | 0 (0) | ||
| Student | 19 (0.5) | 1 (0.2) | 0 (0) | 0 (0) | ||
| Others | 1190 (28.4) | 174 (31.6) | 138 (18.0) | 9 (4.69) | ||
| Years of smoking cessation | 0.0 (0.0 - 0.0) | 0.0 (0.0 - 0.0) | <.001 | 0.0 (0.0 - 0.0) | 1.00 (0.00 - 5.25) | .34 |
| Morning productive cough | 571 (13.6) | 102 (18.5) | <.001 | 80 (10.4) | 68 (35.4) | <.001 |
| Wheeze | 127 (3.0) | 94 (17.1) | <.001 | 81 (10.6) | 123 (64.1) | <.001 |
aData are presented as n (%), median (IQR1∼ IQR3), or mean ± SD.
-value are calculated based on Chi-square test or Mann–Whitney U test.
Predictive ability of different models on cross-sectional validation dataset and prospective cohort dataset.
| Models
| AUC (95% CI) | Sensitivity | Specificity | PPV | NPV | Accuracy | |
|---|---|---|---|---|---|---|---|
| Cross-sectional validation dataset | |||||||
| GAM | 0.813 (0.753–0.867) | 0.91 | 0.55 | 0.21 | 0.98 | 0.59 | <.001 |
| LR | 0.811 (0.747–0.867) | 0.89 | 0.51 | 0.21 | 0.98 | 0.61 | <.001 |
| RF | 0.702 (0.628–0.777) | 0.4 | 0.88 | 0.39 | 0.92 | 0.86 | |
| XGBoost | 0.810 (0.750–0.864) | 0.98 | 0.19 | 0.14 | 0.99 | 0.3 | |
| Prospective cohort dataset | |||||||
| GAM | 0.880 (0.848–0.910) | 0.98 | 0.23 | 0.24 | 0.98 | 0.38 | <.001 |
| LR | 0.869 (0.836–0.901) | 0.97 | 0.26 | 0.24 | 0.97 | 0.4 | <.001 |
| RF | 0.875 (0.844–0.906) | 0.84 | 0.71 | 0.41 | 0.95 | 0.73 | |
| XGBoost | 0.869 (0.832–0.901) | 1 | 0.06 | 0.21 | 1 | 0.25 | |
AUC: area under the receiver operating characteristic curve; PPV: positive predictive value; NPV: negative predictive value; NA: not available for the model; LR: logistic regression; GAM: generalized additive model; RF: random forest; XGBoost: extreme gradient boosting; COPD: chronic obstructive pulmonary disease.
aFor each model, we defined probability higher than 0.075 as with risk for COPD, and the others without for COPD to calculate the metrics. We used spirometry-defined COPD as gold standard.
COPD quick screening questionnaire (COPD-QSQ).
| Characteristics | Items in questionnaire | Answers |
|---|---|---|
| 1. Sex | What is your sex? | Male/Female |
| 2. Wheeze | Have you ever wheezed? | Yes/No |
| 3. Morning productive cough | Do you often cough up sputum when you wake up in the morning? | Yes/No |
| 4. Job | What is your current job? | Official/Unemployed/Farmer/Technical stuffs/Worker, cooker, or housekeeper/Others |
| 5. Years of smoking cessation | How many years have you quit smoking? | Number |
| 6. Age | How old are you (years)? | Number |
| 7. Pack-years | How many packs of cigarettes do you smoke a year? | Number |
Figure 1.Comparisons of the area under curve (AUC) between generalized additive model and three previous approaches on the cross-sectional validation data. The models included seven predictors: age, morning productive cough, wheeze, years of smoking cessation, gender, job, and pack-years of smoking.
Figure 2.Positive predictive value (PPV) and negative predictive value (NPV) of generalized additive model prediction results. The red vertical line presents the high-risk cut-off value of 0.265 using generalized additive model (GAM) with an optimal PPV of 0.5. The blue vertical line presents the low-risk cut-off value of 0.075 with an optimal NPV of 0.98 using the same model. The model included seven predictors: age, morning productive cough, wheeze, years of smoking cessation, gender, job, and pack-years of smoking.