Xiaomeng Li1, Yuhao Guo2, Wenyang Li1, Wei Wang1, Fang Zhang1, Shanqun Li3. 1. Department of Respiratory and Critical Care Medicine, The First Hospital of China Medical University, Shenyang 110000, People's Republic of China. 2. Department of Mathematics and Statistics, Xi'an JiaoTong University, Xi'an 710049, People's Republic of China. 3. Department of Pulmonary and Critical Care Medicine, Zhongshan Hospital, Fudan University, Shanghai 200020, People's Republic of China.
Abstract
Objective: The diagnosis of chronic obstructive pulmonary disease (COPD) is challenging, especially in the primary institution which lacks spirometer. To reduce the rate of COPD missed diagnoses in Northeast China, which has a higher prevalence of COPD, this study aimed to establish efficient primary screening and discriminant models of COPD in this region. Patients and Methods: Subjects from Northeast China were enrolled from December 2017 to April 2019 from The First Hospital of China Medical University. Pulmonary function tests and questionnaire were given to all participants. Using illness or no illness as the goal for screening models and disease severity as the goal for discriminant models, multivariate linear regression, logical regression, linear discriminant analysis, K-nearest neighbor, decision tree and support vector machine were constructed through R language and Python software. After comparing effectiveness among them, the most optimal primary screening and discriminant models were established. Results: Enrolled were 232 COPD patients (124 GOLD I-II and 108 GOLD III-IV) and 218 normal controls. Eight primary screening models were established. The optimal model was Y = -1.2562-0.3891X4 (education level) + 1.7996X5 (dyspnea) + 0.5102X6 (cooking fuel grade) + 1.498X7 (smoking index) + 0.8077X9 (family history)-0.5552X11 (BMI) + 0.538X13 (cough with sputum) + 2.0328X14 (wheezing) + 1.3378X16 (farmers) + 0.8187X17 (mother's smoking exposure history during pregnancy)-0.389X18 (kitchen ventilation) + 0.6888X19 (childhood heating). Six discriminant models were established. The optimal model was decision tree (the optimal variables: dyspnea (x5), cooking fuel grade (x6), second-hand smoking index (x8), BMI (x11), cough (x12), cough with sputum (x13), wheezing (x14), farmer (x16), kitchen ventilation (x18), and childhood heating (x19)). The code was established to combine the discriminant model with computer technology. Conclusion: Many factors were related to COPD in Northeast China. Stepwise logistic regression and decision tree were the optimal screening and discriminant models for COPD in this region.
Objective: The diagnosis of chronic obstructive pulmonary disease (COPD) is challenging, especially in the primary institution which lacks spirometer. To reduce the rate of COPD missed diagnoses in Northeast China, which has a higher prevalence of COPD, this study aimed to establish efficient primary screening and discriminant models of COPD in this region. Patients and Methods: Subjects from Northeast China were enrolled from December 2017 to April 2019 from The First Hospital of China Medical University. Pulmonary function tests and questionnaire were given to all participants. Using illness or no illness as the goal for screening models and disease severity as the goal for discriminant models, multivariate linear regression, logical regression, linear discriminant analysis, K-nearest neighbor, decision tree and support vector machine were constructed through R language and Python software. After comparing effectiveness among them, the most optimal primary screening and discriminant models were established. Results: Enrolled were 232 COPDpatients (124 GOLD I-II and 108 GOLD III-IV) and 218 normal controls. Eight primary screening models were established. The optimal model was Y = -1.2562-0.3891X4 (education level) + 1.7996X5 (dyspnea) + 0.5102X6 (cooking fuel grade) + 1.498X7 (smoking index) + 0.8077X9 (family history)-0.5552X11 (BMI) + 0.538X13 (cough with sputum) + 2.0328X14 (wheezing) + 1.3378X16 (farmers) + 0.8187X17 (mother's smoking exposure history during pregnancy)-0.389X18 (kitchen ventilation) + 0.6888X19 (childhood heating). Six discriminant models were established. The optimal model was decision tree (the optimal variables: dyspnea (x5), cooking fuel grade (x6), second-hand smoking index (x8), BMI (x11), cough (x12), cough with sputum (x13), wheezing (x14), farmer (x16), kitchen ventilation (x18), and childhood heating (x19)). The code was established to combine the discriminant model with computer technology. Conclusion: Many factors were related to COPD in Northeast China. Stepwise logistic regression and decision tree were the optimal screening and discriminant models for COPD in this region.
Authors: Zafar Zafari; Don D Sin; Dirkje S Postma; Claes-Göran Löfdahl; Judith Vonk; Stirling Bryan; Stephen Lam; C Martin Tammemagi; Rahman Khakban; S F Paul Man; Donald Tashkin; Robert A Wise; John E Connett; Bruce McManus; Raymond Ng; Zsuszanna Hollander; Mohsen Sadatsafavi Journal: CMAJ Date: 2016-08-02 Impact factor: 8.262
Authors: S C McCloskey; B D Patel; S J Hinchliffe; E D Reid; N J Wareham; D A Lomas Journal: Am J Respir Crit Care Med Date: 2001-10-15 Impact factor: 21.405
Authors: Melanie C Matheson; Gayan Bowatte; Caroline J Lodge; Shyamali C Dharmage; Jennifer L Perret; Adrian J Lowe; Chamara V Senaratna; Graham L Hall; Nick de Klerk; Louise A Keogh; Christine F McDonald; Nilakshi T Waidyatillake; Peter D Sly; Deborah Jarvis; Michael J Abramson Journal: Int J Chron Obstruct Pulmon Dis Date: 2018-06-14