Hongcheng Wei1, Jie Sun2, Wenqi Shan1, Wenwen Xiao1, Bingqian Wang1, Xuan Ma1, Weiyue Hu3, Xinru Wang1, Yankai Xia4. 1. State Key Laboratory of Reproductive Medicine, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China; Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing 211166, China. 2. Department of Endocrinology, Drum Tower hospital affiliated to Nanjing University Medical School, No 321 Zhongshan Road, Nanjing 210008, China. 3. State Key Laboratory of Reproductive Medicine, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China; Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing 211166, China. Electronic address: weiyuehu@njmu.edu.cn. 4. State Key Laboratory of Reproductive Medicine, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China; Key Laboratory of Modern Toxicology of Ministry of Education, School of Public Health, Nanjing Medical University, Nanjing 211166, China. Electronic address: yankaixia@njmu.edu.cn.
Abstract
BACKGROUND: With dramatically increasing prevalence, diabetes mellitus has imposed a tremendous toll on individual well-being. Humans are exposed to various environmental chemicals, which have been postulated as underappreciated but potentially modifiable diabetes risk factors. OBJECTIVES: To determine the utility of environmental chemical exposure in predicting diabetes mellitus. METHODS: A total of 8501 eligible participants from NHANES 2005-2016 were randomly assigned to a discovery (N = 5953) set and a validation (N = 2548) set. We applied random forest (RF) and least absolute shrinkage and selection operator (LASSO) regression with 10-fold cross-validation in the discovery set to select features, and built an optimal model to predict diabetes mellitus, blood insulin, fasting plasma glucose (FPG) and 2-h plasma glucose after oral glucose tolerance test (2-h PG after OGTT). RESULTS: The machine learning model using LASSO regression predicted diabetes with an area under the receiver operating characteristics (AUROC) of 0.80 and 0.78 in the discovery set and validation set, respectively. The linear model predicted blood insulin level with an R2 of 0.42 and 0.40 in the discovery set and validation set, respectively. For FPG, the discovery set and validation set yielded an R2 of 0.16 and 0.15, respectively. For 2-h PG after OGTT, the discovery set and validation set yielded an R2 of 0.18 and 0.17, respectively. CONCLUSION: We used environmental chemical exposure, constructed machine learning models and achieved relatively accurate prediction for diabetes, emphasizing the predictive value of widespread environmental chemicals for complicated diseases.
BACKGROUND: With dramatically increasing prevalence, diabetes mellitus has imposed a tremendous toll on individual well-being. Humans are exposed to various environmental chemicals, which have been postulated as underappreciated but potentially modifiable diabetes risk factors. OBJECTIVES: To determine the utility of environmental chemical exposure in predicting diabetes mellitus. METHODS: A total of 8501 eligible participants from NHANES 2005-2016 were randomly assigned to a discovery (N = 5953) set and a validation (N = 2548) set. We applied random forest (RF) and least absolute shrinkage and selection operator (LASSO) regression with 10-fold cross-validation in the discovery set to select features, and built an optimal model to predict diabetes mellitus, blood insulin, fasting plasma glucose (FPG) and 2-h plasma glucose after oral glucose tolerance test (2-h PG after OGTT). RESULTS: The machine learning model using LASSO regression predicted diabetes with an area under the receiver operating characteristics (AUROC) of 0.80 and 0.78 in the discovery set and validation set, respectively. The linear model predicted blood insulin level with an R2 of 0.42 and 0.40 in the discovery set and validation set, respectively. For FPG, the discovery set and validation set yielded an R2 of 0.16 and 0.15, respectively. For 2-h PG after OGTT, the discovery set and validation set yielded an R2 of 0.18 and 0.17, respectively. CONCLUSION: We used environmental chemical exposure, constructed machine learning models and achieved relatively accurate prediction for diabetes, emphasizing the predictive value of widespread environmental chemicals for complicated diseases.
Authors: Luis Fregoso-Aparicio; Julieta Noguez; Luis Montesinos; José A García-García Journal: Diabetol Metab Syndr Date: 2021-12-20 Impact factor: 3.320