Jiamin Shi1,2, Rui Fu2,3, Hayley Hamilton1,2, Michael Chaiton1,2. 1. Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada. 2. Institute for Mental Health Policy Research, Centre for Addiction and Mental Health, Toronto, Ontario, Canada. 3. Institute of Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada.
Abstract
INTRODUCTION: We developed separate random forest algorithms to predict e-cigarette (vaping) ever use and daily use among Ontario youth, and subsequently examined predictor importance and statistical interaction. METHODS: This cross-sectional study used a representative sample of Ontario elementary and high school students in 2019 (N = 6471). Vaping frequency over the last 12 months was used to define ever-vaping and daily vaping. We considered a large set of individual characteristics as potential correlates for ever-vaping (176 variables) and daily vaping (179 variables). Using cross-validation, we developed random forest algorithms and evaluated model performance based on the C-index, a measure to assess the discriminatory ability of a model, for both outcomes. Further, the top 10 correlates were identified by relative importance score calculation and their interaction with sociodemographic characteristics. RESULTS: There were 2064 (31.9%) ever-vapers, and 490 (7.6%) of the respondents were daily users. The random forest algorithms for both outcomes achieved high performance, with C-index over 0.90. The top 10 correlates of daily vaping included use of caffeine, cannabis and tobacco, source and type of e-cigarette and absence in last 20 school days. Those of ever-vaping included school size, use of alcohol, cannabis and tobacco; 9 of the top 10 ever-vaping correlates demonstrated interactions with ethnicity. CONCLUSION: Machine learning is a promising methodology for identifying the risks of ever-vaping and daily vaping. Furthermore, it enables the identification of important correlates and the assessment of complex intersections, which may inform future longitudinal studies to customize public health policies for targeted population subgroups.
INTRODUCTION: We developed separate random forest algorithms to predict e-cigarette (vaping) ever use and daily use among Ontario youth, and subsequently examined predictor importance and statistical interaction. METHODS: This cross-sectional study used a representative sample of Ontario elementary and high school students in 2019 (N = 6471). Vaping frequency over the last 12 months was used to define ever-vaping and daily vaping. We considered a large set of individual characteristics as potential correlates for ever-vaping (176 variables) and daily vaping (179 variables). Using cross-validation, we developed random forest algorithms and evaluated model performance based on the C-index, a measure to assess the discriminatory ability of a model, for both outcomes. Further, the top 10 correlates were identified by relative importance score calculation and their interaction with sociodemographic characteristics. RESULTS: There were 2064 (31.9%) ever-vapers, and 490 (7.6%) of the respondents were daily users. The random forest algorithms for both outcomes achieved high performance, with C-index over 0.90. The top 10 correlates of daily vaping included use of caffeine, cannabis and tobacco, source and type of e-cigarette and absence in last 20 school days. Those of ever-vaping included school size, use of alcohol, cannabis and tobacco; 9 of the top 10 ever-vaping correlates demonstrated interactions with ethnicity. CONCLUSION: Machine learning is a promising methodology for identifying the risks of ever-vaping and daily vaping. Furthermore, it enables the identification of important correlates and the assessment of complex intersections, which may inform future longitudinal studies to customize public health policies for targeted population subgroups.
Authors: Joeky T Senders; Patrick C Staples; Aditya V Karhade; Mark M Zaki; William B Gormley; Marike L D Broekman; Timothy R Smith; Omar Arnaout Journal: World Neurosurg Date: 2017-10-03 Impact factor: 2.104
Authors: Pallav Pokhrel; Pebbles Fagan; Crissy T Kawamoto; Scott K Okamoto; Thaddeus A Herzog Journal: Drug Alcohol Depend Date: 2020-09-25 Impact factor: 4.852
Authors: Wei Luo; Dinh Phung; Truyen Tran; Sunil Gupta; Santu Rana; Chandan Karmakar; Alistair Shilton; John Yearwood; Nevenka Dimitrova; Tu Bao Ho; Svetha Venkatesh; Michael Berk Journal: J Med Internet Res Date: 2016-12-16 Impact factor: 5.428
Authors: David Hammond; Jessica L Reid; Vicki L Rynard; Geoffrey T Fong; K Michael Cummings; Ann McNeill; Sara Hitchman; James F Thrasher; Maciej L Goniewicz; Maansi Bansal-Travers; Richard O'Connor; David Levy; Ron Borland; Christine M White Journal: BMJ Date: 2019-06-20