Bo Wang1, Feifan Liu1, Lynette Deveaux2, Arlene Ash1, Samiran Gosh3, Xiaoming Li4, Elke Rundensteiner5, Lesley Cottrell6, Richard Adderley2, Bonita Stanton7. 1. Department of Population and Quantitative Health Sciences, University of Massachusetts Medical School, 368 Plantation Street, Worcester, Massachusetts, USA. 2. Office of HIV/AIDS, Ministry of Health, Shirley Street, Nassau, The Bahamas. 3. Department of Family Medicine and Public Health Sciences, Wayne State University School of Medicine, Detroit, Michigan. 4. Department of Health Promotion, Education, and Behavior, University of South Carolina Arnold School of Public, Columbia, South Carolina. 5. Data Science, Worcester Polytechnic Institute, Worcester, Massachusetts. 6. Center for Excellence in Disabilities, West Virginia University, Morgantown, West Virginia. 7. Hackensack Meridian School of Medicine, Nutley, New Jersey, USA.
Abstract
BACKGROUND: Precision prevention is increasingly important in HIV prevention research to move beyond universal interventions to those tailored for high-risk individuals. The current study was designed to develop machine learning algorithms for predicting adolescent HIV risk behaviours. METHODS: Comprehensive longitudinal data on adolescent risk behaviours, perceptions, peer and family influence, and neighbourhood risk factors were collected from 2564 grade-10 students at baseline followed for 24 months over 2008-2012. Machine learning techniques [support vector machine (SVM) and random forests] were applied to innovatively leverage longitudinal data for robust HIV risk behaviour prediction. In this study, we focused on two adolescent risk behaviours: had ever had sex and had multiple sex partners. Twenty percent of the data were withheld for model testing. RESULTS: The SVM model with cost-sensitive learning achieved the highest sensitivity, at 79.1%, specificity of 75.4% with AUC of 0.86 in predicting multiple sex partners on the training data (10-fold cross-validation), and sensitivity of 79.7%, specificity of 76.5% with AUC of 0.86 on the testing data. The random forest model obtained the best performance in predicting had ever had sex, yielding the sensitivity of 78.5%, specificity of 73.1% with AUC of 0.84 on the training data and sensitivity of 82.7%, specificity of 75.3% with AUC of 0.87 on the testing data. CONCLUSION: Machine learning methods can be used to build effective prediction model(s) to identify adolescents who are likely to engage in HIV risk behaviours. This study builds a foundation for targeted intervention strategies and informs precision prevention efforts in school-setting.
BACKGROUND: Precision prevention is increasingly important in HIV prevention research to move beyond universal interventions to those tailored for high-risk individuals. The current study was designed to develop machine learning algorithms for predicting adolescent HIV risk behaviours. METHODS: Comprehensive longitudinal data on adolescent risk behaviours, perceptions, peer and family influence, and neighbourhood risk factors were collected from 2564 grade-10 students at baseline followed for 24 months over 2008-2012. Machine learning techniques [support vector machine (SVM) and random forests] were applied to innovatively leverage longitudinal data for robust HIV risk behaviour prediction. In this study, we focused on two adolescent risk behaviours: had ever had sex and had multiple sex partners. Twenty percent of the data were withheld for model testing. RESULTS: The SVM model with cost-sensitive learning achieved the highest sensitivity, at 79.1%, specificity of 75.4% with AUC of 0.86 in predicting multiple sex partners on the training data (10-fold cross-validation), and sensitivity of 79.7%, specificity of 76.5% with AUC of 0.86 on the testing data. The random forest model obtained the best performance in predicting had ever had sex, yielding the sensitivity of 78.5%, specificity of 73.1% with AUC of 0.84 on the training data and sensitivity of 82.7%, specificity of 75.3% with AUC of 0.87 on the testing data. CONCLUSION: Machine learning methods can be used to build effective prediction model(s) to identify adolescents who are likely to engage in HIV risk behaviours. This study builds a foundation for targeted intervention strategies and informs precision prevention efforts in school-setting.
Authors: Douglas S Krakower; Susan Gruber; Katherine Hsu; John T Menchaca; Judith C Maro; Benjamin A Kruskal; Ira B Wilson; Kenneth H Mayer; Michael Klompas Journal: Lancet HIV Date: 2019-07-05 Impact factor: 12.767
Authors: Laura B Balzer; Diane V Havlir; Moses R Kamya; Gabriel Chamie; Edwin D Charlebois; Tamara D Clark; Catherine A Koss; Dalsone Kwarisiima; James Ayieko; Norton Sang; Jane Kabami; Mucunguzi Atukunda; Vivek Jain; Carol S Camlin; Craig R Cohen; Elizabeth A Bukusi; Mark Van Der Laan; Maya L Petersen Journal: Clin Infect Dis Date: 2020-12-03 Impact factor: 20.999
Authors: Noam Barda; Dan Riesel; Amichay Akriv; Joseph Levy; Uriah Finkel; Gal Yona; Daniel Greenfeld; Shimon Sheiba; Jonathan Somer; Eitan Bachmat; Guy N Rothblum; Uri Shalit; Doron Netzer; Ran Balicer; Noa Dagan Journal: Nat Commun Date: 2020-09-07 Impact factor: 14.919