Yang Xiang1, Kayo Fujimoto2, John Schneider3,4, Yuxi Jia1,5, Degui Zhi1, Cui Tao1. 1. School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA. 2. Department of Health Promotion & Behavioral Sciences, School of Public Health, University of Texas Health Science Center at Houston, Houston, Texas, USA. 3. Departments of Medicine and Public Health Sciences, University of Chicago, Chicago, Illinois, USA. 4. Chicago Center for HIV Elimination, University of Chicago, Chicago, Illinois, USA. 5. Department of Medical Informatics, School of Public Health, Jilin University, Jilin, China.
Abstract
OBJECTIVE: HIV infection risk can be estimated based on not only individual features but also social network information. However, there have been insufficient studies using n machine learning methods that can maximize the utility of such information. Leveraging a state-of-the-art network topology modeling method, graph convolutional networks (GCN), our main objective was to include network information for the task of detecting previously unknown HIV infections. MATERIALS AND METHODS: We used multiple social network data (peer referral, social, sex partners, and affiliation with social and health venues) that include 378 young men who had sex with men in Houston, TX, collected between 2014 and 2016. Due to the limited sample size, an ensemble approach was engaged by integrating GCN for modeling information flow and statistical machine learning methods, including random forest and logistic regression, to efficiently model sparse features in individual nodes. RESULTS: Modeling network information using GCN effectively increased the prediction of HIV status in the social network. The ensemble approach achieved 96.6% on accuracy and 94.6% on F1 measure, which outperformed the baseline methods (GCN, logistic regression, and random forest: 79.0%, 90.5%, 94.4% on accuracy, respectively; and 57.7%, 80.2%, 90.4% on F1). In the networks with missing HIV status, the ensemble also produced promising results. CONCLUSION: Network context is a necessary component in modeling infectious disease transmissions such as HIV. GCN, when combined with traditional machine learning approaches, achieved promising performance in detecting previously unknown HIV infections, which may provide a useful tool for combatting the HIV epidemic.
OBJECTIVE:HIV infection risk can be estimated based on not only individual features but also social network information. However, there have been insufficient studies using n machine learning methods that can maximize the utility of such information. Leveraging a state-of-the-art network topology modeling method, graph convolutional networks (GCN), our main objective was to include network information for the task of detecting previously unknown HIV infections. MATERIALS AND METHODS: We used multiple social network data (peer referral, social, sex partners, and affiliation with social and health venues) that include 378 young men who had sex with men in Houston, TX, collected between 2014 and 2016. Due to the limited sample size, an ensemble approach was engaged by integrating GCN for modeling information flow and statistical machine learning methods, including random forest and logistic regression, to efficiently model sparse features in individual nodes. RESULTS: Modeling network information using GCN effectively increased the prediction of HIV status in the social network. The ensemble approach achieved 96.6% on accuracy and 94.6% on F1 measure, which outperformed the baseline methods (GCN, logistic regression, and random forest: 79.0%, 90.5%, 94.4% on accuracy, respectively; and 57.7%, 80.2%, 90.4% on F1). In the networks with missing HIV status, the ensemble also produced promising results. CONCLUSION: Network context is a necessary component in modeling infectious disease transmissions such as HIV. GCN, when combined with traditional machine learning approaches, achieved promising performance in detecting previously unknown HIV infections, which may provide a useful tool for combatting the HIV epidemic.
Authors: S R Friedman; A Neaigus; B Jose; R Curtis; M Goldstein; G Ildefonso; R B Rothenberg; D C Des Jarlais Journal: Am J Public Health Date: 1997-08 Impact factor: 9.308
Authors: Kayo Fujimoto; Charlene A Flash; Lisa M Kuhns; Ju-Yeong Kim; John A Schneider Journal: Sex Transm Infect Date: 2018-02-09 Impact factor: 3.519
Authors: John A Schneider; Benjamin Cornwell; David Ostrow; Stuart Michaels; Phil Schumm; Edward O Laumann; Samuel Friedman Journal: Am J Public Health Date: 2012-11-15 Impact factor: 9.308
Authors: Kayo Fujimoto; Peng Wang; Lisa M Kuhns; Michael W Ross; Mark L Williams; Robert Garofalo; Alden S Klovdahl; Edward O Laumann; John A Schneider Journal: Med Care Date: 2017-02 Impact factor: 2.983
Authors: Andrea A Kim; Bharat S Parekh; Mamo Umuro; Tura Galgalo; Rebecca Bunnell; Ernest Makokha; Trudy Dobbs; Patrick Murithi; Nicholas Muraguri; Kevin M De Cock; Jonathan Mermin Journal: PLoS One Date: 2016-05-19 Impact factor: 3.240
Authors: Yang Xiang; Kayo Fujimoto; Fang Li; Qing Wang; Natascha Del Vecchio; John Schneider; Degui Zhi; Cui Tao Journal: AIDS Date: 2021-05-01 Impact factor: 4.632
Authors: Xianglong Xu; Zhen Yu; Zongyuan Ge; Eric P F Chow; Yining Bao; Jason J Ong; Wei Li; Jinrong Wu; Christopher K Fairley; Lei Zhang Journal: J Med Internet Res Date: 2022-08-25 Impact factor: 7.076