Chen Liang1, Shan Qiao2, Bankole Olatosi3, Tianchu Lyu3, Xiaoming Li2. 1. Department of Health Services Policy and Management, Arnold School of Public Health, University of South Carolina, Columbia, SC, USA. Electronic address: cliang@mailbox.sc.edu. 2. Department of Health Promotion, Education, and Behavior, Arnold School of Public Health, University of South Carolina, Columbia, SC, USA. 3. Department of Health Services Policy and Management, Arnold School of Public Health, University of South Carolina, Columbia, SC, USA.
Abstract
BACKGROUND: The rapid growth of inherently complex and heterogeneous data in HIV/AIDS research underscores the importance of Big Data Science. Recently, there have been increasing uptakes of Big Data techniques in basic, clinical, and public health fields of HIV/AIDS research. However, no studies have systematically elaborated on the evolving applications of Big Data in HIV/AIDS research. We sought to explore the emergence and evolution of Big Data Science in HIV/AIDS-related publications that were funded by the US federal agencies. METHODS: We identified HIV/AIDS and Big Data related publications that were funded by seven federal agencies from 2000 to 2019 by integrating data from National Institutes of Health (NIH) ExPORTER, MEDLINE, and MeSH. Building on bibliometrics and Natural Language Processing (NLP) methods, we constructed co-occurrence networks using bibliographic metadata (e.g., countries, institutes, MeSH terms, and keywords) of the retrieved publications. We then detected clusters among the networks as well as the temporal dynamics of clusters, followed by expert evaluation and clinical implications. RESULTS: We harnessed nearly 600 thousand publications related to HIV/AIDS, of which 19,528 publications relating to Big Data were included in bibliometric analysis. Results showed that (1) the number of Big Data publications has been increasing since 2000, (2) US institutes have been in close collaborations with China, Canada, and Germany, (3) some institutes (e.g., University of California system, MD Anderson Cancer Center, and Harvard Medical School) are among the most productive institutes and started using Big Data in HIV/AIDS research early, (4) Big Data research was not active in public health disciplines until 2015, (5) research topics such as genomics, HIV comorbidities, population-based studies, Electronic Health Records (EHR), social media, precision medicine, and methodologies such as machine learning, Deep Learning, radiomics, and data mining emerge quickly in recent years. CONCLUSIONS: We identified a rapid growth in the cross-disciplinary research of HIV/AIDS and Big Data over the past two decades. Our findings demonstrated patterns and trends of prevailing research topics and Big Data applications in HIV/AIDS research and suggested a number of fast-evolving areas of Big Data Science in HIV/AIDS research including secondary analysis of EHR, machine learning, Deep Learning, predictive analysis, and NLP.
BACKGROUND: The rapid growth of inherently complex and heterogeneous data in HIV/AIDS research underscores the importance of Big Data Science. Recently, there have been increasing uptakes of Big Data techniques in basic, clinical, and public health fields of HIV/AIDS research. However, no studies have systematically elaborated on the evolving applications of Big Data in HIV/AIDS research. We sought to explore the emergence and evolution of Big Data Science in HIV/AIDS-related publications that were funded by the US federal agencies. METHODS: We identified HIV/AIDS and Big Data related publications that were funded by seven federal agencies from 2000 to 2019 by integrating data from National Institutes of Health (NIH) ExPORTER, MEDLINE, and MeSH. Building on bibliometrics and Natural Language Processing (NLP) methods, we constructed co-occurrence networks using bibliographic metadata (e.g., countries, institutes, MeSH terms, and keywords) of the retrieved publications. We then detected clusters among the networks as well as the temporal dynamics of clusters, followed by expert evaluation and clinical implications. RESULTS: We harnessed nearly 600 thousand publications related to HIV/AIDS, of which 19,528 publications relating to Big Data were included in bibliometric analysis. Results showed that (1) the number of Big Data publications has been increasing since 2000, (2) US institutes have been in close collaborations with China, Canada, and Germany, (3) some institutes (e.g., University of California system, MD Anderson Cancer Center, and Harvard Medical School) are among the most productive institutes and started using Big Data in HIV/AIDS research early, (4) Big Data research was not active in public health disciplines until 2015, (5) research topics such as genomics, HIV comorbidities, population-based studies, Electronic Health Records (EHR), social media, precision medicine, and methodologies such as machine learning, Deep Learning, radiomics, and data mining emerge quickly in recent years. CONCLUSIONS: We identified a rapid growth in the cross-disciplinary research of HIV/AIDS and Big Data over the past two decades. Our findings demonstrated patterns and trends of prevailing research topics and Big Data applications in HIV/AIDS research and suggested a number of fast-evolving areas of Big Data Science in HIV/AIDS research including secondary analysis of EHR, machine learning, Deep Learning, predictive analysis, and NLP.
Authors: Philip E Bourne; Vivien Bonazzi; Michelle Dunn; Eric D Green; Mark Guyer; George Komatsoulis; Jennie Larkin; Beth Russell Journal: J Am Med Inform Assoc Date: 2015-11 Impact factor: 4.497
Authors: Tomasz Oliwa; Brian Furner; Jessica Schmitt; John Schneider; Jessica P Ridgway Journal: J Am Med Inform Assoc Date: 2021-01-15 Impact factor: 4.497
Authors: David J Kim; Andrew O Westfall; Eric Chamot; Amanda L Willig; Michael J Mugavero; Christine Ritchie; Greer A Burkholder; Heidi M Crane; James L Raper; Michael S Saag; James H Willig Journal: J Acquir Immune Defic Syndr Date: 2012-12-15 Impact factor: 3.731
Authors: Michael S Kozak; Michael J Mugavero; Jiatao Ye; Inmaculada Aban; Sarah T Lawrence; Christa R Nevin; James L Raper; Cheryl McCullumsmith; Joseph E Schumacher; Heidi M Crane; Mari M Kitahata; Michael S Saag; James H Willig Journal: Clin Infect Dis Date: 2011-10-31 Impact factor: 9.079
Authors: Frederick L Altice; Adeeba Kamarulzaman; Vincent V Soriano; Mauro Schechter; Gerald H Friedland Journal: Lancet Date: 2010-07-31 Impact factor: 79.321