| Literature DB >> 31326931 |
Bankole Olatosi1, Jiajia Zhang2, Sharon Weissman3, Jianjun Hu4, Mohammad Rifat Haider5, Xiaoming Li6.
Abstract
INTRODUCTION: Linkage and retention in HIV medical care remains problematic in the USA. Extensive health utilisation data collection through electronic health records (EHR) and claims data represent new opportunities for scientific discovery. Big data science (BDS) is a powerful tool for investigating HIV care utilisation patterns. The South Carolina (SC) office of Revenue and Fiscal Affairs (RFA) data warehouse captures individual-level longitudinal health utilisation data for persons living with HIV (PLWH). The data warehouse includes EHR, claims and data from private institutions, housing, prisons, mental health, Medicare, Medicaid, State Health Plan and the department of health and human services. The purpose of this study is to describe the process for creating a comprehensive database of all SC PLWH, and plans for using BDS to explore, identify, characterise and explain new predictors of missed opportunities for HIV medical care utilisation. METHODS AND ANALYSIS: This project will create person-level profiles guided by the Gelberg-Andersen Behavioral Model and describe new patterns of HIV care utilisation. The population for the comprehensive database comes from statewide HIV surveillance data (2005-2016) for all SC PLWH (N≈18000). Surveillance data are available from the state health department's enhanced HIV/AIDS Reporting System (e-HARS). Additional data pulls for the e-HARS population will include Ryan White HIV/AIDS Program Service Reports, Health Sciences SC data and Area Health Resource Files. These data will be linked to the RFA data and serve as sources for traditional and vulnerable domain Gelberg-Anderson Behavioral Model variables. The project will use BDS techniques such as machine learning to identify new predictors of HIV care utilisation behaviour among PLWH, and 'missed opportunities' for re-engaging them back into care. ETHICS AND DISSEMINATION: The study team applied for data from different sources and submitted individual Institutional Review Board (IRB) applications to the University of South Carolina (USC) IRB and other local authorities/agencies/state departments. This study was approved by the USC IRB (#Pro00068124) in 2017. To protect the identity of the persons living with HIV (PLWH), researchers will only receive linked deidentified data from the RFA. Study findings will be disseminated at local community forums, community advisory group meetings, meetings with our state agencies, local partners and other key stakeholders (including PLWH, policy-makers and healthcare providers), presentations at academic conferences and through publication in peer-reviewed articles. Data security and patient confidentiality are the bedrock of this study. Extensive data agreements ensuring data security and patient confidentiality for the deidentified linked data have been established and are stringently adhered to. The RFA is authorised to collect and merge data from these different sources and to ensure the privacy of all PLWH. The legislatively mandated SC data oversight council reviewed the proposed process stringently before approving it. Researchers will get only the encrypted deidentified dataset to prevent any breach of privacy in the data transfer, management and analysis processes. In addition, established secure data governance rules, data encryption and encrypted predictive techniques will be deployed. In addition to the data anonymisation as a part of privacy-preserving analytics, encryption schemes that protect running prediction algorithms on encrypted data will also be deployed. Best practices and lessons learnt about the complex processes involved in negotiating and navigating multiple data sharing agreements between different entities are being documented for dissemination. © Author(s) (or their employer(s)) 2019. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.Entities:
Keywords: HIV/AIDS; big data science; health care utilisation; machine learning; predictive modeling
Year: 2019 PMID: 31326931 PMCID: PMC6661700 DOI: 10.1136/bmjopen-2018-027688
Source DB: PubMed Journal: BMJ Open ISSN: 2044-6055 Impact factor: 2.692
Figure 1South Carolina Office of Revenue and Fiscal Affairs Integrated Data System.
HIV treatment cascade and corresponding variables data sources*†
| HIV treatment cascade | Variables based on Gelberg-Andersen Model | Data sources |
| Diagnosis | Level at diagnosis CD4 Viral load | Department of Health and Environmental Control Enhanced HIV/AIDS Reporting System (DHEC e-HARS) |
|
| ||
| HIV linkage to care | Demographics | Revenue and Fiscal Affairs (RFA) |
| Health beliefs | ||
| Vulnerable domains | RFA | |
| Criminal behaviour, violent status | Department of Corrections | |
| Mental Illness | Department of Mental Health | |
| Childhood characteristics | Department of Social Services (DSS) | |
|
| ||
| Regular source of care | Medicaid | |
| Social support, public benefits | DSS | |
| Health services resources | ACS Census Tract | |
| Case management | DHEC e-HARS | |
| Community resources | ACS Census Tract | |
| Location variables (poverty, education, median income, employment) | ACS Census Tract | |
|
| ||
| Evaluated health—diagnosis, comorbidities | RFA | |
| Perceived health | RSR | |
| Health behaviours | ||
| Personal health practices | RFA | |
| ART monitoring | Use of health services | RFA |
| Viral suppression |
| DHEC e-HARS |
*All data linkage is conducted at the individual unit level using name, date of birth and social security number.
†All records from other datasets linked to the e-HARS cohort are available through the RFA and other data sources listed above.
ART, antiretroviral treatment; CDC, Centers for Disease Control and Prevention; GPS, Global Positioning System.
Figure 2HIV treatment cascade including Gelberg-Andersen Model variables and data sources.
Selected variables for HIV care pattern determination*
| Measure | Type of output; | Observation time needed to calculate |
| Missed visit | Dichotomous; were there any missed visits in the interval? | At least 6 months |
| Appointment adherence | Continuous; attended appointments divided by (attended appointments plus missed appointments) | Patient: at least 1 year; clinic: as short as 1 day |
| Constancy, 3-month or 4-month intervals | Categorical; number of 3-month or 4-month intervals with at least one attended visit | At least 6–8 months |
| Constancy, 6-month intervals | Categorical; number of 6-month intervals with at least one attended visit | At least 1 year |
| Constancy, 6-month intervals, longer term | Dichotomous; At least one attended visit in each 6-month interval with at least 60 days between visits | At least 2 years |
| Constancy, HIV/AIDS Bureau | Dichotomous; At least two attended visits in 12 months, separated by at least 90 days | At least 1 year |
| Gaps | Dichotomous; did the time between two contiguous attended visits exceed a threshold (eg, 6 months)? | At least 1 year |
| Continuous; what is the longest duration of time between two contiguous attended visits? |
*Definitions for visits have been previously described elsewhere.25
Figure 3Analytic plan.