| Literature DB >> 34842532 |
Sarah B May1,2,3, Thomas P Giordano2,3,4, Assaf Gottlieb1.
Abstract
BACKGROUND: Identification of people with HIV from electronic health record (EHR) data is an essential first step in the study of important HIV outcomes, such as risk assessment. This task has been historically performed via manual chart review, but the increased availability of large clinical data sets has led to the emergence of phenotyping algorithms to automate this process. Existing algorithms for identifying people with HIV rely on a combination of International Classification of Disease codes and laboratory tests or closely mimic clinical testing guidelines for HIV diagnosis. However, we found that existing algorithms in the literature missed a significant proportion of people with HIV in our data.Entities:
Keywords: algorithms; cohort identification; electronic health records; people with HIV; phenotyping
Year: 2021 PMID: 34842532 PMCID: PMC8727048 DOI: 10.2196/28620
Source DB: PubMed Journal: JMIR Form Res ISSN: 2561-326X
Figure 1Diagram of our HIV phenotyping algorithm and evaluation framework.
Figure 2Flow diagram of patients through our algorithm for both national and local data sets. *Any International Classification of Disease code for HIV, HIV-related laboratory test performed regardless of result, or medication used to treat HIV documented in the data.
Figure 3Venn diagram showing the number of patients meeting each of the criteria of our HIV phenotyping algorithm for (A) national data set, and (B) local data set.
Figure 4Comparison of distributions of gender and race in cohorts identified by our algorithm and national (Centers for Disease Control and Prevention) and local (Houston Health Dept) HIV surveillance data.
Evaluation resultsa.
| Algorithm | Source | Sensitivity | Specificity | Positive predictive value | Negative predictive value | Accuracy | Results | |||
|
|
|
|
|
|
|
| C+ | C− | ||
|
|
|
|
|
|
|
| A+ | A − | A+ | A − |
| ICDb-only baseline | Fultz et al [ | 0.86 | 0.99 | 0.99 | 0.89 | 0.93 | 143 | 23 | 1 | 193 |
| Laboratory-based baseline | Paul et al [ | 0.56 |
|
| 0.73 | 0.80 | 93 | 73 | 0 | 194 |
| ICD-based baseline | Paul et al [ | 0.90 |
|
| 0.92 | 0.95 | 149 | 17 | 0 | 194 |
| Criteria-based baseline | Kramer et al [ | 0.92 | 0.99 | 0.99 | 0.93 |
| 152 | 14 | 1 | 193 |
| HIV-Phen | N/Ad |
| 0.95 | 0.95 |
|
| 162 | 4 | 9 | 185 |
aResults of the evaluation of the baseline algorithms and our new clinical criteria-based algorithm on a subsample of 360 patients from the local database. Classification of the patients is shown on the right side of the table (Results) broken down by the results of chart review (C+ or C−) and algorithm classification (A+ or A−).
bICD: International Classification of Disease.
cResults are italicized for the algorithm with the highest value for each metric.
dN/A: not applicable.