OBJECTIVE: Universal HIV screening programs are costly, labor intensive, and often fail to identify high-risk individuals. Automated risk assessment methods that leverage longitudinal electronic health records (EHRs) could catalyze targeted screening programs. Although social and behavioral determinants of health are typically captured in narrative documentation, previous analyses have considered only structured EHR fields. We examined whether natural language processing (NLP) would improve predictive models of HIV diagnosis. METHODS: One hundred eighty-one HIV+ individuals received care at New York Presbyterian Hospital before a confirmatory HIV diagnosis and 543 HIV negative controls were selected using propensity score matching and included in the study cohort. EHR data including demographics, laboratory tests, diagnosis codes, and unstructured notes before HIV diagnosis were extracted for modeling. Three predictive algorithms were developed using machine-learning algorithms: (1) a baseline model with only structured EHR data, (2) baseline plus NLP topics, and (3) baseline plus NLP clinical keywords. RESULTS: Predictive models demonstrated a range of performance with F measures of 0.59 for the baseline model, 0.63 for the baseline + NLP topic model, and 0.74 for the baseline + NLP keyword model. The baseline + NLP keyword model yielded the highest precision by including keywords including "msm," "unprotected," "hiv," and "methamphetamine," and structured EHR data indicative of additional HIV risk factors. CONCLUSIONS: NLP improved the predictive performance of automated HIV risk assessment by extracting terms in clinical text indicative of high-risk behavior. Future studies should explore more advanced techniques for extracting social and behavioral determinants from clinical text.
OBJECTIVE: Universal HIV screening programs are costly, labor intensive, and often fail to identify high-risk individuals. Automated risk assessment methods that leverage longitudinal electronic health records (EHRs) could catalyze targeted screening programs. Although social and behavioral determinants of health are typically captured in narrative documentation, previous analyses have considered only structured EHR fields. We examined whether natural language processing (NLP) would improve predictive models of HIV diagnosis. METHODS: One hundred eighty-one HIV+ individuals received care at New York Presbyterian Hospital before a confirmatory HIV diagnosis and 543 HIV negative controls were selected using propensity score matching and included in the study cohort. EHR data including demographics, laboratory tests, diagnosis codes, and unstructured notes before HIV diagnosis were extracted for modeling. Three predictive algorithms were developed using machine-learning algorithms: (1) a baseline model with only structured EHR data, (2) baseline plus NLP topics, and (3) baseline plus NLP clinical keywords. RESULTS: Predictive models demonstrated a range of performance with F measures of 0.59 for the baseline model, 0.63 for the baseline + NLP topic model, and 0.74 for the baseline + NLP keyword model. The baseline + NLP keyword model yielded the highest precision by including keywords including "msm," "unprotected," "hiv," and "methamphetamine," and structured EHR data indicative of additional HIV risk factors. CONCLUSIONS: NLP improved the predictive performance of automated HIV risk assessment by extracting terms in clinical text indicative of high-risk behavior. Future studies should explore more advanced techniques for extracting social and behavioral determinants from clinical text.
Authors: Kristina E Weis; Angela D Liese; James Hussey; James Coleman; Penney Powell; James J Gibson; Wayne A Duffus Journal: AIDS Patient Care STDS Date: 2009-04 Impact factor: 5.078
Authors: Anna Vassall; Michael Pickles; Sudhashree Chandrashekar; Marie-Claude Boily; Govindraj Shetty; Lorna Guinness; Catherine M Lowndes; Janet Bradley; Stephen Moses; Michel Alary; Peter Vickerman Journal: Lancet Glob Health Date: 2014-08-27 Impact factor: 26.763
Authors: Yu-Hsiang Hsieh; Gabor D Kelen; Kaylin J Beck; Chadd K Kraus; Judy B Shahan; Oliver B Laeyendecker; Thomas C Quinn; Richard E Rothman Journal: Am J Emerg Med Date: 2015-10-09 Impact factor: 2.469
Authors: Daniel J Feller; Jason Zucker; Oliver Bear Don't Walk; Bharat Srikishan; Roxana Martinez; Henry Evans; Michael T Yin; Peter Gordon; Noémie Elhadad Journal: AMIA Annu Symp Proc Date: 2018-12-05
Authors: Jason Zucker; Benjamin Patterson; Tanya Ellman; Jacek Slowikowski; Susan Olender; Peter Gordon; Ellen A B Morrison; Magdalena E Sobieszczyk Journal: AIDS Patient Care STDS Date: 2018-11 Impact factor: 5.078
Authors: Tomasz Oliwa; Brian Furner; Jessica Schmitt; John Schneider; Jessica P Ridgway Journal: J Am Med Inform Assoc Date: 2021-01-15 Impact factor: 4.497
Authors: Daniel J Feller; Jason Zucker; Oliver Bear Don't Walk; Michael T Yin; Peter Gordon; Noémie Elhadad Journal: AMIA Annu Symp Proc Date: 2020-03-04
Authors: Julia L Marcus; Leo B Hurley; Douglas S Krakower; Stacey Alexeeff; Michael J Silverberg; Jonathan E Volk Journal: Lancet HIV Date: 2019-07-05 Impact factor: 12.767
Authors: Jennifer J Mootz; Henry Evans; Jack Tocco; Christian Vivar Ramon; Peter Gordon; Milton L Wainberg; Michael T Yin Journal: Mhealth Date: 2020-04-05