Joanne Michelle F Ocampo1,2, Auntré Hamp1,3, Anne Rhodes4, J C Smart5, Raghu Pemmaraju6, Karalee Poschman7,8, Kristen L Hess8, Reshma Bhattacharjee9, Colin Flynn9, Bridget J Anderson10, James E Dowling11, Fred Maccormack11, Rupali Doshi3,12, Garret Lum3, Lorene Maddox7, Brenda Moncur10, John E Barnhart13, Jason Maxwell13, Sahithi Boggavarapu Aurand4, Vicki Hogan14, David Wills14, Stacy Prowell15, Seble G Kassaye2, Helen E Karn1, Benjamin T Laffoon8, Jeff Collmann1. 1. Georgetown University, Office of the Senior Vice President for Research, Washington, DC. 2. Division of Infectious Diseases, Department of Medicine, Georgetown University, Washington, DC. 3. District of Columbia Department of Health, Washington, DC. 4. Virginia Department of Health, Richmond, VA. 5. Department of Computer Science, Georgetown University, Washington, DC. 6. Georgetown University, University Information Systems, Washington, DC. 7. Florida Department of Health, Tallahassee, FL. 8. Division of HIV/AIDS Prevention, Centers for Disease Control and Prevention, Atlanta, GA. 9. Maryland Department of Health, Baltimore, MD. 10. New York State Department of Health, Albany, NY. 11. Delaware Division of Public Health, Newark, DE. 12. Department of Epidemiology and Biostatistics, The George Washington University, Washington, DC. 13. North Carolina Department of Health, Raleigh, NC. 14. West Virginia Department of Health and Human Resources, Bureau for Public Health Charleston, WV. 15. National Secuirty Sceinces Directorate, Cyber Physical Systems Research Group Oak Ridge National Laboratory, Oak Ridge, Tennessee.
Abstract
BACKGROUND: Focused attention on Data to Care underlines the importance of high-quality HIV surveillance data. This study identified the number of total duplicate and exact duplicate HIV case records in 9 separate Enhanced HIV/AIDS Reporting System (eHARS) databases reported by 8 jurisdictions and compared this approach to traditional Routine Interstate Duplicate Review resolution. METHODS: This study used the ATra Black Box System and 6 eHARS variables for matching case records across jurisdictions: last name, first name, date of birth, sex assigned at birth (birth sex), social security number, and race/ethnicity, plus 4 system-calculated values (first name Soundex, last name Soundex, partial date of birth, and partial social security number). RESULTS: In approximately 11 hours, this study matched 290,482 cases from 799,326 uploaded records, including 55,460 exact case pairs. Top case pair overlaps were between NYC and NYS (51%), DC and MD (10%), and FL and NYC (6%), followed closely by FL and NYS (4%), FL and NC (3%), DC and VA (3%), and MD and VA (3%). Jurisdictions estimated that they realized a combined 135 labor hours in time efficiency by using this approach compared with manual methods previously used for interstate duplication resolution. DISCUSSION: This approach discovered exact matches that were not previously identified. It also decreased time spent resolving duplicated case records across jurisdictions while improving accuracy and completeness of HIV surveillance data in support of public health program policies. Future uses of this approach should consider standardized protocols for postprocessing eHARS data.
BACKGROUND: Focused attention on Data to Care underlines the importance of high-quality HIV surveillance data. This study identified the number of total duplicate and exact duplicate HIV case records in 9 separate Enhanced HIV/AIDS Reporting System (eHARS) databases reported by 8 jurisdictions and compared this approach to traditional Routine Interstate Duplicate Review resolution. METHODS: This study used the ATra Black Box System and 6 eHARS variables for matching case records across jurisdictions: last name, first name, date of birth, sex assigned at birth (birth sex), social security number, and race/ethnicity, plus 4 system-calculated values (first name Soundex, last name Soundex, partial date of birth, and partial social security number). RESULTS: In approximately 11 hours, this study matched 290,482 cases from 799,326 uploaded records, including 55,460 exact case pairs. Top case pair overlaps were between NYC and NYS (51%), DC and MD (10%), and FL and NYC (6%), followed closely by FL and NYS (4%), FL and NC (3%), DC and VA (3%), and MD and VA (3%). Jurisdictions estimated that they realized a combined 135 labor hours in time efficiency by using this approach compared with manual methods previously used for interstate duplication resolution. DISCUSSION: This approach discovered exact matches that were not previously identified. It also decreased time spent resolving duplicated case records across jurisdictions while improving accuracy and completeness of HIV surveillance data in support of public health program policies. Future uses of this approach should consider standardized protocols for postprocessing eHARS data.
Authors: Tigran Avoundjian; Julia C Dombrowski; Matthew R Golden; James P Hughes; Brandon L Guthrie; Janet Baseman; Mauricio Sadinle Journal: JMIR Public Health Surveill Date: 2020-04-30
Authors: Seble G Kassaye; Amanda Blair Spence; Edwin Lau; David M Bridgeland; John Cederholm; Spiros Dimolitsas; J C Smart Journal: JMIR Public Health Surveill Date: 2020-08-13