| Literature DB >> 29511571 |
Ryan C Shean1,2, Alexander L Greninger1,2.
Abstract
Infectious pathogens are known for their rapid evolutionary rates with new mutations arising over days to weeks. The ability to rapidly recover whole genome sequences and analyze the spread and evolution of pathogens using genetic information and pathogen collection dates has lead to interest in real-time tracking of infectious transmission and outbreaks. However, the level of temporal resolution afforded by these analyses may conflict with definitions of what constitutes protected health information (PHI) and privacy requirements for de-identification for publication and public sharing of research data and metadata. In the United States, dates and locations associated with patient care that provide greater resolution than year or the first three digits of the zip code are generally considered patient identifiers. Admission and discharge dates are specifically named as identifiers in Department of Health and Human Services guidance. To understand the degree to which one can impute admission dates from specimen collection dates, we examined sample collection dates and patient admission dates associated with more than 270,000 unique microbiological results from the University of Washington Laboratory Medicine Department between 2010 and 2017. Across all positive microbiological tests, the sample collection date exactly matched the patient admission date in 68.8% of tests. Collection dates and admission dates were identical from emergency department and outpatient testing 86.7% and 96.5% of the time, respectively, with >99% of tests collected within 1 day from the patient admission date. Samples from female patients were significantly more likely to be collected closer to admission date that those from male patients. We show that PHI-associated dates such as admission date can confidently be imputed from deposited collection date. We suggest that publicly depositing microbiological collection dates at greater resolution than the year may not meet routine Safe Harbor-based requirements for patient de-identification. We recommend the use of Expert Determination to determine PHI for a given study and/or direct patient consent if clinical laboratories or phylodynamic practitioners desire to make these data available.Entities:
Keywords: HIPAA; Safe Harbor; admission date; collection date; privacy; protected health information
Year: 2018 PMID: 29511571 PMCID: PMC5829646 DOI: 10.1093/ve/vey005
Source DB: PubMed Journal: Virus Evol ISSN: 2057-1577
List of 18 Protected Health Identifiers required to be removed to deidentify data under Safe Harbor method (reprinted from Office of Civil Rights Guidance on De-identification of PHI, 26 November 2012).
| Protected health identifiers |
|---|
| Names |
| All geographical subdivisions smaller than a State, including street address, city, county, precinct, zip code, and their equivalent geocodes, except for the initial three digits of a zip code, if according to the current publicly available data from the Bureau of the Census: 1, The geographic unit formed by combining all zip codes with the same three initial digits contains more than 20,000 people; and 2, The initial three digits of a zip code for all such geographic units containing 20,000 or fewer people is changed to 000. |
| All elements of dates (except year) for dates directly related to an individual, including birth date, admission date, discharge date, date of death; and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older. |
| Phone numbers |
| Fax numbers |
| Electronic mail addresses |
| Social Security numbers |
| Medical record numbers |
| Health plan beneficiary numbers |
| Account numbers |
| Certificate/license numbers |
| Vehicle identifiers and serial numbers, including license plate numbers |
| Device identifiers and serial numbers |
| Web Universal Resource Locators (URLs) |
| Internet Protocol (IP) address numbers |
| Biometric identifiers, including finger, and voice prints |
| Full face photographic images and any comparable images |
| Any other unique identifying number, characteristic, or code (note this does not mean the unique code assigned by the investigator to code the data). |
Demographic characteristics of samples tested.
| Bacterial–fungal | Viral | |
|---|---|---|
| Emergency Room | 2,197 | 29,964 |
| Inpatient | 16,157 | 79,031 |
| Outpatient | 56,503 | 87,917 |
| Bacterial–fungal | Female | Male |
| Emergency Room | 967 | 1,230 |
| Inpatient | 6,976 | 9,181 |
| Outpatient | 24,883 | 31,620 |
| Viral | Female | Male |
| Emergency Room | 15,516 | 14,448 |
| Inpatient | 33,391 | 45,640 |
| Outpatient | 57,237 | 30,680 |
Figure 1.High concordance of collection date and admission date for positive microbiological testing in multiple care settings. Cumulative probability curves of the collection date being with X days of the admission date are depicted for all settings (A), inpatient (B), outpatient (C), and emergency department (D). All positive microbiological tests (bacteria/fungus) and all virology tests (virus) that had collection dates between 2010 and 2017 from the University of Washington Laboratory Medicine Department are shown. Overall, positive microbiological tests had a collection date that was within 1 day of a patient’s admission date 78.8% of the time. Viruses were significantly more likely to be collected closer to the admission date across all settings (P = 2.2e-16). Cumulative percentages for days 0, 1, 7, and 30 are depicted to highlight ability to impute admission dates exactly, within 1 day, 1 week, and 1 month based on the collection date. Cumulative percentages are plotted on a logarithmic scale due to the very high likelihood of collection date being associated admission date for outpatient and emergency department testing (>98.5% of collection dates were within 1 day of admission date for each location). Two-sample KS testing was performed on all cumulative distributions. Bacteria compared with viruses was significantly different across all locations (P < 2.2e-16). Comparing viruses across all locations and all bacteria across all locations also revealed that differences between the distributions were significant (P < 2.2e-16).
Figure 2.High concordance of collection date and admission date in OB-Gyn testing. Cumulative probability curves of the collection date being within X days of the admission date are shown for inpatients grouped into the following four categories—ICU (A), Medicine (B), OB/Gyn (C), Surgery (D). Bacterial/fungal culture and viral tests are combined for this figure. ICU had the lowest percent of admission dates being within 1 day of collection date at 36.31%. OB/Gyn had the largest percent of collection dates dmission dates being within 1 day of collection at 65.13%. Cumulative percentages for Days 0, 1, 7, and 30 are shown to highlight ability to impute admission dates based on collection dates with that many days of accuracy. The cumulative distribution of days separating collection date and admission date differed significantly between OB/Gyn and each of the other services: ICU (P = 0.0022), Medicine (P = 3.62e-05), and Surgery (P = 0.0027). ICU, Medicine, and Surgery were not found to have significantly different cumultative distributions between each other (P > 0.2).
Figure 3.Higher concordance of collection date and admission date for females than for males. Cumulative probability curves of the collection date being within X days of the admission date are shown for all locations and sexes grouped by the test type – all tests combined (A), bacterial/fungal culture positives (B), all viral testing (C). Samples from women were significantly more likely to be collected on the exact admission date (73.7 versus 63.9%, P = 2.2e-16) and closer to the admission date (D = 0.089916, P = 0.0137) than those from men.