Literature DB >> 34850420

Comparing reliability of ICD-10-based COVID-19 comorbidity data to manual chart review, a retrospective cross-sectional study.

Joseph W Schaefer¹, Joshua M Riley¹, Michael Li², Dianna R Cheney-Peters³, Chantel M Venkataraman⁴, Chris J Li¹, Christa M Smaltz⁴, Conor G Bradley¹, Crystal Y Lee¹, Danielle M Fitzpatrick⁴, David B Ney¹, Dina S Zaret¹, Divya M Chalikonda⁴, Joshua D Mairose¹, Kashyap Chauhan⁴, Margaret V Szot⁴, Robert B Jones⁴, Rukaiya Bashir-Hamidu⁴, Shuji Mitsuhashi⁴, Alan A Kubey^3,5.

Abstract

International Statistical Classification of Disease and Related Health Problems, 10th Revision codes (ICD-10) are used to characterize cohort comorbidities. Recent literature does not demonstrate standardized extraction methods.
OBJECTIVE: Compare COVID-19 cohort manual-chart-review and ICD-10-based comorbidity data; characterize the accuracy of different methods of extracting ICD-10-code-based comorbidity, including the temporal accuracy with respect to critical time points such as day of admission.
DESIGN: Retrospective cross-sectional study. MEASUREMENTS: ICD-10-based-data performance characteristics relative to manual-chart-review.
RESULTS: Discharge billing diagnoses had a sensitivity of 0.82 (95% confidence interval [CI]: 0.79-0.85; comorbidity range: 0.35-0.96). The past medical history table had a sensitivity of 0.72 (95% CI: 0.69-0.76; range: 0.44-0.87). The active problem list had a sensitivity of 0.67 (95% CI: 0.63-0.71; range: 0.47-0.71). On day of admission, the active problem list had a sensitivity of 0.58 (95% CI: 0.54-0.63; range: 0.30-0.68)and past medical history table had a sensitivity of 0.48 (95% CI: 0.43-0.53; range: 0.30-0.56). CONCLUSIONS AND RELEVANCE: ICD-10-based comorbidity data performance varies depending on comorbidity, data source, and time of retrieval; there are notable opportunities for improvement. Future researchers should clearly outline comorbidity data source and validate against manual-chart-review.

Entities: Chemical

Keywords: COVID-19; International Classification of Diseases; comorbidity; electronic health records; research design

Mesh：

Year: 2021 PMID： 34850420 PMCID： PMC9015484 DOI： 10.1002/jmv.27492

Source DB: PubMed Journal: J Med Virol ISSN： 0146-6615 Impact factor: 20.693

INTRODUCTION

Manual chart review is considered the gold standard for clinical research but requires extensive time and personnel. Using International Statistical Classification of Disease and Related Health Problems, 10th Revision codes (ICD‐10) in a hospital database and/or electronic health record (EHR) is an efficient way to characterize comorbidities. Many COVID‐19 studies have used ICD‐10 to identify comorbidities. , , , , , , , , , Prior COVID‐19 cohort studies, however, do not demonstrate standardized practice in the extraction of ICD‐10‐code‐based comorbidity data. The purpose of this study is to compare the comorbidities identified by manual‐chart‐review versus ICD‐10 coding in a cohort of patients hospitalized with COVID‐19. The secondary aim is to characterize the accuracy of different methods of extracting ICD‐10‐code‐based comorbidity, including the temporal accuracy with respect to critical time points such as day of admission.

METHODS

Study design, setting, and population

This retrospective cross‐sectional study included all adults (age ≥ 18 years) admitted with COVID‐19 to Thomas Jefferson University Hospital (TJUH) in Philadelphia from March 1st to June 6th, 2020. COVID‐19 was defined as a positive SARS‐CoV‐2 qualitative polymerase chain reaction. We excluded patients who were transferred from another institution, pregnant, and/or incarcerated. ICD‐10 codes were obtained from the Jefferson Health datamart. ICD‐10 codes were sub‐grouped by their source of origin in the EHR: past medical history, active problem list, and billed discharge diagnoses (i.e., summary for insurance claims). For the Jefferson Health system, the discharge billing diagnoses are entered after an encounter by nonclinical staff; the past medical history table and active problem list are populated by clinicians. The data used for analysis was transferred to an enterprise server on September 22, 2020 (108–205 days after admission). TJUH institutional review board approved this study (IRB#: 21E.265). This study follows the reporting guidelines outlined in Strengthening the Reporting of Observational Studies in Epidemiology.

Chart review data collection

Two independent reviewers extracted comorbidity information via manual EHR chart review. Comorbidities were included that were listed in emergency department notes, admission history and physical note, immediate past discharge summary (if existed) and/or the problem list. Comorbidities of interest were chosen through an April, 2020 literature review of COVID‐19 mortality independent risk factors. For data validation, when two reviewers disagreed, a third independent reviewer adjudicated discrepancies.

ICD‐10 classification process

The “icd” package in R (Version 3.3, Author: Jack O. Wasey, MD. Children's Hospital of Pennsylvania) was used to assign comorbidities using literature‐supported ICD‐10 mappings established by Quan et al. , The comorbidities analyzed for ICD‐10 comparison were congestive heart failure (CHF), cerebral vascular disease (CEVD), diabetes, chronic kidney disease (CKD), cancer, human immunodeficiency virus (HIV), and hypertension (HTN). Composites of subtypes (e.g., complicated and uncomplicated diabetes)were used for diabetes, cancer, and hypertension to allow for comparison to manual review. These comorbidities were selected due to availability in both the manual review and the Quan et al. ICD‐10 mappings. Sensitivity and specificity analysis was performed on each ICD‐10 source against the chart review results. The discharge diagnoses table was linked by hospital encounter identification numbers, whereas the active problem list and past medical history were linked by medical record numbers. The medical record number linked active problem list and past medical history table included a timestamp of when the code was recorded. This timestamp was used to filter the ICD‐10 codes: day of admission, day of discharge, and no time restriction.

Statistical analysis

Statistical analysis was performed using the methodology detailed by Crabb et al. Sensitivity, specificity, positive predictive value, and negative predictive value were calculated by comparing ICD‐10‐derived comorbidities with the corresponding measures from manual chart review; 95% confidence intervals (CI) were calculated by bootstrapping the point estimates using random resampling with replacement to create 1000 samples of the same size as the original group. The 95% CIs for each performance characteristic were calculated from the empirical bootstrap distribution.

RESULTS

A total of 426 patients were admitted to TJUH for COVID‐19 from March 17th to June 6th, 2020. The average patient age was 64.4; 43.4% were female. The percentages of Black, white, Hispanic, Asian, and other patients were 54.4%, 25.6%, 10.6%, 6.8%, and 2.6%, respectively. The mortality rate was 16.7%. Frequency of comorbidities as determined by manual review were as follows: CHF (80), CEVD (81), diabetes (162), CKD (93), cancer (72), HIV (10), and HTN (301). Table 1 displays the performance characteristics for each ICD‐10 diagnosis table from the EHR. The discharge diagnoses table (108–205 days after admission) was the most sensitive individual table for all comorbidities with sensitivity of 0.82 (95% CI: 0.79–0.85), with individual comorbidity sensitivity ranging from 0.35 to 0.96. It was followed by the past medical history table with a sensitivity of 0.72 (95% CI: 0.69–0.76), with individual comorbidity sensitivity ranging from 0.44 to 0.87. The active problem list had a sensitivity of 0.67 (95% CI: 0.63–0.71), with individual comorbidity sensitivity ranging from 0.47 to 0.77. A pooled active problem list and past medical history table had a sensitivity of 0.86 (95% CI: 0.83–0.89), with individual comorbidity sensitivity ranging from 0.73 to 0.93.

Table 1

ICD‐10‐based data performance

	Manual chart review count	Positive predictive value	Negative predictive value	Sensitivity	Specificity
Discharge diagnoses
All	799	–	–	0.82 (0.79, 0.85)	0.94 (0.92, 0.96)
CHF	80	0.81 (0.73, 0.89)	0.98 (0.97, 0.99)	0.92 (0.86, 0.97)	0.95 (0.93, 0.97)
CEVD	81	0.83 (0.7, 0.94)	0.87 (0.83, 0.9)	0.36 (0.26, 0.47)	0.98 (0.97, 0.99)
Diabetes	162	0.98 (0.96, 1)	0.97 (0.95, 0.99)	0.96 (0.92, 0.99)	0.99 (0.97, 1)
CKD	93	0.72 (0.64, 0.81)	0.99 (0.97, 1)	0.96 (0.91, 0.99)	0.9 (0.86, 0.93)
Cancer	72	0.96 (0.87, 1)	0.88 (0.85, 0.91)	0.35 (0.23, 0.46)	1 (0.99, 1)
HIV	10	1 (1, 1)	0.99 (0.97, 1)	0.41 (0.1, 0.73)	1 (1, 1)
HTN	301	0.96 (0.93, 0.98)	0.84 (0.77, 0.89)	0.93 (0.9, 0.95)	0.9 (0.84, 0.95)
Past medical history
All	799	–	–	0.72 (0.69, 0.76)	0.95 (0.93, 0.97)
CHF	80	0.92 (0.84, 1)	0.91 (0.88, 0.94)	0.58 (0.48, 0.69)	0.99 (0.98, 1)
CEVD	81	0.8 (0.69, 0.89)	0.91 (0.88, 0.94)	0.58 (0.48, 0.68)	0.97 (0.94, 0.98)
Diabetes	162	0.97 (0.93, 0.99)	0.92 (0.89, 0.95)	0.87 (0.82, 0.92)	0.98 (0.96, 1)
CKD	93	0.87 (0.76, 0.96)	0.86 (0.83, 0.89)	0.44 (0.35, 0.54)	0.98 (0.96, 0.99)
Cancer	72	0.84 (0.74, 0.94)	0.92 (0.89, 0.95)	0.6 (0.49, 0.7)	0.98 (0.96, 0.99)
HIV	10	1 (1, 1)	1 (0.99, 1)	0.8 (0.46, 1)	1 (1, 1)
HTN	301	0.96 (0.93, 0.98)	0.7 (0.62, 0.77)	0.84 (0.79, 0.88)	0.91 (0.86, 0.96)
Active problem list
All	799	–	–	0.67 (0.63, 0.71)	0.98 (0.97, 0.99)
CHF	80	0.86 (0.77, 0.93)	0.94 (0.91, 0.96)	0.71 (0.62, 0.8)	0.97 (0.96, 0.99)
CEVD	81	0.93 (0.83, 1)	0.89 (0.86, 0.92)	0.47 (0.36, 0.58)	0.99 (0.98, 1)
Diabetes	162	0.98 (0.95, 1)	0.86 (0.82, 0.9)	0.73 (0.67, 0.8)	0.99 (0.98, 1)
CKD	93	0.91 (0.85, 0.97)	0.94 (0.91, 0.96)	0.77 (0.69, 0.85)	0.98 (0.96, 0.99)
Cancer	72	0.91 (0.82, 0.98)	0.92 (0.89, 0.95)	0.58 (0.47, 0.69)	0.99 (0.98, 1)
HIV	10	1 (1, 1)	0.99 (0.98, 1)	0.59 (0.29, 0.9)	1 (1, 1)
HTN	301	0.99 (0.97, 1)	0.55 (0.49, 0.62)	0.67 (0.62, 0.73)	0.98 (0.94, 1)
Pooled active problem list and past medical history
All	799	–	–	0.86 (0.83, 0.89)	0.95 (0.92, 0.96)
CHF	80	0.86 (0.79, 0.93)	0.97 (0.95, 0.99)	0.87 (0.8, 0.94)	0.97 (0.95, 0.98)
CEVD	81	0.81 (0.71, 0.9)	0.94 (0.91, 0.96)	0.73 (0.63, 0.82)	0.96 (0.94, 0.98)
Diabetes	162	0.96 (0.93, 0.99)	0.96 (0.93, 0.98)	0.93 (0.89, 0.97)	0.98 (0.96, 0.99)
CKD	93	0.87 (0.8, 0.94)	0.94 (0.91, 0.97)	0.78 (0.7, 0.87)	0.97 (0.95, 0.98)
Cancer	72	0.83 (0.73, 0.92)	0.95 (0.92, 0.97)	0.74 (0.63, 0.83)	0.97 (0.95, 0.99)
HIV	10	1 (1, 1)	1 (0.99, 1)	0.9 (0.67, 1)	1 (1, 1)
HTN	301	0.96 (0.93, 0.98)	0.8 (0.73, 0.86)	0.9 (0.87, 0.94)	0.9 (0.85, 0.95)

Abbreviations: CEVD, cerebral vascular disease; CHF, congestive heart failure; CKD, chronic kidney disease; HIV, human immunodeficiency virus; HTN, hypertension.

ICD‐10‐based data performance Abbreviations: CEVD, cerebral vascular disease; CHF, congestive heart failure; CKD, chronic kidney disease; HIV, human immunodeficiency virus; HTN, hypertension. The discharge diagnoses table was most sensitive for CHF (0.92), diabetes (0.96), CKD (0.96), and HTN (0.93); it was least sensitive for CEVD (0.36), cancer (0.35), and HIV (0.41). The past medical history was most sensitive for diabetes (0.87), HIV (0.80), and HTN (0.84); it was least sensitive for CHF (0.58), CEVD (0.58), CKD (0.44), and cancer (0.60). The active problem list was most sensitive for CHF (0.71), diabetes (0.73), CKD (0.77), and HTN (0.67); it was least sensitive for CEVD (0.47), cancer (0.58), and HIV (0.59). Table 2 displays the performance characteristics for the active problem list, past medical history, and a pooled table at different time points: recorded on or before day of admission, on or before day of discharge, and no time restriction. When filtering for entries recorded on or before the day of admission, the active problem list had a sensitivity of 0.58 (95% CI: 0.54–0.63); past medical history table had a sensitivity of 0.48 (95% CI: 0.43–0.53); and the pooled table had a sensitivity of 0.74 (95% CI: 0.70–0.78). When filtering for entries recorded on or before the day of discharge, the active problem list had a sensitivity of 0.66 (95% CI: 0.61–0.70); past medical history table had a sensitivity of 0.52 (95% CI: 0.47–0.56); and the pooled table had a sensitivity of 0.81 (95% CI: 0.78–0.84).

Table 2

Performance stratified by time

	Manual chart review count	Positive predictive value	Negative predictive value	Sensitivity	Specificity
Active problem list—day of admission
All	799	–	–	0.58 (0.54, 0.63)	0.99 (0.98, 0.99)
CHF	80	0.87 (0.78, 0.94)	0.93 (0.9, 0.95)	0.66 (0.56, 0.76)	0.98 (0.96, 0.99)
CEVD	81	0.92 (0.81, 1)	0.88 (0.85, 0.91)	0.42 (0.31, 0.53)	0.99 (0.98, 1)
Diabetes	162	0.98 (0.95, 1)	0.82 (0.78, 0.86)	0.64 (0.57, 0.72)	0.99 (0.98, 1)
CKD	93	0.91 (0.84, 0.97)	0.92 (0.88, 0.94)	0.68 (0.58, 0.77)	0.98 (0.96, 0.99)
Cancer	72	0.9 (0.8, 0.98)	0.91 (0.88, 0.94)	0.53 (0.41, 0.63)	0.99 (0.98, 1)
HIV	10	1 (1, 1)	0.98 (0.97, 1)	0.3 (0, 0.64)	1 (1, 1)
HTN	301	0.99 (0.97, 1)	0.49 (0.43, 0.55)	0.57 (0.52, 0.63)	0.98 (0.96, 1)
Active problem list—day of discharge
All	799			0.66 (0.61, 0.70)	0.99 (0.98, 0.99)
CHF	80	0.87 (0.79, 0.95)	0.93 (0.91, 0.96)	0.7 (0.6, 0.79)	0.98 (0.96, 0.99)
CEVD	81	0.93 (0.83, 1)	0.89 (0.86, 0.92)	0.47 (0.36, 0.58)	0.99 (0.98, 1)
Diabetes	162	0.98 (0.95, 1)	0.85 (0.81, 0.89)	0.71 (0.64, 0.78)	0.99 (0.98, 1)
CKD	93	0.91 (0.84, 0.97)	0.94 (0.91, 0.96)	0.76 (0.67, 0.85)	0.98 (0.96, 0.99)
Cancer	72	0.91 (0.81, 0.98)	0.92 (0.89, 0.94)	0.56 (0.44, 0.67)	0.99 (0.98, 1)
HIV	10	1 (1, 1)	0.99 (0.97, 1)	0.4 (0.09, 0.73)	1 (1, 1)
HTN	301	0.99 (0.98, 1)	0.55 (0.49, 0.62)	0.67 (0.62, 0.72)	0.98 (0.96, 1)
Active problem list—no time constraint
All	799	–	–	0.67 (0.63, 0.71)	0.98 (0.97, 0.99)
CHF	80	0.86 (0.77, 0.93)	0.94 (0.91, 0.96)	0.71 (0.62, 0.8)	0.97 (0.96, 0.99)
CEVD	81	0.93 (0.83, 1)	0.89 (0.86, 0.92)	0.47 (0.36, 0.58)	0.99 (0.98, 1)
Diabetes	162	0.98 (0.95, 1)	0.86 (0.82, 0.9)	0.73 (0.67, 0.8)	0.99 (0.98, 1)
CKD	93	0.91 (0.85, 0.97)	0.94 (0.91, 0.96)	0.77 (0.69, 0.85)	0.98 (0.96, 0.99)
Cancer	72	0.91 (0.82, 0.98)	0.92 (0.89, 0.95)	0.58 (0.47, 0.69)	0.99 (0.98, 1)
HIV	10	1 (1, 1)	0.99 (0.98, 1)	0.59 (0.29, 0.9)	1 (1, 1)
HTN	301	0.99 (0.97, 1)	0.55 (0.49, 0.62)	0.67 (0.62, 0.73)	0.98 (0.94, 1)
Past medical history—day of admission
All	799	–	–	0.48 (0.43, 0.53)	0.98 (0.97, 0.99)
CHF	80	0.92 (0.82, 1)	0.88 (0.84, 0.91)	0.41 (0.3, 0.53)	0.99 (0.98, 1)
CEVD	81	0.76 (0.62, 0.88)	0.87 (0.83, 0.9)	0.38 (0.28, 0.49)	0.97 (0.95, 0.99)
Diabetes	162	0.97 (0.93, 1)	0.79 (0.74, 0.83)	0.56 (0.49, 0.64)	0.99 (0.98, 1)
CKD	93	0.91 (0.78, 1)	0.84 (0.8, 0.87)	0.3 (0.21, 0.4)	0.99 (0.98, 1)
Cancer	72	0.9 (0.78, 1)	0.89 (0.86, 0.92)	0.4 (0.28, 0.52)	0.99 (0.98, 1)
HIV	10	1 (1, 1)	0.99 (0.98, 1)	0.5 (0.17, 0.86)	1 (1, 1)
HTN	301	0.98 (0.95, 0.99)	0.47 (0.41, 0.53)	0.55 (0.49, 0.6)	0.97 (0.94, 0.99)
Past medical history—day of discharge
All	799	–	–	0.52 (0.47, 0.56)	0.97 (0.95, 0.98)
CHF	80	0.92 (0.82, 1)	0.88 (0.85, 0.92)	0.44 (0.32, 0.54)	0.99 (0.98, 1)
CEVD	81	0.78 (0.65, 0.89)	0.88 (0.84, 0.91)	0.43 (0.32, 0.54)	0.97 (0.95, 0.99)
Diabetes	162	0.97 (0.93, 1)	0.8 (0.76, 0.85)	0.61 (0.53, 0.68)	0.99 (0.98, 1)
CKD	93	0.88 (0.75, 0.97)	0.84 (0.8, 0.87)	0.31 (0.23, 0.41)	0.99 (0.98, 1)
Cancer	72	0.91 (0.81, 1)	0.9 (0.87, 0.93)	0.46 (0.34, 0.57)	0.99 (0.98, 1)
HIV	10	1 (1, 1)	0.99 (0.98, 1)	0.5 (0.17, 0.86)	1 (1, 1)
HTN	301	0.96 (0.93, 0.98)	0.49 (0.43, 0.56)	0.59 (0.54, 0.65)	0.94 (0.9, 0.98)
Past medical history—no time constraint
All	799	–	–	0.72 (0.69, 0.76)	0.95 (0.93, 0.97)
CHF	80	0.92 (0.84, 1)	0.91 (0.88, 0.94)	0.58 (0.48, 0.69)	0.99 (0.98, 1)
CEVD	81	0.8 (0.69, 0.89)	0.91 (0.88, 0.94)	0.58 (0.48, 0.68)	0.97 (0.94, 0.98)
Diabetes	162	0.97 (0.93, 0.99)	0.92 (0.89, 0.95)	0.87 (0.82, 0.92)	0.98 (0.96, 1)
CKD	93	0.87 (0.76, 0.96)	0.86 (0.83, 0.89)	0.44 (0.35, 0.54)	0.98 (0.96, 0.99)
Cancer	72	0.84 (0.74, 0.94)	0.92 (0.89, 0.95)	0.6 (0.49, 0.7)	0.98 (0.96, 0.99)
HIV	10	1 (1, 1)	1 (0.99, 1)	0.8 (0.46, 1)	1 (1, 1)
HTN	301	0.96 (0.93, 0.98)	0.7 (0.62, 0.77)	0.84 (0.79, 0.88)	0.91 (0.86, 0.96)
Pooled active problem list and past medical history—day of admission
All	799	–	–	0.74 (0.70, 0.78)	0.97 (0.95, 0.98)
CHF	80	0.86 (0.78, 0.93)	0.95 (0.93, 0.97)	0.79 (0.7, 0.87)	0.97 (0.95, 0.99)
CEVD	81	0.81 (0.7, 0.9)	0.91 (0.88, 0.94)	0.62 (0.5, 0.72)	0.97 (0.94, 0.98)
Diabetes	162	0.97 (0.94, 0.99)	0.89 (0.85, 0.92)	0.79 (0.73, 0.86)	0.98 (0.97, 1)
CKD	93	0.88 (0.8, 0.95)	0.92 (0.89, 0.95)	0.71 (0.62, 0.8)	0.97 (0.95, 0.99)
Cancer	72	0.87 (0.77, 0.95)	0.93 (0.9, 0.96)	0.64 (0.53, 0.75)	0.98 (0.96, 0.99)
HIV	10	1 (1, 1)	0.99 (0.98, 1)	0.5 (0.17, 0.86)	1 (1, 1)
HTN	301	0.97 (0.95, 0.99)	0.64 (0.57, 0.71)	0.78 (0.73, 0.83)	0.95 (0.91, 0.98)
Pooled active problem list and past medical history—day of discharge
All	799	–	–	0.81 (0.78, 0.84)	0.96 (0.94, 0.98)
CHF	80	0.87 (0.79, 0.94)	0.97 (0.94, 0.98)	0.85 (0.77, 0.93)	0.97 (0.95, 0.99)
CEVD	81	0.82 (0.73, 0.91)	0.93 (0.9, 0.95)	0.68 (0.57, 0.78)	0.97 (0.94, 0.98)
Diabetes	162	0.97 (0.94, 0.99)	0.92 (0.89, 0.95)	0.86 (0.8, 0.91)	0.98 (0.97, 1)
CKD	93	0.87 (0.79, 0.93)	0.94 (0.91, 0.96)	0.77 (0.69, 0.86)	0.97 (0.95, 0.98)
Cancer	72	0.87 (0.78, 0.96)	0.94 (0.91, 0.96)	0.68 (0.58, 0.78)	0.98 (0.96, 0.99)
HIV	10	1 (1, 1)	0.99 (0.98, 1)	0.6 (0.25, 0.92)	1 (1, 1)
HTN	301	0.97 (0.95, 0.99)	0.74 (0.68, 0.81)	0.87 (0.83, 0.9)	0.94 (0.89, 0.98)
Pooled active problem list and past medical history—no time constraint
All	799	–	–	0.86 (0.83, 0.89)	0.95 (0.92, 0.96)
CHF	80	0.86 (0.79, 0.93)	0.97 (0.95, 0.99)	0.87 (0.8, 0.94)	0.97 (0.95, 0.98)
CEVD	81	0.81 (0.71, 0.9)	0.94 (0.91, 0.96)	0.73 (0.63, 0.82)	0.96 (0.94, 0.98)
Diabetes	162	0.96 (0.93, 0.99)	0.96 (0.93, 0.98)	0.93 (0.89, 0.97)	0.98 (0.96, 0.99)
CKD	93	0.87 (0.8, 0.94)	0.94 (0.91, 0.97)	0.78 (0.7, 0.87)	0.97 (0.95, 0.98)
Cancer	72	0.83 (0.73, 0.92)	0.95 (0.92, 0.97)	0.74 (0.63, 0.83)	0.97 (0.95, 0.99)
HIV	10	1 (1, 1)	1 (0.99, 1)	0.9 (0.67, 1)	1 (1, 1)
HTN	301	0.96 (0.93, 0.98)	0.8 (0.73, 0.86)	0.9 (0.87, 0.94)	0.9 (0.85, 0.95)

Abbreviations: CEVD, cerebral vascular disease; CHF, congestive heart failure; CKD, chronic kidney disease; HIV, human immunodeficiency virus; HTN, hypertension.

Performance stratified by time Abbreviations: CEVD, cerebral vascular disease; CHF, congestive heart failure; CKD, chronic kidney disease; HIV, human immunodeficiency virus; HTN, hypertension.

DISCUSSION

Our results demonstrate significant and concerning variability in ICD‐10 performance depending on comorbidity, source of ICD‐10‐based data, and time of data retrieval. For retrospective research, discharge billing diagnoses was the most sensitive individual table. However, when the clinical tables—active problem list and past medical history table—were pooled, they performed similarly to the discharge diagnoses table. Several of the individual comorbidity performances were insufficient. For example, for the discharge diagnoses table, a commonly used source of ICD‐10 codes, stroke, cancer, and HIV had sensitivities ranging from 0.35 to 0.41. For clinical use, accuracy during the hospital stay is important. The sensitivity of the past medical history and active problem list decreased significantly when filtering for entries recorded on or before day of admission. The sensitivity of the active problem list on admission was 0.58 (95% CI: 0.54–0.63). If the past medical history table is included, the sensitivity on admission improves to 0.74 (95% CI: 0.70–0.78). This emphasizes the importance of a thorough medical history and referencing patient notes. In an outpatient setting, Martin et al., similarly found limitations in ICD‐10‐derived comorbidities when comparing EHR ICD‐10 data sources, with an even lower average sensitivity and specificity for the problem list, past medical history, and encounter level (discharge diagnoses): (0.41, 0.82), (0.55, 0.76), and (0.54, 0.75), respectively. Compared to our study, this lower sensitivity and specificity may be due to differences between the inpatient and outpatient setting, differing institutional practices, comorbidities analyzed, and/or COVID‐19 versus non‐COVID‐19 populations. In a comprehensive review of 51 conditions using ICD‐9‐based data, Wei et al. found the median sensitivity was 83% with range of 3%–100%. Compared to our study, the median sensitivity is comparable to the discharge diagnoses and pooled active problem list and past medical history. This range of sensitivities demonstrates the dramatic variability in sensitivity for conditions, and we encourage institutions to verify their ICD‐10‐based comorbidity data. Of note, most risk calculators are used at initial presentation yet our data shows that the accuracy of the problem list on day of admission is lacking. The temporal accuracy of the EHR, albeit not with respect to clinically relevant time points such as day of admission or discharge, has been assessed in other studies, including a study by Schulz et al. that calculated the lag from diagnostic criteria being met (e.g., hemoglobin A1c ≥6.5%)to structured diagnosis populating in the EHR; the mean delay for HTN, hyperlipidemia, and diabetes was 389, 198, and 166 days, respectively. Our analysis with respect to clinically relevant time points bolsters Schulz et al.'s data that suggest accuracy analyses should include time as a variable. For risk and/or allocation schemes, institutions should compare and validate those schemes against the timing of live data they are prospectively using at bedside, especially if they use comorbidity in native EHR risk calculators. For retrospective research, researchers should include certain structured EHR diagnoses added for a defined period after the event. A review of literature demonstrated variability in ICD‐10 retrieval practices across institutions. Given the significant variability across sources of ICD‐10‐based data in our data, this is a cause for concern. The Cleveland Clinic organizes the raw EHR data, including ICD‐10 codes, into concepts from the Unified Medical Language System to inform comorbidity. , Another study used ICD‐10 codes from the problem list at the beginning of the encounter while another included used ICD‐10 codes from the current and prior encounters in addition to the problem list. , We commend these teams for clearly detailing their source as this is not always clear. To bring clarity to the literature, we encourage future publications to detail the source of the ICD‐10‐based data and validate its accuracy against a subset manual chart review. Our study is limited. The data is from a single‐center, using a specific EHR. It is unclear if our results can be generalized. We suspect accuracy will vary based on institutional practices, including the EHR used, expectations, and/or support for documentation and billing/coding practices. This limitation, however, underscores our conclusion that other institutions/studies should manually validate their ICD‐10‐based‐data. Further, this study is limited to a single, disease‐specific cohort. Although we anticipate similar trends would emerge across populations, it is unclear if these are COVID‐19‐specific findings. Additionally, the comorbidities analyzed were selected based on availability in previously constructed manual‐chart‐review‐based database. We further note that TJUH has been using its current EHR for less than 5 years which may affect discrete ICD‐10‐based data table reliability (i.e., based on Schultz et al., we anticipate data table accuracy will improve as a patient's chart ages). The varied performance characteristics has immediate implications for COVID‐19 research and resource allocation. Multivariable analyses that identify independent risk factors for COVID‐19 mortality and morbidity are being used to allocate scarce resources, such as vaccines, at a public health level. Given the variability we note in ICD‐10‐based comorbidity identification, these studies may over‐and/or‐under‐emphasize certain comorbidities, which may affect resource allocation. Furthermore, some health systems are using EHR‐based real‐time COVID‐19 risk scores to inform triage, treatment allocation, and more, potentially without knowing the performance characteristics of their own EHR comorbidity data.

CONCLUSION

ICD‐10 codes are often used to identify comorbidities for retrospective studies and clinical risk calculators. We note significant variation in performance depending on comorbidity, source of ICD‐10‐based data, and time of data retrieval. We note concerning features such as a CKD sensitivity of 0.3 when using past medical history on day of admission. The data suggests that COVID‐19 ICD‐10‐based data should be reviewed and applied carefully in the clinical and public health realms, that future researchers should clearly outline source and time of comorbidity data and validate against manual chart review, and that the research community should determine consensus in standardized extraction, analysis, and reporting of ICD‐10‐based data.

FUNDING INFORMATION

The research received no specific grant from any funding agency in the public, commerical, or not‐for‐profit sectors.

CONFLICT OF INTERESTS

The authors declare that there are no conflict of interests.

AUTHOR CONTRIBUTIONS

Dianna R. Cheney‐Peters, Robert B. Jones, and Alan A. Kubey: conceived the dataset and group creation. Joseph W. Schaefer and Alan A. Kubey: conceived study design. Joseph W. Schaefer, Joshua M. Riley, Michael Li, Dianna R. Cheney‐Peters, Chantel M. Venkataraman, Chris J. Li, Christa M. Smaltz, Conor G. Bradley, Crystal Y. Lee, Danielle M. Fitzpatrick, David B. Ney, Dina S. Zaret, Divya M. Chalikonda, Joshua D. Mairose, Kashyap Chauhan, Margaret V. Szot, Robert B. Jones, Rukaiya Bashir‐Hamidu, Shuji Mitsuhashi, and Alan A. Kubey: performed data collection and analyses, reviewed, edited, and finalized the manuscript, providing critical feedback and changes before submission.

17 in total

1. Comparison of EHR-based diagnosis documentation locations to a gold standard for risk stratification in patients with multiple chronic conditions.

Authors: Shelby Martin; Jesse Wagner; Nicoleta Lupulescu-Mann; Katrina Ramsey; Aaron Cohen; Peter Graven; Nicole G Weiskopf; David A Dorr
Journal: Appl Clin Inform Date: 2017-08-02 Impact factor: 2.342

2. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data.

Authors: Hude Quan; Vijaya Sundararajan; Patricia Halfon; Andrew Fong; Bernard Burnand; Jean-Christophe Luthi; L Duncan Saunders; Cynthia A Beck; Thomas E Feasby; William A Ghali
Journal: Med Care Date: 2005-11 Impact factor: 2.983

3. Body Mass Index and Risk for Intubation or Death in SARS-CoV-2 Infection : A Retrospective Cohort Study.

Authors: Michaela R Anderson; Joshua Geleris; David R Anderson; Jason Zucker; Yael R Nobel; Daniel Freedberg; Jennifer Small-Saunders; Kartik N Rajagopalan; Richard Greendyk; Sae-Rom Chae; Karthik Natarajan; David Roh; Ethan Edwin; Dympna Gallagher; Anna Podolanczuk; R Graham Barr; Anthony W Ferrante; Matthew R Baldwin
Journal: Ann Intern Med Date: 2020-07-29 Impact factor: 25.391

4. Hospitalization and Mortality among Black Patients and White Patients with Covid-19.

Authors: Eboni G Price-Haywood; Jeffrey Burton; Daniel Fort; Leonardo Seoane
Journal: N Engl J Med Date: 2020-05-27 Impact factor: 91.245

5. Temporal relationship of computed and structured diagnoses in electronic health record data.

Authors: Wade L Schulz; H Patrick Young; Andreas Coppi; Bobak J Mortazavi; Zhenqiu Lin; Raymond A Jean; Harlan M Krumholz
Journal: BMC Med Inform Decis Mak Date: 2021-02-17 Impact factor: 2.796

Review 6. Comprehensive review of ICD-9 code accuracies to measure multimorbidity in administrative data.

Authors: Melissa Y Wei; Jamie E Luster; Chiao-Li Chan; Lillian Min
Journal: BMC Health Serv Res Date: 2020-06-01 Impact factor: 2.655

7. Machine Learning to Predict Mortality and Critical Events in a Cohort of Patients With COVID-19 in New York City: Model Development and Validation.

Authors: Sulaiman Somani; Adam J Russak; Akhil Vaid; Jessica K De Freitas; Fayzan F Chaudhry; Ishan Paranjpe; Kipp W Johnson; Samuel J Lee; Riccardo Miotto; Felix Richter; Shan Zhao; Noam D Beckmann; Nidhi Naik; Arash Kia; Prem Timsina; Anuradha Lala; Manish Paranjpe; Eddye Golden; Matteo Danieletto; Manbir Singh; Dara Meyer; Paul F O'Reilly; Laura Huckins; Patricia Kovatch; Joseph Finkelstein; Robert M Freeman; Edgar Argulian; Andrew Kasarskis; Bethany Percha; Judith A Aberg; Emilia Bagiella; Carol R Horowitz; Barbara Murphy; Eric J Nestler; Eric E Schadt; Judy H Cho; Carlos Cordon-Cardo; Valentin Fuster; Dennis S Charney; David L Reich; Erwin P Bottinger; Matthew A Levin; Jagat Narula; Zahi A Fayad; Allan C Just; Alexander W Charney; Girish N Nadkarni; Benjamin S Glicksberg
Journal: J Med Internet Res Date: 2020-11-06 Impact factor: 5.428

8. Obesity and Mortality Among Patients Diagnosed With COVID-19: Results From an Integrated Health Care Organization.

Authors: Sara Y Tartof; Lei Qian; Vennis Hong; Rong Wei; Ron F Nadjafi; Heidi Fischer; Zhuoxin Li; Sally F Shaw; Susan L Caparosa; Claudia L Nau; Tanmai Saxena; Gunter K Rieg; Bradley K Ackerson; Adam L Sharp; Jacek Skarbinski; Tej K Naik; Sameer B Murali
Journal: Ann Intern Med Date: 2020-08-12 Impact factor: 25.391

9. Association of Race With Mortality Among Patients Hospitalized With Coronavirus Disease 2019 (COVID-19) at 92 US Hospitals.

Authors: Baligh R Yehia; Angela Winegar; Richard Fogel; Mohamad Fakih; Allison Ottenbacher; Christine Jesser; Angelo Bufalino; Ren-Huai Huang; Joseph Cacchione
Journal: JAMA Netw Open Date: 2020-08-03

10. Characteristics Associated With Racial/Ethnic Disparities in COVID-19 Outcomes in an Academic Health Care System.

Authors: Tian Gu; Jasmine A Mack; Maxwell Salvatore; Swaraaj Prabhu Sankar; Thomas S Valley; Karandeep Singh; Brahmajee K Nallamothu; Sachin Kheterpal; Lynda Lisabeth; Lars G Fritsche; Bhramar Mukherjee
Journal: JAMA Netw Open Date: 2020-10-01

3 in total

1. Impact of neurofibromatosis type 1 in an adult community population.

Authors: Timothy A Gregory; Peter Simon B Molina; Gregory D Phillips; John W Henson
Journal: Neurooncol Pract Date: 2022-02-17

2. Comparing reliability of ICD-10-based COVID-19 comorbidity data to manual chart review, a retrospective cross-sectional study.

Authors: Joseph W Schaefer; Joshua M Riley; Michael Li; Dianna R Cheney-Peters; Chantel M Venkataraman; Chris J Li; Christa M Smaltz; Conor G Bradley; Crystal Y Lee; Danielle M Fitzpatrick; David B Ney; Dina S Zaret; Divya M Chalikonda; Joshua D Mairose; Kashyap Chauhan; Margaret V Szot; Robert B Jones; Rukaiya Bashir-Hamidu; Shuji Mitsuhashi; Alan A Kubey
Journal: J Med Virol Date: 2021-12-08 Impact factor: 20.693

3. External validation of the COVID-19 4C Mortality Score in an urban United States cohort.

Authors: Joshua M Riley; Patrick J Moeller; Albert G Crawford; Joseph W Schaefer; Dianna R Cheney-Peters; Chantel M Venkataraman; Chris J Li; Christa M Smaltz; Conor G Bradley; Crystal Y Lee; Danielle M Fitzpatrick; David B Ney; Dina S Zaret; Divya M Chalikonda; Joshua D Mairose; Kashyap Chauhan; Margaret V Szot; Robert B Jones; Rukaiya Bashir-Hamidu; Alan A Kubey
Journal: Am J Med Sci Date: 2022-04-29 Impact factor: 3.462

3 in total