| Literature DB >> 31553307 |
Patrick Wu1,2, Aliya Gifford1, Xiangrui Meng3, Xue Li3, Harry Campbell3, Tim Varley4, Juan Zhao1, Robert Carroll1, Lisa Bastarache1, Joshua C Denny1,5, Evropi Theodoratou3,6, Wei-Qi Wei1.
Abstract
BACKGROUND: The phecode system was built upon the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) for phenome-wide association studies (PheWAS) using the electronic health record (EHR).Entities:
Keywords: data science; electronic health record; genome-wide association study; medical informatics applications; phenome-wide association study; phenotyping
Year: 2019 PMID: 31553307 PMCID: PMC6911227 DOI: 10.2196/14325
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Mapping strategy for ICD-10 (non-CM) and ICD-10-CM diagnosis codes to phecodes. We mapped ICD-10-CM codes directly by matching code descriptions (path A) or indirectly to phecodes, using a number of manually validated mapping resources (paths B, C, D, E, and F). In path D, we used NLM’s SNOMED CT to create ICD-9-CM one-to-one and many-to-one maps [23]. To map ICD-9-CM codes to phecodes, we applied Phecode Map 1.2 with ICD-9 Codes (ICD-9-CM phecode map) [14]. Boxes with solid lines indicate clinical terminologies, and those with dashed lines describe the resources and mapping methods used. ICD-10-CM: International Classification of Diseases, Tenth Revision, Clinical Modification; CUI: Concept Unique Identifier; SNOMED CT: Systematized Nomenclature of Medicine Clinical Terms; GEMS: General Equivalence Mappings; NLM: National Library of Medicine; ICD-9-CM: International Classification of Diseases, Ninth Revision, Clinical Modification; OHDSI: Observational Health Data Sciences and Informatics.
Figure 2Counts of distinct ICD-10-CM source codes at VUMC and ICD-10 (non-CM) source codes in UKBB. (A) Number of unique ICD-10-CM codes in each category. For example, there were 34,793 unique codes (grey section) that were in the official ICD-10-CM system, observed in the VUMC dataset, and mapped to phecodes. (B) Number of unique ICD-10 codes in each category. For example, there were 5823 unique codes (off-white section) that were in the official ICD-10 system, observed in the UKBB dataset, and mapped to phecodes. ICD-10-CM: International Classification of Diseases, 10th Revision, Clinical Modification; VUMC: Vanderbilt University Medical Center; ICD-10: International Classification of Diseases, 10th Revision; UKBB: UK Biobank.
Figure 3Timeline of the two 18-month periods from which ICD-9-CM and ICD-10-CM codes from VUMC were analyzed. The cohort of 357,728 patients had at least one ICD-9-CM and one ICD-10-CM code in the respective 18-month windows. ICD-9-CM: International Classification of Diseases, Ninth Revision, Clinical Modification; ICD-10-CM: International Classification of Diseases, 10th Revision, Clinical Modification.
ICD-10-CM and ICD-10 codes data summary.
|
| ICD-10-CMa (VUMCb) | ICD-10c (UKBBd) | |
|
|
|
| |
| Unique codes, n | 94,201 | 12,027 | |
| Unique codes mapped, n (%) | 82,303 (87.37) | 9,060 (75.33) | |
|
|
|
| |
| Unique codes, n | 36,858 | 6,245 | |
| Unique codes mapped, n (%) | 34,793 (94.40) | 5,823 (93.24) | |
| Total patients (with ICD-10-CM or ICD-10 codes), n | 651,649 | 391,181 | |
| Total instances of all ICDe codes, n | 19,682,697 | 5,114,363 | |
| Instances mapped to phecodes, n (%) | 17,658,470 (89.72) | 4,279,544 (83.68) | |
aICD-10-CM: International Classification of Diseases, 10th Revision, Clinical Modification
bVUMC: Vanderbilt University Medical Center
cICD-10: International Classification of Diseases, 10th Revision
dUKBB: UK Biobank
eICD: International Classification of Diseases
ICD-10-CM phecode map reproducibility analysis.
| Phenotype | Phecodesa | ICD-9-CMb cases (n) | ICD-10-CMc case|ICD-9-CM cased, n (%) |
| Hypertension | 401.* | 65,216 | 49,468 (75.85) |
| Hyperlipidemia | 272.* | 51,187 | 36,187 (70.7) |
| Type 1 diabetes | 250.1* | 5782 | 4412 (76.31) |
| Type 2 diabetes | 250.2* | 25,077 | 19,066 (76.03) |
| Intestinal infection | 008.* | 3410 | 273 (8.01) |
aIn the phecode column, * means ≥1 digits or a period (eg, phecode 401.*=phecodes 401, 401.1, 401.3, 401.22, 401.21, or 401.2)
bICD-9-CM: International Classification of Diseases, Ninth Revision, Clinical Modification
cICD-10-CM: International Classification of Diseases, 10th Revision, Clinical Modification
dIn the last column, “ICD-10-CM case|ICD-9-CM case” indicates patients who were cases for the phenotype of interest during the ICD-9-CM period who were also ICD-10-CM cases
Figure 4Comparative PheWAS of lipoprotein(a) genetic variant, rs10455872. “Coronary atherosclerosis” (phecode 411.4) and “Other chronic ischemic heart disease” (phecode 411.8) were top hits associated with rs10455872 in a PheWAS analysis conducted using ICD-9-CM (top) and ICD-10-CM (bottom) phecode maps. Analyses were adjusted for age, sex, and race. PheWAS: phenome-wide association studies; ICD-9-CM: International Classification of Diseases, Ninth Revision, Clinical Modification; ICD-10-CM: International Classification of Diseases, 10th Revision, Clinical Modification.