| Literature DB >> 30395248 |
George Hripcsak1,2,3, Matthew E Levine1,2, Ning Shang1,2, Patrick B Ryan1,2,4.
Abstract
Objective: To study the effect on patient cohorts of mapping condition (diagnosis) codes from source billing vocabularies to a clinical vocabulary. Materials andEntities:
Mesh:
Year: 2018 PMID: 30395248 PMCID: PMC6289550 DOI: 10.1093/jamia/ocy124
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Figure 1.Design of the vocabulary study. The OHDSI (OMOP) database comprises the source data in ICD9-CM and ICD10-CM (bottom left) and the mapped data in SNOMED CT (bottom right). The gold standard concept sets include the original ICD9-CM concept set, run only on the ICD9-CM codes in the source data, and the extension of that concept set to ICD10-CM (and SNOMED CT but not used here) based on the current authors’ interpretation of the original authors’ intent 11-17. New SNOMED CT concept sets are generated from the original concept set both using knowledge engineering and via automatic translation. The generated concept sets and the gold standard concept sets are run against their respective data sets, and the resulting patient cohorts are compared for false positives (FP) or false negatives (FN) with the original concept set serving as the gold standard for Table 3 and the extended concept set that is based on the original authors’ intent serving as the gold standard in Table 4.
Original ICD9-CM definition of concept sets used in phenotypes
| Algorithm | Original ICD9-CM concept set‡ |
|---|---|
| Heart failure (HF) | 428.* |
| Heart failure as exclusion diagnosis (HF2) | 428.* |
| Type-1 diabetes mellitus (T1DM) | 250.x1, 250.x3 |
| Type-2 diabetes mellitus (T2DM) | 250.x0, 250.x2 |
| Appendicitis (Appy) | 540.* |
| Attention deficit hyperactivity disorder (ADHD) | 314, 314.0, 314.01, 314.1, 314.2, 314.8, 314.9 |
| Cataract (Catar) | 366.10, 366.12, 366.13, 366.14, 366.15, 366.16, 366.17, 366.18, 366.19, 366.21, 366.30, 366.41, 366.45, 366.8, 366.9 |
| Crohn’s disease (Crohn) | 555, 555.0, 555.1, 555.2, 555.9 |
| Rheumatoid arthritis (RA) | 714, 714.0, 714.1, 714.2 |
Within a code list, “*” means one or more digits or a period; “x” means one digit.
Only rheumatoid arthritis also had ICD10-CM codes in its original definition, namely, M05* and M06*, and these were used in the second gold standard.
Methods to generate concept sets from ICD9-CM concept set
| Method | Description |
|---|---|
| Original (no mapping) | Original concept set. |
| ICD9 set | Original ICD9-CM concept set generated by the phenotype author. This set is always run against the patients’ original ICD9-CM terms to show what would have happened before either data or concept sets were mapped. |
| Knowledge engineered (automatically map data; manually translate concept sets) | These SNOMED CT concept sets were created by hand. They are run against data in the form of SNOMED CT terms that were generated by mapping data from ICD9-CM and ICD10-CM to SNOMED CT using the OHDSI vocabulary mappings. |
| SNOMED mimic | SNOMED CT concept set designed to mimic the original ICD9-CM concept set as much as possible, ignoring data from other vocabularies. |
| SNOMED optimize | SNOMED CT concept set designed to carry out phenotype author’s intent to ICD9-CM, ICD10-CM, and SNOMED CT. |
| Automatically generated (automatically map data and concept sets) | These SNOMED CT concept sets were generated automatically from the original ICD9-CM set using OHDSI vocabulary mappings. Like knowledge engineered, they are run against data in the form of SNOMED CT terms that were generated by mapping data from ICD9-CM and ICD10-CM to SNOMED CT using the OHDSI vocabulary mappings. |
| SNOMED no desc | SNOMED CT concept set generated by using OHDSI vocabulary mappings to map from ICD9-CM terms to SNOMED CT, not using the SNOMED hierarchy. |
| SNOMED all desc | Like “SNOMED no desc,” but includes all terms in the SNOMED CT hierarchy that are descendants of the mapped terms. |
| SNOMED desc x child | Like “SNOMED no desc,” but includes descendants for mapped terms only if none of the term’s children is also in the concept set. Can be seen as limited descendants. |
| SNOMED desc x desc | Like “SNOMED no desc,” but includes descendants for mapped terms only if none of the term’s descendants is also in the concept set. Can be seen as more limited descendants. |
Performance on ICD9-CM source data mapped to SNOMED CT (FP false positive, FN false negative)
| Pheno | #Cases | Original | Knowledge engineered | Automated concept set creation | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ICD9 set | SNOMED mimic | SNOMED optimize | SNOMED no desc | SNOMEDall desc | SNOMED desc x child | SNOMEDdesc x desc | |||||||||
| FP | FN | FP | FN | FP | FN | FP | FN | FP | FN | FP | FN | FP | FN | ||
| HF | 75 312 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1262 | 0 | 1054 | 0 | 1054 | 0 |
| HF2 | 75 312 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1262 | 0 | 1054 | 0 | 1054 | 0 |
| T1DM | 27 861 | 0 | 0 | 0 | 23 | 0 | 23 | 108 | 0 | 943 | 0 | 943 | 0 | 108 | 0 |
| T2DM | 125 342 | 0 | 0 | 3 | 30 | 3 | 30 | 34 | 0 | 1318 | 0 | 104 | 0 | 34 | 0 |
| Appy | 9887 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| ADHD | 14 399 | 0 | 0 | 0 | 19 | 0 | 19 | 1362 | 0 | 1362 | 0 | 1362 | 0 | 1362 | 0 |
| Catar | 50 879 | 0 | 0 | 50 | 0 | 74 | 0 | 50 | 0 | 2491 | 0 | 80 | 0 | 80 | 0 |
| Crohn | 4679 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| RA | 9655 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 25 103 | 0 | 0 | 0 | 0 | 0 |
This column is used as the gold standard and is run on unmapped source data and therefore must have perfect performance.
Performance on ICD9-CM and ICD10-CM source data mapped to SNOMED CT (FP false positive, FN false negative)
| Pheno | #Cases | Original | Knowledge engineered | Automated concept set creation | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ICD9 set | SNOMED mimic | SNOMED optimize | SNOMED no desc | SNOMED all desc | SNOMED desc x child | SNOMED desc x desc | |||||||||
| FP | FN | FP | FN | FP | FN | FP | FN | FP | FN | FP | FN | FP | FN | ||
| HF | 75 626 | 0 | 314 | 0 | 0 | 0 | 0 | 0 | 0 | 1332 | 0 | 1116 | 0 | 1116 | 0 |
| HF2 | 75 958 | 0 | 1646 | 0 | 1332 | 0 | 0 | 0 | 1332 | 0 | 0 | 0 | 216 | 0 | 216 |
| T1DM | 27 935 | 0 | 74 | 0 | 23 | 0 | 23 | 108 | 67 | 943 | 0 | 943 | 67 | 108 | 67 |
| T2DM | 126 828 | 0 | 1486 | 3 | 1412 | 3 | 30 | 34 | 1486 | 1317 | 0 | 104 | 1382 | 34 | 1382 |
| Appy | 9920 | 0 | 33 | 0 | 0 | 0 | 0 | 0 | 8 | 0 | 0 | 0 | 8 | 0 | 8 |
| ADHD | 14 547 | 0 | 148 | 0 | 39 | 0 | 19 | 1359 | 19 | 1359 | 0 | 1359 | 19 | 1359 | 19 |
| Catar | 50 953 | 0 | 194 | 39 | 26 | 39 | 2 | 39 | 26 | 2451 | 0 | 51 | 8 | 51 | 8 |
| Crohn | 4679 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| RA | 9793 | 0 | 138 | 0 | 25 | 0 | 25 | 0 | 25 | 25 151 | 0 | 0 | 25 | 0 | 25 |