| Literature DB >> 28173841 |
Vianney Jouhet1,2, Fleur Mougin3, Bérénice Bréchat4,3, Frantz Thiessard4,3.
Abstract
BACKGROUND: Identifying incident cancer cases within a population remains essential for scientific research in oncology. Data produced within electronic health records can be useful for this purpose. Due to the multiplicity of providers, heterogeneous terminologies such as ICD-10 and ICD-O-3 are used for oncology diagnosis recording purpose. To enable disease identification based on these diagnoses, there is a need for integrating disease classifications in oncology. Our aim was to build a model integrating concepts involved in two disease classifications, namely ICD-10 (diagnosis) and ICD-O-3 (topography and morphology), despite their structural heterogeneity. Based on the NCIt, a "derivative" model for linking diagnosis and topography-morphology combinations was defined and built. ICD-O-3 and ICD-10 codes were then used to instantiate classes of the "derivative" model. Links between terminologies obtained through the model were then compared to mappings provided by the Surveillance, Epidemiology, and End Results (SEER) program.Entities:
Keywords: ICD-10; ICD-O-3; NCI thesaurus; Oncology; Semantic integration; Terminology
Mesh:
Year: 2017 PMID: 28173841 PMCID: PMC5294908 DOI: 10.1186/s13326-017-0114-4
Source DB: PubMed Journal: J Biomed Semantics
Fig. 1Graffoo [25] representation of the proposed model. The model is formal and semi-formal hybrid. Terminologies (ICD-10 and ICD-O-3) are represented in SKOS. Above them, a formal model is represented in OWL. Every OWL class of the formal model are subclasses of skos:Concept so that they can be instanciated by terminological artifacts
Part of ICD-O-3 and ICD-10 terminologies integrated within the final model
| Total number | Instantiating the final model | |
|---|---|---|
| ICD-O-3 Topographies | 409 | 278 (68.0%) |
| ICD-O-3 Morphologies | 873 | 860 (98.5%) |
| ICD-10 | 727 | 302 (41.5%) |
| ICD-10 Benign | 180 | 73 (40.5%) |
| ICD-10 In situ | 66 | 22 (33.3%) |
| ICD-10 Malignant | 481 | 207 (43.0%) |
Number of codes instantiating multiple classes in the model
| N | Instances of multiple classes | |
|---|---|---|
| ICD-O-3 Topographies | 278 | 79 (28.4%) |
| ICD-O-3 Morphologies | 860 | 0 (- %) |
| ICD-10 | 302 | 153 (50.7%) |
| ICD-10 Benign | 73 | 26 (35.6%) |
| ICD-10 In situ | 22 | 22 (100%) |
| ICD-10 Malignant | 207 | 105 (50.7%) |
Comparison with the SEER conversion program according to the tumor type (hematopoietic and solid tumors) and the number of branches of the diagnosis lattice that are identified for an ICD-10 code / ICD-O-3 combination
| All | Hematopoietic tumors | Solid tumors | |
|---|---|---|---|
|
|
|
| |
| Related in the model* | 42260 (100.0) | 14213 (100.0) | 28047 (100.0) |
| More than 1 branch | 15234 (36.1) | 9910 (69.7) | 5324 (18.9) |
| Mappings rebuilt from the model** | 17766 (42.0) | 739 (5.2) | 17027 (60.7) |
| Non unique mappings | 4886 (27.5) | 333 (45.1) | 4553 (26.7) |
*Related in the model means that there is at least a common diagnosis inside the model that is instantiated by both the ICD-10 code and the ICD-O-3 combination
**Mappings rebuilt from the model corresponds to the mappings that we were able to rebuild automatically from the model
More than 1 branch means that there is more than one branch of the diagnosis lattice that was instantiated by both the ICD-10 code and the ICD-O-3 combination
Non unique mappings means that the topography-morphology combination was also mapped to another ICD-10 code (inconsistent with the SEER file)