| Literature DB >> 33319710 |
Shiqiang Tao1, Ningzhou Zeng2, Isaac Hands3, Joseph Hurt-Mueller3, Eric B Durbin3,4, Licong Cui1, Guo-Qiang Zhang5.
Abstract
BACKGROUND: The Kentucky Cancer Registry (KCR) is a central cancer registry for the state of Kentucky that receives data about incident cancer cases from all healthcare facilities in the state within 6 months of diagnosis. Similar to all other U.S. and Canadian cancer registries, KCR uses a data dictionary provided by the North American Association of Central Cancer Registries (NAACCR) for standardized data entry. The NAACCR data dictionary is not an ontological system. Mapping between the NAACCR data dictionary and the National Cancer Institute (NCI) Thesaurus (NCIt) will facilitate the enrichment, dissemination and utilization of cancer registry data. We introduce a web-based system, called Interactive Mapping Interface (IMI), for creating mappings from data dictionaries to ontologies, in particular from NAACCR to NCIt.Entities:
Keywords: Concept mapping; Data dictionary; Ontology
Year: 2020 PMID: 33319710 PMCID: PMC7737251 DOI: 10.1186/s12911-020-01288-7
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1Five stems of IMI mapping workflow
NAACCR variables that are mapped to NCIt concepts
| Variable from NAACCR | Mapped concept from NCIt |
|---|---|
| Race 1 | Race |
| Race 2 | Race |
| Race 3 | Race |
| Race 4 | Race |
| Race 5 | Race |
| Spanish/Hispanic Origin | Hispanic or Latino |
| Computed Ethnicity | Computed Ethnicity Code |
| Computed Ethnicity Source | Computed Ethnicity Source Code |
| Sex | Sex |
| Date Of Birth | Birth Date |
| Birthplace-State | Birth State Code |
| Birthplace-Country | Birth Country Code |
| Date Of Last Contact | Date of Last Contact |
| Vital Status | Vital Status |
| Addr Current-City | City |
| Addr Current-State | US State |
| Addr Current-Postal Code | Postal Code |
| Cause Of Death | Cause of Death |
| Autopsy | Autopsy Indicator |
| Patient System Id-Hosp | Patient Identifier |
| Marital Status At Dx | Marital Status Code at Diagnosis |
| Age At Diagnosis | Age at Diagnosis |
| Ruralurban Continuum 2003 | Rural-Urban Continuum Code 2003 |
| Census Tract 2010 | Census Tract |
| Ruralurban Continuum 2013 | Rural-Urban Continuum Codes 2013 |
| Date Of Diagnosis | Initial Cancer Diagnosis Date |
| Primary Site | Primary Site of Disease |
| Laterality | Laterality |
| Histologic Type Icd-O-3 | Histology Type Code ICD-O-3 |
| Diagnostic Confirmation | Diagnostic Confirmation Code |
| Type Of Reporting Source | Reporting Source Type Code |
| Class Of Case | Class of Case |
| Primary Payer At Dx | Primary Healthcare Payer |
| Regional Nodes Positive | Number of Regional Lymph Nodes Positive |
| Regional Nodes Examined | Number of Regional Lymph Nodes Examined |
| Rx Summ-Surgical Margins | Surgical Margin |
| Vendor Name | Vendor Name |
| Follow-Up Source | Last Follow-up Source Type Code |
| Place Of Death | Location of Death |
| Text-Usual Occupation | Occupation |
| Tnm Clin T | AJCC v7-Primary Tumor (T) |
| Tumor Size Summary | Tumor Size Measurement |
| Derived Ajcc-6 Stage Grp | AJCC v6 Stage |
| Multiplicity Counter | Number of Primary Tumors in this Location |
| Lymph-Vascular Invasion | Is Lymphatic Invasion Present |
| Seer Summary Stage 2000 | SEER Summary Stage 2000 |
| Registry Id | Cancer Registry Identifier |
Fig. 2Mapping dashboard
Fig. 3Access control module with uses on the left and privilege on the right
Fig. 4First branch of hierarchical structure
Summary statistics of five branches
| B1 | B2 | B3 | B4 | B5 | |
|---|---|---|---|---|---|
| Root concept | Conceptual entity | Property or attribute | Disease, disorder or finding | Diagnostic or prognostic factor | Activity |
| No. of nodes | 60 | 27 | 4 | 2 | 13 |
| Maximum levels | 7 | 5 | 3 | 1 | 8 |
Fig. 5Mapping result review and export
Average mapping time for ten selected variables in the NAACCR data dictionary
| NAACCR data dictionary variable | IMI-based approach (s) | File-based approach (s) | Mapped NCIt concept |
|---|---|---|---|
| Date of Birth | 17.6 | 28.1 | Birth Date |
| Race 1 | 12.3 | 30.6 | Race |
| Sex | 15.1 | 36.2 | Sex |
| Race Coding Sys-Current | 30.1 | 55.3 | No mapping found |
| Race Coding Sys-Original | 33.2 | 64.1 | No mapping found |
| Spanish/Hispanic Origin | 15.4 | 37.5 | Hispanic or Latino |
| Birthplace-State | 20.6 | 43.2 | Birth State Code |
| Computed Ethnicity | 17.1 | 29.7 | Computed Ethnicity Code |
| Computed Ethnicity Source | 18.1 | 40.3 | Computed Ethnicity Source Code |
| Nhia Derived Hisp Origin | 32.5 | 55.1 | Hispanic or Latino |