Literature DB >> 33295294

Transformation of Pathology Reports Into the Common Data Model With Oncology Module: Use Case for Colon Cancer.

Borim Ryu1, Eunsil Yoon1, Seok Kim1, Sejoon Lee2, Hyunyoung Baek1, Soyoung Yi1, Hee Young Na2, Ji-Won Kim3, Rong-Min Baek4, Hee Hwang1, Sooyoung Yoo1.   

Abstract

BACKGROUND: Common data models (CDMs) help standardize electronic health record data and facilitate outcome analysis for observational and longitudinal research. An analysis of pathology reports is required to establish fundamental information infrastructure for data-driven colon cancer research. The Observational Medical Outcomes Partnership (OMOP) CDM is used in distributed research networks for clinical data; however, it requires conversion of free text-based pathology reports into the CDM's format. There are few use cases of representing cancer data in CDM.
OBJECTIVE: In this study, we aimed to construct a CDM database of colon cancer-related pathology with natural language processing (NLP) for a research platform that can utilize both clinical and omics data. The essential text entities from the pathology reports are extracted, standardized, and converted to the OMOP CDM format in order to utilize the pathology data in cancer research.
METHODS: We extracted clinical text entities, mapped them to the standard concepts in the Observational Health Data Sciences and Informatics vocabularies, and built databases and defined relations for the CDM tables. Major clinical entities were extracted through NLP on pathology reports of surgical specimens, immunohistochemical studies, and molecular studies of colon cancer patients at a tertiary general hospital in South Korea. Items were extracted from each report using regular expressions in Python. Unstructured data, such as text that does not have a pattern, were handled with expert advice by adding regular expression rules. Our own dictionary was used for normalization and standardization to deal with biomarker and gene names and other ungrammatical expressions. The extracted clinical and genetic information was mapped to the Logical Observation Identifiers Names and Codes databases and the Systematized Nomenclature of Medicine (SNOMED) standard terminologies recommended by the OMOP CDM. The database-table relationships were newly defined through SNOMED standard terminology concepts. The standardized data were inserted into the CDM tables. For evaluation, 100 reports were randomly selected and independently annotated by a medical informatics expert and a nurse.
RESULTS: We examined and standardized 1848 immunohistochemical study reports, 3890 molecular study reports, and 12,352 pathology reports of surgical specimens (from 2017 to 2018). The constructed and updated database contained the following extracted colorectal entities: (1) NOTE_NLP, (2) MEASUREMENT, (3) CONDITION_OCCURRENCE, (4) SPECIMEN, and (5) FACT_RELATIONSHIP of specimen with condition and measurement.
CONCLUSIONS: This study aimed to prepare CDM data for a research platform to take advantage of all omics clinical and patient data at Seoul National University Bundang Hospital for colon cancer pathology. A more sophisticated preparation of the pathology data is needed for further research on cancer genomics, and various types of text narratives are the next target for additional research on the use of data in the CDM. ©Borim Ryu, Eunsil Yoon, Seok Kim, Sejoon Lee, Hyunyoung Baek, Soyoung Yi, Hee Young Na, Ji-Won Kim, Rong-Min Baek, Hee Hwang, Sooyoung Yoo. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 09.12.2020.

Entities:  

Keywords:  clinical data; colon cancer; common data model; electronic health record; natural language processing; oncology; oncology module; pathology

Year:  2020        PMID: 33295294      PMCID: PMC7758167          DOI: 10.2196/18526

Source DB:  PubMed          Journal:  J Med Internet Res        ISSN: 1438-8871            Impact factor:   5.428


  25 in total

1.  VistA--U.S. Department of Veterans Affairs national-scale HIS.

Authors:  Steven H Brown; Michael J Lincoln; Peter J Groen; Robert M Kolodner
Journal:  Int J Med Inform       Date:  2003-03       Impact factor: 4.046

2.  An overview of MetaMap: historical perspective and recent advances.

Authors:  Alan R Aronson; François-Michel Lang
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

3.  Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries.

Authors:  Freddie Bray; Jacques Ferlay; Isabelle Soerjomataram; Rebecca L Siegel; Lindsey A Torre; Ahmedin Jemal
Journal:  CA Cancer J Clin       Date:  2018-09-12       Impact factor: 508.702

4.  Coreference annotation and resolution in the Colorado Richly Annotated Full Text (CRAFT) corpus of biomedical journal articles.

Authors:  K Bretonnel Cohen; Arrick Lanfranchi; Miji Joo-Young Choi; Michael Bada; William A Baumgartner; Natalya Panteleyeva; Karin Verspoor; Martha Palmer; Lawrence E Hunter
Journal:  BMC Bioinformatics       Date:  2017-08-17       Impact factor: 3.169

5.  Sharing Clinical Big Data While Protecting Confidentiality and Security: Observational Health Data Sciences and Informatics.

Authors:  Rae Woong Park
Journal:  Healthc Inform Res       Date:  2017-01-31

6.  Developing a portable natural language processing based phenotyping system.

Authors:  Himanshu Sharma; Chengsheng Mao; Yizhen Zhang; Haleh Vatani; Liang Yao; Yizhen Zhong; Luke Rasmussen; Guoqian Jiang; Jyotishman Pathak; Yuan Luo
Journal:  BMC Med Inform Decis Mak       Date:  2019-04-04       Impact factor: 2.796

7.  Standardized Pathology Report for Colorectal Cancer, 2nd Edition.

Authors:  Baek-Hui Kim; Joon Mee Kim; Gyeong Hoon Kang; Hee Jin Chang; Dong Wook Kang; Jung Ho Kim; Jeong Mo Bae; An Na Seo; Ho Sung Park; Yun Kyung Kang; Kyung-Hwa Lee; Mee Yon Cho; In-Gu Do; Hye Seung Lee; Hee Kyung Chang; Do Youn Park; Hyo Jeong Kang; Jin Hee Sohn; Mee Soo Chang; Eun Sun Jung; So-Young Jin; Eunsil Yu; Hye Seung Han; Youn Wha Kim
Journal:  J Pathol Transl Med       Date:  2019-11-13

8.  A Modular Architecture for Electronic Health Record-Driven Phenotyping.

Authors:  Luke V Rasmussen; Richard C Kiefer; Huan Mo; Peter Speltz; William K Thompson; Guoqian Jiang; Jennifer A Pacheco; Jie Xu; Qian Zhu; Joshua C Denny; Enid Montague; Jyotishman Pathak
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2015-03-25

9.  Long-term results of treatment with bosentan in adult Eisenmenger's syndrome patients with Down's syndrome related to congenital heart disease.

Authors:  Roberto Crepaz; Cristina Romeo; Donato Montanaro; Stefano De Santis
Journal:  BMC Cardiovasc Disord       Date:  2013-09-18       Impact factor: 2.298

10.  NOBLE - Flexible concept recognition for large-scale biomedical natural language processing.

Authors:  Eugene Tseytlin; Kevin Mitchell; Elizabeth Legowski; Julia Corrigan; Girish Chavan; Rebecca S Jacobson
Journal:  BMC Bioinformatics       Date:  2016-01-14       Impact factor: 3.169

View more
  3 in total

1.  Electronic case report forms generation from pathology reports by ARGO, automatic record generator for onco-hematology.

Authors:  Gian Maria Zaccaria; Vito Colella; Simona Colucci; Felice Clemente; Fabio Pavone; Maria Carmela Vegliante; Flavia Esposito; Giuseppina Opinto; Anna Scattone; Giacomo Loseto; Carla Minoia; Bernardo Rossini; Angela Maria Quinto; Vito Angiulli; Luigi Alfredo Grieco; Angelo Fama; Simone Ferrero; Riccardo Moia; Alice Di Rocco; Francesca Maria Quaglia; Valentina Tabanelli; Attilio Guarini; Sabino Ciavarella
Journal:  Sci Rep       Date:  2021-12-10       Impact factor: 4.379

2.  Transforming Thyroid Cancer Diagnosis and Staging Information from Unstructured Reports to the Observational Medical Outcome Partnership Common Data Model.

Authors:  Sooyoung Yoo; Eunsil Yoon; Dachung Boo; Borham Kim; Seok Kim; Jin Chul Paeng; Ie Ryung Yoo; In Young Choi; Kwangsoo Kim; Hyun Gee Ryoo; Sun Jung Lee; Eunhye Song; Young-Hwan Joo; Junmo Kim; Ho-Young Lee
Journal:  Appl Clin Inform       Date:  2022-06-15       Impact factor: 2.762

3.  Searching Full-Text Anatomic Pathology Reports Using Business Intelligence Software.

Authors:  Simone Arvisais-Anhalt; Christoph U Lehmann; Justin A Bishop; Jyoti Balani; Laurie Boutte; Marjorie Morales; Jason Y Park; Ellen Araj
Journal:  J Pathol Inform       Date:  2022-02-07
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.