Literature DB >> 28066684

Pathology report data extraction from relational database using R, with extraction from reports on melanoma of skin as an example.

Jay J Ye1.   

Abstract

BACKGROUND: Different methods have been described for data extraction from pathology reports with varying degrees of success. Here a technique for directly extracting data from relational database is described.
METHODS: Our department uses synoptic reports modified from College of American Pathologists (CAP) Cancer Protocol Templates to report most of our cancer diagnoses. Choosing the melanoma of skin synoptic report as an example, R scripting language extended with RODBC package was used to query the pathology information system database. Reports containing melanoma of skin synoptic report in the past 4 and a half years were retrieved and individual data elements were extracted. Using the retrieved list of the cases, the database was queried a second time to retrieve/extract the lymph node staging information in the subsequent reports from the same patients.
RESULTS: 426 synoptic reports corresponding to unique lesions of melanoma of skin were retrieved, and data elements of interest were extracted into an R data frame. The distribution of Breslow depth of melanomas grouped by year is used as an example of intra-report data extraction and analysis. When the new pN staging information was present in the subsequent reports, 82% (77/94) was precisely retrieved (pN0, pN1, pN2 and pN3). Additional 15% (14/94) was retrieved with certain ambiguity (positive or knowing there was an update). The specificity was 100% for both. The relationship between Breslow depth and lymph node status was graphed as an example of lesion-specific multi-report data extraction and analysis.
CONCLUSIONS: R extended with RODBC package is a simple and versatile approach well-suited for the above tasks. The success or failure of the retrieval and extraction depended largely on whether the reports were formatted and whether the contents of the elements were consistently phrased. This approach can be easily modified and adopted for other pathology information systems that use relational database for data management.

Entities:  

Keywords:  Pathology report data extraction; R; SQL database

Year:  2016        PMID: 28066684      PMCID: PMC5100200          DOI: 10.4103/2153-3539.192822

Source DB:  PubMed          Journal:  J Pathol Inform


  7 in total

1.  The 2009 version of the cancer protocols of the college of american pathologists.

Authors:  Mahul B Amin
Journal:  Arch Pathol Lab Med       Date:  2010-03       Impact factor: 5.534

2.  Standardized synoptic cancer pathology reports - so what and who cares? A population-based satisfaction survey of 970 pathologists, surgeons, and oncologists.

Authors:  Sara Lankshear; John Srigley; Thomas McGowan; Marta Yurcan; Carol Sawka
Journal:  Arch Pathol Lab Med       Date:  2013-02-21       Impact factor: 5.534

3.  What impact has the introduction of a synoptic report for rectal cancer had on reporting outcomes for specialist gastrointestinal and nongastrointestinal pathologists?

Authors:  David E Messenger; Robin S McLeod; Richard Kirsch
Journal:  Arch Pathol Lab Med       Date:  2011-11       Impact factor: 5.534

4.  The feasibility of using natural language processing to extract clinical information from breast pathology reports.

Authors:  Julliette M Buckley; Suzanne B Coopey; John Sharko; Fernanda Polubriaginof; Brian Drohan; Ahmet K Belli; Elizabeth M H Kim; Judy E Garber; Barbara L Smith; Michele A Gadd; Michelle C Specht; Constance A Roche; Thomas M Gudewicz; Kevin S Hughes
Journal:  J Pathol Inform       Date:  2012-06-30

5.  Web-based synoptic reporting for cancer checklists.

Authors:  Brett W Baskovich; Robert W Allan
Journal:  J Pathol Inform       Date:  2011-03-15

6.  Validation of natural language processing to extract breast cancer pathology procedures and results.

Authors:  Arika E Wieneke; Erin J A Bowles; David Cronkite; Karen J Wernli; Hongyuan Gao; David Carrell; Diana S M Buist
Journal:  J Pathol Inform       Date:  2015-06-23

7.  Extraction and analysis of discrete synoptic pathology report data using R.

Authors:  Alexander Boag
Journal:  J Pathol Inform       Date:  2015-11-27
  7 in total
  4 in total

1.  Population-Based Analysis of Histologically Confirmed Melanocytic Proliferations Using Natural Language Processing.

Authors:  Jason P Lott; Denise M Boudreau; Ray L Barnhill; Martin A Weinstock; Eleanor Knopp; Michael W Piepkorn; David E Elder; Steven R Knezevich; Andrew Baer; Anna N A Tosteson; Joann G Elmore
Journal:  JAMA Dermatol       Date:  2018-01-01       Impact factor: 10.282

2.  Computational Algorithms that Effectively Reduce Report Defects in Surgical Pathology.

Authors:  Jay J Ye; Michael R Tan
Journal:  J Pathol Inform       Date:  2019-07-01

3.  Performance of a Web-based Method for Generating Synoptic Reports.

Authors:  Megan A Renshaw; Scott A Renshaw; Mercy Mena-Allauca; Patricia P Carrion; Xiaorong Mei; Arniris Narciandi; Edwin W Gould; Andrew A Renshaw
Journal:  J Pathol Inform       Date:  2017-03-10

4.  Construction and Utilization of a Neural Network Model to Predict Current Procedural Terminology Codes from Pathology Report Texts.

Authors:  Jay J Ye
Journal:  J Pathol Inform       Date:  2019-04-03
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.