Literature DB >> 28664200

An Ontology-Enabled Natural Language Processing Pipeline for Provenance Metadata Extraction from Biomedical Text (Short Paper).

Joshua Valdez1, Michael Rueschman2, Matthew Kim2, Susan Redline2, Satya S Sahoo1.   

Abstract

Extraction of structured information from biomedical literature is a complex and challenging problem due to the complexity of biomedical domain and lack of appropriate natural language processing (NLP) techniques. High quality domain ontologies model both data and metadata information at a fine level of granularity, which can be effectively used to accurately extract structured information from biomedical text. Extraction of provenance metadata, which describes the history or source of information, from published articles is an important task to support scientific reproducibility. Reproducibility of results reported by previous research studies is a foundational component of scientific advancement. This is highlighted by the recent initiative by the US National Institutes of Health called "Principles of Rigor and Reproducibility". In this paper, we describe an effective approach to extract provenance metadata from published biomedical research literature using an ontology-enabled NLP platform as part of the Provenance for Clinical and Healthcare Research (ProvCaRe). The ProvCaRe-NLP tool extends the clinical Text Analysis and Knowledge Extraction System (cTAKES) platform using both provenance and biomedical domain ontologies. We demonstrate the effectiveness of ProvCaRe-NLP tool using a corpus of 20 peer-reviewed publications. The results of our evaluation demonstrate that the ProvCaRe-NLP tool has significantly higher recall in extracting provenance metadata as compared to existing NLP pipelines such as MetaMap.

Entities:  

Keywords:  Named Entity Recognition; Ontology-based Natural Language Processing; Provenance Metadata; Scientific Reproducibility

Year:  2016        PMID: 28664200      PMCID: PMC5486409          DOI: 10.1007/978-3-319-48472-3_43

Source DB:  PubMed          Journal:  On Move Meaningful Internet Syst


  14 in total

1.  A broad-coverage natural language processing system.

Authors:  C Friedman
Journal:  Proc AMIA Symp       Date:  2000

2.  Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

Authors:  A R Aronson
Journal:  Proc AMIA Symp       Date:  2001

3.  The Unified Medical Language System (UMLS): integrating biomedical terminology.

Authors:  Olivier Bodenreider
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

4.  caTIES: a grid based system for coding and retrieval of surgical pathology reports and tissue specimens in support of translational research.

Authors:  Rebecca S Crowley; Melissa Castine; Kevin Mitchell; Girish Chavan; Tara McSherry; Michael Feldman
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

5.  An overview of MetaMap: historical perspective and recent advances.

Authors:  Alan R Aronson; François-Michel Lang
Journal:  J Am Med Inform Assoc       Date:  2010 May-Jun       Impact factor: 4.497

6.  Scientific Reproducibility in Biomedical Research: Provenance Metadata Ontology for Semantic Annotation of Study Description.

Authors:  Satya S Sahoo; Joshua Valdez; Michael Rueschman
Journal:  AMIA Annu Symp Proc       Date:  2017-02-10

Review 7.  Scaling Up Scientific Discovery in Sleep Medicine: The National Sleep Research Resource.

Authors:  Dennis A Dean; Ary L Goldberger; Remo Mueller; Matthew Kim; Michael Rueschman; Daniel Mobley; Satya S Sahoo; Catherine P Jayapandian; Licong Cui; Michael G Morrical; Susan Surovec; Guo-Qiang Zhang; Susan Redline
Journal:  Sleep       Date:  2016-05-01       Impact factor: 5.849

8.  Identification of suspected tuberculosis patients based on natural language processing of chest radiograph reports.

Authors:  N L Jain; C A Knirsch; C Friedman; G Hripcsak
Journal:  Proc AMIA Annu Fall Symp       Date:  1996

9.  CPAP versus oxygen in obstructive sleep apnea.

Authors:  Daniel J Gottlieb; Naresh M Punjabi; Reena Mehra; Sanjay R Patel; Stuart F Quan; Denise C Babineau; Russell P Tracy; Michael Rueschman; Roger S Blumenthal; Eldrin F Lewis; Deepak L Bhatt; Susan Redline
Journal:  N Engl J Med       Date:  2014-06-12       Impact factor: 91.245

10.  Policy: NIH plans to enhance reproducibility.

Authors:  Francis S Collins; Lawrence A Tabak
Journal:  Nature       Date:  2014-01-30       Impact factor: 49.962

View more
  2 in total

1.  ProvCaRe: Characterizing scientific reproducibility of biomedical research studies using semantic provenance metadata.

Authors:  Satya S Sahoo; Joshua Valdez; Matthew Kim; Michael Rueschman; Susan Redline
Journal:  Int J Med Inform       Date:  2018-11-03       Impact factor: 4.046

2.  Data extraction methods for systematic review (semi)automation: A living systematic review.

Authors:  Lena Schmidt; Babatunde K Olorisade; Luke A McGuinness; James Thomas; Julian P T Higgins
Journal:  F1000Res       Date:  2021-05-19
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.