Literature DB >> 27989816

Extractive text summarization system to aid data extraction from full text in systematic review development.

Duy Duc An Bui1, Guilherme Del Fiol2, John F Hurdle2, Siddhartha Jonnalagadda3.   

Abstract

OBJECTIVES: Extracting data from publication reports is a standard process in systematic review (SR) development. However, the data extraction process still relies too much on manual effort which is slow, costly, and subject to human error. In this study, we developed a text summarization system aimed at enhancing productivity and reducing errors in the traditional data extraction process.
METHODS: We developed a computer system that used machine learning and natural language processing approaches to automatically generate summaries of full-text scientific publications. The summaries at the sentence and fragment levels were evaluated in finding common clinical SR data elements such as sample size, group size, and PICO values. We compared the computer-generated summaries with human written summaries (title and abstract) in terms of the presence of necessary information for the data extraction as presented in the Cochrane review's study characteristics tables.
RESULTS: At the sentence level, the computer-generated summaries covered more information than humans do for systematic reviews (recall 91.2% vs. 83.8%, p<0.001). They also had a better density of relevant sentences (precision 59% vs. 39%, p<0.001). At the fragment level, the ensemble approach combining rule-based, concept mapping, and dictionary-based methods performed better than individual methods alone, achieving an 84.7% F-measure.
CONCLUSION: Computer-generated summaries are potential alternative information sources for data extraction in systematic review development. Machine learning and natural language processing are promising approaches to the development of such an extractive summarization system. Copyright Â
© 2016 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Data collection; Machine learning; Review literature as topic; Text classification; Text summarization

Mesh:

Year:  2016        PMID: 27989816      PMCID: PMC5362293          DOI: 10.1016/j.jbi.2016.10.014

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  25 in total

1.  Aggregating UMLS semantic types for reducing conceptual complexity.

Authors:  A T McCray; A Burgun; O Bodenreider
Journal:  Stud Health Technol Inform       Date:  2001

2.  A simple algorithm for identifying abbreviation definitions in biomedical text.

Authors:  Ariel S Schwartz; Marti A Hearst
Journal:  Pac Symp Biocomput       Date:  2003

3.  A language independent acronym extraction from biomedical texts with hidden Markov models.

Authors:  Bruno Adam Osiek; Gexéo Xexeo; Luis Alfredo Vidal de Carvalho
Journal:  IEEE Trans Biomed Eng       Date:  2010-07-12       Impact factor: 4.538

4.  Data extraction errors in meta-analyses that use standardized mean differences.

Authors:  Peter C Gøtzsche; Asbjørn Hróbjartsson; Katja Maric; Britta Tendal
Journal:  JAMA       Date:  2007-07-25       Impact factor: 56.272

5.  How quickly do systematic reviews go out of date? A survival analysis.

Authors:  Kaveh G Shojania; Margaret Sampson; Mohammed T Ansari; Jun Ji; Steve Doucette; David Moher
Journal:  Ann Intern Med       Date:  2007-07-16       Impact factor: 25.391

6.  Summarising complex ICU data in natural language.

Authors:  Jim Hunter; Yvonne Freer; Albert Gatt; Robert Logie; Neil McIntosh; Marian van der Meulen; Francois Portet; Ehud Reiter; Somayajulu Sripada; Cindy Sykes
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

7.  PICO element detection in medical text without metadata: are first sentences enough?

Authors:  Ke-Chun Huang; I-Jen Chiang; Furen Xiao; Chun-Chih Liao; Charles Chih-Ho Liu; Jau-Min Wong
Journal:  J Biomed Inform       Date:  2013-07-27       Impact factor: 6.317

8.  Domain adaptation for semantic role labeling in the biomedical domain.

Authors:  Daniel Dahlmeier; Hwee Tou Ng
Journal:  Bioinformatics       Date:  2010-02-23       Impact factor: 6.937

9.  Learning regular expressions for clinical text classification.

Authors:  Duy Duc An Bui; Qing Zeng-Treitler
Journal:  J Am Med Inform Assoc       Date:  2014-02-27       Impact factor: 4.497

10.  Automatically finding relevant citations for clinical guideline development.

Authors:  Duy Duc An Bui; Siddhartha Jonnalagadda; Guilherme Del Fiol
Journal:  J Biomed Inform       Date:  2015-09-10       Impact factor: 6.317

View more
  6 in total

1.  Data extraction methods for systematic review (semi)automation: A living systematic review.

Authors:  Lena Schmidt; Babatunde K Olorisade; Luke A McGuinness; James Thomas; Julian P T Higgins
Journal:  F1000Res       Date:  2021-05-19

2.  A systematic review of automatic text summarization for biomedical literature and EHRs.

Authors:  Mengqian Wang; Manhua Wang; Fei Yu; Yue Yang; Jennifer Walker; Javed Mostafa
Journal:  J Am Med Inform Assoc       Date:  2021-09-18       Impact factor: 7.942

3.  Improving reference prioritisation with PICO recognition.

Authors:  Austin J Brockmeier; Meizhi Ju; Piotr Przybyła; Sophia Ananiadou
Journal:  BMC Med Inform Decis Mak       Date:  2019-12-05       Impact factor: 2.796

Review 4.  Applications of natural language processing in ophthalmology: present and future.

Authors:  Jimmy S Chen; Sally L Baxter
Journal:  Front Med (Lausanne)       Date:  2022-08-08

5.  Clinical Context-Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation.

Authors:  Muhammad Afzal; Fakhare Alam; Khalid Mahmood Malik; Ghaus M Malik
Journal:  J Med Internet Res       Date:  2020-10-23       Impact factor: 5.428

Review 6.  A tutorial on methodological studies: the what, when, how and why.

Authors:  Lawrence Mbuagbaw; Daeria O Lawson; Livia Puljak; David B Allison; Lehana Thabane
Journal:  BMC Med Res Methodol       Date:  2020-09-07       Impact factor: 4.615

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.