Literature DB >> 19783830

Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion.

Shashank Agarwal1, Hong Yu.   

Abstract

Biomedical texts can be typically represented by four rhetorical categories: Introduction, Methods, Results and Discussion (IMRAD). Classifying sentences into these categories can benefit many other text-mining tasks. Although many studies have applied different approaches for automatically classifying sentences in MEDLINE abstracts into the IMRAD categories, few have explored the classification of sentences that appear in full-text biomedical articles. We first evaluated whether sentences in full-text biomedical articles could be reliably annotated into the IMRAD format and then explored different approaches for automatically classifying these sentences into the IMRAD categories. Our results show an overall annotation agreement of 82.14% with a Kappa score of 0.756. The best classification system is a multinomial naïve Bayes classifier trained on manually annotated data that achieved 91.95% accuracy and an average F-score of 91.55%, which is significantly higher than baseline systems. A web version of this system is available online at-http://wood.ims.uwm.edu/full_text_classifier/.

Mesh:

Year:  2009        PMID: 19783830      PMCID: PMC2913661          DOI: 10.1093/bioinformatics/btp548

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  13 in total

1.  GENIES: a natural-language processing system for the extraction of molecular pathways from journal articles.

Authors:  C Friedman; P Kra; H Yu; M Krauthammer; A Rzhetsky
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

2.  Mapping abbreviations to full forms in biomedical articles.

Authors:  Hong Yu; George Hripcsak; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2002 May-Jun       Impact factor: 4.497

3.  Categorization of sentence types in medical abstracts.

Authors:  Larry McKnight; Padmini Srinivasan
Journal:  AMIA Annu Symp Proc       Date:  2003

4.  Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup.

Authors:  Alexander S Yeh; Lynette Hirschman; Alexander A Morgan
Journal:  Bioinformatics       Date:  2003       Impact factor: 6.937

5.  Zone analysis in biology articles as a basis for information extraction.

Authors:  Yoko Mizuta; Anna Korhonen; Tony Mullen; Nigel Collier
Journal:  Int J Med Inform       Date:  2005-08-19       Impact factor: 4.046

6.  Development, implementation, and a cognitive evaluation of a definitional question answering system for physicians.

Authors:  Hong Yu; Minsuk Lee; David Kaufman; John Ely; Jerome A Osheroff; George Hripcsak; James Cimino
Journal:  J Biomed Inform       Date:  2007-03-12       Impact factor: 6.317

7.  A general natural-language text processor for clinical radiology.

Authors:  C Friedman; P O Alderson; J H Austin; J J Cimino; S B Johnson
Journal:  J Am Med Inform Assoc       Date:  1994 Mar-Apr       Impact factor: 4.497

8.  Automatically classifying sentences in full-text biomedical articles into introduction, methods, results and discussion.

Authors:  Shashank Agarwal; Hong Yu
Journal:  Summit Transl Bioinform       Date:  2009-03-01

9.  New directions in biomedical text annotation: definitions, guidelines and corpus construction.

Authors:  W John Wilbur; Andrey Rzhetsky; Hagit Shatkay
Journal:  BMC Bioinformatics       Date:  2006-07-25       Impact factor: 3.169

10.  Multi-dimensional classification of biomedical text: toward automated, practical provision of high-utility text to diverse users.

Authors:  Hagit Shatkay; Fengxia Pan; Andrey Rzhetsky; W John Wilbur
Journal:  Bioinformatics       Date:  2008-08-20       Impact factor: 6.937

View more
  19 in total

1.  Parenthetically speaking: classifying the contents of parentheses for text mining.

Authors:  K Bretonnel Cohen; Thomas Christiansen; Lawrence E Hunter
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

2.  Figure summarizer browser extensions for PubMed Central.

Authors:  Shashank Agarwal; Hong Yu
Journal:  Bioinformatics       Date:  2011-04-14       Impact factor: 6.937

3.  Biomedical text mining for research rigor and integrity: tasks, challenges, directions.

Authors:  Halil Kilicoglu
Journal:  Brief Bioinform       Date:  2018-11-27       Impact factor: 11.622

4.  Automatic recognition of self-acknowledged limitations in clinical research literature.

Authors:  Halil Kilicoglu; Graciela Rosemblat; Mario Malicki; Gerben Ter Riet
Journal:  J Am Med Inform Assoc       Date:  2018-07-01       Impact factor: 4.497

5.  Dynamic categorization of clinical research eligibility criteria by hierarchical clustering.

Authors:  Zhihui Luo; Meliha Yetisgen-Yildiz; Chunhua Weng
Journal:  J Biomed Inform       Date:  2011-06-12       Impact factor: 6.317

6.  Detecting hedge cues and their scope in biomedical text with conditional random fields.

Authors:  Shashank Agarwal; Hong Yu
Journal:  J Biomed Inform       Date:  2010-08-13       Impact factor: 6.317

7.  Automatic figure ranking and user interfacing for intelligent figure search.

Authors:  Hong Yu; Feifan Liu; Balaji Polepalli Ramesh
Journal:  PLoS One       Date:  2010-10-07       Impact factor: 3.240

8.  Studying PubMed usages in the field for complex problem solving: Implications for tool design.

Authors:  Barbara Mirel; Jean Song; Jennifer Steiner Tonks; Fan Meng; Weijian Xuan; Rafiqa Ameziane
Journal:  J Am Soc Inf Sci Technol       Date:  2013-05-01

9.  BioCause: Annotating and analysing causality in the biomedical domain.

Authors:  Claudiu Mihăilă; Tomoko Ohta; Sampo Pyysalo; Sophia Ananiadou
Journal:  BMC Bioinformatics       Date:  2013-01-16       Impact factor: 3.169

10.  GeneRIF indexing: sentence selection based on machine learning.

Authors:  Antonio J Jimeno-Yepes; J Caitlin Sticco; James G Mork; Alan R Aronson
Journal:  BMC Bioinformatics       Date:  2013-05-31       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.