Literature DB >> 25485170

Structured Correspondence Topic Models for Mining Captioned Figures in Biological Literature.

Amr Ahmed1, Eric P Xing2, William W Cohen2, Robert F Murphy2.   

Abstract

A major source of information (often the most crucial and informative part) in scholarly articles from scientific journals, proceedings and books are the figures that directly provide images and other graphical illustrations of key experimental results and other scientific contents. In biological articles, a typical figure often comprises multiple panels, accompanied by either scoped or global captioned text. Moreover, the text in the caption contains important semantic entities such as protein names, gene ontology, tissues labels, etc., relevant to the images in the figure. Due to the avalanche of biological literature in recent years, and increasing popularity of various bio-imaging techniques, automatic retrieval and summarization of biological information from literature figures has emerged as a major unsolved challenge in computational knowledge extraction and management in the life science. We present a new structured probabilistic topic model built on a realistic figure generation scheme to model the structurally annotated biological figures, and we derive an efficient inference algorithm based on collapsed Gibbs sampling for information retrieval and visualization. The resulting program constitutes one of the key IR engines in our SLIF system that has recently entered the final round (4 out 70 competing systems) of the Elsevier Grand Challenge on Knowledge Enhancement in the Life Science. Here we present various evaluations on a number of data mining tasks to illustrate our method.

Entities:  

Keywords:  Algorithms; Experimentation

Year:  2009        PMID: 25485170      PMCID: PMC4256960          DOI: 10.1145/1557019.1557031

Source DB:  PubMed          Journal:  KDD        ISSN: 2154-817X


  4 in total

1.  Finding scientific topics.

Authors:  Thomas L Griffiths; Mark Steyvers
Journal:  Proc Natl Acad Sci U S A       Date:  2004-02-10       Impact factor: 11.205

2.  High-recall protein entity recognition using a dictionary.

Authors:  Zhenzhen Kou; William W Cohen; Robert F Murphy
Journal:  Bioinformatics       Date:  2005-06       Impact factor: 6.937

Review 3.  A text-mining perspective on the requirements for electronically annotated abstracts.

Authors:  Florian Leitner; Alfonso Valencia
Journal:  FEBS Lett       Date:  2008-03-06       Impact factor: 4.124

4.  Structured Correspondence Topic Models for Mining Captioned Figures in Biological Literature.

Authors:  Amr Ahmed; Eric P Xing; William W Cohen; Robert F Murphy
Journal:  KDD       Date:  2009
  4 in total
  6 in total

1.  Automatic figure classification in bioscience literature.

Authors:  Daehyun Kim; Balaji Polepalli Ramesh; Hong Yu
Journal:  J Biomed Inform       Date:  2011-05-27       Impact factor: 6.317

2.  Structured Literature Image Finder: Parsing Text and Figures in Biomedical Literature.

Authors:  Amr Ahmed; Andrew Arnold; Luis Pedro Coelho; Joshua Kangas; Abdul-Saboor Sheikh; Eric Xing; William Cohen; Robert F Murphy
Journal:  Web Semant       Date:  2010-07-01       Impact factor: 1.897

3.  Structured Correspondence Topic Models for Mining Captioned Figures in Biological Literature.

Authors:  Amr Ahmed; Eric P Xing; William W Cohen; Robert F Murphy
Journal:  KDD       Date:  2009

4.  Structured digital tables on the Semantic Web: toward a structured digital literature.

Authors:  Kei-Hoi Cheung; Matthias Samwald; Raymond K Auerbach; Mark B Gerstein
Journal:  Mol Syst Biol       Date:  2010-08-24       Impact factor: 11.429

5.  Figure text extraction in biomedical literature.

Authors:  Daehyun Kim; Hong Yu
Journal:  PLoS One       Date:  2011-01-13       Impact factor: 3.240

6.  DeTEXT: A Database for Evaluating Text Extraction from Biomedical Literature Figures.

Authors:  Xu-Cheng Yin; Chun Yang; Wei-Yi Pei; Haixia Man; Jun Zhang; Erik Learned-Miller; Hong Yu
Journal:  PLoS One       Date:  2015-05-07       Impact factor: 3.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.