Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Mapping annotations with textual evidence using an scLDA model.

Literature DB >> 22195141

Mapping annotations with textual evidence using an scLDA model.

Bo Jin¹, Vicky Chen, Lujia Chen, Xinghua Lu.

Abstract

Most of the knowledge regarding genes and proteins is stored in biomedical literature as free text. Extracting information from complex biomedical texts demands techniques capable of inferring biological concepts from local text regions and mapping them to controlled vocabularies. To this end, we present a sentence-based correspondence latent Dirichlet allocation (scLDA) model which, when trained with a corpus of PubMed documents with known GO annotations, performs the following tasks: 1) learning major biological concepts from the corpus, 2) inferring the biological concepts existing within text regions (sentences), and 3) identifying the text regions in a document that provides evidence for the observed annotations. When applied to new gene-related documents, a trained scLDA model is capable of predicting GO annotations and identifying text regions as textual evidence supporting the predicted annotations. This study uses GO annotation data as a testbed; the approach can be generalized to other annotated data, such as MeSH and MEDLINE documents.

Mesh：

Year: 2011 PMID： 22195141 PMCID： PMC3243146

Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN： 1559-4076

15 in total

1. Finding scientific topics.

Authors: Thomas L Griffiths; Mark Steyvers
Journal: Proc Natl Acad Sci U S A Date: 2004-02-10 Impact factor: 11.205

2. Enhancing text categorization with semantic-enriched representation and training data augmentation.

Authors: Xinghua Lu; Bin Zheng; Atulya Velivelli; Chengxiang Zhai
Journal: J Am Med Inform Assoc Date: 2006-06-23 Impact factor: 4.497

3. Manual curation is not sufficient for annotation of genomic databases.

Authors: William A Baumgartner; K Bretonnel Cohen; Lynne M Fox; George Acquaah-Mensah; Lawrence Hunter
Journal: Bioinformatics Date: 2007-07-01 Impact factor: 6.937

4. Topics in semantic representation.

Authors: Thomas L Griffiths; Mark Steyvers; Joshua B Tenenbaum
Journal: Psychol Rev Date: 2007-04 Impact factor: 8.934

5. The TREC 2004 genomics track categorization task: classifying full text biomedical documents.

Authors: Aaron M Cohen; William R Hersh
Journal: J Biomed Discov Collab Date: 2006-03-14

6. New directions in biomedical text annotation: definitions, guidelines and corpus construction.

Authors: W John Wilbur; Andrey Rzhetsky; Hagit Shatkay
Journal: BMC Bioinformatics Date: 2006-07-25 Impact factor: 3.169

7. Overview of BioCreAtIvE: critical assessment of information extraction for biology.

Authors: Lynette Hirschman; Alexander Yeh; Christian Blaschke; Alfonso Valencia
Journal: BMC Bioinformatics Date: 2005-05-24 Impact factor: 3.169

8. Multi-label literature classification based on the Gene Ontology graph.

Authors: Bo Jin; Brian Muller; Chengxiang Zhai; Xinghua Lu
Journal: BMC Bioinformatics Date: 2008-12-08 Impact factor: 3.169

Review 9. Getting started in text mining.

Authors: K Bretonnel Cohen; Lawrence Hunter
Journal: PLoS Comput Biol Date: 2008-01 Impact factor: 4.475

10. Text mining for biology--the way forward: opinions from leading scientists.

Authors: Russ B Altman; Casey M Bergman; Judith Blake; Christian Blaschke; Aaron Cohen; Frank Gannon; Les Grivell; Udo Hahn; William Hersh; Lynette Hirschman; Lars Juhl Jensen; Martin Krallinger; Barend Mons; Seán I O'Donoghue; Manuel C Peitsch; Dietrich Rebholz-Schuhmann; Hagit Shatkay; Alfonso Valencia
Journal: Genome Biol Date: 2008-09-01 Impact factor: 13.583

1 in total

1. Comparison and combination of several MeSH indexing approaches.

Authors: Antonio Jose Jimeno Yepes; James G Mork; Dina Demner-Fushman; Alan R Aronson
Journal: AMIA Annu Symp Proc Date: 2013-11-16

1 in total