Literature DB >> 12603044

Evaluation of the vector space representation in text-based gene clustering.

P Glenisson1, P Antal, J Mathys, Y Moreau, B De Moor.   

Abstract

Thanks to its increasing availability, electronic literature can now be a major source of information when developing complex statistical models where data is scarce or contains much noise. This raises the question of how to deeply integrate information from domain literature with experimental data. Evaluating what kind of statistical text representations can integrate literature knowledge in clustering still remains an unsufficiently explored topic. In this work we discuss how the bag-of-words representation can be used successfully to represent genetic annotation and free-text information coming from different databases. We demonstrate the effect of various weighting schemes and information sources in a functional clustering setup. As a quantitative evaluation, we contrast for different parameter settings the functional groupings obtained from text with those obtained from expert assessments and link each of the results to a biological discussion.

Mesh:

Year:  2003        PMID: 12603044     DOI: 10.1142/9789812776303_0037

Source DB:  PubMed          Journal:  Pac Symp Biocomput        ISSN: 2335-6928


  11 in total

1.  INCLUSive: A web portal and service registry for microarray and regulatory sequence analysis.

Authors:  Bert Coessens; Gert Thijs; Stein Aerts; Kathleen Marchal; Frank De Smet; Kristof Engelen; Patrick Glenisson; Yves Moreau; Janick Mathys; Bart De Moor
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

2.  Evaluation of semantic-based information retrieval methods in the autism phenotype domain.

Authors:  Saeed Hassanpour; Martin J O'Connor; Amar K Das
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

Review 3.  Empirical distributional semantics: methods and biomedical applications.

Authors:  Trevor Cohen; Dominic Widdows
Journal:  J Biomed Inform       Date:  2009-02-14       Impact factor: 6.317

4.  Predicting novel human gene ontology annotations using semantic analysis.

Authors:  Bogdan Done; Purvesh Khatri; Arina Done; Sorin Drăghici
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2010 Jan-Mar       Impact factor: 3.710

5.  IntelliGO: a new vector-based semantic similarity measure including annotation origin.

Authors:  Sidahmed Benabderrahmane; Malika Smail-Tabbone; Olivier Poch; Amedeo Napoli; Marie-Dominique Devignes
Journal:  BMC Bioinformatics       Date:  2010-12-01       Impact factor: 3.169

6.  Discovering semantic features in the literature: a foundation for building functional associations.

Authors:  Monica Chagoyen; Pedro Carmona-Saez; Hagit Shatkay; Jose M Carazo; Alberto Pascual-Montano
Journal:  BMC Bioinformatics       Date:  2006-01-26       Impact factor: 3.169

7.  A literature-based similarity metric for biological processes.

Authors:  Monica Chagoyen; Pedro Carmona-Saez; Concha Gil; Jose M Carazo; Alberto Pascual-Montano
Journal:  BMC Bioinformatics       Date:  2006-07-26       Impact factor: 3.169

8.  Evaluation of a gene information summarization system by users during the analysis process of microarray datasets.

Authors:  Jianji Yang; Aaron Cohen; William Hersh
Journal:  BMC Bioinformatics       Date:  2009-02-05       Impact factor: 3.169

9.  TXTGate: profiling gene groups with text-based information.

Authors:  Patrick Glenisson; Bert Coessens; Steven Van Vooren; Janick Mathys; Yves Moreau; Bart De Moor
Journal:  Genome Biol       Date:  2004-05-28       Impact factor: 13.583

10.  Content-rich biological network constructed by mining PubMed abstracts.

Authors:  Hao Chen; Burt M Sharp
Journal:  BMC Bioinformatics       Date:  2004-10-08       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.