Literature DB >> 17044165

Text mining biomedical literature for discovering gene-to-gene relationships: a comparative study of algorithms.

Ying Liu1, Shamkant B Navathe, Jorge Civera, Venu Dasigi, Ashwin Ram, Brian J Ciliax, Ray Dingledine.   

Abstract

Partitioning closely related genes into clusters has become an important element of practically all statistical analyses of microarray data. A number of computer algorithms have been developed for this task. Although these algorithms have demonstrated their usefulness for gene clustering, some basic problems remain. This paper describes our work on extracting functional keywords from MEDLINE for a set of genes that are isolated for further study from microarray experiments based on their differential expression patterns. The sharing of functional keywords among genes is used as a basis for clustering in a new approach called BEA-PARTITION in this paper. Functional keywords associated with genes were extracted from MEDLINE abstracts. We modified the Bond Energy Algorithm (BEA), which is widely accepted in psychology and database design but is virtually unknown in bioinformatics, to cluster genes by functional keyword associations. The results showed that BEA-PARTITION and hierarchical clustering algorithm outperformed k-means clustering and self-organizing map by correctly assigning 25 of 26 genes in a test set of four known gene groups. To evaluate the effectiveness of BEA-PARTITION for clustering genes identified by microarray profiles, 44 yeast genes that are differentially expressed during the cell cycle and have been widely studied in the literature were used as a second test set. Using established measures of cluster quality, the results produced by BEA-PARTITION had higher purity, lower entropy, and higher mutual information than those produced by k-means and self-organizing map. Whereas BEA-PARTITION and the hierarchical clustering produced similar quality of clusters, BEA-PARTITION provides clear cluster boundaries compared to the hierarchical clustering. BEA-PARTITION is simple to implement and provides a powerful approach to clustering genes or to any clustering problem where starting matrices are available from experimental observations.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 17044165     DOI: 10.1109/TCBB.2005.14

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  12 in total

1.  A document clustering and ranking system for exploring MEDLINE citations.

Authors:  Yongjing Lin; Wenyuan Li; Keke Chen; Ying Liu
Journal:  J Am Med Inform Assoc       Date:  2007-06-28       Impact factor: 4.497

2.  Semantic relations for interpreting DNA microarray data.

Authors:  Dimitar Hristovski; Andrej Kastrin; Borut Peterlin; Thomas C Rindflesch
Journal:  AMIA Annu Symp Proc       Date:  2009-11-14

Review 3.  Clinical decision support systems in myocardial perfusion imaging.

Authors:  Ernest V Garcia; J Larry Klein; Andrew T Taylor
Journal:  J Nucl Cardiol       Date:  2014-01-31       Impact factor: 5.952

4.  A combined approach to data mining of textual and structured data to identify cancer-related targets.

Authors:  Pavel Pospisil; Lakshmanan K Iyer; S James Adelstein; Amin I Kassis
Journal:  BMC Bioinformatics       Date:  2006-07-20       Impact factor: 3.169

Review 5.  Defrosting the digital library: bibliographic tools for the next generation web.

Authors:  Duncan Hull; Steve R Pettifer; Douglas B Kell
Journal:  PLoS Comput Biol       Date:  2008-10-31       Impact factor: 4.475

6.  Inference of gene pathways using mixture Bayesian networks.

Authors:  Younhee Ko; Chengxiang Zhai; Sandra Rodriguez-Zas
Journal:  BMC Syst Biol       Date:  2009-05-19

7.  A hybrid approach for biomarker discovery from microarray gene expression data for cancer classification.

Authors:  Yanxiong Peng; Wenyuan Li; Ying Liu
Journal:  Cancer Inform       Date:  2007-02-22

8.  Functional gene clustering via gene annotation sentences, MeSH and GO keywords from biomedical literature.

Authors:  Jeyakumar Natarajan; Jawahar Ganapathy
Journal:  Bioinformation       Date:  2007-12-30

9.  Evaluation of a gene information summarization system by users during the analysis process of microarray datasets.

Authors:  Jianji Yang; Aaron Cohen; William Hersh
Journal:  BMC Bioinformatics       Date:  2009-02-05       Impact factor: 3.169

10.  Inferring modules of functionally interacting proteins using the Bond Energy Algorithm.

Authors:  Ryosuke L A Watanabe; Enrique Morett; Edgar E Vallejo
Journal:  BMC Bioinformatics       Date:  2008-06-17       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.