Literature DB >> 25912991

Clustering high throughput biological data with B-MST, a minimum spanning tree based heuristic.

Harun Pirim1, Burak Ekşioğlu2, Andy D Perkins3.   

Abstract

To address important challenges in bioinformatics, high throughput data technologies are needed to interpret biological data efficiently and reliably. Clustering is widely used as a first step to interpreting high dimensional biological data, such as the gene expression data measured by microarrays. A good clustering algorithm should be efficient, reliable, and effective, as demonstrated by its capability of determining biologically relevant clusters. This paper proposes a new minimum spanning tree based heuristic B-MST, that is guided by an innovative objective function: the tightness and separation index (TSI). The TSI presented here obtains biologically meaningful clusters, making use of co-expression network topology, and this paper develops a local search procedure to minimize the TSI value. The proposed B-MST is tested by comparing results to: (1) adjusted rand index (ARI), for microarray data sets with known object classes, and (2) gene ontology (GO) annotations for data sets without documented object classes.
Copyright © 2015 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Biological networks; Clustering; Gene expression data; Graph mining; Heuristics

Mesh:

Year:  2015        PMID: 25912991     DOI: 10.1016/j.compbiomed.2015.03.031

Source DB:  PubMed          Journal:  Comput Biol Med        ISSN: 0010-4825            Impact factor:   4.589


  1 in total

1.  Does Determination of Initial Cluster Centroids Improve the Performance of K-Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm, Minimum Spanning Tree, and Hierarchical Clustering in an Applied Study.

Authors:  Saeedeh Pourahmad; Atefeh Basirat; Amir Rahimi; Marziyeh Doostfatemeh
Journal:  Comput Math Methods Med       Date:  2020-08-01       Impact factor: 2.238

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.