Literature DB >> 28187426

Clustering of Biological Datasets in the Era of Big Data.

Richard Röttger.   

Abstract

Clustering is a long-standing problem in computer science and is applied in virtually any scientific field for exploring the inherent structure of datasets. In biomedical research, clustering tools have been utilized in manifold areas, among many others in expression analysis, disease subtyping or protein research. A plethora of different approaches have been developed but there is only little guideline what approach is the optimal in what particular situation. Furthermore, a typical cluster analysis is an entire process with several highly interconnected steps; from preprocessing, proximity calculation, the actual clustering to evaluation and optimization. Only when all steps seamlessly work together, an optimal result can be achieved. This renders a cluster analyses tiresome and error-prone especially for non-experts. A mere trial-and-error approach renders increasingly infeasible when considering the tremendous growth of available datasets; thus, a strategic and thoughtful course of action is crucial for a cluster analysis. This manuscript provides an overview of the crucial steps and the most common techniques involved in conducting a state-of-the-art cluster analysis of biomedical datasets.

Mesh:

Year:  2016        PMID: 28187426     DOI: 10.2390/biecoll-jib-2016-300

Source DB:  PubMed          Journal:  J Integr Bioinform        ISSN: 1613-4516


  2 in total

1.  Guiding biomedical clustering with ClustEval.

Authors:  Christian Wiwie; Jan Baumbach; Richard Röttger
Journal:  Nat Protoc       Date:  2018-05-24       Impact factor: 13.491

2.  Development of a novel clustering tool for linear peptide sequences.

Authors:  Sandeep K Dhanda; Kerrie Vaughan; Veronique Schulten; Alba Grifoni; Daniela Weiskopf; John Sidney; Bjoern Peters; Alessandro Sette
Journal:  Immunology       Date:  2018-08-06       Impact factor: 7.397

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.