Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Accounting for noise when clustering biological data.

Literature DB >> 23063929

Accounting for noise when clustering biological data.

Roman Sloutsky¹, Nicolas Jimenez, S Joshua Swamidass, Kristen M Naegle.

Abstract

Clustering is a powerful and commonly used technique that organizes and elucidates the structure of biological data. Clustering data from gene expression, metabolomics and proteomics experiments has proven to be useful at deriving a variety of insights, such as the shared regulation or function of biochemical components within networks. However, experimental measurements of biological processes are subject to substantial noise-stemming from both technical and biological variability-and most clustering algorithms are sensitive to this noise. In this article, we explore several methods of accounting for noise when analyzing biological data sets through clustering. Using a toy data set and two different case studies-gene expression and protein phosphorylation-we demonstrate the sensitivity of clustering algorithms to noise. Several methods of accounting for this noise can be used to establish when clustering results can be trusted. These methods span a range of assumptions about the statistical properties of the noise and can therefore be applied to virtually any biological data source.

Keywords: cluster ensemble; clustering; measurement variability; noise; random effects; unsupervised learning

Mesh：

Substances：
Proteins

Year: 2012 PMID： 23063929 DOI： 10.1093/bib/bbs057

Source DB: PubMed Journal: Brief Bioinform ISSN： 1467-5463 Impact factor: 11.622

Keyword Cloud
Cited

7 in total

1. Assessing Dissimilarity Measures for Sample-Based Hierarchical Clustering of RNA Sequencing Data Using Plasmode Datasets.

Authors: Pablo D Reeb; Sergio J Bramardi; Juan P Steibel
Journal: PLoS One Date: 2015-07-10 Impact factor: 3.240

2. High-throughput neuroimaging-genetics computational infrastructure.

Authors: Ivo D Dinov; Petros Petrosyan; Zhizhong Liu; Paul Eggert; Sam Hobel; Paul Vespa; Seok Woo Moon; John D Van Horn; Joseph Franco; Arthur W Toga
Journal: Front Neuroinform Date: 2014-04-23 Impact factor: 4.081

3. Intricate Genetic Programs Controlling Dormancy in Mycobacterium tuberculosis.

Authors: Eliza J R Peterson; Abrar A Abidi; Mario L Arrieta-Ortiz; Boris Aguilar; James T Yurkovich; Amardeep Kaur; Min Pan; Vivek Srinivas; Ilya Shmulevich; Nitin S Baliga
Journal: Cell Rep Date: 2020-04-28 Impact factor: 9.423

4. Constrained Fourier estimation of short-term time-series gene expression data reduces noise and improves clustering and gene regulatory network predictions.

Authors: Nadav Bar; Bahareh Nikparvar; Naresh Doni Jayavelu; Fabienne Krystin Roessler
Journal: BMC Bioinformatics Date: 2022-08-09 Impact factor: 3.307

Review 5. Overview of methods for characterization and visualization of a protein-protein interaction network in a multi-omics integration context.

Authors: Vivian Robin; Antoine Bodein; Marie-Pier Scott-Boyer; Mickaël Leclercq; Olivier Périn; Arnaud Droit
Journal: Front Mol Biosci Date: 2022-09-08

6. iPcc: a novel feature extraction method for accurate disease class discovery and prediction.

Authors: Xianwen Ren; Yong Wang; Xiang-Sun Zhang; Qi Jin
Journal: Nucleic Acids Res Date: 2013-06-12 Impact factor: 16.971

Review 7. A review and outlook on visual analytics for uncertainties in functional magnetic resonance imaging.

Authors: Michael de Ridder; Karsten Klein; Jinman Kim
Journal: Brain Inform Date: 2018-07-03

7 in total