Literature DB >> 17989091

Unsupervised feature selection under perturbations: meeting the challenges of biological data.

Roy Varshavsky1, Assaf Gottlieb, David Horn, Michal Linial.   

Abstract

MOTIVATION: Feature selection methods aim to reduce the complexity of data and to uncover the most relevant biological variables. In reality, information in biological datasets is often incomplete as a result of untrustworthy samples and missing values. The reliability of selection methods may therefore be questioned.
METHOD: Information loss is incorporated into a perturbation scheme, testing which features are stable under it. This method is applied to data analysis by unsupervised feature filtering (UFF). The latter has been shown to be a very successful method in analysis of gene-expression data.
RESULTS: We find that the UFF quality degrades smoothly with information loss. It remains successful even under substantial damage. Our method allows for selection of a best imputation method on a dataset treated by UFF. More importantly, scoring features according to their stability under information loss is shown to be correlated with biological importance in cancer studies. This scoring may lead to novel biological insights.

Entities:  

Mesh:

Year:  2007        PMID: 17989091     DOI: 10.1093/bioinformatics/btm528

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  5 in total

1.  UFFizi: a generic platform for ranking informative features.

Authors:  Assaf Gottlieb; Roy Varshavsky; Michal Linial; David Horn
Journal:  BMC Bioinformatics       Date:  2010-06-03       Impact factor: 3.169

2.  Index cohesive force analysis reveals that the US market became prone to systemic collapses since 2002.

Authors:  Dror Y Kenett; Yoash Shapira; Asaf Madi; Sharron Bransburg-Zabary; Gitit Gur-Gershgoren; Eshel Ben-Jacob
Journal:  PLoS One       Date:  2011-04-27       Impact factor: 3.240

3.  Network theory analysis of antibody-antigen reactivity data: the immune trees at birth and adulthood.

Authors:  Asaf Madi; Dror Y Kenett; Sharron Bransburg-Zabary; Yifat Merbl; Francisco J Quintana; Alfred I Tauber; Irun R Cohen; Eshel Ben-Jacob
Journal:  PLoS One       Date:  2011-03-08       Impact factor: 3.240

4.  Principal component analysis based feature extraction approach to identify circulating microRNA biomarkers.

Authors:  Y-h Taguchi; Yoshiki Murakami
Journal:  PLoS One       Date:  2013-06-24       Impact factor: 3.240

5.  Classifying Incomplete Gene-Expression Data: Ensemble Learning with Non-Pre-Imputation Feature Filtering and Best-First Search Technique.

Authors:  Yuanting Yan; Tao Dai; Meili Yang; Xiuquan Du; Yiwen Zhang; Yanping Zhang
Journal:  Int J Mol Sci       Date:  2018-10-30       Impact factor: 5.923

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.