Literature DB >> 33816890

A Sparse-Modeling Based Approach for Class Specific Feature Selection.

Davide Nardone1, Angelo Ciaramella1, Antonino Staiano1.   

Abstract

In this work, we propose a novel Feature Selection framework called Sparse-Modeling Based Approach for Class Specific Feature Selection (SMBA-CSFS), that simultaneously exploits the idea of Sparse Modeling and Class-Specific Feature Selection. Feature selection plays a key role in several fields (e.g., computational biology), making it possible to treat models with fewer variables which, in turn, are easier to explain, by providing valuable insights on the importance of their role, and likely speeding up the experimental validation. Unfortunately, also corroborated by the no free lunch theorems, none of the approaches in literature is the most apt to detect the optimal feature subset for building a final model, thus it still represents a challenge. The proposed feature selection procedure conceives a two-step approach: (a) a sparse modeling-based learning technique is first used to find the best subset of features, for each class of a training set; (b) the discovered feature subsets are then fed to a class-specific feature selection scheme, in order to assess the effectiveness of the selected features in classification tasks. To this end, an ensemble of classifiers is built, where each classifier is trained on its own feature subset discovered in the previous phase, and a proper decision rule is adopted to compute the ensemble responses. In order to evaluate the performance of the proposed method, extensive experiments have been performed on publicly available datasets, in particular belonging to the computational biology field where feature selection is indispensable: the acute lymphoblastic leukemia and acute myeloid leukemia, the human carcinomas, the human lung carcinomas, the diffuse large B-cell lymphoma, and the malignant glioma. SMBA-CSFS is able to identify/retrieve the most representative features that maximize the classification accuracy. With top 20 and 80 features, SMBA-CSFS exhibits a promising performance when compared to its competitors from literature, on all considered datasets, especially those with a higher number of features. Experiments show that the proposed approach may outperform the state-of-the-art methods when the number of features is high. For this reason, the introduced approach proposes itself for selection and classification of data with a large number of features and classes. ©2019 Nardone et al.

Entities:  

Keywords:  Bioinformatics; Dictionary learning; Ensemble learning; Feature selection; Sparse coding

Year:  2019        PMID: 33816890      PMCID: PMC7924712          DOI: 10.7717/peerj-cs.237

Source DB:  PubMed          Journal:  PeerJ Comput Sci        ISSN: 2376-5992


  18 in total

1.  Estimating mutual information.

Authors:  Alexander Kraskov; Harald Stögbauer; Peter Grassberger
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2004-06-23

2.  Interactive data analysis and clustering of genomic data.

Authors:  A Ciaramella; S Cocozza; F Iorio; G Miele; F Napolitano; M Pinelli; G Raiconi; R Tagliaferri
Journal:  Neural Netw       Date:  2007-12-31

3.  Multiclass cancer diagnosis using tumor gene expression signatures.

Authors:  S Ramaswamy; P Tamayo; R Rifkin; S Mukherjee; C H Yeang; M Angelo; C Ladd; M Reich; E Latulippe; J P Mesirov; T Poggio; W Gerald; M Loda; E S Lander; T R Golub
Journal:  Proc Natl Acad Sci U S A       Date:  2001-12-11       Impact factor: 11.205

4.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling.

Authors:  A A Alizadeh; M B Eisen; R E Davis; C Ma; I S Lossos; A Rosenwald; J C Boldrick; H Sabet; T Tran; X Yu; J I Powell; L Yang; G E Marti; T Moore; J Hudson; L Lu; D B Lewis; R Tibshirani; G Sherlock; W C Chan; T C Greiner; D D Weisenburger; J O Armitage; R Warnke; R Levy; W Wilson; M R Grever; J C Byrd; D Botstein; P O Brown; L M Staudt
Journal:  Nature       Date:  2000-02-03       Impact factor: 49.962

5.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.

Authors:  A Bhattacharjee; W G Richards; J Staunton; C Li; S Monti; P Vasa; C Ladd; J Beheshti; R Bueno; M Gillette; M Loda; G Weber; E J Mark; E S Lander; W Wong; B E Johnson; T R Golub; D J Sugarbaker; M Meyerson
Journal:  Proc Natl Acad Sci U S A       Date:  2001-11-13       Impact factor: 11.205

6.  Molecular classification of human carcinomas by use of gene expression signatures.

Authors:  A I Su; J B Welsh; L M Sapinoso; S G Kern; P Dimitrov; H Lapp; P G Schultz; S M Powell; C A Moskaluk; H F Frierson; G M Hampton
Journal:  Cancer Res       Date:  2001-10-15       Impact factor: 12.701

7.  Microarray gene expression profiling of B-cell chronic lymphocytic leukemia subgroups defined by genomic aberrations and VH mutation status.

Authors:  Christian Haslinger; Norbert Schweifer; Stephan Stilgenbauer; Hartmut Döhner; Peter Lichter; Norbert Kraut; Christian Stratowa; Roger Abseher
Journal:  J Clin Oncol       Date:  2004-10-01       Impact factor: 44.544

8.  Association of USF1 and APOA5 polymorphisms with familial combined hyperlipidemia in an Italian population.

Authors:  Maria Donata Di Taranto; Antonino Staiano; Maria Nicoletta D'Agostino; Antonietta D'Angelo; Elena Bloise; Alberto Morgante; Gennaro Marotta; Marco Gentile; Paolo Rubba; Giuliana Fortunato
Journal:  Mol Cell Probes       Date:  2014-10-13       Impact factor: 2.365

9.  A stable gene selection in microarray data analysis.

Authors:  Kun Yang; Zhipeng Cai; Jianzhong Li; Guohui Lin
Journal:  BMC Bioinformatics       Date:  2006-04-27       Impact factor: 3.169

10.  Methylome analysis and epigenetic changes associated with menarcheal age.

Authors:  Christiana A Demetriou; Jia Chen; Silvia Polidoro; Karin van Veldhoven; Cyrille Cuenin; Gianluca Campanella; Kevin Brennan; Françoise Clavel-Chapelon; Laure Dossus; Marina Kvaskoff; Dagmar Drogan; Heiner Boeing; Rudolf Kaaks; Angela Risch; Dimitrios Trichopoulos; Pagona Lagiou; Giovanna Masala; Sabina Sieri; Rosario Tumino; Salvatore Panico; J Ramón Quirós; María-José Sánchez Perez; Pilar Amiano; José María Huerta Castaño; Eva Ardanaz; Charlotte Onland-Moret; Petra Peeters; Kay-Tee Khaw; Nick Wareham; Timothy J Key; Ruth C Travis; Isabelle Romieu; Valentina Gallo; Marc Gunter; Zdenko Herceg; Kyriacos Kyriacou; Elio Riboli; James M Flanagan; Paolo Vineis
Journal:  PLoS One       Date:  2013-11-20       Impact factor: 3.240

View more
  1 in total

1.  Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments.

Authors:  Muhammad Hamraz; Naz Gul; Mushtaq Raza; Dost Muhammad Khan; Umair Khalil; Seema Zubair; Zardad Khan
Journal:  PeerJ Comput Sci       Date:  2021-06-01
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.