Literature DB >> 15219288

Filter versus wrapper gene selection approaches in DNA microarray domains.

Iñaki Inza1, Pedro Larrañaga, Rosa Blanco, Antonio J Cerrolaza.   

Abstract

DNA microarray experiments generating thousands of gene expression measurements, are used to collect information from tissue and cell samples regarding gene expression differences that could be useful for diagnosis disease, distinction of the specific tumor type, etc. One important application of gene expression microarray data is the classification of samples into known categories. As DNA microarray technology measures the gene expression en masse, this has resulted in data with the number of features (genes) far exceeding the number of samples. As the predictive accuracy of supervised classifiers that try to discriminate between the classes of the problem decays with the existence of irrelevant and redundant features, the necessity of a dimensionality reduction process is essential. We propose the application of a gene selection process, which also enables the biology researcher to focus on promising gene candidates that actively contribute to classification in these large scale microarrays. Two basic approaches for feature selection appear in machine learning and pattern recognition literature: the filter and wrapper techniques. Filter procedures are used in most of the works in the area of DNA microarrays. In this work, a comparison between a group of different filter metrics and a wrapper sequential search procedure is carried out. The comparison is performed in two well-known DNA microarray datasets by the use of four classic supervised classifiers. The study is carried out over the original-continuous and three-intervals discretized gene expression data. While two well-known filter metrics are proposed for continuous data, four classic filter measures are used over discretized data. The same wrapper approach is used for both continuous and discretized data. The application of filter and wrapper gene selection procedures leads to considerably better accuracy results in comparison to the non-gene selection approach, coupled with interesting and notable dimensionality reductions. Although the wrapper approach mainly shows a more accurate behavior than filter metrics, this improvement is coupled with considerable computer-load necessities. We note that most of the genes selected by proposed filter and wrapper procedures in discrete and continuous microarray data appear in the lists of relevant-informative genes detected by previous studies over these datasets. The aim of this work is to make contributions in the field of the gene selection task in DNA microarray datasets. By an extensive comparison with more popular filter techniques, we would like to make contributions in the expansion and study of the wrapper approach in this type of domains.

Entities:  

Mesh:

Year:  2004        PMID: 15219288     DOI: 10.1016/j.artmed.2004.01.007

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  61 in total

1.  Functional dissociation between anterior and posterior temporal cortical regions during retrieval of remote memory.

Authors:  Takamitsu Watanabe; Hiroko M Kimura; Satoshi Hirose; Hiroyuki Wada; Yoshio Imai; Toru Machida; Ichiro Shirouzu; Yasushi Miyashita; Seiki Konishi
Journal:  J Neurosci       Date:  2012-07-11       Impact factor: 6.167

2.  Identification and validation of biomarkers of IgV(H) mutation status in chronic lymphocytic leukemia using microfluidics quantitative real-time polymerase chain reaction technology.

Authors:  Lynne V Abruzzo; Lynn L Barron; Keith Anderson; Rachel J Newman; William G Wierda; Susan O'brien; Alessandra Ferrajoli; Madan Luthra; Sameer Talwalkar; Rajyalakshmi Luthra; Dan Jones; Michael J Keating; Kevin R Coombes
Journal:  J Mol Diagn       Date:  2007-08-09       Impact factor: 5.568

3.  A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis.

Authors:  Sarah E Reese; Kellie J Archer; Terry M Therneau; Elizabeth J Atkinson; Celine M Vachon; Mariza de Andrade; Jean-Pierre A Kocher; Jeanette E Eckel-Passow
Journal:  Bioinformatics       Date:  2013-08-19       Impact factor: 6.937

4.  Reconstruct modular phenotype-specific gene networks by knowledge-driven matrix factorization.

Authors:  Xuerui Yang; Yang Zhou; Rong Jin; Christina Chan
Journal:  Bioinformatics       Date:  2009-06-19       Impact factor: 6.937

5.  Pathway-BasedFeature Selection Algorithm for Cancer Microarray Data.

Authors:  Nirmalya Bandyopadhyay; Tamer Kahveci; Steve Goodison; Y Sun; Sanjay Ranka
Journal:  Adv Bioinformatics       Date:  2010-03-03

6.  DynaMod: dynamic functional modularity analysis.

Authors:  Choong-Hyun Sun; Taeho Hwang; Kimin Oh; Gwan-Su Yi
Journal:  Nucleic Acids Res       Date:  2010-05-11       Impact factor: 16.971

7.  Effective feature selection framework for cluster analysis of microarray data.

Authors:  Gouchol Pok; Jyh-Charn Steve Liu; Keun Ho Ryu
Journal:  Bioinformation       Date:  2010-02-28

8.  Data perturbation independent diagnosis and validation of breast cancer subtypes using clustering and patterns.

Authors:  G Alexe; G S Dalgin; R Ramaswamy; C Delisi; G Bhanot
Journal:  Cancer Inform       Date:  2007-02-19

9.  Refining gene signatures: a Bayesian approach.

Authors:  Amira Djebbari; Aurélie Labbe
Journal:  BMC Bioinformatics       Date:  2009-12-10       Impact factor: 3.169

10.  A hybrid approach for biomarker discovery from microarray gene expression data for cancer classification.

Authors:  Yanxiong Peng; Wenyuan Li; Ying Liu
Journal:  Cancer Inform       Date:  2007-02-22
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.