Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Multi-view feature selection for identifying gene markers: a diversified biological data driven approach.

Literature DB >> 33375940

Multi-view feature selection for identifying gene markers: a diversified biological data driven approach.

Sudipta Acharya¹, Laizhong Cui², Yi Pan³.

Abstract

BACKGROUND: In recent years, to investigate challenging bioinformatics problems, the utilization of multiple genomic and proteomic sources has become immensely popular among researchers. One such issue is feature or gene selection and identifying relevant and non-redundant marker genes from high dimensional gene expression data sets. In that context, designing an efficient feature selection algorithm exploiting knowledge from multiple potential biological resources may be an effective way to understand the spectrum of cancer or other diseases with applications in specific epidemiology for a particular population.
RESULTS: In the current article, we design the feature selection and marker gene detection as a multi-view multi-objective clustering problem. Regarding that, we propose an Unsupervised Multi-View Multi-Objective clustering-based gene selection approach called UMVMO-select. Three important resources of biological data (gene ontology, protein interaction data, protein sequence) along with gene expression values are collectively utilized to design two different views. UMVMO-select aims to reduce gene space without/minimally compromising the sample classification efficiency and determines relevant and non-redundant gene markers from three cancer gene expression benchmark data sets.
CONCLUSION: A thorough comparative analysis has been performed with five clustering and nine existing feature selection methods with respect to several internal and external validity metrics. Obtained results reveal the supremacy of the proposed method. Reported results are also validated through a proper biological significance test and heatmap plotting.

Entities: CellLine Chemical Disease Gene Species

Keywords: Gene ontology (GO); Gene selection; Gene similarity measures; Multi-objective clustering; Multi-view learning; Protein–protein interaction network (PPIN); Sample classification

Year: 2020 PMID： 33375940 PMCID： PMC7772934 DOI： 10.1186/s12859-020-03810-0

Source DB: PubMed Journal: BMC Bioinformatics ISSN： 1471-2105 Impact factor: 3.169

23 in total

1. Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning.

Authors: Margaret A Shipp; Ken N Ross; Pablo Tamayo; Andrew P Weng; Jeffery L Kutok; Ricardo C T Aguiar; Michelle Gaasenbeek; Michael Angelo; Michael Reich; Geraldine S Pinkus; Tane S Ray; Margaret A Koval; Kim W Last; Andrew Norton; T Andrew Lister; Jill Mesirov; Donna S Neuberg; Eric S Lander; Jon C Aster; Todd R Golub
Journal: Nat Med Date: 2002-01 Impact factor: 53.440

2. Nonparametric methods for identifying differentially expressed genes in microarray data.

Authors: Olga G Troyanskaya; Mitchell E Garber; Patrick O Brown; David Botstein; Russ B Altman
Journal: Bioinformatics Date: 2002-11 Impact factor: 6.937

3. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images.

Authors: S Geman; D Geman
Journal: IEEE Trans Pattern Anal Mach Intell Date: 1984-06 Impact factor: 6.226

4. Graph-based unsupervised feature selection and multiview clustering for microarray data.

Authors: Tripti Swarnkar; Pabitra Mitra
Journal: J Biosci Date: 2015-10 Impact factor: 1.826

5. Some new indexes of cluster validity.

Authors: J C Bezdek; N R Pal
Journal: IEEE Trans Syst Man Cybern B Cybern Date: 1998

6. A cluster separation measure.

Authors: D L Davies; D W Bouldin
Journal: IEEE Trans Pattern Anal Mach Intell Date: 1979-02 Impact factor: 6.226

7. Integration of Multi-omics Data for Gene Regulatory Network Inference and Application to Breast Cancer.

Authors: Lin Yuan; Le-Hang Guo; Chang-An Yuan; You-Hua Zhang; Kyungsook Han; Asoke Nandi; Barry Honig; De-Shuang Huang
Journal: IEEE/ACM Trans Comput Biol Bioinform Date: 2018-08-23 Impact factor: 3.710

8. Multiobjective Simulated Annealing-Based Clustering of Tissue Samples for Cancer Diagnosis.

Authors: Sudipta Acharya; Sriparna Saha; Yamini Thadisina
Journal: IEEE J Biomed Health Inform Date: 2015-02-20 Impact factor: 5.772

9. A Refined 3-in-1 Fused Protein Similarity Measure: Application in Threshold-Free Hub Detection.

Authors: Sudipta Acharya; Laizhong Cui; Yi Pan
Journal: IEEE/ACM Trans Comput Biol Bioinform Date: 2022-02-03 Impact factor: 3.710

10. HitPredict version 4: comprehensive reliability scoring of physical protein-protein interactions from more than 100 species.

Authors: Yosvany López; Kenta Nakai; Ashwini Patil
Journal: Database (Oxford) Date: 2015-12-26 Impact factor: 3.451

1 in total

Review 1. Ontologies and Knowledge Graphs in Oncology Research.

Authors: Marta Contreiras Silva; Patrícia Eugénio; Daniel Faria; Catia Pesquita
Journal: Cancers (Basel) Date: 2022-04-10 Impact factor: 6.575

1 in total