Literature DB >> 33816987

CogNet: classification of gene expression data based on ranked active-subnetwork-oriented KEGG pathway enrichment analysis.

Malik Yousef1,2, Ege Ülgen3, Osman Uğur Sezerman3.   

Abstract

Most of the traditional gene selection approaches are borrowed from other fields such as statistics and computer science, However, they do not prioritize biologically relevant genes since the ultimate goal is to determine features that optimize model performance metrics not to build a biologically meaningful model. Therefore, there is an imminent need for new computational tools that integrate the biological knowledge about the data in the process of gene selection and machine learning. Integrative gene selection enables incorporation of biological domain knowledge from external biological resources. In this study, we propose a new computational approach named CogNet that is an integrative gene selection tool that exploits biological knowledge for grouping the genes for the computational modeling tasks of ranking and classification. In CogNet, the pathfindR serves as the biological grouping tool to allow the main algorithm to rank active-subnetwork-oriented KEGG pathway enrichment analysis results to build a biologically relevant model. CogNet provides a list of significant KEGG pathways that can classify the data with a very high accuracy. The list also provides the genes belonging to these pathways that are differentially expressed that are used as features in the classification problem. The list facilitates deep analysis and better interpretability of the role of KEGG pathways in classification of the data thus better establishing the biological relevance of these differentially expressed genes. Even though the main aim of our study is not to improve the accuracy of any existing tool, the performance of the CogNet outperforms a similar approach called maTE while obtaining similar performance compared to other similar tools including SVM-RCE. CogNet was tested on 13 gene expression datasets concerning a variety of diseases.
© 2021 Yousef et al.

Entities:  

Keywords:  Bioinformatics; Classification; Data mining; Data science; Enrichment analysis; Gene expression; Genomics; KEGG pathway; Machine learning; Rank

Year:  2021        PMID: 33816987      PMCID: PMC7959595          DOI: 10.7717/peerj-cs.336

Source DB:  PubMed          Journal:  PeerJ Comput Sci        ISSN: 2376-5992


  23 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

Review 2.  Filter versus wrapper gene selection approaches in DNA microarray domains.

Authors:  Iñaki Inza; Pedro Larrañaga; Rosa Blanco; Antonio J Cerrolaza
Journal:  Artif Intell Med       Date:  2004-06       Impact factor: 5.326

3.  Integration of pathway knowledge into a reweighted recursive feature elimination approach for risk stratification of cancer patients.

Authors:  Marc Johannes; Jan C Brase; Holger Fröhlich; Stephan Gade; Mathias Gehrmann; Maria Fälth; Holger Sültmann; Tim Beissbarth
Journal:  Bioinformatics       Date:  2010-06-30       Impact factor: 6.937

Review 4.  Towards knowledge-based gene expression data mining.

Authors:  Riccardo Bellazzi; Blaz Zupan
Journal:  J Biomed Inform       Date:  2007-06-21       Impact factor: 6.317

5.  SoFoCles: feature filtering for microarray classification based on gene ontology.

Authors:  Georgios Papachristoudis; Sotiris Diplaris; Pericles A Mitkas
Journal:  J Biomed Inform       Date:  2009-07-01       Impact factor: 6.317

6.  Gene expression profiling predicts clinical outcome of breast cancer.

Authors:  Laura J van 't Veer; Hongyue Dai; Marc J van de Vijver; Yudong D He; Augustinus A M Hart; Mao Mao; Hans L Peterse; Karin van der Kooy; Matthew J Marton; Anke T Witteveen; George J Schreiber; Ron M Kerkhoven; Chris Roberts; Peter S Linsley; René Bernards; Stephen H Friend
Journal:  Nature       Date:  2002-01-31       Impact factor: 49.962

7.  maTE: discovering expressed interactions between microRNAs and their targets.

Authors:  Malik Yousef; Loai Abdallah; Jens Allmer
Journal:  Bioinformatics       Date:  2019-10-15       Impact factor: 6.937

8.  Integrated Theory- and Data-driven Feature Selection in Gene Expression Data Analysis.

Authors:  Vineet K Raghu; Xiaoyu Ge; Panos K Chrysanthis; Panayiotis V Benos
Journal:  Proc Int Conf Data Eng       Date:  2017-05-18

9.  RGIFE: a ranked guided iterative feature elimination heuristic for the identification of biomarkers.

Authors:  Nicola Lazzarini; Jaume Bacardit
Journal:  BMC Bioinformatics       Date:  2017-06-30       Impact factor: 3.169

10.  Knowledge Driven Variable Selection (KDVS) - a new approach to enrichment analysis of gene signatures obtained from high-throughput data.

Authors:  Grzegorz Zycinski; Annalisa Barla; Margherita Squillario; Tiziana Sanavia; Barbara Di Camillo; Alessandro Verri
Journal:  Source Code Biol Med       Date:  2013-01-09
View more
  4 in total

1.  miRcorrNet: machine learning-based integration of miRNA and mRNA expression profiles, combined with feature grouping and ranking.

Authors:  Malik Yousef; Gokhan Goy; Ramkrishna Mitra; Christine M Eischen; Amhar Jabeer; Burcu Bakir-Gungor
Journal:  PeerJ       Date:  2021-05-19       Impact factor: 2.984

2.  Differentially Expressed Genes Reveal the Biomarkers and Molecular Mechanism of Osteonecrosis.

Authors:  Huanzhi Ma; Wei Zhang; Jun Shi
Journal:  J Healthc Eng       Date:  2022-01-07       Impact factor: 2.682

3.  miRModuleNet: Detecting miRNA-mRNA Regulatory Modules.

Authors:  Malik Yousef; Gokhan Goy; Burcu Bakir-Gungor
Journal:  Front Genet       Date:  2022-04-12       Impact factor: 4.772

4.  TextNetTopics: Text Classification Based Word Grouping as Topics and Topics' Scoring.

Authors:  Malik Yousef; Daniel Voskergian
Journal:  Front Genet       Date:  2022-06-20       Impact factor: 4.772

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.