Literature DB >> 29422764

Integrated Theory- and Data-driven Feature Selection in Gene Expression Data Analysis.

Vineet K Raghu1, Xiaoyu Ge1, Panos K Chrysanthis1, Panayiotis V Benos1,2.   

Abstract

The exponential growth of high dimensional biological data has led to a rapid increase in demand for automated approaches for knowledge production. Existing methods rely on two general approaches to address this challenge: 1) the Theory-driven approach, which utilizes prior accumulated knowledge, and 2) the Data-driven approach, which solely utilizes the data to deduce scientific knowledge. Both of these approaches alone suffer from bias toward past/present knowledge, as they fail to incorporate all of the current knowledge that is available to make new discoveries. In this paper, we show how an integrated method can effectively address the high dimensionality of big biological data, which is a major problem for pure data-driven analysis approaches. We realize our approach in a novel two-step analytical workflow that incorporates a new feature selection paradigm as the first step to handling high-throughput gene expression data analysis and that utilizes graphical causal modeling as the second step to handle the automatic extraction of causal relationships. Our results, on real-world clinical datasets from The Cancer Genome Atlas (TCGA), demonstrate that our method is capable of intelligently selecting genes for learning effective causal networks.

Entities:  

Year:  2017        PMID: 29422764      PMCID: PMC5799807          DOI: 10.1109/ICDE.2017.223

Source DB:  PubMed          Journal:  Proc Int Conf Data Eng        ISSN: 1084-4627


  17 in total

1.  KEGG: kyoto encyclopedia of genes and genomes.

Authors:  M Kanehisa; S Goto
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

Review 2.  Pharmacogenomics knowledge for personalized medicine.

Authors:  M Whirl-Carrillo; E M McDonagh; J M Hebert; L Gong; K Sangkuhl; C F Thorn; R B Altman; T E Klein
Journal:  Clin Pharmacol Ther       Date:  2012-10       Impact factor: 6.875

Review 3.  TGF-beta signaling in breast cancer.

Authors:  Miriam B Buck; Cornelius Knabbe
Journal:  Ann N Y Acad Sci       Date:  2006-11       Impact factor: 5.691

4.  BRCA2 mutations in primary breast and ovarian cancers.

Authors:  J M Lancaster; R Wooster; J Mangion; C M Phelan; C Cochran; C Gumbs; S Seal; R Barfoot; N Collins; G Bignell; S Patel; R Hamoudi; C Larsson; R W Wiseman; A Berchuck; J D Iglehart; J R Marks; A Ashworth; M R Stratton; P A Futreal
Journal:  Nat Genet       Date:  1996-06       Impact factor: 38.330

5.  BRCA1 mutations in primary breast and ovarian carcinomas.

Authors:  P A Futreal; Q Liu; D Shattuck-Eidens; C Cochran; K Harshman; S Tavtigian; L M Bennett; A Haugen-Strano; J Swensen; Y Miki
Journal:  Science       Date:  1994-10-07       Impact factor: 47.728

6.  Supervised risk predictor of breast cancer based on intrinsic subtypes.

Authors:  Joel S Parker; Michael Mullins; Maggie C U Cheang; Samuel Leung; David Voduc; Tammi Vickery; Sherri Davies; Christiane Fauron; Xiaping He; Zhiyuan Hu; John F Quackenbush; Inge J Stijleman; Juan Palazzo; J S Marron; Andrew B Nobel; Elaine Mardis; Torsten O Nielsen; Matthew J Ellis; Charles M Perou; Philip S Bernard
Journal:  J Clin Oncol       Date:  2009-02-09       Impact factor: 44.544

7.  DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes.

Authors:  Janet Piñero; Núria Queralt-Rosinach; Àlex Bravo; Jordi Deu-Pons; Anna Bauer-Mehren; Martin Baron; Ferran Sanz; Laura I Furlong
Journal:  Database (Oxford)       Date:  2015-04-15       Impact factor: 3.451

8.  Learning mixed graphical models with separate sparsity parameters and stability-based model selection.

Authors:  Andrew J Sedgewick; Ivy Shi; Rory M Donovan; Panayiotis V Benos
Journal:  BMC Bioinformatics       Date:  2016-06-06       Impact factor: 3.307

9.  Master regulators of FGFR2 signalling and breast cancer risk.

Authors:  Michael N C Fletcher; Mauro A A Castro; Xin Wang; Ines de Santiago; Martin O'Reilly; Suet-Feung Chin; Oscar M Rueda; Carlos Caldas; Bruce A J Ponder; Florian Markowetz; Kerstin B Meyer
Journal:  Nat Commun       Date:  2013       Impact factor: 14.919

Review 10.  Causal discovery and inference: concepts and recent methodological advances.

Authors:  Peter Spirtes; Kun Zhang
Journal:  Appl Inform (Berl)       Date:  2016-02-18
View more
  5 in total

1.  A Pipeline for Integrated Theory and Data-Driven Modeling of Biomedical Data.

Authors:  Vineet K Raghu; Xiaoyu Ge; Arun Balajiee; Daniel J Shirer; Isha Das; Panayiotis V Benos; Panos K Chrysanthis
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2021-06-03       Impact factor: 3.702

2.  Integrative Gene Selection on Gene Expression Data: Providing Biological Context to Traditional Approaches.

Authors:  Cindy Perscheid; Bastien Grasnick; Matthias Uflacker
Journal:  J Integr Bioinform       Date:  2018-12-22

3.  CogNet: classification of gene expression data based on ranked active-subnetwork-oriented KEGG pathway enrichment analysis.

Authors:  Malik Yousef; Ege Ülgen; Osman Uğur Sezerman
Journal:  PeerJ Comput Sci       Date:  2021-02-22

4.  Trust in the scientific research community predicts intent to comply with COVID-19 prevention measures: An analysis of a large-scale international survey dataset.

Authors:  Hyemin Han
Journal:  Epidemiol Infect       Date:  2022-02-08       Impact factor: 2.451

5.  CausalMGM: an interactive web-based causal discovery tool.

Authors:  Xiaoyu Ge; Vineet K Raghu; Panos K Chrysanthis; Panayiotis V Benos
Journal:  Nucleic Acids Res       Date:  2020-07-02       Impact factor: 19.160

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.