Literature DB >> 35507271

Deep Mining from Omics Data.

Abeer Alzubaidi1, Jonathan Tepper2.   

Abstract

Since the advent of high-throughput omics technologies, various molecular data such as genes, transcripts, proteins, and metabolites have been made widely available to researchers. This has afforded clinicians, bioinformaticians, statisticians, and data scientists the opportunity to apply their innovations in feature mining and predictive modeling to a rich data resource to develop a wide range of generalizable prediction models. What has become apparent over the last 10 years is that researchers have adopted deep neural networks (or "deep nets") as their preferred paradigm of choice for complex data modeling due to the superiority of performance over more traditional statistical machine learning approaches, such as support vector machines. A key stumbling block, however, is that deep nets inherently lack transparency and are considered to be a "black box" approach. This naturally makes it very difficult for clinicians and other stakeholders to trust their deep learning models even though the model predictions appear to be highly accurate. In this chapter, we therefore provide a detailed summary of the deep net architectures typically used in omics research, together with a comprehensive summary of the notable "deep feature mining" techniques researchers have applied to open up this black box and provide some insights into the salient input features and why these models behave as they do. We group these techniques into the following three categories: (a) hidden layer visualization and interpretation; (b) input feature importance and impact evaluation; and (c) output layer gradient analysis. While we find that omics researchers have made some considerable gains in opening up the black box through interpretation of the hidden layer weights and node activations to identify salient input features, we highlight other approaches for omics researchers, such as employing deconvolutional network-based approaches and development of bespoke attribute impact measures to enable researchers to better understand the relationships between the input data and hidden layer representations formed and thus the output behavior of their deep nets.
© 2022. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.

Entities:  

Keywords:  Deep learning; Deep mining; Explainable AI; Interpretation; Knowledge discovery; Omics data

Mesh:

Year:  2022        PMID: 35507271     DOI: 10.1007/978-1-0716-2095-3_15

Source DB:  PubMed          Journal:  Methods Mol Biol        ISSN: 1064-3745


  32 in total

Review 1.  The model organism as a system: integrating 'omics' data sets.

Authors:  Andrew R Joyce; Bernhard Ø Palsson
Journal:  Nat Rev Mol Cell Biol       Date:  2006-03       Impact factor: 94.444

2.  Diagnostics: The prostate-cancer metabolome.

Authors:  Cory Abate-Shen; Michael M Shen
Journal:  Nature       Date:  2009-02-12       Impact factor: 49.962

3.  Parallel analysis of transcript and metabolic profiles: a new approach in systems biology.

Authors:  Ewa Urbanczyk-Wochniak; Alexander Luedemann; Joachim Kopka; Joachim Selbig; Ute Roessner-Tunali; Lothar Willmitzer; Alisdair R Fernie
Journal:  EMBO Rep       Date:  2003-09-12       Impact factor: 8.807

Review 4.  Representation learning: a review and new perspectives.

Authors:  Yoshua Bengio; Aaron Courville; Pascal Vincent
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2013-08       Impact factor: 6.226

Review 5.  Deep learning.

Authors:  Yann LeCun; Yoshua Bengio; Geoffrey Hinton
Journal:  Nature       Date:  2015-05-28       Impact factor: 49.962

6.  AlphaFold at CASP13.

Authors:  Mohammed AlQuraishi
Journal:  Bioinformatics       Date:  2019-11-01       Impact factor: 6.937

7.  Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal.

Authors:  Jianjiong Gao; Bülent Arman Aksoy; Ugur Dogrusoz; Gideon Dresdner; Benjamin Gross; S Onur Sumer; Yichao Sun; Anders Jacobsen; Rileen Sinha; Erik Larsson; Ethan Cerami; Chris Sander; Nikolaus Schultz
Journal:  Sci Signal       Date:  2013-04-02       Impact factor: 8.192

8.  The Cancer Genome Atlas Pan-Cancer analysis project.

Authors:  John N Weinstein; Eric A Collisson; Gordon B Mills; Kenna R Mills Shaw; Brad A Ozenberger; Kyle Ellrott; Ilya Shmulevich; Chris Sander; Joshua M Stuart
Journal:  Nat Genet       Date:  2013-10       Impact factor: 38.330

Review 9.  Application of machine learning to proteomics data: classification and biomarker identification in postgenomics biology.

Authors:  Anna Louise Swan; Ali Mobasheri; David Allaway; Susan Liddell; Jaume Bacardit
Journal:  OMICS       Date:  2013-10-12

10.  Deep Learning Applications for Predicting Pharmacological Properties of Drugs and Drug Repurposing Using Transcriptomic Data.

Authors:  Alexander Aliper; Sergey Plis; Artem Artemov; Alvaro Ulloa; Polina Mamoshina; Alex Zhavoronkov
Journal:  Mol Pharm       Date:  2016-06-08       Impact factor: 4.939

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.