Literature DB >> 35238376

fastISM: Performant in-silico saturation mutagenesis for convolutional neural networks.

Surag Nair1, Avanti Shrikumar1, Jacob Schreiber2, Anshul Kundaje1,2.   

Abstract

MOTIVATION: Deep learning models such as convolutional neural networks are able to accurately map biological sequences to associated functional readouts and properties by learning predictive de novo representations. In-silico saturation mutagenesis (ISM) is a popular feature attribution technique for inferring contributions of all characters in an input sequence to the model's predicted output. The main drawback of ISM is its runtime, as it involves multiple forward propagations of all possible mutations of each character in the input sequence through the trained model to predict the effects on the output.
RESULTS: We present fastISM, an algorithm that speeds up ISM by a factor of over 10x for commonly used convolutional neural network architectures. fastISM is based on the observations that the majority of computation in ISM is spent in convolutional layers, and a single mutation only disrupts a limited region of intermediate layers, rendering most computation redundant. fastISM reduces the gap between backpropagation-based feature attribution methods and ISM. It far surpasses the runtime of backpropagation-based methods on multi-output architectures, making it feasible to run ISM on a large number of sequences. AVAILABILITY: An easy-to-use Keras/TensorFlow 2 implementation of fastISM is available at https://github.com/kundajelab/fastISM. fastISM can be installed using pip install fastism. A hands-on tutorial can be found at https://colab.research.google.com/github/kundajelab/fastISM/blob/master/notebooks/colab/DeepSEA.ipynb. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) (2022). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Year:  2022        PMID: 35238376      PMCID: PMC9048647          DOI: 10.1093/bioinformatics/btac135

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  21 in total

1.  Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning.

Authors:  Babak Alipanahi; Andrew Delong; Matthew T Weirauch; Brendan J Frey
Journal:  Nat Biotechnol       Date:  2015-07-27       Impact factor: 54.908

2.  Long short-term memory.

Authors:  S Hochreiter; J Schmidhuber
Journal:  Neural Comput       Date:  1997-11-15       Impact factor: 2.026

3.  Predicting effects of noncoding variants with deep learning-based sequence model.

Authors:  Jian Zhou; Olga G Troyanskaya
Journal:  Nat Methods       Date:  2015-08-24       Impact factor: 28.547

4.  Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks.

Authors:  David R Kelley; Jasper Snoek; John L Rinn
Journal:  Genome Res       Date:  2016-05-03       Impact factor: 9.043

5.  Enhanced Integrated Gradients: improving interpretability of deep learning models using splicing codes as a case study.

Authors:  Anupama Jha; Joseph K Aicher; Matthew R Gazzara; Deependra Singh; Yoseph Barash
Journal:  Genome Biol       Date:  2020-06-19       Impact factor: 13.583

6.  Comprehensive evaluation of deep learning architectures for prediction of DNA/RNA sequence binding specificities.

Authors:  Ameni Trabelsi; Mohamed Chaabane; Asa Ben-Hur
Journal:  Bioinformatics       Date:  2019-07-15       Impact factor: 6.937

7.  Deep learning models predict regulatory variants in pancreatic islets and refine type 2 diabetes association signals.

Authors:  Anna L Gloyn; Mark I McCarthy; Agata Wesolowska-Andersen; Grace Zhuo Yu; Vibe Nylander; Fernando Abaitua; Matthias Thurner; Jason M Torres; Anubha Mahajan
Journal:  Elife       Date:  2020-01-27       Impact factor: 8.140

8.  Predicting 3D genome folding from DNA sequence with Akita.

Authors:  Geoff Fudenberg; David R Kelley; Katherine S Pollard
Journal:  Nat Methods       Date:  2020-10-12       Impact factor: 28.547

9.  Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk.

Authors:  Jian Zhou; Chandra L Theesfeld; Kevin Yao; Kathleen M Chen; Aaron K Wong; Olga G Troyanskaya
Journal:  Nat Genet       Date:  2018-07-16       Impact factor: 38.330

10.  Base-resolution models of transcription-factor binding reveal soft motif syntax.

Authors:  Žiga Avsec; Melanie Weilert; Avanti Shrikumar; Sabrina Krueger; Amr Alexandari; Khyati Dalal; Robin Fropf; Charles McAnany; Julien Gagneur; Anshul Kundaje; Julia Zeitlinger
Journal:  Nat Genet       Date:  2021-02-18       Impact factor: 38.330

View more
  2 in total

1.  Accelerating in-silico saturation mutagenesis using compressed sensing.

Authors:  Jacob Schreiber; Surag Nair; Akshay Balsubramani; Anshul Kundaje
Journal:  Bioinformatics       Date:  2022-06-09       Impact factor: 6.931

2.  Genomics enters the deep learning era.

Authors:  Etienne Routhier; Julien Mozziconacci
Journal:  PeerJ       Date:  2022-06-24       Impact factor: 3.061

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.