Literature DB >> 32433972

Predicting mRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks.

Vikram Agarwal1, Jay Shendure2.   

Abstract

Algorithms that accurately predict gene structure from primary sequence alone were transformative for annotating the human genome. Can we also predict the expression levels of genes based solely on genome sequence? Here, we sought to apply deep convolutional neural networks toward that goal. Surprisingly, a model that includes only promoter sequences and features associated with mRNA stability explains 59% and 71% of variation in steady-state mRNA levels in human and mouse, respectively. This model, termed Xpresso, more than doubles the accuracy of alternative sequence-based models and isolates rules as predictive as models relying on chromatic immunoprecipitation sequencing (ChIP-seq) data. Xpresso recapitulates genome-wide patterns of transcriptional activity, and its residuals can be used to quantify the influence of enhancers, heterochromatic domains, and microRNAs. Model interpretation reveals that promoter-proximal CpG dinucleotides strongly predict transcriptional activity. Looking forward, we propose cell-type-specific gene-expression predictions based solely on primary sequences as a grand challenge for the field.
Copyright © 2020 The Authors. Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  deep learning; gene regulation; predicting gene expression

Mesh:

Substances:

Year:  2020        PMID: 32433972     DOI: 10.1016/j.celrep.2020.107663

Source DB:  PubMed          Journal:  Cell Rep            Impact factor:   9.423


  31 in total

1.  Promoter CpG Density Predicts Downstream Gene Loss-of-Function Intolerance.

Authors:  Leandros Boukas; Hans T Bjornsson; Kasper D Hansen
Journal:  Am J Hum Genet       Date:  2020-08-14       Impact factor: 11.025

2.  Predicting target genes of non-coding regulatory variants with IRT.

Authors:  Zhenqin Wu; Nilah M Ioannidis; James Zou
Journal:  Bioinformatics       Date:  2020-08-15       Impact factor: 6.937

3.  Deep learning of cross-species single-cell landscapes identifies conserved regulatory programs underlying cell types.

Authors:  Jiaqi Li; Jingjing Wang; Peijing Zhang; Renying Wang; Yuqing Mei; Zhongyi Sun; Lijiang Fei; Mengmeng Jiang; Lifeng Ma; Weigao E; Haide Chen; Xinru Wang; Yuting Fu; Hanyu Wu; Daiyuan Liu; Xueyi Wang; Jingyu Li; Qile Guo; Yuan Liao; Chengxuan Yu; Danmei Jia; Jian Wu; Shibo He; Huanju Liu; Jun Ma; Kai Lei; Jiming Chen; Xiaoping Han; Guoji Guo
Journal:  Nat Genet       Date:  2022-10-13       Impact factor: 41.307

Review 4.  Decoding disease: from genomes to networks to phenotypes.

Authors:  Aaron K Wong; Rachel S G Sealfon; Chandra L Theesfeld; Olga G Troyanskaya
Journal:  Nat Rev Genet       Date:  2021-08-02       Impact factor: 53.242

Review 5.  Machine Learning in Epigenomics: Insights into Cancer Biology and Medicine.

Authors:  Emre Arslan; Jonathan Schulz; Kunal Rai
Journal:  Biochim Biophys Acta Rev Cancer       Date:  2021-07-07       Impact factor: 10.680

6.  Deciphering enhancer sequence using thermodynamics-based models and convolutional neural networks.

Authors:  Payam Dibaeinia; Saurabh Sinha
Journal:  Nucleic Acids Res       Date:  2021-10-11       Impact factor: 16.971

7.  Accurate and highly interpretable prediction of gene expression from histone modifications.

Authors:  Fabrizio Frasca; Matteo Matteucci; Michele Leone; Marco J Morelli; Marco Masseroli
Journal:  BMC Bioinformatics       Date:  2022-04-26       Impact factor: 3.307

8.  Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network.

Authors:  Mathys Grapotte; Manu Saraswat; Chloé Bessière; Christophe Menichelli; Jordan A Ramilowski; Jessica Severin; Yoshihide Hayashizaki; Masayoshi Itoh; Michihira Tagami; Mitsuyoshi Murata; Miki Kojima-Ishiyama; Shohei Noma; Shuhei Noguchi; Takeya Kasukawa; Akira Hasegawa; Harukazu Suzuki; Hiromi Nishiyori-Sueki; Martin C Frith; Clément Chatelain; Piero Carninci; Michiel J L de Hoon; Wyeth W Wasserman; Laurent Bréhélin; Charles-Henri Lecellier
Journal:  Nat Commun       Date:  2021-06-02       Impact factor: 14.919

9.  Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs.

Authors:  Qingbo S Wang; David R Kelley; Jacob Ulirsch; Masahiro Kanai; Shuvom Sadhuka; Ran Cui; Carlos Albors; Nathan Cheng; Yukinori Okada; Francois Aguet; Kristin G Ardlie; Daniel G MacArthur; Hilary K Finucane
Journal:  Nat Commun       Date:  2021-06-07       Impact factor: 14.919

Review 10.  Learning the Regulatory Code of Gene Expression.

Authors:  Jan Zrimec; Filip Buric; Mariia Kokina; Victor Garcia; Aleksej Zelezniak
Journal:  Front Mol Biosci       Date:  2021-06-10
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.