Literature DB >> 35421087

A general framework for predicting the transcriptomic consequences of non-coding variation and small molecules.

Moustafa Abdalla1,2,3,4, Mohamed Abdalla5,6.   

Abstract

Genome wide association studies (GWASs) for complex traits have implicated thousands of genetic loci. Most GWAS-nominated variants lie in noncoding regions, complicating the systematic translation of these findings into functional understanding. Here, we leverage convolutional neural networks to assist in this challenge. Our computational framework, peaBrain, models the transcriptional machinery of a tissue as a two-stage process: first, predicting the mean tissue specific abundance of all genes and second, incorporating the transcriptomic consequences of genotype variation to predict individual abundance on a subject-by-subject basis. We demonstrate that peaBrain accounts for the majority (>50%) of variance observed in mean transcript abundance across most tissues and outperforms regularized linear models in predicting the consequences of individual genotype variation. We highlight the validity of the peaBrain model by calculating non-coding impact scores that correlate with nucleotide evolutionary constraint that are also predictive of disease-associated variation and allele-specific transcription factor binding. We further show how these tissue-specific peaBrain scores can be leveraged to pinpoint functional tissues underlying complex traits, outperforming methods that depend on colocalization of eQTL and GWAS signals. We subsequently: (a) derive continuous dense embeddings of genes for downstream applications; (b) highlight the utility of the model in predicting transcriptomic impact of small molecules and shRNA (on par with in vitro experimental replication of external test sets); (c) explore how peaBrain can be used to model difficult-to-study processes (such as neural induction); and (d) identify putatively functional eQTLs that are missed by high-throughput experimental approaches.

Entities:  

Mesh:

Year:  2022        PMID: 35421087      PMCID: PMC9041867          DOI: 10.1371/journal.pcbi.1010028

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.779


  47 in total

1.  GERV: a statistical method for generative evaluation of regulatory variants for transcription factor binding.

Authors:  Haoyang Zeng; Tatsunori Hashimoto; Daniel D Kang; David K Gifford
Journal:  Bioinformatics       Date:  2015-10-17       Impact factor: 6.937

2.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits.

Authors:  Lucia A Hindorff; Praveen Sethupathy; Heather A Junkins; Erin M Ramos; Jayashri P Mehta; Francis S Collins; Teri A Manolio
Journal:  Proc Natl Acad Sci U S A       Date:  2009-05-27       Impact factor: 11.205

3.  Predicting mRNA Abundance Directly from Genomic Sequence Using Deep Convolutional Neural Networks.

Authors:  Vikram Agarwal; Jay Shendure
Journal:  Cell Rep       Date:  2020-05-19       Impact factor: 9.423

4.  Integrative approaches for large-scale transcriptome-wide association studies.

Authors:  Alexander Gusev; Arthur Ko; Huwenbo Shi; Gaurav Bhatia; Wonil Chung; Brenda W J H Penninx; Rick Jansen; Eco J C de Geus; Dorret I Boomsma; Fred A Wright; Patrick F Sullivan; Elina Nikkola; Marcus Alvarez; Mete Civelek; Aldons J Lusis; Terho Lehtimäki; Emma Raitoharju; Mika Kähönen; Ilkka Seppälä; Olli T Raitakari; Johanna Kuusisto; Markku Laakso; Alkes L Price; Päivi Pajukanta; Bogdan Pasaniuc
Journal:  Nat Genet       Date:  2016-02-08       Impact factor: 38.330

5.  Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans.

Authors: 
Journal:  Science       Date:  2015-05-07       Impact factor: 47.728

6.  DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome.

Authors:  Yanrong Ji; Zhihan Zhou; Han Liu; Ramana V Davuluri
Journal:  Bioinformatics       Date:  2021-02-04       Impact factor: 6.937

7.  A method to predict the impact of regulatory variants from DNA sequence.

Authors:  Dongwon Lee; David U Gorkin; Maggie Baker; Benjamin J Strober; Alessandro L Asoni; Andrew S McCallion; Michael A Beer
Journal:  Nat Genet       Date:  2015-06-15       Impact factor: 38.330

8.  COSMIC: exploring the world's knowledge of somatic mutations in human cancer.

Authors:  Simon A Forbes; David Beare; Prasad Gunasekaran; Kenric Leung; Nidhi Bindal; Harry Boutselakis; Minjie Ding; Sally Bamford; Charlotte Cole; Sari Ward; Chai Yin Kok; Mingming Jia; Tisham De; Jon W Teague; Michael R Stratton; Ultan McDermott; Peter J Campbell
Journal:  Nucleic Acids Res       Date:  2014-10-29       Impact factor: 16.971

9.  A spectral approach integrating functional genomic annotations for coding and noncoding variants.

Authors:  Iuliana Ionita-Laza; Kenneth McCallum; Bin Xu; Joseph D Buxbaum
Journal:  Nat Genet       Date:  2016-01-04       Impact factor: 38.330

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.