Literature DB >> 34792168

From genotype to phenotype in Arabidopsis thaliana: in-silico genome interpretation predicts 288 phenotypes from sequencing data.

Daniele Raimondi1, Massimiliano Corso2, Piero Fariselli3, Yves Moreau1.   

Abstract

In many cases, the unprecedented availability of data provided by high-throughput sequencing has shifted the bottleneck from a data availability issue to a data interpretation issue, thus delaying the promised breakthroughs in genetics and precision medicine, for what concerns Human genetics, and phenotype prediction to improve plant adaptation to climate change and resistance to bioagressors, for what concerns plant sciences. In this paper, we propose a novel Genome Interpretation paradigm, which aims at directly modeling the genotype-to-phenotype relationship, and we focus on A. thaliana since it is the best studied model organism in plant genetics. Our model, called Galiana, is the first end-to-end Neural Network (NN) approach following the genomes in/phenotypes out paradigm and it is trained to predict 288 real-valued Arabidopsis thaliana phenotypes from Whole Genome sequencing data. We show that 75 of these phenotypes are predicted with a Pearson correlation ≥0.4, and are mostly related to flowering traits. We show that our end-to-end NN approach achieves better performances and larger phenotype coverage than models predicting single phenotypes from the GWAS-derived known associated genes. Galiana is also fully interpretable, thanks to the Saliency Maps gradient-based approaches. We followed this interpretation approach to identify 36 novel genes that are likely to be associated with flowering traits, finding evidence for 6 of them in the existing literature.
© The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Year:  2022        PMID: 34792168      PMCID: PMC8860592          DOI: 10.1093/nar/gkab1099

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  39 in total

1.  Distinct patterns of genetic variation alter flowering responses of Arabidopsis accessions to different daylengths.

Authors:  Antonis Giakountis; Frederic Cremer; Sheina Sim; Matthieu Reymond; Johanna Schmitt; George Coupland
Journal:  Plant Physiol       Date:  2009-11-04       Impact factor: 8.340

2.  Genome sequencing and analysis of the model grass Brachypodium distachyon.

Authors: 
Journal:  Nature       Date:  2010-02-11       Impact factor: 49.962

Review 3.  Rare and common variants: twenty arguments.

Authors:  Greg Gibson
Journal:  Nat Rev Genet       Date:  2012-01-18       Impact factor: 53.242

Review 4.  Arabidopsis thaliana: a model plant for genome analysis.

Authors:  D W Meinke; J M Cherry; C Dean; S D Rounsley; M Koornneef
Journal:  Science       Date:  1998-10-23       Impact factor: 47.728

Review 5.  Finding the missing heritability of complex diseases.

Authors:  Teri A Manolio; Francis S Collins; Nancy J Cox; David B Goldstein; Lucia A Hindorff; David J Hunter; Mark I McCarthy; Erin M Ramos; Lon R Cardon; Aravinda Chakravarti; Judy H Cho; Alan E Guttmacher; Augustine Kong; Leonid Kruglyak; Elaine Mardis; Charles N Rotimi; Montgomery Slatkin; David Valle; Alice S Whittemore; Michael Boehnke; Andrew G Clark; Evan E Eichler; Greg Gibson; Jonathan L Haines; Trudy F C Mackay; Steven A McCarroll; Peter M Visscher
Journal:  Nature       Date:  2009-10-08       Impact factor: 49.962

6.  On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation.

Authors:  Sebastian Bach; Alexander Binder; Grégoire Montavon; Frederick Klauschen; Klaus-Robert Müller; Wojciech Samek
Journal:  PLoS One       Date:  2015-07-10       Impact factor: 3.240

7.  BRR2a Affects Flowering Time via FLC Splicing.

Authors:  Walid Mahrez; Juhyun Shin; Rafael Muñoz-Viana; Duarte D Figueiredo; Minerva S Trejo-Arellano; Vivien Exner; Alexey Siretskiy; Wilhelm Gruissem; Claudia Köhler; Lars Hennig
Journal:  PLoS Genet       Date:  2016-04-21       Impact factor: 5.917

8.  DEOGEN2: prediction and interactive visualization of single amino acid variant deleteriousness in human proteins.

Authors:  Daniele Raimondi; Ibrahim Tanyalcin; Julien Ferté; Andrea Gazzo; Gabriele Orlando; Tom Lenaerts; Marianne Rooman; Wim Vranken
Journal:  Nucleic Acids Res       Date:  2017-07-03       Impact factor: 16.971

9.  Genome-Wide Prediction of Complex Traits in Two Outcrossing Plant Species Through Deep Learning and Bayesian Regularized Neural Network.

Authors:  Carlos Maldonado; Freddy Mora-Poblete; Rodrigo Iván Contreras-Soto; Sunny Ahmar; Jen-Tsung Chen; Antônio Teixeira do Amaral Júnior; Carlos Alberto Scapim
Journal:  Front Plant Sci       Date:  2020-11-27       Impact factor: 5.753

10.  GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists.

Authors:  Eran Eden; Roy Navon; Israel Steinfeld; Doron Lipson; Zohar Yakhini
Journal:  BMC Bioinformatics       Date:  2009-02-03       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.