Literature DB >> 30842277

Evolutionarily informed deep learning methods for predicting relative transcript abundance from DNA sequence.

Jacob D Washburn1, Maria Katherine Mejia-Guerra1, Guillaume Ramstein1, Karl A Kremling1, Ravi Valluru1, Edward S Buckler2,3, Hai Wang4,1.   

Abstract

Deep learning methodologies have revolutionized prediction in many fields and show potential to do the same in molecular biology and genetics. However, applying these methods in their current forms ignores evolutionary dependencies within biological systems and can result in false positives and spurious conclusions. We developed two approaches that account for evolutionary relatedness in machine learning models: (i) gene-family-guided splitting and (ii) ortholog contrasts. The first approach accounts for evolution by constraining model training and testing sets to include different gene families. The second approach uses evolutionarily informed comparisons between orthologous genes to both control for and leverage evolutionary divergence during the training process. The two approaches were explored and validated within the context of mRNA expression level prediction and have the area under the ROC curve (auROC) values ranging from 0.75 to 0.94. Model weight inspections showed biologically interpretable patterns, resulting in the hypothesis that the 3' UTR is more important for fine-tuning mRNA abundance levels while the 5' UTR is more important for large-scale changes.

Keywords:  RNA; convolutional neural networks; machine learning; regulation

Mesh:

Substances:

Year:  2019        PMID: 30842277      PMCID: PMC6431157          DOI: 10.1073/pnas.1814551116

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  32 in total

1.  An efficient algorithm for large-scale detection of protein families.

Authors:  A J Enright; S Van Dongen; C A Ouzounis
Journal:  Nucleic Acids Res       Date:  2002-04-01       Impact factor: 16.971

Review 2.  New perspectives on connecting messenger RNA 3' end formation to transcription.

Authors:  Nick Proudfoot
Journal:  Curr Opin Cell Biol       Date:  2004-06       Impact factor: 8.382

3.  The developmental dynamics of the maize leaf transcriptome.

Authors:  Pinghua Li; Lalit Ponnala; Neeru Gandotra; Lin Wang; Yaqing Si; S Lori Tausta; Tesfamichael H Kebrom; Nicholas Provart; Rohan Patel; Christopher R Myers; Edwin J Reidel; Robert Turgeon; Peng Liu; Qi Sun; Timothy Nelson; Thomas P Brutnell
Journal:  Nat Genet       Date:  2010-10-31       Impact factor: 38.330

Review 4.  The highways and byways of mRNA decay.

Authors:  Nicole L Garneau; Jeffrey Wilusz; Carol J Wilusz
Journal:  Nat Rev Mol Cell Biol       Date:  2007-02       Impact factor: 94.444

5.  Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters.

Authors:  Leighton J Core; Joshua J Waterfall; John T Lis
Journal:  Science       Date:  2008-12-04       Impact factor: 47.728

Review 6.  Messenger RNA 3' end formation in plants.

Authors:  A G Hunt
Journal:  Curr Top Microbiol Immunol       Date:  2008       Impact factor: 4.291

7.  The B73 maize genome: complexity, diversity, and dynamics.

Authors:  Patrick S Schnable; Doreen Ware; Robert S Fulton; Joshua C Stein; Fusheng Wei; Shiran Pasternak; Chengzhi Liang; Jianwei Zhang; Lucinda Fulton; Tina A Graves; Patrick Minx; Amy Denise Reily; Laura Courtney; Scott S Kruchowski; Chad Tomlinson; Cindy Strong; Kim Delehaunty; Catrina Fronick; Bill Courtney; Susan M Rock; Eddie Belter; Feiyu Du; Kyung Kim; Rachel M Abbott; Marc Cotton; Andy Levy; Pamela Marchetto; Kerri Ochoa; Stephanie M Jackson; Barbara Gillam; Weizu Chen; Le Yan; Jamey Higginbotham; Marco Cardenas; Jason Waligorski; Elizabeth Applebaum; Lindsey Phelps; Jason Falcone; Krishna Kanchi; Thynn Thane; Adam Scimone; Nay Thane; Jessica Henke; Tom Wang; Jessica Ruppert; Neha Shah; Kelsi Rotter; Jennifer Hodges; Elizabeth Ingenthron; Matt Cordes; Sara Kohlberg; Jennifer Sgro; Brandon Delgado; Kelly Mead; Asif Chinwalla; Shawn Leonard; Kevin Crouse; Kristi Collura; Dave Kudrna; Jennifer Currie; Ruifeng He; Angelina Angelova; Shanmugam Rajasekar; Teri Mueller; Rene Lomeli; Gabriel Scara; Ara Ko; Krista Delaney; Marina Wissotski; Georgina Lopez; David Campos; Michele Braidotti; Elizabeth Ashley; Wolfgang Golser; HyeRan Kim; Seunghee Lee; Jinke Lin; Zeljko Dujmic; Woojin Kim; Jayson Talag; Andrea Zuccolo; Chuanzhu Fan; Aswathy Sebastian; Melissa Kramer; Lori Spiegel; Lidia Nascimento; Theresa Zutavern; Beth Miller; Claude Ambroise; Stephanie Muller; Will Spooner; Apurva Narechania; Liya Ren; Sharon Wei; Sunita Kumari; Ben Faga; Michael J Levy; Linda McMahan; Peter Van Buren; Matthew W Vaughn; Kai Ying; Cheng-Ting Yeh; Scott J Emrich; Yi Jia; Ananth Kalyanaraman; An-Ping Hsia; W Brad Barbazuk; Regina S Baucom; Thomas P Brutnell; Nicholas C Carpita; Cristian Chaparro; Jer-Ming Chia; Jean-Marc Deragon; James C Estill; Yan Fu; Jeffrey A Jeddeloh; Yujun Han; Hyeran Lee; Pinghua Li; Damon R Lisch; Sanzhen Liu; Zhijie Liu; Dawn Holligan Nagel; Maureen C McCann; Phillip SanMiguel; Alan M Myers; Dan Nettleton; John Nguyen; Bryan W Penning; Lalit Ponnala; Kevin L Schneider; David C Schwartz; Anupma Sharma; Carol Soderlund; Nathan M Springer; Qi Sun; Hao Wang; Michael Waterman; Richard Westerman; Thomas K Wolfgruber; Lixing Yang; Yeisoo Yu; Lifang Zhang; Shiguo Zhou; Qihui Zhu; Jeffrey L Bennetzen; R Kelly Dawe; Jiming Jiang; Ning Jiang; Gernot G Presting; Susan R Wessler; Srinivas Aluru; Robert A Martienssen; Sandra W Clifton; W Richard McCombie; Rod A Wing; Richard K Wilson
Journal:  Science       Date:  2009-11-20       Impact factor: 47.728

8.  Precise maps of RNA polymerase reveal how promoters direct initiation and pausing.

Authors:  Hojoong Kwak; Nicholas J Fuda; Leighton J Core; John T Lis
Journal:  Science       Date:  2013-02-22       Impact factor: 47.728

9.  Unraveling the KNOTTED1 regulatory network in maize meristems.

Authors:  Nathalie Bolduc; Alper Yilmaz; Maria Katherine Mejia-Guerra; Kengo Morohashi; Devin O'Connor; Erich Grotewold; Sarah Hake
Journal:  Genes Dev       Date:  2012-08-01       Impact factor: 11.361

10.  The physical and genetic framework of the maize B73 genome.

Authors:  Fusheng Wei; Jianwei Zhang; Shiguo Zhou; Ruifeng He; Mary Schaeffer; Kristi Collura; David Kudrna; Ben P Faga; Marina Wissotski; Wolfgang Golser; Susan M Rock; Tina A Graves; Robert S Fulton; Ed Coe; Patrick S Schnable; David C Schwartz; Doreen Ware; Sandra W Clifton; Richard K Wilson; Rod A Wing
Journal:  PLoS Genet       Date:  2009-11-20       Impact factor: 5.917

View more
  25 in total

1.  Comparative evolutionary genetics of deleterious load in sorghum and maize.

Authors:  Roberto Lozano; Elodie Gazave; Jhonathan P R Dos Santos; Markus G Stetter; Ravi Valluru; Nonoy Bandillo; Samuel B Fernandes; Patrick J Brown; Nadia Shakoor; Todd C Mockler; Elizabeth A Cooper; M Taylor Perkins; Edward S Buckler; Jeffrey Ross-Ibarra; Michael A Gore
Journal:  Nat Plants       Date:  2021-01-15       Impact factor: 15.793

2.  Building a tRNA thermometer to estimate microbial adaptation to temperature.

Authors:  Emre Cimen; Sarah E Jensen; Edward S Buckler
Journal:  Nucleic Acids Res       Date:  2020-12-02       Impact factor: 16.971

3.  Application of deep learning in genomics.

Authors:  Jianxiao Liu; Jiying Li; Hai Wang; Jianbing Yan
Journal:  Sci China Life Sci       Date:  2020-10-10       Impact factor: 6.038

Review 4.  Machine learning: its challenges and opportunities in plant system biology.

Authors:  Mohsen Hesami; Milad Alizadeh; Andrew Maxwell Phineas Jones; Davoud Torkamaneh
Journal:  Appl Microbiol Biotechnol       Date:  2022-05-16       Impact factor: 4.813

Review 5.  Obtaining genetics insights from deep learning via explainable artificial intelligence.

Authors:  Gherman Novakovsky; Nick Dexter; Maxwell W Libbrecht; Wyeth W Wasserman; Sara Mostafavi
Journal:  Nat Rev Genet       Date:  2022-10-03       Impact factor: 59.581

6.  Quantitative Extraction and Evaluation of Tomato Fruit Phenotypes Based on Image Recognition.

Authors:  Yihang Zhu; Qing Gu; Yiying Zhao; Hongjian Wan; Rongqing Wang; Xiaobin Zhang; Yuan Cheng
Journal:  Front Plant Sci       Date:  2022-04-13       Impact factor: 6.627

7.  Genome-wide cis-decoding for expression design in tomato using cistrome data and explainable deep learning.

Authors:  Takashi Akagi; Kanae Masuda; Eriko Kuwada; Kouki Takeshita; Taiji Kawakatsu; Tohru Ariizumi; Yasutaka Kubo; Koichiro Ushijima; Seiichi Uchida
Journal:  Plant Cell       Date:  2022-05-24       Impact factor: 12.085

8.  Variation in upstream open reading frames contributes to allelic diversity in maize protein abundance.

Authors:  Joseph L Gage; Sujina Mali; Fionn McLoughlin; Merritt Khaipho-Burch; Brandon Monier; Julia Bailey-Serres; Richard D Vierstra; Edward S Buckler
Journal:  Proc Natl Acad Sci U S A       Date:  2022-03-29       Impact factor: 12.779

9.  Conserved noncoding sequences provide insights into regulatory sequence and loss of gene expression in maize.

Authors:  Baoxing Song; Edward S Buckler; Hai Wang; Yaoyao Wu; Evan Rees; Elizabeth A Kellogg; Daniel J Gates; Merritt Khaipho-Burch; Peter J Bradbury; Jeffrey Ross-Ibarra; Matthew B Hufford; M Cinta Romay
Journal:  Genome Res       Date:  2021-05-27       Impact factor: 9.043

Review 10.  Learning the Regulatory Code of Gene Expression.

Authors:  Jan Zrimec; Filip Buric; Mariia Kokina; Victor Garcia; Aleksej Zelezniak
Journal:  Front Mol Biosci       Date:  2021-06-10
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.