Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Navigating the pitfalls of applying machine learning in genomics.

Literature DB >> 34837041

Navigating the pitfalls of applying machine learning in genomics.

Sean Whalen¹, Jacob Schreiber², William S Noble³, Katherine S Pollard^4,5,6.

Abstract

The scale of genetic, epigenomic, transcriptomic, cheminformatic and proteomic data available today, coupled with easy-to-use machine learning (ML) toolkits, has propelled the application of supervised learning in genomics research. However, the assumptions behind the statistical models and performance evaluations in ML software frequently are not met in biological systems. In this Review, we illustrate the impact of several common pitfalls encountered when applying supervised ML in genomics. We explore how the structure of genomics data can bias performance evaluations and predictions. To address the challenges associated with applying cutting-edge ML methods to genomics, we describe solutions and appropriate use cases where ML modelling shows great potential.

Entities: Chemical

Mesh：

Year: 2021 PMID： 34837041 DOI： 10.1038/s41576-021-00434-9

Source DB: PubMed Journal: Nat Rev Genet ISSN： 1471-0056 Impact factor: 53.242

78 in total

1. Avoiding common pitfalls in machine learning omic data science.

Authors: Andrew E Teschendorff
Journal: Nat Mater Date: 2019-05 Impact factor: 43.841

2. The nature of confounding in genome-wide association studies.

Authors: Bjarni J Vilhjálmsson; Magnus Nordborg
Journal: Nat Rev Genet Date: 2012-11-20 Impact factor: 53.242

3. Confounding and heterogeneity in genetic association studies with admixed populations.

Authors: Jinghua Liu; Juan Pablo Lewinger; Frank D Gilliland; W James Gauderman; David V Conti
Journal: Am J Epidemiol Date: 2013-01-18 Impact factor: 4.897

Review 4. Tackling the widespread and critical impact of batch effects in high-throughput data.

Authors: Jeffrey T Leek; Robert B Scharpf; Héctor Corrada Bravo; David Simcha; Benjamin Langmead; W Evan Johnson; Donald Geman; Keith Baggerly; Rafael A Irizarry
Journal: Nat Rev Genet Date: 2010-09-14 Impact factor: 53.242

5. Evaluation of methods for modeling transcription factor sequence specificity.

Authors: Matthew T Weirauch; Atina Cote; Raquel Norel; Matti Annala; Yue Zhao; Todd R Riley; Julio Saez-Rodriguez; Thomas Cokelaer; Anastasia Vedenko; Shaheynoor Talukder; Harmen J Bussemaker; Quaid D Morris; Martha L Bulyk; Gustavo Stolovitzky; Timothy R Hughes
Journal: Nat Biotechnol Date: 2013-01-27 Impact factor: 54.908

Review 6. Deep learning: new computational modelling techniques for genomics.

Authors: Gökcen Eraslan; Žiga Avsec; Julien Gagneur; Fabian J Theis
Journal: Nat Rev Genet Date: 2019-07 Impact factor: 53.242

Review 7. A primer on deep learning in genomics.

Authors: James Zou; Mikael Huss; Abubakar Abid; Pejman Mohammadi; Ali Torkamani; Amalio Telenti
Journal: Nat Genet Date: 2018-11-26 Impact factor: 38.330

Review 8. Population structure in genetic studies: Confounding factors and mixed models.

Authors: Jae Hoon Sul; Lana S Martin; Eleazar Eskin
Journal: PLoS Genet Date: 2018-12-27 Impact factor: 5.917

9. The Unreasonable Effectiveness of Convolutional Neural Networks in Population Genetic Inference.

Authors: Lex Flagel; Yaniv Brandvain; Daniel R Schrider
Journal: Mol Biol Evol Date: 2019-02-01 Impact factor: 16.240

Review 10. Opportunities and obstacles for deep learning in biology and medicine.

Authors: Travers Ching; Daniel S Himmelstein; Brett K Beaulieu-Jones; Alexandr A Kalinin; Brian T Do; Gregory P Way; Enrico Ferrero; Paul-Michael Agapow; Michael Zietz; Michael M Hoffman; Wei Xie; Gail L Rosen; Benjamin J Lengerich; Johnny Israeli; Jack Lanchantin; Stephen Woloszynek; Anne E Carpenter; Avanti Shrikumar; Jinbo Xu; Evan M Cofer; Christopher A Lavender; Srinivas C Turaga; Amr M Alexandari; Zhiyong Lu; David J Harris; Dave DeCaprio; Yanjun Qi; Anshul Kundaje; Yifan Peng; Laura K Wiley; Marwin H S Segler; Simina M Boca; S Joshua Swamidass; Austin Huang; Anthony Gitter; Casey S Greene
Journal: J R Soc Interface Date: 2018-04 Impact factor: 4.293

11 in total

Review 1. Obtaining genetics insights from deep learning via explainable artificial intelligence.

Authors: Gherman Novakovsky; Nick Dexter; Maxwell W Libbrecht; Wyeth W Wasserman; Sara Mostafavi
Journal: Nat Rev Genet Date: 2022-10-03 Impact factor: 59.581

2. An approachable, flexible and practical machine learning workshop for biologists.

Authors: Chris S Magnano; Fangzhou Mu; Rosemary S Russ; Milica Cvetkovic; Debora Treu; Anthony Gitter
Journal: Bioinformatics Date: 2022-06-24 Impact factor: 6.931

Review 3. Current progress and open challenges for applying deep learning across the biosciences.

Authors: Nicolae Sapoval; Amirali Aghazadeh; Michael G Nute; Dinler A Antunes; Advait Balaji; Richard Baraniuk; C J Barberan; Ruth Dannenfelser; Chen Dun; Mohammadamin Edrisi; R A Leo Elworth; Bryce Kille; Anastasios Kyrillidis; Luay Nakhleh; Cameron R Wolfe; Zhi Yan; Vicky Yao; Todd J Treangen
Journal: Nat Commun Date: 2022-04-01 Impact factor: 14.919

Review 4. Precision medicine for the treatment of glomerulonephritis: a bold goal but not yet a transformative achievement.

Authors: Richard J Glassock
Journal: Clin Kidney J Date: 2021-12-11