| Literature DB >> 31867668 |
Shanwen Sun1, Chunyu Wang2, Hui Ding3, Quan Zou4.
Abstract
The advent of high-throughput genomic technologies has resulted in the accumulation of massive amounts of genomic information. However, biologists are challenged with how to effectively analyze these data. Machine learning can provide tools for better and more efficient data analysis. Unfortunately, because many plant biologists are unfamiliar with machine learning, its application in plant molecular studies has been restricted to a few species and a limited set of algorithms. Thus, in this study, we provide the basic steps for developing machine learning frameworks and present a comprehensive overview of machine learning algorithms and various evaluation metrics. Furthermore, we introduce sources of important curated plant genomic data and R packages to enable plant biologists to easily and quickly apply appropriate machine learning algorithms in their research. Finally, we discuss current applications of machine learning algorithms for identifying various genes related to resistance to biotic and abiotic stress. Broad application of machine learning and the accumulation of plant sequencing data will advance plant molecular studies.Entities:
Keywords: evaluation metrics; genomics; plants; supervised machine learning; unsupervised machine learning
Mesh:
Year: 2020 PMID: 31867668 DOI: 10.1093/bfgp/elz036
Source DB: PubMed Journal: Brief Funct Genomics ISSN: 2041-2649 Impact factor: 4.241