| Literature DB >> 31617459 |
Huiqing Wang1, Chun Li1, Jianhui Zhang1, Jingjing Wang1, Yue Ma1, Yuanyuan Lian1.
Abstract
Molecular biology combined with in silico machine learning and deep learning has facilitated the broad application of gene expression profiles for gene function prediction, optimal crop breeding, disease-related gene discovery, and drug screening. Although the acquisition cost of genome-wide expression profiles has been steadily declining, the requirement generates a compendium of expression profiles using thousands of samples remains high. The Library of Integrated Network-Based Cellular Signatures (LINCS) program used approximately 1000 landmark genes to predict the expression of the remaining target genes by linear regression; however, this approach ignored the nonlinear features influencing gene expression relationships, limiting the accuracy of the experimental results. We herein propose a gene expression prediction model, L-GEPM, based on long short-term memory (LSTM) neural networks, which captures the nonlinear features affecting gene expression and uses learned features to predict the target genes. By comparing and analyzing experimental errors and fitting the effects of different prediction models, the LSTM neural network-based model, L-GEPM, can achieve low error and a superior fitting effect.Entities:
Keywords: Gene expression; LSTM; landmark genes; linear regression; target genes
Year: 2019 PMID: 31617459 DOI: 10.1142/S0219720019500227
Source DB: PubMed Journal: J Bioinform Comput Biol ISSN: 0219-7200 Impact factor: 1.122