Ameen Eetemadi1,2, Ilias Tagkopoulos1,2. 1. Department of Computer Science, University of California, Davis, CA, USA. 2. Genome Center, University of California, Davis, CA, USA.
Abstract
MOTIVATION: Gene expression prediction is one of the grand challenges in computational biology. The availability of transcriptomics data combined with recent advances in artificial neural networks provide an unprecedented opportunity to create predictive models of gene expression with far reaching applications. RESULTS: We present the Genetic Neural Network (GNN), an artificial neural network for predicting genome-wide gene expression given gene knockouts and master regulator perturbations. In its core, the GNN maps existing gene regulatory information in its architecture and it uses cell nodes that have been specifically designed to capture the dependencies and non-linear dynamics that exist in gene networks. These two key features make the GNN architecture capable to capture complex relationships without the need of large training datasets. As a result, GNNs were 40% more accurate on average than competing architectures (MLP, RNN, BiRNN) when compared on hundreds of curated and inferred transcription modules. Our results argue that GNNs can become the architecture of choice when building predictors of gene expression from exponentially growing corpus of genome-wide transcriptomics data. AVAILABILITY AND IMPLEMENTATION: https://github.com/IBPA/GNN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Gene expression prediction is one of the grand challenges in computational biology. The availability of transcriptomics data combined with recent advances in artificial neural networks provide an unprecedented opportunity to create predictive models of gene expression with far reaching applications. RESULTS: We present the Genetic Neural Network (GNN), an artificial neural network for predicting genome-wide gene expression given gene knockouts and master regulator perturbations. In its core, the GNN maps existing gene regulatory information in its architecture and it uses cell nodes that have been specifically designed to capture the dependencies and non-linear dynamics that exist in gene networks. These two key features make the GNN architecture capable to capture complex relationships without the need of large training datasets. As a result, GNNs were 40% more accurate on average than competing architectures (MLP, RNN, BiRNN) when compared on hundreds of curated and inferred transcription modules. Our results argue that GNNs can become the architecture of choice when building predictors of gene expression from exponentially growing corpus of genome-wide transcriptomics data. AVAILABILITY AND IMPLEMENTATION: https://github.com/IBPA/GNN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Sun Kyung Kim; Peter C Goughnour; Eui Jin Lee; Myeong Hyun Kim; Hee Jin Chae; Gwang Yeul Yun; Yi Rang Kim; Jin Woo Choi Journal: PLoS One Date: 2021-01-28 Impact factor: 3.240
Authors: Robert Ietswaart; Benjamin M Gyori; John A Bachman; Peter K Sorger; L Stirling Churchman Journal: Genome Biol Date: 2021-02-02 Impact factor: 13.583