| Literature DB >> 35692809 |
Yanglan Gan1, Xin Hu1, Guobing Zou2, Cairong Yan1, Guangwei Xu1.
Abstract
Accurate inference of gene regulatory rules is critical to understanding cellular processes. Existing computational methods usually decompose the inference of gene regulatory networks (GRNs) into multiple subproblems, rather than detecting potential causal relationships simultaneously, which limits the application to data with a small number of genes. Here, we propose BiRGRN, a novel computational algorithm for inferring GRNs from time-series single-cell RNA-seq (scRNA-seq) data. BiRGRN utilizes a bidirectional recurrent neural network to infer GRNs. The recurrent neural network is a complex deep neural network that can capture complex, non-linear, and dynamic relationships among variables. It maps neurons to genes, and maps the connections between neural network layers to the regulatory relationship between genes, providing an intuitive solution to model GRNs with biological closeness and mathematical flexibility. Based on the deep network, we transform the inference of GRNs into a regression problem, using the gene expression data at previous time points to predict the gene expression data at the later time point. Furthermore, we adopt two strategies to improve the accuracy and stability of the algorithm. Specifically, we utilize a bidirectional structure to integrate the forward and reverse inference results and exploit an incomplete set of prior knowledge to filter out some candidate inferences of low confidence. BiRGRN is applied to four simulated datasets and three real scRNA-seq datasets to verify the proposed method. We perform comprehensive comparisons between our proposed method with other state-of-the-art techniques. These experimental results indicate that BiRGRN is capable of inferring GRN simultaneously from time-series scRNA-seq data. Our method BiRGRN is implemented in Python using the TensorFlow machine-learning library, and it is freely available at https://gitee.com/DHUDBLab/bi-rgrn.Entities:
Keywords: bidirectional structure; gene expression; gene regulatory network; recurrent neural network; single-cell transcriptomic data
Year: 2022 PMID: 35692809 PMCID: PMC9178250 DOI: 10.3389/fonc.2022.899825
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 5.738
Figure 1BiRGRN reconstructs GRNs from time-series single cell transcriptome data using bidirection RNN. (A) Inferring initial gene regulatory network with RNN. (B) Incorporating incomplete prior knowledge to adjust candidate regulatory edges. (C) Adopting a voting strategy to integrate multiple candidate regulatory networks, and further utilizing bidirectional model to optimize the inferred GRN.
Figure 2The schematic structure of a RNN unfolded in time. Each node corresponds to a gene and a connection between two nodes defines their interaction.
Details of time-seris gene expression datasets used in the experiment.
| Dataset | Genes | Time points | Cells |
|---|---|---|---|
| GSD | 19 | 734 | 2000 |
| HSC | 11 | 731 | 2000 |
| VSC | 8 | 492 | 2000 |
| mCAD | 5 | 492 | 2000 |
| Real Dataset1 | 100 | 456 | 456 |
| Real Dataset2 | 100 | 405 | 405 |
| Real Dataset3 | 100 | 758 | 758 |
Figure 3AUROC scores of the compared GRN inference algorithms on four simulated datasets.
Figure 4AUPRC scores of of the compared GRN inference algorithms on four simulated datasets.
Figure 5The overall score of the algorithm on the four simulated datasets.
The AUROC value of the algorithm on three real scRNA-seq datasets.
| Algorithm | Dataset 1 | Dataset 2 | Dataset 3 |
|---|---|---|---|
| BiRGRN | 0.573 | ||
| GENIE3 | 0.503 | 0.498 | 0.507 |
| LEAP | 0.487 | 0.5 | 0.494 |
| SCODE | 0.536 | 0.523 | |
| BiXGBoost | 0.509 | 0.479 | 0.510 |
The value in bold represents the highest value in the column.
The runtime of each method for three real datasets.
| Runtime1 | BiRGRN | SCODE | GENIE3 | LEAP | BiXGBoost |
|---|---|---|---|---|---|
| Dataset 1 | 1min58s | 7min3s | 58s | 6s | min49s |
| Dataset 2 | 1min58s | 6min39s | 52s | 4s | 3min21s |
| Dataset 3 | 2min22s | 8min49s | 1min6s | 11s | 3min58s |
1All algorithms except BiXGBoost are tested on Beeline(a benchmarking software for GRN inference algorithms). The computations were performed on a Lenovo Legion R7000 2020 equipped with a 3.0GHz AMD Ryzen 5 4600H processor a 4GB NVIDIA GeForce GTX 1650Ti and 16GB of 3200MHz DDR4 RAM.
The AUROC value of the algorithm and three variants on the simulated datasets.
| Dataset | BiRGRN | Prior network | Forward | Reverse |
|---|---|---|---|---|
| GSD | 0.544 | 0.583 | 0.587 | |
| HSC | 0.586 | 0.656 | 0.660 | |
| VSC | 0.624 | 0.761 | 0.763 | |
| mCAD | 0.796 | 0.678 | 0.796 | 0.792 |
The value in bold represents the highest value in the row.