| Literature DB >> 35733604 |
Zhaorui Dong1, Xiaoqiang Sun2.
Abstract
Due to a lack of explicit temporal information, it can be challenging to infer gene regulatory networks from clinical transcriptomic data. Here, we describe the protocol of PROB_R for inferring latent temporal disease progression and reconstructing gene regulatory networks from cross-sectional clinical transcriptomic data. We illustrate the protocol by applying it to a breast cancer dataset to demonstrate its use in recovering pseudo-temporal dynamics of gene expression alongside disease progression, reconstructing gene regulatory networks, and identifying key regulatory genes. For complete details on the use and execution of this protocol, please refer to Sun et al. (2021).Entities:
Keywords: Bioinformatics; Cancer; Gene Expression; Health Sciences; Systems biology
Mesh:
Year: 2022 PMID: 35733604 PMCID: PMC9207570 DOI: 10.1016/j.xpro.2022.101467
Source DB: PubMed Journal: STAR Protoc ISSN: 2666-1667
Example format for the clinical transcriptomic data
| Genes/Grades | GSM177885 | GSM177886 | GSM177887 | GSM177888 | … |
|---|---|---|---|---|---|
| A1CF | 7.422311 | 7.149458 | 6.974534 | 8.001110 | … |
| A2M | 11.029381 | 11.564107 | 13.150140 | 12.194598 | … |
| … | … | … | … | … | … |
| ZZZ3 | 10.420448 | 8.733127 | 9.8648944 | 8.7696577 | … |
| Grade information | 3 | 3 | 3 | 3 | … |
Figure 1Pseudo temporal dynamics of gene expression along latent disease progression
Shown is an example for the gene MCM10. The values of x-axis and y-axis are standardized.
Figure 2Clustering heatmap of TCGs
The color in each cell represents the standardized expression level of corresponding gene. The left side of the figure shows clustering structure of TCGs.
Figure 3Bubble plot of adjacent matrix of the inferred GRNs
The node color represents the posterior mean of regulatory coefficient for each edge, with red for positive and green for negative. The node size represents the standardized absolute value of the edge coefficient, which is calculated as the absolute posterior mean divided by the standard deviation.
Figure 4Gene regulatory networks with threshold 0.95 as edge credible level
The color of the edge represents the sign of the edge, i.e., red for positive regulation and green for negative regulation, respectively.
Figure 5Top 5 genes ranked according to the hub scores
Figure 6Time course curves of the top 5 genes
Figure 7Clinical relevance of FOXM1 for breast cancer patients
The red curve corresponds to low level of gene expression and the blue curve corresponds to high expression level. The log-rank test p values are used to assess the statistical significance of difference between the two K-M survival curves.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Dataset of microarray experiments from primary breast tumors | NCBI Gene Expression Omnibus | GEO: |
| PROB_R | This paper | |
| R software (version 4.0.5) | ( | |
| RStudio (version 1.4.1106) | ( | |
| Cytoscape | Cytoscape Team | |
| Biobase package | ( | |
| ggplot2 package | ( | |
| GEOquery package | ( | |
| trend package | ( | |
| OmnipathR package | ( | |
| monomvn | ( | |
| pheatmap package | ( | |
| minerva package | ( | |
| tidyr package | ( | |
| gprofiler2 package | ( | |
| reshape2 package | ( | |
| igraph | ( | |
| survival package | ( | |
| Brq package | ( | |
| Equipment | A laptop with an Intel Core i7 10th generation 2.60 GHz CPU, 16 GB RAM and 64× Windows 10 system | N/A |