| Literature DB >> 35321220 |
Sree K Chanumolu1, Hasan H Otu1.
Abstract
Introduction: Reconstruction of gene interaction networks from experimental data provides a deep understanding of the underlying biological mechanisms. The noisy nature of the data and the large size of the network make this a very challenging task. Complex approaches handle the stochastic nature of the data but can only do this for small networks; simpler, linear models generate large networks but with less reliability.Entities:
Keywords: Bayesian networks; Interactome; atlas; external knowledge; gene interaction network
Year: 2022 PMID: 35321220 PMCID: PMC8922291 DOI: 10.1017/cts.2022.18
Source DB: PubMed Journal: J Clin Transl Sci ISSN: 2059-8661
Fig. 1.Workflow for atlas generation.
Fig. 2.V-measure, biological homogeneity index (BHI), and adjusted Rand index (ARI) values for the eight clustering algorithms using a range of average number of genes per cluster. The input data are the simulated gene expression data that is generated from the human atlas with 11,454 genes representing 337 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways.
Area under the curve of precision-recall curve (AUC of PRC) (×10−4) values for the atlases generated using the proposed approach based on k-mean, hierarchical, or EMMIXgene clustering algorithms
| Clustering method | k-means | Hierarchical | EMMIXgene |
|---|---|---|---|
| First cluster | 3.8 | 5.0 | 3.0 |
| Minimum | 29.3 | 39.0 | 13.0 |
| Maximum | 24.6 | 34.6 | 11.9 |
| Mean | 29.9 |
| 13.1 |
| Median | 26.2 | 36.3 | 12.7 |
| Tukey | 26.0 | 36.0 | 12.5 |
Strength values for the genes in a cluster were calculated either based on the strength value when only the genes in the cluster are used for network generation (First cluster) or the minimum, maximum, mean, median, and Tukey’s bi-weight average of the strength values obtained during the cluster merge process. For a pair of genes within a cluster, there were as many strength values as the number of times the cluster has gone through a merge process with another cluster. Best performing combination is highlighted with boldface and shaded background.
Area under the curve of precision-recall curve (AUC of PRC) values with 95% confidence interval for the atlases generated using the correlation and average mutual information (AMI) metrics compared with the proposed approach based on hierarchical clustering and perfect clustering of expression data
| Correlation | Hierarchical (proposed) | AMI | Perfect clustering | |
|---|---|---|---|---|
| AUC of PRC (×10−4) | 7.9 [7.6–8.2] | 39.7 [37.8–41.6] | 2.4 [2.2–2.6] | 5116 [4895–5337] |
Subnetwork statistics and the area under the curve of precision-recall curve (AUC of PRC) values for the learned networks using the Bayesian network prior (BNP) and atlas approaches
| Subnetwork | No. of nodes | No. of edges | No. of pathways involved | AUC of PRC (×10−3) | ||
|---|---|---|---|---|---|---|
| BNP | Atlas | Baseline | ||||
| 1 | 528 | 879 | 11 | 129.6 | 112.8 | 6.3 |
| 2 | 542 | 840 | 13 | 109.8 | 95.5 | 5.7 |
| 3 | 496 | 913 | 13 | 113.9 | 102.5 | 7.4 |
| 4 | 514 | 911 | 8 | 112.2 | 96.5 | 6.9 |
| 5 | 537 | 850 | 11 | 74 | 59.9 | 5.9 |
| 6 | 526 | 897 | 13 | 124.8 | 109.8 | 6.5 |
| 7 | 494 | 837 | 11 | 124.2 | 105.6 | 6.9 |
| 8 | 459 | 845 | 14 | 83.2 | 69.9 | 8.0 |
| 9 | 508 | 912 | 8 | 108.8 | 87.0 | 7.1 |
| 10 | 462 | 816 | 8 | 79.3 | 63.4 | 7.7 |
| Average | 507 | 870 | 11.0 | 106.0 | 90.3 | 6.8 |
| St. dev. | 29.05 | 36.57 | 2.31 | 20.06 | 19.45 | 0.90 |
Each subnetwork was chosen to have ∼500 nodes.
List of pathways used to generate the “mini-atlas” for testing the proposed workflow
| KEGG pathway ID | Pathway name | No. of nodes | No. of edges |
|---|---|---|---|
| hsa00010 | Glycolysis/Gluconeogenesis | 28 | 78 |
| hsa00020 | Citrate cycle (TCA cycle) | 16 | 40 |
| hsa00030 | Pentose phosphate pathway | 20 | 61 |
| hsa00061 | Fatty acid biosynthesis | 12 | 35 |
| hsa00230 | Purine metabolism | 48 | 277 |
| hsa00280 | Valine, leucine, and isoleucine degradation | 33 | 116 |
| hsa00330 | Arginine and proline metabolism | 31 | 64 |
| hsa00562 | Inositol phosphate metabolism | 27 | 75 |
| hsa00620 | Pyruvate metabolism | 21 | 51 |
| hsa00640 | Propanoate metabolism | 20 | 53 |
KEGG, Kyoto Encyclopedia of Genes and Genomes.
Fig. 3.A subnetwork of the reconstructed test atlas that involves genes from the hsa00010 and hsa00330 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. FN, false negative; FP, false positive; TP, true positive.