| Literature DB >> 30717663 |
Wajdi Dhifli1, Julia Puig1, Aurélien Dispot1, Mohamed Elati2,3.
Abstract
BACKGROUND: With the recent advancements in high-throughput experimental procedures, biologists are gathering huge quantities of data. A main priority in bioinformatics and computational biology is to provide system level analytical tools capable of meeting an ever-growing production of high-throughput biological data while taking into account its biological context. In gene expression data analysis, genes have widely been considered as independent components. However, a systemic view shows that they act synergistically in living cells, forming functional complexes and more generally a biological system.Entities:
Keywords: Gene expression; Gene perturbation; Latent signals; Network-based transformations; Regulator activity
Mesh:
Year: 2019 PMID: 30717663 PMCID: PMC7394327 DOI: 10.1186/s12859-018-2481-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1General framework. (a) An overview of our framework LATNET for latent network-based representations. (b) A regulatory program (network) of a target gene with its co-activators and co-inhibitors. (c) An example of a regulator and its set of activated and repressed genes. (d) An example of a pipeline for comparative analyses based on LATNET transformed signals
Number of genes, samples and classification of the bladder cancer datasets
| Genes | Tumor samples | Invasive | Superficial | |
|---|---|---|---|---|
|
| 20,326 | 193 | 89 | 104 |
|
| 8174 | 79 | 43 | 36 |
|
| 15,092 | 306 | 93 | 213 |
Number of features in EXPRESSION, LATNET and LATNET data
| Number of features | |||
|---|---|---|---|
|
|
|
| |
|
| 7089 | 667 | 6359 |
|
| 3238 | 394 | 2773 |
|
| 5858 | 606 | 5190 |
Classification results in terms of AUC obtained on the E-MTAB-1803, E-TABM-147 and GSE32894 datasets with Random Forest (RF) and Support Vector Machine (SVM) classifiers using input signals LATNET , LATNET and EXPRESSION
|
|
|
| ||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
|
| 0.93 | 0.91 |
| 0.85 |
| 0.93 |
|
| 0.88 | 0.87 |
|
| 0.90 | 0.83 |
|
| 0.83 | 0.82 | 0.83 | 0.83 | 0.84 |
|
In bold are the best AUC achieved in each of the 3 datasets
Classification results in terms of AUC obtained on the E-MTAB-1803, E-TABM-147 and GSE32894 datasets with Random Forest (RF) and Support Vector Machine (SVM) classifiers using input signals LATNET , LATNET , PCA, SVD and NMF
|
|
|
|
|
| ||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| |
|
| 0.93 | 0.91 |
| 0.85 | 0.93 | 0.87 | 0.93 | 0.90 | 0.92 | 0.92 |
|
| 0.88 | 0.87 |
|
| 0.87 | 0.87 | 0.86 | 0.90 | 0.84 | 0.89 |
|
|
| 0.82 |
|
| 0.75 | 0.75 | 0.77 | 0.81 | 0.82 | 0.82 |
In bold are the best AUC achieved on each dataset
Fig. 2Stability and reproducibility of Expression, LATNET and LATNET for bladder cancer dataset. a) The stability of signatures depending of the number of selected features estimated by the Kuncheva index in dataset E-TABM-147, GSE32894 and E-MTAB-1803 respectively. b) The reproducibility between signature from GSE32894 and E-MTAB-1803 dataset. The overlap between signatures was computed by the Kuncheva index between signatures from the two dataset with the same number of selected features
Fig. 3Network visualization of EXPRESSION, LATNET and LATNET signals (in green, red and blue, respectively) on each breast cancer subtype. Colour intensity represents the strength of the signal for the nodes