| Literature DB >> 28957313 |
André Voigt1, Katja Nowick2,3,4, Eivind Almaas1,5.
Abstract
Differential co-expression network analyses have recently become an important step in the investigation of cellular differentiation and dysfunctional gene-regulation in cell and tissue disease-states. The resulting networks have been analyzed to identify and understand pathways associated with disorders, or to infer molecular interactions. However, existing methods for differential co-expression network analysis are unable to distinguish between various forms of differential co-expression. To close this gap, here we define the three different kinds (conserved, specific, and differentiated) of differential co-expression and present a systematic framework, CSD, for differential co-expression network analysis that incorporates these interactions on an equal footing. In addition, our method includes a subsampling strategy to estimate the variance of co-expressions. Our framework is applicable to a wide variety of cases, such as the study of differential co-expression networks between healthy and disease states, before and after treatments, or between species. Applying the CSD approach to a published gene-expression data set of cerebral cortex and basal ganglia samples from healthy individuals, we find that the resulting CSD network is enriched in genes associated with cognitive function, signaling pathways involving compounds with well-known roles in the central nervous system, as well as certain neurological diseases. From the CSD analysis, we identify a set of prominent hubs of differential co-expression, whose neighborhood contains a substantial number of genes associated with glioblastoma. The resulting gene-sets identified by our CSD analysis also contain many genes that so far have not been recognized as having a role in glioblastoma, but are good candidates for further studies. CSD may thus aid in hypothesis-generation for functional disease-associations.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28957313 PMCID: PMC5634634 DOI: 10.1371/journal.pcbi.1005739
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Gene co-expression score surfaces.
General representation of the regions of interest for differential co-expression relationship scores C, S, and D. Here, ρ1 and ρ2 denote the Spearman rank-correlation of the expression of a given gene pair under condition 1 and condition 2, respectively. Colored regions correspond the three kinds of co-expression: blue is conserved C (strong co-expression in both conditions, no sign change), green is specific S (strong co-expression under only one condition), red is differentiated D (strong co-expression in both conditions, but with opposite signs). The colored letters indicate the scores associated with each colored region.
Summary overview and characterization of differential co-expression methods.
We characterize the presented methods by 5 tests: 1. Detects loss of co-expression. 2. Detects sign change. 3. Differentiates loss of co-expression from sign change. 4. Differentiates sign change and conservation. 5. Integrates conserved co-expression.
| Method | Operating principle | Focus | 1. | 2. | 3. | 4. | 5. | Main output |
|---|---|---|---|---|---|---|---|---|
| CSD | Direct score | Link | Yes | Yes | Yes | Yes | Yes | Full network |
| DCGL (DCe) [ | Direct score | Link | Yes | Yes | Yes | Yes | No | Gene rankings, full differential network |
| DCGL (DCp) [ | Direct score | Gene | Yes | Yes | No | Yes | No | Gene rankings |
| DiffCoEx [ | Direct score | Link | Yes | Yes | No | Yes | No | Network modules |
| BMHT [ | Direct score | Link | Yes | Yes | No | Yes | No | Network cliques, gene rankings |
| Choi (2005) [ | Network comparison | Link | Yes | No | Yes | No | No | Full network, network clusters |
| Reverter (2006) [ | Network comparison | Gene | Yes | No | Yes | No | No | Gene rankings |
| DICER [ | Direct score | Link | Yes | Yes | No | Yes | No | Network modules |
| Gao (2013) [ | Direct score | Link | Yes | Yes | No | Yes | No | Full differential network |
| DiffCorr [ | Direct score, network comparison | Link, Module | Yes | Yes | No | Yes | No | Full differential network, differential clusters |
Fig 2Receiver operating characteristic (ROC) curves for differential gene co-expression scores.
ROC curves for the C, S and D-scores, and equivalent scores in the DCe method averaged over 20 independent simulations. The dashed black diagonal corresponds to Sensitivity = 1 − Specificity. Note that the DCe curves do not extend across the whole range, since DCe classifies genes detected as differentially co-expressed into either S-equivalent or D-equivalent, depending on sign change in the underlying correlation. Since a gene pair may only belong to one category in DCe, it is not possible to relax test requirements in such a way that one category contains all gene pairs. Notably, even under the most inclusive test requirements, the D-equivalent category can only contain on average ≈ 20% of gene pairs that show differently signed correlations between the two conditions.
Fig 3Overview of the CSD network.
Visualization of the aggregate CSD-type network generated using a sample size of L = 105. Triangular nodes indicate transcription factors. Prominent hubs (nodes with more than 40 neighbors) are colored black, enlarged and labeled for emphasis. Edges are colored by type: blue is C-type, green is S-type, red is D-type.
Fig 4Node homogeneity and mixing of interactions.
a) Box plot of gene homogeneity scores H according to node degree. Red bars denote the median H for nodes of the specified degree, and red squares denote the mean. Bottom and top ends of the boxes represent the first and third quartiles, respectively. The end of the whiskers correspond to min/max values of H at that degree. b) Ternary heatmap, detailing the fractions of specified interactions k/k with j ∈ {C, S, D} per gene: Corners correspond to homogeneous nodes, i.e. nodes with only one type of interaction. The sides correspond to nodes with two types of interactions (scale is fraction × 10), e.g. k = 0 along the side marked D. The blue cross is an aid, with coordinates (C ∼ 60%, S ∼ 30%, D ∼ 10%. c) Venn diagram showing the relative quantities of genes involved in each type of interaction.
Network hubs for each type of interaction.
k denotes node degree (total number of connections), while k, k and k denote the number of connections of each type (k + k + k = k). H denotes node homogeneity, as defined in Eq 5.
| Top 5 C | H | ||||
| UBQLN1 | 23 | 22 | 1 | 0 | 0.92 |
| Top 5 S | H | ||||
| GPR101 | 45 | 0 | 45 | 0 | 1.0 |
| Top 5 D | H | ||||
| FOXO1 | 240 | 0 | 3 | 237 | 0.98 |
Fig 5Robustness of network topology.
a) Degree assortativity for the consolidated comparative gene co-expression network, generated for different importance levels. Lines denote the difference between maximum k-core in the empirical network at the selected threshold (arrow tip) and mean degree assortativity across 100 random networks with degree distributions similar to the empirical network (arrow tail). b) Similar to a), but for networks consisting of interactions of each individual type (C, S, and D). Empirical C-networks show positive assortativity as well as higher than random k-core, indicating an affinity between tightly connected nodes and a “rich club”-structure, while the empirical S- and D-networks show negative assortativity and lower than random k-core, indicating a hub-and-spoke type network structure.
Gene pairs directly connected in both the PPI and CSD networks.
| Gene A | Gene B | Type of CSD interaction |
|---|---|---|
| C1QA | C1QB | C |
| CARHSP1 | PNMA1 | D |
| CD74 | HLA-DRA | C |
| HCK | WAS | C |
| HERC3 | UBQLN2 | C |
| RPS11 | RPS3 | C |
| S100A8 | S100A9 | C |
Fig 6Network hubs and glioma associations.
Neighborhood of the glioma-associated hubs FOXO1, CARHSP1 and PBX3. Every represented gene connects to at least one of the hubs. Non-hub genes are grouped according to the hubs they connect to (and by interaction type), as well as regulation in glioma. Transcription factors are denoted by triangular labels, other genes with circles. Purple nodes represent genes whose activity is positively linked to harmful outcomes in glioma, while the activity of yellow node is linked to more benign outcomes. White nodes represent genes without established links between activity and glioma. Red links are D-type connections, while green ones are S-type. There are no D-type connections linking two non-hub genes to each other.
Glioma-associated nodes in the neighborhood of FOXO1, CARHSP and PBX3.
Associations between gene activity and glioma (as found in literature) are divided into two general groups: Positive associations denote any gene where increased expression is generally linked to harmful outcomes for the patient. This includes genes which are overexpressed in glioma as opposed to healthy tissue, where increased expression in glioma is correlated with higher mortality, or where the gene is more highly expressed in higher grade gliomas. Negative associations denote genes where higher expression is generally linked to beneficial outcomes for the patient.
| Neighboring hub(s) | Positive | Negative |
|---|---|---|
| FOXO1 | MBTPS1 [ | DCTN2 [ |
| CARHSP1 | MADD [ | ST6GALNAC5 [ |
| PBX3 | BEX1 [ | |
| FOXO1 + CARHSP1 | PAK1 [ | RNF41 [ |
| FOXO1 + PBX3 | VIPR1 [ | SNRPN [ |
| CARHSP1 + PBX3 | SULT4A1 [ | |
| FOXO1 + CARHSP1 + PBX3 | FBXO16 [ | NDRG4 [ |