| Literature DB >> 22319435 |
Ekaterina Kotelnikova1, Maria A Shkrob, Mikhail A Pyatnitskiy, Alessandra Ferlini, Nikolai Daraselia.
Abstract
Elucidation of new biomarkers and potential drug targets from high-throughput profiling data is a challenging task due to a limited number of available biological samples and questionable reproducibility of differential changes in cross-dataset comparisons. In this paper we propose a novel computational approach for drug and biomarkers discovery using comprehensive analysis of multiple expression profiling datasets.The new method relies on aggregation of individual profiling experiments combined with leave-one-dataset-out validation approach. Aggregated datasets were studied using Sub-Network Enrichment Analysis algorithm (SNEA) to find consistent statistically significant key regulators within the global literature-extracted expression regulation network. These regulators were linked to the consistent differentially expressed genes.We have applied our approach to several publicly available human muscle gene expression profiling datasets related to Duchenne muscular dystrophy (DMD). In order to detect both enhanced and repressed processes we considered up- and down-regulated genes separately. Applying the proposed approach to the regulators search we discovered the disturbance in the activity of several muscle-related transcription factors (e.g. MYOG and MYOD1), regulators of inflammation, regeneration, and fibrosis. Almost all SNEA-derived regulators of down-regulated genes (e.g. AMPK, TORC2, PPARGC1A) correspond to a single common pathway important for fast-to-slow twitch fiber type transition. We hypothesize that this process can affect the severity of DMD symptoms, making corresponding regulators and downstream genes valuable candidates for being potential drug targets and exploratory biomarkers.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22319435 PMCID: PMC3271016 DOI: 10.1371/journal.pcbi.1002365
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Figure 1Overall workflow of the analysis.
See corresponding section for detailed description.
Consistent regulators of differentially expressed genes plus regulators from SNEA of reference dataset.
| Function | Regulation | Regulators |
| Transcription factors | positive, negative | AML1-ETO, CCAAT factors, CIITA, CTCF, ESRRA, FOXI1, ING4, MEF2C, MYOD1, MYOG, NF-kB, NR1H2, NR1H4, NR4A2, PPARD, PPARGC1A, RUNX1, RUNX2, SCXA, SMAD, SMAD7, SREBF1, SREBF2, STAT1, TWIST1, ZEB1, ZFHX3 |
| Cytokines and cytokines receptors | positive, negative | ADIPOQ, BMP2, CSF1, CSH1, CTGF, GCG, GH1, IFNG, IL13, IL4, IL6, IL6R, INS, LEP, PTH, TGF family, TGFB1, TGFB2, TNFRSF11B |
| Growth factors | positive | AGT, BMP2, CSF1, CTGF, FGF2, GF, IL4, IL6, TGF family, TGFB1, TGFB2 |
| Hormones | positive, negative | ADIPOQ, AGT, CSH1, GCG, GH1, INS, LEP, PTH |
| MAPK | positive | MAPK, MAPK3 |
| Extracellular matrix | positive | collagen type I, vitronectin |
| Inflammation and immune response | positive | allergen, CAMP, CCL2, CCR7, CIITA, CMA1, CXCL2, IFNG, IL13, IL4, IL6, IL6R, IRF1, NF-kB, STAT1, TGF family, TGFB1, TGFB2, TNFRSF11B, ZEB1 |
| Regulation of metabolic processes | negative | ADIPOQ, ADRB3, AMPK, GCG, INS, LEP, NR1H2, PPARD, PPARGC1A, PRKAA2, SREBF1, SREBF2, TORC2, UCP2 |
| TGFB-SMAD pathway | positive | BMP2, SMAD, SMAD7, TGF family, TGFB1, TGFB2 |
| Muscle-specific factors | negative | MEF2C, muscle fiber, MYOD1, MYOG |
| Cell cycle | positive | CDKN1B, CTCF, ING4, SCXA |
| IFNG signaling | positive | IFNG, IRF1, STAT1 |
| Renin-angiotensin system | positive | AGT, angiotensin II receptor |
| Chromatin modification | positive, negative | HDAC1, histone deacetylase inhibitor |
| Other | positive | alkaline phosphohydrolase, GJA1, LPL, MIRN29C, RHOA |
Figure 2Regulators of down-regulated genes.
Most of SNEA-derived regulators of down-regulated genes regulate the processes related to myotube formation, fast-to-slow fiber type switch (including changes in myofiber composition, mitochondria content and insulin sensitivity) and metabolic changes in DMD affected muscles. Relations are described in text. Catalytic subunit of AMPK, PRKAA2, is shown next to AMPK. Functional class - class of proteins, such as enzyme families. Complex - a group of two or more proteins linked by non-covalent protein-protein interactions. Expression - protein members of one class regulate expression of proteins in another class. DirectRegulation - protein members of one class bind and regulate proteins in another class. Regulation - protein members of one class indirectly regulate proteins in another class. ProteinModification - protein members of the regulator class phosphorylate or otherwise modify proteins in the target class. PromoterBinding - protein members of one class bind promoters of genes encoding proteins in another class.
Figure 3Fraction of common genes in top-k rankings for different types of gene expression.
For each of six datasets and for each type of regulation gene ranking procedure was performed and overlap between six top-k lists was calculated. Fraction of common genes in top-k reaches saturation for k roughly equal to 500, hence adding more genes will not increase overlap between six rankings.
Gene Ontology groups enriched by consistent differentially expressed genes.
| GO Process | Number of genes | p-value |
|
| ||
| cell adhesion | 23 | 1.92E-09 |
| immune response | 20 | 6.8E-08 |
| proteolysis | 13 | 0.00193 |
| apoptosis | 12 | 0.001376 |
| negative regulation of cell proliferation | 10 | 0.00025 |
| inflammatory response | 10 | 0.000111 |
| cell motion | 10 | 3.76E-08 |
| heart development | 9 | 1.08E-05 |
| skeletal system development | 9 | 2.48E-06 |
| wound healing | 9 | 3.45E-09 |
|
| ||
| carbohydrate metabolic process | 16 | 2.94E-11 |
| metabolic process | 14 | 0.000619 |
| oxidation reduction | 12 | 0.00102 |
| modification-dependent protein catabolic process | 11 | 0.000211 |
| glycogen metabolic process | 9 | 2.16E-12 |
| muscle contraction | 8 | 1.01E-07 |
| response to hypoxia | 7 | 0.000119 |
| electron transport chain | 7 | 6.87E-06 |
| nervous system development | 6 | 0.040331 |
| response to drug | 6 | 0.00847 |
Biological processes from Gene Ontology associated with consistently differentially expressed genes were found by applying “Find groups enriched with selected entities” tool embedded in Ariadne Pathway Studio to the list of 431 genes. Resulting significant (p-value<0.05) biological processes were sorted by number of genes involved in a process. Top 10 processes are shown.
GEO datasets used for the meta-analysis.
| GEO ID | Platform | Description | Source | Reference |
| GDS 214 | custom Affymetrix | 4 healthy, 26 DMD | Muscle |
|
| GDS 563 | Affymmetrix U95A | 11 healthy, 12 DMD | Quadriceps Muscle |
|
| GDS 1956 | Affymetrix U133A | 18 healthy, 10 DMD | Muscle |
|
| GDS 2855 | Affymetrix U133B | 20 healthy, 10 DMD | Muscle |
|
| GDS 3027 | Affymetrix U133A | 14 healthy, 23 DMD | Quadriceps Muscle |
|