| Literature DB >> 22136190 |
Danning He1, Zhi-Ping Liu, Luonan Chen.
Abstract
BACKGROUND: The incidence of congenital heart disease (CHD) is continuously increasing among infants born alive nowadays, making it one of the leading causes of infant morbidity worldwide. Various studies suggest that both genetic and environmental factors lead to CHD, and therefore identifying its candidate genes and disease-markers has been one of the central topics in CHD research. By using the high-throughput genomic data of CHD which are available recently, network-based methods provide powerful alternatives of systematic analysis of complex diseases and identification of dysfunctional modules and candidate disease genes.Entities:
Mesh:
Year: 2011 PMID: 22136190 PMCID: PMC3256240 DOI: 10.1186/1471-2164-12-592
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Method Overview. Triangles represent disease genes, rectangles represent target genes and grey circles represent other genes on the PPI network. (A) We overlaid expression profiles to PPI network, and mapped putative disease genes and differentially expressed target genes onto this weighted PPI network. (B) For each disease gene, we found its shortest paths to all target genes. (C) We computed the information flow for each node in a given subnetwork, and genes participating in several subnetworks can have several information scores. (D) We identified modules by assigning a gene to the module in which its information score is maximum.
Figure 2CHD subnetwork and representative modules. (A) CHD subnetwork extracted from the weighted PPI network. Different colors represent different modules. All the modules with size smaller than 10 are in the same green color. (B) Pie chart of module size, where modules with size smaller than 10 are grouped into category "others". (C-E) Three representative modules. Dark red diamonds represent putative disease genes, and red rectangles represent final candidate genes identified by our method.
Module information.
| Module | Size | Interaction | Mutual Information (empirical | Correlation ( | CHD genes in this module |
|---|---|---|---|---|---|
| M1 | 25 | 2.94E-01 | 2.80E-02 | 1.80E-03 | GATA6 |
| M2 | 66 | 4.33E-01 | 8.00E-03 | 3.62E-03 | ELN |
| M3 | 61 | 2.83E-01 | 2.00E-03 | 9.82E-01 | ACVR2B |
| M4 | 14 | 8.50E-02 | 2.70E-02 | 7.02E-02 | MYH11 |
| M5 | 46 | 1.47E-01 | 2.00E-03 | 1.67E-02 | MYH7 |
| M6 | 27 | 4.23E-01 | 1.80E-02 | 3.36E-03 | CITED2 |
| M7 | 28 | 3.44E-01 | 8.39E-01 | 5.76E-01 | FLNA |
| M8 | 40 | 2.25E-01 | <1.00E-06 | 3.66E-02 | MYBPC3 |
| M9 | 18 | 9.94E-02 | 7.40E-02 | 1.02E-02 | GATA4, TBX5 |
| M10 | 64 | 5.75E-01 | <1.00E-06 | 8.95E-03 | ACTC1 |
| M11 | 41 | 4.32E-01 | 2.30E-02 | 4.80E-01 | NKX2-5 |
| M12 | 46 | 4.51E-01 | 2.00E-03 | 1.81E-05 | NOTCH1, JAG1 |
Size means the number of genes in each module. Interaction score evaluates the topological relations between modules. Hub modules have higher scores while peripheral modules have lower scores. Mutual Information evaluates the significance of synergistic differential expression within a module compared with 1000 random gene sets of the same size. Correlation measures the significance of correlation coexpression of two interacting partners within a module compared with 1000 random edge lists of the same number.
Figure 3Synergistic differential expression of modules compared with random gene sets of the same size.
Figure 4Within- and cross-validation performance comparison of modules and pathways. (A) Within-dataset classification evaluation. X-axis corresponds to the number of features added to the logistic classifier. Assuming that there are K features (modules/pathways/random gene sets), then for each k ≤K, select the first k th modules to train the classifier. The final classification performance was reported as the AUC on the testing set using the classifier optimized from the validation set in a five-fold cross-validation. Modules were ordered in decreasing significance of MI, pathways were ordered in decreasing significance of enrichment, and random gene sets were in the order as their compared modules. (B) In cross-dataset classification evaluation, all 12 features were trained on GSE26125 and validated on 12 disease samples from GSE14970 combined with 5 controls from GSE26125.
Figure 5Inter-module coexpression and intra-module coordination. (A) To analyze coexpression within modules, we computed the average Pearson's correlation coefficient (PCC) of all edges in each module, and compared it with that of 1000 random control edge sets of the same size. (B) Module interaction identification was based on the weighted protein-protein interactions between modules. Edge width corresponds to the absolute value of PCC of two end nodes. Color edges represent cross-module interactions which will be used to compute module interaction, while grey edges are interactions within modules. (C) Intra-module coordination. Node colors range from green to red correspond to significance -log(p-value) of inter-module coexpression, node size corresponds to module size, and edge width corresponds to strength of interactions.
Enriched GO terms in each module.
| Module | GO ID | p-value | GO description | CHD genes in this module and with this GO term | FDR |
|---|---|---|---|---|---|
| M1 | GO:0019219 | 8.00E-10 | regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolic process | GATA6 | 1.90E-07 |
| GO:0051171 | 1.10E-09 | regulation of nitrogen compound metabolic process | GATA6 | 2.70E-07 | |
| GO:0051716 | 4.00E-08 | cellular response to stimulus | GATA6 | 9.70E-06 | |
| GO:0045941 | 3.20E-07 | positive regulation of transcription | GATA6 | 7.80E-05 | |
| M3 | GO:0007178 | 7.30E-34 | transmembrane receptor protein serine/threonine kinase signaling pathway | ACVR2B | 2.70E-31 |
| GO:0007179 | 3.70E-23 | Transforming growth factor beta receptor signaling pathway | 1.40E-20 | ||
| GO:0032925 | 1.70E-12 | regulation of activin receptor signaling pathway | ACVR2B | 6.60E-10 | |
| GO:0045597 | 1.10E-09 | positive regulation of cell differentiation | ACVR2B | 4.40E-07 | |
| GO:0051239 | 5.60E-09 | regulation of multicellular organismal process | ACVR2B | 2.10E-06 | |
| GO:0010646 | 1.60E-08 | regulation of cell communication | ACVR2B | 6.40E-06 | |
| M4 | GO:0006468 | 1.20E-04 | protein amino acid phosphorylation | 5.80E-03 | |
| M5 | GO:0006259 | 7.60E-05 | DNA metabolic process | 1.78E-02 | |
| M6 | GO:0045941 | 4.50E-06 | positive regulation of transcription | CITED2 | 5.90E-04 |
| GO:0051254 | 9.80E-06 | positive regulation of RNA metabolic process | CITED2 | 1.20E-03 | |
| GO:0045935 | 1.00E-05 | positive regulation of nucleobase, nucleoside, nucleotide and nucleic acid metabolic process | CITED2 | 1.30E-03 | |
| M7 | GO:0035556 | 2.20E-04 | intracellular signal transduction | FLNA | 2.52E-02 |
| M8 | GO:0006936 | 1.90E-08 | muscle contraction | MYBPC3 | 3.00E-06 |
| GO:0003008 | 1.80E-06 | system process | MYBPC3 | 2.90E-04 | |
| GO:0007010 | 8.20E-06 | cytoskeleton organization | 1.30E-03 | ||
| GO:0030036 | 8.10E-05 | Actin cytoskeleton organization | 1.31E-02 | ||
| M9 | GO:0007154 | 9.40E-06 | cell communication | GATA4, TBX5 | 1.40E-03 |
| GO:0048545 | 7.50E-05 | response to steroid hormone stimulus | GATA4 | 1.15E-02 | |
| GO:0006629 | 2.30E-04 | lipid metabolic process | 3.57E-02 | ||
| M10 | GO:0022607 | 5.10E-06 | cellular component assembly | ACTC1 | 7.30E-04 |
| GO:0009653 | 9.30E-05 | anatomical structure morphogenesis | ACTC1 | 0.0132 | |
| M11 | GO:0031334 | 2.20E-04 | positive regulation of protein complex assembly | NKX2-5, SRF | 4.42E-02 |
| M12 | GO:0043066 | 2.20E-12 | negative regulation of apoptosis | NOTCH1 | 6.20E-10 |
| GO:0045595 | 1.90E-10 | regulation of cell differentiation | JAG1, NOTCH1 | 5.50E-08 | |
| GO:0051093 | 1.90E-09 | negative regulation of developmental process | JAG1, NOTCH1 | 5.40E-07 | |
| GO:0045596 | 9.70E-09 | negative regulation of cell differentiation | JAG1, NOTCH1 | 2.70E-06 | |
| GO:0007219 | 7.30E-08 | Notch signaling pathway | JAG1, NOTCH1 | 2.00E-05 | |
CHD genes participate in most of the top enriched GO processes in each module and the function of a module is very similar to those of the putative disease genes in it. Only GO terms with FDR<0.05 are shown below.
Figure 6Module-pathway crosstalk. In both subfigures, only pathways which contain at least one gene on the CHD subnetwork are shown. (A) Network view of module-pathway crosstalk. Blue circles represent modules and green rectangles represent pathways. Edge width corresponds to strength of interaction. (B) Heatmap of module-pathway crosstalk. For clarity, module-pathway activity matrix is x-scaled, which means, for each pathway, colors ranging from blue to red represent the influential power of one particular module compared with all 12 modules.
Top 10 candidate disease genes and supporting evidence.
| Gene | Score | Description (GeneCards Version 3) | Supporting evidence | PubMed ID |
|---|---|---|---|---|
| HAND2 | 6.87 | heart- and neural crest derivatives-expressed protein 2, essential for cardiac morphogenesis, particularly for the formation of the right ventricle and of the aortic arch arteries | population study: various mutations found in TOF patients | 20819618 |
| FOS | 4.09 | Nuclear phosphoprotein which interacts JUN/AP-1 transcription factor. Has a critical function in regulating the development of cells destined to form and maintain the skeleton. | literature text-mining: cardiac hypertrophy | 16696897, 10328763, 12713689, 16259952 |
| NOTCH2 | 2.5 | Functions as a receptor for membrane-bound ligands Jagged1, Jagged2 and Delta1 to regulate cell-fate determination. | population study: various mutations found in CHD patients | 16773578 |
| MLLT4 | 1.72 | Belongs to an adhesion system which plays a role in the organization of cell-cell adherens junctions (AJs). Nectin- and actin-filament-binding protein that connects nectin to the actin cytoskeleton | similar functions to FLNA [ | see reference of supporting evidence |
| THBS1 | 1.7 | Adhesive glycoprotein that mediates cell-to-cell and cell-to-matrix interactions | genetic association database: myocardial infarct | 12482844, 16684956 |
| MAPK14 | 1.69 | act as an integration point for multiple biochemical signals, and are involved in a wide variety of cellular processes such as proliferation, differentiation, transcription regulation and development. | literature text-mining: pulmonary disease chronic obstructive | 19880675, 17959643, 20093202, 19004925 |
| ELK1 | 1.56 | Can form a ternary complex with the serum response factor and the ETS and SRF motifs of the fos serum response element | SRF [ | see reference of supporting evidence |
| MAPKAPK5 | 1.54 | similar to MAPK14 | ||
| DCN | 1.52 | This protein is a component of connective tissue, binds to type I collagen fibrils, and plays a role in matrix assembly. | literature text-mining: myocardial infarction, heart failure, congenital malformation, vascular diseases | 17558846, 9162605, 18514055, 9493904 |
| NUMB | 1.49 | plays a role in the determination of cell fates during development. associate with disease gene NOTCH1 | MGI database: targeted knock-out in mice affect cardiovascular systems | 11412999 |
Score is a candidate's GO semantic similarity with disease genes in the same module, and IEA GO terms are excluded. Description briefly introduces the candidate's CHD-related functions. Supporting evidence lists the types of literature support, and PubMed IDs of related articles are provided in the last column for reference.