| Literature DB >> 31044086 |
Arda Halu1, Manlio De Domenico2, Alex Arenas2, Amitabh Sharma1.
Abstract
Untangling the complex interplay between phenotype and genotype is crucial to the effective characterization and subtyping of diseases. Here we build and analyze the multiplex network of 779 human diseases, which consists of a genotype-based layer and a phenotype-based layer. We show that diseases with common genetic constituents tend to share symptoms, and uncover how phenotype information helps boost genotype information. Moreover, we offer a flexible classification of diseases that considers their molecular underpinnings alongside their clinical manifestations. We detect cohesive groups of diseases that have high intra-group similarity at both the molecular and the phenotypic level. Inspecting these disease communities, we demonstrate the underlying pathways that connect diseases mechanistically. We observe monogenic disorders grouped together with complex diseases for which they increase the risk factor. We propose potentially new disease associations that arise as a unique feature of the information flow within and across the two layers.Entities:
Keywords: Computational biology and bioinformatics; Systems biology
Mesh:
Year: 2019 PMID: 31044086 PMCID: PMC6478736 DOI: 10.1038/s41540-019-0092-5
Source DB: PubMed Journal: NPJ Syst Biol Appl ISSN: 2056-7189
Fig. 1The multiplex disease network. a Tripartite network of symptoms (green nodes on the left), diseases (pink nodes in the middle) and genes (blue nodes on the right). Symptoms and genes that are shared between diseases are shown in darker text. b Phenotype- and genotype-based disease-disease networks where diseases are connected in the genotype layer (blue) if they share at least one gene and connected in the phenotype layer (green) if they share at least one symptom. The thickness of the edge is proportional to the number of common genes or symptoms. c The two networks are considered as layers of a multiplex system, where nodes are the diseases and colored links encode their interactions. Disease-disease interactions that are present in both layers are denoted “overlapping links.”
Fig. 2Multiplex communities. An emblematic example of multiplex community bridging genotypic and phenotypic information to discover new disease-disease interactions that, otherwise, would not be identified from standard analysis. In this case (Multiplex Community 15) there are no edges in common across the two layers (i.e., there is no phenotype interaction with a genetic explanation) and only two diseases are shared by the communities in the two layers, i.e. Age-related macular degeneration and acute lymphocytic leukemia, due to acute leukemia being associated with ocular comorbidity
Fig. 3Similarity assessment of multiplex disease communities. Radar plots of the 29 disease communities with size 10 or more, showing the −log P-values as the concentric circles. The molecular similarity represented by the North-South axis is for Gene Ontology:Biological Process (GO:BP) and Gene Overlap, whereas phenotypic similarity represented by the East-West axis is for relative risk (RR) comorbidity and MimMiner (MM) phenotype semantic similarity. The larger the overall shaded area, the more significant the intra-community similarity. Points confined to the innermost circle (−log P-value < 2, or P-value > 0.01) represent non-significant intra-community similarities for the respective similarity measure
Fig. 4Anatomy of Multiplex disease community 8. Disease community 8 is characterized by rare congenital heart defects and skeletal anomalies. a GO: Biological Process similarity heatmap, where similarity scores range between 0 and 1. b Relative risk (RR) comorbidity heatmap, where the colors represent the logarithm of the RR values. c Gene overlap, quantified by the Jaccard index. d MimMiner phenotype semantic similarity heatmap, where similarity scores range between 0 and 1. e PubMed literature co-occurrence heatmap and representative network, where the number in each cell denotes the number of publications that have the co-occurence of the queried keywords. The color of each cell represents the Jaccard index; a redder cell means higher literature co-occurrence weighted by the total number of publications
Fig. 5Biological process similarity of diseases. Portion of the human protein-protein interaction (PPI) network depicting the localization of disease genes and genes related to relevant biological processes. Nodes represent protein-encoding genes and edges represent literature-documented physical interactions between them. Disease genes (denoted with the dashed ellipse areas) associated with ventricular septal defect (green nodes) and DiGeorge Syndrome (cyan nodes) are connected by biological processes related to cardiac development (nodes related to outflow tract morphogenesis in blue; heart morphogenesis in red; heart development in yellow) and endocrine development (nodes related to thyroid gland development in magenta). Concentric circles around disease genes indicate overlap of disease genes with the biological process of that color
Example multiplex disease communities where monogenic disorders were grouped together with the complex diseases for which they increase risk (Communities 12, 16, 21, and 22), and Mendelian diseases with severe phenotypes were found in smaller communities (Communities 31, 40, 73, and 82)
| Multiplex disease community # | Diseases |
|---|---|
| 12 | asphyxiating thoracic dystrophy, |
| 16 | |
| 21 | alcohol dependence, cerebrovascular disease, |
| 22 | |
| 31 | achondrogenesis type IB, |
| 40 | cardiofaciocutaneous syndrome, Coffin-Lowry syndrome, cutaneous porphyria, fragile X syndrome, non-syndromic X-linked intellectual disability, Rett syndrome, |
| 73 | bronchiectasis, Camurati-Engelmann disease, |
| 82 | glycine encephalopathy, |
The diseases mentioned in the Discussion are highlighted in bold