Literature DB >> 27508260

Data and programs in support of network analysis of genes and their association with diseases.

Panagiota I Kontou¹, Athanasia Pavlopoulou¹, Niki L Dimou¹, Georgios A Pavlopoulos², Pantelis G Bagos¹.

Abstract

The network-based approaches that were employed in order to depict the relationships between human genetic diseases and their associated genes are described. Towards this direction, monopartite disease-disease and gene-gene networks were constructed from bipartite gene-disease association networks. The latter were created by collecting and integrating data from three diverse resources, each one with different content, covering from rare monogenic disorders to common complex diseases. Moreover, topological and clustering graph analyses were performed. The methodology and the programs presented in this article are related to the research article entitled "Network analysis of genes and their association with diseases" [1].

Entities: Disease Species

Keywords: Disease-disease networks; Gene-disease associations; Gene-gene networks

Year: 2016 PMID： 27508260 PMCID： PMC4969244 DOI： 10.1016/j.dib.2016.07.022

Source DB: PubMed Journal: Data Brief ISSN： 2352-3409

Specifications Table Value of the data The need for integrating complementary data from different sources to biological networks is further highlighted in this study. Important, previously unknown, associations between genes and diseases were revealed. Based on the constructed disease-disease networks, diseases with apparently distinct phenotypic manifestations were found to share a common genetic background. This finding could be utilized in network pharmacology.

Data

The overall procedure of the data analysis is shown illustratively in Fig. 1. The Perl (Supplementary Files 1-5) and R (Supplementary File 6) programs used for data analysis are indicated. A complete description of the data and methodology is presented in [1].

Fig.1

Flow Diagram of the data analysis.

Experimental design, materials and methods

Data collection

Disease-gene association data were collected and integrated from three diverse publicly available, comprehensive resources (NCBI׳s OMIM [2], NIH׳s GAD [3] and NHRI GWAS Catalog [4]). As a given disease can be associated with more than one gene, a script was written in Perl to separate the multiple entries (Supplementary File 1; separate.pl).

Disease and gene nomenclature

In order to maintain a consistent nomenclature and classification for diseases in our analysis, the naming conventions described in the International Classification of Diseases (ICD) were used. The disease terms from the three databases were converted to ICD terms with the use of a Perl script (Supplementary File 2; ICD.pl). Moreover, in order to maintain a uniform nomenclature across all datasets, all genes from our three databases along with the ones from UniProtKB [5] were converted to the official HGNC (HUGO Gene Nomenclature Committee) [6] gene symbols using a Perl script (Supplementary File 3; Hugo.pl).

Network processing and analysis

The bipartite networks of gene-disease associations were converted to monopartite networks of gene-gene and disease-disease interactions, by using a Perl script (Supplementary File 4; Bipartite.pl). This functionality is not available in other network analysis packages and we incorporated it in a publicly available web-server, PowerClust, which is available at: http://www.compgen.org/tools/powerclust. PowerClust, is an easy-to-use web application for clustering analysis, network processing and visualization. Moreover, randomization procedures were performed in order to determine whether the highly connected nodes in the original networks have a degree that cannot occur simply by chance given the other properties of the networks (Supplementary File 5; Random.pl). Finally, the robustness of the topological features of the projected gene-gene and disease-disease networks was assessed by employing a bipartite-specific rewiring algorithm [7] to test whether the degree distributions of the projected monopartite networks are kept stable in the randomized gene-gene/disease-disease networks compared to the initial ones (Supplementary File 6; Rewire.R). The JOINT gene-disease network (generated by combing data from the individual databases) is provided as a cytoscape network file.

Subject area	Systems biology
More specific subject area	Gene-disease networks
Type of data	Figure, text files, Cytoscape Network file
How data were acquired	Data were acquired from the publicly available databases: OMIM, GAD, GWAS, UniProtKB, ICD, HGNC
Data format	Processed, analyzed
Experimental factors	Gene-disease association data were analyzed using Perl and R scripts and Cytoscape.
Experimental features	Gene-gene and disease-disease networks were constructed.
Data source location	Department of Computer Science and Biomedical Informatics, University of Thessaly, Lamia, Greece
Data accessibility	Data are provided with this article.

7 in total

1 in total

1. Identification of gene expression profiles in myocardial infarction: a systematic review and meta-analysis.

Authors: Panagiota Kontou; Athanasia Pavlopoulou; Georgia Braliou; Spyridoula Bogiatzi; Niki Dimou; Sripal Bangalore; Pantelis Bagos
Journal: BMC Med Genomics Date: 2018-11-27 Impact factor: 3.063

1 in total

Data and programs in support of network analysis of genes and their association with diseases.

Data

Experimental design, materials and methods

Data collection

Disease and gene nomenclature

Network processing and analysis

1. Genetic association studies.

2. Network analysis of genes and their association with diseases.

3. Genenames.org: the HGNC resources in 2015.

4. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations.

5. OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders.

6. Expert curation in UniProtKB: a case study on dealing with conflicting and erroneous data.

7. Fast randomization of large genomic datasets while preserving alteration counts.

1. Identification of gene expression profiles in myocardial infarction: a systematic review and meta-analysis.