| Literature DB >> 30914061 |
Juan Fernández-Tajes1, Kyle J Gaulton2, Martijn van de Bunt1,3,4, Jason Torres1,3, Matthias Thurner1,3, Anubha Mahajan1, Anna L Gloyn1,3,5, Kasper Lage6,7,8, Mark I McCarthy9,10,11.
Abstract
BACKGROUND: Genome-wide association studies (GWAS) have identified several hundred susceptibility loci for type 2 diabetes (T2D). One critical, but unresolved, issue concerns the extent to which the mechanisms through which these diverse signals influencing T2D predisposition converge on a limited set of biological processes. However, the causal variants identified by GWAS mostly fall into a non-coding sequence, complicating the task of defining the effector transcripts through which they operate.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30914061 PMCID: PMC6436236 DOI: 10.1186/s13073-019-0628-8
Source DB: PubMed Journal: Genome Med ISSN: 1756-994X Impact factor: 15.266
Fig. 1Overview of the data integration pipeline. We collected variants in the 1-Mb interval surrounding index variants at each of the 101 T2D GWAS loci along with relevant annotations for all protein coding genes in GENCODE including coding exon location, promoter location, distal regulatory elements correlated with gene activity from DNAseI hypersensitivity (DHS) data and summary statistic expression QTL (eQTL) data from T2D-relevant tissues. This, combined with information at the gene level from a semantic similarity metric, allowed us to define positional candidacy scores (PCS) for each gene in the GWAS intervals. PCS confers “weights” to the genes in the 1-Mb window based on biological candidacy, in contrast to “nearest” approaches, where the closest gene to the GWAS signal gets a weight of 1 and others a 0, or the “equal” approach where all genes in the window have the same weight. Genes with cumulative PCS > 0.7 were projected into the InWeb3 dataset using a Steiner tree algorithm to define a PPI network that maximises candidate gene connectivity. This network was further analysed to find processes, pathways and genes implicated in the T2D pathogenesis
Fig. 2APCST final network. The final PPI network generated from the T2D GWAS interval genes includes 431 seed nodes and 274 linking nodes connected by 2678 interactions. We divided this network into 18 sub-networks (communities) using a community clustering algorithm that maximises network modularity [33], and highlighted enrichment of specific biological processes contained within these based on Gene Ontology terms and REACTOME pathways. Nodes are coloured according to their PCS with grey nodes representing linking nodes. Coloured nodes represent seed nodes, whereas grey nodes represent linking nodes. The size of the nodes is proportional to the network parameter neighbourhood connectivity. These networks are available in cytoscape and edge list format at [https://github.com/jfertaj/T2D_data_integration]
Fig. 3GWAS signal enrichment in tissue-specific interactomes. RNA-Seq data was used to filter the overall InWeb3 network and generate in silico tissue-specific networks that maximise connectivity between GWAS interval genes. Linking nodes within these networks were then tested for enrichment for GWAS signals using a permutation scheme. Each dot in the figure depicts the −log10 p value for enrichment for signals in a given GWAS dataset, for each of the 46 tissues. Dot colours reflect the GWAS phenotypes with T2D in the larger red colour. The dotted red line represents the nominal value of significance (p = 0.05), calculated as the empirical estimate of the null distribution from which the p value has been drawn. Islet showed the second strongest enrichment signal for T2D
Fig. 4GWAS signal enrichment in an islet-specific network derived from T2D GWAS subsets. We built APCST networks filtered for islet RNA expression for each of the subsets of T2D GWAS loci defined by shared mechanistic mediation (refs [48, 49]). Number of seed genes for each of the subsets is displayed below the name of the phenotype. Enrichment in GWAS signals for linking nodes only was tested using a permutation scheme. Each dot in the figure depicts the −log10 p value of enrichment for association signals in a particular GWAS analysis. The results for T2D GWAS enrichment for the APCST networks built around the different T2D GWAS subsets are also represented (large red dots). The dotted red line represents nominal significance (p = 0.05) calculated as the empirical estimate of the null distribution from which the p value has been drawn. The strongest enrichment for T2D GWAS data in islet-filtered PPI data is observed for subsets of loci acting through reduced insulin secretion. The central network depicts the intersection of the seven networks created from the seven T2D GWAS locus subset categories. Nodes are coloured according to their PCS with grey nodes representing linking nodes. For the other networks, we have only named the nodes that are exclusive to each of the seven T2D GWAS locus subset category networks to ease interpretation. These networks are available in cytoscape and edge list format at [https://github.com/jfertaj/T2D_data_integration]