| Literature DB >> 35484353 |
Johanne Brooks-Warburton1,2,3,4, Dezso Modos1,2,5, Padhmanand Sudhakar1,2,6, Matthew Madgwick1,2, John P Thomas1,2,7, Balazs Bohar1,8, David Fazekas1,8, Azedine Zoufir5, Orsolya Kapuy9, Mate Szalay-Beko1, Bram Verstockt6,10, Lindsay J Hall2,11,12, Alastair Watson7,11, Mark Tremelling7, Miles Parkes13, Severine Vermeire6,10, Andreas Bender5, Simon R Carding14,15, Tamas Korcsmaros16,17.
Abstract
We describe a precision medicine workflow, the integrated single nucleotide polymorphism network platform (iSNP), designed to determine the mechanisms by which SNPs affect cellular regulatory networks, and how SNP co-occurrences contribute to disease pathogenesis in ulcerative colitis (UC). Using SNP profiles of 378 UC patients we map the regulatory effects of the SNPs to a human signalling network containing protein-protein, miRNA-mRNA and transcription factor binding interactions. With unsupervised clustering algorithms we group these patient-specific networks into four distinct clusters driven by PRKCB, HLA, SNAI1/CEBPB/PTPN1 and VEGFA/XPO5/POLH hubs. The pathway analysis identifies calcium homeostasis, wound healing and cell motility as key processes in UC pathogenesis. Using transcriptomic data from an independent patient cohort, with three complementary validation approaches focusing on the SNP-affected genes, the patient specific modules and affected functions, we confirm the regulatory impact of non-coding SNPs. iSNP identified regulatory effects for disease-associated non-coding SNPs, and by predicting the patient-specific pathogenic processes, we propose a systems-level way to stratify patients.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35484353 PMCID: PMC9051123 DOI: 10.1038/s41467-022-29998-8
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 17.694
Fig. 1The iSNP workflow and its application to reconstruct a ulcerative colitis-associated signalling network for non-coding single nucleotide polymorphisms.
Single nucleotide polymorphisms (SNP) identified in patients were annotated based on those occurring within transcription factor binding sites (TFBS) localised in enhancer or promoter regions of genes, or within microRNA-target sites (miRNA-TS) that are in first intronic regions or untranslated regions. After identifying the proteins whose transcription or translation could be affected by these non-coding SNPs, their protein interactors (first neighbours) were determined to construct a ulcerative colitis-associated signalling network. (UK: United Kingdom, IBD: Inflammatory Bowel Disease).
Affected SNPs in the UC-associated signalling network, their target genes and interactionsa.
| SNP | Target gene name | Regulatory annotation of the SNP |
|---|---|---|
| rs11041476 | TFBS in an enhancer, miRNA-TS in the first intron | |
| TFBS in an enhancer | ||
| rs11168249 | TFBS in an enhancer | |
| TFBS in an enhancer, miRNA-TS in the first intron | ||
| rs11676348 | TFBS in an enhancer | |
| rs12254167 | TFBS in an enhancer | |
| rs1598859 | TFBS in an enhancer | |
| rs17085007 | TFBS in an enhancer | |
| rs1801274 | miRNA-TS in an exon | |
| rs3774937 | miRNA-TS in an intron | |
| rs543104 | TFBS in an enhancer | |
| rs559928 | TFBS in an enhancer | |
| rs6087990 | TFBS in a promoter | |
| rs907611 | TFBS in a promoter | |
aDetails of each interaction are provided in Supplementary Table 1. Cluster-driving SNPs affecting the regulation of a high number of proteins directly or through their first neighbours are shown in bold.
Fig. 2Visualisation and modularisation of the ulcerative colitis-associated signalling network.
a The ulcerative colitis (UC)-associated signalling network contains proteins affected by—associated single nucleotide polymorphisms (SNPs), their interactor partners as well as the transcription factors(TF) and micro-RNAs(miRNA) whose binding or target sites are affected by a SNP. Circles represent proteins and squares represent regulators (red = TFs, blue = miRNAs). Nodes are coloured according to network modules. The modules are named by their representative function. At the top right side of the network are TFs involved in potential regulatory feedback loops in UC pathogenesis. b Visualisation of the two regulatory modules. The module on the left represents the transcription factor binding sites based effects on the downstream network, which affects almost the entire signalling network. The module on the right represents the microRNA-target site based effects that mainly happen by regulating PRKCB.
Fig. 3Unsupervised clustering of ulcerative colitis patients based on their network footprint.
a Heatmap of directly or indirectly affected proteins in each patient. Each column represents a patient, and each row is a protein. Yellow colouring indicates specific proteins affected in individual patients while blue means the opposite. The hierarchical clustering of the patients is shown above the heatmap and was generated using Hamming distance with the average clustering method in which colours represent the patient clusters. The left of the heatmap identifies the proteins in various patient-specific modules, while cluster-driving proteins are shown on the right side of the heatmap. b Representative networks from the four patient clusters. Yellow colour indicates directly or indirectly affected proteins, while blue colour indicates not affected proteins. c Histograms depict the number of patients in which a given protein is affected. The horizontal red line demarcates affected proteins in more than 300 patients. The green line defines the cut-off of proteins affected in 170 patients or less. Both cut-offs were defined based on the distribution. The colours of the proteins are from the representative network modules from Fig. 2 (HLA human leucocyte antigen).
Fig. 4Validation of the iSNP method with transcriptomic data from an independent cohort of ulcerative colitis patients.
a Flow chart depicting the validation approaches. b Single nucleotide polymorphism (SNP) affected genes differentially expressed in ulcerative colitis (UC) patients from biopsy samples of the GSE109142 dataset. Absolute log2 fold change > 1 was used as a cut-off. c The percentage of differentially expressed genes from the first neighbours of the cluster-driving proteins using the same dataset. Analysis of the patients in the independent cohort produced two clusters similar to those generated by iSNP (red and purple cluster on Fig. 3B). The most differentially expressed genes from the UC-associated signalling network in the validation cohort were the first neighbours of multiple SNP-affected proteins. d Similarities between the over-represented Gene Ontology Biological Processes between first neighbours of cluster-driving proteins and differentially expressed genes. Gene Ontology terms were considered enriched based on a Benjamini-Hochberg corrected hypergeometric test p < 0.05. A gene was considered differentially expressed based on |FC| > 1 and q < 0.05 Benjamini-Hochberg corrected moderate t-test There are two main groups of Gene Ontology Biological Processes: common and specific. The common processes include regulation of signalling or metabolic processes while the specific processes represent the cluster-driving protein and its cluster function or the transcriptomic effect of inflammation. Differentially expressed genes, first neighbours, or functions are represented in yellow and the cluster-driving genes are represented by their respective colours: pink—SNAI1, CEBPB, PTPN1; blue—PRKCB; orange—VEGFA, XPO5, POLH; purple—NFKB1, turquoise—HLA proteins.