| Literature DB >> 29263820 |
Helle Krogh Pedersen1, Valborg Gudmundsdottir1, Mette Krogh Pedersen1,2, Caroline Brorsson1, Søren Brunak1,2, Ramneek Gupta1.
Abstract
As weight-loss surgery is an effective treatment for the glycaemic control of type 2 diabetes in obese patients, yet not all patients benefit, it is valuable to find predictive factors for this diabetic remission. This will help elucidating possible mechanistic insights and form the basis for prioritising obese patients with dysregulated diabetes for surgery where diabetes remission is of interest. In this study, we combine both clinical and genomic factors using heuristic methods, informed by prior biological knowledge in order to rank factors that would have a role in predicting diabetes remission, and indeed in identifying patients who may have low likelihood in responding to bariatric surgery for improved glycaemic control. Genetic variants from the Illumina CardioMetaboChip were prioritised through single-association tests and then seeded a larger selection from protein-protein interaction networks. Artificial neural networks allowing nonlinear correlations were trained to discriminate patients with and without surgery-induced diabetes remission, and the importance of each clinical and genetic parameter was evaluated. The approach highlighted insulin treatment, baseline HbA1c levels, use of insulin-sensitising agents and baseline serum insulin levels, as the most informative variables with a decent internal validation performance (74% accuracy, area under the curve (AUC) 0.81). Adding information for the eight top-ranked single nucleotide polymorphisms (SNPs) significantly boosted classification performance to 84% accuracy (AUC 0.92). The eight SNPs mapped to eight genes - ABCA1, ARHGEF12, CTNNBL1, GLI3, PROK2, RYBP, SMUG1 and STXBP5 - three of which are known to have a role in insulin secretion, insulin sensitivity or obesity, but have not been indicated for diabetes remission after bariatric surgery before.Entities:
Year: 2016 PMID: 29263820 PMCID: PMC5685313 DOI: 10.1038/npjgenmed.2016.35
Source DB: PubMed Journal: NPJ Genom Med ISSN: 2056-7944 Impact factor: 8.617
Figure 1(a) General framework for integrating heterogeneous data types for patient stratification. In this study we focus on the three data types: clinical traits, genetic information and protein–protein interactions. Panels (b–e) illustrate the approach for compiling an enriched subset of candidate SNPs by utilising prior biological knowledge. Essentially, the top 200 GWAS SNPs were expanded using protein–protein interaction data, see text for details.
Baseline patient characteristics associated with diabetes resolution for the nonredundant subset of 268 patients from the eMERGE cohort
| P | |||
|---|---|---|---|
| No. of patients | 114 | 154 | |
| Male sex | 39 (34.2%) | 44 (28.6%) | 0.393 |
| Weight before bariatric surgery (pounds) | 308 [268;337] | 308 [260;374] | 0.423 |
| BMI before bariatric surgery (kg/m2) | 48.0 [43.7;55.0] | 50.5 [43.4;57.8] | 0.321 |
| Alcohol use before bariatric surgery ( | 25 (25.0%) | 52 (38.0%) | 0.05 |
| Tobacco use before bariatric surgery ( | 27 (33.3%) | 37 (33.3%) | 1 |
| Systolic blood pressure before bariatric surgery (mm Hg) | 136 [122;153] | 134 [122;152] | 0.649 |
| Diastolic blood pressure before bariatric surgery (mm Hg) | 74.0 [67.0;85.8] | 77.0 [68.0;86.0] | 0.452 |
| Pulse pressure before bariatric surgery (mm Hg) | 60.0 [49.2;73.8] | 59.0 [48.0;68.0] | 0.21 |
| Use of sulfonylureas before bariatric surgery | 39 (34.2%) | 58 (37.7%) | 0.651 |
Abbreviation: BMI, body mass index.
Values show the median [1st; 3rd quartiles] or number of patients and percentages (%). P values are shown for χ2-test (categorical variables) and Kruskal–Wallis test (continuous variables). Rows with P values <0.05 are shown in bold. If not otherwise stated, n=268.
Figure 2Ranking of features. (a–c) The number of times (out of 125) a given clinical feature (a) or SNP (b and c) was selected in the forward feature selection approach. The more times selected, the more important the given feature is in predicting diabetes remission. (d) Relative importance of input variables for diabetes remission, highlighting the directionality of the different features (positive values indicate that high values/minor alleles is associated with diabetes remission, whereas negative values indicate that high values/minor alleles/taking the medication is associated with failure of diabetes remission). The plot shows the average relative importance for the five outer cross-validation folds.
Description of the eight highest ranked SNPs by the present study, ordered according to Figure 2c
| Waist-hip ratio adjusted for BMI P value | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| rs1688686 | 3 | 122144630 | Intron | Syntaxin-binding protein 5-like (Tomosyn-2) | 6 | A | G | 0.280 | 1.75×10−4 | 8.40×10-1 | |
| rs9310221 | 3 | 71932386 | Intergenic | Prokineticin-2|RING1 and YY1-binding protein | 7 | A | G | 0.426 | 4.79×10−4 | 3.20×10−1 | |
| rs2279400 | 12 | 52867581 | Intron | Single-strand-selective monofunctional uracil-DNA glycosylase 1 | 7 | G | A | 0.450 | 3.54×10−6 | 4.60×10−1 | |
| rs11600200 | 11 | 119839890 | Intron | Rho guanine nucleotide exchange factor (GEF) 12 | 6 | C | A | 0.232 | 9.61×10−4 | 2.90×10−4 | |
| rs2190513 | 7 | 42175859 | Intron | GLI family zinc finger 3 | 5 | G | A | 0.418 | 4.38×10−3 | 2.00×10−1 | |
| rs10230715 | 7 | 42155381 | Intron | GLI family zinc finger 3 | 5 | G | A | 0.471 | 2.21×10−3 | 1.50×10−1 | |
| rs10820743 | 9 | 106711480 | Intron | ATP-binding cassette, sub-family A (ABC1), member 1 | 6 | G | A | 0.292 | 7.54×10−2 | 7.00×10−1 | |
| rs6512714 | 20 | 35874753 | Intron | Catenin, beta like 1 | 6 | A | C | 0.356 | 9.35×10−4 | 9.30×10−1 |
Abbreviations: AUC, area under the curve; BMI, body mass index; Chr, chromosome; MAF, minor allele frequencies; OGTT, oral glucose tolerance test; SNP, single-nucleotide polymorphism; TF, transcription factor.
MetaboChip SNP IDs, chromosome locations and gene locations are from the MetaboChip annotation file in build 36 coordinates. RegulomeDBscore was used to identify DNA features and regulatory elements overlapping the SNP coordinates (7=none; 6=other; 5=TF binding or DNase peak). MAFs are for all the 457 participants from the cohort. P values are listed for the 1 out of 18 tested GIANT, DIAGRAM and MAGIC consortium studies with any nominal P value<0.01 (BMI[19] waist–hip ratio adjusted for BMI;[20] Type 2 diabetes;[21] corrected insulin response, corrected insulin response adjusted for insulin-sensitivity index, ratio of the AUC for AUC insulin/AUC glucose, insulin-sensitivity index, disposition index, insulin at 30 min, incremental insulin at 30 min, insulin response to glucose during the first 30 min adjusted for BMI, and AUC of insulin levels during OGTT;[22] 2 h glucose, fasting glucose, fasting insulin, and fasting insulin adjusted for BMI profiled with the Metabochip;[23] fasting glucose and fasting insulin[24]).
SNPs included from the protein–protein interaction network expansion.
Internal validation performance for the two models: the four clinical traits alone or in combination with the eight SNPs
| Clinical traits alone | 0.810 | 0.735 | 0.629 | 0.811 | 0.807 (0.0054) | 0.992 (0.00008927) |
| Clinical traits+eight SNPs | 0.921 | 0.838 | 0.790 | 0.872 | 0.919 (0.0046) | 0.917 (0.001072) |
Abbreviations: AUC, area under the curve; SNP, single-nucleotide polymorphism.
The first four columns show internal validation and performance measures for the cross-validation splits used for feature selection and the 268 individuals remaining after excluding similar patients (as reported throughout the paper). The next column shows internal validation again based on the 268 individuals, but for 1,000 different cross-validation splits. The last column shows the AUC for the 189 individuals initially held out because of their redundant properties in terms of clinical traits. In this last column, the models are trained on the 268 included individuals but evaluated on the 189 held out individuals.
Figure 3Patient breakdown. (a) The number of patients correctly or incorrectly classified in the internal validation step with an artificial neural network (ANN) predictor trained on the top four clinical traits alone, or the clinical traits+the eight top-ranked SNPs. (b) Distributions of variables for the eight different patient subgroups for the four top-ranked features. The violin plots in b indicate frequency distributions of the features (a kernel density plot), with the black bars indicating interquartile range and white circles the median value. (c) Receiver operating characteristic (ROC) curves for the two models: the four clinical traits alone or in combination with the eight SNPs. (d) Adding the SNPs pulls patients to the poles. The start of the arrows marks the output score from ANN trained on the four clinical traits, whereas the end (arrowhead) marks the output from ANN trained on both the four clinical traits and eight SNPs. During ANN training and evaluation, nonremitters are encoded as 0 and remitters as 1.