| Literature DB >> 32910913 |
Sara Althari1, Laeya A Najmi2, Amanda J Bennett1, Ingvild Aukrust3, Jana K Rundle1, Kevin Colclough4, Janne Molnes3, Alba Kaci5, Sameena Nawaz1, Timme van der Lugt6, Neelam Hassanali1, Anubha Mahajan7, Anders Molven8, Sian Ellard4, Mark I McCarthy9, Lise Bjørkhaug10, Pål Rasmus Njølstad11, Anna L Gloyn12.
Abstract
Exome sequencing in diabetes presents a diagnostic challenge because depending on frequency, functional impact, and genomic and environmental contexts, HNF1A variants can cause maturity-onset diabetes of the young (MODY), increase type 2 diabetes risk, or be benign. A correct diagnosis matters as it informs on treatment, progression, and family risk. We describe a multi-dimensional functional dataset of 73 HNF1A missense variants identified in exomes of 12,940 individuals. Our aim was to develop an analytical framework for stratifying variants along the HNF1A phenotypic continuum to facilitate diagnostic interpretation. HNF1A variant function was determined by four different molecular assays. Structure of the multi-dimensional dataset was explored using principal component analysis, k-means, and hierarchical clustering. Weights for tissue-specific isoform expression and functional domain were integrated. Functionally annotated variant subgroups were used to re-evaluate genetic diagnoses in national MODY diagnostic registries. HNF1A variants demonstrated a range of behaviors across the assays. The structure of the multi-parametric data was shaped primarily by transactivation. Using unsupervised learning methods, we obtained high-resolution functional clusters of the variants that separated known causal MODY variants from benign and type 2 diabetes risk variants and led to reclassification of 4% and 9% of HNF1A variants identified in the UK and Norway MODY diagnostic registries, respectively. Our proof-of-principle analyses facilitated informative stratification of HNF1A variants along the continuum, allowing improved evaluation of clinical significance, management, and precision medicine in diabetes clinics. Transcriptional activity appears a superior readout supporting pursuit of transactivation-centric experimental designs for high-throughput functional screens.Entities:
Keywords: HNF1A; bioinformatics; cluster analysis; diabetes; genetics; monogenic diabetes; protein function; rare variants; type 2 diabetes
Mesh:
Substances:
Year: 2020 PMID: 32910913 PMCID: PMC7536579 DOI: 10.1016/j.ajhg.2020.08.016
Source DB: PubMed Journal: Am J Hum Genet ISSN: 0002-9297 Impact factor: 11.025
Figure 1Eigendecomposition of Principal Components Explaining > 85% of Variance
Shown are (A) Oxford and (B) Bergen datasets. TA INS1_P2 and TA HeLa_ALB are transcriptional activity data from INS-1 cells using HNF4A P2 promoter and from HeLa cells using rat albumin promoter, respectively. PE, protein expression; nuc loc, nuclear localization data.
Figure 2K-Means Clustering
HNF1A missense alleles characterized at Oxford (A) and Bergen (B) in principal component (PC) space. Blue and green k clusters represent alleles with benign and benign-to-intermediate effects on function, respectively; purple k clusters represent alleles with intermediate functional impact; red k clusters indicate intermediate-to-damaging or functionally damaging alleles.
Figure 3Hierarchical Clustering Analysis
HNF1A missense alleles characterized at Oxford (A) and Bergen (B). WARD minimum variance method was used and analysis performed using orthogonally transformed functional data from PC1-PC4 (>85% explained variance) from Oxford dataset and PC1-PC5 (>85% explained variance) form Bergen dataset. To optimize visualization of the function phenotype gradient, some branches were rotated. The numbers of the y axes of (A) and (B) refer to clustering height calculated as by Ward's criterion (total within-cluster variance).
Figure 4Distribution of Functionally Annotated HNF1A Missense Alleles
(A and B) As a function of frequency in the (A) UK MODY diagnostic registry and (B) Norway MODY diagnostic registry on the x axis and reported frequency in the genome aggregation database (gnomAD) on the y axis. Alleles are colored on the basis of the (re)classification scheme on the top right.
(C) Frequency of functionally characterized exome-detected HNF1A missense alleles in gnomAD. The red and orange dashed lines mark known ultra-rare, MODY pathogenic (allele count ≤ 2, AF < 0.0008%) and low frequency type 2 diabetes predisposing allele frequencies (allele count ≤ 121, AF < 0.04%) respectively.