| Literature DB >> 32569262 |
Brett M Kroncke1,2,3, Derek K Smith4, Yi Zuo4, Andrew M Glazer1,2, Dan M Roden1,2,3,5, Jeffrey D Blume4.
Abstract
A major challenge emerging in genomic medicine is how to assess best disease risk from rare or novel variants found in disease-related genes. The expanding volume of data generated by very large phenotyping efforts coupled to DNA sequence data presents an opportunity to reinterpret genetic liability of disease risk. Here we propose a framework to estimate the probability of disease given the presence of a genetic variant conditioned on features of that variant. We refer to this as the penetrance, the fraction of all variant heterozygotes that will present with disease. We demonstrate this methodology using a well-established disease-gene pair, the cardiac sodium channel gene SCN5A and the heart arrhythmia Brugada syndrome. From a review of 756 publications, we developed a pattern mixture algorithm, based on a Bayesian Beta-Binomial model, to generate SCN5A penetrance probabilities for the Brugada syndrome conditioned on variant-specific attributes. These probabilities are determined from variant-specific features (e.g. function, structural context, and sequence conservation) and from observations of affected and unaffected heterozygotes. Variant functional perturbation and structural context prove most predictive of Brugada syndrome penetrance.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32569262 PMCID: PMC7347235 DOI: 10.1371/journal.pgen.1008862
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Fig 1Penetrance priors are informed by variant-specific features.
Probability density (y-axis) versus penetrance (x-axis) for three selected SCN5A variants where peak current, penetrance density, and in silico classification are known. Numbers of affected and unaffected individuals reported are presented for each variant. Penetrance priors are low for c.3922C>T (p.Leu1308Phe; Benign according to ClinVar), moderate for c.4978A>G (p.Ile1660Val; VUS), and higher for c.2632C>T (p.Arg878Cys; Pathogenic). When variant-specific data are known, the penetrance estimate is adjusted to reflect the penetrance probability consistent with variants with similar features.
Fig 2Bland-Altman plot between EM prior and EM posterior mean penetrances for all SCN5A variants.
To assess the performance of the EM prior, we used a Bland-Altman plot to compare the mean BrS1 penetrance estimated from the EM prior and from the EM posterior, the y-axis is the difference between the two and the x-axis is the average between the two. For each plotted point, both color and radius indicate the log10 of the total number of heterozygotes present in the dataset. The relatively consistent scatter about y = 0 suggests no systematic biases present in the EM prior mean BrS1 estimates.
Weighted R2 from EM prior means to Empirical/EM posterior means.
Models trained with displayed subsets of features using the same subset of variants, where covariates listed in S1 Table are known.
| Features | Empirical | EM |
|---|---|---|
| Peak Current | 0.22 [0.12–0.34; 155] | 0.35 [0.24–0.45; 20] |
| Penetrance Density | 0.35 [0.20–0.49; 113] | 0.66 [0.53–0.76; -124] |
| Peak Current and Penetrance Density | 0.43 [0.27–0.57; 88] | 0.76 [0.66–0.83; -201] |
| All Features | 0.44 [0.28–0.59; 90] | 0.78 [0.69–0.85; -218] |
| Sequence-based Features | 0.12 [0.06–0.19; 189] | 0.20 [0.12–0.28; 74] |
†Weighted R2 [95% Confidence Interval; Akaike information criterion], weighted by inverse beta-binomial variance capped at the 9th decile as described in the methods section
Fig 3Prior mean BrS1 penetrance reflects the protein topology of NaV1.5.
The predicted mean BrS1 penetrance from the converged expectation maximization (EM) algorithm. The line across the plot is a predicted mean BrS1 penetrance averaged over 30 neighboring variants. Topology diagram is shown above with transmembrane helices indicated by yellow lines and membrane indicated as a grey rectangle. Note the four largest, distinct peaks correspond to the four structured, transmembrane domains of the channel, with an especially steep peak at the selectivity filter and pore. Though estimated distances in three-dimensional space between residues is used to construct the BrS1 penetrance density, structural data are not explicitly used in the BrS1 penetrance prior and so the recapitulation of the structure is not assured.
Fig 4Sample of BrS1 penetrance prior 95% credible intervals.
Left: SCN5A variants with more than one heterozygote in our dataset are plotted with prior 95% credible intervals (colored bars) and mean posteriors (black rectangles) with posterior 95% credible intervals (black lines). Right: a model of the SCN5A protein product, NaV1.5, is shown with the regions highlighted in blue, green, gold, and red, corresponding to the colors of the variant prior 95% credible intervals shown to the left, which are analogous to the penetrance probability distributions shown on the y-axes in Fig 1. Variants near the D-III pore selectivity filter have a much higher prior and posterior BrS1 penetrance compared to residues near the D-III/D-IV linker. This is expected since the selectivity filter pore helices contain the most compacted region of the protein and also are responsible for the ion conduction and are therefore most sensitive to substitution. In fact, the highest density of variants with non-zero BrS1 penetrance lie at this depth in the membrane (S6 Fig). Variants listed are c.4057G>A (p.Val1353Met), c.4070C>T (p.Ala1357Val), c.4109A>G (p.Asp1370Gly), c.4132G>A (p.Val1378Met), c.4140C>G (p.Asn1380Lys), c.4137_4139CAA (p.Asn1380del), c.4145G>T (p.Ser1382Ile), c.4171G>A (p.Gly1391Arg), c.4192G>A (p.Val1398Met), c.4213G>C (p.Val1405Leu), c.4213G>A (p.Val1405Met), c.4217G>A (p.Gly1406Glu), c.4216G>A (p.Gly1406Arg), c.4222G>A (p.Gly1408Arg), c.4258G>C (p.Gly1420Arg), c.4259G>T (p.Gly1420Val), c.4282G>T (p.Ala1428Ser), c.4283C>T (p.Ala1428Val), c.4288G>A (p.Asp1430Asn), c.4296G>T (p.Arg1432Ser), c.4297G>T (p.Gly1433Trp), c.4328A>G (p.Asn1443Ser), c.4333T>C (p.Tyr1445His), c.4342A>C (p.Ile1448Leu), c.4346A>G (p.Tyr1449Cys), c.4381A>T (p.Thr1461Ser), c.4414_4417AAC (p.Asn1472del), c.4418T>G (p.Phe1473Cys), c.4427A>G (p.Gln1476Arg), c.4459A>C (p.Met1487Leu), c.4467G>T (p.Glu1489Asp).