| Literature DB >> 34108472 |
Julia K Goodrich1, Moriel Singer-Berk1, Rachel Son1, Abigail Sveden1, Jordan Wood1, Eleina England1, Joanne B Cole1, Ben Weisburd1, Nick Watts1, Lizz Caulkins1, Peter Dornbos1, Ryan Koesterer1, Zachary Zappala1, Haichen Zhang2, Kristin A Maloney2, Andy Dahl3, Carlos A Aguilar-Salinas4, Gil Atzmon5,6,7, Francisco Barajas-Olmos8, Nir Barzilai5,7, John Blangero9, Eric Boerwinkle10,11, Lori L Bonnycastle12, Erwin Bottinger13, Donald W Bowden14,15,16, Federico Centeno-Cruz8, John C Chambers17,18, Nathalie Chami13,19, Edmund Chan20, Juliana Chan21,22,23,24, Ching-Yu Cheng25,26,27, Yoon Shin Cho28, Cecilia Contreras-Cubas8, Emilio Córdova8, Adolfo Correa29, Ralph A DeFronzo30, Ravindranath Duggirala9, Josée Dupuis31, Ma Eugenia Garay-Sevilla32, Humberto García-Ortiz8, Christian Gieger33,34,35, Benjamin Glaser36, Clicerio González-Villalpando37, Ma Elena Gonzalez38, Niels Grarup39, Leif Groop40,41, Myron Gross42, Christopher Haiman43, Sohee Han44, Craig L Hanis10, Torben Hansen39, Nancy L Heard-Costa45,46, Brian E Henderson43, Juan Manuel Malacara Hernandez32, Mi Yeong Hwang44, Sergio Islas-Andrade8, Marit E Jørgensen47,48,49, Hyun Min Kang50, Bong-Jo Kim44, Young Jin Kim44, Heikki A Koistinen51,52,53, Jaspal Singh Kooner54,55,56,57, Johanna Kuusisto58, Soo-Heon Kwak59, Markku Laakso58, Leslie Lange60, Jong-Young Lee61, Juyoung Lee44, Donna M Lehman30, Allan Linneberg62,63,64, Jianjun Liu20,65,66, Ruth J F Loos13,19, Valeriya Lyssenko38,67, Ronald C W Ma21,22,23,24, Angélica Martínez-Hernández8, James B Meigs1,68,69, Thomas Meitinger70,71, Elvia Mendoza-Caamal8, Karen L Mohlke72, Andrew D Morris73,74, Alanna C Morrison10, Maggie C Y Ng14,15,16, Peter M Nilsson75, Christopher J O'Donnell76,77,78,79, Lorena Orozco8, Colin N A Palmer80, Kyong Soo Park59,81,82, Wendy S Post83, Oluf Pedersen39, Michael Preuss13, Bruce M Psaty84,85, Alexander P Reiner86, Cristina Revilla-Monsalve8, Stephen S Rich87, Jerome I Rotter88, Danish Saleheen89,90,91, Claudia Schurmann13,92,93, Xueling Sim65, Rob Sladek94,95,96, Kerrin S Small97, Wing Yee So21,22,23, Timothy D Spector97, Konstantin Strauch98,99, Tim M Strom70,100, E Shyong Tai20,27,65, Claudia H T Tam21,22,23, Yik Ying Teo65,101,102, Farook Thameem103, Brian Tomlinson104, Russell P Tracy105,106, Tiinamaija Tuomi40,41,107,108,109, Jaakko Tuomilehto110,111,112,113, Teresa Tusié-Luna114,115, Rob M van Dam20,65,116, Ramachandran S Vasan45,117, James G Wilson118, Daniel R Witte119,120, Tien-Yin Wong25,26,27, Noël P Burtt1, Noah Zaitlen3, Mark I McCarthy121,73,122, Michael Boehnke50, Toni I Pollin2, Jason Flannick1,123,76, Josep M Mercader1,124,68, Anne O'Donnell-Luria1,123,76, Samantha Baxter1, Jose C Florez1,124,68, Daniel G MacArthur1,125,126, Miriam S Udler127,128,129.
Abstract
Hundreds of thousands of genetic variants have been reported to cause severe monogenic diseases, but the probability that a variant carrier develops the disease (termed penetrance) is unknown for virtually all of them. Additionally, the clinical utility of common polygenetic variation remains uncertain. Using exome sequencing from 77,184 adult individuals (38,618 multi-ancestral individuals from a type 2 diabetes case-control study and 38,566 participants from the UK Biobank, for whom genotype array data were also available), we apply clinical standard-of-care gene variant curation for eight monogenic metabolic conditions. Rare variants causing monogenic diabetes and dyslipidemias display effect sizes significantly larger than the top 1% of the corresponding polygenic scores. Nevertheless, penetrance estimates for monogenic variant carriers average 60% or lower for most conditions. We assess epidemiologic and genetic factors contributing to risk prediction in monogenic variant carriers, demonstrating that inclusion of polygenic variation significantly improves biomarker estimation for two monogenic dyslipidemias.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34108472 PMCID: PMC8190084 DOI: 10.1038/s41467-021-23556-4
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 17.694
Fig. 1Curation of ClinVar and pLoF variants across the monogenic conditions.
Total number of curated ClinVar/Review (blue) and pLoF (red) variants with carriers in AMP-T2D-GENES (left panel) and UKB (right panel). Darker color shades indicate variants determined to be clinically significant (pathogenic, likely pathogenic, or pLoF) and lighter shades indicate variants excluded during curation from further analysis.
Impact of clinically significant variants on traits.
| AMP-T2D-GENES ( | UK Biobank ( | ||||||
|---|---|---|---|---|---|---|---|
| Condition (proxy measure) | Gene | Beta (se) | Beta (se) | ||||
| High LDL (LDL mg/dL) | composite | 55 | 56.0 (5.2) | 3.9 × 10−24 | 83 | 54.2 (3.9) | 1.6 × 10−44 |
| 11 | 31.5 (12.0) | 8.9 × 10−3 | 26 | 52.2 (6.8) | 2.2 × 10−14 | ||
| 44 | 65.3 (6.3) | 9.0 × 10−25 | 57 | 55.1 (4.7) | 1.1 × 10−31 | ||
| Low LDL (LDL mg/dL) | composite | 35 | −56.1 (7.1) | 4.4 × 10−15 | 90 | −56.4 (3.7) | 6.9 × 10−52 |
| 8 | −79.8 (14.7) | 5.9 × 10−8 | 48 | −74.5 (5.1) | 6.7 × 10−48 | ||
| 27 | −48.7 (8.2) | 2.6 × 10−9 | 42 | −36.1 (5.4) | 2.7 × 10−11 | ||
| High HDL (HDL mg/dL) | 21 | 16.5 (3.0) | 3.6 × 10−8 | 20 | 16.8 (2.4) | 2.3 × 10−12 | |
| High triglycerides (TG mg/dL) | composite | 20 | 130.0 (27.3) | 2.8 × 10−6 | 54 | 126.0 (12.2) | 2.4 × 10−16 |
| 15 | 122.4 (29.7) | 2.6 × 10−5 | 38 | 145.5 (13.6) | 2.4 × 10−14 | ||
| 5 | 152.8 (54.6) | 2.5 × 10−2 | 16 | 79.3 (22.4) | 9.4 × 10−4 | ||
| Monogenic obesity (BMI kg/m2) | MC4R | 28 | 1.5 (1.0) | 6.3 × 10−2 | 31 | 2.2 (0.8) | 6.3 × 10−3 |
Composite = individuals carrying variants in any of the genes analyzed for that condition. Note that MODY composite gene set included GCK, HNF1A, HNF1B, HNF4A, and PDX1.
*Comparison of variant carriers to non-carries using EPACTS burden two-sided testing, adjusted for age, sex, 10 PCs. No adjustment has been made for multiple comparisons.
Fig. 2Carriers of rare clinically significant monogenic variants for lipid conditions and monogenic diabetes have more extreme effect size estimates than individuals with the top 1% of global extended polygenic scores (gePS).
In all plots data is from the UK Biobank participants. The left panels show the distribution of the phenotype in each percentile of the gePS for the relevant condition (black, N mean 364 individuals per centile), and the right panel shows the phenotype distribution in carriers of rare clinically significant monogenic variants for the corresponding condition (red); low LDL cholesterol (APOB, PCSK9; N = 90), high LDL cholesterol (LDLR, APOB; N = 83), high HDL cholesterol (CETP; N = 20), high triglycerides (APOA5, LPL; N = 54), monogenic obesity (MC4R; N = 31), and MODY (GCK, HNF1A, PDX1; N = 16). A–E Mean and 95% CI of each phenotype are indicated by the point and error bars, respectively. The same gePS calculated for risk of increasing LDL levels was used for (A and B); however, the inverse of this gePS was used for (B) to illustrate that higher gePS indicates risk of lower LDL cholesterol. F The proportion of individuals with diabetes and 95% CI computed with the Clopper–Pearson method are shown as points and error bars, respectively. Individuals in the gePS analysis were restricted to those age ≥ 60 years. LDL cholesterol and triglyceride values were adjusted for lipid-lowering medication use (see “Methods”).
Fig. 3Phenotype distributions and penetrance estimates of clinically significant variant carriers.
In all plots, clinically significant variant carriers are shown in red and non-carriers are shown in grey. The left panel of each plot shows AMP-T2D-GENES participants (T2D case/control study) and the right panel shows UK Biobank participants (population-based study). See Supplementary Data 3 for individual counts. A Mean and 95% CI are represented by the black circle and black lines, respectively. Relevant lipid levels (mg/dl) or body mass index (kg/m2) are shown for carriers (C) and non-carriers (NC) of clinically significant variants for the five monogenic conditions. The blue boxes indicate the phenotype values that meet a clinical threshold for diagnosis of each of the conditions, and P values were obtained by two-tailed burden analysis in EPACTS (see “Methods”). No adjustment has been made for multiple testing. B Dots are the proportion of individuals that have the condition based on the clinical diagnosis threshold for each condition; for MODY, we show the proportion of individuals meeting T2D as well as T2D and prediabetes criteria (see “Methods”). Error bars reflect 95% CI computed with the Clopper–Pearson method.
Fig. 4Ascertainment bias significantly impacts expressivity of clinically significant variants for LDL cholesterol conditions.
LDL cholesterol levels are shown for carriers and non-carriers of LDL cholesterol raising (top panels) or lowering (bottom panels) clinically significant variants in AMP-T2D-GENES. The variants carriers are stratified by whether they were identified in individuals phenotypically ascertained for extreme serum LDL cholesterol levels (Yes, Red) or in a separate unascertained population (No, Blue) (see “Methods”). The left panels show all clinically significant variant carriers. The right panels show carriers of the single variants that were present in both ascertained and unascertained individuals. Top left, LDL-raising variant Non-carriers N = 19,131, Carriers not ascertained on LDL cholesterol level N = 55, Carriers ascertained on LDL cholesterol level N = 18. Bottom left, LDL-lowering variant Non-carriers N = 19,151, Carriers not ascertained N = 35, Carriers ascertained N = 15. Mean and 95% CI are represented by the black circle and black lines, respectively. LDL cholesterol values are adjusted for lipid-lowering medication use as per methods. See also Supplementary Table 4.
Fig. 5The combination of clinically significant monogenic variants and corresponding polygenic scores significantly improves prediction for high HDL cholesterol and high triglyceride conditions.
In all plots, an empirical cumulative distribution function (CDF) of each phenotype is shown for clinically significant variant carriers and non-carriers in the UKB for each monogenic condition stratified by bottom/top quartiles of the corresponding gePS. The monogenic conditions are (A) low LDL cholesterol (APOB, PCSK9), (B) high LDL cholesterol (LDLR, APOB), (C) high HDL cholesterol (CETP), (D) high triglycerides (APOA5, LPL), and (E) monogenic obesity (MC4R). The same gePS calculated for risk of increasing LDL cholesterol levels was used for (A and B), however, the inverse of the gePS was used for (A) to illustrate that higher gePS indicates risk of lower LDL cholesterol. The impact of higher gePS was testing in carrier-only linear regression analysis; asterisks indicate two-sided P < 0.05 unadjusted for multiple testing (High HDL P = 0.012, High Triglycerides P = 0.014). See also Supplementary Table 5.