| Literature DB >> 29257133 |
Jason Flannick1,2, Christian Fuchsberger3, Anubha Mahajan4, Tanya M Teslovich3, Vineeta Agarwala2,5, Kyle J Gaulton4, Lizz Caulkins2, Ryan Koesterer2, Clement Ma3, Loukas Moutsianas4, Davis J McCarthy4,6, Manuel A Rivas4, John R B Perry4,7,8,9, Xueling Sim3, Thomas W Blackwell3, Neil R Robertson4,10, N William Rayner4,10,11, Pablo Cingolani12,13, Adam E Locke3, Juan Fernandez Tajes4, Heather M Highland14, Josee Dupuis15,16, Peter S Chines17, Cecilia M Lindgren2,4, Christopher Hartl2, Anne U Jackson3, Han Chen15,18, Jeroen R Huyghe3, Martijn van de Bunt4,10, Richard D Pearson4, Ashish Kumar4,19, Martina Müller-Nurasyid20,21,22,23, Niels Grarup24, Heather M Stringham3, Eric R Gamazon25, Jaehoon Lee26, Yuhui Chen4, Robert A Scott8, Jennifer E Below27, Peng Chen28, Jinyan Huang29, Min Jin Go30, Michael L Stitzel31, Dorota Pasko7, Stephen C J Parker32, Tibor V Varga33, Todd Green2, Nicola L Beer10, Aaron G Day-Williams11, Teresa Ferreira4, Tasha Fingerlin34, Momoko Horikoshi4,10, Cheng Hu35, Iksoo Huh26, Mohammad Kamran Ikram36,37,38, Bong-Jo Kim30, Yongkang Kim26, Young Jin Kim30, Min-Seok Kwon39, Juyoung Lee30, Selyeong Lee26, Keng-Han Lin3, Taylor J Maxwell27, Yoshihiko Nagai13,40,41, Xu Wang28, Ryan P Welch3, Joon Yoon39, Weihua Zhang42,43, Nir Barzilai44, Benjamin F Voight45,46, Bok-Ghee Han30, Christopher P Jenkinson47,48, Teemu Kuulasmaa49, Johanna Kuusisto49,50, Alisa Manning2, Maggie C Y Ng51,52, Nicholette D Palmer51,52,53, Beverley Balkau54, Alena Stančáková49, Hanna E Abboud47, Heiner Boeing55, Vilmantas Giedraitis56, Dorairaj Prabhakaran57, Omri Gottesman58, James Scott59, Jason Carey2, Phoenix Kwan3, George Grant2, Joshua D Smith60, Benjamin M Neale2,61, Shaun Purcell2,62,63, Adam S Butterworth64, Joanna M M Howson64, Heung Man Lee65, Yingchang Lu58, Soo-Heon Kwak66, Wei Zhao67, John Danesh11,64,68, Vincent K L Lam65, Kyong Soo Park69, Danish Saleheen70,71, Wing Yee So65, Claudia H T Tam65, Uzma Afzal42, David Aguilar72, Rector Arya73, Tin Aung36,37,38, Edmund Chan74, Carmen Navarro75,76,77, Ching-Yu Cheng28,36,37,38, Domenico Palli78, Adolfo Correa79, Joanne E Curran80, Dennis Rybin15, Vidya S Farook81, Sharon P Fowler47, Barry I Freedman82, Michael Griswold83, Daniel Esten Hale73, Pamela J Hicks51,52,53, Chiea-Chuen Khor28,36,37,84,85, Satish Kumar80, Benjamin Lehne42, Dorothée Thuillier86, Wei Yen Lim28, Jianjun Liu28,85, Marie Loh42,87,88, Solomon K Musani89, Sobha Puppala81, William R Scott42, Loïc Yengo86, Sian-Tsung Tan43,59, Herman A Taylor79, Farook Thameem47, Gregory Wilson90, Tien Yin Wong36,37,38, Pål Rasmus Njølstad91,92, Jonathan C Levy10, Massimo Mangino9,93, Lori L Bonnycastle17, Thomas Schwarzmayr94, João Fadista95, Gabriela L Surdulescu9, Christian Herder96,97, Christopher J Groves10, Thomas Wieland94, Jette Bork-Jensen24, Ivan Brandslund98,99, Cramer Christensen100, Heikki A Koistinen101,102,103,104, Alex S F Doney105, Leena Kinnunen101, Tõnu Esko2,106,107,108, Andrew J Farmer109, Liisa Hakaste102,110,111, Dylan Hodgkiss9, Jasmina Kravic95, Valeri Lyssenko95, Mette Hollensted24, Marit E Jørgensen112, Torben Jørgensen113,114,115, Claes Ladenvall95, Johanne Marie Justesen24, Annemari Käräjämäki116,117, Jennifer Kriebel97,118,119, Wolfgang Rathmann97,120, Lars Lannfelt56, Torsten Lauritzen121, Narisu Narisu17, Allan Linneberg113,122,123, Olle Melander124, Lili Milani106, Matt Neville10,125, Marju Orho-Melander126, Lu Qi127,128, Qibin Qi127,129, Michael Roden96,97,130, Olov Rolandsson131, Amy Swift17, Anders H Rosengren95, Kathleen Stirrups11, Andrew R Wood7, Evelin Mihailov106, Christine Blancher132, Mauricio O Carneiro2, Jared Maguire2, Ryan Poplin2, Khalid Shakir2, Timothy Fennell2, Mark DePristo2, Martin Hrabé de Angelis97,133,134, Panos Deloukas11,135,136, Anette P Gjesing24, Goo Jun3,27, Peter Nilsson137, Jacquelyn Murphy2, Robert Onofrio2, Barbara Thorand97,118, Torben Hansen24,138, Christa Meisinger97,118, Frank B Hu29,127, Bo Isomaa110,139, Fredrik Karpe10,125, Liming Liang18,29, Annette Peters23,97,118, Cornelia Huth97,118, Stephen P O'Rahilly140, Colin N A Palmer141, Oluf Pedersen24, Rainer Rauramaa142, Jaakko Tuomilehto143,144,145,146, Veikko Salomaa146, Richard M Watanabe147,148,149, Ann-Christine Syvänen150, Richard N Bergman151, Dwaipayan Bharadwaj152, Erwin P Bottinger58, Yoon Shin Cho153, Giriraj R Chandak154, Juliana Cn Chan65,155,156, Kee Seng Chia28, Mark J Daly61, Shah B Ebrahim57, Claudia Langenberg8, Paul Elliott42,157, Kathleen A Jablonski158, Donna M Lehman47, Weiping Jia35, Ronald C W Ma65,155,156, Toni I Pollin159, Manjinder Sandhu11,64, Nikhil Tandon160, Philippe Froguel86,161, Inês Barroso11,140, Yik Ying Teo28,162,163, Eleftheria Zeggini11, Ruth J F Loos58, Kerrin S Small9, Janina S Ried20, Ralph A DeFronzo47, Harald Grallert97,118,119, Benjamin Glaser164, Andres Metspalu106, Nicholas J Wareham8, Mark Walker165, Eric Banks2, Christian Gieger20,118,119, Erik Ingelsson4,166, Hae Kyung Im25, Thomas Illig119,167,168, Paul W Franks33,127,131, Gemma Buck132, Joseph Trakalo132, David Buck132, Inga Prokopenko4,10,161, Reedik Mägi106, Lars Lind169, Yossi Farjoun170, Katharine R Owen10,125, Anna L Gloyn4,10,125, Konstantin Strauch20,22, Tiinamaija Tuomi102,110,111,171, Jaspal Singh Kooner43,59,172, Jong-Young Lee30, Taesung Park26,39, Peter Donnelly4,6, Andrew D Morris173,174, Andrew T Hattersley175, Donald W Bowden51,52,53, Francis S Collins17, Gil Atzmon44,176, John C Chambers42,43,172, Timothy D Spector9, Markku Laakso49,50, Tim M Strom94,177, Graeme I Bell178, John Blangero80, Ravindranath Duggirala81, E Shyong Tai28,74,179, Gilean McVean4,180, Craig L Hanis27, James G Wilson181, Mark Seielstad182,183, Timothy M Frayling7, James B Meigs184, Nancy J Cox25, Rob Sladek13,40,185, Eric S Lander186, Stacey Gabriel2, Karen L Mohlke187, Thomas Meitinger94,177, Leif Groop95,171, Goncalo Abecasis3, Laura J Scott3, Andrew P Morris4,106,188, Hyun Min Kang1, David Altshuler1,2,107,189,190,191, Noël P Burtt2, Jose C Florez2,62,189,190, Michael Boehnke3, Mark I McCarthy4,10,125.
Abstract
To investigate the genetic basis of type 2 diabetes (T2D) to high resolution, the GoT2D and T2D-GENES consortia catalogued variation from whole-genome sequencing of 2,657 European individuals and exome sequencing of 12,940 individuals of multiple ancestries. Over 27M SNPs, indels, and structural variants were identified, including 99% of low-frequency (minor allele frequency [MAF] 0.1-5%) non-coding variants in the whole-genome sequenced individuals and 99.7% of low-frequency coding variants in the whole-exome sequenced individuals. Each variant was tested for association with T2D in the sequenced individuals, and, to increase power, most were tested in larger numbers of individuals (>80% of low-frequency coding variants in ~82 K Europeans via the exome chip, and ~90% of low-frequency non-coding variants in ~44 K Europeans via genotype imputation). The variants, genotypes, and association statistics from these analyses provide the largest reference to date of human genetic information relevant to T2D, for use in activities such as T2D-focused genotype imputation, functional characterization of variants or genes, and other novel analyses to detect associations between sequence variation and T2D.Entities:
Mesh:
Year: 2017 PMID: 29257133 PMCID: PMC5735917 DOI: 10.1038/sdata.2017.179
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 8.501
Figure 1Overview of data and analysis generation.
Shown is a flowchart for variant calling, quality control, and propagation of variants in both the WGS panel and WES panel. (a) Variant calling and quality control in the WGS and WES panels. Individuals were characterized with one or more sequencing and genotyping technologies, and then individuals and variants were excluded based on quality control metrics. The final WGS panel consists of data from 2,857 individuals and 28.5M variants, while the final WES panel consists of data from 13,008 individuals and 3.04M variants. (b) Assessing variants in larger sample sizes. Non-coding variants from the WGS panel were studied via statistical imputation in 44,414 additional individuals from cohorts within the DIAGRAM consortium. Coding variants from the WES panel were genotyped on the exome array in 79,854 additional individuals. Modified from Extended Data Fig. 1 of Fuchsberger et al.[8].
Summary of studies included in WGS panel.
| Shown are the number of individuals included in association analysis for the GoT2D whole-genome sequencing study, stratified by their study of origin. Columns from left show the ancestry of individuals in each study, the name of the study (or studies), the country of origin for the individuals, the number of cases and controls, and the total number of individuals. Reproduced from Extended Data Table 1 of Fuchsberger | |||||
|---|---|---|---|---|---|
| European | Finland-United States Investigation of NIDDM Genetics (FUSION) Study | Finland | 493 (41.5) | 486 (45.2) | 979 |
| European | Kooperative Gesundheitsforschung in der Region Augsburg (KORA) | Germany | 101 (44.5) | 104 (66.3) | 205 |
| European | Malmö-Botnia Study | Finland, Sweden | 410 (51.5) | 419 (44.1) | 829 |
| European | UK Type 2 Diabetes Genetics Consortium (UKT2D) | UK | 322 (46.2) | 322 (82.2) | 644 |
Summary of studies included in WES panel.
| Shown are the number of individuals included in the GoT2D and T2D-GENES exome sequencing studies, stratified by their study of origin. Columns are as described in | |||||
|---|---|---|---|---|---|
| African American | Jackson Heart Study (JHS) | US | 500 (66.6) | 526 (63.3) | 1,026 |
| African American | Wake Forest School of Medicine Study (WF) | US | 518 (59.5) | 530 (56.0) | 1,048 |
| East Asian | Korea Association Research Project (KARE) | Korea | 526 (45.6) | 561 (58.5) | 1,087 |
| East Asian | Singapore Diabetes Cohort Study; Singapore Prospective Study Program | Singapore (Chinese) | 486 (52.1) | 592 (61.3) | 1,078 |
| European | Ashkenazi | US, Israel | 506 (47.0) | 355 (56.9) | 861 |
| European | Metabolic Syndrome in Men Study (METSIM) | Finland | 484 (0) | 498 (0) | 982 |
| European | Finland-United States Investigation of NIDDM Genetics (FUSION) | Finland | 472 (42.6) | 476 (45.0) | 948 |
| European | Kooperative Gesundheitsforschung in der Region Augsburg (KORA) | Germany | 97 (44.3) | 90 (63.3) | 187 |
| European | UK Type 2 Diabetes Genetics Consortium (UKT2D) | UK | 322 (45.7) | 320 (82.8) | 642 |
| European | Malmö-Botnia Study | Finland, Sweden | 478 (54.8) | 443 (43.8) | 921 |
| Hispanic | San Antonio Family Heart Study (SAFHS), San Antonio Family Diabetes/ Gallbladder Study (SAFDGS), Veterans Administration Genetic Epidemiology Study (VAGES), and the Investigation of Nephropathy and Diabetes Study Family Component (SAMAFS) | US | 272 (58.8) | 218 (58.7) | 490 |
| Hispanic | Starr County, Texas | US | 749 (59.7) | 704 (71.9) | 1,453 |
| South Asian | London Life Sciences Population Study (LOLIPOP) | UK (Indian Asian) | 531 (14.1) | 538 (15.8) | 1,069 |
| South Asian | Singapore Indian Eye Study | Singapore (Indian Asian) | 563 (44.4) | 585 (49.2) | 1,148 |
Summary of variants in the WGS panel.
| Shown are aggregate statistics on the variants the WGS panel, stratified by type (SNV, indel, or SV), function, frequency, and presence in dbSNP b137. Adapted from Extended Data Table 2 in Fuchsberger | |||
|---|---|---|---|
| N (%total) | 25.2M (94%) | 1.50M (5.6%) | 8,876 (0.3%) |
| N (%total) | 888K (3.3%) | 25.8M (96.7%) | |
| Common (MAF>5%) | |||
| N (%total) | 6.26M (23%) | 4.16M (16%) | 16.3M (61%) |
| N (%total) | 14.6M (55%) | 12.1M (45%) |
Summary of variant annotations in the WES panel.
| Shown are aggregate statistics on variants in the WES panel, stratified by predicted molecular function. Variant annotations are produced from the Variant Effect Predictor[ | ||||||
|---|---|---|---|---|---|---|
| Synonymous SNV | 627,630 | 237,430 | 178,232 | 192,282 | 156,231 | 211,218 |
| Missense SNV | 1,110,897 | 354,797 | 296,707 | 327,049 | 231,351 | 344,191 |
| Start SNV | 2,055 | 593 | 523 | 639 | 384 | 583 |
| Nonsense SNV | 26,321 | 7,188 | 6,668 | 8,030 | 4,660 | 7,339 |
| Frameshift INDEL | 26,901 | 6,605 | 6,159 | 7,515 | 4,155 | 6,609 |
| Inframe INDEL | 11,090 | 3,471 | 2,963 | 3,145 | 2,068 | 3,165 |
| 3′UTR SNV, INDEL | 65,013 | 24,583 | 19,149 | 21,102 | 16,959 | 22,177 |
| 5′UTR SNV, INDEL | 43,965 | 16,920 | 13,520 | 15,562 | 11,634 | 15,595 |
| Intron SNV, INDEL | 931,449 | 352,398 | 270,564 | 296,970 | 243,139 | 314,810 |
| Essential splicing SNV, INDEL | 14,286 | 3,648 | 3,454 | 4,108 | 2,301 | 3,744 |
| Other splicing SNV, INDEL | 128,644 | 45,876 | 35,413 | 38,263 | 30,301 | 41,122 |
| Non-coding RNA SNV, INDEL | 18,113 | 7,247 | 5,996 | 6,715 | 5,084 | 6,706 |
| Intergenic SNV, INDEL | 37,345 | 14,335 | 11,498 | 13,614 | 10,700 | 12,937 |
Summary of coding variant frequencies in the WES panel.
| Shown are aggregate frequency statistics on coding variants in the WES panel, stratified by frequency. Counts and frequencies are shown for variants specific to each ancestry, as well as overall. Private: unique to one ancestry group; Cosmopolitan: observed across all ancestry groups. Adapted from Extended Data Table 2 in Fuchsberger | ||||||
|---|---|---|---|---|---|---|
| Rare (MAF<0.5%) | 95.79% | 83.30% | 90.06% | 89.19% | 84.56% | 89.89% |
| Low frequency (0.5%<MAF<5%) | 2.57% | 10.36% | 4.61% | 5.52% | 8.21% | 5.10% |
| Common (MAF>5%) | 1.65% | 6.35% | 5.33% | 5.29% | 7.23% | 5.00% |
Additional cardiometabolic phenotypes measured in individuals included in the WGS and WES panels.
| For each phenotype, shown are the number of samples with the phenotype measured, the mean value of the phenotype, and its standard deviation in cases within the WGS panel, controls within the WGS panel, cases within the WES panel, and controls within the WGS panel. Some values should be used with caution, such as glycemic measurements in diabetes cases, and others should likely be adjusted prior to use, such as lipid values in individuals on lipid medications. Only the phenotypes directly available are listed in the table; some unmeasured phenotypes (such as Waist-Hip Ratio for samples in the WES panel) can be inferred from other phenotypes. | ||||||||
|---|---|---|---|---|---|---|---|---|
| Age (yr) | 1326 | 54.9 (9.3) | 1331 | 64.3 (8.6) | 6506 | 57.9 (10.1) | 6434 | 57.9 (13.0) |
| Age at diagnosis (yr) | 0 | — | 0 | — | 3745 | 48.3 (10.4) | 0 | — |
| BMI (kg/m2) | 1326 | 27.6 (4.9) | 1326 | 30.6 (5.0) | 6431 | 28.6 (5.6) | 6381 | 27.8 (5.8) |
| Weight (kg) | 0 | — | 0 | — | 5067 | 79.0 (18.3) | 5063 | 73.9 (18.5) |
| Height (cm) | 1326 | 168.9 (9.5) | 1326 | 166.6 (9.1) | 6433 | 165.9 (10) | 6385 | 165.2 (10.4) |
| Waist-Hip Ratio | 1114 | 0.94 (0.08) | 1224 | 0.91 (0.1) | 0 | — | 0 | — |
| Hip circumference (cm) | 1114 | 105.1 (9.7) | 1224 | 109 (10.5) | 4454 | 103.1 (11.0) | 4301 | 102.8 (11.8) |
| Waist circumference (cm) | 1114 | 98.6 (13) | 1224 | 99.1 (13.1) | 4995 | 99.8 (14.3) | 5158 | 94.1 (13.9) |
| Fasting blood glucose (mmol/l) | 22 | 9.9 (2.8) | 1330 | 5.2 (0.53) | 2837 | 8.6 (3.4) | 5247 | 5.0 (0.56) |
| 2-hour glucose (mmol/l) | 0 | — | 0 | — | 637 | 13.9 (4.4) | 1942 | 6.3 (1.8) |
| HbA1C (%) | 0 | — | 0 | — | 4403 | 8.4 (15.1) | 3098 | 5.6 (0.43) |
| Fasting blood insulin (μIU/ml) | 7 | 1.26 (1.1) | 1070 | 43.9 (41.2) | 1993 | 19.6 (26.9) | 4677 | 17.3 (25.6) |
| 2-hour insulin (μIU/ml) | 0 | — | 0 | — | 613 | 51.7 (60.4) | 1222 | 37.1 (50.4) |
| 2-hour C-peptide (ng/ml) | 0 | — | 0 | — | 52 | 1.7 (1.4) | 34 | 2.1 (1.9) |
| GAD antibodies (nmol/l) | 0 | — | 0 | — | 484 | 3.3 (4.5) | 0 | — |
| Total cholesterol (mmol/l) | 964 | 5.4 (1.2) | 1283 | 5.7 (1.0) | 5530 | 5.1 (1.2) | 5813 | 5.3 (1.0) |
| LDL (mmol/l) | 809 | 3.3 (1.0) | 1275 | 3.7 (0.97) | 4410 | 3.1 (0.98) | 4583 | 3.4 (0.93) |
| HDL (mmol/l) | 847 | 1.23 (0.35) | 1282 | 1.4 (0.41) | 5395 | 1.2 (0.35) | 5811 | 1.4 (0.39) |
| TG (mmol/l) | 963 | 2.0 (1.8) | 1282 | 1.4 (0.7) | 5524 | 2.0 (1.6) | 5812 | 1.5 (0.89) |
| Systolic blood pressure (mmHg) | 622 | 142 (21.2) | 904 | 134.4 (18.1) | 5143 | 135.8 (20.4) | 5411 | 130.2 (19.9) |
| Diastolic blood pressure (mmHg) | 622 | 83.4 (11.2) | 904 | 80.6 (10.3) | 5143 | 79.1 (11.4) | 5411 | 78.6 (11.0) |
| Creatinine (μmol/l) | 0 | — | 0 | — | 2819 | 85.8 (42.9) | 3189 | 84.4 (33.3) |
| Leptin (ng/ml) | 0 | — | 0 | — | 559 | 29.4 (23.4) | 658 | 27.3 (23.9) |
| Adiponectin (μg/ml) | 0 | — | 0 | — | 957 | 6.6 (5.4) | 1733 | 6.6 (5.1) |
| Diabetes medication (%) | 0 | — | 0 | — | 3770 | 70.7% | 3622 | 0% |
| Lipids medication (%) | 1187 | 19.7% | 1213 | 11.1% | 5688 | 38.4% | 5569 | 14.6% |
| Blood pressure medication (%) | 755 | 50.6% | 726 | 34.6% | 4589 | 58.8% | 4451 | 30.8% |
Summary of datasets.
| Datasets from the T2D-GENES and GoT2D studies consist of individual genotypes and phenotypes as well as statistics from genome- or exome-wide association analysis. Quality control has been performed to exclude problematic variants or individuals with problematic genotypes. Datasets are available at dbGAP and the EGA. | |||||
|---|---|---|---|---|---|
| 2,874 Europeans from the GoT2D consortium analysis | 5x whole-genome sequencing, 82x exome sequencing, SNP array genotyping | Integration, phasing, individual and variant exclusions | Single variant (allele count above 3) | Sequence reads | phs000840.v1.p1 |
| WGS panel, individual phenotypes | phs000840.v1.p1 EGAS00001001459 | ||||
| Lists of individuals and variants in association analysis, variant association statistics | EGAS00001001459 | ||||
| 13,008 individuals from the T2D-GENES consortium analysis | 82x exome sequencing | Individual and variant exclusions | Single variant, gene-level (four masks) | WES panel, individual phenotypes for all samples | EGAS00001001460 |
| QC+ variant list, list of individuals and variants in association analysis, variant association statistics, gene-level variant masks, gene-level association statistics | EGAS00001001460 | ||||
| Sequence reads, genotypes and phenotypes (Starr County individuals) | phs001099.v1.p1 | ||||
| Sequence reads, genotypes and phenotypes (JHS individuals) | phs001098.v1.p1 | ||||
| Sequence reads, genotypes and phenotypes (SAMAFS individuals) | phs000849.v1.p1 | ||||
| Sequence reads, genotypes and phenotypes (Singapore Chinese and Singapore Indian individuals) | phs001097.v1.p1 | ||||
| Sequence reads, genotypes and phenotypes (KARE individuals) | phs001096.v1.p1 | ||||
| Sequence reads, genotypes and phenotypes (Ashkenazi individuals) | phs001095.v1.p1 | ||||
| Sequence reads, genotypes and phenotypes (LOLIPOP individuals) | phs001093.v1.p1 | ||||
| Sequence reads, genotypes and phenotypes (METSIM individuals) | phs001100.v1.p1 | ||||
| Sequence reads, genotypes and phenotypes (WFS individuals) | phs001102.v1.p1 | ||||
| 44,414 Europeans | Imputation from WGS panel | Imputation quality | Single variant | Imputation quality scores, variant association statistics | EGAS00001001459 |
| 79,854 Europeans | Illumina exome array genotyping | Individual and variant exclusions | Single variant | Variant association statistics | EGAS00001001460 |
Figure 2Summary of key quality control metrics for WGS and WES panels.
We computed several metrics to verify the sequencing accuracy of the study individuals. (a) Estimates of sensitivity of WGS panel. Shown is the fraction of variants, as a function of minor allele count in the WGS sequenced individuals, estimated as included in the WGS panel. Green circles show the total fraction of variants; green crosses show the fraction of variants for hypothetical variants with a T2D odds ratio of 5 (because T2D cases are overrepresented in our sample, the sensitivity to detect risk variants is increased). For comparison, shown are the fraction of variants that are included in the 1000G Phase 1 dataset (blue circles) or HapMap panel (red circles). (b) Distribution of minor alleles carried by individuals in the WES panel. For different populations within the WES panel, the distribution of minor alleles carried is plotted across all individuals. A normal distribution indicates a lack of systematic sequencing artefacts for any one individual, at least according to this metric. Afr-Am: African American. (c) Comparison of principal components computed from SNPs and indels versus indels alone. We calculated principal components for the European individuals in the WES panel using all variants in the panel and then again using only indels. Adapted from Supplementary Tables 5 and 6, and Fig. 1a, in Fuchsberger et al.[8].
Figure 3Completeness of additional variant genotyping.
We calculated the fraction of variants in the WGS and WES panel that were captured via either imputation or exome array genotyping, respectively. (a) The mean imputation quality of variants in the WGS panel, as a function of their allele count in the WGS panel. Green circles show imputation quality in Finnish individuals, while green crosses show imputation quality in British individuals. For comparison, blue circles and crosses show imputation quality using the 1000G Phase 1 dataset as a reference panel (instead of the WGS panel). (b) The number of coding variants in the WES panel present on the exome array. Variants are stratified by annotation and frequency, and sensitivity calculations are shown for variants in each ancestry group as well as overall. Panel (b) is reproduced from Supplementary Fig. 17 in Fuchsberger et al.[8].
Figure 4Power of single variant analysis in the WGS panel, WES panel, imputation, and exome array analyses.
Shown is the power to detect an association with variants of varying population frequencies and T2D odds ratios, at a relatively lenient significance level of α=10−4. Such a significance level would be insufficient to establish an association due to the burden of multiple testing, but lack of association at this significance level can place bounds on the maximum effect a variant has in the population. (a) Power for a variant of constant frequency and effect across all populations in the WGS panel. (b) Power for a variant of constant frequency and effect across all populations in the WES panel. (c) Power for a variant imputed from the WGS panel with imputation accuracy r2=0.8. (d) Power for a variant in both the WES panel and on the exome array. NA: number of affecteds (cases); NU: number of unaffecteds (controls); K: presumed prevalence of T2D in the population.