| Literature DB >> 21829380 |
Serena Sanna1, Bingshan Li, Antonella Mulas, Carlo Sidore, Hyun M Kang, Anne U Jackson, Maria Grazia Piras, Gianluca Usala, Giuseppe Maninchedda, Alessandro Sassu, Fabrizio Serra, Maria Antonietta Palmas, William H Wood, Inger Njølstad, Markku Laakso, Kristian Hveem, Jaakko Tuomilehto, Timo A Lakka, Rainer Rauramaa, Michael Boehnke, Francesco Cucca, Manuela Uda, David Schlessinger, Ramaiah Nagaraja, Gonçalo R Abecasis.
Abstract
Complex trait genome-wide association studies (GWAS) provide an efficient strategy for evaluating large numbers of common variants in large numbers of individuals and for identifying trait-associated variants. Nevertheless, GWAS often leave much of the trait heritability unexplained. We hypothesized that some of this unexplained heritability might be due to common and rare variants that reside in GWAS identified loci but lack appropriate proxies in modern genotyping arrays. To assess this hypothesis, we re-examined 7 genes (APOE, APOC1, APOC2, SORT1, LDLR, APOB, and PCSK9) in 5 loci associated with low-density lipoprotein cholesterol (LDL-C) in multiple GWAS. For each gene, we first catalogued genetic variation by re-sequencing 256 Sardinian individuals with extreme LDL-C values. Next, we genotyped variants identified by us and by the 1000 Genomes Project (totaling 3,277 SNPs) in 5,524 volunteers. We found that in one locus (PCSK9) the GWAS signal could be explained by a previously described low-frequency variant and that in three loci (PCSK9, APOE, and LDLR) there were additional variants independently associated with LDL-C, including a novel and rare LDLR variant that seems specific to Sardinians. Overall, this more detailed assessment of SNP variation in these loci increased estimates of the heritability of LDL-C accounted for by these genes from 3.1% to 6.5%. All association signals and the heritability estimates were successfully confirmed in a sample of ∼10,000 Finnish and Norwegian individuals. Our results thus suggest that focusing on variants accessible via GWAS can lead to clear underestimates of the trait heritability explained by a set of loci. Further, our results suggest that, as prelude to large-scale sequencing efforts, targeted re-sequencing efforts paired with large-scale genotyping will increase estimates of complex trait heritability explained by known loci.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21829380 PMCID: PMC3145627 DOI: 10.1371/journal.pgen.1002198
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
5]. SNPs in these loci did not reach genome-wide significance in two subsequent meta-analyses [1], [6] and were not significantly associated with LDL-C in the data generated here (Table 1 and Figure S1). Because we have no evidence that these two genes are associated with LDL-C, they are not discussed further. Variants identified in the two genes were also deposited in dbSNP.
Association Analysis results.
| Locus | SNPname | Type | Effect Allele/Other | Freq Effect Allele | Effect (SE) | P-value | Genomic Annotation | Variance explained by the locus | Top GWAS SNP | Effect Allele/Other | Freq Effect Allele | Effect (SE) | P-value | r2 | Adjusted P-value | Variance explained by the locus |
|
| rs11591147 | Metabochip | T/G | 0.037 | −0.380 (0.048) | 2.90×10−15 |
| 1.19% | rs11206510 | C/T | 0.243 | −0.106 (0.023) | 5.71×10−07 | 0.101 | 0.013 | 0.23% |
| rs2479415 | 1000G | C/T | 0.413 | 0.076 (0.019) | 7.50×10−05 | 8 Kb from | ||||||||||
|
| rs583104 | Metabochip | T/G | 0.177 | 0.149 (0.024) | 1.28×10−09 | 31 Kb from | 0.63% | rs599839 | G/A | 0.276 | −0.148 (0.025) | 1.43×10−09 | 0.991 | 0.90 | 0.61% |
|
| rs28361085 | 1000G | C/T | 0.073 | 0.114 (0.036) | 0.00169 | 146 Kb from | 0.22% | rs2254287 | G/C | 0.492 | 0.005 (0.018) | 0.771 | 0.413 | 0.84 | 0.02% |
|
| rs34507110 | 1000G | G/A | 0.154 | 0.122 (0.030) | 4.99×10−05 | 83 Kb from | 0.48% | rs12695382 | A/G | 0.075 | −0.074 (0.035) | 0.035 | 0.795 | 0.48 | 0.03% |
|
| rs547235 | 1000G | A/G | 0.187 | −0.144 (0.024) | 1.69×10−09 | 140 Kb from | 0.51% | rs562338 | A/G | 0.173 | −0.139 (0.025) | 1.43×10−8 | 0. 878 | 0.98 | 0.43% |
|
| rs73015013 | Metabochip | T/C | 0.138 | −0.155 (0.027) | 1.12×10−08 | 9 kb from | 1.17% | rs6511720 | T/G | 0.132 | −0.160 (0.027) | 1.71×10−08 | 0.934 | 0.97 | 0.59% |
| rs72658864 | Metabochip | C/T | 0.005 | 0.626 (0.136) | 3.90×10−06 |
| ||||||||||
|
| rs7412 | Metabochip | T/C | 0.037 | −0.563 (0.048) | 1.80×10−31 |
| 3.33% | rs4420638 | G/A | 0.097 | 0.218 (0.031) | 4.67×10−12 | 0.0003 | 6.41×10−10 | 1.07% |
| rs429358 | Affy+Sanger | C/T | 0.071 | 0.260 (0.036) | 5.82×10−11 |
|
The left panel shows the association results at 7 loci. For each gene, the strongest variant is listed first, and any second detected independent signal is listed with results from the conditional analysis (Materials and Methods). The column Type indicates whether the SNP was directly genotyped (Metabochip) or imputed using 1000G reference haplotype (1000G) or the Sardinian reference panel (Affy+Sanger). The right panel shows the association results for the GWAS SNPs previously described [5], the correlation with the top SNP listed in the left panel, and its p-value in the conditional analysis (Adjusted P-value).
Effect sizes are standardized (see Materials and Methods), and represent the change in trait LDL-C values associated with each copy of the reference allele, measured in standard deviation units.
SNP rs583104 is also 1 Kb from PSRC1 transcript.
r2 = 0.967 with Metabochip second-independent SNP, rs429358. After adjusting for the two independent SNPs, rs7412 and rs429358, the p-value for rs4420638 was 0.5.
Figure 1Regional Association plots.
Association results around LDLR, PCSK9 cluster and APOE. In each panel, the box at left (A, C and E) shows the association results in the main analysis; and at right (B, D and F) the results after conditioning for the strongest associated variant, highlighted with a purple dot in both plots, and its name written at the top. Arrows highlight independent signals and the most associated SNP detected in the previous GWAS [5]. Each SNP is also colored according to its LD (r2) in Sardinians with the top variant, with symbols that reflect genomic annotation as indicated in the legend. The rugs above indicate the position of the SNPs that were analyzed by direct typing (MetaboChip), or imputed by using haplotypes from sequenced samples (Affy+Sanger) or 1000 Genomes haplotypes (1000G). Plots were drawn using the LocusZoom standalone version [37]. Genomic coordinates are given according to build 36 (hg18).
Heritability estimates in all study samples.
| Study | N samples | Variance explained by 5 GWAS SNPs | Variance explained by 8 SNPs |
| SardiNIA | 5,382 | 3.1% | 6.5% |
| Norwegian T2D | 1,171 | 5.8% | 9.3% |
| Norwegian controls | 1,436 | 3.1% | 8.5% |
| Finnish T2D | 1,742 | 2.1% | 5.0% |
| Finnish controls | 5,678 | 3.4% | 7.0% |
|
|
|
|
|
The table shows the LDL-C variance accounted for by the 5 GWAS SNPs and the 8 SNPs here described in all studies. A sample size weighted average estimate is given for the Finnish and Norwegian samples.