| Literature DB >> 23092954 |
Osorio D Meirelles1, Jun Ding, Toshiko Tanaka, Serena Sanna, Hsih-Te Yang, Dawood B Dudekula, Francesco Cucca, Luigi Ferrucci, Goncalo Abecasis, David Schlessinger.
Abstract
Measurement error and biological variability generate distortions in quantitative phenotypic data. In longitudinal studies with repeated measurements, the multiple measurements provide a route to reduce noise and correspondingly increase the strength of signals in genome-wide association studies (GWAS).To optimize noise correction, we have developed Shrunken Average (SHAVE), an approach using a Bayesian Shrinkage estimator. This estimator uses regression toward the mean for every individual as a function of (1) their average across visits; (2) their number of visits; and (3) the correlation between visits. Computer simulations support an increase in power, with results very similar to those expected by the assumptions of the model. The method was applied to a real data set for 14 anthropomorphic traits in ∼6000 individuals enrolled in the SardiNIA project, with up to three visits (measurements) for each participant. Results show that additional measurements have a large impact on the strength of GWAS signals, especially when participants have different number of visits, with SHAVE showing a clear increase in power relative to single visits. In addition, we have derived a relation to assess the improvement in power as a function of number of visits and correlation between visits. It can also be applied in the optimization of experimental designs or usage of measuring devices. SHAVE is fast and easy to run, written in R and freely available online.Entities:
Mesh:
Year: 2012 PMID: 23092954 PMCID: PMC3658185 DOI: 10.1038/ejhg.2012.215
Source DB: PubMed Journal: Eur J Hum Genet ISSN: 1018-4813 Impact factor: 4.246
Simulated and expected power for alpha equal to 5 × 10−8 and different levels of frequency P, slope and correlation
| P | β | ρ | ||||||
|---|---|---|---|---|---|---|---|---|
| 0.10 | 0.20 | 0.20 | 0.0028 | 0.0103 | 0.0185 | 0.0028 | 0.0103 | 0.0185 |
| 0.10 | 0.20 | 0.50 | 0.1137 | 0.2114 | 0.2406 | 0.1147 | 0.2129 | 0.2419 |
| 0.10 | 0.25 | 0.20 | 0.0179 | 0.0628 | 0.1066 | 0.0181 | 0.0629 | 0.1070 |
| 0.10 | 0.25 | 0.50 | 0.4437 | 0.6429 | 0.6872 | 0.4467 | 0.6456 | 0.6892 |
| 0.10 | 0.30 | 0.20 | 0.0772 | 0.2279 | 0.3447 | 0.0777 | 0.2283 | 0.3451 |
| 0.10 | 0.30 | 0.50 | 0.8226 | 0.9371 | 0.9531 | 0.8257 | 0.9391 | 0.9546 |
| 0.50 | 0.20 | 0.20 | 0.1640 | 0.4118 | 0.5648 | 0.1657 | 0.4131 | 0.5655 |
| 0.50 | 0.20 | 0.50 | 0.9495 | 0.9899 | 0.9935 | 0.9509 | 0.9902 | 0.9937 |
| 0.50 | 0.25 | 0.20 | 0.5581 | 0.8630 | 0.9430 | 0.5617 | 0.8634 | 0.9426 |
| 0.50 | 0.25 | 0.50 | 0.9997 | 1.0000 | 1.0000 | 0.9997 | 1.0000 | 1.0000 |
| 0.50 | 0.30 | 0.20 | 0.8987 | 0.9923 | 0.9987 | 0.9008 | 0.9922 | 0.9986 |
| 0.50 | 0.30 | 0.50 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
There were 1 million simulations for each combination of P, β and ρ. Expected power is estimated based on equations 5 and 6.
Figure 1Simulated power by different levels of correlation between visits ρ (top) with effect size β fixed at 0.20, and simulated power by different levels of effect size β (bottom), with ρ fixed at 0.20. Power was simulated for single visit, Average and SHAVE. In both plots, alpha level was set to 5 × 10−8 and minor allele frequency at 0.5.
Association results between 14 traits and their corresponding top SNPs, where top SNPs were selected based on visit 1 results of SardiNIA GWAS, and where z-statistics for Average and SHAVE are based on three visits
| Bilirubin | rs887829 | 27.33 | 27.37 | 31.44 | 31.59 | 4 | 3 | 2 | 1 |
| Cholesterol | rs4910742 | −6.25 | −4.96 | −5.88 | −6.04 | 1 | 4 | 3 | 2 |
| rs7310409 | −6.29 | −6.52 | −6.53 | −6.69 | 4 | 3 | 2 | 1 | |
| Glycemia | rs853787 | −7.17 | −7.50 | −8.06 | −8.24 | 4 | 3 | 2 | 1 |
| HDL | rs247617 | 8.45 | 8.93 | 10.10 | 10.19 | 4 | 3 | 2 | 1 |
| Height | rs3132468 | 5.91 | 5.82 | 5.91 | 5.92 | 3 | 4 | 2 | 1 |
| LDL | rs445925 | −5.89 | −6.13 | −6.74 | −6.77 | 4 | 3 | 2 | 1 |
| PR-interval | rs6800541 | 6.56 | 5.92 | 7.10 | 7.11 | 3 | 4 | 2 | 1 |
| QT-interval | rs12036340 | 6.27 | 5.31 | 7.21 | 7.27 | 3 | 4 | 2 | 1 |
| RBC | rs4910742 | 23.39 | 22.19 | 24.16 | 24.26 | 3 | 4 | 2 | 1 |
| Serum iron | rs4820268 | 8.77 | 7.52 | 10.43 | 10.66 | 3 | 4 | 2 | 1 |
| Transferrin | rs4854761 | 9.58 | 13.04 | 14.96 | 15.40 | 4 | 3 | 2 | 1 |
| Triglycerides | rs10401969 | −6.15 | −4.72 | −6.67 | −6.76 | 3 | 4 | 2 | 1 |
| Uric acid | rs13145758 | −11.84 | −12.52 | −13.80 | −14.05 | 4 | 3 | 2 | 1 |
| Average rank | 3.36 | 3.50 | 2.07 | 1.07 | |||||
The z-statistics are shown in order for visit 1 (z1), visit 2 (z2), Average (zAVG) and SHAVE (zSHAVE). In the next four columns their corresponding ranks within each trait are shown, where 1 is assigned to the most significant and 4 to the least. On the last row, we have the averages of the ranks for each metric.
Association results between 14 traits and their corresponding top SNPs, where top SNPs were selected based on visit 2 results of SardiNIA GWAS, where z-statistics for Average and SHAVE are based on three visits
| Bilirubin | rs887829 | 27.33 | 27.37 | 31.44 | 31.59 | 4 | 3 | 2 | 1 |
| Cholesterol | rs6511720 | −4.60 | −5.96 | −5.49 | −5.62 | 4 | 1 | 3 | 2 |
| rs7310409 | −6.29 | −6.52 | −6.53 | −6.69 | 4 | 3 | 2 | 1 | |
| Glycemia | rs853787 | −7.17 | −7.50 | −8.06 | −8.24 | 4 | 3 | 2 | 1 |
| HDL | rs247617 | 8.45 | 8.93 | 10.10 | 10.19 | 4 | 3 | 2 | 1 |
| Height | rs3132468 | 5.91 | 5.82 | 5.91 | 5.92 | 3 | 4 | 2 | 1 |
| LDL | rs6511720 | −5.86 | −6.93 | −6.81 | −6.97 | 4 | 2 | 3 | 1 |
| PR-interval | rs6795970 | 6.18 | 5.94 | 6.80 | 6.86 | 3 | 4 | 2 | 1 |
| QT-interval | rs12143842 | 6.27 | 5.47 | 7.30 | 7.33 | 3 | 4 | 2 | 1 |
| RBC | rs4910742 | 23.39 | 22.19 | 24.16 | 24.26 | 3 | 4 | 2 | 1 |
| Serum iron | rs855791 | −8.03 | −7.88 | −10.24 | −10.55 | 3 | 4 | 2 | 1 |
| Transferrin | rs4854761 | 9.58 | 13.04 | 14.96 | 15.40 | 4 | 3 | 2 | 1 |
| Triglycerides | rs6999813 | −5.67 | −7.63 | −6.87 | −6.97 | 4 | 1 | 3 | 2 |
| URIC ACID | rs13145758 | −11.84 | −12.52 | −13.80 | −14.05 | 4 | 3 | 2 | 1 |
| Average rank | 3.64 | 3.00 | 2.21 | 1.14 | |||||
The z-statistics are shown in order for visit 1 (z1), visit 2 (z2), Average (zAVG) and SHAVE (zSHAVE). The z-statistics are shown in order for visit 1 (z1), visit 2 (z2), Average (zAVG) and SHAVE (zSHAVE). In the next four columns their corresponding ranks within each trait are shown, where 1 is assigned to the most significant and 4 to the least. On the last row, we have the averages of the ranks for each metric.
Association results between 14 traits and their corresponding top SNPs, where top SNPs were selected based on multi-study meta-analyses, and z-statistics for Average and SHAVE are based on three visits and results of SardiNIA GWAS
| Bilirubin | rs887829 | 27.33 | 27.37 | 31.44 | 31.59 | 4 | 3 | 2 | 1 |
| Cholesterol | rs646776* | −4.76 | −4.20 | −4.68 | −4.76 | 1 | 4 | 3 | 2 |
| rs7310409 | −6.29 | −6.52 | −6.53 | −6.69 | 4 | 3 | 2 | 1 | |
| Glycemia | rs10830963 | 6.32 | 6.34 | 7.38 | 7.50 | 4 | 3 | 2 | 1 |
| HDL | rs247617 | 8.45 | 8.93 | 10.10 | 10.19 | 4 | 3 | 2 | 1 |
| Height | rs724016 | 5.02 | 4.25 | 5.26 | 5.24 | 3 | 4 | 1 | 2 |
| LDL | rs646776* | −5.39 | −5.14 | −5.69 | −5.76 | 3 | 4 | 2 | 1 |
| PR-interval | rs6800541 | 6.56 | 5.92 | 7.10 | 7.11 | 3 | 4 | 2 | 1 |
| QT-interval | rs7550692* | 5.89 | 4.36 | 6.66 | 6.55 | 3 | 4 | 1 | 2 |
| RBC | rs4910742 | 23.39 | 22.19 | 24.16 | 24.26 | 3 | 4 | 2 | 1 |
| Serum iron | rs4820268 | 8.77 | 7.52 | 10.43 | 10.66 | 3 | 4 | 2 | 1 |
| Transferrin | rs4854761* | 9.58 | 13.04 | 14.96 | 15.40 | 4 | 3 | 2 | 1 |
| Triglycerides | rs1260326 | 5.12 | 4.75 | 5.92 | 5.95 | 3 | 4 | 2 | 1 |
| Uric acid | rs9998811* | −11.58 | −12.44 | −13.61 | −13.87 | 4 | 3 | 2 | 1 |
| Average rank | 3.29 | 3.57 | 1.93 | 1.21 | |||||
The z-statistics are shown in order for visit 1 (z1), visit 2 (z2), Average (zAVG) and SHAVE (zSHAVE). In the next four columns their corresponding ranks within each trait are shown, where 1 is assigned to the most significant and 4 to the least. On the last row, we have the averages of the ranks for each metric. *The original SNPs (which were replaced by proxy SNPs from the Metabochip) are: cholesterol (total) rs629301 (R2=1.00), LDL rs629301 (R2=1.00), QT-interval rs2880058(R2=0.92), transferrin rs3811647(R2=0.96) and uric acid rs734553(R2=0.80).
Figure 2Observed and expected LOD ratio for Average and single visit for top SNPs from meta-analysis, for a subset of individuals that had both visits 1 and 2. Observed LOD score ratio is calculated based on the square of the z-statistics of the Average and the square of (z1+z2)/2 (from z-statistics from visits 1 and 2). Traits on the x axis are sorted by correlation between visits (in parenthesis).
Figure 3Observed and expected LOD ratio for SHAVE and Average for top SNPs from meta-analysis for a subset of individuals in which all individuals had visit 1 and a randomly chosen 50% of visit 2 cases were selected among the same individuals.
Figure 4Expected LOD ratio between Average and single visit for hypothetical datasets in which all individuals had k visits ranging from 2 to 10.