| Literature DB >> 34140656 |
Joel Becker1, Casper A P Burik2, Grant Goldman3, Nancy Wang3, Hariharan Jayashankar3, Michael Bennett3, Daniel W Belsky4,5, Richard Karlsson Linnér2, Rafael Ahlskog6, Aaron Kleinman7, David A Hinds7, Avshalom Caspi8,9,10,11, David L Corcoran10, Terrie E Moffitt8,9,10,11, Richie Poulton12, Karen Sugden8, Benjamin S Williams8, Kathleen Mullan Harris13,14, Andrew Steptoe15, Olesya Ajnakina15,16, Lili Milani17, Tõnu Esko17,18, William G Iacono19, Matt McGue19, Patrik K E Magnusson20, Travis T Mallard21, K Paige Harden21,22, Elliot M Tucker-Drob21,22, Pamela Herd23, Jeremy Freese24, Alexander Young25,26, Jonathan P Beauchamp27, Philipp D Koellinger2,28, Sven Oskarsson6, Magnus Johannesson29, Peter M Visscher30, Michelle N Meyer31, David Laibson3,32, David Cesarini33,34, Daniel J Benjamin35,36,37, Patrick Turley38,39, Aysu Okbay40.
Abstract
Polygenic indexes (PGIs) are DNA-based predictors. Their value for research in many scientific disciplines is growing rapidly. As a resource for researchers, we used a consistent methodology to construct PGIs for 47 phenotypes in 11 datasets. To maximize the PGIs' prediction accuracies, we constructed them using genome-wide association studies-some not previously published-from multiple data sources, including 23andMe and UK Biobank. We present a theoretical framework to help interpret analyses involving PGIs. A key insight is that a PGI can be understood as an unbiased but noisy measure of a latent variable we call the 'additive SNP factor'. Regressions in which the true regressor is this factor but the PGI is used as its proxy therefore suffer from errors-in-variables bias. We derive an estimator that corrects for the bias, illustrate the correction, and make a Python tool for implementing it publicly available.Entities:
Mesh:
Year: 2021 PMID: 34140656 PMCID: PMC8678380 DOI: 10.1038/s41562-021-01119-3
Source DB: PubMed Journal: Nat Hum Behav ISSN: 2397-3374