Literature DB >> 35199043

Stability of polygenic scores across discovery genome-wide association studies.

Laura M Schultz^1,2, Alison K Merikangas^1,2,3, Kosha Ruparel^2,4, Sébastien Jacquemont^5,6, David C Glahn^7,8, Raquel E Gur^2,4,9, Ran Barzilay^2,4,9, Laura Almasy^1,2,3.

Abstract

Polygenic scores (PGS) are commonly evaluated in terms of their predictive accuracy at the population level by the proportion of phenotypic variance they explain. To be useful for precision medicine applications, they also need to be evaluated at the individual level when phenotypes are not necessarily already known. We investigated the stability of PGS in European American (EUR) and African American (AFR)-ancestry individuals from the Philadelphia Neurodevelopmental Cohort and the Adolescent Brain Cognitive Development study using different discovery genome-wide association study (GWAS) results for post-traumatic stress disorder (PTSD), type 2 diabetes (T2D), and height. We found that pairs of EUR-ancestry GWAS for the same trait had genetic correlations >0.92. However, PGS calculated from pairs of same-ancestry and different-ancestry GWAS had correlations that ranged from <0.01 to 0.74. PGS stability was greater for height than for PTSD or T2D. A series of height GWAS in the UK Biobank suggested that correlation between PGS is strongly dependent on the extent of sample overlap between the discovery GWAS. Focusing on the upper end of the PGS distribution, different discovery GWAS do not consistently identify the same individuals in the upper quantiles, with the best case being 60% of individuals above the 80th percentile of PGS overlapping from one height GWAS to another. The degree of overlap decreases sharply as higher quantiles, less heritable traits, and different-ancestry GWAS are considered. PGS computed from different discovery GWAS have only modest correlation at the individual level, underscoring the need to proceed cautiously with integrating PGS into precision medicine applications.

Entities: Chemical

Keywords: Adolescent Brain Cognitive Development study; African American; PRS-CS; PTSD; Philadelphia Neurodevelopmental Cohort; UK Biobank; ancestry; height; methods development; type 2 diabetes

Year: 2022 PMID： 35199043 PMCID： PMC8841810 DOI： 10.1016/j.xhgg.2022.100091

Source DB: PubMed Journal: HGG Adv ISSN： 2666-2477

Introduction

Polygenic scores (PGS) are increasingly being used to draw inferences regarding genetic contributions to a variety of complex anthropometric and disease-related traits. Numerous methods have been developed for computing PGS for a target population using summary statistics from a discovery genome-wide association study (GWAS) run for an independent population, with newer Bayesian-based techniques such as LDpred, SBayesR, and PRS-CS generally yielding more predictive PGS than those produced using older methodologies that rely on a combination of linkage disequilibrium (LD) clumping and p-value thresholding. One goal is to utilize PGS in clinical settings to facilitate the diagnosis and treatment of a wide range of heritable diseases, such as inflammatory bowel disease, diabetes, cardiovascular disease,, cancer, Alzheimer disease, attention-deficit/hyperactivity disorder, major depressive disorder, bipolar disorder, and schizophrenia. While progress has been made toward reaching this goal,17, 18, 19, 20 numerous challenges remain to be solved.,21, 22, 23, 24 Given that the GWAS required for computing PGS have been disproportionately run for European-ancestry populations,25, 26, 27, 28, 29 a fundamental challenge will be ensuring that diverse populations have equitable access to medically beneficial PGS, as it has been demonstrated that that PGS are less predictive when the target and discovery populations have differing genetic ancestry or varying degrees of admixture.31, 32, 33, 34, 35 To leverage the power of larger sample sizes, consortia, such as the Psychiatric Genetics Consortium (PGC), the Diabetes Genetics Replication and Meta-Analysis (DIAGRAM) consortium, and the Genetic Investigation of Anthropometric Traits (GIANT) consortium, routinely produce updated meta-GWAS incorporating new cohorts and samples. Hence, there is a growing pool of discovery GWAS that could be used for computing PGS, and this year's largest, best-powered meta-GWAS may soon be eclipsed by next year's newer, larger meta-GWAS. In general, these larger, more powerful GWAS explain greater proportions of the trait variance and improve the predictive power of the PGS on an aggregate level. However, there has been little examination of the performance of successive generations of PGS at the individual level. Given the potential usefulness of PGS for stratifying individuals based on their genetic risk for developing a given disorder,, the question arises as to whether the same individuals would be classified as having high genetic risk by PGS produced from subsequent generations of meta-GWAS. From a clinical perspective, identifying substantially different sets of individuals as “high risk” from one generation of meta-GWAS to the next would be problematic. Previous studies have evaluated PGS performance in terms of how well they predict phenotypes at the population level. However, it is also necessary to examine how well PGS perform at predicting the risk for individuals. To this end, we examined the stability of PGS computed for individuals across discovery GWAS. Specifically, we evaluated the correlations between the PGS computed for European American (EUR) and African American (AFR) individuals from pairs of same- and different-ancestry discovery GWAS for post-traumatic stress disorder (PTSD),, type 2 diabetes (T2D),40, 41, 42 and height., These specific traits were chosen because they had sufficiently powered, publicly available AFR-ancestry GWAS. We also addressed the question of whether the same individuals were consistently identified as belonging to the top PGS quantiles. For this work, we targeted EUR- and AFR-ancestry youth from the Philadelphia Neurodevelopmental Cohort (PNC) and the Adolescent Brain Cognitive Development (ABCD) study to compare PGS on an individual level across discovery GWAS.

Subjects and methods

This study, which uses publicly available de-identified data, was approved by the Institutional Review Board of Boston Children's Hospital.

PNC

Genotype data for the PNC, a population-based sample of youth who were aged 8–21 years at the time of study enrollment, were obtained from dbGaP (phs000607.v2.p2). Biological samples from PNC subjects were genotyped in 15 batches (Table S1) using 10 different types of Affymetrix and Illumina arrays by the Center for Applied Genomics at the Children's Hospital of Philadelphia. Analysis was limited to the 5,239 EUR- and 3,260 AFR-ancestry individuals for whom genotype data were available after the quality control (QC) process described below.

ABCD study

Results were replicated using post-QC genotype data for 5,815 EUR and 1,741 AFR individuals in the independent ABCD cohort (NDA no. 2573, fix release 2.0.1). This cohort is comprised of adolescents who were aged 9–10 years at the time that their saliva samples were collected for genotyping. The Rutgers University Cell and DNA Repository stored and genotyped all samples using the Affymetrix NIDA SmokeScreen array.

QC and imputation

The PNC dataset was processed by array batch and merged after imputation, whereas the ABCD dataset was processed as a single batch. For each batch, PLINK 1.9 was used to remove single-nucleotide polymorphisms (SNPs) with >5% missingness, samples with more than 10% missingness, and samples with a genotyped sex that did not match the reported sex phenotype. As a final step, each batch was checked with a pre-imputation perl script that compared SNP frequencies against the 1000 Genomes ALL reference panel. This script fixed strand reversals and improper Ref/Alt assignments and also removed palindromic A/T and C/G SNPs with minor allele frequency (MAF) > 0.4, SNPs with alleles that did not match the reference panel, SNPs with allele frequencies differing by more than 0.2 from the reference, and SNPs not present in the reference panel. Genotypes were phased (Eagle v.2.4) and imputed by chromosome to the 1000 Genomes Other/Mixed GRCh37/hg19 reference panel (Phase 3 v.5) using Minimac 4 via the Michigan Imputation Server. All post-imputation QC was run using bcftools. The 15 imputation batches for the PNC dataset were merged by chromosome, and then post-imputation QC was run using the average imputation quality and average MAF for the merged chromosome files. Only polymorphic sites with (average) imputation quality R2 ≥ 0.7 and (average) MAF ≥ 0.01 were included in the final PLINK 1.9 hard-call PNC and ABCD post-imputation datasets.

Ancestry and kinship analysis

Multi-dimensional scaling (MDS) was conducted using KING (v.2.2.4) to identify the top 10 ancestry components for each sample. (Note that while these components are technically axes in MDS space, we refer to them as principal components [PCs] for the sake of simplicity.) The ancestry PCs were projected onto the 1000 Genomes PC space, and genetic ancestry was inferred using the e1071 support vector machines package in R (Figure 1). Based on these inferences, AFR- and EUR-ancestry cohorts were created for the PNC and ABCD datasets; all other ancestry groups were excluded from further analysis. A second round of unprojected MDS was then performed within the EUR- and AFR-ancestry groups to produce ten PCs that were regressed out of the standardized PGS to adjust for array batch effects and genetic ancestry (Figures S1–S5).

Figure 1

First and second principal components of cohort genotypes

Principal components (PCs) were computed and projected to a 1000 Genomes reference using KING (Manichaikul et al.). Colors indicate inferred genetic ancestry for the (A) 9,206 Philadelphia Neurodevelopmental Cohort (PNC) and (B) 10,318 Adolescent Brain Cognitive Development (ABCD) genotyped samples.

First and second principal components of cohort genotypes Principal components (PCs) were computed and projected to a 1000 Genomes reference using KING (Manichaikul et al.). Colors indicate inferred genetic ancestry for the (A) 9,206 Philadelphia Neurodevelopmental Cohort (PNC) and (B) 10,318 Adolescent Brain Cognitive Development (ABCD) genotyped samples. KING was also used to identify all pairwise relationships out to third-degree relatives based on estimated kinship coefficients and inferred IBD segments. Although the PNC was not recruited as a family study, it does include some related individuals (i.e., siblings and cousins). We ran a sensitivity analysis using a reduced PNC dataset that included only one individual from each family (chosen as the lowest individual ID number for a given family ID number), which reduced the size of the PNC EUR cohort from 5,239 to 4,928 and the AFR cohort from 3,260 to 2,954. After establishing that the PNC PTSD PGS correlation results obtained using only unrelated individuals did not differ meaningfully from those obtained using the full dataset (Tables S4 and S5), we performed all subsequent analyses using the complete EUR and AFR cohorts.

Polygenic score computation with PRS-CS

PRS-CS was used to infer posterior mean effects by chromosome for the SNPs in a given dataset that overlapped with both the discovery GWAS summary statistics and an external 1000 Genomes LD panel that was matched to the ancestry group used for the discovery GWAS. Posterior mean effects were only inferred for SNPs located on the 22 autosomal chromosomes. PGS for the EUR and AFR subsets of PNC and ABCD were computed using both EUR and AFR discovery GWAS for PTSD,, T2D,40, 41, 42 and height, (Table 1). To ensure convergence of the underlying Gibbs sampler algorithm, we ran 25,000 Markov chain Monte Carlo (MCMC) iterations and designated the first 10,000 MCMC iterations as burn-in. The PRS-CS global shrinkage parameter was set to 0.01 when the discovery GWAS had an SNP sample size that was less than 200,000; otherwise, it was learned from the data using a fully Bayesian approach. Default settings were used for all other PRS-CS parameters. Given the stochastic nature of the Bayesian algorithm used by PRS-CS, PGS replicability was confirmed by completing multiple PRS-CS runs using the same discovery GWAS. The PLINK 1.9 score function was used to produce raw PGS from the posterior means of the estimated SNP effects returned by PRS-CS for each chromosome, and then R was used to standardize the PGS for a given cohort to mean = 0 and SD = 1. Standardized PGS were then adjusted by regressing out the first ten within-ancestry PCs.

Table 1

Discovery GWAS used to compute polygenic scores with PRS-CS

Trait	Discovery GWAS	GWAS ancestry	GWAS sample sizea for PRS-CS	SNP countb for PNC PGS calculations	SNP count for ABCD PGS calculations
PTSD	Nievergelt et al.³⁹(Freeze 2 PGC)	AFR	11,321	1,162,502	1,064,574
	Nievergelt et al.³⁹(Freeze 2 PGC)	EUR	70,237	1,087,435	1,016,161
	Duncan et al.³⁸(Freeze 1 PGC)	AFR	9,691	1,157,302	1,059,197
	Duncan et al.³⁸(Freeze 1 PGC)	EUR	9,954	1,086,644	1,015,369
T2D	Chen et al.⁴²	AFR	4,146	1,114,936	1,020,579
	Scott et al.⁴⁰(DIAGRAM)	EUR	152,599	1,087,724	1,016,440
	Mahajan et al.⁴¹(DIAGRAM)	EUR	231,420	1,089,613	1,018,372
Height	Marouli et al.⁴⁴(GIANT)	AFR	27,494	18,580	15,720
	Marouli et al.⁴⁴(GIANT)	EUR	381,625	18,035	15,767
	Wood et al.⁴³(GIANT)	EUR	252,048	987,760	920,889

PTSD, post-traumatic stress disorder; T2D, type 2 diabetes; PNC, Philadelphia Neurodevelopmental Cohort; PGS, polygenic score; ABCD, Adolescent Brain Cognitive Development study; PGC, Psychiatric Genomics Consortium; DIAGRAM, Diabetes Genetics Replication and Meta-Analysis Consortium; GIANT, Genetic Investigation of Anthropometric Traits Consortium.

PRS-CS requires a single GWAS sample size; see supplemental methods for how we derived this measure when the sample size varied by SNP.

The "SNP count" is the number of SNPs in common between the discovery GWAS, the PRS-CS LD panel, and the genomic dataset.

Discovery GWAS used to compute polygenic scores with PRS-CS PTSD, post-traumatic stress disorder; T2D, type 2 diabetes; PNC, Philadelphia Neurodevelopmental Cohort; PGS, polygenic score; ABCD, Adolescent Brain Cognitive Development study; PGC, Psychiatric Genomics Consortium; DIAGRAM, Diabetes Genetics Replication and Meta-Analysis Consortium; GIANT, Genetic Investigation of Anthropometric Traits Consortium. PRS-CS requires a single GWAS sample size; see supplemental methods for how we derived this measure when the sample size varied by SNP. The "SNP count" is the number of SNPs in common between the discovery GWAS, the PRS-CS LD panel, and the genomic dataset.

LD score regression

LD score regression (LDSC) was used to calculate the mean for each EUR-ancestry GWAS as a proxy for GWAS power (Table S14)., We also used LDSC to compute the genetic correlation for each pair of same-trait GWAS (Table S15). Standard error was estimated by jackknifing over blocks of adjacent SNPs. Our LDSC calculations only included SNPs with MAF > 0.01. Given that LDSC may yield biased estimates for admixed populations, we did not perform LD score regression for the AFR-ancestry discovery GWAS.

Quantile-based comparisons

We counted the number of samples in common at or above the 80th percentile, the 90th percentile, and the 95th percentile of the PC-adjusted standardized PGS distributions. Specifically, we counted how many individuals were jointly identified as being at or above a given percentile of the PGS computed from a pair of different discovery GWAS. As an example, consider the n = 3,260 individuals in the PNC AFR cohort. There are n = 652 individuals with PGS at or above the 80th percentile, n = 326 with PGS at or above the 90th percentile, and n = 163 with PGS at or above the 95th percentile of PGS. The proportional overlap for PGS at or above the 80th percentile was calculated by identifying which 652 samples were located within that region of each of the two PGS distributions being compared (e.g., those computed from an AFR GWAS for trait X and those computed from a EUR GWAS for trait X), counting how many of those samples were present at or above that quantile for both distributions, and then dividing that count by 652. A proportional overlap of 1 would indicate that the same 652 individuals had PGS that were among the top 20% of PGS for both distributions.

UK Biobank experiment

We obtained imputed genotypes and standing height phenotypes for 276,107 unrelated white British individuals in the UK Biobank. The supplemental methods describe an experiment we designed using these data to explore the degree to which our primary findings could be attributed to differing GWAS sample sizes. In brief, we used PRS-CS and PLINK 1.9 as described above to compute height PGS for an independent test group using seven discovery GWAS with controlled differences in their sample sizes and degree of sample overlap (Figure S7; Table S16). GWAS A and GWAS B were run using non-overlapping samples (each n = 134,000), whereas GWAS C and GWAS D were run using sub-samples (each n = 75,000) that were randomly drawn from the individuals included in GWAS A and GWAS B, respectively. GWAS E and GWAS F were run using sub-samples (each n = 10,000) that were randomly drawn from the individuals included in GWAS C and GWAS D, respectively. Finally, GWAS AB was run using a superset comprised of the individuals included in either GWAS A or GWAS B (n = 268,000). We performed LD score regression (Table S17) and genetic correlation (Table S19) analyses for these GWAS as described above. We also analyzed the correlation between the height PGS computed from the different GWAS and assessed how well the PGS predicted height for a test group of individuals who were not included in any of the GWAS (n = 8,107).

Statistical analysis

All statistics and graphical displays were generated using R. Pearson correlation coefficients were calculated to assess the strength of correlations between PC-adjusted standardized PGS that were calculated for a given trait using different discovery GWAS. We quantified the association between PGS computed from different discovery GWAS using Pearson's linear correlation coefficient (r), and we ran two-tailed t tests for linear association to determine whether the observed correlations were statistically significant. To evaluate the predictive accuracy of the PGS produced from our height GWAS experiment, we used each set of standardized PGS to predict the height of the test subjects via an additive multiple linear regression model that also included sex, age at height measurement, and the first 20 ancestry PCs supplied by the UK Biobank as covariates. We calculated the coefficient of determination (R2) for each model as a measure of how well the PGS from a given GWAS predicted height in conjunction with these covariates, and we also ran a partial F test for each predictive model to assess the effect of adding the standardized PGS to a base model that included sex, age, and the first 20 ancestry PCs as predictors of height.

Results

Reproducibility across PRS-CS runs

Given that PRS-CS relies on Bayesian methodology to infer posterior effects for the SNPs on each chromosome, it was necessary to confirm that we had used enough MCMC iterations and burn-in trials to ensure convergence of the underlying Gibbs sampler algorithm. We checked for convergence indirectly by assessing the correlation between the posterior effects calculated across multiple runs for a given chromosome (Figure 2). The PRS-CS default setting of 1,000 MCMC iterations with the first 500 iterations serving as burn-in produced relatively inconsistent posterior effects (r ≈ 0.8), suggesting incomplete convergence. The correlation between the posterior effects computed during multiple runs of PRS-CS improved to r ≈ 0.98 when we increased the number of MCMC iterations to 10,000 (5,000 burn-in) and further improved to r > 0.99 for both large and small chromosomes when we used 25,000 MCMC iterations (10,000 burn-in). Given that the computational time increases substantially as more MCMC iterations are run, we opted to use 25,000 MCMC iterations with the first 10,000 as burn-in rather than pursuing even stronger correlations.

Figure 2

Reproducibility of Bayesian posterior effects computed by PRS-CS

As illustrated for chromosome 3 (76,064 SNPs) and chromosome 21 (15,447 SNPs) using the Nievergelt et al. EUR PTSD discovery GWAS with the PNC EUR dataset, posterior effects were more strongly correlated between PRS-CS runs as the number of MCMC iterations (and burn-in iterations) increased.

Reproducibility of Bayesian posterior effects computed by PRS-CS As illustrated for chromosome 3 (76,064 SNPs) and chromosome 21 (15,447 SNPs) using the Nievergelt et al. EUR PTSD discovery GWAS with the PNC EUR dataset, posterior effects were more strongly correlated between PRS-CS runs as the number of MCMC iterations (and burn-in iterations) increased. The next concern was whether the PGS calculated by PLINK 1.9 from the Bayesian posterior effects would also be reproducible across PRS-CS runs. To address this question, we ran PRS-CS twice using the PGC Freeze 2 PTSD discovery GWAS, and calculated PGS from both sets of posterior effects. For both the EUR and AFR PNC cohorts, the correlation between the adjusted PGS was greater than 0.999 (Figure 3). Hence, we are confident that PRS-CS yields reproducible PGS for a given discovery GWAS provided that enough MCMC iterations are used.

Figure 3

Reproducibility of PGS across multiple runs of PRS-CS

PC-adjusted standardized PGS computed from posterior effects generated by two runs of PRS-CS using the same PTSD discovery GWAS from Nievergelt et al. had correlations greater than r = 0.999 for both the EUR (n = 5,239) and AFR (n = 3,260) cohorts of PNC.

Reproducibility of PGS across multiple runs of PRS-CS PC-adjusted standardized PGS computed from posterior effects generated by two runs of PRS-CS using the same PTSD discovery GWAS from Nievergelt et al. had correlations greater than r = 0.999 for both the EUR (n = 5,239) and AFR (n = 3,260) cohorts of PNC.

Stability of PGS computed from different same-ancestry discovery GWAS

Of the three traits that we analyzed, only PTSD had two publicly available AFR-ancestry GWAS., We computed PGS using both GWAS for each AFR-ancestry individual and then assessed the correlation between the two sets of PGS (Figure 4). We found a moderately strong positive correlation between the PGS computed from the PGC Freeze 1 and Freeze 2 AFR-ancestry PTSD GWAS for the AFR-ancestry cohorts of both PNC (r = 0.696, t(3,258) = 55.26, p < 2 × 10−16) and ABCD (r = 0.657, t(1,739) = 36.34, p < 2 × 10−16).

Figure 4

Correlation between PGS computed from two different AFR-ancestry PTSD discovery GWAS for AFR-ancestry individuals

Significant positive correlations were observed between the AFR PGS computed from the PGC Freeze 1 and Freeze 2 AFR PTSD GWAS for both the PNC (r = 0.696, t(3,258) = 55.26, p < 2 × 10−16) and ABCD (r = 0.657, t(1,739) = 36.34, p < 2 × 10−16) AFR cohorts.

Correlation between PGS computed from two different AFR-ancestry PTSD discovery GWAS for AFR-ancestry individuals Significant positive correlations were observed between the AFR PGS computed from the PGC Freeze 1 and Freeze 2 AFR PTSD GWAS for both the PNC (r = 0.696, t(3,258) = 55.26, p < 2 × 10−16) and ABCD (r = 0.657, t(1,739) = 36.34, p < 2 × 10−16) AFR cohorts. The wider availability of EUR-ancestry GWAS allowed us to compute PGS for EUR-ancestry individuals using pairs of EUR-ancestry discovery GWAS for PTSD,, T2D,, and height, (Figure 5). Statistically significant positive correlations between the pairs of PGS were observed for all three traits for both the PNC (Table S8) and ABCD (Table S9) EUR-ancestry cohorts, with the strongest association observed between the height PGS (PNC: r = 0.736; ABCD: r = 0.734) and the weakest observed for the PTSD PGS (PNC: r = 0.392; ABCD: r = 0.378).

Figure 5

Correlation between PGS computed from two different EUR-ancestry discovery GWAS for EUR-ancestry individuals

Pairs of PGS computed for the EUR samples of PNC (n = 5,239) and ABCD (n = 5,815) using two different EUR discovery GWAS for PTSD,, T2D,, and height, all showed significant positive correlations.

Correlation between PGS computed from two different EUR-ancestry discovery GWAS for EUR-ancestry individuals Pairs of PGS computed for the EUR samples of PNC (n = 5,239) and ABCD (n = 5,815) using two different EUR discovery GWAS for PTSD,, T2D,, and height, all showed significant positive correlations.

Stability of PGS computed from different-ancestry discovery GWAS

Given the scarcity of AFR-ancestry GWAS, it is often tempting to compute PGS for AFR-ancestry individuals using EUR-ancestry discovery GWAS. To assess the feasibility of this approach, we computed PGS for AFR-ancestry individuals in PNC and ABCD using both AFR-ancestry discovery GWAS and EUR-ancestry GWAS and then assessed the correlation between the two sets of PGS (Figure 6).

Figure 6

Correlation between PGS computed from AFR-ancestry and EUR-ancestry discovery GWAS for AFR-ancestry individuals

Pairs of PGS computed for the AFR samples of PNC and ABCD from the newer EUR and AFR discovery GWAS were not significantly correlated for either PTSD or T2D,, but there was a significant positive correlation for height.

Correlation between PGS computed from AFR-ancestry and EUR-ancestry discovery GWAS for AFR-ancestry individuals Pairs of PGS computed for the AFR samples of PNC and ABCD from the newer EUR and AFR discovery GWAS were not significantly correlated for either PTSD or T2D,, but there was a significant positive correlation for height. For PTSD, there was no significant correlation between the PGS computed from the newer Freeze 2 PGC AFR and EUR discovery GWAS for AFR-ancestry individuals in either PNC (r = 0.00356, t(3,258) = 0.203, p = 0.839) or ABCD (r = 0.00283, t(1,739) = 0.118, p = 0.906). The AFR PGS computed using the Freeze 1 PGC PTSD AFR and EUR discovery GWAS were uncorrelated for ABCD (r = −0.00320, t(1,739) = −0.133, p = 0.894), but we observed a weak positive correlation for PNC (r = 0.0417, t(3,258) = 2.379, p = 0.0174). We made the same different-ancestry GWAS comparisons for the EUR-ancestry individuals in the PNC and ABCD study populations (Figure 7). As was the case for AFR-ancestry individuals, we found no significant correlation between PGS computed from the PGC Freeze 2 EUR- and AFR-ancestry PTSD discovery GWAS. While we observed no significant correlation between the PGS computed using the PGC Freeze 1 EUR- and AFR-ancestry PTSD discovery GWAS for EUR-ancestry individuals in ABCD (r = −0.00109, t(5,813) = −0.083, p = 0.934), we did observe a weak positive correlation for the EUR cohort of PNC (r = 0.0379, t(5,237) = 2.746, p = 0.0065).

Figure 7

Correlation between PGS computed from EUR-ancestry and AFR-ancestry discovery GWAS for EUR-ancestry individuals

Pairs of PGS computed for the EUR samples of PNC and ABCD from the newer EUR and AFR discovery GWAS were not significantly correlated for either PTSD or T2D,, but there was a significant positive correlation for height.

Correlation between PGS computed from EUR-ancestry and AFR-ancestry discovery GWAS for EUR-ancestry individuals Pairs of PGS computed for the EUR samples of PNC and ABCD from the newer EUR and AFR discovery GWAS were not significantly correlated for either PTSD or T2D,, but there was a significant positive correlation for height. We compared T2D PGS computed from an AFR-ancestry discovery GWAS to those computed using two EUR discovery GWAS, published by the DIAGRAM consortium. The newer EUR-ancestry T2D discovery GWAS yielded PGS that were uncorrelated with those computed from the AFR-ancestry discovery GWAS for the AFR-ancestry individuals in both PNC (r = 0.0185, t(3,258) = 1.055, p = 0.292) and ABCD (r = 0.0219, t(1,739) = 0.912, p = 0.362). Similarly, there was no significant correlation between the different-ancestry T2D PGS that we computed for the EUR-ancestry individuals in PNC (r = 0.0240, t(5,237) = 1.739, p = 0.082) and ABCD (r = 0.0224, t(5,813) = 1.71, p = 0.0872). We observed a weak positive correlation between the PGS computed from the older EUR-ancestry T2D discovery GWAS and the PGS computed from the AFR-ancestry T2D discovery GWAS for the PNC AFR cohort (r = 0.0432, t(3,258) = 2.469, p = 0.0136), but there were no significant correlations between the two sets of PGS computed for the ABCD AFR cohort (r = −0.0458, t(1,739) = −1.911, p = 0.0562), the PNC EUR cohort (r = 0.00528, t(5,237) = 0.382, p = 0.703), or the ABCD EUR cohort (r = 0.0188, t(5,813) = 1.431, p = 0.152). We also computed different-ancestry PGS using EUR- and AFR-ancestry height discovery GWAS that we obtained from the GIANT consortium., We observed significant positive correlations between the PGS computed from the newer EUR- and AFR-ancestry height discovery GWAS for the PNC AFR (r = 0.287, t(3,258) = 17.09, p < 2 × 10−16), ABCD AFR (r = 0.306, t(1,739) = 13.42, p < 2 × 10−16), PNC EUR (r = 0.403, t(5,237) = 31.82, p < 2 × 10−16), and ABCD EUR (r = 0.404, t(5,813) = 33.69, p < 2 × 10−16) cohorts. Likewise, we found significant positive correlations between the PGS computed from the older EUR-ancestry height discovery GWAS and the AFR-ancestry height discovery GWAS for the PNC AFR (r = 0.258, t(3,258) = 15.22, p < 2 × 10−16), ABCD AFR (r = 0.312, t(1,739) = 13.68, p < 2 × 10−16), PNC EUR (r = 0.335, t(5,239) = 25.25, p < 2 × 10−16), and ABCD EUR (r = 0.327, t(5,813) = 26.39, p < 2 × 10−16) cohorts. As was the case for T2D, there was only one AFR-ancestry height discovery GWAS available to use for computing PGS. The supplemental methods includes complete statistical results for the comparisons between PGS computed from different discovery GWAS for the PNC AFR (Table S6), ABCD AFR (Table S7), PNC EUR (Table S8), and ABCD EUR (Table S9) cohorts.

GWAS power

We hypothesized that PGS would be more stable for traits with more powerful discovery GWAS. As such, we used LDSC to compute the mean χ2 as a proxy for power for each of the EUR-ancestry discovery GWAS that we used to compute PGS (Table S14). We found that the two height GWAS had higher mean χ2 than the two T2D GWAS, which had higher mean χ2 than the two PTSD GWAS. Also, the newer, larger GWAS had higher mean χ2 than the older GWAS for each trait. The genetic correlation calculated by LDSC for each pair of GWAS was essentially perfect (Table S15), with the lowest rg = 0.9225 ± 0.1807 for PTSD.

Sample-size effects

In an effort to disentangle the effect of GWAS sample size from other factors differing between the height, T2D, and PTSD GWAS, such as trait heritability and sample overlap, we computed height PGS from seven GWAS that we ran using unrelated white British samples from the UK Biobank. The heights, male-female ratios, and ages at height measurement were comparable across all seven GWAS groups and the test set (Table S16). The LDSC mean χ2 for our seven height GWAS, which ranged from 1.0982 for GWAS E to 3.7259 for GWAS AB (Table S17), spanned the range of the mean χ2 we found for the EUR-ancestry meta-GWAS we used for our primary analyses (Table S14), suggesting that our height GWAS had a similar range of power. The genetic correlations between the GWAS computed by LDSC were essentially perfect for all comparisons (Table S19). All seven of our height GWAS identified genome-wide significant SNPs (p < 5 × 10−8), with the larger GWAS identifying more such SNPs than the smaller GWAS (Table S18). We used our seven discovery GWAS to generate height PGS for an independent test group of 8,107 unrelated white British individuals who had standing height measurements. We found that the correlation between PGS is driven by both the discovery GWAS sample size and the degree of sample overlap between the discovery GWAS (Figures 8 and 9; Table S20). The PGS that were computed from GWAS AB (n = 268,000), which overlaps with all of the other GWAS, showed similar degrees of correlation with the PGS from the GWAS A and B (each n = 134,000; both r ≈ 0.91), GWAS C and D (each n = 75,000; both r ≈ 0.79), and GWAS E and F (each n = 10,000; both r ≈ 0.35) (Figure 8). However, the PGS computed from GWAS A, which overlapped only with GWAS C and GWAS E, showed a stronger correlation with the PGS computed from the overlapping, smaller GWAS C (r = 0.88) than they did with the PGS computed from the non-overlapping, larger GWAS B (r = 0.65). In general, the percentage overlap between discovery GWAS was relatively more important than the number of subjects in common (Figure 9). When the discovery that GWAS had 10,000 subjects in common, the PGS correlation was stronger when the percentage overlap between the discovery GWAS was 13% (C × E; D × F) than it was when the percentage overlap was 7.5% (A × E; B × F) or 3.7% (AB × E; AB × F). Likewise, when two discovery GWAS had 75,000 subjects in common, the correlation was stronger when that number represented a 56% overlap between the GWAS (A × C; B × D) than when it represented a 28% overlap (AB × C; AB × D). Moreover, PGS computed from GWAS A were more strongly correlated with those from GWAS C (r = 0.88) than they were with those from GWAS D (r = 0.56), which was the same size as GWAS C (n = 75,000) but had no overlap with GWAS A. When considering only non-overlapping discovery GWAS, the correlation was stronger for PGS computed from larger GWAS; for example, the PGS computed from GWAS A were more strongly correlated with those from GWAS B (r = 0.65) than with those from either GWAS D (r = 0.56) or GWAS F (r = 0.25). The additive models including, sex, age, 20 ancestry PCs, and the PGS computed from our height GWAS explained between 54.82% (GWAS E) and 62.86% (GWAS AB) of the variability in the measured heights for the test group (Table S21). Moreover, the PGS computed from the height GWAS all explained a significant amount of variability in the height phenotypes beyond what was explained by sex, age at height measurement, and 20 ancestry PCs alone (Table S21). We obtained these experimental results using a highly heritable trait (height), and the discovery GWAS and test samples were drawn from the same ancestrally homogeneous population (white British). The fact that we observed variable degrees of correlation between PGS even under these controlled conditions implies that the differing degrees of correlation that we report for pairs of PTSD, T2D, and height PGS cannot be attributed solely to differing discovery GWAS sample sizes. The proportional overlap between the discovery GWAS is also important, as are the individual subjects who are included in a discovery GWAS.

Figure 8

Correlation between PGS computed from seven white British height GWAS for an independent test set of 8,107 unrelated white British individuals from the UK Biobank

GWAS A and GWAS B were each run for n = 134,000 non-overlapping, unrelated white British individuals using sex, age at height measurement, and the first 20 ancestry PCs as covariates. The GWAS A and GWAS B samples were combined to run GWAS AB (n = 268,000). GWAS C was run using a random subsample (n = 75,000) of the individuals included in GWAS A, and GWAS E was run using a random subsample (n = 10,000) of the individuals included in GWAS C. The same relationship exists between GWAS B, GWAS D (n = 75,000), and GWAS F (n = 10,000). The strength of the correlation between PGS is driven by both GWAS sample size and the degree of sample overlap between the GWAS. ∗∗∗p < 0.001.

Figure 9

Contributions of GWAS sample size and proportional sample overlap to the correlation between height PGS

Height GWAS A and GWAS B were each run for n = 134,000 non-overlapping, unrelated white British individuals using sex, age at height measurement, and the first 20 ancestry PCs as covariates. The GWAS A and GWAS B samples were combined to run GWAS AB (n = 268,000). GWAS C was run using a random subsample (n = 75,000) of the individuals included in GWAS A, and GWAS E was run using a random subsample (n = 10,000) of the individuals included in GWAS C. The same relationship exists between GWAS B, GWAS D (n = 75,000), and GWAS F (n = 10,000). Black dots correspond to the Pearson correlation coefficients for height PGS computed from pairs of discovery GWAS with no sample overlap. When the PGS were computed from overlapping discovery GWAS, the correlation coefficients are depicted using colored dots; the legend lists the number of samples in common as well as the proportion of samples in common for each color. Error bars denote 95% confidence intervals. PGS from pairs of discovery GWAS are more strongly correlated when there is a higher proportion of sample overlap between the GWAS.

Correlation between PGS computed from seven white British height GWAS for an independent test set of 8,107 unrelated white British individuals from the UK Biobank GWAS A and GWAS B were each run for n = 134,000 non-overlapping, unrelated white British individuals using sex, age at height measurement, and the first 20 ancestry PCs as covariates. The GWAS A and GWAS B samples were combined to run GWAS AB (n = 268,000). GWAS C was run using a random subsample (n = 75,000) of the individuals included in GWAS A, and GWAS E was run using a random subsample (n = 10,000) of the individuals included in GWAS C. The same relationship exists between GWAS B, GWAS D (n = 75,000), and GWAS F (n = 10,000). The strength of the correlation between PGS is driven by both GWAS sample size and the degree of sample overlap between the GWAS. ∗∗∗p < 0.001. Contributions of GWAS sample size and proportional sample overlap to the correlation between height PGS Height GWAS A and GWAS B were each run for n = 134,000 non-overlapping, unrelated white British individuals using sex, age at height measurement, and the first 20 ancestry PCs as covariates. The GWAS A and GWAS B samples were combined to run GWAS AB (n = 268,000). GWAS C was run using a random subsample (n = 75,000) of the individuals included in GWAS A, and GWAS E was run using a random subsample (n = 10,000) of the individuals included in GWAS C. The same relationship exists between GWAS B, GWAS D (n = 75,000), and GWAS F (n = 10,000). Black dots correspond to the Pearson correlation coefficients for height PGS computed from pairs of discovery GWAS with no sample overlap. When the PGS were computed from overlapping discovery GWAS, the correlation coefficients are depicted using colored dots; the legend lists the number of samples in common as well as the proportion of samples in common for each color. Error bars denote 95% confidence intervals. PGS from pairs of discovery GWAS are more strongly correlated when there is a higher proportion of sample overlap between the GWAS. Given that much of the interest in PGS is in identifying individuals at high genetic risk for a disorder, we evaluated whether there would be more stability if we focused on the individuals who had PGS located in the upper tail of the distribution. As a baseline comparison, we determined the degree of overlap between the individuals in the top quantiles of PGS computed from two PRS-CS runs using the Freeze 2 AFR- and EUR-ancestry PTSD discovery GWAS for the AFR (Figure 10A) and EUR (Figure 11A) PNC cohorts, respectively. Of the n = 3,260 individuals in the PNC AFR cohort, there are n = 652 individuals with PGS at or above the 80th percentile, n = 326 with PGS at or above the 90th percentile, and n = 163 with PGS at or above the 95th percentile of PGS. We found an overlap of 644 of the 652 AFR-ancestry individuals who had PGS at or above the 80th percentile from the two runs using the same AFR-ancestry PTSD discovery GWAS, which is a 98.7% overlap. Comparable degrees of overlap were observed between the PNC AFR-ancestry individuals with PTSD PGS at or above the 90th (318/326 = 0.975) and 95th (161/163 = 0.988) percentiles. Similarly, the proportional overlap between the PTSD PGS computed from two PRS-CS runs using the Freeze 2 EUR-ancestry PTSD discovery GWAS for the EUR-ancestry cohort (n = 5,239) was 1,026/1,048 = 0.979 at or above the 80th percentile, 513/524 = 0.979 at or above the 90th percentile, and 255/262 = 0.973 at or above the 95th percentile.

Figure 10

Comparison of the samples comprising the top PGS quantiles for the PNC AFR cohort

(A) The samples located at the top 20%, 10%, and 5% of the PTSD PGS distribution were virtually the same when PGS were computed twice using the same discovery GWAS. For example, 644 out of the 652 samples (98.7%) at or above the 80th percentile were the same between the two batches of PGS. (B) The overlap between samples at all three quantiles dropped substantially when the PGS computed from the AFR PGC Freeze 1 PTSD discovery GWAS were compared with those computed from the AFR Freeze 2 PTSD discovery GWAS (Nievergelt et al.), with the degree of overlap being reduced at higher quantiles. (C) The degree of overlap was further reduced when comparing PGS computed from an AFR-ancestry discovery GWAS to those computed from a EUR-ancestry GWAS for PTSD (Nievergelt et al.), T2D,, and height (Marouli et al.). For context, the green bars depict the number of samples included at or above the 80th percentile (n = 652), 90th percentile (n = 326), and 95th percentile (n = 163). Additional results can be found in Tables S10 and S11.

Figure 11

Comparison of the samples comprising the top PGS quantiles for the PNC EUR cohort

(A) The EUR samples located within the top 20%, 10%, and 5% of the PTSD PGS distribution were nearly the same when PGS were computed twice using the same EUR discovery GWAS (Nievergelt et al.). For example, 1,026 out of the 1,048 samples (97.9%) at or above the 80th percentile were the same between the two runs of PRS-CS. (B) The overlap between samples at all three quantiles dropped substantially when the PGS computed from two different EUR discovery GWAS were compared for PTSD,, T2D,, and height., (C) The degree of overlap was dramatically reduced when comparing PGS computed from an AFR-ancestry discovery GWAS with those computed from an EUR-ancestry GWAS for PTSD (Nievergelt et al.), T2D,, and height., Green bars depict the number of samples included at or above the 80th percentile (n = 1,048), 90th percentile (n = 524), and 95th percentile (n = 262). Additional results can be found in Tables S12 and S13.

Comparison of the samples comprising the top PGS quantiles for the PNC AFR cohort (A) The samples located at the top 20%, 10%, and 5% of the PTSD PGS distribution were virtually the same when PGS were computed twice using the same discovery GWAS. For example, 644 out of the 652 samples (98.7%) at or above the 80th percentile were the same between the two batches of PGS. (B) The overlap between samples at all three quantiles dropped substantially when the PGS computed from the AFR PGC Freeze 1 PTSD discovery GWAS were compared with those computed from the AFR Freeze 2 PTSD discovery GWAS (Nievergelt et al.), with the degree of overlap being reduced at higher quantiles. (C) The degree of overlap was further reduced when comparing PGS computed from an AFR-ancestry discovery GWAS to those computed from a EUR-ancestry GWAS for PTSD (Nievergelt et al.), T2D,, and height (Marouli et al.). For context, the green bars depict the number of samples included at or above the 80th percentile (n = 652), 90th percentile (n = 326), and 95th percentile (n = 163). Additional results can be found in Tables S10 and S11. Comparison of the samples comprising the top PGS quantiles for the PNC EUR cohort (A) The EUR samples located within the top 20%, 10%, and 5% of the PTSD PGS distribution were nearly the same when PGS were computed twice using the same EUR discovery GWAS (Nievergelt et al.). For example, 1,026 out of the 1,048 samples (97.9%) at or above the 80th percentile were the same between the two runs of PRS-CS. (B) The overlap between samples at all three quantiles dropped substantially when the PGS computed from two different EUR discovery GWAS were compared for PTSD,, T2D,, and height., (C) The degree of overlap was dramatically reduced when comparing PGS computed from an AFR-ancestry discovery GWAS with those computed from an EUR-ancestry GWAS for PTSD (Nievergelt et al.), T2D,, and height., Green bars depict the number of samples included at or above the 80th percentile (n = 1,048), 90th percentile (n = 524), and 95th percentile (n = 262). Additional results can be found in Tables S12 and S13. The proportional overlap decreases if we consider PGS computed from two different same-ancestry discovery GWAS. For the PNC AFR-ancestry cohort (Figure 10B), PC-adjusted standardized PGS computed from the Freeze 1 and Freeze 2 PTSD AFR-ancestry discovery GWAS had 53.6% of individuals in common at or above the 80th percentile, 47.5% at or above the 90th percentile, and 36.3% at or above the 95th percentile. The decrease in proportional overlap was even more pronounced for PGS computed from two different EUR-ancestry GWAS for the PNC EUR-ancestry cohort (Figure 11B). The proportion of overlap became progressively smaller as we considered progressively higher percentiles for PTSD, T2D, and height. Moreover, the amount of overlap was greatest for height and smallest for PTSD at each of the percentiles that we considered. Proportional overlap was even more dramatically decreased when we compared the top quantiles of the PGS that had been computed from an AFR-ancestry discovery GWAS with those that had been computed from a EUR-ancestry discovery GWAS (Figures 10C and 11C). For the AFR cohort of PNC, the proportional overlap ranged from a low of 4.91% for different-ancestry PTSD PGS at the 95th percentile to a high of 32.8% for different-ancestry height PGS at the 80th percentile, whereas the proportional overlap for the PNC EUR cohort ranged from 3.82% for different-ancestry T2D PGS at the 95th percentile to 38.1% for height PGS at the 80th percentile. For both the EUR and AFR cohorts, the general pattern is that proportional overlap is largest for different-ancestry PGS at the 80th percentile and smallest at the 95th percentile. Within a given percentile, the proportional overlap is largest for height and smallest for either PTSD or T2D. Note that Figures 10 and 11 only include comparisons for PNC between the PGS computed using the newer discovery GWAS if there was more than one comparison possible. We observed similar results for the ABCD cohort and also for additional different-ancestry comparisons. See the supplemental methods for complete results of our quantile-based analyses for the PNC AFR (Table S10), ABCD AFR (Table S11), PNC EUR (Table S12), and ABCD EUR (Table S13) cohorts.

Discussion

Our work focused on comparing the PGS computed from different discovery GWAS at the individual level. The correlation in PGS across discovery GWAS was higher for a strongly heritable anthropometric trait (e.g., height) as compared with medical and psychiatric disorders, such as T2D and PTSD; higher between GWAS with overlapping samples than between non-overlapping GWAS; and higher for same-ancestry versus different-ancestry GWAS. These patterns of stability extended to comparisons between the upper quantiles of PGS, underscoring the need to proceed cautiously with integrating PGS into precision medicine applications. This relatively modest correlation in PGS is especially noteworthy given that it was observed for PGS computed using successive generations of meta-GWAS that were produced by the PGC,, DIAGRAM,, and GIANT, consortia. The fact that even same-ancestry meta-GWAS computed by the same consortia using overlapping samples and SNPs (Tables 1 and 2) could yield PGS with correlations <0.7 at the individual level raises serious concerns. If PGS are going to be used clinically, then they need to be reproducible. In many ways, our UK Biobank experiment provided the best-case scenario for PGS stability. We considered height, a highly heritable, easily measured quantitative trait, and the phenotyping, genotyping, statistical analyses, and study population were constant across all discovery GWAS and the test set. PGS were most correlated between the largest GWAS, but the degree of sample overlap appeared to be a stronger predictor of correlation strength than sample size.

Table 2

Estimated sample overlap between same-ancestry GWAS

Trait	Discovery GWAS ancestry	No. of overlapping samplesa	No. of new samplesb	Percentage increase in sample size (%)c	Correlation between PGS:
Trait	Discovery GWAS ancestry	No. of overlapping samplesa	No. of new samplesb	Percentage increase in sample size (%)c	PNC	ABCD
PTSD³⁸^,³⁹	AFR	9,691	1,630	16.82	0.696	0.657
PTSD³⁸^,³⁹	EUR	9,954	60,283	605.62	0.392	0.378
T2D⁴⁰^,⁴¹	EUR	152,599	78,821	51.65	0.602	0.597
Height⁴³^,⁴⁴	EUR	252,048	129,577	51.41	0.736	0.734

PTSD, post-traumatic stress disorder; T2D, type 2 diabetes; PNC, Philadelphia Neurodevelopmental Cohort; PGS, polygenic score; ABCD, Adolescent Brain Cognitive Development study.

Discovery GWAS references: Duncan et al., Nievergelt et al., Scott et al., Mahajan et al., Wood et al., Marouli et al.

PRS-CS sample size for the older GWAS in the pair; calculated as described in the supplemental methods.

Calculated as the difference between the PRS-CS sample size for the newer GWAS and that for the older GWAS run by the consortium. This is an estimate, as we do not know the exact degree of overlap between the two GWAS.

Calculated as the number of new samples divided by the number of overlapping samples.

Estimated sample overlap between same-ancestry GWAS PTSD, post-traumatic stress disorder; T2D, type 2 diabetes; PNC, Philadelphia Neurodevelopmental Cohort; PGS, polygenic score; ABCD, Adolescent Brain Cognitive Development study. Discovery GWAS references: Duncan et al., Nievergelt et al., Scott et al., Mahajan et al., Wood et al., Marouli et al. PRS-CS sample size for the older GWAS in the pair; calculated as described in the supplemental methods. Calculated as the difference between the PRS-CS sample size for the newer GWAS and that for the older GWAS run by the consortium. This is an estimate, as we do not know the exact degree of overlap between the two GWAS. Calculated as the number of new samples divided by the number of overlapping samples. Even if stand-alone PGS are not yet useful clinically, they could still be used to help identify those individuals at highest disease risk. For instance, PGS for psychiatric traits could be used in conjunction with environmental factors to identify adolescents most at risk for developing psychosis and other mental health disorders. We are actively pursuing such applications with the PNC and ABCD cohorts and have found that ancestry-specific PTSD PGS do indeed add predictive value to models that include other non-genetic factors. Nonetheless, we caution that it is dangerous to rely solely on PGS quantiles to identify at-risk individuals. Successive generations of discovery GWAS yielded PGS that did not identify the same individuals at the top quantiles of the distribution, and the amount of overlap decreased as higher quantiles were considered (Figures 10 and 11; Tables S10–S13). Hence, the instinctive decision to focus only on the upper tail of the PGS distribution will not mitigate the lack of PGS stability across different discovery GWAS. We chose to use the Bayesian PRS-CS Python package to compute PGS for this study. It has been demonstrated that Bayesian methods generally yield more predictive PGS than those produced via traditional p-value thresholding approaches. The advantage of PRS-CS over other Bayesian methods is that it employs a very robust Strawderman-Berger continuous shrinkage prior rather than a discrete mixture prior, which allows for more accurate multivariate modeling of local LD in the polygenic prediction. When enough MCMC iterations are used to ensure convergence of the underlying Gibbs sampler algorithm, PRS-CS yields very consistent posterior means of the estimated SNP effects (Figure 2). PGS computed using the same discovery GWAS are highly correlated when computed using multiple PRS-CS runs (Figure 3), and others have previously shown that PGS computed from the same discovery GWAS are strongly correlated when computed using PRS-CS and other Bayesian and non-Bayesian approaches. Hence, the limited PGS stability across discovery GWAS that we report here cannot be attributed to the stochastic nature of Bayesian methods; there must be differences between the discovery GWAS. By choosing to use multiple generations of GWAS produced by the same consortia, we hoped to minimize potential methodological differences between the same-trait meta-GWAS. As expected, the genetic correlation between each pair of same-trait GWAS was nearly perfect, no doubt due to the large overlap between SNPs and samples within each pair (Table 1). Initially, we had assumed that the newer GWAS in each pair would be the "better" GWAS since we thought that the larger sample size would yield more explanatory power. We cannot rule out this possibility, but the results of our UK Biobank experiment suggest that factors beyond sample size also contribute to PGS stability. It is not surprising that the two height GWAS had higher mean χ2, a proxy for GWAS power, as compared with the PTSD and T2D GWAS (Table S14). Height is an easily measured quantitative trait that is less susceptible to ascertainment bias than qualitative disease traits. Furthermore, environmental factors make substantial contributions to the development of both PTSD and T2D. Even so, LDSC gave an unusually high estimate of mean χ2 (6.4544) for the newer height GWAS. While it is possible that the LDSC calculations could have been biased due to being based only on a small number of low-frequency SNPs, we believe that a plausible explanation could lie in the design of this GWAS. Specifically, the newer height GWAS included a small number of targeted rare and low-frequency SNPs (MAF between 0.1% and 4.8%) on a specially designed exome array rather than casting the same wide net as the earlier GWAS, although our LDSC and PRS-CS calculations only included the low-frequency SNPs (i.e., those with MAF > 1%). This modification coupled with a substantially increased sample size and an easily ascertained quantitative trait could have yielded this improvement in explanatory power. Our results add to the growing body of evidence that PGS should be computed from an ancestrally matched discovery GWAS. It is well established that EUR-ancestry GWAS typically yield PGS that are less predictive for AFR and other non-EUR-ancestry groups.,,,,,,63, 64, 65, 66, 67 We have further demonstrated that PGS computed from same-ancestry GWAS for PTSD and T2D are uncorrelated with those computed from different-ancestry GWAS for both AFR- and EUR-ancestry study participants (Figures 6 and 7), and we also found that there is very little overlap between the individuals in the upper tails of the PGS distributions computed using EUR-ancestry GWAS as compared with those computed using AFR-ancestry GWAS (Figures 10C and 11C; Tables S10–S13). Given the dearth of AFR-ancestry and other non-EUR-ancestry discovery GWAS, our results underscore the urgent need for more high-powered GWAS analyses to be run for non-EUR-ancestry populations. We chose to study PTSD, T2D, and height because all three traits had publicly available GWAS for both EUR- and AFR-ancestry populations. Of these three, PTSD was the only trait that had two AFR-ancestry GWAS available for comparison purposes. While we focused our current work on the EUR- and AFR-ancestry individuals in the PNC and ABCD cohorts, we hope that methodology and GWAS data will soon exist to make it possible expand our analyses to the admixed American and other ancestral groups that are also included in these cohorts (Figure 1). The recent release of PRS-CSx will make it possible to use discovery GWAS that include a combination of East Asian-, AFR-, and EUR-ancestry samples. Although it offers an improvement over the current requirement that the discovery GWAS be limited to only one of these three ancestry groups, PRS-CSx still does not enable analyses of admixed samples from other genetic backgrounds. Ultimately, we envision a future where genetic ancestry will not be a necessary consideration before computing PGS. Given that genetic ancestry is continuous, it is rather artificial to assign samples to discrete ancestry groups. Within the AFR-ancestry group alone, there is an enormous degree of genetic diversity., We controlled for such diversity by calculating PGS separately for each ancestry group and then regressing out within-ancestry principal components from the standardized PGS. We are optimistic that new methods that incorporate local ancestry, will eventually allow us to embrace this diversity and compute stable, accurate PGS for admixed populations. Increasingly economical whole-genome sequencing, coupled with expanded (i.e., less Eurocentric) genotyping arrays and improved imputation to diverse reference panels from TOPMed, should also facilitate the further development of inclusive approaches, such as BOLT-LMM,, trans-ethnic GWAS, and multi-ethnic PGS. While it certainly would be easier to continue to focus PGS development on EUR-ancestry populations, we do so at the grave risk of further exacerbating the inequities in medical care between EUR-ancestry populations and the rest of the world.,

Data and code availability

The PNC and ABCD genomic datasets used in this study are available by application from dbGaP (phs00060) and NDAR (NDA no. 2573), respectively. UK Biobank data are also available by application. All discovery GWAS summary statistics and software used in this study are publicly available; see Web resources for access information.

71 in total

1. Can polygenic risk scores help identify pediatric bipolar spectrum and related disorders?: A systematic review.

Authors: Joseph Biederman; Allison Green; Maura DiSalvo; Stephen V Faraone
Journal: Psychiatry Res Date: 2021-03-03 Impact factor: 3.222

2. The use of polygenic risk scores to identify phenotypes associated with genetic risk of schizophrenia: Systematic review.

Authors: Sumit Mistry; Judith R Harrison; Daniel J Smith; Valentina Escott-Price; Stanley Zammit
Journal: Schizophr Res Date: 2017-11-10 Impact factor: 4.939

3. Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores.

Authors: Bjarni J Vilhjálmsson; Jian Yang; Hilary K Finucane; Alexander Gusev; Sara Lindström; Stephan Ripke; Giulio Genovese; Po-Ru Loh; Gaurav Bhatia; Ron Do; Tristan Hayeck; Hong-Hee Won; Sekar Kathiresan; Michele Pato; Carlos Pato; Rulla Tamimi; Eli Stahl; Noah Zaitlen; Bogdan Pasaniuc; Gillian Belbin; Eimear E Kenny; Mikkel H Schierup; Philip De Jager; Nikolaos A Patsopoulos; Steve McCarroll; Mark Daly; Shaun Purcell; Daniel Chasman; Benjamin Neale; Michael Goddard; Peter M Visscher; Peter Kraft; Nick Patterson; Alkes L Price
Journal: Am J Hum Genet Date: 2015-10-01 Impact factor: 11.025

Review 4. Could Polygenic Risk Scores Be Useful in Psychiatry?: A Review.

Authors: Graham K Murray; Tian Lin; Jehannine Austin; John J McGrath; Ian B Hickie; Naomi R Wray
Journal: JAMA Psychiatry Date: 2021-02-01 Impact factor: 21.596

Review 5. Polygenic Score Models for Alzheimer's Disease: From Research to Clinical Applications.

Authors: Xiaopu Zhou; Yolanda Y T Li; Amy K Y Fu; Nancy Y Ip
Journal: Front Neurosci Date: 2021-03-29 Impact factor: 4.677

6. Tractor uses local ancestry to enable the inclusion of admixed individuals in GWAS and to boost power.

Authors: Caroline M Nievergelt; Mark J Daly; Benjamin M Neale; Elizabeth G Atkinson; Adam X Maihofer; Masahiro Kanai; Alicia R Martin; Konrad J Karczewski; Marcos L Santoro; Jacob C Ulirsch; Yoichiro Kamatani; Yukinori Okada; Hilary K Finucane; Karestan C Koenen
Journal: Nat Genet Date: 2021-01-18 Impact factor: 38.330

7. Comparing distributions of polygenic risk scores of type 2 diabetes and coronary heart disease within different populations.

Authors: Sulev Reisberg; Tatjana Iljasenko; Kristi Läll; Krista Fischer; Jaak Vilo
Journal: PLoS One Date: 2017-07-05 Impact factor: 3.240

8. Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations.

Authors: Ying Wang; Jing Guo; Guiyan Ni; Jian Yang; Peter M Visscher; Loic Yengo
Journal: Nat Commun Date: 2020-07-31 Impact factor: 14.919

Review 9. Biospecimens and the ABCD study: Rationale, methods of collection, measurement and early data.

Authors: Kristina A Uban; Megan K Horton; Joanna Jacobus; Charles Heyser; Wesley K Thompson; Susan F Tapert; Pamela A F Madden; Elizabeth R Sowell
Journal: Dev Cogn Neurosci Date: 2018-03-16 Impact factor: 6.464

Review 10. Polygenic risk scores: a biased prediction?

Authors: Francisco M De La Vega; Carlos D Bustamante
Journal: Genome Med Date: 2018-12-27 Impact factor: 11.117

4 in total

1. Gene-based polygenic risk scores analysis of alcohol use disorder in African Americans.

Authors: Dongbing Lai; Tae-Hwi Schwantes-An; Marco Abreu; Grace Chan; Victor Hesselbrock; Chella Kamarajan; Yunlong Liu; Jacquelyn L Meyers; John I Nurnberger; Martin H Plawecki; Leah Wetherill; Marc Schuckit; Pengyue Zhang; Howard J Edenberg; Bernice Porjesz; Arpana Agrawal; Tatiana Foroud
Journal: Transl Psychiatry Date: 2022-07-05 Impact factor: 7.989

2. Three legs of the missing heritability problem.

Authors: Lucas J Matthews; Eric Turkheimer
Journal: Stud Hist Philos Sci Date: 2022-05-06 Impact factor: 1.379

3. Copy Number Variant Risk Scores Associated With Cognition, Psychopathology, and Brain Structure in Youths in the Philadelphia Neurodevelopmental Cohort.

Authors: Aaron Alexander-Bloch; Guillaume Huguet; Laura M Schultz; Nicholas Huffnagle; Sebastien Jacquemont; Jakob Seidlitz; Zohra Saci; Tyler M Moore; Richard A I Bethlehem; Josephine Mollon; Emma K Knowles; Armin Raznahan; Alison Merikangas; Barbara H Chaiyachati; Harshini Raman; J Eric Schmitt; Ran Barzilay; Monica E Calkins; Russel T Shinohara; Theodore D Satterthwaite; Ruben C Gur; David C Glahn; Laura Almasy; Raquel E Gur; Hakon Hakonarson; Joseph Glessner
Journal: JAMA Psychiatry Date: 2022-07-01 Impact factor: 25.911

4. Improving the computation efficiency of polygenic risk score modeling: faster in Julia.

Authors: Annika Faucon; Julian Samaroo; Tian Ge; Lea K Davis; Nancy J Cox; Ran Tao; Megan M Shuey
Journal: Life Sci Alliance Date: 2022-07-18

4 in total