| Literature DB >> 32681795 |
Aroon T Chande1,2,3, Lavanya Rishishwar2,3, Dongjo Ban1,3, Shashwat D Nagar1,3, Andrew B Conley2,3, Jessica Rowell1, Augusto E Valderrama-Aguirre3,4,5, Miguel A Medina-Rivas3,6, I King Jordan1,2,3.
Abstract
Genome-wide association studies have uncovered thousands of genetic variants that are associated with a wide variety of human traits. Knowledge of how trait-associated variants are distributed within and between populations can provide insight into the genetic basis of group-specific phenotypic differences, particularly for health-related traits. We analyzed the genetic divergence levels for 1) individual trait-associated variants and 2) collections of variants that function together to encode polygenic traits, between two neighboring populations in Colombia that have distinct demographic profiles: Antioquia (Mestizo) and Chocó (Afro-Colombian). Genetic ancestry analysis showed 62% European, 32% Native American, and 6% African ancestry for Antioquia compared with 76% African, 10% European, and 14% Native American ancestry for Chocó, consistent with demography and previous results. Ancestry differences can confound cross-population comparison of polygenic risk scores (PRS); however, we did not find any systematic bias in PRS distributions for the two populations studied here, and population-specific differences in PRS were, for the most part, small and symmetrically distributed around zero. Both genetic differentiation at individual trait-associated single nucleotide polymorphisms and population-specific PRS differences between Antioquia and Chocó largely reflected anthropometric phenotypic differences that can be readily observed between the populations along with reported disease prevalence differences. Cases where population-specific differences in genetic risk did not align with observed trait (disease) prevalence point to the importance of environmental contributions to phenotypic variance, for both infectious and complex, common disease. The results reported here are distributed via a web-based platform for searching trait-associated variants and PRS divergence levels at http://map.chocogen.com (last accessed August 12, 2020).Entities:
Keywords: disease; genetic ancestry; health; polygenic; population genomics; traits
Year: 2020 PMID: 32681795 PMCID: PMC7513793 DOI: 10.1093/gbe/evaa154
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.Genetic ancestry in Antioquia and Chocó. (A) The locations of the Colombian administrative departments of Chocó (purple) and Antioquia (green) are shown along with pie charts indicating the average continental ancestry fractions: African (blue), European (orange), and Native American (red). (B) Ternary plots showing the relative contributions of African, European, and Native American ancestry to individuals from Antioquia (green) and Chocó (purple). (C) ADMIXTURE plot showing the continental ancestry fractions for African (blue), European (orange), and Native American (red) reference populations together with Antioquia and Chocó.
. 2.Single nucleotide variant phenotype associations. (A) Polarized fixation index (FST) values for divergent trait-associated SNP effect alleles: higher effect allele frequency in Antioquia (left, green) and higher effect allele frequency Chocó (right, purple). The corresponding SNP associations are shown in panel B (see supplementary table 3, Supplementary Material online, for details). (B) Heatmap of effect allele frequencies in Antioquia and Chocó (see key) and their SNP associations. (C) Word clouds showing the enrichment of SNP-associated traits for each population. Word clouds were generated by counting the occurrences of SNP trait-annotations for SNPs with an FST value >0.2, 98 for Chocó and 61 for Antioquia (all SNPs significantly divergent at P ≪ 0.001; supplementary table 3, Supplementary Material online), and words are scaled by number of times they appear in the trait association list.
. 3.Polygenic risk divergence. (A) Distribution of the differences in population-average PRS are shown for significantly divergent traits: higher in Antioquia (above, green) and higher in Chocó (below, purple). (B) Population-specific PRS distributions for examples of anthropometric and disease traits are shown for Antioquia (green) and Chocó (purple) along with the significance levels for the distribution differences. Traits with increased prevalence/risk in Antioquia are shown on the left, traits with increased prevalence/risk in Chocó are shown on the right.
. 4.Population-specific differences in trait endophenotypes: pathways and biochemical functions. Gene set enrichment was used uncover pathways and functional gene sets that are enriched for divergent associated SNPs in each population. For each pathway or function, circles are scaled to the relative number of implicated genes for each population and colored according to the population-specific levels of enrichment.
. 5.Genetic ancestry and polygenic trait divergence. (A) Distributions of the correlations (r2) between individuals’ genetic ancestry fractions—African (blue), European (orange), Native American (red)—and their PRS for all traits analyzed here. Vertical lines show the median for each distribution. (B) Ancestry × PRS correlations (r2) polarized by the direction of the correlation (positive or negative) are shown for all traits where r2 > 0.4 for at least one ancestry component—African (A), European (E), and Native American (N). (C) Examples of polygenic traits with high correlations between ancestry and PRS are shown. Ancestry components are color coded as in panel A, and for each scatter plot, ancestry fractions (y axis) are regressed against PRS (x axis). Linear trend lines with 95% confidence intervals are shown for each regression.
. 6.Predicted versus observed disease risk. Left: For each disease, the predicted genetic risk difference for Antioquia compared with Chocó (red circles) is compared to the observed prevalence of the disease (blue circles). Right: The differences between predicted disease risk minus observed prevalence. Diseases are grouped into bands as complex common diseases (yellow), cancer (blue), and infectious disease (red). The x axis values are log odds ratios for population-specific disease risk allele frequencies and observed disease prevalence values, as described in Materials and Methods.