| Literature DB >> 26170196 |
Lily M Blair1, Marcus W Feldman2.
Abstract
BACKGROUND: Demography and environmental adaptation can affect the global distribution of genetic variants and possibly the distribution of disease. Population heterozygosity of single nucleotide polymorphisms has been shown to decrease strongly with distance from Africa and this has been attributed to the effect of serial founding events during the migration of humans out of Africa. Additionally, population allele frequencies have been shown to change due to environmental adaptation. Here, we investigate the relationship of Out-of-Africa migration and climatic variables to the distribution of risk alleles for 21 diseases.Entities:
Mesh:
Year: 2015 PMID: 26170196 PMCID: PMC4501093 DOI: 10.1186/s12863-015-0239-3
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Fig. 1Regression of average heterozygosity of type 2 diabetes risk alleles on distance from Africa. Each point represents the average heterozygosity for one of the 61 populations studied in this paper
Correlation coefficient, r, between average heterozygosity and average risk allele frequency on distance from Africa, with p-value
| Disease | SNPs | Heterozygosity r with distance |
| Risk allele frequency r with distance |
|
|---|---|---|---|---|---|
| Biliary liver cirrhosis | 41 | -0.04 | 0.03* | 0.63 | 0.3 |
| Alopecia areata | 41 | -0.42 | 0.78 | 0.69 | 0.04* |
| Prostate cancer | 39 | -0.22 | 0.33 | -0.46 | 0.31 |
| Systemic lupus erythematosus | 33 | -0.37 | 0.75 | 0.60 | 0.42 |
| Ulcerative colitis | 32 | -0.51 | 0.89 | -0.26 | 0.18 |
| Type 1 diabetes | 27 | -0.23 | 0.74 | 0.23 | 0.89 |
| Celiac disease | 26 | -0.45 | 0.65 | -0.08 | 0.62 |
| Parkinson’s disease | 25 | -0.76 | 0.33 | 0.16 | 0.99 |
| Crohn’s disease | 24 | -0.81 | 0.10 | -0.29 | 0.31 |
| Membranous nephropathy | 20 | -0.51 | 0.72 | -0.31 | 0.44 |
| Systemic sclerosis | 19 | -0.69 | 0.41 | -0.74 | 0.05* |
| Primary biliary cirrhosis | 15 | -0.35 | 0.57 | -0.33 | 0.72 |
| Colorectal cancer | 15 | -0.64 | 0.34 | -0.66 | 0.05 |
| Type 2 diabetes | 15 | -0.83 | 0.03* | -0.76 | 0.02* |
| Breast cancer | 14 | -0.65 | 0.35 | -0.02 | 0.77 |
| Melanoma | 14 | -0.28 | 0.51 | -0.42 | 0.22 |
| Rheumatoid arthritis | 14 | -0.24 | 0.64 | 0.43 | 0.56 |
| Asthma | 13 | -0.10 | 0.46 | -0.20 | 0.56 |
| Neuroblastoma | 10 | -0.04 | 0.52 | -0.32 | 0.58 |
| Polycystic ovary syndrome | 10 | -0.32 | 0.94 | 0.76 | 0.03* |
| Pancreatic cancer | 7 | -0.32 | 0.95 | -0.71 | 0.06 |
P-value was calculated by comparing the R2 values of the risk alleles to the null distributions created from 10,000 resampled SNP sets (see “Null Distributions” section in Methods). Correlation coefficient is reported instead of R2 to show directionality
*These p-values are not significant when Bonferroni corrected for the ten variables for each allelic statistic or when adjusted for an FDR of 0.2
Fig. 2Null Distributions. Blue histograms represent the binned R2 values for each of 10,000 sets of resampled SNPs regressed on an environmental variable. Each resampled set contains random SNPs that match the number of risk alleles and global allele frequency of the risk alleles for that disease. Red lines indicate values of R2, adjusted as in Methods, with 0.45 for type 2 diabetes on distance from Africa (a) and 0.03 for celiac disease on longitude (b). Before adjustment, the R2 values were 0.69 for type 2 diabetes on distance from Africa and 0.13 for celiac disease on longitude. The null distributions for these two diseases are different because each null distribution is created using resampled sets that are matched for number and global allele frequency of the risk alleles. Our analysis included 15 risk alleles for type 2 diabetes and 26 for celiac disease
Fig. 3P-values for average heterozygosity regressed on environment. P-values are calculated by comparing the R2 of the disease risk allele heterozygosities to the null distribution created using 10,000 resampled sets of SNPs matched for number of and global allele frequency of disease risk SNPs. * Indicates significance after adjustment for an FDR of 0.2. + Indicates significance after Bonferroni correction (see “Null Distributions” section in Methods)
Fig. 4P-values for average risk allele frequency regressed on environment. P-values are calculated by comparing the R2 of the disease risk allele frequencies to a null distribution created using 10,000 resampled sets of SNPs matched for number of and global allele frequency of disease risk SNPs. * Indicates significance after adjustment for an FDR of 0.2. + Indicates significance after Bonferroni correction (see “Null Distributions” section in Methods)
Fig. 5Enrichment of disease risk SNPs in the 0.05 empirical tail in Bayenv. Enrichment is indicated by color. Permutations were carried out to determine whether the number of SNPs with low p-values was more than expected given the total number of risk SNPs for each disease. (see “Enrichment of SNPs with low Bayenv p-values” section in Methods) A star indicates significance at p<0.05 after Bonferroni correction