| Literature DB >> 36246630 |
Efe Sezgin1,2, Elif Kaplan2.
Abstract
Behçet disease (BD) is a polygenic, multifactorial, multisystem inflammatory condition with unknown etiology. Global distribution of BD is geographically structured, highest prevalence observed among East Asian, Middle Eastern, and Mediterranean populations. Although adaptive selection on a few BD susceptibility loci is speculated, a thorough evolutionary analysis on the genetic architecture of BD is lacking. We aimed to understand whether increased BD risk in the human populations with high prevalence is due to past selection on BD associated genes. We performed population genetics analyses with East Asian (high BD prevalence), European (low/very low BD prevalence), and African (very low/no BD prevalence) populations. Comparison of ancestral and derived alleles' frequencies versus their reported susceptible or protective effect on BD showed both derived and ancestral alleles are associated with increased BD risk. Variants showing higher risk to and more significant association with BD had smaller allele frequency differences, and showed less population differentiation compared to variants that showed smaller risk and less significant association with BD. Results suggest BD alleles are not unique to East Asians but are also found in other world populations at appreciable frequencies, and argue against selection favoring these variants only in populations with high BD prevalence. BD associated gene analyses showed similar evolutionary histories driven by neutral processes for many genes or balancing selection for HLA (Human Leukocyte Antigen) genes in all three populations studied. However, nucleotide diversity in several HLA region genes was much higher in East Asians suggesting selection for high nucleotide and haplotype diversity in East Asians. Recent selective sweep for genes involved in antigen recognition, peptide processing, immune and cellular differentiation regulation was observed only in East Asians. We conclude that the evolutionary processes shaping the genetic diversity in BD risk genes are diverse, and elucidating the underlying specific selection mechanisms is complex. Several of the genes examined in this study are risk factors (such as ERAP1, IL23R, HLA-G) for other inflammatory diseases. Thus, our conclusions are not only limited to BD but may have broader implications for other inflammatory diseases.Entities:
Keywords: ancestral allele; behcet disease; complex disease evolution; derived allele; population differentiation; population genetics; population genomics; selection
Year: 2022 PMID: 36246630 PMCID: PMC9561091 DOI: 10.3389/fgene.2022.983646
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
Comparison of ancestral and derived allele status versus population distribution of BD associated SNPs with respect to their effect on BD.
| Susceptible | Protective | P | |||
|---|---|---|---|---|---|
| Ancestral N(%) | Derived N(%) | Ancestral N(%) | Derived N(%) | ||
| Chinese | 18 (31) | 14 (24) | 13 (22) | 14 (22) | 0.64 |
| Japanese | 35 (24) | 47 (33) | 20 (14) | 41 (29) | 0.23 |
| Korean | 5 (62) | 3 (38) | - | - | - |
| Turkish | 14 (35) | 24 (60) | 0 (0) | 2 (100) | 0.29 |
| Total | 72 (29) | 88 (35) | 33 (13) | 57 (23) | 0.32 |
| SNPs | |||||
| 32 (26) | 53 (44) | 15 (12) | 21 (17) | 0.68 | |
Chi-square test result.
Pooling alleles and their effects from all reported studies.
Focusing only on reported alleles from studies with larger sample size and more significant BD, association (reported p values less then p < 10−5). Populations represent study populations where the BD, genetic association study was conducted and variants were discovered.
FIGURE 1Distribution of BD associated variants’ ancestral and derived allele status, and their effect on BD among the populations with highest BD prevalence. Populations represent study populations where the BD genetic association study was conducted and variants were discovered. Allele count details can be seen in Table 1.
FIGURE 2(A) Separation of all 26 1K Genome populations along the top 2 principal components (Dim1 and Dim2) based on principal component analyses conducted on a BD associated allele frequency matrix. Ellipses around populations in indicate clusters formed by clustering analysis. (B) Population differentiation (Fst) estimates of BD associated SNPs between East Asian (EAS) and African (AFR) populations. Solid and dashed vertical lines shows genome-wide SNP Fst and mean Fst of BD associated SNPs, respectively. (C) Fst estimates of BD associated SNPs between East Asian (EAS) and European (EUR) populations. Solid and dashed vertical lines shows genome-wide SNP Fst and mean Fst of BD associated SNPs, respectively. (D) Distribution of BD associated SNPs’ EAS-AFR Fst values along human chromosomes. (E) Distribution of BD associated SNPs’ EAS-EUR Fst values along human chromosomes.
FIGURE 3(A) Plot of BD associated variants’ allele frequency difference between African and East Asian populations versus the rank of BD association test p-values reported. (B) Plot of BD associated variants’ allele frequency difference between European and East Asian populations versus the rank of BD association test p-values reported. (C) Plot of population differentiation (Fst) between East Asian, and European populations versus the rank of BD association test p-values reported. (D) Plot of population differentiation (Fst) between East Asian, and African populations versus the rank of BD association test p-values reported.
Comparison of population genetic parameter estimates of 114 BD associated genes among 1,000 Genomes African (AFR), East Asian (EAS), and European (EUR) populations.
| Parameter | AFR Median (25%,75%) | EAS Median (25%,75%) | EUR Median (25%,75%) |
|
|---|---|---|---|---|
|
|
| |||
| Total sites | 903 (246, 3,695) | 903 (246, 3,695) | 903 (246, 3,695) | 0.99 |
| S | 143 (15, 549) | 113 (8, 400) | 105 (11, 375) | 0.05 |
| Eta | 553 (144, 1789) | 400 (114, 1,074) | 375 (106, 1,006) | 0.05 |
| Hap | 482 (160, 1,056)a | 313 (96, 877)b | 278 (78, 819)b | 0.004 |
| Hd | 0.99 (0.95, 0.99) | 0.97 (0.86, 0.99) | 0.97 (0.88, 0.92) | 0.08 |
| Π | 1.1 (0.8, 1.5)a | 0.8 (0.6, 1.2)b | 0.8 (0.6, 1.3)b | 0.0002 |
| θK | 43.7 (14.7, 131.1) | 35.3 (9.4, 107.2) | 35.8 (11.1, 100.5) | 0.37 |
| θW | 71.3 (18.5, 230.4) | 53.4 (15.3, 143.4) | 50.1 (14.1, 134.3) | 0.08 |
| | ||||
| Tajima’s | −1.2 (−1.5, −0.9)a | −0.8 (−1.3, −0.3)b | −0.7 (−1.2, −0.1)b | <0.001 |
| Fu and Li’s | −6.5 (−7.9, −4.3)a | −8.5 (−10.4, −4.0)b | −7.4 (−9.2, −4.0)a | 0.0008 |
| Fu and Li’s | −3.7 (−4.5, −2.6)a | −4.5 (−5.6, −2.6)b | −3.9 (−4.8, −2.8)a | 0.001 |
| Fu’s | −31.4 (−34.4, −30.2) | −30.9 (−32.0, −9.6) | −31.0 (−32.2, −7.9) | 0.06 |
| Achaz’s | −0.7 (−1.2, −0.3)a | 0.1 (−0.5, 0.7)b | 0.2 (−0.4, 0.9)b | <0.001 |
| | 0.04 (0.03, 0.05)a | 0.05 (0.04, 0.06)b | 0.05 (0.04, 0.06)b | 0.001 |
| | 0.02 (0.01, 0.05)a | 0.04 (0.02, 0.07)b | 0.04 (0.02, 0.07)b | 0.0002 |
| | ||||
| AFR - Fst | - | 0.12 (0.07, 0.16) | 0.09 (0.06, 0.14) | 0.03 |
| AFR - Dxy | - | 0.001 (0.0008, 0.002) | 0.001 (0.0008, 0.002) | 0.90 |
| AFR - Hst | - | 0.009 (0.0003, 0.03) | 0.008 (0.0004, 0.02) | 0.31 |
| | ||||
| | 0.006 (0.004, 0.01) | 0.007 (0.005, 0.02) | 0.07 | |
| | 0.76 (0.64, 0.88) | 0.82 (0.71, 0.90) | 0.02 | |
Medians and distributions of 114 BD, associated genes are compared by non-parametric Kruskal–Wallis one-way ANOVA, followed by non-parametric Wilcoxon pairwise tests. Small letters ‘a’ and ‘b’ represent significantly different pairwise comparisons between the three populations. S: segregating sites, Eta: total number of mutations, Hap: total number of haplotypes, Hd: Haplotype diversity, π (pi): nucleotide diversity, θK (ThetaK): average number of nucleotide differences, θW (ThetaW): watterson theta.
FIGURE 4Population genetic parameter and selection estimates superposed on to protein-protein interaction network of BD associated genes with population genetic parameter and selection estimates unique to East Asians. Primary interactions based on functional and physical protein associations only from curated databases and experimentally determined sources are presented. Line thickness of the edges indicates the strength of data support. High and low Fst represent population differentiation of East Asians with respect to Africans.