| Literature DB >> 26921301 |
Dov Tiosano1, Laura Audi2, Sharlee Climer3, Weixiong Zhang3, Alan R Templeton4, Monica Fernández-Cancio2, Ruth Gershoni-Baruch5, José Miguel Sánchez-Muro6, Mohamed El Kholy7, Zèev Hochberg8.
Abstract
The well-documented latitudinal clines of genes affecting human skin color presumably arise from the need for protection from intense ultraviolet radiation (UVR) vs. the need to use UVR for vitamin D synthesis. Sampling 751 subjects from a broad range of latitudes and skin colors, we investigated possible multilocus correlated adaptation of skin color genes with the vitamin D receptor gene (VDR), using a vector correlation metric and network method called BlocBuster. We discovered two multilocus networks involving VDR promoter and skin color genes that display strong latitudinal clines as multilocus networks, even though many of their single gene components do not. Considered one by one, the VDR components of these networks show diverse patterns: no cline, a weak declining latitudinal cline outside of Africa, and a strong in- vs. out-of-Africa frequency pattern. We confirmed these results with independent data from HapMap. Standard linkage disequilibrium analyses did not detect these networks. We applied BlocBuster across the entire genome, showing that our networks are significant outliers for interchromosomal disequilibrium that overlap with environmental variation relevant to the genes' functions. These results suggest that these multilocus correlations most likely arose from a combination of parallel selective responses to a common environmental variable and coadaptation, given the known Mendelian epistasis among VDR and the skin color genes.Entities:
Keywords: adaptation; epistasis; linkage disequilibrium; network analysis; skin color; vitamin D
Mesh:
Substances:
Year: 2016 PMID: 26921301 PMCID: PMC4856077 DOI: 10.1534/g3.115.026773
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Ethnic distribution of the study subjects
| Site of Sample Origin | Ethnic Origin | Number | Approximate Latitude of Ethnic Origin |
|---|---|---|---|
| Haifa, Israel | Western European ancestry Jews | 163 | 50 |
| Ethiopian Jews | 9 | 8 | |
| Indian Jews - out of Cochin | 20 | 25 | |
| Yemenite Jews | 79 | 16 | |
| Arab Christians | 70 | 32 | |
| Arab Muslims | 60 | 28 | |
| Salt, Spain | Spanish Caucasians | 83 | 40 |
| North Africans from Maghreb | 85 | 34 | |
| Sub-Saharan from different countries | 98 | 8 | |
| South Americans (Amerindian) | 20 | 15 | |
| India | 14 | 25 | |
| Cairo, Egypt | Egyptians | 50 | 30 |
The HapMap populations used in this study
| Population | Description | Number of Individuals |
|---|---|---|
| CEU | Utah residents with Northern and Western European ancestry from the CEPH collection | 116 |
| YRI | Yoruba in Ibadan, Nigeria | 119 |
| JPT | Japanese in Tokyo, Japan | 116 |
| CHB | Han Chinese in Beijing, China | 139 |
| GIH | Gujarati Indians in Houston, Texas | 101 |
| TSI | Toscans in Italy | 102 |
| LWK | Luhya in Webuye, Kenya | 110 |
| MKK | Maasai in Kinyawa, Kenya | 143 |
Figure 1Allelic networks identified by BlocBuster. Each dot represents an allelic node, identified by its SNP and nucleotide state. Out of the total of 128 nodes in the data set, only the nodes that had an edge connecting them to another node are depicted. The 52 edges connecting nodes represent Custom Correlation Coefficient (CCC) values ≥ 0.65. These 52 edges defined seven discrete networks of alleles, labeled 65_1 through 65_7.
Networks identified by BlocBuster using a threshold of CCC ≥ 0.65 on the samples described in Table 1 and the SNPs described in File S1
| Network | Gene | Network ID | SNP | Allele | Chr | Gene Location |
|---|---|---|---|---|---|---|
| 65_1 | 70_1 | rs2733832 | C | 9 | Intron 5 | |
| rs1408799 | T | 5′UTR | ||||
| 70_5 | rs4760658 | T | 12 | Promoter (Intron 1a) | ||
| rs11168293 | C | |||||
| rs2853564 | T | |||||
| rs1989969 | C | |||||
| n/a | rs1042602 | C | 11 | Ser192Tyr | ||
| 65_2 | 70_4 | rs11568820 | G | 12 | Promoter (Cdx2) | |
| rs10875695 | G | Promoter (Intron 1a) | ||||
| rs7302235 | A | Promoter (Intron 1a) | ||||
| n/a | rs4073729 | C | 12 | Promoter | ||
| n/a | rs2228478 | A | 16 | Thr314Thr | ||
| n/a | rs3212369 | A | 16 | 3′UTR | ||
| n/a | rs1426654 | A | 15 | Thr111Ala | ||
| n/a | rs16891982 | G | 5 | Phe374Leu | ||
| 65_3 | 70_3 | rs3212357 | T | 16 | 5′UTR | |
| rs3212359 | C | 5′UTR | ||||
| n/a | rs3212363 | A | 16 | 5′UTR | ||
| 65_4 | n/a | rs4237856 | T | 12 | Promoter | |
| n/a | rs10783219 | A | 12 | Promoter (Intron 1a) | ||
| n/a | rs2238136 | G | 12 | Promoter (Intron 1a) | ||
| 65_5 | 70_6 | rs2248098 | T | 12 | Intron 3 | |
| rs1544410 | G | Intron 8 | ||||
| rs731236 | T | Exon 9 | ||||
| rs3782905 | C | Intron 2 | ||||
| rs2238138 | C | Intron 2 | ||||
| 70_8 | rs739837 | C | 12 | 3′ UTR | ||
| rs7975232 | G | Intron 8 | ||||
| 65_6 | 70_7 | rs7305032 | T | 12 | Intron 5 | |
| rs7975232 | T | Intron8 | ||||
| rs2525044 | C | Intron6 | ||||
| rs739837 | A | 3′ UTR | ||||
| n/a | rs2248098 | C | 12 | Intron 3 | ||
| 65_7 | 70_2 | rs3212357 | C | 16 | 5′UTR | |
| rs3212359 | T | 5′UTR |
Components at a threshold of CCC ≥ 0.70 are indicated by 70_x in the Network ID column. SNP, single nucleotide polymorphism; Chr, chromosome; UTR, untranslated region; Ser, serine; Tyr, tyrosine; Thr, threonine; Ala, alanine; Phe, phenylalanine; Leu, leucine.
Figure 2Significant latitudinal clines for haplotype 65_3 identified by BlocBuster in the skin color gene MC1R. Haplotype 65_3 shows a significant nonlinear regression with latitude due to the low frequency of this haplotype in sub-Saharan Africa, but exclusion of the sub-Saharan population still results in a significant linear regression with latitude, as shown by the thick straight line.
The least-squares regressions of frequencies of individuals bearing a specified network or SNP vs. latitude for the populations indicated in Table 1 from the frequency and sample size data given in Table 4
| Network/SNP | Genes | Intercept | Slope | Quadratic Term | R2 |
|---|---|---|---|---|---|
| 65_1 | 0.752 | –0.0074** | n.s. | 0.70 | |
| 70_1 | 0.977 | –0.0082*** | n.s. | 0.76 | |
| rs1042602 | 0.890 | 0.0065 | n.s. | 0.56 | |
| 70_5 | 0.848 | n.s. | –0.0002 | 0.53 | |
| 70_5–Af | 0.920 | –0.0027** | n.s. | 0.68 | |
| 65_2 | −0.066 | 0.0200*** | n.s. | 0.84 | |
| 70_4 | 0.608 | 0.0104 | –0.0007 | 0.73 | |
| 70_4–Af | 0.858 | n.s. | n.s. | 0.15 | |
| rs4073729 | 0.858 | n.s. | –0.0002 | 0.50 | |
| rs4073729–Af | 0.942 | –0.0023 | n.s. | 0.61 | |
| rs2228478 | 0.741 | 0.0066** | –0.0005** | 0.85 | |
| rs2228478–Af | 0.877 | n.s. | n.s. | 0.21 | |
| rs3212369 | 0.575 | 0.0063** | n.s. | 0.59 | |
| rs1426654 | 0.579 | 0.0130** | –0.0007 | 0.80 | |
| rs1426654–Af | 0.810 | n.s. | n.s. | 0.24 | |
| rs16891982 | −0.136 | 0.0209*** | n.s. | 0.86 | |
| 65_3 | 0.363 | 0.0125*** | –0.0005** | 0.91 | |
| 65_3–Af | 0.511 | 0.0064** | n.s. | 0.69 | |
| 65_4 | 0.983 | n.s. | n.s. | 0.29 | |
| 65_5 | 0.462 | n.s. | n.s. | 0.36 | |
| 65_6 | 0.671 | n.s. | n.s. | 0.00 | |
| 65_7 | 0.728 | n.s. | n.s. | 0.06 |
Significant with P ≤ 0.05, ** significant with P ≤ 0.01, *** significant with P ≤ 0.001. SNP, single nucleotide polymorphism; n.s., not significant; –Af, the sub-Saharan African populations were excluded from the regression analysis.
The frequency of individuals possessing the SNP allele networks identified with the 0.65 threshold in the 10 sampled populations
| Population | 65_1 | 65_2 | 65_3 | 65_4 | 65_5 | 65_6 | 65_7 |
|---|---|---|---|---|---|---|---|
| Sub-Saharan | 0.667 (99) | 0.021 (95) | 0.142 (106) | 1.000 (78) | 0.420 (69) | 0.620 (71) | 0.877 (106) |
| Yamane | 0.662 (71) | 0.278 (72) | 0.595 (74) | 0.897 (68) | 0.629 (35) | 0.721 (68) | 0.892 (74) |
| Egypt | 0.571 (49) | 0.273 (44) | 0.723 (47) | 0.949 (39) | 0.571 (35) | 0.703 (37) | 0.729 (48) |
| Arab Muslim | 0.510 (51) | 0.552 (58) | 0.704 (54) | 0.895 (57) | 0.500 (42) | 0.800 (50) | 0.727 (55) |
| Arab Christian | 0.453 (64) | 0.817 (60) | 0.786 (70) | 0.966 (59) | 0.667 (48) | 0.709 (55) | 0.600 (70) |
| Maghreb | 0.474 (78) | 0.590 (78) | 0.662 (80) | 0.966 (58) | 0.527 (55) | 0.607 (56) | 0.612 (80) |
| Spain | 0.375 (72) | 0.805 (77) | 0.836 (73) | 0.870 (54) | 0.706 (51) | 0.722 (54) | 0.554 (74) |
| W. Eur. ancestry | 0.413 (143) | 0.886 (149) | 0.800 (145) | 0.873 (150) | 0.687 (131) | 0.655 (142) | 0.574 (148) |
| India | 0.828 (29) | 0.344 (32) | 0.667 (33) | 0.793 (29) | 0.792 (24) | 0.538 (26) | 0.758 (33) |
| Amerindian | 0.667 (18) | 0.529 (17) | 0.526 (19) | 1.000 (13) | 0.818 (11) | 0.600 (10) | 0.750 (20) |
Numbers in parenthesis are the sample sizes, which vary across networks due to missing SNP (single nucleotide polymorphism) genotype data.
Figure 3A plot of the frequencies of network 65_1 and its single-gene components vs. latitude. The components are the extended haplotype 70_1 in TYRP1, the extended promoter haplotype 70_5 in VDR, and a single nucleotide polymorphism in TYR. The line shows only the regression for the multilocus network 65_1.
Figure 4A plot of the frequencies of network 65_2 and its single-gene components vs. latitude. The line shows only the regression for the multilocus network 65_2. SNP, single nucleotide polymorphism.
Figure 5Frequencies of 65_2R4, 65_2, and 84_2 vs. latitude. (A) Frequency of individuals with 65_2R4 vs. latitude in 10 populations surveyed here and in eight HapMap populations. The line indicates the linear regression when the Han Chinese, Japanese, and sub-Saharan populations are excluded. (B) Frequency of individuals with 65_2 vs. latitude in 10 populations surveyed here and with 84_2 in eight HapMap populations. The line indicates the linear regression when the Han Chinese and Japanese populations are excluded.
The least-squares regressions of frequencies of individuals bearing the reduced networks 70_4R1 and 65_2R4 in our data set and in the HapMap data set
| Network/SNP | Dataset | Genes | Intercept | Slope | Quadratic Term | R2 |
|---|---|---|---|---|---|---|
| 70_4R1 | Ours | 0.600 | 0.0106 | –0.0007 | 0.74 | |
| 70_4R1 – Af | Ours | 0.854 | n.s. | n.s. | 0.18 | |
| 70_4R1 | HapMap | 0.215 | 0.0148** | n.s. | 0.76 | |
| 70_4R1 – Af | HapMap | 0.633 | n.s. | n.s. | 0.51 | |
| 65_2R4 | Ours | 0.484 | 0.0129** | –0.0007** | 0.85 | |
| 65_2R4 – Af | Ours | 0.698 | 0.0042 | n.s. | 0.54 | |
| 65_2R4 | HapMap | 0.021 | n.s. | n.s. | 0.48 | |
| 65_2R4 – As | HapMap | 0.046 | 0.0183** | n.s. | 0.90 | |
| 65_2R4 – As | All | 0.271 | 0.0189*** | –0.0005** | 0.89 | |
| 65_2R4 – Af – As | All | 0.693 | 0.0044** | n.s. | 0.65 |
Significant with P ≤ 0.05, ** significant with P ≤ 0.01, *** significant with P ≤ 0.001. SNP, single nucleotide polymorphism; n.s., not significant; – Af, the sub-Saharan populations have been excluded; – As, the Chinese and Japanese populations have been excluded.
Pairs of SNPs (SNP1 and SNP2) with significant linkage disequilibrium (r2 or D’, with “*” indicating significant and “n.s.” not significant) at the 5% level after correction for multiple testing
| SNP1 | Gene | Network ID | SNP2 | Gene | Network ID | r2 | D’ |
|---|---|---|---|---|---|---|---|
| rs1408799 | 65_1 | rs2733832 | 65_1 | * | * | ||
| rs3212357 | 65_3 & 65_7 | rs3212359 | 65_3 & 65_7 | * | n.s. | ||
| rs3212357 | 65_3 & 65_7 | rs3212363 | 65_3 | * | n.s. | ||
| rs3212359 | 65_3 & 65_7 | rs3212363 | 65_3 | * | * | ||
| rs2228478 | 65_2 | rs3212369 | 65_2 | * | * | ||
| rs2228478 | 65_2 | rs3212371 | None | * | * | ||
| rs11568820 | 65_2 | rs10875695 | 65_2 | * | * | ||
| rs11568820 | 65_2 | rs7302235 | 65_2 | * | * | ||
| rs10875695 | 65_2 | rs7302235 | 65_2 | * | * | ||
| rs4760658 | 65_1 | rs11168293 | 65_1 | * | * | ||
| rs4760658 | 65_1 | rs2853564 | 65_1 | * | * | ||
| rs11168293 | 65_1 | rs2853564 | 65_1 | * | * | ||
| rs2853564 | 65_1 | rs1989969 | 65_1 | * | * | ||
| rs3782905 | 65_5 | rs2238138 | 65_5 | * | * | ||
| rs2248098 | 65_5 & 65_6 | rs7305032 | 65_6 | * | n.s. | ||
| rs2248098 | 65_5 & 65_6 | rs2525044 | 65_6 | * | n.s. | ||
| rs2248098 | 65_5 & 65_6 | rs1544410 | 65_5 | * | * | ||
| rs2248098 | 65_5 & 65_6 | rs7975232 | 65_5 & 65_6 | * | n.s. | ||
| rs2248098 | 65_5 & 65_6 | rs731236 | 65_5 | * | * | ||
| rs2248098 | 65_5 & 65_6 | rs739837 | 65_5 & 65_6 | * | n.s. | ||
| rs7305032 | 65_6 | rs2525044 | 65_6 | * | * | ||
| rs7305032 | 65_6 | rs7975232 | 65_5 & 65_6 | * | * | ||
| rs7305032 | 65_6 | rs739837 | 65_5 & 65_6 | * | * | ||
| rs2525044 | 65_6 | rs7975232 | 65_5 & 65_6 | * | * | ||
| rs2525044 | 65_6 | rs739837 | 65_5 & 65_6 | * | * | ||
| rs1544410 | 65_5 | rs7975232 | 65_5 & 65_6 | * | n.s. | ||
| rs1544410 | 65_5 | rs731236 | 65_5 | * | * | ||
| rs1544410 | 65_5 | rs739837 | 65_5 & 65_6 | * | n.s. | ||
| rs7975232 | 65_5 & 65_6 | rs731236 | 65_5 | * | n.s. | ||
| rs7975232 | 65_5 & 65_6 | rs739837 | 65_5 & 65_6 | * | * | ||
| rs731236 | 65_5 | rs739837 | 65_5 & 65_6 | * | n.s. | ||
| rs11574114 | None | rs2853563 | None | * | * |
If a SNP involved with significant r2 or D’ values is also associated with a significant CCC value, the BlocBuster network ID is given that contains an allele from that SNP. SNP, single nucleotide polymorphism; n.s., not significant.
Networks identified by BlocBuster using a threshold of CCC ≥ 0.84 on the HapMap samples using 571 SNPs
| Network | Gene | SNP ID | Allele | Chr | Position | Gene Location |
|---|---|---|---|---|---|---|
| 84_1 | rs183671* | T | 5 | 33964210 | Intron 2 | |
| rs28777 | C | 5 | 33958959 | Intron 3 | ||
| rs35389 | G | 5 | 33954880 | Intron 3 | ||
| rs16891982* | C | 5 | 33951693 | Exon 5 (Phe374Leu) | ||
| rs35397* | G | 5 | 33951116 | Intron 5 | ||
| rs35395* | T | 5 | 33948589 | Intron 5 | ||
| rs35407 | A | 5 | 33946571 | 3′UTR | ||
| 84_2 | rs183671* | G | 5 | 33964210 | Intron 2 | |
| rs28777 | A | 5 | 33958959 | Intron 3 | ||
| rs35389 | A | 5 | 33954880 | Intron 3 | ||
| rs16891982* | G | 5 | 33951693 | Exon 5 (Phe374Leu) | ||
| rs35397* | T | 5 | 33951116 | Intron 5 | ||
| rs35395* | C | 5 | 33948589 | Intron 5 | ||
| rs35407 | G | 5 | 33946571 | 3′UTR | ||
| rs12440301 | G | 15 | 48389924 | 5′UTR | ||
| rs12441154* | C | 15 | 48390956 | 5′UTR | ||
| rs1834640 | A | 15 | 48392165 | 5′UTR | ||
| rs2675345 | A | 15 | 48400199 | 5′UTR | ||
| rs1426654 | A | 15 | 48426484 | Exon 3 (Thr111Ala) | ||
| 84_3 | rs35391 | C | 5 | 33955673 | Intron 3 | |
| rs35390* | A | 5 | 33955326 | Intron 3 | ||
| rs250417* | C | 5 | 33952378 | Intron 4 | ||
| 84_4 | rs987849* | A | 12 | 48254676 | Intron 3 | |
| rs11168268 | A | 12 | 48251812 | Intron 3 | ||
| 84_5 | rs4760655* | A | 12 | 48294131 | Promoter (intron 1a) | |
| rs10783219* | A | 12 | 46581755 | Promoter (intron 1a) | ||
| rs7132324 | C | 12 | 46593576 | Promoter | ||
| rs10783219* | A | 12 | 46581755 | Promoter (intron 1a) | ||
| rs7132324 | C | 12 | 46593576 | Promoter | ||
| 84_6 | rs1559857 | G | 15 | 48396808 | 5′UTR | |
| rs2675346 | C | 15 | 48411821 | 5′UTR | ||
| rs2433354 | C | 15 | 48414969 | Intron 2 | ||
| rs2675347 | A | 15 | 48418645 | Intron 2 | ||
| rs2555364* | G | 15 | 48419386 | Intron 2 | ||
| rs2675348 | A | 15 | 48420744 | Intron 2 | ||
| 84_7 | rs2675345 | G | 15 | 48400199 | 5′UTR | |
| rs2470102 | G | 15 | 484333494 | Intron 8 |
SNP positions are indicated according to Genome Build 37.1. SNPs with an asterisk were excluded from the reduced networks used to test for latitudinal associations. SNP, single nucleotide polymorphism; Chr, chromosome; Phe, phenylalanine; Leu, leucine; UTR, untranslated region; Thr, threonine; Ala, alanine.
The least-squares regressions of frequencies of individuals bearing the 84_2R network in the HapMap data set, and 84_2R plus 65_2 in the combined data set
| Network | Genes | Intercept | Slope | Quadratic Term | R2 |
|---|---|---|---|---|---|
| 84_2R | 0.083 | n.s. | n.s. | 0.20 | |
| 84_2R – As | 0.092 | 0.0037** | n.s. | 0.87 | |
| 84_2R – As + 65_2 | 0.042 | 0.0180*** | n.s. | 0.84 |
Significant with P ≤ 0.01, *** significant with P ≤ 0.001. n.s., not significant; – As, the Han Chinese and Japanese populations have been excluded.