Literature DB >> 29662168

Genome-wide association meta-analysis of individuals of European ancestry identifies new loci explaining a substantial fraction of hair color variation and heritability.

Pirro G Hysi1,2, Ana M Valdes1,3,4, Fan Liu5,6,7, Nicholas A Furlotte8, David M Evans9,10, Veronique Bataille1, Alessia Visconti1, Gibran Hemani10, George McMahon10, Susan M Ring10, George Davey Smith10, David L Duffy11, Gu Zhu11, Scott D Gordon11, Sarah E Medland11, Bochao D Lin12, Gonneke Willemsen12, Jouke Jan Hottenga12, Dragana Vuckovic13, Giorgia Girotto13,14, Ilaria Gandin13, Cinzia Sala13, Maria Pina Concas14, Marco Brumat13, Paolo Gasparini13,14, Daniela Toniolo15, Massimiliano Cocca14, Antonietta Robino14, Seyhan Yazar16,17, Alex W Hewitt16,18,19, Yan Chen5,6, Changqing Zeng5, Andre G Uitterlinden20,21, M Arfan Ikram21, Merel A Hamer22, Cornelia M van Duijn21, Tamar Nijsten22, David A Mackey16,18,19, Mario Falchi1, Dorret I Boomsma12, Nicholas G Martin11, David A Hinds8, Manfred Kayser23, Timothy D Spector24.   

Abstract

Hair color is one of the most recognizable visual traits in European populations and is under strong genetic control. Here we report the results of a genome-wide association study meta-analysis of almost 300,000 participants of European descent. We identified 123 autosomal and one X-chromosome loci significantly associated with hair color; all but 13 are novel. Collectively, single-nucleotide polymorphisms associated with hair color within these loci explain 34.6% of red hair, 24.8% of blond hair, and 26.1% of black hair heritability in the study populations. These results confirm the polygenic nature of complex phenotypes and improve our understanding of melanin pigment metabolism in humans.

Entities:  

Mesh:

Year:  2018        PMID: 29662168      PMCID: PMC5935237          DOI: 10.1038/s41588-018-0100-5

Source DB:  PubMed          Journal:  Nat Genet        ISSN: 1061-4036            Impact factor:   38.330


Human pigmentation refers to coloration of external tissues due to variations in quantity, ratio and distribution of the two main types of the pigment melanin: eumelanin and pheomelanin1. Most melanin is produced by melanosomes2,3, large organelles specialized in melanin synthesis and transportation located mainly in the epidermis, hair and iris as well as the central nervous system. Early humans had a darkly pigmented skin4,5 which protected against high Ultraviolet radiation (UVR) and its consequences such as skin cancer6 and folate depletion7. European and Asian populations evolved to lighter skin pigmentation8,9, as they migrated towards northern latitudes with lower UVR4. The lighter pigmentation maximizes UVR absorption needed to maintain adequate vitamin D levels. In Europeans, pigmentation of skin, hair and or eyes has characteristic geographic distributions because of natural selection10 and perhaps genetic drift; a role for sexual selection has been debated 11–13. Hair color is one of the most prominent traits in humans. Twin studies suggest that up to 97% of variation in hair color may be explained by heritable factors14 and genome-wide association studies (GWAS) 15–20 have identified several chromosomal regions associated with hair color and related pigmentation traits21. Except for red hair, known variants have a relatively low predictive value22 and the heritability gap remains relatively large14 which suggests that many hair color genes, remain undiscovered. Here we report the results of a meta-analysis of two GWAS carried out in two large discovery cohort studies: 157,653 research participants from the 23andMe, Inc. customer base18 and 133,238 individuals from the UK Biobank (UKBB). Participants in both studies self-reported the natural color of their hair in adulthood (Supplementary Figure 1 and Supplementary Methods). For the purpose of this work, each hair color category collected (black, dark brown, light brown, red and blond) was assigned numerical values ranging from lowest (blond) to highest (black). These codes were used as the outcome variable in linear regression based GWAS analyses. To minimize population admixture and stratification, the analyses were restricted to individuals of European ancestry (Supplementary Figures 2 and 3) and adjusted for the first ten principal components (PCs) of the genotype matrix, as well as age and sex. The analyses confirmed a strong association between hair color and PCs, especially in the less ethnically homogeneous 23andMe dataset, which includes participants of more varied European origin, in line with the known North-South cline in hair color variation and other regional differences in hair color across Europe12 (Supplementary Table 1). The strongest associations in both groups were with sex (Table 1). Women were more likely to report blond (OR=1.20 and OR=1.29 in the 23andMe and UKBB participants, respectively), or red hair (OR=1.72 and OR=1.40, respectively) than any other color and three to five times less likely to report black hair (OR=0.35 and OR=0.20, respectively) compared to men.
Table 1

Effect of sex on the hair color phenotypes in the 23andMe (N=157,653 independent participants) and UK Biobank (N=133,238 independent participants) cohorts

23andMeOddsStandard95% Confidence Interval
RatioErrorlowupper
Blond (all)1.2020.0241.1741.230
Red1.7210.0141.6751.768
Light Brown1.1160.0131.0881.145
Dark Brown0.6630.0110.6500.677
Black0.3480.0300.3290.369
Genomic inflation factors 23 (λGC) from the 23andMe and the UKBB GWAS were 1.147 and 1.146, respectively, in line with expectations of high power to detect large polygenic effects in these large samples24 (Supplementary Figure 4). Meta-analyzed GWAS results reached conventional genome-wide significance (p<5x10-08) in many regions, primarily clustering around 123 distinct autosomal genomic SNPs and one X-chromosomal locus (Figure 1, Supplementary Table 2), mostly new (Table 2). In line with power expectations (Supplementary Figure 5), 75 of these regions were genome-wide significant in at least one of the two cohorts and always at least nominally significant (p<0.01) in the other.
Figure 1

Manhattan plot of the inverse variance meta-analysis for association with hair color of the 23andMe and UKBB cohorts (meta-analysis N=290,891). The unadjusted significance of association (y-axis) for each SNP on different chromosomes is shown in alternating navy and green along the x-axis with polymorphisms reaching significance at GWAS level (p<5x10-08) depicted in red. The values on the y-axis were truncated at p=10-500.

Table 2

A selection of genes newly associated with hair color.

The selection was based on the strength of their effect, which is defined as the standardized linear regression coefficient. Results are given for the UK Biobank, 23andMe, their meta-analysis as well as the meta-analysis results from the VisiGen Consortium. These results were generated linear models and effect sizes (Beta) are given in SD units. The A, C, T and G under the “Reference Allele field” denote the nucleotide of the allele for which the effect size and allele frequencies are reported. Frequencies are given for the reference allele and are the average of observed frequencies in the 23andMe and UK Biobank. Associations with p-values of less than 10-100 are reported as “p<10-100”.

UK Biobank23andMeMeta-analysis

ChrPos(Build37)SNP IDRef. AlleleFreq.Nearest GeneNBetaSEp-valueNBetaSEp-valueBetaSEp-value
18207579rs80293268G0.047SLC45A11322210.1940.0091.54E-971576510.1570.0091.29E-670.1750.007<E-100
1205181062rs2369633T0.089DSTYK132887-0.0710.0079.20E-26157651-0.0770.0063.15E-38-0.0750.0053.44E-62
228613302rs71443018G0.039FOSL21264280.1330.012.14E-391576510.1480.0124.18E-330.1390.0081.36E-70
9126808006rs58979150T0.108LHX21328830.0890.0061.03E-441576510.0830.0059.93E-530.0860.0041.40E-95
1378391757rs1279403T0.406EDNRB133238-0.0860.004<E-100157651-0.0740.0044.57E-95-0.080.003<E-100
1548426484rs1426654G0.021SHC41332380.1880.0690.0061576510.2890.032.12E-210.2730.0281.24E-22
1739551099rs117612447T0.029KRT311332380.0630.0112.95E-081576510.0640.0112.09E-090.0630.0083.29E-16
2052661068rs73132911T0.046BCAS11328360.0890.0096.78E-221576510.0460.0082.54E-090.0640.0065.85E-27
Previously known pigmentation loci were all strongly associated in the meta-analysis results: HERC2 (rs12913832), IRF4 (rs12203592), MC1R (rs1805007), as well as others, showed some of the strongest statistical evidence for association ever published for human complex traits. Strong associations were found for genes whose mutations reportedly cause impairment of pigmentation such as Waardenburg (EDNRB, rs1279403, p<10-100; MITF, rs9823839, p<10-100), Hermansky-Pudlak (HPS5, rs201668750, p=4.68x10-11), Trichomegaly (FGF5 rs7681907, p=5.684x10-25) or Ablepharon-Macrostomia (TWIST2, rs11684254, p=1.233x10-20) Syndromes. Many polymorphisms significantly (p<5x10-08) associated with hair color in our meta-analysis had existing entries in the GWAS Catalog21. In previous publications, they were associated to several phenotypes, including most known pigmentation loci (Supplementary Table 3). Among the associated loci, some of the strongest effects were observed for two solute carrier 45A family members (SLC45A1, rs80293268, p<10 and the SLC45A2 gene, rs16891982, p<10-100); polymorphisms near a third solute carrier gene were also significantly associated with the trait (rs60086398 upstream of SLC7A1, p=4.93x10-08). In addition, forkhead box family genes (FOXO6, rs3856254, p=4.0x10-09 and FOXE1, rs3021523, p=4.23x10-23) and sex determining region Y (SRY)-box genes (SOX5 rs9971729, p=8.8x10-17 and SOX6, rs1531903, p=9.1x10-16) were among those highlighted in our results. An additional locus, located on chromosome X, on the second intron of the collagen type IV alpha 6 gene was also significantly associated (COL46A, rs1266744, p=5.03x10-12). Chromosome Y information was not analyzed. Interestingly, given the observed strong association of hair color with sex, there was no particular difference in effect sizes observed for these loci among men and women in either cohort (Supplementary Table 4, Supplementary Figure 6); only one SNPs significantly associated with hair color in the meta-analysis showed significant (p=1.6x10-08) interaction with sex in the 23andMe (Supplementary Table 5), but much weaker interaction in the UK Biobank cohort (p=0.04). As reported before10, some hair color genes are subject to significant natural selection (Supplementary Table 6); SNPs associated with hair color in our meta-analysis, tended to have lower selection score centiles and higher than average evidence for natural selection within European populations (p=0.04) and compared to Africans (Supplementary Figure 7). To further validate the results and to introduce a testing dataset, we collected GWAS summary statistics from 10 additional cohorts with 27,865 European participants from International Visible Trait Genetics (VisiGen) Consortium25 and meta-analyzed them. For 114 of the 123 autosomal loci highlighted by the discovery GWAS meta-analysis, the direction of the association was the same as observed in the meta-analysis; despite the lower statistical power of the replication due to smaller sample sizes, most leading SNP loci from the discovery meta-analysis (75 of the 123 autosomal regions) replicated at least at a nominal level and the same direction of association (p<0.05); for 35 of these loci the association was stronger even after correction for multiple testing (Supplementary Table 2). Next, we assessed the potential relationship of the most associated polymorphisms and expression of the genes nearest to them. In line with most previous GWAS26, the majority of these polymorphic loci had eQTL effects in several tissues. The strongest associations were observed with transcript of the CBWD1 (rs478882, p=1.30x10-30), PPM1A (rs7154748, p=3.30x10-14) and RALY genes (rs6059655 being associated with ASIP gene expression, p=6.0x10-09) in sun-exposed skin tissues (Supplementary Table 7). As expected, genes showing the strongest association in the meta-analysis were significantly enriched for several Gene Ontology entries, especially pigmentation, melanin biosynthetic and metabolic processes, etc. (Figure 2, Supplementary Table 8).
Figure 2

Gene Ontology Biological Processes annotations for genes adjacent to the SNPs showing the strongest associations with hair color via GWAS meta-analysis in the 23andMe and UKBB cohorts.

A conditional analysis of the discovery cohorts identified 258 SNPs independently associated with hair color (Supplementary table 9). These SNPs explain overall 20.68% of the hair color heritability (using ordinal categories) and 34.58% (SE=3.64%) of the population liability scale27 heritability for red hair (vs. any other color, assuming population prevalence is as in the UKBB at 0.047), 24.80% for blond hair (SE=2.49%, assuming a prevalence of 0.11) and 26.12% (SE=3.11%) of the black hair heritability (prevalence 0.046, Table 3).
Table 3

Phenotypic variance explained by the identified autosomal loci significantly associated with hair color. The current estimates are given as the ratio of the genetic variance, V(G), over the phenotypic variance (Vp) and scaled over the population prevalence, V(G)/Vp_L, (estimated in the UKBB cohort, N=133,238), on the right. The estimates of genetic variance explained by known SNPs prior to this study were taken from previous publications. The phenotypes in this table were compared with all other hair colors. Since 80% of the participants reported some shade of brown hair color (dark or light), the heritabilities for these two phenotypes were considered baseline and were not calculated.

Current heritability estimatesPrevious estimates
PhenotypeV(G)/VpSEV(G)/Vp_LSEPrevalenceV(G)/VpSE
Blond0.0940.0090.2480.0250.1130.0580.022
Red0.0740.0080.3460.0360.0460.0690.069
Black0.0560.0070.2610.0310.0470.0050.005
Finally, we modelled hair color prediction in two cohorts (QIMR N=7,283 and RS N=7,724) using the 258 independently associated SNPs from the discovery GWAS meta-analysis (Supplementary Table 9) together with previously reported SNP predictors for hair color from the HIrisPlex System28. We split the data into model building (80%) and validation (20%) sets to assure that marker discovery, model building and model validation were independently executed. In both cohorts, prediction accuracies were high for black (QIMR AUC=0.91, RS AUC=0.81) and red (0.87 and 0.84) hair colors, but lower for blond (0.79 and 0.74) and brown (0.76 and 0.64; Supplementary Table 10, Supplementary Figure 8). Using the same datasets, these new models outperformed the previous HIrisPlex model22 (QIMR/RS black 0.82 vs 0.77, red 0.87 vs. 0.83, blond 0.67 vs. 0.65, brown 0.66 vs. 0.57, Supplementary Table 10). Our work identified over a hundred new genetic loci involved in hair pigmentation in Europeans and raises interesting questions. First, the observation of higher prevalence of lighter hair colors among women (Supplementary Figure 9), follows previous findings based on objective quantitative measurement of hair color29,30 suggesting that sex is truly associated with hair color, independent of socially driven self-reporting bias. Second, although hair pigmentation spans a spectrum from very bright (blond) to very dark (black), the genetic mechanisms don’t always follow this linear scale, as red hair color often has unique predisposing genetic factors 16,17. However, our results explain even higher portions of heritability than before14 for all hair colors and not just for the extremes of the light-dark hair color spectrum. Third, hair color is a trait that follows special distribution patterns of distribution, therefore is prone to issues of population structure bias that may be controlled in several ways. A comparison of different methodologies (Supplementary Figure 10) shows that our approach is roughly equivalent with others. Fourth, annotation of genetic regions based on physical distances and association probability most likely underestimates the number of regions involved in hair pigmentation. For example, the involvement of OCA2 and HERC2 genes in human pigmentation is not simply due to linkage disequilibrium31, yet because of their proximity, both loci in our study were assigned to the same association region. This would, however, not affect the conditional analysis at a marker level, which discriminates separate effects. In conclusion, this large GWAS meta-analysis has improved our knowledge on the genetic controls of human hair and pigmentation by bringing the number of known genes into the hundreds. The newly identified genetic loci explain substantial portions of the hair color phenotypic variability and will guide future research into better understanding the functional mechanisms linking these genes to pigmentation variation. Our findings are also useful in the future for both the better molecular understanding of human pigmentation including their DNA-based prediction as relevant in forensic and anthropological applications, and the diseases that result from biological impairment of pigmentation including the development of treatment strategies.

Online Methods

Data Availability

This work used data from two primary sources. The original datasets can be accessed as follows: For UK Biobank data, through the UK Biobank Access management, as specified here: http://www.ukbiobank.ac.uk/register-apply/. The hair color data accession codes are 1747.0.0, 1747.1.0 and 1747.2.0. The participants age UK Biobank accession code is 21022, for sex 31.0.0 and the pre-computed principal components used here 22009.0.1 through 22009.0.10. For the 23andMe participants requests for summary statistics access can be made at https://researchers.23andme.org/collaborations. There are no accession codes available. For the TwinsUK datasets access can be asked through http://www.twinsuk.ac.uk/data-access/ and access to the secondary source of data through the corresponding authors.
  30 in total

1.  Exploratory Gene Ontology Analysis with Interactive Visualization.

Authors:  Junjie Zhu; Qian Zhao; Eugene Katsevich; Chiara Sabatti
Journal:  Sci Rep       Date:  2019-05-24       Impact factor: 4.379

Review 2.  The genomics of coloration provides insights into adaptive evolution.

Authors:  Anna Orteu; Chris D Jiggins
Journal:  Nat Rev Genet       Date:  2020-05-07       Impact factor: 53.242

3.  An Assessment of Environmental Health Measures in the Deepwater Horizon Research Consortia.

Authors:  Huaqin Pan; Stephen W Edwards; Cataia Ives; Hannah Covert; Emily W Harville; Maureen Y Lichtveld; Jeffrey K Wickliffe; Carol M Hamilton
Journal:  Curr Opin Toxicol       Date:  2019-07-30

4.  Ancient genomes from present-day France unveil 7,000 years of its demographic history.

Authors:  Samantha Brunel; E Andrew Bennett; Laurent Cardin; Damien Garraud; Hélène Barrand Emam; Alexandre Beylier; Bruno Boulestin; Fanny Chenal; Elsa Ciesielski; Fabien Convertini; Bernard Dedet; Stéphanie Desbrosse-Degobertiere; Sophie Desenne; Jerôme Dubouloz; Henri Duday; Gilles Escalon; Véronique Fabre; Eric Gailledrat; Muriel Gandelin; Yves Gleize; Sébastien Goepfert; Jean Guilaine; Lamys Hachem; Michael Ilett; François Lambach; Florent Maziere; Bertrand Perrin; Suzanne Plouin; Estelle Pinard; Ivan Praud; Isabelle Richard; Vincent Riquier; Réjane Roure; Benoit Sendra; Corinne Thevenet; Sandrine Thiol; Elisabeth Vauquelin; Luc Vergnaud; Thierry Grange; Eva-Maria Geigl; Melanie Pruvost
Journal:  Proc Natl Acad Sci U S A       Date:  2020-05-26       Impact factor: 11.205

5.  Skin pigmentation and genetic variants in an admixed Brazilian population of primarily European ancestry.

Authors:  Jeppe D Andersen; Olivia S Meyer; Filipa Simão; Juliana Jannuzzi; Elizeu Carvalho; Mikkel M Andersen; Vania Pereira; Claus Børsting; Niels Morling; Leonor Gusmão
Journal:  Int J Legal Med       Date:  2020-05-09       Impact factor: 2.686

6.  A study in scarlet: MC1R as the main predictor of red hair and exemplar of the flip-flop effect.

Authors:  Katerina Zorina-Lichtenwalter; Ryan N Lichtenwalter; Dima V Zaykin; Marc Parisien; Simon Gravel; Andrey Bortsov; Luda Diatchenko
Journal:  Hum Mol Genet       Date:  2019-06-15       Impact factor: 6.150

7.  Efficient base editing by RNA-guided cytidine base editors (CBEs) in pigs.

Authors:  Hongming Yuan; Tingting Yu; Lingyu Wang; Lin Yang; Yuanzhu Zhang; Huan Liu; Mengjing Li; Xiaochun Tang; Zhiquan Liu; Zhanjun Li; Chao Lu; Xue Chen; Daxin Pang; Hongsheng Ouyang
Journal:  Cell Mol Life Sci       Date:  2019-07-13       Impact factor: 9.261

8.  Deciphering osteoarthritis genetics across 826,690 individuals from 9 populations.

Authors:  Cindy G Boer; Konstantinos Hatzikotoulas; Lorraine Southam; Lilja Stefánsdóttir; Yanfei Zhang; Rodrigo Coutinho de Almeida; Tian T Wu; Jie Zheng; April Hartley; Maris Teder-Laving; Anne Heidi Skogholt; Chikashi Terao; Eleni Zengini; George Alexiadis; Andrei Barysenka; Gyda Bjornsdottir; Maiken E Gabrielsen; Arthur Gilly; Thorvaldur Ingvarsson; Marianne B Johnsen; Helgi Jonsson; Margreet Kloppenburg; Almut Luetge; Sigrun H Lund; Reedik Mägi; Massimo Mangino; Rob R G H H Nelissen; Manu Shivakumar; Julia Steinberg; Hiroshi Takuwa; Laurent F Thomas; Margo Tuerlings; George C Babis; Jason Pui Yin Cheung; Jae Hee Kang; Peter Kraft; Steven A Lietman; Dino Samartzis; P Eline Slagboom; Kari Stefansson; Unnur Thorsteinsdottir; Jonathan H Tobias; André G Uitterlinden; Bendik Winsvold; John-Anker Zwart; George Davey Smith; Pak Chung Sham; Gudmar Thorleifsson; Tom R Gaunt; Andrew P Morris; Ana M Valdes; Aspasia Tsezou; Kathryn S E Cheah; Shiro Ikegawa; Kristian Hveem; Tõnu Esko; J Mark Wilkinson; Ingrid Meulenbelt; Ming Ta Michael Lee; Joyce B J van Meurs; Unnur Styrkársdóttir; Eleftheria Zeggini
Journal:  Cell       Date:  2021-08-26       Impact factor: 41.582

Review 9.  Evolutionary genetics of skin pigmentation in African populations.

Authors:  Yuanqing Feng; Michael A McQuillan; Sarah A Tishkoff
Journal:  Hum Mol Genet       Date:  2021-04-26       Impact factor: 6.150

10.  Cell-type-specific meQTLs extend melanoma GWAS annotation beyond eQTLs and inform melanocyte gene-regulatory mechanisms.

Authors:  Tongwu Zhang; Jiyeon Choi; Ramile Dilshat; Berglind Ósk Einarsdóttir; Michael A Kovacs; Mai Xu; Michael Malasky; Salma Chowdhury; Kristine Jones; D Timothy Bishop; Alisa M Goldstein; Mark M Iles; Maria Teresa Landi; Matthew H Law; Jianxin Shi; Eiríkur Steingrímsson; Kevin M Brown
Journal:  Am J Hum Genet       Date:  2021-07-21       Impact factor: 11.025

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.