| Literature DB >> 29220522 |
Fan Liu1,2,3, Yan Chen1,3, Gu Zhu4, Pirro G Hysi5, Sijie Wu3,6, Kaustubh Adhikari7, Krystal Breslin8, Ewelina Pospiech9,10, Merel A Hamer11, Fuduan Peng1,3, Charanya Muralidharan8, Victor Acuna-Alonzo12, Samuel Canizales-Quinteros13, Gabriel Bedoya14, Carla Gallo15, Giovanni Poletti15, Francisco Rothhammer16, Maria Catira Bortolini17, Rolando Gonzalez-Jose18, Changqing Zeng1, Shuhua Xu3,6,19,20, Li Jin3,19, André G Uitterlinden21,22, M Arfan Ikram22, Cornelia M van Duijn22, Tamar Nijsten11, Susan Walsh8, Wojciech Branicki10,23, Sijia Wang3,6,19, Andrés Ruiz-Linares7,24,25, Timothy D Spector5, Nicholas G Martin4, Sarah E Medland4, Manfred Kayser2.
Abstract
Shape variation of human head hair shows striking variation within and between human populations, while its genetic basis is far from being understood. We performed a series of genome-wide association studies (GWASs) and replication studies in a total of 28 964 subjects from 9 cohorts from multiple geographic origins. A meta-analysis of three European GWASs identified 8 novel loci (1p36.23 ERRFI1/SLC45A1, 1p36.22 PEX14, 1p36.13 PADI3, 2p13.3 TGFA, 11p14.1 LGR4, 12q13.13 HOXC13, 17q21.2 KRTAP, and 20q13.33 PTK6), and confirmed 4 previously known ones (1q21.3 TCHH/TCHHL1/LCE3E, 2q35 WNT10A, 4q21.21 FRAS1, and 10p14 LINC00708/GATA3), all showing genome-wide significant association with hair shape (P < 5e-8). All except one (1p36.22 PEX14) were replicated with nominal significance in at least one of the 6 additional cohorts of European, Native American and East Asian origins. Three additional previously known genes (EDAR, OFCC1, and PRSS53) were confirmed at the nominal significance level. A multivariable regression model revealed that 14 SNPs from different genes significantly and independently contribute to hair shape variation, reaching a cross-validated AUC value of 0.66 (95% CI: 0.62-0.70) and an AUC value of 0.64 in an independent validation cohort, providing an improved accuracy compared with a previous model. Prediction outcomes of 2504 individuals from a multiethnic sample were largely consistent with general knowledge on the global distribution of hair shape variation. Our study thus delivers target genes and DNA variants for future functional studies to further evaluate the molecular basis of hair shape in humans.Entities:
Mesh:
Year: 2018 PMID: 29220522 PMCID: PMC5886212 DOI: 10.1093/hmg/ddx416
Source DB: PubMed Journal: Hum Mol Genet ISSN: 0964-6906 Impact factor: 6.150
Figure 1.Manhattan plot of meta-analysis of three GWASs for human hair shape in Europeans from QIMR, TwinsUK and RS (totaling 16, 763 subjects). The −log10 P values for association were plotted for each SNP according to its chromosomal position according to assembly GRCh37.p13. Previously known genes with genome-wide significant association in the present study are noted using black text above the figure and genes in the newly identified loci are noted in red. Genome-wide significance threshold (P = 5e-8) is indicated as a red line, while the suggestive significance threshold (P = 1e-5) is indicated as a blue line.
SNPs associated with hair shape variation in the discovery meta-analysis of three European cohorts and in meta-analysis of all 9 cohorts of European and non-European origins
| SNP | Gene | CHR | MBp | EA | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1p36.23 | 8.21 | C | 0.04 | −0.17 | 0.03 | −0.18 | 0.03 | −0.18 | |||||
| 1p36.22 | 10.56 | C | 0.34 | −0.06 | 0.38 | −0.02 | 1.58E-03 | 0.38 | −0.01 | 8.31E-03 | |||
| 1p36.13 | 17.60 | G | 0.17 | −0.07 | 0.15 | −0.07 | 0.15 | −0.07 | |||||
| rs17646946 | 1q21.3 | 152.06 | A | 0.19 | −0.22 | 0.17 | −0.21 | 0.17 | −0.21 | ||||
| 2p13.3 | 70.79 | C | 0.36 | −0.05 | 0.35 | −0.04 | 0.36 | −0.03 | 7.35E-07 | ||||
| rs74333950 | 2q35 | 219.75 | G | 0.15 | 0.10 | 0.15 | 0.09 | 0.16 | 0.06 | ||||
| rs506863 | 4q21.21 | 79.26 | C | 0.34 | 0.08 | 0.35 | 0.05 | 0.41 | 0.04 | ||||
| rs1999874 | 10p14 | 8.35 | A | 0.38 | 0.06 | 0.36 | 0.06 | 0.34 | 0.04 | ||||
| 11p14.1 | 27.41 | G | 0.10 | 0.09 | 0.09 | 0.06 | 2.32E-06 | 0.09 | 0.02 | 4.13E-03 | |||
| 12q13.13 | 54.15 | G | 0.26 | −0.07 | 0.22 | −0.06 | 0.22 | −0.06 | |||||
| 17q21.2 | 39.19 | T | 0.83 | 0.12 | 0.81 | 0.05 | 6.65E-08 | 0.80 | 0.03 | 1.12E-04 | |||
| 20q13.33 | 62.16 | C | 0.05 | 0.13 | 0.05 | 0.10 | 0.05 | 0.06 | |||||
META: Discovery, meta-analysis of 3 European cohorts (QIMR, TwinsUK and RS);
META: Non-Asian, meta-analysis of 5 European cohorts (QIMR, TwinsUK, RS, ERF, POL) and 2 European admixed cohorts from America (CANDELA and US);
META: ALL, meta-analysis of all 9 cohorts used in this study including QIMR, TwinsUK, RS, ERF, POL, CANDELA, US, UYG, and TZL;
SNP, the top-associated SNP per locus in META: Discovery, bold indicate new loci for hair curliness;
EA, fEA, the effect allele and the frequency of effect allele; P-values < 5e-8 are in bold.
Figure 2.Regional Manhattan plot for eight novel loci associated with hair shape variation in the meta-analysis of three European GWASs (QIMR, TwinsUK, RS). (A) 1p36.23 - ERRFI1/SLC45A1; (B) 1p36.22 - PEX14; (C) 1p36.13 - PADI3; (D) 2p13.3 – TGFA; (E) 11p14.1 - LGR4; (F) 12q13.13 - HOXC13; (G) 17q21.2 – KRTAP; (H) 20q13.33 - PTK6. The index SNP in each region (Table 1) is shown as a purple diamond. At the top of the figure are shown the association P-values on a -log10 P scale (y-axis) for all genotyped and imputed SNPs according to their physical positions (x-axis) using human genome sequence build 37. Genes in the region and LD heatmap (r2) patterns according to 1000 genomes EUR data set are aligned bellow. Plots for identified regions not shown here are presented in Supplementary Material, Figure S3.
Figure 3.Effect sizes for the derived allele at index SNPs (Table 1) in eight novel genomic regions associated with hair shape variation in 9 cohorts and a meta-analysis of all 9 cohorts (META). (A) 1p36.23—rs80293268; (B) 1p36.22 – rs6658216; (C) 1p36.13—rs11203346; (D) 2p13.3—rs12997742; (E) 11p14.1 – rs2219783; (F) 12q13.13—rs11170678; (G) 17q21.2—rs11078976; (H) 20q13.33—rs310642. Orange color indicates European cohorts, green color indicates admixed European cohorts, and purple color indicated East Asian. META: Discovery, meta-analysis of 3 European cohorts (QIMR, TwinsUK and RS), META: Non-Asian, meta-analysis of European cohorts (QIMR, TwinsUK, RS, ERF, and POL) and admixed European cohorts (CANDELA and US), and META: ALL indicates meta-analysis of all 9 cohorts (QIMR, TwinsUK, RS, ERF, POL, CANDELA, US, UYG, and TZL). Blue boxes represent linear regression coefficients (x-axis) estimated in each cohort. Red boxes represent effect sizes estimated in the combined meta-analysis. Box sizes are proportional to sample size. Horizontal bars indicate a 95% confidence interval of width equal to 1.96 standard errors. The right y-axis indicates P values in each cohort on –log10 scale. Similar plots for regions previously associated with hair shape are shown in Supplementary Material, Figure S6.
Multiple logistic regression for SNPs associated with hair shape variation in 6068 Europeans from QIMR
| Marker | Gene | CHR | EA | OR | Lower95 | Upper95 | R2 | AUC | |
|---|---|---|---|---|---|---|---|---|---|
| rs17646946 | 1q21.3 | A | 0.50 | 0.45 | 0.56 | 2.35E-38 | 0.04 | 0.58 | |
| rs80293268 | 1p36.23 | C | 0.40 | 0.30 | 0.53 | 3.33E-10 | 0.05 | 0.60 | |
| Sex (female) | 1.43 | 1.27 | 1.62 | 8.71E-09 | 0.05 | 0.61 | |||
| rs506863 | 4q21.21 | C | 1.27 | 1.17 | 1.38 | 6.06E-09 | 0.06 | 0.62 | |
| rs1556547 | 6p24.3 | G | 0.81 | 0.75 | 0.88 | 1.85E-07 | 0.07 | 0.63 | |
| rs74333950 | 2q35 | G | 1.35 | 1.20 | 1.51 | 2.68E-07 | 0.07 | 0.63 | |
| rs310642 | 20q13.33 | C | 1.56 | 1.32 | 1.86 | 2.76E-07 | 0.08 | 0.64 | |
| rs11170678 | 12q13.13 | G | 0.78 | 0.70 | 0.86 | 6.99E-07 | 0.08 | 0.64 | |
| rs11203346 | 1p36.13 | G | 0.79 | 0.71 | 0.88 | 1.53E-05 | 0.09 | 0.65 | |
| rs11078976 | 17q21.2 | C | 0.75 | 0.66 | 0.86 | 2.73E-05 | 0.09 | 0.65 | |
| rs2847344* | 1p36.22 | G | 0.86 | 0.79 | 0.93 | 3.14E-04 | 0.09 | 0.65 | |
| rs1999874 | 10p14 | A | 1.15 | 1.07 | 1.25 | 3.91E-04 | 0.10 | 0.65 | |
| rs2219783 | 11p14.1 | G | 1.26 | 1.11 | 1.44 | 5.05E-04 | 0.10 | 0.65 | |
| rs12997742 | 2p13.3 | C | 0.88 | 0.81 | 0.95 | 1.13E-03 | 0.10 | 0.66 | |
| rs499697 | 1q21.3 | G | 1.13 | 1.04 | 1.22 | 5.54E-03 | 0.10 | 0.66 |
The markers were ordered according to P-values in the multiple logistic regression and hair curliness is dichotomized as non-straight (1) vs. straight (0); Marker, initial analysis includes sex, age, and 16 SNPs associated with hair curliness, 12 from the current study (see Table 1) and 4 from previous studies (EDAR rs3827760, OFCC1 rs1556547, PRSS53 rs11150606, LCE3E rs499697, also see Supplementary Material, Table S6), non-significant SNPs in the final model are not presented; *rs2847344 is used as a replacement for rs6658216, these 2 SNPs are in nearly complete LD (r2 > 0.9); R2, Accumulative Nagelkerke pseudo R2 while the current marker is included; AUC, accumulative Area Under the ROC Curve value while the current marker is included.
Figure 4.Hair shape prediction accuracy in unrelated QIMR subjects (N = 6068) and ERF (N = 977). (A) Accuracy of hair straight vs. non-straight hair prediction using DNA variants, sex and age in unrelated Europeans from QIMR based on 10-fold cross validation and the accuracy of applying model to the testing set of QIMR and to all ERF subjects. Prediction performance measured by AUC for the model based on binomial logistic regression (Y-axis) was plotted against the number of markers included in the model (X-axis). Sex and age were always included and fixed at the position 1 and 2 in all analyses. (B) Frequency (left y-axis) of predicted probability (x-axis) for non-straight hair and percentage (right y-axis) of non-straight hair in each probability bin in Europeans from QIMR by self-cross-validation. (C) Frequency (left y-axis) of predicted probability (x-axis) for non-straight hair and percentage (right y-axis) of non-straight hair in each probability bin in Europeans from ERF by applying model of QIMR to ERF. (D) The distribution of predicted non-straight hair probabilities for 2504 subjects from the 1000-Genomes Project panel; samples are grouped according to the 5 continents they originate from: AFR—Sub-Saharan Africans, AMR—Native Americans, EAS—East Asians, EUR—Europeans, and SAS - South Asians. (E) The distribution of predicted non-straight hair probabilities for 2504 subjects from 19 worldwide countries from the 1000-Genomes Project panel; Except 3 populations, 26 populations were grouped to 19 countries including 2248 subjects according to the official population description from the 1000 Genomes Project.