| Literature DB >> 24516404 |
Taru Tukiainen1, Matti Pirinen2, Antti-Pekka Sarin3, Claes Ladenvall4, Johannes Kettunen3, Terho Lehtimäki5, Marja-Liisa Lokki6, Markus Perola7, Juha Sinisalo8, Efthymia Vlachopoulou6, Johan G Eriksson9, Leif Groop10, Antti Jula11, Marjo-Riitta Järvelin12, Olli T Raitakari13, Veikko Salomaa14, Samuli Ripatti15.
Abstract
The X chromosome (chrX) represents one potential source for the "missing heritability" for complex phenotypes, which thus far has remained underanalyzed in genome-wide association studies (GWAS). Here we demonstrate the benefits of including chrX in GWAS by assessing the contribution of 404,862 chrX SNPs to levels of twelve commonly studied cardiometabolic and anthropometric traits in 19,697 Finnish and Swedish individuals with replication data on 5,032 additional Finns. By using a linear mixed model, we estimate that on average 2.6% of the additive genetic variance in these twelve traits is attributable to chrX, this being in proportion to the number of SNPs in the chromosome. In a chrX-wide association analysis, we identify three novel loci: two for height (rs182838724 near FGF16/ATRX/MAGT1, joint P-value = 2.71×10(-9), and rs1751138 near ITM2A, P-value = 3.03×10(-10)) and one for fasting insulin (rs139163435 in Xq23, P-value = 5.18×10(-9)). Further, we find that effect sizes for variants near ITM2A, a gene implicated in cartilage development, show evidence for a lack of dosage compensation. This observation is further supported by a sex-difference in ITM2A expression in whole blood (P-value = 0.00251), and is also in agreement with a previous report showing ITM2A escapes from X chromosome inactivation (XCI) in the majority of women. Hence, our results show one of the first links between phenotypic variation in a population sample and an XCI-escaping locus and pinpoint ITM2A as a potential contributor to the sexual dimorphism in height. In conclusion, our study provides a clear motivation for including chrX in large-scale genetic studies of complex diseases and traits.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24516404 PMCID: PMC3916240 DOI: 10.1371/journal.pgen.1004127
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
A summary of the characteristics of the discovery and replication cohorts.
| Cohort | Full name | Sex | N | Age (years) | Height (cm) | BMI (kg/m2) |
|
| ||||||
| NFBC | Northern Finland Birth Cohort 1966 |
| 2388 | 31.0±0.0 | 178.2±6.4 | 25.2±3.6 |
|
| 2644 | 31.0±0.0 | 164.8±6.2 | 24.2±4.7 | ||
| COROGENE | The COROGENE Study |
| 2441 | 60.4±12.8 | 176.0±6.7 | 27.4±4.2 |
|
| 1502 | 62.8±13.4 | 161.6±6.6 | 26.9±5.2 | ||
| DGI | Diabetes Genetics Initiative |
| 1534 | 61.0±10.6 | 174.9±6.3 | 27.5±3.7 |
|
| 1608 | 62.5±10.7 | 161.7±6.1 | 27.8±4.7 | ||
| GENMETS | Health2000 GenMets Study |
| 1000 | 49.2±10.4 | 176.4±6.6 | 27.3±3.9 |
|
| 1040 | 52.1±11.4 | 162.7±6.5 | 27.2±5.0 | ||
| YFS | The Cardiovascular Risk in Young Finns Study |
| 917 | 37.6±5.1 | 179.7±6.7 | 26.8±4.3 |
|
| 1111 | 37.6±5.0 | 166.0±6.0 | 25.3±5.0 | ||
| PredictCVD | Case control sample from the FINRISK surveys |
| 1180 | 51.5±13.0 | 174.5±6.8 | 27.6±4.3 |
|
| 686 | 52.2±13.4 | 161.5±6.5 | 27.3±5.3 | ||
| HBCS | Helsinki Birth Cohort Study |
| 696 | 61.4±2.8 | 176.9±5.8 | 27.5±4.3 |
|
| 950 | 61.5±3.0 | 163.2±5.8 | 27.7±5.1 | ||
|
| ||||||
| FINRISK | Subset from the FINRISK 1997 and 2002 surveys |
| 2287 | 46.8±13.4 | 175.8±7.0 | 26.7±4.1 |
|
| 2745 | 45.1±12.4 | 162.6±6.2 | 26.2±5.1 |
N: maximum number of individuals with phenotype and genotype data available; BMI: Body-mass-index; Age, height and BMI are given as mean ± standard deviation.
Estimates of the explained variances in the twelve quantitative phenotypes attributable to chromosome X SNPs and autosomal SNPs separately using equal variance (EV) model.
| Phenotype | N | hX (%) | seX (%) | P-value | haut (%) | seaut (%) |
| Height | 14408 | 1.41 | 0.41 | 2.00E-06 | 52.35 | 2.41 |
| SBP | 9990 | 1.07 | 0.52 | 0.005 | 16.63 | 2.98 |
| Fasting glucose | 9151 | 0.84 | 0.57 | 0.06 | 11.15 | 3.09 |
| HDL-C | 11139 | 0.73 | 0.42 | 0.01 | 30.21 | 2.85 |
| Fasting insulin | 9616 | 0.68 | 0.47 | 0.04 | 12.66 | 2.98 |
| TG | 11140 | 0.43 | 0.4 | 0.1 | 19.88 | 2.72 |
| CRP | 9697 | 0.42 | 0.47 | 0.2 | 11.16 | 2.89 |
| WHR | 12334 | 0.22 | 0.34 | 0.2 | 12.75 | 2.36 |
| BMI | 14214 | 0.11 | 0.31 | 0.4 | 25.86 | 2.25 |
| DBP | 9984 | 0 | 0.42 | 0.5 | 12.5 | 2.89 |
| LDL-C | 11040 | 0 | 0.41 | 0.5 | 26.43 | 2.82 |
| TC | 11141 | 0 | 0.43 | 0.5 | 27.21 | 2.81 |
The estimates are based on an analysis of the individuals from six Finnish cohorts using the program GCTA and 217,112 common and low-frequency chrX SNPs (MAF>1%) directly genotyped or imputed with high-quality (info >0.8) and 319,445 directly genotyped autosomal SNPs (MAF>1%).
hX: estimate for the proportion of explained variance accountable by the SNPs in chromosome X in per cent; seX: standard error in per cent for the X chromosome variance estimate; P-value: P-value for the test of hX = 0; haut: estimate for the proportion of explained variance accountable by the SNPs in autosomes in per cent; seaut: standard error in per cent for the autosomal variance estimate; SBP: systolic blood pressure; HDL-C: high-density lipoprotein cholesterol; TG: total triglycerides; CRP: C-reactive protein; WHR: waist-hip-ratio; BMI: body-mass-index; DBP: diastolic blood pressure; LDL-C: low-density lipoprotein cholesterol; TC: total cholesterol.
The lead associations in the three significantly associated loci in the chromosome X-wide association analysis.
| Locus/Candidates | SNP | Pos (chrX) | EA/OA | Data set | Sex | EAF | Beta | SE | P-value | N | P-value (sex) | P-value (het) | Variance explained (%) |
| Xq21.1/ITM2A | rs1751138 | 78657806 | G/A | Discovery | Males+Females | 0.643 | 0.057 | 0.009 | 5.54E-10 | 19566 | 1.99E-10 | 1.33E-02 | - |
| Males | 0.641 | 0.045 | 0.011 | 3.32E-05 | 10093 | - | - | 0.19% | |||||
| Females | 0.644 | 0.084 | 0.016 | 1.67E-07 | 9473 | - | - | 0.32% | |||||
| Replication | Males+Females | 0.647 | 0.038 | 0.019 | 5.53E-02 | 4996 | 2.53E-04 | 3.32E-04 | - | ||||
| Males | 0.647 | −0.013 | 0.024 | 5.94E-01 | 2259 | - | - | 0.01% | |||||
| Females | 0.647 | 0.128 | 0.032 | 5.51E-05 | 2737 | - | - | 0.75% | |||||
| Joint | Males+Females | 0.643 | 0.054 | 0.008 | 3.03E-10 | 24562 | 3.26E-12 | 2.85E-04 | - | ||||
| Males | 0.642 | 0.036 | 0.010 | 3.39E-04 | 12352 | - | - | 0.12% | |||||
| Females | 0.645 | 0.093 | 0.014 | 2.56E-10 | 12210 | - | - | 0.39% | |||||
| Xq21.1/FGF16, ATRX, MAGT1 | rs182838724 | 76797439 | T/A | Discovery | Males+Females | 0.299 | 0.053 | 0.009 | 4.31E-08 | 19562 | 1.31E-07 | 1.99E-01 | - |
| Males | 0.294 | 0.053 | 0.011 | 4.07E-06 | 10092 | - | - | 0.23% | |||||
| Females | 0.304 | 0.054 | 0.016 | 1.24E-03 | 9470 | - | - | 0.12% | |||||
| Replication | Males+Females | 0.290 | 0.056 | 0.020 | 7.74E-03 | 4996 | 1.69E-02 | 3.03E-01 | - | ||||
| Males | 0.287 | 0.046 | 0.025 | 6.60E-02 | 2259 | - | - | 0.18% | |||||
| Females | 0.293 | 0.071 | 0.033 | 2.88E-02 | 2737 | - | - | 0.21% | |||||
| Joint | Males+Females | 0.297 | 0.054 | 0.008 | 2.71E-09 | 24558 | 4.44E-09 | 8.19E-02 | - | ||||
| Males | 0.293 | 0.052 | 0.010 | 8.95E-07 | 12351 | - | - | 0.22% | |||||
| Females | 0.301 | 0.058 | 0.015 | 1.58E-04 | 12207 | - | - | 0.14% | |||||
| Xq23 | rs139163435 | 116110239 | G/T | Discovery | Males+Females | 0.071 | −0.128 | 0.023 | 2.87E-08 | 11681 | 1.72E-07 | 5.74E-01 | - |
| Males | 0.071 | −0.137 | 0.028 | 1.45E-06 | 5542 | - | - | 0.50% | |||||
| Females | 0.071 | −0.110 | 0.039 | 4.95E-03 | 6139 | - | - | 0.16% | |||||
| Replication | Males+Females | 0.088 | −0.221 | 0.110 | 4.51E-02 | 370 | 9.95E-02 | 4.39E-01 | - | ||||
| Males | 0.091 | −0.170 | 0.129 | 1.87E-01 | 200 | - | - | 0.96% | |||||
| Females | 0.084 | −0.363 | 0.214 | 9.00E-02 | 170 | - | - | 2.04% | |||||
| Joint | Males+Females | 0.072 | −0.132 | 0.023 | 5.18E-09 | 12051 | 3.46E-08 | 6.65E-01 | - | ||||
| Males | 0.072 | −0.139 | 0.028 | 6.05E-07 | 5742 | - | - | 0.51% | |||||
| Females | 0.071 | −0.118 | 0.039 | 2.15E-03 | 6309 | - | - | 0.19% |
EA: effect allele; OA: other allele; Data set: the data sets included in the meta-analysis: discovery cohorts (discovery), the replication cohort (replication), both discovery and replication cohorts (joint); Sex: the sex in which the analysis was conducted; EAF: effect allele frequency; Beta: effect size for the effect allele; SE: standard error for Beta; P-value: P-value for the association from fixed-effects meta-analysis; N: sample size in the analysis; P-value (sex): P-value for the association from sex-differentiated meta-analysis; P-value (het): P-value from the sex heterogeneity test; Variance explained: the proportion of phenotype variance the lead SNP explains in per cent, calculated from meta-analysis summary statistics (Materials and Methods).
Figure 1A Manhattan plot across all the twelve phenotypes and regional association plots for the three associated loci.
A. A Manhattan plot showing the associations of the X chromosome SNPs to the twelve phenotypes in the discovery analysis. The associated loci (P-value<5.0×10−8) are highlighted with red dots and solid lines. B. A plot of the height associations in the Xq21.1 region showing two separate association signals. C–E. The association plots for height near ITM2A (C) and height near ATRX (D) and for fasting insulin in Xq23 (E), showing the association P-values in the discovery analysis. Yellow diamonds indicate the SNPs, which showed the strongest evidence of association in each of the loci, and purple diamonds and the P-values given in the plots indicate the associations of these lead SNPs in the joint analysis of discovery and replication data sets. Each circle in the plots indicates a SNP with the color of the circle (in C–E) showing the linkage disequilibrium between the SNP and the highlighted lead SNP: dark blue (r2<0.2), light blue (r2>0.2), green (r2>0.4), orange (r2>0.6) and red (r2>0.8), The r2 values were calculated using the genotype data from the COROGENE cohort, and the recombination rate, indicated by the blue lines in the background and the right hand y-axis, was estimated from the CEU HapMap data. The bottom panels show the genes (RefSeq Genes) and their positions in each locus. In all plots the dashed red line marks the threshold for genome-wide significance (P-value = 5.0×10−8).
Figure 2Comparison of the dosage compensation models in the three associated loci applying Bayesian framework.
A. Separately estimated effect sizes of the three lead SNPs (dots labeled with rs-numbers) in females (x-axis) and males (y-axis) when female genotypes are coded {0,1,2} and male genotypes {0,2}. Ellipses show the 95% confidence regions for the estimates. The lines show the regions of the expected values of the effects under either full dosage compensation (FDC) or no dosage compensation (NDC) models. The associated traits are fasting insulin (INS) and height (HGT). B. Posterior probability of no dosage compensation (NDC) model at the three lead SNPs when the other candidate is full dosage compensation model and the two models are equally probable a priori. Labels under bars give the rs-number of the SNP, the associated trait (INS = fasting insulin or HGT = height) and the height of the bar.