| Literature DB >> 19282985 |
Yao-Zhong Liu1, Yan-Fang Guo, Liang Wang, Li-Jun Tan, Xiao-Gang Liu, Yu-Fang Pei, Han Yan, Dong-Hai Xiong, Fei-Yan Deng, Na Yu, Yin-Ping Zhang, Lei Zhang, Shu-Feng Lei, Xiang-Ding Chen, Hong-Bin Liu, Xue-Zhen Zhu, Shawn Levy, Christopher J Papasian, Betty M Drees, James J Hamilton, Robert R Recker, Hong-Wen Deng.
Abstract
For females, menarche is a most significant physiological event. Age at menarche (AAM) is a trait with high genetic determination and is associated with major complex diseases in women. However, specific genes for AAM variation are largely unknown. To identify genetic factors underlying AAM variation, a genome-wide association study (GWAS) examining about 380,000 SNPs was conducted in 477 Caucasian women. A follow-up replication study was performed to validate our major GWAS findings using two independent Caucasian cohorts with 854 siblings and 762 unrelated subjects, respectively, and one Chinese cohort of 1,387 unrelated subjects--all females. Our GWAS identified a novel gene, SPOCK (Sparc/Osteonectin, CWCV, and Kazal-like domains proteoglycan), which had seven SNPs associated with AAM with genome-wide false discovery rate (FDR) q<0.05. Six most significant SNPs of the gene were selected for validation in three independent replication cohorts. All of the six SNPs were replicated in at least one cohort. In particular, SNPs rs13357391 and rs1859345 were replicated both within and across different ethnic groups in all three cohorts, with p values of 5.09 x 10(-3) and 4.37 x 10(-3), respectively, in the Chinese cohort and combined p values (obtained by Fisher's method) of 5.19 x 10(-5) and 1.02 x 10(-4), respectively, in all three replication cohorts. Interestingly, SPOCK can inhibit activation of MMP-2 (matrix metalloproteinase-2), a key factor promoting endometrial menstrual breakdown and onset of menstrual bleeding. Our findings, together with the functional relevance, strongly supported that the SPOCK gene underlies variation of AAM.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19282985 PMCID: PMC2652107 DOI: 10.1371/journal.pgen.1000420
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Characteristics of the subjects for GWAS and the replication study.
| Age Range | GWAS Cohort (Caucasian) | Replication Cohort I (Caucasian) | Replication Cohort II (Caucasian) | Replication Cohort III (Chinese) | ||||
| N | AAM (yr) | N | AAM (yr) | N | AAM (yr) | N | AAM (yr) | |
| <40 | 175 | 13.0 (1.5) | 426 | 12.8 (1.4) | 6 | 12.8 (2.2) | 943 | 13.4 (1.4) |
| 40–50 | 58 | 12.7 (1.4) | 329 | 13.0 (1.6) | 98 | 12.6 (1.4) | 97 | 13.7 (1.7) |
| 50–60 | 37 | 13.5 (4.6) | 77 | 13.4 (1.6) | 247 | 12.8 (1.5) | 174 | 14.5 (1.8) |
| ≥60 | 207 | 12.9 (1.4) | 22 | 12.8 (1.5) | 411 | 12.8 (1.5) | 173 | 15.1 (1.9) |
| Total | 477 | 854 | 762 | 1,387 | ||||
Note: Presented are means (SD).
SNPs identified in GWAS with genome-wide significant FDR q values.
| SNP Name | Position | Role | Allele | MAF | MAF |
| FDR | SNP effect size | |||
| β | SE | R2 | Reference allele | ||||||||
|
| 136451658 | Intron 5 | T/C | 0.464 | 0.492 | 4.92×10−7 | 0.035 | −0.821 | 0.158 | 0.059 | C |
|
| 136463382 | Intron 5 | G/T | 0.367 | 0.308 | 8.03×10−6 | 0.038 | −0.644 | 0.144 | 0.045 | T |
|
| 136468981 | Intron 5 | T/C | 0.344 | 0.308 | 5.77×10−6 | 0.038 | −0.644 | 0.144 | 0.044 | C |
|
| 136475319 | Intron 5 | T/C | 0.343 | 0.308 | 1.58×10−5 | 0.044 | −0.624 | 0.144 | 0.042 | C |
|
| 136587711 | Intron 3 | A/G | 0.235 | 0.233 | 1.20×10−5 | 0.042 | −0.605 | 0.143 | 0.040 | G |
|
| 136593147 | Intron 3 | A/G | 0.237 | 0.233 | 1.61×10−5 | 0.044 | −0.605 | 0.143 | 0.040 | G |
|
| 136600692 | Intron 3 | A/G | 0.233 | 0.208 | 4.81×10−6 | 0.038 | −0.604 | 0.143 | 0.040 | G |
The second allele represents the minor allele of each marker.
Minor allele frequency calculated in our cohort.
Minor allele frequency reported for Caucasians in the HapMap CEU.
The association analysis and estimation for SNP effect size was performed under dominant genetic model.
Figure 1Association signals of the SPOCK gene SNPs.
This figure depicts association signals achieved in Caucasians for all the genotyped SNPs of the SPOCK gene in our GWAS. The X axis shows the physical position of each SNP. The haplotype block map for the whole span of the SPOCK gene, showing pairwise LD in D′, was constructed for both Caucasians (CEU) and Chinese Han (CHB) using the Haploview program [54] (http://www.broad.mit.edu/mpg/haploview/) and the most recent SNP genotype data (HapMap Data Rel 26/phaseIII Nov 08, on NCBI B36 assembly, dbSNP b126) from HapMap (www.hapmap.org). The seven SNPs circled within the rounded rectangle are those achieving significant genome-wide FDR values (q<0.05), which are further illustrated in Figure 2.
Figure 2Association signals of the seven most significant SNPs of the SPOCK gene.
This figure illustrates the association signals in our GWAS for the seven SPOCK gene SNPs that achieved significant genome-wide FDR values (q<0.05). The figure also shows the signals for two haplotype blocks (Blocks 1 & 2) formed by the SNPs rs7701979, rs13357391 and rs1859345 (for Block 1) and the SNPs rs10054991, rs12653349 and rs17779700 (for Block 2). The haplotype block map for the seven SPOCK gene SNPs, showing pairwise LD in D′, was constructed for both Caucasians (CEU) and Chinese Han (CHB) using the Haploview program [54] (http://www.broad.mit.edu/mpg/haploview/) and the most recent SNP genotype data (HapMap Data Rel 26/phaseIII Nov 08, on NCBI B36 assembly, dbSNP b126) from HapMap (www.hapmap.org).
Association signals of SNPs under replication.
| SNP name | Allele | Replication Cohort I (Caucasian) | Replication Cohort II (Caucasian) | Replication Cohort III (Chinese) | Combined | Combined | ||||
| MAF |
| MAF |
| MAF |
| Fisher's | UNPHASED | |||
|
| T/C | 0.481 | 0.15 | 0.479 |
| – |
|
|
|
|
|
| G/T | 0.297 |
| 0.297 |
| – |
|
| 0.24 |
|
|
| T/C | 0.303 |
| 0.305 |
| 0.160 |
|
|
|
|
|
| T/C | 0.307 |
| 0.305 |
| 0.163 |
|
|
|
|
|
| A/G | 0.202 | 0.27 | 0.200 | 0.057 | 0.123 |
| 0.080 |
|
|
|
| A/G | 0.203 | 0.33 | 0.198 |
| 0.138 |
| 0.077 | 0.053 |
|
|
|
|
|
|
|
| – |
|
|
|
|
Block 1 is formed by the three neighboring SNPs in intron 5 marked with asterisks. The second allele represents the minor allele of each marker. MAF, minor allele frequency calculated based on the genotypes of our study subjects. Marked in bold are the replication p values less than 0.05. Combined p values in 2 Caucasian replication cohorts were calculated using Fisher's method [29] or using the UNPHASED software (through association analysis on pooled sample containing the two cohorts) [58]. Combined p values in all 3 replication cohorts were calculated using Fisher's method only. The two SNPs, rs2348186 and rs7701979, deviate from HWE (p<0.001) in the Chinese cohort and hence were excluded from association analyses. The association analysis for Replication Cohorts I and II was performed under the dominant model. The analysis for Replication Cohort III was performed under the recessive model.
Magnitude and direction of SNP effects in Replication Cohorts II and III.
| SNP name | Replication Cohort II (Caucasian) | Replication Cohort III (Chinese) | Reference allele | ||||
| β | SE | R2 | β | SE | R2 | ||
|
| −0.215 | 0.140 | 0.005 | – | – | – | C |
|
| −0.159 | 0.125 | 0.003 | – | – | – | T |
|
| −0.267 | 0.123 | 0.009 | −0.709 | 0.271 | 0.005 | C |
|
| −0.245 | 0.124 | 0.008 | −0.791 | 0.262 | 0.007 | C |
|
| −0.237 | 0.129 | 0.006 | −0.784 | 0.393 | 0.003 | G |
|
| −0.232 | 0.128 | 0.006 | −0.694 | 0.333 | 0.003 | G |
The two SNPs, rs2348186 and rs7701979, deviate from HWE (p<0.001) in the Chinese cohort and hence were excluded from association analyses. The estimation for SNP effect size for Replication Cohort II was performed under the dominant model. The analysis for Replication Cohort III was performed under the recessive model.
Association analysis of haplotypes formed by SNPs of the SPOCK gene.
| Cohort | Component SNPs of a Haplotype | Haplotype | Haplotype Frequency |
|
| GWAS Cohort | SNPs 1-2-3-4-5-6 | G-A-T-T-T-A | 0.605 | 7.90×10−4 |
| T-G-C-C-C-G | 0.197 | 8.04×10−4 | ||
| Replication Cohort I | SNPs 1-2-3-4-6 | G-C-T-A-G | 0.677 | 0.038 |
| T-C-C-A-A | 0.123 | 0.015 | ||
| Replication Cohort II | SNPs 1-2-3-4-6 | G-A-T-A-T | 0.659 | 0.069 |
| T-G-C-G-C | 0.166 | 0.093 | ||
| Replication Cohort III | SNPs 2-3-4-6 | A-A-C-C | 0.066 | 0.051 |
| A-A-T-T | 0.786 | 0.054 |
SNP1: rs7701979, SNP2: rs13357391; SNP3: rs1859345; SNP4: rs10054991; SNP5: rs12653349; SNP6: rs17779700. The haplotype covers the two haplotype blocks, Block 1 and Block 2, as shown in Figure 1. SNP 5 was not genotyped in Replication Cohorts I and II, and SNPs 1 and 5 were not genotyped in Replication Cohort III. Therefore, these SNPs were not included as the component SNPs for the haplotype analysis for the Replication Cohorts.
Figure 3Comparison of AAM among subjects of different genotypes at rs13357391.
Presented in the figure are the means and standard errors of AAM for specific genotype groups in the GWAS and Chinese replication cohorts.
Comparison between previous study and this GWAS for CCR3 gene's association with AAM.
| SNP name | Results in the previous study | Imputed results in GWAS sample | ||||||
| Position | Role | Allele | MAF | MAF |
| MAF |
| |
|
| 46248770 | promoter | G/A | 0.074 | 0.067 | 0.142 | 0.076 | 0.066 |
|
| 46251882 | promoter | T/A | 0.365 | 0.325 | 0.748 | 0.345 | 0.921 |
|
| 46270781 | intron 1 | G/A | 0.443 | 0.479 | 0.009 | 0.446 | 0.200 |
|
| 46278188 | intron 1 | G/A | 0.191 | 0.208 |
| 0.200 |
|
|
| 46279610 | intron 2 | T/C | 0.445 | 0.460 | 0.03 | 0.448 | 0.277 |
|
| 46280445 | intron 2 | C/A | 0.073 | 0.067 | 0.151 | 0.076 | 0.050 |
|
| 46281704 | exon 3 | T/C | 0.075 | 0.067 | 0.663 | 0.077 | 0.031 |
|
| 46283476 | downstream | T/A | 0.249 | 0.250 | 0.503 | 0.256 | 0.858 |
|
| 46287543 | downstream | G/A | 0.446 | 0.400 | 0.06 | 0.450 | 0.301 |
The second allele represents the minor allele of each marker.
Minor allele frequency calculated in the sample of the previous study [16].
Minor allele frequency reported for Caucasians in the HapMap CEU.
Minor allele frequency calculated based on imputed genotypes.
The direction of the effects for the SNP, rs3091309, is the same in the previous study as in the GWAS sample according to the imputation results, where carriers of the minor allele “A” tend to have a lower AAM as compared with the non-carriers. Using Fisher's method [29] to combine the two p values achieved in the previous and the current GWAS samples, a p value of 4.78×10−3 is obtained for the overall association of this SNP with AAM in the two datasets.