Literature DB >> 17903299

A genome-wide association study for blood lipid phenotypes in the Framingham Heart Study.

Sekar Kathiresan1, Alisa K Manning, Serkalem Demissie, Ralph B D'Agostino, Aarti Surti, Candace Guiducci, Lauren Gianniny, Nöel P Burtt, Olle Melander, Marju Orho-Melander, Donna K Arnett, Gina M Peloso, Jose M Ordovas, L Adrienne Cupples.   

Abstract

BACKGROUND: Blood lipid levels including low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglycerides (TG) are highly heritable. Genome-wide association is a promising approach to map genetic loci related to these heritable phenotypes.
METHODS: In 1087 Framingham Heart Study Offspring cohort participants (mean age 47 years, 52% women), we conducted genome-wide analyses (Affymetrix 100K GeneChip) for fasting blood lipid traits. Total cholesterol, HDL-C, and TG were measured by standard enzymatic methods and LDL-C was calculated using the Friedewald formula. The long-term averages of up to seven measurements of LDL-C, HDL-C, and TG over a approximately 30 year span were the primary phenotypes. We used generalized estimating equations (GEE), family-based association tests (FBAT) and variance components linkage to investigate the relationships between SNPs (on autosomes, with minor allele frequency > or =10%, genotypic call rate > or =80%, and Hardy-Weinberg equilibrium p > or = 0.001) and multivariable-adjusted residuals. We pursued a three-stage replication strategy of the GEE association results with 287 SNPs (P < 0.001 in Stage I) tested in Stage II (n approximately 1450 individuals) and 40 SNPs (P < 0.001 in joint analysis of Stages I and II) tested in Stage III (n approximately 6650 individuals).
RESULTS: Long-term averages of LDL-C, HDL-C, and TG were highly heritable (h2 = 0.66, 0.69, 0.58, respectively; each P < 0.0001). Of 70,987 tests for each of the phenotypes, two SNPs had p < 10(-5) in GEE results for LDL-C, four for HDL-C, and one for TG. For each multivariable-adjusted phenotype, the number of SNPs with association p < 10(-4) ranged from 13 to 18 and with p < 10(-3), from 94 to 149. Some results confirmed previously reported associations with candidate genes including variation in the lipoprotein lipase gene (LPL) and HDL-C and TG (rs7007797; P = 0.0005 for HDL-C and 0.002 for TG). The full set of GEE, FBAT and linkage results are posted at the database of Genotype and Phenotype (dbGaP). After three stages of replication, there was no convincing statistical evidence for association (i.e., combined P < 10(-5) across all three stages) between any of the tested SNPs and lipid phenotypes.
CONCLUSION: Using a 100K genome-wide scan, we have generated a set of putative associations for common sequence variants and lipid phenotypes. Validation of selected hypotheses in additional samples did not identify any new loci underlying variability in blood lipids. Lack of replication may be due to inadequate statistical power to detect modest quantitative trait locus effects (i.e., <1% of trait variance explained) or reduced genomic coverage of the 100K array. GWAS in FHS using a denser genome-wide genotyping platform and a better-powered replication strategy may identify novel loci underlying blood lipids.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17903299      PMCID: PMC1995614          DOI: 10.1186/1471-2350-8-S1-S17

Source DB:  PubMed          Journal:  BMC Med Genet        ISSN: 1471-2350            Impact factor:   2.103


Introduction

Blood lipid levels are a major contributor to atherosclerotic cardiovascular disease [1]. Current evidence suggests that blood lipids are complex genetic phenotypes, influenced by both environmental and genetic factors. Heritability estimates for blood lipids are high, including ~40–60% for high-density lipoprotein cholesterol (HDL-C), ~40–50% for low-density lipoprotein cholesterol (LDL-C), and ~35–48% for triglycerides (TG) [2]. These estimates indicate that DNA sequence variation plays an important role in explaining inter-individual variation in blood lipid levels. Indeed, sequence variants in individual genes have been consistently related to blood lipid phenotypes, including APOE/PCSK9 with LDL-C [3-5], CETP/LIPC/LPL with HDL-C [6-9], and APOA5/LPL with TG [10,11], among others. However, the extent to which common genetic variants across the genome account for total variation in blood lipid levels is unknown. Recent advances in genomics enable a genome-wide association study (GWAS), an approach in which a substantial fraction of common genetic variation is tested for a role in determining phenotypic variation [12]. These advances include a map of the correlation structure for approximately 4 million common genetic variants (minor allele frequency >5%) and whole-genome genotyping technologies capable of assaying 100,000–500,000 single nucleotide polymorphisms (SNPs) in an individual [13]. Utilizing a fixed genotyping marker set such as the Affymetrix 100K GeneChip in an association study tests a substantial fraction of the genome in whites, ~30–45% in some estimates [14]. GWAS has been successfully applied to identify novel genetic loci related to several medical phenotypes including age-related macular degeneration [15], inflammatory bowel disease [16], and electrocardiographic QT interval [17]. Identifying novel genetic variants related to blood lipid phenotypes may provide new drug targets to alter blood lipid levels and may aid in the prediction of cardiovascular disease. We hypothesized that common genetic variants explain a proportion of the inter-individual variability in LDL-C, HDL-C, and TG. Accordingly, we conducted genome-wide linkage and association studies for these three phenotypes in Framingham Heart Study (FHS) participants.

Materials and methods

GWAS sample

Of the 1345 FHS participants who are part of the family plate set (see Executive Summary), we focused our analyses on the 1087 participants from the Offspring cohort who had Affymetrix 100K genotypes. Lipid phenotypes were measured at various examinations as described in Table 1. Each study participant provided written informed consent for genetic analyses and the study was approved by Boston University's Institutional Review Board.
Table 1

Lipid Phenotypes Examined Using Affymetrix 100K GeneChip Scan

PhenotypeAcronymPhenotypesNh2 *Offspring ExamAdjustmentMultivariable model
Total cholesterolTC710690.571,2,3,4,5,6,7Age, age2, smoking, body mass index, alcohol consumption, menopausal status, hormone replacement therapy; Covariate-adjusted residuals created separately by gender
Low-density lipoprotein cholesterolLDL-C710560.591,2,3,4,5,6,7
High-density lipoprotein cholesterolHDL-C710620.521,2,3,4,5,6,7
TriglyceridesTG710680.481,2,3,4,5,6,7
Mean total cholesterolMeanTC110870.61Avg of Exams 1,2,3,4,5,6,7
Mean low-density lipoprotein cholesterolMeanLDL-C110860.66Avg of Exams 1,2,3,4,5,6,7
Mean high-density lipoprotein cholesterolMeanHDL-C110870.69Avg of Exams 1,2,3,4,5,6,7
Mean triglyceridesMeanTG110870.58Avg of Exams 1,2,3,4,5,6,7
Apolipoprotein A-IApoA-I19970.424
Apolipoprotein BApoB19970.474
Apolipoprotein C3ApoC317670.384
Apolipoprotein EApoE17440.545
High-density lipoprotein 2 cholesterolHDL2-C2955 (exam 4) 958 (exam 5)0.50 0.534,5
High-density lipoprotein 3 cholesterolHDL3-C2984 (exam 4) 1020 (exam 5)0.48 0.544,5
Large high-density lipoprotein by NMRHDLNMRlg18510.534
Intermediate high-density lipoprotein by NMRHDLNMRint18510.224
Small high-density lipoprotein by NMRHDLNMRsm18510.244
High-density lipoprotein particle size by NMRHDLNMRsz18510.504
Intermediate-density lipoprotein particle by NMRIDLNMR18510.204
Large low-density lipoprotein by NMRLDLNMRlg18510.394
Small low-density lipoprotein by NMRLDLNMRsm18510.404
Low-density lipoprotein particle size by NMRLDLNMRsz18510.404
Large very low-density lipoprotein particle by NMRVLDLNMRlg18510.344
Intermediate very low-density lipoprotein particle by NMRVLDLNMRint18780.404
Small very low-density lipoprotein particle by NMRVLDLNMRsm18510.244
Very low-density lipoprotein particle size by NMRVLDLNMRsz18510.424
Triglyceride/high-density lipoprotein cholesterol ratiotghdl710600.561,2,3,4,5,6,7
Total cholesterol/high-density lipoprotein cholesterol ratiocholhdl710600.581,2,3,4,5,6,7
Lipoprotein(a)Lp(a)17630.903
Remnant lipoprotein cholesterolRLP-C17460.344
Remnant lipoprotein triglyceridesRLP-TG17150.334

Note: TG, MeanTG, cholhdl, and tghdl were natural log transformed due to skewed distribution;

*Heritability (h2) estimates presented are those after multivariable-adjustment; P < 0.0001 for all heritability estimates.

†Each phenotype had 2 adjustment schemes: age- and sex-adjusted and multivariable-adjusted. Both age- and sex-adjusted and multivariable-adjusted model results are web posted.

‡Heritability estimates were derived from lipid phenotypes at a single time-point, that of FHS Examination 1.

Lipid Phenotypes Examined Using Affymetrix 100K GeneChip Scan Note: TG, MeanTG, cholhdl, and tghdl were natural log transformed due to skewed distribution; *Heritability (h2) estimates presented are those after multivariable-adjustment; P < 0.0001 for all heritability estimates. †Each phenotype had 2 adjustment schemes: age- and sex-adjusted and multivariable-adjusted. Both age- and sex-adjusted and multivariable-adjusted model results are web posted. ‡Heritability estimates were derived from lipid phenotypes at a single time-point, that of FHS Examination 1.

Phenotype definition and methods

Blood lipids were measured from fasting venous blood collected at each of seven clinical examination time points extending from 1971 to 2001. Total cholesterol, HDL-C, and TG were measured by standard enzymatic methods. LDL-C was calculated using the Friedewald formula, with a missing value assigned for participants with a measured TG > 400 mg/dL. Clinical covariates utilized in phenotypic regression modeling included age at the time of blood lipid measurement, age2, body mass index (weight in kg divided by the height in m2), alcohol intake (drinks per week), current cigarette smoking (yes, no), menopausal status (postmenopausal yes, no), and hormone replacement therapy (yes, no). Commonly-used lipid lowering therapies affect total cholesterol and TG. To account for treatment effect, we imputed total cholesterol and TG values for those treated with lipid-lowering therapy. The imputation procedure was modeled after prior work on imputing blood pressure values for those on antihypertensive medication [18]. For each treated individual, a correction factor was added to the observed [treated] lipid value (total cholesterol or TG). This correction factor consisted of the difference between an ''expected'' residual and the ''calculated'' residual. The ''calculated'' residual for each individual was generated in a sex-specific manner after adjustment for age, age2, age3, and examination year (by decade). The ''expected'' residual was generated within each sex and 10 year-age-group as the average of ''calculated'' residuals equal or greater than the treated individual's ''calculated'' residual. Lipoprotein subclass profiles were measured by a commercially available proton NMR spectroscopic assay (LipoScience, Raleigh, NC) on plasma samples stored at -70°C as described previously [19]. The particle concentration of the following 9 lipoprotein species were determined: 3 VLDL subclasses [large, >60 nm (including chylomicrons); intermediate, 35–60 nm; small, 27–35 nm]; 3 LDL subclasses (IDL, 23–27 nm; large LDL 21.3–23 nm; small LDL, 18.3–21.2 nm); and 3 HDL subclasses (large, 8.8–13 nm; intermediate, 8.2–8.8 nm; small, 7.3–8.2 nm). The small LDL subclass comprises the sum of subclasses formerly labeled "intermediate" (19.8–21.2 nm) and "small" (18.3–19.7 nm) [19], since concentrations of both have very similar relations to lipid levels.

Genotyping methods

All analyses were based on the Affymetrix 100K GeneChip genotyping data generated in Framingham Heart Study participants as described previously [20]. In order to minimize false positive associations due to genotyping artifact, we limited our analyses to SNPs with a genotyping call rate ≥80% and a Hardy-Weinberg Equilibrium P ≥ 0.001. Given lower statistical power to detect associations with rarer SNPs, we limited our results to SNPs with a minor allele frequency ≥10%.

Statistical analysis methods

TG levels were log-transformed to approximate a normal distribution. For each blood lipid phenotype, the long-term average of 4 to 7 serial measurements was used as the primary phenotype. Participants contributing fewer than 4 of 7 measures of a given phenotype were excluded from that analysis. MeanLDL-C, MeanHDL-C, and MeanTG were adjusted for covariates in sex-specific linear regression models. Two sets of phenotypic models were created: Model 1 (age, age2) and Model 2 (age, age2, body mass index, alcohol intake, cigarette smoking, menopausal status, and hormone replacement therapy). For quantitative covariates (age, body mass index, and alcohol intake), the mean value across examinations was used as a covariate. For categorical covariates, the proportion of exams scored as 'yes' was used. The residual MeanLDL-C, MeanHDL-C, and MeanlogTG values from Model 1 and Model 2 served as the primary phenotypes. For genotype-phenotype association analyses, we assumed an additive model of inheritance. We conducted multivariable linear regression using GEE, family-based association testing using FBAT, and linkage using Merlin for computation of IBDs and SOLAR for variance component models as described in the Executive Summary.

Heritability analyses

Heritability estimates for the lipid phenotypes were obtained from extended families with at least two members by variance-components methods using the Sequential Oligogenic Linkage Analysis Routines (SOLAR) package [21]. Using this approach, maximum-likelihood estimation was applied to a mixed-effects model that incorporated fixed covariate effects, additive genetic effects, and residual error. The additive genetic effects and residual errors were assumed to be normally distributed and to be mutually independent. The analyses were performed using residuals from the multivariable models (Model 1 and Model 2) mentioned above. For phenotypes with kurtosis > 1, heritability estimates were computed on ranked normalized deviates.

Replication samples

Replication genotyping was attempted in three independent sample sets: a) the FHS unrelated plate set; b) Genetics of Lipid Lowering Drugs and Diet Network (GOLDN); and c) Malmö Diet and Cancer Study – Cardiovascular Cohort (MDC-CC). The second stage consisted of ~1450 biologically unrelated individuals from the FHS unrelated plate set. The third stage consisted of ~1450 participants from GOLDN and ~5200 participants from MDC-CC. GOLDN is a family-based sample recruited from two National Heart, Lung, and Blood Institute's Family Heart Study field centers (Minneapolis, MN and Salt Lake City, UT). The Family Heart Study is a multi-center, population-based cohort designed to study the genetic and environmental determinations of cardiovascular disease. The MDC study is a community-based prospective epidemiologic cohort of 28,098 persons recruited for a baseline examination between 1991 and 1996. From this cohort, 6103 persons were randomly selected to participate in the MDC-CC which sought to investigate risk factors for cardiovascular disease. Of the MDC-CC participants, 5466 had DNA and lipid phenotypes available. Individuals on lipid lowering therapy and with outlier values of LDL-C, HDL-C, or TG (top 0.5% of the distribution) were excluded, leaving 5212 individuals available for the SNP-lipid association analyses

Staged replication strategy

For follow-up into Stage II (the FHS unrelated plate set), we selected all SNPs in the GWAS with an association P < 0.001 for the MeanLDL-C, MeanHDL-C, or MeanTG phenotypes from the minimally-adjusted phenotypic model (Model 1, adjustment for age, age2 only). We next conducted a joint analysis of Stage I (GWAS 100K data) and Stage II (FHS unrelated plate set). The joint analysis consisted of a weighted average of the beta estimates and standard errors from Stages I and II and used the inverse of the variance in each stage as weights. For follow-up into Stage III (GOLDN and MDC-CC), we selected for genotyping all SNPs with a P < 0.001 in the joint analysis of Stages I and II. For genotype-phenotype association analyses in MDC-CC and GOLDN, we assumed an additive model of inheritance. In MDC-CC, we conducted multivariable linear regression analyses to test the null hypothesis that LDL-C, HDL-C, or TG residuals (sex-specific residuals adjusted for age and age2) did not differ by increasing minor allele copy number. In GOLDN, to account for correlated observations due to family relationships we used linear mixed-effects methods in SOLAR. To summarize the statistical evidence for association for each SNP across all three stages, we reiterated the weighted average beta-estimates and standard errors as described above.

Results

Clinical characteristics of the FHS sample of 1345 subjects are presented in the Executive Summary. Table 1 displays the variables that were studied in our analyses of lipid phenotypes. Further information on these phenotypes can be found at . Since Original cohort members were non-fasting at examination, our analyses considered only the 1087 Offspring Study participants with fasting lipid measurements and Affymetrix 100K SNP genotypes. For this paper we focus on longitudinal mean levels of serially measured values (minimum of 4, maximum of 7) of LDL-C, HDL-C, and TG (labeled MeanLDL-C, MeanHDL-C, and MeanTG). Heritability estimates for long-term average lipid phenotypes (Mean LDL-C, MeanHDL-C, and MeanTG) were greater than those from single time-point measurements (Table 1). For example, the heritabilities of MeanLDL-C, MeanHDL-C, and MeanTG were 0.66, 0.69, and 0.58, respectively, whereas heritabilities for LDL-C, HDL-C, and TG measured at FHS Examination 1 (a single time-point) were 0.59, 0.52, and 0.48, respectively. The highest heritability estimate for any available lipid phenotype was that for lipoprotein (a) at 0.90. From the GEE analyses, the strongest associations for MeanLDL-C, MeanHDL-C, and MeanTG were for SNPs rs287474 (p = 6.3*10-9), rs524802 (p = 7.6*10-7), and rs7007075 (p = 7.7*10-6), respectively (Table 2a). From the FBAT analyses, the strongest associations for MeanLDL-C, MeanHDL-C, and MeanTG were for SNPs rs287474 (p = 1.4*10-8), rs10495594 (p = 5.1*10-5), and rs1449866 (p = 1.8*10-5), respectively (Table 2b). For each multivariable-adjusted phenotype, the number of SNPs with a GEE association p < 10-4 ranged from 13 to 18 and with p < 10-3, from 94 to 149. The number of SNPs with FBAT association p < 10-4 ranged from 2 to 5 and with p < 10-3, from 74 to 79.
Table 2

Overview of Top Association and Linkage Results for MeanLDL-C, MeanHDL-C, and MeanTG

2a. Top 25 SNPs for association with MeanLDL-C, MeanHDL-C, or MeanTG based on the lowest p values of the GEE tests
PhenotypeSNP rs ID*ChrPhysical location (bp)GEE P-valueFBAT P-valuegene (IN or NEAR)

MeanLDL-Crs28747413681739616.3*10-91.4*10-8
MeanLDL-Crs28735413681379533.4*10-86.8*10-8
MeanHDL-Crs52480219421387877.6*10-70.04
MeanHDL-Crs50571719421221561.2*10-60.07ZNF345
MeanHDL-Crs54454319421226632.5*10-60.07
MeanTGrs700707581381684487.7*10-60.004
MeanHDL-Crs373467861076398538.2*10-60.009C6orf210
MeanLDL-Crs1048882411816782871.4*10-50.03
MeanTGrs214213610897530501.5*10-50.06BC005821| PTEN
MeanLDL-Crs15551736851211771.7*10-50.09
MeanHDL-Crs6889339709027131.8*10-50.38TRPM3| AJ505026| AL136545
MeanTGrs31447411114826341.8*10-50.09AB094146| IGSF4| BC047021
MeanHDL-Crs96637618270868362.1*10-50.07
MeanHDL-Crs445984531274003262.3*10-50.005ALDH1L1| CR749807
MeanTGrs220840111148261732.5*10-50.09AB094146| IGSF4| BC047021
MeanHDL-Crs385030114581508852.8*10-58.6*10-4BX161433| DACT1
MeanLDL-Crs13175386737061323.0*10-50.02BC050689| KCNQ5
MeanHDL-Crs70218349711236393.0*10-50.003AL136545
MeanLDL-Crs5045271341427013.2*10-50.007CSMD2
MeanHDL-Crs1048877911794947593.6*10-50.07
MeanLDL-Crs241562114401142773.8*10-50.005
MeanHDL-Crs1050851810170814793.9*10-50.09CUBN
MeanLDL-Crs12450581697600854.1*10-50.004LRRC7
MeanHDL-Crs1048878011794900004.4*10-50.06
MeanHDL-Crs5413269709365054.5*10-50.39TRPM3| AJ505026| AL136545

2b. Top 25 SNPs for association with MeanLDL-C, MeanHDL-C, or MeanTG based on the lowest p values of the FBAT tests

PhenotypeSNP rs ID*ChrPhysical location (bp)GEE P-valueFBAT P-valuegene (IN or NEAR)

MeanLDL-Crs28747413681739611.4*10-86.3*10-9
MeanLDL-Crs28735413681379536.8*10-83.4*10-8
MeanTGrs144986631443593311.8*10-50.002CHST2| AK131346
MeanTGrs1050635412577590342.2*10-50.007
MeanTGrs1049498912115766342.4*10-54.7*10-4AY552981| AK123409| KCNK2
MeanTGrs237197812577592392.4*10-50.01
MeanHDL-Crs104955942122809975.1*10-51.8*10-4
MeanHDL-Crs104955932122799455.1*10-51.8*10-4
MeanLDL-Crs14119311374519675.8*10-50.03
MeanLDL-Crs105164334998007047.7*10-50.03TSPAN5
MeanHDL-Crs218211411114962698.2*10-50.07CHI3L2
MeanTGrs183535321248800578.3*10-50.02CNTNAP5| AK056528
MeanLDL-Crs105164344998024938.6*10-50.03TSPAN5
MeanLDL-Crs1050775513683559911.1*10-41.6*10-4
MeanLDL-Crs68530794998007891.3*10-40.06TSPAN5
MeanHDL-Crs15081162414377731.3*10-40.07
MeanLDL-Crs105180724712714181.4*10-40.05APIN| C4orf7| CSN3
MeanTGrs715260498044151.5*10-40.002WDR1
MeanTGrs77840567777620381.6*10-40.12AB014605| MAGI2
MeanHDL-Crs65056231811150571.6*10-40.006
MeanHDL-Crs154129618591730951.6*10-40.02FVT1
MeanHDL-Crs68213284540584271.7*10-40.02SCFD2
MeanTGrs1501572928947941.7*10-41.1*10-4
MeanTGrs4684343323660981.8*10-40.24CNTN4
MeanTGrs22563461425183121.9*10-42.4*10-4C6orf55

2c. Magnitude and Location of Peak LOD scores > 2.0 for MeanLDL-C, MeanHDL-C, and MeanTG

PhenotypeExamChrPhysical location (Mb)Maximum LODLOD-1.5 IntervalLOD+1.5 Interval

MeanHDL-C1–77334859833.302781082036074331
MeanLDL-C1–79941301812.838805713598516033
MeanTG1–711534443892.73151582274155505440
MeanLDL-C1–731963849982.12187706181199138789

Chr = chromosome

Overview of Top Association and Linkage Results for MeanLDL-C, MeanHDL-C, and MeanTG Chr = chromosome Linkage LOD scores > 2.0 are presented in Table 2c. The best evidence for linkage was a peak LOD score of 3.3 on chromosome 7 for the MeanHDL-C phenotype. Because the prior probability of any SNP relating to a phenotype is low and given the number of tests, the P value distribution in a GWAS should approach a null distribution. Any strong departure from this expectation might suggest artifacts in genotyping or analysis. For the 70,987 SNPs that passed quality-control filters, the distribution of association P values (generated by the GEE methodology) approached a null distribution but with a slight excess of low P values. For example, for the MeanLDL-C, whereas one would expect 1% of SNPs to demonstrate a P < 0.01 by chance, we found that 1.34% of SNPs displayed a P < 0.01. Similar results were seen for meanHDL-C and meanTG (data not shown). We evaluated the association results for a SNP and each of a set of four correlated phenotypes – ApoA-I, LDLNMRsm, MeanHDL-C, and MeanTG (Table 3). Several SNPs were associated with P < 0.01 for 3 of the 4 phenotypes.
Table 3

GEE results for 4 correlated phenotypes (ApoA-I, LDLNMRsm, MeanHDL-C, and MeanTG), ranked by proportion of GEE P < 0.01*

rs idChrPhysical PositionGEE P ApoA-IGEE P LDLNMRsmGEE P MeanHDL-CGEE P MeanTGGene Symbol
rs50571719421221567.8*10-40.0031.3*10-60.04ZNF568
rs373467861076398534.7*10-40.188.2*10-60.006PDSS2
rs225603821154430470.110.0028.8*10-46.4*10-5
rs104867887157885814.7*10-40.0058.5*10-40.01
rs2328851588240780.860.0029.6*10-45.7*10-4AB067502
rs498279514232948750.0011.9*10-70.0010.11
rs52480219421387870.0010.0037.6*10-70.02ZNF568
rs54454319421226630.0010.0042.5*10-40.03ZNF568
rs282309721154450150.100.0040.0028.0*10-5
rs139404131485795450.020.0030.0025.6*10-5
rs10499320729066240.0010.0018.8*10-40.03
rs28891951658687516.6*10-40.0030.0030.02
rs70077978199212500.010.025.0*10-40.002
rs759786121003117050.080.0041.0*10-40.006
rs606385820507159590.043.0*10-40.0020.003
rs242552420407503580.010.053.6*10-40.008
rs15093842225564888.5*10-40.010.0070.03
rs931776013684806860.0030.056.6*10-40.002
rs680403231160311610.0020.020.0098.7*10-4
rs806991317138121670.030.0090.0016.7*10-4

*All SNPs in this table had GEE association P values < 0.01 for at least three of the four traits (ApoA-I, LDLNMRsm, MeanHDL-C, and MeanTG). No SNPs in our dataset had GEE association P values < 0.01 for all four traits.

GEE results for 4 correlated phenotypes (ApoA-I, LDLNMRsm, MeanHDL-C, and MeanTG), ranked by proportion of GEE P < 0.01* *All SNPs in this table had GEE association P values < 0.01 for at least three of the four traits (ApoA-I, LDLNMRsm, MeanHDL-C, and MeanTG). No SNPs in our dataset had GEE association P values < 0.01 for all four traits. Among the GEE association results, a SNP (rs7007797) in the lipoprotein lipase (LPL) was associated with MeanHDL-C (p = 0.0005) and MeanTG (p = 0.002) (Table 4). This SNP is a perfect proxy (r2 = 1) to the previously studied rs328 (also known as S447X) [22]. The minor allele of rs328 has been consistently related to higher HDL-C and lower TG. The direction of effect for SNP rs7007797 in our dataset was consistent with previous observations. Due to a lack of SNPs in the Affymetrix 100K GeneChip correlated with previously reported variants (at r2 > 0.5 threshold) in the APOE, PCSK9, CETP, LIPC, and APOA5 genes, we were unable to confirm these other previously reported associations (Table 4).
Table 4

Comparison with the prior literature

GenePhenotypeSelected SNPs previously associated with phenotype# SNPs in Affymetrix 100K within 60 kb of gene# SNPs in Affymetrix 100K within 60 kb of Gene Locus and with r2 > 0.5 to previously reported SNPSNP in Affymetrix 100K (r2 to previously reported SNP)p for association in FHS 100K
APOELDL-Crs429358rs741210--
PCSK9LDL-Crs1159114770--
CETPHDL-Crs180077520--
LIPCHDL-Crs1800588180--
LPLHDL-Crs32882rs10503669 (r2 = 1.0)rs7007797 (r2 = 1.0)rs10503669 – 9.1*10-4rs7007797 – 5.0*10-4
LPLTGrs32882rs10503669 (r2 = 1.0)rs7007797 (r2 = 1.0)rs10503669 – 0.02rs7007797 – 0.002
APOA5TGrs662799rs313550640--
Comparison with the prior literature Replication is critical to distinguish true positives from false ones in a GWAS. We pursued a three-stage replication strategy with 287 SNPs (P < 0.001 in Stage I) tested in Stage II (n~1450 individuals) and 40 SNPs (P < 0.001 in joint analysis of Stages I and II) tested in Stage III (n~6650 individuals). Results are displayed in Table 5. After three stages of replication, there was no convincing statistical evidence for association (i.e. joint analysis stages I, II & III P < 10-5) between any of the tested SNPs and lipid phenotypes.
Table 5

Association Results for 40 SNPs Attempted for Replication in Three Stages

SNPChrGeneAllele*TraitStage I – FHS 100KStage II – FHS UnrelatedJoint PStage III – MDCStage III – GOLDNJoint Stages I, II, & III
PBetaSEPBetaSEStages I & IIPBetaSEPBetaSEPBetaSE

rs723146018CMeanHDL-C4.3*10-4-0.1450.0412.4*10-4-0.1310.0363.7*10-70.500.0130.0200.600.0230.0440.03-0.0330.015
rs96637618CMeanHDL-C2.5*10-4-0.1480.0404.8*10-4-0.1250.0364.5*10-70.430.0160.0200.560.0260.0440.04-0.0310.015
rs74413413L10374CMeanLDL-C5.4*10-40.1680.0484.7*10-40.1240.0361.1*10-60.670.0090.0200.200.0620.0480.00070.0530.016
rs723338618TMeanHDL-C2.4*10-40.1490.0410.0050.1000.0366.3*10-60.32-0.0200.0200.61-0.0230.0460.110.0240.015
rs14284455AMeanHDL-C3.0*10-4-0.1870.0520.01-0.1130.0441.8*10-50.24-0.0280.0240.770.0160.0540.002-0.0580.018
rs23045892LOC130502AMeanLDL-C6.4*10-40.1870.0550.0060.1240.0451.8*10-50.47-0.0200.0270.09-0.1010.0600.180.0270.020
rs22785282LOC130502GMeanLDL-C6.7*10-40.1800.0530.0070.1220.0452.0*10-50.51-0.0180.0270.12-0.0920.0600.140.0290.020
rs214213610PTENGMeanTG8.1*10-60.2280.0510.140.0770.0522.4*10-50.38-0.0260.0290.850.0110.0610.060.0400.021
rs15551736CMeanLDL-C1.0*10-5-0.2430.0550.10-0.0830.0502.7*10-50.31-0.0270.0270.030.1430.0660.01-0.0510.021
rs605375420FLJ25067CMeanLDL-C3.1*10-40.1660.0460.020.0900.0373.4*10-50.460.0170.0220.99-0.0010.0500.0030.0490.017
rs803255315KIF7AMeanLDL-C3.0*10-50.1710.0410.080.0610.0345.5*10-50.090.0330.020FailedFailedFailed0.00020.0600.016
rs17416626RFXDC1TMeanTG7.2*10-50.1980.0500.070.0830.0465.9*10-50.15-0.0370.0260.03-0.1310.0610.580.0110.020
rs17416606RFXDC1TMeanTG7.8*10-50.1970.0500.070.0840.0466.1*10-5FailedFailedFailedFailedFailedFailed6E-050.1360.034
rs8909455AMeanHDL-C5.4*10-4-0.1810.0520.03-0.0990.0448.0*10-50.18-0.0330.0240.790.0140.0530.002-0.0570.018
rs105122979TMeanLDL-C2.1*10-4-0.2350.0640.05-0.1080.0548.6*10-50.71-0.0110.0300.440.0560.0720.03-0.0510.023
rs5413269TRPM3GMeanHDL-C3.6*10-5-0.2010.0490.28-0.0610.0561.3*10-40.100.0480.0290.02-0.1590.0690.08-0.0380.022
rs7198295GMeanTG1.3*10-4-0.1560.0410.11-0.0620.0391.4*10-40.660.0090.0220.580.0280.0500.08-0.0280.016
rs1050938410KCNMA1GMeanLDL-C7.7*10-4-0.1480.0440.04-0.0760.0381.9*10-40.020.0500.021FailedFailedFailed0.79-0.0050.017
rs6889339TRPM3TMeanHDL-C3.1*10-6-0.2320.0500.80-0.0150.0592.0*10-40.170.0410.0300.01-0.1720.0690.05-0.0440.022
rs11220805AMeanHDL-C3.4*10-4-0.2020.0560.07-0.0840.0472.5*10-40.44-0.0200.0250.47-0.0410.0580.005-0.0550.019
rs67437792KLF7CMeanLDL-C2.4*10-5-0.1810.0430.24-0.0430.0362.8*10-40.690.0090.022FailedFailedFailed0.05-0.0330.017
rs9508284AMeanHDL-C9.1*10-4-0.1510.0460.06-0.0800.0432.9*10-40.580.0120.022FailedFailedFailed0.10-0.0300.018
rs13650328CMeanLDL-C1.4*10-4-0.1610.0420.12-0.0530.0343.2*10-40.01-0.0560.0200.110.0770.0480.0002-0.0550.015
rs28747413TMeanLDL-C8.1*10-60.2090.0470.160.0560.0403.4*10-40.970.0010.0200.62-0.0230.0460.050.0300.016
rs1050775613CMeanLDL-C6.0*10-50.2020.0500.220.0520.0434.2*10-40.240.0320.0280.89-0.0080.0600.0030.0590.020
rs150720313CMeanLDL-C1.9*10-40.1720.0460.150.0570.0394.3*10-40.230.0270.0230.32-0.0520.0520.010.0440.017
rs494123618SERPINB8CMeanLDL-C4.7*10-40.1510.0430.090.0590.0344.3*10-40.83-0.0040.0200.18-0.0650.0480.020.0220.015
rs54454319GMeanHDL-C4.8*10-6-0.1850.0400.61-0.0190.0384.7*10-40.95-0.0010.0210.78-0.0130.0470.03-0.0330.016
rs37631886REPS1CMeanTG5.2*10-4-0.1910.0550.12-0.0760.0485.1*10-40.130.0420.0280.02-0.1530.0630.09-0.0350.021
rs7179876RUNX2GMeanHDL-C8.0*10-50.2500.0630.260.0630.0575.4*10-40.060.0580.0310.870.0110.0670.00070.0790.023
rs1050899710GMeanTG4.1*10-4-0.1650.0470.15-0.0620.0435.6*10-40.100.0390.023FailedFailedFailed0.51-0.0120.019
rs23958337PRKAR2BTMeanHDL-C6.2*10-4-0.1450.0420.12-0.0600.0385.6*10-40.990.0000.0210.06-0.0900.0490.01-0.0410.016
rs66775891JUNGMeanTG1.9*10-4-0.1450.0390.27-0.0420.0377.1*10-40.180.0280.0210.84-0.0100.0480.30-0.0160.016
rs105133203AMeanTG1.3*10-4-0.1900.0500.31-0.0460.0459.0*10-40.240.0280.0240.210.0680.0540.61-0.0100.018
rs50896918CETN1AMeanLDL-C6.8*10-4-0.1440.0420.18-0.0530.0409.5*10-40.310.0230.0230.640.0250.0540.29-0.0180.017
rs78580999GMeanLDL-C6.9*10-4-0.1400.0410.15-0.0510.0359.6*10-40.53-0.0130.0210.31-0.0490.0480.006-0.0430.016
rs413442514TMeanLDL-C6.6*10-4-0.1560.0460.13-0.0550.0369.8*10-40.140.0320.0210.16-0.0680.0480.24-0.0190.016
rs70218349AL136545TMeanHDL-C4.3*10-5-0.1740.0430.65-0.0210.0459.9*10-40.050.0450.0231.000.0000.0570.72-0.0060.018
rs1050077011TMeanLDL-C3.9*10-4-0.1570.0440.17-0.0490.0360.0010.410.0170.0210.900.0060.0500.22-0.0190.016
rs105164304RAP1GDS1TMeanHDL-C7.2*10-4-0.1390.0410.20-0.0520.0400.0010.64-0.0100.022FailedFailedFailed0.02-0.0410.017

*The allele on the positive strand of the reference genome was modeled in all analyses.

†Beta refers to the proportion of 1 standard deviation unit change in phenotype (phenotype is sex-specific residual adjusted for age and age2) per copy of the allele modeled.

‡"Failed" refers to SNP genotype failure in the sample.

Association Results for 40 SNPs Attempted for Replication in Three Stages *The allele on the positive strand of the reference genome was modeled in all analyses. †Beta refers to the proportion of 1 standard deviation unit change in phenotype (phenotype is sex-specific residual adjusted for age and age2) per copy of the allele modeled. ‡"Failed" refers to SNP genotype failure in the sample.

Discussion

We examined associations of Affymetrix 100K SNPs and lipid traits in FHS and identified putative associations with lipid phenotypes. We studied the long-term average of up to 7 measurements each of LDL-C, HDL-C, and TG as the primary phenotypes and for one phenotype, the MeanLDL-C, we observed a nominal P that exceeded genome-wide significance [13]. However, validation of selected hypotheses in additional samples did not identify any new loci underlying variability in blood lipids. GWAS offers the potential to identify novel genetic variants/loci that are associated with blood lipid variation, unlimited by our current knowledge of lipoprotein biology. However, a central limitation of GWAS is that the true signals are mixed amidst a large number of false positive results. Validation in additional samples is required to distinguish the true positives from the false ones. Replication of initial GWAS findings using a staged design has been suggested to minimize genotyping cost and maximize statistical power [23,24]. An important consideration in such a design is the proportion of markers taken forward to a second stage. We estimated the statistical power for our three-stage GWAS strategy. Assuming a modest number of markers (all SNPs with P < 0.001 for each phenotype, ~0.1% of markers) are taken forward to Stage II, a second stage sample size of 1450, that SNPs with P < 0.001 are taken forward from Stage II to Stage III, a stage III sample size of 6650, and that the final alpha (after Stages I, II, & III) is set at a conservative 5*10-8, we estimated that we had 89% power to detect a quantitative trait locus explaining 2% of phenotypic variance, 48% power to detect a locus explaining 1% of the variance, and 13% power to detect a locus explaining 0.5% of the variance. With our replication effort, we failed to identify any novel loci related to blood lipids. At least two potential explanations are possible. First, our study design had limited statistical power to detect common SNPs that explain ≤1% of trait variance. In the Diabetes Genetics Initiative genome-wide association study for blood lipid traits, we recently showed that for lipid traits, there are few common variants that explain >2% of the variance and most SNPs explain <1% of trait variance [25]. To have adequate statistical power to detect these effects given an initial GWAS sample size of ~1000, many more markers (i.e., hundreds of SNPs) will need to be taken to the second and third stages. Second, the limited genomic coverage of the Affymetrix 100K array may have limited our ability to replicate previously reported loci and discover novel loci. For example, using the Affymetrix 500 K array, we recently identified glucokinase regulatory protein (GCKR) as a novel locus associated with TG [25]. Of any SNP on the 500 K array, an intronic GCKR SNP (rs780094) explained the greatest proportion of blood TG variance in the Diabetes Genetics Initiative study. However, on the Affymetrix 100K array, there are no SNPs within the 60 kb spanning GCKR.

Strengths and limitations

This study is distinguished by the availability of serial lipid phenotypes over a 30-year time span, the community-based nature of the collection, and the routine ascertainment of covariates in a standardized clinical examination. We acknowledge several limitations. These include the lack of validation for the imputation methodology used to address lipid lowering therapy, limited statistical power due to sample size, and confinement to a single ancestral group – whites of European ancestry.

Conclusions & future directions

Using a 100K genome-wide scan, we present association and linkage results for a rich set of lipid phenotypes in FHS. This resource may be useful for comparisons with other GWAS currently in progress. GWAS in FHS using a denser genome-wide genotyping platform and a better-powered replication strategy may identify novel loci underlying blood lipids.

Abbreviations

FBAT = family-based association test; GEE = generalized estimating equations.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

SK, AM, SD, GP, JMO, and LAC participated in the design of the study and the interpretation of the data. AM, GP, and SD conducted the statistical analyses. SK drafted the manuscript. AS, CG, LG, and NPB generated replication genotype data and analyses. OM and MOM provided replication samples and conducted association analyses in the Malmo Diet and Cancer Study. DKA and JMO provided replication samples and led the generation of lipid phenotypes in the GOLDN study. SD, SK, RD, JMO, and LAC revised the manuscript critically for important intellectual content. All authors read and approved the above manuscript.
  25 in total

1.  Complement factor H polymorphism in age-related macular degeneration.

Authors:  Robert J Klein; Caroline Zeiss; Emily Y Chew; Jen-Yue Tsai; Richard S Sackler; Chad Haynes; Alice K Henning; John Paul SanGiovanni; Shrikant M Mane; Susan T Mayne; Michael B Bracken; Frederick L Ferris; Jurg Ott; Colin Barnstable; Josephine Hoh
Journal:  Science       Date:  2005-03-10       Impact factor: 47.728

2.  A haplotype map of the human genome.

Authors: 
Journal:  Nature       Date:  2005-10-27       Impact factor: 49.962

3.  A common genetic variant in the NOS1 regulator NOS1AP modulates cardiac repolarization.

Authors:  Dan E Arking; Arne Pfeufer; Wendy Post; W H Linda Kao; Christopher Newton-Cheh; Morna Ikeda; Kristen West; Carl Kashuk; Mahmut Akyol; Siegfried Perz; Shapour Jalilzadeh; Thomas Illig; Christian Gieger; Chao-Yu Guo; Martin G Larson; H Erich Wichmann; Eduardo Marbán; Christopher J O'Donnell; Joel N Hirschhorn; Stefan Kääb; Peter M Spooner; Thomas Meitinger; Aravinda Chakravarti
Journal:  Nat Genet       Date:  2006-04-30       Impact factor: 38.330

4.  Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies.

Authors:  Andrew D Skol; Laura J Scott; Gonçalo R Abecasis; Michael Boehnke
Journal:  Nat Genet       Date:  2006-01-15       Impact factor: 38.330

5.  Evaluating and improving power in whole-genome association studies using fixed marker sets.

Authors:  Itsik Pe'er; Paul I W de Bakker; Julian Maller; Roman Yelensky; David Altshuler; Mark J Daly
Journal:  Nat Genet       Date:  2006-05-21       Impact factor: 38.330

6.  Low LDL cholesterol in individuals of African descent resulting from frequent nonsense mutations in PCSK9.

Authors:  Jonathan Cohen; Alexander Pertsemlidis; Ingrid K Kotowski; Randall Graham; Christine Kim Garcia; Helen H Hobbs
Journal:  Nat Genet       Date:  2005-01-16       Impact factor: 38.330

7.  Cholesteryl ester transfer protein TaqIB variant, high-density lipoprotein cholesterol levels, cardiovascular risk, and efficacy of pravastatin treatment: individual patient meta-analysis of 13,677 subjects.

Authors:  S M Boekholdt; F M Sacks; J W Jukema; J Shepherd; D J Freeman; A D McMahon; F Cambien; V Nicaud; G J de Grooth; P J Talmud; S E Humphries; G J Miller; G Eiriksdottir; V Gudnason; H Kauma; S Kakko; M J Savolainen; M Arca; A Montali; S Liu; H J Lanz; A H Zwinderman; J A Kuivenhoven; J J P Kastelein
Journal:  Circulation       Date:  2005-01-17       Impact factor: 29.690

Review 8.  Lipoprotein lipase S447X: a naturally occurring gain-of-function mutation.

Authors:  Jaap Rip; Melchior C Nierman; Colin J Ross; Jan Wouter Jukema; Michael R Hayden; John J P Kastelein; Erik S G Stroes; Jan Albert Kuivenhoven
Journal:  Arterioscler Thromb Vasc Biol       Date:  2006-03-30       Impact factor: 8.311

9.  The sex-specific genetic architecture of quantitative traits in humans.

Authors:  Lauren A Weiss; Lin Pan; Mark Abney; Carole Ober
Journal:  Nat Genet       Date:  2006-01-22       Impact factor: 38.330

10.  A common genetic variant is associated with adult and childhood obesity.

Authors:  Alan Herbert; Norman P Gerry; Matthew B McQueen; Iris M Heid; Arne Pfeufer; Thomas Illig; H-Erich Wichmann; Thomas Meitinger; David Hunter; Frank B Hu; Graham Colditz; Anke Hinney; Johannes Hebebrand; Kerstin Koberwitz; Xiaofeng Zhu; Richard Cooper; Kristin Ardlie; Helen Lyon; Joel N Hirschhorn; Nan M Laird; Marc E Lenburg; Christoph Lange; Michael F Christman
Journal:  Science       Date:  2006-04-14       Impact factor: 47.728

View more
  169 in total

1.  Heart disease and stroke statistics--2012 update: a report from the American Heart Association.

Authors:  Véronique L Roger; Alan S Go; Donald M Lloyd-Jones; Emelia J Benjamin; Jarett D Berry; William B Borden; Dawn M Bravata; Shifan Dai; Earl S Ford; Caroline S Fox; Heather J Fullerton; Cathleen Gillespie; Susan M Hailpern; John A Heit; Virginia J Howard; Brett M Kissela; Steven J Kittner; Daniel T Lackland; Judith H Lichtman; Lynda D Lisabeth; Diane M Makuc; Gregory M Marcus; Ariane Marelli; David B Matchar; Claudia S Moy; Dariush Mozaffarian; Michael E Mussolino; Graham Nichol; Nina P Paynter; Elsayed Z Soliman; Paul D Sorlie; Nona Sotoodehnia; Tanya N Turan; Salim S Virani; Nathan D Wong; Daniel Woo; Melanie B Turner
Journal:  Circulation       Date:  2011-12-15       Impact factor: 29.690

Review 2.  Genetic causes of high and low serum HDL-cholesterol.

Authors:  Daphna Weissglas-Volkov; Päivi Pajukanta
Journal:  J Lipid Res       Date:  2010-04-26       Impact factor: 5.922

3.  Identification of Rare Variants in ATP8B4 as a Risk Factor for Systemic Sclerosis by Whole-Exome Sequencing.

Authors:  Li Gao; Mary J Emond; Tin Louie; Chris Cheadle; Alan E Berger; Nicholas Rafaels; Candelaria Vergara; Yoonhee Kim; Margaret A Taub; Ingo Ruczinski; Stephen C Mathai; Stephen S Rich; Deborah A Nickerson; Laura K Hummers; Michael J Bamshad; Paul M Hassoun; Rasika A Mathias; Kathleen C Barnes
Journal:  Arthritis Rheumatol       Date:  2016-01       Impact factor: 10.995

4.  Association of two polymorphisms in the FADS1/FADS2 gene cluster and the risk of coronary artery disease and ischemic stroke.

Authors:  Qian Yang; Rui-Xing Yin; Xiao-Li Cao; Dong-Feng Wu; Wu-Xian Chen; Yi-Jiang Zhou
Journal:  Int J Clin Exp Pathol       Date:  2015-06-01

Review 5.  Lipase maturation factor 1: a lipase chaperone involved in lipid metabolism.

Authors:  Miklós Péterfy
Journal:  Biochim Biophys Acta       Date:  2011-10-12

6.  Rare DEGS1 variant significantly alters de novo ceramide synthesis pathway.

Authors:  Nicholas B Blackburn; Laura F Michael; Peter J Meikle; Juan M Peralta; Marian Mosior; Scott McAhren; Hai H Bui; Melissa A Bellinger; Corey Giles; Satish Kumar; Ana C Leandro; Marcio Almeida; Jacquelyn M Weir; Michael C Mahaney; Thomas D Dyer; Laura Almasy; John L VandeBerg; Sarah Williams-Blangero; David C Glahn; Ravindranath Duggirala; Mark Kowala; John Blangero; Joanne E Curran
Journal:  J Lipid Res       Date:  2019-06-21       Impact factor: 5.922

7.  Sample-size properties of a case-control association analysis of multistage SNP studies for identifying disease susceptibility genes.

Authors:  Nobutaka Kitamura; Kouhei Akazawa; Shin-Ichi Toyabe; Akinori Miyashita; Ryozo Kuwano; Junichiro Nakamura
Journal:  J Hum Genet       Date:  2008-02-21       Impact factor: 3.172

8.  SLCO1B1 genetic variants, long-term low-density lipoprotein cholesterol levels and clinical events in patients following cardiac catheterization.

Authors:  Josephine H Li; Sunil Suchindran; Svati H Shah; William E Kraus; Geoffrey S Ginsburg; Deepak Voora
Journal:  Pharmacogenomics       Date:  2015       Impact factor: 2.533

Review 9.  Metabolic syndrome: from epidemiology to systems biology.

Authors:  Aldons J Lusis; Alan D Attie; Karen Reue
Journal:  Nat Rev Genet       Date:  2008-11       Impact factor: 53.242

10.  The ClinSeq Project: piloting large-scale genome sequencing for research in genomic medicine.

Authors:  Leslie G Biesecker; James C Mullikin; Flavia M Facio; Clesson Turner; Praveen F Cherukuri; Robert W Blakesley; Gerard G Bouffard; Peter S Chines; Pedro Cruz; Nancy F Hansen; Jamie K Teer; Baishali Maskeri; Alice C Young; Teri A Manolio; Alexander F Wilson; Toren Finkel; Paul Hwang; Andrew Arai; Alan T Remaley; Vandana Sachdev; Robert Shamburek; Richard O Cannon; Eric D Green
Journal:  Genome Res       Date:  2009-07-14       Impact factor: 9.043

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.