Literature DB >> 22303337

Pathway-Wide Association Study Implicates Multiple Sterol Transport and Metabolism Genes in HDL Cholesterol Regulation.

Kai Wang1, Andrew C Edmondson, Mingyao Li, Fan Gao, Atif N Qasim, Joseph M Devaney, Mary Susan Burnett, Dawn M Waterworth, Vincent Mooser, Struan F A Grant, Stephen E Epstein, Muredach P Reilly, Hakon Hakonarson, Daniel J Rader.   

Abstract

Pathway-based association methods have been proposed to be an effective approach in identifying disease genes, when single-marker association tests do not have sufficient power. The analysis of quantitative traits may be benefited from these approaches, by sampling from two extreme tails of the distribution. Here we tested a pathway association approach on a small genome-wide association study (GWAS) on 653 subjects with extremely high high-density lipoprotein cholesterol (HDL-C) levels and 784 subjects with low HDL-C levels. We identified 102 genes in the sterol transport and metabolism pathways that collectively associate with HDL-C levels, and replicated these association signals in an independent GWAS. Interestingly, the pathways include 18 genes implicated in previous GWAS on lipid traits, suggesting that genuine HDL-C genes are highly enriched in these pathways. Additionally, multiple biologically relevant loci in the pathways were not detected by previous GWAS, including genes implicated in previous candidate gene association studies (such as LEPR, APOA2, HDLBP, SOAT2), genes that cause Mendelian forms of lipid disorders (such as DHCR24), and genes expressing dyslipidemia phenotypes in knockout mice (such as SOAT1, PON1). Our study suggests that sampling from two extreme tails of a quantitative trait and examining genetic pathways may yield biological insights from smaller samples than are generally required using single-marker analysis in large-scale GWAS. Our results also implicate that functionally related genes work together to regulate complex quantitative traits, and that future large-scale studies may benefit from pathway-association approaches to identify novel pathways regulating HDL-C levels.

Entities:  

Keywords:  GWAS; HDL-C; cholesterol; genetic association; lipid; pathway analysis; sterol metabolism; sterol transport

Year:  2011        PMID: 22303337      PMCID: PMC3268595          DOI: 10.3389/fgene.2011.00041

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


Introduction

Genome-wide association studies (GWAS) have been very successful in identifying genes or loci that are associated with complex diseases or quantitative traits. However, for most GWAS, the collection of known loci explains only a minor fraction of disease heritability (Maher, 2008; Manolio et al., 2009). Since most GWAS are not powered to detect the majority of genuinely associated loci, researchers could expand sample size indefinitely in an attempt to find more associated loci; however, this approach has its limitations as the GWAS are likely to have diminishing returns. That is, after the first few loci (presumably more readily detectable with larger effect sizes) have been identified and replicated, the remaining loci will be more and more difficult to find as they will have much smaller effect sizes (Goldstein, 2009). An alternative approach to detect trait-associated loci in GWAS, which can provide complementary information to single-marker analysis, is a pathway-based approach, which examines whether test statistics for related genes in the same pathway are consistently associated with a disease trait under study (Wang et al., 2010). There have been a few published examples demonstrating the power and effectiveness of pathway-wide association approaches for several diseases. We have previously developed a pathway association approach (Wang et al., 2007), by adopting the Gene Set Enrichment Analysis (Subramanian et al., 2005) strategies, but with appropriate modifications to account for the linkage disequilibrium between SNP markers, the different sizes of genes and pathways. This approach has been successfully applied to uncover association between the IL12/IL23 pathway and Crohn’s disease (Wang et al., 2009a), the involvement of cell adhesion molecules in autism spectrum disorders (Wang et al., 2009b), as well as the WNT-signaling pathway in Type 2 Diabetes (T2D) (Perry et al., 2009). Additionally, Lesnick et al. (2007) have demonstrated that SNPs in the candidate pathway (axon guidance) collectively predisposed to a Parkinson’s disease, even though no individual SNP reaches genome-wide significance level. Dinu et al. (2007) applied a pathway approach and discovered that multiple complement factor pathway genes, in additional to complement factor H (a well known susceptibility locus), are associated with risk for developing age-related macular degeneration (AMD). Torkamani et al. (2008) applied a pathway-based approach to analyze the Wellcome Trust Case Control Consortium (WTCCC) data sets, and discovered numerous pathways implicated in disease predisposition, many of which have long been assumed to contain polymorphic genes that lead to disease predisposition. Askland et al. (2009) analyzed GWAS data sets on bipolar disorder using a pathway association approach, and implicated genes mediating ion channel activity and synaptic neurotransmission. Holmans et al. (2009) also described a pathway approach and applied it on meta-analysis of bipolar disorder, and implicated a collection of genes involved in the modulation of transcription and cellular activity. Furthermore, several recent studies have focused on incorporating biological network structure into the analysis of GWAS (Pan, 2008; Baranzini et al., 2009). Collectively, all these studies show that pathway-based approaches can be applied to GWAS data sets to gain insights into disease susceptibility mechanisms, and to provide enhanced or complementary information to the conventional single-marker analysis. Unlike previous disease studies, our current study focused on a quantitative trait, plasma levels of high-density lipoprotein cholesterol (HDL-C), to examine whether pathway-based approaches can be effective in enriching a group of truly associated genes. HDL-C, together with low-density lipoprotein cholesterol (LDL-C) and triglyceride (TG), are three major lipid phenotypes in the human serum. Lipid levels are heritable traits that are strongly associated with cardiovascular risk, therefore they are widely measured in clinical practice. Lipid measures represent excellent secondary phenotypes for genetic investigation by many GWAS, and relatively large sample sizes can be obtained to study these traits. Three large-scale GWAS have been reported to identify genetic loci associated with lipid traits: Kathiresan et al. (2009) conducted GWAS screens in 19,840 individuals and replication in up to 20,623 individuals, and identified 30 loci (36 trait-loci associations) controlling HDL-C, LDL-C, or TG levels. Sabatti et al. (2009) have assayed a birth cohort from a founder population with ∼4,500 subjects, and identified 17 trait-loci associations for lipid levels. Aulchenko et al. (2009) analyzed 16 European population cohorts with ∼20,000 subjects and identified 22 trait-loci associations on lipid traits. More recently, Teslovich et al. (2010) have published an even larger meta-analysis on lipid traits involving over 100,000 subjects. Despite these early successes, a more complete ensemble of lipids genes is needed for better understanding of the gene networks involved, and the current successes probably merely represent the start of a long road (Edmondson and Rader, 2008; Holleboom et al., 2008) for the following reasons: First, among genes that cause Mendelian disorders of lipid traits, only a subset of them have been identified in GWAS (Edmondson and Rader, 2008). Second, multiple candidate genes believed to affect HDL-C levels based on preclinical models have not yet been identified in GWAS for HDL-C. Finally, the collection of HDL-C genes identified in GWAS only explains a minor fraction of inter-individual variability in lipoprotein levels (Kathiresan et al., 2009). Therefore, it is likely that many more genes regulating HDL-C levels remain to be discovered. In this study, we applied a pathway-based association approach on a GWAS for the purpose of identifying genes regulating HDL-C levels. The study is distinct from previous GWAS on lipids in that it was originally designed and ascertained as a case/control study, so the study participants have either extremely high or low HDL-C levels but without syndromic dyslipidemia features. Although the initial single-marker association analysis identified only one well known locus that significantly associates with HDL-C, the pathway-based association approach resulted in ∼100 loci that potentially associate with HDL-C, including multiple previously known lipid loci. We further validated some of the novel findings through literature survey of previously known Mendelian forms of lipid disorders, previously published candidate gene association studies, as well as the examination of metabolic phenotypes of gene knockout mice maintained by the Jackson laboratory. Our study has significant implications for the study of quantitative traits and the analysis of GWAS data: First, it demonstrated that a modest case/control data set sampling from two extreme tails of a quantitative trait can be very useful in identifying genuinely associated genes. Second, it illustrated that pathway-wide association approaches could provide biological insights that are not readily available from single-marker analysis on orders of magnitude larger sample sizes.

Materials and Methods

Study participants

Discovery cohort

Cases and control subjects were ascertained for the purpose of identifying genetic variation regulating HDL-C levels. All study participants were self-declared to be of European ancestry, and were genotyped on the Affymetrix Genome-wide Human SNP 6.0 arrays. The “cases” were defined as persons with HDL-C levels >94th percentile for age and gender. The control subjects were defined as persons with HDL-C <25th percentile for age and gender. Based on the whole-genome genotyping data, we performed multi-dimensional scaling analysis on a set of independent SNP markers (Figure S1 in Supplementary Material), to ensure that only subjects of genetically inferred European ancestry were used in analysis. We also removed subjects with call rates of SNPs less than 95%. Additionally, we have removed pairs of subjects with cryptic relatedness, as defined by identity-by-descent score higher than 0.2. The final data set for association analysis included a total of 653 cases and 784 control subjects with 681,050 SNP markers. The genomic control inflation factor was 1.057, indicating that cases and control subjects were well matched and that no systematic batch effects were present.

Replication cohort

The replication cohort was compiled from PennCATH and MedStar, two ongoing GWAS on coronary artery disease, which were previously described in detail (Myocardial Infarction Genetics Consortium et al., 2009). The PennCATH study is a University of Pennsylvania Medical Center based angiographic study of over 3,800 subjects that has been used previously for replication of novel genes and risk factors for atherosclerotic cardiovascular disease and T2D (Helgadottir et al., 2006; Steinthorsdottir et al., 2007). The MedStar study is a Washington Hospital Center based angiographic study of 1,500 subjects specifically designed for biomarker and genetic association studies of acute and chronic coronary atherosclerosis. Since the PennCATH and MedStar samples were genotyped together at the same site and were analyzed by identical procedures, to improve power to detect association, we combined the two data sets together. To define HDL-C case–control status, the subjects were dichotomized by sex-specific cutoff points: those subjects with HDL-C higher than 80% percentile were regarded as “cases” and those with HDL-C levels lower than 20% percentile were regarded as control subjects. The thresholds differ from the discovery cohort, because the discovery cohort is specifically designed and ascertained for studying high HDL levels, whereas the PennCATH and MedStar were not ascertained in this manner. Subjects who participated in the discovery cohort were not included in the replication cohort. Through these criteria, a total of 496 cases and 373 control subjects were included in the analysis.

Ethics statement

All subjects have been enrolled in a University of Pennsylvania or Washington Hospital Institutional Review Board approved protocol, and all subjects gave written informed consent.

Pathway association analysis

The method for pathway-based association test was previously described in detail elsewhere (Wang et al., 2007, 2009a). Briefly, we assign each SNP to the overlapping gene/genes, or its closest gene (subject to a distance constraint) in the genome if it does not overlap with a gene. For each gene, we assigned the highest statistic value among all SNPs mapped to the gene as the statistic value of the gene. For all the N genes that are represented by SNPs in the GWAS, we sorted their statistic values from the largest to the smallest, denoted by r(1), …, r(. For any given gene set S composed of NH genes, we then calculated a weighted Kolmogorov–Smirnov-like running sum statistic(Hollander and Wolfe, 1999) that reflects the overrepresentation of genes within the set S at the top of the entire ranked list of genes in the genome, where and p (default value = 1) is a parameter that gives higher weight to genes with extreme statistic values. To adjust for differences in gene size (hence the different number of SNPs located within/nearby each gene), as well as the linkage disequilibrium between SNPs within the same gene, we conducted a two-step correction procedure. In the first step, we permuted the disease labels of all samples ensuring the same number of individuals in each phenotype group for case–control studies. During each permutation (denoted by π), we repeated the calculation of enrichment score as described above as ES(S, π). In the second-step, we calculated normalized enrichment score (NES) as a Z-score, defined as [ES(S) − mean(ES(S, π))]/SD[ES(S, π)], so that different gene sets are directly comparable to each other. The P-value can be calculated from the π permutations, and a false discovery rate (FDR) procedure can be used to control the fraction of expected false positive findings below a certain threshold (Reiner et al., 2003). For a gene set, let NES* denote the NES in the observed data. The FDR is calculated as Alternatively, we also calculated the FWER P-value as the fraction of all permutations whose highest NES score in all gene sets are higher than the NES* for a given gene set. For the discovery cohort, a total of 10,000 phenotype permutation cycles were used so that a more accurate P-value can be calculated for each pathway; for replication cohort, 1,000 phenotype permutation cycles were used. The software implementing this method was made freely available at http://www.openbioinformatics.org/gengen. The current software uses chi-square test statistics for association, so it cannot incorporate covariates in the analysis. The software package included pathway definitions for the three pathway collections (KEGG, BioCarta, GO) used in the study, as well as library files mapping SNP identifiers to gene names for the Affymetrix Genome-wide Human SNP 6.0 array. The pathway collections were accessed and assembled on March 03, 2008. These default library files, as well as all default parameters in the software, were used in our pathway association analysis.

Results

Single-marker and pathway association analysis on HDL-C levels

We examined a previously unpublished cohort on HDL-C levels (discovery cohort, Table 1), genotyped on the Affymetrix Genome-wide Human SNP 6.0 arrays. Unlike other disease studies that investigated HDL-C levels as an additional/secondary phenotype for investigation, this cohort consists of “cases” with extremely high HDL-C levels and “control” subjects with low HDL-C levels (see Material and Methods). One locus on 16q13 reached genome-wide significance levels of P < 5 × 10−8 (Figure 1), with the most significant SNP being rs1532625 (P = 6.8 × 10−12, OR = 1.68) within CETP (cholesteryl ester transfer protein). This locus was a well-established HDL-C locus before the GWAS era (Koizumi et al., 1985; Kurasawa et al., 1985; Inazu et al., 1990). Therefore, through single-marker association analysis, no novel lipid genes were identified that met genome-wide significance criterion, though we were able to replicate several loci from other published studies on lipid levels (Table S1 in Supplementary Material). Altogether, these single-marker association results testified the high quality of the phenotype data, and demonstrated the unique power of case/control data set that samples from extreme tails of normal distributions on a quantitative trait.
Table 1

The most significantly associated pathways in the discovery GWAS data set.

Pathway IDDescriptionNumber of genesZ-scoreP-valueFDRReplication P-value
GO0016125Sterol metabolic process893.665<0.00010.0830.029
GO0008202Steroid metabolic process1403.6080.00040.0520.17
GO0015918Sterol transport253.475<0.00010.0560.083
GO0006096Glycolysis453.0770.00070.1610.60
Insulin pathwayInsulin Signaling Pathway222.6420.00390.4840.83
hsa00960Alkaloid biosynthesis II202.5180.00080.5690.50
hsa04360Axon guidance1282.410.00850.6530.81
Figure 1

Manhattan plot of the single-marker association analysis on the discovery cohort. Except CETP on 16p13.3, single-marker analysis failed to identify any locus reaching genome-wide significance (P < 5 × 10−]).

The most significantly associated pathways in the discovery GWAS data set. Manhattan plot of the single-marker association analysis on the discovery cohort. Except CETP on 16p13.3, single-marker analysis failed to identify any locus reaching genome-wide significance (P < 5 × 10−]). To screen for novel biological pathways or collections of functionally related genes that associate with HDL-C levels, we next applied a pathway-based association approach (Wang et al., 2007) on this GWAS data set, with the same set of subjects and SNPs used in single-marker association analysis. This pathway association method permutes the case/control labels a large number of times, and for each permuted data set, it re-calculates the association test statistics for all SNP markers, and then assesses the significance of the pre-defined pathways or sets of genes by comparing the observed test statistics with the null distribution generated by the permutations. The collection of pathways or gene sets analyzed in the study was retrieved from the BioCarta (BioCarta, BioCarta-Charting pathways of life), KEGG (Kanehisa et al., 2010) and level 4 of the Gene Ontology (Ashburner et al., 2000) databases. We performed 10,000 permutations and assessed a total of 568 pathways passing size threshold (20–200 genes). We found that seven pathways were associated with HDL-C levels at a nominal P-value less than 0.01 (Table 1), slightly more than expected by chance (5.68). Among these pathways, four have false discovery rate (FDR) less than 25%. However, we note that all genes in the “sterol metabolic process” pathway belong to the “steroid metabolic process” pathway as both were from the Gene Ontology collection (Figure S2 in Supplementary Material), therefore we excluded the “steroid metabolic process” pathway from further discussion as it has lower Z-score and P-value than its subset. The full list of genes and representative SNPs for the sterol metabolism and transport pathways were given in Figure S2 in Supplementary Material. To further evaluate the sensitivity of the pathway analysis to the inclusion of CETP – a well known HDL-C gene that showed genome-wide significance in our single-marker association analysis – we removed CETP from the pathway analysis. This practice is similar to removing APOE from pathway analysis on Alzheimer’s disease (Jones et al., 2010) or removing NOD2 from pathway analysis on Crohn’s disease (Wang et al., 2009a). We still observed association signals for the “sterol metabolic process” and “sterol transport” pathways but with reduced significance levels (P = 0.0011 and P = 0.007, respectively), suggesting that the observed pathway associations above were not due to the effect of CETP alone. Similar to single-marker association tests, pathway-based association tests may be susceptible to spurious associations, so we next sought to replicate the pathway findings using independent data sets. Since the pathway association method requires phenotype permutation of case/control status, we compiled a case–control dataset (496 cases, 373 controls) from ongoing GWAS at University of Pennsylvania which have fasting lipid measures on some study participants (see Materials and Methods). The replication P-values for sterol metabolic process pathway and sterol transport pathway are 0.029 and 0.083, respectively. These values do not reach stringent levels of statistical significance considering the need to test two hypotheses, and this may be due to the lack of power in the replication study.

Additional evidence supporting pathway association signals

Although our study demonstrated that a group of genes were collectively associated with HDL-C levels, only one locus (CETP) truly showed a genome-wide significance signal in single-marker association analysis. The Gene Ontology pathways were curated from heterogeneous data sources such as electronic annotation from protein homology searches, so they may contain genes that are not genuinely associated with HDL-C levels. Therefore, we applied three complementary approaches, to search for additional evidence supporting the role of a subset of genes in the pathways in regulating HDL-C levels, and summarized the results in Table 2.
Table 2

List of genes in the sterol transport/metabolism pathway with additional evidence of association with lipid traits.

GeneLocusPrevious GWAS (HDL)Previous GWAS (LDL/TG)Teslovich GWAS (HDL)Teslovich GWAS (LDL/TG)Knockout mouse phenotypeCandidate gene associationMedelian lipid disorders
ELA3A1p36.12
ELA3B
LDLRAP11p36.11YesElevated LDL-C (Jones et al., 2003)Familial hypercholesterolemia (Garcia et al., 2001; Soutar and Naoumova, 2007)
NR0B21p36.11
STX121p35.3
DHCR241p32.3No plasma or tissue cholesterol (Wechsler et al., 2003)Desmosterolosis (Waterham et al., 2001)
PCSK91p32.3YesYesDecreased plasma cholesterol (Rashid et al., 2005)Decreased LDL-C and protection from CAD (Cohen et al., 2006)Autosomal dominant hypercholesterolemia (Abifadel et al., 2003)
PRKAA21p32.2
ANGPTL31p31.3YesYesDecreased plasma lipids (Koishi et al., 2002)
LEPROT1p31.3
LEPRIncreased circulating cholesterol level (JAX1)Sun et al. (2009)
HMGCS21p12
PMVK1q22
FDPS1q22
APOA21q23.3Decreased HDL-C level (JAX1)Chen et al. (2009)Familial hypercholesterolemia modifier (Takada et al., 2002)
SOAT11q25.2Increased circulating cholesterol level (JAX1)
CYB5R11q32.1
APOB2p24.1YesYesYesEmbryonic lethal, heterozygotes have decreased plasma cholesterol (Huang et al., 1995), hyopbetalipoproteinemia (Homanics et al., 1993)Hypobetalipoproteinemia (Young et al., 1987)
SNX172p23.3
ABCG82p21YesYesSitosterolemia (Klett et al., 2004)Sitosterolemia (Berge et al., 2000)
INSIG22q14.2
HDLBP2q37.3Cánovasa et al. (2009)
SCAP3p21.31
CNBP3q21.3
SC4MOL4q32.3
PRKAA15p13.1
HMGCS15p12
HMGCR5q13.3YesYes
IL45q23.3
APOM6p21.33YesAccumulation of large HDL particles (Wolfrum et al., 2005)
PPARD6p21.31
TRERF16p21.1
CYP39A16p12.3
NPC1L17p13YesIncreased cholesterol level (JAX1)
PON17q21.3Increased LDL-C level (JAX1)
CAV17q31.2
LEP7q32.1
AKR1D17q34
PRKAG27q36.1
INSIG17q36.2
CLN88p23.3
FDFT18p23.1
CYP7A18q12.1YesNearly absent cholesterol absorption (JAX1)Couture et al. (1999), Wang et al. (1998)Hypercholesterolemia (Pullinger et al., 2002)
CYP7B18q12.3Elevated levels of 2 different oxysterols in serum and tissue (Li-Hawkins et al., 2000)
SQLE8q24.13
VLDLR9p24.2
OPRS19p13.3
ABCA19q31.1YesYesYesNearly absent HDL-C (Orso et al., 2000)Tangier disease (Bodzioch et al., 1999; Brooks-Wilson et al., 1999; Rust et al., 1999)
CEL9q34.2
RXRA9q34.2
ABCA29q34.3
IDI210p15.3
IDI110p15.3
CUBN10p13
CH25H10q23.31
OSBPL511p15.4
CYB5R211p15.4
TM7SF211q13.1
DHCR711q13.4Reduction of serum and tissue cholesterol levels (Wassif et al., 2001)Smith-Lemli-Opitz Syndrome (Shefer et al., 1995)
APOA111q23.3YesYesYesYes
APOA4
APOC3
SC5DL11q23.3Stillborn and with decreased cholesterol levels (Krakowiak et al., 2003)Lathosterolosis (Krakowiak et al., 2003)
SORL111q23.3
SOAT212q13.13Reduced cholesterol ester synthesis (Willner et al., 2003)Chen et al. (2009)
MVK12q24.11YesYes
SCARB112q24.31YesAbnormal HDLs (Miettinen et al., 2001)Roberts et al. (2007)
EBPL13q14.2
NPC214q24.3
CYP46A114q32.2
LIPC15q21.3YesYesYesHepatic lipase deficiency (Breckenridge et al., 1982)
CLN615q23
CYP11A115q24.1
APOB48R16p11.2
CETP16q13YesYesYesCETP deficiency (Brown et al., 1989)
MBTPS116q23.3
MVD16q24.3
SCARF117p13.3
SREBF117p11.2
STARD317q12YesMild defects in cholesterol metabolism (JAX1)
PCTP17q23.2
FDXR17q25.1
NPC118q11.2
OSBPL1A18q11.2
ACAA218q21.1
VPS4B18q21.33
LDLR19p13.2YesYesElevated LDL-C (Ishibashi et al., 1993)Homozygous familial hypercholesterolemia (Hobbs et al., 1986)
LIPE19q13.2Increased HDL concentration (Haemmerle et al., 2002)
APOC119q13.31YesYesYes
APOEHypercholesterolemia (Zhang et al., 1992)
APOC2ApoC2 deficiency (Breckenridge et al., 1978)
SOD121q22.11
ABCG121q22.3Decreased HDL on high fat diet (Wiersma et al., 2009)
SEC14L222q12.2
APOL222q12.3
APOL122q12.3
SREBF222q13.2
CYB5R322q13.2
TSPO22q13.2
MBTPS2Xp22.11
EBPXp11.23
NSDHLXq28

1http://jaxmice.jax.org/query

List of genes in the sterol transport/metabolism pathway with additional evidence of association with lipid traits. 1http://jaxmice.jax.org/query First, we examined the sterol transport and metabolism genes in three independent GWAS [the Kathiresan et al. (2009) meta-analysis, the Aulchenko et al. (2009) meta-analysis, the Sabatti et al. (2009) study on founder populations), as well as the large-scale Teslovich et al. (2010) meta-analysis. Among the 102 unique genes (96 unique loci), genes in the sterol transport and metabolism pathways, 9 and 15 loci were reported as genuine HDL-C loci and/or LDL-C/TG loci in at least one of these publications, respectively (Table 2). The abundance of previously known lipid genes in the identified pathway therefore served as de novo confirmation supporting the biological relevance of the pathways. Additionally, when examining previously published candidate gene association studies, we found that multiple genes in the sterol transport/metabolism pathway, such as LEPR, APOA2, SOAT2, HDLBP, were previously associated with lipid levels (Table 2), albeit with relatively small effect sizes and in some cases not replicated. Second, we examined published literature on monogenic form of lipid disorders, and determined whether any of the sterol transport/metabolism genes have been previously associated with such diseases. Interestingly, multiple additional genes, such as DHCR24 and APOA2, have never been implicated by GWAS on lipid traits, but were identified as genuine lipid genes through the study of rare forms of lipid disorders (Table 2). It is possible that no common causal variants in these genes exist in European populations, or that the effect sizes of causal variants are too small to be detected by the sample sizes used in previous studies. Nevertheless, this fact implicates the complementary nature of pathway-based association approach to single-marker association approach, and supports the utility of the pathway-based approach in identifying biologically relevant genes. Third, we examined previous functional studies conducted on model organisms, including mouse knockout studies. Unlike many complex human diseases where genetic models are difficult to replicate or study in animal models, the lipid traits are easily measurable and quantifiable, so they were almost always included in knockout experiments, facilitating the functional study of lipid genes. This analysis revealed multiple additional previously unreported genes, including SOAT1, LEPR, and PON1, that had a metabolic phenotype (increased or decreased cholesterol levels) in knockout experiments (Table 2). Altogether, the three complementary approaches presented above resulted in the confirmation of a prioritized set of genes that have strong a prior evidence of playing a role in HDL-C regulation. Although the sterol transport/metabolism pathways contains 102 genes, this particular subset of genes may represent a prioritized subset that are promising candidates to be investigated in functional studies of lipid traits.

Discussion

In this study, we performed pathway-based genome-wide association analysis on two GWAS data sets, and demonstrated that a group of sterol transport and metabolism genes are collectively associated with HDL-C levels. We further validated a subset of these genes through examination of previously published candidate gene association studies, known Mendelian forms of lipid disorders, and known dyslipidemia phenotypes in knockout mice. Compared to previously published large-scale GWAS, our results supplied a richer list of additional candidate genes regulating HDL-C levels, and demonstrated the enhanced power of pathway-based association analysis on finding biologically relevant genes. Given the relatively small sample size, we were not able to identify previously unknown pathways as being involved in regulation of HDL-C levels; however, the fact that known pathways and genes are identified suggest that the approach may be applicable to larger sample sets to identify novel pathways. The physiology of HDL metabolism results from the complex interplay of multiple gene products including apolipoproteins, enzymes, lipid transfer proteins, cellular transporters, and receptors. Although the pathways identified in the study are intuitive in a biological sense, caution should be exercised as some of the genes within the pathways probably play more important and direct roles than other genes in regulating HDL-C homeostasis. This explains the rationale of our three-tiered prioritization approach to identify subsets of HDL-C genes for future functional studies. We also wish to discuss a few points related to sample ascertainment scheme. One particular point is that all individuals in the discovery cohort are ascertained from the general population with extreme lipids measures, yet the individuals in the replication cohort have CAD. HDL-C levels might be more disturbed by lipid medications or other clinical factors in CAD patients. Therefore, although it would be valid to use individuals with CAD for replication purposes, it would be most appropriate to make discoveries from general population, to ensure that the data ascertainment scheme does not bias the results (for example, pathways that associate with CAD status rather than lipids levels). Another point is that our study did not utilize a larger cohort and took a QTL approach for the association analysis. It is expected that under fixed genotyping budget, extreme phenotypes have greatly improved power to detect associations due to increased odds ratio measures. However, lipids measures are easily obtainable, so they represent excellent secondary phenotypes available from many GWAS on diverse phenotypes with large sample sizes. Therefore, we expect that a QTL approach for a very large cohort may also have sufficient power to identify the implicated pathways. Although the name “pathway” was used in our analysis, we caution that the term merely refers to a collection of genes, rather than an inter-connected biological process. The pathway-based association analysis is borrowed from microarray data analysis, where sets of related genes are combined into a single group to examine the up or down-regulation of gene expression levels for a group of genes that are consistent. Therefore, “pathway” does not necessarily correspond to biologists’ understanding of biological process, but rather a functional classification of genes based on a set of pre-defined criteria. Our study used pathways compiled from three resources, but we note that other pathway collections could also be explored. Furthermore, we note that pathways analyses have the potential to move beyond the so-called “common-disease/common-variant” paradigm, and consider SNP and genes together rather than in isolation. In conclusion, using two distinct GWAS data sets based on modestly-sized samples ascertained by extremes of a quantitative trait (HDL-C), we have applied pathway analysis to suggest that an ensemble of ∼100 genes is collectively associated with HDL-C levels. A fraction of these genes has already been suggested by previously published literature and by animal models, lending strong support to the validity of the pathway-based approach. Our results therefore confirm the validity and potential utility of the pathway association approach as applied to the extreme tails of a quantitative trait.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary Material

The Supplementary Material for this article can be found online at http://www.frontiersin.org/Applied_Genetic_Epidemiology/10.3389/fgene.2011.00041/abstract
  72 in total

1.  Accumulation of dietary cholesterol in sitosterolemia caused by mutations in adjacent ABC transporters.

Authors:  K E Berge; H Tian; G A Graf; L Yu; N V Grishin; J Schultz; P Kwiterovich; B Shan; R Barnes; H H Hobbs
Journal:  Science       Date:  2000-12-01       Impact factor: 47.728

2.  Generation of viable cholesterol-free mice.

Authors:  A Wechsler; A Brafman; M Shafir; M Heverin; H Gottlieb; G Damari; S Gozlan-Kelner; I Spivak; O Moshkin; E Fridman; Y Becker; R Skaliter; P Einat; A Faerman; I Björkhem; E Feinstein
Journal:  Science       Date:  2003-12-19       Impact factor: 47.728

3.  Personal genomes: The case of the missing heritability.

Authors:  Brendan Maher
Journal:  Nature       Date:  2008-11-06       Impact factor: 49.962

4.  Increased high-density lipoprotein levels caused by a common cholesteryl-ester transfer protein gene mutation.

Authors:  A Inazu; M L Brown; C B Hesler; L B Agellon; J Koizumi; K Takata; Y Maruhama; H Mabuchi; A R Tall
Journal:  N Engl J Med       Date:  1990-11-01       Impact factor: 91.245

5.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

6.  apo B gene knockout in mice results in embryonic lethality in homozygotes and neural tube defects, male infertility, and reduced HDL cholesterol ester and apo A-I transport rates in heterozygotes.

Authors:  L S Huang; E Voyiaziakis; D F Markenson; K A Sokol; T Hayek; J L Breslow
Journal:  J Clin Invest       Date:  1995-11       Impact factor: 14.808

7.  Deficiency of acyl CoA:cholesterol acyltransferase 2 prevents atherosclerosis in apolipoprotein E-deficient mice.

Authors:  Emily L Willner; Bryan Tow; Kimberly K Buhman; Martha Wilson; David A Sanan; Lawrence L Rudel; Robert V Farese
Journal:  Proc Natl Acad Sci U S A       Date:  2003-01-21       Impact factor: 11.205

8.  Genetic evidence implicates the immune system and cholesterol metabolism in the aetiology of Alzheimer's disease.

Authors:  Lesley Jones; Peter A Holmans; Marian L Hamshere; Denise Harold; Valentina Moskvina; Dobril Ivanov; Andrew Pocklington; Richard Abraham; Paul Hollingworth; Rebecca Sims; Amy Gerrish; Jaspreet Singh Pahwa; Nicola Jones; Alexandra Stretton; Angharad R Morgan; Simon Lovestone; John Powell; Petroula Proitsi; Michelle K Lupton; Carol Brayne; David C Rubinsztein; Michael Gill; Brian Lawlor; Aoibhinn Lynch; Kevin Morgan; Kristelle S Brown; Peter A Passmore; David Craig; Bernadette McGuinness; Stephen Todd; Clive Holmes; David Mann; A David Smith; Seth Love; Patrick G Kehoe; Simon Mead; Nick Fox; Martin Rossor; John Collinge; Wolfgang Maier; Frank Jessen; Britta Schürmann; Reinhard Heun; Heike Kölsch; Hendrik van den Bussche; Isabella Heuser; Oliver Peters; Johannes Kornhuber; Jens Wiltfang; Martin Dichgans; Lutz Frölich; Harald Hampel; Michael Hüll; Dan Rujescu; Alison M Goate; John S K Kauwe; Carlos Cruchaga; Petra Nowotny; John C Morris; Kevin Mayo; Gill Livingston; Nicholas J Bass; Hugh Gurling; Andrew McQuillin; Rhian Gwilliam; Panos Deloukas; Ammar Al-Chalabi; Christopher E Shaw; Andrew B Singleton; Rita Guerreiro; Thomas W Mühleisen; Markus M Nöthen; Susanne Moebus; Karl-Heinz Jöckel; Norman Klopp; H-Erich Wichmann; Eckhard Rüther; Minerva M Carrasquillo; V Shane Pankratz; Steven G Younkin; John Hardy; Michael C O'Donovan; Michael J Owen; Julie Williams
Journal:  PLoS One       Date:  2010-11-15       Impact factor: 3.240

9.  Rate of cholesteryl ester transfer between high and low density lipoproteins in human serum and a case with decreased transfer rate in association with hyperalphalipoproteinemia.

Authors:  T Kurasawa; S Yokoyama; Y Miyake; T Yamamura; A Yamamoto
Journal:  J Biochem       Date:  1985-12       Impact factor: 3.387

10.  A mouse model of sitosterolemia: absence of Abcg8/sterolin-2 results in failure to secrete biliary cholesterol.

Authors:  Eric L Klett; Kangmo Lu; Astrid Kosters; Edwin Vink; Mi-Hye Lee; Michael Altenburg; Sarah Shefer; Ashok K Batta; Hongwei Yu; Jianliang Chen; Richard Klein; Norbert Looije; Ronald Oude-Elferink; Albert K Groen; Nobuyo Maeda; Gerald Salen; Shailendra B Patel
Journal:  BMC Med       Date:  2004-03-24       Impact factor: 8.775

View more
  4 in total

1.  Polymorphism of rs3737597 in DISC1 Gene on Chromosome 1q42.2 in sALS Patients: a Chinese Han Population Case-Control Study.

Authors:  Libin Deng; Liwei Huo; Jie Zhang; Xiaoli Tang; Zhujun Cheng; Gang Li; Xin Fang; Jinsong Xu; Xiong Zhang; Renshi Xu
Journal:  Mol Neurobiol       Date:  2016-04-07       Impact factor: 5.590

2.  Axon guidance pathway genes are associated with schizophrenia risk.

Authors:  Zhichao Wang; Ping Li; Tong Wu; Shuangyue Zhu; Libin Deng; Guangcheng Cui
Journal:  Exp Ther Med       Date:  2018-09-19       Impact factor: 2.447

3.  A family with hypothyroidism caused by fatty acid synthase and apolipoprotein B receptor mutations.

Authors:  Jianhua Sun; Lizhi Sun; Weijie Chen; Xiao Yin; Yong Lu; Qiang Jiang
Journal:  Mol Med Rep       Date:  2018-09-20       Impact factor: 2.952

4.  Genome-wide association studies suggest sex-specific loci associated with abdominal and visceral fat.

Authors:  Y J Sung; L Pérusse; M A Sarzynski; M Fornage; S Sidney; B Sternfeld; T Rice; J G Terry; D R Jacobs; P Katzmarzyk; J E Curran; J Jeffrey Carr; J Blangero; S Ghosh; J-P Després; T Rankinen; D C Rao; C Bouchard
Journal:  Int J Obes (Lond)       Date:  2015-10-20       Impact factor: 5.551

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.