Literature DB >> 22952805

Genome-wide association study of African and European Americans implicates multiple shared and ethnic specific loci in sarcoidosis susceptibility.

Indra Adrianto1, Chee Paul Lin, Jessica J Hale, Albert M Levin, Indrani Datta, Ryan Parker, Adam Adler, Jennifer A Kelly, Kenneth M Kaufman, Christopher J Lessard, Kathy L Moser, Robert P Kimberly, John B Harley, Michael C Iannuzzi, Benjamin A Rybicki, Courtney G Montgomery.   

Abstract

Sarcoidosis is a systemic inflammatory disease characterized by the formation of granulomas in affected organs. Genome-wide association studies (GWASs) of this disease have been conducted only in European population. We present the first sarcoidosis GWAS in African Americans (AAs, 818 cases and 1,088 related controls) followed by replication in independent sets of AAs (455 cases and 557 controls) and European Americans (EAs, 442 cases and 2,284 controls). We evaluated >6 million SNPs either genotyped using the Illumina Omni1-Quad array or imputed from the 1000 Genomes Project data. We identified a novel sarcoidosis-associated locus, NOTCH4, that reached genome-wide significance in the combined AA samples (rs715299, P(AA-meta) = 6.51 × 10(-10)) and demonstrated the independence of this locus from others in the MHC region in the same sample. We replicated previous European GWAS associations within HLA-DRA, HLA-DRB5, HLA-DRB1, BTNL2, and ANXA11 in both our AA and EA datasets. We also confirmed significant associations to the previously reported HLA-C and HLA-B regions in the EA but not AA samples. We further identified suggestive associations with several other genes previously reported in lung or inflammatory diseases.

Entities:  

Mesh:

Year:  2012        PMID: 22952805      PMCID: PMC3428296          DOI: 10.1371/journal.pone.0043907

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Sarcoidosis is a systemic disease characterized by granulomatous inflammation that primarily affects the lungs, but can affect any organ [1], [2], [3]. While the etiology of this disease remains elusive, the pathophysiology likely involves a dysregulated immune response to environmental agents in a genetically susceptible host. Several environmental exposures have been associated with sarcoidosis including mold, inorganic particles, and insecticides [4], [5], [6]. A significant genetic component to sarcoidosis susceptibility is supported by a 2.5 fold elevated disease risk in siblings and parents of cases [7] as well as potential disease susceptibility loci identified from both linkage and association studies [8], [9], [10], [11], [12]. Sarcoidosis impacts individuals of all races, ages and genders [13], but in the U.S. is most frequent in AAs [14], [15], with disease onset peaking between the ages of 20 and 39 years [16]. The AA population is more commonly affected than EAs [16], [17], [18], [19], with a three-fold higher lifetime risk (2.4%) and age-adjusted annual incidence (35.5 per 100,000) compared to EAs (0.85% and 10.9 per 100,000, respectively). AA patients have higher disease severity and more extra-thoracic involvement than EA patients and are less likely to have disease that resolves [20]. Ethnicity specific prevalence and severity support the involvement of genes and further suggest ethnicity-specific genetic risk profiles. Genetic associations with specific HLA alleles and sarcoidosis have repeatedly been reported [21], [22], [23], [24]. Heterogeneity of these HLA effects in sarcoidosis across ancestries was observed in the ACCESS study [23] suggesting that while the HLA-DRB1*1101 allele was associated with sarcoidosis in AAs and EAs, the HLA-DRB1*1501 allele was associated with sarcoidosis only in EAs [23]. Recent studies have reported additional susceptibility loci including BTNL2 [9], [25], [26] in both EAs and AAs, and ANXA11 [11] and RAB23 [27] in Germans. The first genome-wide linkage study of AA sarcoidosis families performed by our group found prominent linkage signals on chromosome 5, at 5q11.2, 5p13, and 5q31 [10]. Our admixture study confirmed the latter two of these effects and found regions on chromosomes 6p22.3 and 17p13.3–17p13.1 associated with increased African ancestry [28]. Based on clear evidence of the involvement of genes in the onset and manifestation of sarcoidosis, we sought to confirm sarcoidosis genetic risk loci reported in association scans of European populations and to identify novel risk loci by conducting the first genome-wide association study (GWAS) of sarcoidosis in an American population. We present results from a family-based discovery cohort of AAs as well as two independent replication sets of AA cases and controls and EA cases and controls.

Results

Genome-wide Association Scan of AA Discovery Set

A total of 864,829 single-nucleotide polymorphisms (SNPs) in our AA discovery set passed quality control assessment (Materials and Methods, Figure 1, Table 1). To increase the density of SNPs to be tested for association, we performed genotype imputation across the genome with the 1000 Genomes Project Phase I haplotypes as reference (Materials and Methods). The GWAS of the AA discovery set demonstrated no evidence for inflation of the test statistics (genomic control inflation factor [λGC] = 0.980) after comparing the observed and expected distributions of the SNP-sarcoidosis association P-values calculated using EMMAX (Figure S1, Materials and Methods). This suggests our regression model was able to account for population stratification in this dataset. The quantile-quantile plot revealed the presence of significant genetic effects associated with sarcoidosis (Figure S1). This dataset had good statistical power (at α = 5×10−8) to detect associations from common alleles with odds ratios ≥1.5 (Figure S2). We only found variants within previously reported MHC Class II genes [11], [22] exceeding genome-wide significance in this dataset (Figure 2A, Figure 3A, Table S2); HLA-DRA with the peak signals at multiple SNPs in perfect linkage disequilibrium (LD) with each other (r 2 = 1) including a missense SNP rs7192 (P AA-Disc = 8.73×10−9), HLA-DQA1 (peak signal at rs17843604, P AA-Disc = 4.77×10−10), and HLA-DQB1 (peak signal at rs149288329, P AA-Disc = 1.27×10−9) (Table S2). These SNPs were not LD with each other (r 2≤0.054).
Figure 1

A graphical overview of the GWAS datasets.

(A–B) Summary of the AA (A) and EA (B) datasets.

Table 1

Sample summary before and after quality control (QC).

African AmericanEuropean American
CharacteristicAll samplesbefore QCDiscovery setafter QCReplication setafter QCAll Samples after QCReplication set before QCReplication set after QC
Cases14878184551273518442
Controls15049085771465379339
External Controls180a 18001803208b 1945
Unknown Affection Status200000
Male88957524481918471173
Female22641331768209922471553
Unknown Gender20000110
Total317319061012291841052726

Taken from the Illumina YRI-ASW iControlDB;

175 Caucasian healthy controls from the Illumina iControlDB, 1047 controls from the dbGaP GENEVA Melanoma study, and 1986 controls from the dbGAP CIDR: NGRC Parkinson’s Disease Study.

Figure 2

Manhattan plots of SNP-sarcoidosis association test results.

(A–D) Association results in the AA discovery set (A), a meta-analysis between the AA discovery and AA replication sets (B), the EA dataset (C), and a meta-analysis of the AA discovery, AA replication and EA datasets (D). The black horizontal line represents the threshold for genome-wide significance (P<5×10−8) and the gray line is the suggestive evidence of association threshold (P<1×10−4).

Figure 3

Regional association plots of SNP-sarcoidosis association test results within the MHC Class II region.

(A–D) Association results in the AA discovery set (A), AA replication set (B), a meta-analysis between the AA discovery and AA replication sets (C), the EA dataset (D), and a meta-analysis of the AA discovery, AA replication and EA datasets (E). Each SNP is colored according to its LD (r 2) with the top SNP, except for (E) since the meta-analysis was performed on two different populations. The recombination rate is denoted by the blue solid line. Plots were drawn using LocusZoom [100].

A graphical overview of the GWAS datasets.

(A–B) Summary of the AA (A) and EA (B) datasets. Taken from the Illumina YRI-ASW iControlDB; 175 Caucasian healthy controls from the Illumina iControlDB, 1047 controls from the dbGaP GENEVA Melanoma study, and 1986 controls from the dbGAP CIDR: NGRC Parkinson’s Disease Study.

Manhattan plots of SNP-sarcoidosis association test results.

(A–D) Association results in the AA discovery set (A), a meta-analysis between the AA discovery and AA replication sets (B), the EA dataset (C), and a meta-analysis of the AA discovery, AA replication and EA datasets (D). The black horizontal line represents the threshold for genome-wide significance (P<5×10−8) and the gray line is the suggestive evidence of association threshold (P<1×10−4).

Regional association plots of SNP-sarcoidosis association test results within the MHC Class II region.

(A–D) Association results in the AA discovery set (A), AA replication set (B), a meta-analysis between the AA discovery and AA replication sets (C), the EA dataset (D), and a meta-analysis of the AA discovery, AA replication and EA datasets (E). Each SNP is colored according to its LD (r 2) with the top SNP, except for (E) since the meta-analysis was performed on two different populations. The recombination rate is denoted by the blue solid line. Plots were drawn using LocusZoom [100].

Genome-wide Meta-Analysis of the AA Discovery and Replication Sets

After assessing association between SNPs and sarcoidosis using logistic regression in the AA replication set (Materials and Methods, Figure 1, Table 1), we found little evidence for inflation of the test statistics in this dataset (λGC = 1.030, Figure S1). A meta-analysis of the AA discovery and replication sets yielded additional MHC SNPs that surpassed genome-wide significance in the meta-analysis results not present in either set alone. These included a genotyped SNP in the previously unreported neurogenic locus notch homolog protein 4 (NOTCH4) gene (rs715299, P AA-meta = 6.51×10−10) and other SNPs within the MHC Class II genes (Figure 1B, Figure 3C, Table 2, Table S2).
Table 2

Regions of association meeting genome-wide significance and their most significant SNPs grouped by sample.

CHRBP(hg 19)SNPGeneAlleles1 African AmericansEuropean AmericansPAll-Meta Heterogeneity Test
MAFAA-Disc 2 ORAA-Disc 3 P AA-Disc MAFAA-Rep 2 ORAA-Rep 3 P AA-Rep PAA-Meta MAFEA 2 OREA 3 P EA Q I 2 (%)
632,411,646rs7192 HLA-DRA a G/T0.4241.668.73E-090.4451.403.44E-041.40E-110.3951.351.26E-045.28E-140.30416
632,620,283rs17843604 HLA-DQA1 a C/T0.4020.634.77E-100.3780.801.70E-021.21E-100.560.911.81E-012.73E-085.33E-0589.8
632,642,794rs149288329 HLA-DQB1 a T/C0.0251.921.27E-090.0381.871.15E-021.55E-10NANANANANANA
632,189,841rs715299 NOTCH4 b T/G0.4541.301.12E-050.4801.528.14E-066.51E-100.3241.149.58E-022.15E-080.06463.6
631,272,612rs6457375 HLA-C c A/G0.4230.884.24E-010.4031.179.06E-027.26E-010.491.581.98E-099.80E-061.84E-0590.8
631,326,324rs2596475 HLA-B c T/C0.2870.905.27E-010.2631.009.84E-016.01E-010.3861.523.82E-082.72E-057.45E-0589.5
632,446,853rs17203612 HLA-DRB5 c T/C0.2700.642.66E-050.2430.792.42E-022.33E-060.4380.631.82E-082.80E-130.20936.1

Major/minor allele of AAs as the reference;

Minor allele frequency;

The odds ratio (OR) was calculated with respect to the minor allele of AAs.

Previously reported sarcoidosis loci meeting genome-wide significance in the AA discovery set.

Potentially novel region meeting genome-wide significance after the meta-analysis of AA datasets.

Previously reported sarcoidosis loci meeting genome-wide significance in the EA dataset.

Note that stepwise conditional analysis results to identify independent signals within the MHC region can be found in Tables S3 and S4.

Major/minor allele of AAs as the reference; Minor allele frequency; The odds ratio (OR) was calculated with respect to the minor allele of AAs. Previously reported sarcoidosis loci meeting genome-wide significance in the AA discovery set. Potentially novel region meeting genome-wide significance after the meta-analysis of AA datasets. Previously reported sarcoidosis loci meeting genome-wide significance in the EA dataset. Note that stepwise conditional analysis results to identify independent signals within the MHC region can be found in Tables S3 and S4.

Stepwise Conditional Association of the MHC Region in Combined AA Dataset

Since the MHC region is known for its extensive regions of high LD [29], we sought to assess whether the novel AA association signal within NOTCH4 was independent of the signals within the MHC Class II genes. We performed stepwise conditional association analyses (Materials and Methods) among variants with P AA-meta <5×10−8 in the MHC region in the combined AA set and at step one used the most significant SNP (rs2227139, HLA-DRA) as the covariate. After adjusting for this HLA-DRA SNP, we observed significant residual associations in several other regions; the most significant of which was at rs146146117 (HLA-DQA1, P conditional = 6.81×10−8, Table S3). Significant residual associations remained after the next step of adjusting for HLA-DRA and HLA-DQA1 SNPs; the most significant residual association was within HLA-DRB1 (rs9461776, P conditional = 1.45×10−7, Table S3). We continued to step three by adding this HLA-DRB1 SNP into the regression and found the most significant residual signals at NOTCH4 (rs715299, P conditional = 1.74×10−6) and HLA-DQA1 (rs9272320, P conditional = 7.04×10−6) (Table S3). The subsequent (and final) step adding this HLA-DQA1 SNP (rs9272320) as a covariate resulted in diminished association signals for the remaining significant SNPs within the MHC class II genes (P conditional ≥0.014), whereas NOTCH4 remained significant (rs715299, P conditional = 8.85×10−5) (Table S3). While the P-value for NOTCH4 did not retain the GWAS threshold of 5×10−8 after rigorous conditioning, it remains the only significant effect well exceeding the suggestive level of association. It suggests that the observed signal within NOTCH4 is independent of the evaluated SNPs within the MHC Class II genes. These analyses also showed the existence of multiple independent signals within this MHC region (Table 2).

Confirmation of Previously Reported SNPs Associated with Sarcoidosis in the Combined AA Datasets

Three significant SNPs reported in the previous German GWAS in the MHC region (P<1×10−6) [11] were also replicated in our combined AA datasets (rs7194 [in perfect LD with rs7192], HLA-DRA, P AA-meta = 1.40×10−11; rs9268853, HLA-DRB5, P AA-meta = 7.40×10−4; and rs615672, HLA-DRB1, P AA-meta = 2.60×10−9, Table 3). The previously reported peak SNP within BTNL2 (rs2076530) [9], [11], [25] was not strongly associated with sarcoidosis in our AA datasets (P AA-meta = 0.024, Table 3). However, a SNP with 4 kb upstream of rs2076530, rs9268482, was suggestive of association (P AA-meta = 6.32×10−6, Table 3). Interestingly, we also identified a suggestive association at a BTNL2 coding-synonymous SNP, rs9268480 (P AA-meta = 1.03×10−5), only 28 bp upstream of rs2076530 and in high LD with rs9268482 (r 2 = 0.996). Since BNTL2 is only 170 kb apart from NOTCH4, we sought to assess whether the signal within NOTCH4 is independent of the signal within BTNL2 using conditional association analyses. When adjusting for one of those associated BTNL2 SNPs (rs9268482), we found NOTCH4 remained significant (rs715299, P conditional = 2.86×10−8). On the other hand, after adjusting for the NOTCH4 SNP, we still observed a significant residual signal at the BTNL2 SNP (rs9268482, P conditional = 1.26×10−4). These indicated the signal within NOTCH4 is also independent of the BTNL2 signal.
Table 3

Replication of previously reported SNPs associated with sarcoidosis [9], [11], [25], [27].

CHRBP(hg 19)SNPGeneAlleles1 African AmericansEuropean AmericansPAll-Meta Heterogeneity Test
MAFAA-Disc 2 ORAA-Disc P AA-Disc MAFAA-Rep 2 ORAA-Rep P AA-Rep PAA-Meta MAFEA 2 OREA P EA Q I 2 (%)
632,363,816rs2076530 BTNL2 T/C0.3090.842.50E-010.3120.802.46E-022.42E-020.4340.704.19E-061.44E-060.32411.3
632,412,480rs7194 HLA-DRA A/G0.4241.668.73E-090.4451.403.44E-041.40E-110.3951.351.26E-045.28E-140.30416
632,429,643rs9268853 HLA-DRB5 T/C0.2140.721.16E-030.1970.862.03E-017.40E-040.3310.769.79E-042.39E-060.5440
632,574,171rs615672 HLA-DRB1 C/G0.4490.641.23E-060.4380.725.50E-042.60E-090.6430.818.00E-039.97E-102.00E-0793.5
657055354rs1040461 RAB23 C/T0.1581.133.24E-020.1771.211.18E-018.04E-030.0790.894.18E-011.80E-010.25726.4
1081,926,702rs1049550 ANXA11 G/A0.1850.687.91E-040.1870.882.89E-018.46E-040.4090.818.33E-032.30E-050.3563.2

Major/minor allele of AAs as the reference;

Minor allele frequency;

The odds ratio (OR) was calculated with respect to the minor allele of AAs.

Major/minor allele of AAs as the reference; Minor allele frequency; The odds ratio (OR) was calculated with respect to the minor allele of AAs. We saw modest association with two other previously reported susceptibility genes: ANXA11 [11] and RAB23 [27]. A non-synonymous SNP within ANXA11, rs1049550, was associated with sarcoidosis in our combined AA datasets at P AA-meta = 8.46×10−4 (Table 3). A similar modest association was seen with a non-synonymous SNP within RAB23 (rs1040461, P AA-meta = 8.04×10−3, Table 3). We did find suggestive evidence of association on 5q11.2 (peak signal at rs116137605 within a region between SNX18 and ESM1, P AA-meta = 3.09×10−5) a region identified in our previous linkage and fine-mapping studies [10], [28], [30].

Genome-wide Association Scan of EA Dataset

We found 682,921 genotyped SNPs passed quality control measures in our EA dataset (Materials and Methods, Figure 1, Table 1). After performing imputation with the 1000 Genomes Project haplotypes, the SNP-sarcoidosis association calculated using logistic regression of the EA dataset showed little evidence for inflation of the test statistics (λGC = 1.027, Figure S1). This dataset also had good statistical power (at α = 5×10−8) to detect associations from common alleles with odds ratios ≥1.5 (Figure S2). We observed genome-wide significance SNPs within previously reported MHC genes [9], [11], [24] including HLA-C (peak signal at rs6457375, P EA = 1.98×10−9), HLA-B (peak signal at rs2596475, P EA = 3.82×10−8), and HLA-DRB5 (peak signal at rs17203612, P EA = 1.82×10−8) (Figure 2C, Figure 3D, Table 2, Table S2). However, we did not find any variant within NOTCH4 passed genome-wide significance in this dataset (Figure S3). Stepwise conditional association analyses further demonstrated two independent signals exist within this region tagged by rs6457375 (HLA-C) and rs17203612 (HLA-DRB5) (Table S4).

Confirmation of Previously Identified Loci in EA Dataset

We replicated significant SNPs from the German GWAS [11] in the EA dataset including rs7194 (HLA-DRA, P EA = 1.26×10−4), rs9268853 (HLA-DRB5, P EA = 9.79×10−4), rs615672 (HLA-DRB1, P EA = 8.00×10−3), and rs1049550 (ANXA11, P EA = 8.33×10−3) (Table 3). We also replicated the BTNL2 SNP, rs2076530 [9], [11], [25], in our EA dataset (P EA = 4.19×10−6, Table 3). We did not, however, confirm the RAB23 association [27] in this dataset (rs1040461, P EA = 0.418, Table 3).

Meta-analysis Results of All Datasets

Among regions that met genome-wide significance in the AA meta-analysis, we also found significant associations within HLA-DRA, HLA-DRB1, and HLA-DQA1 in the EA dataset (8.25×10−5≤P EA ≤3.97×10−2, 3.77×10−14≤P All-meta ≤7.23×10−8) (Figure 3E, Table S2). We found a weak association to the NOTCH4 SNP (rs715299) in the EA dataset (P EA = 0.096), perhaps suggesting its ethnicity specific effect (the Cochran’s Q test of heterogeneity P = 0.064 and the inconsistency index I 2 = 63.60%, see Materials and Methods). Conversely, when evaluating regions reaching genome-wide significant in the EA dataset, variants within HLA-DRB5, HLA-DRB1, and HLA-DQA1 were also significant in the AA datasets (1.81×10−7≤P AA-meta ≤1.28×10−5, 1.16×10−14≤P All-meta ≤2.65×10−12, Table S2), whereas HLA-C and HLA-B were not (P AA-meta ≥0.575, Table S2).

Suggestive Association Regions

We observed multiple regions reached suggestive association (P all-meta <1×10−4) in the meta-analysis of all AA and EA datasets. These included variants within TRAK1, SLC44A4, GLI3-C7orf25, ATP8A2, and TGM3 (Tables S5). We observed additional suggestive association regions (P<1×10−4) that were unique to one ethnic group. For example, we identified variants with suggestive association within FHIT, PRDM1, FRMD3, DMBT1 and a region between ZSCAN2 and ALPK3 in the combined AA datasets only (Tables S5). We also observed suggestive association only in the EA dataset within CASP10, RARB, and NCR3 among others (Tables S5). Several of these suggestive effects fall within genes implicated in other lung or inflammatory diseases (Table S6).

Discussion

Previously reported GWASs of sarcoidosis have been limited to European (specifically German) samples. Ours is the first GWAS of sarcoidosis in Americans and, even more importantly, of AAs, the population most commonly and severely affected. Our results, while demonstrating some shared effects across ethnicities, strongly support the presence of ethnic specific genetic effects. We identified significant association between sarcoidosis and a previously unreported locus (NOTCH4) in our AA datasets. This association was determined to be independent of other neighboring MHC genes and is an attractive biological candidate. NOTCH4 encodes a member of the Notch family that is involved in controlling cell fate decisions during developmental processes and regulating the activity of T cell immune responses [31], [32]. The Notch signaling pathway also plays a role in endothelial cell differentiation, apoptosis and proliferation [33], [34], [35], [36]. Further, NOTCH4 is highly expressed in the lung and may play a key role in the lung development and diseases such as asthma and lung arteriovenous shunts [37], [38], [39], [40], [41]. NOTCH4 has also been associated with neonatal lupus [42], multiple sclerosis [43], systemic sclerosis [44], and other immune-related disorders [45], [46], [47], [48]. We also saw evidence of suggestive association of NOTCH4 in our EA dataset. While further studies are needed to define the role of NOTCH4 in the specific pathogenesis of sarcoidosis, a novel association to this gene is supported by previous expression and disease studies. We replicated associations for several previously reported sarcoidosis susceptibility risk loci in our AA collection including MHC Class II region genes (HLA-DRA, HLA-DRB5, HLA-DRB1, and HLA-DQA1), BTNL2, RAB23, and ANXA11 [9], [11], [25], [27], [49]. These regions were also replicated in our EA dataset except for RAB23. It is known that the MHC Class II region plays a major role in immune-mediated disorders, including associations to celiac disease, insulin-dependent diabetes mellitus, rheumatoid arthritis, multiple sclerosis, and systemic lupus erythematosus (SLE) [50], [51]. Similarly, BTNL2, RAB23, and ANXA11 have been suggested to play a role in T-cell activation [9], antibacterial defense processes [27], and apoptosis [11]. It is worth noting that we did not replicate the association with C10orf67 [12] as identified in a joint GWAS of German patients with either sarcoidosis or Crohn’s disease. Additional regions with suggestive evidence of association in both AAs and EAs include TRAK1, SLC44A4, GLI3-C7orf25, ATP8A2, and TGM3. While the biological relevance of most of these genes to sarcoidosis is still unknown, GLI3-C7orf25 and TGM3 may warrant further investigation. Although C7orf25 is a hypothetical gene with unknown function, GLI3 encodes zinc finger protein Gli3 that has a bipotential function as a transcriptional activator or repressor of the sonic hedgehog pathway [52], [53]. This pathway contains RAB23 (discussed above) and has been suggested to play a role in the sarcoidosis pathophysiology [27]. TGM3 (Transglutaminase 3) encodes protein involved in the later stages of cell envelope formation in the epidermis and hair follicle [54] and has been associated with celiac disease [55], [56] and psoriasis [57], [58]. Despite the overlap of compelling signals across populations, we did find evidence of genetic heterogeneity between ethnic groups in this disease (see Tables 2 and 3). The previously reported MHC Class I region [24] including HLA-C and HLA-B (associated with psoriasis [59] and ankylosing spondylitis [60], respectively) was associated only in the EA dataset. Other noteworthy genes with suggestive association specific to EAs included CASP10, RARB, and NCR3. CASP10 (caspase 10) plays a role in apoptosis and has been associated with autoimmune lymphoproliferative syndrome [61] and non-Hodgkin lymphoma [62]. In addition, RARB (retinoic acid receptor beta) and NCR3 (natural cytotoxicity triggering receptor 3) have been associated with pulmonary function based on a recent GWAS of European Caucasians [63]. Suggestive associations specific to AAs include FHIT, FRMD3, DMBT1, and PRDM1. FHIT (fragile histidine triad) is involved in various intracellular functions and a putative tumor suppressor for various cancers including lung cancer [64], [65]. FRMD3 (FERM domain containing 3) is over-expressed in normal human lung tissue compared with tissue from lung tumors of lung carcinoma patients suggesting its important role in the origin and progression of lung cancer [66]. DMBT1 (deleted in malignant brain tumors 1) is overexpressed in epithelial cells [67] and has been found associated with ulcerative colitis [68] and Crohn’s disease [67], [69]. PRDM1 (PR domain containing protein 1) plays a role as a repressor of beta-interferon gene expression [70] and had been associated with rheumatoid arthritis [71], inflammatory bowel disease (IBD) [72], [73], and SLE [74], [75]. We also observed variants with suggestive associations specific to AAs in a region containing ZSCAN2, SCAND2, WDR73, NMB, SEC11A, ZNF592, and ALPK3 as well as a region identified in our linkage studies [10], [28], [30] on 5q11.2 (a region between SNX18 and ESM1). However, the actual biological functions of these genes are largely unknown. In summary, this is the first report of GWAS in an American sample and the first report of a significant association between sarcoidosis and NOTCH4. We have replicated several previously reported sarcoidosis susceptibility loci in both our EA and AA samples as well as report several biologically plausible effects at loci with suggestive statistical evidence. We report sarcoidosis associations both shared between ethnicities as well as those unique to either our AA or EA dataset, supporting genetic heterogeneity of this disease. The presence of genetic heterogeneity may well serve as a useful tool in the isolation of the causal variants associated with this disease as it has in other complex disorders [76], [77]. Finally, this study demonstrates both the usefulness of and need for genetic studies of sarcoidosis in diverse populations and further elucidates potential pathogenic mechanisms of this disease. Future replication, sequencing and functional studies are required to further elucidate the causal variants that may underlie these associations as well as to discover rare variants that may have yet to be identified.

Materials and Methods

Ethics Statement

The study and sample collection were approved by the Institutional Review Board (IRB) at all participating institutions including A Case Control Etiologic Study of Sarcoidosis (ACCESS) Group, Sarcoidosis Genetic Analysis (SAGA) study, Henry Ford Health System in Detroit, Michigan, and Oklahoma Medical Research Foundation (OMRF), Oklahoma City, Oklahoma, Institutional Review Boards (IRBs). Only individuals who signed informed consent forms were included in this study. No minors or children were involved in our study.

Subjects

Our AA sample collection, which comprises 1487 cases and 1504 controls (Figure1, Table 1), was taken from an extensive cohort of AA sarcoidosis patients, family members and controls assembled from 1) case-control pairs collected as a part of a 10 center collaborative study (ACCESS Group) [78], 2) the SAGA sample ascertained through affected sib pairs [79], 3) a nuclear family-based sample ascertained through single sarcoidosis-affected offspring from the Henry Ford Health System in Detroit, Michigan [80], and 4) healthy controls from the OMRF Lupus Family Registry and Repository (LFRR) [81]. The AA cases and their family members were grouped into a discovery set of 818 cases and 908 related and unrelated controls and the other 455 independent cases and 557 independent controls were selected for a replication set after applying quality control measures as described below (Figure 1, Table 1). In addition, genotype data from 180 HapMap controls from Yoruba in Ibadan, Nigeria (YRI) and of African ancestry in Southwest USA (ASW) were obtained from the Illumina HumanOmni1-Quad iControlDB (http://www.illumina.com/science/icontroldb.ilmn) and included into the control group of the AA discovery set, as is common practice in order to increase statistical power [82], [83], [84]. The EA dataset consisted of 518 independent cases and 379 independent controls from the ACCESS and the Henry Ford Health System studies mentioned above. We also assembled external genotype data on 3208 healthy Caucasian controls from the Illumina iControlDB (175), the dbGaP (Accession: phs000187.v1.p1) GENEVA Melanoma study (1047), and the dbGAP (Accession: phs000196.v2.p1) CIDR: NGRC Parkinson’s Disease Study (1986) (Figure 1, Table 1). Each sample collection site received the IRB approval to recruit samples. All samples were processed and genotyped at the OMRF under the auspice of the OMRF IRB.

Genotyping and Quality Control

Genotyping was performed at the OMRF using the Illumina HumanOmni1-Quad array for ∼1.1M variants across the genome. SNPs had to meet the following quality control criteria for inclusion for each population: well-defined cluster plots by visual inspections, call rate >95%, minor allele frequency >0.01, Hardy-Weinberg proportion tests P>0.0001 in cases and P>0.001 in controls, and case-control differences in missingness P>0.001. Copy number variations, X, Y, XY, and mitochondrial chromosomes were not included in the analysis. A total of 864,829 and 682,921 SNPs passed our quality controls in the AA discovery and replication sets and the EA dataset, respectively. We found 657,350 successfully genotyped SNPs that overlap between the panels. Samples were removed from analysis if they were determined to be a duplicate of another sample, cryptic relatedness in the independent datasets (the proportion of alleles shared identical by descent >0.25), displayed low call rates (<90%), exhibited extreme heterozygosity (>5 standard deviations from the mean), demonstrated either outlying principal component values of population membership calculated by EIGENSOFT 3.0 [85] or global ancestry estimates calculated by ADMIXMAP [86], [87], or revealed discrepancies between reported gender and genetic data (Table S1). For the EA dataset, we assigned to each sarcoidosis case the five best-matched controls as determined by identity-by-state (IBS) allele sharing using PLINK v1.07 [88] resulting in a large drop-out of external controls in the EA dataset.

Imputation Method

Imputation was performed in each population at 5 Mb bins across the genome using the IMPUTE2 program [89], [90]. The 1000 Genomes Project Phase I data release (June 2011), which contains haplotypes derived from 1,094 individuals from Africa, Asia, Europe, and the Americas, was used as the reference [89], [90]. IMPUTE2 estimated the posterior probabilities for the three possible genotypes (i.e. AA, AB, and BB). The posterior probabilities were then converted to the most likely genotypes with a threshold of 0.9. Imputed SNPs with either low imputation accuracy (information measure <0.5 and the average maximum posterior genotype call probability <0.9) and that failed the SNP quality control standards described above were removed in order to minimize false positives. After imputation, 10,948,298 SNPs in the AA discovery set, 11,160,451 SNPs in the AA replication set, and 6,620,482 SNPs in the EA replication set passed quality control measures for analysis.

Association Analyses

Because our discovery set contained related individuals, association analysis to any single marker in this set was performed using the Efficient Mixed-Model Association eXpedited (EMMAX) software [91], [92]. EMMAX was chosen because it implements a variance component approach in the linear mixed-model that simultaneously adjusts for both pairwise genetic relatedness between individuals and corrects for population stratification using an empirical kinship matrix based on the proportion of alleles at all genome-wide SNPs shared identical-by-state between all pairs of individuals in the study [91]. We assumed an additive model [91], [92] and adjusted the statistics for gender. Since EMMAX does not calculate odds ratios (ORs), we estimated these using logistic regression as implemented in PLINK using independent samples (480 cases and 367 controls) ascertained from the AA discovery set. The association analyses of the independent sets of AAs and EAs were calculated using logistic regression in PLINK. We assumed the additive genetic model and adjusted the statistics for gender and the first five principal components of each population (calculated using EIGENSOFT 3.0). Meta-analyses were performed using the weighted Z-score method that accounts for the direction of effects and sample-size as implemented in METAL [93]. Both the Cochran’s Q test statistic and I 2 index were used to test for heterogeneity in the meta-analysis of all samples. The Cochran’s Q test calculates the weighted sum of the squared deviations between each study effects and the overall effect across studies [94], whereas the I 2 index quantifies the percentage of inconsistency across studies due to heterogeneity rather than by chance [95]. The Q test with P<0.05 or I 2>50% indicates the presence of heterogeneity. Stepwise conditional association analysis in AAs was conducted for SNPs with P<5×10−8 using EMMAX adjusting for gender and SNPs of interest, a SNP added at a time. We required a SNP threshold of P<5×10−8 to be considered significantly associated and P<1×10−4 to be considered suggestively associated with sarcoidosis [96], [97], [98]. The power calculations for different minor allele frequencies and odds ratios for each dataset were performed using the Genetic Power Calculator program [99] and have been summarized in Figure S2. The assumptions are a disease prevalence of 0.05%, complete linkage disequilibrium between SNP and predisposing loci, an additive genetic model and a type I error rate α = 5×10−8. To present power curves that are comparable across sets, we used a power calculator that assumes independence, but adjusted the analysis of the AA discovery set (family-based set) assuming a familial correlation of 0.25 since most pairs are siblings (and thus smaller equivalent count or 75% of the total cases and controls in this set). The quantile-quantile (Q–Q) plots of the observed and expected distributions of P-values. (A–C) The Q–Q plots for (A) the AA discovery set (genomic control inflation factor [λGC]  = 0.980), (B) the AA replication set (λGC = 1.030), and (C) the EA dataset (λGC = 1.027). (DOC) Click here for additional data file. Power calculation plots of the GWAS datasets. (A–C) Power calculation plots for the AA discovery set (A), the AA replication set (B), and the EA dataset (C). (DOC) Click here for additional data file. Regional association plots of SNP-sarcoidosis association test results within NOTCH4. (A–D) Association results in the AA discovery set (A), AA replication set (B), a meta-analysis between the AA discovery and AA replication sets including the LD (D’) plot (C), and the EA dataset including the LD (D’) plot (D). Each SNP is colored according to its LD (r 2) with the top SNP. The blue solid line denotes the recombination rate. (DOC) Click here for additional data file. Summary of dropped samples after QC. (DOC) Click here for additional data file. Association results with P<5×10−8 in either dataset. (XLS) Click here for additional data file. Stepwise conditional analysis in AA samples for SNPs in the MHC region with P<5×10−8. (XLS) Click here for additional data file. Stepwise conditional analysis in EA samples for SNPs in the MHC region with P<5×10−8. (XLS) Click here for additional data file. Association results with P<1×10−4 in either dataset. (XLS) Click here for additional data file. Shared or Ethnic Specific Suggestive Association Regions supported by the heterogeneity test results and list of inflammatory or lung diseases associated with these regions. (DOC) Click here for additional data file.
  99 in total

Review 1.  The Hedgehog and Wnt signalling pathways in cancer.

Authors:  J Taipale; P A Beachy
Journal:  Nature       Date:  2001-05-17       Impact factor: 49.962

2.  Concepts of epidemiology of sarcoidosis; preliminary report of 1,194 cases reviewed with special reference to geographic ecology.

Authors:  M M CUMMINGS; E DUNNER; R H SCHMIDT; J B BARNWELL
Journal:  Postgrad Med       Date:  1956-05       Impact factor: 3.840

3.  Notch activation induces endothelial cell cycle arrest and participates in contact inhibition: role of p21Cip1 repression.

Authors:  Michela Noseda; Linda Chang; Graeme McLean; Jonathan E Grim; Bruce E Clurman; Laura L Smith; Aly Karsan
Journal:  Mol Cell Biol       Date:  2004-10       Impact factor: 4.272

4.  Course and prognosis of sarcoidosis around the world.

Authors:  L E Siltzbach; D G James; E Neville; J Turiaf; J P Battesti; O P Sharma; Y Hosoda; R Mikami; M Odaka
Journal:  Am J Med       Date:  1974-12       Impact factor: 4.965

5.  The lupus family registry and repository.

Authors:  Astrid Rasmussen; Sydney Sevier; Jennifer A Kelly; Stuart B Glenn; Teresa Aberle; Carisa M Cooney; Anya Grether; Ellen James; Jared Ning; Joanne Tesiram; Jean Morrisey; Tiny Powe; Mark Drexel; Wes Daniel; Bahram Namjou; Joshua O Ojwang; Kim L Nguyen; Joshua W Cavett; Jeannie L Te; Judith A James; R Hal Scofield; Kathy Moser; Gary S Gilkeson; Diane L Kamen; Craig W Carson; Ana I Quintero-del-Rio; Maria del Carmen Ballesteros; Marilynn G Punaro; David R Karp; Daniel J Wallace; Michael Weisman; Joan T Merrill; Roberto Rivera; Michelle A Petri; Daniel A Albert; Luis R Espinoza; Tammy O Utset; Timothy S Shaver; Eugene Arthur; Juan-Manuel Anaya; Gail R Bruner; John B Harley
Journal:  Rheumatology (Oxford)       Date:  2010-09-23       Impact factor: 7.580

6.  Association of systemic lupus erythematosus with C8orf13-BLK and ITGAM-ITGAX.

Authors:  Geoffrey Hom; Robert R Graham; Barmak Modrek; Kimberly E Taylor; Ward Ortmann; Sophie Garnier; Annette T Lee; Sharon A Chung; Ricardo C Ferreira; P V Krishna Pant; Dennis G Ballinger; Roman Kosoy; F Yesim Demirci; M Ilyas Kamboh; Amy H Kao; Chao Tian; Iva Gunnarsson; Anders A Bengtsson; Solbritt Rantapää-Dahlqvist; Michelle Petri; Susan Manzi; Michael F Seldin; Lars Rönnblom; Ann-Christine Syvänen; Lindsey A Criswell; Peter K Gregersen; Timothy W Behrens
Journal:  N Engl J Med       Date:  2008-01-20       Impact factor: 91.245

7.  The deduced sequence of the novel protransglutaminase E (TGase3) of human and mouse.

Authors:  I G Kim; J J Gorman; S C Park; S I Chung; P M Steinert
Journal:  J Biol Chem       Date:  1993-06-15       Impact factor: 5.157

8.  Meta-analysis identifies 29 additional ulcerative colitis risk loci, increasing the number of confirmed associations to 47.

Authors:  Carl A Anderson; Gabrielle Boucher; Charlie W Lees; Andre Franke; Mauro D'Amato; Kent D Taylor; James C Lee; Philippe Goyette; Marcin Imielinski; Anna Latiano; Caroline Lagacé; Regan Scott; Leila Amininejad; Suzannah Bumpstead; Leonard Baidoo; Robert N Baldassano; Murray Barclay; Theodore M Bayless; Stephan Brand; Carsten Büning; Jean-Frédéric Colombel; Lee A Denson; Martine De Vos; Marla Dubinsky; Cathryn Edwards; David Ellinghaus; Rudolf S N Fehrmann; James A B Floyd; Timothy Florin; Denis Franchimont; Lude Franke; Michel Georges; Jürgen Glas; Nicole L Glazer; Stephen L Guthery; Talin Haritunians; Nicholas K Hayward; Jean-Pierre Hugot; Gilles Jobin; Debby Laukens; Ian Lawrance; Marc Lémann; Arie Levine; Cecile Libioulle; Edouard Louis; Dermot P McGovern; Monica Milla; Grant W Montgomery; Katherine I Morley; Craig Mowat; Aylwin Ng; William Newman; Roel A Ophoff; Laura Papi; Orazio Palmieri; Laurent Peyrin-Biroulet; Julián Panés; Anne Phillips; Natalie J Prescott; Deborah D Proctor; Rebecca Roberts; Richard Russell; Paul Rutgeerts; Jeremy Sanderson; Miquel Sans; Philip Schumm; Frank Seibold; Yashoda Sharma; Lisa A Simms; Mark Seielstad; A Hillary Steinhart; Stephan R Targan; Leonard H van den Berg; Morten Vatn; Hein Verspaget; Thomas Walters; Cisca Wijmenga; David C Wilson; Harm-Jan Westra; Ramnik J Xavier; Zhen Z Zhao; Cyriel Y Ponsioen; Vibeke Andersen; Leif Torkvist; Maria Gazouli; Nicholas P Anagnou; Tom H Karlsen; Limas Kupcinskas; Jurgita Sventoraityte; John C Mansfield; Subra Kugathasan; Mark S Silverberg; Jonas Halfvarson; Jerome I Rotter; Christopher G Mathew; Anne M Griffiths; Richard Gearry; Tariq Ahmad; Steven R Brant; Mathias Chamaillard; Jack Satsangi; Judy H Cho; Stefan Schreiber; Mark J Daly; Jeffrey C Barrett; Miles Parkes; Vito Annese; Hakon Hakonarson; Graham Radford-Smith; Richard H Duerr; Séverine Vermeire; Rinse K Weersma; John D Rioux
Journal:  Nat Genet       Date:  2011-02-06       Impact factor: 38.330

9.  High-density SNP screening of the major histocompatibility complex in systemic lupus erythematosus demonstrates strong evidence for independent susceptibility regions.

Authors:  Lisa F Barcellos; Suzanne L May; Patricia P Ramsay; Hong L Quach; Julie A Lane; Joanne Nititham; Janelle A Noble; Kimberly E Taylor; Diana L Quach; Sharon A Chung; Jennifer A Kelly; Kathy L Moser; Timothy W Behrens; Michael F Seldin; Glenys Thomson; John B Harley; Patrick M Gaffney; Lindsey A Criswell
Journal:  PLoS Genet       Date:  2009-10-23       Impact factor: 5.917

10.  Notch4/int-3, a mammary proto-oncogene, is an endothelial cell-specific mammalian Notch gene.

Authors:  H Uyttendaele; G Marazzi; G Wu; Q Yan; D Sassoon; J Kitajewski
Journal:  Development       Date:  1996-07       Impact factor: 6.868

View more
  45 in total

1.  Role of NOD2 Pathway Genes in Sarcoidosis Cases with Clinical Characteristics of Blau Syndrome.

Authors:  Ghalib A Bello; Indra Adrianto; Gerard G Dumancas; Albert M Levin; Michael C Iannuzzi; Benjamin A Rybicki; Courtney Montgomery
Journal:  Am J Respir Crit Care Med       Date:  2015-11-01       Impact factor: 21.405

2.  Replication of genetic loci for sarcoidosis in US black women: data from the Black Women's Health Study.

Authors:  Yvette Cozier; Edward Ruiz-Narvaez; Craig McKinnon; Jeffrey Berman; Lynn Rosenberg; Julie Palmer
Journal:  Hum Genet       Date:  2013-03-31       Impact factor: 4.132

3.  Efficient generalized least squares method for mixed population and family-based samples in genome-wide association studies.

Authors:  Jia Li; James Yang; Albert M Levin; Courtney G Montgomery; Indrani Datta; Sheri Trudeau; Indra Adrianto; Paul McKeigue; Michael C Iannuzzi; Benjamin A Rybicki
Journal:  Genet Epidemiol       Date:  2014-05-20       Impact factor: 2.135

4.  Expression of Notch signaling components in cutaneous foreign body and sarcoidal granulomas and fusing macrophages.

Authors:  Javier R Rangel; Yoonjie Chung; Misha Rosenbach; Amy F Ziober; Paul J Zhang; Andras Schaffer
Journal:  Am J Dermatopathol       Date:  2014-05       Impact factor: 1.533

5.  Transferability of genome-wide associated loci for asthma in African Americans.

Authors:  Mezbah U Faruque; Guanjie Chen; Ayo P Doumatey; Jie Zhou; Hanxia Huang; Daniel Shriner; Adebowale A Adeyemo; Charles N Rotimi; Georgia M Dunston
Journal:  J Asthma       Date:  2016-05-13       Impact factor: 2.515

6.  [Genetic risk profile of sarcoidosis].

Authors:  A Fischer; G Zissel; A Nebel; J Müller-Quernheim
Journal:  Internist (Berl)       Date:  2014-02       Impact factor: 0.743

7.  Association of ANXA11 genetic variation with sarcoidosis in African Americans and European Americans.

Authors:  A M Levin; M C Iannuzzi; C G Montgomery; S Trudeau; I Datta; P McKeigue; A Fischer; A Nebel; B A Rybicki
Journal:  Genes Immun       Date:  2012-11-15       Impact factor: 2.676

Review 8.  Identifying Novel Biomarkers in Sarcoidosis Using Genome-Based Approaches.

Authors:  Nancy Casanova; Tong Zhou; Kenneth S Knox; Joe G N Garcia
Journal:  Clin Chest Med       Date:  2015-09-26       Impact factor: 2.878

9.  Identification of Immune-Relevant Factors Conferring Sarcoidosis Genetic Risk.

Authors:  Annegret Fischer; David Ellinghaus; Marcel Nutsua; Sylvia Hofmann; Courtney G Montgomery; Michael C Iannuzzi; Benjamin A Rybicki; Martin Petrek; Frantisek Mrazek; Stefan Pabst; Christian Grohé; Johan Grunewald; Marcus Ronninger; Anders Eklund; Leonid Padyukov; Violeta Mihailovic-Vucinic; Dragana Jovanovic; Martina Sterclova; Jiri Homolka; Markus M Nöthen; Stefan Herms; Christian Gieger; Konstantin Strauch; Juliane Winkelmann; Bernhard O Boehm; Stephan Brand; Carsten Büning; Manfred Schürmann; Eva Ellinghaus; Hansjörg Baurecht; Wolfgang Lieb; Almut Nebel; Joachim Müller-Quernheim; Andre Franke; Stefan Schreiber
Journal:  Am J Respir Crit Care Med       Date:  2015-09-15       Impact factor: 21.405

Review 10.  Genetic, Immunologic, and Environmental Basis of Sarcoidosis.

Authors:  David R Moller; Ben A Rybicki; Nabeel Y Hamzeh; Courtney G Montgomery; Edward S Chen; Wonder Drake; Andrew P Fontenot
Journal:  Ann Am Thorac Soc       Date:  2017-12
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.