Literature DB >> 31921204

Exome Sequencing Identifies Susceptibility Loci for Sarcoidosis Prognosis.

Elisa Lahtela1, Matti Kankainen2,3, Juha Sinisalo4, Olof Selroos5, Marja-Liisa Lokki1.   

Abstract

Many sarcoidosis-associating immunological genes have been shown to be shared between other immune-mediated diseases. In Finnish sarcoidosis patients, good prognosis subjects more commonly have HLA-DRB1*03:01 and/or HLA-DRB1*04:01-DPB1*04:01 haplotype, but no marker for persistent disease have been found. The objective was to further pinpoint genetic differences between prognosis subgroups in relation to the HLA markers. Whole-exome sequencing was conducted for 72 patients selected based on disease activity (resolved disease, n = 36; persistent disease, n = 36). Both groups were further divided by the HLA markers (one/both markers, n = 18; neither of the markers, n = 18). The Finnish exome data from the Genome Aggregation Database was used as a control population in the WES sample. Statistical analyses included single-variant analysis for common variants and gene level analysis for rare variants. We attempted to replicate associated variants in 181 Finnish sarcoidosis patients and 150 controls. An association was found in chromosome 1p36.21 (AADACL3 and C1orf158), which has recently been associated with sarcoidosis in another WES study. In our study, variations in these genes were associated with resolved disease (AADACL3, p = 0.0001 and p = 0.0003; C1orf158, p = 7.03E-05). Another interesting chromosomal region also peaked, Leucocyte Receptor Complex in 19q13.42, but the association diminished in the replication sample. In conclusion, this WES study supports the previously found association in the region 1p36.21. Furthermore, a novel to sarcoidosis region was found, but additional studies are warranted to verify this association.
Copyright © 2019 Lahtela, Kankainen, Sinisalo, Selroos and Lokki.

Entities:  

Keywords:  1p36.21; MHC; leucocyte receptor complex; prognosis; sarcoidosis; whole exome sequencing

Year:  2019        PMID: 31921204      PMCID: PMC6937869          DOI: 10.3389/fimmu.2019.02964

Source DB:  PubMed          Journal:  Front Immunol        ISSN: 1664-3224            Impact factor:   7.561


Introduction

Sarcoidosis is granulomatous disease characterized by the presence of non-caseating granulomas in affected organs. Sarcoidosis can affect any organ in the body and the disease course can be self-limited or chronic (1). These different outcomes have led to the classification of sarcoidosis based on prognosis: resolved disease, i.e., duration of disease <2 years and persistent disease with a longer duration (≥2 years) (2). Exact etiology of sarcoidosis is still unclear. The heterogeneity of sarcoidosis in clinical course and organ involvement has led to the hypothesis that sarcoidosis might not be one disease, but consists of several disease entities, each with distinct genetic associations (3). This may explain why no single sarcoidosis-associated variant has been found, but a wide range of genes with relatively small effects. With a current knowledge it seems, that the genetic susceptibility to sarcoidosis is mainly affected by the genetic variation in the genes and pathways related to granuloma formation and immune response. The MHC region in chromosome 6p21.3 contains multiple genes essential for the immune system. The association for sarcoidosis in the MHC have been found in all classes through I to III (4, 5). In our previous studies in Finnish patients, an association between class II HLA-DRB1*03:01 and the haplotype consisting of HLA-DRB1*04:01-DPB1*04:01 and good prognosis compared to poor prognosis was found (44.9 vs. 22.7%; p = 0.001; OR = 2.78; 95% CI = 1.45–5.24) (6, 7). Besides the MHC, other susceptible sarcoidosis risk/protective chromosome regions and genes have been found throughout the genome. Beside classical candidate-gene approaches, genome-wide association analyses (GWAS) have become method of choice nowadays. However, as sarcoidosis is a rare disease, its prevalence in Finland being 28 per 100,000 (8), collections of large case-control materials for GWAS are demanding. A possible new method for finding causality in genetics behind sarcoidosis is whole-exome sequencing (WES). The exome sequences encompass only about 2% of the human genome but harbors about 85% of all described disease-causing variants (9), making smaller sample sizes sufficient for identification of novel genes. Aim of this study was to further characterize genetic differences between Finnish sarcoidosis prognosis utilizing whole-exome sequencing method in the subset of 72 patients and to replicate the findings in a bigger data set of 188 Finnish patients. In Finnish patients, subjects with good prognosis are more likely to have above mentioned class II HLA markers, but no genetic markers have been found for persistent disease. The objective was to further pinpoint genetic variety of sarcoidosis prognosis in relation to the HLA markers.

Materials and Methods

Study Subjects and Selection Criteria

Study subjects and their characteristics have been previously described (7). In summary, a total of 188 Finnish patients with verified pulmonary sarcoidosis had been followed-up for 5–15 years (Supplementary Table 1). After follow-up the patients were clinically divided into those with disease resolved within 2 years (n = 90) and to those with persisting activity after 2 years (n = 98). Disease activity was evaluated using the generally accepted WASOG (World Association of Sarcoidosis and Other Granulomatous diseases) criteria (10). The clinical examinations included a chest radiograph, a lung function test (spirometry, diffusion capacity), electrocardiography (ECG), liver enzymes, serum calcium, creatinine, serum lysozyme, and serum ACE. For the WES study, a subset of 72 patients were chosen (Figure 1). Patients were selected based on disease activity (resolved disease, n = 36; persistent disease, n = 36). These subsets were further divided by the HLA markers previously known to influence disease prognosis in Finnish patients (HLA-DRB1*03:01 and HLA-DRB1*04:01-DPB1*04:01) yielding in 18 patients with one or both markers (HLA+) and 18 patients with neither of the markers (HLA–). Same selection was used in both resolved and persistent patient groups. The Genome Aggregation Database (gnomAD) (11) was used as control population in the WES study. GnomAD database consists of exome data from 11 150 Finnish subjects, with no known diseases.
Figure 1

Finnish sarcoidosis patients. Selection criteria based on sarcoidosis prognosis and the presence or absence of the HLA markers (HLA-DRB1*03:01 and/or HLA-DRB1*04:01-DPB1*04:01 haplotype). From each of the HLA positive/negative groups, 18 samples were randomly selected for this study.

Finnish sarcoidosis patients. Selection criteria based on sarcoidosis prognosis and the presence or absence of the HLA markers (HLA-DRB1*03:01 and/or HLA-DRB1*04:01-DPB1*04:01 haplotype). From each of the HLA positive/negative groups, 18 samples were randomly selected for this study. For the replication of the SNPs found in the WES study, the rest of the original above mentioned 188 Finnish sarcoidosis patients were genotyped. A control population of 150 healthy subjects representing the Finnish population (12) was included for the replication phase. The study protocol was approved by the Ethics Committee of the Department of Internal Medicine, Hospital District of Helsinki and Uusimaa, Helsinki, Finland. All subjects provided written informed consent for their participation in genetic association studies.

Whole Exome Sequencing

The DNA extraction have been previously explained (7). The WES was performed in Institute for Molecular Medicine Finland (FIMM). 150 ng of gDNA was fragmented with Covaris E220 evolution instrument (Covaris, Woburn, MA, USA). Sample libraries and MedExome Enrichment were processed according to SeqCapEZ HyperCap Workflow User's Guide (Roche Nimblegen, Madison, WI, USA). Enrichment was performed in 4 samples Multiplexed DNA Sample Library Pools using 1 μg of each library. The amplified library was purified with 1.8x Agencourt AMPure XP beads and eluted to 200 μl EB-buffer. Library was quantified for sequencing using 2100 Bioanalyzer High sensitivity kit. Sequencing was performed with Illumina HiSeq2500 system in HiSeq High Output mode using v4 kits (Illumina, San Diego, CA, USA). Read length for the paired-end run was 2 × 101 bp. 91% of all samples had the target coverage of 20x.

Variant Calling

Sequencing data were pre-processed using Trimmomatic (leading:3, trailing:3, sliding window:4:15, illuminaclip:2:30:10, minlen:36). Paired-end reads passing the filtering were mapped to the reference genome (GRCh38) using the BWA-MEM algorithm. Reads were sorted by coordinate and duplicates were marked with the Picard tools (http://broadinstitute.github.io/picard/). Base qualities were recalibrated and local indel realignment performed around indels at the BAM-level using Genome Analysis Toolkit (GATK) followed by merging of data by sample and remarking of duplicates with the Picard tools. Joint genotyping was performed on the whole cohort of 72 samples using GATK HaplotypeCaller and variants were recalibrated using GATK VariantRecalibrator. GATK tools were applied as recommended by GATK guidelines and using GATK resource files that had been converted from GRCh37 to GRCh38 using CrossMap and chain files from EnsEMBL. Quality control analysis of sequencing data was performed using the FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc) with default settings. Additionally, frameshift indels were excluded due to recognized difficulties calling from WES data (13). Annotation and filtering of variants was performed using the Annovar tool against the RefGene database. Filtering quality control was as follows: coverage >10, variant quality value >40, frequency <1%, and assumed to impair protein function. The mean coverage of exomic regions for each variant was 43X with individual variant reads varying between 17 and 153X. Reads were manually checked using Integrative Genomics Viewer (IGV). Variants with unknown frequency in Finnish population in the gnomAD database were excluded. Version information and references of tools used in variant calling are given in Supplementary Table 2.

Testing for Genetic Association

We analyzed associations for single variants and variant groups within genes. The associations were evaluated between all the dichotomous groups (persistent vs. resolved disease, HLA+ persistent vs. resolved disease, and HLA- persistent vs. resolved disease). For single-variant analysis, the common variants [minor allele frequency (MAF) > 0.01] were included. The hypergeometric distribution method in EPACTS software pipeline (https://genome.sph.umich.edu/wiki/EPACTS) was used. The gene-based tests were assessed to find whether variants in each gene are jointly associated with the phenotype. Gene-based association improves the power to detect rare variants (MAF < 0.05) in limited sample size. Two methods were used to test the associations: hypergeometric distribution and SKAT-O (14) methods implemented in the EPACTS software. SKAT-O test was used, because it assumes that each variant can have either positive or negative effect to the phenotype. The p were corrected for multiple comparisons (number of studied variants) by using False Discovery Rate (FDR) method and Bonferroni correction. A value of p < 0.05 was considered statistically significant. Age, gender, and extrapulmonary manifestations were used as covariates in all statistical tests. To assess the functionality of the found variations in protein level, we used Sift (15) and PolyPhen (16) databases, which both predict possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations. To further investigate the possibly functional effects of the significant SNPs, we used the Genotype-Tissue Expression (GTEx) Portal (17) to study the expression quantitative trait loci (eQTL). For the replication, the SNPs that reached the significant association level in single-variant and gene-based analysis were included (Supplementary Table 3). The SNP genotyping was performed with the Agena Bioscience (Sequenom) MassARRAY System (Agena Biosciences, San Diego, California) at the Institute for Molecular Medicine Finland (FIMM), Helsinki, Finland, with standard protocols. Genotypes were called using Sequenom's MassARRAY Typer software. The allele frequencies between different groups were compared by a case-control association analysis (Chi-square χ2 test, PLINK software) (18). We applied the following quality control filters: minimum call rate per sample of 90%, SNP minor allele frequency (MAF) > 0.01 and Hardy Weinberg equilibrium (HWE) > 0.001. Total success rate for accepted SNP arrays was 95% in the replication samples.

Results

Single-Variant Analysis

Figure 2 shows the Manhattan plot from the single-variant association tests between resolved and persistent patients showing the highest associating peak in the chromosome 19, although not reaching the exome-significance level. Supplementary Figure 1 shows the quantile-quantile (Q-Q) plot for the single-variant tests, which demonstrates the similarity between observed and expected significance values. Table 1 shows the single variants that yielded the strongest associations in analyses between resolved and persistent, and between HLA+ and HLA– prognosis groups. In analysis between resolved and persistent patients altogether seven variants associated with the prognosis. Interestingly, five of these variants were located in two chromosomal regions: three in chromosome 1 (1p36.21) and two in chromosome 19 (19q13.42). In HLA+ group, the associations in these two chromosomal locations remained. In HLA- patients the chromosomal locations 1p36.21 and 19q13.42 did not associate with the prognosis. None of these associations remained significant after correcting for multiple testing with FDR or Bonferroni correction.
Figure 2

Manhattan plot from the single-variant association tests between persistent and resolved patients. The X-axis shows chromosome position the Y-axis shows the negative log of p-values so that higher values represent stronger significance levels. The horizontal red line shows the threshold value for significance.

Table 1

Associated variations in single-variant analysis in different Finnish sarcoidosis subgroups based on disease prognosis (persistent, resolved) and HLA markers (HLA+ = HLA–DRB1*03:01 and/or HLA-DRB1*04:01-DPB1*04:01; HLA– = without either of the markers).

ChromosomePositionChromosomal locationMAF PersistentMAF ResolvedMAF gnomAD FinnishSNPAA ChangePredicted FunctionPrsIDGene
PERSISTENT VS. RESOLVED
1127196161p36.210.1250.3750.2196C/TSer47ProB/T0.000915rs3010877AADACL3
1127255271p36.210.1810.3890.3072T/GCys125PheP/T0.016591rs7513079AADACL3
1127609371p36.210.1390.3610.2346C/TPhe153LeuB/T0.00833rs1132185C1orf158
11562675241q220.0000.0690.01T/CIle355ValB/T0.0269443rs145100575SMG5
195466481119q13.420.2080.3750.3021G/AAsp223GlyB/D0.01645rs731170LILRB4
195481858119q13.420.2640.1530G/AMet113ValB/D0.046766rs643861KIR3DL1, KIR3DS1
203543458920q11.220.4440.3470.3917C/AGlu144AlaB/D0.0397467rs224331GDF5
HLA+
1127196161p36.210.0830.3610.2196C/TSer47ProB/T0.001023rs3010877AADACL3
1127255271p36.210.1110.3330.3072T/GCys125PheP/T0.008796rs7513079AADACL3
1127609371p36.210.1110.3060.2346C/TPhe153LeuB/T0.020456rs1132185C1orf158
11562675241q220.0000.1390.01T/CIle355ValB/T0.0227273rs145100575SMG5
11598604211p15.40.0830.2500.1899G/ATrp297ArgB/D0.0375179rs4237768OR52L1
173762436417q120.2780.1110.2469C/TMet403ValB/T0.0429011rs7216445DDX52
194447766619q13.310.3330.4720.4727G/CCys272SerB/T0.0438416rs1897820ZNF180
194449729419q13.310.3610.5000.4932A/GAla41ValB/T0.0227273rs2571108ZNF180
195466481119q13.420.1390.3330.3021G/AMet113ValB/D0.021855rs731170LILRB4
195481858119q13.420.2780.1110G/AGlu144AlaB/D0.042901rs643861KIR3DL1, KIR3DS1
HLA–
81443794258q24.30.3330.4720.4882C/AArg17SerB/T0.0438416rs6599528ADCK5
222251416922q11.220.3330.1670.2745T/GArg488SerB/D0.0471713rs361666ZNF280A
222251417322q11.220.3330.1670.2745C/GPhe486LeuB/T0.0471713rs361762ZNF280A
222251488522q11.220.3330.1670.2745G/CGly249AlaB/T0.0471713rs362124ZNF280A
222251489422q11.220.3330.1670.2745C/TAsn246SerB/T0.0471713rs362132ZNF280A
222251522122q11.220.3330.1670.275G/TTyr137SerB/T0.0471713rs361580ZNF280A
222251522422q11.220.3330.1670.2753C/TAsn136SerB/T0.0471713rs362011ZNF280A
222251541822q11.220.3330.1670.2742A/CLys71AsnB/T0.0471713rs361959ZNF280A

Bolded are the chromosomal locations were muliple variants were associated.

Resolved patients, n = 36; persitent patients, n = 36; HLA+ resolved, n = 18; HLA+ persistent, n = 18; HLA− resolved, n = 18, HLA− persistent, n = 18.

Predicted function according to PolyPhen and Sift, respectively: B, benign; P, probably damaging; T, tolerated; D, deleterious. MAF, minor allele frequency; gnomAD, The Genome Aggregation Database; AA, amino acid. P values are uncorrected, none of the values remained significant after correcting for multiple tests.

Manhattan plot from the single-variant association tests between persistent and resolved patients. The X-axis shows chromosome position the Y-axis shows the negative log of p-values so that higher values represent stronger significance levels. The horizontal red line shows the threshold value for significance. Associated variations in single-variant analysis in different Finnish sarcoidosis subgroups based on disease prognosis (persistent, resolved) and HLA markers (HLA+ = HLADRB1*03:01 and/or HLA-DRB1*04:01-DPB1*04:01; HLA– = without either of the markers). Bolded are the chromosomal locations were muliple variants were associated. Resolved patients, n = 36; persitent patients, n = 36; HLA+ resolved, n = 18; HLA+ persistent, n = 18; HLA− resolved, n = 18, HLA− persistent, n = 18. Predicted function according to PolyPhen and Sift, respectively: B, benign; P, probably damaging; T, tolerated; D, deleterious. MAF, minor allele frequency; gnomAD, The Genome Aggregation Database; AA, amino acid. P values are uncorrected, none of the values remained significant after correcting for multiple tests.

Gene-Based Analysis

In gene-based tests, all variants found in the same gene are grouped to assess the association with the phenotype. Also, in gene-based analysis the locations 1p36.21 and 19q13.42 showed associations while not remained significant after correcting with FDR or Bonferroni correction (Table 2). In location 19q13.42, variants in gene LAIR1 were also significantly associated with the prognosis, in addition to previously seen LILRB4 and KIR3DL1/KIRSDS1. Manhattan plot from the gene-based analysis between HLA+ persistent and resolved patients is seen in Supplementary Figure 2. No association in locations 1p36.21 and 19q13.42 was seen in HLA- patients. Another chromosomal location was also seen: in 2p22.3, two genes were shown to associate with disease prognosis; RASGRP3 with persistent disease and LTBP1 in resolved disease. However, these genes did not show significant association in any of the other subgroup analysis.
Table 2

Associated genes in gene-based analysis (hypergeometric distribution and SKAT-O methods) in different Finnish sarcoidosis subgroups based on disease prognosis (persistent, resolved) and HLA markers (HLA+ = HLA-DRB1*03:01 and/or HLA-DRB1*04:01-DPB1*04:01; HLA– = without either of the markers).

ChromosomeRegionP HGDP SKAT-OGeneAssociated trait
PERSISTENT VS. RESOLVED
1109252166–1092734931p13.30.0124667540.021725009CELSR2Persistent
112716215–127255271p36.210.04489705AADACL3Resolved
112755700–127609371p36.210.00429823C1orf158Resolved
1156267524–1562675241q220.0269442740.021343706SMG5Resolved
123519074–235210371p36.120.0124667540.024629574E2F2Resolved
2178532055–1787812352q31.20.028053232TTNResolved
232947681–333429042p22.30.02694427LTBP1Resolved
233524504–335271732p22.30.026944270.03932876RASGRP3Persistent
8144467535–1444695618q24.30.0269442740.039865155KIFC2Resolved
9128990198–1290055279q34.110.0269442740.039865155NUP188Persistent
1120160309–2016030911p15.10.040988976DBX1Persistent
1252692557–5269255712q13.130.040988976KRT77Persistent
1917102004–1721209019p13.110.0316013640.034412198MYO9BPersistent
1942079904–4208080619q13.20.0269442740.039865155ZNF574Persistent
1954361091–5436109119q13.420.04825453LAIR1Resolved
1954664811–5466481119q13.420.01644974LILRB4Resolved
1954818479–5481991719q13.420.04676604KIR3DL1, KIR3DS1Persistent
2227798761–2780030222q12.10.0124667540.0183MN1Persistent
2232491131–3249851322q12.30.026944274FBXO7Persistent
HLA+
112716215–127255271p36.210.01770751AADACL3Resolved
112755700–127609371p36.210.00879604C1orf158Resolved
1156267524–1562675241q220.022727273SMG5Resolved
312004712–120048643p25.20.04290113SYN2Resolved
115986042–598604211p15.40.037517877OR52L1Resolved
1737624364–3762858717q120.04290113DDX52Persistent
1944477645–4449729419q13.310.022727273ZNF180Resolved
1954664811–5466481119q13.420.02185508LILRB4Resolved
1954818479–5481991719q13.420.04290113KIR3DL1, KIR3DS1Persistent
HLA–
2178532055–1787812352q31.20.020455901TTNResolved
810609233–106231108p23.10.0227272730.03576884RP1L1Resolved
8144379425–1443925228q24.30.043841642ADCK5Resolved
1917102004–1721209019p13.110.0227272730.03576884MYO9BPersistent
2222514169–2251541822q11.220.047171298ZNF280APersistent

Bolded are the chromosomal locations were multiple genes are associated.

Resolved patients, n = 36; persitent patients, n = 36; HLA+ resolved, n = 18; HLA+ persistent, n = 18; HLA– resolved, n = 18, HLA– persistent, n = 18.

HGD, hypergeometric distribution method.

P values are uncorrected, none of th values remained significant after correcting for multiple tests.

Associated genes in gene-based analysis (hypergeometric distribution and SKAT-O methods) in different Finnish sarcoidosis subgroups based on disease prognosis (persistent, resolved) and HLA markers (HLA+ = HLA-DRB1*03:01 and/or HLA-DRB1*04:01-DPB1*04:01; HLA– = without either of the markers). Bolded are the chromosomal locations were multiple genes are associated. Resolved patients, n = 36; persitent patients, n = 36; HLA+ resolved, n = 18; HLA+ persistent, n = 18; HLA– resolved, n = 18, HLA– persistent, n = 18. HGD, hypergeometric distribution method. P values are uncorrected, none of th values remained significant after correcting for multiple tests.

Replication Analysis

For the replication analysis, the rest of the Finnish sample set of 188 samples and 150 Finnish control material were included. After quality control, total of 150 controls and 181 cases (72 from the WES study and 109 from the replication study) were included in the analysis. In the replication analysis between persistent and resolved sarcoidosis patients, the SNPs in the chromosomal location 1p36.21 were statistically significant as in the WES study (Table 3). The association in these SNPs was also found in the analysis between HLA + persistent and resolved patients, and in contrast to the WES study, these SNPs were also significantly associated in the HLA-disease subgroup. When combining the data from the WES and replication studies, the associations became even stronger in all subgroup analysis. In chromosomal location 19q13.42, KIR3DL1/KIR3DS1 SNPs did not replicate and LILRB4 SNPs associated significantly in the HLA + resolved patients in the combined data set (Table 3). None of the associations remained statistically significant after correction with SNP count (data not shown). However, the MAFs in the control population corresponded to the ones in the gnomAD Finnish control population except for the KIR3DL1/KIR3DS1 SNP rs643861, which lacked allele frequency data in the gnomAD database (Table 3). However, in the NCBI dbSNP database (https://www.ncbi.nlm.nih.gov/snp/) MAF from TopMed database is A = 0.185, which resembles MAF found in our study. TOPMed uses hg38 as a reference genome, in gnomAD the hg19 is used as a reference genome which might explain the differences.
Table 3

Replication of SNPs found in the preliminary Whole-exome sequencing analysis in different Finnish sarcoidosis subgroups based on disease prognosis (persistent, resolved) and HLA markers (HLA+ = HLA–DRB103:01 and/or HLA–DRB104:01-DPB104:01; HLA– = without either of the markers) and Finnish control population.

ChromosomePositionChromosomal locationMAF* PersistentMAF* ResolvedMAF gnomAD FinnishMAF ControlMAF replication** PersistentMAF replication** ResolvedMAF combined*** PersistentMAF combined*** ResolvedSNPPP ReplicationP CombinedrsIDGene
PERSISTENT VS. RESOLVED
1127196161p36.210.1250.3750.21960.21660.16670.28850.15050.3239C/T0.0009150.031420.0001009rs3010877AADACL3
1127255271p36.210.1810.3890.30720.28660.22810.3750.20970.3807T/G0.01659130.017850.0003522rs7513079AADACL3
1127609371p36.210.1390.3610.23460.22610.16670.31730.15590.3352C/T0.00832990.0091437.031E-05rs1132185C1orf158
195466481119q13.420.2080.3750.30210.28340.34210.35580.29030.3636G/A0.01644970.83250.1369rs731170LILRB4
195481858119q13.420.2640.15300.26770.25440.26920.27960.233G/A0.0467660.80330.3103rs643861KIR3DL1, KIR3DS1
HLA+
1127196161p36.210.0830.3610.21960.21660.10710.220.093750.314C/T0.00102280.21285.27E-06rs3010877AADACL3
1127255271p36.210.1110.3330.30720.28660.10710.320.10940.3721T/G0.0087960.035652.88E-07rs7513079AADACL3
1127609371p36.210.1110.3060.23460.22610.071430.260.093750.3256C/T0.02045590.042652.10E-06rs1132185C1orf158
195466481119q13.420.1390.3330.30210.28340.39290.420.250.407G/A0.02185510.81520.004529rs731170LILRB4
195481858119q13.420.2780.11100.26770.250.30.28120.2674G/A0.04290110.63780.8511rs643861KIR3DL1, KIR3DS1
HLA–
1127196161p36.210.16670.30560.21960.21660.1860.35190.18030.3333C/T0.16530.027430.01049rs3010877AADACL3
1127255271p36.210.250.33330.30720.28660.26740.42590.26230.3889T/G0.43670.051970.05005rs7513079AADACL3
1127609371p36.210.16670.30560.23460.22610.19770.37040.18850.3444C/T0.16530.024090.01001rs1132185C1orf158

Resolved patients, n = 36; persistent patients, n = 36; HLA+ resolved, n = 18; HLA+ persistent, n = 18; HLA– resolved, n = 18, HLA– persistent, n = 18.

Resolved patients, n = 52; persistent patients, n = 57; HLA+ resolved, n = 25; HLA+ persistent, n = 14; HLA– resolved, n = 27, HLA– persistent, n = 43.

Resolved patients, n = 88; persistent patients, n = 93; HLA+ resolved, n = 43; HLA+ persistent, n = 32; HLA– resolved, n = 45, HLA– persistent, n = 61.

Predicted function according to PolyPhen and Sift, respectively: B, benign; P, probably damaging; T, tolerated; D, deleterious.

MAF, minor allele frequency; AA, amino acid.

Replication of SNPs found in the preliminary Whole-exome sequencing analysis in different Finnish sarcoidosis subgroups based on disease prognosis (persistent, resolved) and HLA markers (HLA+ = HLA–DRB103:01 and/or HLA–DRB104:01-DPB104:01; HLA– = without either of the markers) and Finnish control population. Resolved patients, n = 36; persistent patients, n = 36; HLA+ resolved, n = 18; HLA+ persistent, n = 18; HLA– resolved, n = 18, HLA– persistent, n = 18. Resolved patients, n = 52; persistent patients, n = 57; HLA+ resolved, n = 25; HLA+ persistent, n = 14; HLA– resolved, n = 27, HLA– persistent, n = 43. Resolved patients, n = 88; persistent patients, n = 93; HLA+ resolved, n = 43; HLA+ persistent, n = 32; HLA– resolved, n = 45, HLA– persistent, n = 61. Predicted function according to PolyPhen and Sift, respectively: B, benign; P, probably damaging; T, tolerated; D, deleterious. MAF, minor allele frequency; AA, amino acid. The WES study associations were found in single-variation and gene levels in the chromosomal locations 1p36.21 and 19q13.42. In the replication sample, the associations in the location 1p36.21 were found only in the SNPs associating in the single-variation analysis (Supplementary Table 3). In chromosomal location 19q13.42, none of the other SNPs in the KIR3DL1/KIR3DS1 associated with the disease (data not shown). LAIR1 and KIR3DL3 SNPs did not meet the Sequenom study design criteria and could not be replicated (Supplementary Table 3). In chromosomal location 2p22.3, two genes were shown to associate with the disease prognosis in the WES study; RASGRP3 with persistent disease and LTBP1 in resolved disease. In the replication study, only one SNP in the RASGRP3 and LTBP1 passed the quality control (Supplementary Table 3) and none of these SNPs associated with the trait (data not shown).

Discussion

The heterogeneity of sarcoidosis in clinical course and development of organ involvement suggests that manifestation of the disease phenotype is a result of multiple genetic variations. There is a special interest in finding genetic differences between sarcoidosis patients with different prognosis, because at the moment prediction of the disease course at the time of diagnosis is difficult. There are also significant ethnic differences in disease development and prognosis in different populations (19). These differences represent a compelling need for ethnic-selective biomarkers to assess disease progression in diverse populations. Here we present results from WES study in Finnish sarcoidosis patients. The strategy was to sequence groups of sarcoidosis patients with known differences in prognosis to identify variants that could alter functional properties of proteins. In Finnish sarcoidosis patients, certain HLA markers (HLA-DRB1*03:01, HLA-DRB1*04:01-DQB1*04:01) have been shown to be more common in patients with sarcoidosis disease resolving within 2 years, i.e., good prognosis disease. The aim of this study was to further pinpoint genetic variety in different sarcoidosis prognosis groups in relation to these HLA markers. The most strongly associated and replicated chromosomal location was found in 1p36.21, containing two genes (AADACL3 and C1orf158). This region has also been found to associate with familial sarcoidosis in a recent WES study in German patients (20). However, in French WES study of familial sarcoidosis, no association with this region was found (21). In our study, variations in these genes were associated with disease resolution independent of the HLA markers DRB1*03:01 and DRB1*04:01. The found variations in AADACL3 are predicted to be benign, but the variation in C1orf158 is thought to alter the function of the protein. However, the functionality of these variants in sarcoidosis predisposition is unclear at the moment. The variants rs3010877 in AADACL3 and rs1132185 in C1orf158 are known to act as eQTLs in not sun exposed skin, but no clear in sarcoidosis pathogenesis is evident. Due to LD, it is also possible that association derives from elsewhere in the chromosome. However, no functionally plausible genes reside in near proximity of these genes, while the further region in 1p36 has been associated with resolution of chest findings in sarcoidosis patients (22). Therefore, further studies are needed to evaluate the possible causality of these genes and sarcoidosis susceptibility. In the WES study, another interesting chromosomal region was seen in 19q13.42 (LILRB4, KIR3DL1/KIR3DS1, and LAIR1). In the replication part of the study, only association with LILRB4 remained statistically significant when WES and replication data were combined, the other associations were weakened or could not been replicated. It should be noted, that the size of the replication material is still relatively small, and possible true causality cannot be ruled out. The 19q13.42 region, called the Leukocyte Receptor Complex (LRC), comprises a large set of genes encoding immunoglobulin superfamily receptors (23). The leukocyte immunoglobulin-like receptors (LILR) can interact with the HLA class I (24) and are expressed by a range of immunologically active cells, including natural killer (NK) and antigen-presenting cells (APCs) (25). LILRs have both activating and inhibitory effects (26). Previously no associations have been found between LILRB4 and sarcoidosis, but variations in this gene have shown to increase risk of another immune-mediated disease, systemic lupus erythematosus (SLE) (27). In our study, variations in this gene associated with good prognosis patients having the good prognosis HLA Class II markers. LILRB4 belongs to inhibitory receptors expressed on dendritic cells and has been shown to interact with the HLA-G. (28). LILRB4-HLA-G interaction has been demonstrated to limit the activation of dendritic cells (29), which are important APCs in sarcoidosis inflammation. In sarcoidosis, the association with HLA-II is thought to be more important, but it should be noted that limited information exists in relation to HLA-G in sarcoidosis. In one study, the HLA-G variation was seen in sarcoidosis patients (30). Also, there is a strong LD in the MHC region, and it is possible that patients with the HLA class II markers have haplotypes containing the HLA class I as well. The combination of these MHC variations, with the variants in the LILRB4, could alter the immune reaction in sarcoidosis and lead to self-resolving disease. In addition to LILRB4, in the WES study, association was found with variants in LAIR1 and KIR3DL1/KIR3DS1. Leukocyte-associated immunoglobulin-like receptor-1 (LAIR1) has been shown to inhibit T and NK cell activation (31), but there are no known HLA ligands for this receptor. In our study, association with variants in the LAIR1 was seen when persistent and resolved patients were compared. No association was evident with the HLA markers, which is in concordance with the notion that there are no known HLA ligands for this protein. No association have been found between this gene and sarcoidosis, but also this immune-modulator gene has been shown to have effect in SLE (32). However, this association was seen only in gene-based analysis and with single method. Due to technical issues, the variation could not be replicated, so the effect should be considered cautiously. The other association seen only in the WES study encode KIRs, receptors expressed mainly on the surface of natural killer (NK) cells. Most diverse of the KIR genes is locus KIR3DL1/KIR3DS1 which associated with persistent sarcoidosis in our study. The gene encodes both inhibitory (3DL1) and activating (3DS1) receptors (33), so either or both of these functions could be affected, shifting the course of the disease toward chronic. In a recent WES study of three families sharing pediatric sarcoidosis, the KIR3DL1/KIR3DS1 in gene level was found to be shared in these families (34). To our knowledge, no case-control study has found association in KIR genes and sarcoidosis. There are some limitations in this study. Sarcoidosis being a relatively rare disease, the collection of large data sets is demanding in a small country as Finland. Therefore, due to relatively small sample size, the power might have not been sufficient to detect all associations, however, the gene-based analysis in the WES study and the replication sample increases the power. Because of the patient selection based on the HLA markers, the analysis between all persistent and resolved patients has some statistical limitations. However, in the replication study, the whole Finnish sarcoidosis sample is involved. The other confounding factors: gender, age and extrapulmonary manifestations, were conditioned in all analyses. In summary, we used WES to further characterize genetic differences between persistent and resolved sarcoidosis patients and to test the hypothesis whether different variations are associated in relation to the class II markers (HLA-DRB1*03:01/HLA-DRB1*04:01-DPB1*04:01). An association with resolved disease raised in the chromosomal region 1p36.21, a region which has recently been associated with sarcoidosis in another WES study. Another interesting chromosomal region peaked, 19q13.42, but in the replication study the association weakened. However, due to relatively small sample size, the true causality of these variants should be evaluated in a larger data set.

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

Ethics Statement

The studies involving human participants were reviewed and approved by the Ethics Committee of the Department of Internal Medicine, in the Hospital District of Helsinki and Uusimaa, Helsinki, Finland (Approval Dnro 362/E5/05). The patients/participants provided their written informed consent to participate in this study.

Author Contributions

EL, MK, JS, OS, and M-LL: study design and data analysis and manuscript writing. The manuscript has been read and approved by all named authors.

Conflict of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
  34 in total

1.  Genetic linkage analysis of sarcoidosis phenotypes: the sarcoidosis genetic analysis (SAGA) study.

Authors:  B A Rybicki; R Sinha; S Iyengar; C Gray-McGuire; R C Elston; M C Iannuzzi
Journal:  Genes Immun       Date:  2007-05-03       Impact factor: 2.676

Review 2.  Racial differences in sarcoidosis incidence: a 5-year study in a health maintenance organization.

Authors:  B A Rybicki; M Major; J Popovich; M J Maliarik; M C Iannuzzi
Journal:  Am J Epidemiol       Date:  1997-02-01       Impact factor: 4.897

3.  Exome sequencing and pathogenicity-network analysis of five French families implicate mTOR signalling and autophagy in familial sarcoidosis.

Authors:  Alain Calender; Clarice X Lim; Thomas Weichhart; Adrien Buisson; Valérie Besnard; Pierre Antoine Rollat-Farnier; Claire Bardel; Pascal Roy; Vincent Cottin; Gilles Devouassoux; Amélie Finat; Stéphane Pinson; Serge Lebecque; Hilario Nunes; Dominique Israel-Biet; Abderazzaq Bentaher; Dominique Valeyre; Yves Pacheco
Journal:  Eur Respir J       Date:  2019-08-01       Impact factor: 16.671

4.  Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies.

Authors:  Seunggeun Lee; Mary J Emond; Michael J Bamshad; Kathleen C Barnes; Mark J Rieder; Deborah A Nickerson; David C Christiani; Mark M Wurfel; Xihong Lin
Journal:  Am J Hum Genet       Date:  2012-08-02       Impact factor: 11.025

5.  The mode of presentation of sarcoidosis in Finland and Hokkaido, Japan. A comparative analysis of 571 Finnish and 686 Japanese patients.

Authors:  A Pietinalho; M Ohmichi; Y Hiraga; A B Löfroos; O Selroos
Journal:  Sarcoidosis Vasc Diffuse Lung Dis       Date:  1996-09       Impact factor: 0.670

6.  Whole-exome sequencing identifies rare genetic variations in German families with pulmonary sarcoidosis.

Authors:  Amit Kishore; Britt-Sabina Petersen; Marcel Nutsua; Joachim Müller-Quernheim; Andre Franke; Annegret Fischer; Stefan Schreiber; Martin Petrek
Journal:  Hum Genet       Date:  2018-07-27       Impact factor: 4.132

Review 7.  Sarcoidosis.

Authors:  Dominique Valeyre; Antje Prasse; Hilario Nunes; Yurdagul Uzunhan; Pierre-Yves Brillet; Joachim Müller-Quernheim
Journal:  Lancet       Date:  2013-10-01       Impact factor: 79.321

8.  Human leukocyte antigen class I alleles and the disease course in sarcoidosis patients.

Authors:  Johan Grunewald; Anders Eklund; Olle Olerup
Journal:  Am J Respir Crit Care Med       Date:  2003-12-04       Impact factor: 21.405

9.  The Genotype-Tissue Expression (GTEx) project.

Authors: 
Journal:  Nat Genet       Date:  2013-06       Impact factor: 38.330

Review 10.  Deciphering the killer-cell immunoglobulin-like receptor system at super-resolution for natural killer and T-cell biology.

Authors:  Vivien Béziat; Hugo G Hilton; Paul J Norman; James A Traherne
Journal:  Immunology       Date:  2016-12-14       Impact factor: 7.397

View more
  5 in total

Review 1.  The Evolving Landscape of Cutaneous Sarcoidosis: Pathogenic Insight, Clinical Challenges, and New Frontiers in Therapy.

Authors:  Julie H Wu; Sotonye Imadojemu; Avrom S Caplan
Journal:  Am J Clin Dermatol       Date:  2022-05-18       Impact factor: 6.233

2.  HLA class II variants defined by next generation sequencing are associated with sarcoidosis in Korean patients.

Authors:  Kateřina Sikorová; Su-Jin Moon; Hee-Young Yoon; Adam Strnad; Jin Woo Song; Martin Petrek
Journal:  Sci Rep       Date:  2022-06-03       Impact factor: 4.996

Review 3.  The Genetic Control of the Rheumatic Heart: Closing the Genotype-Phenotype Gap.

Authors:  Atiyeh M Abdallah; Marawan Abu-Madi
Journal:  Front Med (Lausanne)       Date:  2021-03-24

4.  Exome Sequencing Reveals Genetic Variability and Identifies Chronic Prognostic Loci in Chinese Sarcoidosis Patients.

Authors:  Qian Zhang; Hui Huang; Meijun Zhang; Chuling Fang; Na Wang; Xiaoyan Jing; Jian Guo; Wei Sun; Xiaoyu Yang; Zuojun Xu
Journal:  Front Oncol       Date:  2022-07-04       Impact factor: 5.738

Review 5.  Classical Examples of the Concept of the ASIA Syndrome.

Authors:  Vânia Borba; Anna Malkova; Natalia Basantsova; Gilad Halpert; Laura Andreoli; Angela Tincani; Howard Amital; Yehuda Shoenfeld
Journal:  Biomolecules       Date:  2020-10-12
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.