| Literature DB >> 25687228 |
Yu Chen1, Jie Zhou2, Zhongshan Cheng3, Shigui Yang1, Hin Chu4, Yanhui Fan5, Cun Li3, Bosco Ho-Yin Wong3, Shufa Zheng1, Yixin Zhu1, Fei Yu1, Yiyin Wang1, Xiaoli Liu1, Hainv Gao1, Liang Yu1, Linglin Tang1, Dawei Cui1, Ke Hao6, Yohan Bossé7, Ma'en Obeidat8, Corry-Anke Brandsma9, You-Qiang Song10, Kelvin Kai-Wang To11, Pak Chung Sham5, Kwok-Yung Yuen11, Lanjuan Li1.
Abstract
The fatality of avian influenza A(H7N9) infection in humans was over 30%. To identify human genetic susceptibility to A(H7N9) infection, we performed a genome-wide association study (GWAS) involving 102 A(H7N9) patients and 106 heavily-exposed healthy poultry workers, a sample size critically restricted by the small number of human A(H7N9) cases. To tackle the stringent significance cutoff of GWAS, we utilized an artificial imputation program SnipSnip to improve the association signals. In single-SNP analysis, one of the top SNPs was rs13057866 of LGALS1. The artificial imputation (AI) identified three non-genotyped causal variants, which can be represented by three anchor/partner SNP pairs rs13057866/rs9622682 (AI P = 1.81 × 10(-7)), rs4820294/rs2899292 (2.13 × 10(-7)) and rs62236673/rs2899292 (4.25 × 10(-7)) respectively. Haplotype analysis of rs4820294 and rs2899292 could simulate the signal of a causal variant. The rs4820294/rs2899292 haplotype GG, in association with protection from A(H7N9) infection (OR = 0.26, P = 5.92 × 10(-7)) correlated to significantly higher levels of LGALS1 mRNA (P = 0.050) and protein expression (P = 0.025) in lymphoblast cell lines. Additionally, rs4820294 was mapped as an eQTL in human primary monocytes and lung tissues. In conclusion, functional variants of LGALS1 causing the expression variations are contributable to the differential susceptibility to influenza A(H7N9).Entities:
Mesh:
Substances:
Year: 2015 PMID: 25687228 PMCID: PMC4649671 DOI: 10.1038/srep08517
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Demographic and clinical characteristics of A(H7N9) patients
| Characteristics | Death(n = 27) | Survival(n = 75) | |
|---|---|---|---|
| Age (years) | 66 (61–72) | 58 (46–66) | 0.002 |
| Female sex | 8 (29.6) | 27 (36.0) | 0.640 |
| Age ≥ 65 | 15 (57.7) | 24 (32.0) | 0.034 |
| Pregnant women | 0 (0) | 1 (1.3) | 1.000 |
| Chronic pulmonary diseases | 4 (14.8) | 2 (2.7) | 0.041 |
| Chronic cardiac diseases | 3 (11.1) | 4 (5.3) | 0.378 |
| Metabolic disorders | 4 (14.8) | 5 (6.7) | 0.240 |
| Chronic renal diseases | 1 (3.7) | 3 (4.0) | 1.000 |
| Chronic hepatic diseases | 0 (0) | 1 (1.3) | 1.000 |
| Neurological conditions | 2 (7.4) | 4 (5.3) | 0.654 |
| Immunosuppression | 3 (11.1) | 2 (2.7) | 0.114 |
| Hemoglobin (g/L) | 108 (90–125) | 124 (110–135) | 0.014 |
| Total white blood cell (×109 cells/L) | 5.4 (1.8–11.4) | 4.1 (3.0–6.1) | 0.428 |
| Neutrophil (×109 cells/L) | 3.4 (1.4–10.2) | 3.3 (2.1–4.9) | 0.776 |
| Lymphocyte (×109 cells/L) | 0.60 (0.30–0.80) | 0.50 (0.40–0.70) | 0.272 |
| Platelet (×109 cells/L) | 98 (61–184) | 128 (93–162) | 0.262 |
| Prothrombin time (s) | 13.0 (12.1–14.3) | 12.5 (12.0–13.6) | 0.154 |
| Activated partial thromboplastin time (s) | 38.3 (33.7–44.4) | 36.0 (30.5–42.2) | 0.607 |
| D-dimer (μg/L) | 7594 (5780–13900) | 2190 (1260–5810) | <0.001 |
| Urea (mmol/L) | 8.9 (5.9–16.2) | 4.6 (3.3–7.3) | 0.001 |
| Creatinine (μmol/L) | 88 (70–170) | 61 (47–82) | <0.001 |
| Bilirubin (μmol/L) | 11.0 (8.8–16.6) | 8.0 (6.0–12.0) | <0.001 |
| Alanine transaminase (U/L) | 39 (27–65) | 33 (21–55) | 0.174 |
| Aspartate transaminase (U/L) | 87 (39–147) | 49 (35–74) | 0.002 |
| Lactate dehydrogenase (U/L) | 670 (569–873) | 394 (309–549) | <0.001 |
| Creatine kinase (U/L) | 263 (140–630) | 152 (74–301) | 0.032 |
| C-reactive protein (mg/L) | 113 (78–153) | 57 (28–102) | 0.001 |
| ICU admission | 26 (96.3) | 68 (90.7) | 0.678 |
| APACHE II score | 27 (25–30) | 18 (16–22) | <0.001 |
| ARDS | 25 (92.6) | 49 (65.3) | 0.006 |
| MODS | 23 (85.2) | 10 (13.3) | <0.001 |
| ECMO | 6 (22.2) | 8 (10.7) | 0.190 |
APACHE II, Acute Physiology and Chronic Health Evaluation II; ARDS, acute respiratory distress syndrome; ECMO, extracorporeal membrane oxygenation; ICU, intensive care unit; MODS, multi-organ dysfunction syndrome.
aAll continuous variables are expressed as median (interquartile range).
bObesity or hemoglobinopathy were not present in any individuals.
cData only included patients who were admitted to the intensive care unit.
dFisher exact test and Mann Whitney U test were used for categorical variables and continuous variables, respectively.
Figure 1Linkage disequilibrium pattern of three anchor SNPs rs4820294, rs13057866, rs62236773 and the related variants.
The LD pattern of three anchor SNPs rs13057866, rs4820294, rs62236673 (in green color) and partner SNP rs2899292 are plotted for 208 study participants in this study (H7N9). The LD patterns of the three anchor SNPs and their high LD variants are plotted for 94 individuals from Chinese Han in Beijing (CHB), 264 individuals from Asian (ASN), and 172 individuals from America (AMR), whose genotypes are retrieved from 1000 Genomes Project. The boxes are colored according to D′ measure on a white and red scale where red indicates complete LD (D' = 1). The numbers inside the boxes are r2 measure. The red box without number indicates the highest r2 of 1.0.
Figure 2The artificial imputation implemented in SnipSnip increased the association signal compared with single-SNP analysis.
Manhattan plots show P values of SNPs (y-axis, -log10 scale) on a genomic scale (x axis) of A(H7N9) GWAS dataset. The single-SNP allelic association P values using logistic regression implemented in PLINK and artificial imputation P values using SnipSnip are shown in the upper and lower panel respectively.
The top association anchor-partner SNPs identified with SnipSnip implementation
| Anchor-partner SNP | Anchor-partner Position (hg19) | Annotation | Artificial imputation |
|---|---|---|---|
| rs646606|rs706481 | 57416633|57415200 | C8B;chromosome 1 (intronic/intronic) | 6.72 × 10−8 |
| rs13057866|rs9622682 | 38069622|38074434 | LGALS1; chromosome 22 (upstream/intronic) | 1.81 × 10−7 |
| rs4820294|rs2899292 | 38071043|38077718 | LGALS1; chromosome 22 (upstream/intergenic) | 2.13 × 10−7 |
| rs62236673|rs2899292 | 38076063|38077718 | LGALS1; chromosome 22 (downstream/intergenic) | 4.25 × 10−7 |
The haplotype analysis of rs4820294/rs2899292 for disease association in GWAS and LGALS1 expression association in lymphoblast cell lines (LCLs)
| Disease Association | Expression Association in LCLs | |||||
|---|---|---|---|---|---|---|
| Freq | ||||||
| Haplotype | Case | Control | OR (95% CI) | Freq | ||
| AA | 0.0207 | 0.0115 | 2.30 (0.34–15.50) | 0.452 | 0.0277 | 6.86 × 10−3 |
| GA | 0.4793 | 0.4225 | 1.28 (0.86–1.91) | 0.245 | 0.5871 | 0.126 |
| AG | 0.3567 | 0.2102 | 2.10 (1.34–3.31) | 9.03 × 10−4 | 0.1817 | 0.465 |
| GG | 0.1433 | 0.3558 | 0.26 (0.15–0.45) | 5.92 × 10−7 | 0.2034 | 0.050 |
Freq, frequency; OR (95% CI), odds ratio (95% confidential interval).
Figure 3The genetic architecture of LGALS1 gene.
The upper panel denotes the chromosomal region that accommodates the LGALS1 gene. The region flanking the transcriptional start site is a conserved regulatory region containing high LD SNPs in Chinese and other populations. The variants in high LD with anchor SNPs, rs4820294 and rs13057866 (denoted with ★), are shown in the next panel. The underlined variants are those in high LD with rs4820294 while un-marked ones are in high LD with rs13057866. Lung eQTL panel shows the locations of eQTL and P values in -log10 scale. The horizontal bar represents the -log10P value of 4. DNAse I hypersensitivity cluster and transcriptional factor binding site signals are annotated according to the experimental data from ENCODE Consortium. The gray box indicates the extent of the hypersensitive region or cluster of transcriptional factor occupancy. The darkness is proportional to the maximum signal strength observed in any cell line. The green line indicates the highest scoring site of an identified canonical motif for the corresponding factor.
Figure 4The association variants of LGALS1 are correlated to differential expression levels in lymphoblastoid cell lines (LCLs), human monocytes and lung tissues.
(4a) Boxplot of LGALS1 mRNA expression according to rs4820294/rs2899292 haplotype GG in LCLs. The carriage of haplotype GG significantly correlated to LGALS1 mRNA expression (P = 0.050) in LCLs generated from 74 Chinese Han from Beijing (CHB). In x-axis, -/-, -/+, and +/+ denote non-carriers (N = 49), heterozygotes (N = 21), and homozygotes (N = 4) of rs4820294/rs2899292 haplotype GG respectively. The box denotes the interquartile range. The line and diamond within the box represent the median and average respectively. Linear regression analysis was used to analyze the data. (4b)Boxplot of LGALS1 protein expression corresponding to rs4820294/rs2899292 haplotype GG in 21 LCLs (N = 8 for -/-, 9 for -/+ and 4 for +/+) by flow cytometry analysis. Carriage of rs4820294/rs2899292 haplotype GG significantly correlated to the LGALS1 protein expression in these LCLs (P = 0.025). MFI, mean florescence intensity. (4c)The anchor SNP rs4820294, a variant in the proximal promoter of LGALS1, regulated LGALS1 mRNA expression. A total of 19 mRNA samples (N = 2 for genotype A/A, 8 for A/G and 9 for G/G) from peripheral blood monocytes expressed differential levels of LGALS1 transcript in a genotype-specific manner (P = 0.031) by RT-qPCR assay. Levels of LGALS1 transcript are normalized with those of GAPDH. (4d) Gene expression levels of LGALS1 in human lung correlated to genotype groups of rs4820294 (meta-analysis P = 3.63 × 10−5). The lung cis-eQTL dataset has been generated from lung specimens collected at three centers examining a total of 1111 individuals. Denotations of boxplot are the same as 4a, except that open dots represent the outliers.
Enriched biological pathways in A(H7N9) patients versus healthy poultry workers with gene set analysis
| Pathway name | ECM-receptor interaction | MAPK signaling pathway |
|---|---|---|
| laminin, alpha 4 (LAMA4) | fibroblast growth factor 1 (FGF1) | |
| collagen, type IV, alpha 1 (COL4A1) | calcium channel, voltage-dependent, R type, alpha 1E subunit (CACNA1E) | |
| synaptic vesicle glycoprotein 2B(SV2B) | Rap guanine nucleotide exchange factor 2 (RAPGEF2) | |
| thrombospondin 3 (THBS3) | fibroblast growth factor 20 (FGF20) | |
| integrin, beta 4 (ITGB4) | phospholipase A2, group V (PLA2G5) | |
| CD36 (thrombospondin receptor) | fibroblast growth factor 12 (FGF12) | |
| integrin, alpha 4(ITGA4) | mitogen-activated protein kinase 8 interacting protein 3 (MAPK8IP3) | |
| CD47 | transforming growth factor, beta 2 (TGFB2) | |
| mitogen-activated protein kinase kinase kinase 7 (MAP3K7) | ||
| neurotrophic tyrosine kinase, receptor, type 2 (NTRK2) | ||
| calcium channel, voltage-dependent, alpha 2/delta subunit 1 (CACNA2D1) | ||
| RAS p21 protein activator 1 (RASA1) | ||
| Ratio of Enrichment | 7.76 | 3.69 |
| Adjusted | 0.0008 | 0.0045 |
Ratio of enrichment represents the ratio of the number of genes in the gene set and also in the category versus the expected number of genes by chance. The adjusted P is the P value after correction for multiple tests.