| Literature DB >> 28355295 |
Eun Pyo Hong1, Min Jin Go2, Hyung-Lae Kim3, Ji Wan Park1.
Abstract
A complex interplay among host, pathogen, and environmental factors is believed to contribute to the risk of developing pulmonary tuberculosis (PTB). The lack of replication of published genome-wide association study (GWAS) findings limits the clinical utility of reported single nucleotide polymorphisms (SNPs). We conducted a GWAS using 467 PTB cases and 1,313 healthy controls obtained from two community-based cohorts in Korea. We evaluated the performance of PTB risk models based on different combinations of genetic and nongenetic factors and validated the results in an independent Korean population comprised of 179 PTB cases and 500 healthy controls. We demonstrated the polygenic nature of PTB and nongenetic factors such as age, sex, and body mass index (BMI) were strongly associated with PTB risk. None of the SNPs achieved genome-wide significance; instead, we were able to replicate the associations between PTB and ten SNPs near or in the genes, CDCA7, GBE1, GADL1, SPATA16, C6orf118, KIAA1432, DMRT2, CTR9, CCDC67, and CDH13, which may play roles in the immune and inflammatory pathways. Among the replicated SNPs, an intergenic SNP, rs9365798, located downstream of the C6orf118 gene showed the most significant association under the dominant model (OR = 1.59, 95% CI 1.32-1.92, P = 2.1×10-6). The performance of a risk model combining the effects of ten replicated SNPs and six nongenetic factors (i.e., age, sex, BMI, cigarette smoking, systolic blood pressure, and hemoglobin) were validated in the replication set (AUC = 0.80, 95% CI 0.76-0.84). The strategy of combining genetic and nongenetic risk factors ultimately resulted in better risk prediction for PTB in the adult Korean population.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28355295 PMCID: PMC5371343 DOI: 10.1371/journal.pone.0174642
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Univariate logistic regression analysis for associations between baseline characteristics and pulmonary tuberculosis in the KARE and HEXA Studies.
| KARE | HEXA | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Characteristics | Case | Control | OR (95% CI) | Case | Control | OR (95% CI) | |||
| (N = 467) | (N = 1,313) | (N = 179) | (N = 500) | ||||||
| Male, N (%) | 286 (61.2) | 574 (43.7) | 2.03 (1.64–2.52) | 1.1×10−10 | 104 (58.1) | 124 (24.8) | 4.20 (2.93–6.02) | 5.0×10−15 | |
| Age, N (%) | |||||||||
| < 50 | 216 (46.2) | 799 (60.8) | 1.00 | 67 (37.4) | 302 (60.4) | 1.00 | |||
| 50 ≤ | 251 (53.8) | 514 (39.2) | 1.81 (1.46–2.24) | 5.3×10−8 | 112 (62.6) | 198 (39.6) | 2.55 (1.79–3.62) | 1.8×10−7 | |
| Household income (million won), N (%) | |||||||||
| < 100 | 138 (29.9) | 281 (21.7) | 1.00 | 21 (11.7) | 39 (7.8) | 1.00 | |||
| 100 ≤ | 324 (70.1) | 1,016 (78.3) | 0.65 (0.51–0.82) | 4.0×10−4 | 133 (74.3) | 371 (74.2) | 0.67 (0.38–1.17) | 0.159 | |
| Cigarette smoking, N (%) | |||||||||
| Nonsmoker | 240 (51.8) | 835 (63.8) | 1.00 | 110 (61.5) | 406 (81.2) | 1.00 | |||
| Ex-smoker | 94 (20.3) | 177 (13.5) | 1.85 (1.38–2.47) | 3.0×10−5 | 34 (19.0) | 46 (9.2) | 2.72 (1.67–4.46) | 6.1×10−5 | |
| Current smoker | 129 (27.9) | 296 (22.6) | 1.52 (1.18–1.95) | 0.001 | 34 (19.0) | 46 (9.2) | 2.72 (1.67–4.46) | 6.1×10−5 | |
| Alcohol consumption, N (%) | |||||||||
| Nondrinker | 185 (39.7) | 647 (49.4) | 1.00 | 78 (43.6) | 265 (53.0) | 1.00 | |||
| Ex-drinker | 39 (8.4) | 69 (5.3) | 1.98 (1.29–3.02) | 0.002 | 9 (5.0) | 16 (3.2) | 1.91 (0.81–4.49) | 0.138 | |
| Current drinker | 242 (51.9) | 594 (45.3) | 1.42 (1.14–1.79) | 0.002 | 90 (50.3) | 219 (43.8) | 1.40 (0.98–1.99) | 0.063 | |
| BMI, kg/m2 | 23.45±0.13 | 24.70±0.08 | 0.86 (0.82–0.89) | 1.3×10−14 | 22.94±0.20 | 22.96±0.12 | 0.99 (0.93–1.06) | 0.905 | |
| < 25 | 338 (72.4) | 734 (55.9) | 1.00 | 137 (76.5) | 393 (78.6) | 1.00 | |||
| 25–30 | 117 (25.0) | 515 (39.2) | 0.49 (0.39–0.63) | 6.4×10−9 | 41 (22.9) | 101 (20.2) | 1.16 (0.77–1.76) | 0.469 | |
| 30 ≤ | 12 (2.6) | 64 (4.9) | 0.41 (0.22–0.76) | 0.005 | 1 (0.6) | 6 (1.2) | 0.48 (0.06–4.01) | 0.496 | |
| SBP, mmHg | 119.06±0.87 | 116.15±0.44 | 1.01 (1.01–1.02) | 0.001 | 122.80±1.09 | 107.47±0.37 | 1.17 (1.14–1.21) | 4.1×10−27 | |
| < 120 | 296 (63.4) | 952 (72.5) | 1.00 | 70 (39.1) | 500 (100) | ||||
| 120–140 | 117 (25.0) | 282 (21.5) | 1.33 (1.04–1.72) | 0.025 | 81 (45.3) | 0 | |||
| 140 ≤ | 54 (11.6) | 79 (6.0) | 2.20 (1.52–3.18) | 3.0×10−5 | 28 (15.6) | 0 | |||
| DBP, mmHg | 79.09±0.56 | 77.20±0.28 | 1.02 (1.01–1.03) | 0.001 | 77.24±0.74 | 67.37±0.29 | 1.20 (1.16–1.25) | 3.4×10−26 | |
| < 80 | 329 (70.4) | 1,004 (76.5) | 1.00 | 86 (48.0) | 500 (100) | ||||
| 80–90 | 84 (18.0) | 224 (17.0) | 1.14 (0.86–1.51) | 0.345 | 64 (35.8) | 0 | - | ||
| 90 ≤ | 54 (11.6) | 85 (6.5) | 1.94 (1.35–2.79) | 3.5×10−4 | 29 (16.2) | 0 | |||
| Hb, g/dL | 13.80±0.06 | 13.52±0.04 | 1.13 (1.05–1.21) | 8.4×10−4 | 14.36±0.14 | 13.47±0.07 | 1.37 (1.23–1.53) | 1.5×10−8 | |
| 13.8–17.2 (12.1–15.1) | 353 (75.6) | 1,022 (77.8) | 1.00 | 148 (82.7) | 410 (82.0) | 1.00 | |||
| < 13.8 (< 12.1) | 110 (23.5) | 282 (21.5) | 1.13 (0.88–1.45) | 0.343 | 20 (11.2) | 73 (14.6) | 0.76 (0.45–1.29) | 0.307 | |
| 17.2 < (15.1 <) | 4 (0.9) | 9 (0.7) | 1.29 (0.39–4.20) | 0.676 | 11 (6.1) | 17 (3.4) | 1.79 (0.82–3.92) | 0.143 | |
| BUN, mg/dL | 14.47±0.17 | 13.97±0.09 | 1.04 (1.01–1.07) | 0.008 | 13.92±0.28 | 13.32±0.16 | 1.05 (0.99–1.10) | 0.060 | |
| ≤ 20 | 428 (91.6) | 1,235 (94.1) | 1.00 | 168 (93.8) | 474 (94.8) | 1.00 | |||
| 20 < | 39 (8.4) | 78 (5.9) | 1.44 (0.97–2.15) | 0.072 | 11 (6.2) | 26 (5.2) | 1.19 (0.58–2.47) | 0.633 | |
Abbreviations: BMI, body mass index, BUN, blood urea nitrogen; CI, confidence interval; DBP, diastolic blood pressure; Hb, hemoglobin; OR, odds ratio; SBP, systolic blood pressure
a Data are shown as the numbers of subjects (percentage) for discrete and categorical variables and mean ± standard error for continuous variables.
b ORs and P values were estimated from univariate logistic regression analysis.
c The variables remained significant at P value less than 0.05 after backward elimination in a multivariate logistic regression model.
d The ORs, 95% CIs, and P values for the continuous variables were estimated using univariate logistic regression analyses.
e Hemoglobin levels in males (females).
Results of the genome-wide association study for pulmonary tuberculosis in the KARE Study (P < 1×10−5).
| Gene | Chr. | SNP | Function | Model | N/R | NN/NR/RR | RAF | MLR | ||
|---|---|---|---|---|---|---|---|---|---|---|
| Cases | Controls | Cases/Controls | OR (95% CI) | |||||||
| 13q32.2 | rs3825435 | intron | A | 303/144/20 | 996/298/18 | 0.20/0.13 | 1.69 (1.38–2.07) | 5.3×10−7 | ||
| 8q23.1 | rs3110431 | intron | D | 180/223/54 | 668/519/126 | 0.37/0.29 | 1.70 (1.37–2.12) | 2.1×10−6 | ||
| 6p12.3 | rs9381416 | intron | A | 76/216/174 | 320/640/353 | 0.61/0.51 | 1.44 (1.23–1.67) | 4.0×10−6 | ||
| 16q24.1 | rs2326344 | intergenic | A | 203/214/50 | 726/506/81 | 0.34/0.25 | 1.48 (1.25–1.75) | 4.9×10−6 | ||
| 3q13.3 | rs571110 | intergenic | R | 258/150/47 | 766/463/59 | 0.27/0.23 | 2.56 (1.70–3.85) | 6.9×10−6 | ||
| 2p21 | rs17394081 | intron | D | 423/43/1 | 1257/55/1 | 0.05/0.02 | 2.61 (1.72–3.98) | 7.9×10−6 | ||
Abbreviations: Chr., chromosome; N/R, non-risk/risk allele; OR, odds ratio; MLR, multiple logistic regression; RAF, risk allele frequency; SNP, single nucleotide polymorphism
a The genetic model that showed the most significant evidence for association with PTB: A, additive; R, recessive.
b NN/NR/RR, the numbers of cases and controls with non-risk homozygote/heterozygote/risk homozygote genotypes, respectively.
c ORs and P values were estimated from the MLR model adjusted for age, sex, and BMI.
Ten replicated SNPs associated with PTB in the KARE and HEXA Studies.
| Gene | Chr. | SNP | Function | Model | N/R | KARE | HEXA | Joint Analysis | |||
|---|---|---|---|---|---|---|---|---|---|---|---|
| OR (95% CI) | OR (95% CI) | OR (95% CI) | |||||||||
| 2q31.1 | rs7594926 | intergenic | D | 1.59 (1.22–2.08) | 7.6×10−4 | 1.64 (1.05–2.56) | 0.030 | 1.54 (1.23–1.89) | 1.4×10−4 | ||
| 3p12.2 | rs2307058 | intron | R | 1.75 (1.31–2.33) | 1.4×10−4 | 1.60 (1.02–2.50) | 0.040 | 1.63 (1.29–2.05) | 3.4×10−5 | ||
| 3p23 | rs9682385 | intergenic | A | 1.34 (1.14–1.58) | 5.4×10−4 | 1.45 (1.11–1.91) | 0.007 | 1.31 (1.15–1.50) | 7.9×10−5 | ||
| 3q26.31 | rs9840514 | intergenic | R | 1.79 (1.27–2.50) | 9.5×10−4 | 1.96 (1.16–3.23) | 0.011 | 1.96 (1.16–3.23) | 5.4×10−5 | ||
| 6q27 | rs9365798 | intergenic | D | 1.61 (1.27–2.04) | 7.1×10−5 | 1.49 (1.02–2.17) | 0.038 | 1.59 (1.32–1.92) | 2.1×10−6 | ||
| 9p24.1 | rs4348560 | intron | R | 1.79 (1.28–2.50) | 6.3×10−4 | 1.75 (1.04–2.94) | 0.034 | 1.69 (1.30–2.22) | 1.3×10−4 | ||
| 9p24.3 | rs10738171 | intergenic | A | 1.33 (1.12–1.57) | 8.9×10−4 | 1.34 (1.03–1.75) | 0.029 | 1.30 (1.14–1.49) | 1.3×10−4 | ||
| 11p15.3 | rs9787961 | Near 5’-UTR | R | 1.67 (1.26–2.21) | 3.9×10−4 | 1.83 (1.17–2.85) | 0.008 | 1.64 (1.30–2.06) | 2.3×10−5 | ||
| 11q21 | rs3019221 | intron | A | 1.61 (1.25–2.07) | 2.3×10−4 | 1.50 (1.00–2.23) | 0.048 | 1.56 (1.27–1.91) | 2.1×10−5 | ||
| 16q23.3 | rs12716963 | intron | D | 1.54 (1.19–1.96) | 7.7×10−4 | 1.72 (1.12–2.83) | 0.012 | 1.56 (1.27–1.92) | 2.6×10−5 | ||
Abbreviations: Chr., chromosome; CI, confidence interval; HEXA, Health Examinees Study; KARE, Korea Association Resource Study; OR, odds ratio.
a The genetic models that showed the most significant evidence for association with PTB: A, additive; D, dominant; R, recessive.
b N/R, non-risk/risk allele
c ORs and P values were estimated from the multiple logistic regression model adjusted for age, sex, and BMI.
Fig 1Comparison of predictability of risk prediction models for tuberculosis in KARE (463 cases and 1,308 controls) (A) and HEXA (142 cases and 490 controls) (B) Studies. wGRS comprised of ten SNPs replicated in both studies (blue-dotted line), wnGRS comprised of six nongenetic factors, age, sex, body mass index (< 20 kg/m2), systolic blood pressure (120 mmHg <), hemoglobin, cigarette smoking (red dashed lines), and the combined model of wGRS and wnGRS (solid black lines).
Replication of risk models for pulmonary tuberculosis.
| Risk model | SNP | AUC (95% CI) | |||||
|---|---|---|---|---|---|---|---|
| N | wGRS | wnGRS1 | wnGRS2 | wnGRS3 | wnGRS4 | ||
| 0.636 (0.607–0.665) | |||||||
| Nongenetic model | Cases/Controls, N | 467/1313 | 463/1308 | 463/1308 | 463/1305 | ||
| 0.630 (0.602–0.659) | 0.627 (0.598–0.657) | 0.634 (0.604–0.663) | 0.633 (0.603–0.662) | ||||
| Combined model | Cases/Controls, N | 467/1313 | 463/1305 | 463/1308 | 463/1305 | ||
| 0.690 (0.663–0.718) | 0.684 (0.656–0.712) | 0.692 (0.665–0.720) | 0.687 (0.659–0.715) | ||||
| 0.639 (0.587–0.692) | |||||||
| Nongenetic model | Cases/Controls, N | 179/500 | 177/498 | 151/498 | 141/498 | ||
| 0.693 (0.650–0.736) | 0.689 (0.643–0.735) | 0.770 (0.726–0.814) | 0.780 (0.734–0.825) | ||||
| Combined model | Cases/Controls, N | 170/490 | 168/490 | 142/490 | 132/490 | ||
| 0.739 (0.697–0.781) | 0.736 (0.692–0.780) | 0.799 (0.758–0.841) | 0.808 (0.765–0.851) |
Abbreviations: AUC, area under the receiver operating curve; CI, confidence interval; wGRS weighted genetic risk score; wnGRS, weighted non-genetic risk score.
a AUCs and P values were estimated for ROC analyses
b P < 1×10−4, AUCs of the three risk score models, wGRS, wnGRS, and wGRS+wnGRS, were significantly different from each other at P < 0.0001.
c wGRS, the weighted genetic risk score model composed of the ten replicated SNPs.
d wGRS+wnGRS1, the combined model of wGRS model composed of 10 validated SNPs plus wnGRS1 comprised of age, sex, BMI.
e wnGRS2 was comprised of wnGRS1 plus cigarette smoking, and alcohol consumption.
f wnGRS3 was comprised of wnGRS1 plus cigarette smoking, SBP and Hb.
g wnGRS4 was comprised of wnGRS3 plus alcohol consumption, DBP, and BUN.
Fig 2Comparison of genetic (A), nongenetic (B), genetic plus nongenetic (C) models in 142 cases and 490 controls after removing individuals with missing values from HEXA data. White and dark-gray bars (left Y-axis) denote the proportions of controls and cases, respectively, in each risk quartile. Black dots and solid lines (right Y-axis) denote odds ratios and their standard errors for each risk quartile group compared to the lowest risk quartile group (X-axis).