| Literature DB >> 30787624 |
Min Deng1, Xiao-Dong Lv2, Zhi-Xian Fang2, Xin-Sheng Xie1, Wen-Yu Chen2.
Abstract
BACKGROUND: Although the incidence of tuberculosis (TB) has dropped substantially, it still is a serious threat to human health. And in recent years, the emergence of resistant bacilli and inadequate disease control and prevention has led to a significant rise in the global TB epidemic. It is known that the cause of TB is Mycobacterium tuberculosis infection. But it is not clear why some infected patients are active while others are latent.Entities:
Keywords: blood gene expression; incremental feature selection; minimal redundancy maximal relevance; support vector machine; tuberculosis
Year: 2019 PMID: 30787624 PMCID: PMC6363485 DOI: 10.2147/IDR.S184640
Source DB: PubMed Journal: Infect Drug Resist ISSN: 1178-6973 Impact factor: 4.003
Figure 1The prediction performances for TB activation by using different numbers of signature genes.
Notes: The x-axis is the number of genes in the gene set while y-axis is the prediction accuracy of the SVM classifier evaluated with LOOCV. The peak of the IFS curve had an accuracy of 0.919 when 51 genes were used. But when 24 genes were used, the accuracy has already become stable. Therefore, we choose these 24 genes as signature genes of TB activation. The sensitivity, specificity, and accuracy of the 24 signature genes for TB activeness prediction were 0.907, 0.913, and 0.911, respectively.
Abbreviations: LOOCV, leave-one out-cross validation; IFS, incremental feature selection; SVM, support vector machine; TB, tuberculosis.
The 24 signature genes for TB activation
| Rank | Name | Function | mRMR score |
|---|---|---|---|
|
| |||
| 1 | HNRNPD | Anaphase promoting complex subunit 1 | 0.399 |
| 2 | CYBB | B-cell scaffold protein with ankyrin repeats 1 | 0.149 |
| 3 | TSPO | Ribosomal l24 domain containing 1 | 0.149 |
| 4 | SLC9A3R1 | Carbonic anhydrase 5B | 0.144 |
| 5 | LOXL3 | CD36 molecule | 0.15 |
| 6 | CA5B | Cytochrome b561 | 0.134 |
| 7 | GPR63 | Cytochrome b-245 beta chain | 0.128 |
| 8 | C15orf15 | EPH receptor A4 | 0.136 |
| 9 | FNBP4 | Formin binding protein 4 | 0.130 |
| 10 | EPHA4 | G protein-coupled receptor 63 | 0.119 |
| 11 | ANAPC1 | Heterogeneous nuclear ribonucleoprotein D | 0.117 |
| 12 | QSOX2 | Family with sequence similarity 214 member a | 0.113 |
| 13 | NELL2 | Lysyl oxidase like 3 | 0.109 |
| 14 | LYRM1 | LYR motif containing 1 | 0.106 |
| 15 | KIAA1370 | Neural EGFL like 2 | 0.108 |
| 16 | ZNF91 | Protein kinase C theta | 0.108 |
| 17 | TMEM51 | Quiescin sulfhydryl oxidase 2 | 0.107 |
| 18 | TRIB2 | SLC9A3 regulator 1 | 0.111 |
| 19 | BANK1 | Signal transducer and activator of transcription 1 | 0.107 |
| 20 | TUSC4 | Transmembrane protein 51 | 0.108 |
| 21 | CYB561 | Tribbles pseudokinase 2 | 0.106 |
| 22 | PRKCQ | Translocator protein | 0.104 |
| 23 | CD36 | Npr2 like, gator1 complex subunit | 0.103 |
| 24 | STAT1 | Zinc finger protein 91 | 0.106 |
Abbreviations: mRMR, minimal redundancy maximal relevance; TB, tuberculosis.
The confusion matrix of the predicted and actual TB activeness based on the 24 signature genes
| Actual active TB | Actual latent TB | |
|---|---|---|
|
| ||
| Predicted active TB | 49 | 6 |
| Predicted latent TB | 5 | 63 |
|
| ||
| Sensitivity: 0.907 | Specificity: 0.913 | Accuracy: 0.911 |
Abbreviation: TB, tuberculosis.
Figure 2The heatmap of the 24 signature genes latent and active TB patients.
Notes: The rows represent genes while the columns represent patients. The green and red columns represent latent and active TB patients, respectively. It can be seen that the latent and active TB patients were clustered into different groups.
Abbreviation: TB, tuberculosis.
The enriched GO biological processes for the 24 signature genes
| GO biological process | FDR | Signature genes with this GO annotation |
|---|---|---|
|
| ||
| GO:0001817 regulation of cytokine production | 0.0373 | TSPO, CD36, CYBB, PRKCQ, STAT1, TRIB2, BANK1 |
| GO:0001816 cytokine production | 0.0373 | TSPO, CD36, CYBB, PRKCQ, STAT1, TRIB2, BANK1 |
| GO:0051172 negative regulation of nitrogen compound metabolic process | 0.0796 | TSPO, CD36, EPHA4, HNRNPD, STAT1, ZNF91, SLC9A3R1, TRIB2, BANK1, ANAPC1, LOXL3 |
| GO:0042592 homeostatic process | 0.0796 | TSPO, CD36, CYBB, HNRNPD, NELL2, PRKCQ, STAT1, SLC9A3R1, QSOX2 |
| GO:0031324 negative regulation of cellular metabolic process | 0.0796 | TSPO, CD36, EPHA4, HNRNPD, STAT1, ZNF91, SLC9A3R1, TRIB2, BANK1, ANAPC1, LOXL3 |
| GO:0010243 response to organonitrogen compound | 0.0796 | TSPO, CD36, CYBB, EPHA4, HNRNPD, PRKCQ, STAT1 |
| GO:0010605 negative regulation of macromolecule etabolic process | 0.0796 | TSPO, CD36, EPHA4, HNRNPD, STAT1, ZNF91, SLC9A3R1, TRIB2, BANK1, ANAPC1, LOXL3 |
Abbreviations: FDR, false discovery rate; GO, Gene Ontology.
The log2 label-free quantification intensity of signature genes in caseous granuloma caseum, cavitary granuloma cells, cavitary granuloma caseum, and solid granuloma cells
| Protein | Caseous granuloma caseum | Cavitary granuloma cells | Cavitary granuloma caseum | Solid granuloma cells |
|---|---|---|---|---|
|
| ||||
| HNRNPD | 28.41 | 29.89 | 27.26 | 29.87 |
| CYBB | 28.66 | 26.68 | 29.95 | 25.76 |
| TSPO | 27.96 | 26.38 | 27.68 | 26.2 |
| SLC9A3R1 | 25.35 | 26.53 | 27.01 | 26.51 |
| FNBP4 | 20.53 | 20.32 | 21.88 | 20.83 |
| STAT1 | 28.98 | 29.9 | 30.07 | 29.69 |