Literature DB >> 30787624

The blood transcriptional signature for active and latent tuberculosis.

Min Deng1, Xiao-Dong Lv2, Zhi-Xian Fang2, Xin-Sheng Xie1, Wen-Yu Chen2.   

Abstract

BACKGROUND: Although the incidence of tuberculosis (TB) has dropped substantially, it still is a serious threat to human health. And in recent years, the emergence of resistant bacilli and inadequate disease control and prevention has led to a significant rise in the global TB epidemic. It is known that the cause of TB is Mycobacterium tuberculosis infection. But it is not clear why some infected patients are active while others are latent.
METHODS: We analyzed the blood gene expression profiles of 69 latent TB patients and 54 active pulmonary TB patients from GEO (Transcript Expression Omnibus) database.
RESULTS: By applying minimal redundancy maximal relevance and incremental feature selection, we identified 24 signature genes which can predict the TB activation. The support vector machine predictor based on these 24 genes had a sensitivity of 0.907, specificity of 0.913, and accuracy of 0.911, respectively. Although they need to be validated in a large independent dataset, the biological analysis of these 24 genes showed great promise.
CONCLUSION: We found that cytokine production was a key process during TB activation and genes like CYBB, TSPO, CD36, and STAT1 worth further investigation.

Entities:  

Keywords:  blood gene expression; incremental feature selection; minimal redundancy maximal relevance; support vector machine; tuberculosis

Year:  2019        PMID: 30787624      PMCID: PMC6363485          DOI: 10.2147/IDR.S184640

Source DB:  PubMed          Journal:  Infect Drug Resist        ISSN: 1178-6973            Impact factor:   4.003


Introduction

Tuberculosis (TB), a pulmonary infectious disease caused by Mycobacterium tuberculosis, is a serious threat to human health. In the early 20th century, TB had spread around the world, killing millions of people.1 Later, with the progress of medical technology and improvement of sanitary conditions, the incidence of TB dropped substantially. However, in recent years the emergence of resistant bacilli and inadequate disease control and prevention has led to a significant rise in the global TB epidemic, which has now become the leading cause of deaths from infectious diseases.2 Currently, about 90%–95% TB patients are with latent tuberculosis infection (LTBI).3 LTBI is defined by persistent immune response caused by M. tuberculosis, but under conditions of a long-term asymptomatic infection.4 During latent infection, body’s immune system can inhibit the growth of bacteria by blocking the replication of the bacteria,5 but not to be completely eliminated. M. tuberculosis can stay dormant for decades, or even during the host’s lifetime.6 However, the risk of developing active TB increases greatly when human immunity declines.7 The main reason bacteria can survive in the body is because it contains high lipid content, which can prevent it from degradation and destruction within macrophages.5 In addition, M. tuberculosis can also affect the normal function of CD8+ T-cells, natural killer cells, and complement membrane attack complex.6 Immune responses to M. tuberculosis is primarily cell-mediated, regulated by the interaction between T-cells and infected macrophages and cytokines secreted by these cells.8 In most cases, infection of M. tuberculosis transits to the dormant state accompanied by the formation of infective granuloma that are well separated from the surrounding tissue.9 Saunders and Cooper indicated that the formation of granulomas is essential for limiting M. tuberculosis growth and tissue damage in TB infections, two major components of active TB.10 Risk factors known for TB reactivation include malnutrition, HIV infection, anti-tumor necrosis factor treatment, insulin-dependent diabetes, alcoholism, and smoking. However, the mechanisms of reactivation of LTBI in most cases are still unknown.11 Early diagnosis and preventive treatment of LTBI are an effective means to eliminate the bacteria and reduce the risk of TB in immunocompromised patients. Currently, the preferred diagnostic tools for LTBI remain tuberculin skin testing (TST) or interferon-γ release assay (IGRA).12 IGRA, based on cellular immunity, was tested to confirm whether people infected with bacteria by detecting the release of IFN-γ to specific antigen of M. tuberculosis. Compared with traditional TST, IGRA has higher specificity and sensitivity, and has been widely used in clinical detection of TB.13 However, they can only detect the infection of TB bacilli, but cannot reflect the risk of developing active TB. Recently, evidence from researches has demonstrated the importance of genetic factors in the development of active TB,14 but the mechanism of genetic susceptibility for TB remains largely unknown. Since humans are the natural hosts for M. tuberculosis, the precise etiology of infection cannot be studied in any other animal models.15 To understand what actives TB, we analyzed the blood gene expression profiles of 69 latent TB patients and 54 active pulmonary TB (PTB) patients. With advanced feature selection methods, including minimal redundancy maximal relevance (mRMR) and incremental feature selection (IFS), 24 genes were identified as discriminative between latent and active TB patients. What is more, a support vector machine (SVM) classifier for active TB prediction was built based on these 24 genes and its sensitivity, specificity, and accuracy evaluated with leave-one out-cross validation (LOOCV) were 0.907, 0.913, and 0.911, respectively.

Methods

The blood gene expression profiles of latent and active TB patients

To identify the key genes that activate TB, we downloaded the blood gene expression profiles of 69 latent TB patients and 54 active PTB patients from publicly available GEO (Transcript Expression Omnibus) database under accession number of GSE19491.16 The 24 normal samples in the original GSE19491 dataset were excluded, only the 69 latent TB patients and 54 active PTB patients were analyzed to get the key genes that activate TB. These samples were collected from London, UK and Cape Town, South Africa.16 We combined the samples from different cities to get a larger dataset. The age and gender information of these samples are listed in Table S1. Within the 123 samples, there were 64 male and 59 female patients. The average age was 32.5 with an SD of 13.6. The gene expression levels were measured using microarray of Illumina HumanHT-12 V3.0 expression beadchip. There were 48,803 probes corresponding to 25,153 genes. The probes corresponding to the same gene were averaged. The gene expression matrix was quantile normalized.

Identify the genes that are related to the activation of TB

As a basic problem in bioinformatics, many methods have been proposed to identify the phenotype-related genes or proteins. One of these methods, mRMR,17 has been widely used and proved to be effective.18–20 The mRMR method is different from univariate statistical test. To explain its principle, let us use Ω to denote all the 25,153 genes, Ω to denote the selected m genes, and Ω to denote the to be selected n genes. First, the relevance of gene g from Ω with the activeness of TB l was evaluated with mutual information (I) equation: Then, the average redundancy of gene g with the already selected genes was At last, to consider both maximum relevance and minimum redundancy, the goal of this algorithm was to find the best gene g from to maximize the function below in which the first part meant relevance and the second represented redundancy After 25,153 rounds of optimization, all the 25,153 genes can be ranked as In this gene list, the top ranked ones were discriminative. We considered the top 500 mRMR genes as relevant to the activeness of TB. The top 500 mRMR genes are listed in Table S2.

Find the signature genes for TB activation

The top 500 mRMR genes were related to TB activation, but the number of genes was too large to detect practically. Therefore, we adopted IFS21–27 to get the signature genes for TB activation. Hopefully, a small number of genes were enough to build an actionable predictor for TB activation. The widely used SVM was adopted to construct the classifiers. During the IFS procedure, the genes were added sequentially based on the mRMR rank, starting with the first mRMR gene, ending with the 500th mRMR gene. In other words, 500 gene sets were selected and 500 SVM classifiers were constructed. Each classifier was evaluated with LOOCV and its sensitivity (S), specificity (S), and accuracy (ACC) were calculated TP, TN, FP, and FN were the number of true active TB, true latent TB, false active TB, and latent TB patients. Based on the IFS results, we can choose the right gene sets as signature and get acceptable prediction accuracy.

Results and discussion

The 500 genes that are related to the activation of TB

Using the mRMR method which considered both the relevance with TB activeness and the redundancy with other genes, we identified 500 genes that were related to the activation of TB. These 500 genes were ranked based on their discriminative ability of latent TB patients and active PTB patients.

The 24 signature genes for TB activation

To further narrow down the 500 genes that were related to the activation of TB and obtain a signature for TB activation, we applied the IFS analysis and plotted the IFS curve based on the prediction performances using different number of signature genes (Table S3). As shown in Figure 1, when the top 51 mRMR genes were used, the LOOCV accuracy was the highest, 0.919. But when only 24 genes were used, the accuracy became stable. Therefore, we choose the 24 genes as signature genes for TB activation. The sensitivity, specificity, and accuracy of the 24 signature genes for TB activeness prediction were 0.907, 0.913, and 0.911, respectively. The 24 signature genes are given in Table 1 and the confusion matrix of their prediction performance is shown in Table 2.
Figure 1

The prediction performances for TB activation by using different numbers of signature genes.

Notes: The x-axis is the number of genes in the gene set while y-axis is the prediction accuracy of the SVM classifier evaluated with LOOCV. The peak of the IFS curve had an accuracy of 0.919 when 51 genes were used. But when 24 genes were used, the accuracy has already become stable. Therefore, we choose these 24 genes as signature genes of TB activation. The sensitivity, specificity, and accuracy of the 24 signature genes for TB activeness prediction were 0.907, 0.913, and 0.911, respectively.

Abbreviations: LOOCV, leave-one out-cross validation; IFS, incremental feature selection; SVM, support vector machine; TB, tuberculosis.

Table 1

The 24 signature genes for TB activation

RankNameFunctionmRMR score

1HNRNPDAnaphase promoting complex subunit 10.399
2CYBBB-cell scaffold protein with ankyrin repeats 10.149
3TSPORibosomal l24 domain containing 10.149
4SLC9A3R1Carbonic anhydrase 5B0.144
5LOXL3CD36 molecule0.15
6CA5BCytochrome b5610.134
7GPR63Cytochrome b-245 beta chain0.128
8C15orf15EPH receptor A40.136
9FNBP4Formin binding protein 40.130
10EPHA4G protein-coupled receptor 630.119
11ANAPC1Heterogeneous nuclear ribonucleoprotein D0.117
12QSOX2Family with sequence similarity 214 member a0.113
13NELL2Lysyl oxidase like 30.109
14LYRM1LYR motif containing 10.106
15KIAA1370Neural EGFL like 20.108
16ZNF91Protein kinase C theta0.108
17TMEM51Quiescin sulfhydryl oxidase 20.107
18TRIB2SLC9A3 regulator 10.111
19BANK1Signal transducer and activator of transcription 10.107
20TUSC4Transmembrane protein 510.108
21CYB561Tribbles pseudokinase 20.106
22PRKCQTranslocator protein0.104
23CD36Npr2 like, gator1 complex subunit0.103
24STAT1Zinc finger protein 910.106

Abbreviations: mRMR, minimal redundancy maximal relevance; TB, tuberculosis.

Table 2

The confusion matrix of the predicted and actual TB activeness based on the 24 signature genes

Actual active TBActual latent TB

Predicted active TB496
Predicted latent TB563

Sensitivity: 0.907Specificity: 0.913Accuracy: 0.911

Abbreviation: TB, tuberculosis.

To explore the expression levels of these 24 signature genes in latent TB patients and active PTB patients, we plotted their heatmap in Figure 2. It can be seen that most of the latent and active TB patients were clustered into the right groups. TMEM51, TSPO, CD36, CYBB, LOXL3, CYB561, GPR63, LYRM1, and STAT1 were highly expressed in active TB patients while SLC9A3R1, TUSC4, ZNF91, HNRNPD, PRKCQ, QSOX2, BANK1, EPHA4, C15orf15, NELL2, ANAPC1, FNBP4, TRIB2, CA5B, and KIAA1370 were highly expressed in latent TB patients.
Figure 2

The heatmap of the 24 signature genes latent and active TB patients.

Notes: The rows represent genes while the columns represent patients. The green and red columns represent latent and active TB patients, respectively. It can be seen that the latent and active TB patients were clustered into different groups.

Abbreviation: TB, tuberculosis.

The biological analysis of the signature genes for TB activation

We did Gene Ontology (GO) enrichment analysis of these 24 signature genes and the results with false discovery rate <0.10 are given in Table 3. The enriched GO biological processes were GO:0001817 regulation of cytokine production, GO:0001816 cytokine production, GO:0051172 negative regulation of nitrogen compound metabolic process, GO:0042592 homeostatic process, GO:0031324 negative regulation of cellular metabolic process, GO:0010243 response to organonitrogen compound, and GO:0010605 negative regulation of macromolecule metabolic process. There have been many reports of the important roles of cytokine in responding to mycobacteria during the activation of TB.28,29
Table 3

The enriched GO biological processes for the 24 signature genes

GO biological processFDRSignature genes with this GO annotation

GO:0001817 regulation of cytokine production0.0373TSPO, CD36, CYBB, PRKCQ, STAT1, TRIB2, BANK1
GO:0001816 cytokine production0.0373TSPO, CD36, CYBB, PRKCQ, STAT1, TRIB2, BANK1
GO:0051172 negative regulation of nitrogen compound metabolic process0.0796TSPO, CD36, EPHA4, HNRNPD, STAT1, ZNF91, SLC9A3R1, TRIB2, BANK1, ANAPC1, LOXL3
GO:0042592 homeostatic process0.0796TSPO, CD36, CYBB, HNRNPD, NELL2, PRKCQ, STAT1, SLC9A3R1, QSOX2
GO:0031324 negative regulation of cellular metabolic process0.0796TSPO, CD36, EPHA4, HNRNPD, STAT1, ZNF91, SLC9A3R1, TRIB2, BANK1, ANAPC1, LOXL3
GO:0010243 response to organonitrogen compound0.0796TSPO, CD36, CYBB, EPHA4, HNRNPD, PRKCQ, STAT1
GO:0010605 negative regulation of macromolecule etabolic process0.0796TSPO, CD36, EPHA4, HNRNPD, STAT1, ZNF91, SLC9A3R1, TRIB2, BANK1, ANAPC1, LOXL3

Abbreviations: FDR, false discovery rate; GO, Gene Ontology.

The following genes from Table 1 showed great promise and were discussed. Gp91phox, encoded by CYBB ranked second in Table 1, is an essential subunit of NADPH oxidase complex. Alterations in macrophage function, such as defects in NADPH oxidase30 and the vitamin D receptor,31 are known as risk factors for mycobacterial infection. Several studies have shown that mutations in CYBB could result in X-linked chronic granulomatous disease with much higher risk of TB,32–34 which is an immunodeficiency caused by defective activity of NADPH oxidase in phagocytes.33,35 Liu et al have also demonstrated the significant correlation between CYBB polymorphisms and decreased risk of TB, particularly among male smokers.36 TSPO ranked third in Table 1 serves as a trans-mitochondrial membrane channel that transports cholesterol and other endogenous ligands.37 It has been reported that the expression of TSPO is highest in steroidogenic tissues, lung, and immune cells like macrophages.38 Immunofluorescence studies made by Foss et al indicated that TSPO was highly expressed in phagocytic cells and CD68(+) macrophages within TB lesions.39 The increased expression of TSPO will lead to macrophages activation, which is a pivotal component of TB-associated inflammation.39 In PTB, a synthetic ligand for TSPO, radioiodinated DPA-713, is found to be upregulated in activated macrophages.40 CD36 ranked 23rd in Table 1 is a membrane glycoprotein that exist in various cells, including macrophages, monocytes, adipocytes, and platelets.41 It has been implicated in multiple cellular processes and defined as a multiligand scavenger receptor that mediates fatty acid transport, phagocytosis, and inflammation in response to a variety of pathogens, including mycobacteria.42 CD36 facilitates surfactant lipid uptake which can be exploited by M. tuberculosis for growth.43 Lao et al suggested that rs1194182 and rs10499859, two SNPs of CD36, may reduce the risk of PTB, indicating CD36 as an important biomarker for PTB.44 Hawkes et al observed that deficiency of CD36 reduces the susceptibility of mice to mycobacterial infection. In addition, CD36 deficiency of macrophages inhibits the growth of many mycobacterial species in vitro, demonstrating that deficiency of CD36 plays a role in the resistance to mycobacterial infection.45 STAT1 ranked 24th in Table 1, is a member of the STAT protein family and thought to be an important mediator in response to IFN-γ and host defense against M. tuberculosis.46,47 IFN-γ activates macrophages to kill multiple pathogens but cannot activate macrophages to kill M. tuberculosis.48 Much recent evidence shows that M. tuberculosis infection inhibited IFN-γ signaling via blocking several responses to IFN-γ, such as induction of FcγRI,49 and the dysfunction of macrophages in response to IFN-γ depends on an altered regulatory mechanism in STAT1 signaling pathway.50 Sugawara et al discovered that STAT1 knockout mice have higher susceptibility to pulmonary mycobacterial infection in mice, indicating that STAT1 appears to be a key transcription factor in resisting mycobacterial infection.51

The proteomics pattern of the signature genes

Marakalala et al measured the granulomas proteomes of TB patients using mass spectrometry and confocal microscopy.52 They identified 4,406 proteins that were expressed in caseous granuloma caseum, cavitary granuloma cells, cavitary granuloma caseum, and solid granuloma cells. We mapped our signatures onto their proteomics results. Within the 24 signature genes, 6 of them were detected by Marakalala et al. Their log2 label-free quantification intensities in caseous granuloma caseum, cavitary granuloma cells, cavitary granuloma caseum, and solid granuloma cells were shown in Table 4. HNRNPD, CYBB, TSPO, SLC9A3R1, FNBP4, and STAT1 are important on both mRNA and protein levels.
Table 4

The log2 label-free quantification intensity of signature genes in caseous granuloma caseum, cavitary granuloma cells, cavitary granuloma caseum, and solid granuloma cells

ProteinCaseous granuloma caseumCavitary granuloma cellsCavitary granuloma caseumSolid granuloma cells

HNRNPD28.4129.8927.2629.87
CYBB28.6626.6829.9525.76
TSPO27.9626.3827.6826.2
SLC9A3R125.3526.5327.0126.51
FNBP420.5320.3221.8820.83
STAT128.9829.930.0729.69

Conclusion

As a pulmonary infectious disease caused by M. tuberculosis, TB is a serious threat and can spread through coughing, sneezing, or other ways. But not all infected patients exhibit symptoms. Some patients were active while others were latent. Which factors or genes trigger TB activation is key to understand TB. By analysis, the blood gene expression profiles of 69 latent TB patients and 54 active PTB patients, we found 24 signature genes that can predict TB activeness. In-depth analysis of these 24 genes suggested that cytokine production was a key process during TB activation. Signature genes including CYBB, TSPO, CD36, and STAT1 worth to be further investigated.
  52 in total

Review 1.  CD36: a class B scavenger receptor involved in angiogenesis, atherosclerosis, inflammation, and lipid metabolism.

Authors:  M Febbraio; D P Hajjar; R L Silverstein
Journal:  J Clin Invest       Date:  2001-09       Impact factor: 14.808

2.  Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy.

Authors:  Hanchuan Peng; Fuhui Long; Chris Ding
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2005-08       Impact factor: 6.226

3.  Improved superoxide-generating ability by interferon gamma due to splicing pattern change of transcripts in neutrophils from patients with a splice site mutation in CYBB gene.

Authors:  F Ishibashi; T Mizukami; S Kanegasaki; L Motoda; R Kakinuma; F Endo; H Nunoi
Journal:  Blood       Date:  2001-07-15       Impact factor: 22.113

4.  Molecular evidence of endogenous reactivation of Mycobacterium tuberculosis after 33 years of latent infection.

Authors:  Troels Lillebaek; Asger Dirksen; Inga Baess; Benedicte Strunge; Vibeke Ø Thomsen; Ase B Andersen
Journal:  J Infect Dis       Date:  2002-01-17       Impact factor: 5.226

Review 5.  New insights into the function of granulomas in human tuberculosis.

Authors:  Timo Ulrichs; Stefan H E Kaufmann
Journal:  J Pathol       Date:  2006-01       Impact factor: 7.996

6.  Mycobacterium tuberculosis inhibits IFN-gamma transcriptional responses without inhibiting activation of STAT1.

Authors:  L M Ting; A C Kim; A Cattamanchi; J D Ernst
Journal:  J Immunol       Date:  1999-10-01       Impact factor: 5.422

Review 7.  Genetics of susceptibility to human infectious disease.

Authors:  G S Cooke; A V Hill
Journal:  Nat Rev Genet       Date:  2001-12       Impact factor: 53.242

8.  Restraining mycobacteria: role of granulomas in mycobacterial infections.

Authors:  B M Saunders; A M Cooper
Journal:  Immunol Cell Biol       Date:  2000-08       Impact factor: 5.126

9.  STAT1 knockout mice are highly susceptible to pulmonary mycobacterial infection.

Authors:  Isamu Sugawara; Hiroyuki Yamada; Satoru Mizuno
Journal:  Tohoku J Exp Med       Date:  2004-01       Impact factor: 1.848

10.  X-linked chronic granulomatous disease: first report of mutations in patients of Argentina.

Authors:  Cecilia Barese; Silvia Copelli; Rubén Zandomeni; Matías Oleastro; Marta Zelazko; Eva María Rivas
Journal:  J Pediatr Hematol Oncol       Date:  2004-10       Impact factor: 1.289

View more
  3 in total

1.  Diagnosis of pulmonary tuberculosis via identification of core genes and pathways utilizing blood transcriptional signatures: a multicohort analysis.

Authors:  Qian Qiu; Anzhou Peng; Yanlin Zhao; Dongxin Liu; Chunfa Liu; Shi Qiu; Jinhong Xu; Hongguang Cheng; Wei Xiong; Yaokai Chen
Journal:  Respir Res       Date:  2022-05-14

2.  Enhancing the weighted voting ensemble algorithm for tuberculosis predictive diagnosis.

Authors:  Victor Chukwudi Osamor; Adaugo Fiona Okezie
Journal:  Sci Rep       Date:  2021-07-20       Impact factor: 4.379

3.  A systematic evaluation of Mycobacterium tuberculosis Genome-Scale Metabolic Networks.

Authors:  Víctor A López-Agudelo; Tom A Mendum; Emma Laing; HuiHai Wu; Andres Baena; Luis F Barrera; Dany J V Beste; Rigoberto Rios-Estepa
Journal:  PLoS Comput Biol       Date:  2020-06-15       Impact factor: 4.475

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.