| Literature DB >> 32295893 |
Xuejiao Hu1,2, Shun Liao3,4, Hao Bai1, Shubham Gupta4,5, Yi Zhou1, Juan Zhou1, Lin Jiao1, Lijuan Wu1, Minjin Wang1, Xuerong Chen6, Yanhong Zhou1, Xiaojun Lu1, Tony Y Hu7, Zhaolei Zhang8,4,5, Binwu Ying9.
Abstract
Clinically diagnosed pulmonary tuberculosis (PTB) patients lack microbiological evidence of Mycobacterium tuberculosis, and misdiagnosis or delayed diagnosis often occurs as a consequence. We investigated the potential of long noncoding RNAs (lncRNAs) and corresponding predictive models to diagnose these patients. We enrolled 1,764 subjects, including clinically diagnosed PTB patients, microbiologically confirmed PTB cases, non-TB disease controls, and healthy controls, in three cohorts (screening, selection, and validation). Candidate lncRNAs differentially expressed in blood samples of the PTB and healthy control groups were identified by microarray and reverse transcription-quantitative PCR (qRT-PCR) in the screening cohort. Logistic regression models were developed using lncRNAs and/or electronic health records (EHRs) from clinically diagnosed PTB patients and non-TB disease controls in the selection cohort. These models were evaluated by area under the concentration-time curve (AUC) and decision curve analyses, and the optimal model was presented as a Web-based nomogram, which was evaluated in the validation cohort. Three differentially expressed lncRNAs (ENST00000497872, n333737, and n335265) were identified. The optimal model (i.e., nomogram) incorporated these three lncRNAs and six EHRs (age, hemoglobin, weight loss, low-grade fever, calcification detected by computed tomography [CT calcification], and interferon gamma release assay for tuberculosis [TB-IGRA]). The nomogram showed an AUC of 0.89, a sensitivity of 0.86, and a specificity of 0.82 in differentiating clinically diagnosed PTB cases from non-TB disease controls of the validation cohort, which demonstrated better discrimination and clinical net benefit than the EHR model. The nomogram also had a discriminative power (AUC, 0.90; sensitivity, 0.85; specificity, 0.81) in identifying microbiologically confirmed PTB patients. lncRNAs and the user-friendly nomogram could facilitate the early identification of PTB cases among suspected patients with negative M. tuberculosis microbiological evidence.Entities:
Keywords: clinically diagnosed pulmonary tuberculosis; electronic health record; lncRNA; nomogram
Mesh:
Substances:
Year: 2020 PMID: 32295893 PMCID: PMC7315016 DOI: 10.1128/JCM.01973-19
Source DB: PubMed Journal: J Clin Microbiol ISSN: 0095-1137 Impact factor: 5.948
FIG 1Overview of the strategy for investigating lncRNAs and prediction models for clinically diagnosed PTB patients. Abbreviations: PTB, pulmonary tuberculosis; PBMC, peripheral blood mononuclear cell; non-TB DC, nontuberculosis disease control; DE, differentially expressed; EHR, electronic health record; DCA, decision curve analysis.
Demographic and clinical features of suspected clinically diagnosed PTB patients and healthy controls
| Clinical feature | Value for group | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Selection cohort | Validation cohort | |||||||||
| Suspected clinically diagnosed PTB patients | HSs ( | Suspected clinically diagnosed PTB patients | HSs ( | |||||||
| Clinically diagnosed PTB cases ( | Non-TB DCs ( | Clinically diagnosed PTB cases ( | Non-TB DCs ( | |||||||
| No. (%) of male subjects | 84 (59.57) | 95 (59.75) | 0.976 | 284 (49.13) | 0.026 | 58 (59.79) | 87 (62.14) | 0.715 | 126 (51.43) | 0.162 |
| Mean age (yr) ± SD | 37.81 ± 17.93 | 56.68 ± 14.52 | <0.0001 | 40.59 ± 13.11 | 0.084 | 38.29 ± 17.57 | 57.96 ± 16.66 | <0.0001 | 36.82 ± 9.28 | 0.436 |
| Mean BMI (kg/m2) ± SD | 20.81 ± 2.99 | 20.43 ± 4.03 | 0.359 | 20.65 ± 3.19 | 0.57 | 21.59 ± 3.43 | 21.29 ± 3.62 | 0.52 | 21.51 ± 3.52 | 0.843 |
| No (%) of smoking subjects | 61 (43.26) | 72 (45.28) | 0.725 | 161 (27.85) | <0.0001 | 41 (42.27) | 66 (47.14) | 0.458 | 84 (34.29) | 0.167 |
| No (%) of subjects with radiological pathology | 116 (82.26) | 140 (88.05) | 0.158 | 86 (88.66) | 130 (92.86) | 0.264 | ||||
| Laboratory tests | ||||||||||
| No. (%) of subjects with positive TB-IGRA | 97 (68.88) | 56 (35.22) | <0.0001 | 64 (66.00) | 42 (30.00) | <0.0001 | ||||
| Median C-reactive protein concn (mg/liter) (IQR) | 16.30 (5.32–54.05) | 17.80 (6.47–60.20) | 0.427 | 13.60 (3.34–43.55) | 18.60 (6.56–70.63) | 0.037 | ||||
| Mean hematocrit ± SD | 0.37 ± 0.06 | 0.36 ± 0.07 | 0.162 | 0.44 ± 0.04 | <0.0001 | 0.38 ± 0.07 | 0.35 ± 0.07 | 0.002 | 0.43 ± 0.04 | <0.0001 |
| Mean no. of erythrocytes (1012/liter) ± SD | 4.33 ± 0.72 | 4.03 ± 0.81 | 0.001 | 4.78 ± 0.46 | <0.0001 | 4.46 ± 0.80 | 3.89 ± 0.91 | <0.0001 | 4.80 ± 0.46 | <0.0001 |
| Mean hemoglobin concn (g/liter) ± SD | 122.57 ± 23.22 | 115.82 ± 25.20 | 0.017 | 144.46 ± 13.88 | <0.0001 | 125.08 ± 24.25 | 113.11 ± 25.52 | <0.0001 | 145.82 ± 13.73 | <0.0001 |
| Median no. of platelets (109/liter) (IQR) | 238.00 (177.00–305.00) | 233.00 (149.00–299.00) | 0.171 | 193.00 (158.00–223.00) | <0.0001 | 220.00 (160.50–313.00) | 199.50 (137.00–290.75) | 0.059 | 190.00 (165.00–230.00) | 0.001 |
| Median no. of leukocytes (109/liter) (IQR) | 6.03 (4.76–8.25) | 6.36 (4.71–9.07) | 0.488 | 5.92 (5.18–6.67) | 0.184 | 6.96 (5.06–9.14) | 5.93 (4.34–8.33) | 0.009 | 5.70 (4.91–6.55) | 0.217 |
| Median no. of lymphocytes (109/liter) (IQR) | 1.15 (0.80–1.52) | 1.28 (0.87–1.87) | 0.056 | 1.86 (1.55–2.19) | <0.0001 | 1.29 (0.91–1.80) | 1.22 (0.86–1.62) | 0.343 | 1.85 (1.57–2.55) | <0.0001 |
| Mean no. of neutrophils (109/liter) (IQR) | 4.02 (3.23–5.93) | 4.08 (2.71–6.32) | 0.956 | 3.47 (2.87–4.09) | <0.0001 | 4.03 (2.51–5.69) | 4.85 (3.21–6.70) | 0.023 | 3.36 (2.75–3.92) | 0.006 |
| Median no. of monocytes (109/liter) (IQR) | 0.47 (0.35–0.65) | 0.42 (0.26–0.61) | 0.015 | 0.36 (0.29–0.44) | <0.0001 | 0.43 (0.30–0.64) | 0.46 (0.34–0.71) | 0.211 | 0.31 (0.25–0.39) | <0.0001 |
| Mean Alb concn (g/liter) ± SD | 36.66 ± 6.78 | 36.60 ± 6.66 | 0.973 | 48.24 ± 2.67 | <0.0001 | 37.33 ± 7.47 | 36.69 ± 7.22 | 0.509 | 47.06 ± 2.25 | <0.0001 |
| Mean globin concn (g/liter) ± SD | 31.69 ± 7.66 | 30.68 ± 8.10 | 0.269 | 28.92 ± 3.29 | 0.041 | 30.41 ± 7.89 | 29.73 ± 7.91 | 0.514 | 27.41 ± 3.19 | <0.0001 |
Radiological pathology refers to abnormal chest imaging results, including at least one of the following signs: polymorphic abnormality, calcification, cavity, bronchus sign, and pleural effusion. Abbreviations: Alb, albumin; IQR, interquartile range; P1, P value for the comparison of clinically diagnosed PTB cases and non-TB DCs (nontuberculosis disease control patients) in the selection cohort; P2, P value for the comparison of clinically diagnosed PTB patients and healthy subjects (HSs) in the selection cohort; P3, P value for the comparison of clinically diagnosed PTB cases and non-TB DCs in the validation cohort; P4, P value for the comparison of clinically diagnosed PTB patients and healthy subjects in the validation cohort.
FIG 2Receiver operating characteristic (ROC) curves of different models in predicting clinically diagnosed PTB from suspected patients. (A) ROC curves of the selection cohort between clinically diagnosed PTB cases and non-TB disease controls. The 10-fold cross-validation ROC curve of the EHR+lncRNA model is provided in Fig. S4 in the supplemental material. P values for model AUC comparisons in the selection cohort were 0.00012 (EHR+lncRNA versus EHR only), 1.402 × 10−7 (EHR+lncRNA versus lncRNA only), and 0.103 (EHR only versus lncRNA only). P values of <0.016 (0.05/3, i.e., alpha divided by the comparison number) were considered statistically significant. (B) ROC curves of the validation cohort between clinically diagnosed PTB cases and non-TB disease controls. P values for model AUC comparisons in the validation cohort were 0.004 (EHR+lncRNA versus EHR only), 0.0003 (EHR+lncRNA versus lncRNA only), and 0.361 (EHR only versus lncRNA only).
Performances of the comparative diagnostic models in the selection and validation cohorts
| Model | Value (95% CI) | ||||
|---|---|---|---|---|---|
| Sensitivity | Specificity | Accuracy | Positive predictive value | Negative predictive value | |
| Selection cohort | |||||
| Clinically diagnosed PTB cases vs non-TB DCs | |||||
| EHR+lncRNA (nomogram) | 0.89 (0.82–0.93) | 0.80 (0.73–0.85) | 0.84 (0.80–0.88) | 0.80 (0.73–0.85) | 0.89 (0.83–0.93) |
| EHR only | 0.89 (0.83–0.93) | 0.62 (0.54–0.68) | 0.75 (0.69–0.79) | 0.67 (0.60–0.74) | 0.87 (0.79–0.91) |
| lncRNA only | 0.85 (0.76–0.88) | 0.55 (0.46–0.61) | 0.69 (0.63–0.74) | 0.62 (0.55–0.69) | 0.80 (0.72–0.86) |
| Validation cohort | |||||
| Clinically diagnosed PTB cases vs non-TB DCs | |||||
| EHR+lncRNA (nomogram) | 0.86 (0.77–0.90) | 0.82 (0.75–0.87) | 0.84 (0.78–0.88) | 0.77 (0.68–0.83) | 0.89 (083–0.93) |
| EHR only | 0.89 (0.82–0.94) | 0.65 (0.56–0.72) | 0.75 (0.69–0.81) | 0.64 (0.56–0.72) | 0.90 (0.83–0.94) |
| lncRNA only | 0.85 (0.76–0.90) | 0.54 (0.47–0.62) | 0.67 (0.60–0.73) | 0.56 (0.48–0.63) | 0.83 (0.75–0.89) |
| Microbiologically confirmed PTB cases vs non-TB DCs | |||||
| EHR+lncRNA (nomogram) | 0.85 (0.81–0.88) | 0.81 (0.76–0.85) | 0.83 (0.80–0.86) | 0.85 (0.81–0.89) | 0.80 (0.75–0.84) |
| EHR only | 0.86 (0.82–0.89) | 0.63 (0.58–0.68) | 0.76 (0.73–0.79) | 0.75 (0.71–0.79) | 0.77 (0.72–0.82) |
| lncRNA only | 0.86 (0.82–0.89) | 0.55 (0.50–0.61) | 0.73 (0.69–0.76) | 0.71 (0.67–0.75) | 0.75 (0.69–0.81) |
Note that the cutoff probabilities in the selection cohort were 0.37 for the EHR+lncRNA model, 0.26 for the EHR-only model, and 0.32 for the lncRNA-only model. Features in each model are provided in Appendix S4 in the supplemental material. The EHR+lncRNA formula that was developed to classify patients as PTB cases or non-TB disease controls was −3.32 − 0.053 × (age) − 0.94 × log(ENST00000497872) − 0.39 × log(n333737) + 1.51 × (CT calcification) + 1.16 × (TB-IGRA) + 1.09 × (low-grade fever) + 0.014 × (hemoglobin) + 0.23 × log(n335265) + 0.43 × (weight loss).
FIG 3Nomogram for the prediction of clinically diagnosed PTB patients. (A) Nomogram to predict the risk of clinically diagnosed PTB patients, in which points were assigned based on the feature rank order of the effect estimates. A vertical line is drawn between the “Points” axis and the corresponding point for each feature to generate a total point score and PTB probability. (B) Calibration plot in the selection cohort (left) and validation cohort (right), with lines indicating the ideal, apparent, and bias-corrected predictions of the nomogram. (C) Decision curve analysis for the nomogram and EHR-only model, with lines indicating the nomogram, the EHR-only model, and assumptions that no patients or all patients have PTB.
FIG 4Alteration of lncRNAs before and after 2 months of intensive therapy. Shown are lncRNA expression levels before (blue) and after (red) a 2-month intensive anti-TB treatment regimen. Altered lncRNA expression levels were calculated using log2 lncRNA (posttreatment expression/pretreatment expression) values, and the Wilcoxon matched-paired rank test was used for comparisons among 22 paired samples. The median (interquartile range) log2 lncRNA values are as follows: −1.91 (−2.74, −1.11) before and −1.55 (−2.61, −0.79) after treatment for ENST00000497872, −3.88 (−4.81, −3.33) before and −2.30 (−2.99, −0.50) after treatment for n333737, and 2.12 (1.05, 2.34) before and 1.29 (0.85, 1.69) after treatment for n335265.