| Literature DB >> 35308365 |
Juli Chen1, Lijuan Wu2, Yanghua Lv1, Tangyuheng Liu2, Weihua Guo1, Jiajia Song2, Xuejiao Hu3, Jing Li1.
Abstract
Background: Pathogenic testing for tuberculosis (TB) is not yet sufficient for early and differential clinical diagnosis; thus, we investigated the potential of screening long non-coding RNAs (lncRNAs) from human hosts and using machine learning (ML) algorithms combined with electronic health record (EHR) metrics to construct a diagnostic model.Entities:
Keywords: diagnostic models; long non-coding RNA; machine learning algorithms; molecular markers; tuberculosis
Year: 2022 PMID: 35308365 PMCID: PMC8928272 DOI: 10.3389/fmicb.2022.774663
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
FIGURE 1Overview of the strategy for investigating lncRNAs and prediction models for clinically diagnosed PTB patients.
Cohort distribution of PTB population study.
| Queue name | TB patients (n) | Healthy controls (n) | Non-TB lung disease (n) |
| Primary screening cohort | 7 | 5 | / |
| Selection cohort | 798 | 1650 | 299 |
| Training set | 399 | / | 150 |
| Validation set | 399 | / | 149 |
In the modeling and verification stage, the tuberculosis patients and non-tuberculous lung disease patients in the selected cohort are randomly divided into training set and validation set.
Demographic and clinical features in Selection cohort.
| Clinical feature | TB patients | Healthy controls | Adjusted- | Non-TB lung disease ( | Adjusted- |
|
| |||||
| Mean age(year) ± SD | 40.81 ± 18.37 | 40.39 ± 12.73 | 0.659 | 57.02 ± 15.41 | 0.000 |
| Gender (male/female, n) | 444/354 | 874/776 | 0.214 | 182/117 | 0.132 |
| Mean BMI (kg/m2) ± SD | 21.59 ± 3.43 | 21.51 ± 3.52 | 0.843 | 21.11 ± 3.62 | 0.359 |
|
| |||||
| Mean of erythrocytes (1012/L) ± SD | 4.25 ± 0.77 | 4.91 ± 0.50 | 0.000 | 3.97 ± 0.85 | 0.000 |
| Mean hemoglobin (g/L) ± SD | 120.47 ± 23.23 | 145.59 ± 15.16 | 0.000 | 114.55 ± 25.34 | 0.000 |
| Mean Hematocrit (%) ± SD | 36.72 ± 6.61 | 44.93 ± 3.22 | 0.000 | 35.69 ± 7.16 | 0.031 |
| Median of platelets (109/L)(IQR) | 233(167,311) | 212(178,247) | 0.000 | 208(145,294) | 0.022 |
| Median of ALT (U/L)(IQR) | 17(11,31) | 20(14,32) | 0.007 | 21(15,37) | 0.183 |
| Median of AST (U/L)(IQR) | 22(17,36) | 20(16,26) | 0.000 | 26(20,36) | 0.097 |
| Median of CRP (mg/L)(IQR) | 21.80(6.21,53.37) | 2.41(1.24,3.35) | 0.000 | 12.00(5.15,38.50) | 0.722 |
| Median of ESR (mm/h)(IQR) | 34(12,67) | 13(2,16) | 0.000 | 42(21,61) | 0.314 |
| Mean of albumin (g/L) ± SD | 36.08 ± 6.67 | 48.69 ± 2.71 | 0.000 | 36.54 ± 6.49 | 0.306 |
| Mean of globulin (g/L) ± SD | 31.27 ± 6.98 | 27.55 ± 3.78 | 0.000 | 30.27 ± 7.71 | 0.040 |
| Median of leukocytes (109/L) (IQR) | 6.06(4.65,8.30) | 5.77(5.02,6.64) | 0.000 | 6.67(4.86,9.07) | 0.011 |
| Median of lymphocytes (109/L) (IQR) | 1.08(0.75,1.51) | 1.85(1.54,2.17) | 0.000 | 1.29(0.90,1.81) | 0.005 |
| Median of neutrophils (109/L) (IQR) | 4.24(2.99,5.98) | 3.38(2.80,4.07) | 0.000 | 4.68(3.02,6.67) | 0.113 |
| Median of monocytes (×109/L) (IQR) | 0.46(0.32,0.64) | 0.33(0.26,0.41) | 0.000 | 0.43(0.30,0.65) | 0.213 |
| TB-IGRA positive (n,%) | 357(44.74) | / | / | 85(28.43) | 0.000 |
|
| |||||
| Cough (n,%) | 434(54.38) | / | / | 154(51.51) | 0.433 |
| Low fever (n,%) | 413(51.75) | / | / | 109(36.45) | 0.000 |
| Weight loss (n,%) | 248(31.08) | / | / | 40(13.38) | 0.000 |
| Night sweats (n,%) | 313(39.22) | / | / | 51(17.06) | 0.000 |
| Poor appetite (n,%) | 336(42.10) | / | / | 103(34.45) | 0.021 |
| Fatigue (n,%) | 289(36.22) | / | / | 124(41.48) | 0.110 |
|
| |||||
| polymorphic changes (n,%) | 446(55.89) | / | / | 103(34.45) | 0.000 |
| calcified foci (n,%) | 136(17.04) | / | / | 21(7.02) | 0.000 |
P-value was adjusted for age and gender between two groups; IQR, interquartile range; P1, P value for the comparison of TB cases and healthy controls (HCs) in the selection cohort; P2, P value for the comparison of TB cases and non-TB DCs (non-tuberculosis lung disease control patients) in the selection cohort.
FIGURE 2Bioinformatics analysis of candidate lncRNAs.
FIGURE 3Expression of candidate LncRNAs in selection cohort.
lncRNA expression in Selection cohort (2–ΔΔCq).
| LncRNA | Healthy controls | TB patients | Adjusted- | Non-TB lung disease ( | Adjusted- |
|
| 0.94 (0.46–2.39) | 0.62 (0.36–1.25) | <0.001 | 0.68 (0.32–1.44) | 0.367 |
|
| 1.12 (0.62–1.65) | 0.33 (0.14–0.65) | <0.001 | 0.54 (0.32–0.88) | <0.001 |
|
| 0.99 (0.49–2.07) | 0.70 (0.46–1.17) | <0.001 | 0.91 (0.55–1.44) | <0.001 |
P-value was adjusted for age and gender between two groups, IQR, interquartile range; P1, P value for the comparison of TB cases and healthy controls (HCs) in the selection cohort; P2, P value for the comparison of TB cases and non-TB DCs (non-tuberculosis lung disease control patients) in the selection cohort.
FIGURE 4ROC curves of the three models in selection cohort.
AUC of the three diagnostic models in the training set.
| Model | AUC(95%CI) | Z value |
|
| “LncRNA+EHR” model | 0.89(0.86–0.92) | ||
| “EHR” model | 0.86(0.83–0.89) | 3.224 | 0.001 |
| “LncRNA”model | 0.68(0.63–0.72) | 9.081 | <0.001 |
The test method is DeLong’s test, *: the comparison between the “EHR” model and the “LncRNA+EHR” model; **: the comparison between the “LncRNA” model and the “LncRNA+EHR” model.
FIGURE 5Nomogram of the optimal model (“lncRNA + EHR” model).
FIGURE 6DCA of the “EHR” model and nomogram.