Literature DB >> 33653759

Systematic review of prediction models for pulmonary tuberculosis treatment outcomes in adults.

Lauren S Peetluk1, Felipe M Ridolfi2, Peter F Rebeiro3,4, Dandan Liu5, Valeria C Rolla2, Timothy R Sterling4.   

Abstract

OBJECTIVE: To systematically review and critically evaluate prediction models developed to predict tuberculosis (TB) treatment outcomes among adults with pulmonary TB.
DESIGN: Systematic review. DATA SOURCES: PubMed, Embase, Web of Science and Google Scholar were searched for studies published from 1 January 1995 to 9 January 2020. STUDY SELECTION AND DATA EXTRACTION: Studies that developed a model to predict pulmonary TB treatment outcomes were included. Study screening, data extraction and quality assessment were conducted independently by two reviewers. Study quality was evaluated using the Prediction model Risk Of Bias Assessment Tool. Data were synthesised with narrative review and in tables and figures.
RESULTS: 14 739 articles were identified, 536 underwent full-text review and 33 studies presenting 37 prediction models were included. Model outcomes included death (n=16, 43%), treatment failure (n=6, 16%), default (n=6, 16%) or a composite outcome (n=9, 25%). Most models (n=30, 81%) measured discrimination (median c-statistic=0.75; IQR: 0.68-0.84), and 17 (46%) reported calibration, often the Hosmer-Lemeshow test (n=13). Nineteen (51%) models were internally validated, and six (16%) were externally validated. Eighteen (54%) studies mentioned missing data, and of those, half (n=9) used complete case analysis. The most common predictors included age, sex, extrapulmonary TB, body mass index, chest X-ray results, previous TB and HIV. Risk of bias varied across studies, but all studies had high risk of bias in their analysis.
CONCLUSIONS: TB outcome prediction models are heterogeneous with disparate outcome definitions, predictors and methodology. We do not recommend applying any in clinical settings without external validation, and encourage future researchers adhere to guidelines for developing and reporting of prediction models. TRIAL REGISTRATION: The study was registered on the international prospective register of systematic reviews PROSPERO (CRD42020155782). © Author(s) (or their employer(s)) 2021. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.

Entities:  

Keywords:  epidemiology; statistics & research methods; tuberculosis

Mesh:

Year:  2021        PMID: 33653759      PMCID: PMC7929865          DOI: 10.1136/bmjopen-2020-044687

Source DB:  PubMed          Journal:  BMJ Open        ISSN: 2044-6055            Impact factor:   2.692


Prediction models for tuberculosis treatment outcomes have the potential to inform interventions or treatment management protocols to promote cure among patients with tuberculosis at the greatest risk of unsuccessful treatment outcomes, but the methods and clinical utility of existing models had not been formally evaluated. This was the first systematic review of prediction models for tuberculosis treatment outcomes. The review used a comprehensive search strategy, conducted thorough bias assessment with the Prediction Model Risk of Bias Assessment Tool (PROBAST) tool, and offers recommendations for future model development and validation studies for predicting tuberculosis treatment outcomes. Evidence synthesis and quality assessment were limited by incomplete reporting in primary studies, as well as heterogeneities in study populations, such as multidrug resistance and age. External validation studies or studies written in languages other than English, Spanish, Portuguese or French were excluded.

Background

Tuberculosis (TB) is one of the top 10 causes of death worldwide and a leading cause of death from an infectious disease. In 2018, 10 million people developed TB and 1.45 million people died from it globally, despite widespread availability of curative treatment.1 Global treatment success was 85% for all new and relapse patients with TB in 2018. For HIV-associated TB, it was 75%. These proportions are lower than the End TB Strategy target of ≥90% treatment success.2 Heeding early recognition that Mycobacterium tuberculosis develops resistance rapidly in response to single-drug therapy, TB has been treated with combination regimens for more than 50 years.3 Aside from weight-based dosing, the WHO and other TB guidelines authorities recommend a standardised approach for treatment of almost all patients with TB.4–6 The current recommendation for drug-susceptible TB includes 2 months of isoniazid, rifampin, pyrazinamide, and ethambutol, followed by 4 months of isoniazid and rifampin. Due to the long duration of TB treatment, it would be beneficial to understand early predictors of unsuccessful TB treatment outcomes to identify patients needing tailored treatment approaches, such as directly observed therapy (DOT) or extended treatment course. Research suggests that individual characteristics, such as HIV, age, undernutrition, diabetes, TB disease severity, extrapulmonary TB, history of TB, adherence, alcohol use and adverse drug reactions, are associated with unsuccessful TB treatment outcomes, but results vary by setting and patient population.7–10 Prediction models, defined as any combination or equation of two or more predictors to estimate an individualised probability of a specific endpoint within a defined period of time, are increasingly common in TB research.11 The large number of recent prediction models for TB outcomes highlights the common desire to identify patients with TB at greatest risk of an unsuccessful treatment outcome. However, to date, there has not been a formal synthesis or quality assessment of existing prediction models for TB treatment outcomes, which is essential to determine whether they should be used to inform care and may help guide development of future models. Thus, we conducted a systematic review to identify, describe, compare and synthesise clinical prediction models designed to predict TB treatment outcomes among persons with pulmonary TB.

Methods and analysis

All steps of the systematic review were carried out according to guidelines set by Cochrane Prognosis Methods Group (PMG) and PROGnosis RESearch Strategy (PROGRESS).12–14 Reporting adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA, online supplemental file 1). This study was preregistered on Open Science Framework (OSF, https://osf.io/rz3wp) and the international prospective register of systematic reviews (PROSPERO: CRD42020155782).

Study eligibility criteria

The review question was defined according to the PICOTS (Population, Intervention, Comparator, Outcomes, Timing, Setting) framework (online supplemental file 2). In brief, the goal was to identify prognostic models developed to predict TB treatment outcomes among pulmonary TB cases. The main endpoint was unsuccessful TB treatment outcome, defined by the WHO as the combination of death, treatment failure, loss to follow-up and/or not evaluated, as compared with successful TB treatment outcome, defined as the combination of cure or treatment completion (table 1).15 Loss to follow-up was sometimes referred to as default or treatment abandonment.
Table 1

WHO definition of treatment outcomes for patients with TB

OutcomeDefinition
Treatment completionCompletion of treatment without evidence of failure, but without documentation of a negative sputum smear or culture in the last month of treatment and/or on at least one previous occasion, either because tests were not done or because results are unavailable
CureBacteriologic confirmation of a negative smear or culture at the end of TB treatment and on at least one previous occasion
Treatment successComposite of cured and treatment completed
Treatment failureSputum smear or culture is positive at month 5 or later during treatment
DeathPatient with TB who dies for any reason before starting or during the course of treatment
Loss to follow-upPatient with TB who did not start treatment or whose treatment was interrupted for 2 consecutive months or more
Not evaluated (transfer out)Patient with TB for whom no treatment outcome was assigned, which includes cases who ‘transferred out’ to another treatment unit as well as cases for whom the treatment outcome is unknown to the reporting unit

TB, tuberculosis.

WHO definition of treatment outcomes for patients with TB TB, tuberculosis. Inclusion criteria were: (1) prognostic model studies with or without external validation16; (2) study population included adult, drug-susceptible, pulmonary, TB cases; (3) written in English, Spanish, Portuguese and French; (4) published between 1 January 1995 and 9 January 2020; (5) treatment outcome was one of the following: cure, treatment completion, death, treatment failure, loss to follow-up or not evaluated. Exclusion criteria were: (1) predictive value of more than one variable was evaluated but not combined in a prediction model; (2) study population was only multidrug-resistant (MDR) TB cases, only extrapulmonary TB cases or only children (<18 years old); (3) outcome was evaluated during treatment such as: 2-month smear/culture conversion, acquired resistance, adverse events, quality of life; (4) long-term outcomes, such as relapse, recurrence or post-treatment mortality. The decision to include only articles in English, Spanish, Portuguese and French was based on study team capabilities. The dates reflect modern TB treatment practice; first-line TB treatment regimens were not available until the early 1990s.17 18 Articles that included a combination of drug-susceptible and drug-resistant cases, or a combination of children and adults were included.

Search strategy and selection criteria

The following electronic databases were searched on 9 January 2020: PubMed, Embase, Web of Science and the first 200 references from Google Scholar. This combination of databases achieved best overall recall for systematic reviews in a recent study.19 Clinicaltrials.gov and retractiondatabase.org were also searched for unpublished research. Reference lists of retrieved articles were checked to identify eligible studies. Search terms relating to the ‘prediction model’ component of the search were adapted from a PubMed search strategy that captured prediction model studies with sensitivity of 98%.20 That component was combined with terms relating to TB treatment outcomes. The search strategy, developed in PubMed, was adapted for all other databases with assistance from a reference librarian (online supplemental file 3). Article selection was conducted in three stages. The first stage was automatic deduplication and title screening, carried out using revtools in RStudio (V.1.2).21 Remaining articles were imported into Covidence, a web-based software platform that streamlines systematic reviews, where abstracts (Stage 2) and full text (Stage 3) were manually screened.22 Stages 2 and 3 were carried out by two independent reviewers (LP and FR). Discordance was discussed between reviewers, and if consensus was not reached, a third party arbitrated (one of TS, VCR, PR, DL). In stage 3, reasons for exclusion were documented according to PRISMA.

Data analysis

Data from selected studies were recorded using a database designed in REDCap (Vanderbilt University).23 24 Data extraction was informed by the CHecklist for critical Appraisal and data extraction for systematic Reviews of prediction Modelling Studies (CHARMS) and the Prediction Model Risk of Bias Assessment Tool (PROBAST).16 25 26 CHARMS checklist and PROBAST are shown in online supplemental files 4 and 5, respectively. Quality assessment and applicability of included studies was assessed using PROBAST by dual independent review.16 26 PROBAST was specifically designed to assess risk of bias of prediction model studies, which included identifying deficiencies in study design, conduct or analysis that led to inaccurate estimates of predictive performance. PROBAST has four domains: participants, predictors, outcome and analysis with 20 total signalling questions. Each question was answered on the scale: yes, probably yes, no, probably no, no information. Domains were scored as low, high and unclear risk of bias. PROBAST also guides assessment of applicability of participants, predictors, and outcomes from each included study to the review question. Results were summarised narratively and in tables and figures. Meta-analysis was not possible due to lack of external validation and use of disparate predictors, outcome definitions and modelling methods. For studies that presented multiple models with the same set of predictors and outcomes, but different methods, the best-performing method was included in data synthesis. For studies presenting multiple models with different sets of predictors (ie, baseline data vs longitudinal data), the model developed using only baseline data was included. If studies developed multiple models for different outcomes or with different populations, all models were included. To further evaluate the impact of study population heterogeneities on prediction model performance, we additionally examined results after stratifying studies by inclusion/exclusion of MDR and younger age groups.

Patient and public involvement

Neither patients nor the public were involved in the design, conduct, or reporting of the research, as it was not feasible or appropriate for this systematic review. The study protocol is publicly available at https://osf.io/rz3wp.

Role of the funding source

The funder of the study had no role in study design, data collection, data analysis, data interpretation or writing of the report. The corresponding author had full access to all the data in the study and had final responsibility for the decision to submit for publication.

Results

Study selection

The search identified 14 739 unique studies. After excluding irrelevant titles, 6426 abstracts were screened, 536 articles underwent full-text review, and 33 model development studies presenting 37 prediction models were included (figure 1).
Figure 1

PRISMA flow chart of inclusion process. PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses

PRISMA flow chart of inclusion process. PRISMA, Preferred Reporting Items for Systematic Reviews and Meta-Analyses

Study characteristics

Of the 33 studies, most were retrospective cohorts (n=25, 76%), three (9%) were prospective cohort studies, two (6%) were case–control studies and three (9%) were nested case–control studies. Data from nearly half of studies (n=16, 48%) were collected from surveillance systems; 11 (33%) studies used a data collection form developed specifically for their study and 6 (18%) studies extracted data from medical records. Median sample size was 803 (IQR: 291–4167). Full details on included studies are in table 2.
Table 2

Study characteristics

First author, yearPopulationStudy yearsStudy designLocationValidationNo. with outcome/sample size (%)Predictors in final modelPerformance measuresModel presentationRisk of bias (population, predictor, outcome, analysis)
Death
Abdelbary et al9/2017TB cases2006–2013Retrospective cohortMexicoInternal(split-sample)Development:261/4216 (6%)Validation:260/4215 (6%)Age (<41, 41–65, ≥65), sex, MDR, HIV, malnutrition, alcoholism, diabetes, pulmonary TBc-statistic=0.70Sensitivity=60%Specificity=71%Risk scoreLow, high, low, high
Abdelbary et al9/2017 (TB/DM)TB/DM cases2006–2013Retrospective cohortMexicoNone88/2121 (4%)Sex, malnutrition, BCG vaccinated, AFB smear (positive vs negative)c-statistic=0.68Risk scoreUnclear, high, low, high
Aljohaney69/2018Hospitalised patients with TBDecember 2011 –December 2016Retrospective cohortSaudi ArabiaNone41/291 (14%)Clinical model: age, congestive heart failureClinical + lab model:* age >65, congestive heart failure, bilateral disease on chest X-rayClinical model: Accuracy=86%Clinical and lab model:* Accuracy=90%ORsUnclear, unclear, unclear, high
Bastos et al70/2016Inpatient and outpatient TB cases on DOT2007–2013Retrospective cohortPortugalExternal(setting)Development:121/681 (18%)Validation:24/103 (23%)Hypoxemic respiratory failure, age (≥50 vs <50), bilateral involvement, comorbidities (at least one of HIV, diabetes, liver failure/cirrhosis, congestive heart failure, chronic respiratory disease), haemoglobin (<12 vs ≥12)AUROC=0.84(95% CI: 0.76 to 0.93)Sensitivity=41.8%Specificity=92.1%Risk scoreLow, unclear, low, high
Gupta-Wright et al44/2019Hospitalised patients with TB/HIVOctober 2015–September 2017Retrospective cohortMalawi and South AfricaExternal(setting)Development:94/315 (30%)Validation:147/644 (23%)Sex, age 55+, currently taking ART, ability to walk unaided, severe anaemia, positive TB-LAMc-statistic=0.68(95% CI: 0.61 to 0.74)HL test: p=0.13Calibration plotRisk scoreLow, low, low, high
Horita et al71/2013Hospitalised patients with TBJanuary 2008–July 2011Retrospective cohortJapanExternal(setting)Development:36/179 (20%)Validation:48/244 (20%)Age, oxygen requirement, albumin, activities of daily livingAUROC=0.893Sensitivity=0.92Specificity=0.73Risk scoreLow, low, low, high
Koegelenberg et al40/2015Hospitalised patients with TBJanuary 2012–May 201366Retrospective cohortSouth AfricaNone38/83 (46%)Septic shock, HIV with CD4 <200, creatinine >140 (male) or >120 (female), P:F O2 ratio <200, chest radiograph showing miliary pattern/parenchymal infiltrates, absence of TB treatment at admissionMean score in survivors:2.27 (SD=1.47)Mean score in non-survivors:3.58 (SD=1.08)Risk scoreLow, Low, Low, High
Nguyen and Graviss53(general pop)/2018TB casesJanuary 2010–December 2016Retrospective cohortTexasInternal(split-sample)Development:253/3378 (7%)Validation:270/3377 (8%)Age group (15–44, 44–64, >64), US born, homeless, resident of long-term care facility, chronic kidney failure, meningeal TB, miliary TB, HIV positive, HIV unknownAUROC=0.80(95% CI: 0.77 to 0.82)HL test: χ²=6.3, p=0.613Risk scoreLow, unclear, unclear, high
Nguyen and Graviss37(TB/DM)/2019Patients with TB/DMJanuary 2010–December 2016Retrospective cohortTexasInternal (bootstrap)112/1227 (9%)Age ≥65, US-born, homeless, injection drug use, chronic kidney failure, TB meningitis, Miliary TB, AFB positive smear, HIV positiveAUROC=0.82(95% CI: 0.78 to 0.87)HL test: χ²=4.54, p=0.81Brier score=0.07Risk scoreUnclear, unclear, unclear, high
Nguyen et al52 (TB/HIV)/2018Patients with TB/HIVJanuary 2010–December 2016Retrospective cohortTexasInternal (bootstrap)57/450 (13%)Age ≥45, resident of long-term care facility, meningeal TB, abnormal chest x-ray, diagnosis confirmed by positive culture of nucleic acid amplification, culture not converted or unknownAUROC=0.79(95% CI: 0.70 to 0.87)HL test: χ²=4.25, p=0.51Brier score: 0.09Risk scoreLow, high, unclear, high
Pefura-Yone et al54/2017Patients with TBJanuary 2012–December 201366Retrospective cohortCameroonInternal (bootstrap)213/2250 (9%)Age, adjusted BMI, clinical form (smear-positive pulmonary TB, Psmear-negative pulmonary TB, extrapulmonary TB), HIVC-statistic: 0.808HL test: χ²=6.44, p=0.60Sensitivity=80.7%Specificity=68.2%Calibration plotModel coefficientsLow, low, low, high
Podlekareva et al46/2013Patients with TB/HIVJanuary 2004–December 2006Retrospective cohort52 cities in Europe and ArgentinaNone995†Drug susceptibility testing performed, treatment with rifamycin+isoniazid+pyrazinamide, and combination ART at/near TB diagnosisCrude hazard ratio=0.62(95% CI: 0.64 to 0.84)Risk scoreLow, unclear, low, high
Valade et al42/2012Hospitalised patients with TBMarch 2000–July 2009Retrospective cohortFranceInternal (bootstrap)20/53 (38%)Miliary TB, catecholamine infusion, mechanical ventilation on admissionAUROC=0.92(95% CI: 0.85 to 0.98)Brier score=0.13Optimism=0.03Accuracy=85%Sensitivity - 75%Specificity=91%Risk scoreUnclear, low, low, high
Wang et al43/2019HIV-negative, culture-confirmed, pulmonary TB casesJanuary 2014–December 2016Prospective cohortChinaExternal(setting)Development:36/287 (13%)Validation:15/104 (14%)Age, cavitary lesion, pleural effusion, drug resistance, disseminated, albumin, c-reactive protein, white blood cell count, IL-6, migration inhibitory factorAUROC=0.85 ± 0.028ORsLow, low, low, high
Wejse et al36/2008Patients with Pulmonary TB on DOT1996–2001Retrospective cohortGuinea BissauNone100/698 (14%)Cough, haemoptysis, dyspnoea, chest pain, night sweating, anaemia conjunctivae, tachycardia, positive funding at lung auscultation, temperature >37, BMI <18, BMI <16, mid-upper arm circumference (MUAC) <220, MUAC <200AUROC=0.65(95% CI: 0.6 to 0.7)Sensitivity=0.45Specificity=0.75Risk scoreLow, high, low, high
Zhang et al45/2019Patients with TB/HIV at end stage of AIDSAugust 2009–January 2018Retrospective cohortChinaInternal(split-sample)Development:157/807 (19%)Validation:40/200 (20%)Anaemia, TB meningitis, severe pneumonia, hypoalbuminaemia, unexplained infection or space-occupying lesions, malignancyAUROC=0.867(95% CI: 0.832 to 0.902)Sensitivity=79.6%Specificity=82.9%Risk scoreLow, low, low, high
Treatment failure
Abdelbary et al9/2017TB cases2006–2013Retrospective cohortMexicoInternal(split-sample)Development: 2109†Validation:6322†Education (no or low vs higher than primary school), MDR, AFB smear (>+2,+1, negative)c-statistic=0.65Sensitivity=52%Specificity=66%Risk scoreLow, high, low, high
Kalhori et al49 (logistic)/2010TB cases at DOTS registration2005Retrospective cohortIranInternal(split-sample)Development:828/4836 (17%)Validation:2418†Gender, age, weight nationality, prison, case typeAUROC=0.70Accuracy=81.64%HL test: χ²=11.935, df=8, p=0.154Model coefficientsUnclear, unclear, unclear, high
Keane et al30/1997Patients with smear-positive TB on standard first-line regimen with DOT1990–1995Non-nested case–controlVietnamNone130/803 (16%)3 month model: extensive lesions, mediastinal shift, average smear score third month, weight, progressive X-ray, any previous treatmentBaseline model: mediastinal shift, average smear score, extensive lesions, any previous treatment, cavities, weight3 month:Sensitivity=80%Specificity=80%Baseline:Sensitivity=70%Specificity=80%Model coefficientsHigh, unclear, unclear, high
Luies et al33/2017Smear-positive pulmonary TB cases on DOTMay 1999–July 2002Nested case−controlSouth AfricaInternal(cross-validation)10/31 (32%)3,5,-Dihydroxybenzoic acid, (3-(4-Hydroxy-3-methoxyphenyl) propionic acidAUROC=0.89(95% CI: 0.7 to 1.00)Model coefficientsHigh, unclear, unclear, high
Mburu et al72/2018Patients with smear-positive TBFebruary 2014–August 2015Prospective cohortKenyaInternal(cross-validation)13/321 (4%)HbA1c, regimen (retreatment), age, weight, random blood glucose, BMI, blood urea nitrogen, HIV-positive result, ever smoker, creatinineAUROC=0.56 ± 0.07Relative scoreLow, low, low, high
Default
Thompson et al73/2017Adults who were HIV-uninfected with newly diagnosed pulmonary TBApril 2010–April 2013Retrospective cohortSouth AfricaInternal(cross-validation) and external (setting)6/99 (6%)18 splice junctions and 13 genesAUROC (internal)=0.87AUROC (external)=0.63Heatmap of differentially expressed genesLow, low, low, high
Abdelbary et al9/2017 (TB/DM)TB cases2006–2013Retrospective cohortMexicoNone93/2121 (4%)Age (<40 vs ≥40), sex, HIVc-statistic=0.62Risk scoreUnclear, high, unclear, high
Belilovsky et al35/2010Hospitalised patients with TB1993–2002Retrospective cohortRussiaExternal (geographical)Development:1326/3904 (34%)Validation:4662/12803 (36%)Sex, unemployment, retreatment case, alcohol abuse (yes, no, no data), severe TB form, residence (urban vs rural), age (25–50 vs other), pulmonary TB (vs extrapulmonary), prison historyBelgrood: AUROC=0.75Orel: AUROC=0.75Pskov: AUROC=0.78Yaroslavi: AUROC=0.75Calibration tableModel coefficientsUnclear, high, high, high
Chang et al31/2004All patients with TBJanuary 1999–March 1999Nested case–controlChinaNone102/408 (25%)Baseline:* ever smoker (current, former, never), retreatment (history of default, no history of default, not)Longitudinal: smoking status (current, former, never), retreatment (with history of default, without history of default, never), unsatisfactory adherence in first 2 months (good, poor, fair, unknown), subsequent hospitalisation, treatment side effects in last month of treatmentBaseline:*AUROC=0.70 (95% CI: 0.63 to 0.76)HL test: χ²=1.448, df=5, p=0.919Longitudinal:AUROC=0.85 (95% CI: 0.80 to 0.90)HL test: χ²=5.887, df=6, p=0.436ORsHigh, high, low, high
Chee et al32/2000TB cases1996Nested case–controlSingaporeNone38/71 (54%)Chinese race, extent of family support, treatment durationAccuracy=74.6%Model coefficientsHigh, unclear, high, high
Cherkaoui et al29/2014Patients with TB with definite or probable pulmonary or extrapulmonary TBJune 2010–October 2011Non-nested case–controlMoroccoNone91/277 (33%)Age <50, work interfering with ability to take TB treatment, retreatment regimen, daily DOT, moderate or severe side effects, told friends about TB, current smoker, never smoker, symptom resolution in <2 months, knowledge of TB treatment durationAUROC=0.85(95% CI: 0.80 to 0.90)Sensitivity=82.4%Specificity=87.6%HL test: χ²=0.77, p value=1.00Survey toolHigh, high, high, high
Rodrigo et al55/2012New TB casesJanuary 2006–December 2009Prospective cohortSpainInternal(split-sample)Development:92/1490 (6%)Validation:103/1589 (6%)Immigrant, living alone, living in an institution, previous TB treatment, linguistic barriers (poor understanding), intravenous drug use, unknown intravenous drug useAUROC=0.67 (95% CI: 0.65 to 0.70)Sensitivity=65.05%Specificity=67.36%Risk scoreLow, low, low, high
Unfavourable outcome
Kalhori and Zeng50(predicting)/2009†Patients with TB at DOT registration2005Retrospective cohortIranInternal(split-sample)Development: 6920†Validation:2966†Age, gender, nationality, prison, area, weightClassification rate=89.8%R2=0.45Model coefficientsUnclear, unclear, unclear, high
Sauer et al57/ 2018†TB casesData available through March 2018Retrospective cohortAzerbaijan, Belarus, Georgia, Moldova, RomaniaInternal(split-sample)Development:103/411 (25%)Validation:44/176 (25%)FS:*Drug sensitivity, employment status, smear microscopy, disseminationBackwards elimination (BE):Drug sensitivity, employment status, smear microscopy, disseminationStepwise selection (SS):Drug sensitivity, employment status, smear microscopy, disseminationLasso:Country, employment, extrapulmonary, cavity size, decrease in lung capacity, smear microscopy, drug sensitivity, chest imagingRandom forest (RF):Top five by mean decrease accuracy: lung cavity size, type of resistance, employment status, country, total cavitiesTop five by mean decrease Gini index: Age of onset, drug regimen, lung cavity size, number of daily contacts, cultureFS:*AUROC=0.74(95% CI: 0.66 to 0.82)Sensitivity=0.36Specificity=0.89Misclassification=0.24BE:AUROC=0.73(95% CI: 0.65 to 0.81)Sensitivity=0.3Specificity=0.88Misclassification=0.27SS:AUROC=0.73(95% CI: 0.65 to 0.81)Sensitivity=0.30Specificity=0.88Misclassification=0.27Lasso:AUROC=0.72(95% CI: 0.64 to 0.80)Sensitivity=0.21Specificity=0.96Misclassification=0.23RF:AUROC=0.73(95% CI: 0.65 to 0.81)Sensitivity=0.30Specificity=0.88Misclassification=0.27SVM linear:AUROC=0.69(95% CI: 0.60 to 0.77)Sensitivity=0.21Specificity=0.94Misclassification=0.24SVM polynomial:AUROC=0.69(95% CI: 0.60 to 0.77)Sensitivity=0Specificity=1Misclassification=0.25ListUnclear, unclear, unclear, high
Baussano et al47/2008‡Pulmonary TB cases2001–2005Retrospective cohortItalyInternal (bootstrap)576/1242 (46%)Residency (residential vs homeless), sex, geographic origin (non-EU vs EU), case definition (other than definite vs definite), treatment setting (inpatient and unknown vs outpatient), age (continuous)AUROC=0.75Calibration slope=0.98R2=0.24NomogramLow, unclear, low, high
Costa-Veiga et al48/2017‡Pulmonary TB cases2000–2012Retrospective cohortPortugalExternal (temporal)Development:1152/10766 (11%)Validation:4714†HIV, previous treatment, age class (25–44, 15–24, 45–64,>64), intravenous drug use, pathologies (other disease comorbidity)AUROC=75.9%(95% CI: 74.1 to 77.7)Sensitivity=71%Specificity=73%NomogramLow, low, low, high
Killian et al34/2019‡Patients with TB (99 DOTS programme)February 2017–September 2018Retrospective cohortIndiaNone433/4167 (10%)LEAP:* LEAP with two input layers, (1) LSTM with 64 hidden units and a dense layer with 48 units for the dense layer and four units for the penultimate layerw-misses: missed doses in last weekt-misses: total missed doses in 35 days units and a dense layer with 48 units for the dense layer and four units for the penultimate layerRandom forest: 150 trees and no max depth based on digital adherence technology from first 35 dayLEAP*AUROC=0.743lw-misses:AUROC=0.607t-misses:AUROC=0.630Random forest:AUROC=0.722NoneHigh, high, unclear, high
Madan et al51/2018‡Patients with TB/HIV/HIV on DOT with first-line TB treatment2015Retrospective cohortIndiaNone78/448 (17%)Sputum smear grade, previous TB, disease classification, HIV status, ART status, CD4 cell count, sex and age group (with interaction terms between age group and sex; sputum smear status and type of TB; HIV status at TB diagnosis and CD4 cell category).AUROC=0.783HL test p value=0.149Model coefficientsLow, low, low, high
Mburu et al72/ 2018‡Patients with Smear-positive TBFebruary 2014–August 2015Prospective cohortKenyaInternal(cross-validation)32/340 (9%)HbA1c, treatment regimen (retreatment), creatinine, BMI, blood urea nitrogen, weight, age, random blood glucose, HIV positive result, male genderAUROC=0.65 ± 0.06Relative scoreLow, low, low, high
Other outcome
Kalhori and Zeng74(fuzzy)/2009§Patients with TB at DOTS registration2005Retrospective cohortIranInternal(split-sample)Development: 7254†Validation:2418†Case type, treatment category, risky sex, prison, sex, recent TB infection, diabetes, low body weight, TB type, length, previous imprisonment, age, area, HIVMean absolute percentage error=1.24Learnt parametersUnclear, unclear, high, high
Hussain and Junejo56/2019¶Patients with pulmonary and extrapulmonary TB(TB Reach)2011–2014Retrospective cohortUnknownInternal(split-sample)Development: 3371†Validation:842†Random forest*, artificial neural networks and support vector machineRandom forest:*Accuracy=76.32%NoneUnclear, unclear, unclear, high

*Indicates best-performing/most relevant model, which is included throughout the manuscript (see Methods section for details). Performance measures are reported for highest level of validation performed (ranked from strongest to weakest: external validation, internal validation, no validation). If internal and external validation were performed, both are reported.

†Outcome number unknown.

‡Outcome is composite of death, treatment failure, loss to follow-up and not evaluated.

§Outcome is a value from 1 to 5 (1=patient completed the treatment course in frame of DOTS, 2=cured, 3=quit treatment, 4=failed treatment and 5=death).

¶Outcome is treatment completion.

**Outcome is composite of death and treatment failure (losses to follow-up and not evaluated (unknown) outcomes were excluded).

AFB, acid fast bacilli; AUROC, area under receiver operating characteristic; BCG, Bacillus Calmette–Guérin; BMI, body mass index; c-statistic, concordance statistic; DM, diabetes mellitus; DOTS, directly observed therapy; FS, forward selection; HbA1c, haemoglobin A1c; HL, Hosmer-Lemeshow; LEAP, Lstm rEal-time Adherence Predictor; MDR, multidrug resistant; TB, tuberculosis.

Study characteristics *Indicates best-performing/most relevant model, which is included throughout the manuscript (see Methods section for details). Performance measures are reported for highest level of validation performed (ranked from strongest to weakest: external validation, internal validation, no validation). If internal and external validation were performed, both are reported. †Outcome number unknown. ‡Outcome is composite of death, treatment failure, loss to follow-up and not evaluated. §Outcome is a value from 1 to 5 (1=patient completed the treatment course in frame of DOTS, 2=cured, 3=quit treatment, 4=failed treatment and 5=death). ¶Outcome is treatment completion. **Outcome is composite of death and treatment failure (losses to follow-up and not evaluated (unknown) outcomes were excluded). AFB, acid fast bacilli; AUROC, area under receiver operating characteristic; BCG, Bacillus Calmette–Guérin; BMI, body mass index; c-statistic, concordance statistic; DM, diabetes mellitus; DOTS, directly observed therapy; FS, forward selection; HbA1c, haemoglobin A1c; HL, Hosmer-Lemeshow; LEAP, Lstm rEal-time Adherence Predictor; MDR, multidrug resistant; TB, tuberculosis. Thirteen (41%) studies took place in Asia, eight (25%) in Africa, six (19%) in Europe, four (12%) in North America and one (3%) included sites in Europe and Argentina. Fewer than half (n=14, 45%) took place in high-burden TB settings.1 One study did not report study location (tables 2 and 3). Characteristics of patient populations in the 33 included studies with prediction models for TB treatment outcomes *Determined based on study location and WHO list of 30 countries with high-burden TB in the 2019 Global Tuberculosis Report (1). TB, tuberculosis. Reporting of population characteristics varied by study (table 4). Among 18 studies that reported a measure of central tendency (mean or median) for age, the median of those measures was 41 years (IQR: 37–49). Of 17 studies that reported the minimum age of participants, seven (41%) had a minimum age of 15, one (6%) had a minimum age of 16, one (6%) had a minimum age of 17 and the remainder had minimum age of 18. Eighteen studies reported including persons living with HIV (PLWH); 5 of these included only patients with TB/HIV. Thirteen studies reported including persons with diabetes; one of which included only TB/DM. Eight studies reported including some participants with MDR, though prevalence of MDR was low in all studies. Ten studies included only hospitalised patients, and in 14 studies, all participants were on directly observed therapy (DOT).
Table 4

Study population characteristics of 33 included studies

CharacteristicsIncluded?Median (IQR)*, n
YesNoUnknown
Age†1541 (37–49), n=18
HIV187823% (10–100), n=17
Diabetes1311912% (5–21), n=11
MDR87181% (1–3), n=8
Other drug resistance121206% (4–12), n=10
Extrapulmonary TB‡224711%(4–17), n=16
Previous TB2011219% (9–30), n=17
DOT14019100% (100–100), n=14
Hospitalised patients13119100% (100–100), n=10

*Other than age (which is reported in years), this is the percentage of the population that has the characteristic among studies that include patients with the characteristic. For example, among the 18 studies that include persons with HIV, 17 report how many people had HIV and among those, the median percentage of the population with HIV is 23%.

†Based on the measure of central tendency reported in the study (mean: n=11; median: n=7).

‡Forms of extrapulmonary TB differ by study but included some of the following: miliary, meningeal, pleural, peritoneal, disseminated, blood/bone, abdominal.

DOT, directly observed therapy; MDR, multidrug resistance; TB, tuberculosis.

Study population characteristics of 33 included studies *Other than age (which is reported in years), this is the percentage of the population that has the characteristic among studies that include patients with the characteristic. For example, among the 18 studies that include persons with HIV, 17 report how many people had HIV and among those, the median percentage of the population with HIV is 23%. †Based on the measure of central tendency reported in the study (mean: n=11; median: n=7). ‡Forms of extrapulmonary TB differ by study but included some of the following: miliary, meningeal, pleural, peritoneal, disseminated, blood/bone, abdominal. DOT, directly observed therapy; MDR, multidrug resistance; TB, tuberculosis.

Model characteristics

Model outcomes included death (n=16, 43%), treatment failure (n=6, 16%), default (n=6, 16%) or a composite outcome (n=9, 25%, tables 2 and 5). The complete outcome definition for all included studies is in online supplemental file 6. Methods reported for the 37 models of the 33 included studies with prediction models for TB treatment outcomes *Outcome is a value from 1 to 5 (1=patient completed the treatment course in frame of DOTS, 2=cured, 3=quit treatment, 4=failed treatment and 5=death). †Prevalence of outcome in the population used to develop the prediction model (ie, derivation/development subset if split-sample technique was used or full sample if the model was not validated or if bootstrap/cross-validation was used). ‡Only five studies report the exact number of predictors considered. Otherwise, the number of candidate predictors was estimated from the provided tables or lists of candidate predictors in the source paper. §Other methods of determining which variables to consider for prediction model include: principal components analysis (n=1), screening for multicollinearity via correlation coefficient (n=1), one study used a combination of a priori and selection via univariable association, and the other used machine-learning preprocessing (n=1). ¶Sums to more than 100%, because some studies report multiple measures of calibration or discrimination. **Based on the following cut-off methods: Youden (n=4) concordance probability (n=1), estimated at nearest 0,1 for studies that present a range of sensitivity and specificity in a table or figure (n=4), or unknown (n=5). ††Other includes one study that reports false positive rate and one study that includes a graph of sensitivity versus specificity. AUROC, area under receiver operating characteristic; c-statistic, concordance statistic; TB, tuberculosis. Most models were developed using clinical/epidemiologic predictors (n=34, 92%), two (5%) used multiple biomarkers and one (3%) used adherence data. The most common candidate predictors were age, sex, extrapulmonary TB, smear result, body mass index (BMI), X-ray findings and previous TB. The most common predictors retained in the final models were age, sex, extrapulmonary TB, BMI, chest X-ray results, previous TB and HIV (figure 2).
Figure 2

Most common predictors considered and included. Considered: the predictor as evaluated as a candidate predictor prior to multivariable modelling. Included: the predictor was considered and subsequently included in the final multivariable model. BMI, body mass index; MDR, multidrug resistant; TB, tuberculosis.

Most common predictors considered and included. Considered: the predictor as evaluated as a candidate predictor prior to multivariable modelling. Included: the predictor was considered and subsequently included in the final multivariable model. BMI, body mass index; MDR, multidrug resistant; TB, tuberculosis. Only three models (8%) used survival analysis; most models used logistic regression (n=29, 78%) and five (14%) used a machine-learning approach. More than half of studies (n=19, 51%) considered variables for inclusion in the multivariable model based on unadjusted associations with the outcome. Model building methods varied widely between models (table 5).
Table 5

Methods reported for the 37 models of the 33 included studies with prediction models for TB treatment outcomes

CharacteristicsStudies reporting characteristic, n (%)CategoriesN (%) or median (IQR)
Type of outcome37 (100)Single29 (78)
Composite8 (22)
Outcome37 (100)Death16 (43)
Treatment failure6 (16)
Default, loss to follow-up or treatment interruption6 (16)
Unfavourable outcome6 (16)
Treatment success2 (6)
Other*1 (3)
Number—prevalence of outcome32 (87)94 (38–171)15% (9–-26)
Events per candidate variable‡30 (81)6 (3–-11)
Events per variable (in final model)29 (78)14 (9–26)
Predictor types37 (100)Clinical/epidemiologic34 (92)
Adherence1 (3)
Biomarker2 (5)
Analysis37 (100)Logistic regression29 (78)
Survival analysis3 (8)
Machine learning5 (14)
Method for considering predictors in multivariable models36 (97)All candidate predictors12 (32)
Based on unadjusted association with outcome19 (51)
Based on clinical relevance1 (3)
Other§4 (14)
Selection of predictors during modelling31 (84)Full model approach2 (6)
Forward selection7 (23)
Backwards elimination5 (16)
Stepwise selection8 (26)
Random Forest1 (3)
Hosmer-Lemeshow model building criteria4 (13)
Bayesian model averaging3 (10)
Pairwise selection1 (3)
P value for consideration in model17 (46)0.012 (12)
0.053 (18)
0.111 (6)
0.26 (35)
0.255 (29)
P value for retention in MV model20 (54)0.059 (45)
0.19 (45)
0.151 (5)
0.21 (5)
Internal validation19 (51)Split-sample10 (53)
Bootstrap5 (26)
Cross-validation4 (21)
External validation6 (16)Temporal1 (17)
Geographic1 (4)
Setting4 (67)
Calibration17 (46)Calibration plot¶2 (12)
Calibration slope¶1 (6)
Hosmer-Lemeshow goodness of fit p value¶13 (77)
0.51 (0.20–0.79)
Calibration table¶2 (12)
Mean absolute error¶1 (6)
Discrimination30 (81)C-statistic (AUROC)¶30 (100)
0.75 (0.68–0.84)
Log rank test¶2 (5)
Classification18 (49)Sensitivity**14 (78)
70(54, 78)
Specificity**13 (72)
75 (71–88)
Accuracy2 (11)
Other††2 (11)
Model presentation34 (92)Risk score16 (43)
Model coefficient8 (22)
Nomogram2 (6)
ORs/relative scores4 (12)
Survey tool1 (3)

*Outcome is a value from 1 to 5 (1=patient completed the treatment course in frame of DOTS, 2=cured, 3=quit treatment, 4=failed treatment and 5=death).

†Prevalence of outcome in the population used to develop the prediction model (ie, derivation/development subset if split-sample technique was used or full sample if the model was not validated or if bootstrap/cross-validation was used).

‡Only five studies report the exact number of predictors considered. Otherwise, the number of candidate predictors was estimated from the provided tables or lists of candidate predictors in the source paper.

§Other methods of determining which variables to consider for prediction model include: principal components analysis (n=1), screening for multicollinearity via correlation coefficient (n=1), one study used a combination of a priori and selection via univariable association, and the other used machine-learning preprocessing (n=1).

¶Sums to more than 100%, because some studies report multiple measures of calibration or discrimination.

**Based on the following cut-off methods: Youden (n=4) concordance probability (n=1), estimated at nearest 0,1 for studies that present a range of sensitivity and specificity in a table or figure (n=4), or unknown (n=5).

††Other includes one study that reports false positive rate and one study that includes a graph of sensitivity versus specificity.

AUROC, area under receiver operating characteristic; c-statistic, concordance statistic; TB, tuberculosis.

Only 19 (51%) models were internally validated, including 10 (53%) split-sample validation, 5 (26%) bootstrap resampling and 4 (21%) cross-validation. Six (16%) models were externally validated. Many models (n=30, 81%) reported discrimination with c-statistic (concordance statistic) or area under the receiver operating characteristic (AUROC), which are equivalent and quantify the ability of the model to distinguish between patients who do and do not develop an outcome. Only 17 (46%) reported calibration, the agreement between observed and predicted outcomes. Most studies assessed calibration with Hosmer-Lemeshow tests (n=13, 77%); only two studies provided a calibration plot, the preferred reporting method for prediction model studies,16 27 28 and one reported the calibration slope (table 2). Models were presented a variety of ways, the most common of which was a weighted risk score (n=16, 43%); details on model presentation are in online supplemental file 7.

Quality assessment

Grading of PROBAST signalling questions is summarised in figure 3, and the summary risk of bias for the participants, predictors, outcome and analysis domains and assessment of applicability are shown in figure 4. More than half of the studies were at low risk of bias for the population and outcomes domains, but all studies were at high risk of bias in the analysis domain.
Figure 3

Heatmap of signalling questions from risk of bias assessment with PROBAST. PROBAST questions (additional details in online supplemental file 5) Participants 1: what study design was used and was it appropriate? Participants 2: were all inclusion and exclusion criteria appropriate? Predictors 1: were predictors defined as assessed the same way for all participants? Predictors 2: were predictor assessments made without knowledge of data outcome? Predictors 3: are all predictors available at the time the model was intended to be used? Outcome 1: was the outcome determined appropriately? Outcome 2: was the outcome pre-specified or standard? Outcome 3: were predictors excluded from outcome definition? Outcome 4: was the outcome defined and determined in a similar way for all participants? Outcome 5: was the outcome determined without predictor information? Outcome 6: was the time interval between predictor assessment and outcome determination appropriate? Analysis 1: were there a reasonable number of participants with the outcome? Analysis 2: were continuous and categorical variables handled appropriately? Analysis 3: were all enroled participants included in the analysis? Analysis 4: were participants with missing data handled appropriately? Analysis 5: was selection of predictors based on univariable analysis avoided? Analysis 6: were complexities in data (censoring, competing risks, sampling of control participants) accounted for appropriately? Analysis 7: were relevant model performance measures evaluated appropriately? Analysis 8: were model overfitting, underfitting, and optimism in the model performance accounted for? Analysis 9: do predictors and their assigned weights in the final model correspond to the results from the reported multivariable analysis?.

Figure 4

Summary of risk of bias and applicability assessment with PROBAST. PROBAST, Prediction Model Risk of Bias Assessment Tool.

Heatmap of signalling questions from risk of bias assessment with PROBAST. PROBAST questions (additional details in online supplemental file 5) Participants 1: what study design was used and was it appropriate? Participants 2: were all inclusion and exclusion criteria appropriate? Predictors 1: were predictors defined as assessed the same way for all participants? Predictors 2: were predictor assessments made without knowledge of data outcome? Predictors 3: are all predictors available at the time the model was intended to be used? Outcome 1: was the outcome determined appropriately? Outcome 2: was the outcome pre-specified or standard? Outcome 3: were predictors excluded from outcome definition? Outcome 4: was the outcome defined and determined in a similar way for all participants? Outcome 5: was the outcome determined without predictor information? Outcome 6: was the time interval between predictor assessment and outcome determination appropriate? Analysis 1: were there a reasonable number of participants with the outcome? Analysis 2: were continuous and categorical variables handled appropriately? Analysis 3: were all enroled participants included in the analysis? Analysis 4: were participants with missing data handled appropriately? Analysis 5: was selection of predictors based on univariable analysis avoided? Analysis 6: were complexities in data (censoring, competing risks, sampling of control participants) accounted for appropriately? Analysis 7: were relevant model performance measures evaluated appropriately? Analysis 8: were model overfitting, underfitting, and optimism in the model performance accounted for? Analysis 9: do predictors and their assigned weights in the final model correspond to the results from the reported multivariable analysis?. Summary of risk of bias and applicability assessment with PROBAST. PROBAST, Prediction Model Risk of Bias Assessment Tool. Common sources of population bias included use of non-nested case–control design,29 30 nested case–control design without proper estimation of baseline risk,31 32 or inappropriate inclusion/exclusion criteria.33 34 Sources of predictor bias included lack of standardised assessment of key predictors (ie, HIV, diabetes, chest X-ray scoring)9 29 31 34–36 or timing of data collection/availability that would limit the intended use of the model.9 29 37 Within the outcomes domain, sources of bias included subjective35 or non-standard32 38 outcome measures and inconsistent outcome ascertainment.29 Bias in the analysis domain was widespread. More than half of the models included were likely overfit due to low events per variable ratios (table 5). Only six studies handled continuous and categorical variables appropriately (ie, did not dichotomise continuous variables, considered non-linearity of continuous variables).31 39–43 Most studies used complete case analysis or did not mention missing data; no study used multiple imputation in their main analysis. One study with low amounts of missing data (<5%) conducted sensitivity analysis with multiple imputation.44 A different study excluded only two people out of a total sample size of 1007 with missing data, which would have little impact on model performance.45 Fewer than half (n=14) of studies avoided univariable predictor selection, and only three studies used survival analysis, appropriately accounting for censoring.36 45 46 Performance measures were appropriately reported (ie, calibration assessed with plot and discrimination assessed with c-statistic/AUROC) in three studies.41 44 47 Only two studies estimated optimism (degree to which data are overfit) or accounted for potential overfitting with penalisation of model parameters.35 41 Ten studies appropriately presented their model with model coefficients or nomograms, which prevents bias from rounding or transforming model coefficients to generate a risk score.30 33 35 37 38 45 47–55 About half of the models (n=19, 51%) were applicable to the review question in all domains. However, unclear reporting of target population or predictor and outcome definitions limited assessment of applicability for several studies.38 49 50 56 57 Additionally, studies that included only hospitalised patients with specific laboratory parameters may not be routinely available in the clinical setting.39 40 42 Results from analyses stratified by inclusion of patients with MDR and minimum age <18 are presented in online supplemental file 8.

Discussion

In this comprehensive, systematic review of prediction models for pulmonary TB treatment outcomes, we identified 33 model development studies presenting 37 prediction models. Although diagnostic prediction models for prevalent TB were previously systematically reviewed, this is the first review of TB treatment outcomes.58 The included prediction models were developed for predicting death, treatment failure, default or a composite unfavourable outcome during TB treatment. Most models reported good performance (c-statistic/AUROC >0.7), but all were evaluated to have high risk of bias due to poor reporting, exclusion of missing data, weak methodologic approaches, lack of calibration assessment and limited validation. Population heterogeneities, such as differences in inclusion/exclusion of individuals with MDR and younger ages, and varying predictor and outcome definitions limited comparisons between models. More than half of the models included in the review were developed in low-burden TB settings, and none were developed specifically in South America. Prediction of TB treatment outcome is especially important in high-burden TB settings, where resources may be limited, and risk assessment can guide resource allocation toward patients who need the most involved care. Common risk factors included in the models were consistent with well-established risk factors for poor TB treatment outcomes, including age, sex, HIV, extrapulmonary TB, baseline smear results and previous TB treatment. Among studies that included PLWH, only three considered factors related to management/severity of HIV, such as receipt of antiretroviral therapy, CD4 cell count or viral load, which likely impacted TB treatment outcomes.40 46 51 Laboratory values or metabolic biomarkers, such as haemoglobin, haemoglobin A1c or random blood glucose, may also be associated with treatment outcome and worth considering as candidate predictors. There is increasing evidence that diabetes impacts TB treatment outcomes, but caution is warranted about how to best define diabetes in the context of a prediction model to ensure consistency and reproducibility across studies.59 Behavioural characteristics, such as tobacco use, alcohol use and drug use were rarely included in final prediction models and are difficult to collect objectively, suggesting their role in prediction models for TB treatment outcomes may be limited. Additionally, several studies excluded participants with HIV, diabetes, extrapulmonary TB or MDR TB, because these factors negatively influence treatment outcomes. However, careful consideration should be given to inclusion/exclusion criteria in prediction model studies, given that information should be available at the time of intended model use, which may not always hold for these aforementioned factors.60 This is especially questionable for MDR, given that conventional drug-susceptibility testing results are not available for several weeks after TB diagnosis; though more recent advances in rapid molecular methods such as GeneXpert or line-probe assays offer rapid screening.61 TB researchers should thoughtfully consider how to appropriately handle complexities of censoring and competing risks in TB outcomes research. Only three studies in this review used survival analysis, despite the long duration of TB treatment outcome assessment and relatively high rates of losses to follow-up across studies, and no studies considered competing risks, such as death due to other causes.62 Losses to follow-up were frequently excluded, which can lead to selection bias. Though all included studies were at high risk of bias in the analysis domain, we want to highlight two studies with some exemplary characteristics.41 44 Pefura-Yone et al41 provide clear explanations of study design, inclusion/exclusion criteria and data collection procedures; TB diagnosis and treatment outcome definitions were standard.63 Non-linearity of continuous variables was considered with restricted cubic splines, and no continuous variables were categorised or dichotomised; the final model includes four predictors that are easy to collect and routinely assessed in most TB control programmes, especially those in high-burden settings. The performance of the model was internally validated with bootstrap validation, and the discrimination (c-statistic=0.808) was corrected for optimism. Model calibration was presented graphically with calibration plots. The final model was presented as a nomogram with instructions for use, which facilitates use in external validation studies. Gupta-Wright and colleagues developed and externally validated a clinical risk score to predict mortality in high-burden, low-resource settings.43 They used clinical trial data with very low amounts of missing data for model development, and externally validated the clinical risk score with data collected independently from two other studies (a clinical trial and a prospective cohort). Given high amounts (42%) of missing data in the validation cohort, they conducted sensitivity analysis using multiple imputation for missing data; the c-statistic differed slightly between complete case and multiple-imputation analyses in the validation cohort (0.68 vs 0.64). Candidate predictors were based on a priori clinical knowledge, previous literature, and required variables were objective, reproducible and available in low-resource settings, consistent with recommended approaches.26 60 64 Additionally, they reported model performance with the c-statistics and calibration plots for development and validation cohorts, and reported results according to TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) guidance.27 28 Regardless, each of these models requires external validation prior to use in clinical practice. There are several limitations of this study. Data extraction was subject to reporting in the primary study, which varied widely and was often incomplete, leading to challenges evaluating differences in model performance due to heterogeneities in study populations. Additionally, though most studies reported discrimination, few presented a calibration curve, arguably the most important measure of model performance, further inhibiting assessment and comparison of model performance.28 65 We did not include external validation studies, which is an essential step for translation to clinical practice. However, several studies in the review did not include the full model equation, which impedes their ability to be externally validated. On searching for studies that externally validated prediction models in this review, we found three studies66–68 that evaluated the same model (TBscore).36 Briefly, these studies evaluated the ability of TBscore to monitor treatment response in a new setting,66 refined the instrument (TBscoreII) using exploratory factor analysis,67 and then evaluated TBscoreII for use in patients with TB/HIV.68 To our knowledge, no other studies included in the review were externally validated by other sources. Finally, we excluded 10 studies that were not available in English, Spanish, Portuguese or French; all abstracts were available in English, and none reported model performance metrics, so they likely would have been excluded for different reasons regardless. The findings of this review not only serve as a comprehensive overview of existing TB outcome prediction models but can act as a resource for future model development and validation of prediction models for TB treatment outcomes. We encourage researchers to focus future TB outcome prediction models on easily collected and readily available predictors that are widely generalisable. We highlight age, sex, extrapulmonary TB, BMI, chest X-ray results, previous TB and HIV as common predictors of TB treatment outcomes. Additionally, when building a new prediction model, it is recommended to first prune the set of considered predictors based on expert opinion and previous literature, rather than univariable analysis or variable selection processes26 60 64 Future model development or validation studies should adhere to the TRIPOD guidelines, which provide a 22-item checklist and aims to improve the reporting of prediction model development studies.27 28 We also encourage researchers consider PROBAST criteria to limit bias in design and conduct of prognostic studies. Prediction models are an important tool in TB management. They can lay the foundation for future impact studies by providing risk estimation to target novel treatment approaches, resource allocation or intensive case management towards patients who are least likely to achieve cure and most likely to benefit from intervention, especially in high-burden and low-resources areas. Use of prediction models can potentially help guide TB treatment practices to achieve the End TB Strategy goal of >90% treatment success, but methodologic rigour and detailed reporting must be improved. Though our findings suggest that none of the existing models are ready for clinical application without extensive external validation, we hope they direct future researchers to make use of guidelines for development and reporting of prediction models.
Table 3

Characteristics of patient populations in the 33 included studies with prediction models for TB treatment outcomes

CharacteristicsStudies reporting characteristic, n (%)CategoriesN (%) or median (IQR)
Sample size33 (11)803 (291–4167)
Study duration, years32 (97)4 (2–7)
Study design33 (100)Prospective cohort3 (9)
Retrospective cohort25 (76)
Nested case–control3 (9)
Non-nested case–control2 (6)
Data source33 (100)Medical record6 (18)
National registry or surveillance system13 (39)
Local registry or surveillance system1 (3)
Regional registry or surveillance system2 (6)
Data collect form for study purposes11 (33)
Study region32 (97)Africa8 (25)
Asia13 (41)
Europe6 (19)
North America4 (12)
South America0 (0)
Global1 (3)
High burden TB setting*31 (94)All143 (42)
Some1 (3)
None17 (55)
Missing data18 (54)Complete case analysis9 (50)
Missing indicator method4 (22)
Heckman’s method1 (6)
Simple imputation2 (12)
Sensitivity analysis with imputation1 (6)
Other1 (5)
Number of models developed33 (100)125 (76)
24 (12)
31 (3)
42 (6)
71 (3)
Reasons for multiple models developed8 (24)Different outcomes1 (12)
Different predictors considered4 (50)
Different methods2 (25)
Different outcomes1 (12)
Different populations and outcomes1 (12)

*Determined based on study location and WHO list of 30 countries with high-burden TB in the 2019 Global Tuberculosis Report (1).

TB, tuberculosis.

  61 in total

1.  TBscore II: refining and validating a simple clinical score for treatment monitoring of patients with pulmonary tuberculosis.

Authors:  Frauke Rudolf; Grethe Lemvik; Ebba Abate; Jay Verkuilen; Thomas Schön; Victor Francisco Gomes; Jesper Eugen-Olsen; Lars Østergaard; Christian Wejse
Journal:  Scand J Infect Dis       Date:  2013-09-17

2.  A guide to systematic review and meta-analysis of prediction model performance.

Authors:  Thomas P A Debray; Johanna A A G Damen; Kym I E Snell; Joie Ensor; Lotty Hooft; Johannes B Reitsma; Richard D Riley; Karel G M Moons
Journal:  BMJ       Date:  2017-01-05

Review 3.  A systematic review of prediction models for prevalent pulmonary tuberculosis in adults.

Authors:  S S Van Wyk; H H Lin; M M Claassens
Journal:  Int J Tuberc Lung Dis       Date:  2017-04-01       Impact factor: 2.373

4.  The effects of diabetes on tuberculosis treatment outcomes: an updated systematic review and meta-analysis.

Authors:  P Huangfu; C Ugarte-Gil; J Golub; F Pearson; J Critchley
Journal:  Int J Tuberc Lung Dis       Date:  2019-07-01       Impact factor: 2.373

5.  Patient and disease characteristics, and outcome of treatment defaulters from the Singapore TB control unit--a one-year retrospective survey.

Authors:  C B Chee; I C Boudville; S P Chan; Y K Zee; Y T Wang
Journal:  Int J Tuberc Lung Dis       Date:  2000-06       Impact factor: 2.373

6.  Poor performance status is a strong predictor for death in patients with smear-positive pulmonary TB admitted to two Japanese hospitals.

Authors:  Nobuyuki Horita; Naoki Miyazawa; Takashi Yoshiyama; Ryota Kojima; Naoko Omori; Takeshi Kaneko; Yoshiaki Ishigatsubo
Journal:  Trans R Soc Trop Med Hyg       Date:  2013-06-13       Impact factor: 2.184

7.  PROBAST: A Tool to Assess Risk of Bias and Applicability of Prediction Model Studies: Explanation and Elaboration.

Authors:  Karel G M Moons; Robert F Wolff; Richard D Riley; Penny F Whiting; Marie Westwood; Gary S Collins; Johannes B Reitsma; Jos Kleijnen; Sue Mallett
Journal:  Ann Intern Med       Date:  2019-01-01       Impact factor: 25.391

Review 8.  Optimal database combinations for literature searches in systematic reviews: a prospective exploratory study.

Authors:  Wichor M Bramer; Melissa L Rethlefsen; Jos Kleijnen; Oscar H Franco
Journal:  Syst Rev       Date:  2017-12-06

9.  Development and validation of a prognostic score during tuberculosis treatment.

Authors:  Eric Walter Pefura-Yone; Adamou Dodo Balkissou; Virginie Poka-Mayap; Hadja Koté Fatime-Abaicho; Patrick Thierry Enono-Edende; André Pascal Kengne
Journal:  BMC Infect Dis       Date:  2017-04-08       Impact factor: 3.090

10.  Mortality of patients hospitalized for active tuberculosis in King Abdulaziz University Hospital, Jeddah, Saudi Arabia.

Authors:  Ahmed A Aljohaney
Journal:  Saudi Med J       Date:  2018-03       Impact factor: 1.484

View more
  6 in total

1.  Integrating landmark modeling framework and machine learning algorithms for dynamic prediction of tuberculosis treatment outcomes.

Authors:  Maryam Kheirandish; Donald Catanzaro; Valeriu Crudu; Shengfan Zhang
Journal:  J Am Med Inform Assoc       Date:  2022-04-13       Impact factor: 4.497

2.  Development of a nomogram for predicting treatment default under facility-based directly observed therapy short-course in a region with a high tuberculosis burden.

Authors:  Saibin Wang
Journal:  Ther Adv Infect Dis       Date:  2021-07-29

3.  Tuberculosis Disease Diagnosis Based on an Optimized Machine Learning Model.

Authors:  Olfa Hrizi; Karim Gasmi; Ibtihel Ben Ltaifa; Hamoud Alshammari; Hanen Karamti; Moez Krichen; Lassaad Ben Ammar; Mahmood A Mahmood
Journal:  J Healthc Eng       Date:  2022-03-21       Impact factor: 3.822

Review 4.  How to conduct a systematic review and meta-analysis of prognostic model studies.

Authors:  Johanna A A Damen; Karel G M Moons; Maarten van Smeden; Lotty Hooft
Journal:  Clin Microbiol Infect       Date:  2022-08-04       Impact factor: 13.310

5.  Integrative analysis of clinical health records, imaging and pathogen genomics identifies personalized predictors of disease prognosis in tuberculosis.

Authors:  Awanti Sambarey; Kirk Smith; Carolina Chung; Harkirat Singh Arora; Zhenhua Yang; Prachi Agarwal; Sriram Chandrasekaran
Journal:  medRxiv       Date:  2022-07-21

6.  A Clinical Prediction Model for Unsuccessful Pulmonary Tuberculosis Treatment Outcomes.

Authors:  Lauren S Peetluk; Peter F Rebeiro; Felipe M Ridolfi; Bruno B Andrade; Marcelo Cordeiro-Santos; Afranio Kritski; Betina Durovni; Solange Calvacante; Marina C Figueiredo; David W Haas; Dandan Liu; Valeria C Rolla; Timothy R Sterling
Journal:  Clin Infect Dis       Date:  2022-03-23       Impact factor: 20.999

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.