| Literature DB >> 35051129 |
Sandra Alba1, Ente Rood1, Fulvia Mecatti2, Jennifer M Ross3, Peter J Dodd4, Stewart Chang5, Matthys Potgieter6, Gaia Bertarelli7, Nathaniel J Henry8, Kate E LeGrand9, William Trouleau10, Debebe Shaweno4, Peter MacPherson11,12,13, Zhi Zhen Qin14, Christina Mergenthaler1, Federica Giardina15, Ellen-Wien Augustijn16, Aurangzaib Quadir Baloch17, Abdullah Latif17.
Abstract
Pakistan's national tuberculosis control programme (NTP) is among the many programmes worldwide that value the importance of subnational tuberculosis (TB) burden estimates to support disease control efforts, but do not have reliable estimates. A hackathon was thus organised to solicit the development and comparison of several models for small area estimation of TB. The TB hackathon was launched in April 2019. Participating teams were requested to produce district-level estimates of bacteriologically positive TB prevalence among adults (over 15 years of age) for 2018. The NTP provided case-based data from their 2010-2011 TB prevalence survey, along with data relating to TB screening, testing and treatment for the period between 2010-2011 and 2018. Five teams submitted district-level TB prevalence estimates, methodological details and programming code. Although the geographical distribution of TB prevalence varied considerably across models, we identified several districts with consistently low notification-to-prevalence ratios. The hackathon highlighted the challenges of generating granular spatiotemporal TB prevalence forecasts based on a cross-sectional prevalence survey data and other data sources. Nevertheless, it provided a range of approaches to subnational disease modelling. The NTP's use and plans for these outputs shows that, limitations notwithstanding, they can be valuable for programme planning.Entities:
Keywords: forecasting; predictive modelling; small area estimation; spatial epidemiology; subnational prevalence; tuberculosis burden
Year: 2022 PMID: 35051129 PMCID: PMC8780063 DOI: 10.3390/tropicalmed7010013
Source DB: PubMed Journal: Trop Med Infect Dis ISSN: 2414-6366
Datasets made available to TB hackathon modelers by Pakistan NTP.
| Dataset | Disaggregation | Time Period |
|---|---|---|
| 1. Prevalence survey data 1 | Individual | 2010–2011 |
| 2. TB notifications | District | quarterly |
| 3. Laboratory External Quality Assessment data | District | quarterly |
| 4. Drug-sensitive TB treatment outcomes data | District | quarterly |
| 5. Drug-Resistant TB notifications | District | quarterly |
| 6. Master list of TB facilities | Health facility | 2019 |
| 7. Sputum smear testing data | District | quarterly |
| 8. Private sector notifications | District | Yearly |
| 9. HIV registrations | Province | 2001–2018 |
| 10. HIV testing rates among TB cases | District | quarterly |
| 11. Census Population estimates | District | 2017 |
| 12. Shape files | District | 2019 |
1 Including village names corresponding to the survey-clusters
Figure 1Proportion of TB bacteriologically positive people (out of all tested) by cluster in Pakistan in 2010–2011 prevalence survey [21].
Model specifications.
| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | |
|---|---|---|---|---|---|
|
| Binomial-logistic regression | Binomial-logistic regression | Binomial-logistic regression | Small Area Estimation (SAE) and Latent Markov (LM) modelling as linking model for SAE | Self-Organising Maps (SOM) on binomial |
|
| Bayesian inference with Markov Chain Monte Carlo with No-U-Turn-Sampler (NUTS) | Approximate Bayesian inference with integrated nested Laplace | Approximate Bayesian inference with Broyden–Fletcher–Goldfarb–Shanno algorithm | Bayesian inference with Data Augmentation Markov Chain Monte Carlo and Gibbs sampler | Bayesian Artificial Neural Network |
|
| Spatially explicit hierarchical model with fixed and random effects. | Spatially explicit hierarchical model with fixed and random effects | Spatially explicit hierarchical model with fixed and random effects | Hierarchical Discrete latent state model depending on a Gaussian linking model | N/A |
|
| Bacteriologically-confirmed TB cases from TB prevalence survey at cluster-level by age and sex | Bacteriologically-confirmed TB cases from TB prevalence survey at cluster-level | Bacteriologically-confirmed TB cases from TB prevalence survey at cluster-level | Bacteriologically-confirmed TB cases from TB prevalence survey at district level | Bacteriologically-confirmed TB cases from TB prevalence survey at district level |
|
| SES, HH size, | Age 15–24 | Population density | Urban households | All-forms TB notifications |
1 x1: x2 represents factor multiplication, while x1 * x2 represents factor crossing and is equivalent to x1 + x2 + x1: x2.
Predictions: data quality appraisal.
| Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | |
|---|---|---|---|---|---|
|
| Min = 104 | Min = 276 | Min = 51 | Min = 0 | Min = 44 |
|
| 131 | 143 | 142 | 94 | 139 |
|
| R2 = 0.404 | R2 = 0.320 | R2 = 0.733 4 | R2 = 0.115 | |
|
| Model 2: r = −0.0882 | Model 1: r = −0.0882 | Model 1: r = 0.2305 | Model 1: r = −0.0041 | Model 1: r = 0.0001 |
|
| Ratio = 2.69 | Ratio = 0.78 | Ratio = 5.30 | Ratio = 0.63 | Ratio = 2.06 |
|
| Rater 1: 3 | Rater 1: 5 | Rater 1: 7 | Rater 1: 4 | Rater 1: 7 |
1 Prevalence per 100,000 inhabitants. 2 Out of 143 districts. The difference between 136 and 143 is accounted for by districts in contested areas of Pakistan: 1 district in India-administered Kashmir, 1 district in Pakistan-administered Kashmir, 3 districts in the Federally Administered Tribal Area (FATA) and 2 districts in Balochistan. 3 LOOCV comparing final model estimates for 2010–2011 with actual prevalence survey cluster-level estimates for 2010–2011. This could not be calculated for Model 4 as LOOCV metrics are not practical for SAE-LM models (computationally too intensive) and were not produced by Model 5. 4 When performing cross validation, Model 3 excluded each cluster from the original survey; this meant that for cluster observations that were originally geo-matched to admin3 units and then resampled to multiple admin4 centroids, all down-sampled points corresponding to a single survey observation were excluded from a single out-of-sample run. 5 Pairwise correlations of district level central estimates of TB prevalence predictions (Pearson’s correlation coefficient). 6 Ratio = [(upper limit of 95% credible interval) − (lower limit of 95% credible interval)]/(prevalence estimate). 7 Four TB experts from the Pakistan National TB control Programme were asked to grade models from 1–10 based on how credible they deemed the model estimates.
Figure 2Predicted district-level TB prevalence in 2018, by model.
Figure 3Ratio of 2018 new and relapse bacteriologically positive TB Notification rate over the predicted prevalence in each district, by model.