| Literature DB >> 26537373 |
Denis Valle1, Joanna M Tucker Lima2, Justin Millar3, Punam Amratia4, Ubydul Haque5,6.
Abstract
BACKGROUND: Logistic regression is a statistical model widely used in cross-sectional and cohort studies to identify and quantify the effects of potential disease risk factors. However, the impact of imperfect tests on adjusted odds ratios (and thus on the identification of risk factors) is under-appreciated. The purpose of this article is to draw attention to the problem associated with modelling imperfect diagnostic tests, and propose simple Bayesian models to adequately address this issue.Entities:
Mesh:
Year: 2015 PMID: 26537373 PMCID: PMC4634725 DOI: 10.1186/s12936-015-0966-y
Source DB: PubMed Journal: Malar J ISSN: 1475-2875 Impact factor: 2.979
Summary of the proposed statistical models, their assumptions regarding the diagnostic method, and the additional data required to fit these models
| Model | Additional data requirement | Assumptions related to detection |
|---|---|---|
| Standard logistic regression | None | Perfect detection (i.e., sensitivity and specificity equal to 100 %) |
| Bayesian model 1 | Estimate of sensitivity | Sensitivity and specificity are perfectly known constants, equal to the estimates from external study |
| Bayesian model 2 | Data on sensitivity and specificity (i.e., | Sensitivity and specificity are constants and external study provides reasonable prior information on sensitivity and specificity for the target study |
| Bayesian model 3 | Subset of individuals diagnosed with the regular and the gold standard method | Sensitivity and specificity can vary as a function of covariates. This model does not rely on data from external study (i.e., does not rely on transportability assumption) |
Fig. 1The proposed Bayesian models have a much better 95 % CI coverage than the standard logistic regression model. 95 % confidence/credible interval (CI) coverage for four different methods are shown for different scenarios of sensitivity (SN) and specificity (SP) [SN = 0.6 and SP = 0.9 (upper left panel); SN = 0.9 and SP = 0.9 (upper right panel); SN = 0.6 and SP = 0.98 (lower left panel); SN = 0.9 and SP = 0.98 (lower right panel)]. These results are based on 100 simulated datasets, with 2000 individuals in each dataset. Results closer to 0.95 (blue horizontal dashed lines) indicate better performance
Fig. 2The Bayesian models outperformed the standard logistic regression model based on the MSE criterion. Mean squared error (MSE) for four different methods are shown for different scenarios of sensitivity (SN) and specificity (SP) [SN = 0.6 and SP = 0.9 (upper left panel); SN = 0.9 and SP = 0.9 (upper right panel); SN = 0.6 and SP = 0.98 (lower left panel); SN = 0.9 and SP = 0.98 (lower right panel)]. These results are based on 100 simulated datasets, with 2000 individuals in each dataset. Smaller values indicate better performance
Fig. 3Results from the standard logistic regression (black) and Bayesian model 3 (blue). Left panel shows that inference regarding disease risk factors can be substantially different when using the standard logistic regression and Bayesian model 3. Stars indicate 95 % confidence/credible intervals that did not include zero. The ‘Time’ covariate refers to time of residence in the region. Right panels show that both microscopy sensitivity and infection probability decrease as a function of time living in this region