Literature DB >> 29290291

A discussion of calibration techniques for evaluating binary and categorical predictive models.

Caroline Fenlon1, Luke O'Grady2, Michael L Doherty2, John Dunnion3.   

Abstract

Modelling of binary and categorical events is a commonly used tool to simulate epidemiological processes in veterinary research. Logistic and multinomial regression, naïve Bayes, decision trees and support vector machines are popular data mining techniques used to predict the probabilities of events with two or more outcomes. Thorough evaluation of a predictive model is important to validate its ability for use in decision-support or broader simulation modelling. Measures of discrimination, such as sensitivity, specificity and receiver operating characteristics, are commonly used to evaluate how well the model can distinguish between the possible outcomes. However, these discrimination tests cannot confirm that the predicted probabilities are accurate and without bias. This paper describes a range of calibration tests, which typically measure the accuracy of predicted probabilities by comparing them to mean event occurrence rates within groups of similar test records. These include overall goodness-of-fit statistics in the form of the Hosmer-Lemeshow and Brier tests. Visual assessment of prediction accuracy is carried out using plots of calibration and deviance (the difference between the outcome and its predicted probability). The slope and intercept of the calibration plot are compared to the perfect diagonal using the unreliability test. Mean absolute calibration error provides an estimate of the level of predictive error. This paper uses sample predictions from a binary logistic regression model to illustrate the use of calibration techniques. Code is provided to perform the tests in the R statistical programming language. The benefits and disadvantages of each test are described. Discrimination tests are useful for establishing a model's diagnostic abilities, but may not suitably assess the model's usefulness for other predictive applications, such as stochastic simulation. Calibration tests may be more informative than discrimination tests for evaluating models with a narrow range of predicted probabilities or overall prevalence close to 50%, which are common in epidemiological applications. Using a suite of calibration tests alongside discrimination tests allows model builders to thoroughly measure their model's predictive capabilities.
Copyright © 2017 Elsevier B.V. All rights reserved.

Keywords:  Calibration; Data mining; Deviance; Discrimination; Evaluation; Predictive modelling

Mesh:

Year:  2017        PMID: 29290291     DOI: 10.1016/j.prevetmed.2017.11.018

Source DB:  PubMed          Journal:  Prev Vet Med        ISSN: 0167-5877            Impact factor:   2.670


  19 in total

1.  Derivation and validation of actionable quality indicators targeting reductions in complications for injury admissions.

Authors:  Abakar Idriss-Hassan; Mélanie Bérubé; Amina Belcaïd; Julien Clément; Gilles Bourgeois; Christine Rizzo; Xavier Neveu; Kahina Soltana; Jaimini Thakore; Lynne Moore
Journal:  Eur J Trauma Emerg Surg       Date:  2021-05-07       Impact factor: 3.693

2.  Predicting the risk of depression among adolescents in Nepal using a model developed in Brazil: the IDEA Project.

Authors:  Brandon Kohrt; Helen L Fisher; Rachel Brathwaite; Thiago Botter-Maio Rocha; Christian Kieling; Kamal Gautam; Suraj Koirala; Valeria Mondelli
Journal:  Eur Child Adolesc Psychiatry       Date:  2020-03-12       Impact factor: 5.349

3.  Population-based dementia prediction model using Korean public health examination data: A cohort study.

Authors:  Kyung Mee Park; Ji Min Sung; Woo Jung Kim; Suk Kyoon An; Kee Namkoong; Eun Lee; Hyuk-Jae Chang
Journal:  PLoS One       Date:  2019-02-12       Impact factor: 3.240

4.  Artificial neural networks improve and simplify intensive care mortality prognostication: a national cohort study of 217,289 first-time intensive care unit admissions.

Authors:  Gustav Holmgren; Peder Andersson; Andreas Jakobsson; Attila Frigyesi
Journal:  J Intensive Care       Date:  2019-08-16

Review 5.  A tutorial on calibration measurements and calibration models for clinical prediction models.

Authors:  Yingxiang Huang; Wentao Li; Fima Macheret; Rodney A Gabriel; Lucila Ohno-Machado
Journal:  J Am Med Inform Assoc       Date:  2020-04-01       Impact factor: 4.497

6.  Predicting the individualized risk of poor adherence to ART medication among adolescents living with HIV in Uganda: the Suubi+Adherence study.

Authors:  Rachel Brathwaite; Fred M Ssewamala; Torsten B Neilands; Moses Okumu; Massy Mutumba; Christopher Damulira; Proscovia Nabunya; Samuel Kizito; Ozge Sensoy Bahar; Claude A Mellins; Mary M McKay
Journal:  J Int AIDS Soc       Date:  2021-06       Impact factor: 6.707

7.  Development of EndoScreen Chip, a Microfluidic Pre-Endoscopy Triage Test for Esophageal Adenocarcinoma.

Authors:  Julie A Webster; Alain Wuethrich; Karthik B Shanmugasundaram; Renee S Richards; Wioleta M Zelek; Alok K Shah; Louisa G Gordon; Bradley J Kendall; Gunter Hartel; B Paul Morgan; Matt Trau; Michelle M Hill
Journal:  Cancers (Basel)       Date:  2021-06-08       Impact factor: 6.575

8.  Development and external validation of a risk calculator to predict internalising symptoms among Ugandan youths affected by HIV.

Authors:  Rachel Brathwaite; Fred M Ssewamala; Torsten B Neilands; Proscovia Nabunya; William Byansi; Christopher Damulira
Journal:  Psychiatry Res       Date:  2021-05-28       Impact factor: 11.225

9.  Predicting the risk of future depression among school-attending adolescents in Nigeria using a model developed in Brazil.

Authors:  Rachel Brathwaite; Thiago Botter-Maio Rocha; Christian Kieling; Brandon A Kohrt; Valeria Mondelli; Abiodun O Adewuya; Helen L Fisher
Journal:  Psychiatry Res       Date:  2020-10-16       Impact factor: 11.225

10.  Developing an individualized risk calculator for psychopathology among young people victimized during childhood: A population-representative cohort study.

Authors:  Alan J Meehan; Rachel M Latham; Louise Arseneault; Daniel Stahl; Helen L Fisher; Andrea Danese
Journal:  J Affect Disord       Date:  2019-11-05       Impact factor: 4.839

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.