Literature DB >> 34202622

Post-Analysis of Predictive Modeling with an Epidemiological Example.

Christina Brester1, Ari Voutilainen2, Tomi-Pekka Tuomainen2, Jussi Kauhanen2, Mikko Kolehmainen1.   

Abstract

Post-analysis of predictive models fosters their application in practice, as domain experts want to understand the logic behind them. In epidemiology, methods explaining sophisticated models facilitate the usage of up-to-date tools, especially in the high-dimensional predictor space. Investigating how model performance varies for subjects with different conditions is one of the important parts of post-analysis. This paper presents a model-independent approach for post-analysis, aiming to reveal those subjects' conditions that lead to low or high model performance, compared to the average level on the whole sample. Conditions of interest are presented in the form of rules generated by a multi-objective evolutionary algorithm (MOGA). In this study, Lasso logistic regression (LLR) was trained to predict cardiovascular death by 2016 using the data from the 1984-1989 examination within the Kuopio Ischemic Heart Disease Risk Factor Study (KIHD), which contained 2682 subjects and 950 preselected predictors. After 50 independent runs of five-fold cross-validation, the model performance collected for each subject was used to generate rules describing "easy" and "difficult" cases. LLR with 61 selected predictors, on average, achieved 72.53% accuracy on the whole sample. However, during post-analysis, three categories of subjects were discovered: "Easy" cases with an LLR accuracy of 95.84%, "difficult" cases with an LLR accuracy of 48.11%, and the remaining cases with an LLR accuracy of 71.00%. Moreover, the rule analysis showed that medication was one of the main confusing factors that led to lower model performance. The proposed approach provides insightful information about subjects' conditions that complicate predictive modeling.

Entities:  

Keywords:  model performance; multi-objective optimization; post-analysis of data-driven models; prediction of cardiovascular death; rule design

Year:  2021        PMID: 34202622     DOI: 10.3390/healthcare9070792

Source DB:  PubMed          Journal:  Healthcare (Basel)        ISSN: 2227-9032


  15 in total

1.  Missing value estimation methods for DNA microarrays.

Authors:  O Troyanskaya; M Cantor; G Sherlock; P Brown; T Hastie; R Tibshirani; D Botstein; R B Altman
Journal:  Bioinformatics       Date:  2001-06       Impact factor: 6.937

2.  Is there a continuing need for longitudinal epidemiologic research? The Kuopio Ischaemic Heart Disease Risk Factor Study.

Authors:  J T Salonen
Journal:  Ann Clin Res       Date:  1988

3.  Interpretation of machine learning predictions for patient outcomes in electronic health records.

Authors:  William La Cava; Christopher Bauer; Jason H Moore; Sarah A Pendergrass
Journal:  AMIA Annu Symp Proc       Date:  2020-03-04

4.  What This Computer Needs Is a Physician: Humanism and Artificial Intelligence.

Authors:  Abraham Verghese; Nigam H Shah; Robert A Harrington
Journal:  JAMA       Date:  2018-01-02       Impact factor: 56.272

5.  Development and validation of risk prediction models for multiple cardiovascular diseases and Type 2 diabetes.

Authors:  Mehrdad Rezaee; Igor Putrenko; Arsia Takeh; Andrea Ganna; Erik Ingelsson
Journal:  PLoS One       Date:  2020-07-29       Impact factor: 3.240

6.  Cardiovascular disease risk prediction using automated machine learning: A prospective study of 423,604 UK Biobank participants.

Authors:  Ahmed M Alaa; Thomas Bolton; Emanuele Di Angelantonio; James H F Rudd; Mihaela van der Schaar
Journal:  PLoS One       Date:  2019-05-15       Impact factor: 3.240

Review 7.  Artificial Intelligence: Practical Primer for Clinical Research in Cardiovascular Disease.

Authors:  Nobuyuki Kagiyama; Sirish Shrestha; Peter D Farjo; Partho P Sengupta
Journal:  J Am Heart Assoc       Date:  2019-08-27       Impact factor: 5.501

8.  Study of cardiovascular disease prediction model based on random forest in eastern China.

Authors:  Li Yang; Haibin Wu; Xiaoqing Jin; Pinpin Zheng; Shiyun Hu; Xiaoling Xu; Wei Yu; Jing Yan
Journal:  Sci Rep       Date:  2020-03-23       Impact factor: 4.379

9.  Interpretability With Accurate Small Models.

Authors:  Abhishek Ghose; Balaraman Ravindran
Journal:  Front Artif Intell       Date:  2020-02-25

10.  Predicting Australian Adults at High Risk of Cardiovascular Disease Mortality Using Standard Risk Factors and Machine Learning.

Authors:  Shelda Sajeev; Stephanie Champion; Alline Beleigoli; Derek Chew; Richard L Reed; Dianna J Magliano; Jonathan E Shaw; Roger L Milne; Sarah Appleton; Tiffany K Gill; Anthony Maeder
Journal:  Int J Environ Res Public Health       Date:  2021-03-19       Impact factor: 3.390

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.