Literature DB >> 29255599

The effect of imputing missing clinical attribute values on training lung cancer survival prediction model performance.

Mohamed S Barakat1,2, Matthew Field1,2, Aditya Ghose3, David Stirling4, Lois Holloway1,2,5, Shalini Vinod1,5, Andre Dekker6, David Thwaites7.   

Abstract

According to the estimations of the World Health Organization and the International Agency for Research in Cancer, lung cancer is the most common cause of death from cancer worldwide. The last few years have witnessed a rise in the attention given to the use of clinical decision support systems in medicine generally and in cancer in particular. These can predict patients' likelihood of survival based on analysis of and learning from previously treated patients. The datasets that are mined for developing clinical decision support functionality are often incomplete, which adversely impacts the quality of the models developed and the decision support offered. Imputing missing data using a statistical analysis approach is a common method to addressing the missing data problem. This work investigates the effect of imputation methods for missing data in preparing a training dataset for a Non-Small Cell Lung Cancer survival prediction model using several machine learning algorithms. The investigation includes an assessment of the effect of imputation algorithm error on performance prediction and also a comparison between using a smaller complete real dataset or a larger dataset with imputed data. Our results show that even when the proportion of records with some missing data is very high (> 80%) imputation can lead to prediction models with an AUC (0.68-0.72) comparable to those trained with complete data records.

Entities:  

Keywords:  Decision Support; Imputation; Missing data; Modeling and Lung Cancer

Year:  2017        PMID: 29255599      PMCID: PMC5718991          DOI: 10.1007/s13755-017-0039-4

Source DB:  PubMed          Journal:  Health Inf Sci Syst        ISSN: 2047-2501


  11 in total

1.  Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values.

Authors:  Pedro J García-Laencina; Pedro Henriques Abreu; Miguel Henriques Abreu; Noémia Afonoso
Journal:  Comput Biol Med       Date:  2015-02-16       Impact factor: 4.589

2.  Imputation is beneficial for handling missing data in predictive models.

Authors:  Ewout W Steyerberg; Mirjam van Veen
Journal:  J Clin Epidemiol       Date:  2007-06-28       Impact factor: 6.437

3.  Sample size planning for classification models.

Authors:  Claudia Beleites; Ute Neugebauer; Thomas Bocklitz; Christoph Krafft; Jürgen Popp
Journal:  Anal Chim Acta       Date:  2012-11-17       Impact factor: 6.558

4.  Comparison of Bayesian network and support vector machine models for two-year survival prediction in lung cancer patients treated with radiotherapy.

Authors:  K Jayasurya; G Fung; S Yu; C Dehing-Oberije; D De Ruysscher; A Hope; W De Neve; Y Lievens; P Lambin; A L A J Dekker
Journal:  Med Phys       Date:  2010-04       Impact factor: 4.071

5.  Rapid-learning system for cancer care.

Authors:  Amy P Abernethy; Lynn M Etheredge; Patricia A Ganz; Paul Wallace; Robert R German; Chalapathy Neti; Peter B Bach; Sharon B Murphy
Journal:  J Clin Oncol       Date:  2010-06-28       Impact factor: 44.544

6.  Prognostic factors in stage III non-small cell lung cancer: a review of conventional, metabolic and new biological variables.

Authors:  Thierry Berghmans; Marianne Paesmans; Jean-Paul Sculier
Journal:  Ther Adv Med Oncol       Date:  2011-05       Impact factor: 8.168

7.  Distributed learning: Developing a predictive model based on data from multiple hospitals without data leaving the hospital - A real life proof of concept.

Authors:  Arthur Jochems; Timo M Deist; Johan van Soest; Michael Eble; Paul Bulens; Philippe Coucke; Wim Dries; Philippe Lambin; Andre Dekker
Journal:  Radiother Oncol       Date:  2016-10-28       Impact factor: 6.280

8.  Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls.

Authors:  Jonathan A C Sterne; Ian R White; John B Carlin; Michael Spratt; Patrick Royston; Michael G Kenward; Angela M Wood; James R Carpenter
Journal:  BMJ       Date:  2009-06-29

9.  A Validated Prediction Model for Overall Survival From Stage III Non-Small Cell Lung Cancer: Toward Survival Prediction for Individual Patients.

Authors:  Cary Oberije; Dirk De Ruysscher; Ruud Houben; Michel van de Heuvel; Wilma Uyterlinde; Joseph O Deasy; Jose Belderbos; Anne-Marie C Dingemans; Andreas Rimner; Shaun Din; Philippe Lambin
Journal:  Int J Radiat Oncol Biol Phys       Date:  2015-04-30       Impact factor: 7.038

10.  Rapid learning in practice: a lung cancer survival decision support system in routine patient care data.

Authors:  Andre Dekker; Shalini Vinod; Lois Holloway; Cary Oberije; Armia George; Gary Goozee; Geoff P Delaney; Philippe Lambin; David Thwaites
Journal:  Radiother Oncol       Date:  2014-09-18       Impact factor: 6.280

View more
  4 in total

1.  Guest editorial: special issue on "Artificial Intelligence in Health and Medicine".

Authors:  Siuly Siuly; Runhe Huang; Mahmoud Daneshmand
Journal:  Health Inf Sci Syst       Date:  2018-01-16

2.  On Predicting Recurrence in Early Stage Non-small Cell Lung Cancer.

Authors:  Sameh K Mohamed; Brian Walsh; Mohan Timilsina; Maria Torrente; Fabio Franco; Mariano Provencio; Adrianna Janik; Luca Costabello; Pasquale Minervini; Pontus Stenetorp; Vít Novácˇek
Journal:  AMIA Annu Symp Proc       Date:  2022-02-21

3.  Imputation techniques on missing values in breast cancer treatment and fertility data.

Authors:  Xuetong Wu; Hadi Akbarzadeh Khorshidi; Uwe Aickelin; Zobaida Edib; Michelle Peate
Journal:  Health Inf Sci Syst       Date:  2019-10-03

4.  An ontology-based documentation of data discovery and integration process in cancer outcomes research.

Authors:  Hansi Zhang; Yi Guo; Mattia Prosperi; Jiang Bian
Journal:  BMC Med Inform Decis Mak       Date:  2020-12-14       Impact factor: 2.796

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.