Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 The effect of imputing missing clinical attribute values on training lung cancer survival prediction model performance.

Literature DB >> 29255599

The effect of imputing missing clinical attribute values on training lung cancer survival prediction model performance.

Mohamed S Barakat^1,2, Matthew Field^1,2, Aditya Ghose³, David Stirling⁴, Lois Holloway^1,2,5, Shalini Vinod^1,5, Andre Dekker⁶, David Thwaites⁷.

Abstract

According to the estimations of the World Health Organization and the International Agency for Research in Cancer, lung cancer is the most common cause of death from cancer worldwide. The last few years have witnessed a rise in the attention given to the use of clinical decision support systems in medicine generally and in cancer in particular. These can predict patients' likelihood of survival based on analysis of and learning from previously treated patients. The datasets that are mined for developing clinical decision support functionality are often incomplete, which adversely impacts the quality of the models developed and the decision support offered. Imputing missing data using a statistical analysis approach is a common method to addressing the missing data problem. This work investigates the effect of imputation methods for missing data in preparing a training dataset for a Non-Small Cell Lung Cancer survival prediction model using several machine learning algorithms. The investigation includes an assessment of the effect of imputation algorithm error on performance prediction and also a comparison between using a smaller complete real dataset or a larger dataset with imputed data. Our results show that even when the proportion of records with some missing data is very high (> 80%) imputation can lead to prediction models with an AUC (0.68-0.72) comparable to those trained with complete data records.

Entities: Disease Species

Keywords: Decision Support; Imputation; Missing data; Modeling and Lung Cancer

Year: 2017 PMID： 29255599 PMCID： PMC5718991 DOI： 10.1007/s13755-017-0039-4

Source DB: PubMed Journal: Health Inf Sci Syst ISSN： 2047-2501

11 in total

1. Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values.

Authors: Pedro J García-Laencina; Pedro Henriques Abreu; Miguel Henriques Abreu; Noémia Afonoso
Journal: Comput Biol Med Date: 2015-02-16 Impact factor: 4.589

2. Imputation is beneficial for handling missing data in predictive models.

Authors: Ewout W Steyerberg; Mirjam van Veen
Journal: J Clin Epidemiol Date: 2007-06-28 Impact factor: 6.437

3. Sample size planning for classification models.

Authors: Claudia Beleites; Ute Neugebauer; Thomas Bocklitz; Christoph Krafft; Jürgen Popp
Journal: Anal Chim Acta Date: 2012-11-17 Impact factor: 6.558

4. Comparison of Bayesian network and support vector machine models for two-year survival prediction in lung cancer patients treated with radiotherapy.

Authors: K Jayasurya; G Fung; S Yu; C Dehing-Oberije; D De Ruysscher; A Hope; W De Neve; Y Lievens; P Lambin; A L A J Dekker
Journal: Med Phys Date: 2010-04 Impact factor: 4.071

5. Rapid-learning system for cancer care.

Authors: Amy P Abernethy; Lynn M Etheredge; Patricia A Ganz; Paul Wallace; Robert R German; Chalapathy Neti; Peter B Bach; Sharon B Murphy
Journal: J Clin Oncol Date: 2010-06-28 Impact factor: 44.544

6. Prognostic factors in stage III non-small cell lung cancer: a review of conventional, metabolic and new biological variables.

Authors: Thierry Berghmans; Marianne Paesmans; Jean-Paul Sculier
Journal: Ther Adv Med Oncol Date: 2011-05 Impact factor: 8.168

7. Distributed learning: Developing a predictive model based on data from multiple hospitals without data leaving the hospital - A real life proof of concept.

Authors: Arthur Jochems; Timo M Deist; Johan van Soest; Michael Eble; Paul Bulens; Philippe Coucke; Wim Dries; Philippe Lambin; Andre Dekker
Journal: Radiother Oncol Date: 2016-10-28 Impact factor: 6.280

8. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls.

Authors: Jonathan A C Sterne; Ian R White; John B Carlin; Michael Spratt; Patrick Royston; Michael G Kenward; Angela M Wood; James R Carpenter
Journal: BMJ Date: 2009-06-29

9. A Validated Prediction Model for Overall Survival From Stage III Non-Small Cell Lung Cancer: Toward Survival Prediction for Individual Patients.

Authors: Cary Oberije; Dirk De Ruysscher; Ruud Houben; Michel van de Heuvel; Wilma Uyterlinde; Joseph O Deasy; Jose Belderbos; Anne-Marie C Dingemans; Andreas Rimner; Shaun Din; Philippe Lambin
Journal: Int J Radiat Oncol Biol Phys Date: 2015-04-30 Impact factor: 7.038

The effect of imputing missing clinical attribute values on training lung cancer survival prediction model performance.

1. Missing data imputation on the 5-year survival prediction of breast cancer patients with unknown discrete values.

2. Imputation is beneficial for handling missing data in predictive models.

3. Sample size planning for classification models.

4. Comparison of Bayesian network and support vector machine models for two-year survival prediction in lung cancer patients treated with radiotherapy.

5. Rapid-learning system for cancer care.

6. Prognostic factors in stage III non-small cell lung cancer: a review of conventional, metabolic and new biological variables.

7. Distributed learning: Developing a predictive model based on data from multiple hospitals without data leaving the hospital - A real life proof of concept.

8. Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls.

9. A Validated Prediction Model for Overall Survival From Stage III Non-Small Cell Lung Cancer: Toward Survival Prediction for Individual Patients.

10. Rapid learning in practice: a lung cancer survival decision support system in routine patient care data.

1. Guest editorial: special issue on "Artificial Intelligence in Health and Medicine".

2. On Predicting Recurrence in Early Stage Non-small Cell Lung Cancer.

3. Imputation techniques on missing values in breast cancer treatment and fertility data.

4. An ontology-based documentation of data discovery and integration process in cancer outcomes research.