Literature DB >> 27392227

Predictive modeling of colorectal cancer using a dedicated pre-processing pipeline on routine electronic medical records.

Reinier Kop1, Mark Hoogendoorn2, Annette Ten Teije2, Frederike L Büchner3, Pauline Slottje4, Leon M G Moons5, Mattijs E Numans6.   

Abstract

Over the past years, research utilizing routine care data extracted from Electronic Medical Records (EMRs) has increased tremendously. Yet there are no straightforward, standardized strategies for pre-processing these data. We propose a dedicated medical pre-processing pipeline aimed at taking on many problems and opportunities contained within EMR data, such as their temporal, inaccurate and incomplete nature. The pipeline is demonstrated on a dataset of routinely recorded data in general practice EMRs of over 260,000 patients, in which the occurrence of colorectal cancer (CRC) is predicted using various machine learning techniques (i.e., CART, LR, RF) and subsets of the data. CRC is a common type of cancer, of which early detection has proven to be important yet challenging. The results are threefold. First, the predictive models generated using our pipeline reconfirmed known predictors and identified new, medically plausible, predictors derived from the cardiovascular and metabolic disease domain, validating the pipeline's effectiveness. Second, the difference between the best model generated by the data-driven subset (AUC 0.891) and the best model generated by the current state of the art hypothesis-driven subset (AUC 0.864) is statistically significant at the 95% confidence interval level. Third, the pipeline itself is highly generic and independent of the specific disease targeted and the EMR used. In conclusion, the application of established machine learning techniques in combination with the proposed pipeline on EMRs has great potential to enhance disease prediction, and hence early detection and intervention in medical practice.
Copyright © 2016 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Colorectal cancer; Data mining; Data processing; Electronic medical records; Machine learning

Mesh:

Year:  2016        PMID: 27392227     DOI: 10.1016/j.compbiomed.2016.06.019

Source DB:  PubMed          Journal:  Comput Biol Med        ISSN: 0010-4825            Impact factor:   4.589


  11 in total

1.  Cancer diagnostic tools to aid decision-making in primary care: mixed-methods systematic reviews and cost-effectiveness analysis.

Authors:  Antonieta Medina-Lara; Bogdan Grigore; Ruth Lewis; Jaime Peters; Sarah Price; Paolo Landa; Sophie Robinson; Richard Neal; William Hamilton; Anne E Spencer
Journal:  Health Technol Assess       Date:  2020-11       Impact factor: 4.014

2.  Predicting Outcome of Endovascular Treatment for Acute Ischemic Stroke: Potential Value of Machine Learning Algorithms.

Authors:  Hendrikus J A van Os; Lucas A Ramos; Adam Hilbert; Matthijs van Leeuwen; Marianne A A van Walderveen; Nyika D Kruyt; Diederik W J Dippel; Ewout W Steyerberg; Irene C van der Schaaf; Hester F Lingsma; Wouter J Schonewille; Charles B L M Majoie; Silvia D Olabarriaga; Koos H Zwinderman; Esmee Venema; Henk A Marquering; Marieke J H Wermer
Journal:  Front Neurol       Date:  2018-09-25       Impact factor: 4.003

3.  A novel surgical predictive model for Chinese Crohn's disease patients.

Authors:  Yuan Dong; Li Xu; Yihong Fan; Ping Xiang; Xuning Gao; Yong Chen; Wenyu Zhang; Qiongxiang Ge
Journal:  Medicine (Baltimore)       Date:  2019-11       Impact factor: 1.817

4.  Discovery of predictors of sudden cardiac arrest in diabetes: rationale and outline of the RESCUED (REcognition of Sudden Cardiac arrest vUlnErability in Diabetes) project.

Authors:  Laura H van Dongen; Peter P Harms; Mark Hoogendoorn; Dominic S Zimmerman; Elisabeth M Lodder; Leen M 't Hart; Ron Herings; Henk C P M van Weert; Giel Nijpels; Karin M A Swart; Amber A van der Heijden; Marieke T Blom; Petra J Elders; Hanno L Tan
Journal:  Open Heart       Date:  2021-02

5.  The prediction of asymptomatic carotid atherosclerosis with electronic health records: a comparative study of six machine learning models.

Authors:  Jiaxin Fan; Mengying Chen; Jian Luo; Shusen Yang; Jinming Shi; Qingling Yao; Xiaodong Zhang; Shuang Du; Huiyang Qu; Yuxuan Cheng; Shuyin Ma; Meijuan Zhang; Xi Xu; Qian Wang; Shuqin Zhan
Journal:  BMC Med Inform Decis Mak       Date:  2021-04-05       Impact factor: 2.796

Review 6.  Research and Application of Artificial Intelligence Based on Electronic Health Records of Patients With Cancer: Systematic Review.

Authors:  Xinyu Yang; Dongmei Mu; Hao Peng; Hua Li; Ying Wang; Ping Wang; Yue Wang; Siqi Han
Journal:  JMIR Med Inform       Date:  2022-04-20

7.  Data mining-based model and risk prediction of colorectal cancer by using secondary health data: A systematic review.

Authors:  Hailun Liang; Lei Yang; Lei Tao; Leiyu Shi; Wuyang Yang; Jiawei Bai; Da Zheng; Ning Wang; Jiafu Ji
Journal:  Chin J Cancer Res       Date:  2020-04       Impact factor: 5.087

8.  Development, validation and effectiveness of diagnostic prediction tools for colorectal cancer in primary care: a systematic review.

Authors:  Bogdan Grigore; Ruth Lewis; Jaime Peters; Sophie Robinson; Christopher J Hyde
Journal:  BMC Cancer       Date:  2020-11-10       Impact factor: 4.430

9.  Machine learning is a valid method for predicting prehospital delay after acute ischemic stroke.

Authors:  Li Yang; Qinqin Liu; Qiuli Zhao; Xuemei Zhu; Ling Wang
Journal:  Brain Behav       Date:  2020-08-18       Impact factor: 2.708

Review 10.  Development of artificial intelligence technology in diagnosis, treatment, and prognosis of colorectal cancer.

Authors:  Feng Liang; Shu Wang; Kai Zhang; Tong-Jun Liu; Jian-Nan Li
Journal:  World J Gastrointest Oncol       Date:  2022-01-15
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.