Literature DB >> 26423562

A new cluster-based oversampling method for improving survival prediction of hepatocellular carcinoma patients.

Miriam Seoane Santos1, Pedro Henriques Abreu2, Pedro J García-Laencina3, Adélia Simão4, Armando Carvalho5.   

Abstract

Liver cancer is the sixth most frequently diagnosed cancer and, particularly, Hepatocellular Carcinoma (HCC) represents more than 90% of primary liver cancers. Clinicians assess each patient's treatment on the basis of evidence-based medicine, which may not always apply to a specific patient, given the biological variability among individuals. Over the years, and for the particular case of Hepatocellular Carcinoma, some research studies have been developing strategies for assisting clinicians in decision making, using computational methods (e.g. machine learning techniques) to extract knowledge from the clinical data. However, these studies have some limitations that have not yet been addressed: some do not focus entirely on Hepatocellular Carcinoma patients, others have strict application boundaries, and none considers the heterogeneity between patients nor the presence of missing data, a common drawback in healthcare contexts. In this work, a real complex Hepatocellular Carcinoma database composed of heterogeneous clinical features is studied. We propose a new cluster-based oversampling approach robust to small and imbalanced datasets, which accounts for the heterogeneity of patients with Hepatocellular Carcinoma. The preprocessing procedures of this work are based on data imputation considering appropriate distance metrics for both heterogeneous and missing data (HEOM) and clustering studies to assess the underlying patient groups in the studied dataset (K-means). The final approach is applied in order to diminish the impact of underlying patient profiles with reduced sizes on survival prediction. It is based on K-means clustering and the SMOTE algorithm to build a representative dataset and use it as training example for different machine learning procedures (logistic regression and neural networks). The results are evaluated in terms of survival prediction and compared across baseline approaches that do not consider clustering and/or oversampling using the Friedman rank test. Our proposed methodology coupled with neural networks outperformed all others, suggesting an improvement over the classical approaches currently used in Hepatocellular Carcinoma prediction models.
Copyright © 2015 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Clustering; Hepatocellular Carcinoma (HCC); K-means; Oversampling; SMOTE; Survival prediction

Mesh:

Year:  2015        PMID: 26423562     DOI: 10.1016/j.jbi.2015.09.012

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  8 in total

1.  Machine-learning algorithms based on personalized pathways for a novel predictive model for the diagnosis of hepatocellular carcinoma.

Authors:  Binglin Cheng; Peitao Zhou; Yuhan Chen
Journal:  BMC Bioinformatics       Date:  2022-06-23       Impact factor: 3.307

2.  Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner.

Authors:  S Murugesan; R S Bhuvaneswaran; H Khanna Nehemiah; S Keerthana Sankari; Y Nancy Jane
Journal:  Comput Math Methods Med       Date:  2021-05-17       Impact factor: 2.238

3.  Application of unsupervised analysis techniques to lung cancer patient data.

Authors:  Chip M Lynch; Victor H van Berkel; Hermann B Frieboes
Journal:  PLoS One       Date:  2017-09-14       Impact factor: 3.240

4.  Ensemble Feature Learning to Identify Risk Factors for Predicting Secondary Cancer.

Authors:  Xiucai Ye; Hongmin Li; Tetsuya Sakurai; Pei-Wei Shueng
Journal:  Int J Med Sci       Date:  2019-06-07       Impact factor: 3.738

5.  Risk factor analysis of device-related infections: value of re-sampling method on the real-world imbalanced dataset.

Authors:  Xiang-Fei Feng; Ling-Chao Yang; Li-Zhuang Tan; Yi-Gang Li
Journal:  BMC Med Inform Decis Mak       Date:  2019-09-11       Impact factor: 2.796

6.  Data-Driven Cervical Cancer Prediction Model with Outlier Detection and Over-Sampling Methods.

Authors:  Muhammad Fazal Ijaz; Muhammad Attique; Youngdoo Son
Journal:  Sensors (Basel)       Date:  2020-05-15       Impact factor: 3.576

Review 7.  Current updates in machine learning in the prediction of therapeutic outcome of hepatocellular carcinoma: what should we know?

Authors:  Zhi-Min Zou; De-Hua Chang; Hui Liu; Yu-Dong Xiao
Journal:  Insights Imaging       Date:  2021-03-06

8.  Supervised deep learning embeddings for the prediction of cervical cancer diagnosis.

Authors:  Kelwin Fernandes; Davide Chicco; Jaime S Cardoso; Jessica Fernandes
Journal:  PeerJ Comput Sci       Date:  2018-05-14
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.