Literature DB >> 24997860

Cancer survival classification using integrated data sets and intermediate information.

Shinuk Kim1, Taesung Park2, Mark Kon3.   

Abstract

OBJECTIVE: Although numerous studies related to cancer survival have been published, increasing the prediction accuracy of survival classes still remains a challenge. Integration of different data sets, such as microRNA (miRNA) and mRNA, might increase the accuracy of survival class prediction. Therefore, we suggested a machine learning (ML) approach to integrate different data sets, and developed a novel method based on feature selection with Cox proportional hazard regression model (FSCOX) to improve the prediction of cancer survival time.
METHODS: FSCOX provides us with intermediate survival information, which is usually discarded when separating survival into 2 groups (short- and long-term), and allows us to perform survival analysis. We used an ML-based protocol for feature selection, integrating information from miRNA and mRNA expression profiles at the feature level. To predict survival phenotypes, we used the following classifiers, first, existing ML methods, support vector machine (SVM) and random forest (RF), second, a new median-based classifier using FSCOX (FSCOX_median), and third, an SVM classifier using FSCOX (FSCOX_SVM). We compared these methods using 3 types of cancer tissue data sets: (i) miRNA expression, (ii) mRNA expression, and (iii) combined miRNA and mRNA expression. The latter data set included features selected either from the combined miRNA/mRNA profile or independently from miRNAs and mRNAs profiles (IFS).
RESULTS: In the ovarian data set, the accuracy of survival classification using the combined miRNA/mRNA profiles with IFS was 75% using RF, 86.36% using SVM, 84.09% using FSCOX_median, and 88.64% using FSCOX_SVM with a balanced 22 short-term and 22 long-term survivor data set. These accuracies are higher than those using miRNA alone (70.45%, RF; 75%, SVM; 75%, FSCOX_median; and 75%, FSCOX_SVM) or mRNA alone (65.91%, RF; 63.64%, SVM; 72.73%, FSCOX_median; and 70.45%, FSCOX_SVM). Similarly in the glioblastoma multiforme data, the accuracy of miRNA/mRNA using IFS was 75.51% (RF), 87.76% (SVM) 85.71% (FSCOX_median), 85.71% (FSCOX_SVM). These results are higher than the results of using miRNA expression and mRNA expression alone. In addition we predict 16 hsa-miR-23b and hsa-miR-27b target genes in ovarian cancer data sets, obtained by SVM-based feature selection through integration of sequence information and gene expression profiles.
CONCLUSION: Among the approaches used, the integrated miRNA and mRNA data set yielded better results than the individual data sets. The best performance was achieved using the FSCOX_SVM method with independent feature selection, which uses intermediate survival information between short-term and long-term survival time and the combination of the 2 different data sets. The results obtained using the combined data set suggest that there are some strong interactions between miRNA and mRNA features that are not detectable in the individual analyses.
Copyright © 2014 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Integration of data sets; Intermediate information; Machine learning algorithm; Survival time classification

Mesh:

Substances:

Year:  2014        PMID: 24997860     DOI: 10.1016/j.artmed.2014.06.003

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  8 in total

1.  miR-448 negatively regulates ovarian cancer cell growth and metastasis by targeting CXCL12.

Authors:  Y Lv; Y Lei; Y Hu; W Ding; C Zhang; C Fang
Journal:  Clin Transl Oncol       Date:  2015-06-24       Impact factor: 3.405

2.  A network medicine approach to build a comprehensive atlas for the prognosis of human cancer.

Authors:  Fan Zhang; Chunyan Ren; Kwun Kit Lau; Zihan Zheng; Geming Lu; Zhengzi Yi; Yongzhong Zhao; Fei Su; Shaojun Zhang; Bin Zhang; Eric A Sobie; Weijia Zhang; Martin J Walsh
Journal:  Brief Bioinform       Date:  2016-08-24       Impact factor: 11.622

3.  Classify multicategory outcome in patients with lung adenocarcinoma using clinical, transcriptomic and clinico-transcriptomic data: machine learning versus multinomial models.

Authors:  Fei Deng; Lanlan Shen; He Wang; Lanjing Zhang
Journal:  Am J Cancer Res       Date:  2020-12-01       Impact factor: 6.166

4.  EARN: an ensemble machine learning algorithm to predict driver genes in metastatic breast cancer.

Authors:  Leila Mirsadeghi; Reza Haji Hosseini; Ali Mohammad Banaei-Moghaddam; Kaveh Kavousi
Journal:  BMC Med Genomics       Date:  2021-05-07       Impact factor: 3.063

5.  Identification of miRNA Biomarkers for Diverse Cancer Types Using Statistical Learning Methods at the Whole-Genome Scale.

Authors:  Jnanendra Prasad Sarkar; Indrajit Saha; Adrian Lancucki; Nimisha Ghosh; Michal Wlasnowolski; Grzegorz Bokota; Ashmita Dey; Piotr Lipinski; Dariusz Plewczynski
Journal:  Front Genet       Date:  2020-11-13       Impact factor: 4.599

6.  SurvNet: A Novel Deep Neural Network for Lung Cancer Survival Analysis With Missing Values.

Authors:  Jianyong Wang; Nan Chen; Jixiang Guo; Xiuyuan Xu; Lunxu Liu; Zhang Yi
Journal:  Front Oncol       Date:  2021-01-20       Impact factor: 6.244

7.  Spatial habitats from multiparametric MR imaging are associated with signaling pathway activities and survival in glioblastoma.

Authors:  Katherine Dextraze; Abhijoy Saha; Donnie Kim; Shivali Narang; Michael Lehrer; Anita Rao; Saphal Narang; Dinesh Rao; Salmaan Ahmed; Venkatesh Madhugiri; Clifton David Fuller; Michelle M Kim; Sunil Krishnan; Ganesh Rao; Arvind Rao
Journal:  Oncotarget       Date:  2017-12-05

8.  A hierarchical integration deep flexible neural forest framework for cancer subtype classification by integrating multi-omics data.

Authors:  Jing Xu; Peng Wu; Yuehui Chen; Qingfang Meng; Hussain Dawood; Hassan Dawood
Journal:  BMC Bioinformatics       Date:  2019-10-28       Impact factor: 3.169

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.