Literature DB >> 20005686

Mixture classification model based on clinical markers for breast cancer prognosis.

Tao Zeng1, Juan Liu.   

Abstract

OBJECTIVE: Accurate cancer prognosis prediction is critical to cancer treatment. There have been many prognosis models based on clinical markers, but few of them are satisfied in clinical applications. And with the development of microarray technologies, cancer researchers have discovered many genes as new markers from the gene expression data and have further developed powerful prognosis models based on these so-called genetic biomarkers. However, the application of such biomarkers still suffers from some problems. The first one is there are a great number of genes and a few samples in the gene expression data so that it is difficult to select a unified gene set to establish a stable classifier for prognosis. The second one is that, due to the experimental and technical reasons, there are existing noises and redundancies in gene expression data, which may lead to building a prognosis predictor with poor performance. The last but not the least one is the microarray experiments are so expensive currently that it is hard to obtain abundant samples. Therefore, it is practical to develop prognosis methods mainly based on conventional clinical markers in real cancer treatment applications. This paper aims to establish an accurate classification model for cancer prognosis, in order to make full use of the invaluable information in clinical data, especially which is usually ignored by most of the existing methods when they aim for high prediction accuracies.
METHODS: First, this paper gives the formal description of general classification problem, and presents a novel mixture classification model to make full use of the invaluable information in clinical data, which is similar to the traditional ensemble classification models except for putting strict constraints on the construction of mapping functions to avoid voting process. Then, a two-layer instance of the proposed model, named as MRS (Mixture of Rough set and Support vector machine), is constructed by integrating rough set and support vector machine (SVM) classification methods, in which, the rough set classifier acts as the first layer to identify some singular samples in data, and the SVM classifier acts as the second layer to classify the remaining samples. Finally, MRS is used to make prognosis prediction on two open breast cancer datasets. One dataset, denoted as BRC-1 hereafter, is a high quality, publicly available dataset of 97 breast cancer tumors of node-negative patients. The other, denoted as BRC-2 hereafter, uses baseline human primary breast tumor data from LBL breast cancer cell collection containing 174 samples.
RESULTS: We have done two experiments on BRC-1 and BRC-2, respectively. In the first experiment, the BRC-1 dataset is divided into train set with 78 patients (34 ones belonging to poor prognosis group and 44 ones belonging to good prognosis group) and test set with 19 patients (12 ones belonging to poor prognosis group and 7 ones belonging to good prognosis). After trained on the train set, the MRS can correctly classify all the 12 patients with poor prognosis, and 6 of 7 patients with good prognosis in the test set. The results are better than previous researches, even better than the 70-gene based biomarkers. And in the second experiment, we construct the classifiers using BRC-2 dataset, and compare MRS with other representative methods in Weka software by 5-fold cross-validation, and comparison results show that MRS has higher prediction accuracy than those methods.
CONCLUSIONS: The proposed mixture classification model can easily integrate methods with different characteristics. It can overcome the shortcomings of traditional voting-based ensemble models and thus can make full use of the information in clinical data. The experimental results illustrate that our implemented MRS classifier can predict the breast cancer prognosis more accurately than previous prognostic methods. 2009 Elsevier B.V. All rights reserved.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 20005686     DOI: 10.1016/j.artmed.2009.07.008

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  5 in total

1.  EARN: an ensemble machine learning algorithm to predict driver genes in metastatic breast cancer.

Authors:  Leila Mirsadeghi; Reza Haji Hosseini; Ali Mohammad Banaei-Moghaddam; Kaveh Kavousi
Journal:  BMC Med Genomics       Date:  2021-05-07       Impact factor: 3.063

2.  LDA-SVM-based EGFR mutation model for NSCLC brain metastases: an observational study.

Authors:  Nan Hu; Ge Wang; Yu-Hao Wu; Shi-Feng Chen; Guo-Dong Liu; Chuan Chen; Dong Wang; Zhong-Shi He; Xue-Qin Yang; Yong He; Hua-Liang Xiao; Ding-De Huang; Kun-Lin Xiong; Yan Wu; Ming Huang; Zhen-Zhou Yang
Journal:  Medicine (Baltimore)       Date:  2015-02       Impact factor: 1.889

3.  A Hybrid Computer-aided-diagnosis System for Prediction of Breast Cancer Recurrence (HPBCR) Using Optimized Ensemble Learning.

Authors:  Mohammad R Mohebian; Hamid R Marateb; Marjan Mansourian; Miguel Angel Mañanas; Fariborz Mokarian
Journal:  Comput Struct Biotechnol J       Date:  2016-12-06       Impact factor: 7.271

4.  Gene expression profiles for predicting metastasis in breast cancer: a cross-study comparison of classification methods.

Authors:  Mark Burton; Mads Thomassen; Qihua Tan; Torben A Kruse
Journal:  ScientificWorldJournal       Date:  2012-11-28

5.  A novel gene expression test method of minimizing breast cancer risk in reduced cost and time by improving SVM-RFE gene selection method combined with LASSO.

Authors:  Madhuri Gupta; Bharat Gupta
Journal:  J Integr Bioinform       Date:  2020-12-29
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.