| Literature DB >> 33877555 |
Yan Zhang1, Yuan Wu1, Zi-Ying Gong2,3, Hai-Dan Ye2,3, Xiao-Kai Zhao2,3, Jie-Yi Li2,3, Xiao-Mei Zhang1, Sheng Li1, Wei Zhu4, Mei Wang4, Ge-Yu Liang5, Yun Liu1, Xin Guan1, Dao-Yun Zhang2,3, Bo Shen6.
Abstract
Colorectal cancer (CRC) is the third most commonly diagnosed cancer worldwide. Several studies have indicated that rectal cancer is significantly different from colon cancer in terms of treatment, prognosis, and metastasis. Recently, the differential mRNA expression of colon cancer and rectal cancer has received a great deal of attention. The current study aimed to identify significant differences between colon cancer and rectal cancer based on RNA sequencing (RNA-seq) data via support vector machines (SVM). Here, 393 CRC samples from the The Cancer Genome Atlas (TCGA) database were investigated, including 298 patients with colon cancer and 95 with rectal cancer. Following the random forest (RF) analysis of the mRNA expression data, 96 genes such as HOXB13, PRAC, and BCLAF1 were identified and utilized to build the SVM classification model with the Leave-One-Out Cross-validation (LOOCV) algorithm. In the training (n=196) and the validation cohorts (n=197), the accuracy (82.1 % and 82.2 %, respectively) and the AUC (0.87 and 0.91, respectively) indicated that the established optimal SVM classification model distinguished colon cancer from rectal cancer reasonably. However, additional experiments are required to validate the predicted gene expression levels and functions.Entities:
Keywords: classification; colon cancer; gene selection; rectal cancer; support vector machine
Year: 2021 PMID: 33877555 DOI: 10.1007/s11596-021-2356-8
Source DB: PubMed Journal: Curr Med Sci ISSN: 2523-899X