Literature DB >> 34253750

Transcriptome profiling by combined machine learning and statistical R analysis identifies TMEM236 as a potential novel diagnostic biomarker for colorectal cancer.

Neha Shree Maurya1, Sandeep Kushwaha2, Aakash Chawade3, Ashutosh Mani4.   

Abstract

Colorectal cancer (CRC) is a common cause of cancer-related deaths worldwide. The CRC mRNA gene expression dataset containing 644 CRC tumor and 51 normal samples from the cancer genome atlas (TCGA) was pre-processed to identify the significant differentially expressed genes (DEGs). Feature selection techniques Least absolute shrinkage and selection operator (LASSO) and Relief were used along with class balancing for obtaining features (genes) of high importance. The classification of the CRC dataset was done by ML algorithms namely, random forest (RF), K-nearest neighbour (KNN), and artificial neural networks (ANN). The significant DEGs were 2933, having 1832 upregulated and 1101 downregulated genes. The CRC gene expression dataset had 23,186 features. LASSO had performed better than Relief for classifying tumor and normal samples through ML algorithms namely RF, KNN, and ANN with an accuracy of 100%, while Relief had given 79.5%, 85.05%, and 100% respectively. Common features between LASSO and DEGs were 38, from them only 5 common genes namely, VSTM2A, NR5A2, TMEM236, GDLN, and ETFDH had shown statistically significant survival analysis. Functional review and analysis of the selected genes helped in downsizing the 5 genes to 2, which are VSTM2A and TMEM236. Differential expression of TMEM236 was statistically significant and was markedly reduced in the dataset which solicits appreciation for assessment as a novel biomarker for CRC diagnosis.
© 2021. The Author(s).

Entities:  

Year:  2021        PMID: 34253750     DOI: 10.1038/s41598-021-92692-0

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


  6 in total

1.  Diagnostic genes and immune infiltration analysis of colorectal cancer determined by LASSO and SVM machine learning methods: a bioinformatics analysis.

Authors:  Yan-Rong Li; Ke Meng; Guang Yang; Bao-Hai Liu; Chu-Qiao Li; Jia-Yuan Zhang; Xiao-Mei Zhang
Journal:  J Gastrointest Oncol       Date:  2022-06

2.  Construction of a predictive model for immunotherapy efficacy in lung squamous cell carcinoma based on the degree of tumor-infiltrating immune cells and molecular typing.

Authors:  Lingge Yang; Shuli Wei; Jingnan Zhang; Qiongjie Hu; Wansong Hu; Mengqing Cao; Long Zhang; Yongfang Wang; Pingli Wang; Kai Wang
Journal:  J Transl Med       Date:  2022-08-12       Impact factor: 8.440

3.  A p53 transcriptional signature in primary and metastatic cancers derived using machine learning.

Authors:  Faeze Keshavarz-Rahaghi; Erin Pleasance; Tyler Kolisnik; Steven J M Jones
Journal:  Front Genet       Date:  2022-08-29       Impact factor: 4.772

4.  Novel feature selection methods for construction of accurate epigenetic clocks.

Authors:  Adam Li; Amber Mueller; Brad English; Anthony Arena; Daniel Vera; Alice E Kane; David A Sinclair
Journal:  PLoS Comput Biol       Date:  2022-08-19       Impact factor: 4.779

5.  A novel biomarker selection method combining graph neural network and gene relationships applied to microarray data.

Authors:  Weidong Xie; Wei Li; Shoujia Zhang; Linjie Wang; Jinzhu Yang; Dazhe Zhao
Journal:  BMC Bioinformatics       Date:  2022-07-26       Impact factor: 3.307

6.  Machine Learning Data Analysis Highlights the Role of Parasutterella and Alloprevotella in Autism Spectrum Disorders.

Authors:  Daniele Pietrucci; Adelaide Teofani; Marco Milanesi; Bruno Fosso; Lorenza Putignani; Francesco Messina; Graziano Pesole; Alessandro Desideri; Giovanni Chillemi
Journal:  Biomedicines       Date:  2022-08-19
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.