Literature DB >> 31743831

Taxonomy dimension reduction for colorectal cancer prediction.

Kaiyang Qu1, Feng Gao2, Fei Guo3, Quan Zou4.   

Abstract

A growing number of people suffer from colorectal cancer, which is one of the most common cancers. It is essential to diagnose and treat the cancer as early as possible. The disease may change the microorganism communities in the gut, and it could be an efficient method to employ gut microorganisms to predict colorectal cancer. In this study, we selected operational taxonomic units that include several kinds of microorganisms to predict colorectal cancer. To find the most important microorganisms and obtain the best prediction performance, we explore effective feature selection methods. We employ three main steps. First, we use a single method to reduce features. Next, to reduce the number of features, we integrate the dimension reduction methods correlation-based feature selection and maximum relevance-maximum distance (MRMD 1.0 and MRMD 2.0). Then, we selected the important features according to the taxonomy files. In this study, we created training and test sets to obtain a more objective evaluation. Random forest, naïve Bayes, and decision tree classifiers were evaluated. The results show that the methods proposed in this study are better than hierarchical feature engineering. The proposed method, which combines correlation-based feature selection with MRMD 2.0, performed the best on the CRC2 dataset. The dataset and methods can be found in http://lab.malab.cn/data/microdata/data.html.
Copyright © 2019 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Colorectal cancer; Correlation-based feature selection; Machine learning; Maximum relevant Maximum distance; Microbial

Mesh:

Year:  2019        PMID: 31743831     DOI: 10.1016/j.compbiolchem.2019.107160

Source DB:  PubMed          Journal:  Comput Biol Chem        ISSN: 1476-9271            Impact factor:   2.877


  4 in total

1.  Silybin Prevents Prostate Cancer by Inhibited the ALDH1A1 Expression in the Retinol Metabolism Pathway.

Authors:  Ying Jiang; Hanbing Song; Ling Jiang; Yu Qiao; Dan Yang; Donghua Wang; Ji Li
Journal:  Front Cell Dev Biol       Date:  2020-08-31

2.  Comparison of Methods for Picking the Operational Taxonomic Units From Amplicon Sequences.

Authors:  Ze-Gang Wei; Xiao-Dan Zhang; Ming Cao; Fei Liu; Yu Qian; Shao-Wu Zhang
Journal:  Front Microbiol       Date:  2021-03-24       Impact factor: 5.640

3.  Early Diagnosis of Hepatocellular Carcinoma Using Machine Learning Method.

Authors:  Zi-Mei Zhang; Jiu-Xin Tan; Fang Wang; Fu-Ying Dao; Zhao-Yue Zhang; Hao Lin
Journal:  Front Bioeng Biotechnol       Date:  2020-03-27

4.  Interpretable prediction of necrotizing enterocolitis from machine learning analysis of premature infant stool microbiota.

Authors:  Yun Chao Lin; Ansaf Salleb-Aouissi; Thomas A Hooven
Journal:  BMC Bioinformatics       Date:  2022-03-25       Impact factor: 3.169

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.