Literature DB >> 24630712

Multi-test decision tree and its application to microarray data classification.

Marcin Czajkowski1, Marek Grześ2, Marek Kretowski3.   

Abstract

OBJECTIVE: The desirable property of tools used to investigate biological data is easy to understand models and predictive decisions. Decision trees are particularly promising in this regard due to their comprehensible nature that resembles the hierarchical process of human decision making. However, existing algorithms for learning decision trees have tendency to underfit gene expression data. The main aim of this work is to improve the performance and stability of decision trees with only a small increase in their complexity.
METHODS: We propose a multi-test decision tree (MTDT); our main contribution is the application of several univariate tests in each non-terminal node of the decision tree. We also search for alternative, lower-ranked features in order to obtain more stable and reliable predictions.
RESULTS: Experimental validation was performed on several real-life gene expression datasets. Comparison results with eight classifiers show that MTDT has a statistically significantly higher accuracy than popular decision tree classifiers, and it was highly competitive with ensemble learning algorithms. The proposed solution managed to outperform its baseline algorithm on 14 datasets by an average 6%. A study performed on one of the datasets showed that the discovered genes used in the MTDT classification model are supported by biological evidence in the literature.
CONCLUSION: This paper introduces a new type of decision tree which is more suitable for solving biological problems. MTDTs are relatively easy to analyze and much more powerful in modeling high dimensional microarray data than their popular counterparts.
Copyright © 2014 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Decision trees; Gene expression data; Underfitting; Univariate tests

Mesh:

Year:  2014        PMID: 24630712     DOI: 10.1016/j.artmed.2014.01.005

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  3 in total

1.  Deep learning-based microarray cancer classification and ensemble gene selection approach.

Authors:  Khosro Rezaee; Gwanggil Jeon; Mohammad R Khosravi; Hani H Attar; Alireza Sabzevari
Journal:  IET Syst Biol       Date:  2022-07-04       Impact factor: 1.468

2.  Hybrid learning method based on feature clustering and scoring for enhanced COVID-19 breath analysis by an electronic nose.

Authors:  Shidiq Nur Hidayat; Trisna Julian; Agus Budi Dharmawan; Mayumi Puspita; Lily Chandra; Abdul Rohman; Madarina Julia; Aditya Rianjanu; Dian Kesumapramudya Nurputra; Kuwat Triyana; Hutomo Suryo Wasisto
Journal:  Artif Intell Med       Date:  2022-05-17       Impact factor: 7.011

3.  Automated Detection of Cancer Associated Genes Using a Combined Fuzzy-Rough-Set-Based F-Information and Water Swirl Algorithm of Human Gene Expression Data.

Authors:  Pugalendhi Ganesh Kumar; Muthu Subash Kavitha; Byeong-Cheol Ahn
Journal:  PLoS One       Date:  2016-12-09       Impact factor: 3.240

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.