| Literature DB >> 27102089 |
Worrawat Engchuan1, Asawin Meechai2, Sissades Tongsima3, Narumol Doungpan4, Jonathan H Chan1.
Abstract
Cancer is a complex disease that cannot be diagnosed reliably using only single gene expression analysis. Using gene-set analysis on high throughput gene expression profiling controlled by various environmental factors is a commonly adopted technique used by the cancer research community. This work develops a comprehensive gene expression analysis tool (gene-set activity toolbox: (GAT)) that is implemented with data retriever, traditional data pre-processing, several gene-set analysis methods, network visualization and data mining tools. The gene-set analysis methods are used to identify subsets of phenotype-relevant genes that will be used to build a classification model. To evaluate GAT performance, we performed a cross-dataset validation study on three common cancers namely colorectal, breast and lung cancers. The results show that GAT can be used to build a reasonable disease diagnostic model and the predicted markers have biological relevance. GAT can be accessed from http://gat.sit.kmutt.ac.th where GAT's java library for gene-set analysis, simple classification and a database with three cancer benchmark datasets can be downloaded.Entities:
Keywords: Microarray; breast cancer; classification; colorectal cancer; feature selection; gene expression analysis; gene-set; lung cancer
Mesh:
Substances:
Year: 2016 PMID: 27102089 DOI: 10.1142/S0219720016500153
Source DB: PubMed Journal: J Bioinform Comput Biol ISSN: 0219-7200 Impact factor: 1.122