| Literature DB >> 31074948 |
Johannes Smolander1,2, Matthias Dehmer3,4,5, Frank Emmert-Streib1,6.
Abstract
Genomics data provide great opportunities for translational research and the clinical practice, for example, for predicting disease stages. However, the classification of such data is a challenging task due to their high dimensionality, noise, and heterogeneity. In recent years, deep learning classifiers generated much interest, but due to their complexity, so far, little is known about the utility of this method for genomics. In this paper, we address this problem by studying a computational diagnostics task by classification of breast cancer and inflammatory bowel disease patients based on high-dimensional gene expression data. We provide a comprehensive analysis of the classification performance of deep belief networks (DBNs) in dependence on its multiple model parameters and in comparison with support vector machines (SVMs). Furthermore, we investigate combined classifiers that integrate DBNs with SVMs. Such a classifier utilizes a DBN as representation learner forming the input for a SVM. Overall, our results provide guidelines for the complex usage of DBN for classifying gene expression data from complex diseases.Entities:
Keywords: artificial intelligence; deep belief network; deep learning; genomics; neural networks; support vector machine
Mesh:
Year: 2019 PMID: 31074948 PMCID: PMC6609581 DOI: 10.1002/2211-5463.12652
Source DB: PubMed Journal: FEBS Open Bio ISSN: 2211-5463 Impact factor: 2.693
Figure 1Stages of DBN learning. Two stages of DBN learning. The two edges in fine‐tuning denote the two stages of the backpropagation algorithm: the input feedforwarding and the error backpropagation.
Figure 2Combining DL representations and SVM. Three ways of combining three types of deep neural network representations with a SVM.
Figure 3Clinical parameters of the GSE2034 data set. Overview of the clinical parameters of the breast cancer data (GSE2034) 46.
Figure 4Weight‐decay regularization for Rprop. (A) Shallow architecture. (B) Deep architecture. (C) Weight normalization for Bprop with deep architecture.
Summary of the results for breast cancer for undersampling the training sets.
Results for breast cancer data. I: Results for DBN & DBN and SVM. Here A = 22 283 and training sets were undersampled. II: Results for SVM.
Overall summary of the results for IBD. The results are for unbalanced training sets.
Results for IBD. I. Results for DBN. Here A = 22 283 and undersampled training sets. II: Results for DBN with Rprop and SVM.
Results for IBD. Results for SVM. Here A = 22 283 and undersampled training sets
I: Combining DL (DBN with Bprop) and SVM. II: Combining RBM‐learned representations and SVM. III: Combining autoencoder‐learned representations and SVM (ER status). ‐ All results are for breast cancer.
|
|
|
|
| 1: Train DBN model with |
| 2: Perform feature extraction with |
| 3: Train SVM model with |
| 4: Map each sample from |
| 5: For each sample make a prediction for |