| Literature DB >> 31141969 |
Mingyang Jiang1,2, Yanchun Liang3,4, Zhili Pei5, Xiye Wang6, Fengfeng Zhou7, Chengxi Wei8, Xiaoyue Feng9.
Abstract
Breast cancer is estimated to be the leading cancer type among new cases in American women. Core biopsy data have shown a close association between breast hyperplasia and breast cancer. The early diagnosis and treatment of breast hyperplasia are extremely important to prevent breast cancer. The Mongolian medicine RuXian-I is a traditional drug that has achieved a high level of efficacy and a low incidence of side effects in its clinical use. However, for detecting the efficacy of RuXian-I, a rapid and accurate evaluation method based on metabolomic data is still lacking. Therefore, we proposed a framework, named the metabolomics deep belief network (MDBN), to analyze breast hyperplasia metabolomic data. We obtained 168 samples of metabolomic data from an animal model experiment of RuXian-I, which were averaged from control groups, treatment groups, and model groups. In the process of training, unlabelled data were used to pretrain the Deep Belief Networks models, and then labelled data were used to complete fine-tuning based on a limited-memory Broyden Fletcher Goldfarb Shanno (L-BFGS) algorithm. To prevent overfitting, a dropout method was added to the pretraining and fine-tuning procedures. The experimental results showed that the proposed model is superior to other classical classification methods that are based on positive and negative spectra data. Further, the proposed model can be used as an extension of the classification method for metabolomic data. For the high accuracy of classification of the three groups, the model indicates obvious differences and boundaries between the three groups. It can be inferred that the animal model of RuXian-I is well established, which can lay a foundation for subsequent related experiments. This also shows that metabolomic data can be used as a means to verify the effectiveness of RuXian-I in the treatment of breast hyperplasia.Entities:
Keywords: Mongolian medicine; breast cancer; deep belief networks; metabolomic data
Mesh:
Year: 2019 PMID: 31141969 PMCID: PMC6600413 DOI: 10.3390/ijms20112620
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1The framework of the metabolomics deep belief network (MDBN).
Accuracies of classification in the positive spectrum data in the five-fold cross-validation experiment (%).
| Group | BPNN | KNN | SVM | DBN+GD+Softmax | DBN+L-BFGS+Softmax |
|---|---|---|---|---|---|
| 1 | 79.41 | 58.82 | 64.71 | 88.24 |
|
| 2 | 73.53 | 85.29 | 61.76 | 94.12 |
|
| 3 | 76.47 | 82.35 | 85.29 | 91.18 |
|
| 4 | 76.47 | 79.41 | 67.65 | 88.24 |
|
| 5 | 79.41 | 82.35 | 61.76 | 91.18 |
|
| Mean | 77.06 | 77.64 | 68.23 | 90.59 |
|
Bold values indicate the best results.
Figure 2Fine-tuning experimental results on the five-fold data sets. In each subgraph of (A–E), (i) is the fine-tuning error mean square error (FMSE) of DBN+GD+Softmax, (ii) is the fine-tuning misclassification rate (FMR) of DBN+GD+Softmax, (iii) is the test misclassification rate (TMR) of DBN+GD+Softmax, (iv) is the fine-tuning error MSE (FMSE) of DBN+L-BFGS+Softmax, (v) is the fine-tuning misclassification rate (FMR) of DBN+L-BFGS+Softmax, and (vi) is the test misclassification rate (TMR) of DBN+L-BFGS+Softmax.
Accuracies of the classification of negative spectrum data in the five-fold cross-validation experiment (%).
| Group | BPNN | KNN | SVM | DBN+GD+Softmax | DBN+L-BFGS+Softmax |
|---|---|---|---|---|---|
| 1 | 94.12 | 97.06 |
|
|
|
| 2 | 88.24 | 79.41 | 94.12 |
| 94.12 |
| 3 | 91.18 | 91.18 | 82.35 |
|
|
| 4 | 91.18 | 94.12 | 61.76 |
| 82.35 |
| 5 | 73.53 | 79.41 | 52.94 | 73.53 | 76.47 |
| Mean | 87.65 | 88.24 | 78.23 |
| 89.41 |
Bold values indicate the best results.
Classification accuracies of different training and test sets (%).
| Training Set | Test Set | BPNN | KNN | SVM | DBN+GD+Softmax | DBN+L-BFGS+Softmax |
|---|---|---|---|---|---|---|
| 50 | 118 | 59.32 | 33.05 | 39.83 |
| 66.95 |
| 60 | 108 | 68.52 | 70.37 | 40.74 | 77.78 |
|
| 70 | 98 | 74.49 | 83.67 | 40.82 |
|
|
| 80 | 88 | 84.09 | 84.09 | 46.59 | 92.05 |
|
| 90 | 78 | 78.21 | 82.05 | 48.72 | 88.46 |
|
| 100 | 68 | 77.94 | 79.41 | 44.12 | 89.71 |
|
| 110 | 58 | 77.59 | 75.86 | 53.45 | 91.38 |
|
| 120 | 48 | 81.25 | 70.83 | 54.17 | 89.58 |
|
Bold values indicate the best results.
Figure 3Misclassification rate curve from the 50 to 120 training sets. (A–H) are the training misclassification rate (TMR) curves of DBN+GD+Softmax for the number of training sets from 50 to 120, and (I–P) are the TMR curves of DBN+L-BFGS+Softmax for the number of training sets from 50 to 120.
Figure 4MDBN model. The yellow rectangles represent the metabolomics data. After preprocessing, the data are fed to DBN. The purple circles represent input layer. The green circles represent output layer of the first RBM. The red circles represent the output of DBN. The blue circles represent the output layer of Softmax classifier.