| Literature DB >> 30679648 |
Xueming Zheng1, Shungao Xu2, Ying Zhang2, Xinxiang Huang3.
Abstract
Due to the biogenesis difference, miRNAs can be divided into canonical microRNAs and mirtrons. Compared to canonical microRNAs, mirtrons are less conserved and hard to be identified. Except stringent annotations based on experiments, many in silico computational methods have be developed to classify miRNAs. Although several machine learning classifiers delivered high classification performance, all the predictors depended heavily on the selection of calculated features. Here, we introduced nucleotide-level convolutional neural networks (CNNs) for pre-miRNAs classification. By using "one-hot" encoding and padding, pre-miRNAs were converted into matrixes with the same shape. The convolution and max-pooling operations can automatically extract features from pre-miRNAs sequences. Evaluation on test dataset showed that our models had a satisfactory performance. Our investigation showed that it was feasible to apply CNNs to extract features from biological sequences. Since there are many hyperparameters can be tuned in CNNs, we believe that the performance of nucleotide-level convolutional neural networks can be greatly improved in the future.Entities:
Year: 2019 PMID: 30679648 PMCID: PMC6346112 DOI: 10.1038/s41598-018-36946-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Illustration of the CNN-concat-filters model architecture for base-level pre-miRNAs classification. In CNN-concat-filters model, four kinds of filters, each of which has 32 filters, with the same width and different lengths (3, 4, 5 and 6) are employed (Conv3-32, Conv4-32, Conv5-32 and Conv6-32). In the convolution layer, each filter performs convolution on the sequence matrix and generates a feature map. The max-pooling operation then takes the largest number of each feature map. All the features are concatenated to form a 128-long feature vector for the penultimate fully-connected layer. The final layer is the softmax output which gives the probability of each classification. The shapes of the tensors as indicated in parentheses are given by height × width × channels.
The parameter and output size of each layer in the mixed model.
| Layer (type) | Output Shape | Param # | Connected to |
|---|---|---|---|
| input_1 (InputLayer) | (None, 164, 4, 1) | 0 | |
| Conv3_32 (Conv2D) | (None, 162, 1, 32) | 416 | input_1[0][0] |
| Conv4_32 (Conv2D) | (None, 161, 1, 32) | 544 | input_1[0][0] |
| Conv5_32 (Conv2D) | (None, 160, 4, 32) | 672 | input_1[0][0] |
| Conv6_32 (Conv2D) | (None, 159, 4, 32) | 800 | input_1[0][0] |
| max_pooling2d_1 (MaxPooling2D) | (None, 1,1, 32) | 0 | Conv3_32 [0][0] |
| max_pooling2d_2 (MaxPooling2D) | (None, 1,1, 32) | 0 | Conv4_32 [0][0] |
| max_pooling2d_3 (MaxPooling2D) | (None, 1, 1, 32) | 0 | Conv5_32 [0][0] |
| max_pooling2d_4 (MaxPooling2D) | (None, 1, 1, 32) | 0 | Conv6_32 [0][0] |
| concatenate_1 (Concatenate) | (None, 1, 1, 128) | 0 | max_pooling2d_1[0][0] |
| max_pooling2d_2[0][0] | |||
| max_pooling2d_3[0][0] | |||
| max_pooling2d_4[0][0] | |||
| flatten_1 (Flatten) | (None, 512) | 0 | concatenate_1[0][0] |
| dense_1 (Dense) | (None, 128) | 16512 | flatten_1[0][0] |
| dropout_1 (Dropout) | (None, 128) | 0 | dense_1[0][0] |
| dense_2 (Dense) | (None, 2) | 258 | dropout_1[0][0] |
Figure 2The loss graph during training. The loss was defined as the cross entropy between predicted value and the actual one. With the iteration of training, the loss dramatically decreases and finally tends to zero. The loss graph is from the CNN-concat-filters model. Horizontal axis: iteration times. Vertical axis: loss value.
Performances comparison of our models with traditional machine learning methods. Our models were trained on the training dataset and evaluated on the test dataset. Our models were compared with traditional machine learning methods. The performance data of the traditional machine learning methods were from Rorbach, G., et al.[23]. “—” means “data not provided in the original paper”.
| Model | Sensitivity | Specificity | F1 | MCC | Accuracy |
|---|---|---|---|---|---|
| CNN-filter3-128 | 0.846 | 0.945 | 0.890 | 0.795 | 0.895 |
| CNN-filter4-128 | 0.786 | 0.980 | 0.871 | 0.781 | 0.883 |
| CNN-filter5-128 | 0.861 | 0.955 | 0.903 | 0.819 | 0.908 |
| CNN-filter6-128 | 0.871 |
|
| 0.845 | 0.920 |
| CNN-concat-filters | 0.846 | 0.975 | 0.904 | 0.827 | 0.910 |
| Support Vector Machines | 0.926 | 0.945 | 0.901 | 0.859 | — |
| Random Forest | 0.870 | 0.957 | 0.883 | 0.836 | — |
| Linear Discriminant Analysis | 0.935 | 0.919 | 0.881 | 0.830 | — |
| Logistic Regression | 0.875 | 0.941 | 0.867 | 0.816 | — |
| Decision Tree | 0.861 | 0.943 | 0.863 | 0.808 | — |
| Naive Bayes | 0.875 | 0.894 | 0.824 | 0.746 | — |