| Literature DB >> 33133227 |
Qingwen Li1,2, Lei Xu3, Qingyuan Li4, Lichao Zhang5.
Abstract
Enhancers are noncoding fragments in DNA sequences, which play an important role in gene transcription and translation. However, due to their high free scattering and positional variability, the identification and classification of enhancers have a higher level of complexity than those of coding genes. In order to solve this problem, many computer studies have been carried out in this field, but there are still some deficiencies in these prediction models. In this paper, we use various feature extraction strategies, dimension reduction technology, and a comprehensive application of machine model and recurrent neural network model to achieve an accurate prediction of enhancer identification and classification with the accuracy of was 76.7% and 84.9%, respectively. The model proposed in this paper is superior to the previous methods in performance index or feature dimension, which provides inspiration for the prediction of enhancers by computer technology in the future.Entities:
Mesh:
Year: 2020 PMID: 33133227 PMCID: PMC7591959 DOI: 10.1155/2020/8852258
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Figure 1Research process of enhancer identification.
Figure 2Research process of enhancer classification.
One-Hot encoding.
| Nucleotides | Code |
|---|---|
| A | [1,0,0,0] |
| T | [0,0,0,1] |
| C | [0,1,0,0] |
| G | [0,0,1,0] |
Figure 3(a) The accuracy of different feature extraction methods after verification. Through analysis, this article believed that the method represented by dark blue had higher accuracy, while the method represented by purple had lower accuracy. (b) Changes in accuracy of different extraction methods before and after dimensionality reduction. Through analysis, this paper believed that accuracy has improved after dimensionality reduction.
Figure 4The relationship between accuracy change and dimension change. According to trends, this paper believed that dimension and accuracy are negatively correlated. Using MRMD2.0, when the dimension was 37, the accuracy reached 76.68%, and the dimension reduction continued; the accuracy cannot be improved.
The comparison between this paper and the previous work on enhancer identification.
| Acc | AUC | SN | SP | MCC | Dimension | |
|---|---|---|---|---|---|---|
| iEnhancer-2L | 0.730 | 0.806 | 0.710 | 0.750 | 0.460 | |
| EnhancerPred | 0.740 | 0.801 | 0.735 | 0.745 | 0.480 | |
| iEnhancer-EL | 0.748 | 0.817 | 0.710 | 0.785 | 0.496 | |
| iEnhancer-ECNN | 0.769 | 0.832 | 0.785 | 0.752 | 0.537 | 2400 |
| Our method | 0.767 | 0.837 | 0.733 | 0.801 | 0.535 | 37 |
The comparison between this paper and the previous work on enhancer classification.
| Acc | SN | SP | MCC | |
|---|---|---|---|---|
| iEnhancer-2L | 0.605 | 0.470 | 0.740 | 0.218 |
| EnhancerPred | 0.550 | 0.45 | 0.65 | 0.102 |
| iEnhancer-EL | 0.61 | 0.540 | 0.68 | 0.222 |
| iEnhancer-ECNN | 0.678 | 0.791 | 0.564 | 0.368 |
| Our method | 0.849 | 0.858 | 0.84 | 0.699 |