| Literature DB >> 31048185 |
Muhammad Tahir1, Hilal Tayara2, Kil To Chong3.
Abstract
Pseudouridine is the most prevalent RNA modification and has been found in both eukaryotes and prokaryotes. Currently, pseudouridine has been demonstrated in several kinds of RNAs, such as small nuclear RNA, rRNA, tRNA, mRNA, and small nucleolar RNA. Therefore, its significance to academic research and drug development is understandable. Through biochemical experiments, the pseudouridine site identification has produced good outcomes, but these lab exploratory methods and biochemical processes are expensive and time consuming. Therefore, it is important to introduce efficient methods for identification of pseudouridine sites. In this study, an intelligent method for pseudouridine sites using the deep-learning approach was developed. The proposed prediction model is called iPseU-CNN (identifying pseudouridine by convolutional neural networks). The existing methods used handcrafted features and machine-learning approaches to identify pseudouridine sites. However, the proposed predictor extracts the features of the pseudouridine sites automatically using a convolution neural network model. The iPseU-CNN model yields better outcomes than the current state-of-the-art models in all evaluation parameters. It is thus highly projected that the iPseU-CNN predictor will become a helpful tool for academic research on pseudouridine site prediction of RNA, as well as in drug discovery.Entities:
Keywords: RNA; convolution neural network; deep learning; iPseU-CNN; pseudouridine sites
Year: 2019 PMID: 31048185 PMCID: PMC6488737 DOI: 10.1016/j.omtn.2019.03.010
Source DB: PubMed Journal: Mol Ther Nucleic Acids
Figure 1Illustration of the Pseudouridine Modification
The Success Rates of iPseU-CNN and the Baseline Methods with the Training Datasets
| Training Dataset | Methods | Accuracy (%) | Sensitivity (%) | Specificity (%) | MCC |
|---|---|---|---|---|---|
| H_990 | n-gram | 60.00 | 51.51 | 68.48 | 0.20 |
| MMI | 58.78 | 47.47 | 70.10 | 0.18 | |
| CNN | 66.68 | 65.00 | 68.78 | 0.34 | |
| S_628 | n-gram | 62.73 | 64.64 | 60.82 | 0.25 |
| MMI | 60.19 | 67.51 | 52.86 | 0.20 | |
| CNN | 68.15 | 66.36 | 70.45 | 0.37 | |
| M_944 | n-gram | 62.71 | 65.04 | 60.38 | 0.25 |
| MMI | 58.26 | 63.13 | 53.38 | 0.16 | |
| CNN | 71.81 | 74.79 | 69.11 | 0.44 |
The Success Rates of iPseU-CNN and the Baseline Methods with Two Independent Testing Datasets
| Testing Dataset | Methods | Accuracy (%) | Sensitivity (%) | Specificity (%) | MCC |
|---|---|---|---|---|---|
| H_200 | n-gram | 67.00 | 57.00 | 78.00 | 0.35 |
| MMI | 63.50 | 58.00 | 69.00 | 0.27 | |
| CNN | 69.00 | 77.72 | 60.81 | 0.40 | |
| S_200 | n-gram | 70.50 | 70.00 | 71.00 | 0.41 |
| MMI | 69.50 | 72.00 | 67.00 | 0.39 | |
| CNN | 73.50 | 68.76 | 77.82 | 0.47 |
Figure 2The Success Rates of the iPseU-CNN and Baseline Methods
The Success Rates of iPseU-CNN and State-of-the-Art Methods with the Training Datasets
| Training Dataset | Models | Accuracy (%) | Sensitivity (%) | Specificity (%) | MCC |
|---|---|---|---|---|---|
| H_990 | iPseU-CNN | 66.68 | 65.00 | 68.78 | 0.34 |
| PseUI | 64.24 | 64.85 | 63.64 | 0.28 | |
| iRNA-PseU | 60.40 | 61.01 | 59.80 | 0.21 | |
| S_628 | iPseU-CNN | 68.15 | 66.36 | 70.45 | 0.37 |
| PseUI | 65.13 | 62.74 | 67.52 | 0.30 | |
| iRNA-PseU | 64.49 | 64.65 | 64.33 | 0.29 | |
| M_944 | iPseU-CNN | 71.81 | 74.79 | 69.11 | 0.44 |
| PseUI | 70.44 | 79.87 | 70.34 | 0.41 | |
| iRNA-PseU | 69.07 | 73.31 | 64.83 | 0.38 |
The Success Rates of the iPseU-CNN and State-of-the-Art Methods with Two Independent Testing Datasets
| Testing Dataset | Models | Accuracy (%) | Sensitivity (%) | Specificity (%) | MCC |
|---|---|---|---|---|---|
| H_200 | iPseU-CNN | 69.00 | 77.72 | 60.81 | 0.40 |
| PseUI | 65.50 | 63.00 | 68.00 | 0.31 | |
| RNA-PseU | 61.50 | 58.00 | 65.00 | 0.23 | |
| S_200 | iPseU-CNN | 73.50 | 68.76 | 77.82 | 0.47 |
| PseUI | 68.50 | 65.00 | 72.00 | 0.37 | |
| iRNA-PseU | 60.00 | 63.00 | 57.00 | 0.20 |
Figure 3The Success Rates of the iPseU-CNN and State-of-the-Art Methods
Figure 4Illustration of the Architecture of the iPseU-CNN Model
The Ranges of the Tuned Hyper-Parameters
| Hyper-Parameter | Range |
|---|---|
| Convolution layers | [1,2] |
| Filters | [5,7,9] |
| Filter size | [3,5,7] |
| Stride | [1,2] |
| Dropout | [0.25, 0.50] |