| Literature DB >> 35056899 |
Ibrahim Abdelbaky1, Hilal Tayara2, Kil To Chong3,4.
Abstract
MicroRNAs (miRNAs) are short non-coding RNAs that play important roles in the body and affect various diseases, including cancers. Controlling miRNAs with small molecules is studied herein to provide new drug repurposing perspectives for miRNA-related diseases. Experimental methods are time- and effort-consuming, so computational techniques have been applied, relying mostly on biological feature similarities and a network-based scheme to infer new miRNA-small molecule associations. Collecting such features is time-consuming and may be impractical. Here we suggest an alternative method of similarity calculation, representing miRNAs and small molecules through continuous feature representation. This representation is learned by the proposed deep learning auto-encoder architecture. Our suggested representation was compared to previous works and achieved comparable results using 5-fold cross validation (92% identified within top 25% predictions), and better predictions for most of the case studies (avg. of 31% vs. 25% identified within the top 25% of predictions). The results proved the effectiveness of our proposed method to replace previous time- and effort-consuming methods.Entities:
Keywords: deep learning auto-encoders; drug repurposing; miRNA-small molecule associations; sequence encoding
Year: 2021 PMID: 35056899 PMCID: PMC8780428 DOI: 10.3390/pharmaceutics14010003
Source DB: PubMed Journal: Pharmaceutics ISSN: 1999-4923 Impact factor: 6.321
Figure 1The encoding of miRNAs and small molecules: The input (miRNA sequence/SM SMILES) is represented as one-hot encoding and encoded by the encoder into a compressed low-dimensional representation (64 or 128 features). The decoder part reconstructs the inputs from the encodings to verify the encoding quality.
Figure 2Graphlet Interaction model: similarity matrices and known associations (solid arrows) are inputs; predicted associations (dashed arrows) are outputs.
Average ratio of known associations found during 5-fold cross validation at different percentage points for Guan et al., 64-d feature, and 128-d feature similarities.
| Percent | Guan et al. | 64-d Features | 128-d Features |
|---|---|---|---|
| 0.01 | 0.26 | 0.26 | 0.27 |
| 0.02 | 0.40 | 0.39 | 0.37 |
| 0.05 | 0.66 | 0.62 | 0.62 |
| 0.10 | 0.83 | 0.83 | 0.80 |
| 0.15 | 0.88 | 0.88 | 0.87 |
| 0.20 | 0.90 | 0.89 | 0.89 |
| 0.25 | 0.92 | 0.92 | 0.92 |
| 0.30 | 0.94 | 0.94 | 0.94 |
| 0.40 | 0.96 | 0.96 | 0.96 |
| 0.50 | 1.00 | 1.00 | 1.00 |
Figure 3Percentages of Correct Predictions for 5-Fold Cross Validation for Guan et al. 64-d Features, and 128-d Features.
Average ratio of known associations found during 6 SM case studies at different percentage points for Guan et al. 64-d features, and 128-d features similarities.
| Percent | Guan et al. | 64-d Features | 128-d Features |
|---|---|---|---|
| 0.01 | 0.04 | 0.05 | 0.05 |
| 0.02 | 0.07 | 0.10 | 0.08 |
| 0.05 | 0.16 | 0.20 | 0.17 |
| 0.10 | 0.25 | 0.31 | 0.32 |
| 0.15 | 0.35 | 0.45 | 0.42 |
| 0.20 | 0.49 | 0.53 | 0.52 |
| 0.25 | 0.56 | 0.57 | 0.57 |
| 0.30 | 0.59 | 0.62 | 0.63 |
| 0.40 | 0.70 | 0.71 | 0.69 |
| 0.50 | 0.74 | 0.73 | 0.72 |
Figure 4Avg. prediction curves for 6 SM case studies at different points (Guan et al., 64-d feature, and 128-d feature methods).
Figure 5Avg. prediction curves for the miR-21 case study at different points (Guan et al., 64-d, and 128-d feature methods).