| Literature DB >> 26855734 |
Atefeh Seyeddokht1, Ali Asghar Aslaminejad1, Ali Masoudi-Nejad2, Mohammadreza Nassiri3, Javad Zahiri4, Balal Sadeghi5.
Abstract
BACKGROUND: Piwi-interacting RNAs (piRNAs) are small non-coding RNAs (ncRNAs), with a length of about 24-32 nucleotides, which have been discovered recently. These ncRNAs play an important role in germline development, transposon silencing, epigenetic regulation, protecting the genome from invasive transposable elements, and the pathophysiology of diseases such as cancer. piRNA identification is challenging due to the lack of conserved piRNA sequences and structural elements.Entities:
Keywords: Piwi-interacting RNAs (piRNAs); RNA; Support Vector Machines (SVM)
Year: 2016 PMID: 26855734 PMCID: PMC4717465
Source DB: PubMed Journal: Avicenna J Med Biotechnol ISSN: 2008-2835
The final 48 features used for building our model
| ANAA | Motif | |
| CNTG | Motif | |
| CNTA | Motif | |
| CTNT | Motif | |
| CNTC | Motif | |
| CAC | Motif | |
| CNTNT | Motif | |
| GNCA | Motif | |
| ATA | Motif | |
| ANTT | Motif | |
| TGNNT | Motif | |
| GNAC | Motif | |
| CTT | Motif | |
| ANNCT | Motif | |
| AGNG | Motif | |
| %G+C | GC content | |
| %XY | Frequency of dinucleotide XY (A,T,C,G) | |
| MFEI1 | Index 1 based on the minimum free energy | |
| MFEI2 | Index 2 based on the minimum free energy | |
| MFEI3 | Index 3 based on the minimum free energy | |
| MFEI4 | Index 4 based on the minimum free energy | |
| dG | Normalized minimum free energy | |
| dp | Normalized base-pairing propensity | |
| NEFE | Normalized ensemble free energy | |
| Freq | Frequency of the MFE structure | |
| Diff | Structural diversity | |
| |A-U|/L | Normalized base pair counts | |
| |G-C|/L | Normalized base pair counts | |
| |G-U|/L | Normalized base pair counts | |
| ABS | Average base pairs per stem | |
| %(A-U)/s | Based on the average base pairs per stem | |
| %(G-U)/s | Based on the average base pairs per stem | |
| %(G-C)/s | Based on the average base pairs per stem |
Figure 1.Flowchart describing the pipeline for piRNA identification.
Figure 2.Different performance measures when different subsets of features were used.
Figure 3.Accuracy of the SVM model with different kernel functions.
Performance of the SVM using different subset of features
| S1 | 0.85 | 0.73 | 0.81 | |
| S2 | 0.90 | 0.78 | 0.91 | |
| S3 | 0.91 | 0.81 | 0.88 | |
| S4 | 0.98 | 0.98 | 0.99 | |
| S5 | 0.98 | 0.98 | 0.99 | |
| S6 | 0.78 | 0.50 | 0.92 | |
| S7 | 0.97 | 0.95 | 0.97 | |
| S8 | 0.71 | 0.25 | 0.96 |
Figure 4.Different values of the parameter γ in SVM model.
Comparison with other methods
| 98 | 52 | 75 | |
| 89 | 91 | 90 | |
| 82 | 30 | 58 | |
| 99 | 99 | 98 |