| Literature DB >> 29297285 |
Chun-Chi Chen1, Xiaoning Qian1, Byung-Jun Yoon2.
Abstract
BACKGROUND: Piwi-interacting RNAs (piRNAs) are a new class of small non-coding RNAs that are known to be associated with RNA silencing. The piRNAs play an important role in protecting the genome from invasive transposons in the germline. Recent studies have shown that piRNAs are linked to the genome stability and a variety of human cancers. Due to their clinical importance, there is a pressing need for effective computational methods that can be used for computational identification of piRNAs. However, piRNAs lack conserved structural motifs and show relatively low sequence similarity across different species, which makes accurate computational prediction of piRNAs challenging.Entities:
Keywords: Support vector machine (SVM); n-gram model (NGM); piRNA prediction; piwi-interacting RNA (piRNA)
Mesh:
Substances:
Year: 2017 PMID: 29297285 PMCID: PMC5751586 DOI: 10.1186/s12859-017-1896-1
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1The piRNA detection accuracy and the average number of classified families for Z =1.5. a The prediction accuracy is shown on the y-axis and the dataset size is shown on the x-axis. Lines in different colors correspond to different values of N . b The average number of classified families for different N and dataset size
Fig. 2The piRNA detection accuracy and the average number of classified families for N =200. a The prediction accuracy is shown on the y-axis and the dataset size is shown on the x-axis. Lines in different colors correspond to different values of Z . b The average number of classified families for different Z and dataset size
Dataset size for each species
| Species | Size |
|---|---|
|
| 32,826 |
|
| 63,182 |
|
| 51,664,769 |
Prediction accuracy of piRNAdetect compared against the K-mer scheme and piRPred
| Method |
|
|
| ||||||
|---|---|---|---|---|---|---|---|---|---|
| TPR | FPR | ACC (%) | TPR | FPR | ACC (%) | TPR | FPR | ACC (%) | |
| piRNAdetect | 0.848 | 0.160 | 84.40 | 0.837 | 0.195 | 82.11 | 0.806 | 0.213 | 79.65 |
| K-mer scheme | 0.821 | 0.226 | 79.76 | 0.781 | 0.222 | 77.95 | 0.698 | 0.259 | 71.95 |
| piRPred | 0.375 | 0.098 | 63.85 | 0.290 | 0.201 | 54.42 | 0.208 | 0.020 | 59.39 |
Fig. 3ROC curves showing the prediction performance of piRNAdetect and the performance of the K-mer scheme. a The performance for predicting piRNAs in H. sapiens. The false positive rate (FPR) is shown on the x-axis and the true positive rate (TPR) is shown on the y-axis. b The prediction performance for piRNAs in R. norvegicus. c The prediction performance for piRNAs in M. musculus
Prediction performance based on average AUC
| Average AUC | |||
|---|---|---|---|
| species |
|
|
|
| piRNAdetect | 90.28 | 88.15 | 85.97 |
| K-mer scheme | 87.84 | 86.06 | 79.36 |