| Literature DB >> 19091031 |
A K M A Baten1, S K Halgamuge, B C H Chang.
Abstract
BACKGROUND: Accurate identification of splice sites in DNA sequences plays a key role in the prediction of gene structure in eukaryotes. Already many computational methods have been proposed for the detection of splice sites and some of them showed high prediction accuracy. However, most of these methods are limited in terms of their long computation time when applied to whole genome sequence data.Entities:
Mesh:
Substances:
Year: 2008 PMID: 19091031 PMCID: PMC2638148 DOI: 10.1186/1471-2105-9-S12-S8
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1ROC curve showing the classification performance of different models for NN269 acceptor splice site data.
AUC and training time for different models for NN269 acceptor splice sites.
| Reduced MM1 SVM (Best in terms of accuracy) | GRBF | ||
| Reduced MM1 SVM | Polynomial | 0.9695822 | 00.10.48 |
| MM1 SVM [ | Polynomial | 0.9674048 | 00.11.02 |
| IC Shapiro SVM (Best In terms of Time) | Polynomial | ||
Figure 2ROC curve showing the classification performance of best two models in terms of accuracy and training time for NN269 acceptor splice site data.
AUC and training time improvement for different models compared to MM1-SVM method for NN269 acceptor splice sites.
| MM1 SVM [ | Polynomial | 0.9674 | 11.02 | - | - |
| Reduced MM1 SVM (Best in terms of accuracy) | GRBF | 0.9741 | 22.17 | 0.69% | -101.96% |
| Reduced MM1 SVM | Polynomial | 0.9695 | 10.48 | 0.2171% | 2.11% |
| IC Shapiro SVM (Best In terms of Time) | Polynomial | 0.9628 | 01:18 | -0.4755% | 88.21% |
Figure 3ROC curve showing the classification performance of different models for NN269 donor splice site data.
AUC and training time for different models for NN269 donor splice sites.
| Reduced MM1 SVM (Best in terms of accuracy) | GRBF | ||
| Reduced MM1 SVM | Polynomial | 0.9764903 | 00:09:30 |
| MM1 SVM | Polynomial | 0.9761952 | 00:10:02 |
| IC Shapiro SVM (Best In terms of Time) | Polynomial | ||
Figure 4ROC curve showing the classification performance of best two models in terms of accuracy and training time for NN269 donor splice site data.
AUC and training time improvement for different models compared to MM1-SVM method for NN269 donor splice sites.
| MM1 SVM | Polynomial | 0.9761 | 10:02 | - | - |
| Reduced MM1 SVM (Best in terms of accuracy) | GRBF | 0.9790 | 20:04 | 0.297% | -100% |
| Reduced MM1 SVM | Polynomial | 0.9764 | 09:30 | 0.0102% | 5.31% |
| IC Shapiro SVM (Best In terms of Time) | Polynomial | 0.9665 | 02:59 | -0.9835% | 70.26% |
Proposed models and their description.
| Reduced MM1 SVM Polynomial | Only reduced MM1 parameters and SVM with polynomial kernel |
| Reduced MM1 SVM GRBF | Only reduced MM1 parameters and SVM with GRBF kernel |
| IC Shapiro SVM Polynomial | Information content, Shapiro's score and SVM with polynomial kernel |
Figure 5F-Score analysis of NN269 acceptor splice site.
Figure 6F-Score analysis of NN269 donor splice site.
Definition of TP, TN, FP and FN
| Predicted positive | Predicted negative | |
| Real positive | true positives, TP | false negatives, FN |
| Real negative | true negatives, TN | false positives, FP |