| Literature DB >> 25409550 |
Liqi Li, Sanjiu Yu, Weidong Xiao, Yongsheng Li, Lan Huang, Xiaoqi Zheng1, Shiwen Zhou, Hua Yang.
Abstract
BACKGROUND: Identification of the recombination hot/cold spots is critical for understanding the mechanism of recombination as well as the genome evolution process. However, experimental identification of recombination spots is both time-consuming and costly. Developing an accurate and automated method for reliably and quickly identifying recombination spots is thus urgently needed.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25409550 PMCID: PMC4289199 DOI: 10.1186/1471-2105-15-340
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Comparison of prediction results of different top features.
Figure 2Top106 features in the benchmark dataset.
Figure 3The overlapped features.
A comparison of the proposed method with the existing methods
| Predictor | Test method |
|
|
| MCC |
|---|---|---|---|---|---|
| The proposed method | Jackknife | 76.12 | 90.69 | 84.09 | 0.680 |
| F-score | Jackknife | 70.41 | 88.66 | 80.39 | 0.605 |
| iRSpot-TNCPseAAC [ | Jackknife | 87.14 | 79.59 | 83.72 | 0.671 |
| iRSpot-PseDNC [ | Jackknife | 73.06 | 89.49 | 82.04 | 0.638 |
| IDQD [ | 5-fold cross | 79.40 | 81.00 | 80.30 | 0.603 |
Figure 4The ROC curve of the benchmark dataset.
The normalized values for the six DNA dinucleotide physical structures
| Dinucleotide | Physical structures | |||||
|---|---|---|---|---|---|---|
| V 1(R iR i+1) | V 2(R iR i+1) | V 3(R iR i+1) | V 4(R iR i+1) | V 5(R iR i+1) | V 6(R iR i+1) | |
| AA | 0.06 | 0.50 | 0.27 | 1.59 | 0.11 | −0.11 |
| AC | 1.50 | 0.50 | 0.80 | 0.13 | 1.29 | 1.04 |
| AG | 0.78 | 0.36 | 0.09 | 0.68 | −0.24 | −0.62 |
| AT | 1.07 | 0.22 | 0.62 | −1.02 | 2.51 | 1.17 |
| CA | −1.38 | −1.36 | −0.27 | −0.86 | −0.62 | −1.25 |
| CC | 0.06 | 1.08 | 0.09 | 0.56 | −0.82 | 0.24 |
| CG | −1.66 | −1.22 | −0.44 | −0.82 | −0.29 | −1.39 |
| CT | 0.78 | 0.36 | 0.09 | 0.68 | −0.24 | −0.62 |
| GA | −0.08 | 0.50 | 0.27 | 0.13 | −0.39 | 0.71 |
| GC | −0.08 | 0.22 | 1.33 | −0.35 | 0.65 | 1.59 |
| GG | 0.06 | 1.08 | 0.09 | 0.56 | −0.82 | 0.24 |
| GT | 1.50 | 0.50 | 0.80 | 0.13 | 1.29 | 1.04 |
| TA | −1.23 | −2.37 | −0.44 | −2.24 | −1.51 | −1.39 |
| TC | −0.08 | 0.50 | 0.27 | 0.13 | −0.39 | 0.71 |
| TG | −1.38 | −1.36 | −0.27 | −0.86 | −0.62 | −1.25 |
| TT | 0.06 | 0.50 | 0.27 | 1.59 | 0.11 | −0.11 |
Figure 5The pipeline that goes from the query sequence to the final output and all intermediate steps.