| Literature DB >> 32938395 |
Ke Li1,2, Sijia Zhang2, Di Yan2,3, Yannan Bin2, Junfeng Xia4.
Abstract
BACKGROUND: Identification of hot spots in protein-DNA interfaces provides crucial information for the research on protein-DNA interaction and drug design. As experimental methods for determining hot spots are time-consuming, labor-intensive and expensive, there is a need for developing reliable computational method to predict hot spots on a large scale.Entities:
Keywords: Extreme gradient boosting; Hot spot; Protein–DNA complexes; Supervised isometric feature mapping
Mesh:
Substances:
Year: 2020 PMID: 32938395 PMCID: PMC7495874 DOI: 10.1186/s12859-020-03683-3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1The workflow of sxPDH
Performance of different manifold learning methods on the test set
| Method | SEN | SPE | PRE | F1 | ACC | MCC | AUC |
|---|---|---|---|---|---|---|---|
| LLE (10) | 0.653 | 0.711 | 0.607 | 0.629 | 0.687 | 0.361 | 0.693 |
| ISOMAP (10) | 0.687 | 0.766 | 0.692 | 0.695 | 0.709 | 0.476 | 0.738 |
| SLLE (3) | 0.671 | 0.732 | 0.648 | 0.656 | 0.691 | 0.381 | 0.703 |
| S-ISOMAP (3) |
The highest value in each column is shown in bold. The numbers in parentheses represent the feature dimensions after dimensionality reduction
Fig. 2Running time of different manifold learning methods
Performance of S-ISOMAP compared with other feature selection methods on the test set
| Method | SEN | SPE | PRE | F1 | ACC | MCC | AUC |
|---|---|---|---|---|---|---|---|
| SVM-RFE (19) | 0.423 | 0.763 | 0.555 | 0.478 | 0.625 | 0.197 | 0.635 |
| mRMR (30) | 0.538 | 0.711 | 0.569 | 0.549 | 0.642 | 0.251 | 0.696 |
| RF-SFS (17) | 0.654 | 0.737 | 0.629 | 0.642 | 0.703 | 0.388 | 0.709 |
| VSURF (10) | 0.678 | 0.776 | 0.672 | 0.669 | 0.736 | 0.431 | 0.704 |
| S-ISOMAP (3) |
The highest value in each column is shown in bold. The numbers in parentheses represent the feature dimensions after dimensionality reduction
Fig. 3Running time of S-ISOMAP compared with other feature selection
Performance of different methods on the test set
| Method | SEN | SPE | PRE | F1 | ACC | MCC | AUC |
|---|---|---|---|---|---|---|---|
| SAMPDI | 0.654 | 0.658 | 0.567 | 0.607 | 0.656 | 0.307 | 0.690 |
| PremPDI | 0.577 | 0.737 | 0.600 | 0.588 | 0.672 | 0.316 | 0.708 |
| mCSM-NA | 0.538 | 0.737 | 0.583 | 0.560 | 0.656 | 0.279 | 0.661 |
| PrPDH | 0.692 | 0.816 | 0.720 | 0.706 | 0.766 | 0.764 | |
| sxPDH | 0.508 |
The highest value in each column is shown in bold
Fig. 4Running time of sxPDH compared with PrPDH