| Literature DB >> 35057737 |
Qing Ye1, Xiaolong Zhang2, Xiaoli Lin1.
Abstract
BACKGROUND: Computational prediction of the interaction between drugs and protein targets is very important for the new drug discovery, as the experimental determination of drug-target interaction (DTI) is expensive and time-consuming. However, different protein targets are with very different numbers of interactions. Specifically, most interactions focus on only a few targets. As a result, targets with larger numbers of interactions could own enough positive samples for predicting their interactions but the positive samples for targets with smaller numbers of interactions could be not enough. Only using a classification strategy may not be able to deal with the above two cases at the same time. To overcome the above problem, in this paper, a drug-target interaction prediction method based on multiple classification strategies (MCSDTI) is proposed. In MCSDTI, targets are firstly divided into two parts according to the number of interactions of the targets, where one part contains targets with smaller numbers of interactions (TWSNI) and another part contains targets with larger numbers of interactions (TWLNI). And then different classification strategies are respectively designed for TWSNI and TWLNI to predict the interaction. Furthermore, TWSNI and TWLNI are evaluated independently, which can overcome the problem that result could be mainly determined by targets with large numbers of interactions when all targets are evaluated together.Entities:
Keywords: Drug–target interaction; Multiple classification strategies; Within-class imbalance
Mesh:
Substances:
Year: 2022 PMID: 35057737 PMCID: PMC8772044 DOI: 10.1186/s12859-021-04366-3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Simple statistics for datasets
| Data sets | NR | IC | GPCR | E | DB |
|---|---|---|---|---|---|
| Drugs | 54 | 210 | 223 | 445 | 5877 |
| Targets | 26 | 204 | 95 | 664 | 3348 |
| Interactions | 90 | 1476 | 635 | 2926 | 12,674 |
| Proportion | 6.4% | 3.4% | 3.0% | 0.99% | 0.064% |
Fig. 1The distribution of interactions on five datasets, where Feature vector extraction
Simple information of the extracted features
| Data sets | NR | IC | GPCR | E | DB |
|---|---|---|---|---|---|
| Drug feature dimension | 1024 | 1024 | 1024 | 1024 | 1024 |
| Target feature dimension | 1437 | 1437 | 1437 | 1437 | 1437 |
| Total feature dimension | 2461 | 2461 | 2461 | 2461 | 2461 |
Fig. 2The flowchart of MCSDTI
Fig. 3An example used to show the negative impact of samples of the neighbors
AUCs for TWLNI, where used in Algorithm 1 are respectively set to 1, 3 and 5
| Dataset | 1 | 3 | 5 | Mean | |
|---|---|---|---|---|---|
| NR | DT | 51.90 ± 0.19 | 47.30 ± 0.23 | 47.30 ± 0.23 | 48.83 |
| RF | 76.10 ± 0.40 | 77.30 ± 0.30 | 67.80 ± 0.36 | 73.73 | |
| NP | 69.50 ± 2.00 | 74.90 ± 0.37 | 73.90 ± 0.70 | 72.77 | |
| WP | 72.80 ± 0.22 | 80.60 ± 0.80 | 75.40 ± 0.10 | 76.27 | |
| NBI | 72.40 ± 0.30 | 79.70 ± 0.80 | 73.80 ± 0.20 | 75.30 | |
| RLS | 77.00 ± 0.32 | 87.10 ± 0.40 | 88.20 ± 0.49 | 84.10 | |
| RK | 78.60 ± 0.30 | 88.50 ± 0.40 | 84.97 | ||
| EDT | 86.10 ± 0.10 | 81.20 ± 0.35 | 83.00 | ||
| EKRR | 76.40 ± 0.35 | 87.40 ± 0.29 | 84.17 | ||
| ours | |||||
| IC | DT | 51.30 ± 0.18 | 53.30 ± 0.33 | 54.30 ± 0.34 | 52.97 |
| RF | 67.70 ± 0.27 | 72.10 ± 0.12 | 72.90 ± 0.11 | 70.90 | |
| NP | 62.60 ± 0.10 | 64.30 ± 0.40 | 66.10 ± 0.60 | 64.33 | |
| WP | 75.50 | ||||
| NBI | 72.30 ± 0.54 | 76.40 ± 0.40 | 77.20 ± 0.13 | 75.30 | |
| RLS | 68.30 ± 0.67 | 70.60 ± 0.45 | 71.40 ± 0.42 | 70.10 | |
| RK | 68.40 ± 0.57 | 70.50 ± 0.36 | 71.00 ± 0.31 | 69.97 | |
| EDT | 71.50 ± 0.24 | 76.00 ± 0.18 | 77.20 ± 0.35 | 74.90 | |
| EKRR | 69.50 ± 0.64 | 71.60 ± 0.38 | 72.50 ± 0.33 | 71.20 | |
| ours | |||||
| GPCR | DT | 59.90 ± 0.20 | 62.60 ± 0.80 | 62.70 ± 0.13 | 61.73 |
| RF | 78.00 ± 0.18 | 77.60 ± 1.90 | 77.80 ± 0.19 | 77.80 | |
| NP | 67.80 ± 0.10 | 69.30 ± 0.40 | 70.30 ± 0.20 | 69.13 | |
| WP | 81.90 ± 0.11 | 80.40 ± 0.22 | 80.70 ± 0.20 | 81.00 | |
| NBI | 81.80 ± 0.11 | 80.40 ± 0.22 | 80.60 ± 0.19 | 80.93 | |
| RLS | 81.50 ± 0.11 | 81.20 ± 0.14 | 81.00 ± 0.15 | 81.23 | |
| RK | 82.00 ± 0.40 | 81.00 ± 0.11 | 80.90 ± 0.13 | 82.30 | |
| EDT | 82.20 ± 0.21 | 82.63 | |||
| EKRR | 82.50 ± 0.09 | 82.30 ± 0.14 | 82.37 | ||
| Ours | |||||
| E | DT | 61.60 ± 0.25 | 63.00 ± 0.45 | 63.40 ± 0.56 | 62.67 |
| RF | 76.90 ± 0.80 | 80.70 ± 0.10 | 81.10 ± 0.06 | 79.57 | |
| NP | 74.80 ± 0.30 | 71.90 ± 0.80 | 72.00 ± 0.80 | 72.90 | |
| WP | 85.27 | ||||
| NBI | 84.90 ± 0.30 | 85.20 ± 0.70 | 85.40 ± 0.12 | 85.17 | |
| RLS | 76.70 ± 0.40 | 74.30 ± 0.10 | 74.40 ± 0.21 | 75.13 | |
| RK | 77.40 ± 0.13 | 75.90 ± 0.40 | 76.20 ± 0.17 | 76.50 | |
| EDT | 82.20 ± 0.20 | 84.50 ± 0.30 | 84.70 ± 0.80 | 83.80 | |
| EKRR | 77.30 ± 0.50 | 75.30 ± 0.24 | 75.80 ± 0.38 | 76.13 | |
| Ours | |||||
| DB | DT | 65.70 ± 0.48 | 69.30 ± 0.64 | 70.30 ± 0.58 | 68.43 |
| RF | 82.50 ± 0.60 | 85.30 ± 0.56 | 86.50 ± 0.76 | 84.77 | |
| NP | 69.70 ± 0.78 | 72.50 ± 0.46 | 72.90 ± 0.80 | 71.70 | |
| WP | 84.20 ± 0.64 | 78.20 ± 0.62 | 73.30 ± 0.48 | 78.57 | |
| NBI | 64.40 ± 0.93 | 58.90 ± 0.79 | 55.90 ± 0.53 | 59.73 | |
| RLS | 88.10 ± 0.75 | 87.70 ± 0.39 | 87.30 ± 0.35 | 87.70 | |
| RK | 88.30 ± 0.66 | 89.00 ± 0.60 | 88.70 ± 0.65 | 88.67 | |
| EDT | 87.50 ± 0.58 | 88.32 ± 0.62 | 88.89 ± 0.49 | 88.24 | |
| EKRR | 91.83 | ||||
| Ours |
The maximum and second maximum AUC are shown in bold and italics
Fig. 4Histogram of AUCs for TWLNI, where used in Algorithm.1 are respectively set to 1, 3 and 5
AUCs for TWSNI, where used in Algorithm 1 are respectively set to 1, 3 and 5
| Dataset | 1 | 3 | 5 | Mean | |
|---|---|---|---|---|---|
| NR | DT | 49.90 ± 0.54 | 54.30 ± 0.32 | 53.20 ± 0.31 | 52.47 |
| RF | 51.10 ± 0.61 | 63.90 ± 0.51 | 66.40 ± 0.44 | 60.47 | |
| WP | 32.30 ± 0.64 | 53.90 ± 0.58 | 58.30 ± 0.44 | 48.17 | |
| NBI | 40.20 ± 0.54 | 54.80 ± 0.60 | 58.60 ± 0.44 | 51.20 | |
| RLS | 52.90 ± 0.65 | ||||
| RK | 49.10 ± 0.63 | 66.00 ± 0.51 | 69.40 ± 0.48 | 61.50 | |
| EDT | 44.70 ± 0.73 | 63.00 ± 0.70 | 66.70 ± 0.51 | 58.13 | |
| EKRR | 51.00 ± 0.59 | 66.20 ± 0.52 | 69.60 ± 0.45 | 62.67 | |
| 65.30 ± 0.52 | 67.90 ± 0.33 | 62.37 | |||
| IC | DT | 46.70 ± 0.70 | 48.70 ± 0.41 | 50.30 ± 0.43 | 48.57 |
| RF | 41.90 ± 0.48 | 54.60 ± 0.34 | 61.50 ± 0.33 | 52.67 | |
| WP | 37.40 ± 0.59 | 50.80 ± 0.55 | 60.70 ± 0.45 | 49.63 | |
| NBI | 32.40 ± 0.69 | 50.40 ± 0.57 | 60.20 ± 0.44 | 47.67 | |
| RLS | 51.30 ± 0.73 | 58.80 ± 0.41 | 63.00 ± 0.32 | 57.70 | |
| RK | 46.40 ± 0.91 | 65.30 ± 0.32 | 57.40 | ||
| EDT | 34.10 ± 0.58 | 54.60 ± 0.49 | 62.90 ± 0.39 | 50.53 | |
| EKRR | 50.90 ± 0.76 | 60.10 ± 0.53 | |||
| ours | 54.40 ± 0.34 | 57.20 ± 0.36 | 55.37 | ||
| GPCR | DT | 50.50 ± 0.58 | 53.10 ± 0.37 | 57.30 ± 0.28 | 53.63 |
| RF | 56.80 ± 0.68 | 69.40 ± 0.32 | 73.50 ± 0.26 | 66.57 | |
| WP | 40.80 ± 0.68 | 58.40 ± 0.30 | 64.10 ± 0.21 | 54.43 | |
| NBI | 40.80 ± 0.65 | 58.00 ± 0.29 | 64.10 ± 0.20 | 54.30 | |
| RLS | 71.90 ± 0.53 | 80.40 ± 0.30 | 81.50 ± 0.23 | 77.93 | |
| RK | 71.80 ± 0.56 | 79.20 ± 0.32 | 80.40 ± 0.25 | 77.13 | |
| EDT | 55.80 ± 0.52 | 70.00 ± 0.30 | 74.80 ± 0.23 | 66.87 | |
| EKRR | 78.43 | ||||
| ours | 51.70 ± 0.28 | 59.60 ± 0.21 | 63.00 ± 0.13 | 58.10 | |
| E | DT | 58.90 ± 0.18 | 61.40 ± 0.14 | 61.00 ± 0.21 | 60.43 |
| RF | 58.90 ± 0.27 | 64.10 ± 0.17 | 67.90 ± 0.18 | 63.63 | |
| WP | 44.20 ± 0.21 | 47.40 ± 0.22 | 56.90 ± 0.29 | 49.50 | |
| NBI | 44.70 ± 0.28 | 47.60 ± 0.16 | 57.10 ± 0.32 | 49.80 | |
| RLS | 67.50 ± 0.22 | 68.30 ± 0.22 | 72.60 ± 0.24 | 69.47 | |
| RK | 67.00 ± 0.27 | 70.30 ± 0.27 | 70.07 | ||
| EDT | 58.00 ± 0.28 | 63.10 ± 0.19 | 68.30 ± 0.23 | 63.13 | |
| EKRR | 64.20 ± 0.22 | 65.70 ± 0.22 | 71.20 ± 0.25 | 67.03 | |
| ours | 69.70 ± 0.40 | 70.53 | |||
| DB | DT | 51.20 ± 0.52 | 57.30 ± 0.37 | 58.80 ± 0.39 | 55.77 |
| RF | 54.40 ± 0.65 | 58.60 ± 0.68 | 62.20 ± 0.75 | 58.40 | |
| WP | 56.70 ± 0.37 | 45.10 ± 0.46 | 53.00 ± 0.38 | 51.60 | |
| NBI | 60.80 ± 0.45 | 60.20 ± 0.57 | 59.80 ± 0.57 | 60.27 | |
| RLS | 43.50 ± 0.76 | 51.60 ± 0.62 | 65.80 ± 0.63 | 53.63 | |
| RK | 69.87 | ||||
| EDT | 52.39 ± 0.62 | 55.60 ± 0.49 | 63.30 ± 0.54 | 57.10 | |
| EKRR | 39.90 ± 0.38 | 53.30 ± 0.38 | 60.5. ± 0.38 | 51.23 | |
| ours | 62.50 ± 0.62 | 63.70 ± 0.67 | 64.50 ± 0.57 | 63.57 |
The maximum and second maximum AUC are shown in bold and italics
Fig. 5Histogram of AUCs for TWSNI, where used in Algorithm.1 are respectively set to 1, 3 and 5