| Literature DB >> 32938359 |
Isabella Mendolia1, Salvatore Contino2, Ugo Perricone3, Edoardo Ardizzone2, Roberto Pirrone2.
Abstract
BACKGROUND: A Virtual Screening algorithm has to adapt to the different stages of this process. Early screening needs to ensure that all bioactive compounds are ranked in the first positions despite of the number of false positives, while a second screening round is aimed at increasing the prediction accuracy.Entities:
Keywords: Bioactivity prediction; Deep learning; Drug design; Molecular fingerprints; Virtual screening
Mesh:
Year: 2020 PMID: 32938359 PMCID: PMC7493874 DOI: 10.1186/s12859-020-03645-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Fingerprint Generation. Simplefied Fingerprint generation, the hashing function sets just 1 bit per pattern
Results for the active/inactive discrimination task, and Training scheme 1
| Architecture | Bal. accuracy | Sensitivity | Loss | AUC | F1-score | MCC |
|---|---|---|---|---|---|---|
| Tuned-MLP-Out | 0.9880 | 0.9855 | 0.0405 | 0.9979 | 0.9510 | 0.9462 |
| Voting | 0.9768 | 0.9710 | 0.2093 | 0.9920 | 0.8965 | 0.9033 |
| CNN 1D (F) | 0.9687 | 0.9710 | 0.0688 | 0.9904 | 0.8979 | 0.8813 |
| CNN 2D (R-M-F) | 0.9679 | 0.9565 | 0.0770 | 0.9912 | 0.8918 | 0.8817 |
| Random Forest (F) | 0.9510 | 0.8985 | 0.6405 | 0.9837 | 0.6065 | 0.8962 |
| SVM (F) | 0.9421 | 0.8985 | 0.7883 | 0.9868 | 0.8857 | 0.8731 |
Fingerprint types: (R)DKIT,(M) organ, (F) eatMorgan, (L)ayered
Results for the active/inactive discrimination task, and Training scheme 2
| Architecture | Bal. Accuracy | Sensitivity | Loss | AUC | F1-score | MCC |
|---|---|---|---|---|---|---|
| Tuned-MLP-Out | 0.9644 | 0.9625 | 0.0983 | 0.9875 | 0.5519 | 0.5989 |
| Voting | 0.9639 | 0.9500 | 0.1523 | 0.9889 | 0.6379 | 0.6694 |
| CNN 1D (F) | 0.9579 | 0.9625 | 0.1398 | 0.9854 | 0.4709 | 0.5336 |
| CNN 2D (T-L-E) | 0.9525 | 0.9375 | 0.1054 | 0.9841 | 0.5192 | 0.5920 |
| Random Forest (F) | 0.8789 | 0.7750 | 0.6221 | 0.9541 | 0.6528 | 0.6540 |
| SVM (F) | 0.9208 | 0.8625 | 0.6221 | 0.9682 | 0.6699 | 0.6524 |
Fingerprint types: (F) eatMorgan, (T) orsion, (L) ayered, (E)CFP4
Enrichment factor computed on the test set 1 (70 active molecules out of 701 compounds)
| Architecture | EF 1% | EF 2% | EF 5% | EF 10% |
|---|---|---|---|---|
| Tuned-MLP-Out | 7(100%) | 14(100%) | 34(97.14%) | 65(94.20%) |
| Voting | 7(100%) | 14(100%) | 34(97.14%) | 61(88.40%) |
| CNN 1D (M) | 7(100%) | 13(92.86%) | 33(94.29%) | 62(85.89%) |
| CNN 2D (R-M-F) | 7(100%) | 12(85.71%) | 32(91.43%) | 61(88.40%) |
| RF(F) | 7(100%) | 14(100%) | 35(100%) | 63(91.30%) |
| SVM(F) | 7(100%) | 14(100%) | 35(100%) | 61(88.40%) |
Fingerprint types: (R)DKIT,(M) organ, (F)eatMorgan,
Enrichment factor computed on the test set 2 (80 active molecules out of 3720 compounds)
| Architecture (Training 2) | EF 1% | EF 2 |
|---|---|---|
| Tuned-MLP-Out | 37(100%) | 65(87.84%) |
| Voting | 32(86.5%) | 57(66.9%) |
| CNN 1D (F) | 31(83.8%) | 52(70.3%) |
| CNN 2D (T-L-E) | 31(83.8% | 52(70.3%) |
| RF(F) | 37(100%) | 62(83.8%) |
| SVM(F) | 32(86.5%) | 55(74.3%) |
Fingerprint types: (F) eatMorgan, (L) ayered, (T) orsion, (E)CFP4
Performance of the Tuned-MLP-Out network on three data sets with 1%, 2%, and 5% active/inactive proportion respectively
| Active/inactive proportion | Bal.Accuracy | Sensitivity | Loss | AUC | F1-score | MCC |
|---|---|---|---|---|---|---|
| 1% | 0.7475 | 0.5000 | 0.5116 | 0.9700 | 0.5333 | 0.5289 |
| 2% | 0.9671 | 0.9375 | 0.5114 | 0.9415 | 0.9009 | 0.8226 |
| 5% | 0.9382 | 0.8780 | 0.0565 | 0.9991 | 0.9230 | 0.9196 |
Enrichment factor computed on the test set with 1%, 2%, and 5% active/inactive proportion respectively
| Active/inactive proportion | EF 1% | EF 2% | EF 5 |
|---|---|---|---|
| 1% (8 active molecules) | 4(50%) | – | – |
| 2% (16 active molecules) | 7(87.5%) | 7(43.5%) | – |
| 5% (41 active molecules) | 4(50%) | 8(50%) | 20(48.7%) |
Fig. 21D CNN. One-dimensional convolution architecture
Fig. 32D CNN.Bi-dimensional convolution architecture
Fig. 4Tuned-MLP-Out. The complex architecture with MLP classifier