Literature DB >> 35785279

XLPFE: A Simple and Effective Machine Learning Scoring Function for Protein-Ligand Scoring and Ranking.

Lina Dong1, Xiaoyang Qu2, Binju Wang2.   

Abstract

Prediction of protein-ligand binding affinities is a central issue in structure-based computer-aided drug design. In recent years, much effort has been devoted to the prediction of the binding affinity in protein-ligand complexes using machine learning (ML). Due to the remarkable ability of ML methods in nonlinear fitting, ML-based scoring functions (SFs) can deliver much improved performance on a selected test set, such as the comparative assessment of scoring functions (CASF), when compared to the classical SFs. However, the performance of ML-based SFs heavily relies on the overall similarity of the training set and the test set. To improve the performance and transferability of an SF, we have tried to combine various features including energy terms from X-score and AutoDock Vina, the properties of ligands, and the statistical sequence-related information from either the binding site or the full protein. In conjunction with extreme trees (ET), an ML model, we have developed XLPFE, a new SF. Compared with other tested methods such as X-score, AutoDock Vina, ΔvinaXGB, PSH-ML, or CNN-score, XLPFE achieves consistently better scoring and ranking power for various types of protein-ligand complex structures beyond the CASF, suggesting that XLPFE has superior transferability. In particular, XLPFE performs better with metalloenzymes. With its faster speed, improved accuracy, and better transferability, XLPFE could be usefully applied to a diverse range of protein-ligand complexes.
© 2022 The Authors. Published by American Chemical Society.

Entities:  

Year:  2022        PMID: 35785279      PMCID: PMC9245135          DOI: 10.1021/acsomega.2c01723

Source DB:  PubMed          Journal:  ACS Omega        ISSN: 2470-1343


Introduction

Computer-aided drug design (CADD) can accelerate the process of drug development, saving much time and cost compared to experimental procedures.[1] Drug screening is important in the discovery both of new drugs and new applications for old drugs and is among the most important tasks of CADD. To maximize the effects of drugs while minimizing their side effects, the interactions between the drug and the target should be fully understood. Accordingly, an accurate prediction of binding affinity between drugs and their target proteins is the key to drug screening.[2] To date, many theoretical methods, such as quantum mechanics/molecular mechanics,[3,4] free energy perturbation,[5,6] and thermodynamic integration,[7,8] have been developed to accurately predict the binding affinity of ligands for proteins. Unfortunately, the routine applications of these methods in high-throughput compound screening are hampered by high computational costs. Scoring functions (SFs) however have much lower computational costs and have found wide application in the prediction of binding affinity in protein–ligand complexes.[9] Classical SFs can be force field-based,[10,11] empirical,[12−15] or knowledge-based.[16,17] Force field-based SFs are usually based on calculated energies, while empirical SFs are based on a hypothetical equation with linear regression (LR) parameters. In knowledge-based SFs, the energy terms are derived from the statistics of protein–ligand interactions. In all these SFs, a predetermined functional form is assumed to characterize the relationship between binding affinities and the relevant parameters. With the recent rapid development of artificial intelligence, much effort has been devoted to develop the machine learning[18−21] (ML) based SFs for binding affinity prediction of the protein–ligand complexes. Compared to the classical SFs, ML can automatically learn to use generalized nonlinear functional forms and feature information from training data, which can improve the accuracy of binding affinity prediction.[22] ML can be classified into two categories: traditional ML-based methods and deep learning (DL)-based methods.[23] In traditional ML-based methods, the correlations between the binding affinity and the selected features are calculated via nonlinear regression using classical ML algorithms. For example, Zilian and Sotriffer proposed a method named SFCscoreRF, in which a random forest (RF) method was used to perform the nonlinear fitting of the SFCscore descriptors.[24] Ballester and co-workers developed a series of SFs using an RF algorithm and energy terms taken mainly from RF-Score.[25−27] Wang and Zhang introduced a new method named ΔvinaRF20, which was trained by RF and is based on 20 features, including five AutoDock Vina interaction terms, five ligand-dependent terms, and 10 buried solvent-accessible surface area-related features.[28] Very recently, Xia and co-workers have developed a persistent spectral hypergraph (PSH) model-based ML SF (PSH-ML), which achieved a high Pearson’s correlation coefficient (Rp) of 0.855 for the scoring power in a CASF-2016 benchmark test set.[29] Unlike ML, DL-based methods usually do not require feature engineering, which can directly convert the original structural data into the high-dimensional neural network for subsequent regression. In 2017, Koes and co-workers developed a convolutional neural network (CNN) SF, in which the input is based on a comprehensive three-dimensional representation of protein–ligand interactions.[30] Other successful applications of DL are TopologyNet,[31] KDEEP,[32] Pafnucy,[33] InteractionGraphNet,[34] and others.[35−40] Compared with the DL methods that rely on complex models, the traditional ML method has the advantages of simplicity, fast training speed, and diminished dependence on data and computing equipment. Therefore, many current efforts still rely on the use of traditional ML to improve the scoring power of SFs.[27,41−48] Though the ML-based SFs can achieve much improved performance on the selected test set such as the comparative assessment of scoring functions (CASF),[49] the performance can be significantly diminished as the overall similarity thresholds between the training set and the test set decrease. In some tests, the performances of ML-based SFs are even inferior to those of the conventional SFs.[50] Specifically, the binding affinity prediction in metalloenzymes is challenging due to the complex interactions between the ligand, metals, and the protein environment.[15,51] As such, extensive efforts have been devoted to improve the prediction accuracy for metallocomplexes.[51,52] In terms of these key issues, we wonder if the selection of more suitable features and ML methods could significantly improve the performance and transferability of SFs when compared to the previous results. To pursue this, we combined diverse features, including energy terms from X-score[12] and AutoDock Vina,[13] properties of the ligand, and the sequence information from either the active site or the full protein. Through this elaborate selection and training, we have developed a new SF, XLPFE. Compared with other tested methods such as X-score,[12] AutoDock Vina,[13] ΔvinaXGB,[53] PSH-ML,[29] and CNN-Score,[30] XLPFE consistently achieves scoring power and ranking power better than that of CASF[49] for various types of protein–ligand complex structures, suggesting that XLPFE has much-improved transferability. In particular, XLPFE can achieve robust performance in the scoring and ranking of binding affinity for metalloproteins.

Methods

Data Sets

The PDBbind database (http://www.pdbbind-cn.org/),[54−56] developed and maintained by Wang and co-workers, provides a comprehensive collection of the experimentally measured binding affinity data for complexes and the PDB structures. It has been widely used for the development and validation of SFs. The refined set provided by the PDBbind database prior to 2018 was used as the training set, and the refined post-2018 data set was selected as test set 1.[57] CASF-2016[49] was used as our test set 2. The number of complexes contained in each set is given in Table . In 2020, Wang and co-workers compiled data sets based on the PDBbind refined set by removing redundant samples using various similarity thresholds.[50] We built the standard training sets (the second column of Table ) similarly to evaluate and select different models with the CASF-2016 benchmark. In order to reduce the similarity between the training set and test sets, we also have reduced the size of the test set but maintained the size of the training set. Also, the numbers of standard test sets are listed in the third column of Table .
Table 1

Summary of the Data Sets

 sourcenumbers
training setPDBbind refined set (before 2018)4190
test set 1PDBbind refined set (after 2018)394
test set 2CASF-2016285
Table 2

Sample Size of the Nonredundant Set under Different Similarity Thresholds

sequence similarities (%)numbers of training setsnumbers of test sets
1004190285
953949158
90339057
85282423
802318 

Feature Sets

Our feature set includes five subsets derived from the energy term of AutoDock Vina (V), the energy term of X-score (X), the statistic feature related to the ligand (L), the statistic sequence-based feature related to the pocket (P), and the statistic sequence-based feature related to the full protein (F) (Table ).
Table 3

Summary of the Feature Sets

feature settermsdimension
AutoDock Vina58 terms from the Vina source code58
X-scoreVDW, HB, RT, HS, HM, and HP6
ligandcharge; C, N, O, H, F, P, S, Cl, Br, and I element numbers; and 1, 2, 3, am, and ar bond numbers16
pocket20 amino acid numbers and crystal H2O number21
full protein20 amino acid numbers20
For the V subset, 58 features from the source code of Vina were selected,[13] including protein–ligand interaction terms and a set of ligand properties. Besides the Gaussian, repulsion, hydrogen bonding (HB), and hydrophobic terms included in AutoDock Vina, some other terms, such as simple property counts, electrostatic interactions, AutoDock4 desolvation effects, nonhydrophobic contacts, and Lennard Jones 4–8 van der Waals interactions (Table S1), were also included. X-score,[12] developed by Wang and co-workers, is an empirical SF, which is composed of four major energy terms with respect to van der Waals interactions (VDW), HB, deformation effects (RT), and hydrophobic effects. According to the different approaches to its calculation, the hydrophobic effect can be further sectioned into hydrophobic pairs (HP), hydrophobic matching (HM), and hydrophobic surface (HS). HP calculates the hydrophobic energy by counting the hydrophobic contact pairs between the protein and ligand, whereas HM and HS compute this energy using an HM algorithm and an HS algorithm, respectively (see the Part S1 section for more details).[58−60] These six terms from X-score constitute a subset of X. For the information of the ligand (L), a total of 12 features were considered, including the total charge of the ligand, the number of atoms of each element (C, N, O, F, P, S, Cl, Br, I, and H), and the number of each type of bond—single bond, double bond, triple bond, amide bond, and aromatic bond. For the information from the binding site (P), a total of 21 features were selected, including the quantity of 20 amino acids and the number of water molecules in the site. For the information of the full protein (F), the quantitative distribution of 20 amino acids was taken into account. In order to eliminate the influence of unit and scale differences between features, each feature was subsequently standardized using the sklearn.preprocessing.StandardScaler class.

ML Methods

Scikit-Learn[61] 0.24.1 was used to generate machine-learning models. LR, tree-related models including extra trees (ET),[62] RF,[63] extreme gradient boosting (XGBoost),[64] support vector regressions (SVR),[65] and neural networks (NN)[66] were selected. The ML models can be used to predict the binding affinity according towhere (x1,x2,...,x) is the vector of input features and n is the number of features. F is the machine-learning model that adopts a nonlinear function. The output is the predicted binding affinity for protein–ligand complex i. Different models obtain the minimum mean absolute error (MAE) in different ways according towhere pre and exp are the predicted and experimental binding affinities of the protein–ligand complex, respectively, and i and N are the number of samples in the training set. Fivefold cross-validation was used to efficiently search the hyperparameter space for each model. After training a call (five calls in total), the cost function (the MAE of predictions on the subset) across folds is returned to the estimator, which in turn chooses a new hyperparameter configuration for the next call using its acquisition function (eq ) to further decrease the model’s cost function (eq ). A brief description and tuned hyperparameters of each ML method are shown in Table S2. The results from all models are the average of 10 repeated experiments.

Performance Evaluation

The Rp is a measure of the linear dependence of the predicted binding affinity values on the experimental values according to eq . The Spearman correlation coefficient (Rs) can measure the strength of an association between the predicted and experimental binding affinity values according to a monotonic function (eq ). RMSE is the root mean square error between the predicted binding affinity and the experimental value (eq ). These are calculated as followspre is the binding affinity from the given SF on the ith complex in the test set; exp is the experimental binding constant (in logarithm units, logK) of this complex; preave and expave are the corresponding averages; rpre is the rank of the binding affinity of the ith complex; rexp is the rank of the experimental binding affinity of this complex; and N is the total number of samples. All predicted values were based on the crystal structures of the protein–ligand complexes.

Results

Selection of Feature Set Combinations

Nine feature sets with different combinations (XLF, XLP, XLPF, VXL, VXP, VLF, VLP, VXLP, and VLPF) were applied to the multiple LR, and five ML models (ET, RF, SVR, NN, and XGBoost) were used. Figure compares the performance of different feature sets and different ML models on the test set. As the dimension of features increases from the top to the bottom, it can be seen that the performance of Rp fails to improve steadily. However, it can be seen that the tree-related ML methods (ET, RF, and XGBoost) in general have a performance that is better than that of the other two (SVR and NN). On test set 1, VLP achieves the highest Rp value of 0.703, while XLPF has the highest Rp value of 0.816 on test set 2 (CASF-2016). Consistent with the previous observation, the highest Rp also corresponds to the lowest RMSE on the same feature set (Table S4). In particular, the VLP and XLPF feature sets achieve the low RMSE values of 1.30 and 1.36, respectively. Thus, both the VLP and XLPF feature sets were selected for further testing. It can be seen that in most combinations, performances in test set 2 are better than in test set 1. The different performance may be related to data diversity and similarity between the training set and test set.
Figure 1

Pearson correlation coefficients between the experimental data and the predicted binding affinities on (A) test set 1 and (B) test set 2 (CASF-2016) for combinations of different feature sets and ML methods. The dimension of features increases from the top to the bottom. The darker the color (blue), the higher the correlation, and the lighter the color (yellow), the lower the correlation.

Pearson correlation coefficients between the experimental data and the predicted binding affinities on (A) test set 1 and (B) test set 2 (CASF-2016) for combinations of different feature sets and ML methods. The dimension of features increases from the top to the bottom. The darker the color (blue), the higher the correlation, and the lighter the color (yellow), the lower the correlation.

Comparison of Different ML Methods

According to the different similarity thresholds in the training set, the stability of LR and ML methods was assessed on the VLP and XLPF feature sets. Figure A, B shows the performance of different machine models on both feature sets with different similarity thresholds of the training set. In order to reduce the similarity between the training set and test sets, we also have reduced the size of the test set but maintained the size of the training set (Table ). Figure C,D shows the performance of different machine models on both feature sets with different similarity thresholds of the test set. Consistent with our test study described in the previous section, the tree-related methods, ET, RF, and XGBoost, perform better than the other two, SVR and NN, albeit the performance decreases with the decrease in the similarity.[50] Overall, ET is the best model in view of its stable and near-best performance. As shown in Figure , the Rp obtained from ET using the XLPF feature set is generally higher than that produced using VLP, and XLPF with 63 features has lower feature dimensions than VLP with 95 features. Thus, the combination of XLPF feature sets and the ET model (labeled as the XLPFE model) was selected in our subsequent studies.
Figure 2

Pearson correlation coefficients between the experimental data and the predicted binding affinities for different sequence similarity thresholds of the training set using(A) VLP feature set and (B) XLPF feature set and different sequence similarity thresholds of the test set using(C) VLP feature set and (D) XLPF feature set. Different ML methods are shown in different colors: LR in blue, ET in orange, RF in green, NN in red, SVR in purple, and XGBoost in brown.

Pearson correlation coefficients between the experimental data and the predicted binding affinities for different sequence similarity thresholds of the training set using(A) VLP feature set and (B) XLPF feature set and different sequence similarity thresholds of the test set using(C) VLP feature set and (D) XLPF feature set. Different ML methods are shown in different colors: LR in blue, ET in orange, RF in green, NN in red, SVR in purple, and XGBoost in brown. As summarized in Figure and Tables S3 and S4, XLPFE achieves Rp = 0.68 and RMSE = 1.34 on test set 1, while the corresponding values are 0.816 and 1.41 with test set 2 (CASF-2016). In terms of the CASF-2016 data set, we further tested and compared more than 30 common SFs included in the data set. As summarized in part 1, model XLPFE achieved a Pearson correlation coefficient of 0.816 and a Spearman correlation coefficient of 0.66, which identifies it as one of the best SFs in the CASF-2016 data set. The additional tests on the performance of XLPFE on CASF-2016 and its subsets can be found in Table S6.

Evaluation of Feature Importance

Figure A shows the calculated feature correlation of XLPFE. It can be seen that there is no high correlation between most features. The highest correlation is X-Score-related HP, HS, and HM. The three hydrophobic algorithms are highly related to the energy calculation of hydrophobic action, but they do not exceed 0.9.[67] The correlation between features and experimental values is basically proportional to the importance of features (Figure B). When the correlation between features and experimental values is higher, its importance is often higher. The features mentioned with high correlation—HP, HS, and HM also have similar importance.
Figure 3

(A) Correlation matrix of features and experimental values. (B) Feature importance. Feature importance values are calculated based on the number of times a feature is used to split the data across all trees. Here, the eight most significant features are shown.

(A) Correlation matrix of features and experimental values. (B) Feature importance. Feature importance values are calculated based on the number of times a feature is used to split the data across all trees. Here, the eight most significant features are shown. The feature importance was based on the number of times a feature is used to split the data across all trees. Figure B shows the feature importance analysis of model XLPFE. It is seen that the VDW item generated using X-score ranks first in feature importance. Among the top eight features selected from feature importance, five items are from subset X-score, two are from ligand, and one is from the protein active site. Among the top 30 features in Figure S3, only one feature, the occurrence of Ser, is from the full protein. This is consistent with previous studies, which showed that the use of active site sequences can improve the prediction of the affinity.[68,69] For the occurrence of amino acids in the protein active site, it can be seen that His, Gly, Tyr, and Trp rank in the top position (Figure S3), probably because they can form hydrogen bonds with ligands and thus play key roles in ligand binding. Leu and Phe also rank quite high, possibly due to their contribution to the formation of hydrophobic pockets.

Performance of XLPFE on Diverse Biological Targets

In this section, the performance of XLPFE with different biological targets is evaluated. For this purpose, we collected an expanded test set covering 15 different biological targets, including various kinases, secretory enzymes, and hydrolases. For each target, we collected all the relevant crystal structures of complexes that contain small molecule ligands with relatively reliable bioactivity data available from the PDBbind general set (for the PDB ID list of all the collected crystal structures of the 15 selected targets, see Table S7). In order to maintain a reasonable data size, targets including less than 10 complexes are not selected in data sets. For each target, the performance of XLPFE on the collected complexes was evaluated and characterized in terms of Rp, Rs, and RMSE. Table shows the calculated Rp, Rs, and RMSE values for each target. The predicted Rp values for the 15 targets are in the range 0.384–0.891, and the average value is 0.692. The predicted Rs values are between 0.315 and 0.845, and the average is 0.657. The RMSE values are between 0.90 and 2.44, and the average is 1.58. For comparison, we also tested other popular SFs, including the traditional SFs X-score, AutoDock Vina, and Lin_F9, as well as the ML-based SFs ΔvinaXGB[53] and PSH-ML[29] and DL-based SF CNN-Score.[30,70] In ΔvinaXGB, a feature set of Vina’s 58 energy terms and other interaction-related terms have been applied to the XGBoost algorithm, which achieved a scoring power of 0.796 on CASF-2016.[53] PSH-ML uses the graph and spectrum theory to generate additional features and has an outstanding scoring power on the benchmark set CASF-2016 with an obtained Rp value of 0.855.[29] It is noted that the results of CNN-Score we used herein were calculated from CNN affinity of GNINA.[70]
Table 4

Performance of XLPFE, X-Score, AutoDock Vina,ΔvinaXGB, CNN-Score, PSH-ML, and Lin_F9 Evaluated against a Set Consisting of 15 Selected Diverse Biological Targetsa

BACE-1, β-secretase 1; CHK1, serine/threonine-protein kinase chk1; DPP4, dipeptidyl peptidase 4; ER, estrogen receptor; GluR2, glutamate receptor 2; HIV PR, hiv-1 protease; HSP90, heat shock protein 90; LTA-4H, leukotriene A-4 hydrolase; P38a, mitogen-activated protein kinase 14; PDE4B, camp-specific3′,5′-cyclic phosphodiesterase 4b; PDK1, 3-phosphoinositide-dependent protein kinase 1; PTP1B, protein tyrosine phosphatase 1B; and SRC, proto-oncogene tyrosine protein kinase src. The more intensely red the table, the smaller the value, and the more intensely green the table the larger the value.

BACE-1, β-secretase 1; CHK1, serine/threonine-protein kinase chk1; DPP4, dipeptidyl peptidase 4; ER, estrogen receptor; GluR2, glutamate receptor 2; HIV PR, hiv-1 protease; HSP90, heat shock protein 90; LTA-4H, leukotriene A-4 hydrolase; P38a, mitogen-activated protein kinase 14; PDE4B, camp-specific3′,5′-cyclic phosphodiesterase 4b; PDK1, 3-phosphoinositide-dependent protein kinase 1; PTP1B, protein tyrosine phosphatase 1B; and SRC, proto-oncogene tyrosine protein kinase src. The more intensely red the table, the smaller the value, and the more intensely green the table the larger the value. As shown in Table , XLPFE outperforms AutoDock Vina, ΔvinaXGB, and Lin_F9 in terms of average Rp, Rs, and RMSE. Compared to those in X-score, CNN-score, and PSH-ML, the predicted average Rp and Rs values are improved in XLPFE, although the average RMSE is lower than that in X-score, CNN-score, and PSH-ML. Overall, XLPFE achieved the highest average values of both Rp and Rs among these seven SFs, suggesting that it has robust scoring and ranking power. These results clearly indicate that XLPFE can achieve good and consistent performance on all the test targets, implying that it can be used for a broad range of biological targets. The sensitivity of our method in predicting the binding affinity of similar molecules for the same target in drug design was also confirmed.[71] Detailed discussions are listed in Part S2 of the Supporting Information. This shows that XLPFE has a considerable ability to correctly sort structurally similar ligands that have the same target.

Performance with Complexes Containing Different Types of Metal Atoms

To further demonstrate the performance of XLPFE, we performed binding affinity predictions in various metalloproteins. For this purpose, we collected the metalloproteins from PDBbind and classified them according to metal types (Figure ). For Zn-containing metalloproteins, the training set and test set used in the previous study[15] were used for pretraining and testing (see Part S3 in the Supporting Information for detailed results), and our method performed well. For metalloproteins, the selection criterion is based on the distance between the metal and any atom of the ligand, which should be less than 3.0 Å. Accordingly, we have selected over 50 metalloproteins containing Zn, Fe, Mg, Mn, or Ca. For each metal type, the corresponding metalloproteins were randomly divided into a training set (80%) and a test set (20%).
Figure 4

Predicted affinities in complexes containing different types of metal atoms. Performances of XLPFE, PSH-ML, Lin_F9, and X-Score evaluated against a set consisting of five kinds of selected metal contained targets are listed in the table at the bottom of the figure.

Predicted affinities in complexes containing different types of metal atoms. Performances of XLPFE, PSH-ML, Lin_F9, and X-Score evaluated against a set consisting of five kinds of selected metal contained targets are listed in the table at the bottom of the figure. In the feature set, one additional feature corresponding to the number of metals within 3 Å of any atoms of the ligand was added (Table S10). For comparison, we also tested the ML-based SF PSH-ML[29] and the classical SF Lin_F914. PSH-ML uses the graph and spectrum theory to generate additional features and has outstanding scoring power on the benchmark set CASF-2016 with an obtained Rp value of 0.855.[29] Lin_F9 is based on a linear combination of nine empirical terms, including one energy term to describe the metal–ligand bonding interactions.[14] It was found that Lin_F9 achieves better performance than Vina and X-score with metalloprotein complexes. For a consistent comparison, the same training and test sets as those used for XLPFE were applied to PSH-ML, and all results are summarized in Figure . Among the four SFs, XLPFE achieves the best overall performance among all indexes, with the highest average Rp and Rs values, as well as the lowest average RMSE. In particular, the performance of XLPFE is much better than that of Lin_F9 or X-score. For Zn-metalloproteins, XLPFE shows a similar performance to PSH-ML, but for Mn, Ca, and Fe, XLPFE shows a better performance than PSH-ML. However, PSH-ML outperforms XLPFE in terms of Mg. XLPFE is much faster than PSH-ML. With PDB ID:1SZM as an example, PSH-ML needs 1370.62 s to generate features using a single-core CPU, while XLPFE only needs 0.91 s to generate features. In terms of Mg metalloproteins, our count shows that a considerable proportion of them contains more than one Mg cofactor and if the effects of multinuclear effects are accounted for in the future, the predictive performance of XLPFE could be further improved. Overall, XLPFE can achieve robust performance in the scoring and ranking of binding affinity for metalloproteins and is among one of the best SFs for metalloproteins.

Discussion and Conclusions

In this work, we have combined various feature sets and ML methods to improve the scoring performance of SFs. The five feature sets include energy terms from X-score (X) and AutoDock Vina (V), the properties of the ligand (L), and the statistic sequence-related information from either the binding site (P) or the full protein (F), while the ML methods consist of ET, RF, XGBoost, SVR, and NN. Among the various combinations of feature sets and ML methods, we found that the combination of XLPF feature sets and the ET model (labeled as the XLPFE model) achieves the best and most stable performance. On the benchmark set CASF-2016, XLPFE shows outstanding scoring power (0.816) and ranking power (0.66), which is one of the best SFs in the CASF-2016 data set. XLPFE also shows consistently better scoring power and ranking power than other SFs for various expanded test sets beyond the CASF, including traditional SFs X-score and AutoDock Vina, ML-based SF ΔvinaXGB, and DL-based SF CNN-score. In particular, XLPFE achieved the best overall performance for metalloproteins. All these findings suggest that an appropriate selection of feature sets can render much improved transferability in ML-based SFs. Unlike some state-of-the-art SFs based on the complex topological features or structural features, the combination of simple energy terms and other properties related to protein–ligand binding may achieve better scoring performance and transferability.[42,47] Thus, this study gives some information that can lead to the further improvement of ML-based SFs.

Data and Software Availability

All protein–ligand structural data and related experimental binding affinity are from PDBbind (http://www.pdbbind-cn.org/). Our training and test sets and the code of XLPFE are open access (https://github.com/LinaDongXMU/XLPFE). Other data and results are all listed in the Supporting Information.
  63 in total

1.  General and targeted statistical potentials for protein-ligand interactions.

Authors:  Wijnand T M Mooij; Marcel L Verdonk
Journal:  Proteins       Date:  2005-11-01

Review 2.  QM/MM methods for biomolecular systems.

Authors:  Hans Martin Senn; Walter Thiel
Journal:  Angew Chem Int Ed Engl       Date:  2009       Impact factor: 15.336

3.  Hybrid Alchemical Free Energy/Machine-Learning Methodology for the Computation of Hydration Free Energies.

Authors:  Jenke Scheen; Wilson Wu; Antonia S J S Mey; Paolo Tosco; Mark Mackey; Julien Michel
Journal:  J Chem Inf Model       Date:  2020-08-04       Impact factor: 4.956

4.  Persistent spectral hypergraph based machine learning (PSH-ML) for protein-ligand binding affinity prediction.

Authors:  Xiang Liu; Huitao Feng; Jie Wu; Kelin Xia
Journal:  Brief Bioinform       Date:  2021-04-09       Impact factor: 11.622

5.  Time-split cross-validation as a method for estimating the goodness of prospective prediction.

Authors:  Robert P Sheridan
Journal:  J Chem Inf Model       Date:  2013-04-05       Impact factor: 4.956

6.  Development of a New Scoring Function for Virtual Screening: APBScore.

Authors:  Jingxiao Bao; Xiao He; John Z H Zhang
Journal:  J Chem Inf Model       Date:  2020-10-14       Impact factor: 4.956

7.  Tapping on the Black Box: How Is the Scoring Power of a Machine-Learning Scoring Function Dependent on the Training Set?

Authors:  Minyi Su; Guoqin Feng; Zhihai Liu; Yan Li; Renxiao Wang
Journal:  J Chem Inf Model       Date:  2020-03-03       Impact factor: 4.956

8.  Incorporating Explicit Water Molecules and Ligand Conformation Stability in Machine-Learning Scoring Functions.

Authors:  Jianing Lu; Xuben Hou; Cheng Wang; Yingkai Zhang
Journal:  J Chem Inf Model       Date:  2019-10-31       Impact factor: 4.956

9.  A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking.

Authors:  Pedro J Ballester; John B O Mitchell
Journal:  Bioinformatics       Date:  2010-03-17       Impact factor: 6.937

10.  An open-source drug discovery platform enables ultra-large virtual screens.

Authors:  Andras Boeszoermenyi; Zi-Fu Wang; Christoph Gorgulla; Patrick D Fischer; Paul W Coote; Krishna M Padmanabha Das; Yehor S Malets; Dmytro S Radchenko; Yurii S Moroz; David A Scott; Konstantin Fackeldey; Moritz Hoffmann; Iryna Iavniuk; Gerhard Wagner; Haribabu Arthanari
Journal:  Nature       Date:  2020-03-09       Impact factor: 49.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.