Literature DB >> 35765652

A brief review of protein-ligand interaction prediction.

Lingling Zhao¹, Yan Zhu¹, Junjie Wang², Naifeng Wen³, Chunyu Wang¹, Liang Cheng^4,5.

Abstract

The task of identifying protein-ligand interactions (PLIs) plays a prominent role in the field of drug discovery. However, it is infeasible to identify potential PLIs via costly and laborious in vitro experiments. There is a need to develop PLI computational prediction approaches to speed up the drug discovery process. In this review, we summarize a brief introduction to various computation-based PLIs. We discuss these approaches, in particular, machine learning-based methods, with illustrations of different emphases based on mainstream trends. Moreover, we analyzed three research dynamics that can be further explored in future studies.

Entities: Chemical

Keywords: Drug discovery; Drug-target binding affinity; Machine learning; Protein–ligand interactions

Year: 2022 PMID： 35765652 PMCID： PMC9189993 DOI： 10.1016/j.csbj.2022.06.004

Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN： 2001-0370 Impact factor: 6.155

Introduction

Drug discovery is a time-consuming and labor-intensive process that includes the selection, design, and optimization of molecules based on disease-specific target proteins [1]. The task of predicting the interactions between compounds and proteins is the core and foundation of drug discovery, which consists of drug-target interaction (DTI) [2], drug-target binding affinity (DTA) [3], drug-target interaction sites and drug bioactivity on proteins [4], [5]. Protein-ligand interaction (PLI), also known as compound-protein interaction (CPI), is most reliably determined by in vitro experiments or biochips; however, this is extremely costly in the first screening of a compound, which requires a prohibitively enormous search space [6], [7]. To narrow the search space, there is an urgent need to develop more efficient computational approaches. The increasing publication of large-scale PLI datasets enables the development of traditional machine learning (ML) and deep learning (DL) methods for the prediction of PLIs. The workflow for predicting PLIs using ML methods is shown in Fig. 1. First, the compound–protein pairs and corresponding labels are retrieved from PLI databases and other related databases. In each compound–protein pair, the compound and protein are represented by the feature vectors/matrix from different types of properties (i.e., biological, topological and physicochemical information). Next, the generated feature vectors/matrix and corresponding labels are fed into the ML-based methods for training. After the training stage, the trained model can be tested by different evaluation mechanisms.

Fig. 1

Workflow of ML methods used in PLI prediction, including (a) benchmark data collection and preprocessing; (b) framework building and model training; and (c) model evaluation.

Research status of PLI prediction

The traditional determination of PLIs using humidity tests involves in vitro experiments, biochips, and other classic methods. Due to exorbitant costs, a computational PLI prediction study field has emerged. Researchers have put much effort into this field and have produced excellent results. Currently, there are four types of computation-based PLI methods: ligand-based methods [8], structural methods [9], network-based methods [10] and feature based methods [11].

Ligand based methods

Ligand based methods have been developed to predict potential PLIs under the hypothesis that ligands with chemical similarity also have similar biological activities and they tend to bind to similar protein targets [12]. Therefore, these methods compare candidate molecules with known protein ligands and predict the interactions based on the similarities between them. These methods do not rely on any knowledge about the target protein, but meanwhile performs poorly for targets with an insufficient number of known ligands.

Structural methods

Structural methods use the three-dimensional structure of proteins and ligands and molecular docking to simulate the interaction between proteins and ligands and finally utilize the scoring function to evaluate the conformation [13], [14]. Structural methods can be divided into three categories by the type of scoring function: the classic scoring function method [15], machine learning scoring function method [16], [17] and deep learning scoring function method [18], [19]. The core of structural methods is to accurately model the three-dimensional structure of proteins and compounds. Although structural methods can obtain better prediction performance, they often take a certain amount of computing time. In addition, they fail to predict interactions with unknown structures of proteins or compounds. Therefore, it is difficult to screen compound–protein pairs on a large scale, which seriously limits the application scope of this kind of approach.

Network based methods

Network based methods predict the PLI based on various biological networks and graph theory. A number of computational methods model the relationship between compounds and proteins as a bipartite network [20]. Moreover, PLI-related biological networks, such as protein–protein interactions, drug–drug interactions and drug–disease interactions, have been integrated into a heterogeneous network [9]. The potential interaction information is learned from heterogeneous data from diverse sources to boost the accuracy of DTI prediction tasks [21], [22], [23]. However, those prediction approaches are shallow-learning methods that cannot fully extract deep and complex associations between compounds and proteins.

Feature based methods

Feature based methods are widely used in drug-target interaction prediction studies [24]. These methods predict PLI in a machine learning framework. Feature vectors of drug-target pairs are obtained from their properties or by learning from raw data, and then fed into various classifiers or regressors [25]. Researchers have conducted a plenty of research from many perspectives and their studies are introduced in detail in the following sections. In addition, since both ligand-based and target-based aspects are considered in feature based methods, they can be assigned to the so-called “chemogenomics” approaches [26].

Machine learning in PLI prediction

Existing models typically employ the simplified molecular- input line entry system (SMILES) [27], [28], molecular structure [29], protein sequences [30], secondary structure of protein [31], gene ontology [32], and other descriptors of predefined molecules and proteins as input features. Then, these inputs were trained by a variety of network frameworks, such as convolutional neural networks (CNNs) [33], recurrent neural networks (RNNs) [34], graph neural networks (GNNs) [35], and Transformer network structures and their variants, to realize the prediction of PLI-related tasks, such as DTI, DTA, and activity [36]. Fig. 1 illustrates a flowchart describing the three generic steps used by these computational approaches for predicting PLIs. Table 1 and Table 2 summarize the typical methods to predict PLIs based on ML in recent years in terms of the input protein/compound features, protein/compound feature extractors, final computational methods, and website. Studies regard the DTI prediction task as a binary classification problem corresponding to the articles in Table 1. These methods, which yield 1 if the DTI is active and 0 otherwise, are concerned about the existence of a DTI. However, other researchers doubt that using classification methods to address the DTI prediction problem loses valuable information about the strength of the interaction between proteins and ligands. The studies in Table 2 considered the PLI problem as a regression task to predict the binding affinity score. It can also be seen in Table 2 that methods [37], [38], [39] solve both tasks. The binding affinity, which can be determined by experimental methods, is defined as the strength of the binding interaction between a protein and a ligand.

Table 1

PLI prediction methods as classification tasks based on the ML framework in recent yearsa.

Toolb	Date	Input protein features	Input compound features	Protein feature extractor	Compound feature extractor	Methods
DeepDTIs [69]	03/2017	Protein sequence composition descriptors	Extendedconnectivity fingerprints	–	–	DBN
DDR [70]	01/2018	Similarity measures	Similarity measures	–	–	RF
CPI-GNN [19]	07/2018	N-gram amino acids	Molecular graphs	CNN	GNN	Softmax classifier
DeepConv-DTI [18]	06/2019	Local residue patterns	PubChem fingerprints	Convolution and global max-pooling layers	Fully connected layer	Fully connected layer
DTI-CDF [71]	12/2019	Similarity-based features	Similarity-based features	–	–	Cascade deep forest
DEEPScreen [72]	01/2020	–	2-D compound images	–	Convolutional and pooling layers	Fully connected layers
TransformerCPI [54]	05/2020	Amino acid sequence	CNN	Graph structure	GCNs	Transformer with self-attention mechanism
DTI-CNN [73]	08/2020	Similarity matrix	Similarity matrix	Random walk with restart	Random walk with restart	Fully connected layer
MolTrans [52]	10/2020	Substructureembedding	Substructureembedding	Transformer encoder	Transformer encoder	Linear layer
BridgeDPI [35]	02/2021	K-mer/sequence features	Fingerprint/sequence features	Perceptron layers	Perceptron layers	GNN and a full connected layer
CSConv2d [74]	04/2021	–	2-D structural representations	–	A channel and spatial attention mechanism	Fully connected layer
GADTI [75]	04/2021	Similarity data	Similarity data	Heterogeneous network	Heterogeneous network	Graph autoencoder
LGDTI [76]	04/2021	K-mer	Molecular fingerprint	Graph convolutional network and DeepWalk	Graph convolutional network and DeepWalk	RF
PretrainDPI [77]	05/2021	Pretrained models	Molecular graph	CNN	GraphNet	Fully connected layers
X-DPI [51]	06/2021	Structure and sequence features	Atomic features	TAPE embedding	Mol2vec embedding	Transformer decoder
MultiDTI [78]	07/2021	N-gram embedding	N-gram embedding	Deep downsampling residual module	Deep downsampling residual module	Multilayer perceptron
HyperAttentionDTI [79]	10/2021	Amino acid sequences	SMILES strings	CNN and attention mechanism	CNN and attention mechanism	Fully connected layer
DTIHNC [80]	02/2022	Protein-protein interactions, protein-disease associations	Drug-drug interactions, drug-disease associations, drug-side-effects associations	Denoising autoencoder	Denoising autoencoder	CNN module
HIDTI [81]	03/2022	Protein sequences, protein–protein similarities, protein–protein interactions, protein-disease interactions	SMILES strings, drug-drug interactions, drug-side effect associations, drug-disease associations	A residual block	A residual block	Fully connected layers
HGDTI [82]	04/2022	Node features encoding (interactions, similarities, associations)	Node features encoding (interactions, similarities, associations)	BiLSTM	BiLSTM	Fully connected layers

Note: “-” in the table indicates that there is no such information in the corresponding article.

Abbreviations: DBN – deep belief network; RF – random forest; CNN – convolutional neural network; GNN – graph neural network; GCNs – graph convolutional networks; TAPE – tasks assessing protein embeddings; SMILES – simplified molecular-input line-entry system; BiLSTM – bidirectional long short-term memory;

URL addresses for the listed tools: DeepDTIs – ; DDR – ; CPI-GNN – ; DeepConv-DTI – ; DTI-CDF – ; DEEPscreen – ; transformerCPI – ; DTI-CNN – ; MolTrans – ; BridgeDPI – ; CSConv2d – https://doi.org/10.4121/uuid:547e8014-d662-4852–9840-c1ef065d03ef; GADTI – ; PretrainDPI – ; MultiDTI – ; HyperAttentionDTI – https://github.com/zhaoqichang/HpyerAttentionDTI; DTIHNC – https://github.com/ningq669/DTIHNC; HIDTI – https://github.com/DMCB-GIST/HIDTI; HGDTI – https://bioinfo.jcu.edu.cn/hgdti.

Table 2

PLI prediction methods as regression tasks based on the ML framework in recent yearsa.

Toolb	Date	Input protein features	Input compound features	Protein feature extractor	Compound feature extractor	Methods
SimBoost [26]	04/2017	Target similarity	Drug similarity	–	–	Gradient boosting tree model
ACNN [83]	2017	Atomic coordinates	Atomic coordinates	Atomic convolution layer	Atomic convolution layer	Atomic fully connected layer
DeepDTA [84]	09/2018	Label encoding	Label encoding	CNN blocks	CNN blocks	Fully connected layer
DeepAffinity [46]	02/2019	Structural property sequence representation	Structural property sequence representation	Seq2seq autoencoders	Seq2seq autoencoders	Unified RNN-CNN
WideDTA [85]	02/2019	Textual information	Textual information	CNN blocks	CNN blocks	Fully connected layers
GraphDTA [86]	06/2019	One-hot encoding	Molecular graph	Convolutional layers	4 graph neural network variants	Fully connected layers
RFScore [17]	08/2019	36 intermolecular features	36 intermolecular features	–	–	Random forest
AttentionDTA [36]	11/2019	Label encoding	Label encoding	CNN block	CNN block	Attention block- fully connected layers
Taba [87]	01/2020	The average distance between pairs of atoms	The average distance between pairs of atoms	–	–	Machine-learning model
GAT_GCN [88]	04/2020	Peptide frequency	Graph structure	CNN	GCN	Fully connected layers
SAnDReS [89]	05/2020	Docking scores	Docking scores	–	–	Machine-learning model
DeepCDA [90]	05/2020	N-gram embedding	SMILES sequence	CNN-LSTM-Two-sided attention mechanism	CNN-LSTM-Two-sided attention mechanism	Fully connected layers
DGraphDTA [91]	06/2020	Protein graph	Molecular graph	GNN	GNN	Fully connected layers
JoVA [92]	08/2020	Multiple unimodal representations	Multiple unimodal representations	Joint view attention module	Joint view attention module	Prediction model
Fusion [93]	11/2020	Atomic representation	Atomic representation	CNNs	SG-GCNs	Fully connected layers
DeepGS [44]	2020	Symbolic sequences	Molecular structure	Prot2Vec-CNN-BiGRU blocks	Smi2Vec-CNN-BiGRU blocks	Fully connected layer
DeepDTAF [94]	01/2021	Sequence, structural property information	SMILES string	Dilated/traditional convolution layers	Dilated convolution layers	Fully connected layers
GanDTI [37](classification and regression)	03/2021	Protein sequences	Molecule fingerprints-adjacency matrix	Attention module	Residual graph neural network	MLP
Multi-PLI [38](classification and regression)	04/2021	One-hot vectors	One-hot vectors	CNN blocks	CNN blocks	Fully connected layers
ML-DTI [95]	04/2021	Protein sequences	SMILES string	CNN block (mutual learning)	CNN block (mutual learning)	Linear transformation layers
DEELIG [47]	06/2021	Atomic level-structural information-sequences	Physical properties-fingerprints	CNN	Fully connected layers	Fully connected layers
GEFA [55]	07/2021	Sequence embedding features	Graph representation	GCN	GCN	Linear layers
SAG-DTA [96]	08/2021	Label encoding	Molecular graph	CNN	Graph convolutional layer-SAGPooling layer	Fully connected layers
Tanoori et al. [97]	08/2021	SW sequence similarity	CS similarity	–	–	GBM
EmbedDTI [56]	11/2021	Amino acids	Structural information	CNN	Attention-GCNs	Fully connected layers
DeepPLA [45]	12/2021	Protein sequences (ProSE)	SMILES strings (Mol2Vec)	Head CNN modules-ResNet-based CNN module	Head CNN modules-ResNet-based CNN module	BiLSTM module-MLP module
DeepGLSTM [98]	01/2022	Amino acids	Adjacency representation	BiLSTM	GCN	Fully connected layers
MGraphDTA [99]	01/2022	Integers	Graph structure	Multiscale convolutional neural network	GNN	MLP
FusionDTA [100]	01/2022	word embeddings	SMILES strings	BiLSTM	BiLSTM	Multi-head linear attention blocks/Fully connected layer
HoTS [39](classification and regression)	02/2022	Protein sequences	Morgan/circular fingerprints	Transformer blocks	Transformer blocks	Fully connected layers
ELECTRA-DTA [101]	03/2022	Protein sequences	SMILES string	Squeeze-and-excitation convolutional neural network blocks	Squeeze-and-excitation convolutional neural network blocks	Fully connected layers

Note: “-” in the table indicates that there is no such information in the corresponding article.

Abbreviations: CNN – convolutional neural network; GNN – graph neural network; GCNs – graph convolutional networks; LSTM – long short-term memory; SG-CNNs – spatial graph neural networks; BiGRU – bidirectional gate recurrent unit; MLP – multilayer perceptron; GCN – graph convolutional network; SW – Smith-Waterman; CS – chemical structure; GBM – gradient boosting machine; BiLSTM – bidirectional long short-term memory;

URL addresses for the listed tools: SimBoost – ; ACNN – ; DeepDTA – ; DeepAffinity – ; GraphDTA – ; Taba – https://github.com/azevedolab/taba; SAnDReS – https://github.com/azevedolab/sandres; DeepCDA – ; Fusion – ; DeepGS – ; DeepDTAF – ; GanDTI – ; Multi-PLI – ; ML-DTI – ; DEELIG – ; GEFA – ; EmbedDTI – ; DeepPLA – ; DeepGLSTM – https://github.com/MLlab4CS/DeepGLSTM.git; MGraphDTA – https://github.com/guaguabujianle/MGraphDTA; FusionDTA – https://github.com/yuanweining/FusionDTA; HoTS – https:// github. com/ GIST- CSBL/ HoTS; ELECTRA-DTA – https://github.com/IILab-Resource/ELECTRA-DTA.

PLI prediction methods as classification tasks based on the ML framework in recent yearsa. Note: “-” in the table indicates that there is no such information in the corresponding article. Abbreviations: DBN – deep belief network; RF – random forest; CNN – convolutional neural network; GNN – graph neural network; GCNs – graph convolutional networks; TAPE – tasks assessing protein embeddings; SMILES – simplified molecular-input line-entry system; BiLSTM – bidirectional long short-term memory; URL addresses for the listed tools: DeepDTIs – ; DDR – ; CPI-GNN – ; DeepConv-DTI – ; DTI-CDF – ; DEEPscreen – ; transformerCPI – ; DTI-CNN – ; MolTrans – ; BridgeDPI – ; CSConv2d – https://doi.org/10.4121/uuid:547e8014-d662-4852–9840-c1ef065d03ef; GADTI – ; PretrainDPI – ; MultiDTI – ; HyperAttentionDTI – https://github.com/zhaoqichang/HpyerAttentionDTI; DTIHNC – https://github.com/ningq669/DTIHNC; HIDTI – https://github.com/DMCB-GIST/HIDTI; HGDTI – https://bioinfo.jcu.edu.cn/hgdti. PLI prediction methods as regression tasks based on the ML framework in recent yearsa. Note: “-” in the table indicates that there is no such information in the corresponding article. Abbreviations: CNN – convolutional neural network; GNN – graph neural network; GCNs – graph convolutional networks; LSTM – long short-term memory; SG-CNNs – spatial graph neural networks; BiGRU – bidirectional gate recurrent unit; MLP – multilayer perceptron; GCN – graph convolutional network; SW – Smith-Waterman; CS – chemical structure; GBM – gradient boosting machine; BiLSTM – bidirectional long short-term memory; URL addresses for the listed tools: SimBoost – ; ACNN – ; DeepDTA – ; DeepAffinity – ; GraphDTA – ; Taba – https://github.com/azevedolab/taba; SAnDReS – https://github.com/azevedolab/sandres; DeepCDA – ; Fusion – ; DeepGS – ; DeepDTAF – ; GanDTI – ; Multi-PLI – ; ML-DTI – ; DEELIG – ; GEFA – ; EmbedDTI – ; DeepPLA – ; DeepGLSTM – https://github.com/MLlab4CS/DeepGLSTM.git; MGraphDTA – https://github.com/guaguabujianle/MGraphDTA; FusionDTA – https://github.com/yuanweining/FusionDTA; HoTS – https:// github. com/ GIST- CSBL/ HoTS; ELECTRA-DTA – https://github.com/IILab-Resource/ELECTRA-DTA. From Table 1, Table 2, it can be seen that the traditional ML methods are gradually being phased out and replaced by DL technologies, particularly the utilization of diverse neural networks and learning mechanisms. In the following section, we summarized several research trends of the machine learning based PLI prediction from the relevant literature in recent decades. Many previous studies have applied manually operated descriptors such as similarity and molecular fingerprints, as well as other composition information, to drive PLI predictions [40], [41], [42], [43]. Sequence descriptors, which include SMILES strings and amino acid sequences, are commonly used by encoding sequences in numerical matrices via one-hot or word embedding (such as Prot2Vec and Mol2Vec) [38], [44], [45]. The sequence representation only considers the primary structure information and limits the learning capability. To more effectively represent compounds and proteins, graph-based features have also been widely employed. In the graph representing PLIs, the protein is modeled as a graph structure where nodes are residues and the edge information is provided by the contact map [46]. Researchers are also working on leveraging 3D structural information. For instance, a complex is cropped into a cubic box [47]. With advancements in protein structure prediction and the intuitiveness of 3D information, structural information will have significant research value in predicting PLIs and will be a promising study topic in the future. Some works adopted the same structure to handle the representation of proteins and compounds, while others created separate feature extractors for the two inputs [48]. These extractors include CNN-based models, RNN-based models, attention mechanism-based models, and GNN-based models. CNN has the benefit of being able to catch crucial local patterns in the whole space. However, there are certain drawbacks. Protein residues that are not adjacent can be quite close in structure. CNN has failed to obtain this long-distance dependence. RNN-based modules, such as the long short-term memory (LSTM) network, are suitable for learning long-term dependency from compound and protein sequence inputs, compensating for the CNN disadvantage [49]. However, due to the difficulty of encoding long-range dependencies, the training of RNN becomes problematic when the sequence is long. Furthermore, to overcome the difficulty in the interpretation of black-box-like neural networks, researchers have solved this problem with attention mechanism-based models. The attention mechanism can be effectively visualized by mapping regions with high weight to the known 3D protein–compound complex structures, thus indicating the biological significance of the model [50]. However, its operation, as in the case of Transformer with attention mechanism, requires a large amount of computer memory. However, Transformer has released a series of new and updated versions that offer broad prospects for predicting PLI tasks [51], [52]. The GNN is a kind of neural network dedicated to extracting graph structure information [53]. GNN-based models, such as the graph convolutional neural network (GCN) and Graph Isomorphism Network (GIN), Graph Attention Networks (GAT), are commonly applied in computer-aided drug design [54], [55], [56], [57].

Challenges of machine learning in PLIs

ML methods have attracted increasing attention in the fields of bioinformatics and chemical informatics [58], [59]. However, the complexity of proteins, compounds and their interactions make ML-based PLI prediction challenging for the following reasons: In the field of ML, feature engineering is used in traditional ML frameworks to select related features for downstream tasks [60]. DL methods try to avoid complicated feature engineering and learn abstract representation automatically [61]. Since the rise of large-scale data and improvements in computing power, DL techniques have enabled unprecedented breakthroughs in many areas, including image processing, natural language processing and bioinformatics [62]. PLIs involve complex physical, chemical, and biological processes. The combination of compounds and proteins is the consequence of various processes that are highly concentrated. Therefore, proteins and molecules are far more sophisticated than images, language, and other items. PLI prediction is mainly modeled as a supervised classification or regression problem in the ML-based method [63]. Supervised learning requires large-scale high-quality labeled datasets. In the case of an insufficient quantity of labeled PLI datasets, research works apply unsupervised learning, semi-supervised learning, or self-supervised learning to predict the PLIs [64], [65], [66]. In particular, unsupervised pretrained models on large text corpora have shown remarkable performance on various natural language processing tasks. Consequently, some unsupervised pretrained models for embedding the amino acid sequence and SMILES have been proposed in recent years [67]. Unfortunately, due to the relatively immature understanding of the interaction mechanism between proteins and compounds, there remains a lack of specific unsupervised DL models for the PLI task. In addition to unlabeled data, existing ML methods also do not take full advantage of knowledge about proteins and compounds. The related knowledge can be expressed in various forms. Protein-related knowledge includes primary structure, secondary structure, tertiary structure, functional annotation, motif, and various physical and chemical attributes. Compound-related knowledge includes molecular structure, functional groups and molecular properties. Which type of knowledge is connected to PLIs and how to select, represent, and incorporate knowledge into data-driven ML models are progressive theoretical questions.

Discussion and analysis

The increase in high-quality and large-scale PLI datasets has enabled the development of traditional ML or DL methods for the prediction of PLIs. Compared with traditional ML methods, DL methods have shown significant advantages, such as feature generation automation and the ability to capture complex nonlinear relationships. It is also worth noting that there is still much room for improvement in prediction accuracy, robustness, generalization, and interpretability. First, the performance of existing DL methods for PLIs is still poor due to the complexity of the PLI problem itself and the limited data available. Several DL-based models also fail to make good use of large-scale unlabeled data. In addition, the selection of input representation is a vital part of PLI prediction [68]. Most of the existing DL methods train deep neural networks directly on low-level representations, such as amino acid sequences and SMILESs. The primary structure input may affect the model generalizability in predicting the novel PLI. Researchers should pay more attention to improving the generalizability of models in future studies. Furthermore, the lack of interpretability of DL-based methods limits their practical applications, as the potential factors influencing the prediction results are unknown. Some methods use attention mechanisms to capture interaction sites, but they are still unable to explain the mechanisms behind the PLI. Researchers should attempt to design an interpretable DL model to predict PLIs.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

84 in total

1. Genome scale enzyme-metabolite and drug-target interaction predictions using the signature molecular descriptor.

Authors: Jean-Loup Faulon; Milind Misra; Shawn Martin; Ken Sale; Rajat Sapra
Journal: Bioinformatics Date: 2007-11-23 Impact factor: 6.937

2. Building Machine-Learning Scoring Functions for Structure-Based Prediction of Intermolecular Binding Affinity.

Authors: Maciej Wójcikowski; Pawel Siedlecki; Pedro J Ballester
Journal: Methods Mol Biol Date: 2019

Review 3. Virtual screening strategies in drug discovery: a critical review.

Authors: A Lavecchia; C Di Giovanni
Journal: Curr Med Chem Date: 2013 Impact factor: 4.530

4. GEFA: Early Fusion Approach in Drug-Target Affinity Prediction.

Authors: Tri Minh Nguyen; Thin Nguyen; Thao Minh Le; Truyen Tran
Journal: IEEE/ACM Trans Comput Biol Bioinform Date: 2022-04-01 Impact factor: 3.710

5. DeepDTA: deep drug-target binding affinity prediction.

Authors: Hakime Öztürk; Arzucan Özgür; Elif Ozkirimli
Journal: Bioinformatics Date: 2018-09-01 Impact factor: 6.937

6. AutoDTI++: deep unsupervised learning for DTI prediction by autoencoders.

Authors: Seyedeh Zahra Sajadi; Mohammad Ali Zare Chahooki; Sajjad Gharaghani; Karim Abbasi
Journal: BMC Bioinformatics Date: 2021-04-20 Impact factor: 3.169

Review 7. Artificial intelligence in the prediction of protein-ligand interactions: recent advances and future directions.

Authors: Ashwin Dhakal; Cole McKay; John J Tanner; Jianlin Cheng
Journal: Brief Bioinform Date: 2022-01-17 Impact factor: 11.622

8. ELECTRA-DTA: a new compound-protein binding affinity prediction model based on the contextualized sequence encoding.

Authors: Junjie Wang; NaiFeng Wen; Chunyu Wang; Lingling Zhao; Liang Cheng
Journal: J Cheminform Date: 2022-03-15 Impact factor: 5.514

9. DEELIG: A Deep Learning Approach to Predict Protein-Ligand Binding Affinity.

Authors: Asad Ahmed; Bhavika Mam; Ramanathan Sowdhamini
Journal: Bioinform Biol Insights Date: 2021-07-07

10. PubChem 2019 update: improved access to chemical data.

Authors: Sunghwan Kim; Jie Chen; Tiejun Cheng; Asta Gindulyte; Jia He; Siqian He; Qingliang Li; Benjamin A Shoemaker; Paul A Thiessen; Bo Yu; Leonid Zaslavsky; Jian Zhang; Evan E Bolton
Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971

2 in total

Review 1. Application of Mathematical Modeling and Computational Tools in the Modern Drug Design and Development Process.

Authors: Md Rifat Hasan; Ahad Amer Alsaiari; Burhan Zain Fakhurji; Mohammad Habibur Rahman Molla; Amer H Asseri; Md Afsar Ahmed Sumon; Moon Nyeo Park; Foysal Ahammad; Bonglee Kim
Journal: Molecules Date: 2022-06-29 Impact factor: 4.927

2. Modeling DTA by Combining Multiple-Instance Learning with a Private-Public Mechanism.

Authors: Chunyu Wang; Yuanlong Chen; Lingling Zhao; Junjie Wang; Naifeng Wen
Journal: Int J Mol Sci Date: 2022-09-22 Impact factor: 6.208

2 in total