| Literature DB >> 31905263 |
Alan Talevi1, Juan Francisco Morales1, Gregory Hather2, Jagdeep T Podichetty3, Sarah Kim4, Peter C Bloomingdale5, Samuel Kim6, Jackson Burton3, Joshua D Brown7, Almut G Winterstein7, Stephan Schmidt4, Jensen Kael White3, Daniela J Conrado8.
Abstract
Artificial intelligence, in particular machine learning (ML), has emerged as a key promising pillar to overcome the high failure rate in drug development. Here, we present a primer on the ML algorithms most commonly used in drug discovery and development. We also list possible data sources, describe good practices for ML model development and validation, and share a reproducible example. A companion article will summarize applications of ML in drug discovery, drug development, and postapproval phase.Entities:
Mesh:
Year: 2020 PMID: 31905263 PMCID: PMC7080529 DOI: 10.1002/psp4.12491
Source DB: PubMed Journal: CPT Pharmacometrics Syst Pharmacol ISSN: 2163-8306
Figure 1Overview of the types of machine learning and algorithms. Only the most commonly used algorithms are described in this tutorial. AdaBoost, adaptive boosting; DBSCAN, density‐based spatial clustering of applications with noise; DCNN, deep convolutional neural networks; Eclat, equivalence class transformation; FP‐Growth, frequent pattern growth; GRU, gated recurrent unit; K‐NN, K‐nearest neighbors; LDA, linear discriminant analysis; LightGBM, light gradient boosting machine; LSA, latent semantic analysis; LSM, liquid state machine; LSTM, long short‐term memory; MLP, multilayer perceptron; PCA, principal component analysis; seq2seq, sequence‐to‐sequence; SVD, singular value decomposition; SVM, support vector machine; t‐SNE, t‐distributed stochastic neighbor embedding; XGBoost, extreme gradient boosting.
Open source tools and learning resources on ML
| Resource name | Description | What is this good for? | Reference |
|---|---|---|---|
| Machine Learning in R for beginners | A short tutorial that introduces you to implementing ML using R | To get started with ML using R |
|
| Stanford ML Tips and Tricks | A quick reference guide to ML concepts and algorithms | A cheat sheet for understanding the concepts and algorithms in ML |
|
| Kaggle Data sets | A repository of data sets to build ML model | It contains numerous data sets that can be used to build ML models |
|
| WEKA | An open source user interface ML tools | A great tool to get started with ML |
|
| Google Colaboratory | It’s an online Python Jupyter notebook environment that requires no setup to use | An easy‐to‐use online tool with no installation necessary on your local machine. It is also a great tool for ML education |
|
| Caret R package | R package for ML analysis | For researchers comfortable with R who want to perform ML analysis |
|
ML, machine learning.
| True classification | ||
|---|---|---|
|
|
| |
| Predicted classification | ||
|
| True negative (TN) | False negative (FN) |
|
| False positive (FP) | True positive (TP) |