| Literature DB >> 33430992 |
Nalini Schaduangrat1, Samuel Lampa2, Saw Simeon3, Matthew Paul Gleeson4, Ola Spjuth5, Chanin Nantasenamat6.
Abstract
The reproducibility of experiments has been a long standing impediment for further scientific progress. Computational methods have been instrumental in drug discovery efforts owing to its multifaceted utilization for data collection, pre-processing, analysis and inference. This article provides an in-depth coverage on the reproducibility of computational drug discovery. This review explores the following topics: (1) the current state-of-the-art on reproducible research, (2) research documentation (e.g. electronic laboratory notebook, Jupyter notebook, etc.), (3) science of reproducible research (i.e. comparison and contrast with related concepts as replicability, reusability and reliability), (4) model development in computational drug discovery, (5) computational issues on model development and deployment, (6) use case scenarios for streamlining the computational drug discovery protocol. In computational disciplines, it has become common practice to share data and programming codes used for numerical calculations as to not only facilitate reproducibility, but also to foster collaborations (i.e. to drive the project further by introducing new ideas, growing the data, augmenting the code, etc.). It is therefore inevitable that the field of computational drug design would adopt an open approach towards the collection, curation and sharing of data/code.Entities:
Keywords: Bioinformatics; Cheminformatics; Data science; Data sharing; Drug design; Drug discovery; Open data; Open science; Reproducibility; Reproducible research
Year: 2020 PMID: 33430992 PMCID: PMC6988305 DOI: 10.1186/s13321-020-0408-x
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Fig. 1Schematic summary of the drug discovery process overlayed with corresponding computational approaches
Fig. 2Conceptual map on the experimental and computational methodologies as applied to the drug discovery process [283]. The ordering of terminologies on each of the colored tracks are not of any specific order
Fig. 3Number of articles on PubMed, mentioning “Pipeline Pilot” or “KNIME” in their title or abstract from 2003 to 2017
List of the largest public cloud infrastructure service providers
| Service provider | URL |
|---|---|
| Amazon Web Service | |
| Microsoft’s Azure | |
| Google Cloud Platform | |
| IBM’s SoftLayer | |
| Alibaba Cloud |
Service providers are ordered according to market share [284]
Fig. 4Schematic comparison of virtual machines and containers. Virtual machines run on a Hypervisor and contains their own Guest Operating System. In contrast, Containers provide a layer of isolation that share the Host Operating System kernel and are hence smaller and faster to instantiate than virtual machines
Fig. 5A comparison between monolith services and microservices. In traditional services (left), each service consists of a monolithic implementation that encapsulates all necessary components under a single interface. In contrast, a Microservice-based implementation (right) has the individual components that make up an exposed service running independently, making it easier to scale parts of the service if needed as well as offering the benefit of reusing sub-components in other settings
List of software and packages that implements an automated QSAR modeling workflow
| Software/tool | Description | URL | Refs. |
|---|---|---|---|
| Standalone and online applications | |||
| AZOrange | Graphical programming environment based on the Python package “Orange” for performing QSAR modeling workflow | [ | |
| AutoQSAR | Automated machine learning tool for QSAR modeling using best practice guidelines | [ | |
| AutoWeka | Automated data mining software for QSAR modeling based on the machine learning software Weka | [ | |
| ChemSAR | Online platform for QSAR modeling that is capable of handling chemical structures, computing molecular descriptors, model building as well as producing result plots | [ | |
| Tools implemented in R language | |||
| camb | R package that is capable of handling chemical structures, compute descriptors and build QSAR models | [ | |
| Ezqsar | R package for building QSAR models | [ | |
| RRegrs | R package for building multiple regression models using pre-configured and customizable workflow | [ | |
List of selected GitHub URLs of researchers working in the domain of computational drug discovery
| Researcher’s name | GitHub URL | Ligand-based | Structure-based | Systems-based |
|---|---|---|---|---|
| Andrea Volkamer | ✔ | ✔ | ||
| Chanin Nantasenamat | ✔ | ✔ | ||
| ✔ | ||||
| Egon Willighagen | ✔ | |||
| George Papadatos | ✔ | |||
| Greg Landrum | ✔ | |||
| Jan H. Jansen | ✔ | ✔ | ||
| John Chodera | ✔ | ✔ | ||
| Ola Spjuth | ✔ | |||
| Rajarshi Guha | ✔ | |||
| Samo Turk | ✔ |
List of selected web applications for handling various bioinformatic and cheminformatic tasks belonging to either ligand-based or structure-based drug design approach
| Web servers | Description | URL | Refs. |
|---|---|---|---|
| Ligand-based drug design | |||
| BioTriangle | Compute descriptors for compounds, protein, DNA and their interaction cross-terms | [ | |
| ChemDes | Computes 3679 molecular descriptors and 59 fingerprint types for compounds | [ | |
| ChemBench | Enables QSAR model building via pre-defined workflow | [ | |
| OCHEM | Online platform providing storage for QSAR data and workflow for model building | [ | |
| PUMA | Performs analysis and visualization of chemical diversity | [ | |
| Structure-based drug design | |||
| HADDOCK | Performs information-driven docking of biomolecular complexes (e.g. DNA, proteins, peptides, etc.) | [ | |
| FlexServ | Performs coarse-grained determination of protein dynamics | [ | |
| MDWeb | Provides standard protocol for preparing structures, run standard molecular dynamics simulations and analyze trajectories | [ | |
| PoseView | Displays simple molecular interaction diagram of protein-ligand complexes | [ | |
| SwissModel | Predicts protein structures via template-based homology | [ | |