| Literature DB >> 28272367 |
Floriane Montanari1, Barbara Zdrazil2.
Abstract
Chemical compound bioactivity and related data are nowadays easily available from open data sources and the open medicinal chemistry literature for many transmembrane proteins. Computational ligand-based modeling of transporters has therefore experienced a shift from local (quantitative) models to more global, qualitative, predictive models. As the size and heterogeneity of the data set rises, careful data curation becomes even more important. This includes, for example, not only a tailored cutoff setting for the generation of binary classes, but also the proper assessment of the applicability domain. Powerful machine learning algorithms (such as multi-label classification) now allow the simultaneous prediction of multiple related targets. However, the more complex, the less interpretable these models will get. We emphasize that transmembrane transporters are very peculiar, some of which act as off-targets rather than as real drug targets. Thus, careful selection of the right modeling technique is important, as well as cautious interpretation of results. We hope that, as more and more data will become available, we will be able to ameliorate and specify our models, coming closer towards function elucidation and the development of safer medicine.Entities:
Keywords: applicability domain; computational modeling; data curation; machine learning; multi-label classification; open data; transport proteins
Mesh:
Substances:
Year: 2017 PMID: 28272367 PMCID: PMC5553104 DOI: 10.3390/molecules22030422
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Schematic depiction of data compilation and modeling workflow.
Summary of the methods used to predict ligand–transporter interactions.
| Reference | Endpoint | Training Set | Method Type | Algorithm |
|---|---|---|---|---|
| Ecker [ | P-gp inhibition | 20 benzofurylethanolamine analogs of propafenone | regression | multiple linear regression |
| Li [ | P-gp transport and inhibition | 20 steroids | regression | 3D-QSAR a |
| Zhang [ | BCRP inhibition | 25 flavonoids | regression | feature selection and multiple linear regression |
| Müller [ | P-gp inhibition | 28 tariquidar analogs | regression | 3D-QSAR a |
| Pajeva [ | P-gp inhibition | 40 phenothiazines, thioxanthenes, and structurally related drugs | regression | 3D-QSAR a |
| Broccatelli [ | P-gp inhibition | 772 diverse compounds | classification | PLS-DA b and LDA c |
| Montanari [ | BCRP inhibition | 978 diverse compounds | classification | logistic regression |
| Kotsampasakou [ | OATP1B1 and OATP1B3 inhibition | 1700 diverse compounds | classification | ensemble of SVMs d and random forests |
| Sedykh [ | Inhibition and transport for a range of intestinal transporters | up to 1571 diverse compounds | classification | random forest, k-nearest neighbors or SVM d |
| Montanari [ | P-gp and BCRP inhibition | 2191 diverse compounds | multi-label classification | classifiers chain |
| Aniceto [ | P-gp, BCRP, MRP1, and MRP2 transport | 1493 diverse compounds | multi-label classification | classifiers chain |
a three-dimensional quantitative structure-activity relationship; b partial least-squares discriminant analysis; c linear discriminant analysis; d support vector machines.