Taravat Ghafourian1, Zeshan Amin. 1. Medway School of Pharmacy, Universities of Kent and Greenwich, Central Avenue, Chatham Maritime, Kent ME4 4TB, UK.
Abstract
INTRODUCTION: The prediction of plasma protein binding (ppb) is of paramount importance in the pharmacokinetics characterization of drugs, as it causes significant changes in volume of distribution, clearance and drug half life. This study utilized Quantitative Structure - Activity Relationships (QSAR) for the prediction of plasma protein binding. METHODS: Protein binding values for 794 compounds were collated from literature. The data was partitioned into a training set of 662 compounds and an external validation set of 132 compounds. Physicochemical and molecular descriptors were calculated for each compound using ACD labs/logD, MOE (Chemical Computing Group) and Symyx QSAR software packages. Several data mining tools were employed for the construction of models. These included stepwise regression analysis, Classification and Regression Trees (CART), Boosted trees and Random Forest. RESULTS: Several predictive models were identified; however, one model in particular produced significantly superior prediction accuracy for the external validation set as measured using mean absolute error and correlation coefficient. The selected model was a boosted regression tree model which had the mean absolute error for training set of 13.25 and for validation set of 14.96. CONCLUSION: Plasma protein binding can be modeled using simple regression trees or multiple linear regressions with reasonable model accuracies. These interpretable models were able to identify the governing molecular factors for a high ppb that included hydrophobicity, van der Waals surface area parameters, and aromaticity. On the other hand, the more complicated ensemble method of boosted regression trees produced the most accurate ppb estimations for the external validation set.
INTRODUCTION: The prediction of plasma protein binding (ppb) is of paramount importance in the pharmacokinetics characterization of drugs, as it causes significant changes in volume of distribution, clearance and drug half life. This study utilized Quantitative Structure - Activity Relationships (QSAR) for the prediction of plasma protein binding. METHODS: Protein binding values for 794 compounds were collated from literature. The data was partitioned into a training set of 662 compounds and an external validation set of 132 compounds. Physicochemical and molecular descriptors were calculated for each compound using ACD labs/logD, MOE (Chemical Computing Group) and Symyx QSAR software packages. Several data mining tools were employed for the construction of models. These included stepwise regression analysis, Classification and Regression Trees (CART), Boosted trees and Random Forest. RESULTS: Several predictive models were identified; however, one model in particular produced significantly superior prediction accuracy for the external validation set as measured using mean absolute error and correlation coefficient. The selected model was a boosted regression tree model which had the mean absolute error for training set of 13.25 and for validation set of 14.96. CONCLUSION: Plasma protein binding can be modeled using simple regression trees or multiple linear regressions with reasonable model accuracies. These interpretable models were able to identify the governing molecular factors for a high ppb that included hydrophobicity, van der Waals surface area parameters, and aromaticity. On the other hand, the more complicated ensemble method of boosted regression trees produced the most accurate ppb estimations for the external validation set.
Keywords:
ADME; Albumin Binding; Distribution; Protein Binding; QSAR; Serum Proteins
Authors: Nicole A Kratochwil; Walter Huber; Francis Müller; Manfred Kansy; Paul R Gerber Journal: Biochem Pharmacol Date: 2002-11-01 Impact factor: 5.858
Authors: Joseph R Votano; Marc Parham; L Mark Hall; Lowell H Hall; Lemont B Kier; Scott Oloff; Alexander Tropsha Journal: J Med Chem Date: 2006-11-30 Impact factor: 7.446
Authors: Estelle Yau; Andrés Olivares-Morales; Michael Gertz; Neil Parrott; Adam S Darwich; Leon Aarons; Kayode Ogungbenro Journal: AAPS J Date: 2020-02-03 Impact factor: 4.009
Authors: Daniel E Dawson; Brandall L Ingle; Katherine A Phillips; John W Nichols; John F Wambaugh; Rogelio Tornero-Velez Journal: Environ Sci Technol Date: 2021-04-15 Impact factor: 9.028
Authors: Speranta Avram; Adina Milac; Maria Mernea; Dan Mihailescu; Mihai V Putz; Catalin Buiu Journal: Int J Mol Sci Date: 2014-11-18 Impact factor: 5.923
Authors: Jingtao Lu; Michael-Rock Goldsmith; Christopher M Grulke; Daniel T Chang; Raina D Brooks; Jeremy A Leonard; Martin B Phillips; Ethan D Hypes; Matthew J Fair; Rogelio Tornero-Velez; Jeffre Johnson; Curtis C Dary; Yu-Mei Tan Journal: PLoS Comput Biol Date: 2016-02-12 Impact factor: 4.475
Authors: Amr El-Demerdash; Ahmed M Metwaly; Afnan Hassan; Tarek Mohamed Abd El-Aziz; Eslam B Elkaeed; Ibrahim H Eissa; Reem K Arafa; James D Stockand Journal: Biomolecules Date: 2021-03-19