Literature DB >> 33510862

Intense bitterness of molecules: Machine learning for expediting drug discovery.

Eitan Margulis¹, Ayana Dagan-Wiener¹, Robert S Ives², Sara Jaffari³, Karsten Siems⁴, Masha Y Niv¹.

Abstract

Drug development is a long, expensive and multistage process geared to achieving safe drugs with high efficacy. A crucial prerequisite for completing the medication regimen for oral drugs, particularly for pediatric and geriatric populations, is achieving taste that does not hinder compliance. Currently, the aversive taste of drugs is tested in late stages of clinical trials. This can result in the need to reformulate, potentially resulting in the use of more animals for additional toxicity trials, increased financial costs and a delay in release to the market. Here we present BitterIntense, a machine learning tool that classifies molecules into "very bitter" or "not very bitter", based on their chemical structure. The model, trained on chemically diverse compounds, has above 80% accuracy on several test sets. Our results suggest that about 25% of drugs are predicted to be very bitter, with even higher prevalence (~40%) in COVID19 drug candidates and in microbial natural products. Only ~10% of toxic molecules are predicted to be intensely bitter, and it is also suggested that intense bitterness does not correlate with hepatotoxicity of drugs. However, very bitter compounds may be more cardiotoxic than not very bitter compounds, possessing significantly lower QPlogHERG values. BitterIntense allows quick and easy prediction of strong bitterness of compounds of interest for food, pharma and biotechnology industries. We estimate that implementation of BitterIntense or similar tools early in drug discovery process may lead to reduction in delays, in animal use and in overall financial burden.

Entities: Chemical Disease Gene Species

Keywords: Bitter; Drug discovery; Drugs; Machine learning; Taste; Toxicity

Year: 2020 PMID： 33510862 PMCID： PMC7807207 DOI： 10.1016/j.csbj.2020.12.030

Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN： 2001-0370 Impact factor: 7.271

Introduction

The use of sophisticated and highly automated processes in drug discovery has resulted in expedited pipelines and the approval of more than 4,000 medicines as of April 2020 [1]. Yet, a key aspect related to regimen compliance has not been properly addressed. The oral route remains the main way for drug administration [2], with aversive taste of drugs causing swallowing difficulties and compliance problems. This is especially relevant with pediatric medicine, for which encapsulation does not always provide a solution [3], [4], [5]. Even though some bitter-masking agents exist, they are often insufficient for masking or preventing a drug’s intensely bitter taste [4], [5], [6]. Indeed, more than 90% of pediatricians report that a drug’s aversive taste and low palatability are the biggest barriers to completing the medication regimen, leaving children with limited access to “child-friendly” drugs and exposing them to possible harm and insufficient treatment [4], [7]. Because of the potential risks caused by aversive taste of drugs, the Food and Drug Administration (FDA) has added taste events to their Adverse Event Reporting System [8] and expects that all medicines with the potential to be given to children should be assessed for palatability [9], [10]. The problem is also acute for the older population, and the European Medicines Agency (EMA) reflection paper on the development of medicines for geriatrics lists taste as a key consideration for medicine development [11]. Sour or metallic taste can also elicit aversion, but intense bitterness is particularly abundant among drugs, and many of them were shown to activate bitter taste receptors [12], [13]. The challenge imposed by aversive taste in drug development has led to the establishment of several assays for bitterness measurement, including the rat brief access taste aversion (BATA) [14], electronic tongues and human sensory panels [15]. The current pipeline for drug development process includes four main stages: drug design, preclinical phase (animal testing), clinical phases in human and the final review and approval by the medicines regulators including the FDA and the EMA [16], [17]. During the design and preclinical phases, the taste of the drug is usually disregarded. It is evaluated, if at all, only during the clinical phases when the drug is introduced to humans. As a result, in clinical trials with nauseous drugs (usually due to intense bitterness), reduced compliance to the treatment and increased dropout rates have been documented in several cases [18], [19]. In addition, knowledge of the aversive taste of drug candidates enables the selection of a similarly aversive placebo in order to avoid unblinding of the trials [20], [21]. Often, it is only once intense bitter taste is suspected to affect the clinical trial and cause compliance problems, that the palatability of the drug is tested, usually in human taste panels [15]. This may lead to reformulation of the drug and repeat the preclinical and clinical phases. Such detours may delay a potential medicine from getting to the patient for an additional 6 to 24 months, potentially using additional animals, and increasing financial costs. For a moderately successful medicine, this could be a reduction in income estimated at hundreds of millions of dollars. Bitter taste can be elicited by structurally diverse compounds. Over 1000 bitter-tasting compounds are currently documented in BitterDB. Salts, peptides, fatty acids, polyphenols, alkaloids, terpenoids and compounds from additional chemical families contribute to the wide chemical space of bitter tastants [22], [23]. Bitter compounds vary also in their perceived intensity: quinine and amarogentin were reported as extremely bitter to humans, recognizable at micromolar concentrations. Other molecules, such as caffeine and naringin elicit slightly bitter taste and are typically recognizable at millimolar concentrations [23]. Notably, we showed that bitter compounds are not more toxic than non-bitter compounds [24], questioning the common paradigm that posited the evolutionary role of bitter taste as a marker for toxicity [25], [26]. Bitter compounds are recognized by G-protein coupled receptors subfamily of bitter taste receptors, called T2Rs [27], [28], that harbor 25 functional T2R subtypes in human [29]. While some receptors are broadly tuned with hundreds of diverse ligands, others are very selective with 0–3 known agonists [30], [31]. T2Rs are not only expressed in the oral cavity but also in many extraoral tissues, possessing different physiological roles besides chemosensation of bitter tastants [32], [33]. For example, activation of T2Rs expressed in human airway smooth muscle with inhaled bitter tastants was shown to mediate relaxation of the muscles and decreased airway obstruction in mouse models of asthma [34]. T2Rs expressed in thyrocytes were shown to regulate the production of thyroid hormones and influence the function of the thyroid gland [35]. Surprisingly, a human cohort suggested that polymorphism in T2R42 gene, an orphan bitter taste receptor, is associated with lower thyroid hormone levels [35]. The fact that T2Rs can be expressed in extraoral tissues and that orphan T2Rs can be associated with physiological phenomena may suggest the involvement of T2Rs in health and disease [32] as well as a potential off-target for drugs [36]. We have previously developed BitterPredict [37], which classifies molecules into bitter or non-bitter with over 80% accuracy. Several other machine learning predictors followed suit [38], [39]. In addition, structure-based methods were developed for identification of new agonists for specific bitter taste receptors [12], [40], [41]. Here we are interested specifically in finding intensely or extremely bitter molecules, since these are the ones that are likely to cause compliance problems. This will allow project teams to address very bitter compounds in the early stages of development and focus on bitterness masking for the flagged molecules or deprioritizing them for oral administration. To establish a machine learning algorithm for intense bitterness, data had to be gathered: we curated data from BitterDB and from a public repository of natural compounds by AnalytiCon Discovery [42], and have measured bitterness intensity of several new compounds using the BATA assay. Next, we successfully trained a new machine learning classifier “BitterIntense” with the ability to assign compounds as “very bitter” (VB) or “not very bitter” (NVB) based on descriptors calculated from their chemical structure. BitterIntense was then used to assess prevalence of VB compounds in datasets of interest in order to elucidate additional attributes of VB compounds as in relation with possible toxic and therapeutic effects.

Materials and Methods

BATA assay. The rat brief access taste aversion (BATA) model has been demonstrated as a highly translatable tool for screening bitter compounds [14], [43]. Comparison of BATA and human gustatory trials at GlaxoSmithKline (GSK) suggest an average offset of 0.5 log concentration (3-Fold), with rats typically slightly more tolerant to bitter taste than humans [43]. We have chosen the 3 mM IC50 in rat model as a threshold for classifying compound as “very bitter” or “not very bitter”, and 0.1 mM human recognition threshold in human data. This accommodates both the difference between species, and difference between measures (IC50 is the concentration that inhibits 50% of consumption, while recognition threshold is the lowest concentration at which recognition occurs). Studies at GSK were performed using Davis Rig MS-160 lickometers (DiLog Instruments, Tallahassee, USA) and as described by Soto et al. [14] with the following exceptions: 12 male Sprague Dawley Crl:CD (SD) rats (number determined following power analysis of historic GSK data), 6 to 7 weeks of age on arrival, supplied by Charles River UK (Marston, UK) were used per study. Rats were housed in groups of either two or four, kept on a 12 h light: dark cycle, 19-21oC, 45–55% humidity. 5LF2 rodent diet (LabDiet, Missouri, USA) was fed ad libitum. Animal grade drinking water (AGW) was reverse osmosis filtered, UV treated and provided ad libitum between water restriction periods. All testing occurred during the light period. Rats were water restricted for 21 h prior to each test session to ensure sufficient thirst for the rats to attempt to lick all solutions presented. Each test session was limited to a maximum of 30 min, following which rats were returned to their home cages and given free access to AGW for a minimum of 2.5 h before commencement of the next water restriction period. Following completion of each study, rats were health assessed by a named veterinary surgeon and returned to non-naïve stock. A minimum one-week washout period was provided prior to use of rats on subsequent BATA studies. All compounds presented during BATA studies were fully solubilised across a range of concentrations with 0.5 log dose separation to generate concentration response curves. Lick counts of zero and one were excluded from data sets due to being deemed an insufficient attempt to lick a solution and therefore not related to palatability. Lick responses were modelled using a three parameter logistic function with the minimum constrained to zero (R v.3.5.1, Foundation for Statistical Computing, Vienna, Austria). Lick response was expressed as a percentage of the median AGW response within each study, 95% confidence interval. The concentration of API that elicited lick rates equivalent to 50% of the median AGW was then calculated and deemed the IC50. All animal work conforms to the UK Animals (Scientific Procedures) Act, European Directive 2010/63/EU and the GSK Policy on the Care, Welfare and Treatment of Animals. All protocols were approved as part of the GSK Scientific and Ethical Review Forum and carried out in accordance with the appropriate project licence issued by the UK Home Office. For all studies performed at GSK, commercially available compounds were sourced from Sigma Aldrich (Gillingham, UK). GSK compounds were supplied by the internal dispensary. BitterDB and Analyticon data. Data on bitterness intensity and chemical structures for the training and testing of the model were also obtained from BitterDB [23] and Analyticon’s repository on Kaggle website [42]. There are about 180 compounds with known sensory thresholds (out of 217 that have any kind of sensory data, provided in Supplementary Material). Compounds that were verbally described as intensely bitter, in most cases possessed bitter recognition thresholds below 0.1 mM and compounds that were described as slightly bitter, possessed thresholds with values higher than 0.1 mM. Based on this, all compounds with bitter recognition threshold below 0.1 mM, were classified as “very bitter”, and compounds with bitter recognition threshold above 0.1 mM were classified as “not very bitter”. For compounds with unknown bitter recognition thresholds, we considered the taste descriptions. For example, we classified compounds as very bitter if they were described as “Intensely/extremely bitter”, “more bitter than quinine”, etc. On the other hand, compounds that were described as “slightly/weak bitter” were classified as not very bitter. The addition of the non-bitter subset had shown to be beneficial in preliminary testing of the classifier (not shown), increasing the ability to distinguish between very bitter compounds and not very bitter compounds. Chemical families analysis. The chemical families of the compounds in the training set were extracted using ClassyFire webserver [44]. Datasets preparation. After obtaining the SMILES strings of the compounds, we uploaded the compounds to Maestro (Schrödinger Release 2017–2: MS Jaguar, Schrödinger, LLC, New York, NY, 2017). We generated 3D structures using ligprep and Epik (Schrödinger Release 2017–2: LigPrep, Epik, LLC, New York, NY, 2018) in pH 7.0 ± 0.5. All compounds were desalted when available, retaining the bigger ion and excluding the smaller counter ion. In general, if structures had additional molecules, these were removed. We retained the original chirality of compounds when specified, otherwise we generated additional stereoisomer per ligand. For each compound, the conformer with the lowest energy was extracted and used. When 2 stereoisomers where generated for one compound, we kept both structures. Compounds that could not be neutralized were excluded from the sets due to the limitations of calculating QikProp descriptors. All the datasets in this current study were prepared in the same protocol as mentioned above. Descriptors calculation. Three sets of descriptors were calculated for the prepared 3D structures using Canvas (Schrödinger Release 2017–4: Canvas, Schrödinger, LLC, New York, NY, 2017): Physicochemical descriptors, Ligfilter descriptors (moieties, atoms and functional groups) and QikProp descriptors (ADME descriptors). For the QikProp descriptors, additional PM3 properties were calculated as well (Schrödinger Release 2017–4: QikProp, Schrödinger, LLC, New York, NY, 2017). Compounds that failed to calculate one of the descriptors were excluded from the analysis. Dataset split for training, test and hold-out sets. All the collected compounds were divided randomly into training, test and hold-out set using functions from python’s Scikit-learn package. The training set consisted of 493 compounds: 169 very bitter compounds and 324 not very bitter compounds. The test set consisted of 123 compounds: 43 very bitter compounds and 80 not very bitter compounds and was used for testing the performance of the model during the training. The hold-out set consisted of 105 compounds: 31 very bitter compounds and 74 not very bitter compounds. We have established this set for final evaluation of the model, as an external set of compounds that the model has never encountered before. Distributions of selected descriptors (MW, AlogP, Ring count and QPlogHERG) in the training, test and hold-out sets are shown in Supplementary Fig. S3. Model construction and fitting. The XGBoost model was constructed and fitted using Python 3.7.5, in Spyder 3.7 environment. Relevant packages: Scikit-learn (version 0.21.3), XGBoost (version 0.9), Numpy (version 1.17.4), pandas (version 0.25.3), matplotlib (version 3.1.1) and seaborn (version 0.9.0). An early stopping approach was used in the training process, in which the performance of a model is monitored during the training process which is stopped once the performance ceased improving, see Figs. S1 and S2. The evaluation metrics for the early stopping were: logarithmic loss [45] and Binary classification error rate (defined as = #(wrong cases)/#(all cases)). Feature selection. The most contributing features were selected according to their feature importance gain score, calculated using XGBoost library in python [46]. We constructed a loop that tested the changes in the accuracy of the model by setting different thresholds of the gain scores in order to select the best features. When setting the threshold on 0.004, we maintained 55 features (out of 235 original features) that had the most contribution to model, obtaining 83% accuracy. Parameter tuning. The parameters of the XGBoost algorithm were tuned using sklearn’s GridSearchCV from sklearn.model selection module [47]. The number of cross validation folds was set to 10 and the scoring method was set for ‘f1 score’ in order to improve the precision and the recall. Parameter tuning was performed on the training set using the initial fitted model, suggesting that the optima parameters are: colsample_bytree = 0.6, gamma = 0.5, max_depth = 5. In brief, colsample_bytree is the fraction of features that will be randomly sampled to construct each decision tree, gamma represents the minimum loss reduction required to make a further partition on a leaf node of the decision tree and max depth represents the maximum depth of a tree. In addition, the parameter (scale_pos_weight) that helps with unbalanced data was tuned by the ratio between positive and negative observations in the training set to 1.9. Evaluation of the performance of the model. After calculating the number of true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN), we evaluated the model using four metrics: Toxicity data. FocTox dataset consists of FAO/WHO food contaminants list and a list of extremely hazardous substances defined in section 302 of the U.S. Emergency Planning and Community Right-to-Know Act. CombiTox dataset is a combination of two datasets (The Toxin and Toxin‐Target Database version 2.0 (T3DB) and DSSTox—the Distributed Structure‐Searchable Toxicity Database). The datasets were taken from the paper of Nissim I. et al. [24]. Hepatotoxicity data. All hepatotoxicity descriptors were extracted from FDA’s DILIrank [48] dataset which is an updated version of the LTKB (Liver Toxicity Knowledge Base) Benchmark dataset [49]. DILIrank consists of 1,036 FDA-approved drugs with known hepatotoxicity descriptors and liver toxicity risk assessments. The compounds in the dataset were prepared as explained in “Datasets preparation” section. External datasets. DrugBank (version 5.1.5) consist of experimental and approved drugs [1], and Natural products atlas (NPatlas, version 2019_08) [50] were downloaded from their official websites. Compounds were prepared according to the protocol of “Datasets preparation” section. COVID 19 drugs and their targets were retrieved from “IUPHAR/BPS Guide to Pharmacology” [51]. After excluding the antibodies and compounds without chemical structure we prepared the remaining 34 compounds according to the “Datasets preparation” section. Data Analysis and visualization of the data. All the data in this current study was analyzed using Pandas library [52] and visualized with Matplotlib [53] library in Python.

Results

Establishment of positive and negative sets. 34 compounds were obtained from behavioral studies using the rat brief access taste aversion (BATA – see Materials and Methods). Additional compounds were pulled from the BitterDB [23], and the Analyticon repository of natural compounds on kaggle [42]. The compounds were classified into 2 classes: “Very bitter” (VB) and “Not very bitter” (NVB) using the following criteria: Compounds with sensory bitter recognition threshold below 0.1 mM; or molecules with taste description that states “extremely bitter” or “intensely bitter” etc., were included in the VB class (246 compounds). Compounds with bitter recognition threshold above 0.1 mM or molecules with taste description that includes “slightly bitter” or “weak bitter taste” etc., were included in the NVB class (323 compounds). The BATA test measures aversion, which is assumed to be driven mainly by bitterness. The IC50 achieved for each compound in the BATA test was used to classify compounds using the following criteria: molecules with IC50 below or equal to 3 mM were classified as VB. Molecules with IC50 above 3 mM were classified as NVB. In addition, 152 non-bitter compounds were added to the NVB class from the negative set used previously in BitterPredict [37] to enable correct prediction (namely, they should be classified as NVB) also for non-bitter compounds. For practical purposes pursued here, we group non-bitter and not very bitter into the same category. The non-bitter compounds that were included in the negative set were randomly selected and had MW greater than 250 g/mol to match the MW of very bitter compounds. Chemical families analysis. Bitter compounds are structurally diverse and, and many chemical families can elicit bitter taste [22]. Drug-like molecules and phytonutrients, such as flavonoids, isoflavones, terpenes, and glucosinolates can elicit bitter taste with various intensities [54]. Here we analyze which chemical families are enriched with intensely bitter compounds. The chemical families for VB and NVB compounds in the training set are represented in Fig. 1. The VB class is enriched with triterpene saponins and triterpenes in general, while NVB class is broadly represented by different chemical families. The results suggest that triterpene saponines are the most abundant VB compounds in our training set. Interestingly, this family is also found among the top chemical families of NVB molecules, illustrating that taste intensity can not be simply deduced based on the chemical family of the compound.

Fig. 1

Representation of chemical families in the training set (A) – Top chemical families represented in the dataset of very bitter compounds. (B) - Top chemical families represented in the dataset of not very bitter compounds. The compound on the lower left side (Asperosaponin VI) is a representative of very bitter triterprene saponins, the compound on the lower right (Nitrosaccharin) is a representative of not very bitter benzothiazoles. Deriv. = derivatives. Training the classifier. Physicochemical, Ligfilter and Qikprop descriptors were calculated for the compounds in the dataset (see Materials and Methods). 15% of the compounds (105 compounds) were chosen randomly and left out of the training set, as a hold-out test set for final evaluation of the model. The other 616 compounds were randomly divided into: 80% training set (493 compounds) and 20% internal test set (123 compounds). Extreme Gradient Boosting (XGBoost) algorithm was chosen for this classification task. XGBoost is a popular and powerful decision-tree-based ensemble method that uses optimized gradient boosting techniques and is known to perform well with small to medium size datasets [46]. Since the training set is rather small, we extracted the highest contributing features (see Materials and Methods) in order to avoid overfitting. 55 features were selected out of 235 for the model training. Further details are described in the Methods section. BitterIntense Performance. Evaluation of the model’s performance (Table 1) was carried out on three sets: training set (with cross-validation, k-fold = 10), test set, and hold-out set. BitterIntense was able to achieve over 80% accuracy across the different datasets. The precision, which indicates the ratio between the true positives and all the positive assignments (true positives and false positives), was on average 80% on training set, 71% on the test set and 63% on the hold-out set. Recall, which indicates how well the model identified the true positives (ratio between true positives and the sum of true positives and false negatives) was on average 85% on the training set, 86 on the test set and 77% on the hold-out set. In addition, we calculated the F1-score which is another metric for the accuracy of the model by taking into account both the precision and the recall values (defined as the harmonic mean of precision and recall, see Methods section): for the training set F1-score was 82 ± 5%, for the test set 78% and for the hold-out set 70%. In general, we observed higher recall values than precision values, suggesting that our model has less false negatives than false positives. This result is in line with our goal to maximize identification of very bitter compounds.

Table 1

	Training set	Test set	Hold-out set
	(493 compounds)	(123 compounds)	(105 compounds)
Accuracy (%)	87 ± 5	83	80
Precision (%)	80 ± 8	71	63
Recall (%)	85 ± 4	86	77
F1 score	82 ± 5	78	70
Specificity (%)		81	81

BitterIntense performance on the training, test and hold-out sets. Training set evaluation was done using k-fold cross validation with k = 10. The results in the training set column represent the mean metric with its standard deviation across 10 iteration of cross validation. Important features. The feature importance was measured in XGBoost by the “gain” method which is the average gain of splits which use the feature in the prediction process. Higher gain value implies greater importance of the feature. The top 15% of features are represented in Fig. 2 and suggest that molecule’s size and polarizability are the most influential factors for bitterness level. Fitting the model using the heavy atom count feature only, results in 70% accuracy on the training set, with recall and precision of 70% and 58% respectively. This result emphasizes the importance of molecular size and suggests correlation between the size of the molecule and the bitterness intensity.

Fig. 2

Top 15% important features in the model. The importance is calculated by the average gain of splits which use the feature in the prediction process in each tree in the XGBoost model. The heavy atom count, molar refractivity, number of likely metabolic reactions (metab), number of tertiary amines and amide groups, π (carbon and attached hydrogen) component of the solvent accessible surface area (PISA), hydrophobicity (AlogP), hydrophobic component of the solvent accessible surface area (FOSA) and Predicted IC50 value for blockage of hERG channels (QPlogHERG). Relationship between bitterness intensity and toxicity. Though bitter taste is often regarded as a marker for toxicity that guards against consuming poisons [55], our previous analysis showed that bitter compounds are not necessarily toxic and vice versa [24]. We explored whether similar conclusions hold for VB compounds, by comparing toxicities of VB and NVB compounds. BitterIntense was applied to toxic compounds from FocTox and CombiTox datasets, collated in previous work [24]. The FocTox dataset consists of FAO/WHO food contaminants and extremely hazardous substances. Out of 289 compounds, only 25 were predicted to be very bitter (pVB, 8.6%). CombiTox, a manually curated dataset of toxic compounds [24], holds ~134,000 compounds, out of which 12% were pVB, suggesting that toxic substances are not necessarily very bitter and in fact, most of the toxicants are predicted to be NVB. Thus it appears that only ~10% of toxic compounds are intensely bitter. We further checked the possible connection between the level of bitterness and liver toxicity (hepatotoxicity) and cardiac toxicity. Hepatotoxicity is the most common cause for the discontinuation of clinical trials on a drug, and the most common reason for an approved drug’s withdrawal from the marketplace [48], [56]. Drug-induced liver injury (DILI) has been listed as the leading cause of acute liver failure in the USA in 2002 [57], and DILI has become an important concern in the drug discovery process. A possible connection between the level of bitterness and hepatotoxicity could suggest that the level of bitterness is a potential marker for such toxicity. We predicted the level of bitterness of drugs with known hepatotoxicity descriptors, taken from DILIrank dataset [48]. Most of the drugs in the dataset (729 compounds) were predicted as NVB (pNVB), and only 258 were predicted as VB. The pVB drugs do not appear to be more hepatotoxic than the pNVB drugs: the most hepatotoxic class (class number 8, Fig. 3A) as well as the “Most DILI concern” category (Fig. 3B) are actually enriched with pNVB drugs.

Fig. 3

Bitterness levels of drugs and their hepatotoxicity descriptors. (A) Distribution of pVB (silver) and pNVB (black) drugs across severity classes of hepatotoxicity. (B) Distribution of pVB and pNVB drugs across DILI concern categories. The severity of hepatotoxicity increases from left to right in all figures. Cardiac toxicity is a possible side effect of many anti-cancer drugs [58] and other drugs, causing heart problems as well as sudden cardiac death [59]. For this reason, the FDA has required almost all new drugs to be assessed for cardiac toxicity, which led to discontinuation of clinical trials and even withdrawal of drugs from the market [60]. One of the mechanisms of cardiotoxicity is blockage of voltage-gated potassium channels called the hERG channels [61]. QPlogHERG is a common descriptor for predicted IC50 value for blockage of hERG K + channels, which indicates cardiac toxicity [62], where values below −5 are of concern [63]. As QPlogHERG appeared in the top features list of our model (Fig. 2), we checked whether very bitter compounds tend to be more toxic for the heart. QPlogHERG values were already calculated for our training and test sets as part of the descriptors calculation process (see methods section), we have used these values to compare the QPlogHERG values of VB and NVB compounds. Our results (Fig. 4) suggest that VB compounds are more cardiotoxic than NVB compounds, where the median value for VB compounds is −5.2 (considered cardiotoxic) and the median for NVB compounds is −4.

Fig. 4

QPlogHERG values of VB (grey) and NVB (white) compounds. Statistical significant difference was observed using Mann-Whitney test, n = 721, P-value < 0.0001.

QPlogHERG values of VB (grey) and NVB (white) compounds. Statistical significant difference was observed using Mann-Whitney test, n = 721, P-value < 0.0001. Very bitter drugs and their potential therapeutic effects. Since VB drugs are more likely to cause compliance problems, we applied BitterIntense on all compounds (approved and experimental drugs) from Drugbank (5.1.5)1 to evaluate the abundance of VB drugs (Fig. 5). Out of 10,170 compounds that were able to pass through our predictor, 23.6% were pVB. Specifically, 18% of experimental drugs and 26% of approved drugs are pVB. For comparison, in microbial natural products (NPatlas, version 2019_08, n = 24,805) [50], 47.7% were pVB. Thus only ¼ of drug candidates, but about half of the microbial natural products are likely to be VB.

Fig. 5

Prevalence of pVB compounds (silver) and pNVB compounds (white) across 3 datasets: NPatlas, DrugBank and COVID19 drugs. Statistical significant difference in the proportions of pVB compounds was observed using Two Proportion Z-Test. Are there specific therapeutic indications or targets that are enriched with pVB drugs? We focused here on a highly relevant disease, the COVID19. The COVID19 pandemic is spreading throughout the world, with millions of confirmed cases and hundreds of thousands of deaths, according to reports by the World Health Organization that were published in June 2020 [64]. Some drugs have been suggested for treating COVID-19 patients, but no drug has yet been approved fully and officially by the FDA. Several drugs are currently under study and clinical trial, for example: Remdesivir, an adenosine analog that was previously tested as a potential drug for Ebola and as anti-viral drug [65], showed promising result in COVID19 patients and thus was granted an FDA Emergency Use Authorization on 1 May 2020 [66]. Interestingly, taste and smell loss [67], including impairment of the bitter taste, are reported by many COVID19 patients [68]. We applied BitterIntense to possible COVID19 drug candidates and compared it to the general abundance of pVB drugs in DrugBank. A list of ligands related to COVID19 in “Coronavirus Information – IUPHAR/BPS Guide to Pharmacology” [51] was retrieved. After excluding antibodies and compounds without chemical structure (Fig. 5), among 34 drug candidates, 41.2% were pVB. The proportion of pVB drugs among COVID19 potential drugs, is thus significantly higher than in DrugBank (23.6%)(P-value = 0.016), suggesting that VB drugs may be more abundant in the COVID19-related list than in general drugs. No evident difference was found between the main targets of pVB and NVB COVID19-related drug candidates.

Discussion and conclusions

BitterIntense was developed to easily classify compounds as Very bitter (VB) or Not very bitter (NVB) and has achieved above 80% accuracy on test set. The model was trained with chemically diverse compounds. The most important features in the classification task of VB vs NVB compounds are the heavy atom count and the molar refractivity (a measure of polarizability). Training the model with the single feature of heavy atom count resulted in a model with 70% accuracy, with recall and precision of 70% and 58% respectively. This means that VB molecules are often larger than NVB molecules, but additional features greatly improve the predictions, reaching average accuracy of 87%, recall of 85% and precision of 80% on the training set. One possible explanation for the correlation between the size of the molecule and the bitterness intensity is that bigger molecules can interact with more residues in the binding site of the receptor, resulting in strong binding and slow dissociation. The correlation between size and long residence time of ligands in binding pockets of receptors, including GPCRs was shown [69], [70]. In addition, larger number of heavy atoms was shown to correlate with selectivity of bitter molecules towards T2Rs (namely, activating a small number of T2R subtypes) [30] emphasising the importance of this descriptor for bitterness perception. In addition, BitterIntense was used for analysis of connection between toxicity and the level of bitterness of molecules. Our results suggest that compound that are considered acute toxic or hepatotoxic for human are not necessarily predicted to be very bitter. There are several possible explanations for this trend: some bitter compounds that activate T2Rs were also shown to interact with Cytochrome P450 enzymes (CYP) [71], promiscuous monooxygenases involved in metabolism of drugs and xenobiotics in the human body [72]. Perhaps VB and pVB drugs interact differently or even with stronger affinity with CYP enzymes, possibly explaining the differences in the hepatotoxic effect. This hypothesis could be tested in future work. Furthermore, activation of extraoral T2Rs might contribute to the decreased risk for hepatotoxic effect of the very bitter drugs. Activation of gut T2Rs was shown to lead to detoxification by upregulating the transcription of xenobiotic efflux pumps [73]. However, our results reveal that VB compounds have significantly lower QPlogHERG values than NVB compound, suggesting that VB compounds could be more cardiotoxic than NVB by blocking hERG channels, in line with previous work that found hERG channels among off-targets of bitter compounds [12]. Interestingly, T2Rs are also expressed in the heart, which raises the possibility that bitter compounds may both activate cardiac T2Rs [74] and block hERG channels [12]. In summary, our analysis shows that VB and pVB drugs tend to be less harmful as far as acute toxicity and hepatotoxicity, but have more potential to be cardiotoxic than NVB compounds. The possible underlying mechanisms require further study, while VB and pVB drug candidates should be kept in the drug discovery process with potential aversiveness (but not necessarily toxicity) flagging. Our results suggest that (experimental and approved) drugs tend to be less enriched with pVB compounds as compared to random dataset of microbial natural products. Furthermore, drugs suggested for potential repurposing for COVID-19 targets, tend to include more pVB compared to general population of drugs, raising the question of potential beneficial effect of bitterness or involvement of bitter taste receptors in the disease, interesting also in view of taste loss being one of the symptoms of the disease [68]. However, as it appears that many COVID-19 patients temporarily lose taste sensitivity [68], the aversive taste of orally administered drugs may be actually less problematic for administration in this case. While the potential involvement or mediation of bitter taste receptors in COVID19 is unclear, the results further highlight the importance of flagging - but not excluding from the pipeline – of pVB drug candidates. BitterIntense is a quick and easy method that enables flagging of intensely bitter compounds during early stages of drug development, and can thus potentially reduce financial costs, animal testing and the lead time to the patients. The ability to detect pVB drugs in early stages not only will accelerate the drug development process but will also promote the development of more palatable drugs suitable for children and geriatric patients. In addition, BitterIntense could also be of interest for biotech and foodtech companies that are working on bioactive molecules, artificial sweeteners or natural products of plants that are developed to be integrated in food products. Thus, our proposal is to integrate computational taste prediction along other computational tools within discovery and development of new bioactive compounds.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

59 in total

Review 1. Orally fast disintegrating tablets: developments, technologies, taste-masking and clinical studies.

Authors: Yourong Fu; Shicheng Yang; Seong Hoon Jeong; Susumu Kimura; Kinam Park
Journal: Crit Rev Ther Drug Carrier Syst Date: 2004 Impact factor: 4.889

2. Extraoral bitter taste receptors as mediators of off-target drug effects.

Authors: Adam A Clark; Stephen B Liggett; Steven D Munger
Journal: FASEB J Date: 2012-09-10 Impact factor: 5.191

3. Assessment of bitter taste of pharmaceuticals with multisensor system employing 3 way PLS regression.

Authors: Alisa Rudnitskaya; Dmitry Kirsanov; Yulia Blinova; Evgeny Legin; Boris Seleznev; David Clapham; Robert S Ives; Kenneth A Saunders; Andrey Legin
Journal: Anal Chim Acta Date: 2013-02-15 Impact factor: 6.558

Review 4. Bitter and sweet tasting molecules: It's complicated.

Authors: Antonella Di Pizio; Yaron Ben Shoshan-Galeczki; John E Hayes; Masha Y Niv
Journal: Neurosci Lett Date: 2018-04-19 Impact factor: 3.046

5. Bitter or not? BitterPredict, a tool for predicting taste from chemical structure.

Authors: Ayana Dagan-Wiener; Ido Nissim; Natalie Ben Abu; Gigliola Borgonovo; Angela Bassoli; Masha Y Niv
Journal: Sci Rep Date: 2017-09-21 Impact factor: 4.379

6. ClassyFire: automated chemical classification with a comprehensive, computable taxonomy.

Authors: Yannick Djoumbou Feunang; Roman Eisner; Craig Knox; Leonid Chepelev; Janna Hastings; Gareth Owen; Eoin Fahy; Christoph Steinbeck; Shankar Subramanian; Evan Bolton; Russell Greiner; David S Wishart
Journal: J Cheminform Date: 2016-11-04 Impact factor: 5.514

Review 7. The matching quality of experimental and control interventions in blinded pharmacological randomised clinical trials: a methodological systematic review.

Authors: Segun Bello; Maoling Wei; Jørgen Hilden; Asbjørn Hróbjartsson
Journal: BMC Med Res Methodol Date: 2016-02-13 Impact factor: 4.615

8. More Than Smell-COVID-19 Is Associated With Severe Impairment of Smell, Taste, and Chemesthesis.

Authors: Valentina Parma; Kathrin Ohla; Maria G Veldhuizen; Masha Y Niv; Christine E Kelly; Alyssa J Bakke; Keiland W Cooper; Cédric Bouysset; Nicola Pirastu; Michele Dibattista; Rishemjit Kaur; Marco Tullio Liuzza; Marta Y Pepino; Veronika Schöpf; Veronica Pereda-Loth; Shannon B Olsson; Richard C Gerkin; Paloma Rohlfs Domínguez; Javier Albayay; Michael C Farruggia; Surabhi Bhutani; Alexander W Fjaeldstad; Ritesh Kumar; Anna Menini; Moustafa Bensafi; Mari Sandell; Iordanis Konstantinidis; Antonella Di Pizio; Federica Genovese; Lina Öztürk; Thierry Thomas-Danguin; Johannes Frasnelli; Sanne Boesveldt; Özlem Saatci; Luis R Saraiva; Cailu Lin; Jérôme Golebiowski; Liang-Dar Hwang; Mehmet Hakan Ozdener; Maria Dolors Guàrdia; Christophe Laudamiel; Marina Ritchie; Jan Havlícek; Denis Pierron; Eugeni Roura; Marta Navarro; Alissa A Nolden; Juyun Lim; Katherine L Whitcroft; Lauren R Colquitt; Camille Ferdenzi; Evelyn V Brindha; Aytug Altundag; Alberto Macchi; Alexia Nunez-Parra; Zara M Patel; Sébastien Fiorucci; Carl M Philpott; Barry C Smith; Johan N Lundström; Carla Mucignat; Jane K Parker; Mirjam van den Brink; Michael Schmuker; Florian Ph S Fischmeister; Thomas Heinbockel; Vonnie D C Shields; Farhoud Faraji; Enrique Santamaría; William E A Fredborg; Gabriella Morini; Jonas K Olofsson; Maryam Jalessi; Noam Karni; Anna D'Errico; Rafieh Alizadeh; Robert Pellegrino; Pablo Meyer; Caroline Huart; Ben Chen; Graciela M Soler; Mohammed K Alwashahi; Antje Welge-Lüssen; Jessica Freiherr; Jasper H B de Groot; Hadar Klein; Masako Okamoto; Preet Bano Singh; Julien W Hsieh; Danielle R Reed; Thomas Hummel; Steven D Munger; John E Hayes
Journal: Chem Senses Date: 2020-10-09 Impact factor: 3.160

9. Dual binding mode of "bitter sugars" to their human bitter taste receptor target.

Authors: Fabrizio Fierro; Alejandro Giorgetti; Paolo Carloni; Wolfgang Meyerhof; Mercedes Alfonso-Prieto
Journal: Sci Rep Date: 2019-06-11 Impact factor: 4.379

10. The Natural Products Atlas: An Open Access Knowledge Base for Microbial Natural Products Discovery.

Authors: Jeffrey A van Santen; Grégoire Jacob; Amrit Leen Singh; Victor Aniebok; Marcy J Balunas; Derek Bunsko; Fausto Carnevale Neto; Laia Castaño-Espriu; Chen Chang; Trevor N Clark; Jessica L Cleary Little; David A Delgadillo; Pieter C Dorrestein; Katherine R Duncan; Joseph M Egan; Melissa M Galey; F P Jake Haeckl; Alex Hua; Alison H Hughes; Dasha Iskakova; Aswad Khadilkar; Jung-Ho Lee; Sanghoon Lee; Nicole LeGrow; Dennis Y Liu; Jocelyn M Macho; Catherine S McCaughey; Marnix H Medema; Ram P Neupane; Timothy J O'Donnell; Jasmine S Paula; Laura M Sanchez; Anam F Shaikh; Sylvia Soldatou; Barbara R Terlouw; Tuan Anh Tran; Mercia Valentine; Justin J J van der Hooft; Duy A Vo; Mingxun Wang; Darryl Wilson; Katherine E Zink; Roger G Linington
Journal: ACS Cent Sci Date: 2019-11-14 Impact factor: 14.553

3 in total

1. BitterMatch: recommendation systems for matching molecules with bitter taste receptors.

Authors: Eitan Margulis; Yuli Slavutsky; Tatjana Lang; Maik Behrens; Yuval Benjamini; Masha Y Niv
Journal: J Cheminform Date: 2022-07-07 Impact factor: 8.489

2. Molecular Docking and Dynamics Simulation of Natural Compounds from Betel Leaves (Piper betle L.) for Investigating the Potential Inhibition of Alpha-Amylase and Alpha-Glucosidase of Type 2 Diabetes.

Authors: Sabbir Ahmed; Md Chayan Ali; Rumana Akter Ruma; Shafi Mahmud; Gobindo Kumar Paul; Md Abu Saleh; Mohammed Merae Alshahrani; Ahmad J Obaidullah; Sudhangshu Kumar Biswas; Md Mafizur Rahman; Md Mizanur Rahman; Md Rezuanul Islam
Journal: Molecules Date: 2022-07-15 Impact factor: 4.927

3. Bitter Taste and Olfactory Receptors: Beyond Chemical Sensing in the Tongue and the Nose.

Authors: Mercedes Alfonso-Prieto
Journal: J Membr Biol Date: 2021-06-25 Impact factor: 1.843

3 in total