Literature DB >> 35350660

Multi-lab intrinsic solubility measurement reproducibility in CheqSol and shake-flask methods.

Alex Avdeef1.   

Abstract

This commentary compares 233 CheqSol intrinsic solubility values (log S0) reported in the Wiki-pS0 database for 145 different druglike molecules to the 838 log S0 values determined mostly by the saturation shake-flask (SSF) method for 124 of the molecules from the CheqSol set. The range of log S0 spans from -1.0 to -10.6 (log molar units), averaging at -3.8. The correlation plot between the two methods indicates r2 = 0.96, RMSE = 0.34 log unit, and a slight bias of -0.07 log unit. The average interlaboratory standard deviation (SDi) is slightly better for the CheqSol set than that of the SSF set: SDi CS = 0.15 and SDi SSF = 0.24. The intralaboratory errors reported in the CheqSol method (0.05 log) need to be multiplied by a factor of 3 to match the expected interlaboratory errors for the method. The scale factor, in part, relates to the hidden systematic errors in the single-lab values. It is expected that improved standardizations in the 'gold standard' SSF method, as suggested in the recent 'white paper' on solubility measurement methodology, should make the SDi of both methods be about ~0.15 log unit. The multi-lab averaged log S0 (and the corresponding SDi) values could be helpful additions to existing training-set molecules used to predict the intrinsic solubility of drugs and druglike molecules.
Copyright © 2019 by the authors.

Entities:  

Keywords:  Henderson-Hasselbalch equation; Interlaboratory solubility measurement errors; aqueous intrinsic solubility; potentio- metric solubility; shake-flask solubility; thermodynamic solubility

Year:  2019        PMID: 35350660      PMCID: PMC8957238          DOI: 10.5599/admet.698

Source DB:  PubMed          Journal:  ADMET DMPK        ISSN: 1848-7718


Introduction

This commentary considers the interlaboratory reproducibility of published aqueous intrinsic solubility data (log S0) for 124 drugs, determined both by the potentiometric CheqSol (CS) method (233 reported log S0CS values) and mostly by the ‘gold standard’ saturation shake-flask (SSF) method (838 log S0SSF). For each drug, its method-dependent interlaboratory measurement standard deviations (SDiCS and SDiSSF) are estimated, by comparing solubility values for a given drug determined in different laboratories. The multi-lab averaged log S (and the corresponding SD values) could be helpful additions to existing training-set molecules used to predict the intrinsic solubility of drugs and druglike molecules. The present contribution is the third in a series of papers aiming to address contemporary issues of solubility measurement, interpretation and prediction [1,2]. These are intended to serve as prologue/accompaniment to an upcoming session on solubility at the IAPC-8 meeting in Split, Croatia, 9-11 September 2019. Since 2009, the International Association of Physical Chemists (IAPC, ) series of symposia maintained extensive coverage of the topic of solubility measurement, both from solid state and solution perspectives. At the 2015 IAPC-4 meeting, the special session on solubility measurement resulted in a widely-circulated ‘white paper’ drawing on expert consensus thoughts of scientists from six countries (Hungary, Russia, Serbia, Spain, Sweden, United States) [3]. It is expected that future sessions will continue to cover solubility methods and strategies, to critically address the different needs in pharmaceutical research, spanning from drug discovery to drug development.

Background

Most of the small-molecule research compounds in today’s drug discovery projects are ionizable and poorly soluble in water, and are thus prone to show low and/or erratic in vivo intestinal absorption [4]. In discovery, high-throughput microtitre plate methods are used to estimate solubility, where small volumes of 10 mM DMSO solutions of library compounds are added to buffer solutions to induce formation of solid suspensions in the wells. Such estimates of solubility are needed, in part, to anticipate whether compounds would precipitate in bioassays (and thus indicate false positives). In parallel, methods to predict solubility of molecules play an important role in drug discovery, since virtual screening of compound libraries could prioritize molecules for testing in early in vitro screens [2,5,6].

Saturation shake-flask (SSF) and potentiometric (CheqSol) methods

In more advanced stages of drug research, solubility measurement necessarily becomes more rigorous, where ‘solubility’ refers to the concentration of a solute in a saturated solution, where the dissolved molecule is in a thermodynamic equilibrium with its crystalline form suspended in the solution (of a known pH, composition, and temperature). Accurate solubility measurement of druglike molecules can be very difficult to do well, although to an untrained eye it ought to be as easy as measuring the concentration of a molecule in water. In development projects, thermodynamic solubility measurements are usually done using some variant of the SSF method [3]. As an alternative, a potentiometric procedure, called the dissolution template titration (DTT) method was introduced in 1998 [7] and validated two years later [8]. The methodology is capable of producing highly precise (intralaboratory SD ~0.07 log unit) measurements of aqueous intrinsic solubility, log S0, i.e., the solubility of the uncharged form of an ionizable molecule. A much faster variant of the pH-titration method, called CheqSol, was described in 2005 [9]. Instruments implementing the potentiometric methods have been used in several universities and pharmaceutical companies. All of the potentiometric methods require that the molecule be ionizable and that the accurately-measured pKa be provided. The molecule cannot be too soluble, since the method depends on the pH difference between a saturated and an unsaturated solution in the titration where the molecule is half ionized. So, it is ideally suited for low-soluble molecules, since these molecules display large pH differences. Furthermore, to calculate the CheqSol log S0, it is assumed that solubility as a function of pH follows the curve predicted by the Henderson-Hasselbalch equation. The molecule needs to be stable to hydrolysis when repeatedly exposed to pH conditions far from neutral. Means to recognize hydrolytic decomposition are important to incorporate into the measurement. Sometimes multiple polymorphs may form in the CheqSol method, which requires solid state characterization to identify. In the traditional SSF method, most of the time, the thermodynamically most-stable form of the solid is the one associated with the measured solubility [3]. Equilibration times are selected to be long (24-168 h) to ensure this expectation.

Challenging measurement

What can make solubility measurement of an ionizable low-soluble molecule so difficult? There are two sides to consider for the reaction at equilibrium: (a) the solid state and (b) the solution. Temperature needs to be regulated and specified. To keep the following examples simple, let’s assume that the solvent is distilled water or an aqueous buffer, and that the crystalline form of the test compound is a free-acid/base or a salt. Multiple circumstances may arise, some making the interpretation of the measurement challenging: (i) the simplest suspension is the one where no drug ionization takes place on equilibration. One needs to measure the concentration of the compound in the saturated solution, and to confirm that the solid state form is unchanged – simple. (ii) However, if the solid introduced is not the thermodynamically most-stable polymorph (or is amorphous), then it is possible that the measured solubility would correspond to a different solid form. That’s important to know. (iii) Complication can arise if a low-soluble weak base is added to water (usually saturated with CO2 from the air). The pH will change, depending on the pKa of the molecule. The ambient CO2 may act as a buffer, so the final pH of the saturated solution needs to be carefully measured. Otherwise, the calculated log S0 could be quite erroneous. (iv) If a solid salt of the compound is added to water, a supersaturated solution may form. On equilibration, there could be two precipitates in the suspension: the original compound salt and the neutral free-acid/base form of the solid (at the pH called ‘pHmax’). Solid state characterization of the solid(s) and the measurement of pH would be highly beneficial. (v) When a buffer medium is used, the analysis of the solution and solid states can be complicated, as water-soluble drug-buffer complexes or aggregates may form [10]. In a supersaturated solution, drug molecules may self-associate as sub-micellar aggregates, particularly if they are surface-active [11,12]. For bases introduced as drug salts into a high-pH solution, the charged drug in the supersaturated solution can disproportionate into oil or undergo precipitation into an amorphous solid, along with which charged water-soluble aggregates may co-exist [10-13]. Given enough time, the multiple phases are expected to undergo transformation into a thermodynamically most-stable crystalline solid. Good understanding of solution chemistry and solid state characterization is essential for correctly interpreting the results of solubility measurements, so that high-quality intrinsic solubility data can be reported [3].

The need for high-quality data in accurate solubility prediction

Accurate prediction of the intrinsic solubility of druglike molecules [2] requires that (a) the log S0 values used to train the prediction method are of high quality (with water solubility values, log Sw, or values measured at a particular pH, log SpH, properly corrected for ionization [14] and with all solubility values referring to the same temperature [15]), and (b) the compounds in the training set cover the druglike chemical space of the test set of compounds. These two important notions became the focus of a number of studies since 2008, spurred by the publications of Llinàs et al. [16] and Hopfinger et al. [17]. These authors introduced the ‘Solubility Challenge,’ a competition to probe the limits of prediction methods. The CheqSol method was used to measure the log S0 of 132 structurally diverse drugs. The log S0 values of 100 molecules were offered as the training set for the prediction of an external test set of 32 molecules (not found in the training set), whose values were not revealed before the completion of the competition. In a number of earlier studies, it was suggested that the typical error in measured aqueous solubility is ~0.6 log unit or higher, when the solubility values were collected from many published sources [18]. This suggested that the quality of prediction methods was approaching the experimental limit. However, in the Solubility Challenge competition all of the values came from one laboratory, and the intralaboratory precision (repetitive measurements of the same sample by the same chemist, using the same instrument) of the CheqSol data was reported to be SD = 0.05 log unit. It was not known what the expected interlaboratory precision would be, given the unknown systematic errors that might affect the accuracy of results. For example, when 125 published CheqSol values were compared in 2015 to those obtained by the SSF method, it was reported that r = 0.90, prediction root-mean-square-error, RMSE = 0.52 log unit, and there was a slight bias of -0.13 log unit [19]. The values in the comparison came from the Wiki-pS0 database (in-ADME Research), which at that time contained 4557 log S0 entries. The database now has 6355 entries, with many newly added CheqSol and SSF values. It was thus of interest to update and better characterize the comparison of data quality between the CheqSol and the ‘gold standard’ saturation shake-flask methods. In parallel, armed with new curated data, the second Solubility Challenge has just been announced [2], with the prediction submission deadline set to 8 September 2019, the day before the IAPC-8 conference starts. The Excel submission form in Supporting Info at is freely downloadable for those interested to participate. The data described below could be a useful addition to other druglike training sets currently in circulation.

Method

Data source: Wiki-pS0 database

The ongoing Wiki-pS database project [2,3,15,19], which started in 2011, now has 6355 log S0 entries for 3014 different drug-relevant molecules (solids at room temperature), drawing on the study of 1325 publications. The overall interlaboratory standard deviation, SDiALL = 0.17 log unit, has been estimated from the 870 molecules for which solubility was reported from two or more different sources (comprising 4209 individual S0 values), by taking the average of the individual 870 SD values. The SDiALL, being lower than the older estimate of solubility measurement error (~0.6 log unit [18]), indicates that (i) when legacy data are subjected to critical analysis, as recommended in [3,15,19], improvements in the quality of the extracted log S0 data can be achieved, and (ii) there is room for further improvement to the current prediction methods. Alongside the database, the pDISOL-X program (in-ADME Research) was designed to interpret solubility data and make temperature corrections, to produce a reliable estimate of the underpinning log S0 [10,11,20].

Results

There are 233 reported CheqSol log S0 values in the Wiki-pS0 database for 145 different druglike molecules. For 124 of the molecules, there are 838 reported log S0 determined mostly by the SSF method. Of the 838 entries, 298 (36 %) were log S0 values calculated from log S vs. pH data (using pDISOL-X), based on a total of 2925 individual log SpH measurments. (For 21 of the 145 molecules, SSF data have not been located in the literature.) Table 1 lists the solubility values for the 124 overlapping molecules measured by the CheqSol and SSF methods. The range of log S0 spans from -1.0 to -10.6 (log molar units), averaging at -3.8.
Table 1.

Averaged Saturation Shake-Flask (SSF) and CheqSol (CS) Intrinsic Solubility (log molar) [a]

Compoundavg. 25 °C log S0SSF SDSSF N avg. 25 °C log S0CS SD CS N Ref. (CheqSol)
Acebutolol -2.500.422-2.680.311[17]
Acetaminophen -0.970.1018-1.030.052[16,22]
Acetazolamide -2.380.1810-2.440.041[16]
Alprenolol -2.830.161-2.660.042[22,28]
Amantadine -2.760.501-1.900.072[22,29]
Amiodarone -10.580.354-9.680.151[22]
Amitriptyline -4.660.405-4.550.151[22]
Amodiaquine -4.740.351-5.870.102[16,22]
Amoxicillin -2.130.0610-1.970.081[17]
Aripiprazole -6.740.162-6.430.301[28]
Atenolol -1.200.067-1.250.064[22,23,27,28]
Atropine -2.010.294-2.000.071[16]
Barbital,Hexo- -2.790.155-2.670.071[16]
Barbital,Pheno- -2.300.0825-2.290.071[22]
Bendroflumethiazide -4.350.344-4.190.152[24,29]
Benzocaine -2.180.1213-2.330.311[17]
Benzoic_Acid -1.590.0512-1.610.151[22]
Benzoic_Acid,4-Hydroxy- -1.380.105-1.460.041[22]
Benzthiazide -4.840.284-4.860.042[22,24]
Bifonazole -6.270.254-6.010.151[25]
Bisoprolol -2.400.342-1.460.261[28]
Bupivacaine -3.470.189-3.080.212[16,24]
Carprofen -4.630.051-4.700.221[22]
Carvedilol -5.460.3612-4.460.271[28]
Cephalothin -3.400.321-2.940.041[16]
Chlorpheniramine -2.650.101-2.600.123[21,22]
Chlorpromazine -5.450.3111-5.550.041[22]
Chlorpropamide -3.150.175-3.210.052[16,28]
Chlorprothixene -5.660.383-6.310.443[16,22]
Chlorzoxazone -2.780.113-2.640.042[16,29]
Cimetidine -1.500.227-1.690.041[16]
Ciprofloxacin -3.570.1919-3.600.181[22]
Cyproheptadine -5.020.464-5.000.151[24]
Diazoxide -3.460.263-3.360.071[16]
Dibucaine -3.700.351-4.200.262[17,24]
Diclofenac -5.330.1927-5.400.146[9,21,22,24,26]
Diethylstilbestrol -4.380.386-4.430.481[17]
Difloxacin -3.940.102-3.600.041[16]
Diflunisal -5.080.405-4.920.706[17,21]
Diltiazem -2.950.201-3.050.162[22]
Diphenhydramine -3.290.643-2.950.041[22]
Dipyridamole -5.130.1310-5.160.021[17]
Enrofloxacin -3.160.137-3.180.181[16]
Famotidine -2.670.275-2.650.041[22]
Fenoprofen -3.920.201-3.700.261[16]
Flufenamic_Acid -5.270.2810-5.350.041[22]
Flumequine -4.100.081-3.800.112[16,22]
Flurbiprofen -4.360.1921-4.050.152[16,22]
Fluvastatin -3.780.061-3.870.161[28]
Furosemide -4.510.2019-4.180.123[21,22]
Glibenclamide -6.630.4510-6.410.072[27,28]
Gliclazide -4.270.4014-4.200.153[23,27,28]
Glimepiride -7.220.425-6.440.151[27]
Glipizide -5.660.246-5.510.103[27-29]
Guanine -4.090.072-4.430.371[16]
Haloperidol -5.760.148-5.520.192[22,24]
Hydroflumethiazide -2.720.1115-2.700.033[21,29]
Hydrochlorothiazide -3.080.085-2.970.441[16]
Ibuprofen -3.800.2219-3.970.234[21-23]
Imipramine -4.390.287-4.150.134[17,21,22]
Ketoprofen -3.450.2320-3.230.024[17,21,24]
Lidocaine -1.820.0819-1.870.051[22]
Loperamide -6.800.323-7.100.042[16,22]
Maprotiline -4.540.263-4.750.082[22,24]
Meclizine -5.550.661-6.480.151[29]
Meclofenamic_Acid -6.880.082-6.560.422[17,22]
Mefenamic_Acid -6.360.426-6.540.282[16,22]
Metoclopramide -3.180.281-3.580.071[29]
Metoprolol -1.310.162-1.220.012[22,28]
Metronidazole -1.250.0617-1.220.041[16]
Miconazole -6.040.414-5.380.452[16,22]
Nadolol -1.290.402-1.630.082[22,28]
Nalidixic_Acid -3.760.268-3.610.041[16]
Naphthoic_Acid,2- -3.820.285-3.770.151[22]
Naphthol,1- -2.090.106-1.980.071[22]
Naproxen -4.210.1415-4.410.252[16,22]
Niflumic_Acid -4.180.136-4.110.082[16,22]
Nitrofurantoin -3.340.1111-3.290.062[16,22]
Norfloxacin -2.880.1618-2.750.151[22]
Nortriptyline -3.940.233-3.930.022[16,22]
Ofloxacin -1.370.146-1.270.041[16]
Olanzapine -4.470.171-4.230.181[24]
Orphenadrine -3.710.351-3.170.151[22]
Paliperidone -4.560.181-4.310.221[28]
Papaverine -4.400.148-4.210.244[16,21,22,24]
Phenazopyridine -3.990.166-4.190.071[16]
Phenol,4-Iodo- -1.830.071-1.720.041[22]
Phenylbutazone -4.510.3610-4.390.041[22]
Phenytoin -4.080.1329-3.860.181[16]
Phthalic_Acid,2- -1.460.0210-1.550.082[16,22]
Pindolol -3.810.146-3.640.133[22,24,28]
Pioglitazone -6.210.813-6.160.461[28]
Piroxicam -4.460.2616-4.730.064[16,21,22]
Pramoxine -3.460.391-3.020.151[22]
Probenecid -4.820.243-4.860.041[17]
Procaine -2.590.462-1.720.071[22]
Prochlorperazine -4.380.261-4.750.151[22]
Promethazine -4.390.1910-4.260.151[22]
Propranolol -3.820.304-3.490.066[21-24,27,28]
Pyrimethamine -3.960.573-4.110.221[22]
Quinine -3.160.675-2.800.012[16,22]
Ranitidine -1.700.571-2.500.261[16]
Rosiglitazone -5.280.503-5.220.611[28]
Salicylic_Acid -1.880.0820-1.930.041[17]
Sertraline -5.410.282-4.830.151[22]
Sulfacetamide -1.500.055-1.520.041[16]
Sulfamerazine -3.100.076-3.120.051[22]
Sulfamethazine -2.630.276-2.730.041[16]
Sulfamethizole -2.770.145-2.780.141[17]
Sulfasalazine -6.470.087-6.210.102[16,22]
Sulfathiazole -2.610.238-2.690.151[22]
Terfenadine -7.680.7110-8.400.151[29]
Tetracaine -3.220.111-3.050.062[16,24]
Tetracycline -3.290.056-3.000.092[16,29]
Tetracycline,Oxy- -3.270.074-3.090.331[16]
Thiabendazole -4.130.463-3.480.101[17]
Thymol -2.180.053-2.190.041[22]
Tolbutamide -3.550.115-3.500.052[17,28]
Tolmetin -4.050.151-4.110.032[16,22]
Trazodone -3.310.094-3.190.402[17]
Trichlormethiazide -3.640.123-3.380.183[16,22]
Trimethoprim -2.640.399-2.950.151[16]
Verapamil -4.360.289-4.070.223[21,22]
Warfarin -4.770.229-4.800.142[22,24]

a SD refers to the standard deviation of reported values from N different sourcs. The average interlaboratory SD values for for the two sets of data are: SDiSSF = 0.24 and SDiCS = 0.15 log unit.

Note that indomethacin is not listed in the table. The CheqSol value was not accepted in the Wiki-pS0 database due to the hydrolytic decomposition encountered during the CheqSol assay [30]. On the other hand, the pH-metric DDT method indicated log S0DTT = -5.33 [31], in good agreement with the average of 20 interlaboratory SSF measurements: log S0SSF = -5.49, SDi = 0.23. Figure 1 shows the correlation plot between the two types of measurements, with each point showing both method interlaboratory error bars. The statistics have improved slightly over the 2015 comparison [19], with current values being r = 0.96, RMSE = 0.34 log unit, with a lower slight bias of -0.07 log unit. The average interlaboratory standard deviation is slightly lower for the CheqSol set over that of the SSF set: SDiCS = 0.15 and SDiSSF = 0.24, which probably highlights the benefit of using a highly standardized method (CheqSol) over an ‘open’ method (SSF). It should be kept in mind that the above comparison sets are small. For 870 of these sorts of comparisons, SDiALL = 0.17. The intralaboratory comparison between the methods by one group of researchers performing both the SSF and CheqSol measurements (both highly-standardized) [32] produced the statistics r2 = 0.96, RMSE = 0.20 for 15 compounds, comparable to SDiAL. The latter is the target for computational methods to aim at, provided that the training sets are of high and consistent quality.
Figure 1.

The correlation plot between published CheqSol and saturation shake-flask (SSF) intrinsic solubility values (log molar) at 25 °C. The solid diagonal line is the identity line. The dashed lines are displacements by ±0.5 log.

Conclusions

This brief commentary reasserts that the quality of the standardized CheqSol measurements is comparable to that of the ‘gold standard’ saturation shake-flask measurements. Measurement errors are much lower than commonly acknowledged in the computational prediction community. The intralaboratory (single instrument) errors reported in the CheqSol method (0.05 log) need to be multiplied by a factor of 3 to match the expected interlaboratory errors for the method (0.15 log). The scale factor, in part, relates to the hidden systematic errors in the single-lab values. It is expected that better standardizations in the ‘open’ SSF methods, as recommended in the ‘white paper’ [3], may equalize the SDi of both methods at about ~0.15 log unit. When solubility prediction methods indicate RMSE below 0.15, ‘overfitting’ is probably taking place, overlapping noise with information.
  20 in total

1.  Chasing equilibrium: measuring the intrinsic solubility of weak acids and bases.

Authors:  Martin Stuart; Karl Box
Journal:  Anal Chem       Date:  2005-02-15       Impact factor: 6.986

2.  Solubility challenge: can you predict solubilities of 32 molecules using a database of 100 reliable measurements?

Authors:  Antonio Llinàs; Robert C Glen; Jonathan M Goodman
Journal:  J Chem Inf Model       Date:  2008-07-15       Impact factor: 4.956

3.  Findings of the challenge to predict aqueous solubility.

Authors:  Anton J Hopfinger; Emilio Xavier Esposito; A Llinàs; R C Glen; J M Goodman
Journal:  J Chem Inf Model       Date:  2009-01       Impact factor: 4.956

Review 4.  Using measured pKa, LogP and solubility to investigate supersaturation and predict BCS class.

Authors:  K J Box; J E A Comer
Journal:  Curr Drug Metab       Date:  2008-11       Impact factor: 3.731

5.  pH-Induced precipitation behavior of weakly basic compounds: determination of extent and duration of supersaturation using potentiometric titration and correlation to solid state properties.

Authors:  Yi-Ling Hsieh; Grace A Ilevbare; Bernard Van Eerdenbrugh; Karl J Box; Manuel Vincente Sanchez-Felix; Lynne S Taylor
Journal:  Pharm Res       Date:  2012-05-12       Impact factor: 4.200

6.  Characterisation of selected active agents regarding pKa values, solubility concentrations and pH profiles by SiriusT3.

Authors:  D Schönherr; U Wollatz; D Haznar-Garbacz; U Hanke; K J Box; R Taylor; R Ruiz; S Beato; D Becker; W Weitschies
Journal:  Eur J Pharm Biopharm       Date:  2015-03-07       Impact factor: 5.571

7.  Solubility Challenge Revisited after Ten Years, with Multilab Shake-Flask Data, Using Tight (SD ∼ 0.17 log) and Loose (SD ∼ 0.62 log) Test Sets.

Authors:  Antonio Llinas; Alex Avdeef
Journal:  J Chem Inf Model       Date:  2019-05-09       Impact factor: 4.956

8.  Solubility-pH profiles of some acidic, basic and amphoteric drugs.

Authors:  Elham Shoghi; Elisabet Fuguet; Elisabeth Bosch; Clara Ràfols
Journal:  Eur J Pharm Sci       Date:  2012-11-22       Impact factor: 4.384

9.  In silico prediction of aqueous solubility.

Authors:  John C Dearden
Journal:  Expert Opin Drug Discov       Date:  2006-06       Impact factor: 6.098

10.  Phenothiazines solution complexity - Determination of pKa and solubility-pH profiles exhibiting sub-micellar aggregation at 25 and 37°C.

Authors:  Aneta Pobudkowska; Clara Ràfols; Xavier Subirats; Elisabeth Bosch; Alex Avdeef
Journal:  Eur J Pharm Sci       Date:  2016-07-21       Impact factor: 4.384

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.