Literature DB >> 25692007

A novel application of t-statistics to objectively assess the quality of IC50 fits for P-glycoprotein and other transporters.

Michael O'Connor¹, Caroline Lee², Harma Ellens³, Joe Bentz⁴.

Abstract

Current USFDA and EMA guidance for drug transporter interactions is dependent on IC50 measurements as these are utilized in determining whether a clinical interaction study is warranted. It is therefore important not only to standardize transport inhibition assay systems but also to develop uniform statistical criteria with associated probability statements for generation of robust IC50 values, which can be easily adopted across the industry. The current work provides a quantitative examination of critical factors affecting the quality of IC50 fits for P-gp inhibition through simulations of perfect data with randomly added error as commonly observed in the large data set collected by the P-gp IC50 initiative. The types of errors simulated were (1) variability in replicate measures of transport activity; (2) transformations of error-contaminated transport activity data prior to IC50 fitting (such as performed when determining an IC50 for inhibition of P-gp based on efflux ratio); and (3) the lack of well defined "no inhibition" and "complete inhibition" plateaus. The effect of the algorithm used in fitting the inhibition curve (e.g., two or three parameter fits) was also investigated. These simulations provide strong quantitative support for the recommendations provided in Bentz et al. (2013) for the determination of IC50 values for P-gp and demonstrate the adverse effect of data transformation prior to fitting. Furthermore, the simulations validate uniform statistical criteria for robust IC50 fits in general, which can be easily implemented across the industry. A calibration of the t-statistic is provided through calculation of confidence intervals associated with the t-statistic.

Entities: CellLine Chemical Disease Gene Species

Keywords: Analysis of error; IC50 statistical precision; P-glycoprotein

Year: 2014 PMID： 25692007 PMCID： PMC4317220 DOI： 10.1002/prp2.78

Source DB: PubMed Journal: Pharmacol Res Perspect ISSN： 2052-1707

Introduction

Membrane transporters, such as P-glycoprotein (P-gp), play critical roles in the absorption and excretion of drugs, and their distribution to various physiological spaces. For multidrug transporters like P-gp, the broad range of transported substrates leads to the possibility of competitive inhibition that may result in clinically significant drug–drug interactions (DDIs) (Schwarz et al. 2000; Juan et al. 2007; Fenner et al. 2009; Shirasaka et al. 2010). In the drug transporter area, the potential for inhibition is commonly assessed via the determination of an in vitro IC50 value. Regulatory guidance on the investigation of DDIs contain decision trees/recommendations on whether a clinical DDI study is warranted which are based on the IC50 value in combination with clinical drug concentrations. To assess the risk of a P-gp mediated DDI, both measured plasma concentrations and a theoretical maximal concentration in the intestinal lumen are considered. Several different decision criteria have been proposed to assess the DDI risk for the P-gp substrate digoxin based on different experimental systems and statistical approaches (Cook et al. 2010; Sugimoto et al. 2011; Agarwal et al. 2013). There is a clear need for standardization of the IC50 determination for inhibition of P-gp (Zhang et al. 2008; Agarwal et al. 2013; Bentz et al. 2013). Both P-gp-expressing cell lines and P-gp-containing plasma membrane vesicles have been used to estimate IC50 values for inhibition of P-gp-mediated transport. In addition, transport inhibition data obtained using polarized cell lines is typically transformed before IC50 estimation. Several different data transformations are in use, which are based on efflux ratio, net secretory flux, or unidirectional flux (Tang et al. 2002; U.S. FDA/CDER 2006; Kalvass and Pollack 2007; Balimane et al. 2008; Lumen et al. 2010). A consortium of 22 pharmaceutical and contract research laboratories and an academic institution established a collaboration to assess interlaboratory differences in P-gp IC50 values resulting from these methodological differences (Bentz et al. 2013; Ellens et al. 2013). Among the members of the consortium, P-gp-expressing polarized cell lines were the most frequently used experimental system. The cells lines included human colon adenocarcinoma cells (Caco-2), Madin–Darby canine kidney cells transfected with MDR1 cDNA (MDCKII-MDR1), and Lilly Laboratories Cells – Porcine Kidney Nr. 1 cells transfected with MDR1 cDNA (LLC-PK1-MDR1). The substantial lab-to-lab variability in IC50 values observed by consortium members most likely resulted from differences in P-gp expression levels and possibly in expression levels of a digoxin uptake transporter in the cell systems used. It was also found that IC50 values based on efflux ratios were typically lower than those based on unidirectional flux and noted that data transformation results in propagation of error. Therefore, a recommendation was put forward to determine P-gp IC50 values for digoxin transport based on unidirectional B>A flux only, without data transformation prior to IC50 fitting (in this case subtraction of transport in the presence of a positive control inhibitor) to minimize propagation of error (Bentz et al. 2013). One of the problems faced by the consortium was determining the extent to which differences in the quality of the transport inhibition data accounted for differences in IC50 values, rather than differences in experimental systems and data transformations. Members of the IC50 consortium initially used the standard error of the IC50 (SEIC50) estimate to assess the quality of fits. This was problematic for two reasons. First, estimates of the IC50 for a single data set varied when estimated with different software packages. Some software estimated the standard error of the IC50, others the standard error of the log(IC50) (recall that these cannot be interconverted by taking logs). Even among packages that estimated standard errors for the log(IC50), the estimate of the IC50 could vary several fold, presumably because different packages used different methods to calculate the variance of the log(IC50) estimate, which is a derived quantity (see Data S1 eqs. A1–A4). Second, given this variability, there was no clear criterion that could be used to distinguish good from poor data. In Bentz et al. (2013), a t-statistic was developed to objectively assess when data quality (variability in measurements, insufficient range of inhibitor concentrations, or other factors that confound a sigmoidal profile) limits the interpretation of estimated IC50 values. This statistic was calibrated through visual inspection of logistic IC50 fits of untransformed data by members of the P-gp IC50 consortium and all fits with a t-statistic value below a fixed threshold were judged of poor quality and excluded from further analysis. In this manuscript, that t-statistic calibration is further supported by quantitative simulations of commonly observed error (as in the data in Bentz et al. 2013) added to error-free data and evaluation of the effect of that error on the confidence in the IC50 estimate. The authors simulated the sensitivity of IC50 estimates to (1) measurement errors (variability in replicate measures of transport activity), (2) transformation of error-contaminated transport activity data prior to IC50 fitting, (3) lack of clearly described “no inhibition” and/or “complete inhibition” plateaus, and (4) algorithms used in fitting the data (e.g., two and three parameter fits). These simulations provide strong quantitative support for the recommendations provided in Bentz et al. (2013) for the determination of IC50 values for P-gp and demonstrate quantitatively the adverse effect of data transformation prior to IC50 fitting (both calculation of efflux ratios or subtraction of inhibition in the presence of a positive control inhibitor) on the robustness of the IC50 value. Furthermore, the simulations validate uniform statistical criteria for robust IC50 fits in general (not just for P-gp), which can be easily implemented across the pharmaceutical industry.

Materials and Methods

Computations

Unless specifically noted, all calculations, including statistics, were performed using a 64-bit installation of MATLAB Version 7.11 (Release 2010b).

Statistics

Logistic regressions (logistic fits), parameter, and standard error estimates were fitted using nonlinear least squares regression from MATLAB's statistics toolbox. Standard errors of log(IC50) estimates were calculated as recommended by Lyles et al. (2008). Linear least squares regressions were performed using MATLAB (Quinn and Keough 2002; Press et al. 2007). Analysis of variance (ANOVA) and analysis of covariance (ANCOVA) were calculated via general linear models (Rao 1998; Quinn and Keough 2002).

Monte Carlo simulations of error sensitivity

Ideal error-free IC50 curves were simulated for inhibition of digoxin transport in basolateral-to-apical (B>A) and apical-to-basolateral (A> B) transport directions by increasing concentrations of verapamil. The simulations used the elementary rate constants (on-, off-, and efflux rate constants) determined for digoxin and verapamil in MDCKII-MDR1 cells obtained from the Netherlands Cancer Institute (Lumen et al. 2013). In all simulations, it is assumed that P-gp is the only transporter involved in transport of digoxin across the monolayer, that is, digoxin uptake transporters are not included as part of the simulation. This assumption does not affect the conclusions of this work. Figure1A shows the simulated error-free IC50 curve for transport inhibition of 10 μmol/L digoxin in the B>A direction after 2 h for 18 verapamil concentrations using the kinetic parameters and model given in Lumen et al. (2013). Error-free transport activity was then interpolated from the ideal curve at 7, 9, 11, or 15 inhibitor concentrations spaced evenly on a logarithmic scale (constant ratio between adjacent concentrations). Seven inhibitor concentrations was the most common number used in the data set analyzed in Bentz et al. (2013). Random errors were added to this ideal, error-free data set by Monte Carlo simulations. Each data set with added error consisted of triplicate repeats at each inhibitor concentration, as was the norm for data sets analyzed in Bentz et al. (2013).

Figure 1

“Ideal” noise-free B>A digoxin transport activity fitted using estimated transport parameters (Lumen et al. 2013) for MDR1-MDCK-NKI cells and 18 concentrations of verapamil as a transport inhibitor (A). Parameters used here assumed that no digoxin transporters other than P-gp were active. Similar curve for A>B transport was generated but is not shown. “Homogeneous” random errors are added to each of three replicate “measurements” at each of seven inhibitor concentrations (B). Three levels of error (standard error/range) are illustrated with the noise standard error set to 2%, 10%, and 20% of the transport activity difference between the two plateaus.

Measurement errors (variability in replicate measures of transport activity)

To analyze the two most common types of errors that were observed in Bentz et al. (2013), we added simulated errors to the ideal transport data in order to assess their effects. Two types of random errors were added to the ideal transport data to mimic types of variability seen in the data analyzed in Bentz et al. (2013). First, to simulate measurement errors, normally distributed random errors were added to each transport activity measurement, that is, to transport activity measured at each inhibitor concentration. This is referred to as “homogeneous error.” Standard deviations for this error ranged from 0% to 30% of the full scale deviation (difference between transport activity without inhibitor and with P-gp fully suppressed). IC50 curves were simulated with three levels of error (2%, 10%, and 20%) added to each of the inhibitor concentrations. The magnitude of this “homogeneous error” varied widely in the data analyzed by Bentz et al. (2013), but averaged around 10% of the full range of transport activities. A second type of error was used to simulate situations where all replicates at a particular inhibitor concentration significantly departed from the fitted sigmoid curve – so the mean activity for that concentration of the inhibitor fell noticeably above or below the sigmoid curve. This is referred to as “heterogeneous error.” In this case, in addition to the homogeneous errors described earlier, all activities at particular inhibitor concentrations were perturbed by an additional error with standard deviation equal to 15% of the full scale deviation (as found in the data collected in Bentz et al. 2013). This was done at randomly chosen inhibitor concentrations with the probability that any particular inhibitor concentration would be chosen equal to 25% (approximately the incidence of such problems in the empirical data). Ten thousand simulated data sets were created for each level of error and number of inhibitor concentrations (7, 9, 11, or 15).

Transformation of transport activity prior to IC50 fitting

IC50 is usually estimated via the classic Hill equation where Act([inhib[), activity measured at a particular inhibitor concentration; Act, activity when P-gp is completely inhibited (positive control); Act0, activity measured in absence of inhibitor (negative control); β, slope factor or Hill coefficient; [inhib[, the concentration of the inhibitor. Lyles et al. (2008) argue that the Hill equation is most easily estimated by fitting a maximum-likelihood, nonlinear logistic to [inhib[ and Act[inhib[. The following logistic equation, a transform of equation (2), was used in Bentz et al. (2013) and in the current work: where Range, Act0 − Act∞; ln{IC50} = −α/β. This approach requires four parameters to be fit (α, β, Act∞, Range), but if Act∞ and/or Range can be treated as known, the number of fitted parameters can be reduced to three or two. Ln is the natural logarithm. Each of the 10,000 simulated error-contaminated transport activity versus inhibitor concentration curves were fit to a logistic model (eq. 2) using two parameter (α and β) and three parameter (α, β, and range) fits. As in Bentz et al. (2013), Actmin (=Act∞) was normalized to zero at the lowest mean transport activity at any inhibitor concentration (nearly always with the positive control inhibitor of P-gp). For two parameter fits, transport activities were normalized to a 0–1 scale, that is, with Act∞ = 0 and Range = 1. Thus, equation 2 for a two parameter fit the model becomes yielding Standard errors of fitted parameters were calculated as maximum-likelihood estimates, the most commonly used method for nonlinear regressions. Because ln{IC50} was a function of two fitted parameters (eq. 4), we estimated the standard error of ln{IC50} using the multivariate delta method (Faraway 2006; Bolker 2008; see also Data S1). All natural logarithms were converted to log base 10 for presentation here and log will mean log base 10. For data in Bentz et al. (2013), most laboratories used two or three parameter fits, thus we concentrate here on those models. To assess the effect of data transformations on the quality of the IC50 fit, three transformations were selected which represented the major classes of transformations (Choo et al. 2000; Tang et al. 2002; U.S. FDA/CDER 2006; Balimane et al. 2008; Lumen et al. 2010; Bentz et al. 2013). In equations 7, BA indicates transport activity in the B>A direction and AB indicates transport activity in the A>B direction. Subscript “i” indicates the activity measured at a particular concentration of the inhibitor, “negcon” indicates activity measured without inhibitor (maximum P-gp transport activity, Act0 in eq. 2), and “poscon” indicates activity measured with P-gp fully suppressed (minimum P-gp transport activity, Act in eq. 2). All transformations were applied to each simulated data set generated for the error sensitivity analysis above and their results compared to those for the logistic fit of the untransformed (native) transport activities.

Simulation of missing data segments

One of the major problems in empirical data encountered by Bentz et al. (2013) was failure to include (or inability to measure activity within) important ranges of inhibitor concentrations. Reasons for this included failure of the preplanned range of inhibitor concentrations to include positive or negative control activity plateaus or the steep linear portion of the curve, and in some cases, evidence of inhibitor toxicity (Bentz et al. 2013). To simulate this problem, we divided the 18 inhibitor concentrations in the “ideal” data set into six segments: a negative control plateau at low inhibitor concentrations, two negative control “shoulders” where the inhibitor first clearly affects activity, the nearly linear portion of the curve surrounding the IC50, a positive control shoulder as the curve approaches full suppression, and a positive control plateau where activity is nearly completely suppressed (Fig.1A). For these simulations, each of the shoulders included only one inhibitor concentration. Simulated data sets were constructed with errors as above but with inhibitor concentrations within designated segments of the curve deleted from the data set. Analyses proceeded as with the error sensitivity analyses above except that we used only the inhibitor concentrations from the ideal data within the included segments.

Figure of merit statistics, t and t

In Bentz et al. (2013), the t-statistics, like those provided by standard linear model routines (like SAS, R, and SPSS (IBM SPSS Statistics for Windows, Version 22.0. Armonk, NY: IBM Corp) in reporting regression effects) and used in Wald tests of linear hypotheses (Quinn and Keough 2002; Fox and Weisberg 2011) were used as “figure of merit” to evaluate data sets for precision, conformity to the sigmoid model, and overall quality of the estimated IC50. In particular, two statistics were developed to assess the sigmoidicity of the IC50 fits: where , the fitted estimate of α in equation 3; , the fitted estimate of in equation 3; SE, standard error for the estimate of α; SE, standard error for the estimate of β. To present a single statistic summarizing both values, t was estimated Because the software used by many pharmaceutical professionals to perform nonlinear logistic fits does not allow ready access to either α or SE, while β and SE are readily available, the use of a statistic based on t alone was also investigated. To evaluate whether these statistics correlated well with important characteristics of the fits, t, t, and t values for all of the simulated data sets were determined. The relationship of the t statistics to errors in the parameter estimates (α or β), the standard error of the IC50, and the root-mean-square error (RMSE) for the logistic fit was evaluated. The use of these t-statistics as a figure of merit for a logistic fit assumes that the value of the parameter (α or β), and hence the value of the t-statistic should be something other than zero, (eq. 8, 9). When the concentration of the inhibitor is expressed in μmol/L, and the IC50 can be close to 1 μmol/L, then the log(IC50), α, and t can all be close to zero. Thus, all fits were done with the concentration of the inhibitor expressed as mol/L, and the value of the IC50 transformed for presentation. If one uses t or t, this procedure is necessary. The values of β and t do not depend on how the inhibitor concentration is expressed. Thus, if one uses t as a figure of merit, the re-expression of the inhibitor concentrations is unnecessary.

Example calculations of t-statistics

A two parameter logistic fit to a randomly chosen IC50 data set returned the values of α and β and their standard errors shown in Table1. Also shown are the calculated values for the ln{IC50} values (in the units of inhibitor concentration used in the fit), t, t, and t.

Table 1

Example of calculation of t-statistics.

Parameter	α	SE_α	β	SE_α	ln{IC₅₀} (mol/L)	t _α	t _β	t _αβ
Definition	IC₅₀ locator equation 2	Standard error of α estimate equation 8	Slope factor equation 1	Standard error of β estimate equation 8	−α/β equation 2	α/SE_α equation 8	β/SE_β equation 8	sqrt{t_αt_β} equation 9
Example	11.3	0.756	0.9	0.0599	−12.54	14.95	15.05	15.00

Example of calculation of t-statistics.

Results

A fraction of the transport inhibition curves collected by the P-gp IC50 initiative contained problematic data and the associated IC50 values were consequently excluded from further data analysis. Several types of experimental error affected the quality of the IC50 fit: experimental variability in the transport measurement at each inhibitor concentration, deviation from the sigmoidal curve of all replicates at a specific inhibitor concentration and an insufficient inhibitor concentration range, leading to poorly defined “no inhibition” or “complete inhibition” plateaus. In this work, simulations were performed to quantitatively assess the effect of the magnitude of the error on the confidence in the fitted IC50 value. The magnitude and frequency of the simulated error was based on that found in the P-gp IC50 initiative data set.

Sensitivity of fits to random homogeneous error in the data at each inhibitor concentration

Figure1A shows the simulated error-free IC50 curve for inhibition of digoxin transport in the B>A direction. Error-free transport activity was interpolated from this ideal curve at seven inhibitor concentrations spaced evenly on a logarithmic scale (constant ratio between adjacent concentrations). These ideal data were fitted to a logistic IC50 equation (IC50 = 1.63 μmol/L). Figure1B shows the effect of adding random homogeneous error to the transport measured at each of the seven inhibitor concentrations by Monte Carlo simulation as described in the Materials and Methods section. A total of 10,000 distinct inhibition curves with added error were generated for each level of added error with a few examples illustrated in Figure1B. Each of these 10,000 curves was fitted to a two and three parameter model and IC50 values and slope factors determined. For each of the fits the conventional statistical parameters were calculated (standard error of the IC50 and RMSE and correlation coefficient, r2, of the fit), as well as the novel t-statistic parameters t, t, and t. Most of the simulations described in this work were performed with seven inhibitor concentrations, as was the case in Bentz et al. (2013). In some cases a larger number of inhibitor concentrations was used for comparison (see Data S1). All three potential t-statistic parameters (t, t, and t) for the IC50 fits of the simulated data were highly correlated with one another; minimum r2 > 0.99995 for two parameter fits. For three parameter fits the r2 value for the relation between t and t fell as low as 0.98, because of a small number of cases where the algorithm overestimated the range (eq. 2) and t differed from t. Thus, especially for two parameter fits, aside from computational convenience (as indicated in the Materials and Methods section), there is little reason to prefer one measure over the others. In this work we have presented t as t-statistic, but t and t gave equivalent results. Figure2 shows that adding random homogeneous error to each transport measurement increased the deviation in estimates of both the log{IC50} (Fig.2A) and the slope factor of the IC50 relationship (Fig.2B) from the ideal value estimated from error-free data (represented by the value “0” on the respective y-axes). For most values of t (t > 2) that deviation was centered on the ideal IC50 and slope values estimated from the error-free data. Larger introduced homogeneous errors were also associated with larger standard errors in the fitted log{IC50} and larger RMSE (standard error of the residuals) and thus smaller r2 (correlation coefficient) of the fit (Data S1).

Figure 2

Relation of t to deviation of log(IC50) of error-contaminated data from the ideal value of the logIC50 (A), and the slope (β) parameter from equation (2) (B). For error-contaminated data 10,000 simulated data sets were created for each value of added error (standard error/range). All data sets shown here used seven equally spaced values of log{[inhibitor[}. In the error sensitivity simulations, t was smoothly and monotonically related to the magnitude of the introduced random homogeneous error at each inhibitor concentration (SE/range, or % SE). Figure2 shows that for both the log{IC50} and the slope factor, the greater the introduced error, the greater the deviation from the ideal log{IC50} or slope value and the lower the value of t. t was also smoothly and monotonically related to the standard error for each estimated IC50 and RMSE and r2 for the fit (Data S1).

Relationship between tαβ and the confidence in the IC50 value: effect of the magnitude of the error on the confidence interval

Figure3 shows the simulated probability distributions for the error in the IC50 for several values of t determined utilizing a kernel-based technique (Martinez and Martinez 2007). For a t value of 7, there is a slightly greater than 95% probability that the fitted IC50 is within twofold of the true IC50. For a t value of 5, there is a slightly greater than 95% probability that the fitted IC50 is within threefold of the true IC50. For a t of 3, the probability that the fitted IC50 is within threefold of the true IC50 is 90%.

Figure 3

Relationship between the maximum fold error of the fitted IC50 value, the confidence interval that the maximal fold error is less than a certain value, and the t for the IC50 fit.

Sensitivity of the IC50 fits to transformations of transport activity data prior to IC50 fitting

Several mathematical transformations of transport activity data have been used (reviewed in Balimane et al. 2008 and in Lumen et al. 2010). Analyses in Bentz et al. (2013) suggested that these activity transformations fell into three empirically defined groups, with results from different members of the same group yielding highly correlated results. Thus, one member of each of the three groups was examined, equations 5–7. In equation 5, unidirectional B>A transport activity is calculated by subtracting B>A transport in the presence of a prototypical P-gp inhibitor from B>A transport in the absence of inhibitor. In equation 6, net secretory transport is calculated by subtracting transport in the A>B direction from that in the B>A direction. In equation 7, transport activity is expressed as the efflux ratio by dividing B>A transport with A>B transport. In each case, the transport activity at the various inhibitor concentrations is then expressed as a fraction of transport activity in the absence of inhibitor. The IC50 fit is then performed on this transformed transport activity versus inhibitor concentration data curve. To investigate the effect of data transformation prior to IC50 fitting on the quality of the IC50 fit, transport activity was calculated for 10,000 simulated data sets according to each of the three transport inhibition equations. For the net secretory transport and the efflux ratio equations simulations in both B>A and A>B direction were required. The t was calculated for each of the fitted IC50 values and compared with the t obtained for logistic fits of untransformed data (eq. 3). The line of unity in Figure4 indicates identical t values for fits performed on untransformed and transformed data. The dots falling below this line indicate a lower t for the fits performed on transformed data. The different colors represent different levels of simulated homogeneous error. The unidirectional B>A transformation (Tang et al. 2002) yielded t values similar to, but slightly smaller than, the native fit (Fig.4A). Errors in estimated log{IC50} were very slightly, but significantly (via ANCOVA and paired t test), larger than those using the same data sets via native fit (Fig.5). The net secretory flux (Choo et al. 2000) and efflux ratio (Balimane et al. 2008) transformations yielded noticeably lower t values (Fig.4B and C) and significantly larger (and more variable, ANCOVA, paired t-test) errors in estimated log{IC50} (Fig.5), than native logistic fits using the same data sets. Interestingly, the net secretory flux transformation yielded a larger t statistic than the native logistic fits in a small minority of cases, however, since the vast majority of fits had smaller t statistics, we did not further investigate. The efflux ratio (eq. 7) transformation yielded almost uniformly, and significantly, smaller t statistics values.

Figure 4

Figure 5

Comparison of absolute value of estimation error (mean ± 95% confidence interval) in log{IC50} estimates using data transformation equations (eq. 5–7) and native two parameter logistic model fit on the same data sets with varying levels of added random error for each measurement (SE/range). Estimation error = estimate − log{IC50} from ideal, noise-free data.

Comparison of t obtained using data transformation equations (eq. 5–7) to t obtained using native two parameter logistic fits using the same simulated data sets. Colors signify different levels of error added to ideal data (standard error/range). Dashed lines are lines of equal t values, that is, lines of unity. For each value of added error (SE/range), 10,000 simulated data sets were created. All data sets shown here used seven equally spaced values of log{[inhibitor[}. Comparison of absolute value of estimation error (mean ± 95% confidence interval) in log{IC50} estimates using data transformation equations (eq. 5–7) and native two parameter logistic model fit on the same data sets with varying levels of added random error for each measurement (SE/range). Estimation error = estimate − log{IC50} from ideal, noise-free data.

Sensitivity of heterogeneous error to deviation from sigmoidicity

As described earlier, adding random homogeneous error to ideal data at all inhibitor concentrations lead to a broader range of log{IC50} and slope estimates and smaller t-statistics (Fig.2), as well as larger SE for the Log IC50 and RMSE values for the fits. Another relatively common observation in the empirical data collected in Bentz et al. (2013) was deviation from sigmoidicity of all replicates at a particular inhibitor concentration (heterogeneous error, exemplified by replacement of red symbols by blue symbols at two inhibitor concentrations in Figure6).

Figure 6

Example of simulated data set created by adding noise to idealized data. Homogeneous error, with SE = 10% of full range, is added to each of three replicate “measurements” at each inhibitor concentration (red data points). Heterogeneous error is added to 25% of the points (blue data points, here at inhibitor concentrations 0.8 and 20 μmol/L), by shifting all three data points up or down by 15% of the full range, simulating problems with measurement at that concentration. Figure7 compares the effect of homogeneous error at each inhibitor concentration alone to the presence of both homogeneous and heterogeneous error on the IC50 estimate and t. For each error type, 10,000 simulated data sets were generated for each value of added error (SE/range) as shown in the box plots (Fig.7). Box plots in red give distribution of values from simulated data sets when homogeneous error alone is added to the transport measurement at each inhibitor concentration. The blue box plots represent addition of homogeneous error as described earlier plus heterogeneous error at 25% of inhibitor concentrations chosen at random. Both types of error had similar effects, that is, increased variance in IC50 (Fig.7A) and decreased t as the magnitude of the error increased (Fig.7B). The effect of heterogeneous error (deviation from sigmoidicity) is most obvious when homogeneous error was small (e.g., SE/range of 0.02). As homogeneous error becomes large, it obscured the effects of the heterogeneous error (e.g., SE/range of 0.2).

Figure 7

Effects of added error on estimated IC50 (A, ideal value = 1.637 μmol/L) and t (B). Box plots in red give distribution of values from 10,000 simulated data sets for each magnitude of added error (standard error/range) when homogeneous error is added to each measurement. For plots in blue, homogeneous errors is added to all measurements and heterogeneous error at 25% of inhibitor concentrations chosen at random. All data sets shown here used seven equally spaced values of log{[inhibitor[}. Box plots are slightly displaced horizontally for readability. Horizontal lines at bottom, middle, and top of rectangle are at first, second (median), and third quartiles of variable. Cross is at mean for variable. Whiskers extend to reach data points up to a distance equal to 1.5 times the interquartile range from the first and third quartiles. Values beyond the range of the whiskers are “outliers” shown as individual crosses.

Sensitivity of error contaminated fits to missing segments of the IC50 curve

Another observation from the data collected in Bentz et al. (2013) was sparse data to describe “no inhibition” or “complete inhibition” plateaus. This scenario was simulated by removing different data segments (inhibitor concentrations) from the error-contaminated data. According to Figure1A, the sigmoidal curve is divided into six sections from left to right: negative control plateau, negative control shoulder 1 and 2, linear segment, positive control shoulder, positive control plateau. A six digit binary code identifies which segments are included in the IC50 curve (1 included, 0 excluded). The simulations were performed for ideal data with homogeneous error added to each measurement (for SE/range of 0.05) and added deviation from sigmoidicity (heterogeneous error) at 25% of inhibitor concentrations (this case is also one of the simulated scenarios in Figure7: blue box plot for SE/range of 0.05). Eliminating points in the nearly linear segment spanning the IC50 (the scenario on the extreme right in Figure8) led to a large increase in the variance of the estimated IC50 values among replicate simulations, but no shift in the average log{IC50}. In contrast, eliminating lower inhibitor concentrations lead to systematic errors in log{IC50}. When segment 1, segment 1 and 2, or segment 1, 2, and 3 were missing, the log{IC50} was overestimated by approximately 1.5-, two-, and threefold on average, respectively (Fig.8A). For each of these scenarios, these IC50 values were significantly different from each other (P < 0.0001 by Tukey post hoc comparisons after ANOVA). Eliminating the negative control plateau and shoulders also led to an underestimation of the negative control transport activity (Fig.8B). The estimated log{IC50} was strongly correlated (r2 = 0.77, P < 0.0001) with the estimated transport activity at the negative control plateau, indicating that the underestimation of the negative control plateau value results in misestimation of the IC50 value. Removing the positive control plateau and/or shoulder had little effect on the estimated IC50 in our simulations (Fig.8). The different sensitivity of the log{IC50} estimate in this scenario was associated with different effects on estimated activity at the positive versus negative controls. Omitting the positive control plateau and shoulders did not increase the estimated minimum transport activity (data not shown). Because all laboratories reporting data for Bentz et al. (2013) used and reported a separate positive control that completely inhibited P-gp transport, minimum transport at high inhibitor concentration could still be estimated even when those high inhibitor concentrations were not included in the fit. Our fitting routines in Bentz et al. (2013) took advantage of this information to set the minimum activity. The simulations have been performed similarly. Hence, omitting the positive control (and shoulder) did not affect the minimum estimated transport activity or the IC50 estimate. Consequently, the estimated log{IC50} was poorly correlated (r2 = 0.01) with the estimated transport activity at the positive control plateau. All data in Figure8 used a small homogeneous measurement error added to each simulated activity (SE/range = 0.05) as well as heterogeneous error added to 25% of the inhibitor concentrations. Larger values of random homogeneous error added to the data produced more variability in the IC50 estimate (Data S1).

Figure 8

Effect of omitting segments of data from the fit on estimated IC50 (A) and estimated transport activity without inhibitor (B). Each box plot shows the distribution of values with different segments included and excluded. Segments are those identified in Figure1A. A six digit binary code identifies which segments are included in the plot (1 included, 0 excluded). Segments left-to-right were negative control plateau, negative control shoulder 1 and 2, linear segment, positive control shoulder, positive control plateau. All data shown here have homogeneous error added to each measurement (standard error/range = 0.05) and heterogeneous error added at 25% of inhibitor concentrations. The effect of missing data segments was one of the few ways in which two parameter (α, β) and three parameter (α, β, range) logistic estimates of IC50 differed strongly. For two parameter fits, omitting data at the lowest inhibitor concentrations (negative control plateau and shoulders) lowered the maximal transport activity and hence overestimated IC50 (Fig.9). In contrast, for three parameter fits omitting lower inhibitor concentrations increased maximum transport activity estimated by the logistic model and underestimated IC50 (Fig.9). Both effects were large compared to those seen in two parameter fits.

Figure 9

Differential effects of missing data segments on IC50 estimates and estimated activity at the negative control plateau using two and three parameter logistic models to estimate IC50. Included segment code as in Figure1A.

Discussion

A challenge faced by the P-gp IC50 consortium was the variable quality of data generated by participating laboratories. The t-statistic was developed in Bentz et al. (2013) as an objective statistical tool to eliminate data sets of poor quality. That statistic was calibrated based on visual inspection of IC50 curves. The simulations conducted herein explore factors that play a significant role in estimating robust IC50 values such as (1) measurement error, (2) transformations of transport activity data prior to IC50 fitting, (3) segments of the fit missing from the data, and (4) the use of two or three parameter logistic models. Furthermore, in this work the t-statistic is validated by comparing it with traditional estimators of the quality of fit. Finally, calibration of the t-statistic (beyond visual inspection of IC50 curves) is provided here through calculation of associated confidence intervals. The data quality issues encountered in the P-gp IC50 initiative highlighted the need for a simple measure of the quality of fit that can be applied uniformly across the pharmaceutical industry. With this criterion in mind, traditional estimators of the quality of fit such as the sum squared error and RMSE were rejected because of their dependence on units of the x- or y-axes of the inhibition curve (inhibitor concentration or transport activity, respectively), dependence on scale such as ln{IC50}, log{IC50}, or IC50, dependence on the number of fitted parameters (two, three, or four parameter fits will return different r2 for fits for the same data) or because of uncertainty of how the estimate was calculated in different packages, for example, SE(IC50). The t-statistic does not depend on how the data are expressed (units of inhibitor concentration, use of full range of probe substrate transport versus setting that range to between 0 and 1) and can therefore be easily implemented as a uniform quality criterion for IC50 fits across the pharmaceutical industry. The rationale in synthesizing the t-statistics presented herein was that they were related to standard Wald test statistics as supplied in R, Splus, and SAS to test statistical significance of the fitted parameters, and focused on two parameters (α, β) that determine log{IC50}. Because some of the software packages used to fit logistic models do not provide “α” nor its standard error, a statistic based exclusively on β was also evaluated. The simulations showed that t was very highly correlated with t except in a few cases of three parameter fits with t < 5 (data not shown). Thus, for two parameter fits, there is no statistical reason to prefer t or t and the choice of statistic can be made based on convenience. Both t and t were monotonically, smoothly, and tightly correlated with other measures of fit quality including the potential range of errors in log{IC50} and slope factor among simulations (Fig.2A and B), RMSE, r2, and the estimated standard error of log{IC50} for any given fit (Data S1), and the likelihood of errors in IC50 estimates (Fig.3). Note that a small value of t increased the variance of log{IC50} estimates (decreased precision of the estimate, Figure2A and Data S1) and therefore the probability of an error of a specified size (Fig.3). However, the average or expected (systematic) error stayed small (Data S1). Random error added to the ideal data degraded the precision of estimates of both log{IC50} and the slope (β) (Fig.2A and B), and caused deterioration in all measures of the quality of the fit including the root mean square error (RMSE), the predicted standard error of the log{IC50} estimate, and t (Data S1). When the added error exceeded 20% of the full range of transport values (SE/range equals 0.2), or when t fell below 3–5, variance in estimates increased rapidly and the quality-of-fit statistics deteriorated (Fig.2 and Data S1). For a t value of 7, there is a 95% probability that the estimated IC50 value is within twofold of the true value (Fig.3). For the determination of IC50 values for inhibition of P-gp-mediated transport, transport activity is typically transformed prior to performing the IC50 fit. Several different data transformations are in use. The unidirectional B>A transport transformation, the simplest of the three evaluated in this work (eq. 5), divides each activity level (minus the minimum activity) by the entire dynamic range of transport activity. This transformation performed nearly as well as the native (untransformed) logistic fit. The t was only slightly smaller (Fig.4A) and the errors in log{IC50} were slightly larger and more variable (Fig.5) than the values obtained with the untransformed data. The other two transformations (net secretory flux and efflux ratio, eq. 6 and 7) are more complicated and incorporate both B>A and A>B transport. In the net secretory flux transformation (eq. 6), A>B transport is subtracted from B>A transport, while the efflux ratio transformation (eq. 7) uses the ratio of the two. In both transformations, the activity at each inhibitor concentration is then divided by the dynamic range. In the simulations performed in this study, both transformations were associated with larger and more variable errors in log{IC50} estimation (Fig.5) and smaller t values than the native logistic fit (Fig.4B and C). In short, there is no reason to prefer any of the three activity transformations to native logistic regression in the setting of any of the errors simulated herein. For laboratories participating in Bentz et al. (2013), A>B transport of digoxin was substantially lower than B>A transport, but without a proportionate decrease in measurement error. Thus, it appears likely that inclusion of the noisier A>B transport was responsible for the relatively poor performance of the fits using data prepared with those transformations. Errors for efflux ratios in particular can be large due to small A>B values in the denominator. One of the common problems identified for data sets in Bentz et al. (2013) was an insufficient number of data points at either of the plateaus of the inhibition curve. The consortium elected to use six inhibitor concentrations in addition to a no inhibitor and positive inhibitor control to manage the resources required to generate this data set. Since IC50 values for the selected inhibitors were published, these values could serve as a reference point for choosing an appropriate inhibitor concentration range. Due to the unanticipated wide range in IC50 values, the concentration ranges chosen based on published IC50 values were not always optimal. Simulations revealed that (1) omission of inhibitor concentrations at the negative control plateau (low inhibitor concentration) could lead to average two- to threefold overestimates of IC50's even with random homogeneous error smaller than that typically seen in data analyzed by Bentz et al. (2013) and (2) those misestimations corresponded to underestimation of the transport activity at the missing plateau (Fig.8). When the maximal transport activity at the plateau was not represented in the data set, the algorithm for two parameter logistic fits underestimated activity at the plateau, lowering estimated activity at the IC50, and thus overestimating the IC50. The converse did not occur at the positive control (high inhibitor concentration) plateau because labs participating in Bentz et al. (2013) used separate positive controls that fully suppressed P-gp transport activity. In the data used for Bentz et al. (2013), most laboratories used six inhibitor concentrations, no-inhibitor concentration, and the separate positive control mentioned earlier. As mentioned earlier, the intermediate six inhibitor concentrations often failed to approach the plateaus, and sometimes missed the central, linear portion of the curve, compelling Bentz et al. (2013) to recommend increasing the number of inhibitor concentrations employed to construct an inhibition curve to 8–12. One alternative strategy for highly soluble compounds is to do a “dose ranging” trial with inhibitor concentrations running from 0.001 to 1000 μmol/L with 10× steps in inhibitor concentration to find a preliminary IC50. One could then use 4–6 intermediate inhibitor concentrations around the putative IC50 to pin down the linear part of the curve. The simulations showed that either two or three parameter models performed similarly for fitting IC50 values as long as the magnitude of the added error was relatively small (SE/range < 0.02) and all important segments of the inhibition curve were represented in the data. With more measurement noise or missing segments of the curve, however, three parameter fits yielded log{IC50} estimates that sometimes deviated strongly from the two parameter fits (Fig.9 and Data S1) and t values decreased (with t and t occasionally diverging from one another). Thus, the two parameter fits seem more tolerant and enable generation of IC50 values meeting the t statistic criterion. In Bentz et al. (2013), the cutoff value for the t-statistic was determined by group consensus after visual inspection of all fits. Fits with t < 3 were excluded. For values of t > 3 the probability that the IC50 value generated by an IC50 initiative participant is less than fourfold different from the true IC50 value for the particular system used is 95%. The actual fold difference observed between participants in the P-gp IC50 initiative was much greater than this (with 13 of the 15 inhibitors investigated showing a difference between highest and lowest IC50 values of at least 20-fold). While this work analyzes the variation in the IC50, it would be desirable to know the variation in the inhibitor's P-gp dissociation constant KI, which depends on the intracellular concentration of drug and inhibitor (Lumen et al. 2010, 2013; Chu et al. 2013; Bentz and Ellens 2014). Direct measurement of intracellular concentration, defined as the cytosolic concentration of free unbound compound, is difficult (Chu et al. 2013). Fitting the KI using transport kinetics requires that the intracellular concentration be an explicit variable of the kinetic model (Tran et al. 2005; Sun and Pang 2008; Tachibana et al. 2010; Agnani et al. 2011; Korzekwa et al. 2012; Lumen et al. 2013; Bentz and Ellens 2014). Fitting the data published in Bentz et al. (2013) using the kinetic model in Lumen et al. (2013) to obtain KI estimates is currently underway. In conclusion, the analysis herein provides a quantitative assessment of critical data quality factors that contribute to the reliability of the fitted IC50 value (error in replicates of transport activity, segments of the inhibition curve missing from the fit, data transformation prior to fitting, and two or three parameter fits). Furthermore, a t-statistic was calibrated (t or t) to provide a measure of confidence in the fitted IC50 value. Since IC50 values are used in conjunction with drug concentrations to assess the risk of a DDI, it is important to provide a statistical measure of the confidence in the value of the IC50. The statistic based on SE is easily available in common software packages and is estimated similarly in different software packages, unlike the standard error of the IC50 estimate. Therefore, t can be easily implemented as a uniform statistical criterion for IC50 fits across the pharmaceutical industry. A t or t of 7 is required for 95% probability that the fitted IC50 value will be within twofold of the true value.

24 in total

1. The elementary mass action rate constants of P-gp transport for a confluent monolayer of MDCKII-hMDR1 cells.

Authors: Thuy Thanh Tran; Aditya Mittal; Tanya Aldinger; Joseph W Polli; Andrew Ayrton; Harma Ellens; Joe Bentz
Journal: Biophys J Date: 2004-10-22 Impact factor: 4.033

2. Kinetic considerations for the quantitative assessment of efflux activity and inhibition: implications for understanding and predicting the effects of efflux inhibition.

Authors: J Cory Kalvass; Gary M Pollack
Journal: Pharm Res Date: 2006-12-27 Impact factor: 4.200

3. Permeability, transport, and metabolism of solutes in Caco-2 cell monolayers: a theoretical study.

Authors: Huadong Sun; K Sandy Pang
Journal: Drug Metab Dispos Date: 2007-10-11 Impact factor: 3.922

4. Refining the in vitro and in vivo critical parameters for P-glycoprotein, [I]/IC50 and [I2]/IC50, that allow for the exclusion of drug candidates from clinical digoxin interaction studies.

Authors: Jack A Cook; Bo Feng; Katherine S Fenner; Sarah Kempshall; Ray Liu; Charles Rotter; Dennis A Smith; Matthew D Troutman; Mohammed Ullah; Caroline A Lee
Journal: Mol Pharm Date: 2010-04-05 Impact factor: 4.939

5. Review of P-gp inhibition data in recently approved new drug applications: utility of the proposed [I(1) ]/IC(50) and [I(2) ]/IC(50) criteria in the P-gp decision tree.

Authors: Sheetal Agarwal; Vikram Arya; Lei Zhang
Journal: J Clin Pharmacol Date: 2013-02 Impact factor: 3.126

6. A structural model for the mass action kinetic analysis of P-gp mediated transport through confluent cell monolayers.

Authors: Joe Bentz; Harma Ellens
Journal: Methods Mol Biol Date: 2014

7. Model analysis of the concentration-dependent permeability of P-gp substrates.

Authors: Tatsuhiko Tachibana; Satoshi Kitamura; Motohiro Kato; Tetsuya Mitsui; Yoshiyuki Shirasaka; Shinji Yamashita; Yuichi Sugiyama
Journal: Pharm Res Date: 2010-02-05 Impact factor: 4.200

Review 8. A regulatory viewpoint on transporter-based drug interactions.

Authors: L Zhang; Y D Zhang; J M Strong; K S Reynolds; S-M Huang
Journal: Xenobiotica Date: 2008-07 Impact factor: 1.908

9. Unexpected effect of concomitantly administered curcumin on the pharmacokinetics of talinolol in healthy Chinese volunteers.

Authors: He Juan; Bernd Terhaag; Zang Cong; Zhang Bi-Kui; Zhu Rong-Hua; Wang Feng; Su Fen-Li; Song Juan; Tang Jing; Peng Wen-Xing
Journal: Eur J Clin Pharmacol Date: 2007-04-28 Impact factor: 3.064

10. Transport inhibition of digoxin using several common P-gp expressing cell lines is not necessarily reporting only on inhibitor binding to P-gp.

Authors: Annie Albin Lumen; Libin Li; Jiben Li; Zeba Ahmed; Zhou Meng; Albert Owen; Harma Ellens; Ismael J Hidalgo; Joe Bentz
Journal: PLoS One Date: 2013-08-16 Impact factor: 3.240