Literature DB >> 21865301

SDRS--an algorithm for analyzing large-scale dose-response data.

Rui-Ru Ji1, Nathan O Siemers, Ming Lei, Liang Schweizer, Robert E Bruccoleri.   

Abstract

SUMMARY: Dose-response information is critical to understanding drug effects, yet analytical methods for dose-response assays cannot cope with the dimensionality of large-scale screening data such as the microarray profiling data. To overcome this limitation, we developed and implemented the Sigmoidal Dose Response Search (SDRS) algorithm, a grid search-based method designed to handle large-scale dose-response data. This method not only calculates the pharmacological parameters for every assay, but also provides built-in statistic that enables downstream systematic analyses, such as characterizing dose response at the transcriptome level. AVAILABILITY: Bio::SDRS is freely available from CPAN (www.cpan.org). CONTACTS: ruiruji@gmail.com; bruc@acm.org SUPPLEMENTARY INFORMATION: Supplementary data is available at Bioinformatics online.

Entities:  

Mesh:

Year:  2011        PMID: 21865301      PMCID: PMC3187656          DOI: 10.1093/bioinformatics/btr489

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

Dose–response assays are routinely used in today's pharmaceutical development. Mechanistically, compound:target binding occurs at a single site and follows the law of mass action that is reflected by the sigmoidal dose–response pattern seen in many assays (Balakrishnan, 1991). In statistics, sigmoidal dose-responses can be identified by non-linear regression, a form of regression analysis where the model function is a non-linear combination of the model parameters (Seber and Wild, 1989). Non-linear regression methods such as the well-known Levenberg–Marquardt algorithm involve successive approximations that aim at minimizing an error function (Marquardt, 1963). Despite the general applicability of the iterative non-linear regression methods, there are a couple of limitations in their application to large-scale dose–response screening data. First, the iterative methods do not impose a boundary on the model parameter values, and thus the output model may contain unrealistic or uninterpretable values such as a negative EC50. Second, these methods only calculate the parameter values and fitting statistic for the best model, but do not provide a means that can be integrated in downstream analyses such as the characterization of transcriptome response (Ji ). Ji ) recently described a grid search-based algorithm, Sigmoidal Dose Response Search (SDRS), for identifying transcripts that exhibited sigmoidal dose-response to the treatments of kinase inhibitors. Since the SDRS algorithm is generic and can be expanded to identify other dose–response patterns in different sources of quantitative data, we have implemented the method as a Perl module with C inline codes (Bio::SDRS). We demonstrated the general utility of the method using a dataset from high content screening (HCS).

2 METHOD AND IMPLEMENTATION

Our implementation of the SDRS algorithm includes a typical sigmoidal dose–response model for one-site compound:target interaction, where Y is the assay readout value, X is the dose and the four unknown parameters correspond to minimal response (A), maximal response (B), EC50 (C) and the Hill slope (D). In essence, the SDRS algorithm tests a series of candidate EC50 values (i.e. search doses) across the experimental dose range. Therefore, at every search dose it is a three point grid search for the one-site model. For transcription profiling data, every probeset on the array is treated as an independent assay for the response of its corresponding transcript and its expression values at the experimental doses constitute the assay data. We assume that every assay generates a positive readout. For every assay, the range for the parameter A is determined based on the six (default, or per user defined) lowest readout values, and is set to be the mean value plus or minus two multiples of the standard deviation (SD). If the lower boundary is less than zero, it is reset to the minimal of the readouts. The search step for A is one-fourth of the SD. Similarly, the range for the parameter B is determined using the six (default, or per user defined) highest values and the step is also one-fourth of the SD. The parameter D can vary between −6.3 and 6.3, with a step of 0.3. (In reality, D can vary from −∞ to +∞. However, when the absolute value of D is >6, additional increments have only marginal impact on the estimates of the other three parameters.) Placing data-driven limits on parameter values allows SDRS to exclude unusable parameters such as negative EC50 values. At every search dose, the SDRS algorithm evaluates all possible combinations of parameter values and calculates the deviation of expected values based on the dose–response model from the observational data. The goodness of fit is measured by an F-statistic: , where MSR is the mean square of the variance explained by the model and MSE is the mean square of error (Supplementary Table S1). Assuming that the residuals are normally distributed, the F-statistic follows an F-distribution, F(p − 1, n − p), where n is the number of experimental dose points and p is the number of parameters in the model. For every assay, at every search dose tested, the (local) maximal F-statistic and the corresponding parameter values are recorded.At the end of the grid search, every assay is associated with a series of F-statistic. An assay is designated as fitted to a dose–response model if its global maximal F-statistic (i.e. best fit) is larger than a predefined critical F value, for example, at P < 0.05. For each assay, the parameter values that gave rise to the global maximal F-statistic define the optimal model. The 95% confidence interval for C (i.e. EC50) is defined as from the lowest search dose where the local maximal F-statistic is larger than the critical value to the highest search dose that meets the same criteria. Confidence intervals for other model parameters can be found similarly. One output of SDRS is qualitatively similar to that of an iterative algorithm: each assay is associated with a predicted EC50, P-value and fold-change (i.e. the ratio of B to A). However, SDRS also generates an F-statistic for every assay at each search dose. This output, which is unique to the grid search method, allows for a global characterization and comparison of dose responses (Ji ). For example, the F scores at a search dose can be fed to a multiple test correction procedure (such as FDR) to calculate the number of ‘true responses’ at the dose. Repeating this procedure for every search dose across the dose range can uncover peak(s) of response. The F score output also allows for pathway impact mapping and dose–response comparison at the transcriptome level across the dose range.

3 RESULTS AND DISCUSSION

Herein, we present the SDRS algorithm, which is implemented as a Perl module with C inline codes (Bio::SDRS). We applied the algorithm to a dataset from HCS assays that measured programmed cell death using caspase 3, caspase 8 and cytochrome C as readouts in the ovarian cancer cell line, OVCAR-4. The SDRS outputs were compared with those generated by XLfit, a software that implements the Levenberg–Marquardt algorithm (Table 1). XLfit identified 19 dose responses in these assays. In contrast, SDRS identified three dose responses in addition to those identified by XLfit. The three additional dose responses identified by SDRS appear to be real (Supplementary Figure S1). There is a gradual increase in cytochrome C readouts as the Compound1 concentration increases. In the case of Compound4, it is likely that the compound also has a dose response since both the caspase 8 and cytochrome C assay produced high readouts at the highest dose. When both response plateaus are present, the parameter values generated by SDRS are almost identical to those generated by the iterative method (Table 1 and Supplementary Figure S2). However, when one of the curve plateaus is not present, i.e. where A or B are not well defined, the output is dependent on the behavior of the algorithm utilized. For example, when the high plateau is missing, extreme values for B and C are generated, with C often larger than the maximal experimental dose (Table 1 and Supplementary Figure S3). Similarly, when the low plateau is missing, iterative methods may generate negative estimates for A and C. Although there is no ‘right’ solution in these cases, as the data are not sufficient for parameter estimation, SDRS generates more ‘realistic’ estimates because it imposes constraints on the parameter values based on assay data and experimental dose range (Table 1).
Table 1.

Summary of SDRS and XLfit outputs

CompoundAssaySDRS output
XLfit output
P-valueABC (EC50, nM)DFitted?ABC (EC50, nM)D
Compound1Caspase 37.6E-084.492.69068.3−6Ok3.7113.611991.5−2.2
Compound1aCaspase 85.3E-112.6100.91528.3−1.8Ok2.5102.51570.5−1.7
Compound1bCytochrome C1.7E-0411.588.49688.3−6NoFit
Compound2Caspase 35.7E-031.83.21608.3−1.5Ok1.83.21775.7−1.6
Compound2Caspase 88.1E-032.14.21588.3−6Ok2.14.21515.9−52.0
Compound2Cytochrome C5.2E-01NoFit
Compound3Caspase 39.2E-02NoFit
Compound3Caspase 85.7E-061.85.31568.3−1.8Ok1.85.41628.1−1.5
Compound3Cytochrome C2.9E-056.731.05108.3−1.2Ok6.540.110833.1−0.9
Compound4cCaspase 39.9E-071.747.413268.3−6Ok1.9756.044125.9−4.7
Compound4bCaspase 81.1E-061.789.413308.3−6NoFit
Compound4bCytochrome C2.7E-065.179.013608.3−6NoFit
Compound5Caspase 35.3E-041.66.9548.3−6Ok1.66.5371.4−20.5
Compound5Caspase 81.3E-031.76.9748.3−3.3Ok1.67.2847.0−1.9
Compound5aCytochrome C1.1E-073.867.61288.3−1.5Ok4.166.41241.8−1.7
Compound6Caspase 36.6E-01NoFit
Compound6Caspase 83.7E-01NoFit
Compound6Cytochrome C9.8E-02NoFit
Compound7Caspase 31.2E-022.25.25768.3−1.5Ok2.27.317804.1−0.8
Compound7Caspase 81.4E-021.84.73468.3−1.2Ok1.85.88804.0−0.9
Compound7Cytochrome C7.4E-036.716.61028.3−6Ok6.716.71001.0−8.4
Compound8Caspase 32.8E-021.62.81108.3−6Ok1.66.254354.5−0.7
Compound8Caspase 88.7E-01NoFit
Compound8cCytochrome C1.0E-057.356.810828.3−6Ok6.84099.0475017.8−1.5
Compound9aCaspase 32.3E-102.7100.8388.8−2.4Ok3.199.3390.5−2.6
Compound9aCaspase 83.8E-121.699.077.7−3.6Ok1.799.676.6−3.3
Compound9aCytochrome C4.4E-088.294.91848.3−2.1Ok8.297.91946.7−1.9
Compound10Caspase 36.2E-01NoFit
Compound10Caspase 86.9E-01NoFit
Compound10Cytochrome C4.2E-048.727.09208.3−6Ok8.639.218268.6−1.6

aRepresentative dose responses identified by both SDRS and XLfit, shown in Supplementary Figure S2.

bDose responses identified by SDRS but not XLfit, shown in Supplementary Figure S1.

cDose responses where XLfit generated EC50 values larger than the highest experimental dose, shown in Supplementary Figure S3.

Summary of SDRS and XLfit outputs aRepresentative dose responses identified by both SDRS and XLfit, shown in Supplementary Figure S2. bDose responses identified by SDRS but not XLfit, shown in Supplementary Figure S1. cDose responses where XLfit generated EC50 values larger than the highest experimental dose, shown in Supplementary Figure S3. Although SDRS was initially developed to handle genomic scale transcriptional dose–response data, it can be used to analyze all other types of dose–response data from qPCR and lead evaluation where it performs as efficiently as iterative non-linear regression methods (Ji , and Rui-Ru Ji). For large datasets, SDRS can be run in parallel very efficiently across a multicore system (Supplementary Table S2). SDRS is robust to the naturally occurring variability in large-scale screening data, where the assays are not necessarily ‘optimized’. Importantly, only SDRS provides a full set of F-statistic across the dose range that can be utilized in downstream system level analyses and comparisons.
  1 in total

1.  Transcriptional profiling of the dose response: a more powerful approach for characterizing drug activities.

Authors:  Rui-Ru Ji; Heshani de Silva; Yisheng Jin; Robert E Bruccoleri; Jian Cao; Aiqing He; Wenjun Huang; Paul S Kayne; Isaac M Neuhaus; Karl-Heinz Ott; Becky Penhallow; Mark I Cockett; Michael G Neubauer; Nathan O Siemers; Petra Ross-Macdonald
Journal:  PLoS Comput Biol       Date:  2009-09-18       Impact factor: 4.475

  1 in total
  2 in total

1.  Identifying ultrasensitive HGF dose-response functions in a 3D mammalian system for synthetic morphogenesis.

Authors:  Vivek Raj Senthivel; Marc Sturrock; Gabriel Piedrafita; Mark Isalan
Journal:  Sci Rep       Date:  2016-12-16       Impact factor: 4.379

2.  Genome-wide dose-dependent inhibition of histone deacetylases studies reveal their roles in enhancer remodeling and suppression of oncogenic super-enhancers.

Authors:  Gilson J Sanchez; Phillip A Richmond; Eric N Bunker; Samuel S Karman; Joseph Azofeifa; Aaron T Garnett; Quanbin Xu; Graycen E Wheeler; Cathryn M Toomey; Qinghong Zhang; Robin D Dowell; Xuedong Liu
Journal:  Nucleic Acids Res       Date:  2018-02-28       Impact factor: 16.971

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.