Literature DB >> 35947551

BlotIt-Optimal alignment of Western blot and qPCR experiments.

Svenja Kemmer^1,2, Severin Bang^1,2, Marcus Rosenblatt^1,2, Jens Timmer^1,2,3, Daniel Kaschek^1,2,4.

Abstract

Biological systems are frequently analyzed by means of mechanistic mathematical models. In order to infer model parameters and provide a useful model that can be employed for systems understanding and hypothesis testing, the model is often calibrated on quantitative, time-resolved data. To do so, it is typically important to compare experimental measurements over broad time ranges and various experimental conditions, e.g. perturbations of the biological system. However, most of the established experimental techniques such as Western blot, or quantitative real-time polymerase chain reaction only provide measurements on a relative scale, since different sample volumes, experimental adjustments or varying development times of a gel lead to systematic shifts in the data. In turn, the number of measurements corresponding to the same scale enabling comparability is limited. Here, we present a new flexible method to align measurement data that obeys different scaling factors and compare it to existing normalization approaches. We propose an alignment model to estimate these scaling factors and provide the possibility to adapt this model depending on the measurement technique of interest. In addition, an error model can be specified to adequately weight the different data points and obtain scaling-model based confidence intervals of the finally scaled data points. Our approach is applicable to all sorts of relative measurements and does not need a particular experimental condition that has been measured over all available scales. An implementation of the method is provided with the R package blotIt including refined ways of visualization.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35947551 PMCID： PMC9365137 DOI： 10.1371/journal.pone.0264295

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.752

Introduction

The approach of mathematical modeling to analyse and understand dynamic processes of biological systems requires the collection and quantification of time-resolved experimental data for many different experimental conditions [1-4]. Frequently, the generation of these type of data is achieved by techniques like Western blotting [5, 6], quantitative real-time polymerase chain reaction [7], reverse phase protein arrays [8] or flow and mass cytometry [9] which only generate measurements on a relative scale. Therefore, the number of experiments that are comparable to each other, i.e. provided on the same measurement scale, is typically limited by the experimental setup, which constitutes a bottleneck for mathematical models with high complexity. In the following we focus on Western blotting as a well-established and commonly used technique. In this technique, protein abundances are measured on a relative scale by chemiluminescent antibodies binding to the respective proteins embedded in a gel. Let us consider such a time-course experiment that has been performed twice and quantified with Western blot. The experimental setting is assumed to be the same between the two experiments, i.e. the same biological or experimental conditions were measured, but on two different gels. Since the Western blot technique only provides a relative measurement, the obtained data points presumably show a similar dynamical behavior. However, they do not coincide with each other in absolute numbers due to experimental errors and different measurement scales. The corresponding unknown scaling factor, i.e. the ratio between the measurement scales, can be inferred relatively simply by aligning both measurement profiles to each other. Now, let us consider another experiment where time courses of two different experimental conditions, as for example stimulation doses, have been measured separately on two different Western blots. Here, it cannot be distinguished whether the difference in the results is occurring due to the experimental condition or due to the different measurement scales. In particular, the scaling factor between the two blots cannot be estimated in this case. One way to circumvent the missing comparability between measurements is to add recombinant proteins to the Western blot samples allowing for an absolute-scale quantification [10]. However, this approach is very expensive and time-consuming and scales with the number of measured proteins [11]. In systems biology, where the data is employed to estimate parameters of a mathematical model, it is a common approach to determine the scaling parameters of the different blots together with the remaining parameters of the model [12]. Besides the disadvantage of enlarging the parameter space when using standard ODE modeling and optimization methods, the estimates of the scaling parameters might be biased by the model equations, hampering hypothesis testing and therefore interpretation of the results [13]. As a generally applicable alternative, the Western blot experiments can be designed in a way that a certain experimental overlap exists between different blots, meaning that the same experimental condition is measured multiple times. Degasperi et al. [14] present a method to analytically determine the corresponding scaling factors based on such data. However, to be able to apply this method, there needs to be at least one experimental condition that has been measured on all blots which implies additional planing effort, might be limited by the availability of the overlap sample and complicates the use of experiments performed at a later time point. Here, we present a new data-based approach for the estimation of scaling parameters which is also applicable in the absence of a unique condition overlapping across all available scales. It is sufficient when the independent experiments are connected by pairwise overlapping conditions. In addition, the implemented method provides not only the possibility to obtain scaling parameters and therefore align data points of different Western blots but also to compute confidence intervals for the results by applying a user-defined error model [15]. We implemented this method in the R package blotIt.

Methods

When analyzing the measured values of a hypothetical experiment, we define three classes of effects: (A) Biological effects describe biological conditions as for example different targets (proteins, mRNA, etc.), stimulation doses, inhibition treatments or measurement time points of a dynamical process. The set contains all N unique combinations of biological effects. For each element of this set i ∈ (1, …, N), there exists one true value y. (B) Scaling effects describe the systematic influence of the measurement techniques and evaluation routines on the particular numerical value that is obtained. In the example of Western blotting, these scaling effects include for example development time, sample loading, gel thickness or antibody efficiency. All N scaling effects make up the set , and each scaling factor s with j ∈ (1, …, N), equally affects the measurements of all y within the respective experiment. Only measurements that underlie the same scaling factor can a priori be considered as comparable. In other words, repeated experiments measuring the same effect y result in a set of values Y, where the indices imply that y is affected by experiment-specific scaling factors s. Some properties, e.g. gel imperfections, do not affect the whole experiment uniformly, but neighbouring lanes can be influenced by a systematic error. To resolve this, randomized sample loading is advised [16], ensuring that the resulting errors are independent. (C) Residual noise: In addition to the systematic error sources (A-B), each measurement Y is affected by stochastic noise ϵ. Based on these three error sources, we present in the following an approach to align the numerical values of different experiments with the aim of retrieving one comparable data set.

Definition of the alignment model

In mathematical terms, the influence of scaling factors on true values is described by where f(, ) is the scaling model and reflects the noise of the measurement which is assumed to be normally distributed with , where σ is the standard deviation of the normal distribution, and the indices imply that each measurement Y can in principle have its own error distribution. To assess the individual error distribution, an error model h is introduced which is based on the error model parameters . The data quantification happens usually by relating the luminescence of a sample to signal strength. An example for such a measurement procedure is Western blotting. Because of an always present background, it is in the nature of such measurements to have a low signal-to-noise ratio for data points with low signal. Kreutz et al. elaborate why the error of such measurements is most completely described by a mixed effects error model hmixed = eabs + erel ⋅ f(, ) composed of an absolute error addressing the constant background and a signal dependent relative error [15]. In cases where the signal is significantly higher then the background, or a constant background is subtracted in data quantification, it can be sufficient to describe the error by a purely relative error model hrel = erel ⋅ f(, ), although this simplified model still bears the danger to underestimate the errors of low intensity measurements, e.g. of unstimulated controls. In the following, we consider this relative error model. Kreutz et al. suggested statistical tests to check if this simplification is justified for a given data set [15]. Calculation of the errors by use of an error model has an additional advantage over the calculation by replicate spread. The error model considers the variance information of all experimental data, what allows for a reliable error estimate even for conditions with small numbers of replicates. When considering Western blot data as a typical use case for the here presented method, a simple model with gel-dependent scaling effects is usually assumed. The equation then reads where the measurement Y corresponds to the true value y affected by the scaling factor s and the noise ϵ. The relative error model for the standard deviation for the measurement Y then reads One error parameter erel is determined for all measurements, from which the measurement errors σ are inferred by multiplying erel with the corresponding model evaluation. The accuracy of the estimated errors crucially depends on the validity of the chosen error model. Therefore, the error model needs to be adjusted for other applications. Depending on the measurement technique, the data could be given on the logarithmic scale, and the error model h has to be adjusted accordingly. This is the case e.g. for qPCR data, which is typically provided on log2 scale: Here, the relative error model becomes an absolute one: Together, Eqs (1) and (2) describe a combined scaling and error model formulation based on the assumption that true values of measurements are influenced by scaling factors and experimental noise. All parameters , and are a priori unknown and have to be determined based on the data.

From the original to a common data scale

Evaluating the data with the presented method results in three representations of the data, each of them with its own meaning and application in different contexts: (i) The scaled data representation contains the original replicate data transferred to the common scale, (ii) the aligned data representation reflects the underlying estimated true values, and (iii) the predicted value representation corresponds to model evaluations back on the original scale. The alignment process as described in the following is schematically visualized in Fig 1.

Fig 1

Overview of the blotIt alignment procedure.

Overview of the blotIt alignment procedure.

Top: Three exemplary experiments are represented by cartoon Western blots along with simulated raw data on the original scale (original). Experiments are indicated by color. Middle: Raw data is fitted by the alignment model to estimate scaling parameters s and the underlying true values y. Error parameters e are simultaneously estimated by means of an error model. Bottom: The procedure outputs three different ways to visualize the result: Single replicates aligned to the common scale (scaled), the time course of estimated true values (aligned), and a prediction for the replicates on the original scale (predicted). Uncertainties are shown as shaded areas. Initially, the numerical values of each experiment are on their own original scale shown by three simulated example time courses at the top of Fig 1. In particular, these measurements are not comparable to each other. Now, let us for the moment assume that an optimal set of parameters has been found. With these parameters, we define a common scale as the scale on which all measurements shall be directly comparable. The model assumes that all true values are on this common scale, and describes how the scaling has to be applied to a true value y to match the respective measurement Y. To retrieve the scaled values s from the respective measurements under the scaling , the inverse of this model has to be evaluated with estimated scaling parameters and measured data on the original scale. The resulting data set with replicates aligned to the common scale is thus called scaled, as shown on the lower left in Fig 1. Errors are not shown in Eq (7) because the model describes the scaling of the measured value itself. Error estimates for the measured data are derived from the error model and are propagated to the original scale by the use of Gaussian error propagation (see section Error determination for more details). The scaled data set still contains the information about each independent experiment, but all measurement values are directly comparable. This is useful for the comparison between experiments, e.g. to determine potential outliers and identifying experiments with obviously significant measurement errors. As input for a dynamical modeling approach, the scaled data set is preferred in comparison to the original data as it is already on a common scale. Working with this data set, experiment-specific scaling parameters are not necessary anymore. To be able to compare this data on the common scale to model simulations on a different scale, e.g. absolute concentrations, it is recommended to include one scaling parameter for the whole data set in the model formulation. This enables a proper relation of model and data. If there are only few replicates available, the scaled data set might not be the best input for dynamic modeling and the aligned data set should be favored. This data set consists of the estimated true values . Corresponding estimated errors quantify the uncertainty of the parameter fit, thereby taking all data into account. These estimated errors might be a more reliable description of the data spread compared to the information provided by a small number of replicates. To identify discrepancies between the true values determined based on all measurements, and the measurements of a single experiment, the true values can be scaled back to the original scale. This new data set is termed predicted and consists of the direct alignment model evaluations using the estimated true values and scaling parameters: Note that this data set is again on the original scale, and thus does not provide comparability between the experiments. The calculation of errors for the original, predicted and scaled data is described in the section Error determination.

Parameter estimation

The presented method brings measurements from different experiments to a common scale applying an alignment and an error model. Assuming the parameters obey Gaussian statistics, the best maximum-likelihood estimate is the set of parameters, which minimizes the negative log-likelihood [17]: Here, the log-likelihood function log L() consists of three terms: The special form of the log-likelihood function presented in Eqs (10a) and (10b) is based on the assumption that observations are affected by normally distributed residual noise. The first term (10a) includes the weighted least squares, namely the difference between model prediction f(, ) and corresponding measurements Y. Residuals are weighted by the variance . The second term (10b) accounts for the simultaneous optimization of the error model h(, , ) and thereby estimation of error parameters. While the first two terms ensure minimization of the spread between experiments, the third term (10c) forces the mean of the estimated true values to be one during the optimization process and thereby introduces the common scale. For computational reasons, the parameters are per default transferred to logarithmic scale prior to the estimation. This drastically improves numerical stability especially when the input data varies over multiple orders of magnitude. Estimated parameter values are subsequently transformed back and reported on the linear scale.

Error determination

In Fig 1, we introduced different output data representations as result of the alignment model. As a major advantage of this formulation, a measure of uncertainty, i.e. a statistical error, can be determined for each of these data sets, as summarized in Table 1 and explained in the following.

Table 1

Overview table of the different output data sets of blotIt.

Data set	Data	Error	Scale
Original	Y	σ(γ)=γh(θ^)	Original
Predicted	yp=f(y^,s^)	σ(γ)=γh(θ^)	Original
Scaled	Ys=f-1(Y,s^)	σs(γ)=\|f-1(Y,s^)\|σ(γ)	Common
Aligned	y^	σfit(γ)(y^i)=γCii(θ)	Common

First of all, the number of estimated parameters nP = || in the alignment model is typically quite high compared to the number of data points nd: nd nP ∼ 2–3. Under such conditions, the maximum-likelihood estimation tends to underestimate the standard deviation in a sample. The effect is more apparent for small samples or, equivalently, when many parameters are estimated from few data points. We account for this bias by applying Bessel’s correction [18] to scale the estimated sample standard deviation by a factor that is multiplied with the estimated standard deviation of the measured data within blotIt: The error model is evaluated on the scale of the original observations. To retrieve the error of the scaled data, Gaussian error propagation is employed [19]: In Eq 4 we introduced a relative error model for Western blotting. As described above, this error serves as estimate for the error of both data sets, original and predicted. To retrieve the error of the scaled data , we have to consider the scaling model defined in (3). While all errors considered until now quantify the uncertainty of the measurement, the error of the estimated true values are calculated qualitatively differently. Since are model parameters, their errors have to be estimated by the model uncertainty itself. In maximum likelihood estimation, the uncertainty is reflected in the local curvature of the likelihood landscape around the determined parameter value [20, 21]. The uncertainty of the l-th fitted parameter is given by where the local curvature is approximated by the square root of the ll-diagonal element of the covariance matrix C given by the Fisher information matrix I, which itself is represented by the Hessian H that is calculated during the optimization process. Note that the Bessel correction γ is applied here, too. To sum up, the blotIt approach provides four measures of uncertainty (Table 1). Uncertainty provided with the original data set is estimated based on the error model. This error is reflective of the between-replicate variability and, as such, is comparable to the standard deviation of a single measurement. Uncertainty provided with the predicted data set is the same as for the original data set and, thus, reflective of the standard deviation of the single measurement. The uncertainty provided with the scaled data set is the error of the original data set translated to a different scale, i.e. the common scale. Also this error is reflective of the standard deviation of the single measurement. Finally, uncertainty provided with the aligned data set is the estimation uncertainty of the estimated true concentrations, meaning that this error is comparable to the standard error of the mean. Therefore, with more and more replicate measurements, the error of the aligned data set becomes smaller, whereas the errors provided with original, predicted, and scaled data sets consolidate.

Simulation study

To compare the performance of different scaling approaches, we generated a simulated data set following a function with quadratic rise and exponential decay that represents the typical behavior of e.g. protein phosphorylation or expression dynamics [15]. The parameters ccond and ctarget were chosen from a uniform distribution for each condition and target to simulate different stimuli and target specific dynamics. The fixed values were just used to determine the time scale of the dynamics. An artificial noise consisting of an absolute contribution resembling background noise, as well as a signal dependent relative part, was added to the simulated data: Because this noise is known to be log-normally distributed, the error was implemented as where describes a Gaussian distribution with mean 0 and standard deviation σ. To evaluate the goodness of the scaling and compare blotIt with other available methods, the same generated noisy data was scaled with different normalization approaches. This procedure was repeated 200 times for each normalization approach to be able to statistically evaluate the results. Inspired by Degasperi et al. [14], the goodness of each normalization was assessed based on the spread of the scaled data, the standard deviation sd. It was calculated for each biological condition i, determined by target, condition, and time point: Since the data was generated with log-normally distributed noise, the data points had to be log-transformed before the standard deviation was calculated. The so calculated represents the normalized spread of the scaled replicates for each biological condition and is comparable between scaling methods.

Applications

In the following, the application of blotIt is illustrated and compared to alternative approaches by means of simulated data and a published data set comprising Western blot as well as qPCR measurements. The scaling and thereby alignment of the different data sets was performed with the R package blotIt.

Application to simulated data

To assess the performance of blotIt in comparison to alternative normalization approaches, we conducted a method comparison. Three alternative methods were tested on data realizations with different overlap, i.e. number of samples measured in all experiments, and different noise level. Their performance was compared to the one of blotIt. The following methods were analyzed: (1) Optimal alignment, which is based on the analytical minimization of differences between all overlap samples. This approach was discussed in detail by Degasperi et al. [14] and applied e.g. by Wang et al. [22]. (2) Normalization by fixed point, which uses one biological condition (one y as defined above) to normalize all experiments by the respective measurement value of this condition [23, 24]. (3) Normalization by sum (setSums), or equivalently average, which is analog to the fixed point method, but here experiments are divided by the sum or average of all overlapping biological conditions [25, 26]. We compared the performance of the different scaling techniques for five data realizations chosen to mimic a variety of real world situations: Full overlap—Each experiment describes the exact same biological conditions, meaning the exact same experiment was repeated N times; 50% overlap—In this scenario all experiments share one reference treatment condition, for which all time points are measured. The second half of each experiment covers an individual condition; Dose response—Here, a whole time course is measured in replicates analogously to the previous scenarios, along with an additional replicate set that covers just one time point in multiple conditions. This could be e.g. a dose response measurement; Signal to noise variations—Two more data sets with 50% overlap were simulated with mixed signal to noise ratios. All data sets are visualized in Fig 2a. Experiments describing the exact same biological conditions are referred to as replicate sets indicated by color.

Fig 2

Method comparison.

Method comparison.

The performance of four different scaling methods was analyzed for five simulated data sets with different overlap and signal to noise ratios. (a) Illustration of the tested data sets and their experimental overlap. Rows of the tile plots correspond to the different experiments (scaling effects), columns correspond to different experimental conditions (biological effects). Tiles indicate whether the respective condition was measured in the respective experiment (colored) or not (white). Those experiments describing the exact same biological conditions are summarized and colored as replicate sets. Data with low signal to noise ratio is indicated by shaded area. (b) The performance of the different scaling methods was assessed based on the standard deviation of the respective scaled data and displayed as density plot. Data sets were analyzed with three replicates, i.e. replicate sets consisting of three experiments and with ten replicates, respectively. Note that the methods setSums and fixedPoint often yield very similar results and thus lead to overlaying density plots. The performance of the individual methods was evaluated for each of the data realizations based on the spread of the scaled data, the standard deviation σcalc, as described in the methods section. As the performance might vary with the number of replicates, i.e. the number of experiments belonging to one replicate set, one scenario with three and one scenario with ten replicates was analyzed. As displayed in Fig 2b, all methods performed equally well in the scenario with full overlap except for the fixed point approach that had a slightly worse outcome. However, with decreasing number of overlap samples blotIt gained advantage over the other methods tending towards overall smaller standard deviations. The scaling evaluation of the 50% overlap data set was especially interesting as it visualized how the distinct approaches work. The three methods checked against blotIt displayed a small sharp peak for low standard deviations comparable to results from the perfect overlap scenario, followed by a larger peak for higher sigmas. This characteristic was less marked in the optimal alignment approach. Except for blotIt, all methods just use the overlap present between all experiments to determine the scaling factors. The just described small peaks for low sigmas were originating exactly from these overlap samples. Yet, conditions not measured in all experiments were scaled a lot worse and led to a larger spread between replicates. In contrast, blotIt uses all samples to normalize the data. This improves the scaling of samples not measured in all experiments and of data with low signal to noise ratio as present in data sets NoisyRef and NoisyAddOn. Frequently, a small overlap between experiments is encountered in the typical scenario of combining time course and dose response measurements. At the extreme, when the overlap between experiments was reduced to one condition, blotIt outperformed the other methods by far. With only one overlapping condition, normalization by fixed point and by sums were equivalent in this scenario. Optimal alignment relied on the common condition to calculate the scaling factor, which gave the worst outcome, indicating that this method works best when a large overlap is provided. It has to be noted, however, that the overall scaling performance for all methods got worse with less overlap. Comparing the outcome in regard to replicate numbers, similar means of the standard deviations could be observed for three and ten replicates. The overall performance of the methods thus did not change. However, With increasing number of replicates the peaks got sharper. The equal performance might be due to the design of the data sets, where additional replicates were evenly distributed between reference and individual conditions and thus did not change the proportion of data used for scaling.

Alignment of Western blot data

In the following, blotIt is applied to a published data set that provides time-resolved measurements for the phosphorylation of cytoplasmic Signal Transducer and Activator of Transcription 1 (STAT1) and mRNA levels of the Suppressor of Cytokine Signaling 1 (SOCS1) [4]. Both targets are involved in the Interferon alpha (IFNα) signaling pathway, where STAT1 acts as a transcription factor regulating, among others, SOCS1 expression. Three different IFNα concentrations were used to induce signal transduction and thereby phosphorylation of STAT1 as well as expression of SOCS1. Phosphorylation dynamics of cytoplasmic STAT1 were quantified by Western blot experiments, while SOCS1-mRNA levels were measured by qPCR and are analyzed in the next section. The three IFNα doses used for stimulation correspond to three conditions that are distinguished in the following. Cytoplasmic STAT1 protein was quantified in three experiments. Measurements are thus only available on different scales as they originate from distinct gels. Therefore, one can not directly judge the dynamics of the final time course from investigating the raw data (Fig 3a). Moreover, a direct comparison for example between experiment two and three is not possible, since these have not been measured together on the same gel. Instead, as shown in Fig 3b, an overlap exists between experiments 1 & 2, and 1 & 3 by two replicates, respectively. This allows the alignment of the three experiments and enables a comparison between all replicates. If no overlap existed between gels, it would not be possible to determine or define a common scale. In this case, blotIt would determine scaling factors separately, and the resulting scaled values would not be comparable between experiments.

Fig 3

Application example for the alignment of Western blot and qPCR data.

Application example for the alignment of Western blot and qPCR data.

Raw data of cytoplasmic pSTAT1, measured by Western blot, and SOCS1 mRNA quantified with qPCR, was taken as a subset from [4]. (a, e) Raw data is shown on the original scale (dots) compared to the predictions (dashed interpolating lines) as output by the model. Color indicates the different experiments (gels). (b, f) Illustration of the experimental overlap. Rows correspond to the different gels (scaling effects), columns correspond to different experimental conditions (biological effects). Tiles indicate whether the respective conditions was measured on the respective gel (colored) or not (white). (c, g) Data points after the alignment are shown. Scaled replicates (dots) are colored according to their original gel. On the same common scale, estimated true values are shown as gray interpolating lines. (d, h) Aligned data (dots) and trajectories (linearly interpolating lines) are depicted on the common scale. Color indicates the experimental condition. Within blotIt, scaling of these Western blot data is performed via the alignment function alignReplicates() called as outputWB <- alignReplicates( data = mydata, model = “yi/sj”, errorModel = “e_rel*value”, biological = yi ~ name + time + condition, scaling = sj ~ name + gelID, error = e_rel ~ name) allowing an individualized structure of model and error model. The structure of the input data set is oriented towards a recently developed data sharing standard for dynamic modeling, in particular the measurement file of PEtab [27]. It has to be provided as a data.frame with obligatory columns name, time and value, specifying the observed target, the measurement time and the measurement value. Further columns characterize additional biological effects, which have to be distinguished, and scaling effects, e.g. the experimental condition or the gelID of the Western blot. The alignment function allows to define these effects as a function of y or s in an additive manner. name and time are obligatory for the argument biological, as different targets at different time points are independent and have to be distinguished. The scaling argument requires the parameter name, as every observed target may scale differently. Also the error, here e_rel, can be specified. If the same value of e_rel should be assumed for all conditions, the error is only specified by the target, i.e. name. Alignment of the pSTAT1 data brings the measured data points from different experiments to a common scale. The output of alignReplicates() is a list with the entries scaled, displaying the scaled replicates Y*s (Fig 3c) and aligned, containing the estimated true values y (Fig 3d), both with uncertainties. Further listed elements describe the original data, data predictions based on the alignment model, and the respective estimated scaling parameters.

Alignment of qPCR data

The flexibility of the model based alignment approach allows to process qualitatively different data with only minor adjustments. Thus, in addition to the linear Western blot data, also quantitative real-time polymerase chain reaction (qPCR) data can be analyzed with blotIt. During the process of mRNA quantification in qPCR measurements, a small region of the mRNA of interest is amplified in a sequence of replication cycles. The mRNA concentration is therefore measured in Cycles to Threshold (C) of PCR, a relative value that represents the cycle number at which the amount of amplified DNA reaches a defined threshold level. This threshold is in general individually chosen for each experiment. Since the amount of mRNA is approximately duplicated in each cycle of the PCR, the C value is on the log2 scale. The inferred quantity ΔC describes the difference in cycles between the target and a reference gene, where the reference gene can be a housekeeper which is known to remain relatively stable in response to any treatment. To assess the dynamic development of mRNA expression the ΔΔC can be used, representing the difference in ΔC between the target and a reference condition [28]. Here, an intuitive reference condition is the zero time point. Because higher C values correspond to lower mRNA abundance in the sample, the quantity −ΔΔC is used for the description of expression dynamics. The freedom of choice for the detection threshold results in an experiment-specific shift in the number of cycles until detection. Together with the offsets introduced in the ΔΔC calculation by potential measurement variances of the zero time point and the housekeeper genes, this can be summarized into one offset parameter. Since the data is of logarithmic nature, this offset reflects a multiplicative scaling on the linear scale. The alignment model and the corresponding error model thus have to be log-transformed, which results in an additive model with an absolute error description. The alignReplicates() function call is adjusted accordingly: outputQPCR <- alignReplicates( data = mylog2data, model = “log2(yi)-log2(sj)”, errorModel = “e_abs”, biological = yi ~ name + time + condition, scaling = sj ~ name + gelID, error = e_abs ~ name) The results of the alignment process outputQPCR are analogous to those described for the Western blot data. They are described for the example data set of SOCS1-mRNA on the right hand side of Fig 3e–3h). Data was quantified in three experiments, where experiment four and six are not comparable on the original scale and differences in conditions only become apparent on the common scale. Furthermore, the dynamics of the full time course are not visible at the level of the single experiments (Fig 3e) but can be analyzed after scaling in Fig 3h. Since the original scale is logarithmic this is also true for the common scale.

Discussion

In many cases, biological data is generated in a way that does not allow a direct comparison between different measurements. Reasons can be differences in sample loading, antibody binding or discrepancies between various gels in the case of Western blotting. In turn, these artifacts lead to different measurement scales for the experimental data and mask the effects of biologically different conditions like treatments and measurement time points in dynamical processes. Analyzing longitudinal data or dose response data without proper preprocessing is not possible when the measurements are affected by different scaling factors. We here present a method to scale data of independent experiments to one common scale, where the data is directly comparable. In addition to the original and the scaled data, we provide two further outputs of the algorithm: (1) aligned data, i.e. the true values obtained when the impact of different scaling and residual noise is removed; and (2) predicted data, i.e. the values on the original scale of the experiments obtained when only the impact of residual noise is removed. Previously established strategies to correct for the scaling differences include the usage of recombinant proteins to transform each measurement to an absolute scale, or theoretical approaches like normalization by fixed point, by sum and optimal alignment. In the latter, scaling factors are determined by analytically minimizing differences in scaling between the experiments. The here presented method follows a similar idea but determines the scaling factors via a more flexible numerical optimization. Our approach has the benefit that no single condition needs to be present within all experiments, but instead it is sufficient to have a pairwise overlap of measured conditions between different experiments. Even if a certain overlap is given between all experiments, it is common that some conditions are measured only in a subset of experiments. In contrast to blotIt, the above mentioned approaches cannot use the data points outside of the overlap to determine the scaling factors, which are therefore scaled with poor quality. Here the power of blotIt comes in, taking all data into account for the estimation of the scaling factors. This is especially relevant when performing experiments for a lot of different conditions e.g. times or doses of stimulus or inhibitor. Further, in contrast to the other methods, we are not only able to determine the scaling factors, but also estimate underlying true values, i.e. maximum-likelihood estimates for the true values disregarding the experimental scaling artifacts. Asymptotic confidence intervals based on the Fisher Information Matrix are provided for the scaling factors as well as for the estimated true values. One typical field of application for biological time course data is dynamical modeling where it is often beneficial to reduce the number of estimated parameters to a minimum, e.g. via data pre-processing. By determining a common scale with the presented method, it is not necessary to include experiment-specific scaling factors in the model, decreasing the parameter space and therefore the model complexity. A special optimization approach termed hierarchical optimizing also enables the calculation of scaling parameters without effectively increasing the parameter space [29, 30]. However, this analytical evaluation of scaling parameters is always combined with the parameterization of an ODE model as it is the case for Weber et al. [13]. BlotIt as well as the other analyzed methods are model-free purely data-based approaches, i.e. methods not depending on a specific ODE model implementation or modeling framework. However, it might be of interest to investigate these integrative approaches in comparison to blotIt in a future study. By utilizing numerical optimization, alignment model and error model can be flexibly adapted to the appropriate scaling mechanism for the data at hand. One can therefore account e.g. for data on the logarithmic scale or apply customized scaling approaches. The same freedom applies to the error model with the benefit to individually include relative and absolute errors. With this flexibility, blotIt can not only be applied to data generated by Western blotting, but to all use cases where relative data is generated like quantitative real-time PCR, reverse phase protein arrays as well as flow and mass cytometry. (PDF) Click here for additional data file. 2 Mar 2022

PONE-D-22-03759

BlotIt - Optimal alignment of western blot and qPCR experiments

PLOS ONE Dear Dr. Kemmer, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.

The three reviewers agree that the manuscript would be a valuable addition to the literature, provided that a number of aspects are clarified, and I concur with their assessment. In the review reports you will find a list of suggestions to clarify a number of technical aspects of the proposed method. Notably, two of them agree that it would be interesting to compare BlotIt with alternative approaches. As pointed out by Reviewer 2, one possibility would be to perform a comparison using simulated datasets. If possible, such a study would greatly enhance the contributions of the paper.

Please submit your revised manuscript by Apr 16 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:

A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Alejandro Fernández Villaverde, Ph.D. Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at https://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section. 3. PLOS ONE now requires that authors provide the original uncropped and unadjusted images underlying all blot or gel results reported in a submission’s figures or Supporting Information files. This policy and the journal’s other requirements for blot/gel reporting and figure preparation are described in detail at https://journals.plos.org/plosone/s/figures#loc-blot-and-gel-reporting-requirements and https://journals.plos.org/plosone/s/figures#loc-preparing-figures-from-image-files. When you submit your revised manuscript, please ensure that your figures adhere fully to these guidelines and provide the original underlying images for all blot or gel data reported in your submission. See the following link for instructions on providing the original image data: https://journals.plos.org/plosone/s/figures#loc-original-images-for-blots-and-gels. In your cover letter, please note whether your blot/gel image data are in Supporting Information or posted at a public data repository, provide the repository URL if relevant, and provide specific details as to which raw blot/gel images, if any, are not available. Email us at plosone@plos.org if you have any questions. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: BlotIt - Optimal alignment of western blot and qPCR experiments Summary The authors present an automated procedure for the normalisation of relative data, such as data obtained from Western blot or rt-qPCR. Relative data are usually not directly comparable across replicates, because of different arbitrary units obtained, and so require normalisation. Using likelihood-based optimisation, the proposed method is able to estimate how to scale each replicate, and at the same time provides an estimate of mean and variance for the underlying data. The approach is flexible in various way, as it allows different error models to be used, an also it is able to combine data from multiple blots as long as there is one experiment shared across pairs of blots. The work presented is certainly worthy of publication, although some clarifications and minor corrections are necessary before I can give my final approval. Major 1. Missing discussion of the fact that Yij at extremes of the dynamic range of detection can have relatively low signal to noise ratio or high residual error epsilon. While the error epsilon is initially introduced as potentially different for each Yij, it is then simplified to have a variance proportional to the value of the measurement (sigma_ij=e_rel*yi/sj). While this can be considered an appropriate generic error model, its drawbacks should also be highlighted, such as its underestimation of variance for relatively low intensity blot values (which notoriously have a low signal to noise ratio). It would also be more appropriate to adjust lines 80-83 page 3, to reflect the fact that this error model (Eq 2) efficacy depends on the assumptions chosen to implement h and whether the data follows these assumptions, rather than just describing it as a superior approach. The advantages and disadvantages of the chosen simplified model should be highlighted. I guess the advantage here is the reduction of parameters to estimate (just e_rel), while the disadvantage is the loss of model flexibility. For example, it will ignore that low intensity values might have a much higher relative variability (because of the low signal to noise ratio). 2. Please define a set or range of values for the indexes i and j. For example there could be I conditions and J blots with i in the range 1,…,I and j in the range 1,…,J. In principle, each equation should have a definition for i and j range, like ‘for all i in (…), j in (…)’. The simplest case would be when the set of experiments i is the same for all J. However, the authors imply that different j can have different sets of experiments, so different i? In this case perhaps it makes more sense to talk about indexes i that belong to sets Ij (that is indexed by j) and that the intersections of sets Ij need to be at least pairwise not empty (i.e. share at least one condition i). This might also affect the notation in equation 10. 3. How is the mean of the true values y dash (equation 10c) defined? This is particularly important if there are different conditions i on different blots j (like in Figure 2), I guess this will be a mean across all i regardless of what blot j they belong to? 4. In equation 7, page 4, the measurement error term is not shown. How do you expect the value of Ysij be determined if the error is unknown? For example, arranging Eq 1, Ysij = (Yij+eij)*(s hat j). Because the error is potentially different for each data point, how will the mean and sd of the aligned data be affected? More explanation is given later, but here a clarification and showing the particular example (f=Y/s) next the generic equation would help the flow. Perhaps, it would help to mention the ideas of error propagation here, with examples and simplifying assumptions. 5. At lines 129 to 131 page 4, the authors assert that for dynamic models parameter estimation mean and standard deviation is sometimes preferred, citing the case of low number of replicates. One could argue the opposite, that if there are few replicates, then all data should be used for fitting, because the average and sd alone may not properly represent a given datapoint. I suggest to remove or add more literature or explanation to support the authors claim. 6. The claim at lines 133 and 134 that the estimated errors are more reliable than the data spread obtained from replicates, should be further explained. For example, this could be true only if the data agree with the error models and there could be exceptions (see above for low signal to noise datapoints), and also it depends on how the data spread of replicates is calculated, for example if the replicates are done all on the same blot, then it might be more accurate, but if they are done on different blots, then the spread is completely dependent on the normalisation applied and possibly the value of other datapoints. 7. Why there is no error term in equation 8? Also, in this case it would be useful to give a generic idea of how the error can be inferred and follow Eq 8 with the concrete example f=Y/s, and simplifying assumptions. 8. On page 5 line 146, please reconsider and rephrase the claim that the optimal theta is obtained by minimising the spread of the replicates, because the likelihood model is not just about trying to reduce the spread (low sigma). 9. In the error determination section (page 6), equations 13 and 14 require a reference, please add. Also, it would be nice to accompany these equations with concrete examples for the specific cases and simplifying assumptions described (like f=Y/s). For example, sigma_s, if I am not mistaken, simplifies to sigma_s_ij=s_j*sigma_ij. 10. Perhaps it would be of interest adding to the discussion some speculative yet useful cases. For example, what happens when two blots share only one experiment, but in one blot j the measurement is likely to have a high signal to noise ratio, while in the other blot j’ the same experiment has a very poor signal to noise, perhaps because it is a low intensity value. Would the proposed model be able to propagate the variance of the normalised data? Would there be enough data to constrain the model optimisation? 11. What other error models are available in blotIt besides e_rel*value and e_abs? Minor: 12. Shouldn’t Equations 5 and 6 mirror the definitions in Equations 3 and 4? Equations 5 and 6 seem to be a mix of the generic vector format Equations 1 and 2 and equations with specific indexes i and j such as Eq 3 and 4. Please, choose one format for clarity, probably the format with the indexes i and j would be more suitable for Eq 5 and 6 following the flow of the paper. 13. Line 114, page 4, what is the dimensionality of y hat, s hat and e hat? Probably this will be clearer once the indexes are clarified (see above) 14. ‘Therefore’ might be more appropriate than ‘therefor’ at line 8 page 1, and also line 50 page 2, and also in other places across the manuscript. 15. Line 118 page 4, form -> from 16. In equation 7 page 4, the variable Y_s is undefined. I was initially confused by it because the text that precedes it talks about the true values y. It might help to write a sentence giving a proper definition for Y_s, and accompany this equation with another equation exemplifying what Y_s_ij look like for the simplified model f=Y/s and error e_rel*Yij/sj. Reviewer #2: This manuscript describes a normalization strategy for western blotting as well as many other assay types that can put relative data from different experiments or replicates on the same quantitative scale. This is needed often because experiment specific factors cause the scaling to not be comparable between replicates or different experiments. A main claimed novelty is that the same condition need not be contained in every experiment for normalization. Rather, each experiment needs to share at least one point with one other experiment. I would have liked to have seen more application and discussion to data sets that have such a feature, and showing where current methods fail. And perhaps some discussion of how often this scenario is found in the literature and would be needed. There are other points listed below that may be important to address: 1. What justification do the authors have for assuming normality in errors for western blots? How much does that affect the conclusions of the paper? Could alternative error models be used and BlotIt still functions well? How does that impact application to other data types? 2. It is appreciated that the authors technique is claimed to work when data points are not shared between all replicates and/or conditions. How common or rare this is for bench scientists performing replicates or experiments was not discussed. What novelties or advantages does blotit have when data points are shared among all replicates or experiments? Getting more clarity on that would help make the impact and uptake of the paper clearer. 3. How does the proposed approach differ from the scaling factor approach described here: 10.1038/msb.2009.4 ? There is a general lack of comparison to other analysis methods which have been established and used for quite a long time, as cited by the authors. 4. The discussion of how to use the R code and format it seems more appropriate for detailed methods section, not the results section. 5. In Figure 2, how do alternative methods for normalizing data compare to BlotIt? 6. Perhaps a simulated data study where data points are actually shared between all experiments, but are hidden to see how BlotIt does, could be an effective analysis to demonstrate usefulness and also compare to other normalization methods. 7. How would one compare the common scale data as the output of BlotIt to model simulations that would have a different scale (e.g. absolute concentrations)? Often comparison to and use with dynamical models is cited in the paper as a main motivator, but discussion with respect to this is lacking. 8. Therefor--> therefore Reviewer #3: Review for the manuscript "BlotIt - Optimal alignment of western blot and qPCR experiments": The manuscript proposes a novel alignment method for relative data. Via optimization, it finds a version of the data on a common scale. An implementation in R is provided. The manuscript is overall well written and easy to read, while in some places it could be more specific. In my opinion, the new method is interesting and will find usage, while it is maybe not a mayor conceptual breakthrough. I have a few comments, which I think should be addressed in a revision: Content ------- - it could be made clearer that, unlike e.g. the approach by Weber et al., the approach is essentially model-free, i.e. only dependent on the data and a noise model, but not e.g. a post-hoc employed ODE model. - A comparison to e.g. the method by Weber, as well as the method by Degasperi, in a situation where it is applicable, would be of interest, e.g. regarding predictions, efficiency and uncertainties. It is however understandable if this is beyond the scope of this work. - l. 35f: "disadvantage of enlarging the parameter space drastically": The approach by Weber (or also later papers by Loos et al. "Hierarchical optimization for the efficient parametrization of ODE models" and Schmiester et al. "Efficient parameterization of large-scale dynamic models based on relative measurements") appear to argue explicitly that the parameter space is effectively not enlarged by a hierarchical formulation. - l. 36f: I did not understand "estimates of the scaling parameters might be biased by the model equations[,] hampering hypothesis testing and therefor[e] interpretation". - l. 82: "the error model considers the variance information from all experimental data" as opposed to "calculation of errors based on the spread of measurement values" does not get clear to me. - l. 146: There appears to be a constant $\\pi$ missing in (10b), when deriving (10a) and (10b) from a normal density, which affects the relative impact of both terms. - l. 146: I think it could be clarified that $\\bar y$ in (10c) denotes the mean (over all data points?)? - l. 146: Where does the $10^{-3}$ come from? This appears to be rather arbitrary and may affect how much emphasis the method puts on normalization. How sensitive is the method with regard to it? Or can it be chosen in a problem-specific manner? Or would there be an alternative formulation as an optimization problem with explicit constraint $\\bar y = 1$? - l. 146: Can (10c) be interpreted stochastically (as (9) claims to describe a probability density)? - l. 158: "This drastically improves numerical stability": This is surely an accurate fact, yet a reference may be good. Is the method applicable to negative data? - l. 167ff: References on the parameter/data ratio, the variance underestimation, and the Bessel correction would be good. - l. 180: The confidence interval appears based on a local Taylor approximation given asymptotic normality of the maximum likelihood estimate (with covariance matrix given by the inverse Fisher information matrix). Conceptually, there should be alternative methods, e.g. based on Wilk's theorem or sampling. Maybe a contextualization would be good? - l. 184: What do the authors mean by the FIM is "represented by the Hessian"? - implementation of the method: How is the optimization problem solved? Are gradients available? Does the problem have multiple local optima? - implementation of the method: How computationally expensive is the method? Does it scale to e.g. aligning single-cell data, where normalization is often done simply be cell size? - As mentioned before, a comparison with alternative methods, and a discussion on how to use the scaled data in downstream analysis would be of interest, but it is understandable if this is beyond the scope of this work. A particular question that may come up is: E.g. an ODE model will output values on a certain scale, which may be different from the normalized scale by the presented method. Would this necessitate the use of scaling factors when fitting the ODE model still? Grammar ------- - e.g. l. 8, 37: While this word also exists, you probably mean "Therefore" in multiple places. - l. 37: "[,] hampering" - l. 195: "concentrations[,] meaning" - Table 1: comma in $Y_s = f^{-1}(Y, \\hat s)$ ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: Yes: Yannik Schälte [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 7 Jun 2022 Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. The three reviewers agree that the manuscript would be a valuable addition to the literature, provided that a number of aspects are clarified, and I concur with their assessment. In the review reports you will find a list of suggestions to clarify a number of technical aspects of the proposed method. Notably, two of them agree that it would be interesting to compare BlotIt with alternative approaches. As pointed out by Reviewer 2, one possibility would be to perform a comparison using simulated datasets. If possible, such a study would greatly enhance the contributions of the paper. Answer: We thank the editorial board for the opportunity to submit a revised manuscript. A new chapter including a detailed method comparison with simulated data was added. We will elaborate the specifics below. Please include the following items when submitting your revised manuscript: • A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled ’Response to Reviewers’. • A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled ’Revised Manuscript with Track Changes’. • An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled ’Manuscript’. Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE’s style requirements, including those for file naming. Answer: We checked and conformed to PLOS ONE’s style requirements and those for file naming. Please let us know if a requirement is not met. 2. We note that the grant information you provided in the ‘Funding Information’ and ‘Financial Disclosure’ sections do not match. When you resubmit, please ensure that you provide the correct grant numbers for the awards you received for your study in the ‘Funding Information’ section. Answer: Thanks for checking this in detail! We indeed remarked that the grant number 031L004 was incomplete (should be 031L0048). We corrected this in the ‘Funding Information’ section. We hope that this is what you were referring to. Otherwise, please let us now. 3. PLOS ONE now requires that authors provide the original uncropped and unadjusted images underly- ing all blot or gel results reported in a submission’s figures or Supporting Information files. This policy and the journal’s other requirements for blot/gel reporting and figure preparation are described in detail at https://journals.plos.org/plosone/s/figures#loc-blot-and-gel-reporting-requirements and https://journals.plos.org/plosone/s/figures#loc-preparing-figures-from-image-files. When you submit your revised manuscript, please ensure that your figures adhere fully to these guidelines and provide the origi- nal underlying images for all blot or gel data reported in your submission. See the following link for in- structions on providing the original image data: https://journals.plos.org/plosone/s/figures#loc-original- images-for-blots-and-gels. In your cover letter, please note whether your blot/gel image data are in Supporting Information or posted at a public data repository, provide the repository URL if relevant, and provide specific details as to which raw blot/gel images, if any, are not available. Email us at plosone@plos.org if you have any questions. In this manuscript only simulated data and already published measurements from Kok et al. were used. Answer: In this manuscript only simulated data and already published measurements from Kok et al. were used. There might have been a misunderstanding in Figure 1. We apologize for that and adjusted the figure description to clarify that simulated data was used. Reviewers’ comments: Reviewer #1: BlotIt - Optimal alignment of western blot and qPCR experiments Summary The authors present an automated procedure for the normalisation of relative data, such as data obtained from Western blot or rt-qPCR. Relative data are usually not directly comparable across replicates, because of different arbitrary units obtained, and so require normalisation. Using likelihood-based optimisation, the proposed method is able to estimate how to scale each replicate, and at the same time provides an estimate of mean and variance for the underlying data. The approach is flexible in various way, as it allows different error models to be used, an also it is able to combine data from multiple blots as long as there is one experiment shared across pairs of blots. The work presented is certainly worthy of publication, although some clarifications and minor corrections are necessary before I can give my final approval. Major 1. Missing discussion of the fact that Yij at extremes of the dynamic range of detection can have relatively low signal to noise ratio or high residual error epsilon. While the error epsilon is initially introduced as potentially different for each Yij, it is then simplified to have a variance proportional to the value of the measurement (sigma ij = e rel ∗ yi/sj). While this can be considered an appropriate generic error model, its drawbacks should also be highlighted, such as its underestimation of variance for relatively low intensity blot values (which notoriously have a low signal to noise ratio). It would also be more appropriate to adjust lines 80-83 page 3, to reflect the fact that this error model (Eq 2) efficacy depends on the assumptions chosen to implement h and whether the data follows these assumptions, rather than just describing it as a superior approach. The advantages and disadvantages of the chosen simplified model should be highlighted. I guess the advantage here is the reduction of parameters to estimate (just e rel), while the disadvantage is the loss of model flexibility. For example, it will ignore that low intensity values might have a much higher relative variability (because of the low signal to noise ratio). Answer: Thanks for commenting this point in detail! It is indeed important to adjust the error model according to the error distribution of the data. We therefore added emphasis on the importance of the choice of error model in the revised manuscript. For the specific case mentioned above the purely relative error model σ_ij = e_rel · y_i /s_j still results in individual errors of Y_ij . We added a more detailed explanation why the reduced error model is an appropriate choice for the discussed application. [revised manuscript lines 90-96] 2. Please define a set or range of values for the indexes i and j. For example there could be I conditions and J blots with i in the range 1,. . . ,I and j in the range 1,. . . ,J. In principle, each equation should have a definition for i and j range, like ‘for all i in (. . . ), j in (. . . )’. The simplest case would be when the set of experiments i is the same for all J. However, the authors imply that different j can have different sets of experiments, so different i? In this case perhaps it makes more sense to talk about indexes i that belong to sets Ij (that is indexed by j) and that the intersections of sets Ij need to be at least pairwise not empty (i.e. share at least one condition i). This might also affect the notation in equation 10. Answer: A more formal introduction of the sets I and J was added. There is only one true value for each y_i , i ∈ I, but the reviewer is correct, in each experiment, e.g. on each gel (for western blotting) a different subset of I can be measured. This is indeed the case for the simulation study where different scenarios are elaborated. [revised manuscript lines 58-65] 3. How is the mean of the true values y dash (equation 10c) defined? This is particularly important if there are different conditions i on different blots j (like in Figure 2), I guess this will be a mean across all i regardless of what blot j they belong to? Answer: The scaling of replicates can only be performed relative to each other. Thus, one additional constraint has to be introduced to define the scaling parameters. The constraint fixes the mean over all measurements e.g. to one. This is indeed arbitrary. The remaining degree of freedom requires the choice of a unit to display the results, here a multiple of the means over all data. 4. In equation 7, page 4, the measurement error term is not shown. How do you expect the value of Ysij be determined if the error is unknown? For example, arranging Eq 1, Ysij = (Yij+eij)*(s hat j). Because the error is potentially different for each data point, how will the mean and sd of the aligned data be affected? More explanation is given later, but here a clarification and showing the particular example (f=Y/s) next the generic equation would help the flow. Perhaps, it would help to mention the ideas of error propagation here, with examples and simplifying assumptions. Answer: The error is not shown in equation (7) because we want to visualize the scaling of the measured values Y from their own to a common scale via the model. We added a small look-ahead to the more detailed explanation later. The aligned data is always on the common scale, which is why the propagation of the corresponding errors is not discussed. [revised manuscript lines 130-134] 5. At lines 129 to 131 page 4, the authors assert that for dynamic models parameter estimation mean and standard deviation is sometimes preferred, citing the case of low number of replicates. One could argue the opposite, that if there are few replicates, then all data should be used for fitting, because the average and sd alone may not properly represent a given data point. I suggest to remove or add more literature or explanation to support the authors claim. Answer: There seems to be a misunderstanding based on poor phrasing on our side. We concur with the reviewer: In cases of small number of replicates the use of scaled data (i.e. data scaled to common scale) is not the best input. In those cases the aligned data set (what we meant with ”means”) and the Fisher information based confidence intervals (see Table 1) are better suited. We rewrote the mentioned passage to clarify this. [revised manuscript lines 144-149] 6. The claim at lines 133 and 134 that the estimated errors are more reliable than the data spread obtained from replicates, should be further explained. For example, this could be true only if the data agree with the error models and there could be exceptions (see above for low signal to noise data points), and also it depends on how the data spread of replicates is calculated, for example if the replicates are done all on the same blot, then it might be more accurate, but if they are done on different blots, then the spread is completely dependent on the normalisation applied and possibly the value of other data points. Answer: This is a misunderstanding, the aligned data set consists of the estimated true values y . The eˆ describes not the errors of the error model but the uncertainties of the model fit itself, and thus quantify the error of fitted parameters yˆ. This is elaborated in the error calculation section, we clarified this in the text. [revised manuscript lines 146-147] 7. Why there is no error term in equation 8? Also, in this case it would be useful to give a generic idea of how the error can be inferred and follow Eq 8 with the concrete example f=Y/s, and simplifying assumptions. Answer: Equation (8) only shows the definition of the predicted data set. The error calculation (of all data sets) is topic of the Error determination section. We added a remark pointing at this section and included a more detailed example there. [revised manuscript lines 156-157] 8. On page 5 line 146, please reconsider and rephrase the claim that the optimal theta is obtained by minimising the spread of the replicates, because the likelihood model is not just about trying to reduce the spread (low sigma). Answer: Yes, this was an oversimplification, thank you for the remark. [revised manuscript lines 160-162] 9. In the error determination section (page 6), equations 13 and 14 require a reference, please add. Also, it would be nice to accompany these equations with concrete examples for the specific cases and simplifying assumptions described (like f = Y /s). For example, sigma s , if I am not mistaken, simplifies to sigma s ij = s j · σ ij . Answer: References for equations 13 and 14 and the example for the present model have been added. • Reference Bessel correction (revised manuscript line 187) • Reference Gaussian error propagation (revised manuscript line 192) • Reference Fisher Information (revised manuscript line 197) 10. Perhaps it would be of interest adding to the discussion some speculative yet useful cases. For example, what happens when two blots share only one experiment, but in one blot j the measurement is likely to have a high signal to noise ratio, while in the other blot j’ the same experiment has a very poor signal to noise, perhaps because it is a low intensity value. Would the proposed model be able to propagate the variance of the normalised data? Would there be enough data to constrain the model optimisation? Answer: Thanks for bringing up the effect of different signal to noise ratios again! To elaborate this effect and the influence of different numbers of overlap samples we included several scenarios in the performance analysis described in the new section [Application to simulated data] 11. What other error models are available in blotIt besides e rel · value and e abs ? Answer: BlotIt does not have a repository of pre-implemented (error) models. The user can freely define them. Minor 12. Shouldn’t Equations 5 and 6 mirror the definitions in Equations 3 and 4? Equations 5 and 6 seem to be a mix of the generic vector format Equations 1 and 2 and equations with specific indexes i and j such as Eq 3 and 4. Please, choose one format for clarity, probably the format with the indexes i and j would be more suitable for Eq 5 and 6 following the flow of the paper. 13. Line 114, page 4, what is the dimensionality of y hat, s hat and e hat? Probably this will be clearer once the indexes are clarified (see above) Answer: This is a good point, we use the index definition for equations (3-6) now. The meanings of i, and j are defined in the beginning of the methods section, their dimensionality should be clearer now. [equations 3,4 changed vectors to indexes] 14. ‘Therefore’ might be more appropriate than ‘therefor’ at line 8 page 1, and also line 50 page 2, and also in other places across the manuscript. 15. Line 118 page 4, form -> from 16. In equation 7 page 4, the variable Y s is undefined. I was initially confused by it because the text that precedes it talks about the true values y. It might help to write a sentence giving a proper definition for Y s , and accompany this equation with another equation exemplifying what Y s ij look like for the simplified model f = Y /s and error e rel · Y ij /s j . Answer: The grammar mistakes have been fixed and the text preceding equation (7) was clarified. The error is not mentioned here, because we wanted to highlight the scaling of the measured data. The error propagation is then covered later in the respective section. Also the explicit example for the western blot scaling model can be found there. Reviewer #2: This manuscript describes a normalization strategy for western blotting as well as many other assay types that can put relative data from different experiments or replicates on the same quantita- tive scale. This is needed often because experiment specific factors cause the scaling to not be comparable between replicates or different experiments. A main claimed novelty is that the same condition need not be contained in every experiment for normalization. Rather, each experiment needs to share at least one point with one other experiment. I would have liked to have seen more application and discussion to data sets that have such a feature, and showing where current methods fail. And perhaps some discussion of how often this scenario is found in the literature and would be needed. Answer: In principle, all measurements can be measured as perfect replicates (so that in N experiments, the exact same biological setup is measured N times). In practice, biological conditions (treatments, targets, time points etc.) vastly outnumber the capacity of one experiment. This gives rise to the need of extending that limit of comparable biological samples by measuring only some samples as overlap to previous experiments. We added a simulation study to discuss the performance of blotIt compared to different other scaling approaches. In that context we also introduced some usual scenarios, in which partial overlap between experiments is necessary. There are other points listed below that may be important to address: 1. What justification do the authors have for assuming normality in errors for western blots? How much does that affect the conclusions of the paper? Could alternative error models be used and BlotIt still functions well? How does that impact application to other data types? Answer: We made some changes to better motivate the choice of the explicit error model used in the manuscript. The validity of the error model must of cause be addressed for each application. Due to the flexible nature of the utilized optimization approach, every error model can be used. 2. It is appreciated that the authors technique is claimed to work when data points are not shared between all replicates and/or conditions. How common or rare this is for bench scientists performing replicates or experiments was not discussed. What novelties or advantages does blotit have when data points are shared among all replicates or experiments? Getting more clarity on that would help make the impact and uptake of the paper clearer. Answer: Thanks for pointing out that the need of having small overlap didn’t become clear yet! Indeed, it is a very common scenario that bench scientists perform replicate measurements that are not shared between all experiments. We added an explanation to the discussion and evaluated this point also in the new section [Application to simulated data]. [revised manuscript lines 404-410] 3. How does the proposed approach differ from the scaling factor approach described here: 10.1038/msb.2009.4 ? There is a general lack of comparison to other analysis methods which have been established and used for quite a long time, as cited by the authors. Answer: Wang et al (10.1038/msb.2009.4) used the strategy of optimal alignment. In our new section [Application to simulated data] we analyse the performance of this and other methods in relation to blotIt and work out similarities and differences. 4. The discussion of how to use the R code and format it seems more appropriate for detailed methods section, not the results section. Answer: This is a valid point which we also considered when deciding on the structure of the paper. However, as we would like to emphasis the application of our method to various data types, we decided to describe it in the results section. 5. In Figure 2, how do alternative methods for normalizing data compare to BlotIt? 6. Perhaps a simulated data study where data points are actually shared between all experiments, but are hidden to see how BlotIt does, could be an effective analysis to demonstrate usefulness and also compare to other normalization methods. Answer: Thanks for bringing up this point! It motivated us to perform a method comparison that is now included as new section [Application to simulated data]. However, the data sets described for the real world scenarios in former Figure 2, now Figure 3, cannot be scaled with the other approaches as no biological condition (dose and time point) is measured in all three experiments E1-E3. Therefore, we analyzed the performance of different scaling methods based on simulated data sets where all methods are applicable. 7. How would one compare the common scale data as the output of BlotIt to model simulations that would have a different scale (e.g. absolute concentrations)? Often comparison to and use with dynamical models is cited in the paper as a main motivator, but discussion with respect to this is lacking. Answer: Thanks for pointing out this unclarity! Indeed, it is recommended to include one scaling parameter for the whole data set, instead of N experiment-specific scaling parameters necessary without normalization, in the model formulation. This enables the proper comparison of the common scale data and the model simulations. We clarified this important point in the manuscript. [revised manuscript lines 140-143] 8. Therefor − > therefore Answer: Thanks, we corrected it. Reviewer #3: Review for the manuscript ”BlotIt - Optimal alignment of western blot and qPCR ex- periments”: The manuscript proposes a novel alignment method for relative data. Via optimization, it finds a version of the data on a common scale. An implementation in R is provided. The manuscript is overall well written and easy to read, while in some places it could be more specific. In my opinion, the new method is interesting and will find usage, while it is maybe not a mayor conceptual breakthrough. I have a few comments, which I think should be addressed in a revision: Content —— - it could be made clearer that, unlike e.g. the approach by Weber et al., the approach is essentially model-free, i.e. only dependent on the data and a noise model, but not e.g. a post-hoc employed ODE model. Answer: This is a valid point! We included a paragraph in the discussion addressing this difference. [revised manuscript lines 424-427] - A comparison to e.g. the method by Weber, as well as the method by Degasperi, in a situation where it is applicable, would be of interest, e.g. regarding predictions, efficiency and uncertainties. It is however understandable if this is beyond the scope of this work. Answer: Since the paper was originally addressed to a more application based audience we initially refrained from including a dedicated performance comparison of different scaling methods, especially since our focus lied on cases where the other methods are structurally not applicable. However, we see the added value in such an analysis. So we performed a simulation study for cases where all methods are applicable and evaluated the different performances. The setup of the simulation study can be found in the methods section, while the outcome is analyzed in the results section. We however only included data-based nor- malization approaches that are independent of any ODE model and added a short outlook concerning the Weber method in the discussion. [revised manuscript lines 422-428 494] - l. 35f: ”disadvantage of enlarging the parameter space drastically”: The approach by Weber (or also later papers by Loos et al. ”Hierarchical optimization for the efficient parametrization of ODE models” and Schmiester et al. ”Efficient parameterization of large-scale dynamic models based on relative mea- surements”) appear to argue explicitly that the parameter space is effectively not enlarged by a hierarchical formulation. Answer: This is a good point we didn’t consider so far! As we would not regard hierarchical optimizing as the standard approach – depending on the used tools and setup – the parameter space enlargement can still be a problem. We addressed this in the discussion now. [revised manuscript lines 420-422] - l. 36f: I did not understand ”estimates of the scaling parameters might be biased by the model equations[,] hampering hypothesis testing and therefor[e] interpretation”. Answer: When scaling parameters are optimized along with the model parameters of an ODE model (here: classi- cally by estimating experiment-specific scaling factors), the model could resolve ”problems” as e.g. wrong model assumptions by changing the scales. Equally a wrong scaling model might influence other model parameters. This is difficult to disentangle. - l. 82: ”the error model considers the variance information from all experimental data” as opposed to ”calculation of errors based on the spread of measurement values” does not get clear to me. Answer: We rewrote this paragraph to be more specific, also concerning the choice of the error model. [revised manuscript lines 83-100] - l. 146: There appears to be a constant π missing in (10b), when deriving (10a) and (10b) from a normal density, which affects the relative impact of both terms. Answer: Thanks for pointing that out! We corrected it in the revised manuscript. However, the objective function is only changed by an additive constant that doesn’t influence the optimum or the curvature of the optimization landscape: l(p) = sum(wres(p)^2) + log(pi*sigma(p)^2) = sum(wres(p)^2) + log(sigma(p)^2) + log(pi) - l. 146: I think it could be clarified that ȳ in (10c) denotes the mean (over all data points?)? Answer: ȳ represents the mean of the estimated true values as described in line 155 of the original draft and line 171 of the revised manuscript. - l. 146: Where does the 10 −3 come from? This appears to be rather arbitrary and may affect how much emphasis the method puts on normalization. How sensitive is the method with regard to it? Or can it be chosen in a problem-specific manner? Or would there be an alternative formulation as an optimization problem with explicit constraint ȳ = 1? Answer: Based on our experience, optimization under the side constraint mean = 1 is often not very stable numerically. That’s why we introduced the constraint via 1 − ȳ in the objective function. As ȳ = 1 can be always fulfilled exactly, the result is not sensitive to the penalization factor, here 10 −3 . Only the number of iterations to the optimum can be changed by varying this factors. 10 −3 proved itself stable in practice. - l. 146: Can (10c) be interpreted stochastically (as (9) claims to describe a probability density)? Answer: No, it cannot. As described above, 10c is an auxiliary construct to incorporate the constrained optimiza- tion. - l. 158: ”This drastically improves numerical stability”: This is surely an accurate fact, yet a reference may be good. Is the method applicable to negative data? Answer: The method is also applicable to negative data. An example is given by the qPCR data set from our application example. As this data is on log scale values can be negative as well. The alignment model has to be adjusted accordingly as described. - l. 167ff: References on the parameter/data ratio, the variance underestimation, and the Bessel correction would be good. Answer: For the data/parameter ratio, this is a conservative estimate based on our experience. A reference for the bessel correction was added. - l. 180: The confidence interval appears based on a local Taylor approximation given asymptotic normality of the maximum likelihood estimate (with covariance matrix given by the inverse Fisher information matrix). Conceptually, there should be alternative methods, e.g. based on Wilk’s theorem or sampling. Maybe a contextualization would be good? Answer: We also calculated the confidence intervals of the parameters using profile likelihood. This led to very well defined parameter intervals, indicating that the calculation based on the Fisher information is in fact a conservative approximation. We discussed the implementation of the profile likelihood method in blotit but eventually decided to keep the conservative approximations as the additional computation time is not justified for the standard user. - l. 184: What do the authors mean by the FIM is ”represented by the Hessian”? Answer: Since the FIM I(θ) is defined as the second derivative of the log-likelihood I(θ) := (d^2 l(θ))/(dθ^2) we implement it by ”recycling” the Hessian H which is evaluated in the fitting process anyway, and since the parameters for the last fitting step are the MLE, the evaluation of the Hessian of said step gives directly the Variances of the MLE: Var(θ̂) = 1/I(θ̂) = 1/H(θ̂) - implementation of the method: How is the optimization problem solved? Are gradients available? Does the problem have multiple local optima? Answer: The Implementation is via a trust-region optimizer, and the system is extremely well behaved. It usually takes no more then 50 steps to converge, and a second optimum was never observed. This might be because – contrary to dynamical modeling – the used scaling model is extremely close to reality. - implementation of the method: How computationally expensive is the method? Does it scale to e.g. aligning single-cell data, where normalization is often done simply be cell size? Answer: It is quite expensive compared to the analytical approaches. We did not do an explicit run time analysis, because we applied this method only to data sets of roughly equal size. We expect the run time to suffer heavily for extreme complicated sets (very large data sets with little pairwise overlap for example). At least this process is very well parallelizable. - As mentioned before, a comparison with alternative methods, and a discussion on how to use the scaled data in downstream analysis would be of interest, but it is understandable if this is beyond the scope of this work. A particular question that may come up is: E.g. an ODE model will output values on a certain scale, which may be different from the normalized scale by the presented method. Would this necessitate the use of scaling factors when fitting the ODE model still? Answer: The primary focus of this work was the scaling itself. This is why ODE based approaches are a little less prominent in the discussion. To your specific question: Yes, scaling factors are still necessary. However, just one scaling factor is needed for the whole scaled data set in contrast to individual scaling factors for all data subset as it would be necessary without normalizing the data in advance. [revised manuscript lines 140-143] Grammar —— - e.g. l. 8, 37: While this word also exists, you probably mean ”Therefore” in multiple places. - l. 37: ”[,] hampering” - l. 195: ”concentrations[,] meaning” - Table 1: comma in Y s = f −1 (Y, ŝ) Answer: Thanks for pointing out these comma and spelling mistakes! They are corrected in the revised manuscript. Submitted filename: Response to Reviewers.pdf Click here for additional data file. 4 Jul 2022 BlotIt - Optimal alignment of Western blot and qPCR experiments PONE-D-22-03759R1 Dear Dr. Kemmer, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Alejandro Fernández Villaverde, Ph.D. Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: (No Response) Reviewer #2: All comments have been addressed Reviewer #3: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: Yes Reviewer #2: Yes Reviewer #3: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: I would like to thank the authors for the additional work and for replying to my previous questions, which I consider answered. I just have a few additional point, just the first point is of major concern, and hopefully can be addressed easily. 1. I appreciate the additional simulation study, however I am concerned about the criteria chosen to evaluate and compare the methods. The authors write: “The performance of the individual methods was evaluated for each of the data realizations based on the spread of the scaled data”. Do the authors mean that the methods producing the normalised data with the narrowest standard deviation are preferrable? I think that one of the points of the mentioned Degasperi et al, was that data themselves have a spread, and that if we underestimate such spread this could also be problematic, for example making us believe that there is a difference between two conditions just because our assumptions have reduced the uncertainty of their mean value. So, I wonder whether the goal should be to prefer a method that produces a spread of the scaled data that is as close as possible to that of the simulated data. 2. The Methods section begins with the definition of the sets I and J, as well as measurements Yij. If I understand correctly, the key message here is that measurements Yij are comparable across the index i but not j. If so, this should be stated clearly, and some examples perhaps modified to avoid confusion. For example, the examples of biological effects include things that are comparable like different conditions and time points, but also things that are not comparable like different protein targets in a Western blot. 3. Line 99 of update text: “all experimental data, what allows for a reliable error”, change ‘what’ to ‘which’? Reviewer #2: The authors have done a reasonable job of addressing the concerns raised in the review. I think the paper should be published and will be of use to biological data analysis ideas. Reviewer #3: My comments to the first version have all been sufficeintly addressed, with some minor issues: - l. 425 and l. 446: "bloIt", and in a few other places. I guess the method goes only by "blotIt" - e.g. missing spaces and commas and shortforms like "didn't" in a few places in the newly added text - 10^{-3} in (10c): I would recommend to include the answer to the question as part of the manuscript or supplement. - same for the answer to how the optimization problem was solved ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: Yes: Yannik Schälte ********** 13 Jul 2022 PONE-D-22-03759R1 BlotIt - Optimal alignment of Western blot and qPCR experiments Dear Dr. Kemmer: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Dr. Alejandro Fernández Villaverde Academic Editor PLOS ONE

24 in total

Review 1. Systems biology: a brief overview.

Authors: Hiroaki Kitano
Journal: Science Date: 2002-03-01 Impact factor: 47.728

2. Computational processing and error reduction strategies for standardized quantitative data in biological networks.

Authors: Marcel Schilling; Thomas Maiwald; Sebastian Bohl; Markus Kollmann; Clemens Kreutz; Jens Timmer; Ursula Klingmüller
Journal: FEBS J Date: 2005-12 Impact factor: 5.542

3. Interpreting flow cytometry data: a guide for the perplexed.

Authors: Leonore A Herzenberg; James Tung; Wayne A Moore; Leonard A Herzenberg; David R Parks
Journal: Nat Immunol Date: 2006-07 Impact factor: 25.606

4. Transfer of proteins from gels to diazobenzyloxymethyl-paper and detection with antisera: a method for studying antibody specificity and antigen structure.

Authors: J Renart; J Reiser; G R Stark
Journal: Proc Natl Acad Sci U S A Date: 1979-07 Impact factor: 11.205

Review 5. Absolute quantification of mRNA using real-time reverse transcription polymerase chain reaction assays.

Authors: S A Bustin
Journal: J Mol Endocrinol Date: 2000-10 Impact factor: 5.098

6. Glyceraldehyde-3-phosphate dehydrogenase: a universal internal control for Western blots in prokaryotic and eukaryotic cells.

Authors: Yonghong Wu; Min Wu; Guowei He; Xiao Zhang; Weiguang Li; Yan Gao; Zhihui Li; Zhaoyan Wang; Chenggang Zhang
Journal: Anal Biochem Date: 2012-01-24 Impact factor: 3.365

7. PEtab-Interoperable specification of parameter estimation problems in systems biology.

Authors: Leonard Schmiester; Yannik Schälte; Frank T Bergmann; Tacio Camba; Erika Dudkin; Janine Egert; Fabian Fröhlich; Lara Fuhrmann; Adrian L Hauber; Svenja Kemmer; Polina Lakrisenko; Carolin Loos; Simon Merkt; Wolfgang Müller; Dilan Pathirana; Elba Raimúndez; Lukas Refisch; Marcus Rosenblatt; Paul L Stapor; Philipp Städter; Dantong Wang; Franz-Georg Wieland; Julio R Banga; Jens Timmer; Alejandro F Villaverde; Sven Sahle; Clemens Kreutz; Jan Hasenauer; Daniel Weindl
Journal: PLoS Comput Biol Date: 2021-01-26 Impact factor: 4.475

8. PI3K-dependent cross-talk interactions converge with Ras as quantifiable inputs integrated by Erk.

Authors: Chun-Chao Wang; Murat Cirit; Jason M Haugh
Journal: Mol Syst Biol Date: 2009-02-17 Impact factor: 11.429

9. Lessons learned from quantitative dynamical modeling in systems biology.

Authors: Andreas Raue; Marcel Schilling; Julie Bachmann; Andrew Matteson; Max Schelker; Max Schelke; Daniel Kaschek; Sabine Hug; Clemens Kreutz; Brian D Harms; Fabian J Theis; Ursula Klingmüller; Jens Timmer
Journal: PLoS One Date: 2013-09-30 Impact factor: 3.240

10. Hierarchical optimization for the efficient parametrization of ODE models.

Authors: Carolin Loos; Sabrina Krause; Jan Hasenauer
Journal: Bioinformatics Date: 2018-12-15 Impact factor: 6.937