Nirag Kadakia1,2,3. 1. Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT, United States of America. 2. Quantitative Biology Institute, Yale University, New Haven, CT, United States of America. 3. Swartz Foundation for Theoretical Neuroscience, Yale University, New Haven, CT, United States of America.
Abstract
Functional forms of biophysically-realistic neuron models are constrained by neurobiological and anatomical considerations, such as cell morphologies and the presence of known ion channels. Despite these constraints, neuron models still contain unknown static parameters which must be inferred from experiment. This inference task is most readily cast into the framework of state-space models, which systematically takes into account partial observability and measurement noise. Inferring only dynamical state variables such as membrane voltages is a well-studied problem, and has been approached with a wide range of techniques beginning with the well-known Kalman filter. Inferring both states and fixed parameters, on the other hand, is less straightforward. Here, we develop a method for joint parameter and state inference that combines traditional state space modeling with chaotic synchronization and optimal control. Our methods are tailored particularly to situations with considerable measurement noise, sparse observability, very nonlinear or chaotic dynamics, and highly uninformed priors. We illustrate our approach both in a canonical chaotic model and in a phenomenological neuron model, showing that many unknown parameters can be uncovered reliably and accurately from short and noisy observed time traces. Our method holds promise for estimation in larger-scale systems, given ongoing improvements in calcium reporters and genetically-encoded voltage indicators.
Functional forms of biophysically-realistic neuron models are constrained by neurobiological and anatomical considerations, such as cell morphologies and the presence of known ion channels. Despite these constraints, neuron models still contain unknown static parameters which must be inferred from experiment. This inference task is most readily cast into the framework of state-space models, which systematically takes into account partial observability and measurement noise. Inferring only dynamical state variables such as membrane voltages is a well-studied problem, and has been approached with a wide range of techniques beginning with the well-known Kalman filter. Inferring both states and fixed parameters, on the other hand, is less straightforward. Here, we develop a method for joint parameter and state inference that combines traditional state space modeling with chaotic synchronization and optimal control. Our methods are tailored particularly to situations with considerable measurement noise, sparse observability, very nonlinear or chaotic dynamics, and highly uninformed priors. We illustrate our approach both in a canonical chaotic model and in a phenomenological neuron model, showing that many unknown parameters can be uncovered reliably and accurately from short and noisy observed time traces. Our method holds promise for estimation in larger-scale systems, given ongoing improvements in calcium reporters and genetically-encoded voltage indicators.
This is a PLOS Computational Biology Methods paper.
Introduction
Computations in biological neural networks are shaped both by connection topologies and the response dynamics of individual neurons [1-3]. For single neurons, a versatile and biologically realistic class of computational models is built on the Hodgkin-Huxley framework—so-called conductance-based neuron models [4]. Fitting such biophysically realistic neural models to data is often cast in the framework of statistical inference [5-9], which systematically takes into account i) noise in the model dynamics and in the observations, and ii) unobservability in model states. Moreover, inference procedures produce not only an optimal estimate of model unknowns, but also the distributions of these estimates around their optima, providing a measure of estimate uncertainty. Various manifestations of statistical inference have been applied in neurobiological, behavioral, and neuromorphic settings [6, 7, 9–21].In many settings, the emphasis of statistical inference has been on tracking dynamical variables, such as the membrane voltage, over time. Algorithms like the Kalman filter [5, 22] or its many variants [5, 5, 23, 24] solve this state estimation problem recursively, by updating the optimal estimate sequentially with each subsequent observation. This is computationally cheap and memory-efficient, requiring only the estimate at the most recent timestep. But extracting time-varying quantities is only one aspect of the inference procedure, and is not usually the direct quantity of interest. Rather, it is knowledge of the fixed parameters, such as time constants, channel conductances, baseline ionic concentrations, and synaptic strengths, that is required to generate predictions to novel stimuli, informing our understanding of brain function. Recursive filtering methods are not directly applicable when neuron model parameters are unknown, since the model dynamics, which together with the data determine subsequent estimates, are not fully specified.One approach for parameter estimation in filtering is to “promote” parameters to dynamical states, but with trivial dynamics—this keeps parameters approximately constant, while allowing enough stochasticity to improve estimates [5, 25]. More sophisticated methods combine filtering and parameter inference into an expectation-maximization (EM) algorithm that updates state and parameter estimates in alternating fashion [5, 26]. However, such techniques have been limited to the estimation of linear parameters such as ion conductances, rather than nonlinear parameters such as those governing ion channel kinetics. This is due to the fact that EM algorithms converge to local minima; with nonlinearities and noise, cost manifolds are highly nonconvex, precluding reliable estimation.An alternative to filtering approaches is the optimization of a posterior distribution over, jointly, the states at all timepoints and the unknown parameters. Optimization-based methods have been used to effectively estimate linear parameters in neural models [6, 26–30], as well as to uncover nonlinear parameters in nonlinear and even chaotic models [7, 17–20, 31–37]. Here, we build on some of these approaches to present a new method for joint state and parameter estimation in biophysical neural models. Our emphasis is on strongly nonlinear (including chaotic) models, sparse observations, unknown nonlinear parameters, and extreme nonconvexity in the state and parameter manifold. We combine two ideas which have recently been investigated in nonlinear systems estimation: chaotic synchronization [38-40], and homotopy continuation [7, 33, 41, 42]. Our approach is predicated on a previous observation that, by coupling measured data directly into the system dynamics, cost manifolds over the unknown states and parameters become highly regularized [37, 40]. We extend this notion into an optimal control framework, treating the coupling strength as a control parameter which we optimize using the Pontryagin minimum principle [43]. What results is a new dynamical system that represents the evolution of the optimal state estimate in time. We present an algorithm to simultaneously find the integral curves of these “estimation dynamics” and determine the unknown parameters from noisy data. We illustrate our algorithm using synthetic data from both a sparsely observed canonical choatic attractor and a biophysical HH-type model, comparing to the prior techniques [33, 40].We find that parameter estimates in nonlinear and chaotic systems are significantly more accurate than in previous estimation methods, particularly for very sparse observations. Our technique is able to accurately estimate nonlinear model parameters from short time traces, improving on prior efforts to uncover linear parameters [26]. Moreover, estimates are considerably robust to both measurement noise and model noise. The latter manifests as noisy current inputs, affecting the timing of action potentials, and has not been investigated extensively in other optimization-based approaches. In sum, our method finds significant improvements in prediction accuracy over other optimization-based estimation methods, while exhibiting some tradeoffs in higher computational cost due to additional constraints.
Results
Nudging synchronization for estimating dynamical states
We first consider the problem of estimating the hidden dynamical states of a system with known dynamics from noisy observations, focusing on a simple technique which will form the basis for our proposed approach. The system evolves either deterministically via the ODE dynamics
or stochastically under the Langevin dynamics
where η(t) is uncorrelated Gaussian noise, and for now we assume that all model parameters are known. The system is sparsely observed, so the dimension L of the observation vector is generally less than that of the D-dimensional state space , e.g. L < D. For notational simplicity, throughout we represent as a D-dimensional vector with nonzero elements only in its L observed elements. An unsophisticated but straightforward and computationally attractive way to infer the evolution of both the observed and unobserved variables is by dynamically coupling the system to the measurements, effectively controlling the system dynamics with data [38, 40, 45–47]. To do this, we redefine the dynamics as:
where U is a constant, diagonal matrix with nonzero elements u for observed states l, and H projects the state space x onto the observed subspace y—i.e. both U and H are DxD but have rank L. The initial states x(t0), can be initialized randomly, and the dynamics Eq 3 are simply integrated forward [45]. Information passes from the data to the observed L states through the linear control term U(y − Hx), then from observed to the D—L unobserved states through the coupled dynamical equations (Fig 1A). Information therefore passes from data to all model states, observed or not. The strength of the nonzero control terms u should be set to a value large enough to synchronize the system to the data, but not so large that the noise in the observations is magnified [46]. In the geophysics literature, this technique is known as “nudging” [45, 46]. Those familiar with linear filtering or control theory will recognize that the control term, a linear scaling of the error between data and estimate, is analogous to the innovation term in the Kalman filter [5, 22]. However, the Kalman gain evolves in time and is chosen to minimize the residual estimation error at each time step, while U is constant and prescribed upfront. Though the Kalman prescription is more principled than the choice of U, there is a tradeoff in the computational costs of storing and manipulating large convariance matrices, not required here. Still, the more pressing issue is that sequential estimators such as nudging and Kalman filters are not directly suited for the estimation of static parameters. At each timepoint, sequential estimators incorporate new observations with both the running estimate from the previous timestep and the model predictions; without known parameters, the latter cannot be obtained.
Fig 1
Nudging synchronizes estimate to observations with lax parameter tuning.
A: Hidden states x evolve temporally under the controlled dynamics via , where is the sum of model dynamics and an external data-driven control term. The control feeds back the difference between and the observations y, for measured states; it is absent in unmeasured states. B: Ground-truth trajectories and observations Lorenz96 system in 5 dimensions; only x1 and x4 are observed. Measurements obtained by adding uncorrelated noise with σ = 1 at each timestep. C and D: True (black) and estimated (blue) trajectories of the Lorenz96 system, for no nudging control (C) and with x1 and x4 controlled with strength U11 = U44 ≡ u = 5 (D). The controlled system synchronizes closely to ground truth for both the observed unobserved variables. E: Mean-squared error of estimates as a function of control strength; near-minimal errors are achieved over a relatively wide range of u.
Nudging synchronizes estimate to observations with lax parameter tuning.
A: Hidden states x evolve temporally under the controlled dynamics via , where is the sum of model dynamics and an external data-driven control term. The control feeds back the difference between and the observations y, for measured states; it is absent in unmeasured states. B: Ground-truth trajectories and observations Lorenz96 system in 5 dimensions; only x1 and x4 are observed. Measurements obtained by adding uncorrelated noise with σ = 1 at each timestep. C and D: True (black) and estimated (blue) trajectories of the Lorenz96 system, for no nudging control (C) and with x1 and x4 controlled with strength U11 = U44 ≡ u = 5 (D). The controlled system synchronizes closely to ground truth for both the observed unobserved variables. E: Mean-squared error of estimates as a function of control strength; near-minimal errors are achieved over a relatively wide range of u.We illustrate nudging synchronization in the chaotic Lorenz96 [44] system in 5 dimensions:
The Lorenz96 system contains a single parameter F, which for values ∼8 render the dynamics chaotic. We assume that only x1 and x4 are observed, with Gaussian measurement noise, so
and the remaining H are zero and σ = 1. True states x are generated by integrating Eq 4 numerically with a timestep of dt = 0.01 over t ∈ [0, 5], from which observations are generated using Eq 5 (Fig 1B). Using these observations y, we obtain an estimate of the true states by integrating forward the nudged dynamics, Eq 3, with U11 = U44 ≡ u set to a fixed value, chosen between 0 and 100. We initialize our estimate at time t = 0 to the measured data y(0) for the observed variables x1 and x4, and uniformly between ±5 for the hidden variables. This chaotic system is hypersensitive to errors in the initial conditions, so without control, u = 0, the estimated trajectory evolves in a manner quite distinct from the true model (Fig 1C). For sufficient u, e.g. u = 5.0, the data synchronizes the estimates of both observed and unobserved states to the model, despite large initialization errors in (Fig 1D). A distinct advantage of nudging is that fine-tuning u is not necessary: within a substantial window of u, between ∼5−20, the system closely synchronizes to the true states; however, for u sufficiently large, the control term now overamplifies the observation noise, degrading the estimates (Fig 1E). The simplicity of nudging synchronization is apparent: it necessitates only the choice of the gain parameter u, which can be chosen rather loosely—and requires only a simple forward integration of the model dynamics.
Parameter inference in nudging synchronization
In the context of neuroscience, we are less interested in the time course of dynamical state variables than model parameters such as ion channel conductances and kinetic time constants [18, 19, 48]. One could envision a direct search over parameter space with some appropriately chosen cost function. To illustrate why a naive optimization may not be entirely straightforward, we plot the quadratic cost between the measurements and model, , where is the solution of the Lorenz96 model with forcing parameter F (Fig 2A) and the initial condition is fixed to its true value. The global minimum of this cost surface for F corresponds to the true parameter , but it resides in a narrow basin of attraction and is surrounded by multiple false minima (Fig 2A). This irregularity is a consequence of the highly nonlinear nature of the dynamics [37, 40], and it would be exceedingly difficult to pinpoint the global minimum with conventional optimization routines, even in this optimistic scenario in which x(t0) is assumed known and the search space is effectively one-dimensional.
Fig 2
Constraining least-square estimates with controlled dynamics enables robust parameter estimation in chaotic models.
A: Squared-difference between 10D Lorenz96 model trajectory x and measured data y, for various values of forcing parameter F, assuming that model was integrated forward from the true initial point. Only states x1, x4, x7, x10 were measured. The global minimum of C(F) (dotted line; F=8), resides in an extremely narrow basin of attraction within a highly nonconvex cost surface. B: Same as A, but for controlled dynamics (u = 4 and u = 15 for middle and right plots, respectively). Higher controls provide a far smoother cost function around the global minimum. C: 2D projection of the high-dimensional DSPE cost surface, along F in one dimension and the line U(t)≡u, ∀l, n in the other. Here, x(t) were fixed at the values generated by integrating the model forward from its true initial state for the given F. D: Estimated trajectories of x8, using uncontrolled least squares (top; green) and DSPE (bottom; red). Black line: true trajectory. Shaded regions: SD of estimate across 100 random initializations of the optimization. E: Histogram of parameter estimates across 100 initializations of the optimization, for least squares (green) and DSPE (red).
Constraining least-square estimates with controlled dynamics enables robust parameter estimation in chaotic models.
A: Squared-difference between 10D Lorenz96 model trajectory x and measured data y, for various values of forcing parameter F, assuming that model was integrated forward from the true initial point. Only states x1, x4, x7, x10 were measured. The global minimum of C(F) (dotted line; F=8), resides in an extremely narrow basin of attraction within a highly nonconvex cost surface. B: Same as A, but for controlled dynamics (u = 4 and u = 15 for middle and right plots, respectively). Higher controls provide a far smoother cost function around the global minimum. C: 2D projection of the high-dimensional DSPE cost surface, along F in one dimension and the line U(t)≡u, ∀l, n in the other. Here, x(t) were fixed at the values generated by integrating the model forward from its true initial state for the given F. D: Estimated trajectories of x8, using uncontrolled least squares (top; green) and DSPE (bottom; red). Black line: true trajectory. Shaded regions: SD of estimate across 100 random initializations of the optimization. E: Histogram of parameter estimates across 100 initializations of the optimization, for least squares (green) and DSPE (red).On the other hand, the cost between the measurements and the controlled dynamics is far more regular. If we instead consider , where is generated from the controlled dynamics Eq 3, we find that the cost surface smooths considerably with increasing nudging strength (Fig 2B). For sufficient gain (u = 15), the cost surface is fully convex, exhibiting a broad minimum around F = 7.81. Thus, the linear control term proportional to u acts to smooth the nonconvexities of the cost surface induced by the chaotic dynamics. This observation led Abarbanel et al. to propose the following constrained optimization problem [40]:
where, in contrast to direct nudging, the control terms are now time-dependent, U → U(t). In this dynamical state and parameter estimation (DSPE) framework, the dynamical states x(t), control terms U(t), and model parameters Θ are optimized simultaneously. With no control (U(t) = 0), this optimization reduces to a naive constrained least squares matching of the model to the observations. The insight of DSPE is that the cost function becomes smoother for larger U(t), localizing the estimate to the vicinity of the global minimum, while the quadratic penalty on U(t) reduces the controls, refining the estimate to the true minimum of the uncontrolled dynamics. Thus, one can think of U(t) as opening up new degrees of freedom, allowing escape from local minima during the optimization procedure. When the optimization has terminated, the optimal U(t)—which are driven to small values due to the ∼ U2 cost penalty—are discarded. In this sense, they are essentially an algorithmic device.We illustrate the regularizing effects of the control parameters by plotting a 2D projection of the DSPE cost surface, along F and along a 1D projection in the space of U(t) (Fig 2C): the surface has a broad minimum for larger control strength, and a complex, rugged surface as the control strength becomes weaker. To illustrate the performance of the DSPE algorithm, we compare it against a naive constrained least squares optimization without controlled dynamics, again using the 10D Lorenz96 system. Across 100 random initializations, both least squares and DSPE produced comparably accurate estimates of the state variables, with DSPE being somewhat more accurate near the boundaries of the time window (Fig 2D). The differences in the estimate of the forcing parameter F, however, were more striking. For nearly all initializations, DSPE estimated F within 1% of its true value, while the distribution of parameters estimated by least squares peaked at the true value but were far more dispersed (Fig 2E).
We now present our main results. We have seen that DSPE is built naturally on nudging synchronization, enforcing controlled dynamics as constraints in a global optimization over states, parameters, and time-dependent controls. Here, we suggest that the control variables, rather than acting as an algorithmic device to regularize the optimization procedure, could be instead specified in a more principled manner, by viewing DSPE as the foundation for an optimal control problem. In this framework, the optimal estimate can be derived using the Pontryagin minimum principle. Using the DSPE cost function Eq 6, expressed in continuous time,
and the controlled dynamics Eq 3, we apply the minimum principle in the usual way (Methods for derivation) to obtain an expression for the optimal control (assuming U is diagonal for readability; this can be relaxed trivially):
This control can be used to derive the dynamics obeyed by the optimal estimate and the conjugate momenta
p in time:
where the asterisk refers to the fact that the integral curves of these equations represent locally optimal trajectories. Since this dynamical system describes the evolution of the optimal estimate, we call it the estimation dynamics. Note that the estimation dynamics form a boundary value problem rather than an initial value problem [49]: x in the null-space of H are unobserved, so their values at t = 0 are unspecified. A wealth of studies have been devoted to solving control systems of this class, using polynomial approximations—so-called collocation methods—or instead using shooting methods, which solves the initial value problem perturbatively until the boundary conditions are matched [50]. Our problem has the added complication that with unknown parameters, the dynamics could not be integrated forward in either case. Instead, we propose an algorithm for simultaneously estimating , and Θ that leverages some desirable aspects of the original cost function in a practical implementation. We first use the optimal control condition Eq 9 to express , Eq 8, in space:
This is the original cost function to be optimized, now written in x, p. Since the estimation dynamics Eq 10
fully define the optimal trajectory, contains no new information [49]. Nevertheless, this function must be stationary along locally optimal trajectories. Further, it is convex in the space of observable x and associated p, so its minimum is global in those directions, and highly degenerate in the unmeasured directions.We could therefore use the solutions of as a starting point for the more complicated task of satisfying the highly nonlinear estimation dynamics. Specifically, we could begin by optimizing , subject to a loose enforcement of Eq 10. The resultant path and parameters would then be used as the initialization for a subsequent minimization of subject to Eq 10, now enforced to a tighter degree. We call this method “optimally-controlled dynamical state and parameter estimation” (OC-DSPE), and schematize it in Algorithm 1. The idea of OC-DSPE is that the enforcement of the nonlinear dynamics breaks the convexity in a gradual manner, allowing the global minimum to be systematically tracked—in much the same spirit as the control variable directions in the DSPE cost manifold (vertical direction in Fig 2C). This iterative procedure lies in a class of nonlinear techniques called homotopy continuation, where the solutions of one system are tracked by iterating from a simpler system with known solutions [41]. Incorporating this flavor of homotopy continuation into nonlinear estimation was the basis of a number of studies, in particular quasi-static variational assimilation [51]and variational annealing [7, 32–34]. The latter has been shown to be quite effective in tracking highly chaotic systems, as well as in highly under-observed neuron models with many unspecified nonlinear parameters. A more recent study illustrated benefits in nonlinear system estimation by extending variational annealing to Hamiltonian manifolds [49], in much the same spirit as Algorithm 1. OC-DSPE differs from these other homotopy continuation techniques essentially in its integral equations Eq 10, which are derived from a set of controlled state space dynamics. In previous formulations, the dynamics are uncontrolled.Algorithm 1 Optimally-controlled dynamical state and parameter estimation (OC-DSPE)Given measured data y and annealing parameters βmax, α, λ0for
q = 1, …, Q, in parallel doSample x uniformly from presumed dynamical rangeSample p uniformly from presumed rangeSample Θ from prior distributionend forfor
β = 1, …, βmax
doλ ← λ0
αfor
q = 1, …, Q in parallel doreturnend forend for
Numerical experiments of OC-DSPE with 10-dimensional Lorenz96 model
We first apply OC-DSPE to the Lorenz96 system in 10 dimensions with unknown forcing parameter F, performing 100 optimizations with different initial guesses (Methods for implementation details). We first show illustrative plots of the state estimates, assuming M = 4 states are observed. For the observed , the state estimate averaged over all 100 estimations is well-localized around the true trajectory (Fig 3A; top plot). The unobserved state variable is markedly less accurate (Fig 3B; top plot), but does approximate certain regions well (i.e. t ∼ 4−6), suggesting that the chaotic instabilities are less pronounced in those regions around the attractor. Of course, since the trajectories depend on the parameters, the accuracy of the states is often correlated with that of the parameters. If we plot the state estimates for which the forcing parameter is estimated close to its true value, , we find that the states are nearly indistinguishable from the true trajectories, beyond an initial synchronization region, t > ∼1 (Fig 3A and 3B; bottom). This initial synchronization region is not surprising. The optimal control quickly nudges the estimate toward the observed data, even if the state at the beginning of the estimation window t = 0 is highly inaccurate. Once synchronized, the optimal control maintains the estimate near the true trajectory at minimal control cost .
Fig 3
OC-DSPE exhibits enhanced parameter estimation accuracy in sparsely-observed chaotic models.
Illustration of parameter estimation in chaotic Lorenz96 system using optimally-controlled dynamical state and parameter estimation (OC-DSPE). A: Illustrative estimate of observed variable x1(t) when 4 states are observed, for 100 estimations randomly initialized (top), and for only those estimations in which the parameter was estimated to high accuracy (bottom; ). Grey: standard deviation of all estimates; Dotted: true state. B: Same as A, for the unobserved variable x3(t). C: Histogram of parameter estimates over 100 initializations, for 100 datasets initialized throughout the 10D Lorenz96 chaotic attractor, for 5 observed variables and using 3 different estimation routines (least squares, the original DSPE algorithm, and OC-DSPE). Parameter estimates are tightly clustered around the true value of 8.17. D: Same as C, for only 2 observed variables. Estimated is more widely dispersed, but centered on the true value only for OC-DSPE. E: Estimated parameter versus error in observed states x averaged over space and time. For low state error, estimated parameters are close to the true value for all 3 estimation routines. Tor larger state errors, proximity to the true value occurs only in OC-DSPE, indicating a larger degree of robustness in estimation accuracy. F: Dependence of |p(t)|, averaged over dimension i and time n, on error in parameter estimate, , for 5 (leftmost plot), 4, 3, and 2 (rightmost plot) observed states. In all cases, the minimum average |p| occurs at very close to the true F, indicating that the values of the conjugate momenta can act as a proxy for identifying the optimal parameter estimate.
OC-DSPE exhibits enhanced parameter estimation accuracy in sparsely-observed chaotic models.
Illustration of parameter estimation in chaotic Lorenz96 system using optimally-controlled dynamical state and parameter estimation (OC-DSPE). A: Illustrative estimate of observed variable x1(t) when 4 states are observed, for 100 estimations randomly initialized (top), and for only those estimations in which the parameter was estimated to high accuracy (bottom; ). Grey: standard deviation of all estimates; Dotted: true state. B: Same as A, for the unobserved variable x3(t). C: Histogram of parameter estimates over 100 initializations, for 100 datasets initialized throughout the 10D Lorenz96 chaotic attractor, for 5 observed variables and using 3 different estimation routines (least squares, the original DSPE algorithm, and OC-DSPE). Parameter estimates are tightly clustered around the true value of 8.17. D: Same as C, for only 2 observed variables. Estimated is more widely dispersed, but centered on the true value only for OC-DSPE. E: Estimated parameter versus error in observed states x averaged over space and time. For low state error, estimated parameters are close to the true value for all 3 estimation routines. Tor larger state errors, proximity to the true value occurs only in OC-DSPE, indicating a larger degree of robustness in estimation accuracy. F: Dependence of |p(t)|, averaged over dimension i and time n, on error in parameter estimate, , for 5 (leftmost plot), 4, 3, and 2 (rightmost plot) observed states. In all cases, the minimum average |p| occurs at very close to the true F, indicating that the values of the conjugate momenta can act as a proxy for identifying the optimal parameter estimate.Next, we focus on the accuracy of parameter estimates and compare i) OC-DSPE, ii) DSPE, and iii) constrained least squares (see Methods for details of the implementation of each method). We repeated the 100 estimations for Z = 100 distinct estimation windows around the Lorenz attractor to get a better representation of the estimation in regions where the chaotic instabilities are both strong or weak. We first plot the distribution of estimated parameters when M = 5 states (x; d odd) are observed. For M = 5, the estimated parameters are highly localized around the true value for all techniques (Fig 3C). The advantage of OC-DSPE becomes more apparent with fewer measured states. In particular, with M = 2 observed states, the estimates for all parameters are substantially more dispersed (Fig 3D). Only OC-DSPE, however, is centered near the true value; DSPE in particular is highly dispersed giving erroneous parameter estimates as low as 0. This illustrates that OC-DSPE, while computationally more demanding (the state space is doubled and the equations require more derivatives), can produce substantially more accurate parameter estimates in very sparsely observed chaotic systems.To further quantify the robustness of the estimation procedure, we calculated the estimated parameter as a function of the error between observations and estimates, (Fig 3E). Here, we chose, for each of the Z estimation windows, the optimal parameter estimate among the 100 initializations. This gives a “best-case” estimate for each dataset, and quantifies how this optimal parameter estimate depends on the errors in the dynamical states. Though all 3 algorithms produce accurate when the estimation error is minimal, OC-DSPE is far more tightly centered on the true parameter value even as the state error increases. This indicates that for OC-DSPE i) the optimal parameter estimate could be found more reliably with less initializations and ii) the optimal estimate can be identified with more reliably in practice, where the only error metric available is E.Finally, we consider the role of the auxiliary momenta variables p in OC-DSPE, a feature absent in direct optimization schemes. In the control-theoretic sense, p represent the marginal cost of violating the constraints given by the system dynamics (Eq 7). Regions in which p are appreciable correspond to regions in which the state estimate could easily deviate from the controlled dynamics, suggesting a lack of estimation accuracy (Fig 3E). Do deviations in p also have some bearing on the reliability in the parameter estimate? In other words, is there a correlation between some statistic of p and the estimation errors in F? Indeed, plotting the magnitude of p averaged over time and dimension as a function of shows a prominent dip at zero parameter estimate error (Fig 3F). Though this dip is more prominent for larger observability (leftmost plot in Fig 3F)), it is still present even for very sparse observability (rightmost plot in Fig 3F)). This indicates that the statistics of such artificial “conjugate” variables can provide a further consistency check on the accuracy of the estimation.
Numerical experiments with the Morris-Lecar model
Our primary system of interest is a biophysically realistic neuron modeled on the Hodgkin-Huxley framework. We use the Morris Lecar (ML) system [1], a reduced 2-state variable spiking neuron model in which the gating variable w(t) drives changes in the neuron membrane voltage V(t), and which has 12 static parameters (Methods). We first use OC-DSPE to estimate only the linear conductances g, which dictate the regimes of neuron excitability; we hold the other parameters fixed at their correct values. Observations are generated by adding uncorrelated Gaussian noise to the ground truth voltage , where we investigate σ = 2 mV and σ = 10 mV. The stimulating current is held at a constant value (Fig 4A). To illustrate the robustness of the algorithm to parameter uncertainty, we let the optimization bounds on the unknown conductance parameters span 2 orders of magnitude, from 0 to 200. To cross-validate our predictions, we forward integrate the dynamics from the final time estimate state using the estimated parameters but a pseudo-noisy current spanning a range of input currents (Fig 4A). We use Q = 25 initializations in Alg. 1.
Fig 4
Estimation of states and linear conductances in Morris-Lecar model.
A: Stimulating current used to estimate parameters and compare forward predictions. B: Most accurate (top) and least accurate (bottom) of 25 estimations using OC-DSPE, σ = 2 mV. Black: true voltage; blue: noisy observations; red: forward prediction using estimated parameters. C: Heatmap of voltage over time for all 25 runs, with the run giving lowest (highest) prediction error on top (bottom). D: Estimated conductances corresponding to state estimations in C. E: Analogue of B-D, for higher measurement noise σ = 10 mV. F: Estimations of the linear conductances among the Q = 25 runs using OC-DSPE (black), DSPE (red) and constrained least squares (blue), for 2 different levels of measurement noise (σ = 2, 10 mV, respectively). G: Top trace: Stimulus used for prediction (note timescale is from 100 to 200 ms) for the 3 methods in F. Bottom left 3 traces: predictions using optimally estimated parameters among the Q = 25 runs in F, using the 3 different estimation methods, for σ = 2 mV measurement noise. Black, red, blue: OC-DSPE, DSPE, and least squares predictions, respectively. Grey: Predicted trajectory using the true parameters (note this is visually indistinguishable from the accurate predictions). Bottom right 3 traces: Same, for σ = 10 mV measurement noise.
Estimation of states and linear conductances in Morris-Lecar model.
A: Stimulating current used to estimate parameters and compare forward predictions. B: Most accurate (top) and least accurate (bottom) of 25 estimations using OC-DSPE, σ = 2 mV. Black: true voltage; blue: noisy observations; red: forward prediction using estimated parameters. C: Heatmap of voltage over time for all 25 runs, with the run giving lowest (highest) prediction error on top (bottom). D: Estimated conductances corresponding to state estimations in C. E: Analogue of B-D, for higher measurement noise σ = 10 mV. F: Estimations of the linear conductances among the Q = 25 runs using OC-DSPE (black), DSPE (red) and constrained least squares (blue), for 2 different levels of measurement noise (σ = 2, 10 mV, respectively). G: Top trace: Stimulus used for prediction (note timescale is from 100 to 200 ms) for the 3 methods in F. Bottom left 3 traces: predictions using optimally estimated parameters among the Q = 25 runs in F, using the 3 different estimation methods, for σ = 2 mV measurement noise. Black, red, blue: OC-DSPE, DSPE, and least squares predictions, respectively. Grey: Predicted trajectory using the true parameters (note this is visually indistinguishable from the accurate predictions). Bottom right 3 traces: Same, for σ = 10 mV measurement noise.For small measurement noise, σ = 2, most of the Q runs give highly accurate estimates of both the states V(t) (Fig 4B and 4C) and parameters g (Fig 4D). For the poorest fit, g is sufficiently inaccurate such that spike events are occasionally missed. For higher measurement noise, σ = 10 mV, about 10% of the dynamic range, many predictions have degraded as expected, producing shifted or missed spike times (Fig 4E). Still, the optimal prediction and estimated parameters are highly accurate among these 25 initializations (Fig 4E).To get a sense of how OC-DSPE compares to related estimation methods, we repeated the estimations using the original DSPE method and constrained least squares (as in Fig 3). For small measurement noise (σ = 2 mV), the optimal predictions among Q = 25 runs were equally as accurate as OC-DSPE (Fig 4G), although the likelihood of finding the optimal fit was far less than the near 100% likelihood of OC-DSPE (Fig 4F). The accuracy degraded precipitously with measurement noise, and even the best parameter estimates for these two methods were poor (Fig 4F), producing considerably misshappen spike waveforms (using DSPE) or wildly inaccurate estimations altogether (using least squares) (Fig 4G). This indicated that even in this relatively tractable problem of estimating linear parameters from noisy data, OC-DSPE exhibits a robustness and accuracy not evident in related estimation methods.As a more realistic numerical experiment, we next estimated all 11 model parameters governing ion channel kinetics (we omit the capacitance C, which is typically ascertained from neuron size). All parameters are again bounded liberally, over 2 orders of magnitude. For low measurement noise, the model parameters, including those entering the model equations nonlinearly, are estimated to high precision—this is borne out in the accuracy of the forward predictions (Fig 5A and 5B). Still, as before, spikes can be missed and/or slightly shifted (Fig 5C)).
Fig 5
Estimation of linear and nonlinear parameters in Morris-Lecar model.
A: (top) Stimulating current used for estimation and prediction. (bottom) Black: true voltage trace; blue: noisy observations; red: predicted voltage trace with highest accuracy. B: Estimated parameters for all 25 runs. Black: individual 25 estimates of each parameter; pink: true value of parameter; red: parameter estimate corresponding to lowest prediction error. C: Predicted voltage traces from parameters, from highest accuracy (top) to lowest accuracy (bottom), for all 25 runs. D-F; same as A-C but now using a complex stimulating current. Prediction accuracy is markedly more reliable. G-H: Same as A-F, for higher noise (σ = 10 mV), for step stimulating current (G) and complex stimulating current (H). In the presence of higher noise, optimal state and parameter estimates are nonetheless very accurate for complex stimulating currents (H).
Estimation of linear and nonlinear parameters in Morris-Lecar model.
A: (top) Stimulating current used for estimation and prediction. (bottom) Black: true voltage trace; blue: noisy observations; red: predicted voltage trace with highest accuracy. B: Estimated parameters for all 25 runs. Black: individual 25 estimates of each parameter; pink: true value of parameter; red: parameter estimate corresponding to lowest prediction error. C: Predicted voltage traces from parameters, from highest accuracy (top) to lowest accuracy (bottom), for all 25 runs. D-F; same as A-C but now using a complex stimulating current. Prediction accuracy is markedly more reliable. G-H: Same as A-F, for higher noise (σ = 10 mV), for step stimulating current (G) and complex stimulating current (H). In the presence of higher noise, optimal state and parameter estimates are nonetheless very accurate for complex stimulating currents (H).As noted in a number of prior studies [18, 19, 31], the accuracy of the estimation is inherently limited by the richness of the driving current. If, for example, the stimulating current were 0 pA—below the spiking threshold—the estimated conductances would be degenerate: the fixed point of would be unchanged by appropriate rescalings of g. Difficulties in estimating the kinetic parameters such as β, γ would also arise since there are no observations when the neuron is spiking. In the case of a single step, as in Figs 4A and 5A, the volume of phase space in V(t), w(t) occupied by the system is vanishingly small, covering only the closed curve along the spiking limit cycle. Instead, valuable information toward the parameter estimates could be gained by pushing the system to less-visited regions of phase space. One option is to use a staircase signal consisting of many step currents [31]. Alternatively, by using a dynamic current, we could persistently modulate the system between different limit cycle manifolds. Since the shape of these manifolds differs by input current, and since steady state is never achieved, the system would cover a larger swath of the phase space through transient motion [18]. This richer sampling of the phase space would increasingly enhance our estimate of g.We therefore repeated the experiments using a stimulating current proportional to the x component of the 10D Lorenz96 system (we took the absolute value so all currents were positive, and linearly scaled the time axis) (Fig 5D). In the optimally predicted , the shifts in spike times are now absent, and the dropped spike near t ∼ 200 is recovered (Fig 5D). The estimated parameters are also more precise, producing markedly more robust predictions (Fig 5E and 5F). Finally, we repeat the experiments for the higher measurement noise σ = 10 mV. The optimal prediction for the step current are of moderate accuracy: many spikes are reconstructed but many are dropped (Fig 5G). For the chaotic driving current, spike times in the optimal prediction match very well against the data (Fig 5H).Up to this point, we have considered uncorrelated Gaussian noise added post hoc to the true membrane voltage. This noise would reflect fluctuations in the recording electrode, but would not reflect other sources of noise inherent to the neuron itself. Alternatively, we can add noise directly into the model equations when generating the data. Such Langevin, or process, noise would reflect noisy current inputs into the neuron, and could produce qualitative changes in neuron response, such as mistimed or even missing action potentials. What is the effect of these major changes on the estimation quality of OC-DSPE? To investigate this, we added a Gaussian-distributed random current input into the Morris-Lecar neuron, and generated observations by integrating this stochastic system forward, and adding measurement noise as before. We chose the statistics of to have mean zero and standard deviation SD = 10, 50, or 100 pA—note that this is up to the magnitude of the injected current (Fig 6A). Indeed, for appreciable noise, SD > 50 pA, the neuron membrane voltage is modulated considerably—spikes are mistimed or even missed (Fig 6A). We then estimated the parameters in the deterministic Morris-Lecar model using OC-DSPE. Optimal predictions over Q = 25 runs show that for SD up to 50 pA, the forward predictions (Fig 6B) and parameters (Fig 6C) are highly accurate. For SD = 100 pA, spike timing is mostly preserved, but resting voltages and have larger errors, and some action waveforms are affected. Together, this indicates that even with substantial corrupting model noise, OC-DSPE can accurately infer latent states and model parameters, though there is a breakdown when the fluctuations approach the size of the injected current.
Fig 6
OC-DSPE estimation in the presence of noisy current inputs.
A: (top trace) Stimulating current used for estimation of parameters. (bottom 3 plots) Resulting voltage traces for the stimulating current, for injected current noise levels of σ = 10, 50, 100 pA (light red traces), overlaid with the trace for zero injected current (light black traces). Here, injected current noise acts as a type of model/process noise, since it affects the evolution of the dynamics directly. This contrasts with the “measurement noise” used above, which is uncorrelated, additive noise added to the hidden state space output. B: (top trace) Injected current used for estimation (0 to 100 ms) and prediction (100 to 200 ms). (bottom 3 traces) True trajectory (black), observed data (blue), and optimal estimate (red), for 3 noise levels, respectively, 10, 50, 100 pA. Observed data is generated by integrating the model dynamics with process noise at the level indicated (10, 50, 100 pA), and then adding measurement noise σ = 2 mV to the output. C: Predicted parameters for 100 estimations, for 3 model noise levels. Black dots: all estimates; red: parameter corresponding to the optimal estimate; pink: the true parameter estimate.
OC-DSPE estimation in the presence of noisy current inputs.
A: (top trace) Stimulating current used for estimation of parameters. (bottom 3 plots) Resulting voltage traces for the stimulating current, for injected current noise levels of σ = 10, 50, 100 pA (light red traces), overlaid with the trace for zero injected current (light black traces). Here, injected current noise acts as a type of model/process noise, since it affects the evolution of the dynamics directly. This contrasts with the “measurement noise” used above, which is uncorrelated, additive noise added to the hidden state space output. B: (top trace) Injected current used for estimation (0 to 100 ms) and prediction (100 to 200 ms). (bottom 3 traces) True trajectory (black), observed data (blue), and optimal estimate (red), for 3 noise levels, respectively, 10, 50, 100 pA. Observed data is generated by integrating the model dynamics with process noise at the level indicated (10, 50, 100 pA), and then adding measurement noise σ = 2 mV to the output. C: Predicted parameters for 100 estimations, for 3 model noise levels. Black dots: all estimates; red: parameter corresponding to the optimal estimate; pink: the true parameter estimate.Finally, we investigated the effect of an underdetermined model description, which would reflect, say, incomplete knowledge of which ion channels are present in the neuron. As a simple test, we generated data using the Morris-Lecar model from above, but with an extra ion channel—the persistent Na channel
[52]—in addition to the already present K and Na channels. The presence of this channel had measurable effects on neuron response, including missed spikes (Fig 7A). We then estimated model parameters of the original Morris-Lecar model, which had only the Na and K channels (K+Na model), using observations generated from the model with the extra NaP channel (K+Na+NaP model). Despite the presence of an ion channel not accounted for, optimal predictions were highly accurate (Fig 7B). Interestingly, the optimally predicted parameters using K+Na+NaP observations were only slightly perturbed from those fit to observations of the K+Na model (i.e. fits from Fig 5D–5F) (Fig 7C). This indicated that small perturbations in parameter space can appropriately adjust the fit to observations from a different description. The degree to which this is possible in general, is of course, highly dependent on how misspecified the model is, but it is comforting that OC-DSPE is robust to a degree of incomplete knowledge in the model description.
Fig 7
OC-DSPE performance with an under-specified neuron model.
A: Injected current (top) and resulting membrane voltage (bottom), using the original Morris-Lecar model with Na and K channels (grey), or with an extra NaP channel (red). The addition of the extra channel has appreciable effects on the response, including missing spikes. B: Injected current (top) and optimal OC-DSPE estimated voltage trace (bottom; black) by fitting observations (bottom; blue) from a K+Na+NaP model to an Na+K model. Red: forward predictions of the K+Na model using estimated parameters, overlaid with forward predictions of the true K+Na+NaP model (black). C: The optimal parameters used in B are plotted against the optimal parameters produced by fitting observations without model misspecification (i.e. parameters from Fig 5D–5F). The estimated parameters are nearly in agreement, but small deviations in the optimal fits account for the modified traces that fit the K+Na+NaP observations.
OC-DSPE performance with an under-specified neuron model.
A: Injected current (top) and resulting membrane voltage (bottom), using the original Morris-Lecar model with Na and K channels (grey), or with an extra NaP channel (red). The addition of the extra channel has appreciable effects on the response, including missing spikes. B: Injected current (top) and optimal OC-DSPE estimated voltage trace (bottom; black) by fitting observations (bottom; blue) from a K+Na+NaP model to an Na+K model. Red: forward predictions of the K+Na model using estimated parameters, overlaid with forward predictions of the true K+Na+NaP model (black). C: The optimal parameters used in B are plotted against the optimal parameters produced by fitting observations without model misspecification (i.e. parameters from Fig 5D–5F). The estimated parameters are nearly in agreement, but small deviations in the optimal fits account for the modified traces that fit the K+Na+NaP observations.
Discussion
Mathematical models for biological neuron and circuit models are informed intimately by neurobiology. Voltage dynamics in the Hodgkin-Huxley model, for example, obey classical electrodynamic equations for variable capacitors, reflecting the assumption of cell membranes as insulating barriers separating charged species. But while biological considerations may strongly constrain the model description, the many parameters in these models are unspecified. In fact, it is precisely the variability of these parameters that accounts for the vast diversity in system response, and ultimately dictates how these neural systems process external stimuli.Our approach is to cast the inference problem into an optimal control framework, resulting in a system of coupled ODEs for the dynamics describing the evolution of the optimal estimate. These ODEs, Eq 10, pose two difficulties: they are persistently unstable since they comprise a Hamiltonian system, and they are not directly integrable anyways, since the parameters are not specified. We address these two issues simultaneously by imposing these dynamics as constraints on a related, tractable cost function, and enforcing them iteratively as a penalty term. This is similar to a recent method identifying state space inference with Hamiltonian systems [49], and shares overlaps with a related homotopy approach for parameter estimation in nonlinear systems [42]. These approaches and ours adopt the viewpoint that, rather than i) recursively filtering with linearized dynamics or ii) attempting to represent high-dimensional distributions with particle estimates, the burden of nonlinearity can be shifted to the cost function, provided that nonconvexity can be introduced in a systematic way. Globally optimizing over these variables is, of course, only possible for data that is analyzed offline, as in calcium imaging or electrophysiological measurements.Several studies in recent years have developed direct optimization-based approaches for state and parameter estimation in neural systems [6, 32–34, 40, 49, 53, 54]. Directly optimizing high-dimensional distributions may seem computationally prohibitive, since the search space is large—equal to the number of dynamical states times the number of measurement times, plus the number of constant parameters. But optimization in state-space models benefits from the highly sparse structure of the Hessian matrix [6], making these methods amenable to fast linear algebraic routines. While our approach does not directly optimize the usual posterior distribution appearing in the Bayesian setting [5, 6], the Hessian matrices required by Newton-type minimizers have a similarly sparse structure, so the computational benefits remain.The “constrained least squares” estimation we have used for comparison (Figs 3 and 4) is functionally equivalent to the variational annealing method used in many recent efforts in state and parameter estimation in chaotic and neural systems [7, 18, 20, 32–34]. Variational annealing performs successive iterations of a cost function that sums model error and measurement error; at each iteration, the weight of the model error term is increased. This method is equivalent to a constrained least squares, since the measurement term is a least squares error, and the model error term is a sum of many constraint equations, each enforcing the model dynamics at each step. By increasing the model error weight iteratively, the constraints are enforced gradually, as in a typical penalty-based constrained optimization. This work illustrates that while variational annealing can be effective for nonlinear parameter inference, it still has significant limitations in the presence of appreciable measurement or model noise—a limitation that is handled far more effectively by OC-DSPE.An intriguing recent work [31] has explored how suboptimal parameter estimates can be “nudged” out of local minima by adding and then removing artificial measurement noise. That technique produced highly accurate estimations of 41 parameters in a conductance-based neuron model, though compared to our results here, the measurement noise was 10 times smaller. Still, one could envision incorporating such “noise-regulation” in our method here to suppress the impact of local minima.Our emphasis is on estimation accuracy, rather than computational cost. Indeed, both doubling the state space to a Hamiltonian system and imposing constraints increases computation time compared to, for example, the variational annealing method [32, 33], or directly optimizing the least squares error subject to hard model constraints. Extending this framework to larger systems would require a careful consideration of the tradeoffs between accuracy and speed. A comparison of some recent approaches for estimation in dynamical models, with emphasis on neural systems, is shown in Table 1. In general, parameter estimation is best done in the context of an optimization framework, such as DSPE or variational annealing. Among these, OC-DSPE fares well in terms of accuracy especially with high levels of noise, but with higher computational cost. Still, the accurate estimation of a large number of nonlinear parameters suggests that our method, combined with improved optimization routines, may be relevant for larger neural systems.
Table 1
Comparison of various filtering (sequential) and optimization-based approaches for state and parameter estimation in dynamical systems.
Method
Type
States
Params
Notes
Kalman filter [5]
sequential
linear
N/A
standard method for linear-Gaussian models
Extended/Unscented Kalman filter [5]
sequential
nonlinear
N/A
Allows non-Gaussian noise or nonlinear models
Sequential Monte Carlo (SMC) [24]
sequential
nonlinear
N/A
Scales poorly with model dimension
Nudging [45]
sequential
nonlinear
N/A
Computationally trivial—only integrate forward model dynamics
SMC + Expectation maximization [26]
sequential + linear optimization
nonlinear
linear
Effective for linear parameters and high noise
Quasi-static variational assimilation [51]
nonlinear optimization
nonlinear
N/A
Reduces effect of local minima for long time traces
DSPE [40]
nonlinear optimization
nonlinear
nonlinear
Estimate many nonlinear parameters, given low measurement noise [19, 31]
Variational annealing [33]
nonlinear optimization
nonlinear
nonlinear
Estimate many nonlinear parameters, given appreciable measurement noise [18, 20]
OC-DSPE
nonlinear optimization
nonlinear
nonlinear
Effective for many nonlinear parameters with high model or measurement noise; computationally expensive
Comparison of various methods for state and parameter estimation in dynamical systems, including neural models. The 3rd and 4th column refer to whether the method is applicable to estimated nonlinear states and parameters, respectively. Sequential methods refer to those that iteratively update an estimate, while optimization methods rely on minimizing a cost function. Sequential methods in general tend to be computationally faster, since they require just one or a few iterative passes through the data, scaling linearly with the length of the data. However, they are not directly applicable to estimating parameters. Optimization-based techniques are directly applicable to parameter estimation, and can be aided computationally by sparse Hessians and automatic differentiation. DSPE, variational annealing (akin to constrained least squares), and OC-DSPE can simultaneously estimate nonlinear parameters and dynamical states. OC-DSPE is more accurate, but also more computationally expensive due to the presence of multiple nonlinear constraints.
Comparison of various methods for state and parameter estimation in dynamical systems, including neural models. The 3rd and 4th column refer to whether the method is applicable to estimated nonlinear states and parameters, respectively. Sequential methods refer to those that iteratively update an estimate, while optimization methods rely on minimizing a cost function. Sequential methods in general tend to be computationally faster, since they require just one or a few iterative passes through the data, scaling linearly with the length of the data. However, they are not directly applicable to estimating parameters. Optimization-based techniques are directly applicable to parameter estimation, and can be aided computationally by sparse Hessians and automatic differentiation. DSPE, variational annealing (akin to constrained least squares), and OC-DSPE can simultaneously estimate nonlinear parameters and dynamical states. OC-DSPE is more accurate, but also more computationally expensive due to the presence of multiple nonlinear constraints.For completeness, we note that many methods similar to OC-DSPE, including most listed in Table 1, are often referred to as data assimilation methods [55, 56]. Data assimilation is a broad, and unfortunately somewhat nebulous, term generally referring to any technique combining dynamical models with observations. Within this definition, the Kalman Filter or any of its variants is the most well-known. If viewed in this context, OC-DSPE and variational annealing fall into the class of variational data assimilation methods, which cast the estimation problem as an optimization problem. Neural data inference are amenable to variational approaches since the observation windows are relatively short and estimation is performed offline—in contrast to weather prediction, where observations come in continuously, the observation windows are long, and predictions must be made in real-time.There are a number of limitations of, and potential extensions to, OC-DSPE. Throughout, we have treated the measurement matrix as diagonal, where each observation corresponds to a unique latent state, and noise variances are treated as a fixed and known quantity. This is in line with experimental conditions of intracellular recordings in vitro, where the measurement uncertainty can be quantified by fluctuations in the voltage signal when the injected current is zero. Still, this may not be the case for in vivo data, where measurements reflect superpositions of multiple neurons in each recording channel, and H is unknown and not diagonal. In principle, OC-DSPE could be extended to include the observation matrix H as a set of additional inferred parameters, a potential direction for future study.We have focused here on estimation of parameters in single neuron models, the obvious extension being to the inference of biophysical neuron parameters and connection weights in neural networks. Inference in neural networks will demand a different modeling approach, since highly-resolved voltage waveforms from multiple neurons in vivo are inaccessible, given current technologies. It is unlikely that detailed conductance-based models would be resolvable, and one might opt instead for reduced phenomenological models [1, 26] sharing the desired dynamical features, but containing richer dynamics than simple perceptrons. Our methodology is most relevant to data taken in vitro, where currents can be externally controlled. Fruitful directions for future research would carefully consider these limitations, deciding how best to adapt this and other model-based procedures for brain-wide calcium imaging data, as has been explored in recent studies [28].
Materials and methods
DSPE cost function generation
To generate the data used in Fig 2A, we numerically integrated the Lorenz96 system in 10 dimensions using the scipy.odeint routine in Python, which runs the FORTRAN solver LSODE. The integration was performed over 501 timepoints from t0 = 0 to t = 8 at a step of dt = 0.016. Observations y were generated by adding uncorrelated Gaussian noise to the 4 observed states x1, x4, x7, x10. The states used in Fig 2B were generated similarly, but using the controlled dynamics Eq 3 with U11 = U44 = U77 = U10,10 = u.For the cost surface representation in Fig 2C, we used measurement noise ; the observed states were as above. The full cost surface is technically 5-dimensional—the 4 U plus the forcing F. To plot a visualizable cross section of this surface, we set all U to be the same—this is equivalent to projecting the surface along that line.
Neuron model
The Morris-Lecar ODE dynamics (K+Na model) are given by:
whereThe ground truth parameters were set at as C = 2.5, , , , , , , , , , , γ = 10. Data used for state and parameter estimation was generated by integrating these dynamics with a prescribed stimulating current , over a window of [0, 100] ms at a timestep of 0.05 ms. For the step stimulus (Figs 4 and 5A–5C and 5G), the current was set to 100 pA. For the chaotic stimulating current (Fig 5D–F and 5H)), we used the absolute value of the output of 1 variable of the 10D Lorenz96 system, scaled in time by a factor of 15 and in value by a factor of 20.For the Morris Lecar system with persistent Na channel (NaP channel), the dynamics are defined as:
where and .Finally, for the neuron with input current noise, we define the model as:
where Inoise at each timestep is chosen from a normal distribution with zero mean and standard deviation SD, where we investigate SD = 10, 50, or 100.
Constrained optimization
The estimated states and parameters were all found with constrained optimization using the limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithm with constraints (L-BFGS-B), implemented in Python with the package scipy.optimize.minimize. These optimizations are high-dimensional. The search space in DSPE is ND + NL + P-dimensional, where N is the number of timepoints, D is the dimension of the dynamical system, L is the number of measurements, and P is the number of parameters. The NL term accounts for the control variables U(t), one for each observed variable at each timepoint. For least squares, the search space is ND+ P dimensional, and for OC-DSPE, the search space is 2ND + P-dimensional to account for the momenta variables. Bounds must be supplied on all state and parameter variables; for the Lorenz96 system, the states were bounded in [-15, 15], F was bounded in [1, 20], and U(t) were bounded in [0, 100]. For the Morris Lecar system, V was bounded in [-100, 100], the gating variable w in [0, 1], and the in [0.01, 200]. For OC-DSPE, the momenta p must also be bounded; these were set at [-100, 100].Equality constraints were required for all 3 methods—least squares, DSPE, and OC-DSPE. These constraints enforced either the raw, uncontrolled system dynamics (least squares; Eq 1), the controlled system dynamics (DSPE; Eq 7), or the estimation dynamics (OC-DSPE; Eq 10). The constraints were enforced iteratively with a penalty method. Specifically, for all 3 routines, the following function was minimized.
where C(⋅) is the cost function and g(⋅) is a discretization of the ith constraint equation at timepoint t, and β was iteratively stepped up from 0 to 24. Specifically, C(β = 0) was minimized with a randomly chosen initialization of all states and parameters within their bounds; the result of this was used as the initial guess for the minimization of C(β = 1), etc. The estimate at β = 24 was used taken to be the optimal estimate. C(⋅) for DSPE was Eq 6, for least squares was Eq 6 absent the U terms, and for OC-DSPE was Eq 11. The constraints were discretized using Hermite-Simpson quadrature, which results in N constraint equations for each i. R is a constant factor that allows the individual constraints to scale respectively with the dynamic range of that particular variable; this normalizes the contributions of different variables to the penalty term. For the Lorenz system ; for the neuron model, . Note that the least squares method with iteratively enforced constraints is equivalent to the variational annealing algorithm proposed previously [18, 32–34, 49, 54]. These estimations were done many times in parallel (100 for Lorenz96 system; 25 for the neuron model) with different initial guesses for the parameters and states at β = 0.For the Morris-Lecar neuron model, once the optimal state and parameter estimates are determined, they were used to generate predictions in the interval [100, 200] by integrating forward the dynamics with the final state estimate x(t) using the estimated parameter values.All optimization routines required first derivatives of the cost function and the constraints; we obtained these in a few lines of code by using the automatic differentiation package PyAdolc.
OC-DSPE estimation dynamics
The optimal control conditions were derived using the Pontryagin Minimum Principle [43]. In this formulation, one must prescribe the cost function along with the system dynamics; each of these may depend on the states, parameters, and controls. Our cost function is the DPE cost function Eq 6 (in continuous time)—a least squares matching of observations and states, plus a quadratic control penalty (restatement of Eq 8 from the main text):
where is the system “Lagrangian.” The control optima are derived from the corresponding “Hamiltonian” function which is defined from the Lagrangian via:
where p is conjugate momentum variable. Note that x, U, and p are all time-dependent, with the time-dependence suppressed for readability. The Pontryagin Minimum Principle states that then the optimal control Uopt along the integral curves of this system must satisfy:
If the optimal control exists in the interior of the space of admissable controls, then the minimum of Eq 20 can be found from the stationary points of with respect to U:
which, for the nonzero diagonal elements of U satisfy
In addition to the condition on the stationarity of , the system dynamics must satisfy Hamilton’s equations, one of which reproduces the state space dynamics , while the second gives the dynamics for the conjugate momenta (writing out the indices explicitly):
Writing out the derivatives of , and expanding out , this gives the following:
The momenta equations provide dynamical constraints on the control variables U, absent from nudging and DSPE. For observable states, the optimality condition Eq 22 can be inverted for the momenta p (Eq 9). Applying this in Eq 24 gives the complete set of optimally-controlled dynamics, in Hamiltonian coordinates , Eq 10.30 Mar 2022Dear Dr. Kadakia,Thank you very much for submitting your manuscript "Optimal control methods for nonlinear parameter estimation in biophysical neuron models" for consideration at PLOS Computational Biology.As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments.All three reviewers have recommended substantial revisions to be made to the manuscript. The manuscript is generally well-written and timely, however the methodology needs clarifications and the work needs to be better situated in its background.We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation.When you are ready to resubmit, please upload the following:[1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out.[2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file).Important additional instructions are given below your reviewer comments.Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts.Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments.Sincerely,Alain Nogaret, PhDGuest EditorPLOS Computational BiologyLyle GrahamDeputy EditorPLOS Computational Biology***********************All three reviewers have recommended substantial revisions to be made to the manuscript. The manuscript is generally well-written and timely, however the methodology needs clarifications and the work needs to be better situated in its background.Reviewer's Responses to QuestionsComments to the Authors:Please note here if the review is uploaded as an attachment.Reviewer #1: N/AReviewer #2: This paper introduces a new method for inferring the state and parameters of nonlinear stochastic dynamical systems where there are fewer measurements than the number of states. The method, termed OC-DSPE, combines a dynamic nudging approach with optimal control. By applying OC-DSPE to the chaotic Lorenz96 system and the Morris-Lecar neuron model, the author shows that the new method is superior to ordinary least squares and DSPE without the optimal-control component.The paper is well written. The approach and result description, including figures, is clear. The method seems to offer a useful new approach to an important general problem in neuroscience. There are however several major issues regarding the method’s applicability and advantages in the context of neural data analysis that are not adequately addressed. Additional analyses based on the points raised below are therefore crucial for appreciating the usefulness of the new approach in answering neuroscientific questions:1. Noise: the paper’s objective as stated in the abstract is joint state and parameter estimation in neural State Space Models (SSM). These are usually understood to refer to doubly stochastic dynamical systems. However, only measurement noise is considered, while neural recordings also often include process noise as described by the Langevin equation (Eq. 2). How does OC-DSPE fair in such scenario? Besides, only isotropic Gaussian measurement noise is considered, and its variance is assumed to be known. Can noise variance be also inferred as a model parameter? How about nondiagonal covariance matrices?2. Measurement: Each measurement here is assumed to equal one state plus noise. However, in in vivo recordings from, e.g., multielectrode arrays, measurements usually come from multi-unit activity. They are therefore (an approximately linear) mixture of several states. Can OC-DSPE handle such a situation and infer a mixing matrix as an unknown parameter, i.e., the output matrix in SSMs?3. Dynamics: The method assumes a full knowledge of the state dynamics. However, in neural recordings not all conductance types present in each neuron type are known a priori. What is OC-DSPE’s solution to underdefined state equations?4. Predictions: One major advantage of SSMs is their ability to conduct in silico experiments and make predictions of future dynamics (for instance, how a neuron would behave under stimulation conditions that were not used during the inference stage). Are there advantages for using OC-DSPE in such analyses? That the method is more accurate suggests that it may make better predictions, but it would be useful to see that in practice.5. Comparisons: OC-DSPE was compared to simpler methods and was shown to perform better. However, the paper mentions other approaches that are similar in methodology (quasi-static variational assimilation and variational annealing). What are the advantages of OC-DSPE over this class of methods?Reviewer #3: This paper proposes a new method, based on chaotic synchronization and optimal control theory, for inferring the parameters of conductance-based neuronal models directly from voltage traces. Results are presented demonstrating that the proposed method, OC-DSPE, yielded more accurate parameter estimates from synthetic data generated by the Lorenz96 system (when only 2 of the 10 state variables are observed) compared to 2 prior techniques. The OC-DSPE method is then used to estimate linear and nonlinear parameters from synthetic voltage traces generated by the Morris-Lecar neuronal model in response to a current pulse and a chaotic stimulating current.Comments:This paper is well-written and addresses an important challenge in computational neuroscience. While conductance-based models can be fit to data from voltage-clamp recordings on individual ionic currents, it is often the case that the only data available for model fitting are voltage traces from current-clamp recordings. This is a very difficult parameter estimation problem since only one of the model’s many state variables is directly observed. Thus, an improved method for addressing this challenge would be a significant contribution to the field. However, some major changes to the manuscript are needed to demonstrate that the proposed method does indeed address this challenge.1) The manuscript includes a direct comparison of the performance of the proposed OC-DPSE method to the prior methods of DSPE and least-squares minimization on data from the Lorenz96 system, but no such comparison is provided for data from the Morris-Lecar model. Since the Lorenz96 system is not a neuronal model, and the focus of the manuscript is on parameter estimation in biophysical neuron models, it would significantly strengthen the manuscript if a direct comparison of the performance for the proposed and prior techniques were provided for the Morris-Lecar model as well.2) The last paragraph in the “Optimally-controlled dynamical parameter inference” section of the Results (lines 202-220) does a nice job of relating the OC-DSPE method to previous approaches utilizing homotopy continuation and extensions of variational annealing to Hamiltonian manifolds. It ends by saying that Algorithm 1 is in much the same spirit as [49]. It would be helpful to add a concluding sentence here indicating what makes the proposed OC-DSPE method distinct from these previous approaches.3) The manuscript emphasizes that the proposed method is tailored to situations with considerable measurement noise, however the type of noise used in the “Numerical experiments with the Morris-Lecar model”, where uncorrelated Gaussian noise is added to the ground truth voltage, does not seem very biological to me as it doesn’t appear to affect the timing of action potentials. Is this type of noise present in in vitro current-clamp electrophysiology recordings? A more relevant type of noise, it seems to me, would be some form of dynamical noise (representing unmodeled synaptic inputs, or ion channel noise perhaps) that causes irregularities in the timing of action potentials. Would this type of noise still be considered measurement noise or instead some kind of model noise, and would OC-DSPE still perform well in the face of it? Even if results are not shown with this type of noise, a discussion of the characteristics of the noise encountered in actual experimental voltage traces and how they relate to the numerical experiments shown here and the way that noise is incorporated into the OC-DSPE framework is warranted.4) Does the proposed OC-DSPE method fall under the general umbrella of data assimilation methods? I’m asking because many of the references cited (11 of them by my count) contain the term data assimilation (or a variant of it) in their title, yet the word assimilation only appears once in the manuscript itself (on line 216). It might help orient readers with regards to the existing literature if it is explicitly stated whether OC-DSPE should be considered as a new variational data assimilation algorithm, or as something else.5) I have two questions about lines 97-99 which reads “As will be discussed below, the more pressing issue is that sequential estimators such as both nudging and the Kalman filter are not directly suited for the estimation of static parameters.”First, is it appropriate to refer to nudging as a sequential estimator? I ask because in the Introduction, it seems that chaotic nudging synchronization is included in the paragraph that starts on line 38 as “An alternative to all of these sequential filtering approaches is the direct optimization of a posterior distribution over, jointly, the states at all timepoints and the unknown parameters.”This also leads to my second question, which is where is it discussed below that nudging and the Kalman filter are not directly suited for the estimation of static parameters? I found this confusing because line 170 says that DSPE is “built naturally on nudging synchronization” and earlier in the Introduction references are cited [5, 25, 26] that use a sequential filtering approach for parameter estimation. Perhaps, rather than alluding to a discussion that will come later, it would be good to include a sentence or two about why nudging and the Kalman filter are not directly suited for the estimation of static parameters immediately following lines 98-99.Minor comments:1) Abstract, second sentence: replace “neurons models” with “neuron models”2) Line 76: replace “an D-dimensional” with “a D-dimensional”3) Line 170: replace “DSPE built” with “DSPE is built”4) Equation 9: replace “observed),” with “observed).”5) Line 185: replace “the the” with “the”6) Line 250: replace “near on” with “near” or “on”7) Line 294: extra space after “1” before the period8) Line 315: replace “as” with “is”9) Lines 376 and 379: the opening quotation mark is reversed in “nudged” and “noise-regulation”10) Line 489: missing space between the comma and “and”**********Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.Reviewer #1: No: Computational method need further clarification as specified in the review.Reviewer #2: No: The paper mentions a GitHub repository but no link is provided.Reviewer #3: No: In the Data and Code Availability section of the coversheet it says "All codes are available on GitHub" but I didn't see any mention of the specific GitHub repository in the manuscript itself**********PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.If you choose “no”, your identity will remain anonymous but your review may still be made public.Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.Reviewer #1: NoReviewer #2: Yes: Hazem ToutounjiReviewer #3: NoFigure Files:While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at .Data Requirements:Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5.Reproducibility:To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocolsSubmitted filename: Kadakia_PloSCompBiol_review.pdfClick here for additional data file.6 Jul 2022Submitted filename: response_to_reviewers_1.pdfClick here for additional data file.10 Aug 2022Dear Dr Kadakia,We are pleased to inform you that your manuscript 'Optimal control methods for nonlinear parameter estimation in biophysical neuron models' has been provisionally accepted for publication in PLOS Computational Biology.Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests.We also would like you to address the minor comments from the two reviewers.Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated.IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript.Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS.Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology.Best regards,Alain Nogaret, PhDGuest EditorPLOS Computational BiologyLyle GrahamDeputy EditorPLOS Computational Biology***********************************************************Thank you for revising the manuscript and addressing the reviewer comments. I am pleased to recommend publication. The manuscript still contains a small number of typos some of which were listed by the reviewers. These may be corrected the galley proof stage.Reviewer's Responses to QuestionsComments to the Authors:Please note here if the review is uploaded as an attachment.Reviewer #2: The author address all my comments and made substantial improvement to the article. There remains 3 very minor comments:Author summary says the approach compares well to other methods but doesn’t mention the advantage and disadvantage of the new method (ie handling noise better but higher computational cost). hugely-dimensional --> high dimensionalLine 369: deterministic Morris-Lecar model – isn’t it stochastic?Line 448: compute time --> computation or computing timeReviewer #3: Author summary, line 2: replace “behavioral response” with “behavioral responses” or “a behavioral response”Pg 2, line 29: replace “combines” with "combine”Pg 3, line 59: replace “parameters” with “parameter”Pg 12, line 322: replace “estimated of all” with “estimated all”Pg 18, line 441: replace “and the removing” with “and then removing”Pg 18, line 470: replace “extensions, to OC-DSPE” with “extensions to, OC-DSPE”**********Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available?The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified.Reviewer #2: YesReviewer #3: Yes**********PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.If you choose “no”, your identity will remain anonymous but your review may still be made public.Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy.Reviewer #2: Yes: Hazem ToutounjiReviewer #3: Yes: Casey O. Diekman5 Sep 2022PCOMPBIOL-D-22-00050R1Optimal control methods for nonlinear parameter estimation in biophysical neuron modelsDear Dr Kadakia,I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course.The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript.Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers.Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work!With kind regards,Zsofia FreundPLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol
Authors: Jingxin Ye; Daniel Rey; Nirag Kadakia; Michael Eldridge; Uriel I Morone; Paul Rozdeba; Henry D I Abarbanel; John C Quinn Journal: Phys Rev E Stat Nonlin Soft Matter Phys Date: 2015-11-02
Authors: Shaul Druckmann; Thomas K Berger; Sean Hill; Felix Schürmann; Henry Markram; Idan Segev Journal: Biol Cybern Date: 2008-11-15 Impact factor: 2.086
Authors: Henry D I Abarbanel; Sasha Shirman; Daniel Breen; Nirag Kadakia; Daniel Rey; Eve Armstrong; Daniel Margoliash Journal: Chaos Date: 2017-12 Impact factor: 3.642