Literature DB >> 35622828

Discovering sparse control strategies in neural activity.

Edward D Lee¹, Xiaowen Chen², Bryan C Daniels³.

Abstract

Biological circuits such as neural or gene regulation networks use internal states to map sensory input to an adaptive repertoire of behavior. Characterizing this mapping is a major challenge for systems biology. Though experiments that probe internal states are developing rapidly, organismal complexity presents a fundamental obstacle given the many possible ways internal states could map to behavior. Using C. elegans as an example, we propose a protocol for systematic perturbation of neural states that limits experimental complexity and could eventually help characterize collective aspects of the neural-behavioral map. We consider experimentally motivated small perturbations-ones that are most likely to preserve natural dynamics and are closer to internal control mechanisms-to neural states and their impact on collective neural activity. Then, we connect such perturbations to the local information geometry of collective statistics, which can be fully characterized using pairwise perturbations. Applying the protocol to a minimal model of C. elegans neural activity, we find that collective neural statistics are most sensitive to a few principal perturbative modes. Dominant eigenvalues decay initially as a power law, unveiling a hierarchy that arises from variation in individual neural activity and pairwise interactions. Highest-ranking modes tend to be dominated by a few, "pivotal" neurons that account for most of the system's sensitivity, suggesting a sparse mechanism of collective control.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35622828 PMCID： PMC9140285 DOI： 10.1371/journal.pcbi.1010072

Source DB: PubMed Journal: PLoS Comput Biol ISSN： 1553-734X Impact factor: 4.779

Introduction

Control of complex dynamical networks is a problem of major interest in biology for finding drivers of diseased states [1-3] and determinants of behavior [4-7]. In neural systems, the question of which and how many neurons correspond to controllability is largely open. While there is evidence that single-neuron manipulation is sufficient to induce behavioral change in some cases [8, 9], other research shows behavioral information to be encoded amongst many neurons [4, 10, 11]. Formidable work has gone into highlighting specific behavioral circuits [12-14], but the general problem of identifying which neurons to probe experimentally and choosing how to perturb them is difficult: in principle, a thorough experiment would require a combinatorially large number of procedures to test all the possible ways that neural targets could be modified. Theoretical tools provide a way of winnowing down the number by using control-theoretic analysis of dynamical systems models [15, 16], structural analysis [5, 17–19], and network properties [20], amongst other approaches reviewed in reference [21]. Here, we develop a theoretical framework that explicitly considers perturbation experiments to discover candidate control neurons. Our framework suggests a way of leveraging recent developments in optogenetics [19, 22–27] that allow for fast, precise control of neurons and even the regulation of brain states with the closed-loop optogenetic [28-32] or optoelectronic systems [33]. Despite the generalization away from single neurons in the study of sensory encoding [10, 34], an echo of it still rings in studies of neural control of behavior [12]. Some experiments rely on the notion of individual (or sets of similar) “driver” neurons that—when ablated, silenced, or otherwise modified—substantially change behavioral outcomes [35-37]. While some individual cells, like interneurons connecting distal parts of the body, are essential for normal behavior, some higher-order behaviors are robust to the function of individual cells [14, 38–40]. Indeed, the finding that some neural circuits encode sensory input in a sparse, distributed way raises the question of whether or not neural control follows similar principles [4, 41–43]. We explore two potential sources of sparsity in neural control of collective activity. First, a desired output could be equally well produced by many possible changes, or modes, at the neural level or by only one particular change. If we think of neural states crudely as parameter settings and a mode consists of changing all the parameters in a preset way, this is asking if many such presets matter. We denominate cases where there is a clear separation in the importance of modes as “sloppy” and where there is no such separation as “non-sloppy” [44]. Second, each mode can itself be sparse or dense in the number of neurons involved [45-47], corresponding to the complexity of coordinating multiple neurons simultaneously. These possibilities lead to the four hypotheses we draw in Fig 1, where control is sloppy or non-sloppy with either sparse or dense collective modes of sensitivity (see Appendix A in S1 Text for theoretical overview).

Fig 1

Four possibilities for sparse sensitivity.

Four possibilities for sparse sensitivity.

When collective activity shows sloppy structure, the local information geometry is elongated along sloppy, insensitive directions. With dense collective encoding, multiple components each matter equally. Starred combination with sloppy, sparse combinations of neurons aligns with centralized control. In the model organism Caenorhabdits elegans, the hypothesis that neural control is sparse is supported by observations that both neural and behavioral dynamics live on sparse manifolds. Neural activity displays collective modes originating from circuit components or global encoding of information that effectively reduces its dimensionality [5, 17, 48, 49]. Complementing this, worm posture is low-dimensional, and four eigenmodes capture nearly all the variance in shape space [42, 50]. More recent work combines the two types of measurement to show that dynamical embedding of worm locomotion along dimensions of neural activity extracts a few principal neural dimensions that capture transitions between behaviors [51, 52]. In some cases, neural modes are even predictive of longer-term behavioral regularities [53]. Though neural signals may not simply reflect behavioral control, these findings support the hypothesis that common mechanisms of neural control may underlie neural-behavioral similarities between individuals. Inspired by signatures of sparseness in neural-behavioral mapping, we examine sparseness in collective neural activity response to individual neurons as a step towards considering behavior. To test these hypotheses is difficult because an exhaustive protocol for probing subsets of neurons, even for a minuscule and precisely mapped organism such as C. elegans, is impossible [54]. A combinatorially large number of possible tests could involve simultaneous perturbations to subsets of neurons in ways that are sympathetic, antagonistic, and of varying magnitude. Could we systematically analyze all such combinations of perturbations in a feasible way? Our first result is to show one way in which such complexity can be dramatically reduced. By focusing on the limit of small perturbations, and given the remarkable effectiveness of pairwise models of neural activity [46, 48, 55], we demonstrate how experiments that perturb only pairwise correlations could be enough to fully characterize the local information geometry. Our second result comes from demonstrating how this simplification could be useful in the context of an in silico experiment, making use of data on C. elegans, a particularly interesting example with which to explore neural control. Current experiments measure whole-brain activity in the freely moving worm and soon will allow matching neurons to the structural connectome [56-58]. Combined with tools for manipulating neural states [32], our proposed experimental protocol could be used to search for neural centers important to collective activity. In the following text, we start by developing the basis for our approach in “Methods and theoretical formulation.” We cover minimal statistical models for neural activity and relate theoretical model perturbations to realistic experimental ones, focusing on a measure of neural collective behavior. Afterwards, we apply the described approach to data sets that display signatures of sparse control in “Results in C. elegans.” At the end, we discuss our findings and questions of experimental feasibility.

Methods and theoretical formulation

Maximum entropy (maxent) model

Our theoretical framework begins with a minimal statistical model of neural activity. In the case of spiking neurons, this could be a pairwise maximum entropy model specifying statistics of activity and inactivity. In the case of C. elegans that we focus on here, we distinguish between its more gradual change in membrane voltage in contrast with the rapidly spiking neuron of the mammalian brain. Though we might then think to use Ca2+ levels as a measure of neural state, we find that fluorescence measurements of absolute levels of calcium are dominated by noise, but it is possible to extract the derivative reliably such that it reflects real correlations in neural activity [48]. Given instrumental limitations that prevent us from measuring the precise continuum dynamics, we focus on a minimal discretization of the time series that still captures the dominant tendencies in the data, coarse-graining each neuron m’s derivative as up s = 1, down s = −1, flat s = 0. Whereas a binary discretization would make it impossible to code inactive neurons, a finer discretization leads to exponentially increasing computational costs with the number of states. The optimal procedure should be sensitive to the experimental protocol, but the practical principle of minimal and sufficient description of neural activity leads us to a ternary description of C. elegans neural states. With this discretization procedure, we denote the time-averaged probabilities of being in state k, where the Kronecker delta function indicates when the state of the neuron s is k at time τ over the duration of the experiment T. Similarly, the pairwise probabilities of agreement between any two neurons and higher-order correlations describe the probabilities of agreement between multiple neurons. Higher-order correlations are not necessarily given by the lower-order statistics, but only accounting for pairwise correlations is sufficient to capture higher-order correlations [48] (Appendix B in S1 Text). The maxent approach captures statistics of neural activity in a way that can reflect latent physical interactions while also obeying a quantitative formulation of Occam’s Razor [60, 61]. This is encoded in the maxent principle that a model of the probability distribution p(s) only match specified constraints but otherwise remain as structureless as possible. This can be done by maximizing the model’s information entropy, S = − ∑ p(s) log p(s), here a sum over all 3 possible configurations, with the standard method of Langrangian multipliers [46, 62–64]. When the set of average individual neural activity {r(s)} is constrained, the maxent procedure gives the “independent model,” but it fails to capture correlations in neural activity as we show in Fig 2 (also see Fig B in S1 Text).

Fig 2

Pairwise maxent model of anterior neural activity from reference [59] (see Fig B in S1 Text for another experiment).

Pairwise maxent model of anterior neural activity from reference [59] (see Fig B in S1 Text for another experiment).

(A) Pairwise correlations between subset of N = 50 neurons. Average individual neuron r(s) shown along diagonal. (B) Inferred biases h along diagonal and matrix of couplings J off the diagonal. (C) Coarse collective synchrony, probability that a plurality of n neurons coincide ϕcoarse(n), for data, pairwise maxent model, independent model, and shuffled couplings. Inset compares pairwise correlations for data with maxent model. Error bars show one standard deviation over bootstrapped samples. (D) Fine-grained synchrony ϕfine in the independent vs. pairwise maxent model. This is the probability of set sizes, the number of neurons in each of the three states when ordered n1 ≥ n2 ≥ n3 such that n1 corresponds to the size of the plurality and n3 the smaller minority. Error bars show one standard error. A simple variation that does not assume independence involves constraining the pairwise correlations from Eq 2. Given the finite-size of the data set, we constrain the pairwise correlations to obtain the pairwise maxent model with distribution The function E(s) is known as the “energy.” The log-probability of configuration s varies with its energy such that lower energy means larger probability. As with the independent model, increasing neural bias h pushes neuron m to state k while increasing the magnitude of coupling J magnifies the tendency of neurons m and t to agree when the coupling is positive (or to disagree when negative). Though the couplings are in principle exactly specified by the pairwise correlations in the data, sampling and experimental noise mean that there are many possible sets of couplings that align within the statistical variation of the data sample. We focus on a numerical solution that recovers topological structure in the coupling network and has been shown to faithfully capture collective neural patterns [48]. We show the coupling network in Fig 2B. The two examples we consider are of N = 50 neuron subsets from the neurons located in the anterior of the immobilized worm from reference [59]. The solution then defines a minimal interaction network, distinct from the pattern of pairwise correlations, that characterizes dependencies in neural activity across the timescales represented in the data available.

Mapping perturbations from experiment to simulation

In a hypothetical experiment, we might effect perturbations by inserting electrodes into immobilized worms or, in a less invasive and more natural protocol, by using optogenetic tools—by clamping membrane potential and effectively coupling neurons to one another via an external circuit to control the strength and timing of perturbation as is pictured in Fig 3A. Here, we use a perturbation that with small probability copies the state of one neuron to another, chosen to resemble increase or decrease in synaptic strength. Specifically, we select pairs, identifying a “target” neuron in state s and a “matcher” neuron in state s, and clamp the matcher to the target’s state, s = s, with small probability ϵ ≪ 1 as in Fig 3B. Compared to more common clamping and ablation techniques [31, 65], this proposed closed-loop perturbation is not as drastic, mostly preserves natural dynamics and is closer to internal control mechanisms such as synaptic modulation that do not require constant input from the outside [66-68].

Fig 3

(A) Perturbation thought experiment consists of clamping matcher neuron m to the state of target neuron t with some small probability ϵ when indicated by a random number generator (RNG). We draw electrodes controlling membrane voltage, but optogenetic protocols are more elegant. (B) Perturbation corresponds to modifying fields and couplings in a pairwise maxent model. (C) Principal eigenmatrix for fine-grained synchrony mapped to change in (D) biases {h}, shown only for the “down” state, and (E) couplings {J} for ϵ = 10−4. (F) Diffuse observable perturbation using replacement rule from Eq 4 corresponds to (G) localized “natural” perturbation to one coupling (note nonzero values in the top left corner). Then, the modified probability that the matcher neuron is in state k is the mixture As a result of the perturbation in Eq 4, the statistics describing neuron m becomes more like those of neuron t. This procedure likewise modifies the matcher neuron’s coincidence with all other neurons indexed j as if modifying synaptic weights [47]. If we were to take sufficiently large ϵ in experiment, we would expect such intervention not to remain localized to neuron s but to alter the neighbors and eventually the neighbors of neighbors and so on. If ϵ is small, then we expect a localized perturbation to be able to approximate well the desired change in statistics given in Eq 4. By taking the limit where the perturbation is sufficiently small and assuming that it then remain localized, we specify a perturbation that corresponds to a unique change of parameters. Such uniqueness implicitly depends on constraining ourselves to considering perturbations that move us along the family of pairwise maxent models. This constraint is not reflected in the replacement rule in Eq 4, which is compatible with an infinite variety of changes to higher-order correlations. In other words, we do not necessarily need to restrict ourselves to considering only the energy function defined in Eq 3, but we could allow for higher-order interactions to appear under perturbation. This introduces ambiguity that can only be resolved by choosing the form of the perturbations. Once we have chosen to fix the structure of the energy function from the maxent principle, however, we do not allow a perturbation to arbitrarily alter the form. This assumption is consistent with the widespread observation that the pairwise model generally captures well collective features of biological neural networks of modest size [46, 69–73]. Thus, we use the pairwise maxent model not only to specify a minimal, compressed representation of neural statistics, but also to specify how the probability distribution evolves under the replacement rule. The replacement rule maps perturbations on observed statistics to system sensitivity instead of those on model parameters as is often the first instinct for theorists (see Appendix F in S1 Text). In the latter formulation, the parameters of the maxent model such as fields and couplings form the basis for “natural” or “canonical” perturbations [74]. This perspective adheres to the physical intuition that the parameters in the energy function reflect latent physical interactions. When the pairwise maxent model instead serves as a phenomenological approximation of the underlying physical interactions, a natural perturbation may be mediated by unknown parameters instead of fields and couplings [55, 75]. This obscures the intuition that changing fields should alter the individual neuron biases while couplings should alter the tendencies of pairs to agree. Partly for this reason, the straightforward model perturbation to couplings in Fig 3B becomes much more complex when translated into its reciprocal experiment in Fig 3A. We show an example of a set of neural perturbations in panel C, indicating roughly uniform perturbations to three dominant neurons, that is more complex in the reciprocal space of fields and couplings in panels D and E, respectively. Again reflecting this basic principle, a trivial coupling perturbation in panel G is complex in the space of matcher-target perturbations in panel F. Additionally, observable perturbations do not depend on the model used but are preserved regardless of how a model is parameterized, ensuring consistency when our framework is extended to other model classes. Thus, defining perturbations in observable terms as in Eq 4 allows for easier experimental interpretation and for consistent comparison of inferred models [47].

Neural synchrony for collective activity

If it is collective neural activity that encodes behaviorally relevant information [45, 46], then we can use collective properties as a proxy for behavior. As one measure of collective activity, we consider the probability that there are n1 ≥ n2 ≥ n3 neurons in each of the three states, ϕfine(n1, n2, n3), which we call a fine-grained measure of neural synchrony. We define n1 to correspond to the size of the plurality and n3 the smaller minority (i.e. there is no fixed correspondence to “rise,” “fall,” and “flat”). While synchrony does not differentiate between the orientation of the states, it presents a simple and computationally tractable statistic [76, 77], can contain information about worm posture and velocity (see Appendix C in S1 Text), and is reproduced by the pairwise maxent model as we show in Fig 2D. For comparison, we also consider ϕcoarse(n1) that only considers the number of neurons in the plurality in Fig 2C. We show additional results using coarse synchrony in the Appendices. Importantly, synchrony treats neurons on equal footing with one another, and the symmetry ensures that neurons are distinguished by bias and variation in their local interaction networks and not from our measure of collective activity. Under perturbation of a pair of matcher and target neurons, the synchrony distribution ϕ is mapped to an altered that depends on the strength of perturbation ϵ. To measure how quickly the distribution changes with the perturbation, we use a unique characterization of the distance between distributions, the Kullback-Leibler divergence DKL. In the limit of infinitesimal perturbation, ϵ → 0, this becomes the Fisher information. Recognizing that only the first nonzero term is the second derivative as detailed further in Appendix A in S1 Text, we end up with the matrix of second derivatives, the Fisher information matrix (FIM) with respect to a matcher-target neuron pairs mt and m′t′, where the Jacobian transforms the maxent parameters such as fields and couplings {θ}, here ordered along a single index i, into the vector of changes that correspond to the observable perturbations specified in Eq 4. Eq 5 measures the impact of perturbation on a coarse-grained probability distribution, the collective synchrony, and accounts for the multiplicity of microscopic configurations belonging to a single coarse-grained state (Appendix A in S1 Text). The FIM, whose entries are defined in Eq 5, is spanned by eigenvectors that describe orthogonal perturbations of the parameters {θ}, where modes with large eigenvalues describe perturbations to which collective synchrony is highly sensitive and small eigenvalues represent ones to which the system is insensitive [47, 78]. Importantly, the basis vectors can involve a mix of antagonistic and sympathetic perturbations of neurons of varying magnitude, one that would be nontrivial to extract a priori. Thus, the FIM encodes how quickly coarse-grained configurations ϕ change as we modify the system on a microscopic level by perturbing pairs of neurons at a time, with perturbations that mimic changes to synaptic connectivity.

Results in C. elegans

Leading FIM eigenvalues show Zipfian decay

In Fig 4A, we show the rank-ordered eigenvalue spectrum of the FIM for collective synchrony ϕfine calculated with the pairwise model in comparison with several null models: independent neurons, pairwise maxent model with couplings randomly shuffled between all pairs of neurons (which preserves the distribution of couplings but not the topology of the interaction network), and canonical perturbations directly modifying each coupling at a time by a fixed amount. Across all cases, we expect a hard cutoff at rank Zmax = 234, the dimensionality of the synchrony distribution ϕfine. In contrast with the others, the independent neuron model has short maximum cutoff well below the theoretical maximum, reflecting the essential role of interactions in mediating pairwise perturbations. The cutoff for the pairwise maxent model reveals the replacement rule in Eq 4 to be sufficient to nearly span the full dimensionality of synchrony space. This observation confirms that the pairwise model with the pairwise replacement rule generates an appropriate basis with which to explore collective sensitivity.

Fig 4

(A) FIM eigenvalue spectrum for pairwise maxent, independent (indpt.) and shuffled couplings null models. Results are averaged over M Monte Carlo samples (M = 10 for maxent model and random shuffles, M = 4 for indpt. and canonical). For comparison, we show response to canonical perturbation to couplings. Insets on left show example eigenmatrices of rank 1 and 50, but only the first displays strong vertical striations. Inset on right shows full eigenvalue spectrum. Error bars show standard error of the mean. (B) Eigenmatrix column and row uniformity. Error bars represent a standard deviation across Monte Carlo samples. (C) Rescaled sensitivity. Principal FIM eigenvalues but for neuron subsamples as a function of subsample size. Normalized by the average maximum eigenvalue for the maxent model. Error bars represent standard deviation around means of Monte Carlo samples. Points have been offset for visibility along the x-axis. Compare with Fig P in S1 Text. The rank-ordered spectrum of eigenvalues λ of rank z initially decays such that each successive level of perturbation returns multiplicatively smaller response, following on limited range Zipf’s law, λ ∝ z−1. In contrast, a simple exponential decay would indicate a sharp cutoff for sensitive modes beyond some rank. We find that exponential decay alone does not describe the eigenvalue spectrum and instead find a reasonable fit using a power law with an exponential tail, which includes the exponential form as the special case when α = 0. In Eq 6, we have parameters for vertical scaling A, exponent α, tail location , and numerical precision cutoff Zmax (see Section E of S1 Text for more details about the fitting procedure). We find that the truncated power law usually presents a compelling fit, and the scale of the cutoff indicates a substantial power-law regime as graphed in Fig 5.

Fig 5

(A) Power law exponent from fitting FIM eigenvalue spectra comparing observable and canonical perturbations. (B) Exponential tail cutoff . See Fig F in S1 Text for another experiment.

(A) Power law exponent from fitting FIM eigenvalue spectra comparing observable and canonical perturbations. (B) Exponential tail cutoff . See Fig F in S1 Text for another experiment. Scaling seems to be a general feature of system statistics because it occurs both in the maxent model and the nulls. Notably, the exponent for canonical decay is faster than that from observable perturbations. The shuffled couplings model, however, is much closer than the others to the scaling for the pairwise maxent model of the data, suggesting that the topology of the interaction structure is not primarily determining sensitivity but that sensitivity depends on statistical properties of the coupling distribution. This is evocative of theoretical results analyzing the stability of neural networks [79]. Thus, the statistical hierarchy reflects the role of component disorder common across the models whose overall collective sensitivity is magnified by interactions. On the other hand, the vertical scales and the scaling exponents—which we find to be close to α = 1 for the maxent model as in Fig 5—vary amongst the nulls. Unsurprisingly, the spectra are overall scaled lower for the independent model since any particular perturbation is isolated to that neuron’s contribution to synchrony. The spectra for canonical perturbations also are generally lower, but we expect perturbations in the space of observable and model parameters to be scaled differently from the transformation of variables (Appendix G in S1 Text). In short, the null models we consider (aside from the shuffled couplings) are distinguishable from the observable perturbation in the eigenvalue spectrum.

FIM eigenvectors reveal pivotal neurons

Inspecting the corresponding eigenvectors, we find some modes are dominated by perturbations focused only on a few neurons. To better represent the connection between pairwise perturbations and eigenvectors, we reshape eigenvectors into eigenmatrices V such that the elements in a column indicate how matcher neuron m should imitate its target neighbors t in turn. When put into this representation, the eigenmatrices display vertical striations as in the inset of Fig 4A. These striations are visible because they are almost exclusively of the same sign, indicating that the mode describes perturbations localized to a single neuron that tend to increase or decrease its correlation with all neighbors simultaneously. In contrast, horizontal striations that represent uniform perturbations across all the neighbors of a particular neuron, a kind of global perturbation, tend to be sparser and weaker on average. This emergent pattern contained in the block structure of the FIM suggests that localized, uniform enhancement or suppression of synaptic connections leading to a small set of pivotal neurons may serve as effective mechanisms for modulating collective activity. As a more direct analysis of pivotal neurons, we limit our analysis to perturbations focused on a single matcher neuron at a time. When the FIM is ordered first by matchers and then targets for each matcher, these perturbations correspond to blocks along the diagonal of dimension (N − 1) × (N − 1) for fixed m and variable t [47]. From the principal eigenvalues of the diagonal blocks, we find that pivotal neurons with the largest eigenvalues tend to coincide with the ones that manifest in vertical striations of the full FIM (Appendix E in S1 Text). These striations correspond to columns with high uniformity. As a measure of this, we define row and column uniformity, respectively, for each eigenmatrix v with unit norm. When we consider the subspace of leading pivotal neurons and compare them with the subspace of randomly chosen neurons, the principal eigenvalues of the former set saturate much more quickly as shown in Fig 4C, indicating that there is a subset that dominates collective sensitivity. The patterns that we note both in the eigenvalue spectrum and eigenvector basis hold generally for random neuron subsets of N = 50 sampled from each experiment, the maximum possible number that can be modeled without overfitting [48]. In contrast with the typical notion that the identities of particular neurons are essential, many pivotal neurons fluctuate between subsets and random samples as shown in Fig 6. This means that the appearance of pivotal neurons is a feature of the ensemble statistics, a point that we confirm with the shuffled couplings null (Fig I in S1 Text). Since collective synchrony is a lower-dimensional statistic and in principle permits exchange symmetries between neurons, this is not necessarily surprising. On the other hand, we note that these collective properties are robust to heterogeneity in network connectivity, bias, and interactions that would tend to break such symmetries.

Fig 6

Fraction of times out of 10 Monte Carlo samples that neuron column uniformity exceeds 99% and 99.9% percentile cutoffs out of column uniformities over all neurons and eigenmatrices.

Fraction of times out of 10 Monte Carlo samples that neuron column uniformity exceeds 99% and 99.9% percentile cutoffs out of column uniformities over all neurons and eigenmatrices.

Discussion

Examples of how sensory and behavioral information is encoded in neural activity suggest that while information is sometimes localized to a few neurons it is at other times distributed amongst many. This may result from information flow between collective to individual components and because the precise scale at which information is processed may vary with time, function, and organism [80]. We develop a perturbative approach that is sensitive to the potential variety of involved neural scales. Using an information-geometric perspective, we identify statistical aspects of the distribution of neural activity that are essential for preserving collective activity. Importantly, we do not have to assume that collective properties will be sensitive to individual neurons or certain combinations but can discover the appropriate ones in a principled way. It becomes feasible to map out the combinations comprehensively because the response of the system depends on only pairs of small perturbations, a property of analyticity of the mapping from activity to collective state. As an in silico realization of the protocol, we use perturbations that mimic internal neural coordination and calculate their impact on collective synchrony. In a minimal model, we find that dominant perturbative modes do not tend to be distributed amongst many neurons but are localized into pivotal ones. The concentration of sensitivity in a few neurons is analogous to the presence of driver neurons, modulation of which drives the system from one configuration to another [81-83]. Interestingly, we find that pivotal neuron identities fluctuate across subsets and Monte Carlo samples used to estimate the entries of the FIM. Instead of specific pivotal neurons, it is the collective properties of localized eigenvectors and scaling in sensitivity that are preserved. The collective properties require neural heterogeneity. Since identical neurons would imply uniform eigenvectors, variation in each neuron’s local interaction network and bias is responsible for the emergence of pivotal neurons. By shuffling the inferred interaction network, we verify that these features seem to be a result of the distribution and magnitude of interactions but not the specific topology. In this sense, the neurons do not need to be labeled differently, but the collection of local interactions and biases is responsible for the statistical properties of the FIM, echoing findings about statistical properties that make neural networks stable [79]. At first glance, this seems to be at odds with the consensus that neurons involved in motor circuits play fixed roles such as AVAs in reversals [4]. It is not. We focus on collective synchrony as a simple proxy for behavior, and interchangeability of pivotal neurons may stem from the way that synchrony treats each neuron equally. By definition, it cannot distinguish between neurons with respect to a particular behavior. If we instead imagine searching for controllers of specific behaviors, this symmetry is likely lost such that control may be localized in fixed neurons. It remains to future work to reconcile our result about a statistically robust property of collective neural activity and its functional implications for sparse behavioral control. One major question that we approach in our framework is how experimental perturbations should be represented in the model. While the focus is often on model parameters as representations of physical dials that are experimentally accessible, direct correspondence is not obvious for a statistical model inferred from data. As an example, when couplings inferred by a pairwise maxent model were directly compared with physical contact between protein residues, the model recovered only a subset of real interactions even while recovering the statistical ensemble [60, 84]. Taking the pairwise maxent model, it is clear exactly what increasing a coupling does to the energy function, enhancing the tendency of a pair of neurons to coincide, but the actual result is nontrivial modification of the entire distribution. For the experimentalist, it seems more natural to consider the problem from the perspective of the observable statistics that can be perturbed in a controlled and precise manner. In the case of Boltzmann-type models, the relationship between observables and model parameters can be made exact for small perturbations. On the other hand, other neural models may be able to more faithfully capture temporal dynamics or consider additional collective statistics that could be incorporated in future work. Thus, our formulation is a theoretically extensible and experimentally tractable framework for predicting the effects of perturbations. An important caveat regarding our predictions is that the model is based on neural activity that may reflect other types of activity [40]. Calcium levels may indicate not only behavioral actuation and modulation but also proprioception, efference copies, or sensory information. They may depend strongly on environmental constraints such as bead immobilization here. Then, the interactions that we recover between neurons would confound physical and apparent interactions. As a result, a statistical connection from in silico individual neuron to collective synchrony may not indicate a causal pathway. Though this is a problem common to studies that rely solely on measurement data, previous work shows that maximum entropy models can in some cases identify real physical interactions from correlations [84]. A definitive answer, in any case, needs a perturbative experiment, and our formulation provides a set of predictions that can be directly checked. To arrive at such verification, a perturbation experiment requires innovations on top of previous experiments. For example, small perturbations require a sufficient number of samples for the perturbed parameters to be measurable (Appendix I in S1 Text). By using the fact that the Fisher information is proportional to the number of independent samples and the Cramér-Rao bound, a lower bound on the number of experimental observations T required to distinguish a change in the mode for a perturbation strength ϵ for Fisher information F is on the order of Taking a relatively large perturbation of ϵ = 10−2, we have T ∼ 10 independent samples for an eigenvalue λ ∼ 103, the order of magnitude of the largest mode (Appendix D in S1 Text). Linear growth in the number of required samples, however, restricts us from measuring insensitive modes in a reasonable amount of time (the experiments analyzed here last 8 minutes and collect roughly 80 to 120 independent samples [48]). Though mapping the full local information geometry of N = 50 neurons would be difficult because the number of pairwise perturbations exceeds ∼ 106 (accounting for two distinct pairs of matcher and target neurons), a similar experiment with about N ∼ 10 neurons seems reasonable when coupled with techniques for incomplete matrix estimation [85]. Importantly, our formulation with pairwise perturbations and the maxent model strongly constrain both the high experimental and computational costs (further discussion in Appendix I in S1 Text). Another limitation on implementation is that the duration of the experiment trials and the number of samples needed to estimate a maxent model limit the types of behaviors that could be studied. The experiments we consider last on the order of 102 s, and so pairwise correlations can reflect neural activity on the timescale of several correlated behaviors including much faster head casts, bending, and even multiple locomotory reversals and the transitions between them [59, 86]. In this sense, perturbations indicate modifications to neural synchrony representative of this mixture and likely not of any single stereotyped behavior. In order to run our proposed protocol, it is important to specify time-averaged neural statistics of interest and to seek out recurrences of the same statistics at later points of observation. In other words, our modeling approach is flexible with respect to the particular statistics, but that choice must be made consistently throughout the experiment (Appendix I in S1 Text). The feasibility of such an experiment stems from our argument that perturbative experiments harness analyticity to vastly simplify the range of possible perturbations. We exploit this property to extract a basis for the local information geometry including multi-component perturbations. One goal of such experimental intervention is then not to be more precise but sufficiently varied to span the local basis, an idea that can be generalized to other experimental systems besides the C. elegans model we consider. Our procedure is general enough to be applied to other neural systems and collective behavior in systems biology. Experimentally, our method requires measuring and making small perturbations to neural activity at the level of individual components, along with the simultaneous measurement of a collective state or output behavior. Theoretically, it relies on a statistical modeling approach that assumes a discrete, time-independent description. One could imagine adapting the method to hormone concentration, gene expression, or other experimentally accessible quantities [87-92] and to output behavior including morphological descriptors [42, 93, 94], a stereotyped behavioral sequence [50, 95, 96], or developmental outcomes. Of these possibilities, the most straightforward adaptation would be to other model neural systems such as cortical cell cultures or muscle neurons involved in Drosophila flight [97]. Although neurons in these systems show discrete action potentials instead of graded activity, the standard mapping to binary states could lead to comparable results. Similar modeling approaches to the aforementioned systems, such as by discretizing levels of gene expression, could lay the basis for further extensions of the approach. Furthermore, we focused on a collective statistic measured at the same short timescales as the neural perturbations, but this could be expanded to measurements over longer timescales coinciding with behaviors of interest. In this sense, our work provides a generalizable theoretical framework that also incorporates specific perturbative predictions that could be tested experimentally across multiple biological systems. The information geometry of scientific theories more generally suggests that they show sparse structure, where a few parameter dimensions strongly change the qualitative characteristics of phenomena and most dimensions are unimportant. This is a result of the logarithmically spaced eigenvalues of the FIM, or “sloppiness” [44, 78, 98, 99]. Whether by nature or by design, such quantitative reduction vastly simplifies the level of detail required for approximate theories, allowing for accurate prediction even with large uncertainty in most parameters [100, 101]. We find here a variation on this idea in the statistics of neural activity in C. elegans. While sensitivity is concentrated in a few modes, neural activity is not conventionally sloppy because the largest eigenvalues decay slower like Zipf’s law. What explains the slower decay or self-similarity? Might this be a feature of biological neural networks to reduce the dimensionality of control parameters or to nest varying levels of control? The state-of-the-art today with single-neuron measurement in C. elegans and optical genetics is approaching a point where experiments to test this hypothesis may become a reality.

Discovering sparse control strategies in neural activity.

Appendices A-J. (PDF) Click here for additional data file. 17 Nov 2021 Dear Dr. Lee, Thank you very much for submitting your manuscript "Discovering sparse control strategies in C. elegans" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Matthieu Louis Associate Editor PLOS Computational Biology Lyle Graham Deputy Editor PLOS Computational Biology *********************** Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: My review is also attached in PDF format for readability. Review PCOMPBIOL-D-21-01485 Discovering sparse control strategies in C. elegans The authors leverage a maximum entropy (max-ent) model of the neural activity of the nematode C. elegans to probe the sensitivity of some collective neural statistics to experimentally inspired in-silico perturbations. Calcium recordings of neural dynamics are first discretized into 3-states, and a topologically-conserving maximum entropy model is built to match pairwise correlations and mean activity. Given this class of models, the authors choose a fine-grained statistical measure, termed “synchrony” (the number of neurons in each of the 3-states) and probe the sensitivity of this measure to small perturbations to pairs of neurons. The perturbation consists of clamping a matcher neuron to a target neuron with a small probability. Sensitivity of “synchrony” to such perturbations is measured in information-geometric terms through the Fisher Information Matrix (FIM) and its eigenvalues and eigenvectors. Through this methodology, the authors show that the “synchrony” is sensitive to a few principal perturbation modes, with the eigenvalue spectrum exhibiting a power law behavior for a significant number of modes revealing a wide hierarchy of “stiff” directions. In addition, the authors find that the large eigenvalue modes tend to be dominated by a small set of “pivotal” neurons, which tend to consistently increase or decrease the synaptic connections with matcher neurons. Given these findings, the authors argue that control of collective neural statistics is sparse in the number of neurons, while also being non-sloppy in the sense that the eigenvalues of the FIM decay initially as a power law and thus the system does not exhibit a clear separation between sloppy and stiff perturbative eigenmodes. In summary, I am sympathetic with the general framework of the manuscript. Combining maximum entropy modelling with information-geometric analysis is somewhat innovative, and the authors show that it provides novel insight into the control mechanisms of biological neural networks, in a system that has already been extensively studied. However, besides some concerns regarding the legibility and the structure of the manuscript, I have some deep concerns regarding the validity and the generalizability of the approach that I think need to be addressed before I can recommend publication in this journal. 1) The analysis is based on a discretization of the neural dynamics that, while motivated by previous work, might miss important finer scale information. Indeed, unlike the spiking neurons of the mammalian brain, most C. elegans neurons exhibit continuous dynamics, and therefore the 3-state discretization might erase important information. The authors should better motivate this choice, also to help with the generalizability of the approach. Given time series data, how should one discretize before building a max-ent model? a) I understand that building a maximum entropy model on a system with more states is challenging from a computational perspective, but I believe that this point should be explicitly discussed, especially considering the interpretation of the results. 2) It should also be clearly stated that the analysis is focused solely on time-invariant statistics of a 3-state discretization of the neural code. On this point I have a few detailed concerns: (a) Maximum entropy models are typically used to infer the interaction rules that give rise to the observed steady-state distribution (given some underlying dynamics). Therefore, an important assumption underlying the inference of the model is that the system has been observed for a time scale longer than its relaxation time, such that it has clearly reached a steady-state. However, in the analyzed data it is unclear whether the observed statistics truly reflect the steady-state of the system, given that the measurement time, T, is relatively short compared to the transition time scales reported in Scholz et al. (b) Neurons are inherently dynamical, and thus control mechanisms are also fundamentally dynamical, acting instantaneously on the continuous neural signal. In that sense, the max-ent model used in this manuscript can only probe control at the level of steady-state statistics, and not fine scale control mechanisms. (c) This is particularly relevant when justifying the use of the maximum-entropy model in an experimental protocol. Real-time perturbations are instantaneous, and measuring their impact through “synchrony” is only possible after a time scale comparable to the time scale used for inferring the maximum entropy model. Relating back to concern a.), it’s not obvious that splitting the data in two T/2 segments, for example, would not result in significant changes in the inferred max-ent model, even though there are no apparent external perturbations to the system. 3) While it has been shown (Chen et al. PRE 2019 2019;99(5):052418) that max-ent models can accurately approximate higher order correlations between neural states in C. elegans, there is no figure in the current manuscript that shows that that is true in this dataset. The reference given in Chen et al. for the data is a biorxiv preprint (Scholz et al. 2018, https://www.biorxiv.org/content/10.1101/445643v1), which has received contestation by other experts in the field in the comments section of bioRxiv. Indeed, to my knowledge the peer-reviewed version of the referenced manuscript is (Hallinen et al. eLife 2021;10:e66135), and not the reference given by the authors. Thus, I’d recommend that the authors clarify both the source of the data and the capability of the max-ent model to predict higher order statistics in this dataset. 4) The reasoning behind using subgroups of N=50 neurons should be more clearly explained in the current manuscript. One must resort to reading Chen et al. to know that it is related to overfitting. 5) The use of collective 3-state neural statistics as a proxy for behavior is imprecise and unjustified. Behavior is inherently dynamical in nature, exhibiting a continuity of time scales. In C. elegans specifically, behavior ranges from muscle contractions to single body waves, to sequences of body waves that give rise to forward, backward and turning bouts, to sequences of bouts that result in different navigation strategies and so on. On what level is the synchrony measure presented by the authors related to behavior? Specifically, the calcium activity of certain C. elegans neurons has explicitly been associated with the initiation of different bouts, like forward or backward locomotion, but longer time scale sequences of behaviors have yet to be associated with neural dynamics explicitly. In addition, given the inability of the max-ent model to capture neural dynamics, longer time scale sequences of behavior are inherently unattainable in the current approach. If the relationship between the chosen statistic and behavior is not clearly justified, I’d recommend that terms like “neural-behavioral map”, as used in the abstract, be removed. 6) Related to both points 3) and 5), the authors provide no justification for the use of a dataset with only two recordings on immobilized worms, even though the data provided by Hallinen et al. includes also other experimental protocols, including recordings in freely moving worms. What is the reason for excluding this data? This is especially puzzling given the authors’ claim that the presented approach can be generalizable to other “measures of internal states” including behavioral output. Could these statements be made precise? After all, simultaneous measurements of neural and behavioral dynamics are available, and I’d be curious to see how to translate this approach to behavioral dynamics. 7) The use of the synchrony statistic should be better justified. In what sense is this a complete enough measure of collective neural states and, more importantly, behavior? This is especially important given that all the results of the manuscript depend on this statistic. a) How much more computationally expensive is it to include state identities into synchrony statistic? b) How justified is the connection to the concept of sloppiness? Sloppiness is studied with respect to the likelihood function of a set of parameters given the data (Transtrum, J Chem Phys 2015;143(1):010901), which is not the same as choosing a test statistic and examining how it varies with respect to perturbations. Why is the FIM not computed w.r.t the likelihood function? 8) The authors make no attempt to connect to other modelling paradigms of collective neural activity in C. elegans, even though there are several examples that even connect to the underlying neurobiological control. Here’s a few examples, although there are others: i) Brennan et al. eLife 2019;8:e46814 ii) Morrison et al., Front Comput Neurosci. 2021 Jan 22;14:616639 iii) Fieseler et al., J. R. Soc. Interface.1720200459 9) The Fisher Information Matrix computed in the manuscript measures the sensitivity of the chosen test statistic, to perturbations that are reflected in the changing of model parameters. In that sense, it seems to be specific both to the model class and the chosen statistics. How generalizable is this? Can this FIM analysis be reproduced using other modelling paradigms? Besides these more general concerns, I also have an extensive list of more technical comments: 1) There is a typo on line 16: neruons should read neurons. 2) In the paragraph starting in line 18, environmental conditions should be included as an important barrier to the connection between global neural state and behavior. Indeed, this study uses experiments in which worms are immobilized and drugged, which severely impairs the mapping between neurons and behavior. 3) I really cannot see the relevance of the analogy with sensory processing (line 30) in the context of the neural-behavioral map, behavior is very different from perception. Can you clarify this point? 4) The statement in line 43 starting with “In the formulation” is unclear and grammatically incomplete. 5) Reference 48 in line 104 is not very precise in the context of behavioral sequences, specialy for C. elegans. While discrete behavioral states are defined in ref. 48, no explicit temporal sequences of behavior are described. In the context of C. elegans other references would be more suitable, e.g. (Gomez-Marin et al., J. R. Soc. Interface 13: 20160466) or more recently (Costa et al., arXiv:2105.12811v2). 6) How exactly are the states “flat”, “increasing”, “decreasing” defined from the calcium signals? Information about the discretization of the data should be explicitly included in the methods. The reader shouldn’t have to resort to other references for such important information. 7) In line 146, reference to figures A6 and A7 is unjustified. Where is there a figure to support the fact that accounting for pairwise correlations is sufficient to capture higher-order correlations? Also, at this stage the statistic phi_coarse is not defined and perhaps should be defined in the figure caption. 8) There is no reference in the text of the manuscript to Fig.3. Also, it is not specified whether phi_fine or phi_coarse are used in Figs.3C,D,E,F,G. In addition, Fig 3D is very hard to read. 9) The section about the different types of perturbations starting in line 188 should be made clearer and perhaps there should be a reference to Appendix D. 10) The reference to Fig. C.10 in line 234 is challenging to understand as the exponents have not been defined yet. 11) In lines 236 and 237, it’s not at all clear where the Z_max values are coming from, and Fig. E.19 is not clarifying that. There should be more guidance to the figure or in the figure captions to make these points clearer. 12) In the paragraph before Eq.(8), how are the diagonal blocks exactly defined? Is it possible to show a figure of these matrices and how the blocks are identified? 13) In Eq.(8), how exactly are the eigenvectors normalized? 14) The shuffle models are not clearly explained in the text, and so their relevance becomes difficult to judge. What is the point of comparing with the different shuffles? There should be a section explaining these in detail and their relevance. In particular, the shuffled couplings control is confusing as it seems to only shuffle the neuron indices but preserving everything else. What’s the point? 15) How surprising is it that the pivotal neurons fluctuate across random subsets, given the fact that the statistic that is used to compute the FIM is independent of state and neuron identity? a) Relatedly, it should be possible to reliably obtain neuron identities, even if only from a subset of neurons. Certain neurons are known to be important for initiating and maintaining certain behaviors (e.g. AVA), where are these in your analysis? Do the most consistent pivotal neurons identified in your approach correspond to known important neurons for C. elegans behavior? It would be very interesting if you could establish any testable predictions regarding the underlying neurobiology. 16) In Fig. 4A, the inset matrices are hard to see. I’m recommend expanding them. 17) The discussion paragraph starting in line 320 is also confusing. Doesn’t shuffling the interactions effectively give neurons different roles? 18) In the second paragraph of Appendix A there is a typo. Before references [90,91], “of” in “of the likelihood” should be removed. 19) It is difficult to see, comparing Fig. C14 with Fig. C11, that the distinction of pivotal neurons from uniformities with the MCH model is not as consistent as with the topological max-ent model introduced previously. Can this be explicitly quantified? 20) In Fig A.6C, the figure caption for the inset seems to be wrong. 21) The results of Fig.B.8 are insufficient to show that the eigenvectors of the inferred FIMs are robust to the sample size K. In fact, there seem to be significant differences in the secondary eigenvector between K=10^5 and K=10^6. Such differences should be better quantified, and this analysis should be done across multiple samples and not just one example. 22) There is a collection of typos in the first sentence of Appendix C. 23) Across the supplementary figures subsets A,B,C and D are used. How exactly are these subsets defined? 24) There is a typo in the last sentence of page 24: “and” in “consider and the fine” should be removed. 25) Some supplementary figures have been given the wrong labels. For instance, Fig. E.17 pertains to Appendix D but has been labelled as if belonging to Appendix E, and Figures G have nothing to do with Appendix G. The labelling is confusing. 26) “Furthermore” is repeated in two subsequent sentences in the first paragraph of Appendix G. Reviewer #2: Summary of the paper: The authors of the manuscript “Discovering sparse control strategies in C. elegans” describe an analytical perturbation protocol that is potentially experimentally feasible and should yield insight into the neurons that are most relevant to behavior. They use a (published) maximum entropy model extracted from the measured neuronal activity of C. elegans to test the impact of their pairwise perturbations in this system, finding that some specific ‘pivotal’ neurons are very sensitive to perturbation, possibly indicating that these are also relevant for behavior. Review summary: While I think the work itself is interesting and a relevant addition to the field, adding rigorous theoretical underpinnings to perturbation experiments that are now feasible or at least within reach, I have some major concerns about this paper which I detail below. Foremost, these are concerned with the presentation and context of this work, rather than the theoretical results. I therefore believe these are addressable by a major revision and editing for clarity. Major issues: There are some inconsistencies in the framing of the paper: the authors claim in the introduction and the abstract that this perturbation protocol is “(...) for systematic perturbation of neural states that limits experimental complexity but still characterizes collective aspects of the neural-behavioral map.”. The introduction refrains from considering any of the non-feedforward nature of neural systems where not all neural signals are directly mapping to behavior. Representations of behavior might be due to motor commands, or sensory-motor transformation. But they might also reflect proprioception or efference copies. The connection from neural activity to behavior and the concrete impact of this work needs to be better explained, especially since the maxent model is based on calcium imaging data. A key relevant citation for high-throughput optogenetics in C. elegans is missing: https://doi.org/10.1038/s41592-018-0233-6 which also finds a small number of locomotion-relevant neurons in the worm The introduction mixes organisms, techniques and results across a number of (very different) organisms to support their points. For example in ln.13-17, and especially ln. 58 where Phineas Gage and worms find themselves in the same citation. The motivation could be strengthened significantly by selecting citations from the same or similar organisms that are technically available for these measurements or alternatively expanding the discussion of the other results substantially and how these affect the interpretation with respect to the proposed perturbation approach. There exist a number of similar, published datasets from other groups (e.g. from the Zimmer lab). Do the results generalize to datasets obtained with slightly different imaging modalities? If not, why? Clarity of text and presentation: Panels and one figure seem to not be cited in the main text (e.g., Figure 3, 4B). It is also a bit confusing when large sections of the text refer to figures that are relegated to the appendix, this is mostly the case in the section about the Eigenvalue decay statistics. I think the work could benefit by a thorough editing for clarity in text and visual presentation. The appendix is clearly written and more approachable than the main text, and the main text would benefit from a similar tone (e.g Appendix A and C are particularly well done). I wonder if the authors would consider compressing the main text even more and focusing on the main argument of the perturbation strategy and the results for the C. elegans datasets, to make this manuscript more easily readable and a useful resource for different readers. In the calculation often a subset of N=50 neurons is used, when the nervous system of the worm comprises 302 neurons. Does the fact that in this subset some coupling are missing affect the conclusions, especially the conclusion about sparse control? If not, then do we learn something about an efficient experiment where one only needs to observe a few neurons? I am curious to know if the fact that the model is derived from calcium imaging data changes the interpretation of the model at all. It seems that calcium can detect neural activity easier than suppression such that there is a fundamental asymmetry in the underlying data. How does this impact the ability to detect certain couplings, and relatedly the column uniformity in Fig.4? I am a bit confused by the following statement: ln 285-286: “In contrast with the typical notion that the identities of particular neurons are essential, pivotal neurons fluctuate between subsets and random samples as shown in Figure 5.” This seems to contradict much of the literature in which neurons do play distinct roles in behavior and these have been identified in the worm e.g. AVAs roles in reversals. It would be helpful for the reader to unpack this and similar statements about the sloppy-sparse control and put them in the context of the real, biological network. This would add to the discussion. Minor: Paragraph ln. 93-106 missing citation 10.1016/j.cell.2020.12.012 Citation [48] should instead cite one or more of the many C. elegans behavioral papers that are relevant here Panel 4C: the maxent model data is barely visible Check panel labelings 4 Appendix C ‘...turning them up, for a particular neuron is effective’...effective in what way? Reviewer #3: The authors develop a theoretical framework using perturbation experiments to discover the neurons in a nervous system that induce the most change in the behavior of the system. They apply their methodology to a minimal model of C. elegans neural activity, and they find the model is most sensitive to a few principal perturbative modes. One of the primary contributions of this work is the way in which extends earlier work on “sloppy” models. Whereas the original work primarily relies on an analysis of the curvature of the manifold from the dynamical description of the system, this work estimates the curvature through local perturbations. This is of practical use to neuroscientists who can only perturb their model living organisms. My main issue with this paper is that it is hard to follow. This is partly because the approach is itself relatively sophisticated. But part of the difficulty is due to the writing and the organization. According to the title, the expectation is that there is a certain focus on learning something specific about C. elegans. However, the C. elegans model as an example system diverts from the real focus of the contribution, which is the framework. The paper’s main contribution does not seem to be about the discovery of sparse control strategies in C. elegans. Instead, the paper’s central contribution seems to be about the framework for discovering the neural components that affect behavior, and therefore are likely to be involved in the control of the nervous system. Presumably the choice of a C. elegans model was to provide useful predictions for experimentalists. However, the predictions are not clear, particularly in relation to understanding the neural basis of C. elegans behavior. And unfortunately, the choice of C. elegans model leaves the validity of the framework untested. Given that the focus was on showing the utility of their novel perturbative approach, a simple toy neural-behavioral model, where the ground-truth would be known and tractable, would have made the approach both more convincing, and more easily interpretable. My suggestion would be to improve the clarity of the contribution by either making the focus more clearly on C. elegans and elaborating more succinctly on the predictions, or to focus more concretely on the framework and to provide first a toy model that illustrates the usefulness and validity of the approach, before applying it to data from C. elegans. ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No: The dataset had been previously published (although the reference provided in the current manuscript is wrong, see Comments to the Authors), but no explicit link to the data used in this manuscript is provided. Also, the code available on github, but currently there is no section in the main text or appendix pointing to the code. I believe that this information should be stated more explicitly, with direct links to both code and data appearing on the main text (or in Appendix). Reviewer #2: Yes Reviewer #3: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Reviewer #3: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at . Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols Submitted filename: Review_PCOMPBIOL-D-21-01485.pdf Click here for additional data file. 17 Feb 2022 Submitted filename: ref response.pages.pdf Click here for additional data file. 18 Mar 2022 Dear Dr. Lee, Thank you very much for submitting your manuscript "Discovering sparse control strategies in neural activity" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are very likely to accept this manuscript for publication, providing that you modify the manuscript as best as possible according to the remaining recommendations of reviewers #1 and #2. Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Matthieu Louis Associate Editor PLOS Computational Biology Lyle Graham Deputy Editor PLOS Computational Biology *********************** A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Please note here if the review is uploaded as an attachment. Reviewer #1: The review is uploaded as an attachment. Reviewer #2: The authors have addressed my comments. I have a few remaining textual changes to improves the accuracy of the terms used. "making small perturbations to internal states at the level of individual components" Internal states has a distinct meaning in neuroscience, e.g, motivation, hunger, sleepiness ... which could be confusing to the reader in this sentence. The authors should just use neural activity in its place here. Later in the same paragraph, the authors even refer to that meaning of 'internal state', but also use gene expression in the list. I would recommend staying with 'microscopic quantities' or similar to avoid any confusion. "the anterior neural network" - this should be anatomically correctly named as there is no such thing as an anterior neural network in C. elegans: a subset neurons from the anterior ganglia in the worm head or simpler 'neurons located in the anterior of the worm' Minor: References to “Theoretical formulation.” should be replaced with the new section title ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols References: Review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. Submitted filename: Review PCOMPBIOL.pdf Click here for additional data file. 24 Mar 2022 Submitted filename: ref response 2.pdf Click here for additional data file. 1 Apr 2022 Dear Dr. Lee, We are pleased to inform you that your manuscript 'Discovering sparse control strategies in neural activity' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Matthieu Louis Associate Editor PLOS Computational Biology Lyle Graham Deputy Editor PLOS Computational Biology *********************************************************** 3 May 2022 PCOMPBIOL-D-21-01485R2 Discovering sparse control strategies in neural activity Dear Dr Lee, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Zsanett Szabo PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

79 in total

Review 1. The dynamic clamp comes of age.

Authors: Astrid A Prinz; L F Abbott; Eve Marder
Journal: Trends Neurosci Date: 2004-04 Impact factor: 13.837

2. Direct-coupling analysis of residue coevolution captures native contacts across many protein families.

Authors: Faruck Morcos; Andrea Pagnani; Bryan Lunt; Arianna Bertolino; Debora S Marks; Chris Sander; Riccardo Zecchina; José N Onuchic; Terence Hwa; Martin Weigt
Journal: Proc Natl Acad Sci U S A Date: 2011-11-21 Impact factor: 11.205

3. Behavioural report of single neuron stimulation in somatosensory cortex.

Authors: Arthur R Houweling; Michael Brecht
Journal: Nature Date: 2007-12-19 Impact factor: 49.962

Review 4. Interactome networks and human disease.

Authors: Marc Vidal; Michael E Cusick; Albert-László Barabási
Journal: Cell Date: 2011-03-18 Impact factor: 41.582

5. Pairwise maximum entropy models for studying large biological systems: when they can work and when they can't.

Authors: Yasser Roudi; Sheila Nirenberg; Peter E Latham
Journal: PLoS Comput Biol Date: 2009-05-08 Impact factor: 4.475

6. Hypoxia and the HIF-1 transcriptional pathway reorganize a neuronal circuit for oxygen-dependent behavior in Caenorhabditis elegans.

Authors: Andy J Chang; Cornelia I Bargmann
Journal: Proc Natl Acad Sci U S A Date: 2008-05-13 Impact factor: 11.205

7. Nuclear hormone receptor NHR-49 controls fat consumption and fatty acid composition in C. elegans.

Authors: Marc R Van Gilst; Haralambos Hadjivassiliou; Amber Jolly; Keith R Yamamoto
Journal: PLoS Biol Date: 2005-02-08 Impact factor: 8.029

8. Highly efficient optogenetic cell ablation in C. elegans using membrane-targeted miniSOG.

Authors: Suhong Xu; Andrew D Chisholm
Journal: Sci Rep Date: 2016-02-10 Impact factor: 4.379

9. Dual Coding Theory Explains Biphasic Collective Computation in Neural Decision-Making.

Authors: Bryan C Daniels; Jessica C Flack; David C Krakauer
Journal: Front Neurosci Date: 2017-06-06 Impact factor: 4.677

10. Gene regulatory network inference in long-lived C. elegans reveals modular properties that are predictive of novel aging genes.

Authors: Manusnan Suriyalaksh; Celia Raimondi; Abraham Mains; Anne Segonds-Pichon; Shahzabe Mukhtar; Sharlene Murdoch; Rebeca Aldunate; Felix Krueger; Roger Guimerà; Simon Andrews; Marta Sales-Pardo; Olivia Casanueva
Journal: iScience Date: 2021-12-20