Literature DB >> 35100264

Sparse connectivity for MAP inference in linear models using sister mitral cells.

Sina Tootoonian^1,2, Andreas T Schaefer^2,3, Peter E Latham¹.

Abstract

Sensory processing is hard because the variables of interest are encoded in spike trains in a relatively complex way. A major goal in studies of sensory processing is to understand how the brain extracts those variables. Here we revisit a common encoding model in which variables are encoded linearly. Although there are typically more variables than neurons, this problem is still solvable because only a small number of variables appear at any one time (sparse prior). However, previous solutions require all-to-all connectivity, inconsistent with the sparse connectivity seen in the brain. Here we propose an algorithm that provably reaches the MAP (maximum a posteriori) inference solution, but does so using sparse connectivity. Our algorithm is inspired by the circuit of the mouse olfactory bulb, but our approach is general enough to apply to other modalities. In addition, it should be possible to extend it to nonlinear encoding models.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35100264 PMCID： PMC8830798 DOI： 10.1371/journal.pcbi.1009808

Source DB: PubMed Journal: PLoS Comput Biol ISSN： 1553-734X Impact factor: 4.475

1 Introduction

A common view of sensory systems is that they invert generative models of the environment to infer the causes underlying sensory input. Sensory input is typically ambiguous, so a given input can be explained by multiple causes. Consequently, correct inference requires adequately accounting for interactions among causes. For example, increased evidence for one cause often reduces the probability of, or “explains away”, competing causes (if you think the object you’re smelling is an orange, that makes it less likely to be a lemon). Any neural circuit performing inference must therefore implement mechanisms for inter-causal interaction. This typically results in dense—and in many cases all-to-all—connectivity between neurons representing causes. The myriad causes potentially responsible for a given sensory input often require a neuron representing a cause to connect to hundreds of thousands of others. Such dense connectivity is biologically implausible. This problem is easy to demonstrate in linear models of sensory input. (Although linear may seem overly restrictive, in fact such models have been successful in explaining basic features of the visual [1], olfactory [2], and auditory [3] systems). Consider noisy receptors y (e.g. retinal ganglion cells, olfactory glomeruli) linearly excited by causes (e.g. edges, odours) according to a matrix A. Under this model, the excitation of the ith receptor is given by The causes responsible for the observations, y, can be estimated by minimizing, for example, the squared error between the actual observations and the expected observations. A population of neurons whose individual firing rates represent the x can do this by gradient descent [4], These dynamics can be interpreted as balancing the evidence for the cause x due to the receptor inputs y (the first term) while accounting for the explanatory power of the other causes (the second term). In particular, ∑ A A reflects the contribution of cause x to the evidence for cause x. Importantly, even if the elements A are sparse, ∑ A A will be non-zero for most j and k, implying nearly all-all connectivity in a circuit implementing Eq (2). In common sensory settings there may be hundreds of thousands of causes that explain a given input. This means that x must connect to hundreds of thousands of other neurons. Below we show how the problem of all-to-all connectivity can be solved so that inference can be performed with realistically sparse connectivity. We begin by recapitulating the MAP inference problem, focusing on the olfactory setting for concreteness. This is basically sparse coding applied to olfaction, and suffers from all-to-all connectivity. We then derive a solution inspired by the anatomy of the vertebrate olfactory bulb, namely the presence of dozens of ‘sister cells’ that receive input from the same glomerulus. That solution leads to MAP inference, but using sparser connectivity. While we focus here on the olfactory system, our method is applicable to other modalities.

2 Results

2.1 Olfaction as MAP inference

Animals observe odours indirectly via the excitation of olfactory receptor neurons that project their axons to spherical bundles of neuropil called glomeruli. Each receptor neuron is thought to express a single receptor gene from a large repertoire [5], and neurons expressing the same gene almost always converge onto one of two glomeruli, on either side of the olfactory bulb [6]. Thus each glomerulus represents the pooled activity of the receptor neurons expressing a single type of olfactory receptor. We represent this vector of glomerular activations by y = {y1, y2, …, y}, where y is the activation of the ith glomerulus, and M is the number of glomeruli per lobe of the olfactory bulb, or equivalently, the number of olfactory receptor genes expressed by the animal. This number is ∼50 for flies [7], ∼300 for humans [8], and ∼1000 for mice [9]. The task of the animal is to infer the odour, x* (which consists of N components, ), from the receptor activations, y (see Fig 1A). There are two main interpretations for the . One is that is the concentration of the jth molecule in the odour, and so N is the number of distinct molecular species that the animal may encounter in its environment. The other is that represents a complete olfactory object (e.g., coffee, bacon, marmalade) rather than a molecular species; in this case, N is the number of learned odours. To estimate N for the first interpretation, we note that the study of an estimated 0.25% of all flowering plants has yielded 1700 floral scent compounds [10], suggesting an upper estimate for N on the order of 1700/0.0025, or roughly 700,000 (though the actual number could far fewer if existing molecules appear in as-yet-undiscovered floral scents), on the same order as the 400,000 estimated in the literature [11]. For the second interpretation (odours are complex olfactory objects), N is difficult to approximate, but estimates for the number of distinguishable odour objects range from 10,000 [12] to 1 trillion [13]. Here we simply assume that in both cases N is large. In either case, odours are very sparse—either because only a few molecular species are present [14], or only a few odours are present.

Fig 1

Olfaction as MAP inference.

Olfaction as MAP inference.

(A) Animals observe odours x* indirectly via receptor activations. We assume the function of the olfactory system is to report the odour x most likely to have caused the observed receptor activations y. (B) Schematic representation of an odour, whose defining feature is that it’s sparse (meaning very few components are active). (C) A basic circuit for performing MAP inference on odours: Receptor i projects directly to each readout unit x with weight A determined by the affinity of receptor i for molecule j; the readout unit x reciprocally inhibits unit x with weight ∑ A A. The latter term is likely be non-zero even if A is sparse (since that requires only one term in the sum over i to be nonzero), resulting in each readout unit inhibiting and being inhibited by potentially ∼100,000 other units. (D) An alternate circuit that performs the same inference. Mitral cells now mediate between glomeruli and the readout units. Each mitral cell λ excites each readout unit x with weight A and is in turn inhibited by the same amount. No inhibition is needed between readout units, but a mitral cell must still excite and be inhibited by each of potentially ∼100,000 readout units. From the point of view of learning and inference, there are advantages to both interpretations. If odours are represented as molecular concentration vectors, then information about how each molecule excites each receptor is determined only by the physical parameters of the molecule and receptor. It can, therefore, be learned on evolutionary timescales and hard-wired into the circuit, at least in principle, and it provides a simple substrate for the animal to generalize between chemically similar odours. It is disadvantageous in that the animal requires higher-order circuitry to infer learned olfactory objects (coffee, bacon, marmalade), which consist of many types of odour molecules. If odours are represented as complex objects, then those objects have to be learned within the lifetime of the animal. However, once learned, further higher order circuitry is not needed. Although these important representational issues are beyond the scope of this work, we mention them in passing as examples of the non-trivial assumptions required before a complete theory of olfactory circuit function can be developed. We assume a very simple model of the transduction of odours into neural activations: odour components contribute linearly to the input current of a receptor, which is then converted into a firing rate by a static, invertible, pointwise nonlinearity. That is, the excitation of glomerulus i is described as where is the concentration of the jth molecule, A is the affinity of the ith receptor for the j’th molecule, f converts input current to firing rate, and z is additive noise with variance σ2. Nonlinearities like f can be inverted without changing the nature of the inference problem, so, without loss of generality, we take f to be the identity. Thus, our likelihood is Because the number of glomeruli, M, is likely to be much smaller than the number of molecular species, N, a whole manifold of odours can be consistent with any particular pattern of glomerular activation. We resolve this ambiguity by selecting the candidate odour that is most consistent with our prior information about odours. The prior p(x) encodes the animal’s background knowledge about the presence of odours in the environment. We assume that the animal makes the simplifying assumptions that molecules appear independently of each other (but see [15, 16]), and that the marginal probability distribution for each molecule, p(x), has the same form. For the prior we use an elastic net distribution (a combination of ℓ1 and ℓ2 penalties) [17, 18]. The ℓ1 penalty promotes sparsity as is observed [14] (see Fig 1B); the ℓ2 penalty discourages very large concentrations. In addition, we include a term enforcing the non-negativity of concentrations, yielding where The parameter β determines the degree of sparsity, γ penalizes excessively large concentrations, and the indicator function, defined as when x ≥ 0 and ∞ otherwise, enforces the non-negativity of concentrations. The optimization problem is to determine the odour x most likely to have caused glomerular activations y, taking into account both the likelihood and prior. The resulting MAP estimate is given by Because the objective function being minimized is strictly convex it has a unique minimum. At this minimum the partial derivative of the objective with respect to each x (ignoring for the moment the potential non-differentiabilities introduced in Eq (6)) is zero, yielding A common approach for solving this equation is to perform gradient descent on the objective function (the right hand side of Eq (7)) [1], for which the resulting dynamics is (Note that we recover Eq (2) if we set β and γ to zero and σ2 to 1, and drop the non-negativity constraint in Eq (6)). These dynamics have a natural interpretation: there’s a leak term due to the gradient of the prior (the first term on the right hand side), feed-forward excitation of the readout unit x by the glomeruli (the second term), and recurrent inhibition among the readout units (the third term); see Fig 1C. As discussed above, the problem with this approach is that unless the affinity matrix A has sparse structure, the term ∑ A A leads to dense connectivity. To remedy that, we can factor out the term in parentheses in Eq (9) and implement it with a new variable, λ, giving us Although this describes a different neural system than that in Eq (9), it clearly has the same fixed point. The circuitry implied by Eq (10) is broadly consistent with that of the olfactory system (see Fig 1D). There is one λ for each glomerular input y, making it natural to identify λ with the activation of a mitral or tufted cell. Mitral and tufted cells likely play different roles in olfactory processing [19], but our theory can be applied to both populations, so, for simplicity, we will refer to them collectively as mitral cells. Eq (10a) requires each mitral cell to be directly excited by its corresponding glomerulus (y) and to be inhibited by the readout units (x). Eq (10b) requires the readout units to be directly excited by the mitral cells. This pattern of interaction between the mitral cells and the readout units implies an identification of the readout units x with olfactory bulb granule cells, whose main source of excitation is the mitral cells, which they in turn inhibit. Note that in our model the mitral cell/granule connections are symmetric. Since granule cells lack axons, the results of the computation must be read out from the mitral cell activations. This can be done by mirroring the integration of mitral cell input by granule cells as described in Methods Sec. 4.5, ‘Cortical readout’.

2.2 Incorporating sister mitral cells

Eq (10a) indicates that each mitral cell λ is inhibited by each granule cell x, of which there are hundreds of thousands in the mouse [20]. Thus, although the dynamics yield correct inference at convergence, if A is dense we are again faced with implausibly high connectivity. We take inspiration from the olfactory system to show how this problem can be addressed while still performing MAP inference. So far we have assumed that each glomerulus provides input to one mitral cell (left panel of Fig 2), but in reality, each vertebrate mitral cell has several dozen ‘sister’ cells (mitral and tufted cells) that all receive input from the same glomerulus [21], (right panel of Fig 2), and all receive inhibitory feedback from granule cells. This suggests a way to reduce the number of mitral cell/granule cell connections: let each sister mitral cell connect to a different, non-overlapping, set of granule cells. Given that there are approximately 25–50 sister cells per glomerulus [22, 23], that would reduce connectivity by a factor of 25–50, yielding biologically plausible levels.

Fig 2

Sister cells.

Sister cells.

In the vertebrate olfactory bulb, each glomerulus, y, is sampled by not one (left panel) but approximately 25–50 ‘sister’ mitral and tufted cells, λ [22, 23] (right panel, with only three sisters to reduce clutter). To derive circuitry that can implement this scheme, we start by assuming that the sister cells obey dynamics similar to Eq (10a), where is the sth sister cell for glomerulus i and S is the number of sister cells. Below we will choose to be zero for all but one s, which greatly reduces the number of granule cells that connects to each mitral cell, but for now we leave it arbitrary. To see how sister mitral cells can perform correct MAP inference, note that the average sister cell activity, evolves according to Letting the weights, , obey the average sister cell activity evolves according to This is identical to Eq (10a), the time evolution equation for λ, implying that if at t = 0, then for all time. Consequently, rather than computing λ from Eq (10a), we can compute it by simply averaging over the sister mitral cells, We can, therefore, replace λ in Eq (10b) with the right hand side of Eq (16), and, so long as the sister mitral cells evolve according to Eq (11), our model will implement MAP inference. Eq (14) tells us only that the weights of the sister mitral cells add up to A, but besides that we have complete freedom in choosing them. A trivial choice is (illustrated in Fig 3B). However, this doesn’t help, as each sister mitral cell still receives N inputs—one from each granule cell. What we want to do instead is make the connections sparse so that, as mentioned above, each sister cell receives input from a different, non-overlapping set of granule cells. If the sets are equal in size, this means each sister cell receives input from N/S granule cells, an S-fold sparsification (Fig 3C and 3D).

Fig 3

Sparsifying connectivity with sister cells.

Sparsifying connectivity with sister cells.

Each panel shows a schematic of the connectivity between mitral cells (triangles) and granule cells (circles). All connections are reciprocal. (A) Original scenario—no sister mitral cells. Connectivity is all-to-all, and the weights are A. (B) A densely connected configuration where every granule cell connects to all sisters on each glomerulus, with weight . The sister mitral cells receive the same number of connections as in panel A, but the granule cells receive twice as many as in panel A (S times as as many in general). All-to-all connectivity is thus exacerbated. (C) Blocks of granule cells, indicated by the shades of green, connect to the same sister from each glomerulus, leading to maximally sparse connectivity. In this example, the first three granule cells connect to the first sister on each glomerulus, and the second three granule cells connect to the second sister; see Eq (17). Sister mitral cells now interact with a factor of 2 fewer granule cells (S fewer in general), sparsifying mitral cell connectivity. (D) A more realistic maximally sparse connectivity pattern. Each granule cell connects to a single, randomly selected sister cell from each glomerulus; see Eq (18). Sister mitral cells connect to a factor of S fewer granule cells on average, though individual mitral cells may connect to more (like the first green mitral cell) or fewer (like the second). There are many ways to make connectivity sparse. One is to divide granule cells into blocks, and then have the granule cells in each block project to one of the sister cells corresponding to each glomerulus, This scheme is shown in Fig 3C (with the blocks arranged sequentially). More realistic is for each granule cell to connect to a randomly selected sister cell from each glomerulus, so that s also depends on glomerulus i, This scheme is shown in Fig 3D. In either case, each sister mitral cell now receives input from a factor of S fewer granule cells. The granule cells still make M connections (recall that M is the number of glomeruli), but, at least in the olfactory system, M is relatively small, on the order of 102−103. At this point we have demonstrated that the dynamics in Eqs (10b) and (11) will lead to MAP inference if λ is set to the average of the sister mitral cell activity; that is, if Eq (16) is satisfied. However, no known cell performs the averaging required in Eq (16) and then projects to the granule cells, as required in Eq (10b). We therefore take an alternative strategy: we augment the dynamics to ensure that all the sister mitral cells converge to the average. To do that, we introduce a new cell type (which we will ultimately identify as periglomerular cells) that evolves according to This achieves the desired result: at equilibrium, when , all sister mitral cells associated with glomerulus i have the same value—the average sister mitral cell activity. To ensure that converges to an equilibrium, rather than increasing or decreasing linearly with time, we need to couple to the sister mitral cells. A reasonable coupling is linear negative feedback, transforming Eq (10a) to This certainly has the right flavor: positive tends to decrease , and vice versa, suggesting that all the sister mitral cells will eventually have the same value. But will they have the right value—the value implied by Eq (10a)? To answer that, we combine Eq (14) with Eq (20) to write Except for the last term in parentheses, Eq (21) is exactly the same as the equation for λ, Eq (10a). Note, however, that Hence, if we initialize so that is zero, it will remain zero for all time. In that case, the equation for the sister cell average, Eq (21), is identical to Eq (10a). Consequently, each of the sister cells converge to the correct mean, and so we can replace Eq (10b) with Thus, under the dynamics given in Eqs (19), (20) and (23), with obeying Eq (14), the network performs MAP inference. The variables in Eq (19) are driven by a weighted average of sister cell activations. The observed backpropagation of mitral cell action potentials to the glomeruli [24, 25] and the electrical coupling of sisters at the glomeruli [26] might contribute to the neural implementation of just such an average. Thus we have provisionally identified the variables with olfactory bulb periglomerular cells because they inhibit the mitral cells and are in turn excited by them [27, 28], and do not receive direct receptor input themselves. Periglomerular interneurons constitute a diverse group of cells [28, 29] and there is currently limited insight into their detailed wiring diagram [28, 30]. Nevertheless, the type of cell described above (reciprocal connections with mitral/tufted cells without direct receptor input) is reminiscent of the Type II periglomerular cells of Kosaka and Kosaka [29, 31] (see also [32]). To summarize, the introduction of sister cells allows exact MAP inference to be performed while reducing, by a factor of S, the number of granule cells each mitral cell must connect to. It is in this sense that sister cells allow MAP inference to be performed with sparse connectivity.

2.3 Leaky periglomerular cells

The dynamics in Eq (19) implies that the periglomerular cells, , do not leak; i.e., they are perfect integrators. This is at odds with biology, since we imagine that integration is performed by neuronal membranes, and neuronal membranes are leaky [33]—though periglomerular cells may be less leaky than most other neurons [28, 34]. We can introduce a leak term into the dynamics, where ε sets the magnitude of the leak. One advantage of introducing this leak is that we no longer have to worry about initializing the so that their mean is zero, since with a leak term the mean periglomerular activation decays to zero, The price we pay is that the system no longer computes the MAP solution exactly. As we show in Methods, Sec. 4.1, when there is leak the system of equations minimizes the wrong objective, where In the limit of no leak (ε → 0, so that q → 1), we recover the correct objective (compare to Eq (7)). For non-zero leak, the objective differs from the MAP objective, so solutions will differ from the MAP solution. However, as we show numerically in Sec. 2.5.2, for biologically relevant values of ε these deviations are small. Note that as the number of sister mitral cells S increases, q approaches 1. Naively, this suggests that we should recover the true objective in the large S limit. However, this naive expectation ignores the fact that there is a factor of S in the second term in Eq (26); in the large S limit, this cancels the 1/S dependence in (1 − q). Consequently, it is not immediately clear how inference depends on S when there is nonzero leak. We addressed this numerically, and found that the error relative to the MAP solution increases monotonically with the number of sisters; see Sec. 2.5.2.

2.4 Implementation in neural circuitry

The mitral and periglomerular cell dynamics (Eqs (20) and (19), respectively) are in a form suitable for implementation by neural circuitry. However, the granule cell dynamics in Eq (23) cannot be implemented directly because of the presence of not-everywhere-differentiable terms in the prior, ϕ (see Eq (6)). We thus implement related, neurally plausible, dynamics that has the same fixed point. Specifically, we note that at the fixed point of Eq (23), x satisfies where ∂ is the subgradient operator [35]. If at the solution x > 0, then the subgradient operator reduces to the ordinary gradient and yields the value 1, and we have On the other hand, when x = 0 the subgradient is the set (−∞, 1], so we have which we can write as an inequality, Thus, x = 0 whenever Eq (30) is satisfied; combining that with Eq (29) (which is valid when x > 0), we have where [⋅]+ is the threshold-linear operator. A neural implementation of this function would have x responding instantaneously to changes in the mitral cell activations , which is implausible. Instead we employ a membrane voltage variable v which integrates the mitral cell input and interpret x as the resulting firing rate. The full set of equations describing the model is, therefore, where the weights satisfy Eq (14). Eqs (33a) and (33b) correspond to Eqs (20) and (19), respectively, and Eqs (33c) and (33d) implement Eq (32). The sparsity parameters β and γ from the prior, Eq (6) appear as the threshold and inverse gain of the granule cell firing rate, Eq (33d). Because these parameters reflect the statistics of the environment, we assume that they can be set appropriately on evolutionary time scales as the species adapts to its environment. But they can also be adjusted on faster time scales by, for example, using cortical feedback to add a background voltage to the bracketed term in Eq (33d), modifying the threshold of the granule cell firing rate, and so altering the sparsity prior. We leave the investigation of such possibilities to future work. A circuit that implements these equations is shown schematically in Fig 4. As promised, each mitral cell interacts with only a subset of the granule cells, as in Fig 3D. This reduces mitral-granule connectivity by a factor of S (though the total number of mitral-granule synapses stays the same due to the introduction S sister mitral cells per glomerulus). The information from the other granule cells is delivered indirectly to each mitral cell via the influences of the glomerular average of the sister cell activations and periglomerular inhibition.

Fig 4

Sparsely connected inference circuit, with readout in piriform cortex.

Arrows and filled circles indicate excitatory and inhibitory connections, respectively. Each sister cell receives input y from a glomerulus and interacts with a subset of the granule cells x, reducing connectivity by a factor of S (S = 3 in this example). Information is shared between granule cells through the periglomerular cells (only ones with S = 3 are shown, for clarity). Those cells receive excitatory input from the mitral cells and inhibitory input corresponding to the average sister cell activity (as required by Eq (33b)). The inhibitory input comes from back-propagating action potentials, which travel along mitral cell apical dendrites and are available at the glomeruli [24, 25]. The result of the inference is simultaneously available in granule cells and neurons in piriform cortex (see Methods, Sec. 4.5, ‘Cortical readout’).

Sparsely connected inference circuit, with readout in piriform cortex.

2.5 Simulations and linear analysis

To investigate the behaviour of the system, we performed a series of simulations using the model summarized in Eq (33). The three questions that guided our choice of simulations, along with brief answers, are: What do the dynamics of sister cells look like? Our analysis so far shows only that if the dynamics converges, it yields the MAP solution. However, we have not shown that the dynamics necessarily converges, or said anything about transient behaviour. Thus the first goal of our simulations is to check convergence empirically, and to qualitatively assess the transient dynamics and its biological plausibility. Answer: we show numerically that the dynamics does indeed converge, and show analytically that solutions are stable for the parameter regime of interest. What is the effect of non-zero periglomerular leak (ε > 0)? The dynamics in Eq (33) yields the MAP solution at convergence only when the periglomerular cells have zero leak, yet any biological implementation of these dynamics will have non-zero leak. It is important to determine the extent to which realistic values of the leak affect the dynamics and the inference solutions. Answer: for realistic values of the leak, the effect on MAP inference is small. Finally, how do the various parameters affect the transient dynamics, and which parameters are most important? In particular, does the dynamics become qualitatively unrealistic when some parameters are changed within biologically reasonable ranges? Answer: the transient dynamics is extremely robust to parameters, and exhibits very little change over a broad range. Below we expand on these answers.

2.5.1 System dynamics with sister cells

In our simulations we used the base parameters given in Table 1; departures from those parameters will be flagged. Sister cell connectivity was set according to Eq (18); see Methods Sec. 4.4 for further details.

Table 1

Base parameters used in the simulations.

Parameter	Description	Base Value
M	Number of glomeruli	50
N	Number of granule cells	1200
S	Number of sister cells per glomerulus	4
n	Number of components present in the true odour	3
σ ²	Receptor variance	10⁻²
β	ℓ₁ penalty	3
γ	ℓ₂ penalty	1
τ _v	Granule cell membrane time constant	35 ms
τ _λ	Mitral cell membrane time constant	50 ms
τ _μ	Periglomerular cell membrane time constant	35 ms
ε	Periglomerular cell leak	0
A _ij	Elements of the affinity matrix	N(0,1/M)

Fig 5 shows typical activity patterns for mitral, periglomerular, and granule cells. Panel A shows the response of all four sister mitral cells from a representative glomerulus, and panel B shows the response of the corresponding periglomerular cells. Although there is some initial variability in the responses—in particular decaying oscillations—the sister cells converge to the same value, as expected. Panel C shows granule cell activity, and demonstrates that the readout converges to the MAP solution within a few hundred milliseconds. This is reflected in the root mean square (RMS) error between the granule cell responses and the MAP estimate, which decays exponentially (panel D).

Fig 5

Example dynamics for a network with base parameters (Table 1).

Example dynamics for a network with base parameters (Table 1).

(A) All four sister mitral cells from a representative glomerulus. Although the sister cells start off with identical activations, their activities quickly diverge due to their differing connectivities to the granule cells. Ultimately, however, they converge to the same value. (B) Periglomerular cells from the glomerulus in panel A. (C) Granule cells. Red arrows indicate the MAP solution for the three components present in the odour. Granule cells representing these components are shown in green, the others in gray. After an initial period of activity, the system settles into the MAP solution. (D) Time course of the root-mean-square error between the granule cell activations and the MAP solution normalized to its initial value, indicating convergence to the MAP solution. In Fig 6 we show typical dynamics when there are S = 1, 8 and 25 mitral cells per glomerulus. In all cases, the granule cells converge to the MAP solution within a few hundred milliseconds. The main difference between the three values of S is that when S = 1, convergence to the MAP solution is slightly faster than when S > 1, as indicated by the slightly steeper RMS error curves in the bottom left panel. Otherwise, the dynamics in all three cases is qualitatively similar.

Fig 6

Typical dynamics as the number of mitral cells per glomerulus, S, is varied relative to the base model given in Table 1.

Typical dynamics as the number of mitral cells per glomerulus, S, is varied relative to the base model given in Table 1.

(A) S = 1, (B) S = 8, and (C) S = 25 mitral cells per glomerulus, compared to S = 4 in Fig 5. Note the lack of periglomerular cell activity and slightly faster convergence for S = 1. Otherwise, the dynamics is qualitatively similar for all values of S, including convergence to the MAP solution. Linear analysis. Given that our network was designed to perform MAP inference, the asymptotic behavior shown in Figs 5 and 6 is relatively unsurprising. However, our analysis so far says nothing about the transient behavior, which, as can be seen in Figs 5 and 6, is characterized by large oscillations. To understand this behavior—in particular how stability and oscillation frequency depend on the parameters, including the number of sister mitral cells—we need to analyze the dynamics. Because of the rectifying nonlinearity, that is hard to do exactly. However, our simulations so far (in particular the granule cell activity in Figs 5 and 6) suggest that the composition of the ‘active’ set of granule cells stabilizes before the dynamics terminates. Once this active set has stabilized, so that the rectifications remain within the linear regime, the granule cell activations x can be replaced by their corresponding voltage variables v, and the system becomes linear. We thus performed linear analysis of Eq (33) around a solution with n active granule cells, and used the results to both explain the transient behavior we have seen so far, and guide further investigation of the system. The linearized dynamics relative to their input-dependent fixed-point for n active granule cells is given by (Methods, Sec. 4.2) where the δ in front of each variable indicates a small deviation from the fixed point. Note that we replaced the granule cell activations x by their membrane voltages v, as motivated above, and that the indexing of the latter variables is over the active set of n granule cells. To solve these equations, we let the dynamical variables have the time dependence e, which results in an eigenvalue equation for ξ. That equation can’t be solved exactly (at least not for all eigenvalues), but the approximate eigenvalues, derived in Methods, Sec. 4.2, are reasonably close to the true ones, as shown in Fig 7. In that figure, eigenvalues near blue markers are for modes involving only mitral and periglomerular cells (δv = 0), while eigenvalues near orange and red markers are for modes involving granule cells as well (δv ≠ 0).

Fig 7

Eigenvalues of the linearized system computed numerically (gray crosses) and analytically (coloured symbols).

Eigenvalues of the linearized system computed numerically (gray crosses) and analytically (coloured symbols).

Eigenvalues labeled ‘decay’, Eq (57), ‘decayλ’, Eq (58) and ‘ξmp’, Eq (59), are exact; those with the superscripts ‘low’ or ‘high’, Eq (76), are approximate. (A) n = 20 active granule cells. (B) n = 55, so that n > M. In both panels, blue eigenvalues are for modes involving only mitral and periglomerular cells, modes for orange and red eigenvalues also involve granule cells. The time constant (x-axis) and frequency (y-axis) for each eigenvalue ξ are −1/Re(ξ) and Im(ξ)/2π, respectively. We used large n to better show the spread of eigenvalues. Parameters from Table 1, except S = 25 and ε = 1/2, and n depends on the panel. See Sec. 4.2 for additional details. Each gray cross in Fig 7 corresponds to a mode of the system, and near the equilibrium the activity consists of a sum of these modes. However, because the modes cluster, the system admits only a handful of behaviors, which we summarize as follows: Low frequency oscillations, ‘ξlow’, whose frequency does not change with added sisters (see Methods, Eq (80)); Two high frequency oscillations, ‘ξhigh’ (Methods, Eq (78)), which involves all cell types, and ‘ξmp’ (Methods, Eq (59)), which involves only the mitral and periglomerular cells. The frequency of these oscillations increases with added sisters; Purely decaying modes (no oscillations), ‘decay’, (orange diamond in Fig 7; see Methods, Eq (57)), which has a decay rate that is proportional to ε, and ‘decayλ’ (Methods, Eq (58)), which is present only when n < M (blue diamond in the top panel of Fig 7). The latter involves the mitral and periglomerular cells, but not the granule cells. Fig 7 shows the eigenvalue spectrum for only one set of parameters. What about other choices? The time constants, τλ, τ/ε and τ, set the time scale for decay. The other relevant parameters are the number of sister mitral cells, S, and γ and σ2; the latter two appearing in Eq (34a). As we show in Methods, Sec. 4.2, these have two main effects. First, the oscillation frequencies ξhigh and ξmp scale as (Eq (78)) and Eq (59), respectively, as shown in Fig 8A and 8B. The second effect is on the decay time constants, which are mainly independent of S, except when S = 1; in that case there is no ξmp mode, which can affect the decay rates (see Fig 8C). For a detailed analysis of the effect of parameters on the transient behavior, see Methods Sec. 4.2.

Fig 8

Transient properties of inference dynamics.

(A) Dependence of the high-frequency mode of the mitral cell responses on the number of sisters, S. (B) Dependence on the standard deviation of the noise, σ. In both panels, the dashed lines are fits predicted by the linear analysis. (C) Dependence of the mitral cell decay time constant on the number of sisters S. Going from S = 1 to S = 2 sisters per glomerulus decreases the decay time constant by a factor of about 2. A large change in decay rate from S = 1 to 2, followed by a slower change for S ≥ 2, is typical, although the details are parameter-dependent. Other parameters as in Table 1.

Transient properties of inference dynamics.

2.5.2 The effect of periglomerular leak

As discussed in Sec. 2.3, our network performs MAP inference only if the periglomerular leak, ε, is zero. What happens in the realistic case, when it’s not zero? Here we address that question through simulations. From Eq (33b), we see that the effective time constant of the periglomerular cells is τ/ε = 35 ms/ε (see Table 1), so relevant values of ε are near 1. In Fig 9 we show typical dynamics for ε = 1 and ε = 2. Non-zero values of the leak mean the system no longer performs MAP inference; that’s reflected in the plateauing of RMS error relative to the MAP solution. In both cases however, the effect on granule cell activity is small. Non-zero leak also means that the periglomerular cells are no longer able to force sister cells to the same value at convergence. This is increasingly visible as the leak increases (compare granule cell activity at convergence in panels A and B of Fig 9). Again, though, the effect is small.

Fig 9

Effect of periglomerular leak ε on the dynamics.

Effect of periglomerular leak ε on the dynamics.

Parameters from Table 1, but with the number of sisters S fixed at 8 and the leak at (A) ε = 1 and (B) ε = 2. Note that sister mitral cells no longer converge to the same value (top row). Increasing the leak results in higher RMS error relative to the MAP estimate (red triangles). As discussed in Sec. 2.3, the effect of the periglomerular leak depends on the number of sister mitral cells, S. In Fig 10A we plot the asymptotic RMS error versus ε as we vary S. Increasing the number of sisters increases the steady state RMS error relative to the MAP solution, but past about S = 8 the number of sisters has very little effect on the error. In Fig 10B we plot the spread in sister cell activity relative to the mean for S = 25, showing that it remains small for large values of the leak.

Fig 10

Properties at convergence versus periglomerular leak ε.

(A) Mean RMS error at convergence, relative to its initial value, as a function of leak for different numbers of sister cells per glomerulus. The RMS error increases with leak and with the number of sister cells, but never gets very large. Means are computed over 10 different odours and 5 different olfactory bulbs. (B) Median (dots) and inter-quartile range (lines) of the coefficient of variation (CV; standard deviation divided by the mean) of the ratios of sister cell activity at convergence as a function of the leak parameter ε. CVs are computed over the ratios of all unique pairs of 25 sisters in a glomerulus, percentiles are computed over all glomeruli in response to 10 different odours across 5 different olfactory bulbs. Median CVs remain low even for high values of leak. Parameters from Table 1, except for S and ε.

Properties at convergence versus periglomerular leak ε.

2.5.3 Robustness

Finally, we examined how the parameters affect the dynamics in the non-leaky setting. Because the system always arrives at the MAP solution, we focus on the transient dynamics, and in particular on whether the system remains within a biologically plausible range. Number of odour components, . Intuitively, we expect that as the number of odour components, n, increases, the inference problem will become harder. In Fig 11A we show an example with 10 odour components present, and indeed we see that inference (blue lines) is not great: although the odour components that are present are correctly inferred, many odour components that are not present are inferred as well. Fig 11B corroborates this: the number of recovered odour components exceeds the number of true ones, n, when n is sufficiently large. Note, though, that as M increases, the system can accurately infer more odours. Moreover, as can be seen in Fig 11C, for all values of n tested the dynamics converges to the MAP solution at similar rates.

Fig 11

The effect of the number of odour components, n, on inference.

The effect of the number of odour components, n, on inference.

(A) The MAP estimate for an odour with n = 10 components contains many more than 10 non-zero values. (B) Ratio of the number of odour components inferred using MAP inference to the true number, versus the true number of odour components, n for different numbers of glomeruli, M. An odour component was considered inferred when granule cell activation was greater than 10−2. Dots are averages over 12 trials, vertical lines are standard deviations (shown only for M = 50 for clarity). When using the default number of glomeruli, M = 50, extra odour components are inferred when n is above about 3. Adding glomeruli, however, allows more odours to be correctly inferred. (C) Time course of the RMS error between the granule cell activations and the MAP estimate, for 6 different trials at each n. Parameters from Table 1, except S = 8, and M = 50 unless indicated otherwise. Number of cells. So far our simulations have used M = 50 glomeruli and N = 1200 granule cells. However, most olfactory systems are much larger than this (we used smaller populations solely to speed up simulations). For example, in the fly M ≈ 50 and N ≈ 2500, while in the mouse, M ≈ 1000 and N ≈ 106. In Fig 12 we show circuit dynamics for larger systems; up to M = 200 and N = 4800. Consistent with the fact that the linear analysis (Methods, Sec. 4.2) doesn’t predict a strong dependence on size, the dynamics are qualitatively similar to our simulations with small M and N, and the MAP solution is achieved within a similar time frame.

Fig 12

Dynamics for different numbers of glomeruli, M, and granule cells, N (specified at the top of each column), are qualitatively similar.

(A) Mitral cells. The activity of one mitral cell is highlighted with a dark trace; the remaining sisters are overlayed with a light trace. (B) Granule cells. Granule cells representing molecules present in the true odour are coloured green and the 10 most active granule cells representing molecules not present in the true odour are coloured gray. Granule cell dynamics are all qualitatively similar and converge to the MAP solution (red triangles). (C) RMS error between the granule cell activity and the MAP estimate for 5 random systems of the same size as in the previous rows (gray) and the average (green). All systems converge with similar dynamics to the MAP estimate. Remaining parameters are from Table 1, except S = 25.

Dynamics for different numbers of glomeruli, M, and granule cells, N (specified at the top of each column), are qualitatively similar.

2.6 Relative sister cell activity at convergence

Our proposed scheme makes a strong experimental prediction about the relative activity of sister mitral cells at convergence, but it takes some analysis to determine exactly what that prediction is. According to Eq (33b), when there is no periglomerular leak (ε = 0), all sister mitral cells converge to the same value—the mean, , which depends only on which odours were presented. However, exact equality relies on a particular choice of coordinates. Suppose, for instance, that the experimentally recorded mitral cell firing rates, denoted , were related to those in our model, , by cell-specific invertible transformations , (Invertibility is required because there must be a one-one mapping between trajectories in the transformed and non-transformed variables). Because our model predicts that at convergence , independent of odour, it follows that at convergence, This prediction would be hard to verify experimentally, because it requires knowledge of the transformations and . However, if we assume that the transformations are differentiable, we can arrive at a more useful expression by differentiating both sides with respect to and rearranging terms, Because and are invertible and differentiable, and thus monotonic, functions of their arguments, the sign of the derivatives are independent of the value of either or . This means that the sign of the right hand side is fixed, independent of or . Suppose, for definiteness, that it’s positive. In that case, if is larger for odour A than it is for odour B (at convergence), will also be larger for odour A than for odour B. If, on the other hand, the right hand side is negative, we’ll see the opposite: will be smaller for odour A than for odour B. Our model thus makes the prediction that if we plotted the values of two sisters cells at convergence for a range of odours, that plot would be monotonic. For this prediction to hold, Eq (36) must be satisfied, which is guaranteed only when the periglomerular leak, ε, is zero. When ε > 0 on the other hand, Eq (33b) tells us that at convergence, The ratio of sister mitral cell activations will thus acquire an odour dependence. But as we saw in Fig 10B, that dependence is relatively weak. Consistent with this, when we tested for monotonicity numerically, we found that it is largely maintained, as can be seen in Fig 13. We thus arrive at our main prediction, which holds even when there is periglomerular leak: at convergence, the activity of any mitral cell is an approximately monotonic function of the activity of any of its sisters.

Fig 13

Sister cell odour responses at convergence are approximately monotonically related, even when there is periglomerular leak.

(A) Full (bottom) and zoomed (top) distributions of Spearman’s rank correlation, ρ, comparing responses at convergence for pairs of sister cells (colours) and non-sisters (gray) in a single model olfactory bulb. The olfactory bulb was stimulated with 100 random odours; data is from all 50 glomeruli. Bars indicate the median, red lines indicate the interquartile range, red dots indicate the minimum correlation observed. Olfactory bulb parameters were as in Table 1 except S = 25, and leak as indicated. (B) Responses at convergence of a pair of sister cells with the median Spearman’s rank correlation in the highest leak setting, (ε = 2, top of right-most bar of panel A) to the 100 odours tested. (C) Responses from panel (B) after transformation through cell-specific monotonic nonlinearities (inset) modeling the transformation from model variables to experimentally recorded values . Because the transformations are monotonic, the Spearman’s rank correlations in panels B and C are identical.

Sister cell odour responses at convergence are approximately monotonically related, even when there is periglomerular leak.

3 Discussion

Most neural implementations of probabilistic inference require dense or all-to-all connectivity between elements, so that the explanatory power of all latent variables can be correctly accounted for. In common sensory settings where inference over hundreds of thousands of latent variables is not uncommon, such connectivity can require individual neurons to connect to hundreds of thousands of others, which is biologically implausible. In this work we have taken inspiration from the vertebrate olfactory system to show how such inference problems can be solved using sparse connectivity. Naive olfactory inference would require each mitral cell to connect to hundreds of thousands of granule cells. However, in mice there are approximately 25–50 sister cells per glomerulus [22, 23], and we showed that the sisters can share the connectivity, resulting in a substantial reduction in the required number of synapses per mitral cell. However, this sharing of connectivity comes at a cost: it requires coordination by a neural population. We have identified that population as periglomerular cells, based on their pattern of connectivity. Our approach is not limited to the particular inference setting presented here. It can be applied to the more complex generative model in [2], and a large class of nonlinear models (see Methods, Sec. 4.6). As another example, our approach can be readily applied to the ‘sparse incomplete representations’ of [36], whose Eqs. (5) and (6) are directly analogous to our Eq (10), and thus can be modified analogously to employ sparse connectivity.

3.1 Coordinating connectivity

One of the key requirements of our work is that sister cell connectivity matrices sum according to Eq (14). This condition implies that the weights connecting a given granule cell x to the sisters in glomerulus i sum to A. In the sparsest connectivity setting this requires that each granule cell connect to exactly one sister cell from each glomerulus. This may seem difficult to implement, but given the similarity in the temporal responses of sister cells, such sparse connectivity may be achievable through lateral inhibition among granule cell spines, as two spines each contacting sisters from the same glomerulus would receive very similar inputs, motivating the elimination of one to reduce redundancy. Nevertheless, although sparsifying connectivity was the motivation for our model, the model does not require it, as long as the condition in Eq (14) is met. Perhaps a more fundamental issue is that we have assumed throughout that the correct connectivity—the affinity matrix A—is known. The key question is then how such ‘correct connectivity’ patterns can be learned, particularly in a setting where connectivity has to be coordinated across sister cells according to conditions like Eq (14). Previous work [37] and current investigations [38] have examined this issue. However, neither of those studies used sister mitral cells to sparsify connectivity, and how such circuitry could be learned remains an open question.

3.2 Cortical readout

Although our work deals mainly with olfactory computation in the bulb, communicating the results of this computation to the cortex is obviously required. In Methods, Sec. 4.5, we outline one scheme by which this could be done, in which cortical neurons effectively mirror the computation of olfactory bulb granule cells. If the projections from the mitral cells to the piriform cortex satisfy Eq (95), the cortical readout at convergence would be an exact copy of the granule cell activations in the bulb. Needless to say, the main problem with such a scheme is how to satisfy this condition on the weights, and is likely made more difficult by the physical separation of the bulb and the cortex. One possibility is that the feedback connections from the cortex to the bulb could be used to learn the right connectivity. Another possibility is that an exact copy of the granule cell output is not required, and that random projections from the bulb to the cortex would suffice [39]. We leave the resolution of such issues to future work.

3.3 Predictions

Our model makes a number of experimentally testable predictions. Perhaps the easiest to test is the monotonicity condition in Eq (37), supported by the numerical results in Fig 13, stating that the activity of a mitral cell at convergence is an approximately monotonic function of the activity of any of its sisters. It is admittedly hard to define convergence in animals that are subject to periodic olfactory input due to the breathing cycle. Nevertheless, we can approximate it by the activity at the end of each breathing cycle in an anesthetized animal presented with an odour, after several breathing cycles have elapsed. Our model would then predict that for any pair of sister cells, plotting their firing rates at convergence in response to a range of odours would reveal an approximately monotonic function. A key computation required by our model is the coordination of sister mitral cell activity to arrive at the MAP solution, which we propose is performed by the periglomerular cells. Therefore, we predict that deactivating the periglomerular cells should eliminate sister cell coordination and push the results of inference away from the MAP solution, and likely produce widespread, low amplitude activation of granule cells (see Methods, Sec. 4.3). Although maximum sparsity is not required of our model (see, e.g., Fig 3B), if, as we propose, the function of the sister mitral cells is to sparsify the mitral-granule cell connectivity required for inference, then maximally sparse connectivity solutions would be expected, in which granule cells contact at most one sister cell per glomerulus. Connectomics studies in the olfactory bulb should be able to tell us how often granule cells receive input from two or more sister mitral cells. Finally, in our model we grouped all olfactory bulb projection neurons together and referred to them as mitral cells. This was for simplicity only. In fact, though, olfactory bulb projection neurons fall into two anatomically and physiologically distinct classes, mitral and tufted cells, and it is likely that these classes play different roles in olfactory processing [19]. Because our model describes how a given computation can be distributed among sister cells, if mitral and tufted cells perform different computations, then our predictions would apply independently to sister mitral cells and to sister tufted cells. It would, therefore, be important to establish that two sisters are of the same type before applying our prediction of sister cell monotonicity.

3.4 Summary

In summary, taking inspiration from the sister mitral cells in the vertebrate olfactory bulb, we showed how inference that typically requires dense connectivity can be performed using sparse connectivity. This means computations that would normally require hundreds of thousands of connections can be performed with a fraction of that. To the best of our knowledge our work is the first to propose a computational role for the sister mitral cells, and it makes a number of experimentally testable predictions. Despite its olfactory origins, our approach is quite general, and can be applied in other inference models.

4 Methods

Here we provide additional details about the analyses used in the Main Text. In Sec. 4.1 we introduce a Lagrangian for the non-leaky (ε = 0) system, and modify it for the leaky case to gain insight into the resulting dynamics. In Sec. 4.2 we perform linear analysis to determine how the various parameters influence the transient response properties of the system. In Sec. 4.3 we examine the implications of eliminating the periglomerular cells. In Sec. 4.4 we provide additional details about our simulations of the system dynamics. In Sec. 4.5 we propose a method for reading out odour concentrations in cortex. Finally, in Sec. 4.6 we show that sister mitral cells can be applied to other inference models.

4.1 Leaky periglomerular cells

To gain insight into the effect of pergilomerular leak, we start with a Lagrangian for the non-leaky setting, It is straightforward to show that Consequently, if we minimize the Lagrangian with respect to x and and maximize it with respect to λ, we recover the MAP solution. The resulting equations are These equations correspond, up to scaling factors, to Eqs (23), (20) and (19), respectively, except that the second equation above includes the additional term . That term is, in fact, necessary to yield the correct dynamics. We dropped it because we can instead choose the initial conditions so that is zero, and under the periglomerular dynamics it will stay zero forever. Thus the circuit dynamics can be viewed as finding the saddle point of the Lagrangian of a constrained optimization problem derived from the original MAP objective. To gain insight into the effect of the periglomerular leak we add a term proportional to to the Lagrangian in Eq (39), As is not hard to show, this introduces a term to the right hand side of Eq (41c), which is the same as the leak term in Eq (33b). To give us a Lagrangian that depends solely on x, we eliminate the auxiliary variables and . We start by minimizing with respect to , yielding so that The final term couples the to their mean value computed over sisters. Extremizing with respect to yields where and we used, as usual, the constraint on the weights, Eq (14). We can now insert this into Eq (44) to derive a Lagrangian that depends only on x. Doing that is straightforward, although somewhat algebra-intensive, but it ultimately yields Eq (26).

4.2 Linear dynamics

In addition to the number of sister cells S, our model has several parameters that affect the transient dynamics. To understand these effects, we performed linear analysis of the model in Eq (33). Our aim is a qualitative understanding, so we frequently opt for approximations that yield simple and tractable results over exact solutions. We perform the linear analysis around the steady-state solutions to Eq (33). Because of the threshold nonlinearity in Eq (33d) only a small number of granule cell activations x will be non-zero. We take the deviations around steady-state small enough so that the composition of this active set does not change. Adopting notation where vectors are in sister cell space, we write Here quantities with the subscript 0 are steady-state solutions to Eq (33); quantities with a δ in front of them are infinitesimally small, and obey linear dynamics where the sth component of w is and Because these are linear equations, they (generically) have solutions whose temporal part is given by e. Consequently, derivatives with respect to time can be replaced by ξ, leading to Our approach is to transform this set of equations to an eigenvalue equation in a single variable. To that end we eliminate δ and δv, leaving us, after a small amount of algebra, with The S × S term in parentheses on the right hand side is the (i, k)th block of an MS × MS matrix of rank n, whereas the set of vectors δλ contain MS components (k runs from 1 to M and δλ is an S-dimensional vector). Consequently, so long as n < MS (the regime we consider), that rank n matrix has two different classes of eigenvectors: those with zero eigenvalue and those with non-zero eigenvalue. We consider the former first. For eigenvectors with zero eigenvalue, the left hand side of Eq (51) must be zero. For that to happen there are, naively, two possibilities: δλ ∝ 1, in which case ξ obeys and δλ · 1 = 0, in which case ξ obeys Again naively, this should result in four eigenvalues so long as MS > n. To make this rigorous—and to uncover exactly when the above eigenvalues apply (as the naive conclusions are not quite correct)—we need a more involved analysis. Before proceeding with the general case, however, we note that there’s a special case: τξ + ε = 0, since in that case the right hand side of Eq (51) is identically zero. Examining Eq (50b) we see that when this holds, δλ ∝ 1, which then determines δv by Eq (50c), and δ by Eq (50b). Thus there are M modes, corresponding to the root of Eq (52); for these, When τξ + ε ≠ 0, there is an (MS − n)-dimensional space of vectors, denoted , that obey We need to choose a linear combination of these for which the left hand side of Eq (51) is zero; that is, we need to choose a set of a such that (after rearranging terms slightly) Since this equation must be satisfied for all i, there are MS equations (i runs form 1 to M and is an S-dimensional vector). But there are only MS − n adjustable parameters, so in general the only solution has all the a = 0. There are, though, two ways to find nontrivial solutions. One is to set the first term in parentheses to zero (i.e., enforce Eq (52)). Then, because spans S − 1 dimensions, Eq (56) corresponds to M(S − 1) equations. Because there are MS − n adjustable parameters, there is a nontrivial solution if MS − n − M(S − 1) > 0; that is, if M > n. Note that because we have already taken into account the solution with τξ + ε = 0, these solutions must have τλξ + 1 = 0. The other way to find nontrivial solutions is to set the term in brackets in Eq (56) to zero (i.e., enforce Eq (53)). Then, because spans one dimension, Eq (56) corresponds to M equations, and so there is a nontrivial solution if MS − n − M > 0; MS − n − M > 0; that is, M(S − 1) > n. In summary, when the right hand side of Eq (51) is zero, we have the following eigenvalues, all of which are exact: M modes with [M − n]+ modes with and [M(S − 1) − n]+ modes with where the superscript ‘mp’ indicates that this mode involves only mitral and periglomerular cells (see next paragraph), and (when it’s is not an index). Because ξmp can take on two values, there are 2[M(S − 1) − n]+ modes of this type. Two comments are in order. First, when there is only one sister cell (S = 1), the mode in Eq (59) does not exist, as that mode requires M(S − 1) > n. Second, for the modes given in Eqs (57), (58) and (59), ∑ w ⋅ δλ = 0; this in turn implies, via Eq (50c), that δv = 0. Thus, these modes involve the periglomerular and mitral cells, but not the granule cells. When the right hand side of Eq (51) is nonzero, analysis in δλ space is difficult. However, we can instead work in δv space: eliminating δλ and δ from Eq (50), we can write down an eigenvalue equation for δv; after tedious but straightforward algebra, including application of the Sherman-Morrison formula to invert the operator on the left-hand side of Eq (51), we arrive at where we used Eq (14) to write . Finding exact non-trivial solutions to Eq (60) requires finding the eigenvalues of the sum of the two matrices on the right hand side of this equation. That’s difficult in general, so instead we make an approximation: we replace the second matrix with the identity. We justify this by arguing that its eigenvalues are narrowly distributed around 1. To show that, we start by writing For a given k and j, The elements of are nonzero for only one value of s and zero for the rest (see Eq (18)). Consequently, they are not iid, which makes it difficult to compute the eigenvalue spectrum. However, a reasonable approximation to is In that case, , implying that ∑ w ⋅ w approximately follows a Marcenko-Pastur distribution with parameters (1, n/MS) [40]. For this distribution, the eigenvalues lie in the range When n ≪ MS, the regime of interest, these eigenvalues are very narrowly distributed around 1. Thus, the matrix in Eq (61) is, to good approximation, the identity. This approximation breaks down as n increases, but we’re mainly interested in small n, so that is not a problem. With this approximation, the only nontrivial matrix left in Eq (60) is the one involving the A. The elements of A are drawn iid from , so where MP denotes the Marchenko-Pastur distribution. Using ν to denote an eigenvalue of this distribution, we see that Eq (60) can be approximated as There are n eigenvalues, corresponding to the fact that j runs from 1 to n in Eq (60), so there are n sets of solutions to this equation. (We say “sets of solutions”, rather than just one, because Eq (65) is a polynomial in ξ, which has several roots). The positive eigenvalues lie in the range If n < M, all of the eigenvalues lie in this range, while if n ≥ M, only M eigenvalues lie in this range; the other n − M are zero.

4.2.1 Approximate solutions

To solve to Eq (65), our first step is to write it is a polynomial, We’re looking for values of ξ that satisfy q(ξ) = 0. Note that q(ξ) depends on ν, which means solutions to q(ξ) = 0 will also depend on ν; we drop that dependence to reduce clutter. Because q(ξ) is quartic, an exact analytic expression for its roots is available, but it is too complex to yield insight. Instead, we take a perturbative approach, which rests on the observation that η is large, on the order of 100 (see its definition, Eq (49) and Table 1). To take advantage of this, we scale ξ by a factor of . Choosing a scaling that gives us a dimensionless quantity, we make the change of variables Then, defining the time constant ratios and working to first order in we find, after straightforward algebra, that q(ξ), expressed in terms of α (and denoted, in a slight abuse of notation, q(α)) is given approximately by where indicates approximate equality up to multiplicative constant and In the large η limit, the roots of q(α) are determined by those of B(α). Defining the corresponding four roots are ±iα±. The argument of the square root is (κ − γκ)2 + 4γκ κ(1 − ν/S), which is guaranteed to be positive if ν < S. From Eq (66), this requires . We’ll restrict ourselves to this regime, which ensures that both and are negative, which in turn means all four of these roots are purely imaginary. To compute the corrections to these roots, we let α = α0 + α1 where α0 is any of the above four roots. Then, performing a Taylor expansion of q(α), Eq (70), around α0, we have Setting this to zero and solving for α1 gives Using Eq (71) for B(α0) and b(α0), setting to , and using Eq (72) to simplify the resulting expression, we arrive at We thus have (using Eq (68)) with α1,± given in Eq (68) and given in Eq (72). The “high” and “low” superscripts refer to the fact that , as is easy to see from Eq (72). It is instructive to consider the large S limit, which greatly simplifies the roots. Focusing first on the high frequency roots, in this limit we have so that, using Eq (68), To approximate the low frequency roots, which correspond to , we perform a Taylor expansion of the square root in Eq (72) around (κ + γκ)2, yielding so that, using Eq (68), In summary, our goal was to find solutions to Eq (67) for each value of ν, where ν is drawn from the Marchenko-Pastur distribution MP(1, n/M). This implies that there is a distribution of solutions, ξ, which we can find by solving for ξ at each ν. We did that perturbatively, yielding the high and low frequency solutions given in Eq (76) (with approximate expression for these quantities given in Eqs (78) and (80)). Note that if ν = 0 (which can happen when n > M), α+ = 0 (see Eq (72)). When that happens, takes on only one value, not two (see Eq (76b)), and so there are three possible solutions. The number of modes when the right hand side of Eq (51) is nonzero, then, depends on n. There are always 2n high frequency modes. When n ≤ M (so that ν is strictly positive), there are also 2n low-frequency oscillatory modes. When n > M, on the other hand, n − M of the eigenvalues, ν, are zero, and the rest are positive, resulting in n − M decaying modes and 2M low-frequency oscillatory modes. We thus have 2n − [n − M]+ decaying and low-frequency oscillatory modes, for a total of for Eq (60). All modes of the system are tabulated in Table 2. They are given exactly by Eqs (57), (58) and (59), and approximately by (76). All of these modes have a decay associated with them, and the latter three also have oscillation frequencies. For simplicity, we considered the large S limit, so we used Eqs (78) and (80) for the approximate modes given in Eq (76). Assuming M(S − 1) > n, the total number of modes is M + [M − n]+ + 2(M(S − 1) − n) + 4n − [n − M]+. Adding these together gives 2MS + n, as it should.

Table 2

Linear analysis modes and their properties.

The last two modes correspond to the large S limit of Eq (76).

\|Re(ξ)\|⁻¹: Decay time constant	\|Im(ξ)\|: Oscillation frequency	Number of modes	Source
τλεκμ	0	M	Eq (57)
τ _λ	0	[M − n]⁺	Eq (58)
2τλ1+εκμ	1τλγηκμS	2[M(S − 1) − n]⁺	Eq (59)
2(γκμ+κv)τλκv2+γεκμ2+γκμ+κv	1τλη(γκμ+κv)S	2n	Eq (78)
2(γκμ+κv)τλ(γ+ε)κμκv+γκμ+κv	1τλνηγκvκνγκμ+κv	2n − [n − M]⁺	Eq (80)

Linear analysis modes and their properties.

The last two modes correspond to the large S limit of Eq (76).

4.2.2 Stability

The perturbative corrections in Eq (75) allow us to assess the stability of the linearized dynamics. For stability, both α1,+ and α1,− (which are real) must be negative. Combining Eq (72) with Eq (75), we see that this gives us the two conditions, which can be simplified to just one, As above (see comments following Eq (72)), in the regime of interest, , the left hand side of this equation is greater than |κ − γκ|. The ratio on the right hand side is less then 1, so the right hand side is less than |κ − γκ|. Consequently, this inequality is satisfied. Thus, at least in the large η limit, all roots are stable.

4.3 Inference without periglomerular cells

In our model of sister mitral cells, periglomerular cells are critical to the inference process. This suggests a natural test of our model: remove them experimentally, see what happens to inference, and compare to the predictions of our model. Here we delineate those predictions. For simplicity we’ll assume only one odour is present, which, without loss of generality we take to be odour 1. The input, y, is, therefore, given by . Setting to zero in Eq (33a) and eliminating and v in Eq (33), we see, after a small amount of algebra, that at equilibrium x obeys We’ll make the Ansatz that and all the other x are significantly smaller. This allows us to solve for x1 by considering only x = x1 on the right hand side of Eq (84). This still leaves us with a matrix equation. However, given the discussion in Sec. 4.2, to lowest order we can approximate the matrices in Eq (84) as the identity, With these approximations, the equation for x1 becomes which has the solution To recover the no leak case (ε = 0), we can set S = 1, because in that case Eq (84) corresponds to the MAP solution. Thus, the first observation is that eliminating the periglomerular cells reduces the inferred amplitude of the odour component present by a factor equal to the number of periglomerular cells (in the small noise—meaning small σ2—limit). We can also determine the effect on the incorrect odours. For j ≠ 1, Eq (84) may be written where it is assumed that j ≠ 1. On average, . Consequently, when j ≠ 1, on average . Using this, and also using Eq (87) for x1, this equation becomes Because the elements of A have variance 1/M, the sum over i is a random variable with variance 1/M. We thus have where . (Note that this underestimates the variance, because we are ignoring the additional variability in the matrix , so we’ll be underestimating the effect of eliminating the periglomerular cells). This has the solution The main observation is that in the small noise regime, the regime we consider here, there’s a big difference between no leak (S = 1) and infinite leak: in the former case, ; in the latter it scales as . Because σ2 ≪ 1/S, this is a large effect. This analysis suggests that eliminating periglomerular cells decreases the amplitude of the correctly inferred odours and increases the amplitude of the incorrect inferred odours, justifying our claim in the Discussion that eliminating periglomerular cells from the circuit would result in low amplitude, distributed activity.

4.4 Simulations

The base values of all parameters used in simulations are listed in Table 1. Membrane time constants for mitral and granule cells were set to be similar to the corresponding charging time constants τ0 in [41]. For simplicity the time constant for the periglomerular cells was set to be the same as that of the granule cells, and is consistent with [28]. To model each granule cell connecting to a single sister cell from each glomerulus, we selected for each glomerulus i and granule cell j a random sister cell; see Eq (18). The non-zero concentrations of the presented odours were set to 1, except in Figs 5, 6, 9 and 12, where the true odour had the default number n = 3 non-zero components but at concentrations of 0.8, 1, and 1.2, to aid visual assessment of convergence to the MAP solution. All simulations were initialized with zero activity in all cell populations. To assess the variability of the various response characteristics we usually chose to present the same odour to different random instances of the olfactory bulb, rather than picking different odours within the same olfactory bulb. Thus all references to trials are to the same odour presented to different olfactory bulbs unless stated otherwise. In Fig 7 the Marchenko-Pastur parameter ν used to compute ξhigh and ξlow was set to max(1, n/M) i.e. to 1 in panel A, and to 55/50 in panel B. The quantities in Fig 8 were computed as follows. Amplitude spectra for panel A and panel B were computed as the absolute value of the Fourier transform of the mitral cell responses in the time interval t = 0.3 − 0.6 seconds, averaged over all mitral cells and 20 trials. The decay time constants in panel C were computed from the slope of a linear fit to the log RMS error of the mitral cell activations relative to their final value, for the interval t = 0.4 − 0.6 seconds following odour onset, averaged over 20 trials. To produce Fig 13 we simulated the response of one olfactory bulb to 100 random odours. Each glomerulus contained S = 25 sister cells; all other parameters were the same as in Table 1. Simulations were run for 2.1 seconds to allow the bulb to converge. To generate cell-specific nonlinearities to model transformations from our model variables, , to those that might be experimentally recorded, , we first discretized the interval [−6, 6], spanning the range of values that we observed in our simulations, into 101 points x1, …x101. We generated a covariance kernel Σ on these points according to Σ = exp(−(x − x)2/8). We then generated MS random samples (one for each of the S sisters from M glomeruli) from a 101-dimensional multivariate Gaussian with mean 0 and covariance Σ, yielding, for each sample, a function y of x with discrete domain. To render these functions monotonically increasing, we first computed the minimum slope along each. When the minimum was negative, we added a term linear in x whose positive slope was 1.02 times larger than the absolute value of the minimum negative slope. This rendered the minimal slopes of the resulting discretized functions positive and the functions themselves monotonically increasing. The functions were then scaled to have the same y range as x range, and vertically offset so that their minimum y-value was zero. We used linear interpolation to expand the domain of these functions from the 101 discretization points to the full [−6, 6] interval. For all simulations we used the forward Euler method with a time step of 10−3 ms. To confirm that our networks performed MAP inference, we compared solutions to those found by the convex optimization package CVXPY [42] using the splitting conic solver (SCS), with eps set to 5 × 10−13, applied to the MAP problem in Eq (7) expressed as the constrained optimization The code used to run the simulations and produce the figures are available at https://github.com/stootoon/sister-mcs-release.

4.5 Cortical readout

In our model, the concentrations of the odour components are stored in granule cells, which don’t project outside of the olfactory bulb, and in fact lack axons entirely [43]. Thus, the granule cells can’t communicate any information to the rest of the brain. This can be remedied by projecting mitral cell activity to cortical readouts via projection weights In this circuit each cortical neuron is excited by the sister cells in the same way as the granule cells in the bulb, but is not required to provide feedback to the bulb. When computation in the bulb converges, we have (see Eq (33a) and recall that sister mitral cells all converge to the same activity), so that Thus as long as the projection weights to the cortex satisfy (analogous to Eq (14)) then cortical neuron will have the same fixed point as granule cell x. This means the output of the computation in the bulb can be read out in the cortex via a 1-to-1 correspondence between granule cells and cortical neurons. Thus basic olfactory inference can be performed entirely within the bulb, with the concomitant increase in computational speed, and the results can be read out in the cortex. As cortical feedback to the bulb, in particular to the granule cells, does in fact exist [43], its role may be to incorporate higher level cognitive information and task contingencies into the inference. We leave the exploration of these ideas to future work.

4.6 Application to other models

Our model used sister mitral cells to sparsify connectivity in a circuit performing inference under a linear model of olfactory transduction with Gaussian noise. Our approach, however, is quite general, and can be applied to more complex models. For example, in [2] the authors also consider a linear model of olfactory transduction, but with Poisson noise and a spike-and-slab prior on odour concentrations. Eqs (28) and (29) from their model translate, with minor redefinitions to be consistent with our notation, and minor simplifications to reduce clutter, to where the connectivity matrices , and C are related to A by There are several differences between this model and ours. First, the input, which in our model was y, is now stochastic: r is the number of spikes in a bin of size Δt generated from a Poisson process with rate proportional to y, and is the expected number of spikes. Second, the granule cells and mitral cells communicate via dendro-dendritic connections at “spines”, denoted ; this results in several connection matrices rather than just one. Third, the cortical readout, , feeds back to the olfactory bulb. Fourth, the nonlinearity, , which is defined in terms of the digamma function, is very different from ours. And fifth, the equation for the mitral cells has a term on the right hand side. What they have in common is that the connectivity matrices, and , are dense, and so would require mitral cells to interact with nearly all granule cells. This results in the same all-to-all connectivity problem that we highlighted in Eq (10). But it can again be fixed using sister mitral cells and periglomerular cells, Because of Eq (98c), at equilibrium all the sister mitral cells (all the ) have the same value. Then, assuming, as above, that at t = 0 the average periglomerular activity is zero, it’s easy to see that the sister mitral cells have the same equilibrium values as they do in Eq (96) if Thus, sister mitral cells can be used in more complicated models than the purely linear one we considered here. 1 Sep 2021 Dear Dr. Tootoonian, Thank you very much for submitting your manuscript "Sparse connectivity for MAP inference in linear models using sister mitral cells" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. In light of the reviews (below this email), we would like to invite the resubmission of a significantly-revised version that takes into account the reviewers' comments. As you can see both reviewers valued the modelling idea but raised concerns about a potential departure from available biological evidence and an apparent lack of performance for specific scenarios. Reviewer #1 makes concrete suggestions of how to provide more compelling evidence for the introduction of mitral and sister cells into the model. It would be great if you addressed these first two points as the two key major points of reviewer #1. We cannot make any decision about publication until we have seen the revised manuscript and your response to the reviewers' comments. Your revised manuscript is also likely to be sent to reviewers for further evaluation. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to the review comments and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out. [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Please prepare and submit your revised manuscript within 60 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. Please note that revised manuscripts received after the 60-day due date may require evaluation and peer review similar to newly submitted manuscripts. Thank you again for your submission. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Stefan Kiebel Associate Editor PLOS Computational Biology Samuel Gershman Deputy Editor PLOS Computational Biology *********************** Reviewer's Responses to Questions Comments to the Authors: Reviewer #1: See the attached document. Reviewer #2: Tootoonian and colleagues developed a rigorous model for sensory inference aimed at obtaining a maximum a posteriori (MAP) inference solution. Specifically, they are taking advantage of the olfactory circuit architecture to find a solution supported by biologically plausible sparse connectivity. This is in contrast with standard MAP implementations that require dense, biologically unrealistic, all-to-all connectivity. The authors’ model assigns different sister mitral cells which receive input from the same common glomerulus the job of extracting different latent variables and posits that each sister cell connects reciprocally and symmetrically to different sets of inhibitory granule cells interneurons. This strategy enables each mitral cell to interact with a smaller number of granule cells (by a factor S, ~25 given the number of sister cells per glomerulus). They further achieve coordination among sister mitral cells, necessary for their model implementation to arrive to MAP solution, by engaging back propagating action potentials from the sister mitral cells to drive the firing of periglomerular cells which, in turn, inhibit the sister mitral cells and drive them to converge at equilibrium to the same activity patterns, or at the same ratio in an odor independent fashion (a strong prediction of the model). In the current implementation, the readout units are the granule cells, which nevertheless do not relay information directly to the rest of the brain. The authors argue that a similar architecture with matched weights can be implemented at the level of pyramidal cells in the cortex that can also extract the same kind of information w/o having to relay it back to the mitral cells. The model solves the inference problem successfully for a small number of odors in the sensory scene, but its performance becomes poorer and poorer with increasing number of odors within the range sampled (up to 10). In my opinion, model is elegant and clearly spelled out, the manuscript is well-written, with numerous explanations of the assumptions and possible caveats and solution, and, thus, a pleasure to read. I also found the idea of using the sister cells as means to implement the necessary sparse connectivity very clever. However, my enthusiasm was reduced by the poor performance of the model in conditions where the number of odorants to represent is relatively small (>3-4), as well as by several implementation choices and its predictions that, in my opinion, depart from known biological evidence. I list below my main concerns: 1. No detail/reasoning is provided with respect with how the sparse prior is stored in the brain/model architecture. How is the prior information represented, stored and how is it acquired? 2. The model does not seem to work efficiently for simple sensory scenes where multiple odorants are present. It appears that starting from n=3 odors, the model starts to recover many more odors than are actually present. 3. The hard prediction of the model that all sister cells converge to the same activity is not necessarily supported by previous experimental work in anaesthetized and awake mice (Dhawale et al., 2010, Arneodo et al., 2017) that identified sister mitral cells in vivo and showed that their activity patterns are non-redundant and, if anything, diverge in time. Average firing rate changes of sister cells tend to be indeed somewhat similar to each other (more so than pairs of non-sister cells), but they span a wide range. Furthermore, the ratio of their responses appears to not odor-independent, but to vary as function of the odor presented. Their average odor responses at steady state are not simply scaled versions of each other, displaying instead a wide distribution of response correlations. Throughout the manuscript, the predictions of the model with respect to sister mitral cells converging to same activity patterns at equilibrium and whether they have or not the same ratio of response to different odors vary across different sections. With respect to their spike-timing, experimental work found that sister cells become decorrelated in an odor specific manner, both with respect to each other, as well as with the respiration clock. The model does not address these biological results (in the current implementation, it does not discuss the transmission of information with respect to different phases of the respiration cycle). 4. The authors choose to not implement cortical feedback in this version of their model (though in previous work they have, Grabska-Barwinska et al., 2017), which constraints its relation to the biology, since a cortical feedback is a key feature of the vertebrate olfactory circuit. For example, firing of the granule cells is heavily dependent of cortical input and controls the timing and gain of mitral cell firing. Also, cortical feedback may be important in relaying the required sparse prior. 5. As the authors note, the outputs of individual glomeruli consist of both mitral and tufted and sister cells with very different intrinsic properties (including substantially different firing properties), local wiring and projection patterns. However, in the current implementation both populations are bundled together. This is to some degree a technical detail, but it does have important consequences on key requirements/benefits of the model: whether it is realistic to consider convergence to similar firing patterns in time (this appears different for tufted and mitral cells) and the size of S (Schwarz et al., 2018). 6. The choice of periglomerular cells activated by back-propagating action potentials of mitral cells as a source to regulate the similarity of sister cells is interesting, but further discussion would be needed in terms of the assumptions made with respect to connectivity within the glomerulus of these cells and their timing of spiking with respect to the firing of mitral cells. The activity profiles of mitral, periglomerular and granule cells shown in Figs. 9 & 12 appear quite different from documented activity patterns of such cells in the brain and converge to a stable steady state within 100-200 ms from stimulus onset. ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Darío Cuevas Rivera Reviewer #2: Yes: Dinu F Albeanu Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, . PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at . Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols 4 Nov 2021 Submitted filename: point-by-point-responses.pdf Click here for additional data file. 9 Dec 2021 Dear Dr. Tootoonian, Thank you very much for submitting your manuscript "Sparse connectivity for MAP inference in linear models using sister mitral cells" for consideration at PLOS Computational Biology. As with all papers reviewed by the journal, your manuscript was reviewed by members of the editorial board and by several independent reviewers. The reviewers appreciated the attention to an important topic. Based on the reviews, we are likely to accept this manuscript for publication, providing that you modify the manuscript according to the review recommendations. Please prepare and submit your revised manuscript within 30 days. If you anticipate any delay, please let us know the expected resubmission date by replying to this email. When you are ready to resubmit, please upload the following: [1] A letter containing a detailed list of your responses to all review comments, and a description of the changes you have made in the manuscript. Please note while forming your response, if your article is accepted, you may have the opportunity to make the peer review history publicly available. The record will include editor decision letters (with reviews) and your responses to reviewer comments. If eligible, we will contact you to opt in or out [2] Two versions of the revised manuscript: one with either highlights or tracked changes denoting where the text has been changed; the other a clean version (uploaded as the manuscript file). Important additional instructions are given below your reviewer comments. Thank you again for your submission to our journal. We hope that our editorial process has been constructive so far, and we welcome your feedback at any time. Please don't hesitate to contact us if you have any questions or comments. Sincerely, Stefan Kiebel Associate Editor PLOS Computational Biology Samuel Gershman Deputy Editor PLOS Computational Biology *********************** A link appears below if there are any accompanying review attachments. If you believe any reviews to be missing, please contact ploscompbiol@plos.org immediately: [LINK] Reviewer's Responses to Questions Comments to the Authors: Reviewer #1: The authors have addressed most of the major and minor points from the previous round. However, one important point remains unaddressed. The transition from equation 9 to 10 seems to be a simple change of variables, where \\lambda is defined as \\lambda_i = \\frac{1}{\\sigma^2}(y_i - \\sumA_{ik}x_k). Note that \\frac{1}{\\sigma^2} has to be a part of the definition for equation 10b to hold, contrary to the text description of “factor out the term in parenthesis”. Given the definition of \\lambda, equation 10a can be rewritten as: \\tau_\\lambda \\frac{d\\lambda_i}{dt} = -\\lambda_i + \\frac{1}{\\sigma^2}(y_i - \\sumA_{ik}x_k) = -\\lambda_i + lambda_i = 0 This means that \\lambda_i is fixed in time and the results that follow are based on the false premise that \\lambda and (x, y) are independent from each other. As it stands, the description of the model up to equation 10 is written as if it were a derivation (and the word is used throughout the manuscript). However, given the point above, perhaps the authors meant to introduce an arbitrary model (eqn 10) and then show that it performs MAP, just as the system in eqn 9 does. While it might sound like a distinction without a difference, it is important to note that equations 9 and 10 do not describe the same system at all, but do converge to the same equilibrium point. If this was the authors’ intention, please indicate so in the manuscript, especially in the transition from equation 9 to 10. Minor comment: For completeness, a condition of continuity could be added to the transformations f, such that invertible implies monotonic (as said in line 435). Reviewer #2: The authors have addressed several of my concerns and substantially improved the manuscript which, I think, will have an important impact in the field. I could still quibble with the exact implementation schemes (no cortical feedback) and the convergence of sister mitral cells (happy to share the data on their odor response spectra), but, in my opinion, these points could be addressed in further manuscripts, and this one is now ready for publication. ********** Have the authors made all data and (if applicable) computational code underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data and code underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data and code should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data or code —e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: Yes Reviewer #2: Yes ********** PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: Yes: Dario Cuevas Rivera Reviewer #2: Yes: Dinu Albeanu Figure Files: While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email us at figures@plos.org. Data Requirements: Please note that, as a condition of publication, PLOS' data policy requires that you make available all data used to draw the conclusions outlined in your manuscript. Data must be deposited in an appropriate repository, included within the body of the manuscript, or uploaded as supporting information. This includes all numerical values that were used to generate graphs, histograms etc.. For an example in PLOS Biology see here: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001908#s5. Reproducibility: To enhance the reproducibility of your results, we recommend that you deposit your laboratory protocols in protocols.io, where a protocol can be assigned its own identifier (DOI) such that it can be cited independently in the future. Additionally, PLOS ONE offers an option to publish peer-reviewed clinical study protocols. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols References: Review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. 30 Dec 2021 Submitted filename: point-by-point-r2.pdf Click here for additional data file. 5 Jan 2022 Dear Dr. Tootoonian, We are pleased to inform you that your manuscript 'Sparse connectivity for MAP inference in linear models using sister mitral cells' has been provisionally accepted for publication in PLOS Computational Biology. Before your manuscript can be formally accepted you will need to complete some formatting changes, which you will receive in a follow up email. A member of our team will be in touch with a set of requests. Please note that your manuscript will not be scheduled for publication until you have made the required changes, so a swift response is appreciated. IMPORTANT: The editorial review process is now complete. PLOS will only permit corrections to spelling, formatting or significant scientific errors from this point onwards. Requests for major changes, or any which affect the scientific understanding of your work, will cause delays to the publication date of your manuscript. Should you, your institution's press office or the journal office choose to press release your paper, you will automatically be opted out of early publication. We ask that you notify us now if you or your institution is planning to press release the article. All press must be co-ordinated with PLOS. Thank you again for supporting Open Access publishing; we are looking forward to publishing your work in PLOS Computational Biology. Best regards, Stefan Kiebel Associate Editor PLOS Computational Biology Samuel Gershman Deputy Editor PLOS Computational Biology *********************************************************** 25 Jan 2022 PCOMPBIOL-D-21-01274R2 Sparse connectivity for MAP inference in linear models using sister mitral cells Dear Dr Tootoonian, I am pleased to inform you that your manuscript has been formally accepted for publication in PLOS Computational Biology. Your manuscript is now with our production department and you will be notified of the publication date in due course. The corresponding author will soon be receiving a typeset proof for review, to ensure errors have not been introduced during production. Please review the PDF proof of your manuscript carefully, as this is the last chance to correct any errors. Please note that major changes, or those which affect the scientific understanding of the work, will likely cause delays to the publication date of your manuscript. Soon after your final files are uploaded, unless you have opted out, the early version of your manuscript will be published online. The date of the early version will be your article's publication date. The final article will be published to the same URL, and all versions of the paper will be accessible to readers. Thank you again for supporting PLOS Computational Biology and open-access publishing. We are looking forward to publishing your work! With kind regards, Livia Horvath PLOS Computational Biology | Carlyle House, Carlyle Road, Cambridge CB4 3DN | United Kingdom ploscompbiol@plos.org | Phone +44 (0) 1223-442824 | ploscompbiol.org | @PLOSCompBiol

32 in total

1. Efficient coding of natural sounds.

Authors: Michael S Lewicki
Journal: Nat Neurosci Date: 2002-04 Impact factor: 24.884

Review 2. Neuronal organization of the main olfactory bulb revisited.

Authors: Toshio Kosaka; Katsuko Kosaka
Journal: Anat Sci Int Date: 2015-10-29 Impact factor: 1.741

Review 3. Dendritic processing within olfactory bulb circuits.

Authors: Nathan E Schoppa; Nathan N Urban
Journal: Trends Neurosci Date: 2003-09 Impact factor: 13.837

4. Exploring parameter space in detailed single neuron models: simulations of the mitral and granule cells of the olfactory bulb.

Authors: U S Bhalla; J M Bower
Journal: J Neurophysiol Date: 1993-06 Impact factor: 2.714

1 Introduction

2 Results

2.1 Olfaction as MAP inference

Olfaction as MAP inference.

2.2 Incorporating sister mitral cells

Sister cells.

Sparsifying connectivity with sister cells.

2.3 Leaky periglomerular cells

2.4 Implementation in neural circuitry

Sparsely connected inference circuit, with readout in piriform cortex.

2.5 Simulations and linear analysis

2.5.1 System dynamics with sister cells

Example dynamics for a network with base parameters (Table 1).

Typical dynamics as the number of mitral cells per glomerulus, S, is varied relative to the base model given in Table 1.

Eigenvalues of the linearized system computed numerically (gray crosses) and analytically (coloured symbols).

Transient properties of inference dynamics.

2.5.2 The effect of periglomerular leak

Effect of periglomerular leak ε on the dynamics.

Properties at convergence versus periglomerular leak ε.

2.5.3 Robustness

The effect of the number of odour components, n, on inference.

Dynamics for different numbers of glomeruli, M, and granule cells, N (specified at the top of each column), are qualitatively similar.

2.6 Relative sister cell activity at convergence

Sister cell odour responses at convergence are approximately monotonically related, even when there is periglomerular leak.

3 Discussion

3.1 Coordinating connectivity

3.2 Cortical readout

3.3 Predictions

3.4 Summary

4 Methods

4.1 Leaky periglomerular cells

4.2 Linear dynamics

4.2.1 Approximate solutions

Linear analysis modes and their properties.

4.2.2 Stability

4.3 Inference without periglomerular cells

4.4 Simulations

4.5 Cortical readout

4.6 Application to other models

Review 2. Neuronal organization of the main olfactory bulb revisited.

Review 3. Dendritic processing within olfactory bulb circuits.

Review 7. Molecular recognition and olfactory processing in the mammalian olfactory system.