Literature DB >> 35599569

Strengthened second law for multi-dimensional systems coupled to multiple thermodynamic reservoirs.

Abstract

The second law of thermodynamics can be formulated as a restriction on the evolution of the entropy of any system undergoing Markovian dynamics. Here I show that this form of the second law is strengthened for multi-dimensional, complex systems, coupled to multiple thermodynamic reservoirs, if we have a set of a priori constraints restricting how the dynamics of each coordinate can depend on the other coordinates. As an example, this strengthened second law (SSL) applies to complex systems composed of multiple physically separated, co-evolving subsystems, each identified as a coordinate of the overall system. In this example, the constraints concern how the dynamics of some subsystems are allowed to depend on the states of the other subsystems. Importantly, the SSL applies to such complex systems even if some of its subsystems can change state simultaneously, which is prohibited in a multipartite process. The SSL also strengthens previously derived bounds on how much work can be extracted from a system using feedback control, if the system is multi-dimensional. Importantly, the SSL does not require local detailed balance. So it potentially applies to complex systems ranging from interacting economic agents to co-evolving biological species. This article is part of the theme issue 'Emergent phenomena in complex physical and socio-technical systems: from cells to societies'.

Entities: Chemical

Keywords: entropy production; feedback control; multi-dimensional systems; multipartite processes; second law of thermodynamics; stochastic thermodynamics

Mesh：
Entropy
Thermodynamics

Year: 2022 PMID： 35599569 PMCID： PMC9125225 DOI： 10.1098/rsta.2020.0428

Source DB: PubMed Journal: Philos Trans A Math Phys Eng Sci ISSN： 1364-503X Impact factor: 4.019

Introduction

Statistical physics concerns experimental scenarios where we have restricted information concerning the state of a system , which is quantified as a probability distribution over those states, . In particular, the recently developed variant of statistical physics called ‘stochastic thermodynamics’ concentrates on systems that evolve according to a continuous-time Markov chain (CTMC). For a countable state space, this means that evolves according to a linear differential equation (Note that the rate matrix can depend on time .) Analysing systems that evolve according to equation (1.1) has led to formulations of the second law of thermodynamics which apply even if the system is evolving while arbitrarily far out of thermal equilibrium [1,2]. If we apply one of these formulations of the second law to any system evolving according to equation (1.2) while coupled to a single (infinite) heat bath at temperature , and assume that the rate matrix is related to an underlying Hamiltonian via local detailed balance (LDB), we get where is the total heat flow into the system from its heat bath during the dynamics, and is the change in Shannon entropy of the system during the process. If LDB does not hold, equation (1.2) will not hold either, if we wish to interpret as thermodynamic heat flow. However, for any rate matrix, regardless of whether it obeys LDB, (for a process lasting from time to ). The quantity on the l.h.s. of equation (1.3) is called the total expected entropy flow (EF) into the system during the process. The difference between the entropy change of the system (the r.h.s. of equation (1.3)) and the EF is called the entropy production (EP), written as . So equation (1.3) can be re-expressed as Crucially, the inequality equation (1.4) holds for any CTMC, even a CTMC that has no thermodynamic interpretation, i.e. a CTMC which models a process that does not involve energy transduction. So equation (1.4) applies to dynamic models of everything from stock markets to the evolution of the joint state of an opinion network, so long as those models are CTMCs. In many experimental scenarios, while we are restricted in the information we have concerning the system’s state, we have some other information, in the form of conditions satisfied by the dynamics of the system. Recently, equation (1.4) has been strengthened, by adding non-positive terms to the r.h.s. that incorporate this kind of information concerning the dynamics. Examples of these new results include ‘thermodynamic uncertainty relations’ (TURs [3-6]), ‘speed limit theorems’ (SLTs [7-11]), ‘thermodynamic first passage bounds’ [12-16], etc. Unlike equation (1.4) though, these bounds require measuring variables as they change during the process, in addition to knowing the beginning and ending distributions, and . (For example, TURs rely on measuring accumulated currents, and SLTs rely on measuring integrated activity.) This limits their experimental applicability. In this paper, I derive new strengthened forms of equation (1.4) that, like the TURs and SLTs, incorporate information concerning the dynamics of the system. However, unlike the TURs, SLTs, etc., these new strengthened forms of equation (1.4) do not require measuring variables as they change during the process; they only require knowing the beginning and ending distributions over states. These strengthened forms of equation (1.4) apply whenever we have information about which of the coordinates of the system can have their dynamics directly depend on which of the other coordinates. Formally, such information takes the form of constraints on the rate matrix of the CTMC governing the dynamics of the system. (See also [17].) I call this kind of restriction on the allowed dynamics a ‘dependency constraint’. As an example, consider a random walker over a two-dimensional finite lattice, . For simplicity take for two positive integers . The lattice is coarse-grained into a set of non-overlapping squares each of size , and the position of the walker in the lattice is represented three-dimensionally, by a pair of coordinates and an integer . (The value specifies the precise coarse-grained square, while specifies the coordinates within that square.) In addition to position in the lattice, the walker has internal stores of two nutrients, and , specified (up to some coarse-graining) by values in the finite sets and , respectively. So the state space of the walker is , i.e. those five variables are the five coordinates of the walker. We can suppose that both and evolve autonomously, independently of all other variables, according to two associated rate matrices, i.e. the walker engages in two independent random walks, one in each of the two directions across the lattice. Note though that ’s dynamics will depend on and in general, and that there will sometimes be simultaneous transitions of and some other coordinate. For example, suppose , so the walker is at the extreme value of within some square, adjacent to the next coarse-grained square. Suppose as well that in the next step, the walker moves into that adjacent square. So simultaneously changes to 1 while must also change, since the coarse-grained square changes. However, for other changes in , remains unchanged. We can also suppose that the dynamics of depends only on the walker’s current position in and their current amount of , i.e. it depends only on . Similarly, the dynamics of depends only on . (For example, this would be the case if densities of those two nutrients were arranged appropriately across the lattice, and the walker at a given location accumulates those nutrients based on their densities at that location.) Summarizing, the dependency constraints are that depends only on (in addition to depending on its own state), depends only on (in addition to its own state), and are autonomous, while can depend on and/or (in addition to itself). However, for this special issue on the topic of ‘Emergence’, perhaps the most important type of system that evolves subject to dependency constraints is a system that comprises a set of physically separated subsystems, co-evolving with one another, with each subsystem’s state being identified as a different coordinate [18-20]. In this kind of system, dependency constraints governing the dynamics of each coordinate, specifying which other coordinates can directly affect its dynamics, amount to constraints on the dynamics of each subsystem, specifying which other subsystems can directly affect its dynamics. As a concrete illustration, consider the scenario investigated in [21,22], in which receptors in the wall of a cell sense the concentration of a ligand in the intercellular medium, and those receptors are in turn observed by a ‘memory’ subsystem inside the cell. Modify this scenario by introducing a second cell, which is observing the same external medium as the first cell. Assume that the cells are far enough apart physically so that their dynamics are independent of one another. This gives us the precise scenario in figure 1, where subsystem 3 is concentration in the external medium, subsystem 2 is the state of the receptors of the first cell, subsystem 1 is the memory subsystem of the first cell and subsystem 4 is the state of the receptors of the second cell.

Figure 1

Four interacting subsystems, , grouped into three sets, . The red arrows indicate dependencies in rate matrix of the overall system. So for example evolves autonomously, but is continually observed by and . (The implicit assumption that is not affected by the back-action of the observation holds for many real systems such as colloidal particles and macromolecules [23].) Note that the statistical coupling between and could grow with time, even though the rate matrix does not directly couple their dynamics. The three overlapping sets indicated at the bottom of the figure specify the three units of a unit structure for this process, as discussed in the text. As an illustration of some of the definitions below, there is one reservoir coupled to the system that has subsystem 2 as its puppet set, with both subsystems 2, 3 as its leader set. (Online version in colour.) My main result shows how a set of dependency constraints can strengthen equation (1.4), by adding an expression to its r.h.s.. This expression involves only those dependency constraints and the starting and ending distribution of the system. As a caveat, this new lower bound on EP is not always positive, i.e. it is not always stronger than the conventional second law, equation (1.4). However, I show below that for any set of dependency constraints, there is a conditional distribution that can be implemented by a rate matrix obeying those constraints, together with an initial distribution , such that every rate matrix that implements that conditional distribution must result in a non-negative EP when applied to that initial distribution. Indeed, for some sets of dependency constraints, this new EP bound is stronger than the conventional second law no matter what and are (so long as is consistent with the dependency constraints). Some of the TURs, SLTs, etc., rely on the dynamics obeying LDB. LDB is not required for the new extension of the second law derived here. This means that (for example) this new extension applies to multipartite systems that have ‘directed’ (sometimes called ‘non-reciprocal’) interactions rather than undirected interactions among the subsystems, i.e. interactions in which there is exactly zero back-action [22,24-30]. Very often, these systems violate strict LDB, and so their thermodynamic analyses are, at best, approximations. (See discussion in appendix in [19] of some conditions that justify this approximation.) By contrast, the result derived below applies exactly to any scenario where there is no back-action, with no approximation. In addition, this result holds even if the dynamics allows multiple coordinates to change simultaneously. In particular, in the special case that each coordinate is a separate subsystem, the result does not require that the dynamics be a multipartite process (MPP) [18]. Owing to these relaxations of the assumptions made in conventional stochastic thermodynamics, the results below are not restricted to thermodynamic systems, involving energy transduction. The results hold for any CTMC, even if the rate matrix does not reflect physically coupling between the system and one or more external thermodynamic reservoirs, as it does in conventional applications of stochastic thermodynamics [2]. However, the strengthened second law derived below has special physical significance in the common scenario where the dependency constraints arise because the system’s dynamics is governed by coupling with external reservoirs, and there are restrictions on that coupling. For example, a common physical scenario is where the system has multiple subsystems, and each subsystem is coupled to a physically distinct part of a shared reservoir. Owing to the physical separation of those parts of the reservoir, each connected to a different subsystem, the usual assumption of time-scale separation between the dynamics of the overall system and that of the reservoirs means that the different subsystems are effectively coupled to independent reservoirs from one another.[1] Such systems evolve as an MPP, in which no transitions are allowed in which more than two subsystems change their states exactly simultaneously [18]. If the system is an MPP, and the dynamics of each subsystem obeys LDB, then we can use stochastic thermodynamics to identify various attributes of that dynamics with experimentally measurable thermodynamic quantities [18,24,33-35]. More generally, there are systems with multiple coordinates that are not usually viewed as separate ‘subsystems’, but where the global dynamics arises due to the system’s coupling with thermodynamic reservoirs, and where each reservoir is only coupled to a single coordinate. These systems can also be modelled as MPPs, and analysed accordingly. Generalizing further, there are other kinds of systems that also have multiple coordinates, where the global dynamics arises due to the system’s coupling with thermodynamic reservoirs, just like in an MPP. Also like in an MPP, each reservoir in these systems is only coupled to a proper subset of the coordinates, which results in dependency constraints. In contrast to an MPP however, some reservoirs are coupled to more than one coordinate. As an example, as stated in [36]: ‘Fluctuations in biochemical networks, e.g. in a living cell, have a complex origin that precludes a description of such systems in terms of bipartite or MPPs, as is usually done in the framework of stochastic and/or information thermodynamics’. The strengthened second law I present below applies to these generalized forms of MPPs as well as to MPPs. In the next section, I formalize dependency constraints as restrictions on the rate matrix of a CTMC. This is followed by a section in which I use this formalization to derive an expression for the EP of a system that involves the triple of {the rate matrix dependency constraints, the initial distribution over states, the final distribution over states}, together with certain other factors. In the following section I derive a lower bound on that expression for EP which depends only on the triple of {dependency constraints, initial distribution, final distribution}, without those other factors. In particular, this lower bound does not depend on any properties of the rate matrix, other than the dependency constraints. This lower bound is my main result. In the following section, this main result to analyse how the thermodynamics of feedback control [17,27,28] changes when we know that the system being controlled obeys a given set of dependency constraints. In the following section, I present a set of examples of my main result. I end with some discussion, in particular of the relation of the new result to other results in the literature.

Rate matrix unit structures

I begin by defining notation. First, I write the state space of the system as , where each finite state space is a coordinate of the system. I write the set of coordinates as . As examples, each coordinate could specify the state of a physically separate subsystem of the overall system, or it could specify a position on one axis of a lattice, or it could indicate a degree of freedom in a multiscale specification of the state of the system. The system is assumed to evolve according to a CTMC.[2] For any , I write . So for example, is the vector of all components of other than those in . For any set , is the associated unit-simplex. In addition, for any function , I write . The set of bits is . I write the Kronecker delta as . For any family of sets, , I define . A distribution over a set of values at time is written as , with its value for written as . Similarly, I write , for the conditional distribution of the state at time given the state at time , etc. I write Shannon entropy as , , or , depending on which would result in the cleanest equations, and write mutual information between two random variables as . The distribution over the overall system evolves according to the global rate matrix , as given by equation (1.1). A unit at time is a set of coordinates such that as the full system evolves according to , the marginal distribution evolves according to the CTMC for all , for some associated rate matrix . Intuitively, a unit is any set of coordinates whose evolution is independent of the states of the coordinates outside the unit. Since the dynamics of a unit is given by a self-contained CTMC, all the usual theorems of stochastic thermodynamics apply to any unit, e.g. the second law [2], SLTs [9,37], some of the fluctuation theorems [38] and if LDB holds, then other fluctuation theorems [1], the thermodynamic uncertainty relations [4,6,39], first-passage time bounds [13], bounds on stopping times [15], etc. Any union of units is a unit. In addition, it is proven in electronic supplementary material, appendix A that any non-empty intersection of units is a unit. Note that since the dynamics of the full system is a CTMC, equation (2.1) applies with set to all coordinates in the system. So is a unit. Note also that, in general, the evolution of a coordinate lying outside of a unit may depend on the states of coordinates lying inside , even though the reverse is impossible by definition. As an example, in figure 1, subsystem 3 is its own unit, evolving independently of subsystems 2 and 1. By contrast, none of the other three subsystems are their own unit. (For example subsystem 2’s dynamics depends on the state of 3.) A set of units defined over a set of coordinates is called a unit structure if it obeys the following properties [19,20]: I will generically write any particular unit structure defined over as .[3] The union of the units in the unit structure equals all of . The unit structure is closed under intersections of its units. I will sometimes say that represents the set of coordinates . Any process can be represented with at least one unit structure, e.g. by choosing a unit structure that contains only the single unit . (Typically, it can be represented by many possible unit structures.) Also, in general for any given rate matrix there are sets of coordinates that are not unions of units, and so cannot be represented by any unit structure. On the other hand, one can always construct a rate matrix that will implement any hypothesized unit structure over a set of coordinates, i.e. all unit structures can actually exist, for some appropriate rate matrix. (At worst, one can choose a rate matrix in which each coordinate evolves autonomously, i.e. a rate matrix that is a sum over all coordinates of independent rate matrices for each of those coordinates.) For simplicity, from now on I assume that the unit structure does not change with . In addition, I define a conditional distribution for the ending joint state given an initial joint state, , to be consistent with a specified unit structure if there is some rate matrix that obeys that unit structure and that implements . The dynamics of any two units must be consistent with one another, i.e. for all , i.e., (Note that the l.h.s. can be expanded as .) In particular equation (2.2) must hold for for any joint state . If we use equation (2.1) to evaluate the derivative on the l.h.s. and r.h.s. of equation (2.2) and apply it for all such delta function choices of and then relabel, we see that for all and , (See electronic supplementary material, appendix B for more discussion of this result.) Conversely, if there is some rate matrix such that equation (2.5) holds for all , then the two rate matrices are compatible, i.e. equation (2.2) holds. As an important special case of equation (2.5), if we take and as shorthand writing as just , we see that for any unit , independent of .

Example 2.1.

Recall that an MPP is a set of co-evolving subsystems evolving according to a CTMC in which no transitions are allowed in which more than two subsystems both change their states. Formally, in an MPP, for all subsystems , for all , , unless [17,18]. The units in an MPP are sets of subsystems whose joint evolution is independent of the other subsystems. Equivalently, for every subsystem in an MPP, there is an associated rate matrix that is zero if such that the global rate matrix can be written as and where for every unit containing subsystem , the rate matrix terms are independent of . Equation (2.2) always holds (and therefore so does equation (2.5)) in an MPP. At the other extreme from MPPs, equation (2.5) also holds for some rate matrices which only allow state transitions in which all subsystems change, i.e. rate matrices such that for any where there are two subsystems, , such that both and . This is illustrated in electronic supplementary material, appendix B. It will often be convenient to re-express a unit structure as a directed graph. Define the dependency graph by the rule that there is an edge from node to node iff both: , and there is no intervening unit such that . (Note that is a directed graph, which allows us to use standard graph theory terminology.) In a unit structure where the dependency graph has a single root, but if , then the dependency graph has multiple roots. I will abuse notation and sometimes treat a unit as a set of coordinates while at other times I treat it as a single node in . I write the set of parents of any node as , and the set of its descendants as , with , the family of node . The maximal number of nodes in any directed path that starts at is the height of . So any unit that has no subunits contained in it is a leaf node of , with height 1. (The maximal height of all nodes in is simply called ‘the height of ’.) I write for the set of root nodes in . As an example, the dependency graph of figure 1 has two root nodes, and , and one leaf node, , which is their common child. The height of the graph is 2. For simplicity, from now on I assume that neither the number of reservoirs nor the associated maps and changes with time . There are several additional, technical conditions that I will impose on the unit structure, in order to simplify the algebra in the proofs of the results in §4. (These conditions can be ignored if the reader is only interested in understanding the results, not the details of their proofs.) Any CTMC can be represented with at least one unit structure meeting these three conditions (e.g. the unit structure that consists just of ). I require that the unit structure is rich enough that if a joint state transition can occur that simultaneously changes the state of all coordinates in a set , then there is some unit that contains .[4] I call such a unit structure flush. A unit is vacuous if all of its coordinates are also in at least one subunit . I assume that no unit in any unit structure we are considering is vacuous.[5] I say that two units are equivalent at time if for all where , for all such that , . I require that does not contain any two equivalent units. This means that for any two units in the unit structure, there must be transitions that can occur in which some coordinate changes its value. To connect these considerations to stochastic thermodynamics, from now on I suppose there are a total of thermodynamic reservoirs attached to the system [1,2]. I suppose further that each reservoir generates fluctuations of the joint state of an associated set of coordinates , without any such direct effect on the other coordinates. (For example, may be able to do this by being directly physically coupled to the coordinates in and no others, via an implicit interaction Hamiltonian.) As is standard in stochastic thermodynamics, I suppose that if only one particular reservoir were attached to the system, then the resultant dynamics over would be a CTMC. is called the puppet set of reservoir , with its elements called the puppets of . The collection of all puppet sets covers . To minimize the amount of notation required, I assume that the set-valued function is invertible, i.e. a given pair of reservoirs , might both affect the dynamics of some shared coordinate , but there will always be at least one coordinate whose dynamics is not affected by both those reservoirs and .

Example 2.2.

Return to the example of an MPP, where we identify each subsystem with a separate coordinate. Each subsystem has its own unique set of reservoirs, which jointly causes the fluctuations in its state. In other words, the puppet set of each reservoir is a singleton, the associated subsystem of that reservoir, and each subsystem is the puppet set of at least one reservoir. I write for the minimal set of coordinates whose associated value directly affects how the coupling with reservoir affects the dynamics of . I call this the leader set of , or sometimes the leader set of .[6] I write for the associated rate matrix over induced by the coupling of the system to reservoir . So affects the dynamics of , but leaves the other coordinates unchanged. Abusing notation, I write where is a proper stochastic rate matrix that equals 0 if . Abusing notation, I will sometimes rewrite equation (2.8) as where is a ‘rate matrix’ in that all of its entries for are non-negative, and See figure 1 above and example 3.1 below. In general, any given coordinate may be in more than one reservoir's leader set and in more than one reservoir's puppet set. Accordingly, I extend the definitions above by writing where is an any subset of . So is the set of all coordinates whose state can directly affect the dynamics of coordinate , via arguments of a rate matrix, and similarly for . Along the same lines, I define So is the set of all coordinates, inside or outside of , whose dynamics is governed jointly with that of any coordinate in . , since there can be coordinates whose dynamics is affected by other reservoirs in addition to . Note as well that for any set , . So in particular, if any two different units have non-empty intersection, then since that intersection must also be a unit, the leader sets of all the coordinates in that intersection must lie within that intersection. In addition, the inverses of these set-valued functions are well-defined. In particular, for any set of coordinates , is the set of all reservoirs such that for some . It will be convenient to introduce the shorthand that for any subset is the set of all reservoirs such that . So is the set of all reservoirs who affect the dynamics of any of the coordinates in . As in conventional stochastic thermodynamics, the global rate matrix at time t is the sum over all reservoirs of the rate matrices of those reservoirs, In appendix K it is shown that this implies the following intuitive result:

Proposition 2.3.

For any unit , where is a properly normalized rate matrix over and is independent of .

Example 2.4.

As a simple illustration of Proposition 2.3, in any MPP where each subsystem is controlled by one reservoir, which controls no other subsystems. To reduce notation, consider the case where the unit is all of . In this case the sum over runs over all subsystems in unit , and each is the rate matrix of the subsystem associated with reservoir . So Proposition 2.3 reduces to equation (2.7) in Example 2.1, with each term in Proposition 2.3 re-expressed as . Proposition 2.3 means that as far as any single unit is concerned, we can replace all reservoirs with leader set and puppet set with a reservoir that has leader set and puppet set . Here I assume that the unit structure has this property simultaneously for all units. Formally, I restrict attention to unit structures that only contains units with the property that for all reservoirs , . I call such a unit structure tight. (Note that there is always at least one unit structure with this property, namely the unit structure with a single element, the unit .) For any unit in a tight unit structure, .[7] Since for all sets , it then follows that for any unit in a tight unit structure. In addition, in a tight unit structure, even though is not a unit in general.[8] If LDB holds for all reservoirs with puppet set inside a unit, then other fluctuation theorems [35], the thermodynamic uncertainty relations [15,18,23], first-passage time bounds [10], bounds on stopping times [27], etc., all apply to the thermodynamics of that unit. (See also [49].) We can tighten Proposition 2.3 under our assumption of a tight unit structure. The following result is proven in appendix L:

Proposition 2.5.

For any unit in a tight unit structure,

Thermodynamics of composite systems

Following conventional stochastic thermodynamics, I identify the (expected) global EF rate at time as The results below do not require LDB. However, if all reservoirs are purely thermal, with no associated particle exchange, and if LDB applies, then we can interpret the EF rate as (temperature-normalized) heat flow between the system and its reservoirs.[9] Similarly, the (expected) global EP rate at time is

Example 3.1.

In an MPP, each coordinate is a ‘subsystem’; ; there is a bijection between the set of reservoirs and the set of subsystems; and for every reservoir/subsystem , . So a unit is any set of subsystems such that for all , . In addition, equation (3.4) reduces to See [18-20,22,34,40] and figure 1. Following the same convention as for global EF rate, I define the (expected) local EF rate of any unit at time as the entropy flow rate into the associated reservoirs: Since no reservoir’s puppet set can include both coordinates inside a unit and coordinates outside of , for any two units where , So viewed as a function from the set of all units to reals, obeys the countable additivity axiom of a signed measure over , the sigma algebra generated by the units in . This allows us to extend the definition of local EF rate to the sigma algebra by using the set of values to generate an entire signed measure. So for example, for every pair of units in , even if , Recall that the dynamics of any unit is given by a self-contained CTMC, independent of the state of any coordinate outside of that unit. Accordingly, the EP rate of a unit is the sum of the derivative of the entropy of the distribution of the joint state of that unit and the EF rate into the reservoirs of that unit. Using Proposition 2.5 to evaluate that entropy derivative and equation (3.7) to evaluate the EF rate, we can evaluate the (expected) local EP rate of at time as Accordingly I sometimes write the global EP rate given in equation (3.4) as . For any unit , , since has the usual form of an EP rate of a single system. (See [17] for a discussion of the relation between local EP rates and similar quantities discussed in [18,35,41].) Write the local EP generated by a unit during the process as and similarly write for the global EP. (To minimize notation, I adopt the convention that angle brackets are implicit for time-extended thermodynamic quantities, as opposed to rates.) In electronic supplementary material, appendix C, equation (2.4) and the log sum inequality [42] are used to prove that for any two units , not necessarily part of a unit structure, at all times . Therefore In particular it is shown in [17,43] that in the special case where there is a set of units who have no overlap with another, for any unit , (See also equation (6.4).) Let be a unit structure. For simplicity, from now on I assume that . Suppose we have a set of real numbers, , which are indexed by the units . It will be convenient to use the associated shorthand, (Note that the precise assignment of integer indices to the units in is irrelevant.) This quantity is called the inclusion–exclusion sum (or just ‘in-ex sum’ for short) of for the unit structure . Next, define the time- in-ex information as where all the terms in the sums on the r.h.s. are marginal entropies over the (distributions over the coordinates in) the indicated units. As an example, if consists of two units, , with no intersection, then the expected in-ex information at time is just the mutual information between those units at that time. More generally, if there an arbitrary number of units in but none of them overlap, then the expected in-ex information is what is called the ‘multi-information’, or ‘total correlation’, among those units [17,44,45]. In electronic supplementary material, appendix D, Rota’s extension of the inclusion–exclusion principle [46] is used to show that in any composite unit structure This implies that the global EP rate is This is the first major result of this paper.[10] Integrating equation (3.20) from the start to the end of a process gives As an example of this result, suppose that we have two physically separated subsystems undergoing an MPP, and that subsystem 2 never changes its state, while subsystem 1 executes a map from to , independent of the state of . Note that if the rate matrix of subsystem 1 depends on the state of subsystem 2, i.e. subsystem 1 observes subsystem 2 as it evolves, then there is only one unit rather than two. Accordingly, equation (3.20) tells us that the global EP rate can depend on this property of whether subsystem 1 observes the state of subsystem 2 as subsystem 1 evolves, even though the conditional distribution of subsystem 1’s final state given its initial state, , is independent of the state of subsystem 2. In general, this effect of the unit structure on the EP will occur whenever the two subsystems are initially statistically coupled. See electronic supplementary material, appendix E for a discussion. Equation (3.21) applies to any unit structure. In addition, for any unit structure over a set of coordinates , equation (3.15) and the fact that the union of a set of units is itself a unit means that . Therefore using equation (3.21) to expand gives Equation (3.22) holds even if , and at the other extreme, even if no unit in is also in .

Strengthened second law for composite systems

In general, to evaluate the in-ex sum of local EPs on the r.h.s. of equation (3.21) requires detailed knowledge of the precise rate matrices during the process. However, following Landauer, the goal in this paper is to derive bounds that are independent of those details, depending only on the starting distribution and the conditional distribution of the final state given the initial state. One might hope that one could achieve this goal simply by setting all local EPs to in equation (3.21), giving Below I will sometimes write the r.h.s. of this equation as . Unfortunately, in general it is impossible to have the local EPs of all units in an arbitrary unit structure, even if one uses a quasi-statically slow process. Indeed, the unit structure itself, independent of any other properties of the rate matrix, may mean that it is impossible to have all local .[11] This might seem to imply that we cannot lower-bound the EP as . However, recall that in general there are many different unit structures that all apply to the same CTMC. We are free to choose among those unit structures. And as it turns out, no matter what the CTMC is, we can always choose the unit structure in a way that guarantees that equation (4.1) does in fact hold. I prove this result in several steps. First, in electronic supplementary material, appendices F and G, I derive a set of lower bounds on EP that always apply, no matter what the unit structure. These lower bounds are summarized in electronic supplementary material, proposition F.1, and are my first main result. These bounds are not in the form of equation (4.1) though; while important in their own right, they do not yet achieve our goal. On the other hand, in general we can represent any CTMC with a unit structure of height 2. (For example, we can do that by combining all coordinates that are not members of a root node of , into one, overarching unit.) In electronic supplementary material, appendix F, I derive a corollary of electronic supplementary material, proposition F.1, telling us that equation (4.1) holds for any such unit structure of height 2. This is my second main result.[12] Owing to this result, we can always choose the unit structure so that the global EP is bounded by equation (4.1). Unfortunately, as illustrated below, there are some unit structures of height 2 where the bound on the r.h.s. of equation (4.1) is negative for an appropriate initial distribution and conditional distribution consistent with . In such cases, equation (4.1) does not provide a stronger bound on EP than the conventional second law. This is not as much of a problem as one might fear though. For every unit structure , there are initial distributions and conditional distributions that are consistent with where the r.h.s. of equation (4.1) is non-negative, so that the bound in equation (4.1) is at least as strong as the conventional second law. This is my third and final main result. (This result is presented in electronic supplementary material, proposition F.2, and is also proven in electronic supplementary material, appendix F, based on results in electronic supplementary material, appendix I.)

Thermodynamics of feedback control for composite systems

We can use equation (4.1) to extend previous work on the thermodynamics of feedback control [17,27,28] to account for a known set of dependency constraints of the system being controlled. Suppose we have a composite system with some associated unit structure and some desired initial and final joint distributions over the states of the system, and , respectively. Suppose we also have a feedback controller, , whose state space has values . Before the system starts to evolve, the controller observes the initial state of the system through a noisy channel, . This observation does not affect that initial system state, i.e. there is no back-action. So the initial joint distribution immediately after the observation is As is standard in the literature of the thermodynamics of feedback control, we do not consider the thermodynamics of this measurement process. Note that . After the measurement, does not change. However, the system can observe as it evolves. The result is a new final distribution, where we abuse notation and write for the distribution over final states of the system conditioned on the initial state being and the feedback process state being . For simplicity, we parallel the conventional analysis in the literature and require that the marginal final distribution obeys . In order to analyse the thermodynamics of feedback control, one must define a Hamiltonian over the states of the system, so that one can define the work on/from the system. Following convention, I assume the Hamiltonian is uniform at both and , and assume it is related to the global rate matrix via LDB. Let be some unit structure with height less than three representing the original system, without the feedback apparatus. Using equation (4.1), the EP without the feedback apparatus is lower-bounded by By coupling that original system to the feedback apparatus we construct a new system, , which comprises the original system together with an extra subsystem (the feedback apparatus) and new dependencies of the original coordinates of the system on the state of that new subsystem. There are many possible unit structures, , over this new joint system-feedback-apparatus. For simplicity, exploit the fact that evolves independently of the other coordinates in the system (by not evolving at all) to construct directly from , by replacing each unit with a new unit, . So and contain the same number of units, with each unit in containing the subsystem , and . (It does not matter if we add an additional unit to , containing just itself.) This gives a new lower bound on the EP, . In electronic supplementary material, appendix J, it is shown that the difference between the lower bound on EP in the new, feedback scenario and the lower bound on EP in the original, no-feedback scenario, is By conservation of energy, the work done on the system during is the change in its internal energy minus the heat flow to all the reservoirs, which is given by the sum of the (temperature-normalized) entropy flows to the reservoirs. (Equivalently, this is the negative of the work extracted from the system.) Since the Hamiltonian is uniform at , , the change in internal energy is zero. For simplicity assume all reservoirs have the same temperature, , and choose units so that . Then that sum of entropy flows is the total change in the entropy of the system minus the EP. Combining this with equation (4.1), it is shown in electronic supplementary material, appendix J that the amount of work that can be extracted from the system under feedback control if one takes into account the unit structure is (perhaps loosely) upper-bounded by By contrast, the conventional analysis in the literature, in which one does not account for the unit structure of the system, results in an upper bound of [17,27,28]. The difference between these two terms is how much the unit structure restricts the amount of work we can extract from a system by observing its state.

Examples of the strengthened second law

In this section, I work through some elementary examples illustrating equation (4.1). All unit structures in these examples are implicitly assumed to have height less than 3.

Example 1

Consider any process where every coordinate that is in the intersection of two or more distinct units stays constant throughout. In such a process where the sum runs only over the root nodes. Moreover, since any unit that never changes its state generates no EP, in this kind of process by the definition of in-ex sum. The lowest each can be is zero (which occurs when each unit evolves semi-statically slowly). Therefore we can combine equation (6.2) with equations (3.21) and (6.1) to establish that the lower bound on EP is exactly, i.e. equation (6.1) is a strict lower bound on the EP. This lower bound holds no matter what and are, so long as is consistent with the unit structure. As an illustration of this result, suppose that no two units intersect one another, and that every unit contains just a single coordinate. Then the lower bound on EP is which is the drop among the coordinates in their multi-information, sometimes called ‘total correlation’. (This lower on the EP was previously derived in [17,43], in the special case that each coordinate is a physically separate subsystem.) By repeated application of the data-processing inequality, it is easy to confirm that this lower bound on the EP is non-negative. Note though that equation (6.1) holds for any process with a height 2 unit structure, so long as the ending entropies of (the joint coordinates in the units corresponding to) the leaf nodes equal the associated starting entropies. In particular, this is true even if the coordinates in the leaf nodes do change state during the process. Since the dependency graph has height 2, equation (4.1) tells us that the expression in equation (6.1) is a lower bound on the EP of such a process. Furthermore, the same argument using the data-processing inequality establishes that that lower bound is non-negative. However, in general, if the coordinates in the leaf nodes change their states during the process, that lower bound may not be tight.

Example 2

Suppose that a system comprises three physically separated subsystems, , each with two possible states, and 1. Suppose as well that the dynamics can be represented with the height-2 unit structure . So subsystem 2 evolves independently, while the dynamics of both subsystems 1 and 3 depend on the state of subsystem 2. Suppose as well that initially, with uniform probability over their two possible joint states, and that is independent of both and , also with uniform probability over its states: Therefore, , and so Assume that eventually loses all information about its initial state. So In addition, as required by the unit structure, have and evolve independently of one another, conditioned on the state , and presume that they both eventually lose all information about their own initial states and the initial state of . So for example, Combining, Therefore, , and so Combining equations (6.7) and (6.12) establishes that the EP is lower-bounded by . Note that we can derive this lower bound on the EP even though both subsystems 1 and 3 are continually observing subsystem 2 during the process, even if subsystem 2’s state is changing as they observe it. In addition, this lower bound holds no matter what the ending distribution is, so long it can be written as in equation (6.11). (So in particular, as discussed in the introduction, it applies to a simple extension of the cell-sensing scenario analysed in [21,22].)

Example 3

Return to the example of a random walker presented in the Introduction, with an associated height 2 unit structure illustrated in figure 2. Plugging into equation (3.18) and using obvious shorthand,

Figure 2

The random walker scenario described in the Introduction and investigated in example 3. (a) In the left panel, the five coordinates are indicated by circles, with the associated rate matrix dependencies indicated by arrows, using the same convention as in figure 1. (b) The right panel shows a height-2 dependency graph for this rate matrix. Each square is a different unit, with the associated coordinates explicitly written. Note that in dependency graphs arrows indicate the partial order of subset inclusion. In this example, the number of units is the same as the number of coordinates, but that need not be the case in general. (Online version in colour.) Suppose that at the full system has a single specific state with probability 1. So . Suppose as well that the position in the lattice is uniformly random at . (For example, this will occur at large enough if the lattice has periodic boundary conditions and both and evolve by randomly choosing one of their two neighbours.) This means that knowing the values of at tells us nothing about the most recent values of not already given by the value of at , and so in particular tells us nothing about the most likely value of then. The same is true concerning the value of at . This all means that . Combining gives . So equation (4.1) provides a strictly positive lower bound on the global EP. Note that this lower bound applies no matter what the dynamics of the process; it can be quasi-statically slow, it can involve Hamiltonian quenches, but so long as the unit structure does not change during the process, the EP is lower-bounded by . Furthermore, so long as the Hamiltonian is uniform at both and , the total work extracted in the process is the gain in entropy of the full system minus the global EP. Combining establishes that the total work extracted is upper-bounded by Note that increasing while keeping constant means that the precise value tells us less about the precise lattice position. Equation (6.15) tells us that increasing the significance of this way increases the upper bound on the total amount of work that can be extracted.

Discussion

In this paper, I consider the thermodynamics of multi-dimensional systems evolving according to a continuous-time Markov chain. My main result is a strengthened version of the conventional second law, which applies whenever we have an a priori set of ‘dependency constraints’ that for each coordinate specify which other coordinates can directly affect the dynamics of , via the rate matrix. The result holds for any coordinate system—the coordinates can be conventional phase space coordinates, they can be states of a set of separate interacting subsystems of an overall system, they can be positions in a sequence of more refined coarse-grainings of the state of the system, they can involve amounts of various chemicals in the system, etc. To derive my result I first translate the dependency constraints into a ‘unit structure’. This gives a sigma algebra that groups the coordinates into overlapping sets, in a way that respects the dependency constraints. In general, any set of dependency constraints can be translated into more than one unit structure. In turn, any unit structure specifies an information-theoretic functional of distributions over the states of the system, called the ‘in-ex information’. To illustrate this, suppose the dependency constraints specify that each coordinate evolves autonomously, independent of the others. (As an example, this would be the case for the spatial coordinates of a particle freely evolving under over-damped Langevin dynamics in a uniform medium with no external forces.) We could then choose a unit structure that assigns each coordinate to its own unique unit. In this case the in-ex information reduces to the total correlation (sometimes called ‘multi-information’) of the system’s distribution, with each coordinate viewed as a separate random variable. The strengthened version of the second law derived in this paper says that the EP of the system is lowerbounded by the difference between the beginning and ending values of the system’s in-ex information. This lower bound is independent of all features of the dynamics other than the beginning distribution, the ending distribution and the dependency constraints restricting how the dynamics could have caused the initial distribution to evolve into the ending distribution. Accordingly, we can use this strengthened second law to upper-bound the amount of work that can be extracted from a system as it evolves from one specified distribution to another [27,47], in a way that accounts for dependency constraints governing the system’s dynamics. Similarly, this strengthened second law can be used to refine recent results in thermodynamics of feedback control [48], to account for dependency constraints in the system being controlled. In contrast to other similar recently derived lower bounds on EP [19,20], the one derived here does not require that the dynamics of the system be a multipartite process. Nor does it require that local detailed balance holds. These two features mean the lower bound applies to any system undergoing continuous-time Markovian dynamics, even if the system has no natural thermodynamic interpretation. As a result, we can apply these results to everything from (Markov models of) evolving opinion networks to replicator dynamics of a population of evolving organisms. A recent paper [17] used an information-geometric analysis to also derive bounds on minimal EP that arise due to constraints on the rate matrix of a system’s dynamics. To use the analysis in [17] one needs to first find an operator over the set of all joint distributions which both obeys the Pythagorean theorem of information theory and which commutes with the time-evolution operators defined by the set of allowed rate matrices. In general, there are many such , but different ones will result in different bounds on EP. The analogue of in this paper is the set of dependency constraints. The analogue to finding one (or more) ’s for the approach in this paper is choosing a coordinate system and associated unit structure that represents the dependency constraints and is rich enough for the lower bound on EP to be strictly positive. Similarly to the case with the approach in [17], where different all consistent with the constraints on the set of allowed rate matrices will result in different bounds on EP, in general different unit structures all consistent with the constraints on the set of allowed rate matrices will result in different bounds on EP. Kolchinsky & Wolpert [17] provide many examples of how constraints on the allowed rate matrices can be used to derive non-zero lower bounds on EP, including collective flashing ratchets, Szilard boxes where the particle is subject to a gravitational force in addition to driving by a piston, Szilard boxes where there are constraints on the way the piston can be used, evolving Ising spin systems, etc. Many of these examples can be formulated as systems evolving under dependency constraints (e.g. most of the examples involving in [17] ‘modularity constraints’ can be directly formulated this way). Future work involves comparing the EP bounds in this paper with the ones in [17], and more generally trying to synthesize the two approaches.

21 in total

Strengthened second law for multi-dimensional systems coupled to multiple thermodynamic reservoirs.

Introduction

Rate matrix unit structures

Example 2.1.

Example 2.2.

Proposition 2.3.

Example 2.4.

Proposition 2.5.

Thermodynamics of composite systems

Example 3.1.

Strengthened second law for composite systems

Thermodynamics of feedback control for composite systems

Examples of the strengthened second law

Example 1

Example 2

Example 3

Discussion

1. Nonequilibrium thermodynamics of feedback control.

2. Decision Making in the Arrow of Time.

3. Minimal energy cost for thermodynamic information processing: measurement and information erasure.

4. Stochastic thermodynamics with information reservoirs.

5. Fluctuation theorem for partially masked nonequilibrium dynamics.

Review 6. Stochastic thermodynamics, fluctuation theorems and molecular machines.

7. Phase transitions in Ising models on directed networks.

8. Stochastic thermodynamics in the strong coupling regime: An unambiguous approach based on coarse graining.

9. Strengthened second law for multi-dimensional systems coupled to multiple thermodynamic reservoirs.

1. From the origin of life to pandemics: emergent phenomena in complex systems.

2. Strengthened second law for multi-dimensional systems coupled to multiple thermodynamic reservoirs.