Literature DB >> 35509951

A Dual Formula for the Noncommutative Transport Distance.

Abstract

In this article we study the noncommutative transport distance introduced by Carlen and Maas and its entropic regularization defined by Becker and Li. We prove a duality formula that can be understood as a quantum version of the dual Benamou-Brenier formulation of the Wasserstein distance in terms of subsolutions of a Hamilton-Jacobi-Bellmann equation.

Entities: Chemical

Keywords: Duality; Quantum Markov semigroup; Quantum optimal transport

Year: 2022 PMID： 35509951 PMCID： PMC8993752 DOI： 10.1007/s10955-022-02911-9

Source DB: PubMed Journal: J Stat Phys ISSN： 0022-4715 Impact factor: 1.762

Introduction

The theory of optimal transport [28, 29] has experienced rapid growth in recent years with applications in diverse fields across pure and applied mathematics. Along with this growth came a lot of interest in extending the methods of optimal transport beyond the scope of its original formulation as an optimization problem for the transport cost between two probability measures. One such extension deals with “quantum spaces”, where the probability measures are replaced by density matrices or density operators. Most of the work on quantum optimal transport in this sense can be grouped into one of the following two categories. The first approach (see e.g. [6–9, 11, 22, 27]) takes a quantum Markov semigroup (QMS) as input datum and relies on a noncommutative analog of the Benamou–Brenier formulation [4] of the Wasserstein distance for probability measures on Euclidean spaceIn the simple case when the generator of the QMS is of the formwith self-adjoint matrices , the associated noncommutative transport distance on the set of density matrices is given bywhere the infimum is taken over curves that satisfy , , and whereFor the definition of the metric in the more general case of a QMS satisfying the detailed balance condition (DBC), we refer to the next section. This approach has proven fruitful in applications to noncommutative functional inequalities, similar in spirit to the heuristics known as Otto calculus [8, 9, 12, 31]. The second approach (see e.g. [13, 14, 17, 23, 25, 26]) seeks to find a suitable noncommutative analog of the Monge–Kantorovich formulation [20] of the Wasserstein distance via couplings (or transport plans):This approach also allows to consider a quantum version of Monge–Kantorovich problem for arbitrary cost functions. So far, possible connections between these two approaches in the quantum world stay elusive. The focus of this article lies on the noncommutative transport distance introduced in the first approach. More precisely, we prove a dual formula that is a noncommutative analog of the expression of the classical -Wasserstein distance in terms of subsolutions of the Hamilton–Jacobi equation [5, 24]This result yields a noncommutative version of the dual formula obtained independently by Erbar et al. [15] and Gangb et al. [16] for the Wasserstein-like transport distance on graphs. In fact, we prove a dual formula that is not only valid for the metric , but also for the entropic regularization recently introduced by Becker–Li [3]. When the generator is again of the simple form discussed above, the entropic regularization is a metric obtained when replacing the constraintin the definition of byWith the notation introduced in the next section, the main result of this article reads as follows.

Theorem

Let be an invertible density matrix and an ergodic QMS on that satisfies the -DBC. The entropic regularization of noncommutative transport distance induced by satisfies the following dual formula: Here a QMS is said to satisfy the -DBC iffor all and . If is the identity matrix, this is the case exactly when the generator is of the form with self-adjoint matrices . Moreover, stands for the set of all Hamilton–Jacobi–Bellmann subsolutions, a suitable noncommutative variant of solutions of the differential inequalityOther metrics similar to also occur in the literature, most notably the one called the “anticommutator case” in [3, 10, 11]. In [9, 30], a class of such metrics was studied in a systematic way, and our main theorem applies in fact to this wider class of metrics. For the anticommutator case, this duality formula was obtained before in [10]. There are still some very natural questions left open. For one, we do not discuss the existence of optimizers. While for the primal problem this follows from a standard compactness argument, this question is more delicate for the dual problem, even when dealing with probability densities on discrete spaces instead of density matrices, and one has to relax the problem to obtain maximizers (see [16, Sects. 6–7]). Another interesting direction would be to extend the duality result from matrix algebras to infinite-dimensional systems. While a definition of the metric for QMSs on semi-finite von Neumann algebras is available [19, 30], the problem of duality seems to be much harder to address. Even for abstract diffusion semigroups, the best known result only shows that the primal distance is the upper length distance associated with the dual distance and leaves the question of equality open [2, Proposition 10.11].

Setting and Basic Definitions

In this section we introduce basic facts and definitions about QMSs that will be used later on. In particular, we review the definition of the noncommutative transport distance from [8] and its entropic regularization introduced in [3]. Our notation mostly follows [8, 9]. For a list of symbols we refer the reader to the end of this article. Let denote the complex matrices and let be a unital -subalgebra of . Let denote the self-adjoint part of , the cone of positive elements of and the subset of invertible positive elements. We write for the normalized trace on , that is,and for the Hilbert space formed by equipping with the GNS inner productThe adjoint of a linear operator is denoted by . We write for the set of all density matrices on , that is, all positive elements with . The subset of invertible density matrices is denoted by . A QMS on is a family of linear operators on that satisfy the following conditions:We consider a QMS on which extends to a QMS on satisfying the -detailed balance condition (-DBC) for some density matrix , that is,for and . For , this reduces to the symmetry condition . is unital and completely positive for every , , for all , is continuous. Let denote the generator of , that is, the linear operator on given byWe further assume that is ergodic (or primitive), that is, the kernel of is one-dimensional. This assumption is natural in this context as it ensures that the metric defined below is the geodesic distance induced by a Riemannian metric on and in particular that it is finite. Generators of QMSs are often described by their Lindblad form, but here we will rely on the additional structure coming from the -DBC and use a presentation of provided by Alicki’s theorem [1, Theorem 3], [8, Theorem 3.1] instead: There exists a finite set , real numbers for and for with the following properties:such thatfor . for , for , for every there exists a unique with , for The numbers are called Bohr frequencies of and are uniquely determined by . The matrices are not uniquely determined by and , but in the following we will fix a set that satisfies the preceding conditions. Next we will discuss how the data from Alicki’s theorem give rise to a differential structure associated with . Letwhere is a copy of for . This is the quantum analog of the space of tangent vector fields in our setting. We write for andwhich provide analogs of the partial derivatives and the usual gradient operator, respectively. The commutator satisfies the product ruleNote that in contrast too the usual partial derivatives, the order of the factors plays a role here. This is one central reason for many of the differences and intricacies of the quantum optimal transport distance compared to the classical Wasserstein distance. Continuing with the analogy with calculus, we write for the adjoint of , that is,The crucial ingredient in the definition of , which allows to deal with the noncommutativity of the product rule, is the operator , whose definition we recall next. For and defineThe motivation for this definition is a chain rule identity [8, Eq. (5.7)], which can best be illustrated in the case :Given , we defineFor we write for the set of all pairs such that with , , andfor a.e. . Here and in the following we write for the space of all maps such that for all . The space and other vector-valued functions spaces occurring later are defined similarly. We define a metric on bywhere with the Bohr frequencies of . For , this is the noncommutative transport distance introduced in [8] (as distance function associated with a Riemannian metric on ), and for , this is the entropic regularization of introduced in [3]. A standard mollification argument shows that the infimum in the definition of can equivalently be taken over with . More precisely, if and is a mollifying kernel, then satisfies (2). A suitable reparametrization of the time parameter gives a pair such that is smooth andBy a substitution one can reformulate the minimization problem for in such a way that the constraint becomes independent from . For that purpose define the relative entropy of with respect to byand the Fisher information of byAccording to [3, Theorem 1], one hasThe metric is intimately connected to the relative entropy and therefore well-suited to study its decay properties along the QMS. For other applications, variants of the metric have also proven useful (e.g. [10, 11]), for which the operator is replaced. A systematic framework of these metrics has been developed in [9, 30]. It can be conveniently phrased in terms of so-called operator connections. Let H be an infinite-dimensional Hilbert space. A map is called an operator connection [21] ifFor example, for every the mapis an operator connection. and imply for , for , , imply for . It can be shown that every operator connection satisfiesfor and unitary [21, Sect. 2]. Embedding into H, one can view as bounded linear operators on H, and the unitary invariance of ensures that does not depend on the embedding of into H. For defineNote that if , thenso that L(X) is a positive operator, and the same holds for R(X). Thus we can defineIf and denotes the identity matrix, then is a scalar multiple of the identity as a consequence the unitary invariance of discussed above. By a slight abuse of notation, this scalar will be denoted by . Since L(X) and R(X) commute, we havefor and , where are the eigenvalues of X and the corresponding spectral projections. More generally let be a family of operator connections and defineClearly, with the operator connection from above. Then one can define a distance byIf as above, then we retain the original metric , while for (and ) one obtains the distance studied in [10, 11]. Later we will make the additional assumption that , where is the unique index in the Alicki representation of such that . It follows from the representation theorem of operator means [21] that the class of metrics with subject to this symmetry condition is exactly the class of metrics satisfying Assumptions 7.2 and 9.5 in [9]. For technical reasons in the proof of Theorem 2, it will be necessary to allow for curves of density matrices that are not necessarily invertible. For this purpose, we make the following convention: If is a positive operator and , we defineSince and is injective on , the element in this definition exists and is unique. Moreover, this convention is clearly consistent with the usual definition if is invertible. Alternatively, as a direct consequence of the spectral theorem, this expression can equivalently be defined aswhere are the eigenvalues of and an orthonormal basis of corresponding eigenvectors.

Lemma 1

If are positive invertible operators that converge monotonically decreasing to thenfor all .

Proof

From the spectral expression it is easy to see thatand the same for replaced by . Moreover, since , we have . ThusSince is monotonically increasing, this settles the claim. Write for the set of all pairs such that with , , andfor a.e. . The only difference to the definition of is that is not assumed to be invertible.

Proposition 1

For we have It suffices to show that every curve can be approximated by curves in such that the action integrals converge. For that purpose letSince is assumed to be ergodic, by [8, Theorem 5.4] there exists for every a unique with such thatand X(t) depends continuously on t. For let . Moreover, if is the smallest eigenvalue of , which is strictly positive by assumption, then . Thusas . Similarly one can showBy the same argument as above, for a.e. there exists a unique gradient such thatandSince , the norm on the right side is bounded independent of , so thatwith a constant independent of . As for , this impliesas . Withwe haveFurthermore,where we used the substitution . By Lemma 1 and the monotone convergence theorem we obtainTogether with the convergence result for from above, this impliesAltogether we have shown

Real subspaces

Since the proof of the main result relies on convex analysis methods for real Banach spaces, we need to identify suitable real subspaces for our purposes. For this is simply , but for this is less obvious and will be done in the following. For denote by the unique index in such that . Let be the linear span of , and define a linear map byBy the product rule (1), also belongs to andThus J interchanges left and right multiplication, that is, for and .

Lemma 2

The map J is anti-unitary. For we have LetBy the previous lemma, is a real Hilbert space.

Lemma 3

Let be a family of operator connections such thatfor all . If and then . For the statement follows directly from the definitions. For first note thatas a consequence of the spectral representation (3) and the fact that J interchanges left and right multiplication. Thus

Duality

In this section we prove the duality theorem announced in the introduction. Our strategy follows the same lines as the proof in the commutative case in [15]. It crucially relies on the Rockafellar–Fenchel duality theorem quoted below. Throughout this section we fix an ergodic QMS with generator satisfying the -DBC for some and a family of operator connections such that for all . We need the following definition for the constraint of the dual problem. Here and in the following we writefor and .

Definition 1

A function is said to be a Hamilton–Jacobi–Bellmann subsolution if for a.e. we haveThe set of all Hamilton–Jacobi–Bellmann subsolutions is denoted by . Our proof will establish equality between the primal and dual problem, but before we begin, let us show that one inequality is actually quite easy to obtain.

Proposition 2

For all we have For and we havewhere we used and for the first inequality and Young’s inequality for the second inequality. To prove actual equality, our crucial tool is the Rockafellar–Fenchel duality theorem (see e.g. [28, Theorem 1.9], which we quote here for the convenience of the reader. Recall that if E is a (real) normed space, the Legendre–Fenchel transform of a proper convex function is defined by

Theorem 1

Let E be a real normed space and proper convex functions with Legendre–Fenchel transforms . If there exists such that G is continuous at and then Before we state the main result, we still need the following useful inequality.

Lemma 4

For any operator connection the mapis smooth and its Fréchet derivative satisfiesfor with equality if . Smoothness of is a consequence of the representation theorem of operator connections [21, Theorem 3.4]. For the claim about the Fréchet derivative first note that is concave [21, Theorem 3.5]. Therefore for all and by [18, Proposition 2.2]. The fundamental theorem of calculus impliesSince is 1-homogeneous by [21, Eq. (2.1)], its derivative is 0-homogeneous. Thus, if we replace B by and let , we obtainMoreover, the 1-homogeneity of implies , which settles the claim.

Theorem 2

(Duality formula) For we have The second inequality follows easily by mollifying. We will show the duality formula for Hamilton–Jacobi subsolutions in . For this purpose we use the Rockafellar–Fenchel duality formula from Theorem 1. Let E be the real Banach spaceBy the theory of linear ordinary differential equations, the mapis a linear isomorphism. Thus the dual space can be isomorphically identified withvia the dual pairingDefine functionals byHere denotes the set of all pairs such thatfor all , . It is easy to see that F and G are convex. Moreover, for and we have , hence , andfor all , hence . Furthermore, G is clearly continuous at . Moreover,Let us calculate the Legendre transforms of F and G, keeping in mind the identification of . For F we obtainSince the last expression is homogeneous in A, we have unlessfor all . This implies and andThusHere denotes the set of all pairs satisfying , andThe difference to the definitions of (or ) and is that we do not make any positivity or normalization constraints. Note however that if , thenso that (and ). Now let us turn to the Legendre transform of G. We haveSince implies for all , we have unless . Furthermore, it follows from the definition of that unless for a.e. . For we haveWe will show next that the inequalities are in fact equalities. Let and . Moreover, let with the notation from Lemma 4. Sinceis a bounded linear map that depends continuously on t, there exists a unique continuous map such thatfor every and . LetWe claim that . Indeed,where the inequality follows from Lemma 4. Note that we have equality for . In particular, for we obtainOn the other hand,where we again used Lemma 4 for the first inequality. Put together, we haveandfollows from the monotone convergence theorem. Henceif for a.e. . Together with the formula for , we obtainwhere the last equality follows from Proposition 1. An application of the Rockafellar–Fenchel theorem yields the desired conclusion.

4 in total

1. On the geometry of geodesics in discrete optimal transport.

Authors: Matthias Erbar; Jan Maas; Melchior Wirth
Journal: Calc Var Partial Differ Equ Date: 2018-12-11 Impact factor: 1.945

2. On Matrix-Valued Monge-Kantorovich Optimal Mass Transport.

Authors: Lipeng Ning; Tryphon T Georgiou; Allen Tannenbaum
Journal: IEEE Trans Automat Contr Date: 2014-08-21 Impact factor: 5.792

4 in total