Literature DB >> 34720441

Analysis and Optimal Velocity Control of a Stochastic Convective Cahn-Hilliard Equation.

Abstract

A Cahn-Hilliard equation with stochastic multiplicative noise and a random convection term is considered. The model describes isothermal phase-separation occurring in a moving fluid, and accounts for the randomness appearing at the microscopic level both in the phase-separation itself and in the flow-inducing process. The call for a random component in the convection term stems naturally from applications, as the fluid's stirring procedure is usually caused by mechanical or magnetic devices. Well-posedness of the state system is addressed, and optimisation of a standard tracking type cost with respect to the velocity control is then studied. Existence of optimal controls is proved, and the Gâteaux-Fréchet differentiability of the control-to-state map is shown. Lastly, the corresponding adjoint backward problem is analysed, and the first-order necessary conditions for optimality are derived in terms of a variational inequality involving the intrinsic adjoint variables.

Entities: Chemical

Keywords: Convection; Optimal velocity control; Optimality conditions; Stochastic Cahn–Hilliard equation; Well-posedness

Year: 2021 PMID： 34720441 PMCID： PMC8550788 DOI： 10.1007/s00332-021-09702-8

Source DB: PubMed Journal: J Nonlinear Sci ISSN： 0938-8974 Impact factor: 3.621

Introduction

The aim of this paper is to analyse the stochastic Cahn–Hilliard equation with convectionwhere is a smooth bounded domain in , , is a fixed final time, and denotes the normal outward unit vector on . The system (1.1)-(1.4) models isothermal phase-separation occurring in a moving fluid occupying the space region during the time interval [0, T]. The order parameter, or phase-variable, represents the relative concentration between the pure phases, the variable represents the chemical potential of the system, and the nonlinearity is a double-well potential with two global minima. The term is an external random velocity field acting on the system, modelling possible stirring and mixing processes of the fluid which may affect phase-separation itself. The stochastic forcing describing the thermal fluctuations affecting phase-separation is modelled by means of a cylindrical Wiener process W on a given probability space and a W-integrable coefficient B, possibly depending on the phase variable itself, which calibrates the intensity of the noise. The Cahn–Hilliard equation is a classical model employed in phase-separation, and has nowadays numerous applications to physics, biology, and engineering. Its introduction dates back to the pioneering work by Cahn and Hilliard (1958), where it was proposed, in the deterministic version, to adequately describe spinodal decomposition in binary metallic alloys. In the last decades, the model has been extensively refined in several directions. For example, the description of possible viscous behaviours has been originally presented in Elliott and Stuart (1996), Elliott and Songmu (1986), Novick-Cohen (1988), and then generalised in Gurtin (1996). The presence of a further evolution close to boundary due to the interaction with the hard walls has been accounted for by proposing several choices of dynamic boundary conditions, for which we refer to (Fischer et al. 1997; Kenzler et al. 2001; Gal 2012). The deterministic Cahn–Hilliard equation has been proven to be extremely effective in describing phase-separation phenomena. Nevertheless, it presents some drawbacks. Indeed, the phase-separation process inevitably presents some disruptions, acting at a microscopic level. These are due to unpredictable movements at the atomistic level, which may be caused, for example, by temperature oscillations, magnetic effects, or configurational interactions. As such, the classical Cahn–Hilliard system is unable to capture the erratic nature of the separation process. The most natural way to overcome this problem is to switch to a random setting instead, by introducing a suitable noise term in the equation that could effectively describe the unpredictability of the phenomenon at a small scale. This was proposed by Cook (1970) for Wiener-type noises and gave rise to the well-known Cahn–Hilliard–Cook stochastic model for phase-separation. The stochastic version of the model was then confirmed multiple times (Binder 1981; Pego 1989) to be the only one that can genuinely describe phase-separation in alloys. Since then, the random version of the equation has been increasingly studied, both in the physics literature (Rogers et al. 1988; Elder et al. 1988; Grant et al. 1985; Langer et al. 1975; Milchev et al. 1988) and in the direction of model validation and numerical simulations (Blömker et al. 2001, 2008, 2016; Hawick 2010; Hawick and Playne 2010; Hawick 2008; Lee et al. 2014). The classical Cahn–Hilliard equation is the gradient flow associated with the free energy functionalwith respect to the metric of . The gradient term penalises the oscillation of the order parameter, while the double-well potential models the tendency of each phase to concentrate. The form of the chemical potential in (1.2) appears then naturally from the differentiation of the free energy. Typical examples of are given byandAlthough (1.5) is the most relevant choice in terms of thermodynamical consistency, its singular behaviour in could be hard to tackle from the mathematical viewpoint, and in several models the polynomial approximation (1.6) is often employed. The velocity field models the transport effects due to convection terms acting on the system. In our analysis, this will be a prescribed external forcing field which will play the role of velocity control in a typical optimisation problem. Optimisation involving phase-separating fluids where the velocity is the control arises naturally in applications. For example, this is the case of block solidification of silicon crystals in photovoltaic applications. Here, the flow of the fluid acts as a control to optimise the distribution of certain impurities, at the atomistic level, in a process of solidification of silicon melt. For more details about the applications of optimal velocity control problem in phase-separating fluids, we refer to (Kudla et al. 2013; Rocca and Sprekels 2015). In practice, the motion of the fluid can be achieved in several ways: as pointed out in Colli et al. (2018a), Rocca and Sprekels (2015), the most common choices consist in employing either mechanical stirring devices or ultrasound emitters directly into the container. Another possibility is to prescribe a velocity on the fluid by means of magnetic fields: this is widely employed, for example, in the case of molten metals (Kudla et al. 2013) or bulk semiconductor crystals. Nevertheless, it is worthwhile noting that in all these scenarios, the velocity field is usually obtained in an indirect way, meaning that the motion of the fluid is achieved only as a consequence of more direct controls, such as mechanical devices or magnetic effects. This being noticed, it is clear then that the external prescription of a given velocity is strongly affected by microscopic noises, which may be caused, depending on the type of motion-inducing devices, by configurational or electromagnetic disturbances occurring in the flow-creating process. Also, the effective induction of the flow is strongly affected by the imprecision of the above-mentioned devices. From the modelling point of view, this strongly calls for the introduction of a further source of randomness in the velocity field and for abandoning the classical deterministic setting of the problem. Let us stress that the random component of the velocity field prescinds from the stochastic nature of the noise in equation (1.1): while the Wiener process W models microscopic turbulences occurring in phase-separation, the random nature of takes into account the imprecision of the flow-inducing mechanisms. For example, in typical situations would satisfy a further stochastic equation involving a further Wiener process, independent of W. Clearly, this extra equation would specifically depend on the model in consideration: here, in order to make the treatment as general and light as possible, we only require u to be a stochastic process. Let us point out that this choice implies that the microscopic fluctuations in coming from a possible further noise are not taken into account explicitly here. Indeed, the box constraint for the controls (see Sect. 2 below) only requires some general measurability and integrability conditions on , and does not prescribe any specific requirement on the microscopic fluctuations of . To fix the ideas, the reader can naturally think about focusing only on macroscopic controls, e.g. controls which are in time and in space, and neglecting thus the microscopic turbulence in . Here, since the methodology can be directly adapted to more general controls, we preferred to consider a broader class of admissible controls, for sake of mathematical generality. The importance of allowing the control variable to be random is crucial when dealing with a controlled stochastic equation (see, for example, Yong and Zhou 1999). Indeed, bearing in mind the typical perspective of Monte Carlo simulations, restricting to deterministic controls would mean to choose a priori a control which is independent of the possible outcomes of the evolution according to the prescribed underlying probability space. By contrast, stochastic controls ensure more freedom from the point of view of the controller, as they allow to adapt the control to the random outcomes of the phenomenon itself. With this in mind, in our analysis will be a prescribed stochastic process satisfying some natural box-constraints, possibly taking into account the random imprecision of the velocity-inducing devices. The model that we study presents then two main sources of randomness: the first one is given by the Wiener noise in equation (1.1), taking into account the microscopic turbulence affecting phase-separation, and the second one is the stochastic component of the convection term, modelling the imprecision of the stirring procedure. Hence, one can think the two random forcings as acting on two separate levels: a microscopic scale described by W, and a different uncorrelated scale rendered by . The mathematical literature dealing with the Cahn–Hilliard equation is extremely developed. In the deterministic case, attention has been widely devoted to the study of well-posedness, regularity, long-time behaviour of solutions, and asymptotics. Due to the considerable size of the literature, we prefer to quote the detailed overview by Miranville (2019) and the references therein for completeness. Let us only point out the contributions (Colli et al. 2014; Cherfils et al. 2011; Gilardi et al. 2009) dealing with well-posedness and (Colli et al. 2015a, b, 2016; Hintermüller and Wegner 2012) in the direction of distributed and boundary control problems. Possible relaxations and asymptotics of the Cahn–Hilliard equation have been recently studied in Bonetti et al. (2017, 2018, 2020), Colli and Scarpa (2016), Scarpa (2019a) also with nonlinear viscosity terms. In the stochastic case, the original contribution dealing with Cahn–Hilliard equation is (Da Prato and Debussche 1996), on the existence of mild solutions in the case of polynomial potentials. Further studies have been then carried out in the works (Cornalba 2016; Elezović and Mikelić 1991) again in the polynomial setting, and in Scarpa (2018, 2020) in the case of more general potentials in variational framework. The stochastic Cahn–Hilliard equation with logarithmic potential has been studied in Debussche and Zambotti (2007), Debussche and Goudenège (2011); Goudenège (2009) in relation to reflection measures, and in Scarpa (2019) in the case of degenerate mobility. In the context of phase-field modelling with stochastic forcing, it is worthwhile mentioning the contributions (Antonopoulou et al. 2016; Feireisl and Petcu 2019a, b), as well as (Bauzet et al. 2017; Bertacco 2020; Orrieri and Scarpa 2019) on the stochastic Allen–Cahn equation. In the direction of optimal control, we point out (Scarpa 2019b) dealing with a distributed optimal control problem of the stochastic Cahn–Hilliard equation, and the recent work (Orrieri et al. 2020) on a stochastic phase-field model for tumour growth. Concerning specifically the Cahn–Hilliard equation with convection, in the deterministic case well-posedness has been studied in Colli et al. (2018a) under general choices of dynamic boundary conditions, in Porta and Grasselli (2015) in a local version with reaction terms, while some related optimal velocity control problems have been analysed in Colli et al. (2018b, 2019), Rocca and Sprekels (2015), Zhao and Liu (2013), and Zhao and Liu (2014). Also, the relationship between the behaviour of the convection term and phase-separation has been analysed in the recent work (Feng et al. 2020): here, the authors show that if the velocity field is sufficiently mixing, then no phase-separation occurs, and the solutions of the respective advective Cahn–Hilliard equation converge exponentially to a homogenous mixed state instead. This may have important connections to related optimal control problems with a target distribution at a final time: in particular, the above-mentioned result makes the optimisation problem meaningful also when the final target state is not necessarily separated, but is a homogenous mixed state. Also, it points out how powerful the action of the convection term is on the phase-separation, and motivates the study of phase-optimisation problems where the control is the velocity itself. The convective Cahn–Hilliard equation has also been considered in coupled systems, with a further equation equation for the velocity field: it is the case, for example, of Cahn–Hilliard–Navier–Stokes systems, studied in Abels (2009), and Frigeri et al. (2019, 2020, 2016). By contrast, despite its strong relevance in application to stochastic optimal velocity control, the convective Cahn–Hilliard has not been analysed yet. The only results available in the stochastic setting deal with coupled systems, for example in the context of stochastic Cahn–Hilliard–Navier–Stokes models (Deugoué and Medjo 2018b, a; Medjo 2017). This paper constitutes a first contribution to optimal velocity control for the stochastic convective Cahn–Hilliard equation. The literature on stochastic optimal control is also quite extensive: for a general overview we refer to the monograph (Yong and Zhou 1999). Stochastic optimal control is also studied in Fuhrman et al. (2012, 2013, 2018), Fuhrman and Orrieri (2016), Guatteri et al. (2017) in the context of the heat equation and reaction-diffusion systems. For completeness, we refer also to the works (Du and Meng 2013; Lü and Zhang 2014) concerning the stochastic maximal principle. Relaxation of the optimality conditions has been addressed in Brzeźniak and Serrano (2013) and Barbu et al. (2018) for dissipative SDPEs and the Schrödinger equation, respectively. Deterministic optimal control problems of stochastic reaction–diffusion equations have been analysed in Stannat and Wessels (2019). Let us describe now the main points that will be addressed in this work. First of all, we concentrate on the well-posedness of the state-system (1.1)–(1.4), where the control is arbitrary but fixed. Using a Yosida approximation on the nonlinearity and a time-regularisation on the velocity field, we show existence-uniqueness of solutions by means of variational techniques and stochastic compactness arguments. Thanks to monotone analysis tools, we are able to cover very general potentials, not necessarily of polynomial growth. Also, we prove continuous dependence of the variables with respect to the control, and this allows to define a suitable control-to-state map . Secondly, we focus on the optimisation problem, which consists in minimising a tracking-type cost functional in the form:subject to the state-system (1.1)–(1.4) and the constraint that is an admissible control, meaning that with being a suitable bounded, closed subset of the space p-integrable progressively measurable process with values in . Here, and represent some running and final targets, while are nonnegative weights. Cost functionals in this form arise very naturally from applications. Roughly speaking, the optimisation problem amounts to identify the optimal way of stirring and mixing the fluid in such a way that the state variable is as close as possible to the running target during the evolution and to the final target at the end of the evolution, without wasting too much energy in inducing the flow . As we have anticipated above, a typical example that we have in mind appears in the solidification process of silicon crystals in the context of industrial photovoltaic applications (Kudla et al. 2013; Rocca and Sprekels 2015). Here, a certain mixture of impurities needs to be moved by convection from within the silicon melt to its boundary, in order to refine the quality of the final silicon block. The flow of the fluid behaves then as a control on the silicon melt in order to make the relative distribution of impurities be close enough to some prescribed targets. In particular, the final target distribution of impurities can be seen here as concentrated on the boundary and diluted in the interior. Analogous applications arise more generally in optimal distribution problems of melting materials: the local distribution of some substance contained in the separating fluid is optimised close to some desired targets by inducing a flow in the material itself. The starting point in the analysis consists in addressing existence of optimal controls. This is one of the main differences with respect to the deterministic optimal control problem. Indeed, in the deterministic setting existence of optimal controls follows with no particular effort from the direct method of calculus of variations, since one is able to obtain enough compactness from the well-posedness of the state system and the boundedness of the set of admissible controls. By contrast, in the stochastic case these uniform estimates on the minimising sequence of controls do not ensure enough compactness in probability, due to the stochastic nature of the problem itself. Also, classical stochastic tools that are usually employed to bypass this problem, such as the well-known criterion à la Gyöngy–Krylov, do not work here: this is due to the non-uniqueness of optimal controls, which is caused by the highly nonlinear nature of the minimisation problem. To overcome this issue, we propose instead a relaxed notion of optimality, which may be considered as optimality in law, i.e. requiring that the stochastic basis and the Wiener process are part of the definition of optimal control themselves. This technique mimics the definition of probabilistically weak solution for stochastic evolution equations, and has been employed in other settings such as (Barbu et al. 2018; Orrieri et al. 2020). In this framework, we prove existence of relaxed optimal controls, and we show that when one restricts the attention only to deterministic controls, then it is possible to get existence in the classical (probabilistically strong) sense. We move then to the study of the differentiability properties of the control-to-state map S. More specifically, we prove that S is Gâteaux and Fréchet differentiable between suitable Banach spaces. This is done by showing well-posedness of the so-called linearised system, obtained from (1.1)–(1.4) formally differentiating with respect to , and by carefully proving that the unique linearised solution actually coincides with the derivative of S. This will allow to explicitly characterise, thanks to the chain rule in Banach spaces, the derivative of the reduced cost functional , so that the optimisation problem could be seen only in terms of the control . Consequently, it is possible to obtain a first rudimental version of necessary conditions for optimality, by imposing the classical first-order variational inequality on a given optimal control. The last part of the paper aims at refining the first version of necessary conditions, by removing any explicit dependence on the linearised variables. This is done by introducing and studying a suitable adjoint problem, which is formally related to the dual problem of the linearised system. The adjoint problem consists of a backward-in-time stochastic partial differential equation, and its analysis is the most challenging point of the work. The first main difficulty is indeed the backward nature of the equation: although this is not a great limitation in deterministic problems, in the stochastic case it calls for the introduction of an extra variable, in order to preserve adaptability of the processes in play, and requires different analytical techniques such as martingale representation theorems. The second and most crucial difficulty depends instead on the nonlinear nature of the system. Indeed, the presence of the nonlinear term and the dual structure of the equation prevent from obtaining uniform estimates directly on the adjoint system. Consequently, well-posedness cannot be obtained classically by tackling the adjoint problem straightaway, and a different idea is needed. In this regard, we use a duality method. We consider a more general version of the linearised system, where an arbitrary forcing term is added, and we show that this is well posed and the solutions depend continuously on the forcing term. Then, we prove that such system is in duality with the adjoint problem that we want to study, and this allows to recover by comparison some first uniform estimates on the adjoint variables. This tool is extremely powerful, as it allows to bound the adjoint variables without even working on the adjoint system itself: the main intuition behind this is that the linearised system is usually much simpler to study, and the duality between linearised-adjoint systems allows to “transfer” uniform bounds on the solutions from one problem to the other. Once these first crucial estimates are obtained, using classical techniques we are then able to prove well-posedness of the adjoint problem. Lastly, the duality relation is employed to refine the first-order conditions for optimality and to write them as a variational inequality only depending on the intrinsic adjoint variables. The main novelty of the work is the presence of two sources of randomness in equation (1.1), accounting for noises both in the phase-separation process and in the flow-inducing procedure. As interesting as it may be from the applied point of view, certainly this novel framework does not come without effort on the mathematical side. Indeed, let us stress that the fact that is assumed to be a stochastic process, and not a deterministic function, causes several non-trivial issues in estimating the solutions: this is due to a lack of satisfactory computational tools of Gronwall type in the genuinely pure stochastic case. Such difficulties are evident especially in the study of the forward problems, i.e. in the state system (1.1)–(1.4) and in the corresponding linearised system. Here, the idea is to argue instead combining carefully the Hölder inequality and several iterative patching arguments, in order to avoid applying the Gronwall lemma, which does not work. In the adjoint problem, the situation is slightly better: we will show that the backward nature of the equation allows indeed to use a very general and recent backward-in-time version of the stochastic Gronwall lemma (see Lemma 6.1). We conclude by summarising here the structure of the paper. Section 2 contains the description of the setting of the work, the precise assumptions, and the main results that we prove. In Sect. 3, we prove well-posedness of the state-system, while Sect. 4 focuses on the existence of optimal controls. Then, in Sects. 5 and 6, we study the linearised system and the adjoint system, respectively. Finally, in Sect. 7, we prove the two versions of the first-order conditions for optimality.

Setting and Assumptions

In this section, we specify the general setting, notation, and assumptions of the work. We then present the main results of the paper. Let be a filtered probability space satisfying the usual conditions, where is a fixed final time and W is a cylindrical Wiener process on a separable Hilbert space K. For convenience, let us fix now once and for all a complete orthonormal system of K. The progressive -algebra on is denoted by . As far as notation is concerned, the dual of a given real Banach space E is denoted by , and the duality pairing between and E is denoted by . Weak convergence in E and weak convergence in will be denoted by the respective symbols and . Also, for all we employ the usual symbols and for the spaces of q-Bochner integrable functions, and and for the spaces of strongly and weakly continuous functions from [0, T] to E, respectively. For spaces of stochastic processes, we use the notation to further specify that measurability is also intended with respect to the progressive -algebra . In the case that and E is separable, we explicitly set as the dual space of , which we recall can be characterised (Edwards 1965, Thm. 8.20.3) as the space of weak*-measurable random variables with finite q-moment in . Finally, if and are separable Hilbert spaces, we use the notation for the space of Hilbert–Schmidt operators from to . In the proofs, the symbol c is reserved to denote any generic positive constant, whose value depends on the structure of the problem and may be updated from line to line in the proofs. Let () be a smooth bounded domain. We use the classical notation , , and for every . The outward normal unit vector on the boundary is denoted by . We introduce the functional spacesendowed with their natural norms , , , and , respectively. We identify H to its dual, so that we have the continuous and dense inclusionsFor all , we use the notation for the spatial mean of y, and define the subspaces of zero-mean elements asLet us recall that the variational formulation of the Laplace operator with Neumann conditionsis a well-defined linear operator, and its restriction to is an isomorphism onto the space . Its inverse is the resolvent operator associated with the abstract elliptic problem on with homogenous Neumann conditions, meaning that for all the element is the unique solution with null mean toAs a consequence of the Poincaré–Wirtinger inequality, it is immediate to check thatyields an equivalent norm on . In particular, it follows the compactness inequalityWe introduce the spacewhere the divergence is intended in the sense of distributions on . The space of velocity controls that we focus on will beLet us note that this includes as a special case the choice of deterministic controls, which has also received a strong mathematical interest on its own: see, for instance, Stannat and Wessels (2019). Indeed, we can setThe following assumptions on the problem will be in force throughout the paper. is of class , , and there exist and such that Let us point out that the classical polynomial double-well potential satisfies these assumptions with . Nonetheless, by allowing also the smaller values we are able to include possibly more singular potential, such as the first-order exponentials. We set , : then is a nondecreasing function; hence, it can be identified with a maximal monotone (single-valued) graph in . Let us also denote by the convex lower semicontinuous function with . and . and there exists a constant such that Moreover, we prescribe that Let us note that in case of additive noise , these conditions are trivially satisfied for all if and for all if : in particular, the classical polynomial case in dimension two and three is always covered. In the genuine multiplicative noise case, i.e. when B is not constant in , we also suppose that B is -valued: this amounts to requiring that the noise is conservative, in the sense that it preserves the mean of the phase-variable. A direct consequence is the conservation of mass, which is a fundamental feature of Cahn–Hilliard-type evolutions. This hypothesis on the noise is very classical and natural in literature: for example, let us stress that a relevant multiplicative choice of B can be given as: where the sequence is such that It is not difficult to show that this example allows for all values of in every space-dimension . In the context of the optimal velocity control, it will be useful to introduce a polynomial-growth assumption on . This will be necessary only in the study of the optimisation problem, but is not needed for the well-posedness of the state system. it holds that in A1 and Such requirement is very natural in the Cahn–Hilliard context, since it is satisfied by the classical choice of the polynomial double-well potential of degree 4. The first main result of the paper states existence and uniqueness of strong solutions, and their continuous dependence with respect to the velocity field.

Theorem 2.1

Assume A1–A3. Then, for every , there exists a unique pair with for all , and such thatfor every , -almost surely. Furthermore, there exists a constant , only depending on the structure of the problem, such that for all , the respective solution satisfiesand for every , the respective solutions verifyLastly, if also C1 holds, then Once the analysis of well-posedness of the state system has been addressed, we can turn our attention to the optimal velocity control problem. As far as the controls are concerned, we consider classical box-constraints on the velocity controls, by defining the set of admissible controls as:where is a prescribed constant. The prescription of a box-constraint on the admissible controls is classical on the mathematical side. In applications, the constant L is typically related to the maximum capacity of the flow-inducing devices that convey the velocity field. It will be useful to introduce an enlarged bounded open set in containing , asAnalogously, we introduce the corresponding spaces of admissible deterministic controls as:The cost functional that we study is of quadratic tracking-type and readswhere are non-negative constants with and the targets are fixed withThe optimal velocity control consists in the following: By virtue of the well-posedness Theorem 2.1, it is well defined the control-to-state mapasThis implies that the optimal control problem can be reduced to the only variable , by introducing the so-called reduced cost functional as: minimise the cost functional J with the constraints that belongs to and is the unique corresponding solution component to the state system (1.1)–(1.4).

Remark 2.2

Clearly, the well-posedness result in Theorem 2.1 continues to hold on any new stochastic basis , provided to analogously define the new spaces of controls , , and . Hence, if also are some new targets on with the same law of , one can define the corresponding cost functional , the corresponding control-to-state map , and the new reduced cost functional on the new probability space, by simply replacing with . With this notations, we can state the exact definition of optimal control as follows. As anticipated, we also give some relaxed notions of optimality, one based on the concept of optimality-in-law and the other obtained minimising only on the deterministic controls.

Definition 2.3

An optimal control for (CP) is an element such thatA relaxed optimal control for (CP) is a family where is a probability space, is a filtration satisfying the usual conditions, is a K-cylindrical Wiener process on it, and have the same laws of and , respectively, and satisfiesA deterministic optimal control for (CP) is an element such that Our first result in the analysis of the optimisation problem (CP) concerns existence optimal controls. It is worthwhile noting that due to the non-uniqueness of optimal controls, in the genuinely stochastic case one can only show existence of relaxed optimal controls: this is typical in highly nonlinear stochastic optimal control problems, see, for example, (Barbu et al. 2018; Scarpa 2019b). By contrast, we show that deterministic optimal controls always exist.

Theorem 2.4

Assume A1–A3. Then, there exist a relaxed optimal control and a deterministic optimal control for problem (CP). Once existence of minimisers for (CP) is proved, we can now turn to the main focus of the work, i.e. the investigation of necessary conditions for optimality. The first main step in this direction is the study of the differentiability of the control-to-state map S, along with the characterisation of its derivative through the analysis of the linearised state system. This will allow to obtain a first version of the first-order conditions for optimality by means of a suitable variational inequality involving the derivative of the reduced cost functional. In this direction, we introduce the assumptions The linearised system can be formally obtained by differentiating the state system (1.1)–(1.4) with respect to the control in a given direction , and readsThe next result ensures exactly that the linearised system (2.6)–(2.9) is well posed in a suitable variational sense, and that the unique solution to (2.6)–(2.9) coincides with the derivative of the control-to-state map S in the point along the direction . the map is of class . Let us point out that this implies together with A3 that for all . Moreover, let us stress this requirement is very natural, and it is satisfied, for instance, in the relevant example described in A3, provided to replace with . is of class , , and it holds that This is a refinement of assumptions C1–C2 and ensures, as we will see, better differentiability properties for S. Still, C3 is satisfied by the polynomial potential and the relevant noise coefficient described in A3, provided to replace with .

Theorem 2.5

Assume A1–A3, C1–C2, and . Then, for all and , setting , there exists a unique pair with such that, for every , -almost surely,Furthermore, the control-to-state map is Gâteaux-differentiable in the following sense: for all and , as , it holds that Moreover, if and C3 holds, then is also Fréchet-differentiable as a map The second step in the analysis of necessary conditions for optimality consists in studying the so-called adjoint system and by proving a suitable duality relation with respect to the linearised system. The adjoint system can be formally obtained as the dual system of (2.6)–(2.9), and readsLet us point out that the adjoint system is backward in time: due to the stochastic framework of the problem, this necessarily requires the introduction of the additional variable Z in view of the classical martingale representation theorems. The situation here is then much more complex than the deterministic one: the variable of the adjoint system is indeed the couple (P, Z), with being an auxiliary variable. Due to the difficulty of analysis of the adjoint system, we will need to require more regularity on the targets, namely The next result ensures that the adjoint system (2.10)–(2.13) is well posed in a suitable variational sense, and state a duality relation between (2.6)–(2.9) and (2.10)–(2.13). and it holds that

Theorem 2.6

Assume A1–A3, C1–C2, and C4. Then, for all , setting , there exists a triplet , withsuch that, for every , -almost surely,Furthermore, the solution components , , and are unique in the spaces , , and , respectively. At this point, we are finally ready to state the necessary conditions for optimality: more specifically, we present here two different versions. The first one is deduced directly by the characterisation of the derivative of in Theorem 2.5, and consists of a variational inequality depending also on the linearised variables. The second one is a refinement of this, as it employs the adjoint problem and only depends on the intrinsic adjoint variables , not on the linearised ones.

Theorem 2.7

Assume A1–A3, C1–C2, and . If is an optimal control for (CP) and is its respective optimal state, thenwhere is the unique first solution component of the linearised system (2.6)–(2.9) with the choice , in the sense of Theorem 2.5.

Theorem 2.8

Assume A1–A3, C1–C2, and C4. If is an optimal control for (CP) and is its respective optimal state, thenwhere is the uniquely determined solution component of the adjoint system (2.10)–(2.13) in the sense of Theorem 2.6. In particular, if , then is the orthogonal projection of on the closed convex set in the Hilbert space .

Remark 2.9

Let us comment on the necessary condition for optimality. When handling the optimisation problem in practice, the main role of condition (2.15) is to restrict the class of possible candidates to be optimal controls. Roughly speaking, the optimisation analysis begins with the identification of some natural candidates to the role of optimal controls. Secondly, for such controls the forward and the backward systems are solved, so that the respective variables and are identified. Finally, if condition (2.15) is not met, then the candidate is cut off from the analysis, otherwise it is confirmed. Nonetheless, let stress again that condition (2.15) is only a necessary requirement, and can only help to restrict the class of potential optimal controls. In order to further refine the analysis, sufficient conditions for optimality should be investigated. The mathematical idea behind this is very natural: if the reduced cost functional can be shown to be twice (Fréchet or Gâteaux) differentiable, then any control satisfying the first-order stationary condition (2.15) and the positive definiteness condition is an optimal control. Such second-order analysis is extremely challenging, and to the best of the author’s knowledge, it has been performed so far only in relation to some selected optimal control problems in the deterministic setting (Colli et al. 2015b; Colli and Sprekels 2015). In the stochastic case, the second-order analysis is open and is currently being investigated in a work in preparation.

Well-posedness of the State System

This section is devoted to the proof of Theorem 2.1 about well-posedness of the state system.

Uniqueness

Let and let us denote by any respective solutions to (1.1)–(1.4) in the sense of Theorem 2.1. Let us set for brevity of notation , , : then we havewhere the equality is intended in the usual variational sense of Theorem 2.1. Taking as test function yields directly by assumption A3 that , so that actually and . Hence, Itô’s formula for the function yieldsNow, the mean value theorem and assumption A1 givewhile the inclusion , the Hölder and the Poincaré–Wirtinger inequalities yield Furthermore, assumption A3 ensures thatUsing the compactness inequality (2.1) and rearranging the terms, we are left withOn the right-hand side, we have, by the Hölder inequality in time,and, thanks to the Burkholder–Davis–Gundy and the Young inequalities, assumption A3, and again the compactness inequality (2.1),Consequently, taking power p/2 at both sides of (3.1) and rearranging the terms yieldHence, settingwe getSince is independent of the initial time, we can iterate the procedure and close the estimate on each subinterval for all until : summing up, noting that the number of such subintervals is less than , and renominating c independently of , we get thenfrom which uniqueness of solutions follows.

Approximation

We turn now to existence of solutions. First of all, for every let be the Yosida approximation of and be the Moreau–Yosida regularisation of , which are defined, respectively, as:Let us recall that is -Lipschitz continuous, is convex and quadratic at , and as it holds that and for all . For further details about the properties of and , we refer to the monograph (Barbu 2010, Ch. 2). We define the approximated double-well potential as:so that in particular we have for . Secondly, we definewhere is a classical non-anticipative sequence of mollifiers in time. In particular, let us point out that it holdsThe approximated system is obtained by replacing with and with in (1.1)–(1.4):We formulate (3.2)–(3.5) in an abstract way aswhere the variational operatorsare defined as:andSince is Lipschitz-continuous, it is not difficult to show (see, for example, Scarpa 2018, Lem. 3.1) that is weakly monotone, weakly coercive, and linearly bounded, in the sense that there are two constants such thatandAs far as the convection operator is concerned, since , thanks to the divergence theorem we haveand, thanks to the Hölder inequality and the inclusion ,Hence, the operator is weakly monotone, weakly coercive, and linearly bounded. Besides, due to the Lipschitz-continuity of and the regularity of , it is immediate to check that it is also hemicontinuous. Moreover, assumption A3 ensures that is Lipschitz-continuous. It follows then by the classical variational approach to SPDEs by Pardoux (1975) and Krylov and Rozovskiĭ (1979) that the evolution equation (3.6) admits a unique variational solutionLet us set as the approximated chemical potential.

Uniform Estimates

Itô’s formula for the square of the H-norm yieldsNow, on the left-hand side, we have, thanks to the monotonicity of ,Also, by the Hölder inequality and the inclusion , it holdsThanks to the elliptic regularity theory for the Neumann problem (see, for example, Brezis 2011, §9.6) there is independent of such that for every : consequently, renominating c and using the Young inequality we getFurthermore, noting that since , assumption A3 yieldsPutting this information together and using assumption on the right-hand side we get, possibly updating the value of c, Taking now power p/2 at both sides, the stochastic integral on the right-hand side can be treated again thanks to A3, using classical computations based on the Burkholder–Davis–Gundy inequality (see, for example, Marinelli and Scarpa 2018, Lem. 4.3). Consequently, the same iterative argument used in Sect. 3.1 ensures thatIn order to deduce further estimates on and , we rely on the free-energy estimate. Namely, we consider the approximated energyClearly, is well defined and of class in , with derivativeso that in particular we have . Moreover, the Lipschitz-continuity of ensures that is actually Fréchet-differentiable withNow, we would like to write Itô’s formula for : in order to do this, we need to show first that and enjoy more regularity. This can be shown by performing a further approximation on the problem (for example, the classical Faedo–Galerkin approximation of the abstract evolution equation (3.6)). Indeed, by the classical variational theory on stochastic evolution equations (Liu and Röckner 2015), there is a sequence of finite-dimensional subspaces of H, included in and with dense in H, such that, setting as the orthogonal projection onto , the unique solution of the finite-dimensional systemsatisfy, as ,At this point, the finite-dimensional Itô formula for yieldsfor every , -almost surely. We show now uniform estimates on the terms on the right-hand side, independent of both and n. These will show a posteriori that and are actually more regular. For this reason and for brevity of notation, we omit from now on the dependence on n and refer to (Scarpa 2018, 2020) for more detail. To this end, noting that the definition of and assumption A1 implyon the left-hand side, we getOn the right-hand side, thanks to the Hölder and Young inequalities, the inclusion , and the estimate (3.7), proceeding as in Sect. 3.1, we haveMoreover, assumptions A3 and A1 yield, together with the Hölder inequality and (3.7),Finally, the Burkholder–Davis–Gundy and the Poincaré–Wirtinger inequalities give, together with assumption A3,for every , where we have updated the value of c and step-by-step, independently of . Putting all this information together, choosing sufficiently small, rearranging the terms, and updating again the value of c, we infer thatConsequently, we can close the estimate on a certain subinterval , where is chosen sufficiently small in order to incorporate the terms on the right-hand side into the corresponding ones on the left. Also, a patching argument as in Sect. 3.1 allows then to extend the estimate to the whole interval [0, T], and we obtainwhich by comparison in and estimate (3.7) gives alsoFinally, note that by assumption A3 and the estimate (3.8), we haveso that the classical result by Flandoli and Gatarek (1995, Lem. 2.1) ensures in particular thatConsequently, by comparison in (3.2), it is not difficult to check thatNow, recalling that , for all arbitrary we have that , so that the usual Sobolev embeddings ensure thatand we deduce that

Passage to the Limit

From the estimates (3.7)–(3.9), there exists a pair , withsuch that, as , on a non-relabelled subsequence we haveNow, since , we can fix , so that : with this choice, by the classical Aubin–Lions–Simon compactness results (Simon 1987, Cor. 5) we haveHence, setting as the closed ball of radius n in , we have that is compact in , for every . Consequently, denoting by the law of on for brevity, the Markov inequality and the uniform estimates (3.7), (3.8), and (3.11) yieldfrom whichBy the Prokhorov theorem, this implies thatSimilarly, estimate (3.10) ensures by the same argument thatLet us show now that, possibly on a further subsequence, we have also the strong convergenceTo this end, we use the following lemma due to Gyöngy and Krylov (1996, Lem. 1.1), which characterises the convergence in probability in a Polish space.

Lemma 3.1

Let be a Polish space and be a sequence of -valued random variables. Then, converges in probability if and only if for any pair of subsequences and , there exists a joint sub-subsequence converging in law to a probability measure on such that . We apply this lemma to and . Given two arbitrary subsequences and , since the laws of the pairs are tight on , there is a joint subsequence converging weakly to a probability measure on . By the Skorokhod representation theorem (Ikeda and Watanabe 1989, Thm. 2.7) and (van der Vaart and Wellner 1996, Thm. 1.10.4, Add. 1.10.5), there exist a new probability space and measurable maps , such that for every and for some measurable random variablesSimilarly, we havefor some measurable random variablesandNow, since in -almost surely on the whole sequence , for every arbitrary we havefrom which -almost surely due to the arbitrariness of f. Let us set then and : since the maps preserve the laws, from the uniform estimates (3.7)–(3.9) we deduce also thatfor some measurable random variablesNow, if we introduce the filtration as:using classical representation theorems for martingales (see Flandoli and Gatarek 1995 and Da Prato and Zabczyk 2014, § 8.4) we have that is a cylindrical Wiener process on andso that on the new probability space we havewhere the equations are intended in the usual variational sense (3.6). Now, the strong convergences of imply, together with the Lipschitz-continuity of B, that Introducing then the limiting filtration asa classical argument based again on the martingale representation theorem (see Flandoli and Gatarek 1995 and Da Prato and Zabczyk 2014, § 8.4) yields the identificationMoreover, the strong convergences of together with the uniform estimate (3.9) on the nonlinearities also givePutting all this information together, we deduce that solves the limit problem (1.1)–(1.4) in the sense of Theorem 2.1 on the new probability space , namelySince we have already proved uniqueness of solutions in Sect. 3.1, we deduce thatso that Lemma 3.1 ensures the strong convergence (3.12) also on the original probability space . Proceeding now in exactly the same way on instead, it is a standard matter to show that is the unique solution to the state system (1.1)–(1.4). Clearly, the global estimate (2.2) follows directly by the computations in Sect. 3.3 and assumption A3,

Continuous Dependence

Here we conclude the proof of Theorem 2.1 by showing the continuous dependence estimates (2.3)–(2.4). First of all, (2.3) is a consequence of the already proved (2.2) and Sect. 3.1. Now, let us focus on proving (2.4). To this end, we use the same notation of Sect. 3.1 and use Itô’s formula for the square of the H-norm instead, gettingThe third term on the left-hand side can be handled thanks to assumption A1, the Hölder and Young inequalities, and the embedding , asThe convection terms on the right-hand side can be treated similarly using the divergence theorem, the Hölder and Young inequalities, and the inclusion asHence, we rearrange the terms and take power p/6 at both sides, obtaining, thanks to the Hölder and Young inequalities,where the Burkholder–Davis–Gundy inequality and the Lipschitz-continuity of B yieldfor all . Hence, choosing sufficiently small and rearranging the terms, the continuous dependence (2.4) follows from the already proved estimates (2.2)–(2.3). This concludes the proof of Theorem 2.1.

Existence of Optimal Controls

In this section, we prove Theorem 2.4 showing that the optimisation problem (CP) always admits a relaxed optimal control and a deterministic optimal control . The main idea is to use the direct method from calculus of variations, combined with a stochastic compactness argument. Let be a minimising sequence for the functional , in the sense thatand define as the unique respective solutions to the state system (1.1)–(1.4), in the sense of Theorem 2.1. Thanks to the definition of and the estimate (2.2), we deduce that there exist and a triplet withsuch that, as , possibly on a subsequence,Assumption A3 and the uniform estimates on ensure also thatso that in particularBy comparison in the equation (1.1), we infer thenwhich ensures that the laws of are tight on the space . We argue now on the same line of Sect. 3.4. As a consequence of the Skorokhod theorem, there is a probability space and measurable maps with for all , such thatFurthermore, on the new probability space we havewhere the stochastic integral is intended with respect to a suitably defined filtration . Proceeding as in Sect. 3.4, we infer thatso that by assumption A3 and the martingale representation theorem we can pass to the limit as on the new probability space and getThis shows that and that . To conclude that is a relaxed optimal control for the optimisation problem (CP), we note that by the weak lower semicontinuity of the cost functional J we have so that is a relaxed optimal control in the sense of Definition 2.3. In order to show existence of a deterministic optimal control, the argument is similar. We start taking a minimising sequence such thatArguing exactly as above, thanks to the fact that are deterministic, in this case we have that for every . Consequently, in this case we can inherits some strong compactness properties on the original probability space, using a similar argument to the one of Sect. 3.4, by employing Lemma 3.12. Namely, we infer the strong convergenceon the original probability space . It follows then that almost everywhere, and letting yieldsso that . At this point, the conclusion follows as above by lower semicontinuity of the cost functional.

Linearised System and Differentiability of the Control-to-State Map

The aim of this section is to prove that the linearised state system (2.6)–(2.7) is well posed and to characterise its solution as the derivative on the control-to-state map. Namely, we prove here Theorem 2.5.

Existence

Let and be arbitrary and fixed. Using the notation of Sect. 3.2, we consider the approximated linearised problemNoting that , the classical variational approach ensures existence and uniqueness of the approximated solutionin the sense that, for every , for every , -almost surely,Noting that , we can write Itô’s formula for , gettingNow, assumption A1, the Hölder–Young inequalities and the compactness inequality (2.1), and the embedding give, for all ,Similarly, by C2 and again the compactness inequality (2.1), we haveAs for the stochastic integral, the Burkholder–Davis–Gundy and Young inequalities give (see, for example, Marinelli and Scarpa 2020, Lem. 4.1), together with (2.1) and C2Consequently, using the same iterative-patching argument of Sect. 3.1, raising to power p/2, taking supremum in time and expectations, we infer thatNow, Itô’s formula for yieldswhere by the divergence theorem we haveHence, it is not difficult to see that, using again the Hölder, Young and Burkholder–Davis–Gundy inequalities, assumption C2, and the estimate (5.6), all the terms on the right-hand side can be handled, except the one containing . For this one, we proceed using C1, the embedding , aswhere, thanks to (5.6) and the Hölder inequality,Consequently, we deduce thatfrom which, by comparison in (5.2),We infer the existence of withsuch that, as (possibly on a subsequence),Since the systems (5.1)–(5.4) and (2.6)–(2.9) are linear, the passage to the limit is straightforward. Indeed, by assumption C2 and the dominated convergence theorem, it follows thatMoreover, thanks to C1 and the regularity of , we have , so in particularand also, thanks to (5.10),We deduce that letting in (5.5) we get that is a solution to (2.6)–(2.9) in the sense of Theorem 2.5. The strong continuity in H of follows a posteriori with a classical method by Itô’s formula on the limit equation (2.6). We show here that the linearised system (2.6)–(2.9) admits at most one solution. By linearity, it enough to check that if is a solution to (2.6)–(2.9) in the sense of Theorem 2.5 with , then . To this end, we note that (2.6) yields , so that Itô’s formula givesNow, we can argue on the same line of Sect. 5.1 by using assumption A1 on , C2 on DB, together with Burkholder–Davis–Gundy and Young inequalities to getfrom which , and also by comparison in (2.7). This show that the linearised system (2.6)–(2.9) admits at most one solution.

Gâteaux-Differentiability

We prove here that is Gâteaux-differentiable. Let and be arbitrary and fixed: since is open in , there exists such that for all . For every such , setting and , the difference of the respective equations (for ) giveswhose natural variational formulation readsNow, by the continuous dependence estimate (2.4), we deduce that there exists a constant independent of such thatso that there exist withsuch that, as possibly on a subsequence,It follows in particular thatFurthermore, since , by the inclusion , the Hölder inequality, and the convergence (5.14), it holds thatAs far as the nonlinear term is concerned, thanks to the mean-value theorem we haveNow, by the strong convergence (5.16) and the continuity of , we havewhere, recalling that by C1 has quadratic growth, thanks to the embedding the left-hand side is uniformly bounded in the space , so thatfor every and . Taking (5.14) into account, we infer in particular thatfor every and . Similarly, thanks to C1 and the regularity of , we have , and the same argument as above yieldsfor every and . It follows thatLastly, let us handle the stochastic integral. By the Lipschitz-continuity of B in A3, we haveNow, the strong convergence (5.16), the continuity and boundedness of DB in C2 imply together with the dominated convergence theorem thatfor every . Since is bounded in by interpolation of (5.13)–(5.14), it follows thatfor every . Similarly, by the boundedness of DB in C2 and the convergence (5.14), we have alsoHence, we obtain thatFinally, letting in (5.12) using convergences (5.13)–(5.19), we deduce that actually is the unique solution of the linearised system (2.6)–(2.9) in the sense of Theorem 2.5. It remains to show now the strong convergence of . To this end, note that by the Lipschitz-continuity of B in A3 and (5.14), we havefrom which, thanks to the classical result (Flandoli and Gatarek 1995, Lem. 2.1) we getBy comparison in the equation (5.12) and the estimates proved above, we infer then thatNow, recalling that by (Simon 1987, Cor. 5), we haveso that the laws of are tight on . By using again Lemma 3.12 together with the uniqueness of the limit problem at , proceeding as in Sect. 3.4, we also get the strong convergencewhich in turn yields, together with (5.14), the strong convergence of Theorem 2.5. This proves that is Gâteaux-differentiable, and its derivative is a solution to the linearised system, in the sense of Theorem 2.5.

Fréchet-Differentiability

We are only left to show the Fréchet-differentiability of . To this end, since is open in , there is a -ball of radius centred at such that . For all , we set , , and , so thatNoting that , Itô’s formula yields Now, the Young and Hölder inequalities give, together with the embedding ,and similarlyMoreover, note that by the mean value theorem and assumption A1 we havewhere, by the Hölder inequality, the compactness inequality (2.1), the embedding , and assumption C1, Lastly, we haveso that by A3, C2–C3, and the compactness inequality (2.1),Consequently, taking all this information into account, we can choose small enough and rearrange the terms to getThanks to the embedding , by (2.4) and (5.13)–(5.14), we havewhile (2.2) yieldswhere the constant c is independent of . Taking power at both sides, supremum in time and expectations, on the right-hand side we use the Hölder inequality with exponents to getand similarlyConsequently, arguing again as in Sect. 3.1, using an iterative argument and the Burkholder–Davis–Gundy and Young inequalities (see also Marinelli and Scarpa 2020, Lem. 4.1) gives thenThis proves the Fréchet-differentiability of and concludes the proof of Theorem 2.5.

Adjoint System

In this section, we study the adjoint problem (2.10)–(2.13), proving that it is well posed in the sense of Theorem 2.6. As we have anticipated in Introduction, the presence of the extra-random component in the convection term calls for non-trivial mathematical tools when deriving estimates on the solutions. Let us recall here a general backward version of the stochastic Gronwall lemma that will be used in this section: for details we refer to (Hun et al. 2020, Thm. 1) and (Wang and Fan 2018).

Lemma 6.1

Let be non-negative, with almost everywhere in , and be a non-negative process such thatThen, for every it holds that For every , using the approximations on and as in Sect. 3.2, we consider the approximated problemThis can be written in abstract form as:where is given byBy construction it holds that and , so that using similar arguments to the ones in Sect. 3.2, we have that the operator is progressively measurable, hemicontinuous, weakly monotone, weakly coercive, and linearly bounded. Moreover, the Lipschitz-continuity of B in A3 implies that is uniformly bounded as well. The classical variational theory for backward SPDEs (Du and Meng 2010, Sec. 3) ensures then that such approximated problem admits a unique variational solution , with Actually, let us note that thanks to the assumption on the target and the regularity of , the final value satisfies . Consequently, by a standard finite dimensional approximation of the approximated problem with fixed, it follows that the approximated solution actually inherits more regularity, namely We can then setso that satisfy, for every , -almost surely, for every ,

An Estimate by Duality Method

The first estimate that we prove is based on a duality method between the approximated adjoint system (6.1)–(6.4) and a suitably introduced approximated linearised system. This step is fundamental as it allows to obtain some preliminary estimates on the adjoint variables without working explicitly on the adjoint system, which may be not trivial. Such duality method is extremely powerful, and it will be crucial in showing well-posedness of the adjoint system. The idea is the following: we consider the -approximated version of the linearised system (2.6)–(2.9), in a more general version where the forcing term is given by an arbitrary termNamely, for we considerSince , the classical variational approach (see again Sects. 3.2 and 5.1) ensures that the system (6.5)–(6.8) admits a unique solutionMoreover, we can show that the system (6.5)–(6.8) is in duality with the approximated adjoint system (6.1)–(6.4). To this end, by Itô’s formula we have thatwhich readily implies by comparison in the two systems thatLet us set now for brevity of notation and with the choice . Noting that , Itô’s formula for yieldsUsing the fact that and the boundedness of in , thanks to the Hölder–Young inequalities and the compactness inequality (2.1) we get, for all ,We take now power at both sides, supremum in time, and expectations. Thanks to the Burkholder–Davis–Gundy inequality (see Marinelli and Scarpa 2020, Lem. 4.1), assumption C2, and (2.1), we getMoreover, since , by the Hölder inequality we haveSince and , we can close the estimate rearranging all the terms on for sufficiently small (independent of both and g). Using once more a classical iterative procedure on every subinterval until T, we infer that there exists a constant , independent of both and g, such thatNow, by assumption C4 and the regularity of (since for ), it holdsso that the duality relation (6.9) (with ) and the estimate (6.10) yieldBy the arbitrariness of g we obtain

Further Estimates

We show here that the initial estimate (6.11) allows to obtain uniform estimates on the adjoint variables. To this end, Itô’s formula for yields, recalling that , On the right-hand side, we have already noticed that . Moreover, by A1, the compactness inequality (2.1) and the fact that , for the second and third terms we haveand, thanks to the Hölder–Young inequalities, the embedding , and C1,Also, note that since , in particular it holds that . Hence, using the Young and Hölder inequalities, the embedding yields, for all ,and similarlyLastly, thanks to A3 and C2, and again the compactness inequality (2.1), we have thatChoosing small enough, rearranging the terms in (6.12), and conditioning (6.12) with respect to we are left withso that the backward version of the stochastic Gronwall Lemma 6.1 yieldsConsequently, taking expectations we infer thatwhere, by the Hölder inequality and the duality-estimate (6.11), we havewhich yields in turnWith this additional information, we can perform a classical refinement on the estimates going back to the inequality (6.12), repeating the same steps but this time taking first supremum in time and then expectations: the estimate (6.13) allows to apply the Burkholder–Davis–Gundy inequality on the stochastic integral, so that we obtain, thanks also to elliptic regularity, From (6.13)–(6.14), we infer that there exists withsuch that as , possibly on a subsequence,Now, thanks to C1 and the regularity of , we have , so in particularand also, thanks to (6.16),Similarly, since in for every , from (6.15) we haveLastly, convergence (6.17) readily implies thatwhile by the linearity and continuity of the stochastic integral we haveConsequently, we can let in the variational formulation of the approximated system (6.1)–(6.4) and deduce that solve the limit adjoint problem (2.10)–(2.13). The pathwise continuity of P, hence by comparison also of , follows by classical methods using Itô’s formula on the limit equation. By linearity of the adjoint system, it is enough to show that if is a solution of (2.10)–(2.13) with , then , , and . To this end, Itô’s formula for yieldsNow, as the computations are similar to the ones of Sect. 6.3, we avoid details for brevity. The terms on the right-hand side can be treated using A1, the Hölder–Young inequalities, the embedding , and the compactness inequality (2.1) asand similarly, since is -valued by A3, by the Poincaré–Wirtinger inequality and C2 we haveRearranging the terms and taking conditional expectations with respect to , we get thatso that applying the backward stochastic Gronwall Lemma 6.1 and then taking expectations yield almost everywhere in , hence also almost everywhere in since . Consequently, the stochastic integral appearing in the estimate above vanishes, and we deduce also in , from which in . Also, in . This concludes the proof of Theorem 2.6.

Necessary Conditions for Optimality

In this last section, we prove the two versions of necessary conditions for optimality contained in Theorems 2.7–2.8. Let then be an optimal control for problem (CP) and let us set as its corresponding optimal state. Let us also fix an arbitrary . By convexity of we have for all . Hence, setting , for every the minimality of yieldswhich entails in turnNow, the functions and are Fréchet-differentiable on and , respectively. Hence, the mean-value theorem yieldsAt this point, as , we have in , so (2.3)–(2.4) imply thatMoreover, Theorem 2.5 ensures thatHence, noting that , letting we obtain exactly (2.14), and Theorem 2.7 is proved. Lastly, we note that (2.15) follows directly from (2.14) provided to show the duality relationIn order to prove this, we can take and in the duality relation (6.9), and then let thanks to the convergences (5.9)–(5.10). This concludes the proof of Theorem 2.8.

4 in total