An application of Pontryagin’s principle to Brownian particle engineered equilibration

,


Introduction
An increasing number of applications in micro and sub-micro scale physics call for the development of general techniques for engineered finite-time equilibration of systems operating in a thermally fluctuating environment. Possible concrete examples are the design of nano-thermal engines [13,45] or of micro-mechanical oscillators used for high precision timing or sensing of mass and forces [33].
A recent experiment [36] exhibited the feasibility of driving a micro-system between two equilibria over a control time several order of magnitude faster than the natural equilibration time. The system was a colloidal micro-sphere trapped in an optical potential. There is consensus that non-equilibrium thermodynamics (see e.g. [49]) of optically trapped micron-sized beads is well captured by Langevin-Smoluchowski equations [24]. In particular, the authors of [36] took care of showing that it is accurate to conceptualize the outcome of their experiment as the evolution of a Gaussian probability density according to a controlled Langevin-Smoluchowski dynamics with gradient drift and constant diffusion coefficient. Finite time equilibration means that at the end of the control horizon, the probability density is solution of the stationary Fokker-Planck equation. The experimental demonstration consisted in a compression of the confining potential. In such a case, the protocol steering the equilibration process is specified by the choice of the time evolution of the stiffness of the quadratic potential whose gradient yields the drift in the Langevin-Smoluchowski equation. As a result, the set of admissible controls is infinite. The selection of the control in [36] was then based on simplicity of implementation considerations.
A compelling question is whether and how the selection of the protocol may stem from a notion of optimal efficiency. A natural indicator of efficiency in finite-time thermodynamics is entropy production. Transitions occurring at minimum entropy production set a lower bound in Clausius inequality. Optimal control of these transitions is, thus, equivalent to a refinement of the second law of thermodynamics in the form of an equality.
In the Langevin-Smoluchowski framework, entropy production optimal control takes a particularly simple form if states at the end of the transition are specified by sufficiently regular probability densities [6]. Namely, the problem admits an exact mapping into the well known Monge-Kantorovich optimal mass transport [50]. This feature is particularly useful because the dynamics of the Monge-Kantorovich problem is exactly solvable. Mass transport occurs along free-streaming Lagrangian particle trajectories. These trajectories satisfy boundary conditions determined by the map, called the Lagrangian map, transforming into each other the data of the problem, the initial and the final probability densities. Rigorous mathematical results [8,14,18] preside over the existence, qualitative properties and reconstruction algorithms for the Lagrangian map.
The aforementioned results cannot be directly applied to optimal protocols for engineered equilibration. Optimal protocols in finite time unavoidably attain minimum entropy by leaving the end probability densities out of equilibrium. The qualitative reason is that optimization is carried over the set of drifts sufficiently smooth to mimic all controllable degrees of freedom of the micro-system. Controllable degrees of freedom are defined as those varying over typical time scales much slower than the time scales of Brownian forces [3]. The set of admissible protocols defined in this way is too large for optimal engineered equilibration. The set of admissible controls for equilibration must take into account also extra constraints coming from the characteristic time scales of the forces acting on the system. From the experimental slant, we expect these restrictions to be strongly contingent on the nature and configuration of peripherals in the laboratory setup. From the theoretical point of view, self-consistence of Langevin-Smoluchowski modeling imposes a general restriction. The time variation of drift fields controlling the dynamics must be slow in comparison to Brownian and inertial forces.
In the present contribution, we propose a refinement of the entropy production optimal control adapted to engineered equilibration. We do this by restricting the set of admissible controls to those satisfying a non-holonomic constraint on accelerations. The constraint relates the bound on admissible accelerations to the pathwise displacement of the system degrees of freedom across the control horizon. Such displacement is a deterministic quantity, intrinsically stemming from the boundary conditions inasmuch we determine it from the Lagrangian map.
This choice of the constraint has several inherent advantages. It yields an intuitive hold on the realizability of the optimal process. It also preserves the integrability properties of the unrestricted control problem specifying the lower bound to the second law. This is so because the bound allows us to maintain protocols within the admissible set by exerting on them uniform accelerating or decelerating forces. On the technical side, the optimal control problem can be handled by a direct application of Pontryagin maximum principle [34]. For the same reasons as for the refinement of the second law [6], the resulting optimal control is of deterministic type. This circumstance yields a technical simplification but it is not a necessary condition in view of extensions of our approach. We will return to this point in the conclusions.
The structure of the paper is as follows. In section 2 we briefly review the Langevin-Smoluchowski approach to non-equilibrium thermodynamics [47]. This section can be skipped by readers familiar to the topic. In section 3 we introduce the problem of optimizing the entropy production. In particular we explain its relation with the Schrödinger diffusion problem [46,1]. This relation, already pointed out in [38], has recently attracted the attention of mathematicians and probabilists interested in rigorous application of variational principles in hydrodynamics [5]. In section 4 we formulate the Pontryagin principle for our problem. Our main result follows in section 5 where we solve in explicit form the optimal protocols. Sections 6 and 7 are devoted to applications. In 6 we revisit the theoretical model of the experiment [36], the primary motivation of our work. In section 7 we apply our results to a stylized model of controlled nucleation obtained by manipulating a double well potential. Landauer and Bennett availed themselves of this model to discuss the existence of intrinsic thermodynamic cost of computing [31,9]. Optimal control of this model has motivated in more recent years several theoretical [19] and experimental works [11,28,27].
Finally, in section 8 we compare the optimal control we found with those of [7]. This reference applied a regularization technique coming from instanton calculus [4] to give a precise meaning to otherwise ill-defined problems in non-equilibrium thermodynamics, where terminal cost seem to depend on the control rather than being a given function of the final state of the system. In the conclusions we discuss possible extensions of the present work. The style of the presentation is meant to be discursive but relies on notions in between non-equilibrium physics, optimal control theory and probability theory. For this reason we include in appendices some auxiliary information as a service to the interested reader.

Kinematics and thermodynamics of the model
We consider a physical process in a d-dimensional Euclidean space (R d ) modeled by a Langevin-Smoluchowski dynamics The stochastic differential dω t stands here for the increment of a standard d-dimensional Wiener process at time t [24]. U : R d ⊗ R → R denotes a smooth scalar potential and β −1 is a constant sharing the same canonical dimensions as U . We also suppose that the initial state of the system is specified by a smooth probability density Under rather general hypotheses, the Langevin-Smoluchowski equation (1) can be derived as the scaling limit of the overdamped non-equilibrium dynamics of a classical system weakly coupled to an heat bath [51]. The Wiener process in (1) thus embodies thermal fluctuations of order β −1 . The fundamental simplification entailed by (1) is the possibility to establish a framework of elementary relations linking the dynamical to the statistical levels of description of a non-equilibrium process [47,32]. In fact, the kinematics of (1) ensures that for any time-autonomous, confining potential the dynamics tends to a unique Boltzmann equilibrium state.
Building on the foregoing observations [47], we may then identify U over a finite time horizon with the internal energy of the system. The differential of U yields the energy balance in the presence of thermal fluctuations due to interactions with the environment. We use the notation 1/2 · for the Stratonovich differential [24]. From (3) we recover the first law of thermodynamics by averaging over the realizations of the Wiener process. In particular, we interpret as the average work done on the system. Correspondingly, is the average heat discarded by the system into the heat bath and therefore is the embodiment of the first law.
The kinematics of stochastic processes [41], allow us also to write a meaningful expression for the second law of thermodynamics. The expectation value of a Stratonovich differential is in general amenable to the form where is the current velocity. For a potential drift, the current velocity vanishes identically at equilibrium. As well known from stochastic mechanics [20,40], the current velocity permits to couch the Fokker-Planck equation into the form of a deterministic mass transport equation. Hence, upon observing that we can recast (7) into the form which we interpret as the second law of thermodynamics (see e.g. [42]). Namely, if we define E = β Q T as the total entropy change in [t ι t f ], (10) states that the sum of the entropy generated by heat released into the environment plus the change of the Gibbs-Shannon entropy of the system is positive definite and vanishes only at equilibrium. The second law in the form (10) immediately implies a bound on the average work done on the system. To evince this fact, we avail us of the equality and define the current velocity potential We then obtain In equilibrium thermodynamics the Helmholtz free energy is defined as the difference between the internal energy U and entropy S of a system at temperature β −1 . This relation admits a non-equilibrium extension by noticing that the information content [48] of the system probability density weighs the contribution of individual realizations of (1) to the Gibbs-Shannon entropy. We refer to [41] for the kinematic and thermodynamic interpretation of the information content as osmotic potential. We also emphasize that the notions above can be given an intrinsic meaning using the framework of stochastic differential geometry [40,38]. Finally, it is worth noticing that the above relations can be regarded as a special case of macroscopic fluctuation theory [10].

Non-equilibrium thermodynamics and Schrödinger diffusion
We are interested in thermodynamic transitions between an initial state (2) at time t ι and a pre-assigned final state at time also specified by a smooth probability density We also suppose that the cumulative distribution functions of (2) and (12) are related by a Lagrangian map : According to the Langevin-Smoluchowski dynamics (1), the evolution of probability densities obey a Fokker-Planck equation, a first order in time partial differential equation. As a consequence, a price we pay to steer transitions between assigned states is to regard the drift in (1) not as an assigned quantity but as a control. A priori a control is only implicitly characterized by the set of conditions which make it admissible. Informally speaking, admissible controls are all those drifts steering the process ξ t , t ∈ [t ι , t f ] between the assigned end states (2) and (12) while ensuring that at any time t ∈ [t ι , t f ] the Langevin-Smoluchowski dynamics remains well-defined. Schrödinger [46] considered already in 1931 the problem of controlling a diffusion process between assigned states. His work was motivated by the quest of a statistical interpretation of quantum mechanics. In modern language [17,43], the problem can be rephrased as follows. Given (2) and (12) and a reference diffusion process, determine the diffusion process interpolating between (2) and (12) while minimizing the value of its Kullback-Leibler divergence (relative entropy) [30] with respect to the reference process. A standard application (appendix A) of Girsanov formula [24] shows that the Kullback-Leibler divergence of (1) with respect to the Wiener process is P and P ω denote respectively the measures of the process solution of (1) with drift −∂ q U (q, t) and of the Wiener process w. The expectation value on the right hand side is with respect to P as elsewhere in the text. A now wellestablished result in optimal control theory see e.g. [17,43] is that the optimal value of the drift satisfies a backward Burgers equation with terminal condition specified by the solution of the Beurling-Jamison integral equations. We refer to [17,43] for further details. What interest us here is to emphasize the analogy with the problem of minimizing the entropy production E in a transition between assigned states. Several observations are in order at this stage. The first observation is that also (10) can be directly interpreted as a Kullback-Leibler divergence between two probability measures. Namely, we can write (appendix A) for P R the path-space measure of the process evolving backward in time from the final condition (12) [25,15].
The second observation has more far reaching consequences for optimal control. The entropy production depends upon the drift of (1) exclusively through the current velocity (8). Hence we can treat the current velocity itself as natural control quantity for (15). This fact entails major simplifications [6]. The current velocity can be thought as deterministic rather than stochastic velocity field (see [41] and appendix B). Thus, we can couch the optimal control of (15) into the problem of minimizing the kinetic energy of a classical particle traveling from an initial position q at time t ι and a final position (q) at time t f specified by the Lagrangian map (13). In other words, entropy production minimization in the Langevin-Smoluchowski framework is equivalent to solve a classical optimal transport problem [50].
The third observation comes as a consequence of the second one. The optimal value of the entropy production is equal to the Wasserstein distance [26] between the initial and final probability measures of the system, see [21] for details. This fact yields a simple characterization of the Landauer bound and permits a fully explicit analysis of the thermodynamics of stylized isochoric micro-engines (see [39] and refs therein).
Finally, the construction of Schrödinger diffusions via optimal control of (14) corresponds to a viscous regularization of the optimal control equations occasioned by the Schrödinger diffusion problem (15).

Pontryagin's principle for bounded accelerations
An important qualitative feature of the solution of the optimal control of the entropy production is that the system starts from (2) and reaches (12) with non-vanishing current velocity. This means that the entropy production attains a minimum value when the end-states of the transition are out-of-equilibrium. We refer to this lower bound as the refinement of the second law.
Engineered equilibration transitions are, however, subject to at least two further types of constraints not taken into account in the derivation of the refined second law. The first type of constraint is on the set of admissible controls. For example, admissible controls cannot vary in an arbitrary manner: the fastest time scale in the Langevin-Smoluchowski dynamics is set by the Wiener process. The second type is that end-states are at equilibrium. In mathematical terms, this means that the current velocity must vanish identically at t ι and t f .
We formalize a deterministic control problem modeling these constraints. Our goal is to minimize the functional over the set of trajectories generated for any given choice of the measurable control α t by the differential equatioṅ satisfying the boundary conditions We dub the dynamical variable χ t running Lagrangian map as it describes the evolution of the Lagrangian map within the control horizon. We restrict the set of admissible controls A = α t , t ∈ [t ι , t f ] to those enforcing equilibration at the boundaries of the control horizon whilst satisfying the bound We suppose that the K (i) (q) > 0 i = 1, . . . , d are strictly positive functions of the initial data q of the form The constraint is non-holonomic inasmuch it depends on the initial data of a trajectory. The proportionality (22) relates the bound on acceleration to the Lagrangian displacement needed to satisfy the control problem. We resort to Pontryagin principle [34] to find normal extremals of (17). We defer the statement of Pontryagin principle as well as the discussion of abnormal extremals to appendix C. We proceed in two steps. We first avail us of Lagrange multipliers to define the effective cost functional subject to the boundary conditions (19), (20). Then, we couch the cost functional into an explicit Hamiltonian form with Pontryagin's principle yields a rigorous proof of the intuition that extremals of the optimal control equations correspond to stationary curves of the action (23) with Hamiltonian In view of the boundary conditions (19), (20) extremals satisfy the Hamilton system of equations formed by (18a) anḋ In writing (24a) we adopt the convention and a "no-action" region specified by the conditions where χ t follows a free streaming trajectory:χ We call switching times the values of t corresponding to the boundary values of a no-action region. Switching times correspond to discontinuities of the acceleration α t . Drawing from the intuition offered by the solution of the unbounded acceleration case, we compose push and no-action regions to construct a single solution trajectory satisfying the boundary conditions. If we surmise that during the control horizon only two switching times occur, we obtain which implies Self-consistence of the solutions fixes the initial data in (27) whilst the requirement of vanishing velocity at t = T determines the relation between the switching times Self-consistence then dictates We are now ready to glean the information we unraveled by solving (24), to write the solution of (18a) The terminal condition on χ t fixes the values of t 1 and sgn θ t 0 : The equation for t 1 well posed only if The only admissible solution is then of the form The switching time is independent of q in view of (22). It is realizable as long as The threshold value of δ correspond to the acceleration needed to construct an optimal protocol consisting of two push regions matched at half control horizon.

Qualitative properties of the solution
Equation (28) complemented by (29) and the realizability bound (31) fully specify the solution of the optimization problem we set out to solve. The solution is optimal because it is obtained by composing locally optimal solutions. Qualitatively, it states that transitions between equilibrium states are possible at the price of the formation of symmetric boundary layers determined by the occurrence of the switching times. For δ 1 the relative size of the boundary layers is In the same limit, the behavior of the current velocity far from the boundaries tends to the optimal value of the refined second law [6]. Namely, for t ∈ [t 1 , t f ] we find More generally for any 0 ≤ t 1 ≤ T /2, we can couch (28) into the form The use of the value of the switching time t 1 to parametrize the bound simplifies the derivation of the Eulerian representation of the current velocity. Namely, in order to find the field v : we can invert (32) by taking advantage of the fact that all the arguments of the curly brackets are independent of the position variable q.
We also envisage that the representation (32) may be of use to analyze experimental data when finite measurement resolution may affect the precision with which microscopic forces acting on the system are known.

Comparison with experimental swift engineering (ESE) protocols
The experiment reported in [36] showed that a micro-sphere immersed in water and trapped in an optical harmonic potential can be driven in finite time from an equilibrium state to another. The probability distribution of the particle in and out equilibrium remained Gaussian within experimental accuracy.
It is therefore expedient to describe more in detail the solution of the optimal control problem in the case when the initial equilibrium distribution in one dimension is normal, i.e. Gaussian with zero mean and variance β −1 . We also assume that the final equilibrium state is Gaussian and satisfy (13) with Lagrangian map The parameters h and σ respectively describe a change of the mean and of the variance of the distribution. We apply (13) and (32) for any t ∈ [0, T ] to derive the minimum entropy production evolution of the probability density. In consequence of (22), the running Lagrangian map leaves Gaussian distributions invariant in form with mean value and variance Finally, we find that the Eulerian representation (33) of the current velocity at The foregoing expression allows us to write explicit expressions for the all the thermodynamic quantities governing the energetics of the optimal transition. In particular, the minimum entropy production is with  fig. 1(a) and second law fig. 1(b) of thermodynamics for the same transition between Gaussian states as in [36]. The initial state is a normal distribution with variance β −1 . The final distribution is Gaussian with variance β −1 /2. The condition K(q) ∝ | (q) − q| ensures that the probability density remains Gaussian at any time in the control horizon. The proportionality factor is chosen such that t 1 = 0.3 in (32). The behavior of the variance (inset of fig 1(a)) is qualitatively identical to the one observed in [36] Fig. 2. The behavior of the average work and heat also reproduces the one of Fig. 3 of [36].
the value of the minimum entropy production appearing in the refinement of the second law [6]. In Fig. 1 we plot the evolution of the running average values of the work done on the system, the heat release and the entropy production during the control horizon. In particular, Fig. 1(a) illustrates the first law of thermodynamics during the control horizon. A transition between Gaussian equilibrium states occurs without any change in the internal energy of the system. The average heat and work must therefore coincide at the end of the control horizon. The theoretical results are consistent with the experimental results of [36].

Optimal controlled nucleation and Landauer bound
The form of the bound (22) and running Lagrangian map formula (32) reduce the computational cost of the solution the optimal entropy production control to the determination of the Lagrangian map (13). In general, the conditions presiding to the qualitative properties of the Lagrangian map have been studied in depth in the context of optimal mass transport [50]. We refer to [18] and [21] respectively for a self-contained overview from respectively the mathematics an physics slant. For illustrative purpose, we revisit here the stylized model of nucleation analyzed in [6]. Specifically, we consider the transition between two equilibria in one dimension. The initial state is described by the symmetric double well: In the final state the probability is concentrated around a single minimum of the potential: In the foregoing expressions σ is a constant ensuring consistency of the canonical dimensions. We used the ensuing elementary algorithm to numerically determine the Lagrangian map. We first computed the median z(1) of the assigned probability distributions and then evaluated first the left and then right branch of the Lagrangian map. For the left branch, we proceeded iteratively in z(k) as follows Step 1 We renormalized the distribution restricted to [−∞, z(k)].
Step 2 We computed the 0.9 quantile z(k + 1) < z(k) of the remaining distribution.
Step 3 We solved the ODE We skipped Step 3 whenever the difference |z(k) − z(k − 1)| turned out to be smaller than a given threshold 'resolution'. We plot the results of this computation in Fig. 2.
Once we know the Lagrangian map, we can numerically evaluate the running Lagrangian map (32) and its spatial derivatives. In Fig. 3 we report the evolution of the probability density in the control horizon for two reference values of the switching time. Fig. 4 illustrates the the corresponding evolution of the current velocity. The qualitative behavior is intuitive. The current velocity starts and ends with vanishing value, it catches up with the value for t 1 ↓ 0, i.e. when the bound on acceleration tends to infinity, in the bulk of the control horizon. There the displacement described by the running Lagrangian map occurs at speed higher than in the t 1 ↓ 0 case. The overall value of the (a) t = 0.05 entropy production is always higher than in the t 1 ↓ 0 limit. From (32) we can also write the running values of average heat released by the system. The running average heat is and the running average work The second summand on the right hand side of (39) fixes the arbitrary constant in the Helmholtz potential in the same way as in the Gaussian case. In Fig. 5 we plot the running average work, heat and entropy production.

Comparison with the valley method regularization
An alternative formalism to study transitions between equilibrium states in the Langevin-Smoluchowski limit was previously proposed in [7]. As in the present case, [7] takes advantage of the possibility to map the stochastic optimal control problem into a deterministic one via the current velocity formalism. Physical constraints on admissible controls are, however, enforced by adding to the entropy production rate a penalty term proportional to the squared current acceleration. In terms of the entropy production functional (17) we can couch the regularized functional of [7] into the form δ χ E stands for the variation of E with respect to the running Lagrangian map. The idea behind the approach is the "valley method" advocated by [4] for instanton calculus. The upshot is to approximate field configurations satisfying Contrasted with the approach proposed in the present work, [7] has one evident drawback and one edge. The drawback is that the quantities actually minimized are not anymore the original thermodynamic functionals. The edge is that the resulting optimal protocol has better analyticity properties. In particular, the running Lagrangian map takes the form In fig. 6(a) we compare the qualitative behavior of the universal part of the running Lagrangian map predicted by the valley method and by the bound (21) on admissible current accelerations. The corresponding values of the running average entropy production are in fig. 6(b). The upshot of the comparison is the weak sensitivity of the optimal protocol to the detail of the optimization once the intensity of the constraint on the admissible control (i.e. the current acceleration) is fixed. We believe that this is an important observation for experimental applications (see, e.g., discussion in the conclusions of [27]) as the detail of how control parameters can be turned on and off in general depend on the detailed laboratory setup and on the restrictions by the available peripherals.

Conclusions and outlooks
We presented a stylized model of engineered equilibration of a micro-system. Owing to explicitly integrability modulo numerical reconstruction of the Lagrangian map, we believe that our model may provide an useful benchmark for the  fig. 6(a). In (40) we choose τ = 1, ε = 0.3. Fig. 6(b) evinces, as to be expected, the qualitatively equivalent behaviors of the entropy production for finite value (t 1 = 0.3) of the switching time. The dashed green line is computed from (40). The continuous blue line is the lower bound for the transition as predicted by [6].
devising of efficient experimental setups. Furthermore extensions of the current model are possible although at the price of some complications.
The first extension concerns the form of the constraint imposed on admissible protocols. Here we showed that choosing the current acceleration constraint in the form (22) greatly simplifies the determination of the switching times. It also guarantees that optimal control with only two switching times exists for all boundary conditions if we allow accelerations to take sufficiently large values. The non-holonomic form of the constraint (21) may turn out to be restrictive for the study of transitions for which admissible controls are specified by given forces. If the current velocity formalism is still applicable to these cases, then the design of optimal control still follows the steps we described here. In particular, uniformly accelerated Lagrangian displacement at the end of the control horizon correspond to the first terms of the integration of Newton law in Peano-Picard series. The local form of the acceleration may then occasion some qualitative differences in the form of the running Lagrangian map. Furthermore, the analysis of the realizability conditions of the optimal control may also become more involved.
A further extension is optimal control when constraints on admissible controls are imposed directly on the drift field appearing in the stochastic evolution equation. Constraints of this type are natural when inertial effects become important and the dynamics is governed by Langevin-Kramers equation in the so-called under-damped approximation. In the Langevin-Kramers framework, finding minimum entropy production thermodynamic transitions requires instead a full-fledged formalism of stochastic optimal control [39]. Nevertheless, it is possible also in that case to proceed in a way analogous to one of the present paper by applying the stochastic version of Pontryagin principle [12,29,44].
We expect that considering these theoretical refinements will be of interest in view of the increasing available experimental resolution for efficient design of atomic force microscopes [33,16].

Acknowledgments
The authors thank S. Ciliberto for useful discussions. The work of KS was mostly performed during his stay at the department of Mathematics and Statistics of the University of Helsinki. PMG acknowledges support from Academy of Finland via the Centre of Excellence in Analysis and Dynamics Research (project No. 271983) and to the AtMath collaboration at the University of Helsinki.

Appendices A Evaluation of Kullback-Leibler divergences
Let us consider first the drift-less process with initial data (2). If we denote by P ω the path-space Wiener measure generated by (41) in [t ι , t f ], Girsanov formula yields The Kullback-Leibler divergence is defined as The expectation value is with respect the measure P generated by (1): The last expression readily recovers (14) as dξ t + dt ∂ ξ t U is a Wiener process with respect to P.
To show that the entropy production is proportional to the Kullback-Leibler divergence between the path-space measures of (1) and (16) we observe that The stochastic integral is evaluated in the post-point prescription as the Radon-Nikodym derivative between backward processes must be a martingale with respect the filtration of future event (see e.g. [37] for an elementary discussion).
We then avail us of the time reversal invariance of the Wiener process to write Finally, the definition K(P P R ) = E t f tι ln dP dP R recovers (15) since probability conservation entails E ∂ t ln p = 0 whilst the properties of the Stratonovich integral [40] yield We refer to e.g. [32,35,25,15] for thorough discussions of the significance and applications of the entropy production in stochastic models of non-equilibrium statistical mechanics and to [22,23] for applications to non-equilibrium fluctuating hydrodynamics and granular materials.
B Current velocity and acceleration in terms of the generator of the stochastic process The current velocity is the conditional expectation along the realizations of (1) of the time symmetric conditional increment v(q, t) = lim τ ↓0 E ξ t+τ − ξ t−τ ξ t = q 2 τ A relevant feature of the time symmetry is that the differential can be regarded as the result of the action of a generator including only first order derivatives in space: v(ξ t , t) =D ξ t ξ t whereD ξ t := On the right hand side of (43) there appear the scalar generator of (1) D q = ∂ t − (∂ q U )(q, t) · ∂ q + 1 β ∂ 2 q and the generator of the dual process conjugated by time-reversal of the probability density in [t ι , t f ] [41,40]: The arithmetic averages of these generators readily defines a first order differential operator as in the deterministic case. Analogously, we define the current acceleration as or equivalently α t = a(ξ t , t) =D 2 ξ t ξ t

C Pontryagin principle
We recall the statement of Pontryagin's principle for fixed time and fixed boundary conditions [2,34]. Maximum Principle: Let the functional be subject to the dynamical constraintξ t = b(ξ t , α t , t) (45) and the endpoint constraints with the parameter α t belonging for fixed t to a set U ⊆ R n , the variable ξ t taking values in R d or in a open subset X of R d and the time interval [t ι , t f ] fixed. A necessary condition for a functionᾱ t : [t ι , t f ] → U and a corresponding solutionξ t of (45) to solve the minimization of (44) is that there exist a function tπ t : [t ι , t f ] → R d and a constant p o ≤ 0 such that • (π t ,p 0 ) = (0, 0) ∀ t ∈ [t ι , t f ] (non-triviality condition) • for each fixed t H (q, p, p 0 t) = max a∈U p · b(q, a, t) + p 0 L(q, a, t) (maximum condition) • (ξ t ,π t ) obey the equationṡ ξ t = ∂π t H (ξ t ,π t .p 0 , t) &π t = −∂ξ t H (ξ t ,π t ,p 0 , t) (Hamilton system condition) The proof of the maximum principles requires subtle topological considerations culminating with the application of Brouwer's fixed point theorem. The maximum principle has, nevertheless, an intuitive content. Namely, we can reformulate the problem in an extended configuration space by adding the ancillary equatioṅ ζ t = L(ξ t , π t , t) (46a) ζ tι = 0 (46b) and looking for stationary point of the action functional Let us make the simplifying assumption that any pair of trajectory and control variables satisfying the boundary have a non-empty open neighborhood where linear variations are well defined. Looking for a stationary point of (44) entails considering variations of ζ t under the constraints ζ tι = ζ t f = 0. Then it follows immediately that the stationary value of the Lagrange multiplier φ t must satisfyφ This observation clarifies why the maximum principle is stated for some constant p o ≤ 0 such that φ t = p o . In particular, if p o < 0 we can always rescale it to p o = −1 and recover familiar form of the Hamilton equations. Moreover, the Maximum principle coincides with the Hamilton form of the stationary action principle if b = α t and L is quadratic in α t . If instead there exist stationary solutions for p 0 = 0, they describe abnormal controls.
Abnormal control do not occur in the optimization problem considered in the main text. In the push regions where the acceleration is non-vanishing abnormal control drive the Lagrange multiplier θ t away from zero. Thus, they are not compatible with the occurrence of switching times between push and no-action regions. Looking for abnormal control in the no-action region yields the requirement that all Lagrange multipliers vanish against the hypothesis of the maximum principle.