Maximum Principle and Second-Order Optimality Conditions in Control Problems with Mixed Constraints

: This article concerns the optimality conditions for a smooth optimal control problem with an endpoint and mixed constraints. Under the normality assumption, which corresponds to the full-rank condition of the associated controllability matrix, a simple proof of the second-order necessary optimality conditions based on the Robinson stability theorem is derived. The main novelty of this approach compared to the known results in this area is that only a local regularity with respect to the mixed constraints, that is, a regularity in an ε -tube about the minimizer, is required instead of the conventional stronger global regularity hypothesis. This affects the maximum condition. Therefore, the normal set of Lagrange multipliers in question satisﬁes the maximum principle, albeit along with the modiﬁed maximum condition, in which the maximum is taken over a reduced feasible set. In the second part of this work, we address the case of abnormal minimizers, that is, when the full rank of controllability matrix condition is not valid. The same type of reduced maximum condition is obtained.


Introduction
In this article, second-order necessary optimality conditions for an optimal control problem with mixed equality and inequality constraints are investigated.Under the normality condition, which is ensured by the full rank of the controllability matrix, a rather simple proof of the optimality conditions is proposed based on Robinson's theorem on the metric regularity for set-valued mappings.For the case in which the normality condition is violated, the second-order conditions are derived based on the index approach.This means that some reduced cone of Lagrange multipliers is invoked, which is defined by using the index of the quadratic form of the Lagrange function; see, e.g., [1][2][3].
In work [4], the two notions of the strong and of weak regularity of an admissible trajectory with respect to mixed constraints have been considered.Strong regularity means that the constraint qualification, or the so-called Robinson condition, is satisfied for all timepoints and for all admissible control values.This corresponds to the regularity condition in the classical sense.Weak regularity means that this condition is satisfied merely in some neighborhood of the optimal process.By their nature, these two concepts correspond, respectively, to global, and local regularity settings.Under weak regularity, a refined maximum condition of Pontryagin's type has been obtained, in which the maximum is taken over the closure of regular points of the feasible set, but not over the entire feasible set.In this article, the results of [4] are carried over to the second-order conditions in the case of global minimum.
The literature on optimality conditions for optimal control problems with mixed constraints is extensive.In the context of this research related to the study of mixed constraints, we note the works of [5][6][7][8][9][10][11][12].Regarding the second-order conditions in mixed constrained problems, one may consider, e.g., [3,13,14] and the bibliography cited therein.At the same time, these selective lists of publications are far from exhaustive.
This work is organized as follows.In the next section, the problem formulation is presented, together with main definitions and notation.In Section 3, the issue of normality is discussed.In Section 4, the main result of this work-the normal maximum principle and second-order optimality conditions-is formulated and proved.In Section 5, the abnormal situation is taken into consideration, and the result of the previous section is refined.Section 6 concludes the work with a short summary.
The mappings ϕ : R 2n → R, e i : R 2n satisfy the following hypothesis.
The vector p = (x 0 , x 1 ) is termed the endpoint, as well as the constraints given by mappings e 1 , e 2 .The scalar function ϕ(p) defines the minimizing functional.Mappings r 1 , r 2 define the mixed constraints which are imposed on both state and control variables.
This concept of the minimum is known as a global strong minimum.The purpose of this work is to derive the second-order necessary optimality conditions for this type of minimum under the normality assumptions.That is, to find such a set of Lagrange multipliers that simultaneously satisfies the maximum principle and Legendre's condition, and for which λ 0 > 0. Such a set of multipliers must be unique upon normalization.The abnormal situation is also examined after the normal case.
Consider the reference control process ( x, ū), which can be optimal, extremal, regular, or normal in what follows.Denote by r = (r 1 , r 2 ) the joint mapping acting onto R q , where q = q 1 + q 2 .Let J(x, u, t) := {j : r j (x, u, t) = 0} be the set of active indices, where the upper index specifies the vector component.Set J(u, t) := J( x(t), u, t).Let U (•) designate the closure of function ū(•) w.r.t. the Lebesgue measure; that is, for a given t ∈ [0, 1], the set U (t) consists of essential values of ū(•) at point t, [8].Recall that the vector a is said to be the essential value of a function u(•) at point τ, provided that ({t is the closed ball centered at a with the radius ε, and designates the Lebesgue measure on R.
The main regularity concept is as follows.
Definition 1.The control process ( x, ū) is said to be regular w.r.t. the mixed constraints, provided that, for all t ∈ [0, 1] and for all u ∈ U (t), the active gradients (r j ) u ( x(t), u, t), j ∈ J(u, t) are linearly independent.
The following proposition represents an equivalent reformulation of the introduced regularity concept.For ε ≥ 0, define the set which is subject to the same conventions as the mapping J(x, u, t).It is clear that J ⊆ J ε , and J 0 = J.Proposition 1.Let the control process ( x, ū) be regular w.r.t. the mixed constraints.Then, there exists a number ε 0 > 0 such that, for all t ∈ [0, 1] and for almost all s ∈ [0, 1] such that |s − t| ≤ ε 0 , the ε 0 -active gradients (r j ) u ( x(s), ū(s), s), j ∈ J ε 0 (s) are linearly independent.Moreover, the number ε 0 can be chosen such that the modulus of surjectivity for this set of gradients is not lower than ε 0 .
Under the regularity condition given in Definition 3, the multipliers ψ, and ν are uniquely defined by the vector λ, where (λ, ψ, ν) is the set of Lagrange multipliers corresponding to ( x, ū) in view of the maximum principle.This assertion simply follows from the Euler-Lagrange equation.Then, denote by Λ = Λ( x, ū) the set of vectors λ ∈ (R 1+k 1 +k 2 ) * for which there exist (ψ, ν) such that the corresponding set of Lagrange multipliers (λ, ψ, ν) generated by λ satisfies the maximum principle.

Normality Condition
Let us introduce the notion of normality.This notion is based on the concept of linearization of the control problem and the corresponding variational differential system.Consider the reference control process ( x, ū), and a pair (δx 0 , δu) ∈ X := R n × L 2 ([0, 1]; R m ).Denote by δx(•) the solution to the variational differential equation on the time interval [0, 1], which corresponds to (δx 0 , δu), that is, where δx(0) = δx 0 .Such a solution exists on the entire time interval [0, 1] and, as soon as In what follows, it is not restrictive to set e 1 ( p) = 0. Thus, all the endpoint constraints of the inequality type are assumed to be active.Consider the two following subspaces in X : Here, e is the joint mapping of e 1 , e 2 ; δp = (δx 0 , δx 1 ), where δx 1 = δx(1), and D(t) is the diagonal q × q-matrix which has 1 in the position (j, j) iff j ∈ J(t) and 0 otherwise.
Consider the matrix Here, A + stands for the generalized inverse [15].Here, the generalized inverse R(t) + can be computed as follows.Let T(t) be a non-singular orthogonal linear transform which maps the subspace ker R(t) ⊥ = im R(t) onto the subspace of R m with the first m − q(t) coordinates vanished, where where Φ(0) = I.Let Φ(t) be the solution to (8), and P(t) be the matrix of orthogonal projection onto ker R(t).
It is clear that by virtue of the construction, any element (δx 0 , δu) ∈ N r can be represented as where V [δx 0 , δu] = M(t)δx(t), whereas Conversely, any δx 0 ∈ R n and δv ∈ L 2 ([0, 1]; R m ): δv(t) ∈ ker R(t) a.e.yields an element of N r as (δx 0 , δv + V [δx 0 , δv]) ∈ N r .Therefore, there is a one-to-one correspondence between N r and the space of the above-specified elements (δx 0 , δv).At the same time, the formula for the solution δx in N r is given by (10).
Let us proceed with the construction of the controllability matrix.Define the R n×kmatrix A as A = e x 0 ( p) + e x 1 ( p)Φ(1), with the R m×k -matrix B(t) given as Now, the controllability matrix Q is introduced as the R k×k -matrix: Definition 3. The regular control process ( x, ū) is said to be normal, provided that Q > 0, or equivalently, rank Q = k.
Define the cone On the space X , consider the quadratic form Here and further, for convenience of notation: w = (x, u), δw(t) = (δx(t), δu(t)).The main result of this section consists in the following theorem.Theorem 1.Let ( x, ū) be an optimal control process in Problem (1).Suppose that this process is normal.
Lemma 1.Consider linear bounded operators A and A i , i = 1, 2, . .., acting in a given Hilbert space X, such that A i → A pointwise.Assume that the spaces im A i and im A * i are closed and that im A ⊆ im A i for all i.Assume also that the sequence of norms (A i A * i ) −1 im A i is uniformly bounded.Let C ⊆ X be a closed and convex set.
Let us confirm the inverse embedding.Given Denote the solution to this problem as ξ i .The solution exists since im A ⊆ im A i and since the quadratic functional is weakly lower semi-continuous, whereas the closed convex set A −1 i (Aξ 0 ) is weakly closed.Since the image of A i is closed, one can apply the Lagrange multiplier rule as follows.There exists a non-zero vector λ i ∈ im A i such that Applying A i , the multiplier is expressed as follows Therefore, const by the assumption of the lemma.However, A i ξ 0 → Aξ 0 , and thus, ξ i → ξ 0 .
Proof to Theorem 1.By virtue of Theorem 3.5 in [4] and the regularity of the process ( x, ū), there exists a set of multipliers (λ, ψ, ν) satisfying the maximum principle, such that λ = 0.
Let us proceed to the proof of the second-order condition (11).Take a number ε > 0. Let D ε (t) designate the diagonal q × q-matrix defined as D(t) but, now, with the set J(t) replaced by J ε (t).Define the cone K ε in the same way as K, but with the matrix D(t) replaced by D ε (t).It is clear that K ε ⊆ K for all ε > 0. Firstly, we prove (11) for the reduced cone K ε .Consider the space Define the mapping F ε : X ∞ → Y ε as follows The Fréchet derivative A j (δx 0 , δu) := (r j ) x ( x(t), ū(t), t)δx(t) + (r j ) u ( x(t), ū(t), t)δu(t), t ∈ T ε j , and B(δx 0 , δu) := e ( p)δp.The proof of this fact involves a standard argument.Firstly, consider this derivative as the extended linear mapping acting from X to ∏ q j=1 L 2 (T ε j ; R) × R k , that is, in Hilbert spaces.Let us prove its surjection.Since the linear mapping A ε is surjective due to regularity w.r.t. the mixed and state constraints (this is a simple task to ensure by solving the corresponding Volterra equation and using Proposition 1 in this way), it is sufficient to show that the linear mapping B is surjective on ker A ε .Let Q ε be the matrix constructed as Q; however, with the matrix D(t) replaced by Therefore, one has that Q ε > 0 for all sufficiently small ε.At the same time, this condition implies that E (ker A ε ) = R k .Therefore, it is simple to conclude that (A ε , B) is a surjective linear mapping for all sufficiently small ε.
The surjection of (A ε , B) as the linear mapping from X ∞ to Y ε results from the following simple argument.Firstly, notice that, in space X , one has the relation which is clear due to Formulas ( 9) and (10), as these still hold when D(t) is replaced by D ε (t) for a sufficiently small ε.Then, simply, N r = N r (ε) = ker A ε .At the same time, the linear operator A ε is surjective as the mapping from X ∞ to ∏ q j=1 L ∞ (T ε j ; R) by virtue of the same arguments involving the solution to a Volterra equation.However, the image of B is finite-dimensional, whereas, as has already been confirmed, B is surjective on the space ker A ε .Therefore, by virtue of (15), one finds that B is surjective on the subspace ker A ε ∩ X ∞ .Thus, the derivative F ε ( x0 , ū(•)) is surjective and, thereby, ( x0 , ū(•)) is a normal point for the mapping F ε .
Consider the inequality which results from the condition of minimum and from the fact that the control process (x(•; τ), u(•; τ)) is admissible.Consider the second-order variational system Therefore, by expanding in the Taylor series in (17), one has Here, and from now on, the dependence on the optimal process is, for simplicity, omitted.Therefore, using these relations and the transversality conditions (3), and by gathering the terms with τ, and τ 2 in two different groups, we obtain Now, as the implication of ( 5), we obtain (11) for the given (δx 0 , δu) ∈ K ε ∩ X ∞ .Then, Estimate ( 11) is proven on the cone K ε by a simple passage to the limit in X .
Let us pass to the limit as ε → 0, and prove (11) on the entire cone K. Take h ∈ K.One needs to justify that, for all ε > 0, there exists h ε ∈ K ε such that h ε → h in X .Indeed, then, (11) is proven due to a simple passage to the limit.However, the existence of such h ε is yielded by Lemma 1. Indeed, it is a straightforward task to verify that the derivative operator F ε satisfies all the assumptions of this assertion; it obviously converges pointwise to F 0 as ε → 0, while its image is closed, as was confirmed above.The image of the conjugate operator is closed due to the regularity of the reference control process with respect to the mixed constraints which is merely a technical step to ensure.Another technical step is to assert that F ε [F ε ] * is positive due to normality.Moreover, the constant of covering does not depend on ε.It is also a straightforward task to verify that the rest of the assumptions hold if we consider C 0 as C.
The proof is complete.

Abnormal Case
In this section, we consider the case when rank Q < k.This case, when the normality condition is not satisfied, is called abnormal.Then, as a simple example can show that Theorem 1 fails to hold.Firstly, the normalized multiplier λ is no longer unique, and moreover, there may not exist such a multiplier from the cone Λ for which Estimate (11) is still valid everywhere on K. Therefore, Theorem 1 requires a certain refinement in the abnormal case.Let us formulate the "abnormal" version of this statement.In this enterprise, we follow the method based on the so-called index approach.
Consider the reduced cone of Lagrange multipliers Λ a = Λ a ( x, ū), which contains multipliers λ ∈ Λ such that Here, the notation ind X stands for the index of a quadratic form over the space X, and N = N e ∩ N r .Consider also the following extra hypotheses.

Hypothesis 3 (H3).
Mixed constraints are globally regular, that is, U(x, t) = U R (x, t) for all x and t.Moreover, the set-valued mapping U(x, t) is uniformly bounded.
The main result of this section is as follows.
Theorem 2. Let ( x, ū) be an optimal control process in Problem (1).Suppose that this process is regular with respect to the mixed constraints.
Then, Λ a = ∅.Moreover, under (H2) and (H3), one has In the case of local weak minimum, Estimate (18) has been proven in [14].Here, our task is to prove it in the case of global strong, or Pontryagin's type of the minimum.In [3], the condition that the cone Λ a is non-empty has been proven in the class of generalized controls.Note that, under the normality condition of the optimal control process, Estimate (18) implies (11) since the normalized multiplier is unique.Thus, Theorem 2, in essence, represents a stronger assertion than Theorem 1, albeit under some extra assumptions such as (H2) and (H3).These two assumptions are meant to simplify the presentation.Note that (H3) is sufficient to suppose on the optimal trajectory only.
Proof.The proof of theorem is divided into the two stages.STAGE 1.In this stage, we prove that Λ a = ∅.In the beginning, suppose that Hypothesis (H3) is valid.Under (H1) and (H3), it is convenient to assume that f (x, u, t), and r(x, u, t) are constant with respect to (x, u), and t, outside of some sufficiently large ball.This can be obtained due to a simple problem reduction.In what follows, it will not also be restrictive to consider that ϕ(p * ) = 0, and, for the simplicity of exposition, to consider that all the constraints are scalar-valued, i.e., Let a, b be non-negative numbers.Consider the mapping This function is lower semi-continuous.It will serve as a penalty function in the applied method below.
Take a pair (x 0 , u) ∈ X , and consider the unique solution to the Cauchy problem ẋ(t) = f (x(t), u(t), t), x(0) = x 0 , which exists on the entire time interval [0, 1] due to the above assumptions.Set p = (x 0 , x 1 ), where x 1 = x(1).Note that p depends on (x 0 , u).Let {ε i } be an arbitrary sequence of positive numbers converging to zero.Consider the mapping ϕ + i (p) = (ϕ(p) + ε i ) + , where a + = max{a, 0} for a ∈ R. Thus, the following functional over the space X is well-defined: Functional F i is lower semi-continuous which is a straightforward exercise to verify due to the assumptions made above regarding the mappings f and r.At the same time, this functional is positive everywhere: F i > 0.
Consider the following problem Minimize F i (x 0 , u), (x 0 , u) ∈ X .
Note that F i ( x0 , ū) = ε i .By applying the smooth variational principle, see, e.g., in [17], for each i, there exists an element (x 0,i , u i ) ∈ X and a sequence of elements ( xj , ũj ) ∈ X , j = 1, 2, . .., converging to (x 0,i , u i ) such that and the pair (x 0,i , u i ) is the unique solution to the following problem: Suppose that ϕ + i (p i ) = 0.Then, ϕ(p i ) < 0 and, in view of optimality, taking into account that ϕ( p) = 0, it follows that some of constraints in (1): e 1 , or e 2 , or r, are violated.Therefore, by definition of ∆, one has F i (x 0,i , u i ) ≥ 1.However, this contradicts (19) for i > 1.Thus, ϕ + i (p i ) > 0. Consider a number δ i > 0 such that ϕ + i (p) > 0 ∀ p: |p − p i | ≤ δ i .Then, by virtue of, again, the definition of ∆, the pair (x 0,i , u i ) is the unique global minimum to the following control problem: Denote by x i , z i the solution to (21), that is, the trajectory corresponding to the pair (x 0,i , u i (•)).Note that function z i (•) is constant, and thus it can be treated simply as number z i ∈ R.
Problem ( 21) is, as a matter of fact, unconstrained.Consider the first and second-order necessary optimality conditions for this problem.
The first-order conditions are stated as follows.There exist a number λ 0 i > 0, and absolutely continuous conjugate functions ψ i and σ i which correspond to x i , and z i , respectively, such that, for a.a.
Here, ρ i ∈ R is the multiplier corresponding to the constraint z 0 = ϕ + i (p), Conditions ( 22)-( 24) are the first-order optimality conditions in the form of the maximum principle.Consider the second-order optimality conditions for Problem (21).
The solution to (25) exists, and it is unique on the entire time interval [0, 1] due to the assumptions made above.The function δz i (•) is obviously constant, thus, it is treated just as number δz i in what follows.
On the space X , consider the quadratic form Here, Then, the second-order necessary optimality condition is given by the inequality (Note that functional F(x 0 , u) is not twice continuously differentiable.At the same time, the scalar function F(x 0,i + τδx 0 , u i + τδu) of τ possesses the second derivative w.r.t.τ at τ = 0, provided that (δx 0 , δu) ∈ N i .Using this fact, and the fact that Problem (21) is unconstrained, it is simple to derive (26) by applying direct variations arguments.) The next step is to pass to the limit as i → ∞ in the obtained optimality conditions.Firstly, it follows from (20) that x 0,i → x0 , and u i (t) → ū(t) strongly in L 2 , and, thereby, x i (t) ⇒ x(t) uniformly on [0, 1].Then, z i → 0. Define and consider the following normalization for the multipliers where However, due to (19), one has z −1 i r(x i (t), u i (t), t) + L 2 → 0. This, together with (27), implies that σ i (0) → 0.Then, the transversality condition and, again, (19) and (27) simply yield that λ 0 i − ρ i → 0. By passing to a subsequence, in view of the compactness argument, one may assume from (27) that λ i → λ, ψ i (t) ⇒ ψ(t) and ν i w → ν weakly in L 2 as i → ∞ for some multipliers λ = (λ 0 , λ 1 , λ 2 ), ψ and ν.Then, ρ i → λ 0 .It is also clear that by passing to a subsequence, one can assert that λ 0 i z −4 i → ∞.Indeed, otherwise all the multipliers converge to zero, contradicting (27).By virtue of the regularity of the optimal control process with respect to mixed constraints, for each i, there exists a control function ζ i such that ζ i (t) ∈ U(x i (t), t) a.e., and ζ i → ū in L ∞ .Thus, from the maximum condition (24), it follows that r(x i (t), u i (t), t) + → 0 uniformly.Since the set U( x(t), t) is uniformly bounded, this implies, again due to regularity, that the control function u i is essentially bounded uniformly with respect to i, that is u i L ∞ ≤ const.
Then, as is known, ker A i → X ε (σ) := ker A ⊆ X (E σ ).It is clear that X ε (σ) → X ε := X ε (0) as σ → 0 by virtue of its definition and the regularity condition.One needs to use the solution to a corresponding Volterra equation to prove this simple fact.Then, Lemma 1 yields that X ε → N r as ε → 0 if we consider C = {0}.Here, when treating the convergence of spaces, the symbol '→' stands for Limsup.
Let Π i ⊆ X denote the kernel of the endpoint operator e (p i )δp i .It is clear that codim Π i ≤ k.Then, this is a simple exercise to ensure the existence of a subspace Π ⊆ N e such that codim Π ≤ k and where Limsup is total: firstly as i → ∞, then, as σ → 0 and finally, as ε → 0. At the same time, note that T i ∩ E σ ⊆ T ε (σ) for all large i.Therefore, one has the embedding ker A i ∩ Π i ⊆ N i ∩ X (E σ ), and then, the passage to the limit in (26) gives the condition Λ a = ∅.In the latter deduction, Proposition 1 of [14] has been used and also the fact that the terms with δz 2 i in Ω i converge to zero in view of ( 19) and ( 27).Now, it is necessary to remove the extra assumptions imposed in (H3) regarding the boundedness and global regularity.However, this can be done following precisely the same method as presented in [4].Take c > 0, and consider the additional control constraint |u| ≤ c.For each ε > 0, there will be N specifically constructed regular selectors of U(x, t) which are surrounded with N ε-tubes as in the above-cited source.Then, the passage to the limit, firstly as ε → 0, then, as N → ∞, and, at the end, as c → ∞ will complete the proof for Stage 1.
The full proof for the next stage is rather lengthy.Therefore, let us present it schematically, in a sketch-form, exposing the main idea.