You are currently viewing a new version of our website. To view the old version click .
Mathematics
  • Article
  • Open Access

7 April 2021

Invariant Geometric Curvilinear Optimization with Restricted Evolution Dynamics

Department of Mathematics and Informatics, University Politehnica of Bucharest, 060042 Bucharest, Romania
This article belongs to the Special Issue New Advances in Differential Geometry and Optimizations on Manifolds

Abstract

This paper begins with a geometric statement of constraint optimization problems, which include both equality and inequality-type restrictions. The cost to optimize is a curvilinear functional defined by a given differential one-form, while the optimal state to be determined is a differential curve connecting two given points, among all the curves satisfying some given primal feasibility conditions. The resulting outcome is an invariant curvilinear Fritz–John maximum principle. Afterward, this result is approached by means of parametric equations. The classical single-time Pontryagin maximum principle for curvilinear cost functionals is revealed as a consequence.

1. Introduction

An optimization problem generally refers to searching for the extrema of an objective function (which, depending on the specifics of each issue, could be also referred to as the cost function, utility function, energy function, or reward function). Finding necessary conditions for optimization problems is the primary step in nonlinear programing. For simple optimization, the necessary conditions are provided by Fermat’s theorem. Advancing to constraint optimization, when only equality-type restrictions are assumed, the main tool is provided by the method of Lagrange multipliers []. If inequality-type constraints are also involved, then the optimality becomes the so-called Fritz–John necessary conditions (see [,,,]). They also provide an important tool for the proof of the Karush–Kuhn–Tucker (KKT) conditions, provided that some regularity conditions are satisfied [,,]. Overall, both the Fritz–John and Karush–Kuhn–Tucker approaches to nonlinear programming generalize the method of Lagrange multipliers.
On the other side, at a more extensive level, when the constraints are simultaneously dynamic (meaning that they define a certain type of evolution expressed in terms of differential equations) and controlled (meaning that some control variables are involved in the process), while the cost functional also depends on the controlling elements, the resulting problem is called an optimal control problem. Optimal control is an extension of the calculus of variations, and the solution method is provided by Pontryagin’s maximum principle (see [,,]). Interesting extensions of the classical optimal control problem were obtained by replacing the single-time variable with a multidimensional one; in this case, the evolution dynamics is defined by a PDE system, while the cost functional could attain many possible expressions ([,,,]). Recent developments in optimal control theory have also provided a geometric approach ([,,]).
This paper initiates a new geometric perspective for the optimization problem by going from punctual-type state variables to curvilinear-type state variables. The main result provides adequate Fritz–John optimality conditions. In light of this perspective, the dynamical features of the optimal control problems are subsequently included in the dynamics of the curves. By rethinking the time, states, and controls taken together as a generalized curvilinear-type state variable, the dynamic optimal control issue is restated as an optimization problem, and Pontryagin’s maximum principle results directly from the Fritz–John necessary conditions.

2. Geometric Setting

Let W be an s-dimensional Riemannian manifold. We denote by F ( W ) the set of all differentiable functions on W and by Λ 1 ( W ) the F ( W ) -module of differential one-forms on W. If x 0 , x 1 are two fixed points in W, let S x 0 x 1 1 ( W ) be the set of all smooth oriented curves Γ from x 0 to x 1 .
Definition 1.
A function F : S x 0 x 1 1 ( W ) R for which there exists a differential one-form η Λ 1 ( W ) such that:
F [ Γ ] = Γ η , Γ S x 0 x 1 1 ( W )
is called a curvilinear functional.
In order to emphasize the connection between the curvilinear functional F and the corresponding differential one-form η , in the following, we use the notation F = [ η ] . Therefore, if η Λ 1 ( W ) , then:
[ η ] ( Γ ) = Γ η .
Let [ Λ 1 ( W ) ] = { [ η ] : η Λ 1 ( W ) } . Moreover, let Λ * 1 ( W ) Λ 1 ( W ) be the subset of all the exact one-forms (recall that c is an exact one-form if there exists a differentiable function f F ( W ) such that c = d f ). Then, [ Λ * 1 ( W ) ] is the set of constant curvilinear functionals. Indeed, if f F ( W ) , then:
[ d f ] ( Γ ) = Γ d f = f ( x 1 ) f ( x 0 ) = [ d f ] ( Γ ) , Γ , Γ S x 0 x 1 1 ( W ) .
In conclusion, a differential one-form η is completely integrable if there is a non-constant differentiable function μ : W R such that [ μ η ] is a constant curvilinear functional. In the physical approach, if η = F · d r , then the curvilinear functional [ η ] defines the work done by the force F moving along the curve Γ , where F is the force vector field acting on the object and d r is the unit tangent vector of the curve. In particular, when the force F is conservative, meaning that there exists a scalar potential f such that η = F · d r = d f , the work depends only on the endpoints and not on the route to follow, and it is called the potential energy.
Definition 2.
Let Γ be an element of S x 0 x 1 1 ( W ) . A smooth variation of Γ in S x 0 x 1 1 ( W ) is a smooth surface on W, γ = γ ϵ ( t ) : ( δ , δ ) × [ 0 , 1 ] W , such that γ 0 ( [ 0 , 1 ] ) = Γ and γ ϵ ( 0 ) = x 0 , γ ϵ ( 1 ) = x 1 , for each ϵ. The set:
T Γ S x 0 x 1 1 ( W ) = X X Γ ( W ) :   a   s m o o t h   v a r i a t i o n   γ   o f   Γ   i n   S x 0 x 1 1 ( W ) , γ ϵ ϵ ϵ = 0 = X
is called the set of feasible vector fields along the curve Γ.
We note that:
T Γ S x 0 x 1 1 ( W ) =   X X Γ ( W ) : X ( x 0 ) = X ( x 1 ) = 0 .
The following definition is essential for expressing the invariant inequality constraints.
Definition 3.
Let ω 1 , ω 2 Λ 1 ( W ) be two differential one-forms and Γ S x 0 x 1 1 ( W ) be an oriented curve. We write:
ω 1 | Γ ω 2 | Γ
if there is a smooth positive oriented parametrization γ : [ a , b ] W of Γ, such that ω 1 ( γ ˙ ( t ) ) ω 2 ( γ ˙ ( t ) ) , t [ a , b ] . Moreover, ω 1 | Γ < ω 2 | Γ if ω 1 ( γ ˙ ( t ) ) < ω 2 ( γ ˙ ( t ) ) almost everywhere on [ a , b ] . Equality is similarly defined.

3. Optimal Control Problems with Inequality Constraints

The main purpose of this paper is to obtain the necessary first-order conditions for curvilinear optimization problems constrained by inequalities and equalities such as those described above.
Let [ η ] : S x 0 x 1 1 ( W ) R be a curvilinear functional, ω i , σ a Λ 1 ( W ) and c i Λ * 1 ( W ) , i = 1 , . . . , n , a = 1 , . . . , m . The most general problem discussed in this paper is:
( P ) max Γ S x 0 x 1 1 ( W ) [ η ] ( Γ )
subject to:
( P F E ) ω i | Γ = c i | Γ , i = 1 , . . . , n ;
( P F I ) σ a | Γ 0 , a = 1 , . . . , m ,
where the abbreviation P F stands for primal feasibility, in equality or inequality form, respectively. Let S S x 0 x 1 1 ( W ) denote the feasible set or the set of all curves Γ S x 0 x 1 1 ( W ) , satisfying the feasibility conditions. Members of the feasible set are called feasible curves.
Without losing generality, we shall assume that c i 0 . Further, we shall solve this problem following two steps. The first step is to find the optimal solution for problem ( P ) subject only to inequality constraints. The result is called the Fritz–John maximum principle. We apply this technique to solve the optimal control problem with both equality and inequality constraints. In particular, we derive the Pontryagin maximum principle. The following statement provides an important tool for our further development.
Lemma 1
(Motzkin’s transposition theorem). Let A be an m × n matrix, B be an l × n matrix, and C be an r × n matrix, where B or C may be omitted (but not A). Exactly one of the following alternatives holds.
(i)
there exists x R n such that A x < 0 , B x 0 , H x = 0 , (where the vectorial inequalities or equalities hold component-wise) or
(ii)
there exist u R m , v R l and w R r such that A T u + B T v + H T w = 0 , u 0 , v 0 , e T · u = 1 , where e is the summing vector in R m , meaning that it has all the components equal to the unit.
Proposition 1.
Let ( N , g ) be a paracompact differential manifold and ( f 1 , . . . , f k ) be a family of differential functions on N. If there is no vector field X X ( N ) such that the Lie derivatives of these functions with respect to X satisfy:
X ( f a ) ( p ) < 0 , a = 1 , . . . , k , p N ,
then there exists a vector function μ = ( μ 1 , . . . , μ k ) , μ a : N R , a = 1 , . . . , k satisfying:
a = 1 k μ a ( p ) d f a ( p ) = 0 , μ a ( p ) 0 , a = 1 , . . . , k , p N , μ 0 .
Proof. 
Let p N be an arbitrary fixed point on N. Suppose that:
( C ( p ) ) X p T p N , a = 1 , . . . , k , X p ( f a ) 0 .
This means that the system X p ( f a ) = d f a ( p ) ( X p ) < 0 , a = 1 , . . . , k has no solution on T p N . By Lemma 1, there exists a vector u ( p ) = ( u 1 ( p ) , . . . , u k ( p ) ) 0 satisfying u 1 ( p ) + + u k ( p ) = 1 (hence, u ( p ) 0 ) such that a = 1 k u a ( p ) d f a ( p ) = 0 .
Let us prove that there are points p N satisfying condition C ( p ) . In order to do so, we assume first the opposite, meaning that, p N , there is X p T p N such that:
X p ( f a ) < 0 , a = 1 , . . . , k .
Then, there is a neighborhood V p of each point p N such that X ˜ ( p ) ( f a ) ( q ) < 0 , a = 1 , . . . , k , q V p , where X ˜ ( p ) is a vector field extending the tangent vector X p on V p . Since N is a paracompact manifold, it follows that there exists a locally finite open cover { V p i | i J } of N and a corresponding partition of unity { ρ i | i J } . Let X ˜ = i J ρ i X ˜ ( p i ) . Then, X ˜ ( f a ) ( p ) < 0 , a = 1 , . . . , k , p N , contradicting the hypotheses.
In conclusion, we define:
μ a ( p ) =   u a ( p ) , if   p   satisfies   condition   ( C ( p ) ) 0 , otherwise
It follows that a = 1 k μ a ( p ) d f a ( p ) = 0 , μ a ( p ) 0 , a = 1 , . . . , k , p N , μ 0 . □
The necessary conditions for the constrained optimal control problem ( P ) , ( P F ) rely on the weak Lagrangian one-form:
η ¯ Λ 1 ( W × R n + m + 1 ) , η ¯ ( x , ν , μ , λ ) = λ η ( x ) ν i ω i ( x ) μ a σ a ( x ) .
To simplify writing, we use the Einstein summation convention. The lower and upper repetitive indices indicate that a summation is being performed ( μ a σ a ( x ) = a = 1 m μ a σ a ( x ) ). If such an expression is defined and the index range is mentioned at the same time, then we must understand that the sum no longer needs to be done (if we write μ a σ a = 0 , a = 1 , . . . , m , then there is no summation involved).
If the equality constraints are absent, then:
η ¯ Λ 1 ( W × R m + 1 ) , η ¯ ( x , μ , λ ) = λ η ( x ) μ a σ a ( x ) .
Theorem 1
(Fritz–John maximum principle). If Γ * is an optimal solution for the constrained optimal control problem ( P ) , ( P F I ) , then there exists an extended curve Γ ¯ * = { ( x * , μ * , λ * ) W × R m + 1 | x * Γ * } , whose projection on W is Γ * , satisfying:
( P F I ) σ a | Γ * 0 , a = 1 , . . . , m ;
( C ) μ a * σ a | Γ * = 0 , a = 1 , . . . , m ;
( D F ) λ * 0 , μ a * 0 , a = 1 , . . . , m ; ( λ * , μ * ) 0 , ;
( S ) i U α d η ¯ Γ ¯ * = 0 , α = 1 , . . . , s ,
where { U α : α = 1 , . . . , s } is a smooth frame of the s-dimensional manifold W and i U α d η ¯ stands for the differential of the one-form η ¯ subject to a contraction on the direction of the vector field U α via the interior product operator i : X ( W ) × Λ 2 ( W ) Λ 1 ( W ) .
Here, ( C ) stands for complementary slackness, ( D F ) stands for dual feasibility, and ( S ) denotes the stationarity condition.
Proof. 
Suppose Γ * is an optimal solution, and we define the so-called active constraint:
I * = { a { 1 , . . . , m } | σ a | Γ * = 0 } .
We shall study two different situations.
Case I: We assume I * = Ø , and we choose λ * = 1 , μ 1 * =   . . .   = μ m * = 0 . Then, conditions ( C ) and ( D F ) are identically satisfied, and the stationarity conditions are written as:
i U α d η Γ ¯ * = 0 , α = 1 , . . . , s .
If X T Γ * S x 0 x 1 1 ( W ) is a feasible vector field along Γ * , let γ be a smooth variation induced by X. Since for each a = 1 , . . . , m , we have σ a ( γ ˙ 0 ( t ) ) < 0 , there is a δ small enough such that σ a ( γ ˙ ϵ ( t ) ) < 0 on [ 0 , 1 ] , for each ϵ with | ϵ | < δ . Hence, each curve Γ ϵ = γ ϵ ( [ 0 , 1 ] ) , | ϵ | < δ is feasible. Since Γ * was an optimal solution, it follows that ϵ = 0 is the maximum point for the function φ ( ϵ ) = [ η ] ( Γ ϵ ) . Therefore:
0 = d d ϵ ϵ = 0 ( [ η ] ( Γ ϵ ) ) = d d ϵ ϵ = 0 Γ ϵ η = d d ϵ ϵ = 0 0 1 γ ϵ * ( η ) ,
where γ ϵ * denotes the pull-back operator. Let us use the notations γ ϵ t ( t ) = T ˜ ( γ ϵ ( t ) ) , γ ϵ ϵ ( t ) = X ˜ ( γ ϵ ( t ) ) , and T ( γ 0 ( t ) ) = T ˜ ( γ 0 ( t ) ) and assume that [ X ˜ , T ˜ ] | Γ * = 0 . If X ˜ ( η ) is the Lie derivative of the differential one-form η with respect to the vector field X ˜ , then:
d d ϵ ϵ = 0 0 1 γ ϵ * ( η ) = d d ϵ ϵ = 0 0 1 η ( T ˜ ) ( γ ϵ ( t ) ) d t = 0 1 X ( η ( T ˜ ) ) ( γ 0 ( t ) ) d t = 0 1 [ X ˜ ( η ) ( T ˜ ) ( γ 0 ( t ) ) η ( [ X ˜ , T ˜ ] ) ( γ 0 ( t ) ) ] d t = 0 1 X ˜ ( η ) ( T ) ( γ 0 ( t ) ) d t = 0 1 γ 0 * ( X ˜ ( η ) ) = Γ * X ˜ ( η )
Since X is the restriction of X ˜ along Γ * , we obtain:
0 = Γ * X ˜ ( η ) = Γ * [ i X ˜ d η + d ( η ( X ˜ ) ) ] = Γ * ( i X d η ) + η ( X ) x 0 x 1 .
The properties of X ensure that the second term vanishes; therefore:
0 = Γ * ( i X d η ) .
If { U α : α = 1 , . . . , s } is a smooth frame, we obtain:
i U α d η Γ * = 0 , α = 1 , . . . , s .
Case II: Suppose now that I * Ø , T X Γ * ( W ) is a tangent vector field of the curve Γ * and T ˜ X ( W ) is an extension of T. Then, for each feasible vector field X T Γ * S x 0 x 1 1 ( W ) satisfying:
X ( σ a ( T ˜ ) ) ( x * ) < 0 , a I * , x * Γ * ,
we cannot have:
X ( η ( T ˜ ) ) ( x * ) > 0 , x * Γ * .
Indeed, let us suppose that X satisfies Relation (1). We consider a smooth variation, and we next use the same notations as in Case I. Then, for each t [ 0 , 1 ] , we have:
σ a ( T ˜ ) ( γ ϵ ( t ) ) = σ a ( T ˜ ) ( γ 0 ( t ) ) + ϵ X ˜ ( σ a ( T ˜ ) ) ( γ 0 ( t ) ) + ϵ θ ( ϵ , t )
= σ a ( T ) ( γ 0 ( t ) ) + ϵ X ( σ a ( T ˜ ) ) ( γ 0 ( t ) ) + ϵ θ ( ϵ , t )
= ϵ X ( σ a ( T ˜ ) ) ( γ 0 ( t ) ) + ϵ θ ( ϵ , t ) , a I * ,
where θ ( ϵ , t ) tends to zero as ϵ 0 . If ϵ is small enough, then X ( σ a ( T ˜ ) ) ( γ 0 ( t ) ) + θ ( ϵ , t ) < 0 , t [ 0 , 1 ] , and it follows that σ a ( T ˜ ) ( γ ϵ ( t ) ) < 0 , that is σ a | Γ ϵ < 0 , a I * , for some small ϵ . In conclusion, if ϵ is small enough, then Γ ϵ = γ ϵ ( [ 0 , 1 ] ) is a feasible curve.
Moreover, if X satisfies Relation (2), then:
d d ϵ ϵ = 0 Γ ϵ η = 0 1 X ( η ( T ˜ ) ) ( γ 0 ( t ) ) d t > 0 ,
which contradicts the fact that Γ * is an optimal solution (realizes the maximum).
Therefore, by applying the foregoing Proposition 1, we conclude that there exist the functions λ 0 , μ a : [ 0 , 1 ] R , a I * such that ( λ 0 , μ a ) 0 , ( λ 0 ( t ) , μ a ( t ) ) 0 , t [ 0 , 1 ] and:
λ 0 X ( η ( T ˜ ) ) a I * μ a X ( σ a ( T ˜ ) ) = 0 .
In addition, let us also consider λ 0 ( t ) = μ a ( t ) = 0 , for each a I * , for all t [ 0 , 1 ] . If γ ¯ ϵ ( t ) = ( γ ϵ ( t ) , μ ( t ) , λ ( t ) ) , we shall denote by Y ˜ the vector field satisfying Y ˜ ( γ ¯ ϵ ( t ) ) = γ ¯ ϵ t ( t ) . Then, X ( η ( Y ˜ ) ) = X ( η ( T ˜ ) ) and X ( σ a ( Y ˜ ) ) = X ( σ a ( T ˜ ) ) .
It follows:
X ( η ¯ ( Y ˜ ) ) = λ 0 X ( η ( Y ˜ ) ) μ a X ( σ a ( Y ˜ ) ) = 0 , X X Γ * ( W ) , X ( x 0 ) = X ( x 1 ) = 0
and using the same arguments as for the proof in Case I, we find:
0 = 0 1 X ( η ¯ ( Y ˜ ) ) ( γ ¯ ( t ) ) d t = Γ ¯ * X ˜ ( η ¯ )
= Γ ¯ * i X d η ¯ + η ¯ ( X ) | x 0 x 1 = Γ ¯ * i X d η ¯ , X X Γ * ( W ) .
In conclusion, i U α d η ¯ Γ ¯ * = 0 , α = 1 , . . . , s .
Corollary 1
(Parametric Fritz–John maximum principle). If x * : [ 0 , 1 ] W is a positive oriented parametrization of an optimal solution for the constrained optimal control problem ( P ) , ( P F I ) , then there exists a momentum vector function ( μ * , λ * ) : [ 0 , 1 ] R m + 1 , such that:
( P F I ) σ α a ( x * ( t ) ) x ˙ * α ( t ) 0 , t [ 0 , 1 ] , a = 1 , . . . , m ;
( C ) μ a * ( t ) σ α a ( x * ( t ) ) x ˙ * α ( t ) = 0 , t [ 0 , 1 ] , a = 1 , . . . , m ;
( D F ) λ * 0 , μ a * 0 , a = 1 , . . . , m ; ( λ , μ ) 0 ;
( S ) η ¯ β x α ( x * ( t ) , μ * ( t ) , λ * ( t ) ) x ˙ * β ( t ) = d d t ( η ¯ α ( x * ( t ) , μ * ( t ) , λ * ( t ) ) ) ,
α = 1 , . . . , s , t [ 0 , 1 ] .

4. Optimal Control with Equality and Inequality Constraints

In the following, we are interested in giving an answer to the original problem ( P ) , ( P F E ) , ( P F I ) . A simple way to solve this problem is to transform it into some optimal control problem constrained only by inequalities:
( P ) max Γ S x 0 x 1 1 ( W ) [ η ] ( Γ )
subject to:
( P F E 1 ) ω i | Γ 0 , i = 1 , . . . , n ;
( P F E 2 ) ω i | Γ 0 , i = 1 , . . . , n ;
( P F I ) σ a | Γ 0 , α = 1 , . . . , m ,
Theorem 2
(Generalized Fritz–John maximum principle). If Γ * is an optimal solution for the constrained optimal control problem ( P ) , ( P F E ) , ( P F I ) , then there exists a curve Γ ¯ * = { ( x * , ν * , μ * , λ * ) W × R n + m + 1 | x * Γ * } , satisfying:
( P F E ) ω i | Γ * = 0 , i = 1 , . . . , n ;
( P F I ) σ a | Γ * 0 , a = 1 , . . . , m ;
( C ) μ a * σ a | Γ * = 0 , a = 1 , . . . , m ;
( D F ) λ * 0 , μ a * 0 , a = 1 , . . . , m ; ( ν * , λ * , μ * ) 0 ;
( S ) i U α d η ¯ Γ ¯ * = 0 , α = 1 , . . . , s .
where:
η ¯ Λ 1 ( W × R n + m + 1 ) , η ¯ ( x , ν , μ , λ ) = λ η ( x ) ν i ω i ( x ) μ a σ a ( x ) .
Proof. 
Let us consider:
η ˜ Λ 1 ( W × R 2 n + m + 1 ) , η ˜ ( x , ν 1 , ν 2 , μ , λ ) = λ η ( x ) ( ν i 1 ν i 2 ) ω i ( x ) μ a σ a ( x )
and for each z R n ,
η z ˜ Λ 1 ( W × R 2 n + m + 1 ) , η z ˜ ( x , ν 1 , ν 2 , μ , λ ) = λ η ( x ) ( ν i 1 ν i 2 + z i ) ω i ( x ) μ a σ a ( x ) .
Applying the Fritz–John maximum principle for the inequality constrained problem ( P ) , ( P F E 1 ) , ( P F E 2 ) , ( P F I ) , we conclude that there exists the curve:
Γ ˜ * = { ( x * , ν 1 * , ν 2 * , μ * , λ * ) W × R 2 n + m + 1 | x * Γ * } ,
satisfying:
( P F E 1 ) ω i | Γ * 0 , i = 1 , . . . , n ;
( P F E 2 ) ω i | Γ * 0 , i = 1 , . . . , n ;
( P F I ) σ a | Γ * 0 , a = 1 , . . . , m ;
( C ) μ a * σ a | Γ * ; ( ν i 1 * ν i 2 * ) ω i | Γ * = 0 , i = 1 , . . . , n , a = 1 , . . . , m ;
( D F ) λ * 0 , μ a * 0 , a = 1 , . . . , m , ν i 1 * , ν i 2 * 0 , i = 1 , . . . , n ;
( ν 1 * , ν 2 * , λ * , μ * ) 0 ;
( S ) i U α d η ˜ Γ ˜ * = 0 , α = 1 , . . . , s .
Then, ( P F E 1 ) and ( P F E 2 ) determine the condition ( P F E ) , and the complementary slackness comes out from ( C ) . Moreover,
i U α d η ˜ z Γ ˜ * = i U α d η ˜ Γ ˜ * z i i U α d ω i Γ *
and since ω i | Γ * = 0 , we have:
i U α d ω i Γ * = 0 ;
therefore:
i U α d η ˜ z Γ ˜ * = 0 .
We chose z * R n such that:
( ν 1 * ν 2 * z * , λ * , μ * ) 0
and ν * = ν 1 * ν 2 * z * . Then, ( x * , ν * , μ * , λ * ) satisfies ( D F ) and:
i U α d η ¯ Γ ¯ * = i U α d η ˜ z * Γ ˜ * = 0 .
Corollary 2
(Parametric generalized Fritz–John maximum principle). If x * : [ 0 , 1 ] W is a positive oriented parametrization of an optimal solution for the constrained optimal control problem ( P ) , ( P F E ) , ( P F I ) , then there exists a momentum vector function ( ν * , μ * , λ * ) : [ 0 , 1 ] R n + m + 1 , such that:
( P F E ) ω α i ( x * ( t ) ) x ˙ * α ( t ) = 0 , i = 1 , . . . , n , t [ 0 , 1 ] ;
( P F I ) σ α a ( x * ( t ) ) x ˙ * α ( t ) 0 , a = 1 , . . . , m , t [ 0 , 1 ] ;
( C ) μ a * ( t ) σ α a ( x * ( t ) ) x ˙ * α ( t ) = 0 , a = 1 , . . . , m , t [ 0 , 1 ] ;
( D F ) λ * 0 , μ a * 0 , a = 1 , . . . , m ; ( ν * , λ * , μ * ) 0 ;
( S ) η ¯ β x α ( x * ( t ) , ν * ( t ) , μ * ( t ) , λ * ( t ) ) x ˙ * β ( t ) = d d t ( η ¯ α ( x * ( t ) , ν * ( t ) , μ * ( t ) , λ * ( t ) ) ) ,
α = 1 , . . . , s , t [ 0 , 1 ] .
Remark 1.
The same as in classical optimization with constraints, the solutions resulting from the evaluation of Fritz–John’s necessary conditions are not necessarily the maximums for which we are looking. They could be minimum curves or maximum curves, or just as well, they may not be extremals. We call them critical curves. To decide what kind of optimization they perform (if this really happens), we need an additional approach that formulates sufficient optimality conditions. The example below emphasizes the way that Fritz–John’s necessary conditions are acting, resulting in critical curves.
Example 1.
We consider the necessary conditions for a curve running on the paraboloid Σ : x 2 + y 2 = 2 ( z 1 ) between the points p 0 = ( 1 , 1 , 2 ) and p 1 = ( 1 , 1 , 2 ) to optimize the work under the action of the force F = z i + z j .
This issue could be rephrased as:
max Γ S p 0 p 1 1 ( R 3 ) Γ z d x + z d y
subject to:
( x d x + y d y d z ) | Γ = 0 .
We identify η = z d x + z d y and ω = x d x + y d y d z ; hence, we compute:
η ¯ = ( ν x + λ z ) d x + ( ν y + λ z ) d y ν d z .
From Corollary 2, we reach the stationary conditions:
ν x = ( ν x + λ z ) ; ν y = ( ν y + λ z ) ; λ x + λ y = ν ,
that is:
ν x = ( λ z ) ; ν y = ( λ z ) ; λ ( x + y ) = ν
By subtracting the first two equations, we find ν ( x y ) = 0 ; hence, we can analyze two separate cases.
Case I. Assume that x = y . It follows that the optimal curve results as the intersection between the plane x = y and the paraboloid x 2 + y 2 = 2 ( z 1 ) . A corresponding parametrization could be chosen as:
γ ( t ) = ( t , t , t 2 + 1 ) , t R .
Moreover, introducing this in the stationary conditions above, we find:
ν t = λ ( t 2 + 1 ) 2 λ t ; 2 λ = ν .
It follows immediately that λ = 0 , that is λ = A , with A being constant and A 0 (from the dual feasibility condition). Moreover, we also have ν = 2 A ; hence, ν = 2 A t + B , B being a real constant such that ( A , B ) ( 0 , 0 ) .
Case II. If ν = 0 , then ν = A and ( λ z ) = 0 . Since all the points of the paraboloid satisfy the condition z 0 , it follows that λ = B z , with A, B being real constants, B 0 , ( A , B ) ( 0 , 0 ) (from the dual feasibility condition). Introducing these in the last stationary condition, we find that ( x + y ) = 0 , that is x + y = C . Therefore, the solutions are provided by the intersections between the planes x + y = C and the paraboloid x 2 + y 2 = 2 ( z 1 ) . A corresponding parametrization in this case could be:
γ ( t ) = t , C t , t 2 C t + C 2 / 2 + 1 , t R .
Furthermore, by asking for the solution to have the required endpoints, we find just one possibility, meaning the critical curve:
γ * ( t ) = ( t , t , t 2 + 1 ) , t [ 1 , 1 ] ,
corresponding to the optimal costates:
λ * ( t ) = B t 2 + 1 ; ν * ( t ) = A , A , B R , B 0 , ( A , B ) ( 0 , 0 ) .

5. Applications

5.1. Simple Curvilinear Optimization

Let [ η ] : S x 0 x 1 1 ( W ) R be a curvilinear functional. We try to solve the simple optimization problem:
max Γ S x 0 x 1 1 ( W ) [ η ] ( Γ ) .
In this particular case, Fritz–John’s maximum principle leads directly to the necessary condition:
i x α d η Γ * = 0 , α = 1 , . . . , s .
Moreover, if W = R 3 and η = F · d r , then the necessary conditions for optimizing the work done by the force F moving along the optimal curve Γ * become:
rot F × T = 0 ,
where T is any tangent vector field of the curve. Eventually, this leads to the conclusion that the optimal curve is a field line of rot F .

5.2. Pfaffian Constrained Optimization

In the following, we consider an optimization problem, subject only to equality-type restrictions (Pfaffian equations). Let [ η ] : S x 0 x 1 1 ( W ) R be a curvilinear functional, ω i Λ 1 ( W ) and c i Λ * 1 ( W ) , i = 1 , . . . , n . The optimization problem we want to solve is:
max Γ S x 0 x 1 1 ( W ) [ η ] ( Γ ) subject   to ω i | Γ = c i | Γ .
The idea we use here is to replace the initial constraint problem with an unrestricted one, by using the Lagrange multipliers method. We introduce some new variables ν i , and we consider a new curvilinear functional on S ( x 0 , ν 0 ) ( x 1 , ν 1 ) 1 ( W × R n ) defined as [ η ¯ ] = [ η + ν i ( ω i c i ) ] .
Next, we prove that the solutions of the restricted problem are among the solutions of the unrestricted one.
Proposition 2
(Existence of Lagrange multipliers). If Γ * is a critical curve for the curvilinear functional [ η ] , then there exists another curve Γ ¯ * = { ( x * , ν * ) W × R n | x * Γ * } , which is a critical curve for the unconstrained functional [ η ¯ ] .
Proof. 
We introduce the Hamiltonian one-form:
( H ) H ( x , ν ) = η ( x ) + ν i ω i ( x ) .
Then, η ¯ ( x , ν ) = H ( x , ν ) ν i c i ( x ) and d η ¯ = d H d ν i c i , and:
i x α d η ¯ = i x α d H + c α i d ν i , α = 1 , . . . , s ,
i ν i d η ¯ = i ν i d H c i , i = 1 , . . . , n .
Let Γ * be a critical point for [ η ] . We choose:
Γ ¯ * = ( x * , ν * ) x * Γ * , i x α d H = c α i d ν i .
Then:
i x α d η ¯ Γ ¯ * = 0 , α = 1 , . . . , s .
Moreover, d η ¯ = d η + d ν i ( ω i c i ) , and it follows that:
i ν i d η ¯ Γ ¯ * = ( ω i c i ) Γ * = 0 , i = 1 , . . . , n .
In conclusion, Γ ¯ * defined above is a critical point for the curvilinear functional [ η ¯ ] . Combining Equations (3) and (4), we also obtain:
i ν i d H Γ ¯ * = c i Γ * , i = 1 , . . . , n .
Corollary 3.
Let Γ * be an optimal solution for the constraint optimization problem:
max Γ S x 0 x 1 1 ( W ) [ η ] ( Γ ) s u b j e c t   t o ω i | Γ = c i | Γ .
Then, there exists a curve Γ ¯ * = { ( x * , ν * ) W × R n | x * Γ * } S ( x 0 , ν 0 ) ( x 1 , ν 1 ) 1 ( M × R n ) such that the Hamiltonian one-form defined by ( H ) satisfies:
i x α d H Γ ¯ * = c α i d ν i Γ ¯ * , α = 1 , . . . , s ,
i ν i d H Γ ¯ * = c i Γ * , i = 1 , . . . , n .
A particular type of constraint optimization problem is obtained when all the exact forms ω i are on the vector space generated by ( d x 1 , . . . , d x k ) , k < s . Then, ( x 1 , . . . , x k ) are called state variables, and ( x k + 1 , . . . , x s ) are called control variables and shall be denoted by ( u 1 , . . . , u s k ) .
Corollary 4.
If Γ * is a solution for the constraint optimization problem:
max Γ S x 0 x 1 1 ( W ) [ η ] ( Γ ) s u b j e c t   t o ω i | Γ = c i | Γ , i = 1 , . . . , n
then there exists a curve Γ ¯ * = { ( x * , u * , ν * ) W × R n | ( x * , u * ) Γ * } such that the Hamiltonian one-form defined by ( H ) satisfies:
i x α d H Γ ¯ * = c α i d ν i Γ ¯ * , α = 1 , . . . , k ,
i u σ d H Γ ¯ * = 0 , σ = 1 , . . . , s k ,
i ν i d H Γ ¯ * = c i Γ * , i = 1 , . . . , n .

5.3. Optimal Control Problems

In the following, we derive the single-time Pontryagin maximum principle from the Fritz–John PDE system. We are already familiarized with the single-time optimal control problem (see [,]),
max u ( · ) J [ u ( · ) ] = 0 t 0 X ( t , x ( t ) , u ( t ) ) d t
subject to:
d x i d t ( t ) = X i ( t , x ( t ) , u ( t ) ) , i = 1 , . . . , n ,
x ( t ) X R n , u ( t ) U R r , t [ 0 , t 0 ] , x ( 0 ) = x 0 , x ( t 0 ) = x 1 .
Let us also assume that the feasible control functions have fixed endpoints, i.e., u ( 0 ) = u 0 , u ( t 0 ) = u 1 . In order to adapt this problem to the previous arguments, we consider both the time variable t and the control variables u σ , σ = 1 , . . . , r as generalized state variables, and we read the previous problem as:
max Γ Γ η
subject to:
ω i | Γ = 0 , i = 1 , . . . , n ,
where Γ = { ( t , x ( t ) , u ( t ) ) : t [ 0 , t 0 ] } , η ( t , x , u ) = X ( t , x , u ) d t , and ω i ( t , x , u ) = d x i X i ( t , x , u ) d t .
The Pontryagin maximum principle is obtained by applying the Fritz–John maximum principle for equality constraints only, when considering the time variable as a parameter.
Theorem 3
(Pontryagin’s maximum principle). If ( x * , u * ) : [ 0 , t 0 ] X × U is a parametrization of an optimal (state, control)-solution for the optimal control problem, then there exists a momentum vector function ( λ * , ν * ) : [ 0 , t 0 ] R n + 1 , such that:
d d t ( H ( t , x * ( t ) , u * ( t ) , ν * ( t ) , λ * ( t ) ) ) = H t ( t , x * ( t ) , u * ( t ) , ν * ( t ) , λ * ( t ) ) ) ;
d x * i d t ( t ) = H ν i ( t , x * ( t ) , u * ( t ) , ν * ( t ) , λ * ( t ) ) , i = 1 , . . . , n ;
d ν i * d t ( t ) = H x i ( t , x * ( t ) , u * ( t ) , ν * ( t ) , λ * ( t ) ) , i = 1 , . . . , n ;
and:
H u σ ( t , x * ( t ) , u * ( t ) , ν * ( t ) , λ * ( t ) ) = 0 , σ = 1 , . . . , r ,
where H ( t , x , u , ν , λ ) = λ X ( t , x , u ) + ν i X i ( t , x , u ) denotes the Hamiltonian.
Proof. 
If we denote x ˜ * ( t ) = ( t , x * ( t ) , u * ( t ) ) , then Corollary 2 ensures the existence of the vector function ( λ * , ν * ) : [ 0 , 1 ] R n + 1 such that:
( P F E ) ω α i ( t , x * ( t ) , u * ( t ) ) x ˜ ˙ * α ( t ) = 0 , i = 1 , . . . , n ;
( D F ) λ * 0 , ( λ * , ν * ) 0 ;
( S ) η ¯ β x ˜ α ( t , x * ( t ) , u * ( t ) , ν * ( t ) , λ * ( t ) ) x ˜ ˙ * β ( t ) = d d t ( η ¯ α ( t , x * ( t ) , u * ( t ) , ν * ( t ) , λ * ( t ) ) ) ,
α = 0 , . . . , n + r , t [ 0 , t 0 ] .
Since η ¯ = H d t ν i d x i , when evaluating ( S ) and ( P F E ) for each type of variable,
x ˜ α = t , α = 0 x i , i = α , α = 1 , . . . , n u σ , σ = α n , α = n + 1 , . . . , n + k , ,
we obtain exactly the equations stated by the theorem. □

6. Conclusions

This paper was motivated by the search for an invariant way to phrase constrained optimization problems. When looking to classical single-time optimal control, we noticed that the optimal states taken together provide a parametrization for an optimal curve (a so-called optimal trajectory). This works fine in the euclidean setting, but it generates concerns about global aspects when it comes to differential manifolds, where the use of coordinates is a local matter. To avoid this type of problem, we took into account from the beginning curvilinear-type state variables, making the curve the key element, instead of a certain parametrization. Further, we properly introduced the cost functional as a curvilinear integral where the running cost is ensured by a differential one-form. Differential one-forms are also the main ingredients for the feasibility constraints. These are phrased as inequalities or equalities involving the result of the one-forms acting on a tangent vector field of the curve.
Theorem 1 (Fritz–John maximum principle) provides the necessary optimality conditions, when the problem involves just inequality-type constraints. This outcome was conveniently applied to find the optimum conditions in Theorem 2 (generalized Fritz–John maximum principle), where both equality and inequality-type restrictions were assumed. Although the main objective was to obtain invariant results, we cannot deny the value of the parametric approach. The optimum conditions were written in local coordinates in the corollaries that accompany the two main theorems.
A particularly important case was obtained when we considered only equality constraints (Pfaffian equations). An interesting approach that uses the Hamiltonian one-form was revealed in Corollary 3. This problem is of great importance, generalizing the classical optimal control problem, because the constraints in optimal control are particular Pfaffian equations. From this point of view, Pontryagin’s maximum principle in Theorem 3 is a direct consequence of the previous results.
However, we must point out that the resulting equations offer only the necessary optimal conditions, not sufficient ones. Solving them leads to obtaining the so-called critical points (in this case, critical curves). In order to check if these critical points really reach the maximum, other tools are needed. The way of formulating in an invariant manner such sufficient conditions is for the time being an open problem. Solving this problem will be the next step of our study and will complete the approach initiated here.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The author declares no conflict of interest.

References

  1. Rockafellar, R.T. Lagrange multipliers and optimality. SIAM Rev. 1993, 35, 183–238. [Google Scholar] [CrossRef]
  2. John, F. Extremum problems with inequalities as subsidiary conditions. In Studies and Essays: Courant Anniversary Volume; Friedrichs, K.O., Neugebauer, O.E., Stoker, J.J., Eds.; Wiley-Interscience: New York, NY, USA, 1948; pp. 187–204. [Google Scholar]
  3. Mangasarian, O.L.; Fromovitz, S. The Fritz–John Necessary Optimality Conditions in the Presence of Equality and Inequality Constraints. J. Math. Anal. Appl. 1967, 17, 37–47. [Google Scholar] [CrossRef]
  4. Bertsekas, D.P.; Ozdaglar, A.E.; Tseng, P. Enhanced Fritz–John Conditions for Convex Programming. SIAM J. Optim. 2006, 16, 766–797. [Google Scholar] [CrossRef]
  5. Leal, U.; Silva, G.; Lodwick, W. Fritz–John Necessary Condition for Optimization Problem with an Interval-valued Objective Function. Proc. Ser. Braz. Soc. Appl. Comput. Math. 2018, 6. [Google Scholar] [CrossRef]
  6. Karush, W. Minima of Functions of Several Variables with Inequalities. Ph.D. Thesis, Department of Mathematics, University of Chicago, Chicago, IL, USA, 1939. [Google Scholar]
  7. Kuhn, H.W. Nonlinear programming: A historical note. In History of Mathematical Programming; Lenstra, J.K., Rinnooy Kan, A.H.G., Schrijver, A., Eds.; North–Holland: Amsterdam, The Netherlands, 1991; pp. 77–96. [Google Scholar]
  8. Zalmai, G.J. The Fritz–John and Kuhn-Tucker Optimality Conditions in Continuous–Time Nonlinear Programming. J. Math. Anal. Appl. 1985, 110, 503–518. [Google Scholar] [CrossRef]
  9. Pontriaguine, L.; Boltianski, V.; Gamkrelidze, R.; Michtchenko, E. Théorie Mathématique des Processus Optimaux; Edition MIR: Moscou, Russia, 1974. [Google Scholar]
  10. Evans, L.C. An Introduction to Mathematical Optimal Control Theory; Lecture Notes; University of California, Department of Mathematics: Berkeley, CA, USA, 2010. [Google Scholar]
  11. Tauchnitz, N. The Pontryagin Maximum Principle for Nonlinear Optimal Control Problems with Infinite Horizon. J. Optim. Theory Appl. 2015, 167, 27–48. [Google Scholar] [CrossRef]
  12. Pickenhain, S.; Wagner, M. Pontryagin Principle for State–Constrained Control Problems Governed by First–Order PDE System. J. Optim. Theory Appl. 2000, 107, 297–330. [Google Scholar] [CrossRef]
  13. Wagner, M. Pontryagin Maximum Principle for Dieudonne-Rashevsky Type Problems Involving Lipcshitz functions. Optimization 1999, 46, 165–184. [Google Scholar] [CrossRef]
  14. Udriste, C.; Dinu, S.; Tevy, I. Multitime optimal control for linear PDEs with curvilinear cost functional. Balk. J. Geom. Appl. 2013, 8, 87–100. [Google Scholar]
  15. Udriste, C.; Tevy, I. Multitime Dynamic Programming for Curvilinear Integral Actions. J. Optim. Theory Appl. 2010, 146, 189–207. [Google Scholar] [CrossRef]
  16. Bejenaru, A.; Udriste, C. Multivariate Optimal Control with Payoffs Defined by Submanifold Integrals. Symmetry 2019, 11, 893. [Google Scholar] [CrossRef]
  17. Treanţă, S. Optimal control problems with fundamental tensor evolution. J. Control Decis. 2020. [Google Scholar] [CrossRef]
  18. Barbero-Liñán, M.; Muñoz-Lecanda, M.C. Geometric Approach to Pontryagin’s Maximum Principle. Acta Appl. Math. 2008, 108, 429–485. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.