Manifold Calculus in System Theory and Control—Second Order Structures and Systems

Simone Fiori

doi:10.3390/sym14061144

Dipartimento di Ingegneria dell’Informazione, Università Politecnica delle Marche, Via Brecce Bianche, I-60131 Ancona, Italy

Symmetry2022, 14(6), 1144;https://doi.org/10.3390/sym14061144

Version Notes

Order Reprints

Abstract

The present tutorial paper constitutes the second of a series of tutorials on manifold calculus with applications in system theory and control. The aim of the present tutorial, in particular, is to explain and illustrate some key concepts in manifold calculus such as covariant derivation and manifold curvature. Such key concepts are then applied to the formulation, to the control, and to the analysis of non-linear dynamical systems whose state-space are smooth (Riemannian) manifolds. The main flow of exposition is enriched by a number of examples whose aim is to clarify the notation used and the main theoretical findings through practical calculations.

Keywords:

second-order dynamical systems on manifold; feedback control system; smooth manifolds; covariant derivation; manifold curvature

1. Introduction

The present paper appears as the second part of a long tutorial covering notions of calculus on manifolds embedded in larger flat spaces, dynamical systems whose state spaces are structured as smooth manifolds and non-linear control systems on manifolds.

Regulation of dynamical systems on state manifolds appears to be an appealing research field in non-linear control theory which is gaining increasing interest in the scientific community, especially in the field of mechanical systems control [1,2,3,4,5,6]. Depending on the application at hand, one may be facing two cases [7]: positional control, arising in the regulation of first-order as well as second-order systems, and velocity control, arising in the regulation of second-order systems on manifolds. Embedded manifold setting proves useful in modeling large-scale constrained systems, since a state manifold allows one to take into account non-linear constraints on a non-linear system’s variables seamlessly. A number of scientific papers, both of theoretic and applied nature, were published over the years to investigate such extensions. Since the early days of control theory, it was clear how geometric methods are of prime importance in the study of dynamical systems and their regulation [8,9,10,11]. For a short review of contributions in the theory of Lie-group as well as general manifold system regulation, interested readers might like to consult [7].

A challenging problem in feedback control on smooth manifolds is the synchronization in time of non-linear dynamical systems, an appealing application of which is in the synchronization of a follower oscillator, whose state-variables take the role of masking carrier, to a leader oscillator, to be employed in data encryption for secure transmission over digital networks [12]. In such application, it is assumed that the leader oscillator state variable is completely observable, that its output coincides with its state and that its dynamics is completely specified.

The present tutorial paper is devoted to describing and illustrating key concepts in manifold calculus such as covariant derivation and manifold curvature with applications to formulating, controlling and studying the properties of dynamical systems whose state-space is a smooth manifold. Its content may be summarized as follows:

it provides a clear and well-motivated introduction to advanced concepts in manifold calculus, the basis of system and control theory on manifolds, with special emphasis to its computational and application aspects. The present contribution provides practical formulas to cope with those real-valued manifolds which, in the author’s experience, are most accessed in engineering and applied science problems. As a matter of fact, complex-valued manifolds are not treated at all;
it clearly states and illustrates the idea that, when dealing with a computer-based implementation of dynamical systems, it is necessary to discretize the differential equations that describe the system in a suitable way. In order to perform such discretization, standard discretization methods (such as the ones based on Euler forward-backward discretization) are not advisable since these do not work as they stand on curved manifolds. It is therefore advisable to resort to more sophisticated integration techniques such as the one based on geodesics.

The present paper is organized as follows. Section 2 is dedicated to a recapitulation of the notation used in the first part [13] as well as the present part, along with the main concepts covered by the first part. The recapitulation offered in Section 2 is not very detailed as readers that are approaching the present part of the tutorial are supposed to either get themselves sufficiently acquainted with the introductory material presented in the first part or to be already familiar with it. Section 3 introduces the key notion of covariant derivation of a vector field, which generalizes to manifold the familiar notion of directional derivative from real calculus. Section 3 discusses such concept in detail, starting from motivations behind its definition. Section 4 takes off by some observations about iterated covariant derivation and lands at the key concept of manifold curvature, which is a quintessential notion in manifold calculus. Section 5 surveys the notion of calculus of variations on manifolds by the help of a family of smooth curves and shows how to derive a general kind of continuous-time dynamical systems described by second-order differential equations on manifold. Control systems to regulate such second-order dynamical systems are discussed in Section 6 with special emphasis to time-synchronization. Section 7 presents an overview of second-order discrete-time systems on manifolds with special emphasis on ARMA-type random path generators. Section 8 concludes this tutorial.

As a distinctive aspect of the present tutorial, the main flow of discussion is based on coordinate-free (or component-free) expressions, which facilitates the implementation of the main equations by a matrix-friendly computation language such as MATLAB^©. Starred subsections, namely subsections marked by an asterisk (*), treat some specific arguments, some related to coordinate-prone manifold calculus and some of minor importance in the present context, which may be skipped by the uninterested readers without detriment to the comprehension of the main flow.

2. Notation and Recapitulation of Fundamentals

In the present section we recall some notation and the principal notions laid out in the first part [13]. For details on derivations, we point the reader to such introductory part.

In the present paper, a column array is denoted by a lower-case symbol, while a matrix is denoted by an upper-case symbol. An element of a manifold is however always denoted by a lower-case symbol.

It is convenient to recall a few manifolds of interest in application, defined in embedded (coordinate-free) terms:

Hypercube: The simplest manifold of interest is the hypercube $R^{p}$ , which is essentially the set spanned by p real-valued variables (or p-tuples). A generalization of the hypercube is the set $R^{p \times q}$ of real-valued $p \times q$ matrices.
Hypersphere: A hypersphere is represented as $S^{p - 1} : = {x \in R^{p} ∣ x^{⊤} x = 1}$ and is the subset of points of the hypercube with unit Euclidean distance from the point 0. This is a smooth manifold of dimension $p - 1$ .
General linear group and special linear group: The general linear group is defined as $GL (p) : = {X \in R^{p \times p} ∣ det (X) \neq 0$ }. The special linear group is defined as $Sl (p) : = {X \in R^{p \times p} ∣ det (X) = 1$ }. It represents the set of invertible linear operators of given dimension.
Special orthogonal group: The manifold of special orthogonal matrices is defined as $SO (p) : = \{X \in R^{p \times p} ∣ X^{⊤} X = I_{p}, det (X) = 1\}$ . It represents the set of hyper-rotations in a hypercube of given dimension. In the definition, the symbol $I_{n}$ denotes a $n \times n$ identity matrix.
Stiefel manifold: The (compact) Stiefel manifold is defined as:

$St (n, p) : = {X \in R^{n \times p} ∣ X^{⊤} X = I_{p}},$

(1)

where $p ⩽ n$ .
Real symplectic group: The real symplectic group is defined as

$Sp (2 n) : = {Q \in R^{2 n \times 2 n} | Q^{⊤} J Q = J}, J : = [\begin{matrix} 0_{n} & I_{n} \\ - I_{n} & 0_{n} \end{matrix}],$

(2)

where the symbol $0_{n}$ denotes a whole-zero $n \times n$ matrix.
Manifold of symmetric, positive-definite (SPD) matrices: The space of symmetric, positive-definite matrices is defined as

$S^{+} (p) : = {P \in R^{p \times p} | P = P^{⊤}, P > 0},$

(3)

where the notation $P > 0$ indicates a positive-definite matrix.
Grassmann manifold: A Grassmann manifold $Gr (n, p)$ is a set of subspaces of $R^{n}$ spanned by p independent vectors, namely

$Gr (n, p) = {span (w_{1}, w_{2}, w_{3}, \dots, w_{p})},$

(4)

with $w_{1}, w_{2}, w_{3}, \dots, w_{p} \in R^{n}$ being a p-tuple of arbitrary and linearly independent n-dimensional arrays.

A fundamental assumption of the present tutorial is that any manifold

M

is embedded into an ambient space

A

of suitable dimension. The ambient space is assumed to be metric, namely to be associated with a scalar product

⟨ 🟉, 🟉 ⟩^{A}

.

A smooth curve is generally denoted as

γ : [- ϵ, ϵ] \to M

, with

ϵ > 0

. A curve may be thought of as a trajectory generated by a dynamical system whose state-space is a smooth manifold.

The tangent space to the manifold

M

at a point x is denoted as

T_{x} M

. The tangent bundle is a collection of tangent spaces, namely

T M : = {(x, v) \in A^{2} ∣ x \in M, v \in T_{x} M}

. The concept of normal space complements that of tangent space, namely, the normal space of an embedded manifold in a given point under a chosen metric is defined as

N_{x} M : = {z \in A ∣ {⟨ z, v ⟩}^{A} = 0, \forall v \in T_{x} M}

.

A vector field on a manifold

M

is a function

f : M \to T M

that assigns a tangent vector

f (x) \in T_{x} M

to every point

x \in M

. A vector field may even depend on the time parameter, in which case it will be denoted as

f (t, x)

with

f : R \times M \to T M

.

In order to take into account non-linear constraints on the state variables of a dynamical system, it is convenient to introduce the notion of curved state space under the form of smooth manifolds. This logical process leads to a type of first-order (or single-integrator) dynamical systems described by

\dot{x} (t) = f (t, x (t)), x (0) = x_{0} \in M, t \in [0, t_{f}],

(5)

where

f : R \times M \to T M

denotes the state-transition function of the system and

x \in M

denotes the state of the system at any given time t.

Given two Riemannian manifolds

M

and

N

, let

x \in M

: A pushforward map

d_{x} f : T_{x} M \to T_{f (x)} N

is defined such that for every smooth curve

γ_{x} : [- ϵ, ϵ] \to M

, such that

γ_{x} (0) = x

, it holds:

d_{x} f ({\dot{γ}}_{x} (0)) : = {\frac{d}{d t} f (γ_{x} (t))|}_{t = 0} .

(6)

In general, given a function

f : M \to N

that maps a point x from the manifold

M

to the point

f (x)

on the manifold

N

, the map

d_{x} f

associates to any tangent vector v belonging to the tangent space

T_{x} M

a tangent vector

d_{x} f (v)

belonging to the tangent space

T_{f (x)} N

by ‘pushing’ the former to the latter space.

Manifolds that also enjoy the distinguishing properties of an algebraic group are termed Lie groups. The tangent space of a Lie group

G

at the identity is termed Lie algebra and is denoted as

g

. Any Lie group admits an exponential map, denoted as

\exp : g \to G

.

A Riemannian manifold is defined as a smooth manifold

M

whose tangent bundle is equipped with a smooth family of inner products

M ∋ x \mapsto ⟨ 🟉, 🟉 ⟩_{x} \in R

termed ‘metric’. There exists a canonical metric related to the ambient space

A

that a manifold

M

is embedded in, given by

{⟨ u, w ⟩}_{x} = {⟨ u, G_{x} (w) ⟩}^{A}

, where

⟨ 🟉, 🟉 ⟩^{A}

denotes an inner product in

A

. Such expression is based on a metric kernel

G : T M \to T M

. A derivative of the metric kernel with respect to a curve is defined by

G_{γ}^{•} (w, \dot{γ}) : = \frac{d}{d t} G_{γ} (w),

(7)

where

γ

denotes any smooth curve.

A geodesic arc is defined as the curve

γ

satisfying the following differential inclusion

\frac{\partial F}{\partial γ} - \frac{d}{d t} \frac{\partial F}{\partial \dot{γ}} \in N_{γ} M .

(8)

where

F (x, v) : = {⟨ v, v ⟩}_{x}

denotes a fundamental form. A solution of such differential inclusion that joins two points

x, y \in M

, taking x as the departure point, is denoted as

γ_{x}^{y}

, while a solution with prescribed departure point

x \in M

and velocity at departure

v \in T_{x} M

is denoted as

γ_{x, v}

. The length of a geodesic

γ_{x}^{y}

defines a distance

D (x, y)

between its endpoints.

To a geodesic arc

γ_{x, v} : [0, 1] \to M

is associated a map

\exp_{x} : T_{x} M \to M

defined as

\exp_{x} (v) : = γ_{x, v} (1)

. Such function is termed exponential map with pole

x \in M

. The inverse function associated to the exponential map is termed logarithmic map. On a manifold

M

where it is defined an exponential map

\exp : T M \to M

, a logarithmic map is denoted as

\log : M^{2} \to T M

. A logarithmic map satisfies the identity

\exp_{x} (\log_{x} y) = y

.

An orthogonal projector

Π_{x} : A \to T_{x} M

is a function that maps an element of the ambient space to a tangent vector in a specific tangent space, namely

Π_{x} (a) \in T_{x} M \Leftrightarrow a - Π_{x} (a) \in N_{x} M

. A derivative of the orthogonal projector with respect to a curve is

Π^{•} : A \times T M \to A

, which is defined as

Π_{γ}^{•} (a, \dot{γ}) : = \frac{d}{d t} Π_{γ} (a),

(9)

where

γ

denotes any smooth curve.

Let us recall a property of the derivative

Π^{•}

that will turn useful in the following development. Whenever

Π^{•}

is applied to a pair of tangent vectors, it will return a normal vector, namely the restriction

Π^{•} : T M \times T M \to N M

.

Example 1.

Let us show the behavior of the derivative

Π^{•}

just described in the case of the manifold

SO (n)

endowed with its canonical metric. In this case

Π_{R} (A) = \frac{1}{2} (A - R A^{⊤} R)

, therefore taken

W, V \in T_{R} SO (n)

and a curve

t \mapsto R (t)

such that

\dot{R} = V

, it is seen that

\begin{matrix} Π_{R}^{•} (W, V) = & - \frac{1}{2} (V W^{⊤} R + R W^{⊤} V) \\ = & - \frac{1}{2} (V W^{⊤} + R W^{⊤} V R^{⊤}) R \\ = & - \frac{1}{2} (V W^{⊤} + W R^{⊤} R V^{⊤}) R \\ = & - \frac{1}{2} (V W^{⊤} + W V^{⊤}) R . \end{matrix}

(10)

The matrix sum in parentheses is symmetric, hence

Π_{R}^{•} (W, V)

is a normal vector at R.

The Riemannian gradient of a smooth function

f : M \to R

is denoted as

{grad}_{x} f

and may be written by the explicit formula

{grad}_{x} f = G_{x}^{- 1} (Π_{x} (\partial_{x} f))

, where

\partial_{x} f

denotes the Gateaux derivative of f in

A

. A golden calculation rule involving the notion of Riemannian gradient and that of Riemannian distance is

{grad}_{x} D^{2} (x, y) = - 2 \log_{x} y

.

Parallel transport of a vector

v \in T_{x} M

along a geodesic joining points

x, y \in M

is indicated by

P^{x \to y} (v)

. By this notation it is understood that x is the departure point. Its calculation is based on a Christoffel form of the second kind. It is convenient to define two Christoffel forms, namely the Christoffel form of the first kind

\bar{Γ} : {(T M)}^{2} \to A

as

{\bar{Γ}}_{x} : = \frac{1}{2} G_{x}^{•} + a normal component,

(11)

and a Christoffel form of the second kind

\bar{\bar{Γ}} : {(T M)}^{2} \to A

as

{\bar{\bar{Γ}}}_{x} : = G_{x}^{- 1} {\bar{Γ}}_{x}

. A property that involves parallel transport and manifold logarithm is

P^{x \to y} (\log_{x} y) = - \log_{y} x

.

3. Covariant Derivative of a Vector Field

Covariant derivation provides a kind of derivative to evaluate the rate of change of a vector field along a given direction starting from a given point. Covariant derivation is a generalization to manifolds of the familiar notion of directional derivative.

The present section is organized as follows. Section 3.1 presents a brief review of the standard directional derivative and of its properties. Section 3.2 presents a coordinate-free conceptualization of covariant derivation along a curve, which is the main topic of the present section. Section 3.3 introduces the more general notion of covariant derivation of a vector field along a given direction and that of connection. Section 3.4 discusses again the notion of Riemannian Hessian operator and shows how it can be obtain by the covariant derivative of the Riemannian gradient. Section 3.5 discusses a definition of covariant derivation through axioms and shows that the introduced Levi-Civita connection meets such axioms. Section 3.6 retraces the notion of covariant derivation in coordinates. Section 3.7 introduces two new concepts, namely the commutator of two vector fields and the torsion field, which are indeed closely related. Section 3.8 illustrates and explains the consequence of a quite remarkable relationship between the Christoffel form of the second kind and the linearization of the projection operator. Section 3.9 discusses the notion of Lie derivative, a primeval attempt to define differentiation on tangent bundles.

3.1. Brief Review of Directional Derivative and of Its Properties

Let us briefly review the notion of direction derivative to better understand its manifold counterpart. Let us consider a smooth scalar multivariable function

f : R^{3} \to R

. Given a point

p \in R^{3}

, a directional derivative essentially measures the rate of change of the function f when moving away from p along a given direction. Two points come immediately to mind while endeavoring a definition for a directional derivative:

Since f is a multivariable function, because its domain is $R^{3}$ , there are plenty of ways to ‘move away’ from the foot-point p. The rate of change of the function f depends on the direction towards which one moves away from point x, hence a direction $v \in R^{3}$ needs to be specified. For example, if one takes $f (x, y, z) : = 0.1 x^{2} + 10 y^{2} + e^{z}$ , the function f changes little along the x axis, changes more along the y axis, increases along the positive direction of the z axis and decreases along the negative direction of the z axis, not to mention the behavior corresponding to every possible combinations of the triple $(x, y, z)$ .
Since f is arbitrary, its rate of change depends also on ‘how far’ one moves away from the foot-point p. Taking a large leap is, however, not very useful, as one may always replace the foot-point p with a new foot-point $p^{'}$ to know how the function behaves far from p. Therefore, it is understood that the rate of change is evaluated in close proximity of the point p.

Consequently, a possible definition of directional derivative is

D_{p} f (v) : = lim_{h \to 0} \frac{f (p + h v) - f (p)}{h} \in R for p \in R^{3}, v \in R^{3} .

(12)

From multivariable calculus it is known that

D_{p} f (v) = {\frac{d f (p + t v)}{d t}|}_{t = 0} = \frac{\partial f}{\partial x} v_{x} + \frac{\partial f}{\partial y} v_{y} + \frac{\partial f}{\partial z} v_{z} = {(\partial_{p} f)}^{⊤} v,

(13)

where

v = (v_{x}, v_{y}, v_{z})

and

\partial_{p} f

denotes the gradient (or Jacobian, or Gateaux derivative) of the function f with respect to its three scalar arguments, namely

\partial_{p} f : = (\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}, \frac{\partial f}{\partial z})

. The rightmost quantity is, according to the general definition (6), the differential

d_{p} f (v)

of the function f, at point p, applied to the vector v. In this case, however, since we are working in

R^{3}

, the ‘vectorial’ nature of v is not immediately apparent.

Let us now generalize this notion by taking a smooth function

f : R^{m} \to R^{n}

, which is an array-type multivariable function, which we can call a multivariable vector field. The function f assigns an array to any point of its domain. Therefore

f (p)

may be though of as a vector field (once again, its vectorial nature as understood in manifold calculus in not apparent). Like before, a notion of ‘rate of change’ of a vector field along a given direction may be expressed as

D_{p} f (v) : = lim_{h \to 0} \frac{f (p + h v) - f (p)}{h} \in R^{n} for p \in R^{m}, v \in R^{m} .

(14)

Although the definition just given is formally identical to (12), the space where the two directional derivatives live are different. From multivariable calculus we get:

{(D_{p} f (v))}_{i} = {({\frac{d f (p + t v)}{d t}|}_{t = 0})}_{i} = \sum_{j = 1}^{n} \frac{\partial f_{i}}{\partial p_{j}} v_{j},

(15)

where

p_{i}

denotes the ith component of the m-dimensional array p,

v_{i}

denotes the ith component of the m-dimensional array v and

f_{i}

denotes the ith component of the array-valued function f (with respect to the canonical basis of the domain

R^{m}

and of the codomain space

R^{n}

). Hence, in this generalized scenario,

D_{p} f (v) = (\partial_{p} f) v

, where

\partial_{p} f

denotes the Jacobian matrix of the transformation

p \mapsto f (p)

. In this case, the rate of change itself is multivariable, hence this directional derivative is related to the notion of differential in the sense that

{(D_{p} f (v))}_{i} = d_{p} f_{i} (v)

.

To complete this brief survey of directional derivation, let us emphasize a few properties that the directional derivative

D_{p} f

enjoys:

( $P_{1}$ ) Assume that the direction v along which the rate of change is sought is given by the superposition of two partial directions, namely $v = α u + β w$ , with $α, β$ being two scalar fields, namely $α = α (p)$ and $β = β (p)$ . How the rate of change $D_{p} f (v)$ relates to rates of change $D_{p} f (u)$ and $D_{p} f (w)$ along partial directions is easily understood:

$D_{p} f (α u + β w) = (\partial_{p} f) (α u + β w) = α (\partial_{p} f) u + β (\partial_{p} f) w = α D_{p} f (u) + β D_{p} f (w) .$

(16)
( $P_{2}$ ) Assume the field f be given as the superposition of two concurrent fields, namely $f = b + g$ . The way the rate of change $D_{p} f (v)$ relates to rates of change $D_{p} b (v)$ and $D_{p} g (v)$ is also easily understood:

$D_{p} f (v) = D_{p} b (v) + D_{p} g (v) .$

(17)
( $P_{3}$ ) In addition, assume the field f is given as the dilation of another field, namely $f = α g$ , with $α$ being a scalar field. How the rate of change $D_{p} f (v)$ relates to the rate of change $D_{p} g (v)$ is readily understood:

$D_{p} f (v) = g D_{p} α (v) + α D_{p} g (v) = g d_{p} α (v) + α D_{p} g (v)$

(18)

by the rule of partial derivation of compound functions.

Having recalled the notion of directional derivative of a vector field in

R^{m}

, we are now going to examine and discuss its extension to manifolds.

3.2. Coordinate-Free Conceptualization of Covariant Derivation along a Curve

The notion of covariant derivation may be conceptualized in different manners. We recall here two relevant ways:

Top-down from parallel transport: Assuming that a notion of parallel transport is available, which enables us to compare the values of a vector field at different points of a manifold, covariant derivation stems quite naturally by adapting the classical definition of directional derivative (14). In this sense, covariant derivation represents a ‘local’ version of parallel transport.
Direct axiomatization: In this instance, covariant derivative is defined to be an operator that meets certain requirements, which basically retrace properties $P_{1}$ , $P_{2}$ , $P_{3}$ noted in Section 3.1. Such definition is quite general and leads to a family of covariant derivatives. Moreover, such definition is given independently from that of parallel transport, which may, in turn, be defined as the solution of a covariant-derivative-based differential equation on the tangent bundle $T M$ . In this sense, parallel transport represents a ‘global’ version of covariant derivation.

In the present tutorial paper, we are going to follow the first way of defining covariant derivation (based on parallel transport), not only because one such flow of discussion would be coherent with the general setting of the present series of tutorials, but also because the top-down conceptualization is deemed to appear more clear to control engineers and applied scientist and well-suited to develop numerical methods to simulate the behavior of second-order dynamical systems on manifold.

As a first notion to be defined, we are going to deploy covariant derivation along a curve. Namely, we assume that a vector field

w \in Γ (M)

is well-defined and regular along a smooth curve

γ : [a, b] \to M

and we need to evaluate the rate of change of the vector field

γ \mapsto w_{γ}

along the curve itself. Such rate of change will be denoted as

\nabla_{t}^{γ} w

, where

t \in [a, b]

denotes the value of the parameter that individuates univocally a point over the curve

γ

. Such rate of change is precisely the covariant derivative of the vector field w along the curve

γ

at the point t, and is defined as

\nabla_{t}^{γ} w : = lim_{h \to 0} \frac{P^{γ (t + h) \to γ (t)} w_{γ (t + h)} - w_{γ (t)}}{h} = {\frac{d}{d τ} P^{γ (τ) \to γ (t)} w_{γ (τ)}|}_{τ = t} .

(19)

Such definition appears as a natural extension of the directional derivative (14) of a vector field, properly corrected by parallel transport to align vectors belonging to different tangent spaces. It should be noticed that the curve

γ

does not necessarily need to be a geodesic. In fact, in system and control theory, it likely represents a trajectory generated by a dynamical system over its state manifold, which only occasionally coincides with a geodesic arc.

In order to derive an explicit expression of the covariant derivative, let us evaluate the time-derivative of the function

Q : [a, b] \to R

defined as follows:

Q (t) : = {⟨ w_{γ (t)}, \dot{γ} (t) ⟩}_{γ (t)} .

(20)

By definition of derivative with respect to the scalar parameter t, we have

\dot{Q} (t) = lim_{h \to 0} \frac{{⟨ w_{γ (t + h)}, \dot{γ} (t + h) ⟩}_{γ (t + h)} - {⟨ w_{γ (t)}, \dot{γ} (t) ⟩}_{γ (t)}}{h} .

(21)

By the property of conformal isometry of parallel transport (with respect to the inner product), it follows that

{⟨ w_{γ (t + h)}, \dot{γ} (t + h) ⟩}_{γ (t + h)} = {⟨ P^{γ (t + h) \to γ (t)} (w_{γ (t + h)}), P^{γ (t + h) \to γ (t)} (\dot{γ} (t + h)) ⟩}_{γ (t)},

(22)

hence the derivative

\dot{Q}

may be rewritten as

\begin{matrix} \dot{Q} (t) = & lim_{h \to 0} \frac{{⟨ P^{γ (t + h) \to γ (t)} (w_{γ (t + h)}), P^{γ (t + h) \to γ (t)} (\dot{γ} (t + h)) ⟩}_{γ (t)} - {⟨ w_{γ (t)}, \dot{γ} (t) ⟩}_{γ (t)}}{h} \\ = & lim_{h \to 0} \frac{{⟨ P^{γ (t + h) \to γ (t)} (w_{γ (t + h)}) - w_{γ (t)} + w_{γ (t)}, P^{γ (t + h) \to γ (t)} (\dot{γ} (t + h)) - \dot{γ} (t) + \dot{γ} (t) ⟩}_{γ (t)} - {⟨ w_{γ (t)}, \dot{γ} (t) ⟩}_{γ (t)}}{h} \\ = & lim_{h \to 0} \frac{{⟨ P^{γ (t + h) \to γ (t)} (w_{γ (t + h)}) - w_{γ (t)}, \dot{γ} (t) ⟩}_{γ (t)}}{h} \\ + lim_{h \to 0} \frac{{⟨ w_{γ (t)}, P^{γ (t + h) \to γ (t)} (\dot{γ} (t + h)) - \dot{γ} (t) ⟩}_{γ (t)}}{h} \\ = & {⟨ \nabla_{t}^{γ} w, \dot{γ} (t) ⟩}_{γ (t)} + {⟨ w_{γ (t)}, \nabla_{t}^{γ} \dot{γ} ⟩}_{γ (t)} . \end{matrix}

(23)

On the basis of the above development, it is possible to write an explicit expression for the covariant derivative of a vector field along a curve based on the (derivative of) the metric kernel. Let us rewrite the function

Q

as

Q (t) = {⟨ w_{γ (t)}, G_{γ (t)} (\dot{γ} (t)) ⟩}^{A}

. Then, it holds that

\dot{Q} = {⟨ {\dot{w}}_{γ}, G_{γ} (\dot{γ}) ⟩}^{A} + {⟨ w_{γ}, G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A} + {⟨ w_{γ}, G_{γ} (\ddot{γ}) ⟩}^{A},

(24)

where

{\dot{w}}_{γ}

denotes, for brevity, the total naïve derivative

\frac{d}{d t} (w_{γ (t)})

. Notice that

\frac{d}{d t} (w_{γ}) = (D_{γ} w) (\dot{γ}) = (\partial_{γ} w) (\dot{γ}),

(25)

where

\partial_{x} w

denotes the usual Gateaux derivative and appears as a linear operator (apparently, such operation is defined in the ambient space

A

rather than in

M

). Let us rewrite the above expression by introducing a simple trick:

\begin{matrix} \dot{Q} = \underset{(A)}{\underset{︸}{{⟨ G_{γ} (\dot{γ}), {\dot{w}}_{γ} ⟩}^{A}}} + \frac{1}{2} \underset{(B)}{\underset{︸}{{⟨ w_{γ}, G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A}}} + \underset{(C)}{\underset{︸}{{⟨ w_{γ}, G_{γ} (\ddot{γ}) ⟩}^{A}}} + \frac{1}{2} \underset{(D)}{\underset{︸}{{⟨ w_{γ}, G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A}}} . \end{matrix}

(26)

Let us examine each term separately. The term (A) in (26) may be rewritten as

{⟨ G_{γ} (\dot{γ}), {\dot{w}}_{γ} ⟩}^{A} = {⟨ G_{γ} (\dot{γ}), Π_{γ} ({\dot{w}}_{γ}) ⟩}^{A}

(27)

by definition of orthogonal projection.

The term (B) may be rewritten as

{⟨ w_{γ}, G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A} = {⟨ \dot{γ}, G_{γ}^{•} (w_{γ}, \dot{γ}) ⟩}^{A}

thanks to the commutativity of the derivative

G^{•}

with respect to the ambient inner product. Moreover, by adding an identity

G_{γ} \circ G_{γ}^{- 1}

gives

{⟨ \dot{γ}, G_{γ}^{•} (w_{γ}, \dot{γ}) ⟩}^{A} = {⟨ G_{γ} G_{γ}^{- 1} \dot{γ}, G_{γ}^{•} (w_{γ}, \dot{γ}) ⟩}^{A}

, where cumbersome parentheses were (and will be) omitted for brevity. Invoking again the property of the metric kernel to be self-adjoint and the definition of orthogonal projection leads to the identity

{⟨ w_{γ}, G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A} = {⟨ G_{γ} (\dot{γ}), Π_{γ} G_{γ}^{- 1} G_{γ}^{•} (w_{γ}, \dot{γ}) ⟩}^{A},

(28)

where we have used quite liberally the ‘extendability’ property of the metric kernel to the ambient space.

The term (C) may be rewritten as

{⟨ w_{γ}, G_{γ} (\ddot{γ}) ⟩}^{A} = {⟨ G_{γ} (w_{γ}), \ddot{γ} ⟩}^{A}

and, introducing an orthogonal projector, we get

{⟨ G_{γ} (w_{γ}), \ddot{γ} ⟩}^{A} = {⟨ G_{γ} (w_{γ}), Π_{γ} (\ddot{γ}) ⟩}^{A}

, hence

{⟨ w_{γ}, G_{γ} (\ddot{γ}) ⟩}^{A} = {⟨ G_{γ} (w_{γ}), Π_{γ} (\ddot{γ}) ⟩}^{A} .

(29)

It is now worth noticing that the term (D) may be rewritten equivalently as

{⟨ w_{γ}, G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A} = {⟨ w_{γ}, G_{γ} G_{γ}^{- 1} G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A}

and, in addition, observing that

{⟨ w_{γ}, G_{γ} G_{γ}^{- 1} G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A} = {⟨ G_{γ} (w_{γ}), G_{γ}^{- 1} G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A}

, we may conclude that

{⟨ w_{γ}, G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A} = {⟨ G_{γ} (w_{γ}), Π_{γ} G_{γ}^{- 1} G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A} .

(30)

Summing up the four terms in (26) written in the indicated ways, we obtain the expression

\begin{matrix} \dot{Q} = & {⟨ G_{γ} (\dot{γ}), Π_{γ} ({\dot{w}}_{γ}) + \frac{1}{2} Π_{γ} G_{γ}^{- 1} G_{γ}^{•} (w_{γ}, \dot{γ}) ⟩}^{A} + \\ {⟨ G_{γ} (w_{γ}), Π_{γ} (\ddot{γ}) + \frac{1}{2} Π_{γ} G_{γ}^{- 1} G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A} . \end{matrix}

(31)

Comparing the above expression with the relationship (23) and recalling that the vector field

w_{γ}

is arbitrary, one gets the two identities

\begin{matrix} \nabla_{t}^{γ} w & = & Π_{γ} {\frac{d}{d t} (w_{γ}) + \frac{1}{2} G_{γ}^{- 1} (G_{γ}^{•} (w_{γ}, \dot{γ}))}, \end{matrix}

(32)

\begin{matrix} \nabla_{t}^{γ} \dot{γ} & = & Π_{γ} {\ddot{γ} + \frac{1}{2} G_{γ}^{- 1} (G_{γ}^{•} (\dot{γ}, \dot{γ}))} . \end{matrix}

(33)

The two above expressions are coherent, in fact, setting

w_{γ} : = \dot{γ}

in the relationship (32) yields the relationship (33).

The relationship (32) may be rewritten through the Christoffel form of the first kind. In fact from the expression (11) it follows that the term (B) in (26) may be rewritten as

{⟨ \dot{γ}, G_{γ}^{•} (w_{γ}, \dot{γ}) ⟩}^{A} = 2 {⟨ \dot{γ}, {\bar{Γ}}_{γ} (w_{γ}, \dot{γ}) ⟩}^{A} = 2 {⟨ G_{γ} (\dot{γ}), G_{γ}^{- 1} {\bar{Γ}}_{γ} (w_{γ}, \dot{γ}) ⟩}^{A} = 2 {⟨ G_{γ} (\dot{γ}), Π_{γ} G_{γ}^{- 1} {\bar{Γ}}_{γ} (w_{γ}, \dot{γ}) ⟩}^{A} = 2 {⟨ G_{γ} (\dot{γ}), Π_{γ} {\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ}) ⟩}^{A}

. As a consequence, the covariant derivative (32) may be rewritten equivalently as

\nabla_{t}^{γ} w = Π_{γ} {\frac{d}{d t} (w_{γ}) + {\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ})},

(34)

a classical expression in manifold calculus. In terms of Gateaux derivative of the vector field w, we may also write

\nabla_{t}^{γ} w = Π_{γ} {(\partial_{γ} w_{γ}) (\dot{γ}) + {\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ})},

(35)

which evidences how

\nabla_{t}^{γ} w

is linear in

\dot{γ}

.

Two special cases of the above relationship are particularly worth examining separately:

Case of normal Christoffel form. Whenever the manifold $M$ is endowed with a normal Christoffel form ${\bar{\bar{Γ}}}_{x}$ , namely whenever $Π_{x} {\bar{\bar{Γ}}}_{x} \equiv 0$ , the general expression of the covariant derivative may be simplified as

$\nabla_{t}^{γ} w = Π_{γ} {\frac{d}{d t} (w_{γ})},$

(36)

namely, the covariant derivative of a vector field coincides with an orthogonal projection of the naïve derivative of the vector field.
Case of geodesics. When $w_{γ} = \dot{γ}$ , namely the vector field under derivation coincides with the velocity field of a curve, it holds that $\nabla_{t}^{γ} \ddot{γ} = Π_{γ} {\ddot{γ} + {\bar{\bar{Γ}}}_{γ} (\dot{γ}, \dot{γ})}$ . If $γ$ is a geodesic curve, since $\dot{γ} + {\bar{\bar{Γ}}}_{γ} (\dot{γ}, \dot{γ}) = 0$ in every point of a geodesic, then it turns out that, on every geodesic line,

$\nabla_{t}^{γ} \dot{γ} = 0 .$

(37)

In fact, in a ‘direct axiomatization’ setting, the above relationship is taken as one defining a geodesic!

We shall revise the expression (34) in Section 3.8 where we shall show that projection is not strictly necessary, although taking out the projection operator does not necessarily simplify the related calculations!

From the above calculations one might infer that, for all practical purposes, the operators

G_{x}

and

Π_{x}

commute, namely

G_{x} Π_{x} = Π_{x} G_{x}

.

Example 2.

In all instances in which

G_{x} \equiv {id}_{x}

, as in the case of the unit hypersphere endowed with its canonical metric, the property

G_{x} Π_{x} = Π_{x} G_{x}

is trivially verified.

Let us consider the case of the manifold

S^{+} (n)

endowed with its canonical metric, for which

G_{P} (W) = P^{- 1} W P^{- 1}

and

Π (A) = \frac{1}{2} (A^{⊤} + A)

(uniform). In this case

\begin{matrix} Π G_{P} (A) = & \frac{1}{2} ({(P^{- 1} A P^{- 1})}^{⊤} + P^{- 1} A P^{- 1}) \\ = & \frac{1}{2} (P^{- ⊤} A^{⊤} P^{- ⊤} + P^{- 1} A P^{- 1}) \\ = & \frac{1}{2} (P^{- 1} A^{⊤} P^{- 1} + P^{- 1} A P^{- 1}) \\ = & P^{- 1} \frac{A^{⊤} + A}{2} P^{- 1} \\ = & G_{P} Π (A) . \end{matrix}

(38)

Let us further consider the case of the Stiefel manifold

St (n, p)

endowed with its canonical metric, where

G_{X} (W) = (I_{n} - \frac{1}{2} X X^{⊤}) W

and

Π_{X} (A) = A - \frac{1}{2} X (X^{⊤} A + A^{⊤} X)

. In this case, it holds

\begin{matrix} Π_{X} G_{X} (A) = & (I_{n} - \frac{1}{2} X X^{⊤}) A - \frac{1}{2} X (X^{⊤} (I_{n} - \frac{1}{2} X X^{⊤}) A + A^{⊤} (I_{n} - \frac{1}{2} X X^{⊤}) X) \\ = & A - \frac{3}{4} X X^{⊤} A - \frac{1}{4} X A^{⊤} X, \\ G_{X} Π_{X} (A) = & (I_{n} - \frac{1}{2} X X^{⊤}) (A - \frac{1}{2} X (X^{⊤} A + A^{⊤} X)) \\ = & A - \frac{3}{4} X X^{⊤} A - \frac{1}{4} X A^{⊤} X, \end{matrix}

(39)

hence the property of commutativity of the two operators is verified once again.

In fact, the commutation property just exemplified may be proved quite easily by observing that, for every

a \in A

and

v \in T_{x} M

, the following equality chain holds:

\begin{matrix} {⟨ G_{x} Π_{x} {a}, v ⟩}_{x} & = {⟨ G_{x} Π_{x} {a}, G_{x} (v) ⟩}^{A} = {⟨ Π_{x} {a}, G_{x}^{2} (v) ⟩}^{A} = {⟨ a, G_{x}^{2} (v) ⟩}^{A} \\ = {⟨ G_{x} (a), G_{x} (v) ⟩}^{A} = {⟨ Π_{x} G_{x} (a), G_{x} (v) ⟩}^{A} = {⟨ Π_{x} G_{x} (a), v ⟩}_{x}, \end{matrix}

(40)

where we used the shortened notation

G_{x}^{2}

for

G_{x} \circ G_{x}

. The above equality chain implies that

G Π = Π G

.

Let us consider a few examples of calculation of covariant derivative.

Example 3.

As an example of calculation of covariant derivative, let us consider the case of the unit hypersphere

S^{n - 1}

endowed with its canonical metric. The orthogonal projector reads

Π_{x} (a) = (I_{n} - x x^{⊤}) a

. According to the relationship (36), the covariant derivative would read

\nabla_{t}^{γ} w = (I_{n} - γ γ^{⊤}) \frac{d}{d t} (w_{γ}) .

(41)

Let us verify the same result by applying the definition (19). Let us recall the expression of the parallel transport operator on the hyper-sphere

S^{n - 1}

that reads

P^{x \to y} (u) = (I_{n} - \frac{(x + y) y^{⊤}}{1 + x^{⊤} y}) u

. By direct calculation we get

\begin{matrix} \frac{d}{d τ} P^{γ (τ) \to γ (t)} w_{γ (τ)} = & \frac{d}{d τ} (w_{γ (τ)} - \frac{(γ (τ) + γ (t)) γ^{⊤} (t)}{1 + γ^{⊤} (τ) γ (t)} w_{γ (τ)}) \\ = & {\dot{w}}_{γ (τ)} \\ - \frac{\dot{γ} (τ) γ^{⊤} (t) (1 + γ^{⊤} (τ) γ (t)) - (γ (τ) + γ (t)) γ^{⊤} (t) {\dot{γ}}^{⊤} (τ) γ (t)}{{1 + γ^{⊤} (τ) γ (t)}^{2}} w_{γ (τ)} \\ - \frac{(γ (τ) + γ (t)) γ^{⊤} (t)}{1 + γ^{⊤} (τ) γ (t)} {\dot{w}}_{γ (τ)} . \end{matrix}

(42)

Setting

τ = t

leads to

\begin{matrix} {\frac{d}{d τ} P^{γ (τ) \to γ (t)} w_{γ (τ)}|}_{τ = t} = & {\dot{w}}_{γ (t)} \\ - \frac{2 \dot{γ} (t) γ^{⊤} (t) - 2 γ (t) γ^{⊤} (t) [{\dot{γ}}^{⊤} (t) γ (t)]}{4} w_{γ (t)} - \frac{2 γ (t) γ^{⊤} (t)}{2} {\dot{w}}_{γ (t)} \\ = & {\dot{w}}_{γ (t)} - \frac{1}{2} \dot{γ} (t) (γ^{⊤} (t) w_{γ (t)}) - γ (t) γ^{⊤} (t) {\dot{w}}_{γ (t)} \\ = & (I_{n} - γ (t) γ^{⊤} (t)) {\dot{w}}_{γ (t)} . \end{matrix}

(43)

Hence the two results coincide, as expected.

Let us now compute the covariant derivative of a vector field in

Γ (S^{+} (n))

. Let us recall that, for the manifold of symmetric, positive-definite matrices, we found (cfr. [13]) that

{\bar{\bar{Γ}}}_{P} (V, W) = - \frac{1}{2} (V P^{- 1} W + W P^{- 1} V)

. It is straightforward to verify that, for every

P \in S^{+} (n)

,

V, W \in T_{P} S^{+} (n)

it holds that

{\bar{\bar{Γ}}}_{P} (V, W) \in T_{P} S^{+} (n)

, as well as that, for every vector field

W \in Γ (S^{+} (n))

and every smooth trajectory

t \mapsto P (t)

, it holds that the naïve derivative

\frac{d}{d t} W_{P (t)} \in T_{P (t)} S^{+} (n)

. As a consequence, in the present case the general relationship (34) simplifies into

\nabla_{t}^{P} W = \frac{d}{d t} W_{P} + {\bar{\bar{Γ}}}_{P} (W_{P}, \dot{P}),

(44)

which, by the expression of the Christoffel form, reads

\nabla_{t}^{P} W = \frac{d}{d t} W_{P} - \frac{1}{2} (W_{P} P^{- 1} \dot{P} + \dot{P} P^{- 1} W_{P}),

(45)

To end with this series of examples, let us compute the covariant derivative of a vector field in

Γ (SO (n))

. Let us recall that, whenever the manifold

SO (n)

is endowed with its canonical metric, it holds

Π_{R} (A) = \frac{1}{2} A - \frac{1}{2} R A^{⊤} R

, while the Christoffel form of the second kind takes the expression

{\bar{\bar{Γ}}}_{R} (V, W) = \frac{1}{2} R (V^{⊤} W + W^{⊤} V)

. Indeed, the exact expression of the Christoffel form, in this case, is irrelevant, since it maps

T M \times T M

to the normal bundle

N M

. Therefore, over a trajectory

t \mapsto R (t)

, the covariant derivative of a vector field

R \mapsto W_{R}

reads

\nabla_{t}^{R} W = \frac{d}{d t} W_{R} - R {(\frac{d}{d t} W_{R})}^{⊤} R .

(46)

It is a useful exercise to verify by direct calculations from the definition that the above expression is correct.

3.3. Covariant Derivation along a Direction, Connections

Let

\nabla_{t}^{γ} w

denote the covariant derivative of a vector field

x \mapsto w_{x}

at a point t of a curve

γ

. In some cases, one might want to indicate explicitly the direction along which the derivative is taken, which coincides with

\dot{γ} (t)

. Referring the foot-point and the tangent direction at

t = 0

, in these cases one may set

v : = \dot{γ} (0)

and denote the covariant derivative by the shortened notation

\nabla_{v} w : = {\nabla_{t}^{γ} w|}_{t = 0} .

(47)

In this case, the covariant derivative does not depend on the curve

γ

, except for

γ (0)

and

\dot{γ} (0)

. (If it is necessary to specify the foot-point

x \in M

where all these computations take place, the elongated notation

{(\nabla_{v} w)}_{x}

may be used). For example, the covariant derivative of a velocity field with respect to itself, namely the covariant acceleration, would read

\nabla_{\dot{γ}} \dot{γ}

.

An extension of the covariant derivative of a vector field along a given direction leads to the notion of connection

\nabla : Γ (M) \times Γ (M) \to Γ (M)

. Let us consider a vector field

w \in Γ (M)

that is to be derivated and a further vector field

v \in Γ (M)

that associates to each point of the manifold (or a region of the manifold) a direction along which derivate. In other terms, at a point

x \in M

,

v_{x}

defines the direction along which the rate of change of the vector field w is sought. The corresponding covariant derivative, denoted by

x \mapsto \nabla_{v_{x}} w_{x}

or

{(\nabla_{v} w)}_{x}

, extends the notion of covariant derivation of a vector field along a curve and is based on the assumption that the vector field w is defined in every point of

M

(or within a region of the manifold). The formal definition goes like

x \mapsto {(\nabla_{v} w)}_{x} : = lim_{h \to 0} \frac{P^{\exp_{x} (h v_{x}) \to x} w_{\exp_{x} (h v_{x})} - w_{x}}{h} .

(48)

The connection described in the present tutorial is commonly referred to as Levi-Civita connection.

A metric connection is a covariant derivative that ‘agrees’ or is ‘coordinated’ with an inner product in the sense that the covariant derivative of an inner product of two vector fields depends only on the rate of change of such vector fields and does not depend on the rate of change of the inner product itself, namely, given two vector fields

v, w \in Γ (M)

and a tangent direction

u \in T_{x} M

, it holds that

\nabla_{u} {⟨ v_{x}, w_{x} ⟩}_{x} = {⟨ {(\nabla_{u} v)}_{x}, w_{x} ⟩}_{x} + {⟨ v_{x}, {(\nabla_{u} w)}_{x} ⟩}_{x} .

(49)

The definition (19) ensures that the resulting connection is metric. Notice, in fact, that

x \mapsto {⟨ u_{x}, w_{x} ⟩}_{x}

is a scalar field, a special case of vector field, that may be subjected to covariant derivation. In addition, notice that in a large number of references, the directional derivative

\nabla_{v} {⟨ u_{x}, w_{x} ⟩}_{x}

is denoted simply as

v (⟨ u, w ⟩)

, leaving understood that a vector field acts as a directional derivative.

Example 4.

We can see this even in the case of a flat manifold

R^{3}

endowed with an inner product

{⟨ u_{x}, w_{x} ⟩}_{x}

. In this case, we have that the directional derivative of the inner product of two vector fields along a direction v is

\nabla_{v} {⟨ u_{x}, w_{x} ⟩}_{x} : = lim_{h \to 0} \frac{{⟨ u_{x + h v}, w_{x + h v} ⟩}_{x + h v} - {⟨ u_{x}, w_{x} ⟩}_{x}}{h} .

(50)

Adding zero-sum terms, and rearranging them, gives:

\begin{matrix} \nabla_{v} {⟨ u_{x}, w_{x} ⟩}_{x} : = & lim_{h \to 0} \frac{{⟨ u_{x + h v}, w_{x + h v} ⟩}_{x + h v} - {⟨ v_{x + h v}, w_{x} ⟩}_{x + h v}}{h} + \\ lim_{h \to 0} \frac{{⟨ u_{x + h v}, w_{x + h v} ⟩}_{x + h v} - {⟨ v_{x}, w_{x + h v} ⟩}_{x + h v}}{h} + \\ lim_{h \to 0} \frac{{⟨ u_{x}, w_{x} ⟩}_{x + h v} - {⟨ u_{x}, w_{x} ⟩}_{x}}{h} + \\ lim_{h \to 0} \frac{{⟨ u_{x + h v}, w_{x} ⟩}_{x + h v} - {⟨ u_{x + h v}, w_{x + h v} ⟩}_{x + h v}}{h} + \\ lim_{h \to 0} \frac{{⟨ u_{x}, w_{x + h v} ⟩}_{x + h v} - {⟨ u_{x}, w_{x} ⟩}_{x + h v}}{h} . \end{matrix}

(51)

By the linearity and continuity of the inner product one then gets

\begin{matrix} \nabla_{v} {⟨ u_{x}, w_{x} ⟩}_{x} = & {⟨ u_{x}, {(\nabla_{v} w)}_{x} ⟩}_{x} + {⟨ {(\nabla_{v} u)}_{x}, w_{x} ⟩}_{x} - {⟨ u_{x}, {(\nabla_{v} w)}_{x} ⟩}_{x} + \\ {⟨ u_{x}, {(\nabla_{v} w)}_{x} ⟩}_{x} + lim_{h \to 0} \frac{{⟨ u_{x}, w_{x} ⟩}_{x + h v} - {⟨ u_{x}, w_{x} ⟩}_{x}}{h} . \end{matrix}

(52)

In conclusion, one gets

\begin{matrix} \nabla_{v} {⟨ u_{x}, w_{x} ⟩}_{x} = {⟨ u_{x}, {(\nabla_{v} w)}_{x} ⟩}_{x} + {⟨ {(\nabla_{v} u)}_{x}, w_{x} ⟩}_{x} + lim_{h \to 0} \frac{{⟨ u_{x}, w_{x} ⟩}_{x + h v} - {⟨ u_{x}, w_{x} ⟩}_{x}}{h} . \end{matrix}

(53)

The first two addenda depend only on the directional derivatives of the vector fields, while the third term depends on the rate of change of the inner product along the line

x + h v

. Only when the last term is identically zero the directional derivative is said to be metric, in which case

\nabla_{v} {⟨ u_{x}, w_{x} ⟩}_{x} = {⟨ u_{x}, {(\nabla_{v} w)}_{x} ⟩}_{x} + {⟨ {(\nabla_{v} u)}_{x}, w_{x} ⟩}_{x}

.

3.4. Relationship of Gradient to Hessian by Covariant Derivation

A noticeable result from manifold calculus is that Riemannian Hessian, covariant derivation and Riemannian gradient are related to one another in a neat way, namely

H_{γ} f (\dot{γ}) = \nabla_{t}^{γ} ({grad}_{γ} f) .

(54)

The gradient builds a vector field over a curve based on a scalar function and the Hessian represents the rate of change of such vector field along the curve. In order to prove such neat result, it is convenient to re-derive the expression of the Riemannian gradient in a different way compared to the one derived in [13].

Let us recall that, given a scalar function

f : M \to R

and a smooth curve

γ

with image in

M

, it holds that

\frac{d}{d t} f (γ) = {⟨ {grad}_{γ} f, \dot{γ} ⟩}_{γ} = {⟨ {grad}_{γ} f, G_{γ} (\dot{γ}) ⟩}^{A},

(55)

where

A

denotes, as usual, the ambient space that the manifold

M

is embedded in and

⟨ 🟉, 🟉 ⟩^{A}

denotes an Euclidean scalar product that the ambient space is endowed with. Deriving with respect to t the rightmost expression gives

\frac{d^{2}}{d t^{2}} f (γ) = {⟨ \frac{d}{d t} {grad}_{γ} f, G_{γ} (\dot{γ}) ⟩}^{A} + {⟨ {grad}_{γ} f, G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A} + {⟨ {grad}_{γ} f, G_{γ} (\ddot{γ}) ⟩}^{A} .

(56)

Let us examine separately the above three addenda. The first addendum may be rewritten as

{⟨ \frac{d}{d t} {grad}_{γ} f, G_{γ} (\dot{γ}) ⟩}^{A} = {⟨ Π_{γ} \frac{d}{d t} {grad}_{γ} f, G_{γ} (\dot{γ}) ⟩}^{A} = {⟨ Π_{γ} \frac{d}{d t} {grad}_{γ} f, \dot{γ} ⟩}_{γ} .

(57)

The second addendum may be rewritten as

\begin{matrix} {⟨ {grad}_{γ} f, G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A} = {⟨ G_{γ}^{•} ({grad}_{γ} f, \dot{γ}), \dot{γ} ⟩}^{A} = {⟨ 2 {\bar{Γ}}_{γ} ({grad}_{γ} f, \dot{γ}), \dot{γ} ⟩}^{A} = \\ 2 {⟨ G_{γ}^{- 1} {\bar{Γ}}_{γ} ({grad}_{γ} f, \dot{γ}), G_{γ} (\dot{γ}) ⟩}^{A} = 2 {⟨ {\bar{\bar{Γ}}}_{γ} ({grad}_{γ} f, \dot{γ}), G_{γ} (\dot{γ}) ⟩}^{A} = \\ 2 {⟨ Π_{γ} {\bar{\bar{Γ}}}_{γ} ({grad}_{γ} f, \dot{γ}), G_{γ} (\dot{γ}) ⟩}^{A} = 2 {⟨ Π_{γ} {\bar{\bar{Γ}}}_{γ} ({grad}_{γ} f, \dot{γ}), \dot{γ} ⟩}_{γ} . \end{matrix}

(58)

Since the expression (56) is path-dependent, we choose the curve

γ

to be a geodesic line, hence the third addendum takes the equivalent expressions

\begin{matrix} {⟨ {grad}_{γ} f, G_{γ} (\ddot{γ}) ⟩}^{A} = {⟨ {grad}_{γ} f, - {\bar{Γ}}_{γ} (\dot{γ}, \dot{γ}) ⟩}^{A} = - {⟨ {\bar{Γ}}_{γ} ({grad}_{γ} f, \dot{γ}), \dot{γ} ⟩}^{A} = \\ - {⟨ G_{γ}^{- 1} {\bar{Γ}}_{γ} ({grad}_{γ} f, \dot{γ}), G_{γ} (\dot{γ}) ⟩}^{A} = - {⟨ {\bar{\bar{Γ}}}_{γ} ({grad}_{γ} f, \dot{γ}), G_{γ} (\dot{γ}) ⟩}^{A} = \\ - {⟨ Π_{γ} {\bar{\bar{Γ}}}_{γ} ({grad}_{γ} f, \dot{γ}), G_{γ} (\dot{γ}) ⟩}^{A} = - {⟨ Π_{γ} {\bar{\bar{Γ}}}_{γ} ({grad}_{γ} f, \dot{γ}), \dot{γ} ⟩}_{γ} . \end{matrix}

(59)

Gluing pieces together, one obtains

\frac{d^{2}}{d t^{2}} f (γ) = {⟨ Π_{γ} \frac{d}{d t} {grad}_{γ} f + Π_{γ} {\bar{\bar{Γ}}}_{γ} ({grad}_{γ} f, \dot{γ}), \dot{γ} ⟩}_{γ} .

(60)

By definition, the Hessian of a scalar function at a point

x \in M

, along a given tangent direction

v \in T_{x} M

, is such that

{\frac{d^{2}}{d t^{2}} f (γ)|}_{t = 0} = {⟨ H_{x} f (v), v ⟩}_{x},

(61)

where

γ (0) = x

and

\dot{γ} (0) = v

, hence we may conclude that

H_{x} f (v) = Π_{x} \{{\frac{d}{d t} {grad}_{γ} f|}_{t = 0} + {\bar{\bar{Γ}}}_{x} ({grad}_{x} f, v)\} .

(62)

Let us examine two examples of calculation of Hessian.

Example 5.

As a first example, let us consider the function

f (x) : = \frac{1}{2} D^{2} (x, r)

, where

D : M \times M \to R

denotes a Riemannian distance and

r \in M

denotes a fixed reference point (This function is sometimes referred to as Synge’s world function in relativistic physics [14]). As we have proven, in this case

{grad}_{x} f = - \log_{x} r

, where

\log_{x} : M \to T_{x} M

denotes a logarithmic map. In this example, therefore, it turns out that

\frac{1}{2} (H_{x} D^{2} (x, r)) (v) = - Π_{x} \{{\frac{d}{d t} \log_{γ} (r)|}_{t = 0} + {\bar{\bar{Γ}}}_{x} (\log_{x} r, v)\} .

(63)

The papers [15,16] further elaborate on the notion of Hessian of the Riemannian squared distance.

As a second example, let us consider the Hessian of Oja’s function (or Rayleigh quotient)

f (x) : = \frac{1}{2} x^{⊤} P_{0} x

in

S^{n - 1}

, with

P_{0} \in S^{+} (n)

constant. Since

{\bar{\bar{Γ}}}_{x}

is normal, we can conclude that

H_{x} f (v) = Π_{x} {{\frac{d}{d t} {grad}_{γ}|}_{t = 0}}

, with

γ : [- ϵ, ϵ] \to M

denoting a geodesic arc such that

γ (0) = x

and

\dot{γ} (0) = v

. Since

{grad}_{x} f = (I_{n} - x x^{⊤}) P_{0} x,

(64)

we have

\frac{d}{d t} {grad}_{γ} = \frac{d}{d t} (P_{0} γ - γ γ^{⊤} P_{0} γ) = P_{0} \dot{γ} - \dot{γ} γ^{⊤} P_{0} γ - γ {\dot{γ}}^{⊤} P_{0} γ - γ γ^{⊤} P_{0} \dot{γ} .

(65)

Setting

t = 0

gives

{\frac{d}{d t} {grad}_{γ}|}_{t = 0} = P_{0} v - v x^{⊤} P_{0} x - x v^{⊤} P_{0} x - x x^{⊤} P_{0} v .

(66)

Notice that the third and the fourth addenda are normal to

T_{x} S^{n - 1}

, hence we conclude that

H_{x} f (v) = (I_{n} - x x^{⊤}) P_{0} v - 2 v f .

(67)

In order to judge about the positive-definiteness of the Hessian, let us evaluate the inner product

\begin{matrix} {⟨ H_{x} f (v), v ⟩}_{x} = & v^{⊤} (I_{n} - x x^{⊤}) P_{0} v - 2 v f \\ = & v^{⊤} P_{0} v - (v^{⊤} v) (x^{⊤} P_{0} x) \\ = & {∥ v ∥}^{2} ({\hat{v}}^{⊤} P_{0} \hat{v} - x^{⊤} P_{0} x), \end{matrix}

(68)

with

\hat{v} : = v / ∥ v ∥

(recall that, in the positive-definiteness test,

v \neq 0

). The Hessian is positive-definite at a given point x if

{\hat{v}}^{⊤} P_{0} \hat{v} - x^{⊤} P_{0} x > 0

for every

v \in T_{x} S^{2} - {0}

.

Recalling the expression (34), where we may set

w_{γ} : = {grad}_{γ} f

, and comparing the result with the relationship (62), it is immediate to see that they coincide, hence showing the mentioned relationship between the Hessian and the covariant derivative of a gradient field.

3.5. Axiomatization of Covariant Derivative and Relationship to Parallellism

A notion of covariant derivative may be defined arbitrarily as long as it meets generalized versions of the properties (

P_{1}

), (

P_{2}

), (

P_{3}

) recalled in Section 3.1, which qualify covariant derivative as a generalized directional derivative. Denoting the covariant derivative of a vector field

u \in Γ (M)

in the direction

v \in T_{x} M

at a point

x \in M

as

\nabla_{v} u

, we may require it to satisfy the following properties:

\begin{matrix} (P_{1}^{'}) & \nabla_{α v + β w} u = α \nabla_{v} u + β \nabla_{w} u, \end{matrix}

(69)

\begin{matrix} (P_{2}^{'}) & \nabla_{v} (u + z) = \nabla_{v} u + \nabla_{v} z, \end{matrix}

(70)

\begin{matrix} (P_{3}^{'}) & \nabla_{v} (α u) = α \nabla_{v} u + u d_{x} α (v), \end{matrix}

(71)

with

u, z \in Γ (M)

,

v, w \in T_{x} M

and

α, β

being scalar fields and where

d α

denotes a tangent map (pushforward). The properties

P_{1}^{'}

and

P_{2}^{'}

establish the linearity of the covariant derivative with respect to its vectorial arguments, while the property

P_{3}^{'}

represents a sort of product rule.

We may verify that the connection defined in Section 3.2 satisfies the above axioms through the expression (35):

Proof of property $(P_{1}^{'})$ : The covariant derivative $\nabla_{α v + β w} u$ may be written explicitly as

${(\nabla_{α v + β w} u)}_{x} = Π_{x} {(\partial_{x} w) (α (x) v_{x} + β (x) w_{x}) + {\bar{\bar{Γ}}}_{x} (w_{x}, α (x) v_{x} + β (x) w_{x})} .$

(72)

Since both the Gateaux derivative and the Christoffel form are linear in their vectorial arguments, the property follows.
Proof of property $(P_{2}^{'})$ : The vector field $\nabla_{v} (u + z)$ may be expressed explicitly as

${(\nabla_{v} (u + z))}_{x} = Π_{x} {(\partial_{x} (u + z)) (v_{x}) + {\bar{\bar{Γ}}}_{x} (u_{x} + z_{x}, v_{x})} .$

(73)

By the linearity of the terms in parentheses with respect to their vectorial arguments, the proof follows.
Proof of property $(P_{3}^{'})$ : The covariant derivative $\nabla_{v} (α u)$ may be re-expressed as

$\begin{matrix} {(\nabla_{v} (α u))}_{x} = & Π_{x} {(\partial_{x} (α u)) (v_{x}) + {\bar{\bar{Γ}}}_{x} (α (x) u_{x}, v_{x})} \\ = & Π_{x} {d_{x} α (v_{x}) u_{x} + α (x) (\partial_{x} (u)) (v_{x}) + α (x) {\bar{\bar{Γ}}}_{x} (u_{x}, v_{x})} . \end{matrix}$

(74)

Noticing that $Π_{x} (u_{x}) \equiv u_{x}$ leads to the sought result.

Considering as primary the notion of covariant derivation, it is possible to recover the notion of parallelism from covariant derivation. Let us denote by

γ : [- ϵ, ϵ] \to M

a smooth curve on

M

and a smooth vector field

w \in Γ (M)

defined (at least) on the image of

γ

. Such vector field is said to be parallel along γ if

\nabla_{\dot{γ}} w = 0 .

(75)

Conversely, setting up one such differential equation equipped with an initial condition

w (0) = w_{0} \in T_{γ (0)} M

, its solution generates a parallel transport field of the vector

w_{0}

along the curve γ from 0 to t denoted as

P_{γ}^{0 \to t} (w_{0})

.

A metric connection makes parallel transport be a conformal isometry. To show this important property, let us consider again a curve

γ : [- ϵ, ϵ] \to M

such that

γ (0) = x_{0}

and two vectors

w_{0}, v_{0} \in T_{x_{0}} M

. Parallel transport defines two vector fields

w, v \in Γ (M)

along such curve, namely

w_{γ (t)} : = P_{γ}^{0 \to t} (w_{0}), v_{γ (t)} : = P_{γ}^{0 \to t} (v_{0}) .

(76)

Define

s (t) : = {⟨ w_{γ (t)}, v_{γ (t)} ⟩}_{γ (t)}

. Since the connection ∇ is metric, the function s stays constant for any

t \in [- ϵ, ϵ]

. In fact, by definition of metric connection, it holds that

\dot{s} = {⟨ \nabla_{\dot{γ}} w_{γ}, v_{γ} ⟩}_{γ} + {⟨ w_{γ}, \nabla_{\dot{γ}} v_{γ} ⟩}_{γ},

(77)

with no extra terms. Since the vector fields

v_{x}

and

w_{x}

are parallel along the curve

γ

, it holds that

\nabla_{\dot{γ}} w_{γ} = \nabla_{\dot{γ}} v_{γ} = 0

, which proves the assertion. A consequence of the above-proven property is that

{⟨ w_{γ (t)}, v_{γ (t)} ⟩}_{γ (t)} = {⟨ w_{γ (0)}, v_{γ (0)} ⟩}_{γ (0)} = {⟨ w_{0}, v_{0} ⟩}_{x_{0}}

(78)

for any

t \in [- ϵ, ϵ]

. This equality has two important consequences:

Parallel transport preserves the length of transported vectors: In fact, if we take $w_{0} = v_{0}$ , we get ${⟨ P_{γ}^{0 \to t} (v_{0}), P_{γ}^{0 \to t} (v_{0}) ⟩}_{γ (t)} = ∥ P_{γ}^{0 \to t} (v_{0}) ∥_{γ (t)}^{2} = {∥ v_{0} ∥}_{x_{0}}^{2}$ . This means that parallel transport realizes an isometry.
Parallel transport preserves the angle between transported vectors: In fact, if we define the cosine of the angle between two tangent vectors as their inner product normalized by their length, we get

$\frac{{⟨ P_{γ}^{0 \to t} (w_{0}), P_{γ}^{0 \to t} (v_{0}) ⟩}_{γ (t)}}{∥ P_{γ}^{0 \to t} (w_{0}) ∥_{γ (t)} \cdot {∥ P_{γ}^{0 \to t} (v_{0}) ∥}_{γ (t)}} = \frac{{⟨ w_{0}, v_{0} ⟩}_{x_{0}}}{∥ w_{0} ∥_{x_{0}} \cdot {∥ v_{0} ∥}_{x_{0}}},$

(79)

which means that parallel transport realizes a conformal map.

The following example shows explicitly a property of parallel fields along geodesics.

Example 6.

Let us prove directly that a vector field

w \in Γ (M)

parallel along a geodesic curve γ keeps the function

Q (t) : = {⟨ w_{γ (t)}, \dot{γ} (t) ⟩}_{γ (t)}

constant with respect to the parameter t. Recall that

Q = {⟨ w_{γ}, G_{γ} (\dot{γ}) ⟩}^{A}

, hence

\begin{matrix} \dot{Q} = & {⟨ {\dot{w}}_{γ}, G_{γ} (\dot{γ}) ⟩}^{A} + {⟨ w_{γ}, G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A} + {⟨ w_{γ}, G_{γ} (\ddot{γ}) ⟩}^{A} \\ = & {⟨ {\dot{w}}_{γ}, G_{γ} (\dot{γ}) ⟩}^{A} + {⟨ \dot{γ}, G_{γ}^{•} (w_{γ}, \dot{γ}) ⟩}^{A} + {⟨ w_{γ}, G_{γ} (\ddot{γ}) ⟩}^{A} . \end{matrix}

(80)

On a geodesic line it holds that

G_{γ} (\ddot{γ}) = - \frac{1}{2} G_{γ}^{•} (\dot{γ}, \dot{γ}) + a n o r m a l c o m p o n e n t

, hence

\begin{matrix} \dot{Q} = & {⟨ {\dot{w}}_{γ}, G_{γ} (\dot{γ}) ⟩}^{A} + {⟨ \dot{γ}, G_{γ}^{•} (w_{γ}, \dot{γ}) ⟩}^{A} - \frac{1}{2} {⟨ w_{γ}, G_{γ}^{•} (\dot{γ}, \dot{γ}) ⟩}^{A} \\ = & {⟨ {\dot{w}}_{γ}, G_{γ} (\dot{γ}) ⟩}^{A} + {⟨ \dot{γ}, G_{γ}^{•} (w_{γ}, \dot{γ}) ⟩}^{A} - \frac{1}{2} {⟨ \dot{γ}, G_{γ}^{•} (w_{γ}, \dot{γ}) ⟩}^{A} \\ = & {⟨ {\dot{w}}_{γ}, G_{γ} (\dot{γ}) ⟩}^{A} + \frac{1}{2} {⟨ \dot{γ}, G_{γ}^{•} (w_{γ}, \dot{γ}) ⟩}^{A} \\ = & {⟨ {\dot{w}}_{γ}, G_{γ} (\dot{γ}) ⟩}^{A} + \frac{1}{2} {⟨ G_{γ}^{- 1} G_{γ}^{•} (w_{γ}, \dot{γ}), G_{γ} (\dot{γ}) ⟩}^{A} \\ = & {⟨ {\dot{w}}_{γ} + \frac{1}{2} G_{γ}^{- 1} G_{γ}^{•} (w_{γ}, \dot{γ}), G_{γ} (\dot{γ}) ⟩}^{A} \\ = & {⟨ Π_{γ} {{\dot{w}}_{γ} + \frac{1}{2} G_{γ}^{- 1} G_{γ}^{•} (w_{γ}, \dot{γ})}, G_{γ} (\dot{γ}) ⟩}^{A} . \end{matrix}

(81)

Since

{\bar{\bar{Γ}}}_{x} (u, v) = \frac{1}{2} G_{x}^{- 1} G_{x}^{•} (u, v) + a n o r m a l c o m p o n e n t

, we may write that

\dot{Q} = {⟨ Π_{γ} {{\dot{w}}_{γ} + {\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ})}, \dot{γ} ⟩}_{γ} = {⟨ \nabla_{t}^{γ} w, \dot{γ} ⟩}_{γ} .

(82)

Since, by assumption, the vector field w is parallel along γ, its covariant derivative

\nabla_{t}^{γ} w = 0

, therefore

\dot{Q} = 0

, hence

Q (t)

keeps constant at its initial value

Q (0)

.

3.6. Coordinate-Prone Covariant Derivation*

The fundamental requirement at the very core of a connection is that a covariant derivative of a basis vector field in a tangent direction given by another basis vector is a tangent vector itself, namely

\nabla_{\partial_{i}^{x}} \partial_{j}^{x} \in T_{x} M

. Such a basic fact may be expressed by the fundamental relationship:

\nabla_{\partial_{i}^{x}} \partial_{j}^{x} = C_{i j}^{k} (x) \partial_{k}^{x},

(83)

where

C_{i j}^{k} : M \to R

are coefficients that describe the structure of a connection. The elementary covariant derivative

\nabla_{\partial_{i}^{x}} \partial_{j}^{x}

quantifies the rate of change of the basis vector field

\partial_{j}^{x}

along the direction specified by the further basis vector field

\partial_{i}^{x}

.

By applying the properties (69)–(71), it is readily obtained that:

\nabla_{(v^{i} \partial_{i}^{x})} (u^{i} \partial_{i}^{x}) = v^{i} (C_{i j}^{k} u^{j} + \frac{\partial u^{k}}{\partial x^{i}}) \partial_{k}^{x},

(84)

where the functions

v^{i} = v^{i} (x)

and the functions

u^{i} = u^{i} (x)

are the components of vector fields in

Γ (M)

expressed in the canonical basis

{\partial_{i}^{x}}

. If

C_{i j}^{k} \equiv 0

, the basis vectors stay constant across the tangent bundle and the covariant derivative simply coincides with the directional derivative, namely

C_{i j}^{k} \equiv 0 \Rightarrow \nabla_{(v^{i} \partial_{i}^{x})} (u^{i} \partial_{i}^{x}) = v^{i} (\frac{\partial u^{k}}{\partial x^{i}}) \partial_{k}^{x} = d_{x} u_{k} (v) \partial_{k}^{x} .

(85)

In general, one needs

p^{3}

coefficient-functions to define a connection on a p-dimensional manifold, which may be specified arbitrarily and give rise to an independent definition of connection.

A special choice of connection coeffients leads to the Levi-Civita connection, where

C_{i j}^{k} : = Γ_{i j}^{k}

, namely the Christoffel symbols of the second kind associated to the metric tensor of components

g_{i j}

which read

Γ_{i j}^{k} = \frac{1}{2} g^{k h} (\frac{\partial g_{h j}}{\partial x^{i}} + \frac{\partial g_{i h}}{\partial x^{j}} - \frac{\partial g_{i j}}{\partial x^{h}}) .

(86)

Let us survey the rationale behind Levi-Civita connection through an example.

Example 7.

Let us revisit the notion of geodesic. On a manifold

M

, replacing

u^{k} = v^{k} = {\dot{γ}}^{k}

in the Equation (84), leads to the expression

\nabla_{\dot{γ}} \dot{γ} = {\dot{γ}}^{i} {\dot{γ}}^{j} C_{i j}^{k} \partial_{k}^{x} + {\ddot{γ}}^{k} \partial_{k}^{x}

. Setting the covariant derivative

\nabla_{\dot{γ}} \dot{γ}

to zero leads to the geodesic equations for the components

γ^{k}

:

{\ddot{γ}}^{k} + C_{i j}^{k} {\dot{γ}}^{i} {\dot{γ}}^{j} = 0 .

(87)

Choosing to endow the manifold

M

with a Levi-Civita connection, namely setting

C_{i j}^{k} \equiv Γ_{i j}^{k}

makes the geodesic Equation (87) coincide to the equation given in [13]. A different choice of connection would lead to a different geodesic equation and, ultimately, to a different calculus setting for the manifold

M

.

3.7. Commutator of Vector Fields and Torsion Field

Let us take again two vector fields

v, w \in Γ (M)

and a connection

\nabla : Γ (M) \times Γ (M) \to Γ (M)

. Apparently, in general there need not to subsist any relation between the covariant derivatives

\nabla_{v} w

and

\nabla_{w} v

. In the case of a Levi-Civita connection, there exist however a precise relationship between such derivatives, which is usually expressed through the sentence a Levi-Civita connection is torsion-free.

In order to elucidate such concept, let us define two notions, namely, commutator of two vector fields and torsion. The commutator (or Lie brackets) of two vector fields

v, w \in Γ (M)

is a further vector field denoted as

{[v, w]}_{x} : = Π_{x} {(\partial_{x} w) (v_{x}) - (\partial_{x} v) (w_{x})},

(88)

namely, as a projected version of a comparison between the rate of change of the vector field v along the direction

w_{x}

and of the rate of change of the vector field w along the direction

v_{x}

. (As usual, the Gateaux derivatives

\partial_{x}

are taken in

A

.)

The torsion field

θ : Γ (M) \times Γ (M) \to Γ (M)

is defined as

θ_{x} (v, w) : = {(\nabla_{v} w)}_{x} - {(\nabla_{w} v)}_{x} - {[v, w]}_{x} .

(89)

Torsion measures the degree of non-commutativity of directional derivations as the difference between directional derivatives of two vector fields with respect to each other depurated from their own non-commutativity. In other words, torsion measures how closely the commutator of vector fields can be recovered from the connection. Notice that, in Euclidean spaces, it holds that

{(\nabla_{v} w)}_{x} \equiv (\partial_{x} w) (v)

and

{(\nabla_{w} v)}_{x} \equiv (\partial_{x} v) (w)

, hence torsion is certainly null. As we shall see later, null torsion induces a relationship between two otherwise unrelated concepts, namely covariant derivation and Lie derivation.

Now, let us prove that the torsion field is identically null. Let us recall that

{(\nabla_{v} w)}_{x} = Π_{x} {\partial_{x} w (v_{x}) + {\bar{\bar{Γ}}}_{x} (w_{x}, v_{x})},

(90)

hence

\begin{matrix} {(\nabla_{v} w)}_{x} - {(\nabla_{w} v)}_{x} = & Π_{x} {\partial_{x} w (v_{x})} + Π_{x} {{\bar{\bar{Γ}}}_{x} (w_{x}, v_{x})} \\ - Π_{x} {\partial_{x} v (w_{x})} - Π_{x} {{\bar{\bar{Γ}}}_{x} (v_{x}, w_{x})} \\ = & Π_{x} {\partial_{x} w (v_{x}) - \partial_{x} v (w_{x})} \\ = & {[v, w]}_{x}, \end{matrix}

(91)

since the Christoffel form is symmetric in its vectorial arguments. The assertion that a Levi-Civita connection is torsion free hence follows. We underline how the result (91) is noticeable per se, because it implies that, while the covariant derivative depends on the Christoffel form, the difference between two covariant derivatives with swapped arguments does not. Another way to see this property is to write

\nabla_{v} w = \nabla_{w} v + [v, w]

.

3.8. A Remarkable Relationship Linking the Operator $Π^{•}$ to the Christoffel Form $\bar{\bar{Γ}}$

Let us recall that the bilinear operator

Π^{•} : A \times T M \to T M

arises from the directional derivation of the orthogonal projection operator

Π

which, in turn, does not depend at all from the metric that a manifold

M

is endowed with, as it depends only on the structure of the tangent spaces to the manifold and on the Euclidean ambient space

A

’s metric. In fact, by definition of orthogonal projector, it should hold that

{⟨ Π_{x} (a), v ⟩}^{A} = {⟨ a, v ⟩}^{A}

, for every

a \in A

,

x \in M

and

v \in T_{x} M

.

Let us recall how orthogonal projectors are computed for a few manifolds of interest.

Example 8.

Let us recall how to determine the expression of an orthogonal projector

Π_{x} : R^{n} \to T_{x} S^{n - 1}

. By definition it must hold that

Π_{x} (a) - a \in N_{x} S^{n - 1}

, hence

Π_{x} (a) = a + λ x

. Since

Π_{x} (a) \in T_{x} S^{n - 1}

, it must also hold that

x^{⊤} (a + λ x) = 0

, therefore

λ = - x^{⊤} a

and hence

Π_{x} (a) = a - x x^{⊤} a

. The directional derivative of such projector along a tangent direction

v \in T_{x} S^{n - 1}

reads

Π_{x}^{•} (a, v) = - v (x^{⊤} a) - x (v^{⊤} a) .

(92)

Let us further recall how to derive the expression of an orthogonal projection operator

Π_{R} : R^{n \times n} \to T_{R} SO (n)

. By definition, it must hold that

Π_{R} (A) - A \in N_{R} SO (n)

, hence

Π_{R} (A) = A + R S

, with

S = S^{⊤}

. Since

Π_{R} (A) \in T_{R} SO (n)

, it must also hold that

R^{⊤} (A + R S) + {(A + R S)}^{⊤} R = 0

. Straightforward calculations show that

S = - \frac{1}{2} (R^{⊤} A + A^{⊤} R)

and hence

Π_{R} (A) = \frac{1}{2} (A + R A^{⊤} R)

, where we have used the cancellation implied by

R R^{⊤} = I_{n}

. The directional derivative of such projector along a tangent direction

V \in T_{x} SO (n)

reads

Π_{R}^{•} (A, V) = - \frac{1}{2} (V A^{⊤} R + R A^{⊤} V) .

(93)

In addition, let us survey the case of the manifold of symmetric, positive definite matrices by determining

Π_{P} : R^{n \times n} \to T_{P} S^{+} (n)

. By definition of orthogonal projection, it must hold that

Π_{P} (A) - A \in N_{P} S^{+} (n)

, therefore

Π_{P} (A) = A + H

, with

H + H^{⊤} = 0

. Since

Π_{P} (A) \in T_{P} S^{+} (n)

, it is necessary that

A + H = A^{⊤} - H

, therefore

H = \frac{1}{2} (A^{⊤} - A)

, and hence

Π_{P} (A) = \frac{1}{2} (A^{⊤} + A)

. The directional derivative of this projector along the tangent direction

V \in T_{P} S^{+} (n)

is identically zero, namely

Π_{P}^{•} (A, V) \equiv 0,

(94)

since

Π_{P} (A)

does not actually depend on the foot-point P.

To end this example with, let us consider the case of the Stiefel manifold

St (n, p)

. In this case,

Π_{X} : R^{n \times p} \to T_{X} St (n, p)

. From the condition

Π_{X} (A) - A \in N_{X} St (n, p)

it follows that

Π_{X} (A) = A + X S

, with

S = S^{⊤}

. From the tangency condition

Π_{X} (A) \in T_{X} St (n, p)

it follows that

X^{⊤} (A + X S) + {(A + X S)}^{⊤} X = 0

, hence that

S = - \frac{1}{2} (X^{⊤} A + A^{⊤} X)

. Ultimately, we get

Π_{X} (A) = A - \frac{1}{2} X (X^{⊤} A + A^{⊤} X)

. The directional derivative of such projector along a tangent direction

V \in T_{X} St (n, p)

reads

Π_{X}^{•} (A, V) = - \frac{1}{2} X (V^{⊤} A + A^{⊤} V) - \frac{1}{2} V (X^{⊤} A + A^{⊤} X),

(95)

that the expression (93) is a special case of, when

p = n

.

We further recall that the Christoffel form of the second kind

\bar{\bar{Γ}} : {(T M)}^{2} \to A

does depend on the metric, through the metric kernel

G : T M \to T M

, and from the structure of the tangent bundle

T M

. Let us recall the expressions of some Christoffel forms of interest.

Example 9.

For the hyper-sphere

S^{n - 1}

endowed with the canonical metric, it holds

{\bar{\bar{Γ}}}_{x} (u, v) = x (u^{⊤} v) .

(96)

(We note, with apologies, that in the first part of this series of tutorial [13], the Christoffel form associated to the unit hyper-sphere was reported with an incorrect sign).

For the special orthogonal group, it holds

{\bar{\bar{Γ}}}_{R} (U, V) = \frac{1}{2} R (U^{⊤} V + V^{⊤} U) .

(97)

For the manifold of symmetric, positive definite matrices, the Christoffel form takes the expression

{\bar{\bar{Γ}}}_{P} (U, V) = \frac{1}{2} U P^{- 1} V - \frac{1}{2} V P^{- 1} U .

(98)

Instead, the expression

{\bar{\bar{Γ}}}_{X} (U, V) = \frac{1}{2} X (U^{⊤} V + V^{⊤} U)

(99)

pertains to the Stiefel manifold endowed with the Euclidean metric.

Now we lay out a quite simple observation, that leads to a number of important consequences. Let us consider a smooth vector field

w \in Γ (M)

, namely, a function that assigns a tangent vector

w_{x} \in T_{x} M

to every point

x \in M

, and a smooth curve

ρ : [- ϵ, ϵ] \to M

. In the previous sections we had familiarized ourselves with the naïve directional derivative

{\dot{w}}_{ρ}

, that actually stands for

\frac{d}{d t} w_{ρ (t)}

. Now the fundamental observation that we are underlining is that, although

{\dot{w}}_{ρ}

is an array in the ambient space

A

, this does not mean that its entries are unconstrained. In other terms, the naïve directional derivative of a tangent vector field is not a tangent vector field, yet it must obey some constraints.

Let us express such constraints in an analytic way. For the sake of notation conciseness, we are going to make use of the following notation: For any

a \in A

, we shall denote by

a_{x}^{⊥}

the quantity

a - Π_{x} (a)

.

Let us observe that, by definition of tangent vector field, on a smooth curve

ρ : [- ϵ, ϵ] \to M

it must hold that

Π_{ρ} (w_{ρ}) = w_{ρ}

. Deriving with respect to the parameter t gives:

Π_{ρ}^{•} (w_{ρ}, \dot{ρ}) + Π_{ρ} ({\dot{w}}_{ρ}) = {\dot{w}}_{ρ} .

(100)

Setting

t = 0

,

x : = ρ (0)

and

v : = \dot{ρ} (0)

, gives

{[\partial_{x} w (v)]}^{⊥} = Π_{x}^{•} (w_{x}, v) .

(101)

The above result means that the normal component of the naïve velocity of a vector field is completely determined by the directional derivative

Π^{•}

(hence only its tangential component may be prescribed freely). (This finding further clarifies the meaning of the bilinear operator

Π^{•}

). It is worth underlining that the relation (101) only involves the restriction

Π^{•}

in

{(T M)}^{2}

and is fully compliant with the previous finding that

Π^{•} : {(T M)}^{2} \to N M

.

An important consequence of the above finding is as follows. We know that a tangent vector field

w \in Γ (M)

is parallel along a curve

ρ : [- ϵ, ϵ] \to M

if it satisfies the differential equation (in

A

)

{\dot{w}}_{ρ} + {\bar{\bar{Γ}}}_{ρ} (w_{ρ}, \dot{ρ}) = 0 .

(102)

Splitting the members of the above equation into their tangential and normal components, and retaining the latter, we get, for a point

x : = ρ (0)

:

{[\partial_{x} w (v)]}^{⊥} = - {\bar{\bar{Γ}}}_{x}^{⊥} (w_{x}, v) .

(103)

where

v : = \dot{ρ} (0)

. Comparing the relationship (101) with the relation (103), we immediately get that

Π_{x}^{•} (u, v) + {\bar{\bar{Γ}}}_{x}^{⊥} (u, v) = 0, for every u, v \in T_{x} M,

(104)

namely, the anti-normal component of the Christoffel form (of the second kind) equals a restriction of the directional derivative of the projection operator. Two seemingly unrelated operators are quite tightly bound to one another!

Let us verify the above important finding on a number of cases.

Example 10.

For the hyper-sphere

S^{n - 1}

endowed with the canonical metric, the restriction of the directional derivative

Π^{•}

to the tangent bundle reads

Π_{x}^{•} (u, v) = - v (x^{⊤} u) - x (v^{⊤} u) = - x (u^{⊤} v),

(105)

because

x^{⊤} u = 0

. Notice that, in this example,

{\bar{\bar{Γ}}}^{⊥} \equiv \bar{\bar{Γ}}

, hence it coincides with the Christoffel form (96) up to a sign switch.

For the special orthogonal group endowed with its canonical metric, the restriction of the directional derivative

Π^{•}

to the tangent bundle reads

\begin{matrix} Π_{R}^{•} (U, V) = & - \frac{1}{2} V U^{⊤} R - \frac{1}{2} R U^{⊤} V \\ = & + \frac{1}{2} V R^{⊤} U - \frac{1}{2} R U^{⊤} V \\ = & - \frac{1}{2} R V^{⊤} U - \frac{1}{2} R U^{⊤} V . \end{matrix}

(106)

To conduct the above calculations, we have used twice the property that

V^{⊤} R = - R^{⊤} V

for every

V \in T_{R} SO (n)

. Notice that, even in this case,

{\bar{\bar{Γ}}}^{⊥} \equiv \bar{\bar{Γ}}

, hence

Π_{R}^{•} (U, V)

coincides, up to a sign, with the Christoffel form (97).

In the case of the manifold of symmetric, positive definite matrices endowed with its canonical metric, the directional derivative of the orthogonal projector, although restricted to the tangent bundle, is still identically null, namely

Π_{P}^{•} (U, V) \equiv 0

. On the other hand, from the expression (98), it is immediately verified that

{\bar{\bar{Γ}}}^{⊥} \equiv 0

, hence the two expressions coincide.

For the Stiefel manifold, the directional derivative of the orthogonal projector restricted to the tangent bundle reads

\begin{matrix} Π_{X}^{•} (U, V) = & - \frac{1}{2} X (V^{⊤} U + U^{⊤} V) - \frac{1}{2} V (X^{⊤} U + U^{⊤} X) \\ = & - \frac{1}{2} X (V^{⊤} U + U^{⊤} V), \end{matrix}

(107)

since the second term on the right-hand side cancels. Once again, this quantity, which is purely normal, coincides (up to sign) with the Christoffel form of the second kind (99) which, in turn, appears to be purely normal.

Notice that in all these examples, the expression of

Π^{•}

results to be symmetric in its vectorial arguments.

The property (104) prescribes specifically the normal component of the Christoffel form of the second kind to be independent of the metric. Such prescription implies several important consequences, that we are going to explore in the following.

A consequence to underline is that

Π_{x}^{•} (u, v) = Π_{x}^{•} (v, u)

for every pair

u, v \in T_{x} M

, namely, the restriction of the directional derivative of the orthogonal projector to the tangent bundle is symmetrical in its arguments.

A further consequence is that, although for a non-parallel vector field

w \in Γ (M)

along a curve

ρ

it holds that

{\dot{w}}_{ρ} + {\bar{\bar{Γ}}}_{ρ} (w_{ρ}, \dot{ρ}) \neq 0

, from the relationships (101) and (104) it follows that

{\dot{w}}_{ρ}^{⊥} + {\bar{\bar{Γ}}}_{ρ}^{⊥} (w_{ρ}, \dot{ρ}) \equiv 0 for e v e r y vector field w \in Γ (M) along ρ .

(108)

This latter relationship also implies that the normal component of the naïve velocity of variation of a (non-parallel) vector field may be written as a function of the vector field itself.

As a further consequence, let us examine the expression of the commutator of two vector fields as defined in Equation (88). Given two vector fields

u, v \in Γ (M)

, their commutator at a point x is given by

{[u, v]}_{x} = Π_{x} {\partial_{x} v (u_{x})} - Π_{x} {\partial_{x} u (v_{x})} .

(109)

Since for the two vector fields in must hold that

\{\begin{matrix} Π_{x} (\partial_{x} v (u_{x})) = \partial_{x} v (u_{x}) - Π_{x}^{•} (v_{x}, u_{x}), \\ Π_{x} (\partial_{x} u (v_{x})) = \partial_{x} u (v_{x}) - Π_{x}^{•} (u_{x}, v_{x}), \end{matrix}

(110)

by the symmetry of the operator

Π^{•}

restricted to the tangent bundle it follows that

{[u, v]}_{x} = \partial_{x} v (u_{x}) - \partial_{x} u (v_{x}) .

(111)

An even deeper consequence of the property (104) concerns the computation of the covariant derivative of a vector field. Let us recall that, given a vector field

w \in Γ (M)

and a tangent vector

v \in T_{x} M

, the covariant derivative of w in the direction v at a point x is given by

\begin{matrix} {(\nabla_{v} w)}_{x} = & Π_{x} {\partial_{x} w (v)} + Π_{x} {{\bar{\bar{Γ}}}_{x} (w_{x}, v)} \\ = & \partial_{x} w (v) - Π_{x}^{•} (w_{x}, v) + {\bar{\bar{Γ}}}_{x} (w_{x}, v) - {\bar{\bar{Γ}}}_{x}^{⊥} (w_{x}, v) . \end{matrix}

(112)

Since the second and the fourth terms in the above sum cancel out, we rest with

\begin{matrix} {(\nabla_{v} w)}_{x} = \partial_{x} w (v) + {\bar{\bar{Γ}}}_{x} (w_{x}, v) . \end{matrix}

(113)

In other terms, the sum

\partial_{x} w (v) + {\bar{\bar{Γ}}}_{x} (w_{x}, v)

is already an element of the tangent bundle, without any need for projection. A similar relationship holds for the covariant derivative along a smooth curve, namely

\begin{matrix} \nabla_{t}^{γ} w = \partial_{γ} w (\dot{γ}) + {\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ}) . \end{matrix}

(114)

The above two expressions of the covariant derivative may be used interchangeably with those given in Section 3.2 whenever they are deemed to be easier to apply, as in the case of iterated derivatives covered in Section 4.2.

Let us verify the above general relations by a number of examples.

Example 11.

Let us consider, as a first case of study, the computation of the covariant derivative on the unit hyper-sphere

S^{n - 1}

endowed with the canonical metric. In Example 3 we have evaluated the covariant derivative

\nabla_{t}^{γ} w

of a vector field

w \in Γ (S^{n - 1})

in two different ways to prove their equivalency. Let us repeat such calculation by means of the relationship (114):

\nabla_{t}^{γ} w = {\dot{w}}_{γ} + (w_{γ}^{⊤} \dot{γ}) γ .

(115)

Now, since

γ^{⊤} w_{γ} = 0

, deriving once gives

{\dot{γ}}^{⊤} w_{γ} + γ^{⊤} {\dot{w}}_{γ} = 0

, therefore

\nabla_{t}^{γ} w = {\dot{w}}_{γ} - (γ^{⊤} {\dot{w}}_{γ}) γ = (I_{n} - γ γ^{⊤}) {\dot{w}}_{γ},

(116)

as in Example 3.

The case regarding the manifold of symmetric, positive-definite matrices endowed with the canonical metric is self-explanatory, since the Christoffel form of the second kind is purely tangential.

The case concerning the special orthogonal group is more interesting. In particular, in [17] (Theorem 1), the present author gave the following expression for the covariant derivative of a vector field

W \in Γ (SO (n))

along a curve

R (t)

:

\nabla_{t}^{R} W = (\dot{R} R^{⊤} + \frac{3}{2} R {\dot{R}}^{⊤}) W_{R} + {\dot{W}}_{R} + \frac{1}{2} W_{R} {\dot{R}}^{⊤} R .

(117)

Such expression was derived applying directly the definition in terms of parallel transport. We would like to show here that such expression is exactly the same as that found in Example 3, but applying the Equation (114). Let us first simplify the expression (117) for the sake of easier comparison. From the orthogonality condition

R R^{⊤} = I_{n}

it follows that

\dot{R} R^{⊤} + R {\dot{R}}^{⊤} = 0

, hence

\dot{R} R^{⊤} + \frac{3}{2} R \dot{R} = \frac{1}{2} R {\dot{R}}^{⊤}

, therefore the expression (117) simplifies in

\nabla_{t}^{R} W = \frac{1}{2} R {\dot{R}}^{⊤} W_{R} + {\dot{W}}_{R} + \frac{1}{2} W_{R} {\dot{R}}^{⊤} R .

(118)

Further we notice that from the condition

R W_{R}^{⊤} + W_{R} R^{⊤} = 0

it follows that

W_{R} {\dot{R}}^{⊤} = - \dot{R} W_{R}^{⊤} - R {\dot{W}}_{R}^{⊤} - {\dot{W}}_{R} R^{⊤}

, hence

\begin{matrix} \nabla_{t}^{R} W = & \frac{1}{2} R {\dot{R}}^{⊤} W_{R} + {\dot{W}}_{R} - \frac{1}{2} (\dot{R} W_{R}^{⊤} + R {\dot{W}}_{R}^{⊤} + {\dot{W}}_{R} R^{⊤}) R \\ = & \frac{1}{2} R {\dot{R}}^{⊤} W_{R} + {\dot{W}}_{R} - \frac{1}{2} \dot{R} W_{R}^{⊤} R - \frac{1}{2} R {\dot{W}}_{R}^{⊤} R - \frac{1}{2} {\dot{W}}_{R} \\ = & \frac{1}{2} {\dot{W}}_{R} - \frac{1}{2} R {\dot{W}}_{R}^{⊤} R, \end{matrix}

(119)

in fact

R {\dot{R}}^{⊤} W_{R} = - \dot{R} R^{⊤} W_{R} = \dot{R} W_{R}^{⊤} R

, hence the first and the third terms in the second line cancel out. This is the expression found in Example 3. By applying the simplified Equation (114) we get:

\nabla_{t}^{R} W = {\dot{W}}_{R} + \frac{1}{2} R W_{R}^{⊤} \dot{R} + \frac{1}{2} R {\dot{R}}^{⊤} W_{R} .

(120)

On the other hand, from the tangency condition

W_{R}^{⊤} R + R^{⊤} W_{R} = 0

it follows that

W_{R}^{⊤} \dot{R} = - {\dot{W}}_{R}^{⊤} R - {\dot{R}}^{⊤} W_{R} - R^{⊤} {\dot{W}}_{R}

, hence

\begin{matrix} \nabla_{t}^{R} W = & {\dot{W}}_{R} - \frac{1}{2} R {\dot{W}}_{R}^{⊤} R - \frac{1}{2} R {\dot{R}}^{⊤} W_{R} - \frac{1}{2} R R^{⊤} {\dot{W}}_{R} + \frac{1}{2} R {\dot{R}}^{⊤} W_{R} \end{matrix}

(121)

which, upon easy simplifications, come to coincide with expression (119).

The projectionless expression of the covariant derivative of a vector field along a curve impacts on a number of differential operators, including the Hessian, which is defined as the covariant derivative of a gradient of a function. In fact, let us recall that, given a smooth function

f : M \to R

, its Hessian operator computed along a curve

γ

is defined as

H_{γ} f (\dot{γ}) : = \nabla_{t}^{γ} {grad}_{γ} f

, where now

\{\begin{matrix} \nabla_{t}^{γ} {grad}_{γ} f = \frac{d}{d t} {grad}_{γ} f + {\bar{\bar{Γ}}}_{γ} ({grad}_{γ} f, \dot{γ}), \\ {grad}_{γ} f = G_{γ}^{- 1} (Π_{γ} {\partial_{γ} f}), \end{matrix}

(122)

hence an alternative expression for the Hessian is

H_{γ} f (\dot{γ}) = \frac{d}{d t} G_{γ}^{- 1} (Π_{γ} {\partial_{γ} f}) + {\bar{\bar{Γ}}}_{γ} (G_{γ}^{- 1} (Π_{γ} {\partial_{γ} f}), \dot{γ}) .

(123)

The following examples studies in particular two interesting problems involving covariant differentiation and the logarithmic map.

Example 12.

As a first example of relationship involving covariant differentiation and logarithmic map, let us consider the covariant derivative of a logarithmic map field. In particular, given a point

x \in M

and a smooth curve γ, the quantity

\log_{γ} x

represents a vector field along the curve γ. One might wonder how the covariant derivative of such vector field along the curve γ looks like. Applying the Equation (114) to the vector field

w_{γ} : = \log_{γ} x

(where x is fixed), we get

\nabla_{t}^{γ} \log_{γ} x = (\partial_{γ} \log_{γ} x) (\dot{γ}) + {\bar{\bar{Γ}}}_{γ} (\log_{γ} x, \dot{γ}) .

(124)

Technically speaking, one might notice that

\partial_{y} \log_{y} x \in End (T_{y} M)

and that it might be termed as logarithmic pushforward endomorphism. Such endomorphism was mentioned also in [18].

For the second example, let us consider a manifold

M_{2}

of dimension 2 endowed with a distance function

D : M_{2} \times M_{2} \to R

, take a point

x \in M_{2}

and define a curve γ implicitly such as

D^{2} (γ, x) = r^{2}

for some

r > 0

. Notice that γ is a curve only if the manifold is of dimension two (otherwise it would be a ‘surface’ or, more generally, a submanifold) and that such curve is clearly a ‘circle’, centered in x, of radius r. In addition, let us choose a special parametrization of the curve γ known as parametrization by arclength, which essentially means that at any point of the curve it presents unit speed, namely

∥ \dot{γ} ∥_{γ} = 1

. Now one might wonder as to whether the acceleration of the curve

\nabla_{t}^{γ} \dot{γ}

presents any special property (where ∇ denotes the Levi-Civita connection). This is indeed the case: let us examine the problem. As a first consideration, notice that the condition of unit speed implies that

\nabla_{t}^{γ} \dot{γ} ⊥ \dot{γ}

, in fact

{⟨ \dot{γ}, \dot{γ} ⟩}_{γ} = constant \overset{d / d t}{\Rightarrow} 2 {⟨ \nabla_{t}^{γ} \dot{γ}, \dot{γ} ⟩}_{γ} = 0 \Rightarrow \nabla_{t}^{γ} \dot{γ} ⊥ \dot{γ}

(125)

on each tangent space

T_{γ} M

. In addition, the condition

D^{2} (γ, x) = r^{2}

implies that

0 = \frac{d}{d t} D^{2} (γ, x) = {⟨ - 2 \log_{γ} x, \dot{γ} ⟩}_{γ} \Rightarrow \log_{γ} x ⊥ \dot{γ} .

(126)

Now, since at every point of the curve

dim (T_{γ} M) = 2

, one must conclude that

\nabla_{t}^{γ} \dot{γ} ‖ \log_{γ} x,

(127)

namely the covariant acceleration is directed radially. As a by-product, with reference to the first example, we may notice that the condition

D^{2} (γ, x) = r^{2}

also implies that

0 = \frac{d}{d t} {⟨ \log_{γ} x, \log_{γ} x ⟩}_{γ} = 2 {⟨ \nabla_{t}^{γ} \log_{γ} x, \log_{γ} x ⟩}_{γ} \Rightarrow \nabla_{t}^{γ} \log_{γ} x ⊥ \log_{γ} x,

(128)

which therefore implies

\nabla_{t}^{γ} \log_{γ} x ‖ \dot{γ} .

(129)

In fact we may notice that, for a Euclidean space, it holds that

\log_{γ} x \equiv x - γ

and hence it follows that

\nabla_{t}^{γ} \log_{γ} x = - \dot{γ}

.

To end this section, let us examine a further, perhaps counter-intuitive, example of calculation.

Example 13.

We have seen that an important property of the Levi-Civita connection is that is is ‘metric’, which may be expressed concisely as

\frac{d}{d t} ⟨ 🟉, 🟉 ⟩_{γ (t)} \equiv 0

for any smooth curve γ. In intrinsic manifold calculus, such property is usually expressed through the sentence ‘the covariant derivative of the metric tensor is equal to zero’. One might thus wonder as to whether the same holds for the metric kernel G, namely if

\nabla_{t}^{γ} G_{γ} \overset{?}{\equiv} 0

. We are going to show now that this is not, indeed, the case.

Recalling that, for a smooth vector field

w \in Γ (M)

it holds that

\nabla_{t}^{γ} w = {\dot{w}}_{γ} + {\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ})

, we may compute

\begin{matrix} \nabla_{t}^{γ} G_{γ} (w_{γ}) = & \frac{d}{d t} G_{γ} (w_{γ}) + {\bar{\bar{Γ}}}_{γ} (G_{γ} (w_{γ}), \dot{γ}) \\ = & G_{γ}^{•} (w_{γ}, \dot{γ}) + G_{γ} ({\dot{w}}_{γ}) + {\bar{\bar{Γ}}}_{γ} (G_{γ} (w_{γ}), \dot{γ}) \\ = & G_{γ} (\nabla_{t}^{γ} w) + G_{γ}^{•} (w_{γ}, \dot{γ}) - G_{γ} ({\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ})) + {\bar{\bar{Γ}}}_{γ} (G_{γ} (w_{γ}), \dot{γ}) . \end{matrix}

(130)

Therefore, we can conclude that the covariant derivative of the metric kernel reads

\nabla_{t}^{γ} G_{γ} = G_{γ}^{•} (🟉, \dot{γ}) - G_{γ} ({\bar{\bar{Γ}}}_{γ} (🟉, \dot{γ})) + {\bar{\bar{Γ}}}_{γ} (G_{γ} (🟉), \dot{γ})

, where 🟉 denotes an empty slot again. The expression on the right-hand side of the expression (130) is equal to zero trivially only when the metric kernel coincides with the identity.

3.9. Lie Derivation*

Given two vector fields

u, v \in Γ (M)

, besides of covariant derivation

\nabla_{u} v

, there exists a further way to compute the rate of change of v in the direction u, which constitutes the first attempt to define directional derivatives on manifolds: the Lie derivative. Lie derivation does not require any notion of connection, parallel transport nor metrization. We shall however see that Lie derivation and Levi-Civita covariant derivation are related to one another. Although the notion of Lie derivative is presented in this section in extrinsic terms, it is a non-essential topic in this tutorial paper. However, besides of its historical value, we would like to point out that it has important applications in control theory, as summarized in [19].

Let

u, v \in Γ (M)

denote two smooth tangent vector fields. The notation

u_{x}

indicates the value of u at x. The Lie derivative of a vector field is defined in an analogous way to the covariant derivative, except that the role of parallel transport is played by a pullback map of a flow. We define a flow

σ_{t} (x)

associated to the vector field u as the solution of the following differential equation:

{\dot{σ}}_{t} (x) = u_{σ_{t} (x)}, with initial condition σ_{0} (x) = x .

(131)

The function

σ_{t} : M \to M

takes a point in the manifold to another point in the manifold, hence its first-order approximation at a point x, given by its tangent (pushforward) map, may be defined as

d_{x} σ_{t} (v) : = {\frac{d}{d s} σ_{t} (τ_{s} (x))|}_{s = 0},

(132)

where

τ : [- ϵ, ϵ] \to M

denotes any smooth curve such that

τ_{0} (x) = x

and

{\dot{τ}}_{0} (x) = v_{x} \in T_{x} M

. Now, the pushforward map

d_{x} σ_{t} : T_{x} M \to T_{σ_{t} (x)} M

, hence the associated pullback

{(d_{x} σ_{t})}^{- 1} : T_{σ_{t} (x)} M \to T_{x} M

. Namely, the pullback

{(d_{x} σ_{t})}^{- 1}

takes a vector from the tangent space

T_{σ_{t} (x)} M

to the tangent space

T_{x} M

. Such operation resembles parallel transport but is however referred to as Lie dragging. Lie derivation is based on Lie dragging in place of parallel transport and is defined as

{(L_{u} v)}_{x} : = {\frac{d}{d t} {(d_{x} σ_{t})}^{- 1} v_{σ_{t} (x)}|}_{t = 0} .

(133)

We notice that the Lie derivative is also related to the differential of a scalar-valued function

f : M \to R

, since

d_{x} f (v) = {(L_{v} f)}_{x}

.

In order to find a rule to calculate the Lie derivative of a vector field, we can proceed as follows. By definition we have

\begin{matrix} {(L_{u} v)}_{x} = & lim_{h \to 0} \frac{{(d_{x} σ_{h})}^{- 1} v_{σ_{h} (x)} - v_{x}}{h} \\ = & lim_{h \to 0} {(d_{x} σ_{h})}^{- 1} \frac{v_{σ_{h} (x)} - (d_{x} σ_{h}) v_{x}}{h} \\ = & lim_{h \to 0} \frac{v_{σ_{h} (x)} - (d_{x} σ_{h}) v_{x}}{h} \\ = & lim_{h \to 0} \frac{v_{σ_{h} (x)} - {\frac{d}{d s} σ_{h} (τ_{s})|}_{s = 0}}{h}, \end{matrix}

(134)

where we used the fact that

{(d_{x} σ_{0})}^{- 1} = {id}_{x}

and the last equality is due to the definition (132). By Taylor-expanding the function

σ_{t} (x)

in t we get

σ_{t} (x) = x + t {\frac{d}{d t} σ_{t} (x)|}_{t = 0} + O (t^{2}) = x + t u_{x} + O (t^{2}) .

(135)

Hence, it follows that

σ_{t} (τ_{s} (x)) = τ_{s} (x) + t u_{τ_{s} (x)} + O (t^{2})

(136)

and, consequently, that

\begin{matrix} \frac{d}{d s} σ_{t} (τ_{s} (x)) & = {\dot{τ}}_{s} (x) + t \frac{d}{d s} u_{τ_{s} (x)} + O (t^{2}) \\ = {\dot{τ}}_{s} (x) + t (\partial_{x} u) ({\dot{τ}}_{s} (x)) + O (t^{2}) . \end{matrix}

(137)

Setting

s = 0

results in

\begin{matrix} {\frac{d}{d s} σ_{t} (τ_{s} (x))|}_{s = 0} = {\dot{τ}}_{0} (x) + t (\partial_{x} u) ({\dot{τ}}_{0} (x)) + O (t^{2}) = v_{x} + t (\partial_{x} u) (v_{x}) + O (t^{2}) . \end{matrix}

(138)

Likewise, Taylor-expanding the function

v_{σ_{t} (x)}

with respect to the parameter t gives

v_{σ_{t} (x)} = v_{x} + t (\partial_{x} v) ({\dot{σ}}_{0} (x)) + O (t^{2}) = v_{x} + t (\partial_{x} v) (u_{x}) + O (t^{2}) .

(139)

Gluing pieces together gives

{(L_{u} v)}_{x} = \partial_{x} v (u_{x}) - \partial_{x} u (v_{x}) = {[u, v]}_{x} .

(140)

Namely, the Lie derivative of a vector field along a given direction coincides with a commutator.

Next we discuss, as an example, the meaning of vanishing Lie derivative.

Example 14.

Given two vector fields

u, v \in Γ (M)

, it may happen that

L_{u} v = 0

. Intuitively, such condition means that the two vector fields are related to one another. In particular, recalling the meaning of the flow σ at the basis of the definition (133), mathematicians say that σ is a ‘symmetry transformation’ of the vector field v. In order to make such notion explicit, let us rewrite the expression (133) for a generic point of the flow, namely

{(L_{u} v)}_{σ_{t} (x)} : = \frac{d}{d t} {(d_{x} σ_{t})}^{- 1} v_{σ_{t} (x)} .

(141)

If the Lie derivative of the vector field v vanishes along the whole flow of the vector field u, the above expression implies that the function

t \mapsto {(d_{x} σ_{t})}^{- 1} v_{σ_{t} (x)}

is constant, hence that

{(d_{x} σ_{t})}^{- 1} v_{σ_{t} (x)} = {(d_{x} σ_{0})}^{- 1} v_{σ_{0} (x)}

. Ultimately, a vanishing Lie derivatives implies that

v_{σ_{t} (x)} = (d_{x} σ_{t}) v_{x},

(142)

namely that the vector field v along the flow takes a value obtainable by Lie-dragging backward its source value and is completely ‘generated’ by it. This is reminiscent of the construction of a parallel vector field, from an initial seed, by parallel transport.

A consequence of the above result is that a torsion-free connection is nicely related to the Lie derivative, in fact, the condition

θ (u, v) = 0

implies that

\nabla_{u} v - \nabla_{v} u = L_{u} v

(143)

while, in general, connections need not be related to Lie derivatives. Notice that, from the above expression, it directly follows that

L_{u} v + L_{v} u = 0

. But there is more to this. In fact, upon introducing a metric and a notion of Lie derivation, a covariant derivative may be computed on the basis of Lie derivatives through the following formula due to Koszul:

2 ⟨ \nabla_{v} u, w ⟩ = L_{v} ⟨ u, w ⟩ + L_{u} ⟨ v, w ⟩ - L_{w} ⟨ v, u ⟩ - ⟨ u, L_{v} w ⟩ - ⟨ v, L_{u} w ⟩ - ⟨ w, L_{u} v ⟩,

(144)

where

u, v, w \in Γ (M)

are smooth vector fields and we have omitted for the ease of notation the point

x \in M

where all derivatives are taken. Proving the above identity is a matter of patience. First, let us recall that given a smooth curve

γ : [- ϵ, ϵ] \to M

such that

γ (0) = x \in M

and

\dot{γ} (0) = v_{x} \in T_{x} M

, and given two smooth vector fields

u, w \in Γ (M)

, for a metric connection it holds that

\frac{d}{d t} {⟨ u_{γ (t)}, w_{γ (t)} ⟩}_{γ (t)} = {⟨ \nabla_{t}^{γ} u, w_{γ (t)} ⟩}_{γ (t)} + {⟨ u_{γ (t)}, \nabla_{t}^{γ} w ⟩}_{γ (t)} .

(145)

Setting

t = 0

leads to

(d_{x} {⟨ u_{x}, w_{x} ⟩}_{x}) (v_{x}) = {(L_{v} ⟨ u, w ⟩)}_{x} = {⟨ {(\nabla_{v} u)}_{x}, w_{x} ⟩}_{x} + {⟨ u_{x}, {(\nabla_{v} w)}_{x} ⟩}_{x} .

(146)

Likewise, we may write that

\begin{matrix} {(L_{u} ⟨ v, w ⟩)}_{x} = & {⟨ {(\nabla_{u} v)}_{x}, w_{x} ⟩}_{x} + {⟨ v_{x}, {(\nabla_{u} w)}_{x} ⟩}_{x}, \\ - {(L_{w} ⟨ v, u ⟩)}_{x} = & - {⟨ {(\nabla_{w} v)}_{x}, u_{x} ⟩}_{x} - {⟨ v_{x}, {(\nabla_{w} u)}_{x} ⟩}_{x} . \end{matrix}

(147)

Summing up side by side yields:

\begin{matrix} {(L_{v} ⟨ u, w ⟩)}_{x} + {(L_{u} ⟨ v, w ⟩)}_{x} - {(L_{w} ⟨ v, u ⟩)}_{x} \\ = {⟨ {(\nabla_{v} u)}_{x} + {(\nabla_{u} v)}_{x}, w_{x} ⟩}_{x} + {⟨ {(\nabla_{v} w)}_{x} - {(\nabla_{w} v)}_{x}, u_{x} ⟩}_{x} \\ + {⟨ {(\nabla_{u} w)}_{x} - {(\nabla_{w} u)}_{x}, v_{x} ⟩}_{x} \\ = {⟨ {(\nabla_{v} u)}_{x} + {(\nabla_{v} u)}_{x} + {(L_{u} v)}_{x}, w_{x} ⟩}_{x} + {⟨ {(L_{v} w)}_{x}, u_{x} ⟩}_{x} + {⟨ {(L_{u} w)}_{x}, v_{x} ⟩}_{x} \\ = 2 {⟨ {(\nabla_{v} u)}_{x}, w_{x} ⟩}_{x} + {⟨ {(L_{u} v)}_{x}, w_{x} ⟩}_{x} + {⟨ {(L_{v} w)}_{x}, u_{x} ⟩}_{x} + {⟨ {(L_{u} w)}_{x}, v_{x} ⟩}_{x}, \end{matrix}

(148)

which proves the assertion.

To conclude this stub, we may examine an example that discusses the notion of Killing vector field.

Example 15.

A Killing vector field is one along with the Lie derivative of the metric vanishes. Let us make this notion more substantial. Let

u, v, w \in Γ (M)

denote three vector fields and let us recall that

L_{v} ⟨ u, w ⟩ = ⟨ \nabla_{v} u, w ⟩ + ⟨ u, \nabla_{v} w ⟩

, where ∇ denotes a Levi-Civita connection. Now, from the relationship (143) it follows that

\nabla_{v} u = \nabla_{u} v + L_{v} u, \nabla_{v} w = \nabla_{w} v + L_{v} w,

(149)

therefore, it holds that

\underset{: = L_{v} ⟨ 🟉, 🟉 ⟩}{\underset{︸}{L_{v} ⟨ u, w ⟩ - ⟨ L_{v} u, w ⟩ - ⟨ u, L_{v} w ⟩}} = ⟨ \nabla_{u} v, w ⟩ + ⟨ u, \nabla_{w} v ⟩ .

(150)

The left-hand side of such equation represents the Lie derivative of the metric, namely

L_{v} ⟨ 🟉, 🟉 ⟩

. Vanishing of the Lie derivative of the metric implies that

⟨ \nabla_{🟉} v, 🟉 ⟩ + ⟨ 🟉, \nabla_{🟉} v ⟩ = 0,

(151)

that is a condition on the vector field v referred to as Killing equation. Here, the empty slots remind us that the equation

⟨ \nabla_{u} v, w ⟩ + ⟨ u, \nabla_{w} v ⟩ = 0

must hold for every choice of

u, w \in Γ (M)

, hence the Killing equation involves only the differential structure of the manifold.

4. Iterated Derivatives and Riemannian Curvature

The notion of covariant derivative arises as a generalization of the directional derivative of a vector field. However, while directional derivation commutes, covariant derivation does not. Such observation leads to the notion of Riemannian curvature endomorphism.

The present section is organized as follows. Section 4.1 remarks some well-known properties of the iterated directional derivatives in the flat space

R^{n}

. The Section 4.2 lays out the notion of iterated covariant derivatives and defines the notion of second covariant derivative, which is shown to be well-defined in contrast to a double iterated derivative. Section 4.3 describes the parallel transport of a tangent vector along a tiny loop and shows that the final result differs from the starting vector, and attributes such discrepancy to the curvedness of a manifold. In Section 4.4, we describe the key notion of Riemannian curvature endomorphism, which quantifies the non-commutativity of second covariant derivative and is taken as a fundamental operator in describing the effects of curvature of a manifold. To end this section, the Section 4.5 describes some curvature-related concepts, such as tidal effects, geodesic deviation and sectional curvature.

4.1. Commutativity of Directional Derivatives in $R^{n}$

Commutativity of directional derivatives is a consequence of Schwarz’s theorem. Recall that, given a vector field

f : R^{n} \to R^{m}

, and a vector

v \in R^{n}

, its directional derivative at a point p is defined as

D_{p} f (v) : = lim_{h \to 0} \frac{f (p + h v) - f (p)}{h} .

(152)

Given another direction

u \in R^{n}

, one may compute the directional derivative of the vector field

p \mapsto D_{p} f (v)

along u, namely

D_{p} (D_{p} (v)) (u) : = lim_{k \to 0} \frac{D_{p + k u} f (v) - D_{p} f (v)}{k} .

(153)

In other terms, we have that

\begin{matrix} D_{p} (D_{p} (v)) (u) \\ = lim_{k \to 0} \frac{1}{k} \{lim_{h \to 0} (\frac{f (p + h v + k u) - f (p + k u)}{h} - \frac{f (p + h v) - f (p)}{h})\} \\ = lim_{h \to 0} \frac{1}{h} \{lim_{k \to 0} (\frac{f (p + k u + h v) - f (p + h v)}{k} - \frac{f (p + k u) - f (p)}{k})\} \\ = lim_{h \to 0} \frac{D_{p + h v} f (u) - D_{p} f (u)}{h} \\ = D_{p} (D_{p} (u)) (v), \end{matrix}

(154)

hence directional derivatives commute.

The above-discussed commutativity test for directional derivatives is based on the notion of iterated directional derivation. To extend such test to manifolds, it is necessary to discuss what it is meant by iterated covariant derivation and through which formulas such iterated derivatives may be computed.

4.2. Iterated Covariant Derivatives and Second Covariant Derivative

Covariant derivation inputs a vector field and outputs a further vector field. Hence, it is possible to compute the covariant derivative of a covariant derivative (and so fort) of a given vector field.

In particular, given a smooth curve

γ

with image in

M

, which represents the trajectory generated by a dynamical system, it is conceivable to define the following vector fields along such trajectory:

\begin{matrix} \dot{γ} = & 𝔳 (velocity field), \\ \nabla_{t}^{γ} \dot{γ} = & 𝔞 (covariant acceleration), \\ \nabla_{t}^{γ} \nabla_{t}^{γ} \dot{γ} = & 𝔯 (covariant jolt / jerk), \\ \nabla_{t}^{γ} \nabla_{t}^{γ} \nabla_{t}^{γ} \dot{γ} = & 𝔫 (covariant snap / jounce), \\ {(\nabla_{t}^{γ})}^{4} \dot{γ} = & 𝔠 (covariant crackle / flounce), \\ {(\nabla_{t}^{γ})}^{5} \dot{γ} = & 𝔭 (covariant pop / pounce) . \end{matrix}

(155)

(It is an interesting research exercise to look for a translation of these terms in one’s local language. For the Italian readers, derivatives of a positional variable from the third order on are translated as strappo, sbalzo, crepitìo, schiocco).

In general, iterated covariant derivatives are better computed through the expression (114) which, applied to speed, reads

\nabla_{t}^{γ} \dot{γ} = \ddot{γ} + {\bar{\bar{Γ}}}_{γ} (\dot{γ}, \dot{γ}) .

(156)

Let us consider an example of evaluation of covariant acceleration on the special orthogonal group.

Example 16.

On the special orthogonal group

SO (n)

endowed with its canonical metric, the covariant acceleration associated to a trajectory

t \to R (t)

reads, according to Equation (156):

\begin{matrix} 𝔞_{t}^{R} = \ddot{R} + {\bar{\bar{Γ}}}_{R} (\dot{R}, \dot{R}) = \ddot{R} + R {\dot{R}}^{⊤} \dot{R} . \end{matrix}

(157)

Recalling that

SO (n)

is a Lie group and writing the velocity

\dot{R}

as

R H

, with

H \in so (n)

, one may write

\ddot{R} = \dot{R} H + R \dot{H} = R (H^{2} + \dot{H})

, hence substituting:

\begin{matrix} 𝔞_{t}^{R} = \ddot{R} + R {\dot{R}}^{⊤} \dot{R} = R (H^{2} + \dot{H}) - R H^{2} = R \dot{H} . \end{matrix}

(158)

Whenever the group

SO (3)

is used to model the rotational dynamics of a rigid body, the above relation is interpreted as follows: The covariant acceleration equals the angular acceleration in the body-fixed reference frame brought back to an inertial reference frame.

Applying again the Equation (114) to the vector field

\ddot{γ} + {\bar{\bar{Γ}}}_{γ} (\dot{γ}, \dot{γ})

yields

\nabla_{t}^{γ} \nabla_{t}^{γ} \dot{γ} = \overset{⃛}{γ} + \frac{d}{d t} {\bar{\bar{Γ}}}_{γ} (\dot{γ}, \dot{γ}) + {\bar{\bar{Γ}}}_{γ} (\ddot{γ} + {\bar{\bar{Γ}}}_{γ} (\dot{γ}, \dot{γ}), \dot{γ}) .

(159)

The same procedure may be repeated to get the expressions of the higher-order iterated covariant derivatives. It is worth observing that, for the first time, we encounter a derivative of the Christoffel form,

\frac{d}{d t} \bar{\bar{Γ}}

, as well as nested Christoffel forms

\bar{\bar{Γ}} (\bar{\bar{Γ}}, 🟉)

. Such observation turns out to hold in the more general setting of nested connections applied to vector fields and plays a fundamental role in the computation of Riemannian curvature.

It is worth underlying two special cases.

A special case arises in the presence of a connection whose Christoffel form is normal, namely $Π_{γ} \bar{\bar{Γ}} \equiv 0$ , in which case we may access the simplified relation (36) to compute covariant derivative, namely $\nabla_{t}^{γ} \dot{γ} = Π_{γ} {\ddot{γ}}$ . In such a case, the covariant jolt may be computed as

$𝔯_{t}^{γ} = Π_{γ} {\overset{⃛}{γ} + Π_{γ}^{•} (\ddot{γ}, \dot{γ})} .$

(160)
A even more particular case is given by a geodesic curve, for which $\ddot{γ} = - {\bar{\bar{Γ}}}_{γ} (\dot{γ}, \dot{γ})$ . In this case, the acceleration $𝔞_{t}^{γ} = Π_{γ} {\ddot{γ} + {\bar{\bar{Γ}}}_{γ} (\dot{γ}, \dot{γ})} = 0$ , hence all covariant derivatives (from acceleration to pounce) are identically zero (no matter what is the structure of the Cristoffel form of the second kind).

Let us examine a few examples of computation of covariant jolt

𝔯_{t}^{γ}

associated to a manifold-valued system trajectory.

Example 17.

The first case treated in the present example concerns the hypersphere

S^{n - 1}

endowed with its canonical metric. In this case, the Christoffel form of the second kind reads

{\bar{\bar{Γ}}}_{x} (u, v) = (u^{⊤} v) x

hence the covariant acceleration reads

𝔞_{t}^{x} = \ddot{x} + x {\dot{x}}^{⊤} \dot{x} .

(161)

In addition, according to Equation (159), the covariant jolt of a trajectory

t \mapsto x (t)

in

S^{n - 1}

reads

\begin{matrix} 𝔯_{t}^{x} = & \overset{⃛}{x} + \frac{d}{d t} ({\dot{x}}^{⊤} \dot{x} x) + \dot{x} (\ddot{x} + {\dot{x}}^{⊤} \dot{x} x) x \\ = & \overset{⃛}{x} + 3 x {\ddot{x}}^{⊤} \dot{x} + \dot{x} {\dot{x}}^{⊤} \dot{x} . \end{matrix}

(162)

Let us recall that from the condition

x^{⊤} x = 1

, deriving thrice, one gets

3 {\ddot{x}}^{⊤} \dot{x} + x^{⊤} \overset{⃛}{x} = 0

, hence the last expression is equivalent to

\begin{matrix} 𝔯_{t}^{x} = & (I_{n} - x x^{⊤}) \overset{⃛}{x} + \dot{x} {\dot{x}}^{⊤} \dot{x} . \end{matrix}

(163)

We might have obtained the same expression more easily recalling that, in this case, the Christoffel form is purely normal, hence the jolt is given by the Equation (160). Let us recall that

Π_{x} (a) = a - x x^{⊤} a

, hence

Π_{x}^{•} (a, \dot{x}) = - \dot{x} x^{⊤} a - x {\dot{x}}^{⊤} a

. Therefore, the jolt is computed as

𝔯_{t}^{x} = (I - x x^{⊤}) \overset{⃛}{x} - \dot{x} x^{⊤} \ddot{x} .

(164)

Notice however that, deriving both sides of the condition

x^{⊤} x = 1

twice yields

x^{⊤} \ddot{x} + {\dot{x}}^{⊤} \dot{x} = 0

, hence the expression (164) is equivalent to the expression (163).

A second case-of-study arises from the special orthogonal group

SO (n)

endowed with its canonical metric. In this case, the jolt of a trajectory

t \mapsto R (t)

in

SO (n)

computed through the general Equation (159), taking into account that

{\bar{\bar{Γ}}}_{R} (U, V) = \frac{1}{2} R (U^{⊤} V + V^{⊤} U)

, reads

𝔯_{t}^{R} = \overset{⃛}{R} + \frac{d}{d t} (R {\dot{R}}^{⊤} \dot{R}) + \frac{1}{2} R ({(\ddot{R} + R {\dot{R}}^{⊤} \dot{R})}^{⊤} \dot{R} + {\dot{R}}^{⊤} (\ddot{R} + R {\dot{R}}^{⊤} \dot{R})) .

(165)

Even in this case, the Christoffel form is purely normal, hence we may apply the relationship (160), upon recalling that

Π_{R} (A) = \frac{1}{2} A - \frac{1}{2} R A^{⊤} R

and hence

Π_{R}^{•} (A, \dot{R}) = - \frac{1}{2} \dot{R} A^{⊤} R - \frac{1}{2} R A^{⊤} \dot{R}

, obtaining

𝔯_{t}^{R} = Π_{R} {\overset{⃛}{R} + \dot{R} {\dot{R}}^{⊤} \dot{R}} .

(166)

In this case, since the manifold

SO (n)

is a Lie group, the velocity over the trajectory may be factored as

\dot{R} = R H

with

H \in so (n)

. Hence

\begin{matrix} \ddot{R} = & \dot{R} H + R \dot{H} = R (H^{2} + \dot{H}), \\ \overset{⃛}{R} = & \dot{R} (H^{2} + \dot{H}) + R (\dot{H} H + H \dot{H} + \ddot{H}) = R (\ddot{H} + 2 H \dot{H} + \dot{H} H + H^{3}) . \end{matrix}

(167)

Therefore one may verify that the above expressions are equivalent to one another and to

𝔯_{t}^{R} = R \ddot{H} + \frac{1}{2} R (H \dot{H} - \dot{H} H) .

(168)

Notice that the result of the calculation is a tangent vector in

T_{R} SO (n)

. The so-called rotational jolt

𝔯_{t}^{R}

(and, in particular, a function of its norm

∥ 𝔯_{t}^{R} ∥_{R (t)}

) was used in several contexts to measure the fluidity of rotational movements. An example may be found in [20,21], which dealt with hips sway fluency assessment in the context of movement analysis.

The last example considered concerns the space of symmetric, positive-definite matrices

S^{+} (n)

endowed with its canonical metric. In this case,

{\bar{\bar{Γ}}}_{P} (U, V) = - \frac{1}{2} U P^{- 1} V - \frac{1}{2} V P^{- 1} U

, hence

𝔞_{t}^{P} = \ddot{P} - \dot{P} P^{- 1} \dot{P},

(169)

and, consequently,

\begin{matrix} 𝔯_{t}^{P} = & \overset{⃛}{P} + \frac{d}{d t} (- \dot{P} P^{- 1} \dot{P}) - \frac{1}{2} (\ddot{P} - \dot{P} P^{- 1} \dot{P}) P^{- 1} \dot{P} - \frac{1}{2} \dot{P} P^{- 1} (\ddot{P} - \dot{P} P^{- 1} \dot{P}) \\ = & \overset{⃛}{P} - \frac{3}{2} (\ddot{P} P^{- 1} \dot{P} + \dot{P} P^{- 1} \ddot{P}) + 2 \dot{P} P^{- 1} \dot{P} P^{- 1} \dot{P}, \end{matrix}

(170)

a (perhaps unexpectedly) convoluted expression. Notice that, in this case, the Christoffel form is not purely normal, hence the special relationship (160) does not hold.

We mention that, if a curve

γ

has unitary speed, namely

∥ \dot{γ} ∥_{γ} = 1

, then the norm of its covariant acceleration is taken as a measure of its curvature. Namely, the scalar field

κ_{g}^{γ} : = {∥ \nabla_{t}^{γ} \dot{γ} ∥}_{γ},

(171)

is defined geodesic curvature and quantifies how much a curve differs from a geodesic. Hence a geodesic curve has zero geodesic curvature.

In a similar manner as we have done for the covariant derivatives along a curve, we may compute iterated derivatives of a given vector fields along different directions. Notationwise, given a vector field

w \in Γ (M)

whose covariant derivative is sought and two directions

u, v \in T_{x} M

, a nested derivative will be denoted as

\nabla_{u} \nabla_{v} w

. Let us make such abstract notation become substantial. The vector field w needs to be defined in a neighborhood of the point x. Likewise, it is necessary to set up a vector field

\bar{v} \in Γ (M)

, well-defined in a neighborhood of x, such that

{\bar{v}}_{x} = v

. In this way, the connection

\nabla_{\bar{v}} w \in Γ (M)

yields a vector field, well-defined in a neighborhood of x, which may be further covariantly-derivated along the direction u.

In a even more natural manner, given three vector fields

u, v, w \in Γ (M)

, one may define an iterated connection

\nabla_{u} \nabla_{v} w

, whose result may be expressed as

{(\nabla_{u} \nabla_{v} w)}_{x} = (\partial_{x} {(\nabla_{v} w)}_{x}) (u_{x}) + {\bar{\bar{Γ}}}_{x} ({(\nabla_{v} w)}_{x}, u_{x}) .

(172)

Notice that the term

\partial_{x} (\nabla_{v} w) (u)

involves the directional derivative

\partial_{x} {\bar{\bar{Γ}}}_{x}

of the Christoffel form, as well as the directional derivative of the vector fields w and v, in fact

\begin{matrix} (\partial_{x} {(\nabla_{v} w)}_{x}) (u_{x}) = & \partial_{x} (\partial_{x} w (v_{x}) + {\bar{\bar{Γ}}}_{x} (w_{x}, v_{x})) (u_{x}) \\ = & \partial_{x}^{2} w (v_{x}; u_{x}) + (\partial_{x} w) ((\partial_{x} v) (u_{x})) + (\partial_{x} {\bar{\bar{Γ}}}_{x}) (w_{x}, v_{x}; u_{x}) + \\ {\bar{\bar{Γ}}}_{x} (\partial_{x} w (u_{x}), v_{x}) + {\bar{\bar{Γ}}}_{x} (w_{x}, \partial_{x} v (u_{x})), \end{matrix}

(173)

where

\partial_{x}^{2} w (v_{x}; u_{x})

denotes the second directional derivative of the vector field w and

\partial_{x} ({\bar{\bar{Γ}}}_{x})

denotes the derivative of the Christoffel form with respect to the argument x only. Therefore, the nested connection may effectively be computed through the (a little convoluted) expression

\begin{matrix} {(\nabla_{u} \nabla_{v} w)}_{x} \\ = \partial_{x}^{2} w (v_{x}; u_{x}) + (\partial_{x} w) ((\partial_{x} v) (u_{x})) + (\partial_{x} {\bar{\bar{Γ}}}_{x}) (w_{x}, v_{x}; u_{x}) + \\ {\bar{\bar{Γ}}}_{x} (\partial_{x} w (u_{x}), v_{x}) + {\bar{\bar{Γ}}}_{x} (w_{x}, \partial_{x} v (u_{x})) + {\bar{\bar{Γ}}}_{x} (\partial_{x} w (v_{x}), u_{x}) + {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, v_{x}), u_{x}), \end{matrix}

(174)

where we have made use repeatedly of the linearity of the Christoffel form of the second kind with respect to its vectorial arguments to extend it to the whole ambient space.

An unavoidable characteristic of the iterated derivative (174) is that it includes not only the rate of change of the vector field w and of the Christoffel form (which describes the underlying manifold) but also the rate of change of the vector field v, which may be an unwanted effect. Therefore, on the basis of the nested covariant derivative (174), one may define a second covariant derivative of the vector field w in the directions

u, v

as follows:

\nabla_{u, v}^{2} w : = \nabla_{u} \nabla_{v} w - \nabla_{\nabla_{u} v} w .

(175)

To verify the effects of such definition, it pays to compute first

\begin{matrix} {(\nabla_{\nabla_{u} v} w)}_{x} = & (\partial_{x} w) ({(\nabla_{u} v)}_{x}) + {\bar{\bar{Γ}}}_{x} (w_{x}, {(\nabla_{u} v)}_{x}) \\ = & (\partial_{x} w) (\partial_{x} v (u_{x}) + {\bar{\bar{Γ}}}_{x} (v_{x}, u_{x})) + {\bar{\bar{Γ}}}_{x} (w_{x}, \partial_{x} v (u_{x})) + {\bar{\bar{Γ}}}_{x} (w_{x}, {\bar{\bar{Γ}}}_{x} (v_{x}, u_{x})), \end{matrix}

(176)

so that the second covariant derivative may be expressed as

\begin{matrix} {(\nabla_{u, v}^{2} w)}_{x} \\ = \partial_{x}^{2} w (v_{x}; u_{x}) + (\partial_{x} w) ((\partial_{x} v) (u_{x})) + (\partial_{x} {\bar{\bar{Γ}}}_{x}) (w_{x}, v_{x}; u_{x}) \\ + {\bar{\bar{Γ}}}_{x} (\partial_{x} w (u_{x}), v_{x}) + {\bar{\bar{Γ}}}_{x} (w_{x}, \partial_{x} v (u_{x})) + {\bar{\bar{Γ}}}_{x} (\partial_{x} w (v_{x}), u_{x}) \\ + {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, v_{x}), u_{x}) - (\partial_{x} w) (\partial_{x} v (u_{x})) - (\partial_{x} w) ({\bar{\bar{Γ}}}_{x} (v_{x}, u_{x})) \\ - {\bar{\bar{Γ}}}_{x} (w_{x}, \partial_{x} v (u_{x})) - {\bar{\bar{Γ}}}_{x} (w_{x}, {\bar{\bar{Γ}}}_{x} (v_{x}, u_{x})) \\ = \partial_{x}^{2} w (v_{x}; u_{x}) + (\partial_{x} {\bar{\bar{Γ}}}_{x}) (w_{x}, v_{x}; u_{x}) + {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, v_{x}), u_{x}) \\ + {\bar{\bar{Γ}}}_{x} (\partial_{x} w (u_{x}), v_{x}) + {\bar{\bar{Γ}}}_{x} (\partial_{x} w (v_{x}), u_{x}) - (\partial_{x} w) ({\bar{\bar{Γ}}}_{x} (v_{x}, u_{x})) \\ - {\bar{\bar{Γ}}}_{x} (w_{x}, {\bar{\bar{Γ}}}_{x} (v_{x}, u_{x})) . \end{matrix}

(177)

The second derivative looks ‘purified’ from the rate of change of the vector field v.

Observation 1.

For the mathematically-minded, we underline that there exists a technical reason why the iterated covariant derivative

\nabla_{u} \nabla_{v} w

is, in some sense, not well-defined while the double covariant derivative

\nabla_{u, v} w

is well defined. Let us recall that the iterated covariant derivative

(\nabla_{u} \nabla_{v} w)

is meant to capture the rate of change, along the direction u, of the rate of change of the vector field w along the direction v, at a given point x: in principle, hence

v \in T_{x} M

is a single tangent vector to which there need not to be associated any vector field. Nevertheless, in order to evaluate the covariant derivative along the direction u, it is necessary to ‘extend’ the direction v locally to a smooth vector field, say

\bar{v} \in Γ (M)

: such vector field needs only to obey the constraint

{\bar{v}}_{x} \equiv v

, being elsewhere unconstrained. Now, since the nested covariant derivative (174) depends explicitly on

\partial_{x} \bar{v}

, its value depends on the way we have extended the vector v by the field

\bar{v}

, which makes its expression ‘user-dependent’. As opposed to this, the second covariant derivative (177) does not exhibit such kind of dependence, namely the way one chooses the extension

\bar{v}

is irrelevant to the value of the second covariant derivative, which results henceforth well-defined (‘user-independent’).

Now, we may wonder whether the second covariant derivative, which depends on the second directional derivative, is independent of the order of derivation, namely if the result changes if one swaps the vectors u and v. From the last expression, we may immediately find that

\begin{matrix} {(\nabla_{v, u}^{2} w)}_{x} = & \partial_{x}^{2} w (u_{x}; v_{x}) + (\partial_{x} {\bar{\bar{Γ}}}_{x}) (w_{x}, u_{x}; v_{x}) + {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, u_{x}), v_{x}) \\ + {\bar{\bar{Γ}}}_{x} (\partial_{x} w (v_{x}), u_{x}) + {\bar{\bar{Γ}}}_{x} (\partial_{x} w (u_{x}), v_{x}) - (\partial_{x} w) ({\bar{\bar{Γ}}}_{x} (u_{x}, v_{x})) \\ - {\bar{\bar{Γ}}}_{x} (w_{x}, {\bar{\bar{Γ}}}_{x} (u_{x}, v_{x})) . \end{matrix}

(178)

By exploiting the known symmetry of the Christoffel form along with evident cancellations, one immediately finds that

\begin{matrix} {(\nabla_{u, v}^{2} w)}_{x} - {(\nabla_{v, u}^{2} w)}_{x} = & (\partial_{x} {\bar{\bar{Γ}}}_{x}) (w_{x}, v_{x}; u_{x}) - (\partial_{x} {\bar{\bar{Γ}}}_{x}) (w_{x}, u_{x}; v_{x}) + \\ {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, v_{x}), u_{x}) - {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, u_{x}), v_{x}) . \end{matrix}

(179)

The net result is that the second covariant derivative does not commute and the degree of non-commutativity depends solely on the behavior of the Christoffel form of the second kind while it does not depend at all by the rate of change of the vector field being derivated (w). A remarkable consequence of the latter observation is that the quantity

{(\nabla_{u, v}^{2} w)}_{x} - {(\nabla_{v, u}^{2} w)}_{x}

is an intrinsic property of manifolds.

Let us underline that, for a torsion-free connection as the Levi-Civita one, the non-commutativity measure

\nabla_{u, v}^{2} - \nabla_{v, u}^{2}

may be written in terms of commutators. In fact, from the definition of second covariant derivative (175) we get that

\begin{matrix} \nabla_{u, v}^{2} w - \nabla_{v, u}^{2} w = & \nabla_{u} \nabla_{v} w - \nabla_{v} \nabla_{u} w - \nabla_{\nabla_{u} v} w + \nabla_{\nabla_{v} u} w \\ = & (\nabla_{u} \nabla_{v} - \nabla_{v} \nabla_{u} - \nabla_{\nabla_{u} v - \nabla_{v} u}) w . \end{matrix}

(180)

Now, recalling that

\nabla_{u} v - \nabla_{v} u = [u, v]

and defining the following covariant differential commutator

[\nabla_{u}, \nabla_{v}] : = \nabla_{u} \nabla_{v} - \nabla_{v} \nabla_{u}

, the above relationship may be rewritten compactly and meaningfully as

\nabla_{u, v}^{2} - \nabla_{v, u}^{2} = [\nabla_{u}, \nabla_{v}] - \nabla_{[u, v]},

(181)

which suggests that the lack of commutativity of the iterated covariant connection is due to the lack of commutativity of covariant connection depurated from the lack of commutativity of the involved vector fields.

To conclude this section, let us examine, as an example, how to generalize the relationship (159) to a generic tangent vector field.

Example 18.

Given a smooth vector field

w \in Γ (M)

, its covariant derivative along a smooth curve γ may be expressed through the relation (114) as

\nabla_{t}^{γ} w = {\dot{w}}_{γ} + {\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ})

. Applying again the Equation (114) to the vector field

{\dot{w}}_{γ} + {\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ})

yields

\begin{matrix} \nabla_{t}^{γ} \nabla_{t}^{γ} w = & {\ddot{w}}_{γ} + \frac{d}{d t} {\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ}) + {\bar{\bar{Γ}}}_{γ} ({\dot{w}}_{γ} + {\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ}), \dot{γ}) \\ = & {\ddot{w}}_{γ} + {\bar{\bar{Γ}}}_{γ}^{•} (w_{γ}, \dot{γ}; \dot{γ}) + {\bar{\bar{Γ}}}_{γ} ({\dot{w}}_{γ}, \dot{γ}) \\ + {\bar{\bar{Γ}}}_{γ} (w_{γ}, \ddot{γ}) + {\bar{\bar{Γ}}}_{γ} ({\dot{w}}_{γ}, \dot{γ}) + {\bar{\bar{Γ}}}_{γ} ({\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ}), \dot{γ}) \\ = & {\ddot{w}}_{γ} + {\bar{\bar{Γ}}}_{γ}^{•} (w_{γ}, \dot{γ}; \dot{γ}) + 2 {\bar{\bar{Γ}}}_{γ} ({\dot{w}}_{γ}, \dot{γ}) + {\bar{\bar{Γ}}}_{γ} (w_{γ}, \ddot{γ}) + {\bar{\bar{Γ}}}_{γ} ({\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ}), \dot{γ}), \end{matrix}

(182)

where the following abbreviation has been made use of

{\bar{\bar{Γ}}}_{γ}^{•} (a, b; \dot{γ}) : = \frac{d}{d t} {\bar{\bar{Γ}}}_{γ} (a, b),

(183)

with

a, b \in A

being arbitrary constants. As we shall see in the next section of this paper, such linear operator is not just a convenient placeholder but plays indeed a fundamental role in quantifying the Riemannian notion of curvature.

4.3. Manifestation of Manifold Curvature in Parallel Transport along a Loop

Let ℓ denote a simple loop, namely a simple curve on a manifold that starts and ends at the same point x and let

w_{x} \in T_{x} M

denote a test tangent vector. Let us parallely-transport the tangent vector

w_{x}

along the closed loop ℓ and let us denote the final result of transport as

w_{ℓ} \in T_{x} M

.

If a manifold is flat, the result of parallel transport along a loop is just the initial vector, namely

w_{ℓ} \equiv w_{x}

. On a curved manifold, this is not necessarily the case, as a manifestation of curvedness. Let us examine such phenomenon, referred to as anholonomy, in detail, in the hypothesis that the closed loop ℓ be very small.

To effect parallel transport along the loop ℓ parameterized by

γ : [0, 1] \to M

, with

γ (0) = x

, we should set-up the differential equation

\frac{d}{d t} w_{γ} + {\bar{\bar{Γ}}}_{γ} (w_{γ}, \dot{γ}) = 0

(184)

in the transport field

w \in Γ (M)

, with initial condition

w_{γ (0)} = w_{x}

given. Let us observe that, according to the property of the Christoffel form underlined in Section 3.8, the solution of the above differential equation in

A

is indeed a tangent vector field along

γ

, hence we may write

w_{γ (t)} = w_{x} - \int_{0}^{t} {\bar{\bar{Γ}}}_{γ (τ)} (w_{γ (τ)}, \dot{γ} (τ)) d τ .

(185)

Setting

t = 1

leads to

w_{ℓ} = w_{x} - \oint_{ℓ} {\bar{\bar{Γ}}}_{γ (τ)} (w_{γ (τ)}, \dot{γ} (τ)) d τ,

(186)

since the loop is closed.

Under the hypothesis that the loop is very small, it makes sense to approximate the loop integral above by means of first-order approximations of the Christoffel form as well as the transport field around the foot-point x, namely

\begin{matrix} {\bar{\bar{Γ}}}_{γ (τ)} (a, b) = {\bar{\bar{Γ}}}_{x} (a, b) + {\bar{\bar{Γ}}}_{x}^{•} (a, b; γ (τ) - x) + O (τ^{2}), \end{matrix}

(187)

\begin{matrix} w_{γ (τ)} = w_{x} + τ {\dot{w}}_{x} + O (τ^{2}) = w_{x} - {\bar{\bar{Γ}}}_{x} (w_{x}, γ (τ) - x) + O (τ^{2}), \end{matrix}

(188)

with

τ \in [0, 1]

, where we have made use of the definition (183) and the last equation derives from the relationship (184). Also, we have made use of the approximation

\dot{γ} (τ) = \frac{γ (τ) - x}{τ} + O (τ)

. On the basis of the above approximations, we see that

\begin{matrix} {\bar{\bar{Γ}}}_{γ (τ)} (w_{γ (τ)}, \dot{γ} (τ)) = & {\bar{\bar{Γ}}}_{x} (w_{x} - {\bar{\bar{Γ}}}_{x} (w_{x}, γ (τ) - x) + O (τ^{2}), \dot{γ} (τ)) + \\ {\bar{\bar{Γ}}}_{x}^{•} (w_{x} - {\bar{\bar{Γ}}}_{x} (w_{x}, γ (τ) - x) + O (τ^{2}), \dot{γ} (τ); γ (τ) - x) + O (τ^{2}) \\ = & {\bar{\bar{Γ}}}_{x} (w_{x}, \dot{γ} (τ)) - {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, γ (τ) - x), \dot{γ} (τ)) + \\ {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, \dot{γ} (τ); γ (τ) - x) - {\bar{\bar{Γ}}}_{x}^{•} ({\bar{\bar{Γ}}}_{x} (w_{x}, γ (τ) - x), \dot{γ} (τ); γ (τ) - x) \\ + O (τ^{2}), \end{matrix}

(189)

therefore, making use of the approximation

{\bar{\bar{Γ}}}_{x}^{•} ({\bar{\bar{Γ}}}_{x} (w_{x}, γ (τ) - x), \dot{γ} (τ); γ (τ) - x) = O (τ^{2})

, we get

\begin{matrix} {\bar{\bar{Γ}}}_{γ (τ)} (w_{γ (τ)}, \dot{γ} (τ)) = & {\bar{\bar{Γ}}}_{x} (w_{x}, \dot{γ} (τ)) - {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, γ (τ) - x), \dot{γ} (τ)) + \\ {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, \dot{γ} (τ); γ (τ) - x) + O (τ^{2}) . \end{matrix}

(190)

Plugging the above relation into the loop integral (186), we get

\begin{matrix} w_{ℓ} \approx & w_{x} - \oint_{ℓ} {\bar{\bar{Γ}}}_{x} (w_{x}, \dot{γ} (τ)) d τ + \oint_{ℓ} {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, γ (τ) - x), \dot{γ} (τ)) d τ \\ - \oint_{ℓ} {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, \dot{γ} (τ); γ (τ) - x) d τ . \end{matrix}

(191)

Recalling that the Christoffel form is (bi)linear in its vectorial arguments, it is immediate to see that the first addendum is null, namely

\oint_{ℓ} {\bar{\bar{Γ}}}_{x} (w_{x}, \dot{γ} (τ)) d τ = {\bar{\bar{Γ}}}_{x} (w_{x}, \oint_{ℓ} \dot{γ} (τ) d τ) = {\bar{\bar{Γ}}}_{x} (w_{x}, 0) = 0,

(192)

therefore ultimately we get that the discrepancy between the transported vector after and before transport is quantified by

w_{ℓ} - w_{x} \approx \oint_{ℓ} \{{\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, γ (τ) - x), \dot{γ} (τ)) - {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, \dot{γ} (τ); γ (τ) - x)\} d τ

(193)

up to order

O (τ^{3})

. The above relationship may be written in a more ‘symmetric’ way by the following arguments. Let us first notice that

\oint_{ℓ} \frac{d}{d τ} \{{\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, γ (τ)), γ (τ)) - {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, γ (τ); γ (τ))\} d τ = 0,

(194)

therefore it holds that

\oint_{ℓ} \{{\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, γ), \dot{γ}) - {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, \dot{γ}; γ)\} d τ + \oint_{ℓ} \{{\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, \dot{γ}), γ) - {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, γ; \dot{γ})\} d τ = 0 .

(195)

In addition, notice that, by linearity and closedness of the integration path, it holds that

\begin{matrix} \oint_{ℓ} {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, x), \dot{γ}) d τ = \oint_{ℓ} {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, \dot{γ}), x) d τ = 0, \end{matrix}

(196)

\begin{matrix} \oint_{ℓ} {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, \dot{γ}; x) d τ = \oint_{ℓ} {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, x; \dot{γ}) d τ = 0, \end{matrix}

(197)

therefore, the relationship (195) may be rewritten as

\begin{matrix} \oint_{ℓ} \{{\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, γ - x), \dot{γ}) - {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, \dot{γ}; γ - x)\} d τ = \\ - \oint_{ℓ} \{{\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, \dot{γ}), γ - x) - {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, γ - x; \dot{γ})\} d τ . \end{matrix}

(198)

Through this equation, the discrepancy (193) may be rewritten as

\begin{matrix} w_{ℓ} - w_{x} & \approx \frac{1}{2} \oint_{ℓ} \{{\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, γ - x), \dot{γ}) + {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, γ - x; \dot{γ})\} d τ \\ - \frac{1}{2} \oint_{ℓ} \{{\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w_{x}, \dot{γ}), γ - x) + {\bar{\bar{Γ}}}_{x}^{•} (w_{x}, \dot{γ}; γ - x)\} d τ . \end{matrix}

(199)

One may hence infer that the integrand is hence responsible for the discrepancy between

w_{ℓ}

and

w_{x}

. In addition, we notice informally that the integrand appears to have exactly the same structure of the right-hand side of the relation (179). (In addition, from the above relation it is easily guessed that the amount of discrepancy increases with the size of the loop.)

4.4. Riemannian Curvature Endomorphism

As we have seen, given a closed loop ℓ with foot-point

x \in M

and a tangent vector

w_{x} \in T_{x} M

, the parallel transport of the latter along the closed loop results in a tangent vector

w_{ℓ} \in T_{x} M

that differs from

w_{x}

. Since, however, both vectors

w_{x}

and

w_{ℓ}

belong to the same tangent space (irrespective of how ample is the loop), one may postulate the existence of a function, call it

R_{x} : T M \to T M

, that transforms

w_{x}

to

w_{ℓ}

. Such function is called curvature endomorphism and we write

w_{ℓ} = R_{x} w_{x}

.

As it is customary, we are going to determine the structure of the curvature endomorphism for a special loop, namely a parallelogram. In order to define a parallelogram on a smooth manifolds’ tangent bundle, let us define the following elements

four points $x, a, b, c \in M$ that denote the vertexes of the parallelogram in $M$ around the point x,
two tangent vectors $u, v \in T_{x} M$ that define the ‘sides’ of the parallelogram,
a smooth curve $α : [0, 1] \to M$ that joins the point x to the point a, such that $α (0) = x$ and $\dot{α} (0) = u$ ; the curve $α$ is parametrized through the parameter s, namely a point on such curve is denoted by $α (s)$ ,
a smooth curve $β : [0, 1] \to M$ that joins the point a to the point c, such that $β (0) = α (s)$ and $\dot{β} (0) = P_{α}^{0 \to s} (v)$ ; the curve $β$ is parametrized through the parameter t, namely a point on such curve is denoted by $β (t)$ ,
a smooth curve $λ : [0, 1] \to M$ that joins the point x to the point b, such that $λ (0) = x$ and $\dot{λ} (0) = v$ ; the curve $λ$ is parametrized through the parameter t, namely a point on such curve is denoted by $λ (t)$ ,
a smooth curve $γ : [0, 1] \to M$ that joins the point b to the point b, such that $γ (0) = λ (t)$ and $\dot{γ} (0) = P_{λ}^{0 \to t} (u)$ ; the curve $γ$ is parametrized through the parameter s, namely a point on such curve is denoted by $γ (s)$ .

Notice that, by consistency, the point

β (t)

on the parallelogram coincides to the point

γ (s)

which, in turn, coincides to the point c, which lays at the ‘opposite’ side to the point x in such parallelogram. Also, notice that the extension of the parallelogram depends on the values of the parameters s and t, hence such parallelogram may be made as small as one wishes by reducing the values of such parameters.

It will be necessary to make use of local extensions of the vectors

u, v

and of the test tangent vector

w \in T_{x} M

being parallely-transported. In order not to burden the notation with extra symbols, we shall denote such (arbitrary) extensions by

U, V, W \in Γ (M)

, respectively. These vector fields may be chosen arbitrarily as long as they are sufficiently regular to afford derivation and

U_{x} = u

,

V_{x} = v

,

W_{x} = w

.

As a further introductory element, let us recall a couple of very useful first-order approximations. Let

ρ : [- ϵ, ϵ] \to M

denote a smooth curve in

M

and

F \in Γ (M)

denote a smooth vector field. From the definition of covariant derivative of a vector field along a given direction we get the following first-order approximation of the vector field F at

ρ (0)

:

(A 1) : P_{ρ}^{τ \to 0} F_{ρ (τ)} = F_{ρ (0)} + τ \nabla_{\dot{ρ} (0)} F + O (τ^{2}) .

(200)

This relationship may be rewritten equivalently as

F_{ρ (τ)} = P_{ρ}^{0 \to τ} F_{ρ (0)} + τ P_{ρ}^{0 \to τ} (\nabla_{\dot{ρ} (0)} F) + O (τ^{2})

, which appears as a sort of Taylor expansion of a vector field along a curve departing from a given point. Such expression leads to the further approximation

(A 2) : P_{ρ}^{0 \to τ} F_{ρ (0)} = F_{ρ (τ)} - τ P_{ρ}^{0 \to τ} (\nabla_{\dot{ρ} (0)} F) + O (τ^{2}) .

(201)

The approximations

(A 1)

and

(A 2)

will prove instrumental in writing down the following necessary relations seamlessly.

The main idea to define the curvature tensor is that transporting a tangent vector along a parallelogram and comparing the starting vector with the resulting vector is equivalent to transporting the same vectors along two paths, namely along

x \to a \to c

and along

x \to b \to c

and then comparing the two results at c. Formally, we hence define the Riemannian curvature endomorphism as

R_{x}^{u, v} (w) : = lim_{s, t \to 0} \frac{P_{α}^{s \to 0} P_{β}^{t \to 0} W_{β (t)} - P_{λ}^{t \to 0} P_{γ}^{s \to 0} W_{γ (t)}}{s t} .

(202)

For the first branch, we have

\begin{matrix} P_{α}^{s \to 0} P_{β}^{t \to 0} W_{β (t)} = & P_{α}^{s \to 0} (W_{β (0)} + t \nabla_{\dot{β} (0)} W + O (t^{2})) \\ = & P_{α}^{s \to 0} W_{α (s)} + t P_{α}^{s \to 0} (\nabla_{P_{α}^{0 \to s} (v)} W) + O (t^{2}) \\ = & W_{α (0)} + s \nabla_{\dot{α} (0)} W + O (s^{2}) + t P_{α}^{s \to 0} (\nabla_{V_{α (s)} - s P_{α}^{0 \to s} (\nabla_{\dot{α} (0)} V) + O (s^{2})} W) \\ + O (t^{2}) \\ = & w + s \nabla_{u} W + O (s^{2}) + t P_{α}^{s \to 0} (\nabla_{V_{α (s)}} W - s \nabla_{P_{α}^{0 \to s} (\nabla_{u} V)} W) + O (t^{2}), \end{matrix}

where the first equality follows from the relation

A 1

and the third equality follows from the relations

A 1

and

A 2

. As a consequence, it holds that

\begin{matrix} P_{α}^{s \to 0} P_{β}^{t \to 0} W_{β (t)} = & w + s \nabla_{u} W + t P_{α}^{s \to 0} (\nabla_{V_{α (s)}} W) - s t P_{α}^{s \to 0} (\nabla_{P_{α}^{0 \to s} (\nabla_{u} V)} W) \\ + O (s^{2}) + O (t^{2}) . \end{matrix}

(203)

Analogously, for the second branch we get

\begin{matrix} P_{λ}^{t \to 0} P_{γ}^{s \to 0} W_{γ (t)} = & w + t \nabla_{v} W + s P_{λ}^{t \to 0} (\nabla_{U_{λ (t)}} W) - s t P_{λ}^{t \to 0} (\nabla_{P_{λ}^{0 \to t} (\nabla_{v} U)} W) \\ + O (s^{2}) + O (t^{2}) . \end{matrix}

(204)

Hence, after canceling out equal terms and grouping, we get

\begin{matrix} P_{α}^{s \to 0} P_{β}^{t \to 0} W_{β (t)} - P_{λ}^{t \to 0} P_{γ}^{s \to 0} W_{γ (t)} = \\ t {P_{α}^{s \to 0} (\nabla_{V_{α (s)}} W) - \nabla_{v} W} - s {P_{λ}^{t \to 0} (\nabla_{U_{λ (t)}} W) - \nabla_{u} W} \\ - s t {P_{α}^{s \to 0} (\nabla_{P_{α}^{0 \to s} (\nabla_{u} V)} W) - P_{λ}^{t \to 0} (\nabla_{P_{λ}^{0 \to t} (\nabla_{v} U)} W)} + O (s^{2}) + O (t^{2}) . \end{matrix}

(205)

Dividing both sides by the product

s t

and taking the limit for

s, t

that vanish to zero, we get

\begin{matrix} lim_{s, t \to 0} \frac{P_{α}^{s \to 0} P_{β}^{t \to 0} W_{β (t)} - P_{λ}^{t \to 0} P_{γ}^{s \to 0} W_{γ (t)}}{s t} = \\ lim_{s \to 0} \frac{P_{α}^{s \to 0} (\nabla_{V_{α (s)}} W) - \nabla_{v} W}{s} - lim_{t \to 0} \frac{P_{λ}^{t \to 0} (\nabla_{U_{λ (t)}} W) - \nabla_{u} W}{t} \\ - (\nabla_{\nabla_{u} V} W - \nabla_{\nabla_{v} U} W) . \end{matrix}

(206)

In conclusion, upon computing the above limits, we get the expression of the curvature endomorphism applied to a tangent vector as follows

\begin{matrix} R_{x}^{u, v} (w) = & {(\nabla_{u} \nabla_{V} W)}_{x} - {(\nabla_{v} \nabla_{U} W)}_{x} - {(\nabla_{\nabla_{u} V - \nabla_{v} U} W)}_{x}, \\ = & {(\nabla_{u} \nabla_{V} W)}_{x} - {(\nabla_{v} \nabla_{U} W)}_{x} - {(\nabla_{{[U, V]}_{x}} W)}_{x} . \end{matrix}

(207)

At a first glance, it is not apparent that the combination of differential operators on the right-hand side of Equation (207) should equal a linear endomorphism as in the left-hand side. However, from Section 4.2 we know that not only this is indeed the case, but also that the right-hand side is completely independent from the way we choose the extensions

U, V, W

. In addition, from Section 4.3, we know that the expression on the right-hand side is exactly the combination of operators

{\bar{\bar{Γ}}}^{•}

and

\bar{\bar{Γ}} (\bar{\bar{Γ}}, 🟉)

that quantifies the curvedness of a manifold in the parallel transport around a tiny loop. As a matter of fact, the Riemannian curvature endomorphism, applied to a test tangent vector

w \in T_{x} M

, may be written explicitly (and solely) in terms of the Christoffel form of the second kind, as

R_{x}^{u, v} (w) = {\bar{\bar{Γ}}}_{x}^{•} (w, v; u) - {\bar{\bar{Γ}}}_{x}^{•} (w, u; v) + {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w, v), u) - {\bar{\bar{Γ}}}_{x} ({\bar{\bar{Γ}}}_{x} (w, u), v) .

(208)

Let us apply the above formula to compute the curvature of a few manifolds of interest.

Example 19.

The first case examined is that of a unit hyper-sphere endowed with its canonical metric. Let us recall that

{\bar{\bar{Γ}}}_{x} (u, v) = x (u^{⊤} v)

, hence

{\bar{\bar{Γ}}}_{x}^{•} (u, v; w) = w (u^{⊤} v) .

(209)

Applying the Equation (208) yields

R_{x}^{u, v} (w) = u (w^{⊤} v) - v (w^{⊤} u) + x ({(x (w^{⊤} v))}^{⊤} u) - x ({(x (w^{⊤} u))}^{⊤} v) .

(210)

Since the last two addenda are null because

x^{⊤} u = x^{⊤} v = 0

, we rest with

R_{x}^{u, v} (w) = (w^{⊤} v) u - (w^{⊤} u) v .

(211)

Notice that the result does not depend explicitly on the point x, hence such Riemannian curvature endomorphism is uniform (in the sense that it takes the same expression in all tangent spaces). It is interesting to observe that the vector

R_{x}^{u, v} (w)

belongs to the subspace

span {u, v} \subset T_{x} M

.

Next we tackle the case of the special orthogonal group

SO (n)

endowed with its canonical metric. Let us recall again that the Christoffel form of the second kind reads

{\bar{\bar{Γ}}}_{R} (U, V) = \frac{1}{2} R (U^{⊤} V + V^{⊤} U)

, therefore

{\bar{\bar{Γ}}}_{R}^{•} (U, V; W) = \frac{1}{2} W (U^{⊤} V + V^{⊤} U) .

(212)

Applying the Equation (208) gives

\begin{matrix} R_{R}^{U, V} (W) = & \frac{1}{2} U (W^{⊤} V + V^{⊤} W) - \frac{1}{2} V (W^{⊤} U + U^{⊤} W) \\ + \frac{1}{2} R {\frac{1}{2} {(W^{⊤} V + V^{⊤} W)}^{⊤} R^{⊤} U + \frac{1}{2} U^{⊤} R (W^{⊤} V + V^{⊤} W)} \\ - \frac{1}{2} R {\frac{1}{2} {(W^{⊤} U + U^{⊤} W)}^{⊤} R^{⊤} V + \frac{1}{2} V^{⊤} R (W^{⊤} U + U^{⊤} W)} \\ = & \frac{1}{2} U W^{⊤} V + \frac{1}{2} U V^{⊤} W - \frac{1}{2} V W^{⊤} U - \frac{1}{2} V U^{⊤} W \\ + \frac{1}{4} R W^{⊤} V R^{⊤} U + \frac{1}{4} R V^{⊤} W R^{⊤} U + \frac{1}{4} R U^{⊤} R W^{⊤} V + \frac{1}{4} R U^{⊤} R V^{⊤} W \\ - \frac{1}{4} R W^{⊤} U R^{⊤} V - \frac{1}{4} R U^{⊤} W R^{⊤} V - \frac{1}{4} R V^{⊤} R W^{⊤} U - \frac{1}{4} R V^{⊤} R U^{⊤} W . \end{matrix}

(213)

The addenda in the last expression do not look like they are going to add up and cancel nicely. Easy but lengthy calculations show that, indeed, they do. We are going to show only a few of the many transformations that are necessary:

\begin{matrix} R W^{⊤} (V R^{⊤}) U = - R (W^{⊤} R) V^{⊤} U = (R R^{⊤}) W V^{⊤} U = W V^{⊤} U, \\ R V^{⊤} (W R^{⊤}) U = - R (V^{⊤} R) W^{⊤} U = (R R^{⊤}) V W^{⊤} U = V W^{⊤} U \\ (R U^{⊤}) R W^{⊤} V = - U (R^{⊤} R) W^{⊤} V = - U W^{⊤} V . \end{matrix}

The net result is

R_{R}^{U, V} (W) = \frac{1}{4} \{(U V^{⊤} - V U^{⊤}) W + W (V^{⊤} U - U^{⊤} V)\}

(214)

By some matrix manipulations similar to the above and by introducing the

T SO (n)

-commutator

{[U, V]}_{R} : = R [R^{⊤} U, R^{⊤} V]

based on the matrix commutator

[A, B] : = A B - B A

, the above relationship may be rewritten as

R_{R}^{U, V} (W) = - \frac{1}{4} {[{[U, V]}_{R}, W]}_{R} .

(215)

Such expression is precisely the one reported in [22].

As a last example, let us consider the space of symmetric, positive-definite matrices

S^{+} (n)

endowed with its canonical metric, for which we had found that

{\bar{\bar{Γ}}}_{P} (U, V) = - \frac{1}{2} U P^{- 1} V - \frac{1}{2} V P^{- 1} U

. The directional derivative of the Christoffel form reads

{\bar{\bar{Γ}}}_{P}^{•} (U, V; W) = \frac{1}{2} U P^{- 1} W P^{- 1} V + \frac{1}{2} V P^{- 1} W P^{- 1} U .

(216)

By applying the Equation (208) we obtain an expression that consists of 16 terms, which however either cancel or add up nicely to one another, resulting in the final expression

R_{P}^{U, V} (W) = \frac{1}{4} (V P^{- 1} U - U P^{- 1} V) P^{- 1} W + \frac{1}{4} W P^{- 1} (U P^{- 1} V - V P^{- 1} U),

(217)

that coincides to the one reported in [22].

The Riemannian curvature endomorphism enjoys, together with linearity in its vectorial argument, a number of symmetries. We are going to recall here only those that we are going to actually utilize in Section 4.5, namely

\begin{matrix} ⟨ R^{u, v} (w), z ⟩ = - ⟨ R^{v, u} (w), z ⟩, \\ ⟨ R^{u, v} (w), z ⟩ = - ⟨ R^{u, v} (z), w ⟩ . \end{matrix}

(218)

In particular, one might notice that

⟨ R^{v, v} (🟉), 🟉 ⟩ = 0

and

⟨ R^{🟉, 🟉} (v), v ⟩ = 0

.

4.5. Tidal Effects, Geodesic Deviation, Jacobi Fields, Sectional Curvature

A massive body subjected to a spatially-varying gravitation field is subjected to tidal effects. In fact, since different parts of the body are subjected to a gravitation pull that varies from point to point, the whole body is subjected to internal mechanical stress.

Formally, such physical effect has an analogous in manifold calculus in the separation of nearby geodesics that may accelerate or decelerate according to the local curvature of a manifold. To quantify such effect, let us define a smooth family of geodesics

γ_{s} (t)

where

for a fixed value of the index $s \in [- ϵ, ϵ]$ , the curve $γ_{s} (t)$ represents a geodesic parametrized by the parameter $t \in [0, 1]$ ; the geodesic curve $γ_{0} (t)$ represents the fiducial geodesic; each tangent vector $v_{s} (t) : = \frac{\partial γ_{s} (t)}{\partial t}$ denotes the speed of the geodesic (we shall omit the values of the parameters for the sake of notation conciseness); all geodesics depart from the same point $x \in M$ , namely $γ_{s} (0) = x$ for any s;
for a fixed value of the parameter $t \in [0, 1]$ , the function $γ_{s} (t)$ represents a smooth curve (but not a geodesic); each tangent vector $ξ_{s} (t) : = \frac{\partial γ_{s} (t)}{\partial s}$ represents the separation between two nearby geodesics (even in this case, we shall omit the values of the parameters for the sake of notation conciseness); more intuitively, the infinitesimal separation between nearby geodesics whose index s differ by an infinitesimal $d s$ is $ξ d s$ . Since all geodesics depart from the same point, it holds that $ξ_{s} (0) = 0$ for every value of the index s.

Let us observe that the vector fields v and

ξ

commute, in fact

[v, ξ] = \frac{\partial^{2}}{\partial t \partial s} γ_{s} (t) - \frac{\partial^{2}}{\partial s \partial t} γ_{s} (t) = 0

(219)

by the Schwarz-Clairaut’s theorem.

The vector

ξ

may be referred to as geodesic deviation. It turns out that such deviations obeys a very neat equation involving the curvature endomorphism. The velocity with which the separation

ξ

between two nearby geodesic changes is quantified by the covariant derivation

\nabla_{v} ξ

, while the acceleration of the geodesic deviation is quantified by

\nabla_{v} \nabla_{v} ξ

. Let us recall that, for a torsion-free connection is holds that

\nabla_{v} ξ - \nabla ξ v = [v, ξ]

. Since

[v, ξ] = 0

, it follows that

\nabla_{v} ξ = \nabla ξ v

, hence

\nabla_{v} \nabla_{v} ξ = \nabla_{v} \nabla_{ξ} v .

(220)

From the Equation (207) it follows that

\nabla_{v} (\nabla_{ξ} v) - \nabla_{ξ} (\nabla_{v} v) - \nabla_{[v, ξ]} v = R^{v, ξ} (v),

(221)

where we omitted the base-point x. Since

[v, ξ] = 0

and

\nabla_{v} v = 0

because v is the velocity vector of a geodesic, it ultimately follows that

\nabla_{v} \nabla_{v} ξ = R^{v, ξ} (v),

(222)

that confirms that the acceleration with which two nearby geodesic originating from the same point deviate from one another depends on the local curvature of the underlying manifold.

The Equation (222) is called a Jacobi equation and its solution

ξ

is termed Jacobi field. In some circumstances, the Jacobi equation may assume a particularly simple form. To survey one of such cases, let us introduce the notion of sectional curvature.

The curvature endomorphism describes completely the curvature of a manifold in a given point, however the local curvature of a manifold is more easily described by a scalar index, the sectional curvature

K : {(T M)}^{2} \to R

, defined as

K_{x}^{u, v} : = \frac{{⟨ R_{x}^{u, v} (v), u ⟩}_{x}}{{⟨ u, u ⟩}_{x} {⟨ v, v ⟩}_{x} - {⟨ u, v ⟩}_{x}^{2}} .

(223)

Although the sectional curvature returns a scalar value, it by no means carries any diminished information about the curvature of a manifold, in fact it is possible to recover the curvature endomorphism by the sectional curvature. A special case that serves the purpose of illustrating such property is that of spaces of constant sectional curvature. A manifold

M

possesses a constant section curvature

κ \in R

if, for every

x \in M

and every

u, v \in T_{x} M

, it holds that

K_{x}^{u, v} = κ

. In this case, the curvature endomorphism is given by

R_{x}^{u, v} (w) = κ (u {⟨ v, w ⟩}_{x} - v {⟨ u, w ⟩}_{x}) .

(224)

Let us illustrate the notion of constant curvature by a brief example.

Example 20.

Let us consider the hyper-sphere

S^{n - 1}

endowed with its canonical metric. We already know that its curvature may be expressed as

R_{x}^{u, v} (w) = u {⟨ v, w ⟩}_{x} - v {⟨ u, w ⟩}_{x}

hence, from the property (224), we conclude immediately that the hyper-sphere is indeed a manifold of constant sectional curvature

κ = 1

.

For a manifold of constant sectional curvature

κ

, the Jacobi equation reads

\nabla_{v} \nabla_{v} ξ = {κ (⟨ ξ, v ⟩ v - ∥ v ∥}^{2} ξ),

(225)

which is clearly a linear equation in the unknown Jacobi field

ξ

.

In any case, if one chooses u and v orthogonal to each other and with unit norm, it is immediately seen that

K_{x}^{u, v} = {⟨ R_{x}^{u, v} (v), u ⟩}_{x}

, hence the sectional curvature and the Riemannian curvature essentially coincide.

The sectional curvature plays an important role in evaluating as to whether the acceleration of the geodesic deviation will increase or decrease (locally). To formalize such observation, let us define the quantity

ℓ^{2} : = ⟨ ξ, ξ ⟩

and let us evaluate its Taylor series expansion up to the fourth order. Here

ξ

denotes the solution of the Jacobi equation with initial conditions

\nabla_{v} ξ = w

for

t = 0

, with

∥ w ∥ = 1

together with

ξ = 0

for

t = 0

. Notice that

\begin{matrix} \frac{d}{d t} ℓ^{2} = 2 ⟨ \nabla_{v} ξ, ξ ⟩, \end{matrix}

(226)

\begin{matrix} \frac{d^{2}}{d t^{2}} ℓ^{2} = 2 ⟨ \nabla_{v} \nabla_{v} ξ, ξ ⟩ + 2 ⟨ \nabla_{v} ξ, \nabla_{v} ξ ⟩, \end{matrix}

(227)

\begin{matrix} \frac{d^{3}}{d t^{3}} ℓ^{2} = 2 ⟨ \nabla_{v}^{(3)} ξ, ξ ⟩ + 6 ⟨ \nabla_{v} \nabla_{v} ξ, \nabla_{v} ξ ⟩, \end{matrix}

(228)

\begin{matrix} \frac{d^{4}}{d t^{4}} ℓ^{2} = 2 ⟨ \nabla_{v}^{(4)} ξ, ξ ⟩ + 8 ⟨ \nabla_{v}^{(3)} ξ, \nabla_{v} ξ ⟩ + 6 ⟨ \nabla_{v} \nabla_{v} ξ, \nabla_{v} \nabla_{v} ξ ⟩ . \end{matrix}

(229)

Setting

t = 0

gives

\begin{matrix} {\frac{d}{d t} ℓ^{2}|}_{t = 0} = 2 ⟨ \nabla_{v} ξ, 0 ⟩ = 0, \end{matrix}

(230)

\begin{matrix} {\frac{d^{2}}{d t^{2}} ℓ^{2}|}_{t = 0} = 2 ⟨ \nabla_{v} \nabla_{v} ξ, 0 ⟩ + 2 ⟨ w, w ⟩ = 2, \end{matrix}

(231)

\begin{matrix} {\frac{d^{3}}{d t^{3}} ℓ^{2}|}_{t = 0} = 2 ⟨ \nabla_{v}^{(3)} ξ, 0 ⟩ + 6 ⟨ R^{v, 0} (v), w ⟩ = 0, \end{matrix}

(232)

\begin{matrix} {\frac{d^{4}}{d t^{4}} ℓ^{2}|}_{t = 0} = 2 ⟨ \nabla_{v}^{(4)} ξ, 0 ⟩ + 8 ⟨ \nabla_{v}^{(3)} ξ, w ⟩ + 6 ⟨ R^{v, 0} (v), R^{v, 0} (v) ⟩ = 8 ⟨ \nabla_{v}^{(3)} ξ, w ⟩ . \end{matrix}

(233)

To what concerns the last term, we observe that, due to the linearity of the curvature endomorphism in its vectorial arguments, we have

{(\nabla_{v} \nabla_{v} \nabla_{v} ξ)}_{x} = {(\nabla_{v} R_{x})}^{v, ξ} (v) + R_{x}^{\nabla_{v} v, ξ} (v) + R_{x}^{v, \nabla_{v} ξ} (v) + R_{x}^{v, ξ} (\nabla_{v} v) .

(234)

Setting

t = 0

and recalling that initially

ξ = 0

, we see that all terms on the right-hand side are null except for the third one, that reads

R_{x}^{v, w} (v)

, hence

\begin{matrix} {\frac{d^{4}}{d t^{4}} ℓ^{2}|}_{t = 0} = {⟨ R_{x}^{v, w} (v), w ⟩}_{x}, \end{matrix}

(235)

therefore the Taylor expansion of the function

ℓ^{2}

around

t = 0

reads

ℓ^{2} (t) = t^{2} + \frac{1}{3} t^{4} {⟨ R_{x}^{v, w} (v), w ⟩}_{x} + \dots,

(236)

where the dots stand for higher-order terms that we are not going to pay attention to.

Under the additional hypotheses that

∥ v ∥ = 1

and

{⟨ v, w ⟩}_{x} = 0

, the above relationship may be rewritten in terms of the sectional curvature. In fact, by the symmetry (218) we know that

{⟨ R_{x}^{v, w} (v), w ⟩}_{x} = - {⟨ R_{x}^{v, w} (w), v ⟩}_{x}

which, in turn, equals

- K_{x}^{v, w}

, therefore we may write that

ℓ^{2} (t) = t^{2} - \frac{1}{3} t^{4} K_{x}^{v, w} + \dots,

(237)

therefore, locally, if

K_{x}^{v, w} > 0

the acceleration of geodesic deviation decreases along the fiducial geodesic, while if

K_{x}^{v, w} < 0

the acceleration increases. Hence the sectional curvature determines the speed of spreading of nearby geodesics. (Therefore, in some sense, curvature acts as a force, which is the fundamental observation that motivated Einstein to formulate his General Theory of Relativity).

A further interpretation of the sectional curvature in terms of separation of nearby geodesics is worth examining. Let us define again a family of geodesics

γ_{s} (t)

and let us assume, in particular, that these geodesic lines were shot in a peculiar way such that, if we denote again by v the tangent velocity along the fiducial geodesic and by

ξ

the separation vector along the same geodesic line, then it holds that

⟨ v, ξ ⟩ = 0

, namely the separation vector is orthogonal to the velocity vector along the fiducial geodesic (that is to say, the geodesic curves in this family are parallel to one another). By appropriate parametrization, one can make sure that

⟨ v, v ⟩ = ⟨ ξ, ξ ⟩ = 1

. Since

\nabla_{v} \nabla_{v} ξ

denotes the tidal acceleration, we may interpret the effect of the curvature on the spreading of the geodesics as follows:

whenever $⟨ \nabla_{v} \nabla_{v} ξ, ξ ⟩ > 0$ , the tidal force is directed in the same direction of the current separation, hence nearby geodesics tend to spread even further from the fiducial one;
whenever $⟨ \nabla_{v} \nabla_{v} ξ, ξ ⟩ < 0$ , the tidal force is directed in the opposite direction of the current separation vector, hence nearby geodesics tend to get closer to the fiducial one;
as opposed to the previous cases, whenever $⟨ \nabla_{v} \nabla_{v} ξ, ξ ⟩ = 0$ , the tide does not produce any effects and the geodesics get neither closer nor farther apart (in fact, this case corresponds to zero curvature).

Since, by the hypotheses done,

⟨ \nabla_{v} \nabla_{v} ξ, ξ ⟩ = ⟨ R^{v, ξ} (v), ξ ⟩ = - K^{v, ξ}

, it follows that

in a region of positive sectional curvature, two nearby parallel geodesics tend to converge towards each other,
in a region of negative curvature, two nearby parallel geodesics tend to deviate from one another,
in a region of zero curvature, two nearby parallel geodesic curves neither converge nor diverge.

Once again, the spreading of nearby geodesics is evaluated in terms of relative acceleration, because velocity spreading is not peculiar of curvature, since even in a flat space two curves may deviate with non-zero velocity.

A neat result about sectional curvature is that the number

K_{x}^{u, v}

does not actually depend on two vectors

u, v \in T_{x} M

but rather on the subspace spanned by such vectors, namely if we define a plane

℘ : = span {u, v}

, the sectional curvature is better written as

K_{x}^{℘}

. This property is easily proven by defining any linear combination

\bar{u} = a u + b v, \bar{v} = c u + d v,

(238)

where

a, b, c, d \in R

such that

a d \neq b c

, and calculating sectional curvature at

\bar{u}, \bar{v}

, that takes the value

\begin{matrix} K_{x}^{\bar{u}, \bar{v}} = & \frac{{⟨ R_{x}^{\bar{u}, \bar{v}} (\bar{v}), \bar{u} ⟩}_{x}}{{⟨ \bar{u}, \bar{u} ⟩}_{x} {⟨ \bar{v}, \bar{v} ⟩}_{x} - {⟨ \bar{u}, \bar{v} ⟩}_{x}^{2}} \\ = & \frac{{⟨ R_{x}^{a u + b v, c u + d v} (c u + d v), a u + b v ⟩}_{x}}{{⟨ a u + b v, a u + b v ⟩}_{x} {⟨ c u + d v, c u + d v ⟩}_{x} - {⟨ a u + b v, c u + d v ⟩}_{x}^{2}} . \end{matrix}

(239)

Given the linearity of the curvature endomorphism and of the inner product in their vectorial arguments and the symmetry properties (218), it is a little tedious but not hard to find that

{⟨ R_{x}^{a u + b v, c u + d v} (c u + d v), a u + b v ⟩}_{x} = {(a d - b c)}^{2} {⟨ R_{x}^{u, v} (v), u ⟩}_{x},

(240)

and that

\begin{matrix} {⟨ a u + b v, a u + b v ⟩}_{x} {⟨ c u + d v, c u + d v ⟩}_{x} - {⟨ a u + b v, c u + d v ⟩}_{x}^{2} \\ = {(a d - b c)}^{2} ({⟨ u, u ⟩}_{x} {⟨ v, v ⟩}_{x} - {⟨ u, v ⟩}_{x}^{2}), \end{matrix}

(241)

hence the assertion.

We conclude this section by an example that justifies the property that, in a space with constant sectional curvature, the curvature endomorphism is parallel (which is often expressed through the dry statement

\nabla R = 0

).

Example 21.

Let us indicate by

u, v, w, z \in Γ (M)

four smooth vector fields and let us form the function

R_{x}^{u, v} (w)

. The application

x \mapsto R_{x}^{u, v} (w)

defines a further vector field, that may be covariantly-derivated along z to give

\nabla_{z} R_{x}^{u, v} (w)

. Since

R_{x}^{u, v} (w)

is non-linear in x but trilinear in its vectorial arguments, it holds that

\nabla_{z} (R_{x}^{u, v} (w)) = {(\nabla_{z} R_{x})}^{u, v} (w) + R^{\nabla_{z} u, v} (w) + R^{u, \nabla_{z} v} (w) + R^{u, v} (\nabla_{z} w),

(242)

therefore, in particular

{(\nabla_{z} R_{x})}^{u, v} (w) = \nabla_{z} (R_{x}^{u, v} (w)) - R^{\nabla_{z} u, v} (w) - R^{u, \nabla_{z} v} (w) - R^{u, v} (\nabla_{z} w),

(243)

On a space with constant sectional curvature κ, it holds that

{(\nabla_{z} R_{x})}^{u, v} (w) = 0

for every choice of

u, v, w, z \in Γ (M)

. In fact, recalling that, in this case,

R_{x}^{u, v} (w) = κ (u {⟨ v, w ⟩}_{x} - v {⟨ u, w ⟩}_{x})

, we have:

\begin{matrix} {(\nabla_{z} R_{x})}^{u, v} (w) = & κ (\nabla_{z} u {⟨ v, w ⟩}_{x} + u {⟨ \nabla_{z} v, w ⟩}_{x} + u {⟨ v, \nabla_{z} w ⟩}_{x}) \\ - κ (\nabla_{z} v {⟨ u, w ⟩}_{x} - v {⟨ \nabla_{z} u, w ⟩}_{x} - v {⟨ u, \nabla_{z} w ⟩}_{x}) \\ - κ (\nabla_{z} u {⟨ v, w ⟩}_{x} - v {⟨ \nabla_{z} u, w ⟩}_{x}) \\ - κ (u {⟨ \nabla_{z} v, w ⟩}_{x} - \nabla_{z} v {⟨ u, w ⟩}_{x}) \\ - κ (u {⟨ v, \nabla_{z} w ⟩}_{x} - v {⟨ u, \nabla_{z} w ⟩}_{x}) . \end{matrix}

(244)

The terms on the right-hand side cancel two by two, hence the stated property holds.

5. Continuous-Time Dynamical Systems

This section presents a survey on continuous-time dynamical systems on manifolds with particular emphasis on Lagrangian systems and non-linear damped oscillators. This survey of second-order (or double-integrator) dynamical system is presented in extrinsic terms, as per the main flow of this paper as well as, for the interested readers, in coordinates and through a special formulation based on cotangent bundles.

The present section is organized in the following way. Section 5.1 retraces calculus of variation on manifold and Section 5.2 applies such technique to derive the general equation of motion on a smooth manifold by a Lagrangian approach. The aim of Section 5.3 is to illustrate the same derivation by using the coordinate formalism, while the purpose of Section 5.4 is to retrace the same derivation by using a cotangent-bundle formalism, which is shown to be as valid as the usually-invoked tangent-bundle one. In particular, Section 5.4 digs into non-linear oscillatory phenomena by first revising classical non-linear oscillators and then showing how to extend such dynamical systems to state manifolds.

5.1. Calculus of Variations on Manifold

In the first part [13] of this multi-part tutorial, we studied a problem related to the evaluation of the first-order variation of the fundamental form

F : T M \to R

, defined as

F (x, v) : = {⟨ v, v ⟩}_{x}

, on a manifold and we invoked its ‘extendability’ to a neighborhood of the tangent bundle

T M

in the ambient bundle

A^{2}

, namely, it is not necessary to assume the well-definedness of expressions like

F (x + y, v)

nor

F (x, v + w)

. In that occasion, we mentioned that extendability provides a convenient shortcut but that it is not actually necessary to arrive at the desired result. In the present section, we aim at clarifying this point.

Variational methods constitute a core tool in applied mathematics, and with a specific weight in system theory and control, as it affords the formulation of complex dynamics through and elegant formalism and first principles. It is extremely interesting to examine variational methods on smooth manifolds from an extrinsic perspective. We shall briefly survey this topic as a preparation to the next sections were we shall use it extensively.

Let

F : T M \to R

denote an integrable function on a smooth manifold

M

and let

γ : [0, 1] \to M

denote a smooth curve on a Riemannian manifold

M

. We are going to prove that the first variation of the integral (action) functional

A (γ) : = \int_{0}^{1} F (γ, \dot{γ}) d t,

(245)

takes the expression:

δ A (γ) = \int_{0}^{1} {⟨\frac{\partial F}{\partial γ} - \frac{d}{d t} \frac{\partial F}{\partial \dot{γ}}, δ γ⟩}^{A} d t,

(246)

where

⟨ 🟉, 🟉 ⟩^{A}

denotes the Euclidean inner product that the ambient space

A \supset M

is endowed with.

To understand the present development and the properties of the symbol

δ γ

in a coordinate-free setting, let us define a smoothly-parametrized family of smooth curves

c : [- ϵ, ϵ] \times [0, 1] \to M

. The notation

c (s, t)

may be interpreted as follows:

the index s determines which curve in the family is referred to; keeping the index s fixed and varying the affine parameter t, namely considering $γ (🟉, t)$ , corresponds to tracing the $s^{th}$ curve in the family;
the affine parameter t determines which point on a curve is referred to; keeping t fixed and letting s vary, namely considering $γ (s, 🟉)$ , corresponds to traversing the family of curves transversally in correspondence of homologous points; notice that $γ (s, 🟉)$ still traces out a curve, although not necessarily endowed with the properties of $γ (🟉, t)$ (for example, while curves $γ (🟉, t)$ may be geodesics, transversal curves $γ (s, 🟉)$ might not be geodesics).

Such family of smooth curves will be chosen such that

\{\begin{matrix} c (0, t) = γ (t), \forall t \in [0, 1] (termed ‘ fiducial curve ’), \\ c (s, 0) = γ (0), c (s, 1) = γ (1), \forall s \in [- ϵ, ϵ], \end{matrix}

(247)

namely, the family member corresponding to the index value

s = 0

coincides to the curve

γ

, while the other members are free except that they share the same endpoints as

γ

. As mentioned, for a fixed s, the map

t \mapsto c (t, s)

traces a smooth curve on the manifold

M

. Likewise, for a fixed t, the map

s \mapsto c (t, s)

traces a smooth ‘transversal’ curve on

M

. Smoothness implies

\frac{\partial^{2} c}{\partial t \partial s} = \frac{\partial^{2} c}{\partial s \partial t}

by the Schwarz-Clairaut theorem. Now, it pays to define the following partial derivatives:

δ γ (t) : = {\frac{\partial}{\partial s} c (s, t)|}_{s = 0} and \dot{c} (s, t) : = \frac{\partial}{\partial t} c (s, t) .

(248)

The function

δ γ

is the tangent vector to the transversal curve at the point t, hence

δ γ \in T_{γ} M

. In addition, from the properties (247), it follows that

δ γ (0) = δ γ (1) = 0

.

On the basis of the above settings, the variation

δ A (γ)

may be written as:

\begin{matrix} δ \int_{0}^{1} F (γ, \dot{γ}) d t : = & lim_{h \to 0} \frac{1}{h} \{\int_{0}^{1} F (c (h, t), \dot{c} (h, t)) d t - \int_{0}^{1} F (c (0, t), \dot{c} (0, t)) d t\} \\ = & \int_{0}^{1} lim_{h \to 0} \frac{F (c (h, t), \dot{c} (h, t)) - F (c (0, t), \dot{c} (0, t))}{h} d t \\ = & \int_{0}^{1} {\frac{\partial}{\partial s} F (c (s, t), \dot{c} (s, t))|}_{s = 0} d t . \end{matrix}

(249)

The partial derivative within the integral in Equation (249) may be written as:

{\frac{\partial F (c, \dot{c})}{\partial s}|}_{s = 0} = {{⟨\frac{\partial F (c, \dot{c})}{\partial c}, \frac{\partial c}{\partial s}⟩}^{A}|}_{s = 0} + {{⟨\frac{\partial F (c, \dot{c})}{\partial \dot{c}}, \frac{\partial \dot{c}}{\partial s}⟩}^{A}|}_{s = 0} .

(250)

Notice that the above expression does not require extendability to a neighbor of

T M

but merely the existence of the partial derivatives

\partial_{c} F

and

\partial_{\dot{c}} F

. The variation of the action functional in a neighbor of the fiducial curve

γ

reads, therefore:

δ \int_{0}^{1} F (γ, \dot{γ}) d t = \int_{0}^{1} {⟨\frac{\partial F (γ, \dot{γ})}{\partial γ}, δ γ⟩}^{A} d t + \int_{0}^{1} {⟨\frac{\partial F (γ, \dot{γ})}{\partial \dot{γ}}, \frac{d δ γ}{d t}⟩}^{A} d t,

(251)

where, in the second integral, we have used the interchangeability of partial derivations permitted by the Schwartz-Clairaut theorem. In addition, integration by parts yields:

\int_{0}^{1} {⟨\frac{\partial F}{\partial \dot{γ}}, \frac{d}{d t} δ γ⟩}^{A} d t = {{⟨\frac{\partial F}{\partial \dot{γ}}, δ γ⟩}^{A}|}_{0}^{1} - \int_{0}^{1} {⟨\frac{d}{d t} \frac{\partial F}{\partial \dot{γ}}, δ γ⟩}^{A} d t .

(252)

The first addendum on the right-hand side equals zero since the transversal variation

δ γ

vanishes at the endpoints of the fiducial curve

γ

, hence the variation of the integral functional assumes the final expression (246).

A noticeable consequence of this development is that the variation

δ A (γ)

depends only on the tangential components

δ γ \in T_{γ} M

of the perturbation.

5.2. Second-Order Dynamical Systems on Manifold

In order to formulate second-order dynamical systems on manifolds we shall recall the Euler-Lagrange-d’Alembert modified minimal action principle. According to such principle, a dynamical system with which it is associated a kinetic energy function

K

, a potential energy function

V

and a non-conservative force field f, generates a trajectory

γ : [0, 1] \to M

on a state manifold

M

such that:

δ \int_{0}^{1} (K - V) d t + \int_{0}^{1} {⟨ f, δ γ ⟩}_{γ} d t = 0 .

(253)

Let us survey in details how the above constituent functions are defined, what is their meaning, and how to evaluate the variation of the integral. We shall assume

M

to be a smooth Riemannian manifold.

The kinetic energy

K : T M \to R

associated to a dynamical system is defined as

K : = \frac{1}{2} m {⟨ \dot{γ}, \dot{γ} ⟩}_{γ},

(254)

with

m > 0

. Except for the factor

\frac{m}{2}

, the kinetic energy coincides with the fundamental form

F

. As such, the kinetic energy depends on the state of the system as well as on its velocity. (Notice that the ‘mass’ term m has been introduced to retain familiarity with well-known physics concepts. In general, in the present context it is not necessary at all because the ‘mass’ may be more naturally thought of as part of the metric

⟨ \cdot, \cdot ⟩

, hence of the metric tensor G.)

The potential energy function associated to the dynamical system under construction is indicated by

V : M \to R

. The potential energy is generally a function of the system’s state only and, most frequently, of the Riemannian distance between the state and a reference point on the state-manifold.

The function

L : = K - V

is referred to as Lagrangian function of the system. The integral of the lagrangian function along the trajectory generated by the system, namely

A : = \int_{0}^{1} L d t

, is referred to as total action of the system. (In the international unit system, the Lagrangian is measured in Joule, while the action is measured in Joule · second.)

According to the Euler-Lagrange principle that a dynamical system is supposed to bey, in the absence of non-conservative forces acting on the system, the system generates a trajectory on the state-manifold that makes the total action stationary, namely

δ \int_{0}^{1} L d t = 0 .

(255)

There exists a further function, referred to as Hamiltonian function of the system, defined as

H : = K + V,

(256)

that represents the total energy content of the system (at any given time instant). In a system that obeys the Euler-Lagrange principle, the Hamiltonian is an invariant, namely it holds that

\frac{d}{d t} H = 0

.

In the presence of non-conservative forces, the formed principle needs to be modified to accommodate the work of such forces, hence it gets extended as in (253), that we shall refer to as Euler-Lagrange-d’Alembert principle.

In order to calculate the variation of the total action, according to the method surveyed in Section 5.1, let us define a smooth family of smooth curves

γ : [- ϵ, ϵ] \times [0, 1] \to M

denotes as

γ (s, t), 0 ⩽ t ⩽ 1, - ϵ ⩽ s ⩽ ϵ,

(257)

where the index s individuates a member of the family, while the affine parameter t individuates a point on that curve. Let us underline that, for a fixed value of the affine parameter, the map

s \mapsto γ (s, 🟉)

traces a smooth curve.

Let us observe that the partial derivative

\frac{\partial γ (s, t)}{\partial s} \in T_{γ (s, t)} M

represents a transversal tangent vector, computed as

\frac{\partial γ (s, t)}{\partial s} = lim_{h \to 0} \frac{γ (s + h, t) - γ (s, t)}{h} .

(258)

The above-defined family of curves possesses, in particular, two fixed endpoints, namely the endpoint

t = 0

and the endpoint

t = 1

. In formal terms, this is expressed by

\frac{\partial γ}{\partial s} (s, 0) = \frac{\partial γ}{\partial s} (s, 1) = 0, \forall s \in [- ϵ, ϵ] .

(259)

In other terms, the maps

s \mapsto γ (s, 0)

and

s \mapsto γ (s, 1)

are constants with respect to the index s. As a last definition, we set

δ γ (t) : = \frac{\partial γ}{\partial s} (0, t) .

(260)

A consequence of the assumption (259) is that

δ (0) = δ (1) = 0

. The function

δ γ (t)

represents the ensemble of the variations, namely all transversal tangent vectors to all point of the fiducial curve

γ (t) \equiv γ (t, 0)

. (The fiducial curve represents, in fact, the trajectory actually followed by the system.)

The variation of the total action may be defined as:

δ \int_{0}^{1} (\frac{1}{2} m {⟨ \dot{γ}, \dot{γ} ⟩}_{γ} - V) d t : = {\{\frac{\partial}{\partial τ} \int_{0}^{1} (\frac{1}{2} m {⟨ \dot{γ}, G_{γ} (\dot{γ}) ⟩}^{A} - V) d t\}|}_{τ = 0},

(261)

where we have used the shorthand

γ

to denote

γ (t, τ)

. In particular, the variation of the kinetic energy may be made explicit as follows:

\begin{matrix} \frac{\partial}{\partial τ} \int_{0}^{1} \frac{1}{2} m {⟨ \dot{γ}, G_{γ} (\dot{γ}) ⟩}^{A} d t = & \frac{1}{2} m \int_{0}^{1} {⟨\frac{\partial \dot{γ}}{\partial τ}, G_{γ} (\dot{γ})⟩}^{A} d t + \\ \frac{1}{2} m \int_{0}^{1} {⟨\dot{γ}, G_{γ}^{•} (\frac{\partial γ}{\partial τ}, \dot{γ})⟩}^{A} d t + \\ \frac{1}{2} m \int_{0}^{1} {⟨\dot{γ}, G_{γ} (\frac{\partial \dot{γ}}{\partial τ})⟩}^{A} d t . \end{matrix}

(262)

The first and the third integrals on the right-hand side take the same value. In the second integral, it is possible to swap two tangential arguments, therefore

\begin{matrix} \frac{\partial}{\partial τ} \int_{0}^{1} \frac{1}{2} m {⟨ \dot{γ}, G_{γ} (\dot{γ}) ⟩}^{A} d t = m \int_{0}^{1} {⟨\frac{\partial \dot{γ}}{\partial τ}, G_{γ} (\dot{γ})⟩}^{A} d t + \frac{1}{2} m \int_{0}^{1} {⟨\frac{\partial γ}{\partial τ}, G_{γ}^{•} (\dot{γ}, \dot{γ})⟩}^{A} d t . \end{matrix}

(263)

The first integral may be re-expressed by swapping the order of partial derivation and by applying integration by parts, which give:

\int_{0}^{1} {⟨\frac{d}{d t} \frac{\partial γ}{\partial τ}, G_{γ} (\dot{γ})⟩}^{A} d t = {{⟨\frac{\partial γ}{\partial τ}, G_{γ} (\dot{γ})⟩}^{A}|}_{t = 0}^{t = 1} - \int_{0}^{1} {⟨\frac{\partial γ}{\partial τ}, \frac{d}{d t} G_{γ} (\dot{γ})⟩}^{A} d t,

(264)

where, in particular, we may recall from the definition of the ‘bullet’ derivative

G^{•}

, that

\frac{d}{d t} G_{γ} (\dot{γ}) = G_{γ}^{•} (\dot{γ}, \dot{γ}) + G_{γ} (\ddot{γ}) .

(265)

Continuing with the evaluation of the variation of the integral of the kinetic energy, we get:

\begin{matrix} \frac{\partial}{\partial τ} \int_{0}^{1} \frac{1}{2} m {⟨ \dot{γ}, G_{γ} (\dot{γ}) ⟩}^{A} d t = & - m \int_{0}^{1} {⟨\frac{\partial γ}{\partial τ}, G_{γ}^{•} (\dot{γ}, \dot{γ}) + G_{γ} (\ddot{γ})⟩}^{A} d t \\ + \frac{1}{2} m \int_{0}^{1} {⟨\frac{\partial γ}{\partial τ}, G_{γ}^{•} (\dot{γ}, \dot{γ})⟩}^{A} d t \\ = & m \int_{0}^{1} {⟨\frac{\partial γ}{\partial τ}, - \frac{1}{2} G_{γ}^{•} (\dot{γ}, \dot{γ}) - G_{γ} (\ddot{γ})⟩}^{A} d t . \end{matrix}

(266)

Let us now evaluate the variation of the potential energy function, that reads

\begin{matrix} \frac{\partial}{\partial τ} \int_{0}^{1} V d t = \int_{0}^{1} \frac{\partial V}{\partial τ} d t = \int_{0}^{1} {⟨{grad}_{γ} V, \frac{\partial γ}{\partial τ}⟩}_{γ} d t = \int_{0}^{1} {⟨G_{γ} ({grad}_{γ} V), \frac{\partial γ}{\partial τ}⟩}^{A} d t . \end{matrix}

(267)

To what concerns the work of the external forcing term, ultimately it turns out that

\int_{0}^{1} {⟨ f, δ γ ⟩}_{γ} d t = \int_{0}^{1} {⟨ G_{γ} (f), δ γ ⟩}^{A} d t .

(268)

Upon summing up all the above terms and setting

τ = 0

(wherever the variable

τ

is present) we obtain:

\begin{matrix} δ \int_{0}^{1} L d t + \int_{0}^{1} {⟨ f, δ γ ⟩}_{γ} d t = & m \int_{0}^{1} {⟨- \frac{1}{2} G_{γ}^{•} (\dot{γ}, \dot{γ}) - G_{γ} (\ddot{γ}), δ γ⟩}^{A} d t \\ - \int_{0}^{1} {⟨G_{γ} ({grad}_{γ} V), δ γ⟩}^{A} d t + \int_{0}^{1} {⟨ G_{γ} (f), δ γ ⟩}^{A} d t . \end{matrix}

(269)

Let us observe that, in the first integral on the right-hand side, the scalar product does not change if one adds any arbitrary normal term in

N_{γ} M

, therefore such integral is equivalent to:

m \int_{0}^{1} {⟨- G_{γ} (\nabla_{t}^{γ} \dot{γ}), δ γ⟩}^{A} d t .

(270)

In conclusion, the variational principle of Euler, Lagrange and d’Alembert leads to the integral equation

\begin{matrix} \int_{0}^{1} {⟨- m G_{γ} (\nabla_{t}^{γ} \dot{γ}) - G_{γ} ({grad}_{γ} V) + G_{γ} (f), δ γ⟩}^{A} d t = 0 . \end{matrix}

(271)

Since

δ γ

is an arbitrary (tangential) perturbation, the above equation implies that the integrand must be zero. Ultimately, the calculus of the variation leads to a class of second-order dynamical systems on manifold of the type

m \nabla_{t}^{γ} \dot{γ} = - {grad}_{γ} V + f .

(272)

At a glance, it appears to generalize the II equation of dynamics of Newton to any smooth Riemannian manifold. In fact, let us observe that if

M \equiv R^{p}

, then

\nabla_{t}^{γ} \dot{γ} \equiv \ddot{γ}

, hence the above equation collapses to the familiar physics law “mass × acceleration = resultant of forces”.

As an example, let us compute the energy balance of a non-conservative system (and, as a by-product, let us verify the energy conservation of a conservative system).

Example 22.

By definition, the total energy of the system is

H = K + V

, hence the power law for the system (272) is

\begin{matrix} \frac{d H}{d t} = & \frac{d K}{d t} + \frac{d V}{d t} \\ = & \frac{1}{2} \frac{d}{d t} {⟨ \dot{γ}, \dot{γ} ⟩}_{γ} + \frac{d V}{d t} \\ = & m {⟨ \nabla_{t}^{γ} \dot{γ}, \dot{γ} ⟩}_{γ} + {⟨ {grad}_{γ} V, \dot{γ} ⟩}_{γ} \\ = & {⟨ m \nabla_{t}^{γ} \dot{γ} + {grad}_{γ} V, \dot{γ} ⟩}_{γ} . \end{matrix}

(273)

From the expression (272) it follows that

m \nabla_{t}^{γ} \dot{γ} + {grad}_{γ} V = f

, and hence

\begin{matrix} \dot{H} = {⟨ f, \dot{γ} ⟩}_{γ} . \end{matrix}

(274)

The dissipative components of the external force fields are those that cause a decrease of the energy content of the system, namely those focing terms such that

{⟨ f, \dot{γ} ⟩}_{γ} < 0

(such as, for example, the viscous damping

f : = - μ \dot{γ}

, with

μ > 0

). The active components are, on the other hand, those which inject energy into the system, such as those generated by actuators and due to active damping effects.

5.3. Coordinate-Prone Lagrangian Formulation of Dynamical Systems*

As recalled, on a Riemannian manifold

M

whose tangent spaces

T_{x} M

are endowed with an inner product

⟨ 🟉, 🟉 ⟩_{x}

, the non-conservative Euler-Lagrange-d’Alembert principle may be stated as:

\int_{0}^{1} δ (K_{x} (\dot{x}) - V_{x}) d t + \int_{0}^{1} {⟨ f_{x}, δ x ⟩}_{x} d t = 0,

(275)

where

f \in Γ (M)

denotes a vector field of external forces which, in the context of mathematical system modeling and control, may include energy-dissipation terms and energy-injection terms that include control fields. On each point of the fiducial trajectory

x (t)

, the quantity

δ x

denotes a variation with respect to the fiducial trajectory itself. A possible generalization of this principle would arise by replacing the potential function by a generalized potential [23], which would give rise to gyroscopic force fields.

On a smooth manifold

M

of dimension p, where a point

x \in M

is described by coordinates

(x^{1}, \dots, x^{p})

via a coordinate chart, working with intrinsic coordinates eases the expression of the solution of the variational problem (275) in explicit form. In order to express the derivation of the general law of dynamics arising from the extended Lagrangian formulation (275), the following facts are worth summarizing:

the standard notation for covariant and contravariant tensors indexes as well as Einstein’s convention on summation indexes are in force;
for any pair of tangent vectors $u, v \in T_{x} M$ , it holds that ${⟨ u, v ⟩}_{x} = g_{i j} u^{i} v^{j}$ , where $g_{i j}$ denote the components of the metric tensor associated to the inner product $⟨ 🟉, 🟉 ⟩_{x}$ , namely, based on the canonical basis ${\partial_{1}^{x}, \dots, \partial_{p}^{x}}$ , $g_{i j} (x) : = {⟨ \partial_{i}^{x}, \partial_{j}^{x} ⟩}_{x}$ ; the functions $g_{i j}$ depend on the coordinates $x^{k}$ and the symmetry property $g_{i j} = g_{j i}$ holds;
in coordinates, the Riemannian gradient of a regular function $f : M \to R$ at a point $x \in M$ is given by:

${({grad}_{x} f)}^{h} = g^{h k} \frac{\partial f}{\partial x^{k}};$

(276)

the quantities $\frac{\partial f}{\partial x^{k}}$ denote the components of the Euclidean gradient, while the spatial functions $g^{h k}$ denote the contravariant components of the metric tensor;
the Christoffel symbols of the first kind associated to the Levi-Civita connection corresponding to the metric tensor of components $g_{i j}$ are computed as:

$Γ_{k j i} = \frac{1}{2} (\frac{\partial g_{j i}}{\partial x^{k}} + \frac{\partial g_{i k}}{\partial x^{j}} - \frac{\partial g_{k j}}{\partial x^{i}});$

the Christoffel symbols of the second kind are computed as $Γ_{i k}^{h} = g^{h j} Γ_{i k j}$ ; the Christoffel symbols of the second kind are symmetric in the covariant indices, namely, $Γ_{i j}^{h} = Γ_{j i}^{h}$ ; the Christoffel form $Γ_{x} : T_{x} M \times T_{x} M \to R^{r}$ (a way to express the Christoffel symbol applied to two tangent directions as an array) has components ${[Γ_{x} (v, w)]}^{k} = Γ_{i j}^{k} v^{i} w^{j}$ ;
the kinetic energy functional is expressed by the symmetric bilinear form $\frac{1}{2} m {⟨ \dot{x}, \dot{x} ⟩}_{x}$ , namely $K_{x} (\dot{x}) = \frac{1}{2} m g_{i j} {\dot{x}}^{i} {\dot{x}}^{j}$ , where the constant $m > 0$ plays the role of a mass term; on a Riemannian manifold, the metric tensor is positive-definite, hence $K_{x} (\dot{x}) ⩾ 0$ on every trajectory $x (t) \in M$ ; in the context of abstract manifold calculus, the mass term is quite immaterial since it does not represent any actual physical quantity, nevertheless, we shall leave it in the equation to help readers although, as already mentioned, it might well be incorporated into the metric tensor g;
the potential energy function $V$ depends on the coordinates $x^{k}$ only;
the variations $δ x^{k}$ may be chosen arbitrarily except at the boundaries of the trajectory where they vanish, namely, $δ x^{k} (0) = δ x^{k} (1) = 0$ .

In coordinates, the principle (275) may be rewritten as:

\int_{0}^{1} δ (\frac{1}{2} m g_{i j} {\dot{x}}^{i} {\dot{x}}^{j} - V_{x} + f_{k} δ x^{k}) d t = 0,

(277)

where functions

f_{k}

denote the covariant components of a cotangent dissipation force field

f_{x}^{♭} \in T_{x}^{*} M

. By calculating the variation in the leftmost integral, the extended Lagrangian formulation may be expressed in the form:

\int_{0}^{1} φ_{k} δ x^{k} d t = 0, φ_{k} : = \frac{1}{2} m \frac{\partial g_{i j}}{\partial x^{k}} {\dot{x}}^{i} {\dot{x}}^{j} - m \frac{d}{d t} (g_{i k} {\dot{x}}^{i}) + f_{k} - \frac{\partial V_{x}}{\partial x^{k}},

(278)

through integration by parts. As the variations

δ x^{k}

may be chosen arbitrarily for the parameter t ranging in the open interval

(0, 1)

, the equations of dynamics on the manifold

M

are such that

φ_{k} = 0

, namely:

\frac{1}{2} m \frac{\partial g_{i j}}{\partial x^{k}} {\dot{x}}^{i} {\dot{x}}^{j} - m \frac{\partial g_{i k}}{\partial x^{j}} {\dot{x}}^{i} {\dot{x}}^{j} - m g_{i k} {\ddot{x}}^{i} + f_{k} - \frac{\partial V_{x}}{\partial x^{k}} = 0 .

(279)

Thanks to the Christoffel symbols of the first kind

Γ_{i j k}

, the above equation may be recast as:

m g_{i k} {\ddot{x}}^{i} + m Γ_{i j k} {\dot{x}}^{i} {\dot{x}}^{j} + \frac{\partial V_{x}}{\partial x^{k}} - f_{k} = 0 .

(280)

Upon multiplying both sides of the above equation by the contravariant tensor

g^{h k}

and upon summation with respect to the index k, the above equation takes the form:

m {\ddot{x}}^{h} + m g^{h k} Γ_{i j k} {\dot{x}}^{i} {\dot{x}}^{j} + g^{h k} (\frac{\partial V_{x}}{\partial x^{k}} - f_{k}) = 0 .

(281)

The above expression may be further condensed by noticing that:

the terms $g^{h k} Γ_{i j k}$ equal the Christoffel symbols of the second kind $Γ_{i j}^{h}$ ;
the terms $g^{h k} \frac{\partial V_{x}}{\partial x^{k}}$ denote the components of the Riemannian gradient ${grad}_{x} V_{x}$ ;
the terms $g^{h k} f_{k}$ denote the components of the contravariant (tangent) dissipation force field $f_{x} \in T_{x} M$ .

In conclusion, by introducing the coordinate array

x : = (x^{1}, x^{2}, \dots, x^{p})

and the velocity array

v : = (v^{1}, v^{2}, \dots, v^{p})

, the Equation (281) may be recast compactly as a system of first-order equations

\{\begin{matrix} \dot{x} = \frac{1}{m} v, \\ \dot{v} - m Γ_{x} (v, v) = - {grad}_{x} V_{x} + f_{x} . \end{matrix}

(282)

The following example treats in details the case of viscous damping as external force field and determines the rate of energy dissipation.

Example 23.

The dissipation term is often drawn from viscosity theory, namely,

f_{x} : = - μ \dot{x}

, with

μ > 0

denoting a viscous damping term. The system dynamics assumes then the expression:

\{\begin{matrix} \dot{x} = \frac{1}{m} v, \\ \dot{v} - m Γ_{x} (v, v) = - {grad}_{x} V_{x} - μ v . \end{matrix}

(283)

The dynamics (283) is governed by a driving force from a potential energy plus a viscosity term.

The law of energy balance for the dissipative system (283) may be found by the equation:

\int_{0}^{1} φ_{k} {\dot{x}}^{k} d t = 0 .

(284)

Plugging in the expression (278) gives:

\begin{matrix} \frac{1}{2} m \int_{0}^{1} \frac{\partial g_{i j}}{\partial x^{k}} {\dot{x}}^{k} {\dot{x}}^{i} {\dot{x}}^{j} d t - m \int_{0}^{1} \frac{d}{d t} (g_{i k} {\dot{x}}^{i}) {\dot{x}}^{k} d t - μ \int_{0}^{1} g_{h k} {\dot{x}}^{h} {\dot{x}}^{k} d t - \int_{0}^{1} \frac{\partial V_{x}}{\partial x^{k}} {\dot{x}}^{k} d t = 0 . \end{matrix}

(285)

The first integral may be rewritten, again upon integration by parts, as:

\frac{m}{2} \int_{0}^{1} \frac{\partial g_{i j}}{\partial x^{k}} {\dot{x}}^{k} {\dot{x}}^{i} {\dot{x}}^{j} d t = \frac{m}{2} \int_{0}^{1} \frac{d g_{i j}}{d t} {\dot{x}}^{i} {\dot{x}}^{j} d t = {K|}_{0}^{1} - m \int_{0}^{1} g_{i j} {\dot{x}}^{i} {\ddot{x}}^{j} d t .

The second term may be written, upon integration by parts, as:

m \int_{0}^{1} \frac{d}{d t} (g_{i k} {\dot{x}}^{i}) {\dot{x}}^{k} d t = 2 {K|}_{0}^{1} - m \int_{0}^{1} g_{i j} {\dot{x}}^{i} {\ddot{x}}^{j} d t,

while the remaining terms may be written as

μ \int_{0}^{1} g_{i k} {\dot{x}}^{i} {\dot{x}}^{k} d t + \int_{0}^{1} \frac{\partial V}{\partial x^{k}} {\dot{x}}^{k} d t = 2 \frac{μ}{m} \int_{0}^{1} K (t) d t + {V (t)|}_{0}^{1} .

The energy balance equation may thus be written as:

K (t) + V (t) = K (0) + V (0) - 2 \frac{μ}{m} \int_{0}^{1} K (t) d t,

(286)

where

K (t)

stands for

K_{x} (\dot{x} (t))

and

V (t)

for

V_{x (t)}

. The above equation shows that the designed second-order dynamical system looses energy at a rate proportional to its kinetic energy.

A non-dissipative system (

μ = 0

) devoid of a potential function, hence completely governed by the kinetic energy/metric, produces a trajectory that keeps the quantity that undergoes variation, namely the only kinetic energy, constant over the whole trajectory. As we already knew, therefore, the Christoffel form

Γ_{x} (🟉, 🟉)

may thus be computed on the basis of the variational principle:

\int_{0}^{1} δ K_{x} d t = 0,

(287)

invoking which is often the preferred path for computing the Christoffel form.

5.4. Cotangent Bundle Derivation of Non-Linear Oscillators on Manifolds*

Nonlinear oscillators arise from the modeling of complex physical structures [24] and make up the basis for a number of modern applications [25]. The state of nonlinear oscillators may form complex, non-repeating (although deterministic) patterns. Contributions from the scientific literature suggest how the simplest harmonic-oscillator models may be generalized to complex single or coupled oscillators [26,27,28,29].

The cotangent bundle, namely the dual

T^{🟉} M

of the tangent bundle associated to a manifold, is as valid as the tangent bundle to formulate and study dynamical systems. We mention that some authors in manifold calculus explain the whole manifold calculus itself by starting from cotangent bundles and only later introducing tangent spaces. In the present section, we shall rather be showing how non-linear oscillators may be defined by means of tangent-bundle calculus. We shall start off by revising a number of classical oscillator models very well known in the scientific literature and shall then continue by showing how such mathematical models may be seamlessly be extended to manifolds by means of cotangent-bundle calculus.

The present section makes frequent reference to the work [30] that it is based on.

5.4.1. Classical Oscillators on Euclidean Spaces

A number of nonlinear oscillator models appear in the scientific literature as dynamical systems in a single real descriptive variable. The simplest model known in the scientific literature is perhaps the harmonic oscillator described by the second-order dynamical system:

\ddot{x} + Ω_{0}^{2} x = 0,

(288)

where

x \in R

denotes a descriptive variable and

Ω_{0} > 0

denotes the natural oscillation frequency of the harmonic oscillator. The state

x = x (t)

is a function of time, with

t \in R

. As no damping is considered, the harmonic oscillator conserves its initial energy indefinitely. The same consideration applies to the simple pendulum model

\ddot{x} + Ω_{0}^{2} sin x = 0

, which appears as a second-order differential equation that may be easily recast as a system of two first-order differential equations:

\{\begin{matrix} \dot{x} = v, \\ \dot{v} = - \frac{d V_{x}}{d x}, \end{matrix}

(289)

where the forcing term on the right-hand side has been written as the gradient of a potential energy function

V : R \to R

which, in this case, may be defined as

V_{x} : = Ω_{0}^{2} (1 - cos x)

. (Notice that we have therewith abandoned the notion of ‘mass’ which, besides of being immaterial in the present context, may always be factored out through appropriate normalization.)

A further well-studied nonlinear damped oscillator that arises from the analysis of vacuum tubes is the van der Pol oscillator [24], described by the second-order differential equation:

\ddot{x} - μ (1 - x^{2}) \dot{x} + Ω_{0}^{2} x = 0,

(290)

where again

x \in R

denotes a descriptive variable and

μ > 0

denotes a damping parameter. When

μ = 0

, the Equation (290) collapses into the equation of an harmonic oscillator (288). When

μ > 0

, the system will eventually enter a limit cycle. Near the origin

(x, \dot{x}) = (0, 0)

of the phase space the system is unstable, while far from the origin the system is heavily damped and hence stable. The single Equation (290) may be recast as the paired equations:

\{\begin{matrix} \dot{x} = v, \\ \dot{v} = - \frac{d V_{x}}{d x} - μ (x^{2} - 1) v, \end{matrix}

(291)

where now two variables,

x, v \in R

, are used to describe its dynamics and a potential energy function

V : R \to R

has been introduced to quantify the forcing term, which is defined as

V_{x} : = \frac{1}{2} Ω_{0}^{2} x^{2}

, where

Ω_{0} > 0

denotes a constant parameter. The van der Pol oscillator model may be regarded as a special case of the FitzHugh-Nagumo model [31] which, in turn, represents a simplified version of the Hodgkin-Huxley mathematical model that explains in a detailed manner the phenomena of activation and deactivation of spiking neurons. A related example arises from the examination of the driven van der Pol oscillator, described by the differential system:

\{\begin{matrix} \dot{x} = v, \\ \dot{v} = - \frac{d V_{x}}{d x} - μ (x^{2} - 1) v + A sin (Ω t), \end{matrix}

(292)

where

A > 0

and

Ω > 0

denote two constant parameters. In the present case, the non-linear damping term comprises a quantity (the driving term) that depends sinusoidally from the temporal variable t. The particular form of the damping causes a dehancement of the magnitude of strong oscillations and an amplification of the magnitude of weak oscillations. The original van der Pol oscillator has been generalized in different ways, as in the paper [32] that considers the introduction of fractional derivatives.

Another well-studied nonlinear dynamical system that exhibits a complex behavior is the Duffing oscillator [33]. Duffing’s dynamical system reads:

\{\begin{matrix} \dot{x} = v, \\ \dot{v} = - \frac{d V_{x}}{d x} - μ v + A sin (Ω t), \end{matrix}

(293)

where

A > 0

and

Ω > 0

denote constant parameters and the potential energy function

V : R \to R

is defined according to one of the three known models summarized below:

Case of a hard Duffing oscillator: In this case, the potential function is defined as $V_{x} : = \frac{1}{2} Ω_{0}^{2} x^{2} + \frac{1}{4} α Ω_{0}^{2} x^{4}$ ,
Case of a double-well Duffing oscillator: In this instance, the potential is defined as $V_{x} : = - \frac{1}{2} Ω_{0}^{2} x^{2} + \frac{1}{4} α Ω_{0}^{2} x^{4}$ ,
Case of a soft Duffing oscillator: In this case, the potential is defined as $V_{x} : = \frac{1}{2} Ω_{0}^{2} x^{2} - \frac{1}{4} α Ω_{0}^{2} x^{4}$ ,

where

Ω_{0} > 0

and

α > 0

denote constants of the models. The dynamical system (293) describes the motion of a damped oscillator with a more complicated potential than in the simple harmonic oscillator. It is able to model, for example, a spring pendulum whose spring’s stiffness does not exactly obey Hooke’s law. A direct comparison between the Duffing oscillator (293) and the driven van der Pol oscillator (292) reveals that the linear damping term

- μ v

in the Duffing system is replaced by the non-linear damping term

- μ (x^{2} - 1) v

in the van der Pol system. A number of generalized Duffing models have been proposed in the scientific literature. As an example, we mention the contribution [34] that describes a generalized double-well Duffing oscillator endowed with non-linear damping described by:

\{\begin{matrix} \dot{x} = v, \\ \dot{v} = - Ω_{0}^{2} x + α Ω_{0}^{2} x^{3} - μ v {| v |}^{η - 1} + A sin (Ω t), \end{matrix}

(294)

where the constant

η ⩾ 1

is termed damping exponent and determines the degree of non-linearity of the damping term. The case

η = 1

corresponds to linear damping in the system (293). A similar analysis was performed in [35] about the universal escape oscillator endowed with the same nonlinear damping term.

Several two-variable oscillator models are known from the literature besides the one examined in details, such as the Lotka-Volterra model of prey-predator interaction and the Wilberforce torsional pendulum. A common factor of all two-dimensional models is that their dynamics is entirely described by two variables, a positional variable x and a velocity variable v, in such a way that the phase space of these systems is formed by all pairs

(x, v) \in R^{2}

, hence is the Euclidean space

R^{2}

. The state space, in particular, coincides with the real line

R

, and the velocity space coincides with the real line

R

. This appears to be a coincidence that does not copy to the manifold case, in which the state manifold

M

and the velocity space

T M

are entirely different spaces. Clearly the tangent bundle

T M

plays a fundamental role in the description of a dynamical system on manifold, and so does its dual, namely the cotangent bundle

T^{🟉} M

.

Dynamical systems involving more than one variable are known in the scientific literature. The best known example is perhaps the Lorenz oscillator [36], which appears as a nonlinear, three-dimensional dynamical system that generates a complex flow. A Lorenz oscillator is described by the mathematical model:

\{\begin{matrix} {\dot{x}}_{1} = σ (x_{2} - x_{1}), \\ {\dot{x}}_{2} = x_{1} (ρ - x_{3}) - x_{2}, \\ {\dot{x}}_{3} = x_{1} x_{2} - β x_{3}, \end{matrix}

(295)

which includes three constant parameters

σ, ρ, β > 0

and three variables

(x_{1}, x_{2}, x_{3}) \in R^{3}

. For certain values of the three parameters, the Lorenz system exhibits complex behavior. The Lorenz equations were derived from the simplified relationships that govern convection rolls arising in the study of the atmosphere and were applied to achieve climate and weather predictions. A further noteworthy example is brought to our attention by the authors of [37], which studied the emergence of oscillations in a SIR epidemic model.

The Lorenz model (295) is of the first order, but it may be recast as a second-order model by deriving both sides of each equation with respect to the temporal parameter t:

\{\begin{matrix} {\ddot{x}}_{1} = - σ {\dot{x}}_{1} + σ {\dot{x}}_{2}, \\ {\ddot{x}}_{2} = (ρ - x_{3}) {\dot{x}}_{1} - {\dot{x}}_{2} - x_{1} {\dot{x}}_{3}, \\ {\ddot{x}}_{3} = x_{2} {\dot{x}}_{1} + x_{1} {\dot{x}}_{2} - β {\dot{x}}_{3} . \end{matrix}

(296)

By associating to the Lorenz system the state array

x : = {[x_{1} x_{2} x_{3}]}^{⊤}

, the dynamical system (296) may be cast in the compact form:

\{\begin{matrix} \dot{x} = v, \\ \dot{v} = - C_{x} v, C_{x} : = [\begin{matrix} σ & - σ & 0 \\ x_{3} - ρ & 1 & x_{1} \\ - x_{2} & - x_{1} & β \end{matrix}] . \end{matrix}

(297)

The recast Lorenz system takes the form of a set of purely non-linearly damped oscillator on a flat space which may be thought of being endowed with a constant potential energy function.

A general expression for coupled scalar oscillators in

R^{n}

that include, as a special instance, the Lorenz model (297), may be drawn from the study [38]:

\{\begin{matrix} \dot{x} = v, \\ \dot{v} = - \partial_{x} V - φ_{x} (v) + u f (t), \end{matrix}

(298)

where

x \in R^{n}

denotes the state array,

V : R^{n} \to R

denotes a potential energy function,

\partial_{x} V \in R^{n}

denotes the Euclidean gradient of the potential energy function,

φ : R^{n} \times R^{n} \to R^{n}

indicates a linear operator in the variable v (possibly nonlinear in the variable x) that represents damping and other internal sinks/sources of energy, while the term

f : R \to R

represents a forcing term in the fixed direction

u \in R^{n}

.

5.4.2. Cotangent Bundle Notation

In the present section we shall denote the inner product of two tangent vectors through a special symbol, namely

G_{x} : T_{x} M \times T_{x} M \to R

. The metric ‘flat’ operator, denoted as

G^{♭}

, turns a tangent vector into a cotangent vector, namely,

G_{x}^{♭} : T_{x} M \to T_{x}^{🟉} M

. Its dual is termed ‘sharp’ operator and is denoted by

G^{♯}

. The differential of a differentiable function

ψ : M \to R

at a point

x \in M

is denoted by

d_{x} ψ \in T_{x}^{🟉} M

. The Riemannian gradient of the function

ψ

with respect to a metric

G

, evaluated at a point

x \in M

is expressed though the metric operator as

(G_{x}^{♯} \circ d_{x}) ψ

. The Christoffel form associated to the metric

G

is denoted as

\overset{G}{Γ}

. The Levi-Civita connection associated to this metric is denoted as

\overset{G}{\nabla}

and the parallel transport operator associated to the same metric is denoted as

\overset{G}{P}

.

Let us recall that a vector field

φ \in Γ (M)

that depends on a parameter, namely

φ_{x} : T_{x} M \to T_{x} M

, is positive-definite if

G_{x} (φ_{x} (v), v) > 0

for every

v \in T_{x} M - {0}

and every

x \in M

.

Let us further recall that, given two points

x, y \in M

connectable by a geodesic line

γ : [0, 1] \to M

, their Riemannian distance is defined by

D (x, y) : = \int_{0}^{1} G_{γ (t)}^{\frac{1}{2}} (\dot{γ} (t), \dot{γ} (t)) d t .

(299)

A fundamental result of the calculus on manifolds states that the Riemannian gradient of a squared distance function reads:

G_{x}^{♯} (d_{x} D^{2} (x, y)) = - 2 \log_{x} (y),

(300)

as long as the logarithm is defined. The gradient of the squared distance is known in closed form only if the expression of the inverse exponential map is known explicitly.

5.4.3. Second-Order Dynamical Systems Formulation by Cotangent Bundles

A classical dynamical system model on the ordinary space

R^{3}

is represented by Newton’s second law of motion of a particle of mass m subjected to an external force

f : R^{3} \times R^{3} \times I \to R^{3}

, where

I

denotes a time interval. The trajectory

γ : I \to R^{3}

followed by such particle results as the solution of Newton’s equation:

m \frac{d^{2} γ}{d t^{2}} = f (γ, \frac{d γ}{d t}, t) .

(301)

The second-order time-derivative

\ddot{γ} (t)

quantifies the instantaneous translational acceleration of the test particle as a function of the absolute time t and the mass constant m quantifies the translational inertia of the particle. The resultant of ‘external’ forces f depends on the instantaneous position of the particle

γ (t)

, on its instantaneous translational velocity

\dot{γ} (t)

and may even depend explicitly on the time.

The current section recalls a formulation of second-order dynamical systems on manifold that retraces the above-recalled dynamics of an virtual point-wise particle whose instantaneous position is bound to belong to a Riemannian manifold

M

—endowed with metric

G

—rather than the ordinary space

R^{3}

. As in the previous sections, the Hamilton’s stationary-action principle, extended through the d’Alembert virtual work notion, shall be invoked to formulate the dynamics of such virtual particle. For this reason, it pays to revisit the notion notions of kinetic energy, potential energy and external excitation, mainly with the aim of clarifying the customized notation introduced at the beginning of the present section.

The kinetic energy function for a point-wise particle associated with the metric

G

is denoted by

K : T M \to R

and is defined by

K_{x} (v) : = \frac{1}{2} G_{x} (v, v)

for

(x, v) \in T M

. On a Riemannian manifold, the metric tensor is positive-definite, hence, on every trajectory

γ : I \to M

, it holds that

K_{γ} (\dot{γ}) ⩾ 0

. The potential energy function

V : M \to R

is assumed to depend solely on the positional variable

x \in M

; whenever any exogenous forcing is lacking, the trajectory

γ : I \to M

follows the landscape of the potential energy function. As opposed to what stated in the previous sections, in the present context an exogenous force at a point

x \in M

appears as a cotangent vector, namely

f_{x} : T_{x} M \times I \to T_{x}^{🟉} M

.

On the basis of the above-recalled quantities and notation, the extended stationary-action principle that determines the evolution of the state on a Riemannian manifold

M

endowed with a metric

G

takes the form:

δ \int_{I} (K_{γ} (\dot{γ}) - V_{γ}) d t + \int_{I} G_{γ} (G_{γ}^{♯} (f_{γ}), δ γ) d t = 0,

(302)

where symbol

δ

again denotes variation. The integrand in the leftmost integral represents the classical Lagrangian function of the particle and its integral represents the total action of the particle. The rightmost integral represents the virtual work effected by the resultant of the external driving force and represents an extension of the classical Hamilton’s stationary-action principle that may be traced back to d’Alembert [39]. In the interior of the interval

I

, the variation

δ γ \in T_{γ} M

is arbitrary, while at the boundaries of the trajectory it vanishes to zero. Computing the variation leads to the dynamical system formulation:

(G_{γ}^{♭} \circ {\overset{G}{\nabla}}_{\dot{γ}}) \dot{γ} = - d_{γ} V_{γ} + f_{γ} .

(303)

By comparing the Equation (303) to Newton’s law (301), it is readily seen that the term

{\overset{G}{\nabla}}_{\dot{γ}} \dot{γ}

represents the covariant acceleration of the particle sliding on the manifold

M

, the operator

G^{♭}

plays the role of inertia tensor and the term

f_{x} - d_{x} V_{x}

is the total force that generates the motion of the particle. Of particular interest the role played by the inertia tensor

G^{♭}

that replaces (and extends) the notion of inertial mass.

The covariant acceleration may be written in terms of the Christoffel form as

{\overset{G}{\nabla}}_{\dot{γ}} \dot{γ} = \ddot{γ} + {\overset{G}{Γ}}_{γ} (\dot{γ}, \dot{γ})

, therefore, the Equation (303) may be rewritten as the system of first-order differential equations:

\{\begin{matrix} \dot{x} = v, \\ \dot{v} = - {\overset{G}{Γ}}_{x} (v, v) - (G_{x}^{♯} \circ d_{x}) V_{x} + G_{x}^{♯} (f_{x}) . \end{matrix}

(304)

in the tangent-bundle arrays

(x (t), v (t)) \in T M

.

The total energy (or Hamiltonian function)

H : T M \times I \to R

of the particle sliding on the manifold, and hence of the dynamical system (304), is defined by:

H_{x} : = K_{x} + V_{x} .

(305)

Over a trajectory

γ : I \to M

of the system (304), it holds that:

\frac{d H_{γ}}{d t} = G_{γ} (G_{γ}^{♯} (f_{γ}), \dot{γ}),

(306)

which represents a power law for the dynamical system at hand. In the case that the external forcing is absent, the system (304) is conservative as the total energy

H_{γ}

keeps constant over the trajectory, otherwise the system is non-conservative and its energy varies along its trajectory.

In the present context, the force

f_{x} = f_{x} (v, t) \in T_{x}^{🟉} M

essentially represents damping effects and external forcing terms. In particular:

Friction-type damping: This kind of damping generalizes the Rayleigh damping and is expressed by the forcing term $- \frac{1}{2} μ \frac{\partial (G_{x} {(v, v)}^{ϵ})}{\partial v^{σ}} d x^{σ}$ , with $ϵ ⩾ 1$ being a damping coefficient and $μ ⩾ 0$ being a viscosity coefficient. By computing the derivatives, it is found that the friction-type damping force equals $- {μ ∥ v ∥}_{x}^{2 (ϵ - 1)} G_{x}^{♭} (v)$ .
Non-linear damping: It generalizes the nonlinear damping term that appears in the van der Pol system and assumes the expression $- (G_{x}^{♭} \circ φ_{x}) (v)$ , with $φ \in Γ (M)$ being a vector field that depends on a parameter, such that, for each $x \in M$ , $φ_{x} : T_{x} M \to T_{x} M$ is a linear endomorphism.
Sinusoidal driving force: It generalizes the notion of external sinusoidal forcing term from mono-dimensional dynamical systems. It takes the expression $A sin (Ω t) d x^{1}$ , with $A, Ω \in R$ and $t \in I$ .

With the above assumptions, the dynamical system (304) takes the expression:

\{\begin{matrix} \dot{x} = v, \\ \dot{v} = - {\overset{G}{Γ}}_{x} (v, v) - G_{x}^{♯} (d_{x} V_{x}) - μ {∥ v ∥}_{x}^{2 (ϵ - 1)} v - φ_{x} (v) - A sin (Ω t) G_{x}^{♯} (d x^{1}) . \end{matrix}

(307)

The rate of change of the total energy due to the above forcing terms reads:

\frac{d H_{x}}{d t} = - μ {∥ v ∥}_{x}^{2 ϵ} - G_{x} (φ_{x} (v), v) - A sin (Ω t) v^{1} .

(308)

If

μ \neq 0

, the first term on the right-hand side is purely dissipative and is proportional to the kinetic energy of the system (to

K_{x}^{ϵ}

, in fact), while the second term on the right-hand side is not necessarily dissipative and may bring energy into the system. (If the vector field

φ

is positive-definite, then the term

- φ_{x} (v)

in the system (307) is purely dissipative.)

Let us consider two examples that aim at clarifying the notation used and to show that the considered non-linear oscillators on manifold are indeed general enough to encompass the original classical oscillators.

Example 24.

In the special case that

M = R^{n}

, hence

T_{x} M \equiv R^{n}

, and

G_{x} (v, w) = v^{⊤} w

, it holds that

{\overset{G}{Γ}}_{x} = 0

,

G_{x}^{♯} (d_{x} V_{x}) = \partial_{x} V

,

G_{x}^{♯} (d x^{1}) = 1_{n}

, where symbol

1_{n}

denotes the column array

{[1, 0, \dots, 0]}^{⊤} \in R^{n}

. Hence, the dynamical system (307) becomes:

\{\begin{matrix} \dot{x} = v, \\ \dot{v} = - \partial_{x} V - μ {(v^{⊤} v)}^{ϵ - 1} v - φ_{x} (v) - A sin (Ω t) 1_{n}, \end{matrix}

(309)

where

φ : R^{n} \times R^{n} \to R^{n}

and

μ > 0

. The dynamical system (309) accounts for the exemplary systems (289), (291)–(294) and (298).

To further exemplify the above theoretical development, the present subsection discusses the special case of Riemannian manifolds

M

of dimension 1. The main interest on such kinds of manifolds is analytic tractability. Examples of 1-dimensional manifolds are the unit sphere

S^{1}

(namely, the unit circle) and the special orthogonal group

SO (2)

.

In order to particularize the general Equation (304) to the case of a 1-dimensional Riemannian manifold, the following preliminary observations are in order. On a Riemannian manifold

M

of dimension 1, the tangent space

T_{x} M

is a linear space of dimension 1 whose basis vector is denoted by

\partial_{x}

, and the metric tensor

G_{x}

collapses into a scalar function

g_{x} > 0

,

x \in M

. Given two tangent vectors

v \partial_{x}, w \partial_{x} \in T_{x} M

, their inner product

G_{x} (v \partial_{x}, w \partial_{x}) = g_{x} v w

. Consequently, the squared norm

∥ v \partial_{x} ∥_{x}^{2}

equals

g_{x} v^{2}

. Likewise, the cotangent space

T_{x}^{🟉} M

has dimension 1 and its basis vector is denoted by

d x

. Given a cotangent vector

u d x

, it holds that

G_{x}^{♯} (u d x) = \frac{u}{g_{x}} \partial_{x}

. Hence, in the present example, the only Christoffel symbol

{\overset{G}{Γ}}_{11}^{1}

reads

{\overset{G}{γ}}_{x} : = \frac{1}{2} \frac{1}{g_{x}} \frac{d g_{x}}{d x} = \frac{1}{2} \frac{d \log g_{x}}{d x}

and the only non-zero value of the Christoffel form

{\overset{G}{Γ}}_{x} (v \partial_{x}, w \partial_{x})

is

{\overset{G}{γ}}_{x} v w

. Given a potential energy function

V : M \to R

, the only non-zero component of its gradient

G_{x}^{♯} (d_{x} V_{x})

equals

\frac{1}{g_{x}} \frac{d V_{x}}{d x}

. The general Equation (304), particularized to a 1-dimensional Riemannian manifold, read:

\{\begin{matrix} \dot{x} = v, \\ \dot{v} = - {\overset{G}{γ}}_{x} v^{2} - \frac{1}{g_{x}} \frac{d V_{x}}{d x} - μ {(g_{x} v^{2})}^{ϵ - 1} v - φ_{x} (v) . \end{matrix}

(310)

An interesting special case is the one where the metric is independent of the position, namely

g_{x} = 1

, that implies

{\overset{G}{γ}}_{x} = 0

. In this case, the dynamical system (310) particularizes to:

\{\begin{matrix} \dot{x} = v, \\ \dot{v} = - \frac{d V_{x}}{d x} - μ v^{2 (ϵ - 1)} v - φ_{x} (v) . \end{matrix}

(311)

The above dynamical system is a prototype for the classical nonlinear damped oscillators, such as the van der Pol oscillator and the simple pendulum.

The sphere

S^{1}

embedded in

R^{2}

may be described as the set of arrays of the form

x = {[cos θ sin θ]}^{⊤}

, with

θ \in R

. Then, it holds that:

\frac{d}{d t} [\begin{matrix} cos θ \\ sin θ \end{matrix}] = \dot{x} [\begin{matrix} - sin θ \\ cos θ \end{matrix}] = v \partial_{x},

where

v : = \dot{x}

and

\partial_{x} : = {[- sin x cos x]}^{⊤}

. Choosing the Euclidean inner product on the tangent spaces

T_{x} S^{1}

, it holds that

g_{x} = G_{x} (\partial_{x}, \partial_{x}) = {[- sin θ cos θ]}^{⊤} [- sin θ cos θ] = 1

. Hence, the dynamical equations on

S^{1}

endowed with the Euclidean metric are as in (311).

The exponential map corresponding to the above geometric setting may be expressed as:

\exp_{x} (v \partial_{x}) = cos (v) x + sin (v) \partial_{x} .

(312)

Given two points

y, z \in S^{1}

, solving the equation

z = \exp_{y} (v \partial_{y})

in the unknown v gives

v = arccos (z^{⊤} y)

, hence:

\log_{y} (z) = arccos (z^{⊤} y) \partial_{y},

(313)

and the geodesic distance between the points x and z is given by:

D (z, y) = arccos (z^{⊤} y) .

(314)

In the above expressions the symbol ‘arccos’ denotes the inverse cosine function. Parametrizing the point y by

{[cos θ sin θ]}^{⊤}

and the point z by

{[cos r sin r]}^{⊤}

, the squared geodesic distance may be written as

D^{2} (z, y) = {(θ - r)}^{2}

.

5.4.4. Potential Energy Functions

The present subsection retraces classical potential energy functions summarized in Section 5.4.1 and extends such potentials to a general Riemannian manifold.

A well-documented potential energy function is the one arising in the study of the simple pendulum oscillating around the reference angle

r = 0

. In the classical case that

M = R

, it reads

V_{x} \propto 1 - cos x

. Such potential energy function may be extended to a general Riemannian manifold

M

, endowed with a Riemannian distance function

D (🟉, 🟉)

, as:

V_{x}^{(pen)} : = κ (1 - cos D (x, r)),

(315)

with

κ > 0

being a constant parameter and

r \in M

denoting a reference point. The potential

V_{x}^{(pen)}

presents one of its minima in

x = r

. Rewriting the pendulum-type potential as

V_{x}^{(pen)} = κ - κ cos [{(D^{2} (x, r))}^{\frac{1}{2}}]

it is immediate to see that, according to the calculation rule (300), its Riemannian gradient reads:

G_{x}^{♯} (d_{x} V_{x}^{(pen)}) = - κ \frac{sin D (x, r)}{D (x, r)} \log_{x} (r) .

(316)

The potential energy function introduced to state the van der Pol dynamical system in the mono-dimensional state space

R

is a quadratic function centered in the reference point

r = 0

, namely

V_{x} \propto x^{2}

. Such potential energy function may be extended to a general Riemannian manifold

M

by first generalizing the squared distance-to-zero

x^{2}

by a squared distance-to-reference

{(x - r)}^{2}

and successively by replacing the Euclidean distance by the geodesic distance, namely as

V_{x}^{(pol)} : = \frac{κ}{2} D^{2} (x, r),

(317)

with

κ > 0

being a constant parameter and

r \in M

denoting a reference point. The potential

V_{x}^{(pol)}

presents a single minimum in

x = r

. Moreover, according to the calculation rule (300), its Riemannian gradient reads:

G_{x}^{♯} (d_{x} V_{x}^{(pol)}) = - κ \log_{x} (r) .

(318)

The contribution [40] studied an extension of the (hard) Duffing oscillator to two dimensions, namely, to the state space

R^{2}

. The key point is to extend the original potential function

V_{x} : R \to R

through vector norm as:

V_{x} = \frac{1}{2} Ω_{0}^{2} {∥ x ∥}^{2} + \frac{1}{4} κ Ω_{0}^{2} {∥ x ∥}^{4},

(319)

where

∥ 🟉 ∥

denotes the Euclidean norm, while

Ω_{0} > 0

and

κ > 0

are free parameters. The quantity

∥ x ∥

coincides with the Euclidean distance of the state x to the reference point

r = 0

. Such observation paves the way to an extension to a general Riemannian manifold

M

, that is achieved by replacing the Euclidean distance with the Riemannian distance:

V_{x}^{(duf)} : = \pm \frac{1}{2} D^{2} (x, r) \pm \frac{1}{4} κ D^{4} (x, r),

(320)

where again the quantity

D (🟉, 🟉)

denotes the Riemannian distance between two points on the manifold

M

, the constant

κ > 0

denotes a free parameter that influences the strength of the conservative forcing term corresponding to the potential, and

r \in M

represents a reference point on the state manifold around which the trajectory of the dynamical system develops. The signs ± may be chosen arbitrarily and their choice give rise to the soft and the double-well Duffing oscillator, as well as to the hard Duffing oscillator discussed in the paper [40]. According to the calculation rule (300), the Riemannian gradient of the extended Duffing potential

V_{x}^{(duf)}

, in the case of a hard Duffing oscillator, reads:

G_{x}^{♯} (d_{x} V_{x}^{(duf)}) = - (1 + κ D^{2} (x, r)) \log_{x} (r) .

(321)

Likewise, the Keplerian system mentioned in [40] may be extended to a general Riemannian manifold

M

through the introduction of the Riemannian distance

D (🟉, 🟉)

, by re-defining the Keplerian potential as:

V_{x}^{(kep)} : = - \frac{ρ}{D (x, r)} + ϵ D (x, r),

(322)

with

ρ, ϵ > 0

denoting parameters of the potential energy function. By rewriting such Keplerian-like potential as

- ρ {[D^{2} (x, r)]}^{- \frac{1}{2}} + ϵ {[D^{2} (x, r)]}^{\frac{1}{2}}

and invoking again the calculation rule (300), it is immediate to see that its Riemannian gradient takes the expression:

G_{x}^{♯} (d_{x} V_{x}^{(kep)}) = - (\frac{ρ}{D^{3} (x, r)} + \frac{ϵ}{D (x, r)}) \log_{x} (r) .

(323)

For exemplification purpose, let us consider two instances of non-linear damped oscillators on matrix manifolds.

Example 25.

As a first example, let us take as state space for non-linear manifold-type oscillators the manifold

S^{+} (n)

of symmetric positive-definite matrices, endowed with its canonical metric. If the potential energy function is written in terms of geodesic distance, its Riemannian gradient may be written by invoking the property (300). If the potential energy function is written as an explicit function of the matrix-variable

P \in S^{+} (n)

, then the Riemannian gradient, in terms of its Gateaux derivative

\partial_{P} V

, may be written as:

G_{x}^{♯} (d_{P} V_{P}) = \frac{1}{2} P (\partial_{P} V + \partial_{P}^{⊤} V) P .

(324)

By gathering the expressions of the Christoffel operator and of the Riemannian gradient of the potential energy function, the following dynamical system equations are obtained:

\{\begin{matrix} \dot{P} = U, \\ \begin{matrix} \dot{U} = & U P^{- 1} U - G_{P}^{♯} (d_{P} V_{P}) - μ {∥ U ∥}_{P}^{2 (ϵ - 1)} U - A sin (Ω t) P (1_{n \times n}^{⊤} + 1_{n \times n}) P \\ - φ_{P} (U) . \end{matrix} \end{matrix}

(325)

In the present case, the function

φ_{P} : S (n) \to S (n)

.

As a second example, let us assume a Stiefel manifold is endowed with its canonical metric. In some applications, the potential energy function is written explicitly in terms of the matrix-state

X \in St (n, p)

(rather than on Riemannian distance). In the principal/minor component analysis case, for instance, the potential energy function reads

V_{X} = \pm \frac{1}{2} tr (X^{⊤} Σ X)

, with

Σ \in S^{+} (n)

, and its Gateaux derivative reads

\partial_{X} V = \pm Σ X

. By the expressions of the Riemannian gradient of the potential energy function and of the Christoffel form, it is obtained the dynamical system:

\{\begin{matrix} \dot{X} = U, \\ \begin{matrix} \dot{U} = & - U U^{⊤} X - X U^{⊤} (I_{n} - X X^{⊤}) U - (\partial_{X} V - X \partial_{X}^{⊤} V X) - μ {∥ U ∥}_{X}^{2 (ϵ - 1)} U \\ - φ_{X} (U) - A sin (Ω t) (1_{n \times p} - X 1_{n \times p}^{⊤} X), \end{matrix} \end{matrix}

(326)

where it must hold that

φ_{X} : T_{X} St (n, p) \to T_{X} St (n, p)

.

6. Control Systems on Manifolds and Numerical Implementation

This section presents a survey on error feedback control systems on manifold and their numerical implementation, with special emphasis to system synchronization.

Synchronization in a group of dynamical systems may happen forcibly or spontaneously. Spontaneous synchronization may be observed in synchronous flashing of fireflies, in flocking of fish swarms, in neuronal synchronization in central nervous system as well as in arrays of semiconductor devices. As an example of spontaneous synchronization model, we cite the Cucker-Smale system [41] that insists on the state space

R^{p}

and that may be expressed as

{\ddot{x}}_{i} = \frac{κ}{N} \sum_{j = 1}^{N} ψ (∥ x_{i} - x_{j} ∥) ({\dot{x}}_{j} - {\dot{x}}_{i}),

(327)

where N denotes the number of individuals in the swarm,

i = 1, 2, \dots, N

denotes the ith individual whose state is denoted by

x_{i} \in R^{p}

,

κ > 0

denotes an interaction strength constant and

ψ

denotes communication weight function which quantifies the degree of communications between each individual.

The classical Cucker-Smale model has been extended to manifolds, and its emergent properties have been studied, in the research paper [42]. The studied extension of the classical model (327) to a Riemannian manifold

M

reads

\nabla_{t}^{x_{i}} {\dot{x}}_{i} = \frac{κ}{N} \sum_{j = 1}^{N} ψ (D (x_{i}, x_{j})) (P^{x_{j} \to x_{i}} {\dot{x}}_{j} - {\dot{x}}_{i}),

(328)

where the Euclidean distance in

R^{p}

was replaced by a Riemannian distance in

M

, the state acceleration was replaced by covariant acceleration and the difference between velocities was replaced by a difference between parallel-transported velocities, which ensures that the trajectories

t \mapsto x_{i} (t)

develop entirely on the state manifold

M

.

An application surveyed in this section is the synchronization of the motion of two quadrotor drones (although the achieved results can be extended even to a fleet of drones). In particular, we shall survey a Lie-group control theory to allow the synchronization of the motion of two twin quadrotor drones, whose model is of second-order and insists on a Lie group. A leader drone is flying independently, while a follower should conform its dynamics in such a way that its state syncs with the state of the master. In the present context, we only focus on the rotational component of motion, which engages Lie-group control.

In order to state the quadcopter’s equations of motion, since its state space is a Lie group

(G, m, i, e)

, let us recall some relevant notation:

A left translation on a Lie group $G$ is denoted by $L : G \times G \to G$ and is defined as $L_{g} (h) : = m (i (g), h)$ . Since for the quadcopter model we deal with a matrix Lie group, we may take $m (g, h) : = g h$ , obtaining $L_{g} (h) = g^{- 1} h$ .
The Lie algebra $g$ associated to a Lie group $(G, m, i, e)$ is a vector space endowed with Lie brackets $[🟉, 🟉] : g \times g \to g$ and an adjoint endomorphism ${ad}_{ξ} η : = [ξ, η]$ . The pushforward map $d {(L_{g})}_{g} : T_{g} G \to g$ will be denoted as $d L_{g}$ for brevity.
Given a smooth function $ℓ : g \to R$ , for a matrix Lie group we can define the fiber derivative of ℓ, $\frac{δ ℓ}{δ ξ} \in g$ , with $ξ \in g$ , as the unique element on the algebra such that ${⟨\frac{δ ℓ}{δ ξ}, η⟩}_{e} = tr ({(\partial_{ξ} ℓ)}^{⊤} η)$ for each $η \in g$ , where the symbol $⟨ 🟉, 🟉 ⟩_{e}$ denotes an inner product on the vector space $g$ and the symbol $\partial_{ξ} ℓ$ denotes the Gateaux derivative (or Jacobian matrix) of the function ℓ with respect to the matrix-variable $ξ$ .
On a matrix Lie algebra, it holds that ${ad}_{ξ} η : = ξ η - η ξ$ . In order to ease the notation in the following sections, let us define a matrix anti-commutator $\{ξ, η\} : = ξ η + η ξ$ , which is a symmetric form, namely ${\{ξ, η\}}^{⊤} = \{η, ξ\}$ , no matter what is the structure of the arguments. Conversely, the matrix commutator in the algebra is an anti-symmetric bilinear form, namely $[ξ, η] + [η, ξ] = 0$ .

The present section is organized as follows. Section 6.1 presents a design and an analysis of a Lie-group synchronization theory, which is subsequently applied to synchronizing the motion of quadcopters in Section 6.2. The Section 6.3 treats an important topic, namely synchronization with vanishing control effort. To end with, Section 6.4 describes, in general terms, the notion of feedback error velocity and sheds some light between principal pushforward map and curvature endomorphism by means of homotopic nets.

6.1. Design and Analysis of a Lie-Group Synchronization Theory

The material reported in the present section is largely based on the previous contribution [43].

The Lie group considered for the quadcopter mathematical model is the special orthogonal group

SO (3)

whose Lie algebra is

so (3)

. The rotational dynamics of the leader system is governed by the equations

\{\begin{matrix} {\dot{R}}_{m} = R_{m} H_{m}, \\ {\dot{H}}_{m} = σ (H_{m}), \end{matrix}

(329)

with

(R_{m}, H_{m}) \in SO (3) \times so (3)

denoting the leader system’s state-pair, while the rotational dynamics of the controlled follower is governed by the equations

\{\begin{matrix} {\dot{R}}_{s} = R_{s} H_{s}, \\ {\dot{H}}_{s} = σ (H_{s}) + U, \end{matrix}

(330)

where

(R_{s}, H_{s}) \in SO (3) \times so (3)

denotes the follower system’s state-pair and

U \in so (3)

denotes a control field to be designed to make the two twin systems synchronize in time. The function

σ : so (3) \to so (3)

denotes the state-transition operator that ultimately describes the internal nature of the leader-follower systems twins. The mathematical model of the controlled follower system (330) will be further analyzed in Section 6.2.2 to verify the physical feasibility of the designed control strategy in relation to the effort demanded by to the actuators by the control algorithm.

The control algorithm may be designed either to synchronize the two drones’ angular velocities or to synchronize their attitude. Such control goals are examined separately in what follows.

6.1.1. Velocity Synchronization

The aim of angular velocities synchronization is to ensure that

lim_{t \to + \infty} | | H_{m} (t) - H_{s} (t) {| |}_{F} = 0 .

(331)

For this purpose, a proportional-type Lie-algebra controller of the following form may be designed:

U : = σ (H_{m}) - σ (H_{s}) + K_{P} E

(332)

where

E : = H_{m} - H_{s}

denotes the synchronization angular-velocity mismatch and

K_{P}

denotes a proportional action gain coefficient. By defining

κ (H_{m}, H_{s}) : = σ (H_{m}) - σ (H_{s}) \in so (3)

the control field can be rewritten as

U : = κ (H_{m}, H_{s}) + K_{P} E .

(333)

The function

κ (H_{m}, H_{s})

has the purpose of deleting the dynamics of the follower and to replace it with the dynamics of the leader, hence it realizes a control design principle that we refer to as dynamics replacement. The leader-follower pair systems will result to be velocity-synchronized as

E \approx 0

.

The convergence of the synchronization error to zero is confirmed by the following result, proven in [44]: The control law (332), with

K_{P} > 0

, synchronizes in angular velocity the follower to the leader exponentially fast. This kind of controller does not ensure the full state synchronization for the leader-follower pair.

6.1.2. Attitude Synchronization

The problem of attitude synchronization is more involved than the problem of velocity synchronization discussed above, since a purely proportional control field would not be effective. Attitude synchronization may be achieved by a proportional-derivative-type control. Attitude synchronization is achieved when

lim_{t \to + \infty} D (R_{m} (t), R_{s} (t)) = 0,

(334)

where

D : SO (3) \times SO (3) \to R

denotes a Riemannian distance in

SO (3)

.

The first step in the development of a synchronizing control algorithm is to define an error field that associates to each pair of leader/follower states a mismatch measure. We define an attitude synchronization error measure by the following expression

E : = Log (R_{s}^{⊤} R_{m}) .

(335)

This error expression is equivalent to a ‘difference’ between the states

R_{m}

and

R_{s}

, written for the

SO (3)

group and belongs to the Lie algebra

so (3)

. In order to compute the time-derivative

\dot{E}

, it is convenient to make use of the following workaround:

Exp (E) = R_{s}^{⊤} R_{m} \overset{d / d t}{\to} Exp (E) \dot{E} = {\dot{R}}_{s}^{⊤} R_{m} + R_{s}^{⊤} {\dot{R}}_{m},

(336)

where the symbol ‘

Exp

’ denotes the matrix exponential. Replacing the expressions of the time-derivatives of the states

R_{m}

and

R_{s}

by the expressions given in (329) and (330) leads to the following relationship relating the error velocity to the attitudes and the angular velocities of the two constituent subsystems

\dot{E} = H_{m} - {(R_{s}^{⊤} R_{m})}^{⊤} H_{s} (R_{s}^{⊤} R_{m}) .

(337)

Deriving once more with respect to the time parameter t, and noticing that

\frac{d}{d t} (R_{s}^{⊤} R_{m}) = (R_{s}^{⊤} R_{m}) \dot{E}

, gives

\ddot{E} = {\dot{H}}_{m} - {((R_{s}^{⊤} R_{m}) \dot{E})}^{⊤} H_{s} (R_{s}^{⊤} R_{m}) - {(R_{s}^{⊤} R_{m})}^{⊤} {\dot{H}}_{s} (R_{s}^{⊤} R_{m}) - {(R_{s}^{⊤} R_{m})}^{⊤} H_{s} (R_{s}^{⊤} R_{m}) \dot{E} .

(338)

Plugging in the expressions of

{\dot{H}}_{m}

and

{\dot{H}}_{s}

from (329) and (330) yields

\begin{matrix} \ddot{E} = & σ (H_{m}) + \dot{E} {(R_{s}^{⊤} R_{m})}^{⊤} H_{s} (R_{s}^{⊤} R_{m}) - {(R_{s}^{⊤} R_{m})}^{⊤} (σ (H_{s}) + U) (R_{s}^{⊤} R_{m}) \\ - {(R_{s}^{⊤} R_{m})}^{⊤} H_{s} (R_{s}^{⊤} R_{m}) \dot{E} . \end{matrix}

(339)

The Lie-algebra control field U is designed so that the error dynamics obeys the law

\ddot{E} = - K_{P} E - K_{D} \dot{E} .

(340)

In order to express the Lie-algebra control field U in an explicit form, it is necessary to equate the expression for

\ddot{E}

found in (339) to the desired expression of the error acceleration prescribed by the relation (340), which leads to

\begin{matrix} U : = & (R_{s}^{⊤} R_{m}) σ (H_{m}) {(R_{s}^{⊤} R_{m})}^{⊤} - σ (H_{s}) \\ + (R_{s}^{⊤} R_{m}) \dot{E} {(R_{s}^{⊤} R_{m})}^{⊤} H_{s} - H_{s} (R_{s}^{⊤} R_{m}) \dot{E} {(R_{s}^{⊤} R_{m})}^{⊤} \\ + K_{P} (R_{s}^{⊤} R_{m}) E {(R_{s}^{⊤} R_{m})}^{⊤} + K_{D} (R_{s}^{⊤} R_{m}) \dot{E} {(R_{s}^{⊤} R_{m})}^{⊤} . \end{matrix}

(341)

The error system (340) is asymptotically convergent to a zero synchronization error. In fact, under the assumption that

K_{P} > 0

and

K_{D} > 0

, the function

W : = - \frac{K_{P}}{2} tr (E^{2}) - \frac{1}{2} tr ({\dot{E}}^{2})

(342)

is Lyapunov for the error system (340). Notice that, since

W = \frac{K_{P}}{2} {| | E | |}_{F}^{2} + \frac{1}{2} | | \dot{E} {| |}_{F}^{2}

, from the condition

K_{P} > 0

, it follows that

W ⩾ 0

, with equality only when

E = 0

. Upon derivation with respect to the time parameter, one gets

\begin{matrix} \dot{W} & = - K_{P} tr (E \dot{E}) - tr (\dot{E} \ddot{E}) \\ = - K_{P} tr (E \dot{E}) - tr (\dot{E} (- K_{P} E - K_{D} \dot{E})) \\ = - K_{P} tr (E \dot{E}) + K_{P} tr (\dot{E} E) + K_{D} tr ({\dot{E}}^{2}) \\ = K_{D} tr ({\dot{E}}^{2}) . \end{matrix}

(343)

Since it holds that

\dot{W} = - K_{D} | | \dot{E} {| |}_{F}^{2}

, from the condition

K_{D} > 0

, it follows that

\dot{W} ⩽ 0

with equality only when

\dot{E} = 0

.

The above result shows that the Lie-algebra control-field (341) makes the attitude of the follower subsystem (330) synchronize to that of the leader subsystem (329).

6.2. Application to Quadcopters Synchronization

The material reported in the present section is largely based on the previous contribution [45]. In the present section, we shall apply the above-recalled synchronization theory to mathematical models of quadcopters. For an introductory review, interested readers might want to consult, e.g., [46]. In particular, we are going to retrace mathematical modeling of a quadcopter drone based on the standard approach of Lagrange-Euler-d’Alembert where the exogenous generalized forces are computed on the basis of the specific structure of a quadrotor drone. The main difference between the general case considered earlier in this tutorial paper is that the Lagrange-Euler-d’Alembert principle is applied to a system whose state-space is a Lie group, which leads to the so-called Euler-Poicaré equations.

6.2.1. Lie-Group Model Formulation of a Quadrotor Drone

A quadrotor drone (also called quadcopter) is made of a body

B

and of four rotors

R_{a}

, with

a = 1, 2, 3, 4

, termed propellers, directed upwards and placed in a square formation with equal distance from the center. Among the four propellers, rotors 2 and 4 spin clockwise, while rotors 1 and 3 spin counterclockwise [45]. Translation along a horizontal plane may be achieved by tilting the quadrotor body by modulating the rotation speed of every rotor, while a vertical displacement is achieved by modulating the total thrust exerted by the rotors on the drone’s body. It is customary to attach a reference frame to the quadrotor’s body, whose three orthogonal axes are denoted by x, y and z. A rotation of the body around the x axis is termed roll, a rotation around the y axis is termed pitch, while a rotation around the z axis is termed yaw.

The mathematical model of a quadcopter recalled here holds under a number of assumptions, namely: (1) the structure of the quadrotor is rigid and axis-symmetrical, (2) the center of the body-fixed reference-frame axes coincides with the center of mass of the quadrotor’s structure, (3) the propellers are rigid, (4) the thrust and drag forces are proportional to the square of rotors’ angular speed.

A mathematical model of a composite roto-translating body may be written through the well-known formalism of the Lagrange-Euler-d’Alembert principle. For a dynamical system whose state space is a Lie group, it affirms that a dynamical system generates a trajectory

[a, b] ∋ t \mapsto g (t) \in G

such that

δ \int_{a}^{b} L (g, \dot{g}) d t + \int_{a}^{b} {⟨F (g, \dot{g}), δ g⟩}_{g} d t = 0,

(344)

where the quantity

L : T G \to R

denotes a Lagrangian function for the dynamical system, the symbol

⟨ 🟉, 🟉 ⟩_{🟉}

denotes a metric for the state space, the vector field

F : T G \to T G

denotes a generalized force field, the quantity

\dot{g}

denotes the state (or configuration) velocity and the symbol

δ

represents a variation across two infinitely-close trajectories.

It pays now to recall the notion of left-invariance with special emphasis on the terms that appear in the expression (344):

Left-invariance of the kinetic energy: The kinetic energy associated to a trajectory g is defined as $K_{g} (\dot{g}) = \frac{1}{2} {⟨ \dot{g}, \dot{g} ⟩}_{g}$ . We shall assume the inner product representing the metric to be left-invariant, which may be expressed analytically by the condition that

${⟨ d L_{g} (\dot{g}), d L_{g} (\dot{g}) ⟩}_{e} = {⟨ \dot{g}, \dot{g} ⟩}_{g},$

(345)

namely, the inner product is invariant under left-translation. In practice, this means that the inner product ${⟨ \dot{g}, \dot{g} ⟩}_{g}$ does not depends explicitly on the configuration g and on the velocity $\dot{g}$ , but only on their composition $d L_{g} (\dot{g})$ . Hence, the kinetic energy may be written as $K_{g} (\dot{g}) = k (d L_{g} (\dot{g}))$ , where $k : g \to R$ denotes a ‘reduced’ kinetic energy function.
Left-invariance of the potential energy: There is no notion of left-invariance for the potential function, since it does not depend on the velocity $\dot{g}$ . We shall henceforth assume (and we shall see that it is, in fact, the case) that the potential energy function for the system is simply constant with respect to the configuration g.
Left-invariance of the generalized force field: In general, the product ${⟨F (g, \dot{g}), δ g⟩}_{g}$ depends explicitly on the quantities g and $\dot{g}$ . Under the assumption that the inner product is left-invariant, we may however write that

${⟨F (g, \dot{g}), δ g⟩}_{g} = {⟨d_{g} L (F (g, \dot{g})), d_{g} L (δ g)⟩}_{e} .$

(346)

To this, we add the assumption that the force field F be left-invariant, namely that $d_{g} L (F (g, \dot{g})) = f (d_{g} L (\dot{g}))$ , where $f : g \to g$ denotes a ‘reduced force field’.

In a left-invariant setting, the principle (344) can be simplified as shown in [45] to

δ \int_{a}^{b} ℓ (d L_{g} (\dot{g})) d t + \int_{a}^{b} {⟨f (d L_{g} (\dot{g})), d L_{g} (δ g)⟩}_{e} d t = 0,

(347)

where

ℓ : g \to R

represents a reduced Lagrangian function.

Now, the following result is instrumental in order to cast the integral principle (347) in a differential form [47]: Let

g (s, t)

denote a continuous family of trajectories indexed by the variable s and define

ξ : = d L_{g} (\dot{g})

and

η : = d L_{g} (δ g)

, where the variation δ refers to the variable s. The solution of the integral Euler-Lagrange-d’Alembert Equation (344) under perturbations of the form

\frac{δ ξ}{δ s} = \dot{η} + {ad}_{ξ} η

satisfies the Euler-Poincaré equation

\frac{d}{d t} \frac{δ ℓ}{δ ξ} = {ad}_{ξ}^{*} (\frac{δ ℓ}{δ ξ}) + f,

(348)

where the superscript

^{*}

denotes a dual operator.

Thanks to such fundamental result, the equations that describe the motion of a dynamical system on a state Lie group

G

cast as

\{\begin{matrix} \dot{g} = {(d L_{g})}^{- 1} (ξ), \\ \frac{d}{d t} \frac{δ ℓ}{δ ξ} = {ad}_{ξ}^{*} (\frac{δ ℓ}{δ ξ}) + f . \end{matrix}

(349)

The exogenous forcing f term accounts for several contributions to the dynamics, such as energy dissipation due to friction and energy-injecting active control fields.

The Euler-Poincaré Equation (348) describes a dynamical system subjected to holonomic constraints only, subsumed by the underlying Lie group. A system subjected to non-holonomic constrains (such as a sphere rolling on a plane without slipping), may be described by a modified Euler-Poincaré equation that takes a substantially more convoluted expression [48]. The point is that the set of velocities attainable by the system is restricted to an area of the tangent bundle termed distribution, which makes the equations of motion more convoluted.

The rotational component of motion of a rigid body, such as a drone, takes as descriptor an element in the Lie group

SO (3)

and in its associated Lie algebra

so (3)

. Let us now define a basis of the vector space

so (3) = span (ξ_{x}, ξ_{y}, ξ_{z})

as follows:

ξ_{x} : = [\begin{matrix} 0 & 0 & 0 \\ 0 & 0 & - 1 \\ 0 & 1 & 0 \end{matrix}], ξ_{y} : = [\begin{matrix} 0 & 0 & 1 \\ 0 & 0 & 0 \\ - 1 & 0 & 0 \end{matrix}], ξ_{z} : = [\begin{matrix} 0 & - 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 0 \end{matrix}] .

(350)

Furthermore, by taking the canonical metric

{⟨ξ, η⟩}_{e} : = tr (ξ^{⊤} η)

, it is readily shown that the fiber derivative of the reduced Lagrangian reads

\frac{δ ℓ}{δ ξ} = \frac{1}{2} (\partial_{ξ} ℓ - {(\partial_{ξ} ℓ)}^{⊤}),

(351)

where

e = I_{3}

denotes the identity element in the Lie group

SO (3)

and symbol

\partial_{x}

denotes, once more, a Gateaux derivative.

A quadcopter model includes two specific constants, namely its total mass and its non-standard tensor of inertia, defined as

M_{q} : = M_{b} + \sum_{a = 1}^{4} M_{a}, {\hat{J}}_{q} : = {\hat{J}}_{b} + \sum_{a = 1}^{4} {\hat{J}}_{a},

(352)

where

M_{b}

and

{\hat{J}}_{b}

denote respectively the body mass of the quadcopter and its non-standard tensor of inertia, while

M_{a}

and

{\hat{J}}_{a}

, with

a = 1, 2, 3, 4

, denote respectively the mass of the rotors and their non-standard tensors of inertia. Let us recall that the non-standard tensor of inertia

\hat{J}

of a rigid body relates to its standard tensor of inertia J by the relationship

\hat{J} = tr (J) I_{3} - 2 J

(we mention, with apologies, that the relation reported in [45] was incorrect of a factor

\frac{1}{2}

). It follows that the quadcopter’s non-standard tensor of inertia

{\hat{J}}_{q}

can be deduced from the standard tensor of inertia, which is assumed to take the form of a

3 \times 3

diagonal matrix

J_{q} : = diag (J_{x}, J_{y}, J_{z}) > 0

, as follows:

{\hat{J}}_{q} = tr (J_{q}) I_{3} - 2 J_{q} = diag (- J_{x} + J_{y} + J_{z}, J_{x} - J_{y} + J_{z}, J_{x} + J_{y} - J_{z}) .

(353)

Let us proceed on recalling the equations that describe the rotational component of motion. The angular velocities of the propellers will hereafter be denoted as

ω_{1}, ω_{2}, ω_{3}, ω_{4}

. In order to define the equations for the rotational component of motion for a quadcopter, we shall assume that all rotors are characterized by the same coefficient of rotational inertia

J_{R}

and that such rotors develop an angular velocity around the z axis

ξ_{a} : = {(- 1)}^{a} ω_{a} ξ_{z}

, where the sign-factors

{(- 1)}^{a}

are meant to take into account that the rotors

R_{1}

and

R_{3}

spin counterclockwise, while rotors

R_{2}

and

R_{4}

spin clockwise.

Each rotor exerts a thrust, denoted as

f_{a} \in R^{3}

, on the body of the quadcopter, whose magnitude is taken to be proportional to the square of the corresponding angular velocity

ω_{a}

via a lift constant

b > 0

, namely

f_{a} : = \frac{1}{2} b ω_{a}^{2} e_{z}, where e_{z} : = [\begin{matrix} 0 \\ 0 \\ 1 \end{matrix}] .

(354)

(The factor

\frac{1}{2}

has been introduced for convenience). The thrust produced by each rotor also exerts a mechanical torque on the quadcopter’s body via a moment arm, denoted as

p_{a} \in R^{3}

, given by

p_{1} : = [\begin{matrix} r \\ 0 \\ 0 \end{matrix}], p_{2} : = [\begin{matrix} 0 \\ r \\ 0 \end{matrix}], p_{3} : = [\begin{matrix} - r \\ 0 \\ 0 \end{matrix}], p_{4} : = [\begin{matrix} 0 \\ - r \\ 0 \end{matrix}],

(355)

where

r > 0

denotes the distance between each propellers’ centers and the center of mass of the quadcopter body, supposed to be the same for each propeller by symmetry.

The body of the quadcopter is subjected to further mechanical torques due to the rotation of each propeller. In fact, by virtue of the principle of angular momentum conservation, the spinning of each propeller produces a counter-torque that acts on the whole body of the quadcopter and tends to make it rotate along its z axis in the opposite direction of the resultant of the torques produced by the propellers. These torques are termed drag torques and their magnitude is proportional to the square of their angular velocity via a drag constant

γ > 0

. The resulting drag torques take the expression

τ_{a} : = \frac{1}{2} γ {(- 1)}^{a} ω_{a}^{2} ξ_{z} .

(356)

(Again the factor

\frac{1}{2}

has been introduced solely for convenience).

The resultant mechanical torque

τ \in so (3)

acting on the body of the quadcopter hence takes the expression

τ : = \overset{Active torque}{\overset{︷}{\frac{1}{2} \sum_{a = 1}^{4} {(- 1)}^{a} (f_{a} p_{a}^{⊤} - p_{a} f_{a}^{⊤})}} + \overset{Drag torque}{\overset{︷}{\frac{γ}{2} \sum_{a = 1}^{4} {(- 1)}^{a} ω_{a}^{2} ξ_{z}}} .

(357)

By splitting such total mechanical torque with respect to the basis (350), namely as

τ = τ_{x} ξ_{x} + τ_{y} ξ_{y} + τ_{z} ξ_{z}

, one finds the components of the torques along the axes of the body-fixed reference system

\{\begin{matrix} τ_{x} = \frac{1}{2} b r (ω_{4}^{2} - ω_{2}^{2}), \\ τ_{y} = \frac{1}{2} b r (ω_{3}^{2} - ω_{1}^{2}), \\ τ_{z} = \frac{1}{2} γ (- ω_{1}^{2} + ω_{2}^{2} - ω_{3}^{2} + ω_{4}^{2}) . \end{matrix}

(358)

The components

τ_{x}

and

τ_{y}

of the resultant torque, which quantify the degree of rolling and pitching, are controlled by the unbalance of the spinning velocity of two rotors (in particular, notice that the torque along the x axis is produced by the propellers located along the y axis and vice-versa). The component

τ_{z}

of the total torque, responsible of the yawing of the quadrotor body, is produced by the unbalance between the clockwise and counterclockwise spins. Yawing is generally an unwanted effect, unless the torque

τ_{z}

is so tiny to produce a slow yawing, therefore such component is carefully kept near zero by the onboard control system.

Let us now denote by

R (t) \in SO (3)

the attitude of a quadcopter and by

ξ (t) \in so (3)

the angular speed of this quadcopter, both thought as functions of time t. The time-evolution of the state pair

(R, ξ) \in T SO (3)

is described by the following systems of tangent-bundle differential equations:

\{\begin{matrix} \dot{R} = R ξ, \\ {{\hat{J}}_{q}, \dot{ξ}} = [{\hat{J}}_{q}, ξ^{2}] + [β, ξ] - \dot{β} + 2 τ, \\ β : = (- ω_{1} + ω_{2} - ω_{3} + ω_{4}) J_{R} ξ_{z}, \\ τ : = \frac{1}{2} b r (ω_{4}^{2} - ω_{2}^{2}) ξ_{x} + \frac{1}{2} b r (ω_{3}^{2} - ω_{1}^{2}) ξ_{y} + \frac{1}{2} γ (- ω_{1}^{2} + ω_{2}^{2} - ω_{3}^{2} + ω_{4}^{2}) ξ_{z}, \end{matrix}

(359)

where

{A, B} : = A B + B A

denotes the Poisson bracket and

[A, B] : = A B - B A

denotes the Lie bracket (or simply commutator, in this case) of two matrices

A, B

. To confirm the above result notice that, since on the Lie algebra

so (3)

it holds that

{ad}_{ξ}^{*} = - {ad}_{ξ}

and on the

SO (3)

group it is verified that

d L_{R}^{- 1} (ξ) = R ξ

, the system of Euler-Poincaré Equation (349) may be rewritten as

\{\begin{matrix} \dot{R} = R ξ, \\ \frac{d}{d t} \frac{δ ℓ}{δ ξ} = - {ad}_{ξ} (\frac{δ ℓ}{δ ξ}) + τ, \end{matrix}

(360)

where

τ

denotes the resultant of all external mechanical torques. In the present case, the reduced Lagrangian

ℓ (ξ)

of the quadrotor coincides with its kinetic energy, since there is no potential associated with the rotational component of motion. The fiber derivative of the reduced Lagrangian reads

\frac{δ ℓ}{δ ξ} = \frac{1}{2} {{\hat{J}}_{b}, ξ} + \frac{1}{2} \sum_{a = 1}^{4} {{\hat{J}}_{a}, ξ + ξ_{a}} = \frac{1}{2} {{\hat{J}}_{q}, ξ} + \frac{1}{2} β

(361)

where we have defined, for convenience,

β : = (- ω_{1} + ω_{2} - ω_{3} + ω_{4}) J_{R} ξ_{z} .

(362)

The resultant of the external torques acting on the body of a dquadcopter drone is expressed by the relationship (358).

Whenever a drone is neither taking off nor landing, it is customary to assume the residual rotoric angular velocity

Ω_{r} : = ω_{1} - ω_{2} + ω_{3} - ω_{4}

to be constant. Therefore, the mathematical model (359) of the rotational motion of a quadcopter may be written in the following simplified form:

\{\begin{matrix} \dot{R} = R ξ, \\ {{\hat{J}}_{q}, \dot{ξ}} = [{\hat{J}}_{q}, ξ^{2}] + [Ω_{r} J_{R} ξ_{z}, ξ] + 2 τ . \end{matrix}

(363)

In order to describe the translational motion of a quadrotor, let us denote by

q \in R^{3}

the coordinate of the center of mass of the drone expressed in the inertial reference frame. The translational component of motion in a quadcopter obeys Newton’s second law of dynamics written in an inertial reference frame. The linear acceleration

\ddot{q}

is proportional to the total thrust

f_{T} : = \sum_{a = 1}^{4} f_{a}

, to the gravity acceleration g and to an aerodynamic drag term. Since all the rotor thrusts are directed along the

R e_{z}

direction, the Newton law governing the translational motion can be written as:

M_{q} \ddot{q} = \frac{1}{2} b (ω_{1}^{2} + ω_{2}^{2} + ω_{3}^{2} + ω_{4}^{2}) R e_{z} - g M_{q} e_{z} - Γ \dot{q},

(364)

where

Γ \in R^{3 \times 3}

denotes a positive-definite (and often diagonal) aerodynamic drag tensor. The first term on the right-hand side of the above equation tells that whenever a quadcopter is horizontal to the ground (namely

R = I_{3}

), its center of mass can only move vertically, while whenever its body is tilted away (namely

R \neq I_{3}

) its center of mass can even move horizontally.

In order to write the set of equations that drive the drone’s flight, it is necessary to express explicitly the Euler-Poincaré equation to cast it into the form

\dot{ξ} = σ (ξ, τ)

, where

σ : so (3) \times so (3) \to so (3)

denotes the state-transition function associated to this particular dynamical system. To this aim, it is useful to define a new inertia matrix

D : = diag (\sqrt{\frac{2 J_{y} J_{z}}{J_{x}}}, \sqrt{\frac{2 J_{x} J_{z}}{J_{y}}}, \sqrt{\frac{2 J_{x} J_{y}}{J_{z}}}) .

(365)

Straightforward calculations show that

{{\hat{J}}_{q}, \dot{ξ}} = D \dot{ξ} D

and that

\det (D) = 2 \sqrt{2 J_{x} J_{y} J_{z}} > 0

, from which it follows that the inertia matrix D is invertible. Therefore, the Euler-Poincaré equation may be rewritten explicitly as

\dot{ξ} = σ (ξ, τ) : = D^{- 1} ([{\hat{J}}_{q}, ξ^{2}] + [β, ξ] - \dot{β} + 2 τ) D^{- 1} .

(366)

Ultimately, the mathematical laws that describe a quadcopter flight may be written explicitly by collecting the equations obtained in (359), (364) and (366) as

\{\begin{matrix} \dot{R} = R ξ, \\ \dot{ξ} = D^{- 1} ([{\hat{J}}_{q}, ξ^{2}] + [β, ξ] - \dot{β} + 2 τ) D^{- 1}, \\ β : = (- ω_{1} + ω_{2} - ω_{3} + ω_{4}) J_{R} ξ_{z}, \\ τ : = \frac{1}{2} b r (ω_{4}^{2} - ω_{2}^{2}) ξ_{x} + \frac{1}{2} b r (ω_{3}^{2} - ω_{1}^{2}) ξ_{y} + \frac{1}{2} γ (- ω_{1}^{2} + ω_{2}^{2} - ω_{3}^{2} + ω_{4}^{2}) ξ_{z}, \\ \dot{q} = v, \\ \dot{v} = \frac{1}{2} \frac{b}{M_{q}} (ω_{1}^{2} + ω_{2}^{2} + ω_{3}^{2} + ω_{4}^{2}) R e_{z} - g e_{z} - \frac{1}{M_{q}} Γ v, \end{matrix}

(367)

where the variable

v \in R^{3}

denotes the translational velocity of the center of mass of the quadrotor.

6.2.2. Physical Realizability of the Synchronizing Controller

The follower drone described by Equation (330) differs from the free leader drone by the application of a control field, which must be physically realizable, namely the control field U prescribed by the control algorithm should be effectively realizable by the propellers. A quadcopter is solely operated by changing its rotors’ angular velocities, therefore it is natural to require that the values assumed by the Lie-algebra control field U must be such that it can effectively be translated into achievable angular velocities.

As a first step in a realizability analysis, it is necessary to discern, from the Equation (363) that governs the evolution of the angular velocities of the follower, those terms that represent externally-controllable torques from the terms that represent unoperable torques. Such classification follows from the physical meaning of the mathematical terms and may be written as

\dot{ξ} = \overset{Unoperable component}{\overset{︷}{D^{- 1} ([{\hat{J}}_{q}, ξ^{2}] + [Ω_{r} J_{R} ξ_{z}, ξ]) D^{- 1}}} + 2 D^{- 1} τ D^{- 1},

(368)

where it is assumed that the residual rotor speed

Ω_{r}

is constant, hence the terms in

β

may be neglected. By defining

σ_{uc} (ξ) : = D^{- 1} ([{\hat{J}}_{q}, ξ^{2}] + [Ω_{r} J_{R} ξ_{z}, ξ]) D^{- 1}

and by equating the control field U to the operable component

2 D^{- 1} τ D^{- 1}

, the mathematical system (330) can be recast as

\{\begin{matrix} {\dot{R}}_{s} = R_{s} ξ_{s}, \\ {\dot{ξ}}_{s} = σ_{uc} (ξ_{s}) + U . \end{matrix}

(369)

In order to compute the angular velocities of the four propelling rotors corresponding to the control field U, it is possible to derive three conditions from the Equation (357), namely:

\{\begin{matrix} b r (ω_{4}^{2} - ω_{2}^{2}) = {\hat{ψ}}_{x}, \\ b r (ω_{3}^{2} - ω_{1}^{2}) = {\hat{ψ}}_{y}, \\ γ (- ω_{1}^{2} + ω_{2}^{2} - ω_{3}^{2} + ω_{4}^{2}) = {\hat{ψ}}_{z}, \end{matrix}

(370)

where

{\hat{ψ}}_{x}

,

{\hat{ψ}}_{y}

e

{\hat{ψ}}_{z}

are scalar control fields defined by

\frac{1}{2} D U D = : {\hat{ψ}}_{x} ξ_{x} + {\hat{ψ}}_{y} ξ_{y} + {\hat{ψ}}_{z} ξ_{z}

. The system (370) is under-determined, since it totals three equations in four unknowns, therefore a fourth constraint needs to be added by considering the follower in a hovering condition. The added constraint is:

\frac{b}{2} (ω_{1}^{2} + ω_{2}^{2} + ω_{3}^{2} + ω_{4}^{2}) = M_{q} g .

(371)

The above system of equations is linear in the unknowns

ω_{1}^{2}

,

ω_{2}^{2}

,

ω_{3}^{2}

and

ω_{4}^{2}

and admits as solutions the angular velocities

\{\begin{matrix} ω_{1} = \frac{1}{2 \sqrt{b r γ}} \sqrt{- 2 γ {\hat{ψ}}_{y} - b r {\hat{ψ}}_{z} + 2 r g γ M_{q}}, \\ ω_{2} = \frac{1}{2 \sqrt{b r γ}} \sqrt{b r {\hat{ψ}}_{z} - 2 γ {\hat{ψ}}_{x} + 2 r g γ M_{q}}, \\ ω_{3} = \frac{1}{2 \sqrt{b r γ}} \sqrt{2 γ {\hat{ψ}}_{y} - b r {\hat{ψ}}_{z} + 2 r g γ M_{q}}, \\ ω_{4} = \frac{1}{2 \sqrt{b r γ}} \sqrt{2 γ {\hat{ψ}}_{x} + b r {\hat{ψ}}_{z} + 2 r g γ M_{q}} . \end{matrix}

(372)

The solutions (372) are real-valued, hence physically realizable, in principle, whenever the following two conditions are met:

the values taken by the scalar control component ${\hat{ψ}}_{x}$ , ${\hat{ψ}}_{y}$ and ${\hat{ψ}}_{z}$ do not differ too much from one another and are absolutely bounded,
the control field components are small compared to the mass-related term $2 r g γ M_{q}$ .

In practice, the spinning velocity of the propellers differs little from the spinning velocity corresponding to a hovering condition, termed steady-state velocity. This is defined as the spinning velocity that keeps a quadrotor still in mid-air. From the Equation (371), it is not hard to determine such propellers’ spinning velocity as:

ω_{ss} = \sqrt{\frac{g M_{q}}{2 b}} (rad / s) .

(373)

In order to exemplify the values of the parameters mentioned in this section, an OS4 Mini-VTOL (Vertical Take-Off and Landing) drone has been taken as reference model, as reported in [49]. The numerical values of the parameters corresponding to such quadrotor model are shown in the Table 1.

Table 1. Table recapitulating typical parameter values of a OS4 Mini-VTOL quadcopter.

The value of the gravitational acceleration constant g was approximated at

9.81 m \cdot s^{- 2}

. The calculated steady-state rotor velocity corresponding to such set of parameters is

ω_{ss} = \sqrt{g M_{q} / 2 b} \approx 319

rad/s (roughly corresponding to 3046 revolutions per minute (RPM), quite a fast spinning). Numerical simulations show that even a fractional deviation of one RPM with respect to the steady-state velocity produces a dramatic impact on the motion of a quadrotor.

6.2.3. Numerical Recipes to Implement Synchronization of Quadrotors

We shall now illustrates a numerical technique to implement the recalled mathematical model as well as the synchronization algorithm on a computing platform.

A pseudo-code corresponding to the numerical solution of the differential equations describing the rotational component of the motion of a single drone is illustrated in the Algorithm 1. The invoked numerical integration method represents a counterpart of the well-known Euler method adapted to manifolds.

Algorithm 1 Pseudo-code to implement the (uncontrolled) mathematical model of a quadrotor on a computing platform

Initialize the attitude

R_{0}

and the rotation velocity

ξ_{0}

Set the step-size h and the number of numerical steps K to complete a simulation

for

k = 0

to

K - 1

do

Update the attitude

R_{k + 1} = R_{k} Exp (h ξ_{k})

Update the angular velocity

ξ_{k + 1} = ξ_{k} + h σ_{uc} (ξ_{k})

end for

A pseudo-code corresponding to the numerical counterpart of the differential equations describing the rotational component of the motion of a leader-follower pair of drones along with the application of a velocity-synchronization control field is illustrated in the Algorithm 2.

Algorithm 2 Pseudo-code to implement numerically a leader-follower pair and velocity synchronization by a proportional-type control algorithm

Initialize the attitudes

R_{m, 0}

,

R_{s, 0}

and the rotation velocities

ξ_{m, 0}

,

ξ_{s, 0}

Set the step-size h, the number of numerical steps K and the proportional control gain

K_{P}

for

k = 0

to

K - 1

do

Update

R_{m, k + 1} = R_{m, k} Exp (h ξ_{m, k})

Update

ξ_{m, k + 1} = ξ_{m, k} + h σ_{uc} (ξ_{m, k})

Evaluate the control field

U_{k} = σ_{uc} (ξ_{m, k}) - σ_{uc} (ξ_{s, k}) + K_{P} (ξ_{m, k} - ξ_{s, k})

Update

R_{s, k + 1} = R_{s, k} Exp (h ξ_{s, k})

Update

ξ_{s, k + 1} = ξ_{s, k} + h (σ_{uc} (ξ_{s, k}) + U_{k})

end for

Moreover, a pseudo-code corresponding to the numerical implementation of attitude synchronization by a proportional-derivative-type controller is illustrated in the Algorithm 3.

Algorithm 3 Pseudo-code to implement numerically a leader-follower pair and an attitude synchronization control by a proportioanal-derivative-type controller

Initialize the attitudes

R_{m, 0}

,

R_{s, 0}

and the rotation velocities

ξ_{m, 0}

,

ξ_{s, 0}

Set the discrete step-size h, the number of numerical steps K, the proportional control

gain

K_{P}

and the derivative control gain

K_{D}

for

k = 0

to

K - 1

do

Update

R_{m, k + 1} = R_{m, k} Exp (h ξ_{m, k})

Update

ξ_{m, k + 1} = ξ_{m, k} + h σ_{uc} (ξ_{m, k})

Evaluate the synchronization error

ε_{k} = Log (R_{s, k}^{⊤} R_{m, k})

Evaluate the synchronization error velocity

{\dot{ε}}_{k} = ξ_{m, k} - {(R_{s, k}^{⊤} R_{m, k})}^{⊤} ξ_{s, k} (R_{s, k}^{⊤} R_{m, k})

Evaluate the control field (from Equation (341))

U_{k} = (R_{s, k}^{⊤} R_{m, k}) σ_{uc} (ξ_{m, k}) {(R_{s, k}^{⊤} R_{m, k})}^{⊤} - σ_{uc} (ξ_{s, k}) + (R_{s, k}^{⊤} R_{m, k}) {\dot{ε}}_{k} {(R_{s, k}^{⊤} R_{m, k})}^{⊤} ξ_{s, k} - ξ_{s, k} (R_{s, k}^{⊤} R_{m, k}) {\dot{ε}}_{k} {(R_{s, k}^{⊤} R_{m, k})}^{⊤} + K_{P} (R_{s, k}^{⊤} R_{m, k}) ε_{k} {(R_{s, k}^{⊤} R_{m, k})}^{⊤} + K_{D} (R_{s, k}^{⊤} R_{m, k}) {\dot{ε}}_{k} {(R_{s, k}^{⊤} R_{m, k})}^{⊤}

Update

R_{s, k + 1} = R_{s, k} Exp (h ξ_{s, k})

Update

ξ_{s, k + 1} = ξ_{s, k} + h (σ_{uc} (ξ_{s, k}) + U_{k})

end for

If the leader quadrotor is subjected to a difference in its rotor velocities that is too large (the maximum value of the difference for which the controller is effective depends on the value of the gains set), the control algorithm may loose effectiveness might hence be no longer able to synchronize the two subsystems. Such observation is in accordance with the practice: as mentioned, a quadrotor usually keeps a low difference among its propellers spinning velocities in order to be able to trace a stable trajectory.

6.3. Synchronization of Second-Order Systems on Manifolds with Vanishing Control Effort

In the previous section we surveyed synchronization of two dynamical systems on Lie groups without regard to control effort, which quantifies the effort demanded to the actuators by the control algorithm as well as their energy consumption. In the present section we are going to recall from [50] a control theory that ensures the control effort to vanish asymptotically as synchronization takes place.

In the present section we are studying a category of dynamical systems on a manifold

M

expressed by

\{\begin{matrix} \dot{x} = v, with x (0) = x_{0} \in M, t ⩾ 0, \\ \nabla_{v} v = f (x, v), with v (0) = v_{0} \in T_{x_{0}} M, t ⩾ 0, \end{matrix}

(374)

where

f : T M \to T M

denotes a system’s state-transition function and

\nabla_{v} v

denotes the covariant acceleration of the state. The first equation denotes a state-transition law, in the manifold-valued variable x, while the second equation denotes a velocity-transition law in the vector field v. Since the pair

(x, v)

is an element of the tangent bundle, the tangent bundle

T M

represents a phase-space for the system (374).

Given a system trajectory

(t, x (t)) \in R \times M

and a control vector field

(x (t), u (t)) \in T M

, one may introduce a control effort index associated to the vector control field u as the scalar field

(x (t), σ (t)) \in M \times R_{0}^{+}

defined by

σ : = \frac{1}{2} {∥ u ∥}_{x}^{2} .

(375)

The scalar field

σ

is defined as the squared amplitude of the control field and hence quantifies the ‘effort’ demanded by the control algorithm to the actuators [51].

An undesirable effect of the control schemes discussed in the previous sections is that the control effort

σ

does not necessarily vanish after synchronization is achieved, which entails unnecessary consumption of energy (and may therefor cause premature discharge of battery). In the present section, we shall describe a control scheme, valid for a generic Riemannian manifold, capable of making the control effort vanish to zero while synchronization is being achieved. By denoting as

x_{s} \in M

the state of the follower system, one such control scheme reads

\{\begin{matrix} {\dot{x}}_{s} = v_{s} + u, \\ \nabla_{{\dot{x}}_{s}} v_{s} = f (x_{s}, v_{s}) + k u, \end{matrix}

(376)

where

k \in R

denotes an effort-limitation gain (not to be confused with the proportional control gain) and

f : T M \to T M

denotes the follower state-transition function. Notice that the follower system is described by a couple of first-order differential equations on the tangent bundle

T M

through the paired variables

(x_{s}, v_{s})

and that the control field u is present not only in the equation governing the evolution of the velocity but even in the one governing the evolution of the state, as opposed to the previous synchronization scheme described by Equation (330). It is hence legitimate to wonder as to whether one such system of equations represents a physical object. To answer such question, let us recast the above two equations as a single, second-order, equation by applying the covariant differentiation operator

\nabla_{{\dot{x}}_{s}}

to both sides of the first equation, which gives

\nabla_{{\dot{x}}_{s}} {\dot{x}}_{s} = \nabla_{{\dot{x}}_{s}} v_{s} + \nabla_{{\dot{x}}_{s}} u = f (x_{s}, {\dot{x}}_{s} - u) + k u + \nabla_{{\dot{x}}_{s}} u .

(377)

The controlled system hence is described by the equation

\nabla_{{\dot{x}}_{s}} {\dot{x}}_{s} = f (x_{s}, {\dot{x}}_{s}) + \tilde{u}, with \tilde{u} : = k u + \nabla_{{\dot{x}}_{s}} u + f (x_{s}, {\dot{x}}_{s} - u) - f (x_{s}, {\dot{x}}_{s}),

(378)

that represents a physical (second-order or double-integrator) dynamical system controlled by an effective control field

\tilde{u}

that appears to depend on a primary control field u. As an instance, if the primary control u is proportional to an error feedback and the state-transition function is independent of the velocity field

v_{s}

, then the effective control field

\tilde{u}

is of proportional-derivative type.

It is worth emphasizing that the constant k plays a fundamental role in ensuring the asymptotic vanishing to zero of the control efforts. Before getting to the main result of this section, it is surely instructive to take a look at a couple of simpler cases.

Example 26.

Assume that

M \equiv R^{n}

and that the function f is continuous in the first argument and Lipschitz-continuous in the second argument, with Lipschitz constant

L_{f}

. In this case, we shall consider the control system

\{\begin{matrix} \begin{matrix} {\dot{x}}_{m} = v_{m} \\ {\dot{v}}_{m} = f (x_{m}, v_{m}) \end{matrix}\} (F r e e l e a d e r) \\ \begin{matrix} {\dot{x}}_{s} = v_{s} + u, \\ {\dot{v}}_{s} = f (x_{s}, v_{s}) + k u, \end{matrix}\} (C o n t r o l l e d f o l l o w e r) \\ u : = v_{m} - v_{s} - \frac{c}{2} (x_{s} - x_{m}), (P r i m a r y c o n t r o l f i e l d) \end{matrix}

(379)

where

c > 0

is a constant (that determines the synchronization speed). The control field ‘u’ makes the follower synchronize to the leader exponentially fast. Moreover, a main finding is that, provided

k \neq L_{f} + \frac{c}{4}

and

k > L_{f}

, the control effort asymptotically vanishes to zero.

In order to show that the above control scheme is synchronizing, let us adopt again a Lyapunov-function perspective and define

\{\begin{matrix} W : = \frac{1}{2} {(x_{m} - x_{s})}^{⊤} (x_{m} - x_{s}), \\ σ : = \frac{1}{2} u^{⊤} u . \end{matrix}

(380)

It is not hard to show that

W

is indeed a Lyapunov function for the controller-follower subsystem, in fact:

\begin{matrix} \dot{W} & = {(x_{m} - x_{s})}^{⊤} ({\dot{x}}_{m} - {\dot{x}}_{s}) \\ = {(x_{m} - x_{s})}^{⊤} (v_{m} - v_{s} - u) \\ = {(x_{m} - x_{s})}^{⊤} (v_{m} - v_{s} - v_{m} + v_{s} + \frac{c}{2} (x_{s} - x_{m})) \\ = - \frac{c}{2} {(x_{m} - x_{s})}^{⊤} (x_{m} - x_{s}) \\ = - c W . \end{matrix}

(381)

The last equation implies that

∥ x_{m} (t) - x_{s} (t) ∥ = D_{0} e^{- c t / 2}

, where

∥ \cdot ∥

denotes the Euclidean vector norm and

D_{0} : = ∥ x_{m} (0) - x_{s} (0) ∥

. Therefore, the inter-state distance tends asymptotically to zero with an exponential speed (namely, quickly at the beginning and slowly while approaching synchronization).

In order to ascertain the dynamics of the control effort, it pays to rewrite the expression of the control field in two equally meaningful ways, namely

\begin{matrix} v_{m} - v_{s} = u + \frac{c}{2} (x_{s} - x_{m}), \end{matrix}

(382)

\begin{matrix} v_{s} + u - v_{m} = - \frac{c}{2} (x_{s} - x_{m}) . \end{matrix}

(383)

The above two equations imply, respectively, that

\begin{matrix} ∥ v_{m} - v_{s} ∥ ⩽ ∥ u ∥ + \frac{c}{2} ∥ x_{s} - x_{m} ∥, \end{matrix}

(384)

\begin{matrix} {\dot{x}}_{s} - {\dot{x}}_{m} = - \frac{c}{2} (x_{s} - x_{m}) . \end{matrix}

(385)

The dynamics of the control effort is hence found to be governed by the following relation:

\begin{matrix} \dot{σ} & = u^{⊤} \dot{u} \\ = u^{⊤} ({\dot{v}}_{m} - {\dot{v}}_{s} - \frac{c}{2} ({\dot{x}}_{s} - {\dot{x}}_{m})) \\ = u^{⊤} (f (x_{m}, v_{m}) - f (x_{s}, v_{s}) - k u - \frac{c}{2} ({\dot{x}}_{s} - {\dot{x}}_{m})) \\ = - k u^{⊤} u + u^{⊤} (f (x_{m}, v_{m}) - f (x_{s}, v_{s})) + \frac{1}{4} c^{2} u^{⊤} (x_{s} - x_{m}), \end{matrix}

(386)

with initial condition

σ (0) = σ_{0}

and with

t ⩾ 0

. Since the function f was assumed to be continuous with respect to its x-argument and Lipschitz-continuous with respect to its v-argument, for all constants

ε > 0

, vectors

v_{m}

and

v_{s}

, there exist two constants

δ, L_{f} > 0

such that, for every

{\dot{x}}_{m}, {\dot{x}}_{s}

such that

∥ {\dot{x}}_{m} - {\dot{x}}_{s} ∥ < δ

, it results that

∥ f (x_{m}, v_{m}) - f (x_{s}, v_{s}) ∥ < ε + L_{f} ∥ v_{m} - v_{s} ∥

. By taking profit of such inequality and of the Cauchy-Schwarz inequality about the inner product of two vectors, the right-hand terms in the differential Equation (386) may be approximated to give the following differential inequation:

\dot{σ} < - 2 k σ + \sqrt{2 σ} (ε + L_{f} ∥ v_{m} - v_{s} ∥) + \frac{c^{2}}{4} \sqrt{2 σ} ∥ x_{s} - x_{m} ∥ .

(387)

By invoking the relationships (384) and (385), from the above inequality one obtains

\begin{matrix} \dot{σ} & < - 2 k σ + \sqrt{2 σ} (ε + L_{f} ∥ u ∥ + \frac{1}{2} c L_{f} ∥ x_{s} - x_{m} ∥) + \frac{1}{4} c^{2} \sqrt{2 σ} ∥ x_{s} - x_{m} ∥ \\ = (- 2 k + 2 L_{f}) σ + (ε + \frac{1}{2} c L_{f} D_{0} e^{- c t / 2} + \frac{1}{4} c^{2} D_{0} e^{- c t / 2}) \sqrt{2 σ} . \end{matrix}

(388)

For the sake of notation conciseness, it is convenient to define the following two quantities

\{\begin{matrix} k^{*} : = 2 (k - L_{f}), \\ ρ (t) : = ε + (\frac{c L_{f}}{2} + \frac{c^{2}}{4}) D_{0} e^{- c t / 2}, \end{matrix}

(389)

in such a way that the above differential inequation may be rewritten compactly as

\dot{σ} + k^{*} σ < ρ \sqrt{2 σ} .

(390)

In view of applying a Grönwall argument [52], it is convenient to study the associated differential equation which, upon the change of variable

2 σ = : ξ^{2}

, may be cast as

\dot{ξ} (t) + k^{*} ξ (t) = ρ (t) .

(391)

This is a first-order, inhomogeneous linear differential equation with initial condition

ξ (0) = \sqrt{2 σ_{0}}

whose closed-form solution reads

ξ (t) = ξ_{*} e^{- k^{*} t} + \frac{D_{0}}{2 k^{*} - c} (c L_{f} + \frac{c^{2}}{2}) e^{- c t / 2} + \frac{ε}{k^{*}},

(392)

where

ξ_{*}

depends from the initial condition

ξ (0)

and is given by

ξ_{*} : = ξ_{0} - \frac{ε}{k^{*}} - \frac{c D_{0}}{2 k^{*} - c} (L_{f} + \frac{c}{2}) .

(393)

Note that, by hypothesis,

k^{🟉} > 0

and

k^{🟉} \neq \frac{c}{2}

, hence the above quantities are well-written. Therefore, by the Grönwall’s inequality the solution of the differential inequation (390) satisfies

σ < \frac{1}{2} {\{ξ_{*} e^{- k^{*} t} + \frac{c D_{0}}{2 k^{*} - c} (L_{f} + \frac{c}{2}) e^{- c t / 2} + \frac{ε}{k^{*}}\}}^{2} .

(394)

The first and the second term within the square brackets converge to zero. Since the quantity ε is arbitrary, it may be assumed as small as desired, from which the assertion follows.

Since the two exponential functions within the square brackets peter out with different speed, the vanishing speed of the control effort to zero is dominated by the value

max \{k^{🟉}, \frac{c}{2}\}

.

The second simpler result examines the case that

M

is a general manifold but under the assumption that the leader subsystem and the follower subsystem are perfectly synchronized.

Example 27.

Consider the coupled dynamical systems

\{\begin{matrix} {\dot{x}}_{m} = v_{m}, \\ \nabla_{{\dot{x}}_{m}} v_{m} = f (x_{m}, v_{m}), \\ {\dot{x}}_{s} = v_{s} + u, \\ \nabla_{{\dot{x}}_{s}} v_{s} = f (x_{s}, v_{s}) + k u, \\ u : = P^{x_{m} \to x_{s}} (v_{m} - \frac{c}{2} \log_{x_{m}} x_{s}) - v_{s} \end{matrix}

(395)

evolving on a Riemannian manifold

M

and assume that the velocity-transition function f is Lipschitz continuous in the second argument, uniformly in the first argument, with Lipschitz constant

L_{f}

. The state manifold is supposed to be equipped with a Levi-Civita connection ∇. At perfect synchronization, the control scheme (395) with

k > L_{f}

makes the control effort vanish to zero at least exponentially fast.

In order to justify such statement, assume that the leader subsystem and the follower subsystem are perfectly synchronized, namely, that

x_{s} = x_{m} : = x

,

{\dot{x}}_{s} = {\dot{x}}_{m} : = \dot{x}

. Notice that, even at perfect synchronization, one must assume that

v_{s} \neq v_{m}

, in general, although the pair

v_{s}, v_{m}

belongs to the same tangent space

T_{x} M

. At perfect synchronization, the control vector field

u = P^{x_{m} \to x_{s}} (v_{m} - \frac{c}{2} \log_{x_{m}} (x_{s})) - v_{s}

simplifies to

u = v_{m} - v_{s} \in T_{x} M

, therefore it holds that

\dot{σ} = {⟨ u, \nabla_{\dot{x}} u ⟩}_{x} = {⟨ u, \nabla_{\dot{x}} v_{m} - \nabla_{\dot{x}} v_{s} ⟩}_{x} = {⟨ u, f (x, v_{m}) - f (x, v_{s}) - k u ⟩}_{x} .

(396)

By the hypothesis made, the function f satisfies the Lipschitz inequality in the space

T_{x} M

∥ f (x, v_{m}) - f (x, v_{s}) ∥_{x} ⩽ L_{f} ∥ v_{m} - v_{s} ∥_{x} = L_{f} {∥ u ∥}_{x},

(397)

with the constant

L_{f}

being independent from the state x; therefore, from the relation (396), by applying the Cauchy-Schwarz inequality for the inner product and by recalling the above Lipschitz continuity condition, we may obtain the following differential inequation:

\dot{σ} ⩽ L_{f} {∥ u ∥}_{x}^{2} - k {⟨ u, u ⟩}_{x} = 2 (L_{f} - k) σ,

(398)

with initial condition

σ (0) = σ_{0}

and where it is understood that

t ⩾ 0

. Set

k^{🟉} : = 2 (k - L_{f})

as effective decay constant. By the Grönwall’s inequality, the above relation implies that the actual control effort magnitude

σ (t)

is always bounded from above by the exponential curve

σ_{0} e^{- k^{🟉} t}

, hence the control effort vanishes to zero at least exponentially fast.

In the equations that describe second-order dynamical systems discussed in the paper [30], the velocity-transition function appears as the summation of a gradient term (that depends only on the system’s state) and of a number of non-linear damping terms that depend on the system’s state velocity and on the system’s state itself. We should underline that the assumption that the velocity-transition function appears Lipschitz in its whole domain is rather strict, although in some circumstances such requirement is indeed met, for instance:

in undamped systems, the function f depends only on the system’s state, therefore, the Lipschitz continuity condition is automatically verified and the differential inequation (398) becomes a differential equation, hence the control effort magnitude peters out as fast as $e^{- k t}$ , as long as $k > 0$ .
in linearly damped systems, the transition function is of the type $f (x, v) = - μ v$ + a term that depends on the state ‘x’ only. In this case, it is trivial to verify that the function f is globally $L_{f}$ -Lipschitz with $L_{f} = μ$ .

In a number of applications, the design of the leader-follower systems pair is up to the user and does not have to obey any particular physical constraints (this is the case, for example, in secure transmission of information by signal masking). Therefore, linear damping may be safely assumed (which includes the case of an undamped oscillators wherever

μ = 0

).

On the basis of the above considerations, in the context of systems on manifolds it seems reasonable to prescribe that the momentum-transition function

f : T M \to T M

satisfies a mixed continuity condition in the first argument and Lipschitz continuity in the second argument. To express such requirement more precisely, let us define the following auxiliary function

φ_{x} (y, v) : = P^{y \to x} (f (y, P^{x \to y} (v))),

(399)

where

x, y \in M

,

v \in T_{x} M

and

φ_{x} (y, v) \in T_{x} M

. Note that the following identities hold true

\begin{matrix} φ_{x} (x, v) = P^{x \to x} (f (x, P^{x \to x} (v))) = f (x, v), \end{matrix}

(400)

\begin{matrix} f (x, v) - P^{y \to x} (f (y, w)) = φ_{x} (x, v) - φ_{x} (x, P^{y \to x} (w)), \end{matrix}

(401)

for every

w \in T_{y} M

. We shall say that the function (399) is uniformly continuous in the argument y and Lipschitz continuous in the argument v if for every

x \in M

, for every

r, v \in T_{x} M

and for every

ε_{φ} > 0

there exist a

δ_{φ} > 0

and a

L > 0

such that, for each pair

z, y \in B_{x}^{φ}

with

D (z, y) < δ_{φ}

, it holds that

∥ φ_{x} (y, r) - φ_{x} {(z, v) ∥}_{x} < ε_{φ} + L {∥ r - v ∥}_{x},

(402)

where

B_{x}^{φ} \subset M

denotes a neighborhood of the point x. Such condition, ultimately, ensures that

∥ f (x, v) - P^{y \to x} {(f (y, w)) ∥}_{x} < ε_{φ} + L {∥ v - P^{y \to x} (w) ∥}_{x},

(403)

for every

y \in B_{x}^{φ}

,

v \in T_{x} M

and

w \in T_{y} M

, at every point

x \in M

.

The above-defined mixed continuity condition is discussed in the following example concerning a specific instance of momentum-transition function.

Example 28.

Let us consider the transition function

f : T S^{n - 1} \to T S^{n - 1}

defined by

f (x, v) : = - S x + (x^{⊤} S x) x - μ v,

(404)

with S being a symmetric, positive-definite matrix and μ denoting a real-valued parameter. It is straightforward to verify that

(x, f (x, v))

represents a tangent vector field in

Γ (S^{n - 1})

, in fact

x^{⊤} f (x, v) = - x^{⊤} S x + (x^{⊤} S x) (x^{⊤} x) - μ (x^{⊤} v) = 0

. The above function is continuous in the variable x, while it is obviously Lipschitz-continuous in the variable v.

In the following, the effects of the control term in the dynamical system (376) on the control effort magnitude in the case of a full dynamics will be analyzed in details. In order to state the main findings, it is convenient to present and comment some preliminary results.

A first instrumental result concerns the computation of the covariant derivative of a vector field obtained from the logarithmic map applied to a pair of (nearby) curves. Let

x = x (t)

and

y = y (t)

denote two given smooth curves on the Riemannian manifold

M

endowed with the Levi-Civita connection ∇. Consider the parametrized smooth vector field on

T M

given by

t \mapsto (x (t), λ (t))

, with

λ (t) : = \log_{x (t)} y (t)

. The covariant derivative of the vector field

λ

along the velocity field associated to the curve x may be proven to take the expression

\nabla_{\dot{x}} λ = {(d_{v}^{- 1} \exp)}_{x, λ} (\dot{y} - {(d_{x} \exp)}_{x, λ} (\dot{x})),

(405)

where

d_{x} \exp

and

d_{v} \exp

denote the partial pushforward maps of the exponential map computed with respect to its positional argument x and its tangential argument v, respectively. In order to explain such result, let a function

q : T M \to M

be defined as

q (x, v) : = \exp_{x} (v)

, with

x \in M

and

v \in T_{x} M

. Let us denote the pushforward map of the function q with respect to the first argument by

d_{x} q

and the pushforward map of q with respect to the second argument by

d_{v} q

. Since the function q is a local diffeomorphism in

T M

around

v = 0

, in a neighborhood of zero the linear map

d_{v} q

is full-rank, which ensure that there exists its inverse

d_{v}^{- 1} q

. In addition, notice that the identity

y (t) = \exp_{x (t)} (\log_{x (t)} y (t)) = \exp_{x (t)} (λ (t)) = q (x (t), λ (t))

holds. Differentiating both sides with respect to the parameter t gives

\dot{y} = {(d_{x} q)}_{x, λ} (\dot{x}) + {(d_{v} q)}_{x, λ} (\nabla_{\dot{x}} λ)

, therefore, by inversion, the result relationship (405) is immediately obtained.

In order to get the reader acquainted with the notion of partial pushforwards of the exponential map, detailed computations in the case of a unit hyper-sphere are carried out in the following example.

Example 29.

Let us recall that the exponential map for the unit hypersphere corresponding to a Euclidean metric is given by

\exp_{x} (v) = x cos (∥ v ∥) + v sinc (∥ v ∥),

(406)

where

∥ \cdot ∥

denotes the standard 2-norm and ‘

sinc

’ denotes the cardinal sine function. Its partial pushforward maps may be represented as linear operators mapping

R^{n}

to itself, that is to say, by square matrices.

The computation of the sought tangent maps may be worked out by defining the curve

y (t) = \exp_{x (t)} (v (t)) = x (t) cos (∥ v (t) ∥) + v (t) sinc (∥ v (t) ∥),

(407)

where

x (t) : t ⩾ 0 \mapsto x \in M

,

y (t) : t ⩾ 0 \mapsto y \in M

, and

v (t) : t ⩾ 0 \mapsto v \in T_{x (t)} M

. Taking the derivative of both sides of the expression (407) with respect to the parameter t yields

\begin{matrix} \dot{y} = & \dot{x} cos (∥ v ∥) - x sin (∥ v ∥) \frac{d ∥ v ∥}{d t} + \dot{v} sinc (∥ v ∥) + v {sinc}^{'} (∥ v ∥) \frac{d ∥ v ∥}{d t} \\ = & \dot{x} cos (∥ v ∥) + \dot{v} sinc (∥ v ∥) + ({sinc}^{'} (∥ v ∥) v - x sin (∥ v ∥)) \frac{v^{⊤} \dot{v}}{∥ v ∥}, \end{matrix}

(408)

where

\dot{v}

denotes the naïve derivative of the vector field

v (t)

and

{sinc}^{'}

denotes the derivative of the cardinal sine function.

Recall, from Equation (115), that

\dot{v} = \nabla_{\dot{x}} v - x {\dot{x}}^{⊤} v

, therefore the above expression may be recast as

\begin{matrix} \dot{y} = \dot{x} cos (∥ v ∥) + (\nabla_{\dot{x}} v - x {\dot{x}}^{⊤} v) sinc (∥ v ∥) + ({sinc}^{'} (∥ v ∥) v - x sin (∥ v ∥)] \frac{v^{⊤} \nabla_{\dot{x}} v}{∥ v ∥}, \end{matrix}

(409)

because

x^{⊤} v = 0

. A comparison of the above expression with

\dot{y} = {(d_{x} \exp)}_{x, v} (\dot{x}) + {(d_{v} \exp)}_{x, v} (\nabla_{\dot{x}} v)

leads to a matrix-type representation of the sought partial pushforward maps as

\begin{matrix} 〚 {(d_{x} \exp)}_{x, v} 〛 & = & cos (∥ v ∥) I_{n} - sinc (∥ v ∥) x v^{⊤}, \end{matrix}

(410)

\begin{matrix} 〚 {(d_{v} \exp)}_{x, v} 〛 & = & \{\begin{matrix} sinc (∥ v ∥) (I_{n} - x v^{⊤}) + \frac{cos (∥ v ∥) - sinc (∥ v ∥)}{{∥ v ∥}^{2}} v v^{⊤}, & f o r v \neq 0, \\ I_{n}, & f o r v = 0, \end{matrix} \end{matrix}

(411)

where the symbol

〚 \cdot 〛

denotes a matrix representative of a linear operator.

In correspondence of the null vector

v = 0

, the pushforward map

d_{v} \exp

equals the identity and in a neighborhood of zero such pushforward map is invertible. For

v \neq 0

, since the matrix

〚 {(d_{v} \exp)}_{x, v} 〛

is written as the sum of two terms, one may utilize Woodbury’s matrix inversion lemma [53] to compute such inverse pushforward map. Notice that an application of the matrix inversion lemma gives

{(I_{n} - x v^{⊤})}^{- 1} = I_{n} + x v^{⊤}

. Applying such lemma a second time yields the sought result

〚 {(d_{v}^{- 1} \exp)}_{x, v} 〛 = \{\begin{matrix} \frac{I_{n} + x v^{⊤}}{sinc (∥ v ∥)} [I_{n} - \frac{cos (∥ v ∥) - sinc (∥ v ∥)}{{∥ v ∥}^{2} cos (∥ v ∥)} v v^{⊤}], & f o r v \neq 0 \\ I_{n}, & f o r v = 0 . \end{matrix}

(412)

Notice that a singularity occurs in correspondence of

∥ v ∥ = \frac{π}{2}

, where it holds that

〚 {(d_{v} \exp)}_{x, v} 〛 |_{∥ v ∥ = π / 2} = I_{n} - (x + \frac{8}{π^{3}} v) v^{⊤}

, whose rank is 2. A second singularity occurs when

∥ v ∥ = π

, where

〚 {(d_{v} \exp)}_{x, v} 〛 |_{∥ v ∥ = π} = - \frac{1}{π^{2}} v v^{⊤}

, whose rank is 1. Because of rank deficiency, in correspondence of these balls in

T_{x} M

such partial pushforward map is singular.

The inverse map

{(d_{v}^{- 1} \exp)}_{x, \log_{x} y} : T_{y} M \to T_{x} M

plays a central role in the computation of the covariant derivative of the function

\log_{x} y

. Likewise, the map

{(d_{x} \exp)}_{x, \log_{x} y} : T_{x} M \to T_{y} M

will play a central role in the analysis of the control effort dynamics. Notice that whenever

x = y

, the partial pushforward map

{(d_{v} \exp)}_{x, \log_{x} y}

coincides to the identity

{id}_{T_{y} M}

, while the map

{(d_{x} \exp)}_{x, 0}

coincides to the identity

{id}_{T_{x} M}

. We are going to assume some sort of continuity of the dicussed partial pushforward maps with respect to their positional and vectorial arguments. In order to make such assumptions precise, let us define two further auxiliary maps that play a role in the computation of the covariant derivative of the logarithmic-map-based vector field, namely:

\begin{matrix} μ_{x} (y, r) & : = & ({(d_{v}^{- 1} \exp)}_{x, \log_{x} y} \circ P^{x \to y}) (r), \end{matrix}

(413)

\begin{matrix} β_{x} (y, v) & : = & (P^{y \to x} \circ {(d_{x} \exp)}_{x, \log_{x} y}) (v), \end{matrix}

(414)

where

x, y \in M

and

r, v \in T_{x} M

. We chose the above notation to underline that the functions

μ

and

β

exhibit an asymmetric dependence from the variables x and y; in addition, one such notation makes it easy to remember that, for every given pair

(x, y) \in M^{2}

, the operators

μ_{x} (y, 🟉)

and

β_{x} (y, 🟉)

are linear endomorphisms of

T_{x} M

. The following identities are immediate to verify:

\begin{matrix} μ_{x} (x, r) = ({(d_{v}^{- 1} \exp)}_{x, 0} \circ P^{x \to x}) (r) = r, \end{matrix}

(415)

\begin{matrix} β_{x} (x, v) = (P^{x \to x} \circ {(d_{x} \exp)}_{x, 0}) (v) = v, \end{matrix}

(416)

\begin{matrix} - \nabla_{\dot{x}} \log_{x} y = μ_{x} (y, β_{x} (y, v) - P^{y \to x} (w)) . \end{matrix}

(417)

In the following we shall assume the function

μ_{x}

defined in (413) to be continuous with respect to the variable y, namely that for every

x \in M

, for every

r \in T_{x} M

and for every

ε_{μ} > 0

, there exists a constant

δ_{μ} > 0

such that, for any

y \in B_{x}^{μ}

that satisfies

D (x, y) < δ_{μ}

, it holds that

∥ μ_{x} {(y, r) - r ∥}_{x} < ε_{μ}

, where

B_{x}^{μ}

denotes a neighborhood of the point

x \in M

. Such assumption tells that the linear map

μ_{x} (y, 🟉)

, thought of as a function of y, behaves like an approximate identity when x gets close to y. Note that, because of the linearity in the argument r, it is not necessary to make any restriction on its value. The continuity assumption carries over an useful consequence. By virtue of the triangular inequality, it holds that

∥ μ_{x} {(y, r) - r ∥}_{x} > ∥ μ_{x} {(y, r) ∥}_{x} - {∥ r ∥}_{x}

, therefore the assumption

∥ μ_{x} {(y, r) - r ∥}_{x} < ε_{μ}

implies that

∥ μ_{x} {(y, r) ∥}_{x} < ε_{μ} + {∥ r ∥}_{x} in B_{x}^{μ} .

(418)

Likewise, the map

β_{x} (y, 🟉)

will be assumed to be continuous with respect to the variable y, in a neighborhood

B_{x}^{β}

of a point x for every value

x \in M

. That is to say, for every point

x \in M

, every

r \in T_{x} M

, and every scalar

ε_{β} > 0

there exists a scalar

δ_{β} > 0

such that, for any

y \in B_{x}^{β}

with

D (x, y) < δ_{β}

, it holds that

∥ β_{x} {(y, v) - v ∥}_{x} < ε_{β}

.

By virtue of the triangular inequality of the norm, taking any tangent vector

w \in T_{y} M

, we may write that

∥ β_{x} (y, v) - P^{y \to x} (w) - (v - P^{y \to x} (w)) ∥_{x} > ∥ β_{x} (y, v) - P^{y \to x} {(w) ∥}_{x} - {∥ v - P^{y \to x} (w) ∥}_{x}

. As a consequence, the assumption

∥ β_{x} {(y, v) - v ∥}_{x} < ε_{β}

implies that

∥ β_{x} (y, v) - P^{y \to x} {(w) ∥}_{x} < ε_{β} + {∥ v - P^{y \to x} (w) ∥}_{x} in B_{x}^{β} .

(419)

The implications (418) and (419), together with the identity (417), tell that irrespective of how the positive values

ε_{μ}

and

ε_{β}

are chosen small, there certainly exists a point

y \in B_{x}^{μ} \cap B_{x}^{β} \ {x}

such that

∥ {(d_{v}^{- 1} \exp)}_{x, \log_{x} y} [{(d_{x} \exp)}_{x, \log_{x} y} (v) - w] ∥_{x} < ε_{μ | β} + {∥ v - P^{y \to x} (w) ∥}_{x},

(420)

where

ε_{μ | β} : = ε_{μ} + ε_{β}

. The inequality (420) appears as a sort of mixed continuity/Lipschitz-continuity condition on the composition of the partial pushforward maps associated to the exponential map.

Let us evaluate the continuity of the expressions

μ_{x} (y, u)

and

β_{x} (y, v)

for the unit-hyperpshere, as an instructive case of study, in the following example.

Example 30.

The expression of

μ_{x} (y, u)

, when

x \neq y \in S^{n - 1}

, may be obtained by composing the expression (412) and the expression of the parallel transport on the unit hyper-sphere, that we recall to take the expression

P^{x \to y} (u) = (I_{n} - \frac{(x + y) y^{⊤}}{1 + x^{⊤} y}) u .

(421)

We shall avoid writing the expression of the auxiliary function μ explicitly because it is quite cumbersome and not particularly illuminating. It suffices to note that, as we have seen in the Example 29, the map (412) is continuous as long as

∥ v ∥ < \frac{π}{2}

, while the map (421) is continuous as along as

D (x, y) < π

. Therefore, the function

μ_{x} (y, 🟉)

is continuous in

B_{x}^{μ} : = {y \in S^{n - 1} | D (x, y) < \frac{π}{2}}

. Likewise, the expression of

β_{x} (y, v)

may be obtained by composing the expression (410) and the expression (421), to give:

β_{x} (y, v) = (I_{n} - \frac{(y + x) x^{⊤}}{1 + cos D (x, y)}) (I_{n} cos D (x, y) - x y^{⊤}) v .

(422)

Such function is continuous in

B_{x}^{β} : = {y \in S^{n - 1} | D (x, y) < π}

. Gluing the two above results, one gets

B_{x}^{μ} \cap B_{x}^{β} \ {x} = {y \in S^{n - 1} | 0 < D (x, y) < \frac{π}{2}}

.

Observation 2.

The inverse map

μ_{x}^{- 1} (y, 🟉)

has been studied in details in the unpublished contribution [54]. (In particular, to make the notation uniform, the map

μ_{x}^{- 1} (y, 🟉)

corresponds to the map

E_{x} (\log_{x} y) : = I_{x}^{- 1} (\log_{x} y) \circ {(d_{v} \exp)}_{x, \log_{x} y}

in [54]). The contribution [54] shows that this map may be written explicitly as a Taylor series of higher-order covariant derivatives of the Riemannian curvature endomorphism.

As a further instrumental result in the present analysis, we may express the covariant derivative of the parallel transport operator

\nabla P

on a manifold

M

endowed with a Levi-Civita connection ∇. Consider the parametrized smooth vector field

t \mapsto (y (t), w (t)) \in T M

and define the new parametrized vector field

t \mapsto (x (t), P^{y (t) \to x (t)} (w (t))) \in T M

. The covariant derivative of such vector field along the velocity field of the curve

x (t)

takes the expression

\nabla_{\dot{x}} P^{y \to x} (w) = {(d_{x} P^{y \to x})}_{x, y, w} (\dot{x}) - (P^{y \to x} \circ {(d_{x} P^{y \to x})}_{y, x, v}) (\dot{y}) + P^{y \to x} (\nabla_{\dot{y}} w),

(423)

where

{(d_{x} P^{y \to x})}_{x, y, w} : T_{x} M \to T_{x} M

for every

y \in M

and

w \in T_{y} M

, and

{(d_{x} P^{y \to x})}_{y, x, v} : T_{y} M \to T_{y} M

for every

x \in M

and

v \in T_{x} M

. In addition, the ‘diagonal’ part of the linear map

{(d_{x} P^{y \to x})}_{x, y, w}

is identically zero, namely

{(d_{x} P^{y \to x})}_{x, x, v} \equiv 0

, for every point

x \in M

and tangent vector

v \in T_{x} M

.

The above statements may be motivated as follows. Let

h : M \times T M \to T M

be defined as

h (x, y, w) : = P^{y \to x} (w)

for any two points x and y in

M

and for any tangent vector

w \in T_{y} M

. Note, in particular, that the function h is linear in the third argument. Now, set

v (t) : = h (x (t), y (t), w (t))

and note that v is a vector field in

Γ (M)

, namely

v (t) \in T_{x (t)} M

. By definition of covariant derivative along a curve, we have that

\begin{matrix} \nabla_{\dot{x}} v = & lim_{s \to 0} \frac{1}{s} \{P^{x (t + s) \to x (t)} (v (t + s)) - v (t)\} \\ = & lim_{s \to 0} \frac{1}{s} \{P^{x (t + s) \to x (t)} (h (x (t + s), y (t + s), w (t + s))) - h (x (t), y (t), w (t))\} \\ = & lim_{s \to 0} \frac{1}{s} \{P^{x (t + s) \to x (t)} (h (x (t + s), y (t + s), w (t + s))) - h (x (t), y (t + s), w (t + s)) \\ + h (x (t), y (t + s), w (t + s) - P^{y (t) \to y (t + s)} (w (t))) \\ + h (x (t), y (t + s), P^{y (t) \to y (t + s)} (w (t))) - h (x (t), y (t), w (t))\} . \end{matrix}

(424)

The covariant derivative

\nabla_{\dot{x}} v

may thus be expressed as

\nabla_{\dot{x}} v = {(d_{x} h)}_{x, y, w} (\dot{x}) + {(d_{y} h)}_{x, y, w} (\dot{y}) + h (x, y, \nabla_{\dot{y}} w),

(425)

where

\dot{x} \in T_{x} M

,

\dot{y} \in T_{y} M

and

\nabla_{\dot{y}} w \in T_{y} M

. Since the notation might lead to some confusion, let us specify that the symbol

d_{x} h

denotes the pushforward map of the function h with respect to its first argument, while the symbol

d_{y} h

denotes the pushforward map of the function h evaluated with respect to its second argument. Such partial pushforward maps are defined as

\{\begin{matrix} {(d_{x} h)}_{x, y, w} (\dot{x}) : = lim_{s \to 0} \frac{1}{s} {P^{x (t + s) \to x (t)} (h (x (t + s), y (t + s), w (t + s))) \\ - h (x (t), y (t + s), w (t + s))}, \\ {(d_{y} h)}_{x, y, w} (\dot{y}) : = lim_{s \to 0} \frac{1}{s} \{h (x (t), y (t + s), P^{y (t) \to y (t + s)} (w (t))) - h (x (t), y (t), w (t))\} . \end{matrix}

(426)

Moreover, the notation

{(d_{x} h)}_{x, y, w}

indicates that the pushforward map, that is a function of three variables (besides of its argument), is evaluated at a point where the first variable is instantiated to the value x, the second variable is instantiated to the value y and the third variable is instantiated to the value w.

Now, since

P^{x \to y} \circ P^{y \to x} = {id}_{T_{x} M}

, the identity

w = h (y, x, h (x, y, w))

holds. Applying to both sides the covariant derivative ∇ yields the identity

\nabla_{\dot{y}} w = {(d_{x} h)}_{y, x, v} (\dot{y}) + {(d_{y} h)}_{y, x, v} (\dot{x}) + h (y, x, {(d_{x} h)}_{x, y, w} (\dot{x}) + {(d_{y} h)}_{x, y, w} (\dot{y}) + h (x, y, \nabla_{\dot{y}} w)) .

(427)

Notice that

{(d_{x} h)}_{y, x, v}

is an operator that maps

T_{y} M

to itself, while the operator

{(d_{y} h)}_{x, y, w}

maps

T_{y} M

to

T_{x} M

. Since the velocity vector fields

\dot{x}

and

\dot{y}

are arbitrary, the above identity, rewritten as

\nabla_{\dot{y}} w = {(d_{x} h)}_{y, x, v} (\dot{y}) + {(d_{y} h)}_{y, x, v} (\dot{x}) + h (y, x, {(d_{x} h)}_{x, y, w} (\dot{x})) + h (y, x, {(d_{y} h)}_{x, y, w} (\dot{y})) + \nabla_{\dot{y}} w,

(428)

yields two independent operator identities, one of which is

h (y, x, {(d_{y} h)}_{x, y, w}) + {(d_{x} h)}_{y, x, h (x, y, w)} = 0

, namely

{(d_{y} h)}_{x, y, w} = - h (x, y, {(d_{x} h)}_{y, x, h (x, y, w)}) .

(429)

By using the Equation (425) and the definition of the function h leads to the sought result.

The statement about the diagonal part of the tangent map

{(d_{x} P^{y \to x})}_{x, y, w}

follows from the definitions (426), in fact

\begin{matrix} {(d_{x} h)}_{x, x, v} (\dot{x}) = & lim_{s \to 0} \frac{1}{s} {P^{x (t + s) \to x (t)} (h (x (t + s), x (t + s), v (t + s))) \\ - h (x (t), x (t + s), v (t + s)}, \\ = & lim_{s \to 0} \frac{1}{s} \{P^{x (t + s) \to x (t)} (v (t + s)) - h (x (t), x (t + s), v (t + s))\}, \\ = & lim_{s \to 0} \frac{1}{s} \{P^{x (t + s) \to x (t)} (v (t + s)) - P^{x (t + s) \to x (t)} (v (t + s))\}, \\ = & 0, \end{matrix}

(430)

no matter how the point

x \in M

and the tangent vector

v \in T_{x} M

are chosen on the tangent bundle. But since

\dot{x}

is arbitrary, the statement follows.

The following example clarifies the above findings.

Example 31.

Consider the parallel translation map for the unit hyper-sphere

S^{n - 1}

corresponding to a Euclidean metric. By applying the definitions of partial pushforward maps in terms of time-derivatives, it is not hard to compute the maps

{(d_{x} P^{y \to x})}_{x, y, w}

and

{(d_{y} P^{y \to x})}_{x, y, w}

. Let us initiate by recalling that, in the present example,

h (x, y, w) : = (I_{n} - \frac{(y + x) x^{⊤}}{1 + x^{⊤} y}) w .

(431)

(Notice that this instance of parallel transport is from y to x! In fact,

w \in T_{y} M

). The naïve derivative of such instance of the function h reads

\begin{matrix} \dot{h} = & (- \frac{(\dot{y} + \dot{x}) x^{⊤} (1 + x^{⊤} y) + (y + x) {\dot{x}}^{⊤} (1 + x^{⊤} y) - (y + x) x^{⊤} ({\dot{x}}^{⊤} y + x^{⊤} \dot{y})}{{(1 + x^{⊤} y)}^{2}}) w \\ + h (x, y, \dot{w}) . \end{matrix}

(432)

By making use of the expression of the covariant derivative of the vector fields

v = h (x, y, w)

and w given in (115), the relation (425) may be recast as:

\dot{h} + x h^{⊤} \dot{x} = {(d_{x} h)}_{x, y, w} (\dot{x}) + {(d_{y} h)}_{x, y, w} (\dot{y}) + h (x, y, \dot{w} + y w^{⊤} \dot{y}) .

(433)

Plugging the Equation (432) into the relation (433) leads to

\begin{matrix} - \frac{((\dot{y} + \dot{x}) (x^{⊤} w) (1 + x^{⊤} y) + (y + x) ({\dot{x}}^{⊤} w) (1 + x^{⊤} y) - (y + x) (x^{⊤} w) ({\dot{x}}^{⊤} y + x^{⊤} \dot{y}))}{{(1 + x^{⊤} y)}^{2}} \\ + x w^{⊤} (I_{n} - \frac{x {(y + x)}^{⊤}}{1 + x^{⊤} y}) \dot{x} = {(d_{x} h)}_{x, y, w} (\dot{x}) + {(d_{y} h)}_{x, y, w} (\dot{y}) + (w^{⊤} \dot{y}) (I_{n} - \frac{(y + x) x^{⊤}}{1 + x^{⊤} y}) y . \end{matrix}

(434)

Upon splitting the terms in

\dot{x}

and

\dot{y}

, after some tedious but not hard calculations, we obtain a matrix representation of the sought partial pushforward maps given by

\begin{matrix} 〚 {(d_{x} P^{y \to x})}_{x, y, w} 〛 & = & - \frac{(x^{⊤} w) (I_{n} + x y^{⊤}) + (y + x) w^{⊤}}{1 + x^{⊤} y} + x w^{⊤} + \frac{(x^{⊤} w) (y + x) y^{⊤}}{{(1 + x^{⊤} y)}^{2}}, \end{matrix}

(435)

\begin{matrix} 〚 {(d_{y} P^{y \to x})}_{x, y, w} 〛 & = & - \frac{x^{⊤} w}{1 + x^{⊤} y} (I_{n} - \frac{(y + x) x^{⊤}}{1 + x^{⊤} y}) - y w^{⊤} + \frac{(x^{⊤} y) (y + x) w^{⊤}}{1 + x^{⊤} y} . \end{matrix}

(436)

It is not hard to verify that

x^{⊤} {(d_{x} P^{y \to x})}_{x, y, w} (\dot{x}) = x^{⊤} {(d_{y} P^{y \to x})}_{x, y, w} (\dot{y}) = 0

by using the above matrix representations. In addition, some lengthy–yet straightforward–calculations allow one to verify the identity (429), that may be rewritten explicitly as

{(d_{y} P^{y \to x})}_{x, y, w} (\dot{y}) = - P^{y \to x} \circ {(d_{x} P^{y \to x})}_{y, x, P^{y \to x} (w)} (\dot{y}) .

(437)

Such identity establishes that the partial pushforward map

d_{y} P

may be written in terms of the partial pushforward map

d_{x} P

, which justifies why the latter is termed principal pushforward map (of the parallel transport operator). It is also straightforward to verify directly that

〚 {(d_{x} P^{y \to x})}_{x, x, v} 〛 = 0

for the diagonal part of the partial pushforward map

d_{x} P

.

Some remarks on the partial pushforward maps of the parallel transport operator are in order. The map

{(d_{x} P^{y \to x})}_{x, y, w}

is linear in the variable w (as well as in its argument), therefore, the only way it could possibly be bounded over the whole domain

M \times T M

is that it is identically zero. Moreover, since the linear map

{(d_{x} P^{y \to x})}_{x, y, w}

gets close to the null map when the points x and y come close to one another, it is natural to express the continuity of the term

{(d_{x} P^{y \to x})}_{x, y, w} (\dot{x}) - (P^{y \to x} \circ {(d_{x} P^{y \to x})}_{y, x, v}) (\dot{y})

by requiring that the function

{(d_{x} P^{y \to x})}_{x, y, w} (v) - (P^{y \to x} \circ {(d_{x} P^{y \to x})}_{y, x, P^{y \to x} (w)}) (τ)

is continuous within its domain of definition if, for every

ε_{ψ} > 0

, there exists a scalar

δ_{ψ} > 0

such that, for any pair

x, y \in M^{2}

that satisfies the inequality

D (x, y) < δ_{ψ}

, it results that

∥ {(d_{x} P^{y \to x})}_{x, y, w} (v) - (P^{y \to x} \circ {(d_{x} P^{y \to x})}_{y, x, P^{y \to x} (w)}) (τ) ∥_{x} < ε_{ψ},

(438)

for every choice of the tangent vectors

w, τ \in T_{y} M

and

v \in T_{x} M

.

We have by now set all the analytic tools that are indeed necessary to justify the main result of the present section. We shall in fact show that the control scheme (376) ensures that the control effort vanishes to zero under conditions of continuity.

Let us assume that the parallel-transport operator and the manifold exponential map are locally continuous together with their tangent maps in the tangent bundle

T M

. Let us further assume that the velocity-transition function is continuous in its first argument and Lipschitz continuous in its second argument with Lipschitz constant

L_{f}

. If the constant k meets the conditions

k \neq L_{f} + \frac{c}{4}

and

k > L_{f}

, then the control effort presents a convergence to zero that results to be almost exponentially fast.

In order to justify such statement, it is convenient for easier tractability to define a dual control field

(x_{m}, u^{🟉}) \in T M

as

u^{🟉} = v_{m} - P^{x_{s} \to x_{m}} (v_{s}) - \frac{c}{2} \log_{x_{m}} (x_{s}) .

(439)

It is worth noticing that, since

u^{🟉} = P^{x_{s} \to x_{m}} (u)

, it holds that

σ = \frac{1}{2} {∥ u ∥}_{x_{s}}^{2} = \frac{1}{2} {∥ u^{🟉} ∥}_{x_{m}}^{2} .

(440)

Moreover, since the manifold

M

was endowed with a metric connection ∇, the control effort possesses a time-dynamics described by the covariant derivative of the dual control field through the expression

\dot{σ} = {⟨ u^{🟉}, \nabla_{{\dot{x}}_{m}} u^{🟉} ⟩}_{x_{m}} .

(441)

The covariant derivative of the dual control field, in turn, consists of three terms:

\nabla_{{\dot{x}}_{m}} u^{🟉} = \nabla_{{\dot{x}}_{m}} v_{m} - \nabla_{{\dot{x}}_{m}} P^{x_{s} \to x_{m}} (v_{s}) - \frac{c}{2} \nabla_{{\dot{x}}_{m}} \log_{x_{m}} (x_{s}) .

(442)

According to the expression (405), the covariant derivative of the logarithmic map applied to the trajectories of the leader and of the follower subsystems reads

\nabla_{{\dot{x}}_{m}} \log_{x_{m}} (x_{s}) = {(d_{v}^{- 1} \exp)}_{x_{m}, \log_{x_{m}} (x_{s})} ({\dot{x}}_{s} - {(d_{x} \exp)}_{x_{m}, \log_{x_{m}} (x_{s})} ({\dot{x}}_{m})) .

(443)

In addition, according to the expression (423), the covariant derivative of the parallel-transport-related field reads

\begin{matrix} \nabla_{{\dot{x}}_{m}} P^{x_{s} \to x_{m}} (v_{s}) = & {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{m}, x_{s}, v_{s}} ({\dot{x}}_{m}) + P^{x_{s} \to x_{m}} (\nabla_{{\dot{x}}_{s}} v_{s}) \\ - (P^{x_{s} \to x_{m}} \circ {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{s}, x_{m}, P^{x_{s} \to x_{m}} (v_{s})}) ({\dot{x}}_{s}) \\ = & {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{m}, x_{s}, v_{s}} ({\dot{x}}_{m}) + P^{x_{s} \to x_{m}} (f (x_{s}, v_{s})) + k u^{🟉} \\ - (P^{x_{s} \to x_{m}} \circ {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{s}, x_{m}, P^{x_{s} \to x_{m}} (v_{s})}) ({\dot{x}}_{s}) . \end{matrix}

(444)

Putting all those terms together in a single expression gives, for the covariant derivative of the dual control field along the trajectory of the leader subsystem, that reads

\begin{matrix} \nabla_{{\dot{x}}_{m}} u^{🟉} = & f (x_{m}, v_{m}) - P^{x_{s} \to x_{m}} (f (x_{s}, v_{s})) \\ - {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{m}, x_{s}, v_{s}} ({\dot{x}}_{m}) + (P^{x_{s} \to x_{m}} \circ {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{s}, x_{m}, P^{x_{s} \to x_{m}} (v_{s})}) ({\dot{x}}_{s}) \\ - \frac{c}{2} {(d_{v}^{- 1} \exp)}_{x_{m}, \log_{x_{m}} (x_{s})} ({\dot{x}}_{s} - {(d_{x} \exp)}_{x_{m}, \log_{x_{m}} (x_{s})} ({\dot{x}}_{m})) \\ - k u^{🟉} . \end{matrix}

(445)

It is convenient to recall from the very definition of dual control field that

\{\begin{matrix} u^{🟉} + \frac{c}{2} \log_{x_{m}} (x_{s}) = v_{m} - P^{x_{s} \to x_{m}} (v_{s}), \\ {\dot{x}}_{m} - P^{x_{s} \to x_{m}} ({\dot{x}}_{s}) = v_{m} - P^{x_{s} \to x_{m}} (v_{s}) - u^{🟉}, \end{matrix}

(446)

from which, by the triangular property of the norm, it follows directly that

\{\begin{matrix} ∥ v_{m} - P^{x_{s} \to x_{m}} (v_{s}) ∥_{x_{m}} ⩽ {∥ u^{🟉} ∥}_{x_{m}} + \frac{c}{2} D (x_{s}, x_{m}), \\ ∥ {\dot{x}}_{m} - P^{x_{s} \to x_{m}} ({\dot{x}}_{s}) ∥_{x_{m}} = \frac{c}{2} D (x_{s}, x_{m}) . \end{matrix}

(447)

Let us also recall that the distance

D (x_{s} (t), x_{m} (t))

takes on the expression

D_{0} \exp (- \frac{c}{2} t)

, where

D_{0} : = D (x_{s, 0}, x_{m, 0})

.

For the inner product in Equation (441), it holds that

\begin{matrix} {⟨ u^{🟉}, \nabla_{{\dot{x}}_{m}} u^{🟉} ⟩}_{x_{m}} = \\ - k {⟨ u^{🟉}, u^{🟉} ⟩}_{x_{m}} \\ + {⟨ u^{🟉}, f (x_{m}, v_{m}) - P^{x_{s} \to x_{m}} (f (x_{s}, v_{s})) ⟩}_{x_{m}} \\ + {⟨ u^{🟉}, (P^{x_{s} \to x_{m}} \circ {(d_{z_{m}} P^{x_{s} \to x_{m}})}_{x_{s}, x_{m}, P^{x_{s} \to x_{m}} (v_{s})}) ({\dot{x}}_{s}) - {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{m}, x_{s}, v_{s}} ({\dot{x}}_{m}) ⟩}_{x_{m}} \\ + \frac{c}{2} {⟨ u^{🟉}, {(d_{v}^{- 1} \exp)}_{x_{m}, \log_{x_{m}} (x_{s})} ({(d_{x} \exp)}_{x_{m}, \log_{x_{m}} (x_{s})} ({\dot{x}}_{m}) - {\dot{x}}_{s}) ⟩}_{x_{m}} . \end{matrix}

(448)

Let us examine the four terms in the right-hand side of the above lengthy expression:

the term $- k {⟨ u^{🟉}, u^{🟉} ⟩}_{x_{m}}$ clearly equals $- 2 k σ$ because of the identity (440);
the term ${⟨ u^{🟉}, f (x_{m}, v_{m}) - P^{x_{s} \to x_{m}} (f (x_{s}, v_{s})) ⟩}_{x_{m}}$ may be majorized by means of the Cauchy-Schwarz inequality, by the assumption that the function f is continuous in the first argument and Lipschitz-continuous in the second argument, and by the properties (447). Namely, $\forall ε_{φ} > 0$ , $\exists δ_{φ} > 0, L_{f} > 0$ such that $\forall x_{m}, x_{s} \in M$ with $D (x_{m}, x_{s}) < δ_{φ}$ , it holds that

$\begin{matrix} {⟨ u^{🟉}, f (x_{m}, v_{m}) - P^{x_{s} \to x_{m}} (f (x_{s}, v_{s})) ⟩}_{x_{m}} \\ ⩽ ∥ u^{🟉} ∥_{x_{m}} ∥ f (x_{m}, v_{m}) - P^{x_{s} \to x_{m}} (f (x_{s}, v_{s})) ∥ \\ < ∥ u^{🟉} ∥_{x_{m}} (ε_{φ} + L_{f} {∥ v_{m} - P^{x_{s} \to x_{m}} (v_{s}) ∥}_{x_{m}}) \\ ⩽ ∥ u^{🟉} ∥_{x_{m}} (ε_{φ} + L_{f} (∥ u^{🟉} ∥_{x_{m}} + \frac{c}{2} D (x_{s}, x_{m}))) \\ = L_{f} ∥ u^{🟉} ∥_{x_{m}}^{2} + {∥ u^{🟉} ∥}_{x_{m}} (ε_{φ} + \frac{1}{2} L_{f} c D (x_{s}, x_{m})) \\ = 2 L_{f} σ + \sqrt{2 σ} (ε_{φ} + \frac{1}{2} L_{f} c D (x_{s}, x_{m})); \end{matrix}$

(449)
the further scalar component ${⟨ u^{🟉}, (P^{x_{s} \to x_{m}} \circ {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{s}, x_{m}, P^{x_{s} \to x_{m}} (v_{s})}) ({\dot{x}}_{s}) - {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{m}, x_{s}, v_{s}} ({\dot{x}}_{m}) ⟩}_{x_{m}}$ may be majorized by invoking the Cauchy-Schwarz inequality and the continuity of the parallel transport and of its tangent maps. Namely, $\forall ε_{ψ} > 0$ , $\exists δ_{ψ} > 0$ such that $\forall x_{m}, x_{s} \in M$ with $D (x_{m}, x_{s}) < δ_{ψ}$ , it holds that

$\begin{matrix} {⟨ u^{🟉}, (P^{x_{s} \to x_{m}} \circ {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{s}, x_{m}, P^{x_{s} \to x_{m}} (v_{s})}) ({\dot{x}}_{s}) - {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{m}, x_{s}, v_{s}} ({\dot{x}}_{m}) ⟩}_{x_{m}} \\ ⩽ ∥ u^{🟉} ∥_{x_{m}} {∥ (P^{x_{s} \to x_{m}} \circ {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{s}, x_{m}, P^{x_{s} \to x_{m}} (v_{s})}) ({\dot{x}}_{s}) - {(d_{x_{m}} P^{x_{s} \to x_{m}})}_{x_{m}, x_{s}, v_{s}} ({\dot{x}}_{m}) ∥}_{x_{m}} \\ < ε_{ψ} {∥ u^{🟉} ∥}_{x_{m}} \\ = \sqrt{2 σ} ε_{ψ}; \end{matrix}$

(450)
to end with, the term $\frac{c}{2} {⟨ u^{🟉}, {(d_{v}^{- 1} \exp)}_{x_{m}, \log_{x_{m}} (x_{s})} ({(d_{x} \exp)}_{x_{m}, \log_{x_{m}} (x_{s})} ({\dot{x}}_{m}) - {\dot{x}}_{s}) ⟩}_{x_{m}}$ may be majorized by the help of the Cauchy-Schwarz inequality, by the assumptions on the continuity of the pushforward maps of the exponential map and by the inequalities (447), namely, $\forall ε_{β} > 0$ , $\exists δ_{β} > 0$ such that $\forall x_{s}, x_{m} \in M$ for which $D (x_{s}, x_{m}) < δ_{β}$ then

$\begin{matrix} \frac{c}{2} {⟨ u^{🟉}, {(d_{v}^{- 1} \exp)}_{x_{m}, \log_{x_{m}} (x_{s})} ({(d_{x} \exp)}_{x_{m}, \log_{x_{m}} (x_{s})} ({\dot{x}}_{m}) - {\dot{x}}_{s}) ⟩}_{x_{m}} \\ ⩽ \frac{c}{2} ∥ u^{🟉} ∥_{x_{m}} {∥ {(d_{v}^{- 1} \exp)}_{x_{m}, \log_{x_{m}} (x_{s})} ({(d_{x} \exp)}_{x_{m}, \log_{x_{m}} (x_{s})} ({\dot{x}}_{m}) - {\dot{x}}_{s}) ∥}_{x_{m}} \\ < \frac{c}{2} {∥ u^{🟉} ∥}_{x_{m}} (ε_{β} + {∥ {\dot{x}}_{m} - P^{x_{s} \to x_{m}} ({\dot{x}}_{s}) ∥}_{x_{m}}) \\ = \frac{c}{2} {∥ u^{🟉} ∥}_{x_{m}} (ε_{β} + \frac{c}{2} D (x_{s}, x_{m})) \\ = \frac{c}{2} \sqrt{2 σ} (ε_{β} + \frac{c}{2} D (x_{s}, x_{m})) . \end{matrix}$

(451)

Majorizing the quantity in (448) by means of the inequalities (449)–(451), and choosing

ε_{φ} = \frac{ε}{3}

,

ε_{ψ} = \frac{ε}{3}

and

ε_{β} = \frac{2 ε}{3 c}

, for t sufficiently large to fulfill

D (x_{s}, x_{m}) < min {δ_{φ}, δ_{β}, δ_{ψ}}

, it holds that

\dot{σ} (t) < - k^{🟉} σ (t) + \sqrt{2 σ (t)} ρ (t),

(452)

with

k^{🟉}

and

ρ (t)

defined as in (389). By a Grönwall-like majorization argument, we conclude that the control effort vanishes to zero at least exponentially.

6.4. Feedback Error Velocity, Principal Pushforward Map and Relation to Curvature

The cornerstone of control theory is error feedback control, based on the definition of a suitable error figure between the actual state of a system to control and a desired state, to be fed to a control algorithm. The control algorithm, in turn, will produce the appropriate commands to the system aimed at reducing the value of the error, hence forming a closed loop. Denoting by e such control error, it pays to study the evolution law of the error, often determined by a differential equation in the terms

\dot{e}

and

\ddot{e}

, in order to make sure that effectively the error converges to zero (asymptotically). Such differential equation describes an error system, as it was exemplified in Section 6.1. The above-mentioned time-derivatives of the error also intervene in designing the control algorithm itself, which in industrial applications is often chosen to include proportional, integral and derivative (PID) terms. It is, in fact, known that each of these term may contribute in shaping the error system and in compensating each others’ drawbacks in order to attain a desired dynamics for the error field e.

When considering two dynamical systems whose common state space is a curved Riemannian manifold

M

, whose states are described by the manifold-valued variables

x, y \in M

(where let us say that the reference system has state-variable x), there is not a straightforward definition of error. According to the prevailing literature (see, e.g., [55,56]), and on the present author’s personal experience, the error term

e \in T_{x} M

should be defined as a tangent field along the trajectory of one of the systems as, for example,

e : = \log_{x} y

. A problem with this definition often comes when formulating the derivative and integral control terms. What is most relevant to the present discussion is the computation of a derivative term which, for consistency, cannot be the naïve derivative

\dot{e}

of the error but needs to be, for example, the covariant derivative

ε : = \nabla_{t}^{x} e = \nabla_{t}^{x} \log_{x} y

. Such calculation was already detailed in Section 6.3 and we may recall from Equation (405) that

ε = \nabla_{\dot{x}} \log_{x} y = {(d_{v}^{- 1} \exp)}_{x, \log_{x} y} (\dot{y}) - {(d_{v}^{- 1} \exp)}_{x, \log_{x} y} ({(d_{x} \exp)}_{x, \log_{x} y} (\dot{x})) .

(453)

In practice, such expression, which depends on the partial pushforwards of the exponential map, is deemed to be overly convoluted to employ and hence is replaced by a simpler expression that retains the meaning of error velocity and the property of being a tangent vector field yet being much simpler to utilize, namely

ε : = P^{y \to x} (\dot{y}) - \dot{x} .

(454)

It is interesting to recall that such an error term appears in the Cucker-Smale model on manifold recalled in the introductory discussion of the present Section.

Now, the dynamics of the error depends on the (covariant) time-derivative of the velocity of the error, namely on

\nabla_{t}^{x} ε = \nabla_{t}^{x} P^{y \to x} (\dot{y}) - \nabla_{t}^{x} \dot{x}

. The second term coincides with the state-acceleration of one of the systems, while the term

\nabla_{t}^{x} P^{y \to x} (\dot{y})

needs to be studied aside. The present section is devoted to the analysis of the quantity

\nabla_{\dot{x}} P^{y \to x} (\dot{y})

. The analysis carried out in the present section elucidates some interesting algebraic properties of the principal pushforward map associated to the parallel transport operator and the connection between such principal pushforward map and the curvature endomorphism of the state manifold.

The scientific content recalled in the present section is largely based on the earlier contribution [18].

6.4.1. Algebraic Properties of the Principal Pushforward Map of Parallel Transport

We have already shown in Section 6.3 and, in particular, by Equation (423) that, given two smooth curves

t \mapsto x (t)

and

t \mapsto y (t)

in

M

and a smooth vector field

w_{y (t)} \in T_{y (t)} M

, the covariant derivative of the vector field

x \mapsto P^{y \to x} (w_{y})

along the velocity field of the curve

x (t)

reads

\nabla_{\dot{x}} P^{y \to x} (w_{y}) = {(d_{x} P^{y \to x})}_{x, y, w_{y}} (\dot{x}) - P^{y \to x} ({(d_{x} P^{y \to x})}_{y, x, P^{y \to x} (w_{y})}) (\dot{y})) + P^{y \to x} (\nabla_{\dot{y}} w_{y}),

(455)

where

{(d_{x} P)}_{x, 🟉, 🟉} \in End (T_{x} M)

for any

x \in M

. Such differential map is termed principal pushforward map since the covariant derivative

\nabla P

may be expressed entirely on the basis of such map without any need to refer to the further partial pushforward map

d_{v} P

.

One might notice that the covariant derivative

\nabla_{\dot{x}} P^{y \to x} (w_{y})

of the transported field equals the sum of an ‘obvious’ term, given by the transported covariant derivative

P^{y \to x} (\nabla_{\dot{y}} w_{y})

, and of further terms purely related to the dynamics of parallel transport and, hence, on the curvature of the state manifold

M

. (Such terms appear to be somehow reminiscent of the kinematic corrections due to a change in reference systems in physics.)

The principal pushforward map enjoys some interesting algebraic properties. Since these are not directly instrumental to the present analysis, let us recall them under the form of an example.

Example 32.

Denoting by

{(d_{x} P^{y \to x})}_{x, y, w_{y}}^{†} : T_{x} M \to T_{x} M

the adjoint (or dual) of the operator

{(d_{x} P^{y \to x})}_{x, y, w_{y}}

with respect to the Riemannian metric of

M

, it holds that

{(d_{x} P^{y \to x})}_{x, y, P^{x \to y} (v_{x})}^{†} (v_{x}) = 0 .

(456)

for every

x, y \in M

and

v_{x} \in T_{x} M

. In order to explain such statement, let us recall that, since the parallel transport is an isometry, the vector fields

t \mapsto (x (t), v (t))

and

t \mapsto (y (t), w (t))

enjoy the property

∥ v_{x} {(t) ∥}_{x (t)}^{2} = {∥ w_{y} (t) ∥}_{y (t)}^{2} .

(457)

Taking the derivative of both sides with respect to the parameter t yields the equation

{⟨ v_{x}, \nabla_{\dot{x}} v_{x} ⟩}_{x} = {⟨ w_{y}, \nabla_{\dot{y}} w_{y} ⟩}_{y} .

(458)

By replacing the expression (455) for the covariant derivative

\nabla_{\dot{x}} v_{x}

in (458), we get:

\begin{matrix} {⟨ v_{x}, {(d_{x} P^{y \to x})}_{x, y, w_{x}} (\dot{x}) ⟩}_{x} - {⟨ v_{x}, (P^{y \to x} \circ {(d_{x} P^{y \to x})}_{y, x, v_{x}}) (\dot{y}) ⟩}_{x} \\ + {⟨ P^{y \to x} (w_{y}), P^{y \to x} (\nabla_{\dot{y}} w_{y}) ⟩}_{x} = {⟨ w_{y}, \nabla_{\dot{y}} w_{y} ⟩}_{y} . \end{matrix}

By way of the isometry property of parallel transport, it is immediate to see that the last term on the left-hand side equals the term on the right-hand side, therefore the above property is equivalent to

{⟨ v_{x}, {(d_{x} P^{y \to x})}_{x, y, w_{y}} (\dot{x}) ⟩}_{x} = {⟨ v_{x}, (P^{y \to x} \circ {(d_{x} P^{y \to x})}_{y, x, v_{x}}) (\dot{y}) ⟩}_{x} .

(459)

It pays to recall that, given a linear operator

L : V \to V

on a vector space

V

endowed with a positive-definite inner product

⟨ \cdot, \cdot ⟩ : V^{2} \to R

, its adjoint

L^{†}

with respect to the metric satisfies the relation

⟨ L (v), u ⟩ = ⟨ v, L^{†} (u) ⟩

for every

u, v \in V

. The left-hand side of the Equation (459) may thus be rewritten as

{⟨ {(d_{x} P^{y \to x})}_{x, y, P^{x \to y} (v_{x})}^{†} (v_{x}), \dot{x} ⟩}_{x}

. Since x, y,

\dot{x}

and

\dot{y}

are arbitrary, the Equation (459) may hold only if the identity (456) holds.

Let us further recall the following two properties of linear operators on vector spaces:

Equivalence of matrix representations: Given a linear operator $L : V \to V$ on a vector space $V \subset R^{m}$ , two of its matrix representations $〚 L 〛_{1}$ and $〚 L 〛_{2}$ are equivalent on the vector space $V$ if $〚 L 〛_{1} x = 〚 L 〛_{2} x$ for any $x \in V$ , although the matrices $〚 L 〛_{1}$ and $〚 L 〛_{2}$ may appear different.
Representation of the adjoint: If $〚 L 〛$ is a matrix representation of a linear operator $L : V \to V$ on a vector space $V \subset R^{m}$ with respect to an orthonormal basis of $V$ , then the matrix representation of its adjoint, namely $〚 L^{†} 〛$ coincides with the transpose $〚 L 〛^{T}$ (this is a reason why the adjoint is sometimes also referred to as transpose operator).

It is immediate to recognize that the matrix-representation (435) of the operator

{(d_{x} P^{y \to x})}_{x, y, w_{y}}

is not unique. For example, if we add to the matrix

〚 {(d_{x} P^{y \to x})}_{x, y, w_{y}} 〛

the term

x^{⊤} w_{y} {(1 + x^{⊤} y)}^{- 1} x x^{⊤}

, we get an equivalent representation of the map

{(d_{x} P^{y \to x})}_{x, y, w_{y}}

given by:

\begin{matrix} 〚 {(d_{x} P^{y \to x})}_{x, y, w_{y}} 〛_{1} : = & - \frac{(x^{⊤} w_{y}) (I_{n} + x y^{⊤} - x x^{⊤}) + (y + x) w_{y}^{⊤}}{1 + x^{⊤} y} \\ + x w_{y}^{⊤} + \frac{(x^{⊤} w_{y}) (y + x) y^{⊤}}{{(1 + x^{⊤} y)}^{2}} . \end{matrix}

(460)

A further equivalent representation is

〚 {(d_{x} P^{y \to x})}_{x, y, w_{y}} 〛_{2} : = 〚 {(d_{x} P^{y \to x})}_{x, y, w_{y}} 〛_{1} - \frac{{(w_{y}^{⊤} x)}^{2}}{{(1 + x^{⊤} y)}^{2}} \frac{[P^{y \to x} (w)] x^{⊤}}{∥ w_{y} ∥^{2}},

(461)

that serves to exemplify the property (456). In fact, in the present context, it holds that

〚 {(d_{x} P^{y \to x})}_{x, y, w_{y}}^{†} 〛 = 〚 {(d_{x} P^{y \to x})}_{x, y, w_{y}} 〛^{⊤}

and it is not hard to verify that

〚 {(d_{x} P^{y \to x})}_{x, y, P^{y \to x} (v_{x})} 〛_{2}^{⊤} v_{x} = 0

, no matter how one chooses the points

x \in S^{n - 1}

,

y \in S^{n - 1} \ {- x}

and the tangent vector v in

T_{x} S^{n - 1}

.

As a further property of the principal pushforward map, more related to system dynamics, let us consider two trajectories

t \mapsto x (t)

and

t \to y (t)

that intersect at a point

p \in M

. Since

v_{x} = P^{y \to x} (w_{y})

, when

x = y = p

the vector fields

v_{x}

and

w_{y}

align to one another at p, namely

v_{p} \equiv w_{p} \in T_{p} M

. On the other hand, the two curves are arbitrary and we assume that

{\dot{x}}_{p} \neq {\dot{y}}_{p}

. An interesting finding is that at the point p,

\nabla_{\dot{x}} v_{x} = \nabla_{\dot{y}} w_{y}

. In fact, at any points

x, y \in M

and for every

w_{y} \in T_{y} M

, it holds that

{(\nabla_{\dot{x}} v_{x})}_{x} = {(d_{x} P^{y \to x})}_{x, y, w_{y}} ({\dot{x}}_{x}) - P^{y \to x} (({(d_{x} P^{y \to x})}_{y, x, P^{y \to x} (w_{y})}) ({\dot{y}}_{y}) - {(\nabla_{\dot{y}} w_{y})}_{y}) .

(462)

Setting

x = y = p

gives

{(\nabla_{\dot{x}} v_{x})}_{p} = {(d_{x} P^{y \to x})}_{p, p, w_{p}} ({\dot{x}}_{p}) - {(d_{x} P^{y \to x})}_{p, p, w_{p}} ({\dot{y}}_{p}) + {(\nabla_{\dot{y}} w_{y})}_{p},

(463)

namely

{(\nabla_{\dot{x}} v_{x})}_{p} - {(\nabla_{\dot{y}} w_{y})}_{p} = {(d_{x} P^{y \to x})}_{p, p, w_{p}} ({\dot{x}}_{p} - {\dot{y}}_{p}) .

(464)

The right-hand side is identically zero because it is an element of the diagonal of the pushforward map, whence the conclusion follows.

The above statement, written explicitly for the unit hyper-sphere, leads indeed to non-trivial constraints on the naïve derivatives

{({\dot{v}}_{x})}_{p}, {({\dot{w}}_{y})}_{p} \in A

. By using the Equation (115) for the covariant derivative, the above property reads

{({\dot{v}}_{x})}_{p} + p v_{p}^{⊤} {\dot{x}}_{p} = {({\dot{w}}_{y})}_{p} + p w_{p}^{⊤} {\dot{x}}_{p} .

(465)

Since

v_{p} = w_{p}

, the above equation may be rewritten, for example, as

{({\dot{v}}_{x})}_{p} - {({\dot{w}}_{y})}_{p} = p v_{p}^{⊤} ({\dot{y}}_{p} - {\dot{x}}_{p}) .

(466)

Whenever

{\dot{y}}_{p} = {\dot{x}}_{p}

, it happens trivially that

{({\dot{v}}_{x})}_{p} = {({\dot{w}}_{y})}_{p}

, otherwise, the two naïve derivatives differ by the amount

v_{p}^{⊤} ({\dot{y}}_{p} - {\dot{x}}_{p})

along the radial direction p.

To conclude this review of the abstract properties of the principal pushforward map, let us suppose that one of the two trajectories

x (t)

and

y (t)

is constant, for example, suppose that

y (t) = y = constant for every t \in I,

(467)

in such a way that its image reduces to a point-set

{y}

. As a consequence, we have that

\dot{y} = 0

and therefore

\nabla_{\dot{y}} w_{y} = 0

and

{(d_{x} P^{y \to x})}_{x, y, v_{x}} (\dot{y}) = 0

. Nevertheless, the vector field

v_{x} (t) = P^{y \to x (t)} (w_{y})

changes over time and its covariant derivative along the velocity field

\dot{x}

is still given by the relation (455) that, by the hypothesis made, reads:

{(d_{x} P^{y \to x})}_{x, y, w_{y}} (\dot{x}) = {\nabla_{\dot{x}} P^{y \to x} (w_{y})|}_{y (t) = const} .

(468)

The above relation supplies a meaning to the principal pushforward map

{(d_{x} P^{y \to x})}_{x, y, w_{y}}

as the covariant derivative of the transported ‘constant’ vector field

w_{y}

.

6.4.2. Relationship between the Principal Pushforward Map and the Curvature Endomorphism

Parallel transport does not behave trivially under composition. For instance, given three points

x, y, z \in M

not aligned along the same geodesic arc, in general it happens that

P^{y \to x} \circ P^{z \to y} \neq P^{z \to x}

. Such fundamental observations leads to the holonomy theory which is closely related to the theory of curvature manifold by the Ambrose–Singer theorem [57]. The aim of the current section is to explore and exemplify such relationship with some details.

To start with, let us recall that to any point

x \in M

is associated an algebraic structure

{Hol}_{x} (M)

termed holonomy group formed by the set of linear operators that realize a parallel transport along any piecewise smooth closed loops originating and ending at the base-point x. An instance of piecewise smooth closed loop easy to visualize is a geodesic triangle determined by three non-aligned points

x, y, z \in M

and by the three geodesic arcs that connect them. Then, given a tangent vector

w_{y} \in T_{y} M

, one may start by transporting the vector

w_{y}

from the point y to the point z along the geodesic arc originating at the point y and ending in the point z by means of the parallel transport operator

P^{y \to z}

, then one may transport the resulting tangent vector from the point z to the point x along the geodesic line connecting such points by means of the parallel transport operator

P^{z \to x}

, and then one may transport the resulting tangent vector from the point x back to the point y along the geodesic line connecting such pair of points by means of the parallel transport operator

P^{x \to y}

. The resulting tangent vector, that we denote as

{\bar{w}}_{y} : = (P^{x \to y} \circ P^{z \to x} \circ P^{y \to z}) (w_{y})

(469)

will, in general, differ from the vector

w_{y}

that one started with, because of the curvature of the manifold. The compound linear map

H_{y} (z, x) : = P^{x \to y} \circ P^{z \to x} \circ P^{y \to z}

is an element of the holonomy group, namely,

H_{y} \in {Hol}_{y} (M)

. Notice that, since each element of the parallel transport chain in the definition of the operator

H

realizes an isometry, the map

H_{y}

represents an isometry as well, therefore

∥ {\bar{w}}_{y} ∥_{y} = {∥ w_{y} ∥}_{y}

.

An interesting relationship between the notion of principal pushforward map

(d_{x} P^{y \to x})

and that of manifold holonomy descends directly from the definition (426). In fact, such definition may be rewritten equivalently as

\begin{matrix} {(d_{x} P^{y \to x})}_{x (t), y (t), w_{y (t)}} (\dot{x} (t)) \\ = lim_{s \to 0} \frac{1}{s} \{(P^{x (t + s) \to x (t)} \circ P^{y (t) \to x (t + s)}) (w_{y (t)}) - P^{y (t) \to x (t)} (w_{y (t)})\}, \\ = lim_{s \to 0} \frac{1}{s} P^{y (t) \to x (t)} \{(P^{x (t) \to y (t)} \circ P^{x (t + s) \to x (t)} \circ P^{y (t) \to x (t + s)}) (w_{y (t)}) - w_{y (t)}\}, \\ = P^{y (t) \to x (t)} \{lim_{s \to 0} \frac{1}{s} ((P^{x (t) \to y (t)} \circ P^{x (t + s) \to x (t)} \circ P^{y (t) \to x (t + s)}) (w_{y (t)}) - w_{y (t)})\} . \end{matrix}

(470)

The last line of the above relationship reveals that the argument of the limit involves a comparison between the vector

w_{y}

and the result of its transformation by the operator

H_{y (t)} (x (t + s), x (t)) = P^{x (t) \to y (t)} \circ P^{x (t + s) \to x (t)} \circ P^{y (t) \to x (t + s)}

belonging to the holonomy group

{Hol}_{y} (M)

and, in particular, the relationship

{(d_{x} P^{y \to x})}_{x, y, w_{y}} (\dot{x}) = P^{y \to x} \{lim_{s \to 0} \frac{1}{s} (H_{y} (x (🟉 + s), x) - {id}_{y}) w_{y}\}

(471)

holds true. In turn, as we have already observed, the above-defined operator

H

represents parallel transport along a geodesic triangle, which is an instance of closed loop. Since the curvature endomorphism arises from parallel transport along an infinitesimal closed loop, it might be guessed how parallel transport along a non-infinitesimal loop may be expressed as a path-integral of the curvature.

In order to clarify the relationship that relates the principal pushforward map to the curvature endomorphism, it is instrumental to recall a few results from manifold calculus.

As a first result to recall, which stands by itself as a fundamental finding in manifold calculus, let us survey the notion of anti-covariant-derivative, an homologous of the notion of anti-derivative in standard calculus. Such notion may be expressed in the following way: Let

γ : [0, 1] \to M

denote a curve and

w \in Γ (M)

a smooth vector field defined at least on the image of

γ

and such that

w_{γ (0)} = 0

; then one might wonder how to recover the vector field w from its covariant derivative

v_{γ (t)} : = \nabla_{t}^{γ} w

, supposed to be known. The answer is given by the anti-covariant-derivative formula

w_{γ (t)} = \int_{0}^{t} P_{γ}^{τ \to t} (v_{γ (τ)}) d τ .

(472)

Let us verify the correctness of such assertion. By definition of covariant derivative, it holds that

\begin{matrix} \nabla_{t}^{γ} w = & lim_{s \to 0} \frac{1}{s} \{P_{γ}^{t + s \to t} \int_{0}^{t + s} P_{γ}^{τ \to t + s} (v_{γ (τ)}) d τ - \int_{0}^{t} P_{γ}^{τ \to t} (v_{γ (τ)}) d τ\} \\ = & lim_{s \to 0} \frac{1}{s} \{\int_{0}^{t + s} (P_{γ}^{t + s \to t} \circ P_{γ}^{τ \to t + s}) (v_{γ (τ)}) d τ - \int_{0}^{t} P_{γ}^{τ \to t} (v_{γ (τ)}) d τ\} \\ = & lim_{s \to 0} \frac{1}{s} \{\int_{0}^{t} (P_{γ}^{t + s \to t} \circ P_{γ}^{τ \to t + s} - P_{γ}^{τ \to t}) (v_{γ (τ)}) d τ + \int_{t}^{t + s} (P_{γ}^{t + s \to t} \circ P_{γ}^{τ \to t + s}) (v_{γ (τ)}) d τ\} . \end{matrix}

(473)

Moreover, let us recall that parallel transport enjoys composition over the same curve (no matter as to whether such curve is a geodesic or a non-geodesic line), therefore

P_{γ}^{t + s \to t} \circ P_{γ}^{τ \to t + s} \equiv P_{γ}^{τ \to t}

. As a consequence, the first integral in the last line of the above multiline equation is identically zero, while the second integral equals

\int_{t}^{t + s} P_{γ}^{τ \to t} (v_{γ (τ)}) d τ = s v_{γ (t)} + O (s^{2}),

(474)

by a continuity argument, hence the assertion (472) follows.

The second result that is convenient to recall is based on the notion of geodesic homotopy. Let us define the function

h : [- ϵ, ϵ] \times [0, 1] \to M

to be a geodesic homotopy, namely, a family of curves such that for any fixed

s \in [- ϵ, ϵ]

,

t \mapsto h (s, t)

traces a geodesic curve on

M

connecting the point

h (s, 0)

to the point

h (s, 1)

. The image of an homotopy is sometimes referred to as homotopic net. In addition, let us define

σ : [- ϵ, ϵ] \times [0, 1] \to T M

to be a smooth vector field parallel along the homotopy, namely, such that

σ (s, t) \in T_{h (s, t)} M

and

(\nabla_{\partial_{s} h} σ) (s, 0) = 0 and \nabla_{\partial_{t} h} σ (s, t) = 0

(475)

for every value of the index

s \in [- ϵ, ϵ]

and for every value of the affine parameter

t \in [0, 1]

. Namely, the vector field

σ

is parallel along the back-end curve

h (s, 0)

and along each geodesic

h (🟉, t)

of the homotopic net. Notice that one such vector field is entirely determined by its value

σ (0, 0)

given and by the homotopy itself. Then, the following relationship holds:

(\nabla_{\partial_{s} h} σ) (s, 1) = (\int_{0}^{1} P^{h (s, t) \to h (s, 1)} \circ R_{h (s, t)}^{\partial_{t} h, \partial_{s} h} \circ P^{h (s, 1) \to h (s, t)} d t) σ (s, 1),

(476)

where

R

denotes the Riemannian curvature endomorphism associated to the manifold and to its Levi-Civita connection. Namely, the transversal covariant derivative of the vector field along the front-end curve

h (s, 1)

is given by a linear operator, which depends on the curvature of the manifold along the homotopic net, applied to the field. In the following, to ease notation we shall denote as

ξ : = \partial_{s} h

the transversal velocity and as

v : = \partial_{t} h

the velocity along each geodesic belonging to the homotopic net. By definition of Riemannian curvature endomorphism, we have that

\nabla_{v} \nabla_{ξ} σ = \nabla_{ξ} \nabla_{v} σ + \nabla_{[v, ξ]} σ + R_{h}^{v, ξ} σ .

(477)

As we had already observed earlier, the vector fields v and

ξ

commute, moreover, by hypothesis, the vector field

σ

is parallel along the homotopy, therefore

\nabla_{v} σ = 0

. As a consequence, the expression (477) simplifies to

(\nabla_{v} \nabla_{ξ} σ) (s, t) = R_{h (s, t)}^{v (s, t), ξ (s, t)} σ (s, t) .

(478)

By the assumption of parallelism, for any given index s, the vector

σ (s, t)

may be recovered from its front-end value

σ (s, 1)

through parallel transport, hence

(\nabla_{v} \nabla_{ξ} σ) (s, t) = R_{h (s, t)}^{v (s, t), ξ (s, t)} P^{h (s, 1) \to h (s, t)} σ (s, 1) .

(479)

Applying the notion of anti-covariant-derivative to the left-hand side of the expression (479) gives

(\nabla_{ξ} σ) (s, t) = \int_{0}^{t} P^{h (s, τ) \to h (s, t)} (R_{h (s, τ)}^{v (s, τ), ξ (s, τ)} P^{h (s, 1) \to h (s, τ)} σ (s, 1)) d τ .

(480)

Setting

t = 1

gives the sought result.

On the basis of the Equation (476), it is possible to express the principal pushforward map in terms of the Riemannian curvature endomorphism. Such relationship may be made explicit by observing that the principal pushforward map may be expressed in closed form as

\{\begin{matrix} {(d_{x} P^{y \to x})}_{x, y, w_{y}} (\dot{x}) = (F_{x, y}^{0} \circ P^{y \to x}) (w_{y}), where \\ F_{x, y}^{0} : = \int_{0}^{1} P^{\exp_{y} (t \log_{y} x) \to x} \circ R_{x, y, \dot{x}}^{0} (t) \circ P^{x \to \exp_{y} (t \log_{y} x)} t d t, w i t h \\ R_{x, y, \dot{x}}^{0} (t) : = R_{\exp_{y} (t \log_{y} x)} ({(d \exp)}_{y, t \log_{y} x} (\log_{y} x), {(d \exp)}_{y, t \log_{y} x} ({(d_{x} \log)}_{y, x} (\dot{x}))) . \end{matrix}

(481)

Such statement may be explained by choosing an appropriate homotopy h and a vector field

σ

. In particular, let us choose the homotopy

h : [- ϵ, ϵ] \times [0, 1] \to M

as

h (s, t) : = \exp_{y} (t \log_{y} \exp_{x} (s \dot{x})),

(482)

where the back-end curve

s \mapsto h (s, 0)

is such that

h (0, 0) = y \in M

and the front-end curve

s \mapsto h (s, 1)

is such that

\partial_{s} h (s, 1) {|_{s = 0} = \dot{x} (s) |}_{s = 0}

. In addition, we have denoted, for simplicity, by

\dot{x}

the tangent vector

\dot{x} (0)

. The vector field

σ : [- ϵ, ϵ] \times [0, 1] \to T_{h (s, t)} M

along the corresponding homotopic net is defined as

σ (s, t) : = P^{y \to h (s, t)} (w_{y}),

(483)

with

w_{y} \in T_{y} M

fixed. Apparently, the homotopy (482) traces a smooth geodesic net, in fact, for s fixed,

t \mapsto h (s, t)

traces a geodesic line that departs from the point y and ends at the point

\exp_{x} (s \dot{x})

. Moreover, the vector field (483) satisfies both conditions (475), in fact, for any fixed value of the index s, the vector field

t \mapsto σ (🟉, t)

arises as the result of parallel translating the vector

w_{y}

along the corresponding geodesic of the homotopy net, hence

σ (🟉, t)

is parallel along each geodesic; in addition, the tangent value

σ (s, 0)

is obtained by parallel transporting the vector

w_{y}

along the curve

h (s, 0)

, hence the vector field

s \mapsto σ (s, 0)

is parallel along the back-end arc of which

ξ

represents the velocity. Therefore, from the relationship (476), it follows that

\{\begin{matrix} (\nabla_{ξ} σ) (s, 1) = F (s) (σ (s, 1)), with \\ F (s) : = \int_{0}^{1} P^{h (s, t) \to h (s, 1)} \circ R_{h (s, t)}^{v (s, t), ξ (s, t)} \circ P^{h (s, 1) \to h (s, t)} d t, \end{matrix}

(484)

where

F (s) \in End (T_{h (s, 1)} M)

for every

s \in [- ϵ, ϵ]

. The relationship between the covariant derivative

(\nabla_{ξ} σ)

and the principal pushforward map

{(d_{x} P^{y \to x})}_{x, y, w_{y}} (\dot{x})

is given by the observation that

\begin{matrix} {(d_{x} P^{y \to x})}_{h (s, 1), y, w_{y}} (ξ (s, 1)) = & \nabla_{s}^{h (s, 1)} P^{y \to x (s)} (w_{y}) \\ = & (\nabla_{ξ} σ) (s, 1) . \end{matrix}

(485)

Therefore, on the fiducial geodesic corresponding to the value

s = 0

, it holds that

\{\begin{matrix} {(d_{x} P^{y \to x})}_{x, y, w_{y}} (\dot{x}) = F_{0} (σ (0, 1)), w i t h \\ F_{0} = \int_{0}^{1} P^{h (0, t) \to h (0, 1)} \circ R_{h (0, t)}^{v (0, t), ξ (0, t)} \circ P^{h (0, 1) \to h (0, t)} d t, \end{matrix}

(486)

where, in particular,

h (0, t) = \exp_{y} (t \log_{y} x)

. The partial derivative of the homotopy (482) with respect to the parameter t, namely the velocity along each geodesic of the homotopic net, reads

(\partial_{t} h) (s, t) = {(d_{x} \exp)}_{y, t \log_{y} \exp_{x} (s \dot{x})} (\log_{y} \exp_{x} (s \dot{x})),

(487)

where

d_{x} \exp : {(T M)}^{2} \to T M

denotes the pushforward map of the exponential map

(x, v) \mapsto \exp_{x} (v)

with respect to the x argument, therefore the velocity of the fiducial geodesic reads

v (0, t) = {(d \exp)}_{y, t \log_{y} x} (\log_{y} x) .

(488)

Instead, the partial derivative of the homotopy (482) computed with respect to the index s reads

(\partial_{s} h) (s, t) = {(d_{x} \exp)}_{y, t \log_{y} \exp_{x} (s \dot{x})} (t {(d_{x} \log)}_{y, \exp_{x} (s \dot{x})} {(d_{x} \exp)}_{x, s \dot{x}} (\dot{x})),

(489)

where

d_{x} \log : M^{2} \times T M \to T M

denotes the pushforward map of the logarithmic map

(y, x) \mapsto \log_{y} x

with respect to the x argument. Such partial derivative at

s = 0

reads

ξ (0, t) = {(d_{x} \exp)}_{y, t \log_{y} x} (t {(d_{x} \log)}_{y, x} {(d \exp)}_{x, 0} (\dot{x})) .

(490)

Since

{(d \exp)}_{x, 0} = {id}_{x}

and the pushforward map

d \exp

is linear, the above relation may be simplified to

ξ (0, t) = t ({(d_{x} \exp)}_{y, t \log_{y} x} \circ {(d_{x} \log)}_{y, x}) (\dot{x}) .

(491)

Such expression emphasizes the fact that the transversal velocity

ξ (0, t)

is linear in the vector

\dot{x}

. In conclusion, replacing the Equations (488) and (491) into the expression (486), we get:

\{\begin{matrix} {(d_{x} P^{y \to x})}_{x, y, w_{y}} (\dot{x}) = (F_{0} \circ P^{y \to x}) (w_{y}), \\ F_{0} = \int_{0}^{1} P^{h (0, t) \to h (0, 1)} \circ R_{h (0, t)}^{v (0, t), ξ (0, t)} \circ P^{h (0, 1) \to h (0, t)} d t, \\ v (0, t) = {(d \exp)}_{y, t \log_{y} x} (\log_{y} x), \\ ξ (0, t) = t ({(d_{x} \exp)}_{y, t \log_{y} x} \circ {(d_{x} \log)}_{y, x}) (\dot{x}), \end{matrix}

(492)

which justifies the assertion (481). Notice that the integral in the second equation of (481) is well defined because the integrand is a linear endomorphism in

End (T_{x} M)

.

Let us apply the above formula to the unit hypersphere as an example of calculation.

Example 33.

For the unit hypersphere

S^{n - 1}

endowed with the canonical metric we have determined that the curvature endomorphism reads

R^{u, v} (w) : = (v^{⊤} w) u - (u^{⊤} w) v,

(493)

for every

u, v, w \in T_{x} S^{n - 1}

and

x \in S^{n - 1}

. Since the curvature endomorphism does not depend explicitly on the point x, we omitted its indication.

A matrix representation of the curvature endomorphism

R

is the curvature tensor

〚 R^{u, v} 〛 : = u v^{⊤} - v u^{⊤}

, from which the bilinearity and antisymmetry properties are easily verified. For the ordinary sphere

S^{2}

, denoting by

\land : R^{3} \times R^{3} \to R^{3}

the outer product in

R^{3}

, the skew-symmetric matrix

〚 R^{u, v} 〛

coincides with the matrix representation of the 3-vector

u \land v

, namely

〚 R^{u, v} 〛^{\lor} = u \land v

.

Let us recall that the exponential map is expressed as

\exp_{y} (w) = \{\begin{matrix} y cos (∥ w ∥) + w sinc (∥ w ∥), & i f w \neq 0, \\ y, & i f w = 0, \end{matrix}

(494)

therefore, its inverse, the logarithmic map, is expressed by the relation:

\log_{y} x = \frac{(I_{n} - y y^{⊤}) x}{sinc D (x, y)},

(495)

where it is assumed that the points

x, y \in S^{n - 1}

are not antipodal. In the above formula, the symbol

sinc : R \to R

denotes the cardinal sine function defined as:

sinc ζ : = \{\begin{matrix} ζ^{- 1} sin ζ & f o r ζ \neq 0, \\ 1 & f o r ζ = 0, \end{matrix}

(496)

and the geodesic distance between two points

x, y \in S^{n - 1}

associated to the canonical metrics reads

D (x, y) = ∥ \log_{y} {x ∥}_{y} = acos (x^{⊤} y),

(497)

where the inverse cosine function ranges in

[0, π]

.

Matrix representations of the derivatives of the exponential and the logarithmic maps are computed to be

\begin{matrix} 〚 {(d \exp)}_{y, w} 〛 = cos (∥ w ∥) I_{n} - sinc (∥ w ∥) y w^{⊤}, \end{matrix}

(498)

\begin{matrix} 〚 {(d_{x} \log)}_{y, x} 〛 = \frac{I_{n} - y y^{⊤}}{{sinc}^{2} D (x, y)} [sinc D (x, y) I_{n} - ({sinc}^{'} D (x, y)) {acos}^{'} (x^{⊤} y) x y^{⊤}], \end{matrix}

(499)

where the prime denotes derivation. The corresponding matrix representation of the endomorphism

R^{0}

takes the form:

〚 R^{0} 〛 = 〚 {(d \exp)}_{y, t \log_{y} x} 〛 \{(\log_{y} x) {\dot{x}}^{⊤} 〚 {(d_{x} \log)}_{y, x} 〛^{⊤} - 〚 {(d_{x} \log)}_{y, x} 〛 \dot{x} \log_{y}^{⊤} x\} 〚 {(d \exp)}_{y, t \log_{y} x} 〛^{⊤},

(500)

which denotes an anti-symmetric matrix.

7. Discrete-Time Dynamical Systems on Manifolds

Dynamical systems on manifolds may appear under the form of discrete-time systems as well as continuous-time systems. In turn, discrete-time systems may be natively discrete-time or may be derived from continuous-time systems through time sampling. Discrete-time dynamical systems are nowadays conceived as algorithms and implemented as pieces of codes running on computing platforms.

The aim of the present last section of this tutorial paper is to recall an auto-regressive, moving average (ARMA) mathematical model from [58] that generates a discrete-time temporal sequence on a smooth manifold. In such setting, not only the algorithmic structure of the ARMA model reflect the mathematical structure of the state manifold but the input and the output of the ARMA dynamical system are structured sequences that belong to the tangent bundle

T M

of same smooth manifold

M

.

The present section is organized in the following way. The Section 7.1 describes second-order discrete-time auto-regressive moving-average dynamical systems on Riemannian manifolds. To end with, Section 7.2 illustrates examples of ARMA-type dynamical systems on manifold.

7.1. Auto-Regressive Moving-Average (ARMA) Systems on Riemannian Manifolds

An ARMA model on a manifold is described in terms of a pair of sequences

(x_{k}, v_{k}) \in T M

, where

x_{k}

denotes the output signal of the model,

v_{k}

denotes the state of the model and

k \in Z

denotes a discrete-time index. In the familiar case that

M \equiv R^{p}

, choosing the standard identification

T_{x} R^{p} ≅ R^{p}

, for any

x \in R^{p}

, an ARMA(

r, q

) model on the tangent bundle

T R^{p}

appears under the form of a state-evolution equation and of a state-to-output transformation equation. In particular, the equation that governs the evolution of the internal state

v_{k} \in T_{x_{k}} R^{p}

reads:

v_{k} = \underset{\underset{(AR)}{Auto - regressive}}{\underset{︸}{\sum_{i = 1}^{r} A_{i} v_{k - i}}} + \underset{\underset{(MA)}{Moving average}}{\underset{︸}{\sum_{i = 0}^{q} B_{i} w_{k - i}}},

(501)

where the transition matrices

A_{i} \in R^{p \times p}

and

B_{i} \in R^{p \times p}

denote the parameters of the auto-regressive and of the moving-average subsystems, respectively, and the sequence

w_{k} \in T_{x_{k}} R^{p}

denotes a white noise. Let us recall the salient features of the two subsystem of an ARMA model:

Moving average (MA) subsystem: The moving-average subsystem consists of a linear combination of the current value and of previous samples of the input signal;
Auto-regressive (AR) subsystem: The auto-regressive subsystem consists of a linear combination of previous values of the internal state of the system.

The integer

r ⩾ 1

denotes the memory depth of the AR subsystem, while the integer

q ⩾ 0

denotes the memory depth of the MA subsystem. (Notice that speaking of ‘memory depth’ for the AR subsystem is purely conventional since an AR subsystem has an infinite memory span). In addition, the equation that describes how the internal state

v_{k}

transforms into the output

x_{k} \in R^{p}

reads:

x_{k + 1} = x_{k} + v_{k} + c,

(502)

with

c \in R^{p}

denoting a constant bias (or drift term). It is worth remarking that, since the tangent bundle

T R^{p}

is trivial (in fact,

T R^{p}

is isomorphic to

R^{p} \times R^{p}

), the Equation (501) is based on vector addition. On a general Riemannian manifold

M

, however, vector addition in the tangent bundle

T M

(that is, in the AR part and in the MA part) may be hardly defined without invoking the concept of parallel transport. Likewise, Equation (502) is based on vector addition on

R^{p}

, which is not defined on a general (curved) Riemannian manifold and needs rather to be replaced by the exponential map.

The proposed extension of the above ARMA system to a Riemannian manifold

M

is based on the following considerations:

At each step k, it is necessary to compute a linear combination of several tangent vectors in $T_{x_{k}} M$ via the transition operators $A_{i} : (x, v) \in T M \mapsto w \in T_{x} M$ for the auto-regressive subsystem and $B_{i} : (x, v) \in T M \mapsto w \in T_{x} M$ for the moving-average subsystem, where the operators $A_{i}$ and $B_{i}$ are linear in the second (namely, the vectorial) argument.
In order to compute the linear combination of previous and current values of the state variable, at each step k it is necessary to parallel transport each tangent vector $v_{k - 1}$ , …, $v_{k - r}$ , $w_{k - 1}$ , …, $w_{k - q}$ to the tangent space $T_{x_{k}} M$ .
In order to propagate the constant bias, at each step k it is necessary to parallel transport it to the tangent space $T_{x_{k}} M$ .

On the basis of the above considerations, a manifold-type auto-regressive, moving average M-ARMA(

2, q

) dynamical system on the tangent bundle

T M

of the manifold

M

may be cast as:

\{\begin{matrix} v_{k} = \sum_{i = 1}^{2} A_{i} (x_{k}, P^{x_{k - i} \to x_{k}} (v_{k - i})) + \sum_{i = 0}^{q} B_{i} (x_{k}, P^{x_{k - i} \to x_{k}} (w_{k - i})), \\ x_{k + 1} = \exp_{x_{k}} (v_{k} + P^{x_{0} \to x_{k}} (c)), \end{matrix}

(503)

for

k = 0, 1, 2, \dots

. At

k = 0

, it is necessary to know the initial seed

x_{0} \in M

and to put into effect an extension of standard zero-padding to account for lagged vector-samples that are not available at

k = 0

:

\{\begin{matrix} v_{- 1} = v_{- 2} = 0, \\ w_{- 1} = w_{- 2} = \dots = w_{- q} = 0, \\ x_{- 1} = x_{- 2} = \dots = x_{- β} = x_{0}, \end{matrix}

(504)

with

β : = max {2, q}

denoting the memory depth of the whole M-ARMA system. Notice that the memory depth on the state variable v has been chosen to be 2 since this tutorial paper cover mostly second-order systems and for illustrative purposes. The tangent state-vectors

v_{k}

are computed as the output of an auto-regressive moving-average system on the tangent bundle

T M

. In a M-ARMA(

r, q

) system whose dynamics manifests through a state-sequence

v_{k} \in T_{x_{k}} M

as a response to an input sequence

w_{k} \in T_{x_{k}} M

, the current state variable value

v_{k}

depends from the past state-variable values

v_{k - 1}

and

v_{k - 2}

and from the current and past input-variable values

w_{k}, w_{k - 1}, \dots, w_{k - q}

. The additional term

P^{x_{0} \to x_{k}} (c)

in the second equation of the mathematical model (503) represents a drift term that causes a slight change in the current state even when the scalar velocity

∥ v_{k} ∥_{x_{k}}

is close to zero.

The M-ARMA model (503), may be cast in a different way by replacing parallel transport with vector transport. The corresponding alternative version of such M-ARMA system reads

\{\begin{matrix} v_{k} = \sum_{i = 1}^{2} A_{i} (x_{k}, Π_{x_{k}} (v_{k - i})) + \sum_{i = 0}^{q} B_{i} (x_{k}, Π_{x_{k}} (w_{k - i})), \\ x_{k + 1} = \exp_{x_{k}} (v_{k} + Π_{x_{k}} (c)), \end{matrix}

(505)

for

k = 0, 1, 2, \dots

. In some occasions, even if a parallel transport formula is available, vector transport is invoked due to its alleged lighter computational burden.

The steps required to implement the M-ARMA(

2, q

) models (503) and (505) are summarized in Algorithm 4.

Algorithm 4 Pseudocode to implement the M-ARMA( $2, q$ ) systems (503) and (505).
1:	Choose a metric for the manifold $M$ and make sure the exponential map $\exp : T M \to M$ and the parallel transport operator $P^{x \to y} : T_{x} M \to T_{y} M$ or the vector transport operator $π_{x}$ , with $x, y \in M$ , are available
2:	Choose the initial seed $x_{0} \in M$ and the drift constant $c \in T_{x_{0}} M$
3:	Choose 2 transition operators $A_{i} : T M \to T M$ and q transition operators $B_{i} : T M \to T M$
4:	Set $β : = \max {2, q}$ and set up the zero-padding $v_{- 1} = v_{- 2} = 0$ , $w_{- 1} = w_{- 2} = \dots = w_{- q} = 0$ , $x_{- 1} = x_{- 2} = \dots = x_{- β} = x_{0}$
5:	for $k = 0$ to K do
6:	Get a tangent vector $w_{k} \in T_{x_{k}} M$ from an input source
7:	Compute tangent vectors ${\tilde{v}}_{k, i} : = P^{x_{k - i} \to x_{k}} (v_{k - i})$ and ${\tilde{w}}_{k, j} : = P^{x_{k - j} \to x_{k}} (w_{k - j})$ , or ${\tilde{v}}_{k, i} : = Π_{x_{k}} (v_{k - i})$ and ${\tilde{w}}_{k, j} : = Π_{x_{k}} (w_{k - j})$ , for $i = 1, 2$ and $j = 0, \dots, q$ and the actualized drift term $d_{k} = P^{x_{0} \to x_{k}} (c)$ or $d_{k} = Π_{x_{k}} (c)$
8:	Compute the next state-velocity term $v_{k} = \sum_{i = 1}^{2} A_{i} (x_{k}, {\tilde{v}}_{k, i}) + \sum_{j = 0}^{q} B_{j} (x_{k}, {\tilde{v}}_{k, j})$
9:	Compute the next term in the output sequence $x_{k + 1} = \exp_{x_{k}} (v_{k} + d_{k})$
10:	end for

By adding extra terms to the right-hand side of the first equation of the system (503) or (505), the discussed M-ARMA model may be extended to a M-ARMAX (M-ARMA with exogenous inputs, see, for instance [59]).

7.2. Examples of M-ARMA-Type Dynamical Systems

The aim of the present section is to illustrate some examples of the discussed M-ARMA model. The general M-ARMA(

2, q

) system and the notion of autocorrelation function on manifold will be illustrated through examples on the unit hypersphere, the manifold of symmetric positive-definite matrices and the compact Stiefel manifold. In particular, examples about how to choose the auto-regressive and the moving-average transition operators

A_{i}

and

B_{i}

will be given. In particular, in the following we shall assume that the purpose of running a M-ARMA algorithm is to generate a non-white pseudo-random sequence on a given manifold, on the basis of a random number generator that is used to produce a pseudo-random input sequence

w_{k}

.

7.2.1. Generation of Pseudo-Random Paths on the Unit Hyper-Sphere

In order to generate a pseudo-random vector

v \in T_{x} S^{p - 1}

, given a point

x \in S^{p - 1}

, the following algorithm may be employed:

Generate a random vector $\tilde{v} \in R^{p}$ by stacking the outputs of a pseudo-random scalar generator;
Project the vector $\tilde{v}$ into the tangent space $T_{x} S^{p - 1}$ by $v = Π_{x} {\tilde{v}}$ .

The generation of a pseudo-random path over the hyper-sphere

S^{p - 1}

requires to choose the transition operators that take part in the M-ARMA(

2, q

) model (503). In particular, a possible choice for the transition operators

A_{i} : (x, v) \in T S^{p - 1} \mapsto w \in T_{x} S^{p - 1}

and

B_{i} : (x, v) \in T S^{p - 1} \mapsto w \in T_{x} S^{p - 1}

is:

A_{i} (x, v) = Π_{x} {A_{i} v}, i = 1, 2, B_{i} (x, v) = Π_{x} {B_{i} v}, i ⩾ 0,

(506)

where the square matrices

A_{i}, B_{i} \in R^{p \times p}

may be chosen arbitrarily.

Example 34.

To give an example of the above setting, let us choose

p = 3

and

q = 4

. Moreover, let us opt for the version based on parallel transport. The resulting pseudo-random path-generation algorithm M-ARMA

(2, 4)

on the hypersphere

S^{2}

may be particularized from the general expression (503) as follows:

\begin{matrix} v_{k} = & (I_{3} - x_{k} x_{k}^{⊤}) A_{1} P^{x_{k - 1} \to x_{k}} (v_{k - 1}) + (I_{3} - x_{k} x_{k}^{⊤}) A_{2} P^{x_{k - 2} \to x_{k}} (v_{k - 2}) + (I_{3} - x_{k} x_{k}^{⊤}) B_{0} w_{k} + \\ = & (I_{3} - x_{k} x_{k}^{⊤}) B_{1} P^{x_{k - 1} \to x_{k}} (w_{k - 1}) + (I_{3} - x_{k} x_{k}^{⊤}) B_{2} P^{x_{k - 1} \to x_{k}} (w_{k - 2}) + \\ = & (I_{3} - x_{k} x_{k}^{⊤}) B_{3} P^{x_{k - 3} \to x_{k}} (w_{k - 3}) + (I_{3} - x_{k} x_{k}^{⊤}) B_{4} P^{x_{k - 4} \to x_{k}} (w_{k - 4}), \\ x_{k + 1} = & \exp_{x_{k}} (v_{k} + P^{x_{0} \to x_{k}} (c)), \end{matrix}

(507)

with

x_{k} \in S^{2}

,

w_{k}, v_{k} \in T_{x_{k}} S^{2}

for

k = 0, 1, 2, \dots, K

and

c \in T_{x_{0}} S^{2}

generated randomly by projection over the tangent space

T_{x_{0}} S^{2}

. The value of the integer K may be chosen as large as one needs.

7.2.2. Generation of Pseudo-Random Paths on the Manifold of Symmetric, Positive-Definite Matrices

The starting point is to generate a pseudo-random tangent matrix

V \in T_{P} S^{+} (n)

, given a symmetric, positive-definite matrix

P \in S^{+} (n)

. Such result may be achieved by putting into effect the following scheme:

Generate a pseudo-random matrix $\tilde{V} \in R^{n \times n}$ by means of a pseudo-random scalar generator;
Project the matrix $\tilde{V}$ into the tangent space $T_{P} S^{+} (n)$ by the rule $V = Π_{P} (\tilde{V}) = \frac{1}{2} (\tilde{V} + {\tilde{V}}^{⊤})$ .

In addition, we may recognize that a way to select the transition operators

A_{i} : (P, V) \in T S^{+} (n) \mapsto W \in T_{P} S^{+} (n)

and

B_{i} : (P, V) \in T S^{+} (n) \mapsto W \in T_{P} S^{+} (n)

is:

\{\begin{matrix} A_{i} (P, V) = a_{i} (P V + V P), i = 1, 2 \\ B_{i} (P, V) = b_{i} (P V + V P), i ⩾ 0, \end{matrix}

(508)

where

a_{i} \in R

and

b_{i} \in R

denote constant parameters chosen arbitrarily.

7.2.3. Generation of Pseudo-Random Paths on the Compact Stiefel Manifold

The canonical metric is the metric of choice in the present section.

With the aim to generate a pseudo-random tangent matrix

V \in T_{X} St (n, p)

, given a point

X \in St (n, p)

, the following scheme may be used:

Generate a pseudo-random matrix $\tilde{V} \in R^{n \times p}$ by a pseudo-random scalar generator;
Project the matrix $\tilde{V}$ into the tangent space $T_{X} St (n, p)$ by the rule $V = \tilde{V} - X {\tilde{V}}^{⊤} X$ .

The generation of a pseudo-random path over the manifold

St (n, p)

requires to select transition operators

A_{i} : (X, V) \in T St (n, p) \mapsto W \in T_{X} St (n, p)

and

B_{i} : (X, V) \in T St (n, p) \mapsto W \in T_{X} St (n, p)

. A possible way to effect such choice is:

A_{i} (X, V) = a_{i} V, i = 1, 2, B_{i} (X, V) = b_{i} X V^{⊤} X, i ⩾ 0,

(509)

where

a_{i} \in R

and

b_{i} \in R

denote arbitrarily chosen constant parameters.

Example 35.

For exemplification purposes, one may select

n = 5

,

p = 2

and

q = 4

. The initial seed needs to be chosen randomly in

St (5, 2)

and one may choose, for simplicity, not to include any drift term in the equations. The pseudo-random path generation rule (505) particularizes, in this case, to:

\begin{matrix} V_{k} = Π_{X_{k}} (a_{1} V_{k - 1} + a_{2} V_{k - 2} + X_{k}^{⊤} (\sum_{i = 0}^{4} b_{i} W_{k - i}) X_{k}), \\ X_{k + 1} = \exp_{X_{k}} (V_{k}), \end{matrix}

(510)

for

k = 0, 1, 2, \dots, K

. The exponential is implemented as:

\exp_{X} (V) = [X Q] Exp ([\begin{matrix} X^{⊤} V & - R^{⊤} \\ R & 0_{2} \end{matrix}]) [\begin{matrix} I_{2} \\ 0_{2} \end{matrix}],

(511)

where Q and R denote the factors of the compact QR decomposition of the matrix

(I_{5} - X X^{⊤}) V

.

8. Conclusions

The present tutorial paper, the second of a series of tutorials on manifold calculus with application to non-linear control systems, was devoted to topics from manifold calculus apt to describe properties of non-linear dynamical systems subject to holonomic constraints in its state variable as well as to control non-linear dynamical systems whose state belong to curved manifolds. As an interesting cases of study for illustrative purpose, special emphasis was put on designing algorithms to achieve synchronization of manifold-type systems based on feedback control and on developing numerical methods tailored to curved manifolds to implement such systems and algorithms.

Aside of practical applications to system theory and control, the present tutorial focuses even on purely theoretical topics such as curvature and covariant derivation. The present tutorial paper does not cover further topic of great interest, such as integration on manifold and systems of order larger than two. Such important arguments will be treated in a forthcoming part of this multi-part tutorial.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Conflicts of Interest

The author declares no conflict of interest.

References

Bloch, A. An introduction to aspects of geometric control theory. In Nonholonomic Mechanics and Control; Krishnaprasad, P., Murray, R., Eds.; Interdisciplinary Applied Mathematics; Springer: New York, NY, USA, 2015; Volume 24. [Google Scholar]
Bullo, F.; Lewis, A. Geometric control of mechanical systems. In Texts in Applied Mathematics; Springer: New York, NY, USA; Heidelberg/Berlin, Germany, 2004; Volume 49. [Google Scholar]
Návrat, A.; Vašík, P. On geometric control models of a robotic snake. Note Mat. 2017, 37, 120–129. [Google Scholar]
Nijmeijer, H.; van der Schaft, A. Nonlinear Dynamical Control Systems; Springer: New York, NY, USA, 1990. [Google Scholar]
Sreenath, K.; Lee, T.; Kumar, V. Geometric control and differential flatness of a quadrotor UAV with a cable-suspended load. In Proceedings of the 52nd IEEE Conference on Decision and Control, Firenze, Italy, 10–13 December 2013; pp. 2269–2274. [Google Scholar] [CrossRef]
Agrachev, A.; Sachkov, Y. Control Theory from the Geometric Viewpoint; Springer: Berlin/Heidelberg, Germnay, 2004; Volume 87. [Google Scholar]
Fiori, S. Extension of PID regulators to dynamical systems on smooth manifolds (M-PID). SIAM J. Control Optim. 2021, 59, 78–102. [Google Scholar] [CrossRef]
Brockett, R. Lie algebras and lie groups in control theory. In Geometric Methods in System Theory; Mayne, D., Brockett, R., Eds.; Springer Netherlands: Dordrecht, The Netherlands, 1973; pp. 43–82. [Google Scholar]
Jakubzyk, B. Introduction to geometric nonlinear control; controllability and Lie bracket. In Mathematical Control Theory, Lectures Notes of a Minicourse; Polish Academy of Sciences: Warsaw, Poland, 2002; pp. 107–168. [Google Scholar]
Jurdjevic, V. Optimal control on Lie groups and integrable Hamiltonian systems. Regul. Chaotic Dyn. 2011, 16, 514–535. [Google Scholar] [CrossRef]
Sastry, S. Geometric nonlinear control. In Nonlinear Systems—Analysis, Stability, and Control; Sastry, S., Ed.; Springer Science + Business Media: New York, NY, USA, 1999; pp. 510–573. [Google Scholar]
Fiori, S. Non-delayed synchronization of non-autonomous dynamical systems on Riemannian manifolds and its applications. Nonlinear Dyn. 2018, 94, 3077–3100. [Google Scholar] [CrossRef]
Fiori, S. Manifold calculus in system theory and control—Fundamentals and first-order systems. Symmetry 2021, 13, 2092. [Google Scholar] [CrossRef]
Poisson, E.; Pound, A.; Vega, I. The motion of point particles in curved spacetime. Living Rev. Relativ. 2011, 14, 7. [Google Scholar] [CrossRef]
Ferreira, R.; Xavier, J. Hessian of the Riemannian squared distance function on connected locally symmetric spaces with applications. In Proceedings of the 7th Portuguese conference on automatic control (Controlo 2006), Lisbon, Portugal, 11–13 September 2006. [Google Scholar]
Pennec, X. Hessian of the Riemannian Squared Distance; Technical report; Université Côte d’Azur and Inria Sophia-Antipolis Méditerranée: Valbonne, France, 2017. [Google Scholar]
Fiori, S. Gyroscopic signals smoothness assessment by geometric jolt estimation. Math. Methods Appl. Sci. 2017, 40, 5893–5905. [Google Scholar] [CrossRef]
Fiori, S. Error-based control systems on Riemannian state manifolds: Properties of the principal pushforward map associated to parallel transport. Math. Control Relat. Fields 2021, 11, 143–167. [Google Scholar] [CrossRef]
Röbenack, K. Computation of multiple Lie derivatives by algorithmic differentiation. J. Comput. Appl. Math. 2008, 213, 454–464. [Google Scholar] [CrossRef][Green Version]
Civita, A.; Fiori, S.; Romani, G. A mobile acquisition system and a method for hips sway fluency assessment. Information 2018, 9, 321. [Google Scholar] [CrossRef]
Fiori, S. A closed-form expression of the instantaneous rotational lurch index to evaluate its numerical approximation. Symmetry 2019, 11, 1208. [Google Scholar] [CrossRef]
Ferreira, R.; Xavier, J.; Costeira, J.; Barroso, V. Newton method for Riemannian centroid computation in naturally reductive homogeneous spaces. In Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, Toulouse, France, 14–19 May 2006; Volume 3. [Google Scholar] [CrossRef]
Isvoranu, D.; Udrişte, C. Fluid flow versus geometric dynamics. In Proceedings of the 5th Conference on Differential Geometry, Mangalia, Romania, 29 August–2 September 2005; Geometry Balkan Press: Bucharest, Romania, 2006; pp. 70–82. [Google Scholar]
van der Pol, B. On “relaxation-oscillations”. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1926, 2, 978–992. [Google Scholar] [CrossRef]
Tôrres, L.; Aguirre, L. Transmitting information by controlling nonlinear oscillators. Physica D 2004, 196, 387–406. [Google Scholar] [CrossRef]
Cao, T.; Yi, H. On the complex oscillation of higher order linear differential equations with meromorphic coefficients. J. Syst. Sci. Complex. 2007, 20, 135–148. [Google Scholar] [CrossRef]
Chen, J.; Lu, J.; Wu, X. Bidirectionally coupled synchronization of the generalized Lorenz systems. J. Syst. Sci. Complex. 2011, 24, 433–448. [Google Scholar] [CrossRef]
Huang, J.; Zhang, H. Bifurcations of periodic orbits in three-well Duffing system with a phase shift. J. Syst. Sci. Complex. 2011, 24, 519–531. [Google Scholar] [CrossRef]
Mo, J.; Lin, W. Generalized variation iteration solution of an atmosphere-ocean oscillator model for global climate. J. Syst. Sci. Complex. 2011, 24, 271–276. [Google Scholar] [CrossRef]
Fiori, S. Nonlinear damped oscillators on Riemannian manifolds: Fundamentals. J. Syst. Sci. Complex. 2016, 29, 22–40. [Google Scholar] [CrossRef]
FitzHugh, R. Mathematical models of excitation and propagation in nerve. In Biological Engineering; H.P. Schwan, Ed.; McGraw-Hill: New York, NY, USA, 1969. [Google Scholar]
Barbosa, R.; Tenreiro Machado, J.; Vinagre, B.; Calderón, A. Analysis of the van der Pol oscillator containing derivatives of fractional order. J. Vib. Control 2007, 13, 1291–1301. [Google Scholar] [CrossRef]
Duffing, G. Erzwungene Schwingungen bei Veränderlicher Eigenfrequenz und Ihre Technische Bedeutung; Braunschweig, F., Ed.; Vieweg & Sohn: Berlin, Germany, 1918. [Google Scholar]
Trueba, J.; Rams, J.; Sanjuán, M. Analytical estimates of the effect of nonlinear damping in some nonlinear oscillators. Int. J. Bifurc. Chaos 2000, 10, 2257–2267. [Google Scholar] [CrossRef]
Sanjuán, M. The effect of nonlinear damping on the universal escape oscillator. Int. J. Bifurc. Chaos 1999, 9, 735–744. [Google Scholar] [CrossRef]
Lorenz, E. Deterministic nonperiodic flow. J. Atmos. Sci. 1963, 20, 130–141. [Google Scholar] [CrossRef]
Greer, M.; Saha, R.; Gogliettino, A.; Yu, C.; Zollo-Venecek, K. Emergence of oscillations in a simple epidemic model with demographic data. R. Soc. Open Sci. 2020, 7, 191187. [Google Scholar] [CrossRef]
Georgiou, I.; Corless, M.; Bajaj, A. Dynamics of nonlinear structures with multiple equilibria: A singular perturbation-invariant manifold approach. Z. Angew. Math. Phys. 1999, 50, 892–924. [Google Scholar] [CrossRef]
Shorek, S. A stationarity principle for non-conservative systems. Adv. Water Resour. 1984, 7, 85–88. [Google Scholar] [CrossRef]
Molero, F.; Lara, M.; Ferrer, S.; Cèspedes, F. 2-D Duffing oscillator: Elliptic functions from a dynamical systems point of view. Qual. Theory Dyn. Syst. 2013, 12, 115–139. [Google Scholar] [CrossRef]
Cucker, F.; Smale, S. Emergent behavior in flocks. IEEE Trans. Autom. Control 2007, 52, 852–862. [Google Scholar] [CrossRef]
Ha, S.Y.; Kim, D.; Schlöder, F.W. Emergent behaviors of Cucker–Smale flocks on Riemannian manifolds. IEEE Trans. Autom. Control 2021, 66, 3020–3035. [Google Scholar] [CrossRef]
Fiori, S.; Del Rossi, L. Minimal control effort and time Lie-group synchronisation design based on proportional-derivative control. Int. J. Control 2022, 95, 138–150. [Google Scholar] [CrossRef]
Fiori, S. Synchronization of first-order autonomous oscillators on Riemannian manifolds. Discret. Contin. Dyn. Syst. Ser. B 2019, 24, 1725–1741. [Google Scholar] [CrossRef]
Fiori, S. Model formulation over Lie groups and numerical methods to simulate the motion of gyrostats and quadrotors. Mathematics 2019, 7, 935. [Google Scholar] [CrossRef]
Gaponov, I.; Razinkova, A. Quadcopter design and implementation as a multidisciplinary engineering course. In Proceedings of the IEEE International Conference on Teaching, Assessment, and Learning for Engineering (TALE) 2012, Hong Kong, China, 20–23 August 2012; pp. H2B-16–H2B-19. [Google Scholar] [CrossRef]
Bloch, A.; Krishnaprasad, P.; Marsden, J.; Ratiu, T. The Euler-Poincaré equations and double bracket dissipation. Commun. Math. Phys. 1996, 175, 1–42. [Google Scholar] [CrossRef]
Gajbhiye, S.; Banavar, R. The Euler-Poincaré equations for a spherical robot actuated by a pendulum. IFAC Proc. Vol. 2012, 45, 72–77. [Google Scholar] [CrossRef]
Becker, M.; Sampaio, R.; Bouabdallah, S.; de Perrot, V.; Siegwart, R. In-flight collision avoidance controller based only on OS4 embedded sensors. J. Braz. Soc. Mech. Sci. Eng. 2012, 34, 295–297. [Google Scholar] [CrossRef]
Fiori, S. A control-theoretic approach to the synchronization of second-order continuous-time dynamical systems on real connected Riemannian manifolds. SIAM J. Control Optim. 2020, 58, 787–813. [Google Scholar] [CrossRef]
Magdy, M.; Ng, T. Regulation and control effort in self-tuning controllers. IEE Proc. D Control Theory Appl. 1986, 133, 289–292. [Google Scholar] [CrossRef]
Grönwall, T. Note on the derivatives with respect to a parameter of the solutions of a system of differential equations. Ann. Math. 1919, 20, 292–296. [Google Scholar] [CrossRef]
Henderson, H.; Searle, S. On deriving the inverse of a sum of matrices. SIAM Rev. 1981, 23, 53–60. [Google Scholar] [CrossRef]
Gavrilov, A. The Taylor series related to the differential of the exponential map. arXiv 2012, arXiv:1205:2868v1. [Google Scholar]
Bullo, F.; Lewis, A. Geometric Control of Mechanical Systems: Modeling, Analysis, and Design for Mechanical Control Systems; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
Osborne, J.; Hicks, G. The geodesic spring on the Euclidean sphere with parallel-transport-based damping. Not. AMS 2013, 60, 544–556. [Google Scholar] [CrossRef]
Ambrose, W.; Singer, I.M. A theorem on holonomy. Trans. Am. Math. Soc. 1953, 75, 428–443. [Google Scholar] [CrossRef]
Fiori, S. Auto-regressive moving-average discrete-time dynamical systems and autocorrelation functions on real-valued Riemannian matrix manifolds. Discret. Contin. Dyn. Syst. Ser. B 2014, 19, 2785–2808. [Google Scholar] [CrossRef]
Escobar, J.; Poznyak, A. Robust parametric identification for ARMAX models with non-Gaussian and coloured noise: A survey. Mathematics 2022, 10, 1291. [Google Scholar] [CrossRef]

Table 1. Table recapitulating typical parameter values of a OS4 Mini-VTOL quadcopter.

Description	Parameter	Value
Overall quadrotor mass	$M_{q}$	$650 g$
Inertia on x axis	$J_{x}$	$7.5 \times 10^{- 3} kg \cdot m^{2}$
Inertia along y axis	$J_{y}$	$7.5 \times 10^{- 3} kg \cdot m^{2}$
Inertia along z axis	$J_{z}$	$1.3 \times 10^{- 2} kg \cdot m^{2}$
Thrust coefficient	b	$3.13 \times 10^{- 5} N \cdot s^{2}$
Drag coefficient	$γ$	$7.5 \times 10^{- 7} N \cdot m \cdot s^{2}$
Rotational inertia	$J_{R}$	$6 \times 10^{- 5} kg \cdot m^{2}$
Arm length	r	$0.23 m$

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Manifold Calculus in System Theory and Control—Second Order Structures and Systems

Abstract

1. Introduction

2. Notation and Recapitulation of Fundamentals

3. Covariant Derivative of a Vector Field

3.1. Brief Review of Directional Derivative and of Its Properties

3.2. Coordinate-Free Conceptualization of Covariant Derivation along a Curve

3.3. Covariant Derivation along a Direction, Connections

3.4. Relationship of Gradient to Hessian by Covariant Derivation

3.5. Axiomatization of Covariant Derivative and Relationship to Parallellism

3.6. Coordinate-Prone Covariant Derivation*

3.7. Commutator of Vector Fields and Torsion Field

3.8. A Remarkable Relationship Linking the Operator Π • to the Christoffel Form Γ ¯ ¯

3.9. Lie Derivation*

4. Iterated Derivatives and Riemannian Curvature

4.1. Commutativity of Directional Derivatives in R n

4.2. Iterated Covariant Derivatives and Second Covariant Derivative

4.3. Manifestation of Manifold Curvature in Parallel Transport along a Loop

4.4. Riemannian Curvature Endomorphism

4.5. Tidal Effects, Geodesic Deviation, Jacobi Fields, Sectional Curvature

5. Continuous-Time Dynamical Systems

5.1. Calculus of Variations on Manifold

5.2. Second-Order Dynamical Systems on Manifold

5.3. Coordinate-Prone Lagrangian Formulation of Dynamical Systems*

5.4. Cotangent Bundle Derivation of Non-Linear Oscillators on Manifolds*

5.4.1. Classical Oscillators on Euclidean Spaces

5.4.2. Cotangent Bundle Notation

5.4.3. Second-Order Dynamical Systems Formulation by Cotangent Bundles

5.4.4. Potential Energy Functions

6. Control Systems on Manifolds and Numerical Implementation

6.1. Design and Analysis of a Lie-Group Synchronization Theory

6.1.1. Velocity Synchronization

6.1.2. Attitude Synchronization

6.2. Application to Quadcopters Synchronization

6.2.1. Lie-Group Model Formulation of a Quadrotor Drone

6.2.2. Physical Realizability of the Synchronizing Controller

6.2.3. Numerical Recipes to Implement Synchronization of Quadrotors

6.3. Synchronization of Second-Order Systems on Manifolds with Vanishing Control Effort

6.4. Feedback Error Velocity, Principal Pushforward Map and Relation to Curvature

6.4.1. Algebraic Properties of the Principal Pushforward Map of Parallel Transport

6.4.2. Relationship between the Principal Pushforward Map and the Curvature Endomorphism

7. Discrete-Time Dynamical Systems on Manifolds

7.1. Auto-Regressive Moving-Average (ARMA) Systems on Riemannian Manifolds

7.2. Examples of M-ARMA-Type Dynamical Systems

7.2.1. Generation of Pseudo-Random Paths on the Unit Hyper-Sphere

7.2.2. Generation of Pseudo-Random Paths on the Manifold of Symmetric, Positive-Definite Matrices

7.2.3. Generation of Pseudo-Random Paths on the Compact Stiefel Manifold

8. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics

3.8. A Remarkable Relationship Linking the Operator $Π^{•}$ to the Christoffel Form $\bar{\bar{Γ}}$

4.1. Commutativity of Directional Derivatives in $R^{n}$