# Manifold Calculus in System Theory and Control—Fundamentals and First-Order Systems

## Abstract

**:**

## 1. Introduction

- It provides a clear and well-motivated introduction to manifold calculus, the basis of system and control theories on manifolds, with special emphasis on computational and applicational aspects. The present contribution provides practical formulas to deal with those real-valued manifolds that, in the author’s experience, are the most accessed in engineering and applied science problems. As a matter of fact, complex-valued manifolds are not treated at all.
- It clearly states and illustrates the idea that, when one wishes to perform a simulation, by a computing platform, of dynamical systems on manifolds described in terms of differential equations, it is necessary to time-discretize such differential equations in a suitable way. In order to achieve such a discretization, it is not safe to invoke standard discretization methods (such as the ones based on Euler forward–backward discretization), which do not work as they stand on curved manifolds. One should therefore resort to more sophisticated numerical integration techniques.
- By the author’s choice, the present tutorial paper does not carry any graphical illustrations nor any numerical simulation results. Readers who are interested in deepening their understanding of this topic are invited to to sketch graphs autonomously and to code examples in their favorite programming language.

*****), address some specific arguments, related to coordinate-prone manifold calculus, which may be skipped by the uninterested readers without detriment to the comprehension of the main flow of this presentation.

## 2. Coordinate-Free Embedded Manifold Calculus

**Example**

**1.**

#### 2.1. General Notation and Properties

**Matrix trace**: The trace of a square matrix $M\in {\mathbb{R}}^{p\times p}$ (namely the sum of its principal-diagonal entries) is denoted by $\mathrm{tr}\left(M\right)$. Matrix trace has a cyclic permutation invariance property. For example, given three conformable (i.e., mutually multipliable) matrices $A,B,C$, it holds that:$$\mathrm{tr}\left(ABC\right)=\mathrm{tr}\left(BCA\right)=\mathrm{tr}\left(CAB\right).$$**Matrix square root**: Given a matrix $M\in {\mathbb{R}}^{p\times p}$, its square root R is the unique matrix such that ${R}^{2}=P$. Not every matrix admits a square root. Special square roots (such as a symmetric square root) will be defined later.**Spectral factorization**: Given a matrix $M\in {\mathbb{R}}^{p\times p}$, let us assume that there exists an orthogonal matrix X (i.e., one such that ${X}^{\top}X={I}_{p}$) and a real diagonal matrix D such that $M=XD{X}^{\top}$. The expression on the right-hand side denotes the spectral factorization of the matrix M. Such a factorization turns out to be very useful in evaluating matrix polynomials. For example, it holds that$$\begin{array}{cc}\hfill \phantom{\rule{1.em}{0ex}}& {M}^{4}={\left(XD{X}^{\top}\right)}^{4}=\left(XD{X}^{\top}\right)\left(XD{X}^{\top}\right)\left(XD{X}^{\top}\right)\left(XD{X}^{\top}\right)=\hfill \\ \hfill \phantom{\rule{1.em}{0ex}}& XD\left({X}^{\top}X\right)D\left({X}^{\top}X\right)D\left({X}^{\top}X\right)D{X}^{\top}=XDDDD{X}^{\top}=X{D}^{4}{X}^{\top}.\hfill \end{array}$$Exponentiating full matrices is cumbersome, while exponentiating diagonal matrices is amiable.**Thin QR factorization**: Any matrix $M\in {\mathbb{R}}^{p\times q}$, with $p\ge q$, may be factored as the product of a $p\times p$ orthogonal matrix Q and a $p\times q$ upper triangular matrix R, namely $M=QR$. In general, a QR factorization is not unique [17]. To remove such kind of indeterminacy, the R-factor may be chosen with strictly positive entries on its main diagonal, so that the factorization is unique.**Compact singular value factorization (SVD)**: Compact SVD of a matrix $M\in {\mathbb{R}}^{p\times q}$ is a matrix factorization of the type $M=AD{B}^{\top}$ in which D is square diagonal of size $r\times r$, where $r\le min\{p,q\}$ is the rank of M, and has only nonzero singular values. In this variant, A denotes a $p\times r$ matrix and B denotes a $q\times r$ matrix, such that ${A}^{\top}A={B}^{\top}B={I}_{r}$ [18].**Polar factorization**: Given a real-valued $p\times n$ matrix M, its polar factorization is written as $M=XS$, where X denotes a $p\times n$ matrix such that ${X}^{\top}X={I}_{n}$, termed polar factor, and S denotes a symmetric positive semidefinite $n\times n$ matrix [19]. The polar factorization of a matrix always exists and, if the matrix is full rank, its polar factor is unique.**Matrix exponential**: Given a matrix $M\in {\mathbb{R}}^{p\times p}$, its matrix exponential is denoted as $\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(M\right)$. Matrix exponential is defined via a series as$$\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(M\right):=\sum _{k=0}^{\infty}\frac{{M}^{k}}{k!}.$$There exist special formulas to compute the matrix exponential via a finite number of operations for special matrices (see, for example, [20] and references therein). For example, for a symmetric matrix $M\in {\mathbb{R}}^{p\times p}$, that admits a spectral factorization $M=XD{X}^{\top}$, it holds that$$\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(M\right)=X\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(D\right){X}^{\top},$$**Principal matrix logarithm**: Given a matrix $M\in {\mathbb{R}}^{p\times p}$, its principal matrix logarithm is denoted as $\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}\left(M\right)$. Matrix logarithm is defined via a series as$$\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}\left(M\right):=-\sum _{k=1}^{\infty}\frac{{({I}_{p}-M)}^{k}}{k},\mathrm{de\ufb01ned}\mathrm{for}\parallel {I}_{p}-M\parallel 1.$$In the specialized literaure it is possible to find special formulas to compute the principal matrix logarithm via a finite number of operations for special matrices. For example, for a positive-definite matrix $M\in {\mathbb{R}}^{p\times p}$, which admits a spectral factorization $M=XD{X}^{\top}$, it holds that$$\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}\left(M\right)=X\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}\left(D\right){X}^{\top},$$

#### 2.2. Manifolds and Embedded Manifolds (or Submanifolds)

**Hypercube**: The simplest manifold of interest is perhaps the hypercube ${\mathbb{R}}^{p}$, which is essentially the set spanned by p real-valued variables (or p-tuples).**Hypersphere, oblique manifold, hyperellipsoid**: A hypersphere is represented as ${\mathbb{S}}^{p-1}:=\{x\in {\mathbb{R}}^{p}\mid {x}^{\top}x=1\}$ and is the subset of points of the hypercube with unit Euclidean distance from point 0. This is a smooth manifold of dimension $p-1$ embedded in the hypercube ${\mathbb{R}}^{p}$; in fact, with only $p-1$ coordinates, we can identify unequivocally any point on a sphere. In [21], it is shown how to ‘coordinatize’ such a manifold through, e.g., the stereographic projection, which requires two coordinate maps applied to two convenient neighborhoods (each including only one ‘pole’) on the sphere. The special cases are ${\mathbb{S}}^{1}$, the unit circle, and ${\mathbb{S}}^{2}$, the ordinary sphere. There exist a number of applications insisting on the hyperspheres ${\mathbb{S}}^{p-1}$ such as, for instance, blind deconvolution [22,23], data classification [24], adaptive pattern recognition [25] and motion planning, optimization, and verification in robotics and in computational biology [26]. A smooth manifold closely related to the unit hypersphere is the oblique manifold [27], defined as:$$\mathrm{OB}\left(p\right):=\{X\in {\mathbb{R}}^{p\times p}\mid \mathrm{diag}\left({X}^{\top}X\right)={I}_{p}\},$$$$\mathrm{OB}\left(p\right)\cong \underset{p\phantom{\rule{4pt}{0ex}}\mathrm{times}}{\underbrace{{\mathbb{S}}^{p-1}\times {\mathbb{S}}^{p-1}\times \cdots \times {\mathbb{S}}^{p-1}}},$$$${\mathrm{L}}^{p-1}:=\{x\in {\mathbb{R}}^{p}\mid {\parallel DRx\parallel}^{2}=1\},$$**General linear group and special linear group**: The general linear group is defined as $\mathbb{GL}\left(p\right):=\{X\in {\mathbb{R}}^{p\times p}\mid det\left(X\right)\ne 0$}. This is the subset of the space of $p\times p$ matrices ${\mathbb{R}}^{p\times p}$ which are invertible. The special linear group is defined as $\mathrm{Sl}\left(p\right):=\{X\in {\mathbb{R}}^{p\times p}\mid det\left(X\right)=1$}. This is the subset of the general linear group made by all matrices with a unitary determinant.**Orthogonal group, special orthogonal group, special Euclidean group**: An orthogonal group of size p is defined by $\mathbb{O}\left(p\right):=\{X\in {\mathbb{R}}^{p\times p}\mid {X}^{\top}X={I}_{p}\}$. The manifold $\mathbb{O}\left(p\right)$ has dimension ${2}^{-1}p(p-1)$. In fact, every matrix in $\mathbb{O}\left(p\right)$ possesses ${p}^{2}$ entries which are constrained by ${2}^{-1}p(p+1)$ orthogonality/normality restrictions. The manifold of special orthogonal matrices is defined as $\mathbb{SO}\left(p\right):=\left(\right)open="\{"\; close="\}">X\in {\mathbb{R}}^{p\times p}\mid {X}^{\top}X={I}_{p},\phantom{\rule{4pt}{0ex}}det\left(X\right)=1$. A smooth manifold closely related to the special orthogonal group is the special Euclidean group, denoted as $\mathbb{SE}\left(p\right)$, that finds applications in robotics (see, e.g., [29]). The special Euclidean group is a set of $(p+1)\times (p+1)$ matrices defined as:$$\mathbb{SE}\left(p\right):=\left(\right)open="\{"\; close="\}">\left(\right)close="|">\left(\right)open="["\; close="]">\begin{array}{cc}X& \delta \\ 0& 1\end{array}X\in \mathbb{SO}\left(p\right),\phantom{\rule{4pt}{0ex}}\delta \in {\mathbb{R}}^{p}$$**Stiefel manifold**: The (compact) Stiefel manifold is defined as:$$\mathrm{St}(n,p):=\{X\in {\mathbb{R}}^{n\times p}\mid {X}^{\top}X={I}_{p}\},$$$${\mathrm{St}}_{B}(n,p)=\{X\in {\mathbb{R}}^{n\times p}\mid {X}^{\top}BX={I}_{p}\},$$**Real symplectic group**: The real symplectic group is defined as$$\mathrm{Sp}\left(2n\right):=\{Q\in {\mathbb{R}}^{2n\times 2n}|{Q}^{\top}JQ=J\},\phantom{\rule{4pt}{0ex}}J:=\left(\right)open="["\; close="]">\begin{array}{cc}{0}_{n}& {I}_{n}\\ -{I}_{n}& {0}_{n}\end{array}$$**Manifold of symmetric, positive-definite (SPD) matrices**: The main features of the space$${\mathbb{S}}^{+}\left(p\right):=\{P\in {\mathbb{R}}^{p\times p}|P={P}^{\top},P>0\}$$$${\mathbb{S}}^{+}(n,p):=\{X{X}^{\top}\mid X\in {\mathbb{R}}^{n\times p},\phantom{\rule{0.166667em}{0ex}}\mathrm{rank}\left(X\right)=p\}.$$The growing use of low-rank matrix approximations to retain tractability in large-scale applications boosted extensions of the calculus of positive-definite matrices to their low-rank counterparts [38].**Grassmann manifold**: A Grassmann manifold $\mathrm{Gr}(n,p)$ is a set of subspaces of ${\mathbb{R}}^{n}$ spanned by p-independent vectors, namely$$\mathrm{Gr}(n,p)=\left\{\mathrm{span}({w}_{1},{w}_{2},{w}_{3},\dots ,{w}_{p})\right\},$$

**Example**

**2.**

**Theorem**

**1.**

**Proof.**

- If $g\left({x}_{0}\right)=0$, then $x={x}_{0}$;
- If $g\left({x}_{0}\right)\ne 0$, notice that $g(-{x}_{0})=f(-{x}_{0})-f(-(-{x}_{0}))$ = $-(f\left({x}_{0}\right)-f(-{x}_{0}))=-g\left({x}_{0}\right)$. Since g is continuous and takes a different sign in two different points of its domain, there must exist at least a point $x\in {\mathbb{S}}^{1}$ such that $g\left(x\right)=0$.

**Example**

**3.**

- The manifolds ${\mathbb{S}}^{n-1}\subset \mathbb{A}$, with $\mathbb{A}:={\mathbb{R}}^{n}$, and $\mathbb{SO}\left(n\right)\subset \mathbb{A}$, with $\mathbb{A}:={\mathbb{R}}^{n\times n}$, are compact since there exists a ball $\mathbb{B}(0,r)\subset \mathbb{A}$, with $r<\infty $, that contains them.
- The manifold ${\mathbb{S}}^{+}\left(n\right)$ is non-compact since no t finite-radius ball exists that contains it.

## 3. Smooth Curves, Tangent Vector Fields, Tangent Spaces and Bundle, Normal Spaces

#### 3.1. Curves and Bundles for Embedded Manifolds

**Example**

**4.**

**Hypercube**: Since the space ${\mathbb{R}}^{p}$ is linear, when it is embedded into itself, each tangent space coincides to the whole space, namely, for every, $x\in {\mathbb{R}}^{p}$ it holds that ${T}_{x}{\mathbb{R}}^{p}\equiv {\mathbb{R}}^{p}$. Since a normal space is the orthogonal complement to a tangent space, we must conclude that ${N}_{x}{\mathbb{R}}^{p}=\left\{0\right\}$.**Hypersphere**: At every point $x\in {\mathbb{S}}^{p-1}$, the tangent space has the structure$${T}_{x}{\mathbb{S}}^{p-1}:=\{v\in {\mathbb{R}}^{p}\mid {v}^{\top}x=0\}.$$The normal space ${N}_{x}{\mathbb{S}}^{p-1}$ at every point of the hypersphere, which is the orthogonal complement of the tangent space with respect to the ambient space $\mathbb{A}:={\mathbb{R}}^{p}$ that the manifold ${\mathbb{S}}^{p-1}$ is embedded in, has the structure$${N}_{x}{\mathbb{S}}^{p-1}:=\{\lambda x\mid \lambda \in \mathbb{R}\}$$**Special orthogonal group**: The tangent space of the manifold $\mathbb{SO}\left(p\right)$ has the structure$${T}_{X}\mathbb{SO}\left(p\right)=\{V\in {\mathbb{R}}^{p\times p}\mid {V}^{\top}X+{X}^{\top}V={0}_{p}\}.$$This may be proven by differentiating a generic curve $\gamma \left(t\right)\in \mathbb{SO}\left(p\right)$ passing through X at $t=0$. Every such curve satisfies the orthogonal-group characteristic equation ${\gamma}^{\top}\left(t\right)\gamma \left(t\right)={I}_{p}$; therefore, after differentiation, one obtains$${\dot{\gamma}}^{\top}\left(0\right)\gamma \left(0\right)+{\gamma}^{\top}\left(0\right)\dot{\gamma}\left(0\right)={0}_{p}.$$By recalling that the tangent space is formed by velocity vectors $\dot{\gamma}\left(0\right)$, the above-mentioned result is readily achieved. Provided the ambient space $\mathbb{A}:={\mathbb{R}}^{p\times p}$ is endowed with the canonical Euclidean metric ${\langle V,W\rangle}^{\mathbb{A}}:=\mathrm{tr}\left({V}^{\top}W\right)$, the normal space at a point X may be defined as$${N}_{X}\mathbb{SO}\left(p\right)=\{N\in {\mathbb{R}}^{p\times p}\mid \mathrm{tr}\left({N}^{\top}V\right)=0,\phantom{\rule{4pt}{0ex}}\forall V\in {T}_{X}\mathbb{SO}\left(p\right)\}.$$It is easy to convince oneself that every tangent vector $V\in {T}_{X}\mathbb{SO}\left(p\right)$ may be written as $V=XH$, with H skew-symmetric (i.e., such that ${H}^{\top}=-H$); then, any element $N\in {N}_{X}\mathbb{SO}\left(p\right)$ may be written as $N=XS$, with $S={S}^{\top}$. In fact, the normality condition implies $0=\mathrm{tr}\left({V}^{\top}\left(XS\right)\right)=\mathrm{tr}\left(S{V}^{\top}X\right)=\mathrm{tr}\left({X}^{\top}V{S}^{\top}\right)$, which is equivalent to $-\mathrm{tr}\left(\left({V}^{\top}X\right){S}^{\top}\right)$; therefore, the normality condition may be recast as $\mathrm{tr}\left(\left({V}^{\top}X\right)(S-{S}^{\top})\right)=0$. It is hence necessary and sufficient that $S={S}^{\top}$; therefore,$${N}_{X}\mathbb{SO}\left(p\right)=\{XS\mid {S}^{\top}=S\in {\mathbb{R}}^{p\times p}\}.$$**Stiefel manifold**: Given a trajectory $[-\u03f5,\phantom{\rule{4pt}{0ex}}\u03f5]\ni t\mapsto X\left(t\right)\in \mathrm{St}(n,p)$, derivation with respect to the parameter t yields ${\dot{X}}^{\top}X+{X}^{\top}\dot{X}=0$, which means that the tangent space to the manifold $\mathrm{St}(n,p)$ in a point $X\in \mathrm{St}(n,p)$ has the structure:$${T}_{X}\mathrm{St}(n,p)=\{V\in {\mathbb{R}}^{n\times p}\mid {V}^{\top}X+{X}^{\top}V=0\}.$$The normal space has the structure:$${N}_{X}\mathrm{St}(n,p)=\{XS\mid S\in {\mathbb{R}}^{p\times p},\phantom{\rule{4pt}{0ex}}{S}^{\top}-S=0\}.$$**Real symplectic group**: The tangent space associated with the real symplectic group has the structure:$${T}_{Q}\mathrm{Sp}\left(2n\right)=\{V\in {\mathbb{R}}^{2n\times 2n}\mid {V}^{\top}JQ+{Q}^{\top}JV={0}_{2n}\}.$$The tangent spaces and the normal space associated with the real symplectic group may be characterized as follows:$$\left(\right)$$**Space of symmetric, positive-definite matrices**: Given a point $P\in {\mathbb{S}}^{+}\left(p\right)$, its tangent bundle may be characterized simply by observing that every curve $\gamma :[-\u03f5,\u03f5]\to {\mathbb{S}}^{+}\left(p\right)$ satisfies $\gamma \left(t\right)=\gamma {\left(t\right)}^{\top}$ and $\gamma \left(t\right)>0$. Only the equality constraint influences the structure of the tangent space; therefore,$${T}_{P}{\mathbb{S}}^{+}\left(p\right)=\{S\in {\mathbb{R}}^{p\times p}\mid {S}^{\top}=S\}.$$Notice that every tangent space is identical to each other as it does not depend on the base point P.**Grassmann manifold**: For every element $\left[X\right]\in \mathrm{Gr}(n,p)$, the tangent space may be represented as:$${T}_{\left[X\right]}\mathrm{Gr}(n,p)=\{V\in {\mathbb{R}}^{n\times p}|{X}^{\top}V=0\}.$$A tangent space ${T}_{\left[X\right]}\mathrm{Gr}(n,p)$ may be decomposed as the direct sum of a horizontal space and of a vertical space at $\left[X\right]\in \mathrm{Gr}(n,p)$[40]. Starting from a point $\left[X\right]\in \mathrm{Gr}(n,p)$, moving along a horizontal direction causes a change in subspace, while moving along a vertical direction does not change the subspace $\left[X\right]$.

#### 3.2. Vector Fields

**Example**

**5.**

**Points on a manifold**, denoted as $x\in \mathbb{M}\subset \mathbb{A}$;**Tangent vectors**, denoted as $v\in {T}_{x}\mathbb{M}\subset \mathbb{A}$.

#### 3.3. Canonical Curves, Canonical Basis of a Tangent Space*

**Example**

**6.**

## 4. First-Order Dynamical Systems on Manifolds

**Meaning of $\dot{x}\left(t\right)$ in the expression (70)**: Since we assumed the manifold $\mathbb{M}$ to be embedded in an ambient space $\mathbb{A}$ of type ${\mathbb{R}}^{p\times q}$, the quantity $\dot{x}\left(t\right)$ is an array or a matrix made of the derivatives of the entries of $x\left(t\right)$ with respect to the time parameter t. Let us recall, however, that even if such a specification helps one understand the subject, we shall never write any relation involving single components of such matrix-type objects as we shall treat any state variable x as a whole, (except in a low-dimensional example). The solution $x\left(t\right)$ of the differential Equation (70) is the trajectory of the system and is represented by a curve on the manifold $\mathbb{M}$; hence, $\dot{x}\left(t\right)$ denotes the speed along the trajectory since, for every $t\in [0,\phantom{\rule{4pt}{0ex}}{t}_{\mathrm{f}}]$, it holds that $(x\left(t\right),\dot{x}\left(t\right))\in T\mathbb{M}$, namely $\dot{x}\in \Gamma \left(\mathbb{M}\right)$.**Separation (or non-equivalency) of first-order and second-order systems**: When dealing with dynamical systems in ${\mathbb{R}}^{p\times q}$, the system (69) is actually fairly general since, upon introducing additional variables, it is possible to turn an n-th order system into a first-order system. We consider now how such a property stems from the legitimate ‘confusion’ between the state space and velocity space. In contrast to this, when dealing with dynamical systems on manifolds, such confusion is not legitimate, since $x\in \mathbb{M}$ while $\dot{x}\in {T}_{x}\mathbb{M}$; namely, the system state and the system velocity belong to very different spaces. It suffices to recall that $\mathbb{M}$ is generally a curved space (i.e., non linear) while each ${T}_{x}\mathbb{M}$ is a vector space (flat, linear). For this reason, second-order systems are not assimilable to first-order systems.

**Example**

**7.**

**Example**

**8.**

**Example**

**9.**

- The instance $R={I}_{3}$ indicates that the object is horizontal with respect to the reference frame;
- The instance $R\ne {I}_{3}$ indicates that it is necessary to rotate the body-fixed axes to align them to the inertial axes.

## 5. Tangent Maps: Pushforward and Pullback

- Pushforward of a manifold-to-scalar function: The special case that $\mathbb{N}:=\mathbb{R}$, namely that f is a manifold-to-scalar function, is particularly important in applications. Such a special case will be covered later since it involves the notion of Riemannian gradient.
- Pushforward of a matrix-to-matrix function: This is the case that the smooth manifolds $\mathbb{M}$ and $\mathbb{N}$ are real matrix manifolds embedded in ${\mathbb{R}}^{p\times p}$. Any smooth function between any such pair of manifolds is of matrix-to-matrix type. Let us assume that the function f is analytic about a point ${X}_{0}\in \mathbb{M}$, namely, that it may be expressed as a polynomial series:$$f\left(X\right)=\sum _{k=0}^{\infty}{a}_{k}{(X-{X}_{0})}^{k},\phantom{\rule{4pt}{0ex}}{a}_{k}\in \mathbb{R}.$$Then, the pushforward map ${\mathrm{d}}_{X}f\left(V\right)$ in a point $X\in \mathbb{M}$ applied to the tangent direction $V\in {T}_{X}\mathbb{M}$ may be expressed as:$${\mathrm{d}}_{X}f\left(V\right)=\sum _{k=1}^{\infty}{a}_{k}\sum _{r=0}^{k-1}{(X-{X}_{0})}^{r}V{(X-{X}_{0})}^{k-r-1}.$$It is easily recognized that the tangent map $\mathrm{d}{f}_{X}\left(V\right)$ is linear in the argument V. As a reference for the readers, we recall the analytic expansion of three matrix-to-matrix functions, that may be used to compute the corresponding pushforward maps:
**Matrix exponential**: For the map $f\left(X\right)=\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(X\right)$, it holds that ${X}_{0}=0$, ${a}_{k}={(k!)}^{-1}$ for $k\ge 0$;**Principal matrix logarithm**: For the map $f\left(X\right)=\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}\left(X\right)$, it holds that ${X}_{0}={I}_{p}$, ${a}_{0}=0$, ${a}_{k}={(-1)}^{k+1}{k}^{-1}$ for $k\ge 1$;**Matrix inversion**: For the map $f\left(X\right)={X}^{-1}$, it holds that ${X}_{0}={I}_{p}$, ${a}_{k}={(-1)}^{k}$ for $k\ge 0$.

## 6. Lie Groups, Lie Algebras, Lie Brackets

**Hypercube**: The hypercube, also known as a translation group, is a Lie group under standard matrix sum (matrix subtraction and zero matrix complete the group structure). The Lie algebra of ${\mathbb{R}}^{p}$ coincides with itself.**General linear group and the special linear group**: Both the general linear group $\mathbb{GL}\left(p\right)$ and the special linear group $\mathrm{Sl}\left(p\right)$ are Lie groups under standard matrix multiplication and inversion. The Lie algebra of the general linear group, namely $\mathfrak{gl}\left(p\right)$, coincides with ${\mathbb{R}}^{p\times p}$. The linear algebra associated with the special linear group is more interesting and its determination involves some clever matrix computations. Let us consider $\mathrm{Sl}\left(p\right)$ endowed with standard matrix multiplication, inversion and ${I}_{p}$ as the group identity and a curve $\gamma :[-\u03f5,\phantom{\rule{4pt}{0ex}}\u03f5]\to \mathrm{Sl}\left(p\right)$ defined by $\gamma \left(t\right):=\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t\phantom{\rule{0.166667em}{0ex}}A\right)$, where $\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}$ denotes the matrix exponential and $A\in {\mathbb{R}}^{p\times p}$. Clearly, $\gamma \left(0\right)={I}_{p}$; hence, $\dot{\gamma}\left(0\right)$ represents any element of the Lie algebra $\mathfrak{sl}\left(p\right)$. It is not hard to prove that$$\dot{\gamma}\left(t\right)=A\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t\phantom{\rule{0.166667em}{0ex}}A\right),$$$${\left(\right)}_{\frac{\mathrm{d}}{\mathrm{d}t}}t=0$$In the present case, since $det\left(\gamma \right(t\left)\right)\equiv 1$, it follows from the above considerations that $\mathrm{tr}\left(A\right)=0$. In conclusion, we found that$$\mathfrak{sl}\left(p\right)=\{A\in {\mathbb{R}}^{p\times p}\mid \mathrm{tr}\left(A\right)=0\},$$**Special orthogonal group**: The $\mathbb{SO}\left(n\right)$ manifold is a Lie group under standard matrix multiplication. The Lie algebra associated with the special orthogonal group is the set of skew-symmetric matrices $\mathfrak{so}\left(p\right):=\{H\in {\mathbb{R}}^{p\times p}\mid H+{H}^{\top}={0}_{p}\}$. In fact, at the identity it holds that ${T}_{{I}_{p}}\mathbb{SO}\left(p\right)=\mathfrak{so}\left(p\right)$. The Lie algebra $\mathfrak{so}\left(p\right)$ is a vector space of dimension ${2}^{-1}p(p-1)$.**Real symplectic group**: The Lie algebra associated with the real symplectic group may be characterized as follows:$$\mathfrak{sp}\left(2n\right)=\{H=JS\mid S\in {\mathbb{R}}^{2n\times 2n},\phantom{\rule{4pt}{0ex}}{S}^{\top}=S\},$$**Manifold of symmetric, positive-definite matrices**: The Lie algebra associated with the Lie group ${\mathbb{S}}^{+}\left(p\right)$ is the set of $p\times p$ symmetric matrices, namely ${\mathfrak{s}}^{+}\left(p\right):=\{S\in {\mathbb{R}}^{p\times p}\mid {S}^{\top}=S\}$. The space of symmetric, positive-definite matrices is not a group under standard matrix multiplication. We recall from [44] the following group structure $({\mathbb{S}}^{+}\left(p\right),m,i,e)$:- -
**Multiplication**: $m(P,Q):=\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}(\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}P+\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}Q)$, (logarithmic multiplication), with $P,Q\in {\mathbb{S}}^{+}\left(p\right)$, where ‘$\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}$’ denotes the principal matrix logarithm;- -
**Identity element**: $e={I}_{p}$ (notice that $\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}{I}_{p}={0}_{p}$);- -
**Inverse**: $i\left(P\right)=\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}(-\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}P)$ (matrix inversion), with $P\in {\mathbb{S}}^{+}\left(p\right)$ (any symmetric, positive-definite matrix is non-singular).

It is easy to verify that the proposed instances of $m,\phantom{\rule{0.166667em}{0ex}}i,\phantom{\rule{0.166667em}{0ex}}e$ satisfy the algebraic-group axioms in ${\mathbb{S}}^{+}\left(p\right)$. Additionally, the logarithmic multiplication on ${\mathbb{S}}^{+}\left(p\right)$ is compatible with its smooth manifold structure, as the map $(P,Q)\mapsto m(P,i(Q\left)\right)=\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}(\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}P-\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}Q)$ is smooth.

**Right translation**: Defined as a function ${R}_{x}:\mathbb{G}\to \mathbb{G}$ by ${R}_{x}\left(y\right):=m(y,x)$ for every pair $(x,y)\in \mathbb{G}$;**Left-translation**: Defined as a function ${L}_{x}:\mathbb{G}\to \mathbb{G}$ by ${L}_{x}\left(y\right):=m(i\left(x\right),y)$ for every pair $(x,y)\in \mathbb{G}$;

**Example**

**10.**

**Example**

**11.**

## 7. Metrization, Riemannian Manifolds

#### 7.1. Coordinate-Free Metrization by Inner Products and Metric Kernels

- For every $u,v\in \mathbb{V}$, it holds that $\langle u,v\rangle =\langle v,u\rangle $,
- For every $u,v,w\in \mathbb{V}$, it holds that $\langle u+w,v\rangle =\langle u,v\rangle +\langle w,v\rangle $,
- For every $u,v\in \mathbb{V}$ and $c\in \mathbb{R}$, it holds that $\langle c\phantom{\rule{0.166667em}{0ex}}u,v\rangle =c\langle u,v\rangle $,
- The norm of a vector $v\in \mathbb{V}$ is defined as $\parallel v\parallel :=\sqrt{\langle v,v\rangle}$,
- An inner product is non-degenerate if and only if $\langle u,v\rangle =0$ for every $v\in \mathbb{V}$ implies $u=0$.

**Example**

**12.**

**Observation**

**1.**

**Example**

**13.**

**Hypercube**: For the space ${\mathbb{R}}^{p}$ of (column) arrays, the canonical metric is the Euclidean metric ${\langle u,v\rangle}_{x}:={u}^{\top}v$, for every $u,v\in {\mathbb{R}}^{p}$, while for the space ${\mathbb{R}}^{p\times q}$ of rectangular matrices, the canonical metric is the Euclidean metric ${\langle U,V\rangle}_{X}:=\mathrm{tr}\left({U}^{\top}V\right)$, for every $U,V\in {\mathbb{R}}^{p\times q}$. Notice that both these metrics do not depend explicitly on the point that they are calculated at, and hence they are uniform.**Hypersphere**: The hypersphere ${\mathbb{S}}^{p-1}$ embedded into the ambient space ${\mathbb{R}}^{p}$ inherits its canonical metric; hence, we shall choose ${\langle u,v\rangle}_{x}:={u}^{\top}v$ for every $u,v\in {T}_{x}{\mathbb{S}}^{p-1}$. Even this metric is uniform.**General linear group and the special linear group**: A metric for the general linear group $\mathbb{GL}\left(p\right)\ni A$ is:$${\langle U,V\rangle}_{A}:=\mathrm{tr}\left({\left({A}^{-1}U\right)}^{\top}\left({A}^{-1}V\right)\right),\phantom{\rule{4pt}{0ex}}\forall U,V\in {T}_{A}\mathbb{GL}\left(p\right).$$Such a metric was popularized, for instance, in [45], in the context of machine learning.**Special orthogonal group**: The canonical metric in $\mathbb{SO}\left(p\right)$ is defined as ${\langle U,V\rangle}_{X}:=\mathrm{tr}\left({U}^{\top}V\right)$, for any $X\in \mathbb{SO}\left(p\right)$ and $U,V\in {T}_{X}\mathbb{SO}\left(p\right)$. Notice that the norm of a tangent vector $V\in {T}_{X}\mathbb{SO}\left(p\right)$ is ${\parallel V\parallel}_{X}=\sqrt{\langle V,V\rangle}=\sqrt{\mathrm{tr}\left({V}^{\top}V\right)}=:{\parallel V\parallel}_{\mathrm{F}}$, known as the Frobenius norm [18].**Stiefel manifold**: There are two well-known metrics for the Stiefel manifold, namely, the Euclidean metric and the canonical metric.- Euclidean metric: A possible metric that the Stiefel manifold may be endowed with is the Euclidean metric, inherited from the embedding of $\mathrm{St}(n,p)$ in ${\mathbb{R}}^{n\times p}$:$${\langle U,V\rangle}_{X}:=\mathrm{tr}\left({U}^{\top}V\right),\phantom{\rule{4pt}{0ex}}U,V\in {T}_{X}\mathrm{St}(n,p).$$
- Canonical metric: The Stiefel manifold may be endowed with a second kind of metric, termed ‘canonical metric’. The associated inner product reads:$${\langle U,V\rangle}_{X}:=\mathrm{tr}\left(\right)open="("\; close=")">{U}^{\top}\left(\right)open="("\; close=")">{I}_{n}-{\textstyle \frac{1}{2}}X{X}^{\top},\phantom{\rule{4pt}{0ex}}U,V\in {T}_{X}\mathrm{St}(n,p),$$

**Real symplectic group**: There exist two known metrics in the scientific literature that were applied to the real symplectic group.- Khvedelidze–Mladenov metric: A metric for the real symplectic group $\mathrm{Sp}\left(2n\right)$ is:$${\langle U,V\rangle}_{Q}:=\mathrm{tr}\left({Q}^{-1}U{Q}^{-1}V\right),\phantom{\rule{4pt}{0ex}}\forall U,V\in {T}_{Q}\mathrm{Sp}\left(2n\right).$$It is referred to as Khvedelidze–Mladenov metric (or KM metric, for short, [46]). This is an indefinite metric; hence, a manifold endowed with this metric is not Riemannian (it is in fact referred to as a pseudo-Riemannian manifold).
- Euclidean metric: A further metric for the real symplectic group $\mathrm{Sp}\left(2n\right)$ is:$${\langle U,V\rangle}_{Q}:=\mathrm{tr}\left({\left({Q}^{-1}U\right)}^{\top}\left({Q}^{-1}V\right)\right),\phantom{\rule{4pt}{0ex}}\forall U,V\in {T}_{Q}\mathrm{Sp}\left(2n\right).$$Such a metric is inherited from the embedding of a real symplectic group $\mathrm{Sp}\left(2n\right)$ into the space of real invertible matrices $\mathbb{GL}\left(2n\right)$ that, in turn, is embedded into the real hypercube ${\mathbb{R}}^{2n\times 2n}$ and hence inherits its canonical metric.

**Space of symmetric, positive-definite matrices**: The canonical metric in ${\mathbb{S}}^{+}\left(p\right)$ is defined as ${\langle U,V\rangle}_{P}:=\mathrm{tr}\left(U{P}^{-1}V{P}^{-1}\right)$, for any $P\in {\mathbb{S}}^{+}\left(p\right)$ and $U,V\in {T}_{P}{\mathbb{S}}^{+}\left(p\right)$. Clearly, this is not a uniform metric.**Grassmann manifold**: The canonical metric on a Grassmann manifold is$${\langle U,V\rangle}_{\left[X\right]}=\mathrm{tr}\left({U}^{\top}V\right),\phantom{\rule{4pt}{0ex}}\forall U,V\in {T}_{\left[X\right]}\mathrm{Gr}(n,p),$$

**Example**

**14.**

**Linearity**: ${G}_{x}\left(v\right)$ is linear in v, namely ${G}_{x}(v+\alpha \phantom{\rule{0.166667em}{0ex}}w)={G}_{x}\left(v\right)+\alpha \phantom{\rule{0.166667em}{0ex}}{G}_{x}\left(w\right)$, for every $\alpha \in \mathbb{R}$, $x\in \mathbb{M}$ and $v,w\in {T}_{x}\mathbb{M}$;**Symmetry**: ${G}_{x}$ is a self-adjoint operator, namely ${\langle u,{G}_{x}\left(w\right)\rangle}^{\mathbb{A}}={\langle {G}_{x}\left(u\right),w\rangle}^{\mathbb{A}}$;**Closure with respect to $T\mathbb{M}$**: ${G}_{x}$ is an endomorphism of ${T}_{x}\mathbb{M}$, namely ${G}_{x}\left(v\right)\in {T}_{x}\mathbb{M}$, for every $x\in \mathbb{M}$ and $v\in {T}_{x}\mathbb{M}$;**Invertibility**: ${G}_{x}$ is invertible, namely, its inverse ${G}_{x}^{-1}$ is well-defined for every $x\in \mathbb{M}$.

**Example**

**15.**

- 1.
**Linearity**: ${G}_{P}\left(W\right)$ appears to be linear in W; in fact, it holds that ${G}_{P}(V+\alpha \phantom{\rule{0.166667em}{0ex}}W)={P}^{-1}(V+\alpha \phantom{\rule{0.166667em}{0ex}}W){P}^{-1}={P}^{-1}V{P}^{-1}+\alpha \phantom{\rule{0.166667em}{0ex}}{P}^{-1}W{P}^{-1}$, for every $\alpha \in \mathbb{R}$, $V,W\in {T}_{P}\mathbb{A}$,- 2.
**Symmetry**: ${G}_{P}$ is self-adjoint; in fact, it holds that ${\langle U,{G}_{P}\left(W\right)\rangle}^{\mathbb{A}}=\mathrm{tr}\left(U\left({P}^{-1}W{P}^{-1}\right)\right)=\mathrm{tr}\left(\left({P}^{-1}U{P}^{-1}\right)W\right)={\langle {G}_{P}\left(U\right),W\rangle}^{\mathbb{A}}$, by virtue of the cyclic permutation invariance of the trace operator,- 3.
**Closure**: ${G}_{P}$ turns out to be an endomorphism of ${T}_{P}{\mathbb{S}}^{+}\left(n\right)$; in fact, for every matrix $U\in {T}_{P}{\mathbb{S}}^{+}\left(n\right)$, it holds that ${G}_{P}\left(U\right)={P}^{-1}U{P}^{-1}$ is a symmetric matrix. Hence, it belongs to ${T}_{P}{\mathbb{S}}^{+}\left(n\right)$ (notice that ${\left({P}^{-1}U{P}^{-1}\right)}^{\top}={P}^{-\top}{U}^{\top}{P}^{-\top}={P}^{-1}U{P}^{-1}$ because both U and P are symmetric),- 4.
**Invertibility**: ${G}_{P}$ is invertible; in fact, if $W={G}_{P}\left(U\right)={P}^{-1}U{P}^{-1}$, then $U=PWP=:{G}_{P}^{-1}\left(W\right)$.

**Example**

**16.**

#### 7.2. Covariancy, Contravariancy, Tensors*

**Example**

**17.**

## 8. Geodesic Arc, Riemannian Distance, Exponential and Logarithmic Map

- On a general manifold, the concept of geodesic extends the concept of a straight line from a flat space to a curved space. In fact, let us consider a curved manifold $\mathbb{M}$ embedded in an ambient space ${\mathbb{R}}^{p}$. Such ambient space contains straight lines in the usual meaning, but the manifold, being curved, hardly accommodates any straight lines. Geodesics are curves that resemble straight lines in that they copy some of their distinguishing features.
- On a metrizable manifold, a geodesic connecting two points is locally defined as the shortest curve on the manifold connecting these endpoints. Therefore, once a metric is specified, the equation of the geodesic arises from the minimization of a length functional. Such a definition comes from the observation that any straight line in ${\mathbb{R}}^{p}$ is indeed the shortest path connecting any two given points.
- A further distinguishing feature of straight lines is that they are self-parallel, namely, sliding a straight lines infinitesimally along itself returns the same exact lines. Such a concept gives rise to a definition of geodesic which requires to specify the mathematical meaning of ‘sliding a piece of line infinitesimally along itself’. The technical argument to access such a definition is covariant derivation, which is not covered in the present tutorial (while it will be covered in a forthcoming review paper).
- Another intuitive interpretation is based on the observation that a geodesic emanating from a point on a manifold coincides with the path followed by a particle sliding on the manifold at a constant speed. For a manifold embedded in a larger ambient space and in special circumstances, this is equivalent to the requirement that the naïve (or embedded) acceleration of the particle is either zero or perpendicular to the tangent space to the manifold at every point of its trajectory. (In the present tutorial paper, we use the term naïve acceleration to distinguish it from covariant acceleration, which will only be defined in a subsequent tutorial.)

#### 8.1. Coordinate-Free Embedded Geodesy

**Example**

**18.**

- Given a manifold $(\mathbb{M},\langle \xb7,\xb7\rangle )$, take two arbitrary points $p,q\in \mathbb{M}$;
- Choose a curve $\gamma :[0,\phantom{\rule{4pt}{0ex}}1]\to \mathbb{M}$ that has p and q as endpoints, namely, such that $\gamma \left(0\right)=p$ and $\gamma \left(1\right)=q$;
- Take as as distance $d(p,q)$ the length of such a curve, namely $\mathrm{L}\left(\gamma \right)$; one such definition seems a good starting point, but needs to be perfected since there exist infinitely many curves joining two given points.

**Example**

**19.**

**Theorem**

**2.**

**Proof.**

**Example**

**20.**

**Theorem**

**3.**

**Proof.**

**Observation**

**2.**

**Example**

**21.**

#### 8.2. Exponential and Logarithmic Maps

**Exponential map**: Given a point x and a vector v in ${\mathbb{R}}^{n}$, the exponential map ${exp}_{x}\left(v\right):=x+v$ moves the point from x to $x+v$. (Indeed, the term ‘vector’ comes from the homonymus Latin term that means ‘transporter’.)**Inverse exponential map**: Given two points x and y in ${\mathbb{R}}^{n}$, the inverse of the exponential map ${exp}_{x}^{-1}\left(y\right)$ returns a vector v such that ${exp}_{x}\left(v\right)={exp}_{x}\left({exp}_{x}^{-1}\left(y\right)\right)=x+(y-x)\equiv y$. In other words, the inverse exponential map applied to two points $x,\phantom{\rule{0.166667em}{0ex}}y$ determines the vector v that ‘transports’ x to y.

**Hypercube**: The space ${\mathbb{R}}^{p}$ endowed with the Euclidean metric admits straight lines as geodesics; in fact, since ${T}_{x}{\mathbb{R}}^{p}={\mathbb{R}}^{p}$ it follows that ${N}_{x}{\mathbb{R}}^{p}=\left\{0\right\}$, and hence the geodesic equation is simply ${\ddot{\gamma}}_{x,v}=0$ and its solution is ${\gamma}_{x,v}\left(t\right)=x+t\phantom{\rule{0.166667em}{0ex}}v$ for every $x\in {\mathbb{R}}^{p}$, $v\in {T}_{x}{\mathbb{R}}^{p}$ and $t\in [0,\phantom{\rule{4pt}{0ex}}1]$. Now, take two points $x,y\in {\mathbb{R}}^{p}$ and look for a geodesic arc connecting them with $t\in [0,\phantom{\rule{4pt}{0ex}}1]$. It is necessary to find a vector $v\in {T}_{x}{\mathbb{R}}^{p}$ such that ${\gamma}_{x,v}\left(1\right)=y$. Such a vector is clearly $v=y-x$; hence, the unique geodesic arc connecting x to y is ${\gamma}_{x}^{y}\left(t\right)=x+t(y-x)$. Since $\sqrt{{\langle {\dot{\gamma}}_{x}^{y}\left(0\right),{\dot{\gamma}}_{x}^{y}\left(0\right)\rangle}_{x}}=\parallel v\parallel $, the Riemannian distance reads$$d(x,y)=\phantom{\rule{-0.166667em}{0ex}}\parallel y-x\parallel ,$$**Hypersphere**: On the hypersphere ${\mathbb{S}}^{p-1}$ embedded in the Euclidean space ${\mathbb{R}}^{p}$, a geodesic line may be conceived as a curve on which a particle, departing from the point $x\in {\mathbb{S}}^{p-1}$ with velocity $v\in {T}_{x}{\mathbb{S}}^{p-1}$, slides with constant speed $\parallel v\parallel $, where $\parallel \xb7\parallel $ denotes the standard ${L}_{2}$ vector norm. On the hypersphere, we denote such a curve as ${\gamma}_{x,v}\left(t\right)$, where the variable $t\in [0,\phantom{\rule{4pt}{0ex}}1]$ provides a parametrization of the curve. The differential equation characterizing geodesics on the hypersphere may be determined by observing that, with the given conditions, in this case the naïve acceleration of the particle must be either null or normal to the tangent space at any point of the hypersphere itself, namely ${\ddot{\gamma}}_{x,v}\left(t\right)\in {N}_{{\gamma}_{x,v}\left(t\right)}{\mathbb{S}}^{p-1}$. Since the normal space to a hypersphere at a point x is radial along x, the geodesic equation reads as ${\ddot{\gamma}}_{x,v}\left(t\right)=\lambda {\gamma}_{x,v}\left(t\right)$. In explicit form, the equation of the geodesic on the unit hypersphere may be written as [49]:$${\gamma}_{x,v}\left(t\right)=xcos(\parallel v\parallel t)+{vsin(\parallel v\parallel t)\parallel v\parallel}^{-1},\phantom{\rule{4pt}{0ex}}t\in [0,\phantom{\rule{4pt}{0ex}}1],$$$${exp}_{x}\left(v\right):=xcos(\parallel v\parallel )+{vsin(\parallel v\parallel )\parallel v\parallel}^{-1}.$$The relationship (184) for the geodesic represents a ‘great circle’ on the hypersphere. Now let us take two points $x,y\in {\mathbb{S}}^{p-1}$ (non-antipodal, such that ${x}^{\top}y\ne -1$) and let us look for a geodesic arc of the form (184) connecting them. It is clearly necessary to find a vector $v\in {T}_{x}{\mathbb{S}}^{p-1}$ such that ${\gamma}_{x,v}\left(1\right)=y$. Such an equation in the unknown v may be expressed explicitly as$$xcos(\parallel v\parallel )+v\phantom{\rule{0.166667em}{0ex}}\mathrm{sinc}(\parallel v\parallel )=y,$$$$v=\frac{y-x\left({x}^{\top}y\right)}{\mathrm{sinc}\left(\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{acos}\left({x}^{\top}y\right)\right)}.$$This expression represents the inverse of the exponential map applied to points $x,y\in {\mathbb{S}}^{p-1}$, namely$${log}_{x}y:=\frac{({I}_{p}-x{x}^{\top})y}{\mathrm{sinc}\left(\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{acos}\left({x}^{\top}y\right)\right)}.$$Notice that such a logarithmic map is defined only when ${x}^{\top}y\ne -1$, namely when the two points are antipodal. The unique geodesic arc connecting x to y is given by$${\gamma}_{x}^{y}\left(t\right)=xcos\left(\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{acos}\left({x}^{\top}y\right)t\right)+\frac{y-x\left({x}^{\top}y\right)}{\mathrm{sinc}\left(\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{acos}\left({x}^{\top}y\right)\right)}\phantom{\rule{0.166667em}{0ex}}\mathrm{sinc}\left(\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{acos}\left({x}^{\top}y\right)t\right).$$A noticeable consequence is that, since $\sqrt{{\langle {\dot{\gamma}}_{x}^{y}\left(0\right),{\dot{\gamma}}_{x}^{y}\left(0\right)\rangle}_{x}}=\parallel v\parallel $, the Riemannian distance between the points x and y reads$$d(x,y)=\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{acos}\left({x}^{\top}y\right),$$**Special orthogonal group**: In general, it is not easy to obtain the expression of a geodesic arc on a given manifold in closed form. In the present case, with the assumptions considered, the geodesic on $\mathbb{SO}\left(p\right)$ departing from the identity with velocity $H\in \mathfrak{so}\left(p\right)$ has expression $\tilde{\gamma}\left(t\right)=exp\left(t\phantom{\rule{0.166667em}{0ex}}H\right)$. (It is important to verify that $\tilde{\gamma}\left(0\right)={I}_{p}$ and ${\left(\right)}_{\frac{d\tilde{\gamma}\left(t\right)}{dt}}t=0$.) It might be useful to verify such an essential result by the help of the following arguments. A geodesic $\tilde{\gamma}\left(t\right)$ on the Riemannian manifold $\mathbb{SO}\left(p\right)$, embedded in the Euclidean ambient space $\mathbb{A}:={\mathbb{R}}^{p\times p}$ and endowed with its canonical metric, departing from the identity ${I}_{p}$, should satisfy $\ddot{\tilde{\gamma}}\left(t\right)\in {N}_{\tilde{\gamma}\left(t\right)}\mathbb{SO}\left(p\right)$, and therefore it should hold:$$\ddot{\tilde{\gamma}}\left(t\right)=\tilde{\gamma}\left(t\right)S\left(t\right)\phantom{\rule{4pt}{0ex}},with{S}^{\top}\left(t\right)=S\left(t\right).$$Additionally, we know that any geodesic arc belongs entirely to the base manifold; therefore ${\tilde{\gamma}}^{\top}\left(t\right)\tilde{\gamma}\left(t\right)={I}_{p}$. By differentiating such an expression two times with respect to the parameter t, one obtains:$${\ddot{\tilde{\gamma}}}^{\top}\left(t\right)\tilde{\gamma}\left(t\right)+2{\dot{\tilde{\gamma}}}^{\top}\left(t\right)\dot{\tilde{\gamma}}\left(t\right)+{\tilde{\gamma}}^{\top}\left(t\right)\ddot{\tilde{\gamma}}\left(t\right)={0}_{p}.$$By plugging Equation (191) into Equation (192), we find that $S\left(t\right)=-{\dot{\tilde{\gamma}}}^{\top}\left(t\right)\dot{\tilde{\gamma}}\left(t\right)$, which leads to the second-order differential equation on the orthogonal group:$$\ddot{\tilde{\gamma}}\left(t\right)=-\tilde{\gamma}\left(t\right)\left({\dot{\tilde{\gamma}}}^{\top}\left(t\right)\dot{\tilde{\gamma}}\left(t\right)\right),$$The expression of the geodesic arc in the position of interest may be made explicit by taking advantage of the Lie-group structure of the orthogonal group endowed with the canonical metric. In fact, let us consider the pair $X\in \mathbb{SO}\left(p\right)$ and $V\in {T}_{X}\mathbb{SO}\left(p\right)$ as well as the geodesic $\gamma \left(t\right)$ that emanates from X, namely $\gamma \left(0\right)=X$, with velocity V. We claim that the geodesic departing from $X\in \mathbb{SO}\left(p\right)$ in the direction $V\in {T}_{X}\mathbb{SO}\left(p\right)$ is:$${\gamma}_{X,V}\left(t\right)=X\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t\phantom{\rule{0.166667em}{0ex}}{X}^{\top}V\right).$$In fact, let us consider the left-translated curve $\tilde{\gamma}\left(t\right)={X}^{\top}{\gamma}_{X,V}\left(t\right)$. It has the following properties:- The curve ${\gamma}_{X,V}\left(t\right)$ belongs to the orthogonal group at any time. This may be proven by computing the quantity ${\gamma}_{X,V}^{\top}\left(t\right){\gamma}_{X,V}\left(t\right)$ and taking into account that the identity $\phantom{\rule{0.166667em}{0ex}}{\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}}^{\top}\left(t\phantom{\rule{0.166667em}{0ex}}H\right)=\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}(-t\phantom{\rule{0.166667em}{0ex}}H)$ holds true. Therefore,$$\begin{array}{cc}\hfill {\gamma}_{X,V}^{\top}\left(t\right){\gamma}_{X,V}\left(t\right)=& \phantom{\rule{4pt}{0ex}}\phantom{\rule{0.166667em}{0ex}}{\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}}^{\top}\left(t{X}^{\top}V\right){X}^{\top}X\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{X}^{\top}V\right)\hfill \\ \hfill =& \phantom{\rule{4pt}{0ex}}\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}(-t{X}^{\top}V)\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{X}^{\top}V\right)=\hfill \\ \hfill =& \phantom{\rule{4pt}{0ex}}{I}_{p}.\hfill \end{array}$$
- (2)
- It satisfies Equation (193); hence, it is a geodesic. In fact, notice that $\tilde{\gamma}\left(t\right)={X}^{\top}X\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{X}^{\top}V\right)=\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{X}^{\top}V\right)$; hence, $\dot{\tilde{\gamma}}\left(t\right)={X}^{\top}V\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{X}^{\top}V\right)$ and $\ddot{\tilde{\gamma}}\left(t\right)={\left({X}^{\top}V\right)}^{2}\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{X}^{\top}V\right)$. Now, the right-hand side of Equation (193) has the expression $-\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{X}^{\top}V\right){\left({X}^{\top}V\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{X}^{\top}V\right)\right)}^{\top}{X}^{\top}V\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{X}^{\top}V\right)=-\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{X}^{\top}V\right)$$\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}(-t{X}^{\top}V){\left({X}^{\top}V\right)}^{\top}{X}^{\top}V\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{X}^{\top}V\right)={\left({X}^{\top}V\right)}^{2}\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{X}^{\top}V\right)$ because ${X}^{\top}V\in \mathfrak{so}\left(p\right)$, hence the claim.
- (3)
- It satisfies $\tilde{\gamma}\left(0\right)=X$ and $\dot{\tilde{\gamma}}\left(0\right)=X\left({X}^{\top}V\right)=V$; hence, it has the correct base point and direction.
**Stiefel manifold**: Let us consider the expression of geodesics corresponding to two metrics.- Euclidean metric: The solution of the geodesic equation, with the initial conditions ${\gamma}_{X,V}\left(0\right)=X\in \mathrm{St}(n,p)$ and ${\dot{\gamma}}_{X,V}\left(0\right)=V\in {T}_{X}\mathrm{St}(n,p)$, reads [33]:$${\gamma}_{X,V}\left(t\right)=\left[X\phantom{\rule{4pt}{0ex}}V\right]\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(\right)open="("\; close=")">t\left(\right)open="["\; close="]">\begin{array}{cc}{X}^{\top}V& -{V}^{\top}V\\ {I}_{p}\phantom{\rule{1.em}{0ex}}& {X}^{\top}V\end{array}{I}_{2p,p}\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}(-t{X}^{\top}V),$$$${exp}_{X}\left(V\right):=\left[X\phantom{\rule{4pt}{0ex}}V\right]\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(\left(\right),\begin{array}{cc}{X}^{\top}V& -{V}^{\top}V\\ {I}_{p}\phantom{\rule{1.em}{0ex}}& {X}^{\top}V\end{array}\right)$$
- Canonical metric: The geodesic arc ${\gamma}_{X,V}$ may be computed as follows. Let Q and R denote the factors of the thin QR factorization of the matrix V, then:$${\gamma}_{X,V}\left(t\right)=\left[X\phantom{\rule{4pt}{0ex}}Q\right]\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(\right)open="("\; close=")">t\left(\right)open="["\; close="]">\begin{array}{cc}{X}^{\top}V& -{R}^{\top}\\ R& {0}_{p}\end{array}\left(\right)open="["\; close="]">\begin{array}{c}{I}_{p}\\ {0}_{p}\end{array}$$$${exp}_{X}\left(V\right):=\left[X\phantom{\rule{4pt}{0ex}}Q\right]\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(\left(\right),\begin{array}{cc}{X}^{\top}V& -{R}^{\top}\\ R& {0}_{p}\end{array}\right),$$In fact, neither the logarithmic map nor the geodesic distance are known in closed form for a Stiefel manifold.

**Real symplectic group**: According to the two considered metrics, we have:- KM metric: Under the pseudo-Riemannian metric (115), it is indeed possible to solve the geodesic equation in closed form. The geodesic curve ${\gamma}_{Q,V}:[0,\phantom{\rule{4pt}{0ex}}1]\to \mathrm{Sp}\left(2n\right)$ with $Q\in \mathrm{Sp}\left(2n\right)$ and $V\in {T}_{Q}\mathrm{Sp}\left(2n\right)$ corresponding to the indefinite Khvedelidze–Mladenov metric (115) has the expression:$${\gamma}_{Q,V}\left(t\right)=Q\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{Q}^{-1}V\right).$$In fact, the geodesic equation in variational form is:$$\delta {\int}_{0}^{1}\mathrm{tr}\left({\gamma}^{-1}\dot{\gamma}{\gamma}^{-1}\dot{\gamma}\right)\mathrm{d}t=0.$$The calculation of this variation is facilitated by the following rules of the calculus of variations:$$\begin{array}{ccc}& & \delta \left({Q}^{-1}\right)=-{Q}^{-1}\left(\delta Q\right){Q}^{-1},\hfill \end{array}$$$$\begin{array}{ccc}& & \delta \left(\right)open="("\; close=")">\frac{\mathrm{d}Q}{\mathrm{d}t}=\frac{\mathrm{d}}{\mathrm{d}t}\left(\delta Q\right),\hfill \end{array}$$$$\begin{array}{ccc}& & \delta \left(QZ\right)=\left(\delta Q\right)Z+Q\left(\delta Z\right),\hfill \end{array}$$$${\int}_{0}^{1}\mathrm{tr}\left(\delta \gamma ({\gamma}^{-1}\ddot{\gamma}{\gamma}^{-1}-{\gamma}^{-1}\dot{\gamma}{\gamma}^{-1}\dot{\gamma}{\gamma}^{-1})\right)\mathrm{d}t=0.$$The variation $\delta \gamma \in {T}_{\gamma}\mathrm{Sp}\left(2n\right)$ is arbitrary. By the structure of the normal space ${N}_{Q}\mathrm{Sp}\left(2n\right)$, the equation $\mathrm{tr}\left({P}^{\top}\delta \gamma \right)=0$, with $\delta \gamma \in {T}_{\gamma}\mathrm{Sp}\left(2n\right)$, implies that ${P}^{\top}=HJ{\gamma}^{-1}$ with $H\in \mathfrak{so}\left(2n\right)$. Therefore, Equation (207) is satisfied if and only if:$${\gamma}^{-1}\ddot{\gamma}{\gamma}^{-1}-{\gamma}^{-1}\dot{\gamma}{\gamma}^{-1}\dot{\gamma}{\gamma}^{-1}=HJ{\gamma}^{-1},\phantom{\rule{4pt}{0ex}}H\in \mathfrak{so}\left(2n\right),$$$$\ddot{\gamma}-\dot{\gamma}{\gamma}^{-1}\dot{\gamma}=\gamma HJ,$$$${\gamma}^{\top}J\gamma -J=0\Rightarrow {\ddot{\gamma}}^{\top}J\gamma +2{\dot{\gamma}}^{\top}J\dot{\gamma}+{\gamma}^{\top}\ddot{\gamma}=0.$$Substituting the expression $\ddot{\gamma}=\dot{\gamma}{\gamma}^{-1}\dot{\gamma}+\gamma HJ$ into the above equation yields the condition $JHJ=0$. Hence, $H=0$ and the geodesic equation reads:$$\ddot{\gamma}-\dot{\gamma}{\gamma}^{-1}\dot{\gamma}=0.$$Its solution, with the initial conditions $\gamma \left(0\right)=Q\in \mathrm{Sp}\left(2n\right)$ and $\dot{\gamma}\left(0\right)=V\in {T}_{X}\mathrm{Sp}\left(2n\right)$, is found to be of the form (202). By definition of matrix exponential, it follows that ${\dot{\gamma}}_{Q,V}\left(t\right)=V\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t\phantom{\rule{0.166667em}{0ex}}{Q}^{-1}V\right)$.
- Euclidean metric: The expression of the geodesic corresponding to the Euclidean metric was derived in [50]. Let $\gamma :[0,1]\to \mathrm{Sp}\left(2n\right)$ be a geodesic arc connecting the points $X,Y\in \mathrm{Sp}\left(2n\right)$. Let us define $h\left(t\right):={\gamma}^{-1}\left(t\right)\dot{\gamma}\left(t\right)$. The geodesic that minimizes the following energy functional$${\int}_{0}^{1}\mathrm{tr}\left({h}^{\top}\left(t\right)h\left(t\right)\right)\mathrm{d}t,$$$$\dot{h}\left(t\right)={h}^{\top}\left(t\right)h\left(t\right)-h\left(t\right){h}^{\top}\left(t\right).$$Furthermore, for the initial conditions $\gamma \left(0\right)=X\in \mathrm{Sp}\left(2n\right)$ and $\dot{\gamma}\left(0\right)=V\in {T}_{X}\mathrm{Sp}\left(2n\right)$, the geodesic on the real symplectic group is given by$${\gamma}_{X,V}\left(t\right)=X\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t{\left({X}^{-1}V\right)}^{\top}\right)\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(t[\left({X}^{-1}V\right)-{\left({X}^{-1}V\right)}^{\top}]\right),$$

**Space of symmetric, positive-definite matrices**: The geodesic arc, corresponding to the canonical metric, emanating from a point $P\in {\mathbb{S}}^{+}\left(p\right)$ in the direction $V\in {T}_{P}{\mathbb{S}}^{+}\left(p\right)$ has the expression:$${\gamma}_{P,V}\left(t\right)=\sqrt[\mathbb{S}]{P}\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(\right)open="("\; close=")">t\sqrt[\mathbb{S}]{{P}^{-1}}V\sqrt[\mathbb{S}]{{P}^{-1}}$$$$\begin{array}{ccc}\hfill & & {exp}_{P}\left(V\right)=\sqrt[\mathbb{S}]{P}\phantom{\rule{0.166667em}{0ex}}\mathrm{Exp}\phantom{\rule{0.166667em}{0ex}}\left(\right)open="("\; close=")">\sqrt[\mathbb{S}]{{P}^{-1}}V\sqrt[\mathbb{S}]{{P}^{-1}}\sqrt[\mathbb{S}]{P},\hfill \end{array}$$$$\begin{array}{ccc}& & {log}_{P}\left(Q\right)=\sqrt[\mathbb{S}]{P}\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}\left(\right)open="("\; close=")">\sqrt[\mathbb{S}]{{P}^{-1}}Q\sqrt[\mathbb{S}]{{P}^{-1}}\sqrt[\mathbb{S}]{P}.\hfill \end{array}$$The symmetric matrix square root of a ${\mathbb{S}}^{+}\left(p\right)$ matrix P may be computed by means of its eigenvalue factorization. In fact, if the matrix P is factored as $X\phantom{\rule{0.166667em}{0ex}}\mathrm{diag}({\lambda}_{1},{\lambda}_{2},{\lambda}_{3},\dots ,{\lambda}_{p}){X}^{\top}$, with $X\in \mathbb{O}\left(p\right)$ and every ${\lambda}_{k}>0$, then it holds that $\sqrt[\mathbb{S}]{P}=X\phantom{\rule{0.166667em}{0ex}}\mathrm{diag}(\sqrt{{\lambda}_{1}},\sqrt{{\lambda}_{2}},\sqrt{{\lambda}_{3}},\dots ,$$\sqrt{{\lambda}_{p}}){X}^{\top}$. The squared Riemannian distance between two points $P,Q\in {\mathbb{S}}^{+}\left(p\right)$ is given by$$\begin{array}{cc}\hfill {d}^{2}(P,Q)& ={\langle {log}_{P}Q,{log}_{P}Q\rangle}_{P}=\mathrm{tr}\left(\left({log}_{P}Q\right){P}^{-1}\left({log}_{P}Q\right){P}^{-1}\right)\hfill \\ & =\mathrm{tr}\left(\right)open="("\; close=")">\sqrt[\mathbb{S}]{P}\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}\left(\right)open="("\; close=")">\sqrt[\mathbb{S}]{{P}^{-1}}Q\sqrt[\mathbb{S}]{{P}^{-1}}\sqrt[\mathbb{S}]{P}{P}^{-1}\sqrt[\mathbb{S}]{P}\phantom{\rule{0.166667em}{0ex}}\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}\left(\right)open="("\; close=")">\sqrt[\mathbb{S}]{{P}^{-1}}Q\sqrt[\mathbb{S}]{{P}^{-1}}\hfill & \sqrt[\mathbb{S}]{P}{P}^{-1}\end{array}$$Notice that the following identity holds true:$$\mathrm{tr}{({I}_{p}-\sqrt[\mathbb{S}]{{P}^{-1}}Q\sqrt[\mathbb{S}]{{P}^{-1}})}^{k}=\mathrm{tr}{({I}_{p}-Q{P}^{-1})}^{k},$$$${d}^{2}(P,Q)=\mathrm{tr}\left(\phantom{\rule{0.166667em}{0ex}}{\mathrm{Log}\phantom{\rule{0.166667em}{0ex}}}^{2}\left(Q{P}^{-1}\right)\right),$$**Grassmann manifold**: A geodesic arc on a Grassmann manifold emanating from $\left[X\right]\in \mathrm{Gr}(n,p)$ with the velocity $V\in {T}_{\left[X\right]}\mathrm{Gr}(n,p)$ may be written as:$$\begin{array}{cc}\hfill {\gamma}_{\left[X\right],V}\left(t\right)=& \phantom{\rule{4pt}{0ex}}\left[XB\phantom{\rule{4pt}{0ex}}A\right]\left(\right)open="["\; close="]">\begin{array}{c}cos\left(Dt\right)\\ sin\left(Dt\right)\end{array}{B}^{\top},\hfill \end{array}$$The exponential map associated with the canonical metric reads:$${exp}_{\left[X\right]}\left(V\right)=XBcos\left(D\right){B}^{\top}+Asin\left(D\right){B}^{\top}.$$The logarithmic map ${log}_{\left[X\right]}\left[Y\right]$ of two subspaces $\left[X\right],\left[Y\right]\in \mathrm{Gr}(n,p)$ is not easy to compute in general. In [40], it is shown that if their Stiefel representatives $X,Y\in \mathrm{St}(n,p)$ are such that the product ${X}^{\top}Y$ is symmetric, then ${log}_{\left[X\right]}\left[Y\right]=AD{B}^{\top}$, where $B(cosD){B}^{\top}$ denotes the spectral factorization of the matrix ${X}^{\top}Y$ and $A:=(I-X{X}^{\top})YB{(sinD)}^{-1}$.

**Example**

**22.**

**Observation**

**3.**

#### 8.3. Geodesic Interpolation

**Example**

**23.**

- The notion of ‘mean value’ of objects in a metrizable space should reflect the intuitive understanding that the mean value is an element of the space that locates ‘amidst’ the available objects. Therefore, a fundamental tool in the definition of mean value is a measure of ‘how far apart’ elements in the sample space lie to one another.
- The notion of ‘metric variance’ of objects in a metrizable space should be defined in a way that accounts for the dispersion of these objects about their mean values and also depends on how the dissimilarity of such objects is measured.

#### 8.4. Coordinate-Prone Geodesy, Christoffel Symbols*

**Example**

**24.**

## 9. Riemannian Gradient of a Manifold-to-Scalar Function

#### 9.1. Riemannian Gradient: Motivation and Definition

**Observation**

**4.**

**Tangency condition**: For every $x\in \mathbb{M}$, ${\mathrm{grad}}_{x}^{\mathbb{M}}f\in {T}_{x}\mathbb{M}$.**Metric compatibility condition**: For every $x\in \mathbb{M}$ and every $v\in {T}_{x}\mathbb{M}\subset \mathbb{A}$, it holds that ${\langle {\mathrm{grad}}_{x}^{\mathbb{M}}f,v\rangle}_{x}={\langle {\partial}_{x}f,v\rangle}^{\mathbb{A}}$.

**Hypercube**: In the space ${\mathbb{R}}^{p}$ endowed with the Euclidean metric, the Gateaux derivative of a regular function $f:{\mathbb{R}}^{p}\to \mathbb{R}$ is simply the column array of partial derivatives of function f with respect to the entries of the column array $x={[{x}^{\left(1\right)}\phantom{\rule{4pt}{0ex}}{x}^{\left(2\right)}\phantom{\rule{4pt}{0ex}}{x}^{\left(3\right)}\phantom{\rule{4pt}{0ex}}\cdots \phantom{\rule{4pt}{0ex}}{x}^{\left(p\right)}]}^{\top}$, namely$${\mathrm{grad}}_{x}^{{\mathbb{R}}^{p}}\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{-0.166667em}{0ex}}f={\partial}_{x}f:=\left(\right)open="["\; close="]">\begin{array}{c}\frac{\partial f}{\partial {x}^{\left(1\right)}}\\ \frac{\partial f}{\partial {x}^{\left(2\right)}}\\ \vdots \\ \frac{\partial f}{\partial {x}^{\left(p\right)}}\end{array}$$Likewise, the Gateaux derivative of a regular function $f:{\mathbb{R}}^{p\times q}\to \mathbb{R}$ is the Jacobian matrix of partial derivatives of function f with respect to the entries of the matrix X, denoted by ${X}^{(i,j)}$, namely$${\mathrm{grad}}_{X}^{{\mathbb{R}}^{p\times q}}\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{-0.166667em}{0ex}}f={\partial}_{X}f:=\left(\right)open="["\; close="]">\begin{array}{cccc}\frac{\partial f}{\partial {X}^{(1,1)}}& \frac{\partial f}{\partial {X}^{(1,2)}}& \cdots & \frac{\partial f}{\partial {X}^{(1,q)}}\\ \frac{\partial f}{\partial {X}^{(2,1)}}& \frac{\partial f}{\partial {X}^{(2,2)}}& \cdots & \frac{\partial f}{\partial {X}^{(2,q)}}\\ \vdots & \vdots & \ddots & \vdots \\ \frac{\partial f}{\partial {X}^{(p,1)}}& \frac{\partial f}{\partial {X}^{(p,2)}}& \cdots & \frac{\partial f}{\partial {X}^{(p,q)}}\end{array}$$**Hypersphere**: Given a regular function $f:{\mathbb{S}}^{p-1}\to \mathbb{R}$, its Riemannian gradient at $x\in {\mathbb{S}}^{p-1}$ is denoted as ${\mathrm{grad}}_{x}^{{\mathbb{S}}^{p-1}}\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{-0.166667em}{0ex}}f\in {T}_{x}{\mathbb{S}}^{p-1}$. The Riemannian gradient associated with the canonical metric has the expression:$${\mathrm{grad}}_{x}^{{\mathbb{S}}^{p-1}}\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{-0.166667em}{0ex}}f=({I}_{p}-x{x}^{\top}){\partial}_{x}f.$$In fact, the metric compatibility condition prescribes that $\langle {\mathrm{grad}}_{x}^{{\mathbb{S}}^{p-1}}\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{-0.166667em}{0ex}}f,v\rangle =\langle {\partial}_{x}f,v\rangle $, for every $v\in {T}_{x}{\mathbb{S}}^{p-1}$; hence, $\langle {\mathrm{grad}}_{x}^{{\mathbb{S}}^{p-1}}\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{-0.166667em}{0ex}}f-{\partial}_{x}f,v\rangle =0$ for every $v\in {T}_{x}{\mathbb{S}}^{p-1}$, and therefore ${\mathrm{grad}}_{x}^{{\mathbb{S}}^{p-1}}\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{-0.166667em}{0ex}}f-{\partial}_{x}f\in {N}_{x}{\mathbb{S}}^{p-1}$. It follows that$${\mathrm{grad}}_{x}^{{\mathbb{S}}^{p-1}}\phantom{\rule{-0.166667em}{0ex}}\phantom{\rule{-0.166667em}{0ex}}f={\partial}_{x}f+\lambda \phantom{\rule{0.166667em}{0ex}}x,$$**Special orthogonal group**: Let us compute the Riemannian gradient of a regular function $f:\mathbb{SO}\left(p\right)\to \mathbb{R}$. Let the manifold $\mathbb{SO}\left(p\right)$ be equipped with its canonical metric. Let ${\mathrm{grad}}_{X}^{\mathbb{SO}\left(p\right)}f$ denote the gradient vector of a function f at $R\in \mathbb{SO}\left(p\right)$ derived from the canonical metric. According to the compatibility condition for the Riemannian gradient it must hold that:$$\langle V,{\mathrm{grad}}_{R}^{\mathbb{SO}\left(p\right)}f\rangle =\langle V,{\partial}_{R}f\rangle ,\phantom{\rule{4pt}{0ex}}\forall V\in {T}_{R}\mathbb{SO}\left(p\right)$$$$\langle V,{\mathrm{grad}}_{R}^{\mathbb{SO}\left(p\right)}f-{\partial}_{R}f\rangle =0,\phantom{\rule{4pt}{0ex}}\forall V\in {T}_{R}\mathbb{SO}\left(p\right).$$This implies that the quantity ${\mathrm{grad}}_{R}^{\mathbb{SO}\left(p\right)}f-{\partial}_{R}f$ belongs to the normal space ${N}_{R}\mathbb{SO}\left(p\right)$, namely:$${\mathrm{grad}}_{R}^{\mathbb{SO}\left(p\right)}f={\partial}_{R}f+RS,withSbeingp\times psymmetric.$$In order to determine the unknown matrix S, we may exploit the tangency condition, namely ${\left({\mathrm{grad}}_{R}^{\mathbb{SO}\left(p\right)}f\right)}^{\top}R+{R}^{\top}\left({\mathrm{grad}}_{R}^{\mathbb{SO}\left(p\right)}f\right)={0}_{p}$. Let us first pre-multiply both sides of Equation (258) by ${R}^{\top}$, which gives:$${R}^{\top}{\mathrm{grad}}_{R}^{\mathbb{SO}\left(p\right)}f={R}^{\top}{\partial}_{R}f+S.$$The above equation, transposed hand by hand, gives:$${\left({\mathrm{grad}}_{R}^{\mathbb{SO}\left(p\right)}f\right)}^{\top}R={\partial}_{R}^{\top}f\phantom{\rule{0.166667em}{0ex}}R+S.$$Hand-by-hand summation of the last two equations gives:$${R}^{\top}{\partial}_{R}f+{\partial}_{R}^{\top}f\phantom{\rule{0.166667em}{0ex}}R=-2S\phantom{\rule{4pt}{0ex}},$$$$S=-\frac{1}{2}\left(\right)open="("\; close=")">{R}^{\top}{\partial}_{R}f+{\partial}_{R}^{\top}f\phantom{\rule{0.166667em}{0ex}}R$$By plugging the expression (259) into the expression (258), we obtain the Riemannian gradient in the orthogonal group, corresponding to its canonical metric:$${\mathrm{grad}}_{R}^{\mathbb{SO}\left(p\right)}f=\frac{1}{2}\left(\right)open="("\; close=")">{\partial}_{R}f-R\phantom{\rule{0.166667em}{0ex}}{\partial}_{R}^{\top}f\phantom{\rule{0.166667em}{0ex}}R$$**Stiefel manifold**: Let us compute the expression of the Riemannian gradient of a regular function $f:\mathrm{St}(n,p)\to \mathbb{R}$ at a point $X\in \mathrm{St}(n,p)$ corresponding to the Euclidean and the canonical metrics. Recall that the Riemannian gradient ${\mathrm{grad}}_{X}^{\mathrm{St}(n,p)}f$ in a Stiefel manifold embedded in the Euclidean space ${\mathbb{R}}^{n\times p}$ is the unique matrix in ${T}_{X}\mathrm{St}(n,p)$ such that:$$\mathrm{tr}\left(\right)open="("\; close=")">{U}^{\top}{\partial}_{X}f$$- Euclidean metric: The metric compatibility condition becomes $\mathrm{tr}({U}^{\top}({\partial}_{X}f-$${\mathrm{grad}}_{X}^{\mathrm{St}(n,p)}f\left)\right)=0$, which implies that ${\mathrm{grad}}_{X}^{\mathrm{St}(n,p)}f={\partial}_{X}f+XS$, with S being $p\times p$ symmetric. Pre-multiplying both sides of this equation by the matrix ${X}^{\top}$ yields $S+{X}^{\top}{\partial}_{X}f={X}^{\top}{\mathrm{grad}}_{X}^{\mathrm{St}(n,p)}f$. Transposing both hands of the above equation and summing hand by hand yields:$$S=-\frac{1}{2}\left(\right)open="("\; close=")">{\partial}_{X}^{\top}fX+{X}^{\top}{\partial}_{X}f.$$From the condition ${\mathrm{grad}}_{X}^{\mathrm{St}(n,p)}f\in {T}_{X}\mathrm{St}(n,p)$, according to Equation (46), it follows that ${\left({\mathrm{grad}}_{X}^{\mathrm{St}(n,p)}f\right)}^{\top}X+{X}^{\top}{\mathrm{grad}}_{X}^{\mathrm{St}(n,p)}f=0$. In conclusion, the sought Riemannian gradient reads:$${\mathrm{grad}}_{X}^{\mathrm{St}(n,p)}f={\partial}_{X}f-\frac{1}{2}X\left(\right)open="("\; close=")">{\partial}_{X}^{\top}fX+{X}^{\top}{\partial}_{X}f$$
- Canonical metric: The metric compatibility condition prescribes that:$$\mathrm{tr}\left(\right)open="("\; close=")">{U}^{\top}\left(\right)open="("\; close=")">{\partial}_{X}f-\left(\right)open="("\; close=")">{I}_{n}-\frac{1}{2}X{X}^{\top}$$$$\left(\right)$$Solving for the Riemannian gradient yields the final expression:$${\mathrm{grad}}_{X}^{\mathrm{St}(n,p)}f={\partial}_{x}f-X\phantom{\rule{0.166667em}{0ex}}{\partial}_{X}^{\top}f\phantom{\rule{0.166667em}{0ex}}X.$$

**Real symplectic group**: According to the two considered metrics, the expression of the gradient may be computed as outlined below.- KM metric: The structure of the pseudo-Riemannian gradient of a regular function $f:\mathrm{Sp}\left(2n\right)\to \mathbb{R}$ associated with the Khvedelidze–Mladenov metric (115) is given by$${\nabla}_{Q}f={\textstyle \frac{1}{2}}QJ\left(\right)open="("\; close=")">{Q}^{\top}{\partial}_{Q}fJ-J{\partial}_{Q}^{\top}fQ$$In fact, the pseudo-Riemannian gradient of a regular function $f:\mathrm{Sp}\left(2n\right)\to \mathbb{R}$ associated with the metric (115) is computed as the solution of the following system of equations:$$\left(\right)$$The first constraint ensures the compatibility of the pseudo-Riemannian gradient with the chosen metric, while the second constraint enforces the requirement for the pseudo-Riemannian gradient to lay on a specific tangent space. In particular, the metric compatibility condition may be recast as:$$\mathrm{tr}\left({V}^{\top}({\partial}_{Q}f-{Q}^{-\top}{\mathrm{grad}}_{Q}^{\top}f\phantom{\rule{0.166667em}{0ex}}{Q}^{-\top})\right)=0.$$The above condition implies that ${\partial}_{Q}f-{Q}^{-\top}{\mathrm{grad}}_{Q}^{\top}f\phantom{\rule{0.166667em}{0ex}}{Q}^{-\top}\in {N}_{Q}\mathrm{Sp}\left(2n\right)$, and hence that ${\partial}_{Q}f-{Q}^{-\top}{\mathrm{grad}}_{Q}^{\top}f\phantom{\rule{0.166667em}{0ex}}{Q}^{-\top}=JQH$ with $H\in \mathfrak{so}\left(2n\right)$. Therefore, the pseudo-Riemannian gradient of the criterion function f has the expression:$${\nabla}_{Q}f=Q\phantom{\rule{0.166667em}{0ex}}{\partial}_{Q}^{\top}f\phantom{\rule{0.166667em}{0ex}}Q-QHJ.$$In order to determine the value of the unknown skew-symmetric matrix H, it is sufficient to plug the expression (270) of the gradient within the tangency condition, which becomes:$${\left(\right)}^{Q}\top =0.$$Solving for H gives:$$H=-{\textstyle \frac{1}{2}}\left(\right)open="("\; close=")">J{Q}^{\top}{\partial}_{Q}f+{\partial}_{Q}^{\top}f\phantom{\rule{0.166667em}{0ex}}QJ$$
- Euclidean metric: The structure of the Riemannian gradient ${\nabla}_{Q}f$ of a regular function $f:\mathrm{Sp}\left(2n\right)\to \mathbb{R}$ corresponding to the metric (116) reads:$${\nabla}_{Q}f={\textstyle \frac{1}{2}}QJ\left(\right)open="("\; close=")">{\partial}_{Q}^{\top}f\phantom{\rule{0.166667em}{0ex}}QJ-J{Q}^{\top}{\partial}_{Q}f$$In fact, the Riemannian gradient ${\nabla}_{Q}f$ must satisfy the conditions:$$\left(\right)$$

**Manifold of symmetric, positive-definite matrices**: The Riemannian gradient ${\mathrm{grad}}_{P}f$ of the function $f:{\mathbb{S}}^{+}\left(p\right)\to \mathbb{R}$ may be calculated as the unique vector in ${T}_{P}{\mathbb{S}}^{+}\left(p\right)$ that satisfies the following equation:$$\mathrm{tr}\left(\right)open="("\; close=")">V{\partial}_{P}f,\phantom{\rule{4pt}{0ex}}\forall V\in {T}_{P}{\mathbb{S}}^{+}\left(n\right).$$The solution of the above equation satisfies:$${\partial}_{P}f-{P}^{-1}\left({\mathrm{grad}}_{P}^{{\mathbb{S}}^{+}\left(p\right)}f\right){P}^{-1}={\textstyle \frac{1}{2}}\left(\right)open="("\; close=")">{\partial}_{P}f-{\partial}_{P}^{\top}f$$$${\mathrm{grad}}_{P}^{{\mathbb{S}}^{+}\left(p\right)}f={\textstyle \frac{1}{2}}P\left(\right)open="("\; close=")">{\partial}_{P}f+{\partial}_{P}^{\top}f$$**Grassmann manifold**: The Riemannian gradient ${\mathrm{grad}}_{\left[X\right]}f$ may be calculated by its definition and reads as:$${\nabla}_{\left[X\right]}^{\mathrm{Gr}(n,p)}f=({I}_{n}-X{X}^{\top}){\partial}_{X}f.$$

**Tangency**: ${\Pi}_{x}\left(a\right)\in {T}_{x}\mathbb{M}$ for $x\in \mathbb{M}$ and $a\in \mathbb{A}$.**Complementarity**: ${\langle v,a-{\Pi}_{x}\left(a\right)\rangle}^{\mathbb{A}}=0$ for all $v\in {T}_{x}\mathbb{M}$.

**Observation**

**5.**

**Hypersphere**: An expression of an orthogonal projector ${\Pi}_{x}:{\mathbb{R}}^{n}\to {T}_{x}{\mathbb{S}}^{p-1}$, for $x\in {\mathbb{S}}^{p-1}$, where the ambient space ${\mathbb{R}}^{p}$ is endowed with a Euclidean metric, is:$${\Pi}_{x}\left(a\right):=({I}_{p}-x{x}^{\top})a.$$Notice that $a-{\Pi}_{x}\left(a\right)=\left({x}^{\top}a\right)x$ is radial (that is, directed along x) and hence normal.**Stiefel manifold**: It might be useful to define an orthogonal projection operator ${\Pi}_{X}:{\mathbb{R}}^{n\times p}\to {T}_{X}\mathrm{St}(n,p)$, for $X\in \mathrm{St}(n,p)$. Let us assume the ambient space $\mathbb{A}:={\mathbb{R}}^{n\times p}$ be endowed with a Euclidean metric ${\langle U,V\rangle}^{\mathbb{A}}:=\mathrm{tr}\left({U}^{\top}V\right)$. In this case, the orthogonal projection takes the expression$${\Pi}_{X}\left(U\right):=U-{\textstyle \frac{1}{2}}X({X}^{\top}U+{U}^{\top}X),$$**Grassmann manifold**: An expression of orthogonal projection ${\Pi}_{\left[X\right]}:{\mathbb{R}}^{n\times p}\to {T}_{\left[X\right]}\mathrm{Gr}$$(n,p)$, for $\left[X\right]\in \mathrm{Gr}(n,p)$ is$${\Pi}_{\left[X\right]}\left(V\right):=({I}_{n}-X{X}^{\top})V.$$

**Example**

**25.**

**Example**

**26.**

**Example**

**27.**

**Example**

**28.**

#### 9.2. Application of Riemannian Gradient to Optimization on Manifold

- To determine those points ${x}_{\u2605}$ to which the solutions of the differential Equation (303) will tend toward;

**Theorem**

**4.**

**Proof.**

- The dynamical system $\dot{x}\left(t\right)=+{\mathrm{grad}}_{x\left(t\right)}f,\phantom{\rule{4pt}{0ex}}x\left(0\right)={x}_{0}\in \mathbb{M}$ generates a trajectory in the state space $\mathbb{M}$ that tends toward a point of maximum of the function f located near the initial state ${x}_{0}$,
- Conversely, the dynamical system $\dot{x}\left(t\right)=-{\mathrm{grad}}_{x\left(t\right)}f,\phantom{\rule{4pt}{0ex}}x\left(0\right)={x}_{0}\in \mathbb{M}$ generates a trajectory in the space $\mathbb{M}$ that tends toward a point of minimum of the function f located near the initial state ${x}_{0}$.

**Example**

**29.**

#### 9.3. A Golden Gradient Rule: Gradient of Squared Distance

**Example**

**30.**

**Theorem**

**5.**

**Proof.**

- An arbitrary sufficiently smooth curve ${\alpha}_{x}\left(t\right)$ such that ${\alpha}_{x}\left(0\right)=x$, while ${\dot{\alpha}}_{x}\left(t\right)$ is arbitrary;
- The fundamental form $\mathcal{F}(x,v):={\langle v,v\rangle}_{x}$;
- A geodesic curve $c(t,s)$ connecting the point y to the point ${\alpha}_{x}\left(t\right)$, emanating from the former, with parameter $s\in [0,\phantom{\rule{4pt}{0ex}}1]$;
- The partial derivatives $\dot{c}(t,s):={\textstyle \frac{\partial c}{\partial t}}(t,s)$ and ${c}^{\prime}(t,s):={\textstyle \frac{\partial c}{\partial s}}(t,s)$.

- (P
_{1}) - $c(t,0)=y$, therefore $\dot{c}(t,0)=0$;
- (P
_{2}) - $c(t,1)={\alpha}_{x}\left(t\right)$, therefore $\dot{c}(t,1)={\dot{\alpha}}_{x}\left(t\right)$;
- (P
_{3}) - ${c}^{\prime}(t,1)=-{log}_{{\alpha}_{x}\left(t\right)}y$. (In fact, notice that if ${\gamma}_{y}^{x}:[0,\phantom{\rule{4pt}{0ex}}1]\to \mathbb{M}$ denotes a geodesic from x to y then it holds that ${\dot{\gamma}}_{x}^{y}\left(0\right)={log}_{y}x$ and ${\gamma}_{y}^{x}\left(s\right)={\gamma}_{x}^{y}(1-s)$; therefore, ${\dot{\gamma}}_{y}^{x}\left(1\right)=-{\dot{\gamma}}_{x}^{y}\left(0\right)=-{log}_{y}x$.)

**Example**

**31.**

- Given a metric for the manifold $\mathbb{SO}\left(3\right)$, it is possible to define a distance function $d:\mathbb{SO}\left(3\right)\times \mathbb{SO}\left(3\right)\to {\mathbb{R}}_{0}^{+}$ between any pair of attitude matrices in $\mathbb{SO}\left(3\right)$;
- A hierarchy is established among the agents in a fleet through a set of weights ${w}_{ij}\ge 0$;
- Then, consensus optimization consists of determining an evolution law for each agent to minimize the distance between any pair of attitude matrices weighted according to the assigned hierarchy, namely, to minimize the function$${f}_{i}:=\frac{1}{2}\sum _{j=1}^{N}{w}_{ij}\phantom{\rule{0.166667em}{0ex}}{d}^{2}({R}_{i},{R}_{j}),with\phantom{\rule{4pt}{0ex}}i=1,2,\phantom{\rule{4pt}{0ex}}3,\phantom{\rule{4pt}{0ex}}\dots ,\phantom{\rule{4pt}{0ex}}N.$$Notice that $d({R}_{i},{R}_{i})=0$; hence, the values assigned to the diagonal coefficients ${w}_{ii}$ are unimportant.

#### 9.4. Riemannian Gradient in Coordinates*

- The Euclidean inner product reads $\langle {\partial}_{x}f,v\rangle =\frac{\partial f}{\partial {x}^{k}}{v}^{k}$.
- The inner product reads ${\langle {\mathrm{grad}}_{x}f,v\rangle}_{x}={g}_{ik}{\left({\mathrm{grad}}_{x}f\right)}^{i}{v}^{k}$.

**Sharp isomorphism (${}^{\u266f}$)**: The ‘sharp’ isomorphism ${}^{\u266f}:{T}^{*}\mathbb{M}\to TM$ takes a cotangent vector and returns a tangent vector. Namely, given a cotangent vector $\omega ={\omega}_{i}\phantom{\rule{0.166667em}{0ex}}\mathrm{d}{x}^{i}$, the sharp isomorphism acts like $v:={\omega}^{\u266f}={g}^{ij}{\omega}_{i}{\partial}_{j}$. For example, in the case of gradient, one may say that ${\mathrm{grad}}_{x}f={\left(d{f}_{x}\right)}^{\u266f}$.**Flat isomorphism (${}^{\u266d}$)**: The ‘flat’ isomorphism ${}^{\u266d}:T\mathbb{M}\to {T}^{*}M$ takes a tangent vector and returns a cotangent vector. Namely, given a tangent vector $v={v}^{i}\phantom{\rule{0.166667em}{0ex}}{\partial}_{i}^{x}$, the flat isomorphism acts like $\omega :={v}^{\u266d}={g}_{ij}{v}^{i}\mathrm{d}{x}^{j}$. Then, the differential is the dual of gradient via a flat isomorphism.

## 10. Parallelism and Parallel Transport along a Curve

#### 10.1. Properties and Definition of Parallel Transport

**Example**

**32.**

**Parallel transport along a geodesic arc joining two given points**: Given two (sufficiently close) points $x,y\in \mathbb{M}$, and a tangent vector $w\in {T}_{x}\mathbb{M}$, the parallel transport of w from x to y along the geodesic arc connecting x to y is denoted by ${\mathbb{P}}^{x\to y}\left(w\right)$, since this notation shows all relevant information. In fact, the notation ${\mathbb{P}}^{x\to y}$ for the parallel transport operator is a shortened version of ${\mathbb{P}}_{{\gamma}_{x}^{y}}^{0\to 1}:{T}_{x}\mathbb{M}\to {T}_{y}\mathbb{M}$, where ${\gamma}_{x}^{y}:[0,\phantom{\rule{4pt}{0ex}}1]\to \mathbb{M}$ would denote the geodesic arc such that ${\gamma}_{x}^{y}\left(0\right)=x$ and ${\gamma}_{x}^{y}\left(1\right)=y$.**Parallel transport along a geodesic specified by a point and a tangent direction**: Given a point $x\in \mathbb{M}$, and a tangent direction $v\in {T}_{x}\mathbb{M}$, the parallel transport of a vector $w\in {T}_{x}\mathbb{M}$ along the geodesic arc departing from the point x toward the direction v is denoted by ${\mathbb{P}}_{{\gamma}_{x,v}}^{0\to t}\left(w\right)$, where ${\gamma}_{x,v}:[-\u03f5,\phantom{\rule{4pt}{0ex}}\u03f5]\to \mathbb{M}$, with $\u03f5>0$ and $t\in [-\u03f5,\phantom{\rule{4pt}{0ex}}\u03f5]$, would denote the geodesic curve such that ${\gamma}_{x,v}\left(0\right)=x$ and ${\dot{\gamma}}_{x,v}\left(0\right)=v$.

**Self parallel transport along a geodesic arc**: A distinguishing feature of a geodesic arc is that it parallel-transports its own initial slope. On a setting where the parallel transport operator is unknown in closed form, there exists no known way to transport a given vector w along a geodesic curve ${\gamma}_{x,v}$. However, it is possible to parallel-transport a given vector v along a geodesic ${\gamma}_{x,v}$, namely to compute ${\mathbb{P}}_{{\gamma}_{x,v}}^{0\to t}\left(v\right)$: it coincides exactly with ${\dot{\gamma}}_{x,v}\left(t\right)$. Such a numerical trick was invoked, for instance, in [39], Subsection III.A.**Parallel transport along a closed loop**: Parallel transport of a vector along a (piece-wise continuous) loopℓ is denoted as ${\mathbb{P}}_{\ell}:{T}_{x}\mathbb{M}\to {T}_{x}\mathbb{M}$, where x is termed base of the loop. In general, ${\mathbb{P}}_{\ell}\left(v\right)\ne v$, for $v\in {T}_{x}\mathbb{M}$. This phenomenon is referred to as anholonomy. Whenever ${\mathbb{P}}_{\ell}$ realizes an isometry, since $\parallel {\mathbb{P}}_{\ell}{\left(v\right)\parallel}_{x}={\parallel v\parallel}_{x}$, the operator ${\mathbb{P}}_{\ell}$ changes only the orientation of v. For example, if $\mathbb{M}\subset {\mathbb{R}}^{3}$, then we may say that ${\mathbb{P}}_{\ell}\left(v\right)$ is a rotated version of a tangent vector v laying in the same tangent space, namely, ${\mathbb{P}}_{\ell}$ may be represented as an element of the orthogonal group $\mathbb{O}\left(3\right)$. Intuitively, holonomy is specifically related to curved spaces, and therefore holonomy must be related to curvature. This conjecture is made precise by the Ambrose–Singer theorem [60].

#### 10.2. Coordinate-Free Derivation of Parallel Transport

**Tangency condition**: for every $t\in [0,\phantom{\rule{4pt}{0ex}}1]$ it should hold that ${\mathbb{P}}_{\gamma}^{0\to t}\left(w\right)\in {T}_{\gamma \left(t\right)}\mathbb{M}$,**Conformal isometry condition**: for every value of the parameter $t\in [0,\phantom{\rule{4pt}{0ex}}1]$ it should hold that ${\langle {\mathbb{P}}_{\gamma}^{0\to t}\left(u\right),{\mathbb{P}}_{\gamma}^{0\to t}\left(w\right)\rangle}_{\gamma \left(t\right)}$ keeps constant to ${\langle u,w\rangle}_{x}$ along the curve.

- $\overline{u}\left(t\right):={\mathbb{P}}_{\gamma}^{0\to t}\left(u\right)$: a vector field $(\gamma \left(t\right),\overline{u}\left(t\right))$ that represents the evolution, over the curve $\gamma $, of the vector u that is tangent to the manifold at $\gamma \left(0\right)$ and after transport it will be tangent at $\gamma \left(1\right)$,
- $\overline{w}\left(t\right):={\mathbb{P}}_{\gamma \left(t\right)}^{0\to t}\left(w\right)$: analogous to $\overline{u}\left(t\right)$, both are transport fields,
- $\mathcal{Q}\left(t\right):={\langle \overline{u}\left(t\right),\overline{w}\left(t\right)\rangle}_{\gamma \left(t\right)}$: a scalar field that represents the inner product between the above two transport fields along the curve $\gamma $.

**Taylor series expansion of the kernel G**: Given ${G}_{x}\left(v\right)$ with $x\in \mathbb{M}$, $v\in {T}_{x}\mathbb{M}$, let us write$${G}_{{\gamma}_{x,w}\left(t\right)}\left(v\right)={G}_{x}\left(v\right)+{G}_{x}^{\u2022}(v,t\phantom{\rule{0.166667em}{0ex}}w)+\mathcal{O}\left({t}^{2}\right),$$$${G}_{x}^{\u2022}(v,w):={\left(\right)}_{\frac{\mathrm{d}}{\mathrm{d}t}}t=0$$**Commutativity with the inner product in the ambient space**: In the expression ${\langle u,{G}_{x}^{\u2022}(v,w)\rangle}^{\mathbb{A}}$ it is allowed to swap the arguments u and v. In fact, notice that$$\begin{array}{cc}\hfill {\langle u,{G}_{x}^{\u2022}(v,w)\rangle}^{\mathbb{A}}=& \phantom{\rule{4pt}{0ex}}{\langle u,{\left(\right)}_{\frac{\mathrm{d}}{\mathrm{d}t}}t=0}^{\rangle}\mathbb{A}={\left(\right)}_{\frac{\mathrm{d}}{\mathrm{d}t}}t=0\hfill \end{array}\hfill =& \phantom{\rule{4pt}{0ex}}{\left(\right)}_{\frac{\mathrm{d}}{\mathrm{d}t}}t=0={\langle {\left(\right)}_{\frac{\mathrm{d}}{\mathrm{d}t}}t=0}^{,}\mathbb{A}\hfill $$This property of ${G}_{x}^{\u2022}$ holds even if the first argument of the derivative belongs to $\mathbb{A}$.**Partial derivatives of the fundamental form**: Let us recall the definition of the fundamental form $\mathcal{F}(x,v):={\langle v,{G}_{x}\left(v\right)\rangle}^{\mathbb{A}}$. Its partial derivatives read$$\frac{\partial \mathcal{F}}{\partial x}(x,v)={G}_{x}^{\u2022}(v,v),\phantom{\rule{4pt}{0ex}}\frac{\partial \mathcal{F}}{\partial v}(x,v)=2\phantom{\rule{0.166667em}{0ex}}{G}_{x}\left(v\right).$$

**Example**

**33.**

**Example**

**34.**

**Tangency**: for every $t\in [0,\phantom{\rule{4pt}{0ex}}1]$ it holds that ${\mathbb{P}}_{{\gamma}_{x}^{y}}^{0\to t}\left({\dot{\gamma}}_{x}^{y}\left(0\right)\right)\in {T}_{{\gamma}_{x}^{y}\left(t\right)}\mathbb{M}$; in fact, ${\mathbb{P}}_{{\gamma}_{x}^{y}}^{0\to t}\left({\dot{\gamma}}_{x}^{y}\left(0\right)\right)={\dot{\gamma}}_{x}^{y}\left(t\right)$ is tangent to the curve at every point ${\gamma}_{x}^{y}\left(t\right)$.**Conformal isometry**: for every $t\in [0,\phantom{\rule{4pt}{0ex}}1]$ it holds that ${\langle {\mathbb{P}}_{{\gamma}_{x}^{y}}^{0\to t}\left({\dot{\gamma}}_{x}^{y}\left(0\right)\right),{\mathbb{P}}_{{\gamma}_{x}^{y}}^{0\to t}\left({\dot{\gamma}}_{x}^{y}\left(0\right)\right)\rangle}_{{\gamma}_{x}^{y}\left(t\right)}={\parallel {\dot{\gamma}}_{x}^{y}\left(t\right)\parallel}_{{\gamma}_{x}^{y}\left(t\right)}^{2}$ which is constant along a geodesic,

**Example**

**35.**

**Linearity**: The linearity of ${\mathbb{P}}^{x\to y}\left(w\right)$ with respect to w (but not with respect to x ad y) is quite apparent.**Identity**: Letting $y\equiv x$ leads to an identity map. In fact, it holds that ${\mathbb{P}}^{x\to x}\left(w\right)=\left(\right)open="["\; close="]">{I}_{n}-\frac{(x+x){x}^{\top}}{1+{x}^{\top}x}w=({I}_{n}-x{x}^{\top})w=w-x\left({x}^{\top}w\right)=w-0=w$.**Tangency**: It should hold that ${y}^{\top}{\mathbb{P}}^{x\to y}\left(w\right)=0$. In fact: ${y}^{\top}\left(\right)open="["\; close="]">{I}_{n}-\frac{(x+y){y}^{\top}}{1+{x}^{\top}y}$.

**Example**

**36.**

**Identity**: Setting $Q\equiv P$ leads to parallel transport collapse to an identity map. Indeed, it holds that ${\mathbb{P}}^{P\to P}\left(W\right):=\sqrt{P{P}^{-1}}W\sqrt{{P}^{-1}P}=W$.**Tangency**: It holds that ${\left({\mathbb{P}}^{P\to Q}\left(W\right)\right)}^{\top}={\mathbb{P}}^{P\to Q}\left(W\right)$. In fact, we have ${\left({\mathbb{P}}^{P\to Q}\left(W\right)\right)}^{\top}={\left({P}^{-1}Q\right)}^{\frac{\top}{2}}{W}^{\top}{\left(Q{P}^{-1}\right)}^{\frac{\top}{2}}={\left({Q}^{\top}{P}^{-\top}\right)}^{\frac{1}{2}}W{\left({P}^{-\top}{Q}^{\top}\right)}^{\frac{1}{2}}=\sqrt{Q{P}^{-1}}W\sqrt{{P}^{-1}Q}={\mathbb{P}}^{P\to Q}\left(W\right)$.

**Example**

**37.**

**Hypercube**: The hypercube endowed with the standard Euclidean metric is a flat space; hence, parallel transport may be realized as a rigid translation. Namely parallel transport is an identity map.**Hypersphere**: Parallel transport on the hypersphere ${\mathbb{S}}^{p-1}$ of the tangent vector $w\in {T}_{x}{\mathbb{S}}^{p-1}$ along the geodesic arc ${\gamma}_{x,v}$ of an extent t may be computed by [39]:$${\mathbb{P}}_{{\gamma}_{x,v}}^{0}$$