Two criterion functions on the pseudoorthogonal group that encode the notion of average squared distance may be constructed on the basis of the Frobenius norm and of the induced geodesic distance on the pseudoorthogonal group $O(p,q)$. The related function minimization problem can be solved numerically by a pseudoRiemanniangradientbased algorithm.
In the present section, we deal with a matrix manifold, therefore it pays to recall that, on a matrix space, ${\langle A,B\rangle}^{\mathrm{E}}:=\mathrm{tr}({A}^{\top}B)$, where ${}^{\top}$ denotes matrix transpose and “$\mathrm{tr}$” denotes a matrix trace operator. The matrix Frobenius norm is defined by ${\parallel A\parallel}_{\mathrm{F}}:=\sqrt{{\langle A,A\rangle}^{\mathrm{E}}}$.
3.1. PseudoRiemannian Geometric Structure of the PseudoOrthogonal Group
A pseudoorthogonal group
$O(p,q)$, as a noncompact matrix Lie group [
29], is an instance of quadratic groups and is defined by
where the symbol
${I}_{p}$ denotes a
$p\times p$ identity matrix and the symbol
${O}_{p\times q}$ denotes a wholezero
$p\times q$ matrix. The matrix
${R}_{p,q}$ enjoys the properties
${R}_{p,q}^{2}={I}_{p+q},\phantom{\rule{4pt}{0ex}}{R}_{p,q}^{\top}={R}_{p,q}^{1}={R}_{p,q}$. Pseudoorthogonal matrices enjoy two properties that are recalled in the following.
Lemma 2. Any pseudoorthogonal matrix $X\in O(p,q)$ is invertible. In addition, it holds that ${\parallel X\parallel}_{\mathrm{F}}={\parallel {X}^{1}\parallel}_{\mathrm{F}}$.
Proof. From the defining property
${X}^{\top}{R}_{p,q}X={R}_{p,q}$, it follows that
$det({X}^{\top}{R}_{p,q}X)=det({R}_{p,q})$, hence
$det{(X)}^{2}=1$, which proves the first assertion. From the definition of pseudoorthogonal matrices, there follow the identities
which are often useful in the course of calculations. By the first identity in Equation (
14), it is easy to see that
by the circular shift property of the matrix trace operator. From the second identity in Equation (
14), it follows that
which proves the second assertion. □
The tangent bundle of a pseudoorthogonal Lie group has the structure
and the tangent space at the identity of
$O(p,q)$, namely the Lie algebra
$\mathfrak{o}(p,q)$, has the structure
By the embedding of
$O(p,q)$ into the Euclidean space
${\mathbb{R}}^{{(p+q)}^{2}}$, the normal space at any point
$X\in O(p,q)$ is defined by
The tangent space, the Lie algebra and the normal space associated to the groupmanifold
$O(p,q)$ can be characterized as follows
Let us consider the following indefinite inner product on the general linear group
$GL(p,\mathbb{R})$
referred to as Khvedelidze–Mladenov metric (see [
30]). The following lemma proves two properties of such metric applied to the pseudoorthogonal group.
Lemma 3. The Khvedelidze–Mladenov metric on the pseudoorthogonal group $O(p,q)$ is: (i) nondegenerate; and (ii) indefinite.
Proof. Let us prove the two parts of the lemma separately.
Proof of Part (i): A metric on a finitedimensional space is nondegenerate if and only if
${\langle U,V\rangle}_{X}=0$ for every
U implies
$V=0$. Given
$X\in O(p,q)$ and
$U,V\in {T}_{X}O(p,q)$, by the structure of the tangent space
${T}_{X}O(p,q)$, it is known that
${X}^{1}U={R}_{p,q}\mathsf{\Omega}$, with
${\mathsf{\Omega}}^{\top}=\mathsf{\Omega}$ and
${X}^{1}V={R}_{p,q}\mathsf{\Psi}$, with
${\mathsf{\Psi}}^{\top}=\mathsf{\Psi}$, therefore,
${\langle U,V\rangle}_{X}=tr({R}_{p,q}\mathsf{\Omega}{R}_{p,q}\mathsf{\Psi})$. Let us define
${\mathsf{\Omega}}_{\star}:={R}_{p,q}\mathsf{\Omega}{R}_{p,q}$. Calculations show that
${\mathsf{\Omega}}_{\star}^{\top}={R}_{p,q}^{\top}{\mathsf{\Omega}}^{\top}{R}_{p,q}^{\top}={R}_{p,q}(\mathsf{\Omega}){R}_{p,q}={\mathsf{\Omega}}_{\star}$, hence,
${\mathsf{\Omega}}_{\star}$ is a skewsymmetric matrix. The proof of the claim follows from the observation that
and that the Euclidean inner product is nondegenerate on the Lie algebra of skewsymmetric matrices.
Proof of Part (ii): Given
$X\in O(p,q)$ and
$V\in {T}_{X}O(p,q)$, it is known that
${X}^{1}V={R}_{p,q}\mathsf{\Omega}$, with
${\mathsf{\Omega}}^{\top}=\mathsf{\Omega}$ and
with
${A}^{\top}=A$,
${C}^{\top}=C$ and
B arbitrary. Hence,
${\parallel V\parallel}_{X}^{2}=tr({({X}^{1}V)}^{2})=2tr(B{B}^{\top})tr(A{A}^{\top})tr(C{C}^{\top})$ has indefinite sign. □
The sign of the squared norm
${\parallel V\parallel}_{X}^{2}$ may be positive, negative or even zero whenever
$2tr(B{B}^{\top})+tr({A}^{2})+tr({C}^{2})=0$. Let us take, as a special case, the pseudoorthogonal group
$O(1,1)$, which is the groupmanifold of choice in some of the numerical examples presented in
Section 4 thanks to its low dimensionality, for which the following result holds:
Lemma 4. The Khvedelidze–Mladenov metric in Equation (19) on $O(1,1)$ is positivedefinite. Proof. Every element of
$O(1,1)$ can be written in one of the four forms:
where
s is any real number. Moreover, the inverses of the above representations, which are necessary in the evaluation of the norms, read
The tangent vectors corresponding to the above four representations take the form
where
t is any real number. Hence, straightforward calculations lead to the following values for the tangent vector norms
By direct calculations, the assertion follows. □
It is worth remarking that, in particular, from the proof of the above lemma, it follows that ${T}_{X}^{0}O(1,1)=\{{O}_{2\times 2}\}$, for every $X\in O(1,1)$.
Under the pseudoRiemannian metric in Equation (
19), it is possible to compute the expression of geodesic over the pseudoorthogonal group in closed form. To compute the expression of a geodesic curve on
$O(p,q)$, we invoke the variational formulation recalled in
Section 2.1.
Theorem 1. The geodesic ${\gamma}_{X,V}:[0,1]\to O(p,q)$, with $X\in O(p,q)$, $V\in {T}_{X}O(p,q)$ corresponding to the indefinite Khvedelidze–Mladenov metric (19) has expression Proof. On the strength of Lemma 1, the geodesic equation expressed in variational form reads
where the natural parametrization of the curve is assumed. By computing the variation above, we have that
Since the variation
$\delta \gamma \in {T}_{\gamma}O(p,q)$ is arbitrary, the sum within the innermost parentheses must belong to the normal space at
$\gamma $. By the structure of the normal space
${N}_{\gamma}O(p,q)$, we have
The curve
$\gamma $ must belong entirely to the pseudoorthogonal group, therefore
${\gamma}^{\top}{R}_{p,q}\gamma ={R}_{p,q}$. Deriving this condition twice with respect to
t gives:
Substituting
$\ddot{\gamma}=\dot{\gamma}{\gamma}^{1}\dot{\gamma}+\gamma S{R}_{p,q}$ into the equation above yields
$S=0$. Hence, the geodesic equation reads
Its solution, with initial conditions $\gamma (0)=X\in O(p,q)$ and $\dot{\gamma}(0)=V\in {T}_{X}O(p,q)$, is found to be ${\gamma}_{X,V}(t)=Xexp(t{X}^{1}V)$. □
In the above result, the symbol “exp” denotes matrix exponential, defined on the basis of a Taylor series expansion. For low dimensions
$p+q$, the matrix exponential may be computed through special Rodrigueslike closedform expressions [
31].
As an essential ingredient in the formulation of a pseudoRiemanniangradient stepping algorithm to minimize a smooth function on a pseudoorthogonal group, the structure of the pseudoRiemannian gradient associated to the Khvedelidze–Mladenov metric in Equation (
19) in
$O(p,q)$ is given by the following.
Theorem 2. The pseudoRiemannian gradient of a sufficiently regular function $f:O(p,q)\to \mathbb{R}$ associated to the Khvedelidze–Mladenov metric in Equation (19) reads Proof. According to the relationship in Equation (
19), the gradient
${\nabla}_{X}f$ is computed as the solution of the following system of equations
Note that the first equation in Equation (
23) can be rewritten as
Since
$V\in {T}_{X}O(p,q)$ is arbitrary, the condition above implies that
${({\partial}_{X}^{\top}f{X}^{1}{\nabla}_{X}f{X}^{1})}^{\top}\in {N}_{X}O(p,q)$, hence that
${({\partial}_{X}^{\top}f{X}^{1}{\nabla}_{X}f{X}^{1})}^{\top}={R}_{p,q}XS$, with
$S={S}^{\top}$. Therefore, the pseudoRiemannian gradient of the function
f has the expression
Substituting the relation in Equation (
25) into the second equation of Equation (
23) gives
Substituting back Equation (
26) into the relation in Equation (
25) completes the proof. □
A Riemannian setting for the metrization of the pseudoorthogonal group was proposed and studied in [
12]. We believe that both Riemannian and pseudoRiemannian metrizations are worth investigating as they lead to quite different analytic results.
3.2. A Criterion Function Based on the Frobenius Norm over $O(p,q)$
In the present article, the pseudoorthogonal group is treated as a pseudoRiemannian manifold. Although it is possible to introduce a pseudodistance function that is compatible with the pseudoRiemannian metric, such function is not positive definite and cannot be interpreted as a distance function.
Therefore, as a first attempt in the construction of a criterion function to define an empirical mean, we consider the distance function on the pseudoorthogonal group suggested in the research work [
32], which is defined as
The criterion function
$f:O(p,q)\to \mathbb{R}$ to be minimized to compute an average point out of a collection
$\{{X}_{1},{X}_{2},\dots ,{X}_{N}\}$ of
$O(p,q)$samples is
For the sake of notational convenience, set
$C:=\frac{1}{N}{\sum}_{k}{X}_{k}$, which is the empirical arithmetic average of the collection of points. The criterion function in Equation (
28) can be recast as
$f(X)={\textstyle \frac{1}{2}}{\parallel XC\parallel}_{\mathrm{F}}^{2}+\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{0.166667em}{0ex}}\phantom{\rule{4pt}{0ex}}\mathrm{constant}$. On the basis of such expression, it is straightforward to verify that
therefore the Euclidean gradient of the function
f with respect to
X is given by
${\partial}_{X}f=XC$. According to Theorem 2, the pseudoRiemannian gradient of the criterion function in Equation (
28) on a pseudoorthogonal group endowed with the Khvedelidze–Mladenov metric is given by
The double squared pseudoRiemannian norm of the pseudoRiemannian gradient
${\nabla}_{X}f$ reads
To prove the consistency of the pseudoRiemannian function minimization algorithm with the function minimization problem at hand, it is necessary to evaluate the sign of the coefficients of the stepsize schedule, as discussed in
Section 2.2.
Lemma 5. The coefficients $\tilde{{f}_{1}}$ and $\tilde{{f}_{2}}$ of the function $\tilde{f}=f\circ {\gamma}_{X,V}(t)$ with $(X,V)\in TO(p,q)$ are given by Proof. Since
$\tilde{f}=\frac{1}{2}\left\rightXexp(t{X}^{1}V)C{\left\right}_{\mathrm{F}}^{2}+\mathrm{constant}$, it holds that
which proves the assertion. □
The following two lemmas examine the signs of the coefficients $\tilde{{f}_{1}}$ and $\tilde{{f}_{2}}$.
Lemma 6. The coefficient $\tilde{{f}_{1}}$ in Equation (32) is nonpositive. Proof. In the case that
${\nabla}_{X}f\in {T}_{X}^{+}O(p,q)\cup {T}_{X}^{0}O(p,q)$, the algorithm in Equation (
7) takes
$V={\nabla}_{X}f$, hence, by Equation (
31),
Conversely, when
${\nabla}_{X}f\in {T}_{X}^{}O(p,q)$, the algorithm in Equation (
7) takes
$V={\nabla}_{X}f$, hence
which proves the assertion. □
Lemma 7. Fixing ${\left\rightC\left\right}_{\mathrm{F}}$, for $\left\rightX{C\left\right}_{\mathrm{F}}$ sufficiently small, the coefficient $\tilde{{f}_{2}}$ in Equation (32) is positive. Proof. The coefficient
$\tilde{{f}_{2}}$ is computed as the sum of two terms,
${\left\rightV\left\right}_{\mathrm{F}}^{2}$ and
$tr({(XC)}^{\top}V{X}^{1}V)$. The first term is nonnegative for every
$V\in {T}_{X}O(p,q)$, while the second term is indefinite. Note that
$\left\right{X}^{1}{\left\right}_{\mathrm{F}}={\left\rightX\left\right}_{\mathrm{F}}$, therefore we have that
As a consequence, fixing ${\left\rightC\left\right}_{\mathrm{F}}$, for $\left\rightX{C\left\right}_{\mathrm{F}}$ sufficiently small, the coefficient $\tilde{{f}_{2}}$ is nonnegative. □
A consequence of Lemma 7 is that the initial point
${X}_{(0)}$ may be chosen or randomly generated in
$O(p,q)$, provided it meets the condition
The proposed procedure to minimize the criterion function in Equation (
28) can be summarized by the pseudocode listed in Algorithm 1, where it is assumed that the sequence
$\ell \to {X}_{(\ell )}$ satisfies
${\nabla}_{{X}_{(\ell )}}f\notin {T}_{{X}_{(\ell )}}^{0}O(p,q)$. In Algorithm 1, the quantity
ℓ denotes a step counter, the matrix
${J}_{(\ell )}$ represents the Euclidean gradient of the criterion function in Equation (
28), the matrix
${U}_{(\ell )}$ represents its pseudoRiemannian gradient and the sign of the scalar quantity
${s}_{(\ell )}$ determines whether the matrix
${U}_{(\ell )}$ belongs to the space
${T}_{{X}_{(\ell )}}^{+}O(p,q)$ or to the space
${T}_{{X}_{(\ell )}}^{}O(p,q)$.
Algorithm 1 Pseudocode to implement meancomputation over $O(p,q)$ according to the function minimization rule (7) endowed with the stepsizeselection rule in Equation (11) and the stopping criterion in Equation (12). 
Set ${R}_{p,q}=\left(\begin{array}{cc}{I}_{p}& {O}_{p\times q}\\ {O}_{q\times p}& {I}_{q}\end{array}\right)$ Set $\ell =0$ Set $C=\frac{1}{N}{\sum}_{k}{X}_{k}$ Set ${X}_{(0)}$ to an initial point in $O(p,q)$ Set $\u03f5$ to desired precision repeat Compute ${J}_{(\ell )}={X}_{(\ell )}C$ Compute ${U}_{(\ell )}=\frac{1}{2}({X}_{(\ell )}{J}_{(\ell )}^{\top}{X}_{(\ell )}{R}_{p,q}{J}_{(\ell )}{R}_{p,q})$ Compute ${s}_{(\ell )}=tr({({X}_{(\ell )}^{1}{U}_{(\ell )})}^{2})$ if ${s}_{(\ell )}>0$ then Set ${V}_{(\ell )}={U}_{(\ell )}$ else Set ${V}_{(\ell )}={U}_{(\ell )}$ end if Compute ${\tilde{f}}_{1(\ell )}=tr({J}_{(\ell )}^{\top}{V}_{(\ell )})$ Compute ${\tilde{f}}_{2(\ell )}=tr({V}_{(\ell )}^{\top}{V}_{(\ell )})+tr({J}_{(\ell )}^{\top}{V}_{(\ell )}{X}_{(\ell )}^{1}{V}_{(\ell )})$ Set ${\widehat{t}}_{(\ell )}={\tilde{f}}_{1(\ell )}/{\tilde{f}}_{2(\ell )}$ Set ${X}_{(\ell +1)}={X}_{(\ell )}exp({\widehat{t}}_{(\ell )}{X}_{(\ell )}^{1}{V}_{(\ell )})$ Set $\ell =\ell +1$ until${\tilde{f}}_{1(\ell )}<\u03f5$

3.3. A Criterion Function Based on the Geodesic Distance over $O(p,q)$
We may consider a second instance of distance between two points in the groupmanifold
$O(p,q)$ defined as follows
where the symbol
$\xb7$ denotes the entrywise absolute value of the argument matrix.
On the basis of the above distance function, the criterion function
$f:O(p,q)\to \mathbb{R}$ to be minimized in the context of computing a mean matrix out of a set of pseudoorthogonal matrixsamples is defined by
Let us fix an element
$Y\in O(p,q)$ and compute the pseudoRiemannian gradient of the map
$X\to {\tilde{D}}_{\mathrm{g}}^{2}(X,Y)$, where we define an auxiliary function as
${\tilde{D}}_{\mathrm{g}}^{2}(X,Y):=tr({log}^{2}({X}^{1}Y))$. According to Proposition 2.1 in [
20], the differential of the auxiliary function may be written as
Therefore, the Euclidean gradient of the auxiliary function is given by
According to Theorem 2, the following expression for the pseudoRiemannian gradient of the distance function in Equation (
37) is readily obtained
The proposed procedure to minimize the criterion function (
38) is summarized by the pseudocode listed in Algorithm 2, where the notation is the same as in Algorithm 1. In this empirical mean computation algorithm, a fixed stepsize
$\eta $ has been selected, as opposed to Algorithm 1 that utilizes a variable stepsize schedule.
Algorithm 2 Pseudocode to implement meancomputation over $O(p,q)$ according to the function minimization rule in Equation (7). 
Set ${R}_{p,q}=\left(\begin{array}{cc}{I}_{p}& {O}_{p\times q}\\ {O}_{q\times p}& {I}_{q}\end{array}\right)$ Set $\ell =0$ Set ${X}_{(0)}$ to an initial point in $O(p,q)$ Set $\eta $ to a stepsize value repeat Compute ${J}_{(\ell )}=\frac{1}{N}{\sum}_{k=1}^{N}{(log({X}_{k}^{1}{X}_{(\ell )}){X}_{(\ell )}^{1})}^{\top}signtr({log}^{2}({X}_{(\ell )}^{1}{X}_{k}))$ Compute ${U}_{(\ell )}=\frac{1}{2}({X}_{(\ell )}{J}_{(\ell )}^{\top}{X}_{(\ell )}{R}_{p,q}{J}_{(\ell )}{R}_{p,q})$ Compute ${s}_{(\ell )}=tr({({X}_{(\ell )}^{1}{U}_{(\ell )})}^{2})$ if ${s}_{(\ell )}>0$ then Set ${V}_{(\ell )}={U}_{(\ell )}$ else Set ${V}_{(\ell )}={U}_{(\ell )}$ end if Set ${X}_{(\ell +1)}={X}_{(\ell )}exp(\eta {X}_{(\ell )}^{1}{V}_{(\ell )})$ Set $\ell =\ell +1$ until${X}_{(\ell )}$ is close enough to a critical point of f
