Kernel Density Estimation on the Siegel Space with an Application to Radar Processing

Main techniques of probability density estimation on Riemannian manifolds are reviewed in the case of the Siegel space. For computational reasons we chose to focus on the kernel density estimation. The main result of the paper is the expression of Pelletier’s kernel density estimator. The method is applied to density estimation of reﬂection coeﬃcients from radar observations.


Introduction
Probability density estimation is a vast topic.There exists various standard techniques in the Euclidean context, such as histograms, kernel methods, or the characteristic function approach.These methods can sometimes be transposed to the case of Riemannian manifolds.However, the transposition often introduces additional computational efforts.This additional effort depends on the method used and the nature of the manifold.The Siegel space is a generalization of the hyperbolic space.It has a structure of symmetric Riemannian manifold, which enables to adapt different density estimation methods at a reasonable cost.Convergence rates of the density estimation using kernels and orthogonal series were progressively generalized to Riemannian manifolds, see [2] [1].The Siegel space appears in radar processing in the study of Toeplitz block Toeplitz matrices, whose blocks represent covariance matrices of the signal, see [5,6,7].Information geometry is now a standard framework in radar processing, see [4,5,6,7,8,9,10].The information geometry on positive definite Teoplitz block Teoplitz matrices is directly related to the metric on the Siegel space, see [11].Indeed these Toeplitz block Toeplitz matrices can be represented by a symmetric positive definite matrix and a set of coefficients laying in the Siegel disk.The metric considered on Toeplitz block Toeplitz matrices is a product metric between a metric on symmetric positive definite matrices and the Siegel disk metric, see [11].One already encounters the problem of density estimation in the hyperbolic space for electrical impedance [3], networks [16] and radar signals [17].In [12] was proposed a generalization of the Gaussian law on the hyperbolic space.Apart from [13], where authors propose a generalization of the Gaussian law, probability density estimation on the Siegel space has not yet been addressed.The contributions of the paper are the following.We review the main non parametric density estimation techniques on the Siegel disk.We provide some rather simple explicit expressions of the kernels proposed by Pelletier in [1].These expressions makes the kernel density estimation the most adapted method.We present visual results of estimated densities in the simple case where the Siegel disk correspond to the Poincaré disk.The paper begins with an introduction to the Siegel space in Section 2. Section 3 reviews the main non-parametric density estimation techniques on the Siegel space.Section 4 presents an application to radar data estimation.

The Siegel space
This section presents facts about the Siegel space.The interested reader can find more details in [20,18]

The Siegel upper half space
The Siegel upper half space is a generalization of the Poincaré upper half space.Let Sym(n) be the space of real symmetric matrices of size n × n and Sym + (n) the set of real symmetric positive definite matrices of size n × n.The Siegel upper half space is defined by H n is equipped with the following metric The set of real symplectic matrices Sp(n, R) is defined by The metric ds is invariant under the following action of Sp(n, R), This action is transitive, i.e.
The stabilizer K of iI is the set of elements g of Sp(n, R) whose action leaves iI fixed.K is a subgroup of Sp(n, R) called the isotropy group.We can verify that A symmetric space is a Riemannian manifold where the reversal of the geodesics is well defined and is an isometry.Formally, exp p (u) → exp p (−u) is an isometry for each p on the manifold, where u is a vector in the tangent space at p, and exp p the Riemannian exponential application at p.In other words, the symmetry around each point is an isometry.H n is a symmetric space, see [18].The structure of a symmetric space can be studied through its isometry group and the Lie algebra of its isometry group.The present work will make use of the Cartan and Iwasawa decompositions of the Lie algebra of Sp(n, R).Let sp(n, R) be the Lie algebra of Sp(n, R).Given A, where Ad is the adjoint representation of Sp(n, R).

The Siegel disk
The Siegel disk D n is the set of complex matrices {Z|I − Z * Z ≥ 0} where ≥ stands for the Loewner order.Recall that for A and B to Hermitian matrices, A ≥ B according to the Loewner order means that A − B is positive definite.The transformation where A stands for the conjugate of A. The point iI in H n is identified with the null matrix noted 0 in D n .Let Z ∈ D n .There exists P a diagonal matrix with decreasing positive real entries and U a unitary matrix such that Z = U P U t .Let τ 1 ≥ ... ≥ τ n be such that It can be shown that We provide now a correspondence between the elements of D n and the elements of p defined in Eq. 1.Let and It can be shown that there exists k ∈ K such that Recall that Eq. 2 gives Ad k (H) ∈ p and kak ∈ exp(p).The distance between Z and 0 in D n is given by see [18] page 292.!!Negative curvature of the Siegle space!!

Non parametric density estimation on the Siegel space
Let Ω be a space, endowed with a σ-algebra and a probability measure p.
Let X be a random variable Ω → D n .The Riemannian measure of D n is called vol and the measure on D n induced by X is noted µ X .We assume that µ X has a density, noted f , with respect to vol, and that the support of X is a compact set noted Supp.Let (x 1 , .., x k ) ∈ D k n be a set of draws of X.The Dirac measure in point a is defined as: denotes the empirical measure of the set of draws.This section presents four non-parametric techniques of estimation of f from the set of draws (x 1 , .., x k ).The estimated density at x in D n is noted fk (x) = f (x, x 1 , ..., x k ).The relevance of a density estimation technique depends on several aspects.When the space allows it, the estimation technique should equally consider each directions and locations.This leads to an isotropy and a homogeneity condition.In the kernel method for instance, a kernel function K x i is placed at each observation x i .Firstly, in order to treat directions equally, the function K x i should be invariant under the isotropy group of x i .Secondly, for another observation x j , functions K x i and K x j should be similar up to the isometries that send x i on x j .These considerations strongly depend on the geometry of the space: if the space is not homogeneous and the isotropy group is empty, these indifference principles have no meaning.Since the Siegel space is symmetric, it is homogeneous and has a non empty isotropy group.Thus the density estimation technique should be chosen accordingly.The convergence of the different estimation techniques is widely studied.Results were first obtained in the Euclidean case, and are gradually extended to the probability densities on manifold, see [2,1,3].The last relevant aspect, is computational.Each estimation technique has its own computational framework, which presents pro and cons given the different applications.For instance, the estimation by orthogonal series presents an initial pre-processing, but provides a fast evaluation of the estimated density in compact manifolds.

Histograms
The histogram is the simplest density estimation method.Given a partition of the space D n = ∪ i A i , the estimated density is given by where 1 A i stands for the indicator function of A i .Following the considerations of the previous sections, the elements of the partition should firstly be as isotropic as possible, and secondly as similar as possible to each other.Regarding the problem of histograms, the case of the Siegel space is similar to the case of the hyperbolic space.There exist various uniform polygonal tilings on the Siegel space that could be used to compute histograms.However, there are ratio λ ∈ R for which there is no homothety.Thus it is not always possible to scale the size of the bins to a given set of draws of the random variable.Modifying the scale of the density estimation requires then a change of the structure of the tiling.Thus the study of histograms has not been deepened.

Orthogonal series
The estimation of the density f can be made out of the estimation of the scalar product between f and a set of orthonormal functions {e j }.The most standard choice for {e j } is the eigenfunctions of the Laplacian.When the variable X takes its values in R n , this estimation technique becomes the characteristic function method.When the underlying space is compact, the spectrum of the Laplacian operator is countable, while when the space is non-compact, the spectrum is uncountable.In the first case, the estimation of the density f is made through the estimation of a sum, while in the second case is made through the estimation an integral.In practice, the second situation present a larger computational complexity.Unfortunately, eigenfunctions of the Laplacian operator are known on D n but not on compact sub-domains.For this reason the study of this method has not been deepened.

Kernels
Let K : R + → R + be a map which verifies the following properties: i) Let p ∈ D n .Generally, given a point p on a Riemannian manifold, exp p defines an injective application only on a neighborhood of 0. The Siegel space is a non-compact symmetric space and has thus only negative sectional curvatures.Thus, exp p is injective on the whole space.When the tangent space T p D n is endowed with the local scalar product, where ||.|| is the Euclidean distance associated to the local scalar product and d(., .) is the Riemannian distance.The corresponding Lebesgue measure of T p D n is noted Leb p .Let exp * p (Leb p ) denote the push-forward measure of Lep p by exp p .The function θ p defined by: is the density of the Riemannian measure of D n with respect to the Lebesgue measure Leb p after identification of D n and T p D n induced by exp p , see Fig. 1.
Figure 1: M is a Riemannian manifold, T x M is its tangent space at x.The exponential application induces a volume change θ x between T x M and M.
Given K and a positive radius r, the estimator of f proposed by [1] is defined by: fk The corrective factor θ x i (x) −1 is necessary since the kernel K originally integrates to one with respect to the Lebesgue measure, instead of the Riemannian measure.It can be noted that this estimator is the usual kernel estimator in the case of Euclidean space.When the curvature of the space is negative, which is the case of the Siegel space, the distribution placed over each sample x i has x i as intrinsic mean.The following theorem provides convergence rate of the estimator.It is a direct adaptation of theorem 3.1 of [1].Theorem 3.1.Let (M, g) be a Riemannian manifold of dimension n and µ its Riemannian volume measure.Let X be a random variable taking its values in a compact subset C of (M, g).Let 0 < r ≤ r inj , where r inj is the infimum of the injectivity radius on C. Assume the law of X has a twice differentiable density f with respect to the Riemannian volume measure.Let fk be the estimator defined in Eq. 7. The exist a constant C f such that Proof.See appendix A.
Each location and direction are processed as similarly as possible.
In order to obtain the explicit expression of the estimator one must first have the explicit expression of the Riemannian exponential, its inverse, and of the function θ p , see Eq. 6-7.These expressions are difficult and sometimes impossible to obtain on general Riemannian manifolds.In the case of the Siegel space, the symmetric structure makes the computation possible.Since the space is homogeneous, the computation can be made at the origin iI ∈ H n or 0 ∈ D n and transported to the whole space.In the present work, the random variable lays in D n .However in the literature the Cartan and Iwasawa decompositions are usually given for the isometry group of H n .Thus our computation starts in H n before moving to D n .
The Killing form on the Lie algebra sp(n, R) of the isometry group of H n induces a scalar product on p.This scalar product can be transported on exp(p) by left multiplication.This operation gives exp(p) a Riemannian structure.It can be shown that on this Riemannian manifold, the Riemannian exponential at the identity coincides with the group exponential.Furthermore, is a bijective isometry, up to a scaling factor.Since the volume change θ p is invariant under rescaling of the metric, this scaling factor has no impact.Thus H n can be identified with exp(p) and exp iI∈Hn with exp |p .The expression of the Riemannian exponential is difficult to obtain in general, however it boils down to the group exponential in the case of symmetric spaces.This is the main element of the computation of θ p .The Riemannian volume measure on exp(p) is noted vol .Let Let a + be the diagonal matrices with strictly decreasing positive eigenvalues.Let Λ + be the set of positive roots of sp(n, R) as described in [18] page 282, Λ + = {e i + e j , i ≤ j} ∪ {e i − e j , i < j} where e i (H) is the i-th diagonal term of the diagonal matrix H. Let C c (E) be the set of continuous compactly supported function on the space E. In [19] is given page 73 that for all t ∈ C c (p), there exists where dY is a Lebesgue measure on the coefficients of Y .Let p = ψ(K ×a + ).λ ∈ Λ + never vanishes on a + and p \ p has a null measure.Thus where H Y is the point in a + such that there exists k in K such that ψ(k, H Y ) = Y .Calculation in [19] page 73 also gives that for all t ∈ C c (p), there exists c 2 > 0 such that where dg is the Haar measure on Sp(n, R) and Thus for all t ∈ C c (Sp(n, R)/K), where dx is the invariant measure on Sp(n, R)/K.After identifying Sp(n, R)/K and exp(p) the Riemannian measure on exp(p) coincides with the invariant measure on Sp(n, R)/K.Thus for all t ∈ C c (exp(p)), Using the notation H Y of Eq. 12, (16) Combining Eq. 15 and Eq.16 we obtain that exists c 3 such that p t(exp(Y )) The term sinh(λ(H)) can be extended by continuity on a thus p t(exp(Y )) Let dY be the Lebesgue measure corresponding to the metric.Then the exponential application does not introduce a volume change at 0 ∈ p.Since H 0 = 0 and sinh(λ(H)) where Leb 0 refers to the Lebesgue measure on the tangent space T 0 D n as in Eq. 6.Given Z ∈ D n , H Z from Eq. 4 verifies Cφ(exp(Ad k (H Z )))C −1 = Z for some k in K. Thus We have then where the (τ i ) are described in section 2.2.Given where g −1 Z 1 is defined in Eq. 3. It is thus possible to use the density estimator defined in Eq. 7. Indeed, where the (τ i ) are the diagonal elements of 4 Application to radar processing

Radar data
In STAP radar processing, the signal is formed by a succession of matrix X representing the realization of a temporal and spatial process.Let B + n,m be the set of positive definite block Teoplitz matrices composed of n × n blocks of m × m matrices (PD BT).For a stationary signal the autocorrelation matrix R is PD BT, see [11].Authors of [11] proposed a generalization of Verblunsky coefficients and defined a parametrization of PD BT matrices, in which the metric induced by the Kähler potential is the product metric of an affine invariant metric on Sym + and m − 1 times the metric of the Siegel disk, up to a scaling factor.Among other references, positive definite block Teoplitz matrices have been studied in the context of STAP-radar processing in [5,6,7].

Experimental results
In this section we show density estimation results of the marginal parameters Ω k .For the sake of visualization, only the Siegel disk D 1 is considered.
Recall that D 1 coincides with the Poincaré disk.The results are partly extracted from the conference paper [17].Data used in the experimental tests are radar observations from THALES X-band Radar, recorded during 2014 field trials campaign at Toulouse Blagnac Airport for European FP7 UFO study (Ultra-Fast wind sensOrs for wake-vortex hazards mitigation).Data are representative of Turbulent atmosphere monitored by radar.Fig. 2-3 illustrate the density estimation of six coefficients on the Poincaré unit disk.Fig. 2 correspond to a clear environment and Fig. 3 to a rainy environment.The densities are individually re-scaled for visualization purposes.For each environment the dataset is composed of 120 draws.The densities of the coefficients Ω k are representative of different backgrounds.These information on the background are expected to ease the detection of interesting targets.

Conclusion
Three non parametric density estimation techniques have been considered.The main advantage of histograms in the Euclidean context is there simplicity of use.This make histograms an interesting tool despite the fact that they do not present optimal convergence rate.On the Siegel space, histograms loses their simplicity advantage.They were thus not deeply studied.The orthogonal series density estimation also present technical disadvantages on the Siegel space.Indeed, the series become integrals which make the numerical computation of the estimator more difficult than in the Euclidean case.On the other hand, the use of the kernel density estimator does not present major differences with the Euclidean case.The convergence rate obtained in [1] can be extended to compactly supported random variables on non compact Riemannian manifolds.Furthermore the corrective term whose computation is required to use Euclidean kernels on Riemannian manifolds turns out to have a reasonably simple expression.Our future efforts will concentrate on the use of kernel density estimation on the Siegel space in radar signal processing.We strongly believe that the estimation of the densities of the Ω k will provide a interesting description of the different backgrounds.
on W 1 , vanishes outside of W 2 , and f 2 (x) = 1 outside W 1 and vanishes in U .Define g on M by g = f 1 g + f 2 g 0 on N and g = f 2 g 0 outside of N .Since f 1 + f 2 > 0, g is positive definite everywhere on M .Since f 1 vanishes outside of W 2 , g is smooth on M .Finally, since f 1 = 1 and f 2 = 0 on U , g = g on U .
We can now prove theorem 3.1.Let X be a random variable as in theorem 3.1.Following the notations of the theorem and the lemma, let U = {x ∈ M, d(x, C) < r inj }.U is open, relatively compact and contains C. Let (M , g ) be as in the lemma.Let f and f be the kernel density estimators defined on M and M respectively.Theorem 3.1 of [1] provides the desired results for f .for r ≤ r inj , the support and the values on the support of f and f coincide.Thus the desired result also hold for f .

B
and C three real n by n matrices, let us write A B C −A t = (A, B, C).We have that sp(n, R) = {(A, B, C)|B and C symmetric} The Cartan decomposition of sp(n, R) is given by sp(n, R) = t ⊕ p where t = {(A, B, −B)|B symmetric and A skew-symmetric} p = {(A, B, B)|A, B, symmetric} (1) The Iwasawa decomposition is given by sp(n, R) = t ⊕ a ⊕ n where a = {(H, 0, 0)|H diagonal} n = {(A, B, 0)|A upper triangular with 0 on the diagonal , B symmetric} It can be shown that p is an isometry between the Siegel upper half space and the Siegel disk.Let C = I −iI I iI .The application g ∈ Sp(n, R) → CgC −1 identifies the set of isometries of H n and of D n .Thus, it can be shown that a matrix g = A B A B ∈ Sp(n, C) acts isometrically on D n by g.Z = (AZ + B)(AZ + B) −1

Figure 3 :
Figure 3: Estimation of the density of 6 coefficients Ω k under rainy conditions.The expression of the used kernel is K(x) = 3 π (1−x 2 ) 2 1 x<1 .Densities are rescaled for visual purpose.

Figure 2 :
Figure 2: Estimation of the density of 6 coefficients Ω k under clear conditions.The expression of the used kernel is K(x) = 3 π (1−x 2 ) 2 1 x<1 .Densities are rescaled for visual purpose.