A Generic Formula and Some Special Cases for the Kullback–Leibler Divergence between Central Multivariate Cauchy Distributions

Bouhlel, Nizar; Rousseau, David

doi:10.3390/e24060838

Open AccessFeature PaperArticle

A Generic Formula and Some Special Cases for the Kullback–Leibler Divergence between Central Multivariate Cauchy Distributions

by

Nizar Bouhlel

^1,*

and

David Rousseau

²

¹

ImhorPhen Unit, UMR INRAe IRHS, Institut Agro Rennes-Angers, Université d’Angers, 42 Rue Georges Morel, 49070 Beaucouzé, France

²

LARIS, UMR INRAe IRHS, Université d’Angers, 62 Avenue Notre Dame du Lac, 49000 Angers, France

^*

Author to whom correspondence should be addressed.

Entropy 2022, 24(6), 838; https://doi.org/10.3390/e24060838

Submission received: 22 May 2022 / Revised: 15 June 2022 / Accepted: 15 June 2022 / Published: 17 June 2022

(This article belongs to the Special Issue Information and Divergence Measures)

Download

Browse Figure

Review Reports Versions Notes

Abstract

:

This paper introduces a closed-form expression for the Kullback–Leibler divergence (KLD) between two central multivariate Cauchy distributions (MCDs) which have been recently used in different signal and image processing applications where non-Gaussian models are needed. In this overview, the MCDs are surveyed and some new results and properties are derived and discussed for the KLD. In addition, the KLD for MCDs is showed to be written as a function of Lauricella D-hypergeometric series

F_{D}^{(p)}

. Finally, a comparison is made between the Monte Carlo sampling method to approximate the KLD and the numerical value of the closed-form expression of the latter. The approximation of the KLD by Monte Carlo sampling method are shown to converge to its theoretical value when the number of samples goes to the infinity.

Keywords:

Multivariate Cauchy distribution (MCD); Kullback–Leibler divergence (KLD); multiple power series; Lauricella D-hypergeometric series

1. Introduction

Multivariate Cauchy distribution (MCD) belongs to the elliptical symmetric distributions [1] and is a special case of the multivariate t-distribution [2] and the multivariate stable distribution [3]. MCD has been recently used in several signal and image processing applications for which non-Gaussian models are needed. To name a few of them, in speckle denoizing, color image denoizing, watermarking, speech enhancement, among others. Sahu et al. in [4] presented a denoizing method for speckle noise removal applied to a retinal optical coherence tomography (OCT) image. The method was based on the wavelet transform where the sub-bands coefficients were modeled using a Cauchy distribution. In [5], a dual tree complex wavelet transform (DTCWT)-based despeckling algorithm was proposed for synthetic aperture radar (SAR) images, where the DTCWT coefficients in each subband were modeled with a multivariate Cauchy distribution. In [6], a new color image denoizing method in the contourlet domain was suggested for reducing noise in images corrupted by Gaussian noise where the contourlet subband coefficients were described by the heavy-tailed MCD. Sadreazami et al. in [7] put forward a novel multiplicative watermarking scheme in the contourlet domain where the watermark detector was based on the bivariate Cauchy distribution and designed to capture the across scale dependencies of the contourlet coefficients. Fontaine et al. in [8] proposed a semi-supervised multichannel speech enhancement system where both speech and noise follow the heavy-tailed multi-variate complex Cauchy distribution.

Kullback–Leibler divergence (KLD), also called relative entropy, is one of the most fundamental and important measures in information theory and statistics [9,10]. KLD was first introduced and studied by Kullback and Leibler [11] and Kullback [12] to measure the divergence between two probability mass functions in the case of discrete random variables and between two univariate or multivariate probability density functions in the case of continuous random variables. In the literature, numerous entropy and divergence measures have been suggested for measuring the similarity between probability distributions, such as Rényi [13] divergence, Sharma and Mittal [14] divergence, Bhattacharyya [15,16] divergence and Hellinger divergence measures [17]. Other general divergence families have been also introduced and studied like the

ϕ

-divergence family of divergence measures defined simultaneously by Csiszár [18] and Ali and Silvey [19] where the KLD measure is a special case, the Bregman family divergence [20], the R-divergences introduced by Burbea and Rao [21,22,23], the statistical f-divergences [24,25] and recently the new family of a generalized divergence called the

(h, ϕ)

-divergence measures introduced and studied in Menéndez et al. [26]. Readers are referred to [10] for details about these divergence family measures.

KLD has a specific interpretation in coding theory [27] and is therefore the most popular and widely used as well. Since information theoretic divergence and KLD in particular are ubiquitous in information sciences [28,29], it is therefore important to establish closed-form expressions of such divergence [30]. An analytical expression of the KLD between two univariate Cauchy distributions was presented in [31,32]. To date, the KLD of MCDs has no known explicit form, and it is in practice either estimated using expensive Monte Carlo stochastic integration or approximated. Monte Carlo sampling can efficiently estimate the KLD provided that a large number of independent and identically distributed samples is provided. Nevertheless, Monte Carlo integration is a too slow process to be useful in many applications. The main contribution of this paper is to derive a closed-form expression for the KLD between two central MCDs in a general case to benchmark future approaches while avoiding approximation using expensive Monte Carlo (MC) estimation techniques. The paper is organized as follows. Section 2 introduces the MCD and the KLD. Section 3 gives some definitions and propositions related to a multiple power series used to compute the closed-form expression of the KLD between two central MCDs. In Section 4 and Section 5, expressions of some expectations related to the KLD are developed by exploiting the propositions presented in the previous section. Section 6 demonstrates some final results on the KLD computed for the central MCD. Section 7 presents some particular results such as the KLD for the univariate and the bivariate Cauchy distribution. Section 8 presents the implementation procedure of the KLD and a comparison with Monte Carlo sampling method. A summary and some conclusions are provided in the final section.

2. Multivariate Cauchy Distribution and Kullback–Leibler Divergence

Let

X

be a random vector of

R^{p}

which follows the MCD, characterized by the following probability density function (pdf) given as follows [2]

\begin{matrix} f_{X} (x | μ, Σ, p) = \frac{Γ (\frac{1 + p}{2})}{π^{\frac{p}{2}} Γ (\frac{1}{2})} \frac{1}{{| Σ |}^{\frac{1}{2}}} \frac{1}{{[1 + {(x - μ)}^{T} Σ^{- 1} (x - μ)]}^{\frac{1 + p}{2}}} . \end{matrix}

(1)

This is for any

x \in R^{p}

, where p is the dimensionality of the sample space,

μ

is the location vector,

Σ

is a symmetric, positive definite

(p \times p)

scale matrix and

Γ (.)

is the Gamma function. Let

X^{1}

and

X^{2}

be two random vectors that follow central MCDs with pdfs

f_{X^{1}} (x | Σ_{1}, p) = f_{X^{1}} (x | 0, Σ_{1}, p)

and

f_{X^{2}} (x | Σ_{2}, p) = f_{X^{2}} (x | 0, Σ_{2}, p)

given by (1). KLD provides an asymmetric measure of the similarity of the two pdfs. Indeed, the KLD between the two central MCDs is given by

\begin{matrix} KL (X^{1} | | X^{2}) & = \int_{R^{p}} ln (\frac{f_{X^{1}} (x | Σ_{1}, p)}{f_{X^{2}} (x | Σ_{2}, p)}) f_{X^{1}} (x | Σ_{1}, p) d x \end{matrix}

(2)

\begin{matrix} = E_{X^{1}} {ln f_{X^{1}} (X)} - E_{X^{1}} {ln f_{X^{2}} (X)} . \end{matrix}

(3)

Since the KLD is the relative entropy defined as the difference between the cross-entropy and the entropy, we have the following relation:

\begin{matrix} KL (X^{1} | | X^{2}) & = H (f_{X^{1}}, f_{X^{2}}) - H (f_{X^{1}}) \end{matrix}

(4)

where

H (f_{X^{1}}, f_{X^{2}}) = - E_{X^{1}} {ln f_{X^{2}} (X)}

denotes the cross-entropy and

H (f_{X^{1}}) = - E_{X^{1}} {ln f_{X^{1}} (X)}

the entropy. Therefore, the determination of KLD requires the expression of the entropy and the cross-entropy. It should be noted that the smaller

KL (X^{1} | | X^{2})

, the more similar are

f_{X^{1}} (x | Σ_{1}, p)

and

f_{X^{2}} (x | Σ_{2}, p)

. The symmetric KL similarity measure between

X^{1}

and

X^{2}

is

d_{KL} (X^{1}, X^{2}) = KL (X^{1} | | X^{2}) + KL (X^{2} | | X^{1})

. In order to compute the KLD, we have to derive the analytical expressions of

E_{X^{1}} {ln f_{X^{1}} (X)}

and

E_{X^{1}} {ln f_{X^{2}} (X)}

which depend, respectively, on

E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]}

and

E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]}

. Consequently, the closed-form expression of the KLD between two zero-mean MCDs is given by

\begin{matrix} KL (X^{1} | | X^{2}) & = \frac{1}{2} log \frac{| Σ_{2} |}{| Σ_{1} |} - \frac{1 + p}{2} (E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]} - E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]}) . \end{matrix}

(5)

To provide the expression of these two expectations, some tools based on the multiple power series are required. The next section presents some definitions and propositions used for this goal.

3. Definitions and Propositions

This section presents some definitions and exposes some propositions related to the multiple power series used to derive the closed-form expression of the expectation

E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]}

and

E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]}

, and as a consequence the KLD between two central MCDs.

Definition 1.

The Humbert series of n variables, denoted

Φ_{2}^{(n)}

, is defined for all

x_{i} \in C, i = 1, \dots, n

, by the following multiple power series (Section 1.4 in [33])

\begin{matrix} Φ_{2}^{(n)} (b_{1}, \dots, b_{n}; c; x_{1}, \dots, x_{n}) = \sum_{m_{1} = 0}^{\infty} . . \sum_{m_{n} = 0}^{\infty} \frac{{(b_{1})}_{m_{1}} \dots {(b_{n})}_{m_{n}}}{{(c)}_{\sum_{i = 1}^{n} m_{i}}} \prod_{i = 1}^{n} \frac{x_{i}^{m_{i}}}{m_{i}!} . \end{matrix}

(6)

The Pochhammer symbol

{(q)}_{i}

indicates the i-th rising factorial of q, i.e., for an integer

i > 0

\begin{matrix} {(q)}_{i} = q (q + 1) \dots (q + i - 1) = \prod_{k = 0}^{i - 1} (q + k) = \frac{Γ (q + i)}{Γ (q)} \end{matrix}

(7)

3.1. Integral Representation for $Φ_{2}^{(n)}$

Proposition 1.

The following integral representation is true for

Real {c} > Real {\sum_{i = 1}^{n} b_{i}} > 0

and

Real {b_{i}} > 0

where

Real {.}

denotes the real part of the complex coefficients

\begin{matrix} \int \dots \int_{Δ} {(1 - \sum_{i = 1}^{n} u_{i})}^{c - \sum_{i = 1}^{n} b_{i} - 1} \prod_{i = 1}^{n} u_{i}^{b_{i} - 1} e^{x_{i} u_{i}} d u_{i} = B (b_{1}, \dots, b_{n}, c - \sum_{i = 1}^{n} b_{i}) Φ_{2}^{(n)} (b_{1}, \dots, b_{n}; c; x_{1}, \dots, x_{n}) \end{matrix}

(8)

where

Δ = {(u_{1}, \dots, u_{n}) | 0 \leq u_{i} \leq 1, i = 1, \dots, n; 0 \leq u_{1} + \dots + u_{n} \leq 1}

and the multivariate beta function B is the extension of beta function to more than two arguments (called also Dirichlet function) defined as (Section 1.6.1 in [34])

\begin{matrix} B (b_{1}, \dots, b_{n}, b_{n + 1}) = \frac{\prod_{i = 1}^{n + 1} Γ (b_{i})}{Γ (\sum_{i = 1}^{n + 1} b_{i})} . \end{matrix}

(9)

Proof.

The power series of exponential function is given by

\begin{matrix} e^{x_{i} u_{i}} = \sum_{m_{i} = 0}^{\infty} \frac{x_{i}^{m_{i}}}{m_{i}!} u_{i}^{m_{i}} . \end{matrix}

(10)

By substituting the expression of the exponential into the multiple integrals we have

\begin{matrix} \int . . \int_{Δ} {(1 - \sum_{i = 1}^{n} u_{i})}^{c - \sum_{i = 1}^{n} b_{i} - 1} \prod_{i = 1}^{n} u_{i}^{b_{i} - 1} e^{x_{i} u_{i}} d u_{i} \\ = \int . . \int_{Δ} {(1 - \sum_{i = 1}^{n} u_{i})}^{c - \sum_{i = 1}^{n} b_{i} - 1} (\prod_{i = 1}^{n} \sum_{m_{i} = 0}^{\infty} \frac{x_{i}^{m_{i}}}{m_{i}!} u_{i}^{m_{i} + b_{i} - 1} d u_{i}) \\ = \sum_{m_{1} = 0}^{\infty} . . \sum_{m_{n} = 0}^{\infty} (\prod_{i = 1}^{n} \frac{x_{i}^{m_{i}}}{m_{i}!}) \times I_{D} \end{matrix}

(11)

where the multivariate integral

I_{D}

, which is a generalization of a beta integral, is the type-1 Dirichlet integral (Section 1.6.1 in [34]) given by

\begin{matrix} I_{D} & = \int \dots \int_{Δ} {(1 - \sum_{i = 1}^{n} u_{i})}^{c - \sum_{i = 1}^{n} b_{i} - 1} \prod_{i = 1}^{n} u_{i}^{m_{i} + b_{i} - 1} d u_{i} \\ = \frac{\prod_{i = 1}^{n} Γ (b_{i} + m_{i}) Γ (c - \sum_{i = 1}^{n} b_{i})}{Γ (c + \sum_{i = 1}^{n} m_{i})} . \end{matrix}

(12)

Knowing that

Γ (b_{i} + m_{i}) = Γ (b_{i}) {(b_{i})}_{m_{i}}

, the expression of

I_{D}

can be written otherwise

\begin{matrix} I_{D} = \frac{\prod_{i = 1}^{n} Γ (b_{i}) Γ (c - \sum_{i = 1}^{n} b_{i})}{Γ (c)} \frac{\prod_{i = 1}^{n} {(b_{i})}_{m_{i}}}{{(c)}_{\sum_{i = 1}^{n} m_{i}}} . \end{matrix}

(13)

Finally, plugging (13) back into (11) leads to the final result

\begin{matrix} \frac{Γ (c - \sum_{i = 1}^{n} b_{i}) \prod_{i = 1}^{n} Γ (b_{i})}{Γ (c)} \sum_{\begin{matrix} m_{1}, \dots, \\ m_{n} = 0 \end{matrix}}^{+ \infty} \frac{\prod_{i = 1}^{n} {(b_{i})}_{m_{i}}}{{(c)}_{\sum_{i = 1}^{n} m_{i}}} \prod_{i = 1}^{n} \frac{x_{i}^{m_{i}}}{m_{i}!} = B (b_{1}, \dots, b_{n}, c - \sum_{i = 1}^{n} b_{i}) Φ_{2}^{(n)} (b_{1}, \dots, b_{n}; c; x_{1}, \dots, x_{n}) \end{matrix}

(14)

□

Given Proposition 1, we consider the particular cases

n = {1, 2}

one by one as follows:

Case

n = 1

\begin{matrix} \frac{1}{B (b_{1}, c - b_{1})} \int_{0}^{1} u_{1}^{b_{1} - 1} e^{x_{1} u_{1}} {(1 - u_{1})}^{c - b_{1} - 1} d u_{1} = \sum_{m_{1} = 0}^{\infty} \frac{{(b_{1})}_{m_{1}}}{{(c)}_{m_{1}}} \frac{x_{1}^{m_{1}}}{m_{1}!} = Φ_{2}^{(1)} (b_{1}; c; x_{1}) =_{1} F_{1} (b_{1}, c; x_{1}) \end{matrix}

(15)

where

_{1} F_{1} (.)

is the confluent hypergeometric function of the first kind (Section 9.21 in [35]).

Case

n = 2

\begin{matrix} \frac{1}{B (b_{1}, b_{2}, c - b_{1} - b_{2})} {\int \int}_{\begin{matrix} u_{1} \geq 0, u_{2} \geq 0, \\ u_{1} + u_{2} \leq 1 \end{matrix}} u_{1}^{b_{1} - 1} u_{2}^{b_{2} - 1} e^{x_{1} u_{1} + x_{2} u_{2}} {(1 - u_{1} - u_{2})}^{c - b_{1} - b_{2} - 1} d u_{1} d u_{2} \\ = \sum_{m_{1} = 0}^{\infty} \sum_{m_{2} = 0}^{\infty} \frac{{(b_{1})}_{m_{1}} {(b_{2})}_{m_{2}}}{{(c)}_{m_{1} + m_{2}}} \frac{x_{1}^{m_{1}}}{m_{1}!} \frac{x_{2}^{m_{2}}}{m_{2}!} = Φ_{2}^{(2)} (b_{1}, b_{2}; c; x_{1}, x_{2}) = Φ_{2} (b_{1}, b_{2}, c; x_{1}, x_{2}) \end{matrix}

(16)

where the double series

Φ_{2}

is one of the components of the Humbert series of two variables [36] that generalize Kummer’s confluent hypergeometric series

_{1} F_{1}

of one variable. The double series

Φ_{2}

converges absolutely at any

x_{1}

,

x_{2} \in C

.

3.2. Multiple Power Series $F_{N}^{(n)}$

Definition 2.

We define a new multiple power series, denoted by

F_{N}^{(n)}

and given by

\begin{matrix} F_{N}^{(n)} (a; b_{1}, \dots, b_{n}; c, c_{n}; x_{1}, \dots, x_{n}) \\ = x_{n}^{- a} \sum_{\begin{matrix} m_{1}, \dots, \\ m_{n} = 0 \end{matrix}}^{+ \infty} \frac{{(a)}_{\sum_{i = 1}^{n} m_{i}} {(a - c_{n} + 1)}_{\sum_{i = 1}^{n} m_{i}}}{{(a + b_{n} - c_{n} + 1)}_{\sum_{i = 1}^{n} m_{i}}} \frac{\prod_{i = 1}^{n - 1} {(b_{i})}_{m_{i}}}{{(c)}_{\sum_{i = 1}^{n - 1} m_{i}}} \prod_{i = 1}^{n - 1} {(\frac{x_{i}}{x_{n}})}^{m_{i}} \frac{1}{m_{i}!} \frac{{(1 - x_{n}^{- 1})}^{m_{n}}}{m_{n}!} . \end{matrix}

(17)

The multiple power series (17) is absolutely convergent on the region

| x_{i} x_{n}^{- 1} | + | 1 - x_{n}^{- 1} | < 1

in

C^{n}, \forall i \in {1, \dots, n - 1}

.

The multiple power series

F_{N}^{(n)} (.)

can also be transformed into two other expressions as follows

\begin{matrix} F_{N}^{(n)} (a; b_{1}, \dots, b_{n}; c, c_{n}; x_{1}, \dots, x_{n}) \\ = \sum_{\begin{matrix} m_{1}, \dots, \\ m_{n} = 0 \end{matrix}}^{+ \infty} \frac{{(a - c_{n} + 1)}_{\sum_{i = 1}^{n - 1} m_{i}} {(b_{n})}_{m_{n}} {(a)}_{\sum_{i = 1}^{n} m_{i}}}{{(a + b_{n} - c_{n} + 1)}_{\sum_{i = 1}^{n} m_{i}}} \frac{\prod_{i = 1}^{n - 1} {(b_{i})}_{m_{i}}}{{(c)}_{\sum_{i = 1}^{n - 1} m_{i}}} \prod_{i = 1}^{n - 1} \frac{x_{i}^{m_{i}}}{m_{i}!} \frac{{(1 - x_{n})}^{m_{n}}}{m_{n}!}, \end{matrix}

(18)

\begin{matrix} = x_{n}^{1 - c_{n}} \sum_{\begin{matrix} m_{1}, \dots, \\ m_{n} = 0 \end{matrix}}^{+ \infty} \frac{{(a - c_{n} + 1)}_{\sum_{i = 1}^{n} m_{i}} {(b_{n} - c_{n} + 1)}_{m_{n}} {(a)}_{\sum_{i = 1}^{n - 1} m_{i}}}{{(a + b_{n} - c_{n} + 1)}_{\sum_{i = 1}^{n} m_{i}}} \frac{\prod_{i = 1}^{n - 1} {(b_{i})}_{m_{i}}}{{(c)}_{\sum_{i = 1}^{n - 1} m_{i}}} \prod_{i = 1}^{n - 1} \frac{x_{i}^{m_{i}}}{m_{i}!} \frac{{(1 - x_{n})}^{m_{n}}}{m_{n}!} . \end{matrix}

(19)

By Horn’s rule for the determination of the convergence region (see [37], Section 5.7.2), the multiple power series (18) and (19) are absolutely convergent on region

| x_{i} | < 1, \forall i \in {1, \dots, n - 1}, | 1 - x_{n} | < 1

in

C^{n}

.

Equation (18) can then be deduced from (17) by using the following development where the

F_{N}^{(p)}

function can be written as

\begin{matrix} F_{N}^{(n)} (a; b_{1}, \dots, b_{n}; c, c_{n}; x_{1}, \dots, x_{n}) = x_{n}^{- a} \sum_{\begin{matrix} m_{1}, \dots, \\ m_{n - 1} = 0 \end{matrix}}^{+ \infty} \frac{{(a)}_{\sum_{i = 1}^{n - 1} m_{i}} {(a - c_{n} + 1)}_{\sum_{i = 1}^{n - 1} m_{i}}}{{(a + b_{n} - c_{n} + 1)}_{\sum_{i = 1}^{n - 1} m_{i}}} \frac{\prod_{i = 1}^{n - 1} {(b_{i})}_{m_{i}}}{{(c)}_{\sum_{i = 1}^{n - 1} m_{i}}} \\ \times \prod_{i = 1}^{n - 1} {(\frac{x_{i}}{x_{n}})}^{m_{i}} \frac{1}{m_{i}!} \sum_{m_{n} = 0}^{\infty} \frac{{(α)}_{m_{n}} {(α - c_{n} + 1)}_{m_{n}}}{{(α + b_{n} - c_{n} + 1)}_{m_{n}}} \frac{{(1 - x_{n}^{- 1})}^{m_{n}}}{m_{n}!} \end{matrix}

(20)

and

α = a + \sum_{i = 1}^{n - 1} m_{i}

is used here to alleviate writing equations. Using the definition of Gauss’ hypergeometric series

_{2} F_{1} (.)

[34] and the Pfaff transformation [38], we can write

\begin{matrix} \sum_{m_{n} = 0}^{\infty} \frac{{(α)}_{m_{n}} {(α - c_{n} + 1)}_{m_{n}}}{{(α + b_{n} - c_{n} + 1)}_{m_{n}}} \frac{{(1 - x_{n}^{- 1})}^{m_{n}}}{m_{n}!} & =_{2} F_{1} (α, α - c_{n} + 1; α + b_{n} - c_{n} + 1; 1 - x_{n}^{- 1}) \end{matrix}

(21)

\begin{matrix} = x_{n}^{α}_{2} F_{1} (α, b_{n}; α + b_{n} - c_{n} + 1; 1 - x_{n}) \end{matrix}

(22)

\begin{matrix} = x_{n}^{α} \sum_{m_{p} = 0}^{\infty} \frac{{(α)}_{m_{n}} {(b_{n})}_{m_{n}}}{{(α + b_{n} - c_{n} + 1)}_{m_{n}}} \frac{{(1 - x_{n})}^{m_{n}}}{m_{n}!} . \end{matrix}

(23)

By substituting (23) into (20), and using the following two relations:

\begin{matrix} {(a)}_{\sum_{i = 1}^{n - 1} m_{i}} {(α)}_{m_{n}} = {(a)}_{\sum_{i = 1}^{n} m_{i}}, \end{matrix}

(24)

\begin{matrix} {(a + b_{n} - c_{n} + 1)}_{\sum_{i = 1}^{n - 1} m_{i}} {(α + b_{n} - c_{n} + 1)}_{m_{n}} = {(a + b_{n} - c_{n} + 1)}_{\sum_{i = 1}^{n} m_{i}} \end{matrix}

(25)

we can get (18).

The second transformation is given as follows

\begin{matrix} _{2} F_{1} (α, α - c_{n} + 1; b_{n} - c_{n} + α + 1; 1 - x_{n}^{- 1}) \\ = x_{n}^{α - c_{n} + 1}_{2} F_{1} (b_{n} - c_{n} + 1, α - c_{n} + 1; α + b_{n} - c_{n} + 1; 1 - x_{n}) \end{matrix}

(26)

\begin{matrix} = x_{n}^{α - c_{n} + 1} \sum_{m_{n} = 0}^{\infty} \frac{{(α - c_{n} + 1)}_{m_{n}} {(b_{n} - c_{n} + 1)}_{m_{n}}}{{(α + b_{n} - c_{n} + 1)}_{m_{n}}} \frac{{(1 - x_{n})}^{m_{n}}}{m_{n}!} . \end{matrix}

(27)

By substituting (27) into (20), we get (19).

Lemma 1.

The multiple power series

F_{N}^{(n)}

is equal to the Lauricella D-hypergeometric function

F_{D}^{(n)}

(see Appendix A) [39] when

a - c_{n} + 1 = c

and is given as follows

\begin{matrix} F_{N}^{(n)} (a; b_{1}, \dots, b_{n}; c, c_{n}; x_{1}, \dots, x_{n}) & = \sum_{\begin{matrix} m_{1}, \dots, \\ m_{n} = 0 \end{matrix}}^{+ \infty} \frac{{(a)}_{\sum_{i = 1}^{n} m_{i}} \prod_{i = 1}^{n} {(b_{i})}_{m_{i}}}{{(a + b_{n} - c_{n} + 1)}_{\sum_{i = 1}^{n} m_{i}}} \prod_{i = 1}^{n - 1} \frac{x_{i}^{m_{i}}}{m_{i}!} \frac{{(1 - x_{n})}^{m_{n}}}{m_{n}!} \end{matrix}

(28)

\begin{matrix} = F_{D}^{(n)} (a, b_{1}, \dots, b_{n}; a + b_{n} - c_{n} + 1; x_{1}, \dots, x_{n - 1}, 1 - x_{n}) \end{matrix}

(29)

Proof.

By using Equation (18) of the multiple power series

F_{N}^{(n)}

and after having simplified

{(a - c_{n} + 1)}_{\sum_{i = 1}^{n - 1} m_{i}}

to the numerator and

{(c)}_{\sum_{i = 1}^{n - 1} m_{i}}

to the denominator, we can get the result. □

3.3. Integral Representation for $F_{N}^{(n + 1)}$

Proposition 2.

The following integral representation is true for

Real {a} > 0, Real {a - c_{n + 1} + 1} > 0, and Real {a - c_{n + 1} + b_{n + 1} + 1} > 0

\begin{matrix} \frac{Γ (a) Γ (a - c_{n + 1} + 1)}{Γ (a - c_{n + 1} + b_{n + 1} + 1)} F_{N}^{(n + 1)} (a; b_{1}, \dots, b_{n + 1}; c, c_{n + 1}; x_{1}, \dots, x_{n + 1}) \\ = \int_{0}^{\infty} e^{- r} r^{a - 1} Φ_{2}^{(n)} (b_{1}, \dots, b_{n}; c; r x_{1}, \dots, r x_{n}) U (b_{n + 1}, c_{n + 1}; r x_{n + 1}) d r \end{matrix}

(30)

where

U (\cdot)

is the confluent hypergeometric function of the second kind (Section 9.21 in [35]) defined for

Real {b} > 0

,

Real {z} > 0

by the following integral representation

\begin{matrix} U (b, c; z) = \frac{1}{Γ (b)} \int_{0}^{\infty} e^{- z t} t^{b - 1} {(1 + t)}^{c - b - 1} d t \end{matrix}

(31)

and

Φ_{2}^{(n)} (\cdot)

is defined by Equation (6).

Proof.

The multiple power series

Φ_{2}^{(n)}

and the confluent hypergeometric function

U (\cdot)

are absolutely convergent on

[0, + \infty]

. Using these functions in the above integral and changing the order of integration and summation, which is easily justified by absolute convergence, we get

\begin{matrix} \int_{0}^{\infty} e^{- r} r^{a - 1} Φ_{2}^{(n)} (b_{1}, \dots, b_{n}; c; r x_{1}, \dots, r x_{n}) U (b_{n + 1}, c_{n + 1}; r x_{n + 1}) d r \\ = \sum_{m_{1} = 0}^{\infty} . . \sum_{m_{n} = 0}^{\infty} \frac{{(b_{1})}_{m_{1}} \dots {(b_{n})}_{m_{n}}}{{(c)}_{\sum_{i = 1}^{n} m_{i}}} (\prod_{i = 1}^{n} \frac{x_{i}^{m_{i}}}{m_{i}!}) I \end{matrix}

(32)

where integral

I

is defined as follows

\begin{matrix} I & = \int_{0}^{\infty} e^{- r} r^{a - 1 + \sum_{i = 1}^{n} m_{i}} U (b_{n + 1}, c_{n + 1}; r x_{n + 1}) d r . \end{matrix}

(33)

Substituting the integral expression of

U (\cdot)

in the previous equation and replacing

α = a + \sum_{i = 1}^{n} m_{i}

to alleviate writing equations, we have

\begin{matrix} I & = \frac{1}{Γ (b_{n + 1})} \int_{0}^{\infty} \int_{0}^{\infty} \frac{e^{- (1 + x_{n + 1} t) r} r^{α - 1} t^{b_{n + 1} - 1}}{{(1 + t)}^{- (c_{n + 1} - b_{n + 1} - 1)}} d r d t . \end{matrix}

(34)

Knowing that [35]

\begin{matrix} \int_{0}^{\infty} e^{- (1 + x_{n + 1} t) r} r^{α - 1} d r = \frac{Γ (α)}{{(1 + x_{n + 1} t)}^{α}} \end{matrix}

(35)

and

\begin{matrix} \int_{0}^{\infty} \frac{t^{b_{n + 1} - 1} {(1 + t)}^{c_{n + 1} - b_{n + 1} - 1}}{{(1 + x_{n + 1} t)}^{α}} d t = \frac{Γ (b_{n + 1}) Γ (α - c_{n + 1} + 1)}{Γ (α + b_{n + 1} - c_{n + 1} + 1)}_{2} F_{1} (α, b_{n + 1}; α + b_{n + 1} - c_{n + 1} + 1; 1 - x_{n + 1}) \end{matrix}

(36)

the new expression of

I

is then given by

\begin{matrix} I & = \frac{Γ (α) Γ (α - c_{n + 1} + 1)}{Γ (α + b_{n + 1} - c_{n + 1} + 1)} \sum_{m_{n + 1} = 0}^{+ \infty} \frac{{(α)}_{m_{n + 1}} {(b_{n + 1})}_{m_{n + 1}}}{{(α + b_{n + 1} - c_{n + 1} + 1)}_{m_{n + 1}}} \frac{{(1 - x_{n + 1})}^{m_{n + 1}}}{m_{n + 1}!} . \end{matrix}

(37)

Using the fact that

Γ (α) = Γ (a) {(a)}_{\sum_{i = 1}^{n} m_{i}}

and

{(a)}_{\sum_{i = 1}^{n} m_{i}} {(α)}_{m_{n + 1}} = {(a)}_{\sum_{i = 1}^{n + 1} m_{i}}

, and developing the same method to

Γ (α + b_{n + 1} - c_{n + 1} + 1)

, the final complete expression of the integral is then given by

\begin{matrix} \frac{Γ (a) Γ (a - c_{n + 1} + 1)}{Γ (a + b_{n + 1} - c_{n + 1} + 1)} \sum_{m_{1} = 0}^{\infty} . . \sum_{m_{n + 1} = 0}^{\infty} \frac{{(b_{1})}_{m_{1}} \dots {(b_{n})}_{m_{n}}}{{(c)}_{\sum_{i = 1}^{n} m_{i}}} \frac{{(a - c_{n + 1} + 1)}_{\sum_{i = 1}^{n} m_{i}} {(b_{n + 1})}_{m_{n + 1}} {(a)}_{\sum_{i = 1}^{n + 1} m_{i}}}{{(a + b_{n + 1} - c_{n + 1} + 1)}_{\sum_{i = 1}^{n + 1} m_{i}}} \prod_{i = 1}^{n} \frac{x_{i}^{m_{i}}}{m_{i}!} \\ \times \frac{{(1 - x_{n + 1})}^{m_{n + 1}}}{m_{n + 1}!} = \frac{Γ (a) Γ (a - c_{n + 1} + 1)}{Γ (a - c_{n + 1} + b_{n + 1} + 1)} F_{N}^{(n + 1)} (a; b_{1}, \dots, b_{n + 1}; c, c_{n + 1}; x_{1}, \dots, x_{n + 1}) . \end{matrix}

(38)

□

4. Expression of $E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]}$

Proposition 3.

Let

X^{1}

be a random vector that follows a central MCD with pdf given by

f_{X^{1}} (x | Σ_{1}, p)

. Expectation

E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]}

is given as follows

\begin{matrix} E_{X^{1}} \{ln [1 + X^{T} Σ_{1}^{- 1} X]\} = ψ (\frac{1 + p}{2}) - ψ (\frac{1}{2}) \end{matrix}

(39)

where

ψ (.)

is the digamma function defined as the logarithmic derivative of the Gamma function (Section 8.36 in [35]).

Proof.

Expectation

E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]}

is developed as follows

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]} = \frac{A}{| Σ_{1} |^{\frac{1}{2}}} \int_{R^{p}} \frac{ln [1 + x^{T} Σ_{1}^{- 1} x]}{{[1 + x^{T} Σ_{1}^{- 1} x]}^{\frac{1 + p}{2}}} d x \end{matrix}

(40)

where

A = Γ (\frac{1 + p}{2}) π^{- \frac{1 + p}{2}}

. Utilizing the following property

\int log (x) f (x) d x = \frac{\partial}{\partial a} \int x^{a} f (x)

d x |_{a = 0}

, as a consequence the expectation is given as follows

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]} = \frac{A}{| Σ_{1} |^{\frac{1}{2}}} \frac{\partial}{\partial a} \int_{R^{p}} {[1 + x^{T} Σ_{1}^{- 1} x]}^{a - \frac{1 + p}{2}} d x |_{a = 0} \end{matrix}

(41)

Consider the transformation

y = Σ_{1}^{- 1 / 2} x

where

y = {[y_{1}, y_{2}, \dots, y_{p}]}^{T}

. The Jacobian determinant is given by

d y = | Σ_{1} |^{- 1 / 2} d x

(Theorem 1.12 in [40]). The new expression of the expectation is given by

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]} = A \frac{\partial}{\partial a} \int_{R^{p}} {[1 + y^{T} y]}^{a - \frac{1 + p}{2}} d y |_{a = 0} . \end{matrix}

(42)

Let

u = y^{T} y

be a transformation where the Jacobian determinant is given by (Lemma 13.3.1 in [41])

\begin{matrix} d y = \frac{π^{\frac{p}{2}}}{Γ (\frac{p}{2})} u^{\frac{p}{2} - 1} d u . \end{matrix}

(43)

The new expectation is as follows

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]} = \frac{Γ (\frac{1 + p}{2})}{π^{1 / 2} Γ (\frac{p}{2})} \frac{\partial}{\partial a} \int_{0}^{+ \infty} u^{\frac{p}{2} - 1} {(1 + u)}^{a - \frac{1 + p}{2}} d u |_{a = 0} \end{matrix}

(44)

Using the definition of beta function, we can write that

\begin{matrix} \int_{0}^{+ \infty} u^{\frac{p}{2} - 1} {(1 + u)}^{a - \frac{1 + p}{2}} d u = \frac{Γ (\frac{p}{2}) Γ (\frac{1}{2} - a)}{Γ (\frac{1 + p}{2} - a)} . \end{matrix}

(45)

The derivative of the last integral w.r.t a is given by

\begin{matrix} \frac{\partial}{\partial a} \int_{0}^{+ \infty} u^{\frac{p}{2} - 1} {(1 + u)}^{a - \frac{1 + p}{2}} d u |_{a = 0} = \frac{Γ (\frac{p}{2}) Γ (\frac{1}{2})}{Γ (\frac{1 + p}{2})} [ψ (\frac{1 + p}{2}) - ψ (\frac{1}{2})] \end{matrix}

(46)

Finally, the expression of

E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]}

is given by

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]} = ψ (\frac{1 + p}{2}) - ψ (\frac{1}{2}) . \end{matrix}

(47)

□

5. Expression of $E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]}$

Proposition 4.

Let

X^{1}

and

X^{2}

be two random vectors that follow central MCDs with pdfs given, respectively, by

f_{X^{1}} (x | Σ_{1}, p)

and

f_{X^{2}} (x | Σ_{2}, p)

. Expectation

E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]}

is given as follows

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} = ψ (\frac{1 + p}{2}) - ψ (\frac{1}{2}) + ln λ_{p} \\ - \frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, 1 - \frac{1}{λ_{p}})\} |_{a = 0} . \end{matrix}

(48)

where

λ_{1}

,…,

λ_{p}

are the eigenvalues of the real matrix

Σ_{1} Σ_{2}^{- 1}

, and

F_{D}^{(p)} (.)

represents the Lauricella D-hypergeometric function defined for p variables.

Proof.

To prove Proposition 4, different steps are necessary. They are described in the following:

5.1. First Step: Eigenvalue Expression

Expectation

E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]}

is computed as follows

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} & = \frac{A}{| Σ_{1} |^{\frac{1}{2}}} \int_{R^{p}} \frac{ln [1 + x^{T} Σ_{2}^{- 1} x]}{{[1 + x^{T} Σ_{1}^{- 1} x]}^{\frac{1 + p}{2}}} d x \end{matrix}

(49)

where

A = Γ (\frac{1 + p}{2}) π^{- \frac{1 + p}{2}}

. Consider transformation

y = Σ_{1}^{- 1 / 2} x

where

y = {[y_{1}, y_{2}, \dots, y_{p}]}^{T}

. The Jacobian determinant is given by

d y = | Σ_{1} |^{- 1 / 2} d x

(Theorem 1.12 in [40]) and matrix

Σ = Σ_{1}^{\frac{1}{2}} Σ_{2}^{- 1} Σ_{1}^{\frac{1}{2}}

is a real symmetric matrix since

Σ_{1}

and

Σ_{2}

are real symmetric matrixes. Then, the expectation is evaluated as follows

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} & = A \int_{R^{p}} \frac{ln [1 + y^{T} Σ y]}{{[1 + y^{T} y]}^{\frac{1 + p}{2}}} d y . \end{matrix}

(50)

Matrix

Σ

can be diagonalized by an orthogonal matrix

P

with

P^{- 1} = P^{T}

and

Σ = P D P^{- 1}

where

D

is a diagonal matrix composed of the eigenvalues of

Σ

. Considering that

y^{T} Σ y = tr (Σ y y^{T}) = tr (P D P^{T} y y^{T}) = tr (D P^{T} y y^{T} P)

, the expectation can be written as

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} & = A \int_{R^{p}} \frac{ln [1 + tr (D P^{T} y y^{T} P)]}{{[1 + y^{T} y]}^{\frac{1 + p}{2}}} d y . \end{matrix}

(51)

Let

z = P^{T} y

with

z = {[z_{1}, z_{2}, \dots, z_{p}]}^{T}

be a transformation where the Jacobian determinant is given by

d z = | P^{T} | d y = d y

. Using the fact that

tr (D P^{T} y y^{T} P) = tr (D z z^{T}) = z^{T} D z

and

y^{T} y = z^{T} P^{T} P z = z^{T} z

, then the previous expectation (51) is given as follows

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} & = A \int_{R^{p}} \frac{ln [1 + z^{T} D z]}{{[1 + z^{T} z]}^{\frac{1 + p}{2}}} d z \end{matrix}

(52)

\begin{matrix} = A \int_{R} . . \int_{R} \frac{ln [1 + \sum_{i = 1}^{p} λ_{i} z_{i}^{2}]}{{[1 + \sum_{i = 1}^{p} z_{i}^{2}]}^{\frac{1 + p}{2}}} d z_{1} \dots d z_{p} \end{matrix}

(53)

where

λ_{1}

,…,

λ_{p}

are the eigenvalues of

Σ_{1} Σ_{2}^{- 1}

.

5.2. Second Step: Polar Decomposition

Let the independent real variables

z_{1}, \dots, z_{p}

be transformed to the general polar coordinates r,

θ_{1}, \dots, θ_{p - 1}

as follows, where

r > 0

,

- π / 2 < θ_{j} \leq π / 2

,

j = 1, \dots, p - 2

,

- π < θ_{p - 1} \leq π

[40],

\begin{matrix} z_{1} & = r sin θ_{1} \end{matrix}

(54)

\begin{matrix} z_{2} & = r cos θ_{1} sin θ_{2} \end{matrix}

(55)

\begin{matrix} z_{j} & = r cos θ_{1} cos θ_{2} \dots cos θ_{j - 1} sin θ_{j}, j = 2, 3, \dots, p - 1 \end{matrix}

(56)

\begin{matrix} z_{p} & = r cos θ_{1} cos θ_{2} \dots cos θ_{p - 1} . \end{matrix}

(57)

The Jacobian determinant according to theorem (1.24) in [40] is

\begin{matrix} d z_{1} \dots d z_{p} = r^{p - 1} \prod_{j = 1}^{p - 1} {| cos θ_{j} |}^{p - j - 1} d r d θ_{j} . \end{matrix}

(58)

It is clear that with the last transformations, we get

\sum_{i = 1}^{p} z_{i}^{2} = r^{2}

and the multiple integral in (53) is then given as follows

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} = A \int_{0}^{+ \infty} \frac{r^{p - 1}}{{[1 + r^{2}]}^{\frac{1 + p}{2}}} \int_{- π / 2}^{π / 2} . . \int_{- π}^{π} (\prod_{j = 1}^{p - 1} {| cos θ_{j} |}^{p - j - 1}) \times \\ ln [1 + r^{2} (λ_{1} {sin}^{2} θ_{1} + \dots + λ_{p} {cos}^{2} θ_{1} \dots {cos}^{2} θ_{p - 1})] d r \prod_{j = 1}^{p - 1} d θ_{j} . \end{matrix}

(59)

By replacing the expression of

{sin}^{2} θ_{j}

by

1 - {cos}^{2} θ_{j}

, for

j = 1, \dots, p - 1

, we have the following expression

\begin{matrix} λ_{1} {sin}^{2} θ_{1} + \dots + λ_{p} {cos}^{2} θ_{1} \dots {cos}^{2} θ_{p - 1} = λ_{1} + (λ_{2} - λ_{1}) {cos}^{2} θ_{1} \\ + \dots + (λ_{p} - λ_{p - 1}) {cos}^{2} θ_{1} {cos}^{2} θ_{2} \dots {cos}^{2} θ_{p - 1} . \end{matrix}

(60)

Let

x_{i} = {cos}^{2} θ_{i}

be a transformation to use where

d x_{i} = 2 x_{i}^{1 / 2} {(1 - x_{i})}^{1 / 2} d θ_{i}

. Then the expectation given by the multiple integral over all

θ_{j}

,

j = 1, \dots, p - 1

is as follows

\begin{matrix} 2 A \int_{0}^{+ \infty} \frac{r^{p - 1}}{{[1 + r^{2}]}^{\frac{1 + p}{2}}} \int_{0}^{1} \dots \int_{0}^{1} (\prod_{j = 1}^{p - 1} x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}) ln [1 + r^{2} B_{p} (x_{1}, \dots, x_{p - 1})] d r d x_{1} \dots d x_{p - 1} \end{matrix}

(61)

where

B_{p} (x_{1}, \dots, x_{p - 1}) = λ_{1} + (λ_{2} - λ_{1}) x_{1} + \dots + (λ_{p} - λ_{p - 1}) x_{1} x_{2} \dots x_{p - 1}

,

p \geq 1

and

B_{1} = λ_{1}

. In the following, we use the notation

B_{p}

instead of

B_{p} (x_{1}, \dots, x_{p - 1})

to alleviate writing equations.

Let

t = r^{2}

be transformation to use. Then, one can write

\begin{matrix} = A \int_{0}^{+ \infty} \frac{t^{\frac{p}{2} - 1}}{{[1 + t]}^{\frac{1 + p}{2}}} \int_{0}^{1} \dots \int_{0}^{1} (\prod_{j = 1}^{p - 1} x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}}) ln [1 + t B_{p}] d t d x_{1} \dots d x_{p - 1} . \end{matrix}

(62)

In order to solve the integral in (62), we consider the following property given by

\int log (x) f (x)

d x = - \frac{\partial}{\partial a} \int x^{- a} f (x) d x |_{a = 0}

and the following equation given as follows

\begin{matrix} {(1 + B_{p} t)}^{- a} = \frac{1}{Γ (a)} \int_{0}^{+ \infty} y^{a - 1} e^{- (1 + B_{p} t) y} d y . \end{matrix}

(63)

Making use of the above equation, we obtain a new expression of (62) given as follows

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} \\ = - \frac{\partial}{\partial a} \{\frac{A}{Γ (a)} \int_{0}^{+ \infty} \frac{t^{\frac{p}{2} - 1}}{{[1 + t]}^{\frac{1 + p}{2}}} \int_{0}^{+ \infty} y^{a - 1} e^{- (1 + B_{p} t) y} \int_{0}^{1} \dots \int_{0}^{1} \prod_{j = 1}^{p - 1} x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}} d x_{j} d y d t\} |_{a = 0} \end{matrix}

(64)

\begin{matrix} = - \frac{\partial}{\partial a} \{\frac{A}{Γ (a)} \int_{0}^{+ \infty} \frac{t^{\frac{p}{2} - 1}}{{[1 + t]}^{\frac{1 + p}{2}}} \int_{0}^{+ \infty} y^{a - 1} e^{- y} H (t, y) d y d t\} |_{a = 0} \end{matrix}

(65)

where

H (t, y)

is defined as

\begin{matrix} H (t, y) = \int_{0}^{1} \dots \int_{0}^{1} e^{- B_{p} t y} \prod_{j = 1}^{p - 1} x_{j}^{\frac{p - j}{2} - 1} {(1 - x_{j})}^{- \frac{1}{2}} d x_{j} . \end{matrix}

(66)

5.3. Third Step: Expression for H(t,y) by Humbert and Beta Functions

Let

x_{i}^{'} = 1 - x_{i}

,

i = 1, \dots, p - 1

be transformations to use. Then

\begin{matrix} (λ_{2} - λ_{1}) x_{1} & = (λ_{2} - λ_{1}) (1 - x_{1}^{'}) \end{matrix}

(67)

\begin{matrix} (λ_{3} - λ_{2}) x_{1} x_{2} & = (λ_{3} - λ_{2}) (1 - x_{1}^{'}) [1 - x_{2}^{'}] \end{matrix}

(68)

\begin{matrix} (λ_{4} - λ_{3}) x_{1} x_{2} x_{3} & = (λ_{4} - λ_{3}) (1 - x_{1}^{'}) (1 - x_{2}^{'}) [1 - x_{3}^{'}] \\ ⋮ & = ⋮ \end{matrix}

(69)

\begin{matrix} (λ_{p} - λ_{p - 1}) \prod_{i = 1}^{p - 1} x_{i} & = (λ_{p} - λ_{p - 1}) \prod_{i = 1}^{p - 1} (1 - x_{i}^{'}) . \end{matrix}

(70)

Adding equations from (67) to (70), we can state that the new expression of the function

B_{p}

becomes

\begin{matrix} B_{p} = λ_{p} - (λ_{p} - λ_{1}) x_{1}^{'} - (λ_{p} - λ_{2}) (1 - x_{1}^{'}) x_{2}^{'} - (λ_{p} - λ_{3}) (1 - x_{1}^{'}) (1 - x_{2}^{'}) x_{3}^{'} \\ - \dots - (λ_{p} - λ_{p - 1}) (1 - x_{1}^{'}) \dots (1 - x_{p - 2}^{'}) x_{p - 1}^{'} . \end{matrix}

(71)

Then, the multiple integral

H (t, y)

given by (66) can be written otherwise

\begin{matrix} H (t, y) = \int_{0}^{1} \dots \int_{0}^{1} e^{- B_{p} t y} \prod_{j = 1}^{p - 1} {(1 - x_{j}^{'})}^{\frac{p - j}{2} - 1} {x_{j}^{'}}^{- \frac{1}{2}} d x_{1}^{'} \dots d x_{p - 1}^{'} . \end{matrix}

(72)

Let the real variables

x_{1}^{'}, x_{2}^{'}, \dots, x_{p - 1}^{'}

be transformed to the real variables

u_{1}, u_{2}, \dots, u_{p - 1}

as follows

\begin{matrix} u_{1} = x_{1}^{'} \end{matrix}

(73)

\begin{matrix} u_{2} = (1 - x_{1}^{'}) x_{2}^{'} = (1 - u_{1}) x_{2}^{'} \end{matrix}

(74)

\begin{matrix} u_{3} = (1 - x_{1}^{'}) (1 - x_{2}^{'}) x_{3}^{'} = (1 - u_{1} - u_{2}) x_{3}^{'} \\ ⋮ \end{matrix}

(75)

\begin{matrix} u_{p - 1} = \prod_{i = 1}^{p - 2} (1 - x_{i}^{'}) x_{p - 1}^{'} = (1 - \sum_{i = 1}^{p - 2} u_{i}) x_{p - 1}^{'} . \end{matrix}

(76)

The Jacobian determinant is given by

\begin{matrix} d u_{1} \dots d u_{p - 1} = \prod_{j = 1}^{p - 1} (1 - \sum_{i = 1}^{j - 1} u_{i}) d x_{1}^{'} \dots d x_{p - 1}^{'} . \end{matrix}

(77)

Accordingly, the new expression of

B_{p}

becomes

\begin{matrix} B_{p} = λ_{p} - \sum_{i = 1}^{p - 1} (λ_{p} - λ_{i}) u_{i} . \end{matrix}

(78)

As a consequence, the new domain of the multiple integral (72) is

Δ = {(u_{1}, u_{2}, \dots, u_{p - 1}) \in R^{p - 1}; 0 \leq u_{1} \leq 1, 0 \leq u_{2} \leq 1 - u_{1}, 0 \leq u_{3} \leq 1 - u_{1} - u_{2}, \dots, and 0 \leq u_{p - 1} \leq 1 - u_{1} - u_{2} \dots - u_{p - 2}}

, and the expression of

H (t, y)

is given as follows

\begin{matrix} H (t, y) & = \int \dots \int_{Δ} e^{- B_{p} t y} \prod_{j = 1}^{p - 1} {(1 - \sum_{i = 1}^{j - 1} u_{i})}^{- 1} {(\frac{u_{j}}{1 - \sum_{i = 1}^{j - 1} u_{i}})}^{- \frac{1}{2}} \prod_{j = 1}^{p - 1} {(1 - \frac{u_{j}}{1 - \sum_{i = 1}^{j - 1} u_{i}})}^{\frac{p - j}{2} - 1} d u_{j} \end{matrix}

(79)

\begin{matrix} = \int \dots \int_{Δ} e^{- B_{p} t y} \prod_{j = 1}^{p - 1} u_{j}^{- \frac{1}{2}} {(1 - \sum_{i = 1}^{j} u_{i})}^{\frac{p - j}{2} - 1} {(1 - \sum_{i = 1}^{j - 1} u_{i})}^{\frac{1}{2} - \frac{p - j}{2}} d u_{1} \dots d u_{p - 1} \end{matrix}

(80)

\begin{matrix} = \int \dots \int_{Δ} e^{- B_{p} t y} {(1 - \sum_{i = 1}^{p - 1} u_{i})}^{\frac{p}{2} - \frac{p - 1}{2} - 1} \prod_{j = 1}^{p - 1} u_{j}^{- \frac{1}{2}} d u_{j} \end{matrix}

(81)

\begin{matrix} = e^{- λ_{p} t y} \int \dots \int_{Δ} {(1 - \sum_{i = 1}^{p - 1} u_{i})}^{- \frac{1}{2}} \prod_{i = 1}^{p - 1} u_{i}^{- \frac{1}{2}} e^{(λ_{p} - λ_{i}) u_{i} t y} d u_{i} . \end{matrix}

(82)

Using Proposition 1, we subsequently find that

\begin{matrix} H (t, y) = e^{- λ_{p} t y} B (\underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}}}) Φ_{2}^{(p - 1)} (\underset{p - 1}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}}}; \frac{p}{2}; (λ_{p} - λ_{1}) t y, (λ_{p} - λ_{2}) t y, \dots, (λ_{p} - λ_{p - 1}) t y) . \end{matrix}

(83)

where

Φ_{2}^{(p - 1)} (.)

is the Humbert series of

p - 1

variables and

B (\frac{1}{2}, \dots, \frac{1}{2})

is the multivariate beta function. Applying the following successive two transformations

r = t y

(

d r = t d y

) and

u = 1 / t

(

d u = - u^{2} d t

), the new expression of the expectation given by (65) is written as follows

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} = - \frac{\partial}{\partial a} {\frac{A}{Γ (a)} B (\underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}}}) \int_{0}^{+ \infty} r^{a - 1} e^{- λ_{p} r} \\ \times Φ_{2}^{(p - 1)} (\underset{p - 1}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}}}; \frac{p}{2}; (λ_{p} - λ_{1}) r, \dots, (λ_{p} - λ_{p - 1}) r) (\int_{0}^{+ \infty} u^{a - \frac{1}{2}} {(1 + u)}^{- \frac{1 + p}{2}} e^{- r u} d u) d r} |_{a = 0} . \end{matrix}

(84)

5.4. Final Step

The last integral is related to the confluent hypergeometric function of the second kind

U (.)

as follows

\begin{matrix} \int_{0}^{+ \infty} u^{a - \frac{1}{2}} {(1 + u)}^{- \frac{1 + p}{2}} e^{- r u} d u = Γ (a + \frac{1}{2}) U (a + \frac{1}{2}, a + 1 - \frac{p}{2}, r) . \end{matrix}

(85)

As a consequence, the new expression is

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} = - \frac{\partial}{\partial a} {A \frac{Γ (a + \frac{1}{2})}{Γ (a)} B (\frac{1}{2}, \dots, \frac{1}{2}) \\ \times \int_{0}^{+ \infty} r^{a - 1} e^{- λ_{p} r} Φ_{2}^{(p - 1)} (\frac{1}{2}, \dots, \frac{1}{2}; \frac{p}{2}; 1; (λ_{p} - λ_{1}) r, \dots, (λ_{p} - λ_{p - 1}) r) U (a + \frac{1}{2}, a + 1 - \frac{p}{2}, r) d r} |_{a = 0} . \end{matrix}

(86)

Using the transformation

r^{'} = λ_{p} r

and the Proposition 2, and taking into account the expression of A, the new expression becomes

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} = - \frac{\partial}{\partial a} {\frac{B (a + \frac{1}{2}, \frac{p}{2})}{B (\frac{1}{2}, \frac{p}{2})} λ_{p}^{- a} \\ \times F_{N}^{(p)} (a; \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; \frac{p}{2}, a - \frac{p}{2} + 1; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, λ_{p}^{- 1})} |_{a = 0} \end{matrix}

(87)

Knowing that

\begin{matrix} \frac{\partial}{\partial a} \{\frac{B (\frac{p}{2}, a + \frac{1}{2})}{B (\frac{p}{2}, \frac{1}{2})}\} |_{a = 0} = ψ (\frac{1}{2}) - ψ (\frac{1 + p}{2}), and \end{matrix}

(88)

\begin{matrix} F_{N}^{(p)} (a; \frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}; \frac{p}{2}, a - \frac{p}{2} + 1; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, λ_{p}^{- 1}) |_{a = 0} = 1, \end{matrix}

(89)

the new expression of

E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]}

becomes

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} = ψ (\frac{1 + p}{2}) - ψ (\frac{1}{2}) \\ - \frac{\partial}{\partial a} \{λ_{p}^{- a} F_{N}^{(p)} (a; \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; \frac{p}{2}, a - \frac{p}{2} + 1; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, λ_{p}^{- 1})\} |_{a = 0} . \end{matrix}

(90)

Applying the expression given by (18) of Definition 2 and relying on Lemma 1, the final result corresponds to the D-hypergeometric function of Lauricella

F_{D}^{(p)} (.)

given by

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} = ψ (\frac{1 + p}{2}) - ψ (\frac{1}{2}) \\ - \frac{\partial}{\partial a} \{λ_{p}^{- a} \sum_{\begin{matrix} m_{1}, \dots, \\ m_{p} = 0 \end{matrix}}^{+ \infty} \frac{{(a)}_{\sum_{i = 1}^{p} m_{i}} {(a + \frac{1}{2})}_{m_{p}} \prod_{i = 1}^{p - 1} {(\frac{1}{2})}_{m_{i}}}{{(a + \frac{1 + p}{2})}_{\sum_{i = 1}^{p} m_{i}}} \prod_{i = 1}^{p - 1} {(1 - \frac{λ_{i}}{λ_{p}})}^{m_{i}} \frac{1}{m_{i}!} \frac{{(1 - λ_{p}^{- 1})}^{m_{p}}}{m_{p}!}\} |_{a = 0} \end{matrix}

(91)

\begin{matrix} = ψ (\frac{1 + p}{2}) - ψ (\frac{1}{2}) - \frac{\partial}{\partial a} \{λ_{p}^{- a} F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, 1 - \frac{1}{λ_{p}})\} |_{a = 0} . \end{matrix}

(92)

The final development of the previous expression is as follows

\begin{matrix} E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]} = ψ (\frac{1 + p}{2}) - ψ (\frac{1}{2}) + ln λ_{p} \\ - \frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, 1 - \frac{1}{λ_{p}})\} |_{a = 0} . \end{matrix}

(93)

□

In this section, we presented the exact expression of

E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]}

. In addition, the multiple power series

F_{D}^{(p)}

which appears to be a special case of

F_{N}^{(p)}

provides many properties and numerous transformations (see Appendix A) that make easier the convergence of the multiple power series. In the next section, we establish the KLD closed-form expression based on the expression of the latter expectation.

6. KLD between Two Central MCDs

Plugging (39) and (93) into (5) yields the closed-form expression of the KLD between two central MCDs with pdfs

f_{X^{1}} (x | Σ_{1}, p)

and

f_{X^{2}} (x | Σ_{2}, p)

. This result is presented in the following theorem.

Theorem 1.

Let

X^{1}

and

X^{2}

be two random vectors that follow central MCDs with pdfs given, respectively, by

f_{X^{1}} (x | Σ_{1}, p)

and

f_{X^{2}} (x | Σ_{2}, p)

. The Kullback–Leibler divergence between central MCDs is

\begin{matrix} KL (X^{1} | | X^{2}) = - \frac{1}{2} log \prod_{i = 1}^{p} λ_{i} + \frac{1 + p}{2} [log λ_{p} \\ - \frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, 1 - \frac{1}{λ_{p}})\} |_{a = 0}] \end{matrix}

(94)

where

λ_{1}

,…,

λ_{p}

are the eigenvalues of the real matrix

Σ_{1} Σ_{2}^{- 1}

, and

F_{D}^{(p)} (.)

represents the Lauricella D-hypergeometric function defined for p variables.

Lauricella [39] gave several transformation formulas (see Appendix A), whose relations (A5)–(A7), and (A9) are applied to

F_{D}^{(p)} (.)

in (94). The results of transformation are as follows

\begin{matrix} F_{D}^{(p)} (a, \frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}; a + \frac{1 + p}{2}; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, 1 - \frac{1}{λ_{p}}) \\ = λ_{p}^{a + \frac{p}{2}} \prod_{i = 1}^{p - 1} λ_{i}^{- \frac{1}{2}} F_{D}^{(p)} (\frac{1 + p}{2}, \frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}; a + \frac{1 + p}{2}; 1 - \frac{λ_{p}}{λ_{1}}, \dots, 1 - \frac{λ_{p}}{λ_{p - 1}}, 1 - λ_{p}) \end{matrix}

(95)

\begin{matrix} = {(\frac{λ_{1}}{λ_{p}})}^{- a} F_{D}^{(p)} (a, \frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}; a + \frac{1 + p}{2}; 1 - \frac{λ_{p}}{λ_{1}}, \dots, 1 - \frac{λ_{2}}{λ_{1}}, 1 - \frac{1}{λ_{1}}) \end{matrix}

(96)

\begin{matrix} = λ_{p}^{a} F_{D}^{(p)} (a, \frac{1}{2}, \dots, \frac{1}{2}; a + \frac{1 + p}{2}; 1 - λ_{1}, 1 - λ_{2}, \dots, 1 - λ_{p}) \end{matrix}

(97)

\begin{matrix} = λ_{p}^{a} \prod_{i = 1}^{p} λ_{i}^{- \frac{1}{2}} F_{D}^{(p)} (\frac{1 + p}{2}, \frac{1}{2}, \dots, \frac{1}{2}; a + \frac{1 + p}{2}; 1 - \frac{1}{λ_{1}}, 1 - \frac{1}{λ_{2}}, \dots, 1 - \frac{1}{λ_{p}}) . \end{matrix}

(98)

Considering the above equations, it is easy to provide different expressions of

KL (X^{1} | | X^{2})

shown in Table 1. The derivative of the Lauricella D-hypergeometric series with respect to a goes through the derivation of the following expression

\begin{matrix} \frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \frac{1}{2}, \frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}; a + \frac{1 + p}{2}; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, 1 - \frac{1}{λ_{p}})\} |_{a = 0} \end{matrix}

(99)

\begin{matrix} = \sum_{\begin{matrix} m_{1}, \dots, \\ m_{p} = 0 \end{matrix}}^{+ \infty} \frac{\partial}{\partial a} \{\frac{{(a)}_{\sum_{i = 1}^{p} m_{i}} {(a + \frac{1}{2})}_{m_{p}}}{{(a + \frac{1 + p}{2})}_{\sum_{i = 1}^{p} m_{i}}}\} |_{a = 0} \prod_{i = 1}^{p - 1} {(\frac{1}{2})}_{m_{i}} {(1 - \frac{λ_{i}}{λ_{p}})}^{m_{i}} \frac{1}{m_{i}!} \frac{{(1 - λ_{p}^{- 1})}^{m_{p}}}{m_{p}!} \end{matrix}

(100)

The derivative with respect to a of the Lauricella D-hypergeometric series and its transformations goes through the following expressions (see Appendix B for demonstration)

\begin{matrix} \frac{\partial}{\partial a} \{\frac{{(a)}_{\sum_{i = 1}^{p} m_{i}} {(a + \frac{1}{2})}_{m_{p}}}{{(a + \frac{1 + p}{2})}_{\sum_{i = 1}^{p} m_{i}}}\} |_{a = 0} = \frac{{(\frac{1}{2})}_{m_{p}} {(1)}_{\sum_{i = 1}^{p} m_{i}}}{{(\frac{1 + p}{2})}_{\sum_{i = 1}^{p} m_{i}} (\sum_{i = 1}^{p} m_{i})}, \end{matrix}

(101)

\begin{matrix} \frac{\partial}{\partial a} \{\frac{{(a)}_{\sum_{i = 1}^{p} m_{i}}}{{(a + \frac{1 + p}{2})}_{\sum_{i = 1}^{p} m_{i}}}\} |_{a = 0} = \frac{{(1)}_{\sum_{i = 1}^{p} m_{i}}}{{(\frac{1 + p}{2})}_{\sum_{i = 1}^{p} m_{i}} (\sum_{i = 1}^{p} m_{i})}, \end{matrix}

(102)

\begin{matrix} \frac{\partial}{\partial a} \{\frac{{(a + \frac{1}{2})}_{m_{p}}}{{(a + \frac{1 + p}{2})}_{\sum_{i = 1}^{p} m_{i}}}\} |_{a = 0} = \frac{{(\frac{1}{2})}_{m_{p}}}{{(\frac{1 + p}{2})}_{\sum_{i = 1}^{p} m_{i}}} [\sum_{k = 0}^{m_{p} - 1} \frac{1}{k + \frac{1}{2}} - \sum_{k = 0}^{\sum_{i = 1}^{p} m_{i} - 1} \frac{1}{k + \frac{1 + p}{2}}], \end{matrix}

(103)

\begin{matrix} \frac{\partial}{\partial a} \{\frac{1}{{(a + \frac{1 + p}{2})}_{\sum_{i = 1}^{p} m_{i}}}\} |_{a = 0} = \frac{- 1}{{(\frac{1 + p}{2})}_{\sum_{i = 1}^{p} m_{i}}} \sum_{k = 0}^{\sum_{i = 1}^{p} m_{i} - 1} \frac{1}{k + \frac{1 + p}{2}} . \end{matrix}

(104)

To derive the closed-form expression of

d_{KL} (X^{1}, X^{2})

we have to evaluate the expression of

KL (X^{2} | | X^{1})

. The latter can be easily deduced from

KL (X^{1} | | X^{2})

as follows

\begin{matrix} KL (X^{2} | | X^{1}) = \frac{1}{2} log \prod_{i = 1}^{p} λ_{i} - \frac{1 + p}{2} [log λ_{p} \\ + \frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}; a + \frac{1 + p}{2}; 1 - \frac{λ_{p}}{λ_{1}}, \dots, 1 - \frac{λ_{p}}{λ_{p - 1}}, 1 - λ_{p})\} |_{a = 0}] . \end{matrix}

(105)

Proceeding in the same way by using Lauricella transformations, different expressions of

KL (X^{2} | | X^{1})

are provided in Table 1. Finally, given the above results, it is straightforward to compute the symmetric KL similarity measure

d_{KL} (X^{1}, X^{2})

between

X^{1}

and

X^{2}

. Technically, any combination of the

KL (X^{1} | | X^{2})

and

KL (X^{2} | | X^{1})

expressions is possible to compute

d_{KL} (X^{1}, X^{2})

. However, we choose the same convergence region for the two divergences for the calculation of the distance. Some expressions of

d_{KL} (X^{1}, X^{2})

are given in Table 1.

Table 1. KLD and KL distance computed when

X^{1}

and

X^{2}

are two random vectors following central MCDs with pdfs

f_{X^{1}} (x | Σ_{1}, p)

and

f_{X^{2}} (x | Σ_{2}, p)

.

Table 1. KLD and KL distance computed when

X^{1}

and

X^{2}

are two random vectors following central MCDs with pdfs

f_{X^{1}} (x | Σ_{1}, p)

and

f_{X^{2}} (x | Σ_{2}, p)

.

\begin{array}{l} KL (X^{1} | | X^{2}) \\ (106) & = - \frac{1}{2} log \prod_{i = 1}^{p} λ_{i} + \frac{1 + p}{2} [log λ_{p} - \frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, 1 - \frac{1}{λ_{p}})\} |_{a = 0}] \\ (107) & = - \frac{1}{2} log \prod_{i = 1}^{p} λ_{i} - \frac{1 + p}{2} λ_{p}^{\frac{p}{2}} \prod_{i = 1}^{p - 1} λ_{i}^{- \frac{1}{2}} \frac{\partial}{\partial a} \{F_{D}^{(p)} (\frac{1 + p}{2}, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{λ_{p}}{λ_{1}}, \dots, 1 - \frac{λ_{p}}{λ_{p - 1}}, 1 - λ_{p})\} |_{a = 0} \\ (108) & = - \frac{1}{2} log \prod_{i = 1}^{p} λ_{i} + \frac{1 + p}{2} [log λ_{1} - \frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{λ_{p}}{λ_{1}}, \dots, 1 - \frac{λ_{2}}{λ_{1}}, 1 - \frac{1}{λ_{1}})\} |_{a = 0}] \\ (109) & = - \frac{1}{2} log \prod_{i = 1}^{p} λ_{i} - \frac{1 + p}{2} \frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - λ_{1}, \dots, 1 - λ_{p})\} |_{a = 0} \\ (110) & = - \frac{1}{2} log \prod_{i = 1}^{p} λ_{i} - \frac{1 + p}{2} \prod_{i = 1}^{p} λ_{i}^{- \frac{1}{2}} \frac{\partial}{\partial a} \{F_{D}^{(p)} (\frac{1 + p}{2}, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{1}{λ_{1}}, \dots, 1 - \frac{1}{λ_{p}})\} |_{a = 0} \end{array}

\begin{array}{l} KL (X^{2} | | X^{1}) \\ (111) & = \frac{1}{2} log \prod_{i = 1}^{p} λ_{i} - \frac{1 + p}{2} [log λ_{p} + \frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{λ_{p}}{λ_{1}}, \dots, 1 - \frac{λ_{p}}{λ_{p - 1}}, 1 - λ_{p})\} |_{a = 0}] \\ (112) & = \frac{1}{2} log \prod_{i = 1}^{p} λ_{i} - \frac{1 + p}{2} λ_{p}^{- \frac{p}{2}} \prod_{i = 1}^{p - 1} λ_{i}^{\frac{1}{2}} \frac{\partial}{\partial a} \{F_{D}^{(p)} (\frac{1 + p}{2}, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, 1 - \frac{1}{λ_{p}})\} |_{a = 0} \\ (113) & = \frac{1}{2} log \prod_{i = 1}^{p} λ_{i} - \frac{1 + p}{2} [log λ_{1} + \frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{1}}{λ_{2}}, 1 - λ_{1})\} |_{a = 0}] \\ (114) & = \frac{1}{2} log \prod_{i = 1}^{p} λ_{i} - \frac{1 + p}{2} \frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{1}{λ_{1}}, \dots, 1 - \frac{1}{λ_{p}})\} |_{a = 0} \\ (115) & = \frac{1}{2} log \prod_{i = 1}^{p} λ_{i} - \frac{1 + p}{2} \prod_{i = 1}^{p} λ_{i}^{\frac{1}{2}} \frac{\partial}{\partial a} \{F_{D}^{(p)} (\frac{1 + p}{2}, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - λ_{1}, \dots, 1 - λ_{p})\} |_{a = 0} \end{array}

\begin{array}{l} d_{KL} (X^{1}, X^{2}) \\ = & \frac{1 + p}{2} [log λ_{p} - \frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, 1 - \frac{1}{λ_{p}})\} |_{a = 0} - λ_{p}^{- \frac{p}{2}} \prod_{i = 1}^{p - 1} λ_{i}^{\frac{1}{2}} \\ (116) & \times \frac{\partial}{\partial a} \{F_{D}^{(p)} (\frac{1 + p}{2}, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}, a + \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{λ_{1}}{λ_{p}}, \dots, 1 - \frac{λ_{p - 1}}{λ_{p}}, 1 - \frac{1}{λ_{p}})\} |_{a = 0}] \\ = & - \frac{1 + p}{2} [\frac{\partial}{\partial a} \{F_{D}^{(p)} (a, \underset{p}{\underset{︸}{\frac{1}{2}, \dots, \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - λ_{1}, \dots, 1 - λ_{p})\} |_{a = 0} + \prod_{i = 1}^{p} λ_{i}^{\frac{1}{2}} \frac{\partial}{\partial a} {F_{D}^{(p)} (\frac{1 + p}{2}, \underset{p}{\underset{︸}{\frac{1}{2}, \dots, \frac{1}{2}}}; a + \frac{1 + p}{2}; \\ (117) & 1 - λ_{1}, \dots, 1 - λ_{p})} |_{a = 0}] \\ = & - \frac{1 + p}{2} [\prod_{i = 1}^{p} λ_{i}^{- \frac{1}{2}} \frac{\partial}{\partial a} \{F_{D}^{(p)} (\frac{1 + p}{2}, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - \frac{1}{λ_{1}}, \dots, 1 - \frac{1}{λ_{p}})\} |_{a = 0} + \frac{\partial}{\partial a} {F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}}}; a + \frac{1 + p}{2}; \\ (118) & 1 - \frac{1}{λ_{1}}, \dots, 1 - \frac{1}{λ_{p}})} |_{a = 0}] \end{array}

7. Particular Cases: Univariate and Bivariate Cauchy Distribution

7.1. Case of $p = 1$

This case corresponds to the univariate Cauchy distribution. The KLD is given by

\begin{matrix} KL (X^{1} | | X^{2}) & = - \frac{1}{2} log λ - \frac{\partial}{\partial a} \{_{2} F_{1} (a, \frac{1}{2}; a + 1; 1 - λ)\} |_{a = 0} \end{matrix}

(119)

where

_{2} F_{1}

is the Gauss’s hypergeometric function. The expression of the derivative of

_{2} F_{1}

is given as follows (see Appendix C.1 for details of computation)

\begin{matrix} \frac{\partial}{\partial a} \{_{2} F_{1} (a, \frac{1}{2}; a + 1; 1 - λ)\} |_{a = 0} & = \sum_{n = 1}^{\infty} {(\frac{1}{2})}_{n} \frac{1}{n} \frac{{(1 - λ)}^{n}}{n!} \\ = - 2 ln (\frac{1 + λ^{1 / 2}}{2}) . \end{matrix}

(120)

Accordingly, the KLD is then expressed as

\begin{matrix} KL (X^{1} | | X^{2}) & = log \frac{{(1 + λ^{\frac{1}{2}})}^{2}}{4 λ^{\frac{1}{2}}} \end{matrix}

(121)

\begin{matrix} = log \frac{{(1 + λ^{- \frac{1}{2}})}^{2}}{4 λ^{- \frac{1}{2}}} = KL (X^{2} | | X^{1}) . \end{matrix}

(122)

We conclude that KLD between Cauchy densities is always symmetric. Interestingly, this is consistent with the result presented in [31].

7.2. Case of $p = 2$

This case corresponds to the Bivariate Cauchy distribution. The KLD is then given by

\begin{matrix} KL (X^{1} | | X^{2}) = - \frac{1}{2} log λ_{1} λ_{2} - \frac{3}{2} \frac{\partial}{\partial a} \{F_{1} (a, \frac{1}{2}, \frac{1}{2}; a + \frac{3}{2}; 1 - λ_{1}, 1 - λ_{2})\} |_{a = 0} \end{matrix}

(123)

where

F_{1}

is the Appell’s hypergeometric function (see Appendix A). The expression of the derivative of

F_{1}

can be further developed

\begin{matrix} \frac{\partial}{\partial a} \{F_{1} (a, \frac{1}{2}, \frac{1}{2}; a + \frac{3}{2}; 1 - λ_{1}, 1 - λ_{2})\} |_{a = 0} \\ = \sum_{\begin{matrix} n, m = 0 \end{matrix}}^{+ \infty} \frac{{(1)}_{m + n} {(\frac{1}{2})}_{n} {(\frac{1}{2})}_{m}}{{(\frac{3}{2})}_{m + n}} \frac{1}{m + n} \frac{{(1 - λ_{1})}^{n}}{n!} \frac{{(1 - λ_{2})}^{m}}{m!} . \end{matrix}

(124)

In addition, when the eigenvalue

λ_{i}

for

i = 1, 2

takes some particular values, the expression of the KLD becomes very simple. In the following, we show some cases:

$(λ_{1} = 1, λ_{2} \neq 1)$ or $(λ_{2} = 1, λ_{1} \neq 1)$

For this particular case, we have

\begin{matrix} \frac{\partial}{\partial a} \{F_{1} (a, \frac{1}{2}, \frac{1}{2}; a + \frac{3}{2}; 1 - λ_{i}, 0)\} |_{a = 0} & = \frac{\partial}{\partial a} \{_{2} F_{1} (a, \frac{1}{2}; a + \frac{3}{2}; 1 - λ_{i})\} |_{a = 0} \end{matrix}

(125)

\begin{matrix} = - ln λ_{i} + \frac{1}{\sqrt{1 - λ_{i}}} ln (\frac{1 - \sqrt{1 - λ_{i}}}{1 + \sqrt{1 - λ_{i}}}) + 2 . \end{matrix}

(126)

The demonstration of the derivation is shown in Appendix C.2. Then, KLD becomes equal to

\begin{matrix} KL (X^{1} | | X^{2}) & = ln λ_{i} - \frac{3}{2} \frac{1}{\sqrt{1 - λ_{i}}} ln (\frac{1 - \sqrt{1 - λ_{i}}}{1 + \sqrt{1 - λ_{i}}}) - 3 . \end{matrix}

(127)

$λ_{1} = λ_{2} = λ$

For this particular case, we have

\begin{matrix} \frac{\partial}{\partial a} \{F_{1} (a, \frac{1}{2}, \frac{1}{2}; a + \frac{3}{2}; 1 - λ, 1 - λ)\} |_{a = 0} & = \frac{\partial}{\partial a} \{_{2} F_{1} (a, 1; a + \frac{3}{2}; 1 - λ)\} |_{a = 0} \end{matrix}

(128)

\begin{matrix} = - \frac{2}{\sqrt{1 - λ^{- 1}}} ln (\sqrt{λ} + \sqrt{λ - 1}) + 2 . \end{matrix}

(129)

For more details about the demonstration see Appendix C.3. The KLD becomes equal to

\begin{matrix} KL (X^{1} | | X^{2}) & = - ln λ + \frac{3}{\sqrt{1 - λ^{- 1}}} ln (\sqrt{λ} + \sqrt{λ - 1}) - 3 . \end{matrix}

(130)

It is easy to deduce that

\begin{matrix} KL (X^{2} | | X^{1}) & = ln λ + \frac{3}{\sqrt{1 - λ}} ln (\sqrt{λ^{- 1}} + \sqrt{λ^{- 1} - 1}) - 3 . \end{matrix}

(131)

This result can be demonstrated using the same process as

KL (X^{1} | | X^{2})

. It is worth to notice that

KL (X^{1} | | X^{2}) \neq KL (X^{2} | | X^{1})

which leads us to conclude that the property of symmetry observed for the univariate case is no longer valid in the multivariate case. Nielsen et al. in [32] gave the same conclusion.

8. Implementation and Comparison with Monte Carlo Technique

In this section, we show how we practically compute the numerical values of the KLD, especially when we have several equivalent expressions which differ in the region of convergence. To reach this goal, the eigenvalues of

Σ_{1} Σ_{2}^{- 1}

are rearranged in a descending order

λ_{p} > λ_{p - 1} > \dots > λ_{1} > 0

. This operation is justified by Equation (53) where it can be seen that the permutation of the eigenvalues does not affect the expectation result. Three cases can be identified from the expressions of KLD.

8.1. Case $1 > λ_{p} > λ_{p - 1} > \dots > λ_{1} > 0$

The expression of

KL (X^{1} | | X^{2})

is given by Equation (109) and

KL (X^{2} | | X^{1})

is given by (115).

8.2. Case $λ_{p} > λ_{p - 1} > \dots > λ_{1} > 1$

KL (X^{1} | | X^{2})

is given by the Equation (110) and

KL (X^{2} | | X^{1})

is given by (114).

8.3. Case $λ_{p} > 1$ and $λ_{1} < 1$

This case guarantees that

0 \leq 1 - λ_{j} / λ_{p} < 1

,

j = 1, \dots, p - 1

and

0 \leq 1 - 1 / λ_{p} < 1

. The expression of the

KL (X^{1} | | X^{2})

is given by Equation (106) and

KL (X^{2} | | X^{1})

is given by (112) or (113). To perform an evaluation of the quality of the numerical approximation of the derivative of the Lauricella series, we consider a case where an exact and simple expression of

\frac{\partial}{\partial a} {F_{D}^{(p)} (.)} |_{a = 0}

is possible. The following case where

λ_{1} = \dots = λ_{p} = λ

allows the Lauricella series to be equivalent to the Gauss hypergeometric function given as follows

\begin{matrix} F_{D}^{(p)} (a, \underset{p}{\underset{⏟}{\frac{1}{2}, \dots, \frac{1}{2}}}; a + \frac{1 + p}{2}; 1 - λ, \dots, 1 - λ) =_{2} F_{1} (a, \frac{p}{2}; a + \frac{1 + p}{2}; 1 - λ) . \end{matrix}

(132)

This relation allows us to compare the computational accuracy of the approximation of the Lauricella series with respect to the Gauss function. In addition, to compute the numerical value the indices of the series will evolve from 0 to N instead of infinity. The latter is chosen to ensure a good approximation of the Lauricella series. Table 2 shows the computation of the derivative of

F_{D}^{(p)} (.)

and

_{2} F_{1} (.)

, along with the absolute value of error

| ϵ |

, where

p = 2, N = {20, 30, 40}

. The exact expression of

\frac{\partial}{\partial a} {_{2} F_{1} (.)} |_{a = 0}

when

p = 2

is given by Equation (129). We can deduce the following conclusions. First, the error is reasonably low and decreases as the value of N increases. Second, the error increases for values of

1 - λ

close to 1 as expected, which corresponds to the convergence region limit.

In the following section, we compare the Monte Carlo sampling method to approximate the KLD value with the numerical value of the closed-form expression of the latter. The Monte Carlo method involves sampling a large number of samples and using them to calculate the sum rather than the integral. Here, for each sample size, the experiment is repeated 2000 times. The elements of

Σ_{1}

and

Σ_{2}

are given in Table 3. Figure 1 depicts the absolute value of bias, mean square error (MSE) and box plot of the difference between the symmetric KL approximated value and its theoretical one, given versus the sample sizes. As the sample size increases, the bias and the MSE decrease. Accordingly, the approximated value will be very close to the theoretical KLD when the number of samples is very large. The computation time of the proposed approximation and the classical Monte Carlo sampling method are recorded using Matlab on a 1.6 GHz processor with 16 GB of memory. For the proposed numerical approximation, the computation time is evaluated to 1.56 s with

N = 20

. The value of N can be increased to further improve the accuracy, but it will increase the computation time. For the Monte Carlo sampling method, the mean time values at sample sizes of {65,536; 131,072; 262,144} are

{2.71; 5.46; 10.78}

seconds, respectively.

To further encourage the dissemination of these results, we provide a code available as attached file to this paper. This is given in Matlab with a specific case of

p = 3

. This can be easily extended to any value of p, thanks to the general closed-form expression established in this paper.

9. Conclusions

Since the MCDs have various applications in signal and image processing, the KLD between central MCDs tackles an important problem for future work on statistics, machine learning and other related fields in computer science. In this paper, we derived a closed-form expression of the KLD and distance between two central MCDs. The similarity measure can be expressed as function of the Lauricella D-hypergeometric series

F_{D}^{(p)}

. We have also proposed a simple scheme to compute easily the Lauricella series and to bypass the convergence constraints of this series. Codes and examples for numerical calculations are presented and explained in detail. Finally, a comparison is made to show how the Monte Carlo sampling method gives approximations close to the KLD theoretical value. As a final note, it is also possible to extend these results on the KLD to the case of the multivariate t-distribution since the MCD is a particular case of this multivariate distribution.

Author Contributions

Conceptualization, N.B.; methodology, N.B.; software, N.B.; writing original draft preparation, N.B.; writing review and editing, N.B. and D.R.; supervision, D.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

Authors gratefully acknowledge the PHENOTIC platform node of the french national infrastructure on plant phenotyping ANR PHENOME 11-INBS-0012. The authors would like also to thank the anonymous reviewers for their helpful comments valuable comments and suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Lauricella Function

In 1893, G. Lauricella [39] investigated the properties of four series

F_{A}^{(n)}, F_{B}^{(n)}, F_{C}^{(n)}, F_{D}^{(n)}

of n variables. When

n = 2

, these functions coincide with Appell’s

F_{2}, F_{3}, F_{4}, F_{1}

, respectively. When

n = 1

, they all coincide with Gauss’

_{2} F_{1}

. We present here only the Lauricella series

F_{D}^{(n)}

given as follows

\begin{matrix} F_{D}^{(n)} (a, b_{1}, \dots, b_{n}; c; x_{1}, \dots, x_{n}) = \sum_{m_{1} = 0}^{\infty} \dots \sum_{m_{n} = 0}^{\infty} \frac{{(a)}_{m_{1} + \dots + m_{n}} {(b_{1})}_{m_{1}} \dots {(b_{n})}_{m_{n}}}{{(c)}_{m_{1} + \dots + m_{n}}} \frac{x_{1}^{m_{1}}}{m_{1}!} \dots \frac{x_{n}^{m_{n}}}{m_{n}!} \end{matrix}

(A1)

where

| x_{1} |, \dots, | x_{n} | < 1

. The Pochhammer symbol

{(q)}_{i}

indicates the i-th rising factorial of q, i.e.,

\begin{matrix} {(q)}_{i} = q (q + 1) \dots (q + i - 1) = \frac{Γ (q + i)}{Γ (q)} if i = 1, 2, \dots \end{matrix}

(A2)

If

i = 0

,

{(q)}_{i} = 1

. Function

F_{D}^{(n)} (.)

can be expressed in terms of multiple integrals as follows [42]

\begin{matrix} F_{D}^{(n)} (a, b_{1}, \dots, b_{n}; c; x_{1}, \dots, x_{n}) = \frac{Γ (c)}{Γ (c - \sum_{i = 1}^{n} b_{i}) \prod_{i = 1}^{n} Γ (b_{i})} \times \\ \int_{Ω} \dots \int \prod_{i = 1}^{n} u_{i}^{b_{i} - 1} {(1 - \sum_{i = 1}^{n} u_{i})}^{c - \sum_{i = 1}^{n} b_{i} - 1} {(1 - \sum_{i = 1}^{n} x_{i} u_{i})}^{- a} \prod_{i = 1}^{n} d u_{i} \end{matrix}

(A3)

where

Ω = {(u_{1}, u_{2}, \dots, u_{n}); 0 \leq u_{i} \leq 1, i = 1, \dots, n, and 0 \leq u_{1} + u_{2} + \dots + u_{n} \leq 1}

, Real

(b_{i}) > 0

for

i = 1, \dots, n

and Real

(c - b_{1} - \dots - b_{n}) > 0

. Lauricella’s

F_{D}

can be written as a one-dimensional Euler-type integral for any number n of variables. The integral form of

F_{D}^{(n)} (.)

is given as follows when Real

(a) > 0

and Real

(c - a) > 0

\begin{matrix} F_{D}^{(n)} (a, b_{1}, \dots, b_{n}; c; x_{1}, \dots, x_{n}) = \frac{Γ (c)}{Γ (a) Γ (c - a)} \int_{0}^{1} u^{a - 1} {(1 - u)}^{c - a - 1} {(1 - u x_{1})}^{- b_{1}} \dots {(1 - u x_{n})}^{- b_{n}} d u . \end{matrix}

(A4)

Lauricella has given several transformation formulas, from which we use the two following relationships. More details can be found in Exton’s book [43] on hypergeometric equations.

\begin{matrix} F_{D}^{(n)} (a, b_{1}, \dots, b_{n}; c; x_{1}, \dots, x_{n}) \end{matrix}

\begin{matrix} = \prod_{i = 1}^{n} {(1 - x_{i})}^{- b_{i}} F_{D}^{(n)} (c - a, b_{1}, \dots, b_{n}; c; \frac{x_{1}}{x_{1} - 1}, \dots, \frac{x_{n}}{x_{n} - 1}) \end{matrix}

(A5)

\begin{matrix} = {(1 - x_{1})}^{- a} F_{D}^{(n)} (a, c - \sum_{i = 1}^{n} b_{i}, b_{2}, \dots, b_{n}; c; \frac{x_{1}}{x_{1} - 1}, \frac{x_{1} - x_{2}}{x_{1} - 1}, \dots, \frac{x_{1} - x_{n}}{x_{1} - 1}) \end{matrix}

(A6)

\begin{matrix} = {(1 - x_{n})}^{- a} F_{D}^{(n)} (a, b_{1}, \dots, b_{n - 1}, c - \sum_{i = 1}^{n} b_{i}; c; \frac{x_{n} - x_{1}}{x_{n} - 1}, \frac{x_{n} - x_{2}}{x_{n} - 1}, \dots, \frac{x_{n} - x_{n - 1}}{x_{n} - 1}, \frac{x_{n}}{x_{n} - 1}) \end{matrix}

(A7)

\begin{matrix} = {(1 - x_{1})}^{c - a} \prod_{i = 1}^{n} {(1 - x_{i})}^{- b_{i}} F_{D}^{(n)} (c - a, c - \sum_{i = 1}^{n} b_{i}, b_{2}, \dots, b_{n}; c; x_{1}, \frac{x_{2} - x_{1}}{x_{2} - 1}, \dots, \frac{x_{n} - x_{1}}{x_{n} - 1}) \end{matrix}

(A8)

\begin{matrix} = {(1 - x_{n})}^{c - a} \prod_{i = 1}^{n} {(1 - x_{i})}^{- b_{i}} F_{D}^{(n)} (c - a, b_{1}, \dots, b_{n - 1}, c - \sum_{i = 1}^{n} b_{i}; c; \frac{x_{1} - x_{n}}{x_{1} - 1}, \dots, \frac{x_{n - 1} - x_{n}}{x_{n - 1} - 1}, x_{n}) . \end{matrix}

(A9)

Appendix B. Demonstration of Derivative

Appendix B.1. Demonstration

We use the following notation

α = \sum_{i = 1}^{p} m_{i}

to alleviate the writing of equations. Knowing that

\frac{\partial}{\partial c} {(c)}_{k} = {(c)}_{k} (ψ (c + k) - ψ (c))

,

ψ (c + k) - ψ (c) = \sum_{ℓ = 0}^{k - 1} \frac{1}{c + ℓ}

and

{(c)}_{k} = \prod_{i = 0}^{k - 1} (c + i)

we can state that

\begin{matrix} \frac{\partial}{\partial a} \{\frac{{(a)}_{α}}{{(a + \frac{1 + p}{2})}_{α}}\} & = \frac{{(a)}_{α} [ψ (a + α) - ψ (a) - ψ (a + \frac{1 + p}{2} + α) + ψ (a + \frac{1 + p}{2})]}{{(a + \frac{1 + p}{2})}_{α}} \\ = \frac{\prod_{k = 0}^{α - 1} (a + k) (\sum_{k = 0}^{α - 1} \frac{1}{a + k} - \frac{1}{a + \frac{1 + p}{2} + k})}{{(a + \frac{1 + p}{2})}_{α}} . \end{matrix}

(A10)

Using the fact that

\begin{matrix} \prod_{k = 0}^{α - 1} (a + k) \sum_{k = 0}^{α - 1} \frac{1}{a + k} = \prod_{k = 1}^{α - 1} (a + k) + \prod_{k = 0, k \neq 1}^{α - 1} (a + k) + \dots + \prod_{k = 0}^{α - 2} (a + k) \end{matrix}

(A11)

we can state that

\begin{matrix} \frac{\partial}{\partial a} \{\frac{{(a)}_{α}}{{(a + \frac{1 + p}{2})}_{α}}\} |_{a = 0} = \frac{(α - 1)!}{{(\frac{1 + p}{2})}_{α}} = \frac{{(1)}_{α}}{{(\frac{1 + p}{2})}_{α}} \frac{1}{α} . \end{matrix}

(A12)

Appendix B.2. Demonstration

\begin{matrix} \frac{\partial}{\partial a} \{\frac{{(a)}_{α} {(a + \frac{1}{2})}_{m_{p}}}{{(a + \frac{1 + p}{2})}_{α}}\} & = \frac{{(a + \frac{1}{2})}_{m_{p}} {(a)}_{α} [ψ (a + α) - ψ (a) + ψ (a + \frac{1}{2} + m_{p}) - ψ (a + \frac{1}{2})]}{{(a + \frac{1 + p}{2})}_{α}} - \\ \frac{{(a)}_{α} {(a + \frac{1}{2})}_{m_{p}} [ψ (a + \frac{1 + p}{2} + α) - ψ (a + \frac{1 + p}{2})]}{{(a + \frac{1 + p}{2})}_{α}} \end{matrix}

(A13)

\begin{matrix} = \frac{{(a + \frac{1}{2})}_{m_{p}} \prod_{k = 0}^{α - 1} (a + k) [\sum_{k = 0}^{α - 1} \frac{1}{a + k} - \frac{1}{a + \frac{1 + p}{2} + k} + \sum_{k = 0}^{m_{p} - 1} \frac{1}{a + \frac{1}{2} + k}]}{{(a + \frac{1 + p}{2})}_{α}} . \end{matrix}

(A14)

By developing the previous expression we can state that

\begin{matrix} \frac{\partial}{\partial a} \{\frac{{(a)}_{α} {(a + \frac{1}{2})}_{m_{p}}}{{(a + \frac{1 + p}{2})}_{α}}\} |_{a = 0} = \frac{{(\frac{1}{2})}_{m_{p}} (α - 1)!}{{(\frac{1 + p}{2})}_{α}} = \frac{{(\frac{1}{2})}_{m_{p}} {(1)}_{α}}{{(\frac{1 + p}{2})}_{α}} \frac{1}{α} . \end{matrix}

(A15)

Appendix B.3. Demonstration

\begin{matrix} \frac{\partial}{\partial a} \{\frac{{(a + \frac{1}{2})}_{m_{p}}}{{(a + \frac{1 + p}{2})}_{α}}\} & = \frac{{(a + \frac{1}{2})}_{m_{p}}}{{(a + \frac{1 + p}{2})}_{α}} [\sum_{k = 0}^{m_{p} - 1} \frac{1}{a + \frac{1}{2} + k} - \sum_{k = 0}^{α - 1} \frac{1}{a + \frac{1 + p}{2} + k}] . \end{matrix}

(A16)

As a consequence,

\begin{matrix} \frac{\partial}{\partial a} \{\frac{{(a + \frac{1}{2})}_{m_{p}}}{{(a + \frac{1 + p}{2})}_{α}}\} |_{a = 0} = \frac{{(\frac{1}{2})}_{m_{p}}}{{(\frac{1 + p}{2})}_{α}} [\sum_{k = 0}^{m_{p} - 1} \frac{1}{\frac{1}{2} + k} - \sum_{k = 0}^{α - 1} \frac{1}{\frac{1 + p}{2} + k}] . \end{matrix}

(A17)

Appendix B.4. Demonstration

\begin{matrix} \frac{\partial}{\partial a} \{\frac{1}{{(a + \frac{1 + p}{2})}_{α}}\} & = - \frac{ψ (a + \frac{1 + p}{2} + α) - ψ (a + \frac{1 + p}{2})}{{(a + \frac{1 + p}{2})}_{α}} \end{matrix}

(A18)

\begin{matrix} = \frac{- 1}{{(a + \frac{1 + p}{2})}_{α}} \sum_{k = 0}^{α - 1} \frac{1}{a + \frac{1 + p}{2} + k} . \end{matrix}

(A19)

Finally,

\begin{matrix} \frac{\partial}{\partial a} \{\frac{1}{{(a + \frac{1 + p}{2})}_{α}}\} |_{a = 0} & = \frac{- 1}{{(\frac{1 + p}{2})}_{α}} \sum_{k = 0}^{α - 1} \frac{1}{\frac{1 + p}{2} + k} . \end{matrix}

(A20)

Appendix C. Computations of Some Equations

Appendix C.1. Computation

Let f be a function of

λ

defined as follows:

\begin{matrix} f (λ) = \sum_{n = 1}^{\infty} {(\frac{1}{2})}_{n} \frac{1}{n} \frac{{(1 - λ)}^{n}}{n!} . \end{matrix}

(A21)

The multiplication of the derivative of f with respect to

λ

by

(1 - λ)

is given as follows

\begin{matrix} (1 - λ) \frac{\partial}{\partial λ} f (λ) & = - \sum_{n = 1}^{\infty} {(\frac{1}{2})}_{n} \frac{{(1 - λ)}^{n}}{n!} = 1 - λ^{- 1 / 2} . \end{matrix}

(A22)

As a consequence,

\begin{matrix} \frac{\partial}{\partial λ} f (λ) & = \frac{1 - λ^{- 1 / 2}}{1 - λ} = \frac{- λ^{- 1 / 2}}{1 + λ^{1 / 2}} . \end{matrix}

(A23)

Finally,

\begin{matrix} f (λ) & = - 2 ln \frac{1 + λ^{1 / 2}}{2} . \end{matrix}

(A24)

Appendix C.2. Computation

\begin{matrix} \frac{\partial}{\partial a} \{_{2} F_{1} (a, \frac{1}{2}; a + \frac{3}{2}; 1 - λ_{i})\} |_{a = 0} & = \sum_{n = 1}^{\infty} \frac{{(\frac{1}{2})}_{n} {(1)}_{n}}{{(\frac{3}{2})}_{n} n} \frac{{(1 - λ_{i})}^{n}}{n!} \\ = f (λ_{i}) \end{matrix}

(A25)

where f is a function of

λ_{i}

. The multiplication of the derivative of f with respect to

λ_{i}

by

(1 - λ_{i})

is given as follows

\begin{matrix} (1 - λ_{i}) \frac{\partial}{\partial λ_{i}} f (λ_{i}) & = - \sum_{n = 1}^{\infty} \frac{{(\frac{1}{2})}_{n} {(1)}_{n}}{{(\frac{3}{2})}_{n}} \frac{{(1 - λ_{i})}^{n}}{n!} \end{matrix}

(A26)

\begin{matrix} = -_{2} F_{1} (\frac{1}{2}, 1; \frac{3}{2}; 1 - λ_{i}) + 1 . \end{matrix}

(A27)

Knowing that

\begin{matrix} _{2} F_{1} (\frac{1}{2}, 1; \frac{3}{2}; 1 - λ_{i}) & = \frac{arctan (\sqrt{λ_{i} - 1})}{\sqrt{λ_{i} - 1}} \end{matrix}

(A28)

\begin{matrix} = \frac{1}{2 \sqrt{1 - λ_{i}}} ln (\frac{1 + \sqrt{1 - λ_{i}}}{1 - \sqrt{1 - λ_{i}}}) \end{matrix}

(A29)

we can deduce an expression of

\begin{matrix} \frac{\partial}{\partial λ_{i}} f (λ_{i}) & = \frac{arctan (\sqrt{λ_{i} - 1})}{{(λ_{i} - 1)}^{3 / 2}} + \frac{1}{1 - λ_{i}} . \end{matrix}

(A30)

Accordingly,

\begin{matrix} f (λ_{i}) & = - ln λ_{i} - 2 \frac{arctan (\sqrt{λ_{i} - 1})}{\sqrt{λ_{i} - 1}} + 2 \end{matrix}

(A31)

\begin{matrix} = - ln λ_{i} + \frac{1}{\sqrt{1 - λ_{i}}} ln (\frac{1 - \sqrt{1 - λ_{i}}}{1 + \sqrt{1 - λ_{i}}}) + 2 . \end{matrix}

(A32)

Appendix C.3. Computation

\begin{matrix} \frac{\partial}{\partial a} \{_{2} F_{1} (a, 1; a + \frac{3}{2}; 1 - λ)\} |_{a = 0} & = \sum_{n = 1}^{\infty} \frac{{(1)}_{n} {(1)}_{n}}{{(\frac{3}{2})}_{n} n} \frac{{(1 - λ)}^{n}}{n!} \\ = f (λ) \end{matrix}

(A33)

where f is a function of

λ

. The multiplication of the derivative of f with respect to

λ

by

(1 - λ)

is given as follows

\begin{matrix} (1 - λ) \frac{\partial}{\partial λ} f (λ) & = - \sum_{n = 1}^{\infty} \frac{{(1)}_{n} {(1)}_{n}}{{(\frac{3}{2})}_{n}} \frac{{(1 - λ)}^{n}}{n!} \end{matrix}

(A34)

\begin{matrix} = -_{2} F_{1} (1, 1; \frac{3}{2}; 1 - λ) + 1 . \end{matrix}

(A35)

Knowing that

\begin{matrix} _{2} F_{1} (1, 1; \frac{3}{2}; 1 - λ) & = \frac{1}{\sqrt{λ}} \frac{arcsin (\sqrt{1 - λ})}{\sqrt{1 - λ}} \end{matrix}

(A36)

we can state that

\begin{matrix} \frac{\partial}{\partial λ} f (λ) & = - \frac{1}{\sqrt{λ}} \frac{arcsin (\sqrt{1 - λ})}{{(1 - λ)}^{3 / 2}} + \frac{1}{1 - λ} . \end{matrix}

(A37)

As a consequence,

\begin{matrix} f (λ) & = - \frac{2 \sqrt{λ} arcsin (\sqrt{1 - λ})}{\sqrt{1 - λ}} + 2 \end{matrix}

(A38)

\begin{matrix} = - \frac{2}{\sqrt{1 - λ^{- 1}}} ln (\sqrt{λ} + \sqrt{λ - 1}) + 2 . \end{matrix}

(A39)

References

Ollila, E.; Tyler, D.E.; Koivunen, V.; Poor, H.V. Complex Elliptically Symmetric Distributions: Survey, New Results and Applications. IEEE Trans. Signal Process. 2012, 60, 5597–5625. [Google Scholar] [CrossRef]
Kotz, S.; Nadarajah, S. Multivariate T-Distributions and Their Applications; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar] [CrossRef]
Press, S. Multivariate stable distributions. J. Multivar. Anal. 1972, 2, 444–462. [Google Scholar] [CrossRef] [Green Version]
Sahu, S.; Singh, H.V.; Kumar, B.; Singh, A.K. Statistical modeling and Gaussianization procedure based de-speckling algorithm for retinal OCT images. J. Ambient. Intell. Humaniz. Comput. 2018, 1–14. [Google Scholar] [CrossRef]
Ranjani, J.J.; Thiruvengadam, S.J. Generalized SAR Despeckling Based on DTCWT Exploiting Interscale and Intrascale Dependences. IEEE Geosci. Remote Sens. Lett. 2011, 8, 552–556. [Google Scholar] [CrossRef]
Sadreazami, H.; Ahmad, M.O.; Swamy, M.N.S. Color image denoising using multivariate cauchy PDF in the contourlet domain. In Proceedings of the 2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), Vancouver, BC, Canada, 15–18 May 2016; pp. 1–4. [Google Scholar] [CrossRef]
Sadreazami, H.; Ahmad, M.O.; Swamy, M.N.S. A Study of Multiplicative Watermark Detection in the Contourlet Domain Using Alpha-Stable Distributions. IEEE Trans. Image Process. 2014, 23, 4348–4360. [Google Scholar] [CrossRef]
Fontaine, M.; Nugraha, A.A.; Badeau, R.; Yoshii, K.; Liutkus, A. Cauchy Multichannel Speech Enhancement with a Deep Speech Prior. In Proceedings of the 2019 27th European Signal Processing Conference (EUSIPCO), A Coruna, Spain, 2–6 September 2019; pp. 1–5. [Google Scholar]
Cover, T.M.; Thomas, J.A. Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing); Wiley-Interscience: Hoboken, NJ, USA, 2006. [Google Scholar] [CrossRef]
Pardo, L. Statistical Inference Based on Divergence Measures; CRC Press: Abingdon, UK, 2005. [Google Scholar]
Kullback, S.; Leibler, R.A. On Information and Sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
Kullback, S. Information Theory and Statistics; Wiley: New York, NY, USA, 1959. [Google Scholar]
Rényi, A. On Measures of Entropy and Information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability; University of California Press: Berkeley, CA, USA, 1961; Volume 1, pp. 547–561. [Google Scholar]
Sharma, B.D.; Mittal, D.P. New non-additive measures of relative information. J. Comb. Inf. Syst. Sci. 1977, 2, 122–132. [Google Scholar]
Bhattacharyya, A. On a measure of divergence between two statistical populations defined by their probability distributions. Bull. Calcutta Math. Soc. 1943, 35, 99–109. [Google Scholar]
Kailath, T. The Divergence and Bhattacharyya Distance Measures in Signal Selection. IEEE Trans. Commun. Technol. 1967, 15, 52–60. [Google Scholar] [CrossRef]
Giet, L.; Lubrano, M. A minimum Hellinger distance estimator for stochastic differential equations: An application to statistical inference for continuous time interest rate models. Comput. Stat. Data Anal. 2008, 52, 2945–2965. [Google Scholar] [CrossRef]
Csiszár, I. Eine informationstheoretische Ungleichung und ihre Anwendung auf den Beweis der Ergodizität von Markoffschen Ketten. Publ. Math. Inst. Hung. Acad. Sci. Ser. A 1963, 8, 85–108. [Google Scholar]
Ali, S.M.; Silvey, S.D. A General Class of Coefficients of Divergence of One Distribution from Another. J. R. Stat. Soc. Ser. B (Methodol.) 1966, 28, 131–142. [Google Scholar] [CrossRef]
Bregman, L. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 1967, 7, 200–217. [Google Scholar] [CrossRef]
Burbea, J.; Rao, C. On the convexity of some divergence measures based on entropy functions. IEEE Trans. Inf. Theory 1982, 28, 489–495. [Google Scholar] [CrossRef]
Burbea, J.; Rao, C. On the convexity of higher order Jensen differences based on entropy functions (Corresp.). IEEE Trans. Inf. Theory 1982, 28, 961–963. [Google Scholar] [CrossRef]
Burbea, J.; Rao, C. Entropy differential metric, distance and divergence measures in probability spaces: A unified approach. J. Multivar. Anal. 1982, 12, 575–596. [Google Scholar] [CrossRef] [Green Version]
Csiszar, I. Information-type measures of difference of probability distributions and indirect observation. Stud. Sci. Math. Hung. 1967, 2, 229–318. [Google Scholar]
Nielsen, F.; Nock, R. On the chi square and higher-order chi distances for approximating f-divergences. IEEE Signal Process. Lett. 2014, 21, 10–13. [Google Scholar] [CrossRef] [Green Version]
Menéndez, M.L.; Morales, D.; Pardo, L.; Salicrú, M. Asymptotic behaviour and statistical applications of divergence measures in multinomial populations: A unified study. Stat. Pap. 1995, 36, 1–29. [Google Scholar] [CrossRef]
Cover, T.M.; Thomas, J.A. Information theory and statistics. Elem. Inf. Theory 1991, 1, 279–335. [Google Scholar]
MacKay, D.J.C. Information Theory, Inference and Learning Algorithms; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Ruiz, F.E.; Pérez, P.S.; Bonev, B.I. Information Theory in Computer Vision and Pattern Recognition; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Nielsen, F. Statistical Divergences between Densities of Truncated Exponential Families with Nested Supports: Duo Bregman and Duo Jensen Divergences. Entropy 2022, 24, 421. [Google Scholar] [CrossRef] [PubMed]
Chyzak, F.; Nielsen, F. A closed-form formula for the Kullback–Leibler divergence between Cauchy distributions. arXiv 2019, arXiv:1905.10965. [Google Scholar]
Nielsen, F.; Okamura, K. On f-divergences between Cauchy distributions. arXiv 2021, arXiv:2101.12459. [Google Scholar]
Srivastava, H.; Karlsson, P.W. Multiple Gaussian Hypergeometric Series; Ellis Horwood Series in Mathematics and Its Applications Statistics and Operational Research, E; Horwood Halsted Press: Chichester, UK; West Sussex, UK; New York, NY, USA, 1985. [Google Scholar]
Mathai, A.M.; Haubold, H.J. Special Functions for Applied Scientists; Springer Science+Business Media: New York, NY, USA, 2008. [Google Scholar]
Gradshteyn, I.; Ryzhik, I. Table of Integrals, Series, and Products, 7th ed.; Academic Press is an Imprint of Elsevier: Cambridge, MA, USA, 2007. [Google Scholar]
Humbert, P. The Confluent Hypergeometric Functions of Two Variables. Proc. R. Soc. Edinb. 1922, 41, 73–96. [Google Scholar] [CrossRef] [Green Version]
Erdélyi, A. Higher Transcendental Functions; McGraw-Hill: New York, NY, USA, 1953; Volume I. [Google Scholar]
Koepf, W. Hypergeometric Summation an Algorithmic Approach to Summation and Special Function Identities, 2nd ed.; Universitext, Springer: London, UK, 2014. [Google Scholar]
Lauricella, G. Sulle funzioni ipergeometriche a piu variabili. Rend. Del Circ. Mat. Palermo 1893, 7, 111–158. [Google Scholar] [CrossRef]
Mathai, A.M. Jacobians of Matrix Transformations and Functions of Matrix Argument; World Scientific: Singapore, 1997. [Google Scholar]
Anderson, T.W. An Introduction to Multivariate Statistical Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2003. [Google Scholar]
Hattori, A.; Kimura, T. On the Euler integral representations of hypergeometric functions in several variables. J. Math. Soc. Jpn. 1974, 26, 1–16. [Google Scholar] [CrossRef]
Exton, H. Multiple Hypergeometric Functions and Applications; Wiley: New York, NY, USA, 1976. [Google Scholar]

Figure 1. Top row: Bias (left) and MSE (right) of the difference between the approximated and theoretical symmetric KL for MCD. Bottom row: Box plot of the error. The mean error is the bias. Outliers are larger than

Q_{3} + 1.5 \times I Q R

or smaller than

Q_{1} - 1.5 \times I Q R

, where

Q_{1}

,

Q_{3}

, and

I Q R

are the 25th, 75th percentiles, and the interquartile range, respectively.

Figure 1. Top row: Bias (left) and MSE (right) of the difference between the approximated and theoretical symmetric KL for MCD. Bottom row: Box plot of the error. The mean error is the bias. Outliers are larger than

Q_{3} + 1.5 \times I Q R

or smaller than

Q_{1} - 1.5 \times I Q R

, where

Q_{1}

,

Q_{3}

, and

I Q R

are the 25th, 75th percentiles, and the interquartile range, respectively.

Table 2. Computation of

A = \frac{\partial}{\partial a} {_{2} F_{1} (.)} |_{a = 0}

and

B = \frac{\partial}{\partial a} {F_{D}^{(p)} (.)} |_{a = 0}

when

p = 2

and

λ_{1} = \dots = λ_{p} = λ

.

Table 2. Computation of

A = \frac{\partial}{\partial a} {_{2} F_{1} (.)} |_{a = 0}

and

B = \frac{\partial}{\partial a} {F_{D}^{(p)} (.)} |_{a = 0}

when

p = 2

and

λ_{1} = \dots = λ_{p} = λ

.

		$N = 20$		$N = 30$		$N = 40$
$1 - λ$	$A$	$B$	$\| ϵ \|$	$B$	$\| ϵ \|$	$B$	$\| ϵ \|$
0.1	0.0694	0.0694	9.1309 $\times 10^{- 16}$	0.0694	9.1309 $\times 10^{- 16}$	0.0694	9.1309 $\times 10^{- 16}$
0.3	0.2291	0.2291	3.7747 $\times 10^{- 14}$	0.2291	1.1102 $\times 10^{- 16}$	0.2291	1.1102 $\times 10^{- 16}$
0.5	0.4292	0.4292	2.6707 $\times 10^{- 9}$	0.4292	1.2458 $\times 10^{- 12}$	0.4292	6.6613 $\times 10^{- 16}$
0.7	0.7022	0.7022	5.9260 $\times 10^{- 6}$	0.7022	8.2678 $\times 10^{- 8}$	0.7022	1.3911 $\times 10^{- 9}$
0.9	1.1673	1.1634	0.0038	1.1665	7.2760 $\times 10^{- 4}$	1.1671	1.6081 $\times 10^{- 4}$
0.99	1.7043	1.5801	0.1241	1.6267	0.0776	1.6514	0.0529

Table 3. Parameters

Σ_{1}

and

Σ_{2}

used to compute KLD for central MCD.

Table 3. Parameters

Σ_{1}

and

Σ_{2}

used to compute KLD for central MCD.

$Σ$	$Σ_{11}$ , $Σ_{22}$ , $Σ_{33}$ , $Σ_{12}$ , $Σ_{13}$ , $Σ_{23}$
$Σ_{1}$	1, 1, 1, 0.6, 0.2, 0.3
$Σ_{2}$	1, 1, 1, 0.3, 0.1, 0.4

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Bouhlel, N.; Rousseau, D. A Generic Formula and Some Special Cases for the Kullback–Leibler Divergence between Central Multivariate Cauchy Distributions. Entropy 2022, 24, 838. https://doi.org/10.3390/e24060838

AMA Style

Bouhlel N, Rousseau D. A Generic Formula and Some Special Cases for the Kullback–Leibler Divergence between Central Multivariate Cauchy Distributions. Entropy. 2022; 24(6):838. https://doi.org/10.3390/e24060838

Chicago/Turabian Style

Bouhlel, Nizar, and David Rousseau. 2022. "A Generic Formula and Some Special Cases for the Kullback–Leibler Divergence between Central Multivariate Cauchy Distributions" Entropy 24, no. 6: 838. https://doi.org/10.3390/e24060838

APA Style

Bouhlel, N., & Rousseau, D. (2022). A Generic Formula and Some Special Cases for the Kullback–Leibler Divergence between Central Multivariate Cauchy Distributions. Entropy, 24(6), 838. https://doi.org/10.3390/e24060838

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Generic Formula and Some Special Cases for the Kullback–Leibler Divergence between Central Multivariate Cauchy Distributions

Abstract

1. Introduction

2. Multivariate Cauchy Distribution and Kullback–Leibler Divergence

3. Definitions and Propositions

3.1. Integral Representation for Φ 2 ( n )

3.2. Multiple Power Series F N ( n )

3.3. Integral Representation for F N ( n + 1 )

4. Expression of E X 1 { ln [ 1 + X T Σ 1 − 1 X ] }

5. Expression of E X 1 { ln [ 1 + X T Σ 2 − 1 X ] }

5.1. First Step: Eigenvalue Expression

5.2. Second Step: Polar Decomposition

5.3. Third Step: Expression for H(t,y) by Humbert and Beta Functions

5.4. Final Step

6. KLD between Two Central MCDs

7. Particular Cases: Univariate and Bivariate Cauchy Distribution

7.1. Case of p = 1

7.2. Case of p = 2

8. Implementation and Comparison with Monte Carlo Technique

8.1. Case 1 > λ p > λ p − 1 > … > λ 1 > 0

8.2. Case λ p > λ p − 1 > … > λ 1 > 1

8.3. Case λ p > 1 and λ 1 < 1

9. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Lauricella Function

Appendix B. Demonstration of Derivative

Appendix B.1. Demonstration

Appendix B.2. Demonstration

Appendix B.3. Demonstration

Appendix B.4. Demonstration

Appendix C. Computations of Some Equations

Appendix C.1. Computation

Appendix C.2. Computation

Appendix C.3. Computation

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.1. Integral Representation for $Φ_{2}^{(n)}$

3.2. Multiple Power Series $F_{N}^{(n)}$

3.3. Integral Representation for $F_{N}^{(n + 1)}$

4. Expression of $E_{X^{1}} {ln [1 + X^{T} Σ_{1}^{- 1} X]}$

5. Expression of $E_{X^{1}} {ln [1 + X^{T} Σ_{2}^{- 1} X]}$

7.1. Case of $p = 1$

7.2. Case of $p = 2$

8.1. Case $1 > λ_{p} > λ_{p - 1} > \dots > λ_{1} > 0$

8.2. Case $λ_{p} > λ_{p - 1} > \dots > λ_{1} > 1$

8.3. Case $λ_{p} > 1$ and $λ_{1} < 1$