A New Quantum f-Divergence for Trace Class Operators in Hilbert Spaces

A new quantum f -divergence for trace class operators in Hilbert Spaces is introduced. It is shown that for normalised convex functions it is nonnegative. Some upper bounds are provided. Applications for some classes of convex functions of interest are also given.


Introduction
Let (X, A) be a measurable space satisfying |A| > 2 and µ be a σ-finite measure on (X, A) .Let P be the set of all probability measures on (X, A) which are absolutely continuous with respect to µ.For P, Q ∈ P, let p = dP dµ and q = dQ dµ denote the Radon-Nikodym derivatives of P and Q with respect to µ.Two probability measures P, Q ∈ P are said to be orthogonal and we denote this by Q ⊥ P if: Let f : [0, ∞) → (−∞, ∞] be a convex function that is continuous at zero, i.e., f (0) = lim u↓0 f (u) .In 1963, I. Csiszár [1] introduced the concept of f -divergence as follows.
Definition 1.Let P, Q ∈ P. Then: is called the f -divergence of the probability distributions Q and P.
Remark 1. Observe that, the integrand in the formula (1) is undefined when p (x) = 0.The way to overcome this problem is to postulate for f as above that: For f continuous convex on [0, ∞) we obtain the * -conjugate function of f by: and: It is also known that if f is continuous convex on [0, ∞), then so is f * .
The following two theorems contain the most basic properties of f -divergences.For their proofs we refer the reader to Chapter 1 of [2] (see also [3]).
Theorem 1 (Uniqueness and symmetry theorem).Let f, f 1 be continuous convex on [0, ∞).We have: for all P, Q ∈ P if and only if there exists a constant c ∈ R, such that: Theorem 2 (Range of values theorem).Let f : [0, ∞) → R be a continuous convex function on [0, ∞).
For any P, Q ∈ P, we have the double inequality: (i) If P = Q, then the equality holds in the first part of (3).
If f is strictly convex at one, then the equality holds in the first part of (3) if and only if P = Q; (ii) If Q ⊥ P , then the equality holds in the second part of (3).
If f (0) + f * (0) < ∞, then equality holds in the second part of (3) if and only if Q ⊥ P.
The following result is a refinement of the second inequality in Theorem 2 (see Theorem 3 in [3]).
(1) The class of χ α -divergences.The f -divergences of this class, which is generated by the function have the form: From this class only the parameter α = 1 provides a distance in the topological sense, namely the total variation distance V (Q, P ) = X |q − p| dµ.The most prominent special case of this class is, however, Karl Pearson's χ 2 -divergence: (2) Dichotomy class.From this class, generated by the function f α : [0, ∞) → R: 2 provides a distance, namely, the Hellinger distance: .
Another important divergence is the Kullback-Leibler divergence obtained for α = 1, KL (Q, P ) = X q ln q p dµ.
(5) Divergences of Arimoto-type.This class is generated by the functions: It has been shown in [17] that this class provides the distances [I Ψα (Q, P )] min(α, 1 α ) for α ∈ (0, ∞) and In order to introduce a quantum f -divergence for trace class operators in Hilbert spaces and study its properties we need some preliminary facts as follows.

Trace of Operators
Let (H, •, • ) be a complex Hilbert space and {e i } i∈I an orthonormal basis of H.We say that A ∈ B (H) is a Hilbert-Schmidt operator if: It is well know that, if {e i } i∈I and {f j } j∈J are orthonormal bases for H and A ∈ B (H), then: showing that the definition (6) is independent of the orthonormal basis and A is a Hilbert-Schmidt operator iff A * is a Hilbert-Schmidt operator.Let B 2 (H) the set of Hilbert-Schmidt operators in B (H) .For A ∈ B 2 (H) we define: for {e i } i∈I an orthonormal basis of H.This definition does not depend on the choice of the orthonormal basis.
Using the triangle inequality in l 2 (I), one checks that B 2 (H) is a vector space and that • 2 is a norm on B 2 (H), which is usually called in the literature as the Hilbert-Schmidt norm.
Denote the modulus of an operator A ∈ B (H) by |A| := (A * A) The definition of A 1 does not depend on the choice of the orthonormal basis {e i } i∈I .We denote by B 1 (H) the set of trace class operators in B (H) .
The following proposition holds: The following properties are also well known: Theorem 5.With the above notations: (i) We have (iv) We have: (iv) We have the following isometric isomorphisms: where K (H) * is the dual space of K (H) and B 1 (H) * is the dual space of B 1 (H) .
We define the trace of a trace class operator A ∈ B 1 (H) to be: where {e i } i∈I an orthonormal basis of H.Note that this coincides with the usual definition of the trace if H is finite-dimensional.We observe that the series ( 14) converges absolutely and it is independent from the choice of basis.
The following result collects some properties of the trace: Theorem 6.We have: Utilising the trace notation, we obviously have that: The following Hölder's type inequality has been obtained by Ruskai in [18]: where α ∈ (0, 1) and In particular, for α = 1 2 we get the Schwarz inequality: with A, B ∈ B 2 (H) .
If A ≥ 0 and P ∈ B 1 (H) with P ≥ 0, then: Indeed, since A ≥ 0, then Ax, x ≥ 0 for any x ∈ H.If {e i } i∈I an orthonormal basis of H, then: A P e i , e i for any i ∈ I. Summing over i ∈ I, we get: and since: i∈I AP 1/2 e i , P 1/2 e i = i∈I P 1/2 AP 1/2 e i , e i = tr P 1/2 AP 1/2 = tr (P A) we obtain the desired result (19).This obviously imply the fact that, if A and B are self-adjoint operators with A ≤ B and P ∈ B 1 (H) with P ≥ 0, then: Now, if A is a self-adjoint operator, then we know that: This inequality follows by Jensen's inequality for the convex function f (t) = |t| defined on a closed interval containing the spectrum of A.

Classical Quantum f -Divergence
On complex Hilbert space (B 2 (H) , •, • 2 ), where the Hilbert-Schmidt inner product is defined by: We observe that they are well defined and since: Since tr |X * | 2 = tr |X| 2 for any X ∈ B 2 (H), then also: for A ≥ 0 and T ∈ B 2 (H) .We observe that L A and R B are commutative, therefore the product L A R B is a self-adjoint positive operator on B 2 (H) for any positive operators A, B ∈ B (H) .
For A, B ∈ B + (H) with B invertible, we define the Araki transform We observe that for T ∈ B 2 (H), and we have A A,B T = AT B −1 and: Observe also, by the properties of trace, that: giving that: for any T ∈ B 2 (H) .
Let U be a self-adjoint linear operator on a complex Hilbert space (K; •, • ) .The Gelfand map establishes a * -isometrically isomorphism Φ between the set C (Sp (U )) of all continuous functions defined on the spectrum of U , denoted Sp (U ), and the C * -algebra C * (U ) generated by U and the identity operator 1 K on K as follows: For any f, g ∈ C (Sp (U )) and any α, β ∈ C we have With this notation we define: and we call it the continuous functional calculus for a self-adjoint operator U.
If U is a self-adjoint operator and f is a real valued continuous function on Sp (U ), then f (t) ≥ 0 for any t ∈ Sp (U ) implies that f (U ) ≥ 0, i.e., f (U ) is a positive operator on K.Moreover, if both f and g are real valued functions on Sp (U ), then the following important property holds: in the operator order of B (K) .
Let f : [0, ∞) → R be a continuous function.Utilising the continuous functional calculus for the Araki self-adjoint operator A Q,P ∈ B (B 2 (H)) we can define the quantum f -divergence for Q, P ∈ S (H) := {P ∈ B 1 (H) , P ≥ 0 with tr (P ) = 1} and P invertible, by: If we consider the continuous convex function f : [0, ∞) → R, with f (0) := 0 and f (t) = t ln t for t > 0, then for Q, P ∈ S (H) and Q, P invertible, we have: which is the Umegaki relative entropy.
If we take the continuous convex function f : [0, ∞) → R, f (t) = |t − 1| for t ≥ 0, then for Q, P ∈ S (H) with P invertible, we have: where V (Q, P ) is the variational distance.

If we consider the convex function
which is known as Hellinger discrimination.
If we take f : (0, ∞) → R, f (t) = − ln t, then for Q, P ∈ S (H) and Q, P invertible, we have: In the important case of finite dimensional space H and the generalized inverse P −1 , numerous properties of the quantum f -divergence have been obtained in the recent papers [33][34][35][36] and the references therein.We omit the details.

A New Quantum f -Divergence
In order to simplify the writing, we denote by S 1 (H) the set of all density operators which are elements of B + 1 (H) having unit trace.We observe that, if P, Q are self-adjoint with P, Q ≥ 0 and P is invertible, then P − 1 2 QP − 1 2 ≥ 0. Let f : [0, ∞) → R be a continuous convex function on [0, ∞).We can define the following new quantum f -divergence functional: for Q, P ∈ S 1 (H) with P invertible.The definition can be extended for any continuous function.
If we take the convex function f (t) = t 2 − 1, t ≥ 0, then we get: for Q, P ∈ S 1 (H) with P invertible, which is the Karl Pearson's χ 2 -divergence version for trace class operators.This divergence is the same as the one generated by the classical f -divergence, see (23).
More general, if we take the convex function f (t) = t n − 1, t ≥ 0 and n a natural number with n ≥ 2, then we get: for Q, P ∈ S 1 (H) with P invertible.
If we take the convex function f (t) = t ln t for t > 0 and f (0) := 0, then we get: for Q, P ∈ S 1 (H) with P and Q invertible.We observe that this is not the same as Umegaki relative entropy introduced above.
If we take the convex function f (t) = − ln t for t > 0, then we get: for Q, P ∈ S 1 (H) with P and Q invertible.
If we take the convex function f (t) = |t − 1| , t ≥ 0, then we get: for Q, P ∈ S 1 (H) with P invertible.
If we consider the convex function f (t) = 1 t − 1, t > 0, then: for Q, P ∈ S 1 (H) with P and Q invertible.
If we take the convex function f (t) = f q (t) = 1−t q 1−q , q ∈ (0, 1), then we get: which is different, in general, from the Tsallis relative entropy introduced above.
Other examples may be considered by taking the convex functions from the introduction.The details are omitted.
Suppose that I is an interval of real numbers with interior I and f : I → R is a convex function on I.Then, f is continuous on I and has finite left and right derivatives at each point of I.Moreover, if x, y ∈ I and x < y,, then f − (x) ≤ f + (x) ≤ f − (y) ≤ f + (y) which shows that both f − and f + are nondecreasing function on I.It is also known that a convex function must be differentiable except for at most countably many points.
For a convex function f : I → R, the subdifferential of f denoted by ∂f is the set of all functions ϕ : I → [−∞, ∞], such that ϕ I ⊂ R and: It is also well known that if f is convex on I, then ∂f is nonempty, f − , f + ∈ ∂f and if ϕ ∈ ∂f , then: In particular, ϕ is a nondecreasing function.
If f is differentiable and convex on I, then ∂f = {f } .
If f is continuously differentiable on (0, ∞), then we also have: Proof.For any x ≥ 0, we have from the gradient inequality (24) that: and since f is normalised, then: Utilising the property (P) for the positive operator A = P − 1 2 QP − 1 2 where Q, P ∈ S 1 (H) with P invertible, then we have the inequality in the operator order: Utilising the property (20) for the inequality (28), we have: and the inequality ( 25) is proven.
From the gradient inequality, we also have for any x ≥ 0: and since f is normalised, then: which, as above, implies that: Making use of the property (20) for the inequality (29), then we get: which is the required inequality (26).
Remark 2. If we take f (t) = − ln t, t > 0 in Theorem 7, then we get: for any Q, P ∈ S 1 (H) with P and Q invertible.
If we take the convex function ε (t) = e t−1 − 1, then: where Q, P ∈ S 1 (H) with P invertible.By Theorem 7, we get: where Q, P ∈ S 1 (H) with P invertible.The inequality in ( 32) is equivalent to: where Q, P ∈ S 1 (H) with P invertible.
The following lemma is of interest in itself: Lemma 1.Let S be a self-adjoint operator such that γ1 H ≤ S ≤ Γ1 H for some real constants Γ ≥ γ.
Then, for any P ∈ B + 1 (H) \ {0}, we have: Taking the modulus in (34) and using the properties of trace, we have: that proves the last part of (33).
We observe that if Q, P ∈ S 1 (H) with P invertible and there exists r, R > 0 with: then by the property (20), we get: showing that r ≤ 1 ≤ R.
The following result provides a simple upper bound for the quantum f -divergence D f (Q, P ) .
Theorem 8. Let f be a continuous convex function on [0, ∞) with f (1) = 0.Then, we have: for any Q, P ∈ S 1 (H) with P invertible and satisfying the condition (38).
Proof.Without loosing the generality, we prove the inequality in the case when f is continuously differentiable on (0, ∞) .
We have: for any λ ∈ R and for any Q, P ∈ S 1 (H) with P invertible.Since f is monotonic nondecreasing on [r, R], then: This implies in the operator order that: From ( 30) and (40), we have: which proves the first inequality in (39).The rest follows by (37).
Example 1. 1) If we take f (t) = − ln t, t > 0 in Theorem 8, then we get: for any Q, P ∈ S 1 (H) with P, Q invertible and satisfying the condition: with r > 0.
2) With the same conditions as in 1) for Q, P and if we take f (t) = t ln t, t > 0 in Theorem 8, then we get: 3) If we take in (39) f (t) = f q (t) = 1−t q 1−q , then we get: provided that Q, P ∈ S 1 (H), with P, Q invertible and satisfying the condition (43).
We have the following upper bound, as well: with P invertible, and there exists R ≥ 1 ≥ r ≥ 0 such that the condition ( 38) is satisfied, then: Proof.By the convexity of f , we have: This inequality implies the following inequality in the operator order of B (H): for Q, P ∈ S 1 (H), with P invertible, and R ≥ 1 ≥ r ≥ 0 such that the condition (38) is satisfied.
Utilising the property (20), we get from (47) that: and the inequality ( 46) is thus proven.
If we take in (46) f (t) = t ln t, then we get the inequality: provided that Q, P ∈ S 1 (H), with P, Q invertible and satisfying the condition (38).With the same assumptions for P, Q, if we take in (46) f (t) = − ln t, then we get the inequality:

Further Upper Bounds
We also have: Theorem 10.Let f : [0, ∞) → R be a continuous convex function that is normalized.If Q, P ∈ S 1 (H), with P invertible, and there exists R > 1 > r ≥ 0 such that the condition (38) is satisfied, then: where Ψ f (•; r, R) : (r, R) → R is defined by: We also have: Theorem 11.Let f : [0, ∞) → R be a continuous convex function that is normalized.If Q, P ∈ S 1 (H), with P invertible, and there exists R > 1 > r ≥ 0 such that the condition (38) is satisfied, then: Proof.We recall the following result (see for instance [37]) that provides a refinement and a reverse for the weighted Jensen's discrete inequality: n min i∈{1,...,n} x i (63) x i , where f : C → R is a convex function defined on the convex subset C of the linear space X, {x i } i∈{1,...n} ⊂ C are vectors and {p i } i∈{1,...n} are nonnegative numbers with P n := n i=1 p i > 0. For n = 2, we deduce from (63) that: for any t ∈ [r, R] .
tr (T * AT ) = tr |T * | 2 A ≥ 0 and: R B T, T 2 = T B, T 2 = tr (T * T B) = tr |T | 2 B ≥ 0 for any T ∈ B 2 (H), they are also positive in the operator order of B (B 2 (H)), the Banach algebra of all bounded operators on B 2 (H) with the norm • 2 where T 2 = tr |T | 2 , T ∈ B 2 (H) .

≤
sf (x) + (1 − s) f (y) − f (sx + (1 − s) y) ≤ 2 max {s, 1 − s} f (x) + f (y) 2 − f x + y 2for any x, y ∈ C and s ∈ [0, 1] .Now, if we use the second inequality in (64) for x = r, y = R, s = R−t R−r with t ∈ [r, R], then we have: 1/2 .Because |A| x = Ax for all x ∈ H, A is Hilbert-Schmidt iff |A| is Hilbert-Schmidt and A 2 = |A| 2 .From (7) we have that if A ∈ B 2 (H), then A * ∈ B 2 (H) and A 2 = A * 2 .The following theorem collects some of the most important properties of Hilbert-Schmidt operators:If {e i } i∈I an orthonormal basis of H, we say that A ∈ B (H) is trace class if: i∈I Ae i , Be i = i∈I B * Ae i , e