On the Monotonicity of Relative Entropy: A Comparative Study of Petz’s and Uhlmann’s Approaches

Matheus, Santiago; Bottacin, Francesco; Provenzi, Edoardo

doi:10.3390/e27090954

Open AccessFeature PaperArticle

On the Monotonicity of Relative Entropy: A Comparative Study of Petz’s and Uhlmann’s Approaches

by

Santiago Matheus

¹

,

Francesco Bottacin

¹

and

Edoardo Provenzi

^2,*

¹

Dipartimento di Matematica, Università degli Studi di Padova, Via Trieste 63, 35121 Padova, Italy

²

Institute of Mathematics, Université de Bordeaux, CNRS, Bordeaux INP, IMB, UMR 5251, 351 Cours de la Libération, F-33400 Talence, France

^*

Author to whom correspondence should be addressed.

Entropy 2025, 27(9), 954; https://doi.org/10.3390/e27090954

Submission received: 19 August 2025 / Revised: 10 September 2025 / Accepted: 12 September 2025 / Published: 14 September 2025

(This article belongs to the Section Quantum Information)

Download

Browse Figures

Versions Notes

Abstract

We revisit the monotonicity of relative entropy under the action of quantum channels, a foundational result in quantum information theory. Among the several available proofs, we focus on those by Petz and Uhlmann, which we reformulate within a unified, finite-dimensional operator-theoretic framework. In the first part, we examine Petz’s strategy, identify a subtle flaw in his original use of Jensen’s contractive operator inequality, and point out how it was corrected to restore the validity of his line of reasoning. In the second part, we develop Uhlmann’s approach, which is based on interpolations of positive sesquilinear forms and applies automatically to non-invertible density operators. By comparing these two approaches, we highlight their complementary strengths: Petz’s method is more direct and clear; Uhlmann’s method is more abstract and general. Our treatment aims to clarify the mathematical structure underlying the monotonicity of relative entropy and to make these proofs more accessible to a broader audience interested in both the foundations and applications of quantum information theory.

Keywords:

quantum relative entropy; monotonicity; data processing inequality

1. Introduction

The (quantum) relative entropy is a central concept in quantum information theory, quantifying the distinguishability between quantum states and serving as a key tool in the analysis of information processing tasks.

One of its most important properties is the monotonicity under quantum channels (a quantum channel is a completely positive trace-preserving linear map; see later for a rigorous definition), which expresses the idea that state distinguishability cannot increase during the dynamical evolution of a quantum system that interacts with an environment. This property is also known as the data processing inequality (DPI).

The concept of relative entropy was first introduced by Umegaki in 1962 [1] in the setting of

σ

-finite von Neumann algebras, and it was later extended to arbitrary von Neumann algebras by Araki in 1976 [2] by means of Tomita–Takesaki modular theory.

The proof of the monotonicity of relative entropy is closely related to the proof of the strong subadditivity (SSA) of the von Neumann entropy. The latter serves as a measure of the mixedness of quantum states, and the validity of the SSA ensures that quantum uncertainty behaves consistently across composite systems, placing fundamental constraints on how information and correlations can be distributed among subsystems.

The first step toward proving SSA was made in 1968 by Lanford and Robinson [3], who established the subadditivity of the von Neumann entropy and conjectured its stronger form. In 1973, Lieb [4], building on earlier work by Wigner, Yanase, and Dyson, proved several key properties concerning the convexity and concavity of operator functions and trace functionals.

These results enabled Lieb and Ruskai to establish the full proof of the strong subadditivity of the von Neumann entropy for both finite and infinite-dimensional Hilbert spaces later that same year in their landmark paper [5]. In that work, they also derived, though without emphasizing it as such, the monotonicity of relative entropy under the partial trace operation, which constitutes a special case of quantum channels.

The first explicit and general proof of the monotonicity of relative entropy under the action of quantum channels was provided two years later by Lindblad in his seminal 1975 paper [6], thanks to the results established by Lieb and Ruskai in [5].

A further breakthrough was achieved by Uhlmann in 1977, who extended the property of monotonicity to a broader class of transformations: the adjoints of unital Schwarz maps [7].

The equivalence between SSA and the monotonicity of relative entropy was rigorously established by Petz in the 1980s [8,9]. Later, in [10], Petz proposed a new proof of monotonicity and posed the question of whether this property also holds for positive (but not necessarily completely positive) trace-preserving linear maps. This question remained open for several years until it was affirmatively resolved in 2023 by Müller-Hermes and Reeb in [11]. Other pertinent references related to the monotonicity of relative entropy are [12,13,14,15].

In this paper, we focus specifically on the demonstration strategies proposed by Petz and Uhlmann in [10] and [7] respectively, which we believe offer particularly interesting and useful complementary perspectives. Petz’s proof relies on operator-algebraic methods and an explicit representation of the relative entropy in terms of a suitable inner product, an idea inspired by the previously quoted work of Araki.

However, as we show, his original use of Jensen’s operator inequality contains a subtle flaw when applied in its contractive form. We point out how Petz himself and Nielsen corrected the problem in order to restore the validity of Petz’s original approach.

Uhlmann’s proof, on the other hand, is formulated in terms of interpolations of positive sesquilinear forms, a technique that naturally extends to non-invertible states and arbitrary quantum channels.

Despite their foundational importance, both Petz’s and Uhlmann’s proofs are often considered technically demanding and conceptually opaque. Petz’s approach, while elegant, involves intricate manipulations of operator inequalities that can obscure the overall structure of the argument. Uhlmann’s method introduces a formalism that is unfamiliar to many working in quantum information theory and is rarely presented in full detail in the literature.

We aim to clarify and systematize both strategies by reformulating them in a unified, finite-dimensional, operator-theoretic framework, thus making both proofs more accessible to a wider audience.

This paper is structured as follows: In Section 2, we recall the necessary preliminaries on operator convexity, partial trace, and quantum channels. In Section 3, we analyze Petz’s proof, highlight its limitations, and discuss its correction. In Section 4, we develop Uhlmann’s approach in detail and derive the monotonicity of relative entropy in the general setting.

2. Mathematical Preliminaries

In this section, we start recalling the basic definitions and results needed for the rest of the paper.

Given a finite-dimensional Hilbert state space

(H, {〈, 〉}_{H})

over the field

F = R

or

C

,

B (H)

indicates the

F

-algebra of linear (bounded) operators

A : H \to H

.

We recall that

A \in B (H)

is as follows:

Positive semi-definite, written as $A \geq 0$ if ${〈 x, A x 〉}_{H} \geq 0$ for all $x \in H$ ;
Positive definite, written as $A > 0$ if ${〈 x, A x 〉}_{H} > 0$ for all $x \neq 0_{H} \in H$ ;
Hermitian if $A = A^{†}$ , where $A^{†}$ is the adjoint operator of A, defined by the formula ${〈 A^{†} x, y 〉}_{H} = {〈 x, A y 〉}_{H}$ for all $x, y \in H$ .

If

F = C

, then a positive semi-definite, or positive definite, operator is automatically Hermitian, but this is not the case if

F = R

.

Suppose now that

H

is the state space of a quantum system. The density operator (also called the density matrix)

ρ \in B (H)

associated with a given state s of the system is positive semi-definite, Hermitian, and such that

Tr (ρ) = 1

. Hence,

ρ

has eigenvalues

0 \leq λ_{j} \leq 1

, which sum up to 1.

As is well-known, if s is a pure state, then

ρ

is a rank-one orthogonal projector and thus not invertible.

B (H)

itself becomes an

F

-Hilbert space when it is endowed with the Hilbert–Schmidt operator inner product

{〈 A, B 〉}_{B (H)} : = Tr (A^{†} B), A, B \in B (H) .

(1)

The subset of

B (H)

given by Hermitian operators on

H

, indicated with

B_{H} (H)

, is a real Hilbert space with respect to the inner product inherited from

B (H)

, and it is also a partially ordered set with respect to the Löwner ordering, defined as follows: for all

A, B \in B_{H} (H)

,

A \leq B \Leftrightarrow B - A \geq 0

.

Given a function

f : I \subseteq R \to R

and

A \in B_{H} (H)

with spectral decomposition

A = U D U^{†}

, with U being unitary and D being diagonal, with entries given by the eigenvalues

λ_{j}

of A, all supposed to belong to I, we write as usual

f (A) = U f (D) U^{†},

(2)

where

f (D)

is diagonal, with non-trivial entries given by

f (λ_{j})

.

If, for every finite-dimensional

H

and every couple of operators

A, B \in B_{H} (H)

, we have

A \leq B \Rightarrow f (A) \leq f (B),

(3)

then f is said to be operator monotone on I. Instead, if we have

f (\frac{A + B}{2}) \leq \frac{f (A) + f (B)}{2},

(4)

then f is said to be operator convex on I. If the last inequality is reversed, f is said to be operator concave on I.

By Löwner’s theorem (see, e.g., [16] chapter V.4),

f : I \to R

is operator monotone if and only if it has an analytic continuation that maps the upper half-plane

H_{+}

into itself. Notable examples of operator monotone functions on

(0, + \infty)

are

x \mapsto log (x)

and

x \mapsto - 1 / x

.

If

f : I \to R

is a continuous and operator monotone function on I, then

\forall a \in I

, the function

F : I \to R

given by

F (x) = \int_{a}^{x} f (t) d t

(5)

is operator convex on I; see again [16]. So, since

t \mapsto - 1 / t

is operator monotone on

(0, + \infty)

,

x ⟼ \int_{1}^{x} - \frac{1}{t} d t = - log (x)

(6)

is operator convex on

(0, + \infty)

, which implies that

x \mapsto log (x)

is operator concave on

(0, + \infty)

.

As is well-known, a convex function

f : I \to R

satisfies the Jensen inequality

f (\sum_{i = 1}^{n} λ_{i} x_{i}) \leq \sum_{i = 1}^{n} λ_{i} f (x_{i}),

(7)

for all

x_{1}, \dots, x_{n} \in I

and non-negative weights

λ_{1}, \dots, λ_{n}

such that

\sum_{i = 1}^{n} λ_{i} = 1

.

The Jensen inequality can be generalized to operator convex functions, as stated in the following celebrated theorem, proven by Hansen and Pedersen [17].

Theorem 1

(Jensen’s operator inequality). For a continuous function

f : I \to R

, the following conditions are equivalent:

(i): f is operator convex on I.
(ii): For each natural number $n \geq 1$ , the following inequality holds:

$f (\sum_{i = 1}^{n} A_{i}^{†} X_{i} A_{i}) \leq \sum_{i = 1}^{n} A_{i}^{†} f (X_{i}) A_{i},$

(8)

for every n-tuple of bounded Hermitian operators $X_{1}, \dots, X_{n}$ on an arbitrary Hilbert space $H$ , with spectra contained in I, and every n-tuple of operators $(A_{1}, \dots, A_{n})$ on $H$ satisfying $\sum_{i = 1}^{n} A_{i}^{†} A_{i} = i d_{H} .$
(iii): $f (V^{†} X V) \leq V^{†} f (X) V$ for every isometry $V : K \to H$ from an arbitrary Hilbert space $K$ on an arbitrary Hilbert space $H$ and every Hermitian operator X on $H$ with spectrum contained in I.

An immediate consequence of this theorem is the following corollary, also known as Jensen’s contractive operator inequality.

Corollary 1

(Contractive version). Let

f : I \to R

be a continuous function, and suppose that

0 \in I

. Then, f is operator convex on I and

f (0) \leq 0

if and only if for some, hence, for every,

n \geq 1

inequality (8) is valid for every n-tuple of bounded Hermitian operators

X_{1}, \dots, X_{n}

on an arbitrary Hilbert space

H

, with spectra contained in I, and every n-tuple of operators

(A_{1}, \dots, A_{n})

on

H

satisfying

\sum_{i = 1}^{n} A_{i}^{†} A_{i} \leq i d_{H} .

The reason for the adjective ‘contractive’ can be easily understood by setting

n = 1

; in this case, it follows that f is operator convex on an interval I containing 0 with

f (0) \leq 0

if and only if

f (A^{†} X A) \leq A^{†} f (X) A,

(9)

for every bounded Hermitian operator X with spectrum in I and every operator A such that

A^{†} A \leq i d

, which implies that A is a contraction, i.e.,

∥ A x ∥ \leq ∥ x ∥

for all

x \in H

or, equivalently,

∥ A ∥ \leq 1

.

Let us now consider two interacting quantum systems a and b with Hilbert state spaces

H_{a}

and

H_{b}

, respectively. The associated composite quantum system has Hilbert state space

H_{a b} = H_{a} \otimes H_{b}

.

In the following, we indicate with

X_{a}, X_{b}, X_{a b}

generic operators of

B_{H} (H_{a})

,

B_{H} (H_{b})

, and

B_{H} (H_{a b})

, respectively.

The partial trace

{Tr}_{b}

over subsystem b is a ‘superoperator’, i.e., a linear map

{Tr}_{b} \in B (B_{H} (H_{a b}), B_{H} (H_{a}))

, defined as the linear extension to the whole

B_{H} (H_{a b})

of the map

{Tr}_{b} (X_{a} \otimes X_{b}) = Tr (X_{b}) X_{a} .

(10)

{Tr}_{b}

satisfies the following property:

Tr ({Tr}_{b} (X_{a b}) X_{a}) = Tr (X_{a b} (X_{a} \otimes i d_{b})) .

(11)

See, e.g., [18], page 100.

The partial trace

{Tr}_{b}

is as follows:

Trace-preserving: $Tr (X_{a b}) = Tr ({Tr}_{b} (X_{a b}))$ ;
Positive: if $X_{a b} \geq 0$ , then ${Tr}_{b} (X_{a b}) \geq 0$ .

As a consequence,

{Tr}_{b}

maps states of

H_{a b}

into states of

H_{a}

.

Moreover,

{Tr}_{b}

is completely positive; i.e.,

{Tr}_{b} \otimes i d_{H}

is a positive linear map for all Hilbert spaces

H

. A trace-preserving and completely positive (or ‘CPTP’) linear map is called a channel.

The partial trace is actually one of the three main elements of every channel. In fact, thanks to the Stinespring theorem (see [19]), given any channel

C \in B (B_{H} (H_{a}), B_{H} (H_{a}))

, there exist an operator

Y \in B_{H} (H_{b})

and a unitary operator U on

H_{a} \otimes H_{b}

such that

C (X) : = {Tr}_{b} (U (X \otimes Y) U^{†}), \forall X \in B_{H} (H_{a}) .

(12)

By the Riesz representation theorem, we can define the adjoint of the partial trace

{Tr}_{b}

as the only operator

{Tr}_{b}^{†} : B_{H} (H_{a}) \to B_{H} (H_{a b})

satisfying

{〈 X_{a b}, {Tr}_{b}^{†} (X_{a}) 〉}_{B_{H} (H_{a b})} = {〈 {Tr}_{b} (X_{a b}), X_{a} 〉}_{B_{H} (H_{a})} .

(13)

Writing the inner products explicitly and using property (11), we find

Tr (X_{a b} {Tr}_{b}^{†} (X_{a})) = Tr ({Tr}_{b} (X_{a b}) X_{a}) = Tr (X_{a b} (X_{a} \otimes i d_{b})),

(14)

which allows us to write the explicit action of

{Tr}_{b}^{†}

as

{Tr}_{b}^{†} (X_{a}) = X_{a} \otimes i d_{b} .

(15)

This formula implies that

{({Tr}_{b}^{†} (X_{a}))}^{†} = {Tr}_{b}^{†} (X_{a}^{†})

and that

{Tr}_{b}^{†}

is unital; i.e., it maps the identity of its domain to the identity of its range

{Tr}_{b}^{†} (i d_{a}) = i d_{a b} .

(16)

Being a completely positive unital transformation,

{Tr}_{b}^{†}

is a so-called Schwarz map (see [20], Corollary 2.8); i.e., it satisfies the following inequality:

{Tr}_{b}^{†} (X_{a}^{†}) {Tr}_{b}^{†} (X_{a}) \leq {Tr}_{b}^{†} (X_{a}^{†} X_{a}) .

(17)

This property is shared by the adjoint of any channel

C

.

Relative Entropy in Quantum Information Theory

We devote a separate subsection to relative entropy, as its proper treatment entails the detailed development of several technical aspects.

This is essential for addressing certain subtleties that are often overlooked in the literature yet play a crucial role in the rigorous analysis of relative entropy.

Given any finite-dimensional Hilbert space

H

and any density operator

ρ \in B_{H} (H)

with eigenvalues

λ_{j} \geq 0

, we indicate with

Sp (ρ)

its spectrum and with

{Sp}_{+} (ρ)

the subset of

Sp (ρ)

composed only of strictly positive eigenvalues.

It will be convenient to have the following index sets at hand:

I (ρ) = {j : λ_{j} \in Sp (ρ)}

and

I_{+} (ρ) = {j : λ_{j} \in {Sp}_{+} (ρ)}

.

The support of

ρ

, denoted with

supp (ρ)

, is the subset of

H

defined as follows:

\begin{matrix} supp (ρ) & : = span {x_{j} \in H : x_{j} is an eigenvector of ρ with eigenvalue λ_{j} \in {Sp}_{+} (ρ)} \\ = ⨁_{j \in I_{+} (ρ)} E_{λ_{j}}, \end{matrix}

(18)

where

E_{λ_{j}}

is the eigenspace relative to the eigenvalue

λ_{j}

.

From this definition and the spectral theorem for Hermitian operators, it immediately follows that

supp (ρ) = ker {(ρ)}^{⊥} = Im (ρ) .

(19)

This implies that

H = ker (ρ) \oplus supp (ρ)

, so the positive definite operator

ρ |_{supp (ρ)}

has a trivial kernel; hence, it is invertible.

In particular, if

ρ

is positive definite, then

ρ

is invertible everywhere, and its image and support coincide with the entire

H

. On the other hand, non-invertible density operators, for instance, those corresponding to pure states, have support strictly included in

H

.

The deviation from purity, or mixedness, of

ρ

can be measured by its von Neumann entropy, defined as follows:

S (ρ) : = - Tr (ρ log ρ) .

(20)

The condition

S (ρ) = 0

is satisfied if and only if

ρ

is pure. In the literature, the precise definition of the logarithm of a matrix may vary according to the specific aims and context in which the matrix logarithm is employed.

For the purposes of this paper, the key property that the logarithm of a density operator

ρ

, or, more generally, of a Hermitian operator, must satisfy is

exp (log ρ) = ρ

. For this reason, we adopt a definition of log via functional calculus in the extended real numbers with the following conventions:

log (0) = - \infty, exp (- \infty) = 0, 0 log 0 = 0,

(21)

of course justified by the limits

log ε \to - \infty

as

ε \to 0^{+}

,

exp (- M) \to 0^{+}

as

M \to + \infty

, and

ε log ε \to 0

as

ε \to 0^{+}

.

Using the spectral theorem, if

ρ

is decomposed as

ρ = \sum_{j \in I_{+} (ρ)} λ_{j} P_{j} + 0 P_{0},

(22)

where

P_{j}

denotes the orthogonal projector onto the eigenspace

E_{λ_{j}}

,

j \in I_{+} (ρ)

, and

P_{0}

is the orthogonal projector on

ker (ρ)

, then, using the previous conventions, we have

log ρ : = \sum_{j \in I_{+} (ρ)} log (λ_{j}) P_{j} + (- \infty) P_{0},

(23)

and so

exp (log ρ) = \sum_{j \in I_{+} (ρ)} exp (log λ_{j}) P_{j} + exp (- \infty) P_{0} = \sum_{j \in I_{+} (ρ)} λ_{j} P_{j} + 0 P_{0} = ρ .

(24)

Since the projectors satisfy the orthogonality relation

P_{i} P_{j} = δ_{i j} P_{j}

and by the convention

0 log 0 = 0

, which accounts for the zero eigenvalue, we obtain

S (ρ) = - Tr (ρ log ρ) = - \sum_{j \in I_{+} (ρ)} λ_{j} log (λ_{j}) Tr (P_{j}),

(25)

where

Tr (P_{j})

is the multiplicity of the eigenvalue

λ_{j}

, in the literature, it is actually more common to write just

S (ρ) = - \sum_{j \in I_{+} (ρ)} λ_{j} log (λ_{j}),

(26)

with the understanding that each eigenvalue is repeated according to its multiplicity.

The von Neumann entropy does not, by itself, tell us how one state differs from another. To capture the distinguishability of two states, the concept of relative entropy (sometimes called the quantum Kullback–Leibler divergence) must be introduced.

The definition of the relative entropy between two density operators

ρ

and

σ

is subjected to a condition on the compatibility of their supports, precisely the following:

S (ρ ∥ σ) : = \{\begin{matrix} Tr (ρ log ρ - ρ log σ) & if supp (ρ) \subseteq supp (σ) \\ + \infty & if supp (ρ) ⊈ supp (σ) \end{matrix},

(27)

which is known as the ‘support-based definition’ of relative entropy.

In order to understand why the condition

supp (ρ) \subseteq supp (σ)

is both necessary and sufficient for

S (ρ ∥ σ)

to be well-defined, we first note that

Tr (ρ log ρ - ρ log σ) = Tr (ρ log ρ) - Tr (ρ log σ),

(28)

so, the first term of Formula (28) is minus the von Neumann entropy, which is always finite. In order to analyze the second term, let us consider, alongside Equation (22), the analogous spectral decomposition of

σ

σ = \sum_{k \in I_{+} (σ)} μ_{k} Π_{k} + 0 Π_{0},

(29)

so that

log σ = \sum_{k \in I_{+} (σ)} log (μ_{k}) Π_{k} + log (0) Π_{0},

(30)

where

Π_{k}

is the orthogonal projector on the eigenspace relative to the positive eigenvalue

μ_{k}

of

σ

, and

Π_{0}

is the projector on

ker (σ)

. Then,

\begin{matrix} Tr (ρ log σ) & = \sum_{j \in I_{+} (ρ)} \sum_{k \in I_{+} (σ)} λ_{j} log (μ_{k}) Tr (P_{j} Π_{k}) + \sum_{j \in I_{+} (ρ)} λ_{j} log (0) Tr (P_{j} Π_{0}), \end{matrix}

(31)

where the contribution of

P_{0}

does not appear due to the convention

0 log (0) = 0

.

The first term is always finite, since only strictly positive eigenvalues appear. Instead, the behavior of the second term depends on the value of

Tr (P_{j} Π_{0})

. To compute it, let us consider any eigenbasis

{(x_{j})}_{j \in I_{+} (ρ)}

of

supp (ρ)

so that

P_{j} (x_{j}) = x_{j}

for all

j \in I_{+} (ρ)

and use the fact that orthogonal projectors are Hermitian and idempotent to write

\begin{matrix} Tr (P_{j} Π_{0}) & = \sum_{j \in I_{+} (ρ)} 〈 x_{j}, P_{j} Π_{0} x_{j} 〉 = \sum_{j \in I_{+} (ρ)} 〈 P_{j} x_{j}, Π_{0} x_{j} 〉 = \sum_{j \in I_{+} (ρ)} 〈 x_{j}, Π_{0} x_{j} 〉 \\ = \sum_{j \in I_{+} (ρ)} 〈 x_{j}, Π_{0} Π_{0} x_{j} 〉 = \sum_{j \in I_{+} (ρ)} 〈 Π_{0} x_{j}, Π_{0} x_{j} 〉 = \sum_{j \in I_{+} (ρ)} {∥ Π_{0} x_{j} ∥}^{2} \geq 0 . \end{matrix}

(32)

The second term in Equation (31) diverges to

- \infty

if and only if

Tr (P_{j} Π_{0}) > 0

, i.e., when there exists at least one

j \in I_{+} (ρ)

such that

x_{j} \in supp (ρ)

has a non-trivial projection onto

ker (σ)

, i.e.,

Π_{0} x_{j}

is non-null, which is equivalent to saying that

x_{j} \notin ker {(σ)}^{⊥} = supp (σ)

. Therefore,

Tr (ρ log σ) = - \infty \Leftrightarrow Tr (P_{j} Π_{0}) > 0 \Leftrightarrow supp (ρ) ⊈ supp (σ),

(33)

which justifies the definition given in (27).

Instead, if

supp (ρ) \subseteq supp (σ)

, then

Tr (P_{j} Π_{0}) = 0

; additionally, using again the convention

0 log (0) = 0

, the second term in Equation (31) vanishes, and we remain just with the first term, which can be written in an alternative form. Consider now the orthonormal eigenbases

{(x_{j})}_{j \in J_{+} (ρ)}

,

{(y_{k})}_{k \in I_{+} (σ)}

of

supp (ρ)

and

supp (σ)

, respectively; then, we have the following well-known formula (see, e.g., [21]):

Tr (P_{j} Π_{k}) = {| 〈 x_{j}, y_{k} 〉 |}^{2},

(34)

which shows that the factor

Tr (P_{j} Π_{k})

codifies the transition probability between the pure states represented by the unit vectors

x_{j}

and

y_{k}

.

In summary, when

supp (ρ) \subseteq supp (σ)

, the relative entropy between

ρ

and

σ

can be written explicitly as follows:

\begin{matrix} S (ρ ∥ σ) & = - S (ρ) - \sum_{j \in I_{+} (ρ)} \sum_{k \in I_{+} (σ)} λ_{j} log (μ_{k}) Tr (P_{j} Π_{k}) \\ = \sum_{j \in I_{+} (ρ)} λ_{j} (log (λ_{j}) - \sum_{k \in I_{+} (σ)} log (μ_{k}) {| 〈 x_{j}, y_{k} 〉 |}^{2}) . \end{matrix}

(35)

Two cases are particularly relevant in practical contexts:

If $ρ, σ > 0$ , then their supports coincide with the entire Hilbert space $H$ , and so their relative entropy is the finite real number given by Equation (35);
Instead, if $ρ > 0$ but $σ$ is not, then their relative entropy is infinite. This happens, for instance, when $ρ$ is a full-rank mixed state and $σ$ is a pure state.

An equivalent and useful definition of the von Neumann and relative entropy appears in the literature (see, e.g., [22]). Rather than relying on the support of the density operators, this alternative definition is based on a regularization procedure.

Specifically, given a state

ρ

on

H

, one defines the regularized operator

ρ_{ε} : = ρ + ε i d_{H},

(36)

with

ε > 0

.

ρ_{ε}

is a positive definite Hermitian operator, and the spectral decompositions of

ρ_{ε}

and

log ρ_{ε}

are

ρ_{ε} = \sum_{j \in I (ρ)} (λ_{j} + ε) P_{j}, log ρ_{ε} = \sum_{j \in I (ρ)} log (λ_{j} + ε) P_{j} .

(37)

We have

(λ_{j} + ε) log (λ_{j} + ε) \underset{ε \to 0^{+}}{⟶} \{\begin{matrix} λ_{j} log (λ_{j}) & if j \in I_{+} (ρ) \\ 0 & if λ_{j} = 0 \end{matrix},

(38)

so the von Neumann entropy of

ρ

is well-defined via the limit

S (ρ) : = - lim_{ε \to 0^{+}} Tr (ρ_{ε} log ρ_{ε}) .

(39)

Similarly, setting

σ_{ε} : = σ + ε i d_{H}

, with

ε > 0

, the relative entropy between

ρ

and

σ

can be defined as

S (ρ ∥ σ) : = lim_{ε \to 0^{+}} Tr (ρ_{ε} log ρ_{ε} - ρ_{ε} log σ_{ε}),

(40)

known as the ‘regularized definition’ of relative entropy.

Let us verify that the regularized and support-based definitions of relative entropy coincide. Equation (40) splits into two terms; the first equals minus the regularized definition of the von Neumann entropy, which is finite by Equation (39), so the only issue to address concerns the second term of Equation (40).

For that, we write the spectral decomposition of

σ_{ε}

as follows:

σ_{ε} = \sum_{k \in I_{+} (σ)} (μ_{k} + ε) Π_{k} + ε Π_{0} .

(41)

Repeating the same computations performed in the case of the support-based definition of

S (ρ ∥ σ)

, we obtain

Tr (ρ_{ε} log σ_{ε}) = \sum_{k \in I_{+} (σ)} log (μ_{k} + ε) Tr (ρ_{ε} Π_{k}) + log (ε) Tr (ρ_{ε} Π_{0}) .

(42)

If

supp (ρ) \subseteq supp (σ)

, then

Tr (ρ_{ε} Π_{0}) \to 0

when

ε \to 0^{+}

and the last term in Equation (42) is absent, so the limit converges to the correct value.

Instead, if

supp (ρ) ⊈ supp (σ)

, then

Tr (ρ_{ε} Π_{0}) \to α > 0

when

ε \to 0^{+}

; consequently, the second term in Equation (42) diverges to

- \infty

, thus matching the behavior of the support-based definition.

The relative entropy has several important properties (see, e.g., [23]):

Klein’s inequality: $S (ρ ∥ σ) \geq 0$ for all $ρ, σ$ , and $S (ρ ∥ σ) = 0$ if and only if $ρ = σ$ . This property motivates the reason why the relative entropy, despite lacking symmetry in its arguments, is taken to be a measure of the distinguishability of states in quantum theories.
Invariance under unitary conjugation: $S (U ρ U^{†} ∥ U σ U^{†}) = S (ρ ∥ σ)$ for all unitary operators U acting on the same Hilbert space as $ρ$ and $σ$ .
Additivity w.r.t. tensor product: $S (ρ_{1} \otimes ρ_{2} ∥ σ_{1} \otimes σ_{2}) = S (ρ_{1} ∥ σ_{1}) + S (ρ_{2} ∥ σ_{2})$ for all density operators $σ_{j}, ρ_{j}, j = 1, 2$ .

The monotonicity of S under partial trace is represented by the inequality

S ({Tr}_{b} (ρ) ∥ {Tr}_{b} (σ)) \leq S (ρ ∥ σ),

(43)

which, together with Stinespring’s theorem and the three previously mentioned properties of S, permits to prove that quantum distinguishability does not increase under the action of a generic channel

C

S (C (ρ) ∥ C (σ)) \leq S (ρ ∥ σ),

(44)

a formula also known as the data processing inequality (DPI). In fact,

\begin{matrix} S (C (ρ) ∥ C (σ)) & = S ({Tr}_{b} (U (ρ \otimes X) U^{†}) ∥ {Tr}_{b} (U (σ \otimes X) U^{†})) \\ \leq S (U (ρ \otimes X) U^{†} ∥ U (σ \otimes X) U^{†}) \\ = S (ρ \otimes X ∥ σ \otimes X) = S (ρ ∥ σ) + S (X, X) \\ = S (ρ ∥ σ) . \end{matrix}

(45)

Inequality (44) explains why, in the quantum information literature, a channel is often referred to as a coarse-graining procedure, a term borrowed from statistical mechanics.

This terminology reflects the idea that information about different quantum states is lost through the action of the channel, as previously distinguishable states may become indistinguishable after the transformation.

While the first three properties of S mentioned above are relatively straightforward to prove, its monotonicity under partial trace is considerably more subtle. In the next two sections, we provide a detailed analysis of the proof originally proposed by Petz and Uhlmann.

3. Petz’s Proof of the Monotonicity of the Relative Entropy Under Partial Trace

In this section, we examine the strategy proposed by Petz in [10] for proving the monotonicity of relative entropy under the partial trace operation, which is based on a clever reformulation of the expression of the relative entropy as a suitable inner product inspired by an analogous construction by Araki [2].

We will show that Petz’s proof is flawed due to an incorrect application of the contractive version of Jensen’s operator inequality. We will explain how this issue can be circumvented, thus restoring the validity of Petz’s overall approach. Furthermore, we will show how to extend it to also incorporate non-invertible density operators.

The notation that will be used throughout this section is detailed below:

Given the finite-dimensional Hilbert spaces $(H_{a}, {〈, 〉}_{a})$ and $(H_{b}, {〈, 〉}_{b})$ , we define $H_{a b} : = H_{a} \otimes H_{b}$ , with inner product ${〈, 〉}_{a b}$ induced by those of $H_{a}$ and $H_{b}$ ;
Operators of $B_{H} (H_{a})$ will be denoted as $X, Y, Z$ , and operators of $B_{H} (H_{a b})$ will be denoted by $A, B, C$ ;
$ρ, σ \in B_{H} (H_{a b})$ are two positive definite (invertible) density operators (actually, for the following analysis, only $ρ$ has to be invertible; however, as we noted with the support-based definition of the relative entropy, if $ρ > 0$ , then its support is the entire $H$ , and so, for our analysis to be meaningful, we also have to demand $σ > 0$ ): $ρ, σ > 0$ .

In order to rewrite the relative entropy as an inner product, we must introduce some suitable superoperators.

Precisely, for fixed

B, C \in B_{H} (H_{a b})

,

B > 0

, consider the Hermitian superoperators

R_{B}, L_{C}, Δ_{B, C} \in B_{H} (B_{H} (H_{a b}))

given by the right and left operator multiplications and the relative modular operator, respectively, i.e.,

R_{B} (A) : = A B, L_{C} (A) : = C A, Δ_{B, C} (A) : = L_{C} \circ {R_{B}}^{- 1} (A) = C A B^{- 1} .

(46)

Clearly,

{R_{B}}^{- 1} = R_{B^{- 1}}

,

{L_{C}}^{- 1} = L_{C^{- 1}}

, and

[L_{C}, R_{B}] = [L_{C}, {R_{B}}^{- 1}] = 0 .

(47)

All these superoperators are Hermitian; in fact,

{〈 R_{B} (A), C 〉}_{a b} = Tr ({(A B)}^{†} C) = Tr (B A C) = Tr (A C B) = {〈 A, R_{B} (C) 〉}_{a b},

(48)

and similarly for

L_{C}

. Regarding

Δ_{B, C}

, using Equation (47), we have

Δ_{B, C}^{†} = {(L_{C} {R_{B}}^{- 1})}^{†} = {({R_{B}}^{- 1})}^{†} L_{C} = {(R_{B}^{†})}^{- 1} L_{C} = R_{B}^{- 1} \circ L_{C} = L_{C} \circ {R_{B}}^{- 1} = Δ_{B, C} .

(49)

If we consider the specific case in which

A = ρ > 0

and

B = σ > 0

, then we obtain

\begin{matrix} log (Δ_{ρ, σ}) & = log (L_{σ} \circ R_{ρ}^{- 1}) = log (exp (log (L_{σ})) exp (log (R_{ρ}^{- 1})) \\ = log (exp (log (L_{σ}) - log (R_{ρ}))) = log (L_{σ}) - log (R_{ρ}) . \end{matrix}

(50)

Since

log (L_{σ}) = L_{log (σ)}

and

log (R_{ρ}) = R_{log (ρ)}

, applying formula (50) to

ρ^{1 / 2}

gives

log (Δ_{ρ, σ}) (ρ^{1 / 2}) = log (σ) ρ^{1 / 2} - ρ^{1 / 2} log (ρ) .

(51)

This last identity, the cyclic property of the trace, and the property

[ρ^{1 / 2}, log (ρ)] = 0

imply that

\begin{matrix} S (ρ ∥ σ) & = Tr [ρ log (ρ) - ρ log (σ)] = Tr [ρ^{1 / 2} (ρ^{1 / 2} log (ρ) - ρ^{1 / 2} log (σ))] \\ = Tr [(ρ^{1 / 2} log (ρ) - ρ^{1 / 2} log (σ)) ρ^{1 / 2}] \\ = Tr [ρ^{1 / 2} log (ρ) ρ^{1 / 2} - ρ^{1 / 2} log (σ) ρ^{1 / 2}] \\ = Tr [ρ^{1 / 2} ρ^{1 / 2} log (ρ) - ρ^{1 / 2} log (σ) ρ^{1 / 2}] \\ = - Tr [ρ^{1 / 2} (log (σ) ρ^{1 / 2} - ρ^{1 / 2} log (ρ))] \\ = - {〈 ρ^{1 / 2}, log (Δ_{ρ, σ}) (ρ^{1 / 2}) 〉}_{a b}, \end{matrix}

(52)

which is the ‘inner product reformulation’ of the relative entropy between the positive definite states

ρ

and

σ

that we were searching for.

We can obtain an analogous formula for the relative entropy of the partial traces of

ρ

and

σ

. To this end, if we fix

Y, Z \in B_{H} (H_{a})

,

Y > 0

, then we can define the Hermitian superoperators

R_{Y}^{a}, L_{Z}^{a}, Δ_{ρ, σ}^{a} \in B_{H} (B_{H} (H_{a}))

as follows:

R_{Y}^{a} (X) : = X Y, L_{Z}^{a} (X) : = Z X, Δ_{Y, Z}^{a} (X) : = L_{Z} \circ {R_{Y}}^{- 1} (X) = Z X Y^{- 1} .

(53)

By carrying out computations analogous to those in Equation (52) but this time using the superoperators

R_{{Tr}_{b} (ρ)}^{a}, L_{{Tr}_{b} (σ)}^{a}, Δ_{ρ, σ}^{a} : = L_{{Tr}_{b} (σ)}^{a} \circ R_{{Tr}_{b} {(ρ)}^{- 1}}^{a}

, we obtain

S ({Tr}_{b} (ρ) ∥ {Tr}_{b} (σ)) = - {〈 {Tr}_{b} {(ρ)}^{1 / 2}, log (Δ_{ρ, σ}^{a}) ({Tr}_{b} {(ρ)}^{1 / 2}) 〉}_{a} .

(54)

Due to the minus sign in front of the inner products appearing in Equations (52) and (54), we have that the monotonicity of the relative entropy under partial trace is equivalent to

{〈 ρ^{1 / 2}, log (Δ_{ρ, σ}) (ρ^{1 / 2}) 〉}_{a b} \leq {〈 {Tr}_{b} {(ρ)}^{1 / 2}, log (Δ_{ρ, σ}^{a}) ({Tr}_{b} {(ρ)}^{1 / 2}) 〉}_{a} .

(55)

These inner products are defined on two different Hilbert spaces,

B_{H} (H_{a b})

and

B_{H} (H_{a})

; in order to perform a meaningful comparison and prove the inequality, Petz introduced a superoperator

V_{ρ} : B_{H} (H_{a}) \to B_{H} (H_{a b})

through the explicit formula

V_{ρ} (X {Tr}_{b} {(ρ)}^{1 / 2}) : = {Tr}_{b}^{†} (X) ρ^{1 / 2},

(56)

which serves as a bridge between the reduced state

{Tr}_{b} (ρ)

and the full state

ρ

. While, as we are going to show, this definition is computationally effective, it may at first appear somewhat ad hoc. Actually, a seemingly more natural choice for

V_{ρ}

would have been

{Tr}_{b}^{†}

for two reasons: first, as for

V_{ρ}

,

{Tr}_{b}^{†}

is a map between

B_{H} (H_{a})

and

B_{H} (H_{a b})

, and, second, it satisfies the Schwarz inequality (17), which, in the following, will play a crucial role in the proof of inequality (55).

It turns out that

V_{ρ}

is tightly related to the adjoint of the partial trace

{Tr}_{b}

, not w.r.t. the original inner products of

H_{a}

and

H_{a b}

, but w.r.t. suitably weighted inner products that naturally emerge from the previous structural analysis of the relative entropy in terms of superoperators.

In fact, the reformulations of the relative entropy obtained in Equations (52) and (54) involve inner products weighted by

ρ^{1 / 2}

and

{Tr}_{b} {(ρ)}^{1 / 2}

, respectively. This observation suggests that the correct notion of adjoint to consider for

{Tr}_{b}

is the one defined relative to the inner products (the positive definiteness of these inner products is guaranteed by the fact that

ρ, {Tr}_{b} (ρ) > 0

)

{〈 A, B 〉}_{a b, ρ} : = {〈 R_{ρ^{- 1 / 2}} (A), B 〉}_{a b}, {〈 X, Y 〉}_{a, ρ} : = {〈 R_{{Tr}_{b} {(ρ)}^{- 1 / 2}}^{a} (X), Y 〉}_{a} .

(57)

We have

\begin{matrix} {〈 X, {Tr}_{b} (A) 〉}_{a, ρ} & = {〈 R_{{Tr}_{b} {(ρ)}^{- 1 / 2}}^{a} (X), {Tr}_{b} (A) 〉}_{a} = {〈 {Tr}_{b}^{†} \circ R_{{Tr}_{b} {(ρ)}^{- 1 / 2}}^{a} (X), A 〉}_{a b} \\ = {〈 R_{ρ^{- 1 / 2}} \circ R_{ρ^{1 / 2}} \circ {Tr}_{b}^{†} \circ R_{{Tr}_{b} {(ρ)}^{- 1 / 2}}^{a} (X), A 〉}_{a b} \\ = {〈 R_{ρ^{1 / 2}} \circ {Tr}_{b}^{†} \circ R_{{Tr}_{b} {(ρ)}^{- 1 / 2}}^{a} (X), A 〉}_{a b, ρ} . \end{matrix}

(58)

So, the adjoint of

{Tr}_{b}

w.r.t. the weighted inner products defined above is the operator

{Tr}_{b}^{†, ρ} : = R_{ρ^{1 / 2}} \circ {Tr}_{b}^{†} \circ R_{{Tr}_{b} {(ρ)}^{- 1 / 2}}^{a}

, which, for all

X \in B_{H} (H_{a})

, satisfies

(R_{ρ^{1 / 2}} \circ {Tr}_{b}^{†} \circ R_{{Tr}_{b} {(ρ)}^{- 1 / 2}}^{a}) (X {Tr}_{b} {(ρ)}^{1 / 2}) = (R_{ρ^{1 / 2}} \circ {Tr}_{b}^{†}) (X) = {Tr}_{b}^{†} (X) ρ^{1 / 2},

(59)

and, therefore,

V_{ρ} = {Tr}_{b}^{†, ρ}

; i.e.,

V_{ρ} = R_{ρ^{1 / 2}} \circ {Tr}_{b}^{†} \circ R_{{Tr}_{b} {(ρ)}^{- 1 / 2}}^{a} .

(60)

From this fact, we obtain that the explicit action of

V_{ρ}

on any

X \in B_{H} (H_{a})

is

V_{ρ} (X) = (X {Tr}_{b} {(ρ)}^{- 1 / 2} \otimes i d_{b}) ρ^{1 / 2},

(61)

and this implies immediately that

V_{ρ}

transforms

{Tr}_{b} {(ρ)}^{1 / 2}

into

ρ^{1 / 2}

:

V_{ρ} ({Tr}_{b} {(ρ)}^{1 / 2}) = ρ^{1 / 2} .

(62)

Repeatedly using the cyclic property of the trace and Schwarz’s inequality (17) satisfied by

{Tr}_{b}^{†}

, we can prove that

V_{ρ}

is a contraction

\begin{matrix} ∥ V_{ρ} (X {Tr}_{b} {(ρ)}^{1 / 2}) ∥^{2} & = {〈 {Tr}_{b}^{†} (X) ρ^{1 / 2}, {Tr}_{b}^{†} (X) ρ^{1 / 2} 〉}_{a b} = Tr [{({Tr}_{b}^{†} (X) ρ^{1 / 2})}^{†} {Tr}_{b}^{†} (X) ρ^{1 / 2}] \\ = Tr [ρ^{1 / 2} {Tr}_{b}^{†} (X^{†}) {Tr}_{b}^{†} (X) ρ^{1 / 2}] = Tr [{Tr}_{b}^{†} (X^{†}) {Tr}_{b}^{†} (X) ρ] \\ \leq Tr [{Tr}_{b}^{†} (X^{†} X) ρ] = {〈 {Tr}_{b}^{†} (X^{†} X), ρ 〉}_{a b} = {〈 X^{†} X, {Tr}_{b} (ρ) 〉}_{a} \\ = Tr (X^{†} X {Tr}_{b} {(ρ)}^{1 / 2} {Tr}_{b} {(ρ)}^{1 / 2}) = Tr [{Tr}_{b} {(ρ)}^{1 / 2} X^{†} X {Tr}_{b} {(ρ)}^{1 / 2}] \\ = Tr [{(X {Tr}_{b} {(ρ)}^{1 / 2})}^{†} X {Tr}_{b} {(ρ)}^{1 / 2}] = {〈 X {Tr}_{b} {(ρ)}^{1 / 2}, X {Tr}_{b} {(ρ)}^{1 / 2} 〉}_{a} \\ = ∥ X {Tr}_{b} {(ρ)}^{1 / 2} ∥^{2} . \end{matrix}

(63)

Note now that

\begin{matrix} {〈 Δ_{ρ, σ}^{a} (X) {Tr}_{b} {(ρ)}^{1 / 2}, X {Tr}_{b} {(ρ)}^{1 / 2} 〉}_{a} & = {〈 {Tr}_{b} (σ) X {Tr}_{b} {(ρ)}^{- 1 / 2}, X {Tr}_{b} {(ρ)}^{1 / 2} 〉}_{a} \\ = Tr [{({Tr}_{b} (σ) X {Tr}_{b} {(ρ)}^{- 1 / 2})}^{†} X {Tr}_{b} {(ρ)}^{1 / 2}] \\ = Tr [{Tr}_{b} {(ρ)}^{- 1 / 2} X^{†} {Tr}_{b} (σ) X {Tr}_{b} {(ρ)}^{1 / 2}] \\ = Tr [X X^{†} {Tr}_{b} (σ)] . \end{matrix}

(64)

Using the equality just proven, we can show that

V_{ρ}^{†} Δ_{ρ, σ} V_{ρ} \leq Δ_{ρ, σ}^{a}

; in fact,

\begin{matrix} {〈 V_{ρ}^{†} Δ_{ρ, σ} V_{ρ} (X {Tr}_{b} {(ρ)}^{1 / 2}), X {Tr}_{b} {(ρ)}^{1 / 2} 〉}_{a} & = {〈 Δ_{ρ, σ} V_{ρ} (X {Tr}_{b} {(ρ)}^{1 / 2}), V_{ρ} (X {Tr}_{b} {(ρ)}^{1 / 2}) 〉}_{a b} \\ = {〈 Δ_{ρ, σ} ({Tr}_{b}^{†} (X) ρ^{1 / 2}), {Tr}_{b}^{†} (X) ρ^{1 / 2} 〉}_{a b} \\ = 〈 σ {Tr}_{b}^{†} (X) ρ^{- 1 / 2}, {Tr}_{b}^{†} (X) ρ^{1 / 2} 〉 \\ = Tr [ρ^{- 1 / 2} {Tr}_{b}^{†} (X^{†}) σ {Tr}_{b}^{†} (X) ρ^{1 / 2}] \\ = Tr [{Tr}_{b}^{†} (X) {Tr}_{b}^{†} (X^{†}) σ] \\ \leq Tr [{Tr}_{b}^{†} (X X^{†}) σ] = {〈 {Tr}_{b}^{†} (X X^{†}), σ 〉}_{a b} \\ = {〈 X X^{†}, {Tr}_{b} (σ) 〉}_{a} = Tr [X X^{†} {Tr}_{b} (σ)] \\ = {〈 Δ_{ρ, σ}^{a} (X) {Tr}_{b} {(ρ)}^{1 / 2}, X {Tr}_{b} {(ρ)}^{1 / 2} 〉}_{a}, \end{matrix}

(65)

where we again used Schwarz’s inequality, and, to write the last equality, we applied Equation (64).

Since

log (x)

is an operator monotone function, the inequality

V_{ρ}^{†} Δ_{ρ, σ} V_{ρ} \leq Δ_{ρ, σ}^{a}

implies

log (V_{ρ}^{†} Δ_{ρ, σ} V_{ρ}) \leq log (Δ_{ρ, σ}^{a}),

(66)

and so

\begin{matrix} S ({Tr}_{b} (ρ) ∥ {Tr}_{b} (σ)) & = {〈 {Tr}_{b} {(ρ)}^{1 / 2}, - log (Δ_{ρ, σ}^{a}) ({Tr}_{b} {(ρ)}^{1 / 2}) 〉}_{a} \\ \leq {〈 {Tr}_{b} {(ρ)}^{1 / 2}, - log (V_{ρ}^{†} Δ_{ρ, σ} V_{ρ}) ({Tr}_{b} {(ρ)}^{1 / 2}) 〉}_{a} . \end{matrix}

(67)

Petz’s strategy to make the relative entropy

S (ρ ∥ σ)

appear on the right-hand side of the previous inequality consists of using the fact that

V_{ρ}

is a contraction to apply the contractive Jensen operator inequality (9) with

f (x) = - log (x)

. In this way, due to (62), we would get

\begin{matrix} S ({Tr}_{b} (ρ) ∥ {Tr}_{b} (σ)) & \leq {〈 {Tr}_{b} {(ρ)}^{1 / 2}, - log (V_{ρ}^{†} Δ_{ρ, σ} V_{ρ}) ({Tr}_{b} {(ρ)}^{1 / 2}) 〉}_{a} \\ \leq {〈 {Tr}_{b} {(ρ)}^{1 / 2}, V_{ρ}^{†} (- log (Δ_{ρ, σ})) V_{ρ} ({Tr}_{b} {(ρ)}^{1 / 2}) 〉}_{a} \\ = {〈 V_{ρ} {Tr}_{b} {(ρ)}^{1 / 2}, - log (Δ_{ρ, σ}) V_{ρ} ({Tr}_{b} {(ρ)}^{1 / 2}) 〉}_{a b} \\ = {〈 ρ^{1 / 2}, - log (Δ_{ρ, σ}) (ρ^{1 / 2}) 〉}_{a b} \\ = S (ρ ∥ σ), \end{matrix}

(68)

and so the proof of the monotonicity of S w.r.t. the partial trace

{Tr}_{b}

would be achieved.

For this argument to be valid,

- log (x)

is required to be operator convex, which is true, to be well-defined at

x = 0

and to satisfy

- log (0) \leq 0

.

Petz circumvented the lack of definition of

- log (x)

in

x = 0

by using the following integral identity:

\begin{matrix} \int_{0}^{+ \infty} (\frac{1}{x + ξ} - \frac{1}{1 + ξ}) d ξ & = lim_{M \to + \infty} {(log (x + ξ) - log (1 + ξ))|}_{0}^{M} = \\ = lim_{M \to + \infty} log (\frac{x + M}{1 + M}) - log (x) \\ = - log (x) . \end{matrix}

(69)

Since the integral over

[0, + \infty)

coincides with that over

(0, + \infty)

, we may restrict our attention to strictly positive values of

ξ

. If we denote by

i d_{a b}

the identity operator on

B_{H} (H_{a b})

, we have

Δ_{ρ, σ, ξ} \equiv {(Δ_{ρ, σ} + ξ i d_{a b})}^{- 1} - {(i d_{a b} + ξ i d_{a b})}^{- 1} = {(Δ_{ρ, σ} + ξ i d_{a b})}^{- 1} - i d_{a b} {(1 + ξ)}^{- 1},

(70)

moreover,

{〈 ρ^{1 / 2}, - i d_{a b} {(1 + ξ)}^{- 1} ρ^{1 / 2} 〉}_{a b} = - {(1 + ξ)}^{- 1} Tr (ρ) = - {(1 + ξ)}^{- 1},

(71)

thus,

{〈 ρ^{1 / 2}, Δ_{s, t, ξ} ρ^{1 / 2} 〉}_{a b} = {〈 ρ^{1 / 2}, {(Δ_{ρ, σ} + ξ i d_{a b})}^{- 1} ρ^{1 / 2} 〉}_{a b} - {(1 + ξ)}^{- 1} .

(72)

It follows that

S (ρ ∥ σ) = \int_{0}^{\infty} ({〈 ρ^{1 / 2}, {(Δ_{ρ, σ} + ξ i d_{a b})}^{- 1} ρ^{1 / 2} 〉}_{a b} - {(1 + ξ)}^{- 1}) d ξ,

(73)

and, analogously,

S ({Tr}_{b} (ρ) ∥ {Tr}_{b} (σ)) = \int_{0}^{\infty} ({〈 {Tr}_{b} {(ρ)}^{1 / 2}, {(Δ_{ρ, σ}^{a} + ξ i d_{a})}^{- 1} {Tr}_{b} {(ρ)}^{1 / 2} 〉}_{a} - {(1 + ξ)}^{- 1}) d ξ .

(74)

For all

ξ \in (0, + \infty)

, the function

g_{ξ}

given by

x \mapsto {(x + ξ)}^{- 1} - {(1 + ξ)}^{- 1}

is well-defined for

x = 0

, and it is operator convex and operator monotone (decreasing); see [16] chapter V.1. Thanks to this last property, we have

{(Δ_{ρ, σ}^{a} + ξ)}^{- 1} \leq {(V_{ρ}^{†} Δ_{ρ, σ}^{a} V_{ρ} + ξ)}^{- 1} .

(75)

However,

g_{ξ} (0) = ξ^{- 1} - {(1 + ξ)}^{- 1} > 0

for all

ξ \in (0, + \infty)

, so the contractive Jensen operator inequality cannot be used to write

{(V_{ρ}^{†} Δ_{ρ, σ}^{a} V_{ρ} + ξ)}^{- 1} \leq V_{ρ}^{†} {(Δ_{ρ, σ}^{a} + ξ)}^{- 1} V_{ρ},

(76)

which would lead to the proof of the monotonicity of the relative entropy with computations analogous to those shown in Formula (68).

A simple counterexample illustrating the failure of inequality (76) arises in the scalar case, where a contraction reduces to a multiplication by a real coefficient

α \in (0, 1]

and satisfies

α^{†} = α

. The inequality

{(α x α + ξ)}^{- 1} \leq α {(x + ξ)}^{- 1} α,

(77)

is false for all

ξ \in (0, + \infty)

, as shown in Figure 1.

Note that the inequality

- log (V^{†} X V) \leq - V^{†} log (X) V

is also false for a generic contraction V, as we show in Figure 2 with a counterexample in the scalar case.

To summarize, showing that

V_{ρ}

is a contraction does not permit the use of the contractive version of Jensen’s operator inequality to prove the monotonicity of the relative entropy.

3.1. Correction of Petz’s Strategy to Prove the Relative Entropy Monotonicity

There is a simple correction of Petz’s strategy that restores the validity of his approach to prove the monotonicity of the relative entropy in a rigorous way. The correction was provided by Petz himself, together with Nielsen, in [24]. The same line of reasoning can also be found in [25,26].

The key idea of the correction lies in recognizing that the operator

V_{ρ}

is not merely a contraction but an isometry; i.e.,

V_{ρ}^{†} V_{ρ}

is the identity operator on

B_{H} (H_{a})

. In fact, using Equations (11) and (56), we have

\begin{matrix} {〈 Y {Tr}_{b} {(ρ)}^{1 / 2}, V_{ρ}^{†} V_{ρ} (X {Tr}_{b} {(ρ)}^{1 / 2}) 〉}_{a} & = {〈 V_{ρ} (Y {Tr}_{b} {(ρ)}^{1 / 2}), V_{ρ} (X {Tr}_{b} {(ρ)}^{1 / 2}) 〉}_{a b} \\ = {〈 {Tr}_{b}^{†} (Y) ρ^{1 / 2}, {Tr}_{b}^{†} (X) ρ^{1 / 2} 〉}_{a b} \\ = Tr [(Y^{†} \otimes i d_{b}) (X \otimes i d_{b}) ρ] \\ = Tr [(Y^{†} X \otimes i d_{b}) ρ] = Tr [Y^{†} X {Tr}_{b} (ρ)] \\ = Tr [{Tr}_{b} {(ρ)}^{1 / 2} Y^{†} X {Tr}_{b} {(ρ)}^{1 / 2}] \\ = Tr [{(Y {Tr}_{b} {(ρ)}^{1 / 2})}^{†} X {Tr}_{b} {(ρ)}^{1 / 2}] \\ = {〈 Y {Tr}_{b} {(ρ)}^{1 / 2}, X {Tr}_{b} {(ρ)}^{1 / 2} 〉}_{a} . \end{matrix}

(78)

This allows us to apply point

(i i i)

of Theorem 1 with

H = B_{H} (H_{a b})

,

K = B_{H} (H_{a})

and

X = Δ_{ρ, σ}

, which ensures that the inequality

- log (V_{s}^{†} Δ_{ρ, σ} V_{ρ}) \leq - V_{s}^{†} log (Δ_{ρ, σ}) V_{ρ},

(79)

holds true, confirming the validity of the computations in (68), thereby establishing the monotonicity of the relative entropy.

3.2. Extension of Petz’s Proof to Non-Invertible Density Operators

The corrected version of Petz’s strategy for proving the monotonicity of relative entropy can be extended to include non-invertible density operators. To this end, the interplay between the support-based and the regularized definition of relative entropy given in (27) and (40), respectively, will prove to be useful.

First of all, note that, if

supp (ρ) ⊈ supp (σ)

, then

S (ρ ∥ σ) = + \infty

, and the monotonicity statement is trivially true.

We therefore restrict our attention to the case

supp (ρ) \subseteq supp (σ)

, which ensures that the relative entropy is finite. Within this setting, we assume that

ρ

is non-invertible, while

σ

may or may not be invertible. This covers all cases not addressed by Petz’s original analysis in [10].

To establish the monotonicity of relative entropy in this broader context, we must first ensure that

S ({Tr}_{b} (ρ) ∥ {Tr}_{b} (σ)) < + \infty

, i.e., that

supp ({Tr}_{b} (ρ)) \subseteq supp ({Tr}_{b} (σ))

. Observe that it is not necessary to check this condition when

ρ, σ > 0

, since, in that case,

{Tr}_{b} (ρ)

and

{Tr}_{b} (σ)

are also positive definite, and, thus, their supports coincide with

H_{a}

.

We need a preliminary lemma regarding the kernel of positive semi-definite operators.

Lemma 1.

Let

T_{1}

and

T_{2}

be Hermitian positive semi-definite operators on a finite-dimensional Hilbert space

H

. Then,

ker (T_{1} + T_{2}) = ker (T_{1}) \cap ker (T_{2}) .

(80)

Proof.

Proving the inclusion

ker (T_{1}) \cap ker (T_{2}) \subseteq ker (T_{1} + T_{2})

is trivial: if we have

x \in ker (T_{1}) \cap ker (T_{2})

, then

0_{H} = T_{1} x + T_{2} x = (T_{1} + T_{2}) x

, so

x \in ker (T_{1} + T_{2})

.

To show the inclusion

ker (T_{1} + T_{2}) \subseteq ker (T_{1}) \cap ker (T_{2})

, we first note that, thanks to the positive semi-definiteness of

T_{1}

and

T_{2}

, for all

x \in H

, we have

〈 (T_{1} + T_{2}) x, x 〉 = 〈 T_{1} x, x 〉 + 〈 T_{2} x, x 〉 \geq 0,

(81)

with equality to 0 if and only if

〈 T_{1} x, x 〉 = 〈 T_{2} x, x 〉 = 0

, i.e., if and only if

T_{1} x = T_{2} x = 0

because the eigenvalues of both

T_{1}

and

T_{2}

are non-negative.

Hence, if

x \in ker (T_{1} + T_{2})

, then

〈 (T_{1} + T_{2}) x, x 〉 = 0

, so, due to the previous considerations,

T_{1} x = T_{2} x = 0

, and, therefore,

x \in ker (T_{1}) \cap ker (T_{2})

. □

Proposition 1.

For all density operators

ρ, σ

such that

supp (ρ) \subseteq supp (σ)

, it holds that

supp ({Tr}_{b} (ρ)) \subseteq supp ({Tr}_{b} (σ)) .

(82)

Proof.

Using the notation of Section 2, consider again the spectral decompositions of

ρ

and

σ

:

ρ = \sum_{j \in I_{+} (ρ)} λ_{j} P_{j} + 0 P_{0}, σ = \sum_{k \in I_{+} (σ)} μ_{k} Π_{k} + 0 Π_{0},

(83)

where

P_{j}

and

Π_{k}

are the orthogonal projectors onto the eigenspaces corresponding to the positive eigenvalues

λ_{j}

of

ρ

and

μ_{k}

of

σ

, respectively. Then, the following operator sums yield the orthogonal projectors onto

supp (ρ)

and

supp (σ)

, respectively:

P = \sum_{j \in I_{+} (ρ)} P_{j}, Π = \sum_{k \in I_{+} (σ)} Π_{k} .

(84)

Using the linearity and positivity of the partial trace and the fact that multiplying a positive semi-definite operator by a strictly positive scalar does not change its support, we have

\begin{matrix} supp ({Tr}_{b} (ρ)) & = supp (\sum_{j \in I_{+} (ρ)} λ_{j} {Tr}_{b} (P_{j})) = supp (\sum_{j \in I_{+} (ρ)} {Tr}_{b} (P_{j})) \\ = supp ({Tr}_{b} (\sum_{j \in I_{+} (ρ)} P_{j})) = supp ({Tr}_{b} (P)) . \end{matrix}

(85)

The same argument applies to

σ

and

Π

so that

supp ({Tr}_{b} (σ)) = supp ({Tr}_{b} (Π)) .

(86)

Moreover, according to a standard result about orthogonal projectors, the inclusion

supp (ρ) \subseteq supp (σ)

is equivalent to the fact that the operator

Q : = Π - P

is an orthogonal projector on

supp (σ) \cap supp {(ρ)}^{⊥} = supp (σ) \cap ker (ρ)

. By the linearity of the partial trace, we can write

{Tr}_{b} (Π) = {Tr}_{b} (P) + {Tr}_{b} (Q),

(87)

and, moreover, using the fact that

{Tr}_{b}

is a positive map and Lemma 1, we have

\begin{matrix} supp ({Tr}_{b} (Π)) & = supp ({Tr}_{b} (P) + {Tr}_{b} (Q)) \\ = {[ker ({Tr}_{b} (P) + {Tr}_{b} (Q))]}^{⊥} \\ = {[ker ({Tr}_{b} (P)) \cap ker ({Tr}_{b} (Q))]}^{⊥} \\ = span (ker {({Tr}_{b} (P))}^{⊥} \cup ker {({Tr}_{b} (Q))}^{⊥}) \\ = supp ({Tr}_{b} (P)) + supp ({Tr}_{b} (Q)) \supseteq supp ({Tr}_{b} (P)), \end{matrix}

(88)

having used a standard property of the orthogonal complement of the intersection of two vector subspaces. Therefore,

supp ({Tr}_{b} (ρ)) = supp ({Tr}_{b} (P)) \subseteq supp ({Tr}_{b} (Π)) = supp ({Tr}_{b} (σ)),

(89)

as claimed. □

While the support-based definition of relative entropy is useful to prove Equation (82), in order to prove the monotonicity of relative entropy for a non-invertible density operator

ρ

, the regularized definition (40) turns out to be more adequate.

Since both

ρ_{ε} > 0

and

σ_{ε} > 0

for any arbitrarily small

ε > 0

, the modular operator

Δ_{ρ_{ε}, σ_{ε}}

is well-defined. Similarly, the modular operator

Δ_{ρ_{ε}, σ_{ε}}^{a}

is also well-defined.

By applying the same steps used in the corrected version of Petz’s proof and using Equation (52), we obtain that, for any

ε > 0

,

\begin{matrix} Tr [{Tr}_{b} (ρ_{ε}) log {Tr}_{b} (ρ_{ε}) - {Tr}_{b} (ρ_{ε}) log {Tr}_{b} (σ_{ε})] & = {〈 {Tr}_{b} {(ρ_{ε})}^{1 / 2}, - log (Δ_{ρ_{ε}, σ_{ε}}^{a}) {Tr}_{b} {(ρ_{ε})}^{1 / 2} 〉}_{a} \\ \leq {〈 ρ_{ε}^{1 / 2}, - log (Δ_{ρ_{ε}, σ_{ε}}) ρ_{ε}^{1 / 2} 〉}_{a b} \\ = Tr [ρ_{ε} log ρ_{ε} - ρ_{ε} log σ_{ε}] . \end{matrix}

(90)

Notice that we can deal with

ρ_{ε}, σ_{ε}

instead of

ρ, σ

even if they do not have unit trace because this property is used in Petz’s original (and flawed) argument only in Equation (71), which does not play any role in the corrected version outlined in Section 3.1.

By continuity, taking the limit

ε \to 0^{+}

on both sides of the previous inequality leads to

\begin{matrix} S ({Tr}_{b} (ρ) | | {Tr}_{b} (σ)) & = lim_{ϵ \to 0^{+}} Tr [{Tr}_{b} (ρ_{ε}) log {Tr}_{b} (ρ_{ε}) - {Tr}_{b} (ρ_{ε}) log {Tr}_{b} (σ_{ε})] \\ \leq lim_{ϵ \to 0^{+}} Tr [ρ_{ε} log ρ_{ε} - ρ_{ε} log σ_{ε}] = S (ρ | | σ), \end{matrix}

(91)

which completes the proof for non-invertible

ρ

and general

σ

.

As a final remark, we note that, in [10], Petz claimed that his proof applies not only to quantum channels but also to adjoints of unital Schwarz maps. However, that claim relies on the flawed argument that we have pointed out. The validity of Petz’s proof can be restored by proving that

V_{ρ}

is an isometry, a property that holds if we are dealing with the partial trace but not with the adjoint of a general unital Schwarz map.

4. Uhlmann’s Proof of the Monotonicity of the Relative Entropy Under Partial Trace

The proof offered by Uhlmann in [7] (see also [27] for a review) was more general than that offered by Petz because it also naturally encompassed the case of non-invertible density operators. However, thanks to the correction and generalization of Petz’s proof outlined in the previous section, now, we can state that the two procedures have the same generality.

Uhlmann’s proof is based on the concept of interpolations of positive sesquilinear forms, which we recall in the following subsection. Coherently with the analysis developed so far, we will consider only the case of finite dimensional vector spaces.

4.1. Interpolations of Positive Sesquilinear Forms

Let V be a vector space of finite dimension d over the field

F = R

or

C

, and let

F (V)

be the set of sesquilinear forms over V, assumed to be linear in the second variable and conjugate-linear in the first. The results of this subsection also encompass the case of bilinear forms.

We say that

α \in F (V)

is positive if

α (v, v) \geq 0

\forall v \in V

, and we denote the space of positive sesquilinear forms over V by

F_{+} (V)

.

We can endow

F_{+} (V)

with a Löwner-like partial ordering: given

α, β \in F_{+} (V)

, we say that

β \leq α

if

α - β \geq 0

.

Fixing a basis of V, for any form

α \in F (V)

, there exists a unique Hermitian operator

T \in {End}_{F} (V)

such that, for all

v, w \in V

, written in coordinates as

x, y \in F^{d}

, one has

α (v, w) = x^{†} T y .

(92)

It can be easily proven that, for a positive form

α \in F_{+} (V)

, the kernel of

α

coincides with the isotropic cone

ker (α) = {v \in V : α (v, v) = 0},

(93)

and they are both equal to the kernel of the positive semi-definite operator T.

Let

H

now be a Hilbert space with inner product

{〈, 〉}_{H}

,

h : V \to H

be a linear surjective map, and

A \in B_{H} (H)

. Then,

α \in F_{+} (V)

is said to be represented by

(H, h, A)

, indicated with

α \sim (H, h, A)

, if

α (v, w) = {〈 h (v), A h (w) 〉}_{H}, \forall v, w \in V .

(94)

Two representations of positive sesquilinear forms

α \sim (H, h, A)

and

β \sim (H, h, B)

are said to be compatible if

[A, B] = 0

. As we shall see shortly, compatibility is a key concept in constructing a functional calculus for sesquilinear forms.

The following theorem shows that compatible representations exist, and its constructive proof provides a (non-unique) way to build them.

Theorem 2.

Let

α, β \in F_{+} (V)

. Then, there exist representations

α \sim (H, h, A)

and

β \sim (H, h, B)

(with the same mapping h) such that

[A, B] = 0

.

Proof.

Let

N \subset V

be the kernel of the form

α + β

N = {v \in V : α (v, v) + β (v, v) = 0} .

(95)

By fixing a basis of V, we can associate

α

and

β

with two Hermitian and positive semi-definite operators

T_{1}, T_{2} \in {End}_{F} (V)

via Equation (92). It follows that

N = ker (α + β) = ker (T_{1} + T_{2}) = ker (T_{1}) \cap ker (T_{2}),

(96)

where the last equality is provided by Lemma 1.

Now, by setting

H : = V / N

, we can define the following inner product:

\begin{matrix} 〈, 〉 : & H \times H & ⟶ & F \\ (v + N, w + N) & ⟼ & 〈 v + N, w + N 〉 = α (v, w) + β (v, w) . \end{matrix}

(97)

Note that this inner product is well-defined thanks to the equalities (96).

By taking h as the (surjective) quotient map

h : V \to H

,

v \mapsto h (v) = v + N

, we can define the following positive sesquilinear forms on

H

:

\tilde{α} (h (v), h (w)) = \tilde{α} (v + N, w + N) : = α (v, w),

(98)

\tilde{β} (h (v), h (w)) = \tilde{β} (v + N, w + N) : = β (v, w),

(99)

for all

v, w \in V

.

Thanks to the Riesz representation theorem, there exists a unique couple of positive Hermitian operators

A, B \in B_{H} (H)

such that

\tilde{α} (h (v), h (w)) = 〈 h (v), A h (w) 〉 = α (v, w),

(100)

\tilde{β} (h (v), h (w)) = 〈 h (v), B h (w) 〉 = β (v, w) .

(101)

It follows that, for all

v, w \in V

,

\begin{matrix} 〈 h (v), h (w) 〉 & = α (v, w) + β (v, w) = \tilde{α} (h (v), h (w)) + \tilde{β} (h (v), h (w)) \\ = 〈 h (v), (A + B) h (w) 〉, \end{matrix}

(102)

and, hence,

A + B = i d_{H}

, which implies that A and B commute. □

Consider now

R_{+}^{2} = {(x, y) \in R^{2} : x \geq 0, y \geq 0}

, and let J be the set of homogeneous (of degree 1), measurable, and locally bounded functions

f : R_{+}^{2} \to R

. It is possible to develop the concept of function of positive sesquilinear forms thanks to the following theorem, whose quite lengthy proof can be consulted in [28].

Theorem 3.

Let V be a vector space,

α, β \in F_{+} (V)

, and let

α \sim (H, h, A)

and

β \sim (H, h, B)

be two compatible representations of α and β. Then, for any

f \in J

, the function

γ : V \times V \to F

,

γ (v, w) : = 〈 h (v), f (A, B) h (w) 〉, v, w \in V,

(103)

is a well-defined sesquilinear form on V; i.e., γ is independent of the choice of representations, and, for any given f, γ depends only on

α, β \in F_{+} (V)

.

Note that, on the right-hand side of (103),

f (A, B)

is intended as an operator function, which is well-defined since A and B commute, and so they can be simultaneously diagonalized.

Combining Theorems 2 and 3, every

f \in J

can be extended to a function of positive sesquilinear forms. Keeping the same symbol for simplicity, we can define

f : F_{+} (V) \times F_{+} (V) ⟶ F (V),

(104)

as follows: given

α, β \in F_{+} (V)

and any compatible representations

α \sim (H, h, A)

and

β \sim (H, h, B)

, the sequilinear form

f (α, β)

is represented as

f (α, β) \sim (H, h, f (A, B)) .

(105)

The elements of J used to define the concept of the interpolation of positive sesquilinear forms are the positive functions

f^{t} (x, y) = x^{1 - t} y^{t}

, where

t \in [0, 1]

. Given

α, β \in F_{+} (V)

, we call interpolation from α to β the positive sesquilinear form

\begin{matrix} γ_{α \to β}^{t} : = f^{t} (α, β) : & V \times V & ⟶ & F \\ (v, w) & ⟼ & f^{t} (α, β) (v, w), \end{matrix}

(106)

which means that, for any compatible representation

α \sim (H, h, A)

and

β \sim (H, h, B)

,

\begin{matrix} γ_{α \to β}^{t} (v, w) & = 〈 h (v), A^{1 - t} B^{t} h (w) 〉, \forall v, w \in V . \end{matrix}

(107)

The interpolation is said to go from

α

to

β

because, clearly,

γ_{α \to β}^{0} = α

and

γ_{α \to β}^{1} = β

. Moreover, the interpolation of two interpolations is another interpolation. In fact, given

t_{1}, t_{2} \in [0, 1]

, we have

\begin{matrix} γ_{γ_{α \to β}^{t_{1}} \to γ_{α \to β}^{t_{2}}}^{t} (v, w) & = 〈 h (v), {(A^{1 - t_{1}} B^{t_{1}})}^{1 - t} {(A^{1 - t_{2}} B^{t_{2}})}^{t} h (w) 〉 \\ = 〈 h (v), A^{1 - t_{1} - t + t_{1} t} A^{t - t_{2} t} B^{t_{1} - t_{1} t} B^{t_{2} t} h (w) 〉 \\ = 〈 h (v), A^{1 - (t_{1} (1 - t) + t_{2} t)} B^{t_{1} (1 - t) + t_{2} t} h (w) 〉 \\ = γ_{α \to β}^{t^{'}} (v, w), \end{matrix}

(108)

with

t^{'} = t_{1} (1 - t) + t_{2} t

.

The interpolation of

α, β

computed at the value

t = 1 / 2

corresponds to a particularly important positive sesquilinear form, indicated with

\sqrt{α β} : = γ_{α \to β}^{1 / 2},

(109)

and called the geometric mean of

α, β

. Clearly, we have

\sqrt{α β} (v, w) = γ_{α \to β}^{1 / 2} (v, w) = 〈 h (v), A^{1 / 2} B^{1 / 2} h (w) 〉, \forall v, w \in V .

(110)

An important property of the geometric mean of

α

and

β

will emerge in connection with the following concept: a positive sesquilinear form

r \in F_{+} (V)

is said to be dominated by

α, β \in F_{+} (V)

if

{| r (v, w) |}^{2} \leq α (v, v) β (w, w), \forall v, w \in V .

(111)

The following theorem establishes that

\sqrt{α β}

is the ‘maximal’ positive sesquilinear form dominated by

α

and

β

.

Theorem 4.

Let V be a vector space, and let

α, β \in F_{+} (V)

; then,

\sqrt{α β}

is dominated by

α, β

. Moreover, interpreting

F_{+} (V)

as a partially ordered set, let

S \subset F_{+} (V)

be the subset of positive sesquilinear forms dominated by

α, β

; then,

\sqrt{α β} = sup S

, i.e.,

r \leq \sqrt{α β}

for all

r \in S

.

Proof.

The first statement is simply an application of the Cauchy–Schwarz inequality. For all

v, w \in V

and for any compatible representations of

α

and

β

, we have

\begin{matrix} | 〈 h (v), A^{1 / 2} B^{1 / 2} h (w) 〉 |^{2} & = | 〈 A^{1 / 2} h (v), B^{1 / 2} h (w) 〉 |^{2} \\ \leq 〈 A^{1 / 2} h (v), A^{1 / 2} h (v) 〉 〈 B^{1 / 2} h (w), B^{1 / 2} h (w) 〉 \\ = 〈 h (v), A h (v) 〉 〈 h (w), B h (w) 〉, \end{matrix}

(112)

having used the fact that A and B, and so their square roots are Hermitian operators. Thus,

| \sqrt{α β} {(v, w) |}^{2} \leq α (v, v) β (w, w), \forall v, w \in V,

(113)

so

\sqrt{α β}

is dominated by

α

and

β

.

To prove the second statement, consider a form

r \in F_{+} (V)

dominated by

α, β

, and let

α \sim (H, h, A)

and

β \sim (H, h, B)

be compatible representations. By applying the constructive proof of Theorem 2 to r, we can find a positive operator

C \in B_{H} (H)

such that

r (v, w) = 〈 h (v), C h (w) 〉

; however, the representation

r \sim (H, h, C)

will not, in general, be compatible with that of

α

and

β

. Since r is dominated by

α

and

β

, we have

{| 〈 h (v), C h (w) 〉 |}^{2} \leq 〈 h (v), A h (v) 〉 〈 h (w), B h (w) 〉, \forall v, w \in V,

(114)

or, thanks to the fact that h is surjective on

H

,

{| 〈 x, C y 〉 |}^{2} \leq 〈 x, A x 〉 〈 y, B y 〉, \forall x, y \in H .

(115)

Now, the second statement of the theorem, i.e.,

r \leq \sqrt{α β}

, means that

(\sqrt{α β} - r) (v, v) \geq 0

for all

v \in V

and all

r \in S

, which is equivalent to

〈 u, (A^{1 / 2} B^{1 / 2} - C) u 〉 \geq 0

for all

u \in H

and all C satisfying inequality (115).

It follows that the second statement of the theorem will be proven if we manage to show that

C \leq A^{1 / 2} B^{1 / 2}

. In order to obtain this result, a regularization procedure applied to the operators

A, B

will be helpful: since

A \geq 0

and

B \geq 0

, for all

ε > 0

,

A_{ε} : = A + ε i d_{H}

,

B_{ε} : = B + ε i d_{H}

, and their square roots

A_{ε}^{1 / 2}

,

B_{ε}^{1 / 2}

are Hermitian, positive, and invertible. Moreover, it is clear that

A \leq A_{ϵ}

,

B \leq B_{ϵ}

, hence

A_{ε}^{- 1 / 2} A A_{ε}^{- 1 / 2} \leq i d_{H}, B_{ε}^{- 1 / 2} B B_{ε}^{- 1 / 2} \leq i d_{H} .

(116)

Finally,

A_{ε}

and

B_{ε}

are monotonically decreasing in

ε

w.r.t. the Löwner ordering and

A_{ε} \to A

and

B_{ε} \to B

, as

ε \to 0

. For all

x, y \in H

, there exist unique vectors

u, v \in H

such that

x : = A_{ε}^{- 1 / 2} u, y : = B_{ε}^{- 1 / 2} v,

(117)

then inequality (115) becomes

| 〈 A_{ε}^{- 1 / 2} u, C B_{ε}^{- 1 / 2} v 〉 |^{2} \leq 〈 A_{ε}^{- 1 / 2} u, A A_{ε}^{- 1 / 2} u 〉 〈 B_{ε}^{- 1 / 2} v, B B_{ε}^{- 1 / 2} v 〉,

(118)

i.e.,

| 〈 u, A_{ε}^{- 1 / 2} C B_{ε}^{- 1 / 2} v 〉 |^{2} \leq 〈 u, A_{ε}^{- 1 / 2} A A_{ε}^{- 1 / 2} u 〉 〈 v, B_{ε}^{- 1 / 2} B B_{ε}^{1 / 2} v 〉 \leq 〈 u, u 〉 〈 v, v 〉,

(119)

having used the inequalities written in (116).

By considering

u = v

and taking into account that

A_{ε}^{- 1 / 2} C B_{ε}^{- 1 / 2}

is a positive operator, we can write

〈 u, A_{ε}^{- 1 / 2} C B_{ε}^{- 1 / 2} u 〉 \leq 〈 u, u 〉, \forall u \in H,

(120)

which implies that

A_{ϵ}^{- 1 / 2} C B_{ϵ}^{- 1 / 2} \leq i d_{H}

, thus

C \leq A_{ε}^{1 / 2} B_{ε}^{1 / 2}

. By taking the limit

ε \to 0

, we get

C \leq A^{1 / 2} B^{1 / 2}

, and so the second statement of the theorem is also proven. □

The property of the geometric mean just proven allows us to extend the ordering relation between two positive sesquilinear forms to their interpolations, in the sense specified by the following theorem.

Theorem 5.

Let V be a vector space, and let

α, α^{'}, β, β^{'} \in F_{+} (V)

such that

α^{'} \leq α

and

β^{'} \leq β

; then,

γ_{α^{'} \to β^{'}}^{t} \leq γ_{α \to β}^{t}, \forall t \in [0, 1] .

(121)

Proof.

The statement is clearly satisfied for

t = 0, 1

. Setting

t = 1 / 2

, thanks to the first part of Theorem 4, we get

| γ_{α^{'} \to β^{'}}^{1 / 2} {(v, w) |}^{2} \leq α^{'} (v, v) β^{'} (w, w) \leq α (v, v) β (w, w), \forall v, w \in V,

(122)

where the second inequality follows from the hypotheses of this theorem. This means that

γ_{α^{'} \to β^{'}}^{1 / 2}

is dominated by

α, β

, and so the extremality of the geometric mean implies that

γ_{α^{'} \to β^{'}}^{1 / 2} \leq γ_{α \to β}^{1 / 2} .

(123)

If we now use Equation (108) with

t = t_{2} = 1 / 2

and

t_{1} = 0

, we get

γ_{α \to β}^{1 / 4} = γ_{γ_{α \to β}^{0} \to γ_{α \to β}^{1 / 2}}^{1 / 2}, γ_{α^{'} \to β^{'}}^{1 / 4} = γ_{γ_{α^{'} \to β^{'}}^{0} \to γ_{α^{'} \to β^{'}}^{1 / 2}}^{1 / 2} .

(124)

By repeating the previous argument, this time using Equation (123), we show that

γ_{α^{'} \to β^{'}}^{1 / 4} \leq γ_{α \to β}^{1 / 4}

.

By iterating this procedure, we can prove the statement of the theorem for any

t \in [0, 1]

of the type

t_{k, n} = k / 2^{n}

, with

n, k \in N

,

k \leq 2^{n}

, which is a dense subset of

[0, 1]

.

Finally, the functions

t_{k, n} \to γ_{α \to β}^{t_{k, n}} (v, w)

and

t_{k, n} \to γ_{α^{'} \to β^{'}}^{t_{k, n}} (v, w)

are continuous for every fixed

v, w \in V

; hence, the theorem holds for all

t \in [0, 1]

. □

In a similar way, we can prove another important result. Let

ψ : U \to V

be a linear map between vector spaces, and let

α, β \in F_{+} (V)

. Then,

ψ

allows us to pull back these sesquilinear forms on U as follows:

\begin{matrix} ψ^{*} α : & U \times U & ⟶ & F \\ (v, w) & ⟼ & α (ψ (v), ψ (w)), \\ ψ^{*} β : & U \times U & ⟶ & F \\ (v, w) & ⟼ & β (ψ (v), ψ (w)) . \end{matrix}

(125)

The following theorem shows that the pull-back of an interpolation of the forms in

F_{+} (V)

is always ‘smaller’ than the interpolation of their pull-backs, with respect to the partial ordering of

F_{+} (U)

.

Theorem 6.

Let

U, V

be vector spaces,

ψ : U \to V

be a linear map, and

α, β \in F_{+} (V)

; then,

ψ^{*} γ_{α \to β}^{t} \leq γ_{ψ^{*} α \to ψ^{*} β}^{t}, \forall t \in [0, 1] .

(126)

Proof.

The argument that we use is quite similar to the one appearing in the previous proof. The statement is true for

t = 0

and

t = 1

because, in these cases, we have

ψ^{*} α \leq ψ^{*} α

and

ψ^{*} β \leq ψ^{*} β

, respectively.

Let us now consider

t = 1 / 2

, which gives rise to the geometric mean, so

| ψ^{*} \sqrt{α β} {(v, w) |}^{2} = {| \sqrt{α β} (ψ (v), ψ (w)) |}^{2}, \forall v, w \in U .

(127)

Since

\sqrt{α β}

is dominated by

α, β

, we can write

| ψ^{*} \sqrt{α β} {(v, w) |}^{2} \leq α (ψ (v) ψ (v)) β (ψ (w), ψ (w)) = ψ^{*} α (v, v) ψ^{*} β (w, w),

(128)

which shows that

ψ^{*} \sqrt{α β}

is dominated by

ψ^{*} α

,

ψ^{*} β

. Thus, by the extremal property of the geometric mean, the statement of the theorem holds for

t = 1 / 2

.

By iterating this reasoning as done in the proof of the previous theorem, the validity of inequality (126) can be generalized to all

t \in [0, 1]

. □

4.2. Definition of the Relative Entropy in Terms of Interpolations of Forms

In this subsection, we apply the results previously established to reformulate the relative entropy in a manner that will facilitate the proof of its monotonicity under partial trace, which is to be presented in the next subsection.

Adopting notations analogous to those introduced at the beginning of Section 3, we identify the vector space V, on which the forms of interest for us will be defined, with

B (H_{a b})

. As we know, this is a Hilbert space w.r.t. the Hilbert–Schmidt inner product

{〈 A, B 〉}_{a b} = Tr (A^{†} B)

,

A, B \in B (H_{a b})

, which is a positive definite sesquilinear form.

Given two density operators

ρ, σ \in B_{H} (H_{a b})

, we can define the following positive sesquilinear forms

ρ_{L}, σ_{R} : B (H_{a b}) \times B (H_{a b}) \to F

:

ρ_{L} (A, B) : = Tr (ρ B A^{†}) = Tr (A^{†} ρ B) = 〈 A, L_{ρ} B 〉,

(129)

σ_{R} (A, B) : = Tr (σ A^{†} B) = Tr (A^{†} B σ) = 〈 A, R_{σ} B 〉,

(130)

where the operators

L_{ρ}

and

R_{σ}

are defined as in Equation (46). We immediately recognize the two representations

ρ_{L} \sim (B (H_{a b}), i d_{a b}, L_{ρ}), σ_{R} \sim (B (H_{a b}), i d_{a b}, R_{σ}),

(131)

where

i d_{a b}

is the identity map on

B (H_{a b})

. These representations are compatible because, thanks to Equation (47),

[L_{ρ}, R_{σ}] = 0

.

We can now define the relative entropy positive sesquilinear form between the two density operators

ρ, σ

, indicated with

S_{ρ ∥ σ} : B (H_{a b}) \times B (H_{a b}) \to F

, as the rate of change of the interpolation

γ_{ρ_{L} \to σ_{R}}^{t}

with respect to

ρ_{L}

S_{ρ ∥ σ} (A, B) = - \underset{t \to 0^{+}}{lim inf} \frac{γ_{ρ_{L} \to σ_{R}}^{t} (A, B) - ρ_{L} (A, B)}{t}, A, B \in B (H_{a b}) .

(132)

Remark 1.

The use of lim inf instead of an ordinary limit is motivated by the following two arguments:

1.: When ρ and σ are not invertible, the interpolation function $t \mapsto γ_{ρ_{L} \to σ_{R}}^{t} (A, B)$ may not be differentiable at $t = 0^{+}$ . The ordinary limit of the differential quotient at $t = 0^{+}$ might fail to exist due to oscillations or a lack of smoothness in the interpolation path. In such situations, the ordinary limit does not exist. However, lim inf always exists (possibly infinite), thereby ensuring that the entropy form $S_{ρ ∥ σ} (A, B)$ is always well-defined.
2.: The function $t \mapsto γ_{ρ_{L} \to σ_{R}}^{t} (A, B)$ is convex in t, as it arises from the interpolation $f_{t} (x, y) = x^{1 - t} y^{t},$ which is jointly operator convex for $x, y > 0$ . For convex functions, the left and right derivatives at an endpoint may differ, and the correct notion of derivative from the right at $t = 0$ is the lower right Dini derivative, i.e., the lim inf of the difference quotient. Thus, the use of lim inf aligns with standard practice in convex analysis.

The relative entropy between states

ρ, σ

can be recovered as follows.

Theorem 7.

For all density operators

ρ, σ \in B_{H} (H_{a b})

, we have

S (ρ ∥ σ) = S_{ρ ∥ σ} (i d_{a b}, i d_{a b}) .

(133)

Proof.

Let us first consider the case of invertible density operators

ρ > 0

and

σ > 0

. Then,

\begin{matrix} S_{ρ ∥ σ} (i d_{a b}, i d_{a b}) & = - \underset{t \to 0^{+}}{lim inf} \frac{γ_{ρ_{L} \to σ_{R}}^{t} (i d_{a b}, i d_{a b}) - ρ_{L} (i d_{a b}, i d_{a b})}{t} \\ = - \underset{t \to 0^{+}}{lim inf} \frac{〈 i d_{a b}, L_{ρ}^{1 - t} R_{σ}^{t} i d_{a b} 〉 - 〈 i d_{a b}, L_{ρ} i d_{a b} 〉}{t} \\ = - \underset{t \to 0^{+}}{lim inf} \frac{Tr (ρ^{1 - t} σ^{t}) - Tr (ρ)}{t} = - {\frac{d}{d t}|}_{t = 0} Tr (ρ^{1 - t} σ^{t}) \\ = - {\frac{d}{d t}|}_{t = 0} Tr (exp ((1 - t) log ρ) exp (t log σ)) \\ = - Tr [- exp ((1 - t) log ρ) log ρ exp (t log σ) \\ + exp ((1 - t) log ρ) exp (t log σ) log σ]_{t = 0} \\ = - Tr (- ρ log ρ + ρ log σ) = Tr (ρ log ρ - ρ log σ) \\ = S (ρ ∥ σ) . \end{matrix}

(134)

In the case of

σ

and

ρ

, which are not invertible, we can use the regularized version of relative entropy. We have seen an equivalent definition of the relative entropy in (40), and, thus, it is natural to consider

lim_{ϵ \to 0^{+}} S_{ρ_{ϵ} ∥ σ_{ϵ}} (i d_{a b}, i d_{a b}) .

(135)

Now, we have that both

ρ_{ϵ}, σ_{ϵ}

are positive definite, and, thus,

log ρ_{ϵ}

,

log σ_{ϵ}

are well-defined. By repeating the steps in (134), we get that

lim_{ϵ \to 0^{+}} S_{ρ_{ϵ} ∥ σ_{ϵ}} (i d_{a b}, i d_{a b}) = lim_{ϵ \to 0^{+}} Tr (ρ_{ϵ} log ρ_{ϵ} - ρ_{ϵ} log σ_{ϵ}) .

(136)

We obtain the same definition of relative entropy as in (27). □

4.3. Proof of the Monotonicity of the Relative Entropy Under Partial Trace

Thanks to formula (133), the monotonicity of the relative entropy under the partial trace

{Tr}_{b} : B (H_{a b}) \to B (H_{a})

will be proven if we show that

S_{{Tr}_{b} (ρ) ∥ {Tr}_{b} (σ)} (i d_{a}, i d_{a}) \leq S_{ρ ∥ σ} (i d_{a b}, i d_{a b}) .

(137)

As in Petz’s proof, the adjoint of the partial trace plays a central, though conceptually distinct, role in establishing the monotonicity of relative entropy.

Specifically, in Uhlmann’s approach,

{Tr}_{b}^{†}

acts as a pull-back map; i.e., we define

ψ : = {Tr}_{b}^{†} : B (H_{a}) \to B (H_{a b})

and use it to pull back the positive sesquilinear forms

ρ_{L}

and

σ_{R}

introduced in Equations (129) and (130), respectively. Thanks to Theorem 6, we have

ψ^{*} γ_{ρ_{L} \to σ_{R}}^{t} \leq γ_{ψ^{*} ρ_{L} \to ψ^{*} σ_{R}}^{t},

(138)

i.e.,

γ_{ρ_{L} \to σ_{R}}^{t} ({Tr}_{b}^{†} (X), {Tr}_{b}^{†} (X)) = ψ^{*} γ_{ρ_{L} \to σ_{R}}^{t} (X, X) \leq γ_{ψ^{*} ρ_{L} \to ψ^{*} σ_{R}}^{t} (X, X),

(139)

for all

X \in B (H_{a})

. Now, we can use the fact that

{Tr}_{b}

preserves density operators and that

{Tr}_{b}^{†}

is a Schwarz map to write

\begin{matrix} ψ^{*} ρ_{L} (X, X) & = ρ_{L} ({Tr}_{b}^{†} (X), {Tr}_{b}^{†} (X)) = Tr ({Tr}_{b}^{†} {(X)}^{†} ρ {Tr}_{b}^{†} (X)) = Tr (ρ {Tr}_{b}^{†} (X) {Tr}_{b}^{†} {(X)}^{†}) \\ \leq Tr (ρ {Tr}_{b}^{†} (X X^{†})) = 〈 ρ, {Tr}_{b}^{†} (X X^{†}) 〉 = 〈 {Tr}_{b} (ρ), X X^{†} 〉 = Tr ({Tr}_{b} (ρ) X X^{†}) \\ = {Tr}_{b} {(ρ)}_{L} (X, X) . \end{matrix}

(140)

Replacing

ρ_{L}

with

σ_{R}

, we find

ψ^{*} σ_{R} (X, X) \leq {Tr}_{b} {(σ)}_{R} (X, X) .

(141)

Thus, we have proven that

ψ^{*} ρ_{L} \leq {Tr}_{b} {(ρ)}_{L}

and

ψ^{*} σ_{R} \leq {Tr}_{b} {(σ)}_{R}

, and so Theorem 5 implies that, for all

t \in [0, 1]

, it holds that

γ_{ψ^{*} ρ_{L} \to ψ^{*} σ_{R}}^{t} \leq γ_{{Tr}_{b} {(ρ)}_{L} \to {Tr}_{b} {(σ)}_{R}}^{t} .

(142)

This result and inequality (139) imply

γ_{ρ_{L} \to σ_{R}}^{t} ({Tr}_{b}^{†} (X), {Tr}_{b}^{†} (X)) \leq γ_{{Tr}_{b} {(ρ)}_{L} \to {Tr}_{b} {(σ)}_{R}}^{t} (X, X),

(143)

for all

X \in B (H_{a})

. By considering in particular

X = i d_{a}

and recalling that

{Tr}_{b}^{†}

is a unital map, i.e.,

{Tr}_{b}^{†} (i d_{a}) = i d_{a b}

, we obtain

\begin{matrix} γ_{ρ_{L} \to σ_{R}}^{t} (i d_{a b}, i d_{a b}) \leq γ_{{Tr}_{b} {(ρ)}_{L} \to {Tr}_{b} {(σ)}_{R}}^{t} (i d_{a}, i d_{a}) . \end{matrix}

(144)

Now, since

{Tr}_{b}

is trace-preserving, we have

\begin{matrix} {Tr}_{b} {(ρ)}_{L} (i d_{a}, i d_{a}) & = {〈 i d_{a}, L_{{Tr}_{b} (ρ)} i d_{a} 〉}_{a} = Tr ({Tr}_{b} (ρ)) = Tr (ρ) = {〈 i d_{a b}, L_{ρ} i d_{a b} 〉}_{a b} \\ = ρ_{L} (i d_{a b}, i d_{a b}), \end{matrix}

(145)

and, thus, we can rewrite inequality (144) as follows:

γ_{ρ_{L} \to σ_{R}}^{t} (i d_{a b}, i d_{a b}) - ρ_{L} (i d_{a b}, i d_{a b}) \leq γ_{{Tr}_{b} {(ρ)}_{L} \to {Tr}_{b} {(σ)}_{R}}^{t} (i d_{a}, i d_{a}) - {Tr}_{b} {(ρ)}_{L} (i d_{a}, i d_{a}) .

(146)

Thanks to Equations (133) and (132), this last inequality implies the monotonicity of the relative entropy under partial trace.

5. Conclusions

We revisited the monotonicity of relative entropy under the action of quantum channels by focusing on two important proofs: those by Petz and Uhlmann. While both approaches are foundational, their complexity has often hindered their pedagogical dissemination.

Our aim was to clarify and reconstruct these strategies within a finite-dimensional operator framework. In particular, we pointed out a subtle flaw in Petz’s original argument, whose validity was nonetheless restored by Petz and Nielsen soon after, and we showed how to rigorously extend this approach to incorporate non-invertible density operators.

It is also worth noting that our explicit construction of the isometric operator defined in Equation (60) sheds new light on its structural role within the broader context of quantum information theory. In particular, this operator can be seen as an essential component of the Petz recovery map, originally introduced in [9] in the setting of von Neumann algebras. Given a quantum channel

C

and a fixed full-rank state

σ \in B_{H} (H_{a b})

, the Petz recovery map associated with

C

and

σ

is defined by

P_{σ, C} (ρ) : = σ^{1 / 2} C^{†} (C {(σ)}^{- 1 / 2} ρ C {(σ)}^{- 1 / 2}) σ^{1 / 2},

(147)

or, equivalently, in terms of superoperators,

P_{σ, C} = L_{σ^{1 / 2}} \circ R_{σ^{1 / 2}} \circ C^{†} \circ R_{C {(σ)}^{- 1 / 2}}^{a} \circ L_{C {(σ)}^{- 1 / 2}}^{a} .

(148)

When

C = {Tr}_{b}

, this reduces to

\begin{matrix} P_{σ, {Tr}_{b}} & = L_{σ^{1 / 2}} \circ (R_{σ^{1 / 2}} \circ {Tr}_{b}^{†} \circ R_{{Tr}_{b} {(σ)}^{- 1 / 2}}^{a}) \circ L_{{Tr}_{b} {(σ)}^{- 1 / 2}}^{a} \\ = L_{σ^{1 / 2}} \circ V_{σ} \circ L_{{Tr}_{b} {(σ)}^{- 1 / 2}}^{a}, \end{matrix}

(149)

or

V_{σ} = L_{σ^{- 1 / 2}} \circ P_{σ, {Tr}_{b}} \circ L_{{Tr}_{b} {(σ)}^{1 / 2}}^{a} .

(150)

So, just as the Petz recovery map characterizes the reversibility of quantum channels and identifies conditions for saturation of the monotonicity inequality, the operator

V_{σ}

explicitly captures the mechanism by which relative entropy is contracted under partial trace.

In recent developments, particularly in the work of Fawzi and Renner [29], the Petz recovery map plays a central role in quantitative refinements of the data processing inequality. Specifically, for states

ρ

and

σ

and a channel

C

, the inequality

S (ρ ∥ σ) - S (C (ρ) ∥ C (σ)) \geq - 2 log F (ρ, (P_{σ, C} \circ C) (ρ)),

(151)

bounds the loss of distinguishability in terms of the fidelity F between the original state

ρ

and its recovered approximation via

P_{σ, C} \circ C

.

The explicit identification of

V_{σ}

offers a concrete realization of this recovery mechanism, reinforcing its interpretive clarity and suggesting further applications in entropy inequalities and recoverability conditions.

Author Contributions

Conceptualization, S.M., F.B. and E.P.; Formal analysis, S.M., F.B. and E.P.; Writing—original draft, S.M., F.B. and E.P. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors are grateful to Mark M. Wilde for his valuable insights on the literature concerning relative entropy and, in particular, for clarifying the correction of the flawed Petz argument through the proof that the operator

V_{ρ}

is an isometry.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Umegaki, H. Conditional expectation in an operator algebra, IV (entropy and information). Kodai Math. Semin. Rep. 1962, 14, 59–85. [Google Scholar] [CrossRef]
Araki, H. Relative Entropy of States of von Neumann Algebras. Publ. Res. Inst. Math. Sci. 1976, 11, 809–833. [Google Scholar] [CrossRef]
Lanford, O.E.; Robinson, D.W. Mean Entropy of States in Quantum Statistical Mechanics. J. Math. Phys. 1968, 9, 1120–1125. [Google Scholar] [CrossRef]
Lieb, E.H. Convex Trace Functions and the Wigner-Yanase-Dyson Conjecture. Adv. Math. 1973, 11, 267–288. [Google Scholar] [CrossRef]
Lieb, E.H.; Ruskai, M.B. Proof of the Strong Subadditivity of Quantum-Mechanical Entropy. J. Math. Phys. 1973, 14, 1938–1941. [Google Scholar] [CrossRef]
Lindblad, G. Completely Positive Maps and Entropy Inequalities. Commun. Math. Phys. 1975, 40, 147–151. [Google Scholar] [CrossRef]
Uhlmann, A. Relative entropy and the Wigner-Yanase-Dyson-Lieb concavity in an interpolation theory. Commun. Math. Phys. 1977, 54, 21–32. [Google Scholar] [CrossRef]
Petz, D. Sufficient subalgebras and the relative entropy of states of a von Neumann algebra. Commun. Math. Phys. 1986, 105, 123–131. [Google Scholar] [CrossRef]
Petz, D. Sufficiency of channels over von Neumann algebras. Q. J. Math. 1988, 39, 97–108. [Google Scholar] [CrossRef]
Petz, D. Monotonicity of Quantum Relative Entropy Revisited. Rev. Math. Phys. 2003, 15, 79–91. [Google Scholar] [CrossRef]
Müller-Hermes, A.G.; Reeb, D. Monotonicity of the Quantum Relative Entropy under Positive Maps. Ann. Henri Poincaré 2017, 18, 1777–1788. [Google Scholar] [CrossRef]
Sharma, N. More on a trace inequality in quantum information theory. arXiv 2015, arXiv:1512.00226. [Google Scholar] [CrossRef]
Datta, N.; Wilde, M.M. Quantum Markov chains, sufficiency of quantum channels, and Rényi information measures. J. Phys. A Math. Theor. 2015, 48, 505301. [Google Scholar] [CrossRef]
Zhang, L. A strengthened monotonicity inequality of quantum relative entropy: A unifying approach via Rényi relative entropy. Lett. Math. Phys. 2016, 106, 557–573. [Google Scholar] [CrossRef]
Junge, M.; Renner, R.; Sutter, D.; Wilde, M.M.; Winter, A. Universal recovery maps and approximate sufficiency of quantum relative entropy. Ann. Henri Poincaré 2018, 19, 2955–2978. [Google Scholar] [CrossRef]
Bhatia, R. Matrix Analysis; Springer: New York, NY, USA, 1997. [Google Scholar] [CrossRef]
Hansen, F.; Pedersen, G.K. Jensen’s operator inequality. Bull. Lond. Math. Soc. 2003, 35, 553–564. [Google Scholar] [CrossRef]
Heinosaari, T.; Ziman, M. The Mathematical Language of Quantum Theory: From Uncertainty to Entanglement; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
Stinespring, W.F. Positive functions on C*-algebras. Proc. Am. Math. Soc. USA 1955, 6, 211–216. [Google Scholar] [CrossRef]
Choi, M.D. A Schwarz inequality for positive linear maps on C*-algebras. Ill. J. Math. 1974, 18, 565–574. [Google Scholar] [CrossRef]
Moretti, V. Spectral Theory and Quantum Mechanics; Springer International Publishing: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
Wilde, M.M. Quantum Information Theory, 2nd ed.; Cambridge University Press: Cambridge, UK, 2017. [Google Scholar] [CrossRef]
Vedral, V. The role of relative entropy in quantum information theory. Rev. Mod. Phys. 2002, 74, 197. [Google Scholar] [CrossRef]
Nielsen, M.A.; Petz, D. A simple proof of the strong subadditivity inequality. Quantum Inf. Comput. 2005, 5, 480–486. [Google Scholar] [CrossRef]
Tomamichel, M.; Colbeck, R.; Renner, R. A Fully Quantum Asymptotic Equipartition Property. IEEE Trans. Inf. Theory 2009, 55, 5840–5847. [Google Scholar] [CrossRef]
Khatri, S.; Wilde, M.M. Principles of Quantum Communication Theory: A Modern Approach. arXiv 2024, arXiv:2011.04672. [Google Scholar] [CrossRef]
Pérez-Pardo, J.M. On Uhlmann’s proof of the Monotonicity of the Relative Entropy. In Particles, Fields and Topology: Celebrating AP Balachandran; World Scientific: Singapore, 2023; pp. 145–155. [Google Scholar]
Pusz, W.; Woronowicz, S.L. Functional Calculus for Sesquilinear Forms and the Purification Map. Rep. Math. Phys. 1975, 8, 159–170. [Google Scholar] [CrossRef]
Fawzi, O.; Renner, R. Quantum Conditional Mutual Information and Approximate Markov Chains. Commun. Math. Phys. 2015, 340, 575–611. [Google Scholar] [CrossRef]

Figure 1. Comparison of

{(α x α + ξ)}^{- 1}

and

α {(x + ξ)}^{- 1} α

, illustrating the failure of the Jensen-type inequality in the scalar case, with

α = 0.5

and

ξ = 0.5

.

Figure 1. Comparison of

{(α x α + ξ)}^{- 1}

and

α {(x + ξ)}^{- 1} α

, illustrating the failure of the Jensen-type inequality in the scalar case, with

α = 0.5

and

ξ = 0.5

.

Figure 2. Counterexample showing that the inequality

- log (α x α) \leq - α log (x) α

fails for

α = 0.5

.

Figure 2. Counterexample showing that the inequality

- log (α x α) \leq - α log (x) α

fails for

α = 0.5

.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Matheus, S.; Bottacin, F.; Provenzi, E. On the Monotonicity of Relative Entropy: A Comparative Study of Petz’s and Uhlmann’s Approaches. Entropy 2025, 27, 954. https://doi.org/10.3390/e27090954

AMA Style

Matheus S, Bottacin F, Provenzi E. On the Monotonicity of Relative Entropy: A Comparative Study of Petz’s and Uhlmann’s Approaches. Entropy. 2025; 27(9):954. https://doi.org/10.3390/e27090954

Chicago/Turabian Style

Matheus, Santiago, Francesco Bottacin, and Edoardo Provenzi. 2025. "On the Monotonicity of Relative Entropy: A Comparative Study of Petz’s and Uhlmann’s Approaches" Entropy 27, no. 9: 954. https://doi.org/10.3390/e27090954

APA Style

Matheus, S., Bottacin, F., & Provenzi, E. (2025). On the Monotonicity of Relative Entropy: A Comparative Study of Petz’s and Uhlmann’s Approaches. Entropy, 27(9), 954. https://doi.org/10.3390/e27090954

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On the Monotonicity of Relative Entropy: A Comparative Study of Petz’s and Uhlmann’s Approaches

Abstract

1. Introduction

2. Mathematical Preliminaries

Relative Entropy in Quantum Information Theory

3. Petz’s Proof of the Monotonicity of the Relative Entropy Under Partial Trace

3.1. Correction of Petz’s Strategy to Prove the Relative Entropy Monotonicity

3.2. Extension of Petz’s Proof to Non-Invertible Density Operators

4. Uhlmann’s Proof of the Monotonicity of the Relative Entropy Under Partial Trace

4.1. Interpolations of Positive Sesquilinear Forms

4.2. Definition of the Relative Entropy in Terms of Interpolations of Forms

4.3. Proof of the Monotonicity of the Relative Entropy Under Partial Trace

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI