Belavkin–Staszewski Relative Entropy, Conditional Entropy, and Mutual Information

Belavkin–Staszewski relative entropy can naturally characterize the effects of the possible noncommutativity of quantum states. In this paper, two new conditional entropy terms and four new mutual information terms are first defined by replacing quantum relative entropy with Belavkin–Staszewski relative entropy. Next, their basic properties are investigated, especially in classical-quantum settings. In particular, we show the weak concavity of the Belavkin–Staszewski conditional entropy and obtain the chain rule for the Belavkin–Staszewski mutual information. Finally, the subadditivity of the Belavkin–Staszewski relative entropy is established, i.e., the Belavkin–Staszewski relative entropy of a joint system is less than the sum of that of its corresponding subsystems with the help of some multiplicative and additive factors. Meanwhile, we also provide a certain subadditivity of the geometric Rényi relative entropy.


Introduction
Rényi proposed an axiomatic approach to derive the Shannon entropy, and he found a family of entropies with parameter α (α ∈ [0, 1) ∪ (1, ∞)), called Rényi entropy. Meanwhile, the same axiomatic approach was extended to relative entropy and obtained Rényi relative entropy [1]. Relative entropy (or Kullback-Leibler divergence [2]) is a special case of Rényi relative entropy, which is an important ingredient for a mathematical framework of information theory. It has operational meaning in information theoretical tasks and can be used to describe the level of closeness between two random variables [3,4]. The axiomatic approach introduced by Rényi can be readily generalized to quantum settings [5,6]. Because of the non-commutativity of the quantum states, there are at least three different and special ways to generalize the classical Rényi relative entropy [6][7][8][9][10], such as Petz-Rényi relative entropy [11,12], sandwiched Rényi relative entropy [6,13] and geometric Rényi relative entropy [14,15]. These quantities are very meaningful in different information-theoretic tasks, including source coding, hypothesis testing, state merging, and channel coding.
The fact is that quantum relative entropy, by taking the limit as α → 1, is a special case of the Petz-Rényi and sandwiched Rényi relative entropies. However, the geometric Rényi relative entropy converges to the Belavkin-Staszewski (BS) relative entropy by taking the same limit. It is noteworthy that both the quantum and BS relative entropies are important variants of the classical relative entropy extension to quantum settings [16][17][18]. Quantum relative entropy, a direct generalization of the classical relative entropy, has been studied extensively in recent decades. BS relative entropy is also an enticing and crucial entropy used to process quantum information tasks, which can be used to describe the effects of possible noncommutativity of the quantum states (the quantum relative entropy can not work well for this). Additionally, BS relative entropy has recently attracted the attention of researchers. More precisely, Katariya and Wilde employed BS relative entropy to discuss quantum channel estimation and discrimination [19], Bluhm and Capel contributed a strengthened data processing inequality for BS relative entropy [20]. This property was first established by Hiai and Petz [21]. Bluhm et al. produced some weak quasi-factorization results for BS relative entropy [22]. Fang and Fawzi studied quantum channel capacities with respect to geometric Rényi relative entropy [23].
It is commonly known that von Neumann entropy, quantum conditional entropy, and quantum mutual information play vital roles in quantum information theory. Apart from the above entropic measures derived from the quantum relative entropy, however, other useful entropy-like quantities have also been well studied recently, such as maxinformation [24], collision entropy [25], and min-and max-entropies [26][27][28]. All of these information measures were generated from quantum Rényi relative entropies by taking different limits.
BS relative entropy can be seen as a fresh and forceful tool to resolve some specific challenges of quantum information-processing tasks. Concurrently, the main use of the geometric Rényi and BS relative entropies is to establish upper bounds on the rates of feedback-assisted quantum communication protocols [29]. To our present knowledge, there is no systematic analysis and research for conditional entropy and mutual information defined from BS relative entropy. Therefore, this paper explores some basic but necessary results for BS relative entropy. More precisely, we first provide a class of new definitions of conditional entropy (called BS conditional entropy, see Definition 2) and new mutual information (called BS mutual information, see Definition 3) via BS relative entropy. Additionally, we showed that von Neumann entropy can be defined by BS relative entropy. Second, we built an order relation between the BS conditional entropy of the bipartite and tripartite quantum systems. Subsequently, since classical-quantum states play an essential role in quantum channel coding and classical data compression with quantum side information, we discussed some valuable properties of BS conditional entropy and BS mutual information in classical-quantum settings. We established the weak concavity of BS conditional entropy and obtained chain rules for BS mutual information. Last but not least, the subadditivity of the geometric Rényi and BS relative entropies is established with the help of some multiplicative and additive factors (the factors are different linear combinations of quantum max-relative entropy [30]), i.e., the geomertric Rényi/BS relative entropy of a joint system is less than the sum of that of its corresponding subsystems. This paper is organized as follows. In Section 2, we present the mathematical terminology and formal definitions necessary for the formulation of our results. Our results are shown in Section 3. The paper ends with a conclusion.

Basic Notations and Definitions
We denote a finite-dimensional Hilbert space by H. Normalized quantum states are in the set S = (H) := {ρ ∈ P (H) : Trρ = 1}, and subnormalized states are in the set S ≤ (H) := {ρ ∈ P (H) : 0 < Trρ ≤ 1}. We use P + (H) and P (H) to denote the set of positive definite operators and the set of positive semi-definite operators on H, respectively. An identity operator is denoted by I. The Hilbert spaces corresponding to different physical systems are distinguished by different capital Latin letters as a subscript. A compound system is modeled using the Hilbert space H AB = H A ⊗ H B . For a bipartite classical-quantum system H XB , the corresponding state ρ XB is formalized as where {|x } corresponds to an orthonormal basis on the classical system H X , ρ x B is any quantum state on the quantum system H B , p(x) is the probability distribution, and ∑ x p(x) = 1 [17,29]. We also refer to a tripartite classical-quantum state, where ρ x AB is any quantum state on the quantum system H AB .
In quantum information theory, one can generalize Rényi relative entropy to the quantum case, these quantities depend on a parameter α ∈ (0, 1) ∪ (1, ∞), and one can evaluate their values at α ∈ {0, 1, ∞} by taking different limits. For Petz-Rényi relative entropy [11,12] and sandwiched Rényi relative entropy [6,13], if one takes the limit as α → 1, we can obtain the well-known quantum relative entropy. For ρ ∈ S = (H) and σ ∈ S ≤ (H), if supp(ρ) ⊆ supp(σ), the quantum relative entropy of ρ and σ is defined as otherwise, it is defined as +∞. Throughout this paper, we take the logarithmic function to base 2. The quantum relative entropy is nonnegative and satisfies the data processing inequality, which has good applications in quantum hypothesis testing and quantum resource theory [5,31,32]. We now define the geometric Rényi relative entropy [5,14,15,29].
Obviously, if ρ and σ can be commuted, the BS relative entropy will reduce to the quantum relative entropy. In this paper, we also need to employ the quantum max-relative entropy [5,6,30,34], which comes from the sandwiched Rényi relative entropy by taking the limit as α → ∞, and

Main Results
Once again, the Petz-/sandwiched and geometric Rényi relative entropies are inconsistent when taking the limit as α → 1, which leads to many differences between the BS relative entropyD(ρ σ) (generated from the geometric Rényi relative entropy) and the quantum relative entropy D(ρ σ) (generated from the Petz-/sandwiched Rényi relative entropy). However, we find that both the quantum relative entropy and the BS relative entropy reduce to the von Neumann entropy for any quantum state ρ when one takes σ = I, i.e.,D

Belavkin-Staszewski Conditional Entropy
One of the significant properties of the relative entropy is that it can derive the conditional entropy and mutual information in information theory. The quantum relative entropy is the quantum analogue of Kullback-Leibler divergence. We know that there is no similar concept for the joint probability distribution of two variables with different time in quantum mechanics; in other words, there is no real conditional quantum state to process quantum information tasks. Thus, we can consider a formal definition of the quantum conditional entropy [17,31], i.e., where ρ B = Tr A (ρ AB ) is the reduced state for the bipartite quantum state ρ AB . The quantum conditional entropy S(A|B) ρ can be denoted as the quantum relative entropy [5,29], i.e., In fact, from the basic properties of the quantum relative entropy, the above equation has another equivalent expression [5,29], where the maximum is taken over all sub-normalized states on H B . Combining Equation (9) with Equation (10), we have However, from the property of Equation (7) and definition of Equation (11), intuitively, we find that conditional entropy defined by the BS relative entropy is different from the quantum conditional entropy of Equation (11) generally. Therefore, we define a new conditional entropy based on the BS relative entropy in the following: the so-called BS conditional entropy.

Definition 2.
For any quantum state ρ AB ∈ S = (H AB ), the BS conditional entropy is defined aŝ Similar to Equation (12), we can also define the alternative BS conditional entropy, i.e., where σ B ∈ S ≤ (H B ). In general, the optimal state is not necessarily the state ρ B . We further haveŜ from the relation of Equation (7). Additionally, if one considers the above relations in the bipartite classical-quantum systems, they remain equal, as follows.

Lemma 1.
For any bipartite classical-quantum states ρ XB , we havê Proof. Without a loss of generality, letting σ X = ∑ x q(x)|x x| and ρ X = ∑ x p(x)|x x|, using the definition of Equation (1), we have where the last equality follows from the fact that and Tr ρ x B = 1. Next, taking the minimization of σ X for both sides of Equation (18), we have The optimization of σ X for the first equality only depends on the first term. For all x, one takes the minimization if and only if p(x) = q(x), which also implies that D(p(x) q(x)) = 0. Furthermore, combining the definitions of Equation (15) with Equation (5) to Equation (19), it holds thatŜ Similarly, we haveŜ Finally, we can obtain the same result for S(B|X) ρ . We thus complete this proof.
Since the BS relative entropy satisfies the data processing inequality (6), for any tripartite quantum systems H ABC , it is easy to obtain that conditioning reduces entropy, i.e., This property also holds for the quantum conditional entropy S(A|B) ρ . However, there is not always true for S(A|B) ρ ≤ S(AC|B) ρ (Problem 11.3(2), in [17]). For the BS conditional entropy, this paper provides another result in the following.

Lemma 2.
For any tripartite quantum states ρ ABC ∈ S = (H ABC ), we havê where d A is the dimension of subsystem H A .
Proof. Since the BS relative entropy satisfies the data-processing inequality (Corollary 4.53 in [29]) and the fact that partial trace is a quantum channel [5], we havê Applying the additivity of the BS relative entropy (Proposition 4.54, [29]), we then havê Recalling the definition of the BS conditional entropyŜ(A|B) ρ to the above equality, we haveŜ Similarly, for the alternative definition of the BS conditional entropyŜ m (A|B) ρ , taking the minimization of σ X , we have The quantum conditional entropy of Equation (10) also satisfies the concavity, which plays an important role in the quantum information processing [5,17,31]. Additionally, for the BS conditional entropy of the tripartite classical-quantum state, we obtain the following result. and where H(X) is the Shannon entropy, andŜ(A x |B) ρ is the BS conditional entropy for the quantum state ρ x AB .
Proof. For any tripartite classical-quantum states ρ XAB , we have For the first term of the last equality, we have where the last equality follows from the fact that Tr ρ x AB = 1. Similarly, for the second term, we further have whereŜ(A x |B) ρ is the BS conditional entropy for the quantum state ρ x AB . Therefore, we can obtain Using the definition of the BS conditional entropyŜ(XA|B) ρ to the above equality, we haveŜ Applying Lemma 2, we further havê Substituting Equation (27) into the above inequality (28), we can obtain the first inequality of Theorem 1.
We will replace ρ B with σ B to analyze the case of the alternative BS relative entropy in the same way, i.e., taking the optimization of σ B for the above equality and combining the definition of Equation (15). We then obtain the desired result.
From the above results, we know that there are two additional terms H(X) and log d X on each side of the inequality (24), which are different from the concavity of the quantum conditional entropy S(A|B) ρ . To make a distinction, we call it the weak concavity of the BS conditional entropy given by Theorem 1.
Combining the above fact with Theorem 1, we can establish the relationship between S(A|XB) ρ andŜ(XA|B) ρ . Using the direct sum property of the BS relative entropy (Proposition 4.54 in [29]), we havê We cannot determine the order relations between ρ x B with ρ B , but one can always compareD(ρ . For the case that the former is less than the latter, we haveŜ Applying Equation (26), we can obtain In addition, as a special case of the inequality (20), we can easily obtain that S(A|XB) ρ ≤Ŝ(A|B) ρ . Therefore, it holds that Otherwise, for the case ofD(ρ x AB I A ⊗ ρ x B ) ≥D(ρ x AB I A ⊗ ρ B ), we then obtain For the quantum conditional entropy of Equation (10), we know that any bipartite pure states are entangled if and only if S(A|B) < 0. Here, we are also interested in the BS conditional entropy. Without a loss of generality, let |ψ AB = ∑ i λ i |i A |i B be any bipartite pure state, where λ i represents non-negative real numbers satisfying ∑ i λ 2 i = 1, known as Schmidt coefficients, and |i A and |i B are orthonormal states for A and B, respectively. The number of non-zero values λ i is called the Schmidt number for the pure state |ψ AB [17]. We haveŜ where r is the Schmidt number of |ψ . We remark that the bipartite pure state is entangled if the Schmidt number is greater than 1.

Belavkin-Staszewski Mutual Information
The quantum mutual information is another important measure in quantum information theory, which can describe total correlations in the bipartite quantum subsystems, and there are important applications in quantum channel capacity, quantum cryptography, and quantum thermodynamics [17,35]. Based on the property of the quantum relative entropy, there are four equal definitions for the quantum mutual information, i.e., where the minimums are taken over all density operators σ A and σ B on quantum systems H A and H B , respectively. However, for other general relative entropies, these equalities do not hold in general, such as max-information [24]. In this section, we will consider a new mutual information via the BS relative entropy, Similar to Equation (35), we define four different BS information terms as follows.

Definition 3.
For any quantum state ρ AB ∈ S = (H AB ), the BS mutual information terms are defined asÎ Notice that, forÎ 2 (A; B) ρ andÎ 2 (A; B) ρ , they can be thought of as swapping the positions of the optimization operators σ A and σ B , so we will consider only one of them. Intuitively, the remaining three definitions of the BS mutual informationÎ i (A; B) ρ decreases with i, i.e.,Î 3 Additionally, recalling the inequality (7), we can obtain that there is not less BS mutual information than there is quantum mutual information, i.e., From the monotonicity of the BS relative entropy, it follows that discarding quantum systems does not increase the BS mutual information, i.e., Subsequently, for the quantum mutual information (35), it holds that The above two relations are called chain rules for the quantum mutual information. Chain rule can be regarded as a 'bridge' between conditional entropy with mutual information in information theory. We are also interested in exploring chain rules for the BS mutual information for bipartite classical-quantum system H XB (for all quantum scenarios, further discussion is needed as a remaining issue). It is well-known that the classical-quantum state only possesses classical correlation, and there is no quantum correlation, which leads us to find some significative and interesting results. We first give the following result before discussing the chain rules for the BS mutual information.

Proof.
The proof is similar to the proof of Lemma 1, thus we omit some calculation steps. We first consider the caseÎ 3 (X; B) ρ . Let σ X = ∑ x q(x)|x x|, for any quantum state σ B , we have Taking the minimum optimization for both sides of Equation (45) about σ X and σ B , respectively, we then have, The last equality follows from the fact that the relative entropy D(p(x) q(x)) is nonnegative; i.e., it holds that min σ X D(p(x) q(x)) = 0, for all x, if and only if p(x) = q(x). ForÎ 2 (X; B) ρ , only optimization is required for σ B from its definition, so we directly havê Notice that the BS mutual informationÎ 1 (X; B) ρ does not involve any optimizations, so we haveÎ This result shows that the BS mutual informationÎ 1 (X; B) ρ is identical in form with the quantum mutual information I(X; B) ρ , while the latter is the well-known Holevo information. In addition, if we consider the tripartite classical-quantum state, we can also obtain a sum form of the BS mutual information for i = 1, i.e., whereÎ 1 (A x ; ρ B ) is the BS mutual information between quantum states ρ x AB and ρ x A ⊗ ρ B . Other cases of the BS mutual information are similar, so we will not go into detail. Based on the above results, we obtain the chain rules for the BS mutual information for bipartite classical-quantum states as follows.
Proof. The proof of this theorem is similar to the proof of Theorem 1, so we omit some of the repetition. For any bipartite classical-quantum state ρ XB , we havê Employing the definition of Equation (15), we further havê Applying Lemma 3 to Equation (50), we can then complete the proof.
Similarly, forÎ 1 (X; B) ρ , we give a chain rule with respect to the definition of the BS conditional entropyŜ(X|B) ρ as follows.

Corollary 1.
For any bipartite classical-quantum state ρ XB ∈ S ≤ (H XB ), we havê Proof. From Definition 2, we can directly obtain that Combining Equation (46) with Equation (52), we then obtain the desired result.
Recalling Holevo information and applying the result of Lemma 1, we further have This result shows that the BS conditional entropy with classical side information can be used to describe the Holevo information as well. In addition, employing the inequality (41), we have Comparing this to Theorem 2, we find that, when the side information is classical, the equal sign of the chain rule forÎ 3 (X; B) ρ does not hold in general.

Subadditivity for the BS Relative Entropy
It is necessary to study the relationship of entropic measures between a joint system and its corresponding subsystems, which plays a vital role in estimating channel capacity bounds and analyzing error exponents. The quantum relative entropy satisfies subadditivity and superadditivity, both of which are fundamental properties of the quantum relative entropy [31]. More precisely, for any bipartite quantum states ρ AB and a product state σ A ⊗ σ B , the superadditivity of the quantum relative entropy is This was extended to a more general setting [36]. This paper does not determine whether the BS relative entropy holds the same property. However, the following result shows an opposite relationship for the BS entropy, i.e., the subadditivity. We first give an equivalent definition of the BS relative entropy for obtaining the desired result.

Lemma 4.
For any quantum state ρ ∈ S = (H) and σ ∈ S ≤ (H), we havê Proof. Let Π σ be the projection onto the support of σ. One can obtain that ρ = ρΠ σ = Π σ ρ from supp(ρ) ⊆ supp(σ). From Equation (5), we thus havê where the equalities holds from the fact that Π σ = σ 1 2 σ − 1 2 . Employing Lemma 2.6 in [29], we have The equality holds from the cyclic property of the trace. Since Finally, we obtain the desired result by applying the cyclic property of the trace again.

Theorem 3. For any quantum state ρ
For any quantum state σ A or σ B that satisfies suppρ A ⊆ suppσ B and suppρ A ⊆ suppσ B , respectively, applying Lemma 4, we havê Employing the basic operator inequalities of Lemma 2.13 in [29], we have Substituting the above inequality into Equation (61), we then havê The equality follows from the linearity of the trace. The last inequality holds based on Similarly, As mentioned above, the geometric Rényi relative entropy converges to the BS relative entropy when the limit is α → 1. More generally, we also provide an upper bound for the geometric Rényi relative entropy.

Theorem 4. For any quantum state ρ
For σ A and σ B with suppρ A ⊆ suppσ B and suppρ A ⊆ suppσ B , respectively, employing the relation of Eq. (4.6.21) in [29], we have It then holds that Combining the above result with Definition 1, we havê Similarly, one can define a new mutual information term via the geometric Rényi relative entropy asÎ It is then easy to draw the following conclusion.

Corollary 2.
For any quantum state ρ AB ∈ S = (H AB ), σ A ∈ S ≤ (H A ), or σ B ∈ S ≤ (H B ), we haveÎ Notably, if one considers the classical-quantum state, there is no result as shown in Lemma 3 for the mutual information defined by the geometric Rényi relative entropy. Specifically, for α ∈ (0, 1), we havê where the inequality comes from the Jensen inequality of − log t. For α ∈ (1, ∞), we obtain the opposite result, i.e.,D Furthermore, if one considers the conditional entropy defined by the geometric Rényi relative entropy, for α ∈ (0, 1), we havê For α ∈ (1, ∞), we then havê

Conclusions
This paper investigates the subadditivity of the geometric Rényi and BS relative entropies and explores the indispensable properties of the BS conditional entropy and mutual information, especially in classical-quantum settings. The subadditivity of the geometric Rényi and BS relative entropies can provide new valuable bounds to estimate channel capacity and analyze the error exponent. As mentioned above, the BS relative entropy represents a different quantum generalization of classical relative entropy. The main use of BS relative entropy is in establishing upper bounds for the rates of feedbackassisted quantum communication protocols. The primary goal of further research on BS relative entropy is to explore the intrinsic properties of its relevant conditional entropy and mutual information and to gain a better understanding of their operational relevance. We hope that the formal tools provided in this paper will be useful for this purpose.
One question worth answering is whether there is a chain rule for the mutual information in terms of the geometric Rényi relative entropy, i.e., or the other forms, whereŜ α (ρ A ) is the quantum Rényi entropy. Subsequently, the duality of conditional entropy is an important property for a tripartite pure state system, which can be effectively applied in random number extraction and channel coding [26]. Further research will focus on the duality of the BS conditional entropy.