On the Classical Capacity of General Quantum Gaussian Measurement

In this paper, we consider the classical capacity problem for Gaussian measurement channels. We establish Gaussianity of the average state of the optimal ensemble in the general case and discuss the Hypothesis of Gaussian Maximizers concerning the structure of the ensemble. Then, we consider the case of one mode in detail, including the dual problem of accessible information of a Gaussian ensemble. Our findings are relevant to practical situations in quantum communications where the receiver is Gaussian (say, a general-dyne detection) and concatenation of the Gaussian channel and the receiver can be considered as one Gaussian measurement channel. Our efforts in this and preceding papers are then aimed at establishing full Gaussianity of the optimal ensemble (usually taken as an assumption) in such schemes.


Introduction
From the viewpoint of information theory, measurements are hybrid communication channels that transform input quantum states into classical output data. As such, they are described by the classical information capacity which is the most fundamental quantity characterizing their ultimate information-processing performance [1][2][3][4]. Channels with continuous output, such as bosonic Gaussian measurements, do not admit direct embedding into properly quantum channels and, hence, require separate treatment. In particular, their output entropy is the Shannon differential entropy, instead of the quantum entropy, which completely changes the pattern of the capacity formulas. The classical capacity of multimode Gaussian measurement channels was computed in Reference [5] under socalled threshold condition (which includes phase-insensitive or gauge covariant channels as a special case [6]). The essence of this condition is that it reduces the classical capacity problem to the minimum output differential entropy problem solved in Reference [7] (in the context of quantum Gaussian channels, a similar condition was introduced and studied in References [8,9]; also see references therein).
In this paper, we approach the classical capacity problem for Gaussian measurement channels without imposing any kind of threshold condition. In particular, in the framework of quantum communication, this means that both (noisy) heterodyne and (noisy/noiseless) homodyne measurements [10,11] are treated from a common viewpoint. We prove Gaussianity of the average state of the optimal ensemble in general and discuss the Hypothesis of Gaussian Maximizers (HGM) concerning the structure of the ensemble. The proof uses the approach of the paper of Wolf, Giedke, and Cirac [12] applied to the convex closure of the output differential entropy. Then, we discuss the case of one mode in detail, including the dual problem of accessible information of a Gaussian ensemble.
In quantum communications, there are several studies of the classical capacity in the transmission scheme where not only the Gaussian channel but also the receiver is fixed, and the optimization is performed over certain set of the input ensembles (see References [10,[13][14][15] and references therein). These studies are practically important in view of greater complexity of the optimal receiver in the Quantum Channel Coding (HSW) theorem (see, e.g., Reference [16]). Our findings are relevant to such a situation where the receiver is Gaussian and concatenation of the channel and the receiver can be considered as one Gaussian measurement channel. Our efforts in this and preceding papers are then aimed at establishing full Gaussianity of the optimal ensemble (usually taken as a key assumption) in such schemes.

The Measurement Channel and Its Classical Capacity
An ensemble E = {π(dx), ρ(x)} consists of probability measure π(dx) on a standard measurable space X and a measurable family of density operators (quantum states) x → ρ(x) on the Hilbert space H of the quantum system. The average state of the ensemble is the barycenter of this measure:ρ the integral existing in the strong sense in the Banach space of trace-class operators on H.
Let M = {M(dy)} be an observable (POVM) on H with the outcome standard measurable space Y. There exists a σ−finite measure µ(dy) such that, for any density operator ρ, the probability measure TrρM(dy) is absolutely continuous w.r.t. µ(dy), thus having the probability density p ρ (y) (one can take µ(dy) = Trρ 0 M(dy), where ρ 0 is a nondegenerate density operator). The affine map M : ρ → p ρ (·) will be called the measurement channel.
The joint probability distribution of x, y on X × Y is uniquely defined by the relation where A is an arbitrary Borel subset of X , and B is that of Y. The classical Shannon information between x, y is equal to In what follows, we will consider POVMs having (uniformly) bounded operator density, M(dy) = m(y)µ(dy), with m(y) ≤ b, so that the probability densities p ρ (y) = Tr ρm(y) are uniformly bounded, 0 ≤ p ρ (y) ≤ b. (The probability densities corresponding to Gaussian observables we will be dealing with possess this property). Moreover, without loss of generality [6] we can assume b = 1. Then, the output differential entropy is well defined with values in [0, +∞] (see Reference [6] for the details). The output differential entropy is concave lower semicontinuous (w.r.t. trace norm) functional of a density operator ρ. The concavity follows from the fact that the function p → −p log p, p ∈ [0, 1] is concave. Lower semicontinuity follows by an application of the Fatou-Lebesgue lemma from the fact that this function is nonnegative, continuous, and p ρ (y) − p σ (y) ≤ ρ − σ 1 . Next, we define the convex closure of the output differential entropy (1): which is the "measurement channel analog" of the convex closure of the output entropy for a quantum channel [17].

Lemma 1.
The functional e M (ρ) is convex, lower semicontinuous and strongly superadditive: As it is well known, the property (3) along with the definition (2) imply additivity: if ρ 12 = ρ 1 ⊗ ρ 2 then e M 1 ⊗M 2 (ρ 12 ) = e M 1 (ρ 1 ) + e M 2 (ρ 2 ). (4) Proof. The lower semicontinuity follows from the similar property of the output differential entropy much in the same way as in the case of quantum channels, treated in Reference [17], Proposition 4; also see Reference [18], Proposition 1. Let us prove strong superadditivity. Let be a decomposition of a density operator ρ 12 on H 1 ⊗ H 2 , then and, whence taking the infimum over decompositions (5), we obtain (3).
Let H be a Hamiltonian in the Hilbert space H of the quantum system, E a positive number. Then, the energy-constrained classical capacity of the channel M is equal to where maximization is over the input ensembles of states E satisfying the energy constraint Trρ E H ≤ E, as shown in Reference [5], proposition 1.
Note that the measurement channel is entanglement-breaking [16]; hence, its classical capacity is additive and is given by the one-shot expression (6). By using (7), (2), we obtain

Gaussian Maximizers for Multimode Bosonic Gaussian Observable
Consider now multimode bosonic Gaussian system with the quadratic Hamiltonian H = R R t , where > 0 is the energy matrix, and R = [q 1 , p 1 , . . . , q s , p s ] is the row vector of the bosonic position-momentum observables, satisfying the canonical commutation relation (see, e.g., References [11,16]). This describes quantization of a linear classical system with s degrees of freedom, such as finite number of physically relevant electromagnetic modes on the receiver's aperture in quantum optics. From now on, we will consider only states with finite second moments. By S(α), we denote the set of all states ρ with the fixed correlation matrix For centered states (i.e., states with vanishing first moments), the covariance matrix and the matrix of second moments coincide. We denote by ρ α centered Gaussian state with the The energy constraint reduces to (We denote Sp trace of s × s-matrices as distinct from trace of operators on H.) For a fixed correlation matrix α, we will study the α-constrained capacity With the Hamiltonian H = R R t , the energy-constrained classical capacity of observable M is We will be interested in the approximate position-momentum measurement (observable, POVM) where ρ β is centered Gaussian density operator with the covariance matrix β and are the unitary displacement operators. Thus, µ(dz) = d 2s z (2π) s and the operator-valued density of POVM (11) is m(z) = D(z)ρ β D(z) * . In quantum optics, some authors [11,19] call such measurements (noisy) general-dyne detections.
In what follows, we will consider n independent copies of our bosonic system on the Hilbert space H ⊗n . We will supply all the quantities related to k−th copy (k = 1, . . . , n) with upper index (k) , and we will use tilde to denote quantities related to the whole collection on n copies. Thus,z (1) ) . . . µ(dz (n) ).
..,n be a real orthogonal n × n−matrix and U-the unitary operator on H ⊗n implementing the linear symplectic transformatioñ Then, for any stateρ on H ⊗n , e M ⊗n (ρ) = e M ⊗n (UρU * ).

Lemma 3.
Let M be the Gaussian measurement (11). For any state ρ with finite second moments, e M (ρ) ≥ e M (ρ α ), where α is the covariance matrix of ρ.
Proof. The proof follows the pattern of Lemma 1 from the paper of Wolf, Giedke, and Cirac [12]. Without loss of generality, we can assume that ρ is centered. We have whereρ = Uρ ⊗n U * with symplectic unitary U in H ⊗n , corresponding to an orthogonal matrix O as in Lemma 2, andρ (k) is the k−th partial state ofρ.
Step (2) follows from lemma 2, and step (3) follows from the superadditivity of e M (Lemma 1). The final step of the proof, uses ingeniously constructed U from Reference [12] and lower semicontinuity of e M (Lemma 1). Namely, n = 2 m , and U corresponds via (12) to the following special orthogonal matrix Every row of the n × n−matrix O, except the first one which has all the elements 1, has n/2 = 2 m−1 elements equal to 1 and n/2 elements equal to −1. Then, the quantum characteristic function of the statesρ (k) , k = 2, . . . , n is equal to is the quantum characteristic function of the state ρ. This allows to apply Quantum Central Limit Theorem [20] to show thatρ (k) → ρ α as n → ∞, in a uniform way, implying (16); see Reference [12] for details. Theorem 1. The optimizing density operator ρ in (10) is the (centered) Gaussian density operator ρ α : and, hence, Proof. Lemma 3 implies that, for any ρ with finite second moments, e M (ρ) ≥ e M (ρ α ), where α is the covariance matrix of ρ. On the other hand, by the maximum entropy (17) is maximized by a Gaussian density operator.

Remark 1.
The proof of Lemma 2 and, hence, of Theorem 1 can be extended to a general Gaussian observable M in the sense of Reference [16,21], defined via operator-valued characteristic function of the form where K is a scaling matrix, γ is the measurement noise covariance matrix, and γ ≥ ± i 2 K t ∆K. Then, the Fourier transform of the measurement probability density p ρ (z) is equal to Tr ρφ M (w), and one can use this function to obtain generalization of the relation (14) for the measurement probability densities. The case (11) corresponds to the type 1 Gaussian observable [21] with K = I 2s , γ = β. However, (19) also includes type 2 and 3 observables (noisy and noiseless multimode homodyning), in which case K is a projection onto an isotropic subspace of Z (i.e., one on which the symplectic form ∆ vanish.) Remark 2. Theorem 1 establishes Gaussianity of the average state of the optimal ensemble for a general Gaussian measurement channel. However, Gaussian average state can appear in a non-Gaussian ensemble. An immediate example is thermal state represented as a mixture of the Fock states with geometric distribution. Thus, Theorem 1 does not necessarily imply full Gaussianity of the optimal ensemble as formulated in the following conjecture.

Hypothesis of Gaussian Maximizers (HGM).
Let M be an arbitrary Gaussian measurement channel. Then, there exists an optimal Gaussian ensemble for the convex closure of the output differential entropy (2) with Gaussian ρ and, hence, for the energy-constrained classical capacity (6) of the channel M. More explicitly, the ensemble consists of (properly squeezed) coherent states with the displacement parameter having Gaussian probability distribution.
For Gaussian measurement channels of the type 1 (essentially of the form (11), see Reference [21] for complete classification) and Gaussian states ρ α satisfying the "threshold condition" , we have e M (ρ α ) = min ρ h M (ρ), (20) with the minimum attained on a squeezed coherent state, which implies the validity of the HGM and an efficient computation of C(M, H, E); see Reference [5]. On the other hand, the problem remains open in the case where the "threshold condition" is violated, and in particular, for all Gaussian measurement channels of the type 2 (noisy homodyning), with the generic example of the energy-constrained approximate measurement of the position [q 1 , . . . , q s ] subject to Gaussian noise (see Reference [22], where the entanglement-assisted capacity of such a measurement was computed). In the following section, we will touch upon the HGM in this case for one mode system.

Gaussian Measurements in One Mode
Our framework in this section will be one bosonic mode described by the canonical position and momentum operators q, p . We recall that D(x, y) = exp i(yq − xp), x, y ∈ R are the unitary displacement operators.
We will be interested in the observable where ρ β is centered Gaussian density operator with the covariance matrix Let ρ α be a centered Gaussian density operator with the covariance matrix The problem is, to compute e M (ρ α ) and, hence, the classical capacity C(M, H, E) for the oscillator Hamiltonian H = 1 2 q 2 + p 2 (as shown in the Appendix of Reference [22], we can restrict to Gaussian states ρ α with the diagonal covariance matrix in this case). The energy constraint (9) takes the form The measurement channel corresponding to POVM (21) acts on the centered Gaussian state ρ α by the formula In this expression, c is a fixed constant depending on the normalization of the underlying measure µ in (1). It does not enter the information quantities which are differences of the two differential entropies.
Assuming validity of the HGM, we will optimize over ensembles of squeezed coherent states where ρ Λ is centered Gaussian state with correlation matrix Λ = δ 0 0 1/(4δ) , and the vector (x, y) has centered Gaussian distribution with covariance matrix γ q 0 0 γ p . Then, the average stateρ E of the ensemble is centered Gaussian ρ α with the covariance matrix (23), where For this ensemble, Then, the hypothetical value: The derivative of the minimized expression vanishes for δ = 1 2 β q β p . Thus, depending on the position of this value with respect to the interval (27), we obtain three possibilities): Here, the column C corresponds to the case where the "threshold condition" holds, implying (20). Then the full validity of the HGM in much more general multimode situation was established in Reference [5]. All the quantities in this column, as well as the value of C(M, H, E) in the central column of Table 2, were obtained in that paper as an example. On the other hand, the HGM remains open in the cases of mutually symmetric columns L and R (for the derivation of the quantities in column L of Tables 1 and 2 see Appendix A).
Maximizing C(M; α) over α q , α p which satisfy the energy constraint (24) (with the equality): α q + α p = 2E, we obtain C(M, H, E) depending on the signal energy E and the measurement noise variances β q , β p : where we introduced the "energy threshold function" In the gauge invariant case when β q = β p = β, the threshold condition amounts to E ≥ 1/2, which is fulfilled by definition, and the capacity formula gives the expression log E+β β+1/2 equivalent to one obtained in Hall's 1994 paper [13]. Let us stress that, opposite to column C, the values of C(M, H, E) in the L and R columns are hypothetic, conditional upon validity of the HGM. Looking into the left column, one can see that C(M; α) and C(M, H, E) do not depend at all on β p . Thus, we can let the variance of the momentum p measurement noise β p → +∞, and, in fact, set β p = +∞, which is equivalent to the approximate measurement only of the position q described by POVM which belongs to type 2 according to the classification of Reference [21]. In other words, one makes the "classical" measurement of the observable with the quantum energy constraint Tr ρ(q 2 + p 2 ) ≤ 2E. The measurement channel corresponding to POVM (29) acts on the centered Gaussian state ρ α by the formula In this case, we have which differ from the values in the case of finite β p → +∞ by the absence of the factor α p + β p under the logarithms, while the difference C(M; α) = h M (ρ α ) − e M (ρ α ) and the capacity C(M, H, E) have the same expressions as in that case (column L).

The Dual Problem: Accessible Information
Let us sketch here ensemble-observable duality [1,2,4] (see Reference [6] for details of mathematically rigorous description in the infinite dimensional case).
Then, the average states of both ensembles coincidē and the joint distribution of x, y is the same for both pairs (E , M) and (E , M ) so that Moreover, sup where the supremum in the right-hand side is taken over all ensembles E satisfying the conditionρ E =ρ E . It can be shown (Reference [6], Proposition 4), that the supremum in the lefthand side remains the same if it is taken over all observables M (not only of the special kind with the density we started with), and then it is called the accessible information A(E ) of the ensemble E . Thus, Since the application of the duality to the pair {E , M } results in the initial pair {E , M}, we also have Coming to the case of bosonic mode, we fix the Gaussian state ρ α and restrict to ensembles E withρ E = ρ α . Let M be the measurement channel corresponding to POVM (21). Then, according to formulas (34), the dual ensemble E = {p (x, y), ρ (x, y)}, where p (x, y) is the Gaussian probability density (25) and By using the formula for √ ρ 1 ρ 2 √ ρ 1 , where ρ 1 , ρ 2 are Gaussian operators (see Reference [24] and also Corollary in the Appendix of Reference [25]), we obtain and Since x y t ∼ N (0, α + β), then, from second and third equations in (39), we obtain . By denoting p γ (x , y ), the density of this normal distribution, we can equivalently rewrite the ensemble E as E = {p γ (x , y ), ρ α (x , y )} with the average state ρ α , α = α + γ . Then, HGM is equivalent to the statement where the values of C(M; α) are given in Table 1; however, they should be reexpressed in terms of the ensemble parameters γ , α . In Reference [25], we treated the case C in multimode situation, establishing that the optimal measurement is Gaussian, and described it. Here, we will discuss the case L (R is similar) and show that, for large β p (including β p = +∞), the HGM is equivalent to the following: the value of the accessible information is attained on the sharp position measurement M 0 (dξ) = |ξ ξ|dξ (in fact, this refers to the whole domain L: 1 2 β q β p < 1 4α p , which, however, has rather cumbersome description in the new variables γ , α , cf. Reference [25]).
In the case of the position measurement channel M corresponding to POVM (29) (β p = +∞), we have α p = α p ; otherwise, the argument is essentially the same. Thus, we obtain that the HGM concerning e M (ρ) in case L is equivalent to the following: The accessible information of a Gaussian ensemble E = {p (x), ρ (x)}, where is given by the expression (43) and attained on the sharp position measurement M 0 (dx) = |ξ ξ|dξ.

Discussion
In this paper, we investigated the classical capacity problem for Gaussian measurement channels. We established Gaussianity of the average state of the optimal ensemble in full generality and discussed the Hypothesis of Gaussian Maximizers concerning the detailed structure of the ensemble. Gaussian systems form the backbone of information theory with continuous variables, both in the classical and in the quantum case. Starting from them, other, non-linear models can be constructed and investigated. Therefore, the quantum Gaussian models must be studied exhaustively. Despite the progress made, there are still intriguing gaps along this way. A major problem remains the proof (or refutation) of the hypothesis of Gaussian optimizers for various entropy characteristics of quantum Gaussian systems and channels. So far, the proof of this hypothesis in special cases required tricky and special constructions, such as in the path-breaking paper [7] concerning gauge-covariant channels, or in Section 3 of the present work concerning general Gaussian measurement channels. It seems plausible that quantum Gaussian systems may have some as yet undiscovered structural property, from which a proof of this hypothesis in its maximum generality would follow in a natural way. Acknowledgments: The author is grateful to M. J. W. Hall for sending a copy of his paper [13], and to M. E. Shirokov for the comments improving the presentation.

Conflicts of Interest:
The author declares no conflict of interest. Tables 1 and 2 By taking the Gaussian ensemble parameters in (28) as

Appendix A. Case L in
we get the hypothetic value e M (ρ α ) = 1 2 log 1 4α p + β q α p + β p + c, hence taking into account (26), The Gaussian constrained capacity is where, in the second line, we took the maximal value α q = 2E − α p . Differentiating, we obtain the equation for the optimal value α p : the positive solution of which is whence C Gauss (M, H, E) = log The parameters of the optimal Gaussian ensemble are obtained by substituting the value (A5) into (A1) with α q = 2E − α p . The above derivation concerns the measurement (21) (β p < ∞). The case of the measurement (29) (β p = +∞) is treated similarly, with (A2), (26) replaced by (32), (31). Notably, in this case, the expression (A6) coincides with the one obtained in Reference [13] by optimizing the information from applying sharp position measurement to noisy optimally squeezed states (the author is indebted to M. J. W. Hall for this observation).