Next Article in Journal
Entropic Measure of Epistemic Uncertainties in Multibody System Models by Axiomatic Design
Next Article in Special Issue
CSL Collapse Model Mapped with the Spontaneous Radiation
Previous Article in Journal / Special Issue
Test of the Pauli Exclusion Principle in the VIP-2 Underground Experiment

Entropy 2017, 19(7), 301; https://doi.org/10.3390/e19070301

Article
Measurement Uncertainty Relations for Position and Momentum: Relative Entropy Formulation
1
Dipartimento di Matematica, Politecnico di Milano, Piazza Leonardo da Vinci 32, I-20133 Milano, Italy
2
Istituto Nazionale di Alta Matematica (INDAM-GNAMPA), 00185 Roma, Italy
3
Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Milano, 20133 Milano, Italy
*
Author to whom correspondence should be addressed.
Received: 26 May 2017 / Accepted: 21 June 2017 / Published: 24 June 2017

Abstract

:
Heisenberg’s uncertainty principle has recently led to general measurement uncertainty relations for quantum systems: incompatible observables can be measured jointly or in sequence only with some unavoidable approximation, which can be quantified in various ways. The relative entropy is the natural theoretical quantifier of the information loss when a ‘true’ probability distribution is replaced by an approximating one. In this paper, we provide a lower bound for the amount of information that is lost by replacing the distributions of the sharp position and momentum observables, as they could be obtained with two separate experiments, by the marginals of any smeared joint measurement. The bound is obtained by introducing an entropic error function, and optimizing it over a suitable class of covariant approximate joint measurements. We fully exploit two cases of target observables: (1) n-dimensional position and momentum vectors; (2) two components of position and momentum along different directions. In (1), we connect the quantum bound to the dimension n; in (2), going from parallel to orthogonal directions, we show the transition from highly incompatible observables to compatible ones. For simplicity, we develop the theory only for Gaussian states and measurements.
Keywords:
measurement uncertainty relations; relative entropy; position; momentum
PACS:
03.65.Ta; 03.65.Ca; 03.67.-a; 03.65.Db
MSC:
81P15; 81P16; 94A17; 81P45

1. Introduction

Uncertainty relations for position and momentum [1] have always been deeply related to the foundations of Quantum Mechanics. For several decades, their axiomatization has been of ‘preparation’ type: an inviolable lower bound for the widths of the position and momentum distributions, holding in any quantum state. Such kinds of uncertainty relations, which are now known as preparation uncertainty relations (PURs) have been later extended to arbitrary sets of n 2 observables [2,3,4,5]. All PURs trace back to the celebrated Robertson’s formulation [6] of Heisenberg’s uncertainty principle: for any two observables, represented by self-adjoint operators A and B, the product of the variances of A and B is bounded from below by the expectation value of their commutator; in formulae, Var ρ ( A ) Var ρ ( B ) 1 4 Tr { ρ [ A , B ] } 2 , where Var ρ is the variance of an observable measured in any system state ρ . In the case of position Q and momentum P, this inequality gives Heisenberg’s relation Var ρ ( Q ) Var ρ ( P ) 2 4 . About 30 years after Heisenberg and Robertson’s formulation, Hirschman attempted a first statement of position and momentum uncertainties in terms of informational quantities. This led him to a formulation of PURs based on Shannon entropy [7]; his bound was later refined [8,9], and extended to discrete observables [10]. Also other entropic quantities have been used [11]. We refer to [12,13] for an extensive review on entropic PURs.
However, Heisenberg’s original intent [1] was more focused on the unavoidable disturbance that a measurement of position produces on a subsequent measurement of momentum [14,15,16,17,18,19,20,21]. Trying to give a better understanding of his idea, more recently new formulations were introduced, based on a ‘measurement’ interpretation of uncertainty, rather than giving bounds on the probability distributions of the target observables. Indeed, with the modern development of the quantum theory of measurement and the introduction of positive operator valued measures and instruments [3,22,23,24,25,26], it became possible to deal with approximate measurements of incompatible observables and to formulate measurement uncertainty relations (MURs) for position and momentum, as well as for more general observables. The MURs quantify the degree of approximation (or inaccuracy and disturbance) made by replacing the original incompatible observables with a joint approximate measurement of them. A very rich literature on this topic flourished in the last 20 years, and various kinds of MURs have been proposed, based on distances between probability distributions, noise quantifications, conditional entropy, etc. [12,14,15,16,17,18,19,20,21,27,28,29,30,31,32].
In this paper, we develop a new information-theoretical formulation of MURs for position and momentum, using the notion of the relative entropy (or Kullback-Leibler divergence) of two probabilities. The relative entropy S ( p q ) is an informational quantity which is precisely tailored to quantify the amount of information that is lost by using an approximating probability q in place of the target one p. Although classical and quantum relative entropies have already been used in the evaluation of the performances of quantum measurements [24,27,30,33,34,35,36,37,38,39,40], their first application to MURs is very recent [41].
In [41], only MURs for discrete observables were considered. The present work is a first attempt to extend that information-theoretical approach to the continuous setting. This extension is not trivial and reveals peculiar problems, that are not present in the discrete case. However, the nice properties of the relative entropy, such as its scale invariance, allow for a satisfactory formulation of the entropic MURs also for position and momentum.
We deal with position and momentum in two possible scenarios. Firstly, we consider the case of n-dimensional position and momentum, since it allows to treat either scalar particles, or vector ones, or even the case of multi-particle systems. This is the natural level of generality, and our treatment extends without difficulty to it. Then, we consider a couple made up of one position and one momentum component along two different directions of the n-space. In this case, we can see how our theory behaves when one moves with continuity from a highly incompatible case (parallel components) to a compatible case (orthogonal ones).
The continuous case needs much care when dealing with arbitrary quantum states and approximating observables. Indeed, it is difficult to evaluate or even bound the relative entropy if some assumption is not made on probability distributions. In order to overcome these technicalities and focus on the quantum content of MURs, in this paper we consider only the case of Gaussian preparation states and Gaussian measurement apparatuses [2,4,5,42,43,44,45]. Moreover, we identify the class of the approximate joint measurements with the class of the joint POVMs satisfying the same symmetry properties of their target position and momentum observables [3,23]. We are supported in this assumption by the fact that, in the discrete case [41], simmetry covariant measurements turn out to be the best approximations without any hypothesis (see also [17,19,20,29,32] for a similar appearance of covariance within MURs for different uncertainty measures).
We now sketch the main results of the paper. In the vector case, we consider approximate joint measurements M of the position Q ( Q 1 , , Q n ) and the momentum P ( P 1 , , P n ) . We find the following entropic MUR (Theorem 5, Remark 14): for every choice of two positive thresholds ϵ 1 , ϵ 2 , with ϵ 1 ϵ 2 2 / 4 , there exists a Gaussian state ρ with position variance matrix A ρ ϵ 1 𝟙 and momentum variance matrix B ρ ϵ 2 𝟙 such that
S ( Q ρ M 1 , ρ ) + S ( P ρ M 2 , ρ ) n log e ln 1 + 2 ϵ 1 ϵ 2 + 2 ϵ 1 ϵ 2
for all Gaussian approximate joint measurements M of Q and P . Here Q ρ and P ρ are the distributions of position and momentum in the state ρ , and M ρ is the distribution of M in the state ρ , with marginals M 1 , ρ and M 2 , ρ ; the two marginals turn out to be noisy versions of Q ρ and P ρ . The lower bound is strictly positive and it linearly increases with the dimension n. The thresholds ϵ 1 and ϵ 2 are peculiar of the continuous case and they have a classical explanation: the relative entropy S ( p q ) + if the variance of p vanishes faster than the variance of q, so that, given M , it is trivial to find a state ρ enjoying (1) if arbtrarily small variances are allowed. What is relevant in our result is that the total loss of information S ( Q ρ M 1 , ρ ) + S ( P ρ M 2 , ρ ) exceeds the lower bound even if we forbid target distributions with small variances.
The MUR (1) shows that there is no Gaussian joint measurement which can approximate arbitrarily well both Q and P . The lower bound (1) is a consequence of the incompatibility between Q and P and, indeed, it vanishes in the classical limit 0 . Both the relative entropies and the lower bound in (1) are scale invariant. Moreover, for fixed ϵ 1 and ϵ 2 , we prove the existence and uniqueness of an optimal approximate joint measurement, and we fully characterize it.
In the scalar case, we consider approximate joint measurements M of the position Q u = u · Q along the direction u and the momentum P v = v · P along the direction v , where u · v = cos α . We find two different entropic MURs. The first entropic MUR in the scalar case is similar to the vector case (Theorem 3, Remark 11). The second one is (Theorem 1):
S ( Q u , ρ M 1 , ρ ) + S ( P v , ρ M 2 , ρ ) c ρ ( α ) ,
c ρ ( α ) = log e ln 1 + | cos α | 2 Var Q u , ρ Var P v , ρ | cos α | | cos α | + 2 Var Q u , ρ Var P v , ρ ,
for all Gaussian states ρ and all Gaussian joint approximate measurements M of Q u and P v . This lower bound holds for every Gaussian state ρ without constraints on the position and momentum variances Var Q u , ρ and Var P v , ρ , it is strictly positive unless u and v are orthogonal, but it is state dependent. Again, the relative entropies and the lower bound are scale invariant.
The paper is organized as follows. In Section 2, we introduce our target position and momentum observables, we discuss their general properties and define some related quantities (spectral measures, mean vectors and variance matrices, PURs for second order quantum moments, Weyl operators, Gaussian states). Section 3 is devoted to the definitions and main properties of the relative and differential (Shannon) entropies. Section 4 is a review on the entropic PURs in the continuous case [7,8,9,46], with a particular focus on their lack of scale invariance. This is a flaw due to the very definition of differential entropy, and one of the reasons that lead us to introduce relative entropy based MURs. In Section 5 we construct the covariant observables which will be used as approximate joint measurements of the position and momentum target observables. Finally, in Section 6 the main results on MURs that we sketched above are presented in detail. Some conclusions are discussed in Section 7.

2. Target Observables and States

Let us start with the usual position and momentum operators, which satisfy the canonical commutation rules:
Q ( Q 1 , , Q n ) , P ( P 1 , , P n ) , Q i , P j = i δ i j .
Each of the vector operators has n components; it could be the case of a single particle in one or more dimensions ( n = 1 , 2 , 3 ), or several scalar or vector particles, or the quadratures of n modes of the electromagnetic field. We assume the Hilbert space H to be irreducible for the algebra generated by the canonical operators Q and P . An observable of the quantum system H is identified with a positive operator valued measure (POVM); in the paper, we shall consider observables with outcomes in R k endowed with its Borel σ -algebra B ( R k ) . The use of POVMs to represent observables in quantum theory is standard and the definition can be found in many textbooks [22,23,26,47]; the alternative name “non-orthogonal resolutions of the identity” is also used [3,4,5]. Following [5,23,26,31], a sharp observable is an observable represented by a projection valued measure (pvm); it is standard to identify a sharp observable on the outcome space R k with the k self-adjoint operators corresponding to it by spectral theorem. Two observables are jointly measurable or compatible if there exists a POVM having them as marginals. Because of the non-vanishing commutators, each couple Q i , P i , as well as the vectors Q , P , are not jointly measurable.
We denote by T ( H ) the trace class operators on H , by S T ( H ) the subset of the statistical operators (or states, preparations), and by L ( H ) the space of the linear bounded operators.

2.1. Position and Momentum

Our target observables will be either n-dimensional position and momentum (vector case) or position and momentum along two different directions of R n (scalar case). The second case allows to give an example ranging with continuity from maximally incompatible observables to compatible ones.

2.1.1. Vector Observables

As target observables we take Q and P as in (3) and we denote by Q ( A ) , P ( B ) , A , B B ( R n ) , their pvm’s, that is
Q i = R n x i Q ( d x ) , P i = R n p i P ( d p ) .
Then, the distributions in the state ρ S of a sharp position and a sharp momentum measurements (denoted by Q ρ and P ρ ) are absolutely continuous with respect to the Lebesgue measure; we denote by f ( | ρ ) and g ( | ρ ) their probability densities: A , B B ( R n ) ,
Q ρ ( A ) = Tr ρ Q ( A ) = A f ( x | ρ ) d x , P ρ ( B ) = Tr ρ P ( B ) = B g ( p | ρ ) d p .
In the Dirac notation, if | x and | p are the improper position and momentum eigenvectors, these densities take the expressions f ( x | ρ ) = x | ρ | x and g ( p | ρ ) = p | ρ | p , respectively. The mean vectors and the variance matrices of these distributions will be given in (7) and (8).

2.1.2. Scalar Observables

As target observables we take the position along a given direction u and the momentum along another given direction v :
Q u = u · Q , P v = v · P , with u , v R n , u = v = 1 , u · v = cos α .
In this case we have [ Q u , P v ] = i cos α , so that Q u and P v are not jointly measurable, unless the directions u and v are orthogonal.
Their pvm’s are denoted by Q u and P v , their distributions in a state ρ by Q u , ρ and P v , ρ , and their corresponding probability densities by f u ( | ρ ) and g v ( | ρ ) : A , B B ( R ) ,
Q u , ρ ( A ) = Tr { Q u ( A ) ρ } = A f u ( x | ρ ) d x , P v , ρ ( B ) = Tr { P v ( A ) ρ } = B g v ( p | ρ ) d p .
Of course, the densities in the scalar case are marginals of the densities in the vector case. Means and variances will be given in (11).

2.2. Quantum Moments

Let S 2 be the set of states for which the second moments of position and momentum are finite:
S 2 : = ρ S : R n x 2 f ( x | ρ ) d x < + , R n p 2 g ( p | ρ ) d p < + .
Then, the mean vector and the variance matrix of the position Q in the state ρ S 2 are
a i ρ : = R n x i f ( x | ρ ) d x Tr ρ Q i , A i j ρ : = R n x i a i ρ x j a j ρ f ( x | ρ ) d x Tr ρ Q i a i ρ Q j a j ρ ,
while for the momentum P we have
b i ρ : = R n p i g ( p | ρ ) d p Tr ρ P i , B i j ρ : = R n p i b i ρ p j b j ρ g ( p | ρ ) d p Tr ρ P i b i ρ P j b j ρ .
For ρ S 2 it is possible to introduce also the mixed ‘quantum covariances’
C i j ρ : = Tr ρ ( Q i a i ρ ) ( P j b j ρ ) + ( P j b j ρ ) ( Q i a i ρ ) 2 .
Since there is no joint measurement for the position Q and momentum P , the quantum covariances C i j ρ are not covariances of a joint distribution, and thus they do not have a classical probabilistic interpretation.
By means of the moments above, we construct the three real n × n matrices A ρ , B ρ , C ρ , the 2 n -dimensional vector μ ρ and the symmetric 2 n × 2 n matrix V ρ , with
μ ρ : = a ρ b ρ , V ρ : = A ρ C ρ ( C ρ ) T B ρ .
We say V ρ is the quantum variance matrix of position and momentum in the state ρ . In [2] dimensionless canonical operators are considered, but apart from this, our matrix V ρ corresponds to their “noise matrix in real form”; the name “variance matrix” is also used [44,48].
In a similar way, we can introduce all the moments related to the position Q u and momentum P v introduced in (6). For ρ S 2 , the means and variances are respectively
u · a ρ , Var ( Q u , ρ ) = u · A ρ u , v · b ρ , Var ( P v , ρ ) = v · B ρ v .
Similarly to (9), we have also the ‘quantum covariance’ u · C ρ v v · ( C ρ ) T u . Then, we collect the two means in a single vector and we introduce the variance matrix:
μ u , v ρ : = u · a ρ v · b ρ , V u , v ρ : = u · A ρ u u · C ρ v u · C ρ v v · B ρ v .
Proposition 1.
Let V = A C C T B be a real symmetric 2 n × 2 n block matrix with the same dimensions of a quantum variance matrix. Define
V ± : = A C ± i 2 𝟙 C T i 2 𝟙 B V ± i 2 Ω , with Ω : = 0 𝟙 𝟙 0 .
Then
V = V ρ for some state ρ S 2 V + 0 V 0 .
In this case we have: V 0 , A > 0 , B > 0 , and
( u · A u ) ( v · B v ) v · C u 2 + 2 4 v · u 2 , u R n , v R n .
The inequalities (14) for V ± tell us exactly when a (positive semi-definite) real matrix V is the quantum variance matrix of position and momentum in a state ρ . Moreover, they are the multidimensional version of the usual uncertainty principle expressed through the variances [2,3,5], hence they represent a form of PURs. The block matrix Ω in the definition of V ± is useful to compress formulae involving position and momentum; moreover, it makes simpler to compare our equations with their frequent dimensionless versions (with = 1 ) in the literature [43,44].
Proof. 
Equivalences (14) are well known, see e.g., [3] (Section 1.1.5), [5] (Equation (2.20)), and [2] (Theorem 2). Then V = 1 2 V + + 1 2 V 0 .
By using the real block vector α u β v , with arbitrary α , β R and given u , v R n , the semi-positivity (14) implies
u · A u u · C v ± i 2 u · v v · C T u i 2 v · u v · B v 0 , u R n , v R n ,
which in turn implies A 0 , B 0 and (15). Then, by choosing u = v = u i , where u 1 , , u n are the eigenvectors of A (since A is a real symmetric matrix, u i R n for all i), one gets the strict positivity of all the eigenvalues of A; analogously, one gets B > 0 . ☐
Inequality (15) for u = u and v = v becomes the uncertainty rule à la Robertson [6] for the observables in (6) (a position component and a momentum component spanning an arbitrary angle α ):
Var ( Q u , ρ ) Var ( P v , ρ ) v · C ρ u 2 + 2 4 cos α 2 .
Inequality (16) is equivalent to
V u , v ρ ± i 2 cos α 0 1 1 0 0 .
Since V ± are block matrices, their positive semi-definiteness can be studied by means of the Schur complements [49,50,51]. However, as V ± are complex block matrices with a very peculiar structure, special results hold for them. Before summarizing the properties of V ± in the next proposition, we need a simple auxiliary algebraic lemma.
Lemma 1.
Let A and B be complex self-adjoint matrices such that A B 0 . Then det A det B 0 , and the equality det A = det B holds iff A = B .
Proof. 
Let λ i ( A ) and λ i ( B ) be the ordered decreasing sequences of the eigenvalues of A and B, respectively. Then, by Weyl’s inequality, A B 0 implies λ i ( A ) λ i ( B ) 0 for every i [52] (Section III.2). This gives the first statement. Moreover, if A B 0 and det A = det B , we get λ i ( A ) = λ i ( B ) for every i. Then A = B because A B 0 and Tr { A B } = 0 . ☐
Proposition 2.
Let V = A C C T B be a real symmetric 2 n × 2 n matrix with the same dimensions of a quantum variance matrix. Then V + 0 (or, equivalently, V 0 ) if and only if A > 0 and
B C T i 2 𝟙 A 1 C ± i 2 𝟙 C T A 1 C + 2 4 A 1 i 2 A 1 C C T A 1 .
In this case we have
B C T A 1 C + 2 4 A 1 2 4 A 1 > 0 .
Moreover, we have also the following properties for the various determinants:
( det A ) ( det B ) det V = ( det A ) det B C T A 1 C 2 2 n ,
det V = 2 2 n B = C T A 1 C + 2 4 A 1 C A = A C T ,
( det A ) ( det B ) = 2 2 n B = 2 4 A 1 , C = 0 .
By interchanging A with B and C with C T in (18)–(22) equivalent results are obtained.
Proof. 
Since we already know that V + 0 implies the invertibility of A, the equivalence between (14) and (18) with A > 0 follows from [49] (Theorem 1.12 p. 34) (see also [50] (Theorem 11.6) or [51] (Lemma 3.2)).
In (19), the first inequality follows by summing up the two inequalities in (18). The last two ones are immediate by the positivity of A 1 .
The equality in (20) is Schur’s formula for the determinant of block matrices ([49], Theorem 1.1 p. 19). Then, the first inequality is immediate by the lemma above and the trivial relation B B C T A 1 C ; the second one follows from (19):
B C T A 1 C 2 4 A 1 det B C T A 1 C det 2 4 A 1 = ( / 2 ) 2 n det A .
The equality det V = 2 2 n is equivalent to det B C T A 1 C = det 2 4 A 1 ; since the latter two determinants are evaluated on ordered positive matrices by (19), they coincide if and only if the respective arguments are equal (Lemma 1); this shows the equivalence in (21). Then, by (18), the self-adjoint matrix i 2 A 1 C C T A 1 is both positive semi-definite and negative semi-definite; hence it is null, that is, C A = A C T .
Finally, B = 2 4 A 1 gives ( det A ) ( det B ) = 2 2 n trivially. Conversely, ( det A ) ( det B ) = 2 2 n implies det B = det B C T A 1 C by (20); since B B C T A 1 C 0 by (19), Lemma 1 then implies C T A 1 C = 0 and so C = 0 . ☐
By (18) and (19), every time three matrices A , B , C define the quantum variance matrix of a state ρ , the same holds for A , B , C ˜ = 0 . This fact can be used to characterize when two positive matrices A and B are the diagonal blocks of some quantum variance matrix, or two positive numbers c Q and c P are the position and momentum variances of a quantum state along the two directions u and v .
Proposition 3.
Two real matrices A > 0 and B > 0 , having the dimension of the square of a length and momentum, respectively, are the diagonal blocks of a quantum variance matrix V ρ if and only if
B 2 4 A 1 .
Two real numbers c Q > 0 and c P > 0 , having the dimension of the square of a length and momentum, respectively, are such that c Q = Var ( Q u , ρ ) and c P = Var ( P v , ρ ) for some state ρ if and only if
c Q c P 2 cos α 2 .
Proof. 
For A and B, the necessity follows from (19). The sufficiency comes from (18) by choosing V ρ = A 0 0 B .
For c Q and c P , the necessity follows from (15). The sufficiency comes from (18) with V ρ = A 0 0 B and for example the following choices of A and B:
  • if cos α = ± 1 , we take A = c Q 𝟙 and B = c P 𝟙 ;
  • if cos α = 0 , we let
    A = c Q u u T + 2 4 c P v v T + A B = 2 4 c Q u u T + c P v v T + B ,
    where A and B are any two scalar multiples of the orthogonal projection onto { u , v } satisfying B 2 4 A 1 when restricted to { u , v } ;
  • if cos α { 0 , ± 1 } , we choose
    A = c Q u u T 1 cos α ( u v T + v u T ) + 2 ( cos α ) 2 v v T + A B = c P ( sin α ) 4 ( sin α ) 2 + ( cos α ) 4 ( cos α ) 2 u u T 1 cos α ( u v T + v u T ) + v v T + B ,
    where A and B are as in the previous item.
In the last two cases, we chose A and B in such a way that B = c Q c P ( cos α ) 2 A 1 when restricted to the linear span of { u , v } . ☐

2.3. Weyl Operators and Gaussian States

In the following, we shall introduce Gaussian states, Gaussian observables and covariant observables on the phase-space. In all these instances, the Weyl operators are involved; here we recall their definition and some properties (see e.g., [4] (Section 5.2) or [5] (Section 12.2), where, however, the definition differs from ours in that the Weyl operators are composed with the map Ω 1 of (13)).
Definition 1.
The Weyl operators are the unitary operators defined by
W ( x , p ) : = exp i p · Q x · P = j = 1 n e i p j Q j x j P j = j = 1 n e i p j Q j e i x j P j e i x j p j 2 .
The Weyl operators (23) satisfy the composition rule
W ( x 1 , p 1 ) W ( x 2 , p 2 ) = exp i 2 x 1 · p 2 x 2 · p 1 W ( x 1 + x 2 , p 1 + p 2 ) ;
in particular, this implies the commutation relation
W ( x 1 , p 1 ) W ( x 2 , p 2 ) = exp i x 1 T p 1 T Ω 1 x 2 p 2 W ( x 2 , p 2 ) W ( x 1 , p 1 ) .
These commutation relations imply the translation property
W ( x , p ) * Q i W ( x , p ) = Q i + x i , W ( x , p ) * P i W ( x , p ) = P i + p i , i = 1 , , n ;
due to this property, the Weyl operators are also known as displacement operators.
With a slight abuse of notation, we shall sometimes use the identification
W ( x , p ) W x p ,
where x p is a block column vector belonging to the phase-space R n × R n R 2 n ; here, the first block x is a position and the second block p is a momentum.
By means of the Weyl operators, it is possible to define the characteristic function of any trace-class operator.
Definition 2.
For any operator ρ T ( H ) , its characteristic function is the complex valued function ρ ^ : R 2 n C defined by
ρ ^ ( w ) : = Tr ρ W ( Ω w ) , w k l .
Note that k is the inverse of a length and l is the inverse of a momentum, so that w is a block vector living in the space R 2 n R n × R n regarded as the dual of the phase-space.
Instead of the characteristic function, sometimes the so called Weyl transform Tr W ( x , p ) ρ is introduced [4,44].
By [4] (Proposition 5.3.2, Theorem 5.3.3), we have ρ ^ ( w ) L 2 ( R 2 n ) and the following trace formula holds: ρ , σ T ( H ) ,
Tr { σ * ρ } = 2 π n R 2 n σ ^ ( w ) ¯ ρ ^ ( w ) d w .
As a corollary [4] (Corollary 5.3.4), we have that a state ρ S is pure if and only if
2 π n R 2 n ρ ^ ( w ) 2 d w = 1 .
By [53] (Lemma 3.1) or [26] (Proposition 8.5.(e)), the trace formula also implies
1 ( 2 π ) n R 2 n W ( x , p ) ρ W ( x , p ) * d x d p = Tr { ρ } 𝟙 , ρ T ( H ) .
Moreover, the following inversion formula ensures that the characteristic function ρ ^ completely characterizes the state ρ [4] (Corollary 5.3.5):
ρ = 2 π n R 2 n W ( Ω w ) ρ ^ ( w ) d w , ρ T ( H ) .
The last two integrals are defined in the weak operator topology.
Finally, for ρ S 2 , the moments (7)–(10) can be expressed as in [4] (Section 5.4):
i ρ ^ ( w ) w i | 0 = μ i ρ , 2 ρ ^ ( w ) w i w j | 0 = V i j ρ + μ i ρ μ j ρ .
Definition 3 
([2,3,4,5,44,48]). A state ρ is Gaussian if
ρ ^ ( w ) = exp i w T μ ρ 1 2 w T V ρ w = exp i k · a ρ + l · b ρ 1 2 k · A ρ k + l · B ρ l k · C ρ l ,
for a vector μ ρ R 2 n and a real 2 n × 2 n matrix V ρ such that V + ρ 0 .
The condition V + ρ 0 is necessary and sufficient in order that the function (31) defines the characteristic function of a quantum state [4] (Theorem 5.5.1), [5] (Theorem 12.17). Therefore, Gaussian states are exactly the states whose characteristic function is the exponential of a second order polynomial [4] (Equation (5.5.49)), [5] (Equation (12.80)).
We shall denote by G the set of the Gaussian states; we have G S 2 S . By (30), the vectors a ρ , b ρ and the matrices A ρ , B ρ , C ρ characterizing a Gaussian state ρ are just its first and second order quantum moments introduced in (7)–(9). By (31), the corresponding distributions of position and momentum are Gaussian, namely
Q ρ = N ( a ρ ; A ρ ) , Q u , ρ = N ( u · a ρ ; u · A ρ u ) , P ρ = N ( b ρ ; B ρ ) , P v , ρ = N ( v · b ρ ; v · B ρ v ) .
Proposition 4 (Pure Gaussian states).
For ρ G , we have det V ρ = 2 2 n if and only if ρ is pure.
Proof. 
The trace formula (28) and (31) give Tr { ρ 2 } = ( / 2 ) n det V ρ , and this implies the statement. ☐
Proposition 5 (Minimum uncertainty states).
For ρ S 2 , we have ( det A ρ ) ( det B ρ ) = 2 2 n if and only if ρ is a pure Gaussian state and it factorizes into the product of minimum uncertainty states up to a rotation of R n .
Proof. 
If ( det A ρ ) ( det B ρ ) = 2 2 n , then the equivalence (22) gives B ρ = 2 4 ( A ρ ) 1 , so that the variance matrices A ρ and B ρ have a common eigenbasis u 1 , , u n . Thus, all the corresponding couples of position Q u i and momentum P u i have minimum uncertainties: Var ( Q u i ) Var ( P u i ) = 2 4 . Therefore, if we consider the factorization of the Hilbert space H = H 1 H n corresponding to the basis u 1 , , u n , all the partial traces of the state ρ on each factor H i are minimum uncertainty states. Since for n = 1 the minimum uncertainty states are pure and Gaussian, the state ρ is a pure product Gaussian state.
The converse is immediate. ☐

3. Relative and Differential Entropies

In this paper, we will be concerned with entropic quantities of classical type [54,55,56]. We express them in ‘bits’, that is we use the base-2 logarithms: log a log 2 a .
We deal only with probabilities on the measurable space R n , B ( R n ) which admit densities with respect to the Lebesgue measure. So, we define the relative entropy and differential entropy only for such probabilities; moreover, we list only the general properties used in the following.

3.1. Relative Entropy or Kullback-Leibler Divergence

The fundamental quantity is the relative entropy, also called information divergence, discrimination information, Kullback-Leibler divergence or information or distance or discrepancy. The relative entropy of a probability p with respect to a probability q is defined for any couple of probabilities p, q on the same probability space.
Given two probabilities p and q on ( R n , B ( R n ) ) with densities f and g, respectively, the relative entropy of p with respect to q is
S ( p q ) = R n f ( x ) log f ( x ) g ( x ) d x .
The value + is allowed for S ( p q ) ; the usual convention 0 log ( 0 / 0 ) = 0 is understood. The relative entropy (33) is the amount of information that is lost when q is used to approximate p [54] (p. 51). Of course, if x is dimensioned, then the densities f and g have the same dimension (that is, the inverse of x ), and the argument of the logarithm is dimensionless, as it must be.
Proposition 6 
([56], Theorem 8.6.1). The following properties hold.
(i) 
S ( p q ) 0 .
(ii) 
S ( p q ) = 0 p = q f = g a . e . .
(iii) 
S ( p q ) is invariant under a change of the unit of measurement.
(iv) 
If p = N ( a ; A ) and q = N ( b ; B ) with invertible variance matrices A and B, then
2 S ( p q ) = ( log e ) a b · B 1 a b + Tr B 1 A 𝟙 + log det B det A .
As S ( p q ) is scale invariant, it quantifies a relative error for the use of q as an approximation of p, not an absolute one.
Let us employ the relative entropy to evaluate the effect of an additive Gaussian noise ν N ( b ; β 2 ) on an independent Gaussian random variable X. If X N ( a ; α 2 ) , then X + ν N ( a + b ; α 2 + β 2 ) , and the relative entropy of the true distribution of X with respect to its disturbed version X + ν is
S ( X X + ν ) = log e 2 b 2 β 2 α 2 + β 2 + 1 2 log α 2 + β 2 α 2 .
This expression vanishes if the noise becomes negligible with respect to the true distribution, that is if β 2 / α 2 0 and b 2 / α 2 0 . On the other hand, S ( X X + ν ) diverges if the noise becomes too strong with respect to the true distribution, or, in other words, if the true distribution becomes too peaked with respect to the noise, that is, β 2 / α 2 + or b 2 / α 2 + .

3.2. Differential Entropy

The differential entropy of an absolutely continuous random vector X with a probability density f is
H ( X ) : = R n f ( x ) log f ( x ) d x .
This quantity is commonly used in the literature, even if it lacks many of the nice properties of the Shannon entropy for discrete random variables. For example, H ( X ) is not scale invariant, and it can be negative [56] (p. 244).
Since the density f enters in the logarithm argument, the definition of H ( X ) is meaningful only when f is dimensionless, which is the same as X being dimensionless. Note that, if X is dimensioned and c > 0 is a real parameter making X ˜ = c X a dimensionless random variable, then
H ( X ˜ ) = R n f ( u / c ) c n log f ( u / c ) c n d u = R n f ( x ) log f ( x ) c n d x .
In the following, we shall consider the differential entropy only for dimensionless random vectors X .
Proposition 7 
([56], Section 8.6). The following properties hold.
(i) 
If X is an absolutely continuous random vector with variance matrix A, then
H ( X ) 1 2 log ( 2 π e ) n det A = n 2 log 2 π e + 1 2 Tr log A .
The equality holds iff X is Gaussian with variance matrix A and arbitrary mean vector a .
(ii) 
If X = ( X 1 , , X n ) is an absolutely continuous random vector, then
H ( X ) i = 1 n H ( X i ) .
The equality holds iff the components X 1 , , X n are independent.
Remark 1.
In property (i) we have used the following well-known matrix identity, which follows by diagonalization:
log det A = Tr log A , A > 0 .
Remark 2.
Property (i) yields that the differential entropy of a Gaussian random variable X N ( a ; α 2 ) is
H ( X ) = 1 2 log 2 π e α 2 ,
which is an increasing function of the variance α 2 , and thus it is a measure of the uncertainty of X. Note that H ( X ) 0 iff α 2 1 / ( 2 π e ) .

4. Entropic PURs for Position and Momentum

The idea of having an entropic formulation of the PURs for position and momentum goes back to [7,8,9]. However, we have just seen that, due to the presence of the logarithm, the Shannon differential entropy needs dimensionless probability densities. So, this leads us to introduce dimensionless versions of position and momentum.
Let λ > 0 be a dimensionless parameter and ϰ a second parameter with the dimension of a mass times a frequency. Then, we introduce the dimensionless versions of position and momentum:
Q ˜ : = ϰ Q , P ˜ = λ ϰ P Q ˜ i , P ˜ j = i λ δ i j .
We use a unique dimensional constant ϰ , in order to respect rotation symmetry and do not distinguish different particles. Anyway, there is no natural link between the parameter multiplying Q and the parameter multiplying P ; this is the reason for introducing λ . As we see from the commutation rules, the constant λ plays the role of a dimensionless version of ; in the literature on PURs, often λ = 1 is used [8,9,12,46].

4.1. Vector Observables

Let Q ˜ and P ˜ be the pvm’s of Q ˜ and P ˜ ; then, Q ˜ ρ and P ˜ ρ are their probability distributions in the state ρ . The total preparation uncertainty is quantified by the sum of the two differential entropies H ( Q ˜ ρ ) + H ( P ˜ ρ ) . For ρ G , by Proposition 7 we get
H ( Q ˜ ρ ) + H ( P ˜ ρ ) = n log π e λ + 1 2 log 4 2 n det A ρ det B ρ .
In the case of product states of minimum uncertainty, we have det A ρ det B ρ = 2 / 4 n ; then, by taking (20) into account, we get
inf ρ G H ( Q ˜ ρ ) + H ( P ˜ ρ ) = n log π e λ .
Thus, the bound (37) arises from quantum relations between Q and P ; indeed, there would be no lower bound for (36) if we could take both det A ρ and det B ρ arbitrarily small.
By item (ii) of Proposition 7, the differential entropy for the distribution of a random vector is smaller than the sum of the entropies of its marginals; however, the final bound (37) is a tight bound for both H ( Q ˜ ρ ) + H ( P ˜ ρ ) and i = 1 n H ( Q ˜ i , ρ ) + i = 1 n H ( P ˜ i , ρ ) .
By the results of [8,9], the same bound (37) is obtained even if the minimization is done over all the states, not only the Gaussian ones.
The uncertainty result (37) depends on λ , this being a consequence of the lack of scale invariance of the differential entropy; note that the bound is positive if and only if λ > 1 / ( π e ) . Sometimes in the literature the parameter appears in the argument of the logarithm [27,30]; this fact has to be interpreted as the appearance of a parameter with the numerical value of , but without dimensions. In this sense the formulation (37) is consistent with both the cases with λ = 1 or λ = . Sometimes the smaller bound ln 2 π appears in place of log π e [10]; this is connected to a state dependent formulation of the entropic PUR [12] (Section V.B).

4.2. Scalar Observables

The dimensionless versions of the scalar observables introduced in (6) are
Q ˜ u = ϰ Q u , P ˜ v = λ ϰ P v Q ˜ u , P ˜ v = i λ cos α .
We denote by Q ˜ u , ρ and P ˜ v , ρ the associated distributions in the state ρ . For ρ S 2 , the respective means and variances are
ϰ u · a ρ , λ ϰ v · b ρ , Var ( Q ˜ u , ρ ) = ϰ u · A ρ u , Var ( P ˜ v , ρ ) = λ 2 ϰ v · B ρ v ,
with Var ( Q ˜ u , ρ ) Var ( P ˜ v , ρ ) λ cos α / 2 .
As in the vector case, the total preparation uncertainty is quantified by the sum of the two differential entropies H ( Q ˜ u , ρ ) + H ( P ˜ v , ρ ) . For ρ G , Proposition 7 gives
H ( Q ˜ u , ρ ) + H ( P ˜ v , ρ ) = log 2 π e Var ( Q ˜ u , ρ ) Var ( P ˜ v , ρ ) .
Then, we have the lower bound
inf ρ G H ( Q ˜ u , ρ ) + H ( P ˜ v , ρ ) = log π e λ cos α = 1 + ln π λ cos α ln 2 ,
which depends on λ , but not on ϰ . Of course, because of (39), for Gaussian states a lower bound for the sum H ( Q ˜ u , ρ ) + H ( P ˜ v , ρ ) is equivalent to a lower bound for the product Var ( Q ˜ u , ρ ) Var ( P ˜ v , ρ ) . By the generalization of the results of [8,9] given in [46], the bound (40) is obtained also when the minimization is done over all the states.
Let us note that the bound in (40) is positive for λ cos α > 1 / ( π e ) , and it goes to for α π / 2 , which is the case of compatible Q u , ρ and P v , ρ . In the case α = 0 , the bound (40) is the same as (37) for n = 1 .

5. Approximate Joint Measurements of Position and Momentum

In order to deal with MURs for position and momentum observables, we have to introduce the class of approximate joint measurements of position and momentum, whose marginals we will compare with the respective sharp observables. As done in [3,4,18,57], it is natural to characterize such a class by requiring suitable properties of covariance under the group of space translations and velocity boosts: namely, by approximate joint measurement of position and momentum we will mean any POVM on the product space of the position and momentum outcomes sharing the same covariance properties of the two target sharp observables. As we have already discussed, two approximation problems will be of our concern: the approximation of the position and momentum vectors (vector case, with outcomes in the phase-space R n × R n ), and the approximation of one position and one momentum component along two arbitrary directions (scalar case, with oucomes in R × R ). In order to treat the two cases altogether, we consider POVMs with outcomes in R m × R m R 2 m , which we call bi-observables; they correspond to a measurement of m position components and m momentum components. The specific covariance requirements will be given in the Definitions 5–7.
In studying the properties of probability measures on R k , a very useful notion is that of the characteristic function, that is, the Fourier cotransform of the measure at hand; the analogous quantity for POVMs turns out to have the same relevance. Different names have been used in the literature to refer to the characteristic function of POVMs, or, more generally, quantum instruments, such as characteristic operator or operator characteristic function [3,24,34,44,58,59,60,61,62]. As a variant, also the symplectic Fourier transform quite often appears [5] (Section 12.4.3). The characteristic function has been used, for instance, to study the quantum analogues of the infinite-divisible distributions [3,34,58,59,60,62] and measurements of Gaussian type [5,44,61]. Here, we are interested only in the latter application, as our approximating bi-observables will typically be Gaussian. Since we deal with bi-observables, we limit our definition of the characteristic function only to POVMs on R m × R m , which have the same number of variables of position and momentum type.
Being measures, POVMs can be used to construct integrals, whose theory is presented e.g., in [26] (Section 4.8) and [4] (Section 2.9, Proposition 2.9.1).
Definition 4.
Given a bi-observable M : B ( R 2 m ) L ( H ) , the characteristic function of M is the operator valued function M ^ : R 2 m L ( H ) , with
M ^ ( k , l ) = R 2 m e i ( k · x + l · p ) M ( d x d p ) .
In this definition the dimensions of the vector variables k and l are the inverses of a length and momentum, respectively, as in the definition of the characteristic function of a state (27). This definition is given so that Tr M ^ ( k , l ) ρ is the usual characteristic function of the probability distribution M ρ on R 2 m .

5.1. Covariant Vector Observables

In terms of the pvm’s (4), the translation property (25) is equivalent to the symmetry properties
W ( x , p ) Q ( A ) W ( x , p ) * = Q ( A + x ) , W ( x , p ) P ( B ) W ( x , p ) * = P ( B + p ) , A , B B ( R n ) ,
and they are taken as the transformation property defining the following class of POVMs on R 2 n [23,26,44,53,57].
Definition 5.
A covariant phase-space observable is a bi-observable M : B ( R 2 n ) L ( H ) satisfying the covariance relation
W ( x , p ) M ( Z ) W ( x , p ) * = M Z + x p , Z B ( R 2 n ) , x , p R n .
We denote by C the set of all the covariant phase-space observables.
The interpretation of covariant phase-space observables as approximate joint measurements of position and momentum is based on the fact that their marginal POVMs
M 1 ( A ) = M ( A × R n ) , M 2 ( B ) = M ( R n × B ) , A , B B ( R n ) ,
have the same symmetry properties of Q and P , respectively. Although Q and P are not jointly measurable, the following well-known result says that there are plenty of covariant phase-space observables [4] (Theorem 4.8.3), [63,64]. In (43) below, we use the parity operator Π on H , which is such that
Π W ( x , p ) Π = W ( x , p ) = W ( x , p ) * .
Proposition 8.
The covariant phase-space observables are in one-to-one correspondence with the states on H , so that we have the identification S C ; such a correspondence σ M σ is given by
M σ ( B ) = B M σ ( x , p ) d x d p , B B ( R 2 n ) , M σ ( x , p ) = 1 2 π n W ( x , p ) Π σ Π W ( x , p ) * .
The characteristic function (41) of a measurement M σ C has a very simple structure in terms of the characteristic function (27) of the corresponding state σ S .
Proposition 9.
The characteristic function of M σ C is given by
M ^ σ ( k , l ) = W Ω w σ ^ ( w ) , w k l R 2 n ,
and the characteristic function of the probability M ρ σ is
Tr M ^ σ ( k , l ) ρ = ρ ^ ( w ) σ ^ ( w ) .
In (44) we have used the identification (26). The characteristic function of a state is introduced in (27).
Proof. 
By the commutation relations (24) we have
W ( l , k ) W ( x , p ) W ( l , k ) * = e i ( k · x + l · p ) W ( x , p ) .
Then, we get
M ^ σ ( k , l ) = 1 ( 2 π ) n R 2 n e i ( k · x + l · p ) W ( x , p ) Π σ Π W ( x , p ) * d x d p = 1 ( 2 π ) n R 2 n W ( l , k ) W ( x , p ) W ( l , k ) * Π σ Π W ( x , p ) * d x d p = W ( l , k ) Tr { W ( l , k ) * Π σ Π } ,
where we used the formula (29). By (42) and the definition (27), we get (44). Again by (27), we get (45). ☐
In terms of probability densities, measuring M σ on the state ρ yields the density function h σ ( x , p | ρ ) = Tr { M σ ( x , p ) ρ } . Then, by (45), the densities of the marginals M 1 , ρ σ and M 2 ρ σ are the convolutions
h 1 σ ( | ρ ) = f ( | ρ ) * f ( | σ ) , h 2 σ ( | ρ ) = g ( | ρ ) * g ( | σ ) ,
where f and g are the sharp densities introduced in (5). By the arbitrariness of the state ρ , the marginal POVMs of M σ turn out to be the convolutions (or ‘smearings’)
M 1 σ ( A ) A d x R n f ( x x | σ ) Q ( d x ) , M 2 σ ( B ) B d p R n g ( p p | σ ) P ( d p )
(see e.g., [23] (Section III, Equations (2.48) and (2.49))).
Let us remark that the distribution of the approximate position observable M 1 σ in a state ρ is the distribution of the sum of two independent random vectors: the first one is distributed as the sharp position Q in the state ρ , the second one is distributed as the sharp position Q in the state σ . In this sense, the approximate position M 1 σ looks like a sharp position plus an independent noise given by σ . Of course, a similar fact holds for the momentum. However, this statement about the distributions can not be extended to a statement involving the observables. Indeed, since Q and P are incompatible, nobody can jointly observe M σ , Q and P , so that the convolutions (46) do not correspond to sums of random vectors that actually exist when measuring M σ .

5.2. Covariant Scalar Observables

Now we focus on the class of approximate joint measurements of the observables Q u and P v representing position and momentum along two possibly different directions u and v (see Section 2.1.2). As in the case of covariant phase-space observables, this class is defined in terms of the symmetries of its elements: we require them to transform as if they were joint measurements of Q u and P v . Recall that Q u and P v denote the spectral measures of Q u , P v .
Due to the commutation relation (24), the following covariance relations hold
W ( x , p ) Q u ( A ) W ( x , p ) * = Q u ( A + u · x ) , W ( x , p ) P v ( B ) W ( x , p ) * = P v ( B + v · p ) ,
for all A , B B ( R ) and x , p R n . We employ covariance to define our class of approximate joint measurements of Q u and P v .
Definition 6.
A ( u , v ) -covariant bi-observable is a POVM M : B ( R 2 ) L ( H ) such that
W ( x , p ) M ( Z ) W ( x , p ) * = M Z + u · x v · p , Z B ( R 2 ) , x , p R n .
We denote by C u , v the class of such bi-observables.
So, our approximate joint measurements of Q u and P v will be all the bi-observables in the class C u , v .
Example 1.
The marginal of a covariant phase-space observable M σ along the directions u and v is a ( u , v ) -covariant bi-observable. Actually, it can be proved that, if cos α 0 , all ( u , v ) -covariant bi-observables can be obtained in this way.
It is useful to work with a little more generality, and merge Definitions 5 and 6 into a single notion of covariance.
Definition 7.
Suppose J is a k × 2 n real matrix. A POVM M : B ( R k ) L ( H ) is a J -covariant observable on R k if
W ( x , p ) M ( Z ) W ( x , p ) * = M Z + J x p , Z B ( R k ) , x , p R n .
Thus, approximate joint observables of Q u and P v are just J-covariant observables on R 2 for the choice of the 2 × 2 n matrix
J = u T 0 T 0 T v T .
On the other hand, covariant phase-space observables constitute the class of 𝟙 2 n -covariant observables on R 2 n , where 𝟙 2 n is the identity map of R 2 n .

5.3. Gaussian Measurements

When dealing with Gaussian states, the following class of bi-observables quite naturally arises.
Definition 8.
A POVM M : B ( R 2 m ) L ( H ) is a Gaussian bi-observable if
M ^ ( k , l ) = W Ω ( J M ) T k l exp i k T l T a M b M 1 2 k T l T V M k l
for two vectors a M , b M R m , a real 2 m × 2 n matrix J M and a real symmetric 2 m × 2 m matrix V M satisfying the condition
V M ± i 2 J M Ω ( J M ) T 0 .
We set μ M = a M b M . The triple ( μ M , V M , J M ) is the set of the parameters of the Gaussian observable M .
In this definition, the vector a M has the dimension of a length, and b M of a momentum; similarly, the matrices J M , V M decompose into blocks of different dimensions. The condition (49) is necessary and sufficient in order that the function (48) defines the characteristic function of a POVM.
For unbiased Gaussian measurements, i.e., Gaussian bi-observables with a M = b M = 0 , the previous definition coincides with the one of [5] (Section 12.4.3). It is also a particular case of the more general definition of Gaussian observables on arbitrary (not necessarily symplectic) linear spaces that is given in [43,44]. We refer to [5,44] for the proof that Equation (48) is actually the characteristic function of a POVM.
Measuring the Gaussian observable M on the Gaussian state ρ yields the probability distribution M ρ whose characteristic function is
Tr { M ^ ( k , l ) ρ } = ρ ^ ( J M ) T k l exp i k T l T a M b M 1 2 k T l T V M k l = exp i k T l T a M b M + J M a ρ b ρ 1 2 k T l T V M + J M V ρ ( J M ) T k l ;
hence the output distribution is Gaussian,
M ρ = N J M μ ρ + μ M ; J M V ρ ( J M ) T + V M .

5.3.1. Covariant Gaussian Observables

For Gaussian bi-observables, J-covariance has a very easy characterization.
Proposition 10.
Suppose M is a Gaussian bi-observable on R 2 m with parameters ( μ M , V M , J M ) . Let J be any 2 m × 2 n real matrix. Then, the POVM M is a J-covariant observable if and only if J M = J .
Proof. 
For x , p R n , we let M and M be the two POVMs on R 2 m given by
M ( Z ) = W ( x , p ) M ( Z ) W ( x , p ) * , M ( Z ) = M Z + J x p , Z B ( R 2 m ) .
By the commutation relations (24) for the Weyl operators, we immediately get
M ^ ( k , l ) = W ( x , p ) M ^ ( k , l ) W ( x , p ) * = exp i x T p T Ω 1 Ω ( J M ) T k l M ^ ( k , l ) = exp i k T l T J M x p M ^ ( k , l ) ;
we have also
M ^ ( k , l ) = R 2 m exp i k T l T x p J x p M ( d x d p ) = exp i k T l T J x p M ^ ( k , l ) .
Since M ^ ( k , l ) 0 for all k , l , by comparing the last two expressions we see that M = M if and only if
exp i k T l T J M x p = exp i k T l T J x p , x , p R n , k , l R m ,
which in turn is equivalent to J M = J . ☐

Vector Observables

Let us point out the structure of the Gaussian approximate joint measurements of Q and P .
Proposition 11.
A bi-observable M σ C is Gaussian if and only if the state σ is Gaussian. In this case, the covariant bi-observable M σ is Gaussian with parameters
μ M σ = μ σ , V M σ = V σ , J M σ = 𝟙 2 n .
Proof. 
By comparing (31), (44) and (48), and using the fact that W ( x 1 , p 1 ) W ( x 2 , p 2 ) if and only if x 1 = x 2 and p 1 = p 2 , we have the first statement. Then, for σ G , we see immediately that M σ is a Gaussian observable with the above parameters. ☐
We call C G the class of the Gaussian covariant phase-space observables. By (50), observing M σ on a Gaussian state ρ G yields the normal probability distribution M ρ σ = N μ ρ + μ σ ; V ρ + V σ , with marginals
M 1 , ρ σ = N ( a ρ + a σ ; A ρ + A σ ) , M 2 , ρ σ = N ( b ρ + b σ ; B ρ + B σ ) .
When a σ = 0 and b σ = 0 , we have an unbiased measurement.

Scalar Observables

We now study the Gaussian approximate joint measurements of the target observables Q u and P u defined in (6).
Proposition 12.
A Gaussian bi-observable M with parameters ( μ M , V M , J M ) is in C u , v if and only if J M = J , where J is given by (47). In this case, the condition (49) is equivalent to
V 11 M 0 , V 22 M 0 , V 11 M V 22 M 2 4 ( cos α ) 2 + ( V 12 M ) 2 .
Proof. 
The first statement follows from Proposition 10. Then, the matrix inequality (49) reads
V M ± i 2 0 cos α cos α 0 0 ,
which is equivalent to (52). ☐
We write C u , v G for the class of the Gaussian ( u , v ) -covariant phase-space observables. An observable M C u , v G is thus characterized by the couple ( μ M , V M ) . From (50) with J M = J given by (47), we get that measuring M C u , v G on a Gaussian state ρ yields the probability distribution M ρ = N μ u , v ρ + μ M ; V u , v ρ + V M with μ u , v ρ and V u , v ρ given by (12). Its marginals with respect to the first and second entry are, respectively,
M 1 , ρ = N u · a ρ + a M ; Var ( Q u , ρ ) + V 11 M , M 2 , ρ = N v · b ρ + b M ; Var ( P v , ρ ) + V 22 M .
Example 2.
Let us construct an example of an approximate joint measurement of Q u and P v , by using a noisy measurement of position along u followed by a sharp measurement of momentum along v . Let Δ be a positive real number yielding the precision of the position measurement, and consider the POVM M on R 2 given by
M ( A × B ) = 1 2 π Δ A exp ( x Q u ) 2 4 Δ P v ( B ) exp ( x Q u ) 2 4 Δ d x , A , B B ( R ) .
The characteristic function of M is
M ^ ( k , l ) = 1 2 π Δ R e i k x exp ( x Q u ) 2 4 Δ R e i l p P v ( d p ) exp ( x Q u ) 2 4 Δ d x = 1 2 π Δ R exp i k x ( x Q u ) 2 4 Δ e i l P v exp ( x Q u ) 2 4 Δ d x = e i l P v 2 π Δ R exp i k x ( x Q u + l u · v ) 2 4 Δ exp ( x Q u ) 2 4 Δ d x = 1 2 π Δ exp i l P v ( l cos α ) 2 8 Δ R exp i k x ( x Q u + l cos α / 2 ) 2 2 Δ d x = exp i l P v + i k Q u + l cos α 2 Δ 2 k 2 ( cos α ) 2 8 Δ l 2 = W ( l v , k u ) exp Δ 2 k 2 ( cos α ) 2 8 Δ l 2 .
Therefore, M is a Gaussian bi-observable with parameters a M = 0 , b M = 0 and J M = J , where J is given by (47) and V 11 M = Δ , V 22 M = ( cos α ) 2 4 Δ and V 12 M = 0 . This implies M C u , v G ; in particular, the set C u , v G is non-empty. Moreover, the lower bound V 11 M V 22 M = 2 4 ( cos α ) 2 is attained, cf. (52).
Example 3.
Let us consider the case α = ± π / 2 ; now the target observables Q u and P v are compatible and we can define a pvm M on R 2 by setting M ( A × B ) = Q u ( A ) P v ( B ) for all A , B B ( R ) . Its characteristic function is
M ^ ( k , l ) = R e i k x Q u ( d x ) R e i l p P v ( d p ) = e i ( k Q u + l P v ) = W ( l v , k u ) .
Then, M C u , v G with parameters a M = 0 , b M = 0 , V M = 0 and J M = J given by (47). Note that M can be regarded as the limit case of the observables of the previous example when cos α = 0 and Δ 0 .

6. Entropic MURs for Position and Momentum

In the case of two discrete target observables, in [41] we found an entropic bound for the precision of their approximate joint measurements, which we named entropic incompatibility degree. Its definition followed a three steps procedure. Firstly, we introduced an error function: when the system is in a given state ρ , such a function quantifies the total amount of information that is lost by approximating the target observables by means of the marginals of a bi-observable; the error function is nothing else than the sum of the two relative entropies of the respective distributions. Then, we considered the worst possible case by maximizing the error function over ρ , thus obtaining an entropic divergence quantifying the approximation error in a state independent way. Finally, we got our index of the incompatibility of the two target observables by minimizing the entropic divergence over all bi-observables. In particular, when symmetries are present, we showed that the minimum is attained at some covariant bi-observables. So, the covariance followed as a byproduct of the optimization procedure, and was not a priori imposed upon the class of approximating bi-observables.
As we shall see, the extension of the previous procedure to position and momentum target observables is not straightforward, and peculiar problems of the continuous case arise. In order to overcome them, in this paper we shall fully analyse only a case in which explicit computations can be done: Gaussian preparations, and Gaussian bi-observables, which we a priori assume to be covariant. We conjecture that the final result should be independent of these simplifications, as we shall discuss in Section 7.
As we said in Section 5, by “approximate joint measurement” we mean “a bi-observable with the ‘right’ covariance properties”.

6.1. Scalar Observables

Given the directions u and v , the target observables are Q u and P v in (6) with pvm’s Q u and P v . For ρ G with parameters ( μ ρ , V ρ ) given in (10), the target distributions Q u , ρ and P v , ρ are normal with means and variances (11).
An approximate joint measurements of Q u and P v is given by a covariant bi-observable M C u , v ; then, we denote its marginals with respect to the first and second entry by M 1 and M 2 , respectively. For a Gaussian covariant bi-observable M C u , v G with parameters ( μ M , V M ) , the distribution of M in a Gaussian state ρ is normal,
M ρ = N μ u , v ρ + μ M ; V u , v ρ + V M ,
so that its marginal distributions M 1 , ρ and M 2 , ρ are normal with means u · a ρ + a M and v · b ρ + b M and variances
Var M 1 , ρ = Var Q u , ρ + V 11 M , Var M 2 , ρ = Var P v , ρ + V 22 M .
Let us recall that u = 1 , v = 1 , u · v = cos α , and that by (16) and (52), we have
Var Q u , ρ Var P v , ρ 2 4 cos α 2 , V 11 M V 22 M 2 4 cos α 2 .

6.1.1. Error Function

The relative entropy is the amount of information that is lost when an approximating distribution is used in place of a target one. For this reason, we use it to give an informational quantification of the error made in approximating the distributions of sharp position and momentum by means of the marginals of a joint covariant observable.
Definition 9.
Given the preparation ρ S and the covariant bi-observable M C u , v , the error function for the scalar case is the sum of the two relative entropies:
S ( ρ , M ) : = S ( Q u , ρ M 1 , ρ ) + S ( P v , ρ M 2 , ρ ) .
The relative entropy is invariant under a change of the unit of measurement, so that the error function is scale invariant, too; indeed, it quantifies a relative error, not an absolute one. In the Gaussian case the error function can be explicitly computed.
Proposition 13 (Error function for the scalar Gaussian case).
For ρ G and M C u , v G , the error function is
S ( ρ , M ) = log e 2 s ( x ) + s ( y ) + Δ ( ρ , M ) ,
where
x : = V 11 M Var Q u , ρ , y : = V 22 M Var P v , ρ , Δ ( ρ , M ) : = ( a M ) 2 Var M 1 , ρ + ( b M ) 2 Var M 2 , ρ ,
and s : [ 0 , + ) [ 0 , + ) is the following C strictly increasing function with s ( 0 ) = 0 :
s ( x ) : = ln 1 + x x 1 + x .
Proof. 
The statement follows by a straightforward combination of (32), (34), (53) and (56). ☐
Note that the error function does not depend on the mixed covariances u · C ρ v and V 12 M . Note also that, if we select a possible approximation M , then the error function S ( ρ , M ) decreases for states ρ with increasing sharp variances Var Q u , ρ and Var P v , ρ : the loss of information decreases when the sharp distributions make the approximation error negligible. Finally, note that
s ( x ) + s ( y ) = ln [ ( 1 + x ) ( 1 + y ) ] + ( 1 + x ) 1 + ( 1 + y ) 1 2 ,
1 + x = Var M 1 , ρ Var Q u , ρ , 1 + y = Var M 2 , ρ Var P v , ρ .
This means that, apart from the term Δ ( ρ , M ) due to the bias, our error function S ( ρ , M ) only depends on the two ratios “variance of the approximating distribution over variance of the target distribution”. Thus, in order to optimize the error function, one has to optimize these two ratios.
We use formula (57) to firstly give a state dependent MUR, and then, following the scheme of [41], a state independent MUR.
A lower bound for the error function can be found by minimizing it over all possible approximate joint measurements of Q u and P v . First of all, let us remark that this minimization makes sense because we consider only ( u , v ) -covariant bi-observables: if we minimized over all possible bi-observables, then the minimum would be trivially zero for every given preparation ρ . Indeed, the trivial bi-observable M ( A × B ) = Q u , ρ ( A ) P v , ρ ( B ) 𝟙 yields S ( ρ , M ) = 0 .
When minimizing the error function over all ( u , v ) -covariant bi-observables, both the minimum and the best measurement attaining it are state dependent. When α = ± π / 2 , the two target observables are compatible, so that their joint measurement trivially exists (see Example 3) and we get inf M C u , v S ( ρ , M ) = 0 . In order to have explicit results for any angle α , we consider only the Gaussian case.
Theorem 1 (State dependent MUR, scalar observables).
For every ρ G and M C u , v G ,
S ( Q u , ρ M 1 , ρ ) + S ( P v , ρ M 2 , ρ ) c ρ ( α ) ,
where the lower bound is
c ρ ( α ) = s z ρ log e = log e ln 1 + | cos α | 2 Var Q u , ρ Var P v , ρ | cos α | | cos α | + 2 Var Q u , ρ Var P v , ρ ,
with
z ρ : = cos α 2 Var Q u , ρ Var P v , ρ [ 0 , 1 ] .
The lower bound is tight and the optimal measurement is unique: c ρ ( α ) = S ( ρ , M * ) , for a unique M * C u , v G ; such a Gaussian ( u , v ) -covariant bi-observable is characterized by
μ M * = 0 , V 12 M * = 0 , V 11 M * = 2 Var Q u , ρ Var P v , ρ cos α , V 22 M * = 2 Var P v , ρ Var Q u , ρ cos α .
Proof. 
As already discussed, the case cos α = 0 is trivial. If cos α 0 , we have to minimize the error function (57) over M . First of all we can eliminate the positive term Δ ( ρ , M ) by taking an unbiased measurement. Then, since s is an increasing function, by the second condition in (55) we can also take V 11 M * V 22 M * = 2 4 cos α 2 . This implies V 12 M * = 0 by (52). In this case the error function (57) reduces to
S ( ρ , M * ) = log e 2 s ( x ) + s ( z ρ 2 / x ) , x = V 11 M * Var Q u , ρ ,
with z ρ given by (61); by the first of (55), we have z ρ ( 0 , 1 ] .
Now, we can minimize the error function with respect to x by studying its first derivative:
d d x s ( x ) + s ( z ρ 2 / x ) = x ( 1 + x ) 2 z ρ 4 x ( z ρ 2 + x ) 2 = x 2 z ρ 2 x 2 + 2 z ρ 2 x + z ρ 2 x z ρ 2 + x 2 1 + x 2 .
Having x > 0 , we immediately get that x = z ρ gives the unique minimum. Thus
S ( ρ , M ) S ( ρ , M * ) = s ( z ρ ) log e = log ( 1 + z ρ ) z ρ 1 + z ρ log e ,
and
V 11 M * = z ρ Var Q u , ρ 2 Var Q u , ρ Var P v , ρ cos α , V 22 M * = z ρ Var P v , ρ 2 Var P v , ρ Var Q u , ρ cos α ,
which conclude the proof. ☐
Remark 3.
The minimum information loss c ρ ( α ) depends on both the preparation ρ and the angle α. When α ± π / 2 , that is when the target observables are not compatible, c ρ ( α ) is strictly grater than zero. This is a peculiar quantum effect: given ρ, u and v , there is no Gaussian approximate joint measurement of Q u and P v that can approximate them arbitrarily well. On the other side, in the limit α ± π / 2 , the lower bound c ρ ( α ) goes to zero; so, the case of commuting target observables is approached with continuity.
Remark 4.
The lower bound c ρ ( α ) goes to zero also in the classical limit 0 . This holds for every angle α and every Gaussian state ρ.
Remark 5.
Another case in which c ρ ( α ) 0 is the limit of large uncertainty states, that is, if we let the product Var Q u , ρ Var P v , ρ + : our entropic MUR disappears because, roughly speaking, the variance of (at least) one of the two target observables goes to infinity, its relative entropy vanishes by itself, and an optimal covariant bi-observable M * has to take care of (at most) only the other target observable.
Remark 6.
Actually, something similar to the previous remark happens also at the macroscopic limit, and does not require the measuring instrument to be an optimal one; indeed, unbiasedness is enough in this case. This happens because the error function S ( ρ , M ) quantifies a relative error; even if the measurement approximation M is fixed, such an error can be reduced by suitably changing the preparation ρ. Indeed, if we consider the position and momentum of a macroscopic particle, for instance the center of mass of many particles, it is natural that its state has much larger position and momentum uncertainties than the intrinsic uncertainties of the measuring instrument; that is, V 11 M Var Q u , ρ 1 and V 22 M Var P v , ρ 1 , implying that the error function (57) is negligible. In practice, this is a classical case: the preparation has large position and momentum uncertainties and the measuring instrument is relatively good. In this situation we do not see the difference between the joint measurement of position and momentum and their separate sharp observations.
Remark 7.
The optimal approximating joint measurement M * C u , v G is unique; by (62) it depends on the preparation ρ one is considering, as well as on the directions u and v . A realization of M * is the measuring procedure of Example 2.
Remark 8.
The MUR (59) is scale invariant, as both the error function S ( ρ , M ) and the lower bound c ρ ( α ) are such.
Remark 9.
For cos α 0 , we get inf M C u , v G S ( ρ , M ) = s ( z ρ ) log e , where z ρ is defined by (61). As z ρ ranges in the interval ( 0 , 1 ] , the quantity inf M C u , v G S ( ρ , M ) takes all the values in the interval 0 , 1 log e 2 , so that
sup ρ G inf M C u , v G S ( ρ , M ) = 1 log e 2 .
In order to get this result, we needed cos α 0 ; however, the final result does not depend on α. Therefore, in the sup ρ inf M -approach of (63), the continuity from quantum to classical is lost.

6.1.2. Entropic Divergence of Q u , P v from M

Now we want to find an entropic quantification of the error made in observing M C u , v as an approximation of Q u and P v in an arbitrary state ρ . The procedure of [41], already suggested in [19] (Section VI.C) for a different error function, is to consider the worst case by maximizing the error function over all the states. However, in the continuous framework this is not possible for the error function (56); indeed, from (57) we get sup ρ G S ( ρ , M ) = + even if we restrict to unbiased covariant bi-observables.
Anyway, the reason for S ( ρ , M ) to diverge is classical: it depends only on the continuous nature of Q u and P v , without any relation to their (quantum) incompatibility. Indeed, as we noted in Section 3.1, if an instrument measuring a random variable X N ( a ; α 2 ) adds an independent noise ν N ( b ; β 2 ) , thus producing an output X + ν N ( a + b ; α 2 + β 2 ) , then the relative entropy S ( X X + ν ) diverges for α 2 0 ; this is what happens if we fix the noise and we allow for arbitrarily peaked preparations. Thus, the sum S ( Q u , ρ M 1 , ρ ) + S ( P v , ρ M 2 , ρ ) diverges if, fixed M , we let Var ( Q u , ρ ) or Var ( P v , ρ ) go to 0.
The difference between the classical and quantum frameworks emerges if we bound from below the variances of the sharp position and momentum observables. Indeed, in the classical framework we have inf b , β 2 sup α 2 ϵ S ( X X + ν ) = 0 for every ϵ > 0 ; the same holds for the sum of two relative entropies if no relation exists between the two noises. On the contrary, in the quantum framework the entropic MURs appear due to the relation between the position and momentum errors occurring in any approximate joint measurement.
In order to avoid that S ( ρ , M ) + due to merely classical effects, we thus introduce the following subset of the Gaussian states:
G ϵ u , v : = ρ G : Var Q u , ρ ϵ 1 , Var P v , ρ ϵ 2 , ϵ i > 0 ,
and we evaluate the error made in approximating Q u and P v with the marginals of a ( u , v ) -covariant bi-observable by maximizing the error function over all these states.
Definition 10.
The Gaussian ϵ -entropic divergence of Q u , P v from M C u , v is
D ϵ G ( Q u , P v M ) : = sup ρ G ϵ u , v S ( ρ , M ) .
For Gaussian M , depending on the choice of the thresholds ϵ 1 and ϵ 2 , the divergence D ϵ G ( Q u , P v M ) can be easily computed or at least bounded.
Theorem 2.
Let the bi-observable M C u , v G be fixed.
(i) 
For ϵ 1 ϵ 2 2 4 cos α 2 , the divergence D ϵ G ( Q u , P v M ) is given by
D ϵ G ( Q u , P v M ) = S ( ρ ϵ ( u , v ) , M ) = log e 2 s ( x ϵ ) + s ( y ϵ ) + Δ ( ϵ ; M ) ,
where ρ ϵ ( u , v ) is any Gaussian state with Var Q u , ρ ϵ ( u , v ) = ϵ 1 and Var P v , ρ ϵ ( u , v ) = ϵ 2 , and
x ϵ : = V 11 M ϵ 1 , y ϵ : = V 22 M ϵ 2 , Δ ( ϵ ; σ ) : = ( a M ) 2 V 11 M + ϵ 1 + ( b M ) 2 V 22 M + ϵ 2 .
(ii) 
For ϵ 1 ϵ 2 < 2 4 cos α 2 , the divergence D ϵ G ( Q u , P v M ) is bounded from below by
D ϵ G ( Q u , P v M ) S ( ρ ϵ ( u , v ) , M ) = log e 2 s ( x ϵ ) + s ( y ϵ ) + Δ ( ϵ ; M ) ,
where ρ ϵ ( u , v ) is any Gaussian state with Var Q u , ρ ϵ ( u , v ) = ϵ 1 and Var P v , ρ ϵ ( u , v ) = 2 4 ϵ 1 cos α 2 , and
x ϵ : = V 11 M ϵ 1 , y ϵ : = 4 ϵ 1 V 22 M 2 cos α 2 , Δ ( ϵ ; σ ) : = ( a M ) 2 V 11 M + ϵ 1 + ( b M ) 2 V 22 M + 2 4 ϵ 1 cos α 2 .
The existence of the above states ρ ϵ ( u , v ) is guaranteed by Proposition 3.
Proof. 
By Proposition 3, maximizing the error function over the states in G ϵ u , v is the same as maximizing (57) over the parameters Var Q u , ρ and Var P v , ρ satisfying (55) and (64) (note that in the bias Δ ( ρ , M ) , the variances Var M 1 , ρ and Var M 2 , ρ depend on Var Q u , ρ and Var P v , ρ by (54)).
(i)
In the case ϵ 1 ϵ 2 2 4 cos α 2 , the thresholds themselves satisfy Heisenberg uncertainty relation, and so equality (66) follows from the expression (57) and the fact the functions s ( x ) , s ( y ) , Δ ( ρ , M ) are decreasing in Var Q u , ρ and Var P v , ρ .
(ii)
In the case ϵ 1 ϵ 2 < 2 4 cos α 2 , we have to take into account the relation (55) for Var Q u , ρ and Var P v , ρ : the supremum of S ( ρ , M ) is achieved when Var Q u , ρ Var P v , ρ = 2 4 cos α 2 , with Var Q u , ρ ϵ 1 and Var P v , ρ ϵ 2 . Then inequality (67) follows by choosing Var Q u , ρ = ϵ 1 and Var P v , ρ = 2 4 ϵ 1 cos α 2 .
 ☐
Remark 10.
The conditions on the states ρ ϵ ( u , v ) do not depend on M , but only on the parameters defining G ϵ u , v . Thus, in the case ϵ 1 ϵ 2 2 4 cos α 2 , any choice of ρ ϵ ( u , v ) yields a state which is the worst one for every Gaussian approximate joint measurement M .

6.1.3. Entropic Incompatibility Degree of Q u and P v

The last step is to optimize the state independent ϵ -entropic divergence (65) over all the approximate joint measurements of Q u and P v . This is done in the next definition.
Definition 11.
The Gaussian ϵ -entropic incompatibility degree of Q u , P v is
c inc G ( Q u , P v ; ϵ ) : = inf M C u , v G D ϵ G ( Q u , P v M ) inf M C u , v G sup ρ G ϵ u , v S ( ρ , M ) .
Again, depending on the choice of the thresholds ϵ 1 and ϵ 2 , the entropic incompatibility degree c inc G ( Q u , P v ; ϵ ) can be easily computed or at least bounded.
Theorem 3. 
(i) 
For ϵ 1 ϵ 2 2 4 cos α 2 , the incompatibility degree c inc G ( Q u , P v ; ϵ ) is given by
c inc G ( Q u , P v ; ϵ ) = ( log e ) ln 1 + cos α 2 ϵ 1 ϵ 2 cos α 2 ϵ 1 ϵ 2 + cos α .
The infimum in (68) is attained and the optimal measurement is unique, in the sense that
c inc G ( Q u , P v ; ϵ ) = D ϵ G ( Q u , P v M ϵ )
for a unique M ϵ C u , v G ; such a bi-observable is characterized by
a M ϵ = 0 , b M ϵ = 0 , V 11 M ϵ = 2 ϵ 1 ϵ 2 cos α , V 22 M ϵ = 2 ϵ 2 ϵ 1 cos α , V 12 M ϵ = 0 .
(ii) 
For ϵ 1 ϵ 2 < 2 4 cos α 2 , the incompatibility degree c inc G ( Q u , P v ; ϵ ) is bounded from below by
c inc G ( Q u , P v ; ϵ ) ( log e ) ln 2 1 2 .
The latter bound is
( log e ) ln 2 1 2 = S ρ ϵ ( u , v ) , M ϵ = inf M C u , v G S ρ ϵ ( u , v ) , M ,
where the state ρ ϵ ( u , v ) is defined in item (ii) of Theorem 2 and M ϵ is the bi-observable in C u , v G such that
a M ϵ = 0 , b M ϵ = 0 , V 11 M ϵ = ϵ 1 , V 22 M ϵ = 2 4 ϵ 1 cos α 2 , V 12 M ϵ = 0 .
Proof. 
(i)
In the case ϵ 1 ϵ 2 2 4 cos α 2 , due to (66), the proof is the same as that of Theorem 1 with the replacements Var Q u , ρ ϵ 1 and Var P v , ρ ϵ 2 .
(ii)
In the case ϵ 1 ϵ 2 < 2 4 cos α 2 , starting from (67), the proof is the same as that of Theorem 1 with the replacements Var Q u , ρ ϵ 1 and Var P v , ρ 2 4 ϵ 1 cos α 2 .
 ☐
Remark 11 (State independent MUR, scalar observables).
By means of the above results, we can formulate a state independent entropic MUR for the position Q u and the momentum P v in the following way. Chosen two positive thresholds ϵ 1 and ϵ 2 , there exists a preparation ρ ϵ ( u , v ) G ϵ u , v (introduced in Theorem 2) such that, for all Gaussian approximate joint measurements M of Q u and P v , we have
S ( Q u , ρ ϵ ( u , v ) M 1 , ρ ϵ ( u , v ) ) + S ( P v , ρ ϵ ( u , v ) M 2 , ρ ϵ ( u , v ) )        ( log e ) ln 1 + cos α 2 ϵ 1 ϵ 2 cos α 2 ϵ 1 ϵ 2 + cos α , if ϵ 1 ϵ 2 2 4 cos α 2 , ( log e ) ln 2 1 2 , if ϵ 1 ϵ 2 < 2 4 cos α 2 .
The inequality follows by (66) and (69) in the case ϵ 1 ϵ 2 2 4 cos α 2 , and (73) in the case ϵ 1 ϵ 2 < 2 4 cos α 2 .
What is relevant is that, for every approximate joint measurement M , the total information loss S ( ρ , M ) does exceed the lower bound (75) even if the set of states G ϵ u , v forbids preparations ρ with too peaked target distributions. Indeed, without the thresholds ϵ 1 , ϵ 2 , it would be trivial to exceed the lower bound (75), as we noted in Section 6.1.2.
We also remark that, chosen ϵ 1 and ϵ 2 , we found a single state ρ ϵ ( u , v ) in G ϵ u , v that satisfies (75) for every M , so that ρ ϵ ( u , v ) is a ‘bad’ state for all Gaussian approximate joint measurements of position and momentum.
When ϵ 1 ϵ 2 2 4 cos α 2 , the optimal approximate joint measurement M ϵ is unique in the class of Gaussian ( u , v ) -covariant bi-observables; it depends only on the class of preparations G ϵ u , v : it is the best measurement for the worst choice of the preparation in the class G ϵ u , v .
Remark 12.
The entropic incompatibility degree c inc G ( Q u , P v ; ϵ ) is strictly positive for cos α 0 (incompatible target observables) and it goes to zero in the limits α ± π / 2 (compatible observables), 0 (classical limit), and ϵ 1 ϵ 2 (large uncertainty states).
Remark 13.
The scale invariance of the relative entropy extends to the error function S ( ρ , M ) , hence to the divergence D ϵ G ( Q u , P v M ) and the entropic incompatibility degree c inc G ( Q u , P v ; ϵ ) , as well as the entropic MUR (75).

6.2. Vector Observables

Now the target observables are Q and P given in (3), with pvm’s Q and P ; the approximating bi-observables are the covariant phase-space observables C of Definition 5. Each bi-observable M C is of the form M = M σ for some σ S , where M σ is given by (43). C G is the subset of the Gaussian bi-observables in C , and M σ C G if and only if σ is a Gaussian state.
We proceed to define the analogues of the scalar quantities introduced in Section 6.1.1, Section 6.1.2 and Section 6.1.3. In order to do it, in the next proposition we recall some known results on matrices.
Proposition 14.
([50,51,52,65]). Let M 1 and M 2 be n × n complex matrices such that M 1 > M 2 > 0 . Then, we have 0 < M 1 1 < M 2 1 . Moreover, if s : R + R is a strictly increasing continuous function, we have Tr { s ( M 1 ) } > Tr { s ( M 2 ) } .

6.2.1. Error Function

Definition 12.
Given the preparation ρ S and the covariant phase-space observable M σ , with σ S , the error function for the vector case is the sum of the two relative entropies:
S ( ρ , M σ ) : = S ( Q ρ M 1 , ρ σ ) + S ( P ρ M 2 , ρ σ ) .
As in the scalar case, the error function is scale invariant, it quantifies a relative error, and we always have S ( ρ , M σ ) > 0 because position and momentum are incompatible. Indeed, since the marginals of a bi-observable M σ C turn out to be convolutions of the respective sharp observables Q and P with some probability densities on R n , Q ρ M 1 , ρ σ and P ρ M 2 , ρ σ for all states ρ ; this is an easy consequence, for instance, of Problem 26.1, p. 362, in [66].
In the Gaussian case the error function can be explicitly computed.
Proposition 15 (Error function for the vector Gaussian case).
For ρ , σ G , the error function has the two equivalent expressions:
S ( ρ , M σ ) = log e 2 Tr s ( E ρ , σ ) + s ( F ρ , σ ) + a σ · ( A ρ + A σ ) 1 a σ + b σ · ( B ρ + B σ ) 1 b σ
S ( ρ , M σ ) = log e 2 Tr s ( N ρ , σ 1 ) + s ( R ρ , σ 1 ) + a σ · ( A ρ + A σ ) 1 a σ + b σ · ( B ρ + B σ ) 1 b σ ,
where the function s is defined in (58), and
E ρ , σ : = ( A ρ ) 1 / 2 A σ ( A ρ ) 1 / 2 , F ρ , σ : = ( B ρ ) 1 / 2 B σ ( B ρ ) 1 / 2 ,
N ρ , σ : = ( A σ ) 1 / 2 A ρ ( A σ ) 1 / 2 , R ρ , σ : = ( B σ ) 1 / 2 B ρ ( B σ ) 1 / 2 .
Proof. 
First of all, recall that
Q ρ = N ( a ρ ; A ρ ) , M 1 , ρ σ = N ( a ρ + a σ ; A ρ + A σ ) P ρ = N ( b ρ ; B ρ ) , M 2 , ρ σ = N ( b ρ + b σ ; B ρ + B σ ) .
A direct application of (34) yields
S ( Q ρ M 1 , ρ σ ) = 1 2 log det ( A ρ + A σ ) det A ρ + log e 2 Tr ( A ρ + A σ ) 1 A ρ 𝟙 + a σ · ( A ρ + A σ ) 1 a σ .
We can transform this equation by using
det A σ + A ρ det A ρ = det ( A ρ ) 1 / 2 A σ + A ρ ( A ρ ) 1 / 2 = det 𝟙 + E ρ , σ ,
ln det 𝟙 + E ρ , σ = Tr ln 𝟙 + E ρ , σ ,
Tr ( A ρ + A σ ) 1 A ρ 𝟙 = Tr ( A ρ ) 1 / 2 ( A ρ + A σ ) 1 ( A ρ ) 1 / 2 𝟙 = Tr ( 𝟙 + E ρ , σ ) 1 E ρ , σ .
This gives
S ( Q ρ M 1 , ρ σ ) = log e 2 Tr { s ( E ρ , σ ) } + a σ · ( A ρ + A σ ) 1 a σ .
In the same way a similar expression is obtained for S ( P ρ M 2 , ρ σ ) and (77a) is proved.
On the other hand, by using
ln det A σ + A ρ det A ρ = ln det 𝟙 + N ρ , σ det N ρ , σ = ln det 𝟙 + N ρ , σ 1 = Tr ln 𝟙 + N ρ , σ 1 ,
Tr ( A ρ + A σ ) 1 A ρ 𝟙 = Tr A ρ + A σ 1 A σ = Tr 𝟙 + N ρ , σ 1 1 N ρ , σ 1 ,
and the analogous expressions involving B ρ and R ρ , σ , one gets (77b). ☐

State Dependent Lower Bound

In principle, a state dependent lower bound for the error function could be found by analogy with Theorem 1, by taking again the infimum over all joint covariant measurements, that is inf σ S ( ρ , M σ ) . By considering only Gaussian states ρ and measurements M σ , from (18), (77a) and (78a), the infimum over σ G can be reduced to an infimum over the matrices A σ :
inf σ G S ( ρ , M σ ) = log e 2 inf A σ Tr s ( A ρ ) 1 / 2 A σ ( A ρ ) 1 / 2 + s 2 4 ( B ρ ) 1 / 2 ( A σ ) 1 ( B ρ ) 1 / 2 .
The above equality follows since the monotonicity of s (Proposition 14) implies that the trace term in (77a) attains its minimum when B σ = 2 4 ( A ρ ) 1 . However, it remains an open problem to explicitly compute the infimum over the matrices A σ when the preparation ρ is arbitrary.
Nevertheless, the computations can be done at least for a preparation ρ * of minimum uncertainty (Proposition 5). Indeed, by (22) we get
inf σ G S ( ρ * , M σ ) = log e 2 inf A σ Tr s E ρ , σ + s E ρ , σ 1 .
Now we can diagonalize E ρ , σ and minimize over its eigenvalues; since s ( x ) + s ( x 1 ) attains its minimum value at x = 1 , this procedure gives E ρ , σ = 𝟙 . So, by denoting by σ * the state giving the minimum, we have
A σ * = A ρ * , B σ * = B ρ * = 2 4 A ρ * 1 ,
inf σ G S ( ρ * , M σ ) = S ( ρ * , M σ * ) = n s ( 1 ) log e .
For an arbitrary ρ G , we can use the last formula to deduce an upper bound for inf σ G S ( ρ , M σ ) . Indeed, if ρ * is a minimum uncertainty state with A ρ * = A ρ , then B ρ 2 4 ( A ρ ) 1 = B ρ * by (19), and, using again the state σ * of (79), we find
inf σ G S ( ρ , M σ ) S ( ρ , M σ * ) S ( ρ * , M σ * ) = n s ( 1 ) log e .
The second inequality in the last formula follows from (77b), (78b) and the monotonicity of s (Proposition 14).

6.2.2. Entropic Divergence of Q , P from M σ

In order to define a state independent measure of the error made in regarding the marginals of M σ as approximations of Q and P , we can proceed along the lines of the scalar case in Section 6.1.2. To this end, we introduce the following vector analogue of the Gaussian states defined in (64):
G ϵ : = ρ G : A ρ ϵ 1 𝟙 , B ρ ϵ 2 𝟙 , ϵ ( ϵ 1 , ϵ 2 ) , ϵ i > 0 .
In the vector case, Definition 10 then reads as follows.
Definition 13.
The Gaussian ϵ -entropic divergence of Q , P from M σ C is
D ϵ G ( Q , P M σ ) : = sup ρ G ϵ S ( ρ , M σ ) .
As in the scalar case, when M σ is Gaussian, depending on the choice of the product ϵ 1 ϵ 2 , we can compute the divergence D ϵ G ( Q , P M σ ) or at least bound it from below.
Theorem 4.
Let the bi-observable M σ C G be fixed.
(i) 
For ϵ 1 ϵ 2 2 4 , the divergence D ϵ G ( Q , P M σ ) is given by
D ϵ G ( Q , P M σ ) = S ( ρ ϵ , M σ ) = log e 2 [ Tr s A σ / ϵ 1 + s B σ / ϵ 2 + a σ · ( A σ + ϵ 1 𝟙 ) 1 a σ + b σ · ( B σ + ϵ 2 𝟙 ) 1 b σ ] ,
where ρ ϵ is any Gaussian state with A ρ ϵ = ϵ 1 𝟙 and B ρ ϵ = ϵ 2 𝟙 .
(ii) 
For ϵ 1 ϵ 2 < 2 4 , the divergence D ϵ G ( Q , P M σ ) is bounded from below by
D ϵ G ( Q , P M σ ) S ( ρ ϵ , M σ ) = log e 2 [ Tr s A σ / ϵ 1 + s 4 ϵ 1 B σ / 2 + a σ · ( A σ + ϵ 1 𝟙 ) 1 a σ + b σ · B σ + 2 4 ϵ 1 𝟙 1 b σ ] ,
where ρ ϵ is any Gaussian state with A ρ ϵ = ϵ 1 𝟙 and B ρ ϵ = 2 4 ϵ 1 𝟙 .
Proof. 
(i)
In the case ϵ 1 ϵ 2 2 4 , for ρ G ϵ we have N ρ , σ ϵ 1 ( A σ ) 1 and R ρ , σ ϵ 2 ( B σ ) 1 ; by Proposition 14 we get
Tr { s ( N ρ , σ 1 ) } Tr s A σ / ϵ 1 , Tr { s ( R ρ , σ 1 ) } Tr s B σ / ϵ 2 ,
( A ρ + A σ ) 1 ( ϵ 1 𝟙 + A σ ) 1 , ( B ρ + B σ ) 1 ( ϵ 2 𝟙 + B σ ) 1 .
By using these inequalities in the expression (77b), we get (83).
(ii)
In the case ϵ 1 ϵ 2 < 2 4 , the lower bound (84) follows by evaluating S ( ρ , M σ ) at the state ρ = ρ ϵ G ϵ with A ρ ϵ = ϵ 1 𝟙 and B ρ ϵ = 2 4 ϵ 1 𝟙 .
 ☐
Note that ρ ϵ does not depend on σ , but only on the parameters defining G ϵ : again, in the case ϵ 1 ϵ 2 2 4 , the error attains its maximum at a state which is independent of the approximate measurement.

6.2.3. Entropic Incompatibility Degree of Q and P

By analogy with Section 6.1.3, we can optimize the ϵ -entropic divergence over all the approximate joint measurements of Q and P .
Definition 14.
The Gaussian ϵ -entropic incompatibility degree of Q and P is
c inc G ( Q , P ; ϵ ) : = inf σ G D ϵ G ( Q , P M σ ) inf σ G sup ρ G ϵ S ( ρ , M σ ) .
Again, depending on the product ϵ 1 ϵ 2 , we can compute or at least bound c inc G ( Q , P ; ϵ ) from below.
Theorem 5. 
(i) 
For ϵ 1 ϵ 2 2 4 , the incompatibility degree c inc G ( Q , P ; ϵ ) is given by
c inc G ( Q , P ; ϵ ) = n log e ln 1 + 2 ϵ 1 ϵ 2 2 ϵ 1 ϵ 2 + .
The infimum in (85) is attained and the optimal measurement is unique, in the sense that
c inc G ( Q , P ; ϵ ) = D ϵ G ( Q , P M σ ϵ )
for a unique σ ϵ G ; such a state is the minimal uncertainty state characterized by
a σ ϵ = 0 , b σ ϵ = 0 , A σ ϵ = 2 ϵ 1 ϵ 2 𝟙 , B σ ϵ = 2 ϵ 2 ϵ 1 𝟙 , C σ ϵ = 0 .
(ii) 
For ϵ 1 ϵ 2 < 2 4 cos α 2 , the incompatibility degree c inc G ( Q , P ; ϵ ) is bounded from below by
c inc G ( Q , P ; ϵ ) n ( log e ) ln 2 1 2 .
The latter bound is
n ( log e ) ln 2 1 2 = S ( ρ ϵ , M σ ϵ ) = inf σ G S ( ρ ϵ , M σ ) ,
where the preparation ρ ϵ is defined in item (ii) of Theorem 4 and σ ϵ is the state in G such that
a σ ϵ = 0 , b σ ϵ = 0 , A σ ϵ = ϵ 1 𝟙 , B σ ϵ = 2 4 ϵ 1 𝟙 , C σ ϵ = 0 .
Proof. 
(i)
In the case ϵ 1 ϵ 2 2 4 , from the expression (83) we get immediately a σ ϵ = 0 , b σ ϵ = 0 and by (19) we have B σ 2 4 ( A σ ) 1 . So, by (83) and Propositions 3 and 14, we get B σ = 2 4 ( A σ ) 1 , and
inf σ G sup ρ G ϵ S ( ρ , M σ ) = log e 2 inf A σ Tr s A σ / ϵ 1 + s 2 4 ϵ 2 ( A σ ) 1 .
By minimizing over all the eigenvalues of A σ , we get the minimum (86), which is attained if and only if A σ is as in (88). Hence, A σ ϵ and B σ ϵ are as in (88). This implies that any optimal state σ ϵ is a minimum uncertainty state; so, C σ ϵ = 0 and the state σ ϵ is unique.
(ii)
In the case ϵ 1 ϵ 2 < 2 4 , by (19) and Proposition 14, inequality (84) implies
inf σ G sup ρ G ϵ S ( ρ , M σ ) log e 2 inf A σ Tr s A σ / ϵ 1 + s ϵ 1 ( A σ ) 1 .
By minimizing over all the eigenvalues of A σ , we get (89). Then (89) holds for ρ ϵ as in item (ii) of Theorem 4 and σ ϵ in (91).
 ☐
Remark 14 (State independent MUR, vector observables).
By means of the above results, we can formulate the following state independent entropic MUR for the position Q and momentum P . Chosen two positive thresholds ϵ 1 and ϵ 2 , there exists a preparation ρ ϵ G ϵ (introduced in Theorem 4) such that, for all Gaussian approximate joint measurements M σ of Q and P , we have
S ( Q ρ ϵ M 1 , ρ ϵ σ ) + S ( P ρ ϵ M 2 , ρ ϵ σ ) n log e ln 1 + 2 ϵ 1 ϵ 2 2 ϵ 1 ϵ 2 + , if ϵ 1 ϵ 2 2 4 , n ( log e ) ln 2 1 2 , if ϵ 1 ϵ 2 < 2 4 .
The inequality follows by (83) and (86) for ϵ 1 ϵ 2 2 4 , and (90) for ϵ 1 ϵ 2 < 2 4 .
Thus, also in the vector case, for every approximate joint measurement M σ , the total information loss S ( ρ , M σ ) does exceed the lower bound (92) even if G ϵ forbids preparations ρ with too peaked target distributions. Moreover, chosen ϵ 1 and ϵ 2 , one can fix again a single ‘bad’ state ρ ϵ in G ϵ that satisfies (92) for all Gaussian approximate joint measurements M σ of Q and P .
Whenever ϵ 1 ϵ 2 2 4 , the optimal approximating joint measurement M σ ϵ is unique in the class of Gaussian covariant bi-observables; it corresponds to a minimum uncertainty state σ ϵ which depends only on the chosen class of preparations G ϵ , that is, on the thresholds ϵ 1 and ϵ 2 : M σ ϵ is the best measurement for the worst choice of the preparation in that class.
Remark 15.
For n = 1 , the vector lower bound in (92) reduces to the scalar lower bound found in (75) for two parallel directions u and v ; for n 1 , the bound linearly increases with n.
Remark 16.
The entropic incompatibility degree c inc G ( Q u , P v ; ϵ ) is strictly positive for cos α 0 (incompatible target observables) and it goes to zero in the limit α ± π / 2 (compatible observables), 0 (classical limit), and ϵ 1 ϵ 2 (large uncertainty states).
Remark 17.
Similarly to Remark 6 for scalar target observables, also the MUR (92) is actually ineffective for macroscopic systems. Indeed, suppose we are concerned with position and momentum of a macroscopic particle, say the center of mass of a multi-particle system (in this case n = 3 ). The states ρ which can be prepared in practice have macroscopic widths, say ρ G ϵ with ‘large’ thresholds ϵ and ϵ 1 ϵ 2 2 / 4 . Then, we consider a measuring instrument M σ * having a high precision with respect to this class of states, but not necessarily attaining a precision near the quantum limits. For instance, let us take M σ * C G with A σ * = δ 1 𝟙 , B σ * = δ 2 𝟙 , and 0 < δ 1 ϵ 1 , 0 < δ 2 ϵ 2 ; we assume M σ * is also unbiased: a σ * = 0 , b σ * = 0 . Obviously, δ 1 δ 2 2 / 4 must hold. Then, ρ G ϵ by (77a) and (78a) we have
E ρ , σ * = δ 1 A ρ δ 1 ϵ 1 𝟙 , F ρ , σ * = δ 2 B ρ δ 2 ϵ 2 𝟙 ,
0 < S ( ρ , M σ * ) = log e 2 Tr s ( E ρ , σ * ) + s ( F ρ , σ * ) n log e 2 s ( δ 1 / ϵ 1 ) + s ( δ 2 / ϵ 2 ) .
By (58) the function s is increasing and it behaves as s ( x ) x 2 / 2 in a neighborhood of zero; in the present case δ 1 / ϵ 1 1 and δ 2 / ϵ 2 1 , thus implying that the error function is negligible. This is practically a ‘classical’ case: the preparation has ‘large’ position and momentum uncertainties and the measuring instrument is ‘relatively good’. In this situation we do not see the difference between the joint measurement of position and momentum and their separate sharp distributions. Of course the bound (92) continues to hold, but it is also negligible since ϵ 1 ϵ 2 2 / 4 .
Remark 18.
Also in the vector case, the scale invariance of the relative entropy extends to the error function S ( ρ , M σ ) , the divergence D ϵ G ( Q , P M σ ) and the entropic incompatibility degree c inc G ( Q , P ; ϵ ) , as well as the entropic MUR (92). Indeed, let us consider the dimensionless versions of position and momentum (35) and their associated projection valued measures Q ˜ , P ˜ introduced in Section 4. Accordingly, we rescale the joint measurement M σ of (43) in the same way, obtaining the POVM
M ˜ σ ( B ) = B M ˜ σ ( x ˜ , p ˜ ) d x ˜ d p ˜ , M ˜ σ ( x ˜ , p ˜ ) = 1 2 π λ n exp i λ p ˜ · Q ˜ x ˜ · P ˜ Π σ Π exp i λ p ˜ · Q ˜ x ˜ · P ˜ .
Here, both the vector variables x ˜ and p ˜ , as well as the components of the Borel set B, are dimensionless. By the scale invariance of the relative entropy, the error function takes the same value as in the dimensioned case:
S ( Q ˜ ρ M ˜ 1 , ρ σ ) + S ( P ˜ ρ M ˜ 2 , ρ σ ) = S ( Q ρ M 1 , ρ σ ) + S ( P ρ M 2 , ρ σ ) .
Then, the scale invariance holds for the entropic divergence and incompatibility degree, too:
D ϵ ˜ G ( Q ˜ , P ˜ M ˜ σ ) = D ϵ G ( Q , P M σ ) , c inc G ( Q ˜ , P ˜ ; ϵ ˜ ) = c inc G ( Q , P ; ϵ ) ,
where ϵ ˜ 1 : = ϰ ϵ 1 and ϵ ˜ 2 : = λ 2 ϵ 2 ϰ . In particular ϵ ˜ 1 ϵ ˜ 2 λ 2 4 ϵ 1 ϵ 2 2 4 and, in this case, we have
n log e s λ 2 ϵ ˜ 1 ϵ ˜ 2 = c inc G ( Q ˜ , P ˜ ; ϵ ˜ ) = c inc G ( Q , P ; ϵ ) = n log e s 2 ϵ 1 ϵ 2 .

7. Conclusions

We have extended the relative entropy formulation of MURs given in [41] from the case of discrete incompatible observables to a particular instance of continuous target observables, namely the position and momentum vectors, or two components of them along two possibly non parallel directions. The entropic MURs we found share the nice property of being scale invariant and well-behaved in the classical and macroscopic limits. Moreover, in the scalar case, when the angle spanned by the position and momentum components goes to ± π / 2 , the entropic bound correctly reflects their increasing compatibility by approaching zero with continuity.
Although our results are limited to the case of Gaussian preparation states and covariant Gaussian approximate joint measurements, we conjecture that the bounds we found still hold for arbitrary states and general (not necessarily covariant or Gaussian) bi-observables. Let us see with some more detail how this should work in the case when the target observables are the vectors Q and P .
The most general procedure should be to consider the error function S ( Q ρ M 1 , ρ ) + S ( P ρ M 2 , ρ ) for an arbitrary POVM M on R n × R n and any state ρ S . First of all, we need states for which neither the position nor the momentum dispersion are too small; the obvious generalization of the test states (81) is
S ϵ : = ρ S 2 : A ρ ϵ 1 𝟙 , B ρ ϵ 2 𝟙 , ϵ i > 0 .
Then, the most general definitions of the entropic divergence and incompatibility degree are:
D ϵ ( Q , P M ) : = sup ρ S ϵ S ( Q ρ M 1 , ρ ) + S ( P ρ M 2 , ρ ) ,
c inc ( Q , P ; ϵ ) : = inf M D ϵ ( Q , P M ) .
It may happen that Q ρ is not absolutely continuous with respect to M 1 , ρ , or P ρ with respect to M 2 , ρ ; in this case, the error function and the entropic divergence take the value + by definition. So, we can restrict to bi-observables that are (weakly) absolutely continuous with respect to the Lebesgue measure. However, the true difficulty is that, even with this assumption, here we are not able to estimate (94), hence (95). It could be that the symmetrization techniques used in [17,19] can be extended to the present setting, and one can reduce the evaluation of the entropic incompatibility index to optimizing over all covariant bi-observables. Indeed, in the present paper we a priori selected only covariant approximating measurements; we would like to understand if, among all approximating measurements, the relative entropy approach selects covariant bi-observables by itself. However, even if M is covariant, there remains the problem that we do not know how to evaluate (94) if ρ and M are not Gaussian. It is reasonable to expect that some continuity and convexity arguments should apply, and the bounds in Theorem 5 might be extended to the general case by taking dense convex combinations. Also the techniques used for the PURs in [8,9] could be of help in order to extend what we did with Gaussian states to arbitrary states. This leads us to conjecture:
c inc ( Q , P ; ϵ ) = c inc G ( Q , P ; ϵ ) .
Conjecture (96) is also supported since the uniqueness of the optimal approximating bi-observable in Theorem 5(i) is reminiscent of what happens in the discrete case of two Fourier conjugated mutually unbiased bases (MUBs); indeed, in the latter case, the optimal bi-observable is actually unique among all the bi-observables, not only the covariant ones (see [41] (Theorem 5)).
Similar considerations obviously apply also to the case of scalar target observables. We leave a more deep investigation of equality (96) to future work.
As a final consideration, one could be interested in finding error/disturbance bounds involving sequential measurements of position and momentum, rather than considering all their possible approximate joint measurements. As sequential measurements are a proper subset of the set of all the bi-observables, optimizing only over them should lead to bounds that are greater than c inc . This is the reason for which in [41] an error/disturbance entropic bound, denoted by c ed and dinstinct from c inc , was introduced. However, it was also proved that the equality c inc = c ed holds when one of the target observables is discrete and sharp. Now, in the present paper, only sharp target observables are involved; although the argument of [41] can not be extended to the continuous setting, the optimal approximating joint observables we found in Theorems 3(i) and 5(i) actually are sequential measurements. Indeed, the optimal bi-observable in Theorem 3(i) is one of the POVMs described in Examples 2 and 3 (see (74)); all these bi-observables have a (trivial) sequential implementation in terms of an unsharp measurement of Q u followed by sharp P v . On the other hand, in the vector case, it was shown in ([67], Corollary 1) that all covariant phase-space observables can be obtained as a sequential measurement of an unsharp version of the position Q followed by the sharp measurement of the momentum P . Therefore, c inc = c ed also for target position and momentum observables, in both the scalar and vector case.

Author Contributions

The three authors equally contributed to the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Heisenberg, W. Über den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik. Zeitschr. Phys. 1927, 43, 172–198. [Google Scholar] [CrossRef]
  2. Simon, R.; Mukunda, N.; Dutta, B. Quantum-noise matrix for multimode systems: U(n) invariance, squeezing, and normal forms. Phys. Rev. A 1994, 49, 1567–1583. [Google Scholar] [CrossRef] [PubMed]
  3. Holevo, A.S. Statistical Structure of Quantum Theory; Lecture Notes in Physics Monographs 67; Springer: Berlin, Germany, 2001. [Google Scholar]
  4. Holevo, A.S. Probabilistic and Statistical Aspects of Quantum Theory; Quaderni della Normale; Edizioni della Normale: Pisa, Italy, 2011. [Google Scholar]
  5. Holevo, A.S. Quantum Systems, Channels, Information; De Gruiter: Berlin, Germany, 2012. [Google Scholar]
  6. Robertson, H. The uncertainty principle. Phys. Rev. 1929, 34, 163–164. [Google Scholar] [CrossRef]
  7. Hirschman, I.I. A note on entropy. Am. J. Math. 1957, 79, 152–156. [Google Scholar] [CrossRef]
  8. Beckner, W. Inequalities in Fourier analysis. Ann. Math. 1975, 102, 159–182. [Google Scholar] [CrossRef]
  9. Białynicki-Birula, I.; Mycielski, J. Uncertainty relations for information entropy in wave machanics. Commun. Math. Phys. 1975, 44, 129–132. [Google Scholar] [CrossRef]
  10. Maassen, H.; Uffink, J.B.M. Generalized entropic uncertainty relations. Phys. Rev. Lett. 1988, 60, 1103–1106. [Google Scholar] [CrossRef] [PubMed]
  11. Gibilisco, P.; Isola, T. On a refinement of Heisenberg uncertainty relation by means of quantum Fisher information. J. Math. Anal. Appl. 2011, 375, 270–275. [Google Scholar] [CrossRef]
  12. Coles, P.J.; Berta, M.; Tomamichel, M.; Whener, S. Entropic uncertainty relations and their applications. Rev. Mod. Phys. 2017, 89, 015002. [Google Scholar] [CrossRef]
  13. Wehner, S.; Winter, A. Entropic uncertainty relations—A survey. New J. Phys. 2010, 12, 025009. [Google Scholar] [CrossRef]
  14. Ozawa, M. Position measuring interactions and the Heisenberg uncertainty principle. Phys. Lett. A 2002, 299, 1–7. [Google Scholar] [CrossRef]
  15. Ozawa, M. Physical content of Heisenberg’s uncertainty relation: Limitation and reformulation. Phys. Lett. A 2003, 318, 21–29. [Google Scholar] [CrossRef]
  16. Ozawa, M. Universally valid reformulation of the Heisenberg uncertainty principle on noise and disturbance in measurement. Phys. Rev. A 2003, 67, 042105. [Google Scholar] [CrossRef]
  17. Werner, R.F. The uncertainty relation for joint measurement of position and momentum. Quantum Inf. Comput. 2004, 4, 546–562. [Google Scholar]
  18. Busch, P.; Heinonen, T.; Lahti, P. Heisenberg’s Uncertainty Principle. Phys. Rep. 2007, 452, 155–176. [Google Scholar] [CrossRef]
  19. Busch, P.; Lahti, P.; Werner, R. Measurement uncertainty relations. J. Math. Phys. 2014, 55, 042111. [Google Scholar] [CrossRef]
  20. Busch, P.; Lahti, P.; Werner, R. Quantum root-mean-square error and measurement uncertainty relations. Rev. Mod. Phys. 2014, 86, 1261–1281. [Google Scholar] [CrossRef]
  21. Ozawa, M. Heisenberg’s original derivation of the uncertainty principle and its universally valid reformulations. Curr. Sci. 2015, 109, 2006–2016. [Google Scholar] [CrossRef]
  22. Davies, E.B. Quantum Theory of Open Systems; Academic: London, UK, 1976. [Google Scholar]
  23. Busch, P.; Grabowski, M.; Lahti, P. Operational Quantum Physics; Springer: Berlin, Germany, 1997. [Google Scholar]
  24. Barchielli, A.; Gregoratti, M. Quantum Trajectories and Measurements in Continuous Time: The Diffusive Case; Lecture Notes in Physics; Springer: Berlin/Heidelberg, Germany, 2009; Volume 782. [Google Scholar]
  25. Heinosaari, T.; Ziman, M. The Mathematical Language of Quantum Theory: From Uncertainty to Entanglement; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
  26. Busch, P.; Lahti, P.; Pellonpää, J.-P.; Ylinen, K. Quantum Measurement; Springer: Berlin, Germany, 2016. [Google Scholar]
  27. Buscemi, F.; Hall, M.J.W.; Ozawa, M.; Wilde, M.M. Noise and disturbance in quantum measurements: An information-theoretic approach. Phys. Rev. Lett. 2014, 112, 050401. [Google Scholar] [CrossRef] [PubMed]
  28. Busch, P.; Heinosaari, T.; Schultz, J.; Stevens, N. Comparing the degrees of incompatibility inherent in probabilistic physical theories. Europhys. Lett. 2013, 103, 10002. [Google Scholar] [CrossRef]
  29. Busch, P.; Lahti, P.; Werner, R. Proof of Heisenberg’s error-disturbance relation. Phys. Rev. Lett. 2013, 111, 160405. [Google Scholar] [CrossRef] [PubMed]
  30. Coles, P.J.; Furrer, F. State-dependent approach to entropic measurement-disturbance relations. Phys. Lett. A 2015, 379, 105–112. [Google Scholar] [CrossRef]
  31. Heinosaari, T.; Schultz, J.; Toigo, A.; Ziman, M. Maximally incompatible quantum observables. Phys. Lett. A 2014, 378, 1695–1699. [Google Scholar] [CrossRef]
  32. Werner, R.F. Uncertainty relations for general phase spaces. Front. Phys. 2016, 11, 110305. [Google Scholar] [CrossRef]
  33. Buscemi, F.; Das, S.; Wilde, M.M. Approximate reversibility in the context of entropy gain, information gain, and complete positivity. Phys. Rev. A 2016, 93, 062314. [Google Scholar] [CrossRef]
  34. Barchielli, A.; Lupieri, G. Instrumental processes, entropies, information in quantum continual measurements. Quantum Inf. Comput. 2004, 4, 437–449. [Google Scholar]
  35. Barchielli, A.; Lupieri, G. Instruments and channels in quantum information theory. Opt. Spectrosc. 2005, 99, 425–432. [Google Scholar] [CrossRef]
  36. Barchielli, A.; Lupieri, G. Quantum measurements and entropic bounds on information transmission. Quantum Inf. Comput. 2006, 6, 16–45. [Google Scholar]
  37. Barchielli, A.; Lupieri, G. Instruments and mutual entropies in quantum information. Banach Center Publ. 2006, 73, 65–80. [Google Scholar]
  38. Barchielli, A.; Lupieri, G. Entropic bounds and continual measurements. In Quantum Probability and Infinite Dimensional Analysis; QP-PQ: Quantum Probability and White Noise Analysis; Accardi, L., Freudenberg, W., Schürmann, M., Eds.; World Scientific: Singapore, 2007; Volume 20, pp. 79–89. [Google Scholar]
  39. Barchielli, A.; Lupieri, G. Information gain in quantum continual measurements. In Quantum Stochastic and Information; Belavkin, V.P., Guţǎ, M., Eds.; World Scientific: Singapore, 2008; pp. 325–345. [Google Scholar]
  40. Maccone, L. Entropic information-disturbance tradeoff. EPL 2007, 77, 40002. [Google Scholar] [CrossRef]
  41. Barchielli, A.; Gregoratti, M.; Toigo, A. Measurement uncertainty relations for discrete observables: Relative entropy formulation. arXiv, 2016; arXiv:1608.01986. [Google Scholar]
  42. Braunstein, S.L.; van Loock, P. Quantum information with continuous variables. Rev. Mod. Phys. 2005, 77, 513–577. [Google Scholar] [CrossRef]
  43. Heinosaari, T.; Kiukas, J.; Schultz, J. Breaking Gaussian incompatibility on continuous variable quantum systems. J. Math. Phys. 2015, 56, 082202. [Google Scholar] [CrossRef]
  44. Kiukas, J.; Schultz, J. Informationally complete sets of Gaussian measurements. J. Phys. A Math. Theor. 2013, 46, 485303. [Google Scholar] [CrossRef]
  45. Weedbrook, C.; Pirandola, S.; García-Patrón, R.; Cerf, N.J.; Ralph, T.C.; Shapiro, J.H.; Lloyd, S. Gaussian quantum information. Rev. Mod. Phys. 2012, 84, 621–669. [Google Scholar] [CrossRef]
  46. Huang, Y. Entropic uncertainty relations in multidimensional position and momentum spaces. Phys. Rev. A 2011, 83, 052124. [Google Scholar] [CrossRef]
  47. Heinosaari, T.; Miyadera, T.; Ziman, M. An invitation to quantum incompatibility. J. Phys. A Math. Theor. 2016, 49, 123001. [Google Scholar] [CrossRef]
  48. Simon, R.; Sudarshan, E.C.G.; Mukunda, N. Gaussian-Wigner distributions in quantum mechanics and optics. Phys. Rev. A 1987, 36, 3868–3880. [Google Scholar] [CrossRef]
  49. Horn, R.A.; Zhang, F. Basic Properties of the Schur Complement. In The Schur Complement and Its Applications; Zhang, F., Ed.; Numerical Methods and Algorithms; Springer: Berlin, Germany, 2005; pp. 17–46. [Google Scholar]
  50. Petz, D. Quantum Information Theory and Quantum Statistics; Springer: Berlin, Germany, 2008. [Google Scholar]
  51. Carlen, E. Trace Inequalities and Quantum Entropy: An Introductory Course. In Entropy and the Quantum; Contemporary Mathematics; American Mathematical Society: Providence, RI, USA, 2010; Volume 529, pp. 73–140. [Google Scholar]
  52. Bhatia, R. Matrix Analysis; Springer: New York, NY, USA, 1997. [Google Scholar]
  53. Werner, R.F. Quantum harmonic analysis on phase spaces. J. Math. Phys. 1983, 25, 1404–1411. [Google Scholar] [CrossRef]
  54. Burnham, K.P.; Anderson, D.R. Model Selection and Multimodel Inference—A Practical Information—Theoretic Approach; Springer: New York, NY, USA, 2002. [Google Scholar]
  55. Topsøe, F. Basic concepts, identities and inequalities—The toolkit of Information Theory. Entropy 2011, 3, 162–190. [Google Scholar] [CrossRef]
  56. Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
  57. Carmeli, C.; Heinonen, T.; Toigo, A. Position and momentum observables on R and on R3. J. Math. Phys. 2004, 45, 2526–2539. [Google Scholar] [CrossRef]
  58. Barchielli, A.; Lupieri, G. Quantum stochastic calculus, operation valued stochastic processes and continual measurements in quantum mechanics. J. Math. Phys. 1985, 26, 2222–2230. [Google Scholar] [CrossRef]
  59. Barchielli, A.; Lupieri, G. A quantum analogue of Hunt’s representation theorem for the generator of convolution semigroups on Lie groups. Probab. Theory Rel. Fields 1991, 88, 167–194. [Google Scholar] [CrossRef]
  60. Barchielli, A.; Holevo, A.S.; Lupieri, G. An analogue of Hunt’s representation theorem in quantum probability. J. Theor. Probab. 1993, 6, 231–265. [Google Scholar] [CrossRef]
  61. Holevo, A.S. Investigations in the General Theory of Statistical Decisions. Proc. Steklov Inst. Math. 1978, 124, 1–140. [Google Scholar]
  62. Holevo, A.S. Infinitely divisible measurements in quantum probability theory. Theory Probab. Appl. 1986, 31, 493–497. [Google Scholar] [CrossRef]
  63. Cassinelli, G.; De Vito, E.; Toigo, A. Positive operator valued measures covariant with respect to an irreducible representation. J. Math. Phys. 2003, 44, 4768–4775. [Google Scholar] [CrossRef]
  64. Kiukas, J.; Lahti, P.; Ylinen, K. Normal covariant quantization maps. J. Math. Anal. Appl. 2006, 319, 783–801. [Google Scholar] [CrossRef]
  65. Ohya, M.; Petz, D. Quantum Entropy and Its Use; Springer: Berlin, Germany, 1993. [Google Scholar]
  66. Billingsley, P. Probability and Measure, 2nd ed.; Wiley: New York, NY, USA, 1986. [Google Scholar]
  67. Carmeli, C.; Heinonen, T.; Toigo, A. Sequential measurements of conjugate observables. J. Phys. A Math. Theor. 2011, 44, 285304. [Google Scholar] [CrossRef]

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Back to TopTop