1. Introduction
Uncertainty relations for position and momentum [
1] have always been deeply related to the foundations of Quantum Mechanics. For several decades, their axiomatization has been of ‘preparation’ type: an inviolable lower bound for the widths of the position and momentum distributions, holding in any quantum state. Such kinds of uncertainty relations, which are now known as 
preparation uncertainty relations (PURs) have been later extended to arbitrary sets of 
 observables [
2,
3,
4,
5]. All PURs trace back to the celebrated Robertson’s formulation [
6] of Heisenberg’s uncertainty principle: for any two observables, represented by self-adjoint operators 
A and 
B, the product of the variances of 
A and 
B is bounded from below by the expectation value of their commutator; in formulae, 
, where 
 is the variance of an observable measured in any system state 
. In the case of position 
Q and momentum 
P, this inequality gives Heisenberg’s relation 
. About 30 years after Heisenberg and Robertson’s formulation, Hirschman attempted a first statement of position and momentum uncertainties in terms of informational quantities. This led him to a formulation of PURs based on Shannon entropy [
7]; his bound was later refined [
8,
9], and extended to discrete observables [
10]. Also other entropic quantities have been used [
11]. We refer to [
12,
13] for an extensive review on entropic PURs.
However, Heisenberg’s original intent [
1] was more focused on the unavoidable disturbance that a measurement of position produces on a subsequent measurement of momentum [
14,
15,
16,
17,
18,
19,
20,
21]. Trying to give a better understanding of his idea, more recently new formulations were introduced, based on a ‘measurement’ interpretation of uncertainty, rather than giving bounds on the probability distributions of the target observables. Indeed, with the modern development of the quantum theory of measurement and the introduction of 
positive operator valued measures and 
instruments [
3,
22,
23,
24,
25,
26], it became possible to deal with approximate measurements of incompatible observables and to formulate 
measurement uncertainty relations (MURs) for position and momentum, as well as for more general observables. The MURs quantify the degree of approximation (or inaccuracy and disturbance) made by replacing the original incompatible observables with a 
joint approximate measurement of them. A very rich literature on this topic flourished in the last 20 years, and various kinds of MURs have been proposed, based on distances between probability distributions, noise quantifications, conditional entropy, etc. [
12,
14,
15,
16,
17,
18,
19,
20,
21,
27,
28,
29,
30,
31,
32].
In this paper, we develop a new information-theoretical formulation of MURs for position and momentum, using the notion of the 
relative entropy (or 
Kullback-Leibler divergence) of two probabilities. The relative entropy 
 is an informational quantity which is precisely tailored to quantify the amount of information that is lost by using an approximating probability 
q in place of the target one 
p. Although classical and quantum relative entropies have already been used in the evaluation of the performances of quantum measurements [
24,
27,
30,
33,
34,
35,
36,
37,
38,
39,
40], their first application to MURs is very recent [
41].
In [
41], only MURs for discrete observables were considered. The present work is a first attempt to extend that information-theoretical approach to the continuous setting. This extension is not trivial and reveals peculiar problems, that are not present in the discrete case. However, the nice properties of the relative entropy, such as its scale invariance, allow for a satisfactory formulation of the entropic MURs also for position and momentum.
We deal with position and momentum in two possible scenarios. Firstly, we consider the case of n-dimensional position and momentum, since it allows to treat either scalar particles, or vector ones, or even the case of multi-particle systems. This is the natural level of generality, and our treatment extends without difficulty to it. Then, we consider a couple made up of one position and one momentum component along two different directions of the n-space. In this case, we can see how our theory behaves when one moves with continuity from a highly incompatible case (parallel components) to a compatible case (orthogonal ones).
The continuous case needs much care when dealing with arbitrary quantum states and approximating observables. Indeed, it is difficult to evaluate or even bound the relative entropy if some assumption is not made on probability distributions. In order to overcome these technicalities and focus on the quantum content of MURs, in this paper we consider only the case of Gaussian preparation states and Gaussian measurement apparatuses [
2,
4,
5,
42,
43,
44,
45]. Moreover, we identify the class of the approximate joint measurements with the class of the joint POVMs satisfying the same symmetry properties of their target position and momentum observables [
3,
23]. We are supported in this assumption by the fact that, in the discrete case [
41], simmetry covariant measurements turn out to be the best approximations without any hypothesis (see also [
17,
19,
20,
29,
32] for a similar appearance of covariance within MURs for different uncertainty measures).
We now sketch the main results of the paper. In the vector case, we consider approximate joint measurements 
 of the position 
 and the momentum 
. We find the following entropic MUR (Theorem 5, Remark 14): for every choice of two positive thresholds 
, with 
, there exists a Gaussian state 
 with position variance matrix 
 and momentum variance matrix 
 such that
      
      for all Gaussian approximate joint measurements 
 of 
 and 
. Here 
 and 
 are the distributions of position and momentum in the state 
, and 
 is the distribution of 
 in the state 
, with marginals 
 and 
; the two marginals turn out to be noisy versions of 
 and 
. The lower bound is strictly positive and it linearly increases with the dimension 
n. The thresholds 
 and 
 are peculiar of the continuous case and they have a classical explanation: the relative entropy 
 if the variance of 
p vanishes faster than the variance of 
q, so that, given 
, it is trivial to find a state 
 enjoying (
1) if arbtrarily small variances are allowed. What is relevant in our result is that the total loss of information 
 exceeds the lower bound even if we forbid target distributions with small variances.
The MUR (
1) shows that there is no Gaussian joint measurement which can approximate arbitrarily well both 
 and 
. The lower bound (
1) is a consequence of the incompatibility between 
 and 
 and, indeed, it vanishes in the classical limit 
. Both the relative entropies and the lower bound in (
1) are scale invariant. Moreover, for fixed 
 and 
, we prove the existence and uniqueness of an optimal approximate joint measurement, and we fully characterize it.
In the scalar case, we consider approximate joint measurements 
 of the position 
 along the direction 
 and the momentum 
 along the direction 
, where 
. We find two different entropic MURs. The first entropic MUR in the scalar case is similar to the vector case (Theorem 3, Remark 11). The second one is (Theorem 1):
      for all Gaussian states 
 and all Gaussian joint approximate measurements 
 of 
 and 
. This lower bound holds for every Gaussian state 
 without constraints on the position and momentum variances 
 and 
, it is strictly positive unless 
 and 
 are orthogonal, but it is state dependent. Again, the relative entropies and the lower bound are scale invariant.
The paper is organized as follows. In 
Section 2, we introduce our target position and momentum observables, we discuss their general properties and define some related quantities (spectral measures, mean vectors and variance matrices, PURs for second order quantum moments, Weyl operators, Gaussian states). 
Section 3 is devoted to the definitions and main properties of the relative and differential (Shannon) entropies. 
Section 4 is a review on the entropic PURs in the continuous case [
7,
8,
9,
46], with a particular focus on their lack of scale invariance. This is a flaw due to the very definition of differential entropy, and one of the reasons that lead us to introduce relative entropy based MURs. In 
Section 5 we construct the covariant observables which will be used as approximate joint measurements of the position and momentum target observables. Finally, in 
Section 6 the main results on MURs that we sketched above are presented in detail. Some conclusions are discussed in 
Section 7.
  2. Target Observables and States
Let us start with the usual position and momentum operators, which satisfy the canonical commutation rules:
Each of the vector operators has 
n components; it could be the case of a single particle in one or more dimensions (
), or several scalar or vector particles, or the quadratures of 
n modes of the electromagnetic field. We assume the Hilbert space 
 to be irreducible for the algebra generated by the canonical operators 
 and 
. An 
observable of the quantum system 
 is identified with a 
positive operator valued measure (POVM); in the paper, we shall consider observables with outcomes in 
 endowed with its Borel 
-algebra 
  The use of POVMs to represent observables in quantum theory is standard and the definition can be found in many textbooks [
22,
23,
26,
47]; the alternative name “non-orthogonal resolutions of the identity” is also used [
3,
4,
5]. Following [
5,
23,
26,
31], a 
sharp observable is an observable represented by a 
projection valued measure (pvm); it is standard to identify a sharp observable on the outcome space 
 with the 
k self-adjoint operators corresponding to it by spectral theorem. Two observables are 
jointly measurable or 
compatible if there exists a POVM having them as marginals. Because of the non-vanishing commutators, each couple 
, 
, as well as the vectors 
, 
, are not jointly measurable.
We denote by  the trace class operators on , by  the subset of the statistical operators (or states, preparations), and by   the space of the linear bounded operators.
  2.1. Position and Momentum
Our target observables will be either n-dimensional position and momentum (vector case) or position and momentum along two different directions of  (scalar case). The second case allows to give an example ranging with continuity from maximally incompatible observables to compatible ones.
  2.1.1. Vector Observables
As target observables we take 
 and 
 as in (
3) and we denote by 
, 
, their pvm’s, that is
          
Then, the distributions in the state 
 of a sharp position and a sharp momentum measurements (denoted by 
 and 
) are absolutely continuous with respect to the Lebesgue measure; we denote by 
 and 
 their probability densities: 
,
          
In the Dirac notation, if 
 and 
 are the improper position and momentum eigenvectors, these densities take the expressions 
 and 
, respectively. The mean vectors and the variance matrices of these distributions will be given in (
7) and (
8).
  2.1.2. Scalar Observables
As target observables we take the position along a given direction 
 and the momentum along another given direction 
:
In this case we have , so that  and  are not jointly measurable, unless the directions  and  are orthogonal.
Their pvm’s are denoted by 
 and 
, their distributions in a state 
 by 
 and 
, and their corresponding probability densities by 
 and 
: 
,
          
Of course, the densities in the scalar case are marginals of the densities in the vector case. Means and variances will be given in (
11).
  2.2. Quantum Moments
Let 
 be the set of states for which the second moments of position and momentum are finite:
Then, the mean vector and the variance matrix of the position 
 in the state 
 are
        
        while for the momentum 
 we have
        
For 
 it is possible to introduce also the mixed ‘quantum covariances’
        
Since there is no joint measurement for the position  and momentum , the quantum covariances  are not covariances of a joint distribution, and thus they do not have a classical probabilistic interpretation.
By means of the moments above, we construct the three real 
 matrices 
, the 
-dimensional vector 
 and the symmetric 
 matrix 
, with
        
We say 
 is the 
quantum variance matrix of position and momentum in the state 
. In [
2] dimensionless canonical operators are considered, but apart from this, our matrix 
 corresponds to their “noise matrix in real form”; the name “variance matrix” is also used [
44,
48].
In a similar way, we can introduce all the moments related to the position 
 and momentum 
 introduced in (
6). For 
, the means and variances are respectively
        
Similarly to (
9), we have also the ‘quantum covariance’ 
. Then, we collect the two means in a single vector and we introduce the variance matrix:
Proposition 1. Let  be a real symmetric  block matrix with the same dimensions of a quantum variance matrix. Define In this case we have: , , , and  The inequalities (
14) for 
 tell us exactly when a (positive semi-definite) real matrix 
V is the quantum variance matrix of position and momentum in a state 
. Moreover, they are the multidimensional version of the usual uncertainty principle expressed through the variances [
2,
3,
5], hence they represent a form of PURs. The block matrix 
 in the definition of 
 is useful to compress formulae involving position and momentum; moreover, it makes simpler to compare our equations with their frequent dimensionless versions (with 
) in the literature [
43,
44].
Proof.  Equivalences (
14) are well known, see e.g., [
3] (Section 1.1.5), [
5] (Equation (2.20)), and [
2] (Theorem 2). Then 
.
By using the real block vector 
, with arbitrary 
 and given 
, the semi-positivity (
14) implies
          
          which in turn implies 
, 
 and (
15). Then, by choosing 
, where 
 are the eigenvectors of 
A (since 
A is a real symmetric matrix, 
 for all 
i), one gets the strict positivity of all the eigenvalues of 
A; analogously, one gets 
. ☐
 Inequality (
15) for 
 and 
 becomes the uncertainty rule à la Robertson [
6] for the observables in (
6) (a position component and a momentum component spanning an arbitrary angle 
):
Inequality (
16) is equivalent to
        
Since 
 are block matrices, their positive semi-definiteness can be studied by means of the Schur complements [
49,
50,
51]. However, as 
 are complex block matrices with a very peculiar structure, special results hold for them. Before summarizing the properties of 
 in the next proposition, we need a simple auxiliary algebraic lemma.
Lemma 1. Let A and B be complex self-adjoint matrices such that . Then , and the equality  holds iff .
 Proof.  Let 
 and 
 be the ordered decreasing sequences of the eigenvalues of 
A and 
B, respectively. Then, by Weyl’s inequality, 
 implies 
 for every 
i [
52] (Section III.2). This gives the first statement. Moreover, if 
 and 
, we get 
 for every 
i. Then 
 because 
 and 
. ☐
 Proposition 2. Let  be a real symmetric  matrix with the same dimensions of a quantum variance matrix. Then  (or, equivalently, ) if and only if  and Moreover, we have also the following properties for the various determinants:  By interchanging 
A with 
B and 
C with 
 in (
18)–(
22) equivalent results are obtained.
Proof.  Since we already know that 
 implies the invertibility of 
A, the equivalence between (
14) and (
18) with 
 follows from [
49] (Theorem 1.12 p. 34) (see also [
50] (Theorem 11.6) or [
51] (Lemma 3.2)).
In (
19), the first inequality follows by summing up the two inequalities in (
18). The last two ones are immediate by the positivity of 
.
The equality in (
20) is Schur’s formula for the determinant of block matrices ([
49], Theorem 1.1 p. 19). Then, the first inequality is immediate by the lemma above and the trivial relation 
; the second one follows from (
19):
          
The equality 
 is equivalent to 
; since the latter two determinants are evaluated on ordered positive matrices by (
19), they coincide if and only if the respective arguments are equal (Lemma 1); this shows the equivalence in (
21). Then, by (
18), the self-adjoint matrix 
 is both positive semi-definite and negative semi-definite; hence it is null, that is, 
.
Finally, 
 gives 
 trivially. Conversely, 
 implies 
 by (
20); since 
 by (
19), Lemma 1 then implies 
 and so 
. ☐
 By (
18) and (
19), every time three matrices 
 define the quantum variance matrix of a state 
, the same holds for 
. This fact can be used to characterize when two positive matrices 
A and 
B are the diagonal blocks of some quantum variance matrix, or two positive numbers 
 and 
 are the position and momentum variances of a quantum state along the two directions 
 and 
.
Proposition 3. Two real matrices  and , having the dimension of the square of a length and momentum, respectively, are the diagonal blocks of a quantum variance matrix  if and only if Two real numbers  and , having the dimension of the square of a length and momentum, respectively, are such that  and  for some state ρ if and only if  Proof.  For 
A and 
B, the necessity follows from (
19). The sufficiency comes from (
18) by choosing 
.
For 
 and 
, the necessity follows from (
15). The sufficiency comes from (
18) with 
 and for example the following choices of 
A and 
B:
In the last two cases, we chose A and B in such a way that  when restricted to the linear span of . ☐
   2.3. Weyl Operators and Gaussian States
In the following, we shall introduce Gaussian states, Gaussian observables and covariant observables on the phase-space. In all these instances, the Weyl operators are involved; here we recall their definition and some properties (see e.g., [
4] (Section 5.2) or [
5] (Section 12.2), where, however, the definition differs from ours in that the Weyl operators are composed with the map 
 of (
13)).
Definition 1. The Weyl operators 
are the unitary operators defined by  The Weyl operators (
23) satisfy the composition rule
        
        in particular, this implies the commutation relation
        
These commutation relations imply the translation property
        
        due to this property, the Weyl operators are also known as 
displacement operators.
With a slight abuse of notation, we shall sometimes use the identification
        
        where 
 is a block column vector belonging to the phase-space 
; here, the first block 
 is a position and the second block 
 is a momentum.
By means of the Weyl operators, it is possible to define the characteristic function of any trace-class operator.
Definition 2. For any operator , its characteristic function is the complex valued function  defined by  Note that  is the inverse of a length and  is the inverse of a momentum, so that w is a block vector living in the space  regarded as the dual of the phase-space.
Instead of the characteristic function, sometimes the so called Weyl transform 
 is introduced [
4,
44].
By [
4] (Proposition 5.3.2, Theorem 5.3.3), we have 
 and the following trace formula holds: 
,
        
As a corollary [
4] (Corollary 5.3.4), we have that a state 
 is pure if and only if
        
By [
53] (Lemma 3.1) or [
26] (Proposition 8.5.(e)), the trace formula also implies
        
Moreover, the following inversion formula ensures that the characteristic function 
 completely characterizes the state 
 [
4] (Corollary 5.3.5):
The last two integrals are defined in the weak operator topology.
Finally, for 
, the moments (
7)–(
10) can be expressed as in [
4] (Section 5.4):
Definition 3  ([
2,
3,
4,
5,
44,
48])
. A state ρ is Gaussian iffor a vector  and a real  matrix  such that . The condition 
 is necessary and sufficient in order that the function (
31) defines the characteristic function of a quantum state [
4] (Theorem 5.5.1), [
5] (Theorem 12.17). Therefore, Gaussian states are exactly the states whose characteristic function is the exponential of a second order polynomial [
4] (Equation (5.5.49)), [
5] (Equation (12.80)).
We shall denote by 
 the set of the Gaussian states; we have 
. By (
30), the vectors 
, 
 and the matrices 
, 
, 
 characterizing a Gaussian state 
 are just its first and second order quantum moments introduced in (
7)–(
9). By (
31), the corresponding distributions of position and momentum are Gaussian, namely
        
Proposition 4 (Pure Gaussian states)
. For , we have  if and only if ρ is pure.
 Proof.  The trace formula (
28) and (
31) give 
, and this implies the statement. ☐
 Proposition 5 (Minimum uncertainty states). 
For , we have  if and only if ρ is a pure Gaussian state and it factorizes into the product of minimum uncertainty states up to a rotation of .
 Proof.  If 
, then the equivalence (
22) gives 
, so that the variance matrices 
 and 
 have a common eigenbasis 
. Thus, all the corresponding couples of position 
 and momentum 
 have minimum uncertainties: 
. Therefore, if we consider the factorization of the Hilbert space 
 corresponding to the basis 
, all the partial traces of the state 
 on each factor 
 are minimum uncertainty states. Since for 
 the minimum uncertainty states are pure and Gaussian, the state 
 is a pure product Gaussian state.
The converse is immediate. ☐
   5. Approximate Joint Measurements of Position and Momentum
In order to deal with MURs for position and momentum observables, we have to introduce the class of approximate joint measurements of position and momentum, whose marginals we will compare with the respective sharp observables. As done in [
3,
4,
18,
57], it is natural to characterize such a class by requiring suitable properties of covariance under the group of space translations and velocity boosts: namely, by 
approximate joint measurement of position and momentum we will mean any POVM on the product space of the position and momentum outcomes sharing the same covariance properties of the two target sharp observables. As we have already discussed, two approximation problems will be of our concern: the approximation of the position and momentum vectors (vector case, with outcomes in the phase-space 
), and the approximation of one position and one momentum component along two arbitrary directions (scalar case, with oucomes in 
). In order to treat the two cases altogether, we consider POVMs with outcomes in 
, which we call 
bi-observables; they correspond to a measurement of 
m position components and 
m momentum components. The specific covariance requirements will be given in the Definitions 5–7.
In studying the properties of probability measures on 
, a very useful notion is that of the characteristic function, that is, the Fourier cotransform of the measure at hand; the analogous quantity for POVMs turns out to have the same relevance. Different names have been used in the literature to refer to the characteristic function of POVMs, or, more generally, quantum instruments, such as characteristic operator or operator characteristic function [
3,
24,
34,
44,
58,
59,
60,
61,
62]. As a variant, also the symplectic Fourier transform quite often appears [
5] (Section 12.4.3). The characteristic function has been used, for instance, to study the quantum analogues of the infinite-divisible distributions [
3,
34,
58,
59,
60,
62] and measurements of Gaussian type [
5,
44,
61]. Here, we are interested only in the latter application, as our approximating bi-observables will typically be Gaussian. Since we deal with bi-observables, we limit our definition of the characteristic function only to POVMs on 
, which have the same number of variables of position and momentum type.
Being measures, POVMs can be used to construct integrals, whose theory is presented e.g., in [
26] (Section 4.8) and [
4] (Section 2.9, Proposition 2.9.1).
Definition 4. Given a bi-observable , 
the characteristic function  
of  is the operator valued function , with  In this definition the dimensions of the vector variables 
 and 
 are the inverses of a length and momentum, respectively, as in the definition of the characteristic function of a state (
27). This definition is given so that 
 is the usual characteristic function of the probability distribution 
 on 
.
  5.1. Covariant Vector Observables
In terms of the pvm’s (
4), the translation property (
25) is equivalent to the symmetry properties
        
        and they are taken as the transformation property defining the following class of POVMs on 
 [
23,
26,
44,
53,
57].
Definition 5. A covariant phase-space observable 
is a bi-observable  satisfying the covariance relation We denote by  the set of all the covariant phase-space observables.
 The interpretation of covariant phase-space observables as approximate joint measurements of position and momentum is based on the fact that their marginal POVMs
        
        have the same symmetry properties of 
 and 
, respectively. Although 
 and 
 are not jointly measurable, the following well-known result says that there are plenty of covariant phase-space observables [
4] (Theorem 4.8.3), [
63,
64]. In (
43) below, we use the parity operator 
 on 
, which is such that
        
Proposition 8. The covariant phase-space observables are in one-to-one correspondence with the states on , so that we have the identification ; such a correspondence  is given by  The characteristic function (
41) of a measurement 
 has a very simple structure in terms of the characteristic function (
27) of the corresponding state 
.
Proposition 9. The characteristic function of  is given byand the characteristic function of the probability  is  In (
44) we have used the identification (
26). The characteristic function of a state is introduced in (
27).
Proof.  By the commutation relations (
24) we have
          
Then, we get
          
          where we used the formula (
29). By (
42) and the definition (
27), we get (
44). Again by (
27), we get (
45). ☐
 In terms of probability densities, measuring 
 on the state 
 yields the density function 
. Then, by (
45), the densities of the marginals 
 and 
 are the convolutions
        
        where 
f and 
g are the sharp densities introduced in (
5). By the arbitrariness of the state 
, the marginal POVMs of 
 turn out to be the convolutions (or ‘smearings’)
        
       (see e.g., [
23] (Section III, Equations (2.48) and (2.49))).
Let us remark that the distribution of the approximate position observable 
 in a state 
 is the distribution of the sum of two independent random vectors: the first one is distributed as the sharp position 
 in the state 
, the second one is distributed as the sharp position 
 in the state 
. In this sense, the approximate position 
 looks like a sharp position plus an independent noise given by 
. Of course, a similar fact holds for the momentum. However, this statement about the distributions can not be extended to a statement involving the observables. Indeed, since 
 and 
 are incompatible, nobody can jointly observe 
, 
 and 
, so that the convolutions (
46) do not correspond to sums of random vectors that actually exist when measuring 
.
  5.2. Covariant Scalar Observables
Now we focus on the class of approximate joint measurements of the observables 
 and 
 representing position and momentum along two possibly different directions 
 and 
 (see 
Section 2.1.2). As in the case of covariant phase-space observables, this class is defined in terms of the symmetries of its elements: we require them to transform as if they were joint measurements of 
 and 
. Recall that 
 and 
 denote the spectral measures of 
, 
.
Due to the commutation relation (
24), the following covariance relations hold
        
        for all 
 and 
. We employ covariance to define our class of approximate joint measurements of 
 and 
.
Definition 6. A-covariant bi-observable 
is a POVM  such that We denote by  the class of such bi-observables.
 So, our approximate joint measurements of  and  will be all the bi-observables in the class .
Example 1. The marginal of a covariant phase-space observable  along the directions  and  is a -covariant bi-observable. Actually, it can be proved that, if , all -covariant bi-observables can be obtained in this way.
 It is useful to work with a little more generality, and merge Definitions 5 and 6 into a single notion of covariance.
Definition 7. Suppose J is a  real matrix. A POVM  is a J -covariant observable on  if  Thus, approximate joint observables of 
 and 
 are just 
J-covariant observables on 
 for the choice of the 
 matrix
        
On the other hand, covariant phase-space observables constitute the class of -covariant observables on , where  is the identity map of .
  5.3. Gaussian Measurements
When dealing with Gaussian states, the following class of bi-observables quite naturally arises.
Definition 8. A POVM  is a Gaussian bi-observable 
iffor two vectors , a real  matrix  and a real symmetric  matrix  satisfying the condition We set . The triple  is the set of the parameters of the Gaussian observable .
 In this definition, the vector 
 has the dimension of a length, and 
 of a momentum; similarly, the matrices 
, 
 decompose into blocks of different dimensions. The condition (
49) is necessary and sufficient in order that the function (
48) defines the characteristic function of a POVM.
For unbiased Gaussian measurements, i.e., Gaussian bi-observables with 
, the previous definition coincides with the one of [
5] (Section 12.4.3). It is also a particular case of the more general definition of Gaussian observables on arbitrary (not necessarily symplectic) linear spaces that is given in [
43,
44]. We refer to [
5,
44] for the proof that Equation (
48) is actually the characteristic function of a POVM.
Measuring the Gaussian observable 
 on the Gaussian state 
 yields the probability distribution 
 whose characteristic function is
        
        hence the output distribution is Gaussian,
        
  5.3.1. Covariant Gaussian Observables
For Gaussian bi-observables, J-covariance has a very easy characterization.
Proposition 10. Suppose  is a Gaussian bi-observable on  with parameters . Let J be any  real matrix. Then, the POVM  is a J-covariant observable if and only if .
 Proof.  For 
, we let 
 and 
 be the two POVMs on 
 given by
            
By the commutation relations (
24) for the Weyl operators, we immediately get
            
            we have also
            
Since 
 for all 
, by comparing the last two expressions we see that 
 if and only if
            
            which in turn is equivalent to 
. ☐
   Vector Observables
Let us point out the structure of the Gaussian approximate joint measurements of  and .
Proposition 11. A bi-observable  is Gaussian if and only if the state σ is Gaussian. In this case, the covariant bi-observable  is Gaussian with parameters  Proof.  By comparing (
31), (
44) and (
48), and using the fact that 
 if and only if 
 and 
, we have the first statement. Then, for 
, we see immediately that 
 is a Gaussian observable with the above parameters. ☐
 We call 
 the class of the Gaussian covariant phase-space observables. By (
50), observing 
 on a Gaussian state 
 yields the normal probability distribution 
, with marginals
          
When  and , we have an unbiased measurement.
  Scalar Observables
We now study the Gaussian approximate joint measurements of the target observables 
 and 
 defined in (
6).
Proposition 12. A Gaussian bi-observable  with parameters  is in  if and only if , where J is given by (47). In this case, the condition (49) is equivalent to  Proof.  The first statement follows from Proposition 10. Then, the matrix inequality (
49) reads
            
            which is equivalent to (
52). ☐
 We write 
 for the class of the Gaussian 
-covariant phase-space observables. An observable 
 is thus characterized by the couple 
. From (
50) with 
 given by (
47), we get that measuring 
 on a Gaussian state 
 yields the probability distribution 
 with 
 and 
 given by (
12). Its marginals with respect to the first and second entry are, respectively,
          
Example 2. Let us construct an example of an approximate joint measurement of  and , by using a noisy measurement of position along  followed by a sharp measurement of momentum along . Let Δ 
be a positive real number yielding the precision of the position measurement, and consider the POVM  on  given by The characteristic function of  is Therefore,  is a Gaussian bi-observable with parameters ,  and , where J is given by (47) and ,  and . This implies ; in particular, the set  is non-empty. Moreover, the lower bound  is attained, cf. (52).  Example 3. Let us consider the case ; now the target observables  and  are compatible and we can define a pvm  on  by setting  for all . Its characteristic function isThen,  with parameters , ,  and  given by (47). Note that  can be regarded as the limit case of the observables of the previous example when  and .    7. Conclusions
We have extended the relative entropy formulation of MURs given in [
41] from the case of discrete incompatible observables to a particular instance of continuous target observables, namely the position and momentum vectors, or two components of them along two possibly non parallel directions. The entropic MURs we found share the nice property of being scale invariant and well-behaved in the classical and macroscopic limits. Moreover, in the scalar case, when the angle spanned by the position and momentum components goes to 
, the entropic bound correctly reflects their increasing compatibility by approaching zero with continuity.
Although our results are limited to the case of Gaussian preparation states and covariant Gaussian approximate joint measurements, we conjecture that the bounds we found still hold for arbitrary states and general (not necessarily covariant or Gaussian) bi-observables. Let us see with some more detail how this should work in the case when the target observables are the vectors  and .
The most general procedure should be to consider the error function 
 for an arbitrary POVM 
 on 
 and any state 
. First of all, we need states for which neither the position nor the momentum dispersion are too small; the obvious generalization of the test states (
81) is
      
Then, the most general definitions of the entropic divergence and incompatibility degree are: 
 It may happen that 
 is not absolutely continuous with respect to 
, or 
 with respect to 
; in this case, the error function and the entropic divergence take the value 
 by definition. So, we can restrict to bi-observables that are (weakly) absolutely continuous with respect to the Lebesgue measure. However, the true difficulty is that, even with this assumption, here we are not able to estimate (
94), hence (
95). It could be that the symmetrization techniques used in [
17,
19] can be extended to the present setting, and one can reduce the evaluation of the entropic incompatibility index to optimizing over all covariant bi-observables. Indeed, in the present paper we a priori selected only covariant approximating measurements; we would like to understand if, among all approximating measurements, the relative entropy approach selects covariant bi-observables by itself. However, even if 
 is covariant, there remains the problem that we do not know how to evaluate (
94) if 
 and 
 are not Gaussian. It is reasonable to expect that some continuity and convexity arguments should apply, and the bounds in Theorem 5 might be extended to the general case by taking dense convex combinations. Also the techniques used for the PURs in [
8,
9] could be of help in order to extend what we did with Gaussian states to arbitrary states. This leads us to conjecture:
Conjecture (
96) is also supported since the uniqueness of the optimal approximating bi-observable in Theorem 5(i) is reminiscent of what happens in the discrete case of two Fourier conjugated mutually unbiased bases (MUBs); indeed, in the latter case, the optimal bi-observable is actually unique among all the bi-observables, not only the covariant ones (see [
41] (Theorem 5)).
Similar considerations obviously apply also to the case of scalar target observables. We leave a more deep investigation of equality (
96) to future work.
As a final consideration, one could be interested in finding error/disturbance bounds involving sequential measurements of position and momentum, rather than considering all their possible approximate joint measurements. As sequential measurements are a proper subset of the set of all the bi-observables, optimizing only over them should lead to bounds that are greater than 
. This is the reason for which in [
41] an error/disturbance entropic bound, denoted by 
 and dinstinct from 
, was introduced. However, it was also proved that the equality 
 holds when one of the target observables is discrete and sharp. Now, in the present paper, only sharp target observables are involved; although the argument of [
41] can not be extended to the continuous setting, the optimal approximating joint observables we found in Theorems 3(i) and 5(i) 
actually are sequential measurements. Indeed, the optimal bi-observable in Theorem 3(i) is one of the POVMs described in Examples 2 and 3 (see (
74)); all these bi-observables have a (trivial) sequential implementation in terms of an unsharp measurement of 
 followed by sharp 
. On the other hand, in the vector case, it was shown in ([
67], Corollary 1) that all covariant phase-space observables can be obtained as a sequential measurement of an unsharp version of the position 
 followed by the sharp measurement of the momentum 
. Therefore, 
 also for target position and momentum observables, in both the scalar and vector case.