Dipartimento di Matematica, Politecnico di Milano, Piazza Leonardo da Vinci 32, I-20133 Milano, Italy
Istituto Nazionale di Alta Matematica (INDAM-GNAMPA), 00185 Roma, Italy
Istituto Nazionale di Fisica Nucleare (INFN), Sezione di Milano, 20133 Milano, Italy
Author to whom correspondence should be addressed.
Received: 26 May 2017 / Accepted: 21 June 2017 / Published: 24 June 2017
Heisenberg’s uncertainty principle has recently led to general measurement uncertainty relations for quantum systems: incompatible observables can be measured jointly or in sequence only with some unavoidable approximation, which can be quantified in various ways. The relative entropy is the natural theoretical quantifier of the information loss when a ‘true’ probability distribution is replaced by an approximating one. In this paper, we provide a lower bound for the amount of information that is lost by replacing the distributions of the sharp position and momentum observables, as they could be obtained with two separate experiments, by the marginals of any smeared joint measurement. The bound is obtained by introducing an entropic error function, and optimizing it over a suitable class of covariant approximate joint measurements. We fully exploit two cases of target observables: (1) n-dimensional position and momentum vectors; (2) two components of position and momentum along different directions. In (1), we connect the quantum bound to the dimension n; in (2), going from parallel to orthogonal directions, we show the transition from highly incompatible observables to compatible ones. For simplicity, we develop the theory only for Gaussian states and measurements.
Uncertainty relations for position and momentum  have always been deeply related to the foundations of Quantum Mechanics. For several decades, their axiomatization has been of ‘preparation’ type: an inviolable lower bound for the widths of the position and momentum distributions, holding in any quantum state. Such kinds of uncertainty relations, which are now known as preparation uncertainty relations (PURs) have been later extended to arbitrary sets of observables [2,3,4,5]. All PURs trace back to the celebrated Robertson’s formulation  of Heisenberg’s uncertainty principle: for any two observables, represented by self-adjoint operators A and B, the product of the variances of A and B is bounded from below by the expectation value of their commutator; in formulae, , where is the variance of an observable measured in any system state . In the case of position Q and momentum P, this inequality gives Heisenberg’s relation . About 30 years after Heisenberg and Robertson’s formulation, Hirschman attempted a first statement of position and momentum uncertainties in terms of informational quantities. This led him to a formulation of PURs based on Shannon entropy ; his bound was later refined [8,9], and extended to discrete observables . Also other entropic quantities have been used . We refer to [12,13] for an extensive review on entropic PURs.
However, Heisenberg’s original intent  was more focused on the unavoidable disturbance that a measurement of position produces on a subsequent measurement of momentum [14,15,16,17,18,19,20,21]. Trying to give a better understanding of his idea, more recently new formulations were introduced, based on a ‘measurement’ interpretation of uncertainty, rather than giving bounds on the probability distributions of the target observables. Indeed, with the modern development of the quantum theory of measurement and the introduction of positive operator valued measures and instruments [3,22,23,24,25,26], it became possible to deal with approximate measurements of incompatible observables and to formulate measurement uncertainty relations (MURs) for position and momentum, as well as for more general observables. The MURs quantify the degree of approximation (or inaccuracy and disturbance) made by replacing the original incompatible observables with a joint approximate measurement of them. A very rich literature on this topic flourished in the last 20 years, and various kinds of MURs have been proposed, based on distances between probability distributions, noise quantifications, conditional entropy, etc. [12,14,15,16,17,18,19,20,21,27,28,29,30,31,32].
In this paper, we develop a new information-theoretical formulation of MURs for position and momentum, using the notion of the relative entropy (or Kullback-Leibler divergence) of two probabilities. The relative entropy is an informational quantity which is precisely tailored to quantify the amount of information that is lost by using an approximating probability q in place of the target one p. Although classical and quantum relative entropies have already been used in the evaluation of the performances of quantum measurements [24,27,30,33,34,35,36,37,38,39,40], their first application to MURs is very recent .
In , only MURs for discrete observables were considered. The present work is a first attempt to extend that information-theoretical approach to the continuous setting. This extension is not trivial and reveals peculiar problems, that are not present in the discrete case. However, the nice properties of the relative entropy, such as its scale invariance, allow for a satisfactory formulation of the entropic MURs also for position and momentum.
We deal with position and momentum in two possible scenarios. Firstly, we consider the case of n-dimensional position and momentum, since it allows to treat either scalar particles, or vector ones, or even the case of multi-particle systems. This is the natural level of generality, and our treatment extends without difficulty to it. Then, we consider a couple made up of one position and one momentum component along two different directions of the n-space. In this case, we can see how our theory behaves when one moves with continuity from a highly incompatible case (parallel components) to a compatible case (orthogonal ones).
The continuous case needs much care when dealing with arbitrary quantum states and approximating observables. Indeed, it is difficult to evaluate or even bound the relative entropy if some assumption is not made on probability distributions. In order to overcome these technicalities and focus on the quantum content of MURs, in this paper we consider only the case of Gaussian preparation states and Gaussian measurement apparatuses [2,4,5,42,43,44,45]. Moreover, we identify the class of the approximate joint measurements with the class of the joint POVMs satisfying the same symmetry properties of their target position and momentum observables [3,23]. We are supported in this assumption by the fact that, in the discrete case , simmetry covariant measurements turn out to be the best approximations without any hypothesis (see also [17,19,20,29,32] for a similar appearance of covariance within MURs for different uncertainty measures).
We now sketch the main results of the paper. In the vector case, we consider approximate joint measurements of the position and the momentum . We find the following entropic MUR (Theorem 5, Remark 14): for every choice of two positive thresholds , with , there exists a Gaussian state with position variance matrix and momentum variance matrix such that
for all Gaussian approximate joint measurements of and . Here and are the distributions of position and momentum in the state , and is the distribution of in the state , with marginals and ; the two marginals turn out to be noisy versions of and . The lower bound is strictly positive and it linearly increases with the dimension n. The thresholds and are peculiar of the continuous case and they have a classical explanation: the relative entropy if the variance of p vanishes faster than the variance of q, so that, given , it is trivial to find a state enjoying (1) if arbtrarily small variances are allowed. What is relevant in our result is that the total loss of information exceeds the lower bound even if we forbid target distributions with small variances.
The MUR (1) shows that there is no Gaussian joint measurement which can approximate arbitrarily well both and . The lower bound (1) is a consequence of the incompatibility between and and, indeed, it vanishes in the classical limit . Both the relative entropies and the lower bound in (1) are scale invariant. Moreover, for fixed and , we prove the existence and uniqueness of an optimal approximate joint measurement, and we fully characterize it.
In the scalar case, we consider approximate joint measurements of the position along the direction and the momentum along the direction , where . We find two different entropic MURs. The first entropic MUR in the scalar case is similar to the vector case (Theorem 3, Remark 11). The second one is (Theorem 1):
for all Gaussian states and all Gaussian joint approximate measurements of and . This lower bound holds for every Gaussian state without constraints on the position and momentum variances and , it is strictly positive unless and are orthogonal, but it is state dependent. Again, the relative entropies and the lower bound are scale invariant.
The paper is organized as follows. In Section 2, we introduce our target position and momentum observables, we discuss their general properties and define some related quantities (spectral measures, mean vectors and variance matrices, PURs for second order quantum moments, Weyl operators, Gaussian states). Section 3 is devoted to the definitions and main properties of the relative and differential (Shannon) entropies. Section 4 is a review on the entropic PURs in the continuous case [7,8,9,46], with a particular focus on their lack of scale invariance. This is a flaw due to the very definition of differential entropy, and one of the reasons that lead us to introduce relative entropy based MURs. In Section 5 we construct the covariant observables which will be used as approximate joint measurements of the position and momentum target observables. Finally, in Section 6 the main results on MURs that we sketched above are presented in detail. Some conclusions are discussed in Section 7.
2. Target Observables and States
Let us start with the usual position and momentum operators, which satisfy the canonical commutation rules:
Each of the vector operators has n components; it could be the case of a single particle in one or more dimensions (), or several scalar or vector particles, or the quadratures of n modes of the electromagnetic field. We assume the Hilbert space to be irreducible for the algebra generated by the canonical operators and . An observable of the quantum system is identified with a positive operator valued measure (POVM); in the paper, we shall consider observables with outcomes in endowed with its Borel -algebra The use of POVMs to represent observables in quantum theory is standard and the definition can be found in many textbooks [22,23,26,47]; the alternative name “non-orthogonal resolutions of the identity” is also used [3,4,5]. Following [5,23,26,31], a sharp observable is an observable represented by a projection valued measure (pvm); it is standard to identify a sharp observable on the outcome space with the k self-adjoint operators corresponding to it by spectral theorem. Two observables are jointly measurable or compatible if there exists a POVM having them as marginals. Because of the non-vanishing commutators, each couple , , as well as the vectors , , are not jointly measurable.
We denote by the trace class operators on , by the subset of the statistical operators (or states, preparations), and by the space of the linear bounded operators.
2.1. Position and Momentum
Our target observables will be either n-dimensional position and momentum (vector case) or position and momentum along two different directions of (scalar case). The second case allows to give an example ranging with continuity from maximally incompatible observables to compatible ones.
2.1.1. Vector Observables
As target observables we take and as in (3) and we denote by , , their pvm’s, that is
Then, the distributions in the state of a sharp position and a sharp momentum measurements (denoted by and ) are absolutely continuous with respect to the Lebesgue measure; we denote by and their probability densities: ,
In the Dirac notation, if and are the improper position and momentum eigenvectors, these densities take the expressions and , respectively. The mean vectors and the variance matrices of these distributions will be given in (7) and (8).
2.1.2. Scalar Observables
As target observables we take the position along a given direction and the momentum along another given direction :
In this case we have , so that and are not jointly measurable, unless the directions and are orthogonal.
Their pvm’s are denoted by and , their distributions in a state by and , and their corresponding probability densities by and : ,
Of course, the densities in the scalar case are marginals of the densities in the vector case. Means and variances will be given in (11).
2.2. Quantum Moments
Let be the set of states for which the second moments of position and momentum are finite:
Then, the mean vector and the variance matrix of the position in the state are
while for the momentum we have
For it is possible to introduce also the mixed ‘quantum covariances’
Since there is no joint measurement for the position and momentum , the quantum covariances are not covariances of a joint distribution, and thus they do not have a classical probabilistic interpretation.
By means of the moments above, we construct the three real matrices , the -dimensional vector and the symmetric matrix , with
We say is the quantum variance matrix of position and momentum in the state . In  dimensionless canonical operators are considered, but apart from this, our matrix corresponds to their “noise matrix in real form”; the name “variance matrix” is also used [44,48].
In a similar way, we can introduce all the moments related to the position and momentum introduced in (6). For , the means and variances are respectively
Similarly to (9), we have also the ‘quantum covariance’ . Then, we collect the two means in a single vector and we introduce the variance matrix:
Let be a real symmetric block matrix with the same dimensions of a quantum variance matrix. Define
In this case we have: , , , and
The inequalities (14) for tell us exactly when a (positive semi-definite) real matrix V is the quantum variance matrix of position and momentum in a state . Moreover, they are the multidimensional version of the usual uncertainty principle expressed through the variances [2,3,5], hence they represent a form of PURs. The block matrix in the definition of is useful to compress formulae involving position and momentum; moreover, it makes simpler to compare our equations with their frequent dimensionless versions (with ) in the literature [43,44].
Equivalences (14) are well known, see e.g.,  (Section 1.1.5),  (Equation (2.20)), and  (Theorem 2). Then .
By using the real block vector , with arbitrary and given , the semi-positivity (14) implies
which in turn implies , and (15). Then, by choosing , where are the eigenvectors of A (since A is a real symmetric matrix, for all i), one gets the strict positivity of all the eigenvalues of A; analogously, one gets . ☐
Inequality (15) for and becomes the uncertainty rule à la Robertson  for the observables in (6) (a position component and a momentum component spanning an arbitrary angle ):
Since are block matrices, their positive semi-definiteness can be studied by means of the Schur complements [49,50,51]. However, as are complex block matrices with a very peculiar structure, special results hold for them. Before summarizing the properties of in the next proposition, we need a simple auxiliary algebraic lemma.
Let A and B be complex self-adjoint matrices such that . Then , and the equality holds iff .
Let and be the ordered decreasing sequences of the eigenvalues of A and B, respectively. Then, by Weyl’s inequality, implies for every i  (Section III.2). This gives the first statement. Moreover, if and , we get for every i. Then because and . ☐
Let be a real symmetric matrix with the same dimensions of a quantum variance matrix. Then (or, equivalently, ) if and only if and
In this case we have
Moreover, we have also the following properties for the various determinants:
By interchanging A with B and C with in (18)–(22) equivalent results are obtained.
Since we already know that implies the invertibility of A, the equivalence between (14) and (18) with follows from  (Theorem 1.12 p. 34) (see also  (Theorem 11.6) or  (Lemma 3.2)).
In (19), the first inequality follows by summing up the two inequalities in (18). The last two ones are immediate by the positivity of .
The equality in (20) is Schur’s formula for the determinant of block matrices (, Theorem 1.1 p. 19). Then, the first inequality is immediate by the lemma above and the trivial relation ; the second one follows from (19):
The equality is equivalent to ; since the latter two determinants are evaluated on ordered positive matrices by (19), they coincide if and only if the respective arguments are equal (Lemma 1); this shows the equivalence in (21). Then, by (18), the self-adjoint matrix is both positive semi-definite and negative semi-definite; hence it is null, that is, .
Finally, gives trivially. Conversely, implies by (20); since by (19), Lemma 1 then implies and so . ☐
By (18) and (19), every time three matrices define the quantum variance matrix of a state , the same holds for . This fact can be used to characterize when two positive matrices A and B are the diagonal blocks of some quantum variance matrix, or two positive numbers and are the position and momentum variances of a quantum state along the two directions and .
Two real matrices and , having the dimension of the square of a length and momentum, respectively, are the diagonal blocks of a quantum variance matrix if and only if
Two real numbers and , having the dimension of the square of a length and momentum, respectively, are such that and for some state ρ if and only if
For A and B, the necessity follows from (19). The sufficiency comes from (18) by choosing .
For and , the necessity follows from (15). The sufficiency comes from (18) with and for example the following choices of A and B:
if , we take and ;
if , we let
where and are any two scalar multiples of the orthogonal projection onto satisfying when restricted to ;
if , we choose
where and are as in the previous item.
In the last two cases, we chose A and B in such a way that when restricted to the linear span of . ☐
2.3. Weyl Operators and Gaussian States
In the following, we shall introduce Gaussian states, Gaussian observables and covariant observables on the phase-space. In all these instances, the Weyl operators are involved; here we recall their definition and some properties (see e.g.,  (Section 5.2) or  (Section 12.2), where, however, the definition differs from ours in that the Weyl operators are composed with the map of (13)).
The Weyl operators are the unitary operators defined by
The Weyl operators (23) satisfy the composition rule
in particular, this implies the commutation relation
These commutation relations imply the translation property
due to this property, the Weyl operators are also known as displacement operators.
With a slight abuse of notation, we shall sometimes use the identification
where is a block column vector belonging to the phase-space ; here, the first block is a position and the second block is a momentum.
By means of the Weyl operators, it is possible to define the characteristic function of any trace-class operator.
For any operator , its characteristic function is the complex valued function defined by
Note that is the inverse of a length and is the inverse of a momentum, so that w is a block vector living in the space regarded as the dual of the phase-space.
Instead of the characteristic function, sometimes the so called Weyl transform is introduced [4,44].
By  (Proposition 5.3.2, Theorem 5.3.3), we have and the following trace formula holds: ,
As a corollary  (Corollary 5.3.4), we have that a state is pure if and only if
By  (Lemma 3.1) or  (Proposition 8.5.(e)), the trace formula also implies
Moreover, the following inversion formula ensures that the characteristic function completely characterizes the state  (Corollary 5.3.5):
The last two integrals are defined in the weak operator topology.
Finally, for , the moments (7)–(10) can be expressed as in  (Section 5.4):
The condition is necessary and sufficient in order that the function (31) defines the characteristic function of a quantum state  (Theorem 5.5.1),  (Theorem 12.17). Therefore, Gaussian states are exactly the states whose characteristic function is the exponential of a second order polynomial  (Equation (5.5.49)),  (Equation (12.80)).
We shall denote by the set of the Gaussian states; we have . By (30), the vectors , and the matrices , , characterizing a Gaussian state are just its first and second order quantum moments introduced in (7)–(9). By (31), the corresponding distributions of position and momentum are Gaussian, namely
Proposition4 (Pure Gaussian states).
For , we have if and only if ρ is pure.
The trace formula (28) and (31) give , and this implies the statement. ☐
Proposition5 (Minimum uncertainty states).
For , we have if and only if ρ is a pure Gaussian state and it factorizes into the product of minimum uncertainty states up to a rotation of .
If , then the equivalence (22) gives , so that the variance matrices and have a common eigenbasis . Thus, all the corresponding couples of position and momentum have minimum uncertainties: . Therefore, if we consider the factorization of the Hilbert space corresponding to the basis , all the partial traces of the state on each factor are minimum uncertainty states. Since for the minimum uncertainty states are pure and Gaussian, the state is a pure product Gaussian state.
The converse is immediate. ☐
3. Relative and Differential Entropies
In this paper, we will be concerned with entropic quantities of classical type [54,55,56]. We express them in ‘bits’, that is we use the base-2 logarithms: .
We deal only with probabilities on the measurable space which admit densities with respect to the Lebesgue measure. So, we define the relative entropy and differential entropy only for such probabilities; moreover, we list only the general properties used in the following.
3.1. Relative Entropy or Kullback-Leibler Divergence
The fundamental quantity is the relative entropy, also called information divergence, discrimination information, Kullback-Leibler divergence or information or distance or discrepancy. The relative entropy of a probability p with respect to a probability q is defined for any couple of probabilities p, q on the same probability space.
Given two probabilities p and q on with densities f and g, respectively, the relative entropy of p with respect to q is
The value is allowed for ; the usual convention is understood. The relative entropy (33) is the amount of information that is lost when q is used to approximate p  (p. 51). Of course, if is dimensioned, then the densities f and g have the same dimension (that is, the inverse of ), and the argument of the logarithm is dimensionless, as it must be.
(, Theorem 8.6.1).The following properties hold.
is invariant under a change of the unit of measurement.
If and with invertible variance matrices A and B, then
As is scale invariant, it quantifies a relative error for the use of q as an approximation of p, not an absolute one.
Let us employ the relative entropy to evaluate the effect of an additive Gaussian noise on an independent Gaussian random variable X. If , then , and the relative entropy of the true distribution of X with respect to its disturbed version is
This expression vanishes if the noise becomes negligible with respect to the true distribution, that is if and . On the other hand, diverges if the noise becomes too strong with respect to the true distribution, or, in other words, if the true distribution becomes too peaked with respect to the noise, that is, or .
3.2. Differential Entropy
The differential entropy of an absolutely continuous random vector with a probability density f is
This quantity is commonly used in the literature, even if it lacks many of the nice properties of the Shannon entropy for discrete random variables. For example, is not scale invariant, and it can be negative  (p. 244).
Since the density f enters in the logarithm argument, the definition of is meaningful only when f is dimensionless, which is the same as being dimensionless. Note that, if is dimensioned and is a real parameter making a dimensionless random variable, then
In the following, we shall consider the differential entropy only for dimensionless random vectors .
(, Section 8.6).The following properties hold.
If is an absolutely continuous random vector with variance matrix A, then
The equality holds iff is Gaussian with variance matrix A and arbitrary mean vector .
If is an absolutely continuous random vector, then
The equality holds iff the components are independent.
In property (i) we have used the following well-known matrix identity, which follows by diagonalization:
Property (i) yields that the differential entropy of a Gaussian random variable is
which is an increasing function of the variance , and thus it is a measure of the uncertainty of X. Note that iff .
4. Entropic PURs for Position and Momentum
The idea of having an entropic formulation of the PURs for position and momentum goes back to [7,8,9]. However, we have just seen that, due to the presence of the logarithm, the Shannon differential entropy needs dimensionless probability densities. So, this leads us to introduce dimensionless versions of position and momentum.
Let be a dimensionless parameter and a second parameter with the dimension of a mass times a frequency. Then, we introduce the dimensionless versions of position and momentum:
We use a unique dimensional constant , in order to respect rotation symmetry and do not distinguish different particles. Anyway, there is no natural link between the parameter multiplying and the parameter multiplying ; this is the reason for introducing . As we see from the commutation rules, the constant plays the role of a dimensionless version of ℏ; in the literature on PURs, often is used [8,9,12,46].
4.1. Vector Observables
Let and be the pvm’s of and ; then, and are their probability distributions in the state . The total preparation uncertainty is quantified by the sum of the two differential entropies . For , by Proposition 7 we get
In the case of product states of minimum uncertainty, we have ; then, by taking (20) into account, we get
Thus, the bound (37) arises from quantum relations between and ; indeed, there would be no lower bound for (36) if we could take both and arbitrarily small.
By item (ii) of Proposition 7, the differential entropy for the distribution of a random vector is smaller than the sum of the entropies of its marginals; however, the final bound (37) is a tight bound for both and .
By the results of [8,9], the same bound (37) is obtained even if the minimization is done over all the states, not only the Gaussian ones.
The uncertainty result (37) depends on , this being a consequence of the lack of scale invariance of the differential entropy; note that the bound is positive if and only if . Sometimes in the literature the parameter ℏ appears in the argument of the logarithm [27,30]; this fact has to be interpreted as the appearance of a parameter with the numerical value of ℏ, but without dimensions. In this sense the formulation (37) is consistent with both the cases with or . Sometimes the smaller bound appears in place of ; this is connected to a state dependent formulation of the entropic PUR  (Section V.B).
4.2. Scalar Observables
The dimensionless versions of the scalar observables introduced in (6) are
We denote by and the associated distributions in the state . For , the respective means and variances are
As in the vector case, the total preparation uncertainty is quantified by the sum of the two differential entropies . For , Proposition 7 gives
Then, we have the lower bound
which depends on , but not on . Of course, because of (39), for Gaussian states a lower bound for the sum is equivalent to a lower bound for the product . By the generalization of the results of [8,9] given in , the bound (40) is obtained also when the minimization is done over all the states.
Let us note that the bound in (40) is positive for , and it goes to for , which is the case of compatible and . In the case , the bound (40) is the same as (37) for .
5. Approximate Joint Measurements of Position and Momentum
In order to deal with MURs for position and momentum observables, we have to introduce the class of approximate joint measurements of position and momentum, whose marginals we will compare with the respective sharp observables. As done in [3,4,18,57], it is natural to characterize such a class by requiring suitable properties of covariance under the group of space translations and velocity boosts: namely, by approximate joint measurement of position and momentum we will mean any POVM on the product space of the position and momentum outcomes sharing the same covariance properties of the two target sharp observables. As we have already discussed, two approximation problems will be of our concern: the approximation of the position and momentum vectors (vector case, with outcomes in the phase-space ), and the approximation of one position and one momentum component along two arbitrary directions (scalar case, with oucomes in ). In order to treat the two cases altogether, we consider POVMs with outcomes in , which we call bi-observables; they correspond to a measurement of m position components and m momentum components. The specific covariance requirements will be given in the Definitions 5–7.
In studying the properties of probability measures on , a very useful notion is that of the characteristic function, that is, the Fourier cotransform of the measure at hand; the analogous quantity for POVMs turns out to have the same relevance. Different names have been used in the literature to refer to the characteristic function of POVMs, or, more generally, quantum instruments, such as characteristic operator or operator characteristic function [3,24,34,44,58,59,60,61,62]. As a variant, also the symplectic Fourier transform quite often appears  (Section 12.4.3). The characteristic function has been used, for instance, to study the quantum analogues of the infinite-divisible distributions [3,34,58,59,60,62] and measurements of Gaussian type [5,44,61]. Here, we are interested only in the latter application, as our approximating bi-observables will typically be Gaussian. Since we deal with bi-observables, we limit our definition of the characteristic function only to POVMs on , which have the same number of variables of position and momentum type.
Being measures, POVMs can be used to construct integrals, whose theory is presented e.g., in  (Section 4.8) and  (Section 2.9, Proposition 2.9.1).
Given a bi-observable , the characteristic function of is the operator valued function , with
In this definition the dimensions of the vector variables and are the inverses of a length and momentum, respectively, as in the definition of the characteristic function of a state (27). This definition is given so that is the usual characteristic function of the probability distribution on .
5.1. Covariant Vector Observables
In terms of the pvm’s (4), the translation property (25) is equivalent to the symmetry properties
and they are taken as the transformation property defining the following class of POVMs on [23,26,44,53,57].
A covariant phase-space observable is a bi-observable satisfying the covariance relation
We denote by the set of all the covariant phase-space observables.
The interpretation of covariant phase-space observables as approximate joint measurements of position and momentum is based on the fact that their marginal POVMs
have the same symmetry properties of and , respectively. Although and are not jointly measurable, the following well-known result says that there are plenty of covariant phase-space observables  (Theorem 4.8.3), [63,64]. In (43) below, we use the parity operator on , which is such that
The covariant phase-space observables are in one-to-one correspondence with the states on , so that we have the identification ; such a correspondence is given by
The characteristic function (41) of a measurement has a very simple structure in terms of the characteristic function (27) of the corresponding state .
The characteristic function of is given by
and the characteristic function of the probability is
In (44) we have used the identification (26). The characteristic function of a state is introduced in (27).
where we used the formula (29). By (42) and the definition (27), we get (44). Again by (27), we get (45). ☐
In terms of probability densities, measuring on the state yields the density function . Then, by (45), the densities of the marginals and are the convolutions
where f and g are the sharp densities introduced in (5). By the arbitrariness of the state , the marginal POVMs of turn out to be the convolutions (or ‘smearings’)
(see e.g.,  (Section III, Equations (2.48) and (2.49))).
Let us remark that the distribution of the approximate position observable in a state is the distribution of the sum of two independent random vectors: the first one is distributed as the sharp position in the state , the second one is distributed as the sharp position in the state . In this sense, the approximate position looks like a sharp position plus an independent noise given by . Of course, a similar fact holds for the momentum. However, this statement about the distributions can not be extended to a statement involving the observables. Indeed, since and are incompatible, nobody can jointly observe , and , so that the convolutions (46) do not correspond to sums of random vectors that actually exist when measuring .
5.2. Covariant Scalar Observables
Now we focus on the class of approximate joint measurements of the observables and representing position and momentum along two possibly different directions and (see Section 2.1.2). As in the case of covariant phase-space observables, this class is defined in terms of the symmetries of its elements: we require them to transform as if they were joint measurements of and . Recall that and denote the spectral measures of , .
Due to the commutation relation (24), the following covariance relations hold
for all and . We employ covariance to define our class of approximate joint measurements of and .
A-covariant bi-observable is a POVM such that
We denote by the class of such bi-observables.
So, our approximate joint measurements of and will be all the bi-observables in the class .
The marginal of a covariant phase-space observable along the directions and is a -covariant bi-observable. Actually, it can be proved that, if , all -covariant bi-observables can be obtained in this way.
It is useful to work with a little more generality, and merge Definitions 5 and 6 into a single notion of covariance.
Suppose J is a real matrix. A POVM is a J -covariant observable on if
Thus, approximate joint observables of and are just J-covariant observables on for the choice of the matrix
On the other hand, covariant phase-space observables constitute the class of -covariant observables on , where is the identity map of .
5.3. Gaussian Measurements
When dealing with Gaussian states, the following class of bi-observables quite naturally arises.
A POVM is a Gaussian bi-observable if
for two vectors , a real matrix and a real symmetric matrix satisfying the condition
We set . The triple is the set of the parameters of the Gaussian observable .
In this definition, the vector has the dimension of a length, and of a momentum; similarly, the matrices , decompose into blocks of different dimensions. The condition (49) is necessary and sufficient in order that the function (48) defines the characteristic function of a POVM.
For unbiased Gaussian measurements, i.e., Gaussian bi-observables with , the previous definition coincides with the one of  (Section 12.4.3). It is also a particular case of the more general definition of Gaussian observables on arbitrary (not necessarily symplectic) linear spaces that is given in [43,44]. We refer to [5,44] for the proof that Equation (48) is actually the characteristic function of a POVM.
Measuring the Gaussian observable on the Gaussian state yields the probability distribution whose characteristic function is
hence the output distribution is Gaussian,
5.3.1. Covariant Gaussian Observables
For Gaussian bi-observables, J-covariance has a very easy characterization.
Suppose is a Gaussian bi-observable on with parameters . Let J be any real matrix. Then, the POVM is a J-covariant observable if and only if .
For , we let and be the two POVMs on given by
By the commutation relations (24) for the Weyl operators, we immediately get
we have also
Since for all , by comparing the last two expressions we see that if and only if
which in turn is equivalent to . ☐
Let us point out the structure of the Gaussian approximate joint measurements of and .
A bi-observable is Gaussian if and only if the state σ is Gaussian. In this case, the covariant bi-observable is Gaussian with parameters
By comparing (31), (44) and (48), and using the fact that if and only if and , we have the first statement. Then, for , we see immediately that is a Gaussian observable with the above parameters. ☐
We call the class of the Gaussian covariant phase-space observables. By (50), observing on a Gaussian state yields the normal probability distribution , with marginals
When and , we have an unbiased measurement.
We now study the Gaussian approximate joint measurements of the target observables and defined in (6).
A Gaussian bi-observable with parameters is in if and only if , where J is given by (47). In this case, the condition (49) is equivalent to
The first statement follows from Proposition 10. Then, the matrix inequality (49) reads
We write for the class of the Gaussian -covariant phase-space observables. An observable is thus characterized by the couple . From (50) with given by (47), we get that measuring on a Gaussian state yields the probability distribution with and given by (12). Its marginals with respect to the first and second entry are, respectively,
Let us construct an example of an approximate joint measurement of and , by using a noisy measurement of position along followed by a sharp measurement of momentum along . Let Δ be a positive real number yielding the precision of the position measurement, and consider the POVM on given by
The characteristic function of is
Therefore, is a Gaussian bi-observable with parameters , and , where J is given by (47) and , and . This implies ; in particular, the set is non-empty. Moreover, the lower bound is attained, cf. (52).
Let us consider the case ; now the target observables and are compatible and we can define a pvm on by setting for all . Its characteristic function is
Then, with parameters , , and given by (47). Note that can be regarded as the limit case of the observables of the previous example when and .
6. Entropic MURs for Position and Momentum
In the case of two discrete target observables, in  we found an entropic bound for the precision of their approximate joint measurements, which we named entropic incompatibility degree. Its definition followed a three steps procedure. Firstly, we introduced an error function: when the system is in a given state , such a function quantifies the total amount of information that is lost by approximating the target observables by means of the marginals of a bi-observable; the error function is nothing else than the sum of the two relative entropies of the respective distributions. Then, we considered the worst possible case by maximizing the error function over , thus obtaining an entropic divergence quantifying the approximation error in a state independent way. Finally, we got our index of the incompatibility of the two target observables by minimizing the entropic divergence over all bi-observables. In particular, when symmetries are present, we showed that the minimum is attained at some covariant bi-observables. So, the covariance followed as a byproduct of the optimization procedure, and was not a priori imposed upon the class of approximating bi-observables.
As we shall see, the extension of the previous procedure to position and momentum target observables is not straightforward, and peculiar problems of the continuous case arise. In order to overcome them, in this paper we shall fully analyse only a case in which explicit computations can be done: Gaussian preparations, and Gaussian bi-observables, which we a priori assume to be covariant. We conjecture that the final result should be independent of these simplifications, as we shall discuss in Section 7.
As we said in Section 5, by “approximate joint measurement” we mean “a bi-observable with the ‘right’ covariance properties”.
6.1. Scalar Observables
Given the directions and , the target observables are and in (6) with pvm’s and . For with parameters given in (10), the target distributions and are normal with means and variances (11).
An approximate joint measurements of and is given by a covariant bi-observable ; then, we denote its marginals with respect to the first and second entry by and , respectively. For a Gaussian covariant bi-observable with parameters , the distribution of in a Gaussian state is normal,
so that its marginal distributions and are normal with means and and variances
Let us recall that , , , and that by (16) and (52), we have
6.1.1. Error Function
The relative entropy is the amount of information that is lost when an approximating distribution is used in place of a target one. For this reason, we use it to give an informational quantification of the error made in approximating the distributions of sharp position and momentum by means of the marginals of a joint covariant observable.
Given the preparation and the covariant bi-observable , the error function for the scalar case is the sum of the two relative entropies:
The relative entropy is invariant under a change of the unit of measurement, so that the error function is scale invariant, too; indeed, it quantifies a relative error, not an absolute one. In the Gaussian case the error function can be explicitly computed.
Proposition13 (Error function for the scalar Gaussian case).
For and , the error function is
and is the following strictly increasing function with :
The statement follows by a straightforward combination of (32), (34), (53) and (56). ☐
Note that the error function does not depend on the mixed covariances and . Note also that, if we select a possible approximation , then the error function decreases for states with increasing sharp variances and : the loss of information decreases when the sharp distributions make the approximation error negligible. Finally, note that
This means that, apart from the term due to the bias, our error function only depends on the two ratios “variance of the approximating distribution over variance of the target distribution”. Thus, in order to optimize the error function, one has to optimize these two ratios.
We use formula (57) to firstly give a state dependent MUR, and then, following the scheme of , a state independent MUR.
A lower bound for the error function can be found by minimizing it over all possible approximate joint measurements of and . First of all, let us remark that this minimization makes sense because we consider only -covariant bi-observables: if we minimized over all possible bi-observables, then the minimum would be trivially zero for every given preparation . Indeed, the trivial bi-observable yields .
When minimizing the error function over all -covariant bi-observables, both the minimum and the best measurement attaining it are state dependent. When , the two target observables are compatible, so that their joint measurement trivially exists (see Example 3) and we get . In order to have explicit results for any angle , we consider only the Gaussian case.
The lower bound is tight and the optimal measurement is unique: , for a unique ; such a Gaussian -covariant bi-observable is characterized by
As already discussed, the case is trivial. If , we have to minimize the error function (57) over . First of all we can eliminate the positive term by taking an unbiased measurement. Then, since s is an increasing function, by the second condition in (55) we can also take . This implies by (52). In this case the error function (57) reduces to
with given by (61); by the first of (55), we have .
Now, we can minimize the error function with respect to x by studying its first derivative:
Having , we immediately get that gives the unique minimum. Thus
which conclude the proof. ☐
The minimum information loss depends on both the preparation ρ and the angle α. When , that is when the target observables are not compatible, is strictly grater than zero. This is a peculiar quantum effect: given ρ, and , there is no Gaussian approximate joint measurement of and that can approximate them arbitrarily well. On the other side, in the limit , the lower bound goes to zero; so, the case of commuting target observables is approached with continuity.
The lower bound goes to zero also in the classical limit . This holds for every angle α and every Gaussian state ρ.
Another case in which is the limit of large uncertainty states, that is, if we let the product : our entropic MUR disappears because, roughly speaking, the variance of (at least) one of the two target observables goes to infinity, its relative entropy vanishes by itself, and an optimal covariant bi-observable has to take care of (at most) only the other target observable.
Actually, something similar to the previous remark happens also at the macroscopic limit, and does not require the measuring instrument to be an optimal one; indeed, unbiasedness is enough in this case. This happens because the error function quantifies a relative error; even if the measurement approximation is fixed, such an error can be reduced by suitably changing the preparation ρ. Indeed, if we consider the position and momentum of a macroscopic particle, for instance the center of mass of many particles, it is natural that its state has much larger position and momentum uncertainties than the intrinsic uncertainties of the measuring instrument; that is, and , implying that the error function (57) is negligible. In practice, this is a classical case: the preparation has large position and momentum uncertainties and the measuring instrument is relatively good. In this situation we do not see the difference between the joint measurement of position and momentum and their separate sharp observations.
The optimal approximating joint measurement is unique; by (62) it depends on the preparation ρ one is considering, as well as on the directions and . A realization of is the measuring procedure of Example 2.
The MUR (59) is scale invariant, as both the error function and the lower bound are such.
For , we get , where is defined by (61). As ranges in the interval , the quantity takes all the values in the interval , so that
In order to get this result, we needed ; however, the final result does not depend on α. Therefore, in the -approach of (63), the continuity from quantum to classical is lost.
6.1.2. Entropic Divergence of , from
Now we want to find an entropic quantification of the error made in observing as an approximation of and in an arbitrary state . The procedure of , already suggested in  (Section VI.C) for a different error function, is to consider the worst case by maximizing the error function over all the states. However, in the continuous framework this is not possible for the error function (56); indeed, from (57) we get even if we restrict to unbiased covariant bi-observables.
Anyway, the reason for to diverge is classical: it depends only on the continuous nature of and , without any relation to their (quantum) incompatibility. Indeed, as we noted in Section 3.1, if an instrument measuring a random variable adds an independent noise , thus producing an output , then the relative entropy diverges for ; this is what happens if we fix the noise and we allow for arbitrarily peaked preparations. Thus, the sum diverges if, fixed , we let or go to 0.
The difference between the classical and quantum frameworks emerges if we bound from below the variances of the sharp position and momentum observables. Indeed, in the classical framework we have for every ; the same holds for the sum of two relative entropies if no relation exists between the two noises. On the contrary, in the quantum framework the entropic MURs appear due to the relation between the position and momentum errors occurring in any approximate joint measurement.
In order to avoid that due to merely classical effects, we thus introduce the following subset of the Gaussian states:
and we evaluate the error made in approximating and with the marginals of a -covariant bi-observable by maximizing the error function over all these states.
The Gaussian -entropic divergence of from is
For Gaussian , depending on the choice of the thresholds and , the divergence can be easily computed or at least bounded.
Let the bi-observable be fixed.
For , the divergence is given by
where is any Gaussian state with and , and
For , the divergence is bounded from below by
where is any Gaussian state with and , and
The existence of the above states is guaranteed by Proposition 3.
By Proposition 3, maximizing the error function over the states in is the same as maximizing (57) over the parameters and satisfying (55) and (64) (note that in the bias , the variances and depend on and by (54)).
In the case , the thresholds themselves satisfy Heisenberg uncertainty relation, and so equality (66) follows from the expression (57) and the fact the functions , , are decreasing in and .
In the case , we have to take into account the relation (55) for and : the supremum of is achieved when , with and . Then inequality (67) follows by choosing and .
The conditions on the states do not depend on , but only on the parameters defining . Thus, in the case , any choice of yields a state which is the worst one for every Gaussian approximate joint measurement .
6.1.3. Entropic Incompatibility Degree of and
The last step is to optimize the state independent -entropic divergence (65) over all the approximate joint measurements of and . This is done in the next definition.
The Gaussian -entropic incompatibility degree of , is
Again, depending on the choice of the thresholds and , the entropic incompatibility degree can be easily computed or at least bounded.
For , the incompatibility degree is given by
The infimum in (68) is attained and the optimal measurement is unique, in the sense that
for a unique ; such a bi-observable is characterized by
For , the incompatibility degree is bounded from below by
The latter bound is
where the state is defined in item (ii) of Theorem 2 and is the bi-observable in such that
In the case , due to (66), the proof is the same as that of Theorem 1 with the replacements and .
In the case , starting from (67), the proof is the same as that of Theorem 1 with the replacements and .
By means of the above results, we can formulate a state independent entropic MUR for the position and the momentum in the following way. Chosen two positive thresholds and , there exists a preparation (introduced in Theorem 2) such that, for all Gaussian approximate joint measurements of and , we have
The inequality follows by (66) and (69) in the case , and (73) in the case .
What is relevant is that, for every approximate joint measurement , the total information loss does exceed the lower bound (75) even if the set of states forbids preparations ρ with too peaked target distributions. Indeed, without the thresholds , , it would be trivial to exceed the lower bound (75), as we noted in Section 6.1.2.
We also remark that, chosen and , we found a single state in that satisfies (75) for every , so that is a ‘bad’ state for all Gaussian approximate joint measurements of position and momentum.
When , the optimal approximate joint measurement is unique in the class of Gaussian -covariant bi-observables; it depends only on the class of preparations : it is the best measurement for the worst choice of the preparation in the class .
The entropic incompatibility degree is strictly positive for (incompatible target observables) and it goes to zero in the limits (compatible observables), (classical limit), and (large uncertainty states).
The scale invariance of the relative entropy extends to the error function , hence to the divergence and the entropic incompatibility degree , as well as the entropic MUR (75).
6.2. Vector Observables
Now the target observables are and given in (3), with pvm’s and ; the approximating bi-observables are the covariant phase-space observables of Definition 5. Each bi-observable is of the form for some , where is given by (43). is the subset of the Gaussian bi-observables in , and if and only if is a Gaussian state.
We proceed to define the analogues of the scalar quantities introduced in Section 6.1.1, Section 6.1.2 and Section 6.1.3. In order to do it, in the next proposition we recall some known results on matrices.
([50,51,52,65]).Let and be complex matrices such that . Then, we have . Moreover, if is a strictly increasing continuous function, we have .
6.2.1. Error Function
Given the preparation and the covariant phase-space observable , with , the error function for the vector case is the sum of the two relative entropies:
As in the scalar case, the error function is scale invariant, it quantifies a relative error, and we always have because position and momentum are incompatible. Indeed, since the marginals of a bi-observable turn out to be convolutions of the respective sharp observables and with some probability densities on , and for all states ; this is an easy consequence, for instance, of Problem 26.1, p. 362, in .
In the Gaussian case the error function can be explicitly computed.
Proposition15 (Error function for the vector Gaussian case).
For , the error function has the two equivalent expressions:
In the same way a similar expression is obtained for and (77a) is proved.
On the other hand, by using
and the analogous expressions involving and , one gets (77b). ☐
State Dependent Lower Bound
In principle, a state dependent lower bound for the error function could be found by analogy with Theorem 1, by taking again the infimum over all joint covariant measurements, that is . By considering only Gaussian states and measurements , from (18), (77a) and (78a), the infimum over can be reduced to an infimum over the matrices :
The above equality follows since the monotonicity of s (Proposition 14) implies that the trace term in (77a) attains its minimum when . However, it remains an open problem to explicitly compute the infimum over the matrices when the preparation is arbitrary.
Nevertheless, the computations can be done at least for a preparation of minimum uncertainty (Proposition 5). Indeed, by (22) we get
Now we can diagonalize and minimize over its eigenvalues; since attains its minimum value at , this procedure gives . So, by denoting by the state giving the minimum, we have
For an arbitrary , we can use the last formula to deduce an upper bound for . Indeed, if is a minimum uncertainty state with , then by (19), and, using again the state of (79), we find
The second inequality in the last formula follows from (77b), (78b) and the monotonicity of s (Proposition 14).
6.2.2. Entropic Divergence of from
In order to define a state independent measure of the error made in regarding the marginals of as approximations of and , we can proceed along the lines of the scalar case in Section 6.1.2. To this end, we introduce the following vector analogue of the Gaussian states defined in (64):
In the vector case, Definition 10 then reads as follows.
The Gaussian -entropic divergence of from is
As in the scalar case, when is Gaussian, depending on the choice of the product , we can compute the divergence or at least bound it from below.
Let the bi-observable be fixed.
For , the divergence is given by
where is any Gaussian state with and .
For , the divergence is bounded from below by
where is any Gaussian state with and .
In the case , for we have and ; by Proposition 14 we get
By using these inequalities in the expression (77b), we get (83).
In the case , the lower bound (84) follows by evaluating at the state with and .
Note that does not depend on , but only on the parameters defining : again, in the case , the error attains its maximum at a state which is independent of the approximate measurement.
6.2.3. Entropic Incompatibility Degree of and
By analogy with Section 6.1.3, we can optimize the -entropic divergence over all the approximate joint measurements of and .
The Gaussian -entropic incompatibility degree of and is
Again, depending on the product , we can compute or at least bound from below.
For , the incompatibility degree is given by
The infimum in (85) is attained and the optimal measurement is unique, in the sense that
for a unique ; such a state is the minimal uncertainty state characterized by
For , the incompatibility degree is bounded from below by
The latter bound is
where the preparation is defined in item (ii) of Theorem 4 and is the state in such that
In the case , from the expression (83) we get immediately , and by (19) we have . So, by (83) and Propositions 3 and 14, we get , and
By minimizing over all the eigenvalues of , we get the minimum (86), which is attained if and only if is as in (88). Hence, and are as in (88). This implies that any optimal state is a minimum uncertainty state; so, and the state is unique.
In the case , by (19) and Proposition 14, inequality (84) implies
By minimizing over all the eigenvalues of , we get (89). Then (89) holds for as in item (ii) of Theorem 4 and in (91).
By means of the above results, we can formulate the following state independent entropic MUR for the position and momentum . Chosen two positive thresholds and , there exists a preparation (introduced in Theorem 4) such that, for all Gaussian approximate joint measurements of and , we have
The inequality follows by (83) and (86) for , and (90) for .
Thus, also in the vector case, for every approximate joint measurement , the total information loss does exceed the lower bound (92) even if forbids preparations ρ with too peaked target distributions. Moreover, chosen and , one can fix again a single ‘bad’ state in that satisfies (92) for all Gaussian approximate joint measurements of and .
Whenever , the optimal approximating joint measurement is unique in the class of Gaussian covariant bi-observables; it corresponds to a minimum uncertainty state which depends only on the chosen class of preparations , that is, on the thresholds and : is the best measurement for the worst choice of the preparation in that class.
For , the vector lower bound in (92) reduces to the scalar lower bound found in (75) for two parallel directions and ; for , the bound linearly increases with n.
The entropic incompatibility degree is strictly positive for (incompatible target observables) and it goes to zero in the limit (compatible observables), (classical limit), and (large uncertainty states).
Similarly to Remark 6 for scalar target observables, also the MUR (92) is actually ineffective for macroscopic systems. Indeed, suppose we are concerned with position and momentum of a macroscopic particle, say the center of mass of a multi-particle system (in this case ). The states ρ which can be prepared in practice have macroscopic widths, say with ‘large’ thresholds and . Then, we consider a measuring instrument having a high precision with respect to this class of states, but not necessarily attaining a precision near the quantum limits. For instance, let us take with , , and , ; we assume is also unbiased: , . Obviously, must hold. Then, by (77a) and (78a) we have
By (58) the function s is increasing and it behaves as in a neighborhood of zero; in the present case and , thus implying that the error function is negligible. This is practically a ‘classical’ case: the preparation has ‘large’ position and momentum uncertainties and the measuring instrument is ‘relatively good’. In this situation we do not see the difference between the joint measurement of position and momentum and their separate sharp distributions. Of course the bound (92) continues to hold, but it is also negligible since .
Also in the vector case, the scale invariance of the relative entropy extends to the error function , the divergence and the entropic incompatibility degree , as well as the entropic MUR (92). Indeed, let us consider the dimensionless versions of position and momentum (35) and their associated projection valued measures , introduced in Section 4. Accordingly, we rescale the joint measurement of (43) in the same way, obtaining the POVM
Here, both the vector variables and , as well as the components of the Borel set B, are dimensionless. By the scale invariance of the relative entropy, the error function takes the same value as in the dimensioned case:
Then, the scale invariance holds for the entropic divergence and incompatibility degree, too:
where and . In particular and, in this case, we have
We have extended the relative entropy formulation of MURs given in  from the case of discrete incompatible observables to a particular instance of continuous target observables, namely the position and momentum vectors, or two components of them along two possibly non parallel directions. The entropic MURs we found share the nice property of being scale invariant and well-behaved in the classical and macroscopic limits. Moreover, in the scalar case, when the angle spanned by the position and momentum components goes to , the entropic bound correctly reflects their increasing compatibility by approaching zero with continuity.
Although our results are limited to the case of Gaussian preparation states and covariant Gaussian approximate joint measurements, we conjecture that the bounds we found still hold for arbitrary states and general (not necessarily covariant or Gaussian) bi-observables. Let us see with some more detail how this should work in the case when the target observables are the vectors and .
The most general procedure should be to consider the error function for an arbitrary POVM on and any state . First of all, we need states for which neither the position nor the momentum dispersion are too small; the obvious generalization of the test states (81) is
Then, the most general definitions of the entropic divergence and incompatibility degree are:
It may happen that is not absolutely continuous with respect to , or with respect to ; in this case, the error function and the entropic divergence take the value by definition. So, we can restrict to bi-observables that are (weakly) absolutely continuous with respect to the Lebesgue measure. However, the true difficulty is that, even with this assumption, here we are not able to estimate (94), hence (95). It could be that the symmetrization techniques used in [17,19] can be extended to the present setting, and one can reduce the evaluation of the entropic incompatibility index to optimizing over all covariant bi-observables. Indeed, in the present paper we a priori selected only covariant approximating measurements; we would like to understand if, among all approximating measurements, the relative entropy approach selects covariant bi-observables by itself. However, even if is covariant, there remains the problem that we do not know how to evaluate (94) if and are not Gaussian. It is reasonable to expect that some continuity and convexity arguments should apply, and the bounds in Theorem 5 might be extended to the general case by taking dense convex combinations. Also the techniques used for the PURs in [8,9] could be of help in order to extend what we did with Gaussian states to arbitrary states. This leads us to conjecture:
Conjecture (96) is also supported since the uniqueness of the optimal approximating bi-observable in Theorem 5(i) is reminiscent of what happens in the discrete case of two Fourier conjugated mutually unbiased bases (MUBs); indeed, in the latter case, the optimal bi-observable is actually unique among all the bi-observables, not only the covariant ones (see  (Theorem 5)).
Similar considerations obviously apply also to the case of scalar target observables. We leave a more deep investigation of equality (96) to future work.
As a final consideration, one could be interested in finding error/disturbance bounds involving sequential measurements of position and momentum, rather than considering all their possible approximate joint measurements. As sequential measurements are a proper subset of the set of all the bi-observables, optimizing only over them should lead to bounds that are greater than . This is the reason for which in  an error/disturbance entropic bound, denoted by and dinstinct from , was introduced. However, it was also proved that the equality holds when one of the target observables is discrete and sharp. Now, in the present paper, only sharp target observables are involved; although the argument of  can not be extended to the continuous setting, the optimal approximating joint observables we found in Theorems 3(i) and 5(i) actually are sequential measurements. Indeed, the optimal bi-observable in Theorem 3(i) is one of the POVMs described in Examples 2 and 3 (see (74)); all these bi-observables have a (trivial) sequential implementation in terms of an unsharp measurement of followed by sharp . On the other hand, in the vector case, it was shown in (, Corollary 1) that all covariant phase-space observables can be obtained as a sequential measurement of an unsharp version of the position followed by the sharp measurement of the momentum . Therefore, also for target position and momentum observables, in both the scalar and vector case.
The three authors equally contributed to the paper.
Conflicts of Interest
The authors declare no conflict of interest.
Heisenberg, W. Über den anschaulichen Inhalt der quantentheoretischen Kinematik und Mechanik. Zeitschr. Phys.1927, 43, 172–198. [Google Scholar] [CrossRef]
Simon, R.; Mukunda, N.; Dutta, B. Quantum-noise matrix for multimode systems: U(n) invariance, squeezing, and normal forms. Phys. Rev. A1994, 49, 1567–1583. [Google Scholar] [CrossRef] [PubMed]
Holevo, A.S. Statistical Structure of Quantum Theory; Lecture Notes in Physics Monographs 67; Springer: Berlin, Germany, 2001. [Google Scholar]
Holevo, A.S. Probabilistic and Statistical Aspects of Quantum Theory; Quaderni della Normale; Edizioni della Normale: Pisa, Italy, 2011. [Google Scholar]
Buscemi, F.; Hall, M.J.W.; Ozawa, M.; Wilde, M.M. Noise and disturbance in quantum measurements: An information-theoretic approach. Phys. Rev. Lett.2014, 112, 050401. [Google Scholar] [CrossRef] [PubMed]
Busch, P.; Heinosaari, T.; Schultz, J.; Stevens, N. Comparing the degrees of incompatibility inherent in probabilistic physical theories. Europhys. Lett.2013, 103, 10002. [Google Scholar] [CrossRef]
Busch, P.; Lahti, P.; Werner, R. Proof of Heisenberg’s error-disturbance relation. Phys. Rev. Lett.2013, 111, 160405. [Google Scholar] [CrossRef] [PubMed]
Coles, P.J.; Furrer, F. State-dependent approach to entropic measurement-disturbance relations. Phys. Lett. A2015, 379, 105–112. [Google Scholar] [CrossRef]
Heinosaari, T.; Schultz, J.; Toigo, A.; Ziman, M. Maximally incompatible quantum observables. Phys. Lett. A2014, 378, 1695–1699. [Google Scholar] [CrossRef]
Werner, R.F. Uncertainty relations for general phase spaces. Front. Phys.2016, 11, 110305. [Google Scholar] [CrossRef]
Buscemi, F.; Das, S.; Wilde, M.M. Approximate reversibility in the context of entropy gain, information gain, and complete positivity. Phys. Rev. A2016, 93, 062314. [Google Scholar] [CrossRef]
Barchielli, A.; Lupieri, G. Instrumental processes, entropies, information in quantum continual measurements. Quantum Inf. Comput.2004, 4, 437–449. [Google Scholar]
Barchielli, A.; Lupieri, G. Instruments and channels in quantum information theory. Opt. Spectrosc.2005, 99, 425–432. [Google Scholar] [CrossRef]
Barchielli, A.; Lupieri, G. Quantum measurements and entropic bounds on information transmission. Quantum Inf. Comput.2006, 6, 16–45. [Google Scholar]
Barchielli, A.; Lupieri, G. Instruments and mutual entropies in quantum information. Banach Center Publ.2006, 73, 65–80. [Google Scholar]
Barchielli, A.; Lupieri, G. Entropic bounds and continual measurements. In Quantum Probability and Infinite Dimensional Analysis; QP-PQ: Quantum Probability and White Noise Analysis; Accardi, L., Freudenberg, W., Schürmann, M., Eds.; World Scientific: Singapore, 2007; Volume 20, pp. 79–89. [Google Scholar]
Barchielli, A.; Lupieri, G. Information gain in quantum continual measurements. In Quantum Stochastic and Information; Belavkin, V.P., Guţǎ, M., Eds.; World Scientific: Singapore, 2008; pp. 325–345. [Google Scholar]
Huang, Y. Entropic uncertainty relations in multidimensional position and momentum spaces. Phys. Rev. A2011, 83, 052124. [Google Scholar] [CrossRef]
Heinosaari, T.; Miyadera, T.; Ziman, M. An invitation to quantum incompatibility. J. Phys. A Math. Theor.2016, 49, 123001. [Google Scholar] [CrossRef]
Simon, R.; Sudarshan, E.C.G.; Mukunda, N. Gaussian-Wigner distributions in quantum mechanics and optics. Phys. Rev. A1987, 36, 3868–3880. [Google Scholar] [CrossRef]
Horn, R.A.; Zhang, F. Basic Properties of the Schur Complement. In The Schur Complement and Its Applications; Zhang, F., Ed.; Numerical Methods and Algorithms; Springer: Berlin, Germany, 2005; pp. 17–46. [Google Scholar]
Petz, D. Quantum Information Theory and Quantum Statistics; Springer: Berlin, Germany, 2008. [Google Scholar]
Carlen, E. Trace Inequalities and Quantum Entropy: An Introductory Course. In Entropy and the Quantum; Contemporary Mathematics; American Mathematical Society: Providence, RI, USA, 2010; Volume 529, pp. 73–140. [Google Scholar]
Bhatia, R. Matrix Analysis; Springer: New York, NY, USA, 1997. [Google Scholar]
Werner, R.F. Quantum harmonic analysis on phase spaces. J. Math. Phys.1983, 25, 1404–1411. [Google Scholar] [CrossRef]
Burnham, K.P.; Anderson, D.R. Model Selection and Multimodel Inference—A Practical Information—Theoretic Approach; Springer: New York, NY, USA, 2002. [Google Scholar]
Topsøe, F. Basic concepts, identities and inequalities—The toolkit of Information Theory. Entropy2011, 3, 162–190. [Google Scholar] [CrossRef]
Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley: Hoboken, NJ, USA, 2006. [Google Scholar]
Carmeli, C.; Heinonen, T.; Toigo, A. Position and momentum observables on R and on R3. J. Math. Phys.2004, 45, 2526–2539. [Google Scholar] [CrossRef]
Barchielli, A.; Lupieri, G. Quantum stochastic calculus, operation valued stochastic processes and continual measurements in quantum mechanics. J. Math. Phys.1985, 26, 2222–2230. [Google Scholar] [CrossRef]
Barchielli, A.; Lupieri, G. A quantum analogue of Hunt’s representation theorem for the generator of convolution semigroups on Lie groups. Probab. Theory Rel. Fields1991, 88, 167–194. [Google Scholar] [CrossRef]
Barchielli, A.; Holevo, A.S.; Lupieri, G. An analogue of Hunt’s representation theorem in quantum probability. J. Theor. Probab.1993, 6, 231–265. [Google Scholar] [CrossRef]
Holevo, A.S. Investigations in the General Theory of Statistical Decisions. Proc. Steklov Inst. Math.1978, 124, 1–140. [Google Scholar]
Holevo, A.S. Infinitely divisible measurements in quantum probability theory. Theory Probab. Appl.1986, 31, 493–497. [Google Scholar] [CrossRef]
Cassinelli, G.; De Vito, E.; Toigo, A. Positive operator valued measures covariant with respect to an irreducible representation. J. Math. Phys.2003, 44, 4768–4775. [Google Scholar] [CrossRef]
Kiukas, J.; Lahti, P.; Ylinen, K. Normal covariant quantization maps. J. Math. Anal. Appl.2006, 319, 783–801. [Google Scholar] [CrossRef]
Ohya, M.; Petz, D. Quantum Entropy and Its Use; Springer: Berlin, Germany, 1993. [Google Scholar]
Billingsley, P. Probability and Measure, 2nd ed.; Wiley: New York, NY, USA, 1986. [Google Scholar]
Carmeli, C.; Heinonen, T.; Toigo, A. Sequential measurements of conjugate observables. J. Phys. A Math. Theor.2011, 44, 285304. [Google Scholar] [CrossRef]