Cosmic Inflation, Quantum Information and the Pioneering Role of John S Bell in Cosmology

According to the theory of cosmic inflation, the large scale structures observed in our Universe (galaxies, clusters of galaxies, Cosmic Background Microwave - CMB - anisotropy ...) are of quantum mechanical origin. They are nothing but vacuum fluctuations, stretched to cosmological scales by the cosmic expansion and amplified by gravitational instability. At the end of inflation, these perturbations are placed in a two-mode squeezed state with the strongest squeezing ever produced in Nature (much larger than anything that can be made in the laboratory on Earth). This article studies whether astrophysical observations could unambiguously reveal this quantum origin by borrowing ideas from quantum information theory. It is argued that some of the tools needed to carry out this task have been discussed long ago by J. Bell in a, so far, largely unrecognized contribution. A detailed study of his paper and of the criticisms that have been put forward against his work is presented. Although J. Bell could not have realized it when he wrote his letter since the quantum state of cosmological perturbations was not yet fully characterized at that time, it is also shown that Cosmology and cosmic inflation represent the most interesting frameworks to apply the concepts he investigated. This confirms that cosmic inflation is not only a successful paradigm to understand the early Universe. It is also the only situation in Physics where one crucially needs General Relativity and Quantum Mechanics to derive the predictions of a theory and, where, at the same time, we have high-accuracy data to test these predictions, making inflation a playground of utmost importance to discuss foundational issues in Quantum Mechanics


Introduction
The theory of cosmic inflation [1][2][3][4][5][6] is considered as the leading paradigm for describing the physical conditions that prevailed in the early Universe. It is a very successful theory because it solves the puzzles of the standard model of Cosmology but, also, because it has made predictions that have been observationally verified (for a recent assessment of the scientific status of inflation, see Refs. [7,8]). For instance, it predicts the presence of Doppler peaks in the Cosmic Microwave Background (CMB) multipoles moments, a vanishing spatial curvature or a power spectrum of cosmological perturbations close to scale invariance but not exactly scale invariant [9][10][11][12][13][14] (for reviews, see e.g. Refs. [15,16]). This last prediction has been verified, at a statistical significant level, only recently, thanks to the release of the European Space Agency (ESA) satellite Planck data [17][18][19][20][21].
But inflation is also interesting because it combines Quantum Mechanics and General Relativity. Indeed, according to inflation, all structures in our Universe are of quantum-mechanical origin. This claim, although very strong, seems to be empirically correct in the sense that all conclusions that can be derived from this assumption fit well the data at our disposal. However, it would clearly be interesting to go beyond an indirect proof and to be able to find an explicit and unambiguous signature of the quantum origin of the structures present in our Universe.
This article is devoted to this question and discusses the tools, often borrowed to Quantum Information Theory, that can be used in order to address these problems [22][23][24][25][26].
We also argue that crucial insights into those issues were anticipated by John Bell in an, so far, unrecognized contribution "EPR correlations and EPW distributions" [27] reproduced in his book "Speakable and unspeakable in quantum mechanics" [28]. This letter was written after the invention of inflation but before the quantum state of cosmological perturbations was fully characterized by Grishchuk and Sidorov [29]. It discusses important ideas related to the classical limit of Quantum Mechanics. We study Bell's paper but also the criticisms that have been put forward against it [30]. There was indeed a long and controversial discussion about the validity of the results obtained by Bell. What was not realized before, however, is that the domains most applicable to the ideas developed by Bell in his article are Cosmology and the scenario of cosmic inflation.
The article is organized as follows. In the next section, Sec. 2, we first review the theory of inflationary cosmological perturbations, at the classical level in Sec. 2.1 and, then, in Sec. 2.2 at the quantum level. In Sec. 2.3, we study in more detail the quantum state in which inflationary perturbations are placed, namely a two-mode squeezed state. In Sec. 2.4, we present some simple considerations that allow us to intuitively understand what a squeezed state is and, in Sec. 2.5, we show that this type of states in fact belong to a larger class of quantum states known as Gaussian states. In Sec. 3, we study the quantum-to-classical transition of cosmological perturbations. In Sec. 3.1, we investigate if the fluctuations can be described by a classical stochastic process. In Sec. 3.2, we use tools borrowed from quantum information theory, namely the quantum discord, to address the question of the classicality of cosmological perturbations. In Sec. 4, we come back again to the question of the classical limit using Bell ideas. In Sec. 4.1, we explain why the non-positivity of the Wigner function can be taken as a criterion for the existence of genuine quantum effects. In Sec. 4.2, we discuss this idea in the context of the Wentzel-Kramer-Brillouin (WKB) approximation. In Sec. 4.3, we review in detail the paper by John Bell, mentioned earlier, and show that it is especially relevant for our purposes. In Sec. 4.4, we present the criticisms that have been made on Bell's letter and, in Sec. 4.5, we comment on these criticisms. In Sec. 4.6, in the light of the previous considerations, we conclude about the status of Bell's letter. Finally, in Sec. 4.7, we explain how the whole situation has been clarified by the publication of a theorem due to Revzen. In Sec. 5, based on the previous considerations, we study whether the Bell inequality can be constructed for CMB observables and we briefly discuss our results in Sec. 6. Finally, in Sec. 7, we briefly present our conclusions.

Classical Perturbations
On large scales, the Universe is homogeneous and isotropic (the so-called cosmological principle) and is well-described by the Friedman-Lemaître-Robertson-Walker (FLRW) metric, ds 2 = a 2 (η)(−dη 2 + δ ij dx i dx j ), where η is the conformal time and a(η) the scale factor. The scale factor describes how the Universe expands. We now have an accurate picture of the behavior of a(η) from epochs possibly characterized by an energy scale as high as ∼ 10 15 GeV to present times. This cosmic history constitutes the standard model of Cosmology, known as the ΛCDM model [31]. This model is a six parameter model and correctly accounts for all known cosmological observations. The earliest epoch of this ΛCDM model, namely the one which describes the very early Universe, is known as inflation. It is a phase of accelerated expansion and it is believed that it was driven by a scalar field, the "inflaton", the physical nature of which is still unknown.
However, in order to understand the large scale structures in our Universe, such as clusters of galaxies or CMB anisotropies, it is clearly necessary to go beyond the previous description, namely beyond the cosmological principle. A crucial remark is that, in the early Universe, the deviations from homogeneity and isotropy are small (recall that δT/T ∼ 10 −5 on the last scattering surface). As a consequence, one can use perturbative methods: as a matter of fact, linear perturbations theory will be sufficient. Therefore, we perturb the FLRW metric tensor introduced before and write [32] g µν = g FLRW µν (η) + δg µν (η, x) + · · · , where g FLRW µν (η) is the metric tensor introduced before which only depends on time since it describes a homogeneous and isotropic Universe. The perturbed part is δg µν (η, x), which is supposed to be small compared to g FLRW µν (η), and which is time, but also space dependent. It represents small ripples on top of an expanding Universe, the expansion itself being described by the scale factor a(η). Then, exactly in the same way as a vector can be decomposed into a curl-free and a divergence-free component (the Helmhotz theorem that can be found in any textbook on electromagnetism), a two rank tensor can be decomposed into a scalar, vector and tensor part, a result known as the Stewart lemma [32]. If one restricts ourselves to scalar perturbations (tensor modes, or primordial gravity waves, can be treated in a similar fashion and vector modes are absent during inflation), then the perturbed metric can be written as The above perturbed metric depends on four functions because we need four functions to write the components of the perturbed metric in terms of scalar functions only (for instance, as can be seen in the above equation, the time space component of the metric has been written in terms of the scalar B since δg 0i = ∂ i B). Obviously, these four functions are time and space dependent. However, this description is redundant because of gauge freedom [32]. This means that there are infinitesimal changes of coordinates that can mimic perturbative solutions. These fictitious solutions must be removed and this is accomplished in the gauge invariant formalism. It consists in working with quantities that are invariant under infinitesimal changes of coordinates. For instance, the gravitational sector can be described by a single quantity, the so-called Bardeen potential defined by Φ B (η, x) = φ + [a (B − E )] /a, a prime denoting derivative with respect to conformal time. The changes in the functions φ, B and E caused by a small diffeomorphim exactly compensate if the above combination of φ, B and E is considered, which is the essence of what a gauge-invariant quantity is. In the same way, the perturbations of matter can be described by a single quantity. For instance, if one studies the perturbations during inflation, then this single quantity is the gauge invariant fluctuation of the inflaton scalar field δϕ (gi) (η, x) = δϕ + ϕ (B − E ), where the superscript "gi" stands for gauge-invariant. Moreover, since the two above mentioned quantities are related through perturbed Einstein equations, it is in fact the whole scalar sector that can be reduced to the study of a single quantity that can be chosen to be curvature perturbations, usually denoted ζ(η, x), and defined by where H = a /a and 1 is the first Hubble-flow function given by 1 = 1 − H /H 2 . ζ(η, x) is directly related to the perturbed three-dimensional curvature scalar, hence its name. Physically, ζ(η, x) is a very relevant quantity because, in the real world, it can be measured (and has been measured). Indeed, the temperature anisotropy, where θ and φ defines a direction in the sky and Y m are spherical harmonics, is an observable and has now been measured by many different experiments. The first one was the COBE satellite in 1992 [33]. The most recent and most accurate observation is by The European Space Agency (ESA) Planck satellite [17][18][19][20][21], see Fig. 1. The so-called Sachs-Wolfe effect [34] relates the presence of small inhomogeneities, living in three-dimensional space and described by curvature perturbations ζ(η, x) to the temperature anisotropy of Fig. 1, namely where we have defined z ≡ aM Pl √ 2 1 . One recognizes the equation of motion of an oscillator whose fundamental frequency, ω 2 = k 2 − z /z, is time-dependent. In other words, we deal with a parametric oscillator: a classical analogy would a pendulum the length of which can change in time. Here the time dependence is fixed by z, which is only determined by the dynamics of the expansion since z depends on the scale factor and its derivatives. The solution to the above equations are easily analyzed. In an inflationary Universe, the Hubble radius H −1 is constant while the wavelength of a given Fourier mode, which grows proportional to the scale factor, is stretched beyond the Hubble radius. Therefore, initially, k 2 z /z (small scales limit) and the quantity zζ k oscillates where A k and B k are two arbitrary integration constants. The reason for this behavior is that the wavelength of the mode is so small that it does not feel the curvature of spacetime and behaves as if it lived in flat spacetime. In principle, the two constants A k and B k are fixed by the initial conditions. At the classical level, it is just unclear what these ones should be. Then, as time goes on, the mode exits the Hubble radius and the regime k 2 z /z (large scales limit) becomes relevant. In that case, the solution can be written where C k and D k are two constants of integrations. The first branch, proportional to C k , is the growing mode and the second one the decaying mode. This can easily be verified if, for instance, one considers scale factors of the form a(η) ∝ (−η) 1+β , recalling that inflation corresponds to β −2 and that the conformal time during inflation is negative and tends to zero (by negative values) as inflation proceeds. The above solution shows that the growing mode is, on large scales, constant, namely ζ k C k , which means that the curvature perturbation has the advantage (among others) to be conserved on large scales.
Usually, the properties of CMB anisotropies are characterized by the correlation functions of δT(e)/T which are, thanks to Eq. (4), directly related to the correlation functions of curvature perturbation at the end of inflation. The two-point correlation function of ζ(η, x) reads where the brackets are supposed to represent an average over some classical distribution such that . At the end of inflation, ζ k (η end ) C k since, as explained before, the decaying mode can be neglected. But one needs to specify the scale dependence of C k . This can be done by matching the large scale regime to the small scale regime which, in practice, amounts to express C k in terms of A k and B k . The problem is thus moved to determining the scale dependence of the coefficients A k and B k . At the classical level, as mentioned before, there is just no clear approach of how this can be done in a well-justified and well-motivated way.

Quantum Perturbations
The above considerations, therefore, leave one important question unanswered: what is the origin of these perturbations? The beauty of inflation is that it can also provide an answer to this important question: inflation says that the primordial perturbations originate from the vacuum quantum fluctuations of the inflaton and gravitational fields at the beginning of inflation. This means that all structures in our Universe are nothing but quantum fluctuations stretched over cosmological distances by the expansion of the Universe and amplified by gravitational instability.
At the technical level, this means that, now, the perturbed metric δg µν is viewed as a quantum operator, δĝ µν , satisfying the quantum perturbed Einstein equations, viewed as equations for quantum operators δĜ µν = M −2 Pl δT µν . Even more concretely, in this formulation, curvature perturbation can be viewed as a (test) quantum scalar field living in the expanding spacetime and can be written as whereĉ k (η) andĉ † k (η) are the annihilation and creation operators satisfying the usual equal time commutation relations, [ĉ k (η),ĉ † p (η)] = δ(k − p). Curvature perturbations are then related to the creation and annihilation operators through We notice thatζ k andζ k mix creation and annihilation operators of momentum k and −k. The evolution ofζ(η, x) is controlled by the following Hamiltonian This Hamiltonian comes from a second order expansion in ζ of the action of GR plus a scalar field (since inflation is driven by this type of field). This action (and, therefore, the corresponding Hamiltonian) remains quadratic in ζ and higher order terms are ignored because the perturbations are small. As already mentioned, this is well established at the time of recombination where the deviations are measured to be of the order 10 −5 . Since the fluctuations grow by gravitational instability, there were certainly even smaller during inflation. The Hamiltonian (11) is made of two pieces. The first part describes the free Hamiltonian of a collection of harmonic oscillators with fundamental frequency ω = k (as appropriate for massless excitations). The second piece describes the interaction of the quantum fieldζ(η, x) with the  [17][18][19][20][21]. It represents the most exquisite measurement of the CMB ever done. classical background characterized by the scale factor a(η). If space-time is static (namely Minkowski space-time) then a = 0 and the "time-dependent" coupling constant z /z vanishes. This term is responsible for particle creation. Moreover, the momentum structure of this second piece inĉ kĉ−k orĉ † −k c † k indicates that particles are created by pairs with opposite momenta (in accordance with momentum conservation). At this point, one should clarify the following. In quantum field theory, quadratic action (or Hamiltonian) usually describes free fields while interactions are described as higher order terms. There is, however, one exception, namely the case where a quantum field interacts with a classical source. It is still described by a quadratic action, the presence of the interaction manifesting itself only by giving a time dependence to the effective frequency of the field oscillators. In other words, a free field is equivalent to a collection of harmonic oscillators while a field in interaction with a classical source is equivalent to a collection of parametric oscillators. The classic example is the Schwinger effect, where a fermionic field interacts with a classical electric field [16,36]. The case of a scalar field in a cosmological background is another example.
The Heisenberg equation, idĉ k /dη = [ĉ k ,Ĥ], allows us to calculate the equation of motion of the operatorζ k (η). This leads to from which one deduces that that is to say Eq. (5) but written at the operator level. A fundamental assumption of inflation is that the system starts in the vacuum state |0 defined by the conditionĉ k (η ini )|0 = 0. In order to see what this implies for the fieldζ k (η), let us first solve the time dependence of the creation and annihilation operators, see Eq. (12). This can be done by means of a Bogoliubov transformation, namelŷ where the functions u k (η) and v k (η) obey and, by definition, have initial conditions u k (η ini ) = 1 and v k (η ini ) = 0. Let us notice that u k and v k depend on the modulus of the wavenumber k only. The Bogoliubov transformation allows us to re-express the field expansion as It is easy to verify from Eqs. (15) that the function namely the same equation as zζ k . Recalling the initial conditions for u k and v k , this implies that the mode function in (16) in the small scales limit. In other words, the assumption that the fluctuations are quantum and start in the vacuum state has completely fixed the initial conditions. As a consequence, in Eq. (6), one should choose Then, the initial conditions being now known, the calculation of the power spectrum can be performed explicitly. Indeed, we no longer face the issues discussed after Eq. (8): C k can be related to A k and B k but, as just explained, these ones are now fully determined. The calculation leads to an almost scale invariant power spectrum, P ζ (k) = k 3 |ζ k (η end )| 2 /(2π 2 ) A S k n S −1 , where n S 1 plus small corrections that depend on the model of inflation considered. Scale invariance means that, if n S = 1, then P ζ (k) no longer depends on k. Since, according to inflation, n S 1 but n S = 1, we have in fact almost scale invariance. Moreover, measuring the small deviations from scale invariance allows us to constrain inflation since the corrections, as just mentioned above, depend on the scenario of inflation. The fact that n S 1 crucially rests on the choice A k ∝ 1/ √ 2k. Had we have another scale dependence initially for A k and B k , the power spectrum would have been completely different and, generically, far from scale invariance. Remarkably, according to the most recent data obtained by the Planck satellite [20,21], everything is precisely consistent with a power spectrum of the form P ζ (k) = A S k n S −1 , with ln(10 10 A S ) = 3.044 ± 0.014 and n S = 0.9649 ± 0.0042. The value of the spectral index n S is not predicted by inflation (more precisely, if a model of inflation is given, then it is predicted. But the problem is that the correct model of inflation is not known). It was known for a long time that n S should be around one but it is only very recently that Planck demonstrated that n S = 1 at a significant statistical level (namely more that 5σ).
The Planck results are, therefore, one of the main reasons to trust inflation and its mechanism of structures formation according to which structures in the Universe are nothing but quantum fluctuations. This fascinating conclusion is now well supported by astrophysical data.

The Quantum State of Inflationary Perturbations
Before discussing the properties of the state of cosmological perturbations, let us make the following remark. We saw in Eqs. (10) that the definition ofζ k mixes creation and annihilation operators of mode k and −k. This is because, as already mentioned, particles are created by pair with opposite momenta. But this is uncommon from a quantum information point of view where one likes to view the total system as a collection of subsystems associated to the different modes. In other words, if E is the Hilbert space of the full system and E k the Hilbert space associated to the mode k, we would like to have E = ⊗ k E k . For this reason, we now introduce, for a fixed mode k, the "position"q k and the "momentum"π k defined bŷ These two operators are Hermitian. The relation betweenζ k and the position and momentum operators can be easily derived and reads In the following, we consider a description of the system based on these operators since we want to make use of the formalism of quantum information. In fact, the above definitions allow us to describe cosmological perturbations as a continuous variable system. A continuous variable system is a system that is described by Hermitian operators satisfying canonical commutation relation, [q k ,π k ] = iδ(k − k). The number of degrees of freedom is infinite and labeled by the wavenumber k.
Another idea from Quantum Information Theory that will be playing an important role in the following is that of bipartite system. Indeed, for higher than one dimensional system, the set of degrees of freedom can be split into two subsets. This defines a partition and allows us to see the whole system as a bipartite system. Then, one can study the nature of the correlations between the two subsystems which is a way to assess the "quantumness" of the whole system. An important point is that one can define several partitions for the same system. As an introductory example, let us consider a four dimensional system with degrees of freedomq i and vectorr = (q i ,π i ) T , i = 1, · · · , 4. We split the system into two subsystems, A and B, and defines a partition such that E = E A ⊗ E B . For instance, we can choose the subsystem A to containq 1 andq 2 and, therefore, to be described byr A = (q 1 ,q 2 ,π 1 ,π 2 ) T while the system B containsq 3 andq 4 and is described byr B = (q 3 ,q 4 ,π 3 ,π 4 ) T . Then, we define the vectorR byR = (r A ,r B ) T . This definition ofR, namely the way we list its components, is, implicitly, a definition of a partition. In Cosmology, because of pair creations, the partition E = ⊗ k∈R 3+ E k ⊗ E −k is, at least at first sight, very natural and, in the following, we work with it.
After these preliminary remarks, let us come back to the quantum state of the perturbations. As mentioned earlier, in Cosmology, one starts from the vacuum state ⊗ k |0 k , 0 −k at time η = η ini , that is to sayĉ k (η ini )|0 = 0 andĉ −k (η ini )|0 = 0. Then, by solving the Schrödinger equation, with the Hamiltonian given by Eq. (11), one can show that this state evolves into a two-mode squeezed state given by where |n k is an eigenvector of the particle number operator in the mode k. r k and ϕ k are the squeezing parameter and squeezing angle, respectively. They are time dependent functions controlled by the following equations The fact that we encounter a squeezed state should not come as a surprise. It is indeed well known that, while the quantization of an harmonic oscillator naturally leads to coherent states, the quantization of a parametric oscillator leads to squeezed states. Since we saw before that, because of its interaction with the dynamical scale factor, the fieldζ k acquires a time-dependent fundamental frequency, and, therefore, can be viewed as parametric oscillator, the whole picture appears to be consistent.

Physical Interpretation
Let us now come back to the quantum state (21) and try to gain physical intuition about it. The corresponding wavefunction can be expressed as where the functions A(r k , ϕ k ) and B(r k , ϕ k ) are given by This explicit form for the wave function allows us to understand what, physically, a two-mode squeezed state means [37]. Before treating the case of a two-mode squeezed state however, let us recall some well-known facts. Let us consider one mode k. The corresponding vacuum state is a coherent state, that is to say, a state with wavefunction in position space given by which is nothing but the wavefunction for the ground state of the harmonic oscillator in non-relativistic quantum mechanics. The same state, written in momentum basis, can be expressed asΨ 0 (π k ) = 1 π 1/4 e −(π k ) 2 /2 . From the knowledge of the wave function, one can calculate the dispersion in field amplitude and momentum. This gives ∆q 2 k = ∆π 2 k = 1/2, which saturates the Heisenberg inequality, namely ∆q 2 k ∆π 2 k = 1/4. Let us now consider a one-mode squeezed state. Its wave function, in field amplitude and conjugate momentum basis, can be written as where we have introduced a new parameter, R. The physical interpretation of this parameter can be found by calculating again the dispersion in position and momentum. One finds ∆q 2 k = 1/(2R 2 ) and ∆π 2 k = R 2 /2. Although they still saturate the Heisenberg inequality, the two dispersion's are no longer equal. If R > 1, then the dispersion in field amplitude is smaller than that the dispersion in conjugate momentum (and, interestingly enough, smaller than that of the vacuum state). We say that the state is squeezed in position or field amplitude, hence its name. Of course, since one has to satisfy the Heisenberg inequality, the dispersion in momentum is larger. If R < 1, we have the opposite situation and the state is squeezed in momentum.
After these preliminary comments, we now come to the two-mode squeezed state. As the name of the state indicates, we must now consider two modes and, of course, we choose k and −k. Following the tradition in quantum information theory, we can also call mode k "Alice" and mode −k "Bob". In field amplitude basis, the vacuum state of this bipartite system can be written as We see that the position of Alice and Bob are uncorrelated. Then, in a way which is exactly similar to what has already been done above, we introduce the following state where the squeezing factor R appears again and where the two functions A(R) and B(R) are defined by The state (28) is, by definition, a two-mode squeezed state. Clearly, since Eqs. (28) and (23) are similar, this means that the state (23) is also a two-mode squeezed state. In Eq. (28), we have ignored the squeezed angle and, therefore, we should identify the function A(R) with A(r k , 0) = −(e 2r k + e −2r k )/4 which immediately leads to r k = ln R. One checks that this is consistent since B(r k , 0) = −(e 2r k − e −2r k )/2 is indeed equals to B(R) and the normalization factor cosh r k 1 − e −4iϕ k tanh 2 r k goes to one when the squeezing angle is zero. We notice that the position of Alice and Bob are now correlated and that these correlations are genuinely quantum since the state (28) is an entangled state, namely . This means that the state (23) implies the existence of genuine quantum correlations between the field amplitudes q k and q −k . It is also interesting to remark that the two-mode squeezed state does not lead to squeezing for Alice or Bob. Indeed, it is easy to verify that . These dispersions are always larger than the dispersions obtained for the vacuum state. This is related to the fact that, if one traces out, say, Alice's degree of freedom, one is left with a state for Bob that is not a one-mode squeezed state but a thermal state.
The two-mode squeezed state that is present in Cosmology is quite peculiar: it is probably the strongest squeezed state ever produced in Nature. The squeezing factor is often expressed in decibel with the help of the following definition, see also Eqs. (14) and (15) of Ref. [38] −10 log 10 e −2r dB = 20r ln 10 dB.
In Cosmology, one can achieve r 50 which means a squeezing of 43 dB to be compared with ∼ 15 dB which is the world record in the lab, see Refs. [39] and [40].

Gaussian States
Another interesting property of a two mode squeezed state is that it belongs to a wider class of quantum states called "Gaussian states". We indeed check that the wavefunction (23) is Gaussian. Gaussian states play a fundamental role in quantum mechanics. They arise in many different branches of Physics such as Laser Physics, Quantum Field Theory (in curved spacetime or not), Solid State Physics or Cosmology and they are ubiquitous in Quantum Information Theory. Gaussian states naturally occur as ground (coherent or squeezed) or thermal equilibrium states of any physical quantum system described by a quadratic Hamiltonian. Moreover, with existing technologies, they are easily manipulable in the lab.
At the technical level, a Gaussian state is a state the characteristic function of which is a Gaussian. The characteristic function χ(ξ) is defined by whereρ is the density matrix of the quantum state and whereŴ (ξ) is the Weyl operator which can be expressed asŴ where the vectorR has already been introduced before and is defined byR we have slightly modified the definition by introducing a k 1/2 in front of positions and a k −1/2 in front of momentum in order to work with dimensionless quantities). For a two-mode squeezed state, the characteristic function is indeed a Gaussian (justifying the fact that it belongs to the class of Gaussian states) where γ is the covariance matrix is given The covariance matrix is related to the two point correlation function of the position and/or momentum Another, equivalent, way to define a Gaussian state is the following: a Gaussian state is a state which has a Gaussian Wigner function. For a state with density matrixρ, the Wigner function is defined by Physically, the Wigner function is the quantum generalization of the classical distribution in phase space. It is related to the characteristic function introduced before by the formula Using this result and the characteristic function of a Gaussian state, see Eq. (33), it is easy to demonstrate that This shows that the Wigner function is also a Gaussian. If one uses the expression (34) of the covariance matrix, then the explicit expression of the Wigner function of a two-mode squeezed state reads

The Quantum-to-Classical Transition of the Cosmological Perturbations
From the above considerations, why curvature perturbations are viewed as genuinely quantum should now be clear. However, when CMB anisotropies are analyzed by astronomers, curvature perturbations are treated classically without any reference to their quantum origin. Is this just wrong or can we justify this approach by claiming that some sort of quantum-to-classical transition took place in the early Universe? This question is reminiscent of the question of classical limit in Quantum Mechanics. It is known that this problem is subtle and we will argue that, in the context of Cosmology, it is even more subtle than in ordinary situations.

Stochastic Description?
The fact that a two-mode squeezed state is Gaussian implies, as discussed before, that its Wigner function is positive definite. In fact, one can show that the only states for which this is the case are precisely the Gaussian states [41]. This leads to the idea that the Wigner function could be used as a classical distribution over phase space. If so, this would mean that there exists a classical, stochastic, description of the properties of the system. This would indicate that a quantum-to-classical transition has taken place. At the technical level, the previous argument can be formulated as follows. Let us consider a function O of position and momentum, namely O(q k , π k , q −k , π −k ). According to the above considerations, one would define the classical average of O as Using the correspondence principle, one can define the corresponding operatorÔ asÔ = O(q k ,π k ,q −k ,π −k ). If the system is classical, then one must have We now establish under which conditions the above equation holds. Let us first define the Weyl transform of the operatorÔ bỹ It is of course reminiscent of the Wigner function. In fact, up to a factor (2π) −2 the Wigner function is the Weyl transform of the density matrix, namelyρ = (2π) 2 W. The Weyl transform associates a function in phase space to any operator in Hilbert space. The fundamental property of the Weyl transform is that, for two operatorsÂ andB, one has where the seemingly awkward factor (2π) 2 comes from the fact that the subspace we consider here is four-dimensional. It follows that Comparing the above formula to Eq. (40), we see that quantum and stochastic averages coincide if Therefore, whether or not a quantum-to-classical transition takes place can be summarized to the above equation and to whether it holds in general. For instance, it is easy to show that it is always valid for any power of the operatorq m k , namely q m k = q m k . In the same fashion, one also has π m k = π m k . However, q k π k = q k π k + i/2. For quadratic combinations, one can summarize the previous results as where the matrix J has been defined in Eq. (35). Moreover, any combination of operators of mode k and mode −k has a trivial Weyl transform, for instance q m 1 k π m 2 −k = q m 1 k π m 2 −k . Using Eqs. (20), one can now calculate ζ 2 k in order to see whether a stochastic calculation of the curvature perturbations power spectrum is possible. The previous results imply that Among all the terms in the above expression, only the last four ones on the first line have a non trivial Weyl transform. However, if q k π k and π k q k have, separately, a non-trivial Weyl transform, the sum of these two terms actually has a trivial Weyl transform because the additional factor i/2 originating from Eq. (46) cancel out. Therefore, we conclude that ζ 2 k = ζ 2 k . As a consequence, the quantum two-point correlation function of curvature perturbations, ζ (η, x)ζ(η, y) can be exactly reproduced in a classical, stochastic, approach. One can also show that this is the case for ζ (η, x)ζ (η, y) +ζ (η, x)ζ(η, y) or ζ (η, x)ζ (η, y) . Notice that this is true whatever the state of the system. Of course, in order for Eq. (40) to be not only mathematically correct but also physically meaningful, the distribution W has to be positive definite and this is the case only for Gaussian states. The identification of the quantum and stochastic correlators being valid for any states, it is obviously valid for any Gaussian states, in particular for any values of the squeezing parameter and angle. It is sometimes claimed that this identification is possible only in the limit r k → +∞ and we see that this is not just the case. However, for higher order correlators, the story is different. Rather than a long demonstration, let us take an example. One can show that q 2 We see that this particular correlator has a non trivial Weyl transform. However, if one uses Eqs. (20) to calculate the higher order correlation function of curvature perturbations, one finds Some combinations participating to ζ 3 k and ζ 4 k have a non trivial Weyl transform but these extra contributions all cancel out to produce the above result. Again, this result is obtained without the help of the large squeezing limit. In fact, one can even show that it is valid for any power of ζ k , namely ζ n k = ζ n k , see Eq. (13.5) in Ref. [42].

Quantum Discord and Inflation
The previous results seem to indicate that a system described by a two-mode squeezed state is classical since the correlation functions of curvature perturbations can also be obtained by mean of a stochastic process. Even quantities such as q 2 k π 2 k + π 2 k q 2 k become identical to their Weyl transform in the large squeezing limit. So there is indeed a quantum-to-classical transition and this would explain why the astronomers can treat the perturbations classically. However, we have already noticed that a two-mode squeezed state is also an entangled state which, on the contrary, is usually viewed as the prototype of a non-classical state. Moreover, in the limit r → +∞, one has Ψ R (q k , q −k ) ∝ δ(q k − q −k ), which is an Einstein Podolsky Rosen (EPR) quantum state [43], also considered as "the" state that can be used to illustrate the non-intuitive features of Quantum Mechanics.
In addition, more recently, in the context of Quantum Information Theory, a quantitative measure of the "quantumness" of a system has been introduced. This measure is called the "quantum discord", see Refs. [44] and [45]. Very briefly, the main idea is the following. In order to decide whether a system is quantum or classical, one can divide it into two parts (a "bipartite" system as already discussed before) and look whether the correlations between the two subsystems can be understood classically or not. In the cosmological context, as already discussed as well, the two subsystems can be taken to be the mode k and the mode −k. Let us first discuss the idea classically. For this purpose, let us consider two random variables a and b having a joint probability distribution p(a i , b j ) (the indices i and j label the possible realizations; of course, it can be a continuous index if we deal with continuous random variables). Each random variable has a distribution that can be obtained from the joint distribution by marginalization, namely p( . Then, the mutual information is given by ] is the Shannon entropy. As is well-known, the entropy quantifies the uncertainty of the possible outcomes a i endowed with probability distribution p(a i ). For instance, if there are only two possible outcomes, a 1 and a 2 , with probability p and 1 − p, then S[p(a i )] = −p ln p − (1 − p) ln(1 − p). If p = 0 or p = 1, then S[p(a i )] = 0. The first case corresponds to a situation where the probability of having a 1 vanishes and the probability of having a 2 is one. The second case corresponds to the opposite situation. Clearly, in both cases, the outcomes are certain and, therefore, the uncertainty is zero. The uncertainty is maximal if p = 1/2 which is also very intuitive. Coming back to Eq. (49), this quantity is a measure of the correlations between the two subsystems since, when they are independent, the joint distribution factorizes, p(a i , b j ) = p(a i )p(b j ) and I(a, b) = 0. We can also view it as the "distance" between two distributions also known as the Kullback and Leibler divergence [46]. If they are one-dimensional, then the distance between two distributions p(a i ) and and so on. Let us remark, however, that this is not a real distance since it is not symmetric. It is easy to show that the mutual information discussed above is nothing but the distance between the joint distribution and the product of the two marginalized distributions, namely 1 When the two random variables are independent the distance between p(a i , b j ) and p(a i )p(b j ) vanishes. One can also discuss the above concepts in another way. Let p(b j |a i ) be the probability to observe b j given that a i has been observed. Then the uncertainty associated with the outcomes b j is defined Of course, this quantity depends on the measured quantity a i . In order to have the total uncertainty one can average the previous quantity using the distribution p(a i ). This leads us to define the conditional entropy by Then, if one uses the Bayes theorem, Let us now discuss the previous considerations again but in Quantum Mechanics. First of all, let us recall that the entropy S of a system characterized by the stateρ is defined by S = −Tr(ρ log 2ρ ). The interpretation of this quantity is the same as before. Then we can define the quantum-mechanical mutual information by the following expression where the density matricesρ(k) andρ(−k) are obtained fromρ(k, −k) by tracing out the degrees of freedom associated with −k and k, respectively. The non trivial part comes from the quantum-mechanical generalization of the expression of I expressed in terms of the conditional entropy, see Eq. (55). This expression is based on conditional probabilities which deal with the concept of observing an outcome given that another outcome has been observed or measured. It is well known that the concept of measurement is subtle in Quantum Mechanics and, in some sense, highly "non-classical". So let us suppose that we perform a measurement on the system −k. Measurements in Quantum Mechanics are represented by projectors and we noteΠ j the projector associated to the measurement of the system −k (it is therefore an operator living in the Hilbert space associated to the subsystem −k). After the measurement, the state of the system isρ(k, −k)Π j /p j with probability p j = Tr[ρ(k, −k)Π j ]. If we only have access to the system k, we trace out degrees of freedom associated with the system −k and we arrive atρ(k, . This allows us to calculate probabilities for outcomes associated with the system k given that a measurement has been performed 1 The calculation proceeds as follows: In Quantum Mechanics, contrary to the case where classical probability calculus applies, I(k, −k) and J (k, −k) need not to coincide. In fact, we use this difference as a signature of the fact that the system is not classical. This leads us to define the "quantum discord" by where a minimization over the set of all possible projectors is performed in order for the discord to be independent of the choice of a particularΠ j . Having defined what the quantum discord is, one can now calculate it for a two-mode squeezed state. Straightforward manipulations lead to [23] δ(k, −k) = cosh 2 r k log 2 cosh 2 r k − sinh 2 r k log 2 sinh 2 r k .
Let us notice that the discord does not depend on the squeezing angle. The discord is represented in Fig. 2. For a vanishing squeezing parameter, the discord is zero and then grows with r k . For large values of r k , it simply grows linearly since δ(k, −k) = 2r k / ln 2 − 2 − 1/ ln 2 + O(e −2r k ). Therefore, one concludes that a two-mode squeezed state is not a classical state at all, at least if one accepts the quantum discord as a meaningful criterion for classicality.

Negativity of the Wigner Distribution as a Criterion of Non-Classicality
We have reached a seemingly paradoxical stage. On one hand, the considerations in Sec. 3.1 seem to indicate that the system is classical. Everything can indeed be described by means of a stochastic distribution, namely the Wigner function. This one is positive definite because we deal with a Gaussian state and the Weyl transform of any power of curvature perturbation is "trivial", which indicates that any quantum correlation function can be obtained as a stochastic correlation function using the Wigner distribution. On the other, a two-mode squeezed state tends to an EPR state and, moreover, modern criterion designed in the context of Quantum Information Theory, such as the quantum discord calculated in the last section, Sec. 3.2, unambiguously shows that the system is quantum. How can we understand something which, at first sight, looks like a contradiction? It should be added that, although interesting in general, this question is especially relevant in Cosmology which, as argued before, is the situation in Physics where the strongest squeezing and, therefore, the largest discord, are obtained.
In fact this question has a fascinating history although it was realized only very recently that Cosmology is probably the most interesting context to discuss it. To our knowledge, it started in 1986 with the letter "EPR correlations and EPW distributions" [27] that J. Bell presented in a conference organized by the New York Academy of Sciences and which is also reproduced in his famous book "Speakable and unspeakable in quantum mechanics" (this is chapter 21) [28]. The letters "EPW" in the title stand for "Eugene Paul Wigner" since Bell's letter is dedicated to Professor E. P. Wigner. Amusingly enough, the inflationary mechanism for structure formation was invented only a few years before and, soon after Bell's letter, in 1990, Grishchuk and Sidorov [29] realized for the first time that a two-mode squeezed state was involved in the inflationary scenario. Remarkably, the Grishchuk and Sidorov paper contains a calculation of the Wigner function of cosmological perturbations.
The main idea of Bell's paper is to relate the presence of non-classical, quantum, correlations to the non-positivity of the Wigner function. The idea that a negative Wigner function signals non-classicality is intuitive since a classical probability function must be positive. It is best illustrated in the case of a Schrödinger cat state (we consider for simplicity but without loss of generality, a one-dimensional system) with and N CAT = [1 + e −mωq 2 0 cos(2q 0 p 0 )] −1/2 in order for the wave function (60) to be properly normalized. Inserting this expression into the definition of the Wigner function 2 one arrives at the following expression, see Ref. [47] For a one-dimensional system, it reads where W ± (q, p) represents the Wigner function of a single wave packet, namely and W int (q, p) is an interaction term This Wigner function is represented in Fig. 3. We see two peaks where the Wigner function is positive and, in between, a series of oscillations due to the cosine term in Eq. (65) where the Wigner function can be negative. The oscillations in the Wigner function are clearly due to interferences between the two wave packets. Therefore, interferences, which are a typical quantum phenomenon, are responsible for the non-positivity of the Wigner function, hence the idea to view the non-positivity of the Wigner function as a criterion for classicality.

WKB
Given the considerations presented in the previous section, it is interesting to calculate the Wigner function of a WKB state since the WKB approximation is often viewed as a way to study the classical limit in Quantum Mechanics. Surprisingly, the calculation of the WKB Wigner function is not as straightforward as one might think at first sight. In Cosmology, it has even a controversial and rich history. The WKB Wigner function has indeed been applied to various questions in Cosmology such as the interpretation of the wavefunction of the Universe (Quantum Cosmology) and the quantum-to-classical transition of inflationary perturbations, this last topic being obviously especially relevant for the present article.
Although the original calculation of the semi-classical Wigner function was performed by M. Berry in 1976 [48], it started to be applied in Cosmology only at the end of the 80's, in Ref. [49]. The question of how to interpret the wavefunction of the Universe in Quantum Cosmology was the issue tackled in this article. Usually, the solution of the Wheeler-de Witt equation makes sense only in the WKB approximation because this is the regime where positive probabilities can be extracted from this formalism (recall that the Wheeler-de Witt equation is similar to a Klein-Gordon equation and, hence, does not always lead to positive probabilities). Then, the idea was to look for correlations in the WKB Wigner function. The calculation of Ref. [49] proceeds as follows. Inserting the WKB wavefunction, where we have provisionally re-establishedh and where S(q, t) is the classical action of the system, into the definition of the Wigner function, one arrives at Expanding the amplitude and the phase inh, one obtains that is to say, by performing the integration over x Therefore, the conclusion of Ref. [49] was that the WKB approximation is really a classical limit in the sense that the Wigner function is positive definite and peaked over the classical trajectories p = ∂S/∂q with a weigh given by the squared amplitude |C(q)| 2 . However, after the publication of Ref. [49], it was pointed out in Refs. [50] and [51] that the calculation of the WKB Wigner function (69) is in fact unjustified and that, moreover, the correct formula was derived, as already mentioned, by M. Berry in Ref. [48]. The trouble with Eq. (69) is that one cannot truncate the expansion of the phase and, then, perform the integration over x. It is true that the higher order terms are proportional to powers ofh (which goes to zero) but also to powers of x (the range of which is the entire real axis) so that it is unclear whether the contributions of higher order terms are really negligible. The correct method consists in fact in using the saddle point approximation. This leads to where Ai(z) is a Airy function [52]. In the above expression, E is the energy shell, namely the quantity such that the Hamiltonian of the system satisfies H(p, q) = E. The points 1 and 2 are the points of coordinates q ± x 0 (q, p), p(q ± x 0 ) satisfying the stationary phase condition [p (q − x 0 /2, E) + p (q + x 0 /2, E)] = 2p. They lie on the classical trajectory and their position is determined such that the arithmetic mean of their momentum is p. Finally, A(q, p) is the area comprised between the chord 1 − 2 and the classical torus H = E. One can show that, when the Wigner function is known, the above formula (70) matches very well the exact result in the regime where the WKB condition is valid. But the most important property of Eq. (70) is that it is not positive definite and usually displays oscillations in phase space. This shows that the semi-classical limit cannot be viewed as being a truly classical regime.
The WKB approximation has also been applied to the quantum-to-classical transition of cosmological perturbations in Ref. [53]. In that paper, it is claimed that this transition is achieved because the quantum state of the perturbations precisely satisfies the WKB approximation on super Hubble scales. Based on what we have just seen about the WKB Wigner function, this last statement should be taken with great care. In fact, as we are going to see, the behavior of cosmological perturbations on large scales is especially relevant when it comes to semi-classical methods in phase space.
Based on the fact that the quantum state of the perturbations is a squeezed state, Ref.
[53] considers a simplified model consisting in an inverted harmonic oscillator whose Lagrangian is given by where the potential term has the "wrong" sign, V(x) = −kq 2 /2. As we are going to see, the state of the oscillator evolves into a strongly squeezed state which justifies to consider this simple system. The corresponding Hamiltonian reads H = p 2 /(2m) − kq 2 /2, with p = ∂L/∂q = mq, and is not bounded from below. Then, creation and annihilation operators can be introduced in the standard fashion where ω 2 = k/m. This allows us to express the Hamiltonian aŝ Of course, the most striking property of this Hamiltonian is the presence of an overall minus sign which is just the consequence of the inverted nature of the oscillator. Then, the equations of motion are given by As usual, they can be solved by mean of a Bogoliubov transformation, namelŷ where the functions u(t) and v(t) satisfy the following equations with initial conditions u(0) = 1 and v(0) = 0 (here, for simplicity, we have taken t ini = 0). Combining these two first order differential equations, one can obtain one second order equation for u (and/or v) which reads which gives u(t) = cosh(ωt) and v(t) = i sinh(ωt). As a consequence, the operatorĉ(t) can be rewritten asĉ where we have writtenĉ(t ini ) =ĉ ini and where the operatorŜ is defined bŷ with squeezing parameter and angle given by r = ωt and φ = −π/4.Ŝ is the squeezing operator and is responsible, as announced above, for the appearance of squeezed states in the problem. In order to mimic the behavior of cosmological perturbations, we assume that the initial state of the system is the vacuum |0 . Then, the state at a subsequent time t can be found using techniques based on the operator ordering theorem [54] which allows us to rewrite the operatorŜ aŝ from which it follows that This state is a one-mode squeezed state. It slightly differs from the two-mode squeezed state considered before in Eq. (21). In particular, we see that the sum is only on states with an even number of particles. Since r = ωt, the squeezing goes to infinity in the large time limit. Our next move is to calculate the wavefunction. It reads where H 2p (z) is a Hermite polynomial of order 2p [52] and appears in the q-representation of the state |2p . Then, using ∑ ∞ p=0 t n H 2n (x)/n! = e 4tx 2 /(1+4t) / √ 1 + 4t (here, one has 4t = e −2iφ tanh r) and recalling that φ = −π/4 one arrives at where one verifies that this wavefunction is correctly normalized. As noticed in Ref. [53], it can be written in a WKB form (66) with and, in the large time limit or, equivalently, strong squeezing limit, the semi-classical approximation is extremely well-verified since the WKB condition is satisfied |C∂ q S/∂ q C| = sinh(2r) 1. Hence the claim that cosmological perturbations on super Hubble scales, which is equivalent to strong squeezing, are "semi-classical". It is then tempting to go from "semi-classical" to "classical" and consider that the quantum-to-classical transition has been achieved. However, as seen above, given that the WKB Wigner function is not positive definite, one should a priori resist to this temptation. One obtains a Gaussian, which is consistent with the fact that the wavefunction (83) is a Gaussian. This means that this Wigner function, contrary to the WKB Wigner function (70), is positive definite which, at first sight, seems surprising since Ψ(q) in Eq. (83) satisfies the WKB approximation. Moreover, writing = 1/[4 cosh(2r)] which, in the strong squeezing limit, goes to zero, representing the Dirac function by δ (x) = e −x 2 /(4 ) /[2 √ π ] and, finally, noticing that p = ∂S/∂q = q tanh(2r), the Wigner function (85) can be re-written as where C is given in Eq. (84). This equation is nothing but Eq. (69) which was, as discussed in Refs. [50] and [51], supposed to be incorrect! What happened is in fact very simple. It was pointed out before that the expansion of the phase performed in Ref. [49] is not justified because the order of the various terms of this expansion is in fact indeterminate. There is of course one exception to this claim which is when the calculation of Ref. [49] is not an expansion but is exact. This is exactly what happens here since the phase is quadratic in q. So ignoring higher order terms, which is usually unjustified, is, in the present case, totally valid simply because these higher order terms are just not present. This is consistent with the fact that the Wigner function of Gaussian is a Gaussian and, therefore, is positive definite. This shows how peculiar and subtle is the quantum-to-classical transition of cosmological perturbations is.

Bell's Paper on the Wigner Function
After these preliminary considerations, let us now come back to the letter written by J. Bell in 1986 [27]. Based on the original EPR article [43], Bell imagines a situation where there are two free particles traveling in space along a given axis (the particles can propagate in both directions). Then, Bell assumes that one can measure position (of the two particles) only. As he notices himself, this slightly differs from the standard EPR argument where it is also assumed that momenta can be measured. The article makes use of the "two-time" Wigner function defined by W(q 1 , q 2 , p 1 , p 2 , t 1 , t 2 ) = 1 (2π) 2 dy 1 dy 2 e −ip 1 y 1 −ip 2 y 2 where Ψ(q 1 , q 2 , t 1 , t 2 ) is the wavefunction of the system. If one considers two freely-moving particles, then Ψ satisfies the Schrödinger equations, i∂Ψ/∂t 1,2 =p 2 1,2 /(2m 1,2 )Ψ and, as a consequence, the Wigner function obeys (∂/∂t 1,2 + p 1,2 ∂/∂q 1,2 )W = 0. This means that W(q 1 , q 2 , p 1 , p 2 , t 1 , t 2 ) = W(q 1 − p 1 t 1 , q 2 − p 2 t 2 , p 1 , p 2 , 0, 0) and this allows us to calculate the Wigner function at any times from the sole knowledge of the initial wavefunction. Of course, the calculation of W(q 1 − p 1 t 1 , q 2 − p 2 t 2 , p 1 , p 2 , 0, 0) still requires the knowledge of the wavefunction and, in the following, several possibilities are considered.
Then, Bell proceeds and shows how his famous inequality can be implemented in the situation described before. More precisely, he does so in the so-called Clauser, Horne, Shimony and Holt (CHSH) formulation 3 which supposes to deal with quantities that can only take the values ±1. This is why, usually, the CHSH inequality is experimentally tested with spin variables. However, in the case considered by Bell, the particles are spinless and, as already mentioned, we only have access to position measurements. Although he does not present it exactly in this way, what Bell does to circumvent this problem is to introduce the two following operatorŝ which represent the sign of q 1 + q 0 /2 and q 2 − q 0 /2 at times t 1 and t 2 , respectively, q 0 being an arbitrary position. Clearly, the spectra ofŜ 1 andŜ 2 only consists of two values, namely ±1, as required. Interestingly enough, this is exactly what is done in Refs. [55][56][57] and, then in the context of Cosmology, in Refs. [25] and [26]. Therefore, remarkably, Bell's paper already contains the idea of fictitious spin operators that, as we will see later on, can be used in order to design a cosmic Bell experiment. Then, once we have discrete variables, one can just mimic the usual approach which, as reminded above, is formulated in terms of spins. The first step consists in defining the two-point correlators where |Ψ is the quantum state in which the system is placed. Let us also remark that the two times t 1 and t 2 play the role of the polarizers settings in the standard CHSH formulation. Following the usual considerations, one can then prove that if the correlators are interpreted as stochastic averages and if locality holds [59]. On the contrary, in Quantum Mechanics, one just has B(t 1 , t 2 , t 1 , t 2 ) < 2 √ 2, hence the idea to look for experimental configurations for which 2 < B(t 1 , t 2 , t 1 , t 2 ) < 2 √ 2. In the following, Bell inequality violation (or CHSH inequality violation, we use the two expressions indifferently) will refer to as a situation where B(t 1 , t 2 , t 1 , t 2 ) > 2.
As discussed at the beginning of this section, Bell wants to relate the non-positivity of the Wigner function to a violation of the CHSH inequality. Technically, the link between the inequality and the Wigner function is expressed as follows. It is easy to check that the quantities S 1 and S 2 are such that S 1 = S 1 and S 2 = S 2 , see the discussion above in Sec. 3.1 and, as a consequence, thanks to Eqs. (43), (44), the expression of the correlator E(t 1 , t 2 ) can be rewritten as where the two-time distribution probability ρ(q 1 , q 2 , t 1 , t 2 ) is defined by ρ(q 1 , q 2 , t 1 , t 2 ) = ∞ −∞ dp 1 ∞ −∞ dp 2 W(q 1 − p 1 t 1 , q 2 − p 2 t 2 , p 1 , p 2 , 0, 0). The above equations exactly represent the relation 3 Amusingly, throughout his paper, Bell mixes the acronyms and refers to this inequality as the CHHS inequality although he cites the correct reference (i.e. the correct journal, issue and page), namely Ref. [58]. A closer look at the list of references shows that he has also permuted the two "H", that is to say he puts Holt before Horne while, in the original paper, this is the opposite. In fact, Bell has simply taken the liberty to re-establish the alphabetical order! needed as it allows us to estimate the left hand side of Eq. (90) in terms of the Wigner function. Using the definition of S 1 and S 2 , one can also show that where we used the fact that the Wigner function is normalized to one.
To go further and concretely calculate the correlators, and, hence, verify whether the CHSH inequality is violated or not, one needs to specify the state in which the system is placed. The first example considered by Bell is simply the original EPR wavefunction [43] (supposed to hold at initial times), Ψ EPR (q 1 , q 2 ) = N EPR δ(q 1 − q 2 + q 0 ), where N EPR is a normalization constant. Then, he calculates the corresponding Wigner function and finds Bell remarks that this Wigner function is positive everywhere and that "the EPR correlations are precisely those between two classical particles in independent free classical motion. With the wave function (8) [namely the EPR wavefunction], then, there is no non-locality problem when the incompleteness of the wave function description is admitted". Therefore, Bell explicitly relates classicality (namely no violation of the CHSH inequality) and positivity of the Wigner function. However, it is interesting to notice that he does not explicitly verify the violation of the CHSH inequality in that case [namely he does not explicitly calculate the two-point correlators for the EPR wavefunction nor the combination (90)], despite the fact that he clearly suggests it is not violated. Although he does not explain why, one can maybe guess the reason. If one takes the EPR Wigner function (95) and tries to calculate ρ(q 1 , q 2 , t 1 , t 2 ) one finds an infinite result. Indeed, the first integration, say on p 1 , kills the Dirac delta function δ(p 1 + p 2 ). Then it remains an integral over p 2 of something which does not depend on p 2 , which gives an infinite result. This remark is interesting because it turns out to be at the core of all the literature that is devoted to this question and to the Bell's paper: this will be discussed at length in the following sections. The reason for this problem is in fact related to the normalization N EPR . Indeed, the wavefunction Ψ EPR (q 1 , q 2 ) is not properly normalized. A correct way to normalize it is to modify the wavefunction such that it now reads This is suggested by Bell himself in the continuation of his article. Strictly speaking, in Bell's paper, this trick is not applied to the EPR wavefunction but to another wavefunction considered later on in his article [and, in the present paper as well, see Eq. (108)]. Here, we have just anticipated his guess and have applied it to the EPR wavefunction. We will comment more about this point in the following. The "new" wavefunction (96) is made of three pieces: a normalization factor, a new factor depending on a new parameter b, e −(q 1 +q 2 ) 2 /(2b 2 ) and the last factor depending on another new parameter ε, e −(q 1 −q 2 +q 0 ) 2 /ε 2 /(ε √ π). This last factor is just a finite representation of the Dirac function δ(q 1 − q 2 + q 0 ) which is recovered in the limit ε → 0. The second factor is necessarily present to make the wavefunction normalizable. Even if one uses the finite representation of the Dirac function, it is not possible to make |Ψ EPR | 2 dq 1 dq 2 finite without the factor e −(q 1 +q 2 ) 2 /(2b 2 ) as it is easy to see by introducing new variables q 1 ± q 2 in the previous integral. Finally, the first factor is the normalization coefficient implied by the two other factors.
In his article, Bell claims 4 that the second factor can be ignored by taking the limit b → ∞. However, we see that this limit, as well as the limit ε → 0, are very problematic due to the normalization factor. For this reason, in the following, we work with Eq. (96) without taking any limit. Since the wavefunction is Gaussian, the calculations are tractable. Indeed, using the wavefunction (96), it is easy to calculate the Wigner function which reads W EPR (q 1 , q 2 , p 1 , p 2 , 0, 0) = 1 Of course, since the state (96) is Gaussian, we find that the Wigner function is also a Gaussian, see Sec. 2.5. One checks that the Wigner function is properly normalized, dq 1 dq 2 dp 1 dp 2 W EPR = 1, which is a consistency check of Eq. (97). The next move consists in calculating the distribution ρ. Recall that this requires the calculation of W EPR (q 1 − p 1 t 1 , q 2 − p 2 t 2 , p 1 , p 2 , 0, 0). Using Eq. (97), this can be expressed as where one has defined and and P T = (p 1 , p 2 ). Then, the distribution ρ can then be straightforwardly evaluated by applying well-known formula for Gaussian integrals. This leads to We are now in a position where the correlator E EPR (t 1 , t 2 ) can be calculated. Plugging the above equation (101) into Eq. (94), one obtains 4 Again, this in fact applies to another wavefunction in the paper. However, since the problem is exactly the same for the EPR wavefunction, we take the liberty to propagate this statement to the situation studied here.

Figure 4.
Bell quantity B(0, t 2 , 0, t 2 ) as a function of the two times t 2 and t 2 calculated for the EPR state, a state which has a positive definite Wigner function. As is apparent from the plot, one always has B(0, t 2 , 0, t 2 ) < 2 which means that Bell inequality is never violated.
We see that, in the arguments of all exponentials, q 0 cancel out and makes the correlation function independent of q 0 . This makes sense since this quantity is arbitrary and, therefore, a physical result cannot depend on its value. This also implies that the arguments of the exponentials are in fact quadratic inq 1 andq 2 . As a consequence, by a simple change of variable, the second term in the square bracket is in fact equal to the first one. Finally, one obtains where the coefficients c 1 , c 2 and c 3 are defined by The integral (103) can easily be performed and one obtains As required E EPR (t 1 , t 2 ) varies between −1 and 1. It is interesting to notice that the form of the correlator is really typical of what one obtains with pseudo spin operators, see for instance Eq. (26) of Ref. [26]: the resemblance is striking.
Having obtained the correlators, it is now easy to verify whether the CHSH inequality (90) is violated or not. In Fig. 4, we have represented the quantity B(0, t 2 , 0, t 2 ) defined in Eq. (90) and evaluated with the correlator obtained above. We see that this quantity is always smaller than two so that Bell inequality is never violated. Therefore, we have confirmed Bell's suggestion that the EPR state, which has a positive Wigner function, does not lead to any Bell inequality violation. This was done with a method that avoided the technical issues present in Bell's letter. In short, we conclude that Bell calculations are problematic but, despite those issues, the overall result, at least for the example of the EPR state, is correct.
After this description of the warm-up example in Bell's paper, we now come to the core of it. As noticed by Bell and as we have seen in detail previously, the EPR state corresponds to a positive Wigner function. However, Bell remarks that this is not always the case and that, for other wavefunctions, the Wigner distribution can take negative values [see, for instance, the case of the Schrödinger cat state (60)]. The next and crucial step of Bell's paper is then to study the CHSH inequality for states corresponding to Wigner functions that are not positive definite. In particular, Bell considers the case of the following initial wavefunction where a is a free parameter. Bell notices that this wavefunction is not properly normalized but he suggests that it could easily made so by including a factor e (q 1 +q 2 ) 2 /(2b 2 ) , where b is a new parameter 5 . Notice that, while only the difference q 1 − q 2 appeared in Eq. (108), considering this extra factor introduces an additional dependence in q 1 + q 2 in the wavefunction. It can be checked that the wavefunction is indeed correctly normalized. However, Bells argues that the limit b → ∞ can be taken from the very beginning so that we can work with (108) and ignore the more complicated form (109). The justification given by Bell is that only relative probabilities will be calculated in the rest of his article. So, for all practical purposes, he argues that one can work with Eq. (108), replacing the proportionality symbol by a q 1 and q 2 independent "normalization constant" N BELL . In other words, N BELL which, according to Eq. (109), reads N BELL = 8/(11πa 5 b)e −(q 1 +q 2 ) 2 /b 2 , is treated as a simple constant. Then, it is easy to calculate the corresponding Wigner function which, if the choice a = 1 is made (namely the value considered by Bell in his letter), reads This equation is in perfect agreement with Eq. (13) of Bell's letter [27] and, in addition, allows us to identify "K" the "unimportant constant" introduced by Bell in the above equation. Comparing with Eq. (13) of Ref. [27], we have The main point of this example is that the Wigner function (110) can be negative in some region as can be seen in Fig. 5. Then, we follow the same procedure as the one already used and explained for the EPR wavefunction: the next step consists in calculating the distribution ρ. Using the definition of this quantity, see below Eq. (93), one arrives at where τ ≡ t 1 + t 2 . This expression coincides with Eq. (15) of Ref. [27] and, again, we can identify the constant K in Ref. [27], namely K = N 2 BELL . The final step consists in using the distribution ρ BELL in Eq. (94) in order to calculate the two-point correlators of the pseudo spin operators. This leads to This expression is identical to Eq. (18) of Ref. [27] and, therefore, is in perfect agreement with Bell calculations. Once more this allows us to identify the constant called K by Bell and we have K = 5N 2 BELL . Finally, one can compute the quantity B(t 1 , t 2 , t 1 , t 2 ) given in Eq. (90). One arrives at  Eq. (112). Evidently, the two curves match perfectly well. Since they are negative for some values of x, we have violation of the CHSH inequality.
which is our final result. We have an explicit form of the Bell operator mean value which allows us to test the CHSH inequality.
In his paper, Bell calculates B BELL (t 1 , t 2 , t 1 , t 2 ) for t 1 = −2x, t 2 = x, t 1 = 0 and t 2 = 3x. If, given Eq. (112), one writes, as Bell does, E BELL (t 1 , In fact, since F BELL (τ) is a function of τ 2 only, the previous equation reduces to B BELL (−2x, x, 0, 3x) = 2 − [3F BELL (x) − F BELL (3x)] or, more explicitly, In Fig. 6, we have have plotted the quantity [2 − B BELL (−2x, x, 0, 3x)]/N 2 BELL (solid blue line). When this quantity is negative, the CHSH inequality is violated. We see on the plot that this is indeed the case provided x > 0.989761, a conclusion also reached by Bell. Notice that one can obtain this result regardless of the value of N BELL . In Fig. 6, we have also represented the quantity [3F BELL (x) − F BELL (3x)]/N 2 BELL (dashed red line) where F BELL has been defined above. Evidently, according to the previous considerations, it should exactly coincide with [2 − B BELL (−2x, x, 0, 3x)]/N 2 BELL and one checks that this is indeed the case. The condition 3F BELL (x) − F BELL (3x) ≥ 0 for a non violation of the CHSH inequality is Eq. (25) of Bell's paper.
Based on the previous considerations, Bell concludes that the non-positivity of the Wigner function (110) implies a violation of the CHSH inequality. He also adds that "I do not know that the failure of W to be non-negative is a sufficient condition in general for a locality paradox". Although it is fair to say that the main message is not explicitly expressed in this way, it is however clear that Bell's letter suggests a (one-to-one?) correspondence between a violation of the CHSH inequality and the non-positivity of the Wigner distribution. This argument seems to be supported by the EPR example (positive Wigner function and no violation of CHSH) and by the example we have just studied (no positive definite Wigner function and violation of CHSH). After all, the non-positivity of the Wigner function certainly signals that some genuine quantum effects are at play and, when this is the case, it is natural to think that Bell inequality could be violated. Therefore, at first sight, this conclusion appears to be meaningful and correct. It has very important consequences for Cosmology. Indeed, as we have already seen, cosmological perturbations are placed in a two-mode squeezed state, which is a Gaussian state, and, therefore, has a positive definite Wigner function. As a consequence, Bell's result, if true, would imply that no violation of his inequality can be observed in the CMB.

Is Bell's Paper Wrong?
In 1997, the article [30] was published by L. Johansen in Physics Letters A. In brief, this article claims that Bell's paper is wrong. The main argument is that working with Eq. (108), namely with a wave function Ψ BELL (q 1 , 2 ) , where N BELL is just viewed as a constant, is incorrect because, as we have already noticed in the previous section, this wavefunction is not correctly normalized. Ref. [30] quotes the book by A. Peres, "Quantum Theory: Concepts and Methods" where, on p. 80, one is warned that not normalizing properly the wavefunction can lead to negative probabilities or to probabilities larger than one.
Then, Ref. [30] makes his argument more precise and states that Bell's mistake is in fact to treat the normalization factor N BELL as time independent. However, what Ref. [30] does in practice is in fact much more interesting for the question discussed here. The idea is to consider a Wigner function which is positive definite, then use Bell's mathematical trick described above and, finally, show that this implies of violation of the CHSH inequality. Since, according to Ref. [30], one cannot have a CHSH inequality violation if the Wigner function is positive definite, it follows that Bell's mathematical trick and, therefore, Bell's result must be incorrect. This is a reductio ad absurdum proof. What is especially interesting is the fact that the correspondence positivity of the Wigner distribution versus impossibility to violate Bell inequality is taken for granted or is considered as obvious by the author.
Let us now study in detail the results of Ref. [30]. The starting point is the following Wigner function which is obviously positive definite. In fact, this is the product of the Wigner function of a coherent state with the Wigner function of a squeezed state. One can also check that this Wigner function is correctly normalized. Then, Ref. [30] proceeds and applies Bell's trick consisting in working with unnormalized states to the Wigner function (115). In order to see what it means in the present context, the best is to calculate the Wigner function associated with the wavefunction (109). Recall that the wavefunction (109) is the correctly normalized version of the unnormalized wavefunction (108) considered before and used by Bell to show that a non-positive Wigner function may cause a CHSH inequality violation. Assuming the wavefunction (109), the corresponding Wigner function reads As discussed before, Bell claims that one can take the limit b → ∞ from the very beginning, which consists in killing the term proportional to (q 1 + q 2 ) 2 and boosting the term proportional to (p 1 + p 2 ) 2 in the argument of the first exponential. In the limit b → ∞, one has b √ πe −b 2 (p 1 +p 2 ) 2 /4 → 2πδ(p 1 + p 2 ) and recalling that N 2 BELL = 8/(11πa 5 b)e −(q 1 +q 2 ) 2 /b 2 , we see that we exactly arrive at the Wigner function considered by Bell in his article, namely Eq. (110) (for a = 1, which is the choice made in Bell's article). The conclusion is that Bell's limit or trick is equivalent to killing the term proportional to (q 1 + q 2 ) 2 and boosting the term ∝ (p 1 + p 2 ) 2 in the Wigner function.
Therefore, coming back to Eq. (115) and to the article [30], this corresponds to taking the limit s → 0, leading to where K = √ 2/πse −s 2 (q 1 +q 2 ) 2 /2 . We see that this Wigner function also contains a Dirac function of p 1 + p 2 as in Eq. (110), which confirms that Bell's limit has indeed been correctly implemented in Eq. (115). As already noticed before, Ref. [30] remarks that K is what Bell calls an "an unimportant constant". Here, we have calculated K in term of s, which is not done in Ref. [30].
Then, we repeat once more the standard procedure. We first calculate ρ J (q 1 , q 2 , t 1 , t 2 ) by integrating the Wigner function over p 1 and p 2 . We find where τ ≡ (t 1 + t 2 )/2 (recall that Bell defines τ as t 1 + t 2 ) and q 0 (τ) ≡ q 0 + p 0 τ. This expression exactly coincides with Eq. (10) of Ref. [30]. The only difference is that the constant K is divided by √ π instead of √ 2π in the above expression. This factor √ 2 is just due to the fact that Ref. [30] has a slightly different definition of K: according to its Eq. (3), it is indeed the overall constant for the Wigner function (117) if the Dirac function appearing is written as δ[(p 1 + p 2 )/ √ 2], while in our case the Dirac function is simply written δ(p 1 + p 2 ). This difference accounts for the √ 2 between the two expressions.
Although Ref. [30] is supposed to mimic Bell's paper exactly, there are other differences between the two articles. One, which is only a detail, is that Ref. [30] defines the sign operators, or pseudo spin operators, with q 0 = 0, namelyŜ 1 (t 1 ) = sgn (q 1 , t 1 ) andŜ 2 (t 2 ) = sgn (q 2 , t 2 ). However, this does not affect the discussion since it was shown before that Bell's result does not depend on q 0 . This also means that Eq. (94) now reads E J (t 1 , +∞ 0 dq 2 ρ (q 1 , q 2 , t 1 , t 2 ) . Inserting Eq. (118) in this last expression, one finds that which coincides with Eq. (14) of Ref. [30] (up to the factor √ 2 already mentioned above). Following Bell, Ref. [30] simply defines F J (τ) by E J (t 1 , t 2 ) = 1 − F J (τ), which means that in agreement with Eq. (14) of this paper. Finally, Ref. [30] computes the mean value of the Bell operator given in Eq. (114), namely for t 1 = −2x, t 2 = x, t 1 = 0, t 2 = 3x. Following Bell, Ref. [30] studies the function 3F J (x) − F J (3x) which, if it takes negative values, signals a violation of the CHSH inequality, see the discussion around Eq. (114). Ref. [30] notices that, if, for instance, one chooses q 0 = 1 and p 0 = −1, this is precisely the case in the limit x 1, see Fig. 1 of this article. In Fig. 7, we have checked that, indeed, the function 3F J (x) − F J (3x) can be negative, see the green solid line. Notice that the scales in Fig. 7 and in Fig. 1 of Ref. [30] do not coincide because of the slight difference in the definition of τ already signaled before. Therefore, by using Bell's trick, Ref. [30] arrives at a violation of the CHSH inequality starting with a Wigner function which is positive definite. According to this paper, this is impossible because a positive Wigner function necessarily implies that the CHSH inequality cannot be violated. As a consequence, Ref. [30] concludes that the only way out is that Bell's trick, and therefore his entire paper, is incorrect. The deep reason for this mistake is that "one is not allowed to assume that K is time independent".

Are Criticisms Against Bell's Paper Wrong?
Let us now examine in more details the considerations of Ref. [30] presented in the previous section.
A first remark is that Ref. [30] has p 0 = 0 while there is no p 0 in Bell's article. This difference turns out to be crucial because, if Bell's function F BELL defined in Eq. (112) only depends on τ 2 , F J in Ref. [30] defined in Eq. (119) depends on τ precisely because of the presence of p 0 [since q 0 (τ) = q 0 + p 0 τ]. Only when p 0 = 0 this becomes a function of τ 2 only. This turns out to have drastic consequences. Indeed, since F J (−x) = F J (x), one cannot say that F J (−x) + 2F J (x) − F J (3x) = 3F J (x) − F J (3x) and a signature of a CHSH inequality violation is F J (−x) + 2F J (x) − F J (3x) ≤ 0 and no longer 3F J (x) − F J (3x) ≤ 0; and it turns out that if 3F J (x) − F J (3x) does become negative (see the solid green line in Fig. 7, this is not the case for F J (−x) + 2F J (x) − F J (3x) as shown in Fig. 7 (see the dashed red line). This clearly invalidates the whole reasoning of Ref. [30]: Bell's mathematical trick has not led to a fake CHSH violation for the Wigner function (115).
Second, Ref. [30] claims that Bell's mistake is to have ignored the time dependence of the constant K. But the constant K was calculated before and reads K = √ 2/πse −s 2 (q 1 +q 2 ) 2 /2 . It does not contain any time dependence so this argument is incorrect as well.
Third, let us notice that the Wigner function considered in Ref. [30], namely Eq. (115), is nothing but a special case of the EPR Wigner function (97) considered in Sec. 4.3. Indeed, if one takes b = 2/s, ε = 2, q 0 → √ 2q 0 , then Eq. (97) becomes identical to Eq. (115) with p 0 = 0. But we have established before that, if the state of the system is the EPR state, then no CHSH violation can occur. This means that if we had found a violation of the CHSH inequality starting from the Wigner function (115), this would have indeed indicated a mathematical inconsistency somewhere, as argued by Ref. [30]. But this is not the case since the CHSH inequality is never violated as can be seen in Fig. 7 where (2 − B J )/K is represented (solid blue line) and is always positive, a conclusion already obtained before from the plot of the combination [F J (−x) + 2F J (x) − F J (3x)]/K represented by the dashed red line. Therefore, this is an additional reason why the argument of Ref. [30], which is entirely based on this belief, would have remained, in any case, problematic. We therefore conclude that the criticisms put forward in Ref. [30] against Bell's paper are incorrect.

Correct or not Correct?
Before closing this section, let us examine again Bell results. After all, the fact that the criticisms against them are wrong does not mean that Bell's paper is correct. As guessed in Ref. [30], we believe that Bell's treatment of the wavefunction normalization is problematic. We have established before that Bell's main idea is to ignore the dependence in q 1 + q 2 by sending b to infinity in the argument of the exponential. But we see in the above expression that, in fact, this sends the whole normalization factor, and therefore the whole wavefunction to zero! The same remark of course applies to the exact Wigner function calculated in Eq. (116). Therefore, this procedure cannot be correct. Moreover, N BELL is a function of q 1 and q 2 and, as a consequence, when one integrates ρ BELL overq 1 andq 2 in Eq. (112), it is simply incorrect to treat N BELL as a constant. Therefore, we also reach the conclusion that Bell's article is incorrect even if for completely different reasons from those put forward in Ref. [30].

Revzen's Theorem
We are apparently in a complex situation: we have found that Bell's paper establishing a connection between the non-positivity of the Wigner function and a CHSH inequality violation is incorrect but we have also reached the conclusion that the criticisms expressed against that paper are wrong as well! Moreover, all these authors seem to agree that, if the Wigner function is positive definite, then no CHSH inequality violation can occur which, we recall, would have important conceptual consequences for Cosmology.
In fact, the situation was clarified in 2004 in Refs. [60] and [61]. In these papers, Revzen establishes that, under certain conditions that we are going to describe, Bell inequality can be violated even if the Wigner distribution is positive definite. Notice that this both invalidates Bell's paper [27], since Revzen's result shows that a Bell inequality violation is not necessarily associated to a negative Wigner function, but also Ref. [30] since its reasoning was entirely based on the fact that violating Bell's inequality is impossible if the Wigner function is positive. In his paper, Revzen mentions Bell's paper [27] but says that "Bell's original argument that nonnegativity of Wigner's function suffices to preclude Bell inequality violation was shown to be inaccurate" in Ref. [30]. He adds that "Difficulties in handling normalization of the EPR state considered by Bell were shown to involve a misleading factor". As explained in the last section, this description is not very accurate since we have just shown that the criticisms of Ref. [30] are in fact not valid.
Let us now come to the main result obtained by Revzen. In brief, Revzen shows that Bell inequality can be violated even if the Wigner function is positive definite provided the variables considered are "improper", namely if the Weyl transform [defined in Eq. (42)] of an operator takes different values than the spectrum of that operator. Let us give an example of a proper and improper operators. First, let us consider the pseudo spin operators used by Bell, namelyŜ = sgn(q, t). Its Weyl transform is given by Therefore this operator is proper since its Weyl transform takes values ±1 which are exactly the values taken by the spectrum of the operator. This explains why, in the EPR state, Bell inequality was not violated in Sec. 4.3. This was a situation where the Wigner function was positive and the operator used to construct the Bell operator was proper.
Let us now give an example of an improper operator. Let us consider the following operator The reason for the notationŝ z will be clarified below. Here |n is an eigenvector of the number operator. It is easy to show that the spectrum of this operator is ±1 becauseŝ 2 z =Î. The matrix element ofŝ z is given by m|ŝ z |m = ±δ mm with a plus sign if m is odd and a minus sign if m is even. This allows us to rewriteŝ z asŝ Indeed, one can show that this leads to the same matrix element, namely where H m is a Hermite polynomial of order n [52]. It follows that Clearly, the Weyl transform ofŝ z does not take values ±1 and, as a consequence, this operator is improper. The Revzen theorem rests on Eqs. (43) and (44). Indeed, according to these equations, the mean value of an operator is the average of its Weyl transform weighted by the Wigner function. If the Weyl transform takes the same values as the spectrum of the operator, it means that any quantum average can be obtained through the usual, classical, laws of random variables. But, if this is so, Bell's theorem precisely tells us that its inequality cannot be violated.
We conclude that the history of the relationship between the possibility of violating Bell inequality and the positivity of the Wigner function is a long, chaotic and rich one. For Cosmology, this question is absolutely crucial since the Wigner function is, in this case, positive definite. The Revzen theorem establishes the possibility of a Bell inequality violation in the sky, a fascinating possibility indeed. In his paper, Revzen precisely discusses his theorem with the help of a two-mode squeezed state. What was not realized before is that Cosmology provides a perfect situation to illustrate this problem. It was not realized by the cosmology community because the issues related to quantum foundations are, usually, far from its everyday interests and it was not realized by people working on Quantum Mechanics because the inflationary mechanism and the fact that cosmological perturbations are placed in a two-mode squeezed state was largely ignored by people working in this field. In fact, given that Cosmology is the part of physics where the largest squeezing is achieved, one can even argue that it is the most interesting situation to discuss the issues tackled in this section.

Bell Inequality Violation in the CMB?
We have just seen that, even if the CMB is placed in a quantum state with positive Wigner function, Revzen's theorem is compatible with Bell inequality violation in the sky. Based on this result, the next question is of course to identify improper variables in the CMB that could be used for that purpose. In fact, it turns out that it is possible to build improper pseudo spin operators out of a continuous variable, here of course taken to be the Fourier amplitude of curvature perturbations. A first example has been considered by Banaszek and Wodkiewics (BM) in Ref. [55] and Chen, Pan, Hou and Zhang in Ref. [56]. They have defined the following operatorŝ where |n k are the eigenvectors of the particle number operator. These operators are spin operators because they satisfy ŝ x ,ŝ y = 2iŝ z , [ŝ x ,ŝ z ] = −2iŝ y and ŝ y ,ŝ z = 2iŝ x and that their spectrum is ±1. Notice that the z-component is precisely the state we discussed in Eq. (123). Another example of fictitious spin operators are those introduced by Gour, Khanna, Mann and Revzen (GKMR) in Ref. [57], which are given bŷ where |E k and |O k are defined by Here, |q k is an eigenstate of the position operator for the mode k. Let us notice that, in principle, it is not an eigenstate of curvature perturbations, see also Eq. (137) below. There exists a third way to define fictitious spin operators as shown by Larsson in Ref. [62]. The z-component of the Larsson spin operators can be defined aŝ We notice that Bell inequality is violated since | Ψ 2 sq |B (k, −k) |Ψ 2 sq | ≥ 2. Right panel: same quantity for the Larsson fictitious spin operators. The value of the Bell operator mean is indicated by a color code where the maximum over is taken. If less than two, the color is blue otherwise the corresponding value is indicated by the color bar. Bell inequality is violated in some region of the squeezing parameter space corresponding to large squeezing and small squeezing angle. Credit: Refs. [24][25][26].
where is a free parameter that can be arbitrarily chosen. The other components can then be introduced once the step operatorŜ It can be checked that, among each set of fictitious spin operators, there are at least two spin components that are improper operators [24][25][26]. As discussed before, this implies that, despite the positivity of the Wigner function, they can be used to define a quantity that violates Bell inequality. What should be done is just to follow the pioneering ideas of Bell's paper and implement Bell inequality with the pseudo spin operators in its CHSH version.
In practice, this is done as follows. First, we view the CMB has a bipartite system made of two sub-systems corresponding to mode k and −k. Second, we calculate the following quantity Ψ 2 sq |B (k, −k) |Ψ 2 sq = E (θ n , θ m ) + E (θ n , θ m ) + E (θ n , θ m ) − E (θ n , θ m ) , where the two-point correlation function E(n, m) is defined by E(n, m) = Ψ 2 sq |n ·Ŝ (k) ⊗ m · S (−k) |Ψ 2 sq ,Ŝ denoting a spin operator of one of the three types introduced before. The vector n is a unit vector that can be written as n = (sin θ n cos ϕ n , sin θ n sin ϕ n , cos θ n ) (in the following, we choose vanishing azimuthal angles). We have calculated the mean of the Bell operator for the three sets of pseudo spin operators introduced before, see Refs. [25] and [26], and the result is displayed in Fig. 8. For each of them, we find that Bell inequality can be violated (and, as a consistency check, we find that | Ψ 2 sq |B (k, −k) |Ψ 2 sq | < 2 √ 2, the Cirel'Son bound [63]).

Discussion
Let us recap what has been achieved. According to inflation, curvature perturbations that source CMB anisotropies are placed in a two-mode squeezed state. This state has a positive Wigner function but, nevertheless, is highly non-classical. One way to highlight this non-classical nature is by studying the Bell inequality. It turns out that, from the curvature perturbation, which is a continuous variable, one can extract dichotomic spin operators (following Bell's paper), which allows us to study the Bell inequality in its CHSH incarnation. We have checked that, if the system is placed in a two mode squeezed state, then this inequality is indeed violated. Observing this violation would be the definitive proof that CMB anisotropies are of quantum mechanical origin. Is it possible in practice?
The first question is what it means to "measure" the spin operators. Concretely, we measure the temperature anisotropy. But, through the so-called Sachs-Wolfe effect, δT/T is in fact a direct measurement of curvature perturbation, as explained in Eq. (4). The definition of the spin operators introduced before, however, involvesq k and notζ k . These two quantities are related bŷ We see that the knowledge ofζ k is not sufficient to inferq k . However, the amplitude ofζ k is in fact related to the decaying mode as was established before in Eq. (7). Since the curvature perturbation is conserved on large scale, this decaying mode is in fact negligible. If one accepts this reasoning, then a measurement ofζ k is equivalent to a measurement ofq k . The next question is whether this allows us to measure the spin operators? Let us discuss this question for the GKMR operators defined in Eqs. (130), (131) and (132) (the same conclusion applies to the two other sets). It is interesting to notice that which means thatŜ x (k) can be, in principle, inferred from a measurement ofq k . But forŜ y one has q k | Ŝ y (k),q k |q k = q k − q k −q k |q k + q k q k | − q k + q k q k |q k = q k δ q k + q k + δ q k − q k = 0 .
This means that measuringq k is not sufficient to measure the y-component. This implies that we would need to measure another non commuting operator that can only beζ k . There are two problems with this idea. First, we would need to measure the system again and it is not clear what it means when what is measured is the Universe itself! Maybe, by applying some ergodic reasoning, this would mean measuring different horizon patch on the last scattering surface? But, even if we succeed in doing so, measuring the derivative of curvature perturbation means measuring the decaying mode. Although it is a priori possible in principle, it practices it is clearly impossible. Recall that its amplitude is typically of the order e −50 , which seems out of reach for ever.

Conclusions
Let us now present our conclusions. In this article, we have studied the quantum mechanical aspects of inflationary perturbations. According to inflation, CMB anisotropies and large scale structures are nothing but quantum fluctuations amplified by gravitational instability and stretched to cosmological scales by the cosmic expansion. This raises two questions: first, data analysis in Cosmology is usually done without any reference to the quantum origin of the perturbations since these ones look classical to us. Then, how can we understand this quantum-to-classical transition? Second, if the perturbations are really of quantum-mechanical origin, is there a signature of this origin left over somewhere in the cosmological data?
It seems fair to acknowledge the frustrating aspect of the results established above. We have shown that the quantum mechanical origin of the perturbations is still encoded in the CMB map which, in some sense, contains many Schrödinger cats, a fascinating conclusion indeed! However, highlighting this signature essentially appears to be impossible in practice due to the smallness of the signal. We have therefore to find another method to check the quantum origin of the galaxies. A suggestion recently made in Ref. [24] is to use Leggett-Garg inequality rather than Bell inequality since the former one only requires the measurement of a single spin component. The price to pay, however, is that it should be done at different times that is to say at different redshifts. Maybe a future experiment such as Euclid [64] could be useful for that purpose since it plans to perform a "tomography" of the power spectrum. The quest continues! To address these tricky questions, we have also shown that the ideas and questions that John Bell discusses in his letter "EPR correlations and EPW distributions", despite the technical problems of his paper, are crucial to investigate these issues. They allow us to better understand the quantum-to-classical transition of the fluctuations and they help us to imagine what could be an unambiguous signature of their origin.
The contribution of John Bell to Cosmology is usually summarized by his letter "Quantum mechanics for cosmologists" (this is chapter 15 of the book "Speakable and unspeakable in quantum mechanics") where he emphasizes that the interpretational issues of Quantum Mechanics are exacerbated in the context of Cosmology, see the famous quote "Was the world wavefunction waiting to jump for thousands of millions of years until a single-celled living creature appeared? Or did it have to wait a little longer for some more highly qualified measurer -with a Ph. D.?". Here, we have argued that the article "EPR correlations and EPW distributions" is another, unrecognized, Bell's contribution to Cosmology and to the theory of inflation which is probably even more important than "Quantum mechanics for cosmologists". The intriguing point is that Bell's contribution was written before the cosmology community, thanks to Grishchuk and Sidorov, realized that the inflationary perturbations are placed in a two-mode squeezed state, namely an entangled state with positive definite Wigner function.
Once more, John Bell has been a precursor!