Open Access
This article is

- freely available
- re-usable

*Entropy*
**2019**,
*21*(2),
185;
https://doi.org/10.3390/e21020185

Article

Bounding the Plausibility of Physical Theories in a Device-Independent Setting via Hypothesis Testing

^{1}

Department of Physics and Center for Quantum Frontiers of Research & Technology (QFort), National Cheng Kung University, Tainan 701, Taiwan

^{2}

NTT Basic Research Laboratories and NTT Research Center for Theoretical Quantum Physics, NTT Corporation, 3-1 Morinosato-Wakamiya, Atsugi, Kanagawa 243-0198, Japan

^{*}

Authors to whom correspondence should be addressed.

Received: 15 December 2018 / Accepted: 12 February 2019 / Published: 15 February 2019

## Abstract

**:**

The device-independent approach to physics is one where conclusions about physical systems (and hence of Nature) are drawn directly and solely from the observed correlations between measurement outcomes. This operational approach to physics arose as a byproduct of Bell’s seminal work to distinguish, via a Bell test, quantum correlations from the set of correlations allowed by local-hidden-variable theories. In practice, since one can only perform a finite number of experimental trials, deciding whether an empirical observation is compatible with some class of physical theories will have to be carried out via the task of hypothesis testing. In this paper, we show that the prediction-based-ratio method—initially developed for performing a hypothesis test of local-hidden-variable theories—can equally well be applied to test many other classes of physical theories, such as those constrained only by the nonsignaling principle, and those that are constrained to produce any of the outer approximation to the quantum set of correlations due to Navascués-Pironio-Acín. We numerically simulate Bell tests using hypothetical nonlocal sources of correlations to illustrate the applicability of the method in both the independent and identically distributed (i.i.d.) scenario and the non-i.i.d. scenario. As a further application, we demonstrate how this method allows us to unveil an apparent violation of the nonsignaling conditions in certain experimental data collected in a Bell test. This, in turn, highlights the importance of the randomization of measurement settings, as well as a consistency check of the nonsignaling conditions in a Bell test.

Keywords:

quantum nonlocality; Bell test; device-independent; p-value; hypothesis testing; nonsignaling## 1. Introduction

In physics, the terminology “device-independent” apparently made its first appearance in Ref. [1] where the authors drew a connection between the celebrated discovery by Bell [2] and the vibrant field of quantum cryptography [3]. As of today, device-independent quantum information has become a well-established research area where Bell-inequality-violating correlations find applications not only in the distribution of secret keys [4,5,6] (see also Ref. [7]), but also in the generation of random bits [8,9,10], as well as in the assessment of uncharacterized devices (see, e.g., Refs. [11,12,13,14,15,16,17]). For a comprehensive review, see Refs. [18,19].

A device-independent approach to physics, however, could be traced back, for example, to the work of Bell [2]. There, he showed that any local-hidden-variable (LHV) theory [20] must be incompatible with certain quantum predictions. The proof is “device-independent” in the sense that one needs no further assumption about the nature of the theory (including the detailed functioning of any devices that one may use to test the theory). Rather, the proof relies on a common ingredient of operational physical theories—correlations between measurement outcomes, i.e., the probability of getting particular measurement outcomes conditioned on certain measurement choices being made—to manifest the incompatibility.

By now, this incompatibility has been verified in various loophole-free Bell tests, such as those reported in Refs. [21,22,23,24,25]. Importantly, any real experiments must involve only a finite number of experimental trials. Statistical fluctuations must thus be carefully taken into account in order to draw any conclusion against a hypothetical theory, such as an LHV theory. For example, using the observed relative frequencies as a naïve estimator of the underlying correlations would generically (see, e.g., Refs. [26,27]) lead to a violation of the nonsignaling conditions [28,29]. Since the assumption of nonsignaling is a prerequisite for any Bell tests, it is only natural that a Bell test of LHV theories must also be accompanied by the corresponding test of this assumption [22,23,24,25,30] (see also Refs. [31,32,33]).

The effects of statistical fluctuations in a Bell test were (in fact, still are) often reported in terms of the number of standard deviations the estimated Bell violation exceeds the corresponding local bound (see, e.g., Refs. [34,35,36,37,38,39,40,41,42]). However, there are several problems with such a statement (see Refs. [19,43] for detailed discussions). Alternatively, as a common practice in hypothesis testing, one could also present the p-value according to a certain null hypothesis (e.g., the hypothesis that a LHV theory holds true). The corresponding p-value then describes the probability that the statistical model (associated with the null hypothesis) produces some quantity (e.g., the amount of Bell-inequality violation) at least as extreme as that observed.

A pioneering work in this regard is that due to Gill [44] where he presented a p-value upper bound according to the hypothesis of a LHV theory based on the violation of the Clauser-Horne-Shimony-Holt (CHSH) [45] Bell inequality. A few years later, a systematic method that works directly on the observed data (without relying on any predetermined Bell inequality)—by the name of the prediction-based-ratio method—was developed by one of the present authors and coworkers [43] (see also Ref. [46]). This method was designed for computing a p-value upper bound—based on the data collected in a Bell test—according to LHV theories. As we shall show in this work, essentially the same method can be applied for the hypothesis testing of some other nonlocal physical theories, thus allowing us to bound the plausibility of physical theories beyond LHV theories.

Indeed, since the pioneering work by Popescu and Rohrlich [28], there has been an ongoing effort (see, e.g., Refs. [47,48,49,50]) to find well-motivated physical [51,52] or information-theoretic [53,54,55,56] principles to recover precisely the set of quantum correlations. Unfortunately, none of these has succeeded. Rather, they each define a set of correlations that outer approximates the quantum set [57]. In other words, they also contain correlations that are more nonlocal than that allowed by quantum theory. For example, the so-called “almost-quantum” [50] set of correlations is one such superset of the quantum set, yet satisfying essentially all the proposed principle known to date. In the rest of this work, it suffices to think of this set as a fairly good outer approximation to the quantum set of correlations.

In this work, we show that the prediction-based-ratio method can be applied to test any physical theory that is constrained to produce correlations that is amenable to a semidefinite programming [58] characterization. In particular, it can be applied to test any physical theory that is constrained to produce nonsignaling [28] correlations, or any theory that respects macroscopic locality [51] or which gives rise to the almost-quantum [50] set of correlations etc.

## 2. Methods

#### 2.1. Preliminaries

For a complete description of the prediction-based-ratio method and a comparison of its strength against the martingale-based method [44], we refer the reader to Ref. [43]. Here, we merely recall the necessary ingredients of the prediction-based-ratio method and show how it can be used to achieve the purpose of bounding the plausibility of physical theories based on the data collected in a Bell test, with minimal assumptions. Making this possibility evident and demonstrating how well it works in practice are the main contributions of the present work.

For simplicity, the following discussions are based on a Bell test that involves two parties (Alice and Bob) who are each allowed to perform one of two measurements randomly selected at each trial, each produces one of two possible outcomes. Generalization to other Bell scenarios will be evident. To this end, let us denote the measurement choice (input) of Alice (Bob) by x (y) and the corresponding measurement outcome (output) by a (b), where $a,b,x,y\in \{0,1\}$. The extent to which the distant measurement outcomes are correlated is then succinctly summarized by the collection of joint conditional probability distributions $\overrightarrow{P}=\left\{P(a,b|x,y)\right\}{}_{a,b,x,y}$.

In an LHV theory, the outcome probability distributions can be produced with the help of some LHV $\lambda $ (distributed according to ${q}_{\lambda}$) via the local response functions satisfying $0\le {P}_{\lambda}^{A}\left(a\right|x),{P}_{\lambda}^{B}\left(b\right|y)\le 1$ and ${\sum}_{a}{P}_{\lambda}^{A}\left(a\right|x)={\sum}_{b}{P}_{\lambda}^{B}\left(b\right|y)=1$ such that [2]:

$$P(a,b|x,y)=\sum _{\lambda}{q}_{\lambda}{P}_{\lambda}^{A}\left(a\right|x){P}_{\lambda}^{B}\left(b\right|y).$$

Hereafter, we refer to any $\overrightarrow{P}$ that can be decomposed in the above manner as a (Bell-) local correlation and denote the set of such correlations as $\mathcal{L}$.

In contrast, if Alice and Bob conduct the experiment by performing local measurements on some shared quantum state $\rho $, quantum theory predicts setting-dependent outcome distributions for all $a,b,x,y$ of the form:
where ${M}_{a|x}^{A}$ and ${M}_{b|y}^{B}$ denote, respectively, the local positive-operator-value-measure element associated with the a-th outcome of Alice’s x-th measurement and the b-th outcome of Bob’s y-th measurement. Accordingly, we refer to any $\overrightarrow{P}$ that can be written in the form of Equation (2) as a quantum correlation and the set of such correlations as $\mathcal{Q}$.

$$P(a,b|x,y)=tr(\rho \phantom{\rule{0.166667em}{0ex}}{M}_{a|x}^{A}\otimes {M}_{b|y}^{B}),$$

Importantly, both local and quantum correlations satisfy the nonsignaling conditions [29]:
where ${P}_{A}\left(a\right|x,y):={\sum}_{b}P(a,b|x,y)$ and ${P}_{B}\left(b\right|x,y):={\sum}_{a}P(a,b|x,y)$ are marginal probability distributions of $P(a,b|x,y)$. Should (any of) these conditions be violated in a way that is independent of spatial separation, Alice and Bob would be able to communicate faster-than-light [28] via the choice of measurement $x,y$. We shall denote the set of $\overrightarrow{P}$ satisfying Equation (3) as $\mathcal{NS}$. It is known that $\mathcal{L}$, $\mathcal{Q}$, and $\mathcal{NS}$ are convex sets and that they satisfy the strict inclusion relations $\mathcal{L}\subset \mathcal{Q}\subset \mathcal{NS}$ (see, e.g., Ref. [19] and references therein).

$$\begin{array}{c}\hfill {P}_{A}\left(a\right|x,y)={P}_{A}\left(a\right|x,{y}^{\prime}):={P}_{A}\left(a\right|x)\phantom{\rule{1.em}{0ex}}\forall \phantom{\rule{0.166667em}{0ex}}a,x,y,{y}^{\prime},\\ \hfill {P}_{B}\left(b\right|x,y)={P}_{B}\left(b\right|{x}^{\prime},y):={P}_{B}\left(b\right|y)\phantom{\rule{1.em}{0ex}}\forall \phantom{\rule{0.166667em}{0ex}}b,x,{x}^{\prime},y,\end{array}$$

A few other convex sets of correlations are worth mentioning for the purpose of subsequent discussions. To this end, note that the problem of deciding if a given $\overrightarrow{P}$ is in $\mathcal{Q}$ is generally a difficult problem. However, the characterization of $\mathcal{Q}$ can, in principle, be achieved by solving a converging hierarchy of semidefinite programs [58] due to Nacascués, Pironio, and Acín (NPA) [59,60] (see also Ref. [61,62]). The lowest level outer approximation of $\mathcal{Q}$ in this hierarchy, often denoted by ${\mathcal{Q}}_{1}\supset \mathcal{Q}$, happens to be exactly the set of correlations that is characterized by the physical principle of macroscopic locality [51]. A finer outer approximation of $\mathcal{Q}$ corresponding to the lowest-level hierarchy of Ref. [62], which we denote by $\tilde{\mathcal{Q}}$, is known in the literature as the almost-quantum set [50], as it appears to satisfy all the physical principles that have been proposed to characterize $\mathcal{Q}$. In Section 3, we use $\tilde{\mathcal{Q}}$ and $\mathcal{NS}$ as examples to illustrate how the prediction-based-ratio method can be adapted to test physical theories that are constrained to produce correlations from these sets.

#### 2.2. Finite Statistics and the Prediction-Based-Ratio Method

Coming back to an actual Bell test, let ${N}_{\mathrm{total}}$ be the total number of experimental trials carried out during the course of the experiment. During each experimental trial, x and y are to be chosen randomly according to some fixed probability distribution ${P}_{xy}$ (This distribution may be varied from one trial to another but for simplicity of discussion, we consider in this work only the case where this is fixed once and for all before the experiment begins). From the data collected in a Bell test, a naïve (but very commonly-adopted) way to estimate the correlation $\overrightarrow{P}$ between measurement outcomes is to compute the relative frequencies $\overrightarrow{f}$ that each combination of outcomes $(a,b)$ occurs given the choice of measurement $(x,y)$, i.e.,
where ${N}_{a,b,x,y}$ is the total number of trials the events corresponding to $(a,b,x,y)$ are registered and ${N}_{x,y}={\sum}_{a,b}{N}_{a,b,x,y}$ is the number of times the particular combination of measurement settings $(x,y)$ is chosen. By definition, ${N}_{\mathrm{total}}={\sum}_{x,y}{N}_{x,y}$.

$$f(a,b|x,y)=\frac{{N}_{a,b,x,y}}{{N}_{x,y}},$$

If the experimental trials are independent and identically distributed (i.i.d.) corresponding to a fixed state $\rho $ with fixed measurement strategies ${\{{M}_{a|x}^{A}\}}_{a,x},{\{{M}_{b|y}^{B}\}}_{b,y}$, then in the asymptotic limit, ${lim}_{{N}_{\mathrm{total}}\to \infty}f(a,b|x,y)=P(a,b|x,y)$ where $\overrightarrow{P}$ here would satisfy Equation (2). In this limit, the amount of statistical evidence in the data against a particular hypothesis $\mathfrak{H}$ can be quantified by the Kullback-Leibler (KL) divergence [63] (also known as the relative entropy) from $\overrightarrow{P}$ to $\mathcal{L}$, see Refs. [64,65] for a detailed explanation with quantum experiments. We remark that the KL divergence is directly related with the Fisher information metric and so it measures the distinguishability of a distribution from its neighborhood. This provides a motivation for using the KL divergence as a measure of statistical evidence.

In the (original) prediction-based-ratio method of Ref. [43] (see also Ref. [66]), the hypothesis of interest is that the experimental data can be produced using an LHV theory, in other words, that the underlying correlation $\overrightarrow{P}\in \mathcal{L}$. For convenience, we shall refer to this hypothesis as $\mathfrak{L}$. In this case, given $\overrightarrow{f}$ and ${P}_{xy}$, the relevant KL divergence from $\overrightarrow{f}$ to $\mathcal{L}$ reads as

$${D}_{\mathrm{KL}}\left(\overrightarrow{f}\left|\right|\mathcal{L}\right):=\phantom{\rule{0.166667em}{0ex}}\underset{\overrightarrow{P}\in \mathcal{L}}{min}\sum _{a,b,x,y}{P}_{xy}f(a,b|x,y)log\left[\frac{f(a,b|x,y)}{P(a,b|x,y)}\right]$$

As the objective function in Equation (5) is strictly convex in $\overrightarrow{P}$ and the feasible set $\mathcal{L}$ is convex, the minimizer of the above optimization problem—which we shall denote by ${\overrightarrow{P}}_{\mathrm{KL}}^{\mathcal{L},*}$—is unique (see, e.g., Ref. [27]). It follows from the results presented in Ref. [43] that this unique minimizer ${\overrightarrow{P}}_{\mathrm{KL}}^{\mathcal{L},*}$ can be used to construct a Bell inequality:
where the non-negative coefficients of the Bell inequality are defined via the ratios

$$\sum _{a,b,x,y}R(a,b,x,y){P}_{xy}P(a,b|x,y)\stackrel{\mathcal{L}}{\le}1,$$

$$R(a,b,x,y):=\frac{f(a,b|x,y)}{{\overrightarrow{P}}_{\mathrm{KL}}^{\mathcal{L},*}(a,b|x,y)}.$$

This Bell inequality is the key ingredient of the prediction-based-ratio method and is ideally suited for performing a hypothesis test of $\mathfrak{L}$.

To understand the method, we introduce the random variables X and Y to denote the random inputs and the variables A and B to denote the random outputs of Alice and Bob at a trial. The ability to select measurement settings randomly, in particular, is an indispensable prerequisite of the prediction-based-ratio method, or more generally, a proper Bell test (see, e.g., Ref. [20]). We further denote the possible values of inputs and outputs by the respective lower-case letters. Then we can think of the ratio R in Equation (6) as a non-negative function of the inputs $X,Y$ and outputs $A,B$ at each experimental trial such that its expectation according to an arbitrary $\overrightarrow{P}\in \mathcal{L}$ with the fixed input distribution ${P}_{xy}$ satisfies

$$\langle R(A,B,X,Y)\rangle \stackrel{\mathcal{L}}{\le}1.$$

Equation (7) is an alternative way of expressing the Bell inequality of Equation (6). A real experiment necessarily involves only a finite number ${N}_{\mathrm{total}}=({N}_{\mathrm{est}}+{N}_{\mathrm{test}})$ of experimental trials in time order. Here, we have split the experimental data into two sets: the data from the first ${N}_{\mathrm{est}}$ trials as the training data and the data from the remaining ${N}_{\mathrm{test}}$ trials as the hypothesis-testing data. In practice, we first construct the function R using the training data and then perform a hypothesis test with the test data. Since the ratio R is determined before the hypothesis test based on the prediction according to the training data, R is called a prediction-based ratio.

Given a prediction-based ratio and a finite number ${N}_{\mathrm{test}}$ of test data, we can quantify the evidence against the hypothesis $\mathfrak{L}$ by a p-value. For concreteness, suppose that the actual measurements chosen at the i-th test trial are ${x}_{i}$, ${y}_{i}$ and the corresponding measurement outcomes observed are ${a}_{i}$, ${b}_{i}$. Then the value of the prediction-based ratio at the i-th test trial is $R({a}_{i},{b}_{i},{x}_{i},{y}_{i})$, abbreviated as ${r}_{i}$. We introduce a test static T as the product of the possible values of the prediction-based ratio at all test trials, so the observed value of the test statistic is $t={\prod}_{i=1}^{{N}_{\mathrm{test}}}{r}_{i}$. If we denote by ${N}_{a,b,x,y}^{\prime}$ the total number of counts registered for the input-output combination $(a,b,x,y)$ in the test data, then t can be expressed also as

$$t=\prod _{a,b,x,y}R{(a,b,x,y)}^{{N}_{a,b,x,y}^{\prime}}.$$

According to Ref. [43], the p-value, which is defined as the maximum probability according to the hypothesis $\mathfrak{L}$ of obtaining a value of T at least as high as t actually observed in the experiment, is bounded by

$$p\le min\{1/t,1\}.$$

The smaller the p-value, the stronger the evidence against the hypothesis $\mathfrak{L}$ is, in other words, the less plausible LHV theories are. It is worth noting that the p-value bound computed in this manner remains valid even if the experimental trials are not i.i.d., while when the experimental trials are i.i.d., the p-value bound is asymptotically optimal (or tight) [43].

#### 2.3. Generalization for Hypothesis Testing Beyond LHV Theories

The following two simple observations, which allow one to apply the prediction-based-ratio method to test physical theories beyond those described by LHV, are where our novel contribution enters. Firstly, we make the observation that in the above arguments leading to the p-value bound of Equation (9), the actual hypothesis $\mathfrak{L}$ only enters at Equation (6) via the set of correlations $\mathcal{L}$ compatible with the hypothesis $\mathfrak{L}$. In particular, if we are to consider the hypothesis $\mathfrak{H}$ that the data observed is produced by a physical theory H (e.g., a nonsignaling theory), then we merely have to replace $\mathcal{L}$ by the (convex) set of correlations $\mathcal{H}$ (e.g., $\mathcal{NS}$) associated with H in the optimization problem of Equation (5). The method then allows us to bound the plausibility of the hypothesis $\mathfrak{H}$ via the p-value bound in Equation (9) with the possible values of the prediction-based ratio given by
where ${\overrightarrow{P}}_{\mathrm{KL}}^{\mathcal{H},*}$ is the unique minimizer of the optimization problem:

$$R(a,b,x,y):=\frac{f(a,b|x,y)}{{\overrightarrow{P}}_{\mathrm{KL}}^{\mathcal{H},*}(a,b|x,y)},$$

$${D}_{\mathrm{KL}}\left(\overrightarrow{f}\left|\right|\mathcal{H}\right):=\phantom{\rule{0.166667em}{0ex}}\underset{\overrightarrow{P}\in \mathcal{H}}{min}\sum _{a,b,x,y}{P}_{xy}f(a,b|x,y)log\left[\frac{f(a,b|x,y)}{P(a,b|x,y)}\right].$$

Although Equation (8), Equation (9) and Equation (10) together provide us, in principle, a recipe to test the plausibility of a general physical theory H, its implementation depends on the nature of the set of correlations associated with the hypothesis. Indeed, a crucial part of the procedure is to solve the optimization problem of Equation (11) for the convex set of correlations $\mathcal{H}$ compatible with H, which is generally far from trivial. If $\mathcal{H}$ is a convex polytope, such as $\mathcal{L}$ and $\mathcal{NS}$, or the set of correlations associated with the models considered in Refs. [67,68]), it is known [43] that Equation (11) can indeed be solved numerically.

Our second observation is that for the convex sets of correlations that are amenable to a semidefinite programming characterization, such as those considered in Refs. [59,62,69,70], Equation (11) is an instance of a conic program [58] that can be efficiently solved using a freely available solver, such as PENLAB [71]. To see this, one first notes that, apart from the constant factor ${P}_{xy}$, the optimization of Equation (11) is essentially the same as that considered in Ref. [27]. A straightforward adaptation of the argument presented in Appendix D 2 of Ref. [27] would then allow us to complete the aforementioned observation. The data observed in a Bell test can thus be used to test not only $\mathfrak{L}$, but also $\mathfrak{N}$ and even the hypothesis $\mathfrak{Q}$ that the observation is compatible with Born’s rule, cf. Equation (2), via outer approximations of $\mathcal{Q}$ (such as ${\mathcal{Q}}_{1}$ and $\tilde{\mathcal{Q}}$).

A remark is now in order. In order to avoid so-called p-value hacking, it is essential that the test data used in the computation of the test statistic T is not used to determine $\overrightarrow{f}$, and hence the values of the prediction-based ratio R in Equation (10). In this work, for simplicity we use the first ${N}_{\mathrm{est}}$ trials of an experiment as the training data for estimating $\overrightarrow{f}$ and further constructing a prediction-based ratio R that is applied for all test trials. In principle, we can use different training data for different test trials. For example, we can define the training data for a test trial as the data from all trials performed before this test trial, and then we can adapt the construction of the prediction-based ratio for each individual test trial. We refer to Ref. [43] for more details on the adaptability of the prediction-based ratio.

## 3. Results

To illustrate how well the prediction-based-ratio method works in identifying data that are not even explicable by some nonlocal physical theories, such as quantum theory, we now consider a few examples of applications of the method. As above, we restrict our attention to a bipartite Bell test, where each party performs two binary-outcome measurements randomly selected at each trial. Throughout this section, we assume that the input distribution is uniform, specifically ${P}_{xy}=\frac{1}{4}$ for all combinations of $x,y\in \{0,1\}$. In Section 3.2 and Section 3.3 we study the behaviour of numerically simulated Bell tests based on hypothetical sources of correlations described in Section 3.1, while in Section 3.4, we analyze the real experimental data reported in Ref. [72].

#### 3.1. Modeling a Bell Test

For our numerical simulations, we consider a $\overrightarrow{P}$ that resembles a nonlocal source targeted at in various actual Bell experiments [35,36,37,72,73]:
where $v\in [0,1]$, ${\overrightarrow{P}}_{\mathrm{PR}}$ is the Popescu-Rohrlich (PR) correlation [28] ${P}_{\mathrm{PR}}(a,b|x,y)={\textstyle \frac{1}{2}}{\delta}_{a\oplus b,xy}$ with $a,b,x,y\in \{0,1\},$ and ${P}_{\mathbb{I}}(a,b|x,y)={\textstyle \frac{1}{4}}$ for all $a,b,x,y$ is the white-noise distribution. In Equation (12), the real parameter v can be seen as the weight associated with ${\overrightarrow{P}}_{\mathrm{PR}}$ in the convex mixture. Importantly, the nonlocal source represented by such a mixture can (in principle) be produced by performing appropriate local measurements on a maximally entangled two-qubit state if and only if $v\le {v}_{c}:=\frac{1}{\sqrt{2}}\approx 0.71$ (see, e.g., Refs. [27,57]). In particular, when $v={v}_{c}$—corresponding to an ideal nonlocal source—the mixture gives rise to the maximal quantum violation of the CHSH [45] Bell inequality.

$$\overrightarrow{P}\left(v\right):=v{\overrightarrow{P}}_{\mathrm{PR}}+(1-v){\overrightarrow{P}}_{\mathbb{I}},$$

To mimic an experimental scenario with noise (something unavoidable in practice), we shall introduce a slight perturbation to the ideal source $\overrightarrow{P}\left(v\right)$ of Equation (12). Specifically, we require the measurement outcomes observed at each trial in the simulated Bell test to be governed by the nonlocal source $(1-\u03f5)\overrightarrow{P}\left(v\right)+\u03f5{\overrightarrow{P}}_{\mathrm{noise}}$, where $\u03f5\ll 1$ is the weight associated with the noise term ${\overrightarrow{P}}_{\mathrm{noise}}$. Moreover, for the purpose of illustrating the effectiveness of the method in identifying non-quantum-compatible data, we set $v>{v}_{c}$. In our simulations, we set $\u03f5=0.01$ and $v=0.72>{v}_{c}$. However, as long as the given mixture lies outside $\tilde{\mathcal{Q}}$ (and hence also outside $\mathcal{Q}$), the actual choices of $\u03f5\ll 1$ and $v\in ({v}_{c},1]$ are irrelevant. The only impact that these choices may have is the number of trials ${N}_{\mathrm{total}}$ needed to falsify the hypothesis

with the same level of confidence. Inspired by the experiments of Ref. [72] where ${N}_{\mathrm{total}}={10}^{5}$∼${10}^{6}$, we set in our simulations ${N}_{\mathrm{total}}={10}^{6}$. Note also that instead of $\tilde{\mathcal{Q}}$, we can equally well choose another set of correlations that admits a semidefinite programming characterization, such as those described in Refs. [59,62].“The observed data is compatible with a physical theory that is constrained to produce only the almost-quantum set of correlations.”

Since we are interested to model a nonlocal source that obeys the nonsignaling conditions of Equation (3), there is no loss in generality by considering ${\overrightarrow{P}}_{\mathrm{noise}}\in \mathcal{NS}$. To this end, let ${\overrightarrow{P}}_{j}^{\mathrm{Ext}}$ be the j-th extreme point of the nonsignaling polytope [29], then we may write ${\overrightarrow{P}}_{\mathrm{noise}}={\sum}_{j}{p}_{j}{\overrightarrow{P}}_{j}^{\mathrm{Ext}}$ where ${p}_{j}$ is the weight associated with ${\overrightarrow{P}}_{j}^{\mathrm{Ext}}$ in the convex decomposition of ${\overrightarrow{P}}_{\mathrm{noise}}$. We may thus write the nonlocal source of interest as:

$$\overrightarrow{P}(v,\u03f5,\left\{{p}_{j}\right\}):=(1-\u03f5)\overrightarrow{P}\left(v\right)+\u03f5\sum {p}_{j}{\overrightarrow{P}}_{j}^{\mathrm{Ext}}.$$

Finally, to simulate the raw data ${\left\{({a}_{i},{b}_{i},{x}_{i},{y}_{i})\right\}}_{i=1}^{N}$ obtained in an N-trial Bell test for any given input distribution ${P}_{xy}$ and correlation $\overrightarrow{P}$, we make use of the MATLAB toolbox Lightspeed developed by Minka [74].

#### 3.2. Simulations of Bell Tests with an i.i.d. Nonlocal Source

Let us begin with the case of i.i.d. trials, corresponding to a source of correlation that remains unchanged throughout the experiment, and where the inputs at each trial are independent of the inputs of the previous trials. To this end, we first sample the weights ${\left\{{p}_{j}\right\}}_{j}$ uniformly from the interval $[0,1]$ and renormalize them such that ${\sum}_{j}{p}_{j}=1$. With our choice of $v=0.72$ and $\u03f5=0.01$, it is easy to find such a randomly generated correlation $\overrightarrow{P}(v,\u03f5,\left\{{p}_{j}\right\})$ that lies outside $\tilde{\mathcal{Q}}$. (Verifying that any given $\overrightarrow{P}$ is (not) in $\tilde{\mathcal{Q}}$ can be carried out by solving a semidefinite program. Specifically, for any given correlation $\overrightarrow{P}$, if the maximal white-noise visibility $\nu $ such that $\nu \overrightarrow{P}+(1-\nu ){\overrightarrow{P}}_{\mathbb{I}}\in \tilde{\mathcal{Q}}$ is smaller than 1, then $\overrightarrow{P}\notin \tilde{\mathcal{Q}}\supset \mathcal{Q}$, and hence outside $\mathcal{Q}$, otherwise $\overrightarrow{P}\in \tilde{\mathcal{Q}}$.) For convenience, we denote by $\mathcal{P}$ the specific set of ${\left\{{p}_{j}\right\}}_{j}$ employed in our simulation of 500 Bell tests, each with ${N}_{\mathrm{total}}={10}^{6}$ trials. In Figure 1, we summarize the steps involved in our analysis of the numerically simulated data using the prediction-based-ratio method. The resulting p-value upper bounds are summarized in Table 1.

As expected, despite statistical fluctuations, the data does not suggest any obvious evidence against the nonsignaling hypothesis. In fact, among the 500 p-value bounds obtained, 97% of them are trivial (i.e., equal to unity), while the smallest non-trivial p-value bound obtained is approximately 0.14. On the contrary, for the hypothesis test of the almost-quantum set of correlations, more than half of the simulated Bell tests give a p-value upper bound that is less than ${10}^{-10}$. Although there are also 5.8% of these simulated Bell tests that give a trivial p-value bound according to the almost-quantum hypothesis, we see that the method generally works very well in falsifying this hypothesis. In fact, a separate calculation (not shown in the table) shows that when we increase ${N}_{\mathrm{total}}$ to ${10}^{7}$, all the 500 p-value upper bounds obtained according to the almost-quantum hypothesis are less than or equal to ${10}^{-10}$.

#### 3.3. Simulations of Bell tests with a non-i.i.d. Nonlocal Source

In a real experiment, the assumption that the experimental trials are i.i.d is often far from justifiable, as that would require, for example, that the experimental setup remain as it is over the entire course of the experiment. As a result, we also consider here the case where the source that generates the data actually varies from one trial to another. To this end, for the i-th trial of the Bell test, we simulate according to the conditional outcome distributions:
where ${n}_{i}=1,2,\dots ,24$ labels the single nonsignaling extreme point used to mix with $\overrightarrow{P}\left(v\right)$ at this trial, cf. Equation (13) with ${p}_{j}=1$ if $j={n}_{i}$ but vanishes otherwise. Moreover, to facilitate a comparison with the i.i.d. case, before the i-th trial, we randomly pick ${n}_{i}$ according to the probability $P({n}_{i}=j)={p}_{j}$ where ${p}_{j}\in \mathcal{P}$ is exactly the probability employed in the simulation of Section 3.2. With this choice, the outcome distributions governed by the nonlocal source of Equation (14) (for the i-th trial) averages to that of Equation (13) when the number of trials ${N}_{\mathrm{total}}\to \infty $. Again, we follow the steps summarized in Figure 1 to compute the relevant p-value upper bounds using the prediction-based-ratio method. The resulting p-value upper bounds are summarized in Table 2.

$${\overrightarrow{P}}_{i}(v,\u03f5,{n}_{i})=(1-\u03f5)\overrightarrow{P}\left(v\right)+\u03f5{\overrightarrow{P}}_{{n}_{i}}^{\mathrm{Ext}},$$

As with the i.i.d. case, for these 500 simulated Bell tests, our application of the prediction-based-ratio method does not lead to any obvious evidence against the nonsignaling hypothesis $\mathfrak{N}$. However, for the hypothesis associated with the almost-quantum set $\tilde{\mathcal{Q}}$, our results (last row of Table 2) give more than half of the p-value upper bounds that are less than ${10}^{-4}$ (accordingly, 17% if we set the cutoff at ${10}^{-10}$). Although there are 24% of these instances where the returned p-value upper bound for the same hypothesis is trivial, we see that, as with the i.i.d. case, the method remains very effective in showing that the observed data cannot be entirely accounted for using a theory that is constrained to produce only almost-quantum correlations. In addition, as with the i.i.d. case, our separate calculation shows that the effectiveness of this method can be substantially improved when we increase ${N}_{\mathrm{total}}$ to ${10}^{7}$: all the 500 p-value upper bounds obtained according to the almost-quantum hypothesis become less than or equal to ${10}^{-10}$.

#### 3.4. Application to Some Real Experimental Data

Armed with the experience gained in the above analyses, let us now analyze the experimental results presented in Figure 3 of Ref. [72] using the prediction-based-ratio method. One of the goals of Ref. [72] was to experimentally approach the boundary of the quantum set of correlations in the two-dimensional subspace spanned by the two Bell parameters:
where ${E}_{xy}:={\sum}_{a,b=0}^{1}{(-1)}^{a+b}P(a,b|x,y)$ is the correlator. To this end, the Bell parameter ${\mathcal{S}}_{\mathrm{CHSH}}cos\theta +{\mathcal{S}}_{\mathrm{CHSH}}^{\prime}sin\theta $ for 180 uniformly-spaced values of $\theta \in \{{\theta}_{1},{\theta}_{2},\dots ,{\theta}_{180}\}\subset [0,2\pi )$ were estimated by performing the measurements presented in Appendix A of Ref. [72] on a two-qubit maximally entangled state.

$$\begin{array}{c}\hfill {\mathcal{S}}_{CHSH}={E}_{00}+{E}_{01}+{E}_{10}-{E}_{11},\\ \hfill {\mathcal{S}}_{CHSH}^{\prime}=-{E}_{00}+{E}_{01}+{E}_{10}+{E}_{1},\end{array}$$

Unfortunately, only the total counts for each combination of input-output ${N}_{a,b,x,y}$ (rather than the time sequences of raw data) given the value of $\theta $ are available [75]. Therefore, in analogy with the analyses presented above, we use the relative frequencies obtained for ${\theta}_{k}$ as the training data to derive a prediction-based ratio (which corresponds to a Bell-like inequality) for the hypothesis test using the data associated with ${\theta}_{k+1}$ (for the case of $k=180$, the hypothesis test uses the data associated with ${\theta}_{1}$). The analysis therefore essentially follows the steps outlined in Figure 1, but with the computation of t carried out using Equation (8) instead, since we do not have the time sequences of raw data. Moreover, to apply the prediction-based-ratio method, we assume, as with the numerical experiments reported earlier that the input distributions are uniform, i.e., ${P}_{xy}={\textstyle \frac{1}{4}}$ for all combinations of $x,y\in \{0,1\}$. A summary of the p-value upper bounds obtained from these 180 Bell tests is given in Table 3.

For both hypotheses, approximately half of the p-value upper bounds obtained are trivial. At the same time, about the same fraction of the p-value bounds obtained are less than ${10}^{-2}$ (with the majority of them being less than ${10}^{-4}$). In fact, the smallest of the p-value upper bounds are remarkably small: $3.2\times {10}^{-55}$ for the hypothesis of nonsignaling $\mathfrak{N}$ and $2.7\times {10}^{-55}$ for the hypothesis of almost-quantum $\tilde{\mathfrak{Q}}$. These results strongly suggest that under the assumption that the measurement settings were randomly chosen according to a uniform input distribution, it is extremely unlikely that a physical theory associated with each of these hypotheses can produce the observed relative frequencies.

These conclusions that the observed data are incompatible with the fundamental principle of nonsignaling or with quantum theory (via the almost-quantum hypothesis), however, turn out to be flawed, as it was brought to our attention [75] that during the course of the experiment, the measurement bases were not at all randomized—the measurements were carried out in blocks using the same combination of $(x,y)$ before moving to another. Why should this pose a problem? In the extreme scenario, if the measurement settings were fully correlated to some local hidden variable, it is known that the the resulting correlation between measurement outcomes can violate the nonsignaling conditions of Equation (3), see, e.g., Ref. [76]. Consequently, it is not surprising that in the prediction-based-ratio method (as well as any other methods employed for the statistical analysis of a Bell test), the measurement inputs $({x}_{i},{y}_{i})$ during the i-th trial, as discussed in Section 2, ought to be randomly chosen.

## 4. Discussion

As discussed in the last section, the conclusion that “the experimental data of Ref. [72] show a violation of the nonsignaling principle" based on an erroneous application of the prediction-based-ratio method is unfounded. The results are nonetheless thought-provoking. For example, suppose for now that we had access to the raw data for all trials. Since the analysis was flawed because of the nonrandomnization of measurement settings, one can imagine that—under the assumption that the trials are exchangeable—we first artificially randomize the hypothesis-testing trials to simulate the randomization of measurement settings in the experiment. Should we then expect to obtain p-value bounds with fundamentally different features? The answer is negative. The reason is that in our crude application of the method, only the number of counts ${N}_{a,b,x,y}^{\prime}$ for each input-output combination matters, see Equation (8). In particular, the actual trials in which a particular combination of $(a,b,x,y)$ appears are irrelevant in such an analysis.

So, if one holds the view that the nonsignaling principle cannot be flawed, then one must come to the conclusion that “should the measurement choices be randomized, it would be impossible to register the same number of counts ${N}_{a,b,x,y}^{\prime}$ for each input-output combination”. A plausible cause for this is that the experimental setup suffered from some systematic drift during the course of the experiment, which is exactly a manifestation that the experimental trials are not i.i.d. It might then appear that a hypothesis test of the nonsignaling principle is hopeless in such a scenario. However, as mentioned above, the prediction-based-ratio method is applicable even for non-i.i.d. experimental trials. Indeed, as we illustrate in Section 3.3 (see, specifically Table 2), such fluctuations have not lead to any false positive in the sense of giving very small p-value upper bound according to the nonsignaling hypothesis.

More generally, as the above example of Section 3.4 illustrates, an unexpectedly small p-value upper bound according to the nonsignaling hypothesis may be a consequence that certain premises needed to perform a sensible Bell test are violated. In other words, an apparent violation as such does not necessarily pose a problem to any physical principle, such as the nonsignaling principle that is rooted in the theory of relativity. However, as nonlocal correlations also find applications in device-independent quantum information processing [18,19], it is important to carry out such consistency checks alongside the violation of a Bell inequality before one applies the estimated nonlocal correlation in any such protocols.

Of course, an unexpectedly small p-value upper bound according to the nonsignaling hypothesis could also be a consequence of mere statistical fluctuation. Indeed, our results in Section 3.2 and Section 3.3 show that when a null hypothesis indeed holds true, it can still happen that one obtains a relatively small p-value upper bound (of the order of ${10}^{-1}$) even after a large number of trials (${N}_{\mathrm{total}}={10}^{6}$). However, as explained in Appendix 1 of Ref. [43], if a null hypothesis is correct, the probability of obtaining a p-value upper bound smaller than q with the prediction-based-ratio method is no larger than q. Indeed, in each of these instances, p-value upper bounds that are less than 10${}^{-1}$ occur way less than 50 times among the 500 simulated experiments. In any case, this means that even though the prediction-based-ratio method already gets rids of the often unjustifiable i.i.d. assumption involved in such an analysis, the interpretation of the significance of a small p-value upper bound must still be carried out with care, as advised, for example, in Refs. [77,78,79].

## 5. Conclusions

In this work, we revisited the prediction-based-ratio method developed [43]—in the context of a Bell test—for performing hypothesis tests of LHV theories. We showed that with the two observations presented in Section 2.3, the method can equally well be applied to perform hypothesis tests of other physical theories, specifically those that are constrained to produce correlations amenable to a semidefinite programming characterization. Prime examples of such theories include those that obey the principle of nonsignaling [28], those that satisfy the principle of macroscopic locality [51], the so-called v-causal models [67], as well as physical theories that are constrained to produce the almost-quantum set [50] or any other outer approximations [59,62,69] of the quantum set of correlations.

To illustrate the effectiveness of the method, we first numerically simulated 500 Bell tests using a hypothetical source of correlations that lies somewhat outside the almost-quantum set of correlations. We then applied the method to obtain a p-value upper bound according to both the almost-quantum hypothesis and the nonsignaling hypothesis for the simulated data obtained in each of these Bell tests. In the majority ($>90\%$) of these 500 instances, the p-value upper bound according to the almost-quantum hypothesis is less than ${10}^{-2}$. Since a p-value upper bound quantifies the evidence against the assumed (almost-quantum) theory given the observed data, these results show that in most of these simulated Bell tests, the data is unlikely to be explicable by the assumed theory. In a similar manner, we numerically simulated another 500 Bell tests using a hypothetical source that varies from one trial to another. Again, the method remained very effective (giving a p-value upper bound that is less than ${10}^{-2}$ for 69% of the instances) in identifying the incompatibility between the observed data and the assumed (almost-quantum) theory in such a non-i.i.d. scenario.

Finally, we applied the prediction-based-ratio method to the experimental data of Ref. [72]. To this end, we assumed that the measurement settings were randomly chosen with uniform distributions. An application of the method under this assumption again led to very small p-value upper bounds (${10}^{-4}$) for more than 40% of the 180 Bell tests analyzed—not only for the almost-quantum hypothesis, but also for the nonsignaling hypothesis. Such a violation of the nonsignaling conditions, however, is apparent, as we learned after the analysis that the measurement settings were not randomized during the course of the experiments, thereby invalidating one of the basic assumptions needed in the application of the prediction-based-ratio method. Nonetheless, as we remarked in the Discussion section, the analysis nevertheless unveils that the possibility of using the prediction-based-ratio method to identify a situation where a certain premise is needed to perform a proper Bell test, such as the randomization of settings, is invalidated.

Note added: While preparing this manuscript, we became aware of the work of Smania et al. [80], which also discussed, among others, the implication of not randomizing the settings in a Bell test, and its relevance in quantitative applications.

## Author Contributions

Both authors contributed toward the computation of the numerical results and the preparation of the manuscript.

## Funding

This work is supported by the Ministry of Science and Technology, Taiwan (Grants No. 104-2112-M-006-021-MY3, 107-2112-M-006-005-MY2, 107-2627-E-006-001) and the National Center for Theoretical Science, Taiwan (R.O.C.).

## Acknowledgments

Y.C.L. is grateful to Adán Cabello, Bradley Christensen, Ehtibar Dzhafarov, Nicolas Gisin, Scott Glancy, Paul Kwiat, Jan-Åke Larsson, Denis Rosset, and Lev Vaidman for useful discussions.

## Conflicts of Interest

The authors declare no conflict of interest.

## References

- Acín, A.; Gisin, N.; Masanes, L. From Bell’s Theorem to Secure Quantum Key Distribution. Phys. Rev. Lett.
**2006**, 97, 120405. [Google Scholar] [CrossRef] - Bell, J.S. On the Einstein-Podolsky-Rosen paradox. Physics
**1964**, 1, 195. [Google Scholar] [CrossRef] - Gisin, N.; Ribordy, G.; Tittel, W.; Zbinden, H. Quantum cryptography. Rev. Mod. Phys.
**2002**, 74, 145–195. [Google Scholar] [CrossRef] - Barrett, J.; Hardy, L.; Kent, A. No Signaling and Quantum Key Distribution. Phys. Rev. Lett.
**2005**, 95, 010503. [Google Scholar] [CrossRef] - Acín, A.; Brunner, N.; Gisin, N.; Massar, S.; Pironio, S.; Scarani, V. Device-Independent Security of Quantum Cryptography against Collective Attacks. Phys. Rev. Lett.
**2007**, 98, 230501. [Google Scholar] [CrossRef] [PubMed] - Vazirani, U.; Vidick, T. Fully Device-Independent Quantum Key Distribution. Phys. Rev. Lett.
**2014**, 113, 140501. [Google Scholar] [CrossRef] [PubMed] - Ekert, A.K. Quantum cryptography based on Bell’s theorem. Phys. Rev. Lett.
**1991**, 67, 661–663. [Google Scholar] [CrossRef] - Colbeck, R. Quantum and Relativistic Protocols for Secure Multi-Party Computation. arXiv, 2009; arXiv:0911.3814. [Google Scholar]
- Pironio, S.; Acín, A.; Massar, S.; de La Giroday, A.B.; Matsukevich, D.N.; Maunz, P.; Olmschenk, S.; Hayes, D.; Luo, L.; Manning, T.A.; Monroe, C. Random numbers certified by Bell’s theorems theorem. Nature (London)
**2010**, 464, 1021. [Google Scholar] [CrossRef] - Colbeck, R.; Kent, A. Private randomness expansion with untrusted devices. J. Phys. A Math. Theor.
**2011**, 44, 095305. [Google Scholar] [CrossRef] - Mayers, D.; Yao, A. Self Testing Quantum Apparatus. Quantum Inf. Comput.
**2004**, 4, 273. [Google Scholar] - Brunner, N.; Pironio, S.; Acín, A.; Gisin, N.; Méthot, A.A.; Scarani, V. Testing the Dimension of Hilbert Spaces. Phys. Rev. Lett.
**2008**, 100, 210503. [Google Scholar] [CrossRef] [PubMed] - Reichardt, B.W.; Unger, F.; Vazirani, U. Classical command of quantum systems. Nature (London)
**2013**, 496, 456. [Google Scholar] [CrossRef] [PubMed] - Yang, T.H.; Vértesi, T.; Bancal, J.D.; Scarani, V.; Navascués, M. Robust and Versatile Black-Box Certification of Quantum Devices. Phys. Rev. Lett.
**2014**, 113, 040401. [Google Scholar] [CrossRef] [PubMed] - Liang, Y.C.; Rosset, D.; Bancal, J.D.; Pütz, G.; Barnea, T.J.; Gisin, N. Family of Bell-like Inequalities as Device-Independent Witnesses for Entanglement Depth. Phys. Rev. Lett.
**2015**, 114, 190401. [Google Scholar] [CrossRef] [PubMed] - Coladangelo, A.; Goh, K.T.; Scarani, V. All pure bipartite entangled states can be self-tested. Nat. Comm.
**2017**, 8, 15485. [Google Scholar] [CrossRef] [PubMed] - Sekatski, P.; Bancal, J.D.; Wagner, S.; Sangouard, N. Certifying the Building Blocks of Quantum Computers from Bell’s Theorem. Phys. Rev. Lett.
**2018**, 121, 180505. [Google Scholar] [CrossRef] - Scarani, V. The device-independent outlook on quantum physics. Acta Phys. Slovaca
**2012**, 62, 347. [Google Scholar] - Brunner, N.; Cavalcanti, D.; Pironio, S.; Scarani, V.; Wehner, S. Bell nonlocality. Rev. Mod. Phys.
**2014**, 86, 419–478. [Google Scholar] [CrossRef] - Bell, J.S. Speakable and Unspeakable in Quantum Mechanics: Collected Papers on Quantum Philosophy, 2nd ed.; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
- Hensen, B.; Bernien, H.; Dreau, A.E.; Reiserer, A.; Kalb, N.; Blok, M.S.; Ruitenberg, J.; Vermeulen, R.F.L.; Schouten, R.N.; Abellan, C.; et al. Loophole-free Bell inequality violation using electron spins separated by 1.3 kilometres. Nature
**2015**, 526, 682–686. [Google Scholar] [CrossRef] - Shalm, L.K.; Meyer-Scott, E.; Christensen, B.G.; Bierhorst, P.; Wayne, M.A.; Stevens, M.J.; Gerrits, T.; Glancy, S.; Hamel, D.R.; Allman, M.S.; et al. Strong Loophole-Free Test of Local Realism. Phys. Rev. Lett.
**2015**, 115, 250402. [Google Scholar] [CrossRef] [PubMed] - Giustina, M.; Versteegh, M.A.M.; Wengerowsky, S.; Handsteiner, J.; Hochrainer, A.; Phelan, K.; Steinlechner, F.; Kofler, J.; Larsson, J.A.; Abellán, C.; et al. Significant-Loophole-Free Test of Bell’s Theorem with Entangled Photons. Phys. Rev. Lett.
**2015**, 115, 250401. [Google Scholar] [CrossRef] [PubMed] - Rosenfeld, W.; Burchardt, D.; Garthoff, R.; Redeker, K.; Ortegel, N.; Rau, M.; Weinfurter, H. Event-Ready Bell Test Using Entangled Atoms Simultaneously Closing Detection and Locality Loopholes. Phys. Rev. Lett.
**2017**, 119, 010402. [Google Scholar] [CrossRef] [PubMed] - Li, M.H.; Wu, C.; Zhang, Y.; Liu, W.Z.; Bai, B.; Liu, Y.; Zhang, W.; Zhao, Q.; Li, H.; Wang, Z.; et al. Test of Local Realism into the Past without Detection and Locality Loopholes. Phys. Rev. Lett.
**2018**, 121, 080404. [Google Scholar] [CrossRef] [PubMed] - Schwarz, S.; Bessire, B.; Stefanov, A.; Liang, Y.C. Bipartite Bell inequalities with three ternary-outcome measurements - from theory to experiments. New J. Phys.
**2016**, 18, 035001. [Google Scholar] [CrossRef] - Lin, P.S.; Rosset, D.; Zhang, Y.; Bancal, J.D.; Liang, Y.C. Device-independent point estimation from finite data and its application to device-independent property estimation. Phys. Rev. A
**2018**, 97, 032309. [Google Scholar] [CrossRef] - Popescu, S.; Rohrlich, D. Quantum nonlocality as an axiom. Found. Phys.
**1994**, 24, 379–385. [Google Scholar] [CrossRef] - Barrett, J.; Linden, N.; Massar, S.; Pironio, S.; Popescu, S.; Roberts, D. Nonlocal correlations as an information-theoretic resource. Phys. Rev. A
**2005**, 71, 022101. [Google Scholar] [CrossRef] - Liu, Y.; Zhao, Q.; Li, M.H.; Guan, J.Y.; Zhang, Y.; Bai, B.; Zhang, W.; Liu, W.Z.; Wu, C.; Yuan, X.; et al. Device-independent quantum random-number generation. Nature
**2018**, 562, 548–551. [Google Scholar] [CrossRef] - Adenier, G.; Khrennikov, A.Y. Test of the no-signaling principle in the Hensen loophole-free CHSH experiment. Fortschr. Phys.
**2017**, 65, 1600096. [Google Scholar] [CrossRef] - Bednorz, A. Analysis of assumptions of recent tests of local realism. Phys. Rev. A
**2017**, 95, 042118. [Google Scholar] [CrossRef] - Kupczynski, M. Is Einsteinian no-signalling violated in Bell tests? Open Phys.
**2017**, 15, 739. [Google Scholar] [CrossRef] - Aspect, A.; Dalibard, J.; Roger, G. Experimental Test of Bell’s Inequalities Using Time-Varying Analyzers. Phys. Rev. Lett.
**1982**, 49, 1804–1807. [Google Scholar] [CrossRef] - Tittel, W.; Brendel, J.; Zbinden, H.; Gisin, N. Violation of Bell Inequalities by Photons More Than 10 km Apart. Phys. Rev. Lett.
**1998**, 81, 3563–3566. [Google Scholar] [CrossRef] - Weihs, G.; Jennewein, T.; Simon, C.; Weinfurter, H.; Zeilinger, A. Violation of Bell’s Inequality under Strict Einstein Locality Conditions. Phys. Rev. Lett.
**1998**, 81, 5039–5043. [Google Scholar] [CrossRef] - Rowe, M.A.; Kielpinski, D.; Meyer, V.; Sackett, C.A.; Itano, W.M.; Monroe, C.; Wineland, D.J. Experimental violation of a Bell’s inequality with efficient detection. Nature
**2001**, 409, 791–794. [Google Scholar] [CrossRef] [PubMed] - Giustina, M.; Mech, A.; Ramelow, S.; Wittmann, B.; Kofler, J.; Beyer, J.; Lita, A.; Calkins, B.; Gerrits, T.; Nam, S.W.; et al. Bell violation using entangled photons without the fair-sampling assumption. Nature
**2013**, 497, 227. [Google Scholar] [CrossRef] - Christensen, B.G.; McCusker, K.T.; Altepeter, J.B.; Calkins, B.; Gerrits, T.; Lita, A.E.; Miller, A.; Shalm, L.K.; Zhang, Y.; Nam, S.W.; et al. Detection-Loophole-Free Test of Quantum Nonlocality, and Applications. Phys. Rev. Lett.
**2013**, 111, 130406. [Google Scholar] [CrossRef] - Erven, C.; Meyer-Scott, E.; Fisher, K.; Lavoie, J.; Higgins, B.L.; Yan, Z.; Pugh, C.J.; Bourgoin, J.P.; Prevedel, R.; Shalm, L.K.; et al. Experimental three-photon quantum nonlocality under strict locality conditions. Nature Photonics
**2014**, 8, 292. [Google Scholar] [CrossRef] - Lanyon, B.P.; Zwerger, M.; Jurcevic, P.; Hempel, C.; Dür, W.; Briegel, H.J.; Blatt, R.; Roos, C.F. Experimental Violation of Multipartite Bell Inequalities with Trapped Ions. Phys. Rev. Lett.
**2014**, 112, 100403. [Google Scholar] [CrossRef] - Shen, L.; Lee, J.; Thinh, L.P.; Bancal, J.D.; Cerè, A.; Lamas-Linares, A.; Lita, A.; Gerrits, T.; Nam, S.W.; Scarani, V.; et al. Randomness Extraction from Bell Violation with Continuous Parametric Down-Conversion. Phys. Rev. Lett.
**2018**, 121, 150402. [Google Scholar] [CrossRef] [PubMed] - Zhang, Y.; Glancy, S.; Knill, E. Asymptotically optimal data analysis for rejecting local realism. Phys. Rev. A
**2011**, 84, 062118. [Google Scholar] [CrossRef] - Gill, R.D. Time, Finite Statistics, and Bell’s Fifth Position. arXiv, 2003; arXiv:quant-ph/0301059. [Google Scholar]
- Clauser, J.F.; Horne, M.A.; Shimony, A.; Holt, R.A. Proposed Experiment to Test Local Hidden-Variable Theories. Phys. Rev. Lett.
**1969**, 23, 880–884. [Google Scholar] [CrossRef] - Zhang, Y.; Glancy, S.; Knill, E. Efficient quantification of experimental evidence against local realism. Phys. Rev. A
**2013**, 88, 052119. [Google Scholar] [CrossRef] - Cavalcanti, D.; Salles, A.; Scarani, V. Macroscopically local correlations can violate information causality. Nat. Commun.
**2010**, 1, 136. [Google Scholar] [CrossRef] [PubMed] - Fritz, T.; Sainz, A.B.; Augusiak, R.; Brask, J.B.; Chaves, R.; Leverrier, A.; Acín, A. Local orthogonality as a multipartite principle for quantum correlations. Nat. Commun.
**2013**, 4, 2263. [Google Scholar] [CrossRef] [PubMed] - Amaral, B.; Cunha, M.T.; Cabello, A. Exclusivity principle forbids sets of correlations larger than the quantum set. Phys. Rev. A
**2014**, 89, 030101. [Google Scholar] [CrossRef] - Navascués, M.; Guryanova, Y.; Hoban, M.J.; Acín, A. Almost quantum correlations. Nat. Commun.
**2015**, 6, 6288. [Google Scholar] [CrossRef] - Navascués, M.; Wunderlich, H. A glance beyond the quantum model. Proc. R. Soc. A
**2010**, 466, 881. [Google Scholar] [CrossRef] - Rohrlich, D. PR-Box Correlations Have No Classical Limit. In Quantum Theory: A Two-Time Success Story; Struppa, D.C., Tollaksen, J.M., Eds.; Springer Milan: Milano, Italy, 2014; pp. 205–211. [Google Scholar]
- Van Dam, W. Implausible consequences of superstrong nonlocality. Nat. Comput.
**2013**, 12, 9–12. [Google Scholar] [CrossRef] - Brassard, G.; Buhrman, H.; Linden, N.; Méthot, A.A.; Tapp, A.; Unger, F. Limit on Nonlocality in Any World in Which Communication Complexity Is Not Trivial. Phys. Rev. Lett.
**2006**, 96, 250401. [Google Scholar] [CrossRef] - Linden, N.; Popescu, S.; Short, A.J.; Winter, A. Quantum Nonlocality and Beyond: Limits from Nonlocal Computation. Phys. Rev. Lett.
**2007**, 99, 180502. [Google Scholar] [CrossRef] [PubMed] - Pawłowski, M.; Paterek, T.; Kaszlikowski, D.; Scarani, V.; Winter, A.; Żukowski, M. Information causality as a physical principle. Nature
**2009**, 461, 1101. [Google Scholar] [CrossRef] [PubMed] - Goh, K.T.; Kaniewski, J.; Wolfe, E.; Vértesi, T.; Wu, X.; Cai, Y.; Liang, Y.C.; Scarani, V. Geometry of the set of quantum correlations. Phys. Rev. A
**2018**, 97, 022104. [Google Scholar] [CrossRef] - Boyd, S.; Vandenberghe, L. Convex Optimization, 1st ed.; Cambridge University Press: Cambridge, UK, 2004. [Google Scholar]
- Navascués, M.; Pironio, S.; Acín, A. Bounding the Set of Quantum Correlations. Phys. Rev. Lett.
**2007**, 98, 010401. [Google Scholar] [CrossRef] [PubMed] - Navascués, M.; Pironio, S.; Acín, A. A convergent hierarchy of semidefinite programs characterizing the set of quantum correlations. New J. Phys.
**2008**, 10, 073013. [Google Scholar] [CrossRef] - Doherty, A.C.; Liang, Y.C.; Toner, B.; Wehner, S. The Quantum Moment Problem and Bounds on Entangled Multi-prover Games. In Proceedings of the 2008 23rd Annual IEEE Conference on Computational Complexity, College Park, MD, USA, 23–26 June 2008; pp. 199–210. [Google Scholar]
- Moroder, T.; Bancal, J.D.; Liang, Y.C.; Hofmann, M.; Gühne, O. Device-Independent Entanglement Quantification and Related Applications. Phys. Rev. Lett.
**2013**, 111, 030501. [Google Scholar] [CrossRef] [PubMed] - Kullback, S.; Leibler, R.A. On Information and Sufficiency. Ann. Math. Statist.
**1951**, 22, 79–86. [Google Scholar] [CrossRef] - Van Dam, W.; Gill, R.D.; Grunwald, P.D. The statistical strength of nonlocality proofs. IEEE Trans. Inf. Theor.
**2005**, 51, 2812–2835. [Google Scholar] [CrossRef] - Acín, A.; Gill, R.; Gisin, N. Optimal Bell Tests Do Not Require Maximally Entangled States. Phys. Rev. Lett.
**2005**, 95, 210402. [Google Scholar] [CrossRef] - Zhang, Y.; Knill, E.; Glancy, S. Statistical strength of experiments to reject local realism with photon pairs and inefficient detectors. Phys. Rev. A
**2010**, 81, 032117. [Google Scholar] [CrossRef] - Bancal, J.D.; Pironio, S.; Acin, A.; Liang, Y.C.; Scarani, V.; Gisin, N. Quantum non-locality based on finite-speed causal influences leads to superluminal signalling. Nat. Phys.
**2012**, 8, 867–870. [Google Scholar] [CrossRef] - Barnea, T.J.; Bancal, J.D.; Liang, Y.C.; Gisin, N. Tripartite quantum state violating the hidden-influence constraints. Phys. Rev. A
**2013**, 88, 022123. [Google Scholar] [CrossRef] - Chen, S.L.; Budroni, C.; Liang, Y.C.; Chen, Y.N. Natural Framework for Device-Independent Quantification of Quantum Steerability, Measurement Incompatibility, and Self-Testing. Phys. Rev. Lett.
**2016**, 116, 240401. [Google Scholar] [CrossRef] [PubMed] - Chen, S.L.; Budroni, C.; Liang, Y.C.; Chen, Y.N. Exploring the framework of assemblage moment matrices and its applications in device-independent characterizations. Phys. Rev. A
**2018**, 98, 042127. [Google Scholar] [CrossRef] - Fiala, J.; Kočvara, M.; Stingl, M. PENLAB: A MATLAB solver for nonlinear semidefinite optimization. arXiv, 2013; arXiv:1311.5240. [Google Scholar]
- Christensen, B.G.; Liang, Y.C.; Brunner, N.; Gisin, N.; Kwiat, P.G. Exploring the Limits of Quantum Nonlocality with Entangled Photons. Phys. Rev. X
**2015**, 5, 041052. [Google Scholar] [CrossRef] - Poh, H.S.; Joshi, S.K.; Cerè, A.; Cabello, A.; Kurtsiefer, C. Approaching Tsirelson’s Bound in a Photon Pair Experiment. Phys. Rev. Lett.
**2015**, 115, 180408. [Google Scholar] [CrossRef] - Minka, T. The Lightspeed Matlab Toolbox. Available online: https://github.com/tminka/lightspeed (accessed on 18 June 2017).
- Christensen, B.G.; (University of Wisconsin-Madison, Madison, WI, USA). Personal communication, 2017.
- Pütz, G.; Rosset, D.; Barnea, T.J.; Liang, Y.-C.; Gisin, N. Arbitrarily Small Amount of Measurement Independence Is Sufficient to Manifest Quantum Nonlocality. Phys. Rev. Lett.
**2014**, 113, 190402. [Google Scholar] [CrossRef] - Nuzzo, R. Statistical errors: P values, the ’gold standard’ of statistical validity, are not as reliable as many scientists assume. Nature
**2014**, 506, 150. [Google Scholar] [CrossRef] - Leek, J.T.; Peng, R.D. P values are just the tip of the iceberg. Nature
**2015**, 520, 612. [Google Scholar] [CrossRef] [PubMed] - Wasserstein, R.L.; Lazar, N.A. The ASA’s Statement on p-Values: Context, Process, and Purpose. Am. Stat.
**2016**, 70, 129–133. [Google Scholar] [CrossRef] - Smania, M.; Kleinmann, M.; Cabello, A.; Bourennane, M. Avoiding apparent signaling in Bell tests for quantitative applications. arXiv, 2018; arXiv:1801.05739. [Google Scholar]

**Figure 1.**Flowchart summarizing the steps involved in our application of the prediction-based-ratio method on the simulated data ${\left\{({a}_{i},{b}_{i},{x}_{i},{y}_{i})\right\}}_{i=1}^{{N}_{\mathrm{total}}}$ of a single Bell test. In the first step, we separate the data into two sets, with the data collected from the first ${N}_{\mathrm{est}}$ trials serving as the training data while the rest is used for the actual hypothesis testing. Specifically, the training data is used to compute the relative frequencies $\overrightarrow{f}$ and to minimize the KL divergence ${D}_{\mathrm{KL}}(\overrightarrow{f}\left|\right|\mathcal{H})$ with respect to the set of correlations $\mathcal{H}\in \{\mathcal{NS},\tilde{\mathcal{Q}}\}$ associated, respectively, with the hypothesis of $\mathfrak{N}$ and $\tilde{\mathfrak{Q}}$. The correlation ${\overrightarrow{P}}_{\mathrm{KL}}^{\mathcal{H},*}\in \mathcal{H}$ that minimizes ${D}_{\mathrm{KL}}(\overrightarrow{f}\left|\right|\mathcal{H})$ gives rise to a Bell-like inequality with coefficients ${\left\{R(A=a,B=b,X=x,Y=y)\right\}}_{x,y,a,b}$. The remaining data is then used to compute $t={\prod}_{i>{N}_{\mathrm{est}}}{r}_{i}$ where ${r}_{i}:=R({a}_{i},{b}_{i},{x}_{i},{y}_{i})$. Finally, a p-value bound according to the hypothesis is obtained by computing $min\{\frac{1}{t},1\}$.

**Table 1.**Summary of frequency distributions of the p-value upper bounds obtained from 500 numerically simulated Bell tests, each consists of ${N}_{\mathrm{est}}={10}^{6}$ trials and assumes the same i.i.d. nonlocal source $\overrightarrow{P}(v,\u03f5,\left\{{p}_{j}\right\})$ of Equation (13) that lies outside $\tilde{\mathcal{Q}}$. The second and third row give, respectively, the frequency distributions according to the hypothesis associated with $\mathcal{NS}$ (nonsignaling) and $\tilde{\mathcal{Q}}$ (almost-quantum). For these hypotheses, the smallest p-value upper bound found among these 500 Bell tests are, respectively, 0.14 and $5.7\times {10}^{-20}$. The second to the fifth column give, respectively, the fraction of simulated Bell tests having a p-value upper bound (for each hypothesis) that satisfies the given (increasing) threshold (e.g., ${10}^{-10}$ for the second column). Similarly, in the last column, we give the fraction of instances where the p-value upper bound obtained is trivial, i.e., exactly equals to 1. The smaller the p-value upper bound, the less likely it is that a physical theory associated with the hypothesis produces the observed data. Thus, the larger the value in the second (to the fourth) column, the less likely it is that the assumed physical theory holds true. In contrast, the larger the value in the rightmost column, the weaker the empirical evidence against the assumed theory is.

p-Value Bound | ≤${10}^{-10}$ | ≤${10}^{-4}$ | ≤${10}^{-2}$ | ≤${10}^{-1}$ | Trivial |
---|---|---|---|---|---|

$\mathcal{NS}$ | 0 | 0 | 0 | 0 | 97% |

$\tilde{\mathcal{Q}}$ | 58% | 85% | 90% | 93% | 5.8% |

**Table 2.**Summary of frequency distributions of the p-value upper bounds obtained from 500 numerically simulated Bell tests. Each of these Bell tests involves ${N}_{\mathrm{est}}={10}^{6}$ trials and each trial assumes a varying source ${\overrightarrow{P}}_{i}(v,\u03f5,{n}_{i})$ of Equation (14). For the hypothesis of $\mathfrak{N}$ and $\tilde{\mathfrak{Q}}$, associated with $\mathcal{NS}$ (second row) and $\tilde{\mathcal{Q}}$ (third row), respectively, the smallest p-value upper bound found among these 500 instances are 0.21 and $1.3\times {10}^{-15}$. The significance of each column follows that described in the caption of Table 1.

p-Value Bound | $\le {10}^{-10}$ | $\le {10}^{-4}$ | $\le {10}^{-2}$ | $\le {10}^{-1}$ | Trivial |
---|---|---|---|---|---|

$\mathcal{NS}$ | 0 | 0 | 0 | 0 | 97% |

$\tilde{\mathcal{Q}}$ | 17 | 59% | 69% | 72 | 24% |

**Table 3.**Summary of frequency distributions of the p-value upper bounds obtained from the 180 Bell tests of Ref. [72] according to the hypothesis of $\mathfrak{N}$ and $\tilde{\mathfrak{Q}}$ (associated, respectively, with $\mathcal{NS}$, the second row, and $\tilde{\mathcal{Q}}$, the third row) under the assumption that the measurement settings were randomly chosen according to a uniform distribution. The significance of each column follows that described in the caption of Table 1.

p-Value Bound | $\le {10}^{-10}$ | $\le {10}^{-4}$ | $\le {10}^{-2}$ | $\le {10}^{-1}$ | Trivial |
---|---|---|---|---|---|

$\mathcal{NS}$ | 38% | 45% | 48% | 51% | 48% |

$\tilde{\mathcal{Q}}$ | 35% | 44% | 47% | 49% | 49% |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).