Probability Theory as a Physical Theory Gives Insight in Big Topics. Questions to Mathematicians

There is something puzzling about probability theory: does it describe individual events (or systems), or rather ensembles of similar systems? At any rate, probabilities are always measured on ensembles. In this sense probability theory, as a physical theory, is unique: other physical theories describe individual measurements and individual systems. Here it is argued that probability theory can be seen as a general theory of causality (or determinism), so dealing with the underlying causal connections between systems. This simple be it radical interpretation suggests new avenues of research for fundamental issues in physics and mathematics. For example, it suggests 1) a generalization of the Central Limit Theorem; and 2) a different approach to address the unification of quantum mechanics and relativity theory. Throughout the article precise questions to mathematicians are formulated to advance this research.

As a mathematical theory, probability theory was axiomatized by Andrei Kolmogorov in his reference work of 1933 [4]. This work solved as a matter of principle all purely mathematical issues of probability theory. But probability theory is also a theory about the real world, in particular the physical world: it can be applied to physical situations. To do so, one needs to interpret the concept of probability. Interestingly, here is where subtleties and problems come in, giving rise to a long series of paradoxes (such as Bertrand's paradox, the Monty Hall problem, the Borel-Kolmogorov paradox, Bernstein's paradox etc.). As Tijms writes: "Probability theory is the branch of mathematics in which it is easiest to make mistakes" [5]. Many interpretations of probability are on the market, such as the classical interpretation of Laplace, the frequency interpretation, the subjective or Bayesian interpretation etc. 1 [6][7][8][9][10][11]. In physics (and beyond), the standard interpretation is the frequency interpretation, stipulating, roughly, that an event's probability is the limit of its relative frequency in a large number of trials. Note that this is a rough definition: phrased like this it is marvellous source for bad application of the theorysee further. The author who gave the most elaborated analysis of the interpretational theory of probability is, it seems, Richard von Mises 2 [10][11]. Based on his work it becomes clear, if it is not obvious from the start, that probability theory can also be seen as a physical theoryfor specific systems. Many of von Mises' basic ideas are quite straightforward; I suppose they are intuitively, implicitly applied by most practitioners of probability theory. But it is in paradoxical or complex situations that an explicit formulation of a precise interpretation of probability becomes necessary, as argued below. In the following I will start by summarizing a few straightforward ideas that were elaborated in some detail in [12][13], mostly inspired by von Mises.
It seems clear that probability is a property that, strictly speaking, can only be attributed to ensembles of sufficiently similar systems, all characterized by a sufficiently similar 'environment' (or boundary conditions if one prefers). Usual physical properties can be measured on individual systems. But probabilities can only be measured by a series of sufficiently similar experiments (all involving sufficiently similar trial systems). Here it is important to remember that the precise 1 In the philosophy and the foundations of mathematics there is an enormous body of literature devoted to the interpretation of probability [6][7][8][9][10][11], of which I suspect most physicists are not even aware (I was not before getting interested in philosophy of physics). One more visible indicator of this debate on interpretation is the rise in popularity of the subjective or Bayesian interpretation in quantum mechanics. 2 I will not consider nor use here von Mises' mathematical calculus of probability (the so-called 'calculus of collectives'): Kolmogorov's calculus is much simpler. So I focus here on von Mises' interpretational or physical theory. Kolmogorov refers to von Mises as the primary source for the physical interpretation of probability [4]. boundary conditions of the experiment (the environment) determine the numerical values of the probabilities [12][13]. In this sense, probabilities 'emerge' out of (long runs of) experiments. The role of the environment or boundary conditions is often forgotten: in mundane cases no-one feels compelled to mention them. "The probability P6 to throw a 6 with this normal die" seems welldefined; everyone expects P6 = 1/6; and everyone knows how to verify this. And yet, as a matter of principle, P6 is not well-defined; in principle one should consider and mention the boundary conditions of the die throw, and these conditions involve both 'initializing' and 'measurement' (or 'observing') conditions. The initializing conditions of a normal die throw involve randomizing (e.g. by sufficiently vigorous throwing), and the observing conditions involve e.g. interaction with a hard table. One may well imagine specific conditions of throwing a die on a table covered with glue, such that P6, in these experimental conditions, is not = 1/6 but close to 1. Let's briefly look at a famous paradox, Bertrand's, that has intrigued a good part of all famous probability theorists, including Kolmogorov [14]. It goes as follows: "A chord is drawn randomly in a circle. What is the probability that it is shorter than the side of the inscribed equilateral triangle ?" The reader may try: after enough trying it becomes clear that three possible answers can be given, depending on the physical realization of randomly choosing the chordthis can be done in three non-equivalent ways 3 . Thus the problem is not well posed; one should specify the precise experimental conditions that allow to perform a well-defined probabilistic experiment, in particular the initializing conditions. Analyzing problems as the above confers the definite impression that probability paradoxes arise due to the neglect of the experimental conditions that determine any probability value 4 . The fact that even skilled mathematicians disagree about them, seems to point to this reminder: one should not forget that applied probability theory is a physical theory, and that probability as a physical property depends on the environment, the contextit is relational, as so many other physical properties (position, momentum, energy,…).
The role of the experimental boundary conditions becomes particularly clear in quantum mechanics. Quantum mechanics is a probabilistic theoryand maybe the most paradigmatic of all of themin the sense that, in general, the measurable quantities of quantum mechanics are probabilities. One of the key ideas of the orthodox Copenhagen interpretation is expressed in 3 One can for instance randomly chose two points (homogeneously distributed) on the circle by two independent spins of a pointer; a procedure that leads to the probability 1/3, as can be measured and calculated. But two other 'initializing conditions' (experimental randomizing conditions) exist that lead to a different probability. 4 For probabilities 'out there in nature', the initializing and observing conditions are the natural environment.
following quote by Bohr. It is taken from his 1935 reply to Einstein, Podolsky and Rosen in their debate on the completeness of quantum mechanics [15]: "The procedure of measurement has an essential influence on the conditions on which the very definition of the physical quantities in question rests". It seems that Bohr says here that quantum propertiesquantum probabilitiesare determined by the observing conditions or in other words the observing subsystem of the experiment 'generating' them. But this dependence holdsin principlefor all probabilistic systems, quantum or classical. For classical systems one needs some attention to realize the influence of the observing system (recall the die example above); in quantum mechanics the 'in principle' becomes basicas correctly emphasized by Bohr. As an example: the probability that an x-polarized photon passes a y-polarizer obviously depends on x and y. In this sense quantum properties are determined by the 'observer' or rather the 'observing system / conditions'. As will be elaborated elsewhere, it thus seems that a precise interpretation of probability solves the infamous measurement problem -the enigmatic role of the 'observer' in quantum mechanics (see preliminary ideas in [12][13]). Any probability value, quantum or classical, is determined by the 'observer'or if one prefers the detector parameters.
Let us here also recall that the hallmark of probabilistic or random systems is 1) unpredictability of individual outcomes, and more importantly, 2) 'frequency stabilization'. The genuine probabilistic nature of an event can, in the end, only be verified by the fact that its relative frequency, as it is measured in a repeated experiment, and which is a measure of the event's probability, stabilizes towards a fixed number when the number of trials increases. All probabilistic systems, stemming from the enormous variety of probabilistic disciplines and sub-fields, show this frequency stabilization and satisfy the simple rules of probability theory. To the surely biased taste of this author, this 'universal scheme of necessity' has a mysterious touch to it. And much of the remainder of the article is devoted to exploring this aspect of physical probability.
The above introductory remarks are intended to show that probability theory is also a physical theory. Note that it is moreover a vastly corroborated theory, and perhaps the most general physical theory. But of what precisely ?

Q1.
What is the subject matter of probability theory as a physical theory ? This is the first of a series of questions (Q1 -Q6) addressed to mathematicians, physicists, and philosophers; especially the next questions are of mathematical nature.
Let us however start cautiously with the simple question Q1. A first and obvious possible answer to Q1 is: probability theory is the theory of random physical systems, or more precisely of ensembles of random systems. Thus probability theory is unique among physical theories, unique in that it describes ensemble characteristics, moreover of an uncountable variety of types (ensembles from chance games, fluid mechanics, population dynamics, quantum mechanics, engineering, biology, etc.). While other branches of physics (let us exclude quantum mechanics and quantum field theories) deterministically describe individual systems, probability describes collections of systems; moreover its measurement normally necessitates a series of repeated experiments. Of course, one can always more or less metaphorically speak of the probability of this individual die to show a 6 etc., and attach probability to individual eventsbut in an obvious sense this is metaphorical. Probability is a strange animal as a physical property, and I believe this is at the basis of most paradoxes of probability and quantum theory.
Deterministic theories attribute values to events or properties with necessity, in that one can predict them with certainty, based on the knowledge of these theories and of the particular causes (e.g. initial values) acting on the system under scrutiny. Hence there is a necessity underlying deterministic systems; to put it metaphorically, a deterministic system has no choice. But there is a necessity attached to probabilistic systems too. As we just said any probabilistic system must satisfy with necessity the laws of probability calculus. Remembering Occam's razor, and keeping things as simple as possible, one is tempted to assume that these two types of necessity are, in reality, one and the same. In other words, one is tempted to assume 5 : HYP-1. Probabilistic systems are, in reality, deterministic systems in disguise.
Any probability emerges from underlying deterministic processes. This is a very old hypothesis, tracing back at least to Laplace, who famously considered probabilities as emerging from a deterministic substratum; a creature with infinite knowledge would not need the tool of probability but would be able to predict everything with necessity. In this context, recall that countless probabilistic systems are indeed known to be reducible to This 'deterministic' or 'causal' interpretation of probability theory, as I will term it, points to a next possibleradicalanswer to Q1. Namely:

HYP-2.
Probability theory is the most general theory of causality (or causation or determinism), or more precisely the most general theory of the causality of ensembles of random systems.
Instead of assuming HYP-2 one can be more cautious and state that "probability theory is closely related to causality / determinism". But our exercise today is to put things as sharply as possible, and bring hypotheses to their logical end.
Let us now explore a few more arguments in favor of the closely related HYP-1 and HYP-2, as well as of their heuristic fruitfulness. The most obvious argument is the following. Any practitioner of probability theory intuitively acknowledges the link between causality and probability, via the concept of probabilistic dependence. One knows that there is a (somewhat subtle) link between correlation and causation; note that this link has been verified countless times.
Indeed, recently books have been published trying to mathematically describe various forms of causation with probabilistic tools (in essence, the concept of probabilistic dependence), cf. e.g. [16]. But it seems rarely acknowledged that this is a genuine argument in favor of the causal / deterministic interpretation of probability theory, HYP-1 and HYP-2. To put things in a slightly more dramatic perspective, here is a question by Kolmogorov in his Foundations of the Theory of Probability ([4], p. 9): "In consequence, one of the most important problems in the philosophy of the natural sciences isin addition to the well-known one regarding the essence of the concept of probability itselfto make precise the premises which would make it possible to regard any given events as independent. This question, however, is beyond the scope of this book." Now, HYP-1 and HYP-2 allow to answer the question: two probabilistic events are independent if they are not causally connected, i.e. if one event is not causally determining the other one, and if there is no common cause determining the two events. We intuitively feel that the outcomes of two thrown dice are independent because we intuitively feel that there is no causal connection between the outcomesin normal circumstances. If the great Kolmogorov intuits that there is something fundamentally puzzling about the notion of probabilistic dependence, and if HYP-1 allows to solve the puzzle, perhaps we should take this hypothesis seriously.
A further argument in favor of HYP-1 is given by the Central Limit Theorem (CLT) of probability theory. There is something intriguing about the CLT, to which contributed such mathematicians as de Moivre, Laplace, Chebyshev, Lyapunov, Pòlya, Gnedenko and Kolmogorov.
It has been termed the "unofficial sovereign of probability theory" ( [5], p. 169); it is not rare to read that it is "one of the most remarkable theorems in the whole of mathematics" (e.g. [17] "I know of scarcely anything so apt to impress the imagination as the wonderful form of cosmic order expressed by the Law of Frequency of Error. The law would have been personified by the Greeks and deified, if they had known of it. It reigns with serenity and in complete self-effacement amidst the wildest confusion. The huger the mob, and the greater the apparent anarchy, the more perfect is its sway. It is the supreme law of Unreason. Whenever a large sample of chaotic elements are taken in hand and marshalled in the order of their magnitude, an unsuspected and most beautiful form of regularity proves to have been latent all along." Are there good grounds why mathematicians, usually so sober, become lyrical in the face of this theorem ? In essence, the CLT states that any probabilistic variable that is the average, i.e. the weighted sum, of many independent variables has a normal (Gaussian) distribution independently of the distribution of the contributing variables. Now, this theorem of pure mathematics is again interpreted causally in almost all works on (applied) probability. Textbooks typically present the theorem as an explanation of why the Gaussian distribution is so overwhelmingly present in the world, describing an unlimited variety of stochastic phenomena and properties. The textbook rationale is simple: many properties are the average result of many other independent phenomena; or more precisely: many properties are the effect of many independent causes. Hence we see deterministic reasoning appearing again in applied probability. This fact corroborates the conjecture that probability theory and the CLT are about the causal connectedness of things.
But if this interpretation of the CLT corroborating HYP-1 is correct, then one wonders whether the CLT cannot be generalized. Note first that, in physics, the usual way to formalize causal dependence is via functional dependence: if a variable y is caused or determined by variables (x1,…,xN) then y = f(x1,…,xN) for some function f. (Throughout this article it is useful for mathematicians to remember that in physics all functions can be considered as sufficiently "wellbehaving".) Now, if it indeed suffices, as proposed in the usual rationale to interpret the CLT, that a variable is determined by a large number of causes to have a normal distribution, and if causal determination is expressed as functional dependence, then one is tempted to ask: Q2. Can the Central Limit Theorem be generalized as follows ?: "Under broad conditions, a stochastic variable that is a function of many independent variables will be normally distributed, independently of the distribution of the contributing variables".
The CLT seems to restrictively consider a special case for the function f, namely y = 1 ∑ . Here is a first hint. According to Taylor's theorem one can approximate a function f(x1,…,xN) as follows: Let us therefore see on which assumptions the venerable theorem is based; these are all probabilistic assumptions. Getting insight in this theorem is useful for another reason: it seems to be a fundamental obstacle in the unification of quantum mechanics and general relativity.
In essence, Bell assumes that in a Bell experiment, in which a quantum property as spin or polarisation () is measured on two entangled electrons or photons flying off in opposite directions, the average product of 1 and 2 is given by following formula, if one assumes hidden determinism: A formula as (2)  But here a few puzzling thoughts pop up immediatelyor rather, a few questions to mathematicians. Formula (2) seems a priori a perfectly acceptable expression to physicists; it has been used and tested against experiment in countless other statistical contexts. But first note that (2) assumes 'partial' determinism, in that it still introduces a probability measure (); HYP-1 stipulates that even this  should, in principle, be explainable by deterministic processes. It seems the answer to Q4 is 'yes' for instance if there are realistic deterministic scenarios in which no probability density () exists; but as far as I know there are no compelling reasons to assume this. But in any case, if mathematical conditions stipulated in Q3-Q4 can be found and correspond to realistic physical conditions (which is clearly suggested by the fact that applied probability theory is a physical theory), then Bell's analysis is not applicable.
Let us focus on Q3. Could there be correlations between freely chosen analyzer settings (a and b) and spacelike separated particle properties ? A handful of researchers have proposed that the answer to Q3 is yes, by arguing that in a truly deterministic worldand not just in a partially deterministic one as in (2)there can, in principle, exist correlations between the particles and the choices 'a' and 'b' (a list of references is given in [23][24]). This objection to Bell's theorem is old, Bell himself was aware of it; but it has always been dismissed by the large majority as a 'conspiratorial' solution, in 'obvious' contradiction with the existence of free will. According to this prevailing position, (|a,b) or equivalently (a,b|) make no sense: freely chosen parameters cannot be stochastically determined by particle propertiesthis would contradict free will. Fans of determinism and HYP-1 are not so sure. They argue that, in principle, since the Big Bang all events are correlated and have common causes; hence (3) could well be the better formula.
Moreover, invoking 'free will' to justify (2), as is often done in the quantum foundations community, is neglecting a mainstream conclusion from other communities: the majority of researchers (e.g. philosophers and neuroscientists) having studied free will professionally have come to the conclusion that free will is compatible with total determinism ( [25], p. 242). According to these experts one cannot invoke free will to justify (2).
I have often discussed this issue with physicists, and the argument that always comes up is the following: it would be a mind-boggling cosmic conspiracy that (|a,b) would alwayswhatever the precise sequences of the choices of a and b are, and whatever the way a and b are determined (be it by free choice of experimenters or by the number of mouse droppings in a lab [26] or by cosmic photons [21][22][23])be exactly such that it gives rise (via (3) with (a,b)). Is it really strange that a correlation exists between the -field (taken at the Big Bang) and (a,b) ? A priori and in principle, in a fully deterministic or fully correlated system any probability density() depends on all other variables (the set X) of the system, so that (|X) ≠ () (all variables can be taken at any time). In practice, many of the variables X play no role. But in the Bell experiment a and b can be expected to play a predominant role since they are the experimental 'boundary conditions', the analyzer settings. (In this debate I believe it is wise to remember the rich history of 'conspiracies': usually they are in the eye of the beholder. Seeing a conspiracy by higher powers is, most of the time, not seeing the necessary deterministic processes under the surface [24].) But admitted, this qualitative argument may only convince the convinced. For sure, fans of determinism have to admit that (2) is a formula that always has worked for macroscopic systems.
To the best of my knowledge there is no exception known to (2) in the macroscopic world in comparable statistical cases: an average value of a stochastic variable x() = 1(a,).2(b,) can be calculated by (2) by integrating over (), the latter density being independent of freely or randomly chosen (a,b). Still, it may make sense to perform experiments on well-chosen macroscopic systems to verify this claim. In Ref. [27] it is argued that fluid-mechanical pilot-wave systems are interesting candidates to answer Q3 or Q4 experimentally; they consist of oil droplets resonantly interacting with a surface wave. These fluid systems have an intriguing capacity to mimic quantum properties [40][41][42][43]. Furthermore they have a holistic aspect to them, in that they are determined by the boundary conditions of the whole experimental set-up (just as quantum systems); as a consequence they show massive correlation (again just as quantum systems). Finally, these are nonlinear hence potentially chaotic systems, as has been experimentally proven. That means that even infinitesimally weak (but non-zero) nonlocal effects in them might lead to violation of the Bell inequality 7 [28]. Others have proposed to test Eq. (3) in the quantum realm [46].
But in the meanwhile question Q3 remains entirely pertinent. There may exist convincing quantitative arguments from probability theory or elsewhere to address it. Here is what I consider 7 If such violation would occur in the droplet system, this would lend support to 'background-based' hidden-variable theories as stochastic electrodynamics and related theories [44][45].
at present the most convincing semi-quantitative argument why Eq. (3) could be correct in the subquantum realm, while Eq. (2) works in the macroscopic realm. Roughly put, it is based on the idea that the  are not just any sub-quantum variables, but the 'ultimate' variables taken at the Big-Bang (they thus easily escape from the spectacular attempt to probe 600-year old variables correlated with light from Milky Way stars [21]). Indeed, assume for a moment that the  are the degrees of freedom arising in the Theory of Everything, as envisaged by 't Hooft notably [26,29]. Now, one can certainly not exclude that at this very fundamental level "everything is causally connected or correlated with everything" (again, an idea that the determinist finds highly convincing, nay obvious, in view of the Big Bang), and that one has to consider for these ultimate degrees of freedom their correlations with a and b. Then (3) rather than (2) is the correct formula. Only when considering sets of 'usual' variables as (1, 2, a, b) that belong to the macroscopic (a, b) or quantum (1, 2) level can one sometimes assume independence between variables (e.g. a and b are independent as free variables, as one can easily experimentally verify). But already on these levels some independencies make no sense anymore (e.g. ) , ( does not make sense, one has to consider ) , , ( ). Going one stage deeper to the fundamental level  then induces the aboriginal correlations as in (3). So in a nutshell, the idea is that at the macroscopic level correlations may be lost or washed-out due to a statistical or largeensemble effect, while for the most fundamental degrees of freedom this is not the casethese keep their universal correlation. Note that probability theory allows that general variables (a, b, ) are correlated as in (3), even if (a, b) are not correlated: this is an application of Bernstein's paradox [30,4]; and see the explicit calculation in [26]. Bernstein's paradox states that a set of pair-wise independent variables may not be jointly independent (there is no independence between all subensembles) [4,31]. This may seem counterintuitive at first sight, but interestingly, probability theorists have found that this is the rule rather than the exception [31]. (No wonder, the determinist thinks: everything is in principle connected to everything.) Maybe this offers a further line of research. Let us therefore ask:

Q5. Does Bernstein's paradox offer insights in Q3-Q4?
Here is a simple proof, then, that Bell's theorem evaporates in the above-mentioned conditions. (The first proof that (3) can lead not only to violation of the Bell inequality but to the correct quantum expression for M(a,b) was given by Brans [32].) Below model assumes that the ultimate variables are fully determined, correlated with emerging variables (as a, b) and that nontrivial (≠ 0, 1) probabilities can emerge at a higher levelwhich is HYP-1. As was realized by Bell soon after his seminal article of 1964, Eq. (2) can actually be generalized. This more general variant assumes that the  do not deterministically determine 1 and 2, but stochastically determine their probability. In other words, in this case: ).P( a , P( . This expression is only based on standard rules of probability calculus and on the physical assumption that ) , , , . This is the so-called Clauser-Horne factorability condition, and is generally accepted as a consequence of local causality (relativity theory), since, a, b) are mutually spacelike separated in advanced experiments. Note that the deterministic case (3) is a special case of (4), for ) , ) , ( is the Kronecker-delta, both probabilities are then only 0 or 1). If the ab-initio degrees of freedom are fully determined, , having gigantically many components, takes one value, 0, and This type of M(a,b) can easily lead to a violation of the Bell inequality, as is well-known and straightforward to prove (note that ) ), , ( that can help answer Q3-Q4 ? E.g. systems that are deterministic at bottom, and in which probabilities as (|a,b) emerge (even when, a, b) are mutually spacelike separated).
't Hooft's Cellular Automaton Interpretation of quantum mechanics is evidently much more sophisticated and developed than e.g. spin-lattices, but it is a cellular automaton model too. Other related models are investigated in [33]. That cellular automata are potent devices to simulate complex dynamical systems, is for instance shown by the fact they can reproduce the Navier-Stokes equation [34]. More on this equation in a moment.
Going over these arguments pro and contra, at times the author is enthusiastically convinced that HYP-1 must be true and Bell's theorem inapplicable. HYP-1 corresponds to the simplest world picture, and seems of a singular heuristic import, which can be summarized by following sane rule of conduct: keep on hunting for causes. HYP-1 states that all phenomena, including quantum ones, have deterministic causes. Maintaining, as quantum orthodoxy has it, that ) ( 1 a P  = ½ is determined by nothing, so that the measured spin of an electron assumes a value say +1 rather than -1 on the basis of nothing, is a difficult thing to swallow for the determinist. Of course, it may well be that it is ultimately impossible to decide whether the universe is fully deterministic or probabilistic; but this is not the most important point; what is essential is the question whether extra variables (as deterministic or probabilistic causes) are conceivable (cf. footnote 6). One might therefore prefer to use 'determinism' in a somewhat broader sense, and replace in HYP-1 'Any probability emerges from underlying deterministic processes' by 'Any probability emerges from underlying causal processes, i.e. variables determining this probability'.
According to above interpretation, Bell's theorem teaches us not that nature is non-local 8 in the sense of superluminal; nor that nature is irreducibly acausal, indeterministic; but rather that there is no intermediate -level (underneath the quantum level) in which (2)  Once this interpretation adopted, we must bring it to its logical conclusion, even if it goes against mainstream views. One of the pressing problems of physics is the unification of quantum mechanics (or QFTs) and general relativity, which seems to butt on principled problems. From the very abstract point of view of probability theory, it may be that the problem is this: quantum mechanics is a probabilistic theory, while relativity a deterministic one. But HYP-1 suggests that the schism is not insurmountable after all. HYP-1, taken at face-value, suggests following strategy for the unification program: one should start from deterministic theories that can incorporate gravity and see whether these can generate quantum mechanics or QFT (as an emergent or effective theory). In this speculative exercise, fluid mechanics comes to mind. The mathematics of fluid mechanics, in essence the Navier-Stokes equation, is surprisingly powerful; better understanding it is one of the seven 'Millennium Problems' of mathematics. It is well-known that the Navier-Stokes equation can be rewritten so as to yield the 1-particle Schrödinger equation [35][36][37]. Upon closer inspection, the probabilistic character of the latter equation comes in via the assumption of stochastically fluctuating 'elementary units' or 'singularities' within a fluid element [36]. More recently, it was shown by W. Unruh that the Navier-Stokes equation can be rewritten as the equation for a massless scalar field in a geometry with a Schwarzschild metric near the horizon of a black hole. The quantized motion of sound waves in a convergent fluid flow is an analog model of a quantum field in a classical gravitational field. This allowed to predict the existence of 'sonic black holes' emitting a phononic version of Hawking radiation [38]; a prediction that has been many semi-professional text will do: https://www.nytimes.com/2015/10/22/science/quantum-theory-experiment-saidto-prove-spooky-interactions.html (dated 21.10.2015, retrieved 05.01.2019).
verified [39]. From the perspective developed in this article, it seems these results deserve more attention in the unification program.
In conclusion, it was argued here that probability theory, if taken seriously as a physical theory, offers a unifying framework allowing to address, besides some puzzles of quantum interpretation, several foundational questions. Specifically, it was argued that there is one hypothesis which offers an explanation for 1) Kolmogorov's problem of probabilistic dependence; 2) the interpretation of the Central Limit Theorem; and 3) Bell's theoremnamely HYP-1, in short, the hypothesis of determinism. On the other hand, indeterminism ('no hidden variables') remains entirely silent regarding 1) and 2), and leaves 3) as an obstacle rather than an argument for the unification program. Sure, HYP-1 is non-standard in modern physics; but then, physics seems to need new ideas [1][2][3].
As a summary, here is the list of hypotheses and questions proposed above (HYP-2 is a possible answer to Q1). It is hoped that these questions will inspire others to go further.
HYP-1. Probabilistic systems are, in reality, deterministic systems in disguise. Any probability emerges from underlying deterministic processes.

HYP-2.
Probability theory is the most general theory of causality (or causation or determinism), or more precisely the most general theory of the causality of ensembles of random systems.

Q1.
What is the subject matter of probability theory as a physical theory ?
Q2. Can the Central Limit Theorem be generalized as follows ?: "Under broad conditions, a stochastic variable that is a function of many independent variables will be normally distributed, independently of the distribution of the contributing variables".
Q3. In Eq. (2), should(|a,b) be used instead of() to describe hidden-variable scenarios in Bell experiments (even in the most advanced dynamic ones)?
Q4. Are there mathematical conditions in which Eq. (2) is not applicable to calculate the average product of two stochastic variables <1.2> that are determined by other variables ?

Q5. Does Bernstein's paradox offer insights in Q3-Q4?
Q6. Are there simple model systems (such as cellular automata, spin-lattices,…) that can help answer Q3-Q4 ? E.g. systems that are deterministic at bottom, and in which probabilities as (|a,b) emerge (even when, a, b) are mutually spacelike separated).
Acknowledgements. I would like to greatly thank, for discussion, Henry E. Fischer.