Next Article in Journal
Simulating Public Opinion: Comparing Distributional and Individual-Level Predictions from LLMs and Random Forests
Previous Article in Journal
A Graph Contrastive Learning Method for Enhancing Genome Recovery in Complex Microbial Communities
Previous Article in Special Issue
Quantum Weak Values and the “Which Way?” Question
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

On the Relativity of Quantumness as Implied by Relativity of Arithmetic and Probability

Wydział Fizyki Technicznej i Matematyki Stosowanej, Politechnika Gdańska, 80-233 Gdańsk, Poland
Entropy 2025, 27(9), 922; https://doi.org/10.3390/e27090922
Submission received: 27 July 2025 / Revised: 20 August 2025 / Accepted: 27 August 2025 / Published: 2 September 2025
(This article belongs to the Special Issue Quantum Measurement)

Abstract

A hierarchical structure of isomorphic arithmetics is defined by a bijection g R : R R . It entails a hierarchy of probabilistic models, with probabilities p k = g k ( p ) , where g is the restriction of g R to the interval [ 0 , 1 ] , g k is the kth iterate of g, and k is an arbitrary integer (positive, negative, or zero; g 0 ( x ) = x ). The relation between p and g k ( p ) , k > 0 , is analogous to the one between probability and neural activation function. For k 1 , g k ( p ) is essentially white noise (all processes are equally probable). The choice of k = 0 is physically as arbitrary as the choice of origin of a line in space, hence what we regard as experimental binary probabilities, p exp , can be given by any k, p exp = g k ( p ) . Quantum binary probabilities are defined by g ( p ) = sin 2 π 2 p . With this concrete form of g, one finds that any two neighboring levels of the hierarchy are related to each other in a quantum–subquantum relation. In this sense, any model in the hierarchy is probabilistically quantum in appropriate arithmetic and calculus. And the other way around: any model is subquantum in appropriate arithmetic and calculus. Probabilities involving more than two events are constructed by means of trees of binary conditional probabilities. We discuss from this perspective singlet-state probabilities and Bell inequalities. We find that singlet state probabilities involve simultaneously three levels of the hierarchy: quantum, hidden, and macroscopic. As a by-product of the analysis, we discover a new (arithmetic) interpretation of the Fubini–Study geodesic distance.

1. Introduction

In brief, the quantum measurement problem consists of finding a rule that correlates states of a quantum system with those of a macroscopic observer. When phrased in probabilistic terms, the problem is to find a consistent rule of replacing joint probabilities, p ( a , b ) , by conditional probabilities, p ( a | b ) , where a and b represent states (or properties) of the system and the observer, respectively. In standard quantum mechanics the rule can be inferred from Bayes law by the following sequence of equivalences:
p ( a | b ) = p ( a , b ) p ( b ) = Tr ( ρ P b P a P b ) Tr ( ρ P b ) = Tr P b ρ P b Tr ( P b ρ P b ) P a = Tr ( ρ b P a ) .
Thus, the process of conditioning by the event “b has occurred” can be represented by the “state vector reduction”,
ρ ρ b = P b ρ P b Tr ( P b ρ P b ) .
However, do we really need (2)? From an operational point of view, it is enough if we know the joint probability,
p ( a , b ) = Tr ( ρ P b P a P b ) ,
and the probability of the condition,
p ( b ) = Tr ( ρ P b ) .
Both numbers are directly related to experimental data, so (2) is redundant.
If we try to generalize the above procedure beyond quantum mechanics, various possibilities arise. In nonlinear quantum mechanics, for example, once we obtain p ( a , b ) and p ( b ) , we can deduce the mathematical form of an effective state vector reduction, but it will not coincide with (2), because the sequence of transformations (1) will no longer be true (cf. [1] for the details). A naive combination of (2) with nonlinear evolution of states implies the inconsistency known as faster-than-light communication [2,3,4,5]. Of course, one can work with the projection postulate even in nonlinear quantum mechanics (eliminating the faster-than-light effect), but the form of state vector reduction must be first derived in a consistent way from Bayes law [1]. Here, consistency is the keyword.
The Bayes law, when written as p ( a , b ) = p ( a | b ) p ( b ) , is known as the product rule. Jaynes [6] (following the ideas of Aczél [7] and Cox [8]) derives the product rule from some very general desiderata of consistent and plausible reasoning but, interestingly, what one finds turns out be more general,
p ( a , b ) = g 1 ( g p ( a | b ) g p ( b ) ) ,
where g is some monotone non-negative function (cf. Equation (2.27) in [6]). Still, for Jaynes, p ( ) is not yet a probability. His intuition tells him that the probability (or, rather, a measure of plausibility) is given by g ( p ( ) ) , so that the product rule is reconstructed in the standard form,
g p ( a , b ) = g p ( a | b ) g p ( b ) .
What we will discuss later on in this paper employs a possibility that was not taken into account by Jaynes. Namely, we will treat formulas such as (5) as a definition of a new product, ⊙, so that
p ( a , b ) = g 1 ( g p ( a | b ) g p ( b ) ) = p ( a | b ) p ( b ) .
We will also see that g ( p ) and its higher iterates have intriguing similarities to neural activation functions, whereas higher iterates of g 1 ( p ) resemble a white noise.
A new product is an element of a new arithmetic, leading us ultimately to a whole hierarchical structure of such generalized models. As one of the conclusions, we will find that both p and g ( p ) may be treated as genuine probabilities, provided g is restricted to the class discussed in detail in Section 2. One of the possibilities, directly related to the measurement problem, is that p are probabilities at a hidden-variable level, whereas g ( p ) are the quantum ones. We will see that any two neighboring levels of the hierarchy are related to each other in a way that may be regarded as a form of a quantum–subquantum relationship. This will lead to the idea of relativity of quantumness.
In any such generalized and fundamental theory one is necessarily confronted with the chicken-or-egg dilemma: What was first, p ( a , b ) and p ( b ) , or p ( a | b ) and p ( b ) ? The Bayes law that defines the conditional probability in terms of the joint probability, or the product rule that defines the joint probability in terms of the conditional probability?
An alternative form of the dilemma can be expressed in terms of the projection postulate: Do we first define conditional probabilities in terms of some given form of state vector reduction, or we begin with joint probabilities and then infer the form of state vector reduction? In nonlinear quantum mechanics, the latter strategy is superior to the former one. However, in the Bayesian approach to probability, one updates probabilities on the basis of prior information, so the conditional probabilities are superior to the joint ones.
The formalism of arithmetic hierarchies discussed in the present paper clearly prefers the Bayesian approach. The reason is in the three fundamental lemmas, which we will discuss in Section 2, which are true only for binary probabilities. There is priority in the binary coding, as we have to construct probabilities involving more than two events in terms of binary trees of conditional probabilities. Binary coding becomes as fundamental for probability theory as the two-spinors are fundamental for relativistic physics [9].
We begin in Section 2 by recalling the three fundamental lemmas about the functional equation g ( p ) + g ( 1 p ) = 1 . In Section 3, we construct a hierarchy of isomorphic arithmetics associated with g ( p ) . The hierarchy of arithmetics leads to a hierarchy of probabilities introduced in Section 4. A hierarchical ordering relation, briefly discussed in Section 5, will allow us to unambiguously employ symbols such as < and >. A family of product rules, discussed in Section 6, is employed in the problem of hidden-variables representation of singlet-state probabilities in Section 7. We explain, in particular, that one encounters here three types of arithmetic levels in a single formula for joint probabilities: quantum, macroscopic, and hidden. Section 8 introduces some elements of hierarchical calculi, with special emphasis on non-Newtonian integration. We make here a digression on Rényi’s entropy which is implicitly based on a generalized arithmetic, but does not take advantage of the possibilities inherent in generalized calculus. Section 9 is devoted to local hidden-variable models of singlet-state probabilities constructed in terms of the generalized calculus. This seems to be the most controversial aspect of the formalism, as it clearly contradicts common wisdom about Bell’s theorem. Section 10 brings us to the intriguing role played in quantum mechanics by the geodesic distance in the projective space of quantum states. A typical discussion of the Fubini–Study metric is restricted in the literature to its geometric interpretation. Here, we reveal its unknown aspect: Its role for the arithmetic structure of quantum states. It seems that g ( p ) = sin 2 π 2 p is a fundamental bijection that determines the arithmetic of the subquantum world. In Section 11, we give a simple argument explaining why the effective number of distinguishable probabilistic levels of the hierarchy is finite. We also point out a possible interpretation of the hierarchy of probabilities in terms of neural activation functions. At such a formal level, the only means of relating formal probabilities to experiment is via the laws of large numbers, discussed in Section 12. In Section 13, we return to the problem of Bell’s inequalities. We depart here a little from the formalism we developed in a series of earlier papers where the same arithmetic was used at the hidden and the macroscopic levels. Our current understanding of the problem is that it is better to employ the freedom of combining different arithmetics simultaneously. We end the paper with remarks on open problems, Section 14, and certain personal perspective is given in Section 15. The Appendix A is devoted to certain technicalities which cannot be found in the literature.

2. Three Fundamental Lemmas

The hierarchical structure of (binary) probabilities is a consequence of the following three lemmas. They do not have a sufficiently nontrivial generalization beyond the binary case (cf. the discussion in [10]), hence the non-binary case has to be treated in terms of trees of conditional probabilities constructed in analogy to binary Huffman codes [11].
Lemma 1. 
g : [ 0 , 1 ] [ 0 , 1 ] is a solution of the functional equation g ( p ) + g ( 1 p ) = 1 if and only if
g ( p ) = 1 2 + h p 1 2 ,
where h ( x ) = h ( x ) , h : [ 1 / 2 , 1 / 2 ] [ 1 / 2 , 1 / 2 ] , i.e., h is an arbitrary odd mapping of the closed interval into itself. Any such g has a fixed point at p = 1 / 2 .
Lemma 2. 
Consider two functions g j : [ 0 , 1 ] [ 0 , 1 ] , j = 1 , 2 , that satisfy assumptions of Lemma 1,
g j ( p ) = 1 2 + h j p 1 2 ,
where h j ( x ) = h j ( x ) . Then g 12 = g 1 g 2 also satisfies Lemma 1 with h 12 = h 1 h 2 ,
g 12 ( p ) = 1 2 + h 12 p 1 2 .
Accordingly,
g 12 ( p ) + g 12 ( 1 p ) = 1
for any p [ 0 , 1 ] .
Lemma 3. 
Let g k = g g , g k = g 1 g 1 (k times), g 0 ( x ) = x . If g satisfies Lemma 1,
g ( p ) = 1 2 + h p 1 2 ,
then the kth iterate g k also satisfies Lemma 1 for any k Z ,
g k ( p ) = 1 2 + h k p 1 2 ,
where h k is the kth iterate of h. Accordingly,
g k ( p ) + g k ( 1 p ) = 1
for any p [ 0 , 1 ] , and any integer k. In particular
g 1 ( p ) + g 1 ( 1 p ) = 1 .
The proofs can be found in [12,13].
Armed with the lemmas we can construct a hierarchy of arithmetics, entailing a hierarchy of probabilities.

3. Hierarchy of Isomorphic Arithmetics

Assume that g : [ 0 , 1 ] [ 0 , 1 ] occurring in the above three lemmas is a restriction of a bijection g R : R R , i.e., g ( x ) = g R ( x ) for x [ 0 , 1 ] . It does not matter what the properties of g R ( x ) are if x [ 0 , 1 ] , except for the bijectivity of g R . Put differently, g belongs to the equivalence class [ g R ] of bijections whose restrictions to [ 0 , 1 ] are identical. Following the notation of Lemma 3, we denote g k = g R g R , g k = g R 1 g R 1 , g 0 ( x ) = x . Now, let x , y R . Define,
x k y = g k g k ( x ) + g k ( y ) ,
x k y = g k g k ( x ) g k ( y ) ,
x k y = g k g k ( x ) · g k ( y ) ,
x k y = g k g k ( x ) / g k ( y ) .
The arithmetic R k is the set R equipped with the above four operations, i.e., R k = { R , k , k , k , k } . The ordering relation is independent of k if g is increasing, which we therefore assume, hence g k ( x ) < g k ( y ) if and only if x < y . The neutral elements of addition, 0 k = g k ( 0 ) , and multiplication, 1 k = g k ( 1 ) ,
x k 0 k = x k 1 k = x , for   any   x ,
can be regarded as bits, in principle applicable to some form of binary coding. Greater natural numbers are obtained by the n-times repeated addition of 1 k ,
n k = 1 k k k 1 k n times = g k ( n ) ,
n k k m k = g k ( n + m ) = ( n + m ) k ,
n k k m k = g k ( n m ) = ( n m ) k .
An nth power of x,
x n k = x k k x n times ,
satisfies
x n k k x m k = x ( n + m ) k = x n k k m k .
Rational numbers are those of the form
n k k m k = g k ( n / m ) = ( n / m ) k , n , m Z .
The notion of rationality is arithmetic-dependent. Indeed, let n / m be a rational number in the arithmetic R 0 = { R , + , , · , / } . Then, typically, g k ( n / m ) , k 0 , is not a rational number in R 0 . Still, it is a rational number in the arithmetic R k = { R , k , k , k , k } in consequence of (26).
For any k , l Z , the four arithmetic operations are related by
x k + l y = g l g l ( x ) k g l ( y ) = g k g k ( x ) l g k ( y ) ,
x k + l y = g l g l ( x ) k g l ( y ) = g k g k ( x ) l g k ( y ) ,
x k + l y = g l g l ( x ) k g l ( y ) = g k g k ( x ) l g k ( y ) ,
x k + l y = g l g l ( x ) k g l ( y ) = g k g k ( x ) l g k ( y ) .
The bijection f k = g k is an isomorphism of R k + l and R l , for any k , l Z ,
f k x k + l y = f k ( x ) l f k ( y ) ,
f k x k + l y = f k ( x ) l f k ( y ) ,
f k x k + l y = f k ( x ) l f k ( y ) ,
f k x k + l y = f k ( x ) l f k ( y ) .
The value l = 0 is not privileged. The role of a 0th level can be played by any l. The notation where
R l = { R , l , l , l , l } = { R , + , , · , / } ,
is perfectly acceptable, hence any R l can be regarded as “the” ordinary arithmetic we are taught at school. The latter statement is the content of the “arithmetic Copernican principle”, introduced in [13] and discussed further in [14]. In the present paper we nevertheless simplify notation and assume R 0 = { R , + , , · , / } . This is analogous to the usual habit of imposing initial conditions in Newtonian dynamics “at t = 0 ” instead of a general t = t 0 .
The hierarchy of arithmetics leads to the hierarchy of probabilities.

4. Hierarchy of Probabilities

Let g ( 1 ) = 1 , so that 1 k = g k ( 1 ) = 1 and 0 k = g k ( 0 ) = 0 , for any k. Now, let p, q, p + q = 1 , be probabilities. Assuming that g satisfies the assumptions of Lemma 1, we find (in consequence of Lemmas 2 and 3, and g k ( 1 ) = 1 for any k Z )
p + q = 1 ,
g k ( p ) + g k ( q ) = 1 ,
p k q = g k g k ( p ) + g k ( q ) = 1 ,
for any k Z . The Copernican aspect is visible at the level of probabilities as well, if we define P = g k ( p ) , Q = g k ( q ) , so that
g k ( P ) + g k ( Q ) = 1 ,
P + Q = 1 ,
P k Q = g k g k ( P ) + g k ( Q ) = 1 ,
for any k Z . Indeed, how to distinguish between (36)–(38) and (39)–(41), if we bear in mind that k can be positive, negative, or zero, and the formulas are true for all k? How to distinguish between the two levels if in both cases we find p + q = 1 and P + Q = 1 ? Which of the probabilities, p or P, is the one we measure in experiment? Which iterate, k, 0, or k , is the one that defines our probabilities we experimentally define in terms of frequencies of successes? Which natural numbers n k , n = n 0 , or n k , are the ones we use to define numbers of trials and successes?
Formula (38) shows that probabilities p and q sum to 1 in infinitely many ways, corresponding to infinitely many values of k in k . Formula (37) shows that probabilities p and q generate infinitely many probabilities p k = g k ( p ) and q k = g k ( q ) that sum to 1 by means of the same addition + = 0 . The Arithmetic Copernican Principle is a relativity principle which states that any value of k can correspond to the arithmetic and probability that we regard as “the human and experimental one”.
Still, this is not the end of the story. Replacing in (37) k by k l ,
g k l ( p ) + g k l ( q ) = 1 ,
and acting on both sides with g l , we find
g l g k l ( p ) + g k l ( q ) = g k ( p ) l g k ( q ) = 1 ,
for any k , l Z . The resulting wealth of available probability models implied by a single bijection g is truly overwhelming, yet ignored by those who study quantum probabilities and the hidden variables problem.
Let us now consider the concrete case of the equivalence class of a function g R whose restriction to [ 0 , 1 ] is given by g ( x ) = sin 2 π 2 x . Then,
h ( x ) = g x + 1 2 1 2 = 1 2 sin π x , 1 2 x 1 2 ,
g ( p ) = 1 2 + h p 1 2 = 1 2 + 1 2 sin π p 1 2 , 0 p 1 ,
Let p = ( π θ ) / π be the probability of finding a point belonging to the overlap of two half-circles rotated by θ [ 0 , π ] . Then, for k = 1 , q = θ / π ,
P = g ( p ) = g k ( p ) = sin 2 π 2 π θ π = cos 2 θ 2 ,
Q = g ( q ) = g k ( q ) = sin 2 π 2 θ π = sin 2 θ 2 ,
in which we recognize the conditional probabilities for two successive measurements of spin-1/2 in two Stern–Gerlach devices placed one after another, with relative angle θ .
By Lemma 3, we have in fact much more, because k = 1 can be replaced by any integer. For example, the second iterate
P = g 2 ( p ) = g g ( p ) = sin 2 π 2 cos 2 θ 2 ,
satisfies g 2 ( p ) + g 2 ( q ) = 1 , of course, as can be proved by a straightforward but instructive calculation [14]. The minus-first iterate,
P = g 1 ( p ) = 2 π arcsin p = 2 π arcsin π θ π ,
satisfies g 1 ( p ) + g 1 ( q ) = 1 , and so on and so forth.
Clearly, we have absolutely no criterion that could indicate which level of the hierarchy is the one we regard as our human one, a fact that justifies the adjective “Copernican”. For example, rewriting (49) as
P = g 1 2 ( p ) = g 1 g 2 ( p ) = g 1 1 g 2 ( q ) = g 1 1 α π = cos 2 α 2 ,
we find the relation between the two parameters, α and θ , corresponding to the two levels of the hierarchy (see Figure 1),
α ( θ ) = π g 2 ( q ) = 2 arcsin 2 π arcsin θ π .
The usual tests of classicality and quantumness are based on inequalities. However, in order to discuss an inequality we have to control ordering relations such as ≤ and ≥. Fortunately, with our assumptions about g the problem is trivial.

5. Hierarchical Ordering Relation

We assume that the bijection g is strictly increasing, i.e., x < y if and only if g ( x ) < g ( y ) . A composition of two strictly increasing functions is strictly increasing, hence x < y implies g k ( x ) l g k ( y ) < 0 l = 0 for any k , l Z . The latter leads to a unique ordering relation at the level of the entire hierarchy of arithmetics. This is why it is safe to use the symbols <, >, ≤, ≥ at any level of the hierarchy.
So far, we have restricted our analysis to binary events. An extension to higher dimensional problems needs the notion of a product rule.

6. Hierarchical Product Rules

The standard product rule states that probability of a sequence of two events, first a 1 then a 2 , is given by the product of the prior p ( a 1 ) (a probability of the condition) with the posterior p ( a 2 | a 1 ) (a conditional probability of a 2 under the condition that a 1 has happened). The sums of binary probabilities,
g k 1 p ( 0 ) l g k 1 p ( 1 ) = 1 , for   any   k 1 , l Z ,
g k 2 p ( 0 | a 1 ) l g k 2 p ( 1 | a 1 ) = 1 , for   any   k 2 , l Z ,
as implied by the lemmas, are naturally related to
g k 2 p ( a 2 | a 1 ) l g k 1 p ( a 1 ) , for   any   k 1 , k 2 , l Z ,
because
l a 1 , a 2 g k 2 p ( a 2 | a 1 ) l g k 1 p ( a 1 ) = 1 , for   any   k 1 , k 2 , l Z .
A sequence of results, a n , a n 1 , , a 1 , implies their joint probability
g k n p ( a n | a n 1 a 1 ) l l g k 2 p ( a 2 | a 1 ) l g k 1 p ( a 1 )
normalized by
l a 1 a n g k n p ( a n | a n 1 a 1 ) l l g k 2 p ( a 2 | a 1 ) l g k 1 p ( a 1 ) = g l ( 1 ) = 1 .
In particular, for l = 0 ,
g k 1 p ( 0 ) + g k 1 p ( 1 ) = 1 , for   any   k 1 Z ,
g k 2 p ( 0 | a 1 ) + g k 2 p ( 1 | a 1 ) = 1 , for   any   k 2 Z ,
and
a 1 , a 2 g k 2 p ( a 2 | a 1 ) g k 1 p ( a 1 ) = 1 , for   any   k 1 , k 2 Z .
At the other extreme is the case of l = k 1 = k 2 = k ,
g k p ( a 2 | a 1 ) k g k p ( a 1 ) = g k p ( a 2 | a 1 ) p ( a 1 ) ,
with normalization
k a 1 , a 2 g k p ( a 2 | a 1 ) p ( a 1 ) = g k a 1 , a 2 p ( a 2 | a 1 ) p ( a 1 ) = 1 , for   any   k Z .
It is striking that in formulas such as (56) each of the k-indices can be in principle different. In effect, (56) may be regarded as a component of a tensor.
A truly nontrivial application of generalized product rules occurs in the problem of singlet-state probabilities, quantum entangled states, and Bell’s theorem.

7. Singlet-State Probabilities

Singlet-state probabilities occur in experiments where two parties (“Alice” and “Bob”) are macroscopically separated, but the measurements they perform are the quantum ones. Such probabilities naturally occur in the context of the hierarchical product rule. Indeed, consider the following probabilities,
p ( 0 ) = p ( 1 ) = 1 2 ,
p ( 0 | 0 ) = p ( 1 | 1 ) = θ π ,
p ( 1 | 0 ) = p ( 0 | 1 ) = π θ π ,
whose geometric interpretation is evident. As the bijection take the one occurring in (45)–(47). Then,
g p ( 0 | 0 ) g p ( 0 ) = g p ( 1 | 1 ) g p ( 1 ) = 1 2 sin 2 θ 2 ,
g p ( 1 | 0 ) g p ( 0 ) = g p ( 0 | 1 ) g p ( 1 ) = 1 2 cos 2 θ 2 ,
are the probabilities typical of the singlet state. Let us note that we have employed the product rule,
g k 2 p ( a | b ) l g k 1 p ( b ) = g 1 p ( a | b ) 0 g k 1 p ( b ) ,
with k 2 l . k 1 can be arbitrary because g ( 1 / 2 ) = 1 / 2 = g k 1 ( 1 / 2 ) for any g that satisfies Lemma 1. For simplicity, we set k 1 = 1 . Now, the joint probability can be interpreted as follows:
P ( a , b ) = g ( p ( a | b ) hidden ) quantum 0 macroscopic g ( p ( b ) hidden ) quantum .
Let us further note that we could have started with the following:
g k p ( 0 ) = g k p ( 1 ) = 1 2 ,
g k p ( 0 | 0 ) = g k p ( 1 | 1 ) = θ π ,
g k p ( 0 | 1 ) = g k p ( 1 | 0 ) = π θ π .
Then, g k + 1 p ( a 2 | a 1 ) g k + 1 p ( a 1 ) would be the singlet-state probabilities.
One concludes that the notion of a quantum level is a relative one. In fact, any level is quantum, and any level is hidden; moreover, any l can play the role of the macroscopic arithmetic. What counts is the neighboring location in the hierarchy. The so-called violation of Bell’s inequality is an inconsistency that occurs if we apply the arithmetic of a hidden level to calculations performed at the neighboring quantum one. An analogous inconsistency that occurs between non-neighboring levels leads to violations beyond the Tsirelson bound [13,15].
In order to perform calculations at different levels of the hierarchy, we have to understand what the consequences are of the hierarchical structure of arithmetics for the resulting hierarchy of calculi.

8. Hierarchy of Calculi

A hierarchy of arithmetics leads to a hierarchy of “non-Newtonian” calculi [16,17,18,19,20,21]. Here, functions such as A : R R have to be treated as mappings between arithmetics and not between sets, hence it is more appropriate to write
A l k : R k R l ,
with some k , l Z . Otherwise the notions of derivative and integral are ambiguous. The derivative of A l k is
D l A l k ( x ) D k x = lim δ 0 ( A l k ( x k δ k ) l A l k ( x ) ) l δ l .
As before, δ k = g k ( δ ) , δ l = g l ( δ ) . The derivative is R l -linear and satisfies an appropriate Leibniz rule,
D l A l k ( x ) l B l k ( x ) D k x = D l A l k ( x ) D k x l D l B l k ( x ) D k x ,
D l A l k ( x ) l B l k ( x ) D k x = D l A l k ( x ) D k x l B l k ( x ) l A l k ( x ) l D l B l k ( x ) D k x .
Integration of A l k : R k R l is defined in a way that guarantees the two fundamental theorems of calculus (under standard assumptions about differentiability and continuity):
a b D l A l k ( x ) D k x D k x = A l k ( b ) l A l k ( a ) ,
D l D k x a x A l k ( y ) D k y = A l k ( x ) .
The formulas become less abstract if one considers the following commutative diagram ( f = g 1 )
R k A l k R l f k g l R 0 A 00 R 0 g n f m R n A m n R m ,
leading to a very simple and useful form of the derivative (74),
D l A l k ( x ) D k x = g l d A 00 f k ( x ) d f k ( x ) ,
while the integral reads,
a b A l k ( x ) D k x = g l f k ( a ) f k ( b ) A 00 ( r ) d r .
Here, d r denotes the usual (Riemann, Lebesgue, etc.) integral in R 0 . Formula (80) is derived under the assumption that g : R R is continuous (in the usual meaning of the term employed in ordinary “Newtonian” real analysis), which is however, automatically guaranteed by the fact that g is a bijection. What is important, neither g nor its inverse f have to be differentiable in the standard Newtonian sense. The latter makes an important difference with respect to the ordinary differential geometry where functions such as g ( x ) = x 1 / 3 would be excluded as non-differentiable at x = 0 . In the non-Newtonian formalism, any bijection g, as well as its inverse f, are automatically smooth with respect to the non-Newtonian differentiation defined by the same g. Various explicit examples can be found in [22,23].
Linearity of the integral must be understood in the sense of R l ,
a b A l k ( x ) l B l k ( x ) D k x = a b A l k ( x ) D k x l a b B l k ( x ) D k x ,
a b A l l B l k ( x ) D k x = A l l a b B l k ( x ) D k x , for a constant A l R l ,
a property of fundamental importance for Bell-type inequalities [13]. An analogous form of generalized linearity of integrals occurs in fuzzy calculus [24,25,26,27,28].
Diagram (79) implies
A l k = g l A 00 f k = g l m g m A 00 f n f k n = g l m A m n f k n ,
which leads to a new type of a chain rule, relating derivatives and integrals at different levels of the hierarchy,
D l A l k ( x ) D k x = g l m D m A m n f k n ( x ) D n f k n ( x ) ,
a b A l k ( x ) D k x = g l m f k n ( a ) f k n ( b ) A m n ( x ) D n x .
Formulas (85) and (86) do not seem to appear in the literature, so we prove them in Appendix A.

Digression: Logarithm and Rényi Entropies

Exponential function is defined by the differential equation,
D l exp l k ( x ) D k x = g l d exp 00 f k ( x ) d f k ( x ) = exp l k ( x ) = g l exp 00 f k ( x ) ,
exp l k ( 0 k ) = 1 l .
The solution is given by exp 00 ( x ) = e x and satisfies
exp l k ( x k y ) = exp l k ( x ) l exp l k ( y ) .
The inverse is given by
ln k l ( x ) = g k ln 00 f l ( x ) ,
where ln 00 ( x ) = ln x , and
ln k l ( x l y ) = ln k l ( x ) k ln k l ( y ) .
Now, consider ϕ α ( x ) = e ( 1 α ) x , ϕ α 1 ( x ) = 1 1 α ln x . Rényi introduced his α -entropy as a Kolmogorov–Nagumo average [29,30,31,32,33,34,35,36] of the Shannon amount of information [37] (we prefer the natural logarithm to the original log 2 from [33], but this is just a choice of units of information),
S α = ϕ α 1 p p ϕ α ( ln p ) = 1 1 α ln p p α .
It is clear that (92) can be expressed in several different ways by means of generalized arithmetics. For example,
1 ln 1 , 0 ( x ) = g 1 ln f 0 ( x ) = g ln x ,
has the same functional form as ϕ α ln p . Alternatively, defining
x y = ϕ α 1 ϕ α ( x ) + ϕ α ( y ) ,
x y = ϕ α 1 ϕ α ( x ) ϕ α ( y ) ,
and ϕ α 1 ( p ) = P , we find
S α = ϕ α 1 P ϕ α ( P ) ϕ α ln ϕ α ( P ) = P P ln 1 / ϕ α ( P ) .
Rényi’s choice of ϕ α ( x ) = e ( 1 α ) x was dictated by the assumed additivity of entropy for independent (i.e., uncorrelated) systems. Our general formalism suggests various hierarchical generalizations of the notion of entropy, automatically inheriting the additivity properties from the arithmetics involved. Some examples can be found in [10].

9. Application: Local Hidden-Variable Models Based on Non-Newtonian Integration

Consider an integral representation of the standard R 0 -valued probability, with probability densities ρ 00 and characteristic functions
χ φ , 00 ( λ ) = 1 if λ [ φ π / 2 , φ + π / 2 ] 0 if λ [ φ π / 2 , φ + π / 2 ]
treated as mappings R 0 R 0 . For example, setting θ = α β in (63) and (64) one can express the probabilities in integral forms,
1 2 = χ α , 00 ( λ ) ρ 00 ( λ ) d λ = 1 2 π α π / 2 α + π / 2 d λ ,
1 2 α β π = χ α , 00 ( λ ) χ β + π , 00 ( λ ) ρ 00 ( λ ) d λ = 1 2 π β + π / 2 α + π / 2 d λ , β α .
χ φ , 00 ( λ ) is the characteristic function of the half-circle located symmetrically with respect to the angle φ ; ρ 00 ( λ ) = 1 / ( 2 π ) is the uniform probability density on the circle. Formula (99) is local in the sense of Bell [38] and Clauser and Horne [39], because of the product structure of the term
χ α , 00 ( λ ) χ β + π , 00 ( λ ) = χ α , 00 ( λ ) 0 χ β + π , 00 ( λ ) .
The case k = l = 0 of Bayes law discussed in Section 6 is (with θ = α β )
α β π = χ β + π , 00 ( λ ) χ α , 00 ( λ ) ρ 00 ( λ ) d λ χ α , 00 ( λ ) ρ 00 ( λ ) d λ = p ( 0 2 , 0 1 ) p ( 0 1 ) = p ( 1 2 , 1 1 ) p ( 1 1 )
= χ β + π , 00 ( λ ) χ α , 00 ( λ ) ρ 00 ( λ ) χ α , 00 ( λ ) ρ 00 ( λ ) d λ d λ ,
which is equivalent to the assumption that the first measurement reduces the probability density according to
ρ 00 ( λ ) χ α , 00 ( λ ) ρ 00 ( λ ) χ α , 00 ( λ ) ρ 00 ( λ ) d λ .
Equation (103) is an example of a classical projection postulate in theories based on R 0 arithmetic.
Returning to the singlet case, corresponding to k = 1 , l = 0 , we can write it in analogy to (98) and (99),
g p ( a 2 | a 1 ) g p ( a 1 ) = g χ a 1 , 00 ( λ ) χ a 2 , 00 ( λ ) ρ 00 ( λ ) d λ χ a 1 , 00 ( λ ) ρ 00 ( λ ) d λ g χ a 1 , 00 ( λ ) ρ 00 ( λ ) d λ
= 1 2 g 2 χ a 1 , 00 ( λ ) χ a 2 , 00 ( λ ) ρ 00 ( λ ) d λ
= G χ a 1 , 00 ( λ ) χ a 2 , 00 ( λ ) ρ 00 ( λ ) d λ
= G χ a 1 a 2 , 00 ( λ ) ρ 00 ( λ ) d λ ,
where G ( x ) = 1 2 g ( 2 x ) , and
χ a 1 a 2 , 00 ( λ ) = χ a 1 , 00 ( λ ) χ a 2 , 00 ( λ )
is the characteristic function representing the conjunction “ a 1 and a 2 ”. Notice that (106) is a non-Newtonian integral
G χ a 1 , 00 ( λ ) χ a 2 , 00 ( λ ) ρ 00 ( λ ) d λ = χ a 1 , 11 ( λ ) 1 χ a 2 , 11 ( λ ) 1 ρ 11 ( λ ) D 1 λ ,
of the function
χ a 1 , 11 1 χ a 2 , 11 1 ρ 11 : R 1 R 1 ,
where
R 1 χ a 1 , 11 R 1 G 1 G R 0 χ a 1 , 00 R 0 , R 1 ρ 11 R 1 G 1 G R 0 ρ 00 R 0 ,
and the multiplication is given by
x 1 y = G G 1 ( x ) 0 G 1 ( y ) = G G 1 ( x ) G 1 ( y ) .
The right-hand side of (109) has again the Bell–Clauser–Horne product form, the only difference being that instead of 0 one employs 1 . This is why (109) can be regarded as a local hidden-variable representation of singlet-state probabilities, hence a counterexample to Bell’s theorem. This is the main idea of the approach to singlet-state correlations introduced in [12] and further discussed in [10,13,14].
A formal basis of the construction from [10,12,13,14] is given by the following:
Lemma 4. 
Consider four joint probabilities p 0 1 0 2 , p 1 1 1 2 , p 0 1 1 2 , p 1 1 0 2 , satisfying
a b p a b = 1 ,
a p a a 2 = a p a 1 a = 1 2 .
A sufficient condition for
a b G ( p a b ) = 1 ,
is given by G ( p ) = 1 2 g ( 2 p ) , where g satisfies Lemma 1. Any such G has a fixed point at p = 1 / 4 .
A disadvantage of the construction based on Lemma 4 is its restriction to “rotationally symmetric” probabilities, i.e., those fulfilling (114). Moreover, being in itself sufficient as a counterexample to Bell’s theorem, it lacks the generality typical of arbitrary k , l Z .
The fundamental structure of the quantum probability model seems to be best described by Formula (69).
So far, the angles occurring in singlet-state probabilities were interpretable as experimental parameters (angles between polarizers or Stern–Gerlach devices). But what about arbitrary quantum states, even those described by infinite-dimensional Hilbert spaces? It turns out that the parameter in question can be interpreted in geometric terms, independently of the physical nature of the problem.

10. Fubini–Study Geodesic Distance as a Hidden Variable

The scalar product a | b of two vectors belonging to some Hilbert space defines their Fubini–Study geodesic distance θ ( a , b ) [40,41,42,43,44,45],
| a | b | 2 = a | a b | b cos 2 θ ( a , b ) .
Let P b be a projector, | b = P b | a , and a | a = 1 , so that b | b = a | b = a | P b | a = P ( b | a ) is a conditional quantum probability. The geodesic distance between | a and | b satisfies
| a | b | 2 = a | P b | a 2 = a | P b | a cos 2 θ ( a , b ) ,
and thus,
P ( b | a ) = cos 2 θ ( a , b ) .
The formal angle θ ( a , b ) between the two vectors in the Hilbert space acquires a direct physical interpretation if a and b represent linear polarizations of photons: θ ( a , b ) becomes the angle between two polarizers. In the analogous case of the electrons, θ ( a , b ) would represent one half of the angle between two Stern–Gerlach devices.
Next, let us rewrite (118) as
P ( b | a ) = cos 2 θ ( a , b ) = sin 2 π 2 p ( b | a ) = g p ( b | a ) = cos 2 π 2 1 p ( b | a ) ,
where g : [ 0 , 1 ] [ 0 , 1 ] is the bijection we have introduced in the context of the singlet state. Probabilities p ( b | a ) and P ( b | a ) = g ( p ( b | a ) ) represent, respectively, the hidden and the quantum neighboring levels of the hierarchy of (conditional) probabilities. The hidden probability is thus directly related to the Fubini–Study geodesic distance,
θ ( a , b ) = π 2 1 p ( b | a ) ,
p ( b | a ) = 1 θ ( a , b ) π / 2 ,
q ( b | a ) = 1 p ( b | a ) = θ ( a , b ) π / 2 ,
where q ( b | a ) is the probability that two randomly chosen and intersecting straight lines intersect at an angle not exceeding θ ( a , b ) [ 0 , π / 2 ] .
The Fubini–Study geodesic distance has been turned into a classical measure of a subset of a quarter-circle. It defines the whole hierarchy of probabilities, g k p ( b | a ) , where k = 1 is the quantum one. Note that g ( p ) = sin 2 π 2 p has been elevated to the role of a universal bijection, defining an arithmetic applicable to all the possible (pure) quantum states. Explicitly, we find
g 1 p ( b | a ) = 1 π / 2 arcsin 1 arccos P ( b | a ) π / 2 ,
g 0 p ( b | a ) = 1 arccos P ( b | a ) π / 2 ,
g 1 p ( b | a ) = sin 2 π 2 1 arccos P ( b | a ) π / 2 = P ( b | a ) ,
g 2 p ( b | a ) = sin 2 π 2 P ( b | a ) ,
g 3 p ( b | a ) = sin 2 π 2 sin 2 π 2 P ( b | a ) ,
Since a | P b | a = P ( b | a ) is real, it can be written as a real quadratic form,
a | P b | a = r s R ( a r ) A r s R ( a s ) + r s ( a r ) B r s ( a s ) + r s R ( a r ) C r s ( a s ) .
Hence,
g 2 p ( b | a ) = g 1 P ( b | a )
= g 1 r s R ( a r ) A r s R ( a s ) + r s ( a r ) B r s ( a s ) + r s R ( a r ) C r s ( a s ) = r s g R ( a r ) g A r s g R ( a s ) r s g ( a r ) g B r s g ( a s )
r s g R ( a r ) g C r s g ( a s )
= g ( a ) | g ( P b ) | g ( a ) = a 1 | 1 P b , 1 1 | a 1 ,
where g ( a ) | g ( P b ) | g ( a ) in (132) is defined in a way that parallels the form of
a | P b | a = a 0 | 0 P b , 0 0 | a 0
in (128), but with all the “standard” sums + = 0 and products · = 0 replaced by 1 and 1 , and all the coefficients transformed by g. In effect, the difference between (128) and (132) is purely notational, as one can write the whole hierarchy of probabilities in a “quantum” form as well,
g 0 p ( b | a ) = a 1 | 1 P b , 1 1 | a 1 ,
g 1 p ( b | a ) = a 0 | 0 P b , 0 0 | a 0 ,
g 2 p ( b | a ) = a 1 | 1 P b , 1 1 | a 1 ,
g 3 p ( b | a ) = a 2 | 2 P b , 2 2 | a 2
This is the Copernican principle in action. The choice of the “quantum” level of the hierarchy is just a matter of convention. In fact, any formula from (123)–(127) can represent quantum mechanics known from textbooks.
It is perhaps more striking that any of these levels can be regarded as a hidden-variable level, where the hidden variable is given by an appropriate geodesic distance.
The concrete example of g ( p ) = sin 2 π 2 p can help us to understand the structure of the whole hierarchy. We will see that, in spite of the infinite dimension of the hierarchy, one effectively deals with a finite dimensional structure.

11. Effective Trunction of the Infinite Hierarchy of Probabilities

Figure 2 explains why in spite of the infinite number of levels, those that statistically differ between one another may be limited to a finite “band” in the hierarchy. What it practically means is that if our level of the hierarchy is given by some l (say, l = 0 ) then, depending on the available precision of our experiments, we may restrict the analysis to a finite collection of probabilities. In the example depicted in Figure 2, we can restrict the analysis to 31 levels,
{ g 15 ( p ) , , g 1 ( p ) , p , g ( p ) , , g 15 ( p ) } ,
because the full infinite hierarchy is indistinguishable from
{ , g 15 ( p ) , , g 15 ( p ) , , g 1 ( p ) , p , g ( p ) , , g 15 ( p ) , , g 15 ( p ) , } ,
When increasing k in g k , we effectively obtain a theory that may look discrete, because g k ( p ) , k > k max , are indistinguishable from the red step function in Figure 2. For g k ( p ) , k < k min , we obtain an analogous behavior of the inverse functions.
Let us stress that the above argument for indistinguishability has been formulated only for probabilities, p [ 0 , 1 ] , hence for g ( p ) , and not for g R ( x ) , x [ 0 , 1 ] . In principle, for x [ 0 , 1 ] , all the levels of the hierarchy may be distinguishable.
Notice that for this concrete g ( p ) = sin 2 π 2 p , one finds g 15 ( p ) 0 if p < 1 / 2 , g 15 ( 1 / 2 ) = 1 / 2 , and g 15 ( p ) 1 if p > 1 / 2 . Thus, the higher-level probabilities possess several obvious analogies to neural activation functions [46], making links between the hierarchical structure and the measurement problem even more intriguing. An observer who measures g 15 ( p ) probabilities ignores practically all the events whose probability is smaller than 1/2, and treats all p > 1 / 2 as certain.
This type of behavior is the essence of learning algorithms. An intriguing possibility occurs that g ( p ) is a probability related to the act of learning that events with probability p are true. Hence, the natural question: Is the stabilization of large k > 0 iterates on effectively the step function a formal counterpart of stabilization of self-observation, a creation of self-awareness?
For the negative iterates, instead of a threshold function we tend toward a “white noise”: g 15 ( 0 ) = 0 , g 15 ( 1 / 2 ) = 1 / 2 , g 15 ( 1 ) = 1 , and g 15 ( p ) 1 / 2 , for 0 < p < 1 . The lower levels of the hierarchy become less and less diverse from the point of view of a higher-level observer. Here, the analogy is with observations of micro-scale events is quite evident. The relativity of probability becomes analogous to the “relativity of smallness”—what is small to us, may be large for a bacteria or an atom.
It is worth recalling that g 15 ( p ) and g 15 ( p ) only look discrete due to our limited resolution—in reality, both maps are continuous bijections of [ 0 , 1 ] into itself.
Now, what about experiment and laws of large numbers? Can they somehow discriminate between all these probabilities?

12. Hierarchical Laws of Large Numbers

Laws of large numbers formalize the relations between probabilities (real numbers), (natural) numbers of trials and successes, and (rational) numbers of their relative frequencies. However, as we already know, all these notions are arithmetic dependent: a natural number n k = g k ( n ) R k may not be a natural number from the point of view of some other R l , a rational number ( n / m ) k = g k ( n / m ) R k may not be a rational number from the point of view of R l , and so on. The most general law of large numbers should involve all the levels of the hierarchy simultaneously. Dealing with binary events, we need an appropriate generalization of the Bernoulli law of large numbers.
To begin with, let us imagine we “live” in a world where all the possible computations are performed in terms of the arithmetic R l . If we toss a coin, say, one hundred times, and observe heads forty times, the arithmetic formulation of the experiment involves n l = 40 l heads in N l = 100 l trials. The experimental ratio is n l l N l = 40 l l 100 l . This is a rational number in R l .
If the same experiment is described by an observer who employs arithmetic R j , j l , the experimental ratio is given by n j j N j = 40 j j 100 j . In terms of g l and g j we can write 40 l l 100 l = g l ( 40 / 100 ) and 40 j j 100 j = g j ( 40 / 100 ) . Yet, if we demanded g l ( 40 / 100 ) = g j ( 40 / 100 ) , it would imply that g l j ( 40 / 100 ) = 40 / 100 , i.e., 40 / 100 is a fixed point of g l j . Since the same argument can be applied to any rational number, one arrives at the conclusion that the trivial case g ( x ) = g 0 ( x ) is the only solution.
One concludes that a nontrivial g generically implies n j j N j n l l N l for l j . In other words, the same experiment can be described by different probabilities, p l = g l ( p ) p j = g j ( p ) , although from the frequentist perspective both descriptions involve forty successes in one hundred trials. We inevitably arrive at the whole hierarchy.
This is my tentative interpretation of the hierarchical structure. However, the links with neural activation functions deserve a separate study.
In order to formulate a generalized Bernoulli law of large numbers, we have to estimate the probability that
g k ( p ) l n l l N l = g l g k l ( p ) n / N = g l g k l ( p ) n / N ε l = g l ( ε ) .
The modulus is defined in R l in the standard way,
| x | = x if x 0 l l x if x < 0 l ,
where we keep in mind that, by assumption, 0 l = g l ( 0 ) = 0 and the ordering relation is unaffected by a strictly increasing g. Inequality (140) effectively boils down to
g k l ( p ) n / N ε .
Next, we note that probabilities depicted in the lower part of Figure 3 are normalized in consequence of the identity
g k ( p ) l g k ( q ) N l = l n = 0 N N l n l l g k ( q ) ( N n ) l l g k ( p ) n l = 1 l = 1 ,
N l n l = g l N n .
The probability
p ( n l , N l ) k = N l n l l g k ( q ) ( N n ) l l g k ( p ) n l
= g l N n g k l ( q ) N n g k l ( p ) n
corresponds to n l sucessess in N l trials. The expected number of successes and the corresponding variance read,
n l k = l n = 0 N n l l N l n l l g k ( q ) ( N n ) l l g k ( p ) n l
= g l N g k l ( p ) = N l l g k ( p ) ,
n l 2 l k l n l k 2 l = g l N g k l ( p ) g k l ( q ) = N l l g k ( p ) l g k ( q )
= N l 2 l l l n = 0 N n l l N l l g k ( p ) 2 l l p ( n l , N l ) k .
Applying g l to (149) and (150), we find
g k l ( p ) g k l ( q ) / N = n = 0 N n / N g k l ( p ) 2 N n g k l ( q ) N n g k l ( p ) n .
Now, let n N ε , g k l ( p ) , N if n / N g k l ( p ) ε . Then,
g k l ( p ) g k l ( q ) / N n N ε , g k l ( p ) , N n / N g k l ( p ) 2 N n g k l ( q ) N n g k l ( p ) n
ε 2 n N ε , g k l ( p ) , N N n g k l ( q ) N n g k l ( p ) n
= ε 2 p n N ε , g k l ( p ) , N ,
where p n N ε , g k l ( p ) , N R 0 is the 0th-level probability that | n / N g k l ( p ) | ε . In this way we have arrived at the standard Bernoulli law of large numbers in R 0 ,
p n N ε , g k l ( p ) , N g k l ( p ) g k l ( q ) N ε 2 .
Of course, the left-hand side of (155) cannot be greater than 1, so the number of trials N must be chosen so that
g k l ( p ) g k l ( q ) ε 2 N .
For p l = g l ( p ) we find, denoting ε l = g l ( ε ) , N l = g l ( N ) ,
p l n N ε , g k l ( p ) , N g l g k l ( p ) g k l ( q ) N ε 2 = g l g k l ( p ) g k l ( q ) g l ( N l ) g l ( ε l ) 2 = g k ( p ) l g k ( q ) l N l l ε l 2 l ,
for any k Z .
In order to have a feel of the influence of l Z on the rate of convergence of experimental ratios to probabilities, consider the simple case of a symmetric coin, p = q = 1 / 2 , and the universal quantum bijection g ( x ) = sin 2 π 2 x . Since g k l ( 1 / 2 ) = 1 / 2 for any k , l , we have to estimate
p l n N ε , g k l ( p ) , N g l 1 4 N ε 2 ,
1 4 ε 2 N .
Figure 4 illustrates the right-hand side of (158) for ε = 0.1 and 25 N 75 , for the first four iterates of g, from g 1 ( x ) = sin 2 π 2 x to
g 4 ( x ) = sin 2 π 2 sin 2 π 2 sin 2 π 2 sin 2 π 2 x .
The graphs are intriguing. Their interpretation is additionally obscured by the fact that Wolfram Mathematica operates in the arithmetic R 0 , which is not used by any of the four observers. The problem requires further studies.

13. Hierarchical Approach to Bell’s Theorem—Revisited

If we are able to reconstruct singlet-state probabilities in a hidden-variable way, it means that Bell’s inequality (in any form) cannot be proved for the model. In the hierarchical context the obstacle for proving the inequality lies in the lack of the k-level additivity of the l-level integrals, if l k . The usual derivation, when seen from the hierarchical perspective, assumes 0 -additivity of D 1 λ integrals, which is untrue for a nontrivial g, and g ( p ) = sin 2 π 2 in particular, hence the inequality derived at level zero does not apply to level 1: Level-0 formulas are “violated” by level-1 probabilities (and the other way around).
Let us see how it works. Consider the joint probabilities
P ( a 1 , a 2 ) = P ( first a 1 then a 2 ) = g p ( a 2 | a 1 ) g p ( a 1 ) = g p ( a 1 | a 2 ) g p ( a 2 ) = P ( first a 2 then a 1 ) = P ( a 2 , a 1 ) ,
where we assume the independence of the order in which the measurements are performed. This is typical of the scenarios involving “observer 1 measuring a 1 ” (“Alice”) and “observer 2 measuring a 2 ” (“Bob”) who are space-like separated and thus the order is undefined.
Now, we will derive an analog of the Clauser–Horne inequality [39]. We will work with probabilities (161). Let us stress that an analogous derivation was presented in [13], but was based on the form occurring in (109), that is by means of the bijection G. The derivation we will discuss now is based on g ( x ) , and not on G ( x ) = 1 2 g ( 2 x ) . Why? Because we want a proof that is easy to generalize to any k , l Z .
We assume a local-hidden variable form of the probabilities that occur at the hidden level (level zero), hence
p ( a 2 | a 1 ) = χ a 1 , 00 ( λ ) χ a 2 , 00 ( λ ) ρ 00 ( λ ) d λ χ a 1 , 00 ( λ ) ρ 00 ( λ ) d λ ,
p ( a 1 ) = χ a 1 , 00 ( λ ) ρ 00 ( λ ) d λ .
Level-one conditional probabilities
g p ( a 2 | a 1 ) = g χ a 1 , 00 ( λ ) χ a 2 , 00 ( λ ) ρ 00 ( λ ) d λ χ a 1 , 00 ( λ ) ρ 00 ( λ ) d λ
= χ a 1 , 11 1 χ a 2 , 11 1 ρ 11 ( λ ) D 1 λ 1 χ a 1 , 11 1 ρ 11 ( λ ) D 1 λ ,
can be rewritten in several useful forms. First of all, introducing the reduced (conditional) probability density we obtain the “projection postulate”,
ρ 11 ( λ ) ρ a 1 , 11 ( λ ) = χ a 1 , 11 1 ρ 11 ( λ ) 1 χ a 1 , 11 1 ρ 11 ( λ ) D 1 λ ,
g p ( a 2 | a 1 ) = χ a 2 , 11 1 ρ a 1 , 11 ( λ ) D 1 λ .
Secondly, we can explicitly express the conditional probability in a local Clauser–Horne form (in the arithmetic R 1 ),
g p ( a 2 | a 1 ) = x a 1 , 11 ( λ ) 1 y a 2 , 11 ( λ ) 1 ρ 11 ( λ ) D 1 λ ,
where
x a 1 , 11 ( λ ) = χ a 1 , 11 ( λ ) 1 χ a 1 , 11 ( λ ) 1 ρ 11 ( λ ) D 1 λ
= g R χ a 1 , 00 ( λ ) χ a 1 , 00 ( λ ) ρ 00 ( λ ) d λ ,
y a 2 , 11 ( λ ) = χ a 2 , 11 ( λ ) = g χ a 2 , 00 ( λ ) = χ a 2 , 00 ( λ ) ,
(because g ( 0 ) = 0 , g ( 1 ) = 1 ).
Repeating step by step the derivation of the Clauser–Horne inequality [39], but here in the arithmetic R 1 , we can derive an analogous inequality which must be satisfied at the quantum level of the hierarchy. Such an inequality cannot be violated by quantum probabilities. For simplicity let us reduce the analysis to singlet-state probabilities and g R ( n ) = n for any n Z . Then,
x a 1 , 11 ( λ ) 1 ρ 11 ( λ ) D 1 λ = 1 ,
y a 2 , 11 ( λ ) 1 ρ 11 ( λ ) D 1 λ = 1 / 2 ,
0 x a 1 , 11 ( λ ) X = g R 1 χ a 1 , 00 ( λ ) ρ 00 ( λ ) d λ = 2 ,
0 y a 2 , 11 ( λ ) Y = 1 ,
and
g p ( a 2 | a 1 ) = g p ( a 1 | a 2 ) .
Next, we consider the Clauser–Horne linear combination
C H ( λ ) = x a 1 , 11 1 y b 2 , 11 ( λ ) 1 x a 1 , 11 1 y b 2 , 11 ( λ ) 1 x a 1 , 11 1 y b 2 , 11 ( λ ) 1 x a 1 , 11 1 y b 2 , 11 ( λ ) 1 x a 1 , 11 1 Y 1 X 1 y b 2 , 11 ( λ ) .
Repeating in R 1 the reasoning from [39], we obtain
2 C H ( λ ) 0 .
R 1 -multiplying the latter by ρ 11 ( λ ) , integrating with D 1 λ , and taking into account the R 1 -linearity of the D 1 λ integral, we find
0 g p ( a 1 | b 2 ) 1 g p ( a 1 | b 2 ) 1 g p ( a 1 | b 2 ) 1 g p ( a 1 | b 2 ) 2 .
Notice that inequality (179) involves conditional probabilities, as opposed to the original Clauser–Horne one which was based on joint probabilities. The inequalities derived in the arithmetic induced by G ( x ) and discussed in [12,13] were also based on joint probabilities. However, joint probabilities involve the “macroscopic” level-0 multiplication of 1 / 2 by cos 2 ( α / 2 ) , whereas the conditional probabilities involve only the arithmetic of the “microscopic” level-1 probability cos 2 ( α / 2 ) .
When investigating the violation of inequalities such as (179) one should keep in mind the difference between g ( x ) = sin 2 π 2 x , for x [ 0 , 1 ] , and its extension g R (x) beyond the interval [ 0 , 1 ] . Here, (179) is derived under the assumption that g R ( n ) = n , for any integer n. Readers interested in explicit examples of g R may consult [12,13,14].
The inequality that can indeed be violated is
0 g p ( a 1 | b 2 ) g p ( a 1 | b 2 ) + g p ( a 1 | b 2 ) + g p ( a 1 | b 2 ) 2 ,
but it cannot be proved for the model, so is simply untrue. The technical difficulty in proving (180) is the lack of R 0 -linearity of the D 1 λ integral.
The notion of “violation” of a formula is, in my opinion, very confusing. In the same sense one could say that the real-number inequality x 2 0 is violated by complex numbers. Instead of saying that i 2 = 1 violates x 2 0 one rather says that x 2 0 cannot be proved for all x C . The same happens with the Bell inequality, derived in R 0 but not valid in R 1 . On the other hand, the inequalities that can be derived in R 1 are never “violated” in R 1 , but certainly will be untrue in some other R k .

14. Interference, Propagators, Dynamics…

Formulas (134)–(137) show that the conditional probabilities can be written in scalar-product forms,
g 0 p ( b | a ) = b 1 | 1 | a 1 = b | a 1 ,
g 1 p ( b | a ) = b 0 | 0 | a 0 = b | a 0 ,
g 2 p ( b | a ) = b 1 | 1 | a 1 = b | a 1 ,
g 3 p ( b | a ) = b 2 | 2 | a 2 = b | a 2 ,
where we have introduced the compact notation,
b k | k | a k = b | a k = g b | a k 1 .
These concrete scalar products are real. However, a complex scalar product can be always treated as a pair of reals with, in principle, different arithmetics for real and imaginary parts (see Appendix A). This type of generalized complex numbers was applied to non-Newtonian Fourier analysis on fractals [47], and proved very useful in circumventing certain impossibility theorems about Fourier transforms on the triadic Cantor set. Scalar products (181)–(184) when generalized to complex numbers (see Appendix A) can be used to generalize Feynman’s path integral formalism to its hierarchical form, ultimately leading to propagators and time evolution.
We leave it for a future paper.

15. An Open Ending

Standard modern physics involves a three-level hierarchy: quantum, classical and cosmological. As human observers, we are positioned at the center of this hierarchy, but the connections with the remaining two levels remain unclear. We do not understand how is it that we observe quantum properties (the measurement problem). Similarly, we do not understand our relation with the large-scale universe (the dark energy problem). In both cases the arithmetic freedom is probably essential [12,48] but generally overlooked by our scientific community.
Bell’s theorem is generally believed to eliminate levels lower than the quantum one, but the hierarchical picture questions this viewpoint: Quantum and classical probabilities typical of the singlet state belong to neighboring levels in the hierarchy—any two neighboring levels. Elimination of any of the levels, thus, would destroy the whole hierarchical structure, all quantum levels included [14].
To the best of my knowledge, the first systematic study of generalized arithmetics in physics was initiated by my paper [49], in which the relativity of arithmetic was interpreted in terms of a fundamental symmetry. However, I merely rediscovered a structure that had previously been introduced to calculus by Grossman and Katz (non-Newtonian calculus) [16,17,18], Maslov (idempotent analysis) [50] and Pap (g-calculus) [19]. The origins of the idea of generalized arithmetic and calculus can be traced back to the works of Volterra on the product integral [51], Kolmogorov [34], and Nagumo [35] on generalized means, and Rényi on generalized entropies [33]. Studies of a nonstandard number theory were initiated by Rashevsky [52] and, in a concrete form of non-Diophantine arithmetic, developed by Burgin [53,54,55,56]. Generalized forms of arithmetic can be found in Bennioff’s attempts to formulate a coherent theory of physics and mathematics [57,58,59,60,61]. Mathematical constructions such as Lad’s impediment functions [62], cepstral signal analysis [63,64], fractal F α -calculus [65,66,67,68,69,70], or nonextensive statistics [71,72,73,74,75], involve certain formal elements analogous to non-Newtonian integration or differentiation. The first application of non-Newtonian calculus to probability of which I am aware was provided by Meginniss in his analysis of the objectivity of p versus the subjectivity of g ( p ) , with applications to gambling theory [76]. Another field in which generalized arithmetic and non-Newtonian calculus are starting to attract attention is mathematical finance [77,78]. From my personal perspective, the most important achievements of the new formalism include circumventing the limitations of Bell’s theorem and Tsirelson bounds in quantum mechanics [12,13]; the arithmetic of time, which appears to eliminate dark energy from cosmology in the same way that the arithmetic of velocities eliminated the luminiferous aether from special relativity [48]; formulating wave propagation along fractal coastlines [23]; and overcoming the limitations of Fourier analysis on Cantor sets [47].
The two most important observations of the present study seem to be the interpretation of the singlet-state probabilities in terms of several different arithmetic levels occurring in a single Formula (69),
P ( a , b ) = g ( p ( a | b ) hidden ) quantum 0 macroscopic g ( p ( b ) hidden ) quantum .
and the possible links with neural network learning algorithms.
The hierarchical structure is clearly “there”. What we have understood so far is just the tip of the iceberg.

Funding

This research received no external funding.

Acknowledgments

Calculations in Mathematica were carried out at the Academic Computer Center in Gdańsk (CI TASK project pt01234).

Conflicts of Interest

The author declares no conflicts of interest.

Appendix A

Appendix A.1. Proof of (85):

D l A l k ( x ) D k x = lim δ 0 A l k ( x k δ k ) l A l k ( x ) l δ l = lim δ 0 g l m A m n f k n ( x k δ k ) l g l m A m n f k n ( x ) l δ l = lim δ 0 g l g l + l m A m n f k n g k f k ( x ) + f k ( δ k ) g l + l m A m n f k n ( x ) l δ l = lim δ 0 g l g m A m n f n f k ( x ) + f k ( δ k ) g m A m n f k n ( x ) l δ l = lim δ 0 g l g m A m n f n f k ( x ) + δ g m A m n f k n ( x ) / δ = lim δ 0 g l m g m f m g m g m A m n f n f k ( x ) + δ g m A m n f k n ( x ) / f m ( δ m ) = lim δ 0 g l m g m g m A m n f n f k ( x ) + δ g m A m n f k n ( x ) m δ m = g l m lim δ 0 A m n g n f k ( x ) + δ m A m n f k n ( x ) m δ m = g l m lim δ 0 A m n g n f n f k n ( x ) + f n ( δ n ) m A m n f k n ( x ) m δ m = g l m lim δ 0 A m n f k n ( x ) n δ n m A m n f k n ( x ) m δ m = g l m D m A m n f k n ( x ) D n f k n ( x ) .

Appendix A.2. Proof of (86):

Define
B l k ( x ) = a x A l k ( y ) D k y = g l f k ( a ) f k ( x ) A 00 ( r ) d r = g l B 00 f k ( x ) ,
C l k ( x ) = g l m f k n ( a ) f k n ( x ) A m n ( y ) D n y = g l m C m n f k n ( x )
Now compute the derivatives:
D l D k x B l k ( x ) = D l D k x a x A l k ( y ) D k y
= D l D k x g l B 00 f k ( x )
= g l d d f k ( x ) B 00 f k ( x )
= g l d d f k ( x ) f k ( a ) f k ( x ) A 00 ( r ) d r
= g l A 00 f k ( x ) = A l k ( x )
D l C l k ( x ) D k x = g l m D m C m n f k n ( x ) D n f k n ( x )
= g l m D m D n f k n ( x ) f k n ( a ) f k n ( x ) A m n ( y ) D n y
= g l m A m n f k n ( x ) = A l k ( x )
The derivatives are identical,
D l D k x g l m f k n ( a ) f k n ( x ) A m n ( y ) D n y = D l D k x a x A l k ( y ) D k y ,
which implies
g l m f k n ( a ) f k n ( x ) A m n ( y ) D n y = a x A l k ( y ) D k y l constant
Setting x = a we find
0 l = 0 l l constant = constant
hence
g l m f k n ( a ) f k n ( x ) A m n ( y ) D n y = a x A l k ( y ) D k y

Appendix A.3. Powers (Repeated Multiplications)

In order to introduce generalized arithmetics of complex numbers we need a useful concept of a “first power” [22]. To this end, consider two sets X and Y and a map A : X Y which can be described by a convergent power series. If X and Y are equipped with different arithmetics we first have to clarify the meaning of “power”. This will be performed as follows. Consider two bijections f X : X R , f Y : Y R , and their composition f = f Y 1 f X . The map
X x x 1 X Y = f ( x ) Y
defines a first Y-valued power of x X .
Lemma A1. 
( x 1 X Y ) 1 Y Z = x 1 X Z ,
x 1 X Y 1 Y X = x = x 1 X X ,
( x X y ) 1 X Y = x 1 X Y Y y 1 X Y ,
( x X y ) 1 X Y = x 1 X Y Y y 1 X Y .
Proof. 
( x 1 X Y ) 1 Y Z = x 1 X Y 1 Y Z = f Z 1 f Y f Y 1 f X ( x ) = f Z 1 f X ( x ) = x 1 X Z , ( x X y ) 1 X Y = f Y 1 f X ( x X y ) = f Y 1 f X ( x ) f X ( y ) = f Y 1 f Y x 1 X Y f Y y 1 X Y = x 1 X Y Y y 1 X Y , ( x X y ) 1 X Y = f Y 1 f X ( x X y ) = f Y 1 f X ( x ) + f X ( y ) = f Y 1 f Y x 1 X Y + f Y y 1 X Y = x 1 X Y Y y 1 X Y .
The remaining properties are obvious. □
Let n Y = f Y 1 ( n ) , with n N being a natural number satisfying the arithmetic defined by f R ( x ) = x . Then, first of all,
n Y = f Y 1 f R ( n ) = n 1 R Y , n = n R .
More generally,
( n Y ) 1 Y Z = f Z 1 f Y f Y 1 ( n ) = f Z 1 ( n ) = n Z ,
and, in particular,
( 0 Y ) 1 Y Z = f Z 1 ( 0 ) = 0 Z , ( 1 Y ) 1 Y Z = f Z 1 ( 1 ) = 1 Z ,
are the relations between neutral elements in Y and Z.
An nth Y-valued power reads
x n X Y = x 1 X Y Y Y x 1 X Y ( n times ) , = f Y 1 f Y x 1 X Y f Y x 1 X Y = f Y 1 f X ( x ) f X ( x ) = f Y 1 f X x X X x = x X X x 1 X Y , x n X Y Y x m X Y = x ( n + m ) X Y .
Note that one naturally arrives at the definition of
x n X = x X X x = x 1 X X X X x 1 X X = x n X X ,
which coincides with the definition of x n k discussed earlier. So, n X , understood as a power, can be identified with n X = f X 1 ( n ) , since
n X X m X = f X 1 ( n + m ) = ( n + m ) X ,
and thus one obtains the expected relation between products and sums,
x n X X x m X = x ( n + m ) X = x n X X m X .
Finally, let us compute
x n X Y m Y Z = x X X x n 1 X Y 1 Y Z Z Z x X X x n 1 X Y 1 Y Z = x X X x 1 X Z Z Z x X X x 1 X Z = x 1 X Z Z Z x 1 X Z Z Z x 1 X Z Z Z x 1 X Z = x 1 X Z Z Z x 1 X Z ( n m   times ) = x ( n m ) X Z

Appendix A.4. Generalized Complex Numbers

Let ( x , y ) X 1 × X 2 . The arithmetic of complex numbers is defined as
x y = ( x 1 , x 2 ) ( y 1 , y 2 ) = ( x 1 X 1 y 1 , x 2 X 2 y 2 ) , x y = ( x 1 , x 2 ) ( y 1 , y 2 ) = x 1 X 1 y 1 X 1 x 2 1 X 2 X 1 X 1 y 2 1 X 2 X 1 , x 1 1 X 1 X 2 X 2 y 2 X 2 x 2 X 2 y 1 1 X 1 X 2
(we feel free to represent pairs of numbers as either columns or rows). Neutral elements of multiplication and addition are given by
1 X = 1 X 1 , 0 X 2 ,
0 X = 0 X 1 , 0 X 2 .
The “imaginary unit” is represented by
i X = 0 X 1 , 1 X 2 .
In order to simplify notation i X will be sometimes denoted by i . We get the standard “i squared equals minus one” rule,
i ( y 1 , y 2 ) = 0 X 1 X 1 y 1 X 1 1 X 2 1 X 2 X 1 X 1 y 2 1 X 2 X 1 , 0 X 1 1 X 1 X 2 X 2 y 2 X 2 1 X 2 X 2 y 1 1 X 1 X 2 = X 1 y 2 1 X 2 X 1 , y 1 1 X 1 X 2 = ( z 1 , z 2 ) , i ( z 1 , z 2 ) = X 1 z 2 1 X 2 X 1 , z 1 1 X 1 X 2 = X 1 y 1 1 X 1 X 2 1 X 2 X 1 , X 1 y 2 1 X 2 X 1 1 X 1 X 2 = X 1 y 1 , X 2 y 2 = i i ( y 1 , y 2 ) = ( y 1 , y 2 ) .
Thinking of the plane as a representation of complex numbers, we identify real and imaginary parts as follows:
R x = x 1 0 X 2 , x = x 2 1 X 2 X 1 0 X 2 , i x = i x 2 1 X 2 X 1 0 X 2 = 0 X 1 x 2 ,
(the imaginary part is also real!). Decomposition of a general complex x into its real and imaginary parts can be expressed in the usual way by means of addition,
x = R x i x = x 1 0 X 2 i x 2 1 X 2 X 1 0 X 2 = x 1 0 X 2 X 1 0 X 2 1 X 2 X 1 x 2 1 X 2 X 1 1 X 1 X 2 = x 1 X 1 0 X 1 0 X 2 X 2 x 2 = x 1 x 2 .
Complex conjugation reads
x * = R x i x = x 1 0 X 2 i x 2 1 X 2 X 1 0 X 2 = x 1 0 X 2 X 1 0 X 2 1 X 2 X 1 x 2 1 X 2 X 1 1 X 1 X 2 = x 1 X 1 0 X 1 0 X 2 X 2 x 2 = x 1 X 2 x 2 .
Modulus squared is real,
x x * = x 1 x 2 x 1 X 2 x 2 = x 1 2 X 1 X 1 x 2 2 X 2 X 1 X 2 x 1 1 X 1 X 2 X 2 x 2 X 2 x 2 X 2 x 1 1 X 1 X 2 = x 1 2 X 1 X 1 x 2 2 X 2 X 1 0 X 2 .
The definition of addition is obvious, but let us take a closer look at multiplication. Recalling that x 1 X 2 X 1 = f X 1 1 f X 2 ( x ) , x 1 X 1 X 2 = f X 2 1 f X 1 ( x ) , we rephrase real and imaginary parts of the product as
( x y ) 1 = f X 1 1 f X 1 ( x 1 ) f X 1 ( y 1 ) f X 1 x 2 1 X 2 X 1 f X 1 y 2 1 X 2 X 1 = f X 1 1 f X 1 ( x 1 ) f X 1 ( y 1 ) f X 2 ( x 2 ) f X 2 ( y 2 ) = f X 1 1 R ( x ˜ y ˜ ) ,
( x y ) 2 = f X 2 1 f X 2 x 1 1 X 1 X 2 f X 2 ( y 2 ) + f X 2 ( x 2 ) f X 2 y 1 1 X 1 X 2 = f X 2 1 f X 1 ( x 1 ) f X 2 ( y 2 ) + f X 2 ( x 2 ) f X 1 ( y 1 ) = f X 2 1 ( x ˜ y ˜ ) ,
where x ˜ = f X 1 ( x 1 ) + i f X 2 ( x 2 ) = x ˜ 1 + i x ˜ 2 , etc. Analogously,
( x y ) 1 = f X 1 1 f X 1 ( x 1 ) + f X 1 ( y 1 ) = f X 1 1 R ( x ˜ + y ˜ ) , ( x y ) 2 = f X 2 1 f X 2 ( x 2 ) + f X 2 ( y 2 ) = f X 2 1 ( x ˜ + y ˜ ) .
Accordingly,
x y = f X 1 1 R ( x ˜ y ˜ ) , f X 2 1 ( x ˜ y ˜ ) ,
x y = f X 1 1 R ( x ˜ / y ˜ ) , f X 2 1 ( x ˜ / y ˜ ) ,
x y = f X 1 1 R ( x ˜ + y ˜ ) , f X 2 1 ( x ˜ + y ˜ ) ,
x y = f X 1 1 R ( x ˜ y ˜ ) , f X 2 1 ( x ˜ y ˜ ) .
One can still further simplify operations with complex numbers. Note that
x 1 x 2 = f X 1 1 f X 1 ( x 1 ) f X 2 1 f X 2 ( x 2 ) = f X 1 1 ( R x ˜ ) f X 2 1 ( x ˜ ) ,
so that
f X 1 1 ( R x ˜ ) f X 2 1 ( x ˜ ) f X 1 1 ( R y ˜ ) f X 2 1 ( y ˜ ) = f X 1 1 R ( x ˜ y ˜ ) f X 2 1 ( x ˜ y ˜ ) ,
f X 1 1 ( R x ˜ ) f X 2 1 ( x ˜ ) f X 1 1 ( R y ˜ ) f X 2 1 ( y ˜ ) = f X 1 1 R ( x ˜ + y ˜ ) f X 2 1 ( x ˜ + y ˜ ) ,
which are, perhaps, the most convenient forms of generalized complex arithmetic. The operations and are denoted by identical symbols no matter which arithmetic is used. This will not lead to inconsistencies and should be clear from a context. The standard Diophantine complex numbers are denoted either as x = x 1 + i x 2 or x = ( x 1 , x 2 ) and it is understood that the two notations mean the same, i.e., it is allowed to write x 1 + i x 2 = ( x 1 , x 2 ) .
Lemma A2. 
andare associative and commutative, andis distributive with respect to .
The proof is an immediate consequence of (A30) and (A31).

Appendix A.5. Complex-Valued Scalar Product via Non-Newtonian Integration

A complex-valued function is defined by the diagram
X A Y = Y 1 × Y 2 f X f Y = f Y 1 × f Y 2 R A ˜ R 2 .
Consider two functions A , B : X Y 1 × Y 2 associated with diagrams of the form (A32). Let A ˜ ( r ) = A ˜ 1 ( r ) + i A ˜ 2 ( r ) , B ˜ ( r ) = B ˜ 1 ( r ) + i B ˜ 2 ( r ) , and
A ˜ | B ˜ = f X ( T ) / 2 f X ( T ) / 2 A ˜ ( r ) ¯ B ˜ ( r ) d r .
f X ( T ) can be finite or infinite. The scalar product of two functions A , B : X Y 1 × Y 2 is defined as
A | B = X T X 2 X T X 2 X A ( x ) * B ( x ) D x .
The same symbol of the scalar product for both A ˜ | B ˜ and A | B will not lead to ambiguities.
Lemma A3. 
(Properties of the scalar product)
A | B = f Y 1 1 R A ˜ | B ˜ f Y 2 1 A ˜ | B ˜ ,
A | B * = B | A ,
A | B C = A | B A | C ,
A | λ B = λ A | B .
Proof. 
The integrand
A ( x ) * B ( x ) = A 1 ( x ) Y 2 A 2 ( x ) B 1 ( x ) B 2 ( x ) = f Y 1 1 f Y 1 A 1 ( x ) f Y 1 B 1 ( x ) f Y 2 Y 2 A 2 ( x ) f Y 2 B 2 ( x ) f Y 2 1 f Y 1 A 1 ( x ) f Y 2 B 2 ( x ) + f Y 2 Y 2 A 2 ( x ) f Y 1 B 1 ( x ) = f Y 1 1 A ˜ 1 f X ( x ) B ˜ 1 f X ( x ) + A ˜ 2 f X ( x ) B ˜ 2 f X ( x ) f Y 2 1 A ˜ 1 f X ( x ) B ˜ 2 f X ( x ) A ˜ 2 f X ( x ) B ˜ 1 f X ( x ) = f Y 1 1 C ˜ 1 f X ( x ) f Y 2 1 C ˜ 2 f X ( x ) = C ( x ) .
So,
A | B = X T X 2 X T X 2 X C ( x ) D x = f Y 1 1 f X ( X T X 2 X ) f X ( T X 2 X ) C ˜ 1 ( r ) d r f Y 1 1 f X ( X T X 2 X ) f X ( T X 2 X ) C ˜ 2 ( r ) d r = f Y 1 1 f X ( T ) / 2 f X ( T ) / 2 C ˜ 1 ( r ) d r f Y 2 1 f X ( T ) / 2 f X ( T ) / 2 C ˜ 2 ( r ) d r = f Y 1 1 f X ( T ) / 2 f X ( T ) / 2 R A ˜ ( r ) ¯ B ˜ ( r ) d r f Y 2 1 f X ( T ) / 2 f X ( T ) / 2 A ˜ ( r ) ¯ B ˜ ( r ) d r = f Y 1 1 R A ˜ | B ˜ f Y 2 1 A ˜ | B ˜ .
By definition of complex conjugation
A | B * = f Y 1 1 R A ˜ | B ˜ Y 2 f Y 2 1 A ˜ | B ˜ = f Y 1 1 R A ˜ | B ˜ f Y 2 1 A ˜ | B ˜ = f Y 1 1 R B ˜ | A ˜ f Y 2 1 B ˜ | A ˜ = B | A .
The remaining two properties follow from associativity and distributivity of the arithmetic operations, supplemented by linearity of the integral. □

Appendix A.6. Continuous Transition Between Two Levels of the Hierarchy

There exists a simple generalization of the hierarchies, allowing for a continuous transition between the levels. It is based on the following
Lemma A4. 
Consider a collection (finite or not) of parameters λ j , j λ j = 1 , and a collection of bijections g j that satisfy Lemma 1. Then g = j λ j g j also satisfies Lemma 1.
Proof. 
Each g j can be written as
g j ( p ) = 1 2 + h j p 1 2 ,
where h j ( x ) = h j ( x ) , so
g ( p ) = j λ j g j ( p ) = 1 2 j λ j 1 / 2 + j λ j h j p 1 2 h p 1 / 2 .
The function h ( x ) = j λ j h j ( x ) is odd, as a linear combination of odd functions. □
As a side remark, let us note that g ( p ) = sin 2 π 2 p satisfies all the concavity and monotonicity properties required for the existence of non-integer iterations g r , r Z , (cf. [79], Chapter XV). The problem is intriguing as an alternative possibility of continuously switching between quantum and subquantum levels of the hierarchy (cf. the discussion of zero-multiplier iterative roots in [80], Chapter 11.5).

References

  1. Czachor, M.; Doebner, H.-D. Correlation experiments in nonlinear quantum mechanics. Phys. Lett. A 2002, 301, 139–152. [Google Scholar] [CrossRef]
  2. Gisin, N. Stochastic quantum dynamics and relativity. Helv. Phys. Acta 1989, 62, 363–371. [Google Scholar]
  3. Gisin, N. Weinberg’s non-linear quantum mechanics and supraluminal communications. Phys. Lett. A 1990, 143, 1–2. [Google Scholar] [CrossRef]
  4. Polchinski, J. Weinberg’s nonlinear quantum mechanics and the Einstein-Podolsky-Rosen paradox. Phys. Rev. Lett. 1991, 66, 397–400. [Google Scholar] [CrossRef] [PubMed]
  5. Jordan, T.F. Reconstructing a nonlinear dynamical framework for testing quantum mechanics. Ann. Phys. 1993, 225, 83–113. [Google Scholar] [CrossRef]
  6. Jaynes, E.T. Probability Theory: The Logic of Science; Bretthorst, G.L., Ed.; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
  7. Aczél, J. Lectures on Functional Equations and Their Applications; Academic Press: New York, NY, USA, 1966. [Google Scholar]
  8. Cox, R.T. The Algebra of Probable Inference; Johns Hopkins University Press: Baltimore, MD, USA, 1961. [Google Scholar]
  9. Penrose, R.; Rindler, W. Spinors and Space-Time: Volume 1: Two-Spinor Calculus and Relativistic Fields; Cambridge University Press: Cambridge, UK, 1984. [Google Scholar]
  10. Czachor, M. Unifying aspects of generalized calculus. Entropy 2020, 22, 1180. [Google Scholar] [CrossRef]
  11. Cormen, T.H.; Leiserson, C.E.; Rivest, R.L.; Stein, C. Introduction to Algorithms, 4th ed.; MIT Press: Cambridge, MA, USA; McGraw-Hill: New York, NY, USA, 2022. [Google Scholar]
  12. Czachor, M. Arithmetic loophole in Bell’s Theorem: Overlooked threat to entangled-state quantum cryptography. Acta Phys. Polon. A 2021, 139, 70–83. [Google Scholar] [CrossRef]
  13. Czachor, M.; Nalikowski, K. Imitating quantum probabilities: Beyond Bell’s theorem and Tsirelson bounds. Found. Sci. 2024, 29, 281–305. [Google Scholar] [CrossRef]
  14. Czachor, M. Contra Bellum: Bell’s theorem as a confusion of languages. Acta Phys. Polon. A 2023, 143, S158–S170. [Google Scholar] [CrossRef]
  15. Cirel’son, B.S. Quantum generalizations of Bell’s inequality. Lett. Math. Phys. 1980, 4, 93–100. [Google Scholar] [CrossRef]
  16. Grossman, M.; Katz, R. Non-Newtonian Calculus; Lee Press: Pigeon Cove, MA, USA, 1972. [Google Scholar]
  17. Grossman, M. The First Nonlinear System of Differential and Integral Calculus; Mathco: Rockport, TX, USA, 1979. [Google Scholar]
  18. Grossman, M. Bigeometric Calculus: A System with Scale-Free Derivative; Archimedes Foundation: Rockport, TX, USA, 1983. [Google Scholar]
  19. Pap, E. g-Calculus. Zb. Rad. Prirod.-Fak. Ser. Mat. 1993, 23, 145–156. [Google Scholar]
  20. Pap, E. Generalized real analysis and its applications. Int. J. Approx. Reason. 2008, 47, 368–386. [Google Scholar] [CrossRef]
  21. Grabisch, M.; Marichal, J.-L.; Mesiar, R.; Pap, E. Aggregation Functions; Cambridge University Press: Cambridge, UK, 2009. [Google Scholar]
  22. Burgin, M.; Czachor, M. Non-Diophantine Arithmetics in Mathematics, Physics, and Psychology; World Scientific: Singapore, 2020. [Google Scholar]
  23. Czachor, M. Waves along fractal coastlines: From fractal arithmetic to wave equations. Acta Phys. Polon. B 2019, 50, 813–831. [Google Scholar] [CrossRef]
  24. Zimmermann, H.-J. Fuzzy Set Theory—And Its Applications, 3rd ed.; Kluwer: Boston, MA, USA, 1996. [Google Scholar]
  25. Dubois, D.; Prade, H. Towards fuzzy differential calculus. Part 1: Integration of fuzzy mappings. Fuzzy Sets Syst. 1982, 8, 1–17. [Google Scholar] [CrossRef]
  26. Dubois, D.; Prade, H. Towards fuzzy differential calculus. Part 2: Integration on fuzzy intervals. Fuzzy Sets Syst. 1982, 8, 105–116. [Google Scholar] [CrossRef]
  27. Dubois, D.; Prade, H. Towards fuzzy differential calculus. Part 3: Differentiation. Fuzzy Sets Syst. 1982, 8, 225–233. [Google Scholar] [CrossRef]
  28. Zhang, D.; Mesiar, R.; Pap, E. Pseudo-integral and generalized Choquet integral. Fuzzy Sets Syst. 2022, 446, 193–221. [Google Scholar] [CrossRef]
  29. Czachor, M.; Naudts, J. Thermostatistics based on Kolmogorov-Nagumo averages: Unifying framework for extensive and nonextensive generalizations. Phys. Lett. A 2002, 298, 369–374. [Google Scholar] [CrossRef]
  30. Jizba, P.; Arimitsu, T. Observability of Rényi’s entropy. Phys. Rev. E 2004, 69, 026128. [Google Scholar]
  31. Jizba, P.; Arimitsu, T. The world according to Rényi: Thermodynamics of fractal systems. AIP Conf. Proc. 2001, 597, 341. [Google Scholar] [CrossRef]
  32. Jizba, P.; Korbel, J. When Shannon and Khinchin meet Shore and Johnson: Equivalence of information theory and statistical inference axiomatics. Phys. Rev. E 2020, 101, 042126. [Google Scholar] [CrossRef]
  33. Rényi, A. Some fundamental questions of information theory. MTA III. Oszt. Közl. 1960, 10, 251–282, Reprinted in Selected Papers of Alfréd Rényi, Turán, P., Ed.; Akadémiai Kiadó: Budapest, Hungary, 1976. [Google Scholar]
  34. Kolmogorov, A.N. Sur la notion de la moyenne. Atti. Acad. Naz. Lincei. Rend. 1930, 12, 388–391, Reprinted in Selected Works of A.N. Kolmogorov. Vol.1. Mathematics and Mechanics; Tikhomirov, V.M., Ed.; Kluwer: Dordrecht, The Netherlands, 1991. [Google Scholar]
  35. Nagumo, M. Über eine Klasse der Mittelwerte. Jpn. J. Math. 1930, 7, 71–79, Reprinted in Mitio Nagumo Collected Papers; Yamaguti, M., Nirenberg, L., Mizohata, S., Sibuya, Y., Eds.; Springer: Tokyo, Japan, 1993.. [Google Scholar]
  36. Naudts, J. Generalized Thermostatistics; Springer: London, UK, 2011. [Google Scholar]
  37. Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423+623–653. [Google Scholar] [CrossRef]
  38. Bell, J.S. On the Einstein-Podolsky-Rosen paradox. Physics 1964, 1, 195. [Google Scholar] [CrossRef]
  39. Clauser, J.F.; Horne, M.A. Experimental consequences of objective local theories. Phys. Rev. D 1974, 10, 526. [Google Scholar] [CrossRef]
  40. Fubini, G. Sulle metriche definite da una forme Hermitiana. Atti Istit. Veneto 1904, 63, 502–513. [Google Scholar]
  41. Study, E. Kürzeste Wege im komplexen Gebiet. Math. Ann. 1905, 60, 321–378. [Google Scholar] [CrossRef]
  42. Field, T.R.; Hughston, L.P. The geometry of coherent states. J. Math. Phys. 1999, 40, 2568–2583. [Google Scholar] [CrossRef]
  43. Brody, D.C.; Hughston, L.P. Geometric quantum mechanics. J. Geom. Phys. 2001, 38, 19–53. [Google Scholar] [CrossRef]
  44. Bengtsson, I.; Życzkowski, K. Geometry of Quantum States: An Introduction to Quantum Entanglement; Cambridge University Press: Cambridge, UK, 2006. [Google Scholar]
  45. Chruściński, D. Geometric aspects of quantum mechanics and quantum entanglement. J. Phys. Conf. Ser. 2006, 30, 9. [Google Scholar] [CrossRef]
  46. Kunc, V.; Kléma, J. Three decades of activations: A comprehensive survey of 400 activation functions for neural networks. arXiv 2024, arXiv:2402.09092. [Google Scholar] [CrossRef]
  47. Aerts, D.; Czachor, M.; Kuna, M. Fourier transforms on Cantor sets: A study in non-Diophantine arithmetic and calculus. Chaos Solitons Fractals 2016, 91, 461–468. [Google Scholar] [CrossRef]
  48. Czachor, M. Non-Newtonian mathematics instead of non-Newtonian physics: Dark matter and dark energy from a mismatch of arithmetics. Found. Sci. 2021, 26, 75–95. [Google Scholar] [CrossRef]
  49. Czachor, M. Relativity of arithmetic as a fundamental symmetry of physics. Quantum Stud. Math. Found. 2016, 3, 123–133. [Google Scholar] [CrossRef]
  50. Maslov, V.P. On a new superposition principle for optimization problems. Uspekhi Mat. Nauk 1987, 42, 43. (In Russian) [Google Scholar] [CrossRef]
  51. Volterra, V.; Hostinský, B. Opérations Infinitésimales Linéaires. Applications Aux équations Différentielles et Fonctionnelles; Gauthier-Villars: Paris, France, 1938. [Google Scholar]
  52. Rashevsky, P.K. On the dogma of the natural numbers. Uspekhi Mat. Nauk. 1973, 28, 243–246. (In Russian) [Google Scholar] [CrossRef]
  53. Burgin, M.S. Nonclassical models of the natural numbers. Uspekhi Mat. Nauk 1977, 32, 209–210. (In Russian) [Google Scholar]
  54. Burgin, M.S. Non-Diophantine Arithmetics, or Is It Possible That 2 + 2 Is Not Equal to 4? Ukrainian Academy of Information Sciences: Kiev, Ukraine, 1997. (In Russian) [Google Scholar]
  55. Burgin, M. Introduction to projective arithmetics. arXiv 2010, arXiv:1010.3287. [Google Scholar] [CrossRef]
  56. Burgin, M.; Meissner, G. 1 + 1 = 3: Synergy arithmetics in economics. Appl. Math. 2017, 8, 133–134. [Google Scholar] [CrossRef]
  57. Benioff, P. Towards a coherent theory of physics and mathematics. Found. Phys. 2002, 32, 989–1029. [Google Scholar] [CrossRef]
  58. Benioff, P. Towards a coherent theory of physics and mathematics. The theory-experiment connection. Found. Phys. 2005, 35, 1825–1856. [Google Scholar] [CrossRef]
  59. Benioff, P. Fiber bundle description of number scaling in gauge theory and geometry. Quant. Stud. Math. Found. 2015, 2, 289–313. [Google Scholar] [CrossRef]
  60. Benioff, P. Space and time dependent scaling of numbers in mathematical structures: Effects on physical and geometric quantities. Quant. Inf. Proc. 2016, 15, 1081–1102. [Google Scholar] [CrossRef]
  61. Benioff, P. Effects of a scalar scaling field on quantum mechanics. Quant. Inf. Proc. 2016, 15, 3005–3034. [Google Scholar] [CrossRef]
  62. Lad, F. Embedding Bayes’ theorem in general learning rules: Connections between idealized behaviour and empirical research on learning. Brit. J. Math. Stat. Psychol. 1978, 31, 113–125. [Google Scholar] [CrossRef]
  63. Childers, D.G.; Skinner, D.P.; Kemerait, R.C. The cepstrum: A guide to processing. Proc. IEEE 1977, 65, 1428–1443. [Google Scholar] [CrossRef]
  64. Oppenheim, A.V.; Schafer, R.W. From frequency to quefrency: A history of the cepstrum. IEEE Signal Process. Mag. 2004, 21, 95. [Google Scholar] [CrossRef]
  65. Parvate, A.; Gangal, A.D. Calculus on fractal subsets of real line. (I) Formulation. Fractals 2009, 17, 53. [Google Scholar] [CrossRef]
  66. Parvate, A.; Gangal, A.D. Calculus on fractal subsets of real line. (II) Conjugacy with ordinary calculus. Fractals 2011, 19, 271. [Google Scholar] [CrossRef]
  67. Golmankhaneh, A.K.; Baleanu, D. New derivatives on the fractal subset of real line. Entropy 2016, 18, 1. [Google Scholar] [CrossRef]
  68. Golmankhaneh, A.K.; Baleanu, D. Non-local integrals and derivatives on fractal sets with applications. Open Phys. 2016, 14, 542–548. [Google Scholar] [CrossRef]
  69. Golmankhaneh, A.K.; Tunc, C. On the Lipschitz condition in the fractal calculus. Chaos Soliton Fract. 2017, 95, 140–147. [Google Scholar] [CrossRef]
  70. Golmankhaneh, A.K. Fractal Calculus and Its Applications: Fα-Calculus; World Scientific: Singapore, 2022. [Google Scholar]
  71. Tsallis, C.; Mendes, R.; Plastino, A. The role of constraints within generalized nonextensive statistics. Physica A 1998, 261, 543–554. [Google Scholar] [CrossRef]
  72. Touchette, H. When is a quantity additive, and when is it extensive? Physica A 2002, 305, 84–88. [Google Scholar] [CrossRef]
  73. Nivanen, L.; Le Mehaute, A.; Wang, Q.A. Generalized algebra within a nonextensive statistics. Rep. Math. Phys. 2003, 52, 437–444. [Google Scholar] [CrossRef]
  74. Kaniadakis, G. Nonlinear kinetics underlying generalized statistics. Physica A 2001, 296, 405. [Google Scholar] [CrossRef]
  75. Kaniadakis, G. Theoretical foundations and mathematical formalism of the power-law tailed statistical distributions. Entropy 2013, 15, 3983–4010. [Google Scholar] [CrossRef]
  76. Meginniss, J.R. Non-Newtonian calculus applied to probability, utility, and Bayesian analysis. In Proceedings of the Business and Economic Statistics Section; American Statistical Association: Washington, DC, USA, 1980; pp. 405–410. [Google Scholar]
  77. Carr, P.; Cherubini, U. Option pricing generators. Front. Math. Financ. 2023, 2, 150–169. [Google Scholar] [CrossRef]
  78. Carr, P.; Cirillo, P. A pseudo-analytic generalization of the memoryless property for continuous random variables and its use in pricing contingent claims. R. Soc. Open Sci. 2024, 11, 231690. [Google Scholar] [CrossRef]
  79. Kuczma, M. Functional Equations in a Single Variable; Polish Scientific Publishers: Warszawa, Poland, 1968. [Google Scholar]
  80. Kuczma, M.; Choczewski, B.; Ger, R. Iterative Functional Equations; Cambridge University Press: Cambridge, UK, 1990. [Google Scholar]
Figure 1. The relation between α and θ as given by (51). There are three fixed points: α ( 0 ) = 0 , α ( π / 2 ) = π / 2 , α ( π ) = π . Here, α is the geometric angle between the two Stern–Gerlach devices, whereas θ is a hidden parameter.
Figure 1. The relation between α and θ as given by (51). There are three fixed points: α ( 0 ) = 0 , α ( π / 2 ) = π / 2 , α ( π ) = π . Here, α is the geometric angle between the two Stern–Gerlach devices, whereas θ is a hidden parameter.
Entropy 27 00922 g001
Figure 2. A total of 1, 2, 5 and 15 iterations of g ( p ) = sin 2 π 2 p (upper plots). All the curves cross at p = 1 / 2 and are of the sigmoidal form, analogously to activation functions occurring in learning algorithms. Is it just a coincidence, or are there deeper connections to the problem of measurement, learning, or consciousness? Iterates g k with k > 15 are practically indistinguishable within the precision of the plot: They all look like the red step function. An analogous phenomenon occurs for the negative iterates: k = 1 , 2 , 5 , 15 , but here almost all events described by g 15 ( p ) are equally probable, hence indistinguishable for level-0 observers (lower plots). Effectively, even though the number of levels is infinite, the distinguishable ones are restricted to a finite “band” k min k k max . Of course, the Copernican aspect of the hierarchy means that the same happens in a neighborhood of any l, and not only l = 0 depicted here.
Figure 2. A total of 1, 2, 5 and 15 iterations of g ( p ) = sin 2 π 2 p (upper plots). All the curves cross at p = 1 / 2 and are of the sigmoidal form, analogously to activation functions occurring in learning algorithms. Is it just a coincidence, or are there deeper connections to the problem of measurement, learning, or consciousness? Iterates g k with k > 15 are practically indistinguishable within the precision of the plot: They all look like the red step function. An analogous phenomenon occurs for the negative iterates: k = 1 , 2 , 5 , 15 , but here almost all events described by g 15 ( p ) are equally probable, hence indistinguishable for level-0 observers (lower plots). Effectively, even though the number of levels is infinite, the distinguishable ones are restricted to a finite “band” k min k k max . Of course, the Copernican aspect of the hierarchy means that the same happens in a neighborhood of any l, and not only l = 0 depicted here.
Entropy 27 00922 g002
Figure 3. The upper diagram: An R l -valued branch of a binary tree of conditional probabilities. This is how one can include events with more results than just two. Assuming independent events and the same value of all k j (the lower diagram), we can derive a hierarchical analog of the Bernoulli law of large numbers. Laws of large numbers are the places where theory and experiment meet.
Figure 3. The upper diagram: An R l -valued branch of a binary tree of conditional probabilities. This is how one can include events with more results than just two. Assuming independent events and the same value of all k j (the lower diagram), we can derive a hierarchical analog of the Bernoulli law of large numbers. Laws of large numbers are the places where theory and experiment meet.
Entropy 27 00922 g003
Figure 4. Hierarchical law of large numbers in action. Upper bound on probability of disagreement between theory and experiment in N tosses of a symmetric coin for four different arithmetics R l of the observer. Plot of the right-hand side of (158) with ε = 0.1 , for the four iterates g l , l = 1 , 2 , 3 , 4 , of g ( x ) = sin 2 π 2 x . The number of coin tosses 25 N 75 . Plots are made in the arithmetic R 0 , implicitly assumed in Wolfram Mathematica 14.
Figure 4. Hierarchical law of large numbers in action. Upper bound on probability of disagreement between theory and experiment in N tosses of a symmetric coin for four different arithmetics R l of the observer. Plot of the right-hand side of (158) with ε = 0.1 , for the four iterates g l , l = 1 , 2 , 3 , 4 , of g ( x ) = sin 2 π 2 x . The number of coin tosses 25 N 75 . Plots are made in the arithmetic R 0 , implicitly assumed in Wolfram Mathematica 14.
Entropy 27 00922 g004
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Czachor, M. On the Relativity of Quantumness as Implied by Relativity of Arithmetic and Probability. Entropy 2025, 27, 922. https://doi.org/10.3390/e27090922

AMA Style

Czachor M. On the Relativity of Quantumness as Implied by Relativity of Arithmetic and Probability. Entropy. 2025; 27(9):922. https://doi.org/10.3390/e27090922

Chicago/Turabian Style

Czachor, Marek. 2025. "On the Relativity of Quantumness as Implied by Relativity of Arithmetic and Probability" Entropy 27, no. 9: 922. https://doi.org/10.3390/e27090922

APA Style

Czachor, M. (2025). On the Relativity of Quantumness as Implied by Relativity of Arithmetic and Probability. Entropy, 27(9), 922. https://doi.org/10.3390/e27090922

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop