Inﬂation propensity of Collatz orbits: a new proof-of-work for blockchain applications

: Cryptocurrencies like Bitcoin rely on a proof-of-work system to validate transactions 1 and prevent attacks or double-spending. Reliance on a few standard proofs-of-work such as 2 hashcash, Ethash or Scrypt increases systemic risk of the whole crypto-economy. Diversiﬁcation of 3 proofs-of-work is a strategy to counter potential threats to the stability of electronic payment systems. 4 To this end, another proof-of-work is introduced: it is based on a new metric associated to the 5 algorithmically undecidable Collatz algorithm: the inﬂation propensity is deﬁned as the cardinality of 6 new maxima in a developing Collatz orbit. It is numerically veriﬁed that the distribution of inﬂation 7 propensity slowly converges to a geometric distribution of parameter 0.714 ≈ ( π − 1 ) 3 as the sample 8 size increases. This pseudo-randomness opens the door to a new class of proofs-of-work based on 9 congruential graphs. 10


Introduction
A decentralized electronic payment system relies on a ledger of transactions shared on a network.
The decentralization of a transaction ledger raises the question of security and integrity of the ledger.
In the original Bitcoin protocol, the problem of double-spending or alteration of the ledger is solved by the use of blockchain, a system that requires proof-of-work by a network of computers to confirm transactions.In cryptography, intensive computation as proof-of-work allows one party to verify with little computational effort that a counterparty has spent a large amount of computational effort.
The concept was originally developed by [6] as a spam prevention technique.[16] used for Bitcoin a proof-of-work based on [1].The protocol consists in finding a nonce value such that the application of the SHA256 hashing algorithm to a combination of that nonce and a block of information gives a hash starting with series of zeroes by targetting a given threshold.The idea behind the proof-of-work is that participants have an incentive to cooperate rather than to cheat because the computational power required to cheat is too large.However, as cryptocurrencies became more popular and diverse, an over-reliance on mainstream proof-of-work protocols, such as hashcash, Ethash (Wood , 2014) or Scrypt (Percival , 2009) creates a new type of systemic risk in which a cryptographic breakdown would jeopardize cryptocurrencies that rely on these standard proofs-of-work.Diversification of the proofs-of-work is a credible way to mitigate this systemic risk.Other types of proof-of-work have been designed, such as prime numbers verification (King , 2013), graph-theoretic proof-of-work (Tromp , 2015) or asymmetric proof-of-work based on the generalized birthday problem (Biryukov & Khovratovich , 2017).Attacks on proofs-of-work could prevent new transactions or alter past ones.In financial markets, exchanges have the possibility to cancel trades in case of infrastucture breakdown or malfunction.However, a systemic failure of the proof-of-work system in decentralized cryptocurrency markets could mean the destruction of the whole history of transactions.Potential risks clouding the proof-of-work system include innovation in mathematics and cryptography that could compromise the existing proofs-of-work.Further diversification of proofs-of-work could help mitigate that systemic risk and improve robustness of the nascent crypto-economy.However, additional threats to the cryptocurrency ecosystems could also come from technological innovation, such as for instance the introduction of quantum computers.Post-quantum algorithms are currently being developed in the field of security, see e.g.[2].In particular, [10] propose a quantum-safe blockchain that utilizes quantum key distribution.The application presented in the following sections does not address the problems posed by a post-quantum world but suggests new ways to tackle immediate systemic threats by offering a proof-of-work based on properties of the Collatz algorithm.In order to describe this algorithm, consider the following function from N 0 to N 0 : Now, apply the following iterate of T: The Collatz conjecture states that ∀ x ∈ N 0 , ∃ a finite k such that T k (x) = 1.[14] uses the following terminology: the "total stopping time" is defined as For instance, let us consider the case for x = 3: In this example, the Collatz sequence 1 is 3, 5, 8, 4, 2, 1 and σ ∞ (3) equals to 5 while σ(3) = 4.By definition, the value of σ ∞ (x) depends on the starting point of the algorithm.For example Analyzing the total stopping time ∀ x ∈ N 0 has proven challenging: the lack of clear patterns and the absence of an analytical shortcut to estimate σ ∞ (x) have left practitioners with numerical methods to compute it and verify the conjecture.[18] proved computationally that the conjecture holds up until x = 20 × 2 58 .Current computational capabilities have allowed confirming the conjecture for very large numbers.For example, [8] introduced a GPU-based method to verify the Collatz algorithm.The authors could verify 1.31 × 10 12 64-bit numbers per second.A probabilistic approach is also a frequent workaround to justify the validity of the Collatz conjecture: assuming function T k (x) is "random enough", [5] showed that half of the time, the next number in the sequence will be (3x + 1)/2, then 1/4 of the time it will be (3x + 1)/4, then 1/8 of the time it will be (3x + 1)/8 and so on so that the average 1 also called "trajectory" or "forward orbit" as k → ∞.These elements tend to indicate that T k (x) does not diverge to infinity as k grows.Using [15] machines, [4] showed that a problem generalizing the Collatz problem is not algorithmically decidable.[12] extended the proof to show that this generalization is Π 2 0 complete.If the problem is algorithmically undecidable, then no information about the future inflation of the Collatz map is passed from one step k to the next step k + 1.To explore that hypothesis and the properties of this "pseudo-randomness", let us define the inflation propensity of order K ξ(x, K) as the cardinality of the set of steps that lead to a number strictly larger than all previous numbers in the same sequence: where ) is a particular case.For the ease of notation: ).In the above example of x = 3, ξ(3) = 2. Indeed, the set of numbers strictly larger than the previous maxima in the sequence are {5, 8} so that ξ(3) = card{5, 8} = 2.In the other example presented supra with x = 2 α , ξ(2 α ) = 0 ∀ α ∈ N 0 since no number in their sequences can be strictly larger than the initial one.
This research paper investigates the distribution of ξ(x), the inflation propensity as a deterministic variable that resembles a random behavior.If past maxima anywhere in the sequence are independent from new maxima later computed in that orbit, we should have that ξ

Inflation propensity
[13] describes the 3x + 1 conjecture as "a deterministic process that simulates random behaviour" and goes further to mention that the problem seems "structureless".[22] formally proves the non-regularity of the Collatz's graph.As a visual illustration of this "structurlessness", the total stopping time for the first 1e6 natural numbers as a function of their value is presented in Figure 1.
The equally "structureless" empirical distribution of the total stopping time for the same numbers is presented in Figure 2.  Because the Collatz graph is non-regular, its complexity gives rise to a pseudo-random behaviour.[17] and [11] explore similarities between the Collatz model and the following dynamical system:

Preprints
where b 3 is a constant and Y k are IID (independant and identically distributed) Bernouilli random variables.The stochastic models predict that all orbits converge to a bounded set and that the total stopping time σ ∞ (x) for the 3x + 1 map of random starting point x is about 6.95212 log x steps, as x → ∞ have a normal distribution centered around that value.The authors point out that a suitable scaling limit for the trajectories is a geometric Brownian motion.This approach is extended in the current research: the empirical distribution of ξ(x) defined in ( 5) is presented in Figure 3. Let us assume that the amount of new maxima in any given orbit is independent from the amount of previously found maxima in that orbit.This would mean that for any random starting point x > 4 ∈ N 0 : Assuming such memorylessness naturally yields a geometric distribution of the inflation propensity.The longer the stopping time, the larger the orbits.Let us assume that the probability to reach a new maximum is memoryless for large orbits and that the density f (ξ(x) = y) follows a geometric distribution.It would mean that with ρ ∈ ]0; 1[ and y ∈ N. The moment generating function is where Li n (ρ) is the nth polylogarithm of ρ and ρ = µ 1 1 + µ 1 (10) is the corresponding estimator of ρ based on (9).It is also the maximum likelihood estimator.The next step is to test the hypothesis that ξ(x) ∼ G(ρ).

Empirical results
The samples consist in the first 1e8, 1e9, 1e10 and 1e11 positive integers.For each sample, the maximum likelihood estimator of ρ is computed, then tests are performed to see if elements of the distribution follow a geometric distribution of parameter ρ: where q ∈ [0, N] and N is the largest observed maximum in the sample.When q = N, the entire distribution is tested for goodness of fit with a geometric distribution of parameter ρ.The tests are performed using Pearson's χ 2 test at a 10% confidence level.Table 1 summarizes the results of the tests.As the sample size increases, the hypothesis is not rejected when it comes to considering the first quantiles of the distribution.For the last sample (1e11), the hypothesis that the distribution of the inflation propensity follows a geometric distribution cannot be rejected up to the 91th percentile, compared to the 49th percentile for the 1e9 sample.Computational limitations prevent at this stage investigating larger sample sizes so that the geometric behaviour of the inflation propensity over the entire domain (N 0 ) needs to be conjectured.Interestingly, the estimator for ρ seems also to converge to a given value as the size of the sample increases and is very close to π−1 3 , which is coincidentally the solution to the equation 3x + 1 = π (see Figure 4).

Collatz-based proof-of-work
Because the distribution of the inflation propensity of Collatz orbits can be assumed to be geometric over large samples, and that the Collatz algorithm has been proven to be undecidable, the inflation propensity can be considered as a new candidate to generate proofs-of-work.Consider the following problem: find the smallest possible value X such that the sum of X and a hashed block of information B will have an inflation propensity of a given value Q.In other terms: 3 ) the difficulty of the verification is tailored in a very straightforward manner: higher targets Q will be exponentially more difficult to find.A natural extension is to find a set X = {X 1 , X 2 , ..., X n } whose n elements all lead to a combination of values .., Q n }, so that the probability of occurence can be more precisely selected.However, verifying the proof Q given X and B is computationally straightforward, a desirable property for a proof-of-work.Contrary to the hashcash proof-of-work, there can exist only one solution to the proof-of-work with the Collatz approach.
At the exception of the nonce and the target Q, the remainder of blockchain application based on Collatz can be identical to the existing Bitcoin protocol.The nonce is simply the value of X.The target set Q can be selected by the network so that, similar to Bitcoin, 6 blocks are mined per hour.
Every 2016 blocks, clients can compare the performance of the network and adjust the difficulty accordingly.Thanks to the geometric nature of the inflation propensity, a protocol for this adjustment is straightforward.Let us assume U 0 is the average amount of time required by the network to find any single value ξ(x).Any total computational time U T ≥ U 0 can be easily selected by finding a set Q solving the following problem: Three additional constraints must be considered for the protocol to have a unique solution and be properly defined: the set must be chosen so that 0 ≤ ≤ U 0 , the cardinality of the set must be as small as possible and max(Q) should be set to an arbitrary number based on contemporary knowledge of the empirical distribution of the inflation propensity.It is suggested to set that maximum value to 50, which corresponds to an observed empirical probability of occurence 1.44e-08 in the first 1e11 integers.

Example: Bitcoin genesis hash
A new Bitcoin genesis hash is created using original inputs by [16], but exploiting inflation propensity proof-of-work instead of hashcash.The inputs are: a hash merkle root that condenses all information related to the first Bitcoin transaction, a version number, a public key, a date, a time stamp that is used as coinbase parameter, and a target for complexity.A genesis block is the first block of a blockchain.To create a genesis hash using inflation propensity as proof-of-work, only two adjustments to the Bitcoin protocol are required: first, the target for complexity is expressed with an integer, which is the targeted inflation propensity.This directly relates to a specific probability of occurence.Second, the hashcash is replaced with the inflation propensity algorithm.In practice, the block header is hashed using SHA256 then converted into an integer using hexadecimal encoding.
This corresponds to B in equation (13).The target set Q is arbitrarily set to a single value of 40 for the generation of this first hash, which corresponds to a probability of occurence of ∼ 4e-07.The value of B given Nakamoto's other initial inputs is of ∼3.57e98.The X nonce is then incrementally added to the integer B and inflation propensity is computed until the target of 40 is reached.The values obtained from each iteration are hereafter named "Xis".In the python implementation of the

Advantages of the Collatz-based proof-of-work
The advantages of a Collatz-based proof-of-work are many.From a practitioner perspective, the algorithm is easy to implement since the underlying problem is made of simple arithmetic operations.
Also, the Collatz algorithm is known to be algorithmically undecidable, which guarantees asymmetry: it is difficult to find the targeted value but easy to verify.Furthermore, the inflation-propensity based proof-of-work has only one easily verifiable solution, which simplifies the consensus for a proof.The geometric distribution allows a very convenient tailoring of the computational complexity, by adjusting a specific targeted inflation-propensity, or a combination of targets.The same algorithm can also be indefinitely extended to meet new computational improvements since the upper bound of the orbits is infinity.In addition to this scalability, it could be possible to generalize the 3x + 1 algorithm to other congruential graphs exhibiting the same properties (for example, the 5x + 1 graph).Provided further research confirms this hypothesis, such a feature could allow more possibilities to generate proofs-of-work.From a purely computational perspective, computing as many orbits as possible and recording inflation propensities of integers is useful to marginally decrease the computational time of mining a block.The practical consequence of pre-computation is very clear: miners do not actually need to compute the entire orbit for proof-of-work in real-time, but have the possibility to store orbits in memory.This also offers the possibility of ASIC resistance (Application Specific Integrated Circuits) by the use of memory-hardness since holding orbits in memory gives substantial advantage over real-time computation.Finally, computing proofs-of-work by exploring Collatz orbits contributes directly to numerical verification of the Collatz conjecture for large numbers.This provides an elegant purpose to using computational power for proof-of-work.

Conclusion
For the classical 3x + 1 map, it is conjectured that inflation propensity ξ(x) = card k : interests of fitting a density distribution to ξ(x) are multiple: first, in absence of proof of the Collatz conjecture, numerical analysis of the problem stays relevant towards resolving the question.Second, by properly addressing the behavior of the series for large numbers, one can help anticipate the computational challenges related to exploring the orbits of the Collatz map.Third, identifying pseudo-random behaviour of Collatz inflation propensity directly leads to a new class of proofs-of-work for blockchain applications.The remainder of this document is built as follows: the next section discusses the empirical distributions of σ ∞ (x), σ(x) and ξ(x) ∀ x ∈ N 0 .The third section details the observed density of ξ(x).The density parameter of a geometric distribution is estimated using all natural numbers up to 1e11 as sample.The fourth section presents a new proof-of-work based on inflation propensity.The last section concludes.

Figure 1 .
Figure 1."Structureless" total stopping time for the first 1e6 natural numbers

Figure 2 .Figure 3 .
Figure 2. "Structureless"distribution of the total stopping time for the first 1e6 natural numbers

Figure 4 .
Figure 4. ρ as a function of the sample size (log10-scale)

Table 1 .
Table A1 in Appendix indicates the distribution of inflation propensities for the first 1e11 integers.Pearson's χ 2 tests for goodness of fit with a geometric distribution

preprints.org) | NOT PEER-REVIEWED | Posted: 25 September 2018 doi:10.20944/preprints201809.0472.v1 algorithm
, 1248 Xis are computed per second on an Intel Core i7-4700MQ CPU with 8 x 2.40GHz.After 11 minutes of computation, the solution is found.Table2describes diagnostics and results of the genesis hash.Using this first instance to calibrate the computational difficulty, the smallest set Q that solves equation (14) that would yield an expected computational time of 10 minutes for the next block would be {2,15, 20, 26, 30, 40}.

Table 2 .
A genesis hash based on original Bitcoin's inputs for genesis but using inflation propensity as proof-of-work

Table A1 .
k, ...σ ∞ (x) has a geometric density distribution whose parameter's value ρ ≈ This has been verified numerically for the first 1e11 integers.Standardization of proofs-of-work has led to increased systemic risk in case of cryptographic failure of the hashcash, Ethash or Scrypt protocols.The inflation propensity of Collatz orbits is a new metric that exhibits properties particularly well suited to be the base for new cryptography applications.A new proof-of-work is suggested: finding the smallest possible value X such that the sum of X and a hashed block of information B has Distribution of inflation propensity ξ(x) for the first 1e11 integers data_block = data_block[0:len(data_block) -4] + struct.pack('<I',nonce) def generate_hashes_from_block(data_block): header_hash = hashlib.sha256(hashlib.sha256(data_block).digest()).digest()[::-1]print "genesis hash: " + genesis_hash.encode('hex_codec')main() Preprints (www.