Stochastic Computing Implementation of Chaotic Systems

: An exploding demand for processing capabilities related to the emergence of the Internet of Things (IoT), Artiﬁcial Intelligence (AI), and big data, has led to the quest for increasingly efﬁcient ways to expeditiously process the rapidly increasing amount of data. These ways include different approaches like improved devices capable of going further in the more Moore path but also new devices and architectures capable of going beyond Moore and getting more than Moore. Among the solutions being proposed, Stochastic Computing has positioned itself as a very reasonable alternative for low-power, low-area, low-speed, and adjustable precision calculations—four key-points beneﬁcial to edge computing. On the other hand, chaotic circuits and systems appear to be an attractive solution for (low-power, green) secure data transmission in the frame of edge computing and IoT in general. Classical implementations of this class of circuits require intensive and precise calculations. This paper discusses the use of the Stochastic Computing (SC) framework for the implementation of nonlinear systems, showing that it can provide results comparable to those of classical integration, with much simpler hardware, paving the way for relevant applications.


Introduction
The increasing pervasion of devices related to wireless networks, data process, and transport and in general the Internet of Things (IoT) has led to a resultant rapid increase of edge devices.The numbers are enormous and the predictions are wondrous [1]; in five years (by 2025) 180 ZBytes will be the amount of data to be handled (International Data Corporation-IDC).In addition, IoT devices will exceed 150 billions and it is estimated that the data produced by them will be about 70% of the data produced worldwide (IDC) [1].It is apparent that all the types of centralized processing, even in the form of cloud, cannot properly and efficiently support this new computing landscape, taking also into account the fact that IoT has to go hand in hand with other technologies regarding artificial intelligence, big data, mobile computing etc. which are being referred to as ubiquitous computing platforms.Therefore, edge computing rises in the horizon, calling for data processing at the edge of the network.
All these technologies call for innovative approaches [2,3] and some of those approaches are approximate computing [4][5][6], deep learning [7], new post-CMOS devices and architectures [8], or advanced processing techniques [9,10].One of the fields where more computational power is required is communications, especially when data privacy protection and security are included.Thus, data encryption emerges as a mandatory element.Nowadays, many techniques and approaches are proposed in the context of the IoT, in all hierarchical levels of data communications, others for ensuring privacy [11] and others for practically ensuring security [12].As a result, there exist numerous proposals in this sense, usually ubiquitous ones; one such promising option seems to lean towards using chaotic-based encoder-decoder schemes to secure and/or authenticate data transmission in general [13][14][15][16].There are examples of such secure-communication systems, analog [17,18], and digital [19,20] ones demonstrating merits like low-cost, circuit simplicity, low-power operation etc. [9,10,21,22].
On the other hand, it seems that approximate computing enables high power savings [6], making this technique a viable candidate for IoT edge devices.This framework offers energy savings by trading accuracy for energy.There are a handful of methods that can successfully implement approximate computing: programming methods and algorithms, hardware implementations, and other ubiquitous solutions.As a curious side note, this strongly reminds one of the way chaos was encountered by Lorenz, finding different solutions of his equations because of truncated number storage [23].In fact, as the legend goes, Lorenz found that saving the state of a simulation and using it as the new initial condition, resulted in a very different simulation than simply continuing this without saving.After a lot of analysis effort, it was found that results were stored with less resolution than was used internally to perform the computations.This difference in the number of bits caused minuscule differences that led to divergent simulations because of the chaotic nature of the system.Thus, it is quite important to analyze if the system to be implemented allows the use of approximate computing.
An interesting approach had been already introduced by Von Neumann in 1956 [24], based on a series of lectures given by R.S. Pierce in 1952 at California Institute of Technology.This approach, namely Stochastic Computing (SC) or Stochastic Logic, makes a trade off between calculation time and accuracy.This approach has been successfully applied in fields as diverse as neural network implementation [25,26], data mining [27], data compression [28], or mathematical calculations (FFT) [29], control [30], or even A/D conversion [31], among others.Indicative of its advantages, it allows for a high reduction in the number of components, thus reducing the power required to run the circuit.However, this comes to a price, since the time required to perform the operation also increases exponentially with the number of bits.This, in turn, may increase the total energy consumption when the number of bits exceeds 16-17 [32,33].There are, however, techniques that allow this problem to be alleviated, and make this competitive even for higher bit-numbers [33].Notice, that stochasticity is entering only in the proper (for calculation) number generation and has nothing to do with the systems implemented within this framework.In fact, it could be considered as an equivalent of typical noisy system.
There are cases where this trade-off plays a crucial role, like in the case of chaotic systems.It is known that key features of all nonlinear, chaotic systems are: long-term bounded aperiodic behavior, enhanced sensitivity to parameters and initial conditions, and fast decorrelation between past and present [34].These features, especially sensitivity to initial conditions and parameter values, make proper implementation of chaotic systems within approximate computing frameworks, difficult, not impossible though.A successful example is the case of implementing a chaotic oscillator in a purely digital environment [20] but with the cost of creating a much more complicated (higher-dimensional) implementation.
In this paper, having in focus possible utilization of chaotic systems for edge computing applications, like the IoT, an unconventional approach to the computation of nonlinear dynamical systems, is investigated.Specifically, we show how the stochastic computing (SC) framework, an option for approximate computing, can be utilized to properly implement nonlinear, chaotic operating systems.It is apparent that approximate computing methods for implementing systems sensitive to initial conditions and system parameters is an important issue that needs thorough investigation.In Section 2 an introduction to Stochastic Computing is apposed, presenting some very basic examples; Section 3 presents a method in the form of an example, on how to implement a nonlinear system using both this framework and a classical integration, as well.Then, established nonlinear dynamics tools and metrics like fractal dimension, Kolmogorov-Sinai entropy etc. are utilized to estimate how well the described SC implementation behaves, compared to the conventional approach.Finally, Section 4 concludes the paper.

Introduction to Stochastic Computing
The SC approach means representing real numbers by a string of random binary numbers.The average value of this string (between zero and one) is correlated to the number represented [35].These strings are called Stochastic Computing numbers (SCN) or Stochastic Encoded Numbers (SEN).In this paper, we will opt for the second, using also the term Binary Encoded Numbers (BEN) for those encoded as classical binary numbers.There are two main mappings from real to SEN: one from the real domain [0..1], and another one from the real domain [−1,1].
Depending on the chosen mapping, several operations can then be carried out with different simple logic gates.In the case of the real [0,1] domain, multiplication of Stochastic Computing numbers is calculated using an AND gate.In the case of the [−1,1] domain, multiplication requires the use of an XNOR gate, as shown in Figure 1.The case of addition is slightly more complex, since 1 + 1 = 2, and we cannot represent any number as a probability higher than one.Thus, the operation that should be implemented is (x + y)/2, which will always be 1 at the most.This operation is usually implemented using a multiplexer, as shown in Figure 2a, where the p (0.5) means a signal with a probability of 0.5 to be "1" or "0".This signal is generated using one of the bits generated in the RNG, so no additional circuitry is needed.It is worth pointing out that this gate is the same both for the [0,1] and the [−1,1] domains.Other more complex operations (division [30], square roots [30], reversible gates [36], etc.) are also discussed in the literature, though they are not to be presented here to focus on the implementation procedure of the non-linear equation system.Another important point is the conversion from BEN to SEN.This is usually achieved by using a scheme similar to that in Figure 2b, where an N-bit random number is generated by utilizing a random number generator (RNG) and compared to the value of the N-bit BEN.If the output of the RNG is below the BEN, the converter's output would be a 1-bit " 1", or a 1-bit " 0" otherwise.In the opposite operation, converting SEN back to its BEN representation, the number of 1's included in the signal needs to be calculated; something that can be achieved by a simple counter.
It is apparent that the error in the approximation of the SEN to its actual value is equivalent to the error provided by a random walk process of length n, and thus proportional to √ n, as it has been discussed in the literature [37].Therefore, using N bits, we may consider that all the noise caused by the process is included in the lowest N/2 bits.This way, the noise figure NF for a signal of power S p with noise power N p caused by the use of the SCN is: Notice that for this equation we have considered the maximum possible amplitude for the input signal.In order to consider the minimum amplitude over the noise, we consider we are 1 bit over the noise (N/2 + 1).In this case, we obtain: Thus, the system is expected to have a NF between 3dB and 3(N/2 + 1)dB, assuming as above that we use more than 1 bit.This NF sets the required number of bits, which is related to the sensitivity of the equation system to noise.Empirically, we have seen that linear equations allow for a low N, while nonlinear systems call for higher values.Notice that a value of the NF = 20 dB calls for N=12 bits, while N=32 bits would provide NF = 54 dB.

Implementation of Basic Differential Equations
Implementation of differential equations using SC is similar to the case of implementing them in discrete form.The process involves, mainly, rewriting the equation in a specific way so they can be integrated using SC.This requires three different transformations: 1.
Rewriting the equation in a form suited to SC.In the most basic case, this implies replacing all the additions by half additions: a + b → 2(a/2 + b/2).Other, more complex operations may require a harder reworking of the equations to ensure all the operations can be implemented in SC in the [−1,1] or [0,1] range.For instance, in the case of implementing a division, one has to ensure that the result is always going to be in range, which may require an additional scaling and shifting of the variables involved.

2.
Scaling all the variables into the [−1..1] or [0..1] domains, since those are the values that can be dealt with in SC.

3.
Then, a final transformation ensures that all the modules of the coefficients in the equations are lower than 1.This is equivalent to a time scaling.
Once these three transformations are performed, the equations may be processed as a SC system.It is apparent that in this procedure multiplications are implemented by the gates appearing in Figure 1 and additions by the half-additions implemented by the gate in Figure 2a.
In order to implement the integrator the scheme appearing in Figure 3 is utilized.This figure shows the symbol to be used in Figure 3a, while the scheme is shown in Figure 3b.It performs a continuous integration, by counting up or down depending on whether the input is 1 or 0 and comparing the output of the counter with a random number generator RNG to create a SCN.The implementation of the integrator and the RNG implies making a decision on the effective number of bits demanded in order to define the size of the counter.Notice that this is equivalent to determining the precision of the integrator, since numbers are represented between [0..1] (or [−1..1]) with N bits of resolution and a noise figure provided by Equation (1).Related to the number of RNG needed to implement this scheme, it has been shown in [38] that using the same RNG in both inputs actually improves the accuracy, thus simplifying the design.Related to this number of bits is the issue of way time is mapped to the number of iterations, in the relevant SC equations.This is an issue strongly dependent on the integration method; assuming that a simple first order rectangular integration, with no explicit time dependence is utilized, then, it is apparent that the following Equation ( 3) holds valid: Using the integrator scheme presented in Figure 3, we see that allowing for a high enough number of N acc iterations, the output at the counter Int(x(t)) would be: where p(1) is the probability of having a 1 at the input.It has to be noted that p(1) corresponds to the actual value to be represented in the SCN , and N acc is the number of "ones" the accumulator has counted.Thus, equating Equation (3) with Equation (4), we find that the effective time step is given in Equation (5).
Notice that the integrator depicted in Figure 3 also includes a binary to stochastic (B2S) converter, which is simply a random number generator (RNG), plus a comparator.This way, the output of the integrator is already also a SC number that can be used further in the circuit.

Basic Examples
As a proof of concept two basic examples are presented: a system that integrates twice a constant and a simple oscillator.We have implemented these systems into a DE2-70 Field Programmable Gate Array (FPGA) by Altera, using Quartus-II to compile it.The resulting circuits were implemented using less than 500 gates or less than a 2% of the available number of gates.Communication with the computer was implemented using the JTAG interface, controlled within Matlab.

Integrating a Constant (Twice)
As a first example, we have designed the system shown in Figure 4 and Equation (6) and initial conditions ( ẍ, ẋ, x) = (k 2 , k 1 , k 0 ).The system then integrates the equations over time.The analytical solution of the system above, is provided in Figure 4 and the results for a specific set of initial conditions are depicted in Figure 5 (for initial conditions k 0 = k 1 = 0, and k 2 = 0.08).In this figure, we have used N acc = 5000 iterations and N = 21 bits, resulting in a time-step ∆t = 5000 2 21 ≈ 2.384 ms.Using this relation, we expect the line corresponding to ẋ reaching ẍ (k 2 ) at the 420 th iteration and crossing the parabola generated by x at the 839 th iteration.The parabola is expected to cross k 2 at iteration 593.Actually, all these crossings emerged exactly at the expected points, as shown in Figure 5.

A Simple Oscillator
The simplest possible autonomous system to be implemented was that of a selfoscillating system; this is a simple coupled system described in Equation ( 7) implementing a simple oscillator of frequency one (ω = 1), corresponding to the system illustrated in Figure 6.
Figure 6.Basic implementation scheme of a coupled second order ODE, as in Equation ( 7).
In this case, the implementation is direct, by simply replacing the blocks in Figure 6, with those discussed above.The resulting waveforms of both x and y variables are shown in Figure 7.In this figure, we have used N acc = 2 4 = 16 iterations and N = 18 bits.This provides a ∆t = 2 4  2 18 = 1 8192 s.Using this relation, the period is expected to be 2π/∆t ≈ 51497 iterations, as it actually happens in Figure 7.

Implementing Chaotic Systems in SC
Implementation of chaotic systems using stochastic computing can be achieved by using the same three step process presented above.The main issue in this case regards the sensitivity to noise, for such systems.Thus, an adequate NF will require a high number of bits.To compare the results coming from integrating within the proposed SC environment, to those utilizing the conventional methods, the Shimizu-Morioka system [39] was utilized as a paradigm.Therefore, a comparison between these two implementations of some typical nonlinear time series analysis estimators (Kolmogorov entropy, correlation dimension, etc.) were performed; documenting an evaluation of the effectiveness of the SC implementation.It should be mentioned that the classical integration was performed using Matlab, while the SCN was performed using the same FPGA as in the previous section.
Regarding the RNG, we have used a single 64-bit implementation of the Mersenne algorithm, which provides a period long enough for our purposes.Notice that most of the RNG we need are only 1-bit long, so we can extract all of them from the same RNG.In addition, we have used the technique proposed in [40], which allows for the use of a single generator to create different random numbers to the integrators and constants to the generators.The scaling constants are stored as fixed registers, which are then used to generate the random sequence through a B2S converter.It is worth noticing that the RNG is actually responsible of a large part of the total energy consumption, due to the large number of computations needed by the algorithm.This energy consumption could be addressed in the future by using memristors to generate the random bits, as proposed in [41][42][43].

The Shimizu-Morioka System
A system algebraically simpler than the Lorenz system has been proposed by Shimizu and Morioka [39] in 1980.The original equation formulation appears below: A usual set of parameter values, leading to the emergence of chaos in this system, is µ = 0.81 and α = 0.375.By applying this set, the system is operating in a deterministic chaotic mode, demonstrating an elegant chaotic attractor in the corresponding 3D phase space.As expected for deterministic chaotic systems, the trajectories comprising this strange attractor are bounded, while the whole system behavior appears to be bounded within a box (the phase space), the limits of which are explicitly presented in relations (9).

Equation Preparation
It is apparent that the next step in implementing equation set (8) in SC environment, is to get through a normalization procedure, as this has been discussed above.This would ensure that all three variables (x, y, z) would be within the SC working interval.In our case, we opted for normalizing all the variables to be within the [0,1] range.In order to achieve this, the following variable transform (10) was applied, as a first step: It is apparent that the proposed change of variables transformed equation set (8) into the form appearing in equation set (12), after getting through the relations in (11).
Thus, the resulting equation system in (12) has all its (X, Y, Z) variables oscillating within the proper range for stochastic logic, but the equations are not yet ready to be implemented in a SC environment.This is due to the fact that some of the system-parameters are outside the SC range (in this case higher than one).To solve this, we divided all the terms by a number c r higher than the highest coefficient appearing in the equations.Notice that this is equivalent to a scaling of time of a magnitude t = c r t, thus no qualitative change emerges.In this specific case, using the value c r = 32, the equation set (12) However, it has to be noted that, even if all the parameters and variables are within the proper range, the operations are not in a form ensuring that their results would fall within the SC range.In specific, the additions must be in the form suggested in Equation (14).
Rewriting all the equations according to this requirement, the equation set ( 13) transforms into in the one appearing in (15), which is the final form of the equations to be implemented in the SC environment.

Implementation
The equation set (15) was implemented in a Matlab environment, being integrated in a conventional way by utilizing its built-in functions and the ode45 solver with variable time step.The solution of the system for all three system-variables in the time domain appears in Figure 8a for specific initial conditions.In Figure 8b the corresponding attractor of the system, in its 3-dimensional phase space, is apposed.
The same system (beginning from the same initial conditions) has also been implemented in a SC environment using the previously discussed basic gates and it is presented in Figure 9.In this figure, the x i variable corresponds to any of the signals x, delayed by i clock cycles, an essential approach that improves decorrelation.Note that the delay element is not shown, but it is simply implemented by a shift register, taking into account that the variables are only 1 bit long.The number of bits used in this implementation was N = 22, with N acc = 2 12 iterations.This would be equivalent to a conventional system using 17-bit binary arithmetic, once the noise has been taken into account.The results corresponding to this integration are presented in the form of time series (in the time-domain) in Figure 10a, where all three variables X, Y, Z appear.The corresponding attractor, embedded in a 3D phase space, appears in Figure 10b.Comparing these two figures to the corresponding Figure 8a,b of the classical implementation, one gets the subjective perception that the two implementations evolve in time in a quite similar way, not the same though.Indeed, although in both implementations the systems begin evolving in time from the same initial conditions, their evolution is not identical, due to the approximate computing approach implemented in SC environment.However, the emerging attractors are in both cases demonstrating almost identical structure in their embedding phase space (3D in this case).A very draft remark is that the time series and the resulting attractor appears to be slightly more noisy, in the case of SC implementation, something expected because of the approximate nature of calculations in the case of SL.

Chaotic Evaluation
Due to the nature of deterministic chaotic systems and in order to perform a more objective investigation of the fidelity of the dynamics demonstrated by the Shimizu-Morioka equation system, in the stochastic computing environment, a procedure including a variety of relevant metrics was applied.
Initially, the power spectrum of one of the state variables, namely Z(t), was calculated in both cases.It is presented in Figure 11, where the red line regards the classical integration method and the green line the SC method.The similarity between them is obvious, including the minimum at around 0.35 mHz and the maximum at 0.5 mHz.The most apparent difference appears in the higher frequencies, which can be attributed to both the error in number quantization [44] and the noise from a random walk [37] as expressed in Equations ( 1) and (2).Further investigation and analysis of the demonstrated chaotic behavior included calculation of correlation dimension, Kolmogorov entropy, as well as Lyapunov exponents.To this direction, the z(t) and Z(t) variable from the three time series appearing in Figures 8a and 10a correspondingly, were considered.
Applying the Takens theory [45] to the Z(t) time series, the topologically equivalent attractor was reconstructed, in the proper phase space.It is noted that according to this theory, the technique of displaced vectors is applied.So initially, the appropriate time delay τ, needed for the reconstruction of the phase space, was calculated by both the mutual information's first local minimum and the autocorrelation function's first zero.In both cases, the conventional and the SC solution, the value emerging for τ appeared to have a significantly lower value when calculated through the mutual information, and therefore this was the one adopted [23]; for the specific set of parameters and initial conditions: τ = 16 for the conventionally calculated solution and τ = 8 for the SC solution (in both cases we refer to measurement points).
The correlation integrals C(2, ) , for different embedding dimensions, have been numerically calculated, according to the Grassberger-Procaccia method [46][47][48], Equation ( 16): where parameter is the hypercube dimension in the hyperphase space for a series of specific embedding dimensions m, and Θ the Heaviside function.These integrals provided information characterizing the attractor and were calculated for embedding dimension up to m = 6, for both the conventional and the SC solutions of Z(t).In both cases, the integrals appeared to almost parallelize for embedding dimensions m = 3 and above.The slopes of the correlation integrals' linear parts (in a double logarithmic scaling) were determined for each embedding dimension (v vs m) and their values appear in Figure 12.In this plot, the black line refers to conventional solution, while the red one to SC.In both cases a saturation plateau appears for m ≥ 3, thus, this is the sufficient phase space dimension, necessary to host the system's global dynamics under any circumstances [23].
In the case of the conventional solution (black line in Figure 12), the saturation tends to the noninteger value of v = 2.10, which is the correlation dimension of the attractor under these circumstances, further proving the deterministic chaotic nature of the studied time series (and the corresponding system).The closest (to the correlation dimension) higher, integer value defines the minimum embedding dimension, which in this case appears to be m min = 3, as expected for a three state-variable system.It is apparent that for the Shimizu Morioka system, the calculated minimum embedding dimension coincides with the minimum sufficient phase space dimension m min = m su f f = 3.In the case of the SC solution (red line in Figure 12) the relation of the correlation integral slope, for all the embedding dimensions (v vs m) is again saturated after m = 3, but this time to a slightly higher noninteger value; thus the correlation dimension in this case is v = 2.30.This difference is something expected and fully explainable, since the approximate computing nature of SC is introducing a specific level of noise, which is expected (and hereby verified) to increase the attractor's dimensionality.However, this increase is within the system's minimum dimensions m min = m su f f = 3.
In order to investigate the global chaotic dynamics, the Kolmogorov-Sinai Entropy and the Lyapunov exponents were calculated.The Kolmogorov Entropy appears in Figure 13, in both cases.It clearly possesses positive value, K 2 = 0.44 bits/τ for the conventional solution, and K 2 = 0.54 bits/τ for the SC solution.These values are almost the same depicting similar rate of loss of information of the past state of the system; therefore, there is a similar deterministic chaotic nature for both implementations.Finally, the three (as expected for a three-dimensional system) corresponding average local Lyapunov exponents, as they were calculated by the observed Z(t) time series [49], are presented in Table 1.The maximal exponent provides with a measure of the predictability of system producing the studied time series [49,50].If at least one of them is positive then the system is chaotic.In both investigated cases, the system demonstrates one positive exponent, one nearly zero (due to the finite time series and the approximation introduced by the calculating method) and one negative, hinting for a simple chaotic system.Moreover, in both implementations the values were close one to the other.The higher value of λ 1 in the case of the SC solutions, is again expected and due to the noise introduced by the approximate calculations taking place in the case of the SC implementation [23,34].Additionally, the second exponent λ 2 is zero in both cases as expected, and the third ones λ 3 are also close in value.It is noted that in this case the value of the maximal exponent (the only positive) also provides the lower bound of the Kolmogorov-Sinai entropy.
All the above calculations of nonlinear dynamics established metrics prove and confirm the ability to implement the chaotic dynamics of Shimizu-Morioka system in SC, indifferent to the approximate nature of this kind of calculations.

Discussion
Having in focus edge computing and data trafficking in the frame of the IoT, approximate computing emerges as an essential technological option, being able to provide low cost in both area and energy for (relatively) low precision calculations.To this direction, in this work we have presented a detailed procedure to implement nonlinear equations using stochastic computing (a kind of an approximate computing method).This procedure involves a renormalization of the equations in three phases: first, it sets the possible values of the variables inside the values [−1,1]; secondly, all the constants inside the equations are recalculated so that they would also lay within the same range, which is equivalent to a scaling of the time variable; finally, all the operations are rewritten in a form compatible to the available operations in stochastic computing.
We have also discussed the effect of changing the number of bits used to represent the variables, showing that this number is related to the signal-to-noise ratio that the system can withstand.Specifically, since the process is a random walk, the noise figure is expected to grow according to the bit-string length, as √ N. The main advantage of the presented approach is the low number of components needed to implement the equations, which is far below the required number demanded in the classical approach, due to the method of implementing SC calculations.This makes the specific approach useful for small (lightweight) systems that require making complex calculations with a low energy consumption, mainly if the needed number of bits is not large.In addition, the robustness of nonlinear systems in the frame of SC was also shown.This is specially relevant, since nonlinear systems are highly sensitive to initial conditions and parameter variations, which may greatly change their long-term behavior.In our case, we have shown that this long-term behavior is kept, even when all the constants and parameters have been defined as SCN.
As examples of application, we have implemented three different systems representing as many differential equation systems: a double integrator, a simple oscillator, and a nonlinear system.As already discussed above, the results from the first two cases appear to be exactly the same as the conventional system implementations.In the case of the Shimizu-Morioka system, which has been used as the toy model for nonlinear system implementation, the results obtained using SC are consistent with a conventional implementation.The metrics used for comparing the implementation in the SC versus the conventional one were three: the correlation dimension (2.10 for the conventional case, 2.30 for the SC); the Kolmogorov entropy (0.44 bits/τ for the conventional, 0.54 bits/τ for the SC); and the Lyapunov exponents, which were also found to be very similar.
Related to the performance of the system in stochastic computing; it is clear that its worse aspect is the time needed to perform the calculations.In the proposed FPGA implementation, the throughput of the system is one bit per clock cycle, and it needs 2 N cycles to perform a single δt iteration.This total time can be reduced by parallelization.Thus, using M identical blocks in parallel as proposed in [33], the total time needed to perform the iteration is reduced accordingly to be t SC = 2 N /M clock cycles.On the other hand, when considering conventional binary operations implemented sequentially, the total time would be given by: where τ x is the number of cycles needed to implement the arithmetic operation x, which appears N x times in the circuit.In our case, we only have used multiplication (mul), addition (add), subtraction (sub), and integration (int).Of those, multiplication needs usually between 3 and 7 clock cycles, and the others only 1, since the integral is just another addition.Thus, for our system, we need a total of clock cycles between 40 and 76, with an expected value around 67 cycles.Thus, for a 16 bit implementation with M parallel SC branches, the ratio between t conv and t SC is: Thus, for a simple implementation with 32 parallel branches, the conventional arithmetic would be 32 times faster than a Stochastic Computing equivalent.This, as has been discussed above, leads to a trade-off between speed, precision, and power consumption.Notice, however, that for an application requiring a lower number of bits (as, for instance, in [51], where only 10 bits are needed), this is changed to ≈ M/16 and an SC implementation could be actually faster and simpler than a conventional one.Related to the other performance parameters of the system, we have to note that in this paper we are focusing only on the possibility of implementing such systems, with no emphasis on the energy consumption or area optimization.These two aspects are still under consideration for our proposal, since the literature seems to be unclear in this aspect.However, as an example, related to the area optimization, we can compare the difference between the number of logic elements needed to implement a stochastic computing (always a simple gate) against a usual digital implementation of a vedic multiplication [52] into a certain FPGA, as in [53].These results are compiled in Table 2 and show a clear advantage of SC over conventional implementation.Related to energy consumption, it is known [33] that for short bit-streams (less than 16-17 bits) SC performs better than conventional.Notice, in any case, that the energy consumption needed to generate the random bits is not taken into account, since they can be generated in multiple ways.For instance, a 64-bit Mersenne RNG needs a lot of power to perform all the calculations that lead to the pseudorandom sequence of bits, but if we can use memristors for this task [41][42][43] this energy is drastically reduced, as well as the needed area.Additionally, a final ASIC implementation of the design could utilize some improvements, as extensively discussed in [33], that allow for exponential improvement of consumption.
Table 2. Comparison of the number of Field Programmable Gate Array (FPGA) parts used for implementation of the vedic multiplication algorithm [53] and the SC multiplication.(*) The number of LUTs used in SC is considered to be 1/3 of a 6-input LUT, as those in the FPGA used in [53].The results emerging from the SC implementation are thus showing that this technique is capable of implementing complex nonlinear systems, in an area-efficient way, with small loss of precision, equivalent to an analog circuit implementation.Taking into account their possible application for secure data transmission, they are one possible alternative to be considered in IoT, or Edge computing systems, paving the way for an even wider spread of IoT.

Figure 2 .
Figure 2. Basic implementation schemes (a) of a SC adder using a multiplexer and (b) a Binary Encoded Number (BEN) to a Stochastic Encoded Number (SEN), using a Random Number Generator (RNG).

Figure 3 .
Figure 3. (a) Symbol for the integrator block; (b) Basic implementation scheme of a SC integrator.Notice that both the input ẋ(t) and the output x(t) are SEN numbers.

Figure 4 .
Figure 4. Basic implementation scheme of a second order ODE with constant input and initial conditions ( ẍ, ẋ, x) = (k 2 , k 1 , k 0 ).The analytical solution is also provided at the end of each stage.

Figure 7 .
Figure 7. Output of the scheme in Figure 6, showing both x and ẋ.In this case, N acc = 16, N = 18 bits, resulting in a ∆t = 1/8192 s.

Figure 8 .
Figure 8.For the normalized Shimizu-Morioka system, beginning from initial conditions (x, y, z)=(0.51,0.51,0.51),(a) the nonlinear time series of the normalized Shimizu-Morioka system and (b) the corresponding attractor in a 3D phase space, are presented.

Figure 9 .
Figure 9. Implementation of the Shimizu-Morioka equations using SC.The subindex in the variables means a delay equivalent to the number used to decorrelate them.The constants c i are those corresponding to Equation (15).

Figure 11 .
Figure 11.Power Spectra obtained from the Z(t) time series using the SC (green) and classical (red) integration methods.

Figure 12 .
Figure 12.Correlation Dimension for the conventionally calculated Z(t) time series (black line) and the one calculated in the SC environment (red line).

Figure 13 .
Figure 13.Kolmogorov-Sinai Entropy for the conventionally calculated Z(t) time series (black line) and the one calculated in the SC environment (red line).

Table 1 .
Values of the first three Lyapunov exponents for the "Z" variable in the cases of classical integration and SC integration.