MDPI - Publisher of Open Access Journals

24 pages, 389 KB

Open AccessArticle

The Power of the Lorentz Quantum Computer

by Qi Zhang and Biao Wu

Entropy 2026, 28(3), 266; https://doi.org/10.3390/e28030266 - 28 Feb 2026

Cited by 1 | Viewed by 285

We analyze the power of the recently proposed Lorentz quantum computer (LQC), a theoretical model leveraging hyperbolic bits (hybits) governed by complex Lorentz transformations. We define the complexity class BLQP (bounded-error Lorentz quantum polynomial-time) and demonstrate its equivalence to the complexity class [...] Read more.

We analyze the power of the recently proposed Lorentz quantum computer (LQC), a theoretical model leveraging hyperbolic bits (hybits) governed by complex Lorentz transformations. We define the complexity class BLQP (bounded-error Lorentz quantum polynomial-time) and demonstrate its equivalence to the complexity class

P^{♯ P}

(the class of problems solvable by a deterministic polynomial-time Turing machine with access to a

♯ P

oracle). LQC algorithms are shown to solve NP-hard problems, such as the maximum independent set (MIS), in polynomial time, thereby placing NP and co-NP within BLQP. Furthermore, we establish that LQC can efficiently simulate quantum computing with postselection (PostBQP), while the reverse is not possible, highlighting LQC’s unique “super-postselection” capability. By proving BLQP

= P^{♯ P}

, we situate the entire polynomial hierarchy (PH) within BLQP and reveal profound connections between computational complexity and physical frameworks like Lorentz quantum mechanics. These results underscore LQC’s theoretical superiority over conventional quantum computing models and its potential to redefine boundaries in complexity theory. Full article

(This article belongs to the Special Issue Quantum Computation, Quantum AI, and Quantum Information)

► Show Figures

Figure 1

24 pages, 5038 KB

Open AccessArticle

Dynamic Analysis, FPGA Implementation and Application of Memristive Hopfield Neural Network with Synapse Crosstalk

by Minghao Shan, Yuyao Yang, Qianyi Tang, Xintong Hu and Fuhong Min

Electronics 2025, 14(12), 2464; https://doi.org/10.3390/electronics14122464 - 17 Jun 2025

Viewed by 915

Abstract

In a biological nervous system, neurons are connected to each other via synapses to transmit information. Synaptic crosstalk is the phenomenon of mutual interference or interaction of neighboring synapses between neurons. This phenomenon is prevalent in biological neural networks and has an important [...] Read more.

In a biological nervous system, neurons are connected to each other via synapses to transmit information. Synaptic crosstalk is the phenomenon of mutual interference or interaction of neighboring synapses between neurons. This phenomenon is prevalent in biological neural networks and has an important impact on the function and information processing of the neural system. In order to simulate and study this phenomenon, this paper proposes a memristor model based on hyperbolic tangent function for simulating the activation function of neurons, and constructs a three-neuron HNN model by coupling two memristors, which brings it close to the real behavior of biological neural networks, and provides a new tool for studying complex neural dynamics. The intricate nonlinear dynamics of the MHNN are examined using techniques like Lyapunov exponent analysis and bifurcation diagrams. The viability of the MHNN is confirmed through both analog circuit simulation and FPGA implementation. Moreover, an image encryption approach based on the chaotic system and a dynamic key generation mechanism are presented, highlighting the potential of the MHNN for real-world applications. The histogram shows that the encryption algorithm is effective in destroying the features of the original image. According to the sensitivity analysis, the bit change rate of the key is close to 50% when small perturbations are applied to each of the three parameters of the system, indicating that the system is highly resistant to differential attacks. The findings indicate that the MHNN displays a wide range of dynamical behaviors and high sensitivity to initial conditions, making it well-suited for applications in neuromorphic computing and information security. Full article

► Show Figures

Figure 1

10 pages, 2040 KB

Open AccessArticle

Optical Full Adder Based on Integrated Diffractive Neural Network

by Chenchen Deng, Yilong Wang, Guangpu Li, Jiyuan Zheng, Yu Liu, Chao Wang, Yuyan Wang, Yuchen Guo, Jingtao Fan, Qingyang Du and Shaoliang Yu

Micromachines 2025, 16(6), 681; https://doi.org/10.3390/mi16060681 - 4 Jun 2025

Viewed by 1585

Abstract

Light has been intensively investigated as a computing medium due to its high-speed propagation and large operation bandwidth. Since the invention of the first laser in 1960, the development of optical computing technologies has presented both challenges and opportunities. Recent advances in artificial [...] Read more.

Light has been intensively investigated as a computing medium due to its high-speed propagation and large operation bandwidth. Since the invention of the first laser in 1960, the development of optical computing technologies has presented both challenges and opportunities. Recent advances in artificial intelligence over the past decade have opened up new horizons for optical computing applications. This study presents an end-to-end truth table direct mapping approach using on-chip deep diffractive neural network (D²NN) technology to achieve highly parallel optical logic operations. To enable precise logical operations, we propose an on-chip nonlinear solution leveraging the similarity between the hyperbolic tangent (tanh) function and reverse saturable absorption characteristics of quantum dots. We design and demonstrate a 4-bit on-chip D²NN full adder circuit. The simulation results show that the proposed architecture achieves 100% accuracy for 4-bit full adders across the entire dataset. Full article

(This article belongs to the Special Issue Microelectronics and Optoelectronic Devices: From Fundamental Research to Advanced Applications, 2nd Edition)

► Show Figures

Figure 1

25 pages, 5363 KB

Open AccessArticle

Power-Optimized Field-Programmable Gate Array Implementation of Neural Activation Functions Using Continued Fractions for AI/ML Workloads

by Chanakya Hingu, Xingang Fu, Taofiki Saliyu, Rui Hu and Ramkrishna Mishan

Electronics 2024, 13(24), 5026; https://doi.org/10.3390/electronics13245026 - 20 Dec 2024

Cited by 3 | Viewed by 1451

Abstract

The increasing demand for energy-efficient hardware platforms to support artificial intelligence (AI) and machine learning (ML) algorithms in edge computing has driven the adoption of system-on-chip (SoC) architectures. Implementing neural network (NN) activation functions, such as the hyperbolic tangent (tanh), on hardware presents [...] Read more.

The increasing demand for energy-efficient hardware platforms to support artificial intelligence (AI) and machine learning (ML) algorithms in edge computing has driven the adoption of system-on-chip (SoC) architectures. Implementing neural network (NN) activation functions, such as the hyperbolic tangent (tanh), on hardware presents challenges due to computational complexity, high resource requirements, and power consumption. This paper aims to optimize the hardware implementation of the tanh function using continued fraction and polynomial approximations to minimize resource consumption and power usage while preserving computational accuracy. Five models of the tanh function, including continued fraction and quadratic approximations, were implemented on Intel field-programmable gate arrays (FPGAs) using VHDL and Intel’s ALTFP toolbox, with 32-bit floating-point outputs validated against MATLAB’s 64-bit floating-point results. Detailed analyses of resource utilization, power optimization, clock latency, and bit-level accuracy were conducted, focusing on minimizing logic elements and digital signal processing (DSP) blocks while achieving high precision and low power consumption. The most optimized model was further integrated into a four-input, two-output recurrent neural network (RNN) structure to assess real-time performance. Experimental results demonstrate that the continued fraction-based models significantly reduce resource usage, computation time, and power consumption, enhancing FPGA performance for AI/ML applications in resource-constrained and power-sensitive environments. Full article

(This article belongs to the Special Issue Artificial Intelligence and Pattern Recognition for Intelligent Systems)

► Show Figures

Figure 1

20 pages, 8952 KB

Open AccessArticle

Research on High-Frequency Torsional Oscillation Identification Using TSWOA-SVM Based on Downhole Parameters

by Tao Zhang, Wenjie Zhang, Zhuoran Meng, Jun Li and Miaorui Wang

Processes 2024, 12(10), 2153; https://doi.org/10.3390/pr12102153 - 2 Oct 2024

Cited by 2 | Viewed by 2430

Abstract

The occurrence of downhole high-frequency torsional oscillations (HFTO) can lead to the significant damage of drilling tools and can adversely affect drilling efficiency. Therefore, establishing a reliable HFTO identification model is crucial. This paper proposes an improved whale algorithm optimization support vector machine [...] Read more.

The occurrence of downhole high-frequency torsional oscillations (HFTO) can lead to the significant damage of drilling tools and can adversely affect drilling efficiency. Therefore, establishing a reliable HFTO identification model is crucial. This paper proposes an improved whale algorithm optimization support vector machine (TSWOA-SVM) for accurate HFTO identification. Initially, the population is initialized using Fuch chaotic mapping and a reverse learning strategy to enhance population quality and accelerate the whale optimization algorithm (WOA) convergence. Subsequently, the hyperbolic tangent function is introduced to dynamically adjust the inertia weight coefficient, balancing the global search and local exploration capabilities of WOA. A simulated annealing strategy is incorporated to guide the population in accepting suboptimal solutions with a certain probability, based on the Metropolis criterion and temperature, ensuring the algorithm can escape local optima. Finally, the optimized whale optimization algorithm is applied to enhance the support vector machine, leading to the establishment of the HFTO identification model. Experimental results demonstrate that the TSWOA-SVM model significantly outperforms the genetic algorithm-SVM (GA-SVM), gray wolf algorithm-SVM (GWO-SVM), and whale optimization algorithm-SVM (WOA-SVM) models in HFTO identification, achieving a classification accuracy exceeding 97%. And the 5-fold crossover experiment showed that the TSWOA-SVM model had the highest average accuracy and the smallest accuracy variance. Overall, the non-parametric TSWOA-SVM algorithm effectively mitigates uncertainties introduced by modeling errors and enhances the accuracy and speed of HFTO identification. By integrating advanced optimization techniques, this method minimizes the influence of initial parameter values and balances global exploration with local exploitation. The findings of this study can serve as a practical guide for managing near-bit states and optimizing drilling parameters. Full article

(This article belongs to the Special Issue Condition Monitoring and the Safety of Industrial Processes)

► Show Figures

Figure 1

16 pages, 1568 KB

Open AccessArticle

A Neural-Network-Based Watermarking Method Approximating JPEG Quantization

by Shingo Yamauchi and Masaki Kawamura

J. Imaging 2024, 10(6), 138; https://doi.org/10.3390/jimaging10060138 - 6 Jun 2024

Cited by 3 | Viewed by 2687

Abstract

We propose a neural-network-based watermarking method that introduces the quantized activation function that approximates the quantization of JPEG compression. Many neural-network-based watermarking methods have been proposed. Conventional methods have acquired robustness against various attacks by introducing an attack simulation layer between the embedding [...] Read more.

We propose a neural-network-based watermarking method that introduces the quantized activation function that approximates the quantization of JPEG compression. Many neural-network-based watermarking methods have been proposed. Conventional methods have acquired robustness against various attacks by introducing an attack simulation layer between the embedding network and the extraction network. The quantization process of JPEG compression is replaced by the noise addition process in the attack layer of conventional methods. In this paper, we propose a quantized activation function that can simulate the JPEG quantization standard as it is in order to improve the robustness against the JPEG compression. Our quantized activation function consists of several hyperbolic tangent functions and is applied as an activation function for neural networks. Our network was introduced in the attack layer of ReDMark proposed by Ahmadi et al. to compare it with their method. That is, the embedding and extraction networks had the same structure. We compared the usual JPEG compressed images and the images applying the quantized activation function. The results showed that a network with quantized activation functions can approximate JPEG compression with high accuracy. We also compared the bit error rate (BER) of estimated watermarks generated by our network with those generated by ReDMark. We found that our network was able to produce estimated watermarks with lower BERs than those of ReDMark. Therefore, our network outperformed the conventional method with respect to image quality and BER. Full article

(This article belongs to the Special Issue Robust Deep Learning Techniques for Multimedia Forensics and Security)

► Show Figures

Figure 1

20 pages, 1513 KB

Open AccessArticle

Design of Hardware IP for 128-Bit Low-Latency Arcsinh and Arccosh Functions

by Junfeng Chang and Mingjiang Wang

Electronics 2023, 12(22), 4658; https://doi.org/10.3390/electronics12224658 - 15 Nov 2023

Cited by 1 | Viewed by 1629

Abstract

With the rapid development of technologies like artificial intelligence, high-performance computing chips are playing an increasingly vital role. The inverse hyperbolic sine and inverse hyperbolic cosine functions are of utmost importance in fields such as image blur and robot joint control. Therefore, there [...] Read more.

With the rapid development of technologies like artificial intelligence, high-performance computing chips are playing an increasingly vital role. The inverse hyperbolic sine and inverse hyperbolic cosine functions are of utmost importance in fields such as image blur and robot joint control. Therefore, there is an urgent need for research into high-precision, high-performance hardware Intellectual Property (IP) for arcsinh and arccosh functions. To address this issue, this paper introduces a novel 128-bit low-latency floating-point hardware IP for arcsinh and arccosh functions, employing an enhanced Coordinate Rotation Digital Computer (CORDIC) algorithm, achieving a computation precision of 113 bits in just 32 computation cycles. This significantly enhances computational efficiency while reducing hardware implementation latency. The results indicate that, when compared to Python standard results, the calculated error of the proposed hardware IP does not exceed

8 \times 10^{- 34}

. Furthermore, this paper synthesizes the completed IP using the TSMC 65 nm process, with a total IP area of 2.1056 mm

^{2}

. Operating at a frequency of 300 MHz, its power is 22.4 mW. Finally, hardware implementation and resource analysis are conducted and compared on an Field Programmable Gate Array (FPGA). The results show that the improved algorithm trades a slight area increase for lower latency and higher accuracy. The designed hardware IP is expected to provide a more accurate and efficient computational tool for applications like image processing, thereby advancing technological development. Full article

(This article belongs to the Section Circuit and Signal Processing)

► Show Figures

Figure 1

17 pages, 347 KB

Open AccessArticle

Design of Hardware Accelerators for Optimized and Quantized Neural Networks to Detect Atrial Fibrillation in Patch ECG Device with RISC-V

by Ingo Hoyer, Alexander Utz, André Lüdecke, Holger Kappert, Maurice Rohr, Christoph Hoog Antink and Karsten Seidl

Sensors 2023, 23(5), 2703; https://doi.org/10.3390/s23052703 - 1 Mar 2023

Cited by 8 | Viewed by 4372

Abstract

Atrial Fibrillation (AF) is one of the most common heart arrhythmias. It is known to cause up to 15% of all strokes. In current times, modern detection systems for arrhythmias, such as single-use patch electrocardiogram (ECG) devices, have to be energy efficient, small, [...] Read more.

Atrial Fibrillation (AF) is one of the most common heart arrhythmias. It is known to cause up to 15% of all strokes. In current times, modern detection systems for arrhythmias, such as single-use patch electrocardiogram (ECG) devices, have to be energy efficient, small, and affordable. In this work, specialized hardware accelerators were developed. First, an artificial neural network (NN) for the detection of AF was optimized. Special attention was paid to the minimum requirements for the inference on a RISC-V-based microcontroller. Hence, a 32-bit floating-point-based NN was analyzed. To reduce the silicon area needed, the NN was quantized to an 8-bit fixed-point datatype (Q7). Based on this datatype, specialized accelerators were developed. Those accelerators included single-instruction multiple-data (SIMD) hardware as well as accelerators for activation functions such as sigmoid and hyperbolic tangents. To accelerate activation functions that require the e-function as part of their computation (e.g., softmax), an e-function accelerator was implemented in the hardware. To compensate for the losses of quantization, the network was expanded and optimized for run-time and memory requirements. The resulting NN has a 7.5% lower run-time in clock cycles (cc) without the accelerators and 2.2 percentage points (pp) lower accuracy compared to a floating-point-based net, while requiring 65% less memory. With the specialized accelerators, the inference run-time was lowered by 87.2% while the F1-Score decreased by 6.1 pp. Implementing the Q7 accelerators instead of the floating-point unit (FPU), the silicon area needed for the microcontroller in 180 nm-technology is below 1 mm

^{2}

. Full article

(This article belongs to the Special Issue Digital Remote Healthcare Monitoring: Non-invasive Sensor Technology and AI/ML Techniques)

► Show Figures

Figure 1

25 pages, 2455 KB

Open AccessArticle

An Optimized Method for Nonlinear Function Approximation Based on Multiplierless Piecewise Linear Approximation

by Hongjiang Yu, Guoshun Yuan, Dewei Kong, Lei Lei and Yuefeng He

Appl. Sci. 2022, 12(20), 10616; https://doi.org/10.3390/app122010616 - 20 Oct 2022

Cited by 8 | Viewed by 2850

Abstract

In this paper, we propose an optimized method for nonlinear function approximation based on multiplierless piecewise linear approximation computation (ML-PLAC), which we call OML-PLAC. OML-PLAC finds the minimum number of segments with the predefined fractional bit width of input/output, maximum number of shift-and-add [...] Read more.

In this paper, we propose an optimized method for nonlinear function approximation based on multiplierless piecewise linear approximation computation (ML-PLAC), which we call OML-PLAC. OML-PLAC finds the minimum number of segments with the predefined fractional bit width of input/output, maximum number of shift-and-add operations, user-defined widths of intermediate data, and maximum absolute error (MAE). In addition, OML-PLAC minimizes the actual MAE as much as possible by iterating. As a result, under the condition of satisfying the maximum number of segments, the MAE can be minimized. Tree-cascaded 2-input and 3-input multiplexers are used to replace multi-input multiplexers in hardware architecture as well, reducing the depth of the critical path. The optimized method is applied to logarithmic, antilogarithmic, hyperbolic tangent, sigmoid and softsign functions. The results of the implementation prove that OML-PLAC has better performance than the current state-of-the-art method. Full article

(This article belongs to the Section Electrical, Electronics and Communications Engineering)

► Show Figures

Figure 1

24 pages, 6083 KB

Open AccessArticle

Learning Hyperbolic Embedding for Phylogenetic Tree Placement and Updates

by Yueyu Jiang, Puoya Tabaghi and Siavash Mirarab

Biology 2022, 11(9), 1256; https://doi.org/10.3390/biology11091256 - 24 Aug 2022

Cited by 14 | Viewed by 5178

Abstract

Phylogenetic placement, used widely in ecological analyses, seeks to add a new species to an existing tree. A deep learning approach was previously proposed to estimate the distance between query and backbone species by building a map from gene sequences to a high-dimensional [...] Read more.

Phylogenetic placement, used widely in ecological analyses, seeks to add a new species to an existing tree. A deep learning approach was previously proposed to estimate the distance between query and backbone species by building a map from gene sequences to a high-dimensional space that preserves species tree distances. They then use a distance-based placement method to place the queries on that species tree. In this paper, we examine the appropriate geometry for faithfully representing tree distances while embedding gene sequences. Theory predicts that hyperbolic spaces should provide a drastic reduction in distance distortion compared to the conventional Euclidean space. Nevertheless, hyperbolic embedding imposes its own unique challenges related to arithmetic operations, exponentially-growing functions, and limited bit precision, and we address these challenges. Our results confirm that hyperbolic embeddings have substantially lower distance errors than Euclidean space. However, these better-estimated distances do not always lead to better phylogenetic placement. We then show that the deep learning framework can be used not just to place on a backbone tree but to update it to obtain a fully resolved tree. With our hyperbolic embedding framework, species trees can be updated remarkably accurately with only a handful of genes. Full article

(This article belongs to the Special Issue Research in Computational Molecular Biology Focused on Comparative Genomics: Selected Papers from RECOMB-CG 2022)

► Show Figures

Figure 1

20 pages, 2148 KB

Open AccessArticle

Low-Latency and Minor-Error Architecture for Parallel Computing X^Y-like Functions with High-Precision Floating-Point Inputs

by Ming Liu, Wenjia Fu and Jincheng Xia

Electronics 2022, 11(1), 69; https://doi.org/10.3390/electronics11010069 - 27 Dec 2021

Cited by 8 | Viewed by 3116

Abstract

This paper proposes a novel architecture for the computation of X^Y-like functions based on the QH CORDIC (Quadruple-Step-Ahead Hyperbolic Coordinate Rotation Digital Computer) methodology. The proposed architecture converts direct computing of function X^Y to logarithm, multiplication, and exponent operations. The [...] Read more.

This paper proposes a novel architecture for the computation of X^Y-like functions based on the QH CORDIC (Quadruple-Step-Ahead Hyperbolic Coordinate Rotation Digital Computer) methodology. The proposed architecture converts direct computing of function X^Y to logarithm, multiplication, and exponent operations. The QH CORDIC methodology is a parallel variant of the traditional CORDIC algorithm. Traditional CORDIC suffers from long latency and large area, while the QH CORDIC has much lower latency. The computation of functions lnx and e^x is accomplished with the QH CORDIC. To solve the problem of the limited range of convergence of the QH CORDIC, this paper employs two specific techniques to enlarge the range of convergence for functions lnx and e^x, making it possible to deal with high-precision floating-point inputs. Hardware modeling of function X^Y using the QH CORDIC is plotted in this paper. Under the TSMC 65 nm standard cell library, this paper designs and synthesizes a reference circuit. The ASIC implementation results show that the proposed architecture has 30 more orders of magnitude of maximum relative error and average relative error than the state-of-the-art. On top of that, the proposed architecture is also superior to the state-of-the-art in terms of latency, word length and energy efficiency (power × latency × period /efficient bits). Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

18 pages, 4038 KB

Open AccessArticle

Low-Latency Hardware Implementation of High-Precision Hyperbolic Functions Sinhx and Coshx Based on Improved CORDIC Algorithm

by Wenjia Fu, Jincheng Xia, Xu Lin, Ming Liu and Mingjiang Wang

Electronics 2021, 10(20), 2533; https://doi.org/10.3390/electronics10202533 - 17 Oct 2021

Cited by 16 | Viewed by 3989

Abstract

CORDIC algorithm is used for low-cost hardware implementation to calculate transcendental functions. This paper proposes a low-latency high-precision architecture for the computation of hyperbolic functions sinhx and coshx based on an improved CORDIC algorithm, that is, the QH-CORDIC. The principle, structure, [...] Read more.

CORDIC algorithm is used for low-cost hardware implementation to calculate transcendental functions. This paper proposes a low-latency high-precision architecture for the computation of hyperbolic functions sinhx and coshx based on an improved CORDIC algorithm, that is, the QH-CORDIC. The principle, structure, and range of convergence of the QH-CORDIC are discussed, and the hardware circuit architecture of functions sinhx and coshx using the QH-CORDIC is plotted in this paper. The proposed architecture is implemented using an FPGA device, showing that it has 75% and 50% latency overhead over the two latest prior works. In the synthesis using TSMC 65 nm standard cell library, ASIC implementation results show that the proposed architecture is also superior to the two latest prior works in terms of total time (latency × period), ATP (area × total time), total energy (power × total time), energy efficiency (total energy/efficient bits), and area efficiency (efficient bits/area/total time). Comparison of related works indicates that it is much more favorable for the proposed architecture to perform high-precision floating-point computations on functions sinhx and coshx than the LUT method, stochastic computing, and other CORDIC algorithms. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

21 pages, 13331 KB

Open AccessArticle

Palm Date Leaf Clipping: A New Method to Reduce PAPR in OFDM Systems

by Brahim Bakkas, Reda Benkhouya, Idriss Chana and Hussain Ben-Azza

Information 2020, 11(4), 190; https://doi.org/10.3390/info11040190 - 1 Apr 2020

Cited by 16 | Viewed by 5869

Abstract

Orthogonal frequency division multiplexing (OFDM) is the key technology used in high-speed communication systems. One of the major drawbacks of OFDM systems is the high peak-to-average power ratio (PAPR) of the transmitted signal. The transmitted signal with a high PAPR requires a very [...] Read more.

Orthogonal frequency division multiplexing (OFDM) is the key technology used in high-speed communication systems. One of the major drawbacks of OFDM systems is the high peak-to-average power ratio (PAPR) of the transmitted signal. The transmitted signal with a high PAPR requires a very large linear range of the Power Amplifier (PA) on the transmitter side. In this paper, we propose and study a new clipping method named Palm Clipping (Palm date leaf) based on hyperbolic cosine. To evaluate and analyze its performance in terms of the PAPR and Bit Error Rate (BER), we performed some computer simulations by varying the Clipping Ratio (CR) and modulation schemes. The obtained results show that it is possible to achieve a gain of between 7 and 9 dB in terms of PAPR reduction depending on the type of modulation. In addition, comparison with several techniques in terms of PAPR and BER shows that our method is a strong alternative that can be adopted as a PAPR reduction technique for OFDM-based communication systems. Full article

(This article belongs to the Section Information and Communications Technology)

► Show Figures

Figure 1

18 pages, 798 KB

Open AccessArticle

Fast Approximations of Activation Functions in Deep Neural Networks when using Posit Arithmetic

by Marco Cococcioni, Federico Rossi, Emanuele Ruffaldi and Sergio Saponara

Sensors 2020, 20(5), 1515; https://doi.org/10.3390/s20051515 - 10 Mar 2020

Cited by 30 | Viewed by 6571

Abstract

With increasing real-time constraints being put on the use of Deep Neural Networks (DNNs) by real-time scenarios, there is the need to review information representation. A very challenging path is to employ an encoding that allows a fast processing and hardware-friendly representation of [...] Read more.

With increasing real-time constraints being put on the use of Deep Neural Networks (DNNs) by real-time scenarios, there is the need to review information representation. A very challenging path is to employ an encoding that allows a fast processing and hardware-friendly representation of information. Among the proposed alternatives to the IEEE 754 standard regarding floating point representation of real numbers, the recently introduced Posit format has been theoretically proven to be really promising in satisfying the mentioned requirements. However, with the absence of proper hardware support for this novel type, this evaluation can be conducted only through a software emulation. While waiting for the widespread availability of the Posit Processing Units (the equivalent of the Floating Point Unit (FPU)), we can already exploit the Posit representation and the currently available Arithmetic-Logic Unit (ALU) to speed up DNNs by manipulating the low-level bit string representations of Posits. As a first step, in this paper, we present new arithmetic properties of the Posit number system with a focus on the configuration with 0 exponent bits. In particular, we propose a new class of Posit operators called L1 operators, which consists of fast and approximated versions of existing arithmetic operations or functions (e.g., hyperbolic tangent (TANH) and extended linear unit (ELU)) only using integer arithmetic. These operators introduce very interesting properties and results: (i) faster evaluation than the exact counterpart with a negligible accuracy degradation; (ii) an efficient ALU emulation of a number of Posits operations; and (iii) the possibility to vectorize operations in Posits, using existing ALU vectorized operations (such as the scalable vector extension of ARM CPUs or advanced vector extensions on Intel CPUs). As a second step, we test the proposed activation function on Posit-based DNNs, showing how 16-bit down to 10-bit Posits represent an exact replacement for 32-bit floats while 8-bit Posits could be an interesting alternative to 32-bit floats since their performances are a bit lower but their high speed and low storage properties are very appealing (leading to a lower bandwidth demand and more cache-friendly code). Finally, we point out how small Posits (i.e., up to 14 bits long) are very interesting while PPUs become widespread, since Posit operations can be tabulated in a very efficient way (see details in the text). Full article

(This article belongs to the Special Issue Applications in Electronics Pervading Industry, Environment and Society – Sensing Systems and Pervasive Intelligence)

► Show Figures

Figure 1

13 pages, 2599 KB

Open AccessEditor’s ChoiceArticle

A New Simplified Model and Parameter Estimations for a HfO₂-Based Memristor †

by Valeri Mladenov

Technologies 2020, 8(1), 16; https://doi.org/10.3390/technologies8010016 - 7 Mar 2020

Cited by 2 | Viewed by 4950

Abstract

The purpose of this paper was to propose a complete analysis and parameter estimations of a new simplified and highly nonlinear hafnium dioxide memristor model that is appropriate for high-frequency signals. For the simulations; a nonlinear window function previously offered by the author [...] Read more.

The purpose of this paper was to propose a complete analysis and parameter estimations of a new simplified and highly nonlinear hafnium dioxide memristor model that is appropriate for high-frequency signals. For the simulations; a nonlinear window function previously offered by the author together with a highly nonlinear memristor model was used. This model was tuned according to an experimentally recorded current–voltage relationship of a HfO₂ memristor. This study offered an estimation of the optimal model parameters using a least squares algorithm in SIMULINK and a methodology for adjusting the model by varying its parameters overbroad ranges. The optimal values of the memristor model parameters were obtained after minimizing the error between the experimental and simulated current–voltage characteristics. A comparison of the obtained errors between the simulated and experimental current–voltage relationships was made. The error derived by the optimization algorithm was a little bit lower than that obtained by the used methodology. To avoid convergence problems; the step function in the considered model was replaced by a differentiable tangent hyperbolic function. A PSpice library model of the HfO₂ memristor based on its mathematical model was created. The considered model was successfully applied and tested in a multilayer memristor neural network with bridge memristor–resistor synapses Full article

(This article belongs to the Special Issue MOCAST 2019: Modern Circuits and Systems Technologies on Electronics)

► Show Figures

Figure 1

Search Results (17)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (17)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI