Next Issue
Volume 12, June
Previous Issue
Volume 11, December
 
 
Due to planned maintenance work on our platforms, there might be short service disruptions on Saturday, December 3rd, between 15:00 and 16:00 (CET).

J. Low Power Electron. Appl., Volume 12, Issue 1 (March 2022) – 18 articles

  • Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
  • You may sign up for e-mail alerts to receive table of contents of newly released issues.
  • PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Readerexternal link to open them.
Order results
Result details
Select all
Export citation of selected articles as:
Article
Towards Integration of a Dedicated Memory Controller and Its Instruction Set to Improve Performance of Systems Containing Computational SRAM
J. Low Power Electron. Appl. 2022, 12(1), 18; https://doi.org/10.3390/jlpea12010018 - 16 Mar 2022
Cited by 1 | Viewed by 1550
Abstract
In-memory computing (IMC) aims to solve the performance gap between CPU and memories introduced by the memory wall. However, it does not address the energy wall problem caused by data transfer over memory hierarchies. This paper proposes the data-locality management unit (DMU) to [...] Read more.
In-memory computing (IMC) aims to solve the performance gap between CPU and memories introduced by the memory wall. However, it does not address the energy wall problem caused by data transfer over memory hierarchies. This paper proposes the data-locality management unit (DMU) to efficiently transfer data from a DRAM memory to a computational SRAM (C-SRAM) memory allowing IMC operations. The DMU is tightly coupled within the C-SRAM and allows one to align the data structure in order to perform effective in-memory computation. We propose a dedicated instruction set within the DMU to issue data transfers. The performance evaluation of a system integrating C-SRAM within the DMU compared to a reference scalar system architecture shows an increase from ×5.73 to ×11.01 in speed-up and from ×29.49 to ×46.67 in energy reduction, versus a system integrating C-SRAM without any transfer mechanism compared to a reference scalar system architecture. Full article
(This article belongs to the Special Issue Low Power Memory/Memristor Devices and Systems)
Show Figures

Figure 1

Article
Implementation of a Fuel Estimation Algorithm Using Approximated Computing
J. Low Power Electron. Appl. 2022, 12(1), 17; https://doi.org/10.3390/jlpea12010017 - 16 Mar 2022
Cited by 2 | Viewed by 1592
Abstract
The rising concerns about global warming have motivated the international community to take remedial actions to lower greenhouse gas emissions. The transportation sector is believed to be one of the largest air polluters. The quantity of greenhouse gas emissions is directly linked to [...] Read more.
The rising concerns about global warming have motivated the international community to take remedial actions to lower greenhouse gas emissions. The transportation sector is believed to be one of the largest air polluters. The quantity of greenhouse gas emissions is directly linked to the fuel consumption of vehicles. Eco-driving is an emergent driving style that aims at improving gas mileage. Real-time fuel estimation is a critical feature of eco-driving and eco-routing. There are numerous approaches to fuel estimation. The first approach uses instantaneous values of speed and acceleration. This can be accomplished using either GPS data or direct reading through the OBDII interface. The second approach uses the average value of the speed and acceleration that can be measured using historical data or through web mapping. The former cannot be used for route planning. The latter can be used for eco-routing. This paper elaborates on a highly pipelined VLSI architecture for the fuel estimation algorithm. Several high-level transformation techniques have been exercised to reduce the complexity of the algorithm. Three competing architectures have been implemented on FPGA and compared. The first one uses a binary search algorithm, the second architecture employs a direct address table, and the last one uses approximation techniques. The complexity of the algorithm is further reduced by combining both approximated computing and precalculation. This approach helped reduce the floating-point operations by 30% compared with the state-of-the-art implementation. Full article
(This article belongs to the Special Issue Advanced Researches in Embedded Systems)
Show Figures

Figure 1

Article
A 0.5 V Sub-Threshold CMOS Current-Controlled Ring Oscillator for IoT and Implantable Devices
J. Low Power Electron. Appl. 2022, 12(1), 16; https://doi.org/10.3390/jlpea12010016 - 09 Mar 2022
Cited by 1 | Viewed by 2170
Abstract
A current-controlled CMOS ring oscillator topology, which exploits the bulk voltages of the inverter stages as control terminals to tune the oscillation frequency, is proposed and analyzed. The solution can be adopted in sub-1 V applications, as it exploits MOSFETS in the subthreshold [...] Read more.
A current-controlled CMOS ring oscillator topology, which exploits the bulk voltages of the inverter stages as control terminals to tune the oscillation frequency, is proposed and analyzed. The solution can be adopted in sub-1 V applications, as it exploits MOSFETS in the subthreshold regime. Oscillators made up of 3, 5, and 7 stages designed in a standard 28-nm technology and supplied by 0.5 V, were simulated. By exploiting a programmable capacitor array, it allows a very large range of oscillation frequencies to be set, from 1 MHz to about 1 GHz, with a limited current consumption. Considering, for example, the five-stage topology, a nominal oscillation frequency of 516 MHz is obtained with an average power dissipation of about 29 µW. The solution provides a tuneable oscillation frequency, which can be adjusted from 360 to 640 MHz by controlling the bias current with a sensitivity of 0.43 MHz/nA. Full article
(This article belongs to the Special Issue Ultra-Low-Power ICs for the Internet of Things)
Show Figures

Figure 1

Article
Cooperative Design of Devices and Services to Balance Low Power and User Experience
J. Low Power Electron. Appl. 2022, 12(1), 15; https://doi.org/10.3390/jlpea12010015 - 08 Mar 2022
Cited by 1 | Viewed by 1703
Abstract
CPS (Cyber Physical Systems) is an approach often adopted for improving real-world activities by utilizing data. It also can be used to improve customer experiences in service applications by analyzing customer behavior, captured by sensing devices and by supporting utilization of that data [...] Read more.
CPS (Cyber Physical Systems) is an approach often adopted for improving real-world activities by utilizing data. It also can be used to improve customer experiences in service applications by analyzing customer behavior, captured by sensing devices and by supporting utilization of that data by the service providers, to improve the system. In developing such systems, no method has been established to systematically evaluate the impact of individual component design on the user experience. Knowledge Experience Design is a method for distilling and validating information that affects the quality of the user experience by focusing on user activities and underlying knowledge. This methodology has been applied to a system for a museum, in which visitor activities are observed by sensing devices, to aid the Curator’s awareness for improving museum services. As a result, a cooperative process for designing devices and user experience as a service was derived, in which competing interests of lower power consumption and user experience improvement have been attained. The proposed design method can be used for the co-design of systems that are built on the close coordination of hardware devices and software applications, for providing value-oriented services to users, which aids realization of CPS oriented to evaluating and improving such environments. Full article
Show Figures

Figure 1

Article
Silicon-Compatible Memristive Devices Tailored by Laser and Thermal Treatments
J. Low Power Electron. Appl. 2022, 12(1), 14; https://doi.org/10.3390/jlpea12010014 - 02 Mar 2022
Viewed by 1889
Abstract
Nowadays, memristors are of considerable interest to researchers and engineers due to the promise they hold for the creation of power-efficient memristor-based information or computing systems. In particular, this refers to memristive devices based on the resistive switching phenomenon, which in most cases [...] Read more.
Nowadays, memristors are of considerable interest to researchers and engineers due to the promise they hold for the creation of power-efficient memristor-based information or computing systems. In particular, this refers to memristive devices based on the resistive switching phenomenon, which in most cases are fabricated in the form of metal–insulator–metal structures. At the same time, the demand for compatibility with the standard fabrication process of complementary metal–oxide semiconductors makes it relevant from a practical point of view to fabricate memristive devices directly on a silicon or SOI (silicon on insulator) substrate. Here we have investigated the electrical characteristics and resistive switching of SiOx- and SiNx-based memristors fabricated on SOI substrates and subjected to additional laser treatment and thermal treatment. The investigated memristors do not require electroforming and demonstrate a synaptic type of resistive switching. It is found that the parameters of resistive switching of SiOx- and SiNx-based memristors on SOI substrates are remarkably improved. In particular, the laser treatment gives rise to a significant increase in the hysteresis loop in IV curves of SiNx-based memristors. Moreover, for SiOx-based memristors, the thermal treatment used after the laser treatment produces a notable decrease in the resistive switching voltage. Full article
(This article belongs to the Special Issue Low Power Memory/Memristor Devices and Systems)
Show Figures

Figure 1

Article
A Model for the Evaluation of Monostable Molecule Signal Energy in Molecular Field-Coupled Nanocomputing
J. Low Power Electron. Appl. 2022, 12(1), 13; https://doi.org/10.3390/jlpea12010013 - 01 Mar 2022
Viewed by 1533
Abstract
Molecular Field-Coupled Nanocomputing (FCN) is a computational paradigm promising high-frequency information elaboration at ambient temperature. This work proposes a model to evaluate the signal energy involved in propagating and elaborating the information. It splits the evaluation into several energy contributions calculated with closed-form [...] Read more.
Molecular Field-Coupled Nanocomputing (FCN) is a computational paradigm promising high-frequency information elaboration at ambient temperature. This work proposes a model to evaluate the signal energy involved in propagating and elaborating the information. It splits the evaluation into several energy contributions calculated with closed-form expressions without computationally expensive calculation. The essential features of the 1,4-diallylbutane cation are evaluated with Density Functional Theory (DFT) and used in the model to evaluate circuit energy. This model enables understanding the information propagation mechanism in the FCN paradigm based on monostable molecules. We use the model to verify the bistable factor theory, describing the information propagation in molecular FCN based on monostable molecules, analyzed so far only from an electrostatic standpoint. Finally, the model is integrated into the SCERPA tool and used to quantify the information encoding stability and possible memory effects. The obtained results are consistent with state-of-the-art considerations and comparable with DFT calculation. Full article
Show Figures

Figure 1

Article
A Tree-Based Architecture for High-Performance Ultra-Low-Voltage Amplifiers
J. Low Power Electron. Appl. 2022, 12(1), 12; https://doi.org/10.3390/jlpea12010012 - 17 Feb 2022
Cited by 6 | Viewed by 1777
Abstract
In this paper, we introduce a novel tree-based architecture which allows the implementation of Ultra-Low-Voltage (ULV) amplifiers. The architecture exploits a body-driven input stage to guarantee a rail-to-rail input common mode range and body-diode loading to avoid Miller compensation, thanks to the absence [...] Read more.
In this paper, we introduce a novel tree-based architecture which allows the implementation of Ultra-Low-Voltage (ULV) amplifiers. The architecture exploits a body-driven input stage to guarantee a rail-to-rail input common mode range and body-diode loading to avoid Miller compensation, thanks to the absence of high-impedance internal nodes. The tree-based structure improves the CMRR of the proposed amplifier with respect to the conventional OTA architectures and allows achievement of a reasonable CMRR even at supply voltages as low as 0.3 V and without tail current generators which cannot be used in ULV circuits. The bias currents and the static output voltages of all the stages implementing the architecture are accurately set through the gate terminals of biasing transistors in order to guarantee good robustness against PVT variations. The proposed architecture and the implementing stages are investigated from an analytical point of view and design equations for the main performance metrics are presented to provide insight into circuit behavior. A 0.3 V supply voltage, subthreshold, ultra-low-power (ULP) OTA, based on the proposed tree-based architecture, was designed in a commercial 130 nm CMOS process. Simulation results show a dc gain higher than 52 dB with a gain-bandwidth product of about 35 kHz and reasonable values of CMRR and PSRR, even at such low supply voltages and considering mismatches. The power consumption is as low as 21.89 nW and state-of-the-art small-signal and large-signal FoMs are achieved. Extensive parametric and Monte Carlo simulations show the robustness of the proposed circuit to PVT variations and mismatch. These results confirm that the proposed OTA is a good candidate to implement ULV, ULP, high performance analog building blocks for directly harvested IoT nodes. Full article
(This article belongs to the Special Issue Ultra-Low-Power ICs for the Internet of Things)
Show Figures

Figure 1

Article
DSCU: Accelerating CNN Inference in FPGAs with Dual Sizes of Compute Unit
J. Low Power Electron. Appl. 2022, 12(1), 11; https://doi.org/10.3390/jlpea12010011 - 13 Feb 2022
Cited by 2 | Viewed by 1776
Abstract
FPGA-based accelerators have shown great potential in improving the performance of CNN inference. However, the existing FPGA-based approaches suffer from a low compute unit (CU) efficiency due to their large number of redundant computations, thus leading to high levels of performance degradation. In [...] Read more.
FPGA-based accelerators have shown great potential in improving the performance of CNN inference. However, the existing FPGA-based approaches suffer from a low compute unit (CU) efficiency due to their large number of redundant computations, thus leading to high levels of performance degradation. In this paper, we show that no single CU can perform best across all the convolutional layers (CONV-layers). To this end, we propose the use of dual sizes of compute unit (DSCU), an approach that aims to accelerate CNN inference in FPGAs. The key idea of DSCU is to select the best combination of CUs via dynamic programming scheduling for each CONV-layer and then assemble each CONV-layer combination into a computing solution for the given CNN to deploy in FPGAs. The experimental results show that DSCU can achieve a performance density of 3.36 × 103 GOPs/slice on a Xilinx Zynq ZU3EG, which is 4.29 times higher than that achieved by other approaches. Full article
(This article belongs to the Special Issue Low Power AI)
Show Figures

Figure 1

Article
Mapping Transformation Enabled High-Performance and Low-Energy Memristor-Based DNNs
J. Low Power Electron. Appl. 2022, 12(1), 10; https://doi.org/10.3390/jlpea12010010 - 10 Feb 2022
Viewed by 1996
Abstract
When deep neural network (DNN) is extensively utilized for edge AI (Artificial Intelligence), for example, the Internet of things (IoT) and autonomous vehicles, it makes CMOS (Complementary Metal Oxide Semiconductor)-based conventional computers suffer from overly large computing loads. Memristor-based devices are emerging as [...] Read more.
When deep neural network (DNN) is extensively utilized for edge AI (Artificial Intelligence), for example, the Internet of things (IoT) and autonomous vehicles, it makes CMOS (Complementary Metal Oxide Semiconductor)-based conventional computers suffer from overly large computing loads. Memristor-based devices are emerging as an option to conduct computing in memory for DNNs to make them faster, much more energy efficient, and accurate. Despite having excellent properties, the memristor-based DNNs are yet to be commercially available because of Stuck-At-Fault (SAF) defects. A Mapping Transformation (MT) method is proposed in this paper to mitigate Stuck-at-Fault (SAF) defects from memristor-based DNNs. First, the weight distribution for the VGG8 model with the CIFAR10 dataset is presented and analyzed. Then, the MT method is used for recovering inference accuracies at 0.1% to 50% SAFs with two typical cases, SA1 (Stuck-At-One): SA0 (Stuck-At-Zero) = 5:1 and 1:5, respectively. The experiment results show that the MT method can recover DNNs to their original inference accuracies (90%) when the ratio of SAFs is smaller than 2.5%. Moreover, even when the SAF is in the extreme condition of 50%, it is still highly efficient to recover the inference accuracy to 80% and 21%. What is more, the MT method acts as a regulator to avoid energy and latency overhead generated by SAFs. Finally, the immunity of the MT Method against non-linearity is investigated, and we conclude that the MT method can benefit accuracy, energy, and latency even with high non-linearity LTP = 4 and LTD = −4. Full article
(This article belongs to the Special Issue Low Power AI)
Show Figures

Figure 1

Article
Fully Differential Miller Op-Amp with Enhanced Large- and Small-Signal Figures of Merit
J. Low Power Electron. Appl. 2022, 12(1), 9; https://doi.org/10.3390/jlpea12010009 - 08 Feb 2022
Cited by 1 | Viewed by 1784
Abstract
A highly power-efficient, fully differential Miller op-amp with accurately controlled output quiescent current is introduced. The op-amp can drive both capacitive and resistive load due to the presence of the auxiliary amplifier. This amplifier helps to achieve class AB operation of the proposed [...] Read more.
A highly power-efficient, fully differential Miller op-amp with accurately controlled output quiescent current is introduced. The op-amp can drive both capacitive and resistive load due to the presence of the auxiliary amplifier. This amplifier helps to achieve class AB operation of the proposed op-amp. The fully differential auxiliary amplifier is compact and uses a resistive local common-mode feedback network. It consumes only 6% of the total current of the op-amp. The proposed op-amp has several innovative features. Incorporating the auxiliary amplifier helps to improve the unity gain frequency, power efficiency, slew-rate, and common-mode rejection ratio of the proposed op-amp. It can drive a wide range of resistive (200 Ω–1 MΩ) and capacitive loads (5 pF–300 pF). The op-amp has a large signal dynamic current efficiency of 8.6 and a large signal static current efficiency of 7.9. The small-signal figure of merit is 8.7 for RL = 1 MΩ and 7.3 for RL = 200 Ω. Full article
Show Figures

Figure 1

Article
CondenseNeXtV2: Light-Weight Modern Image Classifier Utilizing Self-Querying Augmentation Policies
J. Low Power Electron. Appl. 2022, 12(1), 8; https://doi.org/10.3390/jlpea12010008 - 03 Feb 2022
Viewed by 1652
Abstract
Artificial Intelligence (AI) combines computer science and robust datasets to mimic natural intelligence demonstrated by human beings to aid in problem-solving and decision-making involving consciousness up to a certain extent. From Apple’s virtual personal assistant, Siri, to Tesla’s self-driving cars, research and development [...] Read more.
Artificial Intelligence (AI) combines computer science and robust datasets to mimic natural intelligence demonstrated by human beings to aid in problem-solving and decision-making involving consciousness up to a certain extent. From Apple’s virtual personal assistant, Siri, to Tesla’s self-driving cars, research and development in the field of AI is progressing rapidly along with privacy concerns surrounding the usage and storage of user data on external servers which has further fueled the need of modern ultra-efficient AI networks and algorithms. The scope of the work presented within this paper focuses on introducing a modern image classifier which is a light-weight and ultra-efficient CNN intended to be deployed on local embedded systems, also known as edge devices, for general-purpose usage. This work is an extension of the award-winning paper entitled ‘CondenseNeXt: An Ultra-Efficient Deep Neural Network for Embedded Systems’ published for the 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC). The proposed neural network dubbed CondenseNeXtV2 utilizes a new self-querying augmentation policy technique on the target dataset along with adaption to the latest version of PyTorch framework and activation functions resulting in improved efficiency in image classification computation and accuracy. Finally, we deploy the trained weights of CondenseNeXtV2 on NXP BlueBox which is an edge device designed to serve as a development platform for self-driving cars, and conclusions will be extrapolated accordingly. Full article
(This article belongs to the Special Issue Advanced Researches in Embedded Systems)
Show Figures

Figure 1

Editorial
Acknowledgment to Reviewers of Journal of Low Power Electronics and Applications in 2021
J. Low Power Electron. Appl. 2022, 12(1), 7; https://doi.org/10.3390/jlpea12010007 - 25 Jan 2022
Viewed by 1398
Abstract
Rigorous peer-reviews are the basis of high-quality academic publishing [...] Full article
Article
Hardware/Software Solution for Low Power Evaluation of Tsunami Danger
J. Low Power Electron. Appl. 2022, 12(1), 6; https://doi.org/10.3390/jlpea12010006 - 21 Jan 2022
Cited by 1 | Viewed by 1642
Abstract
Carbon footprint reduction issues have been drawing more and more attention these days. Reducing the energy consumption is among the basic directions along this line. In the paper, a low-energy approach to tsunami danger evaluation is concerned. After several disaster tsunamis of the [...] Read more.
Carbon footprint reduction issues have been drawing more and more attention these days. Reducing the energy consumption is among the basic directions along this line. In the paper, a low-energy approach to tsunami danger evaluation is concerned. After several disaster tsunamis of the XXIst century, the question arises whether is it possible to evaluate in a couple of minutes the tsunami wave parameters, expected at the particular geo location. The point is that it takes around 20 min for the wave to approach the nearest coast after a seismic event offshore of Japan. Currently, the main tool for studying tsunamis is computer modeling. In particular, the expected tsunami height near the coastline, when a major underwater earthquake is detected, can be estimated by a series of numerical experiments of various scenarios of generation and the following wave propagation. Reducing the calculation time of such scenarios and the necessary energy consumption for this is the scope of this study. Moreover, in case of the major earthquake, the electric power shutdown is possible (e.g., the accident at the Fukushima nuclear power station in Japan on 11 May 2011), so the solution should be of low energy-consuming, preferably based at regular personal computers (PCs) or laptops. The way to achieve the requested performance of numerical modeling at the PC platform is a combination of efficient algorithms and their hardware acceleration. Following this strategy, a solution for the fast numerical simulation of tsunami wave propagation has been proposed. Most of tsunami researchers use the shallow-water approximation to simulate tsunami wave propagation at deep water areas. For software implementation, the MacCormack finite-difference scheme has been chosen, as it is suitable for pipelining. For hardware code acceleration, a special processor, that is, the calculator, has been designed at a field-programmable gate array (FPGA) platform. This combination was tested in terms of precision by comparison with the reference code and with the exact solutions (known for some special cases of the bottom profile). The achieved performance made it possible to calculate the wave propagation over a 1000 × 500 km water area in 1 min (the mesh size was compared to 250 m). It was nearly 300 times faster compared to that of a regular PC and 10 times faster compared to the use of a central processing unit (CPU). This result, being implemented into tsunami warning systems, will make it possible to reduce human casualties and economy losses for the so-called near-field tsunamis. The presented paper discussed the new aspect of such implementation, namely low energy consumption. The corresponding measurements for three platforms (PC and two types of FPGA) have been performed, and a comparison of the obtained results of energy consumption was given. As the numerical simulation of numerous tsunami propagation scenarios from different sources are needed for the purpose of coastal tsunami zoning, the integrated amount of the saving energy is expected to be really valuable. For the time being, tsunami researchers have not used the FPGA-based acceleration of computer code execution. Perhaps, the energy-saving aspect is able to promote the use of FPGAs in tsunami researches. The approach to designing special FPGA-based processors for the fast solution of various engineering problems using a PC could be extended to other areas, such as bioinformatics (motif search in DNA sequences and other algorithms of genome analysis and molecular dynamics) and seismic data processing (three-dimensional (3D) wave package decomposition, data compression, noise suppression, etc.). Full article
(This article belongs to the Special Issue Low Power AI)
Show Figures

Figure 1

Article
Design Aspects of a Single-Output Multi-String WLED Driver Using 40 nm CMOS Technology
J. Low Power Electron. Appl. 2022, 12(1), 5; https://doi.org/10.3390/jlpea12010005 - 18 Jan 2022
Cited by 1 | Viewed by 1596
Abstract
This work presents various essential features and design aspects of a single-inductor, common-output, and multi-string White Light Emitting Diode (WLED) driver for low-power portable devices. High efficiency is one of the main features of such a device. Here, the efficiency improvement is achieved [...] Read more.
This work presents various essential features and design aspects of a single-inductor, common-output, and multi-string White Light Emitting Diode (WLED) driver for low-power portable devices. High efficiency is one of the main features of such a device. Here, the efficiency improvement is achieved by selecting the proper arrangement of WLEDs and a proper sensing-circuit technique to determine the minimum, real-time, needed output voltage. This minimum voltage necessary to activate all WLEDs depends on the number of strings and the forward voltage drops among the WLEDs. Advanced CMOS technology is advantageous in mixed-signal environments such as WLED drivers. However, this process suffers from low on-resistance, which degrades the accuracy of the current sinks. To accommodate the above features and mitigate the low node process issue, a boost-converter that is single output with a load of a three-string arrangement, with 6 WLEDs each, is presented. The designed driver has an input voltage range of 3.2–4.2V. The proposed solution is realized with ultra-low power consumption circuits and verified using ADS tools utilizing 40 nm 1P9M TSMC CMOS technology. An inter-string current accuracy of 0.2% and peak efficiency of 91% are achieved with an output voltage up to 25 V. The integrated WLED driver circuitry enables a high switching frequency of 1MHz and reduces the passive elements’ size in the power stage. Full article
Show Figures

Figure 1

Article
CORDIC Hardware Acceleration Using DMA-Based ISA Extension
J. Low Power Electron. Appl. 2022, 12(1), 4; https://doi.org/10.3390/jlpea12010004 - 15 Jan 2022
Cited by 2 | Viewed by 2008
Abstract
The use of RISC-based embedded processors aimed at low cost and low power is becoming an increasingly popular ecosystem for both hardware and software development. High-performance yet low-power embedded processors may be attained via the use of hardware acceleration and Instruction Set Architecture [...] Read more.
The use of RISC-based embedded processors aimed at low cost and low power is becoming an increasingly popular ecosystem for both hardware and software development. High-performance yet low-power embedded processors may be attained via the use of hardware acceleration and Instruction Set Architecture (ISA) extension. Recent publications of AI have demonstrated the use of Coordinate Rotation Digital Computer (CORDIC) as a dedicated low-power solution for solving nonlinear equations applied to Neural Networks (NN). This paper proposes ISA extension to support floating-point CORDIC, providing efficient hardware acceleration for mathematical functions. A new DMA-based ISA extension approach integrated with a pipeline CORDIC accelerator is proposed. The CORDIC ISA extension is directly interfaced with a standard processor data path, allowing efficient implementation of new trigonometric ALU-based custom instructions. The proposed DMA-based CORDIC accelerator can also be used to perform repeated array calculations, offering a significant speedup over software implementations. The proposed accelerator is evaluated on Intel Cyclone-IV FPGA as an extension to Nios processor. Experimental results show a significant speedup of over three orders of magnitude compared with software implementation, while applied to trigonometric arrays, and outperforms the existing commercial CORDIC hardware accelerator. Full article
(This article belongs to the Special Issue Low Power AI)
Show Figures

Figure 1

Article
A Time-Domain z−1 Circuit with Digital Calibration
J. Low Power Electron. Appl. 2022, 12(1), 3; https://doi.org/10.3390/jlpea12010003 - 03 Jan 2022
Cited by 1 | Viewed by 1795
Abstract
This paper presents a novel circuit of a z−1 operation which is suitable, as a basic building block, for time-domain topologies and signal processing. The proposed circuit employs a time register circuit which is based on the capacitor discharging method. The large [...] Read more.
This paper presents a novel circuit of a z−1 operation which is suitable, as a basic building block, for time-domain topologies and signal processing. The proposed circuit employs a time register circuit which is based on the capacitor discharging method. The large variation of the capacitor discharging slope over technology process and chip temperature variations which affect the z−1 accuracy is improved using a novel digital calibration loop. The circuit is designed using a 28 nm Samsung FD-SOI process under 1 V supply voltage with 5 MHz sampling frequency. Simulation results validate the theoretical analysis presenting a variation of capacitor voltage discharging slope less than 5% over worst-case process corners for temperature between 0 °C and 100 °C while consuming only 30 μA. Also, the worst-case accuracy of z−1 operation is better than 33 ps for input pulse widths between 5 ns and 45 ns presenting huge improvement compared with the uncalibrated operator. Full article
Show Figures

Figure 1

Article
A Framework for Ultra Low-Power Hardware Accelerators Using NNs for Embedded Time Series Classification
J. Low Power Electron. Appl. 2022, 12(1), 2; https://doi.org/10.3390/jlpea12010002 - 31 Dec 2021
Viewed by 2223
Abstract
In embedded applications that use neural networks (NNs) for classification tasks, it is important to not only minimize the power consumption of the NN calculation, but of the whole system. Optimization approaches for individual parts exist, such as quantization of the NN or [...] Read more.
In embedded applications that use neural networks (NNs) for classification tasks, it is important to not only minimize the power consumption of the NN calculation, but of the whole system. Optimization approaches for individual parts exist, such as quantization of the NN or analog calculation of arithmetic operations. However, there is no holistic approach for a complete embedded system design that is generic enough in the design process to be used for different applications, but specific in the hardware implementation to waste no energy for a given application. Therefore, we present a novel framework that allows an end-to-end ASIC implementation of a low-power hardware for time series classification using NNs. This includes a neural architecture search (NAS), which optimizes the NN configuration for accuracy and energy efficiency at the same time. This optimization targets a custom designed hardware architecture that is derived from the key properties of time series classification tasks. Additionally, a hardware generation tool is used that creates a complete system from the definition of the NN. This system uses local multi-level RRAM memory as weight and bias storage to avoid external memory access. Exploiting the non-volatility of these devices, such a system can use a power-down mode to save significant energy during the data acquisition process. Detection of atrial fibrillation (AFib) in electrocardiogram (ECG) data is used as an example for evaluation of the framework. It is shown that a reduction of more than 95% of the energy consumption compared to state-of-the-art solutions is achieved. Full article
(This article belongs to the Special Issue Hardware for Machine Learning)
Show Figures

Figure 1

Article
LoRaWAN Base Station Improvement for Better Coverage and Capacity
J. Low Power Electron. Appl. 2022, 12(1), 1; https://doi.org/10.3390/jlpea12010001 - 30 Dec 2021
Cited by 2 | Viewed by 1862
Abstract
Low Power Wide Area Network (LPWAN) technologies provide long-range and low power consumption for many battery-powered devices used in Internet of Things (IoT). One of the most utilized LPWAN technologies is LoRaWAN (Long Range WAN) with over 700 million connections expected by the [...] Read more.
Low Power Wide Area Network (LPWAN) technologies provide long-range and low power consumption for many battery-powered devices used in Internet of Things (IoT). One of the most utilized LPWAN technologies is LoRaWAN (Long Range WAN) with over 700 million connections expected by the year 2023. LoraWAN base stations need to ensure stable and energy-efficient communication without unnecessary repetitions with sufficient range coverage and good capacity. To meet these requirements, a simple and efficient upgrade in the design of LoRaWAN base station is proposed, based on using two or more concentrators. The development steps are outlined in this paper and the evaluation of the enhanced base station is done with a series of measurements conducted in Zagreb, Croatia. Through these measurements we compared received messages and communication parameters on novel and standard base stations. The results showed a significant increase in the probability of successful reception of messages on the novel base station which corresponds to the increase of base station capacity and can be very beneficial for the energy consumption of most LoRaWAN end devices. Full article
(This article belongs to the Special Issue Advanced Researches in Embedded Systems)
Show Figures

Figure 1

Previous Issue
Next Issue
Back to TopTop