Journal of Low Power Electronics and Applications

JLPEA, Vol. 14, Pages 25: Gate-Level Hardware Priority Resolvers for Embedded Systems

Padmanabhan Balasubramanian — 2024-04-17

JLPEA, Vol. 14, Pages 25: Gate-Level Hardware Priority Resolvers for Embedded Systems

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14020025

Authors: Padmanabhan Balasubramanian Douglas L. Maskell

An N-bit priority resolver having N inputs and N outputs functions as polling hardware in an embedded system, enabling access to a resource when multiple devices initiate access requests at its inputs which may be located on-chip or off-chip. Subsystems such as data buses, comparators, fixed- and floating-point arithmetic units, interconnection network routers, etc., utilize the priority resolver function. In the literature, there are many transistor-level designs for the priority resolver based on dynamic CMOS logic, some of which are modular and others are not. This article presents a novel gate-level modular design of priority resolvers that can accommodate any number of inputs and outputs. Based on our modular design architecture, small-size priority resolvers can be conveniently combined to form medium- or large-size priority resolvers along with extra logic. The proposed modular design approach helps to reduce the coding complexity compared to the conventional direct design approach and facilitates scalability. We discuss the gate-level implementation of 4-, 8-, 16-, 32-, 64-, and 128-bit priority resolvers based on the direct and modular approaches and provide a performance comparison between these based on the design metrics. According to the modular approach, different sizes of priority resolver modules were used to implement larger-size priority resolvers. For example, a 4-bit priority resolver module was used to implement 8-, 16-, 32-, 64-, and 128-bit priority resolvers in a modular fashion. We used a 28 nm CMOS standard digital cell library and Synopsys EDA tools to synthesize the priority resolvers. The estimated design metrics show that the modular approach tends to facilitate increasing reductions in delay and power-delay product (PDP) compared to the direct approach, especially as the size of the priority resolver increases. For example, a 32-bit modular priority resolver utilizing 16-bit priority resolver modules had a 39.4% reduced delay and a 23.1% reduced PDP compared to a directly implemented 32-bit priority resolver, and a 128-bit modular priority resolver utilizing 16-bit priority resolver modules had a 71.8% reduced delay and a 61.4% reduced PDP compared to a directly implemented 128-bit priority resolver.

JLPEA, Vol. 14, Pages 24: Efficient Addition Circuits Using Three-Gate Reconfigurable Field Effect Transistors

Fanny Spagnolo — 2024-04-14

JLPEA, Vol. 14, Pages 24: Efficient Addition Circuits Using Three-Gate Reconfigurable Field Effect Transistors

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14020024

Authors: Fanny Spagnolo Pasquale Corsonello Fabio Frustaci Stefania Perri

Reconfigurable FETs (RFETs) are widely recognized as a promising way to overcome conventional CMOS architectures. This paper presents novel addition circuit intentionally designed to exploit the ability of RFETs to operate efficiently on demand as n- or p-type FETs. First, a novel Full Adder (FA) is proposed and characterized. A comparison with other designs shows that the proposed FA achieves a worst-case delay and a dynamic power consumption of up to 43.5% and 79% lower. As a drawback, in terms of the estimated area, it is up to 32% larger than the competitors. Then, the new FA is used to implement Ripple-Carry Adders (RCAs). A 32-bit adder designed as proposed herein reaches an energy–delay product (EDP) ~25.7× and ~141× lower than its CMOS and the RFET-based counterparts.

JLPEA, Vol. 14, Pages 23: Vehicle Detection in Adverse Weather: A Multi-Head Attention Approach with Multimodal Fusion

Nujhat Tabassum — 2024-04-13

JLPEA, Vol. 14, Pages 23: Vehicle Detection in Adverse Weather: A Multi-Head Attention Approach with Multimodal Fusion

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14020023

Authors: Nujhat Tabassum Mohamed El-Sharkawy

In the realm of autonomous vehicle technology, the multimodal vehicle detection network (MVDNet) represents a significant leap forward, particularly in the challenging context of weather conditions. This paper focuses on the enhancement of MVDNet through the integration of a multi-head attention layer, aimed at refining its performance. The integrated multi-head attention layer in the MVDNet model is a pivotal modification, advancing the network’s ability to process and fuse multimodal sensor information more efficiently. The paper validates the improved performance of MVDNet with multi-head attention through comprehensive testing, which includes a training dataset derived from the Oxford Radar RobotCar. The results clearly demonstrate that the multi-head MVDNet outperforms the other related conventional models, particularly in the average precision (AP) of estimation, under challenging environmental conditions. The proposed multi-head MVDNet not only contributes significantly to the field of autonomous vehicle detection but also underscores the potential of sophisticated sensor fusion techniques in overcoming environmental limitations.

JLPEA, Vol. 14, Pages 22: A Low Power Injection-Locked CDR Using 28 nm FDSOI Technology for Burst-Mode Applications

Yuqing Mao — 2024-04-07

JLPEA, Vol. 14, Pages 22: A Low Power Injection-Locked CDR Using 28 nm FDSOI Technology for Burst-Mode Applications

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14020022

Authors: Yuqing Mao Yoann Charlon Yves Leduc Gilles Jacquemod

In this paper, a low-power Injection-Locked Clock and Data Recovery (ILCDR) using a 28 nm Ultra-Thin Body and Box-Fully Depleted Silicon On Insulator (UTBB-FDSOI) technology is presented. The back-gate auto-biasing of UTBB-FDSOI transistors enables the creation of a Quadrature Ring Oscillator (QRO) reducing both size and power consumption compared to an LC tank oscillator. By injecting a digital signal into this circuit, we realize an Injection-Locked Oscillator (ILO) with low jitter. Thanks to the good performance of this oscillator, we propose a low-power ILCDR with fast locking time and low jitter for burst-mode applications. The main novelty consists of the implementation of a complementary QRO based on back-gate control using FDSOI technology to realize a simple and efficient ILCDR circuit. With a Pseudo-Random Binary Sequence (PRBS7) at 868 Mbps, the recovered clock jitter is 26.7 ps (2.3% UIp-p) and the recovered data jitter is 11.9 ps (1% UIp-p). With a 0.6 V power supply, the power consumption is 318μW. All the results presented here are based on post-layout simulations, as no prototypes have been produced. Similarly, we can estimate the surface area of the chip (without the pad ring) at around 6600 μm2.

JLPEA, Vol. 14, Pages 21: A 0.3 V OTA with Enhanced CMRR and High Robustness to PVT Variations

Riccardo Della Sala — 2024-04-02

JLPEA, Vol. 14, Pages 21: A 0.3 V OTA with Enhanced CMRR and High Robustness to PVT Variations

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14020021

Authors: Riccardo Della Sala Francesco Centurelli Giuseppe Scotti Alessandro Trifiletti

In this paper, we present a 0.3 V body-driven operational transconductance amplifier (OTA) that exploits a biasing approach based on the use of a replica loop with gain. An auxiliary amplifier is exploited both in the current mirror load of the first stage of the OTA and in the replica loop in order to achieve super-diode behavior, resulting in low mirror gain error, which enhances CMRR, and robust biasing. Common-mode feedforward, provided by the replica loop, further enhances CMRR. Simulations in a 180 nm CMOS technology show 65 dB gain with 2 kHz unity-gain frequency on a 200 pF load when consuming 9 nW. Very high linearity with a 0.24% THD at 90% full-scale and robustness to PVT variations are also achieved.

JLPEA, Vol. 14, Pages 20: Dual-Band Large-Frequency Ratio Power Divider Using Mode Composite Transmission Line for 5G Communication Systems

Kaijun Song — 2024-03-31

JLPEA, Vol. 14, Pages 20: Dual-Band Large-Frequency Ratio Power Divider Using Mode Composite Transmission Line for 5G Communication Systems

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14020020

Authors: Kaijun Song Lele Fang Yedi Zhou

In this paper, a novel kind of mode composite transmission line (MC-TL) is proposed, and a dual-band power divider with a large frequency ratio using this novel MC-TL for 5G communication systems was developed. The proposed MC-TL was developed using spoof surface plasmon polaritons (SSPPs) and a corrugated substrate-integrated waveguide (CSIW) transmission line, which supports both a surface plasmon mode and TE10 mode, independently. The surface plasmon mode operates in the grooves of the surface metal layer, while the TE10 mode works in the substrate between two metal layers. These two parts can transmit different modes at independent frequencies. This structure can be used in dual-band transmission lines with a high frequency ratio. The characteristics and design of the MC-TL (SSPPs and CSIW) are analyzed and illustrated. The MC-TL was fabricated and measured to demonstrate its performance. Moreover, based on the proposed MC-TL, a dual-band power divider with a large frequency ratio (operating at 3 GHz and 28 GHz simultaneously) was also designed and fabricated. It can cover the frequency of a fifth-generation communication system perfectly. The measured outcomes align closely with the simulated results, demonstrating robust agreement and showcasing excellent transmission capabilities.

JLPEA, Vol. 14, Pages 19: A Citizen Science Tool Based on an Energy Autonomous Embedded System with Environmental Sensors and Hyperspectral Imaging

Charalampos S. Kouzinopoulos — 2024-03-27

JLPEA, Vol. 14, Pages 19: A Citizen Science Tool Based on an Energy Autonomous Embedded System with Environmental Sensors and Hyperspectral Imaging

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14020019

Authors: Charalampos S. Kouzinopoulos Eleftheria Maria Pechlivani Nikolaos Giakoumoglou Alexios Papaioannou Sotirios Pemas Panagiotis Christakakis Dimosthenis Ioannidis Dimitrios Tzovaras

Citizen science reinforces the development of emergent tools for the surveillance, monitoring, and early detection of biological invasions, enhancing biosecurity resilience. The contribution of farmers and farm citizens is vital, as volunteers can strengthen the effectiveness and efficiency of environmental observations, improve surveillance efforts, and aid in delimiting areas affected by plant-spread diseases and pests. This study presents a robust, user-friendly, and cost-effective smart module for citizen science that incorporates a cutting-edge developed hyperspectral imaging (HI) module, integrated in a single, energy-independent device and paired with a smartphone. The proposed module can empower farmers, farming communities, and citizens to easily capture and transmit data on crop conditions, plant disease symptoms (biotic and abiotic), and pest attacks. The developed HI-based module is interconnected with a smart embedded system (SES), which allows for the capture of hyperspectral images. Simultaneously, it enables multimodal analysis using the integrated environmental sensors on the module. These data are processed at the edge using lightweight Deep Learning algorithms for the detection and identification of Tuta absoluta (Meyrick), the most important invaded alien and devastating pest of tomato. The innovative Artificial Intelligence (AI)-based module offers open interfaces to passive surveillance platforms, Decision Support Systems (DSSs), and early warning surveillance systems, establishing a seamless environment where innovation and utility converge to enhance crop health and productivity and biodiversity protection.

JLPEA, Vol. 14, Pages 18: A Compact 0.73~3.1 GHz CMOS VCO Based on Active-Inductor and Active-Resistor Topology

Chatrpol Pakasiri — 2024-03-25

JLPEA, Vol. 14, Pages 18: A Compact 0.73~3.1 GHz CMOS VCO Based on Active-Inductor and Active-Resistor Topology

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14020018

Authors: Chatrpol Pakasiri Ke-Chung Hsu Sen Wang

In this paper, a wideband VCO that covers popular Long-Term Evolution (LTE) 0.7 GHz and LTE 2.6 GHz frequencies is designed and developed in a standard 0.18 μm CMOS process. The VCO utilizes active inductors to achieve coarse-tuning of the inductance and a compact chip area. Moreover, an active feedback resistor is introduced into the active inductor for fine-tuning of the inductance. The feedback resistor also affects the equivalent resistance of the active inductor; therefore, wide inductance tuning and low power consumption can be obtained by optimizing the resistor. The core area of the fabricated CMOS chip is merely 0.046 mm2, excluding all testing pads. With a 6.7~10.1 mW DC consumption, the measured oscillation frequencies range from 0.73 GHz to 3.1 GHz, which demonstrates a 123.8% tuning range. At the frequencies of interest, the measured phase noises are from −80.7 to −84.5 dBc/Hz at a 1 MHz offset frequency.

JLPEA, Vol. 14, Pages 17: A Simplified Gm − C Filter Technique for Reference Spur Reduction in Phase-Locked Loop

P. Purushothama Chary — 2024-03-20

JLPEA, Vol. 14, Pages 17: A Simplified Gm − C Filter Technique for Reference Spur Reduction in Phase-Locked Loop

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010017

Authors: P. Purushothama Chary Rizwan Shaik Peerla Ashudeb Dutta

This paper presents a wideband approach for L5 and S-band integer-N phase-locked loop (PLL) targeting Indian Regional Navigation Satellite System (IRNSS) applications. A reference spur reduction technique using a Gm−C filter is proposed. The reference spur is improved by 7 dB when compared with one without any Gm−C filter. The wideband integer-N PLL is designed and fabricated in UMC 65-nm CMOS process. The Gm−C filter block consumes 200 μA current. The wideband voltage-controlled oscillator (VCO) oscillates from 1.6 GHz to 3.2 GHz having a tuning range (TR) of 40%, achieving a best and worst phase noise of ≈−122 dBc/Hz and ≈−116 dBc/Hz at a 1 MHz offset, respectively.

JLPEA, Vol. 14, Pages 16: Design of Impedance Matching Network for Low-Power, Ultra-Wideband Applications

Sepideh Hassani — 2024-03-19

JLPEA, Vol. 14, Pages 16: Design of Impedance Matching Network for Low-Power, Ultra-Wideband Applications

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010016

Authors: Sepideh Hassani Chih-Hung Chen Natalia K. Nikolova

This paper addresses the design of ultra-wideband (UWB) impedance matching networks operating in the unlicensed 3.1–10.6 GHz frequency band for low-power applications. It improves the simplified real frequency technique (SRFT) by adding a realizability check and employing an iterative approach with different initial guesses in optimization to achieve realizable solutions under the requirements of UWB, low-power consumption, and a minimum number of circuit components. The comparison of solutions obtained using the SRFT with published solutions based on the Chebyshev filter theory is presented. It is shown that the optimal SRFT solution requires fewer components in the impedance matching network, maximizes the RF power delivery over the UWB spectrum with a reflection coefficient below −10 dB, and allows for circuit optimization to reduce power consumption. Using the improved SRFT, it demonstrates a systematic approach to find the strategies and limitations of designing the input matching networks for low-power UWB applications using GlobalFoundries 90 nm BiCMOS technology.

JLPEA, Vol. 14, Pages 15: Control of Vibratory Feeder Device Mechanical Frequency Using the Modification of the Sinusoidal Supply Voltage Signal

Žydrūnas Kavaliauskas — 2024-03-06

JLPEA, Vol. 14, Pages 15: Control of Vibratory Feeder Device Mechanical Frequency Using the Modification of the Sinusoidal Supply Voltage Signal

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010015

Authors: Žydrūnas Kavaliauskas Igor Šajev

In the industrial and sales processes, dosing systems of various constructions, whose operation is based on mechanical vibrations (vibratory feeders), are very often used. These systems face many problems, such as resonant frequency, flow instability of dosed product, instability of mechanical vibration amplitude, etc., because most of them are based on controlling the frequency of the electrical signal of the supply voltage. All these factors negatively affect the durability and reliability of the vibratory feeder systems. During this research, an automatic control system for vibratory feeder was created, whose control process is based on the modification of the sinusoidal signal (partially changing the signal area). In addition, such a way of controlling the vibratory feeder is not discussed in the literature. As the research conducted in this paper has shown, while using sinusoidal signal modification it was possible to achieve a stable flow rate of bulk production (the flow rate varied from 0 to 100 g/s when the frequency of mechanical vibrations changed from 1 to 50 Hz) and a stable amplitude of mechanical oscillations was achieved and equal to 1.5 mm. The control system is based on the microcontroller PIC24FV32KA302 for which the special software was developed. The thyristor BTA16 used for voltage modification of the sinusoidal signal made it possible to ensure the reliable control of the sinusoidal voltage modification process.

JLPEA, Vol. 14, Pages 14: CMOS Design of Chaotic Systems Using Biquadratic OTA-C Filters

Eduardo Juarez-Mendoza — 2024-03-04

JLPEA, Vol. 14, Pages 14: CMOS Design of Chaotic Systems Using Biquadratic OTA-C Filters

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010014

Authors: Eduardo Juarez-Mendoza Francisco Asahel del Angel-Diaz Alejandro Diaz-Sanchez Esteban Tlelo-Cuautle

This manuscript shows the CMOS design of Lorenz systems using operational transconductance amplifiers (OTAs). Two Lorenz systems are then synchronized in a master–slave topology and used to implement a CMOS secure communication system. The contribution is devoted to the correct design of first- and second-order OTA-C filters, using 180 nm CMOS technology, to guarantee chaotic behavior. First, Simulink is used to simulate a secure communication system using two Lorenz systems connected in a master–slave topology, which is tested using sinusoidal signals that are masked by chaotic signals. Second, the Lorenz systems are scaled to have amplitudes of the state variables below 1 Volt, to allow for CMOS design using OTA-C filters. The transconductances of the OTAs are tuned to accomplish a Laplace transfer function. In this manner, this work highlights the design of a second-order CMOS OTA-C filter, whose damping factor is tuned to generate appropriate chaotic behavior. Finally, chaotic masking is performed by designing a whole CMOS secure communication system by using OTA-C based Lorenz systems, and its SPICE simulation results show its appropriateness for hardware security applications.

JLPEA, Vol. 14, Pages 13: A Sub-1-V Nanopower MOS-Only Voltage Reference

Siqi Wang — 2024-02-29

JLPEA, Vol. 14, Pages 13: A Sub-1-V Nanopower MOS-Only Voltage Reference

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010013

Authors: Siqi Wang Zhenghao Lu Kunpeng Xu Hongguang Dai Zhanxia Wu Xiaopeng Yu

A novel low-power MOS-only voltage reference is presented. The Enz–Krummenacher–Vittoz (EKV) model is adopted to provide a new perspective on the operating principle. The normalized charge density, introduced as a new variable, serves as an indicator when trimming the output temperature coefficient. The proposed voltage reference consists of a specific current generator and a 5-bit trimmable load. Thanks to the good match between the current source stage and the output stage, the nonlinear temperature dependence of carrier mobility is automatically canceled out. The circuit is designed using 55 nm COMS technology. The operating temperature ranges from −40 °C to 120 °C. The average temperature coefficient of the output voltage can be reduced to 21.7 ppm/°C by trimming. The power consumption is only 23.2 nW with a supply voltage of 0.8 V. The line sensitivity and the power supply rejection ratio at 100 Hz are 0.011 %/V and −89 dB, respectively.

JLPEA, Vol. 14, Pages 12: A Low-Power BL Path Design for NAND Flash Based on an Existing NAND Interface

Hikaru Makino — 2024-02-19

JLPEA, Vol. 14, Pages 12: A Low-Power BL Path Design for NAND Flash Based on an Existing NAND Interface

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010012

Authors: Hikaru Makino Toru Tanzawa

This paper is an extended version of a previously reported conference paper regarding a low-power design for NAND Flash. As the number of bits per NAND Flash die increases with cost scaling, the IO data path speed increases to minimize the page access time with a scaled CMOS in IOs. The power supply for IO buffers, namely, VDDQ, decreases from 3 V to 1.2 V, accordingly. In this paper, the way in which a reduction in VDDQ can contribute to power reduction in the BL path is discussed and validated. Conventionally, a BL voltage of about 0.5 V has been supplied from a supply voltage source (VDD) of 3 V. The BL path power can be reduced by a factor of VDDQ to VDD when the BL voltage is supplied by VDDQ. To maintain a sense margin at the sense amplifiers, the supply source for BLs is switched from VDDQ to VDD before sensing. As a result, power reduction and an equivalent sense margin can be realized at the same time. The overhead of implementing this operation is an increase in the BL access time of about 2% for switching the power supply from VDDQ to VDD and an increase in the die size of about 0.01% for adding the switching circuit, both of which are not significant in comparison to the significant power reduction in the BL path power of the NAND die of about 60%. The BL path is then designed in 180 nm CMOS to validate the design. When the cost for powering the SSD becomes quite significant, especially for data centers, an additional lower voltage supply, such as 0.8 V, dedicated to BL charging for read and program verifying operations may be the best option for future applications.

JLPEA, Vol. 14, Pages 11: Extrema-Triggered Conversion for Non-Stationary Signal Acquisition in Wireless Sensor Nodes

Swagat Bhattacharyya — 2024-02-17

JLPEA, Vol. 14, Pages 11: Extrema-Triggered Conversion for Non-Stationary Signal Acquisition in Wireless Sensor Nodes

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010011

Authors: Swagat Bhattacharyya Jennifer O. Hasler

While wireless sensor node (WSNs) have proliferated with the rise of the Internet of Things (IoT), uniformly sampled analog–digital converters (ADCs) have traditionally reigned paramount in the signal processing pipeline. The large volume of data generated by uniformly sampled ADCs while capturing most real-world signals, which are highly non-stationary and sparse in information content, considerably strains the power budget of WSNs during data transmission. Given the pressing need for intelligent sampling, this work proposes an extrema pulse generator devised to trigger ADCs at significant signal extrema, thereby curbing the volume of data points collected and transmitted, and mitigating transmission power draw. After providing a comprehensive signal-theoretic rationale, we construct and experimentally validate these circuits on a system-on-chip field-programmable analog array in a 350 nm complementary metal-oxide-semiconductor (MOS) process. Operating within a power range of 4.3–12.3 µW (contingent on the input bandwidth requirements), the extrema pulse generator has proven to be capable of effectively sampling both synthetic and natural signals, achieving significant reductions in data volume and signal reconstruction error. Using a nonideality-resilient reconstruction algorithm, that we develop in this work, experimental comparisons between extrema and uniform sampling show that extrema sampling achieves an 18-fold lower normalized root mean square reconstruction error for a quadratic chirp signal, despite requiring 5-fold fewer sample points. Similar improvements in both the reconstruction error and effective sampling rate objectives are found experimentally for an electrocardiogram signal. Using both theoretical and experimental methods, this work demonstrates the potential of extrema-triggered systems for extending Pareto frontiers in modern, resource-constrained sensing scenarios.

JLPEA, Vol. 14, Pages 10: A Low-Power, 65 nm 24.6-to-30.1 GHz Trusted LC Voltage-Controlled Oscillator Achieving 191.7 dBc/Hz FoM at 1 MHz

Abdullah Kurtoglu — 2024-02-02

JLPEA, Vol. 14, Pages 10: A Low-Power, 65 nm 24.6-to-30.1 GHz Trusted LC Voltage-Controlled Oscillator Achieving 191.7 dBc/Hz FoM at 1 MHz

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010010

Authors: Abdullah Kurtoglu Amir H. M. Shirazi Shahriar Mirabbasi Hossein Miri Lavasani

This work presents a novel trusted LC voltage-controlled oscillator (VCO) with an embedded compact analog Physically Unclonable Function (PUF) used for authentication. The trusted VCO is implemented in a 1P9M 65 nm standard CMOS process and consumes 1.75 mW. It exhibits a measured phase noise (PN) of −104.8 dBc/Hz @ 1 MHz and −132.2 dBc/Hz @ 10 MHz offset, resulting in Figures of Merit (FoMs) of 191.7 dBc/Hz and 199.1 dBc/Hz, respectively. With the measured frequency tuning range (TR) of ~5.5 GHz, the FoM with tuning (FoMT) reaches 197.6 dBc/Hz and 205.0 dBc/Hz at 1 MHz and 10 MHz offset, respectively. The analog PUF consists of CMOS cross-coupled pairs in the main VCO to change analog characteristics. Benefiting from the impedance change and parasitic capacitance of the cross-coupled pairs, the AC and DC responses of the VCO are utilized for multiple responses for each input. The PUF consumes 0.83 pJ/bit when operating at 1.5 Gbps. The proposed PUF exhibits a measured Inter-Hamming Distance (HD) of 0.5058b and 0.4978b, with Intra-HD reaching 0.0055b and 0.0053b for the current consumption and fosc, respectively. The autocorrelation function (ACF) of 0.0111 and 0.0110 is obtained for the current consumption and fosc, respectively, at a 95% confidence level.

JLPEA, Vol. 14, Pages 9: PANDA: Processing in Magnetic Random-Access Memory-Accelerated de Bruijn Graph-Based DNA Assembly

Shaahin Angizi — 2024-02-02

JLPEA, Vol. 14, Pages 9: PANDA: Processing in Magnetic Random-Access Memory-Accelerated de Bruijn Graph-Based DNA Assembly

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010009

Authors: Shaahin Angizi Naima Ahmed Fahmi Deniz Najafi Wei Zhang Deliang Fan

In this work, we present an efficient Processing in MRAM-Accelerated De Bruijn Graph-based DNA Assembly platform, named PANDA, based on an optimized and hardware-friendly genome assembly algorithm. PANDA is able to assemble large-scale DNA sequence datasets from all-pair overlaps. We first design a PANDA platform that exploits MRAM as computational memory and converts it to a potent processing unit for genome assembly. PANDA can not only execute efficient bulk bit-wise X(N)OR-based comparison/addition operations heavily required for the genome assembly task but also a full set of 2-/3-input logic operations inside the MRAM chip. We then develop a highly parallel and step-by-step hardware-friendly DNA assembly algorithm for PANDA that only requires the developed in-memory logic operations. The platform is then configured with a novel data partitioning and mapping technique that provides local storage and processing to utilize the algorithm level’s parallelism fully. The cross-layer simulation results demonstrate that PANDA reduces the run time and power by a factor of 18 and 11, respectively, compared with CPU. Moreover, speed-ups of up to 2.5 to 10× can be obtained over other recent processing in-memory platforms to perform the same task, like STT-MRAM, ReRAM, and DRAM.

JLPEA, Vol. 14, Pages 8: LC Tank Oscillator Based on New Negative Resistor in FDSOI Technology

Yuqing Mao — 2024-02-01

JLPEA, Vol. 14, Pages 8: LC Tank Oscillator Based on New Negative Resistor in FDSOI Technology

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010008

Authors: Yuqing Mao Yoann Charlon Yves Leduc Gilles Jacquemod

Although Moore’s Law reaches its limits, it has never applied to analog and RF circuits. For example, due to the short channel effect (SCE), drain-induced barrier lowering (DIBL), and sub-threshold slope (SS)…, longer transistors are required to implement analog cells. From 22 nm CMOS technology and beyond, for reasons of variability, the channel of the transistors has no longer been doped. Two technologies then emerged: FinFET transistors for digital applications and UTBB FDSOI transistors, suitable for analog and mixed applications. In a previous paper, a new topology was proposed utilizing some advantages of the FDSOI technology. Thanks to this technology, a novel cross-coupled back-gate (BG) technique was implemented to improve analog and mixed signal cells in order to reduce the surface of the integrated circuit. This technique was applied to a current mirror to reduce the small channel effect and to provide high-output impedance. It was demonstrated that it is possible to overcompensate the SCE and DIBL effects and to create a negative output resistor. This paper presents a new LC tank oscillator based on this current mirror functioning as a negative resistor.

JLPEA, Vol. 14, Pages 7: Array-Designed Triboelectric Nanogenerator for Healthcare Diagnostics: Current Progress and Future Perspectives

Zequan Zhao — 2024-01-22

JLPEA, Vol. 14, Pages 7: Array-Designed Triboelectric Nanogenerator for Healthcare Diagnostics: Current Progress and Future Perspectives

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010007

Authors: Zequan Zhao Qiliang Zhu Yifei Wang Muhammad Shoaib Xia Cao Ning Wang

Array-designed triboelectric nanogenerators (AD-TENGs) have firmly established themselves as state-of-the-art technologies for adeptly converting mechanical interactions into electrical signals. Central to the AD-TENG’s prowess is its inherent modularity and the multifaceted, grid-like design that pave the way to robust and adaptable detection platforms for wearables and real-time health monitoring systems. In this review, we aim to elucidate the quintessential role of array design in AD-TENGs for healthcare detection, emphasizing its ability to heighten sensitivity, spatial resolution, and dynamic monitoring while ensuring redundancy and simultaneous multi-detection. We begin from the fundamental aspects, such as working principles and design basis, then venture into methodologies for optimizing AD-TENGs that ensure the capture of intricate physiological changes, from nuanced muscle movements to sensitive electronic skin. After this, our exploration extends to the possible cutting-edge electronic systems that are built with specific advantages in filtering noise, magnifying signal-to-noise ratios, and interpreting complex real-time datasets on the basis of AD-TENGs. Culminating our discourse, we highlight the challenges and prospective pathways in the evolution of array-designed AD-TENGs, stressing the necessity to refine their sensitivity, adaptability, and reliability to perfectly align with the exacting demands of contemporary healthcare diagnostics.

JLPEA, Vol. 14, Pages 6: Advancing Smart Lighting: A Developmental Approach to Energy Efficiency through Brightness Adjustment Strategies

Vandha Pradwiyasma Widartha — 2024-01-15

JLPEA, Vol. 14, Pages 6: Advancing Smart Lighting: A Developmental Approach to Energy Efficiency through Brightness Adjustment Strategies

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010006

Authors: Vandha Pradwiyasma Widartha Ilkyeun Ra Su-Yeon Lee Chang-Soo Kim

Smart lighting control systems represent an advanced approach to reducing energy use. These systems leverage advanced technology to provide users with better control over their lighting, allowing them to manually, remotely, and automatically modify the brightness, color, and timing of their lights. In this study, we aimed to enhance the energy efficiency of smart lighting systems by using light source data. A multifaceted approach was employed, involving the following three scenarios: sensing device, daylight data, and a combination of both. A low-cost sensor and third-party API were used for data collection, and a prototype application was developed for real-time monitoring. The results showed that combining sensor and daylight data effectively reduced energy consumption, and the rule-based algorithm further optimized energy usage. The prototype application provided real-time monitoring and actionable insights, thus contributing to overall energy optimization.

JLPEA, Vol. 14, Pages 5: A Scalable Formal Framework for the Verification and Vulnerability Analysis of Redundancy-Based Error-Resilient Null Convention Logic Asynchronous Circuits

Dipayan Mazumder — 2024-01-14

JLPEA, Vol. 14, Pages 5: A Scalable Formal Framework for the Verification and Vulnerability Analysis of Redundancy-Based Error-Resilient Null Convention Logic Asynchronous Circuits

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010005

Authors: Dipayan Mazumder Mithun Datta Alexander C. Bodoh Ashiq A. Sakib

The increasing demand for high-speed, energy-efficient, and miniaturized electronics has led to significant challenges and compromises in the domain of conventional clock-based digital designs, most notably reduced circuit reliability, particularly in mission-critical hardware. At scaled technology nodes, devices are vulnerable to transient or soft errors, such as Single Event Upset (SEU) and Single Event Latch-up (SEL). External radiation, internal electromagnetic interference (EMI), or noise are the primary sources of these errors, which can compromise the circuit functionality. In response to these challenges, the Quasi-Delay-Insensitive (QDI) Null Convention Logic (NCL) asynchronous design paradigm has emerged as a promising alternative, offering advantages such as ultra-low power performance, reduced noise and EMI, and resilience to process, voltage, and temperature variations. Moreover, its unique architecture and insensitivity to timing variations offers a degree of resistance against transient errors; however, it is not entirely resilient. Several resiliency schemes are available to detect and mitigate soft errors in QDI circuits, with approaches based on redundancy proving to be the most effective in ensuring complete resilience across all major QDI implementation paradigms, including NCL, Pre-charge/Weak-charge Half Buffers (PCHB/WCHB), and Sleep Convention Logic (SCL). This research focuses on one such redundancy-based resiliency scheme for QDI NCL circuits, known as the dual-modular redundancy-based NCL (DMR-NCL) architecture, and addresses the absence of formal methods for the verification and analysis of such circuits. A novel methodology has been proposed for formally verifying the correctness of DMR-NCL circuits synthesized from their synchronous counterparts, covering both safety (functional correctness) and liveness (the absence of deadlock). In addition, this research introduces a formal framework for the vulnerability analysis of DMR-NCL circuits against SEU/SEL. To demonstrate the framework’s efficacy and scalability, a prototype computer-aided support tool has been developed, which verifies and analyzes multiple DMR-NCL benchmark circuits of varying sizes and complexities.

JLPEA, Vol. 14, Pages 4: Understanding Timing Error Characteristics from Overclocked Systolic Multiply–Accumulate Arrays in FPGAs

Andrew Chamberlin — 2024-01-09

JLPEA, Vol. 14, Pages 4: Understanding Timing Error Characteristics from Overclocked Systolic Multiply–Accumulate Arrays in FPGAs

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010004

Authors: Andrew Chamberlin Andrew Gerber Mason Palmer Tim Goodale Noel Daniel Gundi Koushik Chakraborty Sanghamitra Roy

Artificial Intelligence (AI) hardware accelerators have seen tremendous developments in recent years due to the rapid growth of AI in multiple fields. Many such accelerators comprise a Systolic Multiply–Accumulate Array (SMA) as its computational brain. In this paper, we investigate the faulty output characterization of an SMA in a real silicon FPGA board. Experiments were run on a single Zybo Z7-20 board to control for process variation at nominal voltage and in small batches to control for temperature. The FPGA is rated up to 800 MHz in the data sheet due to the max frequency of the PLL, but the design is written using Verilog for the FPGA and C++ for the processor and synthesized with a chosen constraint of a 125 MHz clock. We then operate the system at a frequency range of 125 MHz to 450 MHz for the FPGA and the nominal 667 MHz for the processor core to produce timing errors in the FPGA without affecting the processor. Our extensive experimental platform with a hardware–software ecosystem provides a methodological pathway that reveals fascinating characteristics of SMA behavior under an overclocked environment. While one may intuitively expect that timing errors resulting from overclocked hardware may produce a wide variation in output values, our post-silicon evaluation reveals a lack of variation in erroneous output values. We found an intriguing pattern where error output values are stable for a given input across a range of operating frequencies far exceeding the rated frequency of the FPGA.

JLPEA, Vol. 14, Pages 3: Design and Assessment of Hybrid MTJ/CMOS Circuits for In-Memory-Computation

Prashanth Barla — 2024-01-06

JLPEA, Vol. 14, Pages 3: Design and Assessment of Hybrid MTJ/CMOS Circuits for In-Memory-Computation

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010003

Authors: Prashanth Barla Hemalatha Shivarama Ganesan Deepa Ujjwal Ujjwal

Hybrid magnetic tunnel junction/complementary metal oxide semiconductor (MTJ/CMOS) circuits based on in-memory-computation (IMC) architecture is considered as the next-generation candidate for the digital integrated circuits. However, the energy consumption during the MTJ write process is a matter of concern in these hybrid circuits. In this regard, we have developed a novel write circuit for the contemporary three-terminal perpendicular-MTJs that works on the voltage-gated spin orbit torque (VG+SOT) switching mechanism to store the information in hybrid circuits for IMC architecture. Investigation of the novel write circuit reveals a remarkable reduction in the total energy consumption (and energy delay product) of 92.59% (95.81) and 92.28% (42.03%) than the conventional spin transfer torque (STT) and spin-Hall effect assisted STT (SHE+STT) write circuits, respectively. Further, we have developed all the hybrid logic gates followed by nonvolatile full adders (NV-FAs) using VG+SOT, STT, and SHE+STT MTJs. Simulation results show that with the VG+SOT NOR-OR, NAND-AND, XNOR-XOR, and NV-FA circuits, the reduction in the total power dissipation is 5.35% (4.27%), 5.62% (3.2%), 3.51% (2.02%), and 4.46% (2.93%) compared to STT (SHE+STT) MTJs respectively.

JLPEA, Vol. 14, Pages 2: Multi-Ported GC-eDRAM Bitcell with Dynamic Port Configuration and Refresh Mechanism

Roman Golman — 2024-01-04

JLPEA, Vol. 14, Pages 2: Multi-Ported GC-eDRAM Bitcell with Dynamic Port Configuration and Refresh Mechanism

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010002

Authors: Roman Golman Robert Giterman Adam Teman

Embedded memories occupy an increasingly dominant part of the area and power budgets of modern systems-on-chips (SoCs). Multi-ported embedded memories, commonly used by media SoCs and graphical processing units, occupy even more area and consume higher power due to larger memory bitcells. Gain-cell eDRAM is a high-density alternative for multi-ported operation with a small silicon footprint. However, conventional gain-cell memories have limited data availability, as they require periodic refresh operations to maintain their data. In this paper, we propose a novel multi-ported gain-cell design, which provides up-to N read ports and M independent write ports (NRMW). In addition, the proposed design features a configurable mode of operation, supporting a hidden refresh mechanism for improved memory availability, as well as a novel opportunistic refresh port approach. An 8kbit memory macro was implemented using a four-transistor bitcell with four ports (2R2W) in a 28 nm FD-SOI technology, offering up-to a 3× reduction in bitcell area compared to other dual-ported SRAM memory options, while also providing 100% memory availability, as opposed to conventional dynamic memories, which are hindered by limited availability.

JLPEA, Vol. 14, Pages 1: Speed, Power and Area Optimized Monotonic Asynchronous Array Multipliers

Padmanabhan Balasubramanian — 2023-12-24

JLPEA, Vol. 14, Pages 1: Speed, Power and Area Optimized Monotonic Asynchronous Array Multipliers

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea14010001

Authors: Padmanabhan Balasubramanian Nikos E. Mastorakis

Multiplication is a fundamental arithmetic operation in electronic processing units such as microprocessors and digital signal processors as it plays an important role in various computational tasks and applications. There exist many designs of synchronous multipliers in the literature. However, in the domain of Input–Output Mode (IOM) asynchronous design, there is relatively less published research on multipliers. Some existing works have considered quasi-delay-insensitive (QDI) asynchronous implementations of multipliers. However, the QDI asynchronous design paradigm, in general, is not area- and speed-efficient. This article presents an efficient alternative implementation of IOM asynchronous multipliers based on the concept of monotonic Boolean networks. The array multiplier architecture has been considered for demonstrating the usefulness of our proposition. The building blocks of the multiplier, such as the partial product generator, half adder, and full adder, were implemented monotonically. The popular dual-rail encoding scheme was considered for encoding the multiplier inputs and outputs, and four-phase return-to-zero handshaking (RZH) and return-to-one handshaking (ROH) were separately considered for communication. Compared to the best of the existing QDI asynchronous array multipliers, the proposed monotonic asynchronous array multiplier achieves the following reductions in design metrics: (i) a 40.1% (44.3%) reduction in cycle time (which is the asynchronous equivalent of synchronous clock timing), a 37.7% (37.7%) reduction in area, and a 4% (4.5%) reduction in power for 4 × 4 multiplication corresponding to RZH (ROH), and (ii) a 58.1% (60.2%) reduction in cycle time, a 45.2% (45.2%) reduction in area, and a 10.3% (11%) reduction in power for 8 × 8 multiplication corresponding to RZH (ROH). The multipliers were implemented using a 28 nm CMOS process technology.

JLPEA, Vol. 13, Pages 65: An Ultra Low Power Integer-N PLL with a High-Gain Sampling Phase Detector for IOT Applications in 65 nm CMOS

Javad Tavakoli — 2023-12-17

JLPEA, Vol. 13, Pages 65: An Ultra Low Power Integer-N PLL with a High-Gain Sampling Phase Detector for IOT Applications in 65 nm CMOS

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040065

Authors: Javad Tavakoli Hossein Miri Lavasani Samad Sheikhaei

A low-power and low-jitter 1.2 GHz Integer-N PLL (INPLL) is designed in a 65 nm standard CMOS process. A novel high-gain sampling phase detector (PD), which takes advantage of a transconductance (Gm) cell to boost the gain, is developed to increase the phase detection gain by ~100× compared to the Phase-Frequency Detectors (PFDs) used in conventional PLLs. Using this high detection gain, the noise contribution of the PFD and Charge Pump (CP), reference clock, and dividers on the PLL output is minimized, enabling low output jitter at low power, even when using low-frequency reference clocks. To provide a sufficient frequency locking range, an auxiliary frequency-locked loop (AFLL) is embedded within the INPLL. An integrated Lock Detector (LD) helps detect the INPLL locked state and disables the AFLL to save on power consumption and minimize its impact on the INPLL jitter. The proposed INPLL layout measures 700 µm × 350 µm, consumes 350 µW, and exhibits an integrated phase noise (IPN) of −37 dBc (from 10 kHz to 10 MHz), equivalent to 2.9 ps rms jitter, while keeping the spur level 64 dBc lower, resulting in jitter figure of Merit (FoMjitter) ~−236 dB.

JLPEA, Vol. 13, Pages 64: Design of a Low-Power Delay-Locked Loop-Based 8× Frequency Multiplier in 22 nm FDSOI

Naveed — 2023-12-12

JLPEA, Vol. 13, Pages 64: Design of a Low-Power Delay-Locked Loop-Based 8× Frequency Multiplier in 22 nm FDSOI

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040064

Authors: Naveed Jeff Dix

A low-power delay-locked loop (DLL)-based frequency multiplier is presented. The multiplier is designed in 22 nm FDSOI and achieves 8× multiplication. The proposed DLL uses a new simple duty cycle correction circuit and is XOR logic-based for frequency multiplication. Current starved delay cells are used to make the circuit power efficient. The circuit uses three 2× stages instead of an edge combiner to achieve 8× multiplication, thus requiring far less power and chip area as compared to conventional phase-locked loop (PLL) circuits. The proposed 8× multiplier occupies an active area of 0.09 mm2. The measurement result shows ultra-low power consumption of 130 µW at 0.8 V supply. The post-layout simulation shows a timing jitter of 24 ps (pk-pk) at 2.44 GHz.

JLPEA, Vol. 13, Pages 63: Signal Filtering Using Neuromorphic Measurements

Dorian Florescu — 2023-12-06

JLPEA, Vol. 13, Pages 63: Signal Filtering Using Neuromorphic Measurements

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040063

Authors: Dorian Florescu Daniel Coca

Digital filtering is a fundamental technique in digital signal processing, which operates on a digital sequence without any information on how the sequence was generated. This paper proposes a methodology for designing the equivalent of digital filtering for neuromorphic samples, which are a low-power alternative to conventional digital samples. In the literature, filtering using neuromorphic samples is performed by filtering the reconstructed analog signal, which is required to belong to a predefined input space. We show that this requirement is not necessary, and introduce a new method for computing the neuromorphic samples of the filter output directly from the input samples, backed by theoretical guarantees. We show numerically that we can achieve a similar accuracy compared to that of the conventional method. However, given that we bypass the analog signal reconstruction step, our results show significantly reduced computation time for the proposed method and good performance even when signal recovery is not possible.

JLPEA, Vol. 13, Pages 62: Applications of Sustainable Hybrid Energy Harvesting: A Review

Hamna Shaukat — 2023-11-26

JLPEA, Vol. 13, Pages 62: Applications of Sustainable Hybrid Energy Harvesting: A Review

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040062

Authors: Hamna Shaukat Ahsan Ali Shaukat Ali Wael A. Altabey Mohammad Noori Sallam A. Kouritem

This paper provides a short review of sustainable hybrid energy harvesting and its applications. The potential usage of self-powered wireless sensor (WSN) systems has recently drawn a lot of attention to sustainable energy harvesting. The objective of this research is to determine the potential of hybrid energy harvesters to help single energy harvesters overcome their energy deficiency problems. The major findings of the study demonstrate how hybrid energy harvesting, which integrates various energy conversion technologies, may increase power outputs, and improve space utilization efficiency. Hybrid energy harvesting involves collecting energy from multiple sources and converting it into electrical energy using various transduction mechanisms. By properly integrating different energy conversion technologies, hybridization can significantly increase power outputs and improve space utilization efficiency. Here, we present a review of recent progress in hybrid energy-harvesting systems for sustainable green energy harvesting and their applications in different fields. This paper starts with an introduction to hybrid energy harvesting, showing different hybrid energy harvester configurations, i.e., the integration of piezoelectric and electromagnetic energy harvesters; the integration of piezoelectric and triboelectric energy harvesters; the integration of piezoelectric, triboelectric, and electromagnetic energy harvesters; and others. The output performance of common hybrid systems that are reported in the literature is also outlined in this review. Afterwards, various potential applications of hybrid energy harvesting are discussed, showing the practical attainability of the technology. Finally, this paper concludes by making recommendations for future research to overcome the difficulties in developing hybrid energy harvesters. The recommendations revolve around improving energy conversion efficiency, developing advanced integration techniques, and investigating new hybrid configurations. Overall, this study offers insightful information on sustainable hybrid energy harvesting together with quantitative information, numerical findings, and useful research recommendations that progress and promote the use of this technology.

JLPEA, Vol. 13, Pages 61: Application Specific Reconfigurable Processor for Eyeblink Detection from Dual-Channel EOG Signal

Diba Das — 2023-11-23

JLPEA, Vol. 13, Pages 61: Application Specific Reconfigurable Processor for Eyeblink Detection from Dual-Channel EOG Signal

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040061

Authors: Diba Das Mehdi Hasan Chowdhury Aditta Chowdhury Kamrul Hasan Quazi Delwar Hossain Ray C. C. Cheung

The electrooculogram (EOG) is one of the most significant signals carrying eye movement information, such as blinks and saccades. There are many human–computer interface (HCI) applications based on eye blinks. For example, the detection of eye blinks can be useful for paralyzed people in controlling wheelchairs. Eye blink features from EOG signals can be useful in drowsiness detection. In some applications of electroencephalograms (EEGs), eye blinks are considered noise. The accurate detection of eye blinks can help achieve denoised EEG signals. In this paper, we aimed to design an application-specific reconfigurable binary EOG signal processor to classify blinks and saccades. This work used dual-channel EOG signals containing horizontal and vertical EOG signals. At first, the EOG signals were preprocessed, and then, by extracting only two features, the root mean square (RMS) and standard deviation (STD), blink and saccades were classified. In the classification stage, 97.5% accuracy was obtained using a support vector machine (SVM) at the simulation level. Further, we implemented the system on Xilinx Zynq-7000 FPGAs by hardware/software co-design. The processing was entirely carried out using a hybrid serial–parallel technique for low-power hardware optimization. The overall hardware accuracy for detecting blinks was 95%. The on-chip power consumption for this design was 0.8 watts, whereas the dynamic power was 0.684 watts (86%), and the static power was 0.116 watts (14%).

JLPEA, Vol. 13, Pages 60: Design of Current Equalization Circuit in Dual Ethernet Power Supply System

Xingyu Guan — 2023-11-18

JLPEA, Vol. 13, Pages 60: Design of Current Equalization Circuit in Dual Ethernet Power Supply System

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040060

Authors: Xingyu Guan Xinyuan Hu Junkai Zhang Yanfeng Jiang

A current-balancing circuit for a dual-channel Ethernet power supply system is designed in this paper, which can be used to solve the mismatch between the two channels caused by unavoidable factors, such as mismatched resistances, temperatures and voltages. Based on the design, the mismatch of the currents between the two power transmission paths can be controlled to be less than 1% of the original ones. It can be operated under these conditions with the changes of the load current and the PSE output voltage. The maximum output power of the dual-channel power supply can reach up to 96.5 W. When the DC–DC conversion efficiency is less than 75%, it can still provide 72 W for the PD end, meeting the requirements of the PoE power system. The current-balancing circuit designed in the paper has potential application value to improve the dual PoE power supply system.

JLPEA, Vol. 13, Pages 59: From SW Timing Analysis and Safety Logging to HW Implementation: A Possible Solution with an Integrated and Low-Power Logger Approach

Francesco Cosimi — 2023-11-02

JLPEA, Vol. 13, Pages 59: From SW Timing Analysis and Safety Logging to HW Implementation: A Possible Solution with an Integrated and Low-Power Logger Approach

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040059

Authors: Francesco Cosimi Antonio Arena Paolo Gai Sergio Saponara

In this manuscript, we propose a configurable hardware device in order to build a coherent data log unit. We address the need for analyzing mixed-criticality systems, thus guaranteeing the best performances without introducing additional sources of interference. Log data are essential to inspect the behavior of running applications when safety analyses or worst-case execution time measurements are performed. Furthermore, performance and timing investigations are useful for solving scheduling issues to balance resource budgets and investigate misbehavior and failure causes. We additionally present a performance evaluation and log capabilities by means of simulations on a RISC-V use case. The simulations highlight that such a data log unit can trace the execution from a single- to an octa-core microcontroller. Such an analysis allows a silicon developer to obtain the right sizings and timings of devices during the development phase. Finally, we present an analysis of a real RISC-V implementation for a Xilinx UltraScale+ FPGA, which was obtained with Vivado 2018. The results show that our data log unit implementation does not introduce a significant area overhead if compared to the RISC-V core targeted for tests, and that the timing constraints are not violated.

JLPEA, Vol. 13, Pages 58: Analog System High-Level Synthesis for Energy-Efficient Reconfigurable Computing

Afolabi Ige — 2023-10-26

JLPEA, Vol. 13, Pages 58: Analog System High-Level Synthesis for Energy-Efficient Reconfigurable Computing

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040058

Authors: Afolabi Ige Linhao Yang Hang Yang Jennifer Hasler Cong Hao

The design of analog computing systems requires significant human resources and domain expertise due to the lack of automation tools to enable these highly energy-efficient, high-performance computing nodes. This work presents the first automated tool flow from a high-level representation to a reconfigurable physical device. This tool begins with a high-level algorithmic description, utilizing either our custom Python framework or the XCOS GUI, to compile and optimize computations for integration into an Integrated Circuit (IC) design or a Field Programmable Analog Array (FPAA). An energy-efficient embedded speech classifier benchmark illustrates the tool demonstration, automatically generating GDSII layout or FPAA switch list targeting.

JLPEA, Vol. 13, Pages 57: Design and Implementation of an Open-Source and Internet-of-Things-Based Health Monitoring System

Sehrash Ashraf — 2023-10-22

JLPEA, Vol. 13, Pages 57: Design and Implementation of an Open-Source and Internet-of-Things-Based Health Monitoring System

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040057

Authors: Sehrash Ashraf Shahnaz Parveen Khattak Mohammad Tariq Iqbal

Across the globe, COVID-19 had far-reaching impacts that included healthcare facilities, public health, as well as all forms of transport. Hospitals were experiencing staffing shortages at the same time as patients were experiencing healthcare issues. Consequently, even in developing countries without full access to technology, remote health monitoring became necessary. There was a greater severity of the pandemic in countries with fewer financial and technical resources. It became evident that such remote health monitoring systems that not only allowed the user to monitor their basic health information, but also to communicate that information to healthcare personnel, were essential. In this article, we present an open-source, Internet-of-Things (IoT)-based health monitoring system that is intended to mitigate the basic healthcare challenges posed by remote areas of developing countries. To facilitate remote health monitoring, an IoT server has been configured on an ESP32 chip as part of this study. The microcontroller was also connected to a Max 30100 sensor, a DHT11 sensor, and a global positioning system GPS module. As a result of this, the user is able to measure the heart rate (HR), blood oxygen level (SpO2), human body temperature, ambient temperature and humidity, as well as the location of the user. Through the internet protocol, the important vital signs can be displayed in real time on the dashboard using a private communication network. This article presents the details of a complete system design, implementation, testing, and results. Such systems can help limit the spread of infectious diseases like COVID-19.

JLPEA, Vol. 13, Pages 56: Theoretical Validation and Hardware Implementation of Dynamic Adaptive Scheduling for Heterogeneous Systems on Chip

A. Alper Goksoy — 2023-10-17

JLPEA, Vol. 13, Pages 56: Theoretical Validation and Hardware Implementation of Dynamic Adaptive Scheduling for Heterogeneous Systems on Chip

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040056

Authors: A. Alper Goksoy Sahil Hassan Anish Krishnakumar Radu Marculescu Ali Akoglu Umit Y. Ogras

Domain-specific systems on chip (DSSoCs) aim to narrow the gap between general-purpose processors and application-specific designs. CPU clusters enable programmability, whereas hardware accelerators tailored to the target domain minimize task execution times and power consumption. Traditional operating system (OS) schedulers can diminish the potential of DSSoCs, as their execution times can be orders of magnitude larger than the task execution time. To address this problem, we propose a dynamic adaptive scheduling (DAS) framework that combines the advantages of a fast, low-overhead scheduler and a sophisticated, high-performance scheduler with a larger overhead. We present a novel runtime classifier that chooses the better scheduler type as a function of the system workload, leading to improved system performance and energy-delay product (EDP). Experiments with five real-world streaming applications indicate that DAS consistently outperforms fast, low-overhead, and slow, sophisticated schedulers. DAS achieves a 1.29× speedup and a 45% lower EDP than the sophisticated scheduler under low data rates and a 1.28× speedup and a 37% lower EDP than the fast scheduler when the workload complexity increases. Furthermore, we demonstrate that the superior performance of the DAS framework also applies to hardware platforms, with up to a 48% and 52% reduction in the execution time and EDP, respectively.

JLPEA, Vol. 13, Pages 55: A Low-Power Analog Cell for Implementing Spiking Neural Networks in 65 nm CMOS

John S. Venker — 2023-10-17

JLPEA, Vol. 13, Pages 55: A Low-Power Analog Cell for Implementing Spiking Neural Networks in 65 nm CMOS

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040055

Authors: John S. Venker Luke Vincent Jeff Dix

A Spiking Neural Network (SNN) is realized within a 65 nm CMOS process to demonstrate the feasibility of its constituent cells. Analog hardware neural networks have shown improved energy efficiency in edge computing for real-time-inference applications, such as speech recognition. The proposed network uses a leaky integrate and fire neuron scheme for computation, interleaved with a Spike Timing Dependent Plasticity (STDP) circuit for implementing synaptic-like weights. The low-power, asynchronous analog neurons and synapses are tailored for the VLSI environment needed to effectively make use of hardware SSN systems. To demonstrate functionality, a feed-forward Spiking Neural Network composed of two layers, the first with ten neurons and the second with six, is implemented. The neuron design operates with 2.1 pJ of power per spike and 20 pJ per synaptic operation.

JLPEA, Vol. 13, Pages 54: Design and Optimization of an Ultra-Low-Power Cross-Coupled LC VCO with a DFF Frequency Divider for 2.4 GHz RF Receivers Using 65 nm CMOS Technology

Muhammad Faisal Siddiqui — 2023-10-07

JLPEA, Vol. 13, Pages 54: Design and Optimization of an Ultra-Low-Power Cross-Coupled LC VCO with a DFF Frequency Divider for 2.4 GHz RF Receivers Using 65 nm CMOS Technology

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040054

Authors: Muhammad Faisal Siddiqui Mukesh Kumar Maheshwari Muhammad Raza Aurangzeb Rashid Masud

This article presents the design and optimization of a tunable quadrature differential LC CMOS voltage-controlled oscillator (VCO) with a D flip-flop (DFF) frequency divider. The VCO is designed for the low-power and low-phase-noise applications of 2.4 GHz IoT/BLE receivers and wireless sensor devices. The proposed design comprises the proper stacking of an LC VCO and a DFF frequency divider and is simulated using a TSMC 65 nm CMOS technology, and it has a tuning range of 4.4 to 5.7 GHz. The voltage headroom is preserved using a high-impedance on-chip passive inductor at the tail for filtering and enabling true differential operation. The VCO and frequency divider consume as low as 2.02 mW altogether, with the VCO section consuming only 0.47 mW. The active area of the chip including the pads is only 0.47 mm2. The designed VCO achieved a much better phase noise of −118.36 dBc/Hz at a 1 MHz offset frequency with 1.2 V supply voltages. The design produced a much better FoM of −196.44 dBc/Hz compared to other related research.

JLPEA, Vol. 13, Pages 53: A Power-Gated 8-Transistor Physically Unclonable Function Accelerates Evaluation Speeds

Yujin Zheng — 2023-09-29

JLPEA, Vol. 13, Pages 53: A Power-Gated 8-Transistor Physically Unclonable Function Accelerates Evaluation Speeds

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13040053

Authors: Yujin Zheng Alex Yakovlev Alex Bystrov

The proposed 8-Transistor (8T) Physically Unclonable Function (PUF), in conjunction with the power gating technique, can significantly accelerate a single evaluation cycle more than 100,000 times faster than a 6-Transistor (6T) Static Random-Access Memory (SRAM) PUF. The 8T PUF is built to swiftly eliminate data remanence and maximise physical mismatch. Moreover, a two-phase power gating module is devised to provide controllable power on/off cycles for the chosen PUF clusters in order to facilitate fast statistical measurements and curb the in-rush current. The architecture and hardware implementation of the power-gated PUF are developed to accommodate fast multiple evaluations of PUF Responses. The fast speed enables a new data processing method, which coordinates Dark-bit masking and Multiple Temporal Majority Voting (TMV) in different Process, Voltage and Temperature (PVT) corners or during field usage, hence greatly reducing the Bit Error Rate (BER) and the hardware penalty for error correction. The designs are based on the UMC 65 nm technology and aim to tape out an Application-Specific Integrated Circuit (ASIC) chip. Post-layout Monte Carlo (MC) simulations are performed with Cadence, and the extracted PUF Responses are processed with Matlab to evaluate the 8T PUF performance and statistical metrics for subsequent inclusion in PUF Responses, which comprise the novelty of this approach.

JLPEA, Vol. 13, Pages 52: FFC-NMR Power Supply with Hybrid Control of the Semiconductor Devices

António Roque — 2023-09-19

JLPEA, Vol. 13, Pages 52: FFC-NMR Power Supply with Hybrid Control of the Semiconductor Devices

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13030052

Authors: António Roque Duarte M. Sousa Pedro J. Sebastião Vítor Silva Elmano Margato

The performance of FFC-NMR power supplies is evaluated not only considering the technique requirements but also comparing efficiencies and power consumption. Since the characteristics of FFC-NMR power supplies depend on the power circuit topology and on the control solutions, the control design is a core aspect for the development of new FFC systems. A new hybrid solution is described that allows controlling the power of semiconductors by switches (ON/OFF mode) or as a linear device. The approach avoids over-design of the power supply and makes it possible to implement new low power solutions constituting a novel design by joining a continuous match between the ON/OFF mode and the linear control of the power semiconductor devices.

JLPEA, Vol. 13, Pages 51: An Investigation of the Operating Principles and Power Consumption of Digital-Based Analog Amplifiers

Anna Richelli — 2023-09-08

JLPEA, Vol. 13, Pages 51: An Investigation of the Operating Principles and Power Consumption of Digital-Based Analog Amplifiers

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13030051

Authors: Anna Richelli Paolo Faustini Andrea Rosa Luigi Colalongo

Digital-based differential amplifiers (DDA) are particularly suitable to low voltage digital integrated circuit technologies. This paper presents an exhaustive analysis of digital-based analog amplifiers to take advantage of today’s high-performance digital technologies, and of computer aided design (CAD), which is commonly employed to design integrated circuits. The operating principle and the main mathematical relations of digital-based differential amplifiers are discussed along with an exhaustive explanation of its operating regions and of the corresponding power consumption. These aspects, which are not discussed in the literature, are very important for the circuit designers. Finally, a detailed description of the design procedure of the UMC 180nm standard CMOS technology is provided.

JLPEA, Vol. 13, Pages 50: Address Obfuscation to Protect against Hardware Trojans in Network-on-Chips

Thomas Mountford — 2023-09-06

JLPEA, Vol. 13, Pages 50: Address Obfuscation to Protect against Hardware Trojans in Network-on-Chips

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13030050

Authors: Thomas Mountford Abhijitt Dhavlle Andrew Tevebaugh Naseef Mansoor Sai Manoj Pudukotai Dinakarrao Amlan Ganguly

In modern computing, which relies on the interconnection of networks used in many/multi-core systems, any system can be critically subverted if the interconnection is compromised. This can be done in a multitude of ways, but the threat of a hardware Trojan (HT) being injected into a system is particularly prevalent due to the increase in third-party manufacturers for system-on-chip (SoC) designs. With a local injection of an HT in an SoC, an adversary can gain access to information about applications running on the system by revealing specific communications of the SoC, and the network-on-chip (NoC) as a whole. This heavily compromises the system and gives information to the attacker, which can lead to more tailored, compromising attacks. In this paper, we demonstrate an HT that exploits communication patterns inside an SoC to reveal applications that are running on an NoC with multi/many-core processors. This is performed by leaking packet counts, after which the attacker then uses machine learning techniques to identify applications running on processors, and the SoC as a whole. We also propose a LUT-based obfuscation technique to limit the information available to the hardware Trojan. Our results indicate that this obfuscation method can reduce the accuracy of this attack from 99% to <8% in multi/many-core systems.

JLPEA, Vol. 13, Pages 49: An Improved Lightweight Network Using Attentive Feature Aggregation for Object Detection in Autonomous Driving

Priyank Kalgaonkar — 2023-08-10

JLPEA, Vol. 13, Pages 49: An Improved Lightweight Network Using Attentive Feature Aggregation for Object Detection in Autonomous Driving

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13030049

Authors: Priyank Kalgaonkar Mohamed El-Sharkawy

Object detection, a more advanced application of computer vision than image classification, utilizes deep neural networks to predict objects in an input image and determine their locations through bounding boxes. The field of artificial intelligence has increasingly focused on the demands of autonomous driving, which require both high accuracy and fast inference speeds. This research paper aims to address this demand by introducing an efficient lightweight network for object detection specifically designed for self-driving vehicles. The proposed network, named MobDet3, incorporates a modified MobileNetV3 as its backbone, leveraging its lightweight convolutional neural network algorithm to extract and aggregate image features. Furthermore, the network integrates altered techniques in computer vision and adjusts to the most recent iteration of the PyTorch framework. The MobDet3 network enhances not only object positioning ability but also the reusability of feature maps across different scales. Extensive evaluations were conducted to assess the effectiveness of the proposed network, utilizing an autonomous driving dataset, as well as large-scale everyday human and object datasets. These evaluations were performed on NXP BlueBox 2.0, an advanced edge development platform designed for autonomous vehicles. The results demonstrate that the proposed lightweight object detection network achieves a mean precision of up to 58.30% on the BDD100K dataset and a high inference speed of up to 88.92 frames per second on NXP BlueBox 2.0, making it well-suited for real-time object detection in autonomous driving applications.

JLPEA, Vol. 13, Pages 48: TCI Tester: A Chip Tester for Inductive Coupling Wireless Through-Chip Interface

Hideto Kayashima — 2023-08-04

JLPEA, Vol. 13, Pages 48: TCI Tester: A Chip Tester for Inductive Coupling Wireless Through-Chip Interface

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13030048

Authors: Hideto Kayashima Hideharu Amano

The building block computation system is constructed by stacking various chips three-dimensionally. The stacked chips incorporate the same TCI IP (Through Chip Interface Intellectual Property) but cannot provide identical characteristics, requiring adjustments in power supply and bias voltage. However, providing characteristics measurement hardware for all chips is difficult due to the limitation of chip area or pin numbers. To address this problem, we developed TCI Tester, a small chip to measure electric characteristics by stacking on TCI of every chip. By stacking two TCI Tester chips, it appears that the up-directional data transfer has a stricter condition than down directional one on power supply voltage and operational frequency. Also, the transfer performance is poorer than designed. Similar measurement results are obtained by stacking TCI Tester on other chips with TCI IP. To investigate the reason, we analyzed the power grid resistance of various chips with the TCI IP. Results also showed that the chips with higher resistance have a narrow operational condition and poorer performance. The results suggest that the power grid design is important for keeping the performance through the TCI channel.

JLPEA, Vol. 13, Pages 47: Programmable Energy-Efficient Analog Multilayer Perceptron Architecture Suitable for Future Expansion to Hardware Accelerators

Jeff Dix — 2023-07-31

JLPEA, Vol. 13, Pages 47: Programmable Energy-Efficient Analog Multilayer Perceptron Architecture Suitable for Future Expansion to Hardware Accelerators

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13030047

Authors: Jeff Dix Jeremy Holleman Benjamin J. Blalock

A programmable, energy-efficient analog hardware implementation of a multilayer perceptron (MLP) is presented featuring a highly programmable system that offers the user the capability to create an MLP neural network hardware design within the available framework. In addition to programmability, this implementation provides energy-efficient operation via analog/mixed-signal design. The configurable system is made up of 12 neurons and is fabricated in a standard 130 nm CMOS process occupying approximately 1 mm2 of on-chip area. The system architecture is analyzed in several different configurations with each achieving a power efficiency of greater than 1 tera-operations per watt. This work offers an energy-efficient and scalable alternative to digital configurable neural networks that can be built upon to create larger networks capable of standard machine learning applications, such as image and text classification. This research details a programmable hardware implementation of an MLP that achieves a peak power efficiency of 5.23 tera-operations per watt while consuming considerably less power than comparable digital and analog designs. This paper describes circuit elements that can readily be scaled up at the system level to create a larger neural network architecture capable of improved energy efficiency.

JLPEA, Vol. 13, Pages 46: Review of Orthogonal Frequency Division Multiplexing-Based Modulation Techniques for Light Fidelity

Rahmayati Alindra — 2023-07-26

JLPEA, Vol. 13, Pages 46: Review of Orthogonal Frequency Division Multiplexing-Based Modulation Techniques for Light Fidelity

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13030046

Authors: Rahmayati Alindra Purnomo Sidi Priambodo Kalamullah Ramli

Light Fidelity (LiFi) technology has gained attention and is growing rapidly today. Utilizing light as a propagation medium allows LiFi to promise a wider bandwidth than existing Wireless Fidelity (WiFi) technology and enables the implementation of cellular technology to improve bandwidth utilization. In addition, LiFi is very attractive because it can utilize lighting facilities consisting of light-emitting diodes (LEDs). A LiFi system that uses intensity modulation and direct detection requires the signal of orthogonal frequency division multiplexing (OFDM) to have a real and non-negative value; therefore, certain adjustments must be made. The proposed methods for generating unipolar signals vary from adding a direct current, clipping the signal, superposing several unipolar signals, and hybrid methods as in DC-biased optical (DCO)-OFDM, asymmetrically clipped optical (ACO)-OFDM, layered ACO (LACO)-OFDM, and asymmetrically clipped DC-biased optical (ADO)-OFDM, respectively. In this paper, we review and compare various modulation techniques to support the implementation of LiFi systems using commercial LEDs. The main objective is to obtain a modulation technique with good energy efficiency, efficient spectrum utilization, and low computational complexity so that it is easy for us to apply it in experiments on a laboratory scale.

JLPEA, Vol. 13, Pages 45: BFT—Low-Latency Bit-Slice Design of Discrete Fourier Transform

Cataldo Guaragnella — 2023-07-18

JLPEA, Vol. 13, Pages 45: BFT—Low-Latency Bit-Slice Design of Discrete Fourier Transform

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13030045

Authors: Cataldo Guaragnella Agostino Giorgio Maria Rizzi

Structures for the evaluation of fast Fourier transforms are important components in several signal-processing applications and communication systems. Their capabilities play a key role in the performance enhancement of the whole system in which they are embedded. In this paper, a novel implementation of the discrete Fourier transform is proposed, based on a bit-slice approach and on the exploitation of the input sequence finite word length. Input samples of the sequence to be transformed are split into binary sequences and each one is Fourier transformed using only complex sums. An FPGA-based solution characterized by low latency and low power consumption is designed. Simulations have been carried out, first in the Matlab environment, then emulated in Quartus IDE with Intel. The hardware implementation of the conceived system and the test for the functional accuracy verification have been performed, adopting the DE2-115 development board from Terasic, which is equipped with the Cyclone IV EP4CE115F29C7 FPGA by Intel.

JLPEA, Vol. 13, Pages 44: Electromigration-Aware Memory Hierarchy Architecture

Freddy Gabbay — 2023-07-11

JLPEA, Vol. 13, Pages 44: Electromigration-Aware Memory Hierarchy Architecture

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13030044

Authors: Freddy Gabbay Avi Mendelson

New mission-critical applications, such as autonomous vehicles and life-support systems, set a high bar for the reliability of modern microprocessors that operate in highly challenging conditions. However, while cutting-edge integrated circuit (IC) technologies have intensified microprocessors by providing remarkable reductions in the silicon area and power consumption, they also introduce new reliability challenges through the complex design rules they impose, creating a significant hurdle in the design process. In this paper, we focus on electromigration (EM), which is a crucial factor impacting IC reliability. EM refers to the degradation process of IC metal nets when used for both power supply and interconnecting signals. Typically, EM concerns have been addressed at the backend, circuit, and layout levels, where EM rules are enforced assuming extreme conditions to identify and resolve violations. This study presents new techniques that leverage architectural features to mitigate the effect of EM on the memory hierarchy of modern microprocessors. Architectural approaches can reduce the complexity of solving EM-related violations, and they can also complement and enhance common existing methods. In this study, we present a comprehensive simulation analysis that demonstrates how the proposed solution can significantly extend the lifetime of a microprocessor’s memory hierarchy with minimal overhead in terms of performance, power, and area while relaxing EM design efforts.

JLPEA, Vol. 13, Pages 43: An Extended Range Divider Technique for Multi-Band PLL

Rizwan Shaik Peerla — 2023-07-05

JLPEA, Vol. 13, Pages 43: An Extended Range Divider Technique for Multi-Band PLL

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13030043

Authors: Rizwan Shaik Peerla Ashudeb Dutta Bibhu Datta Sahoo

This paper presents a multiplexer-based extended range multi-modulus divider (ER-MMD) technique for multi-band phase locked loop (PLL). The architecture maintains a modular structure by using conventional 2/3 divider cells and a multiplexer without adding any extra logic circuitry. The area and power overhead is minimal. The 2/3 divider cells are designed using true single phase clock (TSPC) logic for ER-MMD to operate in the sub-10 GHz range. A division range of 2 to 511 is achieved using this logic. The ER-MMD operates at a maximum frequency of 6 GHz with a worst-case current of 625 μA when powered with a 1 V supply. A dual voltage controlled oscillator (VCO), L5/S band PLL for Indian Regional Navigation Satellite System (IRNSS) application is designed, which incorporates an ER-MMD based on the proposed approach as a proof of concept. This technique achieves the best power efficiency of 12 GHz/mW, among the state-of-the-art ER-MMD designs.

JLPEA, Vol. 13, Pages 42: FTFNet: Multispectral Image Segmentation

Justin Edwards — 2023-06-30

JLPEA, Vol. 13, Pages 42: FTFNet: Multispectral Image Segmentation

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13030042

Authors: Justin Edwards Mohamed El-Sharkawy

Semantic segmentation is a machine learning task that is seeing increased utilization in multiple fields, from medical imagery to land demarcation and autonomous vehicles. A real-time autonomous system must be lightweight while maintaining reasonable accuracy. This research focuses on leveraging the fusion of long-wave infrared (LWIR) imagery with visual spectrum imagery to fill in the inherent performance gaps when using visual imagery alone. This approach culminated in the Fast Thermal Fusion Network (FTFNet), which shows marked improvement over the baseline architecture of the Multispectral Fusion Network (MFNet) while maintaining a low footprint.

JLPEA, Vol. 13, Pages 41: Resonator Arrays for Linear Position Sensors

Mattia Simonazzi — 2023-06-07

JLPEA, Vol. 13, Pages 41: Resonator Arrays for Linear Position Sensors

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020041

Authors: Mattia Simonazzi Leonardo Sandrolini Andrea Mariscotti

A contactless position sensor based on an array of magnetically coupled resonators and an external single coil cell is discussed for both stationary and dynamic applications. The simple structure allows the sensor to be adapted to the system in which it is installed and can be used to detect the positions of objects in motion that bear an external resonator coil that does not necessitate a supply. By exploiting the unique behaviour of the array input impedance, it is possible to identify the position of the external resonator by exciting the first array cell with an external voltage source and measuring the resulting input current. The system is robust and suitable for application in harsh environments. The sensitivity of the measured input impedance to the space variation is adjustable with the definition of the array geometry and is analysed. Different configurations of the array and external resonator are considered, and the effects of various termination conditions and the resulting factor of merit after changing the coil resistance are discussed. The proposed procedure is numerically validated for an array of ten identical magnetically coupled resonators with 15 cm side lengths. Simulations carried out for a distance of up to 20 cm show that, with a quality factor lower than 100 and optimal terminations of both the array and external coil, it is possible to detect the position of the latter.

JLPEA, Vol. 13, Pages 40: Efficient GEMM Implementation for Vision-Based Object Detection in Autonomous Driving Applications

Fatima Zahra Guerrouj — 2023-06-06

JLPEA, Vol. 13, Pages 40: Efficient GEMM Implementation for Vision-Based Object Detection in Autonomous Driving Applications

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020040

Authors: Fatima Zahra Guerrouj Sergio Rodríguez Flórez Mohamed Abouzahir Abdelhafid El Ouardi Mustapha Ramzi

Convolutional Neural Networks (CNNs) have been incredibly effective for object detection tasks. YOLOv4 is a state-of-the-art object detection algorithm designed for embedded systems. It is based on YOLOv3 and has improved accuracy, speed, and robustness. However, deploying CNNs on embedded systems such as Field Programmable Gate Arrays (FPGAs) is difficult due to their limited resources. To address this issue, FPGA-based CNN architectures have been developed to improve the resource utilization of CNNs, resulting in improved accuracy and speed. This paper examines the use of General Matrix Multiplication Operations (GEMM) to accelerate the execution of YOLOv4 on embedded systems. It reviews the most recent GEMM implementations and evaluates their accuracy and robustness. It also discusses the challenges of deploying YOLOv4 on autonomous vehicle datasets. Finally, the paper presents a case study demonstrating the successful implementation of YOLOv4 on an Intel Arria 10 embedded system using GEMM.

JLPEA, Vol. 13, Pages 39: Nanomaterial-Based Sensor Array Signal Processing and Tuberculosis Classification Using Machine Learning

Chenxi Liu — 2023-05-29

JLPEA, Vol. 13, Pages 39: Nanomaterial-Based Sensor Array Signal Processing and Tuberculosis Classification Using Machine Learning

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020039

Authors: Chenxi Liu Israel Cohen Rotem Vishinkin Hossam Haick

Tuberculosis (TB) has long been recognized as a significant health concern worldwide. Recent advancements in noninvasive wearable devices and machine learning (ML) techniques have enabled rapid and cost-effective testing for the real-time detection of TB. However, small datasets are often encountered in biomedical and chemical engineering domains, which can hinder the success of ML models and result in overfitting issues. To address this challenge, we propose various data preprocessing methods and ML approaches, including long short-term memory (LSTM), convolutional neural network (CNN), Gramian angular field-CNN (GAF-CNN), and multivariate time series with MinCutPool (MT-MinCutPool), for classifying a small TB dataset consisting of multivariate time series (MTS) sensor signals. Our proposed methods are compared with state-of-the-art models commonly used in MTS classification (MTSC) tasks. We find that lightweight models are more appropriate for small-dataset problems. Our experimental results demonstrate that the average performance of our proposed models outperformed the baseline methods in all aspects. Specifically, the GAF-CNN model achieved the highest accuracy of 0.639 and the highest specificity of 0.777, indicating its superior effectiveness for MTSC tasks. Furthermore, our proposed MT-MinCutPool model surpassed the baseline MTPool model in all evaluation metrics, demonstrating its viability for MTSC tasks.

JLPEA, Vol. 13, Pages 38: Ultra-Low-Power ICs for the Internet of Things

Orazio Aiello — 2023-05-26

JLPEA, Vol. 13, Pages 38: Ultra-Low-Power ICs for the Internet of Things

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020038

Authors: Orazio Aiello

The collection of research works in this Special Issue focuses on Ultra-Low-Power (ULP) Integrated Circuits (ICs) operating under a tight budget of power as a criterion to build electronic devices relying less and less on batteries [...]

JLPEA, Vol. 13, Pages 37: Ultra-Low Power Programmable Bandwidth Capacitively-Coupled Chopper Instrumentation Amplifier Using 0.2 V Supply for Biomedical Applications

Xuan Thanh Pham — 2023-05-24

JLPEA, Vol. 13, Pages 37: Ultra-Low Power Programmable Bandwidth Capacitively-Coupled Chopper Instrumentation Amplifier Using 0.2 V Supply for Biomedical Applications

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020037

Authors: Xuan Thanh Pham Xuan Thuc Kieu Manh Kha Hoang

This paper presents a capacitively coupled chopper instrumentation amplifier (CCIA) with ultra-low power consumption and programmable bandwidth for biomedical applications. To achieve a flexible bandwidth from 0.2 to 10 kHz without additional power consumption, a programmable Miller compensation technique was proposed and used in the CCIA. By using a Squeezed inverter amplifier (SQI) that employs a 0.2-V supply, the proposed CCIA addresses the primary noise source in the first stage, resulting in high noise power efficiency. The proposed CCIA is designed using a 0.18 µm CMOS technology process and has a chip area of 0.083 mm2. With a power consumption of 0.47 µW at 0.2 and 0.8 V supply, the proposed amplifier architecture achieves a thermal noise of 28 nV/√Hz, an input-related noise (IRN) of 0.9 µVrms, a closed-loop gain (AV) of 40 dB, a power supply rejection ratio (PSRR) of 87.6 dB, and a common-mode rejection ratio (CMRR) of 117.7 dB according to post-simulation data. The proposed CCIA achieves a noise efficiency factor (NEF) of 1.47 and a power efficiency factor (PEF) of 0.56, which allows comparison with the latest research results.

JLPEA, Vol. 13, Pages 36: AMA: An Ageing Task Migration Aware for High-Performance Computing

Emmanuel Ofori-Attah — 2023-05-22

JLPEA, Vol. 13, Pages 36: AMA: An Ageing Task Migration Aware for High-Performance Computing

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020036

Authors: Emmanuel Ofori-Attah Michael Opoku Agyeman

The dark-silicon challenge poses a design problem for future many-core systems. As a result of this, several techniques have been introduced to improve the number of processing elements that can be powered on. One of the techniques employed by many is Task Migration. In this paper, an Ageing Task Migration Aware for High-Performance Computing (AMA) is proposed to improve the lifetime of nodes. The proposed method determines which clusters applications are mapped to and migrates high-demand tasks amongst nodes to improve the lifetime at every epoch. Experimental results show that the proposed method outperforms state-of-the-art techniques by more than 10%.

JLPEA, Vol. 13, Pages 34: Evaluation of Polylactic Acid Polymer as a Substrate in Rectenna for Ambient Radiofrequency Energy Harvesting

Pangsui Usifu Linge — 2023-05-12

JLPEA, Vol. 13, Pages 34: Evaluation of Polylactic Acid Polymer as a Substrate in Rectenna for Ambient Radiofrequency Energy Harvesting

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020034

Authors: Pangsui Usifu Linge Tony Gerges Pascal Bevilacqua Jean-Marc Duchamp Philippe Benech Jacques Verdier Philippe Lombard Michel Cabrera Pierre Tsafack Fabien Mieyeville Bruno Allard

This work details the design and experimental characterization of a 2D rectenna for scavenging radio frequency energy at 2.45 GHz (WiFi band), fabricated on polylactic acid polymer (PLA) using a plastronics approach. PLA is the RF substrate of both antenna and rectifier. The two transmission line (TTL) approach is used to characterize the substrate properties to be considered during design. A linearly polarized patch antenna with microstrip transmission feeding is connected to a single series diode rectifier through a T-matching network. The antenna has simulated and measured gain of 7.6 dB and 7.5 dB, respectively. The rectifier has a measured DC output power of 0.96 μW at an optimal load of 2 kΩ under RF input power of −20 dBm at 2.45 GHz. The power conversion efficiency is 9.6% in the latter conditions for a 54 × 36 mm patch antenna of a 1.5 mm thick PLA substrate obtained from additive manufacturing. The power conversion efficiency reaches a value of 28.75% when the input power is −10 dBm at 2.45 GHz. This corresponds to a peak DC power of 28.75 μW when the optimal load is 1.5 kΩ. The results compare significantly with the ones of a similar rectenna circuit manufactured on preferred RF substrate.

JLPEA, Vol. 13, Pages 35: A 0.15-to-0.5 V Body-Driven Dynamic Comparator with Rail-to-Rail ICMR

Riccardo Della Sala — 2023-05-11

JLPEA, Vol. 13, Pages 35: A 0.15-to-0.5 V Body-Driven Dynamic Comparator with Rail-to-Rail ICMR

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020035

Authors: Riccardo Della Sala Valerio Spinogatti Cristian Bocciarelli Francesco Centurelli Alessandro Trifiletti

In this paper, a novel dynamic body-driven ultra-low voltage (ULV) comparator is presented. The proposed topology takes advantage of the back-gate configuration by driving the input transistors’ gates with a clocked positive feedback loop made of two AND gates. This allows for the removal of the clocked tail generator, which decreases the number of stacked transistors and improves performance at low VDD. Furthermore, the clocked feedback loop causes the comparator to behave as a full CMOS latch during the regeneration phase, which means no static power consumption occurs after the outputs have settled. Thanks to body driving, the proposed comparator also achieves rail-to-rail input common mode range (ICMR), which is a critical feature for circuits that operate at low and ultra-low voltage headrooms. The comparator was designed and optimized in a 130-nm technology from STMicroelectronics at VDD=0.3 V and is able to operate at up to 2 MHz with an input differential voltage of 1 mV. The simulations show that the comparator remains fully operational even when the supply voltage is scaled down to 0.15 V, in which case the circuit exhibits a maximum operating frequency of 80 kHz at Vid=1 mV.

JLPEA, Vol. 13, Pages 33: In-Pipeline Processor Protection against Soft Errors

Ján Mach — 2023-05-10

JLPEA, Vol. 13, Pages 33: In-Pipeline Processor Protection against Soft Errors

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020033

Authors: Ján Mach Lukáš Kohútka Pavel Čičák

The shrinking of technology nodes allows higher performance, but susceptibility to soft errors increases. The protection has been implemented mainly by lockstep or hardened process techniques, which results in a lower frequency, a larger area, and higher power consumption. We propose a protection technique that only slightly affects the maximal frequency. The area and power consumption increase are comparable with dual lockstep architectures. A reaction to faults and the ability to recover from them is similar to triple modular redundancy architectures. The novelty lies in applying redundancy into the processor’s pipeline and its separation into two sections. The protection provides fast detection of faults, simple recovery by a flush of the pipeline, and allows a large prediction unit to be unprotected. A proactive component automatically scrubs a register file to prevent fault accumulation. The whole protection scheme can be fully implemented at the register transfer level. We present the protection scheme implemented inside the RISC-V core with the RV32IMC instruction set. Simulations confirm that the protection can handle the injected faults. Synthesis shows that the protection lowers the maximum frequency by only about 3.9%. The area increased by 108% and power consumption by 119%.

JLPEA, Vol. 13, Pages 32: A Time-Mode PWM 1st Order Low-Pass Filter

Konstantinos P. Pagkalos — 2023-05-06

JLPEA, Vol. 13, Pages 32: A Time-Mode PWM 1st Order Low-Pass Filter

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020032

Authors: Konstantinos P. Pagkalos Orfeas Panetas-Felouris Spyridon Vlassis

In this work, a first-order low-pass filter is proposed as suitable for time-mode PWM signal processing. In time-mode PWM signal processing, the pulse width of a rectangular pulse is the processing variable. The filter is constructed using basic time-mode building blocks such as time registers and time adders and so it is characterized by low complexity which can lead to the modular and versatile design of higher-order filters. All the building blocks of the filter were designed and verified in a TSMC 65 nm technology process. The sampling frequency was 5 MHz, the gain of the filter at low frequencies was at −0.016 dB, the cut-off frequency was 1.2323 MHz, and the power consumption was around 59.1 μW.

JLPEA, Vol. 13, Pages 31: Batteryless Sensor Devices for Underground Infrastructure—A Long-Term Experiment on Urban Water Pipes

Manuel Boebel — 2023-04-29

JLPEA, Vol. 13, Pages 31: Batteryless Sensor Devices for Underground Infrastructure—A Long-Term Experiment on Urban Water Pipes

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020031

Authors: Manuel Boebel Fabian Frei Frank Blumensaat Christian Ebi Marcel Louis Meli Andreas Rüst

Drinking water is becoming increasingly scarce as the world’s population grows and climate change continues. However, there is great potential to improve drinking water pipelines, as 30% of fresh water is lost between the supplier and consumer. While systematic process monitoring could play a crucial role in the early detection and repair of leaks, current practice requires manual inspection, which is both time-consuming and costly. This project envisages maintenance-free measurements at numerous locations within the underground infrastructure, a goal that is to be achieved through the use of a harvesting device mounted on the water pipe. This device extracts energy from the temperature difference between the water pipe and the soil using a TEG (thermoelectric generator), takes sensor measurements, processes the data and transmits it wirelessly via LoRaWAN. We built 16 harvesting devices, installed them in four locations and continuously evaluated their performance throughout the project. In this paper, we focus on two devices of a particular type. The data for a full year show that enough energy was available on 94% of the days, on average, to take measurements and transmit data. This study demonstrates that it is possible to power highly constrained sensing devices with energy harvesting in underground environments.

JLPEA, Vol. 13, Pages 30: Energy-Efficient Audio Processing at the Edge for Biologging Applications

Jonathan Miquel — 2023-04-27

JLPEA, Vol. 13, Pages 30: Energy-Efficient Audio Processing at the Edge for Biologging Applications

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020030

Authors: Jonathan Miquel Laurent Latorre Simon Chamaillé-Jammes

Biologging refers to the use of animal-borne recording devices to study wildlife behavior. In the case of audio recording, such devices generate large amounts of data over several months, and thus require some level of processing automation for the raw data collected. Academics have widely adopted offline deep-learning-classification algorithms to extract meaningful information from large datasets, mainly using time-frequency signal representations such as spectrograms. Because of the high deployment costs of animal-borne devices, the autonomy/weight ratio remains by far the fundamental concern. Basically, power consumption is addressed using onboard mass storage (no wireless transmission), yet the energy cost associated with data storage activity is far from negligible. In this paper, we evaluate various strategies to reduce the amount of stored data, making the fair assumption that audio will be categorized using a deep-learning classifier at some point of the process. This assumption opens up several scenarios, from straightforward raw audio storage paired with further offline classification on one side, to a fully embedded AI engine on the other side, with embedded audio compression or feature extraction in between. This paper investigates three approaches focusing on data-dimension reduction: (i) traditional inline audio compression, namely ADPCM and MP3, (ii) full deep-learning classification at the edge, and (iii) embedded pre-processing that only computes and stores spectrograms for later offline classification. We characterized each approach in terms of total (sensor + CPU + mass-storage) edge power consumption (i.e., recorder autonomy) and classification accuracy. Our results demonstrate that ADPCM encoding brings 17.6% energy savings compared to the baseline system (i.e., uncompressed raw audio samples). Using such compressed data, a state-of-the-art spectrogram-based classification model still achieves 91.25% accuracy on open speech datasets. Performing inline data-preparation can significantly reduce the amount of stored data allowing for a 19.8% energy saving compared to the baseline system, while still achieving 89% accuracy during classification. These results show that while massive data reduction can be achieved through the use of inline computation of spectrograms, it translates to little benefit on device autonomy when compared to ADPCM encoding, with the added downside of losing original audio information.

JLPEA, Vol. 13, Pages 29: Battery Parameter Analysis through Electrochemical Impedance Spectroscopy at Different State of Charge Levels

Yuchao Wu — 2023-04-26

JLPEA, Vol. 13, Pages 29: Battery Parameter Analysis through Electrochemical Impedance Spectroscopy at Different State of Charge Levels

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020029

Authors: Yuchao Wu Sneha Sundaresan Balakumar Balasingam

This paper presents a systematic approach to extract electrical equivalent circuit model (ECM) parameters of the Li-ion battery (LIB) based on electrochemical impedance spectroscopy (EIS). Particularly, the proposed approach is suitable to practical applications where the measurement noise can be significant, resulting in a low signal-to-noise ratio. Given the EIS measurements, the proposed approach can be used to obtain the ECM parameters of a battery. Then, a time domain approach is employed to validate the accuracy of estimated ECM parameters. In order to investigate whether the ECM parameters vary as the battery’s state of charge (SOC) changes, the EIS experiment was repeated at nine different SOCs. The experimental results show that the proposed approach is consistent in estimating the ECM parameters. It is found that the battery parameters, such as internal resistance, capacitance and inductance, remain the same for practical SOC ranges starting from 20% until 90%. The ECM parameters saw a significant change at low SOC levels. Furthermore, the experimental data show that the resistive components estimated in the frequency domain are very close to the internal resistance estimated in the time domain. The proposed approach was applied to eight different battery cells consisting of two different manufacturers and produced consistent results.

JLPEA, Vol. 13, Pages 28: Class AB Voltage Follower and Low-Voltage Current Mirror with Very High Figures of Merit Based on the Flipped Voltage Follower

Jaime Ramírez-Angulo — 2023-04-24

JLPEA, Vol. 13, Pages 28: Class AB Voltage Follower and Low-Voltage Current Mirror with Very High Figures of Merit Based on the Flipped Voltage Follower

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020028

Authors: Jaime Ramírez-Angulo Anindita Paul Manaswini Gangineni Jose Maria Hinojo-Montero Jesús Huerta-Chua

The application of the flipped voltage follower to implement two high-performance circuits is presented: (1) The first is a class AB cascode flipped voltage follower that shows an improved slew rate and an improved bandwidth by very large factors and that has a higher output range than the conventional flipped voltage follower. It has a small signal figure of merit FOMSS = 46 MHz pF/µW and a current efficiency figure of merit FOMCE = 118. This is achieved by just introducing an additional output current sourcing PMOS transistor (P-channel Metal Oxide Semiconductor Field Effect Transistor) that provides dynamic output current enhancement and increases the quiescent power dissipation by less than 10%. (2) The other is a high-performance low-voltage current mirror with a nominal gain accuracy better than 0.01%, 0.212 Ω input resistance, 112 GΩ output resistance, 1 V supply voltage requirements, 0.15 V input, and 0.2 V output compliance voltages. These characteristics are achieved by utilizing two auxiliary amplifiers and a level shifter that increase the power dissipation just moderately. Post-layout simulations verify the performance of the circuits in a commercial 180 nm CMOS (Complementary Metal Oxide Semiconductor) technology.

JLPEA, Vol. 13, Pages 27: Buck-Boost Charge Pump Based DC-DC Converter

Evi Keramida — 2023-04-21

JLPEA, Vol. 13, Pages 27: Buck-Boost Charge Pump Based DC-DC Converter

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020027

Authors: Evi Keramida George Souliotis Spyridon Vlassis Fotis Plessas

This paper presents a novel inductorless dual-mode buck-boost charge pump (CP) based DC-DC converter. The proposed architecture allows the same circuit to accomplish two modes of operation, buck and boost, for degrading or elevating the output voltage, respectively, compared to the input. To achieve each mode, only a switching of the input–output connections is needed without any other modification in the design of the DC-DC converter. The dual-mode configuration aims to merge two different functions into one circuit, minimizing the design time and the area the DC-DC converter occupies on the die. The proposed buck-boost CP has been designed using TSMC 65 nm complementary metal–oxide–semiconductor (CMOS) technology. The functional input voltage range of the CP in boost mode is 1.2 V to 1.8 V and the typical output voltage is 1.8 V. For the buck mode, the input voltage range is 3.2 V to 3.6 V and the output is 1.5 V. For both modes, the output can be easily modified to new values by changing the comparator configuration. Efficiency results are also provided for the two modes.

JLPEA, Vol. 13, Pages 26: Innovative Characterization and Comparative Analysis of Water Level Sensors for Enhanced Early Detection and Warning of Floods

Rula Tawalbeh — 2023-04-11

JLPEA, Vol. 13, Pages 26: Innovative Characterization and Comparative Analysis of Water Level Sensors for Enhanced Early Detection and Warning of Floods

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020026

Authors: Rula Tawalbeh Feras Alasali Zahra Ghanem Mohammad Alghazzawi Ahmad Abu-Raideh William Holderbaum

In considering projections that flooding will increase in the future years due to factors such as climate change and urbanization, the need for dependable and accurate water sensors systems is greater than ever. In this study, the performance of four different water level sensors, including ultrasonic, infrared (IR), and pressure sensors, is analyzed based on innovative characterization and comparative analysis, to determine whether or not these sensors have the ability to detect rising water levels and flash floods at an earlier stage under different conditions. During our exhaustive tests, we subjected the device to a variety of conditions, including clean and contaminated water, light and darkness, and an analogue connection to a display. When it came to monitoring water levels, the ultrasonic sensors stood out because of their remarkable precision and consistency. To address this issue, this study provides a novel and comparative examination of four water level sensors to determine which is the most effective and cost-effective in detecting floods and water level fluctuations. The IR sensor delivered accurate findings; however, it demonstrated some degree of variability throughout the course of the experiment. In addition, the results of our research show that the pressure sensor is a legitimate alternative to ultrasonic sensors. This presents a possibility that is more advantageous financially when it comes to the development of effective water level monitoring systems. The findings of this study are extremely helpful in improving the dependability and accuracy of flood detection systems and, eventually, in lessening the devastation caused by natural catastrophes.

JLPEA, Vol. 13, Pages 25: First Review of Conductive Electrets for Low-Power Electronics

D. D. L. Chung — 2023-04-06

JLPEA, Vol. 13, Pages 25: First Review of Conductive Electrets for Low-Power Electronics

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020025

Authors: D. D. L. Chung

This is the first review of conductive electrets (unpoled carbons and metals), which provide a new avenue for low-power electronics. The electret provides low DC voltage (μV) while allowing low DC current (μA) to pass through. Ohm’s Law is obeyed. The voltage scales with the inter-electrode distance. Series connection of multiple electret components provides a series voltage that equals the sum of the voltages of the components if there is no bending at the connection between the components. Otherwise, the series voltage is below the sum. Bending within the component also diminishes the voltage because of the polarization continuity decrease. The electret originates from the interaction of a tiny fraction of the carriers with the atoms. This interaction results in the charge in the electret. Dividing the electret charge by the electret voltage V’ provides the electret-based capacitance C’, which is higher than the permittivity-based capacitance (conventional) by a large number of orders of magnitude. The C’ governs the electret energy (1/2 C’V’2) and electret discharge time constant (RC’, where R = resistance), as shown for metals. The discharge time is promoted by a larger inter-electrode distance. The electret discharges occur upon short-circuiting and charge back upon subsequent opencircuiting. The discharge or charge of the electret amounts to the discharge or charge of C’.

JLPEA, Vol. 13, Pages 24: A 0.6 V Bulk-Driven Class-AB Two-Stage OTA with Non-Tailed Differential Pair

Andrea Ballo — 2023-03-28

JLPEA, Vol. 13, Pages 24: A 0.6 V Bulk-Driven Class-AB Two-Stage OTA with Non-Tailed Differential Pair

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020024

Authors: Andrea Ballo Alfio Dario Grasso Salvatore Pennisi

This work presents a two-stage operational transconductance amplifier suitable for sub-1 V operation. This characteristic is achieved thanks to the adoption of a bulk-driven non-tailed differential pair. Local positive feedback is exploited to boost the equivalent transconductance of the first stage and the quasi-floating gate approach enables the class AB operation of the second stage. Implemented in a standard 180 nm CMOS technology and supplied at 0.6 V, the amplifier exhibits a 350 kHz gain bandwidth product and a phase margin of 69° while driving a 150 pF load. Compared to other solutions in the literature, the proposed one exhibits a considerable performance improvement, especially for large signal operation.

JLPEA, Vol. 13, Pages 23: A Ka-Band SiGe BiCMOS Quasi-F−1 Power Amplifier Using a Parasitic Capacitance Cancellation Technique †

Vasileios Manouras — 2023-03-24

JLPEA, Vol. 13, Pages 23: A Ka-Band SiGe BiCMOS Quasi-F−1 Power Amplifier Using a Parasitic Capacitance Cancellation Technique †

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13020023

Authors: Vasileios Manouras Ioannis Papananos

This paper deals with the design, analysis, and implementation of a Ka-band, single-stage, quasi-inverse class F power amplifier (PA). A detailed methodology for the evaluation of the active device’s output capacitance is described, enabling the designing of a second-harmonically tuned load and resulting in enhanced performance. A simplified model for the extraction of time-domain intrinsic voltage and current waveforms at the output of the main active core is introduced, enforcing the implementation process of the proposed quasi-inverse class F technique. The PA is fabricated in a 130 nm SiGe BiCMOS technology with fT/fmax=250/370 GHz and it is suitable for 5G applications. It achieves 33% peak power-added efficiency (PAE), 18.8 dBm saturation output power Psat, and 14.7 dB maximum large-signal power gain G at the operating frequency of 38 GHz. The PA’s response is also tested under a modulated-signal excitation and simulation results are denoted in this paper. The chip size is 0.605×0.712 mm2 including all pads.

JLPEA, Vol. 13, Pages 22: Extreme Path Delay Estimation of Critical Paths in Within-Die Process Fluctuations Using Multi-Parameter Distributions

Miikka Runolinna — 2023-03-20

JLPEA, Vol. 13, Pages 22: Extreme Path Delay Estimation of Critical Paths in Within-Die Process Fluctuations Using Multi-Parameter Distributions

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010022

Authors: Miikka Runolinna Matthew Turnquist Jukka Teittinen Pauliina Ilmonen Lauri Koskinen

Two multi-parameter distributions, namely the Pearson type IV and metalog distributions, are discussed and suggested as alternatives to the normal distribution for modelling path delay data that determines the maximum clock frequency (FMAX) of a microprocessor or other digital circuit. These distributions outperform the normal distribution in goodness-of-fit statistics for simulated path delay data derived from a fabricated microcontroller, with the six-term metalog distribution offering the best fit. Furthermore, 99.7% confidence intervals are calculated for some extreme quantiles on each dataset using the previous distributions. Considering the six-term metalog distribution estimates as the golden standard, the relative errors in single paths vary between 4 and 14% for the normal distribution. Finally, the within-die (WID) variation maximum critical path delay distribution for multiple critical paths is derived under the assumption of independence between the paths. Its density function is then used to compute different maximum delays for varying numbers of critical paths, assuming each path has one of the previous distributions with the metalog estimates as the golden standard. For 100 paths, the relative errors are at most 14% for the normal distribution. With 1000 and 10,000 paths, the corresponding errors extend up to 16 and 19%, respectively.

JLPEA, Vol. 13, Pages 21: DycSe: A Low-Power, Dynamic Reconfiguration Column Streaming-Based Convolution Engine for Resource-Aware Edge AI Accelerators

Weison Lin — 2023-03-16

JLPEA, Vol. 13, Pages 21: DycSe: A Low-Power, Dynamic Reconfiguration Column Streaming-Based Convolution Engine for Resource-Aware Edge AI Accelerators

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010021

Authors: Weison Lin Yajun Zhu Tughrul Arslan

Edge AI accelerators are utilized to accelerate the computation in edge AI devices such as image recognition sensors on robotics, door lockers, drones, and remote sensing satellites. Instead of using a general-purpose processor (GPP) or graphic processing unit (GPU), an edge AI accelerator brings a customized design to meet the requirements of the edge environment. The requirements include real-time processing, low-power consumption, and resource-awareness, including resources on field programmable gate array (FPGA) or limited application-specific integrated circuit (ASIC) area. The system’s reliability (e.g., permanent fault tolerance) is essential if the devices target radiation fields such as space and nuclear power stations. This paper proposes a dynamic reconfigurable column streaming-based convolution engine (DycSe) with programmable adder modules for low-power and resource-aware edge AI accelerators to meet the requirements. The proposed DycSe design does not target the FPGA platform only. Instead, it is an intellectual property (IP) core design. The FPGA platform used in this paper is for prototyping the design evaluation. This paper uses the Vivado synthesis tool to evaluate the power consumption and resource usage of DycSe. Since the synthesis tool is limited to giving the final complete system result in the designing stage, we compare DycSe to a commercial edge AI accelerator for cross-reference with other state-of-the-art works. The commercial architecture shares the competitive performance within the low-power ultra-small (LPUS) edge AI scopes. The result shows that DycSe contains 3.56% less power consumption and slight resources (1%) overhead with reconfigurable flexibility.

JLPEA, Vol. 13, Pages 20: Efficient Dual Output Regulating Rectifier and Adiabatic Charge Pump for Biomedical Applications Employing Wireless Power Transfer

Noora Almarri — 2023-03-04

JLPEA, Vol. 13, Pages 20: Efficient Dual Output Regulating Rectifier and Adiabatic Charge Pump for Biomedical Applications Employing Wireless Power Transfer

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010020

Authors: Noora Almarri Peter Langlois Dai Jiang Andreas Demosthenous

A power management unit (PMU) is an essential block for diversified multi-functional low-power Internet of Things (IoT) and biomedical electronics. This paper includes a theoretical analysis of a high current, single-stage ac-dc, reconfigurable, dual output, regulating rectifier consisting of pulse width modulation (PWM) and pulse frequency modulation (PFM). The regulating rectifier provides two independently regulated supply voltages of 1.8 V and 3.3 V from an input ac voltage. The PFM control feedback consists of feedback-driven regulation to adjust the driving frequency of the power transistors through adaptive buffers in the active rectifier. The PWM/PFM mode control provides a feedback loop to adjust the conduction duration accurately and minimize power losses. The design also includes an adiabatic charge pump (CP) to provide a higher voltage level. The adiabatic CP consists of latch-up and power-saving topologies to enhance its power efficiency. Simulation results show that the dual regulating rectifier has 94.3% voltage conversion efficiency with an ac input magnitude of 3.5 Vp. The power conversion efficiency of the regulated 3.3 V output voltage is 82.3%. The adiabatic CP has an overall voltage conversion efficiency (VCE) of 92.9% with a total on-chip capacitance of 60 pF. The circuit was designed using 180 nm CMOS technology.

JLPEA, Vol. 13, Pages 19: Radio-Frequency Energy Harvesting Using Rapid 3D Plastronics Protoyping Approach: A Case Study

Xuan Viet Linh Nguyen — 2023-02-17

JLPEA, Vol. 13, Pages 19: Radio-Frequency Energy Harvesting Using Rapid 3D Plastronics Protoyping Approach: A Case Study

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010019

Authors: Xuan Viet Linh Nguyen Tony Gerges Pascal Bevilacqua Jean-Marc Duchamp Philippe Benech Jacques Verdier Philippe Lombard Pangsui Usifu Linge Fabien Mieyeville Michel Cabrera Bruno Allard

Harvesting of ambient radio-frequency energy is largely covered in the literature. The RF energy harvester is considered most of the time as a standalone board. There is an interest to add the RF harvesting function on an already-designed object. Polymer objects are considered here, manufactured through an additive process and the paper focuses on the rapid prototyping of the harvester using a plastronic approach. An array of four antennas is considered for circular polarization with high self-isolation. The RF circuit is obtained using an electroless copper metallization of the surface of a 3D substrate fabricated using stereolithography printing. The RF properties of the polymer resin are not optimal; thus, the interest of this work is to investigate the potential capabilities of such an implementation, particularly in terms of freedom of 3D design and ease of fabrication. The electromagnetic properties of the substrate are characterized over a band of 0.5–2.5 GHz applying the two-transmission-line method. A circular polarization antenna is experimented as a rapid prototyping vehicle and yields a gain of 1.26 dB. A lab-scale prototype of the rectifier and power management unit are experimented with discrete components. The cold start-up circuit accepts a minimum voltage of 180 mV. The main DC/DC converter operates under 1.4 V but is able to compensate losses for an input DC voltage as low as 100 mV (10 μW). The rectifier alone is capable of 3.5% efficiency at −30 dBm input RF power. The global system of circularly polarized antenna, rectifier, and voltage conversion features a global experimental efficiency of 14.7% at an input power of −13.5 dBm. The possible application of such results is discussed.

JLPEA, Vol. 13, Pages 18: Self-Parameterized Chaotic Map for Low-Cost Robust Chaos

Partha Sarathi Paul — 2023-02-13

JLPEA, Vol. 13, Pages 18: Self-Parameterized Chaotic Map for Low-Cost Robust Chaos

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010018

Authors: Partha Sarathi Paul Anurag Dhungel Maisha Sadia Md Razuan Hossain Md Sakib Hasan

This paper presents a general method, called “self-parameterization”, for designing one-dimensional (1-D) chaotic maps that provide wider chaotic regions compared to existing 1-D maps. A wide chaotic region is a desirable property, as it helps to provide robust performance by enlarging the design space in many hardware-security applications, including reconfigurable logic and encryption. The proposed self-parameterization scheme uses only one existing chaotic map, referred to as the seed map, and a simple transformation block. The effective control parameter of the seed map is treated as an intermediate variable derived from the input and control parameter of the self-parameterized map, under some constraints, to achieve the desired functionality. The widening of the chaotic region after adding self-parameterization is first demonstrated on three ideal map functions: Logistic; Tent; and Sine. A digitized version of the scheme was developed and realized in a field-programmable gate array (FPGA) implementation. An analog version of the proposed scheme was developed with very low transistor-count analog topologies for hardware-constrained integrated circuit (IC) implementation. The chaotic performance of both digital and analog implementations was evaluated with bifurcation plots and four established chaotic entropy metrics: the Lyapunov Exponent; the Correlation Coefficient; the Correlation Dimension; and Approximate Entropy. An application of the proposed scheme was demonstrated in a random number generator design, and the statistical randomness of the generated sequence was verified with the NIST test.

JLPEA, Vol. 13, Pages 17: Decoding Algorithms and HW Strategies to Mitigate Uncertainties in a PCM-Based Analog Encoder for Compressed Sensing

Carmine Paolino — 2023-02-13

JLPEA, Vol. 13, Pages 17: Decoding Algorithms and HW Strategies to Mitigate Uncertainties in a PCM-Based Analog Encoder for Compressed Sensing

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010017

Authors: Carmine Paolino Alessio Antolini Francesco Zavalloni Andrea Lico Eleonora Franchi Scarselli Mauro Mangia Alex Marchioni Fabio Pareschi Gianluca Setti Riccardo Rovatti Mattia Luigi Torres Marcella Carissimi Marco Pasotti

Analog In-Memory computing (AIMC) is a novel paradigm looking for solutions to prevent the unnecessary transfer of data by distributing computation within memory elements. One such operation is matrix-vector multiplication (MVM), a workhorse of many fields ranging from linear regression to Deep Learning. The same concept can be readily applied to the encoding stage in Compressed Sensing (CS) systems, where an MVM operation maps input signals into compressed measurements. With a focus on an encoder built on top of a Phase-Change Memory (PCM) AIMC platform, the effects of device non-idealities, namely programming spread and drift over time, are observed in terms of the reconstruction quality obtained for synthetic signals, sparse in the Discrete Cosine Transform (DCT) domain. PCM devices are simulated using statistical models summarizing the properties experimentally observed in an AIMC prototype, designed in a 90 nm STMicroelectronics technology. Different families of decoders are tested, and tradeoffs in terms of encoding energy are analyzed. Furthermore, the benefits of a hardware drift compensation strategy are also observed, highlighting its necessity to prevent the need for a complete reprogramming of the entire analog array. The results show >30 dB average reconstruction quality for mid-range conductances and a suitably selected decoder right after programming. Additionally, the hardware drift compensation strategy enables robust performance even when different drift conditions are tested.

JLPEA, Vol. 13, Pages 16: Exploring Topological Semi-Metals for Interconnects

Satwik Kundu — 2023-02-09

JLPEA, Vol. 13, Pages 16: Exploring Topological Semi-Metals for Interconnects

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010016

Authors: Satwik Kundu Rupshali Roy M. Saifur Rahman Suryansh Upadhyay Rasit Onur Topaloglu Suzanne E. Mohney Shengxi Huang Swaroop Ghosh

The size of transistors has drastically reduced over the years. Interconnects have likewise also been scaled down. Today, conventional copper (Cu)-based interconnects face a significant impediment to further scaling since their electrical conductivity decreases at smaller dimensions, which also worsens the signal delay and energy consumption. As a result, alternative scalable materials such as semi-metals and 2D materials were being investigated as potential Cu replacements. In this paper, we experimentally showed that CoPt can provide better resistivity than Cu at thin dimensions and proposed hybrid poly-Si with a CoPt coating for local routing in standard cells for compactness. We evaluated the performance gain for DRAM/eDRAM, and area vs. performance trade-off for D-Flip-Flop (DFF) using hybrid poly-Si with a thin film of CoPt. We gained up to a 3-fold reduction in delay and a 15.6% reduction in cell area with the proposed hybrid interconnect. We also studied the system-level interconnect design using NbAs, a topological semi-metal with high electron mobility at the nanoscale, and demonstrated its advantages over Cu in terms of resistivity, propagation delay, and slew rate. Our simulations revealed that NbAs could reduce the propagation delay by up to 35.88%. We further evaluated the potential system-level performance gain for NbAs-based interconnects in cache memories and observed an instructions per cycle (IPC) improvement of up to 23.8%.

JLPEA, Vol. 13, Pages 15: A 1.1 V 25 ppm/°C Relaxation Oscillator with 0.045%/V Line Sensitivity for Low Power Applications

Yizhuo Liao — 2023-02-07

JLPEA, Vol. 13, Pages 15: A 1.1 V 25 ppm/°C Relaxation Oscillator with 0.045%/V Line Sensitivity for Low Power Applications

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010015

Authors: Yizhuo Liao Pak Kwong Chan

A fully-integrated CMOS relaxation oscillator, realized in 40 nm CMOS technology, is presented. The oscillator includes a stable two-transistor based voltage reference without an operational amplifier, a simple current reference employing the temperature-compensated composite resistor, and the approximated complementary to absolute temperature (CTAT) delay-based comparators compensate for the approximated proportional to absolute temperature (PTAT) delay arising from the leakage currents in the switches. This relaxation oscillator is designed to output a square wave with a frequency of 64 kHz in a duty cycle of 50% at a 1.1 V supply. The simulation results demonstrated that the circuit can generate a square wave, with stable frequency, against temperature and supply variation, while exhibiting low current consumption. For the temperature range from −20 °C to 80 °C at a 1.1 V supply, the oscillator’ output frequency achieved a temperature coefficient (T.C.) of 12.4 ppm/°C in a typical corner in one sample simulation. For a 200-sample Monte Carlo simulation, the obtained T.C. is 25 ppm/°C. Under typical corners and room temperatures, the simulated line sensitivity is 0.045%/V with the supply from 1.1 V to 1.6 V, and the dynamic current consumption is 552 nA. A better figure-of-merit (FoM), which equals 0.129%, is displayed when compared to the representative prior-art works.

JLPEA, Vol. 13, Pages 13: Minimum Active Component Count Design of a PIλDμ Controller and Its Application in a Cardiac Pacemaker System

Julia Nako — 2023-02-02

JLPEA, Vol. 13, Pages 13: Minimum Active Component Count Design of a PIλDμ Controller and Its Application in a Cardiac Pacemaker System

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010013

Authors: Julia Nako Costas Psychalinos Ahmed S. Elwakil

A generalized structure for implementing fractional-order controllers is introduced in this paper. This is achieved thanks to the consideration of the controller transfer function as a ratio of integer and non-integer impedances. The non-integer order impedance is implemented using RC networks, such as the Foster and Cauer networks. The main offered benefit, with regards to the corresponding convectional implementations, is the reduced active and, also, passive component count. To demonstrate the versatility of the proposed concept, a controller suitable for implementing a cardiac pacemaker control system is designed. The evaluation of the performance of the system is performed through circuit simulation results, using a second-generation voltage conveyor as the active element.

JLPEA, Vol. 13, Pages 14: Wideband Cascaded and Stacked Receiver Front-Ends Employing an Improved Clock-Strategy Technique

Arash Abbasi — 2023-02-02

JLPEA, Vol. 13, Pages 14: Wideband Cascaded and Stacked Receiver Front-Ends Employing an Improved Clock-Strategy Technique

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010014

Authors: Arash Abbasi Frederic Nabki

A wideband cascaded receiver and a stacked receiver using an improved clock strategy are proposed to support the software-defined radio (SDR). The improved clock strategy reduces the number of mixer switches and the number of LO clock paths required to drive the mixer switches. This reduces the dynamic power consumption. The cascaded receiver includes an inverter-based low-noise transconductance amplifier (LNTA) using a feed-forward technique to enhance the noise performance; a passive mixer; and an inverter-based transimpedance amplifier (TIA). The stacked receiver architecture is used to reduce the power consumption by sharing the current between the LNTA and the TIA from a single supply. It utilizes a wideband LNTA with a capacitor cross-coupled (CCC) common-gate (CG) topology, a passive mixer to convert the RF current to an IF current, an active inductor (AI) and a 1/f noise-cancellation (NC) technique to improve the noise performance, and a TIA to convert the IF current to an IF voltage at the output. Both cascaded and stacked receivers are simulated in 22 nm CMOS technology. The cascaded receiver achieves a conversion-gain from 26 dB to 36 dB, a double-sideband noise-figure (NFDSB) from 1.4 dB to 3.9 dB, S11<−10 dB and an IIP3 from −7.5 dBm to −10.5 dBm, over the RF operating band from 0.4 GHz to 12 GHz. The stacked receiver achieves a conversion-gain from 34.5 dB to 36 dB, a NFDSB from 4.6 dB to 6.2 dB, S11<−10 dB, and an IIP3 from −21 dBm to −17.5 dBm, over the RF operating band from 2.2 GHz to 3.2 GHz. The cascaded receiver consumes 11 m from a 1 V supply voltage, while the stacked receiver consumes 2.4 m from a 1.2 V supply voltage.

JLPEA, Vol. 13, Pages 12: Energy Autonomous Wireless Sensing Node Working at 5 Lux from a 4 cm2 Solar Cell

Marcel Louis Meli — 2023-02-01

JLPEA, Vol. 13, Pages 12: Energy Autonomous Wireless Sensing Node Working at 5 Lux from a 4 cm2 Solar Cell

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010012

Authors: Marcel Louis Meli Sebastien Favre Benjamin Maij Stefan Stajic Manuel Boebel Philip John Poole Martin Schellenberg Charalampos S. Kouzinopoulos

Harvesting energy for IoT nodes in places that are permanently poorly lit is important, as many such places exist in buildings and other locations. The need for energy-autonomous devices working in such environments has so far received little attention. This work reports the design and test results of an energy-autonomous sensor node powered solely by solar cells. The system can cold-start and run in low light conditions (in this case 20 lux and below, using white LEDs as light sources). Four solar cells of 1 cm2 each are used, yielding a total active surface of 4 cm2. The system includes a capacitive sensor that acts as a touch detector, a crystal-accurate real-time clock (RTC), and a Cortex-M3-compatible microcontroller integrating a Bluetooth Low Energy radio (BLE) and the necessary stack for communication. A capacitor of 100 μF is used as energy storage. A low-power comparator monitors the level of the energy storage and powers up the system. The combination of the RTC and touch sensor enables the MCU load to be powered up periodically or using an asynchronous user touch activity. First tests have shown that the system can perform the basic work of cold-starting, sensing, and transmitting frames at +0 dBm, at illuminances as low as 5 lux. Harvesting starts earlier, meaning that the potential for full function below 5 lux is present. The system has also been tested with other light sources. The comparator is a test chip developed for energy harvesting. Other elements are off-the-shelf components. The use of commercially available devices, the reduced number of parts, and the absence of complex storage elements enable a small node to be built in the future, for use in constantly or intermittently poorly lit places.

JLPEA, Vol. 13, Pages 11: Study of Nitrogen-Doped Carbon Nanotubes for Creation of Piezoelectric Nanogenerator

Marina V. Il’ina — 2023-01-22

JLPEA, Vol. 13, Pages 11: Study of Nitrogen-Doped Carbon Nanotubes for Creation of Piezoelectric Nanogenerator

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010011

Authors: Marina V. Il’ina Olga I. Soboleva Soslan A. Khubezov Vladimir A. Smirnov Oleg I. Il’in

The creation of sustainable power sources for wearable electronics and self-powered systems is a promising direction of modern electronics. At the moment, a search for functional materials with high values of piezoelectric coefficient and elasticity, as well as non-toxicity, is underway to generate such power sources. In this paper, nitrogen-doped carbon nanotubes (N-CNTs) are considered as a functional material for a piezoelectric nanogenerator capable of converting nanoscale deformations into electrical energy. The effect of defectiveness and of geometric and mechanical parameters of N-CNTs on the current generated during their deformation is studied. It was established that the piezoelectric response of N-CNTs increased nonlinearly with an increase in the Young’s modulus and the aspect ratio of the length to diameter of the nanotube and, on the contrary, decreased with an increase in defectiveness not caused by the incorporation of nitrogen atoms. The advantages of using N-CNT to create energy-efficient piezoelectric nanogenerators are shown.

JLPEA, Vol. 13, Pages 10: A Power-Efficient Neuromorphic Digital Implementation of Neural–Glial Interactions

Angeliki Bicaku — 2023-01-18

JLPEA, Vol. 13, Pages 10: A Power-Efficient Neuromorphic Digital Implementation of Neural–Glial Interactions

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010010

Authors: Angeliki Bicaku Maria Sapounaki Athanasios Kakarountas Sotiris K. Tasoulis

Throughout the last decades, neuromorphic circuits have incited the interest of scientists, as they are potentially a powerful tool for the treatment of neurological diseases. To this end, it is essential to consider the biological principles of the CNS and develop the appropriate area- and power-efficient circuits. Motivated by studies that outline the indispensable role of astrocytes in the dynamic regulation of synaptic transmission and their active contribution to neural information processing in the CNS, in this work we propose a digital implementation of neuron–astrocyte bidirectional interactions. In order to describe the neuronal dynamics and the astrocytes’ calcium dynamics, a modified version of the original Izhikevich neuron model was combined with a linear approximation of the Postnov functional neural–glial interaction model. For the implementation of the neural–glial computation core, only three pipeline stages and a 10.10 fixed point representation were utilized. Regarding the results obtained from the FPGA implementation and the comparisons to other works, the proposed neural–glial circuit reported significant savings in area requirements (from 22.53% up to 164.20%) along with considerable savings in total power consumption of 28.07% without sacrificing output computation accuracy. Finally, an RMSE analysis was conducted, confirming that this particular implementation produces more accurate results compared to previous studies.

JLPEA, Vol. 13, Pages 9: Acknowledgment to the Reviewers of Journal of Low Power Electronics and Applications in 2022

JLPEA Editorial Office JLPEA Editorial Office — 2023-01-16

JLPEA, Vol. 13, Pages 9: Acknowledgment to the Reviewers of Journal of Low Power Electronics and Applications in 2022

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010009

Authors: JLPEA Editorial Office JLPEA Editorial Office

High-quality academic publishing is built on rigorous peer review [...]

JLPEA, Vol. 13, Pages 8: Numerical Optimization of a Nonlinear Nonideal Piezoelectric Energy Harvester Using Deep Learning

Andreas Hegendörfer — 2023-01-12

JLPEA, Vol. 13, Pages 8: Numerical Optimization of a Nonlinear Nonideal Piezoelectric Energy Harvester Using Deep Learning

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010008

Authors: Andreas Hegendörfer Paul Steinmann Julia Mergheim

This contribution addresses the numerical optimization of the harvested energy of a mechanically and electrically nonlinear and nonideal piezoelectric energy harvester (PEH) under triangular shock-like excitation, taking into account a nonlinear stress constraint. In the optimization problem, a bimorph electromechanical structure equipped with the Greinacher circuit or the standard circuit is considered and different electrical and mechanical design variables are introduced. Using a very accurate coupled finite element-electronic circuit simulator method, deep neural network (DNN) training data are generated, allowing for a computationally efficient evaluation of the objective function. Subsequently, a genetic algorithm using the DNNs is applied to find the electrical and mechanical design variables that optimize the harvested energy. It is found that the maximum harvested energy is obtained at the maximum possible mechanical stresses and that the optimum storage capacitor for the Greinacher circuit is much smaller than that for the standard circuit, while the total harvested energy by both configurations is similar.

JLPEA, Vol. 13, Pages 7: Electromigration-Aware Architecture for Modern Microprocessors

Freddy Gabbay — 2023-01-11

JLPEA, Vol. 13, Pages 7: Electromigration-Aware Architecture for Modern Microprocessors

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010007

Authors: Freddy Gabbay Avi Mendelson

Reliability is a fundamental requirement in microprocessors that guarantees correct execution over their lifetimes. The reliability-related design rules depend on the process technology and device operating conditions. To meet reliability requirements, advanced process nodes impose challenging design rules, which place a major burden on the VLSI implementation flow because they impose severe physical constraints. This paper focuses on electromigration (EM), one of the critical factors affecting semiconductor reliability. EM is the aging process of on-die wires in integrated circuits (ICs). Traditionally, EM issues have been handled at the physical design level, which enforces reliability rules using worst-case scenario analysis to detect and solve violations. In this paper, we offer solutions that exploit architectural characteristics to reduce EM impact. The use of architectural methods can simplify EM solutions, and such methods can be incorporated with standard physical-design-based solutions to enhance current methods. Our comprehensive physical simulation results show that, with minimal area, power, and performance overhead, the proposed solution can relax EM design efforts and significantly extend a microprocessor’s lifetime.

JLPEA, Vol. 13, Pages 6: FPGA-Based Decision Support System for ECG Analysis

Agostino Giorgio — 2023-01-07

JLPEA, Vol. 13, Pages 6: FPGA-Based Decision Support System for ECG Analysis

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010006

Authors: Agostino Giorgio Cataldo Guaragnella Maria Rizzi

The high mortality rate associated with cardiac abnormalities highlights the need of accurately detecting heart disorders in the early stage so to avoid severe health consequence for patients. Health trackers have become popular in the form of wearable devices. They are aimed to perform cardiac monitoring outside of medical clinics during peoples’ daily lives. Our paper proposes a new diagnostic algorithm and its implementation adopting a FPGA-based design. The conceived system automatically detects the most common arrhythmias and is also able to evaluate QT-segment lengthening and pulmonary embolism risk often caused by myocarditis. Debug and simulations have been carried out firstly in Matlab environment and then in Quartus IDE by Intel. The hardware implementation of the embedded system and the test for the functional accuracy verification have been performed adopting the DE1_SoC development board by Terasic, which is equipped with the Cyclone V 5CSEMA5F31C6 FPGA by Intel. Properly modified real ECG signals corrupted by a mixture of muscle noise, electrode movement artifacts, and baseline wander are used as a test bench. A value of 99.20% accuracy is achieved by taking into account 0.02 mV for the root mean square value of noise voltage. The implemented low-power circuit is suitable as a wearable decision support device.

JLPEA, Vol. 13, Pages 5: A Bottom-Up Methodology for the Fast Assessment of CNN Mappings on Energy-Efficient Accelerators

Guillaume Devic — 2023-01-05

JLPEA, Vol. 13, Pages 5: A Bottom-Up Methodology for the Fast Assessment of CNN Mappings on Energy-Efficient Accelerators

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010005

Authors: Guillaume Devic Gilles Sassatelli Abdoulaye Gamatié

The execution of machine learning (ML) algorithms on resource-constrained embedded systems is very challenging in edge computing. To address this issue, ML accelerators are among the most efficient solutions. They are the result of aggressive architecture customization. Finding energy-efficient mappings of ML workloads on accelerators, however, is a very challenging task. In this paper, we propose a design methodology by combining different abstraction levels to quickly address the mapping of convolutional neural networks on ML accelerators. Starting from an open-source core adopting the RISC-V instruction set architecture, we define in RTL a more flexible and powerful multiply-and-accumulate (MAC) unit, compared to the native MAC unit. Our proposal contributes to improving the energy efficiency of the RISC-V cores of PULPino. To effectively evaluate its benefits at system level, while considering CNN execution, we build a corresponding analytical model in the Timeloop/Accelergy simulation and evaluation environment. This enables us to quickly explore CNN mappings on a typical RISC-V system-on-chip model, manufactured under the name of GAP8. The modeling flexibility offered by Timeloop makes it possible to easily evaluate our novel MAC unit in further CNN accelerator architectures such as Eyeriss and DianNao. Overall, the resulting bottom-up methodology assists designers in the efficient implementation of CNNs on ML accelerators by leveraging the accuracy and speed of the combined abstraction levels.

JLPEA, Vol. 13, Pages 4: Simple Technique to Improve Essentially the Performance of One-Stage Op-Amps in Deep Submicrometer CMOS Technologies

Jaime Ramirez-Angulo — 2023-01-04

JLPEA, Vol. 13, Pages 4: Simple Technique to Improve Essentially the Performance of One-Stage Op-Amps in Deep Submicrometer CMOS Technologies

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010004

Authors: Jaime Ramirez-Angulo Alejandra Diaz-Armendariz Jesus E. Molinar-Solis Alejandro Diaz-Sanchez Jesus Huerta-Chua

A comparative study of one-stage-amp performance improvement based on simulations in 22 nm, 45 nm, 90 nm, and 180 nm in deep submicrometer CMOS technologies is discussed. Generic SPICE models were used to simulate the circuits. It is shown that in all cases a simple modification using resistive local common mode feedback increases open-loop gain and gain-bandwidth product, peak output currents, and slew rate by close to an order of magnitude. It is shown that this modification is especially appropriate for its utilization in current CMOS technologies since large factor improvements were not available in previous technologies. The OTAs with resistive local common mode feedback require simple phase lead compensation with a very small additional silicon area and keep supply requirements and static power dissipation unchanged.

JLPEA, Vol. 13, Pages 3: A Fully-Differential CMOS Instrumentation Amplifier for Bioimpedance-Based IoT Medical Devices

Israel Corbacho — 2022-12-30

JLPEA, Vol. 13, Pages 3: A Fully-Differential CMOS Instrumentation Amplifier for Bioimpedance-Based IoT Medical Devices

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010003

Authors: Israel Corbacho Juan M. Carrillo José L. Ausín Miguel Á. Domínguez Raquel Pérez-Aloe J. Francisco Duque-Carrillo

The implementation of a fully-differential (FD) instrumentation amplifier (IA), based on indirect current feedback (ICF) and aimed to electrical impedance measurements in an Internet of Things (IoT) biomedical scenario, is presented. The IA consists of two FD transconductors, to process the input signal and feed back the output signal, a summing stage, used to add both contributions and generate the correcting current feedback signal, and a common-mode feedback network, which controls the DC level at the output nodes of the circuit. The transconductors are formed by a voltage-to-current conversion resistor and two voltage buffers, which are based on a super source follower cell in order to improve the overall response of the circuit. As a result, a compact single-stage structure, suitable for achieving a high bandwidth and a low power consumption, is obtained. The FD ICF IA has been designed and fabricated in 180 nm CMOS technology to operate with a 1.8-V supply and provide a nominal gain of 4 V/V. Experimental results show a voltage gain of 3.78 ± 0.06 V/V, a BW of 5.83 MHz, a CMRR at DC around 70 dB, a DC current consumption of 266.4 μA and a silicon area occupation of 0.0304 mm2.

JLPEA, Vol. 13, Pages 2: Evaluation of Dynamic Triple Modular Redundancy in an Interleaved-Multi-Threading RISC-V Core

Marcello Barbirotta — 2022-12-28

JLPEA, Vol. 13, Pages 2: Evaluation of Dynamic Triple Modular Redundancy in an Interleaved-Multi-Threading RISC-V Core

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010002

Authors: Marcello Barbirotta Abdallah Cheikh Antonio Mastrandrea Francesco Menichelli Marco Ottavi Mauro Olivieri

Functional safety is a key requirement in several application domains in which microprocessors are an essential part. A number of redundancy techniques have been developed with the common purpose of protecting circuits against single event upset (SEU) faults. In microprocessors, functional redundancy may be achieved through multi-core or simultaneous-multi-threading architectures, with techniques that are broadly classifiable as Double Modular Redundancy (DMR) and Triple Modular Redundancy (TMR), involving the duplication or triplication of architecture units, respectively. RISC-V plays an interesting role in this context for its inherent extendability and the availability of open-source microarchitecture designs. In this work, we present a novel way to exploit the advantages of both DMR and TMR techniques in an Interleaved-Multi-Threading (IMT) microprocessor architecture, leveraging its replicated threads for redundancy, and obtaining a system that can dynamically switch from DMR to TMR in the case of faults. We demonstrated the approach for a specific family of RISC-V cores, modifying the microarchitecture and proving its effectiveness with an extensive RTL fault-injection simulation campaign.

JLPEA, Vol. 13, Pages 1: CCALK: (When) CVA6 Cache Associativity Leaks the Key

Valentin Martinoli — 2022-12-27

JLPEA, Vol. 13, Pages 1: CCALK: (When) CVA6 Cache Associativity Leaks the Key

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea13010001

Authors: Valentin Martinoli Elouan Tourneur Yannick Teglia Régis Leveugle

In this work, we study an end-to-end implementation of a Prime + Probe covert channel on the CVA6 RISC-V processor implemented on a FPGA target and running a Linux OS. We develop the building blocks of the covert channel and provide a detailed view of its behavior and effectiveness. We propose a realistic scenario for extracting information of an AES-128 encryption algorithm implementation. Throughout this work, we discuss the challenges brought by the presence of a running OS while carrying out a micro architectural covert channel. This includes the effects of having other running processes, unwanted cache evictions and the OS’ timing behavior. We also propose an analysis of the relationship between the data cache’s characteristics and the developed covert channel’s capacity to extract information. According to the results of our experimentations, we present guidelines on how to build and configure a micro architectural covert channel resilient cache in a mono-core mono-thread scenario.

JLPEA, Vol. 12, Pages 65: Energy Sustainability in Wireless Sensor Networks: An Analytical Survey

Emmanouil Andreas Evangelakos — 2022-12-16

JLPEA, Vol. 12, Pages 65: Energy Sustainability in Wireless Sensor Networks: An Analytical Survey

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea12040065

Authors: Emmanouil Andreas Evangelakos Dionisis Kandris Dimitris Rountos George Tselikis Eleftherios Anastasiadis

Wireless Sensor Networks (WSNs) are considered to be among the most important scientific domains. Yet, the exploitation of WSNs suffers from the severe energy restrictions of their electronic components. For this reason there are numerous scientific methods that have been proposed aiming to achieve the extension of the lifetime of WSNs, either by energy saving or energy harvesting or through energy transfer. This study aims to analytically examine all of the existing hardware-based and algorithm-based mechanisms of this kind. The operating principles of 48 approaches are studied, their relative advantages and weaknesses are highlighted, open research issues are discussed, and resultant concluding remarks are drawn.

JLPEA, Vol. 12, Pages 64: All-Standard-Cell-Based Analog-to-Digital Architectures Well-Suited for Internet of Things Applications

Ana Correia — 2022-12-07

JLPEA, Vol. 12, Pages 64: All-Standard-Cell-Based Analog-to-Digital Architectures Well-Suited for Internet of Things Applications

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea12040064

Authors: Ana Correia Vítor Grade Tavares Pedro Barquinha João Goes

In this paper, the most suited analog-to-digital (A/D) converters (ADCs) for Internet of Things (IoT) applications are compared in terms of complexity, dynamic performance, and energy efficiency. Among them, an innovative hybrid topology, a digital–delta (Δ) modulator (ΔM) ADC employing noise shaping (NS), is proposed. To implement the active building blocks, several standard-cell-based synthesizable comparators and amplifiers are examined and compared in terms of their key performance parameters. The simulation results of a fully synthesizable Digital-ΔM with NS using passive and standard-cell-based circuitry show a peak of 72.5 dB in the signal-to-noise and distortion ratio (SNDR) for a 113 kHz input signal and 1 MHz bandwidth (BW). The estimated FoMWalden is close to 16.2 fJ/conv.-step.

JLPEA, Vol. 12, Pages 63: A Spintronic 2M/7T Computation-in-Memory Cell

Atousa Jafari — 2022-12-06

JLPEA, Vol. 12, Pages 63: A Spintronic 2M/7T Computation-in-Memory Cell

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea12040063

Authors: Atousa Jafari Christopher Münch Mehdi Tahoori

Computing data-intensive applications on the von Neumann architecture lead to significant performance and energy overheads. The concept of computation in memory (CiM) addresses the bottleneck of von Neumann machines by reducing the data movement in the computing system. Emerging resistive non-volatile memory technologies, as well as volatile memories (SRAM and DRAM), can be used to realize architectures based on the CiM paradigm. In this paper, we propose a hybrid cell design to provide the opportunity for CiM by combining the magnetic tunnel junction (MTJ) and the conventional 6T-SRAM cell. The cell performs CiM operations based on stateful in-array computation, which has better scalability for multiple operands compared with stateless computation in the periphery. Various logic operations such as XOR, OR, and IMP can be performed with the proposed design. In addition, the proposed cell can also operate as a conventional memory cell to read and write volatile as well as non-volatile data. The obtained simulation results show that the proposed CiM-A design can increase the performance of regular memory architectures by reducing the delay by 8 times and the energy by 13 times for database query applications consisting of consecutive bitwise operations with minimum overhead.

JLPEA, Vol. 12, Pages 62: 0.6-V 1.65-μW Second-Order Gm-C Bandpass Filter for Multi-Frequency Bioimpedance Analysis Based on a Bootstrapped Bulk-Driven Voltage Buffer

Juan M. Carrillo — 2022-11-30

JLPEA, Vol. 12, Pages 62: 0.6-V 1.65-μW Second-Order Gm-C Bandpass Filter for Multi-Frequency Bioimpedance Analysis Based on a Bootstrapped Bulk-Driven Voltage Buffer

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea12040062

Authors: Juan M. Carrillo Carlos A. de la Cruz-Blas

A bootstrapping technique used to increase the intrinsic voltage gain of a bulk-driven MOS transistor is described in this paper. The proposed circuit incorporates a capacitor and a cutoff transistor to be connected to the gate terminal of a bulk-driven MOS device, thus achieving a quasi-floating-gate structure. As a result, the contribution of the gate transconductance is cancelled out and the voltage gain of the device is correspondingly increased. The technique allows for implementing a voltage follower with a voltage gain much closer to unity as compared to the conventional bulk-driven case. This voltage buffer, along with a pseudo-resistor, is used to design a linearized transconductor. The proposed transconductance cell includes an economic continuous tuning mechanism that permits programming the effective transconductance in a range sufficiently wide to counteract the typical variations that process parameters suffer during fabrication. The transconductor has been used to implement a second-order Gm-C bandpass filter with a relatively high selectivity factor, suited for multi-frequency bioimpedance analysis in a very low-voltage environment. All the circuits have been designed in 180 nm CMOS technology to operate with a 0.6-V single-supply voltage. Simulated results show that the proposed technique allows for increasing the linearity and reducing the input-referred noise of the bootstrapped bulk-driven MOS transistor, which results in an improvement of the overall performance of the transconductor. The center frequency of the bandpass filter designed can be programmed in the frequency range from 6.5 kHz to 37.5 kHz with a power consumption ranging between 1.34 μW and 2.19 μW. The circuit presents an in-band integrated noise of 190.5 μVrms and is able to process signals of 110 mVpp with a THD below −40 dB, thus leading to a dynamic range of 47.4 dB.

JLPEA, Vol. 12, Pages 61: Hardware Solutions for Low-Power Smart Edge Computing

Lucas Martin Wisniewski — 2022-11-25

JLPEA, Vol. 12, Pages 61: Hardware Solutions for Low-Power Smart Edge Computing

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea12040061

Authors: Lucas Martin Wisniewski Jean-Michel Bec Guillaume Boguszewski Abdoulaye Gamatié

The edge computing paradigm for Internet-of-Things brings computing closer to data sources, such as environmental sensors and cameras, using connected smart devices. Over the last few years, research in this area has been both interesting and timely. Typical services like analysis, decision, and control, can be realized by edge computing nodes executing full-fledged algorithms. Traditionally, low-power smart edge devices have been realized using resource-constrained systems executing machine learning (ML) algorithms for identifying objects or features, making decisions, etc. Initially, this paper discusses recent advances in embedded systems that are devoted to energy-efficient ML algorithm execution. A survey of the mainstream embedded computing devices for low-power IoT and edge computing is then presented. Finally, CYSmart is introduced as an innovative smart edge computing system. Two operational use cases are presented to illustrate its power efficiency.

JLPEA, Vol. 12, Pages 60: Ultra-Low-Power Circuits for Intermittent Communication

Alessandro Torrisi — 2022-11-13

JLPEA, Vol. 12, Pages 60: Ultra-Low-Power Circuits for Intermittent Communication

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea12040060

Authors: Alessandro Torrisi Kasım Sinan Yıldırım Davide Brunelli

Self-sustainable energy harvesting for Internet of Things devices is challenging since ambient energy may be sporadic and unpredictable. This situation leads to frequent power failures that lead to intermittent operations, which prevent the reliability of data communications. This article presents fundamental hardware circuitry that enables reliable intermittent communications over wireless batteryless node networks. We emphasize two main mechanisms that ensure energy awareness and reliability: energy status-sharing and synchronized operation. We introduce novel low-power and self-sustainable plug-and-play circuits to support these mechanisms.

JLPEA, Vol. 12, Pages 59: Towards Low-Power Machine Learning Architectures Inspired by Brain Neuromodulatory Signalling

Taylor Barton — 2022-11-04

JLPEA, Vol. 12, Pages 59: Towards Low-Power Machine Learning Architectures Inspired by Brain Neuromodulatory Signalling

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea12040059

Authors: Taylor Barton Hao Yu Kyle Rogers Nancy Fulda Shiuh-hua Wood Chiang Jordan Yorgason Karl F. Warnick

We present a transfer learning method inspired by modulatory neurotransmitter mechanisms in biological brains and explore applications for neuromorphic hardware. In this method, the pre-trained weights of an artificial neural network are held constant and a new, similar task is learned by manipulating the firing sensitivity of each neuron via a supplemental bias input. We refer to this as neuromodulatory tuning (NT). We demonstrate empirically that neuromodulatory tuning produces results comparable with traditional fine-tuning (TFT) methods in the domain of image recognition in both feed-forward deep learning and spiking neural network architectures. In our tests, NT reduced the number of parameters to be trained by four orders of magnitude as compared with traditional fine-tuning methods. We further demonstrate that neuromodulatory tuning can be implemented in analog hardware as a current source with a variable supply voltage. Our analog neuron design implements the leaky integrate-and-fire model with three bi-directional binary-scaled current sources comprising the synapse. Signals approximating modulatory neurotransmitter mechanisms are applied via adjustable power domains associated with each synapse. We validate the feasibility of the circuit design using high-fidelity simulation tools and propose an efficient implementation of neuromodulatory tuning using integrated analog circuits that consume significantly less power than digital hardware (GPU/CPU).

JLPEA, Vol. 12, Pages 58: Tunnel Field-Effect Transistor: Impact of the Asymmetric and Symmetric Ambipolarity on Fault and Performance in Digital Circuits

Chiara Elfi Spano — 2022-10-31

JLPEA, Vol. 12, Pages 58: Tunnel Field-Effect Transistor: Impact of the Asymmetric and Symmetric Ambipolarity on Fault and Performance in Digital Circuits

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea12040058

Authors: Chiara Elfi Spano Fabrizio Mo Roberta Antonina Claudino Yuri Ardesi Massimo Ruo Roch Gianluca Piccinini Marco Vacca

Tunnel Field-Effect Transistors (TFETs) have been considered one of the most promising technologies to complement or replace CMOS for ultra-low-power applications, thanks to their subthreshold slope below the well-known limit of 60 mV/dec at room temperature holding for the MOSFET technologies. Nevertheless, TFET technology still suffers of ambipolar conduction, limiting its applicability in digital systems. In this work, we analyze through SPICE simulations, the impact of the symmetric and asymmetric ambipolarity in failure and power consumption for TFET-based complementary logic circuits. Our results clarify the circuit-level effects induced by the ambipolarity feature, demonstrating that it affects the correct functioning of logic gates and strongly impacts power consumption. We believe that our outcomes motivate further research towards technological solutions for ambipolarity suppression in TFET technology for near-future ultra-low-power applications.

JLPEA, Vol. 12, Pages 57: Ocelli: Efficient Processing-in-Pixel Array Enabling Edge Inference of Ternary Neural Networks

Sepehr Tabrizchi — 2022-10-30

JLPEA, Vol. 12, Pages 57: Ocelli: Efficient Processing-in-Pixel Array Enabling Edge Inference of Ternary Neural Networks

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea12040057

Authors: Sepehr Tabrizchi Shaahin Angizi Arman Roohi

Convolutional Neural Networks (CNNs), due to their recent successes, have gained lots of attention in various vision-based applications. They have proven to produce incredible results, especially on big data, that require high processing demands. However, CNN processing demands have limited their usage in embedded edge devices with constrained energy budgets and hardware. This paper proposes an efficient new architecture, namely Ocelli includes a ternary compute pixel (TCP) consisting of a CMOS-based pixel and a compute add-on. The proposed Ocelli architecture offers several features; (I) Because of the compute add-on, TCPs can produce ternary values (i.e., −1, 0, +1) regarding the light intensity as pixels’ inputs; (II) Ocelli realizes analog convolutions enabling low-precision ternary weight neural networks. Since the first layer’s convolution operations are the performance bottleneck of accelerators, Ocelli mitigates the overhead of analog buffers and analog-to-digital converters. Moreover, our design supports a zero-skipping scheme to further power reduction; (III) Ocelli exploits non-volatile magnetic RAMs to store CNN’s weights, which remarkably reduces the static power consumption; and finally, (IV) Ocelli has two modes, including sensing and processing. Once the object is detected, the architecture switches to the typical sensing mode to capture the image. Compared to the conventional pixels, it achieves an average 10% efficiency on its lane detection power consumption compared with existing edge detection algorithms. Moreover, considering different CNN workloads, our design shows more than 23% power efficiency over conventional designs, while it can achieve better accuracy.

JLPEA, Vol. 12, Pages 56: Templatized Fused Vector Floating-Point Dot Product for High-Level Synthesis

Dionysios Filippas — 2022-10-17

JLPEA, Vol. 12, Pages 56: Templatized Fused Vector Floating-Point Dot Product for High-Level Synthesis

Journal of Low Power Electronics and Applications doi: 10.3390/jlpea12040056

Authors: Dionysios Filippas Chrysostomos Nicopoulos Giorgos Dimitrakopoulos

Machine-learning accelerators rely on floating-point matrix and vector multiplication kernels. To reduce their cost, customized many-term fused architectures are preferred, which improve the latency, power, and area of the designs. In this work, we design a parameterized fused many-term floating-point dot product architecture that is ready for high-level synthesis. In this way, we can exploit the efficiency offered by a well-structured fused dot-product architecture and the freedom offered by high-level synthesis in tuning the design’s pipeline to the selected floating-point format and architectural constraints. When compared with optimized dot-product units implemented directly in RTL, the proposed design offers lower-latency implementations under the same clock frequency with marginal area savings. This result holds for a variety of floating-point formats, including standard and reduced-precision representations.