Journal of Low Power Electronics and Applications

19 pages, 1347 KB

Open AccessArticle

Flexible, Scalable and Energy Efficient Bio-Signals Processing on the PULP Platform: A Case Study on Seizure Detection

by Fabio Montagna, Simone Benatti and Davide Rossi

J. Low Power Electron. Appl. 2017, 7(2), 16; https://doi.org/10.3390/jlpea7020016 - 11 Jun 2017

Cited by 14 | Viewed by 10052

Ultra-low power operation and extreme energy efficiency are strong requirements for a number of high-growth application areas requiring near-sensor processing, including elaboration of biosignals. Parallel near-threshold computing is emerging as an approach to achieve significant improvements in energy efficiency while overcoming the performance [...] Read more.

Ultra-low power operation and extreme energy efficiency are strong requirements for a number of high-growth application areas requiring near-sensor processing, including elaboration of biosignals. Parallel near-threshold computing is emerging as an approach to achieve significant improvements in energy efficiency while overcoming the performance degradation typical of low-voltage operations. In this paper, we demonstrate the capabilities of the PULP (Parallel Ultra-Low Power) platform on an algorithm for seizure detection, representative of a wide range of EEG signal processing applications. Starting from the 28-nm FD-SOI (Fully Depleted Silicon On Insulator) technology implementation of the third embodiment of the PULP architecture, we analyze the energy-efficient implementation of the seizure detection algorithm on PULP. The proposed parallel implementation exploits the dynamic voltage and frequency scaling capabilities, as well as the embedded power knobs of the PULP platform, reducing energy consumption for a seizure detection by up to 10× with respect to a sequential implementation at the nominal supply voltage and by 4.2× with respect to a sequential implementation with voltage scaling. Moreover, we analyze the trans-precision optimization of the algorithm on PULP, by means of a hybrid fixed- and floating-point implementation. This approach reduces the energy consumption by up to 43% with respect to the plain fixed-point and floating-point implementations, leveraging the requirements in terms of the precision of the kernels composing the processing chain to improve energy efficiency. Thanks to the proposed architecture and system-level approach for optimization, we demonstrate that PULP reduces energy consumption by up to 140× with respect to commercial low-power microcontrollers, being able to satisfy the real-time constraints typical of bio-medical applications, breaking the barrier of microwatts for a 50-ms complete seizure detection and a few milliwatts for a 5-ms detection latency on a fully-programmable architecture. Full article

(This article belongs to the Special Issue Selected Papers from IEEE S3S Conference 2016)

► Show Figures

Figure 1

16 pages, 8028 KB

Open AccessArticle

Predictive Direct Torque Control Application-Specific Integrated Circuit of an Induction Motor Drive with a Fuzzy Controller

by Guo-Ming Sung, Wei-Yu Wang, Wen-Sheng Lin and Chih-Ping Yu

J. Low Power Electron. Appl. 2017, 7(2), 15; https://doi.org/10.3390/jlpea7020015 - 10 Jun 2017

Cited by 7 | Viewed by 8906

Abstract

This paper proposes a modified predictive direct torque control (PDTC) application-specific integrated circuit (ASIC) of a motor drive with a fuzzy controller for eliminating sampling and calculating delay times in hysteresis controllers. These delay times degrade the control quality and increase both torque [...] Read more.

This paper proposes a modified predictive direct torque control (PDTC) application-specific integrated circuit (ASIC) of a motor drive with a fuzzy controller for eliminating sampling and calculating delay times in hysteresis controllers. These delay times degrade the control quality and increase both torque and flux ripples in a motor drive. The proposed fuzzy PDTC ASIC calculates the stator’s magnetic flux and torque by detecting the three-phase current, three-phase voltage, and rotor speed, and eliminates the ripples in the torque and flux by using a fuzzy controller and predictive scheme. The Verilog hardware description language was used to implement the hardware architecture, and the ASIC was fabricated by the Taiwan Semiconductor Manufacturing Company through a 0.18-μm 1P6M CMOS process that involved a cell-based design method. The measurements revealed that the proposed fuzzy PDTC ASIC of the three-phase induction motor yielded a test coverage of 96.03%, fault coverage of 95.06%, chip area of 1.81 × 1.81 mm², and power consumption of 296 mW, at an operating frequency of 50 MHz and a supply voltage of 1.8 V. Full article

► Show Figures

Figure 1

24 pages, 610 KB

Open AccessArticle

Architectural Techniques for Improving the Power Consumption of NoC-Based CMPs: A Case Study of Cache and Network Layer

by Emmanuel Ofori-Attah, Washington Bhebhe and Michael Opoku Agyeman

J. Low Power Electron. Appl. 2017, 7(2), 14; https://doi.org/10.3390/jlpea7020014 - 29 May 2017

Cited by 14 | Viewed by 12442

Abstract

The disparity between memory and CPU have been ameliorated by the introduction of Network-on-Chip-based Chip-Multiprocessors (NoC-based CMPS). However, power consumption continues to be an aggressive stumbling block halting the progress of technology. Miniaturized transistors invoke many-core integration at the cost of high power [...] Read more.

The disparity between memory and CPU have been ameliorated by the introduction of Network-on-Chip-based Chip-Multiprocessors (NoC-based CMPS). However, power consumption continues to be an aggressive stumbling block halting the progress of technology. Miniaturized transistors invoke many-core integration at the cost of high power consumption caused by the components in NoC-based CMPs; particularly caches and routers. If NoC-based CMPs are to be standardised as the future of technology design, it is imperative that the power demands of its components are optimized. Much research effort has been put into finding techniques that can improve the power efficiency for both cache and router architectures. This work presents a survey of power-saving techniques for efficient NoC designs with a focus on the cache and router components, such as the buffer and crossbar. Nonetheless, the aim of this work is to compile a quick reference guide of power-saving techniques for engineers and researchers. Full article

(This article belongs to the Special Issue Emerging Network-on-Chip Architectures for Low Power Embedded Systems)

► Show Figures

Figure 1

18 pages, 852 KB

Open AccessArticle

Global Adaptation Controlled by an Interactive Consistency Protocol

by Alina Lenz and Roman Obermaisser

J. Low Power Electron. Appl. 2017, 7(2), 13; https://doi.org/10.3390/jlpea7020013 - 28 May 2017

Cited by 7 | Viewed by 8600

Abstract

Static schedules for systems can lead to an inefficient usage of the resources, because the system’s behavior cannot be adapted at runtime. To improve the runtime system performance in current time-triggered Multi-Processor System on Chip (MPSoC), a dynamic reaction to events is performed [...] Read more.

Static schedules for systems can lead to an inefficient usage of the resources, because the system’s behavior cannot be adapted at runtime. To improve the runtime system performance in current time-triggered Multi-Processor System on Chip (MPSoC), a dynamic reaction to events is performed locally on the cores. The effects of this optimization can be increased by coordinating the changes globally. To perform such global changes, a consistent view on the system state is needed, on which to base the adaptation decisions. This paper proposes such an interactive consistency protocol with low impact on the system w.r.t. latency and overhead. We show that an energy optimizing adaptation controlled by the protocol can enable a system to save up to 43% compared to a system without adaptation. Full article

(This article belongs to the Special Issue Emerging Network-on-Chip Architectures for Low Power Embedded Systems)

► Show Figures

Figure 1

12 pages, 2110 KB

Open AccessArticle

SoC Hardware Implementation of Real-Time Video Segmentation based on the Mixture of Gaussian Algorithm

by Peng Li and Hua Tang

J. Low Power Electron. Appl. 2017, 7(2), 12; https://doi.org/10.3390/jlpea7020012 - 18 May 2017

Cited by 1 | Viewed by 7981

Abstract

Video segmentation based on the Mixture of Gaussian (MoG) algorithm is widely used in video processing systems, and hardware implementations have been proposed in the past years. Most previous work focused on high-performance custom design of the MoG algorithm to meet real-time requirement [...] Read more.

Video segmentation based on the Mixture of Gaussian (MoG) algorithm is widely used in video processing systems, and hardware implementations have been proposed in the past years. Most previous work focused on high-performance custom design of the MoG algorithm to meet real-time requirement of high-frame-rate high-resolution video segmentation tasks. This paper focuses on the System-on-Chip (SoC) design and the priority is SoC integration of the system for flexibility/adaptability, while at the same time, custom design of the original MoG algorithm is included. To maximally retain the accuracy of the MoG algorithm for best segmentation performance, we minimally modified the MoG algorithm for hardware implementation at the cost of hardware resources. The MoG algorithm is custom-implemented as a hardware IP (Intellectual Property), which is then integrated within an SoC platform together with other video processing components, so that some key control parameters can be configured on-line, which makes the video segmentation system most suitable for different scenarios. The proposed implementation has been demonstrated and tested on a Xilinx Spartan-3A DSP Video Starter Board. Experiment results show that under a clock frequency of 25 MHz, this design meets the real-time requirement for VGA resolution (640 × 480) at 30 fps (frame-per-second). Full article

► Show Figures

Figure 1

12 pages, 1899 KB

Open AccessArticle

Ultra Low Energy FDSOI Asynchronous Reconfiguration Network for Adaptive Circuits

by Soundous Chairat, Edith Beigne, Ivan Miro-Panades and Marc Belleville

J. Low Power Electron. Appl. 2017, 7(2), 11; https://doi.org/10.3390/jlpea7020011 - 11 May 2017

Viewed by 8205

Abstract

This paper introduces a plug-and-play on-chip asynchronous communication network aimed at the dynamic reconfiguration of a low-power adaptive circuit such as an internet of things (IoT) system. By using a separate communication network, we can address both digital and analog blocks at a [...] Read more.

This paper introduces a plug-and-play on-chip asynchronous communication network aimed at the dynamic reconfiguration of a low-power adaptive circuit such as an internet of things (IoT) system. By using a separate communication network, we can address both digital and analog blocks at a lower configuration cost, increasing the overall system power efficiency. As reconfiguration only occurs according to specific events and has to be automatically in stand-by most of the time, our design is fully asynchronous using handshake protocols. The paper presents the circuit’s architecture, performance results, and an example of the reconfiguration of frequency locked loops (FLL) to validate our work. We obtain an overall energy per bit of 0.07 pJ/bit for one stage, in a 28 nm Fully Depleted Silicon On Insulator (FDSOI) technology at 0.6 V and a 1.1 ns/bit latency per stage. Full article

(This article belongs to the Special Issue Selected Papers from IEEE S3S Conference 2016)

► Show Figures

Figure 1

16 pages, 283 KB

Open AccessArticle

A General-Purpose Graphics Processing Unit (GPGPU)-Accelerated Robotic Controller Using a Low Power Mobile Platform

by Syed Tahir Hussain Rizvi, Gianpiero Cabodi, Denis Patti and Muhammad Majid Gulzar

J. Low Power Electron. Appl. 2017, 7(2), 10; https://doi.org/10.3390/jlpea7020010 - 5 May 2017

Cited by 5 | Viewed by 8321

Abstract

Robotic controllers have to execute various complex independent tasks repeatedly. Massive processing power is required by the motion controllers to compute the solution of these computationally intensive algorithms. General-purpose graphics processing unit (GPGPU)-enabled mobile phones can be leveraged for acceleration of these motion [...] Read more.

Robotic controllers have to execute various complex independent tasks repeatedly. Massive processing power is required by the motion controllers to compute the solution of these computationally intensive algorithms. General-purpose graphics processing unit (GPGPU)-enabled mobile phones can be leveraged for acceleration of these motion controllers. Embedded GPUs can replace several dedicated computing boards by a single powerful and less power-consuming GPU. In this paper, the inverse kinematic algorithm based numeric controllers is proposed and realized using the GPGPU of a handheld mobile device. This work is the extension of a desktop GPU-accelerated robotic controller presented at DAS’16 where the comparative analysis of different sequential and concurrent controllers is discussed. First of all, the inverse kinematic algorithm is sequentially realized using Arduino-Due microcontroller and the field-programmable gate array (FPGA) is used for its parallel implementation. Execution speeds of these controllers are compared with two different GPGPU architectures (Nvidia Quadro K2200 and Nvidia Shield K1 Tablet), programmed with Compute Unified Device Architecture (CUDA) computing language. Experimental data shows that the proposed mobile platform-based scheme outperforms the FPGA by 5× and boasts a 100× speedup over the Arduino-based sequential implementation. Full article

► Show Figures

Graphical abstract

10 pages, 845 KB

Open AccessArticle

High Performance Receiver Design for RX Carrier Aggregation

by Jusung Kim, Bon-Hyun Ku, Sanghun Lee, Sungchan Kim and Keunkwan Ryu

J. Low Power Electron. Appl. 2017, 7(2), 9; https://doi.org/10.3390/jlpea7020009 - 1 May 2017

Cited by 5 | Viewed by 8554

Abstract

Carrier aggregation is one of the key features to increase the data rate given a scarce bandwidth spectrum. This paper describes the design of a high performance receiver suitable for carrier aggregation in LTE-Advanced and future 5 G standards. The proposed architecture is [...] Read more.

Carrier aggregation is one of the key features to increase the data rate given a scarce bandwidth spectrum. This paper describes the design of a high performance receiver suitable for carrier aggregation in LTE-Advanced and future 5 G standards. The proposed architecture is versatile to support legacy mode (single carrier), inter-band carrier aggregation, and intra-band carrier aggregation. Performance with carrier-aggregation support is as good as legacy receivers. Contradicting requirements of high linearity and the low noise is satisfied with the single-gm receiver architecture in addition to supporting carrier aggregation. The proposed cascode-shutoff low-noise trans-conductance amplifier (LNTA) achieves 57.1 dB voltage gain, 1.76 dB NF (noise figure) , and

- 6.7

dBm IIP3 (Third-order intercept point) with the power consumption of 21.3 mW in the intra-band carrier aggregation scenario. With legacy mode, the same receiver signal path achieves 56.6 dB voltage gain, 1.33 dB NF, and

- 6.2

dBm IIP3 with a low power consumption of 7.4 mW. Full article

(This article belongs to the Special Issue Selected Papers from IEEE ISOCC Conference 2016)

► Show Figures

Figure 1

20 pages, 2210 KB

Open AccessArticle

The Design and Implementation of a Low-Power Gating Scan Element in 32/28 nm CMOS Technology

by Mahshid Mojtabavi Naeini, Sreedharan Baskara Dass and Chia Yee Ooi

J. Low Power Electron. Appl. 2017, 7(2), 7; https://doi.org/10.3390/jlpea7020007 - 28 Apr 2017

Cited by 5 | Viewed by 9994

Abstract

Excessive power consumption during test application time has severely negative effects on chip reliability since it has an inevitable role in hot spots that appear, degradation of performance, circuit premature destruction, and functional failures. In scan-based designs, rippling transitions caused by test patterns [...] Read more.

Excessive power consumption during test application time has severely negative effects on chip reliability since it has an inevitable role in hot spots that appear, degradation of performance, circuit premature destruction, and functional failures. In scan-based designs, rippling transitions caused by test patterns shifting along the scan chain not only elevate power consumption in the scan chain but also introduce spurious switching activities in the combinational logic. In this work, a new low power gating scan cell for scan based designs has been proposed in order to reduce power consumption in the scan chain as well as the combinational part during shifting. We have modified the conventional scan cell and augmented it with state preserving and gating logic that enables an average power reduction in combinational logic during shift mode. The new scan cell mitigates the number of transitions during shift and capture cycles. Thus, it reduces the average power consumption inside the scan cell and as a result the scan chain during scan shifting with a low impact on peak power during the capture cycle. Furthermore, due to introducing a new shorter shift path, improvements are observed in terms of propagation delay and power consumption in the scan chain during shifting. This leads to higher feasible shift frequency whereby the shift frequency is limited by the maximum power budget and hence results in reducing the test application time. The post-layout spice simulation results show a 7.21% reduction in total power consumption, an average 12.25% reduction of shift power consumption, and a 50.7% improvement in the clock (CLK)-to-shift propagation delay over the conventional scan cell in Synopsys 32/28 nm standard CMOS technology. Full article

(This article belongs to the Special Issue Ultra-Low Power VLSI Design for Emerging Applications)

► Show Figures

Figure 1

23 pages, 2001 KB

Open AccessArticle

Extending the Performance of Hybrid NoCs beyond the Limitations of Network Heterogeneity

by Michael Opoku Agyeman, Wen Zong, Alex Yakovlev, Kin-Fai Tong and Terrence Mak

J. Low Power Electron. Appl. 2017, 7(2), 8; https://doi.org/10.3390/jlpea7020008 - 26 Apr 2017

Cited by 13 | Viewed by 9123

Abstract

To meet the performance and scalability demands of the fast-paced technological growth towards exascale and big data processing with the performance bottleneck of conventional metal-based interconnects (wireline), alternative interconnect fabrics, such as inhomogeneous three-dimensional integrated network-on-chip (3D NoC) and hybrid wired-wireless network-on-chip (WiNoC), [...] Read more.

To meet the performance and scalability demands of the fast-paced technological growth towards exascale and big data processing with the performance bottleneck of conventional metal-based interconnects (wireline), alternative interconnect fabrics, such as inhomogeneous three-dimensional integrated network-on-chip (3D NoC) and hybrid wired-wireless network-on-chip (WiNoC), have emanated as a cost-effective solution for emerging system-on-chip (SoC) design. However, these interconnects trade off optimized performance for cost by restricting the number of area and power hungry 3D routers and wireless nodes. Moreover, the non-uniform distributed traffic in a chip multiprocessor (CMP) demands an on-chip communication infrastructure that can avoid congestion under high traffic conditions while possessing minimal pipeline delay at low-load conditions. To this end, in this paper, we propose a low-latency adaptive router with a low-complexity single-cycle bypassing mechanism to alleviate the performance degradation due to the slow 2D routers in such emerging hybrid NoCs. The proposed router transmits a flit using dimension-ordered routing (DoR) in the bypass datapath at low-loads. When the output port required for intra-dimension bypassing is not available, the packet is routed adaptively to avoid congestion. The router also has a simplified virtual channel allocation (VA) scheme that yields a non-speculative low-latency pipeline. By combining the low-complexity bypassing technique with adaptive routing, the proposed router is able to balance the traffic in hybrid NoCs to achieve low-latency communication under various traffic loads. Simulation shows that the proposed router can reduce applications’ execution time by an average of 16.9% compared to low-latency routers, such as SWIFT. By reducing the latency between 2D routers (or wired nodes) and 3D routers (or wireless nodes), the proposed router can improve the performance efficiency in terms of average packet delay by an average of

45 %

(or

50 %

) in 3D NoCs (or WiNoCs). Full article

(This article belongs to the Special Issue Emerging Network-on-Chip Architectures for Low Power Embedded Systems)

► Show Figures

Figure 1

13 pages, 1187 KB

Open AccessArticle

Design of a Wideband Antenna for Wireless Network-On-Chip in Multimedia Applications

by Fernando Gutierrez

J. Low Power Electron. Appl. 2017, 7(2), 6; https://doi.org/10.3390/jlpea7020006 - 29 Mar 2017

Cited by 9 | Viewed by 9936

Abstract

To allow fast communication—at several Gb/s—of multimedia content among processors and memories in a multi-processor system-on-chip, a new approach is emerging in literature: Wireless Network-on-Chip (WiNoC). With reference to this scenario, this paper presents the design of the key element of the WiNoC: [...] Read more.

To allow fast communication—at several Gb/s—of multimedia content among processors and memories in a multi-processor system-on-chip, a new approach is emerging in literature: Wireless Network-on-Chip (WiNoC). With reference to this scenario, this paper presents the design of the key element of the WiNoC: the antenna. Specifically, a bow-tie antenna is proposed, which operates at mm-waves and can be implemented on-chip using the top metal layer of a conventional silicon CMOS (Complementary Metal Oxide Semiconductor) technology. The antenna performance is discussed in the paper and is compared to the state-of-the-art, including the zig-zag antenna topology that is typically used in literature as a reference for WiNoC. The proposed bow-tie antenna design for WiNoC stands out for its good trade-off among bandwidth, gain, size and beamwidth vs. the state-of-the-art. Full article

(This article belongs to the Special Issue Emerging Network-on-Chip Architectures for Low Power Embedded Systems)

► Show Figures

Figure 1

Journal Menu

Journal Browser

J. Low Power Electron. Appl., Volume 7, Issue 2 (June 2017) – 11 articles

Further Information

Guidelines

MDPI Initiatives

Follow MDPI