Open AccessArticle
Energy Efficiency Effects of Vectorization in Data Reuse Transformations for Many-Core Processors—A Case Study †
J. Low Power Electron. Appl. 2017, 7(1), 5; doi:10.3390/jlpea7010005 -
Abstract
Thread-level and data-level parallel architectures have become the design of choice in many of today’s energy-efficient computing systems. However, these architectures put substantially higher requirements on the memory subsystem than scalar architectures, making memory latency and bandwidth critical in their overall efficiency. Data
[...] Read more.
Thread-level and data-level parallel architectures have become the design of choice in many of today’s energy-efficient computing systems. However, these architectures put substantially higher requirements on the memory subsystem than scalar architectures, making memory latency and bandwidth critical in their overall efficiency. Data reuse exploration aims at reducing the pressure on the memory subsystem by exploiting the temporal locality in data accesses. In this paper, we investigate the effects on performance and energy from a data reuse methodology combined with parallelization and vectorization in multi- and many-core processors. As a test case, a full-search motion estimation kernel is evaluated on Intel® CoreTM i7-4700K (Haswell) and i7-2600K (Sandy Bridge) multi-core processors, as well as on an Intel® Xeon PhiTM many-core processor (Knights Landing) with Streaming Single Instruction Multiple Data (SIMD) Extensions (SSE) and Advanced Vector Extensions (AVX) instruction sets. Results using a single-threaded execution on the Haswell and Sandy Bridge systems show that performance and EDP (Energy Delay Product) can be improved through data reuse transformations on the scalar code by a factor of ≈3× and ≈6×, respectively. Compared to scalar code without data reuse optimization, the SSE/AVX2 version achieves ≈10×/17× better performance and ≈92×/307× better EDP, respectively. These results can be improved by 10% to 15% using data reuse techniques. Finally, the most optimized version using data reuse and AVX512 achieves a speedup of ≈35× and an EDP improvement of ≈1192× on the Xeon Phi system. While single-threaded execution serves as a common reference point for all architectures to analyze the effects of data reuse on both scalar and vector codes, scalability with thread count is also discussed in the paper. Full article
Figures

Figure 1

Open AccessArticle
A Novel Design Flow for a Security-Driven Synthesis of Side-Channel Hardened Cryptographic Modules
J. Low Power Electron. Appl. 2017, 7(1), 4; doi:10.3390/jlpea7010004 -
Abstract
Over the last few decades, computer-aided engineering (CAE) tools have been developed and improved in order to ensure a short time-to-market in the chip design business. Up to now, these design tools do not yet support an integrated design strategy for the development
[...] Read more.
Over the last few decades, computer-aided engineering (CAE) tools have been developed and improved in order to ensure a short time-to-market in the chip design business. Up to now, these design tools do not yet support an integrated design strategy for the development of side-channel-resistant hardware implementations. In order to close this gap, a novel framework named AMASIVE (Adaptable Modular Autonomous SIde-Channel Vulnerability Evaluator) was developed. It supports the designer in implementing devices hardened against power attacks by exploiting novel security-driven synthesis methods. The article at hand can be seen as the second of the two contributions that address the AMASIVE framework. While the first one describes how the framework automatically detects vulnerabilities against power attacks, the second one explains how a design can be hardened in an automatic way by means of appropriate countermeasures, which are tailored to the identified weaknesses. In addition to the theoretical introduction of the fundamental concepts, we demonstrate an application to the hardening of a complete hardware implementation of the block cipher PRESENT. Full article
Figures

Open AccessArticle
Completing the Complete ECC Formulae with Countermeasures
J. Low Power Electron. Appl. 2017, 7(1), 3; doi:10.3390/jlpea7010003 -
Abstract
This work implements and evaluates the recent complete addition formulae for the prime order elliptic curves of Renes, Costello and Batina on an FPGA platform. We implement three different versions:(1) an unprotected architecture; (2) an architecture protected through coordinate randomization; and (3) an
[...] Read more.
This work implements and evaluates the recent complete addition formulae for the prime order elliptic curves of Renes, Costello and Batina on an FPGA platform. We implement three different versions:(1) an unprotected architecture; (2) an architecture protected through coordinate randomization; and (3) an architecture with both coordinate randomization and scalar splitting in place. The evaluation is done through timing analysis and test vector leakage assessment (TVLA). The results show that applying an increasing level of countermeasures leads to an increasing resistance against side-channel attacks. This is the first work looking into side-channel security issues of hardware implementations of the complete formulae. Full article
Figures

Figure 1

Open AccessArticle
On Improving Reliability of SRAM-Based Physically Unclonable Functions
J. Low Power Electron. Appl. 2017, 7(1), 2; doi:10.3390/jlpea7010002 -
Abstract
Physically unclonable functions (PUFs) have been touted for their inherent resistance to invasive attacks and low cost in providing a hardware root of trust for various security applications. SRAM PUFs in particular are popular in industry for key/ID generation. Due to intrinsic process
[...] Read more.
Physically unclonable functions (PUFs) have been touted for their inherent resistance to invasive attacks and low cost in providing a hardware root of trust for various security applications. SRAM PUFs in particular are popular in industry for key/ID generation. Due to intrinsic process variations, SRAM cells, ideally, tend to have the same start-up behavior. SRAM PUFs exploit this start-up behavior. Unfortunately, not all SRAM cells exhibit reliable start-up behavior due to noise susceptibility. Hence, design enhancements are needed for improving reliability. Some of the proposed enhancements in literature include fuzzy extraction, error-correcting codes and voting mechanisms. All enhancements involve a trade-off between area/power/performance overhead and PUF reliability. This paper presents a design enhancement technique for reliability that improves upon previous solutions. We present simulation results to quantify improvement in SRAM PUF reliability and efficiency. The proposed technique is shown to generate a 128-bit key in ≤0.2 μs at an area estimate of 4538 μm2 with error rate as low as 106 for intrinsic error probability of 15%. Full article
Figures

Figure 1

Open AccessEditorial
Acknowledgement to Reviewers of Journal of Low Power Electronics and Applications in 2016
J. Low Power Electron. Appl. 2017, 7(1), 1; doi:10.3390/jlpea7010001 -
Abstract The editors of Journal of Low Power Electronics and Applications would like to express their sincere gratitude to the following reviewers for assessing manuscripts in 2016.[...] Full article
Open AccessReview
Low Power Design for Future Wearable and Implantable Devices
J. Low Power Electron. Appl. 2016, 6(4), 20; doi:10.3390/jlpea6040020 -
Abstract
With the fast progress in miniaturization of sensors and advances in micromachinery systems, a gate has been opened to the researchers to develop extremely small wearable/implantable microsystems for different applications. However, these devices are reaching not to a physical limit but a power
[...] Read more.
With the fast progress in miniaturization of sensors and advances in micromachinery systems, a gate has been opened to the researchers to develop extremely small wearable/implantable microsystems for different applications. However, these devices are reaching not to a physical limit but a power limit, which is a critical limit for further miniaturization to develop smaller and smarter wearable/implantable devices (WIDs), especially for multi-task continuous computing purposes. Developing smaller and smarter devices with more functionality requires larger batteries, which are currently the main power provider for such devices. However, batteries have a fixed energy density, limited lifetime and chemical side effect plus the fact that the total size of the WID is dominated by the battery size. These issues make the design very challenging or even impossible. A promising solution is to design batteryless WIDs scavenging energy from human or environment including but not limited to temperature variations through thermoelectric generator (TEG) devices, body movement through Piezoelectric devices, solar energy through miniature solar cells, radio-frequency (RF) harvesting through antenna etc. However, the energy provided by each of these harvesting mechanisms is very limited and thus cannot be used for complex tasks. Therefore, a more comprehensive solution is the use of different harvesting mechanisms on a single platform providing enough energy for more complex tasks without the need of batteries. In addition to this, complex tasks can be done by designing Integrated Circuits (ICs), as the main core and the most power consuming component of any WID, in an extremely low power mode by lowering the supply voltage utilizing low-voltage design techniques. Having the ICs operational at very low voltages, will enable designing battery-less WIDs for complex tasks, which will be discussed in details throughout this paper. In this paper, a path towards battery-less computing is drawn by looking at device circuit co-design for future system-on-chips (SoCs). Full article
Figures

Figure 1

Open AccessArticle
InGaAs-OI Substrate Fabrication on a 300 mm Wafer
J. Low Power Electron. Appl. 2016, 6(4), 19; doi:10.3390/jlpea6040019 -
Abstract
In this work, we demonstrate for the first time a 300-mm indium–gallium–arsenic (InGaAs) wafer on insulator (InGaAs-OI) substrates by splitting in an InP sacrificial layer. A 30-nm-thick InGaAs layer was successfully transferred using low temperature direct wafer bonding (DWB) and Smart CutTM
[...] Read more.
In this work, we demonstrate for the first time a 300-mm indium–gallium–arsenic (InGaAs) wafer on insulator (InGaAs-OI) substrates by splitting in an InP sacrificial layer. A 30-nm-thick InGaAs layer was successfully transferred using low temperature direct wafer bonding (DWB) and Smart CutTM technology. Three key process steps of the integration were therefore specifically developed and optimized. The first one was the epitaxial growing process, designed to reduce the surface roughness of the InGaAs film. Second, direct wafer bonding conditions were investigated and optimized to achieve non-defective bonding up to 600 °C. Finally, we adapted the splitting condition to detach the InGaAs layer according to epitaxial stack specifications. The paper presents the overall process flow that achieved InGaAs-OI, the required optimization, and the associated characterizations, namely atomic force microscopy (AFM), scanning acoustic microscopy (SAM), and HR-XRD, to insure the crystalline quality of the post transferred layer. Full article
Figures

Figure 1

Open AccessArticle
A 4.1 W/mm2 Hybrid Inductive/Capacitive Converter for 2–140 mA-DVS Load under Inductor
J. Low Power Electron. Appl. 2016, 6(3), 18; doi:10.3390/jlpea6030018 -
Abstract
This work presents a fully integrated hybrid inductive/capacitive converter maintaining high efficiency for a load range of 2 mA to 140 mA (70×) suitable for the dynamic voltage scaling (DVS) based loads. This high efficiency is achieved by using an inductive converter for
[...] Read more.
This work presents a fully integrated hybrid inductive/capacitive converter maintaining high efficiency for a load range of 2 mA to 140 mA (70×) suitable for the dynamic voltage scaling (DVS) based loads. This high efficiency is achieved by using an inductive converter for higher loads (15–140 mA, 0.50–0.9 V) and a capacitive converter for lighter loads (2–5 mA, 0.40–0.55 V) with a 50 mV hysteresis margin. A digital state machine activates the appropriate converter based on the power efficiency and enables the converter hand-over. The functional feasibility of implementing digital circuits as representative loads under the inductor is shown thereby increasing the peak converter power density from 0.387 W/mm2 to 4.1 W/mm2 with only a minor hit on the efficiency. The maximum measured efficiency is achieved in inductive mode of operation and decreases from 76.4% to 71% when digital circuits are present under the inductor. The design was fabricated in IBM’s 32 nm SOI technology. Full article
Figures

Figure 1

Open AccessArticle
A Fully Integrated 2:1 Self-Oscillating Switched-Capacitor DC–DC Converter in 28 nm UTBB FD-SOI
J. Low Power Electron. Appl. 2016, 6(3), 17; doi:10.3390/jlpea6030017 -
Abstract
The importance of energy-constrained processors continues to grow especially for ultra-portable sensor-based platforms for the Internet-of-Things (IoT). Processors for these IoT applications primarily operate at near-threshold (NT) voltages and have multiple power modes. Achieving high conversion efficiency within the DC–DC converter that supplies
[...] Read more.
The importance of energy-constrained processors continues to grow especially for ultra-portable sensor-based platforms for the Internet-of-Things (IoT). Processors for these IoT applications primarily operate at near-threshold (NT) voltages and have multiple power modes. Achieving high conversion efficiency within the DC–DC converter that supplies these processors is critical since energy consumption of the DC–DC/processor system is proportional to the DC–DC converter efficiency. The DC–DC converter must maintain high efficiency over a large load range generated from the multiple power modes of the processor. This paper presents a fully integrated step-down self-oscillating switched-capacitor DC–DC converter that is capable of meeting these challenges. The area of the converter is 0.0104 mm2 and is designed in 28 nm ultra-thin body and buried oxide fully-depleted SOI (UTBB FD-SOI). Back-gate biasing within FD-SOI is utilized to increase the load power range of the converter. With an input of 1 V and output of 460 mV, measurements of the converter show a minimum efficiency of 75% for 79 nW to 200 µW loads. Measurements with an off-chip NT processor load show efficiency up to 86%. The converter’s large load power range and high efficiency make it an excellent fit for energy-constrained processors. Full article
Figures

Open AccessFeature PaperArticle
Sizing of SRAM Cell with Voltage Biasing Techniques for Reliability Enhancement of Memory and PUF Functions
J. Low Power Electron. Appl. 2016, 6(3), 16; doi:10.3390/jlpea6030016 -
Abstract
Static Random Access Memory (SRAM) has recently been developed into a physical unclonable function (PUF) for generating chip-unique signatures for hardware cryptography. The most compelling issue in designing a good SRAM-based PUF (SPUF) is that while maximizing the mismatches between the transistors in
[...] Read more.
Static Random Access Memory (SRAM) has recently been developed into a physical unclonable function (PUF) for generating chip-unique signatures for hardware cryptography. The most compelling issue in designing a good SRAM-based PUF (SPUF) is that while maximizing the mismatches between the transistors in the cross-coupled inverters improves the quality of the SPUF, this ironically also gives rise to increased memory read/write failures. For this reason, the memory cells of existing SPUFs cannot be reused as storage elements, which increases the overheads of cryptographic system where long signatures and high-density storage are both required. This paper presents a novel design methodology for dual-mode SRAM cell optimization. The design conflicts are resolved by using word-line voltage modulation, dynamic voltage scaling, negative bit-line and adaptive body bias techniques to compensate for reliability degradation due to transistor downsizing. The augmented circuit-level techniques expand the design space to achieve a good solution to fulfill several otherwise contradicting key design qualities for both modes of operation, as evinced by our statistical analysis and simulation results based on complementary metal–oxide–semiconductor (CMOS) 45 nm bulk Predictive Technology Model. Full article
Figures

Open AccessArticle
Stochastic-Based Spin-Programmable Gate Array with Emerging MTJ Device Technology
J. Low Power Electron. Appl. 2016, 6(3), 15; doi:10.3390/jlpea6030015 -
Abstract
This paper describes the stochastic-based Spin-Programmable Gate Array (SPGA), an innovative architecture attempting to exploit the stochastic switching behavior newly found in emerging spintronic devices for reconfigurable computing. While many recently studies have investigated using Spin Transfer Torque Memory (STTM) devices to replace
[...] Read more.
This paper describes the stochastic-based Spin-Programmable Gate Array (SPGA), an innovative architecture attempting to exploit the stochastic switching behavior newly found in emerging spintronic devices for reconfigurable computing. While many recently studies have investigated using Spin Transfer Torque Memory (STTM) devices to replace configuration memory in field programmable gate arrays (FPGAs), our study, for the first time, attempts to use the quantum-induced stochastic property exhibited by spintronic devices directly for reconfiguration and logic computation. Specifically, the SPGA was designed from scratch for high performance, routability, and ease-of-use. It supports variable-granularity multiple-input-multiple-output (MIMO) logic blocks and variable-length bypassing interconnects with a symmetrical structure. Due to its unconventional architectural features, the SPGA requires several major modifications to be made in the standard VPR placement/routing CAD flow, which include a new technology mapping algorithm based on computing (k, l)-cut, a new placement algorithm, and a modified delay-based routing procedure.Previous studies have shown that, simply replacing reconfiguration memory bits with spintronic devices, the conventional 2D island-style FPGA architecture can achieve approximately 5 times area savings, 2 times speedup and 1.6 times power savings. Our mixed-mode simulation results have shown that, with FPGA architecture innovations, on average, a SPGA can further achieve more than 10 times improvement in logic density, about 5 times improvement in average net delay, and about 5 times improvement in the critical-path delay for the largest 12 MCNC benchmark circuits over an island-style baseline FPGA with spintronic configuration bits. Full article
Figures

Open AccessArticle
Remote System Setup Using Large-Scale Field Programmable Analog Arrays (FPAA) to Enabling Wide Accessibility of Configurable Devices
J. Low Power Electron. Appl. 2016, 6(3), 14; doi:10.3390/jlpea6030014 -
Abstract
We present a novel remote test system, an integrated remote testing system requiring minimal technology support overhead, enabled by configurable analog–digital Integrated Circuits (IC) to create a simple interface for a wide range of experiments. Our remote test system requires no additional setup,
[...] Read more.
We present a novel remote test system, an integrated remote testing system requiring minimal technology support overhead, enabled by configurable analog–digital Integrated Circuits (IC) to create a simple interface for a wide range of experiments. Our remote test system requires no additional setup, resulting both from using highly configurable devices, as well as from the advancement of straight-forward digital interfaces (i.e., USB) for the resulting experimental system. The system overhead requirements require simple email handling, available over almost all network systems with no additional requirements. The system is empowered through large-scale Field Programmable Analog Array (FPAA) devices and Baseline Tool Framework (BTF), where we present a range of experimentally measured examples illustrating the range of user interfacing available for the remote user. Full article
Open AccessArticle
Scaling Floating-Gate Devices Predicting Behavior for Programmable and Configurable Circuits and Systems
J. Low Power Electron. Appl. 2016, 6(3), 13; doi:10.3390/jlpea6030013 -
Abstract
This paper presents scaling of Floating-Gate (FG) devices, and the resulting implication to large-scale Field Programmable Analog Arrays (FPAA) systems. The properties of FG circuits and systems in one technology (e.g., 350 nm CMOS) are experimentally shown to roughly translate to FG circuits
[...] Read more.
This paper presents scaling of Floating-Gate (FG) devices, and the resulting implication to large-scale Field Programmable Analog Arrays (FPAA) systems. The properties of FG circuits and systems in one technology (e.g., 350 nm CMOS) are experimentally shown to roughly translate to FG circuits in scaled down processes in a way predictable through MOSFET physics concepts. Scaling FG devices results in higher frequency response, (e.g., FPAA fabric) as well as lower parasitic capacitance and lower power consumption. FPAA architectures, limited to 50–100 MHz frequency ranges could be envisioned to operate at 500 MHz–1 GHz for 130 nm line widths, and operate around 4 GHz for 40 nm line widths. Full article
Open AccessArticle
Energy-Efficient Hardware Implementation of an LR-Aided K-Best MIMO Decoder for 5G Networks
J. Low Power Electron. Appl. 2016, 6(3), 12; doi:10.3390/jlpea6030012 -
Abstract
Energy efficiency is a primary design goal for future green wireless communication technologies. Multiple-input multiple-output (MIMO) schemes have been proposed in the literature to improve the throughput of communication systems, and they are expected to play a prominent role in the upcoming fifth
[...] Read more.
Energy efficiency is a primary design goal for future green wireless communication technologies. Multiple-input multiple-output (MIMO) schemes have been proposed in the literature to improve the throughput of communication systems, and they are expected to play a prominent role in the upcoming fifth generation (5G) standard. This paper presents a novel, high-efficiency MIMO decoder based on the K-Best algorithm with lattice reduction. We have designed a novel hardware architecture for this decoder, which was implemented using 32 nm standard CMOS technology. Our results show that the proposed decoder can achieve on average a four-fold reduction in the power costs compared to recently published designs for 5G networks. The throughput of the design is 506 Mbits/s, which is comparable to existing designs. Full article
Open AccessArticle
A Design and Theoretical Analysis of a 145 mV to 1.2 V Single-Ended Level Converter Circuit for Ultra-Low Power Low Voltage ICs
J. Low Power Electron. Appl. 2016, 6(3), 11; doi:10.3390/jlpea6030011 -
Abstract
This paper presents an ultra-low swing level converter with integrated charge pumps that shows measured conversion in a 130-nm CMOS test chip from an input at a 145-mV swing to a 1.2-V output. Lowering the input allowable for a single-ended level converter supports
[...] Read more.
This paper presents an ultra-low swing level converter with integrated charge pumps that shows measured conversion in a 130-nm CMOS test chip from an input at a 145-mV swing to a 1.2-V output. Lowering the input allowable for a single-ended level converter supports energy harvesting systems that need to use very low voltages. Full article
Open AccessArticle
A 0.2 V, 23 nW CMOS Temperature Sensor for Ultra-Low-Power IoT Applications
J. Low Power Electron. Appl. 2016, 6(2), 10; doi:10.3390/jlpea6020010 -
Abstract
We propose a fully on-chip CMOS temperature sensor in which a sub-threshold (sub-VT) proportional-to-absolute-temperature (PTAT) current element starves a current-controlled oscillator (CCO). Sub-VT design enables ultra-low-power operation of this temperature sensor. However, such circuits are highly sensitive to process variations,
[...] Read more.
We propose a fully on-chip CMOS temperature sensor in which a sub-threshold (sub-VT) proportional-to-absolute-temperature (PTAT) current element starves a current-controlled oscillator (CCO). Sub-VT design enables ultra-low-power operation of this temperature sensor. However, such circuits are highly sensitive to process variations, thereby causing varying circuit currents from die to die. We propose a bit-weighted current mirror (BWCM) architecture to resist the effect of process-induced variation in the PTAT current. The analog core constituting the PTAT, the CCO, and the BWCM is operational down to 0.2 V supply voltage. A digital block operational at 0.5 V converts the temperature information into a digital code that can be processed and used by other components in a system-on-chip (SoC). The proposed temperature sensor system also supports resolution-power trade-off for Internet-of-things (IoT) applications with different sampling rates and energy needs. The system power consumption is 23 nW and the maximum temperature inaccuracy is +1.5/−1.7 °C from 0 °C to 100 °C with a two-point calibration. The temperature sensor system was designed in a 130 nm CMOS technology and its total area is 250 × 250 μm2. Full article
Open AccessReview
Mastering the Art of High Mobility Material Integration on Si: A Path towards Power-Efficient CMOS and Functional Scaling
J. Low Power Electron. Appl. 2016, 6(2), 9; doi:10.3390/jlpea6020009 -
Abstract
In this work, we will review the current progress in integration and device design of high mobility devices. With main focus on (Si)Ge for PMOS and In(Ga)As for NMOS, the benefits and challenges of integrating these materials on a Si platform will be
[...] Read more.
In this work, we will review the current progress in integration and device design of high mobility devices. With main focus on (Si)Ge for PMOS and In(Ga)As for NMOS, the benefits and challenges of integrating these materials on a Si platform will be discussed for both density scaling (“more Moore”) and functional scaling to enhance on-chip functionality (“more than Moore”). Full article
Open AccessArticle
A Sub-Threshold 8T SRAM Macro with 12.29 nW/KB Standby Power and 6.24 pJ/access for Battery-Less IoT SoCs
J. Low Power Electron. Appl. 2016, 6(2), 8; doi:10.3390/jlpea6020008 -
Abstract
We present an ultra-low power (ULP) 1 KB SRAM macro for Internet of Things (IoT) battery-less systems-on-chip (SoCs) operating under varying energy harvesting conditions. The unique combination of features within this array allows battery-less SoCs to retain important information for a significantly longer
[...] Read more.
We present an ultra-low power (ULP) 1 KB SRAM macro for Internet of Things (IoT) battery-less systems-on-chip (SoCs) operating under varying energy harvesting conditions. The unique combination of features within this array allows battery-less SoCs to retain important information for a significantly longer period of time when energy harvesting conditions are poor. The array uses 8T high-threshold (high-VT) static random access memory (SRAM) cells with word line boosting to eliminate write failures coupled with a read-before-write scheme to address read-disturb in half-selected cells. Due to the reduced on current in high-VT devices, read word line boosting is implemented to improve the drive strength of the read buffer, and to eliminate read failures. Leakage currents through the unselected cells during a read operation is addressed by boosting the footer virtual VSS (VVSS) of the read port to the supply voltage (VDD). To reduce the power consumption of instruction memories in battery-less SoCs, two features were utilized in this array: a read burst mode is used when reading consecutive addresses to reduce the read energy, and instructions with higher percentages of “1” data are defined since reading a “1” is less costly than reading a “0” in 8T cells. The proposed array can operate at a wide range of supply voltages (350–700 mV) and has two ULP modes: standby with retention (1.5 pW/bit) and shutdown without retention (0.13 pW/bit). Aggressive power gating of all peripherals during the standby state reduces the array power consumption down to 12.29 nW/KB at 320 mV with data retention. Compared to previously published 8T arrays, the proposed design provides the lowest standby power. The complete shutdown of the array allows further reduction down to 1.09 nW/KB and is suitable for reducing the power consumption of data memories in battery-less SoCs. The measured results from a commercial 130 nm chip show that the proposed array consumes a minimum of 6.24 pJ/access with a 17.16 nW standby power at 400 mV. The read burst mode allows up to 22% reduction in energy/access at 400 mV. Full article
Open AccessArticle
A 36 nW, 7 ppm/°C on-Chip Clock Source Platform for Near-Human-Body Temperature Applications
J. Low Power Electron. Appl. 2016, 6(2), 7; doi:10.3390/jlpea6020007 -
Abstract
We propose a fully on-chip clock-source system in which an ultra-low-power diode-based temperature-uncompensated oscillator (OSCdiode) serves as the main clock source and frequency locks to a higher-power temperature-compensated oscillator (OSCcmp) that is disabled after each locking event to save
[...] Read more.
We propose a fully on-chip clock-source system in which an ultra-low-power diode-based temperature-uncompensated oscillator (OSCdiode) serves as the main clock source and frequency locks to a higher-power temperature-compensated oscillator (OSCcmp) that is disabled after each locking event to save power. The locking allows the stability of the uncompensated oscillator to stay within the stability bound of the compensated design. This paper demonstrates the functionality of a locking controller that uses a periodic (counter-based) scheme implemented on-chip and a prediction (temperature-drift-based) scheme. The flexible clock source platform is validated in a 130 nm CMOS technology. In the demonstrated system, it achieves an effective average temperature stability of 7 ppm/°C in the human body temperature range from 20 °C to 40 °C with a power consumption of 36 nW at 0.7 V. It achieves a frequency range of 12 kHz to 150 kHz at 0.7 V. Full article
Open AccessArticle
Toward a Faster Screening of Faulty Digital Chips via Current-Bound Estimation Based on Device Size and Threshold Voltage
J. Low Power Electron. Appl. 2016, 6(2), 6; doi:10.3390/jlpea6020006 -
Abstract
Observations of peak and average currents are important for designed circuits, as faulty circuits have abnormal peaks and average currents. Using current bounds to detect faulty chips is a comparatively innovative idea, and many advanced schemes without them use it as a component
[...] Read more.
Observations of peak and average currents are important for designed circuits, as faulty circuits have abnormal peaks and average currents. Using current bounds to detect faulty chips is a comparatively innovative idea, and many advanced schemes without them use it as a component in statistical outlier analysis. However, these previous research works have focused on the discussion of the testing impact without a proposed method to define reference current bounds to find faulty chips. A software framework is proposed to synthesize high-performance, power-performance optimized, noise-immune, and low-power circuits with current-bound estimations for testing. This framework offers a rapid methodology to quickly screen potential faulty chips by using the peak and average current bounds for different purposed circuits. The proposed estimation technique generates suitable reference current bounds from transistor threshold voltage and size adjustments. The SPICE-level simulation leads to the most accurate estimations. However, such simulations are not feasible for a large digital circuit. Hence, this work proposes constructing a feasible gate-level software framework for large digital circuits that will serve all of simulation purposes. In comparison with transistor-level Nanosim simulations, the proposed gate-level simulation framework has a margin of error of less than 2% in the peak current, and the computation time is 334 times faster. Full article