Ultralow-Power SOTB CMOS Technology Operating Down to 0 . 4 V †

Ultralow-voltage (ULV) CMOS will be a core building block of highly energy efficient electronics. Although the operation at the minimum energy point (MEP) is effective for ULP CMOS circuits, its slow operation speed often means that it is not used in many applications. The silicon-on-thin-buried-oxide (SOTB) CMOS is a strong candidate for the ultralow-power (ULP) electronics because of its small variability and back-bias control. Proper power and performance optimization with adaptive Vth control OPEN ACCESS J. Low Power Electron. Appl. 2014, 4 66 taking advantage of SOTB’s features can achieve the ULP operation with acceptably high speed and low leakage. This paper describes our results on the ULV operation of logic circuits (CPU, SRAM, ring oscillator and other logic circuits) and shows that the operation speed is now sufficiently high for many ULP applications. The “Perpetuum-Mobile” micro-controllers operating down to 0.4 V or lower are expected to be implemented in a huge number of electronic devices in the internet-of-things (IoT) era.


Introduction: Issues for ULV Operation Possibly Staying on MEP Point
A huge number of small electronic devices composing big data are expected to be used across the globe as the "internet of things" (IoT).The CMOS integrated circuit is a core part of these devices.The energy efficiency of the CMOS circuits should therefore be greatly improved.It is well known that the operating voltage (V dd ) is a primarily important parameter for reducing the energy per operation cycle in the CMOS circuits.As shown in Figure 1, the energy is a sum of active (E ac ) and leakage (E leak ) energy as shown in Equation (1) in the simplified form.
where C load , I leak , a, and f denote load capacitance, leakage current, activity, and frequency, respectively.With decreasing V dd , E ac decreases since it is proportional to C load V dd 2 .However, E leak relatively increases as f decreases with decreasing V dd .These two terms determine the minimum energy point (MEP).The energy efficiency of CMOS circuits has been greatly improved by the miniaturization of CMOS transistors.This improvement is mainly accomplished by E ac reduction due to C load reduction with the transistor scaling.However, most of the circuits do not operate at MEP and the improvement in terms of the efficiency has not been perfect so far.In recent generations, the scaling has increased the V dd at MEP, as shown in Figure 2 [1,2].This is because E leak tends to increase with the miniaturization that has been seen in recent generations, especially for the performance-oriented applications.In the energy-efficiency conscious design like [1], the E min has already an increasing trend (minimum point of E min at 90-nm node), as shown in Figure 2.
The near-or sub-V th operation is attractive to improve the energy efficiency.The operating speed of these circuits, however, is not high.The maximum frequency rapidly drops with decreasing V dd in the conventional CMOS [3].In the device design for ULP circuits, it is important to optimize both V dd and V th .With decreasing V dd , V th should increase to minimize the energy [4].This drastically decreases the frequency and in many cases the MEP operation is a sub-V th operation and its frequency is less than MHz.The variable V th approach with adaptive back-bias control can mitigate the situation: optimizing frequency and decreasing energy as low as possible down to the MEP value.Both V dd and V th are controlled to minimize the energy, while satisfying the required workload: required frequency.In the dynamic voltage and frequency scaling (DVFS), only V dd is controlled.In order to achieve higher energy efficiency, the control of V th should be accompanied.It is well known that the characteristic variability of transistors is recognized as a major obstacle for the performance/power tradeoff, especially at low V dd .The increasing variability also increases V dd at MEP [5] simply because of the increase in leakage current in a circuit that is a sum of transistor leakages [6].Moreover, increasing the transistor variation causes delay variation, especially at low V dd and causes a significant performance drop [7].Design to cope with the increasing variability at low V dd becomes more complex.The variability tolerant design prefers to increase the transistor width W; however, this directly increases the power [8,9].Another variability tolerant logic design prefers a smaller number of pipeline stages and longer logic depth.However, these design strategies decrease the frequency [10,11] and increase E min .
We hypothesize the main issues for the highly energy efficient CMOS circuits are adaptive V th control and small variability, as described in this section.In order to solve these issues, we are developing the silicon on thin buried oxide (SOTB) [12][13][14][15][16].In this article, we show SOTB's low voltage capability, including small variability and back-bias control through device and circuit results.[2] thin BOX layer and doped ground plane (n GP and p GP) just below the BOX layer; (iii) flexible V th tuning by impurity density control of the ground plane and (iv) high design compatibility with the conventional CMOS due to mostly identical planar layout to the bulk and a hybrid bulk integration for I/O.Details of the SOTB fabrication process are reported elsewhere [15,16].

Figure 3.
Schematic cross section of silicon on thin buried oxide (SOTB) and hybrid bulk transistors.V bp and V bn denote back-bias terminal for p-and n-type SOTB, respectively.
V th control optimized for the ULV operation has been an important process issue.In this optimization, we controlled the V th values at around 0.2 V, which is by about 0.2 V lower than that for the low-standby-power (LSTP) application, with the multiple V th option utilizing a poly-silicon and high-k gate-stack technology and a proper doping profile control of the ground plane [15].As shown in Figure 4, we utilize a small amount of high-k (Hf and Al) oxide mixed with the conventional SiON gate dielectric and control proper effective work function (EWF) both for NMOS and PMOS.Typical I d -V g characteristics of triple V th option are shown in Figure 5. LVT, SVT, and HVT denote low, standard, and high V th option, respectively.Two-orders-of-magnitude off-leakage-current variation can be done only by changing the doping density below the BOX layer with the same gate stack.We have demonstrated significant reduction of the V th variability.The Pelgrom coefficient (A VT ) of the SOTB was about 1.2-1.3mVμm [16], which is less than half of the bulk.Moreover, we measured the V th variation of one-million transistors (gate length and width: 0.06 and 0.14 μm, respectively) [16] as shown in Figure 6 and confirmed regular distribution without dropout transistors.Variation of on-state current I on is important and must be decreased since this directly affects the delay variation of circuits.The I on variation was demonstrated to be less than half of bulk [17] and we confirmed a significant reduction of the I on variability for one-million transistors as shown in Figure 7 [16].The lowest V th values of SOTB and bulk are the same as shown in Figure 6.This means the highest leakage current among one-million transistors is the same.Besides, the highest V th transistor determines the delay.As shown in Figure 7, the smallest I on value for SOTB is about twice as high as the bulk's worst value.This is a strong advantage of SOTB's small variability in terms of the circuit performance.

V min Reduction of 6T-SRAM and Leakage Control by Back-Bias
Thanks to the significant reduction of the variability, as shown in Figures 6 and 7, we successfully demonstrated 2-Mbit SRAM operation at V min = 0.37 V [16] of the standard six-transistor (6T) layout and without assist circuits as shown in Figure 8. Due to the V th optimization mentioned in the previous section, very small access time (5.5 ps at V dd = 0.4 V) was demonstrated.This enables a circuit operation with SRAM at several tens of MHz.The standby leakage decreased more than two orders of magnitude by a reverse back biasing and achieved 1.2 pA/cell without destroying the data.Moreover, the above V min value can be kept at lower at elevated temperatures with a proper back-bias control.In Figure 9, the V min value at room temperature was the same as the value shown in Figure 8.At 80 °C, the V min value increased to 0.46 V.This is because the V th values of NMOS and PMOS transistors differently shifted from the room-temperature values.Although these V th values were smaller than the room-temperature values (this increased the leakage current about two orders of magnitude higher than that at room temperature), balance of V th s (current drivabilities) between the NMOS and PMOS transistors untuned.This deteriorated the SRAM cell stability and increased the V min .By applying proper back-bias voltages for both transistors, the V min value again reduced to less than 0.4 V as shown in Figure 9. Leakage current also can be minimized by the back-bias control regardless of temperature, which is roughly the same as the room-temperature value.

Ring Oscillator Circuit Results
We have developed a standard logic cell library of the SOTB technology with a hybrid bulk I/O library.The delay characteristics of the cells were evaluated through the RO measurements [18].Figure 10 shows propagation delay t pd of a 101-stage inverter RO for SOTB and bulk.The delay of SOTB was by 42% smaller than bulk at V dd = 0.4 V.Note that V th s of SOTB and bulk were the same at V dd = 0.4 V.The speed gain of SOTB was higher at lower V dd because of better I eff /I off and smaller DIBL.The delay variability was then investigated.The standard deviation of t pd for SOTB exhibited a :;<= 2&(> very weak 1/√ dependence.The result means that the local delay variability of SOTB is very small.We also succeeded in the significant reduction of die-to-die delay variability by a proper back-biasing [19].The logic circuits contain various types of logic cells such as inverter, NAND, NOR, etc.We found that back-biasing considering a drivability balance of NMOS and PMOS transistors is essential for an effective suppression of die-to-die delay variability for various types of the cells.The minimum energy consumption of SOTB logic circuits of 50 kgates was estimated based on the RO results by optimizing the back-bias voltage [20].At the same energy per operation, SOTB operates about ×10 higher than bulk.The power consumption of 44 μW at 10 MHz (4.4 pJ/cycle) is expected at V dd = 0.33 V whereas bulk operates at 1 MHz with the same energy per cycle as SOTB. Figure 10.Inverter delay t pd as a function of V dd [18].

Demonstration of ULV and ULP Operation of Logic Circuits
The design flow for the SOTB integrated circuits is basically the same as the conventional one.Using our newly developed design flow with the SOTB/bulk hybrid library, several ULV circuits were designed.Significant power reduction was demonstrated by the post-layout timing and power analysis.The reconfigurable accelerator named cool mega array (CMA) was designed and silicon results were obtained [21].The bulk CMA operates at 0.8-1.2V (with dynamic voltage scaling) and 210 MHz, and the SOTB version operates at 0.4 V (with back-biasing) and 65 MHz.The energy efficiency executing the Alpha blender test program was 38 and 65 MOPS/mW for bulk and SOTB, respectively.
The back-bias control offers a strong advantage for the FPGA circuits.The flex-power FPGA of the SOTB version was firstly implemented and silicon results were obtained [22].After the FPGA configuration, the back-bias control enables that the only critical-path logic elements are set to low V th .This significantly reduces the leakage power with no operation speed penalty.
High-efficiency generator of back-bias voltage is important for the SOTB technology since standby leakage current is kept low by applying reverse back-bias voltage.A superior point of the SOTB technology is that current load of the back-bias generator is very small because back-gate region of the SOTB transistor is electrically isolated by the BOX layer.This leads to a significant reduction of current consumption of the back-bias generator itself.We designed the generator circuit using the standard Dickson's charge pump for the SOTB and bulk hybrid platform and silicon results were obtained [23].The generator operates at V dd = 0.1 V and higher, and generates sufficient back-bias voltages for NMOS and PMOS of 0.85 and −1.5 V, respectively, at V dd = 0.4 V with a current consumption of only 13 μA.By applying these back-bias voltages, leakage current of a 500 kgate logic circuit reduced to 2 μA corresponding to 4 pA/gate.There are still several points of the optimization and the generator current consumption is still higher than our target specification.The optimization is now under way.
We have confirmed a successful operation [24] of the proto-type ULV micro-controller chip as shown in Figure 11.This chip is composed of 32-bit RISC CPU with five-stage pipeline, 144 kByte SRAM, and interfaces (ROM, UART, SPI, and GP) and can be connected with sensors and rf modules for the sensor-network node.The micro-controller chip operates at V dd = 0.35 V and consumes only E = 13.4 pJ (f = 14 MHz) as shown in Figure 12.Note that the E values for SOTB and bulk at V dd ≥ 0.8 V are identical because it is determined by E ac (same C load of the same 65-nm technology) in this region.Sleep current is only 0.14 μA.By taking advantage of the ULP capability of our SOTB micro-controller chip, named "Perpetuum-Mobile", the sensor node is expected to operate with a sufficient frequency (>10 MHz) for a long period with a single battery or further longer operation with an energy harvester.

Conclusions
Silicon on thin buried oxide (SOTB) is suitable for the ULV operation thanks to its small variability and back-gate bias controllability.We have demonstrated significant variability reduction, 0.4-V operation of SRAM, and reducing power consumption of logic circuits including a micro-controller chip with a significant speed gain even at ULV.Many ULP applications are expected with SOTB chips.The "Perpetuum-Mobile" micro-controller chips will work as a core electronics parts in various types of electronic apparatuses of the "internet of things".

Figure 1 .
Figure 1.Schematic relationship between energy per operation E versus operating voltage V dd .

Figure 2 .
Figure 2. Minimum energy E min and V dd at MEP as a function of technology node number after [1,2].

Figure 3 .
There are four major factors: (i) small local variability due to low-impurity fully depleted SOI channel; (ii) high back-bias coefficient due to

Figure 4 .Figure 5 .
Figure 4. Effective work function (EWF) control with high-k/SiON gate stack [15].Circles and diamonds represent EMF of P-type and N-type gate stacks for PMOS and NMOS, respectively.

Figure 6 .Figure 7 .
Figure 6.V th distribution of 1 M transistors.[16] Vertical-axis value shows deviation from V th median value.Positive or negative σ values indicate that the corresponding V th value is upper or below median, respectively.

Figure 8 .
Figure8.Fail-bit count of 2-M bit SRAM array as a function of V dd[16].

Figure 9 .
Figure 9. V min of 2-M bit SRAM array as a function of back-bias voltage |V b |: absolute values of back-bias voltage for NMOS and PMOS, |V b | = |V bn | = |V bp |.

Figure 12 .
Figure12.Energy per cycle E as a function of operating voltage V dd for the "Perpetuum-Mobile" micro-controller chip[24].