- freely available
JLPEA 2012, 2(4), 282-300; doi:10.3390/jlpea2040282
Abstract: This paper presents an energy efficient bootstrapped CMOS driver to enhance the switching speed for driving large RC load for ultra-low-voltage CMOS VLSI. The proposed bootstrapped driver eliminates the leakage paths in the conventional bootstrapped driver to achieve and maintain more positive and negative boosted voltage levels of the boosted nodes, thus improving boosting efficiency and enhancing driver switching speed. Measured performance from test chips implemented with UMC 65 nm low-power CMOS technology (VTN ≈ VTP ≈ 0.5 V) indicates that the proposed driver provides a rising-delay improvement of 37%–50% and a falling-delay improvement of 25%–47% at 0.3 V for a loading ranging from a 0 to 24 mm long M6 metal line compared with the conventional bootstrapped driver. Although designed and optimized for subthreshold ultra low-voltage operation, the proposed bootstrapped driver is shown to be advantageous at higher nearly-threshold supply voltage as well. The proposed driver provides a rising delay improvement of 20% to 52% and a falling delay improvement of 23%–43% for VDD ranging from 0.3 V to 0.5 V, while consuming about 15% less average power than the conventional bootstrapped driver driving a 16 mm long M6 wire.
With the scaling of semiconductor technology, more and more transistors are integrated on a single chip. Thus, low-power circuit design has become ever-increasingly crucial in nanoscale CMOS VLSI. One effective way of reducing the power consumption of a digital CMOS VLSI circuit is to scale down the Vdd supply voltage, which reduces both active energy and leakage power. For some emerging applications, such as implantable devices, medical instruments and wireless sensor networks, the supply voltage may be scaled down to the subthreshold region to prolong battery life [1,2]. However, reducing supply voltage causes a substantial speed penalty, since the drain-source voltage and gate-source voltage of CMOS transistors are simultaneously reduced, leading to significant degradation of driving current and switching speed.
On the other hand, as the complexity of nanoscale CMOS VLSI grows, long interconnect metal lines and an output driver with a large distributed RC loading constrain the achievable performance. In order to drive a large distributed RC-load, the design of an energy efficient driver circuit has become a critical concern for switching speed and power consumption. Since the threshold voltage of a CMOS device cannot be easily scaled down with supply voltage, the design of an energy efficient high-performance driver operating at a low supply voltage (or subthreshold voltage level) poses a significant challenge [3,4,5]. A bootstrapped CMOS driver circuit  (shown in Figure 1) was proposed previously. This driver mainly consists of positive and negative voltage bootstrapped circuits and a pair of corresponding driver transistors (MP1 and MN1). When the bootstrapped driver was functioning, gate nodes of driver transistors, V1n and V1p, were boosted above Vdd and below ground, respectively, to enhance the current driving ability of the driver transistors (MN1 and MP1), thus improving the switching speed of circuits with a large distributed RC loading. The enhanced current driving ability of the driver circuit is determined by the voltage levels of nodes V1n and V1p, which are boosted by two coupled capacitors, Cbn and Cbp, respectively. The extent of internal voltage undershoot (overshoot) of V1p (V1n) is determined by the ratio of Cbp (Cbn) to the parasitic capacitance at the right side of Cbp (Cbn), due to charge sharing. Ideally, by using large Cbp (Cbn), the boosted voltage level of V1p (V1n) can be pulled down (up) to near -Vdd (2Vdd). Thus, almost a -Vdd (2Vdd) voltage level is fed to the gate nodes of the driver transistor MP1 (MN1), respectively, providing a bootstrapping effect to enhance the driving capability.
Unfortunately, for the conventional bootstrapped CMOS driver in Figure 1, even with a large boosting capacitor Cbp (Cbn), near −Vdd (2Vdd) voltage value cannot be generated and maintained at the corresponding gate node of the driver device, due to a leakage path through MN4 (MP4), which charges (discharges) node V1p (V1n). During the input-rising transition, V2p is boosted to below-ground voltage level (−Vdd), which is passed through an “On” transistor MN3 to node V1p to enhance the gate-drive of MP1. However, as V2p is over-driven to the below-ground level, a leakage current flows from the ground through MN4 (with its gate at the ground) to charge up V2p (and hence V1p), thus reducing/limiting the gate over-drive of MP1 and the boosting efficiency. The complementary situation occurs where the leakage path through the transistor MP4 discharges V2n (and hence V1n) during the input-falling transition.
In addition to the conventional bootstrapped driver, several bootstrapped driver designs have been reported [6,7,8,9,10]. The bootstrapped driver designs in  and  utilize pre-charged capacitors to couple the output loading charges directly, thus incurring a large area overhead, as the capacitors have to be sized in proportion to the large output loading. The area overhead is exacerbated by the fact that a capacitor implemented with a MOS structure has quite a poor area efficiency (low capacitance per unit area) at a low voltage subthreshold operation. The bootstrapped CMOS driver in  provides a short turn-on delay for faster speed. It, however, requires almost 2X transistor counts compared with the conventional bootstrapped driver in  and has larger parasitic capacitance at the charge sharing gate nodes of the output driver devices, resulting in more power dissipation and degraded boosting efficiency. The bootstrapped driver in  requires four capacitors, twice that of the conventional bootstrapped driver and, hence, a large area overhead. It also has a large cross-over current and, hence, a large power consumption. The conditional bootstrapped driver circuit in  employs only one capacitor, but with two independent input signals to determine the state of the conditional output.
In this paper, an energy efficient bootstrapped CMOS driver is proposed to overcome the shortcomings of the conventional bootstrapped driver. The proposed succinct single-input to single-output scheme eliminates the leakage paths of the conventional scheme  to prevent charging/discharging of the boosted nodes, thus enhancing the boosted voltage levels and allowing the boosted voltage levels to be maintained to enhance the boosting efficiency and switching speed. The proposed scheme requires an identical number (i.e., two) of boosting capacitors and only four more transistors compared with the conventional bootstrapped driver in . In Section 2, we describe the proposed bootstrapped driver circuit structure and its operation. Section 3 presents the detailed test chip implementation in UMC 65 nm low-power CMOS technology. Section 4 describes the testing environment, chip measurement results, and the measured performance comparison with the conventional bootstrapped driver. The conclusion of the paper is given in Section 5.
2. Energy Efficient Bootstrapped Driver
Figure 2 shows the proposed energy efficient bootstrapped CMOS driver . It consists of the basic driver block and the bootstrap block. In the basic driver block, there are PMOS (MP1) and NMOS (MN1) for pushing and pulling the output load. In the bootstrap block, there are two PMOS (MP2, MP3) and three NMOS (MN4, MN5, MN6) for push-up transient and two NMOS (MN2, MN3) and three PMOS (MP4, MP5, MP6) for pull-down transient. Moreover, two coupling capacitors (Cpos and Cneg) and one inverter are used for storing and coupling charges in the bootstrap block.
Notice that MN3, MN4 in the conventional driver (Figure 1) and MN4, MN5 and MN6 in the proposed driver (Figure 2) are shown with the source-body tied (VSB = 0). In the following simulations and test chip implementation, the process used is the UMC 65 nm Low-Power (LP) CMOS process, a standard foundry bulk (PMOS in N-Well) process. So, the bodies of MN3, MN4 in the conventional driver and MN4, MN5 andMN6 in the proposed driver are tied to GND. As such, the body to source junctions of these NMOSs are reverse-biased (that raises VT and reduces leakage).
Figure 3 shows pertinent transient waveforms of the energy efficient bootstrapped driver, which drives a series of RC loading segments during the push-up transient. Note that the supply Vdd is set at 0.3 V for ultra low-voltage subthreshold operation. The driver circuit and RC loading segment are designed based on the UMC 65 nm low-power CMOS process. Each RC segment is composed of a 120 Ω resistor and two 504 fF capacitors, corresponding to 4 mm of M6 wire, extracted from a long metal wire in the 65 nm CMOS process. Fast push-up operation of the energy efficient bootstrapped driver is illustrated when the input rises from 0 V to 0.3 V.
The equivalent driver circuit at time T1 before the push-up transient is shown in Figure 4(a). At time T1, input Vi is at 0 V, and the output of the inverter, Vx, is at 0.3 V. Therefore, in the basic driver block, MP1 is off, and MN1 is on. In the bootstrap block, MN2, MN3, MN4 and MN6 are off and the rest transistors are on. Accordingly, the output node, Vo, is driven by MN1 to 0 V. The gate node of MN1, Vn, retains a high voltage because of the boosting of the Vpos1 node to nearly 0.6 V by Cpos and the fact that MP4 is on. Note that Vpos1 and Vn have been boosted in the preceding pull-down transient prior to the present push-up transient.
Notice also that there is no discharging current path from charge sharing nodes, Vpos1, Vpos2 and Vn. On the other hand, at T1, MP3 and MN5 of the bootstrap block are on, setting the top side of Cneg, Vneg1, to 0 V. As a result, the top and bottom side of the capacitor Cneg, Vneg1 and Vx are pre-charged to 0 V and 0.3 V, respectively. Therefore, at T1, capacitor Cneg has stored a charge of 0.3Cneg Coulombs.
Figure 4(b) illustrates the equivalent circuit of the energy efficient bootstrapped driver at time T2 after the push-up transient. At T2, the input Vi has ramped up to 0.3 V. MP3 and MP2 are off. MN6 and MN4 turn on, and the output of the inverter Vx is pulled down to 0 V. The top side of bootstrap capacitor Cneg (Vneg1) is coupled from 0 V to near -0.3 V. As MN6 and MN4 are on at T2, Vp and Vneg2 are also coupled to near −0.3 V, thus providing under-drive to enhance the driving capability of MP1, while simultaneously cutting off the leakage current path through MN5 by equalizing voltages of Vneg1 and Vneg2. The pertinent waveforms for the push-up transient are shown in Figure 3. The pull-down bootstrap transient operates in a complementary manner.
Table 1 lists the bootstrapped device states of the proposed Energy Efficient (EE) driver and the Conventional (Conv.) driver in the rising and falling transients in detail. It can be seen that MN4 (MP4) in the conventional driver has a large Vgs (Vsg)~VDD during the boosting transient, thus causing significantly higher leakage than the proposed EE driver.
Figure 5(a,b) shows the waveforms of boosted node voltages (V1n and V1p of the conventional bootstrapped driver and Vn and Vp of the proposed driver) at Vdd = 0.3 V, 25 °C, TT corner. Vn (Vp) of the proposed driver is boosted to a higher (lower) voltage level than V1n (V1p) of the conventional driver. The decaying of the voltage levels of V1n and V1p due to leakage paths through MP4 and MN4 (discussed early in Section I) can be seen. Figure 5(c) compares the delays of the conventional driver and the proposed driver for wire length ranging from 1 mm to 4 mm. The performance advantage of the proposed driver can be clearly seen across this loading range.
Monte-Carlo simulations with 16,000 samples for three sigma Vt mismatch are performed. The results for the delay distributions of the conventional driver and the proposed driver at 0.3 V are shown in Figure 5(d). It can be seen clearly that the proposed driver offers better performance and tighter delay distribution.
|Bootstrapping Comparison||Device State||NMOS Vgs/PMOS Vsg|||Vds||
|Conv. (Rising)||MN4 (turn ON)||≈VDD > 0|
|MP2 (turn OFF)||Vsg ≈ −2VDD < 0|
|Conv. (Falling)||MP4 (turn ON)||≈VDD > 0|
|MN2 (turn OFF)||Vgs ≈ −2VDD < 0|
|EE. (Rising)||MN5 (turn OFF)||Vgs = 0||≈VDD > 0|
|MP2 (turn OFF)||Vsg ≈ −2VDD < 0|
|MP3 (turn OFF)||Vsg ≈ −2VDD < 0|
|EE. (Falling)||MP5 (turn OFF)||Vsg = 0|
|MN2 (turn OFF)||Vgs ≈ −2VDD < 0|
|MN3 (turn OFF)||Vgs ≈ −2VDD < 0|
3. Test Chip Implementation
A test chip is designed in UMC 65 nm low-power CMOS technology for performance evaluation and comparison with the conventional bootstrapped driver. The test chip structure is shown in Figure 6. The input Level Shifter  down-shifts the normal (high) voltage input to the desired low-voltage for the driver input. The voltage down-shift is performed in five successive steps with five power domains to mitigate ESD reliability issues between power domains with large voltage disparity. The output of the Level Shifter is fed to the proposed energy efficient driver and the conventional bootstrapped driver. Four NMOS-only pass-transistor based multiplexer switches are provided at the output of each driver to allow selection of four different loading segments of 0 mm, 8 mm, 16 mm and 24 mm long M6 metal line, respectively. The signals at the far-end of the loading segments are then up-shifted by the output Level Shift, again, in five successive steps with five power domains, to the normal (high) voltage. The successive down/up-shift with multiple power domains not only alleviates ESD reliability issues between power domains with large voltage disparity, but also provides a better shift-voltage level control under Process, Voltage and Temperature (PVT) variation. The decoder of the NMOS-only pass-transistor-based multiplexer operates at the standard high voltage domain to over-drive the pass-transistor, so the VT loss through the NMOS-only pass-gate and the multiplexer delay are negligible.
Figure 7 shows the layout view of the test chip and the table summarizes the device count, total device width, MOS coupling cap area and core area of the conventional driver and the proposed energy efficient driver. The proposed energy efficient bootstrapped driver is located at the upper-half, while the low-half is the conventional bootstrapped driver. The 8 mm, 16 mm and 24 mm loading segments are drawn by folding M6 metal lines. The insets show the blow-up views of the relatively small bootstrapped driver blocks, where the two boosting capacitors for each driver can be clearly seen. While the proposed driver has a higher device count, the space between N-wells for the PMOS capacitor could accommodate the extra devices, resulting in an identical area compared with the conventional driver.
In the energy efficient and the conventional bootstrapped drivers, all transistors, except MP1, of Figure 1 and Figure 2 have a minimum length and width. MP1 has a minimum length, and the width is three times larger than the minimum width. The MOS-based capacitances of Cneg(Cbp) and Cpos(Cbn) in Figure 2 (Figure 1) are 40 fF and 30 fF, respectively. The inverter in both figures is composed of a PMOS with a minimum length and nine times minimum width and an NMOS with a minimum length and three times minimum width. The test chip is designed and taped out in UMC 65 nm low-power CMOS technology. Figure 8 shows the packaged test chip and the internal transection screen within the QFP208-type package.
4. Measurement and Performance Evaluation
Figure 9 shows the measurement environment setup, including oscilloscope, PCB and multiple power supplies. A 50% duty-cycle clock signal is supplied to the input of the test chip. Figure 10 shows the pertinent measured oscilloscope waveforms, including the input (CK_IN) signal and two of the eight output signals (selected by the 2-bit multiplexer, one output for each driver scheme as shown in Figure 6) for the Energy Efficient (EE.) driver and the Conventional (Conv.) bootstrapped driver. The 50% to 50% rising and falling delay times (Trd and Tfd) from the input to output are measured. Notice that Trd and Tfd are “external” delays, which include the delays of the input and output level shifters, so we have to deduct the delays of the level shifters to obtain the “internal” delays (shown as Trdi and Tfdi in Figure 6), which represent the true delays from the driver input to the far end of the load. Separate power supply pins are used for each type of driver, so the power consumption of each driver can be measured precisely.
Figure 11 shows the post-simulation and measured on-current of NMOS and PMOS transistor versus supply voltage ranging from 0.3 V to 0.5 V. As can be seen, for both NMOS and PMOS, the measured on-current is much larger than the post-simulation results. The data of blue lines (Nratio and Pratio) corresponding to the right Y-axis represent the ratio of the measured on-current to the post-simulation on-current. It is obvious that the test chip devices have larger current driving capability than the post-simulation results based on the device models. The discrepancy becomes larger with lower supply voltage. This is attributed to the fact that the device models from the foundry are typically well-calibrated only for operation in the super-threshold region (e.g., say 0.7–1.2 V), and not for operation in the subthreshold region, and the large variability of device characteristics in the subthreshold region.
Figure 12 shows the post-simulation and the measured “external” rising and falling delay times (Trd and Tfd) at 0.3 V supply. The smaller delays of the measured results are due to the significantly larger device current driving capability than that predicted by the device models as shown in Figure 11.
Figure 13(a) shows the measured “external” rising and falling delay times and average power consumption of the Energy Efficient (EE.) and Conventional (Conv.) bootstrapped driver circuit versus the M6 wire loading length at 0.3 V. Figure 13(b) shows the measured “external” rising and falling delay times and average power consumption of the Energy Efficient (EE.) and Conventional (Conv.) bootstrapped driver circuit versus VDD with a loading of 16 mm M6 wire.
As mentioned previously, Trd and Tfd are “external” delays, which include the delays of the input and output level shifters, and we have to deduct the delays of the level shifters to obtain the “internal” delays Trdi and Tfdi. Notice that if one simply constructs a path (chain) consisting of only the input level shifters and output level shifters, the so-obtained level shifter delay does not represent the actual situation, since in Figure 6, the last stage of the input level shifters is driving the proposed or conventional bootstrapped driver, whereas in the level shifter chain, it is driving the first stage of the output level shifters.
Similarly, in Figure 6, the input of the first stage of the output level shifters is driven by the far end of the M6 loading wire, while in the level shifter chain. it is driven by the last stage of the input level shifters. If the device models were accurate, one could also rely on post-simulations to obtain the delays of the level shifters and deduct them from the “external” delays. However, given the large discrepancy for on-currents between the post-simulation results and measured hardware results shown in Figure 11 (and the large discrepancy between the “external” delays from post-simulations and measured hardware results in Figure 12), direct deduction of post-simulation level shifter delays from the measured “external” delays would yield inaccurate (and even meaningless) data.
In order to obtain reasonable estimates of the level shifter delays (hence “internal” delays Trdi and Tfdi), we calculate the ratio of “internal” delays to “external” delays based on post-simulation results (as shown in Figure 14) and then apply the ratio to the measured “external” delays to obtain the “internal” delays. The deduced “internal” delays, together with the rising and falling delay improvement (in %) of the proposed energy efficient driver with respect to the conventional bootstrapped driver vs. M6 wire loading length, are shown in Figure 15.
As can be seen from Figure 15, the proposed driver provides (“internal”) a rising-delay improvement of 37%–50% and (“internal”) a falling-delay improvement of 25%–47% at 0.3 V for loading ranging from 0 to 24 mm long M6 metal line. Figure 16 shows the deduced “internal” rising and falling delay times and average power improvement of the Energy Efficient (EE.) driver with respect to the Conventional (Conv.) bootstrapped driver versus VDD with a loading of 16 mm M6 wire. Although designed and optimized for subthreshold ultra low-voltage operation, the proposed bootstrapped driver can be seen to be advantageous at higher nearly-threshold supply voltage (0.5 V) as well. The proposed driver provides “internal” rising delay improvement of 20% to 52% and falling delay improvement of 23%–43% for VDD ranging from 0.3 V to 0.5 V, while driving 16 mm long M6 wire. Furthermore, the proposed driver consumes about 15% less average power (at higher speed) than the conventional bootstrapped driver for the same supply voltage range.
In this paper, a novel energy efficient bootstrapped CMOS driver was proposed. The circuit scheme eliminated the leakage paths of the boosted nodes to improve the boosting efficiency, achieve higher (or lower) boosted voltage levels and retain the boosted voltage levels, thus enhancing the switching speed. Performance evaluation based on measurements from a test chip implemented in UMC 65 nm low-power CMOS technology indicated that the proposed driver provided a rising-delay improvement of 37%–50% and a falling-delay improvement of 25%–47% at 0.3 V for a loading ranging from 0 to 24 mm M6 metal lines compared with the conventional bootstrapped driver. Although designed and optimized for subthreshold ultra low-voltage operation, the proposed bootstrapped driver was shown to be advantageous at higher nearly-threshold supply voltage as well. The proposed driver provided a rising delay improvement of 20% to 52% and a falling delay improvement of 23%–43% for VDD ranging from 0.3 V to 0.5 V with 16 mm long M6 wire, while consuming about 15% less average power compared with the conventional bootstrapped driver.
The authors would like to thank the Faraday Technology Corporation for their design environment and technical resource support and United Microelectronics Corporation (UMC) for chip fabrication. This work was supported by the Ministry of Education in Taiwan under the ATU Program, and the Ministry of Economic Affairs in Taiwan under Contract 99-EC-17-A-01-S1-124.
- Wang, A.; Chandrakasan, A.; Kosonocky, D. Optimal Supply and Threshold Scaling for Sub-threshold CMOS Circuits. In Proceedings of the 2002 IEEE Computer Society Annual Symposium on VLSI, Pittsburgh, PA, USA, 25–26 April 2002; Volume 4, pp. 5–9.
- Wang, A.; Calhoun, B.H.; Chandrakasan, A. Sub-Threshold Design for Ultra Low-Power Systems; Springer: New York, NY, USA, 2008. [Google Scholar]
- Lou, J.H.; Kuo, J.B. 1.5 V full-swing bootstrapped CMOS large capacitive-load driver circuit suitable for low-voltage CMOS VLSI. IEEE J. Solid State Circuits 1997, 32, 119–121. [Google Scholar] [CrossRef]
- Kong, B.S.; Jun, Y.H. Power-efficient low-voltage bootstrapped CMOS latched driver. Electron. Lett. 1999, 35, 2113–2115. [Google Scholar] [CrossRef]
- Lu, H.W.; Su, C.C.; Liu, C.N. A Scalable Digitalized Buffer for Gigabit I/O. IEEE Trans. Circuits Syst. II Express Br. 2008, 55, 1026–1030. [Google Scholar] [CrossRef]
- Chen, P.C.; Kuo, J.B. Sub-1 V CMOS large capacitive-load driver circuit using direct bootstrap technique for low-voltage CMOS VLSI. Electron. Lett. 2002, 38, 265–266. [Google Scholar] [CrossRef]
- Garcia, J.C.; Montiel-Nelson, J.A.; Nooshabadi, S. A Single-capacitor bootstrapped power-efficient CMOS driver. IEEE Trans. Circuits Syst. II Express Br. 2006, 53, 877–881. [Google Scholar] [CrossRef]
- Chong-Fatt, L.; Kiat-Seng, Y.; Rofail, S.S. Sub-1 V bootstrapped CMOS driver for giga-scale-integration era. Electron. Lett. 1999, 35, 392–394. [Google Scholar] [CrossRef]
- Kil, J.; Gu, J.; Kim, C.H. A High-speed variation-tolerant interconnect technique for sub-threshold circuits using capacitive Boosting. IEEE Trans. Very Large Scale Integr. VLSI Syst. 2008, 16, 456–465. [Google Scholar]
- Kim, J.W.; Kong, B.S. Low-voltage bootstrapped CMOS drivers with efficient conditional bootstrapping. IEEE Trans. Circuits Syst. II Express Br. 2008, 55, 556–560. [Google Scholar] [CrossRef]
- Lu, C.Y.; Chuang, C.T. Energy efficient bootstrapped CMOS large RC-load driver circuit for ultra low-voltage VLSI. In Proceedings of the IEEE International Symposium on VLSI DesignAutomationand Test (VLSI-DAT), Hsinchu, Taiwan, 26–29 April 2010; pp. 70–73.
- Tran, C.Q.; Kawaguchi, H.; Sakurai, T. Low-power high-speed level shifter design for block-level dynamic voltage scaling environment. In Proceedings of the 2005 International Conference on Integrated Circuit Design and Technology (ICICDT 2005), Austin, TX, USA, 9–11 May 2005; Volume 35, pp. 229–232.
© 2012 by the authors; licensee MDPI, Basel, Switzerland. This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).