Design Techniques for Low-Power and Low-Voltage Bandgaps †

First Compensation ppm/ Abstract: Reverse bandgaps generate PVT-independent reference voltages by means of the sums of pairs of currents over individual matched resistors: one (CTAT) current is proportional to V EB ; the other one (PTAT) is proportional to V T (Thermal voltage). Design guidelines and techniques for a CMOS low-power reverse bandgap reference are presented and discussed in this paper. The paper explains ﬁrstly how to design the components of the bandgap branches to minimize circuit current. Secondly, error ampliﬁer topologies are studied in order to reveal the best one, depending on the operation conditions. Finally, a low-voltage bandgap in 65 nm CMOS with 5 ppm/ ◦ C, with a DC PSR of − 91 dB, with power consumption of 5.2 µ W and with an area of 0.0352 mm 2 developed with these techniques is presented.


Introduction
Bandgap (BG) voltage references are widely used in integrated circuits, since each provides a constant voltage, regardless of process, power supply voltage and temperature (PVT) variations. In recent years, the electronics trend has been pushing towards reducing power supplies down to 1.2 V or lower while maintaining or improving measures of performance, such as robustness and current consumption. Thus, voltage-mode BGs cannot be used, since the natural silicon bandgap voltage of 1.25 V would be higher than the supply. For this reason, to avoid switching structure [1,2], current-mode reverse-BGs (R-BG) are typically used [3][4][5][6][7][8], whose conceptual scheme is shown in Figure 1a. This scheme produces a PVT-independent output voltage V REF by means of the sum of two currents over output resistance R 3 : V T -based (V T = k·T/q) current proportional to absolute temperature (PTAT component, I 1 ) and V EB -based current complementary to absolute temperature (CTAT component, I 2 ). V T -based current is multiplied by a constant factor to have the PTAT component be equal to the CTAT one. The summed current flows into a resistor to generate a temperature-independent output voltage V REF .
In this paper, the R-BG implementation scheme of Figure 1b is used as the benchmark [9]. The design of said R-BG circuit was analyzed in detail, and optimization guidelines are proposed herein to guarantee overall state-of-the-art (SoA) performance (in terms of a number of parameters; other proposals are focusing on only a few) and industrial yield. As validation, a R-BG circuit was developed in 65nm-CMOS technology to operate with a 1 V supply consuming 5.2 µW with 1% V REF accuracy in the temperature range [−40, 100] • C, and the DC-PSRR was below −91 dB. This performance is guaranteed over 3σ yield for applications in industrial audio products. This device favorably compares with the SoA. This paper is organized as follows. Section 2 presents R-BG design techniques focusing on bandgap branches, error amplifier (EA) choice and power supply rejection (PSR) optimization. Section 3 shows the actual design of the LV&LP R-BG and Section 4 presents the conclusions.  Figure 1b shows the low-voltage (LV) R-BG voltage reference conceptual circuit [9] adopted for discussion. The structure can be divided into two main parts: the BG branches (including Q2 of the single PNP device and Q1 for N PNP devices) and the error amplifier (EA). In the following design, strategies to minimize power consumption-that is, current consumption while operating at LV-are analyzed and optimized. We take into account reliability and performance in the presence of PVT variations for the adopted technology.

Low-Voltage Bandgap Design
As a general guideline for R-BG design, the accuracy is favored over the bandwidth. This leads to using larger devices, since a longer L offers larger output impedance, which translates to higher gain. Moreover, larger devices area-wise (i.e., larger WxL) guarantee fewer device mismatches and lower offsets.

Bandgap Branches
The current in the R-BG branches is minimized while guaranteeing yield with PVT variations. The minimum current in Q1 and Q2 (they operate with different current densities) is defined from the minimum current per-unit-PNP (I α ) in the range where the β factor (defined as the ratio between the collector current I C and the base current I B ) is constant for both devices. Therefore, the current in each R-BG branch is defined by the minimum current per-unit-PNP multiplied by the Q1 size, i.e., N.
A key design parameter is the device ratio (N) between Q1 and Q2. For BJT matching purposes, the layout adopts a common-centroid structure, because it averages the geometrical inaccuracies. The value of N follows the equation (with n odd) [10]: (1) In this way, the current I C in each bipolar device (Q1 and Q2) is N·I α , and a lower N value reduces power consumption. A higher N value increases the PTAT component by improving the robustness and by reducing the importance of circuit non-idealities (such as opamp offset and component mismatch effects). In fact, the PTAT component has to be larger than the EA offset evaluated as V off ·(1+R 2a /R 2b ) [10], where V off is the effective EA offset voltage. This is achieved for large N values. However, due to the ln (natural logarithm) operation, a significant advantage would require an excessive increase in the N value (and so higher area and higher power consumption). Due to trade-off between power consumption and performance robustness, N = 8 (n = 3) is adopted for the common-centroid layout. The current flowing in each PNP (Q1 and Q2) is then I 1 = 8 × I α · (1 + β)/β. Upon this choice, R 1 and R 2 (= R 2a + R 2b ) can be designed as follows: The value of the temperature-independent constant m, defined as [(∂V EB /∂T)/(∂V T /∂T)]| 300K , depends on the technology. Thus, by equating m with (R 2 /R 1 )· ln(N), the value of R 2 is: Thus, the current through R 2 is I 2 = V EB2 /R 2 = I 1 ·V EB /(m·V T ). R 2 is composed of R 2a and R 2b (Figure 1b). The R 2a and R 2b partition has to be optimized as a trade-off between two trends: by increasing R 2b (reducing R 2a ), the V off output contribution is reduced; by decreasing R 2b (increasing R 2a ), the EA input nodes' biases are reduced for V X and V Y .
In conclusion, the total current in each BG branch (flowing through PMOS current mirrors M1 and M2) is I 1 +I 2 . This current (I 1 +I 2 ) is mirrored for the output branch M3. For the defined output voltage V REF,n which is defined as the peak value of the BG curve in nominal condition), R 3 should be designed according to the equation:  is not constant, and it depends on the technology. Flatness is optimized for m calculated at 27 • C (300 K). Its value is usually around −2 mV/ • C at 27 • C. On the contrary, the term (∂V T /∂T) is constant with a value of 0.086 mV/ • C. The typical slow and fast corners of the BG curve are compared in Figure 2b. As can be seen in ss and ff, the curve is not centered at 27 • C as in tt corner. This is due to the change in the R 1 value that is calibrated on the nominal corner.
High accuracy has to be achieved in the current mirror M1/M2-M3. Without any arrangement, different V DS values would result in a mirror error for the current, which should be minimized by using either long L devices or the cascode current mirror, if permitted by the available voltage headroom. To reduce the mismatch between M1, M2 and M3, these devices are designed with large L values.
The ratio between M1/M2 and M3 can be reduced to decrease the power consumption. When the M3 current is reduced by a γ factor, the current in the output branch is (I 1 +I 2 )/γ, and V REF,n is: This means that R 3 is multiplied by the same γ factor, i.e., R 3 * =γ ·R 3 , where R 3 is given by (4). It is important to avoid large γ values because γ increases the mismatch between M1/M2 and M3.
To improve the current matching between the BG branches and the output branch, a cascode current mirror can be implemented. The cascode current mirror use could be enabled by operating all the transistors in the sub-threshold region, which would result in V GS < V TH .
The voltage divider made by R 2a and R 2b introduces a voltage shift at the input of the error amplifier. This allows proper biasing of the EA differential PMOS input pair, despite the low VDD value.
The minimum VDD for bandgap branches proper operation is given by [10]: where V X is the voltage at X (= Y) node, and V GS,sth is the sub-threshold region V GS . If a cascode current mirror is not used, only a V DS,sat is needed.

Trimming Resistor
The R-BG was conceived to minimize the effect of PVT variations while not reducing constant deviations (like offset and mismatch), and it produces a V REF constant deviation. Such constant deviations are compensated by digitally-controlled trimming on R 3 (used in test bench) in a resistive array whose design is driven by the trade-off between complexity and accuracy (TC is not affected by the trimming circuit). Other trimming implementations (such as changing M3 size) could reduce PSR performance.
The main sources of V REF deviation are: the EA offset (V off ), the resistor mismatch (ε R defined as δR/R) and the M1/M2-M3 current mirror mismatch (ε M defined as δI/I). δR and δI are the deviations from R and I, respectively. The contributions of these terms to the V ERR (defined as the deviation from V REF ) can be written as: in which the first term on the right-hand side of the equation is called V ERR,PTAT . The full scale (FS) trimming correction range is designed to manage such a total error. Assuming a maximum acceptable error (∆V ERR ) and n bits for controlling the resistive array, the trimming full-scale correction range is FS trim = ∆V ERR ·2 n , which is allocated to be ± FS trim /2 around the V REF nominal value. Then, the design of the R-BG has to optimize the V REF deviation in order to be included in the trimming of the full-scale correction range.

Error Amplifier
The error amplifier (EA, Figure 1b) is committed to force V X = V Y , to ensure that the residual induced error is lower than the target accuracy. As R-BG produces a DC voltage reference, the main attention is given to static performance (bias, DC-gain, offset) with respect to dynamic performance (bandwidth and slew-rate), which needs to be taken into account for EA design.

EA Bias
The EA bias point has to fulfill the R-BG bias voltage operating point [10] for both input and output nodes for the LV conditions.
For the input nodes, the R 2a -R 2b partition is defined to bias the EA input node close to GND, allowing a PMOS differential pair operation by satisfying: With the same consideration, the minimum VDD minEA is [10]: For the output node, a proper biasing of M1, M2 and M3 gates requires

EA Offset Specification
EA offset's (V off ) effect V REF is given by the V ERR,PTAT term in (7). Typically, R 2a /R 2b is about 1, and R 3 /R 1 > 1. Therefore, V off is greatly amplified to V REF . However, this V ERR,PTAT is a constant error, and then it is compensated by the trimming operation, limiting the trimming effectiveness range fixed by FS trim . Assuming one allocates for V off compensation 50% of the FS trim , the specification for the V off is: Since R 1 /R 3 is typically very small, the V off requirement results in very stringent and challenging values (such as V off < 0.5 mV or less).

EA Topology
The above requirements have to be satisfied by the EA design. The required target DC-gain can be achieved by using long devices and/or multistage opamp structures. V off can be reduced (avoiding switching schemes is to be carefully considered) by using large area devices (i.e., large WxL) [11]. In addition, also lowering the current level with MOS in the saturation region would increase DC-gain. Then, the device design can be optimized for low power by operating the transistors in the sub-threshold region, thereby maximizing the intrinsic gain of the transistor (proportionally to V A /V T , where V A is the early voltage) for a given current level. Moreover, lower V GS is required, thereby reducing minimum supply voltage and/or enabling cascode structures. Different EA topologies could be compared, as follows [12].
The single-stage operational transconductance amplifier (OTA), the simplest opamp structure (Figure 4a), is widely used for high voltage supplies; nonetheless, its DC-gain is limited to g m,in ·r out , and it appears insufficiently large to guarantee sufficiently high values-described above. Furthermore, the intrinsic V DS difference in the input devices could introduce systematic V off larger than the requirement. Finally, the request for V o = VDD − V GS,M1 could be critical for this opamp structure. For this reason, other topologies are considered.
Symmetrical OTA, shown in Figure 4b, achieves a low systematic V off since input transistors have the same V DS . Moreover, DC-gain (k·g m,in ·r out , where k is the current ratio between input and output branches) can be higher than fir the single-stage and sufficiently large for the specification. This is at the cost of the extra current of the output branches. The output branch allows only a V DS from VDD to V o . This helps with correct biasing of M1, M2 and M3.
The two-stage Miller OTA ( Figure 4c) helps with reaching a higher DC-gain that is given by A 1 ·A 2 (= g m,in r out,1 ·g m,out r out,2 ). However, the two-stage structure frequency response requires a large compensation capacitor and a large current in the output stage. For the correct biasing of M1, M2 and M3, the two-stage miller OTA is similar to the symmetrical one.
The folded cascode OTA, as shown in Figure 4d, is very similar to the symmetrical and the two-stage Miller OTAs. Large gain can be achieved using long devices in the output node, and this allows the correct biasing of M1, M2 and M3. Extra cost results from the stability, since a large compensation capacitor from V o to V + is needed, with large die area occupancy.
Among the four OTA options, the symmetrical one appears the best choice, and so it could be suggested for high-performance LV R-BGs. The additional current compared to the single-stage OTA is negligible, since it is much lower than the current requested by the BG branches.

Power Supply Rejection
Power supply rejection (PSR) [13] is a critical parameter, particularly in LV circuits, where any disturber is more important than at high voltages. At low power, the impedance level is higher and so small current errors could result in large voltage errors. For the basic R-BG scheme and assuming the use of a symmetrical EA, the PSR frequency response is shown by the solid line in Figure 5. The DC-PSR value is given by the following equation: where g m,M3 is the transconductance of M3, r 0 the output resistance of M3, g m,ea the transconductance of the symmetrical EA (k·g m,in ), r ea the output resistance of the EA and R J (R K ) the resistance at node J (K). PSR with C C connected from V o to GND presents poles and zero as follows: The performance can be improved by increasing current consumption, i.e., reducing the output impedance of the EA. To increase the position of z 1 , that is, increasing the bandwidth of the DC PSR, it is useful to connect the capacitor C C from V o to VDD instead of to GND. With this solution z 1 becomes: Figure 5 (dashed line) displays the effect of C C connected from V o to VDD increasing z 1 of a quantity g m,M3 ·r 0 . The other poles maintain the same values. Figure 5 shows that without C C the first zero is shifted to higher frequencies. This is positive, but it reaches worse values at higher frequencies. Due to z 1 and z 1c , which are close to each other, the slope of the PSR is about −40 dB/dec. The zeros are due to the parasitic capacitors. If the application requires a good PSR for a low range of frequencies, C C could be avoided, with consequent area saving. However, if a good PSR is needed for a high range of frequencies, it is better to place the first zero to a lower frequency. This allows a good PSR in the whole range of frequencies. Figure 5. Comparison of a PSR typical shape without coupling capacitor C C (dotted line), with C C connected from V 0 to VDD (dashed line) and GND (solid line). The graph shows that z 1 is moved to higher frequencies by a factor g m,M3 r 0 .

LV-LP BG Design in 65 nm Technology
The design in 65 nm CMOS technology of a LV-LP BG for audio applications based on the previous guidelines is proposed [14]. The developed circuit was fully characterized in the presence of PVT variations, and Monte Carlo simulations were used in post-layout for validation in terms of performance and robustness.
The R-BG is required to operate from the nominal VDD = 1.2 V ± 0.2 V. The R-BG has to provide 600 mV of V REF,n with a 6 mV maximum deviation at 3σ. Figure 6 shows the developed R-BG structure. As a first step, the parameter N = 8 was adopted as a trade-off between minimum current and large ∆V EB (2). This means that ∆V EB = 26 mV·ln(N) = 54 mV, I 1 = 8 × I α ·(1+β)/β = 680 nA and R 1 = ∆V EB /I 1 = 81 KΩ. Consequently, from (3), R 2 = 790 kΩ (m ∼ 20.3). By having V EB = 690.9 mV and setting V REF,n = 600 mV in (4), the value of R 3 is equal to 1170 kΩ with γ = 3. R 3 is a 4-bit trimmable resistor with a trimming range of 5 mV, resulting in FS trim = 75 mV around the nominal V REF,n = 600 mV. The value of V REF can be adjusted by 40 mV above and by 35 mV below (one of the 16 trimming codes is used to not apply any changes). To have enough biasing headroom without increasing the contribution of V off to V REF , R 2a = R 2b = 395 KΩ has been chosen.

Bandgap Branch Design
All devices operate in the sub-threshold region, minimizing V GS request. Assuming VDD min = 1.0 V, the voltage space for the current mirror (M1-M2) is (VDD min -V EB ) about 350 mV, which allows one to use cascode current mirrors with devices in the sub-threshold region. This optimizes also the output stage (M3) current accuracy.

Error Amplifier Structure
As discussed above, a symmetrical EA is used (Figure 7). All transistors operate the in sub-threshold region, and the current mirrors can also be used from VDD min = 1.0 V. V off is reduced to be slightly lower than 0.5 mV by using large-area input devices. In this way, the maximum V ERR due to offset is about 20 mV; i.e., 50% of the FS trim /2 is allocated for V off correction.
According to Figure 3, to guarantee a ∆V REF lower than 1 mV in the temperature range [−40, 100] • C and a maximum trimming range of about 30 mV, EA DC-gain larger than 70 dB is needed. To avoid values that exceed this error during PVT simulations, a gain of 80 dB was chosen. Input stage and output stage currents are in the order of 40 nA each-negligible with respect to the BG branches, as expected. In Figure 8a a pie chart is reported with the power breakdown of the total BG structure. Figure 8b shows the power consumption and the total current of the BG depending on the supply.

Start-Up and Biasing Circuit
The start-up circuit (Figure 6a) is used to guarantee that the BG operates properly. For example, a wrong operating point can occur when no current is flowing in the circuit. During the start-up, transistor M su provides the current to reach the correct operating point. After this has been reached, M su turns-off because its V GS becomes zero. Figure 8c shows the start-up transition behavior of the V REF , the EA output V o and the current flowing in M3 with a supply rise time of 100 µs.
The current consumption of the start-up circuit in the steady state condition is around 400 nA, and it represents the 9% of the total current (Figure 8a). This solution guarantees robust operation.
The biasing circuit is presented in Figure 7. It is used to bias the cascodes in the EA and the current mirror attached to the source of the differential input pair. Furthermore, it biases the BG cascode current mirror composed by M1C, M2C and M3C through V C (Figure 6a).

PSR Simulation
PSR performance is shown in Figure 9. The position of z 1 was shifted one decade higher because the value of g m,M3 ·r 0 was about 10. To save on area, C C was implemented by using PMOS transistors with the drains and sources connected to VDD and with the gates connected to V o .

Monte Carlo Simulation
Based on Monte Carlo simulations (considering a 2000-point simulation), before trimming, the V REF value varies in the voltage range from 575.2 to 630.3 mV with a σ of 7.77 mV. On the other hand, after trimming, the voltage range is reduced: V REF varies from 596.3 to 602.6 mV with a σ of 1.5 mV. This means a variation at 3σ of 1% instead of 4.5% without trimming. Figure 11 shows the histogram collecting simulations at 27 • C, before and after trimming. Moreover, Monte Carlo simulations revealed an EA V off of 471.2 µV at 1σ. This means that V ERR affects V REF for about 15 mV, to be adjusted by the trimming. Table 1 compares the proposed R-BG performance with the state-of-the-art. The aggressive 5 ppm/ • C outperformed SoA BGs with comparable power consumption. Moreover, in Table 2, the simulated performances are presented.

Conclusions
In this paper, a methodology to design a low-power bandgap was presented. We focused on the component sizing to reduce the current consumption and then covered the design of the bandgap branches and which EA should be used, upon target specifications.
A 65 nm, CMOS, low-power bandgap design with the proposed guidelines was presented and compared to the SoA. The former had a higher PSR, resulted in a superior temperature coefficient and required less power consumption. In comparison with [13], the PSR was worse, but the TC and the power consumption were better. To conclude, this work can be considered a good trade-off between high performance and low power consumption.

Conflicts of Interest:
The authors declare no conflict of interest.