An Output-Capacitorless Low-Dropout Regulator with Slew-Rate Enhancement

A novel output-capacitorless low-dropout regulator (OCL-LDO) with an embedded slew-rate-enhancement (SRE) circuit is presented in this paper. The SRE circuit adopts a transient current-boost strategy to improve the slew rate at the gate of the power transistor when a large voltage spike at the output is detected. In addition, a feed-forward transconductance cell is introduced to form a push–pull output structure with the power transistor. The simulation results show that the maximum transient output voltage variation is 23.5 mV when the load current ILOAD is stepped from 0 to 100 mA in 100 ns with a load capacitance of 100 pF, and the settling time is 1.2 μs. The proposed OCL-LDO consumes a quiescent current of 30 μA and has a dropout voltage of 200 mV for the maximum output current of 100 mA.


Introduction
Power management units are popular in system-on-chip (SoC) applications because multiple voltage regulators can be used to individually power system sub-modules [1]. Among the many candidates for on-chip power management, LDO (low dropout regulator) regulators capable of providing accurate and clean supply voltages are considered suitable for SoC applications. Traditional LDOs rely on large off-chip capacitors on the order of µF at the output to ensure system stability while improving transient response and power supply rejection (PSR) [2][3][4]. For portable systems with SoC architectures, bulky off-chip capacitors are not desirable. This led to the development of LDO regulators without off-chip capacitors at the output [5][6][7].
For portable electronic devices, the low quiescent power consumption of OCL-LDOs is critical for improving power efficiency to extend battery runtime. However, OCL-LDOs trade off power consumption and other performance metrics such as loop stability and dynamic performance [8]. The ability to drive large load currents while achieving low dropout voltage requires a PMOS (positive channel metal oxide semiconductor) transistor with a large size as the power device. Since the gate capacitance of the power transistor is proportional to its width, on the one hand, a low-frequency pole is introduced into the system, which affects the stability of the OCL-LDO, and on the other hand, the time for charging and discharging the gate parasitic capacitance of the power transistor is greatly increased. Especially for applications that require low power consumption, the system faces the problem of reduced bandwidth and slew rate, so improving the transient performance of OCL-LDOs is one of the main design challenges.
Currently, many LDO regulators without large off-chip capacitors have been reported. To cater to the need for the low-power consumption of portable devices in standby mode, some LDOs are designed to operate at currents in the order of nA [9,10]. LDOs with nA bias currents struggle to respond quickly to the load transitions because unity-gain

Topology
The topology of the proposed OCL-LDO is shown in Figure 1, including an error amplifier as the first stage, a non-inverting amplifier as the second stage, a power transistor as the third stage, a frequency compensation network, a transient-current boosting circuit, and a feedback network, where the compensation network consists of C m , C t , and g mt1 , and the transient-current boosting circuit consists of two current boosters. R L represents the effective output resistance. The total capacitance at the output is the equivalent output lumped capacitance of the load capacitor C L in the range of 0-100 pF plus the equivalent parasitic capacitance of the power transistor. The input voltage of the transconductance cell g mt1 is denoted as V C . In the proposed architecture, the frequency compensation capacitors C m and C t couple the output voltage variation during the load transients and pass it to the current boosters for transient enhancement.
The transient-current boosting circuit consists of two current boosters, as shown in Figure 1. The output current I 1,2 is quadratically dependent on the booster-cell differential input voltage. Due to the action of the two inverters, the voltages at the positive and negative inputs of the current boosters always change in opposite directions during transients. That is to say, when the voltage at the positive input terminal of booster 1,2 changes by ∆V, the voltage at the negative input terminal changes by −∆V, then the total input voltage change is ∆V in1,2 = 2·∆V. Therefore, even with small bias currents, I 1 and I 2 are able to be boosted up during load transients, which means that the slew rate at the power transistor gate and the output node can be enhanced. The transient-current boosting circuit consists of two current boosters, as shown in Figure 1. The output current , is quadratically dependent on the booster-cell differential input voltage. Due to the action of the two inverters, the voltages at the positive and negative inputs of the current boosters always change in opposite directions during transients. That is to say, when the voltage at the positive input terminal of booster 1,2 changes by ∆ , the voltage at the negative input terminal changes by −∆ , then the total input voltage change is ∆ , = 2 • ∆ . Therefore, even with small bias currents, and are able to be boosted up during load transients, which means that the slew rate at the power transistor gate and the output node can be enhanced.

Stability Analysis
The stability of the proposed OCL-LDO is achieved by the TCFC compensation technique, which can provide higher current-bandwidth efficiency [22]. Figure 2 shows the equivalent small-signal model of the proposed OCL-LDO, where is defined as the transconductance of each stage, whereas and represent the output resistance and lumped parasitic capacitance, respectively. and compose the non-inverting second stage.
is the output resistance of M19, which is a pFET in saturation. is the transconductance of the power transistor Mp. The effective output resistance is defined by = ∥ , where and is the output resistance of the output stage and load resistance, respectively. models the load capacitance as defined above. The Miller compensation capacitor forms an external feedback loop, and the internal compensation capacitor feeds back the output signal to the gate of the power transistor through the transconductance . In order to improve the transient performance of the system, a feed-forward transconductance stage is introduced in the OCL-LDO, which can form a push-pull structure with the power transistor to further improve the slew rate at the output node.

Stability Analysis
The stability of the proposed OCL-LDO is achieved by the TCFC compensation technique, which can provide higher current-bandwidth efficiency [22]. Figure 2 shows the equivalent small-signal model of the proposed OCL-LDO, where g mi is defined as the transconductance of each stage, whereas R i and C i represent the output resistance and lumped parasitic capacitance, respectively. g m2 and g mt compose the non-inverting second stage. r ds19 is the output resistance of M19, which is a pFET in saturation. g mp is the transconductance of the power transistor Mp. The effective output resistance is defined by R L = R o R LOAD , where R o and R LOAD is the output resistance of the output stage and load resistance, respectively. C L models the load capacitance as defined above. The Miller compensation capacitor C m forms an external feedback loop, and the internal compensation capacitor C t feeds back the output signal to the gate of the power transistor through the transconductance g mt1 . In order to improve the transient performance of the system, a feed-forward transconductance stage g m f is introduced in the OCL-LDO, which can form a push-pull structure with the power transistor to further improve the slew rate at the output node.

Both
and are given by the equivalent transconductance of the circuit structure shown in Figure 3.
is defined as: can be deduced as follows: Both G m1 and G m2 are given by the equivalent transconductance G m of the circuit structure shown in Figure 3. G m is defined as: G m can be deduced as follows: where g m is the transconductance of M2. In the proposed design, R s is actually realized by the r ds of M15 and M21, which are two nFETs in saturation, showing large resistance, so g m R s 1. Specifically, G m1 = g mt1 1+g mt1 r ds15 . It can be concluded that . Compared with g mt and g mt1 , the contributions of G m1 and G m2 to the current are insignificant and therefore can be ignored. Thus, the small-signal model in Figure 2 can be simplified as shown in Figure 4.  and are given by the equivalent transconductance of the circuit structure shown in Figure 3.
is defined as: can be deduced as follows: where is the transconductance of M2. In the proposed design, is actually realized by the of M15 and M21, which are two nFETs in saturation, showing large resistance, so ≫ 1. Specifically, It can be concluded that . Compared with and , the contributions of and to the current are insignificant and therefore can be ignored. Thus, the small-signal model in Figure 2 can be simplified as shown in Figure 4.   For simplicity, we assume that the DC gain of each stage is large enough, and the compensation capacitance is larger than the parasitic capacitance of the first stage. and are much smaller than the load capacitance , as given by: It is worth noting that includes the gate parasitic capacitance of the power transistor and is therefore large. The derived small-signal transfer function for the open-loop gain of the OCL-LDO is given by: For simplicity, we assume that the DC gain of each stage is large enough, and the compensation capacitance C m is larger than the parasitic capacitance C 1 of the first stage. C m and C t are much smaller than the load capacitance C L , as given by: It is worth noting that C 2 includes the gate parasitic capacitance of the power transistor and is therefore large. The derived small-signal transfer function for the open-loop gain of the OCL-LDO is given by: A dc and p −3dB are the low-frequency gain and the dominant pole, respectively, which are given as: Hence, the gain-bandwidth product (GBW) can be obtained as: Since the load current will change, the stability of the proposed LDO should be discussed for different load conditions. Case I (low output current): In this case, R L is very large, so that g mt1 C 2 C L R L C 2 C t . The non-dominant poles and zeros can be expressed as: From the above analysis, it can be seen that p 3 and z 1 can cancel each other out. The other two zeros, z 2 and z 3 , only appear at high frequencies. For a third-order Butterworth frequency response with the damping factor ζ = 1 2Q = 0.707, the stability conditions are given by: When g m2 g m1 and g mp g m1 are large, Equation (15) is easily satisfied. It can be noticed that p 2 is proportional to g mp , so the worst stability of the circuit occurs with no load current and maximum load capacitance. As the load current increases, p 2 will undoubtedly be pushed to higher frequencies and the phase margin will increase.
Case II (moderate to maximum output current): In this case, R L is small, as it is greatly affected by the load current (R L ∝ 1

I LOAD
). The expressions for the zeros, dominant pole, and GBW remain the same. The non-dominant poles change, as given by: It can be observed that p 1 remains the same. Since GBW does not vary with the load current, p 1 = 2 GBW still holds. With a small R L , p 3 is located at a higher frequency than GBW and has no effect on LDO stability. Hence, the loop stability only depends on the location of p 2 . Compared to the case discussed before, even though R L is smaller, the larger g mp pushes p 2 to higher frequencies, thus improving the phase margin. Furthermore, the zero z 1 is located slightly beyond the GBW for the enhancement of the phase margin.
In fact, the stability of the circuit is improved with SRE. Specifically, we return to Figure 2 for a detailed analysis of the true equivalent transconductance g m2 of the second gain stage. It follows that g m2 = g m2 R t ·(G m1 + g mt ), where R t = 1 g mt r ds19 . It can be found that g m2 < g m2 , which means that when the SRE circuit fails and the system is under a light load, p 1 and p 2 will move closer to the unit gain bandwidth and the stability of the circuit will be slightly worse. At heavy loads, this situation is improved, as p 2 is still pushed to high frequencies.

Schematic
The full schematic of the proposed OCL-LDO is depicted in Figure 5. The first gain stage is realized by a single folded-cascode error amplifier with M1-M9. The differential pair M2 and M3 provides the transconductance g m1 . The second stage is a non-inverting amplifier composed by M10-M19. Mp is the power transistor, which together with the feedforward transconductance module M21 constitutes a push-pull output stage. C m and C t are capacitors for frequency compensation. R L and C L represent the equivalent output resistance and load capacitance, respectively. The transconductances of transistors M11, M14, M20, and M21 are g mt , g mt1 , g mt2 , and g m f , respectively. V bn , V bp , V cn , and V cp are the bias voltages provided by the bias circuit. The circuit consumes a total of 30 µA quiescent current, of which the first, second, and output stages consume 3 µA, 15 µA, and 9 µA, respectively, and the remaining 3 µA is consumed by the bias circuit. , and , respectively. , , , and are the bias voltages provided by the bias circuit. The circuit consumes a total of 30 μA quiescent current, of which the first, second, and output stages consume 3 A, 15 A, and 9 A, respectively, and the remaining 3 A is consumed by the bias circuit.

Overshoot and Undershoot Reduction
The slew rate at the power transistor gate node and output node affects the transient response. As shown in Figure 5, these two nodes correspond to two charging and discharging paths, one is composed of M13 and M14, and the other is composed of M , M20, and M21. Therefore, it is important to dynamically increase the current in these two critical paths. This paper uses the coupling effect of and when receiving the load current switching request to sense the change of , and pass it to the two current boosters composed of M14 and M20 to accelerate the charging and discharging of the load capacitor and the gate parasitic capacitance of the power transistor.
When generates a spike ∆ in response to an urgent load current request,

Overshoot and Undershoot Reduction
The slew rate at the power transistor gate node and output node affects the transient response. As shown in Figure 5, these two nodes correspond to two charging and discharging paths, one is composed of M13 and M14, and the other is composed of M p , M20, and M21. Therefore, it is important to dynamically increase the current in these two critical paths. This paper uses the coupling effect of C m and C t when receiving the load current switching request to sense the change of V out , and pass it to the two current boosters composed of M14 and M20 to accelerate the charging and discharging of the load capacitor and the gate parasitic capacitance of the power transistor.
When V out generates a spike ∆V in response to an urgent load current request, C m detects the spike and changes the gate voltage of M14 by −∆V through the inverter formed by M10 and M17, while its source voltage changes ∆V due to the coupling effect of C t . This causes the V GS of M14 to change by −2·∆V. When V out undershoots, the current of M14 is boosted and the current of M13 is decreased through the replication of the current mirror formed by M12 and M13. On the one hand, the second stage can therefore withdraw more current to discharge the gate parasitic capacitance of M p . When V out overshoots, the circuit operates in the opposite way to quickly charge the gate capacitance of Mp. On the other hand, for the output node, the push-pull output stage formed by M21 and Mp helps to enhance the slew rate. It should be noted that the path formed by M20 and M21 is the primary channel to discharge the extra current when V out overshoots. Therefore, while reducing the current of Mp, it is more important to increase the current through M20 and M21 to suppress the overshoot of V out . Fortunately, M20 can do this by pulling a large current in a similar manner to M14. When V out is regulated back to a steady state, the operation of dynamic current boost is automatically shut down to save energy.

Simulation Results and Discussions
The simulated loop gain responses of the proposed regulator at different load current conditions are shown in Figure 6. In the case of C L = 100 pF, the regulator achieves a minimum phase margin (PM) of 74.1 • and a minimum gain margin (GM) of 11.2 dB for the load current range from 0 to 100 mA. As the load current raises, the PM and GM increase to 77.2 • and 28.1 dB. At heavy load conditions, R L reduces dramatically when Mp enters into the triode region. In this case, the gain of the output stage g mp R L is reduced, as is the A dc . However, because the proposed regulator has three gain stages, the minimum A dc of 86.3 dB is found at I LOAD = 100 mA. Moreover, the stability of the proposed OCL-LDO for C L = 0 is investigated to conduct the loop gain response in Figure 7. A minimum phase margin (PM) of 77.2 • and a minimum gain margin (GM) of 21.4 dB are achieved. Theoretical analysis shows that the system has the worst PM and GM when I LOAD = 0 and C L = 100 pF. Therefore, for further verification, Monte-Carlo simulations are achieved under the condition of I LOAD = 0 and C L = 100 pF. As Figure 8a,b illustrate, the average PM and GM achieved by the proposed OCL-LDO are 74.2 • and 11.5 dB, respectively. Meanwhile, Table 1 shows the simulated PM and GM across PVT variations. The results shown in Figure 8 and Table 1 verify that the stability of the proposed OCL-LDO can be guaranteed.  Table 1 shows the simulated PM and GM across PVT variations. The results shown in Figure 8 and Table 1 verify that the stability of the proposed OCL-LDO can be guaranteed.        The proposed circuit is able to supply a load current from 0 to 100 mA with a dropout voltage of 200 mV for a supply of 1.1 V. The circuit, including the bias circuit, consumes 30 µA of quiescent current over the specified load current range. The simulated load transient responses under different load capacitor conditions are given in Figure 9. As shown in Figure 9a, when the load current is switched between 0 and 100 mA with an edge time of 100 ns for the case of C L = 0, the simulated undershoot and overshoot are 17.0 mV and 17.4 mV, respectively. On the other hand, the maximum undershoot and overshoot for C L = 100 pF are 23.5 mV and 17.2 mV, as shown in Figure 9b. The maximum output voltage variation is about 2.6% (23.5/900 mV) with load step changes of 100 mA/100 ns, and it can return to the final state within 1.2 µs.
Generally speaking, if the output is connected to a large load capacitor, when the load current changes, the overshoot and undershoot can be effectively reduced because the capacitor charges and discharges the output node. However, as shown in Figure 9, the undershoot with 100 pF C L is even larger than the case with 0 pF C L . This is because the pole of the output node is close to the unit gain bandwidth when the LDO is connected to a 100 pF load capacitor. During the transition of the load current, the bias voltage and bias current of the amplifier will deviate greatly. In particular, the voltage across the gate and source of M14 deviates sharply due to the change in the opposite direction, resulting in the nonlinear behavior of the circuit. This deviation causes the pole and zero frequency to change during the load transition, so the circuit has more overshoot in this case. On the other hand, the nonlinear behavior of the circuit leads to the generation of rings in the transient response, as shown in Figure 9b. If the gate voltage of M14 is connected to a fixed bias, and the circuit structure, transistor size, and bias current are kept unchanged, the deviation of the bias current of M14 decreases during the load transition. The rings are improved in this case. The proposed circuit is able to supply a load current from 0 to 100 mA with a dropout voltage of 200 mV for a supply of 1.1 V. The circuit, including the bias circuit, consumes 30 μA of quiescent current over the specified load current range. The simulated load transient responses under different load capacitor conditions are given in Figure 9. As shown in Figure 9a, when the load current is switched between 0 and 100 mA with an edge time of 100 ns for the case of = 0, the simulated undershoot and overshoot are 17.0 mV and 17.4 mV, respectively. On the other hand, the maximum undershoot and overshoot for = 100 pF are 23.5 mV and 17.2 mV, as shown in Figure 9b. The maximum output voltage variation is about 2.6% (23.5/900 mV) with load step changes of 100mA/100ns, and it can return to the final state within 1.2 μs. Generally speaking, if the output is connected to a large load capacitor, when the load current changes, the overshoot and undershoot can be effectively reduced because the capacitor charges and discharges the output node. However, as shown in Figure 9, the undershoot with 100 pF is even larger than the case with 0 pF . This is because the pole of the output node is close to the unit gain bandwidth when the LDO is connected to a 100 pF load capacitor. During the transition of the load current, the bias voltage and bias current of the amplifier will deviate greatly. In particular, the voltage across the gate and source of M14 deviates sharply due to the change in the opposite direction, resulting in the nonlinear behavior of the circuit. This deviation causes the pole and zero frequency to change during the load transition, so the circuit has more overshoot in this case. On the other hand, the nonlinear behavior of the circuit leads to the generation of rings in the transient response, as shown in Figure 9b. If the gate voltage of M14 is connected to a fixed bias, and the circuit structure, transistor size, and bias current are kept unchanged, the deviation of the bias current of M14 decreases during the load transition. The rings are improved in this case.
To verify the proposed SRE technique of the OCL-LDO, the transient waveforms of the output voltage are simulated with and without the SRE circuit. For a fair comparison, the only difference is that the gate voltages of the transistors M14 and M20 are biased to a fixed value, while the circuit structure, transistor size, and bias current remain the same. As shown in Figure 10, with the help of the slew-rate-enhancement technique, the undershoot is reduced by more than 45 mV and the settling time is also improved. To verify the proposed SRE technique of the OCL-LDO, the transient waveforms of the output voltage are simulated with and without the SRE circuit. For a fair comparison, the only difference is that the gate voltages of the transistors M14 and M20 are biased to a fixed value, while the circuit structure, transistor size, and bias current remain the same. As shown in Figure 10, with the help of the slew-rate-enhancement technique, the undershoot is reduced by more than 45 mV and the settling time is also improved. It can be seen from Figure 10 that without SRE, the undershoot of the LDO is much larger than the overshoot. This is because when the circuit is switched from light to heavy loads, the gate voltage of the power transistor cannot be pulled down quickly due to the large parasitic capacitance, so it cannot provide a large current to the output in time. To solve this problem, the designed SRE circuit can provide a larger discharge current for the gate capacitance of the power transistor during load transitions. Therefore, the improvement for the undershoot is significantly better compared to the overshoot. Moreover, without SRE, the output has rings when the circuit steps from heavy to light loads, as shown in Figure 10b. This shows that the SRE circuit is helpful to the stability of the system, which is consistent with the previous stability analysis.
Since the PSR is related to the loop gain at low frequencies, and the large load capacitance bypasses the output ripple to the ground at high frequencies, we present the worstcase PSR in Figure 11. As depicted, the PSR has its best value at low frequencies. Because the proposed LDO has a three-stage gain structure and has an optimized gain-bandwidth product in TCFC compensation, the proposed OCL-LDO is capable of providing a good PSR. In order to more objectively evaluate the performance improvement in the proposed OCL-LDO resulting from the slew-rate-enhancement technique, a comparison with the state-of-the-art work is given in Table 2. A figure-of-merit ( ) for OCL_LDO is adopted It can be seen from Figure 10 that without SRE, the undershoot of the LDO is much larger than the overshoot. This is because when the circuit is switched from light to heavy loads, the gate voltage of the power transistor cannot be pulled down quickly due to the large parasitic capacitance, so it cannot provide a large current to the output in time. To solve this problem, the designed SRE circuit can provide a larger discharge current for the gate capacitance of the power transistor during load transitions. Therefore, the improvement for the undershoot is significantly better compared to the overshoot. Moreover, without SRE, the output has rings when the circuit steps from heavy to light loads, as shown in Figure 10b. This shows that the SRE circuit is helpful to the stability of the system, which is consistent with the previous stability analysis.
Since the PSR is related to the loop gain at low frequencies, and the large load capacitance bypasses the output ripple to the ground at high frequencies, we present the worst-case PSR in Figure 11. As depicted, the PSR has its best value at low frequencies.
Because the proposed LDO has a three-stage gain structure and has an optimized gainbandwidth product in TCFC compensation, the proposed OCL-LDO is capable of providing a good PSR. In order to more objectively evaluate the performance improvement in the proposed OCL-LDO resulting from the slew-rate-enhancement technique, a comparison with the state-of-the-art work is given in Table 2. A figure-of-merit (FOM) for OCL_LDO is adopted to compare the transient performance [23]. Comparisons are also made using a new figure-of-merit (FOM N ) that takes into account the effects of parasitic capacitances under different processes [14]. It is given by: where K is the edge time ratio and defined by: performance of the designed OCL-LDO has a greater advantage compared to other designs with the same power.   L is the minimum channel length associated with the process. The smaller FOM N value means a better transient performance metric. The FOM N value of the proposed design is second only to that reported in [9]. However, the maximum load capacitance in [9] is only 10 pF, which limits its application. In [17], the dropout voltage of the LDO is designed to be 150mV. Smaller dropout voltage results in higher power efficiency, but at the expense of a larger power transistor for the same drive capability. This means that the gate parasitic capacitance of the power transistor is larger, so the transient response is significantly worse than that of this paper. With the proposed circuit architecture, the voltage-spike detection scheme, and the SRE technique, the transient performance of the designed OCL-LDO has a greater advantage compared to other designs with the same power.

Conclusions
A low-power OCL-LDO regulator with embedded transient enhancement is implemented with a 40nm standard CMOS process. With the proposed transient enhancement technique and circuit architecture, the OCL-LDO can guarantee stability over the full load range of 0-100 mA without the limitation of a minimum load current. The dropout voltage is 200 mV. The simulation results show that the undershoot of the proposed OCL-LDO is significantly improved, and the quiescent power consumption does not increase when the system is heavily loaded. Compared with the prior art, the proposed OCL-LDO regulator achieves a better transient performance indicator and also provides good performance parameters in terms of line regulation, load regulation, and PSR. The above work will be helpful for on-chip applications.