Performance Analysis and Design Optimization of Parallel-Type Slew-Rate Enhancers for Switched-Capacitor Applications

The design of single-stage OTAs for accurate switched-capacitor circuits involves challenging trade-offs between speed and power consumption. The addition of a Slew-Rate Enhancer (SRE) circuit placed in parallel to the main OTA (parallel-type SRE) constitutes a viable solution to reduce the settling time, at the cost of low-power overhead and no modifications of the main OTA. In this work, a practical analytical model has been developed to predict the settling time reduction achievable with OTA/SRE systems and to show the effect of the various design parameters. The model has been applied to a real case, consisting of the combination of a standard folded-cascode OTA with an existing parallel-type SRE solution. Simulations performed on a circuit designed with a commercial 180-nm CMOS technology revealed that the actual settling-time reduction was significantly smaller than predicted by the model. This discrepancy was explained by taking into account the internal delays of the SRE, which is exacerbated when a high output current gain is combined with high power efficiency. To overcome this problem, we propose a simple modification of the original SRE circuit, consisting in the addition of a single capacitor which temporarily boosts the OTA/SRE currents reducing the internal turn-on delay. With the proposed approach a settling-time reduction of 57% has been demonstrated with an SRE that introduces only a 10% power-overhead with respect of the single OTA solution. The robustness of the results have been validated by means of Monte-Carlo simulations.


Introduction
Switched-capacitors (SC) circuits are commonly used in mixed-signal ICs to process data related to sensors and actuators including filters, analog-to-digital and digital-to-analog converters. The IoT paradigm, with its many expressions, imposes extremely low power consumption to such blocks, pushing the boundaries of the design techniques. Slew-rate (SR) enhancement techniques tackle power consumption reduction in SC circuits, since they can be employed to shorten the settling times of the commutation transients and to assist the charge transfer without raising the internal bias currents of the active blocks. SC circuits are the common implementation of sampled-data signal processing techniques, where the signal needs to correctly settle at the end of the sampling phase, regardless on its previous transient evolution. For this reason, circuital solutions that allow for much higher output currents than the quiescent current absorption (such as class-AB stages) can be applied to SC circuits with more relaxed distortion constraints than to their continuous time counterparts, whose output signals need to be valid during all the transient evolution. In addition, the absence of resistive loading favours the use of single stage OTAs in most SC architectures. This reflects into the large amount of works dedicated to new single-stage OTAs for SC applications, optimized for power vs. speed trade-off, which can be classified into four major categories: • Class-AB input pairs employ input devices with a variable bias current which allows for dynamic increase of both OTA transconductance and current capacity following large voltage steps. Class-AB input pairs can be implemented by dynamically controlling the tail current injected into the common source of the input differential pair, or by topological modifications of the input pair based on the flipped-voltage follower [1]. Different strategies have been proposed: (i) charge release synchronized with clock phases [2], (ii) current boosting by sensing input voltage steps [3][4][5][6][7] or (iii) at other OTA's internal nodes [8][9][10]. However, large impulsive currents in the internal branches do not contribute directly to the current delivered to the output capacitive load and, consequently, represent wasted power; furthermore, the upset caused by large increases of the internal currents may result in a settling penalty due to saturation of sensitive devices [11] or it may even cause trojan states [12]. • Non-linear current steering: this category comprises many families of single-stage OTA topologies that differ about the specific current operations performed by the network that conveys the currents of the input devices to the output port. Generally, all the topologies falling in this category aim to achieve class-AB action on the output branches maintaining the simplicity of the differential pair in the input section. This can be implemented through: (i) non-linear current mirrors [13][14][15][16][17], (ii) transconductor nesting [18,19] or (iii) current recycling [20][21][22][23][24][25][26][27][28][29]. The important feature of such implementations is that the increased output current capability does not involve an increased dc supply power, in contrast with what happens with a static current amplification approach. However, in many of these implementations, the appearance of non-dominant internal singularities makes the optimization difficult for small-to-medium capacitive loads due to phase margin degradation [30]. • RC-tie [31] or quasi-floating gate [32,33] configurations for OTA output branches efficiently provide class-AB action directly on the OTA output branches. They are implemented through a decoupling capacitor across the gates of the respective p-type and n-type transistors statically charged to maintain an adequate bias current through the stacked output devices. Any ac signal coupled to either the NMOS or the PMOS will be transferred to the stack in a push-pull fashion. Globally, the OTA presents a band-pass open loop characteristic which can be tuned to cover the range of frequency of interest. • An auxiliary OTA structure or part of it can be used as slew-rate enhancer (SRE) which aims to add an extra current directly to the output load in parallel with the main OTA current path. In order to achieve low power operation, the auxiliary circuit automatically turns off once the current needed by the load is small. This is fundamental in single-stage architectures where the main OTA provides gain through its high-impedance output nodes. Any gain loss due to the presence of parallel branches in the output nodes would cause precision loss at the end of the operation time. This can be avoided by forcing the cut-off state of the auxiliary SRE. This technique is of general use and can be extended also to multi-stage OTA configurations [34,35]. Auxiliary SRE implementations contemplate main-OTA internal-node sensing [36][37][38][39][40][41], or direct sensing of the OTA inputs (parallel-type SRE) [42][43][44]. In the latter case, a complete OTA/SRE system is used as shown in Figure 1. A totally passive, OTA-free SRE technique has been recently proposed for ∆Σ modulators [45].
The aforementioned classification is provided as a general guide for working principles in advanced single-stage OTAs, although many of the cited works use combined techniques to achieve superior performance in terms of slew-rate (SR), gain-bandwidth product (GBW) and static gain, as can be found in the super class-AB topologies [14,15,[26][27][28]. Still, few efforts were made to emphasize and contextualize the auxiliary parallel-type SREs with clear design guidelines. This work focuses on parallel-type auxiliary SRE, based on the pioneering works of Nagaraj [42,43], which, as will be discussed later, allows for decoupled specifications and optimization with respect to the main OTA. The principle of parallel-type SRE is illustrated in Figure 1, showing an OTA/SRE combination used as a power-efficient gain stage for a switched capacitor, fully differential integrator. Figure 1. Example of SC integrator based on single-stage OTA and parallel-type SRE.

OTA SRE
In this work, a practical settling time model is presented, useful for the design of OTA-SC circuits, including SC amplifiers, SC integrators and SC resonators, which find extensive applications in filters, capacitive-load drivers, ∆Σ and pipeline ADCs. The extension of this model to SREs, and in particular to Nagaraj's SRE [42], clarifies essential conditions and trade-offs to make SREs attractive solutions to speed-up the circuit settling. Moreover, an improved version of the original Nagaraj's SRE is presented, which exploits a capacitive boosting to reduce the SRE internal delays and maximize its effectiveness.
The remainder of this paper is organized as follows: Section 2 describes the aforementioned analytical settling time model, which, for its generality, can be applied to any single stage amplifiers regardless of the presence of output current boosting mechanisms; Section 3 describes the application of the model to a particular case of an OTA/SRE combination and compares the prediction of the model with simulated results obtained from the combination of a standard folded cascode OTA and the Nagaraj's SRE. In the same section, the improved SRE circuit is described and the achievable advantages are demonstrated by means of simulations. Finally, conclusions are drawn in Section 4.

The Switched-Capacitors Stage
Accurate SC circuits are generally implemented using fully differential architectures such that of Figure 1. Nevertheless, it is convenient to represent the behaviour of fully differential circuit using a single-ended equivalent circuit such that of Figure 2a. Voltages and currents of the single-ended model represent the total differential mode components of the original fully differential circuit. It can be easily shown that the only transformation that have to be applied regards the input capacitance of the OTA in the equivalent circuit, C P in Figure 2a, which should be set to twice the input capacitance of the fully differential OTA. Considering Figure 1, the other capacitances of the circuit are simply replicated in the single-ended equivalent, i.e., C S = C S1 = C S2 , The circuit represented in Figure 2a models the charge transfer operation from capacitor C S , where the input signal is initially stored in form of a charge sample Q S = C S ∆V S , to the OTA feedback capacitor C F . The presence of a following stage is here modeled through a capacitive load C L .
This represents exactly what happens in the circuit of Figure 2, once the mentioned fully differential to single-ended transformation is considered. Nevertheless, it can be easily shown that the circuit in Figure 2a provides an exact representation of many other frequently used switching strategies, such as the case where one terminal of C S is fixed to the OTA input and the other terminal is swept across two sources with a voltage difference of ∆V S [46,47]. Furthermore, as far as the voltages across capacitor C F are concerned, we are interested only in the variation caused by the input transition (switch S1), independently on whether these capacitors are discharged or not before S1 transition; hence the following analysis is applicable to both SC amplifiers [47] and integrators [48].
Since we are focused on the settling time, in the following discussion we will neglect the OTA finite-gain effects, considering the gain to be high enough to guarantee a perfect virtual ground across the OTA's input terminals in stationary conditions. For the same reason, offset and input referred noise will not be considered here. Therefore, we will assume that the input voltage of the OTA will asymptotically tend to zero after any transient.
At the instant t = 0, the charge initially present on C S is discharged into the amplifier inverting input. In the first place, the inertia against voltage changes of the capacitors' net causes V o to step in the opposite direction with respect to the desired final value. Neglecting any series resistance in the switches and interconnections, the OTA output and input are instantaneously displaced as shown in Figure 2b by: where the feedback coefficient, β, the equivalent input-referred capacitance, C S and the input attenuation c 1 are defined as Asymptotically, the output voltage increment, ∆V o (∞), tends to: The fact that ∆V o (∞) is independent from the values of C P and C L is simply a consequence of assuming a perfect virtual ground in stationary conditions (asymptotic virtual ground).
In practical cases, the output voltage is sampled after a finite time, thus the actual output variations will present an error with respect to the ideal value. Defining the relative error as follows: we are interested in the minimum settling time t S required to guarantee that R is equal or smaller than a target value. Since both R and t S are usually given as specifications, the designer asserts Equation (4) by proper choice of OTA topology, devices geometrical features and current consumption.

Simplified Model of Charge Transfer Transient
We will now consider that the very fast (ideally instantaneous) input and output transitions described by Equation (1) are completed. From this instant on, the system evolves towards the asymptotic value according to the following equations: where C L is the total load capacitance given by: In this section, we model the saturation of the OTA current using the following piece-wise linear approximation, which considers a linear relationship between the output current and the input voltage for values of the latter up to a magnitude V dmax and a constant current of magnitude I omax for input voltages that exceed V dmax , i.e., when the OTA is in slew-rate (SR) regime: where G m is the OTA transconductance in the linear region. For the sake of clarity, we just recall here that I o is equal to the differential output current of the original fully differential circuit, i.e., I on − I op in Figure 1. The application of the piece-wise linear approximation to the characteristic of a class-A OTA is shown in Figure 3a. In this case, the output current is simply proportional to the differential current of the input pair and it is convenient to set G m to the OTA small signal transconductance and V dmax to I omax /G m , avoiding discontinuities in the transfer characteristic [49]. On the other hand, class-AB stages such as the OTA/SRE combination of Figure 1 can be modelled with the discontinuous characteristic of Figure 3b, where I omax > G m V dmax .
In this work, we are interested in all the cases where the slew-rate phenomenon strongly affects the settling time, thus we will consider only input stimuli ∆V S as large as to bring the OTA input voltage out of the linearity region at t = 0 + . This occurs when V i (0 + ), given by Equation (1), exceeds V dmax . The occurrence of such large input stimuli is frequent in many SC circuits and should be regarded as a worst case when determining the settling time. In these conditions the amplifier starts the transient in slew-rate and remains in this non-linear condition for a period t 1 (SR time), ending when |V i | gets smaller than V dmax . After the slew period, the amplifiers enter linear region where the voltage evolution is exponential. This simplified view is represented in Figure 2b. The settling time t S is then given by the sum: where t 2 is the period of time spent in the linear response region until the relative error defined by Equation (4) gets and remains smaller than the target value. Initially (0 ≤ t ≤ t 1 ) the OTA responds with its maximum output current, I omax . As a consequence both V o and V i are bounded to slew linearly with a slope fixed by I omax . From Equations (5) and (7), it can be shown that the OTA's input slews at a rate I omax /C S . This situation persists until V i reaches V dmax . From Equation (5) and the initial condition on the input voltage given in (1), the SR time can be calculated as Notice that in the rightmost hand of Equation (9) we have introduced a factor k AB defined as: where I sup is the total supply current of the OTA. This coefficient represents a sort of current efficiency of the given amplifier topology and is generally smaller than one in pure class-A architectures. Section 3 will discuss k AB for OTA/SRE systems. For t > t 1 , the OTA input voltage gets smaller than V dmax and, according to the approximation given by Equation (7), the output current starts being proportional to the input voltage through the overall transconductance G m . We will also assume that the open-loop frequency response of the amplifier is dominated by the pole associated to the output port. In these conditions, it can be easily shown that the linear transient is a simple exponential decay characterized by the time constant: where ω L is the 0-dB frequency of circuit loop-gain. Then, the linear time t 2 will be simply given by: where V in is the value assumed by the input voltage when V o has settled to the final asymptotic value with a margin equal or smaller than the target relative error R . From Equations (3)-(5), V in turns out to be: Using the values found so far for the SR time t 1 and the linear time t 2 , we can now calculate the total settling time given by: This expression can be simplified by considering that the ratio I sup /G m has the dimensions of voltage, so that we can express it as a function of V dmax by defining the dimensionless factor k G as: The maximum input voltage V dmax generally depends on the input devices of the OTA, while the overall G m is proportional to the transconductance of the input devices (g mi ) through a factor m g defined as: Using Equation (15), the expression of the settling time can be finally written as: where the expression has been made more compact by introducing time t X and coefficient c 2 , defined as: Equation (17) is applicable if the following conditions are verified: (c 1 ≥ V dmax /∆V S ) and (c 2 V dmax > R ∆V S ), which correspond respectively to a transient starting in SR regime and ending in the linear region when the settling time is reached. The apparent complexity of expressions in Equation (17) can be clarified by identifying the various parameters: • t X , R : both parameters descend from system-level specifications. The former contains: (i) C S which is strictly related to C S (see Equation (2)) and thus to kT/C-noise specifications, (ii) ∆V S , which is the maximum stimulus that can be applied to the circuit (may approach the supply voltage in SC ADCs) and (iii) I sup , which is determined by the power budget. On the other hand, R can be related to precision, linearity and maximum tolerable harmonic distortion, depending on the application of the SC amplifier/integrator.
• c 1 and c 2 : both parameters mainly depend on the capacitive feedback network (C S , C F ) and on the load C L . The C S /C F ratio is determined at system level to achieve the desired gain or integrator coefficient, through Equation (3). The contribution of the input capacitance C P to the coefficients c 1 and c 2 may be significant when input devices with large gate area are chosen to minimize the offset voltage and the flicker noise and/or particularly small values are chosen for C S , C F and C L to enable fast clock frequencies.
• k AB and k G express the efficiency by which the OTA uses the given supply current to produce large output currents and large transconductances, respectively.
• ∆V S /V dmax is composed by a specification (∆V S ), dictated by the application, and by V dmax , which is a real degree of freedom that characterizes the design of the OTA.
In next section, the Nagaraj's SRE will be applied to the standard folded cascode (FC) OTA shown in Figure 4. The FC OTA is still widely used in SC circuits for its high gain, high speed and circuital simplicity. We will consider the typical bias current distribution shown in the figure, where the common source stage (Mip, Min) and the common gate one (Mcn1, Mcn2) are both biased by the same current I t /2. This choice results in: k AB = 1/2. Using for this stage the approximation shown in Figure 3a and considering definition (15), we find k G = 1/2. For the topology in Figure 4, we have studied the effect of V dmax on the settling time. For this test, we have assumed a case study with C S = 1.5 pF, C F = 6 pF, C L = 1 pF and C P = 0.32 pF, resulting in the following values for the capacitance ratios c 1 0.63, and c 2 5.2.  Figure 5a shows the settling time normalized to time t X calculated by Equation (17) for the OTA of Figure 4 as a function of the ∆V S /V dmax ratio. We imagine starting with a large input voltage step ∆V S , e.g., 2 V, and then imagine varying V dmax by sizing the input devices. Large values of ∆V S /V dmax , which are beneficial for t S /t X , can be achieved by reducing V dmax which is the remaining degree of freedom in the OTA design when the above mentioned choices on the capacitors are made. The two curves in Figure 5a refer to two different target values of the residual relative error R , namely 10 and 100 ppm. For small values of ∆V S /V dmax , the linear time t 2 represents the main contribution of the whole transient, thus different R impacts on t S /t X according to the logarithmic dependence expressed in Equation (17). The first design indication that can be derived from Figure 5a is that, in a single-stage OTA with a fixed supply current budget and fixed maximum magnitude of the input stimulus, the minimum values of the settling time are obtained by minimizing V dmax . Note that even pushing the input devices into subthreshold region, V dmax cannot be smaller than several of tens millivolts, then, with an input stimulus ∆V S of the order of a few volts, the feasible values of ∆V S /V dmax cannot exceed a few tens, so that the asymptotic behaviour is only a mathematical extrapolation that does not correspond to feasible circuital solutions. The fact that the curves converge at high ∆V S /V dmax ratios means that the settling time is dominated by the SR time, which becomes independent of the target residual error when V dmax gets negligible with respect to ∆V S .
The advantage of increasing the maximum output current by means of a parallel-type SRE circuit is illustrated in Figure 5b, where all parameters are kept constant and k AB is swept, starting from the value 1/2, which represents the original OTA of Figure 4 with no SRE applied. For this investigation, parameter ∆V S /V dmax was fixed to 20. The curve in Figure 5b clearly indicates that, in a typical SC design case as the one considered here, most of the t S reduction is obtained with moderate k AB ratios and there is not a real advantage in seeking extreme ratios between the maximum output current and the static supply current. In addition, it is apparent that the maximum advantage is about a factor of 2 for the FC OTA/SRE combination. This result confirms that the application of these output current boosting strategies may only reduce the SR time (t 1 ) to a negligible value, leaving t 2 unchanged. Larger relative advantages can be brought by increasing k AB when the target accuracy r is lower, since t 2 contribution to t S will be smaller with respect to the contribution of t 1 .
In next section, we will investigate the actual advantages that can be obtained by means of Nagaraj's SRE, showing the main deviation from the simplified model underlying Equation (17) by means of detailed electrical simulations.

Ideal Behaviour and Static Power Overhead of Parallel-Type SRE
We refer to the configuration depicted in Figure 1, where the output currents of both the main OTA and the SRE sum together to the output nodes. In the ideal case, the SRE provides a non-zero output current only during the SR time of the main OTA, concurring to accelerate the circuit settling. The piece-wise linear representation used to model the OTA/SRE output current as a function of the input differential voltage is shown in Figure 3b, where I omax is the sum of the maximum current of the OTA and SRE circuit, indicated with I omax,OTA and I omax,SRE , respectively. Thus, according to Equation (10): where k AB,OTA = I omax,OTA /I sup and k AB,SRE = I omax,OTA /I sup designate the current efficiencies of the main OTA and the SRE, referred to the total current consumption. The introduction of the SRE inevitably introduces a power overhead with respect to the OTA alone, so the total supply current I sup is now calculated as: Parameter η describes the power overhead of the SRE circuit. To obtain an advantage in terms of power vs. settling time trade-off, I omax,SRE should be significantly larger than I omax,OTA with a negligible power overhead (η 1). One possible implementation of the SRE circuit was proposed by Nagaraj in [42,43]. In the following sections, the Nagaraj's SRE will be described together with its major limitations. A simple modification of the original scheme, proposed here, shows how these shortcomings are avoided.

Nagaraj's SRE
The Nagaraj's SRE, shown in Figure 6a, can be seen as a derivation of a complementary input pairs mirror-based OTA with shunt current sources implemented through Mb1p-Mb2p and Mb1n-Mb2n. The effect of the shunt current sources is to create a dead zone in the SRE characteristic around the V id = V ip − V in = 0 condition. Within the dead zone, all the available current provided by the differential pair is absorbed by the shunt current sources, hence cutting off the current mirrors Mm1p-Mm2p, Mm3p-Mm4p, Mm1n-Mm2n, Mm3n-Mm4n and the output current. In particular, the dead zone is designed to be as large as to guarantee that the mentioned mirrors are all turned off at the end of settling transient. In this way, the SRE output resistance is virtually infinite and no degradation of the OTA dc gain occurs. Due to the dead-zone, no current flows into the four current mirrors for input voltages between −V a and V a , where the threshold voltage V a can be tailored by tuning currents I tail and I th . In order for a dead-zone to exist, the condition I th > I tail /2 should be respected. Furthermore, to allow a non-zero current to be conveyed into the selected mirrors when the input pair is fully unbalanced by a large input voltage, I tail should be greater than I th . Setting I th = 3 4 I tail constitutes a robust choice that allows satisfying both conditions with a good margin against process errors. Once currents I tail and I th are chosen, V a can be tuned to the desired value by varying the aspect ratio and, consequently, the overdrive voltage of the input devices (Minn-Mipn and Minp-Mipp). In order to approximate as much as possible the characteristic of Figure 3b, V a was adjusted to be equal to V dmax of the main OTA.
(b) (a) Figure 6. Fully differential Nagaraj's SRE schematic (a); simplified schematic during turn-on transient (b). The capacitor C B , not present in the original works of [42,43], is proposed here to reduce the SRE turn-on delay as explained in Sections 3.3 and 3.4.
The steady-state current consumption of the SRE is I sup,SRE = 2I tail , which from Equation (20) is also related to I sup through η. This allows us to find a design expression of I omax,SRE as and Assuming I th = 3 4 I tail and η 1, k AB,SRE ≈ kη/4.

SRE Simulations and Turn-On/Off Effects
Apparently, Equation (22) suggests that an arbitrary small η can be set for a desired k AB,SRE by incrementing k. Electrical simulations have been performed on the circuit depicted in Figure 1, to check whether the t S reduction expected from Figure 5b can be actually obtained with this approach. The OTA and the SRE are the FC OTA and the Nagaraj's SRE depicted in Figures 4 and 6, respectively; the circuit was designed with the UMC 180 nm CMOS process under 1.8-V supply condition and was simulated with the Spectre TM simulator. The capacitive network was sized as in the case study used in Figure 5; the relative error R was set to 100 ppm. The OTA has been designed with a supply current I sup,OTA = 400 µA, while the SRE supply current is I sup,SRE = η I sup,OTA , where η will be specified later. The differential input signal ∆V S = −1.8 V is high enough (in absolute value) to bring the OTA far from its linearity range; the final differential output voltage is V od = V op − V on = 0.45 V, neglecting the finite dc gain effects. The clock edge controlling the switches' commutation occurs at 1 µs.
The differential output voltage V od and the SRE differential output current I od,SRE in Figure 7a are obtained with η = 10% and different values of the SRE mirroring factor k (k = 0 represents the configuration without SRE). Figure 7b shows the settling time t S vs. k (for different η factors), plotting both the nominal case and the average estimated over 100 Monte Carlo (MC) runs. The reason for the latter will be explained later in this section.  As it looks evident from the curve for η = 10% in Figure 7b, the settling time is progressively reduced by increasing k, e.g., with k = 10 and k = 30. As an example, the case k = 30 implies a settling time reduction of 34% respect to t S of the FC OTA alone. Larger k AB values, obtained by increasing the k factor, would bring the settling time reduction up to 56%, according to the analytical model. However, electrical simulations clearly show that increasing k from 30 to 100 does not introduce further benefits; on the contrary, t S becomes even higher than the one obtained without SRE. This is qualitatively evident from Figure 7a: the higher SRE output current due to the higher k is not provided as promptly as in the case for k = 30. Moreover, an increase of the k factor is followed by a less than proportional increment of I od,SRE . Finally, the delay of the SRE turn-off causes the overshoot of V od shown in the inset, resulting in a larger time to settle.
To better understand the origin of this turn-on delay, a simplified turn-on transient is described here. For the sake of simplicity, let us consider the turn-on transient of current mirror Mm1n-Mm2p, whose input device is represented in Figure 6b together with the input currents applied when the input voltage step magnitude is large enough to completely unbalance the input differential pair, M1np-M1pp, of the SRE. The current (I tail − I th ) linearly charges the parasitic capacitance C k at the mirror input until V k reaches V th and Mm1n turns on. Then, the transient will be governed by the non-linear differential equation described in [50] and characterized by a time constant being g m,m1n the transconductance of Mm1n when its drain current is equal to the asymptotic value I tail − I th . This simple model already suggests that the mirror turn-on delay will be aggravated both by large values of k (increased Mm2n gate area) and small values of I tail , i.e., small values of η, as confirmed by the simulations shown in Figure 7b: for η = 2%, indeed, SRE is ineffective or even detrimental for all the k values. Analogous effects occur also during the SRE turn-off transient, but the actual analysis is made more complex by the input stimulus at Mm1n, which cannot be assumed to be an instantaneous current step as in the turn-on transient. However, it is reasonable to assume a turn-off transient with a time constant similar to the one expressed in Equation (23). An optimum k which minimizes t S (depending on the value of η) is visible from Figure 7b. This phenomenon has been explained considering that the mentioned turn-off delay may cause the SRE current impulse to stop when the OTA has already entered the linear region, significantly reducing also the linear time t 2 . A further increase in the SRE delay would result in overcoming the condition of V id = 0, producing the overshoot previously described and visible in Figure 7a for k = 100. Recovering from this overshoot clearly causes the t S increase visible at high k values. The hypothesis that this optimum is the result of critical compensation among different contributions suggested to perform Monte-Carlo (MC) simulations to test the robustness of this effect against process variations. The MC curves in Figure 7b are obtained by averaging the settling time extracted from sets of 100 Monte-Carlo runs, involving both local and global process variations. As expected, the minimum is still visible in the MC average, but it is significantly less prominent than in the nominal case.
Another noticeable difference between nominal and MC simulations is present in the case for η = 2% and k > 30. The cause of higher t S in the MC curves lies in an incomplete turn-off of the Nagaraj's SRE output branches in some of the MC runs: besides a reduction of the overall output impedance, the incorrect steady state of the OTA/SRE worsens the overall settling time. The incomplete SRE turn-off is due to mismatch in the current mirrors providing I tail and I th , failing to respect the condition I th > I tail /2. Note that the η factor was varied by keeping the SRE transistor sizing optimized for the case η = 10% and varying only the bias currents (I tail and I th ) proportionally. At small η values, the resultant reduction of critical overdrive voltages caused an increase of the matching errors as large as to make I th smaller than I tail /2 in a few runs, preventing the SRE from completely turning off at the end of the transient. This problem could be arguably solved by resizing the SRE for the smaller bias current of the η = 2% case. In summary, the simulated results clearly show that the actual advantage that can be obtained by increasing the I omax /I sup ratio, i.e., the k AB ratio, is significantly smaller that the analytical model prediction. This discrepancy is much more evident when we assign a reduced current budget to the SRE. Nevertheless, the analytical model is still useful to estimate the theoretical limit that could be reached if the mentioned internal delays could be overcome. A possible solution that goes in this direction is shown in next section.

Capacitive-Boosted Nagaraj's SRE
The fact that the asymptotic settling time reduction predicted by Figure 5b cannot be obtained in practice, due to the mentioned SRE delays, limits the applicability of the described solution for low power applications, where negligible power overheads (i.e., small η) are required. A higher I tail current, in fact, would effectively reduce the internal delays, at the cost of a larger static power consumption. To overcome this limitation, we propose a dynamic current boosting technique which is obtained by simply adding the capacitor C B between the source terminals of the complementary input differential pairs of the original Nagaraj's SRE, as depicted in Figure 6a.
Let us consider a large negative voltage step as differential input voltage. The capacitor C B instantaneously realizes a short circuit, creating a direct path between the power rails through Mm3p, Minn, Mipp, Mm1n. The two turned-on input devices establish a large current flow, which is no more limited by I tail but depends on the input differential voltage in a square law fashion (due to the strong inversion region biasing). This large current impulse will immediately start charging the parasitic capacitances of the current mirrors, e.g., C k in Figure 6b and will increase the transconductances of the current mirrors, thus reducing the turn-on delay and making the output currents both faster and larger, obtaining an effective SR time reduction.
When operating in a periodic steady state as in typical SC applications, C B cannot provide an average charge different from zero, because it needs to be periodically recharged by the internal current sources (Mtn and Mtp), but it is able to concentrate its effect in a short time at the beginning of the transient. An analytical model of this phenomenon is very complex and far from the aim of this paper; however, from electrical simulations, it is possible to find an optimum value of the capacitor C B for each given value of η that, together with a proper value of k, minimizes the SR time t 1 and consequently t S . Figure 8 shows the settling time behaviour with respect to the k mirror factor for different C B values, in two different conditions of static current consumption (η = 10% and η = 2%). Each t S point is evaluated averaging over 100 MC runs. In Figure 8a, it looks clear how the increasing of C B enhances the SRE performance, achieving a t S reduction of 57% compared to the settling time of the OTA without SRE, consistent with the maximum reduction indicated by the analytical model. For C B larger than 500 fF, no further benefits are obtained. Figure 8b shows the effectiveness of C B even in the case of η = 2%, when the Nagaraj's SRE was not able to introduce benefits: for C B ≥ 500 fF, settling time reductions up to 44% can be achieved.
Note that the monotonic behaviour of t S as function of k AB in Figure 5b is not respected also for the boosted SRE, since t S increases at high k values, corresponding to high k AB values. Indeed, very large output current gains are obtained by increasing the width (and then gate area) of the output devices to such an extent that even the boosted current impulse is unable to overcome the internal delay. In addition, it should be observed that the turn-off delay is likely to be less affected by the presence of capacitor C B , since the SRE switch-off is not commanded by a large voltage step as the turning-on event, but by an input voltage that is evolving in the linear operating zone of the SRE input pairs. However, as the results of Figure 8 clearly prove, the t S reduction that can be obtained at moderate values of k gains with the modified SRE reaches the asymptotic values predicted by the simple model described in Section 2.

Conclusions
Simulation results performed on the combination of the FC OTA cascode with the original Nagaraj's SRE showed that the latter is unable to produce the settling time reduction predicted by a simple model that neglects internal delays of both the OTA and SRE units. The effectiveness of the Nagaraj's SRE progressively degrades as its relative power overhead (the η factor) is diminished. This might be the reason of the limited presence in the literature of this kind of slew rate enhancing approach, albeit its attracting characteristics and, in particular, the fact that it completely turns off at the end of the transient, leaving the noise, offset and gain properties of the original OTA unchanged. Analysis of the output current dynamics reveals that high gains of the output mirrors, required to achieve adequate power efficiencies, resulted in a significant delay in the delivery of the output current impulse, making the SRE ineffective or even detrimental. This problem was solved by the simple introduction of a bypass capacitor (C B ), which turned the original Nagaraj's SRE in what is here dubbed capacitive-boosted SRE. The simulation results clearly show that this simple modification allowed the capacitive-boosted SRE to provide benefits, in terms of settling time reduction, that reach the prediction of the simplified analytical model in the case of a 10% power overhead, and, more remarkably, get close to the predicted 50% t S reduction even for a small 2% power overhead, at which the original Nagaraj's SRE was completely ineffective. Finally, we emphasize the fact that these improvements are obtained at the cost, in terms of silicon area, of only the accommodation of a sub-pF capacitor.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.