A Novel OTA Architecture Exploiting Current Gain Stages to Boost Bandwidth and Slew-Rate

: A novel architecture and design approach which make it possible to boost the bandwidth and slew-rate performance of operational transconductance ampliﬁers (OTAs) are proposed and employed to design a low-power OTA with top-of-class small-signal and large-signal ﬁgures of merit (FOMs). The proposed approach makes it possible to enhance the gain, bandwidth and slew-rate for a given power consumption and capacitive load, achieving more than an order of magnitude better performance than a comparable conventional folded cascode ampliﬁer. Current mirrors with gain and a push–pull topology are exploited to achieve symmetrical sinking and sourcing output currents, and hence class-AB behavior. The resulting OTA was implemented using the 130 nm STMicroelectronics process, with a supply voltage of 1 V and a power consumption of only 1 µ W. Simulations with a 200 pF load capacitance showed a gain of 92 dB, a unity-gain frequency of 141 kHz, and a peak slew-rate of 30 V/ms, with a phase margin of 80 ◦ , and good noise, PSRR and CMRR performance. The small-signal and large-signal current and power FOMs are the highest reported in the literature for comparable ampliﬁers. Extensive parametric and Monte Carlo simulations show that the OTA is robust against process, supply voltage and temperature (PVT) variations, as well as against mismatches.


Introduction
Nowadays, society is reliant on portable and lightweight devices. Biomedical and Internet-of-Things (IoT) applications are among the most relevant topics in the research community [1][2][3]. A large variety of biomedical products have been proposed for monitoring people's health [4][5][6][7][8]. Many of these devices require lightweight building blocks able to operate with low supply voltages (LV) and low power (LP) consumption, to improve battery life in portable devices and allow the use of energy harvesting techniques [9][10][11].
In this scenario, one of the most useful, and challenging, building blocks is the operational transconductance amplifier (OTA) [12]. Many ideas have been proposed in the literature to develop LV-LP amplifiers with supply voltages of 1 V or less. Besides new architectures, new operating regions, notably sub-threshold and deep sub-threshold operation, have been considered. Indeed, the strong inversion region is not a good choice in energy-harvested systems, because it requires higher supply voltages than weak inversion operation, which is considered the best solution for power consumption optimization, showing good gain and appropriate bandwidth for IoT and biomedical applications [13,14].
In References [13][14][15][16][17], several techniques to enhance low-voltage OTA performance were analyzed. In Reference [18], Algueta Miguel et al. presented an OTA with floating gate and quasi floating-gate techniques. Moreover, OTAs with class-AB behavior were presented in [19,20], showing significant improvement in both common-mode rejection ratio (CMRR) and slew-rate performance. By reducing the supply voltages, one of the most challenging aspects of OTA design is the CMRR. Thus, in [21][22][23], new approaches

Proposed Architecture and Design Approach
This section describes the theoretical idea behind the proposed OTA architecture and design approach. Small-signal, slew-rate, and noise analyses are performed to highlight the advantages of the technique.
A simplified schematic of the proposed OTA architecture is presented in Figure 1a together with a single-transistor, common-source amplifier, shown in Figure 1b. All the devices in Figure 1 are assumed to have a channel length and a unit width , with = , for NMOS and PMOS devices, respectively, and to be biased with a unit current , set by the bias voltages . Unit devices are then scaled by the integer factors , , ≥ 1. The scaling factors can be implemented by placing more devices in parallel, to ensure maximum matching. With the above assumptions, the unit device has transconductance , input capacitance , and output resistance . The scaling factors increase the transconductance and capacitance and lower the output resistance proportionally.
The following analysis in this section proves that there is a clear advantage in gain, bandwidth, and slew-rate, and a moderate worsening of noise performance, for the circuit in Figure 1a with respect to the circuit in Figure 1b. This demonstrates that the use of current mirrors with gain allows excellent figures of merit to be achieved, as shown in the rest of this paper. For the two circuits above to have the same power consumption, we set = 2 + for the amplifier in Figure 1b. We also assume the two amplifiers to have the same load capacitance . In the following, we compare these two topologies to explain how the topology in Figure 1a improves gain, bandwidth and slew-rate, with a slight penalty in noise performance. We will also show in Section 3 that this latter penalty is much lower in the actual (differential-input, push-pull output) amplifier. The following analysis in this section proves that there is a clear advantage in gain, bandwidth, and slew-rate, and a moderate worsening of noise performance, for the circuit in Figure 1a with respect to the circuit in Figure 1b. This demonstrates that the use of current mirrors with gain allows excellent figures of merit to be achieved, as shown in the rest of this paper.
For the two circuits above to have the same power consumption, we set S = 2K + H for the amplifier in Figure 1b. We also assume the two amplifiers to have the same load capacitance C L . In the following, we compare these two topologies to explain how the topology in Figure 1a improves gain, bandwidth and slew-rate, with a slight penalty in noise performance. We will also show in Section 3 that this latter penalty is much lower in the actual (differential-input, push-pull output) amplifier.

Small-Signal Analysis
For the computation of DC gain, we assume that the output conductance g 0 = r −1 0 of MOS devices is much lower than their transconductance g m , so that the two current mirrors M 3 -M 4 and M 6 -M 7 in Figure 1a exhibit current gains of K and H, respectively. If the output resistances were taken into account, the current gain would be slightly lower. However, in the actual implementation of the OTA architecture, cascode current mirrors can be used, and in this case, due to the increased output resistance (g m r 2 0 ), the current mirrors will exhibit a current gain very close to the ideal one.
To simplify the calculations and to gain insight into circuit behavior, it is convenient to split the DC gain of the amplifier in Figure 1a as follows: Hence, the proposed OTA can be seen as a three-stage amplifier, where the first two stages are loaded by a diode-connected MOS device. Since the diode-connected load transistors are smaller than the common-source devices which drive them, both the first and second stage exhibit a DC gain approximately equal to K. Starting from these assumptions and performing simple calculations, the expression of the DC gain of the proposed OTA in Figure 1a (new) is found to be: whereas the DC gain of the conventional common-source amplifier in Figure 1b (CS) can be expressed as: From Equations (2) and (3) it is evident that the proposed architecture exhibits a DC gain which results in enhancement by a factor of K 2 with respect to the conventional common-source amplifier (about 20 dB higher gain can be achieved for factors K in the range of 3). The additional DC gain results in better feedback performance at low frequencies, such as for example higher linearity and more accurate closed-loop gain.
However, the two additional nodes x and y in Figure 1a complicate the frequency response of the amplifier. In fact, neglecting the output conductances and the gate-drain parasitic capacitances of MOS devices, it can be shown that the proposed OTA exhibits three poles: a dominant pole at the output node, also present in the conventional commonsource amplifier, and two additional poles which arise at nodes x and y. Then, in order to compute the frequency response of the amplifier, we observe that the equivalent conductance and the equivalent capacitance at node x are g mP and (1 + K)C GSP , respectively, whereas the conductance and capacitance at node y are g mN and (1 + H)C GSN . Under these assumptions, the frequency response of the proposed architecture is: whereas the frequency response of the conventional common-source amplifier is: Here we notice that the proposed amplifier has two additional poles, at frequencies f TP /K and f TN /H, where f TX is the transition frequency of the NMOS (X = N) and PMOS (X = P) devices. This is the cost of using this topology.
However, there is a great advantage in terms of bandwidth, because the unity-gain frequencies of the two amplifiers are: These relations show another fundamental property of the proposed circuit. The condition for the two amplifiers to have the same biasing current and power consumption is S = 2K + H, which is a linear function of the scaling factors K and H. However, the bandwidth of the propsed amplifier increases by a factor K 2 H, which is polynomial in the scaling factors. Hence, a very high bandwidth can be achieved for the same capacitive load and power consumption, by choosing K, H 1. For instance, for K = 3 and H = 12, and thus S = 18, the ratio between the bandwidth of the proposed and the reference single-transistor amplifier is: The proposed amplifier is thus (in this case) 6 times faster than a common-source amplifier implemented in the same technology, with the same load, biasing point, and power consumption. The disadvantage of the proposed topology is the presence of two additional poles. In this case, a well-behaved frequency response requires that the unitygain frequency is lower than the two additional poles, which, however, are at fairly high frequencies, proportional to the transition frequency of the devices, with scaling factors K and H. Hence, the amplifier is compensated when the load capacitor is sufficiently large to push the unity-gain frequency at sufficiently low frequencies. The compensation technique is similar to that of cascode amplifiers, except for the presence of two additional poles instead of one, and at lower frequencies, owing to the scaling factors.

Slew-Rate Analysis
The circuit in Figure 1a is in class-A on the rising edge, owing to the current source at the output. A push-pull complementary structure is thus needed to implement class-AB behavior on both signal edges. In this subsection, we consider the circuits in Figure 1 to be half-circuits, and we only consider the falling edge, because the rising edge behaves similarly in the complementary push-pull architecture that has actually been implemented, and which will be detailed in Section 3.
We assume that the input stage is a differential pair, so that the current flowing in the transistors can vary from 0 to 2I B (multiplied by the respective scaling factor). The slew-rate of the common-source amplifier is thus limited to: The slew-rate of the proposed amplifier (on the falling edge) can be computed by assuming that the current flowing in the input transistor is twice the biasing current, i.e., 2KI B . Because of the PMOS current source above, only (K + 1)I B flows in the PMOS diode. The corresponding current mirror has a gain of K, bringing the current to K(K + 1)I B . The NMOS current source removes (K − 1)I B , so that the NMOS diode's current is The gain of the second current mirror is H, and the final sinking current of the output stage is thus K 2 H I B , also considering the current sourced by the PMOS current source at the output. Hence: Once again, we notice that the proposed amplifier has a much higher peak output current than the conventional common-source stage, for K, H 1. For K = 3, H = 12, the slew-rate improvement is a factor of 3. Hence, for the same load and total power consumption, the proposed amplifier has a much larger slew-rate than the common-source amplifier.
We point out again that the above slew-rate analysis only holds true for the sinking output current, as the source output current is limited by the current generator. However, the complementary push-pull architecture, which will be detailed in Section 3, will have a symmetrical slew-rate behavior, and the slew-rate improvement will occur symmetrically on both the rising and falling signal edges.

Noise Analysis
While the gain, slew-rate and bandwidth performance of the proposed amplifier increase polynomially with the gain of the current mirrors, noise can be shown to increase linearly, as in the single-stage amplifier. For the two amplifiers in Figure 1, the proposed one has a higher noise power density by a (small) constant factor, as the noise in both amplifiers is approximately linear with the scaling factors.
To derive a simple formulation, we neglect the output resistance r o of all MOS transistors and only consider white thermal noise, with a given excess noise factor γ ≥ 1 for both NMOS and PMOS devices to consider short-channel effects. We further assume both PMOS and NMOS devices to have the same transconductance.
Under these hypotheses it is straightforward to compute the total input-referred voltage noise of the common-source amplifier in Figure 1b, considering the noise of both the main transistor and the active load: where K B is the Boltzmann constant, and T denotes the absolute temperature.
To compute the equivalent input voltage noise for the proposed amplifier in Figure 1a, we refer to the simplified scheme shown in Figure 2, where we notice that the noise injected at node x is amplified by KH, the noise injected at node y is amplified by H, and the noise injected at the output is not amplified. Furthermore, the total transconductance of the stage is K 2 Hg m . Hence: Electronics 2021, 10, x FOR PEER REVIEW 7 of 19 common-source stage, resulting in more current branches and thus a reduction in the value of that is necessary to have the same power consumption.

Amplifier Design
The analysis in Section 2 showed that, by using current mirrors with gain, a potentially large increase in gain, bandwidth and slew-rate performance can be achieved, with at most a slight cost in terms of noise, and the creation of two high-frequency poles limiting stability for small capacitive loads.
In this section, we present the actual push-pull implementation of the novel OTA architecture analyzed in Section 2. The proposed implementation is based on a complementary-input push-pull topology with class-AB behavior and exploits input stages with current mirror active loads that improve the common-mode rejection ratio (CMRR). Finally, we compare this implementation against a conventional, class-A, folded cascode topology, with complementary inputs, which serves as reference for the simulations in  For K 1, Equation (12b) can be simplified as: Hence, the proposed amplifier has higher noise power density only because the first stage devices have a width K, which is lower than S = 2K + H (for the same power consumption). This means that noise only increases linearly: However, this is a worst-case scenario, because we are comparing the baseline proposed amplifier with a single-transistor amplifier. As will be better shown in Section 3, the actual implementation of amplifiers in Figure 1 will have a differential input stage, so that more devices (and more power consumption) are required. Furthermore, to achieve sufficient gain, a folded cascode amplifier is usually employed in place of the conventional common-source stage, resulting in more current branches and thus a reduction in the value of S that is necessary to have the same power consumption.

Amplifier Design
The analysis in Section 2 showed that, by using current mirrors with gain, a potentially large increase in gain, bandwidth and slew-rate performance can be achieved, with at most a slight cost in terms of noise, and the creation of two high-frequency poles limiting stability for small capacitive loads.
In this section, we present the actual push-pull implementation of the novel OTA architecture analyzed in Section 2. The proposed implementation is based on a complementaryinput push-pull topology with class-AB behavior and exploits input stages with current mirror active loads that improve the common-mode rejection ratio (CMRR). Finally, we compare this implementation against a conventional, class-A, folded cascode topology, with complementary inputs, which serves as reference for the simulations in Section 4. Figure 3 shows the detailed schematic of the proposed OTA. It is made up of two input differential pairs with active load based on a current mirror. All current mirrors are realized as high-swing cascode current mirrors (HSCCMs) to boost the output resistance, increase the small-signal gain and improve the mirroring accuracy.  Figure 4 shows the topology of the complementary-input folded cascode OTA assumed to be the actual implementation of the conventional common-source amplifier used as a reference in Section 2. It has to be noted that a telescopic (Arbel, for a complementary input) cascode is better than the folded cascode in terms of power efficiency, but it is hard to bias and has limited output signal swing; therefore, we consider the folded The devices in panel (a) form the input differential pairs, whose CMRR is boosted both by the use of tail current generators and by the active load, which cancels the commonmode current by mirroring it to the output with opposite signs (while the differential input is doubled). The devices in the differential pairs have size K. The devices in panel (b) are the load of the first stage, whose size is 1, and sets the gain for node x. Panel (c) shows the second stage, of size K, with current generators of size K − 1 and diode loads of size 1, ending at node y. Finally, the output stage has size H and is shown in panel (d).
The use of complementary inputs allows rail-to-rail behavior, and improves gain, noise and bandwidth performance. The most important advantage, however, is that the slew-rate behavior is now approximately symmetric, so that both on the positive and negative edges the peak output currents will be very large.
The use of cascoding increases the gain to K 2 g 2 m r 2 0 , still K 2 times higher than that of a conventional folded cascode OTA. There will be further poles (also in the conventional cascode) due to the use of HSCCM mirrors, but the main high-frequency poles will still be those at nodes x and y, because they will be at frequencies f T /(K + 1) and f T /(H + 1), while the poles of the HSCCM are at frequencies f T ( f T denotes the transition frequency of the MOS devices). The tail current generators of the input differential pairs were not cascoded, due to limitations in the voltage headroom. Figure 4 shows the topology of the complementary-input folded cascode OTA assumed to be the actual implementation of the conventional common-source amplifier used as a reference in Section 2. It has to be noted that a telescopic (Arbel, for a complementary input) cascode is better than the folded cascode in terms of power efficiency, but it is hard to bias and has limited output signal swing; therefore, we consider the folded cascode to be the most appropriate reference for comparisons.  Figure 4 shows the topology of the complementary-input folded cascode OTA assumed to be the actual implementation of the conventional common-source amplifier used as a reference in Section 2. It has to be noted that a telescopic (Arbel, for a complementary input) cascode is better than the folded cascode in terms of power efficiency, but it is hard to bias and has limited output signal swing; therefore, we consider the folded cascode to be the most appropriate reference for comparisons. V BN and V BP are the biasing voltages, which set the biasing currents across the amplifier, whereas V CN and V CP are the biasing voltages for the gates of the common-gate stages in the cascoded devices. They are generated by a straightforward biasing network (not shown), which can be common to both amplifiers, and is composed of HSCCM current mirrors (both NMOS and PMOS) and an ideal current source.
The total current consumption of the folded cascode in Figure 4 is 6SI B , while the total current consumption of the proposed amplifier in Figure 3 is 6K + H + 2, and hence, to have the same current consumption: For the design choice of K = 3, H = 12, we have S = 16/3. We chose S = 6 to have an integer scaling factor. Since the disadvantage in noise performance was mostly due to the large value of S (in Section 2.3), the much lower S for the actual amplifier will yield much better bandwidth and slew-rate performance, with a negligible penalty in noise performance, for the proposed amplifier.

OTA Design and Simulation Results
This section reports details about the OTA design and the simulation results. Simulations in typical conditions confirm the theoretical result that gain, bandwidth and peak slew-rate improve significantly for the same load and power consumption, while noise density increases slightly. The results of Monte Carlo and parametric simulations also show very good robustness to PVT variations and mismatches.
The proposed OTA, whose topology is shown in Figure 3, was designed in a commercial 130 nm CMOS technology from STMicroelectronics with a supply voltage of 1 V and a total power consumption of about 1 µW. Table 1 shows the sizing of the devices together with the main design parameters. Several considerations need to be taken into account to minimize area occupation. The proposed architecture, as well as the folded cascode, employ stacked transistors. Using a single body-voltage for all NMOS and all PMOS devices, area occupation is minimized with respect to body-driven amplifiers, where separate wells are needed for each input device, requiring space for well isolation. The proposed architecture employs PMOS body terminals connected to V DD and NMOS body terminals connected to −V SS , further reducing area occupation.
We define the following Figures of Merit, which are often used in the literature to compare different amplifier designs: where B W is the closed-loop unity-gain bandwidth, C L the load capacitance, SR ± are the positive and negative slew-rate, and P diss (I tot ) is the total power (current) consumption. L and S in (14) stand for large-signal and small-signal, while the S(L)FOM N are normalized with respect to the layout area of the OTA. We consider both positive and negative slew-rate large-signal FOMs, to take into account OTAs with asymmetric slew-rate behavior, while our design, being complementary, is almost symmetrical.

Results of Typical Simulations
The open-loop frequency response of the proposed OTA from 1 Hz to 10 MHz is shown in Figure 5. The response has a dominant pole at a very low frequency and all higher-frequency poles are beyond the unity-gain frequency, so that the phase margin is high (80 • ). There are at least three poles at frequencies beyond 1 MHz.

Results of Typical Simulations
The open-loop frequency response of the proposed OTA from 1 Hz to 10 MHz is shown in Figure 5. The response has a dominant pole at a very low frequency and all higher-frequency poles are beyond the unity-gain frequency, so that the phase margin is high (80°). There are at least three poles at frequencies beyond 1 MHz.  Figure 6 shows the closed-loop frequency response in unity-gain configuration. The high phase margin causes the frequency response to be monotonic, and the high DC gain results in a low-frequency closed-loop gain very close to 0 dB.
The DC transfer characteristic and the DC gain versus the amplitude of the input signal for values ranging from the negative supply voltage to the positive supply voltage are shown in Figure 7. The non-inverting buffer is critical for large-signal performance because the input common-mode signal varies together with the input signal, so that rail-to-rail behavior proves that both the output and the input common-mode are rail-to-rail. This is mostly due to the complementary input; at least one differential pair is active at each input signal level. Furthermore, linearity is significantly improved by the large DC gain, so that even at −0.45 V and 0.45 V, just 50 mV from the supplies, gain is still 0.95, i.e., −0.45 dB. The folded cascode OTA reported in Figure 4, even if it adopts complementary input differential pairs, exhibits lower linearity (not shown), owing to the much lower DC open-loop gain.  Figure 6 shows the closed-loop frequency response in unity-gain configuration. The high phase margin causes the frequency response to be monotonic, and the high DC gain results in a low-frequency closed-loop gain very close to 0 dB.  Figure 7. The non-inverting buffer is critical for large-signal performance, because the input common-mode signal varies together with the input signal, so that rail-to-rail behavior proves that both the output and the input common-mode The DC transfer characteristic and the DC gain versus the amplitude of the input signal for values ranging from the negative supply voltage V SS to the positive supply voltage V DD are shown in Figure 7. The non-inverting buffer is critical for large-signal performance because the input common-mode signal varies together with the input signal, so that rail-to-rail behavior proves that both the output and the input common-mode are rail-to-rail. This is mostly due to the complementary input; at least one differential pair is active at each input signal level. Furthermore, linearity is significantly improved by the large DC gain, so that even at −0.45 V and 0.45 V, just 50 mV from the supplies, gain is still 0.95, i.e., −0.45 dB. The folded cascode OTA reported in Figure 4, even if it adopts complementary input differential pairs, exhibits lower linearity (not shown), owing to the much lower DC open-loop gain. The DC transfer characteristic and the DC gain versus the amplitude of the i signal for values ranging from the negative supply voltage to the positive su voltage are shown in Figure 7. The non-inverting buffer is critical for large-s performance, because the input common-mode signal varies together with the inpu nal, so that rail-to-rail behavior proves that both the output and the input common-m are rail-to-rail. This is mostly due to the complementary input; at least one differentia is active at each input signal level. Furthermore, linearity is significantly improved b large DC gain, so that even at −0.45 V and 0.45 V, just 50 mV from the supplies, gain i 0.95, i.e., −0.45 dB. The folded cascode OTA reported in Figure 4, even if it adopts com mentary input differential pairs, exhibits lower linearity (not shown), owing to the m lower DC open-loop gain.  Figure 8 shows the response to a sinusoidal input in time and frequency. The ou is 500 mVpp, i.e., 50% of the supply rail, and the frequency is 10 kHz. For this input capacitive load of 200 pF, the peak output current is 3.1 µ A, which is 8.6 times larger the biasing current of the output stage (360 nA), highlighting the class-AB behavior o The DC transfer characteristic and the DC gain versus the amplitude of the input signal for values ranging from the negative supply voltage V SS to the positive supply voltage V DD are shown in Figure 7. The non-inverting buffer is critical for large-signal performance, because the input common-mode signal varies together with the input signal, so that rail-to-rail behavior proves that both the output and the input common-mode are rail-to-rail. This is mostly due to the complementary input; at least one differential pair is active at each input signal level. Furthermore, linearity is significantly improved by the large DC gain, so that even at −0.45 V and 0.45 V, just 50 mV from the supplies, gain is still 0.95, i.e., −0.45 dB. The folded cascode OTA reported in Figure 4, even if it adopts complementary input differential pairs, exhibits lower linearity (not shown), owing to the much lower DC open-loop gain. Figure 8 shows the response to a sinusoidal input in time and frequency. The output is 500 mVpp, i.e., 50% of the supply rail, and the frequency is 10 kHz. For this input and a capacitive load of 200 pF, the peak output current is 3.1 µA, which is 8.6 times larger than the biasing current of the output stage (360 nA), highlighting the class-AB behavior of the circuit. This also explains why the folded cascode, in class-A, cannot produce a decent output for this signal swing and frequency.
circuit. This also explains why the folded cascode, in class-A, cannot produce a decent output for this signal swing and frequency.
Simulations at 1 kHz from 300 mVpp to 900 mVpp input signal swing show rail-torail behavior with high linearity (54 dB at 900 mVpp). The step response is shown in Figure 9. The transient is monotonic, owing to the large phase margin, and the peak current is significantly larger than output biasing current. In fact, the slew-rate is 30 V/ms on both rising and falling edges, which corresponds to 6 µ A over a 200 pF load, 17 times larger than the quiescent current. The output noise spectral density is shown in Figure 10. The noise density for white noise is slightly higher than in the reference cascode amplifier (not shown), because the input stage is smaller in the proposed OTA: = 3, = 6. However, the average spectral noise density (total output noise power divided by the closed-loop bandwidth) is better in the proposed amplifier, due to its much larger bandwidth. In fact, in the conventional folded cascode amplifier, the contribution of flicker noise is dominant, since the noise corner frequency is at several kHz and therefore very close to the amplifier bandwidth. Simulations at 1 kHz from 300 mVpp to 900 mVpp input signal swing show rail-to-rail behavior with high linearity (54 dB at 900 mVpp).
The step response is shown in Figure 9. The transient is monotonic, owing to the large phase margin, and the peak current is significantly larger than output biasing current. In fact, the slew-rate is 30 V/ms on both rising and falling edges, which corresponds to 6 µA over a 200 pF load, 17 times larger than the quiescent current.
circuit. This also explains why the folded cascode, in class-A, cannot produce a decent output for this signal swing and frequency.
Simulations at 1 kHz from 300 mVpp to 900 mVpp input signal swing show rail-torail behavior with high linearity (54 dB at 900 mVpp). The step response is shown in Figure 9. The transient is monotonic, owing to the large phase margin, and the peak current is significantly larger than output biasing current. In fact, the slew-rate is 30 V/ms on both rising and falling edges, which corresponds to 6 µ A over a 200 pF load, 17 times larger than the quiescent current. The output noise spectral density is shown in Figure 10. The noise density for white noise is slightly higher than in the reference cascode amplifier (not shown), because the input stage is smaller in the proposed OTA: = 3, = 6. However, the average spectral noise density (total output noise power divided by the closed-loop bandwidth) is better in the proposed amplifier, due to its much larger bandwidth. In fact, in the conventional folded cascode amplifier, the contribution of flicker noise is dominant, since the noise corner frequency is at several kHz and therefore very close to the amplifier bandwidth. The output noise spectral density is shown in Figure 10. The noise density for white noise is slightly higher than in the reference cascode amplifier (not shown), because the input stage is smaller in the proposed OTA: K = 3, S = 6. However, the average spectral noise density (total output noise power divided by the closed-loop bandwidth) is better in the proposed amplifier, due to its much larger bandwidth. In fact, in the conventional folded cascode amplifier, the contribution of flicker noise is dominant, since the noise corner frequency is at several kHz and therefore very close to the amplifier bandwidth. Finally, PSRR and CMRR performance are reported in Figure 11. The CMRR and PSRR are very good, thanks to the small common-mode and supply gains, and the large differential gain. Positive and negative PSRR are almost symmetrical, owing to the complementary push-pull architecture. The results of the simulations in typical conditions are summarized in Table 2, where   0 , ,  ,  ,  ,  ,  ,  ,  , denote the DC gain, unity-gain frequency, phase margin, total bias current, offset voltage, peak-to-peak output voltage amplitude, output noise integrated between 1 Hz and 10 MHz, x-th order harmonic distortions, signal-to-noise ratio and signal-to-noise-and-distortions ratio, respectively.
The reference folded cascode amplifier has = 6 in order to have roughly the same power consumption, but bandwidth and slew-rate performance are insufficient for processing the 10 kHz 500 mVpp input sinusoid and the 500 mVpp 10 kHz input square wave, so transient data are not reported in Table 2. Gain is also significantly lower, whereas the Finally, PSRR and CMRR performance are reported in Figure 11. The CMRR and PSRR are very good, thanks to the small common-mode and supply gains, and the large differential gain. Positive and negative PSRR are almost symmetrical, owing to the complementary push-pull architecture. Finally, PSRR and CMRR performance are reported in Figure 11. The CMRR and PSRR are very good, thanks to the small common-mode and supply gains, and the large differential gain. Positive and negative PSRR are almost symmetrical, owing to the complementary push-pull architecture. The results of the simulations in typical conditions are summarized in Table 2, where   0 , ,  ,  ,  ,  ,  ,  ,  , denote the DC gain, unity-gain frequency, phase margin, total bias current, offset voltage, peak-to-peak output voltage amplitude, output noise integrated between 1 Hz and 10 MHz, x-th order harmonic distortions, signal-to-noise ratio and signal-to-noise-and-distortions ratio, respectively.
The reference folded cascode amplifier has = 6 in order to have roughly the same power consumption, but bandwidth and slew-rate performance are insufficient for processing the 10 kHz 500 mVpp input sinusoid and the 500 mVpp 10 kHz input square wave, so transient data are not reported in Table 2. Gain is also significantly lower, whereas the The results of the simulations in typical conditions are summarized in Table 2, where A 0 , f u , m ϕ , I TOT , V os , V oPP , V on , HDx, SNR, SNDR denote the DC gain, unity-gain frequency, phase margin, total bias current, offset voltage, peak-to-peak output voltage amplitude, output noise integrated between 1 Hz and 10 MHz, x-th order harmonic distortions, signal-to-noise ratio and signal-to-noise-and-distortions ratio, respectively. The reference folded cascode amplifier has S = 6 in order to have roughly the same power consumption, but bandwidth and slew-rate performance are insufficient for processing the 10 kHz 500 mVpp input sinusoid and the 500 mVpp 10 kHz input square wave, so transient data are not reported in Table 2. Gain is also significantly lower, whereas the stability margins are better, owing to the lower number of high-frequency poles, and the much lower unity-gain frequency.
The results in Table 2 show that the proposed OTA exhibits 25 dB higher gain, 18 times larger bandwidth, and 30 times larger peak slew-rate with respect to the reference folded cascode OTA, even with a slightly lower power consumption and the same capacitive load. CMRR and PSRR (positive and negative) data are also very good, owing to the low commonmode and supply voltage gains toward the output. Closed-loop offset (V os ) is lower in our design, owing to the higher DC gain. The PSRR of the proposed amplifier is significantly better than that of the reference folded cascode, because amplification is obtained by current mirroring, and current mirrors are insensitive to supply voltage variations. Moreover, the area-normalized FOMs prove the effectiveness of the approach also when area consumption is a concern; the slight penalty in area is more than compensated by the large improvement in bandwidth and slew-rate.

Temperature and Supply Voltage Simulations
Parametric simulations in temperature were performed at −30, 0, 30, 60, 90, 120 • C, and the best and worst results of the main performance indicators of the OTA in this temperature range are reported in Table 3, confirming the robustness of the OTA with respect to temperature variations, despite the adopted sub-threshold operation of MOS devices. The bandwidth variation is compatible with subthreshold operation, given the inverse relation between the temperature and the transconductance in subthreshold MOS devices. Parametric simulations were also performed for supply voltages equal to 0.9, 1.0 and 1.1 V, i.e., over a 10% variation of the nominal supply voltage. Results for 0.9 V and 1.1 V are reported in Table 4, showing that the amplifier is fundamentally insensitive to supply voltage variations, except for a slight dependence of slew-rate behavior on supply voltage, owing to the larger V GS voltages allowing larger currents in the devices.

Process and Mismatch Monte Carlo Simulations
Monte Carlo simulations were carried out using accurate statistical models of MOS transistors provided by the IC manufacturer, to assess the robustness of the proposed OTA to process and mismatch variations. Tables 5 and 6 show the results of Monte Carlo simulations referring to process and mismatch variations, respectively. Table 5 also reports process corner simulations with fast (F) and slow (S) CMOS devices. The amplifier is fundamentally insensitive to process variations, as the load capacitor is ideal (200 pF) and the transconductance depends on transistor ratios (K, H) and on the biasing current (owing to subthreshold biasing). Sensitivity to mismatches is higher, mostly due to mismatches in the current mirrors, causing variations in the total biasing current (and an offset voltage of 2.6 mVrms). Still, the amplifier is also robust and reliable under mismatch variations, thanks to the large size of the MOS devices.

Conclusions and Comparisons with the Literature
This paper proposes a novel architecture and design approach to boost gain, bandwidth and slew-rate performance of operational transconductance amplifiers. It shows remarkable improvements with respect to a conventional folded cascode with the same load capacitor and power consumption. Adding intermediate stages employing current mirrors with gain allows a polynomial (in the current mirrors' gain and number of additional stages) increase in gain, bandwidth, and peak slew-rate performance to be obtained for the same power consumption and load. Hence, the use of current-mode gain stages allows the design of OTAs with extremely high FOMs. The resulting amplifier is shown to be robust to PVT variations and mismatches, to have good CMRR and PSRR performance, good noise performance, and very high gain (25 dB higher than a comparable folded cascode), bandwidth (20 times higher) and slew-rate (30 times higher), with a slight improvement also with respect to average noise power density. Table 7 shows a comparison with stateof-the-art OTAs taken from the literature. The proposed amplifier has the highest reported FOMs (both small-and large-signal) in the literature, large gain, and small area footprint. Year −  2021  2021  2021  2020  2020  2020  2020  2018  2018  2017  2016  2015  2015  2007  Process  nm  130  130  130  180  180  180  180  65  180  350  180  65  350  It is evident that the FOMs of the proposed amplifiers are much larger than any comparable amplifier. The best in the literature so far, [6], has about one half of the small-signal FOM, and 17% lower large-signal FOM. Simulations show that the theoretical idea behind the proposed architecture allows achieving record-breaking bandwidth and slew-rate performance, besides large gain increases, for low-power amplifiers. Moreover, the proposed architecture shows remarkable improvement in terms of area-normalized FOMs. Indeed, the gate-driven approach allows the minimization of the layout area, given that, unlike body-driven amplifiers, it does not require separate wells for the devices. As a result, the proposed architecture shows the highest S(L)FOM N , therefore proving to be an excellent choice when both bandwidth and slew-rate and area consumption are important requirements.