A 0.6-µW Chopper Amplifier Using a Noise-Efficient DC Servo Loop and Squeezed-Inverter Stage for Power-Efficient Biopotential Sensing

To realize an ultra-low-power and low-noise instrumentation amplifier (IA) for neural and biopotential signal sensing, we investigate two design techniques. The first technique uses a noise-efficient DC servo loop (DSL), which has been shown to be a high noise contributor. The proposed approach offers several advantages: (i) both the electrode offset and the input offset are rejected, (ii) a large capacitor is not needed in the DSL, (iii) by removing the charge dividing effect, the input-referred noise (IRN) is reduced, (iv) the noise from the DSL is further reduced by the gain of the first stage and by the transconductance ratio, and (v) the proposed DSL allows interfacing with a squeezed-inverter (SQI) stage. The proposed technique reduces the noise from the DSL to 12.5% of the overall noise. The second technique is to optimize noise performance using an SQI stage. Because the SQI stage is biased at a saturation limit of 2VDSAT, the bias current can be increased to reduce noise while maintaining low power consumption. The challenge of handling the mismatch in the SQI stage is addressed using a shared common-mode feedback (CMFB) loop, which achieves a common-mode rejection ratio (CMRR) of 105 dB. Using the proposed technique, a capacitively-coupled chopper instrumentation amplifier (CCIA) was fabricated using a 0.18-µm CMOS process. The measured result of the CCIA shows a relatively low noise density of 88 nV/rtHz and an integrated noise of 1.5 µVrms. These results correspond to a favorable noise efficiency factor (NEF) of 5.9 and a power efficiency factor (PEF) of 11.4.


Introduction
Recently, there is a growing interest in wearable, portable, and personal health monitoring. By detecting abnormal health conditions during daily monitoring, this approach provides a new method of preventive healthcare. Monitoring human biopotential and neural signals is also important for early diagnosis and medical treatment [1]. Concerning the biopotential monitoring applications, sensors and their interfaces providing high-quality signals are of great importance. Besides, these sensor devices demand both long operating time and a compact form factor. A battery is widely used, however, it requires frequent battery recharging or replacement, and there is a limit in its size for some applications such as implantable sensors. To achieve a compact form factor by reducing the volume of the battery, low power consumption is in great demand for wearable and portable sensor devices.
Signals from humans have an amplitude of around 1 mV for an electrocardiogram (ECG) and from 10 to 100 µV for an electroencephalogram (EEG) over a frequency band from 0.5 to 150 Hz [2]. The local field potential (LFP) has a typical amplitude of 1 mV over 1 to 200 Hz. These low-frequency signals must first be amplified before any signal processing can be applied. One issue with amplification is the overlap of these signals with 1/f noise. To mitigate the effect of 1/f noise, a chopping technique can be applied for instrumentation amplifiers (IAs) [3][4][5][6][7][8][9][10][11][12]. Another issue is the electrode offset voltage V EOS generated at the tissue-electrode interface by electrochemical effects. To reject V EOS , a DC servo loop (DSL) has been used in a capacitively-coupled chopper instrumentation amplifier (CCIA) [3]. This approach has the advantage of removing bulky external capacitors. However, the DSL achieves the V EOS rejection by increased input-referred noise (IRN). The IRN V 2 n,in of a CCIA can be expressed as [3] V 2 n,in = where V 2 n,in,Gm represents the input-referred noise of a transconductor. The C in , C fb , and C P are the input, feedback, and parasitic capacitors that are connected to the input of the CCIA, respectively. The C hp is the capacitor in the DSL which is used to create a high-pass corner to reject V EOS . When a large C hp is used, the result (1) shows that it increases the IRN by charge dividing, causing the DSL to be a high noise contributor; previous studies often neglected this important issue. For example, the IRN increases from 0.7 to 6.7 µV rms [4] and from 2.8 to 4.7 µV rms [6] when the DSL is enabled. Thus, in these cases, the DSL contributes 89.5% [4] and 40.4% [6] of the overall noise. The increased noise significantly degrades both the noise efficiency factor (NEF) [7] and the power efficiency factor (PEF) [10].
Several methods have been proposed to improve the DSL. In [5], a digitally-assisted foreground calibration is used to allow the DSL to handle residual offset. In [7], a dual DSL which consists of coarse and fine DSLs reduces the value of C hp from 670 to 100 fF. In [8], the output of a DSL is connected to the cascode branch of a transconductor to mitigate the charge dividing effect. Nevertheless, these CCIAs consume 3.48 µW [7] and 2.13 µW [8], which results in relatively high PEFs of 18.3 and 10.5 (over a 10-kHz bandwidth), respectively. The results indicate that previous work suffers from high noise contribution from the DSL and achieves a relatively low noise-power efficiency.
In this paper, we investigate two design techniques to realize a 0.6-µW chopper amplifier with a PEF of 11.4 over a 200-Hz bandwidth. The first technique optimizes noise performance using a squeezed-inverter (SQI) stage. Because the SQI stage allows for the reduction of the supply voltage to a saturation limit of 2V DSAT , its bias current can be increased to reduce noise. The second technique is to reduce the relatively high noise from the DSL. Unlike conventional DSLs, which are connected to the input of the CCIA through C hp , we apply the output of the DSL to the body of a transconductor. The proposed approach not only removes the charge dividing effect but also reduces the noise by the transconductance ratio and the open-loop gain. Furthermore, this approach solves the problem of interfacing the DSL to the SQI stage, which has a different supply voltage. Using this approach, the noise contribution of the DSL is reduced to 12.5%. The fabricated CCIA achieves a relatively low noise density of 88 nV/rtHz with an integrated noise of 1.5 µV rms . The result corresponds to a favorable NEF of 5.9 and a PEF of 11.4 by consuming only 0.68 µW, demonstrating a power-efficient low-noise amplifier. Figure 1 shows the schematic of the proposed CCIA. The input transconductor G m1 is realized using an SQI stage biased at V DD,L = 0.2 V. The transconductors G m2 , G m3 , and G m4 are folded-cascode, two-stage opamp, and common source stages biased at V DD,H = 0.8 V, respectively. Transconductor G m3 is used as the integrator in the DSL. We consider the input offset voltages V OSi (i = 1 to 3) for G mi . The input V in is up-modulated to chopping frequency f CH by the chopper CH in , then down-modulated to baseband by CH out . The common-mode (CM) voltage V CM2 = V DD,H /2, which bypasses the chopper CH out , is used to bias G m2 through pseudo-resistors R b1,2 . A Miller capacitor C m1,2 is used for stability. The mid-band gain of the CCIA is defined by input capacitor C in1,2 and feedback capacitor C fb1.2 . The current consumptions of G m1 , G m2 , G m3 , and G m4 are 1.61 µA, 60 nA, 210 nA, and 80 nA, respectively. Although the SQI stage provides low noise operation, interfacing it with the DSL poses a challenge. This is because the input range of the SQI stage is limited by V DD,L = 0.2 V, while the DSL senses the output V out with a wide swing. We believe that this is one reason why previous studies do not implement a DSL [10]. To solve this problem, we modify the conventional DSL by connecting the output V O,DSL of the DSL to G m2 using the body terminal. We note that this approach is different from the previous approach wherein the output of the DSL is connected to the virtual ground node of the input transconductor through C hp [3,4,6]. The proposed approach offers several advantages: (1) because the proposed DSL uses G m2 instead of C hp , the charge dividing effect is removed and noise from the DSL is reduced, (2) the noise from DSL is further reduced by the open-loop voltage gain A V1 of G m1 as well as by the square of the transconductance ratio, and (3) by rejecting both V EOS and V OS2 , output offset is suppressed. Figure 2 shows the simplified model of the proposed CCIA. Offset voltage V OS1 creates an output ripple due to finite amplifier bandwidth. The amplitude of the output ripple can be expressed as [4]. To suppress this ripple, we use capacitors C b1,2 in front of CH out . Because V OS1 is blocked by C b1,2 , the residual ripple appearing at f CH can be neglected. Both V OS2 and V EOS create output offset V out,OS at the output. The rejection of V EOS and V OS2 is explained as follows: When V EOS is up-modulated to f CH , it is partially suppressed by C fb1,2 at the virtual input node of G m1 . The residual offset V EOS,ω existing at f CH can be expressed as

Design
the overall open-loop voltage gain of the amplifier. This residual offset is amplified by A V1 . Simulation results show that A V1 is 29 dB with a low-pass corner of 954 kHz. Additionally, a high-pass corner frequency of 1 Hz is created by C in1,2 and bias resistor R 1,2 inside G m1 (See Figure 3). Then, V EOS,ω is down-converted by CH out to create an offset voltage V EOS,Gm2 = A V1 V EOS,ω at the input of G m2 . We observe the sum of offset voltages, V OS2,tot = V OS2 + V EOS,Gm2 , at the input of G m2 . Transconductors G m2 and G m4 have a low-pass characteristic with a 3-dB frequency of about 10 Hz, and V OS2,tot generates offset current I O,Gm2 at the output of G m2 . The offset current is integrated by G m4 , which creates the output offset V out,OS . This is sensed by the DSL, then V O,DSL is applied to the body of the differential pair of G m2 . The generated current Gm2 . The DSL continues integrating, and V out,OS is suppressed by the amount 1/LG(s), where the loop gain LG can be expressed as LG(s) = g mb1,2 /(s 2 C m1,2 R DSL1,2 C DSL1,2 ).
The selection of f CH involves considering the various tradeoff between input impedance, output ripple, and residual offset. V out,ripple can be reduced by increasing f CH . One drawback of increasing f CH is that it reduces the input impedance Z in . Besides, there is greater charge injection and clock feed-through during the switching of the chopper [13]. To determine suitable f CH , we perform periodic steady-state (PSS) and periodic noise analysis (PNOISE) simulations. Considering the tradeoff and the amplifier bandwidth, we select f CH = 10 kHz. Figure 3a shows a schematic of the G m1 implemented using an SQI stage [10] modified to improve the common-mode rejection ratio (CMRR). The transistors in the SQI stage are biased in the subthreshold region using V DD,L = 0.2 V. The IRN of the G m1 can be expressed as

Circuit Implementation
where I DC = 800 nA is the bias current, g m,n and g m,p are the transconductance of M n1 and M p1 , respectively, U th = 26 mV is the thermal voltage, and n = 1.5 is the subthreshold factor [9]. The SQI stage reduces the noise by increasing I DC . Because the supply voltage is reduced to a saturation limit of 2V DSAT~0 .2 V, both low noise and low power operation can be achieved.
To generate I DC , bias voltages beyond supply rails are used for M n1 and M p1 . The bias voltage for M n1 is pushed above the supply rail by using a common-mode feedback (CMFB) loop. The bias voltage V NEG for M p1 is pushed below the ground by using a negative voltage generator, which is applied to the gate of M p1 through a pseudo-resistor R 3,4 . Because the transistors work in the subthreshold region without a tail current source, balancing the bias current for the input pair is challenging. To address this, we use a shared CMFB loop. Figure 3b shows the schematic of the CMFB circuit for the SQI stage. It monitors the CM voltage of outputs V 1,ON and V 1,OP . Then, the output V CMFB1 of the CMFB circuit is applied to the gate of M n1"2 through pseudo-resistors R 1,2 . Because any change in V CMFB1 affects the input pair by the same amount, this approach provides balanced bias currents for the SQI stage. Figure 4a shows the schematic of the negative voltage generator. It consists of a 1/10-scaled current replica, two switched-capacitor (SC) paths, a level shifter, and a folded-cascode (FC) amplifier. The SC network consists of the main path and a low noise replica. The FC amplifier and the main SC path generate the bias voltage V G for M 1B by regulating V D to V DD,L /2. The replica path is responsible for copying V G to generate V NEG . The current mirror defines an 80 nA through M 1B , which is the 1/10-scaled current of M p1,2 . The negative voltage generator draws 18 nA from V DD,H and 80 nA from V DD,L . Figure    Compared to previous work which uses two separate CMFB loops [10], the proposed approach increases the CMRR from 85 to 105 dB. This indicates that the proposed shared CMFB loop is effective in improving CMRR. Figure 5c shows the gain of the SQI stage depending on temperatures as a function of V DD,L . Because the transistors are biased in the subthreshold region, the increased threshold voltage with temperature reduces the gain [10]. We note that the SQI stage still provides a gain >20 dB when V DD,L is reduced to 0.15 V at 70 • C. At room temperature, the SQI stage achieves a gain of 29 dB with V DD,L = 0.2 V.  Figure 6 shows a schematic of G m2 with the body-controlled DSL. The bias current of G m2 is 40 nA. The CMFB circuit (not shown) generates the output V CMFB2 using a 20 nA bias current (See Table 1 for the power consumed by the CMFB circuits). The overall current of G m2 is only 60 nA. Figure 7a shows a schematic of the DSL. The R DSL1,2 and C DSL1,2 are the resistors and capacitors in the DSL, respectively. R DSL1,2 is a variable pseudo-resistor controlled by V PR , which is realized by cascading floating PMOS transistors. The input of G m3 is associated with offset V OS3 . Voltage V OS3 can disturb V out of the CCIA similarly to other offsets (V OS1 , V OS2 , V EOS,Gm2 ). To reduce the effect of V OS3 , two choppers, CH D1 and CH D2 , are added to the integrator. Because the bandwidth of the integrator is relatively narrow (~30 mHz), V OS3 is up-modulated to the outside of the integrator's bandwidth by CH D2 . Figure 7b shows a schematic of the two-stage opamp for G m3 . The first stage is biased using 5 nA. The second stage is biased at 200 nA for enhanced swing. The CMFB circuit generates V CMFB3 using a 5 nA bias current. The overall current is 210 nA. The transfer function of the DSL has a low-pass characteristic for V out . It can be expressed as −g mb1,2 /(sR DSL1,2 C DSL1,2 ), where g mb1,2 is the body transconductance integrated into G m2 . Within the feedback loop, the DSL creates a high-pass corner to reject V EOS . Using the condition C fb1,2 << C in1,2 , the transfer function of the CCIA can be expressed as where g m1,2 is the transconductance of the input pair of the G m2 , η = (g mb1,2 /g m1,2 ) ≈ 0.25, ω ugb = 2πf ugb = 1/(R DSL1,2 C DSL1,2 ) is the unity-gain frequency of the integrator, and β = C fb1,2 /C in1,2 is the feedback factor. Using (2), we obtain a high-pass corner frequency f hp = (η/βA V1 ) f ugb . Because f hp created by the DSL depends on the value of pseudo-resistor, we investigate the variability of R DSL1,2 . Figure 8a shows the value of the R DSL1,2 as a function of temperature for various V PR . The resistance increases with V PR while it decreases with temperature. Figure 8b shows the statistical distribution of the resistance obtained from Monte Carlo simulations at 27 • C and V PR = 0.4 V. The result shows that the average value of R DSL1,2 is 34.1 GΩ with a standard deviation of 1.6 GΩ. Figure 9 shows a schematic of the bias generator. It consists of a constant-g m current reference and six branches to generate the bias voltages for the amplifier. Overall current consumption is 47.5 nA.    The proposed CCIA uses a narrow margin for the stacked transistors in the SQI stage. Therefore, we investigate the effect of supply and temperature on the performance of the amplifier. Figure 10 shows the effect of V DD,L on the bias current (SQI stage only), noise, and bandwidth. When V DD,L is increased, it is tracked by V D and V G in the negative generator, which increases V NEG to keep the bias current. When V DD,L is reduced below 0.15 V, the two stacked transistors are driven in the deep subthreshold region, which reduces the current and the gain. We note that the CCIA still operates with an integrated noise < 1.5 µV rms when V DD,L is reduced to 0.15 V. The amplifier bandwidth gradually increases with V DD,L , which agrees with the previous result [10].
Because V DD,L is relatively low, an external electromagnetic interference can affect the sensor interface. In the proposed CCIA, the differential input signal V IN is up-modulated to f CH while the CM signal is not chopped. Therefore, chopping provides some means of rejection of external interference. In the case when the external interference exists at around f CH , it can affect the CCIA, however, this is well beyond the amplifier bandwidth . When the CCIA is used for the sensor readout, a theoretical input range calculated using a gain of 40 dB and the maximum output swing of 0.8 V pp is 8 mV pp , which agrees with the measured value of 6 mV. Because the input is capacitively-coupled, it provides a relatively high DC blocking allowed by the voltage rating of C in1,2 . Figure 11 shows the effect of temperature on the amplifier. The bias current increases with the temperature as expected from the constant-g m current reference, which increases V NEG . The two temperature-dependent parameters of the subthreshold current are mobility and the threshold voltage [14]. The increased threshold voltage with temperature reduces the gain A v . The bandwidth can be expressed as BW = ω p (1+βA v ), where ω p is the 3-dB frequency and β is the feedback factor. Furthermore, the increased temperature reduces the bandwidth [15,16]. The amplifier achieves an integrated noise of less than 2 µV rms over the temperature range from −5 • C to 45 • C.  The IRN of the CCIA, V 2 n,in , can be expressed as where C tot = C in1,2 + C fb1,2 + C p , V 2 n,in,Gm1 and V 2 n,in,Gm2 are the input-referred noise of G m1 and G m2 , respectively, V 2 n,out,DSL is the output-referred noise of the DSL, and g mi represents the transconductance of the transistors in G m2 . The noise from the DSL includes the thermal noise of R DSL1,2 and the noise V 2 n,in,OTA = 1.8 nV/ √ Hz of the two-stage opamp. We note that V 2 n,out,DSL is not only multiplied by (g mb1,2 /g m1,2 ) 2 << 1, but is also reduced by A V1 = 29 dB. Using the values g m1,2 = 0.7 µS, g m3,4 = 0.35 µS, g m9,10 = 0.7 µS, C in1,2 = 4 pF, C fb1,2 = 40 fF, and C p = 66.5 fF, we obtain V 2 n,in = 84.2 nV/ √ Hz. Using the shot noise model [10], we obtain a similar value for V 2 n,in . Over the signal bandwidth of 200 Hz, the integrated noise contributions from G m1 , G m2 , DSL, and the other blocks are 44.9%, 39.1%, 12.5%, and 3.5%, respectively. Figure 12 shows a microphotograph of the CCIA fabricated using a 180-nm CMOS process. The core area is 0.19 mm 2 . The supply voltages V DD,L and V DD,H are generated using external power supplies. Figure 13 shows the measured frequency response of the CCIA. The result shows a mid-band gain of 40 dB with a 3-dB bandwidth of 800 Hz. The high-pass corner f hp was successfully created using the proposed DSL and varies from 0.36 to 2.4 Hz when V PR is changed from 0.68 to 0.35 V. Figure 14 shows that the measured low-frequency CMRR > 105 dB. The power supply rejection ratios (PSRRs) measured at V DD,L and V DD,H show that low-frequency PSRR L > 80 dB and PSRR H > 75 dB, respectively.    Figure 15 shows the measured noise spectral density. The input-referred noise density is 88 nV/rtHz, which is slightly higher than the calculated value of 84.2 nV/rtHz. When the DSL is enabled, the noise integrated from 1 to 200 Hz increases from 1.3 to 1.5 µV rms . We note that the noise contribution from the DSL is just 12.5%, which is much lower than the previous results of 89.5% [4] and 40.4% [6]. Figure 16 shows the measured output of the CCIA for prerecorded human EEG (~100 µV) and ECG (~1 mV) input signals [17]. Table 1 shows the power breakdown of the proposed CCIA.   Table 2 shows a performance comparison with the state of the art. The tradeoff between the noise and power can be evaluated using PEF as

Measured Results
where V ni,rms is the input-referred root-mean-square (rms) noise voltage, P DC is the power consumption, and BW is the amplifier bandwidth. The previous approaches [4,7,12] use relatively-high currents to reduce noise. Because a high supply voltage V DD > 1 V is used except for in [6], the large power consumption >1.8 µW leads to a relatively high PEF. By using the SQI stage with an ultra-low voltage, the proposed CCIA achieves a competitive noise performance of 1.5 µV rms at a relatively low power of 0.61 µW (0.68 µW including bias generators). Our work achieves a good PEF of 10.2 (11.4 with bias generators) which is the lowest of the work shown in Table 2. Besides, the proposed CCIA has the lowest noise contribution of 12.5% from the DSL. The work in [10] achieves a good NEF/PEF = 2.1/1.6, however, their design does not include a DSL. Therefore, direct comparison is difficult. Although the dual power approach requires additional buck converter, a high-efficiency (>80%) converter consuming sub-nW can be used for voltage step-down [18,19].

Conclusions
In this paper, we investigated a sub-µW chopper amplifier using a noise-efficient DSL and power-efficient SQI stage. The proposed DSL not only removes the charge dividing effect but also reduces noise caused by both the transconductance ratio and the open-loop gain. Using the proposed approach, the noise contribution from the DSL is reduced to below 12.5%, which is much lower than the value seen in previous work. For power efficiency, we use an SQI stage biased by a supply voltage reduced to the 2V DSAT saturation limit. The challenge of biasing the SQI stage and interfacing with a DSL having a different supply domain is addressed. Measurement of the fabricated CCIA shows an IRN of 1.5 µV rms with the DSL enabled. The noise density is 88 nV/rtHz at a 40 dB gain when consuming 0.6 µW. The PEF is 11.4, which compares favorably with the state of the art.
Author Contributions: X.T.P. designed the circuit, performed the experimental work, and wrote the manuscript. N.T.N. performed noise analysis and revised the manuscript. V.T.N. performed circuit simulations and revised the manuscript. J.-W.L. conceived the project, organized the paper content, and edited the manuscript. All authors have read and agreed to the published version of the manuscript.