A Low ‐ Distortion 20 GS/s Four ‐ Channel Time ‐ Interleaved Sample ‐ and ‐ Hold Amplifier in 0.18 μ m SiGe BiCMOS

: This paper presents a 20 GS/s four ‐ channel time ‐ interleaved sample ‐ and ‐ hold amplifier (SHA), which aims to improve the harmonic distortion performance, eliminate the common ‐ mode voltage fall in track ‐ to ‐ hold transition, and solve the difficulty of timing mismatch calibration among different sampling channels. In data path, the harmonic distortion of the track ‐ hold switch is optimized by introducing a distortion ‐ improving resistor into the switched emitter follower. The common ‐ mode voltage fall is eliminated by an inserted delay ‐ regulating resistor. Additionally, broadband data buffers are utilized to further guarantee a wide bandwidth. In clock path, an interpolator ‐ based phase regulator in analog domain is implemented to calibrate the timing mismatch, hence avoiding the large area cost and complicated algorithm in the digital domain. Fabricated in a 0.18 μ m SiGe BiCMOS process, the experimental results show that the SHA achieves a bandwidth of 16 GHz and a total harmonic distortion of − 39.6 to approximately − 51.8 dB with a − 3 dBm input. By applying the proposed sampling phase regulator, the timing mismatch can be optimized to satisfy the requirement of 6 ‐ bit resolution at a 4 × 5 GS/s sampling rate. The proposed SHA shows prominent performance on both bandwidth and linearity, which makes it suitable for ultra ‐ high ‐ speed communication networks. 5 GS/s sampling rate. Performance comparison shows that the proposed SHA has advantages on the sampling rate, bandwidth, and distortion performance, revealing a good prospect for application in direct RF sampling receiver, digital oscilloscope, and ADC ‐ DSP based PAM ‐ 4 wireline receiver. The future work will be focused on developing an integrated self ‐ adaptive calibration algorithm for timing mismatch, gain mismatch, and offset mismatch. The performances of distortion, bandwidth, and sampling rate are expected to be improved by utilizing a more advanced process and new circuit topology. What’s more, a high ‐ speed time ‐ interleaved ADC on the basis of the proposed SHA has been put on the agenda.


Introduction
Analog to digital converter (ADC) is an essential bridge to connect the analog domain with the digital domain. Due to the rapid growth of wireless and wireline communications, high-speed broadband ADC with low distortion is required. For example, 10~30 GS/s ADCs with 6~8-bit resolution are demanded in the PAM-4 wireline receiver [1][2][3]. The sample-and-hold amplifier (SHA) at the front end of an ADC is a key block, because it can not only alleviate the bandwidth requirement for the following blocks, but also eliminate the influence of clock jitter and signal skew in the following ADCs [4]. In particular, the master-slave structured SHA, where the output is held constant over the whole clock cycle, supplies sufficient operating time for ADCs [5]. It is difficult for a standalone SHA to achieve tens of GS/s sampling rate with high precision. With several identical SHAs sampling successively under control of multiphase clocks, the time-interleaved structure is an effective method to solve this problem. In theory, sampling rate can be consistently multiplied along with the increasing channel number. However, the offset mismatch, gain mismatch, and timing mismatch between interleaved channels limit the infinite improvement of the sampling rate, where the timing mismatch is the most difficult one to be calibrated [6][7][8][9].
A variety of multi-GS/s SHAs have been reported, but several aspects still need to be improved. First, most of the current sampling switches are based on switched emitter follower (SEF), which is simple in structure and easy to be realized. However, they suffer from high distortions [10][11][12].
Usually, a small-scale sampling capacitor is implemented to suppress harmonic distortion. However, a decreasing sampling capacitor deteriorates the performance of signal feedthrough, hold-mode voltage drop, and signal-to-noise rate (SNR). In other words, the properties of SHA restrict each other and it is difficult to make appropriate trade-offs by only regulating the sampling capacitance. Second, common-mode voltage fall is induced during the track-to-hold transition as a result of the mismatch between turn-on and turn-off speed of the sampling switches [13][14][15]. Although this common-mode fall can be eliminated by differential output, it influences the performance of following blocks, especially when the common mode rejection ratio is not very appropriate. Third, timing mismatch in many interleaved SHAs is calibrated in digital domain using complicated algorithms, which may cause a significant overhead in terms of area occupation and power consumption [7,16,17].
In this paper, a four-channel time-interleaved 20 GS/s master-slave SHA fabricated in a 0.18 μm SiGe BiCMOS process is presented. The contributions of this paper are mainly focused on the following aspects: First, a distortion-improving resistor is introduced into the SEF-based track-hold switch, which can reduce the distortion by enhancing emitter negative feedback, without deteriorating the signal feedthrough and hold-mode voltage drop performance. In other words, decoupling among these mutually restrictive properties is realized, and design difficulty is reduced. Second, a delay-regulating transistor is added in the path of hold clock. By adjusting the size of transistor, the common-mode voltage fall in track-to-hold transition can be eliminated. Third, an interpolator-based phase regulator is implemented to remove the timing mismatch among the four interleaved channels. As a result, large area consumption and complex mismatch calibration algorithm in digital domain are avoided. Finally, fabricating process is another important consideration besides circuit structure. Compared with traditional designs, the adopted SiGe BiCMOS process is less costly than the InP process [5,18] and provides higher cut-off frequency and better matching property than the CMOS process [12,19]. The measurement results show that the proposed SHA can achieve a 16 GHz bandwidth and a −39.6 to approximately −51.8 dB THD, with a -3 dBm input.
The remainder of this paper is organized as follows. Section 2 shows the system architecture of the proposed SHA. The distortion-improved broadband track-hold switch, the interpolator-based phase regulator, and the data buffers are elaborated in Section 3. The measurement results are presented in Section 4, and the conclusions are drawn in Section 5. Figure 1 presents the block diagram of the proposed SHA, which consists of a data path and a clock path. In the data path, the input signal is sampled successively by the four identical sampling channels. Each channel is composed of two sampling stages (master stage and slave stage) and three data buffers (input buffer, middle buffer, and output buffer). The two sampling stages are driven by a pair of clocks with 180-degree phase shift (i.e., CKAM for master stage and CKAS for slave stage in channel A), thus operating in track mode and hold mode alternately. When the master stage is tracking the input signal, the slave stage is in hold mode, to keep the output constant. When the master stage turns to hold mode, the slave stage quickly tracks the held voltage in master stage. As a result, the output can be kept stable during almost the whole clock cycle, hence providing a sufficient operating time for the following data conversion. The data buffers are utilized to provide isolating and driving capability. In traditional designs, small-scale sampling capacitance is usually utilized in the track-hold switch, to achieve low distortions at the cost of deteriorating the feedthrough performance and holdmode voltage drop [10]. It is rather difficult to make appropriate trade-offs among these properties at the same time. Besides, due to the nonlinearity of clock switches, common-mode voltage fall is induced at the transition from track mode to hold mode. In order to solve the two problems, a distortion-improving resistor and a delay-regulating transistor are implemented in the track-hold switches, which will be described in Section 3.

SHA Architecture
The temporal relation of the four interleaved channels are controlled by four-phase sampling clocks generated in the clock path. Compared with offset mismatch and gain mismatch, timing mismatch among sampling clocks is more critical for bit resolution, as it directly affects the sampling accuracy and is more costly and more difficult to be calibrated. It is necessary to take timing mismatch calibration into consideration in architecture design instead of putting all stresses on digital calibration, for the sake of saving area and reducing algorithm complexity. An interpolator-based phase regulator is utilized to realize this consideration, where the dividing clock signals (I/IN, Q/QN) are the input and the driving clocks for each channel are the output whose phase difference can be regulated by the control voltage (i.e., dly_a controls the phase relation among CKAM~CKDM or CKAS~CKDS). acts as an emitter follower, and the input is captured quickly and accurately by the sampling capacitor CH. In hold mode, Q2 is switched off by the low-level signal Track and Q1 is turned on by high-level Hold. ISEF is steered to resistor R1, inducing a significant voltage drop on the base of Q3. As a result, Q3 is switched off, and the voltage is held by CH. Harmonic distortion is one of the most important parameters to characterize the performance of the track-hold switch. The distortion of the track-hold switch is mainly induced by the nonlinear base-emitter voltage modulation of Q3, as its collector current, Ic, has exponential relationship to baseemitter voltage Vbe, which is expressed as follows:

Broadband Distortion-Improved Track-Hold Switch
Compared with traditional design, an extra resistor Rt is in series after the Q3 emitter and before the sampling capacitor CH. Applying the Volterra analysis and negative feedback theory, the second and third harmonic distortions, HD2 and HD3, can be respectively expressed as the following equations (see Appendix A for detailed derivations): These expressions indicate that the second and third harmonic distortions, HD2 and HD3, are the functions of signal amplitude, A, thermal voltage, VT, tail current, ISEF, distortion-optimizing resistor, Rt, and sampling capacitance, CH. It is obvious that, compared with traditional design, the addition of Rt improves the performance of harmonic distortions HD2 and HD3. This improvement can be attributed to the enhanced Q3 emitter negative feedback. Figure 3a future gives the simulated relationship between HD3 and Rt at different frequency, which verifies the effectiveness of Rt. It can be seen that larger Rt induces better harmonic performance. Figure 3b demonstrates the relationship between the improvement on HD3 compared with traditional design (Rt = 0) at different frequencies.
It can be seen that the HD3 improvement becomes more significant as the input frequency increases. Ideally, the voltage should be kept constant on sampling capacitor during the hold period. However, the base leakage current Ib6 of Q6 discharges CH and induces a voltage drop, which is called as hold-mode voltage drop. In addition, a portion of the input is coupled to the sampling capacitor through the Q3 base-emitter parasitic capacitance Cbe3, which is called as signal feedthrough. Meanwhile, the bandwidth is dominated by the sampling capacitor, CH, and the equivalent resistance at the sampling capacitor. Table 1 summaries the comparison of these properties between the proposed switch and traditional one. In the table, the hold-mode voltage drop rate refers to voltage drop per unit time on sampling capacitor during hold mode. Decreasing CH is the main method to achieve low distortion in traditional design, but signal feedthrough and hold-mode voltage drop aggravate the cost. It is rather difficult to optimize these properties at the same time by regulating CH. In the proposed structure, the addition of Rt can suppress the harmonic distortion without aggravating the signal feedthrough and hold-mode voltage drop performance. As a result of the small emitter output resistance of Q3, it is simple to achieve broad bandwidth, even when Rt is implemented. Thus, although the addition of Rt decreases the bandwidth in some degree, dozens of GHz bandwidth is still available.

The proposed SHA
The traditional design Hold-Mode voltage drop rate Bandwidth 1 2

2
Benefitting from the master-slave sampling structure, the master stage and the slave stage can be optimized separately. The master stage is required to be broadband and low-distortion, as it is utilized to capture the input signal precisely. The slave stage needs to possess the properties of high isolation and low feedthrough, as its input is held by the master stage. Consequently, small-scale resistor R = 39 ohms and capacitance CH = 78 fF are chosen in the master stage, achieving simulated 38 GHz bandwidth and −85 dB HD3 @5 GHz. In the slave stage, larger-size 50 ohms R and 135 fF CH are adopted to achieve a simulated 20 GHz bandwidth, 2.1 mV/ns voltage drop rate, -72 dB HD3, and −60 dB feedthrough @5 GHz.
The transition from track mode to hold mode is another factor influencing dynamic performance [13][14][15]. Due to the nonlinear exponential relationship between collector current and base-emitter voltage of bipolar transistor, during the transition from track mode to hold mode, the speed of truing on Q1 is faster than switching off Q2. The base voltage fall of Q3 induced by tail current drawn from R1 is transmitted to CH through the incompletely closed Q3. As a result, the common-mode voltage on CH is pulled down during hold cycle, as shown in Figure 4. Although the voltage drop can be eliminated by differential topology, it will shift the DC operating point, deteriorating the performance of the following transistors. Particularly, dynamic performance can be severely affected when the common-mode rejection ratio is not high enough. In order to address this issue, an extra transistor Q4 is utilized. The collector current IC1 of Q1 discharges the parasitic capacitors at node B before pulling down the base voltage of Q3 through R1, thus the transmitting delay between Hold signal rising up and base voltage of Q3 falling down is generated. During the inserted delay, Q3 emitter voltage soaring is induced by the switching off of Q2, which compensates the common-mode voltage fall on the sampling capacitor. The compensation effectiveness is determined by the delay time, which is tuned by the size of Q4. The delay should be optimized carefully, because short delay will induce under-compensation of common-mode voltage fall, while long delay will result in overcompensation. The small current, I1, is utilized to suppress the voltage fluctuation on node B; otherwise, it needs to take more time for node B voltage (VB) to decrease from high level to switch on Q4 when turning to hold mode. Without I1, the decreasing time of VB could be comparable to hold period at high sampling rate, resulting in overcompensation. Figure 5 reveals that the common-mode voltage fall is significantly reduced by the added transistor Q4. The size of Q4 is supposed to be 0.15 μm × 16 μm, because the corresponding voltage fall is about 0 mV. However, considering the area cost and the effect on the bandwidth of node A, the size of Q4 is chosen to be 0.15 μm × 5 μm, which limits the voltage fall into an acceptable value of 3 mV from the original 24 mV, and guarantees a 40 GHz bandwidth on node A.

Interpolator-Based Phase Regulator
The sampling timing is under control of four-phase differential clocks generated by the phase regulator, as shown in Figure 1. For a 6-bit resolution with an input frequency of 10 GHz, the standard deviation of the timing mismatch is required to be less than 0.2 ps, which corresponds to 0.36 degree [20][21][22]. An interpolator-based phase regulator is employed in the clock path to calibrate the timing mismatch in the analog domain, thus large area cost and complicated algorithm in digital calibration are avoided. Taking I/IN as an example, Figure 6a shows the schematic of the phase adjusting and Figure 6b demonstrates the circuit implementation of interpolator. An input pair of the interpolator (ap/an) is the buffered I/IN, another input pair (bp/bn) is the delayed ap/an through a CML buffer. The output CKAM is utilized to drive the track-hold switch in master stage directly, and in slave stage after a CML buffer, matching the delay between the track-hold switches of the two stages. The two pairs of inputs are interpolated by the control voltage dly_ap/an, generated by dly_a through an S2D converter. The output of the interpolator y(t) can be expressed as the function of input ap/an and bp/bn: where • indicates the input ap/an, • is another input bp/bn, denotes the phase difference between ap/an and bp/bn generated by the CML buffer, and and are the corresponding interpolation weights controlled by dly_ap/an. It can be seen that output amplitude of the interpolator is modulated by the interpolating weights α and β, indicating gain variations of the interpolator when tuning the delay. In order to mitigate the effect of gain variation, in our practical circuit, a two-stage rectifier is employed to convert the output amplitude to full swing before it is applied to the track-hold switches (see the Figure 6a). By this method, the amplitude of the clock signal can be kept constant during the timing mismatch calibration.  Figure 7a shows the simulated Monte Carlo timing mismatch of the sampling channels, and Figure 7b presents the adjusting range of the interpolator at different frequencies. The variation range of simulated phase difference and phase adjusting range of the interpolator for each frequency are annotated in Figure 7a,b, respectively. It can be seen that the phase mismatch can be covered by the adjusting the control voltage between 0 and 3.3 V. Taking 5 GHz as an example, the phase adjusting range of the interpolator is 0°~24.1°, and the variation range of phase difference at 5 GHz is 75.8°~99.5°, indicating that the timing mismatch can be mitigated. Data buffers, including input buffer, middle buffer, and output buffer, play an important role in the SHA. They are employed to provide appropriate broadband voltage gain and isolate track-hold switches from other blocks. Figure 8 presents their circuit implementation, where emitter degeneration resistors are utilized to achieve low distortion and broad bandwidth. It is worth noting that a common-mode feedback topology is utilized in the middle buffer to protect the input common voltage of slave track-hold switch from the voltage fall in master track-hold switch. The feedback path can be elaborated as follows: When Vcm increases because of the process-voltage-temperature (PVT) variation, If2 is supposed to increase along with If1, but If1+If2 is kept as constant Is1, so V1 rises up to suppress If2. If3 also decreases as the increasement of V1 downgrades the gate-source voltage of Qfb.
is pulled down, because larger current If4 = Is0 -If3 is drawn from resistor Rfb, which inhibits the increasement of Vcm as a result. The Miller capacitor C1 is utilized to keep the feedback stable, and C2 is implemented to maintain the pure voltage of Vfb.

Measurement Results
The proposed four-channel time-interleaved SHA was fabricated in a 0.18 μm SiGe BiCMOS process, and it occupies an area of 1.26 × 1.60 mm 2 . Figure 9a presents the chip microphotograph, and the power breakdown is displayed in Figure 9b, where it consumes 2200 mW from a 4.5/3.3 V supply when operating at 4 × 5 GS/s sampling rate. The measurement setup is presented in Figure 10, where differential input signal and sampling clock are provided by the analog signal generators through broadband baluns. The output spectrum information is measured by a spectrum analyzer, and the real-time output waveform is observed by the Teledyne Lecroy MCM-Zi-A oscilloscope.   Figure 11a shows the time-domain measurement result of one channel, where a 600 MHz sinusoid input is sampled at 5 GS/s. It can be seen that the fabricated SHA has good hold-mode performance, with less than 0.4 mV feedthrough. Figure 11b presents the result of a 13.6 GHz input sampled at 5 GS/s, where the curves remain flat during the hold period, revealing good oversampling performance of the SHA. The four-channel output waves are presented in Figure 12. Because the oscilloscope is equipped with only two broadband channels, they are successively demonstrated along with the output of channel A. It can be seen that the channel-timing mismatch is limited to 0.118 ps, verifying the effectiveness of the proposed interpolator-based phase regulator. The spectral measurement results are shown in Figure 13. Since the single-ended spectrum analyzer is used, only single-ended output spectrum is measured. Figures 14 and 15 give the THD and bandwidth curve of the proposed SHA, respectively. It can be seen that, due to the broadband distortion-improved track-hold switch topology, the SHA achieves good harmonic distortion performance of −39.6 to approximately −51.8 dB THD, with an input frequency swept up to 11 GHz, as well a broad 16 GHz bandwidth. Table 2 compares the proposed SHA with several previous designs. The results demonstrate that this design achieves higher sampling rate and wider bandwidth than the designs in 65 nm CMOS and similar 0.18 μm SiGe process [23][24][25][26]. The THD performance is even comparable with those designs utilizing advanced InP processes [5,18].

Conclusion
In this paper, a four-channel time-interleaved SHA was developed, with an area of 1.26 × 1.60 mm 2 and a power consumption of 2200 mW. In order to solve the low-distortion design problem, a track-hold switch with a distortion-improving resistor and a delay-tuning transistor is proposed. The measurement results show that a 16 GHz bandwidth and a −39.6 to approximately −51.8 dB THD are achieved. In addition, an interpolator-based phase regulator is implemented in the clock path, which calibrates the timing mismatch to satisfy the requirement of 6-bit resolution at the 4 × 5 GS/s sampling rate. Performance comparison shows that the proposed SHA has advantages on the sampling rate, bandwidth, and distortion performance, revealing a good prospect for application in direct RF sampling receiver, digital oscilloscope, and ADC-DSP based PAM-4 wireline receiver. The future work will be focused on developing an integrated self-adaptive calibration algorithm for timing mismatch, gain mismatch, and offset mismatch. The performances of distortion, bandwidth, and sampling rate are expected to be improved by utilizing a more advanced process and new circuit topology. What's more, a high-speed time-interleaved ADC on the basis of the proposed SHA has been put on the agenda.

Conflicts of Interest:
The authors declare that there is no conflict of interest.

Appendix A Harmonic Distortion Analysis of the Track-Hold Switch
It is generally known that the input/output relationship of a weakly nonlinear amplifier can be modeled by a Taylor series: where the second and third harmonic distortions, HD2 and HD3, can be expressed as follows [10,27] where A is the amplitude of the input. When the negative feedback in Figure A1 is applied to the signal path, HD2 and HD3 will vary with the open-loop gain Based on the abovementioned analysis, the exponential relationship between collector current IC and base-emitter voltage Vin of the bipolar transistor in Figure A2a can be expressed as follows [11,28,29]:  T  T  T  T   V  I  I  I  I  I where ic is the variation of the base collector current, and IC is the corresponding DC operating current. The simplified small-signal model of the track-hold switch on the track mode is shown in Figure A2b, where a negative feedback impedance, ZE, is introduced into the emitter. The corresponding second and third harmonic distortions can be expressed as follows: