Review of Analog-To-Digital Conversion Characteristics and Design Considerations for the Creation of Power-Efficient Hybrid Data Converters

This article reviews design challenges for low-power CMOS high-speed analog-to-digital converters (ADCs). Basic ADC converter architectures (flash ADCs, interpolating and folding ADCs, subranging and two-step ADCs, pipelined ADCs, successive approximation ADCs) are described with particular focus on their suitability for the construction of power-efficient hybrid ADCs. The overview includes discussions of channel offsets and gain mismatches, timing skews, channel bandwidth mismatches, and other considerations for low-power hybrid ADC design. As an example, a hybrid ADC architecture is introduced for applications requiring 1 GS/s with 6–8 bit resolution and power consumption below 11 mW. The hybrid ADC was fabricated in 130-nm CMOS technology, and has a subranging architecture with a 3-bit flash ADC as a first stage, and a 5-bit four-channel time-interleaved comparator-based asynchronous binary search (CABS) ADC as a second stage. Testing considerations and chip measurements results are summarized to demonstrate its low-power characteristics.


Introduction
High sampling rate (1-3 GS/s) analog-to-digital converters (ADCs) with medium resolutions (6-10 bits) are utilized in diverse applications, including wireless communication systems [1,2], ultra-wideband (UWB) [3], direct-sampling TV receivers [4,5], and digital oscilloscopes [6]. Other applications of these wideband ADCs are in high-speed communication systems such as serial-link receivers [7,8], optical communications [9], and the read channels of disk drives [10][11][12]. Designing energy-efficient wideband ADCs is essential, especially for portable battery-powered devices. Traditionally, flash ADCs have been popular for high-speed analog-to-digital conversion [13][14][15]. However, their input capacitance and power consumption increase exponentially with the number of bits, which makes them less power-efficient when designed for higher resolution. A time-interleaved (TI) architecture [3,4,16,17] that uses lower speed ADCs in parallel is an alternative to simultaneously achieve a high sampling rate and high energy efficiency. The energy per conversion step of a TI-ADC is ideally equal to the one for the sub-ADC in each channel, but in practice, there is a power overhead caused by multi-channel clock generation and the calibration of channel mismatches [4,18,19]. Successive approximation register (SAR) ADCs are commonly used to implement each channel of a TI-ADC [4,16,17]. SAR ADCs are usually power-efficient for medium resolutions (6-10 bits) and medium samplings rates (10-200 MS/s), and their digital features benefit from modern CMOS technologies [20][21][22]. However, their sampling rate is limited by the need for a high-speed clock there will be a series of 1 s transitioning to a series of 0 s. This pattern, which is often referred to as thermometer code, enables determining the quantization level that is closest to the input amplitude. A thermometer-to-binary encoder can generate the final binary output. Since all of the comparators operate in parallel and at the same time, the conversion time of a flash ADC only equals one clock cycle, making it ideal for high-speed applications. However, since the number of comparators increases exponentially with the number of bits, the flash ADC is not area and power-efficient when high resolution is required. Increasing the number of comparators creates a significantly high input capacitance that originates from the total parasitic capacitances of the transistors at the input of the comparators, as well as from the total routing capacitance from distributing the input signal to the comparators on the chip; this limits the input bandwidth.
Due to the inevitable mismatches between the input and clock routing networks to the comparators, the RC delays of the signal paths to each comparator vary. These timing mismatches can result in significant conversion errors (especially for high input frequencies), because each comparator processes a different (i.e., delayed) voltage during the same conversion. To avoid this problem, a front-end sample-and-hold (S/H) or track-and-hold (T/H) is often included in flash ADCs. Moreover, the parasitic capacitances of the transistors at the comparator inputs vary with the applied input voltage amplitude. Therefore, the total input capacitance of a flash ADC changes nonlinearly with the input voltage. This nonlinear input capacitance can cause SNDR degradation at the output of the front-end S/H. The input-referred offset voltage of the comparators in a flash ADC is another problem, which creates nonlinearity errors that become more severe for higher resolutions. To reduce the input-referred offset, a common method is to use preamplifiers for each comparator [34], such that the input-referred offset is divided by the gain of the preamplifier. However, these preamplifiers should be designed with high bandwidth and gain, which significantly increases the power consumption. As a low-power alternative, there are several offset calibration techniques to suppress comparator offset errors [14,35,36]. In flash ADCs with high resolution, the offset requirement not only becomes more stringent, the number of comparators to be calibrated also increases exponentially; this makes such calibration systems more complicated while requiring more layout area for on-chip implementation.

Interpolating and Folding ADCs
Interpolating and folding architectures have been introduced to alleviate some of the main limitations of flash ADCs, such as high power consumption and a large layout area [37]. These architectures operate with single-step conversion, and can be as fast as a flash ADC in theory. Figure  2 displays the concept of interpolation in ADCs. One output from each of the two adjacent preamplifiers is connected to the middle comparator in a way that it effectively compares the signal Since all of the comparators operate in parallel and at the same time, the conversion time of a flash ADC only equals one clock cycle, making it ideal for high-speed applications. However, since the number of comparators increases exponentially with the number of bits, the flash ADC is not area and power-efficient when high resolution is required. Increasing the number of comparators creates a significantly high input capacitance that originates from the total parasitic capacitances of the transistors at the input of the comparators, as well as from the total routing capacitance from distributing the input signal to the comparators on the chip; this limits the input bandwidth.
Due to the inevitable mismatches between the input and clock routing networks to the comparators, the RC delays of the signal paths to each comparator vary. These timing mismatches can result in significant conversion errors (especially for high input frequencies), because each comparator processes a different (i.e., delayed) voltage during the same conversion. To avoid this problem, a front-end sample-and-hold (S/H) or track-and-hold (T/H) is often included in flash ADCs. Moreover, the parasitic capacitances of the transistors at the comparator inputs vary with the applied input voltage amplitude. Therefore, the total input capacitance of a flash ADC changes nonlinearly with the input voltage. This nonlinear input capacitance can cause SNDR degradation at the output of the front-end S/H. The input-referred offset voltage of the comparators in a flash ADC is another problem, which creates nonlinearity errors that become more severe for higher resolutions. To reduce the input-referred offset, a common method is to use preamplifiers for each comparator [34], such that the input-referred offset is divided by the gain of the preamplifier. However, these preamplifiers should be designed with high bandwidth and gain, which significantly increases the power consumption. As a low-power alternative, there are several offset calibration techniques to suppress comparator offset errors [14,35,36]. In flash ADCs with high resolution, the offset requirement not only becomes more stringent, the number of comparators to be calibrated also increases exponentially; this makes such calibration systems more complicated while requiring more layout area for on-chip implementation.

Interpolating and Folding ADCs
Interpolating and folding architectures have been introduced to alleviate some of the main limitations of flash ADCs, such as high power consumption and a large layout area [37]. These architectures operate with single-step conversion, and can be as fast as a flash ADC in theory. Figure 2 displays the concept of interpolation in ADCs. One output from each of the two adjacent preamplifiers is connected to the middle comparator in a way that it effectively compares the signal with a reference voltage in middle of V R1 and V R2 . Interpolating reduces the number of preamplifiers as well as resistors in the reference ladder to half (or less, depending on the interpolation factor) in comparison to a conventional flash ADC; this results in a reduction of the total area and power consumption. However, the number of latched comparators in an interpolating ADC is still the same as in a standard flash ADC with the same resolution. Another benefit of interpolation is its improved linearity due to the averaging and distribution of the errors [38]. In addition, further interpolation levels are possible with extra interpolation resistor ladders between the preamplifiers and latched comparators [39]. with a reference voltage in middle of VR1 and VR2. Interpolating reduces the number of preamplifiers as well as resistors in the reference ladder to half (or less, depending on the interpolation factor) in comparison to a conventional flash ADC; this results in a reduction of the total area and power consumption. However, the number of latched comparators in an interpolating ADC is still the same as in a standard flash ADC with the same resolution. Another benefit of interpolation is its improved linearity due to the averaging and distribution of the errors [38]. In addition, further interpolation levels are possible with extra interpolation resistor ladders between the preamplifiers and latched comparators [39].  A coarse ADC resolves the most significant bits (MSBs). In parallel, a folding circuit divides the input signal into several regions, and then, a fine ADC converts the least significant bits (LSBs) independently of the coarse ADC outputs. In a folding ADC, the MSBs are resolved in parallel with the folding operation and fine ADC decision, but in practice, there is a small delay for the folding operation. Similar to the flash ADC, a folding ADC requires a front-end S/H to ensure that the same sampled value is processed by the folding circuit and the coarse ADC to avoid conversion errors. Folding significantly reduces the number of comparators, because it divides the high-resolution ADC into coarse and fine ADCs with lower resolutions. Combining interpolation and folding architectures is a popular technique to achieve better power and area efficiency, as in [40,41], for instance. The high power consumption and limited bandwidth of the preamplifiers, nonlinearity of the practical folding circuit, and the delay of the folding path are the main limiting factors in folding and interpolating ADC architectures. Therefore, for high-speed analog-to-digital conversion with medium to high resolutions, other architectures such as time-interleaved ADCs are usually more power-efficient.     Figure 3 displays the block diagram of a folding ADC. A coarse ADC resolves the most significant bits (MSBs). In parallel, a folding circuit divides the input signal into several regions, and then, a fine ADC converts the least significant bits (LSBs) independently of the coarse ADC outputs. In a folding ADC, the MSBs are resolved in parallel with the folding operation and fine ADC decision, but in practice, there is a small delay for the folding operation. Similar to the flash ADC, a folding ADC requires a front-end S/H to ensure that the same sampled value is processed by the folding circuit and the coarse ADC to avoid conversion errors. Folding significantly reduces the number of comparators, because it divides the high-resolution ADC into coarse and fine ADCs with lower resolutions. Combining interpolation and folding architectures is a popular technique to achieve better power and area efficiency, as in [40,41], for instance. The high power consumption and limited bandwidth of the preamplifiers, nonlinearity of the practical folding circuit, and the delay of the folding path are the main limiting factors in folding and interpolating ADC architectures. Therefore, for high-speed analog-to-digital conversion with medium to high resolutions, other architectures such as time-interleaved ADCs are usually more power-efficient. with a reference voltage in middle of VR1 and VR2. Interpolating reduces the number of preamplifiers as well as resistors in the reference ladder to half (or less, depending on the interpolation factor) in comparison to a conventional flash ADC; this results in a reduction of the total area and power consumption. However, the number of latched comparators in an interpolating ADC is still the same as in a standard flash ADC with the same resolution. Another benefit of interpolation is its improved linearity due to the averaging and distribution of the errors [38]. In addition, further interpolation levels are possible with extra interpolation resistor ladders between the preamplifiers and latched comparators [39].  A coarse ADC resolves the most significant bits (MSBs). In parallel, a folding circuit divides the input signal into several regions, and then, a fine ADC converts the least significant bits (LSBs) independently of the coarse ADC outputs. In a folding ADC, the MSBs are resolved in parallel with the folding operation and fine ADC decision, but in practice, there is a small delay for the folding operation. Similar to the flash ADC, a folding ADC requires a front-end S/H to ensure that the same sampled value is processed by the folding circuit and the coarse ADC to avoid conversion errors. Folding significantly reduces the number of comparators, because it divides the high-resolution ADC into coarse and fine ADCs with lower resolutions. Combining interpolation and folding architectures is a popular technique to achieve better power and area efficiency, as in [40,41], for instance. The high power consumption and limited bandwidth of the preamplifiers, nonlinearity of the practical folding circuit, and the delay of the folding path are the main limiting factors in folding and interpolating ADC architectures. Therefore, for high-speed analog-to-digital conversion with medium to high resolutions, other architectures such as time-interleaved ADCs are usually more power-efficient.

Subranging and Two-Step ADCs
The number of comparators in flash ADCs exponentially increases with the number of bits, resulting in high power consumption and a large chip area. Subranging and two-step architectures were introduced as a solution to this problem. In a subranging or two-step ADC, a high-resolution conversion task is divided between two ADCs with lower resolution that operate sequentially, as depicted in Figure 4. The coarse ADC operates with full-scale range, and resolves the MSBs from the sampled input voltage. The DAC generates a quantized reference level according to the MSBs. Next, the DAC output is subtracted from the sampled input voltage, generating a residue voltage. Finally, the fine ADC in the second stage, operating with a subrange of the full-scale range, resolves the LSBs from the residue voltage. A two-step ADC is similar to a subranging ADC, but utilizes a gain stage to amplify the residue voltage before delivering it to the fine ADC [42]. Using such architectures can significantly reduce the number of comparators in an ADC. For example, a flash ADC with 8-bit resolution requires 255 comparators activated in parallel, while an equivalent subranging ADC consisting of a 4-bit coarse flash ADC and 4-bit fine flash ADC only requires 30 comparators in total, leading to significant reduction of power consumption and area on the chip.

Subranging and Two-Step ADCs
The number of comparators in flash ADCs exponentially increases with the number of bits, resulting in high power consumption and a large chip area. Subranging and two-step architectures were introduced as a solution to this problem. In a subranging or two-step ADC, a high-resolution conversion task is divided between two ADCs with lower resolution that operate sequentially, as depicted in Figure 4. The coarse ADC operates with full-scale range, and resolves the MSBs from the sampled input voltage. The DAC generates a quantized reference level according to the MSBs. Next, the DAC output is subtracted from the sampled input voltage, generating a residue voltage. Finally, the fine ADC in the second stage, operating with a subrange of the full-scale range, resolves the LSBs from the residue voltage. A two-step ADC is similar to a subranging ADC, but utilizes a gain stage to amplify the residue voltage before delivering it to the fine ADC [42]. Using such architectures can significantly reduce the number of comparators in an ADC. For example, a flash ADC with 8-bit resolution requires 255 comparators activated in parallel, while an equivalent subranging ADC consisting of a 4-bit coarse flash ADC and 4-bit fine flash ADC only requires 30 comparators in total, leading to significant reduction of power consumption and area on the chip. Despite the advantages of subranging and two-step ADC architectures, there are some drawbacks that should be considered. Since the second stage must wait for the completion of the first conversion and the residue generation, there is an inevitable latency in the final digital output. It should also be noted that a front-end S/H is necessary in subranging or two-step ADCs. Moreover, if there is any mismatch between the generated residue voltage range and the input range of the fine ADC, then the fine ADC converts an erroneous residue voltage, causing severe linearity issues such as missed codes or non-monotonicity. Thus, the design should be robust enough and well-trimmed to assure sufficient matching between the two stages for a given target resolution. The use of redundancy (implying additional resolution in the coarse ADC and/or fine ADC) as well as calibration are techniques to suppress such a range-mismatch issue [43,44]. It is also noteworthy that any non-ideality in the DAC can introduce errors for the LSBs generated by the fine ADC.
An N-bit subranging (two-step) ADC can be constructed by L-bit coarse and M-bit fine ADCs, where N = L + M. Although the coarse and fine ADCs have lower resolutions, they still have to be designed and optimized for N-bit offset accuracy. In an N-bit two-step ADC with an L-bit coarse stage, an amplifier with a gain of 2 L will relax the offset requirement for the M-bit fine ADC to only M-bit. In practice, the gain, linearity, bandwidth, and offset of the amplifier should also satisfy the N-bit accuracy of the combined ADC to avoid errors in the residue voltage. Moreover, the power consumption of such an amplifier can be significantly high due to the stringent requirements for high resolutions, especially at high conversion rates.

Pipelined ADCs
Pipelined ADC architectures (such as the one in Figure 5) combine the concept of two-step analog-to-digital conversion with a pipelining technique to extend high-resolution operation to higher conversion rates. After sampling, the first stage starts to generate its corresponding output bits. It also generates the residue voltage and amplifies it to full-scale to be delivered to the next stage. This operation continuously occurs in all of the subsequent stages. While the current stage is converting a sample, the preceding stage is processing the next sample. The last few LSBs in the final Despite the advantages of subranging and two-step ADC architectures, there are some drawbacks that should be considered. Since the second stage must wait for the completion of the first conversion and the residue generation, there is an inevitable latency in the final digital output. It should also be noted that a front-end S/H is necessary in subranging or two-step ADCs. Moreover, if there is any mismatch between the generated residue voltage range and the input range of the fine ADC, then the fine ADC converts an erroneous residue voltage, causing severe linearity issues such as missed codes or non-monotonicity. Thus, the design should be robust enough and well-trimmed to assure sufficient matching between the two stages for a given target resolution. The use of redundancy (implying additional resolution in the coarse ADC and/or fine ADC) as well as calibration are techniques to suppress such a range-mismatch issue [43,44]. It is also noteworthy that any non-ideality in the DAC can introduce errors for the LSBs generated by the fine ADC.
An N-bit subranging (two-step) ADC can be constructed by L-bit coarse and M-bit fine ADCs, where N = L + M. Although the coarse and fine ADCs have lower resolutions, they still have to be designed and optimized for N-bit offset accuracy. In an N-bit two-step ADC with an L-bit coarse stage, an amplifier with a gain of 2 L will relax the offset requirement for the M-bit fine ADC to only M-bit. In practice, the gain, linearity, bandwidth, and offset of the amplifier should also satisfy the N-bit accuracy of the combined ADC to avoid errors in the residue voltage. Moreover, the power consumption of such an amplifier can be significantly high due to the stringent requirements for high resolutions, especially at high conversion rates.

Pipelined ADCs
Pipelined ADC architectures (such as the one in Figure 5) combine the concept of two-step analog-to-digital conversion with a pipelining technique to extend high-resolution operation to higher conversion rates. After sampling, the first stage starts to generate its corresponding output bits. It also generates the residue voltage and amplifies it to full-scale to be delivered to the next stage. This operation continuously occurs in all of the subsequent stages. While the current stage is converting a sample, the preceding stage is processing the next sample. The last few LSBs in the final stage of a pipelined ADC are generally resolved by a low-resolution flash ADC. The final digital output code is generated at the same conversion rate as that of one pipelined stage. There is a substantial latency in pipelined ADCs, because a complete digital output will be resolved after all of the stages finish their conversion. Nevertheless, the output codes are generated with high speeds thanks to the pipelining operation. Each pipelined stage realizes the functions of a sample-and-hold, subtraction, DAC, and amplification; this is similar to a two-step ADC, but without fine ADC. Such a subsystem is referred to as a multiplying digital-to-analog converter (MDAC) [43,45]. A switched-capacitor circuit, comparators, and a high-performance operational amplifier (opamp) are usually used to build MDACs [46]. High-speed high-resolution pipelined ADCs require high-performance power-hungry amplifiers [47,48], which are often the bottlenecks of the design. Similar to subranging ADCs, the use of redundancy techniques is very popular in pipelined ADCs. For example, a 1.5-bit pipelined stage can significantly relax the offset requirement of the comparators in each stage [46,49].
stage of a pipelined ADC are generally resolved by a low-resolution flash ADC. The final digital output code is generated at the same conversion rate as that of one pipelined stage. There is a substantial latency in pipelined ADCs, because a complete digital output will be resolved after all of the stages finish their conversion. Nevertheless, the output codes are generated with high speeds thanks to the pipelining operation. Each pipelined stage realizes the functions of a sample-and-hold, subtraction, DAC, and amplification; this is similar to a two-step ADC, but without fine ADC. Such a subsystem is referred to as a multiplying digital-to-analog converter (MDAC) [43,45]. A switched-capacitor circuit, comparators, and a high-performance operational amplifier (opamp) are usually used to build MDACs [46]. High-speed high-resolution pipelined ADCs require high-performance power-hungry amplifiers [47,48], which are often the bottlenecks of the design. Similar to subranging ADCs, the use of redundancy techniques is very popular in pipelined ADCs. For example, a 1.5-bit pipelined stage can significantly relax the offset requirement of the comparators in each stage [46,49].
Popular methods to reduce the power consumption of a pipelined ADC are stage power scaling [43,50], opamp sharing [51], and switched-opamp [52] techniques. However, these techniques cannot be used for high-speed high-resolution ADCs due to limitations such as memory effects and an additional delay because of extra clock phases. Alternative techniques such as comparator-based and charge-pump based pipelined ADC architectures have been introduced together with calibrations to reduce the power consumption [53,54]. However, they are not appropriate for high resolution at high speeds.

Successive Approximation Register ADCs
The simplified block diagram of a typical successive approximation register (SAR) ADC is shown in Figure 6, which consists of a S/H, comparator, DAC, and digital logic (controller). A full analog-to-digital conversion in a SAR ADC is performed over multiple clock cycles. The reference voltage of the comparator is generated by a DAC that has a resolution equal to the SAR ADC resolution, and that is controlled by the digital SAR logic. The first clock cycle is generally dedicated to sampling the input. In the first cycle after the sampling, the DAC output is set to VFS/2, such that the comparator compares the sampled voltage with the middle reference level, resolving the first MSB. Depending on the comparator's output after each comparison, the SAR logic sets the DAC input bits to generate the appropriate reference voltage for the next comparison. The process continues until the last bit is resolved. Using a successive approximation algorithm (e.g., binary search algorithm), one output bit is generated during each conversion cycle. Therefore, a minimum of N + 1 clock cycles is required to carry out a full N-bit conversion with a basic SAR ADC.
SAR ADCs are generally power-efficient, because they do not require a power-hungry component such as an opamp. However, their conversion speed is limited due to the large number of conversion cycles required for a full analog-to-digital conversion. For an N-bit SAR ADC, the internal clock frequency, comparator delay, and settling time of the DAC have to be optimized to be at least N times faster than the nominal sampling rate, which can easily reach the practical limits of a CMOS technology. Since there is only one comparator in a conventional SAR ADC, the comparator offset will appear as a universal offset for the complete ADC transfer function. Such an offset impact can be calibrated with simple means, which is another advantage of this architecture. Popular methods to reduce the power consumption of a pipelined ADC are stage power scaling [43,50], opamp sharing [51], and switched-opamp [52] techniques. However, these techniques cannot be used for high-speed high-resolution ADCs due to limitations such as memory effects and an additional delay because of extra clock phases. Alternative techniques such as comparator-based and charge-pump based pipelined ADC architectures have been introduced together with calibrations to reduce the power consumption [53,54]. However, they are not appropriate for high resolution at high speeds.

Successive Approximation Register ADCs
The simplified block diagram of a typical successive approximation register (SAR) ADC is shown in Figure 6, which consists of a S/H, comparator, DAC, and digital logic (controller). A full analog-to-digital conversion in a SAR ADC is performed over multiple clock cycles. The reference voltage of the comparator is generated by a DAC that has a resolution equal to the SAR ADC resolution, and that is controlled by the digital SAR logic. The first clock cycle is generally dedicated to sampling the input. In the first cycle after the sampling, the DAC output is set to V FS /2, such that the comparator compares the sampled voltage with the middle reference level, resolving the first MSB. Depending on the comparator's output after each comparison, the SAR logic sets the DAC input bits to generate the appropriate reference voltage for the next comparison. The process continues until the last bit is resolved. Using a successive approximation algorithm (e.g., binary search algorithm), one output bit is generated during each conversion cycle. Therefore, a minimum of N + 1 clock cycles is required to carry out a full N-bit conversion with a basic SAR ADC.
SAR ADCs are generally power-efficient, because they do not require a power-hungry component such as an opamp. However, their conversion speed is limited due to the large number of conversion cycles required for a full analog-to-digital conversion. For an N-bit SAR ADC, the internal clock frequency, comparator delay, and settling time of the DAC have to be optimized to be at least N times faster than the nominal sampling rate, which can easily reach the practical limits of a CMOS technology.
Since there is only one comparator in a conventional SAR ADC, the comparator offset will appear as a universal offset for the complete ADC transfer function. Such an offset impact can be calibrated with simple means, which is another advantage of this architecture. Due to the high compatibility of SAR ADCs with digital CMOS and modern deep submicron technologies, they have become very popular in recent years. Depending on the topology and fabrication technology, SAR ADCs can achieve a wide range of characteristics as standalone ADCs, such as ultralow power [55,56], high speed [57], and high resolution [58]. Furthermore, they can be used as a part of a hybrid ADC, such as in [4,8,59]. There are many different techniques and architectures to implement SAR ADCs, which will be elaborated in Section 3. Figure 6. Successive approximation register (SAR) ADC architecture.

Time-Interleaved ADCs
In a time-interleaved (TI) ADC, multiple ADCs operate in parallel to effectively achieve higher sampling rates. Figure 7 shows the block diagram of a simple M-channel time-interleaved ADC. By time-interleaving an M number of ADCs, each with a sampling rate of fs, a total sampling rate of M × fs is attainable in theory. Therefore, conversion speeds of several GS/s are achievable with this architecture. In each channel, a sample-and-hold captures the input signal, and the sub-ADC resolves the digital output at a conversion rate of fs. After combining the channel outputs with a digital multiplexer, the effective ADC output is generated at the rate of M × fs. Each channel has 1/(M × fs) seconds of delay compared to its neighboring channels. A time-interleaved ADC can employ several power-efficient ADCs in parallel to achieve the same performance as a flash ADC but with lower power consumption in many applications, which also depends on the characteristics of a given CMOS technology. It is possible to use different types of ADCs (such as SAR, pipelined, and flash ADCs) in a time-interleaved architecture, which has to be determined by the designer under consideration of the specific application and power-efficiency requirements.  The main challenges with time-interleaving are offset mismatch, gain mismatch, timing mismatch (timing skew), and bandwidth mismatch among the channels; these will be discussed in Section 3. Thus, they often require calibration techniques for the achievement of medium-high resolutions at relatively high speeds. In theory, the total power consumption of an M-channel TI-ADC is equal to M times of the single ADC power in each channel. Therefore, it is expected to have the same overall TI-ADC energy per conversion efficiency as that of the single ADC used in the channels. However, a TI-ADC will always be less energy-efficient than its sub-ADCs, because of the power overhead associated with interleaving [4]. This overhead includes the generation and distribution of multiple clock phases, the distribution of the input and reference signals to all of the Due to the high compatibility of SAR ADCs with digital CMOS and modern deep submicron technologies, they have become very popular in recent years. Depending on the topology and fabrication technology, SAR ADCs can achieve a wide range of characteristics as standalone ADCs, such as ultralow power [55,56], high speed [57], and high resolution [58]. Furthermore, they can be used as a part of a hybrid ADC, such as in [4,8,59]. There are many different techniques and architectures to implement SAR ADCs, which will be elaborated in Section 3.

Time-Interleaved ADCs
In a time-interleaved (TI) ADC, multiple ADCs operate in parallel to effectively achieve higher sampling rates. Figure 7 shows the block diagram of a simple M-channel time-interleaved ADC. By time-interleaving an M number of ADCs, each with a sampling rate of f s , a total sampling rate of M × f s is attainable in theory. Therefore, conversion speeds of several GS/s are achievable with this architecture. In each channel, a sample-and-hold captures the input signal, and the sub-ADC resolves the digital output at a conversion rate of f s . After combining the channel outputs with a digital multiplexer, the effective ADC output is generated at the rate of M × f s . Each channel has 1/(M × f s ) seconds of delay compared to its neighboring channels. A time-interleaved ADC can employ several power-efficient ADCs in parallel to achieve the same performance as a flash ADC but with lower power consumption in many applications, which also depends on the characteristics of a given CMOS technology. It is possible to use different types of ADCs (such as SAR, pipelined, and flash ADCs) in a time-interleaved architecture, which has to be determined by the designer under consideration of the specific application and power-efficiency requirements. Due to the high compatibility of SAR ADCs with digital CMOS and modern deep submicron technologies, they have become very popular in recent years. Depending on the topology and fabrication technology, SAR ADCs can achieve a wide range of characteristics as standalone ADCs, such as ultralow power [55,56], high speed [57], and high resolution [58]. Furthermore, they can be used as a part of a hybrid ADC, such as in [4,8,59]. There are many different techniques and architectures to implement SAR ADCs, which will be elaborated in Section 3.

Time-Interleaved ADCs
In a time-interleaved (TI) ADC, multiple ADCs operate in parallel to effectively achieve higher sampling rates. Figure 7 shows the block diagram of a simple M-channel time-interleaved ADC. By time-interleaving an M number of ADCs, each with a sampling rate of fs, a total sampling rate of M × fs is attainable in theory. Therefore, conversion speeds of several GS/s are achievable with this architecture. In each channel, a sample-and-hold captures the input signal, and the sub-ADC resolves the digital output at a conversion rate of fs. After combining the channel outputs with a digital multiplexer, the effective ADC output is generated at the rate of M × fs. Each channel has 1/(M × fs) seconds of delay compared to its neighboring channels. A time-interleaved ADC can employ several power-efficient ADCs in parallel to achieve the same performance as a flash ADC but with lower power consumption in many applications, which also depends on the characteristics of a given CMOS technology. It is possible to use different types of ADCs (such as SAR, pipelined, and flash ADCs) in a time-interleaved architecture, which has to be determined by the designer under consideration of the specific application and power-efficiency requirements.  The main challenges with time-interleaving are offset mismatch, gain mismatch, timing mismatch (timing skew), and bandwidth mismatch among the channels; these will be discussed in Section 3. Thus, they often require calibration techniques for the achievement of medium-high resolutions at relatively high speeds. In theory, the total power consumption of an M-channel TI-ADC is equal to M times of the single ADC power in each channel. Therefore, it is expected to have the same overall TI-ADC energy per conversion efficiency as that of the single ADC used in the channels. However, a TI-ADC will always be less energy-efficient than its sub-ADCs, because of the power overhead associated with interleaving [4]. This overhead includes the generation and The main challenges with time-interleaving are offset mismatch, gain mismatch, timing mismatch (timing skew), and bandwidth mismatch among the channels; these will be discussed in Section 3.

Digital Output
Thus, they often require calibration techniques for the achievement of medium-high resolutions at relatively high speeds. In theory, the total power consumption of an M-channel TI-ADC is equal to M times of the single ADC power in each channel. Therefore, it is expected to have the same overall TI-ADC energy per conversion efficiency as that of the single ADC used in the channels. However, a TI-ADC will always be less energy-efficient than its sub-ADCs, because of the power overhead associated with interleaving [4]. This overhead includes the generation and distribution of multiple clock phases, the distribution of the input and reference signals to all of the channels, and the correction of errors from channel mismatches by overdesign or calibration. Therefore, to achieve the best efficiency, the power consumption of each individual channel as well as the time-interleaving overhead should be minimized.

Summary
Basic Nyquist-rate ADC architectures were reviewed in this section. Figure 8 depicts a conceptual plot for a visual comparison between the typical operating ranges of popular Nyquist-rate ADC architectures as well as hybrid/time-interleaved types. Flash, folding, and interpolating ADCs resolve the digital outputs in one cycle, achieving high conversion rates. However, they are not area-efficient and power-efficient when designed for medium to high resolutions, due to the high numbers of active comparators. Subranging, pipelined, and SAR ADCs require multiple clock cycles to complete a full analog-to-digital conversion, resulting in the latency of the output. Although they are usually slower, these architectures have the tendency to be more power-efficient than flash ADCs. On the other hand, time-interleaved ADCs can achieve high conversion rates. As exemplified in the next section, a hybrid ADC architecture can be constructed by combining different basic ADC architectures to accomplish a higher sampling rate and greater efficiency. channels, and the correction of errors from channel mismatches by overdesign or calibration. Therefore, to achieve the best efficiency, the power consumption of each individual channel as well as the time-interleaving overhead should be minimized.

Summary
Basic Nyquist-rate ADC architectures were reviewed in this section. Figure 8 depicts a conceptual plot for a visual comparison between the typical operating ranges of popular Nyquist-rate ADC architectures as well as hybrid/time-interleaved types. Flash, folding, and interpolating ADCs resolve the digital outputs in one cycle, achieving high conversion rates. However, they are not area-efficient and power-efficient when designed for medium to high resolutions, due to the high numbers of active comparators. Subranging, pipelined, and SAR ADCs require multiple clock cycles to complete a full analog-to-digital conversion, resulting in the latency of the output. Although they are usually slower, these architectures have the tendency to be more power-efficient than flash ADCs. On the other hand, time-interleaved ADCs can achieve high conversion rates. As exemplified in the next section, a hybrid ADC architecture can be constructed by combining different basic ADC architectures to accomplish a higher sampling rate and greater efficiency.

Design Considerations for Time-Interleaving and SAR ADCs within Hybrid Architectures
Time-interleaving is an effective technique to increase the sampling rate of ADCs while maintaining low power consumption. In this section, the general design challenges of TI-ADCs are briefly reviewed first, followed by a concise overview of SAR ADC architectures, as well as a review of some existing high-frequency ADC architectures. Afterwards, a specific low-power hybrid ADC architecture is introduced, along with a description of system-level design aspects.

Channel Offset Mismatch
Here, the impacts of channel mismatches are outlined to bring attention to the most common issues in time-interleaved ADCs. Interested readers can refer to [18,[60][61][62][63] for additional theory and analysis. Offset mismatches among TI-ADC channels can originate from the difference between the DC offsets of the buffers or amplifiers, charge injection errors, and offset errors of each sub-ADC in TI channels. Figure 9 models the total input-referred offset voltage (VOS) of each channel in an M-channel TI-ADC. The offset mismatches between each channel cause an error signal with fixed amplitude and periodic pattern in the time domain ADC output [18]. In the frequency domain, the undesired frequency components due to the offset mismatch error of an M-channel TI-ADC occur at:

Design Considerations for Time-Interleaving and SAR ADCs within Hybrid Architectures
Time-interleaving is an effective technique to increase the sampling rate of ADCs while maintaining low power consumption. In this section, the general design challenges of TI-ADCs are briefly reviewed first, followed by a concise overview of SAR ADC architectures, as well as a review of some existing high-frequency ADC architectures. Afterwards, a specific low-power hybrid ADC architecture is introduced, along with a description of system-level design aspects.

Channel Offset Mismatch
Here, the impacts of channel mismatches are outlined to bring attention to the most common issues in time-interleaved ADCs. Interested readers can refer to [18,[60][61][62][63] for additional theory and analysis. Offset mismatches among TI-ADC channels can originate from the difference between the DC offsets of the buffers or amplifiers, charge injection errors, and offset errors of each sub-ADC in TI channels. Figure 9 models the total input-referred offset voltage (V OS ) of each channel in an M-channel TI-ADC. The offset mismatches between each channel cause an error signal with fixed amplitude and periodic pattern in the time domain ADC output [18]. In the frequency domain, the undesired frequency components due to the offset mismatch error of an M-channel TI-ADC occur at: where f s is the sampling frequency of the TI-ADC. The SNR degradation due to the offset mismatch is constant and independent of the input frequency and amplitude. The corresponding SNR degradation can to be calculated from the amount of offset mismatch (σ os ). The required standard deviation of the channel offset in an M-channel N-bit TI-ADC can be calculated with the following equation [62]: where P is the input signal power.
where fs is the sampling frequency of the TI-ADC. The SNR degradation due to the offset mismatch is constant and independent of the input frequency and amplitude. The corresponding SNR degradation can to be calculated from the amount of offset mismatch (σos). The required standard deviation of the channel offset in an M-channel N-bit TI-ADC can be calculated with the following equation [62]: where P is the input signal power.

Channel Gain Mismatch
Gain mismatches in a TI-ADC mainly result from mismatches between the gains of the buffers (or amplifiers) and gain errors of each sub-ADC in the TI channels. The gain of each channel can be modeled as shown in Figure 10. The largest error magnitude due to gain mismatch occurs at the peaks of the sinusoidal input signal, which is similar to amplitude modulation (AM) [18]. In the frequency domain, the undesired components due to gain mismatch errors in an M-channel TI-ADC fall at the following locations: where fin is the input signal frequency. The SNR degradation due to gain mismatches is independent of the input frequency, but depends on the amplitude of the input signal. The required standard deviation of the channel gain (σGain) in an M-channel N-bit TI ADC can be obtained with [62]: (4) Figure 9. Modeling of voltage offsets between channels in a time-interleaved (TI)-ADC.

Channel Gain Mismatch
Gain mismatches in a TI-ADC mainly result from mismatches between the gains of the buffers (or amplifiers) and gain errors of each sub-ADC in the TI channels. The gain of each channel can be modeled as shown in Figure 10. The largest error magnitude due to gain mismatch occurs at the peaks of the sinusoidal input signal, which is similar to amplitude modulation (AM) [18]. In the frequency domain, the undesired components due to gain mismatch errors in an M-channel TI-ADC fall at the following locations: where f in is the input signal frequency. The SNR degradation due to gain mismatches is independent of the input frequency, but depends on the amplitude of the input signal. The required standard deviation of the channel gain (σ Gain ) in an M-channel N-bit TI ADC can be obtained with [62]:  Figure 10. Modeling of gain errors between channels in a TI-ADC.

Channel Timing Mismatch (Timing Skews)
Timing mismatches between sampling clocks, which are also known as clock skews or timing skews, are systematic errors due to the small differences between the actual sampling clock edges of the TI channels compared with the ideal sampling moments. Figure 11 exemplifies the timing skews of the sampling clocks for a four-channel TI-ADC. The main sources of timing skews are from device mismatches in the sampling clock generation circuitry, threshold voltage mismatch of the MOS switches [64] in each sample-and-hold, and the routing mismatches of the sampling clock signals on the chip [65]. Figure 11. Timing mismatches (clock skews) of sampling clocks between time-interleaved channels. Figure 12 visualizes the timing mismatches of sampling clocks in an M-channel TI-ADC, where Δti is the deviation of a sampling moment in channel "i" from the ideal value. In the time domain, the largest error occurs when the input signal has the highest slew rate (at the zero crossing for differential sinusoidal input), which is like phase modulation (PM) noise [18]. In the frequency domain, the undesired frequency components due to gain mismatch errors occur at: which is similar to the case of gain mismatch, according to Equation (3). Notably, the amplitudes of these frequency components increase with increasing input frequency. The SNR degradation due to timing mismatches depends on both the amplitude and the frequency of the input signal [18,60]. SNR degrades when input frequency increases, which can be a severe issue in TI-ADCs, because they are mainly used for broadband applications. The calibration of timing skews in TI-ADCs is more complicated than offset and gain mismatch calibration. Many timing-skew calibration techniques have been proposed in theory, and have also been implemented on-chip or off-chip [16,29,59,60,66,67]. The key design considerations related to the clock signal generation for TI-ADCs are discussed in [65].

Channel Timing Mismatch (Timing Skews)
Timing mismatches between sampling clocks, which are also known as clock skews or timing skews, are systematic errors due to the small differences between the actual sampling clock edges of the TI channels compared with the ideal sampling moments. Figure 11 exemplifies the timing skews of the sampling clocks for a four-channel TI-ADC. The main sources of timing skews are from device mismatches in the sampling clock generation circuitry, threshold voltage mismatch of the MOS switches [64] in each sample-and-hold, and the routing mismatches of the sampling clock signals on the chip [65].  Figure 10. Modeling of gain errors between channels in a TI-ADC.

Channel Timing Mismatch (Timing Skews)
Timing mismatches between sampling clocks, which are also known as clock skews or timing skews, are systematic errors due to the small differences between the actual sampling clock edges of the TI channels compared with the ideal sampling moments. Figure 11 exemplifies the timing skews of the sampling clocks for a four-channel TI-ADC. The main sources of timing skews are from device mismatches in the sampling clock generation circuitry, threshold voltage mismatch of the MOS switches [64] in each sample-and-hold, and the routing mismatches of the sampling clock signals on the chip [65]. Figure 11. Timing mismatches (clock skews) of sampling clocks between time-interleaved channels. Figure 12 visualizes the timing mismatches of sampling clocks in an M-channel TI-ADC, where Δti is the deviation of a sampling moment in channel "i" from the ideal value. In the time domain, the largest error occurs when the input signal has the highest slew rate (at the zero crossing for differential sinusoidal input), which is like phase modulation (PM) noise [18]. In the frequency domain, the undesired frequency components due to gain mismatch errors occur at: which is similar to the case of gain mismatch, according to Equation (3). Notably, the amplitudes of these frequency components increase with increasing input frequency. The SNR degradation due to timing mismatches depends on both the amplitude and the frequency of the input signal [18,60]. SNR degrades when input frequency increases, which can be a severe issue in TI-ADCs, because they are mainly used for broadband applications. The calibration of timing skews in TI-ADCs is more complicated than offset and gain mismatch calibration. Many timing-skew calibration techniques have been proposed in theory, and have also been implemented on-chip or off-chip [16,29,59,60,66,67]. The key design considerations related to the clock signal generation for TI-ADCs are discussed in [65].  Figure 12 visualizes the timing mismatches of sampling clocks in an M-channel TI-ADC, where ∆ ti is the deviation of a sampling moment in channel "i" from the ideal value. In the time domain, the largest error occurs when the input signal has the highest slew rate (at the zero crossing for differential sinusoidal input), which is like phase modulation (PM) noise [18]. In the frequency domain, the undesired frequency components due to gain mismatch errors occur at: which is similar to the case of gain mismatch, according to Equation (3). Notably, the amplitudes of these frequency components increase with increasing input frequency. The SNR degradation due to timing mismatches depends on both the amplitude and the frequency of the input signal [18,60]. SNR degrades when input frequency increases, which can be a severe issue in TI-ADCs, because they are mainly used for broadband applications. The calibration of timing skews in TI-ADCs is more complicated than offset and gain mismatch calibration. Many timing-skew calibration techniques have been proposed in theory, and have also been implemented on-chip or off-chip [16,29,59,60,66,67].
The key design considerations related to the clock signal generation for TI-ADCs are discussed in [65].

Channel Bandwidth Mismatch
Mismatches between the sampling bandwidths of TI channels cause SNR degradation [63]. Each sample-and-hold (S/H) can be approximately modeled with an RC circuit, functioning like a low-pass filter with a cutoff frequency (or bandwidth) of fc = 1/(2π•R•C), where R and C are the total resistance of the sampling path and total sampling capacitance, respectively [58,63]. As shown in Figure 13, there are differences between the bandwidths of the TI channels, which originate from several sources [59,68]. First, the differences originate from the RC mismatch coming from the MOS switch resistance and the sampling capacitance in each sample-and-hold. Second, they originate from the systematic RC mismatch between the input signal routing among the channels on the chip. Moreover, if a buffer amplifier is used in each S/H [34], the amplifier bandwidth mismatch will also contribute to the TI bandwidth mismatch [68]. The analysis of channel bandwidth mismatches is usually performed by writing the transfer function of the sampling channel to evaluate the impact of bandwidth mismatch on both amplitude and phase [18,68]. The bandwidth mismatch impact on SNR degradation is a combination of gain and phase mismatches, where for low-input frequencies, the impact of the phase errors is dominant [63]. The bandwidth mismatch has nonlinear dependence on both input signal amplitude and frequency [61].

Sub-ADC Architectures in Time-Interleaved ADCs
A time-interleaved ADC requires several channels to achieve high conversion rates, which in turn increases the overall chip area and power consumption. Therefore, designing a low-power high-performance sub-ADC to be used in each TI channel has significant importance in order to optimize the total power and area of a time-interleaved ADC. SAR ADCs are commonly used in TI

Channel Bandwidth Mismatch
Mismatches between the sampling bandwidths of TI channels cause SNR degradation [63]. Each sample-and-hold (S/H) can be approximately modeled with an RC circuit, functioning like a low-pass filter with a cutoff frequency (or bandwidth) of f c = 1/(2π·R·C), where R and C are the total resistance of the sampling path and total sampling capacitance, respectively [58,63]. As shown in Figure 13, there are differences between the bandwidths of the TI channels, which originate from several sources [59,68]. First, the differences originate from the RC mismatch coming from the MOS switch resistance and the sampling capacitance in each sample-and-hold. Second, they originate from the systematic RC mismatch between the input signal routing among the channels on the chip. Moreover, if a buffer amplifier is used in each S/H [34], the amplifier bandwidth mismatch will also contribute to the TI bandwidth mismatch [68]. The analysis of channel bandwidth mismatches is usually performed by writing the transfer function of the sampling channel to evaluate the impact of bandwidth mismatch on both amplitude and phase [18,68]. The bandwidth mismatch impact on SNR degradation is a combination of gain and phase mismatches, where for low-input frequencies, the impact of the phase errors is dominant [63]. The bandwidth mismatch has nonlinear dependence on both input signal amplitude and frequency [61].  Figure 12. Modeling of clock skews amongst channels in a TI-ADC.

Channel Bandwidth Mismatch
Mismatches between the sampling bandwidths of TI channels cause SNR degradation [63]. Each sample-and-hold (S/H) can be approximately modeled with an RC circuit, functioning like a low-pass filter with a cutoff frequency (or bandwidth) of fc = 1/(2π•R•C), where R and C are the total resistance of the sampling path and total sampling capacitance, respectively [58,63]. As shown in Figure 13, there are differences between the bandwidths of the TI channels, which originate from several sources [59,68]. First, the differences originate from the RC mismatch coming from the MOS switch resistance and the sampling capacitance in each sample-and-hold. Second, they originate from the systematic RC mismatch between the input signal routing among the channels on the chip. Moreover, if a buffer amplifier is used in each S/H [34], the amplifier bandwidth mismatch will also contribute to the TI bandwidth mismatch [68]. The analysis of channel bandwidth mismatches is usually performed by writing the transfer function of the sampling channel to evaluate the impact of bandwidth mismatch on both amplitude and phase [18,68]. The bandwidth mismatch impact on SNR degradation is a combination of gain and phase mismatches, where for low-input frequencies, the impact of the phase errors is dominant [63]. The bandwidth mismatch has nonlinear dependence on both input signal amplitude and frequency [61].

Sub-ADC Architectures in Time-Interleaved ADCs
A time-interleaved ADC requires several channels to achieve high conversion rates, which in turn increases the overall chip area and power consumption. Therefore, designing a low-power high-performance sub-ADC to be used in each TI channel has significant importance in order to optimize the total power and area of a time-interleaved ADC. SAR ADCs are commonly used in TI

Sub-ADC Architectures in Time-Interleaved ADCs
A time-interleaved ADC requires several channels to achieve high conversion rates, which in turn increases the overall chip area and power consumption. Therefore, designing a low-power high-performance sub-ADC to be used in each TI channel has significant importance in order to optimize the total power and area of a time-interleaved ADC. SAR ADCs are commonly used in TI ADC channels [4,16,25,29,69,70]. However, depending on the application, other ADC architectures such as flash and pipelined can also be utilized to construct a TI-ADC [67,71]. Next, we will review some of the low-power architectures (including SAR and binary search ADCs) that can be employed in time-interleaved architectures.

Suitability of SAR ADCs for Low-Power Hybrid ADC Architectures
SAR ADCs have become very popular over the last decade as the CMOS process technologies evolved because their performance significantly benefits from technology scaling. The quality and density of capacitors, as well as the switching speed of transistors, are improving, which helps to implement more efficient SAR ADCs with capacitive DACs. In this brief overview, we categorize SAR ADCs by their DAC structures, conversion-speed enhancement techniques, and switching techniques for higher energy efficiency.
The most common DAC architecture in SAR ADCs is the binary-weighted capacitive DAC (CDAC). However, it has a large total capacitance of 2 N ·C u , where N is the number of bits, and C u is the unit capacitor value. This limits the sampling speed of the ADC, and increases the required area on the chip as the resolution increases. However, this architecture has very good device matching characteristics, which results in high linearity. Another type of SAR ADC has a split-capacitor architecture (segmented DAC), which uses two split capacitor banks connected by an attenuation (bridge) capacitor between them [72]. A SAR ADC with a C-2C ladder DAC is an alternative technique [5]. These two latter architectures have the advantage of reducing the total capacitance in comparison to the conventional binary-weighted CDAC counterpart. However, they are more sensitive to parasitic capacitances, causing considerable nonlinearity errors. In general, they require calibration because of such errors. Most of the state-of-the-art SAR ADCs contain binary-weighted CDACs because achieving higher sampling rates with a small area is possible with these CDACs in modern short-channel technologies, which often necessitates the use of very small custom-designed capacitors (≤1fF) [4,29].
Resistive DACs can be used in SAR ADCs. However, their problems are the static power consumption and the need for a separate S/H. Nonetheless, a few high-speed state-of-the-art SAR ADCs contain resistive DACs [25] instead of capacitive DACs. In [73], a hybrid resistive-capacitive SAR architecture has been reported to save area on the chip because the total capacitance is significantly reduced. However, there is a tradeoff between area and linearity due to the higher mismatches of resistors in comparison to capacitors.
For an N-bit synchronous SAR ADC with a conversion rate of f s , an internal clock with a frequency of (N + 1)·f s is required. Therefore, the comparator must operate with such a high-speed clock. For every output bit, the comparator decision and DAC settling must be completed in one clock cycle. Thus, (N + 1) clock cycles are required to perform one complete conversion, limiting the overall speed of synchronous SAR ADCs. Several techniques have been proposed to improve the speed of SAR ADCs. An asynchronous SAR algorithm has been introduced in [74], where the triggering of the internal comparisons from MSB to LSB occurs by a ripple-like procedure. Hence, the quantization time allocated to each bit is no longer limited by the slowest conversion bit, but rather is affected by the average conversion time, leading to speed enhancement in comparison to synchronous architectures. Asynchronous architectures have been used frequently in recent designs [26,27,57] to shorten the overall conversion time. While an asynchronous technique helps to achieve higher speeds, it usually requires more complicated digital blocks to generate signals with unequal pulse widths. Converting more than one bit per cycle is another effective way of increasing the conversion speed of a SAR ADC. Several SAR or TI-SAR ADCs with two bits/cycle have been introduced, such as in [3,[23][24][25]. They can achieve higher sampling rates because they require a smaller number of cycles compared to a 1 bit/cycle SAR ADC. However, the disadvantages of multi-bit/cycle SAR ADCs are the larger number of comparators, and the more complex DAC structure. In addition, unlike in a 1 bit/cycle architecture, offset calibration is often required for the comparators in multi-bit/cycle SAR ADCs.
Several techniques have been reported to increase the energy efficiency of SAR ADCs, especially through reducing the power consumed by switching operations in the CDAC. The switching scheme determines the DAC size and energy efficiency in a SAR ADC. Capacitive SAR ADCs operate based on one of the following two concepts: charge redistribution and charge sharing. In charge redistribution architectures, the total DAC capacitance is fixed, and the DAC output is set by changing the voltage on the bottom plate of the capacitors [29]. Most of the standard capacitive SAR ADCs operate based on charge redistribution. Moreover, there is no attenuation of the sampled input voltage (when neglecting the impacts of parasitic capacitances). Monotonic capacitor switching [75] is an example of efficient charge redistribution that requires one cycle less than conventional capacitive SAR ADCs. Furthermore, the total capacitance is reduced to half of its counterpart in a typical SAR ADC. Monotonic switching reduces power consumption and increases the sampling rate. However, the variation of the input common-mode voltage in monotonic switching causes signal-dependent offsets, which degrades the linearity of the ADC. In charge-sharing architectures, the DAC output is varied by connecting pre-charged capacitors to the DAC nodes. Therefore, the total capacitance of the DAC increases during the conversion [76]. A charge-sharing switching scheme has better energy efficiency than the conventional charge redistribution switching technique. Nevertheless, the input voltage will be attenuated at the output of the DAC due to the increment of the total capacitance after sampling. In addition, unlike the charge redistribution architectures, the charge-sharing approach necessitates an explicit S/H.

Comparator-Based Asynchronous Binary Search (CABS) ADC
The comparator-based asynchronous binary search (CABS) ADC [26] can be regarded as an architecture in between the flash and SAR ADCs, having characteristics that resemble both types. Unlike a SAR ADC, which employs one comparator (for 1 bit/cycle) and varying reference levels for each cycle, a CABS ADC consists of a comparator tree. For N-bit resolution, it requires 2 N -1 comparators (as a flash ADC), but only N comparators are activated during a complete conversion. The first comparator is triggered by a clock, while the others are triggered asynchronously by the output of a previous comparator. The CABS architecture combines the advantages of both flash and SAR ADCs to realize high-speed operation with low-power consumption. However, due to its large number of comparators, it has a high input capacitance and occupies a relatively large chip area. A version of the CABS ADC with less comparators is presented in [27], where the total number of comparators has been reduced to 2·N-1. Nevertheless, the required time for the reference settling and the operation of additional digital gates for each comparison can limit the speed of this reduced architecture.

Power-Efficient High-Speed Medium-Resolution ADCs
Before engaging in a case study throughout the remainder of this section, this subsection provides some additional design aspects and tradeoffs of basic ADCs with regards to their efficacy in hybrid architectures. Flash, folding, and interpolating ADC architectures are inherently suitable for high-speed analog-to-digital conversion. However, for medium resolutions and above, their power efficiency degrades significantly due to the large number of active comparators, [9,13,14,77,78]. For pipelined ADCs, high-performance design becomes more difficult as scaled CMOS supply voltages continue to decrease due to the stringent gain and bandwidth requirements for the opamps. In addition, the high power consumption of high-performance opamps limits the minimum power of pipelined ADCs. However, high-speed single-channel pipelined ADCs have recently been reported with considerably low-power consumption thanks to the techniques such as calibration [79] and incomplete settling [7], which relax the required opamp specifications. Moreover, time-interleaved pipelined ADC design is another method to achieve high-speed low-power performance as in [71], which also uses opamp-sharing for further energy savings.
Technology scaling with reduced supply voltages in digital CMOS processes favors ADC topologies that have only a few analog elements, such as SAR ADCs. In deep submicron technologies, SAR ADCs can achieve faster sampling rates, lower power consumptions, and smaller layout areas. Their sampling rate is limited by the need for a high-speed internal clock for the SAR logic and the settling time of the DAC in every cycle. However, with time-interleaving SAR ADCs, high-speed and low-power performance is achievable [3][4][5]24,59]. Most of the SAR ADCs reviewed in Section 3 can be used in a TI architecture. The conversion speed of the SAR ADC determines the number of required TI channels. In addition, there are tradeoffs between the number of channels, the total power consumption, and the input bandwidth of the TI-ADC. The ADCs in [3,24] utilize multi-bit per cycle SAR ADCs in TI-SAR ADCs for high-speed low-power performance. The higher conversion speed of multi-bit/cycle SAR ADCs helps to reduce the number of time-interleaved channels. Nevertheless, when fabricated in CMOS technologies with relatively long channel lengths [3], they are not as power-efficient as they are in short-channel CMOS technologies [23,24].
Hybrid ADC architectures benefit from the combination of several ADCs to achieve high-speed and power-efficient operation. Using subranging or two-step architectures can facilitate the reduction of power consumption in ADCs [1,28]. The ADCs reported in [10,80] have similar two-step and subranging architectures. They combine two-step, time-interleaved, and flash ADC architectures together. The MSBs are resolved by one flash ADC in the first stage, while the conversions for LSBs are performed by two-channel time-interleaved flash ADCs in the second stage. In addition, an MDAC is used to amplify the residue voltage, because both (coarse and fine) ADCs operate with full-scale range. Using more power-efficient ADC architectures inside a hybrid ADC can lead to additional power-saving for higher resolutions. For example, a subranging flash SAR ADC has been described in [22], which resolves the MSBs with a flash ADC that controls the MSB capacitors of the CDAC in the SAR ADC of the second stage to resolve the remaining LSBs. Some low-power hybrid ADC architectures have recently been introduced, combining methods of subranging and time-interleaving with flash and SAR stages [29][30][31]. A subranging TI-ADC has been reported in [29], which uses a front-end flash ADC to resolve the MSBs at the full conversion rate of 1 GS/s. The fine stage consists of eight time-interleaved 10-bit SAR ADCs, where their MSB capacitors in the CDAC are controlled by the flash ADC outputs. Redundancy in the flash ADC and the CDAC is used to relax the offset constraints of the flash ADC. In [30], two flash TI-SAR ADCs are time-interleaved to lower the sampling rate of the front-end flash in order to save more power compared to [29]. It is worthwhile to mention that for these architectures, the mismatches between the two subranging stages must be reconciled through calibration or redundancy techniques to avoid linearity problems. Due to their high power-efficiency with high speeds and medium resolutions, hybrid ADCs are viable alternatives to conventional TI-SAR ADCs.

Architecture Case Study
The work in [32,33] exemplified the concept of a subranging time-interleaved architecture by realizing a hybrid flash TI-ADC with four time-interleaved CABS ADCs. This architecture employs a CABS ADC in a time-interleaved hybrid ADC configuration to take advantage of its power efficiency at relatively high speeds compared to conventional SAR ADCs. A flash ADC resolves the most significant bits (MSBs), whereas a time-interleaved ADC resolves the least significant bits (LSBs). The fast MSB conversion by the flash ADC together with subranging helps to reduce the number of interleaved ADCs, which results in higher input bandwidth. A merged sample-and-hold and capacitive digital-to-analog converter (SHDAC) performs sampling as well as residue generation for the subranging operation. The systematic and random offsets of the flash ADC comparators are calibrated using a foreground calibration technique.
A single-ended illustration of the ADC architecture from [32,33] is displayed in Figure 14. The architecture is a subranging ADC comprised of a 3-bit 1 GS/s flash ADC in the first stage and four time-interleaved 250 MS/s 5-bit CABS ADCs in the second stage. The hybrid ADC has a fully differential architecture to increase the dynamic range and suppress common-mode distortion and noise. It does not require an extra front-end sample-and-hold, because the SHDAC in each channel both samples the input for the flash ADC and performs the residue generation for the CABS ADC. In particular, since each SHDAC is shared by the flash ADC and the CABS ADC, it is ensured that an identical sampled voltage is processed by both stages. In comparison to hybrid ADC architectures in which the signal is sampled separately for the MSB stage and the LSB stage [30], this architecture is more immune to the sampling errors that originate from clock skews and resistor-capacitor (RC) mismatches between the two stages. However, it requires additional bootstrap switches to connect the flash ADC to the proper SHDAC during each sampling cycle. particular, since each SHDAC is shared by the flash ADC and the CABS ADC, it is ensured that an identical sampled voltage is processed by both stages. In comparison to hybrid ADC architectures in which the signal is sampled separately for the MSB stage and the LSB stage [30], this architecture is more immune to the sampling errors that originate from clock skews and resistor-capacitor (RC) mismatches between the two stages. However, it requires additional bootstrap switches to connect the flash ADC to the proper SHDAC during each sampling cycle.
In comparison to conventional asynchronous SAR ADCs, a CABS ADC can support faster speed, since it does not depend on settling delays associated with switched capacitors or changing reference voltages for each comparator decision. Due to the asynchronous operation, achieving 5-bit resolution at a relatively high speed of 250 MS/s is more feasible with a CABS ADC than with a synchronous SAR ADC. However, due to the large number of comparators, the CABS ADC has an input capacitance that is comparable to the total sampling capacitance in the SHDAC. Thus, the loading effect can change the output of the capacitive network, causing errors in the CABS ADC decisions. To alleviate this issue, a unity-gain voltage buffer is placed between the SHDAC and the CABS ADC (Figure 14). The buffer also isolates the SHDAC from the kickback noise of the CABS ADC. To calibrate the flash ADC's offsets, one extra sampling channel is present, which is only activated when the ADC is in the calibration mode. The flash ADC's systematic and mismatch offsets are calibrated with a foreground calibration that imitates the sampling conditions in the main channels. Since the extra calibration channel is disconnected during normal operation, it does not significantly affect the input bandwidth when switches with small parasitic capacitances are employed.

SHDAC Ctrl Flash Enc
Ctrl.   Figure 15 shows the clock diagram for one channel. Each channel has the same timing scheme with a delay of 1 ns compared to its neighboring channel. The clock phases are generated from a 1-GHz master clock. At the beginning of a conversion, the SHDAC of channel i (clocked by CLKSAMP.i) samples the input signal, and the switch between the SHDAC and flash ADC (controlled by CLKSAMPX.i) is closed while, In comparison to conventional asynchronous SAR ADCs, a CABS ADC can support faster speed, since it does not depend on settling delays associated with switched capacitors or changing reference voltages for each comparator decision. Due to the asynchronous operation, achieving 5-bit resolution at a relatively high speed of 250 MS/s is more feasible with a CABS ADC than with a synchronous SAR ADC. However, due to the large number of comparators, the CABS ADC has an input capacitance that is comparable to the total sampling capacitance in the SHDAC. Thus, the loading effect can change the output of the capacitive network, causing errors in the CABS ADC decisions. To alleviate this issue, a unity-gain voltage buffer is placed between the SHDAC and the CABS ADC ( Figure 14). The buffer also isolates the SHDAC from the kickback noise of the CABS ADC. To calibrate the flash ADC's offsets, one extra sampling channel is present, which is only activated when the ADC is in the calibration mode. The flash ADC's systematic and mismatch offsets are calibrated with a foreground calibration that imitates the sampling conditions in the main channels. Since the extra calibration channel is disconnected during normal operation, it does not significantly affect the input bandwidth when switches with small parasitic capacitances are employed.
A complete analog-to-digital conversion in each channel of this hybrid ADC is completed in four phases: (1) the SHDAC samples the input signal; (2) the flash ADC resolves the MSB; (3) the SHDAC performs the residue generation; and (4) the CABS ADC resolves the LSBs. Figure 15 shows the clock diagram for one channel. Each channel has the same timing scheme with a delay of 1 ns compared to its neighboring channel. The clock phases are generated from a 1-GHz master clock. At the beginning of a conversion, the SHDAC of channel i (clocked by CLK SAMP.i ) samples the input signal, and the switch between the SHDAC and flash ADC (controlled by CLK SAMPX.i ) is closed while, the ones in the other channels are opened. The clock signals CLK SAMPX.1 through CLK SAMPX.4 must be non-overlapping in order to avoid changing the charge that is held on the adjacent SHDACs. After the sampling phase is complete, the flash ADC resolves the first three MSBs. Based on the latched thermometer output of the flash ADC at the beginning of the third phase, the SHDAC generates the residue voltage that passes through the unity-gain buffer. Finally, the buffered residue voltage is delivered to the CABS ADC that operates with one-eighth of the full-scale range. Adding redundancy is a way to relax the offset requirement in subranging ADCs. However, it necessitates including additional comparators. In the flash ADC, for example, this would increase the power consumption, input parasitic capacitance, kickback, and chip area. Similarly, adding redundancy to the second stage becomes challenging when the speed/power tradeoff is of foremost importance, especially due to the increased loading from the second stage (with a CABS ADC) or the increased decision time (with a conventional SAR ADC). Furthermore, driving the extra comparators would increase the power consumption in the clock generation circuitry. A foreground offset calibration method was developed [32,81] to achieve the low-power performance at the cost of area overhead. Since the calibration is offline, it does not increase the total power consumption.
When designing the discussed hybrid ADC architecture for a particular application, the decision concerning the number of bits for the flash ADC and the CABS ADC should be made under consideration of power and area impacts. The analytical formulas and quantitative comparisons in terms of power efficiency and total area are provided in Table 1 to compare the different implementation options for the 8-bit subranging architecture.   Adding redundancy is a way to relax the offset requirement in subranging ADCs. However, it necessitates including additional comparators. In the flash ADC, for example, this would increase the power consumption, input parasitic capacitance, kickback, and chip area. Similarly, adding redundancy to the second stage becomes challenging when the speed/power tradeoff is of foremost importance, especially due to the increased loading from the second stage (with a CABS ADC) or the increased decision time (with a conventional SAR ADC). Furthermore, driving the extra comparators would increase the power consumption in the clock generation circuitry. A foreground offset calibration method was developed [32,81] to achieve the low-power performance at the cost of area overhead. Since the calibration is offline, it does not increase the total power consumption.
When designing the discussed hybrid ADC architecture for a particular application, the decision concerning the number of bits for the flash ADC and the CABS ADC should be made under consideration of power and area impacts. The analytical formulas and quantitative comparisons in terms of power efficiency and total area are provided in Table 1 to compare the different implementation options for the 8-bit subranging architecture.
In Table 1, P flash.comp = 143 µW and A flash.comp = 448 µm 2 are the power consumption and area of each comparator in the flash ADC, and P CABS.comp = 48 µW and A CABS.comp = 504 µm 2 are the power consumption and area of each comparator in the CABS ADC, respectively. L designates the flash ADC resolution, and M designates the CABS ADC resolution. According to Table 1, there is a tradeoff between power efficiency and area, as also visualized through the plots in Figure 16. The L-M = 2-6 architecture is the one with the largest area occupation, and L-M = 5-3 is the one with highest power consumption, making them least suitable. Minimizing the power consumption was the main priority of this work, which is why the 3-5 configuration was chosen over the 4-4 configuration.
In addition to power and area considerations, selecting a 3-bit flash instead of a 4-bit flash ADC leads to approximately half the amount of kickback noise and input capacitance.

Prototype Chip Layout and Testing Considerations
This subsection includes measurement setup details and explanations for the hybrid ADC example architecture discussed in Section 4.2 beyond the information in [32,33], which was fabricated in 130-nm CMOS technology and assembled in a TQFP128 package with a ground plane underneath. The full layout of the ADC is shown in Figure 17. A pad frame with 132 pads was created, where three of the pads were bonded to the ground plane of the package and one pad was unused. Due to the required density rules of the technology, the majority of the area on the chip was filled with top metal layers. To benefit from the available 4 mm × 4 mm silicon area, most of the unused areas included large decoupling MIM capacitors to reduce the high-frequency noise on the sensitive reference and supply voltage lines. All of the pins on the left and right side of the chip were assigned to 8-bit low-voltage differential signaling (LVDS) outputs of channels 1-4 (64 pins in total). Several pads were assigned to power supply and ground to reduce the overall bonding wire inductances in order to minimize the bouncing noise. The pads assigned to the ADC input and clock signals were located on the two opposite sides (bottom and top) of the chip to minimize the coupling between the corresponding bonding wires, as well as the on-chip routings. The ADC core occupies 0.69 mm 2 , including all of the SHDACs, CABS ADCs, flash ADC, bootstrap switches, DFFs, and thermometer-to-binary encoders. The clock generation circuits occupied 0.03 mm 2 , including the on-chip clock buffer, which was only added for the prototype test interface. Combined, the ADC core and clock generation occupied an area of 0.72 mm 2 . The digital calibration occupied 0.71 mm 2 , which included the calibration logic, DAC, test signal generation, and extra calibration channel (SHDAC, unity-gain buffer).
As shown in Figure 18, an on-chip matching network with two 50 Ω high-precision poly resistors ensured impedance matching. The common-mode voltage of the differential input signals was generated off-chip and was buffered through an opamp IC (TI OPA2626) in unity-gain configuration. The opamp had only 1-Ω open-loop output resistance, which provided a very low impedance for the reference voltage. The 2-Ω series resistor at the output of the opamp improved

Prototype Chip Layout and Testing Considerations
This subsection includes measurement setup details and explanations for the hybrid ADC example architecture discussed in Section 4.2 beyond the information in [32,33], which was fabricated in 130-nm CMOS technology and assembled in a TQFP128 package with a ground plane underneath. The full layout of the ADC is shown in Figure 17. A pad frame with 132 pads was created, where three of the pads were bonded to the ground plane of the package and one pad was unused. Due to the required density rules of the technology, the majority of the area on the chip was filled with top metal layers. To benefit from the available 4 mm × 4 mm silicon area, most of the unused areas included large decoupling MIM capacitors to reduce the high-frequency noise on the sensitive reference and supply voltage lines. All of the pins on the left and right side of the chip were assigned to 8-bit low-voltage differential signaling (LVDS) outputs of channels 1-4 (64 pins in total). Several pads were assigned to power supply and ground to reduce the overall bonding wire inductances in order to minimize the bouncing noise. The pads assigned to the ADC input and clock signals were located on the two opposite sides (bottom and top) of the chip to minimize the coupling between the corresponding bonding wires, as well as the on-chip routings. The ADC core occupies 0.69 mm 2 , including all of the SHDACs, CABS ADCs, flash ADC, bootstrap switches, DFFs, and thermometer-to-binary encoders. The clock generation circuits occupied 0.03 mm 2 , including the on-chip clock buffer, which was only added for the prototype test interface. Combined, the ADC core and clock generation occupied an area of 0.72 mm 2 . The digital calibration occupied 0.71 mm 2 , which included the calibration logic, DAC, test signal generation, and extra calibration channel (SHDAC, unity-gain buffer). 50 MHz. Therefore, an evaluation board with a single-ended to differential amplifier configuration (TI THS4509 EVM) was used to drive the ADC for that input frequency range (fin <50 MHz), as depicted in Figure 19. The 1-GHz clock signal was generated by a low-jitter RF signal generator (Agilent E8257D), and applied as shown in Figure 20. A bias tee (Mini-Circuits ZFBT-6GW-FT+) was used to set the DC level of the sinusoidal clock signal. The clock signal was terminated on the chip with an AC-coupled 50-Ω resistor to assure impedance matching.  Figure 18. Test setup configuration at the ADC's differential inputs. As shown in Figure 18, an on-chip matching network with two 50 Ω high-precision poly resistors ensured impedance matching. The common-mode voltage of the differential input signals was generated off-chip and was buffered through an opamp IC (TI OPA2626) in unity-gain configuration. The opamp had only 1-Ω open-loop output resistance, which provided a very low impedance for the reference voltage. The 2-Ω series resistor at the output of the opamp improved stability. The single-ended input signal was generated with an RF signal generator (Keysight N5173B). The tunable band-pass filter (BPF) served as an antialiasing filter to ensure the removal of the non-desired harmonics of the sinusoidal input signal. Using a BPF is optional, but highly recommended. The DC blockers removed the DC level from the external signals, because the DC level of the inputs was set by the on-chip matching network.
A single-ended to differential balun (Marki BAL-0006) was used to convert the single-ended input signal to differential signals at the ADC inputs ( Figure 18). However, it was observed that the linearity and amplitude balance of the balun's differential outputs degraded for frequencies below 50 MHz. Therefore, an evaluation board with a single-ended to differential amplifier configuration (TI THS4509 EVM) was used to drive the ADC for that input frequency range (fin <50 MHz), as depicted in Figure 19. The 1-GHz clock signal was generated by a low-jitter RF signal generator (Agilent E8257D), and applied as shown in Figure 20. A bias tee (Mini-Circuits ZFBT-6GW-FT+) was used to set the DC level of the sinusoidal clock signal. The clock signal was terminated on the chip with an AC-coupled 50-Ω resistor to assure impedance matching.    Figure 18. Test setup configuration at the ADC's differential inputs. Figure 18. Test setup configuration at the ADC's differential inputs.  Figure 19. Test setup configuration at the ADC's differential inputs for low-frequency measurements.  Figure 20. Test setup configuration for the single-ended 1-GHz input clock signal. Figure 21 displays the interface to acquire the digital outputs of the hybrid ADC. An LVDS receiver IC (TI LVDT386) converts the differential LVDS outputs to the single-ended signals (LVTTL) before the acquisition with a logic analyzer (Keysight 16852A). This LVDS receiver IC can detect differential voltages as low as 100 mVp-p and consists of 16 receiver channels with an integrated 110-Ω termination resistor for each channel. The logic analyzer can read the LVTTL signals at sampling rates of up to 2.5 GS/s, which is 10 times faster than the ADC output channels.  Figure 19. Test setup configuration at the ADC's differential inputs for low-frequency measurements.  Figure 20. Test setup configuration for the single-ended 1-GHz input clock signal. Figure 21 displays the interface to acquire the digital outputs of the hybrid ADC. An LVDS receiver IC (TI LVDT386) converts the differential LVDS outputs to the single-ended signals (LVTTL) before the acquisition with a logic analyzer (Keysight 16852A). This LVDS receiver IC can detect differential voltages as low as 100 mVp-p and consists of 16 receiver channels with an integrated 110-Ω termination resistor for each channel. The logic analyzer can read the LVTTL signals at sampling rates of up to 2.5 GS/s, which is 10 times faster than the ADC output channels.   Figure 21 displays the interface to acquire the digital outputs of the hybrid ADC. An LVDS receiver IC (TI LVDT386) converts the differential LVDS outputs to the single-ended signals (LVTTL) before the acquisition with a logic analyzer (Keysight 16852A). This LVDS receiver IC can detect differential voltages as low as 100 mVp-p and consists of 16 receiver channels with an integrated 110-Ω termination resistor for each channel. The logic analyzer can read the LVTTL signals at sampling rates of up to 2.5 GS/s, which is 10 times faster than the ADC output channels.
A printed circuit board (PCB) was designed to evaluate the hybrid ADC. As shown in Figure 22, SMA connectors were used for the ADC inputs and clock signal, as well as for the 10-MHz clock for the calibration logic. Adjustable voltage regulators (Maxim MAX8526) generated separate 1.2-V analog and digital supply voltages for the hybrid ADC chip, as well as 2.5 V for the opamp ICs and 3.3 V for the LVDS receiver ICs. The reference voltages were generated on board and delivered to the ADC through unity-gain voltage buffers with low output impedance, which was similar to the generation of V CM in Figure 18. For the bias voltages that were connected to high-impedance nodes on the ADC chip (gate of MOS transistors), simple voltage dividers were employed without buffering.
The ADC can operate in both normal mode and calibration mode, which can be set by an on-board switch. Another switch is used to set and reset the clock generation circuitry. To also allow manual calibration, the PCB contains DIP switches for the 6-bit input data of the coarse and fine codes, 4-bit address lines, and control signals of the calibration logic.  Figure 20. Test setup configuration for the single-ended 1-GHz input clock signal. Figure 21 displays the interface to acquire the digital outputs of the hybrid ADC. An LVDS receiver IC (TI LVDT386) converts the differential LVDS outputs to the single-ended signals (LVTTL) before the acquisition with a logic analyzer (Keysight 16852A). This LVDS receiver IC can detect differential voltages as low as 100 mVp-p and consists of 16 receiver channels with an integrated 110-Ω termination resistor for each channel. The logic analyzer can read the LVTTL signals at sampling rates of up to 2.5 GS/s, which is 10 times faster than the ADC output channels.  A printed circuit board (PCB) was designed to evaluate the hybrid ADC. As shown in Figure 22, SMA connectors were used for the ADC inputs and clock signal, as well as for the 10-MHz clock for the calibration logic. Adjustable voltage regulators (Maxim MAX8526) generated separate 1.2-V analog and digital supply voltages for the hybrid ADC chip, as well as 2.5 V for the opamp ICs and 3.3 V for the LVDS receiver ICs. The reference voltages were generated on board and delivered to the ADC through unity-gain voltage buffers with low output impedance, which was similar to the generation of VCM in Figure 18. For the bias voltages that were connected to high-impedance nodes on the ADC chip (gate of MOS transistors), simple voltage dividers were employed without buffering. The ADC can operate in both normal mode and calibration mode, which can be set by an on-board switch. Another switch is used to set and reset the clock generation circuitry. To also allow manual calibration, the PCB contains DIP switches for the 6-bit input data of the coarse and fine codes, 4-bit address lines, and control signals of the calibration logic.

Summary of Measurement Results
The ADC performance was measured with the optimum calibration codes written into the on-chip memory. The sinusoidal input and 1-GHz clock signals were applied as described in Section 4.3. After recording the digital output data from the four channels with the logic analyzer, it was evaluated in MATLAB. Detailed measurement results are reported in [33], whereas this section

Summary of Measurement Results
The ADC performance was measured with the optimum calibration codes written into the on-chip memory. The sinusoidal input and 1-GHz clock signals were applied as described in Section 4.3. After recording the digital output data from the four channels with the logic analyzer, it was evaluated in MATLAB. Detailed measurement results are reported in [33], whereas this section summarizes the key observations that were made during the testing of the example hybrid ADC.
A histogram testing method was employed to evaluate the DNL and INL errors of the hybrid ADC [82]. The DNL and INL errors were measured with a 3.234863-MHz sinusoidal input signal with a swing slightly larger than the full-scale voltage. This swing ensured that the output of the ADC was slightly clipped at the peaks, such that all of the codes would be presented in the acquired data. With the histogram testing method, a large number of data points is required to assure the accuracy of the calculation results. The DNL and INL were statistically calculated in MATLAB by comparing the resulting histogram of the sinusoidal output data to the bathtub-shaped histogram of an ideal ADC with the same sinusoidal signal. The measured DNL and INL for the 8-bit outputs of the hybrid ADC revealed large nonlinearity errors (DNL of −1/+1.88 LSB 8-bit and INL of −3.58/+2.79 LSB 8-bit after flash ADC calibration), which also caused degradation that limited the dynamic performance (SNDR and SFDR) of the ADC for 8-bit accuracy. In retrospect, the main factor for the large 8-bit DNL/INL was the variation of the fabricated CABS ADC comparators' offsets that were caused by random device mismatches. The CABS comparator offsets were estimated under the assumption of correlations between parameters during the Monte Carlo simulations, which was based on the proximity of the devices on the chip and the matched layout configurations. However, the measurements revealed that the offsets after fabrication were higher than the estimations. As observed during measurements, removing the last two LSBs of the CABS ADC stage lead to the suitable linearity performance for the ADC when evaluated with 6-bit accuracy. For this reason, the results in this section focus mainly on tests with 6-bit resolution (i.e., not using the last two LSB outputs). The measured DNL and INL errors of the hybrid ADC with 6-bit equivalence before and after flash ADC calibration were within −0.41/+0.50 LSB and −0.77/+0.52 LSB, respectively. The results indicate that the nonlinearity errors of the hybrid ADC were significantly reduced by the calibration of the flash ADC. Figure 23 shows the measured output spectra of the hybrid ADC with 6-bit evaluation at a sampling rate of 1 GS/s with low-frequency and high-frequency (near Nyquist rate) sinusoidal full-scale input signals before and after flash ADC offset calibration. Note that by dropping the last two LSBs from the 8-bit output (i.e., 6-bit evaluation), the effective number of bits (ENOB) at the Nyquist rate reduced by 0.22 bit in comparison to the 8-bit case. The flash offset calibration improved the ENOB of the 6-bit hybrid ADC output by 1.87 and 1.22 for the low-input and high-input frequencies, respectively. The 6-bit 1 GS/s hybrid ADC achieved 34.94 dB SNDR and 48.52 dB SFDR with a low-input frequency, and 33.42 dB SNDR and 45.71 dB SFDR with a near-Nyquist rate input frequency. The 6-bit evaluation of the 1 GS/s hybrid ADC revealed an ENOB of 5.26 with a near-Nyquist rate input frequency. The undesired frequency component at 250 MHz in the output spectra originated from the offset mismatches among the TI channels. Furthermore, it could be observed that the components caused by the timing mismatch between the TI channels (Equation (5)) increased with high-input frequencies ( Figure 23b). As discussed in Section 3.3, such non-idealities can be calibrated off-chip or on-chip in order to enhance the performance of a TI-ADC. However, for the evaluations of this work, the post-processing did not involve the calibration of the impacts from time-interleaved channel mismatches. GBW (35.6 dB, 1.53 GHz) compared to the OTA on the prototype chip. The simulated total harmonic distortion (THD) of the low-power buffer was 4.7 dB higher, which is acceptable considering that the 6-bit hybrid ADC requires approximately 12 dB less SNDR performance compared to the 8-bit hybrid ADC. This would reduce the total power consumption of the unity-gain buffers by a factor of 2, if the 5-bit CABS were replaced by a 3-bit CABS, leading to an estimated total power consumption of 8.7 mW for a 6-bit hybrid ADC.  Table 2 lists the measured performance specifications of the described hybrid ADC chip in comparison to the other reported ADCs with similar resolutions and sampling rates. The power consumption of the hybrid ADC is reported as the measured power for the 8-bit and 6-bit cases. Furthermore, an estimated power for a 6-bit redesign of the hybrid ADC is reported for comparison based on the reasoning in the previous two paragraphs. The hybrid ADC fabricated in 130-nm CMOS technology stands amongst the ADCs with a low FoM due to its power efficiency. A lower FoM would be expected if the hybrid ADC would be designed and fabricated in a technology with a smaller channel length, because the architecture would benefit from transistors with higher transition frequencies (fT) and considerably lower parasitic capacitances. Hence, a design in a newer CMOS technology would lead to lower power consumption and a significant area reduction. Overall, in comparison to the specifications of the other works in Table 2, the measurement results of the proposed ADC architecture provide a proof-of-concept for its efficiency and performance. The total measured analog power was 6.6 mW, including the flash ADC (1.05 mW), all of the CABS ADCs (1.19 mW), the SHDACs with decoders (0.81 mW), and the four unity-gain buffers with eight operational transconductance amplifiers (OTAs) (3.55 mW). The total digital power of 0.86 mW included the DFFs and thermometer-to-binary encoders at the flash ADC output. The measured power consumption of the clock generation circuitry and the clock buffer were 3.54 mW and 2.35 mW, respectively. Accordingly, the total power consumption of the 8-bit 1 GS/s ADC was 11 mW, excluding the clock buffer, which was used for testing. The measurements of the power consumptions were completed at room temperature while applying a 10.2-MHz full-scale sinusoidal input signal to the ADC clocked at 1 GHz. The power consumption of each circuit was measured by disconnecting its corresponding 1.2-V supply voltage jumper and measuring the average current with a multimeter.
For a fair 6-bit evaluation of the hybrid ADC, the power consumption should be adjusted to account for the last two LSBs of the ADC output not being used. By reducing the CABS ADC resolution to 3-bit, the number of CABS comparators was reduced from 31 to 7. This would result in a significant reduction of the area and input capacitance of the CABS ADC by factor of 4. Moreover, the number of active comparators in the CABS ADC was reduced from 5 to 3, leading to a 40% reduction of the power consumed by the CABS ADC, which was 0.5 mW. Therefore, the total power consumption of the hybrid ADC for 6-bit evaluation was 10.5 mW, which was based strictly on the measurement data. It is worth mentioning that, for a 3-bit redesign of the CABS ADC, the total area of the four CABS ADCs would become more than four times smaller, resulting in further area-saving.
Furthermore, the operational transconductance amplifier (OTA) in the unity-gain buffer was designed to drive the input capacitance of the 5-bit CABS ADC. In case of a 3-bit CABS ADC design, the load capacitance of the buffer would be reduced from 250 fF to only 60 fF. To estimate the buffer power consumption associated with driving a 3-bit CABS ADC, the OTA in the buffer was simulated with lower bias currents and a 60-fF load capacitance. From the simulation results, the same OTA could drive a 60-fF load while consuming only 320 µW of power, and achieving similar DC gain and GBW (35.6 dB, 1.53 GHz) compared to the OTA on the prototype chip. The simulated total harmonic distortion (THD) of the low-power buffer was 4.7 dB higher, which is acceptable considering that the 6-bit hybrid ADC requires approximately 12 dB less SNDR performance compared to the 8-bit hybrid ADC. This would reduce the total power consumption of the unity-gain buffers by a factor of 2, if the 5-bit CABS were replaced by a 3-bit CABS, leading to an estimated total power consumption of 8.7 mW for a 6-bit hybrid ADC. Table 2 lists the measured performance specifications of the described hybrid ADC chip in comparison to the other reported ADCs with similar resolutions and sampling rates. The power consumption of the hybrid ADC is reported as the measured power for the 8-bit and 6-bit cases. Furthermore, an estimated power for a 6-bit redesign of the hybrid ADC is reported for comparison based on the reasoning in the previous two paragraphs. The hybrid ADC fabricated in 130-nm CMOS technology stands amongst the ADCs with a low FoM due to its power efficiency. A lower FoM would be expected if the hybrid ADC would be designed and fabricated in a technology with a smaller channel length, because the architecture would benefit from transistors with higher transition frequencies (f T ) and considerably lower parasitic capacitances. Hence, a design in a newer CMOS technology would lead to lower power consumption and a significant area reduction. Overall, in comparison to the specifications of the other works in Table 2, the measurement results of the proposed ADC architecture provide a proof-of-concept for its efficiency and performance.

Conclusions
The common Nyquist-rate ADC architectures were reviewed with a focus on their advantages and drawbacks that relate to more complex hybrid ADCs. In general, hybrid ADC architectures benefit from the combination of several ADCs to achieve high-speed and power-efficient operation. Even though their composition and building blocks vary depending on the applications, a trend can be observed towards the more frequent use of SAR ADCs to take advantage of the fast and efficient switching operations resulting from CMOS technology scaling. Nevertheless, it was pointed out how non-idealities such as channel offset and gain mismatches, timing skews, and channel bandwidth mismatch must be addressed when designing low-power hybrid ADCs. As a paradigm, a 1-GS/s hybrid ADC for high-speed low-power applications was described. This discussed subranging architecture contains a flash ADC stage, and a second stage with four time-interleaved comparator-based asynchronous binary search ADCs. Prototype chip testing results were summarized to show low-power operation with 6-bit resolution at 1 GS/s. The measured ENOB was above 5.26 up to the Nyquist frequency with a power consumption of 10.5 mW from a 1.2-V supply.

Conflicts of Interest:
The authors declare no conflict of interest.