A Fast Lock All-Digital MDLL Using a Cyclic Vernier TDC for Burst-Mode Links

: An all-digital multiplying delay-locked loop (MDLL)-based clock multiplier featuring a time-to-digital converter (TDC) to achieve fast power-on capability is presented. The proposed MDLL adopts a new offset-free cyclic Vernier TDC to achieve a fast lock time of 15 reference clock cycles while maintaining a wide detection range and high resolution. The proposed offset-free TDC also uses a correlated double sampling technique to remove mismatch and offset issues, resulting in low jitter characteristics. After the MDLL is quickly locked, the TDC is turned off, and it goes into delta-sigma modulator (DSM)-based sequential tracking mode to reduce power consumption and improve jitter performance. Implemented in a 65-nm 1.0-V CMOS process, the proposed MDLL occupies an active area of 0.043 mm 2 and generates a 2.4-GHz output clock from a 75-MHz reference clock (multiplication factor N = 32). It achieves an effective peak-to-peak jitter of 9.4 ps and consumes 3.3 mW at 2.4 GHz.


Introduction
As the demand for high-speed off-chip I/O bandwidth in computing systems increases, the importance of energy efficiency of serial links is rapidly increasing. One approach to address this problem is to use burst-mode communication. Burst-mode data communication, traditionally applied to passive optical networks (PONs), has recently begun to be applied to electrical chip-to-chip serial links [1][2][3][4]. In conventional serial links, there is idle power consumed by transceivers even when the link is not in use. However, in a burst-mode-based energy proportional link [4], energy efficiency can be increased because the link and transceivers are powered-on/-off rapidly only when there is a data transmission request.
One of the most critical building blocks in this high-speed energy proportion link design is a fast power-on (or fast lock) clock multiplier for burst-mode operation. The fast power-on/-off clock multiplier should have the ability to quickly multiply the reference clock and complete the phase lock to generate a de-skewed high-frequency output clock.
In general, clock multipliers have been designed based on phase-locked loops (PLLs). However, the loop bandwidth of a typical PLL cannot be easily increased to shorten the lock time due to stability problems [5]. Various techniques have been proposed in the PLL structure to reduce the locking time. Among the PLLs showing reasonable power and performance, the digital PLL from [6] achieved a lock time of forty reference clock cycles, which is insufficient for use in burst mode serial link applications.
In this paper, instead of using a PLL, we introduce a clock multiplier technology that uses a digital multiplying delay-locked loop (MDLL) to obtain fast lock characteristics. A typical MDLL generates the output clock frequency f clkout that is N times the reference clock frequency f clkref , where N is the frequency multiplication factor [7][8][9][10][11][12][13][14][15][16][17][18][19]. Figure 1 shows the block diagram of a typical MDLL. It consists of a multiplexer (MUX), a multiplexed ring oscillator (MRO), a phase detector (PD), a charge pump (CP) + loop filter (LF), a divider N, and a select logic. By periodically injecting a clean reference clock edge, the MDLL can achieve better jitter performance with reduced loop bandwidth limit issues. The extended loop bandwidth of the MDLL can bring fast lock characteristics, but most of the MDLLs presented so far have been mainly concerned with jitter and phase noise characteristics.  Among the various all-digital MDLL architectures [8][9][10][11][12][13][14][15][16][17][18][19], the work from digital MDLL [14] achieved a lock time of forty clock cycles by using a successive approximation register (SAR)-based binary search algorithm. To further improve the locking time of a digital MDLL, we propose a new method of using a time-to-digital converter (TDC) [20][21][22][23][24][25][26] in this paper. Conventionally, the purpose of using a TDC in digital MDLL design was to generate a digital bit proportional to the phase difference between two inputs by replacing a phase detector (PD). Since the quantization error of a TDC causes a jitter increase, the main issue of conventional TDC-based MDLLs was in the design of a highresolution TDC with low power consumption [8,13]. In these conventional digital MDLLs, low jitter and low reference spur characteristics were the main concern, and little attention was paid to fast power-on or locking time.
In this paper, we present a new all-digital MDLL that features a cyclic Vernier TDC to achieve fast power-on capability. This is the first fast lock all-digital MDLL that utilizes a cyclic Vernier architecture [20][21][22][23][24][25][26] to achieve a wide detection range and high resolution. The rest of this paper is organized as follows. Section 2 presents the architecture and operation of the proposed all-digital MDLL, Section 3 shows the experimental results, and Section 4 presents the conclusion. Figure 2 shows a conceptual diagram of the proposed MDLL detecting the initial phase error (=∆t) using a TDC at the beginning of the operation, where TREF is the period of the reference clock (clkref), T1 is the period of the initial output clock (clkout), and T2 is the period of the clkout after locking. Ideally, T2 = T1 + ∆t/N after locking, where N is the frequency multiplication factor. Among the various all-digital MDLL architectures [8][9][10][11][12][13][14][15][16][17][18][19], the work from digital MDLL [14] achieved a lock time of forty clock cycles by using a successive approximation register (SAR)-based binary search algorithm. To further improve the locking time of a digital MDLL, we propose a new method of using a time-to-digital converter (TDC) [20][21][22][23][24][25][26] in this paper. Conventionally, the purpose of using a TDC in digital MDLL design was to generate a digital bit proportional to the phase difference between two inputs by replacing a phase detector (PD). Since the quantization error of a TDC causes a jitter increase, the main issue of conventional TDC-based MDLLs was in the design of a high-resolution TDC with low power consumption [8,13]. In these conventional digital MDLLs, low jitter and low reference spur characteristics were the main concern, and little attention was paid to fast power-on or locking time.

Proposed MDLL Architecture
In this paper, we present a new all-digital MDLL that features a cyclic Vernier TDC to achieve fast power-on capability. This is the first fast lock all-digital MDLL that utilizes a cyclic Vernier architecture [20][21][22][23][24][25][26] to achieve a wide detection range and high resolution. The rest of this paper is organized as follows. Section 2 presents the architecture and operation of the proposed all-digital MDLL, Section 3 shows the experimental results, and Section 4 presents the conclusion. Figure 2 shows a conceptual diagram of the proposed MDLL detecting the initial phase error (=∆t) using a TDC at the beginning of the operation, where T REF is the period of the reference clock (clk ref ), T 1 is the period of the initial output clock (clk out ), and T 2 is the period of the clkout after locking. Ideally, T 2 = T 1 + ∆t/N after locking, where N is the frequency multiplication factor. Figure 3a shows the proposed all-digital MDLL architecture, which consists of an offset-free cyclic Vernier TDC, a lock detector (LD), a digital loop filter (DLF), a bang-bang phase detector (BBPD), a second-order delta-sigma modulator (DSM), three binary-tothermometer decoders (coarse/fine/DSM), a digitally controlled multiplexed ring oscillator (MRO), a/16 frequency divider, a/2 frequency divider, and a select logic. The MRO is a pseudo-differential inverter with three types of varactor delay cells (63 coarse delay cells, 15 fine delay cells, 3 DSM cells). As shown in Figure 3b, the proposed MDLL has two operation modes: TDC mode and sequential tracking mode. When the proposed MDLL is enabled, the MRO starts at the maximum operating frequency. The proposed cyclic Vernier TDC measures the initial phase error (∆t) between the N + 1th rising edge of clk out and the rising edge of clk ref (as shown in Figure 2) and converts this ∆t value to a 10-bit digital TDC code. Then, the TDC code is filtered by the DLF. And the DLF generates the 16 Figure 3a shows the proposed all-digital MDLL architecture, which consists of an offset-free cyclic Vernier TDC, a lock detector (LD), a digital loop filter (DLF), a bang-bang phase detector (BBPD), a second-order delta-sigma modulator (DSM), three binary-tothermometer decoders (coarse/fine/DSM), a digitally controlled multiplexed ring oscillator (MRO), a/16 frequency divider, a/2 frequency divider, and a select logic. The MRO is a pseudo-differential inverter with three types of varactor delay cells (63 coarse delay cells, 15 fine delay cells, 3 DSM cells). As shown in Figure 3b, the proposed MDLL has two operation modes: TDC mode and sequential tracking mode. When the proposed MDLL is enabled, the MRO starts at the maximum operating frequency. The proposed cyclic Vernier TDC measures the initial phase error (Δt) between the N + 1th rising edge of clkout and the rising edge of clkref (as shown in Figure 2) and converts this Δt value to a 10-bit digital TDC code. Then, the TDC code is filtered by the DLF. And the DLF generates the 16- Figure 3a shows the proposed all-digital MDLL architecture, which consists of an offset-free cyclic Vernier TDC, a lock detector (LD), a digital loop filter (DLF), a bang-bang phase detector (BBPD), a second-order delta-sigma modulator (DSM), three binary-tothermometer decoders (coarse/fine/DSM), a digitally controlled multiplexed ring oscillator (MRO), a/16 frequency divider, a/2 frequency divider, and a select logic. The MRO is a pseudo-differential inverter with three types of varactor delay cells (63 coarse delay cells, 15 fine delay cells, 3 DSM cells). As shown in Figure 3b, the proposed MDLL has two operation modes: TDC mode and sequential tracking mode. When the proposed MDLL is enabled, the MRO starts at the maximum operating frequency. The proposed cyclic Vernier TDC measures the initial phase error (Δt) between the N + 1th rising edge of clkout and the rising edge of clkref (as shown in Figure 2) and converts this Δt value to a 10-bit digital TDC code. Then, the TDC code is filtered by the DLF. And the DLF generates the 16   As shown in Figure 3b, in the TDC mode, the TDC search can be repeatedly performed several times. Each TDC search requiring three reference clock cycles uses a correlated double sampling technique [13,27] to eliminate mismatch and offset issues. Ideally, the TDC mode can be completed with only one TDC search. However, the mismatch problem between the MRO and TDC remains unless offset calibration is used, which leads to repetitive TDC searches. A single TDC search takes three T REF cycles, and the TDC mode is completed through a maximum of five TDC search iterations. Subsequently, when phase lock is completed, the LD generates the lock signal, and the MDLL enters the sequential tracking mode. In this sequential tracking mode, the TDC is turned off, and both the BBPD and the DSM are enabled. Therefore, after fast phase locking, the MDLL operates in a closed-loop and can track the process, voltage, and temperature (PVT) and environment variations while simultaneously reducing power consumption and improving jitter performance. The DSM receives 6-bit LSBs, LF [5:0], of the DLF and generates a 2-bit binary signal with a frequency 16 times higher than the BBPD operating frequency. Then, the DSM decoder generates dither [2:0] signals operating at high speed to control the DSM cells of the MRO, which effectively reduces the dithering jitter of the digital MDLL [15]. Figure 4 shows the block diagram of the proposed offset-free cyclic Vernier TDC. The proposed TDC consists of an EN generator, a reset generator, a slow ring oscillator (RO), a fast RO, an edge detector, two multiplexers, and a 10-bit up/down counter. The fast RO has a period of T fast that is slightly faster than T slow . As shown in the lower right Figure 4, the TDC is used to measure the initial phase error (=∆t) between the N + 1 th rising edge of clk out and the rising edge of clk ref .   Figure 5 shows the detailed operation process of the proposed TDC mode with an example of N = 4. Each TDC search process takes three reference clock cycles. When the TDC is activated, the EN generator creates the ENslow and ENfast signals that enable the two ROs. The initial phase difference between these two signals is equal to tcyc + ∆t, where tcyc is the free-running period of the clkout (=output of the MRO). The 10-bit counter counts the oscillation number (=m) of the OSCslow signal during the time from the rising edge  TDC is activated, the EN generator creates the ENslow and ENfast signals that enable the two ROs. The initial phase difference between these two signals is equal to t cyc + ∆t, where t cyc is the free-running period of the clk out (=output of the MRO). The 10-bit counter counts the oscillation number (=m) of the OSCslow signal during the time from the rising edge of the ENslow to the rising edge of the ENfast. Instead of using two separate counters, the 10-bit counter counts the oscillation number (=n) of the OSCfast again during the time from the rising edge of the ENfast to the rising edge of the detect signal: in this example, m = 3 and n = 4. The edge detector shown in Figure 4 compares the OSCslow and OSCfast signals and generates the detect signal when the rising edge of OSCfast leads to the rising edge of OSCslow. The detect signal makes the reset signal go to logic high, making the outputs of the EN generator fall to logic low. Then, the detect signal falls to logic low again. As a result, in the first reference clock cycle, t cyc + ∆t is measured and can be determined as follows:

Proposed Offset-Free Cyclic Vernier TDC
where T C is the coarse delay, T F is the fine delay, and T slow − T fast is the fine resolution of the proposed TDC. Similarly, the second reference clock cycle is used to measure the t cyc . By subtracting the code value of the second cycle from the code value of the first cycle, the required initial phase error ∆t can be obtained. The third reference clock cycle is used to apply this subtracted TDC code to the DLF and the MRO, and accordingly, the delay of the MRO is changed, and the MDLL approaches the coarse phase lock state.
Electronics 2021, 10, x FOR PEER REVIEW 6 of 10 When the TDC measures the time difference, there can always be a time offset, ∆offset. This ∆offset is caused by analog nonidealities, such as signal path mismatch and device mismatch, and causes a problem of increasing the deterministic jitter of an MDLL. To overcome this problem, the proposed TDC mode adopts a correlated double-sampling technique [13,22] to eliminate mismatch and offset problems and improve jitter performance.
In Figure 5, what the TDC actually measures in the first sampling period is not tcyc + ∆t but tcyc + ∆t + ∆offset, which is a value including the time offset ∆offset. The value measured in the second sampling period is not tcyc, but tcyc + ∆offset. Therefore, if the values of two consecutive measurement codes are subtracted from each other, the ∆offset can be removed, and the correct ∆t can be obtained. When the ∆t value becomes smaller than the resolution (=Tslow − Tfast = 6 ps in this design) of the Vernier TDC after up to a maximum of five TDC When the TDC measures the time difference, there can always be a time offset, ∆ offset . This ∆ offset is caused by analog nonidealities, such as signal path mismatch and device mismatch, and causes a problem of increasing the deterministic jitter of an MDLL. To overcome this problem, the proposed TDC mode adopts a correlated double-sampling technique [13,22] to eliminate mismatch and offset problems and improve jitter performance.
In Figure 5, what the TDC actually measures in the first sampling period is not t cyc + ∆t but t cyc + ∆t + ∆ offset , which is a value including the time offset ∆ offset . The value measured in the second sampling period is not t cyc , but t cyc + ∆ offset . Therefore, if the values of two consecutive measurement codes are subtracted from each other, the ∆ offset can be removed, and the correct ∆t can be obtained. When the ∆t value becomes smaller than the resolution (=T slow − T fast = 6 ps in this design) of the Vernier TDC after up to a maximum of five TDC searches are performed, the lock detector generates the lock signal, and the MDLL enters the sequential tracking mode. Then, the TDC is disabled to reduce power consumption, and both the BBPD and the DSM are enabled.
The lower six-bits, LF [5:0], of the DLF output, are used for the DSM. The DSM operates 16 times faster than the reference clock and generates a 2-bit signal for the DSM decoder. Then, the DSM decoder generates dither [2:0], which controls the three DSM cells at high frequency. This DSM-based dithering jitter reduction scheme brings the advantage of greatly improving the deterministic jitter performance of the proposed MDLL with a large N value [15].

Experimental Results
The proposed MDLL has been implemented in a 65-nm CMOS process. Figure 6 shows the layout of the proposed MDLL core, where the active area is about 0.043 mm 2 . When the 75 MHz input reference clock is multiplied by N = 32 to generate an output clock of 2.4 GHz, the power consumption is about 3.3 mW from a 1.0 V supply.  Figure 7 shows the simulated locking process of the proposed all-digital offset-free cyclic Vernier-based MDLL. When the MDLL is enabled and the TDC starts operating at 160 ns, the initial phase error (=∆t) is about 2.9 ns. Since each TDC search process takes three reference clock cycles, the 10-bit output value of the DLF, LF [15:6], is changed at every three reference clock cycles. After five TDC search operations taking 15 reference clock cycles, the phase error becomes less than 2.3 ps, and the MDLL starts the sequential tracking mode. At this point, the MDLL is phase-locked, the TDC is turned off, and the BBPD and DSM are turned on to maintain the lock state.  Figure 7 shows the simulated locking process of the proposed all-digital offset-free cyclic Vernier-based MDLL. When the MDLL is enabled and the TDC starts operating at 160 ns, the initial phase error (=∆t) is about 2.9 ns. Since each TDC search process takes three reference clock cycles, the 10-bit output value of the DLF, LF [15:6], is changed at every three reference clock cycles. After five TDC search operations taking 15 reference clock cycles, the phase error becomes less than 2.3 ps, and the MDLL starts the sequential tracking mode. At this point, the MDLL is phase-locked, the TDC is turned off, and the BBPD and DSM are turned on to maintain the lock state. Figure 8 shows the simulated jitter and reference spur performances of the proposed fast-lock all-digital MDLL at 2.4 GHz (N = 32). It achieves a root-mean-square (RMS) jitter of 0.82 ps and a peak-to-peak (p-p) jitter of only 4.0 ps. It also achieves a reference spur of −38.1 dBc. As shown in Figure 9, with an intentionally injected 8.08 ps p-p input clock jitter noise, the proposed MDLL obtains a 17.46 ps p-p (RMS jitter = 2.58 ps) output clock jitter. This means that even when input noise is injected, the effective p-p jitter is only 9.38 ps (=17.46 ps − 8.08 ps). Table 1 compares the performance of the proposed MDLL with previous digital MDLLs. Among the digital MDLLs, the proposed MDLL has the fastest locking time of less than 15 reference clock cycles, which is suitable for use in energy proportional serial link applications. Figure 7 shows the simulated locking process of the proposed all-digital offset-free cyclic Vernier-based MDLL. When the MDLL is enabled and the TDC starts operating at 160 ns, the initial phase error (=∆t) is about 2.9 ns. Since each TDC search process takes three reference clock cycles, the 10-bit output value of the DLF, LF [15:6], is changed at every three reference clock cycles. After five TDC search operations taking 15 reference clock cycles, the phase error becomes less than 2.3 ps, and the MDLL starts the sequential tracking mode. At this point, the MDLL is phase-locked, the TDC is turned off, and the BBPD and DSM are turned on to maintain the lock state.    Figure 8 shows the simulated jitter and reference spur performances of the proposed fast-lock all-digital MDLL at 2.4 GHz (N = 32). It achieves a root-mean-square (RMS) jitter of 0.82 ps and a peak-to-peak (p-p) jitter of only 4.0 ps. It also achieves a reference spur of −38.1 dBc. As shown in Figure 9, with an intentionally injected 8.08 ps p-p input clock jitter noise, the proposed MDLL obtains a 17.46 ps p-p (RMS jitter = 2.58 ps) output clock jitter. This means that even when input noise is injected, the effective p-p jitter is only 9.38 ps (=17.46 ps − 8.08 ps). Table 1 compares the performance of the proposed MDLL with previous digital MDLLs. Among the digital MDLLs, the proposed MDLL has the fastest locking time of less than 15 reference clock cycles, which is suitable for use in energy proportional serial link applications.