Analysis and Modeling of Mueller–Muller Clock and Data Recovery Circuits

: In this paper, an accurate linear model of the Mueller–Muller phase detector (MMPD)-based clock and data recovery circuit (MM-CDR) is proposed, which analyzes several critical points of the MM-CDR including the linearization of the MMPD and the gain of the voter. Using our technique, the jitter between the recovery clock and the input data can be estimated with a sub-picosecond accuracy, as demonstrated in the simulation results of a 56 Gb/s quarter-rate MM-CDR implemented in 28 nm CMOS.


Introduction
With the rapid development of integrated circuits and the emergence of advanced process nodes, the data transfer rate of high-speed serial interfaces (i.e., serializer/deserializer) has grown exponentially [1]. The rapid growth of data rate makes the classical Bang-Bang phase detector (BBPD) no longer suitable for high-speed clock and data recovery (CDR). However, the advantage of the Mueller-Muller phase detector (MMPD) that requires only one sample per symbol alleviates the problem of BBPD timing tension and exponential growth of power consumption in high-speed situations, making Mueller-Muller baud-rate sampling widely used in the serial IO design [2][3][4][5][6].
Compared to the Mueller-Muller baud-rate sampling technique, a lot of work has been done on Bang-Bang CDR. In [7], Jee et al. linearized the BBPD using the effect of metastability and input jitter, combining with the behavioral characteristics of large signals to characterize the jitter transfer and jitter tolerance of the CDR. However, for jitter transfer, we are more concerned about the bandwidth of the CDR after locking (i.e., in a small range). In [8], Sonntag et al. described a linear model for the general architecture of digital clock and data recovery. They linearized the BBPD using Gaussian jitter and derived the gain formula, and then obtained a linear model of the CDR in combination with the analysis of other modules of the digital CDR loop, in which the input jitter of the BBPD needs to be simulated accurately each time. The authors in [9] summarized the existing linear models of BBPD and analyzed the effect of random jitter and finite loop delay on the loop characteristics. Nevertheless, there is no analysis of linearization for the MM-CDRs in previous work, which all focus on the use of the Mueller-Muller baud-rate sampling technique. For example, Dokania et al. [10] introduced an unequalized MM-CDR, which can adjust the phase lock position according to the extracted channel response and achieve the adaptive channel response; while Choi et al. [11] presented a weight-adjusting sign-sign Mueller-Muller clock and data recovery circuit. These studies only apply the MMPD without a systematic analysis of the MM-CDR, affecting the circuit design performance and extending the circuit design cycle. To solve these problems, we propose the gain formula of MMPD for the first time using the jitter characteristics of the input, then give the joint analysis method of voter and MMPD, and finally analyze the error jitter for the overall system. Through the above modeling, we address the theoretical gap and provided the basis for the design for MM-CDR.
In this work, we propose a linear model for a Mueller-Muller CDR. The remainder of this article is organized as follows. Section 2 discusses the working principle of MMPD and presents a linear gain model for it. Section 3 describes the gain model of the MMPD-based voter and builds a small-signal model of the MM-CDR to analyze the jitter tolerance and jitter transfer. Section 4 analyzes the error jitter between recovery clock and input data. Section 5 demonstrates the simulation results that validate the analysis of Section 4, and Section 6 draws the conclusion.

Working Principle of MMPD
The traditional oversampling design samples the input data more than once per unit interval, which has a large frequency capture range and fast capture speed. However, as the data transfer rate increases, oversampling designs encounter performance bottlenecks in terms of clocking and power consumption, and the Mueller-Muller baud rate sampling becomes the sampling method at high speed. Compared to the classical oversampling technique (e.g., BBPD), MMPD requires only half of the sampling clock and half of the high-speed front-end sampling circuitry in the same case, which can effectively reduce the area and power consumption. Figure 1 shows the block diagram of the MMPD structure. As shown in Figure 1, the MMPD mainly consists of error sampler, data sampler, and digital logic circuit [12]. The error sampler first compares the input data stream (D in ) with the threshold voltages "+V ref " and "−V ref " to generate the signals of Errp and Errn. The judgment principle is as follows. When D in > +V ref , Errp = 1; otherwise Errp = 0. When D in < −V ref , Errn = 1; otherwise Errn = 0. The signals' Errp and Errn are retimed by the recovery clock (CK) and exclusive OR to get the error information (Err). The data sampler determines the data bits (D n ) using the threshold voltage "0". Then, it utilizes the three adjacent data and error information through the digital logic circuit to produce the early/late decisions. Compared to BBPD, MMPD reduces the time axis constraint but adds two voltage thresholds to obtain the amplitude information of the sampling points (namely Errp and Errn), which detects the phase of the input data applying the amplitude information together with the waveform.
The MMPD provides updates only on data transition and requires waveform screening. As shown in Figure 2, we give four waveforms that can update the phase information, each of which gives the early and late situation. For example, during 0-0-1 data transition, when recovering the clock lag input data, the clock sampling is denoted by green arrows, according to the previous judgment principle, the error information is ×10; when recovering the clock overrun input data, the clock sampling is represented by blue arrows, and the error information is 10×. Therefore, the early/late clock sampling information can be uniquely determined by the difference in the error information obtained during a specific data transition [13]. Based on the principles of the MMPD mentioned above, we find that the MMPD does not have a unique locking point, and will be free to wander depending on flatness of the received square wave pulse.  Table 1 provides a truth table of the four data transition of waveforms in Figure 2. Therefore, with the Nth-order PRBS codes commonly used to simulate real data streams such as PRBS7, PRBS23, PRBS31, etc., the probability of four data transition waveforms appearing in a 2 N -1 bit cycle sequence length is 0.5. In other words, the detection density of MMPD is 0.5, which is the same as BBPD. It is worth noting that different data encoding methods may have different detection densities.

Linear Gain of MMPD
Just as it has been done to the BBPD in [7,8,14], the binary output characteristic of the MMPD is smoothed out by the jitter inherent in the input data and the recovery clock. First, we assume that the phase error (i.e., the phase difference between the input data and the recovery clock) consists of two terms: one is the deterministic phase error u e,U (e.g., deterministic ISI or sinusoidal jitter) and the other is the random phase error u e,G (e.g., Gaussian white noise) [9]. That is: (1) where u is the average output of the MMPD, and α, β are the weights of the random phase error and the deterministic phase error, respectively. When the phase error between the input data and the recovery clock is dominated by the random error term, we abstract the input jitter using a Gaussian probability density function with a standard deviation of σ and a mean value of 0.
Combining the operation of the MMPD in the clock and data recovery circuit, we assign "1" to phase "late" and "−1" to phase "early", and then sum the positive and negative samples with weights given by the probabilities of their occurrence. Therefore, the average output of the MMPD is: In Figure 3a, φ indicates the average phase difference between the input data and the recovery clock, 2φ m is the drift bit width, namely the part of the input data whose voltage amplitude is greater than +V ref or less than −V ref . According to the Gaussian distribution probability density function, we obtain Let Similarly, we can have Substituting (4) and (5) into (2), we get When the phase error between the input data and the recovery clock is dominated by the deterministic error term, we abstract the input jitter using a uniform distribution probability density function with a standard deviation of σ and a mean value of 0. From Figure 3b, we get Since the variance of the uniform distribution can be expressed as Substituting (8) into (7), then Similarly, we can attain Substituting (9) and (10) into (2), we get Note that the detection density of the input data is not considered in (6) and (11). If the detector density of the input data is λ, the gain of the MMPD based on Gaussian jitter or uniform jitter is λu. By analyzing these two extreme jitter distributions of the MMPD, we perform the linearization of the MMPD and give the gain range. If the CDR system has a clean channel, then Gaussian jitter will dominate the factor and the gain of the MMPD is (6); if deterministic jitter dominates the system, the gain of the MMPD is (11). It can also be seen from (6) and (11) that the gain of the MMPD is inversely proportional to the input jitter. The higher the jitter is, the lower the gain will be. This is the key consideration when we design the MM-CDR system, because in Section 3, we will find that the performance metrics of the CDR system will vary significantly as the jitter changes.
To verify the correctness of the MMPD linear gain model, we add the corresponding jitter to the sampler clock and then scan the phase difference at the input and count the number of overruns and lags to compute its average gain. Figure 4a,b show that the theoretical calculation agrees well with the simulation results. In addition, we find that for Gaussian jitter with a jitter variance of 0.1 UI and a drift bit width of 0.05 UI, the linear region of the MMPD is about |φ| < 0.1 UI; for uniform jitter with a jitter variance of 0.12 UI and a drift bit width of 0.05 UI, the linear region of the MMPD is about |φ| < 0.15 UI. When we continue to increase the phase difference at the input, it can be observed that the MMPD will no longer have linear gain, but enter the nonlinear region, as illustrated in Figure 4c,d.

Linear Voting Model
Majority voting is implemented in CDR by adding the N-bit parallel PD output together and then taking the sign of results. Compared to the boxcar filter, voting is able to reduce the loop delay and lower the output noise of the MMPD. Hence, the digital CDR generally adopts voting as the decimation operation.
The gain of voting relies on the MMPD output and needs to be analyzed jointly with the MMPD. Taking a bank of four Mueller-Muller phase detectors via voting as an example, we give all the possible inputs leading to the output results in Table 2. If the input jitter is Gaussian, the probability of the MMPD output 0, −1, 1 is determined when the phase difference is fixed. According to the principle of probability statistics, the probability when the voting output is "late" (i.e., 1) is P vote (late) = C 1 4 P 3 0 P la + C 2 4 P 2 0 P 2 la + C 2 4 C 1 2 P 2 la P 0 P ea + C 1 4 P 3 la P ea + C 1 4 P 3 la P 0 + P 4 la .
Finally, we obtain the united gain of the MMPD and the voting: To compare the impact of the voter with Boxcar on the loop and get the gain of the voter, we simulate the joint gain of MMPD and Vote, as shown by the red line in Figure 5, whose gain can be expressed as K MD × K V . At the same time, we also simulate the joint gain of MMPD and boxcar filter, as depicted by the blue line in Figure 5, whose gain can be expressed as K MD × 1. The gain ratio of these two cases can be obtained through simulation as well, and finally the gain of the voter can be derived. Figure 5a,b show that the MMPD-based voting is greatly affected by jitter and drift bit width, with the gain of the voter around 0.54 when the jitter is much larger than the drift bit width, and around 0.78 when the drift bit width is close to the jitter. Therefore, to perform analysis and fully understand the impact of voting, it is important to obtain accurate jitter and drift bit width.

Linearized Analysis of CDR System
Based on the linear model of the MMPD in Section 2.2 and the analysis of the digital CDR in [8], we can obtain the small-signal model of the MM-CDR. In this section, according to the proposed architecture in Figure 6, we first present the linearized model (see Figure 7) and then analyze its transfer function and jitter tolerance.  Generally, the CDR must meet certain jitter performance specifications. Therefore, we first obtain the loop gain: . (15) where K MD is the gain of MMPD; K V is the gain to handle the voting; the values of K i and K p correspond to the proportional and integral paths from the output of the voting to the digital-to-phase converter (DPC); the element K DPC is the gain through DPC. These parameters are listed above in Table 3. Then, the transfer function of the MM-CDR can be calculated by the following wellknown equation: The jitter tolerance is the maximum amount of input jitter that can be tolerated by the clock and data recovery circuit at a given bit error rate (BER) [15]. Therefore, the jitter tolerance computed by (17) for a target BER is 10 −10 .
where σ ER is the error jitter between recover clock and data, and T is the data cycle. Figures 8 and 9 plot the jitter transfer and jitter tolerance separately. It can be seen that as the jitter increases, the bandwidth of the MM-CDR decreases and the jitter tolerance becomes smaller.

Jitter Analysis of MM-CDR
As discussed in Section 3.2 and [16], the MM-CDR loop characteristics vary with the error jitter σ ER . To obtain an accurate MM-CDR model, the exact error jitter is required. One possible way is to accurately simulate the entire system at one time, measure the error jitter, and then build a linear model. Nevertheless, this will greatly lengthen the design cycle. In contrast, if we estimate the error jitter in advance based on the circuit characteristics, we can not only obtain accurate CDR performance metrics, but also shorten the design cycle. Here, we focus on the effect of the two main sources of jitter on the error jitter: input data jitter and PI quantization noise [17].
In this work, we assume that there is a clean channel and the noise of the input data is generated by the oscillator at the transmitter. For simplicity, we only consider the main noise of the oscillator (i.e., thermal noise). Its noise power spectral density (PSD) as a function of the offset frequency f is given by [18], where n is the coefficient of the oscillator thermal noise. The period jitter of the input data σ DAT can be obtained by integrating the phase noise with the transfer function 1 − z −1 [19].
Substituting (19) into (18), the noise of input data can be written as According to [20], in order to reduce the loop bandwidth and jitter peaking during design, we usually provide a larger damping factor ζ, so the transfer function can be approximated by that of a one-pole system. Using time instead of phase, the error transfer function is where w −3dB is the bandwidth of MM-CDR; T PI is the period of phase interpolator's input clock; T DLF is the period of digital filter. Combining (20) and (21), the error jitter contributed by the input data jitter is According to Section 2.2, we indicate with σ ER,in the rms value of the jitter out of the MMPD, we can write: Note that we consider the error jitter σ ER,in caused by the input data noise as the total error jitter, and in Section 5 we will give the reasonableness analysis. Combining (21)-(23), we have, Moreover, the quantization noise of PI is not negligible, and we consider that the PI oscillates between two phases. Assuming that the data phase has a delay of "x" with respect to the recovery clock, the phase of the recovery clock with respect to the previous moment when PI is rotated is ∆ = T PI /N PI , where ∆ is the minimum step of PI. Therefore, for random jitter, "x" is continuously varying, the average error jitter is In other words, the absolute jitter of the CDR due to quantization noise of the PI can be expressed as The total error jitter of the system is the combination of the values of each system RJ by the root mean square summing (RSS), Which gives an estimate of the error jitter and will be verified in the simulation results in Section 5.

Results
The simulations in Figure 10 are obtained using a 56 Gb/s quarter-rate Mueller-Muller clock and data recovery circuit on 28 nm CMOS, with the structure shown in Figure 6. We use a 56 GHz clock with period jitter to generate a PRBS31 code as the input data. Varying the input data jitters, the number of phases of the PI and sub-resolution of the phase integrator measure the error jitters between the recovered clock and the input data. The results are shown in Figure 10, where black dashed, green solid, and the red dashed lines correspond to (24), (26) and (27), respectively. The blue line with circles represents the error jitter of a CDR operating on 56 Gbps of serial data. Figure 10a-c respectively show the relationship between the results and the following factors: input data jitter, the number of phases of the PI, and sub-resolution of the phase integrator. it can be seen from Figure 10a that the error jitter caused by the input jitter is the dominant factor at this point, and when σ DAT increases, the total error jitter increases as well.
According to Figure 10b, the total error jitter follows the quantization noise of PI when N PI is low, and when N PI is high, σ ER,tot varies in accordance with σ ER,in . In fact, the MM-CDR bandwidth is large at low N PI , and (21) is high-pass, thus the input jitter has little effect on the total error jitter, and the quantization noise of PI accounts for the dominant factor. This has been validated by the simulation results in Figure 10b.
In Figure 10c, we can find that although the contribution of the input jitter to the total error jitter increases as N SR increases, it is still too small compared to the quantization noise of PI, and the fluctuation of σ ER,in does not cause the growth of the total error jitter. When σ ER,in is the dominant factor, the total error jitter approaches the error jitter caused by the input jitter, so the estimate using (23) is reasonable.

Conclusions
In this paper, we present a linear model of the MM-CDR. First, we design a model for the MMPD gain as in Equation (6) when Gaussian jitter is the dominant factor and Equation (11) when uniform jitter is the dominant one. In order to build a complete linear model, we introduce an estimation model for the error jitter, which achieves a subpicosecond accuracy in the high-speed case. It is worth mentioning that the gain of the MMPD-based voter varies greatly with the error jitter and drift bit width, so it needs to be simulated accurately.