Next Article in Journal
An Efficient Hardware Design for a Low-Latency Traffic Flow Prediction System Using an Online Neural Network
Previous Article in Journal
A Regularization-Based Big Data Framework for Winter Precipitation Forecasting on Streaming Data
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A 1.55-to-32-Gb/s Four-Lane Transmitter with 3-Tap Feed Forward Equalizer and Shared PLL in 28-nm CMOS

1
Institute of Microelectronics, Chinese Academy of Sciences, Beijing 100029, China
2
State-Key Laboratory of Analog and Mixed-Signal VLSI/IME and the DECE/Faculty of Science and Technology, University of Macau, Macau 999078, China
*
Authors to whom correspondence should be addressed.
Electronics 2021, 10(16), 1873; https://doi.org/10.3390/electronics10161873
Submission received: 12 June 2021 / Revised: 26 July 2021 / Accepted: 30 July 2021 / Published: 4 August 2021
(This article belongs to the Section Circuit and Signal Processing)

Abstract

:
This paper presents a fully integrated physical layer (PHY) transmitter (TX) suiting for multiple industrial protocols and compatible with different protocol versions. Targeting a wide operating range, the LC-based phase-locked loop (PLL) with a dual voltage-controlled oscillator (VCO) was integrated to provide the low jitter clock. Each lane with a configurable serialization scheme was adapted to adjust the data rate flexibly. To achieve high-speed data transmission, several bandwidth-extended techniques were introduced, and an optimized output driver with a 3-tap feed-forward equalizer (FFE) was proposed to accomplish high-quality data transmission and equalization. The TX prototype was fabricated in a 28-nm CMOS process, and a single-lane TX only occupied an active area of 0.048 mm2. The shared PLL and clock distribution circuits occupied an area of 0.54 mm2. The proposed PLL can support a tuning range that covers 6.2 to 16 GHz. Each lane’s data rate ranged from 1.55 to 32 Gb/s, and the energy efficiency is 1.89 pJ/bit/lane at a 32-Gb/s data rate and can tune an equalization up to 10 dB.

1. Introduction

As the data centers rapidly evolve to accommodate higher information transfer rates, a high-speed serial interface has become the main candidate to deliver data transmission [1,2,3,4,5,6,7]. Therefore, multiple industrial standards and protocols have been introduced, such as JESD204B, Thunderbolt, Peripheral Component Interconnect Express, and Universal Serial Bus [8,9,10,11]. The bandwidth requirements of those protocols keep increasing, and the decreased unit interval (UI) period becomes a bottleneck in high-speed transmitter (TX) design, which makes the timing budget extremely tight. Hence, the TXs must support a wide range of data rates and the appropriate equalization, which is the most challenging issue in this design.
Despite the availability of high-speed CMOS circuits, high-speed transmission data is still severely restricted by the wireline channels. The bandwidth-limited channel attenuates the high-frequency gain of the transmitted data due to the skin effect and dielectric loss [12,13,14,15], resulting in inter-symbol interference. A feed-forward equalizer (FFE) is usually used in the TXs to compensate for the channel loss. While an FFE is embedded in a current-mode logic (CML) output driver, the output impedance and the swing can be adjusted by the termination resistor and the bias current, respectively. On the other hand, a CML-based output driver can fully exploit the process potentials as its compact NMOS driving topology naturally features fast current switching speed and small parasitic capacitance [16,17]. In addition, as the data rate increases, a low-jitter clock is needed to meet the timing budget, leading to an LC voltage-controlled oscillator (VCO) based phase-locked loop (PLL) [18,19].
This paper reports a 1.55-to-32-Gb/s four-lane TX fabricated in 28-nm CMOS technology. To apply a wide operating range and multiple protocols, a high-operating range PLL is designed to generate the multi-frequency differential clocks, and the multi-rate TX lanes are proposed, in which the signal frequency of the multi-phase clocks can be configured according to the expected data rate. Meanwhile, the proposed TX prototype also supports high-speed data transmission. Hence several circuit techniques are adopted to expand circuit bandwidth and relieve the severe timing constraints. The optimized combiner with the 3-tap FFE is proposed to reduce the high-frequency channel loss.
This paper is organized as follows: Section 2 presents the top architecture of the proposed four-lane TX. Section 3 and Section 4 present the design details and critical considerations of the on-chip PLL and TX lane. Measurement results are illustrated in Section 5. And the conclusion is drawn in Section 6.

2. Top Architecture

Figure 1 shows the overall architecture of the proposed four-lane TX, which consists of a BIST module, four TX lanes with the same structure, a shared PLL, clock distribution circuits (CDCs), and a clock buffer. The BIST module integrates the pseudo-random binary sequence (PRBS) generation, data coding, and register control. It sends the parallel data on the feedback clock to the TX lane. A multi-rate signal lane is important for the wide data-rate range TX. It can be realized by the configurable signal mode and data serialization adjustment. The high-speed data is driven by an optimized CML-based buffer with the 3-tap FFE. Additionally, an LC-PLL takes advantage of excellent jitter performance and reduced high-frequency clock routing. A single LC-VCO-based PLL can be shared by 4 to 8 lanes as a central multiplying unit [20]. Thus, the proposed TX adopts a shared LC-VCO-based PLL for the local half-rate clock generation. In this way, the power consumption and chip area are amortized over the TX lanes, improving the overall energy efficiency.

3. Multi-Rate TX Lane Implementation

Figure 2 depicts the implementation of our single-lane TX. Each lane employs a half-rate architecture to relax the timing constraint of the critical path, and it is composed of a clock path and a data path. The input differential clocks generated by the shared PLL are first converted to rail-to-rail clocks by the CML-to-CMOS circuits in the clock path. They are divided to produce the proper clocks for the serialization trees and latch arrays. The frequency division factor is configured according to the desired transmission data rate. It uses the CMOS-logic-based scheme for the clock dividers and data multiplexers (MUXs) to reduce power consumption. In the data path, both the odd and even bits of the 40-bit parallel data are serialized by the MUX trees and retimed by an interleaved latch array to generate differential data sequences, i.e., DOPRE/MAIN/PST and DEPRE/MAIN/PST. The high-speed 2:1 MUXs generate the full-rate data streams applied to the final CML-based output driver. The 3-tap FFE is inserted into the output driver to improve its eye diagrams.
The proposed single lane can be configured for multi-rate data through the serialization process at the front end. The BIST module transfers the encoded parallel data to the PHY layer and decides the effective data bits. In TX lanes, the divider and MUXs tree is configured to the corresponding data rate. Therefore, the half-rate DE/DO is consistent with the protocol requirements. In this way, the data rate can be altered to multi-rate flexibly, e.g., if the data transfer rate needs to be DR = 16 Gb/s, and the local clock generate by PLL is locked to fCK = 8 GHz, then the dividers are set to DIV1 and DIV5, respectively; hence, the final TXOUT is at 16 Gb/s, as desired.

3.1. Multi-Rate Lane Timing

The most critical timing constraint appears at the final 2:1 MUX, which will become more severe once the timing constraint changes at different data rates. Hence, a latch array and three 2:1 MUXs are employed to satisfy the changing timing, as shown in Figure 3a. For high-speed data, A CML-based MUX can easily merge an active peaking technique in pre-driver [21] or charge enhancement techniques [22] to extend the required bandwidth, but it consumes large power. The CMOS-based MUX is employed in this design to minimize the transmission delay in the critical timing path and reduce the power consumption and area overhead.
Figure 3b illustrates the timing diagram of the latch array and MUX. The latch array guarantees the phase relationship between the clock and data paths and then outputs the complementary data streams with a 1-UI timing offset for the FFE combiner. For example, in the MUX (taking pre-tap), the differential half-rate clock works as a selection signal, and the logic gates accomplish the data serialization. The negative-half and the positive-half sides have the same structure and timing constraint. In this way, the phase difference of CK rising edge and data transition edge is fix to Tsetup, and is not influenced by the UI changes. An appropriate Tsetup is set through the post-simulation, considering the PVT variations. The maximum and minimum Tsetup are separately 12 ps and 5 ps under all the PVT variations in a 28-nm CMOS process.

3.2. The High-Speed CML-Based Output Driver

The proposed TX can operate with a wide data rate and high-speed data transmission simultaneously. Thus, the most critical circuit in the signal path is the output driver, requiring both sufficient bandwidth and reasonable gain. The conventional differential CML topology utilizes a tail current source, and the input devices need to be sized sufficiently large to keep the tail current source in the saturation region. Hence, as shown in Figure 4a, a tailless CML-based output driver is proposed to reduce the size of the NMOS input devices and their parasitic capacitance. It is worthy of mentioning that the resistors in parallel with the input transistors are adopted to keep the cascade transistors in saturation region and set a low drain-source voltage (around 0.25 V) when the input transistors are turned off. Here, the values of the parallel resistors in the main/post/pre-tap slice are 4/8/12 kΩ, and the total current of the parallel resistors is around 110 µA in total, which can be ignored compared to the power consumption of the output driver.
Another primary function of the output driver is to combine the three data sequences to implement signal equalization. The 3-tap FFE is a finite impulse response filter embedded in the proposed output driver. The pre-tap and post-tap data streams are built by parallel connection of the CML-based output driver. They have an opposite polarity compared with the main-tap signal to realize the addition and subtraction operations. The equalization levels are determined by the ratio of the tail current of the CML drivers, which are tuned by the bias generator shown in Figure 4b. A DAC is used to set the TX output swing. The shunt currents of the pre-tap and post-tap are adjusted by weight control signals (WPRE and WPST), and the amplifier is adapted to generate the bias voltage and ensure sufficient phase margins. The value of load resistor RT is 50 Ω to realize impedance matching, considering the PVT variations, RT can also be adjusted from 39 Ω to 75 Ω. In addition, for AC coupling mode, the driver swing is compressed as the common voltage of the output signal is pulled down. Hence we add a parallel current source to provide an appropriate bias and increase the driver output level at a higher transmission data rate.

4. Wide-Operating-Range PLL

As shown in Figure 5a, the designed PLL is composed of a phase-frequency detector (PFD), a charge pump (CP), a second-order loop filter (LF), two parallel VCOs, and frequency dividers. The CML-based buffers are used to transfer the half-rate clocks to each lane and the output PADs for PLL performance measurement. Typically, the reference clock (CKREF) operates from tens to hundreds of MHz. The PFD detects both the phase and frequency errors between the CKREF and the divided high-speed clock (CKDIV). While CKDIV is ahead of CKREF, the rising edge of CKDIV makes the DN change to a high level, it controls the CP to discharge the ground. Hence the VCO control voltage (VCTRL) decreases, and the oscillation frequency reduces, and the rising edge of the subsequent CKREF makes the UP change to a high level and generate a reset signal at the same time. It works in an opposite process while CKDIV lags behind CKREF.
The presented TX supports a wide data-rate range; hence, several techniques are proposed to support a wide operation range. The dual VCOs combining with the switch-controlled capacitance cover the required operating range coarsely. The designed CP supporting a large range of Vctrl is designed to finely tunes the clock frequency. An LC-VCO is superior to a ring-VCO for multi-GHz serial links in terms of noise characteristics such as phase noise and clock jitter. However, its limited tuning range remains a challenge. A single LC-VCO centered at frequency f1 can cover only up to its tuning range. To support a wider range of data rates, the additional LC-VCO centered at f2 is adopted. Therefore, it is possible to support multi-standard at the cost of the acceptable area overhead. In addition, the switch-controlled capacitor array shown in Figure 5b is integrated into each VCO, which can tune the VCO operating range.

5. Measurement Results

The proposed four-lane TX was realized in a 28-nm CMOS process. The chip micrograph and block description are shown in Figure 6. The total chip size was 2.97 mm × 1.08 mm, mainly dominated by input/output testing PADs and on-chip PLL. A single-lane TX merely occupied an active area of 0.048 mm2, and the shared PLL and CDCs occupy a core area of 0.54 mm2. The prototype chip was wire-bonded on the printed circuit board for all measurements. The four lanes shared a 0.9-V supply for core circuits and a 1.2-V supply for the bias generator and output driver; the corresponding power consumptions were 26.5 mW and 34.1 mW, respectively. The PLL and shared circuits were assessed with a 0.9-V and a 1.8-V supply, and the corresponding power consumptions are 63.2 mW and 8.4 mW, respectively. Note that the independent power supplies can help suppress the output jitter effectively. Therefore, the TX dissipates 60.6 mW at a 32-Gb/s data rate per lane, corresponding to the energy efficiency of 1.89 pJ/bit, and the power consumption of the PLL is 71.6 mW.
The measurement environments are shown in Figure 7, in which the TX chip was wire bonded to the demo PCB, the SPI control signal was connected to the FPGA designed kit (VCU118) for register configuration, and the input reference clock for PLL was provided by the analog signal generator (Keysight N5173). The PLL performance is measured through the signal analyzer (Agilent Technologies N9030A). The TX lanes were measured with both the FPGA environment and the oscilloscope (Teledyne LeCroy SDA MCM-ZI-A) to characterize the overall performance.
The PLL was measured through clock buffers in the proposed TX prototype. Figure 8a gives the phase noise and spur performance operating at 10 GHz, where the phase noise is around −92.6 dBc with a 1-MHz offset, and the spur is better than −56 dBc. Figure 8b shows the tunning range of the designed PLL, where VCO1 supports an operation range of 6.2 to 10.7 GHz and VCO2 supports an operation range of 10.2 to 17.1 GHz. Figure 8c further shows a group of the measured eye diagrams. The designed PLL with dual VCOs can cover a wide frequency range from 6.2 to 16 GHz with a low output jitter.
In measurement environment 1, the FPGA design kit receives the four-lane differential data signal and checks the TX data with the specified PRBS patterns. The four-lane data were all checked, and the received data were correct, and the BER was less than 1e-10 up to 25 Gb/s data rate, which proved that the TX data were correctly serialized and the delay mismatch between each lane would not influence the chip function. The oscilloscope was adopted in measurement environment2 to observe the eye diagram and to confirm the high-frequency performance of the proposed TX chip. Figure 9 summarizes the measured eye diagrams of the TX lane-0; the TX can realize an eye height of >0.9 Vppd and eye width of >0.72 UI under different data rates, such as 1.55, 10, 28, and 32 Gb/s. As shown in Figure 10, the four TX lanes can achieve consistent performance. Figure 11a displays the measured s-parameter curves of the 0.5-m, 2.2-m, and 3.2-m paired cables. Figure 11b shows the eye diagrams before and after applying the equalization at a data rate of 32 Gb/s. It can be observed that the designed 3-tap FFE can optimize the eye-opening for all cable channels with −4.1/−6.6/−9.7-dB equalization.
Table 1 summarizes the chip performance and compares this work with recent TXs [23,24,25,26] operating at a similar data rate and in a similar CMOS process. Our work shows a wider range of data rates, better energy efficiency, and wider eye width.

6. Conclusions

A four-lane transmitter suit for multiple serial communication protocols is presented in this study. To support a wide operation range, the dual VCO scheme is proposed in the designed LC-PLL to produce the local half-rate clock, and the configurable data serialization architecture is adopted in signal lanes to match the transmission data rate. Additionally, an optimized output driver with 3-tap FFE is proposed to provide sufficient bandwidth and compensate for the wireline channel loss for high-speed communication. The transmitter prototype is fabricated in a 28 nm CMOS process; the active area of the transmitter lane and PLL with CDCs are 0.048 mm2 and 0.54 mm2, respectively. The prototype can support a wide transmission data range of 1.55 to 32 Gb/s and consumes 1.89 pJ/bit/lane at a data rate of 32 Gb/s.

Author Contributions

C.C. and X.Z. designed the circuits, analyzed the measurement data, and wrote the manuscript. D.W. and J.L. assisted the circuit simulation and implementation. L.Z. and J.W. assisted the chip package implementation and the PCB designing. D.L. performed the chip test and assisted with the chip measurement. Y.C. contributed to the technical discussions and reviewed the manuscript. X.L. gave some valuable guidance and confirmed the final version of the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China, no. 2018YFB2202302.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Frans, Y.; McLeod, S.; Hedayati, H.; Elzeftawi, M.; Namkoong, J.; Lin, W.; Im, J.; Upadhyaya, P.; Chang, K. A 40-to-64 Gb/s NRZ Transmitter with Supply-Regulated Front-End in 16 nm FinFET. IEEE J. Solid-State Circuits 2016, 51, 3167–3177. [Google Scholar] [CrossRef]
  2. Zheng, X.; Zhang, C.; Lv, F.; Zhao, F.; Yuan, S.; Yue, S.; Wang, Z.; Li, F.; Wang, Z.; Jiang, H. A 40-Gb/s Quarter-Rate SerDes Transmitter and Receiver Chipset in 65-nm CMOS. IEEE J. Solid-State Circuits 2017, 52, 2963–2978. [Google Scholar] [CrossRef] [Green Version]
  3. Chen, Y.; Mak, P.I.; Boon, C.C.; Martins, R.P. A 36-Gb/s 1.3-mW/Gb/s duobinary-signal transmitter exploiting power-efficient cross-quadrature clocking multiplexers with maximized timing margin. IEEE Trans. Circuits Syst. I Regul. Pap. 2018, 65, 3014–3026. [Google Scholar] [CrossRef]
  4. Yuan, S.; Wu, L.; Wang, Z.; Zheng, X.; Zhang, C.; Wang, Z. A 70 mW 25Gb/s quarter-rate serdes transmitter and receiver chipset with 40 dB of equalization in 65 nm CMOS technology. IEEE Trans. Circuits Syst. I Regul. Pap. 2016, 63, 939–949. [Google Scholar] [CrossRef]
  5. Chen, Y.; Mak, P.I.; Zhang, L.; Qian, H.; Wang, Y. Pre-emphasis transmitter (0.007 mm2, 8 Gbit/s, 0–14 dB) with improved data zero-crossing accuracy in 65 nm CMOS. Electron. Lett. 2013, 49, 929–930. [Google Scholar] [CrossRef]
  6. Chen, Y.; Mak, P.; Boon, C.C.; Martins, R.P. A 27-Gb/s Time-Interleaved Duobinary Transmitter Achieving 1. 44-mW/Gb/s FOM in 65-nm CMOS. IEEE Microw. Wirel. Compon. Lett. 2017, 27, 839–841. [Google Scholar] [CrossRef]
  7. Zheng, X.; Zhang, C.; Lv, F.; Zhao, F.; Yue, S.; Wang, Z.; Li, F.; Jiang, H.; Wang, Z. A 4–40 Gb/s PAM4 transmitter with output linearity optimization in 65 nm CMOS. In Proceedings of the 2017 IEEE Custom Integrated Circuits Conference (CICC), Austin, TX, USA, 30 April–3 May 2017. [Google Scholar] [CrossRef]
  8. Yin, P.; Shu, Z.; Xia, Y.; Shen, T.; Guan, X.; Wang, X.; Mohammad, U.; Zang, J.; Fu, D.; Zeng, X.; et al. A Low-Area and Low-Power Comma Detection and Word Alignment Circuits for JESD204B/C Controller. IEEE Trans. Circuits Syst. I Regul. Pap. 2021, 68, 2925–2935. [Google Scholar] [CrossRef]
  9. Gao, J.; Cheng, H.; Wu, H.C.; Liu, G.; Lau, E.; Yuan, L.; Krause, C. Thunderbolt Interconnect-Opitcal and Copper. J. Light. Technol. 2017, 35, 3125–3129. [Google Scholar] [CrossRef]
  10. Sung, G.M.; Tung, L.F.; Wang, H.K.; Lin, J.H. USB Transceiver with a Serial Interface Engine and FIFO Queue for Efficient FPGA-to-FPGA Communication. IEEE Access 2020, 8, 69788–69799. [Google Scholar] [CrossRef]
  11. Bae, W.; Cho, S.Y.; Jeong, D.K. A 1.93-pj/bit pci express gen4 phy transmitter with on-chip supply regulators in 28 nm cmos. Electronics 2021, 10, 68. [Google Scholar] [CrossRef]
  12. Chun, Y.; Anand, T. An ISI-Resilient Data Encoding for Equalizer-Free Wireline Communication-Dicode Encoding and Error Correction for 24.2-dB Loss with 2.56 pJ/bit. IEEE J. Solid-State Circuits 2020, 55, 567–579. [Google Scholar] [CrossRef]
  13. Maina, R.; Tumiatti, V.; Pompili, M.; Bartnikas, R. Dielectric loss characteristics of copper-contaminated transformer oils. IEEE Trans. Power Deliv. 2010, 25, 1673–1677. [Google Scholar] [CrossRef]
  14. Oh, K.S. Accurate transient simulation of transmission lines with the skin effect. IEEE Trans. Comput. Des. Integr. Circuits Syst. 2000, 19, 389–396. [Google Scholar] [CrossRef]
  15. Chen, Y.; Mak, P.; Zhang, L.; Wang, Y. A 0.002-mm2 6.4-mW 10-Gb/s Full-Rate Direct DFE Receiver with 59.6% Horizontal Eye Opening at 10-12 BER Under 23.3-dB Channel Loss at Nyquist. IEEE Trans. Microw. Theory Technol. 2014, 62, 3107–3117. [Google Scholar] [CrossRef]
  16. Aroca, R.A.; Voinigescu, S.P. A large swing, 40-Gb/s SiGe BiCMOS driver with adjustable pre-emphasis for data transmission over 75 Ω coaxial cable. IEEE J. Solid-State Circuits 2008, 43, 2177–2186. [Google Scholar] [CrossRef]
  17. Chae, J.H.; Ko, H.; Park, J.; Kim, S. A 12.8-Gb/s Quarter-Rate Transmitter Using a 4:1 Overlapped Multiplexing Driver Combined With an Adaptive Clock Phase Aligner. IEEE Trans. Circuits Syst. II Express Briefs 2019, 66, 372–376. [Google Scholar] [CrossRef]
  18. Svelto, F.; Deantoni, S.; Castello, R. 1.3 GHz low-phase noise fully tunable CMOS LC VCO. IEEE J. Solid-State Circuits 2000, 35, 356–361. [Google Scholar] [CrossRef]
  19. Zhao, Y.; Chen, Z.; Liu, Z.; Li, X.; Wang, X. A 4.1 GHz–9.2 GHz programmable frequency divider for Ka Band pll frequency synthesizer. Electronics 2020, 9, 1773. [Google Scholar] [CrossRef]
  20. Hossain, M.; El-Halwagy, W.; Hossain, A.D.; Aurangozeb, A. Fractional-N DPLL-Based Low-Power Clocking Architecture for 1-14 Gb/s Multi-Standard Transmitter. IEEE J. Solid-State Circuits 2017, 52, 2647–2662. [Google Scholar] [CrossRef]
  21. Kim, J.; Balankutty, A.; Elshazly, A.; Huang, Y.Y.; Song, H.; Yu, K.; O’Mahony, F. A 16-to-40 Gb/s quarter-rate NRZ/PAM4 dual-mode transmitter in 14 nm CMOS. In Proceedings of the 2015 IEEE International Solid-State Circuits Conference-(ISSCC) Digest of Technical Papers, San Francisco, CA, USA, 22–26 February 2015; Volume 58, pp. 60–61. [Google Scholar] [CrossRef]
  22. Menoifi, C.; Braendli, M.; Francese, P.A.; Morf, T.; Cevrero, A.; Kossel, M.; Kull, L.; Luu, D.; Ozkaya, I.; Toifl, T. A 112Gb/S 2.6pJ/b 8-Tap FFE PAM-4 SST TX in 14 nm CMOS. In Proceedings of the Digest of Technical Papers-IEEE International Solid-State Circuits Conference, San Francisco, CA, USA, 11–15 February 2018; IEEE: Piscataway, NJ, USA, 2018; Volume 61, pp. 104–106. [Google Scholar]
  23. Roshan-Zamir, A.; Elhadidy, O.; Yang, H.W.; Palermo, S. A Reconfigurable 16/32Gb/s Dual-Mode NRZ/PAM4 SerDes in 65-nm CMOS. IEEE J. Solid-State Circuits 2017, 52, 2430–2447. [Google Scholar] [CrossRef]
  24. Ahn, C.; Hong, J.; Shin, J.; Kim, B.; Park, H.J.; Sim, J.Y. An 18-Gb/s NRZ Transceiver with a Channel-Included 2-UI Impulse-Response Filtering FFE and 1-Tap DFE Compensating up to 32-dB Loss. IEEE Trans. Circuits Syst. II Express Briefs 2020, 67, 2863–2867. [Google Scholar] [CrossRef]
  25. Choi, M.C.; Jeong, D.K.; Cho, S.Y.; Shim, M.; Kim, B.; Ko, H.G.; Ju, H.; Park, K.; Kim, H.; Kim, K. A 2.5–28 Gb/s Multi-Standard Transmitter with Two-Step Time-Multiplexing Driver. IEEE Trans. Circuits Syst. II Express Briefs 2019, 66, 1927–1931. [Google Scholar] [CrossRef]
  26. Bichan, M.; Ting, C.; Zand, B.; Wang, J.; Shulyzki, R.; Guthrie, J.; Tyshchenko, K.; Zhao, J.; Parsafar, A.; Liu, E.; et al. A 32Gb/s NRZ 37dB SerDes in 10nm CMOS to Support PCI Express Gen 5 Protocol. In Proceedings of the 2020 IEEE Custom Integrated Circuits Conference (CICC), Boston, MA, USA, 22–25 March 2020; pp. 19–22. [Google Scholar] [CrossRef]
Figure 1. The top-level architecture of the proposed four-lane TX with on-chip PLL.
Figure 1. The top-level architecture of the proposed four-lane TX with on-chip PLL.
Electronics 10 01873 g001
Figure 2. The implementation of the proposed multi-rate single lane.
Figure 2. The implementation of the proposed multi-rate single lane.
Electronics 10 01873 g002
Figure 3. (a) The details and (b) timing diagram of the latch array and 2:1 MUX.
Figure 3. (a) The details and (b) timing diagram of the latch array and 2:1 MUX.
Electronics 10 01873 g003
Figure 4. (a) The proposed CML-based output driver with a 3-tap FFE. (b)The details circuit of the bias generator.
Figure 4. (a) The proposed CML-based output driver with a 3-tap FFE. (b)The details circuit of the bias generator.
Electronics 10 01873 g004
Figure 5. (a) The block diagram of the wide-operation-range PLL. (b) The designed LC-VCO.
Figure 5. (a) The block diagram of the wide-operation-range PLL. (b) The designed LC-VCO.
Electronics 10 01873 g005
Figure 6. Chip micrograph.
Figure 6. Chip micrograph.
Electronics 10 01873 g006
Figure 7. Measurement environment with the FPGA and the oscilloscope.
Figure 7. Measurement environment with the FPGA and the oscilloscope.
Electronics 10 01873 g007
Figure 8. (a) Measured spur and phase noise of the designed PLL at 10GHz; (b) Measured frequency range of VCO1 and VCO2; (c) Measured PLL output eye diagrams at 6.2/10/14/16 GHz.
Figure 8. (a) Measured spur and phase noise of the designed PLL at 10GHz; (b) Measured frequency range of VCO1 and VCO2; (c) Measured PLL output eye diagrams at 6.2/10/14/16 GHz.
Electronics 10 01873 g008aElectronics 10 01873 g008b
Figure 9. Measured output eye diagrams of the proposed TX lane-0 at 1.55/10/28/32 Gb/s.
Figure 9. Measured output eye diagrams of the proposed TX lane-0 at 1.55/10/28/32 Gb/s.
Electronics 10 01873 g009
Figure 10. Measured output eye diagrams of the proposed TX lane-0/1/2/3 at 32 Gb/s.
Figure 10. Measured output eye diagrams of the proposed TX lane-0/1/2/3 at 32 Gb/s.
Electronics 10 01873 g010
Figure 11. (a) The frequency response of the three different channels and (b) measured output eye diagrams of the proposed TX lane-0 at 32 Gb/s with different FEEs off and on.
Figure 11. (a) The frequency response of the three different channels and (b) measured output eye diagrams of the proposed TX lane-0 at 32 Gb/s with different FEEs off and on.
Electronics 10 01873 g011
Table 1. Performance summary and comparison.
Table 1. Performance summary and comparison.
This Work[23][24][25][26]
CMOS Technology28 nm65 nm28 nm65 nm10 nm
ModulationNRZNRZ/PAM-4NRZNRZNRZ
Data Rate (Gb/s)1.55–3216/32182.5–281–32
FFE Tap34335
Eye Width (UI) #0.720.6N/A0.50.64
Eye Height (V) #0.350.07/0.50.030.250.16
Output Swing (V) #0.840.2/0.080.20.840.5
Energy Efficiency * (pJ/bit)1.899.44/4.995.382.6210.18
Supply (V) 0.9/1.2/1.81.2N/AN/A1/1.8
Area/Lane (mm2)0.0480.0740.021 0.1960.24
# The eye width, eye height, and differential output swing are measured at the highest data rate. * The energy efficiency is for a single transmitter lane.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Cai, C.; Zheng, X.; Chen, Y.; Wu, D.; Luan, J.; Lu, D.; Zhou, L.; Wu, J.; Liu, X. A 1.55-to-32-Gb/s Four-Lane Transmitter with 3-Tap Feed Forward Equalizer and Shared PLL in 28-nm CMOS. Electronics 2021, 10, 1873. https://doi.org/10.3390/electronics10161873

AMA Style

Cai C, Zheng X, Chen Y, Wu D, Luan J, Lu D, Zhou L, Wu J, Liu X. A 1.55-to-32-Gb/s Four-Lane Transmitter with 3-Tap Feed Forward Equalizer and Shared PLL in 28-nm CMOS. Electronics. 2021; 10(16):1873. https://doi.org/10.3390/electronics10161873

Chicago/Turabian Style

Cai, Chen, Xuqiang Zheng, Yong Chen, Danyu Wu, Jian Luan, Dechao Lu, Lei Zhou, Jin Wu, and Xinyu Liu. 2021. "A 1.55-to-32-Gb/s Four-Lane Transmitter with 3-Tap Feed Forward Equalizer and Shared PLL in 28-nm CMOS" Electronics 10, no. 16: 1873. https://doi.org/10.3390/electronics10161873

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop