An Area ‐ Efficient and Programmable 4 × 25 ‐ to ‐ 28.9 Gb/s Optical Receiver with DCOC in 0.13 μ m SiGe BiCMOS

: In this paper, we present an area ‐ efficient noise ‐ optimized programmable 4 × 25 ‐ to ‐ 28.9 Gb/s optical receiver. Both high ‐ and low ‐ power modes are available for the receiver to meet different requirements. Emitter degeneration provides the input transimpedance amplifier (TIA) stage with improved stability. The noise of the TIA with emitter degeneration is analyzed, and an improved noise optimization method for the TIA is proposed. A sink current source with emitter degeneration in a DC offset cancellation (DCOC) loop reduces the noise introduced by the DCOC circuit. Moreover, with parasitic capacitor utilization in the DCOC loop and capacitive emitter degeneration in the variable ‐ gain amplifier (VGA) stage, the chip area is minimized. Fabricated in a 0.13 μ m SiGe BiCMOS technology, the receiver achieved a small area of 0.54 mm 2 per lane. The measured bit error rate (BER) is 10 − 12 with input signal varying from 110 μ App to 1150 μ App. The one ‐ lane power dissipation values in the low ‐ power and high ‐ power modes are 84.97 mW and 123.75 mW, respectively.


Introduction
The exponential growth of data traffic has created higher data rate requirements for electronic components [1]. To meet the increasing demand among computational blocks, communication links with data rates exceeding hundreds of Gb/s [2] are needed. Compared with telecommunication, multilane optical communication [3] has been widely used for its excellent capability [4].
In previous research, several multilane optical receivers with high-speed data rates and noise optimization have been developed [5][6][7][8][9][10][11] in CMOS and SiGe BiCMOS technology. A multilane optical receiver with data rates exceeding hundreds of Gb/s was first functionally realized by Shibasaki, T. [2] in 2015. In order to explore performance enhancement of the optical receivers, a wide-band design of the receivers was analyzed by Kokolov, A.A. [12] in 2019, a noise optimization method was proposed by Li, D. [9] in 2016, and power and area reductions in advanced CMOS technology were made by Shahramian, S. [13] in 2019. However, there still exist some shortcomings which need to be improved in these receivers, such as bad compatibility with more than one specific operating mode; complicated trade-offs among stability [14], bandwidth, and noise [9] of the input transimpedance amplifier (TIA) stage; and high manufacturing costs due to the large chip area in SiGe BiCMOS technology, etc.
In this paper, we present an area-efficient noise-optimized programmable 4 × 25-to-28.9 Gb/s optical receiver for a multilane optical fiber system in 0.13 μm SiGe BiCMOS. With a flexible power mode and received signal strength indication (RSSI) function, the proposed optical receiver works at 4 × 25-to-28.9 Gb/s data rates and provides variable gain and output swing, etc. In order to avoid intersymbol interference (ISI) for instability of the input TIA stage, the stability of the input TIA stage is improved with the introduction of emitter degeneration. Meanwhile, a noise model of the input TIA stage with emitter degeneration is analyzed. Then, an improved noise optimization method based on the noise model and parameter scaling is proposed. As a result, complicated trade-offs among noise, bandwidth, stability, and S11 (scattering parameters) are avoided. To reduce the chip area, the capacitors in the lowpass filters of the DC offset cancellation (DCOC) loop are omitted due to parasitic capacitor utilization and capacitive emitter degeneration, rather than the commonly used large-sized passive inductors, applied to bandwidth extension in the variable-gain amplifier (VGA) stage. Additionally, the sink current source with emitter degeneration reduces the input-referred noise of the receiver which is introduced by the DCOC circuit.
This paper is organized as follows. The chip architecture is introduced first in Section 2. Then, detailed circuit designs are presented in Section 3, followed by the measurement results in Section 4. Finally, conclusions are drawn in Section 5. Figure 1 presents the detailed architecture of the optical receiver with four parallel lanes. PIN_Kx (x = 1, 2, 3, 4) are the output pads of the four-lane on-chip low dropout regulators (LDOs), which are connected to the cathodes of the photodiodes by bonding wires to provide clean bias. PIN_Ax (x = 1, 2, 3, 4) are the input pads of the four lanes, which are all connected to the anodes of the photodiodes by bondwires. Furthermore, the series inductance introduced by the bondwire helps to isolate the photodiode capacitor from the TIA input capacitor [15]. Additionally, Voutx (x = 1, 2, 3, 4) is the differential pad for the output of lane x.

Architecture Design
Each lane of the optical receiver contains a TIA, a single-ended-differential amplifier (STD), a VGA, an output amplifier, a DCOC loop, and an RSSI module. At the input stage, the TIA is designed to preamplify the small current signal to a large voltage signal. Then, the STD stage converts the single-ended voltage signal to a differential signal. The VGA amplifies the voltage signal to a level which is sufficient for the reliable operation of other subsequent implements [16]. Finally, the output amplifier at the output stage drives the off-chip load. The DCOC loop is implemented to eliminate the output DC offset due to process variation [17]. Additionally, the RSSI module is applied to control the adjusting resistors in the VGA, which changes the photodiode's DC current provided by the LDO on the chip. Furthermore, adjustment of the output swing, gain, and bandwidth is realized by controlling the current sources of the STD, the VGA, and the output amplifier via the power mode controller. The four lanes share only one bandgap module on the chip in order to minimize the chip area.

Transimpedance Amplifier
As shown in Figure 2, a single-ended shunt-shunt feedback TIA architecture with emitter degeneration followed by a cascode configuration was designed. The feedback resistor RF provides a good trade-off between the low noise and wideband characteristics of the TIA [18]. It is necessary to insert cascode Q2 for better isolation and to avoid excessive collector-emitter voltage, followed by an emitter follower Q3 for the sake of driving the next stage. Considering that the input voltage of TIA is around 1 V, a level-shift circuit is needed in the feedback loop. Therefore, a diode-connected bipolar transistor, Q4, is inserted into the feedback loop in series for level shifting, which saves the power by avoiding the introduction of an additional emitter follower stage for level shifting. Considering the ISI and harmonic distortion due to bad linearity, a tantalum nitride resistor with good linearity was chosen as RF to avoid the degradation of deterministic jitter.  An ideal TIA must satisfy the requirements of low noise, high gain, relatively wide bandwidth, and small deterministic jitter. Hence, there is an inevitable trade-off [19] among them, and a circuit model is helpful to optimize the TIA. Considering the time constants of the nodes in the TIA feedback loop, the input pole (NODE0) of the TIA and the output pole (NODE1) of common emitter (CE) stage Q1 with emitter degeneration dominate the bode plot of the TIA feedback system. Meanwhile, the capacitance at the input node of the optical receiver has the largest value among all the nodes, which provides the largest time constant and leads to the dominant pole. Thus, the small-signal equivalent circuit of the input TIA can be simplified into a second-order system, which is shown in Figure 3. The transfer function of the simplified small-signal equivalent circuit is derived as

RL
where CT denotes the total input capacitance and gm denotes the transconductance of Q1. CT consists of CEX (the capacitance of the pad, the photodiode) and Cb (the input capacitance of Q1). Derived from Equation (1), the damping factor ξ of the input TIA is given by As Equation (2) shows, ξ is improved by the emitter degeneration term gmRE, i.e., the stability of the input TIA, and ξ must be larger than 0.71 for PVT (process voltage temperature) variation. Assume that 1/(RLCL) is larger than the bandwidth of the input TIA stage BW; then BW is given by The −3 dB bandwidth (BW) of the input TIA commonly depends on the input pole [20] and the gain (AV) of CE stage Q1 with emitter degeneration from Equation (3). Considering that the bandwidth requirement of the input TIA is 0.7 times the data rate [21], the value of AV is derived from given RF and CT.
Considering the noise of the optical receiver, the noise of the first stage (the input TIA) dominates the noise performance [22]. With the given requirement of ξ and BW, optimum performance of the input TIA can be obtained by optimizing the noise. Moreover, the 1/f noise of the transistors in Figure  2 is ignored for high-speed applications, such that the thermal noise dominates [23]. The thermal noise of the CE stage and RF contributes most to the total noise of the TIA without emitter degeneration [24]. With the introduction of emitter degeneration, shown in Figure 2, the inputreferred noise voltage spectrum of CE stage Q1 with emitter degeneration , is expressed as where rb is the base resistor of Q1. Then the input-referred noise current spectrum , of the input TIA is given by 4 .
There are two terms in Equation (5); the first term is the shaped noise (f 2 noise) of CE stage Q1 with emitter degeneration, while the second term is the noise introduced by RF. There are inevitable trade-offs between the five parameters in Equation (5) for noise optimization, namely, rb, gm, RE, RF, and CT. However, conventional noise optimization methods [9] have several shortcomings: the noise introduced by RE is not considered, nor are the relationships among IC (the collector current of Q1), rb, gm, and CT. With multiple parameters to trade off, the conventional noise optimization method is complicated.
Considering that the performance of the bipolar transistor varies a lot with the DC operating point in SiGe BiCMOS technology, Q1 with given size has to be biased at the optimum operating point for high Ft (characteristic frequency) and β (common emitter current gain). Therefore, the size of Q1 can be determined by a given IC. Then, the sizes of Q1, Cb, and gm are proportional to IC [25], and Rb is inversely proportional to the sizes of Q1 and IC. Hence, IC determines the values of rb, gm, and CT with given CEX. The improved noise optimization method modifies the noise model (Equation (5)) and gets the relationships among IC, rb, gm, and CT; then the optimum IC can be obtained from given RF, RE, and CEX.
As a rule of thumb, the value of CEX is 100 fF. As given by Equation (5), the noise is also suppressed by large RF. However, large RF would degrade the bandwidth of the TIA from Equation (3) and S11. As a result of a trade-off, RF was selected to be 230 Ω. Although the noise and crosstalk coupled from the ground wire are suppressed by emitter degeneration, the noise introduced by emitter degeneration resistor RE increases the input-referred noise current of the input TIA stage. Therefore, the value of RE has to be small enough that it will not deteriorate the input-referred noise. The results of the noise optimization method by Matlab with RE varying from 3 Ω to 11 Ω are shown in Figure 4. Considering the requirement of ξ, RE was selected to be 7 Ω from Figure 5. Then, the optimum collector current is 4 mA, and the corresponding emitter length of Q1 is 6 μm. The value of AV was derived from the given RF and CT from Equation (3) for the bandwidth requirement. As shown in Figure 6, the simulated BW of Figure 2 is 29 GHz. Meanwhile, the simulated in,rms of Figure 2 is 2.6 μA, which confirms the accuracy of the noise model in Equation (5) and the improved noise optimization method. With the stability of the TIA enhanced by emitter degeneration, the simulated damping factor ξ (Equation (2)) is 0.74, which is stable enough for PVT variation.

Single-Ended-Differential Amplifier
The current signal from the photodiode is a unipolar non-return-to-zero (NRZ) single-ended signal. Compared with a single-ended signal, a differential signal offers superiority [26] that makes it the optimal choice for signal processing. The STD stage following the input TIA stage transforms the single-end signal to a differential signal. The STD stage block diagram is shown in Figure 7. A common differential amplifier with cascode structure and a variable current source comprises the STD stage. The variable current source controlled by the power mode equips the STD stage with variable gain. The differential inputs of the STD stage are the output of the input TIA stage and a DC bias generated by a dummy TIA. Meanwhile, the bias is set equal to the output DC component of the TIA stage. Then, the phase and amplitude discrepancy in differential signals can be minimized with this structure. As shown in Figure 8a,b, the simulated gain and phase discrepancy of the optical receiver are smaller than 0.05 dBc and 0.4° within the bandwidth of interest, respectively, which provides good matching performance.

Variable-Gain Amplifiers
Optical receivers with enough gain assure the signal for other subsequent reliable implements. The two-stage VGA provides the optical receiver with high gain. A schematic of each stage is given in Figure 9. A complete schematic of the one-stage VGA consists of the capacitive degeneration structure, two adjusting load resistors, and two identical variable current sources.

fF
80 Ω 0.4 μm / 40 μm The adjusting resistor consists of a fixed resistor and a PMOS. The PMOS transistors controlled by Vctrl in Figure 9, created by the RSSI module in Figure 1, act as variable resistors. Then, the PMOS resistors vary with the LDO output current, which denotes the input signal strength. Since it is unaffordable to extend the bandwidth by using inductors for area minimization, capacitive degeneration as an active inductor was applied for bandwidth extension. The variable current source controlled by the power mode equips the VGA stage with variable gain. Each VGA stage is followed by an emitter follower stage to drive the next module. Finally, the simulated two-stage VGA frequency response with different LDO output currents in low-power mode and high-power mode is shown in Figure 10. The simulated gain of the VGA varies from 16.5 dB to 9.1 dB in high-power mode, and the gain of the VGA varies from 9.1 dB to 1.8 dB with the current reduced in low-power mode.

Output Amplifier
A schematic of the output amplifier is shown in Figure 11. The bandwidth of the output stage is limited by unavoidable large parasitic capacitance. An inductor is used in the output stage to compensate the effect of this parasitic capacitance. The inductance in series with a resistor creates zero peaking for bandwidth extension. The adjusting current source is programmable for variable output swing. In order to get a much cleaner eye diagram with high eye height, the current source was set with large output current in the high-power mode. The simulated 25 Gb/s and 28.9 Gb/s eye diagrams of the optical receiver in high-power mode are shown in Figure 12a,b, and the simulated 25 Gb/s and 28.9 Gb/s eye diagrams of the optical receiver in high-power mode are shown in Figure 13a,b. The simulated eye heights in highpower mode and low-power mode are 255 mV and 90 mV, respectively. The eye height also decreases with higher data rates.

DC Offset Cancellation Loop
There are a large number of nonideal factors that can increase the output offset, such as inevitable device mismatch, large gain of the receiver, flicker noise, the sink current of the photodiode, and PVT variation. Therefore, a DCOC circuit following the VGA stage [27] was designed to eliminate the output offset.
As shown in Figure 14, the schematic of the DCOC circuit consists of a lowpass filter, an error amplifier, and a sink current source. The DCOC circuit should extract the offset signal from the output of the VGA stage without data signal deterioration. Therefore, lowpass filters with large resistors are commonly implemented as the first stage in a DCOC circuit. The bandwidth of the lowpass filter in a DCOC loop is typically designed to be a few megahertz. The capacitance density of a nitride metal-insulator-metal (MIM) capacitor is around 0.001 mm 2 /pF in 0.13 μm SiGe BiCMOS technology. The area of the capacitors is about 0.032 mm 2 to realize the typical 4 MHz bandwidth of a lowpass filter with given R1 and R2.

Isink
In our design, the DCOC loop without the capacitors of the lowpass filter can also realize the function of DCOC (DC offset cancellation), and the bandwidth (BW) and the gain-bandwidth product (GBW) of the DCOC loop remain unchanged. Due to the high-swing input signal, the error amplifier in a DCOC loop would work improperly without parasitic capacitors. Finally, parasitic capacitors result in a 6% decrease in chip area by avoiding the introduction of the additional capacitors in a typical design.
As shown in Figures 15 and 16, the variations of the total parasitic capacitance in the following error amplifier (the sum of CGS1 and CGB1) which caused by differential input voltage, process and temperature are smaller than 3.8% and 1.7%, respectively. The PVT variation of the parasitic capacitors is too small to affect the DCOC loop; hence, the PVT variation of the parasitic capacitors can be considered acceptable. An error amplifier following the lowpass filter was implemented to amplify the DC offset. The effective value of CC becomes large due to the Miller effect; then the dominant pole of the error amplifier is pushed to low frequency, which avoids the data signal coupling from the VGA stage to the input TIA stage through the DCOC loop. The simulated coupling isolation is about −128 dBc at 12.5 GHz.
The DCOC loop shown in Figure 1 is based on feedback architecture [28], and the transfer function of the DCOC loop is given by where HAMP(s) denotes the transimpedance transfer function from the input TIA to the VGA stage, Herror(s) is the transfer function of the error amplifier in the DCOC circuit, and gm5 is the transconductance of Q5. Considering that HAMP(s) and Herror(s) are far more than 1 within the bandwidth of the DCOC loop, HDCOC(s) << 1, and the DC offset and flicker noise are attenuated by the feedback architecture from Equation (6).
Considering the input-referred noise of the optical receiver deteriorated by the DCOC loop, the input-referred noise introduced by the error amplifier and the sink current source were analyzed. The input-referred noise current of the optical receiver contributed by the error amplifier in,in_error is derived as , , where vn,error denotes the input-referred noise voltage of the error amplifier.
Since HAMP(s)Herror(s) is far less than 1 at high frequencies due to the limited bandwidth of the DCOC loop, in,in_error can be ignored in Equation (7). The sink current source Isink with emitter degeneration term gmRE helps to further reduce in,in_error. Meanwhile, the input-referred noise current of the optical receiver introduced by the sink current source in,in_sink_cur is given by where vn,sink_cur denotes the input-referred noise voltage of the sink current source. Compared with the sink current of the CE stage, the emitter degeneration term gm5RE5 equips the proposed sink current source (Isink) with noise suppression in Equation (8). The simulated input-referred noise current introduced by the 100 μA sink current source of a CE stage or a CE stage with emitter degeneration is given in Figure 17. in,in_sink_cur decreases by 0.82 pA/√(Hz) due to emitter degeneration. In order to explore the eye diagram performance of the optical receiver, an INOPOTICALS Bit Analyzer BA8042 (INOPOTICALS, Taiwan) was used to produce 2 7 − 1 pseudo-random bit sequence (PRBS7) data patterns to the optical receiver, while a Keysight digital signal analyzer DSAZ254A (Keysight, Santa Rosa, CA, USA) was used to measure the output eye diagram. The Bit Analyzer BA8042 was also implemented to measure the BER performance with a given PRBS7 input signal.
The measured eye diagrams with a 300 μApp input signal (for PRBS7) are presented in Figure 19a,b, where the receiver works in high-power mode. The measured results of the eye diagrams in Figure 19 a,b are given in Table 1. The eye heights with input signals at 25 Gb/s and 28.9 Gb/s are 154.2 mV and 114.8 mV, respectively. The eye diagram at 25 Gb/s has a much larger eye height than the eye diagram at 28.9 Gb/s. The crossing of the eye diagram become larger with higher data rates, as shown in Table 1. Additionally, the measured BER of the optical receiver with a 200 μApp input signal varying from 25 Gb/s to 28.9 Gb/s (for PRBS7) is 10 −12 when the receiver works in both high-power mode and low-power mode.  A performance summary of the presented optical receiver and comparison with previous works is given in Table 2. The measured minimum input current of the proposed receiver is 110 μApp with minimized noise of the DCOC loop and the input TIA stage. The minimum input current is kept low by the improved stability of the input TIA. Meanwhile, the one-lane area of the receiver is reduced to 0.54 mm 2 , almost half that of prior works in SiGe BiCMOS technology, with parasitics utilization in the DCOC loop and capacitive emitter degeneration in the VGA stage. The one-lane area of the receiver is even smaller than the arts inadvanced CMOS technology. We simplified the power supply solution with a single power supply. Compared with previous works, this work provides a wider range of data rates and transimpedance, which suits different communication environments.

Conclusions
An area-efficient programmable 4 × 25-to-28.9 Gb/s optical receiver was designed and implemented herein. With the optimization mentioned above in terms of area reduction, the one-lane area is only 0.54 mm 2 . Meanwhile, the measured BER is 10 −12 with a 110 μA minimum input current due to the improved noise optimization and programmability of the optical receiver. With a 3.3 V power supply, the power dissipation varies from 84.97 mW to 123.7 mW in low-power and highpower modes, respectively. Consequently, the proposed optical receiver shows good performance in terms of area efficiency and programmability when compared to previous designs.