A 90 GHz Broadband Balanced 8-Way Power Amplifier for High Precision FMCW Radar Sensors in 65-nm CMOS

We present a W-band 8-way wideband power amplifier (PA) for a high precision frequency modulated continuous wave (FMCW) radar in 65-nm CMOS technology. To achieve a broadband operation with an improved output power for a high range resolution and high distance coverage of FMCW radar sensors, a balanced architecture is employed with the Lange coupler which naturally combines the output powers from two 4-way push-pull PAs. By utilizing a transformer-based push-pull structure with a cross-coupled capacitive neutralization technique, the gate-drain capacitance of the 4-way PA is compensated for the stabilization with an improved power gain. Interstage matching was performed with transformers for a reduced loss from the matching network and minimal area occupation. The implemented balanced 8-way PA achieved a saturated output power (Psat) of 16.5 dBm, a 1-dB compressed output power (OP1dB) of 13.3 dBm, a power-added efficiency (PAE) of 9.9% at 90 GHz and 3-dB power bandwidth was 20.4 GHz (79.2–99.6 GHz).


Introduction
Recently, frequency-modulated continuous-wave (FMCW) radar sensors have been widely used for airborne radars, automotive cruise control, and weather radars. FMCW radar sensors that operate at millimeter-wave can achieve better spatial and range resolution owing to their wide sweep bandwidth, as well as a sharper beamwidth for a given antenna size. Moreover, low oxygen attenuation and availability of small size antennas at W-band are appealing in improving propagation loss and implementing a coherent multi-receiver integration for better sensitivity [1]. In this regard, one of the most crucial parameters of a high-precision FMCW radar is a sweep bandwidth that improves high range resolution, e.g., FMCW radar sweep bandwidth should be as wide as 10 GHz to distinguish objects that are 1.5 cm apart [2], which is around 10% of the fractional bandwidth of a 90 GHz FMCW radar sensor.
With the advancement of CMOS technology, a bulk silicon-based CMOS process has been considered one of the most popular technologies for mass production, owing to its low-cost and high-level integration features. However, the design of the wideband power amplifier in a nanoscale CMOS is quite challenging due to the serious channel length modulation effect in the nanoscale CMOS, which directly determines the output power, linearity, and power gain. Moreover, the reduction of the maximum available gain (MAG) and maximum stable gain (MSG) with the increase in operating frequency correspondingly degrades the power gain and efficiency and it hinders the gain flatness to achieve wide bandwidth for FMCW radar applications at the millimeter-wave regime.
Parasitic capacitances in RF transistors have been identified as the main factor of the MAG and MSG reduction. To compensate for parasitic capacitances, the common source (CS) topology with the capacitive cross-coupling neutralization method has been widely applied at millimeter-wave amplifiers [3]. The gate-drain capacitance C gd constructs undesirable shunt feedback which decreases the gain and the stability of systems. This can be relieved by using cross-coupled neutralization capacitors C n between the drain node of one side of the device and the gate node of the other side in a push-pull structure. Additionally, a value of C n should be carefully chosen, since C n can create positive feedback when the capacitive neutralization is over-compensated. Thus, the power gain and poweradded efficiency (PAE) should be carefully compromised with stability.
In designing millimeter-wave CMOS PAs, one of the most important parameters is the saturated output power (P sat ) which is determined by an available active device size considering a feasible output matching network. Since CMOS PAs are usually implemented with a low supply voltage of around 1 V due to low breakdown voltages from the thin oxide layer in MOS, it is hard to achieve an efficient output power matching network with a relatively large size device. To resolve this issue, there have been enormous efforts in the literature. By stacking the MOS transistor, the output voltage can be scaled up but necessitates an expensive SOI technology if the required stack is more than two [4]. Various power combining techniques have been more prevalently used to boost the output power of a PA. Transmission-line-based combiner [5,6], λ/4-based combiner [7,8], and transformerbased combiner [9][10][11][12][13] are widely used to boost the output power. Among these approaches, the transformer-based power combiner is considered the widely adopted technique due to its simplified matching network, and no additional AC coupling capacitor is required, which has a low-quality factor at the millimeter-wave regime. It should be noticed that the 3-dB power bandwidth of the employed PA must satisfy the sweep bandwidth requirement. Moreover, the flatness variation in the output power level should be kept minimized to avoid sidelobes from the unwanted amplitude modulation.
In this paper, a wideband balanced 8-way power amplifier with the Lange couplers is presented by combining two 4-way transformer-based push-pull PAs for advanced performances in the output power, efficiency, and wide power bandwidth with improved flatness. The W-band PA was implemented in TSMC 65 nm CMOS, and it achieved the output power of 16.4 dBm with 26.7 dB gain, 9.8% PAE at 91.2 GHz, 13 GHz (83-96 GHz) of 3-dB gain bandwidth, more than 20 GHz of 3-dB output power bandwidth, and 1.5 dB of variation in the saturated output power. Analysis of choosing a proper neutralization capacitor was performed considering the stability factor, MAG, and MSG of the transistor. Moreover, the transformer-based matching networks and hybrid combiner were investigated to achieve a high output power and PAE with improved 3-dB power bandwidth at W-band. By utilizing a hybrid combining technique together with transformer-based push-pull PAs, the proposed PA achieves improved output power, gain, and bandwidth, which results in the highest Figure-of-Merit (FoM) among the recently reported CMOS PAs operating at the 90 GHz region. Moreover, the presented PA is robust to the load variations owing to the balanced configuration.

Design of 4-Way Push-Pull Power Amplifier
The proposed balanced 8-way PA consists of the two transformer-based 4-way pushpull PAs combined with the Lange couplers. In this section, the design of the 4-way PA is described by utilizing the transformer-based power divider and combiner in the current mode architecture.

Architecture of 4-Way PA
The schematic of the proposed 4-way 4-stage PA is presented in Figure 1. It is composed of an input power divider and output power combiner in the current domain, two 2-way PAs with the push-pull pairs utilizing transformer-based interstage matching networks. For each push-pull pair for the 2-way PA, the capacitive neutralization technique was employed to enhance the impedance matching and stability of the circuit. In this architecture, two push-pull PAs are combined in the current domain with a transformer-based combiner, which can increase the effective impedance seen from each push-pull PA by twice, which eventually helps in choosing a proper output device size of the push-pull PA to improve the power efficiency and the current handling capability of the output matching network.

Differential PA Unit Cell
We utilized the push-pull structure with a differential pair using the capacitive neutralization technique to improve the stability factor as presented in Figure 1. This architecture eventually results in better impedance matching and increases the isolation between input and output as well as the gain of each stage. The value of the neutralization capacitor was selected to be close to the C gd of the transistor. Figure 2 shows the MSG/MAG of the differential pair in the power stage depending on the neutralization capacitance C n4 over frequency. In this design, C n4 = 19 fF was chosen, and it is verified that the active device with the capacitive neutralization technique was stable from the extracted K and |∆| at W-band. The gate width (W) of the first and second driver stage was W 1,2 = 32 × 0.8 µm. For the transistors at the 3rd driver stage and power stage, W 3 = 32 × 1 µm, and W 4 = 48 × 1 µm were used, respectively. To achieve optimal capacitive neutralization for a given device size in each stage, the neutralization capacitors were set by C n1,2 = 12 fF, C n3 = 14.3 fF, and C n4 = 19 fF.

Transformer-Based Matching Network
To model an on-chip transformer (TF) for a PA design in CMOS technology, a lowfrequency model with six parameters (L 1 , R 1 , L 2 , R 2 , M, and R f ) has been used to characterize the two winding inductors and the inductive coupling, the capacitive coupling, respectively, as illustrated in Figure 3 [14][15][16]. The source and the load of the transformer can be either the 50-Ω terminal, or the gate or the drain of transistors in interstage matching network design. The coupling coefficient and the quality factors are defined as below: The optimum source (Z s = R s + jX s ) and load (Z L = R L + jX L ) for a given transformer are given in [15], and they are written in terms of admittances as: where Y S = jB S + G S is the source admittance and Y L = jB L + G L is the load admittance of the transformer. The conductance and susceptance of the source and load can be transferred into the forms of parallel resistance (R p ) and reactance (X p ) as below: For an on-chip transformer in the CMOS process, the five-parameter model with R f = 0, can properly model the electrical behavior of a winding transformer up to around 70% of its SRF with 10% of precise tolerance [16]. It is noteworthy that the intrinsic insertion loss of a TF is reduced monotonically for k 2 Q 1 Q 2 .
In designing the interstage matching networks, transformers with a center tap were employed and named TF1, TF2, and TF3, as illustrated in Figure 1. Considering the device size and neutralization capacitors of each unit cell amplifier, interstage matching networks are designed using the lumped components model of the transformer, presented in Figure 3. The turns ratio of all the interstage transformers is 1:1, and the transformers operate well below the self-resonance frequency (SRF) to guarantee the validity of the lumped component model. In the case of TF3 at the input of the final stage, additional series inductors were applied to resonate out the drain-source capacitance. Figure 4 presents the trace at each interstage by using the lumped components model. Since the size of M1 equals M2, the conjugate source admittance is plotted at the same point. The impedance of the lumped model corresponds well with 3D EM simulated results, as presented in Figure 4.

Transformer-Based Parallel Combiner
In implementing the transformer-based combiner, the ultra-thick metal (UTM) with the thickness T = 3.4 µm was used for the primary coil to minimize the voltage drop in the path of VDD, while M8 was used for the secondary coil. In this structure, the input impedance of the matching network at each differential port of the primary coils is well balanced, even though the secondary coil is a single-ended structure connected to the ground. The 3D EM model of the input divider and output combiner is described in Figure 5a,b, respectively. All the simulations of the passive components were performed with Ansoft HFSS. The diameter of the divider and the outside coils of the combiner are 52 µm and 46 µm, respectively. For the transformer-based divider in Figure 5a, the primary coils have a larger self-inductance compared to the secondary coil as the diameter of the secondary coil is 39 µm. To improve the coupling factor k for the transformer, broadsidecoupled coils were implemented on the M8 layer, which has a smaller minimum spacing rule than the UTM layer. The simulated insertion losses of the combiner and the divider are shown in Figure 6a. In implementing the output parallel combiner for each 4-way push-pull PA, vertical-coupled transformers were used to combine output signals at the coils. This on-chip vertical-coupled structure is advantageous to combine signals with low insertion loss, as it can achieve a very small distance between the primary and secondary coils. Owing to the symmetry in the layout, the parallel combiner with the current mode is highly balanced. To verify the electrical symmetry of the transformer-based combiner and divider, the amplitude and the phase imbalance of each component are simulated, which shows almost 0 dB and below 1-degree imbalance below 140 GHz, as presented in Figure 6b. The small amplitude and phase imbalances are caused by the paths for cross-coupled capacitors in different layers.  The output power matching has been performed using the Load-pull analysis with Harmonic Balanced (HB) simulation. The differential optimum impedance Z opt1diff and Z opt2diff of the power stage are extracted using load-pull simulations, and each input impedance at differential ports of the output matching network is matched to Z opt1diff and Z opt2diff . The simulated S-parameters of the designed 4-way PA are shown in Figure 7a. The 4-way PA achieves a peak gain of 27.5 dB at 91 GHz with a 3-dB gain bandwidth of 11 GHz (85.5-96.5 GHz). The input and output return losses are larger than 10 dB within the 3-dB gain bandwidth. Also, the large-signal results of the 4-way PA were simulated. The simulated output power and the 1-dB compressed output power () are 14.2 dBm and 10.8 dBm at 90 GHz, respectively. The maximum power efficiency of the 4-way PA is simulated to be 11% at 90 GHz. In the simulations of 4-way PA, the input and output port impedance were set to 35 Ω, which is a characteristic impedance of the transmission line in the Lange coupler. The simulation and layout of the designed PA were performed with Cadence Virtuoso. Figure 7b shows simulated results of the large-signal performance versus frequency.

W-Band Microstrip Lange Coupler
The advantage of utilizing a balanced power amplifier configuration is its wideband matching at input and output. Moreover, it guarantees a wide range of stability since the reflected powers from the impedance mismatches are dissipated in the 35 Ω termination of the isolation port at the coupler. Through the balanced structure, the flatness can also be improved as the unbalanced output power from the mismatches of the two 4-way PAs must be dissipated in the termination as well [17]. Owing to this characteristic, a balanced PA is ideal for a high-precision FMCW radar sensor.
To achieve the balanced architecture with an improved P sat for a wideband operation, the Lange couplers were employed to combine the output powers from each 4-way PA. It is noteworthy that the size of the Lange coupler becomes comparable to that of the couplers with lumped components at W-band. Moreover, the insertion loss of the transmission-linebased couplers becomes even smaller than its counterparts at 90 GHz. The designed Lange coupler utilizes the UTM layer as a coupled line to mitigate the insertion loss (IL).
To construct the Microstrip-line, M1 and M2 layers were stacked together with via to form a solid ground plane, and the UTM (ultra-thick metal layer) layer was used as a signal line. For the Lange coupler, the bridges were formed with the M8 layer as illustrated in Figure 8a. The length of the hybrid coupled combiner is λ g /4 ≈ 450 µm for 90 GHz. It should be noted that the reference impedance was chosen to be 35 Ω considering the limited minimum width of the UTM layer in implementing the coupled line of the Lange coupler. Even though the input and output VSWR of the whole PA highly rely on that of the Lange coupler, the return loss (RL) from the mismatch between 35 and 50 Ω load is still good enough (RL≈19 dB). Hence, the load impedance of the 4-way PA was also designed to be 35 Ω. Moreover, a reference impedance of 35 Ω in the Microstrip-line features wide signal paths that allow more current to flow [7]. As illustrated in Figure 8b, the implemented 8-way PA forms a balanced structure with two 4-way four-stage push-pull PAs utilizing the input and output Lange couplers with its high coupling feature, owing to the multiple fingers. The 4-way PAs were implemented with two transformer-based push-pull PAs using transformer-based combiners and dividers, as presented in Figure 1. The simulated magnitude and phase of the S-parameters for the designed Lange coupler are shown in Figure 9a,b. Transistors in two internal paths were placed away from DC pads, and it can increase common-mode inductance due to relatively lengthy supply lines. Herein, bypass capacitors were placed not only around DC pads, but also between two quadrature paths in the coupler.

Robustness to Load Mismatch
In this design, the extracted differential optimum load impedance Z optdiff = 11.3 + j43 Ω at 90 GHz is matched from a 50 Ω load. However, the load impedance cannot exactly be 50 Ω, in practice, since the connection between an antenna at the output and the PA causes load variation. To check the robustness to the load variations, the impedance of the load R L was varied from 20 to 80 Ω (±60% variations) in the simulation. Then, we observed the output power, output return loss, and the input impedance variation of the transformer-based combiner preceding the Lange coupler. The output matching network and unintentionally variable antenna load are shown in Figure 10a. The load impedance does not affect much the input impedance of the output matching network, owing to the balanced structure of the designed PA as shown in Figure 10b. The arrows in Figure 11 indicate the improved output return loss (RL) of the 8-way PA compared with that of the 4-way PA. Meanwhile, the output RL of the 8-way PA with an 80 Ω load achieved the output RL better than the acceptable level (>8 dB) in the working range between 75 and 105 GHz; the S22 of the 4-way PA was distinctively shifted depending on the load variations, which resulted in 5 dB of RL in the operating band. In the load variations, the PA achieved the minimum P sat of 15.7 dBm with the 80 Ω load and the maximum P sat of 16.2 dBm with the 50 Ω load at 90 GHz. As such, the simulated output power results show that the Lange coupler compensates for the load variations, which effectively mitigates the performance degradations of the PA from the load variations.

Measurement Results
The chip size of the implemented PA is 0.94 × 0.8 mm 2 , including RF pads and DC pads. Without RF pads and DC pads, the size of the core area is 0.62 × 0.62 mm 2 . The chip photograph of PA is presented in Figure 12. For the measurement of S-parameters of the implemented balanced PA, a vector network analyzer N5224A (Keysight, CA, USA) combined with an extension module (75-110 GHz) was used with an on-wafer probe station, and on-wafer setup was calibrated with a CS-5 calibration kit. The implemented PA consumes a DC-current of 370 mA from a 1.2-V supply. In measuring the large-signal performance, the W-band input signal was generated using a ×6 frequency multiplier and an X-band signal generator 83623B (Agilent, CA, USA), A W-band attenuator QAD-W00000 (Quinstar, CA, USA) was placed at the output of the frequency multiplier to control the input power. The output power of the PA was measured using a power meter E4419B (Agilent, CA, USA) and a W-band power sensor W8486A (Agilent, CA, USA). The setup for the S-parameters and the large-signal measurement at W-band are illustrated in Figure 13a,b, respectively. The measured S-parameters of the balanced PA are presented in Figure 14. The peak gain improvement of 27.4 dB was observed at 86.4 GHz, and the 3 dB gain bandwidth of the balanced PA was measured to be 13 GHz (83-96 GHz). Owing to the broadband feature of the Lange coupler, the input and output return loss are above 10 dB from 75 GHz to 105 GHz, except for around 97 GHz. Figure 15a,b show results of the large-signal measurement over the frequency and input power, respectively. The implemented PA achieved the peak P sat of 16.5 dBm, OP 1dB of 13.3 dBm, and PAE was 9.9% at 90 GHz. Within the 3-dB gain bandwidth, the designed balanced 8-way PA performs P sat higher than 14.9 dBm, where the maximum variation of 1.6 dB is observed within the 3-dB gain bandwidth. To demonstrate the output power flatness, which is advantageous for the application of PA in high-precision FMCW radars, the 1-dB power bandwidth (P sat BW -1dB ) and 3-dB power bandwidth (P sat BW -3dB ) were evaluated from the measurement results, as shown in Figure 15a. P sat BW -1dB is 13.8 GHz (81.6-95.4 GHz), and P sat W -1dB is 20.4 GHz (79.2-99.6 GHz). Even though the power bandwidths were degraded from estimated values in simulation due to the low P sat above 94 GHz in the measurement, the implemented balanced PA achieved a broad power bandwidth representing the availability of highprecision FMCW radars.   Table 1 summarizes the performance of the implemented PA in this work and other recently reported W-band CMOS PAs. Both PAs in [5,6] utilized transmission-line (T-line)based combiners. The PA in [5] achieved a 38 GHz of bandwidth, and the 16-way PA in [6] performed the highest output power among W-band PAs. However, it is noticed that the PAs with T-line combiner had relatively lower PAE due to the lossy output matching network. Works in [9][10][11]18] utilized compact transformer-based combiners, and they achieved the saturated output power higher than 14 dBm with small area occupancy. However, with this configuration, P sat BW -1dB is relatively lower than other reported PAs. Meanwhile, our proposed balanced PA features better 1-dB and 3-dB power bandwidth by efficiently combining two transformer-based push-pull PAs with Lange couplers. The implemented 8-way PA achieved the highest FoM and FoM BW among the recently reported CMOS PAs operating above 90 GHz to date.  [20]. CS: Common source, CC: Cascode, BA: Balanced amplifier.

Conclusions
We demonstrated a W-band balanced power amplifier (PA) that achieves a 20.4-GHz of 3-dB power bandwidth with +16.5 dBm of the peak saturated output power (P sat ) and 9.9% of the peak power added efficiency (PAE) in 65nm CMOS technology. To achieve the broadband 8-way PA at 90 GHz, each transformer-based 4-way PA was implemented by combining two push-pull PAs in the current mode. By utilizing on-chip Lange couplers as balancing hybrids, the two transformer-based 4-way PAs were effectively combined for the wideband operations. The implemented 8-way PA demonstrated a wideband operation to be used for high-precision FMCW radars with the highest FoM among the recently published s CMOS PAs at the 90 GHz regime.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.