A Fully Integrated Bluetooth Low-Energy Transceiver with Integrated Single Pole Double Throw and Power Management Unit for IoT Sensors

This paper presents a low power Gaussian Frequency-Shift Keying (GFSK) transceiver (TRX) with high efficiency power management unit and integrated Single-Pole Double-Throw switch for Bluetooth low energy application. Receiver (RX) is implemented with the RF front-end with an inductor-less low-noise transconductance amplifier and 25% duty-cycle current-driven passive mixers, and low-IF baseband analog with a complex Band Pass Filter(BPF). A transmitter (TX) employs an analog phase-locked loop (PLL) with one-point GFSK modulation and class-D digital Power Amplifier (PA) to reduce current consumption. In the analog PLL, low power Voltage Controlled Oscillator (VCO) is designed and the automatic bandwidth calibration is proposed to optimize bandwidth, settling time, and phase noise by adjusting the charge pump current, VCO gain, and resistor and capacitor values of the loop filter. The Analog Digital Converter (ADC) adopts straightforward architecture to reduce current consumption. The DC-DC buck converter operates by automatically selecting an optimum mode among triple modes, Pulse Width Modulation (PWM), Pulse Frequency Modulation (PFM), and retention, depending on load current. The TRX is implemented using 1P6M 55-nm Complementary Metal–Oxide–Semiconductor (CMOS) technology and the die area is 1.79 mm2. TRX consumes 5 mW on RX and 6 mW on the TX when PA is 0-dBm. Measured sensitivity of RX is −95 dBm at 2.44 GHz. Efficiency of the DC-DC buck converter is over 89% when the load current is higher than 2.5 mA in the PWM mode. Quiescent current consumption is 400 nA from a supply voltage of 3 V in the retention mode.


Introduction
Recently, the Internet of Things (IoT) can be applied to various applications such as wearable devices, sensor networks, and health care [1]. One of the essential requirements of the IoT application is the low power wireless connectivity for the long battery life. The bluetooth low-energy (BLE) standard is a promising wireless connectivity for IoT applications [2]. The BLE operates in the 2.4-GHz industrial, scientific, and medical (ISM) band and uses GFSK with a modulation index h = 0.5 for signaling. Nominal frequency deviation, f dev , shall be derived as Equation (1) [2]. Figure 1 shows a block diagram of the proposed transceiver for the BLE application. It is composed of a frequency synthesizer using an analog PLL, the transmitter with class-D RF PA, an integrated SPDT switch, and the PMU with DC-DC buck converter in triple-mode and the LDOs. Since the on-chip SPDT switch eliminates the need for external RF components between chip and antenna, it is possible to configure the minimum-sized modules. It is critical that RF application of BLE require PCB area to be as small as possible. Therefore, sharing matching components of TX and RX front-end can be beneficial in terms of area and cost. Architectures of TX and RX are analog PLL-based on direct modulation instead of IQ up-conversion, and low IF down-conversion, respectively. Although the digital PLL can have the benefit in terms of the area, analog PLL is implemented in this paper to avoid the complexity of additional calibration logics in digital PLL [8]. The proposed TX uses the inductor-less class D type Sensors 2019, 19, 2420 3 of 24 PA for small area and high efficiency [9]. It requires bandwidth of PLL of 1 MHz due to PLL based on direct modulation of 1 Mbps [10]. The proposed low-IF RX uses an inductor-less LNTA, a passive quadrature down-conversion mixer, and the trans-impedance amplifiers (TIA).  Figure 2 shows the schematic of RX RF-FE. The RF-EF is composed of an LNTA, passive mixers with the 25% duty generator, and TIAs. In general, when using a 50% duty cycle LO signal, two LNTAs should be used because the LO signal is overlapped and deterioration of the noise feature of the RF-FE [6]. However, to reduce the power consumption, only a single-ended LNTA with 25% duty-cycle LO is used instead of two LNTAs in this paper. A 25% duty cycle LO increases 3-dB conversion gain, which lowers the noise contribution of the mixer compared to 50% duty-cycle LO. Since there is no overlapped period between LOs, an LNTA can drive IQ passive mixers without performance degradation. Additionally, current mode operation is adopted for high gain and low noise. The proposed inductor-less LNTA is a combination of a self-biased inverter type amplifier and separated biasing (VB1). This architecture is more suitable for low power structure than only selfbiasing technique because it has a high gain even though the sizes of MOSFETs are small [11].  Figure 2 shows the schematic of RX RF-FE. The RF-EF is composed of an LNTA, passive mixers with the 25% duty generator, and TIAs. In general, when using a 50% duty cycle LO signal, two LNTAs should be used because the LO signal is overlapped and deterioration of the noise feature of the RF-FE [6]. However, to reduce the power consumption, only a single-ended LNTA with 25% duty-cycle LO is used instead of two LNTAs in this paper. A 25% duty cycle LO increases 3-dB conversion gain, which lowers the noise contribution of the mixer compared to 50% duty-cycle LO. Since there is no overlapped period between LOs, an LNTA can drive IQ passive mixers without performance degradation. Additionally, current mode operation is adopted for high gain and low noise. The proposed inductor-less LNTA is a combination of a self-biased inverter type amplifier and separated biasing (V B1 ). This architecture is more suitable for low power structure than only self-biasing technique because it has a high gain even though the sizes of MOSFETs are small [11].

RX Front-End
Noise figure and gain matching of LNTA are well optimized by controlling V B1 . Impedance matching of RX is realized by sharing SPDT with TX. Although SPDT provides some isolations, RX should consider impedance of TX when turned off. The well estimated TX impedance can mitigate degradation of performance in RF-FE. It guarantees RX to have a sufficient gain and low noise figure at the desired RX band. In a BLE standard, the linearity of mixers is a significant issue because of the process of interferers [2]. Passive mixers are used in this work with a low supply voltage operation. A current conveyer structure is used for the LNTA and passive mixer. C M prevents the image current when the switches are turned on simultaneously by overlap between 25% duty cycle LO signals. conversion gain, which lowers the noise contribution of the mixer compared to 50% duty-cycle LO. Since there is no overlapped period between LOs, an LNTA can drive IQ passive mixers without performance degradation. Additionally, current mode operation is adopted for high gain and low noise. The proposed inductor-less LNTA is a combination of a self-biased inverter type amplifier and separated biasing (VB1). This architecture is more suitable for low power structure than only selfbiasing technique because it has a high gain even though the sizes of MOSFETs are small [11]. Noise figure and gain matching of LNTA are well optimized by controlling VB1. Impedance matching of RX is realized by sharing SPDT with TX. Although SPDT provides some isolations, RX    should consider impedance of TX when turned off. The well estimated TX impedance can mitigate degradation of performance in RF-FE. It guarantees RX to have a sufficient gain and low noise figure at the desired RX band. In a BLE standard, the linearity of mixers is a significant issue because of the process of interferers [2]. Passive mixers are used in this work with a low supply voltage operation. A current conveyer structure is used for the LNTA and passive mixer. CM prevents the image current when the switches are turned on simultaneously by overlap between 25% duty cycle LO signals. Figure 3 shows the block diagram of proposed analog PLL. It is composed of VCO, dividers (/2,/2-/3 and Fractional-N Divider with SDM), PFD, CP, internal third order loop filter, ABC, and the GFSK modulator. Bandwidth requirement of the PLL is that is wider than the data rate for PLL based on one-point modulation in the TX mode [10]. Therefore, bandwidth is used about 1 MHz or more. However, bandwidth of approximately 200 kHz was used to meet the specification of bandwidth of the BPF (1 MHz ± 600 kHz) in an RX mode.

Phase-Locked Loop (PLL) for RX LO Generator and TX Modulator
The bandwidth of PLL with a 3 rd order loop filter can be written as Equation (2) [3].
where ωC is the bandwidth of the PLL, KVCO, ICP, and N are the gain of VCO, and the current of CP, and division ratio of the divider, respectively. Additionally, R1, R2, C1, C2, and C3 are resistors and capacitors of the third order loop filter. In Equation (2), the bandwidth is determined from parameters of the third order loop filter.  Bandwidth requirement of the PLL is that is wider than the data rate for PLL based on one-point modulation in the TX mode [10]. Therefore, bandwidth is used about 1 MHz or more. However, bandwidth of approximately 200 kHz was used to meet the specification of bandwidth of the BPF (1 MHz ± 600 kHz) in an RX mode.
The bandwidth of PLL with a 3rd order loop filter can be written as Equation (2) [3].
where ω C is the bandwidth of the PLL, K VCO , I CP , and N are the gain of VCO, and the current of CP, and division ratio of the divider, respectively. Additionally, R 1 , R 2 , C 1 , C 2 , and C 3 are resistors and capacitors of the third order loop filter. In Equation (2), the bandwidth is determined from parameters of the third order loop filter. Table 1 shows the PLL loop parameter according to the loop bandwidth depending on the TX and RX mode. After determining parameters of the loop filter, the target frequency and bandwidth of PLL are determined with ABC. To adjust the desired bandwidth, the frequency of VCO is calibrated to target frequency and measure the K VCO . The I CP can be calculated by applying the measured K VCO and fixed N value to Equation (2). As can be seen from the table, the TX and RX bandwidths are different from each other. The PLL lock time is proportional to the 4/bandwidth, which makes the RX PLL locking time about 20 µs [12]. Since the TX/RX switching time is 150 µs from the BLE specification [2], the RX PLL locking time can satisfy the BLE specification with the certain margin. The ABC block is operated by two-step as follows.
Step 1: metal-oxide-metal (MOM) capacitances of the cap bank of VCO are controlled by VCO CAP <9:0> in this step. Optimum MOM capacitances are selected through a VCO Frequency Tuning Controller. Free-running frequency of the VCO is near the target channel frequency after the frequency tuning of VCO is completed.
Step 2: Calculation of the bandwidth begins when frequency tuning of the VCO is completed. The K VCO is defined as the frequency range of the VCO with respect to a V CTRL change. Thus, V CTRL is changed by changing VC<1:0>. K VCO is calculated as Equation (3).
When TX modulation is enabled and frequency is changed abruptly, the spurious emission mask can't be met at output of the transmitter due to harmonic tones. When input of TX Data is '0' or '1', the modulation deviation value is added or subtracted from carrier frequency. By mapping and filtering, levels of spurious tones can be reduced by changing inputs to the DSM in PLL.
The Fractional-N Divider is composed of the pulse-shallow counter with divide-by-4/5 and 3-order DSM. Output frequency of the proposed analog PLL is calculated as Equation (4).
The value of PRE DIV is 3 and 2 when the channel frequency is channel-15 and other channels, respectively. Values of PC, SC, and MC are determined in the channel table of PLL with respect to the channel value (CH). The value of F DEV is 10,923 and 16,384 when channel frequency is 15-channel and other channels, respectively. The modulation value adds or subtracts the value of F DEV to the value of MC when TX DATA is '1' and '0,' respectively. Figure 5 shows proposed VCO with MOM capacitor bank. The proposed VCO is designed with MOM capacitors in this paper. They are stacked from Metal 3 to Metal 5 for high capacitor density and reduced die area [13].

Low IF Base-Band Analog
The low-IF RX requires a block for image rejection. Proposed receiver structure uses the pair of the complex BPF for rejecting the image band. Figure 6 shows the block diagram of the proposed low-IF BBA. It is composed of three stage VGAs, two 2nd order BPFs, and three DCOC [14]. Total dynamic range of BBA is 88 dB. The gain of VGAs (VGA1, VGA2, VGA3) and BPFs (BPF1, BPF2) are 20 dB and 14 dB, respectively. A high pass filter (HPF) is used for the DCOC. According to the amplitude of RF signal changes the gain of BBA.

Low IF Base-Band Analog
The low-IF RX requires a block for image rejection. Proposed receiver structure uses the pair of the complex BPF for rejecting the image band. Figure 6 shows the block diagram of the proposed low-IF BBA. It is composed of three stage VGAs, two 2nd order BPFs, and three DCOC [14]. Total dynamic range of BBA is 88 dB. The gain of VGAs (VGA1, VGA2, VGA3) and BPFs (BPF1, BPF2) are 20 dB and 14 dB, respectively. A high pass filter (HPF) is used for the DCOC. According to the amplitude of RF signal changes the gain of BBA. The baseband produces a constant output voltage for the ADC input range during AGC timing. The proposed BLE receiver uses VGA to achieve constant baseband output. The gain of VGAs is adjusted by using the resistor ratio. An AGC is proposed to control gain of the BBA automatically during the preamble duration of eight symbols ( Figure 6. Block diagram of low IF baseband. The baseband produces a constant output voltage for the ADC input range during AGC timing. The proposed BLE receiver uses VGA to achieve constant baseband output. The gain of VGAs is adjusted by using the resistor ratio. An AGC is proposed to control gain of the BBA automatically during the preamble duration of eight symbols (8 µs) [2]. Preamble duration is too short, which is 8 µs in BLE specification. Settling-timing of AGC is determined by group delay of BBA. Bandwidth of DCOC (BW DCOC ) is dominant in group delay of BBA. If BW DCOC is too wide at the Measure_AGC period to reduce group delay, MODEM achieves invalid information gain since output of BBA is too attenuated. Conversely, if BW DCOC is too narrow to achieve characteristics of attenuation, MODEM achieves invalid information of gain since output of BBA cannot be settled down properly. Therefore, the BW DCOC can be controlled by the DCOC controller so that bandwidth becomes wide enough during preamble duration to operate the AGC loop in this paper. Figure 7 shows the timing diagram of the proposed DCOC controller according to operation of AGC. The AGC should control the gain of BBA within 2 µs by the MODEM. Therefore, gain of BBA can be controlled a maximum of three times since the preamble time is 8 µs.
First, during the ED interval in AGC_G1, input level of −50 dBm is detected by the peak detector, and the initial gain is set by MODEM. In AGC_G2, the AGC starts and the gain value of BBA is determined coarsely. If the BBA output of the level cannot reach the desired level, the gain value of BBA is determined finely in AGC_G3. After AGC_G3, the gain value is set.
In the AGC operation, the BW DCOC should be changed to guarantee completion of AGC within 8 µs by controlling the R DCOC <1:0> [2]. The BW DCOC is 2.5 MHz during 36 clocks of CK, and BB OUT is settled fast since the gain information of BBA is not critical at Change_AGC period. After 36 clocks, the BW DCOC is changed to 350 kHz since the 3-dB bandwidth of BBA is from 400 kHz to 1.6 MHz. When the AGC operation is finished, the freeze signal of AGC (AGC FR ) becomes high. The BW DCOC is changed to 100 kHz that does not affect the bandwidth of BBA. As shown in Figure 7, if the BW DCOC is fixed, BBA gain cannot be determined accurately since the common mode of outputs of the BBA is not settled properly. If the settling is not completed during the preamble period, data errors can occur.  As shown in Figure 8, the pseudo differential structure is used for the VGAs for infinite input impedance. Gain steps of VGA1, VGA2, and VGA3 are 4 dB, 2 dB, and 1 dB, respectively [15]. The VGAs use a differential to a single two-stage amplifier. Because input impedance of VGA is high, it does not affect the previous stage. The VGA is designed to have wide dynamic range and its gain is controllable by the modem system. Gain control scheme using the resistor bank is used in this design. Gain of VGA is controlled digitally by a digital modem.  As shown in Figure 8, the pseudo differential structure is used for the VGAs for infinite input impedance. Gain steps of VGA1, VGA2, and VGA3 are 4 dB, 2 dB, and 1 dB, respectively [15]. The VGAs use a differential to a single two-stage amplifier. Because input impedance of VGA is high, it does not affect the previous stage. The VGA is designed to have wide dynamic range and its gain is controllable by the modem system. Gain control scheme using the resistor bank is used in this design. Gain of VGA is controlled digitally by a digital modem. Figure 8, the pseudo differential structure is used for the VGAs for infinite input impedance. Gain steps of VGA1, VGA2, and VGA3 are 4 dB, 2 dB, and 1 dB, respectively [15]. The VGAs use a differential to a single two-stage amplifier. Because input impedance of VGA is high, it does not affect the previous stage. The VGA is designed to have wide dynamic range and its gain is controllable by the modem system. Gain control scheme using the resistor bank is used in this design. Gain of VGA is controlled digitally by a digital modem. Gain range is from 0 dB to 60 dB. The gain (AV) of VGA is determined by the ratio of R1 and Rvar, as shown in Equation (5). Since the gain is controlled by the relative ratio of resistors, the error of the gain is small depending on PVT variations.

= 1 +
(5) Figure 9 shows the designed 2nd-order Chebyshev complex BPF. To achieve the complex operation, it uses in-phase signal and quadrature signal. Characteristic of BPF is made by the low pass filter (LPF) characteristic shifting DC to IF using a cross coupled resistor [14]. To reduce process Gain range is from 0 dB to 60 dB. The gain (A V ) of VGA is determined by the ratio of R 1 and R var , as shown in Equation (5). Since the gain is controlled by the relative ratio of resistors, the error of the gain is small depending on PVT variations. Figure 9 shows the designed 2nd-order Chebyshev complex BPF. To achieve the complex operation, it uses in-phase signal and quadrature signal. Characteristic of BPF is made by the low pass filter (LPF) characteristic shifting DC to IF using a cross coupled resistor [14]. To reduce process variation, capacitor arrays are composed of capacitor and MOS switches. It controls the bandwidth of the BPF and its control signals (C BPF ) are determined by FTC. Center frequency of the complex BPF is 1 MHz and 3-dB bandwidth is 1.2 MHz. Image frequency rejection ratios is 36 dB [15].   Figure 10 shows a schematic of the FTC. It must compensate a capacitance and resistance variation according to process variation. It is composed of current mirror and capacitor array for generating charging voltage (VCH) and comparator and filter tuning controller. If values of resistors and capacitors are changed by process variation, the charge time of VCH is changed. After then, filter tuning controller compares VCH charge time with reference charge time. In this paper, the capacitor is only tuned because resistor variation is reflected in IREF variation. Resistance of RREF is the same as  Figure 9. Block diagram of the second order complex BPF.
Sensors 2019, 19, 2420 10 of 24 Figure 10 shows a schematic of the FTC. It must compensate a capacitance and resistance variation according to process variation. It is composed of current mirror and capacitor array for generating charging voltage (V CH ) and comparator and filter tuning controller. If values of resistors and capacitors are changed by process variation, the charge time of V CH is changed. After then, filter tuning controller compares V CH charge time with reference charge time. In this paper, the capacitor is only tuned because resistor variation is reflected in I REF variation. Resistance of R REF is the same as bandwidth resistor value of BPF to apply resistance variation of the same ratio. It can reduce the tuning time and die area. Figure 11 shows the timing diagram of FTC. When CH ON signal is high, the V CH is increased by charging the BPF CAP BANK replica. Output of the Comparator (OUT COMP ) becomes high when V CH is higher than V REF . If the OUT COMP is low when COMP CLK is high, the value of C BPF is decreased by FTC and start the calibration loop again. Therefore, if the OUT COMP is high when COMP CLK become high, the value of C BPF is increased. In addition, if the OUT COMP value is changed by comparing the value of the previous state, the calibration is finished to reduce the tuning time. After calibration, the determined value of C BPF is applied to the capacitor array of two 2nd order complex BPFs. The maximum tuning time is 150 µs of 32 cycles. If the tuning process is completed, the FTC is turned off to save power consumption.   Figure 12 proposes the designed 6-bit fully differential SAR ADC structure. The resolution of ADC required in modem requires 5-bit, but was designed with 1-bit margin when designing the SAR ADC. Input signals of the ADC include VI, VIB, and VQ, as well as VQB that are differential inputs. Therefore, two parallel ADCs should be applied in the proposed receiver. Each ADC is composed of a comparator and two binary-weighted capacitor arrays. The SAR logic controls the switching sequence of these ADCs. The fully differential structure of the ADCs reduces the substrate and supply voltage noise, and has a good Common Mode Rejection Ratio (CMRR) [16].

Analog to a Digital Converter
Capacitor arrays of this ADC operate as sample and hold circuits and DACs. Significant power consumption of the SAR ADC may occur due to switching in the capacitor array. The switching sequence of the proposed structure is common mode voltage (VCM)-based and straightforward.   Figure 12 proposes the designed 6-bit fully differential SAR ADC structure. The resolution of ADC required in modem requires 5-bit, but was designed with 1-bit margin when designing the SAR ADC. Input signals of the ADC include VI, VIB, and VQ, as well as VQB that are differential inputs. Therefore, two parallel ADCs should be applied in the proposed receiver. Each ADC is composed of a comparator and two binary-weighted capacitor arrays. The SAR logic controls the switching sequence of these ADCs. The fully differential structure of the ADCs reduces the substrate and supply voltage noise, and has a good Common Mode Rejection Ratio (CMRR) [16].

Analog to a Digital Converter
Capacitor arrays of this ADC operate as sample and hold circuits and DACs. Significant power consumption of the SAR ADC may occur due to switching in the capacitor array. The switching sequence of the proposed structure is common mode voltage (VCM)-based and straightforward.  Figure 11. Timing diagram of the RC filter tuning circuit. Figure 12 proposes the designed 6-bit fully differential SAR ADC structure. The resolution of ADC required in modem requires 5-bit, but was designed with 1-bit margin when designing the SAR ADC. Input signals of the ADC include V I , V IB , and V Q , as well as V QB that are differential inputs. Therefore, two parallel ADCs should be applied in the proposed receiver. Each ADC is composed of a comparator and two binary-weighted capacitor arrays. The SAR logic controls the switching sequence of these ADCs. The fully differential structure of the ADCs reduces the substrate and supply voltage noise, and has a good Common Mode Rejection Ratio (CMRR) [16].

Analog to a Digital Converter
Capacitor arrays of this ADC operate as sample and hold circuits and DACs. Significant power consumption of the SAR ADC may occur due to switching in the capacitor array. The switching sequence of the proposed structure is common mode voltage (V CM )-based and straightforward. Previous works have proven the V CM -based straightforward ADCs as one of the most energy efficient structures [17]. Figure 12 proposes the designed 6-bit fully differential SAR ADC structure. The resolution of ADC required in modem requires 5-bit, but was designed with 1-bit margin when designing the SAR ADC. Input signals of the ADC include VI, VIB, and VQ, as well as VQB that are differential inputs. Therefore, two parallel ADCs should be applied in the proposed receiver. Each ADC is composed of a comparator and two binary-weighted capacitor arrays. The SAR logic controls the switching sequence of these ADCs. The fully differential structure of the ADCs reduces the substrate and supply voltage noise, and has a good Common Mode Rejection Ratio (CMRR) [16].

Analog to a Digital Converter
Capacitor arrays of this ADC operate as sample and hold circuits and DACs. Significant power consumption of the SAR ADC may occur due to switching in the capacitor array. The switching sequence of the proposed structure is common mode voltage (VCM)-based and straightforward. Previous works have proven the VCM-based straightforward ADCs as one of the most energy efficient structures [17].  In the V CM -based switching and after sampling, in each cycle, one of the capacitors of the capacitor array switches from V CM to the Reference Voltage (V REF ) or 0, according to the comparator decision. This switching voltage value is half of the one in the conventional SAR ADC structures. Therefore, switching power consumption is reduced significantly. Switching is straightforward, which means that only the next capacitors will be switched, and previously switched capacitors will not switch until the current switching cycle finishes and the next switching cycle starts. This sequence of switching minimizes switching steps and reduces power consumption. As voltages across capacitors are changed only from V CM to V REF or from V CM to 0 V, charging and discharging the time of capacitors decreases, which is useful when conversion speed of the ADC increases. Figure 13 shows the dynamic latched comparator applied in the ADC. It is composed of the pre-amplifier and dynamic latch to prevent kick noise. The pre-Amplifier has N-type and P-type differential input pair for rail-to-rail input range. Power efficiency of the conventional dynamic latched comparator is poor due to the static current even after comparison operation. In this work, through the APC logic after pre-amplifying and output decision, static current consumption of the pre-amplifier is blocked, which improves the power efficiency of the ADC [17]. Figure 13 shows the dynamic latched comparator applied in the ADC. It is composed of the preamplifier and dynamic latch to prevent kick noise. The pre-Amplifier has N-type and P-type differential input pair for rail-to-rail input range. Power efficiency of the conventional dynamic latched comparator is poor due to the static current even after comparison operation. In this work, through the APC logic after pre-amplifying and output decision, static current consumption of the pre-amplifier is blocked, which improves the power efficiency of the ADC [17].  Figure 14 shows the proposed PA with SPDT. It consists of the 16-PA Unit cell with Ramping Controller, and SPDT. The proposed PA is implemented as Class-D type instead of Class-E type considering breakdown voltage of the device and its efficiency. When the PA is enabled or disabled abruptly for data transmission, undesired harmonic tones can be generated.  Figure 14 shows the proposed PA with SPDT. It consists of the 16-PA Unit cell with Ramping Controller, and SPDT. The proposed PA is implemented as Class-D type instead of Class-E type considering breakdown voltage of the device and its efficiency. When the PA is enabled or disabled abruptly for data transmission, undesired harmonic tones can be generated. Since the undesired harmonic tones can degrade spurious spectral emission characteristic of the TX [8], the proposed PA is divided to 16-PA Unit cells, and it applies the Ramping Controller to digitally control output power with ramping when PA is enabled or disabled. The SPDT is designed and integrated for single antenna to be connected to the TX path or the RX path. The body floating technique is adopted to reduce insertion and isolation loss [18]. Figure 15 shows a block diagram of the proposed triple-mode DC-DC buck converter. It is composed of a bandgap reference (BGR), Power MOSFETs, self-calibration negative current detector (SC-NCD), and the triple-mode (PWM, PFM, and retention) controller. Each mode has different characteristics to achieve the wide load current range. The PWM mode controller is designed to Since the undesired harmonic tones can degrade spurious spectral emission characteristic of the TX [8], the proposed PA is divided to 16-PA Unit cells, and it applies the Ramping Controller to digitally control output power with ramping when PA is enabled or disabled. The SPDT is designed and integrated for single antenna to be connected to the TX path or the RX path. The body floating technique is adopted to reduce insertion and isolation loss [18]. Figure 15 shows a block diagram of the proposed triple-mode DC-DC buck converter. It is composed of a bandgap reference (BGR), Power MOSFETs, self-calibration negative current detector (SC-NCD), and the triple-mode (PWM, PFM, and retention) controller. Each mode has different characteristics to achieve the wide load current range. The PWM mode controller is designed to operate in the active state to provide good regulation characteristics of V OUT with low output ripple. Retention and PFM mode controller are designed to reduce switching losses and internal current consumption since the operation is a sleep or stand-by state [19]. The PWM mode and PFM mode is operated under the load of over 2.5 mA and the load of between 0.5 mA and 2.5 mA, respectively. The retention mode is enabled to improve the efficiency when the low load current is below 0.5 mA.

DC-DC Buck Converter and LDO
The DC-DC buck converter generates the output voltage of 1.2 V from the supply voltage of 1.5 V to 3.6 V. The operation is defined by the connection of the external 1 µH inductor (L 1 ) and the 1 µF capacitor (C 1 ). To obtain high conversion efficiency in a discontinuous conduction mode (DCM), the SC-NCD adjusting the NMOS switch (M N ) off-time is proposed. By controlling M N off-time, both diode conduction losses in power MOSFET are minimized effectively. The 1.2 V output of the DC-DC Buck converter supplies a BGR and four on-chip LDOs composed of three low-noise LDO with a capacitor and a capacitor-less LDO, each making a 1 V core supply voltage for the different blocks of the transceiver [20].  Figure 16 shows the block diagram of the proposed SC-NCD. It is a fully digital controller composed of UP/DN counter, duty controller, and D-FF. It is smaller than current consumption of conventional NCD. The proposed SC-NCD method does not use the comparator when it detects the point that the inductor current is 0 A. Therefore, the SC-NCD can improve efficiency by reducing control loss account for a large proportion of the DCM dc-dc converter in light load current conditions using the NCD for digital methods, and it is possible to prevent efficiency reduction due to NCD timing errors caused by the analog comparator offset. Figure 17a and 17b show timing diagram of operation of NCD. VX signal is sampled to digital bits by D-FF. The MN is turned off after 3 ns and 12 ns. The sampled bits, S1 and S2, feeds to logic including XOR, NOR, and the AND gate and output signal of that is feed to UP/DN counter to control MN off-time digitally. In the case where DRVN is under-duty, sampled bits, S1 and S2, are all high and converting the UP signal zero to high. If the DRVN signal is over duty, sampled bits are all-zero and converting the DN signal zero to high. Lastly, the STAY signal is set to high and duty of CKN is locked.  Figure 16 shows the block diagram of the proposed SC-NCD. It is a fully digital controller composed of UP/DN counter, duty controller, and D-FF. It is smaller than current consumption of conventional NCD. The proposed SC-NCD method does not use the comparator when it detects the point that the inductor current is 0 A. Therefore, the SC-NCD can improve efficiency by reducing control loss account for a large proportion of the DCM dc-dc converter in light load current conditions using the NCD for digital methods, and it is possible to prevent efficiency reduction due to NCD timing errors caused by the analog comparator offset. Figure 17a,b show timing diagram of operation of NCD. VX signal is sampled to digital bits by D-FF. The M N is turned off after 3 ns and 12 ns. The sampled bits, S1 and S2, feeds to logic including XOR, NOR, and the AND gate and output signal of that is feed to UP/DN counter to control M N off-time digitally. In the case where DRV N is under-duty, sampled bits, S1 and S2, are all high and converting the UP signal zero to high. If the DRVN signal is over duty, sampled bits are all-zero and converting the DN signal zero to high. Lastly, the STAY signal is set to high and duty of CK N is locked. Figure 17a and 17b show timing diagram of operation of NCD. VX signal is sampled to digital bits by D-FF. The MN is turned off after 3 ns and 12 ns. The sampled bits, S1 and S2, feeds to logic including XOR, NOR, and the AND gate and output signal of that is feed to UP/DN counter to control MN off-time digitally. In the case where DRVN is under-duty, sampled bits, S1 and S2, are all high and converting the UP signal zero to high. If the DRVN signal is over duty, sampled bits are all-zero and converting the DN signal zero to high. Lastly, the STAY signal is set to high and duty of CKN is locked.   Figure 18 shows block diagram of low noise LDO with a fast settling technique (FST). The low noise LDO helps minimize the VCO phase noise and reduce the impact of VCO pushing [21]. It is composed of LPF, a BGR, an LDO, and an NMOS for FST. The LPF is used to reduce output noise of BGR. The FST is composed to compensate settling time, which slowed using LPF of BGR. When ENLDO is low in an initial state, the FST is enabled. The MF1 and MF2 are turned on by SETFAST. Therefore, the RLPF is shorted to achieve fast settling of VREF. The LDO output capacitor (CLDO) is initially charged by using the bypass mode of LDO during fast settling time. Figure 19 shows timing diagram of FST. When the BGR is enabled at T1, the RLPF is bypassed by turning on switching MOSFET (MF). Therefore, the reference voltage of the LDO is applied without delay and CLDO is charged in the bypass mode. It can be rapidly settled. When the LDO is enabled at T2, output voltage of BGR is applied to the LDO through the LPF by turning off MF, and noise of LDO is reduced. Output voltage of the LDO is changed to 1 V from VDD (1.2 V).   Figure 18 shows block diagram of low noise LDO with a fast settling technique (FST). The low noise LDO helps minimize the VCO phase noise and reduce the impact of VCO pushing [21]. It is composed of LPF, a BGR, an LDO, and an NMOS for FST. The LPF is used to reduce output noise of BGR. The FST is composed to compensate settling time, which slowed using LPF of BGR. When EN LDO is low in an initial state, the FST is enabled. The M F1 and M F2 are turned on by SET FAST . Therefore, the R LPF is shorted to achieve fast settling of V REF . The LDO output capacitor (C LDO ) is initially charged by using the bypass mode of LDO during fast settling time.  Figure 18 shows block diagram of low noise LDO with a fast settling technique (FST). The low noise LDO helps minimize the VCO phase noise and reduce the impact of VCO pushing [21]. It is composed of LPF, a BGR, an LDO, and an NMOS for FST. The LPF is used to reduce output noise of BGR. The FST is composed to compensate settling time, which slowed using LPF of BGR. When ENLDO is low in an initial state, the FST is enabled. The MF1 and MF2 are turned on by SETFAST. Therefore, the RLPF is shorted to achieve fast settling of VREF. The LDO output capacitor (CLDO) is initially charged by using the bypass mode of LDO during fast settling time. Figure 19 shows timing diagram of FST. When the BGR is enabled at T1, the RLPF is bypassed by turning on switching MOSFET (MF). Therefore, the reference voltage of the LDO is applied without delay and CLDO is charged in the bypass mode. It can be rapidly settled. When the LDO is enabled at T2, output voltage of BGR is applied to the LDO through the LPF by turning off MF, and noise of LDO is reduced. Output voltage of the LDO is changed to 1 V from VDD (1.2 V).   Figure 19 shows timing diagram of FST. When the BGR is enabled at T 1 , the R LPF is bypassed by turning on switching MOSFET (M F ). Therefore, the reference voltage of the LDO is applied without delay and C LDO is charged in the bypass mode. It can be rapidly settled. When the LDO is enabled at T 2 , output voltage of BGR is applied to the LDO through the LPF by turning off M F , and noise of LDO is reduced. Output voltage of the LDO is changed to 1 V from VDD (1.2 V).  Figure 20 shows a micro-photo of the proposed transceiver for BLE application designed in 1P6M 55-nm CMOS process. The die sizes of transceiver are 1.44 × 1.14 mm 2 and that of DC-DC buck converter is 0.45 × 0.33 mm 2 including the ESD protection pads.   Figure 20 shows a micro-photo of the proposed transceiver for BLE application designed in 1P6M 55-nm CMOS process. The die sizes of transceiver are 1.44 × 1.14 mm 2 and that of DC-DC buck converter is 0.45 × 0.33 mm 2 including the ESD protection pads.  Figure 20 shows a micro-photo of the proposed transceiver for BLE application designed in 1P6M 55-nm CMOS process. The die sizes of transceiver are 1.44 × 1.14 mm 2 and that of DC-DC buck converter is 0.45 × 0.33 mm 2 including the ESD protection pads.   The S11 characteristics of RX including SPDT switch is below −10 dB with respect to BLE channels, as shown in Figure 23.   The S11 characteristics of RX including SPDT switch is below −10 dB with respect to BLE channels, as shown in Figure 23. The S11 characteristics of RX including SPDT switch is below −10 dB with respect to BLE channels, as shown in Figure 23.                      Figure 29a,b show the spurious emission mask of TX and demodulated TX Frequency for the BLE packet. The BLE specification requires spurious emissions from 1 to 1.5 MHz offsets at carrier frequencies to be at least 20 dB lower than the main power, and adjacent channel power for channels above 2 MHz from the carrier frequency should not exceed −20 dBm [2]. The TX output spectrum can satisfy the reference emission mask in the specification. The average measured modulation deviation corresponding to the alternating "10" data pattern is 207 kHz, which creates the most Inter Symbol Interference (ISI). It guarantees the margin of 22-kHz above the specification of 185-kHz, as shown in Figure 29b [2].  Figure 29b show the spurious emission mask of TX and demodulated TX Frequency for the BLE packet. The BLE specification requires spurious emissions from 1 to 1.5 MHz offsets at carrier frequencies to be at least 20 dB lower than the main power, and adjacent channel power for channels above 2 MHz from the carrier frequency should not exceed −20 dBm [2]. The TX output spectrum can satisfy the reference emission mask in the specification. The average measured modulation deviation corresponding to the alternating "10" data pattern is 207 kHz, which creates the most Inter Symbol Interference (ISI). It guarantees the margin of 22-kHz above the specification of 185-kHz, as shown in Figure 29b Figure 30a,b show measurement waveforms of the triple mode DC-DC converter depending on load current in the PWM/PFM and retention modes, respectively. Switching frequency and duty of the PFM mode is 1 MHz and 45 ns under the load current of 2 mA, respectively. When load current is over 2.5 mA, the operation mode of the DC-DC converter is automatically changed to the PWM mode. Switching frequency and duty of PWM mode is 2.5 MHz and 120 ns under the load current of 5 mA, respectively. When the load current of the DC-DC converter is below 0.25 mA, the operation mode of the DC-DC converter is automatically changed to the retention mode. Output is 1.2 V with the ripple of 20 mV in the retention mode. Output ripple has the frequency component of 32-Hz, which is the same as the switching frequency, and the switch turns on for the time of 1/1024 every duty cycle.  Figure 31 shows the efficiency of the DC-DC buck converter in the triple-mode [20]. The DC-DC buck converter can operate under the load current from 1 µA to 10 mA and has maximum efficiency of more than 89%. The mode is changed from the PFM mode to the PWM mode at 2.5 mA because the efficiency of the PWM mode is higher than that of the PFM mode at 2.5 mA. Figure 32a,b shows simulated output noises of BGR and LDO, respectively. Output noises of BGR and LDO are 5.9 nV/Hz and 23 nV/Hz at the frequency of 100 kHz, respectively.  Figure 31 shows the efficiency of the DC-DC buck converter in the triple-mode [20]. The DC-DC buck converter can operate under the load current from 1 µA to 10 mA and has maximum efficiency of more than 89%. The mode is changed from the PFM mode to the PWM mode at 2.5 mA because the efficiency of the PWM mode is higher than that of the PFM mode at 2.5 mA. Figure 32a,b shows simulated output noises of BGR and LDO, respectively. Output noises of BGR and LDO are 5.9 nV/Hz and 23 nV/Hz at the frequency of 100 kHz, respectively. Figure 31 shows the efficiency of the DC-DC buck converter in the triple-mode [20]. The DC-DC buck converter can operate under the load current from 1 µA to 10 mA and has maximum efficiency of more than 89%. The mode is changed from the PFM mode to the PWM mode at 2.5 mA because the efficiency of the PWM mode is higher than that of the PFM mode at 2.5 mA. Figure 32a,b shows simulated output noises of BGR and LDO, respectively. Output noises of BGR and LDO are 5.9 nV/Hz and 23 nV/Hz at the frequency of 100 kHz, respectively.   Figure 19, the reference voltage of the LDO is applied without delay, and the output capacitor of LDO is charged in bypass mode. It can be quickly settled. Therefore, the settling time of BGR and LDO with FST is 29 µs, which is 10 times faster than without it, as shown in Figure 34b.  Table 2 shows the performance comparison with recent BLE transceivers. This work is implemented in the 55-nm CMOS process including TRX switch and DC-DC buck converter. The die area and power consumption of this work is smaller than that of References [3,4]. The power consumption is smaller than those of other works achieving the similar Rx performance (noise figure and sensitivity) and TX output power even though the die area is larger than in Reference [21].    Figure 19, the reference voltage of the LDO is applied without delay, and the output capacitor of LDO is charged in bypass mode. It can be quickly settled. Therefore, the settling time of BGR and LDO with FST is 29 µs, which is 10 times faster than without it, as shown in Figure 33b.   Figure 19, the reference voltage of the LDO is applied without delay, and the output capacitor of LDO is charged in bypass mode. It can be quickly settled. Therefore, the settling time of BGR and LDO with FST is 29 µs, which is 10 times faster than without it, as shown in Figure 34b.  Table 2 shows the performance comparison with recent BLE transceivers. This work is implemented in the 55-nm CMOS process including TRX switch and DC-DC buck converter. The die area and power consumption of this work is smaller than that of References [3,4]. The power consumption is smaller than those of other works achieving the similar Rx performance (noise figure and sensitivity) and TX output power even though the die area is larger than in Reference [21].  Table 2 shows the performance comparison with recent BLE transceivers. This work is implemented in the 55-nm CMOS process including TRX switch and DC-DC buck converter. The die area and power consumption of this work is smaller than that of References [3,4]. The power consumption is smaller than those of other works achieving the similar Rx performance (noise figure and sensitivity) and TX output power even though the die area is larger than in Reference [21].

Conclusions
This paper presents a low power FSK TRX with an integrated SPDT switch and high efficiency power management unit for BLE application. It is implemented with the RF front-end with an inductor-less LNTA and 25% duty-cycle current-driven passive mixers, and low IF baseband analog with complex BPF to reduce power consumption and area, and improve the image rejection ratio, respectively. In the analog PLL, low power VCO is designed by using ABC. This is proposed to optimize bandwidth, settling time, and phase noise by adjusting the charge pump current, VCO gain, and resistor and capacitor values of the loop filter. Current consumption of the ADC is reduced by adopting straightforward architecture. The GFSK modulation is implemented to ensure the proposed low power transceiver can operate at the data rate of 1 Mbps. The DC-DC Buck converter improves overall efficiency by automatically selecting optimum mode among triple modes, PWM, PFM, and retention, depending on the load current. The low noise LDO is designed to improve receiver sensitivity and phase noise of VCO. The transceiver is implemented using 1P6M 55-nm CMOS technology and the die area is 1.79 mm 2 . Power consumption of the receiver and transmitter are 5 mW and 6 mW from the supply voltage of 3V, respectively. Noise figure of the receiver is up to 6.5 dB with respect to channel frequencies. Measured sensitivity of Rx is -95 dBm at 2.44 GHz. The measured phase noise of the PLL is −87.1 and −112.2 dBc/Hz at 100 kHz and 1 MHz offset from 2.44 GHz in the receiver mode, respectively. Efficiency of the DC-DC buck converter is over 89% when the load current is higher than 2.5 mA in the PWM mode. Quiescent current consumption of the TRx is 400 nA from a supply voltage of 3 V in the retention mode.