A Pipeline-TDC-Based CMOS Temperature Sensor with a 48 fJ · K 2 Resolution FoM

: An energy-efficient temperature sensor is important for temperature monitoring in Biomedical Internet-of-things (BIoT) applications. This article presents a time-domain temperature sensor with a pipeline time-to-digital converter (TDC). A programmable-gain time amplifier (PGTA) with high linearity and wide linear range is proposed to improve the resolution of the sensor and to reduce the chip area. The conversion time of the sensor is reduced by the fast TDC that only needs ~26 ns/conversion, which means the sensor is suitable for BIoT applications that commonly use duty cycling mode. Fabricated in a 40 nm standard CMOS technology, the sensor consumes 7.6 µ A at a 0.6 V supply and achieves a resolution of 90 mK and a sensitivity of 0.62%/ ◦ C in a 1.3 µ s conversion time. This translates into a resolution figure-of-merit of 48 fJ · K 2 . The sensor achieves an inaccuracy of 0.39 ◦ C from − 20 ◦ C to 80 ◦ C after two-point calibration. Duty cycling the sensor results in an even lower average power: ~18.6 nW at 10 conversions/s.


Introduction
A smart temperature sensor is one of the most commonly desired parts in Internet of Things (IoT) devices to monitor either environment or chip conditions [1]. The temperature sensors that have high energy efficiency and are of low cost can be used in many BIoT applications, for example, the preservation and transport of vaccines, medicines, blood samples, and other medical samples that should be placed in a certain temperature range. Duty cycling is a commonly used mode in a BIoT system, which requires the sensors switching between the on and off states to extend the batter life. As shown in Figure 1a, the temperature sensor consumes 4.6 µW and it is always on just like other always-on modules that consume 300 nW. The processor and radio frequency module are activated to process and transmit data, which takes 1 ms at a 100 µW normalized power consumption. The temperature sensor consumes 92% of the total power consumption that is 5 µW, which is unacceptable for most BIoT applications. When the temperature sensor works on a duty-cycling mode, it is activated every 100 ms to take a measurement which takes only 1.3 µs consuming a 31 nA leakage current. The temperature sensor consumes only 0.02% of the total power consumption that is reduced to 400 nW. Thus, the conversion time of a temperature sensor should be as short as possible in order to reduce the effective power contribution on a BIoT system. According to where C is the equivalent capacitor, F is the frequency and V is the supply voltage. Reducing the supply voltage is beneficial to further reduce the power consumption. It means that a where C is the equivalent capacitor, F is the frequency and V is the supply voltage. Reducing the supply voltage is beneficial to further reduce the power consumption. It means that a temperature sensor that works at a near-threshold-voltage supply has an advantage over the one that works at a normal-voltage supply. In addition, a temperature sensor should be compact and compatible with the CMOS process to reduce the cost.
(a) (b) Previous researches proposed various kinds of temperature sensors with different principles. A traditional voltage domain temperature sensor used BJT to generate proportional to absolute temperature (PTAT) voltages and then converted the voltages into a digital temperature reading with a ΔΣ-Analog-to-Digital Converter (ADC) [2]. It is both energy and area consuming to implement a high-resolution ΔΣ-ADC. The higher working voltage (>1 V) required by BJT is not suitable for the BIoT applications. In [3], a resistivetype temperature sensor achieved a high sensitivity up to 1.09%/°C, but it is not easy to be integrated, so it is not suitable for the BIoT applications. An alternative method is to use frequency or phase information to express temperature information. In [4], researchers used two Voltage-Controlled Oscillators (VCOs) with different temperature-sensitive characteristics to convert the temperature change into a frequency change, and thus into a temperature reading by counter. With two oscillators operating at dozens of MHz frequency, they achieved a short conversion time of 6.5~22 μs, but this resulted in a high power consumption of about 154 μW. To solve the problem of power consumption, two MOSFETs operating in a sub-threshold region are used to sense the variation of temperature and generated two reference currents. Then, the ratio of currents are transformed into an output frequency difference between the two VCOs working in dozens of kHz frequency. The counter counts the frequency difference and outputs a temperature reading. However, this low-frequency method resulted in a longer conversion times of 59 ms [5]. Resistance-based temperature sensors are usually implemented by RC filters [6], which output temperature-related phase signal. This kind of temperature sensors can achieve high resolutions, but they are not quite suitable for BIoT applications due to high power consumption and long conversion time. In addition, researchers also expressed temperature information using time domain signal. In Ref. [7], a delay line was designed to generate temperature-dependent time signal, and a cyclic time-to-digital convertor (TDC) outputs temperature reading. In [8], a delay line was reused to sense the temperature in a sensing module and measure the PTAT pulse in a TDC. Time domain temperature sensors achieve better power consumption and smaller area.
In this paper, we propose a MOSFET-based time domain temperature sensor, which is suitable for BIoT applications due to its short conversion time and high energy efficiency. The architecture diagram of this work is shown in Figure 2. The sensing module uses a delay line to convert the temperature information into a delay time. An offset compensator generates a delay signal with low temperature sensitivity, which can compensate the offset part of the sensing module output. The output delay difference between the Previous researches proposed various kinds of temperature sensors with different principles. A traditional voltage domain temperature sensor used BJT to generate proportional to absolute temperature (PTAT) voltages and then converted the voltages into a digital temperature reading with a ∆Σ-Analog-to-Digital Converter (ADC) [2]. It is both energy and area consuming to implement a high-resolution ∆Σ-ADC. The higher working voltage (>1 V) required by BJT is not suitable for the BIoT applications. In [3], a resistive-type temperature sensor achieved a high sensitivity up to 1.09%/ • C, but it is not easy to be integrated, so it is not suitable for the BIoT applications. An alternative method is to use frequency or phase information to express temperature information. In [4], researchers used two Voltage-Controlled Oscillators (VCOs) with different temperature-sensitive characteristics to convert the temperature change into a frequency change, and thus into a temperature reading by counter. With two oscillators operating at dozens of MHz frequency, they achieved a short conversion time of 6.5~22 µs, but this resulted in a high power consumption of about 154 µW. To solve the problem of power consumption, two MOSFETs operating in a sub-threshold region are used to sense the variation of temperature and generated two reference currents. Then, the ratio of currents are transformed into an output frequency difference between the two VCOs working in dozens of kHz frequency. The counter counts the frequency difference and outputs a temperature reading. However, this low-frequency method resulted in a longer conversion times of 59 ms [5]. Resistance-based temperature sensors are usually implemented by RC filters [6], which output temperature-related phase signal. This kind of temperature sensors can achieve high resolutions, but they are not quite suitable for BIoT applications due to high power consumption and long conversion time. In addition, researchers also expressed temperature information using time domain signal. In Ref. [7], a delay line was designed to generate temperature-dependent time signal, and a cyclic time-to-digital convertor (TDC) outputs temperature reading. In [8], a delay line was reused to sense the temperature in a sensing module and measure the PTAT pulse in a TDC. Time domain temperature sensors achieve better power consumption and smaller area.
In this paper, we propose a MOSFET-based time domain temperature sensor, which is suitable for BIoT applications due to its short conversion time and high energy efficiency. The architecture diagram of this work is shown in Figure 2. The sensing module uses a delay line to convert the temperature information into a delay time. An offset compensator generates a delay signal with low temperature sensitivity, which can compensate the offset part of the sensing module output. The output delay difference between the sensing module and the offset compensator is then translated into a corresponding digital code using a pipeline TDC. A programmable-gain time amplifier (PGTA) is proposed to achieve an integer time gain and it is low sensitive to temperature variation. The PGTA has a high Electronics 2021, 10, 1542 3 of 16 linearity, a wide linear range, and a programmable time gain. Implemented in a 40 nm standard CMOS technology, the prototype sensor consumes 7.6 µA at a 0.6 V supply and achieves a 90 K resolution from -20 • C to 80 • C at a short conversion time of 1.3 µs.
The remainder of this paper is organized as follows. Section 2 describes the operation principles and the theory analysis. Section 3 presents circuit implementations with detail analysis. Section 4 shows the measurement results of this work, and in Section 5, we present the conclusions for this paper.
Electronics 2021, 10, x FOR PEER REVIEW 3 of 16 sensing module and the offset compensator is then translated into a corresponding digital code using a pipeline TDC. A programmable-gain time amplifier (PGTA) is proposed to achieve an integer time gain and it is low sensitive to temperature variation. The PGTA has a high linearity, a wide linear range, and a programmable time gain. Implemented in a 40 nm standard CMOS technology, the prototype sensor consumes 7.6 μA at a 0.6 V supply and achieves a 90 K resolution from -20 °C to 80 °C at a short conversion time of 1.3 μs. The remainder of this paper is organized as follows. Section 2 describes the operation principles and the theory analysis. Section 3 presents circuit implementations with detail analysis. Section 4 shows the measurement results of this work, and in Section 5, we present the conclusions for this paper.

Principles of Operation and Theory Analysis
For time-domain temperature sensors, the most common method of temperature sensing is to use one or more delay lines to convert the temperature information into a time-domain signal. Therefore, it is necessary to analyze the temperature characteristics of the delay unit which is used in the sensing module and the TDC. According to [9], the process of an inverter output transiting from high to low and from low to high can be equivalent to the process of the load capacitor charging and discharging through the equivalent resistance of MOSFET. At this point, if the output voltage is taken as half of the supply voltage as the reference point, the propagation delay can be expressed as = ln 2 = ln 2 , where RP and RN are the equivalent resistances of PMOS and NMOS, respectively, at the ON state. CL is the load capacitance of the inverter, which has a low-temperature sensitivity [7,8]. The temperature characteristic of the delay time thereby depends on RP and RN that can be written by where VDD, VDS, IDS, and λ are the supply voltage, drain-source voltage, drain-source current and channel length modulation coefficient, respectively.

Principles of Operation and Theory Analysis
For time-domain temperature sensors, the most common method of temperature sensing is to use one or more delay lines to convert the temperature information into a time-domain signal. Therefore, it is necessary to analyze the temperature characteristics of the delay unit which is used in the sensing module and the TDC. According to [9], the process of an inverter output transiting from high to low and from low to high can be equivalent to the process of the load capacitor charging and discharging through the equivalent resistance of MOSFET. At this point, if the output voltage is taken as half of the supply voltage as the reference point, the propagation delay can be expressed as where R P and R N are the equivalent resistances of PMOS and NMOS, respectively, at the ON state. C L is the load capacitance of the inverter, which has a low-temperature sensitivity [7,8]. The temperature characteristic of the delay time thereby depends on R P and R N that can be written by where V DD , V DS , I DS , and λ are the supply voltage, drain-source voltage, drain-source current and channel length modulation coefficient, respectively. Considering that the MOSFETs operates in triode region when the output voltage transits from initial value to 1/2V DD , ignoring the influence of other non-ideal factors, the drain current can be expressed as follows: where µ is the carrier mobility, C OX is the gate-oxide capacitance per unit area, W and L are the channel width and length of MOSFET, and V TH is the threshold voltage. According to the analysis in References [10,11]: where T 0 is the reference temperature, µ 0 is the carrier mobility at reference temperature, V TH0 is the threshold voltage at reference temperature, and T is the temperature. With a rise of temperature, the mobility and the threshold voltage will both decrease. The change of drain current with respect to temperature is obtained by derivation of the Equation (5) to temperature.
Since dµ/dT < 0 and dV TH /dT < 0, the temperature characteristics of the drain-source current can be determined by V DD − V TH , affected by the anti-short channel effect of MOS-FET, the threshold voltage decreases with the increase of channel length. Therefore, when the channel length is long, V TH is much smaller than V DD . In this case, the temperature characteristics of the drain current will be controlled by the carrier mobility, that is, thermal coefficient of the drain current is negative. Otherwise, a shorter channel length will make the thermal coefficient of drain current positive.
By substituting Equations (3)-(5) into Equation (2), the delay time of inverter can be obtained: The delay time of a delay unit composed of two inverters is shown in Equation (10), which is inversely proportional to the drain current. According to Equation (8), with the change of the channel length of MOSFETs, the delay unit can generate a delay time that is positively or negatively correlated with temperature.
In the range of the temperature measurement, the propagation delay of the sensing module has a part of offset. If this signal is directly quantized by a TDC, it will greatly increase the number of the delay units in the TDC and furthermore lead to an additional area and a longer conversion time, which increases the effective power and the area of the sensor and is not suitable for the BIoT applications. To solve this problem, an offset compensator is used to generate a delay signal with low temperature sensitivity to compensate the offset part and the structure is shown in Figure 3.  In order to reduce the power consumption of the bias circuit in the offset compensator, the bias circuit works in sub-threshold region. The drain current of a MOSFET operating in sub-threshold region is shown as where n is the sub-threshold slope and it is larger than one under the normal circumstances. VT is the thermal voltage, and its relationship with temperature is when VDS ≥ 4VT, the Equation (11) can be simplified as As shown in Figure 3, the offset compensator consists of four MOSFETs and delay units, MB1, MB2, and MB3 are working in sub-threshold region, and MB2 and MB3 have the same size. According to Equation (13), the IBias and VBias are shown as = ln exp + In the offset compensator, M2 and M3 have the same size. Therefore, With the same CMOS process, the temperature coefficients of VTH1 and VTH2 are approximately same. Therefore, the temperature characteristics of VBias is determined by VT. By changing the size of MB1 and MB2, VBias that is proportional to temperature is generated. For delay units in the offset compensator, choosing MOSFETs with a longer channel length can increase the propagation time. According to Equation (8), this means the current of the delay units in the offset compensator is inversely proportional to temperature. With the control of VBias, the drain current of MB4 will be positively correlate with the temperature, thereby this can reduce the temperature sensitivity of propagation delay of the offset compensator. Note that the linearity of the offset compensator is more important than the temperature sensitivity of the offset compensator, because the sensitivity is not required to be exactly zero. In this work, the slope of the delay-temperature curve in the In order to reduce the power consumption of the bias circuit in the offset compensator, the bias circuit works in sub-threshold region. The drain current of a MOSFET operating in sub-threshold region is shown as where n is the sub-threshold slope and it is larger than one under the normal circumstances. V T is the thermal voltage, and its relationship with temperature is when V DS ≥ 4V T , the Equation (11) can be simplified as As shown in Figure 3, the offset compensator consists of four MOSFETs and delay units, M B1 , M B2 , and M B3 are working in sub-threshold region, and M B2 and M B3 have the same size. According to Equation (13), the I Bias and V Bias are shown as In the offset compensator, M2 and M3 have the same size. Therefore, With the same CMOS process, the temperature coefficients of V TH1 and V TH2 are approximately same. Therefore, the temperature characteristics of V Bias is determined by V T . By changing the size of M B1 and M B2 , V Bias that is proportional to temperature is generated. For delay units in the offset compensator, choosing MOSFETs with a longer channel length can increase the propagation time. According to Equation (8), this means the current of the delay units in the offset compensator is inversely proportional to temperature. With the control of V Bias , the drain current of M B4 will be positively correlate with the temperature, thereby this can reduce the temperature sensitivity of propagation delay of the offset compensator. Note that the linearity of the offset compensator is more important than the temperature sensitivity of the offset compensator, because the sensitivity is not required to be exactly zero. In this work, the slope of the delay-temperature curve in the sensing module is 0.141 ns/ • C, where that in the offset compensator is 0.059 ns/ • C, which is translated to a 0.081 ns/ • C slope of the temperature sensor.

Sensing Module
The proposed temperature sensor uses a delay line as the sensing module. The structure of the delay unit in the sensing module is shown in Figure 4. After the CLK signal passes through the sensing module, a temperature-dependent delay is generated. For the sensing module with N delay units, the propagation delay can be expressed as Electronics 2021, 10, x FOR PEER REVIEW 6 of 16 sensing module is 0.141 ns/°C, where that in the offset compensator is 0.059 ns/°C, which is translated to a 0.081 ns/°C slope of the temperature sensor.

Sensing Module
The proposed temperature sensor uses a delay line as the sensing module. The structure of the delay unit in the sensing module is shown in Figure 4. After the CLK signal passes through the sensing module, a temperature-dependent delay is generated. For the sensing module with N delay units, the propagation delay can be expressed as As mentioned above, the temperature characteristics of the propagation delay depends on carrier mobility and threshold voltage. According to Equations (9) and (10), in order to achieve a higher resolution, the variation range of the propagation delay with respect to the temperature should be larger. Therefore, it is necessary to increase the channel length of MOSFET. In order to meet the various demands of BIoT applications, such as moist heat sterilization, as well as preservation and transport of biomedical samples, the temperature range of the sensor is designed from −20 °C to 80 °C. The sensing module is designed to generate a 60~73 ns propagation delay over the temperature range of -20~80 °C when L = 11 μm. The propagation delay is approximately proportional to the temperature. The delay difference between the sensing module and the offset compensator is 13~22 ns over the temperature range of −20~80 °C. Figure 5 shows the response ((t'sensor −t-20)/t-20, where t'sensor is the propagation delay change of the sensing module after compensation, t-20 is the propagation delay of the sensing module after compensation at −20 °C) of the temperature sensor with variation in temperature from −20 °C to 80 °C. It can be clearly seen from Figure 5 that the temperature sensor demonstrates around 62% change in the delay time of the sensing module after compensation and the sensitivity is 0.62%/°C. As mentioned above, the temperature characteristics of the propagation delay depends on carrier mobility and threshold voltage. According to Equations (9) and (10), in order to achieve a higher resolution, the variation range of the propagation delay with respect to the temperature should be larger. Therefore, it is necessary to increase the channel length of MOSFET. In order to meet the various demands of BIoT applications, such as moist heat sterilization, as well as preservation and transport of biomedical samples, the temperature range of the sensor is designed from −20 • C to 80 • C. The sensing module is designed to generate a 60~73 ns propagation delay over the temperature range of −20~80 • C when L = 11 µm. The propagation delay is approximately proportional to the temperature. The delay difference between the sensing module and the offset compensator is 13~22 ns over the temperature range of −20~80 • C. Figure 5 shows the response ((t' sensor −t −20 )/t −20 , where t' sensor is the propagation delay change of the sensing module after compensation, t −20 is the propagation delay of the sensing module after compensation at −20 • C) of the temperature sensor with variation in temperature from −20 • C to 80 • C. It can be clearly seen from Figure 5 that the temperature sensor demonstrates around 62% change in the delay time of the sensing module after compensation and the sensitivity is 0.62%/ • C.

Pipeline TDC
A delay difference which is proportional to temperature is generated after a CLK signal propagating the sensing module and the compensator. It is the most convenient way to quantize the delay difference using a TDC. Taking account of the requirements of the area and the resolution for the BIoT applications, a pipeline structure is used in this work. The structure of the proposed TDC is shown in Figure 6, which is composed of a coarse TDC (CTDC), a PGTA, a fine TDC (FTDC), and an encoder.

Pipeline TDC
A delay difference which is proportional to temperature is generated after a CLK signal propagating the sensing module and the compensator. It is the most convenient way to quantize the delay difference using a TDC. Taking account of the requirements of the area and the resolution for the BIoT applications, a pipeline structure is used in this work. The structure of the proposed TDC is shown in Figure 6, which is composed of a coarse TDC (CTDC), a PGTA, a fine TDC (FTDC), and an encoder.

CTDC
As shown in Figure 6, the CTDC is based on a traditional delay-line structure. The structure of the delay unit in the CTDC is shown in Figure 7a. In order to avoid introducing additional errors, the temperature sensitivity of a delay unit in the CTDC should be as low as possible. According to Equation (8), by carefully designing the channel length of MOSFET to adjust the VTH, the propagation delay of a delay unit in the CTDC can maintain low temperature sensitivity. The VTH can be expressed as

CTDC
As shown in Figure 6, the CTDC is based on a traditional delay-line structure. The structure of the delay unit in the CTDC is shown in Figure 7a. In order to avoid introducing additional errors, the temperature sensitivity of a delay unit in the CTDC should be as low as possible. According to Equation (8), by carefully designing the channel length of MOSFET to adjust the V TH , the propagation delay of a delay unit in the CTDC can maintain low temperature sensitivity. The V TH can be expressed as The simulation results of the delay unit variations in the CTDC under different processes are shown in Figure 8. It can be seen that the resolution of the CTDC is 100.6-101.5 ps at the NMOS-Typical corner and PMOS-Typical corner (TT corner) when L = 170 nm. The worst condition occurs at the NMOS-Slow corner and PMOS-Slow corner (SS corner), which shows a delay variation less than 3.8 ps over the temperature range of −20~80 • C. Comparing with the resolution of 117~121.2 ps at SS corner, the variation is about 2.54%. The effect on quantization result caused by the variation is negligible. The simulation results of the delay unit variations in the CTDC under different processes are shown in Figure 8. It can be seen that the resolution of the CTDC is 100.6-101.5 ps at the NMOS-Typical corner and PMOS-Typical corner (TT corner) when L = 170 nm. The worst condition occurs at the NMOS-Slow corner and PMOS-Slow corner (SS corner), which shows a delay variation less than 3.8 ps over the temperature range of −20~80 °C. Comparing with the resolution of 117~121.2 ps at SS corner, the variation is about 2.54%. The effect on quantization result caused by the variation is negligible.  The structure of the arbiter is shown in Figure 7b. Note that the OR gate output VG exhibits only a slim strip of low-level voltage when the phase difference between Start and Stop is approximately π. This results in MA5 and MA6 having insufficient time to charge their drain terminals prior to the next detection. Therefore, a certain margin is required to guarantee the proper functioning of the arbiter circuit. However, this limitation mostly has no effect in this work. The time difference between Start and Stop is less than 22 ns, which implies a frequency lower than 23 MHz should be used in the circuit. In this work, a real-time clock (32.768 kHz) is used for the input clock (CLK in Figure 2).  The structure of the arbiter is shown in Figure 7b. Note that the OR gate output V G exhibits only a slim strip of low-level voltage when the phase difference between Start and Stop is approximately π. This results in M A5 and M A6 having insufficient time to charge their drain terminals prior to the next detection. Therefore, a certain margin is required to guarantee the proper functioning of the arbiter circuit. However, this limitation mostly has no effect in this work. The time difference between Start and Stop is less than 22 ns, which implies a frequency lower than 23 MHz should be used in the circuit. In this work, a real-time clock (32.768 kHz) is used for the input clock (CLK in Figure 2).

PGTA
Time amplifier (TA) is a key module in a pipeline TDC, which amplifies the residue generated in the CTDC quantization. A TA determines the measurement accuracy of a pipeline TDC. The SR-latch-based TA [12] and the cross-couple structure [13] had a very narrow linear range due to the metastable work region and limited discharging time, respectively. A closed-loop TA [14] had a wider input linear range than SR-latch structure, but the gain is not programmable. A promising TA [15] achieved time amplification by duplicating the residue in an OR gate, providing a wide input linear range and a programmable gain simultaneously. However, long delay lines were used in the TA to shift the residues, which caused possible overlaps between each two adjacent residues due to the temperature-dependent delay lines and thus resulted in reducing the accuracy of the amplification. In addition, the long delay lines also slowed down the conversion time.
In the existing works [12][13][14], the TAs amplified the residue in the stage where the stop catches up with the start, the position is defined as stage i, as shown in Figure 4. In fact, the residue exists in every stage of the delay line in CTDC. For instance, the time interval between Start [i+1] and Stop is ( t CTDC -ε), where ε is the residue and t CTDC is the delay time of a delay unit in the CTDC. In the proposed PGTA, time amplification is achieved by extracting M times intervals in M different stages, and the residue is thereby amplified by M times. As shown in Figure 9, the PGTA is composed of a delay line, a digital controller, an OR gate and two groups of XOR gates. According to the PGTA gain (M) and i, the signals from TI[i+10] to TI[i+9+M] are selected by the digital controller. In this work, M is the gain of the PGTA, and M ≤ 8. The XOR gates in the group B are used to extract pulse signals, whose pulse width equal a replication of several times t CTDC from 10 × t CTDC to (9+M) × t CTDC . The pulses TI selected by the digital controller and the outputs of the XOR gates in Group B are send into the XOR gates in Group A. Thus, the residuals are extracted. To guarantee accurate pulse widths of the residues, the size of each delay unit in the PGTA is exactly as same as that in the CTDC. The OR gate sums the pulses (from ε[1] to ε[M]) and generates a serial signal composed of M residues in which each single pulse width is ε. Therefore, the residue is amplified by M times. The simulation of the PGTA when M = 2 is shown in Figure 10. The ideal resolution of the temperature sensor can be expressed as where t temp is the range in which the output delay difference between the compensator and the sensor module with temperature variations, t FTDC is the propagation delay of the delay unit in FTDC, and T range denotes the temperature range (−20~80 • C) in this work.

PGTA
Time amplifier (TA) is a key module in a pipeline TDC, which amplifies the residue generated in the CTDC quantization. A TA determines the measurement accuracy of a pipeline TDC. The SR-latch-based TA [12] and the cross-couple structure [13] had a very narrow linear range due to the metastable work region and limited discharging time, respectively. A closed-loop TA [14] had a wider input linear range than SR-latch structure, but the gain is not programmable. A promising TA [15] achieved time amplification by duplicating the residue in an OR gate, providing a wide input linear range and a programmable gain simultaneously. However, long delay lines were used in the TA to shift the residues, which caused possible overlaps between each two adjacent residues due to the temperature-dependent delay lines and thus resulted in reducing the accuracy of the amplification. In addition, the long delay lines also slowed down the conversion time.
In the existing works [12][13][14], the TAs amplified the residue in the stage where the stop catches up with the start, the position is defined as stage i, as shown in Figure 4. In fact, the residue exists in every stage of the delay line in CTDC. For instance, the time interval between Start [i+1] and Stop is (△tCTDC-ɛ), where ɛ is the residue and △tCTDC is the delay time of a delay unit in the CTDC. In the proposed PGTA, time amplification is achieved by extracting M times intervals in M different stages, and the residue is thereby amplified by M times. As shown in Figure 9, the PGTA is composed of a delay line, a digital controller, an OR gate and two groups of XOR gates. According to the PGTA gain (M) and i, the signals from TI[i+10] to TI[i+9+M] are selected by the digital controller. In this work, M is the gain of the PGTA, and M ≤ 8. The XOR gates in the group B are used to extract pulse signals, whose pulse width equal a replication of several times △tCTDC from 10×△tCTDC to (9+M) ×△tCTDC. The pulses TI selected by the digital controller and the outputs of the XOR gates in Group B are send into the XOR gates in Group A. Thus, the residuals are extracted. To guarantee accurate pulse widths of the residues, the size of each delay unit in the PGTA is exactly as same as that in the CTDC. The OR gate sums the pulses (from ɛ [1] to ɛ[M]) and generates a serial signal composed of M residues in which each single pulse width is ɛ. Therefore, the residue is amplified by M times. The simulation of the PGTA when M = 2 is shown in Figure 10. The ideal resolution of the temperature sensor can be expressed as where △ttemp is the range in which the output delay difference between the compensator and the sensor module with temperature variations, tFTDC is the propagation delay of the delay unit in FTDC, and Trange denotes the temperature range (−20~80 °C) in this work.  The structure of the XOR gate and the OR gate in the PGTA is shown in Figure 11. The matching between the rising edges and falling edges of the XOR gates and the OR gate determines the amplification accuracy of the PGTA. For the CMOS logic gates, by adjusting the width-to-length ratio of PMOS and NMOS, the matching between the rising The structure of the XOR gate and the OR gate in the PGTA is shown in Figure 11. The matching between the rising edges and falling edges of the XOR gates and the OR gate determines the amplification accuracy of the PGTA. For the CMOS logic gates, by adjusting the width-to-length ratio of PMOS and NMOS, the matching between the rising edges and the falling edges can be ensured. As the temperature changes, the slopes of the rising edges and falling edges are similar so that the PGTA is low sensitive to temperature variations. In the proposed PGTA, the pulse width variations of a single XOR output is less than 1 ps over the temperature range of −20~80 • C. The structure of the XOR gate and the OR gate in the PGTA is shown in Figure 11 The matching between the rising edges and falling edges of the XOR gates and the OR gate determines the amplification accuracy of the PGTA. For the CMOS logic gates, by adjusting the width-to-length ratio of PMOS and NMOS, the matching between the rising edges and the falling edges can be ensured. As the temperature changes, the slopes of the rising edges and falling edges are similar so that the PGTA is low sensitive to temperature variations. In the proposed PGTA, the pulse width variations of a single XOR output i less than 1 ps over the temperature range of −20~80 °C.

FTDC
Since the PGTA outputs a serial discrete pulses, a FTDC based on a gate delay line (GDL) is used to quantize the amplified residues. The structure of FTDC is shown in Fig  ure 12. The PGTA output is used as an Enable signal to control the propagations of a LP signal in the GDL. The statement of each delay unit in the GDL is read by the encoder a the output of the FTDC. With the control of LP and LP_n signal, MF9 and MF10 is used to reset the statement of the FTDC after quantization. According to the descriptions in Sec tion 3.2.1, the sizes of the MOSFETs are carefully designed to ensure that the delay unit in the FTDC maintain a low temperature sensitivity. The resolution of the FTDC is abou

FTDC
Since the PGTA outputs a serial discrete pulses, a FTDC based on a gate delay line (GDL) is used to quantize the amplified residues. The structure of FTDC is shown in Figure 12. The PGTA output is used as an Enable signal to control the propagations of a LP signal in the GDL. The statement of each delay unit in the GDL is read by the encoder as the output of the FTDC. With the control of LP and LP_n signal, M F9 and M F10 is used to reset the statement of the FTDC after quantization. According to the descriptions in Section 3.2.1, the sizes of the MOSFETs are carefully designed to ensure that the delay units in the FTDC maintain a low temperature sensitivity. The resolution of the FTDC is about 50 ps. Note that the quantized range of the FTDC should be larger than the total output pulse width of the PGTA. In this work, the FTDC is composed of 32 stages delay units, which can provide a 1.6 ns quantized range.
Electronics 2021, 10, x FOR PEER REVIEW 11 of 16 50 ps. Note that the quantized range of the FTDC should be larger than the total output pulse width of the PGTA. In this work, the FTDC is composed of 32 stages delay units, which can provide a 1.6 ns quantized range.

Post-Simulation
One important design consideration is that the delay difference between the sensing module and the offset compensator should not exceed the quantized range of the pipeline TDC that is determined by the CTDC. In the proposed TDC, the number of stages of CTDC is 256 and the maximal gain of PGTA is 8 in this work. In order to work properly, eight

Post-Simulation
One important design consideration is that the delay difference between the sensing module and the offset compensator should not exceed the quantized range of the pipeline TDC that is determined by the CTDC. In the proposed TDC, the number of stages of CTDC is 256 and the maximal gain of PGTA is 8 in this work. In order to work properly, eight stages after the stage [i+10] in the CTDC are used to extract the residues to guarantee a sufficient processing time for the Encoder when M = 8. Thereby, the effective number of the stages in the CTDC is 237. The maximum conversion time of the CTDC is determined by M and the location where i occurs, which can be expressed as According to the simulation results, the maximum i is 220, at 80 • C. Therefore, the conversion time of the CTDC is about 23.937 ns. The possible maximum conversion time of CTDC is 25.856 ns when i = 237 and M = 8.
The post-simulation result of the temperature sensor output is shown in Figure 13 at different process corners. Thanks to the low temperature sensitivity design of the pipeline TDC, the output has the best linearity at the TT corner. The non-linearity under other process corners is mainly caused by the temperature sensitivity variations of the delay unit in the CTDC and the FTDC. The difference non-linearity (DNL) of the temperature sensor output is shown in Figure 14a and the DNL is from −0.6 LSB to 0.7 LSB. The integral non-linearity (INL) of the temperature sensor output is shown in Figure 14b and the INL is from −1.3 LSB to 2.6 LSB.

Post-Simulation
One important design consideration is that the delay difference between the sensing module and the offset compensator should not exceed the quantized range of the pipeline TDC that is determined by the CTDC. In the proposed TDC, the number of stages of CTDC is 256 and the maximal gain of PGTA is 8 in this work. In order to work properly, eight stages after the stage [i+10] in the CTDC are used to extract the residues to guarantee a sufficient processing time for the Encoder when M = 8. Thereby, the effective number of the stages in the CTDC is 237. The maximum conversion time of the CTDC is determined by M and the location where i occurs, which can be expressed as = ( + 9 + ) ×△ t , 130 i 220.
According to the simulation results, the maximum i is 220, at 80 °C. Therefore, the conversion time of the CTDC is about 23.937 ns. The possible maximum conversion time of CTDC is 25.856 ns when i = 237 and M = 8.
The post-simulation result of the temperature sensor output is shown in Figure 13 at different process corners. Thanks to the low temperature sensitivity design of the pipeline TDC, the output has the best linearity at the TT corner. The non-linearity under other process corners is mainly caused by the temperature sensitivity variations of the delay unit in the CTDC and the FTDC. The difference non-linearity (DNL) of the temperature sensor output is shown in Figure 14a and the DNL is from −0.6 LSB to 0.7 LSB. The integral non-linearity (INL) of the temperature sensor output is shown in Figure 14b and the INL is from −1.3 LSB to 2.6 LSB.  The error after one-point calibration is shown in Figure 15a. Matlab ® is used to find the best fitted lines for the five output curves at different corners. At the NMOS-Fast corner and PMOS-Fast corner (FF corner) and SS corner, as shown in Figure 8, the propagation delay of delay unit is proportional and inversely proportional to the temperature respectively, thereby the second order temperature coefficient of the temperature sensor output is positive and negative, respectively. The error curves at the other corner are between those at FF and SS corners. After two-point calibration, the result is shown in Figure  15b. The worst condition occurs at SS corner, which shows an error of 0.9 °C. At the TT corner, the maximum error is less than 0.29 °C. The design parameters used in this work are summarized in Appendix A.

Experimental Results
The proposed temperature sensor was fabricated in a TSMC 40-nm standard CMOS technology and worked at a 0.6 V supply voltage. The total area of this sensor is 0.05 mm 2 , as shown in the die micrograph in Figure 16a. A photograph of the testing environment is shown in Figure 16b. The sensor was mounted on a test board and placed in a temperature chamber manufactured by GWS. The test board was connected to a power supply and read-out circuit. A computer placed outside the temperature chamber was connected to the read-out circuit, and the output code was read every 10 °C. The error after one-point calibration is shown in Figure 15a. Matlab ® is used to find the best fitted lines for the five output curves at different corners. At the NMOS-Fast corner and PMOS-Fast corner (FF corner) and SS corner, as shown in Figure 8, the propagation delay of delay unit is proportional and inversely proportional to the temperature respectively, thereby the second order temperature coefficient of the temperature sensor output is positive and negative, respectively. The error curves at the other corner are between those at FF and SS corners. After two-point calibration, the result is shown in Figure 15b. The worst condition occurs at SS corner, which shows an error of 0.9 • C. At the TT corner, the maximum error is less than 0.29 • C. The design parameters used in this work are summarized in Appendix A. The error after one-point calibration is shown in Figure 15a. Matlab ® is used to find the best fitted lines for the five output curves at different corners. At the NMOS-Fast corner and PMOS-Fast corner (FF corner) and SS corner, as shown in Figure 8, the propagation delay of delay unit is proportional and inversely proportional to the temperature respectively, thereby the second order temperature coefficient of the temperature sensor output is positive and negative, respectively. The error curves at the other corner are between those at FF and SS corners. After two-point calibration, the result is shown in Figure  15b. The worst condition occurs at SS corner, which shows an error of 0.9 °C. At the TT corner, the maximum error is less than 0.29 °C. The design parameters used in this work are summarized in Appendix A.

Experimental Results
The proposed temperature sensor was fabricated in a TSMC 40-nm standard CMOS technology and worked at a 0.6 V supply voltage. The total area of this sensor is 0.05 mm 2 , as shown in the die micrograph in Figure 16a. A photograph of the testing environment is shown in Figure 16b. The sensor was mounted on a test board and placed in a temperature chamber manufactured by GWS. The test board was connected to a power supply and read-out circuit. A computer placed outside the temperature chamber was connected to the read-out circuit, and the output code was read every 10 °C.

Experimental Results
The proposed temperature sensor was fabricated in a TSMC 40-nm standard CMOS technology and worked at a 0.6 V supply voltage. The total area of this sensor is 0.05 mm 2 , as shown in the die micrograph in Figure 16a. A photograph of the testing environment is shown in Figure 16b. The sensor was mounted on a test board and placed in a temperature chamber manufactured by GWS. The test board was connected to a power supply and read-out circuit. A computer placed outside the temperature chamber was connected to the read-out circuit, and the output code was read every 10 • C. Electronics 2021, 10, x FOR PEER REVIEW 13 of (a) (b) The Figure 17 shows the simulated output results and the measured output resu with different PGTA gains. When M = 8, 20 samples from one wafer are tested in the tem perature chamber over the temperature range of −20~80 °C. The error measurements the proposed temperature sensor with one-point and two-point calibration are shown Figure 18a,b, respectively. With one-point calibration at 30 °C, the measured peak-to-pe errors is ±1.3 °C. In the case of a two-point calibration, the calibration temperatures a selected to 0 °C and 60 °C, and the measured peak-to-peak nonlinearity is ±0.39 °C. T sensor resolution is obtained by measuring the spread of sensor error at 30 °C, as show in Figure 19. The standard deviation of the measurement is about 0.09 °C.  The Figure 17 shows the simulated output results and the measured output results with different PGTA gains. When M = 8, 20 samples from one wafer are tested in the temperature chamber over the temperature range of −20~80 • C. The error measurements of the proposed temperature sensor with one-point and two-point calibration are shown in Figure 18a,b, respectively. With one-point calibration at 30 • C, the measured peakto-peak errors is ±1.3 • C. In the case of a two-point calibration, the calibration temperatures are selected to 0 • C and 60 • C, and the measured peak-to-peak nonlinearity is ±0.39 • C. The sensor resolution is obtained by measuring the spread of sensor error at 30 • C, as shown in Figure 19. The standard deviation of the measurement is about 0.09 • C. The Figure 17 shows the simulated output results and the measured output results with different PGTA gains. When M = 8, 20 samples from one wafer are tested in the temperature chamber over the temperature range of −20~80 °C. The error measurements of the proposed temperature sensor with one-point and two-point calibration are shown in Figure 18a,b, respectively. With one-point calibration at 30 °C, the measured peak-to-peak errors is ±1.3 °C. In the case of a two-point calibration, the calibration temperatures are selected to 0 °C and 60 °C, and the measured peak-to-peak nonlinearity is ±0.39 °C. The sensor resolution is obtained by measuring the spread of sensor error at 30 °C, as shown in Figure 19. The standard deviation of the measurement is about 0.09 °C.  The Figure 17 shows the simulated output results and the measured output results with different PGTA gains. When M = 8, 20 samples from one wafer are tested in the temperature chamber over the temperature range of −20~80 °C. The error measurements of the proposed temperature sensor with one-point and two-point calibration are shown in Figure 18a,b, respectively. With one-point calibration at 30 °C, the measured peak-to-peak errors is ±1.3 °C. In the case of a two-point calibration, the calibration temperatures are selected to 0 °C and 60 °C, and the measured peak-to-peak nonlinearity is ±0.39 °C. The sensor resolution is obtained by measuring the spread of sensor error at 30 °C, as shown in Figure 19. The standard deviation of the measurement is about 0.09 °C.   Table 1 summarizes the measured temperature sensor performance and compares it with other state-of-the-art works. Reference [16] has the highest resolution and a high FoM at the cost of long conversion time and large area. Reference [17] has a similar high resolution with Reference [16] and small area at the cost of high power and long conversion time. This work achieves the best FoM of 0.048 pJ·K 2 thanks to the shortest conversion time of 1.3 μs.

Conclusions
We proposed a CMOS time-domain temperature sensor in this paper. The relationship between the temperature characteristic of propagation delay and the size of a delay unit is analyzed. The design of the channel length of MOSFET can determine the positive or negative temperature coefficient of the propagation delay. Based on this principle, the temperature is converted into a PTAP time domain signal, and then a pipeline TDC is used to quantize the signal. A PGTA is proposed to achieve linear programmable gain. In order to avoid introducing additional errors and to improve the linearity of the TDC output, the circuits in the TDC are designed with low temperature sensitivity. Thus, this temperature sensor does not need any curvature corrections.
Based on the analysis and design, a temperature sensor had been fabricated in a 40nm CMOS technology. A 90 mk resolution at an eight times PGTA gain was achieved. The conversion time is only 1.3 μs, and a FoM of 48 fJ·K 2 is obtained.
For most BIoT applications, the temperature does not change fast, therefore, a low sampling rate of the temperature sensor can meet the demand of the system that works at a duty-cycling mode. In the sleep mode, the sensor only draw 31 nA at 40 °C. At 10 conversions/s and Tconv = 1.3 μs, the effective average power consumption is only 18.6 nW. In order to apply this sensor to different applications, a higher-stages of sensing module can be used to increase the range of propagation delay varying with temperature to achieve a higher resolution.  Table 1 summarizes the measured temperature sensor performance and compares it with other state-of-the-art works. Reference [16] has the highest resolution and a high FoM at the cost of long conversion time and large area. Reference [17] has a similar high resolution with Reference [16] and small area at the cost of high power and long conversion time. This work achieves the best FoM of 0.048 pJ·K 2 thanks to the shortest conversion time of 1.3 µs.

Conclusions
We proposed a CMOS time-domain temperature sensor in this paper. The relationship between the temperature characteristic of propagation delay and the size of a delay unit is analyzed. The design of the channel length of MOSFET can determine the positive or negative temperature coefficient of the propagation delay. Based on this principle, the temperature is converted into a PTAP time domain signal, and then a pipeline TDC is used to quantize the signal. A PGTA is proposed to achieve linear programmable gain. In order to avoid introducing additional errors and to improve the linearity of the TDC output, the circuits in the TDC are designed with low temperature sensitivity. Thus, this temperature sensor does not need any curvature corrections.
Based on the analysis and design, a temperature sensor had been fabricated in a 40-nm CMOS technology. A 90 mk resolution at an eight times PGTA gain was achieved. The conversion time is only 1.3 µs, and a FoM of 48 fJ·K 2 is obtained.
For most BIoT applications, the temperature does not change fast, therefore, a low sampling rate of the temperature sensor can meet the demand of the system that works at a duty-cycling mode. In the sleep mode, the sensor only draw 31 nA at 40 • C. At 10 conversions/s and T conv = 1.3 µs, the effective average power consumption is only 18.6 nW. In order to apply this sensor to different applications, a higher-stages of sensing module can be used to increase the range of propagation delay varying with temperature to achieve a higher resolution.