A 36 nW , 7 ppm / ° C on-Chip Clock Source Platform for Near-Human-Body Temperature Applications

We propose a fully on-chip clock-source system in which an ultra-low-power diode-based temperature-uncompensated oscillator (OSCdiode) serves as the main clock source and frequency locks to a higher-power temperature-compensated oscillator (OSCcmp) that is disabled after each locking event to save power. The locking allows the stability of the uncompensated oscillator to stay within the stability bound of the compensated design. This paper demonstrates the functionality of a locking controller that uses a periodic (counter-based) scheme implemented on-chip and a prediction (temperature-drift-based) scheme. The flexible clock source platform is validated in a 130 nm CMOS technology. In the demonstrated system, it achieves an effective average temperature stability of 7 ppm/ ̋C in the human body temperature range from 20  ̋C to 40  ̋C with a power consumption of 36 nW at 0.7 V. It achieves a frequency range of 12 kHz to 150 kHz at 0.7 V.


Introduction
Modern internet-of-things (IoT) devices are employed across a wide range of applications.In applications such as wearable body sensor node (BSN) technology, the sensor motes can alert people to various medical problems by sensing vital signs such as temperature, heart rate, etc.They are meant to revolutionize healthcare using long-term monitoring in an unobtrusive fashion.These devices are designed to consume very low power to be able to extend battery life or operate off of harvested energy using circuits such as [1].The clock source is a critical component in such ultra-low-power (ULP) designs that must run continuously for time keeping and synchronization.It is essential for the clock source to consume low power at frequencies in the <1 MHz range and to have a small form-factor.For the smallest of these IoT devices, there is a need to provide a stable clock source that does not require external components.
Crystal (XTAL) oscillators with high temperature stability are conventionally used in ULP systems [2] but they require off-chip components.In a recent implementation of a self-powered IoT device [3], the total power consumption of the clock source including an off-chip XTAL, an integrated XTAL oscillator, and an all-digital phase-locked loop (ADPLL) was 300 nW at 187.5 kHz.A recent 32.768 kHz XTAL oscillator design [4] achieves a power consumption of 5.58 nW by lowering the oscillation swing, but the design uses multiple voltage domains (three power supplies and three grounds), requiring switched-cap networks.Another 32.768 kHz XTAL oscillator [5] consumes very low power of 1.89 nW at a low supply voltage of 0.15 V.In a more recent implementation, the XTAL oscillator consumes only 1.5 nW power at a power supply of 0.3 V [6].To generate higher frequencies in the range of 370 kHz to 3.8 MHz, a digitally controlled leakage-based oscillator along with a multiplier delay-locked loop [7] can be used, but it requires a clean reference clock such as a XTAL oscillator.Although such XTAL designs have achieved low power consumption recently, their biggest disadvantage for small form-factor IoT applications is that they require off-chip components, resulting in higher system volume and cost.Moreover, they need an on-chip circuit such as a phase-locked loop (PLL) to provide frequencies higher than 32.768 kHz.On-chip oscillators, on the other hand, do not require off-chip components and reduce the system volume.The wake-up timer in [8] consumes 5.8 nW of power at 11 Hz frequency.Another recent on-chip oscillator [9] consumes only 4.2 pW of power for an oscillation frequency 18 Hz.These prior works tend to be very low frequency and better suited to long-term timekeeping than to providing a system clock.Several integrated oscillators in the kHz frequency range with high temperature stability have been proposed recently such as the 33 kHz, 190 nW RC oscillator [10] with a ˘0.21% frequency variation from ´20 ˝C to +90 ˝C.The on-chip oscillator in [11] consumes 99.4 nW at 70.4 kHz with a high temperature stability of 27.4 ppm/ ˝C in ´40 ˝C to 80 ˝C range.
In this paper, we propose a clock source platform for ULP BSN IoT devices, which utilizes a high stability clock for precise timing control, an ULP clock for lowering the overall system power, and a digital control system that combines the stable clock and the ULP clock into a programmable system clock.This clock source can be programmed and tuned according to the system's power and stability needs.The clock source is fully on-chip, which is desirable for IoT devices with a small form factor requirement.The clock source platform leverages two on-chip oscillators with different stability and power points.Since achieving higher stability costs substantially more power, a low power, uncompensated oscillator is implemented that runs continuously while periodically locking to a duty-cycled compensated oscillator that has higher stability but also higher power.This concept was presented in [12], but this paper implements a new, complete clock system based on the premise.
The clock source in this paper offers a flexible platform into which any oscillator can be integrated.We also propose an ULP diode-based temperature-uncompensated oscillator, a calibration scheme in a fast digital frequency locking circuit that can select a specific system frequency, and a digital controller to demonstrate multiple locking schemes.The demonstrated clock source system targets a lower temperature range that is compatible with BSN applications such as sensor patches that may be mounted on the skin and do not experience harsh environmental conditions.In this work, the design is implemented in a 130 nm CMOS technology and results are demonstrated at 0.7 V supply voltage.It achieves an effective average stability of 7 ppm/ ˝C from 20 ˝C to 40 ˝C with a power consumption of 36 nW at 100 kHz and 0.7 V.The proposed clock system consumes the lowest energy per cycle and the lowest power compared to prior on-chip oscillators in the kHz oscillation frequency range (e.g., [10,11]).It also supports a wide frequency range of 12-150 kHz at 0.7 V and 30-600 kHz at 1.1 V.
Moreover, clock stability constraints in ULP chips vary with the system application, which provides an opportunity to save power.On one hand, a stable clock ensures accurate timekeeping for synchronization in multi-node systems, accurate data conversion using analog to digital converters (ADCs), and precise sampling.On the other hand, a less stable clock is sufficient for many digital processing systems as long as the timing constraints are met.A clock source is very critical to the overall power and performance of an ULP system.A highly flexible clock source is needed for an ULP BSN system where the system power and performance needs can be traded off in a seamless fashion depending on the application.The proposed clocking solution can provide a flexible clocking platform for such ULP systems.This clocking platform supports different clock frequency and stability requirements and achieves power savings at lower frequencies and stabilities.In this paper, we analyze the clock system at 0.7 V, which minimizes the power consumption of the design while fitting with the trend of lowering the V DD for IoT chips.In Section 2, we discuss the operation of the system components.In Section 3, we present the measured results.Section 4 includes a comparison of the proposed system with the state-of-the-art clock sources, and finally, a conclusion is presented in Section 5.

Components of the Clock Source System
In this paper, we demonstrate a fully integrated clock source system as shown in Figure 1.The system consists of a high stability temperature-compensated digitally controlled oscillator (DCO) implemented in [12] (OSC cmp ), a low-power temperature-uncompensated, diode-connected-transistor-based ULP DCO (OSC diode ) that is capable of being frequency locked to OSC cmp and acts as the system clock, and a digital block that can perform locking using a counter-based scheme implemented on-chip or a temperature-drift prediction-based mode that was verified off-chip.OSC diode consumes lower power than the uncompensated DCO implemented in [12] (OSC ucmp , which uses the leakage current of "off" low-threshold (LVT) transistors as the current source).OSC diode also has improved temperature and voltage stability over OSC ucmp .When OSC diode is locked to OSC cmp often enough to compensate for the drift in the unstable clock, the clock stability of OSC diode is within the stability bound of OSC cmp .We demonstrate the locking function in two different modes (counter-based locking and temperature drift-based locking), eliminating the need for high power PLLs.In this section, we describe the components and design techniques used in this clock source system.

Components of the Clock Source System
In this paper, we demonstrate a fully integrated clock source system as shown in Figure 1.The system consists of a high stability temperature-compensated digitally controlled oscillator (DCO) implemented in [12] (OSCcmp), a low-power temperature-uncompensated, diode-connectedtransistor-based ULP DCO (OSCdiode) that is capable of being frequency locked to OSCcmp and acts as the system clock, and a digital block that can perform locking using a counter-based scheme implemented on-chip or a temperature-drift prediction-based mode that was verified off-chip.OSCdiode consumes lower power than the uncompensated DCO implemented in [12] (OSCucmp, which uses the leakage current of "off" low-threshold (LVT) transistors as the current source).OSCdiode also has improved temperature and voltage stability over OSCucmp.When OSCdiode is locked to OSCcmp often enough to compensate for the drift in the unstable clock, the clock stability of OSCdiode is within the stability bound of OSCcmp.We demonstrate the locking function in two different modes (counterbased locking and temperature drift-based locking), eliminating the need for high power PLLs.In this section, we describe the components and design techniques used in this clock source system.
Figure 1.The proposed on-chip clock-source system: An ULP temperature-uncompensated OSCdiode (system clock) locks to a duty-cycled higher-power stable OSCcmp.A reference clock may be used just for the initial calibration that can also be achieved by setting the calibration bits, making the system fully on-chip.We demonstrate counter-based and temperature drift-based locking schemes.The locking principle [12] is shown in Figure 2.For a uniform rate of increasing temperature with time, the temperature-compensated DCO (OSCcmp) accumulates error at a slower rate than the uncompensated DCO (OSCdiode).OSCdiode is locked to OSCcmp at a rate that is fast relative to environmentally caused changes in the clock frequency so that its effective long-term stability stays within the stability bound of OSCcmp.
The clock source system output is derived from OSCdiode.The power savings in the system is obtained by powering down the high-power OSCcmp between locking events.The digital control of OSCcmp allows it to be turned off and turned on quickly while maintaining its frequency target, aiding in the duty cycling operation.Before deployment, OSCcmp can be initially calibrated by locking to a reference to achieve a desired initial frequency or by programming its calibration bits.The proposed on-chip clock-source system: An ULP temperature-uncompensated OSC diode (system clock) locks to a duty-cycled higher-power stable OSC cmp .A reference clock may be used just for the initial calibration that can also be achieved by setting the calibration bits, making the system fully on-chip.We demonstrate counter-based and temperature drift-based locking schemes.The locking principle [12] is shown in Figure 2.For a uniform rate of increasing temperature with time, the temperature-compensated DCO (OSC cmp ) accumulates error at a slower rate than the uncompensated DCO (OSC diode ).OSC diode is locked to OSC cmp at a rate that is fast relative to environmentally caused changes in the clock frequency so that its effective long-term stability stays within the stability bound of OSC cmp .
The clock source system output is derived from OSC diode .The power savings in the system is obtained by powering down the high-power OSC cmp between locking events.The digital control of OSC cmp allows it to be turned off and turned on quickly while maintaining its frequency target, aiding in the duty cycling operation.Before deployment, OSC cmp can be initially calibrated by locking to a reference to achieve a desired initial frequency or by programming its calibration bits.

Figure 2.
Locking principle: Compensated oscillator OSCcmp achieves a specific system frequency initially by locking it to a reference clock [12] or by setting calibration bits, which are write-enabled in this design.The temperature-uncompensated oscillator OSCdiode is often re-locked to OSCcmp, thereby achieving effective stability of that of OSCcmp.

Temperature-Uncompensated Diode-Connected-Transistor Oscillator (OSCdiode)
In this paper, we propose an ULP temperature-uncompensated oscillator, the diode-connected transistor-based DCO, OSCdiode.Diode-connected MOS devices are used to generate a virtual power rail (VDD-VIRTUAL) from the oscillator power-supply (VDD).The oscillator is powered by VDD-VIRTUAL as shown in Figure 3.In this subsection, we will describe the design and oscillator stabilization techniques used in OSCdiode.The diode strength is a function of the width of the diode transistor.Diode-connected transistor stacks sized in a binary-weighted fashion are turned on/off by a 23-bit control signal.This controls the value of VDD-VIRTUAL to obtain different frequencies.For a higher 23-bit value, VDD-VIRTUAL increases and hence raises the oscillation frequency.Thus, setting the 23 calibration bits tunes the oscillator to a specific frequency.In this paper, we propose an ULP temperature-uncompensated oscillator, the diode-connected transistor-based DCO, OSC diode .Diode-connected MOS devices are used to generate a virtual power rail (V DD-VIRTUAL ) from the oscillator power-supply (V DD ).The oscillator is powered by V DD-VIRTUAL as shown in Figure 3.In this subsection, we will describe the design and oscillator stabilization techniques used in OSC diode .

Figure 2.
Locking principle: Compensated oscillator OSCcmp achieves a specific system frequency initially by locking it to a reference clock [12] or by setting calibration bits, which are write-enabled in this design.The temperature-uncompensated oscillator OSCdiode is often re-locked to OSCcmp, thereby achieving effective stability of that of OSCcmp.© (2012) IEEE.Reproduced with permission from A. Shrivastava and B. H. Calhoun, A 150 nW, 5 ppm/°C, 100 kHz On-Chip clock source for ultra low power SoCs; published by Custom Integrated Circuits Conference (CICC), 2012 IEEE.

Temperature-Uncompensated Diode-Connected-Transistor Oscillator (OSCdiode)
In this paper, we propose an ULP temperature-uncompensated oscillator, the diode-connected transistor-based DCO, OSCdiode.Diode-connected MOS devices are used to generate a virtual power rail (VDD-VIRTUAL) from the oscillator power-supply (VDD).The oscillator is powered by VDD-VIRTUAL as shown in Figure 3.In this subsection, we will describe the design and oscillator stabilization techniques used in OSCdiode.The diode strength is a function of the width of the diode transistor.Diode-connected transistor stacks sized in a binary-weighted fashion are turned on/off by a 23-bit control signal.This controls the value of VDD-VIRTUAL to obtain different frequencies.For a higher 23-bit value, VDD-VIRTUAL increases and hence raises the oscillation frequency.Thus, setting the 23 calibration bits tunes the oscillator to a specific frequency.The diode strength is a function of the width of the diode transistor.Diode-connected transistor stacks sized in a binary-weighted fashion are turned on/off by a 23-bit control signal.This controls the value of V DD-VIRTUAL to obtain different frequencies.For a higher 23-bit value, V DD-VIRTUAL increases and hence raises the oscillation frequency.Thus, setting the 23 calibration bits tunes the oscillator to a specific frequency.
The stacked OSC diode transistors have V GS (gate-to-source voltage) equal to V DS (drain-to-source voltage).However, V GS (= V DS ) is less than V T (threshold voltage).Therefore, the transistors operate in the sub-threshold region.The drain current in the sub-threshold region is given by: I DSUB " I o expppV GS ´VT q{nϕ t qp1 ´expp´V DS q{ϕ t q (1) µ o is the carrier mobility, C ox is the gate oxide capacitance, W and L are the channel width and length, and n is the sub-V T slope factor.In the diode-connected transistors, V DS > 3ϕ t and Equation (1) can be approximated as: This is the sub-V T MOSFET saturation region, in which the drain current becomes independent of V DS .A detailed analysis of the temperature coefficient (TC) of the different factors such as threshold voltage and carrier mobility with temperature is out of the scope of this paper.However, relevant equations to explain the temperature dependence are presented below.
The temperature dependence of the threshold voltage and the mobility is typically modeled as: V T0 is the threshold voltage at 0 K, and κ is the TC of V T , T is the target temperature [13].
The sub-threshold current TC can be derived as follows [14]: TC " p1{I DSUB qpdI DSUB {dTq " p2 ´mq{T `pκ ´pV GS ´VT q{Tq{nϕ t (6) From the above equation, we observe that as V GS is lower (as transistor goes into weaker inversion), the TC increases.In diode-connected transistors in OSC diode , V GS = V DS and in the uncompensated oscillator OSC ucmp in [12], V GS = 0. Therefore, transistors in OSC ucmp are in a weaker inversion than the transistors in OSC diode and TC for diode-connected transistors is lower.However, a completely direct comparison is not possible because the current source in OSC ucmp [12] is the leakage current of "off" LVT transistors.The temperature dependence of V T also determines the effective TC for both the DCOs.The TC for LVT devices is lower than high V T devices, which is favorable for OSC ucmp in [12].However, during design, we observed that a higher V GS has a stronger effect on lowering the TC than a lower V T .We verify this with the TC measurement of the DCOs toward the end of this section to conclude that OSC diode has a lower TC than OSC ucmp .
OSC diode uses the DCO architecture shown in Figure 4a [12] that comprises an oscillator, a locking circuit, and digital storage for the 23 calibration bits.OSC diode can lock to the frequency of a reference clock (REF_CLK), which in this system is the temperature-compensated oscillator OSC cmp .The REF_CLK is divided by 16 to obtain REF.During locking, OSC diode (DCO) is enabled when REF goes "high" (calibration time).The locking circuit consists of a frequency comparator and a successive approximation register (SAR) logic.The frequency comparator, which is implemented using a 5-bit counter, compares the frequency of OSC diode and REF.It counts the number of OSC diode cycles when REF is high.As shown in Figure 4b, the output of the comparator is "1" when the count is greater than 1, and "0" otherwise.When REF is low (settling time), the SAR logic sets the 23 configuration bits of the OSC diode one after the other in the digital storage registers, depending on the output of the comparator (1 or 0).It takes 23 locking cycles to set all the calibration bits, however only two such cycles are shown in Figure 4b.Once all the SAR configuration bits are set and locking is done, the OSC diode will be frequency locked to REF_CLK as shown in Figure 4b.Secondly, OSCdiode includes both a primary oscillator (OSCmain) and a dummy oscillator (OSCdummy), with the clock output derived from OSCmain.OSCdummy improves the load mismatch on the VDD-VIRTUAL rail.When REF is high, OSCmain is enabled and consumes a specific amount of current.During its low state, OSCmain does not oscillate, and its current consumption reduces, causing VDD-VIRTUAL to increase as shown in Figure 5.This causes OSCdiode to finally settle at the wrong frequency.As a remedy, OSCdummy is enabled when OSCmain is disabled and vice-versa, which helps to maintain a roughly constant current draw from VDD-VIRTUAL in both high and low states of REF.This enables OSCmain to settle at the right frequency.A simulation of the above stabilization techniques is shown in Figure 5.
New measurements were made for OSCucmp, the uncompensated DCO from [12], at a lower supply voltage of 0.7 V, for comparison with OSCdiode.Firstly, the measured stability of OSCdiode is 2.51%/°C, which is better than OSCucmp, which has a stability of 3.42%/°C at 0.7 V.The previous discussion on the TC explains the above results.The temperature stability of 1.67%/°C stated in [12] for OSCucmp was measured from the test chip implemented in [12] at 1.1 V supply voltage.In OSCucmp [12], the most significant bit (MSB) for calibration (SAR bits) connects the oscillator delay element directly to VDD and only the remaining bits connect the oscillator to "off" leaking LVT transistors.Furthermore, the finer delay elements in OSCucmp are powered directly by VDD that causes stability degradation in OSCucmp at lower VDD.Secondly, our new OSCdiode has a power consumption of 20 nW During locking, the instantaneous frequency of OSC diode is affected.Therefore, the re-locking can take place during the idle times of the sensor operation.The chip can also be designed to send an interrupt that halts its execution while calibrating the clock.
Two techniques are used to stabilize OSC diode .First, OSC diode requires sufficient time to stabilize after a change in the SAR bits.This is achieved by dividing REF_CLK by 16 (REF=REF_CLK/16) and allocating 1/16 of the REF period (1 REF_CLK cycle) for comparison and the other 15/16 of the REF period (15 REF_CLK cycles) to settle V DD-VIRTUAL .V DD-VIRTUAL takes more time to settle because of the diode charging it.Dividing REF_CLK by a number lower than 16 (such as dividing by 2 in OSC ucmp [12], by 4 or by 8) results in insufficient time to stabilize it.This gives a longer time (15 REF_CLK cycles) for the V DD-VIRTUAL rail to settle before the next comparison process sets the next SAR bit.
Secondly, OSC diode includes both a primary oscillator (OSC main ) and a dummy oscillator (OSC dummy ), with the clock output derived from OSC main .OSC dummy improves the load mismatch on the V DD-VIRTUAL rail.When REF is high, OSC main is enabled and consumes a specific amount of current.During its low state, OSC main does not oscillate, and its current consumption reduces, causing V DD-VIRTUAL to increase as shown in Figure 5.This causes OSC diode to finally settle at the wrong frequency.As a remedy, OSC dummy is enabled when OSC main is disabled and vice-versa, which helps to maintain a roughly constant current draw from V DD-VIRTUAL in both high and low states of REF.This enables OSC main to settle at the right frequency.A simulation of the above stabilization techniques is shown in Figure 5.
at 100 kHz frequency and 0.7 V supply voltage, which is lower than the power consumption of OSCucmp at 35 nW, measured at 0.7 V supply voltage, and 100 nW power consumption for OSCucmp at 1.1 V [12].Finally, OSCdiode also has an improved voltage stability of 0.1%/mV as compared to OSCucmp, which has a voltage stability of 0.6%/mV.Supply sensitivity is discussed further in Section 3.6.Higher stabilities require lesser re-locking to the stable clock.The above factors make OSCdiode a better candidate for the temperature-uncompensated oscillator in the clock system.The OSCdiode after accurate configuration is used as the system clock meeting the power goals of an ULP system.

Temperature-Compensated Oscillator (OSCcmp)
OSCcmp is a current-controlled DCO implemented in [12] that is used as a temperaturecompensated oscillator in the system.In this paper, we discuss a summary of its key features.OSCcmp frequency is determined by a constant current source Io and the capacitance CL as shown in Figure 6a [12].
The constant current source Io is obtained by adding currents from a Proportional to Absolute Temperature (PTAT) source and a Complementary to Absolute Temperature (CTAT) source [12].In the PTAT source, the current increases with an increase in temperature.In the CTAT, the current decreases with an increase in temperature.The sum current Io of PTAT and CTAT stays constant and it varies by only 1% over a 100 °C range across different process corners (SS, TT, FF, etc.), as shown in Figure 6b [12].CL is a Metal-Insulator-Metal (MIM) cap and also has very small temperature variation.Process variation in the current source may cause either the PTAT or the CTAT to dominate the other, making the total current Io temperature dependent.To balance these currents, the resistance and subsequently the current of the PTAT circuit is configured using 5-bits of process control.To further compensate for the decrease in the period of oscillations at high temperature, a second-order compensation technique is employed.It consists of an off, low threshold MOS, as shown in Figure 6c, which forms a leakage pull-up path that adds charge to CL, thereby increasing the delay with temperature.This leakage current is controlled by a 6-bit switch and inverter control, thereby regulating the variation in the off-transistors of the second-order compensation technique.The 5-bit process trimming bits and 6-bit second-order compensation bits are set externally during calibration.Together with the second-order compensation, high temperature stability was achieved for OSCcmp.New measurements were made for OSC ucmp , the uncompensated DCO from [12], at a lower supply voltage of 0.7 V, for comparison with OSC diode .Firstly, the measured stability of OSC diode is 2.51%/ ˝C, which is better than OSC ucmp , which has a stability of 3.42%/ ˝C at 0.7 V.The previous discussion on the TC explains the above results.The temperature stability of 1.67%/ ˝C stated in [12] for OSC ucmp was measured from the test chip implemented in [12] at 1.1 V supply voltage.In OSC ucmp [12], the most significant bit (MSB) for calibration (SAR bits) connects the oscillator delay element directly to V DD and only the remaining bits connect the oscillator to "off" leaking LVT transistors.Furthermore, the finer delay elements in OSC ucmp are powered directly by V DD that causes stability degradation in OSC ucmp at lower V DD .Secondly, our new OSC diode has a power consumption of 20 nW at 100 kHz frequency and 0.7 V supply voltage, which is lower than the power consumption of OSC ucmp at 35 nW, measured at 0.7 V supply voltage, and 100 nW power consumption for OSC ucmp at 1.1 V [12].
Finally, OSC diode also has an improved voltage stability of 0.1%/mV as compared to OSC ucmp , which has a voltage stability of 0.6%/mV.Supply sensitivity is discussed further in Section 3.6.Higher stabilities require lesser re-locking to the stable clock.The above factors make OSC diode a better candidate for the temperature-uncompensated oscillator in the clock system.The OSC diode after accurate configuration is used as the system clock meeting the power goals of an ULP system.

Temperature-Compensated Oscillator (OSC cmp )
OSC cmp is a current-controlled DCO implemented in [12] that is used as a temperature-compensated oscillator in the system.In this paper, we discuss a summary of its key features.OSC cmp frequency is determined by a constant current source I o and the capacitance C L as shown in Figure 6a [12].
The constant current source I o is obtained by adding currents from a Proportional to Absolute Temperature (PTAT) source and a Complementary to Absolute Temperature (CTAT) source [12].In the PTAT source, the current increases with an increase in temperature.In the CTAT, the current decreases with an increase in temperature.The sum current I o of PTAT and CTAT stays constant and it varies by only 1% over a 100 ˝C range across different process corners (SS, TT, FF, etc.), as shown in Figure 6b [12].C L is a Metal-Insulator-Metal (MIM) cap and also has very small temperature variation.Process variation in the current source may cause either the PTAT or the CTAT to dominate the other, making the total current I o temperature dependent.To balance these currents, the resistance and subsequently the current of the PTAT circuit is configured using 5-bits of process control.To further compensate for the decrease in the period of oscillations at high temperature, a second-order compensation technique is employed.It consists of an off, low threshold MOS, as shown in Figure 6c, which forms a leakage pull-up path that adds charge to C L , thereby increasing the delay with temperature.This leakage current is controlled by a 6-bit switch and inverter control, thereby regulating the variation in the off-transistors of the second-order compensation technique.The 5-bit process trimming bits and 6-bit second-order compensation bits are set externally during calibration.Together with the second-order compensation, high temperature stability was achieved for OSC cmp .
Stability measurements were performed across 10 chips to be 5 ppm/°C from 20 °C to 40 °C (14 ppm/°C from 20 °C to 70 °C) at 1.1 V supply voltage [12] and 7 ppm/°C from 20 °C to 40 °C at 0.7 V.This DCO is well suited for the human body application temperature range for which it was designed.One such example of a device targeting body temperature range is an RFID batteryless sensor in a wireless human body temperature monitoring system in [15].This DCO is operational from 0.7 V to 1.1 V, assuming that once the supply voltage is chosen it is maintained to be stable.This makes it readily usable in ULP nodes employing a wide range of voltages of operation such as [2].

Digital Control Block
In this work, a low-power digital control block was implemented to automate the locking of OSCdiode to OSCcmp.It controls the time interval between successive locks of OSCdiode to OSCcmp.The digital block is designed using standard digital synthesis flow.We describe two locking modes: (a) a periodic (counter-based) locking scheme; and (b) a prediction (temperature-drift-based) locking scheme in which an algorithm is used to optimize the number of locks in the event of temperature drift.The periodic locking scheme was implemented on the prototype and the prediction locking scheme was verified off-chip.The two modes are described in the following subsections.

Counter-Based Locking Scheme
In the counter-based scheme, locking is achieved through a 32-bit programmable counter.After counting the number of cycles programmed in these registers, the digital block issues a signal to enable the locking of one DCO to another.This programmable counter controls the locking of OSCdiode to OSCcmp in a periodic fashion.A 32-bit count register implies the capability to count 2 32 cycles.If the digital block is run at the same frequency as that of the clock source (e.g., 100 kHz), the interval between the locking of OSCdiode to OSCcmp can be programmed to be any value between the minimum locking time (3.68 ms), to the maximum time possible (11.93 h) in steps of the clock period (10 µs).After locking, a power-down signal is asserted to disable OSCcmp and save power.Its SAR bits are retained to preserve calibration and frequency lock settings.
The start-up times of all the DCOs are in the range of a few microseconds, which has to be considered during powering up of OSCcmp for the next locking event.OSCcmp must be powered on for a sufficient amount of time before the next round of locking starts to account for its settling time.The digital block takes this into account through a settle register in each counter.A power-up signal is issued at a programmable number of cycles prior to the commencement of the next lock.At a system clock frequency of 100 kHz, 1 settle bit is equivalent to 10 µs, which is sufficient time to start-up OSCcmp.The programmable nature of the count and settle registers in the digital block enables this clock source system to be flexible for serving different application needs.This makes it possible for Stability measurements were performed across 10 chips to be 5 ppm/ ˝C from 20 ˝C to 40 ˝C (14 ppm/ ˝C from 20 ˝C to 70 ˝C) at 1.1 V supply voltage [12] and 7 ppm/ ˝C from 20 ˝C to 40 ˝C at 0.7 V.This DCO is well suited for the human body application temperature range for which it was designed.One such example of a device targeting body temperature range is an RFID batteryless sensor in a wireless human body temperature monitoring system in [15].This DCO is operational from 0.7 V to 1.1 V, assuming that once the supply voltage is chosen it is maintained to be stable.This makes it readily usable in ULP nodes employing a wide range of voltages of operation such as [2].

Digital Control Block
In this work, a low-power digital control block was implemented to automate the locking of OSC diode to OSC cmp .It controls the time interval between successive locks of OSC diode to OSC cmp .The digital block is designed using standard digital synthesis flow.We describe two locking modes: (a) a periodic (counter-based) locking scheme; and (b) a prediction (temperature-drift-based) locking scheme in which an algorithm is used to optimize the number of locks in the event of temperature drift.The periodic locking scheme was implemented on the prototype and the prediction locking scheme was verified off-chip.The two modes are described in the following subsections.

Counter-Based Locking Scheme
In the counter-based scheme, locking is achieved through a 32-bit programmable counter.After counting the number of cycles programmed in these registers, the digital block issues a signal to enable the locking of one DCO to another.This programmable counter controls the locking of OSC diode to OSC cmp in a periodic fashion.A 32-bit count register implies the capability to count 2 32 cycles.If the digital block is run at the same frequency as that of the clock source (e.g., 100 kHz), the interval between the locking of OSC diode to OSC cmp can be programmed to be any value between the minimum locking time (3.68 ms), to the maximum time possible (11.93 h) in steps of the clock period (10 µs).After locking, a power-down signal is asserted to disable OSC cmp and save power.Its SAR bits are retained to preserve calibration and frequency lock settings.
The start-up times of all the DCOs are in the range of a few microseconds, which has to be considered during powering up of OSC cmp for the next locking event.OSC cmp must be powered on for a sufficient amount of time before the next round of locking starts to account for its settling time.The digital block takes this into account through a settle register in each counter.A power-up signal is issued at a programmable number of cycles prior to the commencement of the next lock.At a system clock frequency of 100 kHz, 1 settle bit is equivalent to 10 µs, which is sufficient time to start-up OSC cmp .The programmable nature of the count and settle registers in the digital block enables this clock source system to be flexible for serving different application needs.This makes it possible for the system to incorporate another oscillator with lower power or other better clock attributes, and following the above DCO architecture, to be able to easily replace the DCOs described in this paper.

Temperature-Drift-Based Locking Scheme
This locking scheme makes use of the temperature dependence of the SAR calibration bits.As temperature drifts, the frequency of the uncompensated oscillator drifts.When it is re-locked to the compensated clock, the difference in the current and previous value of SAR bits of the uncompensated oscillator indicates the amount of drift in clock frequency and thereby serves as a proxy for the change in the temperature since the last lock.
The SAR calibration bits are designed to be read and write enabled in this system.This makes it possible to read their values from successive locks.During successive locks, if the SAR bits are unchanged or if only the "fine" lower significant bits (LSBs) are changed, it implies that the temperature has not significantly changed since the previous lock.Hence, re-locking, which involves powering up the high-power OSC cmp , leads to unnecessary power consumption.In this prediction-locking scheme, we monitor the bits and perform locks based on the history of temperature change and a prediction algorithm.An algorithm implemented for a temperature drift-prediction based locking scheme is described below.
Figure 7 shows the state diagram of the implemented algorithm.The bubbles (11, 10, 01, and 00) represent the different locking states.Each state has a corresponding programmable counter threshold, which represents the time interval between successive locks.A minimal threshold (corresponding to 11) is the shortest interval between successive locks and a high threshold (corresponding to 00) is the longest locking interval.The low and medium thresholds are intermediate locking interval times and correspond to states 10 and 01, respectively.The SAR bits are monitored every time locking is performed.The initial locking state is "11" corresponding to the minimal locking interval, because there is no information on the temperature drift in the start.In all the successive locks, the difference between current SAR bits and previous SAR bits (diff ) determines the locking interval.If diff exceed a pre-programmed threshold value thresh, it means that a significant temperature drift has occurred.
If diff is greater than thresh, a lock is initiated and the locking state is reset to "11", causing the next locking and monitoring to take place after a minimal interval.If diff is lesser than thresh, the locking state is decremented by 1, and the next locking happens after an increased time interval (low if the previous interval was minimal, medium if the previous interval was low, high if the previous interval was medium, and stays high if the previous interval was high).If at any point diff is greater than thresh, the locking state flips to "11", thereby resetting the state machine.The circuit implementation of the algorithm was achieved using a digital synthesis flow.The functionality and power savings of the algorithm was verified using simulations, but was not implemented in the prototype.However, the reading out of SAR bits and variable locking intervals in this scheme was verified using chip measurements by implementing the algorithm in software.
locking interval times and correspond to states 10 and 01, respectively.The SAR bits are monitored every time locking is performed.The initial locking state is "11" corresponding to the minimal locking interval, because there is no information on the temperature drift in the start.In all the successive locks, the difference between current SAR bits and previous SAR bits (diff) determines the locking interval.If diff exceed a pre-programmed threshold value thresh, it means that a significant temperature drift has occurred.If diff is greater than thresh, a lock is initiated and the locking state is reset to "11", causing the next locking and monitoring to take place after a minimal interval.If diff is lesser than thresh, the locking state is decremented by 1, and the next locking happens after an increased time interval (low if the previous interval was minimal, medium if the previous interval was low, high if the previous The digital block is operational at voltages ranging from 0.5 V to 1.1 V for a 100 kHz system frequency.For a count of 1 min, the measured power consumption of the digital block is 12 nW in the counter-based locking mode.The estimated power consumption of drift-based locking mode from simulations is 5.1 nW at 100 kHz with 0.7 V supply voltage.The power consumption for the temperature-drift-based locking mode is obtained from simulations because it was not implemented in the prototype.Operating the digital block at a lower frequency using a divided clock can further minimize this power consumption.To achieve the same locking interval time at a lower operating frequency, fewer count bits need to be set thereby lowering the switching activity.Further, if the digital block is run at a divided clock of 50 kHz, it can even be operational at 0.4 V supply voltage.At 50 kHz and 0.4 V, its power consumption goes down to 730 pW for the same locking interval and the power consumption of the overall clock source system in this case is 25 nW.Since there are only two signals (reset and power-down) interfacing the digital block to the oscillators and they toggle only once during each lock, they may be level-shifted to the oscillator domain without much loss of power.Hence, it is advantageous to operate the digital block in a sub-threshold voltage domain, which is usually available in ULP systems such as [2].
13 T total : Sum of T lock and T nolock .14 T duration : Time duration of each lock.
OSC diode and digital block are "ON" for the whole duration, while OSC cmp is "ON" only for the duration of the lock.The total system power consumption (P total ) can be estimated to be: P total " ppP diode `Pdig `Pcmp_leak q ˆTnolock `pP diode_lock `Pdig `Pcmp q ˆTlock `pP cmp q ˆTcmp_settle q{T total (7) The digital block controls T interval , which relates to T lock in Equation (1) as: T lock " pT total ˆTduration q{T interval (8) During no lock conditions, the oscillator frequency drift is given by D diode_nolock for OSC diode and D cmp_nolock for OSC cmp .
D diode_nolock " S diode ˆRtemp ˆTtotal D cmp_nolock " S cmp ˆRtemp ˆTtotal (10) In the duration of T total , the total number of locks are n = T total /T interval , where n is an integer.At the end of these locks (at time n ˆTinterval ), OSC diode would have only drifted in the worst case as much as OSC cmp .During the remaining time (T total -n ˆTinterval ), the frequency drifts at the original rate for OSC diode .The effective frequency drift during T total can be estimated as: D cmp_lock " S cmp ˆRtemp ˆn ˆTinterval `Sdiode ˆRtemp ˆpT total ´n ˆTinterval q (11) The maximum drift of frequencies between each lock (MaxD diode_lock ) is one approach to get the maximum frequency variation of the clock.This is equivalent to the net frequency drift during time interval between each lock (T interval ): MaxD diode_lock " S diode ˆRtemp ˆTinterval (12) From Equation ( 12), we observe that MaxD diode_lock decreases proportionally with T interval .Using the above equations, we are able to calculate the maximum frequency drift for any power budget, or the power consumption for any stability budget.If the rate of locking is greater than the rate of temperature change, the system power increases with no additional improvement in stability.This is because the stability of OSC diode cannot exceed the stability of compensated oscillator OSC cmp .

Results
The system was fabricated in the 130 nm CMOS technology.The annotated die photo of the clock source is shown in Figure 8a.The breakout dimensions are: 522 µm ˆ215 µm for OSC cmp , 372 µm ˆ238 µm for OSC diode , and 179 µm ˆ383 µm for the digital block.To implement the drift-based locking scheme, the estimated area is approximately 150 µm ˆ150 µm.Uncompensated oscillator OSC ucmp from [12] was also implemented for comparison to the proposed ULP oscillator OSC diode .The design layout showing the different components is shown in Figure 8b.Results across several chips for different aspects of the clock source system such as SAR calibration bits, frequency ranges, power, stability, jitter, and locking schemes are elaborated in the following sections.
238 µm for OSCdiode, and 179 µm × 383 µm for the digital block.To implement the drift-based locking scheme, the estimated area is approximately 150 µm × 150 µm.Uncompensated oscillator OSCucmp from [12] was also implemented for comparison to the proposed ULP oscillator OSCdiode.The design layout showing the different components is shown in Figure 8b.Results across several chips for different aspects of the clock source system such as SAR calibration bits, frequency ranges, power, stability, jitter, and locking schemes are elaborated in the following sections.The breakout dimensions are: 522 µm ˆ215 µm for OSC cmp , 372 µm ˆ238 µm for OSC diode , and 179 µm ˆ383 µm for the digital block.The total system design area is 0.269 mm 2 .OSC ucmp from [12] was also implemented in this chip for evaluation and comparison with the proposed ULP OSC diode .(b) Design layout showing the breakdown components.

SAR Calibration Bits
To calibrate for a specific clock frequency during system start-up, the on-chip DCOs can be locked to an external stable clock source, such as an XTAL oscillator, that sets their SAR calibration registers.In this work, the calibration was performed at the supply voltage 0.7 V and room temperature (~27 ˝C).These SAR bits can also be manually set for achieving an initial system frequency.The SAR calibration bits are retained after every lock to preserve the locked frequency state.This means that every time the oscillator is powered down to save power and then powered-up for the next lock, it does not lose its calibrated frequency value.Hence, the external stable clock source is not required for the successive calibrations, making the system completely integrated.The DCO SAR registers are also read-enabled, which makes temperature drift-based locking possible.We are able to read, store, and compare successive SAR words to make a decision on the locking interval according to the temperature-drift algorithm.The write-enabled SAR registers make it possible to write in various values to switch between different frequencies.The system frequency is programmable which makes it suitable for dynamic frequency scaling (DFS) technique to save SoC processing power.In this test-chip, the bits to be written into the SAR registers are required to be scanned in serially through a scan chain.Therefore, the bits are scanned into the scan chain starting 23 cycles (corresponding to 23 SAR bits) earlier to the time at which the frequency switch is scheduled.There is a directly proportional relationship between the SAR bits and the oscillator frequency.To achieve higher frequencies for a particular supply voltage, more SAR bits are set.There is an inverse relationship between the SAR bits and the supply voltage.To achieve a particular frequency at a lower supply voltage, more SAR bits are set.

Frequency Range
The frequency ranges of both the uncompensated DCO (clock output) and the compensated DCO (clock that is locked to) limit the frequency range of the clock source system.The standalone frequency range of OSC cmp at 0.7 V supply voltage is 12 kHz to 150 kHz.Although the upper bound of OSC diode goes up to 600 kHz, OSC cmp limits the system frequency range because of its lower upper-bound frequency.Thereby, the clock source system is capable of generating a clock in the range 12 kHz to 150 kHz at a supply voltage of 0.7 V.At 1.1 V, the frequency range is 30 kHz to 600 kHz.This programmable frequency range is useful in DFS for SoCs.

Power
The average power consumption of the individual components of the proposed system at room temperature (~27 ˝C) is shown in Table 1.OSC cmp , OSC diode and the digital block have separate power supply pads, which make it possible to measure the current consumption of each individual block.A Keithley 2401 SourceMeter instrument DC power supply was used to precisely monitor power consumption over multiple locks and test chips.OSC cmp oscillates only during locking during which we measure its total power and we measure only its leakage power when it is disabled during the interval between two locks.This total power consumption of the system is 36 nW at 0.7 V V DD , with the digital block operating in the counter-based locking scheme at 1 min locking interval.The system power consumption will be further lower for sub-threshold operation of the digital block at 0.4 V V DD .It can be generated from 0.7 V using low power switched capacitor converters, however, their design was not included in this test chip.To make such measurements at lower voltages possible separate V DD pads for different blocks were provided.In this case, for the same locking interval, the system power consumption is 25 nW, when OSC diode was locked to OSC cmp .The prediction mode was not realized in the prototype and the power is from simulations.

Jitter
We express RMS jitter in unit intervals (UI).The measured jitter is 0.0023 UI rms for OSC cmp and 0.0027 UI rms for OSC diode , which is the clock output.It is better than [7] at 0.025 UI rms , and [16] at 0.024 UI rms , which were designed for sensor systems.The oscillator in [17] has lower jitter of 0.0014 UI rms , but it operates at sub-Hz frequency.It is important to understand the application space of the clock with jitter in the ns range.Digital blocks operating in the kHz frequency and in the sub-threshold or near-threshold voltage domain can be operational with the above clock when they are designed to meet the timing constraints.Jitter affects the clock path and can cause setup or hold violations in a digital system.Therefore, it is accounted for by specifying stringent constraints in the digital synthesis flow.Jitter is also highly crucial in data converter applications.For high-speed data converters operating at mega-samples per second speed (MSps), the clock jitter is required to be in the ps range.But in the ULP application space such as wearable technology, signals such as ECG (having bandwidth of up to 100 Hz) are sampled at the speed of few kilo samples per second (kSps).The data converters are able to operate at acceptable signal-to-noise ratios with RMS jitter numbers in the ns range (RMS jitter values in ns for OSC cmp is 23 ns and OSC diode is 27 ns).In the next section, we describe the effect of jitter on the stability of the clock source system.

Stability
The effective average stability of the clock source system equals the stability of OSC cmp , which is measured to be 7 ppm/ ˝C in the BSN compatible temperature range of 20 ˝C to 40 ˝C at 0.7 V.The demonstrated clock source system targets body sensor applications that experience only limited temperature variation as shown above.The proposed system is flexible to allow the DCOs to be replaced by improved versions of the DCO for high stability over a wider temperature range for other applications.In this section, we describe the dependence of the stability of the clock source system on the jitter of the temperature-compensated DCO.The jitter measurements are presented in the next subsection.
The jitter in an oscillator circuit can be deterministic or random.Deterministic jitter is caused by identifiable sources such as switching power supplies and data-dependencies, which can be recognized and resolved.Random jitter, on the other hand, is caused by the electrical noise within the oscillator devices and follows a Gaussian distribution.The jitter of the on-chip DCOs was measured using the histogram method on an oscilloscope.Figure 9 shows the reference falling-edge of both OSC cmp and OSC diode (level-shifted to 1.2 V at the output pad in the test-chip) to show the jitter histogram at this edge.It is to be noted that the time axes showing 0 to 200 ns in the plot simply denote the time grid of this plot and is not a measurement of the jitter.A jitter histogram of the above clock edge locations is shown in Figure 9.As indicated in the previous section, the RMS jitter values for OSC cmp and OSC diode is 23 ns and 27 ns, respectively.The histograms follow a Gaussian distribution with a mean jitter of 0 as a result of deviation from the ideal edge in both positive and negative directions.This is a characteristic of random jitter.This jitter is critical for the locking setup, because the uncompensated oscillator locks to the compensated oscillator within the accuracy of the jitter on the compensated clock.The number of edges within a "REF" comparison cycle may vary because of the jitter and cause the locked clock to be slightly higher or lower than REF_CLK.Since the mean of random jitter is 0, the locked frequency OSC diode will on an average match the input frequency (OSC cmp ).The average of jitter from the Gaussian distribution was measured to be 0. As a result, over a number of locking samples the effective average frequency variation tends to be 0. From measurement, we found that after roughly 100 locks of OSCdiode to OSCcmp, the average frequency of OSCdiode tends to match that of OSCcmp.As we further continue locking, the effective long-term average frequency remains 0.

Locking Scheme
To demonstrate the locking scheme, the temperature profile of a thermal chamber enclosing the chip was set to vary by 1 °C/min.The locking interval of the digital block was set at 4 min.With a 4min interval, the uncompensated oscillators drift away and then re-lock to the compensated oscillator.The locking pattern is shown in Figure 11, and it confirms the locking principle proposed in [12].Thus the frequent locking makes sure that the effective long-term stability stays within the stability bound of the compensated OSCcmp.
The 32-bit counter value can be programmed in the range of microseconds to hours.This gives a powerful flexibility to be able to lock depending on the stability requirement of the application.Further, a higher locking rate implies that the compensated oscillator is powered on more often, and consequently the system consumes higher power.The counter-based locking scheme is periodic in nature.When there is no significant drift in temperature and the oscillator frequency, it unnecessarily consumes extra power during locking events.To address this, the temperature-drift is predicted using an algorithm and the savings are analyzed.The algorithm was verified off-chip as it was not implemented in the prototype.However, the read enabled SAR configuration bits indicate the A measured plot of the average frequency variation with number of locks is shown in Figure 10.Initially, OSC diode locks to OSC cmp with accuracy of ~200 ppm due to the effect of jitter and the average frequency variation of OSC diode from OSC cmp approached 0 as the number of locks increases (as jitter averages to 0 with higher number of samples).This achieves a high average long-term stability for the output clock in the targeted application space.From measurements, we found that after approximately 100 locks of OSC diode to OSC cmp , the average locked OSC diode tends to match the OSC cmp frequency.With a locking interval of 1 min, the time taken for 100 locks is approximately 100 min.This time depends on the chosen locking interval time.Figure 10.Measured average frequency variation vs. number of locks: The average of jitter from the Gaussian distribution was measured to be 0. As a result, over a number of locking samples the effective average frequency variation tends to be 0. From measurement, we found that after roughly 100 locks of OSCdiode to OSCcmp, the average frequency of OSCdiode tends to match that of OSCcmp.As we further continue locking, the effective long-term average frequency remains 0.

Locking Scheme
To demonstrate the locking scheme, the temperature profile of a thermal chamber enclosing the chip was set to vary by 1 °C/min.The locking interval of the digital block was set at 4 min.With a 4min interval, the uncompensated oscillators drift away and then re-lock to the compensated oscillator.The locking pattern is shown in Figure 11, and it confirms the locking principle proposed The average of jitter from the Gaussian distribution was measured to be 0. As a result, over a number of locking samples the effective average frequency variation tends to be 0. From measurement, we found that after roughly 100 locks of OSC diode to OSC cmp , the average frequency of OSC diode tends to match that of OSC cmp .As we further continue locking, the effective long-term average frequency remains 0.

Locking Scheme
To demonstrate the locking scheme, the temperature profile of a thermal chamber enclosing the chip was set to vary by 1 ˝C/min.The locking interval of the digital block was set at 4 min.With a 4-min interval, the uncompensated oscillators drift away and then re-lock to the compensated oscillator.The locking pattern is shown in Figure 11, and it confirms the locking principle proposed in [12].Thus the frequent locking makes sure that the effective long-term stability stays within the stability bound of the compensated OSC cmp .

Figure 11.
Frequency vs. time/temperature plot: The uncompensated oscillator locks to the compensated oscillator at programmable intervals.Here the locking interval is set to 4 min.The temperature was increased at the rate of 1 °C/min.The result matches the locking principle shown in Figure 2. However the re-locking is indicated by sloping lines (taking 1 min) because the measurement was done at a 1 min resolution.In reality, the frequency immediately reaches the reference frequency (100 kHz in this plot) as soon as it is locked, as shown in Figure 2.
The y-axis in the above plot indicates the decimal equivalent of the SAR calibration bits of OSCdiode.In the counter-based locking scheme, the locking is performed every minute; hence the number of locks is 74.With the drift-based algorithm, the number of locks is reduced to 41, because the uncompensated oscillator is not re-locked when there is no temperature drift.This implies ~1.8× fewer locks than the counter-based scheme.Hence, the high power oscillator is turned-on ~1.8× less, implying further power savings in an energy-constrained system.The number of locks is optimized and the savings heavily depend on the temperature change profile.Here the locking interval is set to 4 min.The temperature was increased at the rate of 1 ˝C/min.The result matches the locking principle shown in Figure 2. However the re-locking is indicated by sloping lines (taking 1 min) because the measurement was done at a 1 min resolution.In reality, the frequency immediately reaches the reference frequency (100 kHz in this plot) as soon as it is locked, as shown in Figure 2.
The 32-bit counter value can be programmed in the range of microseconds to hours.This gives a powerful flexibility to be able to lock depending on the stability requirement of the application.Further, a higher locking rate implies that the compensated oscillator is powered on more often, and consequently the system consumes higher power.The counter-based locking scheme is periodic in nature.When there is no significant drift in temperature and the oscillator frequency, it unnecessarily consumes extra power during locking events.To address this, the temperature-drift is predicted using an algorithm and the savings are analyzed.The algorithm was verified off-chip as it was not implemented in the prototype.However, the read enabled SAR configuration bits indicate the temperature change from chip measurements and the manual re-lock capability helped us emulate the algorithm and estimate the savings using the algorithm.Figure 12 shows a sample temperature profile over 75 min for temperatures between 20 ˝C and 30 ˝C.In this profile, the temperature is not linearly increasing, but it stays constant over certain periods of time.
The y-axis in the above plot indicates the decimal equivalent of the SAR calibration bits of OSC diode.In the counter-based locking scheme, the locking is performed every minute; hence the number of locks is 74.With the drift-based algorithm, the number of locks is reduced to 41, because the uncompensated oscillator is not re-locked when there is no temperature drift.This implies ~1.8f ewer locks than the counter-based scheme.Hence, the high power oscillator is turned-on ~1.8ˆless, implying further power savings in an energy-constrained system.The number of locks is optimized and the savings heavily depend on the temperature change profile.
OSCdiode.In the counter-based locking scheme, the locking is performed every minute; hence the number of locks is 74.With the drift-based algorithm, the number of locks is reduced to 41, because the uncompensated oscillator is not re-locked when there is no temperature drift.This implies ~1.8× fewer locks than the counter-based scheme.Hence, the high power oscillator is turned-on ~1.8× less, implying further power savings in an energy-constrained system.The number of locks is optimized and the savings heavily depend on the temperature change profile.

Power Supply Noise
The sensitivity of frequency to power supply variation of both OSC cmp (temperature-compensated clock) and OSC diode (clock source system output clock) is 0.1%/mV.The supply sensitivity of our clock system that is reported in this paper is without the presence of a voltage regulator and is therefore high compared to prior ULP on-chip clock sources.It is also comparable to the supply sensitivity in +0.08/´0.04%/mV in [18] (+4%/´2% at +/´50 mV offset), where it was indicated that supply noise is a minor concern in this application domain because of low switching activities of the sensor node.It is comparable to the voltage stability of [7] (0.5% per 1% V DD at 600 mV that is equivalent to 0.083%/mV).It is better than 0.42%/mV in [19], but also necessitates voltage regulation using an ultra-low power voltage reference [20] as described in [19].In [18], the line sensitivity was improved by 11.7ˆby regulating the local supply voltage.
Similarly, recent on-chip oscillators with frequencies in the kHz range and nW range power consumption [10,11,[21][22][23] have also exhibited lower line sensitivity as shown in Table 2.In [10], the supply sensitivity is as less as 0.09%/V due to the regulated local supply.The regulator is a simple NMOS voltage follower circuit and a replica inverter that is flipped and biased by a reference current.This produces a local regulated supply.The proposed clock source also similarly requires a very stable supply and a regulated local supply voltage such as in [10] to improve the line sensitivity of the system.Such a system can also be easily powered using an ULP unity gain buffer, providing a very stable low noise supply.This technique is used in [24] to provide stable band-gap reference voltage.

Comparison with State-of-the-Art On-chip Clock Sources
Table 2 shows the comparison of this work with state-of-the-art, relaxation, ring, RC, and gate-leakage-based on-chip clock sources.The metrics can be improved with improved stable and ULP DCOs in the clock source platform.The proposed system has a wide frequency range of 12 kHz to 150 kHz at 0.7 V, and reaches higher frequencies up to 600 kHz at 1.1 V.A major advantage of this system compared to the prior sources is the capability to easily scale the operating frequency.
The clock system in this work is capable of achieving an average stability of OSC cmp , i.e., 7 ppm/ ˝C between 20 ˝C and 40 ˝C.In this paper, a BSN compatible temperature range as in [12] is used for stability analysis.For a higher temperature range, the inaccuracy is higher in the order of a few hundred ppm/ ˝C.Prior on-chip clock sources in the kHz frequency range such as [10,11] have demonstrated high stability in an extended temperature range.However, the stability in this work targets applications in environments that are not harsh, such as the human body.
The clock system's total power consumption is 36 nW at 0.7 V and 100 kHz. Figure 13 is a plot of inaccuracy vs. power consumption of the on-chip clock sources from Table 2.Among these, [9] has the least power consumption, but the operating frequency is in the Hz range.The proposed on-chip clock source has the lowest power consumption among the oscillators in the kHz range.Figure 14 is a plot of inaccuracy vs. energy per clock cycle.Our work consumes the least energy per clock cycle among kHz range on-chip oscillators and is comparable to the Hz range timer [9].The clock system's total power consumption is 36 nW at 0.7 V and 100 kHz. Figure 13 is a plot of inaccuracy vs. power consumption of the on-chip clock sources from Table 2.Among these, [9] has the least power consumption, but the operating frequency is in the Hz range.The proposed on-chip clock source has the lowest power consumption among the oscillators in the kHz range.Figure 14 is a plot of inaccuracy vs. energy per clock cycle.Our work consumes the least energy per clock cycle among kHz range on-chip oscillators and is comparable to the Hz range timer [9].
The voltage stability of the system is 0.1%/mV.The line sensitivity can be improved by using a locally regulated voltage supply as discussed in Section 3.7.The area of the clock source system is 0.269 mm 2 (excluding bond pads) and it avoids the use of off-chip XTALs and parasitics.The area is comparable to the design in [11], but is higher than the rest of the designs in Table 2 and this metric is traded-off to gain in the other vital aspects of ULP IoT devices such as power and energy.
Besides being capable of acting as a stable clock source, the proposed work offers a flexible platform for clock sources.The stable and the ULP oscillators described in this paper may be easily replaced with better designs to obtain better metrics for the output clock.Improving the jitter of the stable clock can improve the locking and thereby the stability of the output clock.The proposed digital block offers the flexibility to use the counter-based or drift-based locking schemes.

Conclusions
The proposed work demonstrates a completely on-chip clock source platform suitable for ULP BSN IoT devices.With the DCOs used in this system, it achieves the lowest power and energy consumption per cycle compared to prior on-chip kHz frequency oscillators and a long-term average stability of 7 ppm/°C between 20 °C and 40 °C at 0.7 V supply voltage.In this work, we have designed an improved uncompensated diode-based oscillator, OSCdiode, which consumes lower power, has The voltage stability of the system is 0.1%/mV.The line sensitivity can be improved by using a locally regulated voltage supply as discussed in Section 3.7.The area of the clock source system is 0.269 mm 2 (excluding bond pads) and it avoids the use of off-chip XTALs and parasitics.The area is comparable to the design in [11], but is higher than the rest of the designs in Table 2 and this metric is traded-off to gain in the other vital aspects of ULP IoT devices such as power and energy.
Besides being capable of acting as a stable clock source, the proposed work offers a flexible platform for clock sources.The stable and the ULP oscillators described in this paper may be easily replaced with better designs to obtain better metrics for the output clock.Improving the jitter of the stable clock can improve the locking and thereby the stability of the output clock.The proposed digital block offers the flexibility to use the counter-based or drift-based locking schemes.

Conclusions
The proposed work demonstrates a completely on-chip clock source platform suitable for ULP BSN IoT devices.With the DCOs used in this system, it achieves the lowest power and energy consumption per cycle compared to prior on-chip kHz frequency oscillators and a long-term average stability of 7 ppm/ ˝C between 20 ˝C and 40 ˝C at 0.7 V supply voltage.In this work, we have designed an improved uncompensated diode-based oscillator, OSC diode , which consumes lower power, has better temperature and voltage stability than the uncompensated oscillator OSC ucmp proposed in prior work [12].The clock source includes an integrated digital block that automates the locking scheme and can function between voltages from 0.4 V to 1.1 V. We also proposed two modes of locking: periodic counter based locking scheme as well as a temperature-drift-based scheme for additional power savings.A stability-power trade-off and a wide range of programmable frequencies for DFS can also be achieved using this system, providing opportunities for further power savings.The power consumption of the system is 36 nW.With the availability of a sub-threshold domain in the SoC, a lower power consumption of 25 nW may be achieved.This clock source system offers flexibility to replace the DCOs with improved power and stability versions that follow the programmable DCO architecture.There is also scope for using a temperature sensor in conjunction with this system to monitor the temperature changes for locking.
Figure1.The proposed on-chip clock-source system: An ULP temperature-uncompensated OSC diode (system clock) locks to a duty-cycled higher-power stable OSC cmp .A reference clock may be used just for the initial calibration that can also be achieved by setting the calibration bits, making the system fully on-chip.We demonstrate counter-based and temperature drift-based locking schemes.© (2012) IEEE.Adapted with permission from A.Shrivastava  and B. H. Calhoun, A 150 nW, 5 ppm/ ˝C, 100 kHz On-Chip clock source for ultra low power SoCs; published by Custom Integrated Circuits Conference (CICC), 2012 IEEE.

Figure 3 .
Figure 3. Diode-connected transistor-based OSCdiode: Diode-connected transistors produce a virtual power rail (VDD-VIRTUAL) that drives the oscillator from the supply (VDD).OSCmain draws current when the calibration clock is enabled (calibration time).Its current drops when the calibration clock is disabled (settling time) and during this time, OSCdummy balances the current draw.

Figure 3 .
Figure 3. Diode-connected transistor-based OSCdiode: Diode-connected transistors produce a virtual power rail (VDD-VIRTUAL) that drives the oscillator from the supply (VDD).OSCmain draws current when the calibration clock is enabled (calibration time).Its current drops when the calibration clock is disabled (settling time) and during this time, OSCdummy balances the current draw.
allocating 1/16 of the REF period (1 REF_CLK cycle) for comparison and the other 15/16 of the REF period(15 REF_CLK cycles)  to settle VDD-VIRTUAL.VDD-VIRTUAL takes more time to settle because of the diode charging it.Dividing REF_CLK by a number lower than 16 (such as dividing by 2 in OSCucmp[12], by 4 or by 8) results in insufficient time to stabilize it.This gives a longer time (15 REF_CLK cycles) for the VDD-VIRTUAL rail to settle before the next comparison process sets the next SAR bit.

Figure 4 .
Figure 4. (a) DCO architecture [12]: The locking circuit compares the frequencies of DCO and REF and then sets the digital storage bits of the DCO using Serial Approximation Register (SAR) logic.After all the bits are set, the DCO frequency is locked to REF_CLK frequency.(b) Frequency comparison [12]: A 5-bit counter counts the number of DCO cycles within a REF cycle.If DCO frequency is greater than REF, comparator output goes high and otherwise low.© (2012) IEEE.Adapted with permission from A. Shrivastava and B. H. Calhoun, A 150 nW, 5 ppm/°C, 100 kHz On-Chip clock source for ultra low power SoCs; published by Custom Integrated Circuits Conference (CICC), 2012 IEEE.

Figure 4 .
Figure 4. (a) DCO architecture [12]: The locking circuit compares the frequencies of DCO and REF and then sets the digital storage bits of the DCO using Serial Approximation Register (SAR) logic.After all the bits are set, the DCO frequency is locked to REF_CLK frequency.(b) Frequency comparison [12]: A 5-bit counter counts the number of DCO cycles within a REF cycle.If DCO frequency is greater than REF, comparator output goes high and otherwise low.© (2012) IEEE.Adapted with permission from A. Shrivastava and B. H. Calhoun, A 150 nW, 5 ppm/ ˝C, 100 kHz On-Chip clock source for ultra low power SoCs; published by Custom Integrated Circuits Conference (CICC), 2012 IEEE.

Figure 5 .
Figure 5. VDD-VIRTUAL rail stabilization for OSCdiode: REF_CLK is divided by 16 to give the VDD-VIRTUAL rail sufficient time to settle before setting the next SAR bit.Without OSCdummy, VDD-VIRTUAL starts to increase during the "low" time of REF_CLK/16 (settling time) causing the oscillator frequency to settle to a wrong value.With OSCdummy, the frequency settles to the right value as VDD-VIRTUAL is stabilized by balancing the current draw.

Figure 5 .
Figure 5. V DD-VIRTUAL rail stabilization for OSC diode : REF_CLK is divided by 16 to give the V DD-VIRTUAL rail sufficient time to settle before setting the next SAR bit.Without OSC dummy , V DD-VIRTUAL starts to increase during the "low" time of REF_CLK/16 (settling time) causing the oscillator frequency to settle to a wrong value.With OSC dummy , the frequency settles to the right value as V DD-VIRTUAL is stabilized by balancing the current draw.

Figure 6 .
Figure 6.Temperature-compensated OSCcmp [12]: (a) Frequency of the oscillations depends on the current Io and the capacitance CL.The constant current Io is sum of PTAT and CTAT current sources [12] MIM caps used for CL are also resistant to temperature variations.(b) Simulation result [12] shows that I0 varies by only 1% for a 100 °C range.(c) Second-order Compensation [12]: A pull-up path for adding charge to CL. © (2012) IEEE.Reproduced with permission from A. Shrivastava and B. H. Calhoun, A 150 nW, 5 ppm/°C, 100 kHz On-Chip clock source for ultra low power SoCs; published by Custom Integrated Circuits Conference (CICC), 2012 IEEE.

Figure 6 .
Figure 6.Temperature-compensated OSC cmp [12]: (a) Frequency of the oscillations depends on the current I o and the capacitance C L .The constant current I o is sum of PTAT and CTAT current sources [12] MIM caps used for C L are also resistant to temperature variations.(b) Simulation result [12] shows that I 0 varies by only 1% for a 100 ˝C range.(c) Second-order Compensation [12]: A pull-up path for adding charge to C L .© (2012) IEEE.Reproduced with permission from A. Shrivastava and B. H. Calhoun, A 150 nW, 5 ppm/ ˝C, 100 kHz On-Chip clock source for ultra low power SoCs; published by Custom Integrated Circuits Conference (CICC), 2012 IEEE.

Figure 7 .
Figure 7.A temperature-drift-based locking algorithm: The difference of the current and the previous SAR bits indicate the temperature drift.The locking interval is set according to the above state machine.

Figure 7 .
Figure 7.A temperature-drift-based locking algorithm: The difference of the current and the previous SAR bits indicate the temperature drift.The locking interval is set according to the above state machine.

Figure 8 .
Figure 8.(a) Annotated chip die photo.The design was implemented as a part of a larger test chip.The breakout dimensions are: 522 µm × 215 µm for OSCcmp, 372 µm × 238 µm for OSCdiode, and 179 µm × 383 µm for the digital block.The total system design area is 0.269 mm 2 .OSCucmp from [12] was also implemented in this chip for evaluation and comparison with the proposed ULP OSCdiode.(b) Design layout showing the breakdown components.

Figure 8 .
Figure 8.(a) Annotated chip die photo.The design was implemented as a part of a larger test chip.The breakout dimensions are: 522 µm ˆ215 µm for OSC cmp , 372 µm ˆ238 µm for OSC diode , and 179 µm ˆ383 µm for the digital block.The total system design area is 0.269 mm 2 .OSC ucmp from[12]   was also implemented in this chip for evaluation and comparison with the proposed ULP OSC diode .(b) Design layout showing the breakdown components.

J
. Low Power Electron.Appl.2016, 6, 7 14 of 19frequency.With a locking interval of 1 min, the time taken for 100 locks is approximately 100 min.This time depends on the chosen locking interval time.

Figure 9 .
Figure 9.The histogram measurements of DCO jitter follow a Gaussian distribution with mean 0.

Figure 10 .
Figure10.Measured average frequency variation vs. number of locks: The average of jitter from the Gaussian distribution was measured to be 0. As a result, over a number of locking samples the effective average frequency variation tends to be 0. From measurement, we found that after roughly 100 locks of OSCdiode to OSCcmp, the average frequency of OSCdiode tends to match that of OSCcmp.As we further continue locking, the effective long-term average frequency remains 0.

Figure 9 .
Figure 9.The histogram measurements of DCO jitter follow a Gaussian distribution with mean 0.

J
. Low Power Electron.Appl.2016, 6, 7 14 of 19frequency.With a locking interval of 1 min, the time taken for 100 locks is approximately 100 min.This time depends on the chosen locking interval time.

Figure 9 .
Figure 9.The histogram measurements of DCO jitter follow a Gaussian distribution with mean 0.

Figure 10 .
Figure10.Measured average frequency variation vs. number of locks: The average of jitter from the Gaussian distribution was measured to be 0. As a result, over a number of locking samples the effective average frequency variation tends to be 0. From measurement, we found that after roughly 100 locks of OSC diode to OSC cmp , the average frequency of OSC diode tends to match that of OSC cmp .As we further continue locking, the effective long-term average frequency remains 0.

Figure 12 .Figure 11 .
Figure 12.A temperature profile lasting 75 min was set in the temperature chamber.The number of locks performed using the counter-based scheme was 74 (1 lock/min).With the drift-based locking

Figure 12 .Figure 12 .
Figure12.A temperature profile lasting 75 min was set in the temperature chamber.The number of locks performed using the counter-based scheme was 74 (1 lock/min).With the drift-based locking scheme, the number of locks performed was reduced to 41.A fewer number of locks implies savings in power.

Figure 13 .
Figure 13.Plot of inaccuracy vs. power for state-of-the-art on-chip clocks: The proposed on-chip clock source system has the lowest power among on-chip oscillators in the kHz range.

Figure 13 .
Figure 13.Plot of inaccuracy vs. power for state-of-the-art on-chip clocks: The proposed on-chip clock source system has the lowest power among on-chip oscillators in the kHz range.

Figure 13 .
Figure 13.Plot of inaccuracy vs. power for state-of-the-art on-chip clocks: The proposed on-chip clock source system has the lowest power among on-chip oscillators in the kHz range.

Figure 14 .
Figure 14.Plot of inaccuracy vs. energy per cycle for state-of-the-art on-chip clocks: This work has the lowest energy per clock cycle among on-chip oscillators in the kHz range.

Figure 14 .
Figure 14.Plot of inaccuracy vs. energy per cycle for state-of-the-art on-chip clocks: This work has the lowest energy per clock cycle among on-chip oscillators in the kHz range.
© (2012) IEEE.Adapted with permission from A. Shrivastava and B. H. Calhoun, A 150 nW, 5 ppm/°C, 100 kHz On-Chip clock source for ultra low power SoCs; published by Custom Integrated Circuits Conference (CICC), 2012 IEEE.
© (2012) IEEE.Reproduced with permission from A. Shrivastava and B. H. Calhoun, A 150 nW, 5 ppm/°C, 100 kHz On-Chip clock source for ultra low power SoCs; published by Custom Integrated Circuits Conference (CICC), 2012 IEEE.

Table 1 .
Power consumption of components at room temperature (~27 ˝C).

Table 2 .
Comparison to prior state-of the-art, on-chip clock sources.Average stability as the number of locks increases: The time taken for 100 locks with a 1-min locking interval is ~100 min.This time depends on the chosen locking interval time.