Ultra-Low Power, Process-Tolerant 10T (PT10T) SRAM with Improved Read/Write Ability for Internet of Things (IoT) Applications

: In this paper, an ultra-low power (ULP) 10T static random access memory (SRAM) is presented for Internet of Things (IoT) applications, which operates at sub-threshold voltage. The proposed SRAM has the tendency to operate at low supply voltages with high static and dynamic noise margins. The IoT application requires battery-enabled low leakage memory architecture in a subthreshold regime. Therefore, to improve leakage power consumption and provide better cell stability, a power-gated robust 10T SRAM is presented in this paper. The proposed cell uses a power-gated p-MOS transistor to reduce the leakage power or static power in standby mode. Moreover, due to the stacking of n-MOS transistors in 10T SRAM latch and by separating the read path from the 10T SRAM latch, the static and dynamic noise margins in read and write operations has shown signiﬁcant tolerance w.r.t. the variations in device process, voltage, and temperature (PVT) values. The proposed SRAM shows signiﬁcantly improved performance in terms of leakage power, read static noise margin (RSNM), write static noise margin (WSNM), write ability or write trip point (WTP), read–write energy, and dynamic read margin (DRM). Furthermore, these parameters of the proposed cell are observed at 8-Kilo bit (Kb) SRAM and compared with existing SRAM architectures. From the Monte Carlo simulation results, it is observed that the leakage power of a proposed low threshold voltage-LVT 10T SRAM is reduced by 98.76%, 98.6%, 6.7%, and 98.2% as compared to the LVT C6T, RD8T, LP9T, and ST10T SRAM, respectively, at 0.3V VDD. Additionally, in the proposed 10T SRAM, parameters such as RSNM, WSNM, WTP, and DRM are improved by 3 × , 2 × , 1.11 × , and 1.32 × , respectively, as compared to C6T SRAM. Similarly, the proposed 10T SRAM shows an improvement of 1.48 × , 1.25 × , and 1.1 × in RSNM, WSNM, and WTP, respectively, in the parameters as compared to RD8T SRAM at 0.3 V VDD.


Introduction
The constraint of high standby power present in Internet of Things (IoT) devices has encouraged the developments of ultra-low power subthreshold SRAM architectures [1,2].Scaling down the supply voltage is the best way to improve the total power consumption [3].However, this results in reduction of on-to-off current ratio (I ON /I OFF ).This will further affect the read offset voltage required for sensing the bit information stored in SRAM cell.Moreover, in the sub-threshold region, the drain current, which exponentially depends upon the threshold voltage (V th ), introduces numerous challenges such as high standby power, low read and hold stability and read failure.Nevertheless, the low read and write stability are also major factors that resist conventional SRAM architectures to work in the sub-threshold region [4].Due to the rapid growth of the Internet market, the IoT brings connectivity, communication, and data gathering to existing devices.The IoT allows countless devices to be connected and communicating with each other to enrich the present lifestyle.Applications of IoT range widely from traditional Internet and industrial Internet to consumer Internet [5].
The SoC block of IoT contains various sub-systems like sensors, communication ports, security blocks, high-end processor, and on-chip memories, which consist of read-only memory (ROM) and static random access memory (SRAM).However, SRAM has been divided into two types of systems, namely fast and low power, each of with has its own advantages, features, and applications.The IoT devices where SRAMs are used as memory storage needed it for either its high speed or low power consumption depending upon its application.Nevertheless, there is high demand for high-performance devices with ultra-low power consumption to execute composite operations while running on portable devices.This demand is mostly driven by new-generation medical devices, handheld devices, and communication systems that are fulfilled by the IoT devices.The development of the IoT is headed in two distinctive directions, i.e., smart wearables and automation.The wearable devices uses SRAM as a memory element that has a small footprint and low power consumption.Thus, SRAMs with high speed, high stability, and low power consumption are of significant importance to IoT devices and its applications.
Moreover, the IoT portable devices communicate with each other and thus require an enormous amount of memory to store and process data.The memory requirement in IoT depends upon the applications related to the market.For instance, in the case of a huge amount of data storage and handling information for a long period of time, high-density memory like DRAM and Flash are required.On the other hand, for high data transfer rate systems, fast SRAM memory is required, where a high speed data transfer is essential to communicate between IoT devices.Therefore, SRAM is preferred as a cache memory due to its faster response.Moreover, the robustness of such memory systems with respect to the variations in process-voltage-temperature (PVT) values of metal oxide semiconductor devices (MOS) and power efficiency are two of the most important design constraints [4].As per the literature, more than 40% of the active energy is consumed due to the leakage current in modern high-performance processors [6,7].Moreover, most of the time a large number of the SRAM cells used in present-generation on-chip cache memory are in a holding state where leakage power dominates over the dynamic power.Thus, leakage reduction has become an imperative concern in SRAM memory design.
To overcome the issue of leakage of power in subthreshold SRAM due to variations in various process parameters like channel length, gate-oxide, DIBL, etc. and the unpredictability in device performance and instability due to noise generated from high bitline static voltage and thermal noise generated from temperature variations, a differential process-tolerant 10T SRAM cell is proposed in this paper.The rest of the paper is organized as follows: related work and motivation are detailed in Section 2. A proposed 10T SRAM architecture with 64 × 128 array (8-Kb) is presented in Sections 3 and 4, respectively.The operations and working of 8-Kb SRAM are explained in Section 5.The simulation results and a discussion of proposed 10T SRAM are given in Section 6. Section 7 is a summary of results.Lastly, Section 8 concludes the work.

Related Work and Motivation
It is observed that leakage power and cell stability are the key concerns in subthreshold SRAM architectures to improve the reliability, yield, and susceptibility of portable electronic devices.Moreover, SRAM cells are in a static or holding state most of the time, which contributes to the high leakage.Furthermore, cell stability is also a foremost concern in the subthreshold region.The noise generated from threshold variation, process variation, half-select issue, and multiple bit errors reduces the stability of SRAM cell.Consequently, various techniques have been developed to overcome the noise generated from these constraints, such as scaling the supply voltage using process-variation-tolerant Schmitt trigger-based ST10T SRAM [8], a read static noise margin-free 7T SRAM [9], differential data aware power-supplied D2AP8T SRAM [10], low-leakage, variation-tolerant LP8T [11], LP9T [12], and LP10T SRAM [13].Besides the advantages observed from these SRAM architectures, there are some drawbacks like larger area, high cost, low WTP, low WSNM, and higher energy requirement in a subthreshold region.To compensate for all these factors, along with low leakage power and better cell stability, a differential ultra-low power, process-tolerant 10T SRAM is proposed in this paper.The conventional 6T SRAM shown in Figure 1a has a tendency to fail in subthreshold voltages due to the read path sharing of the M5 and M1.However, to improve RSNM, an 8T SRAM shown in Figure 1b is proposed that reads information through a decoupled read path.In a decoupled read the read access time increases.To improve reading speed with better RSNM, a Schmitt triggered 10T (ST10T), shown in Figure 1c, is presented [8].However, in ST10T the charge sharing still exists in read and write, which eventually degrades the read/write static noise margin and write ability.To overcome the problem related to read delay and static noise margin, a low-leakage 10T (LP10T) SRAM is presented in [13], as shown in Figure 1d, which improves the leakage power by introducing a virtual ground path and noise margins by differential decoupled read.However, the stack transistor MN8 used in [13] at the ground pushes the storage node to meta-stability.The meta-stability is not considered a good choice to exist in subthreshold supply voltages, where a small gain in storage node voltage flips the state of the cell.However, to overcome the state of meta-stability and improve the leakage power generated in subthreshold operation in a proposed 10T (shown in Figure 1e), a power gated p-MOS is used.In addition, to improve read and write static noise margin, a separate read path and stacked n-MOS structure is used in proposed 10T SRAM latch.The stacking of MN5-MN6 and MN7-MN8 results in a higher write trip point and WSNM.However, by separating the read path (BL-MN2-MN_GND) from the latch and creating a complete isolation of storage nodes from BLs, the RSNM of proposed 10T SRAM is improved as compared to C6T, RD8T, and ST10T.However, improving these parameters contributes to more area and eventually to a higher cost of SRAM.Furthermore, the SRAM cell and macroblock architecture with area overhead are further detailed in Sections 3 and 4, respectively.
J. Low Power Electron.Appl.2017, 7, 24 3 of 3 higher energy requirement in a subthreshold region.To compensate for all these factors, along with low leakage power and better cell stability, a differential ultra-low power, process-tolerant 10T SRAM is proposed in this paper.The conventional 6T SRAM shown in Figure 1a has a tendency to fail in subthreshold voltages due to the read path sharing of the M5 and M1.However, to improve RSNM, an 8T SRAM shown in Figure 1b is proposed that reads information through a decoupled read path.
In a decoupled read the read access time increases.To improve reading speed with better RSNM, a Schmitt triggered 10T (ST10T), shown in Figure 1c, is presented [8].However, in ST10T the charge sharing still exists in read and write, which eventually degrades the read/write static noise margin and write ability.To overcome the problem related to read delay and static noise margin, a lowleakage 10T (LP10T) SRAM is presented in [13], as shown in Figure 1d, which improves the leakage power by introducing a virtual ground path and noise margins by differential decoupled read.However, the stack transistor MN8 used in [13] at the ground pushes the storage node to metastability.The meta-stability is not considered a good choice to exist in subthreshold supply voltages, where a small gain in storage node voltage flips the state of the cell.However, to overcome the state of meta-stability and improve the leakage power generated in subthreshold operation in a proposed 10T (shown in Figure 1e), a power gated p-MOS is used.In addition, to improve read and write static noise margin, a separate read path and stacked n-MOS structure is used in proposed 10T SRAM latch.
The stacking of MN5-MN6 and MN7-MN8 results in a higher write trip point and WSNM.However, by separating the read path (BL-MN2-MN_GND) from the latch and creating a complete isolation of storage nodes from BLs, the RSNM of proposed 10T SRAM is improved as compared to C6T, RD8T, and ST10T.However, improving these parameters contributes to more area and eventually to a higher cost of SRAM.Furthermore, the SRAM cell and macroblock architecture with area overhead are further detailed in Sections 3 and 4, respectively.

Architecture of 10T SRAM Cell
The advancement in technology and reduction in IC development cost due to smaller die area resulted in considerable improvement in digital electronic devices.If this development continues, it will result in a world where all the digital electronic devices linked to the Internet, known as the Internet of Things (IoT).Furthermore, a millions of zettabytes of memory is required to handle the large-scale computations required by IoT devices.Thus, memory plays a vital role in future energy-efficient computing systems and, therefore, a subthreshold, low-leakage and high-stability SRAM is proposed in this section.
The proposed cell shown in Figure 1e has two-write access n-MOS transistors, MN1 and MN3, and two read-access transistors, MN2 and MN4.BL contains single-bit information to write; BLB contains the complementary of that.Furthermore, the cell layout structure of proposed cell along with the existing SRAM cell is shown in Figure 2. In addition, n-MOS transistors MN2 and MN4 are connected with virtual ground n-MOS transistor MN_GND.BL and BLB are precharged to VDD before the read operation is performed.MP1-MP2 and MN5 to MN8 form a latch, where MP1 and MP2 are the pull-up transistors linked to virtual VDD (VVDD).In addition, MN5-MN6 and MN7-MN8 form a stack, which eventually enhances the read and write noise margin.Furthermore, two transistors, MP_VDD and MN_GND, are shared with each row of 8-Kb SRAM, as shown in Figure 3, where MP_VDD is used as a power gating p-MOS transistor, which disconnects the path between VDD and GND at the holding state to improve the leakage power.

10T SRAM Cell-Based 8 kb Macroblock and Area Overhead
To observe the area overhead and to have a better view of the functionality and placing of SRAM cell and various controlling blocks at array level, a simplified architecture of 10T SRAM array is represented in Figure 3 and the layout area comparison is presented in Table 1.In addition, the functioning of 10T SRAM is explained in Table 2.The table shows the truth table of various operations that take place in the proposed SRAM architecture.The SRAM array uses internal inputs derived from the RWL and WWL signal to control the power-gated p-MOS transistor.However, the proposed SRAM array in Figure 3 shows the complete correlation between various controlling signals and exhibits the procedure of writing and reading using a high-speed differential current compensation sense amplifier (DCC-SA) [14][15][16].The layout areas of different SRAM architectures are shown in Figure 2 and a comparison between cell areas is shown in Table 1.It is observed that the proposed SRAM has a 17% and 27% reduction in footprint area as compared to ST10T [8] and LP10T [13] SRAM, respectively.In addition, leakage power is also drastically improved due to power gating of supply voltage VDD using MP_VDD transistor.By combining the advantages of both ST10T (using stacked n-MOS transistor) and LP10T (using a separate read path), we addressed a significant improvement in static noise margin and leakage power at different process, voltage, and temperature (PVT) conditions in sub-threshold supply voltages.However, an extra amount of overhead is required at array level to improve leakage power and noise margins at different PVT conditions.Figure 3 shows the overhead area over the SRAM array by introducing two extra transistors at each row of the macroblock.MP_VDD (p-MOS) and MN_GND (n-MOS) are the two overhead transistors included at each row to improve leakage power and read RSNM, respectively.An XNOR gate is also introduced at each row of the SRAM Macro to provide a sleep input to the power gated p-MOS transistor MP_VDD.The XNOR gate has four MOS transistors (two n-MOS and two p-MOS) integrated at each row of the array.These extra transistors bear a channel width of 2 µm, which is 13.33× more than that of the SRAM MOS transistors width (150 nm).Consequently, Table 3 shows the comparison between the areas of 8-Kb SRAM array created by C6T, RD8T LP10T, ST10T, and PT10T SRAM.From the table it is observed that the proposed cell has 35% and 12.5% higher macroblock area as compared to C6T and RD8T, respectively.On the other hand, the proposed 10T SRAM has 17% and 27% less macroblock area as compared to ST10T and LP10T, respectively.

Operations and Working of 10T SRAM
This section demonstrates the working principle of the proposed 10T SRAM in subthreshold operations.Furthermore, the operation has been performed in an 8 kB SRAM macroblock.The read and write operations have been carried out using a 64-bit column.

Read Operation
Before reading the information, BL and BLB are precharged to VDD using the precharge logic.The read operation is obtained through ultra-fast differential current compensation sense amplifier (DCC-SA) [14][15][16].The read operation is performed by keeping the read word line (RWL) to HIGH and the write word line (WWL) to LOW and XNOR_I/P at LOW value.For read '1' operation, logic 1 is stored at the storage node Q and RWL is kept HIGH, which eventually turns ON MN_GND and MN2.This forms a discharging path across BL-MN2-MN_GND and a voltage difference, ∆V BL = {(VDD (V BLB ) − (VDD − I read , BL × R MN2-GND )} appears between the BL and BLB line, which is sensed by the full swing inverter sense amplifier [14][15][16].I read is the cell current and R MN2-GND is the resistance through MN2 and MN_GND.The read access time is measured as the time between when the RWL signal is activated and the BLB is discharged to the minimum required potential needed by SA to read.The sensing voltage ∆V BL required for DCC-SA is 80 mV [14].The read access time and power is measured across a 64-bit column of the SRAM cell.The parasitic capacitance of SRAM column is measured after RC-extraction in Cadence layout design at 65 nm UMC Technology.The read operation is achieved using DCC-SA.The sense amplifier works well in subthreshold or near-threshold voltages ranging from 0.3 V to 0.7 V.The circuit topology of DCC-SA is shown in Figure 4a and the simulated transient voltage at 0.7 V VDD is shown in Figure 4b. Figure 4b shows the workings of SA: as soon as SEN and RWL gets activated and chip select (CS) is HIGH, the SA detects the voltage difference (sensing voltage) between BL and BLB.The moment the voltage difference reaches the required sensing voltage, the output node SOB gets logic HIGH, which redirects that logic '1' stored at storage node Q of the SRAM.Moreover, the read operation of proposed cell is shown in Figure 5a.

Write Operation
For the write operation, WWL must be kept HIGH and RWL is at LOW.However, by making WWL (HIGH) and RWL (LOW), XNOR_OUT is switched to LOW value.The logic 1 is written to storage node Q through BL-MN1-Q, as shown in Figure 5b.However, the write '1' access time is measured as the time between when WWL signal is activated and reaches its VDD/2 value and storage node Q reaches 90% of VDD value.Similarly, write '0' time is measured as the time between when the WWL signal is activated and reaches its VDD/2 and storage node Q reaches to 10% of VDD.The half-select issue in write operation is also taken care of by putting a WE signal at each column of SRAM array, as shown in Figure 3.By separating the read and write paths using WE and REB, the sensing and writing operations can be separated, which eventually helps to negate the half-select read and write issue.

Data Retention
In the proposed SRAM a coarse-grain power-gating technique is used, which will turn OFF the power gated p-MOS connected to VDD in standby mode and reduces the overall leakage power.

Write Operation
For the write operation, WWL must be kept HIGH and RWL is at LOW.However, by making WWL (HIGH) and RWL (LOW), XNOR_OUT is switched to LOW value.The logic 1 is written to storage node Q through BL-MN1-Q, as shown in Figure 5b.However, the write '1' access time is measured as the time between when WWL signal is activated and reaches its VDD/2 value and storage node Q reaches 90% of VDD value.Similarly, write '0' time is measured as the time between when the WWL signal is activated and reaches its VDD/2 and storage node Q reaches to 10% of VDD.The half-select issue in write operation is also taken care of by putting a WE signal at each column of SRAM array, as shown in Figure 3.By separating the read and write paths using WE and REB, the sensing and writing operations can be separated, which eventually helps to negate the half-select read and write issue.

Data Retention
In the proposed SRAM a coarse-grain power-gating technique is used, which will turn OFF the power gated p-MOS connected to VDD in standby mode and reduces the overall leakage power.

Write Operation
For the write operation, WWL must be kept HIGH and RWL is at LOW.However, by making WWL (HIGH) and RWL (LOW), XNOR_OUT is switched to LOW value.The logic 1 is written to storage node Q through BL-MN1-Q, as shown in Figure 5b.However, the write '1' access time is measured as the time between when WWL signal is activated and reaches its VDD/2 value and storage node Q reaches 90% of VDD value.Similarly, write '0' time is measured as the time between when the WWL signal is activated and reaches its VDD/2 and storage node Q reaches to 10% of VDD.The half-select issue in write operation is also taken care of by putting a WE signal at each column of SRAM array, as shown in Figure 3.By separating the read and write paths using WE and REB, the sensing and writing operations can be separated, which eventually helps to negate the half-select read and write issue.

Data Retention
In the proposed SRAM a coarse-grain power-gating technique is used, which will turn OFF the power gated p-MOS connected to VDD in standby mode and reduces the overall leakage power.However, during read and write operations in power gated logic, the power switch (VDD) must provide adequate supply voltage and current to the SRAM cells to maintain the hold stability and the bit information at storage node.During standby, the VDD gets turned off using power gated p-MOS (MP_VDD) transistor and the SRAM cell gets connected to VVDD.The VVDD is set to a required value less than VDD, which will maintain the hold states so that the SRAM can access its stored information in read operations.
In general, the memory cell remains in a static or holding state most of the time.Therefore, there would be a very high possibility of an increase in leakage power in a SRAM cell in different PVT conditions.In the proposed PT10T SRAM, the control signal XNOR_OUT turns OFF MP_VDD, which helps to reduce the leakage current by disconnecting the path of the latch from VDD to GND.However, the leakage power in the C6T SRAM cell is introduced due to the non-availability of a virtual VDD transistor such as MP_VDD.In C6T and RD8T SRAM cells, there is a potential difference of VDD between pull up (PU) and pull down (PD) transistors [18].Therefore, the leakage current flow is higher in C6T and RD8T SRAM cell.However, in proposed cell, due to the power gating of VDD, the virtual node VVDD is now set at a positive voltage [19] lower than VDD.This eventually reduces the hold static noise margin (HSNM) but improves the leakage power essential for IoT edge device, as shown in Figure 6.
However, during read and write operations in power gated logic, the power switch (VDD) must provide adequate supply voltage and current to the SRAM cells to maintain the hold stability and the bit information at storage node.During standby, the VDD gets turned off using power gated p-MOS (MP_VDD) transistor and the SRAM cell gets connected to VVDD.The VVDD is set to a required value less than VDD, which will maintain the hold states so that the SRAM can access its stored information in read operations.
In general, the memory cell remains in a static or holding state most of the time.Therefore, there would be a very high possibility of an increase in leakage power in a SRAM cell in different PVT conditions.In the proposed PT10T SRAM, the control signal XNOR_OUT turns OFF MP_VDD, which helps to reduce the leakage current by disconnecting the path of the latch from VDD to GND.However, the leakage power in the C6T SRAM cell is introduced due to the non-availability of a virtual VDD transistor such as MP_VDD.In C6T and RD8T SRAM cells, there is a potential difference of VDD between pull up (PU) and pull down (PD) transistors [18].Therefore, the leakage current flow is higher in C6T and RD8T SRAM cell.However, in proposed cell, due to the power gating of VDD, the virtual node VVDD is now set at a positive voltage [19] lower than VDD.This eventually reduces the hold static noise margin (HSNM) but improves the leakage power essential for IoT edge devices as shown in Figure 6.

Simulation Results of PT10T SRAM
The proposed 10T SRAM is simulated in 65 nm standard CMOS technology.Post-layout simulation at 6-sigma (σ) process variations are observed and compared with C6T, RD8T, ST10T [8], and LP10T [13] SRAMs to determine various constraints like leakage power, read-write delay and power, power delay product (PDP), read static noise margin (RSNM), dynamic read margin (DRM), write static noise margin (WSNM), and write trip point (WTP).

Simulation Setup
The simulation setup is observed through Monte Carlo post-layout simulations on an 8 Kb SRAM array.The read-write access time and power is measured by applying a write and read operation at a single column of the array.The proposed SRAM Macro has a 64-bit column coupled with a DCC-SA.The DCC-SA takes a small differential positive sensing voltage as an input and attains an output that is equivalent to the information stored in the SRAM cell.To perform read and write operations, access transistors are used to separate the read and write paths.The row and

Simulation Results of PT10T SRAM
The proposed 10T SRAM is simulated in 65 nm standard CMOS technology.Post-layout simulation at 6-sigma (σ) process variations are observed and compared with C6T, RD8T, ST10T [8], and LP10T [13] SRAMs to determine various constraints like leakage power, read-write delay and power, power delay product (PDP), read static noise margin (RSNM), dynamic read margin (DRM), write static noise margin (WSNM), and write trip point (WTP).

Simulation Setup
The simulation setup is observed through Monte Carlo post-layout simulations on an 8 Kb SRAM array.The read-write access time and power is measured by applying a write and read operation at a single column of the array.The proposed SRAM Macro has a 64-bit column coupled with a DCC-SA.The DCC-SA takes a small differential positive sensing voltage as an input and attains an output that is equivalent to the information stored in the SRAM cell.To perform read and write operations, access transistors are used to separate the read and write paths.The row and column decoder are used to activate the WWL and RWL of the Marco.The setup for all operations is observed at the worst-case process corner.The write and read operations are performed through BL and BLB.While reading the information from the SRAM, the bit-lines are pre-charged to VDD using a pre-charge signal.Furthermore, all the observations are performed using 1000 Monte Carlo simulations.

Write and Read Analysis
The write '1' access time is measured as the time when WWL signal is triggered and the storage node Q reaches 90% of VDD.Similarly, write '0' access time is defined as the time when WWL signal is activated and storage node Q reaches 10% of VDD.The write power is measured as the product of average current dissipation and the source voltage until write is achieved.Additionally, Figure 7a,b show the write '1' and write '0' delay, respectively, observed at different supply voltages in worst-case (SS) process corner (PC).The delay of LVT PT10T SRAM is compared with existing LVT C6T, RD8T, LP10T [13], and LP9T [12], which shows that PT10T SRAM has a similar write access time.Similarly, Figure 8a,b show comparisons of write '0' and write '1' power, respectively, at different supply voltages in fast-fast (FF) PC.The results obtained show that the proposed LVT PT10T has an improvement in write '1' power by 8.5%, 12.6%, 19.35%, and 19.3% as compared to LVT C6T, RD8T, LP10T [13], and LP9T [12] SRAM, respectively, at 0.3V VDD.The outcome illustrates a reduction in write '0' power by 5.2%, 1%, 10%, and 6.6% as compared to LVT C6T, RD8T, LP10T, and LP9T SRAM, respectively at 0.3 V VDD.
Furthermore, Figure 9 shows the comparisons graphs of energy consumption of various SRAM cells at different supply voltages at SS and FF process corner in write '0' and write '1' operation.The results of LVT PT10T SRAM show a reduction in write '1' energy by 1.23%, 14%, and 20% as compared to LVT RD8T, LP10T, and LP9T SRAM, respectively, at 0.3 V supply voltage.However, write '0' energy is improved by 1%, 6.2%, and 5.2% as compared to RD8T, LP10T, and LP9T SRAM, respectively, at 0.3 V VDD.Furthermore, the read access time is measured when RWL is activated and pre-charged bit-lines (BLs) discharge and reach the minimum sensing voltage required by SA [14][15][16].From the DCC-SA architecture presented in [14], it is observed that a differential sensing voltage of 80 mV is required to read data from SRAM. Figure 10a shows the simulation states of SRAM at read time.The read access time of proposed SRAM is compared with the existing SRAM cells, as shown in Figure 10b.In addition, the read power is measured until the differential voltage across bit-lines reaches the sensing voltage.Figure 10c illustrates the comparisons of read energy.It is observed that the read access of proposed LVT 10T SRAM is 1.6× and 1.57× faster than C6T and ST10T [8], respectively, in SS PC at 0.3 V VDD.In addition, the read energy of proposed cell is reduced by 4% and 3% as compared to LVT C6T and LP10T SRAM, respectively, in SS PC at 0.3 V VDD.
J. Low Power Electron.Appl.2017, 7, 24 9 of 9 column decoder are used to activate the WWL and RWL of the Marco.The setup for all operations is observed at the worst-case process corner.The write and read operations are performed through BL and BLB.While reading the information from the SRAM, the bit-lines are pre-charged to VDD using a pre-charge signal.Furthermore, all the observations are performed using 1000 Monte Carlo simulations.

Write and Read Analysis
The write '1' access time is measured as the time when WWL signal is triggered and the storage node Q reaches 90% of VDD.Similarly, write '0' access time is defined as the time when WWL signal is activated and storage node Q reaches 10% of VDD.The write power is measured as the product of average current dissipation and the source voltage until write is achieved.Additionally, Figure 7a,b show the write '1' and write '0' delay, respectively, observed at different supply voltages in worstcase (SS) process corner (PC).The delay of LVT PT10T SRAM is compared with existing LVT C6T, RD8T, LP10T [13], and LP9T [12], which shows that PT10T SRAM has a similar write access time.Similarly, Figure 8a,b show comparisons of write '0' and write '1' power, respectively, at different supply voltages in fast-fast (FF) PC.The results obtained show that the proposed LVT PT10T has an improvement in write '1' power by 8.5%, 12.6%, 19.35%, and 19.3% as compared to LVT C6T, RD8T, LP10T [13], and LP9T [12] SRAM, respectively, at 0.3V VDD.The outcome illustrates a reduction in write '0' power by 5.2%, 1%, 10%, and 6.6% as compared to LVT C6T, RD8T, LP10T, and LP9T SRAM, respectively at 0.3 V VDD.
Furthermore, Figure 9 shows the comparisons graphs of energy consumption of various SRAM cells at different supply voltages at SS and FF process corner in write '0' and write '1' operation.The results of LVT PT10T SRAM show a reduction in write '1' energy by 1.23%, 14%, and 20% as compared to LVT RD8T, LP10T, and LP9T SRAM, respectively, at 0.3 V supply voltage.However, write '0' energy is improved by 1%, 6.2%, and 5.2% as compared to RD8T, LP10T, and LP9T SRAM, respectively, at 0.3 V VDD.Furthermore, the read access time is measured when RWL is activated and pre-charged bit-lines (BLs) discharge and reach the minimum sensing voltage required by SA [14][15][16].From the DCC-SA architecture presented in [14], it is observed that a differential sensing voltage of 80 mV is required to read data from SRAM. Figure 10a shows the simulation states of SRAM at read time.The read access time of proposed SRAM is compared with the existing SRAM cells, as shown in Figure 10b.In addition, the read power is measured until the differential voltage across bit-lines reaches the sensing voltage.Figure 10c illustrates the comparisons of read energy.It is observed that the read access of proposed LVT 10T SRAM is 1.6× and 1.57× faster than C6T and ST10T [8], respectively, in SS PC at 0.3 V VDD.In addition, the read energy of proposed cell is reduced by 4% and 3% as compared to LVT C6T and LP10T SRAM, respectively, in SS PC at 0.3 V VDD.

Write and Read Analysis at Different Threshold Voltage Transistors
The various merits of write and read operations of proposed 10T SRAM and other existing SRAMs are analyzed using high-threshold HVT, standard-threshold RVT, and low-threshold LVT MOS transistors.The bar graphs shown in Figure 11a,b demonstrate the comparison between write access time and write energy, respectively, using different threshold voltage transistors at 0.3 V VDD.Due to stacked n-MOS transistors in SRAM latch and gated p-MOS transistor at VDD, the results in Figure 11a show a higher write access time for proposed PT10T SRAM as compared to other existing memory cells.However, the power gating at VDD reduces the write energy in RVT-and HVT-based PT10T SRAM, as shown in Figure 11b, while the LVT SRAMs shows similar write energy values.The write energy of proposed HVT 10T SRAM has reduced by 43%, 46%, 44%, and 52% as compared to HVT C6T, RD8T, ST10T, and LP10T SRAM, respectively.
Further, Figure 12a,b illustrates the read access time and read energy, respectively, at 0.3 V VDD. Figure 12a show that the proposed cell has a fast read access (similar to LP10T) due to introduction of virtual ground n-MOS transistor (MN_GND) with high channel width (W = 2 µm) compared to other existing SRAMs measured using threshold voltage transistors.The read delay of proposed RVT 10T SRAM has reduced by 61.6%, 69.5%, and 59% as compared to HVT C6T, RD8T, and ST10T SRAM, respectively.However, the read delay of proposed HVT 10T SRAM has reduced by 58.75%, 67%, and 58% as compared to HVT C6T, RD8T, and ST10T SRAM, respectively.

Write and Read Analysis at Different Threshold Voltage Transistors
The various merits of write and read operations of proposed 10T SRAM and other existing SRAMs are analyzed using high-threshold HVT, standard-threshold RVT, and low-threshold LVT MOS transistors.The bar graphs shown in Figure 11a,b demonstrate the comparison between write access time and write energy, respectively, using different threshold voltage transistors at 0.3 V VDD.Due to stacked n-MOS transistors in SRAM latch and gated p-MOS transistor at VDD, the results in Figure 11a show a higher write access time for proposed PT10T SRAM as compared to other existing memory cells.However, the power gating at VDD reduces the write energy in RVT-and HVT-based PT10T SRAM, as shown in Figure 11b, while the LVT SRAMs shows similar write energy values.The write energy of proposed HVT 10T SRAM has reduced by 43%, 46%, 44%, and 52% as compared to HVT C6T, RD8T, ST10T, and LP10T SRAM, respectively.
Further, Figure 12a,b illustrates the read access time and read energy, respectively, at 0.3 V VDD. Figure 12a show that the proposed cell has a fast read access (similar to LP10T) due to introduction of virtual ground n-MOS transistor (MN_GND) with high channel width (W = 2 µm) compared to other existing SRAMs measured using threshold voltage transistors.The read delay of proposed RVT 10T SRAM has reduced by 61.6%, 69.5%, and 59% as compared to HVT C6T, RD8T, and ST10T SRAM, respectively.However, the read delay of proposed HVT 10T SRAM has reduced by 58.75%, 67%, and 58% as compared to HVT C6T, RD8T, and ST10T SRAM, respectively.

Standby or Leakage Power
Leakage current is measured while the SRAM cell is in a static or holding condition.The static power or leakage power is the amount of power lost during holding.Figure 13a shows the leakage power variations w.r.t. the supply voltages for proposed LVT PT10T SRAM compared to other exiting LVT SRAMs.The proposed LVT 10T SRAM shows a significant improvement in leakage power (84 nW at 0.3 V VDD) which is reduced by 98.76%, 98.6%, 6.7%, and 98.2% as compared to the LVT C6T, RD8T, LP9T, and ST10T SRAM, respectively, at 0.3 V VDD.However, the outcome of PT10T SRAM (Figure 14) using HVT transistors show a substantial reduction in leakage power/cell by 72%, 68.75%, 92.5%, and 57% as compared to the HVT C6T, RD8T, and ST10T SRAM, respectively.Furthermore, the leakage power is observed at different temperature values ranging from 0 °C to 100 °C. Figure 13b shows that the proposed PT10T SRAM has a minimum change in leakage power w.r.t.increase in temperature.The results show that PT10T SRAM has superior temperature tolerance as compared to existing SRAM architectures.The invariability in leakage power w.r.t.temperature variations is very convenient for IoT devices operating on battery power due to their high junction temperature and external temperature variation due to climate change.

Standby or Leakage Power
Leakage current is measured while the SRAM cell is in a static or holding condition.The static power or leakage power is the amount of power lost during holding.Figure 13a shows the leakage power variations w.r.t. the supply voltages for proposed LVT PT10T SRAM compared to other exiting LVT SRAMs.The proposed LVT 10T SRAM shows a significant improvement in leakage power (84 nW at 0.3 V VDD) which is reduced by 98.76%, 98.6%, 6.7%, and 98.2% as compared to the LVT C6T, RD8T, LP9T, and ST10T SRAM, respectively, at 0.3 V VDD.However, the outcome of PT10T SRAM (Figure 14) using HVT transistors show a substantial reduction in leakage power/cell by 72%, 68.75%, 92.5%, and 57% as compared to the HVT C6T, RD8T, and ST10T SRAM, respectively.Furthermore, the leakage power is observed at different temperature values ranging from 0 °C to 100 °C. Figure 13b shows that the proposed PT10T SRAM has a minimum change in leakage power w.r.t.increase in temperature.The results show that PT10T SRAM has superior temperature tolerance as compared to existing SRAM architectures.The invariability in leakage power w.r.t.temperature variations is very convenient for IoT devices operating on battery power due to their high junction temperature and external temperature variation due to climate change.

Standby or Leakage Power
Leakage current is measured while the SRAM cell is in a static or holding condition.The static power or leakage power is the amount of power lost during holding.Figure 13a shows the leakage power variations w.r.t. the supply voltages for proposed LVT PT10T SRAM compared to other exiting LVT SRAMs.The proposed LVT 10T SRAM shows a significant improvement in leakage power (84 nW at 0.3 V VDD) which is reduced by 98.76%, 98.6%, 6.7%, and 98.2% as compared to the LVT C6T, RD8T, LP9T, and ST10T SRAM, respectively, at 0.3 V VDD.However, the outcome of PT10T SRAM (Figure 14) using HVT transistors show a substantial reduction in leakage power/cell by 72%, 68.75%, 92.5%, and 57% as compared to the HVT C6T, RD8T, and ST10T SRAM, respectively.Furthermore, the leakage power is observed at different temperature values ranging from 0 • C to 100 • C. Figure 13b shows that the proposed PT10T SRAM has a minimum change in leakage power w.r.t.increase in temperature.The results show that PT10T SRAM has superior temperature tolerance as compared to existing SRAM architectures.The invariability in leakage power w.r.t.temperature variations is very convenient for IoT devices operating on battery power due to their high junction temperature and external temperature variation due to climate change.

Read Static Noise Margin (RSNM) and Dynamic Read Margin (DRM)
Read static noise margin (RSNM) is measured by applying a DC noise voltage source at one of the storage node Q or QB and investigating the effect on other storage node.The RSNM is examined in the read operation when RWL is HIGH and WWL is LOW.Due to the separation of read path (BL-MN2-MN_GND) from the latch, it does not affect the storage nodes of the SRAM cell, which would further help to neglect the consequences of static noise and, as a result, improves the RSNM.In Figure 15, the RSNM of the proposed LVT PT10T SRAM cell measured as 107 mV, which has 3×, 1.48×, 1.48×, and 4% higher than LVT C6T, RD8T, and LP10T SRAM, respectively, at 0.3 V VDD.The RSNM of read decoupled (RD)-8T, LP9T and LP10T SRAM analyzed and are identical to each other due to their similar read architecture.

Read Static Noise Margin (RSNM) and Dynamic Read Margin (DRM)
Read static noise margin (RSNM) is measured by applying a DC noise voltage source at one of the storage node Q or QB and investigating the effect on other storage node.The RSNM is examined in the read operation when RWL is HIGH and WWL is LOW.Due to the separation of read path (BL-MN2-MN_GND) from the latch, it does not affect the storage nodes of the SRAM cell, which would further help to neglect the consequences of static noise and, as a result, improves the RSNM.In Figure 15, the RSNM of the proposed LVT PT10T SRAM cell measured as 107 mV, which has 3×, 1.48×, 1.48×, and 4% higher than LVT C6T, RD8T, and LP10T SRAM, respectively, at 0.3 V VDD.The RSNM of read decoupled (RD)-8T, LP9T and LP10T SRAM analyzed and are identical to each other due to their similar read architecture.

Read Static Noise Margin (RSNM) and Dynamic Read Margin (DRM)
Read static noise margin (RSNM) is measured by applying a DC noise voltage source at one of the storage node Q or QB and investigating the effect on other storage node.The RSNM is examined in the read operation when RWL is HIGH and WWL is LOW.Due to the separation of read path (BL-MN2-MN_GND) from the latch, it does not affect the storage nodes of the SRAM cell, which would further help to neglect the consequences of static noise and, as a result, improves the RSNM.In Figure 15, the RSNM of the proposed LVT PT10T SRAM cell measured as 107 mV, which has 3×, 1.48×, 1.48×, and 4% higher than LVT C6T, RD8T, and LP10T SRAM, respectively, at 0.3 V VDD.
The RSNM of read decoupled (RD)-8T, LP9T and LP10T SRAM analyzed and are identical to each other due to their similar read architecture.Furthermore, the RSNM of proposed 10T SRAM (Figure 16) is measured using HVT and RVT transistors.Similarly, the RSNM of ST10T and LP10T using HVT and LVT transistors are observed at 0.3 V VDD as shown in Figure 17.From Figures 16a and 17a,c, it is observed that the HVT PT10T SRAM has 6% and 18.4% higher RSNM values as compared to HVT ST10T and LP10T, respectively.Similarly, from Figures 16b and 17b,d, it is observed that the RVT PT10T SRAM has 6.4% and 27.7% higher RSNM values as compared to RVT ST10T and LP10T, respectively.Furthermore, the butterfly curves for RSNM are observed at different temperature values for C6T, RD8T, and PT10T SRAM, as shown in Figure 18a-c  Furthermore, the RSNM of proposed 10T SRAM (Figure 16) is measured using HVT and RVT transistors.Similarly, the RSNM of ST10T and LP10T using HVT and LVT transistors are observed at 0.3 V VDD as shown in Figure 17.From Figures 16a and 17a,c, it is observed that the HVT PT10T SRAM has 6% and 18.4% higher RSNM values as compared to HVT ST10T and LP10T, respectively.Similarly, from Figures 16b and 17b,d, it is observed that the RVT PT10T SRAM has 6.4% and 27.7% higher RSNM values as compared to RVT ST10T and LP10T, respectively.Furthermore, the butterfly curves for RSNM are observed at different temperature values for C6T, RD8T, and PT10T SRAM, as shown in Figure 18a-c, respectively.In addition, the comparison of RSNM w.r.t. the temperature values are also shown in Figure 18d.The figure shows a tremendous tolerance in RSNM w.r.t. the change in temperature as compared to the existing SRAMs.Furthermore, the RSNM of proposed 10T SRAM (Figure 16) is measured using HVT and RVT transistors.Similarly, the RSNM of ST10T and LP10T using HVT and LVT transistors are observed at 0.3 V VDD as shown in Figure 17.From Figures 16a and 17a,c, it is observed that the HVT PT10T SRAM has 6% and 18.4% higher RSNM values as compared to HVT ST10T and LP10T, respectively.Similarly, from Figures 16b and 17b,d, it is observed that the RVT PT10T SRAM has 6.4% and 27.7% higher RSNM values as compared to RVT ST10T and LP10T, respectively.Furthermore, the butterfly curves for RSNM are observed at different temperature values for C6T, RD8T, and PT10T SRAM, as shown in Figure 18a-c

RSNM of C6T =36mV
RSNMof LP10T, LP9T and RD8T =72.3mVConsequently, read margin is measured in a situation where there is no static DC noise to hamper the storage nodes.While measuring read margin, the RWL is activated HIGH and WWL is kept at LOW.The read dynamic noise margin (RDNM) is observed when the bit-line reaches the sensing voltage.Simultaneously, the voltage difference between storage node Q and QB is defined as RDNM.In our case the RDNM value comes out equivalent to 290 mV, which is 7.5% better than C6T SRAM, as shown in Figure 19a,b  Consequently, read margin is measured in a situation where there is no static DC noise to hamper the storage nodes.While measuring read margin, the RWL is activated HIGH and WWL is kept at LOW.The read dynamic noise margin (RDNM) is observed when the bit-line reaches the sensing voltage.Simultaneously, the voltage difference between storage node Q and QB is defined as RDNM.In our case the RDNM value comes out equivalent to 290 mV, which is 7.5% better than C6T SRAM, as shown in Figure 19a,b at 0.3 V VDD.Consequently, read margin is measured in a situation where there is no static DC noise to hamper the storage nodes.While measuring read margin, the RWL is activated HIGH and WWL is kept at LOW.The read dynamic noise margin (RDNM) is observed when the bit-line reaches the sensing voltage.Simultaneously, the voltage difference between storage node Q and QB is defined as RDNM.In our case the RDNM value comes out equivalent to 290 mV, which is 7.5% better than C6T SRAM, as shown in Figure 19a,b

WSNM and Write Trip Point (WTP)
The write static noise margin (WSNM) is measured at the time of write operation by initiating a linear DC noise at one of the storage nodes and observing the effect of the noise at the other end.The WWL is kept HIGH, while the RWL is LOW.The plots in Figures 20a,b and 21a,b determine the WSNM of C6T, LP9T, LP10T, and PT10T SRAMs, respectively, at worst-case (SF) process corner.Moreover, from Figure 20a,b it is observed that the WSNM of C6T and LP9T SRAM has failed to write at 0.2 V with WSNM (20 mV) less than the thermal voltage (28 mV).In addition, LP10T has a WSNM of 10 mV and 30 mV at 0.2 V and 0.3 V VDD, respectively.Therefore, it fails to work at 0.2 V and has a near-threshold value at 0.3 V (WSNM = 30 mV).Consequently, it is suggested to operate LP10T SRAM above 0.3 V supply voltage.Furthermore, the proposed 10T SRAM has a WSNM of 21 mV and 40 mV at 0.2 V and 0.3 V VDD, respectively, in SF PC.From the results, it is recommended not to operate at 0.2V VDD; nevertheless, there is significant activity at 0.3 V VDD.The WSNM of PT10T shows 2×, 2×, and 1.5× higher outcomes as compared to C6T, LP9T, and LP10T SRAM, respectively, at 0.3 V VDD.

WSNM and Write Trip Point (WTP)
The write static noise margin (WSNM) is measured at the time of write operation by initiating a linear DC noise at one of the storage nodes and observing the effect of the noise at the other end.The WWL is kept HIGH, while the RWL is LOW.The plots in Figures 20a,b and 21a,b determine the WSNM of C6T, LP9T, LP10T, and PT10T SRAMs, respectively, at worst-case (SF) process corner.Moreover, from Figure 20a,b it is observed that the WSNM of C6T and LP9T SRAM has failed to write at 0.2 V with WSNM (20 mV) less than the thermal voltage (28 mV).In addition, LP10T has a WSNM of 10 mV and 30 mV at 0.2 V and 0.3 V VDD, respectively.Therefore, it fails to work at 0.2 V and has a near-threshold value at 0.3 V (WSNM = 30 mV).Consequently, it is suggested to operate LP10T SRAM above 0.3 V supply voltage.Furthermore, the proposed 10T SRAM has a WSNM of 21 mV and 40 mV at 0.2 V and 0.3 V VDD, respectively, in SF PC.From the results, it is recommended not to operate at 0.2V VDD; nevertheless, there is significant activity at 0.3 V VDD.The WSNM of PT10T shows 2×, 2×, and 1.5× higher outcomes as compared to C6T, LP9T, and LP10T SRAM, respectively, at 0.3 V VDD.

WSNM and Write Trip Point (WTP)
The write static noise margin (WSNM) is measured at the time of write operation by initiating a linear DC noise at one of the storage nodes and observing the effect of the noise at the other end.The WWL is kept HIGH, while the RWL is LOW.The plots in Figure 20a,b and Figure 21a,b determine the WSNM of C6T, LP9T, LP10T, and PT10T SRAMs, respectively, at worst-case (SF) process corner.Moreover, from Figure 20a,b it is observed that the WSNM of C6T and LP9T SRAM has failed to write at 0.2 V with WSNM (20 mV) less than the thermal voltage (28 mV).In addition, LP10T has a WSNM of 10 mV and 30 mV at 0.2 V and 0.3 V VDD, respectively.Therefore, it fails to work at 0.2 V and has a near-threshold value at 0.3 V (WSNM = 30 mV).Consequently, it is suggested to operate LP10T SRAM above 0.3 V supply voltage.Furthermore, the proposed 10T SRAM has a WSNM of 21 mV and 40 mV at 0.2 V and 0.3 V VDD, respectively, in SF PC.From the results, it is recommended not to operate at 0.2V VDD; nevertheless, there is significant activity at 0.3 V VDD.The WSNM of PT10T shows 2×, 2×, and 1.5× higher outcomes as compared to C6T, LP9T, and LP10T SRAM, respectively, at 0.3 V VDD.Moreover, in Figure 22 the WSNM of proposed 10T SRAM and other SRAM are observed under different temperature values at 0.3 V.The WSNM of proposed SRAM, measured at worst-case process corner, ranges from 20 mV to 63 mV with changes in temperature.It can be observed from the figure that the proposed cell shows the best performance in terms of WSNM as compared to existing SRAMs.Additionally, the write trip point (WTP) is viewed as a key parameter in SRAM write ability.It is measured by lineally varying WWL from 0 to VDD and at the same time writing single-bit information through Bitlines [20].Furthermore, the WTP is measured at the crossover point of Q and QB, as shown in Figure 23a, as a difference between VDD and voltage at the crossover.However, the WTP is compared at different supply voltages in the TT process corner, as shown in Figure 23b.It shows that the proposed SRAM has 11%, 9.75%, 3.3%, and 3.4% better WTP as compared to C6T, RD8T, LP10T, and LP9T SRAM, respectively at 0.3V VDD.Moreover, in Figure 22 the WSNM of proposed 10T SRAM and other SRAM are observed under different temperature values at 0.3 V.The WSNM of proposed SRAM, measured at worst-case process corner, ranges from 20 mV to 63 mV with changes in temperature.It can be observed from the figure that the proposed cell shows the best performance in terms of WSNM as compared to existing SRAMs.Additionally, the write trip point (WTP) is viewed as a key parameter in SRAM write ability.It is measured by lineally varying WWL from 0 to VDD and at the same time writing single-bit information through Bitlines [20].Furthermore, the WTP is measured at the crossover point of Q and QB, as shown in Figure 23a, as a difference between VDD and voltage at the crossover.However, the WTP is compared at different supply voltages in the TT process corner, as shown in Figure 23b.It shows that the proposed SRAM has 11%, 9.75%, 3.3%, and 3.4% better WTP as compared to C6T, RD8T, LP10T, and LP9T SRAM, respectively at 0.3V VDD.Moreover, in Figure 22 the WSNM of proposed 10T SRAM and other SRAM are observed under different temperature values at 0.3 V.The WSNM of proposed SRAM, measured at worst-case process corner, ranges from 20 mV to 63 mV with changes in temperature.It can be observed from the figure that the proposed cell shows the best performance in terms of WSNM as compared to existing SRAMs.Additionally, the write trip point (WTP) is viewed as a key parameter in SRAM write ability.It is measured by lineally varying WWL from 0 to VDD and at the same time writing single-bit information through Bitlines [20].Furthermore, the WTP is measured at the crossover point of Q and QB, as shown in Figure 23a, as a difference between VDD and voltage at the crossover.However, the WTP is compared at different supply voltages in the TT process corner, as shown in Figure 23b.It shows that the proposed SRAM has 11%, 9.75%, 3.3%, and 3.4% better WTP as compared to C6T, RD8T, LP10T, and LP9T SRAM, respectively at 0.3V VDD.

Comparison to FinFET-Based SRAM
To reduce the threshold variations in bulk-CMOS, FinFETs are extensively used in sub nanometer technologies to reduce the threshold variations in bulk-CMOS [21,22].The main advantage of FinFET device is that it reduces the random dopant fluctuation (RDF), which is the key reason behind the threshold voltage variations in ultra-low voltage (ULV) operations.Although FinFET devices have many benefits, FinFET based C6T SRAM does not provide sufficient yield required in the subthreshold region.Additionally, it is not easy to improve read and write stability in ULV operations.In addition, for more analysis of FinFET based SRAM, Younghwi et al. [22] presented FinFET C6T, RD8T, and ST10T SRAM architectures in 14 nm Technology.The results in Table 4 show that FinFET-based SRAM design has very high read and write access speed as compared to the proposed bulk CMOS-based PT10T SRAM.However, the major constraints on today's IoT devices are the leakage power and energy, not the speed.From Table 4, it can be observed that Bulk-CMOS still emerges as the paramount technology of choice considering application in lowpower and -energy devices.

Comparison to FinFET-Based SRAM
To reduce the threshold variations in bulk-CMOS, FinFETs are extensively used in sub nanometer technologies to reduce the threshold variations in bulk-CMOS [21,22].The main advantage of FinFET device is that it reduces the random dopant fluctuation (RDF), which is the key reason behind the threshold voltage variations in ultra-low voltage (ULV) operations.Although FinFET devices have many benefits, FinFET based C6T SRAM does not provide sufficient yield required in the subthreshold region.Additionally, it is not easy to improve read and write stability in ULV operations.In addition, for more analysis of FinFET based SRAM, Younghwi et al. [22] presented FinFET C6T, RD8T, and ST10T SRAM architectures in 14 nm Technology.The results in Table 4 show that FinFET-based SRAM design has very high read and write access speed as compared to the proposed bulk CMOS-based PT10T SRAM.However, the major constraints on today's IoT devices are the leakage power and energy, not the speed.From Table 4, it can be observed that Bulk-CMOS still emerges as the paramount technology of choice considering application in lowpower and -energy devices.

Comparison to FinFET-Based SRAM
To reduce the threshold variations in bulk-CMOS, FinFETs are extensively used in sub nanometer technologies to reduce the threshold variations in bulk-CMOS [21,22].The main advantage of FinFET device is that it reduces the random dopant fluctuation (RDF), which is the key reason behind the threshold voltage variations in ultra-low voltage (ULV) operations.Although FinFET devices have many benefits, FinFET based C6T SRAM does not provide sufficient yield required in the subthreshold region.Additionally, it is not easy to improve read and write stability in ULV operations.In addition, for more analysis of FinFET based SRAM, Younghwi et al. [22] presented FinFET C6T, RD8T, and ST10T SRAM architectures in 14 nm Technology.The results in Table 4 show that FinFET-based SRAM design has very high read and write access speed as compared to the proposed bulk CMOS-based PT10T SRAM.However, the major constraints on today's IoT devices are the leakage power and energy, not the speed.From Table 4, it can be observed that Bulk-CMOS still emerges as the paramount technology of choice considering application in low-power and -energy devices.

Summary of Results
Table 4 shows the summary of post-layout simulation results of proposed PT10T 8-Kb SRAM at 0.3 V power supply.The parameters are observed at 27 • C (room temperature) in the worst-case process corner.Table 4 shows that the leakage power of proposed LVT 10T SRAM is reduced by 98.76%, 98.6%, 6.7%, and 98.2% as compared to the LVT C6T, RD8T, LP9T, and ST10T SRAM, respectively, at 0.3V VDD.The WSNM value is improved by 2×, 25%, 2×, 33.33%, and 14.3% as compared to C6T, RD8T, LP9T, and ST10T SRAM, respectively, at worst-case (SF) process corner.The RSNM is enhanced by a factor of 3×, 1.48×, 1.48×, 1.48×, and 3.88% as compared to C6T, RD8T, LP9T, LP10T, and ST10T SRAM at TT process corner, respectively.The dynamic read margin is determined at the time of read operation and shows improvement by 31.8% and 8.2% as compared to C6T and ST10T SRAM, respectively.Moreover, the WTP is also improved by 11.11%, 9.75%, 3%, 3%, and 9.75% as compared C6T, RD8T, LP9T, LP10T, and ST10T SRAM, respectively at TT process corner.The write '1' energy is reduced by 6%, 2%, and 6% as compared to LP9T, LP10T, and ST10T SRAM, respectively, at worst-case (SS) process corner at 0.3 V VDD.However, the read '1' energy is reduced to 4.3%, 12%, 6.4%, 4%, and 1% as compared to C6T, RD8T, LP9T, LP10T, and ST10T SRAM, respectively, at worst-case (SS) process corner at 0.3V VDD.On the other hand, the footprint area of the proposed cell has 35% and 12.5% more overhead as compared to C6T and RD8T SRAM, respectively.In addition, the proposed 10T SRAM has 17% and 27% lesser macroblock area as compared to ST10T and LP10T, respectively.Therefore, the proposed 10T SRAM is an attractive choice for today's battery-operated IoT enabled system on chip (SoC) applications, where the leakage power consumption and the cell stability are of primary concern.
Furthermore, Table 6 shows a comparison table of all of the fundamental parameters measured using different threshold voltage transistor (HVT, RVT, and LVT).The inter-die and intra-die process/mismatch variations are considered at 0.3 V power supply and room temperature.All of the parameters are measured at worst-case process corner.From the table it can be observed that the proposed 10T SRAM shows the best performance in terms of write power, read-write energy, leakage power, RSNM, and WSNM.The objective of this paper towards creating memory architecture for ULP IoT applications can be achieved by introducing low leakage, high stability, and low read-write energy PT10T SRAM.

Conclusions
This paper exhibits a robust ULP process-tolerant 10T SRAM cell for IoT applications.The low-voltage operations lead to concern about process variations in subthreshold SRAM design.However, the low leakage and high read-write stability at different PVT variations make the proposed cell a leading alternative to conventional memory cells for IoT applications.The proposed SRAM uses the power gating technique to reduce the standby power.It also uses a decoupled separate read path to improve the RSNM.The proposed cell also causes a tremendous improvement in RSNM, WSNM, and WTP values as compared to the existing SRAMs.

Figure 3 .Figure 3 .
Figure 3. Simplified array architecture of 10T SRAM, where the read is performed using a high-speed differential current compensation sense amplifier (DCC-SA) [14].

Figure 5 .
Figure 5. (a) State of read '1' operation in proposed 10T SRAM.(b) State of write '1' operation in proposed 10T SRAM.The power gated PMOS is turned ON for read-write operation.

Figure 6 .
Figure 6.Schematic diagram of proposed 10T SRAM cell in standby mode.

Figure 6 .
Figure 6.Schematic diagram of proposed 10T SRAM cell in standby mode.

Figure 10 .
Figure 10.(a) State of input variables in read operation where micron (µ) is 10 −6 , (b) comparisons of read access time at SS process corner, and (c) comparisons of read energy calculated at different supply voltages.

Figure 12 .
Figure 12. of (a) read delay and (b) read energy in different threshold voltage SRAMs.

Figure 13 .
Figure 13.(a) Leakage power compared at different supply voltages and (b) leakage power compared at different temperature values at 0.3 V VDD.

Figure 14 .
Figure 14.Comparison of leakage power/cell for different threshold voltage SRAMs at 0.3 V VDD.

Figure 13 .Figure 13 .
Figure 13.(a) Leakage power compared at different supply voltages and (b) leakage power compared at different temperature values at 0.3 V VDD.

Figure 14 .
Figure 14.Comparison of leakage power/cell for different threshold voltage SRAMs at 0.3 V VDD.

Figure 14 .
Figure 14.Comparison of leakage power/cell for different threshold voltage SRAMs at 0.3 V VDD.

Figure 21 .
Figure 21.WSNM of (a) LP10T SRAM and (b) PT10T SRAM at different supply voltages in worst-case PC.

Figure 22 .
Figure 22.Comparison of WSNM values observed at 0.3 V VDD using LVT transistors in worst-case (FS) PC.

Figure 23 .
Figure 23.(a) WTP measurement of C6T SRAM at 0.3 V in SF process corner.(b) Comparison of various WTP values at different VDD in TT process corner.

Table 1 .
Comparison of cell layout area at UMC 65 nm CMOS technology.

Table 2 .
Logic truth table for various operations in the proposed PT10T SRAM.

Table 3 .
Comparison of 8-Kb layout area at 65 nm standard CMOS technology.

Table 6 .
Summary of mean (µ) value of inter-/intra-die process mismatch threshold voltage (V th ) variations results of proposed PT10T SRAM compared with other existing SRAM architectures using different RVT/HVT/LVT MOS transistors at 0.3 V VDD in room-temperature (27 • C) conditions.