Error-Tolerant Reconfigurable VDD 10T SRAM Architecture for IoT Applications

: This paper proposes an error-tolerant reconﬁgurable VDD (R-VDD) scaled SRAM architecture, which signiﬁcantly reduces the read and hold power using the supply voltage scaling technique. The data-dependent low-power 10T (D 2 LP10T) SRAM cell is used for the R-VDD scaled architecture with the improved stability and lower power consumption. The R-VDD scaled SRAM architecture is developed to avoid unessential read and hold power using VDD scaling. In this work, the cells are implemented and analyzed considering a technologically relevant 65 nm CMOS node. We analyze the failure probability during read, write, and hold mode, which shows that the proposed D 2 LP10T cell exhibits the lowest failure rate compared to other existing cells. Furthermore, the D 2 LP10T cell design offers 1.66 × , 4.0 × , and 1.15 × higher write, read, and hold stability, respectively, as compared to the 6T cell. Moreover, leakage power, write power-delay-product (PDP), and read PDP has been reduced by 89.96%, 80.52%, and 59.80%, respectively, compared to the 6T SRAM cell at 0.4 V supply voltage. The functional improvement becomes even more apparent when the quality factor (QF) is evaluated, which is 458 × higher for the proposed design than the 6T SRAM cell at 0.4 V supply voltage. A signiﬁcant improvement of power dissipation, i.e., 46.07% and 74.55%, can also be observed for the R-VDD scaled architecture compared to the conventional array for the respective read and hold operation at 0.4 V supply voltage.


Introduction
There is huge demand for the Internet of Things (IoT) throughout the world. IoT connects a manifold of portable, battery-operated gadgets using a wireless sensor network (WSN). The devices' most crucial requirements are a low total power dissipation and a smaller chip area [1]. In a typical WSN, the devices are connected using a base station with a dedicated server through wireless links. Different sensor nodes communicate with the base station employing wireless transmission protocols such as Wi-Fi [2]. The base station collects data and transmits them to the server using GPRS or satellite connection, as shown in Figure 1a. The architecture of WSN demonstrates the processing of data considering a sensing, computing, and communication unit, as shown in Figure 1b. In an on-chip computing approach, memory covers the major portion in the total chip design in an integrated circuit and system. The most common memory architecture is SRAM cell architecture because of its speed and stability [3]. The leakage power of the overall chip is increased due to the large number of cells in standby mode, whereas the lower power of the SRAM cell can be achieved using the supply voltage scaling technique [4]. Still, the reduction in supply voltage drastically reduces the stability, which increases the occurrence of errors in read, write, and hold operations [5,6]. Therefore, we need to design a circuit with proper device constraints and aspect ratio to mitigate the failure rate [7]. A typical 6T SRAM cell is used for the data processing at IoT nodes [8]. The 6T cell has a very compact structure with low area overhead. However, the 6T cell has several unpreventable drawbacks such as read data disturbance, stability reduction at minimum supply voltage, considerable data retention voltage, half-select issue, and conflict of read and write operation [9]. Researchers have demonstrated many cell level or architecture level design approaches to resolve these issues such as read decoupled scheme [10,11], feedback cutting [12], single-ended approach [13,14], write assist technique [15], datadependent circuits [16,17], and stacking effects [18]. Hence, researchers aim to reduce the write power, read power, and leakage power while enhancing the read stability and write ability and resolving half-select issues [9].
Considering the above-suggested strategies in the literature, the read decoupled 8T SRAM cell [19] is the most popular approach to overcome read data disturbance, as shown in Figure 2a. In this 8T cell, the separate read path is established to improve the read stability and mitigate the read/write conflicts. However, the leakage current increases due to the read decoupled transistors. Therefore, the positive feedback controlled 10T (PFC10T) cell [12] has been proposed to resolve the 8T cell issues, as shown in Figure 2b. The PFC10T cell's stacking combination is used to reduce the leakage power because the transistors' series connection increases the equivalent resistance and reduces the current flowing through the path. The failure probability for the PFC10T cell is still considerably high in the read, write, and hold modes. The above-mentioned issues are resolved by a single SRAM cell named as a datadependent low power 10T SRAM cell [16,17] and is referred to as "D 2 LP10T" cell. Some performance parameters are evaluated in [16,17] and some additional parameters are analyzed in this work with the reference cells. The D 2 LP10T cell can perform a successful operation even at considerably lower supply voltage (VDD = 0.4 V) and has almost negligible leakage power dissipation. Thus, the D 2 LP10T cell is suitable for low-voltage and energy-efficient system-on-chip (SoC) IoT applications. In this paper, a novel reconfigurable VDD (R-VDD) scaled architecture is also proposed, which is based on the supply voltage scaling technique. Moreover, the following novelties are elaborated in this paper: • An error-tolerant and energy-efficient D 2 LP10T cell is referred from [16,17] for the IoT applications with improved stability and reduced leakage power consumption. • A reconfigurable VDD (R-VDD) scaled architecture is proposed considering supply scaling technique, where the read power and hold power are significantly reduced. • An algorithm for voltage controller and decision circuit has been designed for better representation. • We analyzed the modeling of failure probability in read, write and hold mode and performed 5000 Monte Carlo (MC) simulations to examine the effect of failure probability. • The read and hold power consumption is determined for the VDD scaled architecture using the proposed 10T cell and compared with a conventional architecture.
The organization of the paper is as follows: In Section 2, we discuss the key features of the proposed 10T cell design. The reconfigurable VDD (R-VDD) scaled architecture is described in Section 3 and the failure probability mechanism examined in Section 4. The simulation results and discussion are given in Section 5 followed by conclusions in Section 6.

Proposed D 2 LP10T SRAM Cell Design
The proposed D 2 LP10T cell is designed using ten transistors with data-dependent power supply, as shown in Figure 3, and the control signal table for different operations is shown in Table 1 [16]. The D 2 LP10T cell consists of a power controlling circuit (PCC) in the pull-up network, which is cut off the pull-down network from the bit lines and discharge the storage node rapidly.
The best aspects of the proposed D 2 LP10T cell are as follows: • The power controlling circuit (PCC) isolates the circuit from the power supply and inherently reduces the write power of the cell. • The isolated read path is used to separate the read and write operation, which is resolved the read/write trade-off and enhanced the read stability. • The write stability is enhanced with the support of a pull up inverter pair, and leakage power is reduced due to stacking combination design. • The half-select issue is resolved by enabling the bit-line select signal, which is powered by the power controlling circuit.  The detailed operation of the proposed D 2 LP10T cell is given in [16,17]. In the literature, various techniques are suggested to reduce leakage power, where supply voltage scaling is the preferable technique [17]. However, in every technique that exhibits its own drawbacks, VDD scaling also reduces the stability and increases the failure probability of the cell. Hence, VDD scaling is not that much easier to handle when we considered all the above parameters simultaneously.

Proposed Reconfigurable VDD (R-VDD) Scaled Memory Architecture
In the conventional memory architecture design, full supply voltage is used in the write, read, and hold operations [16], where the leakage power of the overall chip is increased due to full supply voltage [20]. Thus, an on-chip adaptive VDD scaled architecture is proposed in [17], which reduces the leakage power of the entire unselected rows using the supply selection circuit. Meanwhile, we can scale more supply voltage (VDD S ) during the hold and read mode without any data failure. Figure 4 shows the complete structure of the R-VDD scaled circuit for the SRAM cell, including a voltage controller (VC), decision circuit (DC), and memory controller block. The voltage controller (VC) block is used to control the supply voltage (VDD) and divide this voltage into three sub-parts, which selects the voltage according to the need of the SRAM cell architecture. The variation in the supply voltage (VDD) proportionally affects the power dissipation, and when we scale the supply voltage, the effects are directly reflected in the power. In the electrical voltage divider circuit, resistances are connected in series to generate different voltage levels. However, the resistances have a large size, and it is difficult to connect in the integrated chip, whereas MOS devices are serially arranged in a diode-connected manner to generate different voltages and provide an alternative to the resistance [21]. The sizing of transistors is adjusted in such a way that the output gives an equal voltage drop across each MOS as shown in Figure 4 (voltage controller part).

R-VDD Scaled Circuit (Voltage Controller and Decision Circuit)
The decision circuit (DC) consists of MUXs and logic gates with the controlling of RD and WR signals as shown in Figure 4 (decision circuit part). When we write or read data from the memory architecture, WR or RD signal is enabled, respectively. The MUX selects the write, read, and hold mode using EN 1 and EN 2 select signals with the scaled supply and transmit them to the memory controller block as a VDD S . The transient response of the circuit for all the conditions is shown in Figure 5, which gives information about the selection of read, write, and hold conditions according to WR and RD signals.  The algorithm shows the complete flow of the voltage controller (VC) and the decision circuit (DC). In the given Algorithm 1, the full VDD supply voltage is applied during the write operation for successful write data into the cell. During read mode, full supply is not needed, and it is significant to use scale VDD to reduce the considerable amount of read dynamic power. Further, in hold condition, a much smaller proportion of supply is required to hold the stored data. This indicates that the supply scaling significantly reduced the power consumption. Therefore, the supply voltage is divided by 2 and 3 in read and hold operations, respectively.  Figure 6 shows the controller block for generation of WLA and WLB control signals. These signals are generated by using memory array control signals such as write enable (WR), read enable (RD), and data input (Din) signals. The WR, RD, and Din signals are used to write, read, and for data input selection in memory architecture designing.  Figure 7 shows the complete memory architecture using the proposed D 2 LP10T cell. The proposed architecture is the same as the conventional array except for the R-VDD scaled circuit and controller block. The R-VDD scaled circuit generates VDD S scaled supply, which is given to all the connected D 2 LP10T cells. Therefore, the purpose of this scaled supply (VDD S ) is to reduce the power consumption of the proposed array during read and hold operations.

Mechanism of Failure in SRAM Cell
On-die process parameter variations such as threshold voltage, channel length, and channel width of the transistors result in a mismatch in different SRAM cell transistors' device strength. Hence, the device mismatch affects different failure conditions [22][23][24], which is noted below: • Read Failure-Due to the cause of random fluctuations in the threshold voltage (V t ); if the strength of access transistor increases (reduction in V t ) and the strength of pull-down NMOS transistor is reduced (increase in V t ), the circuit leads to read failure. Conversely, when reducing the strength of pull-up PMOS transistors, the chance to flip the cell content increases, which causes read failure. • Write Failure-When a pull-up PMOS transistor is stronger and the access transistor is weaker, it can significantly degrade the discharging process and thereby cause a write failure. However, the write time of the cell is increased due to random variations in the device strength, leading to the inability to write data into the memory. • Hold Failure-In standby mode, the supply voltage reduction causes the chance to disturb the stored data. Then it can be said that the cell has failed in hold mode. Either the V t of M1 reduces when M3 increases or V t of M2 increases when M4 reduces, so that the possibility of data flipping in the hold mode increases as shown in Figure 2.
The failure probability of an SRAM cell is defined by integrating the normalized Gaussian function, also known as the error function of bit cell [25,26].
where µ and σ are the mean and standard deviation of a random Gaussian variable. We define another function Q(X) given by: The probability P(X > x) is also defined in terms of mean and standard deviation of Q-function for further simplification and calculating the error function.
The value of the Q-function is related to the error function, which is also called failure probability. It is defined in a simplified way for a higher value of x and is given below: where erfc(x) is the compliment error function and P f ailure is the failure probability of value x. The symbol !! used in the above Equation (6) is called double factorial, and is defined below: The overall failure probability (P F ) in any memory architecture can also be defined by the union of individual parametric failures [27] and is given by: where W R , R F , and H F are the write, read, and hold failure probability, respectively.

Simulation Results and Discussion
The proposed D 2 LP10T cell is designed using industry-standard 65 nm CMOS technology, and reference circuits are also redesigned and simulated using the same technology with the given transistors sizing. The performance parameters of the D 2 LP10T cell are compared with conventional 6T, read decoupled 8T [19], and PFC10T [12] cells, as shown in Figure 2. The impact of process variation is significant in the lower technology node [28]; Therefore, we analyzed Monte Carlo (MC) simulations with 5000 samples and verify the impact of process variations. Further, various parameters are also examined and compared with all considered cells and started with the stability analysis as discussed below:

Stability Analysis
Stability is defined as the largest square's side length, which is the most appropriate in the smaller lobe of the butterfly curve [29]. This square side length is called stability or static noise margin (SNM). Figure 8 shows the read static noise margin (RSNM), hold static noise margin (HSNM), and write static noise margin (WSNM) of different SRAM cells at 0.4 V supply voltage. The results demonstrate that the read, write, and hold stability of the D 2 LP10T cell is enhanced by 4.0×, 1.66×, and 1.15× with respect to the conventional 6T cell, respectively. The read, write, and hold stability of the proposed D 2 LP10T cell is enhanced due to the read decoupling approach, power controlling circuit (PCC), and robust strength of the pull-up PMOS transistors, respectively [17]. Hence, the noise margin of the D 2 LP10T cell shows the quality of robustness and is most suitable for IoT-based applications.

Dynamic Read Noise Margin
The dynamic read noise margin (DRNM) of the SRAM cell is defined as a minimum difference between the storage nodes (Q and QB) to be measured during the transient response of the read operation condition [30]. Figure 9a shows the transient response of the DRNM for the conventional 6T and proposed D 2 LP10T cells with a value of 381 and 396 mV, respectively. Figure 9b shows the value of DRNM at different supply voltages for different SRAM cells and demonstrates that the DRNM of the proposed D 2 LP10T cell is 1.04×, 0.99×, and 1.01× higher as compared to the 6T, 8T, and PFC10T cells, respectively, at 0.4 V supply voltage. Further, the DRNM distribution curve is examined using 5000 Monte Carlo simulations by varying both process and mismatch data with 6σ deviations as illustrated in Figure 9c at 0.4 V supply voltage for 6T and the proposed D 2 LP10T cells. The result shows that the proposed cell has the highest DRNM value and lowest variability compared with the 6T cell. The higher value of the DRNM directly indicates the higher stability of the proposed cell.

Write Trip Point
Researchers have suggested write trip point (WTP) as another important parameter to evaluate the write ability of the SRAM cell. The WTP is defined as the voltage difference between supply voltage and word line value at which the stored data of the cell flip during write operation [4]. The WTP is presented in Figure 10 at different supply voltages. The result shows that the write trip point of the proposed cell is increased by 1.53×, 1.5×, and 2.5× as compared to 6T, 8T and PFC10T cells, respectively, at 0.4 V supply voltage. The WTP is higher in the proposed circuit because the PCC is powered by the bit lines and connected with the power controlling signals, which shows the robustness behavior of the proposed D 2 LP10T cell.

Read Failure Probability
RSNM is used to quantify the read stability of the SRAM cells. A read failure condition mainly occurred due to the cause of improper transistor sizing of the SRAM cell, as discussed in Section 4. However, the read failure probability (P RF ) is estimated as The stored data of the cell are flipped when the value of RSNM is less than the thermal noise (kT = 26 mV at 300 K), and the content can be flipped due to thermal noise. The suitable threshold voltage criteria is determined at 6σ read failure probability (i.e., P RF = 1e −09 ) with 5000MC simulations. As shown in Figure 11a, the proposed D 2 LP10T cell has a lower read failure probability than other existing cells. Therefore, the P RF would translate into a lower read VDD in 6T/8T bit cells compared with the proposed D 2 LP10T cell.

Hold Failure Probability
Similarly to read failure, hold failure probability (P HF ) is defined by hold SNM. The hold failure condition mainly depends on the threshold voltage of the pull-up and pull-down transistors, where the equation of hold failure probability (P HF ) is given below: If HSNM is lower than the thermal voltage (kT = 26 mV at 300 K), then the cell stored data can be violated due to thermal noise. As the size of the cross-coupled inverter pair is similar in the 6T, 8T, and proposed D 2 LP10T cells, the results are therefore similar for the case of P HF as shown in Figure 11b. It is also observed that the proposed D 2 LP10T cell has lower P HF compared to other considered SRAM cells in this work. The above discussion indicates that the proposed D 2 LP10T cell would yield lower VDD than the 6T/8T/PFC10T cells.

Write Failure Probability
During the write operation, the main attention is to write data successfully without any chance of failure or distortion. The write trip point (WTP) is a useful parameter which indicates the ability to write data into the cell. Therefore, the write failure probability (P WF ) is defined as: Figure 11c shows that the proposed D 2 LP10T cell gives the lowest write failure probability (P WF ) among all considered SRAM cells. The reason is feedback approach and series connection that gives a higher value of WTP than other considered cells, as shown in Figure 10. Hence, the feedback mechanism in the pull-up network definitely improves the write ability and reduces the P WF of the SRAM cell.

Leakage Power
The leakage power is a major issue, and strategies are required to control the leakage power. Most of the time, we observed the cache memory is in idle mode. However, we can minimize leakage power by using VDD scaling. Figure 12 shows leakage power with the variation of the temperature and supply voltage of different SRAM cells. The result shows that the leakage power of the proposed D 2 LP10T cell is reduced by 89.96%, 90.27%, and 68.30% at 0.4 V supply and 25 • C temperature. In contrast, it is reduced by 99.42%, 99.42%, and 77.16% at 1.2 V supply and 125 • C temperature compared to 6T, 8T, and PFC10T SRAM cells, respectively. This is because the proposed D 2 LP10T cell involves a stacking combination between the bit line and ground line, which enhances the equivalent resistance and consequently reduces the leakage current flowing through it. Hence the excessive amount of the leakage power reduction indicates that the D 2 LP10T cell holds data very efficiently in embedded memories.

Write and Read PDP
The PDP (energy) is defined as power and delay product, where less delay and power indicate less energy consumption. Therefore, PDP is directly described instead of the separate delay and power explanation. In Figure 13a, the write 0 PDP of the proposed D 2 LP10T cell is shown, which is reduced by 80.52%, 83.94%, and 65.37% as compared to 6T, 8T, and PFC10T cells, respectively at 0.4 V supply voltage. The write PDP is excessively reduced because the PCC is directly controlled by the bit lines rather than the power supply. Figure 13b shows the read PDP of different bitcell designs and recognized that the proposed D 2 LP10T cell is reduced by 0.4×, 1.54×, and 0.69× as compared to 6T, 8T, and PFC10T cells, respectively, at 0.4 V supply voltage. The reduction in the read PDP depends on the isolated read decoupled path present in the cell. However, the read energy of the proposed cell is higher than the 8T cell because the isolated read path is established as well as that a lower number of transistors is present. Hence, we concluded that the proposed D 2 LP10T cell is efficient from the energy consumption perspective.

Cell Area Comparison
For the SRAM cell design, the cell area is a major examined parameter to achieve better performance [16]. Figure 14 shows the layout view of the conventional 6T cell and the proposed D 2 LP10T cell in 65 nm standard CMOS technology. The detailed description about the aspect ratio is given in [17]. The cell area overhead of the 6T and D 2 LP10T cells are 2.48 and 3.41 µm 2 , respectively.

Quality Factor (QF)
Quality factor (QF) is considered to show the utility and performance of the circuit. The quality factor of the cell is defined by performance parameters such as the stability, leakage power, PDP, failure probability, and estimated area [16]. From the above performance parameters, stability should be considered as the nominator. Other parameters should be considered as the denominator in the QF formula to achieve better performance of the SRAM cells. Further, failure probability and stability are the preferable conditions for energy-efficient and error-tolerant circuit design. Here, the novel quality factor (QF) is given as: where, RSN M n , HSN M n , WSN M n , LP n , PDP n , FP n , and A n are the normalized values of read stability, hold stability, write stability, leakage power, power delay product (product of read and write PDP), failure probability (product of read and hold failure probability), and estimated area, respectively. All the values are normalized with the conventional 6T cell to determine the QF. Figure 15 shows the quality factor of all the considered cells at 0.4 V supply voltage and it is observed that the D 2 LP10T cell is 458×, 31.05×, and 9.05× higher as compared to the 6T, 8T, and PFC10T cells, respectively. The QF includes 0.4 V supply voltage for the calculation of the above parameters except for failure probability. The read and hold failure probability are considered at a 0.25 V supply voltage for fair comparison because the failure probability values are too minimal at 0.4 V and created unjustified results. The result shows that the D 2 LP10T cell is best suited for high performance, low power circuits with an area penalty.

Power Consumption in Memory Architecture
The read and hold power is calculated in the conventional architecture and the proposed R-VDD architecture using the proposed D 2 LP10T cell. For the R-VDD architecture, the scaled supply (VDD S ) is considered to reduce the read and hold power in read and hold operation. The simulation is performed using 4 × 4 array size and power reduction (in %) is shown in Tables 2 and 3 with different supply voltages. The read and hold power are decreased by 46.07% and 74.55%, respectively, compared to the conventional memory array at 0.4 V supply voltage. In that manner, the energy will also be reduced, which is suitable for IoT applications.

Conclusions
Supply voltage scaling is an effective way to achieve ultra-low-power operation. For that purpose, an error-tolerant energy-efficient data-dependent 10T SRAM cell is used with improved read, write, and hold stability even at lower supply voltage. The reconfigurable VDD (R-VDD) scaled architecture significantly reduces the hold and read power at the lower supply voltage, but scaled VDD may affect the other parameters such as failure probability and stability. The proposed cell is designed through the data-dependent technique using a power controlling circuit rather than a direct power supply, which reduces the total power consumption and mitigates the failure rate. The series-connected transistors are responsible for the leakage power reduction, and for the proposed cell is reduced by 89.96% as compared to the conventional 6T cell at 0.4 V supply voltage. Further, we have examined the write and read PDP, which is reduced by 80.52% and 59.79%, respectively, as compared to the 6T cell at 0.4 V supply voltage. We have also analyzed an analytical model of failure mechanisms for different SRAM cells, such as read, write, and hold failures. The failure probability has found less impact on the proposed cell among all other cells. Consequently, the quality factor was also examined for the proposed D 2 LP10T cell and 458× higher than 6T cells. Further, the R-VDD scaled architecture has also been simulated using D 2 LP10T cell, and it consumes 46.07% and 74.55% less power than conventional array during read and hold operation, respectively for 0.4 V supply voltage. Hence, we conclude that the overall system performance is improved and the proposed R-VDD architecture is best suited for an IoT applications. Moreover, the chip layout and tape-out of the proposed R-VDD architecture using the proposed D 2 LP10T cell will be designed in the future work. In that case, array area overhead and chip measurement results will be included in future publications.