Abstract
To incur the memory interface and faster access of static RAM for near-threshold operation, a stable local bit-line static random-access memory (SRAM) architecture has been proposed along with the low-voltage pre-charged and negative local bit-line (NLBL) scheme. In addition to the low-voltage pre-charged and NLBL scheme being operated by the write bit-line column to work out for the write half-select condition. The proposed local bit-line SRAM design reduces variations and enhances the read stability, the write capacity, prevents the bit-line leakage current, and the designed pre-charged circuit has achieved an optimal pre-charge voltage during the near-threshold operation. Compared to the conventional 6 T SRAM design, the optimal pre-charge voltage has been improved up to 15% for the read static noise margin (RSNM) and the write delay enriched up to 22% for the proposed NLBL SRAM design which is energy-efficient. At 400 mV supply voltage and 25 MHz operating frequency, the read and write energy consumption is 0.22 pJ and 0.23 pJ respectively. After comparing with the related works, the access average energy (AAE) is lower than in other works. The overall performance for the proposed local bit-line SRAM has achieved the highest figure of merit (FoM). The designed architecture has been implemented based on the 1-Kb SRAM macros and TSMC−40 nm GP process technology.
1. Introduction
Modern electronics are being merged into smart technologies such as the Internet of Things (IoT), automotive electronics, biomedical electronics, sensor devices, and so on. Therefore, integrated circuits are widely used for low power consumption, low leakage current, and compact area []. However, with the development of nanometer (nm) process technology, the effect of leakage current has become a major problem in the system on chip (SoC) []. Modern microprocessors require more embedded memory for system specification, compact area, low energy, and power consumption [,]. The embedded memory consumes most of the power, and the large memory is used for data storage in the SoC. Thus, every circuit designer must pay attention to reduce power consumption. Conversely, the hostile design and size constraints make it much more difficult than general logic circuits to minimize the operating voltage [,,]. As a result, several designers have connected additional circuits to decrease power consumption and increase operating stability for low supply voltages [].
The conventional (conv.) 6 T SRAM performance is ineffective at the low supply voltage because of pseudo-read error in half-select condition. A read-decoupled 8 T SRAM [,] architecture was presented to solve the read error although minimized read error and increased RSNM slightly. Conversely, the memory cell is still affected by the read error during the write operation. Adding stacked access transistors, several SRAMs architectures the 9 T SRAM [,,], 10 T SRAM [,], and 12 T SRAM [] for near-threshold/ sub-threshold operations have been proposed to address read errors in half-select conditions. As a result of stacked access transistors, the write operation becomes weak. The bit-line of SRAM is deeply affected by large parasitic capacitance during read and write operations. The Average−8 T SRAM [] consisted of local and global bit-lines proposed to solve the parasitic problem. However, the local and global bit-lines cannot achieve full swing due to less write ability, slow operation, and high-power consumption. The full-swing local bit-line SRAM architecture [] was proposed but still poor write ability because two cascade transistors controlled the bit cell. The 10 T SRAM [] stack pull-down transistors for a cross-coupled inverter with VGND technology was proposed to change the write path, but the write ability is still inefficient.
After analyzing the relationship between read noise and pre-charge voltage of the bit-line pair, a new local bit-line 6 T SRAM architecture has been proposed to improve the noise margin of SRAM, capacitive density, read stability, as well as write capacity.
2. Proposed Local Bit-Line SRAM Architecture
A modern 6 T SRAM architecture has been proposed to robust the performance, the block diagram shown in Figure 1. Consisting with four bits of 6 T cells, two assistant circuits for optimal pre-charge (OPC), and the NLBL framework have been developed for the proposed local bit-line SRAM architecture. The OPC circuit is control by the block selection lines of BLK [0] and BLKB [0] which are generated by row address decoder. The global read bit-line pair (GRBL [0] and GRBLB [0]) reads data from LBL [00] and LBLB [00] via the two transistors T1 and T2. The global write bit-line pair (GWBL [0] and GWBLB [0]) writes strong data in the NLBL and passes to the local bit-line pair (LBLB [00], LBL [00]) and SRAM cells for operation.

Figure 1.
The proposed local bit-line 6 T SRAM block diagram.
2.1. Optimal Pre-Charge Circuit and Read Operation
To facilitate the read operation, an OPC circuit relates to the proposed SRAM design shown in Figure 2. The OPC circuit is controlled by the transistors LPL, LPR, NCL, NCR, and EQ. Initially, the block selection line set BLKB [0] = 0 and BLK [0] = 1 to turn off the pre-charge circuit, when BLKB [0] = 1, the optimal voltages (Vopt) are obtained by the transistors LPL and LPR using the following equation:
where ‘Ids’ stands for the saturation current, and ‘Rds’ channel resistance in LPR or LPL transistors and ‘VBLKB‘is the voltage for block line BLKB[0].

Figure 2.
Proposed design of local bit-line 6 T SRAM Architecture.
The pre-charge voltage for near-threshold operation and process variance is not equal to the local bit-line LBL [00] and LBLB [00]. The block line BLK [0] = 0 is used to facilitate LBL [00] and LBLB [00] for voltage equalization by the transistor EQ. Table 1 shows the OPC voltages for different supply voltages.

Table 1.
Optimal pre- charge voltages.
Four memory cells connected with local bit-line pair LBL [00] and LBLB [00] which minimizes the parasitic capacitance on the bit-line and reduces the read error. The memory cell reads data "1" and the local bit-line LBLB [00] is discharged into the cell through the transistor T4. The transistor T3 and transistor NCL have a small current, the four memory cells of the local bit-line significantly decrease the leakage currents. To increase the read stability and improve the RSNM, the global read bit-line (GRBL/GRBLB) avoids memory cell leakage currents. The sense amplifier (SA) reads the data out to charge the global read bit-line GRBLB [0] through transistor T2, there has a small parasitic capacitance which makes the read operation faster. The voltage of block selection lines BLKB [0] and BLK [0] is not less than Vopt, so the low-voltage pre-charge circuit saves optimal voltage (Vopt) and ensures read stability of read operation.
2.2. Negative Local Bit-Line Scheme and Write Operation
Figure 2 displays the negative local bit-line scheme and the process of the write operation. The global write bit-line pairs GWBL [0] and GWBLB [0] are operated for write data by the following transistors T5 and T6. The Negative Virtual Ground (NVGND) in the column direction is attached to the LBL [00] or LBLB [00] to switch on the transistor T5 or T6, then pulls the other side to VDD via cross transistors NCL and NCR. At first, the block selection lines set BLKB [0] is ‘0’, BLK [0] as ‘1’ then the GWBL [0] or GWBLB [0] set ‘1’, to switched on transistor T5 and T6. The other side of the bit-line LBL/ LBLB is engaged through the NCL/NCR to the VDD, and the word-line WL [0] sets ‘1’ to write for the memory cell. By sharing the charging capacitor, memory cells are discharged via VGND to improve the write capabilities for the near-threshold operation. For instance, the first cell discharge path through T3 to LBL [00] to pass data ‘0’, and then the LBLB [00] retained the VDD to provide additional data. In this process the LBL [00] is connected to the NLBL, the memory cell data is reversed faster during the write operation and RWL [0] is still holding data ‘0’. For the differential read-write function the proposed local bit-line SRAM has high speed, decreased parasitic capacitance, and resistance which is energy-efficient.
2.3. Half-Select Condition Operation
Figure 3 illustrates the half-select condition operation for row blocks of proposed local bit-line SRAM. At the write operation of BLOCK 0, the transistors T3 and T4 are used to pass data of Q and QB which is indicated by the sky-blue color. Initially, the word-line WL0 =1 becomes active that shown by the red color and Q save data ‘1’ and QB saves ‘0’. The global write bit-lines GWBL [0] =1 and GWBLB [0] =0 select data to activate the transistors T5 and T6 then the local bit-line LBL [00] turn out to be discharged through the transistor T5 to the negative voltage of VGND [0]. After discharge the local bit-line LBL [00], the Q = 0 and QB = 1 flip data in the memory cell. At BLOCK 1, the half-selected cell is affected by the word-line WL0 in the same row pseudo-read of LBL [] and LBLB [] that indicated by the orange color. The read word-line RWL [0] remains data "0" and row half-selected blocks does not charge by the global read bit-lines (GRBL/GRBLB), so the low-voltage pre-charge scheme reduces the read disturb and flip data in the memory cell. The proposed structure injects less pre-charge into cell and reduces the capacitance in the local bit-line which decline the leakage power consumption that shown by the purple color dotted line.

Figure 3.
Half-Select Condition operation of local bit-line SRAM.
3. Simulation Results
3.1. The Comparison of RSNM Simulation
Figure 4 shows the RSNM simulation and analysis results for 6 T SRAM cell depends on the bit-line pre-charge voltage adjustment with different supply voltages. It is obvious that each curve has the maximum RSNM value at the specific bit-line voltage corresponding to the supply voltage. This simulation result is considered for proposed local bit-line design.

Figure 4.
The RSNM simulation results of 6 T SRAM for pre-charge voltage adjustment.
The comparison of RSNM simulation result of proposed local bit-line SRAM and conv. 6 T SRAM at the various operating voltages is shown in Figure 5. The RSNM memory cell curve decreased for low voltages. The proposed local bit-line SRAM has obtained strong RSNM by using low-voltage pre-charged scheme during the read operation.

Figure 5.
The RSNM comparison for optimal and (conv.) pre-charge.
3.2. The Comparison of Monte Carlo Simulation
For the global variations, the process-voltage-temperature (PVT) corners that combined the extreme cases of these variables are commonly used to verify the performance. The 10,000 times Monte Carlo post-simulations were performed for the experiment to show the enhancement of read stability. At 400 mV supply voltages, the experimental result of the conventional 6 T SRAM cell is shown in Figure 6a. During the read operation, the conv. 6 T SRAM charge on the bit-line pair destroyed the data store in the memory cell, flipped the data store in the cell, and reversed several read errors and data flips. Alternatively, the proposed local bit-line SRAM uses a low-voltage pre-charge scheme to optimize the voltage and achieved the stability of the cell without any read errors at the same supply voltage the result is shown in Figure 6b.

Figure 6.
10,000 times read Monte Carlo post-simulations at 400 mV (a) Result of (conv.) 6 T SRAM. (b) Result of proposed local bit-line SRAM.
3.3. The Comparison of Write Ability Simulation
The comparison and write ability simulation result for the different operating voltages is shown in Figure 7. The differential write operation is considered to write speed of data "0" and "1", the write operation is affected by the transition point of the cross-coupled inverter. However, the NLBL technology that included with the proposed design, so the write ability of the memory cells has improved and provides better performance. Compared to the conv. 6 T SRAM, the write speed of the proposed design has improved about 22% at the 400 mV operating voltage.

Figure 7.
Write ability comparison at different operating voltages.
3.4. The Comparison of Bit-Line Swing Simulation
Figure 8 shows the difference between conv. 6 T SRAM and the proposed bit-line swing at 400 mV supply voltage. During the read operation, the bit-line pair has a different voltage because of the leakage current of all column half-select cells. The voltage difference of conv. 6 T SRAM is less than 200 mV at 128 bits, which causes the bit-line swing is too small for the SA that cannot succeed stability. The proposed local bit-line SRAM maintains the voltage difference of more than 300 mV and bit-line depth is changed to 256 bits.

Figure 8.
Comparison of bit-line swing at different bit-line depth.
3.5. The Comparison of Leakage Power
The static leakage power consumption of the memory cell is shown in Figure 9. Although the proposed architecture increases the metal–oxide–semiconductor field-effect transistor (MOSFET) number but effectively reduces the leakage current on the bit-line and the static leakage power consumption. The proposed design effectively reduces the bit-line leakage current and static leakage power consumption compared to the conv. 6 T SRAM.

Figure 9.
Comparison of leakage power at different operating voltages.
4. Chip Implementation and Result Comparison
The read and write operations are considered to construct the proposed local bit-line 6 T SRAM. While doing the operation, the read and write cell cannot destroy the logic in the cell. The uniqueness of the layout area, the cell size of the SRAM system has a great impact on the proposed design. The comparison of memory cell size and area is shown in Table 2. The 1 kb conv. 6 T area is still smaller than 8 T and proposed 6 T whereas, the proposed 6 T SRAM cell has a better performance for the read and write operation. The proposed local bit-line SRAM area is 7.65 μm2 as shown in Figure 10a, and the light layout of the proposed design using the TSMC−40 nmGP process technology is shown in Figure 10b. Comparison table of the SRAM area.

Table 2.
Comparison table of the SRAM area.

Figure 10.
(a) Proposed 4-bit block area (b) Proposed 1 kb SRAM chip layout.
The implemented architecture 1 kb SRAM macros (128 rows × 8 columns) has 32 blocks in each column and each block consists of 4 memory cells. By using the Hspice EDA tool, the waveform of post-layout simulation is shown in Figure 11 at 400 mV/25 MHz. The WRITE_EN is the write control signal that controls the GWBL/GWBLB decoder, write driver circuit and NLBL scheme circuits to start work when WRITE_EN is "1". At first set input data is "1" to test the write status parameter. The selected memory cell stores data "1" during the write cycle "1" and reads the status, then the sense amplifier starts operating and pulls up the SA and latch to "1" which means the read and write operation both are successful. If WRITE_EN is "0" then the read operation starts to work. Similarly, when the write data is "0" the selected memory cell stored data "0" and the read status start, the SA and latch is also pulled down to "0". These parameters speed up the proposed local bit-line SRAM to operate at the near-threshold operation by reducing read error strengthening the write ability.

Figure 11.
Post-layout simulation waveform for SRAM Chip @400 mV/25 MHz/TT corner.
Table 3 shows the comparison of proposed local bit-line SRAMs architecture performance and previous work. The comprehensive summary of the proposed local bit-line SRAM has smaller average energy consumption than the MINI-Array [] and the highest FoM performance.

Table 3.
Comparison of previous and proposed SRAM based on the local bit-line structure.
5. Conclusions
The low-voltage pre-charged and NLBL scheme have been included with the proposed local bit-line 6 T SRAM architecture which can be operated at the near-threshold operation. The low-voltage pre-charged circuit has reduced the read error and improved the read stability and RSNM of the memory cells. Moreover, the NLBL scheme has reduced the write error and improved the write ability for the near-threshold operation. Furthermore, the half-select cells pseudo-read error has reduced at the half-select condition in the proposed design. Likewise, the proposed architecture of the local bit-line SRAM eliminates the bit-line leakage induced and read failures. The TSMC−40 nmGP process technology has been implemented for the proposed local bit-line 6 T SRAM on 1 kb SRAM macros fabricated. At 400 mV supply voltage and 25 MHz operating frequency, write energy consumption is saved about 45.2%, and the average energy consumption is reduced by about 52.9% compared to the MINI-Array. The proposed local bit-line 6 T SRAM effectively applicable to operate at low-power SoC chips.
Author Contributions
M.-H.S. and J.-F.L. proposed the idea and method; C.-M.T. and C.-J.Y. performed the simulations and experiments; S.-C.H. and C.-Y.C. analyzed the data; S.M.S.M. wrote the manuscript; M.-H.S. and Y.-H.H. reviewed the manuscript. All authors have read and agreed to publish this version of the manuscript.
Funding
This research is supported by the Ministry of Science & Technology, Taiwan under contract No. 109−2221-E−224 −050 and No. 109-2221-E-324-028.
Data Availability Statement
Required data is contained within the article.
Acknowledgments
The authors would like to acknowledge technical support for simulation by Taiwan Semiconductor Research Institute, Hspice EDA tool support for IC implementation.
Conflicts of Interest
The authors declare that no conflict of interest.
References
- Atzori, L.; Iera, A.; Morabito, G. The Internet of Things: A survey. Comput. Netw. 2010, 54, 2787–2805. [Google Scholar] [CrossRef]
- Frank, D.J.; Dennard, R.H.; Nowak, E.; Solomon, P.M.; Taur, Y.; Wong, H.-S.P. Device scaling limits of Si MOSFETs and their application dependencies. Proc. IEEE 2011, 89, 259–288. [Google Scholar] [CrossRef]
- Do, A.-T.; Low, J.Y.S.; Low, J.Y.L.; Kong, Z.-H.; Tan, X.; Yeo, K.-S. An 8T differential SRAM with improved noise margin for bit-interleaving in 65 nm CMOS. IEEE Trans. Circuits Syst. I Regul. Pap. 2011, 58, 1252–1263. [Google Scholar]
- Song, T.; Rim, W.; Jung, J.; Yang, G.; Park, J.; Park, S.; Kim, Y.; Baek, K.-H.; Baek, S.; Oh, S.-K.; et al. A 14 nm FinFET 128 Mb SRAM with VMIN Enhancement Techniques for Low-Power Applications. IEEE J. Solid-State Circuits 2015, 50, 158–169. [Google Scholar] [CrossRef]
- Chen, G.; Sylvester, D.; Blaauw, D.; Mudge, T.N. Yield Driven Near Threshold SRAM Design. IEEE Trans. Very Large Scale Integr. Syst. 2010, 18, 1590–1598. [Google Scholar] [CrossRef]
- Sinangil, M.E.; Mair, H.; Chandrakasan, A.P. A 28nm high-density 6T SRAM with optimized peripheral-assist circuits for operation down to 0.6V. In Proceedings of the IEEE 2011 International Solid-State Circuits Conference (INSPEC Acc. No. 11930802), San Francisco, CA, USA, 20–24 February 2011; pp. 260–261. [Google Scholar]
- Choi, W.; Park, J. A charge-recycling assist technique for reliable and low power SRAM design. IEEE Trans. Circuits Syst. I Regul. Pap. 2016, 63, 1164–1175. [Google Scholar] [CrossRef]
- Zhang, K.; Bhattacharya, U.; Chen, Z. A 3-GHz 70-Mb SRAM in 65-nm CMOS technology with integrated column-based dynamic power supply. IEEE J. Solid-State Circuits 2006, 41, 146–151. [Google Scholar] [CrossRef]
- Chang, L.; Fried, D.M.; Hergenrother, J.; Sleight, J.W.; Dennard, R.H.; Montoye, R.K.; Sekaric, L.; McNab, S.J.; Topol, A.W.; Adams, C.D.; et al. Stable SRAM cell design for the 32 nm node and beyond. In Proceedings of the IEEE 2005 Symposium on VLSI Technology (INSPEC Acc. No. 8615693), Kyoto, Japan, 14–16 June 2005; pp. 128–129. [Google Scholar]
- Chang, L.; Montoye, R.K.; Nakamura, Y.; Batson, K.A.; Eickemeyer, R.J.; Dennard, R.H.; Haensch, W.; Jamsek, D. An 8T-SRAM for variability tolerance and low-voltage operation in high-performance caches. IEEE J. Solid-State Circuits 2008, 43, 956–963. [Google Scholar] [CrossRef]
- Chang, M.-F.; Chang, S.-W.; Chou, P.-W.; Wu, W.-C. A 130 mV SRAM with Expanded Write and Read Margins for Subthreshold Applications. IEEE J. Solid-State Circuits 2011, 46, 520–529. [Google Scholar] [CrossRef]
- Tu, M.-H.; Lin, J.-Y.; Tsai, M.-C.; Lu, C.-Y.; Lin, Y.-J.; Wang, M.-H.; Huang, H.-S.; Lee, K.-D.; Shih, W.-C.; Jou, S.-J.; et al. A Single-Ended Disturb-Free 9T Subthreshold SRAM With Cross-Point Data-Aware Write Word-Line Structure, Negative Bit-Line, and Adaptive Read Operation Timing Tracing. IEEE J. Solid-State Circuits 2012, 47, 1469–1482. [Google Scholar] [CrossRef]
- Shin, K.; Choi, W.; Park, J. Half-Select Free and Bit-Line Sharing 9T SRAM for Reliable Supply Voltage Scaling. IEEE Trans. Circuits Syst. I Regul. Pap. 2017, 64, 2036–2048. [Google Scholar] [CrossRef]
- Chang, I.J.; Kim, J.-J.; Park, S.P.; Roy, K. A 32 kb 10T sub-threshold SRAM array with bit-interleaving and differential read scheme in 90 nm CMOS. IEEE J. Solid-State Circuits 2009, 44, 650–658. [Google Scholar] [CrossRef]
- Lo, C.-H.; Huang, S.-Y. P-P-N Based 10T SRAM Cell for Low-Leakage and Resilient Subthreshold Operation. IEEE J. Solid-State Circuits 2011, 46, 695–704. [Google Scholar] [CrossRef]
- Chiu, Y.-W.; Hu, Y.-H.; Tu, M.-H.; Zhao, J.-K. 40 nm Bit-Interleaving 12T Subthreshold SRAM with Data-Aware Write-Assist. IEEE Trans. Circuits Syst. I Regul. Pap. 2014, 61, 2578–2585. [Google Scholar] [CrossRef]
- Khayatzadeh, M.; Lian, Y. Average-8T Differential-Sensing Subthreshold SRAM With Bit Interleaving and 1k Bits Per Bitline. IEEE Trans. Very Large Scale Integr. (Vlsi) Syst. 2014, 22, 971–982. [Google Scholar] [CrossRef]
- Kang, K.; Jeong, H.; Yang, Y.; Park, J.; Kim, K.; Jung, S.-O. Full-Swing Local Bitline SRAM Architecture Based on the 22-nm FinFET Technology for Low-Voltage Operation. IEEE Trans. Very Large Scale Integr. (Vlsi) Syst. 2016, 24, 1342–1350. [Google Scholar] [CrossRef]
- Chien, Y.-C.; Wang, J.-S. A 0.2 V 32-Kb 10T SRAM with 41 nW Standby Power for IoT Applications. IEEE Trans. Circuits Syst. I Regul. Pap. 2018, 65, 2443–2454. [Google Scholar] [CrossRef]
- Wu, S.-L.; Li, K.-Y.; Huang, P.-T.; Hwang, W.; Tu, M.-H.; Lung, S.-C.; Peng, W.-S.; Huang, H.-A.; Lee, K.-D.; Kao, Y.-S.; et al. A 0.5-V 28-nm 256-kb Mini-Array Based 6T SRAM with Vtrip-Tracking Write-Assist. IEEE Trans. Circuits Syst. I Regul. Pap. 2017, 64, 1791–1802. [Google Scholar] [CrossRef]
- Oh, T.W.; Jeong, H.; Park, J.; Jung, S.-O. Pre-Charged Local Bit-Line Sharing SRAM Architecture for Near-Threshold Operation. IEEE Trans. Circuits Syst. I Regul. Pap. 2017, 64, 2737–2747. [Google Scholar] [CrossRef]
- Chen, S.-Y.; Wang, C.-C. Single-ended disturb-free 5T loadless SRAM cell using 90 nm CMOS process. In Proceedings of the IEEE 2012 International Conference on IC Design & Technology (INSPEC Acc. No. 12851100), Austin, TX, USA, 30 May–1 June 2012; pp. 1–4. [Google Scholar]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).