Embedded Memories for Cryogenic Applications

: The ever-growing interest in cryogenic applications has prompted the investigation for energy-efﬁcient and high-density memory technologies that are able to operate efﬁciently at ex-tremely low temperatures. This work analyzes three appealing embedded memory technologies under cooling—from room temperature (300K) down to cryogenic levels (77K). As the temperature goes down to 77K, six-transistor static random-access memory (6T-SRAM) presents slight improvements for static noise margin (SNM) during hold and read operations, while suffering from lower ( − 16%) write SNM. Gain-cell embedded DRAM (GC-eDRAM) shows signiﬁcant beneﬁts under these conditions, with read voltage margins and data retention time improved by about 2 × and 900 × , respectively. Non-volatile spin-transfer torque magnetic random access memory (STT-MRAM) based on single- or double-barrier magnetic tunnel junctions (MTJs) exhibit higher read voltage sensing margins (36% and 48%, respectively), at the cost of longer write access time (1.45 × and 2.1 × , respectively). The above characteristics make the considered memory technologies to be attractive candidates not only for high-performance computing, but also enable the possibility to bridge the gap from room-temperature to the realm of cryogenic applications that operate down to liquid helium temperatures and below.


Introduction
Cryogenic electronics is an emerging approach to improve computer performance and deal with the static power consumption issue resulting from transistor scaling towards the end of Moore's law [1][2][3]. MOS technology operating at cryogenic temperatures provides some benefits, such as steeper subthreshold slope, increased carrier mobility, and increased saturation velocity, leading to semiconductor-based circuits with faster operation, reduced leakage, and improved energy-efficiency [4,5]. As shown in Figure 1, MOS technologies operating at cryogenic temperatures are interesting for a wide spectrum of applications including high-performance computing [6,7], control systems for quantum processors [8,9], and aerospace applications [5,10,11]. The need for electronic devices capable of operating at cryogenic temperatures has always been a sought-after feature in deep space applications; however, high-performance computing and especially quantum computing are now increasing the demand for processors and memories that can operate at very low temperatures. While quantum computing systems operate in the mK range, memory sub-systems capable of operating at the liquid nitrogen boiling point (77 K) and interfacing systems operating at helium temperatures (4 K) are requested as more costeffective solutions [2,12]. This potentially enables the possibility to bridge the gap from room-temperature to cryogenic applications that operate down to 4 K and below [8]. The benefits of cooling down processors and memory systems to cryogenic temperatures as low as 77 K have recently been demonstrated [2,13,14]. The studies reported in [11,13,15] mainly focus on traditional embedded memories based on six-transistor static random access memory (6T-SRAM), which are shown to provide significant improvements in terms of performance. However, the relatively large bitcell area of 6T-SRAM limits the overall on-chip memory density and the many leakage paths present in these memories limit the achievable power savings [16]. To improve on these issues, other memory technologies like Gain-Cell embedded DRAMs (GC-eDRAMs) and spin-transfer torque magnetic RAMs (STT-MRAMs) were recently proposed as promising candidates for cryogenic computing applications [5,14,16].
GC-eDRAMs has recently been evaluated at 77 K [5,13], showing that it is a viable alternative to 6T-SRAM under cryogenic operation. In addition to the reduced cell area footprint, the refresh power of GC-eDRAMs is highly reduced at 77 K thanks to the suppressed MOS transistor leakage current when operating at 77 K. This leads to overall static (retention) power savings as compared to 6T-SRAM. In particular, for a 2T mixed pMOS-nMOS GC-eDRAM, the data retention time is found to be in the range of ms, enabling considerable power savings as compared to the room temperature operating condition [5]. GC-eDRAMs based on 3T topology have also been evaluated in [13], demonstrating that cache performance similar to 6T-SRAM can be obtained, while achieving higher density, comparable access speed, and lower power. A recent test-chip of 2T-based GC-eDRAM has been evaluated in the temperature range from 4 K to 300 K for various supply voltages [12]. The prototype shows outstanding improvements, in terms of data retention time, by about six orders of magnitude when cooling down from 300 K to 4 K. Therefore, GC-eDRAMs can be considered as a power-effective solution to build embedded memories operating at cryogenic temperatures.
STT-MRAM based on single-barrier magnetic tunnel junction (SMTJ) operating at 77 K has been demonstrated to be an energy-efficient solution for larger cache sizes [16]. However, due to relatively high switching currents, it suffers from longer write access than 6T-SRAM technology. In addition, for smaller cache sizes, SMTJ-based STT-MRAM exhibits latency and energy penalties under write access. To deal with this, and to further reduce the energy consumption of SMTJ-based STT-MRAMs operating at 77 K, STT-MRAM based on double-barrier MTJ (DMTJ) with two reference layers has also been proposed [17]. In addition, leveraging the thermal stability factor of MTJ devices has also been considered as a promising alternative to build reliable, energy-efficient, and high density STT-MRAMs at 77 K. While this was experimentally demonstrated for SMTJ-based STT-MRAM, as reported by Taiwan Semiconductor Manufacturing Company (TSMC) [14], a DMTJ-based STT-MRAM cryogenic simulation study with such an approach is also reported in [18], suggesting that, in contrast to conventional 6T-SRAM, better energy-efficiency can be achieved even for small-to-large cryogenic embedded memories.
In this paper, we present a comparative evaluation between GC-eDRAM, 6T-SRAM, and STT-MRAM memories when operating at 77 K. The analysis is carried out based on a 65 nm commercial process design kit (PDK) calibrated for 77 K under silicon measurements. For simulating the STT-MRAMs, our study uses state-of-the-art SMTJ and DMTJ Verilog-A compact models [19,20]. The results presented within this study are based on comprehensive bitcell-level simulations carried out through exhaustive Monte Carlo simulations. As the main result of this work, we show the key figure-of-merits of the considered memory technologies when cooled down from 300 K to 77 K. We show that 6T-SRAM offers slight improvements (≈5%) in terms of hold and read static noise margins, while suffering from lower write noise margins (−16%), while GC-eDRAM shows larger read voltage margins and data retention time by about 2× and 900×, respectively. SMTJ-and DMTJ-based STT-MRAMs benefit from higher read voltage sensing margins (36% and 48%, respectively), while exhibiting longer write access times (1.45× and 2.1×, respectively).
The rest of the paper is organized as follows: Section 2 presents a brief review of the considered memory technologies and their operating characteristics when operating at cryogenic temperatures. Section 3 presents the simulation results, with the comparison at 77 K discussed in Section 4. Section 5 concludes this work.

Background
The embedded memory technologies considered in this work are shown in Figure 2: • 6T-SRAM: It is based on a pair of cross-coupled inverters for storing the volatile data. The cell is accessed for write and read operation by asserting the wordline (WL), and driving bitline (BL) and BL to opposite logic values for write, or pre-charging them for read. Although this is the most mature embedded memory technology available in the market, it has barely been studied at cryogenic temperatures. Recently, 6T-SRAM was evaluated in [15], showing the different trade-offs in terms of static noise margins.
• GC-eDRAM: This circuit is most often constructed from two to four transistors, and the dynamic (volatile) data is stored by means of the charge upon a parasitic capacitance, which is commonly referred to as storage node (SN). The 2T mixed nMOS-pMOS GC-eDRAM cell is chosen among different topologies in light of its better performance at 77 K [5]. The write operation is done by asserting the write wordline (WWL) of the nMOS write port (NW) and driving the write bitline (WBL) to V DD ('1') or ground ('0'), so that the charge is transferred to or from the SN. As for the read operation, first the read bitline (RBL) is precharged, and then the pMOS read port (PR) is enabled by asserting the read word line (RWL). If the SN is holding a '1', the RBL is discharged to ground, and if it is a '0', the RBL is maintained at V DD . A recent study experimentally demonstrates the GC-eDRAM capabilities when cooled down from room temperature to the helium nitrogen boiling point [12]. • STT-MRAM: This bitcell consists of a MOS access transistor and an MTJ that stores the non-volatile information. The MTJ stack is build with a reference layer (RL) and a free layer (FL), sandwiching a thin oxide barrier (t OX ). This structure is known as an SMTJ, and presents relatively high switching currents, which impact the bit cell write operation [21]. To deal with this, a possible solution is to use a DMTJ with two reference layers (reference layer top (RL T ) and bottom (RL B )) that enhance the total torque acting on the FL, leading to lower switching currents, albeit with increased resistance and reduced tunnel magnetoresistance (TMR) [22,23]. According to the relative orientation of the FL with respect to that of the RL (or RL T in the case of the DMTJ), two states are possible: parallel (P) or antiparallel (AP). For more detailed information on the SMTJ and DMTJ structures, the reader is referred to our previous works [21,24]. STT-MRAM cells can be built from different topologies, which have been previously evaluated in the works reported in [21]. Among the different bitcell topologies, the most area-efficient are the 1TRC and 1TSC configurations (1TRC and 1TSC are referred to as one-transistor/one-MTJ in reverse connection (RC) and standard connection (SC), respectively) for SMTJ and DMTJ, respectively, as shown in Figure 2c. Table 1 shows the expected impact of cryogenic temperatures on the considered embedded memory technologies. Conventional 6T-SRAM allows significant improvements in terms of performance and power, mainly due to the faster memory access and reduced leakage currents, albeit with reduced write static noise margin (WSNM). The GC-eDRAM also presents power and performance advantages, while also requiring fewer refresh operations at cryogenic temperatures, due to the reduced leakage currents. That being said, GC-eDRAM is still a dynamic memory technology, such that long data retention still requires refresh operations, which complicate the overall system design. When operating at cryogenic temperatures, the STT-MRAM is expected to provide orders of magnitude better endurance and an improved readout signal (due to the higher tunnel magnetoresistance), at the only cost of higher write energy owing to the increased critical switching currents. Overall, all the considered memory technologies benefit from less bitline resistance and faster peripheral circuity when cooled down to cryogenic temperatures. Note that standalone (off-chip) dynamic random-access memory (DRAM) is also considered as a good candidate for cryogenic computing [7]; however, this work is only focused on embedded memory technologies.

Simulation Analysis at Cryogenic Temperatures
The embedded memories taken into consideration within this study are designed using a commercial 65 nm CMOS technology, whose BSIM4.7 transistor models were calibrated at the operation temperature of 77 K. The calibrated models take into consideration the impact of cryogenic temperatures on different process corners, along with cryogenic-temperature dependent equations for different parameters like: leakage (e.g., GIDL), mobility, channel doping, body factor, series resistances, stress effects (on threshold voltage, mobility, body factor), etc. As reported in our previous work [5], while the cryogenic-aware calibrated model is roughly correspondent with the original PDK modeling for the operating point of 300 K, as temperature goes down to 77 K, the calibrated model tracks the silicon wafer measurements much more accurately.
The simulations of the STT-MRAMs use state-of-the-art Verilog-A SMTJ and DMTJ compact models [19,20], with major device parameters that are presented in Table 2. The STT-MRAM compact models are based on physical parameters, which were characterized with experimental prototypes at 300 K. The impact of the cryogenic temperature is taken into account according to the formulations provided in [17]. The simulation analysis reported below is based on extensive Monte Carlo circuit-level simulations of the considered memory technologies operating at 300 K and 77 K. These Monte Carlo simulations consider both CMOS and MTJ variability (σ/µ). In particular, for the MTJ devices, the Gaussian distributed variability is 5% for the cross-section area, and 1% for t OX , t OX,T , t OX,B , and t FL [21,25,26]. The variability of the CMOS devices is provided by the statistical models of the cryogenic PDK.

Static Random-Access Memory (SRAM)
Stability is a crucial design metric in nano-scaled SRAM technologies. Figure 3 shows the statistical distributions of the static noise margin (SNM) for hold (HSNM), read (RSNM), and write (WSNM), when the SRAM is cooled down from 300 K to 77 K. The SNM metrics can be measured by the method proposed by Hill [27]. It consists of plotting the voltage transfer characteristics (VTC) of the SRAM inverters in order to find the noise margins that the SRAM cell can tolerate without disturbing its state. Note that the above method is not efficient for yield results in terms of stability. To deal with this, we used the most-accepted methodology, first proposed by Seevinck et al. [28]. This method efficiently measures the noise margin metrics with a DC sweep simulation. In particular, measuring noise margins (HSNM, RSNM, WSNM) distributions provides a reliable yield estimation, typically at 6σ, which is required for the design of high-density SRAM cells [29].
HSNM and RSNM consider the measure of the largest DC voltage that the SRAM cell can withstand without flipping the stored state. As for the WSNM, it is the minimum voltage required for the SRAM cell to be in a monostable state [29]. While the HSNM is measured with the WL tied to ground, for the RSNM and WSNM the WL is asserted while bitlines are driven to V DD and opposite logic values, respectively.  As compared to the 300 K operating temperature, at 77 K the HSNM and RSNM present a slight increase of about 7.66% and 3.31%, respectively. While this result is obtained at a nominal V DD of 1.2 V, the HSNM and RSNM can be improved when operating at lower V DD [15]. In particular, when operating in the subthreshold (V DD = 0.3 V) region, the HSNM and RSNM increase by about 2× and 4×, respectively. This is due to the steeper transition of the data storage nodes at 77 K [15]. As for the WSNM, it degrades by 15.6% at 77 K due to the increased strength of the SRAM cell pull-up network at the low temperature.

Gain-Cell Embedded DRAM (GC-eDRAM)
As opposed to 6T-SRAM, which has one universally accepted bitcell topology, previous GC-eDRAM research has suggested a large variety of configurations, depending on the target specifications and technology node. In our previous work [5], we demonstrated that the mixed configuration nMOS-pMOS (2T NW-PR) GC-eDRAM cell represents the best solution at 77 K for the considered 65 nm technology. Note that the following GC-eDRAM results in terms of voltage margins and data retention capabilities were evaluated in the worst-case condition. That is, when the WBL is driven to V DD or ground while the SN is '0' or '1', respectively. Figure 4a shows the statistical distributions of the read bitline voltages when the SN is holding a '1' (V RBL('1 ) ) or '0' (V RBL('0 ) ) at an operating temperature of 77 K and considering a read pulse of 1 ns. The voltage margin V M , which is defined as the difference between the RBL voltage for '1' and '0' (i.e., V M = V RBL('1 ) − V RBL('0 ) ), is measured at 4 µs after writing into the SN.  The figure shows the V M evaluated at both 3-sigma and 6-sigma with considerably high margins when operating at 77 K. Moreover, Figure 4b shows the statistical distributions of the read voltage margin corresponding to the mean values measured in Figure 4a. In addition to V M at 77 K, 300 K is also considered, to better show the benefits in terms of voltage margins when the GC-eDRAM operates under cryogenic conditions. In particular, V M at 77 K is improved by 2× as compared to 300 K simulation results.
Because of the inherently dynamic characteristics of the GC-eDRAM, periodic refresh operations are required. However, since the subthreshold leakage is substantially suppressed at cryogenic temperatures, the memory retention time is expected to be considerably improved. Accordingly, we extended our analysis on the considered 2T GC-eDRAM cell to evaluate the data retention capabilities. Figure 5a,b show the worst-case storage node deterioration after writing '0'/'1' into the 2T GC-eDRAM when operating at 300 K and 77 K, respectively. While the blue curves show the degeneration of a logic '1' level when WBL is driven to ground, the red curves show the degeneration of the logic '0' level with WBL driven to V DD . As compared to the 300 K simulations, the 77 K operating point shows improvements by orders of magnitude. In particular, it can be seen that the SN degeneration at cryogenic temperatures is in the order of ms, which points out that energywasting refresh operations could be considerably limited. This is further emphasized by Figure 5c, which shows the statistical distribution of the data retention time (DRT) at 300 K and 77 K. Here, we define DRT as the time it takes for the difference between the '0' and '1' SN voltages, reported in Figure 5a,b, to be 200 mV [30]. At 77 K, the DRT is improved by 900× with respect to room temperature simulations, while also exhibiting 98% less variability.

Spin-Transfer Torque Magnetic RAM (STT-MRAM)
The 1TRC and 1TSC bitcell configurations were referenced in this work since they are the most cost-effective solutions in terms of area and energy to build STT-MRAM embedded memories based on SMTJ and DMTJ, respectively [21]. Figure 2c shows the 1TRC and 1TSC configurations, where the RL (for the SMTJ) or RL B (for the DMTJ) are connected to the BL.
Monte Carlo simulation results under write and read accesses are shown in Figures 6 and 7 for both SMTJ-and DMTJ-based STT-MRAM cells. The reported results refer to 300 K and 77 K simulations, while considering a write error rate (WER) and read disturbance rate (RDR) of 10 −7 and 10 −9 , respectively [31]. In particular, Figure 6 shows the statistical distribution of the write pulse (t p ) referred to the worst case between AP→ P and P→ AP transitions. When cooling down to 77 K, we can observe a penalty of about 1.45× and 2.1× in terms of t p for SMTJ-and DMTJ-based bitcells. This is because the critical switching current dramatically increases as temperature goes down to cryogenic levels [16,17,32], with an adverse impact on energy and latency for write operation. This can be counterbalanced by increasing the width of the access transistor. Note that t p is evaluated at 6-sigma (t p,6σ ). Although SMTJ-and DMTJ-based STT-MRAM solutions present an increased t p of more than 50% as compared to the 300 K operating point, STT-MRAM based on DMTJ allows a t p of a few ns, even for cryogenic temperatures.  To evaluate the reading performance of STT-MRAMs, we used the conventional voltage sensing (CVS) scheme [33], which includes applying a fixed read current (I read ) to the BL of the bitcell, and then comparing the BL voltage (V BL ) with a reference voltage (V REF ) by means of a sense amplifier. I read is set to be low enough to not disturb the stored data (RDR = 10 −9 ) (A read pulse width (t read ) of 1 ns is considered). Figure 7 shows the statistical distribution of the bitline voltages for SMTJ-and DMTJ-based STT-MRAM bitcells operating at 300 K and 77 K. From the two sensing operations, V BL(P) and V BL(AP) , the voltage sensing margin (V SM ) is defined as: V SM = V BL(AP) − V BL(P) . Due to the rise in the TMR, V SM increases at cryogenic temperatures as compared to 300 K [14,32]. In particular, from Figure 7, we can observe V SM improvements by about 36% and 48% for the SMTJ-and DMTJ-based STT-MRAMs. Note that, the DMTJ-based bitcell has reduced sensing margins with respect its SMTJ-based counterpart. In both the cases, the sensing margins can be improved by adopting proper BL boosting design techniques [34,35].

Comparison Results
In order to complete our analysis and make a direct comparison between the different memory technologies, we have measured their main characteristics, including area, sensing margins, data retention capabilities, read/write access, and both power and energy consumption. Table 3 summarizes the simulation results for a 128-word memory bank operating at 77 K and considering a nominal V DD . Note that the data reported in terms of area corresponds to a standard full-custom design, i.e., not using "pushed rules".
From Table 3, the STT-MRAM configurations are the most area-efficient solutions, presenting a bitcell area footprint of about 88% and 56% less than the 6T-SRAM and GC-eDRAM, respectively. This characteristic make them very attractive for designing dense, non-volatile embedded memory banks. In terms of sensing margins, the GC-eDRAM is the technology that benefits the most from operating at 77 K, showing V M improvements up to 90%. While the STT-MRAM cells also present improvements in terms of V SM , the 6T-SRAM maintains comparable HSNM and RSNM, along with reduced WSNM. As for the data retention capabilities, while 6T-SRAM and STT-MRAM provide static and non-volatile behavior, respectively, GC-eDRAM exhibits DRTs in the order of ms, allowing reduced refresh operations with respect to the room temperature operating point.
From Table 3, the STT-MRAM is the most penalized in terms of access time, mainly due to larger write time at 77 K. In particular, the DMTJ-based STT-MRAMs suffer from longer write times of about 24×, on average, as compared to 6T-SRAM and GC-eDRAM. In terms of write and read energy, the GC-eDRAM is the most energy-efficient solution, showing write and read improvements of about 63% and 70%, respectively, as compared to 6T-SRAM. However, differently from STT-MRAM and 6T-SRAM, GC-eDRAM still requires energy consuming refresh operations, although much less than the room temperature operation. Finally, the leakage power is also measured in standby mode. While STT-MRAM technology presents almost zero leakage (the only leakage contribution is due to the periphery), 6T-SRAM presents 98% higher leakage power than GC-eDRAM.

Conclusions
In this work, we investigated the impact of cryogenic temperatures on different embedded memory technologies. Our study was carried out using a commercial 65 nm 1.2 V CMOS technology fully calibrated under silicon measurements at cryogenic temperatures. Obtained results demonstrate that embedded memory technologies benefit in different figures-of-merit when cooled down from 300 K to 77 K. Although the most commercially mature technology, 6T-SRAM, is faster and less leaky at cryogenic temperatures, its rela-tively large area footprint and reduced write static noise margin at 77 K are less desired. GC-eDRAM excels in most of figures-of-merit, and even if refresh operations are required, the resulting refresh power is considerably lower than when operating at room temperature. In particular, GC-eDRAM benefits from improved read voltage margins and data retention time by about 2× and 900×, respectively. STT-MRAMs based on SMTJ present high write overhead due to increased switching currents at cryogenic temperatures. However, the DMTJ-based solution significantly reduces the write penalty (by 83%). Furthermore, the readout capability of STT-MRAMs is improved, enabling more reliable read operations at cryogenic temperatures. Overall, our evaluation points out that embedded memory technologies can be interesting for cryogenic applications, not only for high-performance computing, but also for bridging the gap from room-temperature to the realm of cryogenic applications that operate down to liquid helium temperatures and below.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: 6T-SRAM