Ultralow Voltage FinFET- Versus TFET-Based STT-MRAM Cells for IoT Applications

: Spin-transfer torque magnetic tunnel junction (STT-MTJ) based on double-barrier magnetic tunnel junction (DMTJ) has shown promising characteristics to deﬁne low-power non-volatile memories. This, along with the combination of tunnel FET (TFET) technology, could enable the design of ultralow-power/ultralow-energy STT magnetic RAMs (STT-MRAMs) for future Internet of Things (IoT) applications. This paper presents the comparison between FinFET- and TFET-based STT-MRAM bitcells operating at ultralow voltages. Our study is performed at the bitcell level by considering a DMTJ with two reference layers and exploiting either FinFET or TFET devices as cell selectors. Although ultralow-voltage operation occurs at the expense of reduced reading voltage sensing margins, simulations results show that TFET-based solutions are more resilient to process variations and can operate at ultralow voltages (<0.5V), while showing energy savings of 50% and faster write switching of 60%.


Introduction
Spin-transfer torque magnetic random access memory (STT-MRAM) is an attractive solution for on-chip non-volatile memories with zero standby power [1][2][3][4][5][6][7].Thanks to the inherent non-volatility, compatibility with CMOS processes, relatively large endurance and, in particular, small area footprint and ability to operate at relatively low voltages, STT-MRAM has become a key memory candidate for future Internet of Things (IoT) applications, where energy-efficiency is a highly sought-after feature [1,8].Despite these favorable properties, a compatible technology is needed to realize STT-MRAMs working at ultralow operating voltages (i.e., below 0.5 V), as required for low-cost tightly constrained IoT systems [9,10].Unfortunately, conventional STT-MRAMs based on single-barrier magnetic tunnel junction (SMTJ) present limited voltage scalability, requiring high switching currents for reliable write operations [4,11,12].In addition, standard transistors based on conventional planar CMOS technologies feature too small on-currents (I ON ) when operated at reduced voltages.In order to mitigate the above drawback, emerging FinFETs [13,14] or Tunnel-FETs (TFETs) [15][16][17] technologies, along with the double-barrier MTJ (DMTJ) device, can represent effective solutions to design ultralow-power/ultralow-energy STT-MRAMs.
Previous studies [18][19][20][21] have considered FinFET-based STT-MRAM as an alternative to deal with the energy-efficiency limitations of conventional CMOS technology, while also improving write access times in classical SMTJ-based STT-MRAMs.However, high writing currents are still required, and thus a relatively high operating voltage is needed, which limits the overall energy-efficiency of the memory.The studies reported in [12,22] considered a TFET-based technology as access device for STT-MRAM cell, showing that TFET-based cells are more energy efficient than FinFET-based cells.However, this single memory cell study was done under nominal simulations, without taking into account process variability.Another work presented in [4] shows that STT-MRAM based on FinFET technology, along with DMTJ devices with two reference layers, enables lower operating voltage, thanks to the reduced DMTJ switching current as compared to conventional SMTJ, while maintaining sufficiently high thermal stability, so as not to affect data retention time.To further increase the DMTJ-based STT-MRAM energy benefits, advanced FinFET and TFET-based technologies can be exploited.The different sub-threshold conduction behavior of TFET and FinFET has attracted a great attention of several research groups, which have proposed comparative benchmarks based on applications ranging from digital and arithmetic-logic circuits [23,24], to analog blocks [17,25,26] and Static-RAM memory cells [27,28], among the others.Due to inherent device characteristics such as the steep subthreshold slope and the high ON/OFF ratios when operating at low voltages, the collective opinion of the research community is that TFETs have the potential to outperform FinFETs in applications requiring operating supply voltages (V DD ) below 0.4 V [15].
In the above context, this work investigates STT-MRAM cells based on DMTJ operating at ultralow voltages.In particular, our study was carried out at the memory-bitcell level in which TFET-based DMTJ STT-MRAM bitcells have been benchmarked against their FinFET-based counterparts.Our analysis exploits a state-of-the-art DMTJ Verilog-A compact model [29].For the simulation of transistors, we used a complementary TFET technology [30] and a predictive technology model (PTM) of 10 nm node FinFET [14], both operating in the sub-threshold voltage regime.All simulations are performed in Cadence Virtuoso environment using the Spectre simulator.
As the main results of our analysis, we demonstrated the suitability of TFET-based STT-MRAM bitcells to design ultralow-power/ultralow-energy memory circuits.When powered at 0.4 V, TFET-based memory bitcells consume less energy (about −50%) and present better performance (about +60%) under write operation, as compared to the FinFETbased implementation.This is achieved while also ensuring higher robustness against process variability.
The remainder of the paper is organized as follows.Section 2 presents the considered device structures for STT-MRAM bitcells.Section 3 provides the simulation and benchmark analysis of FinFET-versus TFET-based STT-MRAM bitcells.Finally, Section 4 summarizes the main conclusions of this work.

Ultralow Voltage Transistors and STT-DMTJ
The geometrical structures and main device parameters used in this work, for both transistors (TFET and FinFET) and STT-DMTJ devices, are shown in Figure 1 and in Table 1.

Tunnel-FET (TFET) and FinFET Structures
The complementary III-V heterojunction TFET nanowires (NWs) proposed by the University of Bologna group [30] depicted in Figure 1a, and the complementary models for 10 nm-node FinFETs deployed by the Arizona State University [14] shown in Figure 1b, are both competitive devices featuring an ultralow-voltage operation capability.In particular, the square cross-section AlGaSb/InAs NWs TFETs (L S = 7 nm, gate length L G = 20 nm, see Table 1) and the PTM for 10 nm node FinFETs (fin width t f in = 8 nm, L G = 14 nm), have the same footprint per device (i.e., 1 TFET NW or 1 FinFET, footprint ∼150 nm 2 ), by assuming a vertical architecture for the TFETs (as the experimental TFET in [31]) and the standard horizontal architecture for the FinFETs.While FinFET models are available for spice simulations, TFET models used in this work are based on I-Vs and C-Vs look-up tables, obtained by performing TCAD simulations of III-V TFET devices, whose parameters were calibrated against NEGF simulations performed by [30].As for the electrical characteristics, the I DS − V GS curves are reported in Figure 2 for both technologies and operation-types.The TFETs exhibit the advantage of a very steep transition due to sub-60 mV/dec sub-threshold swing, but they have to face with unidirectional conduction [32], asymmetric pand n-operation mode [30], and relatively low on-currents (I ON ) [16].In particular, the p-type TFET has four times smaller I ON compared to the n-type counterpart: 420 nA (p-mode) against 1.6 µA (n-mode) at V DD = 400 mV.At the nominal V DD of 750 mV, the nand the p-FinFETs feature a threshold voltage, V th , of 425 mV and −428 mV, an I ON of 44 µA and −39.5 µA, and an off-current (I OFF ) of 5.13 pA and −5.08 pA, respectively.From this perspective, it is obvious that at nominal power supply (750 mV) they exhibit an extreme advantage with respect to TFETs.However, when operated at a V DD close to 400 mV, their I ON becomes comparable to the one of TFETs (I ON of 650 nA and −500 nA for n-type and p-type, respectively), and the same absolute I OFF of ∼2 pA is achieved.The comparable performance of nominal TFET and FinFET devices requires a deep analysis in the ultralow-voltage regime, as it is not obvious which is the best candidate to act as a selector for the ultralow power STT-MTJ cell proposed in this work.Critical switching current ∼3 µA Due to the low I ON reported for both TFETs and FinFETs, one single device is not sufficient to act as a proper selector enabling program and erase operations of an STT-MTJ memory cell.Thus, to increase the magnetic device drivability, several parallel nanowires (for the TFETs) or parallel FinFETs have been used to realize a single memory cell.The total multiplier factor (M), has been kept constant for both technologies in order to keep the comparison fair from both area overhead and I OFF (i.e., static power consumption) perspectives.As opposite to most of previous comparative studies, which have considered only nominal device characteristics when benchmarking TFETs and FinFETs, here we also include the device-to-device variability of the threshold voltage, which is a critical issue for circuits operated at extremely reduced V DD levels [33].The standard deviation (σ vth ) is in the 30-40 mV range for scaled node FinFETs [34], according to Perlgrom's law [35], while no dependable data are available for TFETs (for instance, data in [36] are reported for experimental device with size of the order of hundreds of nanometers).For this reason, the σ vth has been kept as a free parameter for both technology options.In fact, our goal is to understand the impact of variability at the STT-MRAM bitcell operation level of the different TFET and FinFET characteristics.

Double-Barrier Magnetic Tunnel Junction (DMTJ)
As shown in Figure 1c, a perpendicular magnetic anisotropy (PMA) DMTJ device consists of three stacked ferromagnetic (FM) layers separated by two MgO barriers with different thickness (t OX,T and t OX,B ).The top and bottom FM layers, namely reference layer top (RL T ) and reference layer bottom (RL B ), have a fixed magnetization orientation opposite to each other [29].The remaining FM layer, known as free layer (FL), has a variable magnetization orientation, i.e., parallel (P) or antiparallel (AP) with respect to that of the RL T or RL B layer.Thus, two different device states, which represent the stored data, are possible.Due to the thinner bottom barrier (t OX,T > t OX,B ) as shown in Figure 1c, the two possible states correspond to two different equivalent resistance values, which derive from two series-connected resistances [4], each one associated with the single oxide barrier.Therefore, the DMTJ resistances in the high and low states (R H and R L ) can be calculated as R H = R AP,T + R P,B and R L = R P,T + R AP,B , respectively.The low-to-high (R L → R H ) and the high-to-low (R H → R L ) switching transitions are performed by injecting a current, above the critical switching current (I c0 ), into the DMTJ.In particular, as shown in Figure 1c, R L → R H and R H → R L switching transitions arise depending on the direction of the injected current and thus the electron flow.
From Table 1, the DMTJ parameters have been set to match experimental data in terms of I c0 [11], while also maintaining a reasonable tunnel magnetoresistance (TMR) ratio at 0 V of about 150%.Moreover, to be compatible with the considered transistor devices, the resistance-area product (RA) was set below 10 Ω µm 2 , which is consistent to the trend reported in [7].In the following analysis, we have also taken into account the effect of process variability on DMTJ devices, by considering Gaussian-distributed variations, with σ/µ equal to 5% for the cross-section area and to 1% for both the FL thickness (t FL ) and oxide barrier thicknesses (t OX,T and t OX,B ) [37][38][39].
Figure 3 shows the electrical characteristics of the DMTJ device.More precisely, Figure 3a shows the typical DC resistance-voltage characteristic, where R L → R H and R H → R L switching transitions along with the TMR are highlighted.Note that, thanks to the presence of two RLs, we have a symmetric critical current across the R L → R H and R H → R L switching transitions [29].This can be graphically appreciated in Figure 3b, where it is shown the switching behavior in terms of write pulse width (t p ) as function of the write current that ensures a write-error-rate (WER) of 10

STT-MRAM BitCell Simulation and Benchmark
As shown in Figure 4, four bitcell configurations were considered in this work, two based on TFETs and two on FinFETs.All the memory bitcells are in standard connection (SC) configuration, i.e., the RL T of the DMTJ is connected to the access transistor/s.Such a bitcell configuration was demonstrated to be the best option in our previous FinFETbased evaluation [4], and it is here also taken into account for TFET-based bitcells, for the sake of comparison.As shown in Figure 4, the considered FinFET-and TFET-based configurations are: (a) one NMOS-one MTJ in SC (1T1MTJ-SC), and (b) 2T1MTJ with complementary NMOS/PMOS transistors in SC (2T1MTJ-SC), (c) two n-type TFETs-one MTJ in SC (2nT1MTJ-SC), and (d) 2npT1MTJ-SC with complementary nand p-type TFETs in SC (2npT1MTJ-SC).Note that, owing to the unidirectional behavior of the TFETs, configurations based on a single transistor cannot accomplish the bidirectional writing operation (refer to Figure 1c).Hence, the inclusion of an additional transistor is needed to allow a bidirectional current flow.As a first step of our analysis, we have evaluated the performance of the bitcells for a supply voltage V DD of 0.4 V. Process variations were initially neglected, while the DMTJ stochastic behavior in the switching time was properly considered.Figure 5 shows the simulation results in terms of the ratio between write current (I write ) and I c0 as functions of the number of parallel-connected devices for both R L → R H and R H → R L switching transitions.Results in Figure 5 refer to only the best performing bitcell configurations, i.e., 2T1MTJ-SC and 2npT1MTJ-SC for FinFET-and TFET-based bitcells as shown in Figure 5a,b, respectively.The same number of fingers for both access transistor types was considered.As highlighted in Figure 5a  Therefore, we considered a design point for FinFET-and TFET-based bitcells at parity of area and writing current.As of the 2npT1MTJ-SC configuration, we have simulated the I write /I c0 as a function of number of fingers per p-type (M P ) and n-type (M N ) TFET, while maintaining M N and M P constant, respectively, as shown in Figure 5c.Here, the inherent unidirectional behavior of the TFET devices can be easily appreciated.By fixing M N to a value that ensures the desired I write /I c0 ratio of 3 (refer to design point in Figure 5b), we can ensure the same I write for the R H → R L transition independently of M P , as shown in the left part of Figure 5c; a similar behavior occurs when fixing M P .
After choosing the design point in terms of I write /I c0 ratio, we extended the TFETversus FinFET-based bitcell comparison analysis to different supply voltages as shown in Figure 6. Figure 6a shows the I write /I c0 ratio referred to the worst-case (i.e., smaller ratio) between switching transitions, where for values > (<) ≈ 0.4 V, the FinFET-based (TFET-based) bitcell configuration provides higher write currents.Figure 6b shows that FinFET-based memory cell is the faster solution for V DD > 0.4V thanks to the the higher I write /I c0 (see Figure 6a).However, TFET-based STT-MRAM cell can reliably work for voltages lower than 0.4 V, while achieving faster switching (at the parity of V DD ). Figure 6c shows the average energy (E write ), where TFET-based bitcell is more energy efficient for a large range of V DD .Finally, Figure 6d shows the simulation results in the E write -t p plan, where clearly TFET-based alternative is the most energy-efficient for reduced V DD s and becomes the best option in applications where the time constraints can be relaxed.To complete our analysis, we have evaluated write and read performance through extensive Monte Carlo simulations by considering device-to-device variability.For both TFET and FinFET devices, it is considered the variability of the threshold voltage.To estimate what is the maximum deviation that can be accepted on each device, we have considered a wide range of σ vth , from 5 to 55 mV. Figure 7 shows the yield of the write operation (at a WER = 10 −7 ) for the fast switching regime (<10 ns) expressed in terms of error probability (P error )) as function of the threshold voltage variability.The TFET-based bitcell solution shows a yield of ∼100% (i.e., 0% of errors) even for σ vth larger than 35 mV.It is also clear that the FinFET-based bitcell is not as robust as the TFET-based counterpart, showing that it can only achieve a yield of 100% for σ vth of 10 mV and below.In light of these results, and for the sake of fair comparison, we have considered a σ vth of 10 for both TFET-and FinFET-based bitcells in the following analysis.Figure 8 shows the TFET-and FinFET-based STT-MRAM bitcell Monte Carlo simulation results, while considering the effect of device process variations, for write and read operations ensuring a WER and read disturbance rate (RDR) of 10 −7 , at a V DD = 0.4 V and σ vth = 10 mV.For the write operation (see Figure 8a), in contrast to FinFET-based bitcell, the TFET-based counterpart allows an improvement of about 60%.As for the read operation, we considered a conventional voltage sensing scheme [40], where a fixed read current (I read ), which assures the target RDR, is forced from the bitline to the sourceline of the bitcell, and the corresponding bitcell/bitline voltage is measured.Figure 8b shows the statistical distribution for the bitcell voltage when the DMTJ is in the low and high resistance states (LRS and HRS, respectively).The voltage sensing margin (V SM ), which is defined as the difference between bitcell voltages when the DMTJ is in LRS and HRS, is also reported.Although the mean and sigma of the bitcell distributions differ between the considered bitcell configurations, the V SM results are the same for both TFET and FinFET-based bitcells.This is attributed to the constant I read that feeds the bitcells.
Table 2 summarizes writing and reading simulation results for TFET-and FinFETbased bitcells operating at V DD of 0.4 V. Note, the reported data consider Monte Carlo simulations for two values of σ vth , 10 mV and 35 mV.For results with σ vth = 10 mV, in contrast to FinFET-based STT-MRAM bitcell, TFET-based alternative allows the E write to be reduced by about 50%, while also ensuring faster write switching time (60%) at the same bitcell area.This occurs at the cost of worsened read operation in terms of reading sensing margins, V SM , with respect to STT-MRAM bitcells operating at higher V DD s.Although the considered configurations present a relatively small VSM, this issue can be mitigated by adopting several techniques [40,41].In Table 2, results for σ vth = 35 mV are also reported, showing that TFET-based bitcell has an increase of 54% and 57% in E write and t p,6σ , respectively, as compared to the case σ vth of 10 mV.FinFET-based bitcell results for σ vth = 35 mV were not reported due to the presence of more than 20% write failures as shown in Figure 7.

Conclusions
In this work, we explored the impact of using TFETs instead of FinFETs in DMTJ-based STT-MRAM cells.Our study was first performed under nominal conditions at different supply voltages within the subthreshold regime.Then, we extended our performance analysis by considering Monte Carlo simulations taking into account device-to-device process variations on both DMTJ and transistors.Such simulation analysis demonstrated that TFET-based solutions can reliably operate at ultralow-voltages (<0.5 V).Such benefits are obtained at the cost of reduced voltage sensing margins.In conclusions, the comparative study demonstrated that DMTJ STT-MRAM based on TFETs is the most promising candidate for ultralow-power/ultralow-voltage IoT applications, thanks to its potential in offering lower write energy and switching improved of about 50% and 60%, respectively, as compared to the FinFET-based counterparts.

Figure 1 .Figure 2 .
Figure 1.Sketch of the device architecture for: nand p-type (a) TFETs and (b) FinFET, and (c) STT DMTJ with magnetization orientation at high (R H ) and low (R L ) resistance states along with the electron flow for R L → R H and R H → R L switching.

Figure 3 .
Figure 3. (a) DC resistance-voltage characteristic, (b) pulse width (t p ) versus write current (I write ) for the DMTJ structure.
,b (refer to the circle with dashed line), to ensure a robust write operation we have chosen a I write /I c0 ratio of ≈ 3, which corresponds to a number of fingers of 20/20 and 17/23 for the n-type/p-type FinFETs-and TFET-based bitcell configurations, respectively.

3 Figure 5 .
Figure 5. Ratio between write current (I write ) and I c0 as functions of the number of fingers per device for (a) 2T1MTJ-SC, and (b) 2npT1MTJ-SC configurations.(c) I write /I c0 as function of number of fingers per p-type (M P ) and n-type (M N ) TFET, while maintaining M N and M P constant, respectively.

Figure 6 .
Figure 6.TFET-versus FinFET-based bitcell performance comparison in terms of: (a) I write /I c0 ratio, (b) write pulse width, t p , at a write-error-rate of 10 −7 , (c) average energy, E write , and (d) E write versus t p for different supply voltages.The V DD step is 10 mV.

Figure 7 .
Figure 7. Yield of write operation for the fast switching regime (<10 ns) expressed in terms of error probability (P error )) as function of the threshold voltage variability (σ vth ) of TFET and FinFET devices.Each point is the result of Monte Carlo simulation with 1000 samples.

Table 1 .
Device parameters and characteristics.