The Cost of Energy-Efficiency in Digital Hardware: The Trade-off between Energy Dissipation, Energy-Delay Product and Reliability in Electronic and Magnetic Binary Switches

Featured Application: This work has applications in the benchmarking of binary switches for energy-efficient nanoelectronics. Abstract: Binary switches, which are the primitive units of all digital computing and information processing hardware, are usually benchmarked on the basis of their ‘energy-delay product’ which is the product of the energy dissipated in completing the switching action and the time it takes to complete that action. The lower the energy-delay product, the better the switch (supposedly). This approach ignores the fact that lower energy dissipation and faster switching usually come at the cost of poorer reliability (i. e. higher switching error rate) and hence the energy-delay product alone cannot be a good metric for benchmarking switches. Here, we show the trade-off between energy dissipation, energy-delay product and error-probability, for both an electronic switch (a metal oxide semiconductor field effect transistor) and a magnetic switch (a magnetic tunnel junction switched with spin transfer torque). As expected, reducing energy dissipation and/or energy-delay-product generally results in increased switching error probability and reduced reliability.


Introduction
The primitive element of all digital circuits (for computing, signal processing, etc.) is a "binary switch" which has two stable states encoding the binary bits 0 and 1. Computing and digital signal processing tasks are carried out by flipping such switches back and forth between the two states.As a result, for a given algorithm and a given computing architecture, the energy cost and speed of a digital computational task are determined by the energy dissipation and the switching delay of the switches.Therefore, it has become common practice to benchmark digital switches on the basis of their 'energy-delay-product', which is the product of the energy dissipated during switching and the switching time [1].
any saving in energy or computational time gained by employing switches with lower energy-delay product may be offset by the additional resources that would be needed for error correction.In this paper, we show the direct relation between energy dissipation and error-resilience with two examples -a field effect transistor and a nanomagnetic switch flipped with current induced spin-transfer-torque [2].

Field-effect-transistor switch
A metal-oxide-semiconductor-field-effect-transistor (MOSFET) is the archetypal binary switch that encodes the bits 0 and 1 in its two conductance states -high (ON) and low (OFF).In the ON-state, charges flood into the channel providing a conduction path between the source and the drain to turn the transistor on, while in the OFF-state, these charges are driven out of the channel to disrupt the conduction path and turn the transistor off.Therefore, the two states are ultimately encoded in two different amounts of charge -Q1 and Q2 -in the channel.The switching action changes the amount of charge from Q1 to Q2, or vice versa, resulting in the (time-averaged) flow of a current wheret is the amount of time it takes for the channel charge to change from Q1 to Q2, or vice versa.This current will cause energy dissipation of the amount where R is the resistance in the path of the current and V IR   .We can think ofV as the amount of voltage needed to be imposed at the transistor's gate to change the charge in the channel by the amount Q.Note that the energy dissipation given in Equation ( 2) is not independent of the switching time, because V depends on the switching time for a fixed Q and R    , which clearly shows that for a fixedQ and R, we will dissipate more energy if we switch faster (smaller t).Therefore, a more meaningful quantity to benchmark energy-efficiency is the energy-delay product which is we can reduce this quantity by reducing Q, but that increasingly blurs the distinction between Q1 and Q2, thereby impairing our ability to distinguish between bits 0 and 1.If Q is too small, then thermal generation and recombination can randomly change the amount of charge in the channel by an amount comparable to Q and cause random switching.
Therefore, larger Q translates to stronger error-resilience and better reliability.This makes it obvious that there is a direct relation between reliability and energy-delay product; if we reduce the energy dissipation or energy-delay product by reducing Q, then we will invariably make the switch less reliable.We can make this argument a little more precise by noting that g Q C V    , where g C is the gate capacitance.The thermal voltage fluctuation at the gate terminal is given by g kT C , where kT is the thermal energy [3], and hence the thermal charge fluctuation in the channel is: This quantity must be much smaller than the Q one needs to switch the conductance state of the transistor, and hence Clearly, is a measure of the 'switching reliability'; the larger is its value, the more reliable is the switch.From Equation (2), we can now obtain that which immediately shows that we have to tolerate more energy dissipation Ed and larger energy-delay product Edt if we desire more reliability (i.e. a larger ) [4].
In some specific cases, such as a field-effect-transistor, we may be able to derive a relation between the energy dissipation/energy-delay product and the error probability.Consider the conduction-band diagram in the channel of an n-channel field effect transistor along the direction of drain current flow as shown in Fig. 1.In the OFF-state, there is a potential barrier at the source-channel junction which prevents electrons in the source contact from entering the channel and turning the transistor ON.This barrier has to be lowered by the applied gate potential V in order to allow electrons to enter the channel when the transistor has to be turned ON.Therefore, this barrier should be approximately equal to the quantity q V  .It is clear then that the transistor can spontaneously turn ON while in the non-conducting state (causing a switching error that results in a bit error), if electrons can enter the channel from the source by thermionic emission over the barrier.The probability of entering the channel in this fashion, which is roughly , is then the switching error probability p. From Equation (2), we then get that the energy dissipation can be written as and the energy-delay product can be written as ( where is the gate charging time.Equations ( 5) and ( 6) show the direct dependences of the energy dissipation and energy-delay product on the error-probability p.These two equations clearly show that lower energy dissipation or lower energy-delay product are associated with higher switching error probability in a transistor switch.

Nanomagnetic switches
Next, we consider a magnetic switch.Unlike the electronic switch, this case is not amenable to any analytical treatment and hence we will resort to simulations.A bistable nanomagnetic switch can be fashioned out of a ferromagnetic elliptical disk where, because of the elliptical shape, the magnetization can point only along the major axis, either pointing to the left or to the right, as shown in Fig. 2(a).This type of nanomagnet is said to possess in-plane magnetic anisotropy (IPA).In thinner nanomagnets, the surface anisotropy may be dominant and the magnetization can point perpendicular to the surface, either up or down.This type of nanomagnet is said to possess perpendicular magnetic anisotropy (PMA) [Fig.2(b)].Either type makes a binary switch if we encode the bit information in the magnetization orientation which can point in just two directions.In this paper, we will consider only the IPA nanomagnetic switch, although the results will apply equally to PMA nanomagnets.
The IPA nanomagnet can be vertically integrated as the "soft" layer into a three-layer stack consisting of a "hard" ferromagnetic layer and an insulating (non-magnetic) spacer, to form a magnetic tunnel junction (MTJ), as shown in Fig. 2(c).The hard layer is permanently magnetized in one of its two stable directions.When the soft layer's magnetization is parallel to that of the hard layer, the MTJ resistance (measured between the two ferromagnetic layers) is low, while, if the two magnetizations are antiparallel, the resistance is high.Thus, the MTJ acts as a binary switch, much like the transistor, whose two resistance states -high and low -encode the binary bits 0 and 1.The difference between the transistor and the MTJ is that the former is volatile (since charges leak out when the device is powered off), while the MTJ is non-volatile since the bit information is encoded in magnetization (spins) and not charge.
In order to make the magnetizations of the hard and soft layers mutually parallel (ON state), we can employ spin-transfer-torque [2].We apply a voltage across the MTJ with the negative polarity of the battery connected to the hard layer.This will inject spinpolarized electrons from the hard layer into the soft layer whose spins are mostly aligned along the magnetization orientation of the hard layer.These injected electrons will transfer their spin angular momenta to the resident electrons in the soft layer, whose spins will then gradually turn in the direction of the injected spins, and that will magnetize the soft layer in a direction parallel to the magnetization of the hard layer.This is how the MTJ is turned "on".In order to turn it "off", we will reverse the polarity of the battery.That will inject electrons from the soft layer into the hard layer, but because of spin-dependent tunneling through the spacer, those electrons whose spin polarizations are parallel to the magnetization of the hard layer will be preferentially injected.As these spins exit the soft layer, their population is quickly depleted, leaving the opposite spins as the majority in the soft layer.That aligns the magnetization of the soft layer antiparallel to that of the hard layer.When that happens, the MTJ turns off.
For any given magnitude of injected current (with a given degree of spin polarization), we can calculate the switching error probability (at room temperature) associated with spin-transfer-torque switching of an MTJ by carrying out Landau-Lifshitz-Gilbert-Langevin simulation (also known as stochastic Landau-Lifshitz-Gilbert (s-LLG) simulation).To do this, we solve the following equation:  The last term in the right-hand side of Equation ( 7) is the field-like spin transfer torque exerted by the injected current Is and the second to last term is the Slonczewski torque exerted by the same current.The coefficients a and b depend on device configurations and following [5], we will use the values is the time-varying magnetization vector in the soft layer normalized to unity, mx(t), my(t) and mz(t) are its time-varying components along the x-, y-and z-axis, respectively (see Fig. 2 for the Cartesian axes), demag H  is the demagnetizing field in the soft layer due to its elliptical shape and thermal H  is the random magnetic field due to thermal noise [6].The different parameters in Equation (3) are:  is the magnetic permeability of free space, Ms is the saturation magnetization of the cobalt soft layer, kT is the thermal energyis the volume of the soft layer which is given by 4 major axis, = minor axis and = thickness a a a a a a

   
, t is the time step used in the simulation, and Gaussians with zero mean and unit standard deviation [6].The quantities , , 1  are calculated from the dimensions of the elliptical soft layer following the prescription of ref. [7].The nanomagnet soft layer is assumed to be made of cobalt with saturation magnetization Ms = 8x10 5 A/m and  = 0.01.Its major axis = 800 nm, minor axis = 700 nm and thickness = 2.2 nm.We assume that the spin polarization in the injected current is which we take to be .The spin current is given by Using the vector identity , we can recast the vector equation in Equation ( 7) as [8]  This vector equation can be recast as three coupled scalar equations in the three Cartesian components of the magnetization vector [8]: where In our s-LLG simulations, we consider six different switching currents of 0.5, 1.0, 5.0, 10.0, 15.0 and 20.0, corresponding to current densities of1.14 × 10 A/m 2 , 2.27 × 10 A/m 2 , 1.14 × 10 A/m 2 , 2.27 × 10 A/m 2 , 3.41 × 10 A/m 2 and 4.55 × 10 A/m 2 , respectively.We generate 1,000 switching trajectories for each current by solving Equation (9).We start with the initial condition       0 0 99 0 0 1 0 0 1 ., .; .
and run each trajectory for 20 ns with a time step of 0.1 ps.After 20 ns, each trajectory ends with a value of y m either close to +1 (switching success) or -1 (switching failure).The error probability is the fraction of trajectories that result in failure.In Fig. 3, we plot the error probability (at room temperature) as a function of the current injected.Keeping in mind that the bulk of the energy dissipated is proportional to the square of the current, we see that the error probability decreases monotonically with increasing current or increasing energy dissipation.This shows that energy efficiency can only be purchased at the cost of reliability.In this respect, the magnetic switch shows the same trait as the electronic switch.In both cases, we have to expend more energy during switching if we wish to increase switching reliability.
In Fig. 4, we show the error probability as a function of switching time (pulse width of the injected current) for a fixed magnitude of the current.The current strength chosen for this plot was 10 mA.In this simulation, we turned off the current after different intervals of time and continued the simulation for 20 ns to see whether the value of my ends up close to +1 (success) or -1 (failure).Again, the simulation duration of 20 ns was sufficient to ensure that for each simulated trajectory, my ends up close to either +1 (success) or -1 (failure) at the end of the simulation.One thousand switching trajectories were generated for each pulse width and the error probability is the fraction of trajectories that result in failure.We observe that the error probability decreases with increasing current pulse width (longer passage of current, or slower switching), as expected.

Conclusions
In this article, we have shown the relationship between energy dissipation, switching delay and reliability for binary switches used in nanoelectronics.Typically, energy efficiency and faster switching come at the cost of reduced reliability.Consequently, it is not appropriate to benchmark switching devices only in terms of their energy-delay product since a lower energy-delay product can always be purchased at the cost of error-resilience.This begs the question if there are computing and information processing paradigms that can tolerate high error probabilities because they can afford to be more energy-efficient.Boolean logic, which is at the heart of most arithmetic logic units in modern day computers, demands a high degree of reliability [9] and therefore is not likely to be frugal in its use of energy.On the other hand, there are computing paradigms (e. g. neuromorphic, probabilistic, Bayesian) where the computational activity is often elicited from the collective activity of many devices (switches) working in unison.In those cases, a single device (or few devices) being erratic does not impair overall circuit functionality [10].Consequently, they can tolerate much higher error probabilities.Hardware platforms for these non-Boolean computing paradigms are therefore likely to be more energy efficient than Boolean logic and that has already motivated a great deal of interest in them [9].

Figure 1 .
Figure 1.Conduction band profile along the channel of a field effect transistor in the OFF-state (solid line) and ON-state (broken line).

Figure 2 .
Figure 2. A nanomagnet shaped like an elliptical disk has two stable magnetization orientations which can encode the binary bits 0 and 1.(a) in-plane magnetic anisotropy, and (b) perpendicular magnetic anisotropy.(c) A magnetic tunnel junction (MTJ) showing the high-(OFF) and low-(ON) resistance states.

PreprintsFigure 3 .
Figure 3. Switching error probability as a function of injected current magnitude.The energy dissipated is proportional to the square of the current.The current was kept on for the entire duration of the simulation, which is 20 ns.

Figure 4 .
Figure 4. Switching error probability as a function of current pulse width (i.e. the duration of spin transfer torque).The current strength was kept fixed at 10 mA.