Neuromorphic Computing Using Emerging Synaptic Devices: A Retrospective Summary and an Outlook

: In this paper, emerging memory devices are investigated for a promising synaptic device of neuromorphic computing. Because the neuromorphic computing hardware requires high memory density, fast speed, and low power as well as a unique characteristic that simulates the function of learning by imitating the process of the human brain, memristor devices are considered as a promising candidate because of their desirable characteristic. Among them, Phase-change RAM (PRAM) Resistive RAM (ReRAM), Magnetic RAM (MRAM), and Atomic Switch Network (ASN) are selected to review. Even if the memristor devices show such characteristics, the inherent error by their physical properties needs to be resolved. This paper suggests adopting an approximate computing approach to deal with the error without degrading the advantages of emerging memory devices.


Introduction
Artificial Intelligence (AI), called machine intelligence, has been researched over the decades, and the AI now becomes an essential part of the industry [1][2][3]. Artificial intelligence means machines that imitate the cognitive functions of humans such as learning and problem solving [1,3]. Similarly, neuromorphic computing has been researched that imitates the process of the human brain compute and store [4][5][6]. Researches have been conducted to simulate human learning skills by mimicking the process the human brain learns using the computational structure of a computer in the form of an artificial neural network. Neuromorphic computing now has great attention from the industry because the neuromorphic computing efficiently executes artificial intelligence algorithms by imitating the brain nerve structure of humans.
The conventional von Neumann computing using individual processors and memory systems is not efficient for machine learning due to processor-memory bottlenecks [7][8][9][10]. Because the machine learning has a unique workload that iterates simple computation with large amounts of data, there should be huge data traffic between processors and memory subsystems. In contrast, the neuromorphic computing system consists of multiple neurons and synapses to compute and store data, and a neural network to communicate them. Therefore, this computing system can compute simple iterations efficiently for the training of machine learning [7,8]. For such a computing system, a new type of hardware is being researched and developed using next-generation emerging memory devices which also imitate neurons and synapses. In this paper, the neuromorphic computing system and promising emerging memory devices are reviewed, and the current issues and an outlook are discussed.

History of Neuromorphic Computing
In the late 1980s, Professor Carver Mead proposed the concept of neuromorphic computing [11,12]. The computation process of computer hardware using transistors mimics the brain's computation process on the neuromorphic computing. Over the decades, the concept has been studied by researchers to make machines that think and learn like humans in various forms. Studies have been conducted to simulate human learning skills using computer hardware in the form of an artificial neural network by mimicking the method the human brain learns and computes. The human brain has a very complex structure in which there are over a billion of neurons and trillions of synapses. Neurons consist of a cell body, an axon that produces neural signal impulse, and a dendrite that receives signals from other neurons as illustrated in Figure 1. A synapse is a structure that permits a neuron to deliver an electrical signal to another neuron.

Presynaptic neuron
Postsynaptic neuron Synapse Neural impulse The neuromorphic hardware typically consists of neurons and synapses to imitate the nerve system of the human brain as shown in Figure 2b. In the neuromorphic hardware, each neuron is a core that processes data and, neurons are connected in parallel through synapses to transmit information [13][14][15]. There is no von Neumann bottleneck caused by having one signal bus in the neuromorphic hardware. To implement this in practical design, the development of artificial synaptic devices is necessary that reflect the characteristics of bio-synapses rather than conventional CMOS devices. Figure 2 shows the block diagram of the conventional von Neumann architecture and the emerging neuromorphic architecture.

An Outlook of Neuromorphic Computing
Neuromorphic computing has been researched over the decades, and recently, there would be significant advancements. The recent enhancement can be categorized with three major steps as shown in Figure 3 [16][17][18]. The first step is a GPU-Centric system that uses a graphics processing unit (GPU) to support artificial intelligence, optimized for parallel operation, and is mainly utilized in learning. The next step is an ASIC-Centric system which is now widely researched. This trend is expected to develop an efficient and low-power application-specific integrated circuit (ASIC) for machine learning. Therefore, many semiconductor companies are developing ASIC chips [19][20][21][22]. However, it is predicted that the neuromorphic computing will eventually evolve into neuromorphic hardware that enables ultra-low power and ultra-high performance computing to support for general-purpose artificial intelligence.
The neuromorphic-centric hardware needs to achieve simultaneous parallel processing of large volumes of data with ultra-low power. Furthermore, the neuromorphic semiconductor chip requires faster computation speed than the existing hardware that uses conventional CMOS devices. This implies developing an emerging synaptic device is key for the successful development of the neuromorphic-centric hardware. This paper describes promising emerging memory devices for the neuromorphic-centric hardware which also imitates the neurons and synapses that can simultaneously process storage and computation in the following sections.

Synaptic Emerging Memory Devices
Novel memory device technologies hold significant promise in providing new capabilities and opportunities for synaptic devices of the neuromorphic hardware. The synaptic memory devices are required to have high integration density, fast read speed, and low power. More importantly, the device imitating the characteristics of synapses needs to be a non-volatile storage, capable of expressing multiple levels of synapse strength, and easy to implement synaptic learning.
Memristor devices have been widely researched because the devices have such desirable characteristics [16,[23][24][25]. This paper investigates the memristor devices as a promising synaptic device. The term, the memristor is a device that combines memory and a resistor. The device's resistance is changed by voltage pulse applied to both ends and serves as a memory to store it for a certain period. In 1971 Professor Leon Chua predicted a fourth element other than a resistor, inductor, and capacitor whose relationship with charge and magnetic flux showed a nonlinear relationship [26]. Over the decades, various materials demonstrate the properties of the memristor such as Phase-change RAM (PRAM) ReRAM (Resistive RAM), and Magnetic RAM (MRAM) [27][28][29]. They implement the learning of synapses using characteristics that vary the resistance according to voltage pulses. PRAM is a non-volatile memory device that utilizes changes in resistance characteristics according to changes in the crystallinity of the material. ReRAM refers to a memory that utilizes analog resistance change, and MRAM exploits the resistance change from its spin direction. The memory devices have been touted as a promising candidate for a universal memory technology that may be able to provide integration density close to DRAM, the non-volatility of Flash memory, fast read speed close to SRAM, and practically zero standby power.
Moreover, the memory devices have a great potential to be a promising synaptic device that can process vast amounts of data efficiently with an ultra-low power, which is essential for the development of artificial intelligence technologies. The following sub-sections describe the detailed analysis of these memory devices including the current issues and the outlook.

PRAM: Phase-Change Synaptic Devices
Phase-change memory (PRAM), also called PCM, is a type of non-volatile random-access memory. In 1969, Charles Sie published a thesis that represents the feasibility of a phase-change-memory device by chalcogenide film with a diode [30]. The following study in 1970 established that the phase-change-memory mechanism in chalcogenide glass involves electric-field-induced crystalline filament growth [31,32].
PRAM utilizes the difference of resistivity between the amorphous phase (high resistivity) and the crystalline phase (low resistivity) in phase change materials. PRAM is typically structured with a phase change material called T-cell between two electrodes as shown in Figure 4 [33][34][35]. Once a high voltage/current is applied to the electrodes, the phase of the material is changed. To Set into the crystalline phase, a current pulse is applied to anneal and quench the phase-change material to crystallize. The Set operation of PRAM devices may be created progressive by applying multiple Set pulses that incrementally crystallizes the high resistance amorphous part. To Reset into the amorphous phase, the programming region is first melted and then quenched rapidly by applying a large current pulse for a relatively short time. Because of desirable characteristics such as high speed, multi-level capability, and low energy consumption, the device has been researched as a good candidate for artificial synapse in implementing the machine learning algorithm [36][37][38]. However, material quality and power consumption problems prevented the wide adoption of PRAM technology. Specifically, resistance drift is a significant challenge, meaning that the resistance is changed over time [39][40][41]. The resistance drift exists commonly in phase-change materials, and it destroys the stability and greatly limits the development of PRAM. The drift phenomenon in amorphous chalcogenide materials has been explained in terms of a structural relaxation which is a thermally-activated local rearrangement at intervals the amorphous section occurring shortly [40,[42][43][44]. This means that the resistance can be modified over time, which implies a lost-memory appears to be almost like human memory. Apart from the major reliability issue in amorphous chalcogenide materials, the high power consumption for melting the phase-change material is also critical for a low power neuromorphic device. To be a good synapse device, PRAM technology needs to resolve these reliability and power issues.

Reram: Filament Type Synaptic Devices
ReRAM (Resistive random-access memory) is one of the most representative next-generation non-volatile memory in which the resistance changes according to the applied voltage. ReRAM uses an insulator that can create a filament or conductivity path by voltage (Set operation) as shown in Figure 5 [45][46][47]. The conductivity path is formed (Set operation) by two mechanisms including vacancy or metal defect migration. The filament can be removed by another voltage (Reset operation). Each of these two states exhibits a distinct resistance corresponding to storing a binary '0' or '1'.  ReRAM has many advantages such as good compatibility with the conventional CMOS processes, which reduces the development cost [46]. However, there are critical issues that need to be resolved for a memristor element. The filament formation of the ReRAM write process itself introduces variation and reliability problems. Specifically, the position or length in which the filament is formed is not controllable, making it difficult to adjust the resistance [47][48][49]. Therefore, the resistance's variation over cells is huge, resulting in poor control of resistance. Furthermore, the resistance value changes over time. That means that it cannot maintain resistance values for a long time. While this issue may be possible to implement the short-term memory effects of synapses, it is difficult to implement long-term memory. However, this technical barrier may be a universal issue in synaptic devices as observed in PRAM, and the problem needs to be solved in circuit-and/or microarchitecture-level.
The significant advantage of ReRAM for synapses devices is that they can reduce the physical space of the memory used to express multiple resistances. Multi-level cell (MLC) ReRAM, which stores multiple bits in a single ReRAM cell, can further improve density [50][51][52][53]. Because the resistance is determined by conductive filaments including metal ions and oxygen vacancies, a ReRAM device can express multiple resistances depending upon the voltage polarity and magnitude to create oxygen vacancies and/or metal ions. Figure 6 shows an MLC resistance distribution of a 2-bit example where both a high resistance state (HRS) and a low resistance state (LRS) exist [53][54][55]. Because multiple levels of bit information can be represented as analog values of a single device, it can significantly increase the aggregation compared to DRAMs that use capacitors to represent a single bit or SRAMs that use multiple transistors.

MRAM: Spintronic Synaptic Devices
Magnetoresistive random-access memory (MRAM) is a type of non-volatile memory that stores data in the magnetic material. MRAM uses magnetic tunnel junction (MTJ) that consists of two ferromagnetic material layers to store data, a free layer (FL), and a pinned layer (PL) as shown in Figure 7. The spin of an FL can be switched from one orientation to its opposite by applying a current pulse through the MTJ while the spin is set to one orientation in a PL. Each of these spin states exhibits a distinct resistance corresponding to storing a binary '0' and '1' [56][57][58][59].
Spin-Transfer Torque RAM (STT-RAM), also called STT-MRAM, is an advanced type of MRAM devices. Because STT-RAM is scalable, this enables higher densities, low power consumption, and reduced cost compared to regular MRAM devices. That is why STT-RAM is currently a majority of MRAM devices. STT-RAM holds significant promise in providing new capabilities and opportunities for low power systems. STT-RAM has been touted as a candidate for a universal memory technology that may be able to provide integration density close to DRAM, the non-volatility of flash memory, fast read speed close to SRAM, and practically zero standby power [57,60,61]. As shown in Figure 8, STT-RAM has higher endurance compared to PRAM and ReRAM because the spintronic device does not use melting or filamenting which impacts material stability. Moreover, the MRAM device can compute data as well as storing data. This is called Spin logic or Spin-based logic in memory. The spin logic has now attracted significant interest because of its potential that enables the new neuromorphic computing application. This means that the spintronic device can store and compute data like a synapse and a neuron. If one device can mimic both elements of the human brain, that would be efficient. This paper introduces two mainstream of the spin logic. The first approach is to use magnetic devices and additional circuits to perform a logic operation. In [62], a memory array and sense amplifiers with a variable reference are used for logic operations such as AND, OR, and XOR as shown in Figure 9. This design re-utilizes existing memory peripheral circuits to perform logic, arithmetic, and complex vector operations. Depending on the logic such as XOR or OR, the sum of current is compared with a variable reference using a sense amplifier. The current of each bitcell represents a stored value of each bit and the values are examined with the reference. Since the proposed spin logic re-utilizes the existing memory array, this can perform both storing and computing at the same time.

Ref.
Ref. Figure 9. Block diagram of a memory array and sense amplifiers with a variable reference to perform a logic function.
The other approach is to use a new spintronic device that operates a NAND gate operation. A NAND gate is generally classed as a universal gate because the NAND gate can produce any other type of logic gate function. By connecting them in various combinations, all the logic computation can be performed. In [63], a new spin device is proposed that uses an intrinsic property in spin-orbit heterostructures to form a logic gate in which the same magnetic contacts that retain the logic inputs serve to simultaneously perform a logic operation and retain the result. This is in contrast to the structures proposed in [62] which require extra circuits. As shown in Figure 10, two MTJs are placed in series and one channel current can control both MTJs. Depending on input voltage pulses on P and Q nodes, the MTJs' spin directions are changed at the same time, resulting in two-inputs NAND operation. STT-RAM holds significant promise in providing new capabilities and opportunities for low power neuromorphic systems as mentioned above. Yet, serious challenges remain for a syntactic device. One of the dominant challenges is the fundamental stochasticity of spin switching operations. The STT write process is inherently stochastic and the actual time to complete a write varies dramatically, with the distribution having a very long tail. This stochasticity of switching time is temporal, leading to variation in transition time even for a single cell. That means that guaranteeing a reliable write is not feasible, resulting in high error. The stochastic switching issue is based on a physical property, so-called thermally activated switching. This stochastic behavior is even more critical compared to PRAM and ReRAM because the temperature is not typically controllable.

Channel Current
To show and evaluate the stochastic behavior, the detailed magnetic numerical simulator, Object Oriented MicroMagnetic Framework (OOMMF) (Version 1.2, National Institute of Standards and Technology, Gaithersburg, MD, USA) is used [64]. OOMMF is a solver of the Landau-Lifshitz-Gilbert equation [65] describing the magnetization dynamics in a solid. OOMMF is used to simulate the normalized magnetization on the Z-direction (Mz), which represents the macro-spin angle in a free layer, over time. Without temperature, Mz stays constant while Mz shows significant fluctuation with temperature over time as shown in Figure 11. When all spins in a free layer are up, the value of Mz is 1. If all spins are down, the value of Mz is −1. This means the direction of spin stays the same without temperature, but the spin direction varies over time by temperature. This fluctuation induces the switching stochasticity, which is not controllable. This implies eliminating the intrinsic error is not possible even if various circuits and architectures techniques have been proposed to compensate this stochasticity [56,66].

ASN (Atomic Switch Network): Network-Based Synaptic Device
ASN, sometimes called nanowire networks, is a volatile memristor device that exhibits both short-term and long-term memory behavior [67][68][69][70]. The device can store information as short-term plasticity (STP) with a decay time or long-term potentiation (LTP) if a frequent stimulus exists. This device is typically composed of a network of interfacial atomic switches, which are self-assembled and randomly connected. Ag 2 S metal-insulator-metal (MIM) interface is widely used to form the switches [71,72]. Silver nanowires or nanoparticles are alternatively used to construct random self-assembled networks in recent researches.
This switch network exhibits a distinct conductance corresponding to storing data by a stimulus, which is voltage in most cases. Once an input voltage is applied to the device, the conductance is changed and sustained for a short time after the voltage is removed. This is called STL and the memory duration varies at random. Under a certain condition, which is most likely breakdown, the conductance is abruptly changed and becomes stable. Repeated input pulses typically produce a continuous and stable conductance pathway, resulting in the final state, called LTP. This device currently focuses on imitating the complex network topology of neurons, rather than stability and controllability. However, this device has a great potential to become the next generation brain-like device because the conductance changes based on its operational history and shows two distinct states, volatile short-term and non-volatile long-term memory behavior like the human brain.

Approximating Computing Using Emerging Synaptic Devices
New memory devices exhibit entirely new physics-level issues that make older design principles obsolete. This sub-section focuses on an inherent error and design approach based on probabilistic principles to overcome the problem. As mentioned above, the new memory devices induce error due to their physical properties. Specifically, the time needed for the MRAM to switch is stochastic and the distribution is quite wide, resulting in high write error. Furthermore, the resistance's variation of PRAM and ReRAM introduce error over time. Various circuit-and microarchitecture-level design techniques have developed. However, the additional circuitry introduces non-negligible area and energy overheads.
Instead of reducing the error using such high-cost techniques, this paper suggests adopting Approximate computing. Approximate computing is a computing paradigm that allows inaccurate results to boost other properties such as energy and computing time [73][74][75][76]. Because it is almost impossible to remove the inherent error entirely in the emerging memory devices, it would better to sacrifice the accuracy to maintain low power and small area. Moreover, the goal of machine learning is not expected to have 100% accuracy, but a fast computation with moderate accuracy. Therefore, the approximate computing is suitable for both the machine learning and synaptic devices. The problem needs to be solved and implemented, but at the same time, the solution does not degrade the advantages of the devices. Therefore, the low-cost approximate computing is a promising solution for the error by the emerging devices which is significant across all of the neuromorphic engineering.
This review introduces an implementation of the approximate computing using a synaptic device. Specifically, an approximate flip-flop using MTJ devices is reviewed [77]. The flip-flop is is a hybrid version of the conventional D flip-flop (D-FF) and non-volatile storage using MTJs for fine-grained power gating. The MTJs are used for temporary storage during power gating. Data on the conventional D flip-flop is stored onto MTJs before the power off and is restored after the power is on again. However, as mentioned in the previous sub-section, the MTJ switching process is inherently stochastic: 100% successful write is not guaranteed. Therefore, the data backed up shows error in any cases. To guarantee a successful write, longer write pulse is typically required as shown in Figure 12. The write error probability is decreased by write pulse (time). However, this approach introduces a significant energy penalty.
Using the approximate computing, the tradeoff between error rate and energy consumption can be resolved effectively. A key insight is that high error rate may be acceptable in some modules of modern hardware systems. In other words, each module has different importance based on its functionally. For example, a flip-flop in a controller is accessed every computation and need to be correct anytime whereas a flip-flop in datapath is only accessed when the data is necessary to be computed. In this case, such data corruption in datapath does not introduce error on outputs. Erroneous data is naturally removed if the data is not accessed for specific computations at various system layers, ranging from circuit-to architectural-level states, called masking effect [78,79]. If we can control error probability of each flop individually, we then would reduce energy effectively. In [77], the error rate of the flip-flop is easily controlled by changing pulse duration of a control node. Figure 13 shows a schematic view of the flip-flop and an inverse relation between the error rate and energy consumption. Data on the D flip-flop is stored onto MTJs before the power-off (=Storing), and is restored after the power is on again (=Restoring). Consider Q = 1 case. Once SEN signal is turned on, a write current is derived and passed through MT J A , M1, and MT J B . The MT J A is written to the AP state, and MT J B becomes the P state because the current direction is reversed. Figure 13b shows the error rate and the expected energy of a flop at various pulse duration, which is controlled by SEN signal. At 6.6 ns, a write error probability is 1.5 × 10 −13 and energy for a storing and restoring operations is 0.2 pJ. For short pulses such as 3.3 ns, the computed error probability is increased to 3.9 × 10 −7 while the energy consumption is reduced by half. Therefore, the error rate of the flip-flop is easily controlled by the pulse duration of a control signal. For the high priority module, the longer pulse duration is needed whereas we could reduce the pulse duration to save energy for the low priority module. This result indicates the approximate computing can maximize the advantage of synaptic devices by saving unnecessary energy while maintaining the required quality of data.

Conclusions
Based on Gartner's projection, the neuromorphic computing market will be expanded from $1.2 billion in 2017 to $15.8 billion in 2022 with a 55% compound annual growth rate (CAGR) [18]. Therefore, researchers need to prepare the new market trend because the neuromorphic computing design is not a plug-and-play method that can be used as soon as a synaptic device is developed. The new computing system requires a design ecosystem with synaptic elements, thus achieving successful joint innovation such as the possibility of developing technologies beyond current semiconductor devices, the establishment of specialized manufacturing processes, and the development of related equipment. Neuromorphic engineering is an area where learning in various fields must be studied together. It is also necessary to understand the learning, memory, and cognitive functions of the brain that are studied in biology, and to incorporate the understanding of computational science.
More importantly, the development of a synaptic device is an essential part. Emerging memory devices are reviewed for neuromorphic computing in this paper. Specifically, PRAM, ReRAM, MRAM, and ASN are discussed as promising candidates for an emerging synaptic device that enables low-power and high integrated neuromorphic systems that will function as a cognitive function that closely mimics the operating principles of the human brain even if there are various existing issues. This paper also suggests implementing an approximate computing concept for the new neuromorphic hardware.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: