Advances in Emerging Memory Technologies: From Data Storage to Artiﬁcial Intelligence

: This paper presents an overview of emerging memory technologies. It begins with the presentation of stand-alone and embedded memory technology evolution, since the appearance of Flash memory in the 1980s. Then, the progress of emerging memory technologies (based on ﬁlamentary, phase change, magnetic, and ferroelectric mechanisms) is presented with a review of the major demonstrations in the literature. The potential of these technologies for storage applications addressing various markets and products is discussed. Finally, we discuss how the rise of artiﬁcial intelligence and bio-inspired circuits offers an opportunity for emerging memory technology and shifts the application from pure data storage to storage and computing tasks, and also enlarges the range of required speciﬁcations at the device level due to the exponential number of new systems and architectures.


Introduction
In the digital era, we are living through societal changes, carried by major technological revolutions. The amount of generated data is exponentially growing; in 2025, about 175 million terabytes will be generated, which would represent 10 times the volume produced in 2015 [1]. This trend comes with a large increase in connected objects and smart components and with the development of required data centers to cover the needs of social networks, streaming, and video on demand. Indeed, it is forecasted that in 2025 data centers will consume about 5% of the globe's produced energy [2], which will cause further issues. Moreover, we are living a change of paradigm where computing system energy consumption is limited more by data transfer than computation itself, leading to the rise of memory-centric and in-memory computing systems. In this context, pressure is exerted on the memory component to fulfill the new specifications of these emerging computing system architectures. Figure 1 presents a brief summary of the main milestones of nonvolatile memory history. The history of semiconductor memory began in 1984 when Dr. Masuoka invented the NAND flash memory [3], leading to success in 1989, when Toshiba's first NAND flash reached the market [4]. NAND technology was scaled for decades, retaining the same concept, stack, and architecture, with the memory density increasing exponentially over time [5]. In 2001, various Flash players announced and launched MLC (multilevel cell) NAND, enabling a capacity increase [6].
In 2007, Toshiba presented the first NAND integrated into 3D architecture [7], while Samsung announced in 2012 the first generation of 3D NAND.
After 2010, embedded memories began to reach the 28-nm node [8], where traditional memory cell concepts became limited in terms of area shrinkage capability and increasing complexity. Then, pressure was exerted on emerging memory concepts in order to pursue scaling to more aggressive technology nodes, opening the path to new class of embedded technologies. The history of semiconductor memory began in 1984 when Dr. Masuoka invented the NAND flash memory [3], leading to success in 1989, when Toshiba's first NAND flash reached the market [4]. NAND technology was scaled for decades, retaining the same concept, stack, and architecture, with the memory density increasing exponentially over time [5]. In 2001, various Flash players announced and launched MLC (multilevel cell) NAND, enabling a capacity increase [6].
In 2007, Toshiba presented the first NAND integrated into 3D architecture [7], while Samsung announced in 2012 the first generation of 3D NAND.
After 2010, embedded memories began to reach the 28-nm node [8], where traditional memory cell concepts became limited in terms of area shrinkage capability and increasing complexity. Then, pressure was exerted on emerging memory concepts in order to pursue scaling to more aggressive technology nodes, opening the path to new class of embedded technologies.
In 2015, Intel and Micron developed the 3D XPoint Technology based on a phase change memory (PCM). This new class of memory enabled filling the latency gap between DRAM and Flash, often referred to as storage class memory. This made possible new system memory architectures with improved performances, bringing innovation to the nonvolatile memory (NVM) arena. The first announcement was made in 2015, and the technology has been available on the open market under the brand name Optane (Intel) since 2017 [9].
We are now in the more than Moore era, and we work toward new systems (including in-memory computing and non von Neumann architectures) that emulate the human brain to achieve high energy efficiency, parallelism, and ability in cognitive tasks, such as object recognition, association, adaptation, and learning [10]. This offers a strong opportunity for new memory technologies.
This article presents the evolution of memory technologies since the invention of Flash memory, and describes how the exponential increase in new systems (essentially coming from in-memory computing architectures, artificial intelligence, and neuromorphic circuits) offers opportunities to back-end emerging memory technologies. Section two presents the memory technologies' market and trends for both stand-alone and embedded sectors. Section three presents an overview of emerging memories with key demonstrators presented in the literature. Finally, Section four analyzes how emerging memories can be used in new computing systems, for both von Neumann and non von Neumann architectures. We are now in the more than Moore era, and we work toward new systems (including in-memory computing and non von Neumann architectures) that emulate the human brain to achieve high energy efficiency, parallelism, and ability in cognitive tasks, such as object recognition, association, adaptation, and learning [10]. This offers a strong opportunity for new memory technologies.
This article presents the evolution of memory technologies since the invention of Flash memory, and describes how the exponential increase in new systems (essentially coming from in-memory computing architectures, artificial intelligence, and neuromorphic circuits) offers opportunities to back-end emerging memory technologies. Section 2 presents the memory technologies' market and trends for both stand-alone and embedded sectors. Section 3 presents an overview of emerging memories with key demonstrators presented in the literature. Finally, Section 4 analyzes how emerging memories can be used in new computing systems, for both von Neumann and non von Neumann architectures. The impressive growth of the NVM market was made possible by the advent of flash memories, NOR first and NAND later, and has been fueled by the development of battery-supplied wearable electronics [11]. Thus, successive memory revolutions were supported by a novel application that allowed the market to exponentially increase. Mobile phones, PDAs, MP3 players, and digital cameras were the drivers in the 1990s [11]. With the emergence of smartphones, tablets, USB drives, and SSD (solid state drives), a new driving force led to a NAND market increase in the digital era of the 2000s [12]. More recently, artificial intelligence and other new applications offer a strong opportunity for emerging memories using new switching mechanisms. The next 10 years are expected to provide many growth opportunities for the semiconductor industry, with a continuous Appl. Sci. 2021, 11, 11254 3 of 25 increase in technical and business challenges. In 2026, the NAND Flash market is predicted to represent the current GDP of medium-sized countries. Competitive pressures within many segments of the semiconductor industry will increase significantly in the future, but semiconductor companies that have innovative businesses as well as product strategies are expected to achieve financial success [13].

Market and Trends of Memory
The memory market can be divided into two categories, high capacity standalone memories and embedded memories where the memory device is integrated into a core CMOS process flow. These two technologies are presented and described in the following sections.

Evolution of Standalone Nonvolatile Memory Technologies
Today, the Flash memory market is driven by high capacity standalone memories for mass storage applications. High memory capacity is required by applications such as digital still and video cameras, mp3 players, solid state drives (flash), mobile phones, and all manner of flash card products. For standalone applications, where density is doubling every 12 months [14], cell area is the key factor. Indeed, standalone memories need aggressive cell dimensions to achieve high memory densities with low cost per bit, required in the applications mentioned above.
For over 40 years, the evolution of nonvolatile memories was mostly based on the floating gate MOS transistor [1]. Scaling of the NAND technology was pursued below the 20-nm node [15]. Then, critical limits led to the appearance of 3D memory [16] that entered the market in~2015 and changed the conservative memory world. In 2020, vertical NAND stacked up to 96 layers, used QLC (quad level cells, four bits per cells) and reached a density of about 10 Gb/mm 2 , with typical memory capacities of one terabyte [17]. In 21 years, multilevel cell flash memory density has increased 10,000 times.
Flash NAND has been adopted in solid state drives (SSD) over the past 10 years and now shares the data storage market with hard disk drives (HDD) thanks to its constant decreasing bit cost. Three-dimensional NAND continues to improve and is expected to continue as a leading technology for data storage [1]. In the case of 3D NAND, scaling is mainly achieved by stacking, making it possible to retain the same cell dimensions in some successive nodes to maintain cell characteristics. Despite many challenges, Flash continues with constant technology, performance, architecture, and design improvements. In particular, in order to pursue scaling, stacking of two arrays (to relax the aspect ratio), integrating the CMOS under the memory array (to increase density), and adopting a more aggressive multilevel (with optimized programming algorithms) [18] with 96 stacked levels. A state-of-the-art multistacked memory hole process was presented by Toshiba in 2018 [19].

Trends in Embedded Nonvolatile Memory Technologies
Although the majority of the market is represented by high-density standalone memories, the embedded nonvolatile memory market is finding increasing use in a wide array of integrated circuits with applications ranging from a few bits (analog trimming) to megabytes for data/code storage. In systems with embedded nonvolatile memories, the memory array occupies only a certain fraction of the total die area due to the relatively small number of embedded bit cells. For this reason, the cell size shrink is not essential and larger cell sizes can be adopted with respect to standalone memories [20]. However, high programming voltages can lead to the need of high voltage transistors in the periphery what can impact the circuit size.
Three sectors can be distinguished for embedded memories, which are key enablers of today's wide variety of microcontroller products [21]: -Consumer and industrial: these also represent a wide range of applications, and include any type of machine or equipment that can use memory.
Smaller than the standalone memories market, the embedded market still represents tens of billion dollars every year with huge volume growth. Embedded specifications depend on the market and applications [22]. The most stringent market is the automotive, with aggressive specifications especially for grade 0 (−40 • C to +150 • C ambient operating temperature range).
In terms of technology, each competitor develops and uses its own memory cell concept. They all rely on charge trapping mechanisms (in a poly-Si floating gate or in a nitride charge trapping layer) and differ in cell architecture, from 1 T (one transistor) and 1.5 T (self-aligned control gate, select gate, and split gate, which is discussed further on) to 2 T options (separated select gate and memory gate). Production is currently at the 40 nm node, while 28 nm technologies are increasing in volume production. The current main embedded NVM charge storage-based cell concepts in production are described in [20].

The Era of Big Data
We are living during the big data revolution. Big data mainly affects large companies, in particular the GAFAMI (Amazon, Apple, Facebook, Google, Microsoft, and IBM), but also changes our everyday life. Big data relies on the three Vs: volume, variety, and velocity: -First, there is a tremendous volume of generated data. More than two trillion gigabytes (10 9 bytes) are created each day. Moreover, the evolution of data generation follows an exponential growth, and extrapolations forecast 175 ZB of data in 2025 [1], four times more than what we use today. People are forecasted to carry more than four mobile devices and 75% of the worldwide population will be connected to the network [23].
This data deluge is accompanied by a constant improvement in memory technologies, with bit density and memory capacity increasing by a factor of 1.4 every year [19]. In 2020, a 1-mm 2 die can integrate 10 Gb of memory capacity [19].
Another important point to mention is that this data increase will be supported by all memory types: even if SSD has the highest increase, HDD and even the tape market will also grow [24]. The IDC (International Data Corporation, global provider of intelligence, advisory services, and events for the information technology, telecommunications, and consumer technology markets) forecasts that over 22 ZB of storage capacity must ship across all media types from 2018 to 2025 to keep up with storage demands.
-Apart from the high volume of data, there is also a large variety of data in various forms. It differs from application to application. Data come from photos, videos, audio recordings, email messages, documents, books, presentations, tweets, etc., and are generally unstructured. . On a large scale and considering all the contributions, data traffic was expected to grow by a factor of 1455 from 2018 to 2020, following an exponential evolution, as reported by IBS [26]. In particular, video is expected to maintain a growth rate of 70-80% every year for the next decade [26].
According to CISCO, data traffic is forecast to increase 2.5 times over the period 2016 to 2021 [27]. The majority of the increase in data traffic is due to the increase in video content delivery, projected to account for 82% of traffic by 2021 [27]. The devices that will thus be the major contributors are smartphones and TVs with respective data traffic percentages of 33% and 30% [28].
We are also in the era of IoT, with the exponential growth of connected objects in our everyday life. We have smartphones, smart watches, smart clothes, etc. The number of

Limitations of Computing Systems
The memory hierarchy of computing systems is organized in tiers, with CPU at the top for computing, followed by SRAM, DRAM, storage memory, and HDD/Tape at the bottom of the architecture. Each memory block of the system has greater capacity than the preceding but slower latency. Thus, there is a cost vs. performance tradeoff in memory subsystem, performance increasing as it moves closer to the processors. Then capacity increases by roughly one order in each subsequent level [31]. It is becoming more accepted by the community that the standard memory architecture (organized in a cache/memory/storage hierarchy) is no longer able to address the bottlenecks in current computing systems [32,33].
First, there is a latency gap between DRAM and NAND [32], leading to high energy consumption and long latency to move data within the system [34]. DRAM latency will be sustained and throughput will improve. On the other hand, NAND will maintain or improve performance with cost leadership. Thus, there is space for a new memory to enter the hierarchy, fill this gap between DRAM and NAND, and improve computing system performance [33]. This new class of memory is known as storage class memory (SCM). As it may be too difficult for one memory to fill the gap, two types of storage class memories can be distinguished: memory mapped and storage mapped [11]. Memory mapped SCM should combine read/write and endurance performances close to DRAM, with improved retention time (without being a true nonvolatile memory) and lower cost. Storage mapped SCM should be nonvolatile, cost much less than DRAM, and have improved performances (speed and endurance) with respect to Flash memories. In particular, the 3DXpoint using PCRAM fits well between NAND and DRAM in the computing memory hierarchy [31].
Moreover, memory performance and speed growth have not kept pace with that of processors (memory latency remains almost constant), leading to a performance gap known as the "memory wall" [35,36]. This performance gap between processor and memory is widening with technology scaling [37]. In von Neumann systems, separation Appl. Sci. 2021, 11, 11254 6 of 25 of computing and memory becomes critical for performance and efficiency when data movement becomes prominent, as is the case in current system. Memory access energy costs 1-3 orders of magnitude higher than computing [38], and consumes approximately 1000 times the energy of a complex addition [36]. Currently, data movement between the main memory and conventional computation units is a major contributor to the total system energy consumption in consumer devices. For instance, it was reported that data movement accounts for 62.7% of the total energy consumed by Google consumer workloads [39]. This is why new non von Neumann architectures were proposed and developed to exploit locality and near-memory computing in order to reduce the cost and energy of data movement.

Development of New Computing Systems
The previous paragraphs highlighted the (1) current data exponential increase, (2) energy increase, and (3) performance limitations of current computing systems. For these reasons, new computing systems are currently being developed and used. In particular, near memory or in-memory computing are proposed to solve the memory wall and the high energy and time cost of data transfer in data centers. However, there is a tradeoff among the existing systems between flexibility and efficiency [1], and choosing the appropriate configuration for each purpose is key.
We can make the following classifications of computing systems depending on the application, in terms of increasing energy efficiency [38]: -CPU: they are fully programmable and versatile (can execute any function). There is a clear separation between logic and memory. -GPU: there is a shared memory architecture with thousands of cores; they operate with high parallel workload. -Near-memory processing: this system aims to supply high performance and highdensity memories as close as possible to the processing units. They are dedicated to data-intensive computing (AI, graph processing, and optimization processing). They combine local and shared memory. They are used for cloud and edge devices. - In-memory processing: in this system, computing and memory are collocated, which eliminates the von Neumann bottleneck. Major changes in both memory and computing units must be achieved in order to couple data processing and storage. In-memory processing is used for vector processing (AI with limited model size) and is mostly used for edge devices. -Neuromorphic dynamical systems: they are brain inspired and merge logic and memory. They are used for AI or autonomous systems.
Among all these systems, there are many artificial neural networks and especially deep learning networks that have matched or surpassed human level capabilities. These "neuromorphic circuits" are the subject of interest for the scientific community.
Deep neural networks (DNNs), or large-scale layered networks of artificial neurons, have profoundly transformed the field of machine learning and represent the state-of-theart in a variety of video, image, audio, and text processing tasks. They are already deployed in many real-world applications such as Google's image and voice search, Apple's Siri, Facebook's DeepText and DeepFace, Microsoft's Skype Translator, and many others [40]. They should also enter a wider range of applications in the future, including autonomous vehicles, education, and healthcare. The rise of neuromorphic circuits offers unique opportunities for emerging memories. Among them, two terminal back-end memories integrated in crossbar arrays could significantly improve the efficiency of deep neural networks as will be discussed in the next section.
In conclusion, these new systems, offering significant computing energy, are also very demanding in terms of high reliability memories and require new memory technologies with excellent capacity, bandwidth, and performance. Research perspectives on emerging memories for new non von Neumann systems are discussed in the next sections.

Filamentary Memory
Filamentary memories (Figure 2), also known as resistive random access memories (RRAM) are based on the reversible formation and disruption of a conductive filament in an insulator sandwiched between two metal electrodes when an electric field is applied. This electrical behavior was first reported in the 1960s [41,42] and was largely studied until the early 1980s for memory device applications. In the 2000s, new classes of RRAM appeared and the interest in this technology returned [43].
ories integrated in crossbar arrays could significantly improve the efficiency of dee ral networks as will be discussed in the next section.
In conclusion, these new systems, offering significant computing energy, a very demanding in terms of high reliability memories and require new memory te ogies with excellent capacity, bandwidth, and performance. Research perspecti emerging memories for new non von Neumann systems are discussed in the next se

Filamentary Memory
Filamentary memories (Figure 2), also known as resistive random access me (RRAM) are based on the reversible formation and disruption of a conductive filam an insulator sandwiched between two metal electrodes when an electric field is a This electrical behavior was first reported in the 1960s [41,42] and was largely s until the early 1980s for memory device applications. In the 2000s, new classes of appeared and the interest in this technology returned [43].  RRAM technology has many advantages. It is a low-cost two-terminal device and the number of integration steps is lower than in standard Flash. It also has a low voltage operation: typical RRAM operating voltages are 1-3 V, which is much lower than~20 V of Flash NAND memories. In addition, RRAM is a fast memory, with a typical programming time of 100 ns. Less than 10 ns programming times were also reported in the literature [46].
Resistive RAM underwent a strong improvement in its performance and maturity. Demonstrations of macros are now reported down to the 22 nm node. Bitcell area of less than 0.05µm 2 [47] was achieved by Intel, and 20 nm cell size functionality was demonstrated by Panasonic with high performances [48]. The endurance of macros of more than one megabyte now reach 10-100 kcycles, while the intrinsic endurance was demonstrated to be more than 10 10 cycles for single cells [49], showing the technology's reliability can be further improved. In particular, Panasonic demonstrated a limited resistance distribution shift up to 100 k with 10-ppm bitcell resolution [50], and Leti showed no fail after 10 5 cycles on 16 kb arrays on a 28 nm node [44].
High-temperature operation is affected by oxygen vacancies (or metal) temperature assisted diffusion, limiting RRAM retention. In particular, device stability becomes critical when low programming currents are used [51]. Renesas demonstrated that it was possible to reach a −6δ margin (with a one bit ECC) after 40 min at 200 • C (equivalent to 10 years at 85 • C), which corresponds to a 0.1% chip failure of two-megabyte cells [52].
As filamentary switching can be observed in most integrated layers, many elements were investigated until some consensus finally appeared on transition metal oxides (Ta 2 O 5 , HfO 2 , Al 2 O 3 , etc.). Depending on the integrated materials, various device performances are possible, enabling targeting of various applications.
Today the main application of RRAM is in embedded products. For classical embedded technologies, chip cost reduction will be very difficult beyond 2× nm nodes, due to their limited scalability and increasing complexity for integration in sub−28 nm CMOS nodes [20]. For that reason, RRAM is a strong candidate for future nodes due to its simple bitcell structure and low process complexity. RRAM is highly studied for IoT for its good density and low power consumption. Panasonic proposed the first commercially available implementation of RRAM. They offered a microcontroller for portable healthcare, security equipment, or sensor processing applications. Renesas also investigated RRAM for lowpower microcontroller units (MCUs) for IoT applications [52]. Intel [47,55] and TSMC [56] both demonstrated RRAM based macros in 22 nm in a 1T1R configuration for embedded applications, with comparable features (85 • C 10-year retention, 10 kcycles endurance). Intel studies mobile and RF applications and TSMC targets eFlash, IoT, and smartcards. Adesto offers RRAM based EEPROM, as a compatible serial memory for discrete and embedded memory applications. They target IoT and other energy-conscious applications. The startup Crossbar is also very active and provides RRAM for IoT system-on-chips but also persistent memory solutions.
In the case of embedded applications, the memory is integrated in the BEOL above the logic. In particular, the memory cell is integrated above the select transistor in the 1T1R configuration. The bitcell area is limited by the transistor more than by the memory itself [57]. Thus, it is important to have low operating voltages. Ideally, the RRAM would use logic transistors where the voltage was~1 V. Due to the short RRAM programming time (~100 ns), the transistor could operate in overdrive mode [57]. This leads a targeted RRAM programming voltage of~1.5 V to be compatible with logic CMOS to reach the best bitcell density. For more advanced nodes (1× nm nodes), the memory could be integrated in a 1S1R configuration with a backend selector [58,59] to reduce the bitcell size and target higher capacities than the 1T1R architecture allows [60].
The main issue for RRAM is the variability and related resistance distribution spread for high and low resistive states. Indeed, the conductive filament is composed with a finite number of atoms, and the memory operation is governed by stochastic phenomena [61,62]. Thus, from cycle to cycle, the conductive filament can have various shapes and is constituted by an uncontrolled number of atoms leading to resistance variability. This resistance dispersion reduces the read window margin and limits the maximum memory capacity that can be achieved. In order to improve RRAM variability, various solutions have been investigated. The first is to improve the memory stack [63,64]. In particular, "subquantum" CBRAM where filaments comprise a semiconductor or semimetal instead of a metal were proposed [65] in order to achieve thicker filaments, less affected by single events. The second approach is to adjust programming schemes and algorithms in order to reach sharper resistance distributions [66,67].
Once RRAM variability is improved, larger capacities can be envisaged and new fields can be targeted. In particular, storage class memories (SCM) can be envisaged where the RRAM would be placed between the DRAM and storage memory in the hierarchy due to its high speed and good endurance. A few years ago, Micron presented a 16 Gb RRAM in a 27 nm node targeting SCM applications with excellent reliability, achieving 10 5 cycles with <7 × 10 −5 of bit error rate due to optimized programming schemes [68]. No further work has reported from Micron but SONY also provides cross-point RRAM for storage class memory applications with an excellent widow margin of two decades at 3δ [69]. Western Digital introduced RRAM in its roadmap [70] and announced "RRAM SCM will close the gap in terms of per-GB cost with BiCS NAND and will thus widen the gap with DRAM, which will make it more economically feasible". RRAM also has the potential to enable analog neuromorphic computing features. This could allow low power neuromorphic IP in embedded nonvolatile memory system-on-chip without adding additional process complexity [20]. More details are provided in the next section.
In summary, RRAM is a proven technology with very low cost, ease of integration in the backend, suitable for embedded (smart card and IoT) and neuromorphic applications. Solving the variability issue would enable opening the application field to other domains such as storage class memories.

Phase Change Memory
Phase change memory ( Figure 3) is based on the reversible transition between amorphous (insulating) and crystalline (conductive) states of a chalcogenide alloy. The phase transition is obtained by current-induced Joule heating. The most typical materials are Ge2Sb2Te5 (GST) or GeTe. Chalcogenide has been researched for a long time, and used as a recording material for optical disks, and chalcogenide memory is in the market although its density is small. Phase change memories offer low voltage operation (<3 V), fast behavior (~100 ns switching), and nonvolatility. Very high endurance (10 12 cycles at the single cell level) has been demonstrated. It is a two terminal device and can be integrated into the BEOL. Phase change memories do not require any initialization step (unlike RRAM, for example) and work in unipolar mode. Moreover, phase change memories can have an analog behavior, which is suitable for some neuromorphic applications. The device characteristics can be tuned through material engineering (doping, etc.). Thus, they have the capability to address the high temperature retention required in embedded applications and the high speed required, for example, in storage class memory applications depending on the elected stack [72,73].
In terms of limitations, phase change memory technology suffers from resistance drift that can affect its high temperature retention [74,75]. Optimized programming schemes (better than detection threshold adjustment over time) were proposed to improve immunity to drift for multilevel operations [76]. Moreover, they generally need elevated (hundreds of µA) currents to operate (linked to the melting temperature of the material and to the current density needed to achieve such a temperature), which can be a limitation for consumption and crossbar integration. This is balanced, nevertheless, by the fact that the programming current decreases as the cell size is called: a 20 nm confined cell can be reset at <100 µA [77,78]. However, at high density, thermal disturbance among neighboring cells can become critical and may require additional layers. In terms of speed, quenching time can limit the programming speed. In terms of process and integration, PCRAM may also require complex alloys (ternary or quaternary materials) with an accu- Phase change memories offer low voltage operation (<3 V), fast behavior (~100 ns switching), and nonvolatility. Very high endurance (10 12 cycles at the single cell level) has been demonstrated. It is a two terminal device and can be integrated into the BEOL. Phase change memories do not require any initialization step (unlike RRAM, for example) and work in unipolar mode. Moreover, phase change memories can have an analog behavior, which is suitable for some neuromorphic applications. The device characteristics can be tuned through material engineering (doping, etc.). Thus, they have the capability to address the high temperature retention required in embedded applications and the high speed required, for example, in storage class memory applications depending on the elected stack [72,73].
In terms of limitations, phase change memory technology suffers from resistance drift that can affect its high temperature retention [74,75]. Optimized programming schemes (better than detection threshold adjustment over time) were proposed to improve immunity to drift for multilevel operations [76]. Moreover, they generally need elevated (hundreds of µA) currents to operate (linked to the melting temperature of the material and to the current density needed to achieve such a temperature), which can be a limitation for consumption and crossbar integration. This is balanced, nevertheless, by the fact that the programming current decreases as the cell size is called: a 20 nm confined cell can be reset at <100 µA [77,78]. However, at high density, thermal disturbance among neighboring cells can become critical and may require additional layers. In terms of speed, quenching time can limit the programming speed. In terms of process and integration, PCRAM may also require complex alloys (ternary or quaternary materials) with an accurate control of the layer composition. They can also need specific device structures such as wall architecture to improve programming efficiency with more technological steps and lithography levels than other emerging memories.
PCRAM has a high maturity and products already use this technology. The most well-known is the 3D Crosspoint technology developed by Intel and Micron that integrates the PCRAM with a backend selector in crosspoint arrays. The technology is used as a storage class memory to fill the latency gap between DRAM and NAND in the memory hierarchy. Two types were initially envisaged with two different locations in the system [1]: "Storage mapped" is part of the memory hierarchy with typical 128 Gb memory capacities. The memory is faster than NAND, has higher endurance, and is 10X more dense than conventional memories. "Memory mapped" is a shadow of DRAM; data in DRAM are copied to the 3D XPoint in order to expand the size of the main memory. Intel has proposed Optane. Intel Optane DC Persistent and SSD/Caches memories can achieve 100s ns and 1s µs, respectively, allowing significant improvement of computer architecture [79]. Micron also offered the X100 NVMe™ SSD cache memory based on 3D XPoint™ Technology. STMicroelectronics also provides PCRAM technology for automotive embedded applications for 28 nm node and beyond [80,81]. The memory uses a chalcogenide ternary material and is integrated in the BEOL of 28 nm FDSOI technology, with a cell size of 0.036 µm 2 . A bit error rate of <10 −8 was achieved after multiple bakes at 150 • C and 10 k cycling of code storage memory was shown.
So far, phase change memory technology has followed the Gartner Hype Cycle, with R&D in the 1960s, followed by the first product generation (Samsung for mobile phones), negative press in the 2000s (reporting drift issue during retention), and a second product generation (3D-Xpoint) now in the market.
In summary, PCRAM is a mature technology showing high reliability, fast speed, and high endurance, which makes it a strong candidate for both automotive grade embedded applications and storage class memories.

Magnetic Memory
MRAM (magnetic RAM) is a memory that uses the magnetism of electron spin to provide non-volatility (Figure 4). MRAM stores information in magnetic material integrated with silicon circuitry.
Toggle MRAM uses a one transistor, one MTJ (magnetic tunnel junction) cell to provide a simple high-density memory. During a read, the pass transistor is activated and data are read by comparing the resistance of the cell to a reference device. During writes, the magnetic field from Write Line 1 and Write Line 2 writes the cell at the intersection of the two lines but does not disturb other cells on either line. Another MRAM technology uses a spin torque transfer property, which is the manipulation of the spin of electrons with a polarizing current, to establish the desired magnetic state of the free layer to program, or write, the bits in the memory array. Spin transfer torque MRAM (STT-MRAM) provides a significant reduction in switching energy compared to toggle MRAM and is highly scalable, enabling higher density memory products.
The advantage of MRAM is fast switching speed compared to other nonvolatile memories, with~1-10 ns read and write erase times and very good endurance (up to 10 15 cycles).
One major concern with MRAM is the scalability that is difficult due to the complexity of etching many layers with good conformity. Moreover, MRAM etching generally uses ion beam etching, which is not suitable for extensive scaling. The other generally reported issue is MRAM data retention. However, recent material and stack development allowed significantly improved retention performances [83], which can be attributed to the adoption of a dual MgO interface instead of the standard single layer.
negative press in the 2000s (reporting drift issue during retention), and a second product generation (3D-Xpoint) now in the market.
In summary, PCRAM is a mature technology showing high reliability, fast speed, and high endurance, which makes it a strong candidate for both automotive grade embedded applications and storage class memories.

Magnetic Memory
MRAM (magnetic RAM) is a memory that uses the magnetism of electron spin to provide non-volatility (Figure 4). MRAM stores information in magnetic material integrated with silicon circuitry. Toggle MRAM uses a one transistor, one MTJ (magnetic tunnel junction) cell to provide a simple high-density memory. During a read, the pass transistor is activated and data are read by comparing the resistance of the cell to a reference device. During writes, the magnetic field from Write Line 1 and Write Line 2 writes the cell at the intersection of the two lines but does not disturb other cells on either line. Another MRAM technology uses a spin torque transfer property, which is the manipulation of the spin of electrons with a polarizing current, to establish the desired magnetic state of the free layer to program, or write, the bits in the memory array. Spin transfer torque MRAM (STT-MRAM) provides a significant reduction in switching energy compared to toggle MRAM and is highly scalable, enabling higher density memory products.
The advantage of MRAM is fast switching speed compared to other nonvolatile memories, with ~1-10 ns read and write erase times and very good endurance (up to 10 15 cycles).
One major concern with MRAM is the scalability that is difficult due to the complexity of etching many layers with good conformity. Moreover, MRAM etching generally uses ion beam etching, which is not suitable for extensive scaling. The other generally reported issue is MRAM data retention. However, recent material and stack development MRAM also suffers from small ON/OFF current ratio compared to the other emerging memories: STT-MRAM has a small window and TMR reduces with temperature [83,84]. Multilevel is difficult to achieve and MRAM requires a good sense amplifier. One decade ago, magnetic memories were envisaged for SRAM replacement (last level cache) due to their high speed and endurance. However, the retention and reliability of MRAM has been highly improved, opening the range of applications to eDRAM (Samsung [85]), embedded (Samsung [85], GlobalFoundries [86], Intel [87]), Industrial (Everspin [88]), and even automotive (TSMC) applications.
Intel recently presented significant improvements in MRAM technology [89,90] for embedded applications. STT-MRAM is also proposed by GF for Embedded, MCU, and IoT applications [86]. Avalanche is shipping perpendicular MRAM for SRAM (manufactured at partner foundries) but also announced the production in 2020 of a 22 nm MRAM for Flash replacement for nonvolatile embedded applications [84]. The limitation of spin memories was retention, but recent achievements by TSMC have shown that this technology can be envisaged for automotive applications in the near future [82].
Several products already exist in the market, such as the spin transfer torque MRAM for DDR3 and the DDR4 (DRAM) product of Everspin using STT-MRAM, serial peripheral interface (SPI) and parallel interface MRAM (using toggle MRAM technology) from Everspin, and the SPSRAM (a persistent SRAM using STT-MRAM technology with serial peripheral interface) and SPNOR (perpendicular STT-MRAM for embedded Flash and embedded SRAM used in system-on-chips) memories from Avalanche technology.
In summary, MRAM offers excellent endurance suitable for DRAM and SRAM applications, but has also shown recent increased stability for embedded applications. Stack complexity is generally invoked as a device limitation. The next challenge will be to increase its scalability and capacity.

Ferroelectric Memory
Ferroelectric memories (FeRAM) are based on the polarization of ferroelectric materials ( Figure 5). In an FeRAM, thin ferroelectric films are sandwiched between two metallic electrodes and used as capacitors. When an electric field is applied across the stack, the states are aligned with the electric field leading to a low-energy state. Analogically, the high-energy state is obtained when the polarization alignment is antiparallel to the electric field. Moving from the high-energy state to the low-energy state or vice versa produces energy as a charge and is normally called a switch charge (Qs). Therefore, two Q(V) curves are obtained as a function of the polarization of the electric field applied. An access transistor is used to sense the state of the ferroelectric film, while the typical polarization depends on the electric field.

Ferroelectric Memory
Ferroelectric memories (FeRAM) are based on the polarization of ferroelectric materials ( Figure 5). In an FeRAM, thin ferroelectric films are sandwiched between two metallic electrodes and used as capacitors. When an electric field is applied across the stack, the states are aligned with the electric field leading to a low-energy state. Analogically, the high-energy state is obtained when the polarization alignment is antiparallel to the electric field. Moving from the high-energy state to the low-energy state or vice versa produces energy as a charge and is normally called a switch charge (Qs). Therefore, two Q(V) curves are obtained as a function of the polarization of the electric field applied. An access transistor is used to sense the state of the ferroelectric film, while the typical polarization depends on the electric field.  Ferroelectrics are theoretically an ideal solution for low write power nonvolatile memories. However, the complexity of ferroelectric perovskites has hindered the scaling of such devices to competitive feature sizes. The discovery of ferroelectricity in hafnium oxide solved this issue, and led to renewed interest by the scientific community in this concept for various applications, due to its CMOS compatibility [92][93][94].
Three types of ferroelectric-based memories are: -Ferroelectric FET (FeFET): the ferroelectric material is embedded in the gate stack of a transistor. The nonvolatile polarization of the material acts as a remnant control gate and leads to a threshold voltage shift of the characteristics. This concept offers ultra-low power but is a three terminal device and can thus hardly be envisaged for high-density applications. Moreover, the effect vanishes for thin layers, making the concept hardly scalable. Finally, the degradation of the interface layer between the ferroelectric and the semiconductor channel limits endurance, in particular, due to trapped charge that affects the conduction of the FET below the ferroelectric [92]. For all of these reasons FeFET are targeting Flash or EEPROM rather than DRAM replacement. In particular, it is now seen as an alternative to Flash for ultra-low power applications [91], due to its 10 fJ/bit consumption and five-nanosecond programming speed. FeFET based eNVM solutions were integrated into leading edge technologies: GlobalFoundries FeFET technology was embedded into the 28 nm gate first HKMG low power CMOS platform, showing 6δ distribution, reasonable endurance, and stable data retention [95]. GF also demonstrated a 22 nm node on FDSOI CMOS technology [96]. -Capacitor based ferroelectric RAM (FeRAM): in this case, the cell resembles a DRAM with the capacitor dielectric replaced by the ferroelectric. Recent reports verified antiferroelectric properties for pure ZrO 2 dielectrics used in DRAM stacks. By employing electrodes with different work function values, a built-in bias is introduced within the anti-ferroelectric stack, thus creating two stable nonvolatile states [97]. It demonstrated 10 10 endurance and 10 ns speed combined with 100 • C retention, making this concept very promising for a dense (6 F 2 ) and new class of nonvolatile DRAM. In FeRAM, reading is destructive as it is performed by switching the ferroelectric into a specific direction and measuring the contrast between a switching and a nonswitching event. Thus, programming is required after each reading operation. In terms of FeRAM reliability challenges, trapped charges at the ferroelectric-electrode interface have to be controlled to improve retention (requiring careful interface engineering), while dielectric breakdown induced by high coercive film has to be prevented to insure high endurance [92].
-Ferroelectric tunneling junctions (FTJ): in this case, the memory is a two terminal device. In the FTJ, a very thin ferroelectric film is used that allows tunneling and the tunneling current is modulated by the polarization of the ferroelectric. A critical issue for this concept (which is more prospective than the previous ones) is the low read current.
In summary, FeRAM is a simple and low cost memory offering very low consumption (~10 fJ/bit), suitable for low power applications (IoT, etc). Its high endurance and nonvolatility also make it a promising technology for the future, including neuromorphic circuits.

Emerging Memory Benchmark
A tentative benchmark is proposed in Figure 6.
Appl. Sci. 2021, 11 Figure 6. Tentative benchmark of various emerging memory technologies for various applications.
In this table, two parts are reported for some technologies. Indeed, adjusting the materials and stacks, it is possible to tune the memory characteristics and target distinct applications. In particular, we consider standalone, embedded, cache, and DRAM replacement applications.
The features are questionable and can be debated; the table only gives general trends for various emerging technologies, while characteristics can vary depending on the materials and technological maturity. Nevertheless, this table can be used as a starting point to evaluate how these new technologies can solve current challenges.
Finally, Figure 7 focuses on macros for embedded applications. Most macros are integrated in 28 nm or 22 nm nodes. Cell size, limited by the selected transistor, is always in the range of 0.04-0.05 µm 2 . The best endurance has been demonstrated for MRAM with 10 6 cycles (10-100 kc for Resistive RAM, 100 kc for Phase Change RAM). Various applications are targeted, from smartcard and IoT to automotive grade MCU.

Evolution of Von Neumann Computing Systems
Emerging memory technologies can be used in several places in this hierarchy, each having its own advantages with respect to the current implementation: -It can be used in the storage memory area, either as a replacement of current flash technology, or, at least in the short term as an intermediate step between the main memory and the flash or disks: storage class memories (SCM); in this case, the memory will be interfaced via an I/O-like interface, such as SATA, Ethernet, PCIe, or another interface that could emerge, and the transfer of data with the main memory could be managed by the OS (operating system) of the chip as it is today for storage memory. The storage memory market being essentially cost-driven, it is likely that RRAM will only replace the now well-established flash technology when it is cost-competitive; as flash density continues to increase due to monolithic 3D integration, this will take time. In the meantime, the emerging memory could be used as an intermediate step between flash and the main memory, as for instance an ultra-fast SSD used to store data with frequent access. The large difference in latency is likely to make acceptable a higher cost. Currently, the gap in latency between the main memory, which is in the order of 30-50 ns and the one of flash-based storage, which is about 100 µs (much more in writing) is vast, and having a technology enabling microsecond scale latencies would undoubtedly be an improvement for data centric applications. For these reasons, 1 µs latency and 10 6 cycle high density RRAM and PCRAM (possibly in vertical architectures) could be two strong candidates for this role. -It can be used in the main memory area, either as a DRAM companion chip, located on the same memory bus, or as a replacement of the DRAM. Inserting the emerging memory on the main memory bus, alongside DRAM components, will present significant system benefits. It will improve data integrity management; data can be quickly secured locally, on a word-by-word basis, instead of using current cumbersome journaling or check-pointing schemes to protect data from events such as a loss of power supply. This can significantly reduce data traffic in a data center, as it is estimated that data integrity management can represent up to 80% of the file system usage in PetaFlop data centers. As the emerging memory should become more dense and less expensive than DRAM, it will enable much greater capacity than main memories; this is especially important for big data applications, where storing large multidimensional tables in the main memory enables a tremendous performance advantage, because these tables often need to be accessed in a different order than the way it has been stored in the storage memory, which can only be accessed sequentially. It will also simplify atomic operations in transactional databases, as RRAM are byte addressable and nonvolatile. They propose a much more efficient solution than current NVDIMMs composed of a mix of DRAM and flashes powered by a bulky supercapacitor during the data transfer from DRAM to Flash when the main power supply is lost. For these reasons, according to the specifications, fast (~100 ns) and high endurance (~10 9 cycles) RRAM and PCRAM could succeed. -It can also be used as a last-level cache replacement or complement; it is unlikely that memory technology could become fast enough to be used as a first-level cache. In this case, the cache capacity could be made much higher, diminishing the external bandwidth requirements. Here, the most stringent requirements would be speed (<30 ns) and endurance (>10 16 ). We cannot rely on wear-leveling in this case as the cache capacity would not be sufficient in regard to its bandwidth, each bit being written frequently. A high endurance of 10 16 cycles is likely to reserve this application for STT-MRAM. FeRAM could be placed between an SCM memory (memory type) and the DRAM due to its high endurance. Finally, the only technology that exhibits performances close to SRAM is the SOT-MRAM. As far as cost and power consumption are concerned, the reference there is embedded DRAM (eDRAM); the RRAM has to be cheaper and less consuming than eDRAM to be competitive.
A summary of the possible implementation of emerging memories in the memory hierarchy is presented in Figure 8. In this figure, storage class memory refers to a class of memory that stands between DRAM and disk storage in the data storage hierarchy. In other work, this class of memory has been defined as persistent memory [98]. The difference at the system level between persistent storage, nonpersistent DRAM extension, and persistent memory at the architecture level is not discussed in this paper. other work, this class of memory has been defined as persistent memory [98]. The difference at the system level between persistent storage, nonpersistent DRAM extension, and persistent memory at the architecture level is not discussed in this paper.

Emerging Memories for Non von Neumann Systems
In the context of the development of new architectures, emerging memories could enable revolutionary novel functions and computing paradigms due to their specificities ( Figure 9). Thus, apart from von Neumann architecture evolutions, emerging memories offer new types of applications that can be classified as follows: novel functions, in/near memory computing, and neuromorphic architectures.

Emerging Memories for Non Von Neumann Systems
In the context of the development of new architectures, emerging memories could enable revolutionary novel functions and computing paradigms due to their specificities ( Figure 9). Thus, apart from von Neumann architecture evolutions, emerging memories offer new types of applications that can be classified as follows: novel functions, in/near memory computing, and neuromorphic architectures.

Novel Functions
A large number of work takes advantage of multiple undesirable nanoscale OxRAM phenomena, such as RESET current stochastics fluctuation, random telegraph noise (RTN), and RESET state resistance variability to realize security and computing circuits [99] such as random number generators (RNG) [100] or physical unclonable functions (PUF). However, some RRAM features do not following fully random laws: some correlation exists among subsequent RRAM levels, for instance [101], and the filament retains some memory effect of its morphology in the previous cycles [54]. Moreover, the reliability of RRAM PUF may degrade with retention loss, read instability, and thermal variation, while PUF uniqueness is maintained as long as the randomness in the RRAM resistance distribution is preserved [102]. Thus, implied physics have to be clearly understood to insure sufficient reliability of the circuit. More generally, as device physicists, our knowledge on the physics of new technologies helps us to understand how it can serve the emergence of new systems and architectures, which requires improving our knowledge on system aspects to be able to communicate with architects.

In/Near Memory Computing
In-memory computing uses nonvolatility and the ability to couple computing with data, such as through Ohm's law to perform multiplication. Artificial neural networks take advantage of this by reducing the amount of data movement compared to von Neumann architectures. The expected performances strongly depend on the application and targeted system. However, general guidelines can be drawn. First, co-integration and persistence of the memory are prerequisite, while endurance and capacity are the key expected features. In order to move computing tasks in the emerging memory, endurance has to remain as close as possible to SRAM performances: the higher the endurance, the more important the amount of computing that can be done in the emerging device. Then, increasing the memory capacity will enable improving system complexity and performance. Again, the gain in memory capacity has to be significant with respect to what can

Novel Functions
A large number of work takes advantage of multiple undesirable nanoscale OxRAM phenomena, such as RESET current stochastics fluctuation, random telegraph noise (RTN), and RESET state resistance variability to realize security and computing circuits [99] such as random number generators (RNG) [100] or physical unclonable functions (PUF). However, some RRAM features do not following fully random laws: some correlation exists among subsequent RRAM levels, for instance [101], and the filament retains some memory effect of its morphology in the previous cycles [54]. Moreover, the reliability of RRAM PUF may degrade with retention loss, read instability, and thermal variation, while PUF uniqueness is maintained as long as the randomness in the RRAM resistance distribution is preserved [102]. Thus, implied physics have to be clearly understood to insure sufficient reliability of the circuit. More generally, as device physicists, our knowledge on the physics of new technologies helps us to understand how it can serve the emergence of new systems and architectures, which requires improving our knowledge on system aspects to be able to communicate with architects.

In/Near Memory Computing
In-memory computing uses nonvolatility and the ability to couple computing with data, such as through Ohm's law to perform multiplication. Artificial neural networks take advantage of this by reducing the amount of data movement compared to von Neumann architectures. The expected performances strongly depend on the application and targeted system. However, general guidelines can be drawn. First, co-integration and persistence of the memory are prerequisite, while endurance and capacity are the key expected features. In order to move computing tasks in the emerging memory, endurance has to remain as close as possible to SRAM performances: the higher the endurance, the more important the amount of computing that can be done in the emerging device. Then, increasing the memory capacity will enable improving system complexity and performance. Again, the gain in memory capacity has to be significant with respect to what can be achieved with SRAM. As endurance and capacity increase, the system efficiency and performance will be improved.
Today, RRAM and PCRAM endurance is insufficient to perform computing in a device, and only specific computing tasks can be achieved with a limited number of cycles. On the contrary, MRAM appears to be a promising technology due to its excellent endurance. The challenge of this technology will be to improve its capacity, using more scalable spinbased concepts [103], in order to envisage the development of more complex and better performing systems.
Ternary content addressable memory (TCAM) can also be listed as a computing architecture. TCAM performs parallel searches by comparing input searched data with data stored in the memory and returning the data address when a match occurs. TCAMs provide a lookup response in a single clock cycle making them faster and more energyefficient than random access memory-based search systems. Emerging memories and in particular RRAMs are a promising solution to implement TCAMs and can offer more area-and energy-efficiency with respect to static random access memory (SRAM)-based TCAMs [104].

Neuromorphic Architectures
As stated before, time and energy spent to move data between memory and processor (across the so-called von Neumann bottleneck) has become the main issue of computational systems, especially for datacentric applications such as realtime image recognition and natural language processing. One way to improve computing system efficiency and capabilities is to study the human brain. It is characterized by its complex parallel architecture connecting myriad low-power computing elements (neurons) and adaptive memory elements (synapses), which outperforms modern von Neumann processors on many tasks involving unstructured data classification and pattern recognition [105].
Various works report neuromorphic digital chips to improve the energy efficiency of the multiply accumulate operation, which is the basic task required in deep neural networks. Among the large number of demonstrations and approaches of proposed neuromorphic architectures, a distinction can be made between digital neural network accelerators, analog deep learning accelerators, and spiking brain inspired neural networks. In 2016, a TrueNorth brain chip was proposed and presented by IBM [106] as an energy efficient bio-inspired circuit to implement AI. This is an asynchronous parallel-distributed modular scalable architecture, and for this reason, it is considered to be non von Neumann architecture. At a prospective level, the MAC operation can be directly performed in the memory exploiting Kirchhoff's law [107]. In this context, advanced memory devices offering high capacity, nonvolatility, and also ML and analog behavior are perfectly suited for these new architectures. More details are presented in the following for the various neuromorphic architecture categories. -

Digital neural network
Research into custom digital accelerators primarily focuses on redesigning a GPU-like processor, explicitly designed for deep learning, using full ASIC designs or FPGAs.
In the approach followed by Stanford, emerging memories can be used to improve the efficiency of microcontrollers for various applications representing machine learning (including convolutional neural network). In particular, in [108] improved accuracy of neural network inference was shown with a RRAM and SRAM chip, with faster (lower latency) and lower energy features during transition to shutdown than Flash can offer. RRAM, used for inference, stores neural network model weights (five levels). During active mode, instructions are read in the RRAM and executed in the microcontroller core. After the data are processed, the results are written back to the RRAM. Instructions are read in the RRAM and executed in the µcontroller core. Thus, nonvolatile memories, and in particular, emerging backend memories become an essential on-chip device due to their fast and low energy operation, suitable for a wide range of application domains, from edge nodes for the Internet of Things (IoT) to large computing clusters.
In a digital neural network accelerator, key requirements for emerging memories would be low latency (compared to Flash), nonvolatility, limited endurance (for offline learning), and high capacity. Thus, OxRAM and PCM can be good candidates for these systems. - Analog deep learning accelerator The core of any analog-based accelerator is a memory array that can store the values of the weight matrix in an analog fashion [105]. Analog computing is made possible with an emerging memory that exhibits an analog behavior, which is the case with PCRAM in particular, and with RRAM to a certain extent (changing the programming current or using various RESET states). This can be utilized, for example, in constructing analog circuits that solve linear systems of equations in constant rather than polynomial time. These systems can operate in a read mode, which reduces the endurance requirements of the memory. A typical example is the achievement of the multiply accumulate (MAC) operation within large memory arrays [107,109,110] as demonstrated by IBM. Multiplication between a set of input voltages and programmed conductances in an array is the dominant operation performed in modern neural networks. In IBM circuits, emerging memory is coupled to another device to perform inference and training tasks; a PCM is used for inference while training, which requires extensive endurance, and is insured by a capacitor [111] or by SRAM [112].
These innovative architectures impose constraints and requirements on the memory technologies such as very high capacity, multilevel capabilities with high accuracy on device conductance and low variability, high endurance, and low power consumption. In this context, several techniques have been proposed in the literature to solve the imperfections of current technologies: in [113], a mixed hardware-software neural-network implementation combines long-term storage in PCM (for weight data), near-linear updates of volatile capacitors, and weight-data transfer with 'polarity inversion' to cancel out inherent device-to-device variations. In [114,115], RRAM are used in low precision binarized neural networks (BNN), to cope with device non-idealities. In this type of neural network, both synaptic weights and neuron activations are implemented by binary values (+1 or −1) after a network training process.
Panasonic developed neuromorphic computing based on analog RRAM, resistive analog neuromorphic device (RAND), as a low power solution for edge application [116]. The authored demonstrated MNIST recognition and sensor application in which several networks could be configured at the same time.
Finally, multiple bits-per-cell was demonstrated in OxRAM in 1T4R configuration due to gradual SET/RESET [117]. The authors claim that this makes the structure suitable for multiple deep learning applications and showed high degrees of inference accuracy within 0.01% of ideal values. However, statistical cell-to-cell variability limits the maximum number of levels because of the overlap between adjacent resistance distributions [118], and OxRAM resistance relaxation after programming leads to overlap of the memory states, which can alter the circuit reliability [119].
-Spiking brain inspired neural network In spiking neural networks, the approach is to mimic human brain behavior and to represent data with spikes. Spiking neural networks (SNN) with spike-timing dependent plasticity (STDP) are then capable of replicating bio-realistic online/unsupervised learning [120,121], which is not the case for deep learning architectures relying on supervised backpropagation. In SNN processes, the simultaneous spiking activity at two neurons can lead to a potentiation of the synapse connecting them, meaning that two neurons which are active in response to the same event, should be linked by a relatively strong synaptic connection.
In this field, various demonstrations were proposed in the literature, aiming at implementing STDP in hardware synapses adopting various classes of emerging memories (including RRAM, CBRAM, STT-MRAM, and PCM). In [122], and [123] OxRAM synapses based on HfO 2 and SiOx resistive memory technologies were combined with analog neu-rons in spiking neural networks for MNIST digit classification. The OxRAM, coding the synaptic weights of the network, was thus used for inference in the DNN, which required high reading operations but only a very limited number of cycles.
As for an analog neural network, no emerging technology can afford the high number of cycles that can offer continuous learning of the system. Thus, today, learning is performed offline, and synaptic weights are then coded in the nonvolatile memories for inference.
In conclusion, the emergence of high capacity memories, with good endurance, ideally with analogic behavior (or at least multilevel) can significantly improve the efficiency of data transfer and allow the emergence of new computing and non von Neumann systems.

Conclusions
The general context of nonvolatile memories is characterized by several key points. First, the era of big data in which we live implies a constant and tremendous increase in data volume generation associated with the increase in the number of connected objects. The introduction into the market of 3D-NAND allowed pursuing a density increase. At the system level, the memory hierarchy suffers from two limitations, memory wall (between SRAM and DRAM) and latency gap (between DRAM and Flash), offering opportunities for new technologies. Data deluge also changed the paradigm of computing system; limited today by data transfer more than computing. This leads to the necessity for more efficient and specialized architectures, such as in-memory computing and neuromorphic circuits.
At the component level, there is currently a renewed interest in emerging memory technologies (RRAM, PCRAM, MRAM, FeRAM, etc), based on "old" concepts due to the combination they offer in terms of fast speed, high endurance, and nonvolatility. Today, no universal memory has been discovered so far, and we move towards the co-existence of various concepts, more and more specialized to a specific application.
Currently, 3D-NAND is dominant in the world of standalone memories and there is no clear need for a new concept to replace it. However, there is an opportunity for emerging memories to enter the memory hierarchy in new (von Neumann and non von Neumann) computing systems to improve efficiency and performances. Innovation will thus consist in new architectures made possible by the advent of new memory technologies showing more and more maturity. This will require strong collaboration and mutual understanding between device engineers and system architects. On the other hand, it is more difficult and expensive to maintain current embedded charge based technologies for newtechnology nodes. Various emerging technologies are thus called upon to enter the embedded memory market. The wide range of existing applications in this domain should result in the appearance of various technologies depending on the applications. To fill these requirements, there is a place for a dedicated research to improve emerging memory performances, based on evolving concepts, new materials, and also optimized programming schemes, which should be adapted to the physics of emerging devices.