Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (5)

Search Parameters:
Keywords = off-chip learning

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
31 pages, 3939 KiB  
Article
Effective 8T Reconfigurable SRAM for Data Integrity and Versatile In-Memory Computing-Based AI Acceleration
by Sreeja S. Kumar and Jagadish Nayak
Electronics 2025, 14(13), 2719; https://doi.org/10.3390/electronics14132719 - 5 Jul 2025
Viewed by 516
Abstract
For data-intensive applications like edge AI and image processing, we present a new reconfigurable 8T SRAM-based in-memory computing (IMC) macro designed for high-performance and energy-efficient operation. This architecture mitigates von Neumann limitations through numerous major breakthroughs. We built a new architecture with an [...] Read more.
For data-intensive applications like edge AI and image processing, we present a new reconfigurable 8T SRAM-based in-memory computing (IMC) macro designed for high-performance and energy-efficient operation. This architecture mitigates von Neumann limitations through numerous major breakthroughs. We built a new architecture with an adjustable capacitance array to substantially increase the multiply-and-accumulate (MAC) engine’s accuracy. It achieves 10–20 TOPS/W and >95% accuracy for 4–10-bit operations and is robust across PVT changes. By supporting binary and ternary neural networks (BNN/TNN) with XNOR-and-accumulate logic, a dual-mode inference engine further expands capabilities. With sub-5 ns mode switching, it can achieve up to 30 TOPS/W efficiency and >97% accuracy. In-memory Hamming error correction is implemented directly using integrated XOR circuitry. This technique eliminates off-chip ECC with >99% error correction and >98% MAC accuracy. Machine learning-aided co-optimization ensures sense amplifier dependability. To ensure CMOS compatibility, the macro may perform Boolean logic operations using normal 8T SRAM cells. Comparative circuit-level simulations show a 31.54% energy efficiency boost and a 74.81% delay reduction over other SRAM-based IMC solutions. These improvements make our macro ideal for real-time AI acceleration, cryptography, and next-generation edge computing, enabling advanced compute-in-memory systems. Full article
Show Figures

Figure 1

25 pages, 13951 KiB  
Article
1D-CNN-Transformer for Radar Emitter Identification and Implemented on FPGA
by Xiangang Gao, Bin Wu, Peng Li and Zehuan Jing
Remote Sens. 2024, 16(16), 2962; https://doi.org/10.3390/rs16162962 - 12 Aug 2024
Viewed by 3800
Abstract
Deep learning has brought great development to radar emitter identification technology. In addition, specific emitter identification (SEI), as a branch of radar emitter identification, has also benefited from it. However, the complexity of most deep learning algorithms makes it difficult to adapt to [...] Read more.
Deep learning has brought great development to radar emitter identification technology. In addition, specific emitter identification (SEI), as a branch of radar emitter identification, has also benefited from it. However, the complexity of most deep learning algorithms makes it difficult to adapt to the requirements of the low power consumption and high-performance processing of SEI on embedded devices, so this article proposes solutions from the aspects of software and hardware. From the software side, we design a Transformer variant network, lightweight convolutional Transformer (LW-CT) that supports parameter sharing. Then, we cascade convolutional neural networks (CNNs) and the LW-CT to construct a one-dimensional-CNN-Transformer(1D-CNN-Transformer) lightweight neural network model that can capture the long-range dependencies of radar emitter signals and extract signal spatial domain features meanwhile. In terms of hardware, we design a low-power neural network accelerator based on an FPGA to complete the real-time recognition of radar emitter signals. The accelerator not only designs high-efficiency computing engines for the network, but also devises a reconfigurable buffer called “Ping-pong CBUF” and two-level pipeline architecture for the convolution layer for alleviating the bottleneck caused by the off-chip storage access bandwidth. Experimental results show that the algorithm can achieve a high recognition performance of SEI with a low calculation overhead. In addition, the hardware acceleration platform not only perfectly meets the requirements of the radar emitter recognition system for low power consumption and high-performance processing, but also outperforms the accelerators in other papers in terms of the energy efficiency ratio of Transformer layer processing. Full article
Show Figures

Figure 1

11 pages, 2671 KiB  
Article
Kernel Mapping Methods of Convolutional Neural Network in 3D NAND Flash Architecture
by Min Suk Song, Hwiho Hwang, Geun Ho Lee, Suhyeon Ahn, Sungmin Hwang and Hyungjin Kim
Electronics 2023, 12(23), 4796; https://doi.org/10.3390/electronics12234796 - 27 Nov 2023
Cited by 2 | Viewed by 2201
Abstract
A flash memory is a non-volatile memory that has a large memory window, high cell density, and reliable switching characteristics and can be used as a synaptic device in a neuromorphic system based on 3D NAND flash architecture. We fabricated a TiN/Al2 [...] Read more.
A flash memory is a non-volatile memory that has a large memory window, high cell density, and reliable switching characteristics and can be used as a synaptic device in a neuromorphic system based on 3D NAND flash architecture. We fabricated a TiN/Al2O3/Si3N4/SiO2/Si stack-based Flash memory device with a polysilicon channel. The input/output signals and output values are binarized for accurate vector-matrix multiplication operations in the hardware. In addition, we propose two kernel mapping methods for convolutional neural networks (CNN) in the neuromorphic system. The VMM operations of two mapping schemes are verified through SPICE simulation. Finally, the off-chip learning in the CNN structure is performed using the Modified National Institute of Standards and Technology (MNIST) dataset. We compared the two schemes in terms of various parameters and determined the advantages and disadvantages of each. Full article
(This article belongs to the Section Semiconductor Devices)
Show Figures

Figure 1

9 pages, 2396 KiB  
Article
Investigation of Deep Spiking Neural Networks Utilizing Gated Schottky Diode as Synaptic Devices
by Sung-Tae Lee and Jong-Ho Bae
Micromachines 2022, 13(11), 1800; https://doi.org/10.3390/mi13111800 - 22 Oct 2022
Cited by 2 | Viewed by 1751
Abstract
Deep learning produces a remarkable performance in various applications such as image classification and speech recognition. However, state-of-the-art deep neural networks require a large number of weights and enormous computation power, which results in a bottleneck of efficiency for edge-device applications. To resolve [...] Read more.
Deep learning produces a remarkable performance in various applications such as image classification and speech recognition. However, state-of-the-art deep neural networks require a large number of weights and enormous computation power, which results in a bottleneck of efficiency for edge-device applications. To resolve these problems, deep spiking neural networks (DSNNs) have been proposed, given the specialized synapse and neuron hardware. In this work, the hardware neuromorphic system of DSNNs with gated Schottky diodes was investigated. Gated Schottky diodes have a near-linear conductance response, which can easily implement quantized weights in synaptic devices. Based on modeling of synaptic devices, two-layer fully connected neural networks are trained by off-chip learning. The adaptation of a neuron’s threshold is proposed to reduce the accuracy degradation caused by the conversion from analog neural networks (ANNs) to event-driven DSNNs. Using left-justified rate coding as an input encoding method enables low-latency classification. The effect of device variation and noisy images to the classification accuracy is investigated. The time-to-first-spike (TTFS) scheme can significantly reduce power consumption by reducing the number of firing spikes compared to a max-firing scheme. Full article
(This article belongs to the Section D1: Semiconductor Devices)
Show Figures

Figure 1

14 pages, 658 KiB  
Article
Securing Resource-Constrained IoT Nodes: Towards Intelligent Microcontroller-Based Attack Detection in Distributed Smart Applications
by Andrii Shalaginov and Muhammad Ajmal Azad
Future Internet 2021, 13(11), 272; https://doi.org/10.3390/fi13110272 - 27 Oct 2021
Cited by 8 | Viewed by 3733
Abstract
In recent years, the Internet of Things (IoT) devices have become an inseparable part of our lives. With the growing demand for Smart Applications, it becomes clear that IoT will bring regular automation and intelligent sensing to a new level thus improving quality [...] Read more.
In recent years, the Internet of Things (IoT) devices have become an inseparable part of our lives. With the growing demand for Smart Applications, it becomes clear that IoT will bring regular automation and intelligent sensing to a new level thus improving quality of life. The core component of the IoT ecosystem is data which exists in various forms and formats. The collected data is then later used to create context awareness and make meaningful decisions. Besides an undoubtedly large number of advantages from the usage of IoT, there exist numerous challenges attributed to the security of objects that cannot be neglected for uninterrupted services. The Mirai botnet attack demonstrated that the IoT system is susceptible to different forms of cyberattacks. While advanced data analytics and Machine Learning have proved efficiency in various applications of cybersecurity, those still have not been explored enough in the literature from the applicability perspective in the domain of resource-constrained IoT. Several architectures and frameworks have been proposed for defining the ways for analyzing the data, yet mostly investigating off-chip analysis. In this contribution, we show how an Artificial Neural Network model can be trained and deployed on trivial IoT nodes for detecting intelligent similarity-based network attacks. This article proposes a concept of the resource-constrained intelligent system as a part of the IoT infrastructure to be able to harden the cybersecurity on microcontrollers. This work will serve as a stepping stone for the application of Artificial Intelligence on devices with limited computing capabilities such as end-point IoT nodes. Full article
(This article belongs to the Collection Information Systems Security)
Show Figures

Figure 1

Back to TopTop