MDPI - Publisher of Open Access Journals

13 pages, 2717 KB

Open AccessArticle

Learning Dynamics of Solitonic Optical Multichannel Neurons

by Alessandro Bile, Arif Nabizada, Abraham Murad Hamza and Eugenio Fazio

Biomimetics 2025, 10(10), 645; https://doi.org/10.3390/biomimetics10100645 - 24 Sep 2025

Viewed by 326

This study provides an in-depth analysis of the learning dynamics of multichannel optical neurons based on spatial solitons generated in lithium niobate crystals. Single-node and multi-node configurations with different topological complexities (3 × 3, 4 × 4, and 5 × 5) were compared, [...] Read more.

This study provides an in-depth analysis of the learning dynamics of multichannel optical neurons based on spatial solitons generated in lithium niobate crystals. Single-node and multi-node configurations with different topological complexities (3 × 3, 4 × 4, and 5 × 5) were compared, assessing how the number of channels, geometry, and optical parameters affect the speed and efficiency of learning. The simulations indicate that single-node neurons achieve the desired imbalance more rapidly and with lower energy expenditure, whereas multi-node structures require higher intensities and longer timescales, yet yield a greater variety of responses, more accurately reproducing the functional diversity of biological neural tissues. The results highlight how the plasticity of these devices can be entirely modulated through optical parameters, paving the way for fully optical photonic neuromorphic networks in which memory and computation are co-localized, with potential applications in on-chip learning, adaptive routing, and distributed decision-making. Full article

(This article belongs to the Special Issue Advanced Biologically Inspired Vision and Its Application: 2nd Edition)

► Show Figures

Figure 1

21 pages, 3746 KB

Open AccessArticle

DCP: Learning Accelerator Dataflow for Neural Networks via Propagation

by Peng Xu, Wenqi Shao and Ping Luo

Electronics 2025, 14(15), 3085; https://doi.org/10.3390/electronics14153085 - 1 Aug 2025

Cited by 1 | Viewed by 680

Abstract

Deep neural network (DNN) hardware (HW) accelerators have achieved great success in improving DNNs’ performance and efficiency. One key reason is the dataflow in executing a DNN layer, including on-chip data partitioning, computation parallelism, and scheduling policy, which have large impacts on latency [...] Read more.

Deep neural network (DNN) hardware (HW) accelerators have achieved great success in improving DNNs’ performance and efficiency. One key reason is the dataflow in executing a DNN layer, including on-chip data partitioning, computation parallelism, and scheduling policy, which have large impacts on latency and energy consumption. Unlike prior works that required considerable efforts from HW engineers to design suitable dataflows for different DNNs, this work proposes an efficient data-centric approach, named Dataflow Code Propagation (DCP), to automatically find the optimal dataflow for DNN layers in seconds without human effort. It has several attractive benefits that prior studies lack, including the following: (i) We translate the HW dataflow configuration into a code representation in a unified dataflow coding space, which can be optimized by back-propagating gradients given a DNN layer or network. (ii) DCP learns a neural predictor to efficiently update the dataflow codes towards the desired gradient directions to minimize various optimization objectives, e.g., latency and energy. (iii) It can be easily generalized to unseen HW configurations in a zero-shot or few-shot learning manner. For example, without using additional training data, Extensive experiments on several representative models such as MobileNet, ResNet, and ViT show that DCP outperforms its counterparts in various settings. Full article

(This article belongs to the Special Issue Applied Machine Learning in Data Science)

► Show Figures

Figure 1

24 pages, 6840 KB

Open AccessArticle

A Tree Crown Segmentation Approach for Unmanned Aerial Vehicle Remote Sensing Images on Field Programmable Gate Array (FPGA) Neural Network Accelerator

by Jiayi Ma, Lingxiao Yan, Baozhe Chen and Li Zhang

Sensors 2025, 25(9), 2729; https://doi.org/10.3390/s25092729 - 25 Apr 2025

Cited by 1 | Viewed by 893

Abstract

Tree crown detection of high-resolution UAV forest remote sensing images using computer technology has been widely performed in the last ten years. In forest resource inventory management based on remote sensing data, crown detection is the most important and essential part. Deep learning [...] Read more.

Tree crown detection of high-resolution UAV forest remote sensing images using computer technology has been widely performed in the last ten years. In forest resource inventory management based on remote sensing data, crown detection is the most important and essential part. Deep learning technology has achieved good results in tree crown segmentation and species classification, but relying on high-performance computing platforms, edge calculation, and real-time processing cannot be realized. In this thesis, the UAV images of coniferous Pinus tabuliformis and broad-leaved Salix matsudana collected by Jingyue Ecological Forest Farm in Changping District, Beijing, are used as datasets, and a lightweight neural network U-Net-Light based on U-Net and VGG16 is designed and trained. At the same time, the IP core and SoC architecture of the neural network accelerator are designed and implemented on the Xilinx ZYNQ 7100 SoC platform. The results show that U-Net-light only uses 1.56 MB parameters to classify and segment the crown images of double tree species, and the accuracy rate reaches 85%. The designed SoC architecture and accelerator IP core achieved 31 times the speedup of the ZYNQ hard core, and 1.3 times the speedup compared with the high-end CPU (Intel CoreTM i9-10900K). The hardware resource overhead is less than 20% of the total deployment platform, and the total on-chip power consumption is 2.127 W. Shorter prediction time and higher energy consumption ratio prove the effectiveness and rationality of architecture design and IP development. This work departs from conventional canopy segmentation methods that rely heavily on ground-based high-performance computing. Instead, it proposes a lightweight neural network model deployed on FPGA for real-time inference on unmanned aerial vehicles (UAVs), thereby significantly lowering both latency and system resource consumption. The proposed approach demonstrates a certain degree of innovation and provides meaningful references for the automation and intelligent development of forest resource monitoring and precision agriculture. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

19 pages, 4018 KB

Open AccessArticle

Research on Weather Recognition Based on a Field Programmable Gate Array and Lightweight Convolutional Neural Network

by Liying Chen, Fan Luo, Fei Wang and Liangfu Lv

Electronics 2025, 14(9), 1740; https://doi.org/10.3390/electronics14091740 - 24 Apr 2025

Viewed by 608

Abstract

With the rapid development of deep learning, weather recognition has become a research hotspot in the field of computer vision, and the research on field programmable gate array (FPGA) acceleration based on deep learning algorithms has received more and more attention, based on [...] Read more.

With the rapid development of deep learning, weather recognition has become a research hotspot in the field of computer vision, and the research on field programmable gate array (FPGA) acceleration based on deep learning algorithms has received more and more attention, based on which, we propose a method to implement deep neural networks for weather recognition in a small-scale FPGA. First, we train a deep separable convolutional neural network model for weather recognition to reduce the parameters and speed up the performance of hardware implementation. However, large-scale computation also brings the problem of excessive power consumption, which greatly limits the deployment of high-performance network models on mobile platforms. Therefore, we use a lightweight convolutional neural network approach to reduce the scale of computation, and the main idea of lightweight is to use fewer bits to store the weights. In addition, a hardware implementation of this model is proposed to speed up the operation and save on-chip resource consumption. Finally, the network model is deployed on a Xilinx ZYNQ xc7z020 FPGA to verify the accuracy of the recognition results, and the accelerated solution succeeds in achieving excellent performance with a speed of 108 FPS and 3.256 W of power consumption. The purpose of this design is to be able to accurately recognize the weather and deliver current environmental weather information to UAV (unmanned aerial vehicle) pilots and other staff who need to consider the weather, so that they can accurately grasp the current environmental weather conditions at any time. When the weather conditions change, the information can be obtained in a timely and effective manner to make the correct judgment, to ensure the flight of the UAV, and to avoid the equipment being affected by the weather leading to equipment damage and failure of the flight mission. With the help of this design, the UAV flight mission can be better completed. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

20 pages, 2239 KB

Open AccessArticle

A Novel Lightweight Deep Learning Approach for Drivers’ Facial Expression Detection

by Jia Uddin

Designs 2025, 9(2), 45; https://doi.org/10.3390/designs9020045 - 3 Apr 2025

Cited by 1 | Viewed by 1550

Abstract

Drivers’ facial expression recognition systems play a pivotal role in Advanced Driver Assistance Systems (ADASs) by monitoring emotional states and detecting fatigue or distractions in real time. However, deploying such systems in resource-constrained environments like vehicles requires lightweight architectures to ensure real-time performance, [...] Read more.

Drivers’ facial expression recognition systems play a pivotal role in Advanced Driver Assistance Systems (ADASs) by monitoring emotional states and detecting fatigue or distractions in real time. However, deploying such systems in resource-constrained environments like vehicles requires lightweight architectures to ensure real-time performance, efficient model updates, and compatibility with embedded hardware. Smaller models significantly reduce communication overhead in distributed training. For autonomous vehicles, lightweight architectures also minimize the data transfer required for over-the-air updates. Moreover, they are crucial for their deployability on hardware with limited on-chip memory. In this work, we propose a novel Dual Attention Lightweight Deep Learning (DALDL) approach for drivers’ facial expression recognition. The proposed approach combines the SqueezeNext architecture with a Dual Attention Convolution (DAC) block. Our DAC block integrates Hybrid Channel Attention (HCA) and Coordinate Space Attention (CSA) to enhance feature extraction efficiency while maintaining minimal parameter overhead. To evaluate the effectiveness of our architecture, we compare it against two baselines: (a) Vanilla SqueezeNet and (b) AlexNet. Compared with SqueezeNet, DALDL improves accuracy by 7.96% and F1-score by 7.95% on the KMU-FED dataset. On the CK+ dataset, it achieves 8.51% higher accuracy and 8.40% higher F1-score. Against AlexNet, DALDL improves accuracy by 4.34% and F1-score by 4.17% on KMU-FED. Lastly, on CK+, it provides a 5.36% boost in accuracy and a 7.24% increase in F1-score. These results demonstrate that DALDL is a promising solution for efficient and accurate emotion recognition in real-world automotive applications. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Smart Factories: From Sensor Networks to Large Language Models)

► Show Figures

Figure 1

16 pages, 1318 KB

Open AccessArticle

Optimised Extension of an Ultra-Low-Power RISC-V Processor to Support Lightweight Neural Network Models

by Qiankun Liu and Sam Amiri

Chips 2025, 4(2), 13; https://doi.org/10.3390/chips4020013 - 3 Apr 2025

Viewed by 2637

Abstract

With the increasing demand for efficient deep learning models in resource-constrained environments, Binary Neural Networks (BNNs) have emerged as a promising solution due to their ability to significantly reduce computational complexity while maintaining accuracy. Their integration into embedded and edge computing systems is [...] Read more.

With the increasing demand for efficient deep learning models in resource-constrained environments, Binary Neural Networks (BNNs) have emerged as a promising solution due to their ability to significantly reduce computational complexity while maintaining accuracy. Their integration into embedded and edge computing systems is essential for enabling real-time AI applications in areas such as autonomous systems, industrial automation, and intelligent security. Deploying BNN on FPGA using RISC-V, rather than directly deploying the model on FPGA, sacrifices detection speed but, in general, reduces power consumption and on-chip resource usage. The AI-extended RISC-V core is capable of handling tasks beyond BNN inference, providing greater flexibility. This work utilises the lightweight Zero-Riscy core to deploy a BNN on FPGA. Three custom instructions are proposed for convolution, pooling, and fully connected layers, integrating XNOR, POPCOUNT, and threshold operations. This reduces the number of instructions required per task, thereby decreasing the frequency of interactions between Zero-Riscy and the instruction memory. The proposed solution is evaluated on two case studies: MNIST dataset classification and an intrusion detection system (IDS) for in-vehicle networks. The results show that for MNIST inference, the hardware resources required are only 9% of those used by state-of-the-art solutions, though with a slight reduction in speed. For IDS-based inference, power consumption is reduced to just 13% of the original, while resource usage is only 20% of the original. Although some speed is sacrificed, the system still meets real-time monitoring requirements. Full article

(This article belongs to the Special Issue IC Design Techniques for Power/Energy-Constrained Applications)

► Show Figures

Figure 1

20 pages, 3504 KB

Open AccessArticle

Memristor-Based Neuromorphic System for Unsupervised Online Learning and Network Anomaly Detection on Edge Devices

by Md Shahanur Alam, Chris Yakopcic, Raqibul Hasan and Tarek M. Taha

Information 2025, 16(3), 222; https://doi.org/10.3390/info16030222 - 13 Mar 2025

Cited by 1 | Viewed by 1303

Abstract

An ultralow-power, high-performance online-learning and anomaly-detection system has been developed for edge security applications. Designed to support personalized learning without relying on cloud data processing, the system employs sample-wise learning, eliminating the need for storing entire datasets for training. Built using memristor-based analog [...] Read more.

An ultralow-power, high-performance online-learning and anomaly-detection system has been developed for edge security applications. Designed to support personalized learning without relying on cloud data processing, the system employs sample-wise learning, eliminating the need for storing entire datasets for training. Built using memristor-based analog neuromorphic and in-memory computing techniques, the system integrates two unsupervised autoencoder neural networks—one utilizing optimized crossbar weights and the other performing real-time learning to detect novel intrusions. Threshold optimization and anomaly detection are achieved through a fully analog Euclidean Distance (ED) computation circuit, eliminating the need for floating-point processing units. The system demonstrates 87% anomaly-detection accuracy; achieves a performance of 16.1 GOPS—774× faster than the ASUS Tinker Board edge processor; and delivers an energy efficiency of 783 GOPS/W, consuming only 20.5 mW during anomaly detection. Full article

(This article belongs to the Special Issue Intelligent Information Processing for Sensors and IoT Communications)

► Show Figures

Graphical abstract

17 pages, 432 KB

Open AccessArticle

Efficient Modeling and Usage of Scratchpad Memory for Artificial Intelligence Accelerators

by Cagla Irmak Rumelili Köksal and Sıddıka Berna Örs Yalçın

Electronics 2025, 14(5), 1032; https://doi.org/10.3390/electronics14051032 - 5 Mar 2025

Cited by 1 | Viewed by 2442

Abstract

Deep learning accelerators play a crucial role in enhancing computation-intensive AI applications. Optimizing system resources—such as shared caches, on-chip SRAM, and data movement mechanisms—is essential for achieving peak performance and energy efficiency. This paper explores the trade-off between last-level cache (LLC) and scratchpad [...] Read more.

Deep learning accelerators play a crucial role in enhancing computation-intensive AI applications. Optimizing system resources—such as shared caches, on-chip SRAM, and data movement mechanisms—is essential for achieving peak performance and energy efficiency. This paper explores the trade-off between last-level cache (LLC) and scratchpad memory (SPM) usage in accelerator-based SoCs. To evaluate this trade-off, we introduce a high-speed simulator for estimating the timing performance of complex SoCs and demonstrate the benefits of SPM utilization. Our work shows that dynamic reconfiguration of the LLC into an SPM with prefetching capabilities reduces cache misses while improving resource utilization, performance, and energy efficiency. With SPM usage, we achieve up to 13× speedup and a 10% reduction in energy consumption for CNN backbones. Additionally, our simulator significantly outperforms state-of-the-art alternatives, running 3000× faster than gem5-SALAM for fixed-weight convolution computations and up to 64,000× faster as weight size increases. These results validate the effectiveness of both the proposed architecture and simulator in optimizing deep learning workloads. Full article

(This article belongs to the Special Issue Recent Advances in AI Hardware Design)

► Show Figures

Figure 1

19 pages, 4184 KB

Open AccessArticle

An Online Evaluation Method for Random Number Entropy Sources Based on Time-Frequency Feature Fusion

by Qian Sun, Kainan Ma, Yiheng Zhou, Zhaoyuxuan Wang, Chaoxing You and Ming Liu

Entropy 2025, 27(2), 136; https://doi.org/10.3390/e27020136 - 27 Jan 2025

Viewed by 1118

Abstract

Traditional entropy source evaluation methods rely on statistical analysis and are hard to deploy on-chip or online. However, online detection of entropy source quality is necessary in some applications with high encryption levels. To address these issues, our experimental results demonstrate a significant [...] Read more.

Traditional entropy source evaluation methods rely on statistical analysis and are hard to deploy on-chip or online. However, online detection of entropy source quality is necessary in some applications with high encryption levels. To address these issues, our experimental results demonstrate a significant negative correlation between minimum entropy values and prediction accuracy, with a Pearson correlation coefficient of −0.925 (p-value = 1.07 × 10⁻⁷). This finding offers a novel approach for assessing entropy source quality, achieving an accurate rate in predicting the next bit of a random sequence using neural networks. To further improve prediction capabilities, we also propose a novel deep learning architecture, Fast Fourier Transform-Attention Mechanism-Long Short-Term Memory Network (FFT-ATT-LSTM), that integrates a simplified soft attention mechanism with Fast Fourier Transform (FFT), enabling effective fusion of time-domain and frequency-domain features. The FFT-ATT-LSTM improves prediction accuracy by 4.46% and 8% over baseline networks when predicting random numbers. Additionally, FFT-ATT-LSTM maintains a compact parameter size of 33.90 KB, significantly smaller than Temporal Convolutional Networks (TCN) at 41.51 KB and Transformers at 61.51 KB, while retaining comparable prediction performance. This optimal balance between accuracy and resource efficiency makes FFT-ATT-LSTM suitable for online deployment, demonstrating considerable application potential. Full article

(This article belongs to the Special Issue Stochastic Models and Statistical Inference: Analysis and Applications)

► Show Figures

Figure 1

22 pages, 3052 KB

Open AccessArticle

A Low-Power General Matrix Multiplication Accelerator with Sparse Weight-and-Output Stationary Dataflow

by Peng Liu and Yu Wang

Micromachines 2025, 16(1), 101; https://doi.org/10.3390/mi16010101 - 16 Jan 2025

Viewed by 2543

Abstract

General matrix multiplication (GEMM) in machine learning involves massive computation and data movement, which restricts its deployment on resource-constrained devices. Although data reuse can reduce data movement during GEMM processing, current approaches fail to fully exploit its potential. This work introduces a sparse [...] Read more.

General matrix multiplication (GEMM) in machine learning involves massive computation and data movement, which restricts its deployment on resource-constrained devices. Although data reuse can reduce data movement during GEMM processing, current approaches fail to fully exploit its potential. This work introduces a sparse GEMM accelerator with a weight-and-output stationary (WOS) dataflow and a distributed buffer architecture. It processes GEMM in a compressed format and eliminates on-chip transfers of both weights and partial sums. Furthermore, to map the compressed GEMM of various sizes onto the accelerator, an adaptable mapping scheme is designed. However, the irregular sparsity of weight matrices makes it difficult to store them in local buffers with the compressed format; denser vectors can exceed the buffer capacity, while sparser vectors may lead to the underutilization of buffers. To address this complication, this work also proposes an offline sparsity-aware shuffle strategy for weights, which balances the utilization of distributed buffers and minimizes buffer waste. Finally, a low-cost sparse computing method is applied to the WOS dataflow with globally shared inputs to achieve high computing throughput. Experiments with an FPGA show that the proposed accelerator achieves 1.73× better computing efficiency and 1.36× higher energy efficiency than existing approaches. Full article

(This article belongs to the Section E：Engineering and Technology)

► Show Figures

Figure 1

16 pages, 1400 KB

Open AccessArticle

On-Chip Sensor Utilizing Concatenated Micro-Ring with Enhanced Temperature Invariance Using Deep Learning

by Thomas Mikhail, Sarah Shafaay and Mohamed Swillam

Photonics 2024, 11(12), 1198; https://doi.org/10.3390/photonics11121198 - 20 Dec 2024

Cited by 1 | Viewed by 992

Abstract

An approach to measuring chemical concentrations using a slotted micro-ring resonator (sMRR) is proposed which is robust to spectral shifts caused by temperature variations. Two 1-D Convolutional Neural Network architectures, ResNet34 and VGG20, were trained for regression, achieving mean squared errors (MSEs) of [...] Read more.

An approach to measuring chemical concentrations using a slotted micro-ring resonator (sMRR) is proposed which is robust to spectral shifts caused by temperature variations. Two 1-D Convolutional Neural Network architectures, ResNet34 and VGG20, were trained for regression, achieving mean squared errors (MSEs) of 1.1251

\times 10^{- 4}

and 1.2195

\times 10^{- 4}

, respectively. The models predict concentrations of water, ethanol, methanol, and propanol (0–100%) from the transmission spectra of a single-ring sMRR etched in heavily doped silicon, operating in the mid-infrared range (290–310 K). Transfer learning adapted the models for datasets with different temperature ranges, analytes (e.g., butanol), and sMRR designs, achieving comparable accuracy. Variations in accuracy across these datasets are also explored. Full article

(This article belongs to the Section Lasers, Light Sources and Sensors)

► Show Figures

Figure 1

21 pages, 5220 KB

Open AccessArticle

A Closed-Loop Ear-Worn Wearable EEG System with Real-Time Passive Electrode Skin Impedance Measurement for Early Autism Detection

by Muhammad Sheeraz, Abdul Rehman Aslam, Emmanuel Mic Drakakis, Hadi Heidari, Muhammad Awais Bin Altaf and Wala Saadeh

Sensors 2024, 24(23), 7489; https://doi.org/10.3390/s24237489 - 24 Nov 2024

Cited by 2 | Viewed by 2474

Abstract

Autism spectrum disorder (ASD) is a chronic neurological disorder with the severity directly linked to the diagnosis age. The severity can be reduced if diagnosis and intervention are early (age < 2 years). This work presents a novel ear-worn wearable EEG system designed [...] Read more.

Autism spectrum disorder (ASD) is a chronic neurological disorder with the severity directly linked to the diagnosis age. The severity can be reduced if diagnosis and intervention are early (age < 2 years). This work presents a novel ear-worn wearable EEG system designed to aid in the early detection of ASD. Conventional EEG systems often suffer from bulky, wired electrodes, high power consumption, and a lack of real-time electrode–skin interface (ESI) impedance monitoring. To address these limitations, our system incorporates continuous, long-term EEG recording, on-chip machine learning for real-time ASD prediction, and a passive ESI evaluation system. The passive ESI methodology evaluates impedance using the root mean square voltage of the output signal, considering factors like pressure, electrode surface area, material, gel thickness, and duration. The on-chip machine learning processor, implemented in 180 nm CMOS, occupies a minimal 2.52 mm² of active area while consuming only 0.87 µJ of energy per classification. The performance of this ML processor is validated using the Old Dominion University ASD dataset. Full article

(This article belongs to the Section Biomedical Sensors)

► Show Figures

Figure 1

25 pages, 9992 KB

Open AccessArticle

Analog Implementation of a Spiking Neuron with Memristive Synapses for Deep Learning Processing

by Royce R. Ramirez-Morales, Victor H. Ponce-Ponce, Herón Molina-Lozano, Humberto Sossa-Azuela, Oscar Islas-García and Elsa Rubio-Espino

Mathematics 2024, 12(13), 2025; https://doi.org/10.3390/math12132025 - 29 Jun 2024

Cited by 2 | Viewed by 3531

Abstract

Analog neuromorphic prototyping is essential for designing and testing spiking neuron models that use memristive devices as synapses. These prototypes can have various circuit configurations, implying different response behaviors that custom silicon designs lack. The prototype’s behavior results can be optimized for a [...] Read more.

Analog neuromorphic prototyping is essential for designing and testing spiking neuron models that use memristive devices as synapses. These prototypes can have various circuit configurations, implying different response behaviors that custom silicon designs lack. The prototype’s behavior results can be optimized for a specific foundry node, which can be used to produce a customized on-chip parallel deep neural network. Spiking neurons mimic how the biological neurons in the brain communicate through electrical potentials. Doing so enables more powerful and efficient functionality than traditional artificial neural networks that run on von Neumann computers or graphic processing unit-based platforms. Therefore, on-chip parallel deep neural network technology can accelerate deep learning processing, aiming to exploit the brain’s unique features of asynchronous and event-driven processing by leveraging the neuromorphic hardware’s inherent parallelism and analog computation capabilities. This paper presents the design and implementation of a leaky integrate-and-fire (LIF) neuron prototype implemented with commercially available components on a PCB board. The simulations conducted in LTSpice agree well with the electrical test measurements. The results demonstrate that this design can be used to interconnect many boards to build layers of physical spiking neurons, with spike-timing-dependent plasticity as the primary learning algorithm, contributing to the realization of experiments in the early stage of adopting analog neuromorphic computing. Full article

(This article belongs to the Special Issue Deep Neural Networks: Theory, Algorithms and Applications)

► Show Figures

Figure 1

16 pages, 5947 KB

Open AccessData Descriptor

Stimulated Microcontroller Dataset for New IoT Device Identification Schemes through On-Chip Sensor Monitoring

by Alberto Ramos, Honorio Martín, Carmen Cámara and Pedro Peris-Lopez

Data 2024, 9(5), 62; https://doi.org/10.3390/data9050062 - 28 Apr 2024

Cited by 2 | Viewed by 2176

Abstract

Legitimate identification of devices is crucial to ensure the security of present and future IoT ecosystems. In this regard, AI-based systems that exploit intrinsic hardware variations have gained notable relevance. Within this context, on-chip sensors included for monitoring purposes in a wide range [...] Read more.

Legitimate identification of devices is crucial to ensure the security of present and future IoT ecosystems. In this regard, AI-based systems that exploit intrinsic hardware variations have gained notable relevance. Within this context, on-chip sensors included for monitoring purposes in a wide range of SoCs remain almost unexplored, despite their potential as a valuable source of both information and variability. In this work, we introduce and release a dataset comprising data collected from the on-chip temperature and voltage sensors of 20 microcontroller-based boards from the STM32L family. These boards were stimulated with five different algorithms, as workloads to elicit diverse responses. The dataset consists of five acquisitions (1.3 billion readouts) that are spaced over time and were obtained under different configurations using an automated platform. The raw dataset is publicly available, along with metadata and scripts developed to generate pre-processed T–V sequence sets. Finally, a proof of concept consisting of training a simple model is presented to demonstrate the feasibility of the identification system based on these data. Full article

► Show Figures

Figure 1

14 pages, 3288 KB

Open AccessArticle

Design of Network-on-Chip-Based Restricted Coulomb Energy Neural Network Accelerator on FPGA Device

by Soongyu Kang, Seongjoo Lee and Yunho Jung

Sensors 2024, 24(6), 1891; https://doi.org/10.3390/s24061891 - 15 Mar 2024

Cited by 2 | Viewed by 1725

Abstract

Sensor applications in internet of things (IoT) systems, coupled with artificial intelligence (AI) technology, are becoming an increasingly significant part of modern life. For low-latency AI computation in IoT systems, there is a growing preference for edge-based computing over cloud-based alternatives. The restricted [...] Read more.

Sensor applications in internet of things (IoT) systems, coupled with artificial intelligence (AI) technology, are becoming an increasingly significant part of modern life. For low-latency AI computation in IoT systems, there is a growing preference for edge-based computing over cloud-based alternatives. The restricted coulomb energy neural network (RCE-NN) is a machine learning algorithm well-suited for implementation on edge devices due to its simple learning and recognition scheme. In addition, because the RCE-NN generates neurons as needed, it is easy to adjust the network structure and learn additional data. Therefore, the RCE-NN can provide edge-based real-time processing for various sensor applications. However, previous RCE-NN accelerators have limited scalability when the number of neurons increases. In this paper, we propose a network-on-chip (NoC)-based RCE-NN accelerator and present the results of implementation on a field-programmable gate array (FPGA). NoC is an effective solution for managing massive interconnections. The proposed RCE-NN accelerator utilizes a hierarchical–star (H–star) topology, which efficiently handles a large number of neurons, along with routers specifically designed for the RCE-NN. These approaches result in only a slight decrease in the maximum operating frequency as the number of neurons increases. Consequently, the maximum operating frequency of the proposed RCE-NN accelerator with 512 neurons increased by 126.1% compared to a previous RCE-NN accelerator. This enhancement was verified with two datasets for gas and sign language recognition, achieving accelerations of up to 54.8% in learning time and up to 45.7% in recognition time. The NoC scheme of the proposed RCE-NN accelerator is an appropriate solution to ensure the scalability of the neural network while providing high-performance on-chip learning and recognition. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

Search Results (49)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (49)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI