MDPI - Publisher of Open Access Journals

18 pages, 768 KB

Open AccessArticle

Uncertainty-Aware Design of High-Entropy Alloys via Ensemble Thermodynamic Modeling and Search Space Pruning

by Roman Dębski, Władysław Gąsior, Wojciech Gierlotka and Adam Dębski

Appl. Sci. 2025, 15(16), 8991; https://doi.org/10.3390/app15168991 - 14 Aug 2025

Viewed by 482

The discovery and design of high-entropy alloys (HEAs) faces significant challenges due to the vast combinatorial design space and uncertainties in thermodynamic data. This work presents a modular, uncertainty-aware computational framework with the primary objective of accelerating the discovery of solid-solution HEA candidates. [...] Read more.

The discovery and design of high-entropy alloys (HEAs) faces significant challenges due to the vast combinatorial design space and uncertainties in thermodynamic data. This work presents a modular, uncertainty-aware computational framework with the primary objective of accelerating the discovery of solid-solution HEA candidates. The proposed pipeline integrates ensemble thermodynamic modeling, Monte Carlo-based estimation, and a structured three-phase pruning algorithm for efficient search space reduction. Key quantitative results are achieved in two main areas. First, for binary alloy thermodynamics, a Bayesian Neural Network (BNN) ensemble trained on domain-informed features predicts mixing enthalpies with high accuracy, yielding a mean absolute error (MAE) of 0.48 kJ/mol—substantially outperforming the classical Miedema model (MAE = 4.27 kJ/mol). These probabilistic predictions are propagated through Monte Carlo sampling to estimate multi-component thermodynamic descriptors, including

Δ H_{mix}

and the

Ω

parameter, while capturing predictive uncertainty. Second, in a case study on the Al-Cu-Fe-Ni-Ti system, the framework reduces a 2.4 million (2.4 M) candidate pool to just 91 high-confidence compositions. Final selection is guided by an uncertainty-aware viability metric,

P (HEA)

, and supported by interpretable radar plot visualizations for multi-objective assessment. The results demonstrate the framework’s ability to combine physical priors, probabilistic modeling, and design heuristics into a data-efficient and interpretable pipeline for materials discovery. This establishes a foundation for future HEA optimization, dataset refinement, and adaptive experimental design under uncertainty. Full article

(This article belongs to the Special Issue High-Entropy Alloys: Advancing Chemistry, Characterization and Computational Insights)

► Show Figures

Figure 1

31 pages, 3939 KB

Open AccessArticle

Effective 8T Reconfigurable SRAM for Data Integrity and Versatile In-Memory Computing-Based AI Acceleration

by Sreeja S. Kumar and Jagadish Nayak

Electronics 2025, 14(13), 2719; https://doi.org/10.3390/electronics14132719 - 5 Jul 2025

Viewed by 1771

Abstract

For data-intensive applications like edge AI and image processing, we present a new reconfigurable 8T SRAM-based in-memory computing (IMC) macro designed for high-performance and energy-efficient operation. This architecture mitigates von Neumann limitations through numerous major breakthroughs. We built a new architecture with an [...] Read more.

For data-intensive applications like edge AI and image processing, we present a new reconfigurable 8T SRAM-based in-memory computing (IMC) macro designed for high-performance and energy-efficient operation. This architecture mitigates von Neumann limitations through numerous major breakthroughs. We built a new architecture with an adjustable capacitance array to substantially increase the multiply-and-accumulate (MAC) engine’s accuracy. It achieves 10–20 TOPS/W and >95% accuracy for 4–10-bit operations and is robust across PVT changes. By supporting binary and ternary neural networks (BNN/TNN) with XNOR-and-accumulate logic, a dual-mode inference engine further expands capabilities. With sub-5 ns mode switching, it can achieve up to 30 TOPS/W efficiency and >97% accuracy. In-memory Hamming error correction is implemented directly using integrated XOR circuitry. This technique eliminates off-chip ECC with >99% error correction and >98% MAC accuracy. Machine learning-aided co-optimization ensures sense amplifier dependability. To ensure CMOS compatibility, the macro may perform Boolean logic operations using normal 8T SRAM cells. Comparative circuit-level simulations show a 31.54% energy efficiency boost and a 74.81% delay reduction over other SRAM-based IMC solutions. These improvements make our macro ideal for real-time AI acceleration, cryptography, and next-generation edge computing, enabling advanced compute-in-memory systems. Full article

► Show Figures

Figure 1

13 pages, 3561 KB

Open AccessArticle

Attention-Based Batch Normalization for Binary Neural Networks

by Shan Gu, Guoyin Zhang, Chengwei Jia and Yanxia Wu

Entropy 2025, 27(6), 645; https://doi.org/10.3390/e27060645 - 17 Jun 2025

Cited by 1 | Viewed by 871

Abstract

Batch normalization (BN) is crucial for achieving state-of-the-art binary neural networks (BNNs). Unlike full-precision neural networks, BNNs restrict activations to discrete values

{- 1, 1}

, which requires a renewed understanding and research of the role and significance of the [...] Read more.

Batch normalization (BN) is crucial for achieving state-of-the-art binary neural networks (BNNs). Unlike full-precision neural networks, BNNs restrict activations to discrete values

{- 1, 1}

, which requires a renewed understanding and research of the role and significance of the BN layers in BNNs. Many studies notice this phenomenon and try to explain it. Inspired by these studies, we introduce the self-attention mechanism into BN and propose a novel Attention-Based Batch Normalization (ABN) for Binary Neural Networks. Also, we present an ablation study of parameter trade-offs in ABN, as well as an experimental analysis of the effect of ABN on BNNs. Experimental analyses show that our ABN method helps to capture image features, provide additional activation-like functions, and increase the imbalance of the activation distribution, and these features help to improve the performance of BNNs. Furthermore, we conduct image classification experiments over the CIFAR10, CIFAR100, and TinyImageNet datasets using BinaryNet and ResNet-18 network structures. The experimental results demonstrate that our ABN consistently outperforms the baseline BN across various benchmark datasets and models in terms of image classification accuracy. In addition, ABN exhibits less variance on the CIFAR datasets, which suggests that ABN can improve the stability and reliability of models. Full article

(This article belongs to the Section Information Theory, Probability and Statistics)

► Show Figures

Figure 1

22 pages, 3160 KB

Open AccessArticle

HE-BiDet: A Hardware Efficient Binary Neural Network Accelerator for Object Detection in SAR Images

by Dezheng Zhang, Zehan Liang, Rui Cen, Zhihong Yan, Rui Wan and Dong Wang

Micromachines 2025, 16(5), 549; https://doi.org/10.3390/mi16050549 - 30 Apr 2025

Viewed by 842

Abstract

Convolutional Neural Network (CNN)-based Synthetic Aperture Radar (SAR) target detection eliminates manual feature engineering and improves robustness but suffers from high computational costs, hindering on-satellite deployment. To address this, we propose HE-BiDet, an ultra-lightweight Binary Neural Network (BNN) framework co-designed with hardware acceleration. [...] Read more.

Convolutional Neural Network (CNN)-based Synthetic Aperture Radar (SAR) target detection eliminates manual feature engineering and improves robustness but suffers from high computational costs, hindering on-satellite deployment. To address this, we propose HE-BiDet, an ultra-lightweight Binary Neural Network (BNN) framework co-designed with hardware acceleration. First, we develop an ultra-lightweight SAR ship detection model. Second, we design a BNN accelerator leveraging four-directions of parallelism and an on-chip data buffer with optimized addressing to feed the computing array efficiently. To accelerate post-processing, we introduce a hardware-based threshold filter to eliminate redundant anchor boxes early and a dedicated Non-Maximum Suppression (NMS) unit. Evaluated on SAR-Ship, AirSAR-Ship 2.0, and SSDD, our model achieves 91.3%, 71.0%, and 92.7% accuracy, respectively. Implemented on a Xilinx Virtex-XC7VX690T FPGA, the system achieves

189.3

FPS, demonstrating real-time capability for spaceborne deployment. Full article

(This article belongs to the Section E：Engineering and Technology)

► Show Figures

Figure 1

16 pages, 1318 KB

Open AccessArticle

Optimised Extension of an Ultra-Low-Power RISC-V Processor to Support Lightweight Neural Network Models

by Qiankun Liu and Sam Amiri

Chips 2025, 4(2), 13; https://doi.org/10.3390/chips4020013 - 3 Apr 2025

Viewed by 2389

Abstract

With the increasing demand for efficient deep learning models in resource-constrained environments, Binary Neural Networks (BNNs) have emerged as a promising solution due to their ability to significantly reduce computational complexity while maintaining accuracy. Their integration into embedded and edge computing systems is [...] Read more.

With the increasing demand for efficient deep learning models in resource-constrained environments, Binary Neural Networks (BNNs) have emerged as a promising solution due to their ability to significantly reduce computational complexity while maintaining accuracy. Their integration into embedded and edge computing systems is essential for enabling real-time AI applications in areas such as autonomous systems, industrial automation, and intelligent security. Deploying BNN on FPGA using RISC-V, rather than directly deploying the model on FPGA, sacrifices detection speed but, in general, reduces power consumption and on-chip resource usage. The AI-extended RISC-V core is capable of handling tasks beyond BNN inference, providing greater flexibility. This work utilises the lightweight Zero-Riscy core to deploy a BNN on FPGA. Three custom instructions are proposed for convolution, pooling, and fully connected layers, integrating XNOR, POPCOUNT, and threshold operations. This reduces the number of instructions required per task, thereby decreasing the frequency of interactions between Zero-Riscy and the instruction memory. The proposed solution is evaluated on two case studies: MNIST dataset classification and an intrusion detection system (IDS) for in-vehicle networks. The results show that for MNIST inference, the hardware resources required are only 9% of those used by state-of-the-art solutions, though with a slight reduction in speed. For IDS-based inference, power consumption is reduced to just 13% of the original, while resource usage is only 20% of the original. Although some speed is sacrificed, the system still meets real-time monitoring requirements. Full article

(This article belongs to the Special Issue IC Design Techniques for Power/Energy-Constrained Applications)

► Show Figures

Figure 1

12 pages, 6631 KB

Open AccessArticle

Fall Detection Based on Continuous Wave Radar Sensor Using Binarized Neural Networks

by Hyeongwon Cho, Soongyu Kang, Yunseong Sim, Seongjoo Lee and Yunho Jung

Appl. Sci. 2025, 15(2), 546; https://doi.org/10.3390/app15020546 - 8 Jan 2025

Cited by 1 | Viewed by 1391

Abstract

Accidents caused by falls among the elderly have become a significant social issue, making fall detection systems increasingly needed. Fall detection systems such as internet of things (IoT) devices must be affordable and compact because they must be installed in various locations around [...] Read more.

Accidents caused by falls among the elderly have become a significant social issue, making fall detection systems increasingly needed. Fall detection systems such as internet of things (IoT) devices must be affordable and compact because they must be installed in various locations around the house, such as bedrooms, living rooms, and bathrooms. In this study, we propose a lightweight fall detection method using a continuous-wave (CW) radar sensor and a binarized neural network (BNN) to meet these requirements. We used a CW radar sensor, which is more affordable than other types of radar sensors, and employed a BNN with binarized features and parameters to reduce memory usage and make the system lighter. The proposed method distinguishes movements using micro-Doppler signatures, and spectrogram is binarized as an input to the BNN. The proposed method achieved 93.1% accuracy in binary classification of five fall actions and six non-fall actions. The memory requirements for storing parameters were reduced to 11.9 KB, representing a reduction of up to 99.9% compared with previous studies. Full article

► Show Figures

Figure 1

24 pages, 17112 KB

Open AccessArticle

Enhancing Binary Convolutional Neural Networks for Hyperspectral Image Classification

by Xuebin Tang, Ke Zhang, Xiaolei Zhou, Lingbin Zeng and Shan Huang

Remote Sens. 2024, 16(23), 4398; https://doi.org/10.3390/rs16234398 - 24 Nov 2024

Cited by 5 | Viewed by 1385

Abstract

Hyperspectral remote sensing technology is swiftly evolving, prioritizing affordability, enhanced portability, seamless integration, sophisticated intelligence, and immediate processing capabilities. The leading model for classifying hyperspectral images, which relies on convolutional neural networks (CNNs), has proven to be highly effective when run on advanced [...] Read more.

Hyperspectral remote sensing technology is swiftly evolving, prioritizing affordability, enhanced portability, seamless integration, sophisticated intelligence, and immediate processing capabilities. The leading model for classifying hyperspectral images, which relies on convolutional neural networks (CNNs), has proven to be highly effective when run on advanced computing platforms. Nonetheless, the high degree of parameterization inherent in CNN models necessitates considerable computational and storage resources, posing challenges to their deployment in processors with limited capacity like drones and satellites. This paper focuses on advancing lightweight models for hyperspectral image classification and introduces EBCNN, a novel binary convolutional neural network. EBCNN is designed to effectively regulate backpropagation gradients and minimize gradient discrepancies to optimize BNN performance. EBCNN incorporates an adaptive gradient scaling module that utilizes a multi-scale pyramid squeeze attention (PSA) mechanism during the training phase, which can adjust training gradients flexibly and efficiently. Additionally, to address suboptimal training issues, EBCNN employs a dynamic curriculum learning strategy underpinned by a confidence-aware loss function, Superloss, enabling progressive binarization and enhancing its classification effectiveness. Extensive experimental evaluations conducted on five esteemed public datasets confirm the effectiveness of EBCNN. These analyses highlight a significant enhancement in the classification accuracy of hyperspectral images, achieved without incurring additional memory or computational overheads during the inference process. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

21 pages, 982 KB

Open AccessArticle

Ponte: Represent Totally Binary Neural Network Toward Efficiency

by Jia Xu, Han Pu and Dong Wang

Sensors 2024, 24(20), 6726; https://doi.org/10.3390/s24206726 - 19 Oct 2024

Viewed by 1403

Abstract

In the quest for computational efficiency, binary neural networks (BNNs) have emerged as a promising paradigm, offering significant reductions in memory footprint and computational latency. In traditional BNN implementation, the first and last layers are typically full-precision, which causes higher logic usage in [...] Read more.

In the quest for computational efficiency, binary neural networks (BNNs) have emerged as a promising paradigm, offering significant reductions in memory footprint and computational latency. In traditional BNN implementation, the first and last layers are typically full-precision, which causes higher logic usage in field-programmable gate array (FPGA) implementation. To solve these issues, we introduce a novel approach named Ponte (Represent Totally Binary Neural Network Toward Efficiency) that extends the binarization process to the first and last layers of BNNs. We challenge the convention by proposing a fully binary layer replacement that mitigates the computational overhead without compromising accuracy. Our method leverages a unique encoding technique, Ponte::encoding, and a channel duplication strategy, Ponte::dispatch, and Ponte::sharing, to address the non-linearity and capacity constraints posed by binary layers. Surprisingly, all of them are back-propagation-supported, which allows our work to be implemented in the last layer through extensive experimentation on benchmark datasets, including CIFAR-10 and ImageNet. We demonstrate that Ponte not only preserves the integrity of input data but also enhances the representational capacity of BNNs. The proposed architecture achieves comparable, if not superior, performance metrics while significantly reducing the computational demands, thereby marking a step forward in the practical deployment of BNNs in resource-constrained environments. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

17 pages, 2167 KB

Open AccessArticle

LDF-BNN: A Real-Time and High-Accuracy Binary Neural Network Accelerator Based on the Improved BNext

by Rui Wan, Rui Cen, Dezheng Zhang and Dong Wang

Micromachines 2024, 15(10), 1265; https://doi.org/10.3390/mi15101265 - 17 Oct 2024

Cited by 1 | Viewed by 1934

Abstract

Significant progress has been made in industrial defect detection due to the powerful feature extraction capabilities of deep neural networks (DNNs). However, the high computational cost and memory requirement of DNNs pose a great challenge to the deployment of industrial edge-side devices. Although [...] Read more.

Significant progress has been made in industrial defect detection due to the powerful feature extraction capabilities of deep neural networks (DNNs). However, the high computational cost and memory requirement of DNNs pose a great challenge to the deployment of industrial edge-side devices. Although traditional binary neural networks (BNNs) have the advantages of small storage space requirements, high parallel computing capability, and low power consumption, the problem of significant accuracy degradation cannot be ignored. To tackle these challenges, this paper constructs a BNN with layered data fusion mechanism (LDF-BNN) based on BNext. By introducing the above mechanism, it strives to minimize the bandwidth pressure while reducing the loss of accuracy. Furthermore, we have designed an efficient hardware accelerator architecture based on this mechanism, enhancing the performance of high-accuracy BNN models with complex network structures. Additionally, the introduction of multi-storage parallelism alleviates the limitations imposed by the internal transfer rate, thus improving the overall computational efficiency. The experimental results show that our proposed LDF-BNN outperforms other methods in the comprehensive comparison, achieving a high accuracy of 72.23%, an image processing rate of 72.6 frames per second (FPS), and 1826 giga operations per second (GOPs) on the ImageNet dataset. Meanwhile, LDF-BNN can also be well applied to defect detection dataset Mixed WM-38, achieving a high accuracy of 98.70%. Full article

► Show Figures

Figure 1

20 pages, 740 KB

Open AccessArticle

A Variation-Aware Binary Neural Network Framework for Process Resilient In-Memory Computations

by Minh-Son Le, Thi-Nhan Pham, Thanh-Dat Nguyen and Ik-Joon Chang

Electronics 2024, 13(19), 3847; https://doi.org/10.3390/electronics13193847 - 28 Sep 2024

Viewed by 1582

Abstract

Binary neural networks (BNNs) that use 1-bit weights and activations have garnered interest as extreme quantization provides low power dissipation. By implementing BNNs as computation-in-memory (CIM), which computes multiplication and accumulations on memory arrays in an analog fashion, namely, analog CIM, we can [...] Read more.

Binary neural networks (BNNs) that use 1-bit weights and activations have garnered interest as extreme quantization provides low power dissipation. By implementing BNNs as computation-in-memory (CIM), which computes multiplication and accumulations on memory arrays in an analog fashion, namely, analog CIM, we can further improve the energy efficiency to process neural networks. However, analog CIMs are susceptible to process variation, which refers to the variability in manufacturing that causes fluctuations in the electrical properties of transistors, resulting in significant degradation in BNN accuracy. Our Monte Carlo simulations demonstrate that in an SRAM-based analog CIM implementing the VGG-9 BNN model, the classification accuracy on the CIFAR-10 image dataset is degraded to below 50% under process variations in a 28 nm FD-SOI technology. To overcome this problem, we present a variation-aware BNN framework. The proposed framework is developed for SRAM-based BNN CIMs since SRAM is most widely used as on-chip memory; however, it is easily extensible to BNN CIMs based on other memories. Our extensive experimental results demonstrate that under process variation of 28 nm FD-SOI, with an SRAM array size of

128 \times 128

, our framework significantly enhances classification accuracies on both the MNIST hand-written digit dataset and the CIFAR-10 image dataset. Specifically, for the CONVNET BNN model on MNIST, accuracy improves from 60.24% to 92.33%, while for the VGG-9 BNN model on CIFAR-10, accuracy increases from 45.23% to 78.22%. Full article

(This article belongs to the Special Issue Research on Key Technologies for Hardware Acceleration)

► Show Figures

Figure 1

15 pages, 1136 KB

Open AccessArticle

Optimizing Data Flow in Binary Neural Networks

by Lorenzo Vorabbi, Davide Maltoni and Stefano Santi

Sensors 2024, 24(15), 4780; https://doi.org/10.3390/s24154780 - 23 Jul 2024

Cited by 3 | Viewed by 1339

Abstract

Binary neural networks (BNNs) can substantially accelerate a neural network’s inference time by substituting its costly floating-point arithmetic with bit-wise operations. Nevertheless, state-of-the-art approaches reduce the efficiency of the data flow in the BNN layers by introducing intermediate conversions from 1 to 16/32 [...] Read more.

Binary neural networks (BNNs) can substantially accelerate a neural network’s inference time by substituting its costly floating-point arithmetic with bit-wise operations. Nevertheless, state-of-the-art approaches reduce the efficiency of the data flow in the BNN layers by introducing intermediate conversions from 1 to 16/32 bits. We propose a novel training scheme, denoted as BNN-Clip, that can increase the parallelism and data flow of the BNN pipeline; specifically, we introduce a clipping block that reduces the data width from 32 bits to 8. Furthermore, we decrease the internal accumulator size of a binary layer, usually kept using 32 bits to prevent data overflow, with no accuracy loss. Moreover, we propose an optimization of the batch normalization layer that reduces latency and simplifies deployment. Finally, we present an optimized implementation of the binary direct convolution for ARM NEON instruction sets. Our experiments show a consistent inference latency speed-up (up to

1.3

and

2.4 \times

compared to two state-of-the-art BNN frameworks) while reaching an accuracy comparable with state-of-the-art approaches on datasets like CIFAR-10, SVHN, and ImageNet. Full article

(This article belongs to the Special Issue Artificial Intelligence and Deep Learning in Sensors and Applications: 2nd Edition)

► Show Figures

Figure 1

21 pages, 1906 KB

Open AccessArticle

BinVPR: Binary Neural Networks towards Real-Valued for Visual Place Recognition

by Junshuai Wang, Junyu Han, Ruifang Dong and Jiangming Kan

Sensors 2024, 24(13), 4130; https://doi.org/10.3390/s24134130 - 25 Jun 2024

Viewed by 1905

Abstract

Visual Place Recognition (VPR) aims to determine whether a robot or visual navigation system locates in a previously visited place using visual information. It is an essential technology and challenging problem in computer vision and robotic communities. Recently, numerous works have demonstrated that [...] Read more.

Visual Place Recognition (VPR) aims to determine whether a robot or visual navigation system locates in a previously visited place using visual information. It is an essential technology and challenging problem in computer vision and robotic communities. Recently, numerous works have demonstrated that the performance of Convolutional Neural Network (CNN)-based VPR is superior to that of traditional methods. However, with a huge number of parameters, large memory storage is necessary for these CNN models. It is a great challenge for mobile robot platforms equipped with limited resources. Fortunately, Binary Neural Networks (BNNs) can reduce memory consumption by converting weights and activation values from 32-bit into 1-bit. But current BNNs always suffer from gradients vanishing and a marked drop in accuracy. Therefore, this work proposed a BinVPR model to handle this issue. The solution is twofold. Firstly, a feature restoration strategy was explored to add features into the latter convolutional layers to further solve the gradient-vanishing problem during the training process. Moreover, we identified two principles to address gradient vanishing: restoring basic features and restoring basic features from higher to lower layers. Secondly, considering the marked drop in accuracy results from gradient mismatch during backpropagation, this work optimized the combination of binarized activation and binarized weight functions in the Larq framework, and the best combination was obtained. The performance of BinVPR was validated on public datasets. The experimental results show that it outperforms state-of-the-art BNN-based approaches and full-precision networks of AlexNet and ResNet in terms of both recognition accuracy and model size. It is worth mentioning that BinVPR achieves the same accuracy with only 1% and 4.6% model sizes of AlexNet and ResNet. Full article

(This article belongs to the Section Navigation and Positioning)

► Show Figures

Figure 1

19 pages, 4470 KB

Open AccessArticle

Deep Learning and Neural Architecture Search for Optimizing Binary Neural Network Image Super Resolution

by Yuanxin Su, Li-minn Ang, Kah Phooi Seng and Jeremy Smith

Biomimetics 2024, 9(6), 369; https://doi.org/10.3390/biomimetics9060369 - 18 Jun 2024

Cited by 1 | Viewed by 1839

Abstract

The evolution of super-resolution (SR) technology has seen significant advancements through the adoption of deep learning methods. However, the deployment of such models by resource-constrained devices necessitates models that not only perform efficiently, but also conserve computational resources. Binary neural networks (BNNs) offer [...] Read more.

The evolution of super-resolution (SR) technology has seen significant advancements through the adoption of deep learning methods. However, the deployment of such models by resource-constrained devices necessitates models that not only perform efficiently, but also conserve computational resources. Binary neural networks (BNNs) offer a promising solution by minimizing the data precision to binary levels, thus reducing the computational complexity and memory requirements. However, for BNNs, an effective architecture is essential due to their inherent limitations in representing information. Designing such architectures traditionally requires extensive computational resources and time. With the advancement in neural architecture search (NAS), differentiable NAS has emerged as an attractive solution for efficiently crafting network structures. In this paper, we introduce a novel and efficient binary network search method tailored for image super-resolution tasks. We adapt the search space specifically for super resolution to ensure it is optimally suited for the requirements of such tasks. Furthermore, we incorporate Libra Parameter Binarization (Libra-PB) to maximize information retention during forward propagation. Our experimental results demonstrate that the network structures generated by our method require only a third of the parameters, compared to conventional methods, and yet deliver comparable performance. Full article

(This article belongs to the Special Issue New Insights into Bio-Inspired Neural Networks)

► Show Figures

Figure 1

16 pages, 3611 KB

Open AccessArticle

A Novel CNFET SRAM-Based Compute-In-Memory for BNN Considering Chirality and Nanotubes

by Youngbae Kim, Nader Alnatsheh, Nandakishor Yadav, Jaeik Cho, Heeyoung Jo and Kyuwon Ken Choi

Electronics 2024, 13(11), 2192; https://doi.org/10.3390/electronics13112192 - 4 Jun 2024

Cited by 3 | Viewed by 1739

Abstract

As AI models grow in complexity to enhance accuracy, supporting hardware encounters challenges such as heightened power consumption and diminished processing speed due to high throughput demands. Compute-in-memory (CIM) technology emerges as a promising solution. Furthermore, carbon nanotube field-effect transistors (CNFETs) show significant [...] Read more.

As AI models grow in complexity to enhance accuracy, supporting hardware encounters challenges such as heightened power consumption and diminished processing speed due to high throughput demands. Compute-in-memory (CIM) technology emerges as a promising solution. Furthermore, carbon nanotube field-effect transistors (CNFETs) show significant potential in bolstering CIM technology. Despite advancements in silicon semiconductor technology, CNFETs pose as formidable competitors, offering advantages in reliability, performance, and power efficiency. This is particularly pertinent given the ongoing challenges posed by the reduction in silicon feature size. We proposed an ultra-low-power architecture leveraging CNFETs for Binary Neural Networks (BNNs), featuring an advanced state-of-the-art 8T SRAM bit cell and CNFET model to optimize performance in intricate AI computations. Through meticulous optimization, we fine-tune the CNFET model by adjusting tube counts and chiral vectors, as well as optimizing transistor ratios for SRAM transistors and nanotube diameters. SPICE simulation in 32 nm CNFET technology facilitates the determination of optimal transistor ratios and chiral vectors across various nanotube diameters under a 0.9 V supply voltage. Comparative analysis with conventional FinFET-based CIM structures underscores the superior performance of our CNFET SRAM-based CIM design, boasting a 99% reduction in power consumption and a 91.2% decrease in delay compared to state-of-the-art designs. Full article

(This article belongs to the Section Microelectronics)

► Show Figures

Figure 1

17 pages, 2074 KB

Open AccessArticle

CBin-NN: An Inference Engine for Binarized Neural Networks

by Fouad Sakr, Riccardo Berta, Joseph Doyle, Alessio Capello, Ali Dabbous, Luca Lazzaroni and Francesco Bellotti

Electronics 2024, 13(9), 1624; https://doi.org/10.3390/electronics13091624 - 24 Apr 2024

Cited by 15 | Viewed by 2700

Abstract

Binarization is an extreme quantization technique that is attracting research in the Internet of Things (IoT) field, as it radically reduces the memory footprint of deep neural networks without a correspondingly significant accuracy drop. To support the effective deployment of Binarized Neural Networks [...] Read more.

Binarization is an extreme quantization technique that is attracting research in the Internet of Things (IoT) field, as it radically reduces the memory footprint of deep neural networks without a correspondingly significant accuracy drop. To support the effective deployment of Binarized Neural Networks (BNNs), we propose CBin-NN, a library of layer operators that allows the building of simple yet flexible convolutional neural networks (CNNs) with binary weights and activations. CBin-NN is platform-independent and is thus portable to virtually any software-programmable device. Experimental analysis on the CIFAR-10 dataset shows that our library, compared to a set of state-of-the-art inference engines, speeds up inference by 3.6 times and reduces the memory required to store model weights and activations by 7.5 times and 28 times, respectively, at the cost of slightly lower accuracy (2.5%). An ablation study stresses the importance of a Quantized Input Quantized Kernel Convolution layer to improve accuracy and reduce latency at the cost of a slight increase in model size. Full article

(This article belongs to the Special Issue Edge Computing and Tiny Machine Learning in the Internet of Things: Latest Advances and Applications)

► Show Figures

Figure 1

Search Results (35)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (35)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI