Emerging Computing Paradigms for Efficient Edge AI Acceleration

A special issue of Electronics (ISSN 2079-9292). This special issue belongs to the section "Circuit and Signal Processing".

Deadline for manuscript submissions: 15 June 2026 | Viewed by 3168

Special Issue Editors


E-Mail Website
Guest Editor
1. School of Electrical and Computer Engineering, National Technical University of Athens, 15780 Athens, Greece
2. Institute of Communication and Computer Systems (ICCS), 15773 Zografou, Greece
Interests: circuits and systems; VLSI design; reconfigurable computing; AI applications; stochastic computing; arithmetic circuits; edge AI

E-Mail Website
Guest Editor
Department of Electronic Systems, Aalborg University, 2450 Copenhagen, Denmark
Interests: circuits and systems; emerging technologies; memristors; nonlinear dynamical circuits; stochastic resonance; in-memory computing; unconventional computing; cellular automata; cellular neural networks

Special Issue Information

Dear Colleagues,

The continuous advancements in artificial intelligence (AI) algorithms, particularly deep learning models, have made their integration into modern applications essential. Many of these applications are required to simultaneously operate “at the edge” and process data in real time to avoid latency and bandwidth issues caused by exchanging data with centralized servers. Computing at the edge, however, necessitates hardware-efficient realizations of AI algorithms to accelerate their performance, requiring devices with reduced size and energy consumption, compactness and mass parallelism, all while maintaining optimal computational accuracy. With traditional computing and processing techniques pushing devices to their limits, new and emerging computing paradigms are being explored as solutions to balance the computational accuracy–hardware efficiency trade-off. Driven by the requirements of resource-constrained devices, this Special Issue aims to advance innovative circuits, architectures, systems and signal processing techniques that accelerate AI applications, with emphasis placed on approaches beyond the conventional computing ones.

In this Special Issue, original research articles are welcome. Topics of interest include, but are not limited to, the theory, design, modeling, and application of the following:

  • Analog computing;
  • Approximate computing;
  • Hybrid computing techniques;
  • Hyperdimensional computing;
  • In-memory and near-memory computing;
  • Memristor-based devices and computing;
  • Neuromorphic computing;
  • Pruning techniques;
  • Quantization techniques;
  • Reservoir computing;
  • Stochastic computing;
  • Unconventional computing.

Dr. Nikos Temenos
Dr. Vasileios Ntinas
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 250 words) can be sent to the Editorial Office for assessment.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Electronics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2400 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • hardware efficiency
  • in-memory computing
  • near-memory computing
  • pruning
  • quantization
  • unconventional computing
  • AI accelerators
  • circuits and systems
  • edge AI

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • Reprint: MDPI Books provides the opportunity to republish successful Special Issues in book format, both online and in print.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (4 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 726 KB  
Article
PBBQ: Plug-In Balanced Binary Quantization for LLMs
by Zhangming Li, Weifan Guan, Zhengwei Chang, Linghao Zhang and Qinghao Hu
Electronics 2026, 15(4), 819; https://doi.org/10.3390/electronics15040819 - 13 Feb 2026
Abstract
In recent years, the expansion of large-model parameters has substantially increased storage and inference overhead. Consequently, post-training quantization has become a key technique for reducing model size and inference-time energy consumption. However, we observe that, under extremely low bit-width settings, mainstream error-compensation-based algorithms [...] Read more.
In recent years, the expansion of large-model parameters has substantially increased storage and inference overhead. Consequently, post-training quantization has become a key technique for reducing model size and inference-time energy consumption. However, we observe that, under extremely low bit-width settings, mainstream error-compensation-based algorithms tend to overfit the calibration data. To mitigate this issue, we propose Plug-in Balanced Binary Quantization for LLMs (PBBQ), which reduces the excessive emphasis on subsequent channels via block-wise dropout and layer-wise reordering. PBBQ can be integrated into GPTQ-style frameworks and ultra-low-bit methods such as BiLLM and ARB-LLM. Experimental results show that PBBQ significantly improves the performance of multiple error-compensation quantization algorithms. When combined with the state-of-the-art methods BiLLM and ARB-LLM, the perplexity (ppl) on WikiText-2 is reduced by 21.46% (from 32.48 to 25.51) and 22.02% (from 16.44 to 12.82), respectively. Full article
(This article belongs to the Special Issue Emerging Computing Paradigms for Efficient Edge AI Acceleration)
22 pages, 4772 KB  
Article
INVCAM: An Inverted Compressor-Based Approximate Multiplier
by Kimia Darabi, Sahand Divsalar, Shaghayegh Vahdat, Nima Amirafshar and Nima TaheriNejad
Electronics 2026, 15(1), 216; https://doi.org/10.3390/electronics15010216 - 2 Jan 2026
Cited by 1 | Viewed by 366
Abstract
In this paper, a novel 8-bit approximate multiplier, called INVCAM, is proposed in which the inverted partial products (PPs) are summed using approximate 4:2 compressors. This design allows for flexibility in applying approximations, enabling the multiplier to be tuned to the specific accuracy [...] Read more.
In this paper, a novel 8-bit approximate multiplier, called INVCAM, is proposed in which the inverted partial products (PPs) are summed using approximate 4:2 compressors. This design allows for flexibility in applying approximations, enabling the multiplier to be tuned to the specific accuracy requirements of different applications. By adjusting the number of approximated bits, the multiplier can operate with a better balance between desirable hardware characteristics and acceptable levels of error. Our approach ensures that INVCAM is customizable for a wide range of applications. The results indicate that INVCAM reduces delay, power, and area by up to 21.5%, 70.0%, and 57.6%, respectively, compared to the state-of-the-art (SoTA) approximate multipliers within its mean relative error distance (MRED) range, and by 42.4%, 80.1%, and 68%, compared to an exact multiplier. The efficacy of INVCAM is evaluated in image processing and deep neural network (DNN) applications. The images processed by different configurations of INVCAM have PSNR and SSIM values greater than 28.9 dB and 0.81, respectively, which manifests the acceptable quality of the processed approximate images. In the DNN application, the classification accuracy of the models implemented using INVCAM(7) is within 0.6% of the original model accuracy. When the number of approximate bits is increased to nine, less than 5% accuracy reduction is observed compared to an exact model, while the power-delay-area product of the multiplier improves by 46%. Full article
(This article belongs to the Special Issue Emerging Computing Paradigms for Efficient Edge AI Acceleration)
Show Figures

Figure 1

22 pages, 3753 KB  
Article
A High-Precision Hybrid Floating-Point Compute-in-Memory Architecture for Complex Deep Learning
by Zizhao Ma, Chunshan Wang, Qi Chen, Yifan Wang and Yufeng Xie
Electronics 2025, 14(22), 4414; https://doi.org/10.3390/electronics14224414 - 13 Nov 2025
Viewed by 1393
Abstract
As artificial intelligence (AI) advances, deep learning models are shifting from convolutional architectures to transformer-based structures, highlighting the importance of accurate floating-point (FP) calculations. Compute-in-memory (CIM) enhances matrix multiplication performance by breaking down the von Neumann architecture. However, many FPCIMs struggle to maintain [...] Read more.
As artificial intelligence (AI) advances, deep learning models are shifting from convolutional architectures to transformer-based structures, highlighting the importance of accurate floating-point (FP) calculations. Compute-in-memory (CIM) enhances matrix multiplication performance by breaking down the von Neumann architecture. However, many FPCIMs struggle to maintain high precision while achieving efficiency. This work proposes a high-precision hybrid floating-point compute-in-memory (Hy-FPCIM) architecture for Vision Transformer (ViT) through post-alignment with two different CIM macros: Bit-wise Exponent Macro (BEM) and Booth Mantissa Macro (BMM). The high-parallelism BEM efficiently implements exponent calculations in-memory with the Bit-Separated Exponent Summation Unit (BSESU) and the routing-efficient Bit-wise Max Finder (BMF). The high-precision BMM achieves nearly lossless mantissa computation in-memory with efficient Booth 4 encoding and the sensitivity-amplifier-free Flying Mantissa Lookup Table based on 12T Triple Port SRAM. The proposed Hy-FPCIM architecture achieves 23.7 TFLOPS/W energy efficiency and 0.754 TFLOPS/mm2 area efficiency, with 617 Kb/mm2 memory density in 28 nm technology. With almost lossless architectures, the proposed Hy-FPCIM achieves an accuracy of 81.04% in recognition tasks on the ImageNet dataset using ViT, representing a 0.03% decrease compared to the software baseline. This research presents significant advantages in both accuracy and energy efficiency, providing critical technology for complex deep learning applications. Full article
(This article belongs to the Special Issue Emerging Computing Paradigms for Efficient Edge AI Acceleration)
Show Figures

Figure 1

12 pages, 7323 KB  
Article
WinEdge: Low-Power Winograd CNN Execution with Transposed MRAM for Edge Devices
by Milad Ashtari Gargari, Sepehr Tabrizchi and Arman Roohi
Electronics 2025, 14(12), 2485; https://doi.org/10.3390/electronics14122485 - 19 Jun 2025
Viewed by 883
Abstract
This paper presents a novel transposed MRAM architecture (WinEdge) specifically optimized for Winograd convolution acceleration in edge computing devices. Leveraging Magnetic Tunnel Junctions (MTJs) with Spin Hall Effect (SHE)-assisted Spin-Transfer Torque (STT) writing, the proposed design enables a single SHE current to simultaneously [...] Read more.
This paper presents a novel transposed MRAM architecture (WinEdge) specifically optimized for Winograd convolution acceleration in edge computing devices. Leveraging Magnetic Tunnel Junctions (MTJs) with Spin Hall Effect (SHE)-assisted Spin-Transfer Torque (STT) writing, the proposed design enables a single SHE current to simultaneously write data to four MTJs, substantially reducing power consumption. Additionally, the integration of stacked MTJs significantly improves storage density. The proposed WinEdge efficiently supports both standard and transposed data access modes regardless of bit-width, achieving up to 36% lower power, 47% reduced energy consumption, and 28% faster processing speed compared to existing designs. Simulations conducted in 45 nm CMOS technology validate its superiority over conventional SRAM-based solutions for convolutional neural network (CNN) acceleration in resource-constrained edge environments. Full article
(This article belongs to the Special Issue Emerging Computing Paradigms for Efficient Edge AI Acceleration)
Show Figures

Figure 1

Back to TopTop