Next Article in Journal
On the SCA Resistance of TMR-Protected Cryptographic Designs
Previous Article in Journal
Disturbance Observer-Based Saturation-Tolerant Prescribed Performance Control for Nonlinear Multi-Agent Systems
Previous Article in Special Issue
Accelerating Deep Learning Inference: A Comparative Analysis of Modern Acceleration Frameworks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Hardware Accelerator Design by Using RT-Level Power Optimization Techniques on FPGA for Future AI Mobile Applications

by
Achyuth Gundrapally
*,†,
Yatrik Ashish Shah
*,†,
Sai Manohar Vemuri
and
Kyuwon (Ken) Choi
DA-Lab, Department of Electrical and Computer Engineering, Illinois Institute of Technology, 3301 South Dearborn Street, Chicago, IL 60616, USA
*
Authors to whom correspondence should be addressed.
These authors contributed equally to this work.
Electronics 2025, 14(16), 3317; https://doi.org/10.3390/electronics14163317
Submission received: 8 July 2025 / Revised: 18 August 2025 / Accepted: 19 August 2025 / Published: 20 August 2025
(This article belongs to the Special Issue Hardware Acceleration for Machine Learning)

Abstract

In resource-constrained edge environments—such as mobile devices, IoT systems, and electric vehicles—energy-efficient Convolution Neural Network (CNN) accelerators on mobile Field Programmable Gate Arrays (FPGAs) are gaining significant attention for real-time object detection tasks. This paper presents a low-power implementation of the Tiny YOLOv4 object detection model on the Xilinx ZCU104 FPGA platform by using Register Transfer Level (RTL) optimization techniques. We proposed three RTL techniques in the paper: (i) Local Explicit Clock Enable (LECE), (ii) operand isolation, and (iii) Enhanced Clock Gating (ECG). A novel low-power design of Multiply-Accumulate (MAC) operations, which is one of the main components in the AI algorithm, was proposed to eliminate redundant signal switching activities. The Tiny YOLOv4 model, trained on the COCO dataset, was quantized and compiled using the Tensil tool-chain for fixed-point inference deployment. Post-implementation evaluation using Vivado 2022.2 demonstrates around 29.4% reduction in total on-chip power. Our design supports real-time detection throughput while maintaining high accuracy, making it ideal for deployment in battery-constrained environments such as drones, surveillance systems, and autonomous vehicles. These results highlight the effectiveness of RTL-level power optimization for scalable and sustainable edge AI deployment.
Keywords: BRAM; CNN accelerator; CNN architecture; FPGA RT level design; high performance; Operand isolation; low-power techniques; MAC; object detection; power consumption BRAM; CNN accelerator; CNN architecture; FPGA RT level design; high performance; Operand isolation; low-power techniques; MAC; object detection; power consumption

Share and Cite

MDPI and ACS Style

Gundrapally, A.; Shah, Y.A.; Vemuri, S.M.; Choi, K. Hardware Accelerator Design by Using RT-Level Power Optimization Techniques on FPGA for Future AI Mobile Applications. Electronics 2025, 14, 3317. https://doi.org/10.3390/electronics14163317

AMA Style

Gundrapally A, Shah YA, Vemuri SM, Choi K. Hardware Accelerator Design by Using RT-Level Power Optimization Techniques on FPGA for Future AI Mobile Applications. Electronics. 2025; 14(16):3317. https://doi.org/10.3390/electronics14163317

Chicago/Turabian Style

Gundrapally, Achyuth, Yatrik Ashish Shah, Sai Manohar Vemuri, and Kyuwon (Ken) Choi. 2025. "Hardware Accelerator Design by Using RT-Level Power Optimization Techniques on FPGA for Future AI Mobile Applications" Electronics 14, no. 16: 3317. https://doi.org/10.3390/electronics14163317

APA Style

Gundrapally, A., Shah, Y. A., Vemuri, S. M., & Choi, K. (2025). Hardware Accelerator Design by Using RT-Level Power Optimization Techniques on FPGA for Future AI Mobile Applications. Electronics, 14(16), 3317. https://doi.org/10.3390/electronics14163317

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop