Adaptive Energy—Accuracy Trade-Offs in Configurable MAC Architectures for AI Acceleration
Abstract
1. Introduction
2. Related Work
2.1. Problem Formulation
| Algorithm 1 Energy-Aware MAC Configuration Algorithm (EAMCA) |
|
2.2. Motivation and Contributions
- A novel configurable MAC (CM) architecture that supports run-time selection of approximation levels through internal bit-level logic compression, enabling dynamic energy–accuracy trade-offs without datapath duplication;
- Integration of the proposed configurable MAC into both Artificial Neural Network (ANN) and Convolutional Neural Network (CNN) architectures, with comprehensive evaluation across multiple datasets to quantify accuracy, energy consumption, and performance trade-offs;
- End-to-end validation of the proposed energy-adaptive neuron architecture through ASIC synthesis and FPGA implementation, demonstrating practical feasibility, scalability, and effectiveness beyond standalone arithmetic units;
- An Energy-Aware MAC Configuration Algorithm (EAMCA) that dynamically selects the optimal MAC operating mode based on instantaneous power or energy availability at run time.
3. Methodology
3.1. Energy Model and Definition
3.2. Numeric Representation and Training Configuration
4. Proposed Architectures and Implementations
4.1. Bit-Level Logic Compression in Configurable Multipliers
Run-Time Mode Switching and Transient Behavior
4.2. Configurable Neuron Architecture
- Internal precision adaptation. The neuron supports three internal operating modes: HP-Mode, MP-Mode, and LP-Mode. These modes determine the extent to which low-significance partial products are generated and reduced inside the multiplier. Higher-significance logic is always preserved to protect the dominant numerical contribution of the multiplication, while progressively lower-significance logic is suppressed as the operating mode shifts from HP to LP. The CLA consistently receives a full -bit product and performs exact accumulation across all modes. As a result, approximation affects only the multiplication stage, whereas the accumulation behavior remains deterministic and mode-independent.
- Interface stability and run-time reconfiguration. Because operand formatting and datapath width remain unchanged, transitions between operating modes do not require datapath reconfiguration, operand rescaling, or retraining of the neural network. This internalized adaptation enables seamless run-time switching between modes without disrupting network-level operation or control flow.
4.3. Trade-Offs for Power, Delay, and Area in ASIC Designs
4.3.1. Normalized ASIC Comparison
4.3.2. Implementation of ANNs
- Baseline: Reduced-Complexity Networks (Model Simplification).
4.3.3. Discussion of ASIC Results
4.3.4. Implementation and Reproducibility Details
4.3.5. Technology Scaling and Leakage Considerations
4.4. Area Reduction and Inference Accuracy in FPGA Implementations
5. Energy-Aware MAC Configuration Algorithm (EAMCA)
5.1. Control Overhead Analysis
5.2. Decision Granularity and Switching Behavior
6. Comparative Analysis of Configurable Approximate MAC Designs
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Sekanina, L. Introduction to Approximate Computing: Embedded Tutorial. In Proceedings of the IEEE 19th International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS), Kosice, Slovakia; IEEE: Piscataway, NJ, USA, 2016; pp. 1–6. [Google Scholar]
- Mahdiani, H.; Ahmadi, A.; Fakhraie, S.M.; Lucas, C. Bio-Inspired Imprecise Computational Blocks for Efficient Digital Signal Processing. IEEE Trans. Circuits Syst. I Regul. Pap. 2010, 57, 850–862. [Google Scholar] [CrossRef]
- Shafik, R.; Yakovlev, A.; Das, S. Real-Power Computing. IEEE Trans. Comput. 2018, 67, 1445–1461. [Google Scholar] [CrossRef]
- Wang, E.; Davis, J.J.; Zhao, R.; Ng, H.-C.; Niu, X.; Luk, W.; Cheung, P.Y.K.; Constantinides, G.A. Deep Neural Network Approximation for Custom Hardware: Where We’ve Been, Where We’re Going. ACM Comput. Surv. 2019, 52, 40. [Google Scholar] [CrossRef]
- Masadeh, M.; Hasan, O.; Tahar, S. Input-Conscious Approximate Multiply—Accumulate (MAC) Unit for Energy Efficiency. IEEE Access 2019, 7, 147129–147142. [Google Scholar] [CrossRef]
- Pinos, M.; Mrazek, V.; Vaverka, F.; Vasicek, Z.; Sekanina, L. Acceleration Techniques for Automated Design of Approximate Convolutional Neural Networks. IEEE J. Emerg. Sel. Top. Circuits Syst. 2023, 13, 212–224. [Google Scholar] [CrossRef]
- Leon, V.; Paparouni, T.; Petrongonas, E.; Soudris, D.; Pekmestzi, K. Improving Power of DSP and CNN Hardware Accelerators Using Approximate Floating-Point Multipliers. ACM Trans. Embed. Comput. Syst. 2021, 20, 1–21. [Google Scholar] [CrossRef]
- Yin, P.; Wang, C.; Waris, H.; Liu, W.; Han, Y.; Lombardi, F. Design and Analysis of Energy-Efficient Dynamic Range Approximate Logarithmic Multipliers for Machine Learning. IEEE Trans. Sustain. Comput. 2020, 6, 612–625. [Google Scholar] [CrossRef]
- Mileiko, S.; Bunnam, T.; Xia, F.; Shafik, R.; Yakovlev, A.; Das, S. Neural Network Design for Energy-Autonomous Artificial Intelligence Applications Using Temporal Encoding. Philos. Trans. R. Soc. A 2020, 378, 20190166. [Google Scholar] [CrossRef] [PubMed]
- Zhang, T.; Niu, Z.; Han, J. A Brief Review of Logarithmic Multiplier Designs. In IEEE Latin American Test Symposium (LATS); IEEE: Piscataway, NJ, USA, 2022; pp. 1–4. [Google Scholar]
- Patel, S.K.; Singhal, S.K. An Area–Delay Efficient Single-Precision Floating-Point Multiplier for VLSI Systems. Microproc. Microsyst. 2023, 98, 104798. [Google Scholar]
- Masadeh, M.; Hasan, O.; Tahar, S. Machine-Learning-Based Self-Tunable Design of Approximate Computing. IEEE Trans. Very Large Scale Integr. (Vlsi) Syst. 2021, 29, 800–813. [Google Scholar] [CrossRef]
- Nakhaee, F.; Kamal, M.; Afzali-Kusha, A.; Pedram, M.; Fakhraie, S.M.; Dorosti, H. Lifetime Improvement by Exploiting Aggressive Voltage Scaling During Runtime of Error-Resilient Applications. Integration 2018, 61, 29–38. [Google Scholar] [CrossRef]
- Venkatachalam, S.; Adams, E.; Lee, H.J.; Ko, S.-B. Design and Analysis of Area and Power Efficient Approximate Booth Multipliers. IEEE Trans. Comput. 2019, 68, 1697–1703. [Google Scholar] [CrossRef]
- Strollo, A.G.M.; De Caro, D.; Napoli, E.; Petra, N.; Di Meo, G. Low-Power Approximate Multiplier with Error Recovery Using a New Approximate 4–2 Compressor. In IEEE International Symposium on Circuits and Systems (ISCAS); IEEE: Piscataway, NJ, USA, 2020; pp. 1–4. [Google Scholar]
- Burke, D.; Jenkus, D.; Qiqieh, I.; Shafik, R.; Das, S.; Yakovlev, A. Significance-Driven Adaptive Approximate Computing for Energy-Efficient Image Processing Applications. In International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS); IEEE: Piscataway, NJ, USA, 2017; pp. 1–2. [Google Scholar]
- Qaseem, Q. Investigation of Magnetic Bistability for a Wider Bandwidth In Vibro-Impact Triboelectric Energy Harvesters. Master’s Thesis, University of Taxas at Tyler, Tyler, TX, USA, 2023. [Google Scholar]
- Qiqieh, I.; Shafik, R.; Tarawneh, G.; Sokolov, D.; Yakovlev, A. Energy-Efficient Approximate Multiplier Design Using Bit Significance-Driven Logic Compression. In Design, Automation & Test in Europe; IEEE: Piscataway, NJ, USA, 2017; pp. 7–12. [Google Scholar]
- Al-Maaitah, K.; Qiqieh, I.; Soltan, A.; Yakovlev, A. Configurable-Accuracy Approximate Adder Design with Lightweight Fast Convergence Error Recovery Circuit. In IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT); IEEE: Piscataway, NJ, USA, 2017; pp. 1–6. [Google Scholar]
- Balasubramanian, P.; Mastorakis, N. Performance Comparison of Carry-Lookahead and Carry-Select Adders Based on Accurate and Approximate Additions. Electronics 2018, 7, 369. [Google Scholar] [CrossRef]
- Ellaithy, D.M.; El-Moursy, M.A.; Zaki, A.; Zekry, A. Dual-Channel Multiplier for Piecewise-Polynomial Function Evaluation for Low-Power 3-D Graphics. IEEE Trans. Very Large Scale Integr. (Vlsi) Syst. 2019, 27, 790–798. [Google Scholar] [CrossRef]
- Ghabeli, H.; Molahosseini, A.S.; Zarandi, A.A.E. New Multiply–Accumulate Circuits Based on Variable Latency Speculative Architectures with Asynchronous Data Paths. Majlesi J. Electr. Eng. 2022, 16, 41–53. [Google Scholar]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-Based Learning Applied to Document Recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Mrazek, V.; Sarwar, S.S.; Sekanina, L.; Vasicek, Z.; Roy, K. Design of Power-Efficient Approximate Multipliers for Approximate Artificial Neural Networks. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (ICCAD); IEEE: Piscataway, NJ, USA, 2016; pp. 1–7. [Google Scholar]
- Ullah, S.; Rehman, S.; Shafique, M.; Kumar, A. High-Performance Accurate and Approximate Multipliers for FPGA-Based Hardware Accelerators. IEEE Trans.-Comput.-Aided Des. Integr. Circuits Syst. 2021, 41, 211–224. [Google Scholar] [CrossRef]
- Ultra96-V2 Development Board. Avnet Inc., Phoenix, AZ, USA, 2021. Available online: https://www.avnet.com/americas/products/avnet-boards/avnet-board-families/ultra96-v2/ (accessed on 13 December 2025).
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, BC, Canada, 8–14 December 2019; pp. 8026–8037. [Google Scholar]
- Mitcheson, P.D.; Yeatman, E.M.; Rao, G.K.; Holmes, A.S.; Green, T.C. Energy Harvesting from Human and Machine Motion for Wireless Electronic Devices. Proc. IEEE 2008, 96, 1457–1486. [Google Scholar] [CrossRef]
- Vakili, S.; Vaziri, M.; Zarei, A.; Langlois, J.-P. DyReCMul: Fast and Low-Cost Approximate Multiplier for FPGAs Using Dynamic Reconfiguration. ACM Trans. Reconfigurable Technol. Syst. 2024, 18, 1–22. [Google Scholar] [CrossRef]
- Ma, J.; Reda, S. RUCA: Runtime Configurable Approximate Circuits with Self-Correcting Capability. In Asia and South Pacific Design Automation Conference (ASP-DAC); ACM: Kowloon, Hong Kong, China, 2023; pp. 140–145. [Google Scholar]
- Luo, H.; Cho, Y.; Demmel, J.W.; Kozachenko, I.; Li, X.S.; Liu, Y. Non-Smooth Bayesian Optimization in Tuning Scientific Applications. Int. J. High Perform. Comput. Appl. 2024, 38, 633–657. [Google Scholar] [CrossRef]





| MAC Mode | Area (μm2) | Delay (ns) | Power (μW) | PDP (fJ) |
|---|---|---|---|---|
| HP-Mode | 1644.4 | 2.66 | 70.2 | 186.7 |
| MP-Mode | 1098.6 | 2.88 | 43.6 | 125.6 |
| LP-Mode | 695.4 | 1.54 | 29.9 | 46.0 |
| Dataset | Network Topology | Epochs | Activation |
|---|---|---|---|
| Noisy XOR | 2–H–1 | 20 | Sigmoid |
| Binary IRIS | 4–H–1 | 100 | Sigmoid |
| UCI Breast Cancer | 9–H–1 | 100 | Sigmoid |
| MNIST | 784–H–10 | 100 | ReLU (hidden)/Softmax (output) |
| Dataset | Mode | (%) | (%) | (%) |
|---|---|---|---|---|
| MP→HP Hybrid | −34 | −38 | −50 | |
| Noisy XOR | MP-Mode | −37 | −41 | −52 |
| LP-Mode | −65 | −61 | −81 | |
| MP→HP Hybrid | −36 | −36 | −48 | |
| Binary IRIS | MP-Mode | −37 | −38 | −51 |
| LP-Mode | −65 | −61 | −80 | |
| MP→HP Hybrid | −36 | −37 | −49 | |
| MNIST | MP-Mode | −37 | −38 | −52 |
| LP-Mode | −65 | −60 | −80 | |
| MP→HP Hybrid | −35 | −37 | −50 | |
| UCI Breast Cancer | MP-Mode | −38 | −40 | −53 |
| LP-Mode | −66 | −62 | −81 |
| Data-Set → | MNIST | |
|---|---|---|
| MAC ↓ | Area Reduction (%) | Inference Accuracy (%) |
| Proposed HP-Mode | - | 98.2 |
| [25] | 4.02 | 64.1 |
| Proposed LP-Mode | 45.2 | 91 |
| Work | Platform | Approximation Mechanism | Run-Time Configurable | Energy-Aware Control | Interface Stability |
|---|---|---|---|---|---|
| Mrazek et al. [24] | ASIC | Truncation-based multiplier | No | No | No |
| Ullah et al. [25] | FPGA | Approximate adders/partial-product pruning | No | No | No |
| DyRecMul [29] | FPGA | LUT-based reconfigurable multiplier | Yes | No | Yes |
| RUCA [30] | ASIC | Quality-controlled circuit selection | Yes | No | Partial |
| AMG [31] | ASIC | Optimized approximate arithmetic generation | Limited | No | Partial |
| This work | ASIC/FPGA | Significance-aware internal logic suppression | Yes | Yes (EAMCA) | Yes |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Alnuayri, T.; Haddadi, I. Adaptive Energy—Accuracy Trade-Offs in Configurable MAC Architectures for AI Acceleration. Electronics 2026, 15, 1129. https://doi.org/10.3390/electronics15051129
Alnuayri T, Haddadi I. Adaptive Energy—Accuracy Trade-Offs in Configurable MAC Architectures for AI Acceleration. Electronics. 2026; 15(5):1129. https://doi.org/10.3390/electronics15051129
Chicago/Turabian StyleAlnuayri, Turki, and Ibrahim Haddadi. 2026. "Adaptive Energy—Accuracy Trade-Offs in Configurable MAC Architectures for AI Acceleration" Electronics 15, no. 5: 1129. https://doi.org/10.3390/electronics15051129
APA StyleAlnuayri, T., & Haddadi, I. (2026). Adaptive Energy—Accuracy Trade-Offs in Configurable MAC Architectures for AI Acceleration. Electronics, 15(5), 1129. https://doi.org/10.3390/electronics15051129

