A Q-Learning-Based Hierarchical Power Delivery Architecture for the Efficient Management of Heterogeneous Loads
Abstract
1. Introduction
2. Related Work
- Real-time optimization framework: We introduce a properly tailored Q-learning algorithm that adapts the system in real time to any combination of heterogeneous loads, determining an operating point that yields higher end-to-end power efficiency than current state-of-the-art methods.
- Adaptive hierarchical PDU support: By effectively coordinating disparate DC–DC converters and power gating, the proposed method achieves significantly higher total efficiency across different supply voltages compared to single-converter systems.
- Dynamic load adaptation: Our online training methodology allows the system to maintain peak performance under changing load conditions, such as the addition of new modules or fluctuations in the transmitter power of radio modules.
- Low-overhead implementation for ULP systems: By combining a search table approach with efficiency estimation, the methodology eliminates the need for complex measurement circuitry, reducing the computational and power overhead to levels suitable for battery-powered IoT nodes.
- Low-overhead estimation for IoT: We utilize a search table and efficiency estimation approach that eliminates the need for redundant measurement circuitry, maintaining a computational runtime below 60 ms—ideal for battery-powered IoT nodes.
3. Background and Theoretical Formulation
3.1. Q-Algorithm
- Hypothesis: A model-free reinforcement learning (RL) agent can autonomously learn to maximize the end-to-end efficiency of a multi-level PDU without requiring prior knowledge of converter topologies or load profiles;
- Input parameters: The system state space is defined by (1) real-time load current demands (), (2) available voltage domains, and (3) battery status ();
- Expected outcomes: An optimal connectivity policy that dynamically delivers power through the most efficient combination of converters, thereby maximizing the total system efficiency compared to static or heuristic control baselines.
3.2. Power Efficiency Decomposition
- The power efficiency of the available LDOs;
- The maximum total current that must be supplied by the system.
3.3. Power Efficiency of Hierarchical PMUs
3.4. Approximation Error Analysis
3.5. Practical Advantages of the Method
4. Power Delivery Method
4.1. Stability of the PDU
- The characteristics of each LDO comprising the PDU;
- The load conditions;
- The output capacitance of each LDO.
4.2. Q-Algorithm-Based Power Delivery
| Algorithm 1: Q-algorithm for optimum PM policy |
![]() |
5. Experimental Results
- State (): Physically represents the total power path of tree T from to based on the enabled regulators and the selection bits of analog multiplexers (AMUX).
- Action (): Corresponds to the digital control signals and sent by the Q-agent (e.g., Arduino Due) to the PMOS transistors and AMUXs. These signals physically reconfigure the circuit topology by enabling/disabling a specific regulator or connecting/disconnecting a to/from a regulator.
5.1. Homogeneous Loads Case Study
5.2. Three-Level Hierarchy PDU
6. Conclusions
- Low overhead: By utilizing a derived formula for power efficiency that relies solely on measured load currents and a pre-characterized regulator lookup table with quiescent current data, the system eliminates the need for continuous sensor polling. This approach significantly reduces the computational cost, improving runtime from 208 ms (sensor polling) to 56 ms (table-based) in high-density-load scenarios. This low overhead is highly appropriate for portable systems.
- Stability: The integration of power gating with a bounded number of regulators ensures the stability of the parallel LDO configuration while preserving the transient response required for fast wake-up modes.
- Scalability: The model-free Q-agent allows for the deployment to different PDU architectures for various applications to support diverse heterogeneous loads, from sensors to communication modules.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Prasad, A.; Chawda, P. Power Management Factors and Techniques for IoT Design Devices. In Proceedings of the International Symposium on Quality Electronic Design, Santa Clara, CA, USA, 13–14 March 2018; pp. 364–369. [Google Scholar] [CrossRef]
- Wei, K.; Ma, D.B. A 10-MHz DAB Hysteretic Control Switching Power Converter for 5G IoT Power Delivery. IEEE J. Solid-State Circuits 2021, 56, 2113–2122. [Google Scholar] [CrossRef]
- Ababneh, M.M.; Ugweje, O.; Jaesim, A. Optimized Power Management Unit for IoT Applications. In Proceedings of the International Conference on Electronics, Computer and Computation, Abuja, Nigeria, 10–12 December 2019; pp. 1–4. [Google Scholar] [CrossRef]
- László-Zsolt, T.; Géza, C.; Csenteri, B. Power Management In IoT Weather Station. In Proceedings of the International Conference and Exposition on Electrical And Power Engineering, Iasi, Romania, 18–19 October 2018; pp. 133–138. [Google Scholar] [CrossRef]
- Carreon-Bautista, S.; Huang, L.; Sanchez-Sinencio, E. An Autonomous Energy Harvesting Power Management Unit With Digital Regulation for IoT Applications. IEEE J. Solid-State Circuits 2016, 51, 1457–1474. [Google Scholar] [CrossRef]
- Triki, M.; Wang, Y.; Ammari, A.; Pedram, M. Hierarchical Power Management of a System with Autonomously Power-Managed Components Using Reinforcement Learning. Integration 2015, 48, 10–20. [Google Scholar] [CrossRef]
- Kose, S.; Friedman, E.G. Distributed On-Chip Power Delivery. IEEE J. Emerg. Sel. Top. Circuits Syst. 2012, 2, 704–713. [Google Scholar] [CrossRef][Green Version]
- Jiang, H.; Marek-Sadowska, M.; Nassif, S. Benefits and Costs of Power-Gating Technique. In Proceedings of the International Conference on Computer Design, San Jose, CA, USA, 2–5 October 2005; pp. 559–566. [Google Scholar] [CrossRef]
- Hattori, T.; Irita, T.; Ito, M.; Yamamoto, E.; Kato, H.; Sado, G.; Yamada, T.; Nishiyama, K.; Yagi, H.; Koike, T.; et al. Hierarchical Power Distribution and Power Management Scheme for a Single Chip Mobile Processor. In Proceedings of the ACM/IEEE Design Automation Conference, San Francisco, CA, USA, 24–28 July 2006; pp. 292–295. [Google Scholar] [CrossRef]
- Brown, J.K.; Abdallah, D.; Boley, J.; Collins, N.; Craig, K.; Glennon, G.; Huang, K.K.; Lukas, C.J.; Moore, W.; Sawyer, R.K.; et al. A 65 nm Energy-Harvesting ULP SoC with 256 kB Cortex-M0 Enabling an 89.1 µW Continuous Machine Health Monitoring Wireless Self-Powered System. In Proceedings of the IEEE International Solid-State Circuits Conference, San Francisco, CA, USA, 16–20 February 2020; pp. 420–422. [Google Scholar] [CrossRef]
- Kim, S.; Vaidya, V.; Schaef, C.; Lines, A.; Krishnamurthy, H.; Weng, S.; Liu, X.; Kurian, D.; Karnik, T.; Ravichandran, K.; et al. A Single-Stage, Single-Inductor, 6-Input 9-Output Multi-Modal Energy Harvesting Power Management IC for 100 µW–120 MW Battery-Powered IoT Edge Nodes. In Proceedings of the IEEE Symposium on VLSI Circuits, Honolulu, HI, USA, 18–22 June 2018; pp. 195–196. [Google Scholar] [CrossRef]
- Benini, L.; De Micheli, G. Dynamic Power Management: Design Techniques and CAD Tools; Springer: Berlin/Heidelberg, Germany, 1998. [Google Scholar] [CrossRef]
- Paleologo, G.; Benini, L.; Bogliolo, A.; De Micheli, G. Policy optimization for dynamic power management. In Proceedings of the Design and Automation Conference, 35th DAC (Cat. No.98CH36175), San Francisco, CA, USA, 15–19 June 1998; pp. 182–187. [Google Scholar] [CrossRef]
- Benini, L.; Bogliolo, A.; Paleologo, G.; De Micheli, G. Policy optimization for dynamic power management. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 1999, 18, 813–833. [Google Scholar] [CrossRef]
- Ishihara, T.; Yasuura, H. Voltage scheduling problem for dynamically variable voltage processors. In Proceedings of the International Symposium on Low Power Electronics and Design (IEEE Cat. No.98TH8379), Monterey, CA, USA, 10–12 August 1998; pp. 197–202. [Google Scholar] [CrossRef]
- Simunic, T.; Benini, L.; Acquaviva, A.; Glynn, P.; de Micheli, G. Dynamic voltage scaling and power management for portable systems. In Proceedings of the Design Automation Conference (IEEE Cat. No.01CH37232), Las Vegas, NA, USA, 22 June 2001; pp. 524–529. [Google Scholar] [CrossRef]
- Kim, C.; Roy, K. Dynamic V/sub TH/scaling scheme for active leakage power reduction. In Proceedings of the Design, Automation and Test in Europe Conference and Exhibition, Paris, France, 4–8 March 2002; pp. 163–167. [Google Scholar] [CrossRef]
- Qiu, Q.; Pedram, M. Dynamic Power Management Based on Continuous-Time Markov Decision Processes. In Proceedings of the Design Automation Conference, New Orleans, LA, USA, 21–25 June 1999; pp. 555–561. [Google Scholar] [CrossRef]
- Dhiman, G.; Rosing, T.S. Dynamic Power Management Using Machine Learning. In Proceedings of the IEEE/ACM International Conference on Computer Aided Design, San Jose, CA, USA, 5–9 November 2006; pp. 747–754. [Google Scholar] [CrossRef]
- Yue, S.; Zhu, D.; Wang, Y.; Pedram, M. Reinforcement Learning Based Dynamic Power Management with a Hybrid Power Supply. In Proceedings of the IEEE International Conference on Computer Design, Montreal, QC, Canada, 30 September–3 October 2012; pp. 81–86. [Google Scholar] [CrossRef]
- Liu, W.; Tan, Y.; Qiu, Q. Enhanced Q-learning algorithm for Dynamic Power Management with Performance Constraint. In Proceedings of the Design, Automation Test in Europe Conference Exhibition, Dresden, Germany, 8–12 March 2010; pp. 602–605. [Google Scholar] [CrossRef]
- Debizet, Y.; Lallement, G.; Abouzeid, F.; Roche, P.; Autran, J.L. Q-learning-based Adaptive Power Management for IoT System-on-Chips with Embedded Power States. In Proceedings of the IEEE International Symposium on Circuits and Systems, Florence, Italy, 27–30 May 2018; pp. 1–5. [Google Scholar] [CrossRef]
- Gupta, U.; Mandal, S.K.; Mao, M.; Chakrabarti, C.; Ogras, U.Y. A Deep Q-Learning Approach for Dynamic Management of Heterogeneous Processors. IEEE Comput. Archit. Lett. 2019, 18, 14–17. [Google Scholar] [CrossRef]
- Li, H.; Tian, Z.; Xu, J.; Maeda, R.K.V.; Wang, Z.; Wang, Z. Chip-Specific Power Delivery and Consumption Co-Management for Process-Variation-Aware Manycore Systems Using Reinforcement Learning. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2020, 28, 1150–1163. [Google Scholar] [CrossRef]
- Vaisband, I.; Friedman, E.G. Energy Efficient Adaptive Clustering of On-Chip Power Delivery Systems. Integr. VLSI J. 2015, 48, 1–9. [Google Scholar] [CrossRef]
- Tan, Y.; Liu, W.; Qiu, Q. Adaptive Power Management Using Reinforcement Learning. In Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA, USA, 2–5 November 2009; pp. 461–467. [Google Scholar]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning; M-Press: Kansas City, MI, USA, 2014; Chapter 3; pp. 79–80. [Google Scholar]
- Chen, Z.; Marculescu, D. Distributed reinforcement learning for power limited many-core system performance optimization. In Proceedings of the 2015 Design, Automation Test in Europe Conference Exhibition (DATE), Grenoble, France, 9–13 March 2015; pp. 1521–1526. [Google Scholar]
- Ge, Y.; Qiu, Q. Dynamic thermal management for multimedia applications using machine learning. In Proceedings of the ACM/EDAC/IEEE Design Automation Conference, San Diego, CA, USA, 5–9 June 2011; pp. 95–100. [Google Scholar]
- ul Islam, F.M.M.; Lin, M. Hybrid DVFS Scheduling for Real-Time Systems Based on Reinforcement Learning. IEEE Syst. J. 2017, 11, 931–940. [Google Scholar] [CrossRef]
- Chen, Y.M.; Chen, C.J. An Event-Driven Self-Clocked Digital Low-Dropout Regulator with Adaptive Frequency Control. Energies 2023, 16, 4749. [Google Scholar] [CrossRef]
- Wang, D.; Gao, N.; Liu, D.; Li, J.; Lewis, F.L. Recent Progress in Reinforcement Learning and Adaptive Dynamic Programming for Advanced Control Applications. IEEE/CAA J. Autom. Sin. 2024, 11, 18–36. [Google Scholar] [CrossRef]
- Xie, D.; Li, H. Deep Reinforcement Learning Based Collaborative Energy Management for Base Station and Microgrid. In Proceedings of the International Conference on Electronic Information Engineering and Computer Communication (EIECC), Wuhan, China, 27–29 December 2024; pp. 160–163. [Google Scholar] [CrossRef]
- Wan, Z.; Huang, Y.; Wu, L.; Liu, C. ADPA Optimization for Real-Time Energy Management Using Deep Learning. Energies 2024, 17, 4821. [Google Scholar] [CrossRef]
- Chen, T.; Dai, Z.; Shan, X.; Li, Z.; Hu, C.; Xue, Y.; Xu, K. Reactive Power Optimization Method of Power Network Based on Deep Reinforcement Learning Considering Topology Characteristics. Energies 2024, 17, 6454. [Google Scholar] [CrossRef]
- Lee, Y.; Bang, S.; Lee, I.; Kim, Y.; Kim, G.; Ghaed, M.H.; Pannuto, P.; Dutta, P.; Sylvester, D.; Blaauw, D. A Modular 1 mm3 Die-Stacked Sensing Platform With Low Power I2C Inter-Die Communication and Multi-Modal Energy Harvesting. IEEE J. Solid-State Circuits 2013, 48, 229–243. [Google Scholar] [CrossRef]
- Urgaonkar, R.; Kozat, U.C.; Igarashi, K.; Neely, M.J. Dynamic resource allocation and power management in virtualized data centers. In Proceedings of the IEEE Network Operations and Management Symposium, Osaka, Japan, 19–23 April 2010; pp. 479–486. [Google Scholar] [CrossRef]
- Kwon, E.; Han, S.; Park, Y.; Kim, Y.H.; Kang, S. Late Breaking Results: Reinforcement Learning-based Power Management Policy for Mobile Device Systems. In Proceedings of the 2020 57th ACM/IEEE Design Automation Conference (DAC), San Francisco, CA, USA, 20–24 July 2020; pp. 1–2. [Google Scholar] [CrossRef]
- Kwon, E.; Han, S.; Park, Y.; Yoon, J.; Kang, S. Reinforcement Learning-Based Power Management Policy for Mobile Device Systems. IEEE Trans. Circuits Syst. I Regul. Pap. 2021, 68, 4156–4169. [Google Scholar] [CrossRef]
- Giardino, M.; Schwyn, D.; Ferri, B.; Ferri, A. Low-Overhead Reinforcement Learning-Based Power Management Using 2QoSM. J. Low Power Electron. Appl. 2022, 12, 29. [Google Scholar] [CrossRef]
- Watkins, C.J.C.H.; Dayan, P. Q-learning. Mach. Learn. 1992, 8, 279–292. [Google Scholar] [CrossRef]
- Oh, C.H.; Nakashima, T.; Ishibuchi, H. Initialization of Q-values by Fuzzy Rules for Accelerating Q-learning. In Proceedings of the IEEE International Joint Conference on Neural Networks, Anchorage, AK, USA, 4–9 May 1998; Volume 3, pp. 2051–2056. [Google Scholar] [CrossRef]
- Song, Y.; Li, Y.B.; Li, C.H.; Zhang, G.F. An Efficient Initialization Approach of Q-learning for Mobile Robots. Int. J. Control. Autom. Syst. 2012, 10, 166–172. [Google Scholar] [CrossRef]
- Mandal, S.K.; Bhat, G.; Doppa, J.R.; Pande, P.P.; Ogras, U.Y. An Energy-Aware Online Learning Framework for Resource Management in Heterogeneous Platforms. ACM Trans. Des. Autom. Electron. Syst. 2020, 25, 28. [Google Scholar] [CrossRef]
- Ciprut, A.; Friedman, E.G. Stability of On-Chip Power Delivery Systems with Multiple Low-Dropout Regulators. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2019, 27, 1779–1789. [Google Scholar] [CrossRef]
- Chang, T.S.; Ramiah, H.; Jiang, Y.; Lim, C.C.; Lai, N.S.; Mak, P.I.; Martins, R.P. Design and Implementation of Hybrid DC-DC Converter: A Review. IEEE Access 2023, 11, 30498–30514. [Google Scholar] [CrossRef]
- Abrar Akram, M.; Ali Wahla, I.; Kim, K.S.; Hwang, I.C. A Four-Phase Digital Buck Converter With MDLL-Based Adaptive Switching Frequency Compensation. IEEE Access 2024, 12, 180404–180414. [Google Scholar] [CrossRef]
- TPS659037 Datasheet. Available online: https://www.ti.com/lit/ds/symlink/tps659037.pdf (accessed on 14 November 2025).
- Tsiougkos, A.; Pavlidis, V.F. A PWM-free DC-DC Boost Converter with 0.43 V Input for Extended Battery Use in IoT Applications. In Proceedings of the IEEE International Midwest Symposium on Circuits and Systems, Lansing, MI, USA, 9–11 August 2021; pp. 479–483. [Google Scholar] [CrossRef]
- MAX17710 Datasheet. Available online: https://datasheets.maximintegrated.com/en/ds/MAX17710.pdf (accessed on 12 November 2025).
- SPV1050 Datasheet. Available online: https://www.st.com/resource/en/datasheet/spv1050.pdf (accessed on 17 November 2025).
- Chen, C.; Sun, M.; Wang, L.; Huang, T.; Xu, M. A Fast Transient Response Capacitor-Less LDO with Transient Enhancement Technology. Micromachines 2024, 15, 299. [Google Scholar] [CrossRef] [PubMed]
- Zachos, N.; Gogolou, V.; Noulis, T. A Fully Integrated 1.8 V Low-Power LDO Regulator with Dynamic Transient Control for SoC Applications. Electronics 2024, 13, 4734. [Google Scholar] [CrossRef]
- Arévalos, D.; Marin, J.; Herman, K.; Gomez, J.; Wallentowitz, S.; Rojas, C.A. A Topology-Independent and Scalable Methodology for Automated LDO Design Using Open PDKs. Electronics 2025, 14, 3448. [Google Scholar] [CrossRef]
- Okamura, L.; Morishita, F.; Arimoto, K.; Yoshihara, T. High efficiency Autonomous Controlled Cascaded LDOs for Green Battery system. In Proceedings of the IEEE International Conference on ASIC, Changsha, China, 20–23 October 2009; pp. 336–339. [Google Scholar] [CrossRef]
- Oh, T.J.; Hwang, I.C. A 110-nm CMOS 0.7-V Input Transient-Enhanced Digital Low-Dropout Regulator with 99.98% Current Efficiency at 80-mA Load. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2015, 23, 1281–1286. [Google Scholar] [CrossRef]
- Rincon-Mora, G.A. Current Efficient, Low Voltage, Low Drop-Out Regulators. Ph.D. Thesis, Department Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, CA, USA, 1996. [Google Scholar]
- Ciprut, A.; Friedman, E.G. On the Stability of Distributed On-Chip Low Dropout Regulators. In Proceedings of the IEEE International Midwest Symposium on Circuits and Systems, Boston, MA, USA, 6–9 August 2017; pp. 217–220. [Google Scholar] [CrossRef]
- ESP8266EX Datasheet. Available online: https://www.espressif.com/sites/default/files/documentation/0a-esp8266ex_datasheet_en.pdf (accessed on 17 November 2025).
- nRF52832 Product Specification. Available online: https://infocenter.nordicsemi.com/pdf/nRF52832_PS_v1.4.pdf (accessed on 14 November 2025).
- Rincón-Mora, G.A. Analog IC Design with Low-Dropout Regulators; McGraw-Hill: New York, NY, USA, 2009. [Google Scholar]
- Tijsma, A.D.; Drugan, M.M.; Wiering, M.A. Comparing Exploration Strategies for Q-learning in Random Stochastic Mazes. In Proceedings of the IEEE Symposium Series on Computational Intelligence, Athens, Greece, 6–9 December 2016; pp. 1–8. [Google Scholar] [CrossRef]
- Chen, J.; Yin, M.; Duan, X.; Jiao, B. Q-Learning Based Selection Strategies for Load Balance and Energy Balance in Heterogeneous Networks. In Proceedings of the International Conference on Computer and Communication Systems, Shanghai, China, 15–18 May 2020; pp. 728–732. [Google Scholar] [CrossRef]
- INA226 36V, 16-Bit, Ultra-Precise I2C Output Current, Voltage, and Power Monitor with Alert. Available online: https://www.ti.com/lit/ds/symlink/ina226.pdf?ts=1768878771120&ref_url=https%253A%252F%252Fwww.google.com%252F (accessed on 19 September 2025).
- L78L Series: Positive Voltage Regulators. Available online: https://www.tme.eu/Document/b4d1ce27007259e9e680165d5f11d754/l78l.pdf (accessed on 19 September 2025).
- LM2937 500mA Low Dropout Regulator. Available online: https://www.ti.com/lit/ds/symlink/lm2937.pdf (accessed on 19 September 2025).
- L78 Series: Positive Voltage Regulators. Available online: https://eu.mouser.com/datasheet/2/389/l78-1849632.pdf (accessed on 19 September 2025).
- L78S Series: 2A Positive Voltage Regulators. Available online: https://gr.mouser.com/datasheet/2/389/l78s-1849452.pdf (accessed on 19 September 2025).
- LD33V (LD1117 Series): Low Drop Fixed and Adjustable Positive Voltage Regulators. Available online: http://www.st.com/st-web-ui/static/active/en/resource/technical/document/datasheet/CD00000544.pdf (accessed on 19 September 2025).
- LM3940: 1-A Low Dropout Regulator for 5V to 3.3V Conversion. Available online: https://www.ti.com/lit/ds/symlink/lm3940.pdf (accessed on 19 September 2025).
- Huang, C.H.; Ma, Y.T.; Liao, W.C. Design of a Low-Voltage Low-Dropout Regulator. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2014, 22, 1308–1313. [Google Scholar] [CrossRef]
- Cheah, M.; Mandal, D.; Bakkaloglu, B.; Kiaei, S. A 100-mA, 99.11% Current Efficiency, 2-mVpp Ripple Digitally Controlled LDO With Active Ripple Suppression. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2017, 25, 696–704. [Google Scholar] [CrossRef]
- Laleni, N.; Tsiougkos, A.; Pavlidis, V. Hybrid Capacitor-less LDO with Switched-Mode Dead-Zone Control. In Proceedings of the International Conference on Synthesis, Modeling, Analysis and Simulation Methods, and Applications to Circuit Design, Online, 19–22 July 2021. [Google Scholar]
- Zarate-Roldan, J.; Wang, M.; Torres, J.; Sánchez-Sinencio, E. A Capacitor-Less LDO With High-Frequency PSR Suitable for a Wide Range of On-Chip Capacitive Loads. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2016, 24, 2970–2982. [Google Scholar] [CrossRef]
- Luria, K.; Shor, J.; Zelikson, M.; Lyakhov, A. 8.7 Dual-Use Low-Drop-Out Regulator/Power Gate with Linear and On-Off Conduction Modes for Microprocessor on-Die Supply Voltages in 14nm. In Proceedings of the IEEE International Solid-State Circuits Conference, San Francisco, CA, USA, 22–26 February 2015; pp. 1–3. [Google Scholar] [CrossRef]
- Liu, X.; Krishnamurthy, H.K.; Barrera, C.P.; Han, J.; Bhatla, R.M.N.; Chiu, S.; Ahmed, Z.K.; Ravichandran, K.; Tschanz, J.W.; De, V. A Dual-Rail Hybrid Analog/Digital LDO with Dynamic Current Steering for Tunable High PSRR and High Efficiency. In Proceedings of the IEEE Symposium on VLSI Circuits, Honolulu, HI, USA, 16–19 June 2020; pp. 1–2. [Google Scholar] [CrossRef]

















| Term | Description |
|---|---|
| Cascaded | Describe systems with components connected in series, thus forming a chain. The input port of the first and the output port of the last component are the input and output ports of the complete system, respectively. |
| Multi-level PDU | Describes a hierarchical PDU with multiple levels, including different cascaded DC–DC converters. |
| System | Refers to the multi-level PDU, along with the loads connected to the PDU and the control block, which implements the Q-algorithm, as shown in Figure 1. |
| The power efficiency for every level or load connected to the system shown in Figure 1. | |
| (Total) end-to-end power efficiency | The overall power efficiency for L loads connected to the system. |
| Quiescent current | The current drawn by an LDO when no load is connected. |
| Homogeneous loads | Refers to loads that exhibit similar power profiles over time, and their peak currents are close to each other. |
| Heterogeneous loads | Describes loads that draw considerably different currents, e.g., . |
| Autonomous PDU | Refers to a PDU that operates without user intervention by utilizing a predefined power management scheme in order to achieve high end-to-end power efficiency [5]. |
| Power gating | Describes a technique that disables idle converters in order to reduce the overall power consumed by a PDU [8]. |
| Resonance frequency | Used to describe the frequency of an electrical system in which the input impedance of the system is minimum or, equivalently, the amplitude of the output signal is maximum [45]. |
| The aggregate current drawn by the specific set of heterogeneous IoT peripherals (DHT11, Servo, and LoRa). Loads include sensors (DHT11, PIR, and HC-SR04), actuators (DC Motor and Servo), and communication modules (BLE, WiFi, and LoRa) representing diverse power profiles. These loads have been selected to represent extreme dynamic range variance (from 1 μA sleep currents to 360 mA motor stall currents), creating the non-linear disturbance the Q-learning agent must manage. |
| Scenario | Application | DHT11 | PIR | HC-SR04 | DC Motor | SG90 Servo | HM10 BLE | ESP8266 WiFi | LoRa SX1278 |
|---|---|---|---|---|---|---|---|---|---|
| Min (mA) | 0.1 | 0.105 | 2.5 | 10 | 15 | 0.004 | 0.01 | 0.001 | |
| Max (mA) | 0.5 | 65 | 15 | 256 | 360 | 106 | 170 | 87 | |
| A | Automotive: Parking assistance and pedestrian detection | ✓ | ✓ | ✓ | ✓ | ||||
| B | Smart farming: Environmental monitoring and auto-watering | ✓ | ✓ | ✓ | |||||
| C | Home automation: Person detection, lights | ✓ | ✓ | ✓ | ✓ | ||||
| D | PV sun tracking: WiFi/LoRa for efficiency | ✓ | ✓ | ✓ | |||||
| E | Autonomous vehicle: Self-exploration/RC control | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| F | Window shutter control over Wi-Fi for temperature | ✓ | ✓ | ✓ |
| 2019 [3] | 2016 [5] | TPS659037 [48] | This Work | |
|---|---|---|---|---|
| Input Voltage | N/A | 250 mV–1.1 V | 1.75 V–5.25 V | 5.5 V |
| Output Voltage | 4 V | 1.8 V–2 V | 0.9 V–3.3 V | 1.2–5 V |
| Maximum Load Current | 2.5 mA | >1 mA | 1 A | >600 mA |
| Heterogeneous Loads | No | Yes | No | Yes |
| Power Efficiency * @ 66 μA (%) | 77% | 26% | 23% | 83% |
| Maximum Power Efficiency | 95% | 57% | >90% | >80% |
| Autonomous | No | Yes | No | Yes |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Tsiougkos, A.; Amanatiadou, G.; Pavlidis, V.F. A Q-Learning-Based Hierarchical Power Delivery Architecture for the Efficient Management of Heterogeneous Loads. J. Low Power Electron. Appl. 2026, 16, 6. https://doi.org/10.3390/jlpea16010006
Tsiougkos A, Amanatiadou G, Pavlidis VF. A Q-Learning-Based Hierarchical Power Delivery Architecture for the Efficient Management of Heterogeneous Loads. Journal of Low Power Electronics and Applications. 2026; 16(1):6. https://doi.org/10.3390/jlpea16010006
Chicago/Turabian StyleTsiougkos, Andreas, Georgia Amanatiadou, and Vasilis F. Pavlidis. 2026. "A Q-Learning-Based Hierarchical Power Delivery Architecture for the Efficient Management of Heterogeneous Loads" Journal of Low Power Electronics and Applications 16, no. 1: 6. https://doi.org/10.3390/jlpea16010006
APA StyleTsiougkos, A., Amanatiadou, G., & Pavlidis, V. F. (2026). A Q-Learning-Based Hierarchical Power Delivery Architecture for the Efficient Management of Heterogeneous Loads. Journal of Low Power Electronics and Applications, 16(1), 6. https://doi.org/10.3390/jlpea16010006


