A Reinforcement Learning-Based Optimization Strategy for Noise Budget Management in Homomorphically Encrypted Deep Network Inference
Abstract
1. Introduction
- (1)
- To address the problem that existing HE schemes use fixed noise budget allocation strategies, which cannot adapt to the heterogeneous computational characteristics between deep network layers, we propose a Deep Network-aware Adaptive Noise-budget Management mechanism. By constructing a layer-aware noise consumption prediction model, this mechanism achieves dynamic and precise allocation of the noise budget.
- (2)
- To solve the lack of global optimality and adaptability in traditional bootstrapping decision methods, we design a reinforcement learning-driven optimization algorithm for bootstrapping decisions. It uses a Deep Q-Network to learn the optimal timing for bootstrapping, minimizing the bootstrapping overhead while ensuring computational security.
- (3)
- Experimental validation on several typical deep networks demonstrates the effectiveness and scalability of the proposed mechanism. The results show that the mechanism can optimize the management strategy online according to the computational characteristics of network. While guaranteeing security requirements, it significantly reduces the redundant overhead of bootstrapping operations compared to fixed strategies.
2. Related Work
2.1. Homomorphic Encryption in Privacy-Preserving Machine Learning
2.2. Noise Budget Management in HE Inference
3. Noise Model for HE Deep Networks
3.1. Problem Formulation Definition
- Addition: Polynomial addition modulo q, performed coefficient-wise (or component-wise in the CRT representation),
- Multiplication: Negacyclic convolution modulo and q, which is efficiently implemented using the NTT,
- : Generates the public key, secret key, and evaluation key.
- : Encrypts a plaintext vector into a ciphertext , where is the scaling factor.
- : Decrypts the ciphertext to obtain an approximate plaintext .
3.2. Deep Network-Aware Adaptive Noise-Budget Management (DN-ANM)
- Topological features: layer depth , the set of predecessor nodes , and the set of successor nodes .
- Operational features: the one-hot encoding of the operation type.
- Parameter features: weight dimension , weight norm , and multiplicative complexity .
- Dataflow features: input/output tensor shapes and .
3.3. Layer-Wise Sensing Noise Prediction Model Flowchart
3.3.1. Noise Modeling of Basic Homomorphic Operations
3.3.2. Noise Modeling of Neural Network Layer-Wise Operations
3.4. Cross-Layer Noise Propagation and Cumulative Prediction
4. Reinforcement Learning-Based Adaptive Bootstrapping Decision
4.1. MDP Problem Formulation
4.1.1. Multi-Dimensional State Space
4.1.2. Extended Action Space
- (Continue): No bootstrapping; compute the next layer directly.
- (Immediate Bootstrapping): Bootstrap first to reset noise and distance, then compute the next layer.
- (Pre-scheduled Bootstrapping): Compute the next layer, then bootstrap immediately to prepare subsequent layers.
- Formal transition compositions for these actions are provided Equation (21).
4.1.3. Multi-Objective Reward Function
4.2. Policy Optimization Based on Deep Q-Learning
4.2.1. Value Network Architecture
4.2.2. Temporal Difference Learning and Bellman Optimality
4.2.3. Experience Replay and Exploration Strategy
4.2.4. Integration with the DN-ANM Model
4.3. Algorithm Implementation
| Algorithm 1 DQN Training | |
| Require: Network , Noise model , Episodes | |
| Ensure: Trained Q-network parameters | |
| 1: Initialize Q-networks , , replay buffer D, | |
| 2: | |
| 3: for do | |
| 4: s ← InitialState; ; | |
| 5: while not done do | |
| 6: if Random < then | |
| 7: a ← UniformSample | |
| 8: else | |
| 9: | |
| 10: end if | |
| 11: ; | ▹ is the DN-ANM simulation |
| 12: if ; then | |
| 13: end if | |
| 14: | ▹ Check environment termination conditions |
| 15: if then | |
| 16: ; | |
| 17: else if then | |
| 18: ; | |
| 19: end if | |
| 20: D. Store | |
| 21: ; | |
| 22: if then | |
| 23: UpdateNetworks | ▹ Target-network updates with Huber loss |
| 24: end if | |
| 25: end while | |
| 26: | ▹ Decay |
| 27: end for | |
| 28: return | |
| Algorithm 2 Online Adaptive Bootstrapping Decision-Making | ||
| Require: Input ciphertext , Network , Trained Q-network , DN-ANM model | ||
| Ensure: Inference result , Total bootstrapping count K | ||
| 1: G ← AnalyzeNetwork | ▹ Offline pre-processing: Analyze network | |
| 2: n0 ← GetNoise | ▹ Get initial noise | |
| 3: s ←InitializeState | ▹ Initialize state | |
| 4: | ||
| 5: while do | ||
| 6: | ▹ Greedy action selection | |
| 7: if then | ▹ Immediate bootstrap | |
| 8: cti ← Boot | ||
| 9: | ||
| 10: end if | ||
| 11: | ▹ Execute homomorphic computation | |
| 12: | ||
| 13: if then | ▹ Pre-scheduled bootstrap | |
| 14: cti ← Boot | ▹ Bootstrap the result (now logical) | |
| 15: | ||
| 16: end if | ||
| 17: ni ← GetNoise | ▹ Update noise information | |
| 18: s ← UpdateState | ▹ Update state | |
| 19: end while | ||
| 20: return | ||
Hyperparameter Configuration and Sensitivity Analysis
5. Experimental Analysis
5.1. Network Architectures and Evaluation Metrics
5.2. Experimental Results and Analysis
5.2.1. Noise Prediction Effectiveness
5.2.2. Bootstrapping Efficiency Analysis
5.2.3. Model Training Accuracy
5.2.4. Parameter Robustness
5.2.5. Ablation Study
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Liu, B.; Ding, M.; Shaham, S.; Rahayu, W.; Farokhi, F.; Lin, Z. When machine learning meets privacy: A survey and outlook. ACM Comput. Surv. (CSUR) 2021, 54, 1–36. [Google Scholar] [CrossRef]
- Podschwadt, R.; Takabi, D.; Hu, P.; Rafiei, M.H.; Cai, Z. A survey of deep learning architectures for privacy-preserving machine learning with fully homomorphic encryption. IEEE Access 2022, 10, 117477–117500. [Google Scholar] [CrossRef]
- Yuan, L.; Wang, Z.; Sun, L.; Yu, P.S.; Brinton, C.G. Decentralized federated learning: A survey and perspective. IEEE Internet Things J. 2024, 11, 34617–34638. [Google Scholar] [CrossRef]
- Zhang, Q.; Xin, C.; Wu, H. Privacy-preserving deep learning based on multiparty secure computation: A survey. IEEE Internet Things J. 2021, 8, 10412–10429. [Google Scholar] [CrossRef]
- Blanco-Justicia, A.; Sánchez, D.; Domingo-Ferrer, J.; Muralidhar, K. A critical review on the use (and misuse) of differential privacy in machine learning. ACM Comput. Surv. 2022, 55, 1–16. [Google Scholar] [CrossRef]
- Marcolla, C.; Sucasas, V.; Manzano, M.; Bassoli, R.; Fitzek, F.H.; Aaraj, N. Survey on fully homomorphic encryption, theory, and applications. Proc. IEEE 2022, 110, 1572–1609. [Google Scholar] [CrossRef]
- Falcetta, A.; Roveri, M. Privacy-preserving deep learning with homomorphic encryption: An introduction. IEEE Comput. Intell. Mag. 2022, 17, 14–25. [Google Scholar] [CrossRef]
- Doan, T.V.T.; Messai, M.L.; Gavin, G.; Darmont, J. A survey on implementations of homomorphic encryption schemes. J. Supercomput. 2023, 79, 15098–15139. [Google Scholar] [CrossRef]
- Zhang, Q.; Fu, Y.; Cui, J.; He, D.; Zhong, H. Efficient fine-grained data sharing based on proxy re-encryption in iiot. IEEE Trans. Dependable Secur. Comput. 2024, 21, 5797–5809. [Google Scholar] [CrossRef]
- Kim, A.; Deryabin, M.; Eom, J.; Choi, R.; Lee, Y.; Ghang, W.; Yoo, D. General bootstrapping approach for RLWE-based homomorphic encryption. IEEE Trans. Comput. 2023, 73, 86–96. [Google Scholar] [CrossRef]
- Disabato, S.; Falcetta, A.; Mongelluzzo, A.; Roveri, M. A privacy-preserving distributed architecture for deep-learning-as-a-service. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–8. [Google Scholar]
- Zhu, K.; Wang, Z.; Ding, D.; Dong, H.; Xu, C.Z. Secure state estimation for artificial neural networks with unknown-but-bounded noises: A homomorphic encryption scheme. IEEE Trans. Neural Netw. Learn. Syst. 2024, 36, 6780–6791. [Google Scholar] [CrossRef] [PubMed]
- Lou, Q.; Jiang, L. Hemet: A homomorphic-encryption-friendly privacy-preserving mobile neural network architecture. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 7102–7110. [Google Scholar]
- Schneider, T.; Wang, H.C.; Yalame, H. HE-SecureNet: An Efficient and Usable Framework for Model Training via Homomorphic Encryption. In Proceedings of the 24th Workshop on Privacy in the Electronic Society, Taipei, China, 13–17 October 2025. [Google Scholar]
- Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
- Fan, J.; Vercauteren, F. Somewhat practical fully homomorphic encryption. In Cryptology ePrint Archive; Paper 2012/144; IACR: Bellevue, WA, USA, 2012. [Google Scholar]
- Cheon, J.H.; Kim, A.; Kim, M.; Song, Y. Homomorphic encryption for arithmetic of approximate numbers. In Proceedings of the International Conference on the Theory and Application of Cryptology and Information Security; Springer: Cham, Switzerland, 2017; pp. 409–437. [Google Scholar]
- Mouchet, C.; Troncoso-Pastoriza, J.; Bossuat, J.P.; Hubaux, J.P. Multiparty homomorphic encryption from ring-learning-with-errors. Proc. Priv. Enhancing Technol. 2021, 2021, 291–311. [Google Scholar] [CrossRef]
- Lloret-Talavera, G.; Jorda, M.; Servat, H.; Boemer, F.; Chauhan, C.; Tomishima, S.; Shah, N.N.; Pena, A.J. Enabling homomorphically encrypted inference for large DNN models. IEEE Trans. Comput. 2021, 71, 1145–1155. [Google Scholar] [CrossRef]
- Castro, F.; Impedovo, D.; Pirlo, G. An efficient and privacy-preserving federated learning approach based on homomorphic encryption. IEEE Open J. Comput. Soc. 2025, 6, 336–347. [Google Scholar] [CrossRef]
- Mia, M.J.; Amini, M.H. QuanCrypt-FL: Quantized homomorphic encryption with pruning for secure federated learning. IEEE Trans. Artif. Intell. 2025. Early Access. [Google Scholar] [CrossRef]
- Wu, L.; Wang, X.A.; Liu, J.; Su, Y.; Tu, Z.; Liu, W.; Lei, H.; Tang, D.; Cao, Y.; Zhang, J. Homomorphic Encryption for Machine Learning Applications with CKKS Algorithms: A Survey of Developments and Applications. Comput. Mater. Contin. 2025, 85, 89–119. [Google Scholar] [CrossRef]
- Gentry, C. A Fully Homomorphic Encryption Scheme; Stanford University: Stanford, CA, USA, 2009; Volume 20. [Google Scholar]
- Gilad-Bachrach, R.; Dowlin, N.; Laine, K.; Lauter, K.; Naehrig, M.; Wernsing, J. Cryptonets: Applying neural networks to encrypted data with high throughput and accuracy. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; pp. 201–210. [Google Scholar]
- Akram, A.; Khan, F.; Tahir, S.; Iqbal, A.; Shah, S.A.; Baz, A. Privacy preserving inference for deep neural networks: Optimizing homomorphic encryption for efficient and secure classification. IEEE Access 2024, 12, 15684–15695. [Google Scholar] [CrossRef]
- Ha, J.; Kim, S.; Lee, B.; Lee, J.; Son, M. Rubato: Noisy ciphers for approximate homomorphic encryption. In Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques; Springer: Cham, Switzerland, 2022; pp. 581–610. [Google Scholar]
- Dathathri, R.; Kostova, B.; Saarikivi, O.; Dai, W.; Laine, K.; Musuvathi, M. EVA: An encrypted vector arithmetic language and compiler for efficient homomorphic computation. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, London, UK, 15–20 June 2020; pp. 546–561. [Google Scholar]
- Lee, Y.; Cheon, S.; Kim, D.; Lee, D.; Kim, H. ELASM: Error-Latency-Aware scale management for fully homomorphic encryption. In Proceedings of the 32nd USENIX Security Symposium (USENIX Security 23), Anaheim, CA, USA, 9–11 August 2023; pp. 4697–4714. [Google Scholar]
- Bengio, Y.; Goodfellow, I.; Courville, A.; Bengio, Y. Deep Learning; MIT Press: Cambridge, MA, USA, 2017; Volume 1. [Google Scholar]
- Miettinen, K. Nonlinear Multiobjective Optimization; Springer Science & Business Media: New York, NY, USA, 1999; Volume 12. [Google Scholar]
- Dathathri, R.; Saarikivi, O.; Chen, H.; Laine, K.; Lauter, K.; Maleki, S.; Musuvathi, M.; Mytkowicz, T. CHET: An optimizing compiler for fully-homomorphic neural-network inferencing. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation, Phoenix, AZ, USA, 22–26 June 2019; pp. 142–156. [Google Scholar]




| Symbol | Description |
|---|---|
| Deep Neural Network model | |
| L | Total number of network layers |
| q | Ciphertext modulus |
| Scaling factor in CKKS scheme | |
| Initial noise budget | |
| Maximum noise threshold for decryption correctness | |
| Infinity norm of the noise error | |
| State space in the Markov Decision Process | |
| Action space | |
| State vector at step t | |
| Reward received at step t | |
| Optimal bootstrapping policy | |
| Action-value function |
| Layer Type | Key Parameters | Noise Increment Model () |
|---|---|---|
| Convolutional | ||
| Square Activation | ||
| Poly Activation |
| Category | Parameter | Value |
|---|---|---|
| Training Settings | Episodes () | 2000 |
| Replay Buffer Size (N) | ||
| Batch Size (B) | 32 | |
| Learning Rate () | ||
| Discount Factor () | 0.99 | |
| Reward Weights | Efficiency () | 0.4 |
| Security () | 0.4 | |
| Completion () | 0.2 | |
| Thresholds | Safe Threshold () | 0.80 |
| Danger Threshold () | 0.95 | |
| Lookahead Threshold () | 0.15 | |
| Penalties | Bootstrapping Cost () | 0.1 |
| Premature Boot. () | 0.2 |
| Type | Params | MN | DN-ANM | Worst-Case | LE |
|---|---|---|---|---|---|
| Conv2d | IC = 1, OC = 8 | 3.7559 | 7.0809 | 9.1703 | |
| Conv2d | IC = 8, OC = 16 | 4.2449 | 7.6795 | 1.1921 | |
| Square | — | 8.1627 | 7.6811 | 2.1458 | |
| Linear | In = 64, Out = 64 | 3.7899 | 7.8763 | 3.1472 | |
| Linear | In = 64, Out = 10 | 3.5029 | 8.0765 | 4.6159 |
| Configuration | Avg Reenc. | Violations | Safe Rate(%) | Adjusted Reenc. * |
|---|---|---|---|---|
| Full DN-ANM | 18.67 | 0 | 76.2 | 19.08 |
| No Foresight | 18.97 | 0.16 | 52.4 | 19.75 |
| Binary Action | 5.85 | 0.24 | 38.1 | 7.06 |
| No Completion Reward | 18.40 | 0 | 76.2 | 18.82 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Zhang, C.; Bai, F.; Wan, J.; Chen, Y. A Reinforcement Learning-Based Optimization Strategy for Noise Budget Management in Homomorphically Encrypted Deep Network Inference. Electronics 2026, 15, 275. https://doi.org/10.3390/electronics15020275
Zhang C, Bai F, Wan J, Chen Y. A Reinforcement Learning-Based Optimization Strategy for Noise Budget Management in Homomorphically Encrypted Deep Network Inference. Electronics. 2026; 15(2):275. https://doi.org/10.3390/electronics15020275
Chicago/Turabian StyleZhang, Chi, Fenhua Bai, Jinhua Wan, and Yu Chen. 2026. "A Reinforcement Learning-Based Optimization Strategy for Noise Budget Management in Homomorphically Encrypted Deep Network Inference" Electronics 15, no. 2: 275. https://doi.org/10.3390/electronics15020275
APA StyleZhang, C., Bai, F., Wan, J., & Chen, Y. (2026). A Reinforcement Learning-Based Optimization Strategy for Noise Budget Management in Homomorphically Encrypted Deep Network Inference. Electronics, 15(2), 275. https://doi.org/10.3390/electronics15020275

