Efficient and Privacy-Preserving Decision Tree Inference via Homomorphic Matrix Multiplication and Leaf Node Pruning
Abstract
:1. Introduction
1.1. Related Work
- HE-based protocols:
- HE supports computation over encrypted data without requiring decryption. Akavia et al. [5] used low-degree polynomial approximations to support non-interactive inference with communication costs independent of tree depth. Frery et al. [6], Hao et al. [7], and Shin et al. [8] adopted the TFHE, BFV, and CKKS schemes, respectively, for efficient computation with low multiplicative depth. Cong et al. [9] reported ciphertext size comparisons and proposed homomorphic traversal algorithms across various commonly used HE schemes. Most existing HE-based PPDT protocols are designed for two-party settings, similar to the linear-function-based scheme proposed by Tai et al. [10].
- Protocols based on other cryptographic primitives:
- Zheng et al. [11] employed additive secret sharing in a two-server setting, ensuring that no single party gained access to both the model and the data, while maintaining low communication overhead. MPC-based approaches, such as those reported b Wu et al. [12], combine additive HE with oblivious transfer (OT), offering strong privacy guarantees but often suffering from scalability issues, particularly with deep trees. Differential privacy (DP), although more commonly used during training [13], can complement cryptographic inference methods by adding noise to outputs to mask sensitive patterns. However, DP typically compromises utility and operates orthogonally to encrypted inference techniques.
1.2. Our Contributions
- Homomorphic matrix multiplication-based inference: Departing from polynomial approximation and linear path evaluation methods [10], we introduce homomorphic matrix multiplication as the primary operation for encrypted path computation. This novel application supports structured and scalable evaluation of encrypted inputs.
- Leaf node pruning during inference: We propose leaf node pruning at inference time, a novel runtime optimization that reduces the number of nodes involved in computation, significantly improving performance. Unlike traditional model pruning during training, this technique operates during encrypted inference.
- Structure-hiding inference protocol: The decision tree structure—including internal nodes and branching conditions—is fully hidden from both the client and the server. By ensuring that all path computations are homomorphically encrypted, the protocol mitigates leakage risks present in prior PPDT methods.
- Semi-interactive three-party architecture: Our protocol requires only one round of interactive communication between the data holder (client) and the outsourced server. No interaction is required from the model holder during inference. This design enables low-latency, real-world deployment scenarios.
2. Preliminaries
- Prediction: We use secure inner product/matrix multiplication to replace linear function in order to outsource prediction while maintaining the tree model secret.
2.1. Existing Decision Tree Classification
2.1.1. Decision Tree
2.1.2. Decision Tree Classification via Linear Function [10]
2.2. Secure Comparison Protocol
2.2.1. -bit Integer Comparison [14]
2.2.2. Packing Method
2.2.3. Secure Comparison Protocol
- 1.
- Alice generates a secret–public key pair and sends to Bob and the server.
- 2.
- Alice and Bob compute
- 3.
- The server computes
- 4.
- The server masks in encrypted form using random polynomial
- 5.
- Alice decrypts
- 6.
- The server unmasks d from as follows:
2.3. Ring-LWE-Based Secure Matrix Multiplication
3. PPDT Classification Model
3.1. Computation Model
- The client encrypts the feature vectors and sends them to the model holder, who then encrypts the information needed to calculate the decision tree’s path cost and threshold.
- The client’s encrypted feature vectors are sent to the server, relaying the model holder, to conceal the information () about which elements of the feature vector threshold of the decision node should compare.
- On behalf of the model holder, the server performs the necessary calculation for the decision tree prediction and sends the client’s encrypted classification results.
- The client decrypts the information and obtains the classification results.
A Representative Application of Online Medical Diagnostics
3.2. Computation of Path Cost by Matrix × Vector Operation
3.3. Proposed Protocol
- Step 1 (client):
- The client generates a secret–public key pair and sends to the model holder and server.
- Step 2 (client):
- The client encrypts each element of a feature vector by packing it using Equation (6).For , the client sends ciphertexts to the model holder.
- Step 3 (model holder):
- For , the model holder encrypts threshold by packing it using the Equation (6).For , the model holder generates path vector multiplied by a random numberFor , path matrix and classification result matrix are encrypted as follows:
- Step 4 (server):
- The server calculates in Equation (10) as . The server masks in encrypted form using random polynomial and then sends to the client.
- Step 5 (client):
- The client decrypts and obtains , which is then returned to the server.
- Step 6 (server):
- The server obtains from Equation (13) and the comparison result vector . If any of the -th coefficients in is zero, then ; otherwise, .
- Step 7 (server):
- The server packs the comparison result vector B using Equation () to obtain . The server calculates
- Step 8 (client):
- The client obtains the number of path matrix using Equation (21), where . Then, for , it decrypts to obtain a polynomial with randomized non-zero coefficients for the path matrix , in which coefficients of are k-th path cost according to Equations (19) and (20). Since , then it is verified whether any -th term for is 0, which implies the corresponding path cost . If so, the corresponding leaf of is the classification result according to Equation (25), and the client decrypts and obtains from the coefficient of .
Algorithm 1 Efficient PPDT Inference via Homomorphic Matrix Multiplication |
|
3.4. Leaf Node Pruning (LNP)
Algorithm 2 Efficiency-Enhanced PPDT with LNP (Algorithm 1+LNP) |
|
- Binary classification tasks
- (e.g., disease detection, abnormality detection): Only the decision path for class 1 (positive or anomalous) is evaluated. If the result is 0, class 1 is returned; otherwise, class 0 (negative or normal) is output.
- Multi-class classification tasks:
- Only classes are evaluated. If the dataset exhibits class imbalance (e.g., ImageNet), the majority class is assigned as the default. In the absence of such imbalance (e.g., MNIST), one class is randomly selected to serve as the default (i.e., the class not evaluated).
- Server runs homomorphic operation to obtain polynomials of the comparison result and then adds a random polynomial using Equation (10) over a polynomial ring , which implies , as follows.
- Bob decrypts and sends back to the server. After removing the random polynomial γ from , the server can obtain comparison results from by checking whether ∃ any coefficient of . If so , set ; otherwise, .For example, , which is the coefficient of , so . Similarly, ; therefore, .
- Bob decrypts and obtains the polynomial corresponding to a -dimension path matrix and then checks its coefficients of terms. In this example, , so whether the coefficient of or is zero is checked.In fact,
- the coefficient of is zero , and
- the coefficient of is zero ,
which links to leaf nodes and , respectively.As shown in Equation (38), since the coefficient of in the first polynomial is zero, if and only if , which means X is classified to leaf node . Then, the coefficient of in the second polynomial is the classification result; that is, .
3.5. Complexity
3.6. Security Analysis
- Client. The client can decrypt the ciphertext sent from the cloud server in Steps 5 and 7 using the secret key and obtain related to the comparison result and the decision tree structure. However, in Step 4, the noise from the cloud server is added for randomization. Therefore, the client receives a polynomial with completely random coefficients in . Similarly, is also a polynomial with random coefficients except zero, which leads to the classification ; that is, the coefficient corresponding to . Thus, the client can obtain the classification without knowing the decision tree model
- Model holder. The model holder only obtains the ciphertext of data X in Step 2, and IND-CPA security guarantees the privacy and security of the data.
- Cloud server. The server honestly performs homomorphic operations for the system but is curious about the data provided by the client and the model holder. In Step 3, it receives only encrypted inputs using an SHE scheme that satisfies IND-CPA security, ensuring the confidentiality of both user data and model parameters. Even if the server obtains the comparison result vector from d sent in Step 5, it cannot recover the original data X nor infer meaningful information about the decision tree (threshold t, path matrix P, and classification result V). This is due not only to encryption but also to randomization of P using noise terms and in Step 3, which prevents linkage to specific nodes of .
4. Experiments
4.1. Experimental Setup
4.2. Performance Comparisons
4.2.1. Path Cost Calculation
4.2.2. Protocol Efficiency
- To evaluate the efficiency of different homomorphic encryption approaches, we also integrated the XCMP scheme into the comparison component of the three-party protocol described in Section 3.3. The path cost computation was performed following the procedure outlined in Steps 6–8 of Section 3.3.
- A naive two-party protocol using XCMP by Lu et al. computed the result of two-party comparison (see Appendix B). Then, we computed the path cost under encrypt conditions by additive homomorphism.
4.3. Time Complexity
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Notations
Functions that return 1 if · is true and 0 if false | |
Maximum bit length of an integer | |
A | Vector of length ℓ |
d-bit subvector of A | |
-bit binary vector of integer a, | |
where is the least significant bit of integer a. | |
N | Dimension of the feature vector |
X | Feature vector |
Decision node | |
Leaf node | |
Index of feature vectors to be compared by decision node | |
Threshold of decision node | |
Class to be output by leaf node | |
m | Number of decision nodes in the decision tree |
Number of leaf nodes in the decision tree |
n | An integer of power of 2 that denotes the degree of polynomial . Define a polynomial ring , |
q | An integer composed of ( is a prime number). Define a polynomial ring representing a ciphertext space , |
p | Define a plaintext space , where p and q are mutually prime natural numbers with the relation , |
Standard deviation of the discrete Gaussian distribution defining the secret key space . The elements of are polynomials on the ring . Each coefficient is independently sampled from the discrete Gaussian distribution of the variance . |
Appendix A. Ring-LWE-Based Homomorphic Encryption
- : input security parameter and output system parameter .
- : input system parameter and output public key and secret key .
- : input plaintext m and output ciphertext c.
- : input ciphertext c and output plaintext m.
Appendix B. XCMP [15]
References
- Konečný, J.; McMahan, H.B.; Ramage, D.; Richtárik, P. Federated Optimization: Distributed Machine Learning for On-Device Intelligence. arXiv 2016, arXiv:1610.02527. [Google Scholar] [CrossRef]
- Fredrikson, M.; Jha, S.; Ristenpart, T. Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, Denver, CO, USA, 12–16 October 2015; Ray, I., Li, N., Kruegel, C., Eds.; ACM: New York, NY, USA, 2015; pp. 1322–1333. [Google Scholar] [CrossRef]
- Rivest, R.L.; Dertouzos, M.L. On Data Banks and Privacy Homomorphisms. In Foundations of Secure Computation; DeMillo, R., Ed.; Academic Press: Cambridge, MA, USA, 1978; Volume 4, pp. 169–180. [Google Scholar]
- Gentry, C. A Fully Homomorphic Encryption Scheme. Ph.D. Thesis, Stanford University, Stanford, CA, USA, 2009. [Google Scholar]
- Akavia, A.; Leibovich, M.; Resheff, Y.S.; Ron, R.; Shahar, M.; Vald, M. Privacy-Preserving Decision Trees Training and Prediction. In Proceedings of the Machine Learning and Knowledge Discovery in Databases—European Conference, ECML PKDD 2020, Ghent, Belgium, 14–18 September 2020; Proceedings, Part I. Hutter, F., Kersting, K., Lijffijt, J., Valera, I., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2020; Volume 12457, pp. 145–161. [Google Scholar] [CrossRef]
- Fréry, J.; Stoian, A.; Bredehoft, R.; Montero, L.; Kherfallah, C.; Chevallier-Mames, B.; Meyre, A. Privacy-Preserving Tree-Based Inference with TFHE. In Proceedings of the Mobile, Secure, and Programmable Networking—9th International Conference, MSPN 2023, Paris, France, 26–27 October 2023; Revised Selected Papers. Bouzefrane, S., Banerjee, S., Mourlin, F., Boumerdassi, S., Renault, É., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2023; Volume 14482, pp. 139–156. [Google Scholar] [CrossRef]
- Hao, Y.; Qin, B.; Sun, Y. Privacy-Preserving Decision-Tree Evaluation with Low Complexity for Communication. Sensors 2023, 23, 2624. [Google Scholar] [CrossRef] [PubMed]
- Shin, H.; Choi, J.; Lee, D.; Kim, K.; Lee, Y. Fully Homomorphic Training and Inference on Binary Decision Tree and Random Forest. In Proceedings of the Computer Security—ESORICS 2024—29th European Symposium on Research in Computer Security, Bydgoszcz, Poland, 16–20 September 2024; Proceedings, Part III. García-Alfaro, J., Kozik, R., Choras, M., Katsikas, S.K., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2024; Volume 14984, pp. 217–237. [Google Scholar] [CrossRef]
- Cong, K.; Das, D.; Park, J.; Pereira, H.V.L. SortingHat: Efficient Private Decision Tree Evaluation via Homomorphic Encryption and Transciphering. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, CCS 2022, Los Angeles, CA, USA, 7–11 November 2022; Yin, H., Stavrou, A., Cremers, C., Shi, E., Eds.; ACM: New York, NY, USA, 2022; pp. 563–577. [Google Scholar] [CrossRef]
- Tai, R.K.H.; Ma, J.P.K.; Zhao, Y.; Chow, S.S.M. Privacy-Preserving Decision Trees Evaluation via Linear Functions. In Proceedings of the Computer Security—ESORICS 2017—22nd European Symposium on Research in Computer Security, Oslo, Norway, 11–15 September 2017; Proceedings, Part II. Foley, S.N., Gollmann, D., Snekkenes, E., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2017; Volume 10493, pp. 494–512. [Google Scholar] [CrossRef]
- Zheng, Y.; Duan, H.; Wang, C.; Wang, R.; Nepal, S. Securely and Efficiently Outsourcing Decision Tree Inference. IEEE Trans. Dependable Secur. Comput. 2022, 19, 1841–1855. [Google Scholar] [CrossRef]
- Wu, D.J.; Feng, T.; Naehrig, M.; Lauter, K.E. Privately Evaluating Decision Trees and Random Forests. Proc. Priv. Enhancing Technol. 2016, 2016, 335–355. [Google Scholar] [CrossRef]
- Maddock, S.; Cormode, G.; Wang, T.; Maple, C.; Jha, S. Federated Boosted Decision Trees with Differential Privacy. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, CCS 2022, Los Angeles, CA, USA, 7–11 November 2022; Yin, H., Stavrou, A., Cremers, C., Shi, E., Eds.; ACM: New York, NY, USA, 2022; pp. 2249–2263. [Google Scholar] [CrossRef]
- Damgård, I.; Geisler, M.; Krøigaard, M. A correction to ’efficient and secure comparison for on-line auctions’. Int. J. Appl. Cryptogr. 2009, 1, 323–324. [Google Scholar] [CrossRef]
- Lu, W.; Zhou, J.; Sakuma, J. Non-interactive and Output Expressive Private Comparison from Homomorphic Encryption. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security, AsiaCCS 2018, Incheon, Republic of Korea, 4–8 June 2018; Kim, J., Ahn, G., Kim, S., Kim, Y., López, J., Kim, T., Eds.; ACM: New York, NY, USA, 2018; pp. 67–74. [Google Scholar] [CrossRef]
- Saha, T.K.; Koshiba, T. An Efficient Privacy-Preserving Comparison Protocol. In Proceedings of the Advances in Network-Based Information Systems, The 20th International Conference on Network-Based Information Systems, NBiS 2017, Ryerson University, Toronto, ON, Canada, 24–26 August 2017; Barolli, L., Enokido, T., Takizawa, M., Eds.; Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 2018; Volume 7, pp. 553–565. [Google Scholar] [CrossRef]
- Wang, L.; Saha, T.K.; Aono, Y.; Koshiba, T.; Moriai, S. Enhanced Secure Comparison Schemes Using Homomorphic Encryption. In Proceedings of the Advances in Networked-Based Information Systems—The 23rd International Conference on Network-Based Information Systems, NBiS 2020, Victoria, BC, Canada, 31 August–2 September 2020; Barolli, L., Li, K.F., Enokido, T., Takizawa, M., Eds.; Advances in Intelligent Systems and Computing. Springer: Berlin/Heidelberg, Germany, 2021; Volume 1264, pp. 211–224. [Google Scholar] [CrossRef]
- Duong, D.H.; Mishra, P.K.; Yasuda, M. Efficient Secure Matrix Multiplication Over LWE-Based Homomorphic Encryption. Tatra Mt. Math. Publ. 2016, 67, 69–83. [Google Scholar] [CrossRef]
- Yasuda, M.; Shimoyama, T.; Kogure, J.; Yokoyama, K.; Koshiba, T. Practical Packing Method in Somewhat Homomorphic Encryption. In Proceedings of the Data Privacy Management and Autonomous Spontaneous Security–8th International Workshop, DPM 2013, and 6th International Workshop, SETOP 2013, Revised Selected Papers, Egham, UK, 12–13 September 2013; García-Alfaro, J., Lioudakis, G.V., Cuppens-Boulahia, N., Foley, S.N., Fitzgerald, W.M., Eds.; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2014; Volume 8247, pp. 34–50. [Google Scholar] [CrossRef]
- Kelly, M.; Longjohn, R.; Nottingham, K. The UCI Machine Learning Repository. Available online: https://archive.ics.uci.edu (accessed on 31 March 2025).
- Fan, J.; Vercauteren, F. Somewhat Practical Fully Homomorphic Encryption. IACR Cryptology ePrint Archive 2012. p. 144. Available online: https://eprint.iacr.org/2012/144 (accessed on 31 March 2025).
- Microsoft SEAL (Release 3.3); Microsoft Research: Redmond, WA, USA, 2019; Available online: https://github.com/Microsoft/SEAL (accessed on 31 March 2025).
- Albrecht, M.; Chase, M.; Chen, H.; Ding, J.; Goldwasser, S.; Gorbunov, S.; Halevi, S.; Hoffstein, J.; Laine, K.; Lauter, K.; et al. Homomorphic Encryption Security Standard; Technical report; HomomorphicEncryption.org: Toronto, ON, Canada, 2018. [Google Scholar]
- Fukui, S.; Wang, L.; Hayashi, T.; Ozawa, S. Privacy-Preserving Decision Tree Classification Using Ring-LWE-Based Homomorphic Encryption. In Proceedings of the Computer Sequrity Symposium 2019, Nagasaki, Japan, 21–24 October 2019; pp. 321–327. [Google Scholar]
Dataset | Dataset | Model | ||
---|---|---|---|---|
# Data | # Attributes | # Classes | # Nodes | |
BC | 569 | 30 | 2 | 8 |
ST | 43,500 | 9 | 7 | 24 |
CS | 653 | 15 | 2 | 26 |
HD | 720 | 13 | 5 | 35 |
NS | 12,960 | 8 | 5 | 49 |
SP | 4601 | 57 | 2 | 110 |
EGG | 14,980 | 14 | 2 | 724 |
BK | 45,211 | 16 | 2 | 1027 |
Dataset Accuracy | Path Cost (ms) | Total Time (ms) | |||
---|---|---|---|---|---|
Matrix (+LNP) | Inner Product | Matrix (+LNP) | Inner Product | ||
BC | 96.4% | 0.25 (0.25) | 3.04 | 44.19 (43.92) | 63.58 |
ST | 99.8% | 0.25 (0.25) | 6.74 | 66.85 (66.83) | 117.51 |
CS | 90.8% | 0.25 (0.25) | 7.47 | 77.57 (76.12) | 134.64 |
HD | 61.1% | 0.25 (0.25) | 9.73 | 98.82 (98.11) | 174.34 |
NS | 98.6% | 0.50 (0.37) | 11.72 | 130.87 (129.07) | 235.04 |
SP | 90.4% | 1.72 (0.86) | 13.13 | 321.89 (302.72) | 490.04 |
EGG | 69.8% | 88.57 (46.20) | 179.31 | 3425.44 (2516.35) | 4442.00 |
BK | 88.6% | 247.63 (131.14) | 252.87 | 6485.30 (4394.58) | 6455.17 |
Dataset | Path Cost (ms) | Total Time (ms) | ||
---|---|---|---|---|
Three-Party | Two-Party | Three-Party | Two-Party | |
BC | 2.17 | 1577.51 | 302.19 | 1958.13 |
ST | 2.17 | 11,745.70 | 647.51 | 12,724.20 |
CS | 2.18 | 14,905.70 | 736.95 | 16,037.90 |
HD | 2.18 | 27,041.40 | 961.19 | 28,298.90 |
NS | 2.18 | 51,772.50 | 1321.47 | 54,268.10 |
SP | 2.19 | 255,659.00 | 2996.92 | 263,008.00 |
EGG | 74.01 | 10,947,300.00 | 19,432.90 | 11,013,300.00 |
BK | 166.74 | 22,027,200.00 | 28,821.60 | 22,130,500.00 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fukui, S.; Wang, L.; Ozawa, S. Efficient and Privacy-Preserving Decision Tree Inference via Homomorphic Matrix Multiplication and Leaf Node Pruning. Appl. Sci. 2025, 15, 5560. https://doi.org/10.3390/app15105560
Fukui S, Wang L, Ozawa S. Efficient and Privacy-Preserving Decision Tree Inference via Homomorphic Matrix Multiplication and Leaf Node Pruning. Applied Sciences. 2025; 15(10):5560. https://doi.org/10.3390/app15105560
Chicago/Turabian StyleFukui, Satoshi, Lihua Wang, and Seiichi Ozawa. 2025. "Efficient and Privacy-Preserving Decision Tree Inference via Homomorphic Matrix Multiplication and Leaf Node Pruning" Applied Sciences 15, no. 10: 5560. https://doi.org/10.3390/app15105560
APA StyleFukui, S., Wang, L., & Ozawa, S. (2025). Efficient and Privacy-Preserving Decision Tree Inference via Homomorphic Matrix Multiplication and Leaf Node Pruning. Applied Sciences, 15(10), 5560. https://doi.org/10.3390/app15105560