VoteSim: Voting-Based Binary Code Similarity Detection for Vulnerability Identification in IoT Firmware
Abstract
1. Introduction
- We design a novel ensemble framework (VoteSim) that integrates diverse neural representations through a voting-based retrieval ranking mechanism.
- We evaluate VoteSim on real-world IoT firmware datasets, demonstrating that our method outperforms state-of-the-art BCSD techniques in both MRR and recall, achieving improvements of up to 14.7% in recall.
- We fill the gap in the literature by presenting a hybrid approach that combines the strengths of multiple neural networks, enabling better BCSD performance through complementary insights from each model.
2. Motivation
3. Design of VoteSim
3.1. Retrieval Results Collection
3.2. Voting Score Computation
3.3. Reordering and Final Retrieval
3.4. Computational Complexity
4. Evaluation
4.1. Evaluation Setup
4.2. Evaluation Metrics
4.3. Baselines
4.4. Vulnerability Detection
- VoteSim outperforms individual models in both Recall@10 and MRR, showing the effectiveness of its aggregation strategy.
- For high-impact vulnerabilities like CVE-2014-3511 and CVE-2008-1672, VoteSim outperforms individual models by preserving the stronger signals through aggregation, highlighting its flexibility and robustness.
- The inverse average rank aggregation strategy significantly enhances the model’s ability to detect true positives while reducing the impact of false positives.
5. Discussion
6. Limitation and Future Work
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Mehdipour, F. A Review of IoT Security Challenges and Solutions. In Proceedings of the 2020 8th International Japan-Africa Conference on Electronics, Communications, and Computations (JAC-ECC), Online, 14–15 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Luo, Z.; Wang, P.; Xie, W.; Zhou, X.; Wang, B. IoTSim: Internet of things-oriented binary code similarity detection with multiple block relations. Sensors 2023, 23, 7789. [Google Scholar] [CrossRef] [PubMed]
- Yang, Y.; Wu, L.; Yin, G.; Li, L.; Zhao, H. A Survey on Security and Privacy Issues in Internet-of-Things. IEEE Internet Things J. 2017, 4, 1250–1258. [Google Scholar] [CrossRef]
- Feng, X.; Zhu, X.; Han, Q.L.; Zhou, W.; Wen, S.; Xiang, Y. Detecting vulnerability on IoT device firmware: A survey. IEEE/CAA J. Autom. Sin. 2022, 10, 25–41. [Google Scholar] [CrossRef]
- Wu, Y.; Wang, J.; Wang, Y.; Zhai, S.; Li, Z.; He, Y.; Sun, K.; Li, Q.; Zhang, N. Your firmware has arrived: A study of firmware update vulnerabilities. In Proceedings of the 33rd USENIX Security Symposium (USENIX Security 24), Philadelphia, PA, USA, 14–16 August 2024; pp. 5627–5644. [Google Scholar]
- Sun, H.; Zhou, W.; Fei, M. A Survey On Graph Matching in Computer Vision. In Proceedings of the 2020 13th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), IEEE, Chengdu, China, 17–19 October 2020; pp. 225–230. [Google Scholar] [CrossRef]
- Gao, D.; Reiter, M.K.; Song, D.X. BinHunt: Automatically finding semantic differences in binary programs. In Proceedings of the 10th International Conference on Information and Communications Security (ICICS), Birmingham, UK, 20–22 October 2008; Springer: Berlin/Heidelberg, Germany, 2008; pp. 238–255. [Google Scholar] [CrossRef]
- Dullien, T.; Rolles, R. BinDiff: Finding similarities in binary code. In Proceedings of the Conference on Reverse Engineering, Pittsburgh, PA, USA, 7–11 November 2005. [Google Scholar]
- Rabin, M.O.; Broder, A.Z. Hashing for Similarity Search: An Empirical Evaluation. In Proceedings of the International Conference on Research and Development in Information Retrieval (SIGIR), New Orleans, LA, USA, 9–13 September 2001; ACM: New York, NY, USA, 2001; pp. 34–40. [Google Scholar]
- Fu, L.; Ji, S.; Liu, C.; Liu, P.; Duan, F.; Wang, Z.; Chen, W.; Wang, T. Focus: Function clone identification on cross-platform. Int. J. Intell. Syst. 2022, 37, 5082–5112. [Google Scholar] [CrossRef]
- Gao, J.; Yang, X.; Fu, Y.; Jiang, Y.; Sun, J. VulSeeker: A Semantic Learning Based Vulnerability Seeker for Cross-Platform Binary. In Proceedings of the 2018 33rd IEEE/ACM International Conference on Automated Software Engineering (ASE), Montpellier, France, 3–7 September 2018; pp. 896–899, ISSN 2643-1572. [Google Scholar] [CrossRef]
- Ling, X.; Wu, L.; Wang, S.; Ma, T.; Xu, F.; Liu, A.X.; Wu, C.; Ji, S. Multilevel graph matching networks for deep graph similarity learning. IEEE Trans. Neural Netw. Learn. Syst. 2021, 34, 799–813. [Google Scholar] [CrossRef] [PubMed]
- Massarelli, L.; Di Luna, G.A.; Petroni, F.; Querzoni, L.; Baldoni, R. Investigating graph embedding neural networks with unsupervised features extraction for binary analysis. In Proceedings of the 2nd Workshop on Binary Analysis Research (BAR), San Diego, CA, USA, 24 February 2019; pp. 1–11. [Google Scholar]
- Yang, J.; Fu, C.; Liu, X.Y.; Yin, H.; Zhou, P. Codee: A tensor embedding scheme for binary code search. IEEE Trans. Softw. Eng. 2021, 48, 2224–2244. [Google Scholar] [CrossRef]
- Liu, B.; Huo, W.; Zhang, C.; Li, W.; Li, F.; Piao, A.; Zou, W. αDiff: Cross-version binary code similarity detection with DNN. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, Montpellier, France, 3–7 September 2018; ACM: New York, NY, USA, 2018; pp. 667–678. [Google Scholar] [CrossRef]
- Yu, Z.; Cao, R.; Tang, Q.; Nie, S.; Huang, J.; Wu, S. Order matters: Semantic-aware neural networks for binary code similarity detection. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 1145–1152. [Google Scholar]
- Wang, H.; Qu, W.; Katz, G.; Zhu, W.; Gao, Z.; Qiu, H.; Zhuge, J.; Zhang, C. jTrans: Jump-Aware Transformer for Binary Code Similarity. arXiv 2022, arXiv:2205.12713. [Google Scholar]
- Wang, H.; Gao, Z.; Zhang, C.; Sha, Z.; Sun, M.; Zhou, Y.; Zhu, W.; Sun, W.; Qiu, H.; Xiao, X. CLAP: Learning Transferable Binary Code Representations with Natural Language Supervision. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2024), Vienna, Austria, 16–20 September 2024; pp. 503–515. [Google Scholar] [CrossRef]
- Wang, H.; Gao, Z.; Zhang, C.; Sun, M.; Zhou, Y.; Qiu, H.; Xiao, X. CEBin: A Cost-Effective Framework for Large-Scale Binary Code Similarity Detection. arXiv 2024, arXiv:2402.18818. [Google Scholar]
- Zhou, S.; Fu, L.; Liu, P.; Wang, W. BinEGA: Enhancing DNN-based Binary Code Similarity Detection through Efficient Graph Alignment. In Proceedings of the 2025 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), Montreal, QC, Canada, 4–7 March 2025; pp. 488–499. [Google Scholar]
- Fu, L.; Liu, P.; Meng, W.; Lu, K.; Zhou, S.; Zhang, X.; Chen, W.; Ji, S. Understanding the AI-powered Binary Code Similarity Detection. arXiv 2024, arXiv:2410.07537. [Google Scholar] [CrossRef]
- Dwork, C.; Kumar, R.; Naor, M.; Sivakumar, D. Rank aggregation methods for the web. In Proceedings of the 10th International Conference on World Wide Web, Hong Kong, China, 1–5 May 2001; ACM: New York, NY, USA, 2001; pp. 613–622. [Google Scholar]
- Lin, S. Rank aggregation methods. Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 555–570. [Google Scholar] [CrossRef]
- Dietterich, T.G. Ensemble methods in machine learning. In Proceedings of the International Workshop on Multiple Classifier Systems, Cagliari, Italy, 21–23 June 2000; Springer: Berlin/Heidelberg, Germany, 2000; pp. 1–15. [Google Scholar]
- Wang, H.; Ma, P.; Yuan, Y.; Liu, Z.; Wang, S.; Tang, Q.; Nie, S.; Wu, S. Enhancing DNN-Based Binary Code Function Search with Low-Cost Equivalence Checking. IEEE Trans. Softw. Eng. 2023, 49, 226–250. [Google Scholar] [CrossRef]
- Zhang, P.; Wu, C.; Wang, Z. BINCODEX: A comprehensive and multi-level dataset for evaluating binary code similarity detection techniques. BenchCouncil Trans. Benchmarks Stand. Eval. 2024, 4, 100163. [Google Scholar] [CrossRef]
- Xu, X.; Liu, C.; Feng, Q.; Yin, H.; Song, L.; Song, D. Neural Network-based Graph Embedding for Cross-Platform Binary Code Similarity Detection. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS’17, Dallas, TX, USA, 30 October–3 November 2017; pp. 363–376. [Google Scholar] [CrossRef]
- Schoonaert, B.; Kim, H.J.; Paek, Y.H. A Study on Binary Code Similarity Detection. In Proceedings of the Annual Conference of KIPS. Korea Information Processing Society, Jeju, Republic of Korea, 16–18 December 2024; pp. 216–219. [Google Scholar]
- Yang, S.; Cheng, L.; Zeng, Y.; Lang, Z.; Zhu, H.; Shi, Z. Asteria: Deep learning-based AST-encoding for cross-platform binary code similarity detection. In Proceedings of the 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Taipei, Taiwan, 21–24 June 2021; pp. 224–236. [Google Scholar]
- Halder, A.; Dalal, A.; Gharami, S.; Wozniak, M.; Ijaz, M.F.; Singh, P.K. A fuzzy rank-based deep ensemble methodology for multi-class skin cancer classification. Sci. Rep. 2025, 15, 6268. [Google Scholar] [CrossRef]
- Rajora, K.; Abdulhussein, N.S. Reviews research on applying machine learning techniques to reduce false positives for network intrusion detection systems. Babylon. J. Mach. Learn. 2023, 2023, 26–30. [Google Scholar] [CrossRef] [PubMed]
- da Silveira Lopes, R.; Duarte, J.C.; Goldschmidt, R.R. False Positive Identification in Intrusion Detection Using XAI. IEEE Lat. Am. Trans. 2023, 21, 745–751. [Google Scholar] [CrossRef]
- Jiang, V.S.; Kandula, H.; Thirumalaraju, P.; Kanakasabapathy, M.K.; Cherouveim, P.; Souter, I.; Dimitriadis, I.; Bormann, C.L.; Shafiee, H. The use of voting ensembles to improve the accuracy of deep neural networks as a non-invasive method to predict embryo ploidy status. J. Assist. Reprod. Genet. 2023, 40, 301–308. [Google Scholar] [CrossRef] [PubMed]
- Ragab, M.; Alshammari, S.M.; Al-Ghamdi, A.S. Modified Metaheuristics with Weighted Majority Voting Ensemble Deep Learning Model for Intrusion Detection System. Comput. Syst. Sci. Eng. 2023, 47, 2497–2512. [Google Scholar] [CrossRef]
- Khan, S.A.; Rehman, A.U.; Arshad, A.; Alqahtani, M.H.; Mahmoud, K.; Lehtonen, M. Effective Voting-Based Ensemble Learning for Segregated Load Forecasting with Low Sampling Data. IEEE Access 2024, 12, 84074–84087. [Google Scholar] [CrossRef]
- Fu, G.; Li, B.; Yang, Y.; Li, C. Re-ranking and TOPSIS-based ensemble feature selection with multi-stage aggregation for text categorization. Pattern Recognit. Lett. 2023, 168, 47–56. [Google Scholar]
Detector | Feature Extraction | Embedding Networks | Approach Granularities |
---|---|---|---|
jTrans [17] | AE | Transformer | Func |
Gemini [27] | MFE | PStructure2Vec | Func |
BinaryAI [16] | BERT | CNN, MPNN | Func |
CVE | Vulnerable Function | Confirmed |
---|---|---|
2015-1788 | BN_GF2m_mod_inv | 10 |
2008-1672 | ssl3_send_client_key_exchange | 10 |
2015-0286 | ASN1_TYPE_cmp | 11 |
2015-1789 | X509_cmp_time | 14 |
2022-0778 | BN_mod_sqrt | 19 |
2014-0224 | ssl3_do_change_cipher_spec | 10 |
2015-0287 | ASN1_item_ex_d2i | 22 |
2015-1791 | ssl_session_dup | 6 |
2015-0288 | X509_to_X509_REQ | 22 |
2014-3511 | ssl23_get_client_hello | 17 |
CVE | Gemini [27] | BinaryAI [16] | jTrans [17] | VoteSim | ||||
---|---|---|---|---|---|---|---|---|
Recall@10 | MRR | Recall@10 | MRR | Recall@10 | MRR | Recall@10 | MRR | |
2015-1788 | 0.300 | 0.176 | 0.300 | 0.185 | 0.700 | 0.259 | 0.600 | 0.240 |
2008-1672 | 0.300 | 0.183 | 0.300 | 0.219 | 0.700 | 0.260 | 0.600 | 0.257 |
2015-0286 | 0.910 | 0.266 | 0.454 | 0.211 | 0.727 | 0.247 | 0.727 | 0.261 |
2015-1789 | 0.429 | 0.177 | 0.429 | 0.178 | 0.429 | 0.180 | 0.429 | 0.182 |
2022-0778 | 0.157 | 0.117 | 0.368 | 0.145 | 0.368 | 0.136 | 0.368 | 0.164 |
2014-0224 | 0.700 | 0.259 | 0.700 | 0.246 | 0.700 | 0.259 | 0.700 | 0.259 |
2015-0287 | 0.182 | 0.103 | 0.136 | 0.085 | 0.136 | 0.086 | 0.182 | 0.101 |
2015-1791 | 0.833 | 0.391 | 0.667 | 0.343 | 0.500 | 0.329 | 1.000 | 0.408 |
2015-0288 | 0.182 | 0.119 | 0.136 | 0.077 | 0.091 | 0.068 | 0.182 | 0.111 |
2014-3511 | 0.412 | 0.153 | 0.176 | 0.110 | 0.176 | 0.109 | 0.412 | 0.147 |
Average | 0.440 | 0.195 | 0.367 | 0.180 | 0.453 | 0.193 | 0.520 | 0.213 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sun, K.; Zhou, S.; Meng, Y.; Ruan, W.; Chen, L. VoteSim: Voting-Based Binary Code Similarity Detection for Vulnerability Identification in IoT Firmware. Appl. Sci. 2025, 15, 10093. https://doi.org/10.3390/app151810093
Sun K, Zhou S, Meng Y, Ruan W, Chen L. VoteSim: Voting-Based Binary Code Similarity Detection for Vulnerability Identification in IoT Firmware. Applied Sciences. 2025; 15(18):10093. https://doi.org/10.3390/app151810093
Chicago/Turabian StyleSun, Keda, Shize Zhou, Yuwei Meng, Wei Ruan, and Liang Chen. 2025. "VoteSim: Voting-Based Binary Code Similarity Detection for Vulnerability Identification in IoT Firmware" Applied Sciences 15, no. 18: 10093. https://doi.org/10.3390/app151810093
APA StyleSun, K., Zhou, S., Meng, Y., Ruan, W., & Chen, L. (2025). VoteSim: Voting-Based Binary Code Similarity Detection for Vulnerability Identification in IoT Firmware. Applied Sciences, 15(18), 10093. https://doi.org/10.3390/app151810093