Data-Driven Transferable Modeling for Cross-Project Software Vulnerability Detection via Dual-Feature Stacking Ensemble
Abstract
1. Introduction
- A vulnerability detection method named Decpvd is presented as an improved approach. Targeting the field of cross-project vulnerability detection, this method is developed on the basis of CSVD-TF (which pioneers expert–semantic feature fusion) and achieves efficient and accurate cross-project vulnerability detection through the collaborative work of three modules: the Code Representation Module, the Model Construction Module, and the Vulnerability Detection Module.
- A model fusion mechanism based on a stacking ensemble is designed, which is capable of adaptively integrating two transfer learning models built on expert-metric features and semantic-metric features respectively. This mechanism upgrades the fixed-weight fusion of CSVD-TF, realizes the effective complementarity between the two types of metrics, and further solves the problem of the poor adaptability of fixed-weight fusion in cross-project scenarios, thereby enhancing the performance of the vulnerability detection method.
- We conduct large-scale evaluation experiments on real-world software project datasets for Decpvd. The experimental results demonstrate that Decpvd significantly outperforms current mainstream methods in cross-project vulnerability detection tasks, especially showing better adaptability than CSVD-TF due to its adaptive stacking fusion strategy.
2. Related Work
2.1. Representation Learning of Source Code in Software Vulnerability Detection
2.2. Cross-Project Software Vulnerability Detection
3. Approach
3.1. Code Representation Module
3.1.1. Expert Metrics
3.1.2. Semantic Metrics
- 1.
- Removing comment information from the code to eliminate potential interference from natural language text;
- 2.
- Uniformly mapping user-defined variable names to standardized ones to avoid generating irrelevant features due to diverse variable naming;
- 3.
- Uniformly mapping user-defined function names to standardized ones to reduce the impact caused by differences in function naming styles.
3.2. Model Construction Module
3.2.1. Training of Transfer Learning Base Models Based on TrAdaboost
3.2.2. Model Fusion Based on Stacking
- Both the source domain expert feature data and semantic feature data are partitioned into five non-overlapping subsets.
- In each round of cross-validation, four subsets are selected as the training set, and the remaining subset serves as the validation set.
- The expert metrics-based model and the semantic metrics-based model are fine-tuned using the training set. Subsequently, the prediction probabilities of both models are obtained using the validation set.
- The prediction probabilities from the expert feature model and the semantic feature model on the validation set are concatenated to form a two-dimensional meta-feature vector . Meanwhile, the true labels of the validation set are collected as the meta-labels .
- The logistic regression model is trained using the meta-feature training set as an input and the meta-labels as outputs.
- By learning the mapping relationship between the prediction probabilities of the two base models and the true labels, the meta-model adaptively adjusts the weights of the two base models. It ultimately outputs the fused prediction probabilities, achieving complementarity and enhancement among multiple feature models.
3.3. Vulnerability Detection Module
| Algorithm 1 Decpvd Vulnerability Detection Algorithm (End-to-End) |
|
| 1: Initialize empty sets: , , |
| 2: # Step 1: Dual-feature model probability prediction for target domain |
| 3: for each sample do |
| 4: {Forward propagation for expert feature positive probability} |
| 5: Add to |
| 6: end for |
| 7: for each sample do |
| 8: {Forward propagation for semantic feature positive probability} |
| 9: Add to |
| 10: end for |
| 11: # Step 2: Meta-feature concatenation |
| 12: for each do |
| 13: {Generate 2D meta-feature vector} |
| 14: end for |
| 15: # Step 3: Meta-model fusion to get final probability |
| 16: {Fusion of dual-feature probabilities} |
| 17: # Step 4: Binary label conversion by threshold |
| 18: for each do |
| 19: if then |
| 20: {Classified as vulnerable} |
| 21: else |
| 22: {Classified as non-vulnerable} |
| 23: end if |
| 24: Add to |
| 25: end for |
| 26: |
| 27: return |
- Expert Feature Model Prediction: Input into the trained expert feature transfer model, which outputs the positive class prediction probability for the target domain samples. (The closer is to 1, the higher the confidence that the sample is classified as “vulnerable”.)
- Semantic Feature Model Prediction: Similarly, input into the semantic feature transfer model to obtain the positive class prediction probability for the target domain samples.
- Meta-Feature Concatenation: Concatenate and for the target domain samples into a two-dimensional target domain meta-feature vector .
- Meta-Model Probability Output: Input into the trained stacking meta-model (logistic regression) to obtain the final positive class prediction probability .
- Binary Label Conversion: Use 0.5 as the classification threshold to convert the continuous probability into a discrete label:where represents the final prediction label for the target domain samples.
4. Experiment
4.1. Research Questions
- RQ1: How effective is Decpvd in cross-project software vulnerability detection?
- RQ2: Can the effective integration of two types of metrics enhance the performance of Decpvd?
- RQ3: How do the Gated Graph Neural Network and model ensemble synergistically affect the performance of Decpvd?
4.2. Dataset
4.3. Baseline Methods
4.4. Evaluation Metrics
4.5. Implementation Details
5. Results
5.1. RQ1: How Effective Is Decpvd in Cross-Project Software Vulnerability Detection?
5.2. RQ2: Can the Effective Integration of Two Types of Metrics Enhance the Performance of Decpvd?
5.3. RQ3: How Do the Gated Graph Neural Network and Model Ensemble Synergistically Affect the Performance of Decpvd?
6. Discussion
6.1. Impact of Feature Selection and Feature Importance Weighting Approaches
6.2. Impact of Key Hyperparameters
6.3. Efficiency Comparison Between Decpvd and Baseline Models
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| GGNN | Gated Graph Neural Network |
| PDG | Program Dependence Graph |
| AST | Abstract Syntax Tree |
| GRU | Gated Recurrent Unit |
| GAN | Generative Adversarial Network |
| RNN | Recurrent Neural Network |
| Bi-LSTM | Bidirectional Long Short-Term Memory |
| LSTM | Long Short-Term Memory |
| CNN | Convolutional Neural Network |
| GAT | Graph Attention Network |
| CPG | Code Property Graph |
| MMD | Maximum Mean Discrepancy |
| AUC | Area Under the ROC Curve |
| MCC | Matthews Correlation Coefficient |
| TPR | True Positive Rate |
| FPR | False Positive Rate |
| TP | True Positive |
| TN | True Negative |
| FP | False Positive |
| FN | False Negative |
| NVD | National Vulnerability Database |
| CVE | Common Vulnerabilities and Exposures |
References
- Aslan, M.; Aktu, S.S.; Ozkan-Okay, M.; Yilmaz, A.A.; Akin, E. A Comprehensive Review of Cyber Security Vulnerabilities, Threats, Attacks, and Solutions. Electronics 2023, 12, 1333. [Google Scholar] [CrossRef]
- Li, Z.; Zou, D.; Xu, S.; Ou, X.; Jin, H.; Wang, S.; Deng, Z.; Zhong, Y. Vuldeepecker: A deep learning-based system for vulnerability detection. arXiv 2018, arXiv:1801.01681. [Google Scholar]
- Zhou, Y.; Liu, S.; Siow, J.; Du, X.; Liu, Y. Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks. In Proceedings of the Advances in Neural Information Processing Systems 32, Volume 13 of 20: 32nd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, CA, USA, 8–14 December 2019. [Google Scholar]
- Nguyen, V.A.; Nguyen, D.Q.; Nguyen, V.; Le, T.; Tran, Q.H.; Phung, D. ReGVD: Revisiting Graph Neural Networks for Vulnerability Detection. In Proceedings of the 44th International Conference on Software Engineering Companion (ICSE ’22 Companion), Pittsburgh, PA, USA, 22–24 May 2022. [Google Scholar]
- Kong, L.; Luo, S.; Pan, L.; Wu, Z.; Li, X. A multi-type vulnerability detection framework with parallel perspective fusion and hierarchical feature enhancement. Comput. Secur. 2024, 140, 103787. [Google Scholar] [CrossRef]
- Nguyen, H.Q.; Hoang, T.; Dam, H.K.; Ghose, A. Graph-based explainable vulnerability prediction. Inf. Softw. Technol. 2025, 177, 107566. [Google Scholar] [CrossRef]
- Risse, N.; Liu, J.; Böhme, M. Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection. arXiv 2025, arXiv:2408.12986v2. [Google Scholar] [CrossRef]
- Li, X.; Xin, Y.; Zhu, H.; Yang, Y.; Chen, Y. Cross-domain vulnerability detection using graph embedding and domain adaptation. Comput. Secur. 2023, 125, 103017. [Google Scholar] [CrossRef]
- Nguyen, V.; Le, T.; de Vel, O.; Montague, P.; Grundy, J.; Phung, D. Dual-Component Deep Domain Adaptation: A New Approach for Cross Project Software Vulnerability Detection. In Proceedings of the Advances in Knowledge Discovery and Data Mining; Springer: Cham, Switzerland, 2020; pp. 699–711. [Google Scholar] [CrossRef]
- Zhang, C.; Liu, B.; Xin, Y.; Yao, L. CPVD: Cross Project Vulnerability Detection Based on Graph Attention Network and Domain Adaptation. IEEE Trans. Softw. Eng. 2023, 49, 4152–4168. [Google Scholar] [CrossRef]
- Nguyen, V.; Le, T.; Tantithamthavorn, C.; Grundy, J.; Phung, D. Deep Domain Adaptation with Max-Margin Principle for Cross-Project Imbalanced Software Vulnerability Detection. ACM Trans. Softw. Eng. Methodol. 2024, 33, 1–34. [Google Scholar] [CrossRef]
- Cai, Z.; Cai, Y.; Chen, X.; Lu, G.; Pei, W.; Zhao, J. CSVD-TF: Cross-project software vulnerability detection with TrAdaBoost by fusing expert metrics and semantic metrics. J. Syst. Softw. 2024, 213, 15. [Google Scholar] [CrossRef]
- Zhao, N.; Huang, Z.; Hua, R.; Li, Y.; Zheng, R.; Shen, Q.; Wang, J. TFSM: A network for time-frequency synergistic modeling integrating Mamba temporal pathway and spectral features for electricity theft detection. Expert Syst. Appl. 2026, 297, 129425. [Google Scholar] [CrossRef]
- Zhao, N.; Feng, Q.; Wang, H.; Jing, M.; Lin, Z.; Wang, J. A Key Node Mining Method Based on K-Shell and Neighborhood Information. Appl. Sci. 2024, 14, 6012. [Google Scholar] [CrossRef]
- Zhao, N.; Wang, H.; Wen, J.; Li, J.; Jing, M.; Wang, J. Identifying critical nodes in complex networks based on neighborhood information. New J. Phys. 2023, 25, 083020. [Google Scholar] [CrossRef]
- Li, Y.; Tarlow, D.; Brockschmidt, M.; Zemel, R. Gated graph sequence neural networks. arXiv 2015, arXiv:1511.05493. [Google Scholar]
- Dai, W.; Yang, Q.; Xue, G.R.; Yu, Y. Boosting for transfer learning. In Proceedings of the 24th International Conference on Machine Learning; ACM: New York, NY, USA, 2007; pp. 193–200. [Google Scholar]
- Wolpert, D.H. Stacked generalization. Neural Netw. 1992, 5, 241–259. [Google Scholar] [CrossRef]
- Neuhaus, S.; Zimmermann, T.; Holler, C.; Zeller, A. Predicting vulnerable software components. In Proceedings of the 14th ACM Conference on Computer and Communications Security, Alexandria, VA, USA, 28–31 October 2007; pp. 529–540. [Google Scholar]
- Shin, Y.; Meneely, A.; Williams, L.; Osborne, J.A. Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities. IEEE Trans. Softw. Eng. 2011, 37, 772–787. [Google Scholar] [CrossRef]
- Walden, J.; Stuckman, J.; Scandariato, R. Predicting vulnerable components: Software metrics vs. text mining. In Proceedings of the 2014 IEEE 25th International Symposium on Software Reliability Engineering; IEEE: New York, NY, USA, 2014; pp. 23–33. [Google Scholar]
- Grieco, G.; Grinblat, G.L.; Uzal, L.; Rawat, S.; Feist, J.; Mounier, L. Toward large-scale vulnerability discovery using machine learning. In Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy, New Orleans, LA, USA, 9–11 March 2016; pp. 85–96. [Google Scholar]
- Dam, H.K.; Tran, T.; Pham, T.; Ng, S.W.; Grundy, J.; Ghose, A. Automatic feature learning for predicting vulnerable software components. IEEE Trans. Softw. Eng. 2018, 47, 67–85. [Google Scholar] [CrossRef]
- Steenhoek, B.; Rahman, M.M.; Jiles, R.; Le, W. An empirical study of deep learning models for vulnerability detection. In Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE); IEEE: New York, NY, USA, 2023; pp. 2237–2248. [Google Scholar]
- Alon, U.; Zilberstein, M.; Levy, O.; Yahav, E. code2vec: Learning distributed representations of code. Proc. ACM Program. Lang. 2019, 3, 1–29. [Google Scholar] [CrossRef]
- Feng, Z.; Guo, D.; Tang, D.; Duan, N.; Feng, X.; Gong, M.; Shou, L.; Qin, B.; Liu, T.; Jiang, D.; et al. Codebert: A pre-trained model for programming and natural languages. arXiv 2020, arXiv:2002.08155. [Google Scholar]
- Li, Z.; Zou, D.; Xu, S.; Jin, H.; Zhu, Y.; Chen, Z. Sysevr: A framework for using deep learning to detect software vulnerabilities. IEEE Trans. Dependable Secur. Comput. 2021, 19, 2244–2258. [Google Scholar] [CrossRef]
- Wu, Y.; Zou, D.; Dou, S.; Yang, W.; Xu, D.; Jin, H. VulCNN: An image-inspired scalable vulnerability detection system. In Proceedings of the 44th International Conference on Software Engineering; Association for Computing Machinery: New York, NY, USA, 2022; ICSE’ 22; pp. 2365–2376. [Google Scholar] [CrossRef]
- Wang, H.; Ye, G.; Tang, Z.; Tan, S.H.; Huang, S.; Fang, D.; Feng, Y.; Bian, L.; Wang, Z. Combining graph-based learning with automated data collection for code vulnerability detection. IEEE Trans. Inf. Forensics Secur. 2020, 16, 1943–1958. [Google Scholar] [CrossRef]
- Chakraborty, S.; Krishna, R.; Ding, Y.; Ray, B. Deep learning based vulnerability detection: Are we there yet? IEEE Trans. Softw. Eng. 2021, 48, 3280–3296. [Google Scholar] [CrossRef]
- Xiao, P.; Xiao, Q.; Zhang, X.; Wu, Y.; Yang, F. Vulnerability Detection Based on Enhanced Graph Representation Learning. IEEE Trans. Inf. Forensics Secur. 2024, 19, 5120–5135. [Google Scholar] [CrossRef]
- Tao, W.; Su, X.; Wan, J.; Wei, H.; Zheng, W. Vulnerability detection through cross-modal feature enhancement and fusion. Comput. Secur. 2023, 132, 103341. [Google Scholar] [CrossRef]
- Lu, G.; Ju, X.; Chen, X.; Pei, W.; Cai, Z. GRACE: Empowering LLM-based software vulnerability detection with graph structure and in-context learning. J. Syst. Softw. 2024, 212, 112031. [Google Scholar] [CrossRef]
- Peng, T.; Li, Z.; Zhang, Y. VulTrLM: LLM-assisted vulnerability detection via AST decomposition and comment enhancement. Empir. Softw. Eng. 2026, 26, 1–28. [Google Scholar] [CrossRef]
- Luo, Y.; Chen, Z.; Dong, Y.; Zhang, H.; Sun, Y.; Xie, F.; Dong, Z. Improving SAST Detection Capability with LLMs and Enhanced DFA. In Proceedings of the 1st ACM SIGPLAN International Workshop on Language Models and Programming Languages; Association for Computing Machinery: New York, NY, USA, 2025; LMPL’ 25; pp. 66–70. [Google Scholar] [CrossRef]
- SciTools Limited Liability Company SciTools Understand; Computer Software. 2025. Available online: https://scitools.com/ (accessed on 15 April 2025).
- Shiri Harzevili, N.; Boaye Belle, A.; Wang, J.; Wang, S.; Jiang, Z.M.J.; Nagappan, N. A Systematic Literature Review on Automated Software Vulnerability Detection Using Machine Learning. ACM Comput. Surv. 2024, 57, 1–36. [Google Scholar] [CrossRef]
- Anon. Joern; Computer Software. 2025. Available online: https://joern.io/ (accessed on 15 April 2025).
- Wolf, L.; Hanani, Y.; Bar, K.; Dershowitz, N. Joint word2vec Networks for Bilingual Semantic Representations. Int. J. Comput. Linguist. Appl. 2014, 5, 27–42. [Google Scholar]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; Association for Computing Machinery: New York, NY, USA, 2016; KDD’ 16; pp. 785–794. [Google Scholar] [CrossRef]
- Nguyen, V.; Le, T.; Le, T.; Nguyen, K.; DeVel, O.; Montague, P.; Qu, L.; Phung, D. Deep Domain Adaptation for Vulnerable Code Function Identification. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar] [CrossRef]
- Wilcoxon, F. Individual comparisons by ranking methods. In Breakthroughs in Statistics: Methodology and Distribution; Kotz, S., Johnson, N.L., Eds.; Springer: New York, NY, USA, 1992; pp. 196–202. [Google Scholar]
- Chen, X.; Xia, H.; Pei, W.; Ni, C.; Liu, K. Boosting multi-objective just-in-time software defect prediction by fusing expert metrics and semantic metrics. J. Syst. Softw. 2023, 206, 111853. [Google Scholar] [CrossRef]
- Ni, C.; Wang, W.; Yang, K.; Xia, X.; Liu, K.; Lo, D. The best of both worlds: Integrating semantic features with expert features for defect prediction and localization. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering; Association for Computing Machinery: New York, NY, USA, 2022; pp. 672–683. [Google Scholar] [CrossRef]








| Dimension | Metric Name |
|---|---|
| Code size | CountDeclClass |
| CountDeclFunction | |
| CountLine | |
| CountLineBlank | |
| CountLineCode | |
| CountLineCodeDecl | |
| CountLineComment | |
| CountLineInactive | |
| CountLinePreprocessor | |
| Complexity | AvgCyclomatic |
| AvgCyclomaticModified | |
| AvgCyclomaticStrict | |
| AvgEssential | |
| MaxCyclomatic | |
| MaxCyclomaticModified | |
| MaxCyclomaticStrict | |
| MaxEssential | |
| MaxNesting | |
| SumCyclomatic | |
| SumCyclomaticModified | |
| SumCyclomaticStrict | |
| SumEssential | |
| Readability | AvgLine |
| AvgLineBlank | |
| AvgLineCode | |
| AvgLineComment | |
| AltAvgLineBlank | |
| AltAvgLineCode | |
| AltAvgLineComment | |
| AltCountLineBlank | |
| AltCountLineCode | |
| AltCountLineComment | |
| RatioCommentToCode | |
| Maintainability | CountStmt |
| CountStmtDecl | |
| CountStmtEmpty | |
| CountStmtExe | |
| Performance | CountLineCodeExe |
| CountSemicolon |
| Project | Vulnerable Functions | Non-Vulnerable Functions |
|---|---|---|
| FFmpeg | 806 | 4808 |
| LibTIFF | 79 | 418 |
| LibPNG | 30 | 370 |
| Model Module | Hyperparameter Name | Value | Selection Method |
|---|---|---|---|
| word2vec | Embedding Dim. | 256 | Grid search [128, 256, 512] |
| GGNN | Embedding Dim. | 256 | Grid search [128, 256, 512] |
| GGNN | Iteration Steps | 6 | Grid search [3, 6, 9] |
| GGNN-Adam | Learning Rate | Fine-tuning | |
| GGNN-Adam | Weight Decay | Fine-tuning | |
| GGNN | Loss Function | BCELoss | Ref. classic domain studies |
| XGBoost | Learning Rate | 0.1 | Grid search [0.01, 0.1, 0.2] |
| XGBoost | Max Tree Depth | 6 | Grid search [3, 6, 10] |
| XGBoost | L2 Regularization | 1 | Grid search [0, 1, 5] |
| XGBoost | Iterations | 200 | Grid search [100, 200, 300] |
| TrAdaBoost | Early Stopping | 50 | Ref. classic domain studies |
| Stacking | K-fold CV | 5 | Classic value |
| Stacking | Classification Threshold | 0.5 | Grid search [0.3, 0.5, 0.7] |
| Source→Target | Decpvd | CSVD-TF | DAM2P | Dual-GD-DDAN | Devign | REGVD |
|---|---|---|---|---|---|---|
| FFmpeg→LibPNG | 0.866 | 0.783 | 0.669 | 0.595 | 0.775 | 0.842 |
| FFmpeg→LibTIFF | 0.791 | 0.741 | 0.665 | 0.667 | 0.716 | 0.718 |
| LibPNG→FFmpeg | 0.845 | 0.530 | 0.658 | 0.552 | 0.628 | 0.429 |
| LibPNG→LibTIFF | 0.747 | 0.614 | 0.595 | 0.650 | 0.550 | 0.367 |
| LibTIFF→FFmpeg | 0.848 | 0.789 | 0.631 | 0.559 | 0.568 | 0.660 |
| LibTIFF→LibPNG | 0.786 | 0.658 | 0.696 | 0.746 | 0.566 | 0.423 |
| Average | 0.814 | 0.686 | 0.652 | 0.628 | 0.634 | 0.573 |
| p-value | – | * | * | * | * | * |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Liu, Y.; Liu, B.; Wang, S.; Hu, B.; Jin, Y. Data-Driven Transferable Modeling for Cross-Project Software Vulnerability Detection via Dual-Feature Stacking Ensemble. Mathematics 2026, 14, 780. https://doi.org/10.3390/math14050780
Liu Y, Liu B, Wang S, Hu B, Jin Y. Data-Driven Transferable Modeling for Cross-Project Software Vulnerability Detection via Dual-Feature Stacking Ensemble. Mathematics. 2026; 14(5):780. https://doi.org/10.3390/math14050780
Chicago/Turabian StyleLiu, Yu, Bin Liu, Shihai Wang, Bin Hu, and Yujie Jin. 2026. "Data-Driven Transferable Modeling for Cross-Project Software Vulnerability Detection via Dual-Feature Stacking Ensemble" Mathematics 14, no. 5: 780. https://doi.org/10.3390/math14050780
APA StyleLiu, Y., Liu, B., Wang, S., Hu, B., & Jin, Y. (2026). Data-Driven Transferable Modeling for Cross-Project Software Vulnerability Detection via Dual-Feature Stacking Ensemble. Mathematics, 14(5), 780. https://doi.org/10.3390/math14050780

