DLF: A Deep Active Ensemble Learning Framework for Test Case Generation

Yaogang Lu; Yibo Peng; Dongqing Zhu

doi:10.3390/info16121109

,

and

¹

Beijing New Building Materials Public Limited Company, Beijing 102209, China

²

School of Cyberspace Science and Technology, Beijing Jiaotong University, Beijing 100044, China

^*

Author to whom correspondence should be addressed.

Information2025, 16(12), 1109;https://doi.org/10.3390/info16121109

Version Notes

Order Reprints

Abstract

High-quality test cases are vital for ensuring software reliability and security. However, existing symbolic execution tools generally rely on single-path search strategies, have limited feature extraction capability, and exhibit unstable model predictions. These limitations make them prone to local optima in complex or cross-scenario tasks and hinder their ability to balance testing quality with execution efficiency. To address these challenges, this paper proposes a Deep Active Ensemble Learning Framework for symbolic execution path exploration. During training, the framework integrates active learning with ensemble learning to reduce annotation costs and improve model robustness, while constructing a heterogeneous model pool to leverage complementary model strengths. In the testing stage, a dynamic ensemble mechanism based on sample similarity adaptively selects the optimal predictive model to guide symbolic path exploration. In addition, a gated graph neural network is employed to extract structural and semantic features from the control flow graph, improving program behavior understanding. To balance efficiency and coverage, a dynamic sliding window mechanism based on branch density enables real-time window adjustment under path complexity awareness. Experimental results on multiple real-world benchmark programs show that the proposed framework detects up to 16 vulnerabilities and achieves a cumulative 27.5% increase in discovered execution paths in hybrid fuzzing. Furthermore, the dynamic sliding window mechanism raises the F1 score to 93%.

Keywords:

symbolic execution; active ensemble learning; heterogeneous model pool; gated graph neural network; dynamic sliding window

DLF: A Deep Active Ensemble Learning Framework for Test Case Generation

Abstract

Article Metrics

Citations

Article Access Statistics