Next Article in Journal
Automatic Evaluation Visual Characteristics of Corn Snacks Using Computer Vision
Previous Article in Journal
Parameter-Based Finite Element Modeling of Functionally Graded Rotating Disks Subject to Thermal Loadings
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Proceeding Paper

Boosting Software Fault Prediction Accuracy with Ensemble Learning †

1
Department of Computer Science & Engineering, Lovely Professional University, Phagwara 144411, India
2
Informatic Engineering, Nusa Putra University, Sukabumi 43152, West Java, Indonesia
*
Author to whom correspondence should be addressed.
Presented at the 7th International Global Conference Series on ICT Integration in Technical Education & Smart Society, Aizuwakamatsu City, Japan, 20–26 January 2025.
Eng. Proc. 2025, 107(1), 63; https://doi.org/10.3390/engproc2025107063
Published: 27 August 2025

Abstract

Software defects are natural quality characteristics of software that are difficult to eliminate completely, even with concerted efforts. In addition to Bayes Net, in this study, C4.5 Decision Tree, Multilayer Perceptron (MLP), and Random Forests (RFs) are used. Moreover, an ensemble strategy with GNB, BNB, RF, and MLP is proposed to enhance the prediction accuracy. Results from empirical evaluations indicate that the F1 score, accuracy, precision, and recall of this strategy are higher than those of any individual approach, providing strong evidence for the ensemble model as an effective method for improving defect prediction performance. The ensemble approach could be a promising pathway to bolster the software quality process, mainly in machine learning-based fault prediction.

1. Introduction

Errors are part of software engineering; even if we rigorously develop and test software, errors will not be completely eliminated [1]. Detecting and preventing defects early in the software development lifecycle are important to avoid adverse effects on the project schedule, costs, and end-user experience. The capacity to anticipate software flaws prior to their emergence in functional settings has attracted significant interest in both academia and business [2]. The goal of predictive models for software defect identification is to pinpoint possibly problematic parts or modules using a variety of software metrics, code features, and past defect information [3,4]. Through foreseeing software problems in high-risk locations, developers can optimize resource allocation towards testing, debugging, and quality assurance tasks.
The process of software fault prediction (SFP) can be broken down into several steps. The first step involves determining the basic abstract information about software, like code and the development process, and using relevant metric elements to establish the metric features of the software element.
However, there could be a number of problems with the dataset that was acquired at this step, including imbalances, outliers, and missing numbers. Therefore, the historical database needs to be pre-processed by removing missing values, detecting outliers, normalizing data, and so on [5]. By correctly cleaning and pre-processing the dataset, it can be used for prediction studies to anticipate defects in the SDLC in advance, enabling the software team to achieve better outcomes and higher software quality [3,4]. Figure 1 represents software fault prediction process.
Machine learning algorithms have become a potential tool for software fault detection in recent years. By making use of patterns and relationships found in software data, these techniques create prediction models that can be used to discover software components that are faulty [6].
While individual machine learning algorithms lead to better insights into defect prediction, it has been shown that the usage of ensemble learning approaches can lead to increases in both prediction accuracy and robustness. Ensemble methods typically combine the predictions of multiple base models to produce a final prediction that is even more reliable and precise [7,8]. Ensemble methods can outperform individual models when they take into account the diversity of models and reduce their disadvantages.
This work focuses primarily on enhancing software quality assurance processes while extending the state of the art in software fault prediction methods. Through analyzing and comparing a number of prediction methods, this research aims to identify the optimal approaches to forecasting software defects [9].
It also aims to offer this insightful knowledge to software engineering and quality assurance domains. This investigation aims to push software engineering approaches with the overall mission of providing developers with effective tools and techniques for the early detection and mitigation of problems [10,11]. Customers can depend on software products with more reliability, security, and efficiency, helping the company’s products to be used more in their respective fields. This is achieved by enhancing software quality assurance processes by automating defect prediction.
This study proposes a new feature selection method that has been specifically designed to address software defect prediction problems, potentially improving the related predictive models’ relevance and performance on the automated tasks they aim to solve. Using multiple datasets and rigorous cross-validation techniques, the study takes a comprehensive approach to establish the robustness and generalizability of the results. Nonetheless, empirical findings reveal better prediction performance than existing methods, exposing previously unnoticed trends in software defect data, as well as showing that the approach is scalable within a variety of project types. These results inform software development practices, enabling practitioners to act and eliminate defects proactively. Software fault prediction (SDP) is introduced in Section 1, and popular machine learning algorithms for software fault prediction are listed here. Section 2 presents a summary of previous studies in the research area of SDP. Section 3 describes the research method employed in this work. Section 4 displays the experimental method’s empirical results. The paper concludes with a summary of the findings and implications of the research in Section 5.

2. Literature Review

In order to foresee software issues, ref. [12] presented a Harmony Search-based Cost-Sensitive Decision Tree. The strategy increased discovered metrics and performed better than previous approaches. It ensured effective use of software project parameters by correctly predicting errors and allocating resources for quality assurance.
Ref. [13] also introduced a program source code parsing model named Multi-Kernel Transfer Convolutional Neural Network (MKT-CNN). With an AST and a CNN algorithm, the model mines transferable semantic features. Hand-crafted features are also used to predict Cross-Project Defect (CPD).
Ref. [14] established a prediction system using convolutional graph neural networks (GCNNs) and an E2E framework to detect software defects. Module classification of faulty and normal groups takes place through a framework that utilizes abstract syntax trees from source code. The evidence demonstrated that the framework achieved better results than previous approaches.
Ref. [15] created a hybrid machine learning prediction system for software defects using a combination of Genetic Algorithms and Decision Trees. The results demonstrated that this method achieved higher accuracy than standard methods according to the comparative examination provided.
Ref. [16] evaluated ensemble machine learning algorithms for malware detection on the Windows platform because traditional signature and heuristic detection faces various shortcomings [16]. The hybrid models composed of Logistic Regression (LR), Decision Tree (DT) and Support Vector Machine (SVM) proved efficient for accuracy enhancement according to.
Ref. [17] developed a framework that enhances SPI efficiency through the integration of DT methods for software development. Ref. [18] developed a requirement mining framework that traces components in globally distributed development environments utilizing aspects of mining.
A CNN-LSTM network based on Gray Wolf Optimization (GWO) for smart home energy usage prediction produced favorable results by addressing traditional energy management system drawbacks according to [19].
Ref. [20] developed an enhanced predictive heterogeneous ensemble model for breast cancer prediction by combining different ML techniques to achieve better diagnostic accuracy. The research findings demonstrate how ensemble learning enhances breast cancer diagnosis by improving clinical decision support systems.
Ref. [21] introduced a solution for software problem prediction that combines data pre-processing at three stages with a correlation analysis and machine learning techniques. The second strategy reached 98.7% diagnostic accuracy and simultaneously lowered maintenance expenses and programming complexity and improved software excellence by eliminating faults.

3. Research Methodology

The presented methodology is based on a number of algorithms, including Random Forest (RF), Multilayer Perceptron (MLP), Bayes Net, and C4.5 Decision Tree. This work projects an ensemble algorithm for predicting defects in software. This algorithm is an integration of Bayes Net, C4.5 Decision Tree, MLP, and RF. The utilized algorithms are discussed in the following paragraphs:

3.1. Bayes Net

The probabilistic graphical model known as Bayes Net, or the Bayesian Network, represents a collection of random variables and their conditional dependencies using a directed acyclic graph (DAG). It models the probability distribution of these variables and infers probabilistic correlations between them by utilizing the Bayes theorem. From a mathematical perspective, a Bayes Net is made up of a collection of nodes that symbolize random variables (1) and directed edges (2) that reflect the probabilistic connections between them.
X 1 , X 2 , X 3 . , X n
P X 1 , X 2 , X 3 . , X n
There is a conditional probability distribution linked to each node X i . P X i   p a r e n t s ( X i ) ) ,   p a r e n t s X i is the set of parent nodes in the graph. The joint probability distribution of all variables in the network is computed as the product of conditional probabilities, as shown in (3).
P X 1 , X 2 , X 3 . , X n = i = 1 n P X i   p a r e n t s ( X i ) )  
By employing techniques like variable elimination and belief propagation, Bayes Net facilitates the effective inference of the posterior distribution of variables given observed data.

3.2. C4.5 Decision Tree

One popular method for building classification models that employ a tree structure to contain decision rules is Ross Quinlan’s revolutionary C4.5 Decision Tree algorithm. The feature space is recursively divided into subsets according to the input feature values, where a leaf node represents a class label and an internal node indicates a choice based on a feature. The optimal characteristic for splitting at each node is chosen by C4.5 using a heuristic method that usually maximizes information gain or the gain ratio, which gauges the decrease in entropy or impurity in the data. Mathematically, the information gain I G ( T , A ) for a given attribute A in a Decision Tree T is calculated as shown in (4).
I G ( T , A ) = E n t r o p y ( T ) v v a l u e s ( A ) T T v × E n t r o p y ( T v )

3.3. Multilayer Perceptron

This is a basic type of feed-forward network (FFN). This algorithm involves many perceptrons. The outcome generated from one perceptron is fed into the next one as an input. Moreover, a non-linear function is used to assess a neuron’s condition. The Multilayer Perceptron algorithm’s general framework is shown in Figure 2.
Each neuron in the hidden layer of a Multilayer Perceptron (MLP) integrates inputs from the preceding layer, modifies them with weights, adds a bias, and then applies an activation function to generate an output. From a mathematical perspective, this appears as (5) and (6) for every neuron in the hidden layer.
n e t h i = s u m   o f ( i n p u t × w e i g h t ) + b i a s
a h i = a c t i v a t i o n ( n e t h i )
Similarly, for each neuron in the output layer, this appears as (7) and (8).
n e t o i = s u m   o f ( h i d d e n   o u t p u t × w e i g h t ) + b i a s
a o i = a c t i v a t i o n ( n e t o i )
In this case, the activation function gives the model non-linearity, bias permits flexibility, and weights provide the relative relevance of each input. The MLP learns tasks like classification or prediction by modifying these weights and biases through a procedure known as backpropagation, which better matches the intended output.

3.4. Random Forest

Random Forest is a powerful ensemble learning method that generates more accurate regression or classification predictions by combining the results of multiple Decision Trees [22]. In order to increase diversity and decrease correlation between trees, each Decision Tree in the forest is constructed using a fraction of the training data and a random selection of characteristics at each node. The algorithm first chooses a selection of characteristics at random for the purpose of building the tree. It next chooses the optimum split among those features, usually maximizing information gain or Gini impurity. Mathematically, the Gini impurity IG(T) for a given node T in a Decision Tree is calculated as shown in (9).
I G ( T ) = 1 i = 1 c ( p i ) 2
where c is the number of classes and p i is the proportion of instances of class i in the node. A Random Forest produces its final forecast through the aggregation of prediction decisions by individual trees by voting for classification and by averaging for regression. Using ensemble methods results in better generalization performance as well as reduced overfitting tendency when compared to standalone Decision Trees. Random Forest has become widely adopted for different machine learning applications because of its user-friendly capabilities combined with high scalability and outstanding performance levels.

3.5. Ensemble Model

Machine learning organizations produce strong ensemble models that improve prediction outcomes through the merging of basic model forecasts. The concept behind ensemble learning requires multiple modeling approaches to work jointly for stronger generalized performance. Bagging is an ensemble learning method that produces predictive results through averaging multiple base models trained with different subsets of the input data.
A base model receives different portions of the training data to be trained into separate subsets. Weak learners receive sequential training through boosting, where successive models consecutively focus on fixing the mistakes made by preceding models [23,24]. Ensemble model prediction f^(x) can be computed through the summation of weighted base models’ predictions f1(x), f2(x), …, fn(x) as shown in (10).
f x = i = 1 n w i f i ( x )
The method bases its approach on weights W_i, which determine the significance of each base model. The machine learning process now heavily relies on ensemble techniques including Random Forest, AdaBoost, Gradient Boosting, and stacking since they deliver extraordinary performance results.
The proposed methodology follows a pseudo-code presented in Algorithm 1. The proposed methodology comprises four different individual models, which can be seen in Figure 3.
Algorithm 1. Pseudo-code
Input: A Stream of pairs (x,y),
   Parameter β   (0,1)
Output: A Stream of prediction   y ^ for each x
1. Initialize experts C 1 . C n with weight ω i = 1 / N each
2. for each x in stream
   do Collect Predictions C 1 x C N ( x )
                 P i ω i · C i ( x )
                 y ^ S i g n   ( P 1 2 )
         for i 1 N
           do if ( C i ( x ) y ) then
           ω i β · ω i
         S i ω i
         for i 1 . . N ,
           do ω i ω i / s

4. Results and Discussion

This study evaluated various models for predicting software defects, with Model 5 being the top performer as shown in Table 1. It achieved the highest accuracy of 90.00%, accurately classifying 90.00% of instances. Models 4 and 5 showed exceptional precision, with ratings of 84.20% and 88.00%, respectively.
Model 5 had the best recall rate of 90.00%, demonstrating its efficacy in flaw detection. It also achieved the highest F1 score of 89.00%, indicating its superiority in forecasting software flaws [25]. The results suggest that integrating different models through ensemble techniques significantly enhances prediction accuracy and performance [25,26,27]. The weighted majority algorithm as shown in Figure 3 is used to combine predictions from multiple experts. Figure 4 presents the comparative performance of the five models across accuracy, precision, recall, and F1-score metrics.

5. Conclusions

Software flaws are a permanent aspect of software quality even after numerous attempts to eradicate them completely. The maintenance and improvement of software quality remain difficult to achieve with sufficient resources and time despite prevention efforts. This research examined Random Forest and MLP together with C4.5 Decision Tree and Bayes Net as various techniques that predict and repair software bugs. Educational research has developed ensemble models to boost the accuracy of defect predictions followed by extensive studies.
Empirical tests prove that the ensemble model is better compared to individual models when the F1 score, along with precision and recall values of accuracy, is measured. The research proves that consolidating various predictive modeling technologies generates improved outcomes for fault detection ability. The group method demonstrates outstanding utility in quality assurance software practices since it enables improved detection and resolution of early software defects. The software industry should endorse the methodology because it successfully performs defect detection, therefore becoming an essential foundation for software improvement efforts. Research in the future will develop from this work by studying new domains and uniting multiple data sources, thus helping to evaluate the enduring impact of defect detection strategies for better software quality assurance methods.

Author Contributions

Conceptualization, A.M. and I.B.; methodology, I.B.; software, A.M.; validation, A.M. and A.F.; formal analysis, I.B.; investigation, A.F.; resources, A.F.; data curation, A.M.; writing original draft preparation, I.B.; writing review and editing, A.F.; visualization, A.M.; supervision, I.B.; project administration, I.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data supporting the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Menzies, T.; Dekhtyar, A.; Distefano, J.; Greenwald, J. Problems with precision: A response to ‘Comments on “data mining static code attributes to learn defect predictors”’. TSE 2007, 33, 637–640. [Google Scholar] [CrossRef]
  2. Lin, J.S.; Huang, C.Y. Queueing-Based Simulation for Software Reliability Analysis. IEEE Access 2022, 10, 107729–107747. [Google Scholar] [CrossRef]
  3. Akintola, A.G.; Balogun, A.; Lafenwa-Balogun, F.B.; Mojeed, H.A. Comparative Analysis of Selected Heterogeneous Classifiers for Software Defects Prediction Using Filter-Based Feature Selection Methods. FUOYE J. Eng. Technol. 2018, 3, 1. [Google Scholar] [CrossRef]
  4. Li, Z.; Niu, J.; Jing, X.Y. Software defect prediction: Future directions and challenges. Autom. Softw. Eng. 2024, 31, 1. [Google Scholar] [CrossRef]
  5. Hall, T.; Beecham, S.; Bowes, D.; Gray, D.; Counsell, S. A systematic literature review on fault prediction performance in software engineering. TSE 2012, 38, 1276–1304. [Google Scholar] [CrossRef]
  6. Alsaeedi, A.; Khan, M.Z. Software Defect Prediction Using Supervised Machine Learning and Ensemble Techniques: A Comparative Study. J. Softw. Eng. Appl. 2019, 12, 85–100. [Google Scholar] [CrossRef]
  7. Matloob, F.; Ghazal, T.M.; Taleb, N.; Aftab, S.; Ahmad, M.; Khan, M.A.; Abbas, S.; Soomro, T.R. Software defect prediction using ensemble learning: A systematic literature review. IEEE Access 2021, 9, 98754–98771. [Google Scholar] [CrossRef]
  8. Malhotra, R.; Jain, A. Fault prediction using statistical and machine learning methods for improving software quality. J. Inf. Process. Syst. 2012, 8, 241–262. [Google Scholar] [CrossRef]
  9. Mehta, A.; Kaur, N.; Kaur, A. Addressing Class Imbalance in Software Fault Prediction using BVPC-SENN: A Hybrid Ensemble Approach. Int. J. Perform. Eng. 2025, 21, 94–103. [Google Scholar] [CrossRef]
  10. Chen, J.; Li, Z.; Pan, J.; Chen, G.; Zi, Y.; Yuan, J.; Chen, B.; He, Z. Wavelet transform based on inner product in fault diagnosis of rotating machinery: A review. Mech. Syst. Signal Process. 2016, 70–71, 1–35. [Google Scholar] [CrossRef]
  11. Arun, C.; Lakshmi, C. Genetic algorithm-based oversampling approach to prune the class imbalance issue in software defect prediction. Soft Comput. 2022, 26, 12915–12931. [Google Scholar] [CrossRef]
  12. Lee, S.Y.; Wong, W.E.; Li, Y.; Chu, W.C.C. Software Fault-Proneness Analysis based on Composite Developer-Module Networks. IEEE Access 2021, 9, 155314–155334. [Google Scholar] [CrossRef]
  13. Deng, J.; Lu, L.; Qiu, S.; Ou, Y. A suitable AST node granularity and multi-kernel transfer convolutional neural network for cross-project defect prediction. IEEE Access 2020, 8, 66647–66661. [Google Scholar] [CrossRef]
  14. Šikić, L.; Kurdija, A.; Vladimir, K.; Šilić, M. Graph neural network for source code defect prediction. IEEE Access 2022, 10, 10402–10415. [Google Scholar] [CrossRef]
  15. Chennappan, R.; Thulasiraman, V. An automated software failure prediction technique using hybrid Machine learning algorithms. J. Eng. Res. 2023, 11, 100002. [Google Scholar] [CrossRef]
  16. Verma, V.; Malik, A.; Batra, I. Analyzing and Classifying Malware Types on Windows Platform using an Ensemble Machine Learning Approach. Int. J. Comput. Sci. 2024, 20, 312. [Google Scholar]
  17. Khalid, A.; Hashmi, A.; Kiani, A. Integrating Design Thinking into Software Process Improvement. In Proceedings of the 2024 International Conference on Emerging Trends in Networks and Computer Communications (ETNCC), Windhoek, Namibia, 23–25 July 2024. [Google Scholar]
  18. Ali, S.; Hafeez, Y.; Humayun, M.; Jhanjhi, N. Towards aspect based requirements mining for trace retrieval of component-based software management process in globally distributed environment. Innov. Syst. Softw. Eng. 2022, 23, 151–165. [Google Scholar] [CrossRef]
  19. Singh, T.; Solanki, A.; Sharma, S.; Jhanjhi, N. Grey Wolf Optimization-Based CNN-LSTM Network for the Prediction of Energy Consumption in Smart Home Environment. IEEE Access 2023, 11, 114917–114935. [Google Scholar] [CrossRef]
  20. Nanglia, S.; Ahmad, M.; Khan, F.; Jhanjhi, N. An enhanced Predictive heterogeneous ensemble model for breast cancer prediction. Biomed. Signal Process. Control 2022, 72, 103279. [Google Scholar] [CrossRef]
  21. Rahim, A.; Hayat, Z.; Abbas, M.; Rahim, A. Software defect prediction with naïve Bayes classifier. In Proceedings of the 2021 International Bhurban Conference on Applied Sciences and Technologies (IBCAST), Islamabad, Pakistan, 12–16 January 2021; pp. 293–297. [Google Scholar]
  22. Ray, S.K.; Sinha, R.; Ray, S.K. A smartphone-based post-disaster management mechanism using WIFI tethering. In Proceedings of the 2015 IEEE 10th Conference on Industrial Electronics and Applications (ICIEA), Auckland, New Zealand, 15–17 June 2015; pp. 966–971. [Google Scholar] [CrossRef]
  23. Mehta, A.; Kaur, N.; Kaur, A. An Ensemble Voting Classification Approach for Software defects prediction. Int. J. Inf. Technol. 2025, 17, 1813–1820. [Google Scholar] [CrossRef]
  24. Mehta, A.; Kaur, A.; Kaur, N. Optimizing Software Fault Prediction using Voting Ensembles in Class Imbalance Scenarios. Int. J. Perform. Eng. 2024, 20, 676–687. [Google Scholar] [CrossRef]
  25. Lee, S.; Abdullah, A.; Jhanjhi, N.Z. A review on honeypot-based botnet detection models for smart factory. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 418–435. [Google Scholar] [CrossRef]
  26. Azeem, M.; Ullah, A.; Ashraf, H.; Jhanjhi, N.; Humayun, M.; Aljahdali, S.; Tabbakh, T.A. FoG-Oriented Secure and Lightweight Data Aggregation in IoMT. IEEE Access 2021, 9, 111072–111082. [Google Scholar] [CrossRef]
  27. Aldughayfiq, B.; Ashfaq, F.; Jhanjhi, N.Z.; Humayun, M. YOLO-Based Deep Learning Model for Pressure Ulcer Detection and Classification. Healthcare 2023, 11, 1222. [Google Scholar] [CrossRef]
Figure 1. Software fault prediction process.
Figure 1. Software fault prediction process.
Engproc 107 00063 g001
Figure 2. Multilayer Perceptron.
Figure 2. Multilayer Perceptron.
Engproc 107 00063 g002
Figure 3. Proposed Methodology.
Figure 3. Proposed Methodology.
Engproc 107 00063 g003
Figure 4. Result comparison.
Figure 4. Result comparison.
Engproc 107 00063 g004
Table 1. Result analysis.
Table 1. Result analysis.
AlgorithmAccuracyPrecision (Weighted)Recall (Weighted)F1 Score (Weighted)
Bayes Net74.00%0.82550.7400.7763
C4.5 Decision Tree80.67%0.81180.80670.8092
Multilayer Perceptron82.67%0.82670.82670.8267
Random Forest88.00%0.84200.8800.8549
Ensemble model90.00%0.88000.900.89
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mehta, A.; Batra, I.; Fergina, A. Boosting Software Fault Prediction Accuracy with Ensemble Learning. Eng. Proc. 2025, 107, 63. https://doi.org/10.3390/engproc2025107063

AMA Style

Mehta A, Batra I, Fergina A. Boosting Software Fault Prediction Accuracy with Ensemble Learning. Engineering Proceedings. 2025; 107(1):63. https://doi.org/10.3390/engproc2025107063

Chicago/Turabian Style

Mehta, Ashu, Isha Batra, and Anggun Fergina. 2025. "Boosting Software Fault Prediction Accuracy with Ensemble Learning" Engineering Proceedings 107, no. 1: 63. https://doi.org/10.3390/engproc2025107063

APA Style

Mehta, A., Batra, I., & Fergina, A. (2025). Boosting Software Fault Prediction Accuracy with Ensemble Learning. Engineering Proceedings, 107(1), 63. https://doi.org/10.3390/engproc2025107063

Article Metrics

Back to TopTop