Next Article in Journal
Metamaterial Impedance Matching Network for Ambient RF-Energy Harvesting Operating at 2.4 GHz and 5 GHz
Previous Article in Journal
Grover on PIPO
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimating Software Development Efforts Using a Random Forest-Based Stacked Ensemble Approach

by
Priya Varshini A G
1,
Anitha Kumari K
2 and
Vijayakumar Varadarajan
3,*
1
Department of Information Technology, Dr. Mahalingam College of Engineering and Technology, Pollachi, Coimbatore 642 003, India
2
Department of Information Technology, PSG College of Technology, Coimbatore 641 004, India
3
School of Computer Science and Engineering, University of New South Wales, Sydney, NSW 2052, Australia
*
Author to whom correspondence should be addressed.
Electronics 2021, 10(10), 1195; https://doi.org/10.3390/electronics10101195
Submission received: 16 March 2021 / Revised: 13 May 2021 / Accepted: 13 May 2021 / Published: 17 May 2021
(This article belongs to the Section Artificial Intelligence)

Abstract

:
Software Project Estimation is a challenging and important activity in developing software projects. Software Project Estimation includes Software Time Estimation, Software Resource Estimation, Software Cost Estimation, and Software Effort Estimation. Software Effort Estimation focuses on predicting the number of hours of work (effort in terms of person-hours or person-months) required to develop or maintain a software application. It is difficult to forecast effort during the initial stages of software development. Various machine learning and deep learning models have been developed to predict the effort estimation. In this paper, single model approaches and ensemble approaches were considered for estimation. Ensemble techniques are the combination of several single models. Ensemble techniques considered for estimation were averaging, weighted averaging, bagging, boosting, and stacking. Various stacking models considered and evaluated were stacking using a generalized linear model, stacking using decision tree, stacking using a support vector machine, and stacking using random forest. Datasets considered for estimation were Albrecht, China, Desharnais, Kemerer, Kitchenham, Maxwell, and Cocomo81. Evaluation measures used were mean absolute error, root mean squared error, and R-squared. The results proved that the proposed stacking using random forest provides the best results compared with single model approaches using the machine or deep learning algorithms and other ensemble techniques.

1. Introduction

Software engineering follows a systematic and cyclic approach in developing and maintaining the software [1]. Software engineering solves problems related to the software life-cycle. The life cycle of software consists of the following phases:
Inception phase
Requirement phase
Design phase
Construction phase
Testing phase
Deployment phase
Maintenance phase
Figure 1 shows the schematic diagram of SDLC phases. During the inception phase, the following are the works carried out by the project team: project goal identification, carrying out various project estimations [2], and identification of the scope of the project. During the requirement or planning phase, user needs are analyzed, and functional and technical requirements are identified. During the design phase, the establishment of architecture is carried out by considering the requirements as input.
In the construction phase, implementation of the project was carried out. During the starting stage of the Construction Phase, a prototype model was developed. Later, the prototype was implemented as a working model. In the testing phase, identification of bugs, errors, and defects was carried out and, finally, the software was assessed for quality. After successful testing of the software, it moved onto the deployment phase wherein the software was released into the environment for the usage of end-users. During the maintenance phase, the feedback from the end-users was received and software enhancement was carried out by the developers.
Estimation, which is the process of finding the approximation or estimate, was performed during the initial stage with several uncertain and unstable data. Estimations were used as input for planning the project, for iteration planning, investment analysis, budget analysis, and etc. [2]. Estimation identifies the size, as well as the amount of time, effort of human skill, money, [3] and resources [4] required to build a system or product. Though several models have been developed for the past two decades, effort estimation remains a challenging tasks, as there are more uncertain and unstable data during the initial stages of software development. Software effort estimation [5] is performed in terms of person-hours or person-months.
Software effort estimations are carried out using the following techniques [6]: expert judgment, analogy-based estimation, function point analysis, machine-learning techniques comprising of regression techniques, classification approaches and clustering methods, neural network and deep learning models, fuzzy-based approaches, and ensemble methods.
Initially, software effort estimation is performed based on expert judgment [7] rather than using a model approach. It is a simple way to implement estimation, and also produced realistic estimations. Delphi technique and work breakdown structure are the most prevalent expert judgment techniques used for estimation. In the Delphi technique, a meeting is conducted among the project experts and, from the arguments during the meeting, the final decision about the estimation is made. In the work breakdown structure, the entire project is broken down into sub-projects or sub-tasks. The process is continued until the baseline activities are reached.
Analogy-based estimation to predict effort was performed based on similar past projects. It also produced accurate results as it was based on past data. Function point analysis approaches were used for effort estimation, which consider the number of functions required for developing software. Various machine learning techniques like regression, classification, and clustering approaches created a large impact in predicting software effort. Various regression techniques used for effort estimation were linear regression (single and multiple linear regressions), logistic regression, elastic net regression, ridge regression, LASSO regression and stepwise regression [8].
Linear regression is a best-fit straight line that is produced as the relationship between dependent and independent variables. The difference between the estimated value and observed value is called an error. Generally, the error must be minimized.
The estimated value is provided as
Y = b 0 + b 1 X
In Equation (1), b0 is the Y-intercept, X is the independent variable and b1 is the slope of the line, which is provided by Equation (2),
b 1 = X X ¯ Y Y ¯ ( X X ¯ ) 2
Multiple linear regression is the extension of linear regression. It uses ‘p’ number of independent or predictor variables.
The estimated value denoted as Y′ is provided as follows:
Y = b 0 + b 1 X 1 + + b p X p +
In Equation (3), b0, b1bp denote the coefficients, X1, X2Xp denote predictor variables and denotes the error.
Logistic regression produces solutions only to linear problems. The sigmoid function, which is also said to be a logistic function, is provided as follows:
Y = 1 1 + e Y
In Equation (4), Y = b 0 + b 1 X 1 + + b p X p + , b0, b1,…,bp denote the coefficients, X1, X2,…Xp denote predictor variables and denotes the error.
In ridge regression, L2 regularization techniques are used to minimize the error between actual and predicted value. A huge number of input variables can be used, but produce high bias. LASSO (Least Absolute Shrinkage and Selection Operator) regression uses the L1 regularization technique. LASSO avoids overfitting problems. Elastic net type is the combination of the LASSO and ridge regression methods [9].
In forward selection regression, the operation starts with the most important predictor variable; there will be a step-by-step increase in the predictor variables. In backward elimination regression, initially, all the predictor variables are included and, at every step, predictors of least significance are removed.
Various classification approaches used for estimation were the decision tree method, random forest approach, SVM classifier, KNN algorithm, and Naïve Bayes approach. The decision tree is a simple method, and is subjected to overfitting for a smaller training dataset. Random forest is the extension of the decision tree, and is an optimal and accurate algorithm compared with the decision tree approach; it is robust against overfitting. SVM [10] is best for linear, non-linear, structured, semi-structured, and unstructured data. KNN is a statistical approach.
KNN is sensitive to noise. The Naïve Bayes approach is based on Bayes theorem and produces a good result if the input variables are independent from one another. In the clustering-based approach, clusters are points with similar data. Hierarchical clustering, K-means clustering, and subtractive clustering were the approaches used for effort estimation.
Neural network [11] and deep learning models used for Effort Estimation were a multi-layer feed forward neural network, a radial basis neural network, a cascaded neural network [12], and Deepnet. The neural network model used a layered approach and a back propagation algorithm. It is suitable for linear and non-linear data, but an overfitting problem occurs. Fuzzy-based approaches are sensitive to outlier data. Mostly fuzzy logic models [13] are combined with other machine learning models for software effort estimation.
Ensemble Techniques are robust predictive models [14]. It provides better accurate results when compared to an existing individual machine or deep learning models. In the ensemble technique, multiple models (called as base learners) are combined to provide better results. Ensemble techniques considered for estimation [15,16] are Averaging, Weighted Averaging, Bagging, Boosting and Stacking.
  • Averaging: It is performed by taking the average of prediction from a single model.
  • Weighted Average: It is calculated by applying different weights to different single models, based on their prediction and finally taking the average weighted predictions.
  • Bagging: It is otherwise called Bootstrap aggregation, a kind of sampling technique. Multiple samples are considered from the original dataset and similar to random forest techniques.
  • Boosting: It uses a sequential method to reduce the bias. Various boosting algorithms used are XGBoost (eXtreme Gradient Boosting), GBM (Gradient Boosting Machine), AdaBoost (Adaptive Boosting).
  • Stacking: Stacking does prediction from multiple models and builds a novel model.

2. Related Work

Software effort estimations are mostly carried out using the following techniques: expert judgment, analogy-based estimation, function point analysis, various machine learning techniques, which include regression techniques, classification approaches and clustering methods, neural network and deep learning models [17], fuzzy-based approaches, and ensemble methods. Table 1 describes the survey of software effort estimation using various algorithms, datasets used for estimation, evaluation measures, and findings from each paper.

3. Mathematical Modeling

3.1. Software Effort Estimation Evaluation Metrics

3.1.1. Mean Absolute Error (MAE)

It is the average sum of absolute errors.
Prediction   error = y i y ^ i
Absolute error = |Prediction error|
MAE = average of all absolute errors.
MAE =   i = 1 n y i y ^ i
In Equation (6), ‘n’ is the total number of data points, y i is the original value, and y ^ i is the predicted value.

3.1.2. Root Mean Square Error (RMSE)

The root mean square error is the measure of the standard deviation of the predicted deviation.
RMSE = 1 n i = 1 n ( y i y ^ i ) 2
In Equation (7), ‘n’ is the total number of data points, y i is the original value, and y ^ i is the predicted value.

3.1.3. R-Squared

R-squared is a statistical measure that finds the proportion of variance in the dependent variable that is predicted from the independent variable. The R-squared is found by dividing the residual sum of squares (RSS) by the total sum of squares (TSS); then, it is subtracted from 1, as provided by Equation (8). RSS is the average squared error between original values y and y ^ . TSS is the squared Error between original value ‘y’ and the average of all ‘y’. The R-squared value ranges between 0 and 1. The model is preferable if the R-squared value is nearer or equal to 1. Here, a negative R-value indicates there is no correlation between the data and the model. It is also known as the co-efficient of determination.
R - Squared = 1 R S S T S S
where RSS is the residual sum of squares and TSS is the total sum of squares.

3.2. Proposed Stacking Using Random Forest for Estimation

Ensemble techniques were used to create multiple models termed base-level classifiers; they were combined to produce better predictions as compared with single-level models. There are several techniques used under ensembling, namely averaging, weighted averaging, bagging, boosting, and stacking. Proposed stacking using random forest was compared with other stacking techniques such as generalized linear model (S-GLM), stacking using decision tree (S-DT), stacking using support vector machine (S-SVM), and stacking using the random forest (S-RT) in Section 5.1. Various ensemble techniques and single-level models were compared with the proposed stacking using the random forest model in the prediction of software effort in Section 5.2. Algorithm 1 provided below explains the pseudocode of proposed stacking using random forest.
Algorithm 1. Pseudocode of proposed stacking using random forest.
1. Input: Training data, Dtrain = a i , b i where a i = Input attributes, b i = Output attribute, i = 1 to n, n = Number of base classifiers.
2. Output: Ensemble classifier, E
3. Step 1: Learn about base-level classifiers
(Apply first-level classifier)
Base-level classifiers considered were svmRadial, decision tree (rpart), RandomForest(rf), and glmnet
4. for t = 1 to T do
5. learn e t based on Dtrain
6. end for
7. Step 2: Build a new dataset for predictions based on the output of the base classifier with the new dataset
8. for i = 1 to n do
9.  D t r a i n e = a i , b i ;
where a i = e 1 ( a i ) , , e T ( a i )
10. end for
11. Step 3: Learn a meta classifier. Apply second-level classifier for the new dataset
Random Forest(rf) applied over the stacked base classifiers svm Radial, rpart, rf and glmnet (Figure 2)
12. Learn E based on D t r a i n e
13. Return E (ensemble classifier)
14. Predicted software effort estimation evaluation metrics using the ensemble approach are as follows:
15. Mean absolute error (MAE) = i = 1 n y i y ^ i
where ‘n’ is the total number of data points, y i is the original value and y ^ i is the predicted value.
16. Root mean squared error (RMSE) = 1 n i = 1 n ( y i y ^ i ) 2
where ‘n’ is the total number of data points, y i is the original value, and y ^ i is the predicted value.
17. R-squared= 1 R S S T S S , where RSS is the residual sum of squares and TSS is the total sum of squares.
In Step 1, load the required packages and dataset. Set the seed function to ensure that we acquire the same result when the same seed is used to run the same process. As the numerical attributes have different scaling range and units, the statistical technique is used to normalize the data. The summary() command is used to find the different scaling ranges of the attributes. To normalize the data, a preprocessing method is used with the range as the factor. After normalization, the range of values are within 0 to 1. Method trainControl() is used, which specifies the cross-validation method used and, by setting class probability as True, it generates probability values as a replacement of directly forecasting the class.
The first-level classifiers considered were svmRadial, decision tree (rpart), RandomForest (rf), and glmnet as they had correlation values less than 0.85. Sub-models having less correlation suggested that the model is better and allows for the production of new classifiers. The resamples() method was used to evaluate multiple machine learning models. The dotplot(), which is a graphical representation of the results, was also obtained. In Step 2, a new dataset was obtained based on the evaluation of the first-level classifiers, namely svmRadial, decision tree (rpart), RandomForest (rf), and glmnet.
In Step 3, second-level classifiers were applied. Figure 2 shows that four different stackings of second level classifiers were done, namely the stacking of the generalized linear model over the baseline classifiers, the stacking of decision tree over baseline classifiers, the stacking of support vector machine over the baseline classifiers, and the stacking of random forest over the baseline classifiers.

4. Data Preparation

Software effort estimation was predicted using seven benchmarked datasets. Datasets were checked for missing values, and feature selection was performed based on highly correlated values in each dataset. Single base learner algorithms were compared with ensemble techniques. Proposed stacked ensemble using random forest was relatively effective compared with all other methods. Evaluation measures employed were mean absolute error, root mean square error, and R-squared.

Software Effort Estimation Datasets

Attributes and records of the datasets are elaborated in Table 2. Datasets considered for software effort estimation were Albrecht, China, Desharnais, Kemerer, Kitchenham, Maxwell, and Cocomo81. The Albrecht dataset consisted of 8 attributes and 24 records, the China dataset consisted of 16 attributes and 499 records, the Desharnais dataset consisted of 12 attributes and 81 records, the Kemerer dataset consisted of 7 attributes and 15 records, the Maxwell dataset consisted of 26 attributes and 62 records, the Kitchenham dataset consisted of 9 attributes and 145 records, and the Cocomo81 dataset consisted of 17 attributes and 63 records. The output attributes of the datasets Albrecht, Kemerer, and Cocomo81 were in the units of person-months. The datasets China, Desharnais, Maxwell, and Kitchenham were in the unit of person-hours.
Table 3 provides the original attributes, description and attributes considered after feature selection for the datasets: Albrecht, China, Desharnais, Kemerer, Maxwell, Kitchenham and Cocomo81. Attributes having a high correlation values are considered after feature selection.

5. Results and Discussion

For effort prediction, the datasets considered were Albrecht, China, Desharnais, Kemerer, Maxwell, Kitchenham, and Cocomo81. The evaluation measures considered were mean absolute error (MAE), root mean square error (RMSE), and R-squared value. It was found that the lesser the values of MAE and RMSE, the better the model; additionally, if the R-squared value was closer to 1, it was the better model. Various ensemble techniques available were averaging, weighted averaging, bagging, boosting, and stacking. Single models considered for comparison were random forest, SVM, decision tree, neural net, ridge, LASSO, elastic net and deep net algorithms.

5.1. Stacking Models

Herein, stacking built a novel model from multiple classifiers. The various stacking models considered for evaluation were stacking using generalized linear models (S-GLM), stacking using decision tree (S-DT), stacking using support vector machine (S-SVM), and stacking using random forest (S-RT).
For the stacking model approaches, the first-level classifiers considered were svmRadial, decision tree (rpart), RandomForest(rf), and glmnet. A new dataset was obtained based on the evaluation of the first-level classifiers. Four different stacking models were used as the second-level classifiers individually over the baseline (first-level) classifiers. They were stacking of generalized linear model, stacking of decision tree, stacking of support vector machine, and stacking of random forest; they were are individually applied as the second level-classifiers over the baseline classifiers. MAE, RMSE, and R-squared values were predicted for all the four stacked model approaches. The stacking model was considered a better model if it produced fewer errors (MAE and RMSE) and if the R-squared value was nearer to 1.
Based on the inference from Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9, stacking using random forest produced less errors (MAE & RMSE) compared with other stacking models; the R-squared values were also closer to 1.
Figure 3 shows that stacking using RF produces less value in terms of MAE and RMSE, which were 0.0288617 and 0.0370489, respectively. The value of R-squared is preferred to be nearer to 1; compared with other stacking algorithms, the R-squared value was nearer to 1(0.9357274) in the proposed stacking using the RF algorithm.
Figure 4 shows that stacking using RF produced less error, MAE (0.004016189), and RMSE (0.01562433) compared with other algorithms. For a better predictive model, the R-squared value should be closer to 1; in stacking using RF, the R-squared value (0.9839643) was closer to 1.
In the Desharnais dataset, MAE and RMSE values of stacking using RF produced less error values, 0.07027704 and 0.1072363, respectively; the R-squared value is 0.6556170. As shown in Figure 5, compared with other stacking algorithms, the proposed algorithm provided better results.
Figure 6 shows that stacking using random forest produced lower MAE (0.07030604) and RMSE (0.1094053) values. The r-squared value (0.7520435) was also closer to 1.
Figure 7 shows that stacking using RF produced lower MAE (0.03566583) and RMSE (0.06379541) values compared with other algorithms. The R-squared value (0.8120214) was also closer to 1.
In the Kitchenham dataset, MAE and RMSE values of stacking using RF produced less error values, 0.005384577 and 0.01505940, respectively; the R-squared value was 0.9246614. Figure 8 shows that, compared with other algorithms, the proposed algorithm provided better results.
Figure 9 shows that stacking using RF produced lower MAE and RMSE values, as 0.02278088 and 0.04415005, respectively. The value of R-squared was preferred to be nearer to 1; compared with other algorithms, the R-squared value is nearer to 1 (0.8667750) in proposed stacking using the RF algorithm.

5.2. Proposed Stacking Using Random Forest against Single Base Learners and Ensemble Techniques for Estimation

Single base learners considered for estimation were random forest (RF), support vector machine (SVM), decision tree (DT), neural net (NN), ridge regression (Ridge), LASSO regression (LASSO), elastic net regression (EN), and deep net (DN) and ensemble techniques including averaging (AVG), weighted averaging (WAVG), bagging (BA), boosting (BS), and stacking using RF (SRF). The software used for estimation was RStudio. Evaluation measures considered were mean absolute error (MAE), root mean square error (RMSE), and R-squared.
Ensemble approaches considered for comparison were averaging, weighted averaging, bagging, boosting, and stacking using RF. Single models considered for comparison were random forest, SVM, decision tree, neural net, ridge regression, LASSO regression, elastic net, and deep net algorithms.
Table 4 shows the mean absolute error values for the base learners and ensemble approaches against seven datasets. Out of all compared 12 algorithms, the proposed stacking using random forest produced minimal error values (mean absolute error) for all seven datasets. Based on the inference from Table 4, mean absolute error value of the proposed stacking using random forest was lower when compared with all other considered algorithms against seven datasets. The values of the proposed stacking using RF model were 0.0288617, 0.004016189, 0.07027704, 0.07030604, 0.03566583, 0.005384577, and 0.02278088 for the datasets Albrect, China, Desharnais, Kemerer, Maxwell, Kitchenham, and Cocomo81, respectively. The values suggest that the proposed stacking using the RF model produced lower MAE only when compared with other base learner algorithms such as random forest, SVM, decision tree, neural net, ridge regression, LASSO regression, elastic net, and deep net and other ensemble approaches including averaging, weighted averaging, bagging, and boosting.
Table 5 shows the root mean square error values for the base learners and ensemble approaches against seven datasets. Table 5 shows that, out of all compared 12 algorithms, the proposed stacking using random forest produced minimal error values (root mean square error) for all seven datasets. The values of the proposed stacking using the RF model were 0.0370489, 0.01562433, 0.1072363, 0.1094053, 0.06379541, 0.01505940, and 0.04415005 for the datasets Albrecht, China, Desharnais, Kemerer, Maxwell, Kitchenham, and Cocomo81, respectively. When compared with the proposed stacking using the RF model with other base learner algorithms such as random forest, SVM, decision tree, neural net, ridge regression, LASSO regression, elastic net, and deep net and other ensemble approaches including averaging, weighted averaging, bagging, and boosting, the proposed model produced lower RMSE values.
Table 6 shows the R-squared values for the base learners and ensemble approaches against seven datasets. The data suggest that, out of all 12 compared algorithms, the proposed stacking using random forest produced values nearer to 1 (R-squared) for all seven datasets. The values of the proposed stacking using the RF model were 0.9357274, 0.9839643, 0.6556170, 0.7520435, 0.8120214, 0.9246614, and 0.8667750 for the datasets Albrecht, China, Desharnais, Kemerer, Maxwell, Kitchenham, and Cocomo81, respectively. The values clearly suggest that the proposed stacking using the RF model produced R-squared values nearer to 1 when compared with other base learner algorithms like Random Forest, SVM, decision tree, neural net, ridge regression, LASSO regression, elastic net, and Deepnet and other ensemble approaches including averaging, weighted averaging, bagging, and boosting. The algorithm that provided R-squared values nearer to 1 indicates a good correlation between the data and the model. Thus, ensemble approaches are preferred over single models for two reasons: better performance in terms of prediction, and robustness, which reduces the spread of predictions.
Initially, stacking models considered for evaluation were stacking using the generalized linear model (S-GLM), stacking using decision tree (S-DT), stacking using support vector machine (S-SVM), and stacking using random forest (S-RF). Stacking using random forest was found to be the best model for prediction when compared with seven datasets against evaluation measures MAE, RMSE, and R-squared metrics. The proposed stacking using random forest compared with the single base learners such as random forest (RF), support vector machine (SVM), decision tree (DT), neural net (NN), ridge regression (Ridge), LASSO regression (LASSO), elastic net regression (EN), and deep net (DN) and with ensemble techniques including averaging (AVG), weighted averaging (WAVG), bagging (BA), and boosting (BS). Based on the data from Table 4, Table 5 and Table 6, it was found that the proposed stacking using RF provided better results in terms of the evaluation measures MAE, RMSE, and R-squared when compared with the single model approaches and ensemble approaches that included averaging, weighted averaging, bagging and boosting.

6. Conclusions

This paper presented software effort estimation using ensemble techniques and machine and deep-learning algorithms. Ensemble techniques compared were averaging, weighted averaging, bagging, boosting, and stacking. Various stacking models considered for evaluation were stacking using generalized linear model, stacking using decision tree, stacking using support vector machine, and stacking using random forest. The proposed stacking using random forest provided the best results and was compared with the single models, namely random forest, SVM, decision tree, ridge regression, LASSO regression, elastic net regression, neural net and deep net using Albrecht, China, Desharnais, Kemerer, Maxwell, Kitchenham, and Cocomo81 datasets. The results suggest that the proposed stacking using RF provides better results compared with single models. This estimation is used as input for the pricing process, project planning, iteration planning, budget, and investment analysis. Evaluation metrics considered were mean absolute error (MAE), root mean square error (RMSE), and R-squared. In the future, a hybrid model will be developed for better prediction of software effort estimation.

Author Contributions

Conceptualization, P.V.A.G. and A.K.K.; methodology, P.V.A.G.; software, P.V.A.G.; validation, P.V.A.G. and A.K.K.; formal analysis, A.K.K.; investigation, A.K.K.; resources, P.V.A.G.; data curation, P.V.A.G.; writing—original draft preparation, P.V.A.G.; writing—review and editing, P.V.A.G., A.K.K. and V.V.; visualization, P.V.A.G.; supervision, A.K.K. and V.V.; project administration, V.V.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data available in a publicly accessible repository that does not issue DOIs. Publicly available datasets were analyzed in this study. This data can be found here: [http://promise.site.uottawa.ca/SERepository].

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Sehra, S.K.; Brar, Y.S.; Kaur, N.; Sehra, S.S. Research patterns and trends in software Effort Estimation. Inf. Softw. Technol. 2017, 91, 1–21. [Google Scholar] [CrossRef]
  2. Sharma, A.; Kushwaha, D.S. Estimation of Software Development Effort from Requirements Based Complexity. Procedia Technol. 2012, 4, 716–722. [Google Scholar] [CrossRef] [Green Version]
  3. Silhavy, R.; Silhavy, P.; Prokopova, Z. Using Actors and Use Cases for Software Size Estimation. Electronics 2021, 10, 592. [Google Scholar] [CrossRef]
  4. Denard, S.; Ertas, A.; Mengel, S.; Ekwaro-Osire, S. Development Cycle Modeling: Resource Estimation. Appl. Sci. 2020, 10, 5013. [Google Scholar] [CrossRef]
  5. Park, B.K.; Kim, R. Effort Estimation Approach through Extracting Use Cases via Informal Requirement Specifications. Appl. Sci. 2020, 10, 3044. [Google Scholar] [CrossRef]
  6. Priya Varshini, A.G.; Anitha Kumari, K. Predictive analytics approaches for software Effort Estimation: A review. Indian J. Sci. Technol. 2020, 13, 2094–2103. [Google Scholar] [CrossRef]
  7. Jorgensen, M. Practical Guidelines for Expert-Judgment-Based Software Effort Estimation. IEEE Softw. 2005, 22, 57–63. [Google Scholar] [CrossRef]
  8. Satapathy, S.M.; Rath, S.K.; Acharya, B.P. Early stage software Effort Estimation using random forest technique based on use case points. IET Softw. 2016, 10, 10–17. [Google Scholar] [CrossRef]
  9. Anandhi, V.; Chezian, R.M. Regression Techniques in Software Effort Estimation Using COCOMO Dataset. In Proceedings of the International Conference on Intelligent Computing Applications, Coimbatore, India, 6–7 March 2014; pp. 353–357. [Google Scholar]
  10. García-Floriano, A.; López-Martín, C.; Yáñez-Márquez, C.; Abran, A. Support vector regression for predicting software enhancement effort. Inf. Softw. Technol. 2018, 97, 99–109. [Google Scholar] [CrossRef]
  11. Nassif, A.B.; Ho, D.; Capretz, L.F. Towards an early software estimation using log-linear regression and a multilayer perceptron model. J. Syst. Softw. 2013, 86, 144–160. [Google Scholar] [CrossRef] [Green Version]
  12. Baskeles, B.; Turhan, B.; Bener, A. Software Effort Estimation using machine learning methods. In Proceedings of the 22nd International Symposium on Computer and Information Sciences, Ankara, Turkey, 7–9 November 2007. [Google Scholar]
  13. Nassif, A.B.; Azzeh, M.; Idri, A.; Abran, A. Software Development Effort Estimation Using Regression Fuzzy Models. Comput. Intell. Neurosci. 2019, 2019. [Google Scholar] [CrossRef]
  14. Idri, A.; Hosni, M.; Abran, A. Improved Estimation of Software Development Effort Using Classical and Fuzzy Analogy Ensembles. Appl. Soft Comput. J. 2016, 49, 990–1019. [Google Scholar] [CrossRef]
  15. Hidmi, O.; Sakar, B.E. Software Development Effort Estimation Using Ensemble Machine Learning. Int. J. Comput. Commun. Instrum. Eng. 2017, 4, 1–5. [Google Scholar]
  16. Minku, L.L.; Yao, X. Ensembles and locality: Insight on improving software Effort Estimation. Inf. Softw. Technol. 2013, 55, 1512–1528. [Google Scholar] [CrossRef] [Green Version]
  17. Varshini, A.G.P.; Kumari, K.A.; Janani, D.; Soundariya, S. Comparative analysis of Machine learning and Deep learning algorithms for Software Effort Estimation. J. Phys. Conf. Ser. 2021, 1767, 12019. [Google Scholar] [CrossRef]
  18. Idri, A.; Amazal, F.a.; Abran, A. Analogy-based software development Effort Estimation: A systematic mapping and review. Inf. Softw. Technol. 2015, 58, 206–230. [Google Scholar] [CrossRef]
  19. Kumar, P.S.; Behera, H.S.; Kumari, A.; Nayak, J.; Naik, B. Advancement from neural networks to deep learning in software Effort Estimation: Perspective of two decades. Comput. Sci. Rev. 2020, 38, 100288. [Google Scholar] [CrossRef]
  20. Fedotova, O.; Teixeira, L.; Alvelos, H. Software Effort Estimation with Multiple Linear Regression: Review and Practical Application. J. Inf. Sci. Eng. 2013, 29, 925–945. [Google Scholar]
  21. Abdelali, Z.; Mustapha, H.; Abdelwahed, N. Investigating the use of random forest in software Effort Estimation. Procedia Comput. Sci. 2019, 148, 343–352. [Google Scholar] [CrossRef]
  22. Nassif, A.B.; Azzeh, M.; Capretz, L.F.; Ho, D. A comparison between decision trees and decision tree forest models for software development Effort Estimation. In Proceedings of the Third International Conference on Communications and Information Technology, Beirut, Lebanon, 19–21 June 2013. [Google Scholar]
  23. Corazza, A.; Di Martino, S.; Ferrucci, F.; Gravino, C.; Mendes, E. Using Support Vector Regression for Web Development Effort Estimation. In International Workshop on Software Measurement; Abran, A., Braungarten, R., Dumke, R.R., Cuadrado-Gallego, J.J., Brunekreef, J., Eds.; Springer: Heidelberg, Germany, 2009; Volume 5891, pp. 255–271. [Google Scholar]
  24. Marapelli, B. Software Development Effort Duration and Cost Estimation using Linear Regression and K-Nearest Neighbors Machine Learning Algorithms. Int. J. Innov. Technol. Explor. Eng. 2019, 9, 2278–3075. [Google Scholar]
  25. Hudaib, A.; Zaghoul, F.A.L.; Widian, J.A.L. Investigation of Software Defects Prediction Based on Classifiers (NB, SVM, KNN and Decision Tree). J. Am. Sci. 2013, 9, 381–386. [Google Scholar]
  26. Wu, J.H.C.; Keung, J.W. Utilizing cluster quality in hierarchical clustering for analogy-based software Effort Estimation. In Proceedings of the 8th IEEE International Conference on Software Engineering and Service Science, Beijing, China, 20–22 November 2017; pp. 1–4. [Google Scholar]
  27. Sree, P.R.; Ramesh, S.N.S.V.S.C. Improving Efficiency of Fuzzy Models for Effort Estimation by Cascading & Clustering Techniques. Procedia Comput. Sci. 2016, 85, 278–285. [Google Scholar]
  28. Rijwani, P.; Jain, S. Enhanced Software Effort Estimation Using Multi Layered Feed Forward Artificial Neural Network Technique. Procedia Comput. Sci. 2016, 89, 307–312. [Google Scholar] [CrossRef] [Green Version]
  29. Nassif, A.B.; Azzeh, M.; Capretz, L.F.; Ho, D. Neural network models for software development Effort Estimation: A comparative study. Neural Comput. Appl. 2015, 2369–2381. [Google Scholar] [CrossRef] [Green Version]
  30. Pospieszny, P.; Czarnacka-Chrobot, B.; Kobylinski, A. An effective approach for software project effort and duration estimation with machine learning algorithms. J. Syst. Softw. 2018, 137, 184–196. [Google Scholar] [CrossRef]
  31. Mensah, S.; Keung, J.; Bosu, M.F.; Bennin, K.E. Duplex output software Effort Estimation model with self-guided interpretation. Inf. Softw. Technol. 2018, 94, 1–13. [Google Scholar] [CrossRef]
  32. Singala, P.; Kumari, A.C.; Sharma, P. Estimation of Software Development Effort: A Differential Evolution Approach. In Proceedings of the International Conference on Computational Intelligence and Data Science, Gurgaon, India, 6–7 September 2019. [Google Scholar]
Figure 1. Software Development Life Cycle (SDLC) phases.
Figure 1. Software Development Life Cycle (SDLC) phases.
Electronics 10 01195 g001
Figure 2. Stacked random forest (SRF).
Figure 2. Stacked random forest (SRF).
Electronics 10 01195 g002
Figure 3. Error rate vs. stacking models in the Albrecht dataset.
Figure 3. Error rate vs. stacking models in the Albrecht dataset.
Electronics 10 01195 g003
Figure 4. Error rate vs. stacking models in the China dataset.
Figure 4. Error rate vs. stacking models in the China dataset.
Electronics 10 01195 g004
Figure 5. Error rate vs. stacking models in the Desharnais dataset.
Figure 5. Error rate vs. stacking models in the Desharnais dataset.
Electronics 10 01195 g005
Figure 6. Error rate vs. stacking models in the Kemerer dataset.
Figure 6. Error rate vs. stacking models in the Kemerer dataset.
Electronics 10 01195 g006
Figure 7. Error rate vs. stacking models in the Maxwell dataset.
Figure 7. Error rate vs. stacking models in the Maxwell dataset.
Electronics 10 01195 g007
Figure 8. Error rate vs. stacking models in the Kitchenham dataset.
Figure 8. Error rate vs. stacking models in the Kitchenham dataset.
Electronics 10 01195 g008
Figure 9. Error rate vs. Stacking models in Cocomo81 dataset.
Figure 9. Error rate vs. Stacking models in Cocomo81 dataset.
Electronics 10 01195 g009
Table 1. Software Effort Estimation Analysis.
Table 1. Software Effort Estimation Analysis.
Existing WorkDatasetsAlgorithmEvaluation MeasuresFindings
Idri et al. [18]Desharnais,
ISBSG
(International Software Benchmarking Standards Group),
Albrecht,
COCOMO,
Kemerer,
Maxwell,
Abran, and
Telecom
Analogy-based software effort estimation (ASEE)(MMRE)-Mean Magnitude of Relative Error
(MdMRE)-Median
Magnitude of Relative
Error
(Pred(25)-Prediction percentage with an MRE less than or equal to 25%
The authors compared analogy-based software effort estimation (ASEE) with eight ML and Non-ML techniques such as the COCOMO model, regression, expert judgment, artificial neural network, function point analysis, support vector regression, decision trees, and radial basis function. From the results, the ASEE technique outperformed eight techniques and provided more accuracy based on the three evaluation measures, namely MMRE, MdMRE, and Pred(25). Several techniques such as fuzzy logic, genetic algorithm, expert judgment, artificial neural network, etc., are combined with ASEE methods. Fuzzy logic and genetic algorithm combined with ASEE provided good results compared with other combined techniques. It was found that estimation accuracy increased when ML and Non-ML techniques were combined with the ASEE technique.
Suresh Kumar et al. [19]Commonly used
datasets:
COCOMO,
NASA, and
ISBSG
Analysis of SEE using artificial neural
network
algorithms (ANN)
Commonly used metrics:(MMRE)-Mean
Magnitude of Relative Error,
(MRE)-Mean Relative Error
In this paper, analysis of software effort estimation using various ANN algorithms such as a higher-order neural network, basic NN and a deep learning network was carried out. This paper mainly focused on comparing the quantitative and qualitative analysis of the papers under software effort estimation. They also created a survey on the following aspects: the most frequently used datasets for prediction, most frequent hybrid algorithm considered for prediction, and most-used evaluation measures, namely MMRE, MdMRE, and MRE.
Fedotova et al. [20]Real time data of a mid-level multinational
software development
company
Multiple linear regression (MLR)(MRE)-Mean Relative Error
(MMRE)-Mean
Magnitude of Relative Error,
Pred(x)-Prediction
percentage with an MRE less than or equal to x%
The project was carried out using a real-time dataset of a medium-sized multinational organization. This paper compared multiple linear regression (MLR) with the expert judgment method; MLR produced good results compared with expert judgment. Evaluation metrics considered were MRE, MMRE, and Pred(x).
Abdelali et al. [21]ISBSG R8, Tukutuku, and
COCOMO.
random forestPred(0.25), MMRE, and MdMREIn this paper, the
random forest algorithm was compared with the regression tree model. Before using the random forest algorithm, they explored the impact of the number of trees and the number of attributes to accuracy and found accuracy was sensitive to these parameters. Finally, the optimized random forest outperformed the regression tree algorithm.
Nassif et al. [22]ISBSG 10
Desharnais
Decision tree,
decision tree forest, andmultiple linear regression
Pred(0.25), MMRE and MdMREThe decision tree forest model was compared with the traditional decision tree model and multiple linear regression. The evaluation was performed using ISBSG and Desharnais datasets. Decision tree uses a recursive partitioning approach, whereas the decision tree forest model is a collection of decision trees that are grown in parallel. The decision tree forest model outperformed the decision tree model and multiple linear regressionwhen the evaluation measures used were MMRE, MdMRE, and PRED (0.25).
Corazza et al. [23]TUKUTUKU
Dataset
Support Vector Regression(Pred(25))
(MEMRE)-Mean
Magnitude of Relative Error relative to the estimate
(MdEMRE)-Median Magnitude of Relative Error relative to the
estimate.
The authors compared Support Vector Regression (SVR) with step-wise, case-based reasoning and Bayesian network; SVR outperformed the other compared algorithms. Four kernel methods of SVR were considered for estimation, namely linear, Gaussian, polynomial, and sigmoid. Two preprocessing methods, namely logarithmic and normalization approaches, were carried out over the dataset. Final results showed that SVR with linear kernel and logarithmic transformation provides better results.
Bhaskar [24]COCOMO81,
COCOMO Nasa,
and
COCOMO Nasa2
Linear regression and K nearest neighborMean squared error (MSE) and Mean magnitude
relative error (MMRE)
In this paper, the authors proposed that linear regression and K nearest neighbor (KNN) algorithms forecasted the effort accurately. The advantage of linear regression is it is fast at training the data and it produces good results when the output attribute is linear to the input attribute. KNN algorithms are preferred when there is less past knowledge regarding the data description.
Amjad et al. [25]NASA DatasetNaïve Bayes, support vector machine, K nearest neighbor,
decision trees
F1, Precision and RecallSoftware defect prediction was carried out using Naïve Bayes, support vector machine, K nearest neighbor (KNN), and decision trees. KNN is a simple model, and uses a statistical approach; Naïve Bayes is a probabilistic model that uses the Bayes theorem; and decision trees uses the recursive partitioning approach and SVM used for both structured and unstructured data. In this paper, the NASA dataset was used and the Naïve Bayes algorithm outperformed the support vector, KNN, and decision tree.
Wu et al. [26]Desharnais, Cocomo81, Maxwell, China, Nasa93, and KemererHierarchical clustering for analogy-based software effort estimation (ABE)MMRE-Mean Magnitude Relative ErrorThe research was carried out using analogy-based software effort estimation (ABE), which uses case-based reasoning that depends on the K value (K-th similar project that is completed over the past). The hierarchical clustering technique was used to identify the optimized set of K values. The proposed hierarchical clustering to the ABE method showed good improvement to ABE.
Sree et al. [27]NASA 93Fuzzy model using
subtractive clustering
Variance Accounted For(VAF)
Mean Absolute Relative
Error(MARE)
Variance Absolute relative Error(VARE)
Mean Balance Relative Error(Mean BRE)
Mean
Magnitude of Relative
Error(MMRE)
Prediction
In this paper, the fuzzy model using subtractive clustering used three rules and provided better estimates compared with cascading the fuzzy logic controller. The computational time for fuzzy logic controller was higher because the rule base was quite large. To reduce the rule base, cascading fuzzy logic controllers were used and to find the correct number of cascading, a subtractive clustering method was employed.
Rijwani et al. [28]COCOMO IIMulti-layered feed forward artificial
neural
network
Mean-Square-Error (MSE) and Mean
Magnitude of
Relative Error (MMRE)
Artificial Neural Network (ANN) can handle complex datasets with various dependent and independent variables. Multi-layered Feed Forward ANN with backpropagation method is employed over here. Multi-layered Feed Forward ANN provided better results and accuracy in forecasting effort.
Nassif et al. [29]International Software Benchmarking Standards Group (ISBSG)Multilayer perceptron; general
regression neural
network;
radial basis function
neural
network;
cascade
correlation neural
network
Mean Absolute Residual (MAR)Models considered for estimation are General Regression Neural Network, Multilayer Perceptron, Radial Basis Function Neural Network and Cascade Correlation Neural Network. In comparison with the four models, 3 models overestimate the accuracy and Cascade Correlation Neural Network outperforms other compared algorithms. ISBSG dataset is used for estimation.
Pospieszny et al. [30]ISBSG
1192 projects
13 attributes
ensemble
averaging
of 3 ML models
-SVM(support vector
machines)
MLP(multi-
layer
perceptron)
-GLM(general linear model)
MAE, MSE, and RMSE
MMRE
PRED
MMER- Mean
magnitude relative error to estimate
MBRE-mean of balanced relative error
The dataset used for estimation was the ISBSG dataset. Ensemble averaging of three machine learning algorithms was used for estimation. Three models considered for ensemble averaging were SVM (support vector machines),
MLP (multi-layer perceptron),
and GLM (general linear model). In ensemble, multiple base learners’ models were combined for effort estimation. The ensemble model outperformed the single models.
Mensah et al. [31]Github:
Albrecht
Telecom
PROMISE:
China
Cocomo
Cocomonasa1
Cocomonasa
Cosmic
Desharnais
Kemerer
Kitchenham
Maxwell
Miyazaki
Industry:php_projects
Regression-based effort estimation techniques:

ordinary least squares
regression (OLS),
stepwise
regression (SWR),
ridge
regression (RR),
LASSO
regression,
elastic net
regression,
Mean Absolute Error(MAE), Balanced Mean Magnitude of Relative Error (BMMRE) and Adjusted R2Software Effort Estimation models incur a drawback in prediction termed conclusion instability. In this paper, 14 datasets are considered for estimation and based on the effort attribute each dataset are grouped into three classes namely high, low and medium which is considered as the first output and again to the effort classes, six regression models were applied to predict accuracy which is considered as the second output. Elastic net regression outperformed the other compared algorithms.
Prerna et al. [32]cocomo81
and nasa93
differential evolution (DE) approachMMREThe differential evolution (DE) approach was applied over COCOMO and COCOMO II models for the datasets from the PROMISE repository. DE approach produced less computational complexity and less memory utilization. The Proposed DE-based COCOMO and COCOMO II approach provided better effort estimates compared with the original COCOMO and COCOMO II models.
Table 2. Dimensions of the dataset.
Table 2. Dimensions of the dataset.
Dataset NameSource RepositoryNo. of RecordsNo. of AttributesOutput
Attribute-Effort
(Unit)
(i) AlbrechtPROMISE248Person-months
(ii) ChinaPROMISE49916Person-hours
(iii) DesharnaisGITHUB8112Person-hours
(iv) KemererGITHUB157Person-months
(v) MaxwellPROMISE6227Person-hours
(vi) KitchenhamGITHUB1459Person-hours
(vii) Cocomo81GITHUB6317Person-months
Table 3. Datasets – Original attributes and attributes considered after Feature selection.
Table 3. Datasets – Original attributes and attributes considered after Feature selection.
Datasets NameAttributes and DescriptionAttributes Considered after
Feature Selection
AlbrechtInputNumericCount of input functions
OutputNumeric
InquiryNumeric
RawFPcounsNumeric
AdjfpNumeric
Effort
OutputNumericCount of output functions
InquiryNumericCount of query functions
FileNumericCount of file processing
FPAdjNumericFunction point
RawFPcounsNumericRaw function points
AdjfpNumericAdjusted function points
EffortEffort in person-months
ChinaAFPAdjusted function points
AFP
Output
File
Interface
Added
PDR_AFP
NPDR_AFP
NPDU_UFP
N-Effort
Effort
InputFunction points of input
OutputFunction points of external output
EnquiryFunction points of external output enquiry
FileFunction points of internal logical files
InterfaceFunction points of external interface added
AddedFunction points of added functions
ChangedFunction points of changed functions
PDR_AFPProductivity delivery rate(adjusted function points)
PDR_UFPProductivity delivery rate(Unadjusted function points)
NPDR_AFPNormalized productivity delivery rate(adjusted function points)
NPDU_UFPProductivity delivery rate(Unadjusted function points)
ResourceTeam type
DurationTotal elapsed time for the project
N-EffortNormalized effort
EffortSummary work report
DesharnaisProjectProject number
Transactions
PointsNonAdjust
PointsAjust
Effort
TeamExpTeam experience in years
ManagerExpProject manager’s experience in years
YearEndYear of completion
LengthLength of the project
TransactionsNumber of transactions processed
EntitiesNumber of entities
PointsNonAdjustUnadjusted function points
AdjustmentAdjustment factor
PointsAjustAdjusted function points
LanguageProgramming language
EffortMeasured in person-hours
KemererLanguageProgramming language
Duration
KSLOC
AdjFP
RawFP
Effort
HardwareHardware resources
DurationDuration of the project
KSLOCKilo lines of code
AdjFPAdjusted function points
RawFPUnadjusted function points
EffortMeasured in person-months
MaxwellYearTime
Year
Source
Nlan
T05
T09
T15
Duration
Size
Time
Effort
AppApplication type
HarHardware platform
DbaDatabase
IfcUser interface
SourceWhere developed
TelonuseTelon use
NlanNumber of development languages
T01Customer participation
T02Development environment adequacy
T03Staff availability
T04Standards use
T05Methods use
T06Tools use
T07Software logical complexity
T08Requirements volatility
T09Quality requirements
T10Efficiency requirements
T11Installation requirements
T12Staff analysis skills
T13Staff application knowledge
T14Staff tool skills
T15Staff team skills
DurationDuration (months)
SizeApplication size (FP)
TimeTime taken
EffortWork carried out in person-hours
KitchenhamClientcodeClient code {1,2,3,4,5,6}
Duration
Adjfp
Estimate
Effort
ProjecttypeProject Type {A,C,D,P,Pr,U}
StartdateStarting date of the project
DurationDuration of the project
AdjfpAdjusted function points
CompletiondateCompletion date of the project
EstimateEffort estimate
Estimate methodEstimate method {A,C,CAE,D,EO,W}
EffortWork carried out in person-hours
Cocomo81RelyRequired software reliability
Rely
Data
Time
Stor
Acap
Modp
Sced
Loc
Effort
DataDatabase size
CplxComplexity of product
TimeTime constraint
StorStorage constraint
VirtVirtual machine volatility
TurnComputer turnaround time
AcapAnalyst capability
AexpApplication experience
PcapProgrammer capability
VexpVirtual machine experience
LexpProgramming language experience
ModpModern programming practices
ToolSoftware tools use
ScedDevelopment schedule
LocLines of code
EffortWork carried in person-months
Table 4. Base learners and ensemble techniques vs. MAE of 7 datasets.
Table 4. Base learners and ensemble techniques vs. MAE of 7 datasets.
Mean Absolute Error (MAE)
AlgorithmsDataset 1
Albrecht
Dataset 2
China
Dataset 3
Desharnais
Dataset 4
Kemerer
Dataset 5
Maxwell
Dataset 6
Kitchenham
Dataset 7
Cocomo81
Random Forest0.19407030.03832486 0.1161946 0.2076936 0.2330224 0.1047124 0.06047924
SVM0.2712100.04872961 0.1052762 0.219517 0.311288 0.1189372 0.07500071
Decision Tree0.22994420.0222997 0.1067227 0.3709271 0.2896955 0.1174008 0.08838886
Neuralnet0.2208348 0.01775594 0.08394905 0.2179616 0.2245069 0.03921013 0.05387155
Ridge0.2495593 0.01977087 0.08810373 0.1971335 0.2162983 0.04342892 0.07912894
LASSO0.2672776 0.01344521 0.08731722 0.183967 0.2106617 0.03876806 0.08097285
ElasticNet0.25227430.01406866 0.08874032 0.1890074 0.211274 0.03961122 0.07980254
Deepnet0.2702398 0.09730134 0.1913845 0.4011895 0.2901784 0.1400501 0.2548906
Averaging0.2133583 0.01996002 0.09603559 0.239746 0.09006789 0.01797183 0.06160017
Weighted Averaging0.1658832 0.05308728 0.1378911 0.1811016 0.1211871 0.02715146 0.06493144
Bagging0.1784421 0.01207605 0.1195285 0.2042914 0.2356023 0.009002502 0.08604002
Boosting0.1237017 0.01059957 0.1072595 0.2345743 0.08203951 0.009002502 0.08095949
Proposed Stacking using RF0.02886170.0040161890.070277040.070306040.035665830.0053845770.02278088
Table 5. Base learners and ensemble techniques vs. RMSE of 7 datasets.
Table 5. Base learners and ensemble techniques vs. RMSE of 7 datasets.
Root Mean Squared Error (RMSE)
AlgorithmsDataset 1
Albrecht
Dataset 2
China
Dataset 3
Desharnais
Dataset 4
Kemerer
Dataset 5
Maxwell
Dataset 6
Kitchenham
Dataset 7
Cocomo81
Random Forest0.22731090.06515490.17445730.2357751 0.30980550.17155960.1402015
SVM0.2918690.1099640.19934750.26351590.40066570.23798810.1976564
Decision Tree0.31400690.053809680.17240280.39546240.38971970.20130180.1869692
Neuralnet0.26760810.04398520.15085660.32193530.29140140.07390.09447618
Ridge0.274339 0.037036510.14821060.28103340.29012340.087745960.1592412
LASSO0.29528340.023814110.14550490.2567310.28598930.074022510.1766792
ElasticNet0.27716940.024627560.14637860.260930.28653950.075420220.1747383
Deepnet0.32227480.14970180.23966870.41422780.35719740.22423060.285657
Averaging0.24036760.061144470.15694770.27199050.13868820.044471220.1272475
Weighted Averaging0.27895380.12141860.22757680.32538420.20960060.10595570.1831177
Bagging0.22641070.041216190.17644520.23997950.31203260.020917790.176543
Boosting0.1843560.036185140.1627240.26264490.12159970.020917790.1693169
Proposed Stacking using RF0.0370489 0.01562433 0.1072363 0.1094053 0.06379541 0.01505940 0.04415005
Table 6. Base learners and ensemble techniques vs. R-squared of seven datasets.
Table 6. Base learners and ensemble techniques vs. R-squared of seven datasets.
R-Squared
AlgorithmsDataset 1
Albrecht
Dataset 2
China
Dataset 3
Desharnais
Dataset 4
Kemerer
Dataset 5
Maxwell
Dataset 6
Kitchenham
Dataset 7
Cocomo81
Random Forest0.47322790.80285270.380590.60329620.1090880.39281020.6062928
SVM0.13152320.43843770.19123670.5044536-0.4901189-0.16843570.2174898
Decision Tree0.00521890.86553250.3950932-0.1160428-0.40981230.16403180.2998227
Neuralnet0.26990290.91015170.53684230.26038140.21179370.88733650.8212226
Ridge0.2327140.93629750.55294720.43638010.21869210.84116410.4920992
LASSO0.11108520.97366310.56912110.52964340.24079990.88696260.3747712
ElasticNet0.21680.97183310.5639310.51413160.23787610.88265360.3884329
Deepnet-0.05885081-0.0407606-0.169022-0.2244724-0.1843314-0.037252-0.6343981
Averaging0.41097460.82637550.4986860.47206810.59425890.90966990.6756852
Weighted Averaging0.20668290.3153521-0.054037610.244449700.073265360.48722980.3283723
Bagging0.47739220.92110810.36639320.58902170.096232930.98001490.3757349
Boosting0.68551810.94786670.4611060.59073760.68808570.98001490.425793
Proposed Stacking using RF0.9357274 0.9839643 0.6556170 0.7520435 0.8120214 0.9246614 0.8667750
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

A G, P.V.; K, A.K.; Varadarajan, V. Estimating Software Development Efforts Using a Random Forest-Based Stacked Ensemble Approach. Electronics 2021, 10, 1195. https://doi.org/10.3390/electronics10101195

AMA Style

A G PV, K AK, Varadarajan V. Estimating Software Development Efforts Using a Random Forest-Based Stacked Ensemble Approach. Electronics. 2021; 10(10):1195. https://doi.org/10.3390/electronics10101195

Chicago/Turabian Style

A G, Priya Varshini, Anitha Kumari K, and Vijayakumar Varadarajan. 2021. "Estimating Software Development Efforts Using a Random Forest-Based Stacked Ensemble Approach" Electronics 10, no. 10: 1195. https://doi.org/10.3390/electronics10101195

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop