Particle Swarm Optimization for Predicting the Development E ﬀ ort of Software Projects

: Software project planning includes as one of its main activities software development e ﬀ ort prediction (SDEP). E ﬀ ort (measured in person-hours) is useful to budget and bidding the projects. It corresponds to one of the variables most predicted, actually, hundreds of studies on SDEP have been published. Therefore, we propose the application of the Particle Swarm Optimization (PSO) metaheuristic for optimizing the parameters of statistical regression equations (SRE) applied to SDEP. Our proposal incorporates two elements in PSO: the selection of the SDEP model, and the automatic adjustment of its parameters. The prediction accuracy of the SRE optimized through PSO ( PSO-SRE ) was compared to that of a SRE model. These models were trained and tested using eight data sets of new and enhancement software projects obtained from an international public repository of projects. Results based on statistically signiﬁcance showed that the PSO-SRE was better than the SRE in six data sets at 99% of conﬁdence, in one data set at 95%, and statistically equal than SRE in the remaining data set. We can conclude that the PSO can be used for optimizing SDEP equations taking into account the type of development, development platform, and programming language type of the projects.


Introduction
Software engineering management involves planning [1]. The software project planning includes software prediction, and the most common predicted variables have been size [2] (mainly measured in either source lines of code, or function points [3]), effort (in person-hours or person-months [3]), duration (in months [4]), and quality (in defects [5]).
Software development effort prediction (SDEP), also termed effort estimation or cost estimation [6], is needed for managers to estimate the monetary cost of projects. As reference, in USA the cost by person-month (which is equivalent to 152 person-hours) is of $8000 USD [7].
Unfortunately, those projects taking more time (i.e., time overrun) costing more money (i.e., cost overrun) [8], and cost overrun has been identified as a chronic problem in most software projects [9]; whereas for cost underrun, a portion of the budgeted money is not spent and then money taxes have to be paid. These issues related to costs have been the causes for which a software project has been assessed based upon the ability to achieve the budgeted cost [10,11]. regression analysis, and from data sets of projects selected from an international public repository of recent software projects (i.e., International Software Benchmarking Standards Group, ISBSG release 2018). The software projects were selected based on their type of development (TD), development platform (DP), and programming language type (PLT) as suggested in the guidelines of the ISBSG [41]. The ISBSG has widely been used for SDEP models [42].
The size of a software project is a common variable used for SDEP [3], therefore, our models use it as the independent variable. In our study, the size type is function points, whose value is calculated from nineteen independent variables mentioned in the Section 4 of the present study (i.e., adjusted function points, AFP) [13].
The justification for the comparison between the prediction accuracy of our PSO-SRE with that obtained from SRE is based on the following issues related to SDEP: (a) The prediction accuracy of any new proposed model should at least outperform a SRE [43]. (b) SRE has been the model whose prediction accuracy has mostly been compared to other models such as those based on ML [44,45]. (c) The prediction accuracy of SRE has outperformed the accuracies obtained from ML models [44].
Owing to a statistical analysis is needed for a validity studies [46], the data preprocessing and our conclusions are based on statistical analysis involving identification of outliers, coefficients of correlation and determination of data, as well as on the suitable statistical test for comparing the prediction accuracy between PSO-SRE and SRE.
A systematic literature review published in 2018 which analyzed studies published between 1981 and 2016 on SDEP models recommends the use of same data sets and a same prediction accuracy measure such that conclusions can be compared to other studies [3]. This recommendation was suggested once they found difficulty to compare the performance among SDEP models due to the wide diversity of data sets and accuracy measures used. Thus, in our study, the models were applied to the same data sets, as well as taking into account a same accuracy measure (i.e., absolute residual, AR). Moreover, they were trained and tested using the same validation method (i.e., a Leave-one-out cross-validation, LOOOV, which is recommended for software effort model evaluation [47]).
In the present study, the null (H 0 ) and alternative (H 1 ) hypotheses to be tested are the following: H 0 . Prediction accuracy of the PSO-SRE is statistically equal to that of the SRE when these two models are applied to predict the development effort of software projects using the AFP as the independent variable. H 1 . Prediction accuracy of the PSO-SRE is statistically not equal to that of the SRE when these two models are applied to predict the development effort of software projects using the AFP as the independent variable.
The remaining of the present study is as follows: Section 2 has been assigned to describe the related studies where PSO has been applied to predict the development effort of software projects. Section 3 describes the Particle Swarm Optimization (PSO) metaheuristic, and our proposal: the PSO-SRE. Section 4 presents the criteria applied to select the data sets of software projects by observing the guidelines of the ISBSG, as well as the data preprocessing. Section 5 presents the results when PSO-SRE is performed and compares its prediction accuracy to SRE once the two models were trained and tested. Section 6 mentions our conclusions. Finally, Section 7 corresponds to a discussion section, which includes the comparison with previous studies, the limitations of our study, validation threats, as well as directions for future work.

Related Work
The proposed SDEP techniques have been systematically analyzed in several reviews [3,6,12,44,45,[48][49][50][51]. They can be classified in those not based on models, and in those based on models. The first type mentioned is also termed expert judgment [48,52], whereas the latter one can be classified in two categories: statistical [53] and ML models [44,45]. Table 1 shows an analysis of those ten studies identified where PSO was applied to SDEP. It includes the data set(s) of software projects, the number of projects by data set, the prediction accuracy measure, the validation method, as well as if the result was reported based on statistical significance, if so, the name of the statistical test is mentioned. A description including the proposal and results by study is done next: Table 1. Studies on SDEP based on PSO (AR: absolute residual, BRE: mean balanced relative error, IBRE: inverted balanced relative error, LSD: logarithmic standard deviation, LOOCV: Leave-one-out cross validation, MRE: magnitude relative error, r 2 : coefficient of determination, NS: not specified).

Study
Data Set(s) 1 Prediction Accuracy

Validation Method
Statistical Significance? [25] Six software organizations ( Azzeh et al. [33] use PSO to find the optimum solutions for variables related to multiple evaluation measures when applied CBR. Their results show that CBR improves when taking into account all variables together.
Bardsiri et al. [34] apply PSO to optimize the CBR weights. The PSO algorithm assigns weights to the features considered in the similarity function. The accuracy of their proposal is compared to those obtained from three types of CBR, as well as to those ones obtained from neural networks, classification and regression tree, and statistical regression models. Results show that the prediction accuracy of the CBR when used PSO was better than all the mentioned models.
Bardsiri et al. [35] use PSO in combination with CBR to design a weighting system in which the project attributes of different clusters of software projects are given different weights. The performance of their proposal is better than the prediction accuracy obtained when neural networks, classification and regression trees, and statistical regression models are applied.
Chhabra and Singh [29] firstly compare the prediction accuracy of three models termed Regression-Based COCOMO, Fuzzy COCOMO, and PSO Optimized Fuzzy COCOMO. In the latter one, they use the PSO to optimize the fuzzy logic model parameters. Then, they also compare the performance of it to that of a GA Optimized Fuzzy COCOMO. Their results show that the PSO Optimized Fuzzy COCOMO has better prediction accuracy than those obtained from the other three models. They concluded that the PSO can be applied as optimizer for a fuzzy logic model.
Hosni et al. [40] apply PSO for setting ensemble parameters of four ML models. They compare the PSO performance to that of grid search. They conclude that PSO and grid search show a same predictive capability when applied to k-nearest neighbor, support vector regression, neural networks, and decision trees.
Khuat and Le [25] propose an algorithm combining the PSO and ABC algorithms for optimizing the parameters of a SDEP formula. This formula is generated by using two independent variables obtained from agile software projects (i.e., final velocity, and story point). The accuracy results by applying this formula are compared to those obtained from four types of neural networks (i.e., general regression neural network, probabilistic neural network, group method of data handling polynomial neural network, and cascade correlation neural network). The performance of the algorithm based on PSO and ABC was better than those obtained from the four mentioned neural networks.
Sheta et al. [38] use PSO for optimizing the parameters of the COCOMO equation (termed PSO-COCOMO). They also build a fuzzy system. The PSO-COCOMO has a better performance than those obtained when the SDEP equations proposed by Halstead, Walston-Felix, Bailey-Basili, and Doty are applied.
Wu et al. [36] use PSO to optimize the CBR weights. They employ Euclidean, Manhattan, and grey relational grade distances as metrics to calculate the similarity measures. Their results show that the weighed CBR generates better prediction accuracy than unweighted CBR methods. They concude that the combined method integrating PSO and CBR improves the performance for the three mentioned measures.
Wu et al. [37] use PSO in combination with six CBR methods. These methods differ by their type of distance measure (i.e., Euclidean, Manhattan, Minkowski, grey relational coefficient, Gaussian, and Mahalanobis). Results show that the combination of methods proposed by them has a better performance than independent methods, and that the weighted mean combination method has a better result.
Zare et al. [32] apply PSO to obtain the optimal updating coefficient of effort prediction based on the concept of optimal control by modifying the predicted value of a Bayesian belief network. Its performance is compared to that obtained when applied GA. Results of their proposed model indicate that optimal updating coefficient obtained by GA increases the accuracy of prediction significantly in comparison with that obtained from PSO.
In accordance with Table 1, only two studies used a non-biased prediction accuracy measure (i.e., AR), only two of them used a deterministic validation method (i.e., LOOCV), the half of them based their conclusions on statistically significance, and none of the them involved any recent repository of software projects: Albrecht was published in the year of 1983, Canadian organization in 1996, COCOMO in 1981, Desharnais in 1988, IBM in 1994, Kemerer in 1987, Maxwell in 1993, Miyazaki in 1994, Nasa in 1981, Telecom in 1997, and the most recent ISBSG release used was published in the year of 2011. Regarding the data set of projects of the six software organizations, it was published in 2012; however, its size is small: 21 projects [25]. Finally, the year of those projects of China was not reported in those studies that have used them [33,40].
In those four studies where the ISBSG data set was used, the releases were 8 [40], 10 [33] and 11 [34,35], whose years of publication and sizes were 2003 with 2000 software projects, 2007 with 4000, and 2009 with 5052, respectively. When the release 8 was used, the authors selected a data set of 148 projects based on the following ISBSG criteria: DT (new), quality rating ("A" and "B" categories), resource level with 1 as value, maximum number of people working on the project, number of business units, and IFPUG as functional sizing method (FSM) type [40]. As for release 10, they selected a data set of 505 projects taking into account only a criterion suggested by the ISBSG: the quality rating ("A") [33]. As for release 11: (a) they selected a data set of 134 projects based on three ISBSG criteria: quality rating ("A" and "B"), normalized effort ratio of up to 1.2, and "Insurance" value for the type of organization attribute [34], and (b) they selected a data set of 380 projects based on quality rating attribute ("A" and "B"), DT, organization type, DP, normalized effort ratio of up to 1.2, resource level with 1 as value, and IFPUG as FSM [35]. That is, in all of these four studies, only one data set was selected by study, and the type of FSM was not taken into account to select the data set by mixing the IFPUG versions.
In accordance with the analysis of these ten studies, PSO has been used in three fundamental manners: (a) as a tool to support CBR [33][34][35][36][37]40], (b) for the selection of the SDEP model [38], and (c) for the optimization of values of a SDEP model ( [25,29,32]). In our opinion, the manner in how PSO was used in these studies has the following disadvantages: (a) An increase in the computational cost inherent to CBR models by incorporating the use of optimization techniques. (b) Allowing selecting the best SDEP model from a set of predefined models, but without an automatically adjustment of the parameters of the selected model. (c) Define a priori the SDEP model to be used, and only adjusting its parameters.
Taking into account these weaknesses, our proposal incorporates the following two elements in PSO: (1) The selection of the SDEP model, and (2) The automatic adjustment of the SDEP model parameters.
The analysis of the Table 1 also allows us emphasizing our experimental design which involves new and enhancement software projects selected based on their TD, DP, PLT, and FSM. Data of these projects are preprocessed through an outlier analysis, and calculation of two types of coefficients: correlation and determination. The models are trained and tested based on AR while a LOOCV is applied. Finally, the hypotheses of our study are statistically tested.

Particle Swarm Optimization (PSO) is an optimization model created in 1995 by Kennedy and
Eberhart [54]. It assumes that there is a cloud of particles, which "fly" in a D dimensional space. This original idea was refined three years later considering the introduction of memory into particles [55]. Particles have access to two types of memory: individual memory (the best position occupied by the particle in space) and collective memory (the best position occupied by the cloud in space). The evolution of the original PSO has continuously been analyzed [15].
In PSO, the size of the particle cloud np (number of particles) is considered as a user parameter. In the cloud, each particle i has stored the following three real vectors of D dimensions: the current position vector x i , the vector of the best position reached p i , and the flight speed vector v i . In addition, the cloud or swarm stores the best global position vector g best .
The movement of the particles is defined as a change in their position, when adjusting a velocity vector, component by component. To do this, the particles use individual memory and collective memory. The j-th component of the velocity vector of the i-th particle is updated as: where w is the inertia weight, c 1 is the individual memory coefficient, and c 2 is the global memory coefficient. The function rand(0, 1) represents the generation of a random number in the [0, 1] interval. If the velocity components exceed the established limits, they are bounded, such that it complied that Subsequently, the j-th component of the current position vector of the i-th particle is adjusted as: This adjusting process on the positions for the particles is repeated until a stop condition is achieved, which is usually settled as a number of algorithm iterations.
The pseudocode of the PSO algorithm described by Shi and Eberhart [55] is shown in Figure 1. It assumes that it is intended to minimize an objective function.
Mathematics 2020, 8, 1819 7 of 21 particle in space) and collective memory (the best position occupied by the cloud in space). The evolution of the original PSO has continuously been analyzed [15]. In PSO, the size of the particle cloud np (number of particles) is considered as a user parameter. In the cloud, each particle has stored the following three real vectors of D dimensions: the current position vector , the vector of the best position reached , and the flight speed vector . In addition, the cloud or swarm stores the best global position vector . The movement of the particles is defined as a change in their position, when adjusting a velocity vector, component by component. To do this, the particles use individual memory and collective memory. The j-th component of the velocity vector of the i-th particle is updated as: where is the inertia weight, is the individual memory coefficient, and is the global memory coefficient. The function (0,1) represents the generation of a random number in the [0, 1] interval. If the velocity components exceed the established limits, they are bounded, such that it complied that ≤ , ≤ . Subsequently, the j-th component of the current position vector of the i-th particle is adjusted as: This adjusting process on the positions for the particles is repeated until a stop condition is achieved, which is usually settled as a number of algorithm iterations.
The pseudocode of the PSO algorithm described by Shi and Eberhart [55] is shown in Figure 1. It assumes that it is intended to minimize an objective function.   In terms of complexity and time execution, PSO has two external loops that are generation loops and a loop through the entire population. In each loop, the optimization function is computed for each member of the population (particle). Considering k as the cost of computing the optimization function, it can be said that the time of execution of PSO is bounded by O(it * np * k).

PSO-SRE
We use a PSO design considering an additional element: the SDEP model. Thus, the 20 functions detailed in Table 2 were taken into account to achieve it. The x independent variable used in these 20 functions corresponds to the FSM (i.e., AFP). We add an additional component to each individual, which corresponds to an integer number in the interval [1,20] which represents the SDEP model to be used by the particle. Consequently, each particle has a different number of dimensions, which will vary according to the coefficients of the selected model. This novel modification allows us to simultaneously optimize the SDEP model to be selected, as well as its parameters. Figure 2 shows a swarm of five particles used by our proposed algorithm. The pseudocode of the proposal is shown in Figure 3. Table 2. Description of the analyzed models.

No. Model Equation Reference
1 Linear equation Exponential decrease or increase between limits Double exponential decay to zero Power Asymptotic equation Asymptotic regression model Logarithmic "Plateau" curve-Michaelis-Menten equation y = ax b+x [57] 10 Yield-loss/density curves Logistic curves with additional parameters Logistic curve with offset on the y-Axis Trigonometric functions Trigonometric functions (2) y = sin ax + sin bx [57] 18 19 Quadratic polynomial regression Cubic polynomial regression y = a + bx + cx 2 + dx 3 [57] Mathematics 2020, 8, 1819 8 of 21 In terms of complexity and time execution, PSO has two external loops that are generation loops and a loop through the entire population. In each loop, the optimization function is computed for each member of the population (particle). Considering as the cost of computing the optimization function, it can be said that the time of execution of PSO is bounded by ( * * ).

PSO-SRE
We use a PSO design considering an additional element: the SDEP model. Thus, the 20 functions detailed in Table 2 were taken into account to achieve it. The x independent variable used in these 20 functions corresponds to the FSM (i.e., AFP). We add an additional component to each individual, which corresponds to an integer number in the interval [1,20] which represents the SDEP model to be used by the particle. Consequently, each particle has a different number of dimensions, which will vary according to the coefficients of the selected model. This novel modification allows us to simultaneously optimize the SDEP model to be selected, as well as its parameters. Figure 2 shows a swarm of five particles used by our proposed algorithm. The pseudocode of the proposal is shown in Figure 3.  Table 2, and the remaining ones the parameters to optimize of the selected model.  Table 2, and the remaining ones the parameters to optimize of the selected model.  Table 2, and the remaining ones the parameters to optimize of the selected model.  Based on Figure 2, it can be defined that to obtain the final value to be compared for each particle, the first value is taken, which is the one that corresponds to the number of the model shown in Table 2, together with the following n values that correspond to the model parameters. As example, for the particle [1, 0.24, 0.18], the first value (1) would correspond to model 1 of the Table 2 and the next two values (0.24, 0.18) correspond to the parameters a and b to optimize the selected model, which is: y = a + bx obtained by the SDEP model.
We use the first dimension to determine the equation assigned to the particle. By this, we use a dynamic codification, and the particles will have different dimensions, depending on the assigned equation. The dimensions of the particles range from two to five.
To update the velocity vector of a particle, if the value of g best has more dimensions, the ones that are needed are used, and if it has fewer dimensions, we use random numbers instead. The particles have in common that they use mathematical equations to optimize the prediction of the effort of software projects. A particle interacts with itself (updating its best position) and with the best particle of the swarm. Even if a particle and the best particle have different equations, using the coefficients of the best particle helps the particle to move towards a global optimum. If we use random guessing, we do not consider the fitness results. On the other hand, if we use the best particle, we consider the fitness.
We consider that using the proposed codification offers the proposed PSO-SRE more search space capacity (exploration) and allows it to get better out of local optimums. However, for other optimizations problems, this increase in the search capacity of the proposal can be prone to not giving the best results, due to the decrease in the exploitation capacity of the proposal. The dimension and model will be unique for each data set and, once the stop condition is attained, the test MAR value will be calculated.
A crucial aspect of optimization algorithms such as PSO lies in the selection of the optimization function. In our research, two optimization functions (i.e., the prediction accuracy measures) are evaluated: AR, and the MAR. AR is calculated by ith-project as follows [13]: And the mean of ARs as follows: The median of ARs is denoted by MdAR. The accuracy of a prediction model is inversely proportional to the MAR or MdAR.
The parameter values for the proposed PSO model are: w = 0.1, V min = −10, V max = 10 and c 1 = c 2 = 1.5, this last value was chosen due to it had better results in our experiments, compared to that recommended as a standard value (i.e., c 1 = c 2 = 2 [54]). The swarm size was evaluated considering np between 50 and 750, whereas the iteration number was set between 250 and 1500.
As optimization function, we use the MAR of the training set, considering a LOOCV for the corresponding model defined in Table 2.

Data Sets of Software Projects
In the present study, the data sets used were obtained from the ISBSG release 2018, which is an international public repository whose data of software projects developed between 1989 and 2016, were reported from 32 countries. Among these countries are Spain, United States, Netherlands, Finland, France, Australia, India, Japan, Canada, and Denmark [59]. The projects were selected observing the ISBSG guidelines by selecting the data sets taking into account the quality of data, FSM, TD, DP, and PLT [41]. Table 3 describes the number of projects by applying each criterion (the ISBSG classifies the quality data of projects from "A" to "D" types, and "A" and "B" are recommended for statistical analysis). Since IFPUG V4 projects with V4 and post V4 should not be mixed [41], only those projects whose FSM corresponded to IFPUG 4+ were selected. In classifying the final 2054 projects of Table 3 by DT, 618 of them were new, 1416 enhanced and 20 re-development projects. The types of DP reported by the ISBSG are mainframe (MF), midrange (MR), multiplatform (Multi), personal computer (PC), and proprietary, whereas the PLT are second (2GL), third (3GL), fourth (4GL) generation, and application generator (ApG). As for the resource level, the ISBSG classifies it in accordance with how effort is quantified, and the level 1 corresponds to development team effort [41]. Those new and enhancement data sets were selected since they are the larger ones. The IFPUGV4+ FSM is reported in AFP, which is a composite value calculated from the following nineteen variables: internal logical file, external interface files, external inputs, external outputs, external inquiries, data communications, distributed data processing, performance, heavily used configuration, transaction rate, on-line data entry, end-user efficiency, on-line update, complex processing, reusability, installation ease, operational ease, multiple sites, and facilitate change [13]. Table 4 classifies those final 2034 new and enhancement projects classified in accordance with criteria included in Table 3. Since the χ 2 statistical normality test to be applied in this study needs at least thirty data, a scatter plot (Effort vs. AFP) was generated by data set whose number of projects in Table 4 was higher or equal than thirty (i.e., fifteen data sets). The scatter plots of these fifteen data sets from the Table 4 showed skewness, heteroscedasticity, and presence of outliers, therefore, in Table 5 four statistical normality tests are applied for Effort and AFP variables. Table 5 shows that there is at least a p-value lower than 0.01 by data set. It means that can be rejected the idea that Effort and AFP come from a normal distribution with 99% confidence for all of the data sets. Therefore, data are normalized applying them the natural logarithm (ln), which ensures that the resulting model goes through the origin on the raw data scale [43]. As example, Figures 4 and 5 depict those scatter plots corresponding to that data set of Table 4 having 133 new software projects. Figures 4 and 5 depict the raw and transformed data, respectively.     Outliers were identified based on studentized residuals greater than 2.5 in absolute value. The outliers, as well as coefficients of correlation (r) and determination (r 2 ) by data set are included in Table 6. In accordance with the number of acceptable outliers, a 5% of them by data set was taken as reference [60]. As for a minimum percentage for the coefficient of determination, at least a r 2 value higher than 0.5 was considered since it has been accepted for SDEP models [61]. Thus, in this study, eight data sets of those fifteen analyzed in Table 6 were selected to generate their corresponding PSO-SRE and SRE. They were finally selected since three of the fifteen had a r 2 value lower than 0.5, three of them presented a percentage between 11% and 16.6% of outliers, and one of them had a r 2 = 0.4119 with 14.28% of outliers.    Outliers were identified based on studentized residuals greater than 2.5 in absolute value. The outliers, as well as coefficients of correlation (r) and determination (r 2 ) by data set are included in Table 6. In accordance with the number of acceptable outliers, a 5% of them by data set was taken as reference [60]. As for a minimum percentage for the coefficient of determination, at least a r 2 value higher than 0.5 was considered since it has been accepted for SDEP models [61]. Thus, in this study, eight data sets of those fifteen analyzed in Table 6 were selected to generate their corresponding PSO-SRE and SRE. They were finally selected since three of the fifteen had a r 2 value lower than 0.5, three of them presented a percentage between 11% and 16.6% of outliers, and one of them had a r 2 = 0.4119 with 14.28% of outliers. Outliers were identified based on studentized residuals greater than 2.5 in absolute value. The outliers, as well as coefficients of correlation (r) and determination (r 2 ) by data set are included in Table 6. In accordance with the number of acceptable outliers, a 5% of them by data set was taken as reference [60]. As for a minimum percentage for the coefficient of determination, at least a r 2 value higher than 0.5 was considered since it has been accepted for SDEP models [61]. Thus, in this study, eight data sets of those fifteen analyzed in Table 6 were selected to generate their corresponding PSO-SRE and SRE. They were finally selected since three of the fifteen had a r 2 value lower than 0.5, three of them presented a percentage between 11% and 16.6% of outliers, and one of them had a r 2 = 0.4119 with 14.28% of outliers. The model for the SRE is linear having the form ln(Effort) = a + b * ln(AFP). Table 7 contains the SRE by data set selected from Table 6. All equations coincide with the assumption of development effort: the higher size (i.e., AFP), the higher effort is. Table 7. SREs for predicting the effort of new and enhancement projects.
The proposed PSO-SRE algorithm was executed for each dataset with different configurations in a distributed manner and on a dedicated server for the laboratory, trying to ensure that the execution time was as short as possible and trying to get the best possible result.
After applying the PSO-SRE, it is possible to detail the selected SDEP model (from those ones included in Table 2) by data set. Table 8 includes the PSO-SRE configuration in terms of the number of iterations, as well as the swarm size for three types of tests described next (the values were selected from those configurations described in the previous paragraph): 1.
Test 1: Up to 500 iterations, and up to 250 individuals in the swarm; 2.
Test 2: Up to 1500 iterations, and up to 750 individuals in the swarm; 3.
Test 3: Up to 1000 iterations, and up to 500 individuals in the swarm. As for velocity updates, Barrera et al. [62] address the issue of defining velocity limits iteratively. They show that for some optimization functions, the velocity update reported by Shi and Eberhart [55], is susceptible to sub-optimal behavior. However, for predicting the effort of software projects, we obtained good results with the approach of Shi and Eberhart [55].
The number of iterations and swarm size depend on the data set converging in different conditions. In relating the swarm size and iteration number of the three different tests of Table 8 to the prediction accuracy of Table 9 by data set, we can conclude that the increase of swarm size and iteration number degrades the performance of the proposed PSO-SRE algorithm. In accordance with Test 1 data, we can conclude that between 250 and 500 iterations, and a swarm size between 50 and 250 correspond to values which can be suggested for generate better results.  Table 9 includes the prediction accuracy obtained by model. It shows that PSO-SRE had a better MAR than SRE in seven of the eight data sets, and equal than the remaining one for Test 1, that is, when swarm size and number of iterations were lower than those of Test 2 and Test 3. In addition, the MARs for the Test 1 data sets were better than those MARs of Test 2 and Test 3 for all data sets except for one data set in which the MAR resulted equal for the three Tests (MAR = 0.61). Thus, data obtained from Test 1 are used in the present study.
In Table 9, we include a simple Random Search (RS) algorithm, which: • not have memory of its own nor a search direction, • repeat the random search of "the best particle" for a number of times that is equal to the number of fitness evaluations in the proposed PSO-SRE, • compare its best solution with the best solution yielded by PSO-SRE.
Since a MAR is not sufficient to report results in studies on software effort prediction, a suitable statistical test is applied for comparing the accuracies of the two models [46]. The selection of this test should be based in the number of data sets to be compared, data dependence and data distribution. In our study, two data sets will be compared at a time, and they are dependent (because each model was applied to each project by data set). As for data distribution, firstly, a new data set is obtained by each of the eight data sets of Table 9. Each new data set is obtained from the difference between the two ARs by project (an AR of SRE, and an AR of PSO-SRE). Secondly, four normality statistical tests are performed to each new data set. Thirdly, if any of their four p-values is lower than 0.05 or 0.01, then data are non-normally distributed at 95% or 99% of confidence, respectively, and a Wilcoxon test should be applied (the medians of models should be compared to accept/reject the hypothesis), otherwise, a t-paired should be performed (the means of models should then be compared) [63]. Table 10 shows that only in two cases the data resulted normally distributed, then, in the resting fourteen cases, a Wilcoxon test was applied, and the medians were used for the fourteen comparisons. We executed the algorithm in a distributed manner (i.e., datasets in parallel), which reduced the execution time. However, we made a sequential set of experiments (i.e., one dataset at the time) to estimate the total time expended in each set of experiments considering the LOOCV used. Table 11 shows the time (sequential) by data set. Its column "Prediction" refers to the time of using the proposed PSO-SRE to predict the effort of a software project for each of the datasets. As shown in Table 11, the proposed PSO-SRE is able to predict the effort of a software project in less than half a minute.

Conclusions
The results showed in the Tables 9 and 10 allow us accepting the following alternative hypothesis formulated in the Section 1 of our study in favor of PSO-SRE for seven of the eight data sets (six of them at 99% of confidence, and the seventh one at 95% of confidence): Prediction accuracy of the PSO-SRE is statistically not equal to that of the SRE when these two models are applied to predict the development effort of software projects using the AFP as the independent variable.
Regarding the remaining data set, the following null hypothesis is accepted at 99% of confidence: Prediction accuracy of the PSO-SRE is statistically equal to that of the SRE when these two models are applied to predict the development effort of software projects using the AFP as the independent variable.
As for the comparison between the PSO-SRE and RS, the following hypothesis can be accepted in favor of the PSO-SRE for the eight data sets at 99% of confidence: Prediction accuracy of the PSO-SRE is statistically not equal to that of the RS when these two models are applied to predict the development effort of software projects using the AFP as the independent variable.
We can conclude that a software manager can apply PSO-SRE for predicting the development effort of a software project taking into account the following TD, DP, and PLT when AFP is used as the independent variable: Regarding PSO-SRE optimization, from a general perspective, the best prediction accuracy by data set was obtained when the number of iterations was between 250 and 500, and the swarm size between 50 and 250.

Discussion
In software prediction, one of the most common predicted software variables has been effort, which is commonly measured in person-hours or person-month. SDEP is needed for managers to estimate the cost of projects and then for budgeting and bidding; actually, its importance can be showed in the hundreds of studies published in the last forty years. Thus, in the present study, the PSO was applied for optimizing the parameters of SDEP equations. The prediction accuracy of the PSO-SRE was compared to that obtained from SRE. Both types of models were generated based on eight data sets of software projects selected by observing the guidelines of the ISBSG.
In comparing our study with those ten identified ones where PSO has been applied to SDEP and described in Table 1, we identify the following issues: • None of them generate their models by using a recent repository of software projects.

•
Regarding the four studies where the ISBSG is used (1) their releases correspond to those published in the years 2007 and 2009, (2) all of them only select one data set from the ISBSG whose sizes are between 134 and 505, and (3) none of them take into account the version of the FSM to select the data set; whereas in our study, (1) the ISBSG release 2018 was used, (2) eight data sets containing between 53 and 440 projects were selected, and (3) all of them took into account the guidelines suggested by the ISBSG, including the type of FSM, that is, our data sets did not mix IFPUG V4 type with V4 and post V4 one.

•
The majority of them base their conclusions on a biased prediction accuracy measure, and on a nondeterministic validation method.

•
The half of them bases their conclusions on statistically significance.
We did not find any study having all of the following characteristics as ours when proposed the PSO-SRE: (1) The use of PSO incorporating an additional component by allowing automatic completion, in a single step, of the selection of the SDEP model, and automatic adjustment of the parameters of the SDEP model. (2) New and enhancement software projects obtained from the ISBSG release 2018.
(3) Software projects selected taking into account the TD, DP, PLT, and FSM as suggested by the ISBSG. (4) Preprocessing of data sets through outliers' analysis, and correlation and determination coefficients. (5) A nonbiased prediction accuracy measure (i.e., AR) to compare the performance between PSO-SRE and SRE models. (6) The use of a deterministic validation method for training and testing the models (i.e., LOOCV) (7) Selection of a suitable statistical test based on number of data sets to be compared, data dependence, and data distribution for comparing the prediction accuracy between PSO-SRE and SRE by data set. (8) Hypotheses tested from statistically significance.
Our manuscript also followed all of the six guidelines when a new SDEP model is proposed [43] by (1) confirming that our PSO-SRE algorithm outperforms a statistical model (i.e., SRE), when PSO-SRE outperformed to SRE in seven of the eight data sets with statistical significance, (2) taking into account the heteroscedasticity, skewness, and heterogeneity of effort and size data of software projects, (3) using statistical tests to compare the performance between prediction models, (4) explaining in detail how our PSO-SRE is applied, (5) justifying the selection of any statistical test we used, and (6) including the criteria followed for selecting the data sets of software projects from the ISBSG.
A first limitation of the present study that reduces the generalization of our conclusions is related to the number of data sets used, that is, in spite of the ISBSG contains more than eight thousands of software projects, we could only select eight data sets observing the guidelines of the ISBSG. A second one is that only 20 prediction models are considered based on simple SRE (Table 2). Finally, a third limitation is that we did not consider other more complex ML prediction models.
As for external threat validity, the prediction accuracy of PSO-SRE will depend on an accurate estimation performed by the practitioner on the independent variable value (i.e., the size measured in AFP).
Another limitation regarding the use of heuristic algorithms (PSO in this paper) is that they prune the search space and can discard useful regions. We are aware that the proposed method, despite its good behavior for predicting the effort of software projects, can have a different performance for other optimization problems.
Future work will be related to the application of other metaheuristics for optimizing the parameters for the SRE. We will intent to use a greater number of datasets, whose data is current and reflecting the heterogenic evolution of these data. Alternative models will also be proposed to predict the effort of new and enhancement software projects such as those based on classifiers [64,65]. Moreover, additional prediction accuracy measure criteria will be take account such as standardized accuracy and effect size [66]. Finally, a modification to the algorithm can be added to take duplicate values into account and to act similarly between them, as well as to apply alternative update mechanisms as in [62] for the velocity update, and test it against the one currently used. We will also explore newer and improved implementations of PSO.

Acknowledgments:
The authors would like to thank the Centro de Innovación y Desarrollo Tecnológico en Cómputo and the Centro de Investigación en Computación, both of the Instituto Politécnico Nacional, México; Universidad de Guadalajara, México, and the Consejo Nacional de Ciencia y Tecnología (CONACyT), México.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript:

2GL
Programming languages of second generation 3GL Programming