Modelling Construction Site Cost Index Based on Neural Network Ensembles

: Construction site overhead costs are key components of cost estimation in construction projects. The estimates are expected to be accurate, but there is a growing demand to shorten the time necessary to deliver cost estimates. The balancing (symmetry) between time of calculation and satisfaction of reliable estimation was the reason for developing a new model for cost estimation in construction. This paper reports some results from the authors’ broad research on the modelling processes in engineering related to estimation of construction costs using artiﬁcial intelligence tools. The aim of this work was to develop a model capable of predicting a construction site cost index that would beneﬁt from combining several artiﬁcial neural networks into an ensemble. Combining selected neural networks and forming the ensemble-based models compromised their strengths and weaknesses. With the use of data including training patterns collected on the basis of studies of completed construction projects, the authors investigated various types of neural networks in order to select the members of the ensemble. Finally, three models that were assessed in terms of performance and prediction quality were proposed. The results revealed that the developed models based on ensemble averaging and stacked generalisation met the expectations of knowledge generalisation and accuracy of prediction of site overhead cost index. The proposed models offer predictions of cost in an accepted error range and prove to deliver better predictions than those based on single neural networks. The developed tools can be used in the decision-making process regarding construction cost estimation.


Introduction
The success of a construction project is determined by obtaining three fundamental goals of a project-completion within the budget, completion within planned time, and achieving the expected quality of construction works.For the budget issue, cost estimation is a key process.On one hand, the estimates are expected to be accurate; on the other hand, there is a growing demand to shorten the time necessary to deliver cost estimates.These needs justify attempts to employ various tools in fast cost analyses and modelling.The aim of this paper is to present the results of the research on artificial neural networks (ANNs) ensembles as artificial intelligence tools for fast analysis and prediction of site overhead costs.This research is a continuation and extension of previous studies, including prediction of these costs with the use of multilayer perceptron neural networks [1].It is worth mentioning that mathematical tools-which are constantly being developed-are present in the investigations of a broad variety of problems in the field of construction management and technology.Some interesting examples are applications of fuzzy sets theory and fuzzy logic in construction project risk [2][3][4], the evaluation of a construction safety management system [5], processes in a Symmetry 2019, 11, 411 2 of 18 construction enterprise [6], the investigation of flow-shop scheduling problems [7], and using multiple criteria decision-making methods for supporting the decision process in construction and building technology [8][9][10].There have also been a number of attempts to apply artificial neural networks in the management of construction projects-predicting the completion period of building contracts [11], analysing efficiency and productivity in construction projects [12,13], predicting the maintenance cost of construction equipment [14], supporting bidding decisions [15,16], and facilitating decision making [17][18][19].Comprehensive discussion on innovative solutions in the construction industry can be found in Reference [20].
The solutions and models that support cost estimates in construction are explored in many scientific publications.The authors propose a variety of methods, for instance multivariate regression [21], analysis of the selected cost-effectiveness factors [22], a case-based reasoning method [23], fuzzy logic [24], and genetic algorithms [25].In terms of ANNs, there have been attempts to apply these tools in the field of construction cost management.Some examples are forecasting costs of motorways in different aspects [26], predicting cost deviations in reconstruction, alteration, and rebuilding projects [27], estimating the costs of construction projects [28,29], cost estimates of residential buildings [30], prediction of overhead costs [31,32], cost estimates of buildings' floor structural frames as a higher level of aggregation elements of building information model [33], construction cost of sports fields [34], and shovel capital cost estimation [35].
According to the research presented in Reference [36], the influence of an improper calculation of the overhead costs can create a significant negative financial situation for the contracting company.Generally, the building contractor's overhead costs are divided into two categories-site (project) overhead costs and company's (general) overhead costs [37].The site (project) overhead costs include items that can be identified with a particular job, but not materials, labour, or production equipment.The company's overhead costs are items that represent the cost of doing business and are often considered fixed expenses that must be paid by the contractor.On the other hand, an overhead cost of a construction project can be defined as a cost that cannot be identified with or charged to a construction project or to a unit of construction production [38].A new classification of construction companies into competitiveness classes according to the relative value of overhead costs was proposed in Reference [39].As far as accuracy is concerned, it is more advantageous to calculate both components separately-as is done in Great Britain [40], the US, and Canada [41].The unstable construction market makes it difficult for construction companies to decide on the optimum level of overhead costs [42].
A number of empirical studies relate to the determination of the project overhead costs.In Reference [43], it is indicated that the method of work is a critical factor affecting the amount spent on project overheads.In Reference [44], the authors pointed that the location of the site could affect a number of project overhead items.In References [31,45,46], research carried out in different countries allowed for the identification of different factors that should be taken into account in site (project) overhead costs.
Studies on construction project overheads and factors that influence their estimates report that it is difficult to determine unambiguously which of the cost components are of the highest importance.Most attention is paid to a detailed calculation of site overheads; however, it is a time-consuming task to take into account all of the possible components of site overhead costs [36].
The aim of the authors' work was to develop a regression model based on the ANNs ensembles, capable of the prediction of site overhead cost index, and, thus, able to support the estimation of site overhead costs in construction projects.An additional research objective was to explore the capabilities of ANNs ensembles in this problem.In the application of ANNs, a very common approach is to select one network to be the core of a developed model.The selection is preceded by a training and performance assessment of numerous networks-compare, e.g., Reference [47].As an alternative, the employment of a combination of networks i.e., ANNs, offer significant capabilities.Despite their advantages, the ANNs ensembles are rarely reported on for the prediction of widely understood construction costs in research papers.
Site overhead costs can be estimated with the use of preliminaries (compare References [40,41]) -such a method is accurate but time-consuming as all of the cost items must be assessed separately.On the other hand, index methods (compare Reference [36]) allow for quick estimation of site overhead costs, however the accuracy depends on the assumption of the index.The novelty of the approach proposed in this paper relies on the use of knowledge and information from the completed construction projects to train several neural networks, combine them into an ensemble, and assess the site overhead costs on the basis of the predictions produced by the ensemble of neural networks.
The paper content includes an introduction and review of the literature in the above section.Section 2 presents the theoretical background of the problem, and a discussion of the site overhead cost index prediction as a regression problem is presented in Section 3. In Section 4, the authors propose a methodology for the implementation of an ensemble of neural networks (with the use of ensemble averaging and stacked generalisation approaches) for prediction of site overhead cost index, present the results of the studies, and discuss the results.Section 5 includes a summary and conclusions.

Background of The Problem, Methods, and Main Assumptions
The development of the proposed model comes down to solving a regression problem and approximation of the true regression function, which is the relationship between the site overhead costs index (as the dependent variable of the model) and a set of selected predictors (as independent variables of the model).The theory of ANNs is widely presented in the literature-for instance, References [47][48][49].ANNs, as mathematical tools applied in regression problems, offer an approximation of the true regression function g(x j ) of multiple variables x j where j = 1, . . .,n: In the equation ( 1), function f (x j ), as an approximation of g(x j ), is assumed to be implemented implicitly by a trained single ANN, selected from a number of trained candidate networks, where ε denotes an error of approximation.There are two disadvantages of an approach based on the selection of a single ANN and discarding the rest of the candidate networks [47,48]-the effort required for the training and assessment of the number of candidate networks is wasted.Moreover, the generalisation performance of the chosen network is biased with respect to some part of the input space due to the selection of learning, testing, and validating subsets from the overall number of patterns available for the training process, structure of the network, its parameters, and conditions of training process initialisation.An alternative approach is to combine a number of different ANNs that share common input x j and form an ensemble (the ANNs may differ in their structures, parameters, and way of training, and the ensemble may even include different kinds of networks).In this paper, the authors consider two alternative approaches that are based on ensembles of neural networks-the first approach is termed as ensemble averaging, and the second one stacked generalisation-compare, e.g., References [47,48].In the next three subsections, the authors systematically present the background of the research and the main assumptions of the model development process.

Ensemble Averaging
The main assumption for the ensemble averaging approach is that approximation of g(x j ) is done with the use of a linear combination of K-trained ANNs.The formal notation is given by Equation (2): where f i (x j ) stands for the approximation and ε i denotes an error of approximation by i-th neural network for i = 1, . . .,K.Such a mechanism (compare Reference [48]), which does not involve input signals, where individual outputs of ANNs are combined to produce an overall output, belongs to a class of static structures.The following assumptions can be made [47]-the sum-of-squares error for f i (x j ) can be given as: where E i sos corresponds to an integration over x j , weighted by unconditional density p(x j ): The average error by the networks acting individually can be written as Supposing that the output of the ensemble of networks is the average of outputs of K networks that belong to the ensemble, we have the prediction of the ensemble f ens (x j ): Under the assumption that ε i (x j ) are uncorrelated and have zero mean, the relation of the ensemble error to the average error of the networks working separately is: In practice, ε i (x j ) are highly correlated and the reduction of the error is much smaller.Typically, some useful reduction of the error is obtained, as the ensemble averaging cannot produce an increase in the expected error: The expectation is that differently trained networks converge to different local minima on the error surface, and the overall performance is improved by combining the outputs in some way [47].The employment of neural networks ensembles may lead to satisfactory results, especially when the number of training patterns is relatively low or the training data is noisy [47,50].

Stacked Generalisation
The stacked generalisation approach, (compare Reference [47]), is based on combining several trained networks together into a two-level model.The general expectation of such an approach is to improve the generalisation capabilities of the networks acting in isolation.The two-step procedure includes a training set of K level-0 networks, whose outputs are then used to train a level-1 network.One can say that the level-0 networks form an ensemble, and the level-1 network acts as a combiner of the outputs of the networks belonging to the ensemble.The general idea of the approach is presented in Figure 1.
A stacked generalisation-based model combines the outputs of level-0 networks trained with the x j inputs; the outputs of level-0 networks can be written down as ŷi = f i (x j ), with the use of the level-1 network to give the final output.Formally the model can be given as Consequently, predictions on new data is also a two-step procedure.They are made by presenting new input data to the level-0 networks and computing their outputs, which are then presented to the level-1 network which computes the final output.The general suggestion for the stacked generalisation approach is that the ensemble of level-0 networks should consist of various networks that differ from each other, whilst the level-1 network should have a relatively simple structure [47].A stacked generalisation-based model combines the outputs of level-0 networks trained with the xj inputs; the outputs of level-0 networks can be written down as ŷi = fi(xj), with the use of the level-1 network to give the final output.Formally the model can be given as Consequently, predictions on new data is also a two-step procedure.They are made by presenting new input data to the level-0 networks and computing their outputs, which are then presented to the level-1 network which computes the final output.The general suggestion for the stacked generalisation approach is that the ensemble of level-0 networks should consist of various networks that differ from each other, whilst the level-1 network should have a relatively simple structure [47].

Construction Site Overhead Cost Index Prediction as a Regression Analysis Problem-Assumptions for Ensemble Averaging and Stacked Generalisation
The prediction of site overhead cost index by the neural networks ensemble and ensemble averaging approach can be formally given with the following equations ( 10) and (11): where y-real life values of site overhead cost index (dependent variable), ŷens-values of y predicted by the ensemble of neural networks, fi-the i-th mapping function implemented implicitly by the i-th neural network belonging to an ensemble, xj-dependent variables, input shared by all of the members of the ensemble for j = 1,…,m, εi-error of approximation by the i-th member of the ensemble for i = 1,…, K.
On the other hand, the prediction by neural networks ensemble and stacked generalisation approach is denoted with equations ( 12) and ( 13): where y-as in (11), ŷsg-values of y predicted by the stacked generalisation-based two-level model, h-the mapping function implemented implicitly by level-1 neural network, fi-the i-th

Construction Site Overhead Cost Index Prediction as a Regression Analysis Problem-Assumptions for Ensemble Averaging and Stacked Generalisation
The prediction of site overhead cost index by the neural networks ensemble and ensemble averaging approach can be formally given with the following Equations ( 10) and ( 11): where y-real life values of site overhead cost index (dependent variable), ŷens -values of y predicted by the ensemble of neural networks, f i -the i-th mapping function implemented implicitly by the i-th neural network belonging to an ensemble, x j -dependent variables, input shared by all of the members of the ensemble for j = 1, . . .,m, ε i -error of approximation by the i-th member of the ensemble for i = 1, . . ., K.
On the other hand, the prediction by neural networks ensemble and stacked generalisation approach is denoted with Equations ( 12) and ( 13): where y-as in (11), ŷsg -values of y predicted by the stacked generalisation-based two-level model, h-the mapping function implemented implicitly by level-1 neural network, f i -the i-th mapping function implemented implicitly by i-th level-0 neural network, x j -as in (11), and ε sg -the error of approximation by the model.The relationship between the set of selected predictors and the site overhead cost index was investigated by the authors.Eleven independent variables of the model were selected on the basis of studies of literature [28,31,46] and investigations of the number of projects completed in Poland.
The training data included samples of real-life values of dependent variables, y, and corresponding vectors of dependent variables, x j .The value of the dependent variable in the p-th sample (p = 1, . . .,143) was calculated as follows: Symmetry 2019, 11, 411 where SOC ind p -site overhead costs index, SOC p -site overhead costs observed in reality, LC p -labour costs observed in reality, MC p -material costs observed in reality, EC p -equipment costs observed in reality, and SC p -subcontractors' costs observed in reality for the p-th observation (sample).Some exemplary data, including cost components present in the Equation ( 13), in thousands of Euros, and corresponding site overhead cost indexes, are presented in Table 1.Independent variables of the model were selected on the basis of studies of the literature and investigations of the number of projects completed in Poland.As a result, a set of selected independent variables was proposed; these variables were denoted as x j , where j = 1, . . .,11.Three variables brought to the model information about the types of work that were executed in the project were: • x 1 -types of work-general construction works, • x 2 -types of work-installation works, • x 3 -types of work-engineering works.
Another four variables brought to the model information about the construction site location were:

•
x 4 -construction site location-in the city centre, • x 5 -construction site location-outside the city centre, • x 6 -construction site location-non-urban spaces, • x 7 -distance between the construction site and the company's office.
One of the variables brought to the model information about the duration of construction works was: • x 8 -overall duration of construction works.
Another two variables brought to the model information about the execution of works in winter and about the subcontracted works were:

•
x 9 -relationship between the amount of works performed in winter to the total amount of works, • x 10 -relationship of the amount of works performed by subcontractors to the total amount of works.
The last variable brought to the model information about the main contractor was: • x 11 -size and necessary potential of the main contractor.
(When compared to the earlier authors' studies on the problem [1,32], the set of ten independent variables has been expanded.Thorough review of available data, which was collected in the earlier phases of the research, allowed to select an additional variable which brings to the model information about the capabilities of the contractor -namely its size and potential.)Variables x 1 -x 6 were of the nominal type.A binary method of coding was applied in the case of x 1 , x 2 and x 3 -their values range was 0 or 1.In the case of x 4 , x 5 and x 6 -a "1 of n" method of coding was applied-the range of values, considered for the three variables altogether, was 1, 0, 0 or 0, 1, 0 or 0, 0, 1. Variables x 7 -x 10 were of the quantitative type, whereas x 11 was of the nominal type.A pseudo-fuzzy scaling method of coding was applied for transformation of the original values or information into numerical values into the range 0.1-0.9 in the case of the variables presented in Table 2, but for the variable x 9 the values were scaled into the range 0.0-1.0.The transformation for these variables is presented in Table 2.The rationale for the transformation was to provide a common scale for all of the variables.0.5 between 60% and 80% 0.7 between 80% and 90% 0.9 more than 90% 1 x 10 share of subcontractors in the total amount of works up to 20% 0.1 between 20% and 50% 0.5 between 50% and 100% 0.9 x 11 size and potential of the main contractor low 0.1 average 0.5 high 0.9 The database that included 143 samples was built on the basis of a survey which was completed by Polish contractors.The survey investigated the factors that influence site overhead costs and the scope and complexity of construction works for completed building projects.The studies of the returned surveys resulted in gathering and ordering data used in the process of ANNs training.Table 3 presents some samples of the training data; exemplary records from the database are given.The strategy of the models' development, as well as the assumptions about the training, testing, and performance analysis, are explained in the next section.

Models' Development Strategy
The strategy of the model development included conducting multiple training and testing of a number of different types of single ANNs as candidates to become members of the ensemble, forming the ensemble, and then investigating the two approaches discussed earlier.The strategy is presented schematically in the chart in Figure 2 and then discussed in detail.The whole set of collected data was divided into two main subsets used for training and testing purposes.The testing subset, later referred to as T, was selected carefully to be statistically representative for the whole data collection and included 20% of the samples from the whole set of collected data.The data belonging to this subset did not take part in the training of ANNs and was used for the purposes of examination of single ANNs, as well as the ensemble models built upon the ensemble averaging and stacked generalisation approaches.Samples belonging to the subset T play the role of new cases in prediction performance analysis as well.
The remaining data was used for training i.e., for supervised learning and validating of single ANNs candidates to become members of the ensemble.Later, these subsets are referred to as L and V, respectively, whilst the whole training subset is referred to as L&V.The strategy involved division of the remaining data in the relation L/V = 80%/20%, repeated five times, so the five folds of data were available for training purposes.Moreover, each of the samples belonging to the L&V subset took part in supervised learning in four folds and in validating in one fold, so the networks for each fold are trained with data which varies in terms of falling different samples either to the L or V subsets.
Another key assumption was to select one ANN for each of the folds of L and V subsets to become the member of the ensemble.The selection was made on the basis of two-step ANNs' performance analysis and assessment within the sets of networks trained with the use of each fold of L and V subsets.The rationale for such assumption was not only to choose the best networks but also to minimise the risk that the prediction of the model is biased due to the sampling of the L and V subsets.The employed error function and criteria of the trained networks assessment are presented in Table 4.For the purposes of performance assessment and analysis of single ANNs, Pearson's correlation coefficient (15) and error measures ( 16)- (20) were calculated for the L, V, L&V, and T subsets.Selection of the ensemble members was preceded by an investigation of a number of various multilayer perceptron (MLP) ANNs with one hidden layer, whose structures included 11 neurons in the input layer, h neurons in the hidden layer, and 1 neuron in the output layer.The choice of the MLP networks relied on their applicability to regression problems (compare References [29,49]).
The networks varied in the number of neurons in the hidden layer (h ranged from 4 to 11), the types of employed activation functions-both in the neurons of the hidden and output layer (sigmoid, hyperbolic tangent, exponential, and linear function) and the initial weights of the neurons-at the beginning of the training process.The Broyden-Fletcher-Goldfarb-Shanno algorithm (BFGS) was used for training individual networks-the details about the algorithm can be found in the literature, e.g., Reference [47].The choice of the training algorithm was dictated by its availability in the software that were used for neural simulations.As one of the three available algorithms, BFGS offered the fastest performance and best convergence of training and testing processes for the investigated problem.A variety of different combinations of employed activation functions and numbers of neurons in hidden layers that made, altogether, over 100 networks were trained for each of the five folds of L and V subsets.
The first step of selection included an assessment of correlation coefficient between the expected and predicted output and root mean squared error (RMSE) values.From the set networks, which fulfilled the conditions of R L > 0.90, R V > 0.90, R L&V > 0.90, and R T > 0.90, the authors initially selected 20 networks for which the differences between RMSE L , RMSE V , RMSE L&V , and RMSE T were the smallest.
The second step of the selection relied on a thorough review of the initially selected networks for each of the five folds of L and V subsets.The authors carried out a residual analysis, in terms of both measures presented in Table 4, and distributions, dispersions, and values of errors for the samples belonging to the training and testing subsets.

Results and Discussion
A review and comparison of the network's performance, based on the methodical analysis, allowed for finally choosing five networks-one for each fold of L and V subsets.The five selected networks-later referred to as ANN1, ANN2, ANN3, ANN4, and ANN5-are presented in Table 5.Table 6 presents the results of training and testing of the five selected networks acting separately.The results in the Table are given according to the criteria presented in Table 4.The results in Table 6 are satisfying, however one can easily see that there are some differences between the performances of the five networks.
Figure 3 presents the scatterplot of the expected and predicted values of SOC ind , points of coordinates (y p , ŷp ), for the training and testing subsets drawn for the five selected networks acting individually.One can see that, in terms of the criteria shown in Table 4 and according to the results presented in Table 5, the performance of the three networks acting individually was similar and the errors were comparable.However, Figure 3a,b and the analysis of the location and the distribution of the points in the graphs reveal that the predictions for will depended strongly on the choice of a single network acting separately.Although most of the points were distributed along the line of a perfect fit, some points (marked with the ellipses) were placed outside of the cone delimited by percentage errors equal to +25% and −25%.Table 7 presents the maximal values of absolute percentage errors ( 20) calculated for the five selected ANNs.The values in Table 7 reveal significant errors of predictions, which also justify employment of ensembles of neural networks in the problem.Table 7 presents the maximal values of absolute percentage errors (20) calculated for the five selected ANNs.The values in Table 7 reveal significant errors of predictions, which also justify employment of ensembles of neural networks in the problem.The five chosen networks were combined to form the ensemble.The rules presented earlier-Equations ( 10) and ( 11)-were employed for implementation of the ensemble averaging approach and the outputs of the model were computed as well as the errors and error measures.This model is later called ENS AV.
To complete the process of model development based on the stacked generalisation approach, the authors investigated a number of artificial neural-network candidates to become the level-1 networks.The investigated networks' structures included five neurons in the input layer (as a consequence of the selection of five ensemble member networks), h neurons in the hidden layer, and one neuron in the output layer.The number of neurons in hidden layer h ranged from one to three, as the structure of the level-1 network was supposed to be simple (compare Section 2.2).The types of employed activation functions and training algorithm were the same as in the case of the training ensemble candidate networks (as presented previously in Section 3).Training patterns that included outputs of the five ensemble member networks as the inputs of level-1 networks, and the accompanying real-life values of SOC ind as the expected outputs, were divided randomly for each investigated network into the learning and validating subset in the proportion L/V = 60%/40%.The investigated networks varied also in the initial weights of the neurons at the beginning of the training process.Altogether, around 100 networks were trained and examined.For the purposes of testing, the authors used the T subset, which was selected in the initial stage of the research (as presented previously in Section 3).The criteria of two-step selection of the level-1 networks were similar as in the case of ensemble candidate networks (as presented previously in Section 3).The final choice of two level-1 networks, namely MLP 5-2-1 and MLP 5-3-1, allowed for the introduction of two alternatively stacked generalisation-based models.The final choice of the two above-mentioned level-1 networks, and further discussion of two alternative models based on stacked generalisation, was due to the comparable quality of these models.These models are later called ENS SG1 and ENS SG2, respectively.The details of the selected level-1 networks are presented in Table 8.All three proposed models based on the ensemble of networks, namely ENS AV, ENS SG1, and ENS SG2, were assessed in terms of performance and prediction quality.The overall results appear together in Table 9.For the purposes of performance assessment and analysis of ensemble averaging and stacked generalisation-based models, Pearson's correlation coefficient (16) and error measures (17), ( 18), (19), and (20) were calculated for L&V and T subsets.When the values in Table 9 are collated with values in Tables 5 and 6, the improvements in error measures can be seen easily.The performance of all three models based on the ensembles of networks is better when compared with the performance of the networks acting in isolation.The most evident improvement is achieved for APE max .
Figures 4-6 depict scatterplots of the expected and predicted values of SOC ind for the ENS AV, ENS SG1, and ENS SG2 models.Figures 4-6 present the points of coordinates (y p , ŷp ens ) for the training and testing subsets separately.When compared to Figure 3, these graphs show that combining the five selected ANNs allowed for the compensation of errors made by the ANNs acting in isolation in the case of the ENS AV as well as the ENS SG1 and ENS SG2 models.Although an improvement has been achieved in the case of all three introduced models, one can see that the best performance is provided by ENS SG2, where all of the points are distributed within the cone of acceptable errors.In the case of ENS AV and ENS SG1, there are single points located outside of the cone.been achieved in the case of all three introduced models, one can see that the best performance is provided by ENS SG2, where all of the points are distributed within the cone of acceptable errors.In the case of ENS AV and ENS SG1, there are single points located outside of the cone.The columns in the Figures 7-9 show the percentage frequencies of the errors that have fallen into one of the intervals.The polylines show the distribution of the errors (cumulative frequencies according to the accepted order of intervals).In Figures 7-9, one can see that, in the case of the ENS AV and ENS SG1, only a few APE p errors (19) are greater than 25%, and in the case of ENS SG2, none of them fall into this range.On the contrary, for networks acting separately, the significant number of errors is greater than 25%.These results can be explained through the analysis of the APE p errors for the networks acting separately.For the networks acting separately (ANN1, ANN2, ANN3, ANN4, ANN5), many of the errors APE p belonging to the interval 1 were relatively small and close to 0%.On The columns in the Figures 7-9 show the percentage frequencies of the errors that have fallen into one of the intervals.The polylines show the distribution of the errors (cumulative frequencies according to the accepted order of intervals).In Figures 7-9, one can see that, in the case of the ENS AV and ENS SG1, only a few APE p errors (19) are greater than 25%, and in the case of ENS SG2, none of them fall into this range.On the contrary, for networks acting separately, the significant number of errors is greater than 25%.These results can be explained through the analysis of the APE p errors for the networks acting separately.For the networks acting separately (ANN1, ANN2, ANN3, ANN4, ANN5), many of the errors APE p belonging to the interval 1 were relatively small and close to 0%.On the other hand, these small errors were accompanied by a significant number of errors APE p ≥ 25%, and high values of APE max (compare Table 7).In the case of the ensemble-based models, these errors have been compensated due to the ensemble averaging (ENS AV) or stacked generalisation (ENS SG1, ENS SG2).The compensation resulted in the collection of most of the prediction errors in the first five intervals.One cost of this compensation is the decrease of the number of small errors, close to 0, in the first interval.The benefit of the compensation, however, is the improvement of the overall prediction performance and better knowledge generalisation.As mentioned previously, one can easily see that the best performance is offered by the ENS SG2 model as there were no errors APE p ≥ 25%.
The analysis of the research results leads to the conclusion that the employment of only one of the five selected networks (as presented in Table 5) to support the prediction of SOCind would burden the predictions with the choice of a network-this is confirmed by the distribution of points that represent expected and predicted values (y p , ŷ p ) in Figure 3.
On the other hand, combining these five networks to form an ensemble compromises the strengths and weaknesses of the five ANNs-for some data, certain single-acting networks offered good predictions, whilst for others, there were weak predictions.Combining these networks into an ensemble allows for synergy.The decrease in APEmax, as well as more stable predictions, are the most beneficial from employment of the ensembles in the models.Furthermore, a risk of errors exceeding the critical level of 25%, in terms of percentage errors is reduced.These benefits have been achieved at some cost, mainly due to compensation of very small and very high errors offered by certain networks acting separately for certain training and testing patterns.However, the compensation of the errors from the ensemble-based models reduces the unwanted oversensitivity of the networks acting separately to certain training patterns.

Summary and Conclusions
The authors developed three original models based on ensembles of neural networks aimed at the prediction of site overhead cost index for construction projects.One of the models employed ensemble averaging and two of the models employed stacked generalisation.The developed models are capable of predicting the site overhead cost index with a satisfactory accuracy and, thus, supporting estimates of site overhead costs.In the light of the presented research, the general conclusion is that the employment of the ensemble of neural networks to the models proved to be superior over the approach based on the employment of a single neural network.Moreover, the effort-which is unavoidable in the training, verifying, and selecting number of networks of similar quality-is not wasted.In practical terms, the prediction using the ensemble averaging is simple-it needs an averaging of the outputs of networks belonging to the ensemble.On the other hand, stacked The compensation resulted in the collection of most of the prediction errors in the first five intervals.One cost of this compensation is the decrease of the number of small errors, close to 0, in the first interval.The benefit of the compensation, however, is the improvement of the overall prediction performance and better knowledge generalisation.As mentioned previously, one can easily see that the best performance is offered by the ENS SG2 model as there were no errors APE p ≥ 25%.
The analysis of the research results leads to the conclusion that the employment of only one of the five selected networks (as presented in Table 5) to support the prediction of SOC ind would burden the predictions with the choice of a network-this is confirmed by the distribution of points that represent expected and predicted values (y p , ŷp ) in Figure 3.
On the other hand, combining these five networks to form an ensemble compromises the strengths and weaknesses of the five ANNs-for some data, certain single-acting networks offered good predictions, whilst for others, there were weak predictions.Combining these networks into an ensemble allows for synergy.The decrease in APE max , as well as more stable predictions, are the most beneficial from employment of the ensembles in the models.Furthermore, a risk of errors exceeding the critical level of 25%, in terms of percentage errors is reduced.These benefits have been achieved at some cost, mainly due to compensation of very small and very high errors offered by certain networks acting separately for certain training and testing patterns.However, the compensation of the errors from the ensemble-based models reduces the unwanted oversensitivity of the networks acting separately to certain training patterns.

Summary and Conclusions
The authors developed three original models based on ensembles of neural networks aimed at the prediction of site overhead cost index for construction projects.One of the models employed ensemble averaging and two of the models employed stacked generalisation.The developed models are capable of predicting the site overhead cost index with a satisfactory accuracy and, thus, supporting estimates of site overhead costs.In the light of the presented research, the general conclusion is that the employment of the ensemble of neural networks to the models proved to be superior over the approach based on the employment of a single neural network.Moreover, the effort-which is unavoidable in the training, verifying, and selecting number of networks of similar quality-is not wasted.In practical terms, the prediction using the ensemble averaging is simple-it needs an averaging of the outputs of

Figure 1 .
Figure 1.General idea of stacked generalisation approach.

Figure 1 .
Figure 1.General idea of stacked generalisation approach.
Models' Development StrategyThe strategy of the model development included conducting multiple training and testing of a number of different types of single ANNs as candidates to become members of the ensemble, forming the ensemble, and then investigating the two approaches discussed earlier.The strategy is presented schematically in the chart in Figure2and then discussed in detail.

Figure 2 .
Figure 2. Scheme of the strategy of the models' development.

Figure 2 .
Figure 2. Scheme of the strategy of the models' development.

Figure 3 .
Figure 3. Scatterplots of y and ŷ for the five selected neural networks acting separately: a) scatterplot for samples belonging to the training subset, b) scatterplot for samples belonging to the testing subset.

Figure 3 .
Figure 3. Scatterplots of y and ŷ for the five selected neural networks acting separately: (a) scatterplot for samples belonging to the training subset, (b) scatterplot for samples belonging to the testing subset.

Figure 4 .
Figure 4. Scatterplot of y and ŷens for the ensemble, ENS AV, performing ensemble averaging; a) scatterplot for samples belonging to the training subset, b) scatterplot for samples belonging to the testing subset.

Figure 4 .
Figure 4. Scatterplot of y and ŷens for the ensemble, ENS AV, performing ensemble averaging; (a) scatterplot for samples belonging to the training subset, (b) scatterplot for samples belonging to the testing subset.

Figure 4 .
Figure 4. Scatterplot of y and ŷens for the ensemble, ENS AV, performing ensemble averaging; a) scatterplot for samples belonging to the training subset, b) scatterplot for samples belonging to the testing subset.

Figure 5 .
Figure 5. Scatterplot of y and ŷsg for the ensemble, ENS SG1; a) scatterplot for samples belonging to the training subset, b) scatterplot for samples belonging to the testing subset.

Figure 5 .
Figure 5. Scatterplot of y and ŷsg for the ensemble, ENS SG1; (a) scatterplot for samples belonging to the training subset, (b) scatterplot for samples belonging to the testing subset.Symmetry 2019, 11, x FOR PEER REVIEW 14 of 19

Figure 6 .
Figure 6.Scatterplot of y and ŷsg for the ensemble, ENS SG2: a) scatterplot for samples belonging to the training subset, b) scatterplot for samples belonging to the testing subset.

Figures 7 -
Figures 7-9 depict frequencies and distributions of APE p errors computed for the training and testing subsets for models based on ensembles of networks.The errors have been accumulated and counted in five intervals, whose ranges equalled 5%; one interval accumulated errors greater than 25%:• interval 1: 0% ≤ APE p < 5%, • interval 2: 5% ≤ APE p < 10%, • interval 3: 10% ≤ APE p < 15%, • interval 4: 15% ≤ APE p < 20%, • interval 5: 20% ≤ APE p < 25%, • interval 6: APE p ≥ 25%.The columns in the Figures7-9show the percentage frequencies of the errors that have fallen into one of the intervals.The polylines show the distribution of the errors (cumulative frequencies according to the accepted order of intervals).In Figures7-9, one can see that, in the case of the ENS AV and ENS SG1, only a few APE p errors (19) are greater than 25%, and in the case of ENS SG2, none of them fall into this range.On the contrary, for networks acting separately, the significant number of errors is greater than 25%.These results can be explained through the analysis of the APE p errors for the networks acting separately.For the networks acting separately (ANN1, ANN2, ANN3, ANN4, ANN5), many of the errors APE p belonging to the interval 1 were relatively small and close to 0%.On

Figure 6 .
Figure 6.Scatterplot of y and ŷsg for the ensemble, ENS SG2: (a) scatterplot for samples belonging to the training subset, (b) scatterplot for samples belonging to the testing subset.

Figures 7 -
Figures 7-9 depict frequencies and distributions of APE p errors computed for the training and testing subsets for models based on ensembles of networks.The errors have been accumulated and counted in five intervals, whose ranges equalled 5%; one interval accumulated errors greater than 25%:

Figure 7 .
Figure 7. Frequencies and distributions of absolute percentage errors for the ENS AV model computed for the training and testing subsets.

Figure 8 .
Figure 8. Frequencies and distributions of absolute percentage errors for the ENS SG1 model computed for the training and testing subsets.

Figure 7 .
Figure 7. Frequencies and distributions of absolute percentage errors for the ENS AV model computed for the training and testing subsets.

Figure 7 .
Figure 7. Frequencies and distributions of absolute percentage errors for the ENS AV model computed for the training and testing subsets.

Figure 8 .
Figure 8. Frequencies and distributions of absolute percentage errors for the ENS SG1 model computed for the training and testing subsets.

Figure 8 .
Figure 8. Frequencies and distributions of absolute percentage errors for the ENS SG1 model computed for the training and testing subsets.

Figure 9 .
Figure 9. Frequencies and distributions of absolute percentage errors for the ENS SG2 model computed for the training and testing subsets.

Figure 9 .
Figure 9. Frequencies and distributions of absolute percentage errors for the ENS SG2 model computed for the training and testing subsets.

Table 1 .
Exemplary values of site overhead costs index.

Table 2 .
Transformation of the descriptive values into the numerical values for variables x 7 -x 11 .

Table 3 .
Exemplary samples of the training data.

Table 4 .
Error function and models' performance assessment criteria.

Table 5 .
Details of the five networks selected to be the members of the ensemble.

Table 6 .
Training results and performance of the selected networks.

Table 6 .
Training results and performance of the selected networks.

Table 7 .
APEmax errors obtained for the five selected networks.

Table 7 .
APE max errors obtained for the five selected networks.

Table 8 .
Details of the two level-1 networks selected for the stacked generalisation-based models.

Table 9 .
Performance measures for the three developed models based on the ensembles of networks.