# Modelling Construction Site Cost Index Based on Neural Network Ensembles

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Background of The Problem, Methods, and Main Assumptions

_{j}) of multiple variables x

_{j}where j = 1,…,n:

_{j}), as an approximation of g(x

_{j}), is assumed to be implemented implicitly by a trained single ANN, selected from a number of trained candidate networks, where ε denotes an error of approximation. There are two disadvantages of an approach based on the selection of a single ANN and discarding the rest of the candidate networks [47,48]—the effort required for the training and assessment of the number of candidate networks is wasted. Moreover, the generalisation performance of the chosen network is biased with respect to some part of the input space due to the selection of learning, testing, and validating subsets from the overall number of patterns available for the training process, structure of the network, its parameters, and conditions of training process initialisation. An alternative approach is to combine a number of different ANNs that share common input x

_{j}and form an ensemble (the ANNs may differ in their structures, parameters, and way of training, and the ensemble may even include different kinds of networks). In this paper, the authors consider two alternative approaches that are based on ensembles of neural networks—the first approach is termed as ensemble averaging, and the second one stacked generalisation—compare, e.g., References [47,48]. In the next three subsections, the authors systematically present the background of the research and the main assumptions of the model development process.

#### 2.1. Ensemble Averaging

_{j}) is done with the use of a linear combination of K-trained ANNs. The formal notation is given by Equation (2):

_{i}(x

_{j}) stands for the approximation and ε

_{i}denotes an error of approximation by i-th neural network for i = 1,…,K. Such a mechanism (compare Reference [48]), which does not involve input signals, where individual outputs of ANNs are combined to produce an overall output, belongs to a class of static structures. The following assumptions can be made [47]—the sum-of-squares error for f

_{i}(x

_{j}) can be given as:

_{i}

^{sos}corresponds to an integration over x

_{j}, weighted by unconditional density p(x

_{j}):

_{ens}(x

_{j}):

_{i}(x

_{j}) are uncorrelated and have zero mean, the relation of the ensemble error to the average error of the networks working separately is:

_{i}(x

_{j}) are highly correlated and the reduction of the error is much smaller. Typically, some useful reduction of the error is obtained, as the ensemble averaging cannot produce an increase in the expected error:

#### 2.2. Stacked Generalisation

_{j}inputs; the outputs of level-0 networks can be written down as ${\widehat{y}}_{i}$ = f

_{i}(x

_{j}), with the use of the level-1 network to give the final output. Formally the model can be given as

## 3. Construction Site Overhead Cost Index Prediction as a Regression Analysis Problem—Assumptions for Ensemble Averaging and Stacked Generalisation

_{i}—the i-th mapping function implemented implicitly by the i-th neural network belonging to an ensemble, x

_{j}—dependent variables, input shared by all of the members of the ensemble for j = 1,…,m, ε

_{i}—error of approximation by the i-th member of the ensemble for i = 1,…, K.

_{i}—the i-th mapping function implemented implicitly by i-th level-0 neural network, x

_{j}—as in (11), and ε

_{sg}—the error of approximation by the model.

_{j}. The value of the dependent variable in the p-th sample (p = 1,…,143) was calculated as follows:

_{ind}

^{p}—site overhead costs index, SOC

^{p}—site overhead costs observed in reality, LC

^{p}—labour costs observed in reality, MC

^{p}—material costs observed in reality, EC

^{p}—equipment costs observed in reality, and SC

^{p}—subcontractors’ costs observed in reality for the p-th observation (sample). Some exemplary data, including cost components present in the Equation (13), in thousands of Euros, and corresponding site overhead cost indexes, are presented in Table 1.

_{j}, where j = 1,…,11. Three variables brought to the model information about the types of work that were executed in the project were:

- x
_{1}—types of work—general construction works, - x
_{2}—types of work—installation works, - x
_{3}—types of work—engineering works.

- x
_{4}—construction site location—in the city centre, - x
_{5}—construction site location—outside the city centre, - x
_{6}—construction site location—non-urban spaces, - x
_{7}—distance between the construction site and the company’s office.

- x
_{8}—overall duration of construction works.

- x
_{9}—relationship between the amount of works performed in winter to the total amount of works, - x
_{10}—relationship of the amount of works performed by subcontractors to the total amount of works.

- x
_{11}—size and necessary potential of the main contractor.

_{1}–x

_{6}were of the nominal type. A binary method of coding was applied in the case of x

_{1}, x

_{2}and x

_{3}—their values range was 0 or 1. In the case of x

_{4}, x

_{5}and x

_{6}—a “1 of n” method of coding was applied—the range of values, considered for the three variables altogether, was 1, 0, 0 or 0, 1, 0 or 0, 0, 1.

_{7}–x

_{10}were of the quantitative type, whereas x

_{11}was of the nominal type. A pseudo-fuzzy scaling method of coding was applied for transformation of the original values or information into numerical values into the range 0.1–0.9 in the case of the variables presented in Table 2, but for the variable x

_{9}the values were scaled into the range 0.0–1.0. The transformation for these variables is presented in Table 2. The rationale for the transformation was to provide a common scale for all of the variables.

## 4. Models’ Development, Results, and Discussion

#### 4.1. Models’ Development Strategy

_{L}> 0.90, R

_{V}> 0.90, R

_{L&V}> 0.90, and R

_{T}> 0.90, the authors initially selected 20 networks for which the differences between RMSE

_{L}, RMSE

_{V}, RMSE

_{L&V}, and RMSE

_{T}were the smallest.

#### 4.2. Results and Discussion

_{ind}, points of coordinates (y

^{p}, ŷ

^{p}), for the training and testing subsets drawn for the five selected networks acting individually. One can see that, in terms of the criteria shown in Table 4 and according to the results presented in Table 5, the performance of the three networks acting individually was similar and the errors were comparable. However, Figure 3a,b and the analysis of the location and the distribution of the points in the graphs reveal that the predictions for will depended strongly on the choice of a single network acting separately. Although most of the points were distributed along the line of a perfect fit, some points (marked with the ellipses) were placed outside of the cone delimited by percentage errors equal to +25% and −25%.

_{ind}as the expected outputs, were divided randomly for each investigated network into the learning and validating subset in the proportion L/V = 60%/40%. The investigated networks varied also in the initial weights of the neurons at the beginning of the training process. Altogether, around 100 networks were trained and examined. For the purposes of testing, the authors used the T subset, which was selected in the initial stage of the research (as presented previously in Section 3). The criteria of two-step selection of the level-1 networks were similar as in the case of ensemble candidate networks (as presented previously in Section 3). The final choice of two level-1 networks, namely MLP 5-2-1 and MLP 5-3-1, allowed for the introduction of two alternatively stacked generalisation-based models. The final choice of the two above-mentioned level-1 networks, and further discussion of two alternative models based on stacked generalisation, was due to the comparable quality of these models. These models are later called ENS SG1 and ENS SG2, respectively. The details of the selected level-1 networks are presented in Table 8.

_{max}.

_{ind}for the ENS AV, ENS SG1, and ENS SG2 models. Figure 4, Figure 5 and Figure 6 present the points of coordinates (y

^{p}, ${{\widehat{y}}^{p}}_{ens}$) for the training and testing subsets separately. When compared to Figure 3, these graphs show that combining the five selected ANNs allowed for the compensation of errors made by the ANNs acting in isolation in the case of the ENS AV as well as the ENS SG1 and ENS SG2 models. Although an improvement has been achieved in the case of all three introduced models, one can see that the best performance is provided by ENS SG2, where all of the points are distributed within the cone of acceptable errors. In the case of ENS AV and ENS SG1, there are single points located outside of the cone.

^{p}errors computed for the training and testing subsets for models based on ensembles of networks. The errors have been accumulated and counted in five intervals, whose ranges equalled 5%; one interval accumulated errors greater than 25%:

- interval 1: 0% ≤ APE
^{p}< 5%, - interval 2: 5% ≤ APE
^{p}< 10%, - interval 3: 10% ≤ APE
^{p}< 15%, - interval 4: 15% ≤ APE
^{p}< 20%, - interval 5: 20% ≤ APE
^{p}< 25%, - interval 6: APE
^{p}≥ 25%.

^{p}errors (19) are greater than 25%, and in the case of ENS SG2, none of them fall into this range. On the contrary, for networks acting separately, the significant number of errors is greater than 25%. These results can be explained through the analysis of the APE

^{p}errors for the networks acting separately. For the networks acting separately (ANN1, ANN2, ANN3, ANN4, ANN5), many of the errors APE

^{p}belonging to the interval 1 were relatively small and close to 0%. On the other hand, these small errors were accompanied by a significant number of errors APE

^{p}≥ 25%, and high values of APE

_{max}(compare Table 7). In the case of the ensemble-based models, these errors have been compensated due to the ensemble averaging (ENS AV) or stacked generalisation (ENS SG1, ENS SG2).

^{p}≥ 25%.

_{ind}would burden the predictions with the choice of a network—this is confirmed by the distribution of points that represent expected and predicted values (y

^{p}, ${\widehat{y}}^{p}$) in Figure 3.

_{max}, as well as more stable predictions, are the most beneficial from employment of the ensembles in the models. Furthermore, a risk of errors exceeding the critical level of 25%, in terms of percentage errors is reduced. These benefits have been achieved at some cost, mainly due to compensation of very small and very high errors offered by certain networks acting separately for certain training and testing patterns. However, the compensation of the errors from the ensemble-based models reduces the unwanted oversensitivity of the networks acting separately to certain training patterns.

## 5. Summary and Conclusions

_{max}ranged between 26.6% and 76.1%. In the case of the proposed ensembles of networks, both MAPE and APE

_{max}errors for testing were lower; values of MAPE ranged between 6.3% and 9.2%, whereas values of APE

_{max}ranged between 18.5% and 23.4%. The quality of the ensemble-based model is also visible in the distribution of errors for each of them—more than 90% of the APE testing errors were smaller than 25%.

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Leśniak, A.; Juszczyk, M. Prediction of site overhead costs with the use of artificial neural network based model. Arch. Civ. Mech. Eng.
**2018**, 18, 973–982. [Google Scholar] [CrossRef] - Gajzler, M.; Zima, K. Evaluation of Planned Construction Projects Using Fuzzy Logic. Int. J. Civ. Eng.
**2017**, 15, 641–652. [Google Scholar] [CrossRef] - Leśniak, A.; Plebankiewicz, E. Modeling the decision-making process concerning participation in construction bidding. J. Manag. Eng.
**2013**, 31, 04014032. [Google Scholar] [CrossRef] - Skorupka, D. Identification and initial risk assessment of construction projects in Poland. J. Manag. Eng.
**2008**, 24, 120–127. [Google Scholar] [CrossRef] - Tam, C.M.; Tong, T.K.; Chiu, G.C.; Fung, I.W. Non-structural fuzzy decision support system for evaluation of construction safety management system. Int. J. Proj. Manag.
**2002**, 20, 303–313. [Google Scholar] [CrossRef] - Hoła, B. Identification and evaluation of processes in a construction enterprise. Arch. Civ. Mech. Eng.
**2015**, 15, 419–426. [Google Scholar] [CrossRef] - Krzemiński, M. KASS v.2.2. Scheduling Software for Construction with Optimization Criteria Description. Acta Phys. Polon. A
**2016**, 309, 1439–1442. [Google Scholar] [CrossRef] - Chatterjee, K.; Zavadskas, E.K.; Tamošaitienė, J.; Adhikary, K.; Kar, S. A hybrid MCDM technique for risk management in construction projects. Symmetry
**2018**, 10, 46. [Google Scholar] [CrossRef] - Tamošaitienė, J.; Zavadskas, E.K.; Šileikaitė, I.; Turskis, Z. A novel hybrid MCDM approach for complicated supply chain management problems in construction. Procedia Eng.
**2017**, 172, 1137–1145. [Google Scholar] [CrossRef] - Zavadskas, E.K.; Antucheviciene, J.; Vilutiene, T.; Adeli, H. Sustainable decision-making in civil engineering, construction and building technology. Sustainability
**2018**, 10, 14. [Google Scholar] [CrossRef] - Anysz, H.; Zbiciak, A.; Ibadov, N. The Influence of Input Data Standardization Method on Prediction Accuracy of Artificial Neural Networks. Procedia Eng.
**2016**, 153, 66–70. [Google Scholar] [CrossRef] [Green Version] - Dikmen, S.U.; Sonmez, M. An Artificial Neural Networks Model for the Estimation of Formwork Labour. J. Civ. Eng. Manag.
**2011**, 17, 340–347. [Google Scholar] [CrossRef] - Schabowicz, K.; Hoła, B. Application of artificial neural networks in predicting earthmoving machinery effectiveness ratios. Arch. Civ. Mech. Eng.
**2008**, 8, 73–84. [Google Scholar] [CrossRef] - Yip, H.L.; Fan, H.; Chiang, Y.H. Predicting the maintenance cost of construction equipment: Comparison between general regression neural network and Box–Jenkins time series models. Autom. Constr.
**2014**, 38, 30–38. [Google Scholar] [CrossRef] - Leśniak, A. Supporting contractors’ bidding decision: RBF neural network application. AIP Conf. Proc.
**2016**, 1738, 200002. [Google Scholar] [CrossRef] - Wanous, M.; Boussabaine, H.A.; Lewis, J. A neural network bid/no bid model: The case for contractors in Syria. Constr. Manag. Econ.
**2003**, 21, 737–744. [Google Scholar] [CrossRef] - Ashraf, M.E. Classifying construction contractors using unsupervised-learning neural networks. J. Constr. Eng. Manag.
**2006**, 132, 1242–1253. [Google Scholar] - Mrówczyńska, M. Neural networks and neuro-fuzzy systems applied to the analysis of selected problems of geodesy. Comput. Assisted Mech. Eng. Sci.
**2011**, 18, 161–173. [Google Scholar] - Zavadskas, E.K.; Vilutienė, T.; Tamošaitienė, J. Harmonization of cyclical construction processes: A systematic review. Procedia Eng.
**2017**, 208, 190–202. [Google Scholar] [CrossRef] - Kapliński, O. Innovative solutions in construction industry. Review of 2016–2018 events and trends. Eng. Struct. Technol.
**2018**, 10, 27–33. [Google Scholar] [CrossRef] - Trost, S.M.; Oberlender, G.D. Predicting accuracy of early cost estimates using factor analysis and multivariate regression. J. Constr. Eng. Manag.
**2003**, 129, 198–204. [Google Scholar] [CrossRef] - Belniak, S.; Leśniak, A.; Plebankiewicz, E.; Zima, K. The influence of the building shape on the costs of its construction. J. Financ. Manag. Prop. Constr.
**2013**, 18, 90–102. [Google Scholar] [CrossRef] - Leśniak, A.; Zima, K. Cost calculation of construction projects including sustainability factors using the Case Based Reasoning (CBR) method. Sustainability
**2018**, 10, 1608. [Google Scholar] [CrossRef] - El Sawalhi, N.I. Modelling the parametric construction project cost estimate using fuzzy logic. Int. J. Emerg. Technol. Adv. Eng.
**2012**, 2, 2250–2459. [Google Scholar] - Kim, K.J.; Kim, K. Preliminary cost estimation model using case-based reasoning and genetic algorithms. J. Comput. Civ. Eng.
**2010**, 24, 499–505. [Google Scholar] [CrossRef] - Wilmot, C.G.; Mei, B. Neural network modeling of highway construction costs. J. Constr. Eng. Manag.
**2005**, 31, 765–771. [Google Scholar] [CrossRef] - Attala, M.; Hegazy, T. Predicting cost deviation in reconstruction projects: Artificial neural networks versus regression. J. Constr. Eng. Manag.
**2003**, 129, 405–411. [Google Scholar] [CrossRef] - El-Sawalhi, N.I.; Shehatto, O. Neural Network Model for Building Construction Projects Cost Estimating. J. Constr. Eng. Proj. Manag.
**2014**, 4, 9–16. [Google Scholar] [CrossRef] - Juszczyk, M. The challenges of nonparametric cost estimation of construction works with the use of artificial intelligence tools. Procedia Eng.
**2018**, 196, 415–422. [Google Scholar] [CrossRef] - Juszczyk, M. Application of committees of neural networks for conceptual cost estimation of residential buildings. AIP Conf. Proc.
**2015**, 1648, 600008. [Google Scholar] [CrossRef] - El-Sawy, I.Y.; Hosny, H.E.; Razek, M.A. A Neural Network Model for Construction Projects Site Overhead Cost Estimating in Egypt. Int. J. Comput. Sci. Issues
**2011**, 8, 273–283. [Google Scholar] - Juszczyk, M.; Leśniak, A. Site Overhead Cost Index Prediction Using RBF Neural Networks. In Proceedings of the 3rd International Conference on Economics and Management (ICEM 2016), Suzhou, China, 2–3 July 2016; DEStech Publications Inc.: Lancaster, CA, USA, 2016; pp. 381–386. [Google Scholar] [CrossRef]
- Juszczyk, M. Implementation of the ANNs ensembles in macro-BIM cost estimates of buildings’ floor structural frames. AIP Conf. Proc.
**2018**, 1946, 020014. [Google Scholar] - Juszczyk, M.; Leśniak, A.; Zima, K. ANN Based Approach for Estimation of Construction Costs of Sports Fields. Complexity
**2018**, 2018, 7952434. [Google Scholar] [CrossRef] - Yazdani-Chamzini, A.; Zavadskas, E.K.; Antucheviciene, J.; Bausys, R. A Model for Shovel Capital Cost Estimation, Using a Hybrid Model of Multivariate Regression and Neural Networks. Symmetry
**2017**, 9, 298. [Google Scholar] [CrossRef] - Plebankiewicz, E.; Leśniak, A. Overhead costs and profit calculation by Polish contractors. Technol. Econ. Dev. Econ.
**2013**, 19, 141–161. [Google Scholar] [CrossRef] - Peurifoy, R.L.; Oberlender, G.D. Estimating Construction Costs, 4th ed.; McGraw Hill: New York, NY, USA, 1989. [Google Scholar]
- Coombs, W.E.; Palmer, W.J. Construction Accounting and Financial Management, 4th ed.; McGraw Hill: New York, NY, USA, 1989. [Google Scholar]
- Apanavičienė, R.; Daugėlienė, A. New Classification of Construction Companies: Overhead Costs Aspect. J. Civ. Eng. Manag.
**2011**, 17, 457–466. [Google Scholar] [CrossRef] - Chartered Institute of Building. Project Overheads, in Code of Estimating Practice, 7th ed.; Wiley-Blackwell: Oxford, UK, 2009. [Google Scholar]
- Hegazy, T.; Moselhi, O. Elements of cost estimation: A survey in Canada and United States. Cost Eng.
**1995**, 37, 27–30. [Google Scholar] - Assaf, S.A.; Bubshait, A.A.; Atiyah, S.; Al-Shahri, M. The management of construction company overhead costs. Int. J. Proj. Manag.
**2001**, 19, 295–303. [Google Scholar] [CrossRef] - Brook, M. Preliminaries in Estimating and Tendering for Construction Work; Butterworth-Heinemann: Oxford, UK, 1998. [Google Scholar]
- Cooke, B. Contract Planning and Contractual Procedures; Macmillan: Basingstoke, UK, 1981. [Google Scholar]
- Assaf, S.A.; Bubshait, A.A.; Atiyah, S.; Al-Shahri, M. Project overhead costs in Saudi Arabia. Cost Eng.
**1999**, 41, 33–37. [Google Scholar] - Chan, C.T.W. The principal factors affecting construction project overhead expenses: An exploratory factor analysis approach. Constr. Manag. Econ.
**2012**, 30, 903–914. [Google Scholar] [CrossRef] - Bishop, P.C. Neural Networks for Pattern Recognition; Oxford University Press: Oxford, UK, 1995. [Google Scholar]
- Haykin, S. Neural Networks: A Comprehensive Foundation; Macmillan Publishing: New York, NY, USA, 1994. [Google Scholar]
- Rojas, R. Neural Networks: A Systematic Introduction; Springer Science & Business Media: New York, NY, USA, 2013. [Google Scholar]
- Krogh, A.; Vedelsby, J. Neural network ensembles, cross validation, and active learning. In Advances in Neural Information Processing Systems, 7th ed.; Tesauro, G., Touretzky, D.S., Leen, T.K., Eds.; MIT Press: Cambridge, MA, USA, 1995; pp. 231–238. [Google Scholar]

**Figure 3.**Scatterplots of y and $\widehat{y}$ for the five selected neural networks acting separately: (

**a**) scatterplot for samples belonging to the training subset, (

**b**) scatterplot for samples belonging to the testing subset.

**Figure 4.**Scatterplot of y and ${\widehat{y}}_{ens}$ for the ensemble, ENS AV, performing ensemble averaging; (

**a**) scatterplot for samples belonging to the training subset, (

**b**) scatterplot for samples belonging to the testing subset.

**Figure 5.**Scatterplot of y and ${\widehat{y}}_{sg}$ for the ensemble, ENS SG1; (

**a**) scatterplot for samples belonging to the training subset, (

**b**) scatterplot for samples belonging to the testing subset.

**Figure 6.**Scatterplot of y and ${\widehat{y}}_{sg}$ for the ensemble, ENS SG2: (

**a**) scatterplot for samples belonging to the training subset, (

**b**) scatterplot for samples belonging to the testing subset.

**Figure 7.**Frequencies and distributions of absolute percentage errors for the ENS AV model computed for the training and testing subsets.

**Figure 8.**Frequencies and distributions of absolute percentage errors for the ENS SG1 model computed for the training and testing subsets.

**Figure 9.**Frequencies and distributions of absolute percentage errors for the ENS SG2 model computed for the training and testing subsets.

p | SOC^{p} | LC^{p} | MC^{p} | EC^{p} | SC^{p} | SOC_{ind}^{p} |
---|---|---|---|---|---|---|

11 | 450.00 | 3828.60 | 4183.50 | 336.00 | 1818.40 | 4.4% |

37 | 289.00 | 1693.00 | 1564.00 | 85.00 | 0.00 | 8.6% |

72 | 812.54 | 3393.91 | 2893.45 | 564.30 | 5146.69 | 6.8% |

99 | 217.60 | 382.36 | 514.23 | 48.52 | 547.43 | 14.6% |

**Table 2.**Transformation of the descriptive values into the numerical values for variables x

_{7}–x

_{11}.

Variable | Description | Descriptive Values | Numerical Values |
---|---|---|---|

x_{7} | distance | up to 20 km | 0.1 |

more than 20 km | 0.9 | ||

x_{8} | duration of construction works | up to 6 months | 0.1 |

between 6 and 12 months | 0.5 | ||

more than 12 months | 0.9 | ||

x_{9} | share of the amount of works executed in winter | up to 10% | 0 |

between 10% and 20% | 0.1 | ||

between 20% and 40% | 0.3 | ||

between 40% and 60% | 0.5 | ||

between 60% and 80% | 0.7 | ||

between 80% and 90% | 0.9 | ||

more than 90% | 1 | ||

x_{10} | share of subcontractors in the total amount of works | up to 20% | 0.1 |

between 20% and 50% | 0.5 | ||

between 50% and 100% | 0.9 | ||

x_{11} | size and potential of the main contractor | low | 0.1 |

average | 0.5 | ||

high | 0.9 |

p | x_{1} | x_{2} | x_{3} | x_{4} | x_{5} | x_{6} | x_{7} | x_{8} | x_{9} | x_{10} | x_{11} | y |
---|---|---|---|---|---|---|---|---|---|---|---|---|

7 | 1.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.90 | 0.50 | 0.00 | 0.10 | 0.50 | 8.2% |

29 | 1.00 | 1.00 | 1.00 | 0.00 | 0.00 | 1.00 | 0.90 | 0.10 | 0.90 | 0.10 | 0.90 | 12.8% |

53 | 1.00 | 1.00 | 1.00 | 0.00 | 1.00 | 0.00 | 0.10 | 0.50 | 0.30 | 0.90 | 0.50 | 4.9% |

73 | 1.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.00 | 0.90 | 0.50 | 0.10 | 0.10 | 0.10 | 4.4% |

82 | 1.00 | 1.00 | 1.00 | 0.00 | 1.00 | 0.00 | 0.10 | 0.10 | 0.10 | 0.10 | 0.10 | 6.1% |

105 | 1.00 | 0.00 | 0.00 | 0.00 | 1.00 | 0.00 | 0.10 | 0.10 | 0.10 | 0.50 | 0.10 | 4.2% |

117 | 0.00 | 0.00 | 1.00 | 0.00 | 1.00 | 0.00 | 0.10 | 0.10 | 0.50 | 0.50 | 0.50 | 9.7% |

Description | Equation | Used As | |
---|---|---|---|

sum-of-squares error function | ${E}_{sos}=\frac{1}{2}{\displaystyle \sum _{p\in L}}{\left({y}^{p}-{\widehat{y}}^{p}\right)}^{2}$ | (15) | error function |

Pearson’s correlation coefficient | $R=\frac{cov\left(y,\widehat{y}\right)}{{\sigma}_{y}{\sigma}_{\widehat{y}}}$ | (16) | criteria for general assessment of trained ANN’s quality (calculated for L, V, L&V, T subsets separately, cov(y,ŷ)—covariance between y and ŷ, σ_{y}—standard deviation for y, σ_{ŷ}—standard deviation for ŷ) |

root mean squared error | $RMSE=\sqrt{\frac{1}{c}{\displaystyle \sum _{p}}{\left({y}^{p}-{\widehat{y}}^{p}\right)}^{2}}$ | (17) | criteria for assessment of quality and performance of both trained ANNs and developed models based on ensembles (calculated for L, V, L&V, T subsets separately, c stands for L, V, L&V, T subsets cardinality, p stands for index of a training pattern) |

mean absolute percentage error | $MAPE=\frac{100\%}{c}{\displaystyle \sum _{p}}\left|\frac{{y}^{p}-{\widehat{y}}^{p}}{{y}^{p}}\right|$ | (18) | |

absolute percentage error | $AP{E}^{p}=\left|\frac{{y}^{p}-{\widehat{y}}^{p}}{{y}^{p}}\right|\xb7100\%$ | (19) | |

maximum of absolute percentage error | $AP{E}_{max}=max\left(\left|\frac{{y}^{p}-{\widehat{y}}^{p}}{{y}^{p}}\right|\xb7100\%\right)$ | (20) |

ANN | Structure | Number of Neurons in the Hidden Layer | Hidden Layer Activation Function | Output Layer Activation Function | Number of Training Epochs |
---|---|---|---|---|---|

ANN1 | MLP 11-10-1 | 10 | hyperbolic tangent | hyperbolic tangent | 146 |

ANN2 | MLP 11-10-1 | 10 | hyperbolic tangent | hyperbolic tangent | 61 |

ANN3 | MLP 11-6-1 | 6 | exponential | linear | 109 |

ANN4 | MLP 11-8-1 | 8 | hyperbolic tangent | exponential | 67 |

ANN5 | MLP 11-8-1 | 8 | logistic | hyperbolic tangent | 73 |

ANN | R | RMSE | MAPE | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|

Training | Testing | Training | Testing | Training | Testing | |||||||

L | V | L&V | T | L | V | L&V | T | L | V | L&V | T | |

ANN1 | 0.9898 | 0.9662 | 0.9850 | 0.9731 | 0.0112 | 0.0206 | 0.01308 | 0.0181 | 6.0% | 18.3% | 8.4% | 10.18% |

ANN2 | 0.9861 | 0.9319 | 0.9761 | 0.9729 | 0.0131 | 0.0277 | 0.01643 | 0.0187 | 8.7% | 14.5% | 9.9% | 10.19% |

ANN3 | 0.9808 | 0.9645 | 0.9778 | 0.9804 | 0.0154 | 0.0198 | 0.01584 | 0.0159 | 10.4% | 9.6% | 11.3% | 12.70% |

ANN4 | 0.9868 | 0.9555 | 0.9737 | 0.9751 | 0.0128 | 0.0227 | 0.01482 | 0.0175 | 10.2% | 13.1% | 10.8% | 13.69% |

ANN5 | 0.9855 | 0.9278 | 0.9807 | 0.9881 | 0.0132 | 0.0296 | 0.01724 | 0.0123 | 11.1% | 17.8% | 12.5% | 9.15% |

ANN | APE_{max} | |||
---|---|---|---|---|

Training | Testing | |||

L | V | L&V | T | |

ANN1 | 65.9% | 64.7% | 65.9% | 76.1% |

ANN2 | 80.7% | 45.2% | 80.7% | 33.4% |

ANN3 | 73.2% | 91.8% | 91.8% | 43.2% |

ANN4 | 50.6% | 64.4% | 64.4% | 70.1% |

ANN5 | 53.3% | 79.1% | 79.1% | 26.6% |

Model | Structure | Number of Neurons in the Hidden Layer | Hidden Layer Activation Function | Output Layer Activation Function | Number of Training Epochs |
---|---|---|---|---|---|

ENS SG1 | MLP 5-2-1 | 2 | exponential | linear | 40 |

ENS SG2 | MLP 5-3-1 | 3 | exponential | exponential | 51 |

Model | R | RMSE | MAPE | APE_{max} | ||||
---|---|---|---|---|---|---|---|---|

Training | Testing | Training | Testing | Training | Testing | Training | Testing | |

L&V | T | L&V | T | L&V | T | L&V | T | |

ENS AV | 0.9869 | 0.9899 | 0.0126 | 0.0112 | 8.3% | 7.1% | 42.6% | 23.4% |

ENS SG1 | 0.9853 | 0.9878 | 0.0135 | 0.0127 | 9.6% | 9.2% | 40.7% | 26.7% |

ENS SG2 | 0.9914 | 0.9922 | 0.0103 | 0.0098 | 7.3% | 6.3% | 23.2% | 18.5% |

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Juszczyk, M.; Leśniak, A.
Modelling Construction Site Cost Index Based on Neural Network Ensembles. *Symmetry* **2019**, *11*, 411.
https://doi.org/10.3390/sym11030411

**AMA Style**

Juszczyk M, Leśniak A.
Modelling Construction Site Cost Index Based on Neural Network Ensembles. *Symmetry*. 2019; 11(3):411.
https://doi.org/10.3390/sym11030411

**Chicago/Turabian Style**

Juszczyk, Michał, and Agnieszka Leśniak.
2019. "Modelling Construction Site Cost Index Based on Neural Network Ensembles" *Symmetry* 11, no. 3: 411.
https://doi.org/10.3390/sym11030411