Using ANN to Predict the Impact of Communication Factors on the Rework Cost in Construction Projects

: The construction sector has a large impact on the environment and available resources. Natural resources and energy consumption occurs not only during the operation of the facility, but also during its construction. In addition, this situation often occurs when work already completed requires rework. In such cases, not only the reuse of resources and energy occurs but also generation of waste. Many studies support the relationship between communication and project efﬁciency, which is expressed in the cost of rework. At present there is no available tool to quantify the evaluation of this relationship. This study aims to ﬁll this knowledge gap. The article purpose was to create ANNs (artiﬁcial neural networks) for assessing and predicting the impact of communication factors on rework costs in construction projects. During the data collection phase, 12 factors that inﬂuence communication were identiﬁed and assessed. The level of rework costs in 18 construction projects was also calculated. We used ANN, which is a two-layer feedforward network with a sigmoid transfer function in the hidden layer and a linear transfer function in the output layer. The network input layer consists of 12 neurons while the hidden layer consists of 10 neurons and one output neuron. The optimal results of the mean square error and correlation were shown by the Levenberg– Marquardt algorithm. The proposed model can be used by project management as the integration decision support tool aimed at decreasing the number of reworks and reducing energy and resource consumption in construction projects.


Introduction
The construction sector has a large impact on the environment and available resources. Recently, more research attempts to integrate environmental aspects and the construction process. The integration methods are applied in relation to important sources of environmental impact: energy and resource consumption, pollution emissions, waste management [1,2].
The researchers note that, in the construction sector, most of the energy is consumed during the stages of building operation, that is for heating, cooling, lighting and hot water [5,6].
At the same time, one should not underestimate the number of consumed resources, including energy, during the construction phase. In this context, the problem of reworks in construction projects becomes important. Apart from the obvious negative impact on the project result, which is expressed in an increase in costs and time, the reworks lead to additional use of resources. unqualified quality problems, deviations, or errors [29]. Forcada et al. [30] noted that any additional work resulting from order changes, design errors, and volume changes should also be considered rework. Ashford described reworks as the process by which an item is made to conform to the original requirement by completion or correction [31].
Rework is known as non-value adding symptoms that affect the productivity and performance in construction projects [32]. The general rule can be formulated that rework is a process or activity, the cause of which is a discrepancy between the actual and planned state of the object and requires compliance.
Next, we analyzed some studies that consider the factor of poor communication as the reasons for rework.
Khalesi et al. [33] study opportunities to identify and reduce rework in construction projects that resulted in time delays. The authors identified and ranked 49 rework reasons and combined them into eight groups: categorization of rework causes, engineering and reviews, implementation of project, material and equipment supply, human resource capability, construction planning and scheduling, effective external causes, leadership and communication. Results showed that the most important causes were "design errors", "difference among plans and operational specifications", "non-compliance with specifications", "insufficient skill level", "unrealistic schedules", "poor communication" and "economic fluctuations".
Ye et al. [34] surveyed 277 Chinese construction practitioners and, based on factor analysis, identified 39 major causes of rework. Among the factors that are associated with communication management were: poor coordination of project team members (6th position in the ranking), poor way of passing instructions on the project (9th position in the ranking), poor communication with project participants (19th position in the ranking).
Liu et al. evaluated the rework costs in residential construction [35]. They analyzed six residential projects and identified 36 rework factors. One of the reasons "Lack of communication between the project users and the client" was the most important rework cause. To solve this research problem, the authors propose to improve the interaction between users and customers during the construction process and operation stage.
Based on 24 previous studies, the authors identified the five most common rework factors: unfinished design at the time of the tender, poor communication between the client and the contractor/design consultant, insufficient labor skills, the client's lack of knowledge and experience in the design and construction process, insufficient contractor attention, and subcontractor to quality [36]. The authors identified 87 reasons for the rework, and the factor "Weak communication/coordination with end users" had 12th place in the ranking. They recommended developing a new approach for the effective communication between the project members (client, consultant, contractor and the end users) in order to minimize the changes that occur during the project delivery.
In this study, the authors identified the rework causes in construction projects from the point of view of a client, designer and contractor [37]. Poor communication was mentioned as first cause from among 22 customer-related factors and 4 out of 15 from a designer's point of view.
To achieve the goals of their study, the authors identified 51 manageable rework reasons, three of which were associated with poor communication [38]. Then, 44 data from case studies were collected and analyzed. The authors said that the evolution of team building reduces the rework costs, that are caused by issues of communication between stakeholders, contractors and designers.
To analyze the reasons for the rework, the authors collected data on 47 construction projects [39]. The study results showed that the main rework causes are errors and omissions, project scope changes, inadequate specification, lack of work skills, poor oversight. The reason "Poor communication between builders" was ranked 10th out of 19.
Summing up, it can be stated that "Communication" is considered by most researchers one of the main reasons responsible for alterations in construction projects, although it has various ranks and weights.

Communication
An important factor of the project success is the communication management and the correct information dissemination among the participants. Effective use of the communication system allows managers to better structure information flows and minimize the costs associated with low communication levels. Despite the importance of communication, the amount of research that determines the factors influencing a project communication is limited.
To systematize the study of communication, we divided the variables that affect communication into two groups: the factors influencing communication system and the qualitative communication factors.

The Factors Influencing Communication System
The authors have determined the purpose of the article to identify the factors influencing the project communication and their clustering [40]. The study was based on an analysis of the project managers views on communication, in order to determine the factors influencing the successful project delivery.
The factors influencing the project communication were definite on the basis of previous research and interviews with experts. The authors selected and confirmed 18 main factors which were then investigated using fuzzy analysis and interpretive structural modeling techniques. The results of the study showed that the number of stakeholders and the size of the organization are the most important factors influencing the project communication.
Muszynska [41] in her research identified the communication management as one of the most important and complex elements of the project management, which is dependent on many factors: cultural differences, trust, communication support tools, IT infrastructure, geographic distance, time interval, stakeholders, monitoring, measurement and analysis, planning, continuous improvement, models and policies, and curriculum.
The authors reviewed the literature and identified 34 factors that influence the communication effectiveness between construction participants [42]. The importance of the selected variables was confirmed by questionnaires and the factors were combined into eight groups. The factors systematization created a conceptual model for improving the communication between construction participants. It was found that the main factors affecting communication between construction participants are the project complexity, the communication schedule and construction timescale, the number of participating companies, contribution of the project manager, good spirit and trust between the parties.
The authors of this study used a variety of statistical methods to identify the communication effective indicators [43]. The factor analysis technique and structural equation modeling were then used to determine the appropriate relationship between the identified components. The results showed that the critical factors are the project purpose, clarity of content, the labor, the goals and constraints, and the resources availability. The research findings can help project managers focus on the key communication drivers to reduce the risk of project failure.
Safapor et al. [44] analyzed 40 construction projects for the factors affecting the communication quality, dividing communication into three levels: between owners, between designers and between contractors. It was found that project objectives, bureaucracy, location and coordination affect the quality of internal communication between the owners. Design and technology, clarity of project scope, resources, delivery, construction management and project management have an impact on the quality of internal communication in designers. Skilled workforce, goals, constraints, materials and equipment quality, availability of qualified project managers have an impact on the quality of internal communication between contractors.

The Qualitative Communication Factors
In this article, the authors describe the efforts of the Construction Industry Institute (CII) [45] research team to identify and measure the critical communication variables during the phases of a construction project. The critical communication variables identified in the process are grouped into six manageable categories that form the basis for a communication improvement program. These categories in the importance order are accuracy, procedures, barriers, understanding, timeliness, and completeness. The authors summarized that the maximum benefit from this study can be obtained by using the assessment tools as part of a communications improvement program.
The authors used the insights and tools from the CII study evaluating communication factors in construction projects [46]. In addition to the CII data collection tools, this study included the semistructured interviews as a means of contextualizing the communication and decision-making. The paper presents the results of the benchmarking and highlights the important issues that the project team members need to improve in order to achieve the planned level of timeliness, quality and cost.
Xie et al. [47] studied multilateral communication in a construction project. The authors used accuracy, procedures, barriers, understanding, timeliness, completeness, information flow, overload, underload, distortion, gatekeeping as quality variables affecting communication.
Aubert, Hooper, and Schnepel [48] repeatedly cited that the communication quality is the key success factor in ERP implementation. They defined the article purpose to conduct a case study to determine the importance of communication quality. The authors identified the main factors affecting the communication quality: completeness, credibility, accuracy, purpose adequacy, timeliness, openness, audience adequacy, bidirectionality, balance of formality/informality.

Neural Network
Recently, the artificial neural networks have found widespread use in construction. In particular, ANNs have been successfully applied in decision-making, cost forecasting, scheduling optimization, risk assessment, and dispute resolution. Classically, ANNs are used for problems that are difficult to solve using the traditional mathematical and statistical methods.
In the article authors used the capabilities of an ANN to predict the organizational efficiency level of a construction company. A back propagation multilayer neural network was developed and trained. The results showed that by using a combination of statistical analysis and ANN on the real dataset, it is possible to achieve the high prediction accuracy [49].
This article analyzed the key factors of construction management and their impact level on the project management effectiveness. The authors identified 12 key factors in the construction management that relate to planning, organizing and controlling a project, as well as the project manager and team. Based on these factors, a construction project management efficiency model was created. The use of ANN allowed to build the evaluation model of construction project management effectiveness and determine the key factors that affect the execution of the project budget [50].
The authors modeled three different types of ANNs to predict the road construction cost based on four indicators: road length, road width, planned construction duration and planned construction cost. The neural networks can be used during the initial design phase when there is usually a limited or incomplete data set for the cost analysis. The proposed method can give more accurate results with less estimation errors [51].
Alaloul et al. [52] determined the article purpose to develop an ANN model to assess the coordination factors' influence on the construction projects delivery. For this, 16 coordination factors were identified that affect the construction projects most. Three layered feedforward networks with Elman backpropagation and propagation algorithms were used to train, validate, and test the data. The mean square error (MSE) with a mean of 0.0231 confirmed the accuracy of the models.
Hong et al. [53] used ANN to analyze the benefits and costs of implementing Building Information Modeling (BIM). To determine the functions of costs and benefits, multicom-ponent and multiclass classifications were adopted. The neural network was developed from data collected by Australian and Chinese construction firms using a 7-point Likers questionnaire. The proposed model can be useful to decision makers and can help in choosing the best BIM implementation strategies.
Palaneeswaran et al. [54] used backpropagation and general regression neural networks to study the effect of rework reasons on various project performance indicators (cost and time overruns, contractual claims). The research results can be used to develop forecasting systems and corresponding intelligent decision support structures.
Anysz et al. [55] used ANN tools to predict the prices of the same type of residential real estate. The flat area, the number of rooms, the number of storeys, the house technical condition and the condition of nearby residential buildings are used as variables. The database includes 222 flats with the indication of transaction prices in the secondary real estate market. In this case, the authors used the hybrid approach, in which the classification problem is solved at the first stage, and the regression problem at the second. The average percentage error in the calculations presented in the article ranges from 4.4% to 7.8%. The ANN tools use showed that automated appraisal of the same types of real estate can give price predictions with a fairly low error.
As can be seen, ANN were widely used to solve various forecasting and classification problems in construction and, thus, can be used to determine the relationship between the communication factors and the rework costs.
The purpose of this study was to develop an ANN to evaluate and predict the communication factors impact on rework costs in construction projects.
The following tasks were identified for this research: (1) calculating the rework cost in construction projects,

The Method of Rework Cost Calculation
The research method goal is to identify the rework and calculate the rework costs. In selecting the data, the authors tried to maximize accuracy and minimize subjectivity. Of all the data on the change in the construction project cost, only those related to the rework cost were used. In total, 18 construction projects (schools, kindergartens, sports complexes) implemented from 2012 to 2020 were investigated.
Local governments, namely the Regional State Administration, were the investor in the projects. The project was supervised by the Department of Construction and Architecture, which is responsible for control and inspection on behalf of the Regional State Administration.
During the study, the following project documentation was analyzed: architectural, structural and MEP projects; acts of additional work; local estimate of additional work; initial and final sheets of the scope of work performed.
If during the construction project delivery alterations occurred, then they must be reflected in the act of additional work (data on the amount of work performed) and in the local estimate of additional work (data on the costs of the work performed).
The changes are visible when comparing the final and initial sheets of the scope of work performed.
The rework costs were calculated as a percentage of the planned project cost.

The Evaluation Method of Factors Influencing Communication
After analyzing the qualitative communication factors, the authors decided to take as a basis a classical study [45], in which six variables were identified: accuracy, procedures, barriers, understanding, timeliness, completeness.
After consultation with experts, six variables were selected as factors influencing communication system: number of project members, location and coordination, complexity of the project, quality and clarity of the project, previous experience in joint project delivery, IT infrastructure and communication support tools.
The data required for this study were obtained using the questionnaire survey method. Four members from each project took part in the survey: the project manager, the general contractor, the project office and the supervision inspector (investor representative). The survey involved participants from 18 projects, for which the rework costs were previously calculated. Thus, the authors managed to form a dataset of 72 items (18 projects multiplied by four respondents from each project). The survey was carried out in electronic form, by sending the MS Office Excel file to the respondent's e-mail. The file contained a dropdown list of 12 reasons and 11 levels of influence of each factor on communication. The respondents were asked to determine the influence on communication of each of the 12 previously selected factors.
The influence level was determined on a scale from 0 to 10, where 0-the factor does not affect communication, 10-the factor has the maximum effect on communication.

ANN Modeling
ANN is part of artificial intelligence. Their architecture resembles human brain cells and nervous systems, and they can learn from their own experiences [56].
Neural networks are built in such a way that they can solve various problems without the support of experts and programming.
Mathematical and statistical methods require prior knowledge regarding the nature of the input and output data interdependence. ANN can search for patterns and connections in fuzzy data, even if the form of the data connectivity is unknown, they can learn from examples and generalize solutions.
A typical ANN is a supervised learning model where learning occurs by representing an input variable set (predictors), each of which has an associated target output value.
The ANN learns by changing the connection weights between neurons in response to errors between target and actual outputs. At the end of the testing phase, the neural network is a model that should be able to predict the target value from the given predictors.
According to Fausett [57], each neural network consists of a network architecture, a learning algorithm, and an activation function.
A network architecture is a specific arrangement and connection of neurons in the form of a network. ANN can be represented by different architectures: multilayer perceptron (MLP), generalized feed-forward neural network (GFNN), support vector machine (SVM), generalized regression neural network (GRNN), radial basis function neural network (RBFNN), neuro-fuzzy and others.
A typical ANN architecture is shown in Figure 1. A learning algorithm is a method that is used to determine the connection weights between neurons. The network was trained using the error backpropagation algorithm, in this study. The collected data was divided into three sets: training, validation and testing.
The backpropagation learning algorithm has three steps: forwarding the input training signal, calculating and backpropagating the associated error, and adjusting the weights.
At the first stage, each neuron in the input layer receives an input signal and transmits this signal to each neuron in the hidden layer.
Then each hidden neuron calculates its activation and transmits its signal to each output neuron. For a neural network, the total input signal is determined by the equation [58]; where, x i -are input variables, w i -are weights. Each output neuron calculates its activation to form a network response to a given input signal.
The main task of the activation function is to determine whether the summed result of the signal can produce an output signal. This feature is associated with hidden layers neurons. Sigmoid function is often used to activate neurons, the range of which is (0,1) and is defined as [58]; Each output neuron compares its actual value with its target value to calculate the error. The neural network is in the learning process until it reaches the global minimum error value between the empirical and target data. The mean square error (MSE) is calculated as an error (losses) [59]; where, Y i is the empirically calculated output and Y i is the target output. Based on this error, an S-factor is calculated, which is used to distribute the error from the output layer back to the hidden and input layers. After all the S-factors have been calculated, the weights for all layers are adjusted at the same time.
After completing the network training, the validation stage begins. Validation is critical to the successful training and subsequent use of the network. The goal of this phase is to ensure that the network is capable of generalizing data within the boundaries established during the training phase. If such characteristics are adequate, the model is considered valid [60].
The final stage of testing is verifying how the network works with data that it has never seen before during its development [61].

Input and Output Data Collection
The analysis of construction projects showed that errors and omissions and, as a result, conflicts and collisions between the engineering systems caused massive rework.
In the analyzed projects, conflicts and collisions between structural elements were often encountered: wall displacement, demolition of beams, lack of technical holes in monolithic reinforced concrete, incorrect placement of stairs, incorrect placement of columns.
The second group is conflicts and collisions between elements of various disciplines: reassignment of electrical networks due to a conflict with a water supply system, reassignment of an air duct due to a conflict with a sewer pipe, door conflicts with plumbing.
The cost impact of each error or omission is highly dependent on the stage of the project at which it was identified. For example, omission of information in the design documentation (lack of numbers, lines, dimensions and elements), which was identified at the design stage, leads to low costs. Missing elements or errors in the placement of building structures that were identified during the construction phase leads to huge costs. Figure 2a shows an example of a stair and column placement error that resulted in significant rework costs. Figure 2b shows the correct solution of a stair and column placement.  Table 1 shows the results of the analysis of 18 construction projects, namely planned project costs and rework costs. The costs are indicated in Ukrainian hryvnia (UAH). After processing the received questionnaires, a matrix was formed, which contained 72 rows and 13 columns (12 columns-input factors of communication, 1 column-target data on rework costs). The relationship between the predictors and the target outputs is inversely proportional.
The effectiveness of using the model depends on the available data and the quality of their preparation. After collecting the data, a procedure for their normalization was carried out. Normalization is a pre-processing procedure for input data (training, test and validation sets), in which the values are reduced to a certain specified range, for example (0, 1) [62]. In our case, the predictors were normalized by dividing their values by 10, the target values were divided by 100.

ANN Creation and Training
The neural network was created using the Neural Network Fitting library in MATLAB R2015b, which can be used to solve approximation and regression problems. ANN is a two-layer feedforward network, with a sigmoid transfer function in the hidden layer and a linear transfer function in the output layer. The number of variables in the input layer is 18, the hidden layer consists of 10 neurons, the output layer consists of one neuron ( Figure 3). The matrix of input variables was split into three datasets: 70% of the data is the training set, 15% is the validation set, and 15% is the test set.
The data was trained, validated and tested using three learning algorithms: Levenberg-Marquardt, Bayesian regularization and scaled conjugate gradient. Levenberg-Marquardt algorithm is the speediest training algorithm for a moderately sized network. Bayesian regularization algorithm is a mathematical method which transforms a nonlinear regression into a 'well-posed' statistical problem. Scaled conjugate gradient algorithm is the only algorithm with a conjugate gradient that needs no line search, which is excellent for general-purpose algorithm training.
The algorithms result on the test set are presented in Table 2. After analyzing the values of MSE and R, it can be seen that the Levenberg-Marquardt algorithm showed the best results (the lowest value of the mean square error and the highest value of the regression). The following study results are shown for the Levenberg-Marquardt algorithm. Regression plots that show the correlation between the output and training, validation, and testing sets are shown in Figure 4. The regression coefficient R is a statistical measure that tests the overall fit of the predictive model by showing the level of correlation between inputs and outputs. R values are in the range (0,1). The closer the value is to 1, the higher the correlation between the data. The regression value of all datasets is 0.963,95, which is acceptable for this task. The network performance is shown in Figure 5. The network training stopped at the 7th iteration of the validation stage, after the error value began to grow. In this case, the final MSE value of validation is insignificant (0.000,147,54). Test and validation set errors have similar characteristics. One of the main ways to compare the of performance of the ANN model is the calculation of various indicators of the error function (mean square error, MSE; root mean square error, RMSE; mean absolute percentage error, MAPE) and the coefficients of regression (R) and determination (R2). Next, we compared our results with some previous studies. Alaloul et al. [52] calculated the value of MSE to be 0.0231 and R2 to be 0.77. Tijanić, Car-Pušić and Špera [51] received values MAPE 0,13 and R2 0.9595. Palaneeswaran et al. [54] calculated the value of RMSE to be 0.3301 and R2 to be 0.8974. From the comparison results of previous studies with those obtained by us (MSE = 0.004,084; R = 0.900,97), we can then conclude that the created neural network can be successfully used to solve the problem. The next step is to export ANN to the MATLAB workspace, where it can be applied on new input variables sets to predict the possible level of rework costs in construction projects.
The presented approach, however, has some limitations. Neural networks learn from input variables, so the quality and quantity of data affects the estimation error. This study used a relatively small amount of input data, which is associated with the complexity and high labor costs of estimating the rework cost. In the future, it is planned to expand the knowledge base of rework costs in construction projects, which will increase the number of predictors. The next important limitation is that the authors used one type of neural network architecture (two-layer feedforward network). The network was trained and tested with three algorithms, but according to the recommendations, it is worth trying several types of neural network architectures to get the most accurate prediction [51]. The authors will take into account the above limitations in their future work.
In addition, the authors understand and realize that project communication is an important factor influencing the rework cost, but still not the only one. The aim of this study was to develop ANN to assess and predict the impact of communication factors on rework costs in construction projects.
The results presented in this section give an idea of the structure of the factors affecting communications and the rework cost in construction projects. The data obtained can be used by stakeholders to manage the project communication system.

Conclusions
The construction sector consumes the large scope of the natural resources and energy. The consumption occurs not only during the operation of the facility, but also during its construction.
When the work already completed requires rework, then the reuse of resources and energy occurs. The purpose of this article was to create ANN for assessing and predicting the impact of communication factors on rework costs in construction projects. During the data collection phase, 12 factors were identified and assessed that influence communication. The level of rework costs in 18 construction projects was also calculated. We used ANN, which is a two-layer feedforward network, with a sigmoid transfer function in the hidden layer and a linear transfer function in the output layer. The network input layer consists of 12 neurons, the hidden layer consists of 10 neurons and one output neuron. The network was trained, validated and tested using three learning algorithms.
The optimal results of the mean square error and the correlation were shown by the Levenberg-Marquardt algorithm. The proposed model can be used by project management as a decision support tool. For example, at the early project stages to collect communication factors data, the proposed model will be able to predict the rework costs level.
Future research will focus on increasing the number of input variables by creating a larger knowledge base of rework costs, as well as the application of created ANN on new input variables sets to predict the possible level of rework costs in construction projects. In addition, we plan to test neural networks with different architectures and with using different activation function algorithms and loss function optimization methods.