Project Cost Overrun Risk Prediction Using Hidden Markov Chain Analysis

: Construction project cost overrun is a common problem in the construction industry. The cost of construction projects is thought to have increased by approximately 33% on average. Several types of research on construction project cost overrun have been conducted and these generally rely on historical data. However, whilst each project has its own project characteristics and cost trend, real-time project cost data are more reliable to forecast its own cost trend. This paper proposes a real-time hidden Markov chain (HMM) model to predict cost overrun risk based on project-owned cost performance data and the corrective actions if adopted. The cost overrun events occurrence in this model was assumed to follow a Poisson arrival pattern. Real-time HMM with a particle ﬁlter was used to run the simulation. One SRC building project in Taiwan was used for model validation and comparison. The posterior probabilities from the real-time HMM model were highly consistent with the cost overrun ratios of real construction projects. The proposed cost overrun prediction model could provide an early alert of cost overruns to the project manager. Based on the survey of cost overrun risk and signiﬁcantly inﬂuential factors, we propose effective cost management plans to alleviate the frequency of project cost overrun.


Introduction
One common problem in the construction industry is project cost overrun.The cost of construction projects is thought to have increased by approximately 33% on average [1].In the construction project cost domain, many studies have focused on developing methodologies that incorporate the effects of uncertainty on project cost overruns.Most of them heavily rely on historical data.Nevertheless, each project may have its own project characteristics and cost trend.Historical data are best used as the prior information at the beginning of the project.During the project operation, real-time project cost data are more reliable to forecast its own cost tendency.As explored in past research, one of the most important requirements of a cost system is to give a trustful warning of cost overruns as early as possible [2,3].
Many cost prediction techniques have been explored in the construction industry, such as regression, simulation, artificial neural network (ANN), and fuzzy sets [4][5][6][7].The main difference is the input data for training and testing; i.e., whether the historical data are from the industry or the project itself.Additionally, in practice, the project manager needs to assess the effect of the corrective actions if they are adopted to minimize the expected variances from planned performance.It is more productive for a cost overrun prediction model to take the effect of corrective action into consideration.
Many previous studies have developed macro-level prediction models which need a lot of historical construction project data or questionnaire data as input to model construction.Few studies support the assessment of cost overrun based upon real-time projectowned data (micro-level).Additionally, the previous models seldom consider the effect of

Literature Review
The prediction models can be generally classified into two categories: the causal model and the time-sequential model.The causal prediction model must collect, compare, and summarize the common significant causes to build the model.The time-sequential model mainly relies on the historical data of surveyed targets to make the model, such as EVM extrapolation.In previous studies, various statistical and artificial intelligence methods and tools have been used to solve the problem of predicting construction costs and cost overruns in the construction projects, such as regressions, neural networks, machine learning, fuzzy logic, Bayesian network, simulation, etc. [4,[6][7][8][9][10][11][12][13][14].These previous studies have mainly focused on the macro level for the overall assessment (e.g., early budget estimates) using various statistical and artificial intelligence methods.Using the comprehensive evaluation, macro-level factors are generally defined for the model construction.They can be project scope, project size, project duration, etc.
As stated above, whilst each project has its own project characteristics and cost trend, real-time project-owned cost data are more reliable for forecasting their own cost trend.This study plans to develop a prediction model only based on the project's own cost data.In addition to conventional EVM extrapolation methods (e.g., linear, exponential, and trend extrapolation), some deterministic and stochastic models were proposed in the past.Chen et al. [15] proposed a straightforward modeling method for improving the predictive power of the planned value (PV) so that the earned value (EV) and actual cost value (AC) could also be correspondingly improved.Acebes et al. [16] drew upon Monte Carlo simulation to obtain information about the expected behavior of the project and then used statistical learning methodologies to detect the project deviations.Sackey et al. [17] adopted linear regression and time series to predict duration at completion based on the actual time spent on each activity.Zhao and Zi [18] applied the exponential smoothing technique to forecast project costs at EVM. Yu et al. [19] proposed an active construction dynamic schedule management model based on fuzzy earned value management and a BP neural network to predict project duration under risk.
Based upon the survey mentioned above, the summary and the limitations of previous research are described as follows: (1) many previous researchers have developed macrolevel models which require a lot of historical construction project data or questionnaire data as input to model construction; few types of research support the assessment of cost overrun based upon real-time project-owned data (micro-level); and (2) previous models have seldom considered the effect of corrective action.The model presented in this research attempted to develop a prediction model to estimate the cost overrun probability founded on the project-owned cost performance data and the corrective action if adopted during the project execution.Furthermore, based upon the common definition of cost overrun factors for a project, the potential cause combinations with a high possibility affect the project cost overrun were supplementarily surveyed using sensitivity analysis inside HMM inference.
As stated above, the model proposed in this paper focused on the construction of the time-sequential prediction model using HMM with a particle filter approach.For the further identification of the potential causes and combinations that affect the project cost overruns using sensitivity analysis inside HMM inference, this paper surveyed the common classification of cost overrun factors.This classification definition was used to consistently categorize project-specific cost overrun causes.The classification of cost overrun factors is diversified based on the research focuses and purposes [1,[20][21][22][23][24][25][26][27][28].For the overall assessment, macro-level factors are generally defined for the model construction.They can be project scope, project size, project duration, etc.As discussed above, it may be more reliable to adopt project-owned cost performance data to estimate and control project cost overrun during project execution.These factors belong to the project-specific level (micro level); i.e., they are generally stepwise assessed and recorded in the project cost reports based on cost performance outcome and the corresponding influence factors.Yeo [27] claimed that the scope and quantity increases, engineering and design changes, underestimation, and unforeseen conditions could cause cost overrun risks.Elinwa and Buba [1] summarized three influence factors of cost overrun: the cost of materials, management practices, and fluctuation in material prices.Based upon the study of Kaming et al. [29], inflationary increases in material cost, inaccurate material estimation, and project complexity were the three main cost overrun factors.Dissanayaka and Kumaraswamy [21] indicated the cost overrun factors to be the construction team, risk retained by a client, project complexity, and payment modality.Wang and Demsetz [25,26] summarized five significant cost overrun factors: approval delay, weather, material delivery, labor, and equipment.In the study of Elhag et al. [22], several external and internal cost overrun factors were summarized as client characteristics, consultant and design parameters, contractor attributes, project characteristics, contract procedures, and procurement methods, as well as external factors and market conditions.Aljohani et al. [30] intensively surveyed the causes of construction project overrun based on a literature review and summarized 173 causes of cost overrun in seventeen internal and external frameworks.Xie et al. [28] surveyed the critical influence factors in construction projects using fuzzy synthetic evaluation.There were 65 critical factors covered in the research which were classified into four categories: project macro, project management, project environment, and core stakeholder.
The cost overrun factors were apparently different in the afore-mentioned studies.This research attempts to forecast cost overrun probability based on the project-owned cost performance data and corrective action if adopted.By unifying the factors proposed in the previous studies, these attributes were re-classified based on their common characteristics.Five significant project-specific classification factors were defined and they are weather, productivity, material, equipment, and management.The project cost tends to overrun if the poor status of these factors happens during the construction project execution.The realtime status of these factors can be summarized and surveyed following the project reports and checklists.Based upon the performance data and the corrective actions input to the model, the cost overrun risk can be in-time assessed and the effect of the corrective action is also surveyed.Accordingly, the potential cause combinations with high possibility affecting the project cost overrun can be supplementarily surveyed using sensitivity analysis.Based on the survey of cost overrun risk and influence factors with high possibility, the project management division can establish the proper effective cost-risk treatment plans in a timely manner.

Real-Time Cost Overrun Prediction Method
To achieve the aforementioned objective, we propose a real-time HMM method to forecast the cost overrun probability based on the cost performance data and the adopted corrective actions.In the model, the Poisson process was used to simulate cost overrun occurrence events with unknown arrival rates and impacts.The effect of corrective action was also unidentified and defined as an unknown modeling parameter.An HMM algorithm using a particle filter was proposed to learn the unknown parameters and update the cost overrun probability in a real-time manner.The overall analysis process of the proposed model is illustrated in Figure 1.Mainly, it is composed of Poisson cost overrun model and a real-time Bayesian updating model.Their detail will be depicted and explained in the following.
with high possibility, the project management division can establish the proper effective cost-risk treatment plans in a timely manner.

Real-Time Cost Overrun Prediction Method
To achieve the aforementioned objective, we propose a real-time HMM method to forecast the cost overrun probability based on the cost performance data and the adopted corrective actions.In the model, the Poisson process was used to simulate cost overrun occurrence events with unknown arrival rates and impacts.The effect of corrective action was also unidentified and defined as an unknown modeling parameter.An HMM algorithm using a particle filter was proposed to learn the unknown parameters and update the cost overrun probability in a real-time manner.The overall analysis process of the proposed model is illustrated in Figure 1.Mainly, it is composed of Poisson cost overrun model and a real-time Bayesian updating model.Their detail will be depicted and explained in the following.

Poisson Cost Overrun Model
The proposed Poisson cost overrun model consists of three modules: (1) cost overrun events occurrence module; (2) corrective action module; and (3) cost status assessment module.They are described in detail in the following:

Cost Overrun Events Occurrence Module
This study mimicked the cost overrun events with the project operation lifecycle as a stochastic process.Cost overruns can be regarded as discrete rare events, compared with regular cost conditions [9,31].This paper followed Touran [9] to assume a Poisson arrival pattern and independent random variables for the cost overrun events.A cost overrun event is described as a random event following a Poisson process with a mean

Poisson Cost Overrun Model
The proposed Poisson cost overrun model consists of three modules: (1) cost overrun events occurrence module; (2) corrective action module; and (3) cost status assessment module.They are described in detail in the following:

Cost Overrun Events Occurrence Module
This study mimicked the cost overrun events with the project operation lifecycle as a stochastic process.Cost overruns can be regarded as discrete rare events, compared with regular cost conditions [9,31].This paper followed Touran [9] to assume a Poisson arrival pattern and independent random variables for the cost overrun events.A cost overrun event is described as a random event following a Poisson process with a mean rate of occurrence equal to µ per unit of time, and the occurrence rate contributes a cost overrun amount equal to λ.In most cases, λ and µ are unknown.
The discrete-time index k is defined to represent the time.Let X k be the accumulated number of cost overrun events at the discrete time k, therefore where T means the total discrete-time duration of interest; and V k follows Poisson distribution with a mean value µ.∆t, i.e., Assume that there is no cost overrun at the beginning of the project; i.e., X 0 = 0. Since Poisson process are memoryless, V 0 , V 1 , . . ., V T−1 are independently identically distributed, so X 0 , X 1 , . . ., X T form a Markov chain.The actual accumulated amount of cost overrun at time instant k is λX k , and the cost overrun probability at time k is P (λX k > 1).If a previous known event X n (n < k) is defined as x n , the cost overrun probability at time k is where .represents the smallest integer greater than the internal real number.Here, we implement the fact that X k − x n follows a Poisson distribution with a mean rate of occurrence equal to (k − n)µ∆t.

Corrective Action Module
The effect of corrective actions was further defined in the model to overcome the limitation of previous research in which corrective action was not covered and assessed.It is assumed that if the cost is overrun at time instant m, corrective action needs to be taken at that time.In practice, it is hoped that the actual cost (AC) returns to the planned value (PV) after the corrective action is utilized.However, due to the improved performance gap, even the corrective action is taken as project AC does not return to PV.It is reasonable to assume that, if the corrective action is taken at time instant m, λX m will be set equal to a random number ν ∈ [0, 1] (i.e., X m = ν/λ).

Cost Status Assessment Module
The judgment of cost status may be affected by some noisy information, such as incomplete progress data, subjective experience, etc.It is necessary to assume that an assessment random variable at a time instant i is defined to judge whether the cost is overrun; i.e., to determine whether λX k is greater than 1 or not.Given λ and X i = x i , the probability of cost overrun is It is assumed that the cost overrun status assessment data are D k = Ŷi i = 1, . . ., k where Y i is the noisy assessment at the time i: 1 means the cost identified to be overrun, −1 means the cost identified to be underrun, and 0 means that the cost is identified in the budget.The variable, α, is an unknown parameter that characterizes the uncertainty degree of the cost status assessment outcome.A large α means a more accurate assessment and a small α represents a poor and noisy assessment.

Real-Time Bayesian Updating Model
As discussed above, the model parameters λ, µ, α, and ν are usually unknown.The most probable values are essential to be determined based on the actual performance data from the project report.This paper utilizes a Bayesian updating approach to estimate the probability distribution of the parameters from the project performance data.The overall data sampling process of the real-time Bayesian updating model based on the particle filter is depicted in Figure 2.
data from the project report.This paper utilizes a Bayesian updating approach to estimate the probability distribution of the parameters from the project performance data.The overall data sampling process of the real-time Bayesian updating model based on the particle filter is depicted in Figure 2.

Real-Time Estimation and Prediction Algorithms
The assessment data  , , , , ,  samples can be drawn from (, , , ,  | ) by the stochastic simulation methods as discussed below.Let those samples be denoted by { ( ) ,  ( ) ,  ( ) ,  ( ) ,  ( )  = 1, . . ., } where N is the total sample number.Once the initial samples are appropriately obtained, the real-time estimate algorithms were inferred and described as follows.

Real-Time Cost Overrun Probability
According to the Law of Large Number, the real-time cost overrun probability can be estimated as

Real-Time Estimation and Prediction Algorithms
The assessment data D 1k ,λ, µ, α, ν, X k samples can be drawn from f (λ, µ, α, ν, x k |D 1k ) by the stochastic simulation methods as discussed below.Let those samples be denoted by k j = 1, . . ., N where N is the total sample number.Once the initial samples are appropriately obtained, the real-time estimate algorithms were inferred and described as follows.

Real-Time Cost Overrun Probability
According to the Law of Large Number, the real-time cost overrun probability can be estimated as

Future Cost Overrun Probability
Moreover, if k > T, P(λX k |D 1T ) stands for the failure probability at future time k given past data D 1:T .Based on the Law of Large Number, where λ T j = 1, . . ., N are samples from f (λ, µ, x T |D 1T ) under the condition that conditioning on X T , D 1:T, and X k are independent.

Simulation Sampling
How to do sample drawing from f (λ, µ, α, ν, x k |D 1k ) is a prerequisite for computing all estimates.This means that it is vital to find a real-time sample drawing mechanism, i.e., wherein λ ) can be acquired with no reference to the result from the previous time steps.The real-time Bayesian updating algorithm utilized a particle filter approach.

Model Revisited
Before the brief introduction of the particle filter algorithm, the model below is defined following Equation (1).
where X k , λ k , µ k , α k and ν k are the model "state variables".This model explicitly states the prior probability density functions (PDFs) for the uncertain variables λ, µ, α, and ν.Note that the values of the parameters λ k , µ k , α k , and ν k keep fixed over time.This above-mentioned model in Equation (7) depicts the evolution of the actual model state updating without the consideration of corrective actions.If a corrective action is conducted at time m, X m will be readjusted to ν m /λ m , where ν m = 0 for AC return to PV when the corrective action taken and ν m > 0 for AC does not return to PV, even under the corrective action.Notice that, although Equation (7) describes the formula of the state evolution, the real values of the state are underdetermined since λ 0 , µ 0 , α 0 , and ν 0 are uncertain and {V k k = 0, . . ., T − 1} are also uncertain.

Particle Filter Approach and Process
The values of X k , λ k , µ k, α k , and ν k based on Equation (7) are further simulated from f (λ, µ, α, ν, X k |D 1k ) with the incorporation of the real-time assessment data D 1:k using particle filter algorithm.To simplify notations, the state at time k is defined as Z k , i.e., Zk = {λ k , µ k , α k , ν k , X k }.The simulation process using the particle filter algorithm was explained as follows.
The given samples Z (j) k j = 1, . . ., N distributed as f (z k |D 1k ) and the new assessment data Ŷk+1 , Z (j) k+1 j = 1, . . ., N samples distributed as f (z k+1 |D 1k+1 ) can be obtained without referring to the result of earlier time instants.Once the initial samples are drawn from f (z 0 |D 10 ), it would be easier to sample from f (z k |D 1k ) at any time instant k using a particle filter algorithm.Note the initial state f (z 0 |D 10 ) that is simply f (λ 0 , µ 0 , α 0 , ν 0 , x 0 ), which can be easily sampled.The following states using the particle filter algorithm are presented as a semi-code right after the derivations.
Let Z (j) k j = 1, . . ., N be the samples from f (z k |D 1k ).By following the Law of the Large Numbers, f (z k |D 1k ) can be approximated as: where δ is the Dirac delta function.According to Bayes' rule: where the derivations were conducted under the assumption of Z k , D 1k and Z k+1 are independent, and the adjustment to Z k+1 , Y k+1 is also independent.
Drawing samples based on the mixture N PDFs in proportion to f 9) is akin to drawing samples from f (z k+1 |D 1k+1 ).There are several ways to draw N samples from the mixture PDF and a simple way is that of sample-importance resampling (SIR), in which the SIR process is explained as follows: Given a previous sample Z The above-mentioned drawing process is conducted under the condition of cost underrun.In this case, there is a cost overrun or in budget at time k + 1, i.e., Ŷk+1 = −1, as these candidates are not distributed as f (z k+1 |D 1k+1 ) since these samples need to include the new information Ŷk+1 .The importance weights w (j) k+1 j = 1, . . ., N will be embodied in each candidate: 1− Ŷk+1 (11) where w (j) k+1 reflects the relative degree plausibility of candidates Z C(j) k+1 about the new information Ŷk+1 .Once the weight is obtained, the samples of f (z k+1 |D 1k+1 ), denoted by Z (j) k+1 j = 1, . . ., N , can be obtained by resampling Z C(j) k+1 j = 1, . . ., N according to their weights w for j = 1, . . ., N. If a corrective action is taken and AC is assumed to return to PV, i.e., X k+1 = 0 for j = 1, . . ., N. The simulated samples Z (j) k j = 1, . . ., N are distributed as f (z k |D 1k ).The λ, µ, α, ν parts of samples are distributed as f (λ, µ, α, ν|D 1k ), and the X k parts of samples are distributed as f (x k |D 1k ).These samples can be further combined to estimate the cost overrun probability at every time instant in real-time.

Parameter Tuning
Before the validation against a real project case, it is necessary to conduct the parameter tuning of the real-time HMM cost overrun prediction model compared with a simulated example taken from Barraza et al. [32].The simulated example is the bridge construction project consisting of a prestressed concrete girder bridge of three 30 m spans, and a cast-in situ deck, supported on two river piers and two abutments on level banks.The planned cost and the duration of the bridge project activities are presented in Table 1.The project duration and the budget were 289 days and USD 632,669, respectively.
The time interval ∆t is ten months and the assessment basis is taken as monthly.The actual evolution of {X k k = 0, . . ., T} is simulated according to Equation (7) and the assessment result X k at the k-th month is simulated according to Equation ( 4), where λ, µ, and α are prescribed real numbers.If a cost overrun is reported from the assessment in the k-th month, the project manager will immediately take corrective action, i.e., X k will be set to (1 − ν)/λ right after the assessment, where ν is a prescribed real number based upon corrective action.
The assessment and the validation were conducted as follows.First, a blind examination was conducted; i.e., no prior knowledge of the initial input data of λ, µ, α, ν, and {X k k = 0, . . ., T}.The assessment result is denoted as Ŷi and its value is defined as 1 for cost overrun, 0 for in budget, and −1 for under budget at time instance i.Because of no prior information about the parameter λ, its prior PDF is fairly assumed to follow a uniform distribution over a relatively broad interval [0.0001, 1], and the parameter α as uniform over [10,31].The PDF for ν is defined as uniform over [0.1, 1], and the prior PDF for µ as uniform over [0.001, 0.5].The number of simulation samples N is defined as 5000.
Figure 3 shows the real-time samples of the unknown parameters and the samples drawn from f (λ, µ, α, ν|D 1k ) for k = 0, 50, 100, and 200 of all factors.These ranges reflect the updated values of the unknown parameters.As shown in Figure 2, {λ, µ} samples evolve with time and finally cluster around their actual values for both parameters, while the {α, ν} parameters seem unidentifiable from the assessment data.To compare the result from Barraza et al. [32] with our model on the same basis, the cost overrun probabilities in Barraza et al. [32] were counted following the normal distribution, in which the overrun average and the standard deviation were estimated based upon historical data.

SRC Building Project in Taiwan
An SRC building project with a comprehensive cost report was further employed to illustrate the use of the proposed method.This SRC building project is located in Taipei, Taiwan.It is a compound building composed of two towers (12F/2B) and six towers (5F/2B).The project duration and the budget are 35 months and NTD 956,912,592 (USD 29,903,520), respectively.The status of the cost overrun in the project is shown in Table 3  Table 2 shows the comparison of the cost overrun probabilities between Barraza et al. [32] and our model.The cost overrun probabilities in Barraza et al. [32] tends to increase as the project duration becomes longer.In contrast, the cost overrun probabilities from our model fell down after the cost overrun events were recorded.The basic reason is that the proposed model takes corrective action into consideration.Once the cost overrun was indicated (e.g., project at day 50), the corrective action would be taken in practice.The cost overrun probability fell from day 50 to 100.The project cost control became poorer from day 50 to 289, as indicated by the probability increase.In real cost records, the cost over occurred again at time 289.This means that, after the corrective action is taken, the project cost is generally under control and within budget.Nevertheless, there is still a chance of overrunning if the project cost is gradually lost control.In practice, it is fairly stated that, in addition to project duration, the cost overrun probability trend significantly depends on the status of influencing factors (such as management and material).If they are under control, the cost overrun probability becomes lower.

SRC Building Project in Taiwan
An SRC building project with a comprehensive cost report was further employed to illustrate the use of the proposed method.This SRC building project is located in Taipei, Taiwan.It is a compound building composed of two towers (12F/2B) and six towers (5F/2B).The project duration and the budget are 35 months and NTD 956,912,592 (USD 29,903,520), respectively.The status of the cost overrun in the project is shown in Table 3 based on the project cost report.The project report recorded eight cost overrun events (the period from the second to eighth months and the tenth month) and Table 4 lists the influential factors status within the project duration to be assessed.The statuses were defined as 1 for cost overrun, 0 for in budget, and −1 for under budget at time instant i.
The assessment simulation was conducted on monthly basis.Basically, the X k evolution was simulated according to Equation (7), and the assessment result was simulated according to Equation ( 4) and the actual values of X k .If a cost overrun is reported at time instant i, the corrective action is taken right after the cost overrun event; i.e., X k will be set to (1 − ν)/λ right after the assessment.At the beginning, the influence factors are assumed to occur individually and independently to each other for simplicity.Based upon the cost records at the project; i.e., how many times each factor affects the cost overrun in one month, their prior PDFs are assumed to be uniformly distributed over [0.001, 0.1], and [0.001, 0.2], [0.001, 0.2], [0.001, 0.1], [0.001, 0.25] for weather, productivity, material, equipment, and management, respectively.Because there is no such prior information about parameters λ, ν, and α, their prior PDFs were assumed to follow uniform distributions over a relatively broad interval [0.00001, 1], interval [0.1, 1], and interval [10,31], respectively.For all the assessments, the number of simulation samples N was selected to be 5000.
Figure 4 plots the real-time cost overrun probability trend for each influence factor in the SRC project.Asterisks in the figure indicate reported cost overrun events in the assessment.The dashed line represents the borderline between the assessment simulation (i.e., the 1st-10th months) and prediction.It is found that if the assessment data and the parameter µ are similar for some factors, the trend plots look similar, e.g., productivity and material.Additionally, note that the cost overrun probability and the cost overrun rate decrease right after each reported cost overrun event because each reported cost overrun event is followed by corrective action.
In the real project execution, many influence factors may simultaneously affect the project cost overruns.This paper further surveyed the potential combinations of the influence factors to determine which combination fits the real cost overrun trend the most using sensitivity analysis.Table 5 shows the potential factor combinations.The threshold of the simulation outcome is set as 0.5.If the combined probability is less than 0.5 then it is recorded with "U", otherwise "O"."O" means that it is likely to be cost overrun at the time instant, and "U" means cost underrun.Table 6 shows that combinations 5 and 7 have a better match than other combinations, compared to the real cost trend status.This means that the productivity and material may give a more significant impact to the project cost overrun.Finally, to compare the project cost based upon the earned value management (EVM) with the real-time HMM assessment model on the same basis, the EVM predicted cost was converted into the cost overrun probability values which were counted following the normal distribution in which the overrun average and the standard deviation were estimated based upon the SRC project cost data.The comparison of the accuracies of both is described in Table 7.The percentage of accuracy of EVM and our model is 77.2% and 82.9%, respectively.It is found that our model is more accurate than EVM.In addition, our model also considers the effect of corrective action that is hardly considered in EVM.

Conclusions and Recommendations
This paper proposed a new method of project cost overrun probability prediction in a real-time fashion.The applicability of the proposed model and algorithm was verified against an SRC building project in Taiwan.The posterior probabilities from the real-time HMM model were highly compatible with the cost overrun ratios of a real construction project.This model overcame several limitations of the classical project cost overrun prediction approaches.This proposed method is capable of providing a fast and timely estimate of cost overrun probability.It does not require the support of historical data from other projects, but only the latest data from the assessed project.This method also considers the effect of corrective action which is rarely considered in past research.Furthermore, the potential cause combinations with high possibility affecting the project cost overrun were supplementarily surveyed using sensitivity analysis compared with HMM inference.In practice, according to the analysis of cost overrun risks, the effect of corrective action, and significant influence factors with high possibility, proper effective cost management plans can be developed to alleviate the risk of construction project cost overrun.
The study is exploratory in nature; further research needs to continue in this area.In this model, cost overrun influence factors are assumed to be independent of each other.In some real construction projects, these factors may not be independent.Additionally, the method strongly relied on the validity of the Poisson arrival assumption which may not be reasonable for some construction projects.Nonetheless, this method has provided a realistic preliminary model to predict the project real-time cost overrun probability.In the future, the accuracy and the applicability of the model may be improved if the assumption of factor independence relaxes.The model would also benefit from examining the possible distributions of cost overrun events, as well as more realistic corrective actions and cost status assessments.

Figure 1 .
Figure 1.Overall analysis process of proposed method.

Figure 1 .
Figure 1.Overall analysis process of proposed method.

Figure 2 .
Figure 2. Overall data sampling process of the real-time Bayesian updating model.

Figure 2 .
Figure 2. Overall data sampling process of the real-time Bayesian updating model.

kk
, the main SIR task is to draw Z C(j) k+1 (C stands for the candidate) from f (z k+1 |Z (j) k ).The candidates, Z C(j) k+1 j = 1, . . ., N , are drawn first.Suppose the previous sample Z , and then drawing sample from f (z k+1 |Z (j) k ) can be obtained by letting k+1 j = 1, . . ., N , i.e., let Z Check for Corrective Actions If a corrective action is executed for time instant k + 1 when the actual cost (AC) goes above the planned value (PV), let X

Buildings 2023 , 17 Figure 4 .
Figure 4. Cost Overrun Probability Trend for Each Influence Factor at SRC Project.

Figure 4 .
Figure 4. Cost Overrun Probability Trend for Each Influence Factor at SRC Project.

Table 1 .
Three Span Bridge-Project Activity Data.

Table 2 .
Cost Overrun Probability Comparison between Barraza et al. [32] and Our Model.

Table 3 .
Status of Cost Overrun report at SRC Project.

Table 4 .
Assessment Data at SRC Project.

Table 7 .
Comparison between Our Model and EVM.

Table 5 .
Potential Factor Combinations and Their Logic Gate.

Table 6 .
Sensitivity Test of Potential Influence Factor Combinations.

Table 7 .
Comparison between Our Model and EVM.