A Statistical Approach to Analyzing Engineering Estimates and Bids

This paper introduces a methodology to assess the accuracy of engineering estimates in relation to the final project cost. The objective of this assessment is to develop a comprehensive approach towards obtaining a more reliable estimate of the project cost. This approach relies on the review of a synthesis of literature, which provides a basis for determining key components in the estimation of the capital cost of a project. A systematic review of existing data for selected projects was obtained as well. Employed data cover sampled public transportation projects to maintain existing infrastructure within selected geographical location and specified time. Enhanced analysis of existing data through statistical models was employed to indicate potential measures for prevention of errors in the estimate due to uncertainties in the time, cost, and method of construction. The comparison of results with similar findings from past research shows the effectiveness of presented methodologies and opportunities to enhance statistical analyses of bids and engineering estimates. Conclusions enable project managers to address uncertainties in the bidding process and enhance financial sustainability of projects within specific programs.


Introduction
The process of bid analysis is a topic of interest for project managers. Bidding, in terms of engineering, refers to the process of obtaining the most desirable proposal in response to a request by the project manager. The process generally includes a series of steps, to achieve the lowest qualifying bid for the project. The definition of a qualifying bid may vary based on legal and organizational environment; however, the accuracy of the bid in relation to the true final cost of the project will be essential for the project manager regardless of enterprise environment. Furthermore, the qualifying lowest bid may differ from the literal lowest bid, due to implementation of qualification rules. Regardless of bid procedures, uncertainties in each project will influence the variation of bids. A breakdown of costs differs based on characteristics of the planned built infrastructure, and generally involves land, labor, equipment, and material. Further, characteristics of the project and environmental parameters affect this general breakdown of costs, including size and complexity of the project, site accessibility, labor and equipment availability, contractual agreements, climate, and local culture [1,2]. The intent of this research is to develop a methodology to analyze the relationship between bids and final project costs within practical limits and consideration of such characteristics. This process, as part of procurement management, influences the financial sustainability of built infrastructure and has substantial impact on the resilience of public infrastructure in respect to resourcefulness, recovery, and rapidity in response to hazards and risks throughout the lifecycle of the infrastructure [3,4].

Contingency Analysis
Existing literature includes substantial research on root causes of uncertainty in the cost estimate of projects. It is important to understand the factors resulting in fluctuations in project cost, to work towards reducing the existing gap between the bid value and the final project cost. Preexisting literature working towards developing such techniques is analyzed in the following paragraphs. As a result, a project manager may better assess uncertainties in the bidding process.
Unclear scope of the work and undefined means and methods of construction are common sources of the errors that are inherent to construction projects [5]. In many cases, it is believed that the project cost estimate is more accurate than the scope development when obtaining an estimate. Lagace (2006) reviewed 24 variables with 84 contractors and determined seven factors to be the key in project cost estimation: project complexity, technological requirements, project information, project team requirements, contract requirement, project duration, and market requirements [6]. Rosenfeld (2014) provides an analysis of cost overruns and develops a methodology for ranking causes of cost overruns in accordance with local circumstances. The hypothesis behind the research conducted stated that cost overruns in construction projects are merely symptoms of a problem much greater in complexity. A method adopted by ASQ (American Society for Quality) known as "events analysis" is conducted to determine the root cause of a problem. It does so through the analysis of factors, conditions, and sequence of events enabling the event to occur. Using this method, researchers were able to converge the original list of 146 possible cost overrun causes to 15 universal "Root Causes" [7].
Much research has also been dedicated for developing a technique for bidders to determine a suitable bid price without facing a cost overrun. The amount to bid is a taxing decision for a contractor, containing two possible outcomes: an opportunity for profit alongside the likelihood of a cost overrun. In a theoretical approach, Rothkopf (1969) presented a model to describe rational bids using the extreme value theory [8]. In similar works, Singh, Shiramizu, and Gantam (2007) proposed a technique for developing a bid through the analysis of previous bids and bidding behavior based on a triangular distribution model for analyzing the frequency distribution. This model considers low, medium, and high frequency levels and relates them to that of commensurate optimistic, most likely, and pessimistic (OMP) prices [9]. An estimator's understanding of OMP provides an individual with the option of determining the price of materials based on experience as oppose to obtaining the deterministic market price of an item. Other probabilistic approaches may involve PERT or Monte Carlo analyses [10].

Impacts on the Market
The competitive bidding system serves the purpose of creating a mechanism in which economic growth can be encouraged through a free market competition. It can be identified as one of the contributing components in analyzing the bid process, and understanding yet another factor resulting in a project cost different than that initially bid. Naturally, interest is generally directed towards the lowest bid proposed. However, as mentioned prior the lowest bid is not always reflective of the final project cost.
The public sector often adopts competitive bidding to comply with government statutes [11]. However, because of the competitive bidding system, it has been shown that the construction industry has faced issues such as abnormally low bids as well as poor project quality. This is believed to be related to the specific characteristics of the construction industry as opposed to other industries. Thus, the lowest bid amount does not necessarily represent the true cost of construction [12]. Bid analysis can assist project managers to understand the relationship between bid amounts and the final cost of construction. The significance of the apparent lowest bid in such analyses is not necessarily substantial [13]. Using the "system dynamics" approach, Lo, Lin, and Yan (2007) suggested that a competitive bidding system proves to be effective only when a contractor's "opportunistic behavior" is restrained. The focus of this research is to show the impact of a contractor's opportunistic bidding behavior on market failure as well as the construction industry. The bidding price of an industry is established after the prediction of the possible bid proposal made by competitors in the industry [14]. The relationship among the level of competition as well as a contractor's bidding price has shown that the more competitors involved in the bidding process, the fewer the amount of profit [15]. Bidders involved adjust the proposed prices based on the perceived level of competition. As a result, the competition level is believed to be a key factor when representing the strength of the Price Competition Feedback structure.

The Prediction of Final Cost
Aretoulis (2006) provides thorough analysis of a study conducted by the Department of Planning and Development showing 9 out of 10 projects that are related to transportation infrastructure globally, are underestimated. Aretoulis (2006) concludes time and cost are two factors imperative to the production of a project [16]. Due to several uncertainties and risks involved in civil engineering projects, receiving a price margin for the final cost based on the amount initially stated at the bidding stage requires a system approach. Variations that are made to the initial cost estimates generally result in increasing the initial established cost. Further, projects that last longer periods of time are more susceptible to cost and time overruns. The key components to determining this final cost include indirect and direct costs as well as the markup. To develop a system approach, several steps are required including process definition, process modeling, database creation with prediction models of the processes' final cost, and the estimation of the total project bidding price with the application of the models.
There have been several research studies on the impact of bid competition as well as the influence of the number of bidders on the final cost. Carr (2005) provides a quantitative approach to analyze the influence of reducing competition, on bid values. The study involved the assessment of 19 public works projects with 438 bids in the amount of USD 158 million in construction value. The projects that were evaluated in the research were based off public educational facility projects in New York State. Through the correlation analysis of the data in respect to the relationship between the number of bidders and the bid price, it was determined that there was a 99.7% chance that the relationship between the two variables was not because of chance alone. The results from the data suggested the existence of a negative correlation among the two variables. The higher the number of individuals involved in the bidding process, the lower the final bid value. Upon conducting a regression model, it was determined that there was a 3.79% increase in the estimated project cost because of the loss of each bidder. As a result, the study concluded that decreasing the number of bidders involved in the project results in an increase in project bid prices [17].
Tehrani (2016) considered 22 projects within a single portfolio with similar scope and size, advertised during the same period, and within the same geographical location to investigate the relationship between the engineer's estimate, bid amounts, and the final cost of construction [18]. The restrictive process to select projects allowed the analyses to focus on bidder's behavior and inherent errors associated with the process, rather than influences of scope, size, and other external parameters. The study revealed that the average value of bid amounts, excluding outliers, represented the true cost of selected projects. Okere (2017) verified this finding to be true for a larger sample of projects by indicating the poor relationship between the engineer's estimate and the true cost of project [19].

Significance of the Research
Interest in research on this topic is stimulated by the desire to develop a more accurate budget with consideration of uncertainty in the bidding process that often causes cost overruns or underruns. Prevention of the cost overrun or underrun also influences the credibility of the project management team, which is responsible for the development of project estimates throughout planning and design stages. This responsibility has substantial impact in the public sector, where proper budget planning is the key to future funding potentials. The current research expands the previous portfolio work in this area in light of new data involving the analysis of a program including a larger body of projects. To eliminate possible external parameters, projects were chosen on a basis of location, similarity in scope, and period in which it was advertised for bid. This ultimately allowed for an analysis between the ratio of bids and the final project cost.

Methodology
This paper follows the proposed methodology by Tehrani (2016) [18]. The analysis begins with the selection of projects for data mining from a portfolio of State of California Department of Transportation highway maintenance (HM) projects. Selected projects had similar scope (Pavement) and were advertised within a small geographical location (District 6) during a single fiscal year. The restrictive criteria limited the total number of specimens to a low number of 50 projects, but it enabled analyses to focus on the bidding process, rather than external parameters such as project type, funding source, economic environment, etc. The distribution of projects in respect to the total value was not necessarily uniform or normal, which is natural for realistic portfolios, where the portfolio contains large number of small projects and few large projects. The total value of projects was nearly USD 220 million based on the engineer's estimate and USD 216 million based on the apparent lowest bids. The engineer's estimate values were obtained from the official records of projects. Each project received 2 to 10 bids with an average of five bids per project and a total of 251 bids. The standard deviation and the coefficient of skewness were 2.1 and 0.43, respectively. The positive skew is apparent in Figure 1.

Significance of the Research
Interest in research on this topic is stimulated by the desire to develop a more accurate budget with consideration of uncertainty in the bidding process that often causes cost overruns or underruns. Prevention of the cost overrun or underrun also influences the credibility of the project management team, which is responsible for the development of project estimates throughout planning and design stages. This responsibility has substantial impact in the public sector, where proper budget planning is the key to future funding potentials. The current research expands the previous portfolio work in this area in light of new data involving the analysis of a program including a larger body of projects. To eliminate possible external parameters, projects were chosen on a basis of location, similarity in scope, and period in which it was advertised for bid. This ultimately allowed for an analysis between the ratio of bids and the final project cost.

Methodology
This paper follows the proposed methodology by Tehrani (2016) [18]. The analysis begins with the selection of projects for data mining from a portfolio of State of California Department of Transportation highway maintenance (HM) projects. Selected projects had similar scope (Pavement) and were advertised within a small geographical location (District 6) during a single fiscal year. The restrictive criteria limited the total number of specimens to a low number of 50 projects, but it enabled analyses to focus on the bidding process, rather than external parameters such as project type, funding source, economic environment, etc. The distribution of projects in respect to the total value was not necessarily uniform or normal, which is natural for realistic portfolios, where the portfolio contains large number of small projects and few large projects. The total value of projects was nearly USD 220 million based on the engineer's estimate and USD 216 million based on the apparent lowest bids. The engineer's estimate values were obtained from the official records of projects. Each project received 2 to 10 bids with an average of five bids per project and a total of 251 bids. The standard deviation and the coefficient of skewness were 2.1 and 0.43, respectively. The positive skew is apparent in Figure 1. The engineer's estimate for this sample varied from nearly USD 0.24 million to USD 68 million. However, the average value was determined to be USD 4 million, as 86% of projects had an engineer's estimate of USD 5 million or less (Figure 2). Such distribution is not uncommon in portfolio management of maintenance projects (Figure 3). Figure 4 indicates that the value of projects, as represented by the engineer's estimate, did not have a significant impact on the number of bids, noting the negligible correlation, and thus, bid analysis is not biased by the distribution of engineer's estimate. The engineer's estimate for this sample varied from nearly USD 0.24 million to USD 68 million. However, the average value was determined to be USD 4 million, as 86% of projects had an engineer's estimate of USD 5 million or less (Figure 2). Such distribution is not uncommon in portfolio management of maintenance projects (Figure 3). Figure 4 indicates that the value of projects, as represented by the engineer's estimate, did not have a significant impact on the number of bids, noting the negligible correlation, and thus, bid analysis is not biased by the distribution of engineer's estimate.
The basis of bid analysis is the comparison between single bid amounts and the referenced value of the project. The value of the project, also identified as the true project cost upon completion of construction, can be linked to several values such as the engineer's estimate, the apparent lowest bid, and the average bid. It should be noted that the apparent lowest bid is not necessarily the same as the qualifying lowest bid. In certain legal circumstances, the project is awarded to the lowest bid after implementing certain factors, such as an allowance for small disadvantaged businesses. In this study, five projects were subject to such modifications. Further, the process of obtaining the average bid often require application of other filters to remove outliers as discussed by Tehrani (2016). Common filters result in averaging a few numbers of bids, like three lower bids, or averaging bids with exclusion of extreme values, say the highest and the lowest bids or bids beyond gross mean plus or minus the gross standard deviation (trimming). Simpler techniques may involve the selection of the second low bid instead of the lowest bid, or median instead of the mean value.   The basis of bid analysis is the comparison between single bid amounts and the referenced value of the project. The value of the project, also identified as the true project cost upon completion of construction, can be linked to several values such as the engineer's estimate, the apparent lowest bid, and the average bid. It should be noted that the apparent lowest bid is not necessarily the same as the qualifying lowest bid. In certain legal circumstances, the project is awarded to the lowest bid after implementing certain factors, such as an allowance for small disadvantaged businesses. In this study, five projects were subject to such modifications. Further, the process of obtaining the average bid often require application of other filters to remove outliers as discussed by Tehrani (2016). Common filters result in averaging a few numbers of bids, like three lower bids, or averaging bids with exclusion of extreme values, say the highest and the lowest bids or   The basis of bid analysis is the comparison between single bid amounts and the referenced value of the project. The value of the project, also identified as the true project cost upon completion of construction, can be linked to several values such as the engineer's estimate, the apparent lowest bid, and the average bid. It should be noted that the apparent lowest bid is not necessarily the same as the qualifying lowest bid. In certain legal circumstances, the project is awarded to the lowest bid after implementing certain factors, such as an allowance for small disadvantaged businesses. In this study, five projects were subject to such modifications. Further, the process of obtaining the average bid often require application of other filters to remove outliers as discussed by Tehrani   The basis of bid analysis is the comparison between single bid amounts and the referenced value of the project. The value of the project, also identified as the true project cost upon completion of construction, can be linked to several values such as the engineer's estimate, the apparent lowest bid, and the average bid. It should be noted that the apparent lowest bid is not necessarily the same as the qualifying lowest bid. In certain legal circumstances, the project is awarded to the lowest bid after implementing certain factors, such as an allowance for small disadvantaged businesses. In this study, five projects were subject to such modifications. Further, the process of obtaining the average bid often require application of other filters to remove outliers as discussed by Tehrani (2016). Common filters result in averaging a few numbers of bids, like three lower bids,

Project Estimates
The true cost of a project may not be determined through the amount indicated by the contract cost, which is generally based upon the value of the lowest bid. Rather, the true cost of the project is based on the official records of the project after the completion of the construction. This value typically includes minor adjustment and corrections in the scope and quantitate of the project which are funded through funded contingencies. To reach a project estimate, when determining the value of the final project cost, bidders take into consideration various factors, including the availability of resources, location, labor, etc. Lower tenders may not always be feasible bids, since such estimates are at times made because of a firm's desire to simply obtain the work and may eventually result in an increase in the final cost throughout the course of the project. The size of the firm resulting in larger overhead costs and lack of experience in specified areas are also factors that contribute to high estimation of the bid value.
As a result, project estimates that are derived to place a bid fluctuate because of the differing factors contributing to the final cost estimate. Developing a statistical analysis of bids placed towards similar projects in the same district, allows for determining the true cost of a project by taking into consideration outside factors, not directly related to the projects, that may influence the value of the bid placed. Table 1 provides a summarization on the best mechanism for the estimation of sample project costs using studied literature [8][9][10][11][12][13]. Presented results aim to analyze each mechanism and to determine appropriateness of each mechanism for the use of project/portfolio/program managers who need to estimate the required budget for their desired financial unit. As displayed in Table 1, three error measures are present for each hypothesis. The error contains the difference that is present amongst the proposed cost estimate, the reference value, and the project engineer estimate. The least square root method is a measure of the probable error in the system, taking into consideration both the differing nature of projects as well as the statistically independent error values for reviewed projects. The weighted signed average contains both the cost and monetary amount (in dollars) of the difference of the projects. This method has proven to be most useful since it not only takes into consideration the value of overestimated bids, but underestimated bids that are placed as well. In total, three different approaches are placed to evaluate the low bid value, which are represented by the three columns present in Table 1. The lowest bid column is a representation of the gap that exists between the value of the lowest bid placed as well as the value of the engineer estimate. The lowest bid is a possible representation of bidders with differing motives, as discussed prior. As a result, of the three different approaches, utilizing the lowest bid when determining the true cost of the project is believed to be a weaker approach. When compared to the value of the 2nd lowest bid, the 2nd lowest bid is a more reliable approach not only due to the substantially lower error values shown in Table 1, but also due to the possible characteristics of the bidders providing the lowest bid offer. The third column represents the value retrieved from the average of the three lowest bids. However, this approach is not useful for projects containing less than four bids.
According to the values obtained in Table 1, the lowest bid provides the least square root value. However, using the lowest bid when determining the final cost can provide a faulty estimation due to the possible issues discussed previously. The average of the three lowest bids has shown to contain the second least square root value. However, this approach may provide an inaccurate estimation if the presence of outlier bids is not accounted for. For example, an inaccurate estimation may be derived if the data contains a bid much higher or lower than the rest. As a result, to account for this issue, an alternative approach involves removing the highest and lowest bid prior to determining the average. Figure 5 displays the distribution of bids for the projects examined. The bid ratio is the ratio of each bid to the average of all bids for each project, indicating a distribution of 88.83% confidence. As a result, indicating tendency of bidders to offer lower-than-estimate bids. The distribution of bids ratio has an average of 1.0, standard deviation of 0.135, skew of 0.973, and mean of 0.993. lowest bids has shown to contain the second least square root value. However, this approach may provide an inaccurate estimation if the presence of outlier bids is not accounted for. For example, an inaccurate estimation may be derived if the data contains a bid much higher or lower than the rest. As a result, to account for this issue, an alternative approach involves removing the highest and lowest bid prior to determining the average. Figure 5 displays the distribution of bids for the projects examined. The bid ratio is the ratio of each bid to the average of all bids for each project, indicating a distribution of 88.83% confidence. As a result, indicating tendency of bidders to offer lower-thanestimate bids. The distribution of bids ratio has an average of 1.0, standard deviation of 0.135, skew of 0.973, and mean of 0.993. Data obtained also represent an average standard deviation of 12.12% for the distribution of project bids. This percentage measures the extent at which every value in the data set differ from the mean of the projects analyzed. Now, taking into consideration the ratio of the lowest bid to the average bid, an alternate distribution of data may be obtained. Figure 6 displays the distribution resulting from the comparison of two factors.  Data obtained also represent an average standard deviation of 12.12% for the distribution of project bids. This percentage measures the extent at which every value in the data set differ from the mean of the projects analyzed. Now, taking into consideration the ratio of the lowest bid to the average bid, an alternate distribution of data may be obtained. Figure 6 displays the distribution resulting from the comparison of two factors. approach may provide an inaccurate estimation if the presence of outlier bids is not accounted for. For example, an inaccurate estimation may be derived if the data contains a bid much higher or lower than the rest. As a result, to account for this issue, an alternative approach involves removing the highest and lowest bid prior to determining the average. Figure 5 displays the distribution of bids for the projects examined. The bid ratio is the ratio of each bid to the average of all bids for each project, indicating a distribution of 88.83% confidence. As a result, indicating tendency of bidders to offer lower-thanestimate bids. The distribution of bids ratio has an average of 1.0, standard deviation of 0.135, skew of 0.973, and mean of 0.993. Data obtained also represent an average standard deviation of 12.12% for the distribution of project bids. This percentage measures the extent at which every value in the data set differ from the mean of the projects analyzed. Now, taking into consideration the ratio of the lowest bid to the average bid, an alternate distribution of data may be obtained. Figure 6 displays the distribution resulting from the comparison of two factors.         Figure 8 represents the ratio of the lowest bid in comparison to the ratio of the average bid. The standard deviation of the data is at low as 0.08 with a skew value of −1.78. The median of the data is 0.89, with an average value of 0.87. Comparison between Figures 7 and 8 indicate skewness values in two opposite directions, which are essential for understanding the bidding behavior. The skewness of ratio of bids to engineer's estimate clearly indicate the under-estimation of the engineer's estimate for determining the bid value. This skewness might be a function of different parameters, including but not limited to market characteristics at the time of bidding interacting type and scope of projects within a small portfolio or large program. For instance, the wide spread of projects in this study in respect to scope and bidding period has changed the direction of skewness in comparison with the work presented by Tehrani (2016). In another observation, the opposite direction of skewness in Figure 8 represents the competitiveness behavior of bidders aiming to provide the lowest bid disregarding the engineer's estimate.

Conclusions
Analyzing cost estimates of similar projects within the same geographical district allows for a better understanding of the gap existing between the engineer estimate and the final project cost upon completion of construction. Statistical analysis of the resulting bids denotes the distribution (Figure 4) relating the average of bids to be an appropriate model for analysis of bids.
When determining a mechanism for the estimation of sample project cost, analysis regarding the data obtained allowed for three approaches to be placed in order, as shown in Table 1 following studied literature. Shown through the error values within the Table, Comparison between Figures 7 and 8 indicate skewness values in two opposite directions, which are essential for understanding the bidding behavior. The skewness of ratio of bids to engineer's estimate clearly indicate the under-estimation of the engineer's estimate for determining the bid value. This skewness might be a function of different parameters, including but not limited to market characteristics at the time of bidding interacting type and scope of projects within a small portfolio or large program. For instance, the wide spread of projects in this study in respect to scope and bidding period has changed the direction of skewness in comparison with the work presented by Tehrani (2016). In another observation, the opposite direction of skewness in Figure 8 represents the competitiveness behavior of bidders aiming to provide the lowest bid disregarding the engineer's estimate.

Conclusions
Analyzing cost estimates of similar projects within the same geographical district allows for a better understanding of the gap existing between the engineer estimate and the final project cost upon completion of construction. Statistical analysis of the resulting bids denotes the distribution (Figure 4) relating the average of bids to be an appropriate model for analysis of bids.
When determining a mechanism for the estimation of sample project cost, analysis regarding the data obtained allowed for three approaches to be placed in order, as shown in Table 1 following studied literature. Shown through the error values within Table 1, the lowest bid seems to be a more reliable approach due to the lower error values. Each approach, when determining the best method for estimation, is not necessarily suitable for all situations. For instance, the average of the three lowest bids contains the second least square value, even though this may result in an inaccurate estimation if outliers are not accounted for. An analysis of the bid ratio shows the tendency of bidders to offer lower than estimated bids. Data obtained also indicate the percentage at which the mean of the project analyzed differs from the original data set. The standard deviations of the figures indicate that many of the values obtained are relatively close to the average. This is the case for both the ratio of the lowest bid to average bid, as well as the ratio of the bids to the engineer estimate. The skewness of the data is also an indication of where the mean of the data being observed lies. The observed data indicated the skewness of the mean of the ratio of bids to the left of the histogram.
The numerical analysis performed in the research provides a limited scope of projects geared towards a distinct region and type of project. The analysis drawn also relies on the assumptions made on the general nature of bidding. However, the following methodology may be applied to a larger scope of projects to provide a more accurate analysis of an engineer project estimate.