Cost Forecasting for Building Rebar under Uncertainty Conditions: Methodology and Practice

: As large - scale infrastructure construction projects conclude, the overall civil construction market shrinks, leading to increased competition among construction companies. Accordingly, various construction companies are gradually emphasizing the issue of project costs. Numerous studies have shown the impact of material costs on the overall project cost. However, sharp ﬂuctuations in material prices have been observed in recent years due to various unstable factors in the market. Thus, accurate prediction of material prices facilitates the development of appropriate material pro-curement strategies to deal with market risks. This paper collects the rebar prices announced in the ﬁrst half of 2023 in Chin a’ s Guangdong Province, selects one type of rebar price as a representative, and analyzes the time series characteristics of the rebar price composition. Then, it judges whether the time series passes the white noise detection, grey correlation detection, and level ratio detection. The prediction model is established based on the seasonal auto - regressive integrated moving average (SARIMA) model and the grey model (GM) (1.1) to predict future rebar prices. According to the characteristics of the rebar price data in June 2023, the residual inverse method combines the results predicted by the two models. The price in the early and middle of July 2023 is then predicted using the newly constructed combined model. The results indicate that the combined model is more accurate than the single prediction model.


Introduction
With the completion of large-scale infrastructure construction in China, the scale of the construction market is gradually decreasing.Consequently, market competition among construction enterprises is becoming increasingly intense, leading to a yearly decline in profitability within the engineering construction industry.The cost of materials accounts for the majority of the project's total cost.It takes several years to construct medium-and large-scale projects, and the fluctuation of material costs significantly influences the profitability of the enterprise.Construction enterprises mainly focus on the construction of the price of raw materials.
Accurate prediction of material prices can improve the total project cost prediction accuracy, ensuring the project's profitability [1].Construction material prices fluctuate throughout a project's life cycle, from the production of tender documents to the completion of the final account of the whole process.As an uncertain risk factor affecting the entire project construction cycle, the material price fluctuation caused by a project's total cost significantly affects the deviation of that project's total cost [2].Accurate prediction of the future price of materials helps the procurement department to arrange appropriate material procurement plans by which to minimize or avoid the risk of material price fluctuations [3][4][5].Among the construction materials, concrete and rebar costs account for most of the construction costs, and the rebar prices fluctuate significantly.The price of HRB400E ϕ25 rebar in Guangdong Province reached its highest point of RMB 4586.67 in the first half of 2023 with a low of RMB 3860 and a span of RMB 726.67 over half a year.Accurately predicting the rebar price fluctuation provides appropriate purchasing arrangements with which to manage it while also reducing its impact on the construction cost.
Many experts and scholars have established various forecasting models for steel price forecasting.Al-Hammad tracked the price pattern of rebar in Saudi Arabia from 2003-2005 and constructed a polynomial regression equation to predict the future price of rebar [6].Lin established a new grey prediction model, MFGMn (1.1), which increased the trend-catching ability to obtain high-accuracy prediction results in steel price prediction in the case of system instability [7].Zhang predicted the weekly shipbuilding steel price data through a variable-structure radial basis function (RBF) network model with a timevarying number of basis functions and input orders [8].Kapl predicted the price of the hot rolled coil in the US market using the AMNIA and MSSA models, demonstrating the superiority of the MSSA model to the AMNIA model as the time series grows [9].
Wu adopted an adaptive radial basis function neural network (RBFNN) and an adaptive sliding window (ASW) to predict the weekly prices of eight steel products extracted from the Baoshan steel market in Shanghai, China, and the study demonstrated that the ASW model had the highest accuracy [10].Yin utilized an adaptive RBFNN model, a backpropagation (BP) NN model, and a sliding window (SW) model to forecast the weekly prices of eight steel products from January 2011 to December 2011 in the Baoshan steel market in Shanghai, China, indicating the lowest mean absolute error (MAE) of the ASW model [11].Using the multiple eigenvalue prediction method, Feng constructed a convolutional neural network to predict rebar price fluctuations [12].Wang predicted the price of rolled round steel using the triple exponential smoothing, grey prediction model (GM (1.1)), grey Verhulst model, and polynomial fitting method.It was found that, although the GM (1.1) was the optimal method, the polynomial fitting method provided the best accuracy at specific local time points [13].Faghih developed a vector error correction (VEC) model for predicting short-and long-term steel prices in construction materials, achieving excellent results in predicting the prices of asphalt, steel, and cement in the US market [14].Shiha predicted steel bar prices in Egypt's construction industry over a 6month period using an artificial neural network (ANN) [15].Mir trained ANN to generate intervals directly using the optimal lower bound estimation (LUBE) method, achieving excellent results in predicting construction steel prices in the United States [16].Xu established a Gaussian process regression model to forecast the prices of 10 steel products in the Chinese market [17].Zhang improved the process of separating and retrieving the four components of the time series, separately utilized the improved multiplicative and additive models for forecasting and employed the inverse variance method to combine the multiplicative and additive methods with reasonable weights.In [18], an improved combined forecasting model was established for steel bar prices.Mi established a VMD-EEMD-LSTM model and collected rebar futures data from 2009-2020 to predict the rebar futures price for the following 14 trading days, providing results superior to other single and combined models [19].Liu predicted the short-term spot price trend of steel plates using an autoregressive moving average (ARIMA) model, a long-and short-term memory (LSTM) model, and a combined LSTM-ARIMA model [20].Sangkhiew made two improvements to the Holt-Winter (HW) model, combined with artificial intelligence techniques to establish PSO-HW and GSA-HW models to measure local stainless-steel prices [21].

Rationale for Model Selection
This paper focuses on the time series price forecasting of the rebar model.To this end, the time series of this price is analyzed to select an appropriate modeling method.

Price Chart
As there are many types of rebars, material price inspection websites typically refer to the price of a specific type of rebar as a representative sample.This section selects the price data of rebar with the type "HRB400E ϕ25" published on the website of China Guangdong Province Rebar Information Price Announcement (http://cbprov.gldjc.com/gd/homepage) in 2023 as a representative.It utilizes the method of isotropic series interpolation to supplement the missing price data due to holidays.

Stability Testing
This paper adopts the augmented Dickey-Fuller (ADF) test as the detection index by which to verify the stability of the time series.The "Adfuller" function in Python 3.12.1 is used to calculate the p value of the time series as 0.943, which is greater than the significance level (α = 0.05), indicating the instability of the time series.

Seasonal Testing
To identify the seasonality of the time series, this study utilizes the "seasonal_decompose" function in Python to decompose the time series values into trend, seasonal, and cyclical components, with a seasonal cycle duration set at 30 days.Figure 2 illustrates the outcomes of the splitting process.The division of the time series data into multiple pairs suggests the presence of a notable seasonal effect.

White Noise Testing
White noise is utilized to assess the appropriateness of the time series for the ARIMA model and its variations.The p value of the sequence, following the detection of the firstorder difference, is 0.012, which is less than the significance level of 0.05.The first-order difference sequence successfully passes the white noise addition test, suggesting the suitability of this time series for ARIMA modeling and its associated statistical time series forecasting techniques.

Grey Correlation and Time Series Level Ratio Detection
The time series of rebar prices can be considered a single-factor time series predicted by the past historical value as the characteristic feature used to predict future values.Following the identification of seasonal variations in the time series, the grey correlation index is employed to assess the strength of correlation between different seasons of the time series by utilizing data from the preceding season to model the subsequent season's array.Simultaneously, the modeled data undergoes testing for the grade ratio to validate the suitability of the GM model for numerical prediction.
The absolute grey correlation 0i ε is calculated as follows: Firstly, the price data for each month will be uniformly converted to 31 days, with less than 31 days of the price data for the equivariant series interpolation complemented.
Considering the series 0 X and 1 X , the specific calculation method can be described as follows: (1)) (1) (1) ( ( 1), (2), , (31)) As shown in Table 1 the grey absolute correlation between two adjacent months is greater than 0.6, the previous month's data can be employed to establish a prediction model to predict the next month's data.The grade ratio test is performed for each month's data, modeling the data to predict it through the grey prediction model.The grade ratio test can be described as follows: The level ratio λ for the time series 0 0 0 0 ( (1), (2), , ( )) can be calculated as follows: According to (4), the extremum values of the level ratios from January 2023 to June 2023 can be obtained, as shown in Table 2. , n n e e − − + ).Table 3 shows the grey level ratio intervals for different time lengths.After a grade comparison check, all of the rebar price data from January to June 2023 were checked and could be processed using the GM (1.1) model.

Model Selection
The examination of the above steps indicates that the time series exhibits non-stationary time series with seasonal effects.These characteristics can be utilized to determine a suitable modeling approach.
The SARIMA model is commonly employed for modeling non-stationary seasonal effect time series in statistical modeling, which can achieve better forecasting results in long time series forecasting.The combination of forecasting models can enhance the precision of short-and medium-term forecasting.
Due to its good prediction accuracy in short-and medium-term time series, the grey prediction model can realize the grey correlation detection and rank ratio detection.The GM (1.1) model can predict the future values of these time series.Therefore, the grey prediction model can be integrated with the SARIMA model to enhance the accuracy of predictions.

GM
The grey system theory proposed by Julong Deng in the 1980s can describe and deal with uncertain information.Only a relatively small amount of data can be employed for modeling without considering the data distribution [22,23].After more than 40 years of development, the relevant theories and methods have matured further, completing the basic structure of the discipline system.This has many applications in image processing, time series prediction, system optimization, control, and decision making [24][25][26][27][28].
In the grey system theory, completely unknown and known information are called black and white information, respectively.Furthermore, partially known information is called grey information, establishing the grey system.The grey system is mainly employed for law discovery and mining historical behavioral data by classifying and processing the known information and establishing approximate differential equations, i.e., grey differential equations, for the explored laws.
Grey system forecasting is a grey theory component, employing the construction of shadow equations to discover data patterns and predict the future.The GM (1.1) model is suitable for predicting short-term data series rather than for predicting a wide range of volatile time series [29].Scholars have established many variants of the GM (1.1) model by combining it with other models and have employed them to forecast food products, securities, property markets, and energy products [27,[30][31][32][33][34][35].
Section 2.1.5verified the correlation and rank ratio of the time series from January to June 2023.The test results indicate that the GM (1.1) model can predict the time series successfully.The model can be established through the following process.
Let the original time series 0 X in the system be as follows: The accumulation operation reduces the randomness of the sequence 0 X , and the newly generated sequence 1 X plots a curve to approximate the image of the exponen- tial curve.The resulting 1-AGO sequence for any non-negative raw data sequence monotonically increases.
, the following sequence of immediate neighborhood mean generation can be obtained as follows: Let the grey differential equation be as follows: Equation ( 5) describes the mean mode of the GM (1.1) model, which is essentially a difference equation, where a and u are the internal variables and the parameters to be identified.a represents the developmental dynamics of the grey system, called the developmental coefficient, and u indicates the developmental changes in the data relationship, called the grey role quantity.
The grey differential equation corresponding to the whitening equation (aka shadow equation) is as follows:   , the least squares method can be used to obtain the coeffi- cients as ，where n Y and B can be calculated as follows: The time response function of the whitening equation can be described as follows: , the predicted value of the original data series can be obtained by the cumulative reduction of the values of 1 ( 1) x k + : 1) )

SARIMA
The differential autoregressive integrated moving average (ARIMA) model is one of the most classical models in time series analysis.This model can fit non-stationary time series containing trend and its main principle is to transform a non-stationary series into a stationary one using the difference method.Nevertheless, certain non-stationary time series might encompass cyclical features, which the ARIMA model cannot extract.The seasonal autoregressive integrated moving average (SARIMA) model is applicable in such scenarios.
The SARIMA model has been applied to the price forecasting of crude oil, agricultural products, and electricity, attaining excellent results [36][37][38][39][40][41][42].Based on its characteristics, the SARIMA model is combined with other models to establish a new prediction model [43][44][45].With the continuous generation of new practice series forecasting models, those combining the SARIMA model as the object have been widely utilized [46,47].
The SARIMA model is based on the d-step differencing of the series and performs the differencing operation on the two series values separated by S steps.The d-order Sstep differencing operation can eliminate the trend and periodicity, thus smoothing the series.At this time, the model can be expressed as SARIMA.p denotes the non-seasonal regression term order, d denotes the non-seasonal difference order, q denotes the nonseasonal moving average term order, P denotes the seasonal regression term order, D denotes the seasonal difference order, and Q denotes the seasonal moving average order.The SARIMA model is established on top of the ARIMA model, requiring both the d-step and the S-step differencing operations, expressed as follows: If a sequence is smoothed through the d-order S-step differencing and an ARMA model is established for the differenced sequence, the model corresponding to this sequence is called a SARIMA model, which can be defined as follows: let d and D be nonnegative integers; a random sequence can be obtained for the original sequence by d-order differencing and D-order S-step differencing For a smooth ARMA sequence, we have A SARIMA process with period S can be described as follows: The symbols AR and SA are, respectively, used to denote non-seasonal and seasonal autoregressive polynomials, while the symbols MA and MSA are, respectively, used to denote non-seasonal and seasonal sliding average polynomials.It should be noted that these polynomials are white noise.D is rarely greater than 1 in practice, while p and Q are generally less than 3. Time series modeling and forecasting are divided into linear and nonlinear modeling.Linear modeling includes AR, MA, ARMA, ARIMA, and SARIMA models.The SARIMA model is an evolutionary model of ARIMA, where its difference from the ARIMA model is that it can deal with periodic time series data.Since the ARIMA model cannot fit the periodic time series, the SARIMA model is proposed.The SARIMA model fits non-stationary, periodic data well and is widely utilized in traffic flow forecasting and disease prediction.

Model Evaluation Indicators
In order to conduct a fair comparison of the predictive accuracy of two distinct models, a consistent comparison framework is established.Both models are developed using price data collected from January to June 2023.At the same time, the GM prediction model exhibits high accuracy in short-and medium-term numerical predictions but demonstrates limitations in long-term numerical prediction due to its inherent prediction constraints, leading to a notable bias in such forecasts.Hence, each month is delineated as a quarter, with each month further segmented into three intervals.The data forecasting the market price of rebar is anticipated to be available in early and mid-July 2023.In order to assess the accuracy of the prediction results, this study introduces three evaluation indices: RMSE, MAE, and MAPE.

Methodology
The following four steps should be performed to establish the predictive models: investigating data, data preprocessing and method selection, model testing and validation, and combining baseline predictive models.

Survey Data
The information price of rebar in Guangdong Province, China, was calculated from January to June 2023 (http://cbprov.gldjc.com/gd/homepage)and is presented in Appendix A. Furthermore, an interpolation method was employed to supplement the information price that could not be released on non-working days to facilitate the prediction model construction.

Identification and Estimation
This paper adopts two different forecasting models for combined forecasting of rebar market prices: SARIMA and GM (1.1).First, a forecasting model is established.Second, a model test is performed.The test results are employed to combine the models.
The SARIMA model establishes the function model using the Python language.The traversal method is then utilized to derive various indicators under different combinations of parameters.Furthermore, the AIC indicator is employed to select the best model parameters.
On the other hand, as the grey model predictions are suitable for short time forecasts, the GM (1.1) model employs data from June 2023 to construct the prediction equations considering the strong correlation between the two adjacent months detected in Section 2.

Model Validation
This paper chooses MSE, RMSE, and MAPE to validate the three models.The performances of the different models are described in the following.

Combination of Models
According to the characteristics of various forecasting models, one of the research directions of time series forecasting is to select the models with complementary characteristics according to the data to be forecasted and combine them to establish new forecasting models.Proper construction and selection of combined forecasting models can achieve better accuracy in time series forecasting in various industry aspects.The combined forecasting model established using ARIMA and GARCH models has higher accuracy in the short-term forecasting of international oil prices than those predicted by a single model [48].The combined forecasting model also reflects higher accuracy in forecasting other time series.Regarding stock prices, indices, and futures prices, the combined forecasting model established using ARIMA and LSTM also obtained better forecasting results [49].In terms of the prediction accuracy of the dynamic gas emission concentration in the coal mining face, wavelet decomposition and the GM-ARIMA-based prediction method are proposed to improve the fitting effect and attain a higher prediction accuracy when compared with results obtained by the GM (1.1), ARIMA, and their combination, called the GM-ARIMA prediction model [50].The combined forecasting model can freely choose the number of models and forecasting methods, which opens up a broad research field to establish a time series forecasting model [12].
The principle of the inverse residual method states that the higher the accuracy of the model prediction, the smaller the residual value and the higher the weight value.If the actual and predicted values at time t are denoted by ( ) x t and ˆ( ) x t , respectively, the sum of squares of their differences in different times, denoted by i Q , can be calculated as follows: ( ( ) ( )) w be the weight value of the prediction result in the ith sample, i.e., x t be the predicted value at moment t.Now, ˆ( ) x t is the combined predicted value at that moment, which is expressed as follows: ˆ( ) ( ( )* )

Data Series
For the year 2023, the median price of Rebar information is CNY 4423.33, the mean is CNY 4282.42, the standard deviation is CNY 254.072, the lowest and highest prices of the year are CNY 3860 and CNY 4586.67,respectively, and the extreme deviation is CNY 726.67.

SARIMA
The SARIMA model can be represented as SARIMA (ps, d, qs) (Ps, D, Qs) [S], which has seven parameters and can be determined as follows.
First, the number of seasons S is determined.In the seasonality test of the time series, it is assumed that a season has 30 days.The seasonality test indicates that the time series has six seasons in the case of a stable sequence of residuals going out of the trend factor.Thus, the value of S is determined as 6.
The time series detection in Section 2 indicates that a first-order differential white noise can detect the time series so that the value of the parameter d can be obtained as 1.
The remaining parameters are chosen in an appropriate range, as shown in Table 4 and an iterative method is adopted to find the optimal AIC values.By traversing the method to construct all of the models with different parameters, the ACI values of the models are calculated for different parameters, and the model with the minimum AIC value is chosen as the optimal model.Figures 3-7 and Table 5 show the final calculation results.According to the above calculations, the SARIMA model parameters were chosen as (1, 1, 1, 0, 1, 1, 6).

GM (1.1)
The model adopts the price data for June 2023 to create a shadow equation with the following data: Taking the above parameters, the following response function can be obtained for this time series: Now, the prediction function for this time series can be derived as follows: Table 6, Figures 8 and 9 shows the prediction results obtained with the GM (1.1) model.

Evaluation of the Combined Model Results
The SARIMA and GM (1.1) models have been established for rebar price forecasting, and the methodology for combining these models has been determined.Accordingly, the following formulas can be obtained for combining the models: Table 7 shows the predicted results obtained based on this combination.As shown in Table 9, the accuracy of the combined prediction model has been improved compared with that of the single prediction model, demonstrating the superiority of the combined prediction model in 20-day prediction.However, as shown in Tables 8  and 10, the SARIMA model achieved the best forecasts in the short-term 10-day price forecasts, and GM (1.1) achieved the best forecasts throughout the July price forecasts.
Figure 10 compares the predicted results of the combinational model with the GM (1.1) and SARIMA models.After analyzing the above charts, the GM model successfully predicts the upward trend of the rebar price in early and mid-July.In contrast, the SARIMA model predicts the fluctuation trend of the rebar price.Although the time of price change in the prediction model has a deviation of 1-2 days, the predicted trend is compatible with the actual price trend.More accurate prediction results can be obtained in the short-and medium-term rebar price prediction when combining the GM prediction model with the SARIMA model through the residual inverse method.

Conclusions
This study gathered data on the price of rebar in the first half of 2023 in Guangdong Province, China, with a focus on the price of HRB400E ϕ25 rebar as a representative sample.The data underwent a process of cleaning and interpolation and were complemented to perform the nature detection.The time series indicated the suitability of the SARIMA and GM models, established through the mentioned data, for modeling and forecasting the rebar price.AIC indicators were adopted to determine the SARIMA model parameters.The rebar price data level ratio test results and grey correlation test results were employed to determine the GM model selection for the GM (1.1) model.The price data of the previous month were utilized as the basis to establish the prediction model to predict the next month's price data.
Taking the completed SARIMA and GM (1.1) models fitted to the price data in June 2023, the suitability of the SARIMA model for volatility prediction is combined with that of the GM (1.1) model for short-term trend prediction to establish a new model.According to the June prediction of price data residuals of the size of the characteristics of the residuals, the residuals of the inverse method of the two models predicted the first half of July 2023 and the middle of the weighted combination of prices.The weighted combination of the two forecasting models in early and mid-July 2023 has improved the accuracy in the short-and medium-term rebar price prediction compared with a single statistical forecasting model.
The model proposed in this paper is primarily employed in the preparation stage of the monthly material procurement plan during the construction phase of a project.By forecasting the price of rebar, it can provide invaluable decision support for planning, mitigate the risk of construction costs resulting from material price fluctuations, and reduce the impact of market price fluctuations on the project.
The model is not subject to any specific restrictions with regard to the materials and regions in question.However, the application of the data in question prior to the detection and trend analysis will yield more precise results.The main limitation is whether the data in question meets the conditions of the SARIMA model and the GM (1.1) model.The analysis of rebar price data in the region indicates that, following a sustained decline in March-May, a period of stability will inevitably ensue.The application of the GM model of the trend of the change in the complementary capture, combined with the SARIMA model for price fluctuations in the combination of forecasts, allows for the generation of accurate predictions.
This model also has some limitations.Firstly, SARIMA models price by a certain mean shift, with results in a lag in predicting the trend.Secondly, in order to be applicable to this model the model must predict the data through white noise detection, grey correlation detection and rank ratio detection, the original data requirements for which are high.
This model offers a more precise representation of data trends than the traditional ARIMA series model, integrating trend and volatility forecasts to provide more accurate forecasting results under specific market price conditions.In the future, the deep learning model can be integrated into this model to replace the GM model and SARIMA model for combined modelling, which can then be used in steel bar price prediction.
The autocorrelation of the sequence is an important reason for the lag in predicting the trend, and to eliminate the autocorrelation one must perform the difference operation, that is, one can take the difference between the current moment and the previous moment as the regression target.This provides a research direction for further optimization of the model, with the objective of improving prediction accuracy in the future.
Figure 1 displays the price of rebar in Guangdong Province.

Figure 1 .
Figure 1.Price of rebar in Guangdong province.

Figure 6 .
Figure 6.The prediction results obtained with the SARIMA.

Figure 8 .
Figure 8.The prediction results obtained with the GM (1.1) model.

Table 1 .
The grey correlation between two adjacent months from January 2023 to June 2023 obtained from Equation (3).

Table 2 .
The extremum values of the rebar price grey level ratio from January to June 2023.As shown in Table2, the level ratios are in the interval (

Table 3 .
Grey level ratio intervals for different time lengths.

Table 4 .
Interval of values of the remaining parameters.

Table 5 .
The results of the SARIMA model prediction testing.

Table 7 .
Predicted results.Tables 8-10 compare the accuracy of the prediction results of the three models based on the evaluation indices, including the RMSE, MAE, and MAPE.

Table 8 .
Evaluation indices for the predicted results in 10 days.

Table 9 .
Evaluation indices for the predicted results in 20 days.

Table 10 .
Evaluation indices for the predicted results in 31 days.