You are currently viewing a new version of our website. To view the old version click .
Buildings
  • Article
  • Open Access

25 December 2022

Application of GSM-SVM for Forecasting Construction Output: A Case Study of Hubei Province

,
,
,
,
,
and
1
School of Urban Construction, Yangtze University, Jingzhou 434023, China
2
School of Economics and Management, Yangtze University, Jingzhou 434023, China
3
Hunan Construction Engineering Real Estate Investment Co., Ltd., Changsha 410026, China
4
School of Architecture and Planning, Hunan University, Changsha 410082, China
This article belongs to the Section Building Energy, Physics, Environment, and Systems

Abstract

It is significant to achieve the scientific forecast and quantitative analysis of construction output. In most existing construction economic forecasting methods, both time series models and BP neural network fail to consider the change in relevant influencing factors. This paper introduced the support vector machine (SVM) to solve the above problems based on the grid search method (GSM) optimization model. First, based on constructing an index system of influencing factors of the gross industrial output, a grey relational method is adopted to verify the correlation between the eight factors and output. Furthermore, a SVM forecast model of the gross output is constructed with the relative datasets and influencing factors of the construction industry in Hubei from 2001 to 2016 as a training sample, while the parameters are optimized using the GSM. Then, the model is used to forecast and analyze the gross output from 2017 to 2020 while checking errors. Finally, according to systematic comparison analyses among three forecast models, including the GSM-SVM model, BP neural network, and grey GM (1,1), the results showed that the GSM-SVM forecast model processed the higher solution accuracy and generalization ability. The effectiveness and reliability of our proposed model in the field of construction output forecasting are verified. It can provide a more effective modeling and forecasting method for the gross output value of the construction industry.

1. Introduction

Construction output denotes the sum of construction industry products and services generated in a country or a region within a certain period. It is a specific manifestation of production scale, development speed, and operating performance in the construction industry [1,2]; it is essential for governments or companies to avoid risks, position the industry, and formulate rules and regulations. It is necessary to forecast the future economic development of the construction industry based on the intrinsic factors affecting the gross construction output. The high-quality development of the construction industry will be achieved by systematically evaluating and measuring the economic growth of the construction industry [3].
The study of traditional forecasting methods based on construction output are generally achieved by establishing mathematical or physical models. Mainly includes linear regression [4,5], time series [6,7], and other linear analysis methods. These forecasting methods utilized the past construction output historical data to speculate on the future development trend of the construction economy. The research objects of these time series analysis methods require a continuous regular growth trend. However, the forecasting of gross construction output is a non-linear problem affected by many factors, so the above methods do not align with the objective development law of construction output. With the rapid development of artificial intelligence (AI) technology, economic development forecasting methods based on AI algorithms have received widespread attention from researchers, mainly including grey forecasting models [8,9], BP neural network [10,11], and other integrated models of different algorithms. However, the grey model (1,1) has low accuracy on irregular, and unstable sample data. BP neural network requires large data samples during training and has poor learning ability and fault tolerance on the small sample time series. Generally, current construction output forecasting algorithms have the disadvantage of not considering the impact of relevant factors, slow convergence speed, and low fault tolerance.
According to the above discussion, a suitable construction output forecasting model for predictive economic development is urgently required. Firstly, it is necessary to introduce the influencing factors of construction output into the forecasting model. Most existing indicators of construction output influencing factors direct use macroeconomic indicators to forecast construction output development, such as gross regional product and fixed asset investment, which failed to summarize the impact of other factors. Therefore, a system of indicators with a more distinct hierarchical structure and multiple integrated considerations is necessary. Secondly, the Support Vector Machine (SVM) is a data mining method based on statistical learning theory. Compared with other algorithms, it has unique advantages in solving small samples, non-linear problems, and identifying high-dimensional patterns. The Grid Search Method (GSM) possesses a solid self-adaptive optimizing search ability, which is able to maximize the search for the optimal parameter combination of SVM. To summarize, this paper uses a support vector machine optimized by the grid search method (GSM-SVM). It will achieve a more accurate prediction of the construction industry’s gross output.
The contributions of this paper are summarized as follows:
(1)
Screening scientific and practical influencing factor indicators are the basis for construction output to conduct predictive analysis research. Based on the literature and related research, the evaluation index system for influencing factors is proposed by combining it with the current Chinese construction industry development situation to screen out the more scientific influencing factors of construction output.
(2)
A grid search method is used to optimize the SVM algorithm to find the optimal combination of values of penalty factor C and kernel function parameters g to improve the recognition accuracy and prediction performance of the SVM prediction model.
(3)
The SVM algorithm attempts to apply to Chinese construction output forecasting, and related experiments verify that the GSM-SVM forecasting model has higher accuracy and is a critical extension of the economic forecasting method of the construction industry.
The remainder of this paper is organized as follows. Section 2 introduces the related works on the methodology and influencing factor indicators used to forecast construction output gross output. Section 3 presents the index system’s support vector machine methods and construction. In Section 4, construction of the GSM-SVM forecasting model and verification analysis of forecasts for gross construction output. To verify the model’s accuracy, we compared its forecast errors with those of the BP neural network and grey GM (1,1) models. Finally, it clarifies the GSM-SVM approach’s feasibility in the construction output forecasting field in Section 5.

4. Results and Discussion

4.1. Data Sources and Data Pre-Processing

This paper selected construction output in Hubei during the 20 years from 2001 to 2020 as the research object. Furthermore, the eight indexes mentioned above were selected as the characteristic values of the prediction model. The data comes from the “China Statistical Yearbook”, “Hubei Statistical Yearbook” and “China Construction Industry Statistical Yearbook” from 2001 to 2020. The original gross output data and relevant indexes are shown in Table 3.
Table 3. Raw data of gross output and relevant indexes of the construction industry in Hubei from 2001 to 2020.
In the SVM algorithm, unifying the magnitude of original data will significantly improve the SVM recognition rate and training efficiency and eliminate the forecast errors caused by the significant value differences of indexes [46]. The mapminmax function was used to normalize the data to [0, 1]. Let the training sample set to be { x i } , the test sample to be { x i } , x max and x min to be the maximum and minimum values of the indexes in the training sample set, respectively. A sample x i in the training sample set was selected for calculation with the normalization equation x i = x i x min x max x min . The normalization of the test sample set x i is the same as that of x i . The results after simulation need to be de-normalized to obtain the final forecast value.

4.2. Model Fitting Based on GSM-SVM

After data sorting and preprocessing, the eight influencing factors and the gross output of the construction industry were selected as the input and output of the model. The time series data of construction output in Hubei from 2001 to 2016 was selected as the training samples, and the data from 2017 to 2020 was selected as the testing samples.
In this study, Radial Basis Function (RBF) kernel function was selected to establish a predictive SVR model, which transformed the optimization problem into solving equations. The grid search method was used to optimize the parameters C and g of SVM since the accuracy and performance depend on C and g. Firstly, the large step length is used for rough search, and then the small step length is used for accurate search. The accuracy of different parameter combinations is compared by 5-fold cross-validation, and the optimal SVM parameter set is finally determined.
The process of parameter optimization is shown in Figure 3. The x-axis represents the value of parameter C taken as the logarithm of base 2, the y-axis represents the value of the kernel parameter g taken as the logarithm of base 2, and the contours represent the accuracy rates corresponding to taking the corresponding C and g. The optimal parameters for Support Vector Regression (SVR) were obtained, bestc = 11.3173, bestg = 0.0078, and CVmse = 0.0073.
Figure 3. Diagram of parameter optimization process. (a) 3D view, (b) contour map.
The SVM network was trained with the optimal parameters C and g of regression analysis, and the network regression forecast was carried out on construction output in Hubei from 2017 to 2020. The output data were processed by de-normalization, and the final forecast value of the construction output in Hubei Province was obtained. The comparison between the value forecasted by the model and the actual value is shown in Figure 4.
Figure 4. Fitting results of GSM-SVM forecast model for the construction industry output in Hubei. (a) total samples’ absolute error diagram, (b) testing samples’ relative error diagram.
Figure 4 shows the forecasted output value of Hubei’s construction industry given by the GSM-SVM model, which better fits the actual value. The overall trend of their curves is consistent, and the prediction accuracy of the overall sample is 93.90%. Of the accuracy test of all samples from 2001 to 2020, the prediction results in 2008 and 2015 are slight deviations, while the rest predicted values that were consistent with the actual results. It can be seen more intuitively from Figure 4a absolute error diagram that the maximum absolute error of the overall sample is no more than 30 billion Yuan. Among the training samples, the prediction effect of 2004, 2012, and 2016 is excellent, and the absolute error is no more than 4 billion Yuan. Figure 4b testing samples’ relative error diagram shows that the relative error of the industry’s gross output forecasted by the GSM-SVM model in the past four years (2017–2020) is relatively stable. The maximum relative error of the testing sample is no higher than 1.5%, and this forecast result has high accuracy in macroeconomic forecasts. To sum up, it shows that the GSM-SVM model can be used to forecast the gross output of the construction industry, and our model forecasts that the construction output has high credibility.

4.3. Model Comparison and Analysis

As is well known, the GSM-SVM model is compared with the BP neural network and the grey GM (1,1) model, which are commonly used in the construction of the gross output value prediction, and to verify the reliability and practicability of the GSM-SVM model. Table 4 and Figure 5 show the comparison of construction output in Hubei from 2017 to 2020 forecasted by the three models.
Table 4. Comparison between the gross output forecasted by the three models and the actual value of the construction industry.
Figure 5. Comparison of the predictive performance of the three models. (a) forecast values and targets for the gross output of construction, (b) relative error diagram.
From Table 4 and Figure 5, it is clear that the grey GM (1,1) forecast model has the worst performance. Moreover, the value forecasted by the GSM-SVM model and the BP neural network model has a similar variation tendency with the actual construction output, indicating that they can effectively forecast the output of an industry. Among them, the curve of the values forecasted by GSM-SVM was more consistent with the actual values, proving its better forecast performance.
Furthermore, in this paper, the mean absolute percentage error (MAPE) and Theil inequality coefficient (TIC) [47] are used to analyze the errors of the three models. The comparison of the model forecast accuracy is shown in Table 5. The equation of MAPE is MAPE = 1 n i = 1 n | p i p ^ i p i | , where p ^ i is the forecasted construction industry output, and p i is the actual value, n is the number of forecasted samples. MAPE reflects the overall closeness of the forecasted value to the actual. The smaller the value is, the higher the forecast accuracy of the model is. The equation of Theil’s inequality coefficient is T = 1 n i = 1 n ( p i p ^ i ) 2 ( p i ) 2 + ( p ^ i ) 2 , reflecting the difference between the real value and the simulation result. The smaller the T value is, the better the fitting is. Herein, T = 0 means complete fitting.
Table 5. Comparison of the forecast accuracy of the three models.
It can be concluded by comparing the errors in Table 5 that the MAPE and TIC of the GSM-SVM model are 0.823% and 0.413%, respectively. It has remarkable prediction accuracy compared with 1.905% and 0.951% of the BP neural network model and 5.333% and 2.704% of the grey GM (1,1) forecast model. The correlation coefficient between the forecasted value by the GSM-SVM model and the actual value is also as high as 0.99612, overall better than the other two algorithm models.
According to the above analysis and comparison, the GSM-SVM algorithm model proposed has the best forecast performance in terms of forecasting construction output in Hubei. The grey GM (1,1) algorithm model has the worst performance. In conclusion, the GSM-SVM forecast model can forecast construction output more accurately, which is an efficient and reasonable forecast method and has the opportunity to be applied to a larger scale of output value prediction and has significant promotion and application value. Additionally, it can provide a decision-making basis for China to formulate construction industry plans scientifically and effectively, thus boosting its output through multiple channels.

5. Conclusions

This study presents the influencing factors of the gross value of construction output. The principle of the GSM-SVM is further employed to establish a forecast model for the gross output of the construction industry in Hubei. The relevant industry data from 2001 to 2020 is used for simulation and forecast. The maximum relative error of the test samples is no higher than 1.5%. According to a systematic comparison among three forecast models, including the GSM-SVM model, BP neural network, and grey GM (1,1), the MAPE of the GSM-SVM forecast model is 0.823%, and the TIC coefficient is 0.413%. The prediction effect is superior to BP neural network and grey GM (1,1) prediction model, which has better forecast and convergence performance for forecasting the construction industry output.
This paper is only a tentative study to apply the SVM model to forecasting the gross construction output in China and to verify the method’s feasibility. As is well known, the gross construction output is also affected by various other complex factors, such as the level of technology and equipment and industry policies. Therefore, it is necessary to add new explanatory variables further to improve the model in the future to build a more scientific and practical forecasting model.

Author Contributions

Conceptualization, M.L.; methodology, Y.F. and Y.H.; formal analysis, Y.H.; investigation, Y.H.; resources, Z.Q. and D.H.; data curation, Y.F. and Y.H.; writing—original draft preparation, Y.H.; writing—review and editing, Y.H. and Y.F.; visualization, Y.H. and L.C.; supervision, M.L.; project administration, D.W.; funding acquisition, D.H. and M.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Teaching and Research Project of Hubei Provincial Department of Education (grant no. 2018284).

Data Availability Statement

All necessary data are provided in the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Giang, D.T.; Pheng, L.S. Role of construction in economic development: Review of key concepts in the past 40 years. Habitat Int. 2011, 35, 118–125. [Google Scholar] [CrossRef]
  2. Anaman, K.A.; Osei-Amponsah, C. Analysis of the causality links between the growth of the construction industry and the growth of the macro-economy in Ghana. Constr. Manag. Econ. 2007, 25, 951–961. [Google Scholar] [CrossRef]
  3. Yang, X.; Cheng, J. Economic Forecasting: Characteristics and Quantitative Methods. Syst. Sci. Math. 2019, 39, 1553–1582. [Google Scholar]
  4. Lu, X. The Growth Path and Affecting Factors of Construction Industry in China. Doctoral Dissertation, Xi’an University of Architecture and Technology, Xi’an, China, 2003. [Google Scholar]
  5. Yi, Z. The Research of the Development Prediction and the Space of Construction Industry’s Growth-Based on the Analysis of Chongqing’s Construction Industry. Master’s Thesis, Chongqing University, Chongqing, China, 2014. [Google Scholar]
  6. Tang, J.J.; Liang, W.Z.; Hu, S.H.; Zhao, T.S. Present Situation Analysis and Development Trend Forecast of Construction Industry. J. Civ. Eng. Manag. 2012, 121, 84–88. [Google Scholar] [CrossRef]
  7. Zhang, L.; Li, H. Application of ARIMA model to forecast the total output value of construction industry in China. Enterp. Econ. 2011, 11, 93–96. [Google Scholar] [CrossRef]
  8. Li, X. Study on the Comprehensive Evaluation and Prediction of Construction Industry in Anhui Province. Master’s Thesis, Anhui University of Construction, Hefei, China, 2017. [Google Scholar]
  9. Liu, L.; Wu, L. Predicting the output value of assembled buildings based on grey mean model. Math. Pract. Underst. 2019, 15, 104–111. [Google Scholar]
  10. Li, H. Research on the Evaluation of the Development of Construction Industry in China Based on Multi-Technology. Master’s Thesis, Northeast Forestry University, Harbin, China, 2012. [Google Scholar]
  11. Guang, H. Research on Forecasting Value Added in Construction Industry Based on GA Optimized Grey Neural Network Model. China Constr. Met. Struct. 2021, 5, 12–13. [Google Scholar]
  12. Yang, S. The Empirical Research on Factors Affecting the Development of the Construction Industry in Anhui. Master’s Thesis, Anhui University of Finance and Economics, Bengbu, China, 2017. [Google Scholar]
  13. Zhang, Y. Research on the Economic Status Evaluation and Comparison of the Technical Efficiency for Construction Industry in Guangdong Province. Doctoral Dissertation, South China University of Technology, Guangzhou, China, 2015. [Google Scholar]
  14. Dai, Y.A.; Chen, C. Technical Efficiency in China’s Construction Industry and Its Influencing Factors. China Soft Sci. 2010, 1, 87–95. [Google Scholar]
  15. Hua, R. Research on Evaluation and Influencing Factors of High-Quality Development of China’s Construction Industry. Master’s Thesis, Anhui University of Construction, Hefei, China, 2021. [Google Scholar]
  16. Wang, X.; Wang, Z. Study on the Temporal and Spatial Difference of Construction Efficiency in the Yangtze River Economic Belt. Constr. Econ. 2021, 42, 14–18. [Google Scholar] [CrossRef]
  17. Jiang, J. Research on the Evaluation System of High-Quality Development of Provincial Construction Industry in China. Master’s Thesis, Guangzhou University, Guangzhou, China, 2021. [Google Scholar]
  18. Peng, X. Research on the High-Quality Development of the Construction Industry in Anhui Province under the Background of the Integration of the Yangtze River Delta. Master’s Thesis, Anhui University of Construction, Hefei, China, 2022. [Google Scholar]
  19. Wang, Y.; Wu, X. Research on High-Quality Development Evaluation, Space–Time Characteristics and Driving Factors of China’s Construction Industry under Carbon Emission Constraints. Sustainability 2022, 14, 10729. [Google Scholar] [CrossRef]
  20. Vapnik, V.N. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 1995. [Google Scholar]
  21. Behzad, M.; Asghari, K.; Eazi, M.; Palhang, M. Generalization performance of support vector machines and neural networks in runoff modeling. Expert Syst. Appl. 2009, 36, 7624–7629. [Google Scholar] [CrossRef]
  22. Yuan, Y.; Wang, C.; Zhou, A. Prediction Model for Stability Classification of Roadway Surrounding Rock Based on Grid Search Method and Support Vector Machine. Saf. Coal Mines 2017, 48, 200–203. [Google Scholar] [CrossRef]
  23. Zhang, C.; Tian, Y.; Deng, N. The new interpretation of support vector machines on statistical learning theory. Sci. China Math. 2010, 53, 151–164. [Google Scholar] [CrossRef]
  24. Wang, L.; Zhang, S.; Li, J. Time Series Prediction Based on Support Vector Regression. Inf. Technol. J. 2006, 5, 353–357. [Google Scholar]
  25. Zhang, T. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. AI Mag. Artif. Intell. 2001, 22, 103. [Google Scholar]
  26. Raghavendra, N.S.; Deka, P.C. Support vector machine applications in the field of hydrology: A review. Appl. Soft Comput. 2014, 19, 372–386. [Google Scholar] [CrossRef]
  27. Virmani, J.; Kumar, V.; Kalra, N.; Khandelwal, N. SVM-Based Characterization of Liver Ultrasound Images Using Wavelet Packet Texture Descriptors. J. Digit. Imaging 2012, 26, 530–543. [Google Scholar] [CrossRef]
  28. Wu, M. Parameter Optimization Method Research and Application of RBF Neural Network and SVM. Master’s Thesis, Central South University, Changsha, China, 2007. [Google Scholar]
  29. Scholkopf, B.; Sung, K.-K.; Burges, C.; Girosi, F.; Niyogi, P.; Poggio, T.; Vapnik, V. Comparing support vector machines with Gaussian kernels to radial basis function classifiers. IEEE Trans. Signal Process. 1997, 45, 2758–2765. [Google Scholar] [CrossRef]
  30. Wang, Q.; Zheng, J. Study of network parameter model base on support vector machine. Mod. Electron. Tech. 2015, 12, 23–24+28. [Google Scholar] [CrossRef]
  31. Boolchandani, D.; Ahmed, A.; Sahula, V. Efficient kernel functions for support vector machine regression model for analog circuits’ performance evaluation. Analog. Integr. Circuits Signal Process. 2011, 66, 117–128. [Google Scholar] [CrossRef]
  32. Gao, Y. Research on Factors Affecting the Economic Growth of the Construction Industry in Henan Province. Master’s Thesis, Zhengzhou University, Zhengzhou, China, 2014. [Google Scholar]
  33. Hu, W.; Kong, D.; He, X. Analysis on Influencing Factors of Green Building Development Based on BP-WINGS. Soft Sci. 2020, 34, 75–81. [Google Scholar] [CrossRef]
  34. Cui, X. Empirical Analysis on Influence Factor of Economic Growth in Henan Construction Industry. Constr. Econ. 2012, 3, 99–101. [Google Scholar] [CrossRef]
  35. Ding, Z.; Fan, Z.; Tam, V.W.; Bian, Y.; Li, S.; Illankoon, I.C.S.; Moon, S. Green building evaluation system implementation. Build. Environ. 2018, 133, 32–40. [Google Scholar] [CrossRef]
  36. Chen, C.; Cao, X.; Zhang, S.; Lei, Z.; Zhao, K. Dynamic Characteristic and Decoupling Relationship of Energy Consumption on China’s Construction Industry. Buildings 2022, 12, 1745. [Google Scholar] [CrossRef]
  37. Li, H.; Han, Z.; Zhang, J.; Philbin, S.P.; Liu, D.; Ke, Y. Systematic Identification of the Influencing Factors for the Digital Transformation of the Construction Industry Based on LDA-DEMATEL-ANP. Buildings 2022, 12, 1409. [Google Scholar] [CrossRef]
  38. Tian, M.; Liu, S.; Bu, Z. Review of research on grey relational degree algorithm model. Stat. Decis. 2008, 1, 24–27. [Google Scholar]
  39. Yin, K.; Xu, T.; Li, X.; Cao, Y. A study of the grey relational model of interval numbers for panel data. Grey Syst. Theory Appl. 2020, 11, 200–211. [Google Scholar] [CrossRef]
  40. Cui, J.; Yuan, W. Optimization of Support Vector Machine Parameters Based on Intelligent Algorithm. J. Hebei Norm. Univ. Sci. Technol. 2017, 1, 34–38. [Google Scholar]
  41. Gaspar, P.; Carbonell, J.; Oliveira, J.L. On the parameter optimization of Support Vector Machines for binary classification. J. Integr. Bioinform. 2012, 9, 33–43. [Google Scholar] [CrossRef]
  42. Xu, C.; Cao, H.; Zhao, X. Speaker Recognition Parameter Selection Method Based on SVM. Comput. Eng. 2012, 38, 175–177. [Google Scholar]
  43. Xu, W.; Liu, C. A Regression Model for Forecasting Regional Annual Water-consumed Quantity Based on GSM and SVM. J. Shenyang Agric. Univ. 2011, 42, 238–240. [Google Scholar]
  44. Tahyudin, I.; Nambo, H.; Goto, Y. An Optimization of the Autoregressive Model Using the Grid Search Method. Int. J. Eng. Technol. 2018, 7, 84–86. [Google Scholar] [CrossRef][Green Version]
  45. Wang, X.; Li, Z. Identifying the Parameters of the Kernel Function in Support Vector Machines Based on the Grid-Search Method. Period. Ocean. Univ. China 2005, 35, 859–862. [Google Scholar] [CrossRef]
  46. Wang, S.; Jin, Z. Intrusion detection classification algorithm based on fuzzy SVM model. Comput. Appl. Res. 2020, 2, 501–504. [Google Scholar] [CrossRef]
  47. Wang, Y. Research on the Methods of Combining Forecasts Based on Correlativity. Forecasting 2002, 21, 58–62. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.