Quantity Prediction of Construction and Demolition Waste Using Weighted Combined Grey Theory and Autoregressive Integrated Moving Average Model

: With rapid urban development, the “waste-free city” concept has emerged. Therefore, the accurate prediction of the amount of C&D waste is of great importance. However, many countries and regions, including China, have not yet established C&D waste databases and standard prediction methods. This study proposed a method using a weighted combination of the grey theory model (GM) and the autoregressive integrated moving average (ARIMA) model to predict the quantity of urban C&D waste in the future. Based on a case study in Guangzhou, this study compared the prediction results of three prediction models, namely the GM, the ARIMA, and the proposed weighted combined model of the GM and the ARIMA (GM-ARIMA). The results of this study proved that the proposed combined GM-ARIMA model had a better predictive performance than both the separated models. The mean absolute percentage errors (MAPE) of the GM and ARIMA models were 12.11% and 14.26%, respectively, whereas the proposed GM-ARIMA model had a lower MAPE (8.5%). This study found that the generation of C&D waste in Guangzhou will continue to grow steadily. From 2024 to 2035, the quantity of C&D waste is expected to reach 850 million tons cumulatively, with an annual growth rate of 7.1%.


Introduction
Large-scale urban construction activities have resulted in a sharp increase in waste generation, exacerbating environmental pollution, resource depletion, and urban landscape destruction.According to Xu et al. [1], construction and demolition (C&D) waste make up 25-34% of the total urban solid waste in developed countries, with an even higher proportion in developing countries.In Southeast Asian countries, the ratio of C and D waste generation to the added value of the construction industry is very high, indicating that construction activities pose a greater waste burden than their contributions to the regional economy.Moreover, the C&D waste management in these countries is inadequate, and there are a lack of statistic data on the volume and composition of C&D waste [2].Globally, more than 10 billion tons of C&D waste are generated each year, including approximately 2300 metric tons from China, 800 metric tons from the European Union, and 700 metric tons from the United States [3].The United States has a mature C&D waste management system; therefore, its C&D waste recycling rate is higher than that of China [3].Statistical data on C&D waste in well-developed urban areas are systematically collected and regularly published, such as Australia, Europe, the UK, and Hong Kong.However, the relevant statistical data are not always available.This is particularly true in China.China has no officially published statistics or standards for waste calculations and predictions.This increases the difficulty in implementing integrated waste management measures [4].
In waste-free cities, promoting the recycling and utilization of C&D waste can effectively reduce environmental pollution and improve the resource-based development of the construction industry.A number of previous studies on C&D waste generation have been conducted, including those on the annual waste amount, waste composition, and waste flow [5].In many developing areas, including China, no official report can provide accurate statistics or estimates of C&D waste generation.Because the generation of C&D waste is non-linear and dynamic, many factors affect waste generation, such as the population density, the GDP, the urban area, the fixed asset investment of the whole society, the climate, and the local environment.The generation of C&D waste is spatially and temporally heterogeneous, and there are several uncertainties when estimating and forecasting waste quantity [6].
The quantity of urban C&D waste is crucial for managing C&D waste supply chains.With accurate statistical data on waste quantity, it is possible to formulate scientific waste management strategies and construct supporting facilities [7].In addition, Menegaki and Damigos [8] investigated 10 cities in China to identify the challenges in construction waste recycling management.They concluded that an unstable construction waste source and inaccurate waste quantity estimation were the major challenges.
Therefore, this study aimed to develop an approach to predict the quantity of urban C&D waste scientifically and accurately in a situation where solid historical statistics were absent.Since historical statistics of urban C&D waste were absent, this study first established a model to estimate the historical C&D waste quantity using the building area method.Then, this study proposed a prediction model for future urban C&D waste quantity by combining the grey theory prediction model (GM) and the autoregressive integrated moving average (ARIMA) model.After that, a case study was conducted to demonstrate the prediction procedure and verify the prediction results.
The novelty and contributions of this study are as follow: 1.
The proposed weighing method of the GM-ARIMA combined model was based on the principle of the mean absolute percentage error (MAPE).This weighted prediction method was, for the first time, applied in the field of urban construction waste management, which added value to the current study.

2.
This study provided not only a new perspective in the research field, but is also of great significance in practice.The proposed method would be universally adoptable if only local historical statistics are applied.In other words, this prediction method could be used globally, providing a practical solution for the cities lacking statistical data on urban C&D waste.

3.
The results of this study not only provided data support for construction waste management, but also built a solid basis for urban construction waste disposal facility planning, which benefits the long-term sustainable development cities.

Literature Review
Determining the accurate quantity of waste generation is essential for effective urban waste management.Because the generation of C&D waste are time series data, the quantity of waste generated in the future is closely related to the amount of waste generated in the past.Therefore, the current research field focuses on two issues, namely the estimation of historical data and the prediction of future data.

Methods of Estimating Historical C&D Waste Quantity
Before predicting the future amount of C&D waste, researchers often need to estimate the waste quantity over the last few years.There are several methods for estimating the historical quantity of C&D waste, including the building area estimation method [9], the building information model estimation method [10], the building life cycle analysis method [11], material flow analysis [12], etc.
The building area estimation method uses relevant historical data, such as the construction area, decoration area, and floor area, which can be found in national and provincial statistical yearbooks to estimate the historical C&D waste quantity.This method considers various lifecycle stages of the C&D waste, including construction, decoration, and demolition.The building area estimation method is relatively comprehensive and accurate among the abovementioned estimation methods.It has become the mainstream method in the literature to estimate the historical production of C&D waste.Moreover, it has been proven that this method is appropriate for estimating the amount of historical C&D waste in developing countries where the economy grows rapidly [11].

Methods of Predicting Future C&D Waste Quantity
In recent years, researchers have conducted in-depth studies to predict the C&D waste quantity based on historical data.The short-and medium-term prediction methods include multiple linear regression models and time series prediction models.In terms of long-term prediction, machine learning, an emerging technology, is increasingly used to predict C&D waste quantity.Each method has advantages and disadvantages regarding the prediction process and accuracy.

Short-Term and Medium-Term Prediction Methods
The multiple linear regression model is a statistical model studying the causal relationships between different variables.According to the relevant variables, such as the urban population, the annual construction area, and the GDP, a regression equation can be established with the C&D waste amount over years; hence, the future waste quantity is predicted.However, this method involves numerous calculations, leading to multiple collinearities that distort the prediction results [13].
The time series prediction method is a generic term for a group of prediction methods based on regression prediction models.These methods describe past time series data, analyze the underlying rules, and predict the future.These methods studies the changing trend of the forecast target by processing its time series [14].These include the exponential smoothing method, the weighted moving average method, the grey theory model, and the ARIMA method.The exponential smoothing method forecasts the dynamic trends of the data series based on historical data and the exponential weighting model.Qiao et al. [15] estimated the quantity of C&D waste based on floor area, and then used the quadratic exponential smoothing method to predict the trend of C&D waste quantity in Shandong Province over the next few years.Hu et al. [16] used a weighted combination model that combined the grey theory and the exponential smoothing method to predict the annual production of C&D waste over the next ten years.The above studies proved that the exponential smoothing method could accurately forecast the amount of C&D waste generated.However, the predictive accuracy of the exponential smoothing method was poor when the amount of sample data was small.
The GM is a method for predicting future trends using existing time series data.In the grey theory, the information in the prediction system is partially known, and there is an uncertain relationship between the parameters in the system.The advantages of the GM model are that the prediction accuracy is not affected by normally distributed data [17].Several factors affect the amount of C&D waste generated.Some of these factors are certain, and some are unknown.Therefore, the GM is the most suitable tool for applications in this field.Currently, the GM(1,1) model, with a proper predictive accuracy and a relatively simple calculation, is typically adopted to predict the generation of C&D waste [18].Zhao et al. [19] applied the GM(1,1) model to predict the quantity of building decoration waste in Shenzhen, China.Wang et al. [20] proposed a model based on the GM(1,1) model to predict C&D waste generation in Shenyang, China, from 2005 to 2014.This research proved that the predictive performance of the GM(1,1) model was adequate [20].However, the GM has several drawbacks.It cannot achieve the best approximation value, and the prediction error gradually increases in medium-and longterm predictions.In practice, it is necessary to constantly revise the model and supplement the data to obtain more accurate predictions [21].
The basic idea of the ARIMA model is to use historical data to predict future trends.The model is comprised of three parts, namely the autoregressive model (AR), the integrated moving process (I), and the moving average model (MA).Zhang et al. [14] adopted the ARIMA model to predict building demolition and completion areas in the Chaoyang District of Beijing over the next eight years.Hu et al. [9] used coupled generation rate calculations and ARIMA models to predict the C&D waste quantity using a limited amount of historical data.This prediction method provides quantitative support for C&D waste management at the regional level.Ceylan et al. [22] compared the prediction performance of the GM and linear regression, support vector regression (SVR), and ARIMA models in a medical waste generation case study.Compared with the other time series forecasting methods, the ARIMA model requires more stable training data, but it offers apparent advantages in processing data with complex time series changes [23].
Chhay et al. [24] compared three methods, namely the GM, the linear regression model, and the artificial neural network (ANN) model, to predict short-term municipal solid waste production.The results proved that the ANN model provided the most accurate prediction results among the three methods.However, research also identified some defects of the ANN model, such as the overfitting of data and a poor generalization performance, which restricted the broad application of ANN model [24].

Long-Term Prediction Methods
Several studies have used machine learning methods to build mathematical models for long-term prediction.To solve the problem of longer prediction periods resulting in a lower predictive accuracy, Sun et al. [25] proposed a time series prediction method based on a three-layer long short-term memory (LSTM) network using univariate time series data of finite data points.Liu et al. [26] established a neural network model based on grey correlation analysis and an LSTM model.Song et al. [27] proposed an SVR model to refine the residual sequence of the GM and improve its prediction performance.Based on the proposed GM-SVR method, the future production, potential composition, and regional spatial distribution of construction waste were predicted.Guo et al. [28] built a GA-BP neural network model to forecast C&D waste amounts in the provinces of China.Research proved that the optimized GA-BP neural network model had a better prediction performance than the BP neural network [28].In summary, machine learning prediction methods are suitable for predicting data with complex internal features.However, its disadvantages include the requirement for a large amount of training data, the proper selection of suitable features, a network hierarchy that is difficult to build, and complex calculations.
Table 1 summarizes the features of the reviewed waste prediction methods.The abovementioned prediction models have different modeling mechanisms and data requirements.Each prediction model has its own strengths and limitations.Therefore, many researchers have proposed combining different prediction models to improve the overall performance [29].Thus far, research on combined prediction models could be generally classified into two categories: constant-weighted and time-varying weighted combined prediction models.
In a constant-weighted combined prediction model, each of its participating models has a weight coefficient which is assumed to be constant.Zheng et al. [30] predicted the demand for lithium carbonate by adopting a constant-weighted combined ARIMA, GM(1,1), and BP neural network models.The research result indicated that the ARIMA and GM(1,1) models had the advantages and features of linear prediction.In contrast, the BP neural network model had the advantages and features of non-linear prediction.It has been proven that a combination of the above models can improve the overall prediction accuracy.Yuan et al. [31] compared the predictive performances of the GM(1,1) model, the ARIMA model, and their combination model.Research proved that the fitting value of the ARIMA model was affected by the long-term trend and less sensitive to the fluctuation in the data series [31].On the other hand, the corresponding fitting values of the GM(1,1) model were sensitive to fluctuations [31].In a few words, the fitness and predictive accuracy of the GM-ARIMA combined prediction model were the best among the three models [31].Li and Li [32] also demonstrated that the predictive accuracy of the GM-ARIMA model was better than that of each separated model.It has been proven that the GM-ARIMA combined model has the advantages of conducting prediction with a limited size of data.In addition, the GM-ARIMA combined model uses differential equations to determine the parameters, which provided a better accuracy and more practical nonstationary sequences.
The other group of combined prediction models is the time-varying weighted combined models.The weight coefficients of each participating model are changed during the prediction process.Sun and Zhang [33] proposed a combined prediction model based on the multiple linear regression model, the GM(1,1) model, and the triple exponential smoothing model.The coefficients of each participating model were determined based on their prediction validity, which varied over time.They used this combined model to predict the output quantity of domestic waste in Harbin, China, and their results showed that the limitations of the single prediction models were significantly improved.The time-varying weighted combined method is more suitable for solving practical problems because the weight coefficient varies with time.The research focus of this type of combined model is to determine the weighted coefficient of each participating model.However, because of the difficulty in determining the time-varying weight coefficients and the complexity of the calculation, there have been very few previous studies on this topic [34].

Summary of Literature Review
In the previous studies, the building area estimation method has been deemed accurate and effective for estimating the historical C&D waste quantity.This estimation method offers accurate results and reduces the dependence on a raw data volume.This provides a solid basis for predicting future C&D waste.Therefore, this study adopted the building area estimation method to estimate the historical C&D waste quantity.
Based on the reviewed studies, the most frequently used prediction models are the GM(1,1) and ARIMA models.As stated before, the GM(1,1) model has the advantages of requiring a small sample size of historical data, or even incomplete data, while providing a high prediction accuracy.However, it is only applicable to short-term forecasts and not suitable for volatile data series.On the other hand, the ARIMA model has the merits of a simple modeling structure and a high prediction accuracy.It is good at dealing with data series with seasonal or cyclical changes.However, the ARIMA model performs less accurately when processing non-linear data series, and it is sensitive to outliers in data series or incomplete data series.The GM-ARIMA combined model makes full use of the advantages of the two individual models when processes data with different characteristics, while keeping modeling simple and fast.It can comprehensively consider both the irregularity and seasonal changes in data and capture the trend of the data accurately.The combined prediction model has strong adaptability in many situations, especially when limited data are available.In the case of missing or incomplete data, the combined prediction model can automatically fill in the missing data by adopting the grey theory.In addition, it can predict both long-term and short-term trends of data series.It is also able to capture the non-linear trend of data series.In the current research field, the weighted combination of the GM(1,1) and ARIMA models has not yet been applied to predict urban C&D waste generation.
Given that the combined prediction model can give full play to the strengths of all the participating models and, therefore, further reduce the prediction errors, this study aimed to establish a GM-ARIMA combined model to predict future regional C&D waste quantities.

Model Development
Historical statistics of C&D waste provide an essential basis for predicting future waste quantities.Owing to the unavailability of historical statistics, an estimation formula for historical waste quantity must first be built.Based on the above reviewed literature, this study adopted the building area method to estimate the past C&D waste quantities.With limited data available, this study chose the GM(1,1) and the ARIMA prediction models.A weighted combined GM-ARIMA prediction model was established by assigning appropriate weighting coefficients to the GM and the ARIMA prediction models.The weighting coefficients were determined based on the prediction accuracy of the two models, that is, the smaller the prediction error of the model is, the greater the weighting coefficient is.
This study adopted a case study to demonstrate the prediction procedure of the proposed GM-ARIMA model.Guangzhou was chosen as the case study city, and the quantity of urban C&D waste was predicted using the proposed models.This case study first estimated the historical urban C&D waste quantity in Guangzhou using the building area method.Relevant statistical data on building areas were obtained from governmentpublished reports to estimate the quantity of C&D waste in Guangzhou during 2011-2022.Then, the estimated C&D waste quantity for 2011-2022 was used to build the GM(1,1) and the ARIMA prediction models.Subsequently, the MAPE of both models was calculated, and the weighting coefficients of the GM(1,1) and the ARIMA models were determined according to the rule that the smaller the prediction error of the model is, the greater the weighting coefficient is.The quantity of C&D waste in Guangzhou from 2024 to 2035 was predicted using the GM(1,1), ARIMA, and the combined GM-ARIMA models.Finally, to verify the accuracy and precision of the proposed GM-ARIMA model, the prediction results were compared with those of the GM(1,1) and the ARIMA models.

Estimation of Historical C&D Waste Quantity Based on the Building Area
Before building the estimation model, it is necessary to define in this study the scope of C&D waste.Wu et al. [35] divided building-related waste into three categories: waste generated during constructing the buildings' main structure, decoration, and the demolition processes.In line with the definition in the study of Wu et al. [35], the C&D waste in this study covered the construction waste, decoration waste, and demolition waste of buildings.These three types of C&D waste covered a wide range of waste, including bricks, tiles, concrete, metals, plastics, wood, and soil residue.This study did not consider waste generated from municipal works, such as roads, pipelines, trench works, urban landscaping, and earthworks, which are composed majorly of soil residue and slurry.The C&D waste quantity was estimated using the following formula: where Q is the total quantity of C&D waste generated annually in tons; S c is the area of newly constructed building main structures annually in m 2 ; i c is the waste rate of newly constructed areas in t/m 2 ; S d is the area of demolished buildings annually in m 2 ; i d is the waste rate of demolished building areas in t/m 2 ; S r is the area of decorated buildings annually in m 2 ; i r is the waste rate of decorated building areas in t/m 2 .

GM(1,1) Prediction Model of C&D Waste
The grey theory model is typically expressed as GM(n,m), with n representing the order of the equation and m representing the number of variables.The GM(1,1), i.e., firstorder variable and first-order differential equation model, is most often used.The GM(1,1) model was first built based on a random discrete original time series.Then, the original time series data were accumulated, and hence a more regular time series was obtained, and differential equations were established accordingly.Referring to the method proposed by Ceylan et al. [22], the procedure of establishing a GM(1,1) model is described in the following seven steps.The grey predicted value of the original sequence could be obtained as follows: where is the estimated quantity of C&D waste generated in the first year; x(1) (k) is the sum of the estimated quantities of C&D waste in year k; x(1) (k + 1) is the sum of the predicted quantities of C&D waste generated in year k + 1 year; x(0) (k + 1) is the predicted quantity of C&D waste in year k + 1; a is the development coefficient, controlling the development trend of the system; b is the grey incidence, reflecting changes in the data, and can be obtained by fitting using the least squares method.
Finally, it is necessary to check the accuracy of the grey model.In the posterior variance test, two criteria, namely the posterior error rate (C) and the small error probability (p), are used to test the accuracy of the GM(1,1) model, which are defined as follows: where s 1 is the variance of the actual data series, and x is the mean of the actual data series, and x = 1 n ∑ n k=1 x (0) (k); s 2 is the variance of the residual sequence, and ε(k) is the residual of the k moment, and ε(k) = x (0) (k) − x(0) (k); ε is the residual mean, and ε = 1 n ∑ n k=1 ε(k); P is the parameter determining the result of the hypothesis test.
According to the values of the posterior error rate (C) and small error probability (p), the tested model can be classified into different accuracy levels, as shown in Table 2.

ARIMA Prediction Model of C&D Waste
The ARIMA prediction model is typically expressed as ARIMA(p,d,q), where p is the autoregressive term, q is the moving average order, and d is the differential order.The ARIMA model used historical time series data to predict future trends.The ARIMA model development procedure was as follows.

•
Step 1. Smoothness test of raw data In this study, the Augmented Dickey-Fuller (ADF) test was used to test the smoothness of the raw data.If the time series data is not smooth, it must be transformed into a smooth time series using a differentiation operation.Usually, nonstationary time series data can be transformed into a stationary time series after several orders of differentiation operations.Up to this step, the differential order of the ARIMA(p,d,q) model, that is the d value was determined.

•
Step 2. Model identification and order determination In this step, the autocorrelation function (ACF) and partial autocorrelation function (PACF) were plotted to analyze and determine the lag order of the ARIMA(p,d,q) model and further determine the values of p and q.The best lag order was determined according to the Akaike Information Criterion (AIC).A preliminarily fitted ARIMA model was obtained by repeating tests and comparisons.

•
Step 3. Parameter estimation and fitting test The residual series of the original sequence and fitted data were tested accordingly.If the residual series met the requirements of the white noise series, the fitted model was proved effective; if it failed to pass the white noise test, step (2) must be repeated, and the model was modified until it passed the fitting test.
The above ARIMA modeling process is illustrated in Figure 1.

GM-ARIMA Prediction Model of C&D Waste
There are several methods to determine the weight coefficient of the participating models, such as the MAPE method, equal weighting method, optimal weighting method, inverse variance method, and error sum of squares method.Among these weighting methods, MAPE is one of the most commonly used indicators for evaluating the prediction performance [23].In this study, by adopting the MAPE weighting method, the weight coefficient of the GM model ( ) and the weight coefficient of the ARIMA model ( ) were calculated using the MAPE of the GM(1,1) and the ARIMA models: where  ,  are the weighting coefficients of the GM(1,1) and the ARIMA models, respectively.
Then, the GM-ARIMA combined prediction model is expressed as follows. is the predicted value of C&D waste quantity at the time of t year in the weighted combined model.

GM-ARIMA Prediction Model of C&D Waste
There are several methods to determine the weight coefficient of the participating models, such as the MAPE method, equal weighting method, optimal weighting method, inverse variance method, and error sum of squares method.Among these weighting methods, MAPE is one of the most commonly used indicators for evaluating the prediction performance [23].In this study, by adopting the MAPE weighting method, the weight coefficient of the GM model (i G ) and the weight coefficient of the ARIMA model (i A ) were calculated using the MAPE of the GM(1,1) and the ARIMA models: where ε G , ε A is the MAPE of the GM(1,1) and the ARIMA models; G t is the fitting value of historical C&D waste quantity data at the time of t year in the GM(1,1) model; G (1) t is the predicted C&D waste quantity at the time of t year in the GM(1,1) model; A (0) t is the fitting value of the historical C&D waste quantity data at the time of t year in the ARIMA model; A (1) t is the predicted C&D waste quantity at the time of t year in the ARIMA model; i G , i A are the weighting coefficients of the GM(1,1) and the ARIMA models, respectively.
Then, the GM-ARIMA combined prediction model is expressed as follows.F t is the predicted value of C&D waste quantity at the time of t year in the weighted combined model.

Validation of the Proposed Prediction Models
After the prediction models were built, the validation of the prediction accuracy of the models was carried out by using different statistical metrics.This study used the mean absolute error (MAE), root mean square error (RMSE), and MAPE to validate the prediction results of the GM(1,1) model, the ARIMA model, and the GM-ARIMA model.
where n is the sample size; y i is the historical C&D waste quantity data series; ŷi is the predicted C&D waste quantity data series of the proposed model.

Case Study 4.1. Case Selection and Raw Data Acquisition
Guangzhou, a modern metropolis in southern China, experienced rapid development and suffered from the encirclement of C&D waste.It is urgent to solve the problems of C&D waste treatment.Therefore, accurately predicting the amount of C&D waste in Guangzhou is of great significance.As a result, Guangzhou was chosen as the case study city.In this case study, data from 2011 and 2022 were selected to build the prediction model.This is because after 2011, the economic development of Guangzhou has been rapid, and the generation of C&D waste has grown rapidly, which is in line with the future growth trends.
As mentioned before, this study limited its scope to building C&D waste, such as bricks, tiles, concrete, metals, plastics, wood, and soil residue, which have greater recycle potential.Therefore, in this case study, the C&D waste generated by municipal works in Guangzhou, such as roads and underground utility tunnels, which are composed majorly of soil residue and slurry, has not been considered.The statistics covered the building projects of both private and public sectors in Guangzhou.When estimating the historical C&D waste quantity, the areas of new construction, demolition, and decoration (S c , S d , and S r ) and the waste rate of new construction, demolition, and decoration (i c , i d , and i r ) were to be determined.The area of new construction (S c ) in Guangzhou from 2011 to 2022 was obtained from the Guangzhou Statistical Yearbook [36].In this paper, the construction waste rate (i c ) was assumed to be 0.03 t/hm 2 .This is because it was pointed out that by the end of 2025, the construction waste per 10,000 square meters of newly constructed buildings would be no more than 300 t [37].
According to research by the Building Energy Efficiency Research Center of Tsinghua University, the proportion of demolished building areas to total completed building areas in urban areas was approximately 23%.Considering that several urban renewal projects, such as the renovation of old villages, have been launched in Guangzhou since 2016, this study assumed the proportion to be 20%.According to relevant research [38], the building demolition waste rate (i d ) of different building types ranged from 0.907 t/m 2 to 1.773 t/m 2 (details are shown in Table 3).At present, the building demolition projects in Guangzhou can be classified into the following types: brick-concrete structure buildings (60%), reinforced concrete structure buildings (25%), masonry-wood structure buildings (15%).Considering the proportion of each type of demolition building, this study took the overall demolition waste rate (i d ) as 1.35 t/m 2 .Regarding the building decoration area (S r ), this study considered the decoration area equal to the newly completed building area, which could be obtained from the Guangzhou Statistical Yearbook.As for the waste rate of building decoration (i r ), this paper grouped decoration waste into two types, namely residential and public building decoration waste, which were assumed to be 0.15 t/m 2 and 0.1 t/m 2 , respectively, according to the study of Chen et al. [23].

Estimation of Historical C&D Waste Quantity in Guangzhou
According to Formula (1), the annual quantity of urban C&D waste in Guangzhou from 2011 to 2022 was estimated, as elaborated in Table 4 and plotted in Figure 2. One can tell in Figure 2 the annual quantity of C&D waste in 2011-2022 grew steadily, fluctuating in some years.The newly constructed and demolition areas increased sharply in 2012-2013 and 2017-2018, resulting in a significant growth in C&D waste.This is because the national government increased real estate investment, and thus speeded up the old city transformation during the "12th Five-Year Plan" and "13th Five-Year Plan" periods.The results in Table 4 show that the total estimated quantity of C&D waste in Guangzhou was 12.917 million tons in 2011 and 46.454 million tons in 2022, tripling over 12 years.The average annual growth rate of C&D waste quantity from 2011 to 2016 was 8.7%, and the average annual growth rate from 2017 to 2022 was 22.7%, indicating a significant increase in the growth rate in the latter five years.The results also indicated that among the urban C&D waste in Guangzhou, demolition waste accounted for 59.7% of the total amount, making it the primary source of C&D waste.Construction and decoration waste accounted for 24.4% and 15.9% of the total amount, respectively.
urban C&D waste in Guangzhou, demolition waste accounted for 59.7% of the total amount, making it the primary source of C&D waste.Construction and decoration waste accounted for 24.4% and 15.9% of the total amount, respectively.

Prediction Models of Future C&D Waste Quantity in Guangzhou
In this section, the GM(1,1) model, the ARIMA model, and the GM-ARIMA combined model were built based on the historical data estimated in the Section 4.2.Ensuring the feasibility, i.e., predictive accuracy, of each single prediction model is the prerequisite of establishing the combined model.Therefore, the modeling of the individual GM and the ARIMA model was carried out first, and their predication accuracy was checked.

Establishing the GM(1,1) model for C&D Waste Quantity in Guangzhou
As shown in Table 5, the test results of the ratios of consecutive terms in all the transformed time series are within the interval (0.857, 1.166), indicating that the transformed data series was smooth and appropriate for the grey prediction model.

Prediction Models of Future C&D Waste Quantity in Guangzhou
In this section, the GM(1,1) model, the ARIMA model, and the GM-ARIMA combined model were built based on the historical data estimated in the Section 4.2.Ensuring the feasibility, i.e., predictive accuracy, of each single prediction model is the prerequisite of establishing the combined model.Therefore, the modeling of the individual GM and the ARIMA model was carried out first, and their predication accuracy was checked.

Establishing the GM(1,1) Model for C&D Waste Quantity in Guangzhou
As shown in Table 5, the test results of the ratios of consecutive terms in all the transformed time series are within the interval (0.857, 1.166), indicating that the transformed data series was smooth and appropriate for the grey prediction model.The parameters were introduced into the equation to obtain the development coefficient a (−0.0276) and grey incidence b (10,162.622).In accuracy verification, the posterior error rate (C) of the GM(1,1) model was 0.102, which was less than 0.35.The error probability (p) of the GM(1,1) model was 1, which was greater than 0.95.According to the classification of the prediction accuracy grades in Table 2, the proposed GM(1,1) model had a high prediction accuracy.The time prediction series of the GM(1,1) model is expressed as follows: x(1) (k + As mentioned, the ARIMA model requires a smooth data series, hence the data should first be checked for their smoothness.Therefore, a time series chart of the C&D waste quantity from 2011 to 2022 was plotted (the "original" curve in Figure 3).One can observe in Figure 3 that the original data series is quite fluctuant.The ADF test was used to further check the smoothness of the original data series (Table 6).As shown in Table 6, the p-value of the original data series was 0.998, indicating that the original data were not smooth.Therefore, the original data need to be preprocessed to make them smoother.

2022
4645.39 0.904 13,936.390.968 The parameters were introduced into the equation to obtain the development coefficient a (−0.0276) and grey incidence b (10,162.622).In accuracy verification, the posterior error rate (C) of the GM(1,1) model was 0.102, which was less than 0.35.The error probability (p) of the GM(1,1) model was 1, which was greater than 0.95.According to the classification of the prediction accuracy grades in Table 2, the proposed GM(1,1) model had a high prediction accuracy.The time prediction series of the GM(1,1) model is expressed as follows: 4.3.2.Establishing the ARIMA(0,1,0) Model for C&D Waste Quantity in Guangzhou As mentioned, the ARIMA model requires a smooth data series, hence the data should first be checked for their smoothness.Therefore, a time series chart of the C&D waste quantity from 2011 to 2022 was plotted (the "original" curve in Figure 3).One can observe in Figure 3 that the original data series is quite fluctuant.The ADF test was used to further check the smoothness of the original data series (Table 6).As shown in Table 6, the p-value of the original data series was 0.998, indicating that the original data were not smooth.Therefore, the original data need to be preprocessed to make them smoother.A differentiated operation was carried out to make the data series smoother, i.e., to eliminate noisy data in the original data series.One can tell in the "1st-order difference" curve in Figure 3 after the first-order differentiation operation, the data series showed significantly improved smoothness.As shown in Table 6, the data series after the first-order differentiation operation has a p-value of 0.074, which also proved the smoothness of the data series had been greatly improved.Subsequently, a second-order differentiation  A differentiated operation was carried out to make the data series smoother, i.e., to eliminate noisy data in the original data series.One can tell in the "1st-order difference" curve in Figure 3 after the first-order differentiation operation, the data series showed significantly improved smoothness.As shown in Table 6, the data series after the first-order differentiation operation has a p-value of 0.074, which also proved the smoothness of the data series had been greatly improved.Subsequently, a second-order differentiation operation was performed to see whether the smoothness of the data series could be further improved.It can be seen from the "second-order difference" curve in Figure 3 the data series after the second-order differentiation operation were not smoother than those after the first-order differentiation operation.This indicated that the optimal differential operation order was the first-order differential, which yielded the smoothest time series data.As a result, the differential order of the ARIMA(p,d,q) model, i.e., the d value, was 1.
Next, the optimal lag order was determined based on the AIC.After trial and error, the optimal p and q values of the ARIMA(p,d,q) model were 0 and 0, respectively.The goodness-of-fit R 2 ; of the ARIMA(0,1,0) model was 0.917, which indicated that the model performed well.Figure 4 shows a trailing phenomenon, indicating that the ARIMA(0,1,0) model is suitable for application to the case study.
improved.It can be seen from the "second-order difference" curve in Figure 3 the data series after the second-order differentiation operation were not smoother than those after the first-order differentiation operation.This indicated that the optimal differential operation order was the first-order differential, which yielded the smoothest time series data.As a result, the differential order of the ARIMA(p,d,q) model, i.e., the d value, was 1.
Next, the optimal lag order was determined based on the AIC.After trial and error, the optimal p and q values of the ARIMA(p,d,q) model were 0 and 0, respectively.The goodness-of-fit R² of the ARIMA(0,1,0) model was 0.917, which indicated that the model performed well.Figure 4 shows a trailing phenomenon, indicating that the ARIMA(0,1,0) model is suitable for application to the case study.The prediction value of C&D waste quantity in Guangzhou as modeled by the GM(1,1) and the ARIMA(0,1,0) models are listed in Table 7.As shown in Table 7, the MAPE of the GM(1,1) and the ARIMA(0,1,0) models were 12.11% and 14.26%, respectively.The prediction accuracy of the ARIMA(0,1,0) model was higher than that of the GM(1,1) model.The prediction errors of the GM(1,1) and the ARIMA(0,1,0) models were both less than 15%, indicating that both the single prediction models had good accuracy and met the prerequisites for establishing a combined model.The prediction value of C&D waste quantity in Guangzhou as modeled by the GM(1,1) and the ARIMA(0,1,0) models are listed in Table 7.As shown in Table 7, the MAPE of the GM(1,1) and the ARIMA(0,1,0) models were 12.11% and 14.26%, respectively.The prediction accuracy of the ARIMA(0,1,0) model was higher than that of the GM(1,1) model.The prediction errors of the GM(1,1) and the ARIMA(0,1,0) models were both less than 15%, indicating that both the single prediction models had good accuracy and met the prerequisites for establishing a combined model.In this study, the MAPE was used to determine the weighting coefficient of the combined model.According to Formula (7), the weighting coefficients and were calculated as follows: i G = 14.26% 14.26% + 12.11% = 0.56 (13) i A = 12.11% 14.26% + 12.11% = 0.44 (14) As a result, the combined prediction model was expressed as follows: Table 8 lists the prediction value of C&D waste quantity in Guangzhou as modeled by the combined GM-ARIMA model.As shown in Table 8, the MAPE of the GM-ARIMA combined model was 8.5%, which is less than those of the single prediction models.The fitting results of the GM(1,1) model, the ARIMA(0,1,0) model, and the GM-ARIMA combined model are shown in Figure 5.One can observe that the fitting curve of GM-ARIMA combined model was closer to the historical value, which indicated that the overall fitting effect of GM-ARIMA model was better than that of the GM(1,1) and the ARIMA(0,1,0) models.In summary, the combined GM-ARIMA model had an excellent fitting effect, high prediction accuracy, and better stability compared to the single prediction models.The quantity of C&D waste in Guangzhou in 2024-2035 was predicted by the developed GM(1,1) model, the ARIMA(0,1,0) model, and the GM-ARIMA model, as shown in Table 9.The prediction results were further plotted in Figure 6.As show in Figure 6, the GM(1,1), the ARIMA(0,1,0) and the GM-ARIMA models provided noise-free, non-linear, and monotonically increasing prediction results.The gr curve in Figure 6 represents the predicted C&D waste quantity in 2024-2035 by t ARIMA(0,1,0) model.The fitting value of the ARIMA(0,1,0) model was constrained by long-term trend; therefore, it was less sensitive to the fluctuations in recent samples a showed a gentle slope in Figure 6.On the other hand, the GM(1,1) model was more sen tive to fluctuations in the recent sample data, hence it showed a steeper slope (oran curve in Figure 6).The GM-ARIMA combined model offered moderate prediction resu (green curve in Figure 6).As indicated by the prediction results of GM-ARIMA mod the overall trend of C&D waste quantity in Guangzhou will continue to grow rapidly.T total C&D waste quantity in Guangzhou is expected to reach 93.176 million tons by t end of 2035.
The prediction results indicate that in 2024-2035, the cumulative C&D waste quant will reach 850 million tons in Guangzhou, with an average annual growth rate of 7.1 As show in Figure 6, the GM(1,1), the ARIMA(0,1,0) and the GM-ARIMA models all provided noise-free, non-linear, and monotonically increasing prediction results.The grey curve in Figure 6 represents the predicted C&D waste quantity in 2024-2035 by the ARIMA(0,1,0) model.The fitting value of the ARIMA(0,1,0) model was constrained by its long-term trend; therefore, it was less sensitive to the fluctuations in recent samples and showed a gentle slope in Figure 6.On the other hand, the GM(1,1) model was more sensitive to fluctuations in the recent sample data, hence it showed a steeper slope (orange curve in Figure 6).The GM-ARIMA combined model offered moderate prediction results (green curve in Figure 6).As indicated by the prediction results of GM-ARIMA model, the overall trend of C&D waste quantity in Guangzhou will continue to grow rapidly.The total C&D waste quantity in Guangzhou is expected to reach 93.176 million tons by the end of 2035.
The prediction results indicate that in 2024-2035, the cumulative C&D waste quantity will reach 850 million tons in Guangzhou, with an average annual growth rate of 7.1%.Compared to the average annual growth rate (21.9%) in the decade 2011-2022, the growth rate of C&D waste quantity will significantly drop in 2024-2035.This indicates that the relevant waste reduction strategies of the Guangzhou government are beginning to pay off.Taking the density of C&D waste as 2.5 m 3 per ton, the cumulative volume of C&D waste in Guangzhou will reach 2.13 billion m 3 by 2035.If the average landfill depth is 5 m, a landfill disposal area of 425 million m 2 is required, which is approximately 5.9% of the total area of Guangzhou city.There is an urgent need to develop sustainable strategies for C&D waste treatment and recycling to solve the problems of the shortage of landfill areas and illegal dumping.

Result Validation
The prediction accuracy of the GM(1,1), the ARIMA(0,1,0) and the GM-ARIMA models were validated using different statistical metrics, namely the MAE, RMSE, and MAPE.The prediction accuracies are listed in Table 10.One can tell that among the three prediction models, the GM-ARIMA model had the smallest values of MAE, RMSE, and MAPE, which are 1.937 (10 2 ), 3.213 (10 2 ), and 8.5%, respectively.This indicated that the GM-ARIMA model had better prediction accuracy than the GM(1,1) and the ARIMA(0,1,0) models.Although the proposed GM-ARIMA combined model had apparent advantages, such as good adaptability to different data characteristics, providing more accurate prediction results, simple and fast modeling, etc., it does have some limitations.
First, the prediction performance of the proposed GM-ARIMA combined model is considerably affected by the sample size of historical data.The GM model performs well with a small amount of sample data and allows missing data in the data series, while the ARIMA model requires sufficient sample data to capture the time series patterns.Therefore, neither too many nor too few historical samples should be adopted, otherwise the fitting effect of the two single models will be affected, and hence the accuracy of the GM-ARIMA combined model will be affected.Therefore, selecting a properly sized sample to build the prediction models is of great importance.
Second, the accuracy of the proposed GM-ARIMA model is affected by historical C&D waste production, which, in this study, was estimated using the building area estimation method.In fact, the estimation of historical C&D waste production was affected by a wide range of factors, such as the construction methods and technologies, regional economic development, the urban renewal and transformation rate, and the popularization of green construction.In this study, the estimation of historical C&D waste production only depended on the urban building area, which could not precisely reflect the real situation of urban C&D waste production change.This affected the MAPE of GM model and ARIMA model in the fitting process, and hence the weighting coefficient of the two models changed accordingly, which resulted in the deviation of the prediction results of the GM-ARIMA combined model.

Conclusions and Future Works
The rapid increase in C&D waste hinders sustainable urban development.To satisfy the requirements of detailed waste management, it is essential to accurately predict the quantity of urban C&D waste.Therefore, this study proposed a prediction method for urban C&D waste quantity based on a weighted combined GM and ARIMA model.The proposed model was proved had the advantages of both the separate prediction models.
This study first used the building area estimation method to estimate the historical C&D waste quantity from 2011 to 2022.Then, the GM(1,1) and the ARIMA(0,1,0) prediction models were built based on historical statistics.After that, the MAPE of the predicted results of the two models were calculated: 12.11% for the GM(1,1) model and 14.26% for the ARIMA(0,1,0) model.Based on the MAPE values of the two models, the weight coefficients of the GM-ARIMA combined model were calculated to be 0.54 (i G ) and 0.46 (i A ).The GM-ARIMA combined model was built, and the MAPE of the GM-ARIMA model was 8.5%, which was less than that of the GM(1,1) and the ARIMA(0,1,0) models.The following conclusions can be drawn from this study: (1) Demolition waste was the primary source of urban C&D waste in Guangzhou, accounted for 59.7% of the total quantity.Therefore, when the government formulates waste reduction strategies, special attention should be paid to the recycling and treatment of construction waste in urban renewal and demolition projects.(2) The overall trend of C&D waste quantity in Guangzhou will steadily increase over the next twelve years (2024-2035), with an annual growth rate of 7.1%.By 2035, the cumulative C&D waste quantity in 2024-2035 will reach 850 million tons in Guangzhou.
(3) The combined GM-ARIMA model demonstrated a better predictive performance, a higher prediction accuracy, and better stability than the individual GM(1,1) model and the ARIMA(0,1,0) model.The GM-ARIMA model has the advantages of two individual prediction models.It is suitable for predicting urban C&D waste quantity in the case of limited available historical data, while achieving proper reliability.
The GM-ARIMA prediction model proposed in this study has significant potential for practical applications in waste management in the construction industry.For instance, by accurately predicting the amount of C&D waste to be recycled, waste treatment companies can adjust their storage and operational plans to reduce the costs and working time.Governments can formulate relevant strategies, laws, and regulations to achieve sustainability.
Last, but not the least, this study provides potential directions for future research.As mentioned, the scope of the case study in this study was limited to building C&D waste in Guangzhou.Municipal work C&D waste in Guangzhou was excluded in this case study.However, municipal work C&D waste is a non-negligible part of urban waste management.Therefore, future studies should be carried out to cover this waste.In addition, future research should consider the other factors affecting urban C&D waste generation, such as the development of construction technologies, the population, property prices, environmental policies, the level of economic development, and the structure of the construction industry of case study cities, when establishing prediction models to achieve a higher prediction accuracy and applicability for local use.Moreover, future research should expand the scope of the location to include more cities/regions to enhance the robustness of the proposed model and ensure the model's adaptability to different urban environments.Further research should also compare the prediction results of developing regions and those of developed regions to explore ways and formulate policies to narrow the gap between them.Finally, a database containing data on the quantity and composition of C&D waste should be developed to monitor and predict future waste generation better.

Figure 3 .
Figure 3. Optimal differential sequence of data.

Figure 3 .
Figure 3. Optimal differential sequence of data.

Figure 6 .
Figure 6.Prediction results of C&D waste quantity in Guangzhou in 2024-2035 by different pred tion models.

Figure 6 .
Figure 6.Prediction results of C&D waste quantity in Guangzhou in 2024-2035 by different prediction models.

Table 1 .
Comparison of waste prediction methods.
,  is the MAPE of the GM(1,1) and the ARIMA models;  ( ) is the fitting value of historical C&D waste quantity data at the time of t year in the GM(1,1) model;  ( ) is the predicted C&D waste quantity at the time of t year in the GM(1,1) model;  ( ) is the fitting value of the historical C&D waste quantity data at the time of t year in the ARIMA model;  ( ) is the predicted C&D waste quantity at the time of t year in the ARIMA model;

Table 3 .
Demolition waste type and demolition waste rate of different building types (t/m 2 ).

Table 4 .
Estimation of historical C&D waste quantity in Guangzhou in 2011-2022.

Table 5 .
Testing the ratios of consecutive terms in raw data.

Table 5 .
Testing the ratios of consecutive terms in raw data.

Original Data of C&D Waste Quantity (10 4 t) Ratio of Consecutive Terms of Original Data Translated Data Value Ratios of Consecutive Terms after Translation
1) = x (0) (1) + 368.2 e 0.0276k + 368.2

Table 6 .
ADF test of C&D waste quantity data.

Table 6 .
ADF test of C&D waste quantity data.

Table 8 .
GM-ARIMA combined model for Guangzhou C&D waste quantity.

Table 9 .
Prediction results of C&D waste quantity in Guangzhou in 2024-2035 by different prediction models (10 4 t).