Next Article in Journal
A SEIQRS Model for Interbank Financial Risk Contagion and Rescue Strategies in Complex Networks
Previous Article in Journal
Dynamic Scheduling for Security Protection Re-2 Sources in Cloud–Edge Collaboration Scenarios Using Deep Reinforcement Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

The Prediction of Tea Production Using Dynamic Rolling Update Grey Model: A Case Study of China

1
Centre for Mathematical Sciences, Universiti Tunku Abdul Rahman, Bandar Sungai Long, Kajang 43000, Selangor, Malaysia
2
Faculty of Accountancy and Management, Universiti Tunku Abdul Rahman, Bandar Sungai Long, Kajang 43000, Selangor, Malaysia
*
Authors to whom correspondence should be addressed.
Mathematics 2025, 13(19), 3056; https://doi.org/10.3390/math13193056
Submission received: 21 August 2025 / Revised: 14 September 2025 / Accepted: 18 September 2025 / Published: 23 September 2025

Abstract

China is one of the world’s largest tea-producing countries, and its fluctuations in production affect the international market and domestic economic stability. Existing research often uses limited predictive models at the local scale and lacks systematic national analysis. This study evaluated five models—autoregressive integrated moving average model (ARIMA), grey model (GM (1,1)), Markov chain grey model (Markov-GM (1,1)), particle swarm optimization Markov chain grey model (PSO-Markov-GM), and dynamic rolling update grey model (DRUGM (1,1))—using three stages of annual tea production data from China (2004–2023). The results indicate that DRUGM (1,1) has the lowest prediction error, demonstrating superior ability to capture production trends. The dynamic update mechanism of this model enhances its adaptability, providing an efficient and scalable framework for predicting the production level of tea and other crops. Accurate predictions are crucial for improving agricultural planning, optimizing resource allocation, and providing information for trade policy design. This study provides practical tools for sustainable agricultural decision-making, helping to strengthen rural economic stability and resilient food systems.

1. Introduction

China’s history of tea production spans more than 4000 years. China is also an important promoter of global tea culture and industry development. As the world’s largest tea producer, China’s annual output has accounted for over 45% of the global total for several consecutive years (https://www.fao.org/faostat/en/#data (accessed on 24 January 2025)). Tea has both significant economic value and widespread appeal for its anti-inflammatory effects and health benefits, such as alleviating metabolic syndrome [1,2,3]. In China, the tea industry has become an important pillar for poverty alleviation and wealth creation in many regions, supporting the employment of over 70 million people (https://data.stats.gov.cn, accessed on 24 January 2025). The tea industry’s development directly affects the rural economy and social stability. However, research predicts that by 2050, the area suitable for tea planting will have significantly decreased [4], posing a threat to the livelihoods of tea farmers and exacerbating industry risks; this issue has attracted much attention from the relevant government departments.
Meanwhile, the annual variation in the rate of tea production in China is as high as 6.8% (https://data.stats.gov.cn, accessed on 24 January 2025), representing significant fluctuations, far higher than those for grain crops. This volatility is a consequence of the lack of high-precision prediction mechanisms and may lead to an imbalance in market supply–demand, reduced incomes for tea farmers, and weaker export performance. These fluctuations also diminish the effectiveness of policy subsidies and contribute to instability in social employment. Therefore, it is necessary to establish a scientific and dynamic tea production forecasting system. This could provide a foundation for improved government decision-making and market regulation. With suitable decisions, the system can better arrange agricultural resources and labor in advance. This can also protect farmers’ incomes and maintain market stability. At the same time, accurate forecasting will help companies manage their supply chains more effectively and improve efficiency. In addition, a stable production rate is important for international trade and industry competitiveness.
There are various methods and related research results for predicting crop yield both domestically and internationally. Nurman et al. [5] and Noorunnahar et al. [6] used ARIMA models to predict rice yields in multiple regions. Devi et al. [7] and Chergui [8] conducted research on predicting wheat yields. Zhang et al. [9] and Wu et al. [10] predicted the demand and production of grain.
Despite tea being an important economic crop, there has been relatively little research on its yield compared to other crops. Saha et al. [11] employed an exponential growth model to evaluate tea planting during different time periods. Mila et al. [12] predicted and analyzed tea production, consumption, and export in Bangladesh from 2019 to 2028 using ARIMA. Jui et al. [13] established a new DRS-RF model for predicting tea yield, which improved prediction accuracy compared to other independent machine learning methods. Ryan et al. [14] also demonstrated that RF outperformed other algorithms in predicting tea production. Table 1 provides some prediction methods for tea and crop yields.
The above predictions of tea-production-related models are mostly time-series analysis and machine learning models. They predominantly rely on large quantities of sample data to achieve reliable results. However, due to recent structural changes in the economy, using data from recent years is more reliable for predictions of tea production. Since there is a relatively small sample size in this study, models designed for large sample sizes may not be suitable. Deng Ju-long proposed grey theory in the 1980s to address the issue of “small information and small samples” [22]. Many fields have adopted GM (1,1), including manufacturing [23], transportation [24], health [25], energy [26], and transportation [27].
Some scholars have conducted further research on GM (1,1) based on different application scenarios and problem characteristics. For example, Jia et al. [28] introduced Markov-GM to reduce errors in coal consumption prediction. Markov-GM has also produced reliable results in other fields, such as electricity production [29], ground monitoring [30], traffic accidents [31], and the economy [32].
Although the above methods can effectively reduce errors, they still have some limitations. In Markov-GM, scholars have generally divided state intervals using empirical judgment rather than a unified standard. Some scholars use optimization algorithms to solve this issue. Xu et al. [33] proposed a Markov chain state interval partitioning hybrid model, optimized using the GWO algorithm, which outperforms other models that use experience-based partitioning in predicting coal consumption. Zheng et al. [34] optimized the non-linear components of NGBM (1,1) through the PSO algorithm, finding that it proved effective in improving prediction performance.
These methods are either based on large sample sizes or only consider static historical data for prediction. They ignore the important role of newly acquired data in developing model trends. Therefore, the impact of small sample sizes and dynamically updating newly acquired data on the prediction results should be taken into consideration. Several researchers have developed rolling-based grey forecasting models, utilizing new information to improve modeling accuracy. Zhou et al. [35] proposed GRPM (1,1) for predicting carbon dioxide emissions in China. This model delivers improved forecasting performance. With the aim of predicting the number of confirmed COVID-19 patients, Zhao et al. [36] established a rolling grey Verhulst model.
A summary of grey prediction methods and derived models is given in Table 2.
A systematic review of the existing research methods indicates that, despite substantial progress in agricultural yield forecasting, particularly for rice and wheat, which benefit from mature and data-rich prediction systems, research on tea yield prediction is still relatively underdeveloped. Given the increasing global importance of tea as an economic crop and cultural commodity, as well as its role in supporting rural livelihoods in many tea-producing areas, this gap is important. With the expansion of tea gardens in multiple countries and the continuous growth of global demand, improving the accuracy of tea production forecasting is crucial for enhancing market stability, optimizing resource allocation, and supporting climate-adaptive agricultural planning.
To bridge this gap, this research seeks to develop and evaluate an accurate predictive model for tea production in China. Using annual tea production data from the past twenty years in China, we conducted a systematic performance comparison of five prediction models: ARIMA, GM (1,1), Markov-GM (1,1), PSO-Markov-GM, and DRUGM (1,1). DRUGM (1,1) combines rolling data updates with a dynamic error correction mechanism. The impact of outdated information is reduced by iteratively adjusting deviations. The main innovations of this study are reflected in three aspects: (1) In response to the dependence of traditional yield prediction methods on large-scale historical data and static modeling frameworks, small sample learning and dynamic update mechanisms are introduced to effectively improve the adaptability and timeliness of the model under limited data conditions. (2) The systematic expansion of yield prediction research associated with tea has filled the gap in this field. (3) The DRUGM (1,1) prediction model was constructed, which integrates rolling data updates and dynamic error correction mechanisms, significantly improving the stability and accuracy of tea yield prediction. By identifying the most accurate and robust models, this study provides methodological advancements and practical tools to support sustainable agricultural decision-making, rural economic stability, and the achievement of global sustainable development goals.

2. Materials and Methods

2.1. Data

In this study, the annual tea production data of China from 2004 to 2023, provided by the National Bureau of Statistics (http://www.stats.gov.cn/english/ (accessed on 24 January 2025)), were analyzed. To improve the study, the data were segmented into two periods: 2004–2013 and 2014–2023. Table 3 lists the summary measures of these three datasets. To facilitate the study, 80% of each dataset was used as a training set, and the rest was used as a testing set.

2.2. GM (1,1) Model

GM (1,1) only requires a small dataset [22]. It is a first-order differential equation model with a single variable. Its modeling process is given below:
Step 1: Define original sequence, X 0 as
X 0 = x 0 1 ,   x 0 2 ,   ,   x 0 n
where x 0 k 0 , k = 1 ,   2 ,   ,   n .
Step 2: Regularize the data in which the original data sequence is accumulated once to generate (1-AGO)
X 1 = x 1 1 ,   x 1 2 ,   ,   x 1 n
x 1 k = i = 1 k x 0 i ,   k = 1 ,   2 ,   ,   n .
Step 3: Adopt the grey differential equation
d x 1 d t + a x 1 = b
where a is the development coefficient and b is the grey action quantity.
Discretize the differential equation
x 0 k + a z 1 k = b ,   k = 1 ,   2 ,   ,   n
and calculate the background value sequence, z 1 k , via
z 1 k = 0.5 x 1 k + 0.5 x 1 k 1 ,   k = 2 ,   3 ,   ,   n .
Step 4: Use the least squared method to solve for a and b :
θ = ( a , b ) T = ( B T B ) 1 B T Y
B = z 1 2 1 z 1 3 1 z 1 n 1 , Y = x 0 2 x 0 3 x 0 n
Step 5: The initial condition is x ^ 0 1 = x ^ 1 1 . After obtaining of a and b coefficients, Equation (4) can then be expressed as
x ^ 1 k = x 1 1 b a e a k 1 + b a ,   k = 1 ,   2 ,   ,   n .
Step 6: From the inverse accumulated generating operator, the forecasting equation is the following:
x ^ 0 1 = x 0 1 ,   k = 1 x ^ 0 k = x ^ 1 k x ^ 1 k 1 = 1 e a ( x 0 1 b a ) e a k 1 ,   k = 2 ,   3 ,   ,   n

2.3. Markov-GM (1,1) Model

Markov chain is a stochastic dynamic process, and its state transition mechanism has the characteristic of “post-invalidity”. Markov chains improve prediction accuracy by correcting errors in grey models. The correction process [38] is as follows:
Step 1: Calculate the error:
e r k = x 0 k x ^ 0 k
The range of the interval state is
M = min e r k , max e r k
Then, the interval is divided into r states, expressed as S i = L i ,   H i ,   i = 1 ,   2 ,   ,   r , and S i M ,   i = 1 ,   2 ,   ,   r . To avoid the subjectivity of experience-based division and ensure a robust transition probability matrix, the equal probability division method [28,39] is employed. This method involves first sorting the error sequence e r k in ascending order and then partitioning it into r intervals, each containing approximately the same number of data points (i.e., each state has an equal probability of occurrence, approximately 1 / r ). Here, L i and H i represent the lower and upper boundaries of the i -th state interval, determined by specific quantiles of the sorted error sequence.
Step 2: Establish a state transition probability matrix
Assuming that the probability of the sequence x t at time t being S i is p x t = i , if the probability of sequence x t transitioning from state S i to the next state S j is p i j , then the probability of transitioning from state i at time t to state j at the next time t + 1 is
p i j t = p x t = i ,   x t + 1 = j p x t = i
p i j t is the conditional probability of a one-step transition at time t .
Then, the transition probability matrix is
P = p 11 p 12 p 1 r p 21 p 22 p 2 r p r 1 p r 2 p r r
Step 3: Obtain the predicted values
x ^ f i n a l 0 k = x ^ 0 k + Δ e k = x ^ 0 k + j = 1 r p i j u j
where u j = L j + H j 2 is the central value of the j -th state.

2.4. PSO-Markov-GM (1,1) Model

The PSO initializes a group of random particles’ positions (which represent the random solution of the optimization problem), iteratively finds the optimal positions of all particles, and ultimately finds the best solution. The Markov-GM (1,1) prediction process optimized by PSO is as follows:
Step 1: Initialize particle swarm and encoding method
Each particle x i = b 1 i ,   b 2 i ,   ,   b r 1 i represents a residual state dividing scheme, among them, r is the number of divided states, so r 1 boundary values need to be determined. The boundary must meet the following constraints:
min ( e r k ) < b 1 i < b 2 i < b r 1 i < max ( e r k )
Each particle is also equipped with a velocity vector, v i = v 1 ,   v 2 ,   ,   v r 1 , that is initially assigned a random value. The parameters for the PSO algorithm are set as follows: the swarm size is 30, and the maximum number of iterations is 100 [40,41].
Step 2: Construct fitness function
The prediction error is evaluated using the following fitness function:
f x i = 1 n 1 k = 2 n x 0 k x ^ 0 k j = 1 r p i j u j x 0 k
where x ^ 0 k is the prediction result of GM (1,1).
Step 3: Particle position and velocity update
The formula for updating speed and position is expressed as
v i t + 1 = ω v i t + c 1 γ 1 p i b e s t x i t + c 2 γ 2 g b e s t x i t
x i t + 1 = x i t + v i t + 1
where x i t is the position of particle i in iteration t , v i t is the velocity of particle i , p i b e s t is the best position found by particle i , and g b e s t is the best position found by the entire swarm. ω is the inertia factor, it is set to ω = 0.7 . c 1 and c 2 are cognitive and social acceleration factors, they are set to c 1 = c 2 = 2 γ 1 and γ 2 are random numbers that follow a uniform distribution U 0 , 1 [40,41].
Step 4: Termination conditions and optimal solution output
Repeat Steps 2 and 3. Finally, the particle position x with the smallest fitness f x i is selected to determine the optimal residual boundary division.
Step 5: Calculate the predicted values
Use the optimal boundary x , corresponding center value u j , and transition matrix P to correct the predicted values.
x ^ f i n a l 0 k = x ^ 0 k + Δ e k = x ^ 0 k + j = 1 r p i j u j

2.5. Dynamic Rolling Update GM (1,1) (DRUGM) Model

The entire dataset was used to model the classic GM (1,1). Over time, the system development process may be affected by random interference factors. The rolling prediction mechanism ensures that subsequent predictions are based on the latest information and can effectively respond to external disturbances by continuously updating the initial data input. This study introduces a dynamic rolling update mechanism in the GM (1,1) (DRUGM) model.
In the DRUGM (1,1) model, x ^ 0 k + 1 is the prediction value of GM (1,1) and X 0 = x 0 1 ,   x 0 2 ,   ,   x 0 k is obtained. Hence, the newly estimated value, x 0 k + 1 is added to the sequence, and the oldest data, x 0 1 , is removed from the sequence. Consequently, X 0 = x 0 2 ,   x 0 3 ,   ,   x 0 k + 1 is used to predict x ^ 0 k + 2 . Figure 1 shows the workflow of DRUGM (1,1).
Assuming the scrolling window is set to four, only the latest four time-series data points are selected as modeling samples during modeling and prediction to establish a new GM (1,1). Then, the data within this small window are used to predict the next point, with only the most recent four points being taken as samples for each modeling step. This process is repeated continuously by sliding the window forward and updating the model, as illustrated in Figure 2.
The predicted values are compared with the actual values to obtain a residual sequence. When there is systematic bias in the model, using residuals to correct future predictions can solve the problem of “systematic bias” and increase the accuracy of modeling predictions.
Rolling updates ensure that the model can adapt to the latest trends in a timely manner, while dynamic error correction utilizes residuals to correct prediction biases. DRUGM (1,1) combines rolling data updates with a dynamic error correction mechanism; it can not only reduce the impact of outdated information, but also iteratively adjust deviations when new data are available, thereby maintaining higher stability and accuracy in the event of sudden fluctuations or trend changes.

2.6. Model Performance Evaluation Index

This study evaluates the accuracy of the models through four indicators: RMSE, MAE, APE, and MAPE. These values are calculated by utilizing Equations (20)–(23), respectively.
R M S E = 1 n i = 1 n ( y i y ^ i ) 2
M A E = 1 n i = 1 n y i y ^ i
A P E = y i y ^ i y i × 100 %
M A P E = 1 n i = 1 n y i y ^ i y i × 100 %
where y i represents the raw data, y ^ i represents the predicted value, and n is the total sample size.

3. Results

We systematically compared the predictive performance of the following five models: ARIMA, GM (1,1), Markov-GM (1,1), PSO-Markov-GM (1,1), and DRUGM (1,1). To select the optimal model, fitting and prediction experiments were conducted on datasets from three different periods: 2004–2023, 2004–2013, and 2014–2023. Annual tea production is a variable in modeling and prediction equations. All analyses and computational procedures in this study were conducted using Python version 3.11 (Python Software Foundation, Wilmington, DE, USA).

3.1. Case 1: Tea Production from 2004 to 2023

For analyzing the annual tea production data from China from 2004 to 2023, data from 2004 to 2019 are the training set, and the rest is the testing set. To assess the models’ predictive capabilities, the APE and MAPE are calculated. The results are presented in Table 4.
For the training set, ARIMA has the lowest MAPE value (2.54%), indicating it has a better fitting effect than the other models. The second-lowest value was obtained for PSO-Markov-GM (1,1); this could be due to the advantages conferred by the PSO algorithm. Overall, when using data from 2004 to 2019, each model had small errors on the training set and strong fitting ability.
For the testing set, the MAPE of DRUGM (1,1) is only 0.44%, indicating excellent performance and strong prediction ability. PSO-Markov-GM (1,1) maintains strong stability, although its prediction accuracy is slightly lower than that of ARIMA. GM (1,1) has the highest MAPE value of 6.79%, indicating relatively low prediction accuracy.
We also calculated the total MAPE for the models, which is the sum of the MAPEs of the training and testing sets. A smaller value indicates better overall model performance.
To more intuitively reflect the difference between the simulated results and raw data, Figure 3 presents a comparison chart of the APEs of the five models, while Figure 4 shows a trend comparison chart of the simulated values and raw data.
The lefthand side of Figure 3 illustrates the variations in APE from 2004 to 2023, while the righthand side summarizes the total MAPE value for each model. DRUGM (1,1) showed moderate error performance during the training stage and did not outperform the other models. During the testing phase (2020–2023), DRUGM (1,1) showed better prediction accuracy and error stability. The forecasting error of PSO-Markov-GM (1,1) exhibited a rising trend throughout the testing period, indicating that there may be a risk of future decline in its predictive performance. Based on total MAPE, DRUGM (1,1) has the lowest overall error (3.39%), performing well in both the training and testing stages. The next best is PSO-Markov-GM (1,1), with a total MAPE of 6.36%. Although PSO-Markov-GM (1,1) and Markov-GM (1,1) fit well during the training phase, their errors increase year by year during the testing phase, highlighting the declining reliability of future production predictions. GM (1,1) has the highest total MAPE of 9.74%, indicating its low accuracy in long-term forecasting. Although ARIMA fits the training data well, its prediction error is relatively high in the testing stage, which weakens the overall prediction performance.
From Figure 4, we find that the ARIMA model exhibits good fitting at the beginning of the sequence, but its prediction capability in the later stages is relatively poor. GM (1,1) can reflect the overall trend but is prone to overfitting in the middle and later stages. Markov-GM (1,1) improved the fitting effect through residual correction, especially in the middle section, but performed poorly in dealing with short-term fluctuations. After further optimizing the state partitioning, the overall fitting was improved. DRUGM (1,1) performs the best among these models, accurately capturing the overall trend.

3.2. Case 2: Tea Production from 2004 to 2013

The raw data, APEs, and MAPE for the annual tea production data from 2004 to 2013 are given in Table 5.
PSO-Markov-GM (1,1) has the lowest MAPE of 1.06% in the training set, indicating the highest fitting accuracy and strongest fitting ability. ARIMA has a high MAPE of 6.36% in the training set, which is the highest among all models. In 2005, the deviation between the predicted and actual values reached 34%, indicating that the model has weak adaptability to data trends and a poor fitting effect.
In terms of performance on the testing set, DRUGM (1,1) achieved an MAPE of 0.64%, which is lower than 1%, indicating a high predictive accuracy. In contrast, ARIMA had an MAPE of 2.80%, which was significantly higher than the other models, indicating insufficient adaptability and poor predictive performance.
Overall, both PSO-Markov-GM (1,1) and DRUGM (1,1) outperform the other models in terms of prediction accuracy. The APE values demonstrate that the two models have similar accuracy and perform better in different stages. To further demonstrate their fitting and predictive performance, the error data are visualized in Figure 5 and Figure 6, respectively. The APE and MAPE values of each model are compared, and their curves are presented to more comprehensively illustrate model performance.
Figure 5 shows that ARIMA has an APE value of 34.00% in 2005, a significantly higher error than for the other models, indicating that it may have significant deviations and poor stability in that year. On the other hand, the error fluctuations of the other models are relatively small, and their overall performance is more stable. DRUGM (1,1) has errors close to 0 for a few years, suggesting an excellent fitting effect. DRUGM (1,1) has the smallest total MAPE of only 1.85%. The total MAPE of ARIMA is the highest across the tested models, reaching 9.16%, which is about five times that of DRUGM (1,1). The other models’ total MAPE values are all around 2%, indicating similar performance accuracy for these models.
In Figure 6, ARIMA shows substantial deviation in 2005, with the fitted values deviating significantly from the actual data. Although there was some improvement after 2010, the overall fitting effect was still not ideal. In contrast, the GM (1,1) curve is generally smoother, but there is some deviation in certain years (such as 2007 and 2012). The fitting effect of Markov-GM (1,1) and PSO-Markov-GM (1,1) is superior, and their training set fitting values are very close to the original data, indicating good tracking ability. Notably, the DRUGM (1,1) curve closely fits the actual data, demonstrating high accuracy in historical data fitting and strong capability in predicting future trends, making it the best-performing model overall.

3.3. Case 3: Tea Production from 2014 to 2023

In the analysis of tea production data from 2014 to 2023, data from 2014 to 2021 are used as the training set. Table 6 shows the APE and MAPE values for the five prediction models using data from 2014 to 2023.
The MAPE of PSO-Markov-GM (1,1) is only 0.88% on the training set, indicating its strong fitting ability. ARIMA has high APEs, especially in 2015 and 2016, where values of 9.96% and 6.00% are reached, respectively. The model also has the highest MAPE value.
DRUGM (1,1) is the best-performing model on the test set, with an MAPE value of only 0.26% demonstrating high prediction accuracy. Figure 7 and Figure 8 show the comparison of APE values and data trends.
As can be seen in Figure 7, ARIMA was associated with notably larger prediction errors than the other models in 2015 and 2016, indicating that it is prone to failure in years with significant trend changes. In contrast, the error fluctuations of the other models are smaller, and their predictive performance is more stable. DRUGM (1,1) has small errors in the testing set, demonstrating its strong prediction accuracy. Its total MAPE is only 1.38%, which was the lowest among all models, indicating that its predictive performance is balanced on the training and testing sets and its overall performance is superior to the other models. The MAPE of PSO-Markov-GM (1,1) is smaller than that of Markov-GM (1,1), indicating that introducing the PSO algorithm is beneficial for model accuracy. Overall, the MAPE of ARIMA reaches 3.94%, which is almost double that of the other models.
To further assess their fitting performance, the prediction curves of the five models, using data from 2014 to 2023, are illustrated in Figure 8. From the graph, it can be seen that ARIMA has significant prediction bias in some years. The overall trends of GM (1,1) and Markov-GM (1,1) are relatively smooth, but there is a certain level of deviation from the raw data. In contrast, the curve for PSO-Markov-GM (1,1) is highly consistent and closely tracks the trend for the raw data. The DRUGM (1,1) curve is extremely consistent with the raw data, clearly demonstrating the excellent prediction ability of this model.
In this paper, the RMSE and MAE are included as performance assessment indicators to conduct a more comprehensive comparative analysis. They are shown in Table 7.
Table 7 presents the RMSE and MAE values for each prediction model based on three sample intervals (2004–2023, 2004–2013, and 2014–2023) of annual tea production data. ARIMA has the smallest RMSE and MAE values for the 2004–2023 data, while PSO-Markov-GM (1,1) performed well on the 2004–2013 and 2014–2023 data. Although DRUGM (1,1) does not achieve the lowest error values in the training set, its performance is comparable to that of PSO-Markov-GM (1,1). More importantly, DRUGM (1,1) achieved the minimum RMSE and MAE in all three testing sets, with error values all below 1.7. Notably, for the data from 2004 to 2023, the RMSE and MAE values for the other models generally exceed 10, with some even above 20. In contrast, the values RMSE and MAE of DRUGM (1,1) are only 1.65 and 1.37, respectively, indicating their significantly better performance.

3.4. Provincial-Level Tea Production from 2004 to 2023

The three cases above show that DRUGM (1,1) has extremely high accuracy in predicting China’s national tea production, indicating that the model has strong predictive ability at the macro level.
To further validate the superiority of DRUGM (1,1), we extended its predictive application to the provincial level. Table 8 shows the APE and MAPE values of DRUGM (1,1) in predicting tea production in various provinces from 2004 to 2023.
Table 8 shows that the overall performance of DRUGM (1,1) is satisfactory at the provincial level. The APE and MAPE values of all 18 major tea-producing provinces are below the threshold of 10%, indicating that DRUGM (1,1) has a relatively low overall error rate in predicting provincial tea production and has good accuracy in overall prediction.
More importantly, the model performs well in provinces with different production scales, climate conditions, and agricultural practices. The model achieved significant accuracy in traditional tea-planting provinces such as Fujian (MAPE = 1.34%), Yunnan (MAPE = 3.39%), Sichuan (MAPE = 2.53%), and Zhejiang (MAPE = 3.41%), demonstrating its effectiveness in stable and high-yield areas. The model maintained strong predictive ability in developing tea-planting regions such as Guangxi (MAPE = 2.78%), Shaanxi (MAPE = 4.09%), and Gansu (MAPE = 4.28%). This indicates that DRUGM (1,1) is not only suitable for stable time series but is also robust enough to handle the inherent growth trends and potential fluctuations in the developing agricultural sectors. Although the MAPE value for Hainan Province was 6.62%, the highest among all provinces, this may be related to its unique tropical climate conditions, especially extreme weather events such as typhoons or heavy rainfall, which may lead to unstable production fluctuations. Due to the insufficient consideration of these factors in the univariate model, the prediction error for Hainan is relatively large. Even so, the highest APE value in the province is still within an acceptable range, demonstrating the applicability and predictive reliability of the model in the face of atypical climatic conditions.

4. Discussion

The results show that DRUGM (1,1) performs well in predicting tea production at both national and provincial levels in China. PSO-Markov-GM (1,1) has certain advantages over DRUGM (1,1) in short-term predictions, but its error in the testing stage tends to increase, indicating that it has limited applicability in long-term prediction. GM (1,1) and Markov-GM (1,1) are more effective at capturing overall trends, but their performance in dealing with short-term fluctuations and structural changes is poor. Although ARIMA has outstanding fitting performance during the training phase, it exhibits strong instability during years of anomalous production.
Previous studies have shown that GM (1,1) is effective for limited datasets, as well as short- to medium-term predictions; however, it is prone to accumulating errors in long-term predictions [26]. The introduction of Markov-GM (1,1) has improved the residual distribution to some extent, but it is difficult to completely eliminate the influence of non-stationary data [28]. This study found that during the fitting stage, Markov-GM (1,1) integrated with the PSO algorithm demonstrated superior performance, which is consistent with the view that “intelligent optimization algorithms can improve the accuracy of grey prediction models” [42]. However, this study further demonstrates that DRUGM (1,1) is superior in overall predictive performance, with its low error rate and high stability indicating that the model maintains good adaptability in response to data fluctuations and trend changes. For example, in 2023, China’s actual tea production was 3.5411 million tons. The prediction error of GM (1,1) was 9.03%, while DRUGM (1,1)’s was only 0.44%. The difference between the two is 8.59%, corresponding to a difference in actual production of approximately 304,200 tons. Such a significant gap can lead to misjudgments in national agricultural policies and trade strategies, misallocation of enterprise production and inventory resources, and even have a direct impact on farmers’ incomes and market stability. This demonstrates that improving prediction accuracy, even by a small percentage, can generate substantial economic and social benefits, further highlighting the practical importance of DRUGM (1,1).
Moreover, compared with traditional rolling grey models such as GRPM [35] and RGVM [36], which mainly rely on updating sample windows to improve short-term adaptability, DRUGM (1,1) combines rolling data updates with dynamic error correction mechanisms. Not only can it reduce the impact of outdated information, but it can also iteratively adjust deviations when new data are available, enabling it to maintain higher stability and accuracy in the event of sudden fluctuations or trend changes.
From the perspective of research hypotheses, the original intention of this study is to verify whether dynamic correction mechanisms can significantly improve prediction accuracy. The experimental results clearly support this hypothesis: in all scenarios, DRUGM (1,1) effectively reduces prediction bias through dynamic updating and correction mechanisms, allowing superior performance in long-term trend prediction.
These findings have broader significance. On the one hand, they indicate that traditional time series methods such as ARIMA still have shortcomings in dealing with non-linear and abrupt data in agricultural production, while improved models based on grey system theory have more potential in such application scenarios. On the other hand, the results also suggest that in agricultural forecasting and decision-making research, more attention should be paid to models that can balance short-term fitting and long-term stability, in order to avoid policy or industry decision-making risks caused by unstable models.
The model developed in this study, DRUGM (1,1), is a univariate model that excludes external variables that may influence tea yield, such as the climatic and economic conditions. Therefore, variables such as meteorological (e.g., precipitation and temperature) and economic factors (e.g., planting costs and market demand) could be introduced in subsequent research to build a multi-dimensional grey model with enhanced accuracy and stability under complex conditions. Overall, PSO-Markov-GM (1,1) also demonstrated good performance, with a lower overall MAPE than GM (1,1) and Markov-GM (1,1), indicating that optimization algorithms can significantly improve predictive performance. Therefore, future research could attempt to combine optimization algorithms with rolling update mechanisms to further improve the performance of tea production prediction.
DRUGM (1,1) not only exhibited superior performance at a national level, but it also demonstrated outstanding performance in provincial forecasting. By accurately predicting production in major tea-producing provinces (such as Zhejiang), developing tea-producing provinces (such as Guangxi), and emerging tea-producing provinces (such as Hainan), good adaptability was demonstrated for different data scales, proving the model’s wide applicability in the agricultural field. By providing an accurate, adaptable, and computationally efficient predictive framework, our model offers a valuable decision-support tool for policymakers, market participants, and researchers. It provides important support for improving resource allocation, optimizing agricultural planning, and stabilizing rural economies. In addition to the aforementioned contributions, DRUGM (1,1) also demonstrated tremendous potential in practical applications. Its rolling update mechanism can generate timely prediction results, directly supporting climate change response strategies, such as adjusting crop planting plans, optimizing irrigation measures, and promoting agricultural insurance. From the perspective of policy regulation, predictions that are more reliable can provide a scientific basis for subsidy allocation, supply–demand balance regulation, and market stability measures, thereby helping to safeguard farmers’ incomes and maintain the competitiveness in the global tea trade.

5. Conclusions

Agriculture supports the stability of the national economy, and the tea industry is strategically important for China’s economic growth, rural livelihoods, and social well-being. Its development speed and quality are not only crucial for agricultural modernization, but also an indispensable part of promoting the rural revitalization strategy.
This study analyzed the data for Chinese tea production and evaluated the predictive performance of five models—ARIMA, GM (1,1), Markov-GM (1,1), PSO-Markov-GM (1,1), and DRUGM (1,1)—using four performance indicators (APE, MAPE, RMSE, and MAE). The results indicate that DRUGM (1,1) consistently achieved the lowest prediction error across all datasets, demonstrating the best fit with historical data. Integrating the dynamic rolling update process into the GM (1,1) framework improves accuracy while maintaining the simplicity of the model.
This study further verified the effectiveness of DRUGM (1,1) through the prediction and analysis of provincial tea production. Reliable results for tea production at the provincial level can provide a more accurate basis for agricultural resource and labor force allocation, further improving decision support.
The research results confirm that DRUGM (1,1) is an efficient, simple, and highly accurate method for predicting annual tea production in China. More importantly, the adaptability of the model makes it applicable at both the national and provincial levels, providing a valuable decision-support tool for policymakers, market participants, and researchers. Accurate predictions help improve resource allocation, optimize agricultural planning, and strengthen supply chain management, all of which can contribute to the stability of the rural economy and resilience of agricultural systems.

Author Contributions

Conceptualization, S.X., W.K.W., H.S.L. and K.S.K.; methodology, S.X. and W.K.W.; software, S.X.; validation, S.X., W.K.W., H.S.L. and K.S.K.; formal analysis, S.X. and H.S.L.; writing—original draft preparation, S.X.; writing—review and editing, S.X., W.K.W., H.S.L. and K.S.K.; visualization, S.X. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data supporting the conclusions of this article are included within the article, specifically in Table 4. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors would like to thank the anonymous reviewers for their valuable comments that helped shape the key messages of the manuscript.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ANNArtificial neural network
APEAbsolute percentage error
ARIMAAuto regressive integrated moving average
BGMBox plot grey model
CNNConvolutional neural networks
ConvLSTMConvolutional long short-term memory
DCNNDimensional convolutional neural network
DGMDiscrete grey model
DNGMDiscrete nonlinear grey model
DRS-RFDragonfly optimization algorithm and support vector regression-random forest
DRUGMDynamic rolling update grey model
DTDecision tree
ELNETElastic net
FANGBMFractional accumulating nonlinear grey Bernoulli model
FOANGBMKMFractional opposite-direction accumulating nonlinear grey Bernoulli Markov model
GAGenetic algorithm
GBDTGradient-boosted decision tree
GMGrey model
GPGaussian processor
GRPMGrey rolling prediction model
GTWNNGeographically and temporally weighted neural network
GTWRGeographically and temporally weighted regression
GVMGeneral vector machine
GWOGrey wolf optimizer
KGMKernel-based multivariate nonlinear grey model
LASSOLeast absolute shrinkage and selection operator
LRLinear regression
LSTMLong short-term memory
MAEMean absolute error
MAPEMean absolute percentage error
Markov-GMMarkov chain grey model
NGBMNonlinear grey Bernoulli model
PRPolynomial regression
PSOParticle swarm optimization
PSO-Markov-GMParticle swarm optimization Markov chain grey model
RFRandom forest
RGVMRolling grey Verhulst model
RLRRobust linear regression
RMSE Root mean squared error
SARIMASeasonal ARIMA
SMLRStepwise multiple linear regression
SVMSupport vector machine
SVRSupport vector regression
UAVsUnmanned aerial vehicles
XGBoostExtreme gradient boosting

References

  1. Chen, C.; Lu, J.; Zhou, M.; Yi, J.; Liao, M.; Gao, Z. A YOLOv3-Based Computer Vision System for Identification of Tea Buds and the Picking Point. Comput. Electron. Agric. 2022, 198, 107116. [Google Scholar] [CrossRef]
  2. Qu, Z.; Liu, A.; Li, P.; Liu, C.; Xiao, W.; Huang, J.; Liu, Z.; Zhang, S. Advances in Physiological Functions and Mechanisms of (−)-Epicatechin. Crit. Rev. Food Sci. Nutr. 2021, 61, 211–233. [Google Scholar] [CrossRef]
  3. Zhou, J.; Ho, C.; Long, P.; Meng, Q.; Zhang, L.; Wan, X. Preventive Efficiency of Green Tea and Its Components on Nonalcoholic Fatty Liver Disease. J. Agric. Food Chem. 2019, 67, 5306–5317. [Google Scholar] [CrossRef]
  4. Jayasinghe, S.L.; Kumar, L. Climate Change May Imperil Tea Production in the Four Major Tea Producers According to Climate Prediction Models. Agronomy 2020, 10, 1536. [Google Scholar] [CrossRef]
  5. Nurman, S.; Nusrang, M.; Sudarmin. Analysis of Rice Production Forecast in Maros District Using the Box-Jenkins Method with the ARIMA Model. ARRUS J. Math. Appl. Sci. 2022, 2, 36–48. [Google Scholar] [CrossRef]
  6. Noorunnahar, M.; Chowdhury, A.H.; Mila, F.A. A Tree Based EXtreme Gradient Boosting (XGBoost) Machine Learning Model to Forecast the Annual Rice Production in Bangladesh. PLoS ONE 2023, 18, 0283452. [Google Scholar] [CrossRef]
  7. Devi, M.; Kumar, J.; Malik, D.P.; Mishra, P. Forecasting of Wheat Production in Haryana Using Hybrid Time Series Model. J. Agric. Food Res. 2021, 5, 100175. [Google Scholar] [CrossRef]
  8. Chergui, N. Durum Wheat Yield Forecasting Using Machine Learning. Artif. Intell. Agric. 2022, 6, 156–166. [Google Scholar] [CrossRef]
  9. Zhang, X.; Bao, J.; Xu, S.; Wang, Y.; Wang, S. Prediction of China’s Grain Consumption from the Perspective of Sustainable Development—Based on GM (1,1) Model. Sustainability 2022, 14, 10792. [Google Scholar] [CrossRef]
  10. Wu, Y.; Zhou, R.; Yu, B.; Huang, X.; Li, B. Grain Yield Prediction Based on the Improved Unbiased Grey Markov Model. Discret. Dyn. Nat. Soc. 2025, 2025, 8282138. [Google Scholar] [CrossRef]
  11. Saha, J.K.; Adnan, K.M.M.; Sarker, S.A.; Bunerjee, S. Analysis of Growth Trends in Area, Production and Yield of Tea in Bangladesh. J. Agric. Food Res. 2021, 4, 100136. [Google Scholar] [CrossRef]
  12. Mila, F.A.; Noorunnahar, M.; Nahar, A.; Acharjee, D.C.; Parvin, M.T.; Culas, R.J. Modelling and Forecasting of Tea Production, Consumption and Export in Bangladesh. Curr. Appl. Sci. Technol. 2022, 22. [Google Scholar] [CrossRef]
  13. Jui, S.J.J.; Ahmed, A.A.M.; Bose, A.; Raj, N.; Sharma, E.; Soar, J.; Chowdhury, M.W.I. Spatiotemporal Hybrid Random Forest Model for Tea Yield Prediction Using Satellite-Derived Variables. Remote Sens. 2022, 14, 805. [Google Scholar] [CrossRef]
  14. Ryan, A.A.; Kh Shuvessa, S.; Mamun, S.; Arpita, H.D.; Ahamed, M.S. Forecasting Tea Production in the Context of Bangladesh Utilizing Machine Learning. In Proceedings of the 2023 14th International Conference on Computing Communication and Networking Technologies, ICCCNT 2023, Delhi, India, 6–8 July 2023; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2023. [Google Scholar]
  15. Satpathi, A.; Setiya, P.; Das, B.; Nain, A.S.; Jha, P.K.; Singh, S.; Singh, S. Comparative Analysis of Statistical and Machine Learning Techniques for Rice Yield Forecasting for Chhattisgarh, India. Sustainability 2023, 15, 2786. [Google Scholar] [CrossRef]
  16. Feng, L.; Wang, Y.; Zhang, Z.; Du, Q. Geographically and Temporally Weighted Neural Network for Winter Wheat Yield Prediction. Remote Sens. Environ. 2021, 262, 112514. [Google Scholar] [CrossRef]
  17. Gavahi, K.; Abbaszadeh, P.; Moradkhani, H. DeepYield: A Combined Convolutional Neural Network with Long Short-Term Memory for Crop Yield Forecasting. Expert. Syst. Appl. 2021, 184, 15511. [Google Scholar] [CrossRef]
  18. Batool, D.; Shahbaz, M.; Shahzad Asif, H.; Shaukat, K.; Alam, T.M.; Hameed, I.A.; Ramzan, Z.; Waheed, A.; Aljuaid, H.; Luo, S. A Hybrid Approach to Tea Crop Yield Prediction Using Simulation Models and Machine Learning. Plants 2022, 11, 1925. [Google Scholar] [CrossRef]
  19. Islam, M.A.; Sumy, M.S.A.; Uddin, M.A.; Hossain, M.S. Fitting ARIMA Model and Forecasting for the Tea Production, and Internal Consumption of Tea (per Year) and Export of Tea. Int. J. Mater. Math. Sci. 2020, 2, 8–15. [Google Scholar] [CrossRef]
  20. Arigela, S.; Naidu, M.B.; Reddy, M.P.R.; Murali, K.; Rayalu, G.M. Tea Crop Yield Prediction Using Time Series Models in Kerala. Int. J. Food Nutr. Sci. 2022, 11, 761–773. [Google Scholar]
  21. Phan, P.; Chen, N.; Xu, L.; Chen, Z. Using Multi-Temporal MODIS NDVI Data to Monitor Tea Status and Forecast Yield: A Case Study at Tanuyen, Laichau, Vietnam. Remote Sens. 2020, 12, 1814. [Google Scholar] [CrossRef]
  22. Deng, J. Control Problems of Grey Systems. Syst. Control Lett. 1982, 1, 288–294. [Google Scholar] [CrossRef]
  23. Chang, C.J.; Li, D.C.; Huang, Y.H.; Chen, C.C. A Novel Gray Forecasting Model Based on the Box Plot for Small Manufacturing Data Sets. Appl. Math. Comput. 2015, 265, 400–408. [Google Scholar] [CrossRef]
  24. Xie, M.; Wu, L.; Li, B.; Li, Z. A Novel Hybrid Multivariate Nonlinear Grey Model for Forecasting the Traffic-Related Emissions. Appl. Math. Model. 2020, 77, 1242–1254. [Google Scholar] [CrossRef]
  25. Gao, J.; Li, J.; Wang, M. Time Series Analysis of Cumulative Incidences of Typhoid and Paratyphoid Fevers in China Using Both Grey and SARIMA Models. PLoS ONE 2020, 15, e0241217. [Google Scholar] [CrossRef]
  26. Yuan, C.; Liu, S.; Fang, Z. Comparison of China’s Primary Energy Consumption Forecasting by Using ARIMA (the Autoregressive Integrated Moving Average) Model and GM (1,1) Model. Energy 2016, 100, 384–390. [Google Scholar] [CrossRef]
  27. Liu, L.; Xie, A.; Ping, H. Research on Freight Development of Guangdong Province Based on Grey Theory Model. Math. Probl. Eng. 2021, 2021, 5401499. [Google Scholar] [CrossRef]
  28. Jia, Z.Q.; Zhou, Z.F.; Zhang, H.J.; Li, B.; Zhang, Y.X. Forecast of Coal Consumption in Gansu Province Based on Grey-Markov Chain Model. Energy 2020, 199, 117444. [Google Scholar] [CrossRef]
  29. Elgharbi, S.; Esghir, M.; Ibrihich, O.; Abarda, A.; El Hajji, S.; Elbernoussi, S. Grey-Markov Model for the Prediction of the Electricity Production and Consumption. In Big Data and Networks Technologies. BDNT 2019. Lecture Notes in Networks and Systems; Farhaoui, Y., Ed.; Springer: Berlin/Heidelberg, Germany, 2020; Volume 81, pp. 206–219. [Google Scholar]
  30. Yuan, D.; Geng, C.; Zhang, L.; Zhang, Z. Application of Gray-Markov Model to Land Subsidence Monitoring of a Mining Area. IEEE Access 2021, 9, 118716–118725. [Google Scholar] [CrossRef]
  31. Jin, X.; Zheng, J.; Geng, X. Prediction of Road Traffic Accidents Based on Grey System Theory and Grey Markov Model. Int. J. Saf. Secur. Eng. 2020, 10, 263–268. [Google Scholar] [CrossRef]
  32. Qiu, M.; Li, D.; Luo, Z.; Yu, X. Huizhou GDP Forecast Based on Fractional Opposite-Direction Accumulating Nonlinear Grey Bernoulli Markov Model. Electron. Res. Arch. 2023, 31, 947–960. [Google Scholar] [CrossRef]
  33. Xu, Y.; Lin, T.; Du, P. A Hybrid Coal Prediction Model Based on Grey Markov Optimized by GWO—A Case Study of Hebei Province in China. Expert Syst. Appl. 2024, 235, 121194. [Google Scholar] [CrossRef]
  34. Zheng, C.; Wu, W.; Xie, W.; Li, Q.; Zhang, T. Forecasting the Hydroelectricity Consumption of China by Using a Novel Unbiased Nonlinear Grey Bernoulli Model. J. Clean. Prod. 2021, 278, 123903. [Google Scholar] [CrossRef]
  35. Zhou, W.; Zeng, B.; Wang, J.; Luo, X.; Liu, X. Forecasting Chinese Carbon Emissions Using a Novel Grey Rolling Prediction Model. Chaos Solitons Fractals 2021, 147, 110968. [Google Scholar] [CrossRef]
  36. Zhao, Y.; Shou, M.; Wang, Z. Prediction of the Number of Patients Infected with COVID-19 Based on Rolling Grey Verhulst Models. Int. J. Environ. Res. Public Health 2020, 17, 4582. [Google Scholar] [CrossRef]
  37. Zhang, H.; Chen, Y. Analysis and Application of Grey-Markov Chain Model in Tax Forecasting. J. Math. 2021, 2021, 9918411. [Google Scholar] [CrossRef]
  38. Gonçalves, J.P.S.; Fruett, F.; Dalfré Filho, J.G.; Giesbrecht, M. Faults Detection and Classification in a Centrifugal Pump from Vibration Data Using Markov Parameters. Mech. Syst. Signal Process 2021, 158, 107694. [Google Scholar] [CrossRef]
  39. Yin, G.G.; Zhang, Q. Discrete-Time Markov Chainstwo-Time-Scale Methods and Applications; Springer: New York, NY, USA, 2005. [Google Scholar]
  40. Shi, Y.; Eberhart, R. A Modified Particle Swarm Optimizer. In Proceedings of the 1998 IEEE International Conference on Evolutionary Computation Proceedings, IEEE World Congress on Computational Intelligence (Cat. No.98TH8360), Anchorage, AK, USA, 4–9 May 1998. [Google Scholar]
  41. Twumasi, E.; Frimpong, E.A.; Kwegyir, D.; Folitse, D. Improvement of Grey System Model Using Particle Swarm Optimization. J. Electr. Syst. Inf. Technol. 2021, 8, 12. [Google Scholar] [CrossRef]
  42. Castillo, M.; Soto, R.; Crawford, B.; Castro, C.; Olivares, R. A Knowledge-Based Hybrid Approach on Particle Swarm Optimization Using Hidden Markov Models. Mathematics 2021, 9, 1417. [Google Scholar] [CrossRef]
Figure 1. The dynamic rolling update mechanism of the grey prediction model.
Figure 1. The dynamic rolling update mechanism of the grey prediction model.
Mathematics 13 03056 g001
Figure 2. Prediction process with a scrolling window of four.
Figure 2. Prediction process with a scrolling window of four.
Mathematics 13 03056 g002
Figure 3. Comparison of APE and total MAPE of the five models using data from 2004 to 2023.
Figure 3. Comparison of APE and total MAPE of the five models using data from 2004 to 2023.
Mathematics 13 03056 g003
Figure 4. Comparison between raw data and model predictions using data from 2004 to 2023.
Figure 4. Comparison between raw data and model predictions using data from 2004 to 2023.
Mathematics 13 03056 g004
Figure 5. Comparison of APE and Total MAPE of the five models using data from 2004 to 2013.
Figure 5. Comparison of APE and Total MAPE of the five models using data from 2004 to 2013.
Mathematics 13 03056 g005
Figure 6. Comparison between raw data and model predictions using data from 2004 to 2013.
Figure 6. Comparison between raw data and model predictions using data from 2004 to 2013.
Mathematics 13 03056 g006
Figure 7. Comparison of APE and total MAPE of the five models using data from 2014 to 2023.
Figure 7. Comparison of APE and total MAPE of the five models using data from 2014 to 2023.
Mathematics 13 03056 g007
Figure 8. Comparison between raw data and model predictions using data from 2014 to 2023.
Figure 8. Comparison between raw data and model predictions using data from 2014 to 2023.
Mathematics 13 03056 g008
Table 1. Summary of prediction studies for tea and crop yields.
Table 1. Summary of prediction studies for tea and crop yields.
ReferencesModelsFindings
Satpathi et al. [15]SMLR, ANN, LASSO, ELNET, Ridge regressionEnsemble models achieved better results than single models.
Feng et al. [16]GTWNN, ANN, GTWR, SVRGTWNN has lower errors and can effectively solve spatial non stationarity in the prediction modeling process.
Gavahi et al. [17]3DCNN, ConvLSTM, DeepYieldDeepYield’s performance is significantly better than all modeling techniques, including ConvLSTM and 3DCNN.
Batool et al. [18]AquaCrop, Machine learningThe machine learning regression algorithm outperformed the simulation model with less data.
Islam et al. [19]ARIMAARIMA (0,1,1) can predict the tea production in Bangladesh very well
Arigela et al. [20]ARIMA, SARIMAThe SARIMA model provides a deeper understanding of seasonal factors that affect yield.
Phan et al. [21]SVM, RF, LRThe predictive performance of different models varies in different periods.
Table 2. Summary of prediction studies using grey models and their derived models.
Table 2. Summary of prediction studies using grey models and their derived models.
ReferencesModelsFindings
Chang et al. [23]BGM (1,1), GM (1,1)The BGM (1,1) model has better prediction performance and is a useful tool for manufacturing enterprises, which can be applied to other practical industrial cases in the future.
Xie et al. [24]KGM (1, N), SVR, RLRThe results of KGM (1, N) are significantly better than those of RLR and SVR prediction models.
Gao et al. [25]SARIMA, GM (1,1)SARIMA (0,1,7) × (1,0,1)12 performed better than GM (1,1), and Children aged 0–4 have been a high-risk group in recent years. The three provinces in southwest China (Yunnan, Guizhou and Guangxi) have the highest incidence rate.
Yuan et al. [26]ARIMA, GM (1,1), GM-ARIMAThe GM-ARIMA performed the best.
Liu et al. [27]GM (1,1)Analyze the impact of COVID-19 on Guangdong’s transportation and find out the related factors of Guangdong’s freight growth.
Jia et al. [28]Markov-GMThe average relative error predicted by Markov-GM (1,1) is much lower than before correction
Elgharbi et al. [29]Markov-GM, GM (1,1)Markov-GM has higher prediction accuracy and better prediction results. Electricity production and consumption will show significant growth in the future.
Yuan et al. [30]Markov-GM, GM (1,1)Compared with traditional grey models, grey Markov models can better reflect the volatility and practicality of subsidence data in mining areas.
Jin et al. [31]Markov-GM, GM (1,1)In terms of predicting road traffic accidents, the accuracy of Markov-GM is significantly higher than others.
Qiu et al. [32]FOANGBMKM, FANGBMThe predictive performance of FOANGBMKM (1,1) is superior to the other four competing models, proving that this model has higher accuracy and efficiency.
Xu et al. [33]Markov-DNGM, GWO-Markov-DNGMThe Markov chain state interval partitioning hybrid model optimized by GWO algorithm has higher reliability in predicting coal consumption compared to models that partition state intervals based on experience.
Zheng et al. [34]PR(n), ARIMA, ANN, GVM (1,1), NGBM, PSO-Unbiased NGBMUsing PSO algorithm to optimize model parameters improves prediction accuracy and outperforms other algorithms.
Zhou et al. [35]GM (1,1), DGM (1,1), GRPM (1,1), LR modelCompared with the other two classic prediction models that did not consider the priority of new information, GRPM (1,1) has higher stability.
Zhao et al. [36]RGVMRGVM and its derivative forms can effectively predict changes in patient populations.
Zhang and Chen [37]GM (1,1), Markov chain,
Markov-GM
Markov-GM can better perform tax analysis and prediction.
Table 3. Summary Measures of the dataset.
Table 3. Summary Measures of the dataset.
SampleMinimum
(10,000 Tons)
Maximum
(10,000 Tons)
Average
(10,000 Tons)
Standard Deviation (10,000 Tons)
2004–202383.52354.11203.8081.70
2004–201383.52188.72132.9333.39
2014–2023204.93354.11274.6646.80
Table 4. The values of APE and MAPE for 2004–2023.
Table 4. The values of APE and MAPE for 2004–2023.
YearRaw DataARIMAGM (1,1)Markov-GM (1,1)PSO-Markov-GM (1,1)DRUGM (1,1)
APE (%)APE (%)APE (%)APE (%)APE (%)
200483.520.000.001.112.040.00
200593.4910.659.438.447.619.43
2006102.810.357.136.235.477.13
2007117.054.381.300.510.151.30
2008125.481.911.730.990.381.73
2009135.060.441.751.070.491.75
2010146.250.801.160.530.001.16
2011160.762.660.921.501.980.92
2012176.152.472.653.183.622.65
2013188.720.402.182.673.082.18
2014204.932.093.023.473.853.02
2015227.664.446.026.426.766.02
2016231.334.590.430.821.160.43
2017246.040.890.790.410.100.79
2018261.040.822.271.921.622.27
2019277.721.263.493.162.883.49
MAPE 2.542.952.652.572.95
2020293.180.595.543.232.230.74
2021316.403.565.283.142.220.76
2022334.214.607.315.284.400.15
2023354.116.119.037.126.290.11
MAPE 3.726.794.693.790.44
Table 5. The values of APE and MAPE for 2004–2013.
Table 5. The values of APE and MAPE for 2004–2013.
YearRaw DataARIMAGM (1,1)Markov-GM (1,1)PSO-Markov-GM (1,1)DRUGM (1,1)
APE (%)APE (%)APE (%)APE (%)APE (%)
200483.520.000.000.890.040.00
200593.4934.002.112.912.082.11
2006102.810.631.292.011.251.29
2007117.053.932.962.322.992.96
2008125.482.191.260.671.291.26
2009135.060.670.060.610.040.06
2010146.250.600.791.300.770.79
2011160.762.520.020.480.000.02
MAPE 6.361.211.401.061.21
2012176.152.470.430.010.450.43
2013188.723.121.371.771.360.85
MAPE 2.800.900.890.900.64
Table 6. The values of APE and MAPE for 2014–2023.
Table 6. The values of APE and MAPE for 2014–2023.
YearRaw DataARIMAGM (1,1)Markov-GM (1,1)PSO-Markov-GM (1,1)DRUGM (1,1)
APE (%)APE (%)APE (%)APE (%)APE (%)
2014204.930.000.000.310.620.00
2015227.669.962.833.111.582.83
2016231.336.001.291.020.751.29
2017246.041.700.880.622.030.88
2018261.043.770.710.470.230.71
2019277.720.930.270.041.290.27
2020293.180.300.610.390.180.61
2021316.402.361.261.460.361.26
MAPE 3.571.120.930.881.12
2022334.210.560.981.171.360.05
2023354.110.191.011.190.210.48
MAPE 0.371.001.180.780.26
Table 7. The values of RMSE and MAE for different prediction models.
Table 7. The values of RMSE and MAE for different prediction models.
YearEvaluation IndexARIMAGM (1,1)Markov-GM (1,1)PSO-Markov-GM (1,1)DRUGM (1,1)
Training set
2004–2023RMSE5.336.045.825.916.04
MAE4.124.844.424.354.84
2004–2013RMSE12.291.751.781.641.75
MAE6.521.381.581.191.38
2014–2023RMSE11.243.343.212.683.34
MAE8.792.852.332.182.85
Test set
2004–2023RMSE14.4523.2516.8514.191.65
MAE12.5022.3415.5712.641.37
2004–2013RMSE5.181.912.361.891.25
MAE5.121.681.681.681.18
2014–2023RMSE1.413.444.073.251.21
MAE1.273.434.062.640.93
Table 8. APE and MAPE values of tea-growing provinces from 2004 to 2023.
Table 8. APE and MAPE values of tea-growing provinces from 2004 to 2023.
YearAnhui (%)Fujian (%)Gansu (%)Guangdong (%)Guangxi (%)Guizhou (%)Hainan (%)Henan (%)Hubei (%)Hunan (%)Jiangsu (%)Jiangxi (%)Shandong (%)Shaanxi (%)Sichuan (%)Yunnan (%)Zhejiang (%)Chongqing (%)
20040.000.000.000.000.000.000.000.000.000.000.000.000.000.000.000.000.000.00
20059.625.773.688.964.308.997.247.048.034.237.902.605.278.289.363.067.033.43
20069.333.591.438.373.887.688.546.509.236.737.377.779.763.267.492.463.003.77
20073.150.488.605.166.486.897.464.815.551.223.429.329.015.041.240.270.481.09
20080.593.146.272.314.673.757.579.702.594.797.699.508.048.422.727.600.239.09
20092.663.086.632.953.436.702.115.892.096.068.775.166.756.660.043.911.765.73
20100.171.510.036.684.868.909.315.432.083.673.772.696.557.471.465.692.600.75
20110.420.494.861.880.433.646.092.192.977.271.703.155.938.293.760.180.042.87
20124.410.388.323.251.894.869.629.355.350.956.713.921.090.146.484.251.497.88
20135.510.951.830.232.500.698.036.412.290.323.464.910.571.073.576.053.799.72
20148.370.714.450.992.952.104.222.004.572.421.404.080.966.422.767.947.452.46
20157.711.296.220.762.593.506.365.282.102.280.624.051.654.260.117.995.250.29
20163.740.020.412.261.260.688.439.552.121.483.034.532.743.260.111.716.462.40
20175.640.167.510.620.616.118.544.141.701.463.881.456.871.282.691.554.433.14
20185.920.286.250.322.957.061.232.932.711.063.241.980.744.422.192.877.941.74
20192.021.121.403.021.303.676.347.785.520.841.181.063.221.781.734.538.391.84
20201.590.835.494.253.032.750.890.134.380.394.160.732.051.610.032.861.142.14
20212.010.542.752.623.273.449.543.680.553.772.560.433.860.221.510.130.151.01
20220.680.460.726.612.042.409.522.982.242.470.490.160.741.860.581.171.171.26
20232.480.604.417.310.381.204.800.832.113.441.382.461.413.950.180.162.051.42
MAPE4.001.344.283.612.784.476.625.093.592.893.833.684.064.092.533.393.413.27
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Xie, S.; Wong, W.K.; Lee, H.S.; Kuang, K.S. The Prediction of Tea Production Using Dynamic Rolling Update Grey Model: A Case Study of China. Mathematics 2025, 13, 3056. https://doi.org/10.3390/math13193056

AMA Style

Xie S, Wong WK, Lee HS, Kuang KS. The Prediction of Tea Production Using Dynamic Rolling Update Grey Model: A Case Study of China. Mathematics. 2025; 13(19):3056. https://doi.org/10.3390/math13193056

Chicago/Turabian Style

Xie, Suwen, Wai Kuan Wong, Hui Shan Lee, and Kee Seng Kuang. 2025. "The Prediction of Tea Production Using Dynamic Rolling Update Grey Model: A Case Study of China" Mathematics 13, no. 19: 3056. https://doi.org/10.3390/math13193056

APA Style

Xie, S., Wong, W. K., Lee, H. S., & Kuang, K. S. (2025). The Prediction of Tea Production Using Dynamic Rolling Update Grey Model: A Case Study of China. Mathematics, 13(19), 3056. https://doi.org/10.3390/math13193056

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop