1. Introduction
Rainfall is a key element of the hydrological cycle, and its spatiotemporal dynamics significantly influence agricultural production, water resource management, and flood-drought disaster warnings [
1,
2]. However, the rainfall process, affected by atmospheric circulation, topography, land–sea distribution, and other factors, exhibits marked non-stationarity, nonlinearity, and heteroskedasticity, making it challenging for traditional time series models to accurately capture its complex dynamics [
3,
4].
In the field of rainfall time series modeling, early research widely employed linear models such as the Autoregressive Integrated Moving Average (ARIMA) model [
5,
6] and its seasonal variant, SARIMA [
7,
8]. These models achieved some success in short-term weather forecasting but were limited in describing nonlinear phenomena such as extreme rainfall and seasonal abrupt changes [
9]. With advancements in nonlinear theory, more flexible models have been introduced into meteorological research [
10,
11]. For instance, the Threshold Autoregressive (TAR) model [
12] captures regime-switching behavior by setting transition thresholds, making it particularly suitable for describing rainfall state transitions under different weather systems [
13]. However, the TAR model’s transition at the threshold is discontinuous and non-differentiable, which contradicts the smooth transitions often observed in actual atmospheric processes.
To overcome this limitation, the Smooth Transition Autoregressive (STAR) model was developed and gradually applied to rainfall forecasting. By introducing continuous and differentiable transition functions (e.g., logistic or exponential functions) [
14], the STAR model achieves smooth transitions between different climatic states, better aligning with the physical mechanisms of precipitation formation. Komorník [
14] systematically elaborated the modeling framework of the STAR model, which subsequently demonstrated good performance in rainfall and temperature forecasting [
15,
16].
However, STAR-type models primarily focus on modeling the conditional mean and do not fully account for the volatility clustering and time-varying variance characteristics commonly found in rainfall series—i.e., heteroskedasticity [
17,
18]. These characteristics often cause prediction intervals to deviate from the actual distribution, reducing the reliability of uncertainty assessment [
19]. To improve interval forecasting accuracy, the Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model [
20] was introduced into hydro-meteorology to model the conditional variance of residual series. For example, Pandey et al. [
21] combined SARIMA with GARCH to enhance the performance of a single model. Scientists integrated deep learning with GARCH, achieving higher predictive accuracy and stability than traditional models [
22,
23].
In recent years, hybrid modeling frameworks have gained significant attention for their ability to simultaneously capture multiple characteristics of time series. Although STAR and GARCH models have shown excellent performance in their respective domains, research on their integration for rainfall forecasting remains limited, particularly in tropical monsoon regions of China. The hybrid framework employs LSTAR to capture nonlinear dynamics in the mean process and GARCH to model heteroskedasticity in the residuals, thereby effectively enhancing the reliability of interval forecasts. Previous studies demonstrate the superiority of such frameworks over individual models: Pandey et al. [
21] combined linear SARIMA with GARCH and found its prediction intervals superior to those of standalone SARIMA. Han et al. [
22] and Araya et al. [
23] integrated deep learning with GARCH, achieving higher predictive stability. However, for rainfall series with pronounced regime-switching behaviors, linear mean models like SARIMA may still be inadequate. Compared to nonlinear machine learning models such as SVR, the LSTAR-GARCH model offers a more transparent probabilistic structure that naturally lends itself to uncertainty quantification. Furthermore, the GARCH model generally captures volatility persistence more effectively with fewer parameters than the simpler ARCH specification [
20]. This advantage was verified in Guo et al. [
19] for groundwater level forecasting, where GARCH-type models yielded more accurate and less conservative prediction intervals than ARCH.
Therefore, this study proposes a hybrid modeling framework based on Box–Cox transformed LSTAR-GARCH for point and interval forecasting of monthly rainfall in Hainan Province. The Box–Cox transformation is used to improve data normality, the LSTAR model captures nonlinear smooth transition behaviors in rainfall mechanisms, and the GARCH model describes the heteroskedastic structure in the residuals, ultimately achieving more reliable point and interval forecasts. This paper also compares the model with the LSTAR-ARCH model to verify the advantage of the GARCH structure in reducing predictive uncertainty. The findings aim to provide a new modeling approach for rainfall forecasting in tropical regions and offer a scientific basis for regional water resource risk management.
2. Study Area and Data
Hainan Province is located in the southernmost part of China, with geographical coordinates ranging from 18°10′ to 20°10′ N and 108°37′ to 111°03′ E, covering a total area of approximately 33,900 square kilometers. The region has a tropical monsoon marine climate, with an average annual temperature of 22.5–25.6 °C and annual precipitation generally between 1500 and 2500 mm. The western coastal areas receive less rainfall, about 1000 mm, with rainfall concentrated mainly in the summer. This study utilizes monthly rainfall data from four meteorological stations: Haikou (59758), Dongfang (59838), Danzhou (59845), and Qiongzhong (59849), spanning 252 months from January 1999 to December 2019. The selection of these four stations was deliberate, designed to capture the primary precipitation modalities across Hainan Island, spanning coastal to inland areas and humid to semi-arid regimes, thereby providing a robust test for the proposed model. Haikou represents the northern coastal area, significantly influenced by the East Asian monsoon and tropical cyclones. Dongfang represents the western dry-hot zone, situated on the leeward slope with relatively low annual precipitation, serving as a key site for testing the model’s performance in arid conditions. Danzhou represents the northwestern region, characterized by a transitional climate between coastal and inland areas. Qiongzhong represents the central mountainous area, where notable orographic uplift effects make it one of the precipitation centers of the island. This “North–West–Northwest–Central” spatial arrangement (as shown in
Figure 1) enables the samples to capture the primary precipitation modalities of Hainan Island, and the specific characteristics of each station provide a basis for evaluating the model’s adaptability under different climatic conditions. The data were obtained from the National Meteorological Science Data Center (
http://data.cma.cn, accessed on 12 December 2024). To evaluate model performance, the data from each station were divided sequentially into a training set and a testing set, with the training set containing the first 80% of the data and the testing set the remaining 20%.
The time series of monthly rainfall at each station is shown in
Figure 2, and the corresponding descriptive statistics are presented in
Table 1. It can be observed that the statistical characteristics of the training and testing periods are generally consistent across stations, indicating a reasonable division of the dataset. This ensures that the model built during the training period can be effectively applied to the testing period for prediction.
3. Methods
This section details the hypothesis tests and data preprocessing methods used, reviews the fundamentals of the LSTAR and GARCH models, describes the process for point and interval forecasting using the LSTAR-GARCH model, and finally outlines the model performance evaluation metrics.
3.1. Hypothesis Testing
Before constructing the model, it is essential to perform the KPSS test [
24], the BDS test [
20], and a skewness test to examine the non-stationarity, nonlinearity, and skewness of the monthly precipitation series. The KPSS test assumes under the null hypothesis that the time series is stationary. The BDS test assumes, under the null hypothesis, that the data are linear. The skewness test evaluates whether the time series follows a normal distribution. For detailed descriptions of the KPSS and BDS tests, please refer to Shin et al. [
24] and Brock et al. [
20], respectively. The skewness test assesses normality by calculating the skewness coefficient of the distribution.
3.2. Data Partitioning and Transformation
The complete dataset was divided into a training set
and a testing set
. If the time series exhibits skewness, a Box–Cox transformation is applied to the training set prior to modeling to approximate a Gaussian distribution. The Box–Cox transformation is defined as:
where
is the original variable,
is the transformed variable, and
is the Box–Cox coefficient.
Applying Equation (1) yields the transformed training set .
3.3. The STAR Model
In the TAR model, regime switching occurs when a threshold variable exceeds or falls below a certain threshold. This switching is discontinuous. If the discontinuity at the threshold is replaced by a smooth function, the TAR model becomes a STAR model. Terasvirta [
12] systematically described the modeling strategy and application procedure for STAR models. The STAR model introduces a transition function to smooth the regime switching. This study employs a two-regime STAR model, as this configuration is generally sufficient to capture the nonlinear characteristics, such as regime switching, in most practical hydrological applications. For rainfall time series, the two regimes can correspond to “low-rainfall” and “high-rainfall” states, effectively capturing the smooth transition between dry and wet periods. From a theoretical perspective, Terasvirta [
12] demonstrated that the two-regime STAR model generally performs well in fitting hydro-meteorological data, while avoiding the over-parameterization and estimation complexities associated with multi-regime models. A two-regime STAR(p) model can be expressed as:
where
and
are the autoregressive coefficients of the first and second regimes, respectively.
is the residual series,
is the transition function, bounded between 0 and 1,
is the transition variable,
is the smoothness parameter, and
is the threshold parameter.
Alternatively, the two-regime STAR(p) model can be written as:
where
.
Different forms of the transition function correspond to different regime-switching behaviors. The two most common are the logistic function (Equation (4)) and the exponential function (Equation (5)), leading to LSTAR and ESTAR models, respectively.
The STAR model employs an auxiliary regression for the linearity test to determine the transition variable and the transition function .
Additionally, the lag order
of the STAR model is determined by the Akaike Information Criterion (AIC):
where
and
are the length of the data and the error variance, respectively.
The parameters
of the STAR model can be estimated using nonlinear least squares:
where
.
A STAR model was fitted to the transformed training set
, producing simulated values
. The residuals
are calculated as:
To validate the STAR model, the Ljung–Box test [
25] was used to check the independence of the residuals. The null hypothesis of the Ljung–Box test is that the series is serially uncorrelated.
3.4. The GARCH Model
The GARCH
model creates the conditional variance,
, by a linear combination of
past squared returns,
, and the
previous conditional variance,
, which can be specified as follows:
where
is a constant, and
and
are the coefficients of the GARCH
model.
For the GARCH model, the orders were determined by minimizing the AIC and the parameter vector
was estimated by maximizing the log likelihood function:
To solve Equation (11), a BHHH [
26] iterative method is called for:
where
is the estimate of
in the
th iteration, and
is a step length. The first- and second-order derivatives of the
with respect to
. The more detailed introduction of parameter estimation in the GARCH model can be referred to Bollerslev [
20]. Engle [
27]. Note that (a)
,
and
guarantee that the conditional variance of GARCH
is always positive, and (b)
suffices for wide-sense stationarity.
To ensure the residual series exhibits independent and identically distributed behavior, the ARCH test was applied to the residuals of the LSTAR model to detect ARCH effects. The null hypothesis of the ARCH test is that no ARCH effects are present. If ARCH effects are found, a GARCH model is established to eliminate these effects and estimate the time-varying conditional variance, . The residuals of the newly built LSTAR-GARCH model should be tested again for ARCH effects to ensure they are eliminated.
3.5. Model Forecasting
A one-step-ahead forecasting strategy was employed for the testing period. First, the fitted LSTAR-GARCH model was used to obtain the forecast mean, , and forecast variance, , for the transformed testing data. These were then transformed back to the original scale using the inverse Box–Cox transformation to obtain the final forecasts, .
In addition to point forecasts, interval forecasts were constructed using symmetric probability intervals,
. At the 95% confidence level, the upper and lower forecast bounds,
and
, are given by:
where
and
are the 2.5% and 97.5% quantiles of the cumulative distribution function (CDF), respectively.
For the LSTAR-GARCH model, the intervals are based on the residuals.
and
are calculated as:
3.6. Model Performance Evaluation
3.6.1. Deterministic Forecast Evaluation Metrics
The Relative Error (
RE) and Nash-Sutcliffe Efficiency (
NSE) coefficient were selected as evaluation metrics for deterministic forecast accuracy. They are calculated as follows:
where
is the series length,
is the observed value,
is the forecasted value, and
is the mean of the observed series.
3.6.2. Probabilistic Interval Forecast Evaluation Metrics
The Coverage Rate (
CR) and Average Relative Interval Width (Average Relative Interval Width, RIW) were selected as evaluation metrics for probabilistic interval forecasts. They are calculated as follows:
where
is the series length,
is the observed value,
and
are the lower and upper bounds of the prediction interval, respectively.
5. Discussions
This study constructed a hybrid LSTAR-GARCH model for point and interval forecasting of monthly rainfall series at four stations in Hainan Province and compared it with the LSTAR and LSTAR-ARCH models. The results show that the hybrid model performed excellently in interval forecasting but did not significantly outperform the standalone LSTAR model in point forecasting. The findings are discussed below in the context of relevant domestic and international research.
First, the point forecast results show that the LSTAR model achieved high NSE values (>0.75) at three stations (excluding Dongfang), indicating its ability to effectively capture the nonlinear characteristics of the rainfall series. This aligns with the conclusions of Teräsvirta [
12] and Saha et al. [
16], who found that STAR-type models have advantages in handling regime-switching time series. The poorer performance at Dongfang station can be attributed to its unique climatic and geographic conditions: located on the western leeward coast of Hainan, it lies in a rain shadow area with relatively low annual precipitation and is strongly influenced by the dry and hot foehn effect from the central mountains. Additionally, rainfall in this region is more susceptible to localized convective systems and tropical cyclone remnants, which are highly stochastic and less periodic, making them harder to capture with a univariate time-series model. These factors likely introduce higher unpredictability and structural breaks that the LSTAR model could not fully accommodate. This finding does not diminish the value of the model but rather provides a more precise delineation of its generalizability and a clearer evolution path for its application. Furthermore, the proposed framework demonstrates direct potential for generalization in typical tropical humid monsoon regions.
Notably, incorporating the GARCH model did not improve point forecast accuracy, consistent with some previous findings (e.g., Guo et al. [
19]). A possible reason is that while rainfall series exhibit volatility, it has minimal impact on the conditional mean. Since GARCH primarily models the conditional variance, its contribution to point forecast accuracy is limited. This suggests that for point forecasting tasks, optimization should focus more on the mean model. To prove this explanation more formally, we make the following simple proof.
Considering a simplest STAR (1) model for representing Equation (2), the conditional mean of
of the
t-th time point, conditioned upon
of the
1-th time point, can be obtained by the following:
From Equation (10),
is an independent random variable with mean 0 and time-varying variance
. Therefore, the conditional mean can be expressed as follows:
From Equation (22), the conditional mean is independent of the . Form Equation (9), is dependent on the that is estimated by GARCH models. Thus, the conditional mean is independent of GARCH models. That is to say, GARCH models do not have any influence on the simulation and prediction of mean value behavior.
In terms of interval forecasting, the LSTAR-GARCH model demonstrated high coverage rates (CR > 93%) and narrower relative interval widths (RIW) across all stations, significantly outperforming the LSTAR-ARCH model. This indicates that the GARCH model more effectively captures the volatility clustering in the residuals, leading to more precise prediction intervals. This result is consistent with the findings of Guo et al. [
19] in runoff forecasting, further validating the potential of GARCH-type models to improve uncertainty quantification in hydro-meteorological variables. Regarding the magnitude of the Relative Interval Width (RIW) obtained from the LSTAR-GARCH model (ranging from 0.065 to 0.130), it is important to justify its reasonability from both methodological and practical perspectives. First, the RIW is a normalized metric calculated as the average ratio of the interval width to the actual observed rainfall. During dry seasons or at stations with lower mean rainfall (e.g., Dongfang), even a modest absolute interval width can result in a larger RIW value, as the denominator (observed rainfall) is small. Second, and more fundamentally, monthly rainfall in tropical monsoonal regions like Hainan is characterized by high volatility and a pronounced heavy-tailed distribution. The presence of extreme rainfall events necessitates sufficiently wide prediction intervals to reliably encapsulate the inherent and substantial uncertainty. The primary objective of interval forecasting is to achieve a high Coverage Rate (CR), and our model successfully maintains CRs between 93.8% and 97.9% at the 95% confidence level. The resultant RIW values are a direct and honest reflection of the uncertainty required to attain this reliable coverage, representing a rational trade-off between precision and reliability. Therefore, the RIWs generated by the LSTAR-GARCH model are not only justifiable but also demonstrate its capability to effectively quantify the true variability in rainfall series.
Nevertheless, this study has some limitations. First, it used data from only four stations, limiting the sample representativeness. Second, the selection of transition variables and lag orders still relied on traditional methods; future work could incorporate machine learning techniques for automatic optimization. Additionally, external variables (e.g., ENSO, monsoon indices) were not considered, which could be an important direction for improving model performance. To address these limitations, future research should focus on the following directions: (1) Embedding physical mechanisms: Introducing exogenous strong signals such as sea surface temperature, wind fields, and humidity as transition variables or input features to enhance the physical consistency and predictive capability of the model in complex regions; (2) Spatially explicit modeling: Developing parameterized spatiotemporal STAR models to characterize the spatial heterogeneity of nonlinear precipitation dynamics in the form of continuous fields; (3) Hybrid framework integration: Exploring hybrid modeling approaches that combine the current framework with machine learning algorithms to overcome the limitations of a single model framework in addressing diverse climatic characteristics across entire basins.
In summary, the LSTAR-GARCH model shows promising application potential for interval forecasting of monthly rainfall, especially for series with significant heteroskedasticity. Future research could extend this model to a multivariate or spatial forecasting framework and incorporate more meteorological factors to further enhance forecast capability.
6. Conclusions
This study constructed a hybrid LSTAR-GARCH model for point and interval forecasting of monthly rainfall series and compared it with a hybrid LSTAR-ARCH model to validate the effectiveness of the LSTAR-GARCH approach. Based on the analysis of monthly rainfall data from Haikou, Dongfang, Danzhou, and Qiongzhong stations, the effectiveness of each model was evaluated, leading to the following main conclusions:
- (1)
For point forecasting, the LSTAR model performed well at all stations except Dongfang. A comparison of the point forecast performance between the hybrid LSTAR-GARCH model and the standalone LSTAR model showed that the evaluation metrics were identical, indicating that the GARCH component did not enhance the modeling and predictive performance of the LSTAR model.
- (2)
For interval forecasting at the 95% confidence level, compared to the LSTAR-ARCH model, the LSTAR-GARCH model produced much narrower relative interval widths while maintaining nearly identical coverage rates. Therefore, the LSTAR-GARCH model exhibits superior interval forecasting performance compared to the LSTAR-ARCH model, reducing the uncertainty of interval forecasts.
In conclusion, the results demonstrate that the LSTAR-GARCH model can achieve satisfactory point and interval forecast performance. We conclude that the GARCH model does not impact point forecasts but, when combined with the LSTAR model, yields better interval forecasting performance. Due to the non-stationary and nonlinear characteristics of hydrological series like rainfall, the model used in this study is highly recommended for other hydrological and climatic variables in changing environments. However, note that this study considered rainfall data from only four stations. Therefore, future research should consider applying these models to other study regions with different climates and backgrounds for further comparison and demonstration.