Prediction of Autumn Precipitation over the Middle and Lower Reaches of the Yangtze River Basin Based on Climate Indices

: Autumn precipitation (AP) has important impacts on agricultural production, water conservation, and water transportation in the middle and lower reaches of the Yangtze River Basin (MLYRB; 25 ◦ –35 ◦ N and 105 ◦ –122 ◦ E). We obtain the main empirical orthogonal function (EOF) modes of the interannual variation in AP based on daily precipitation data from 97 stations throughout the MLYRB during 1980–2015. The results show that the ﬁrst leading EOF mode accounts for 30.83% of the total variation. The spatial pattern shows uniform change over the whole region. The variance contribution of the second mode is 16.13%, and its spatial distribution function shows a north-south phase inversion. Based on previous research and the physical considerations discussed herein, we include 13 climate indices to reveal the major predictors. To obtain an acceptable prediction performance, we comprehensively rank the climate indices, which are sorted according to the values of the new standardized algorithm of information ﬂow (NIF, a causality-based approach) and correlation coe ﬃ cient (a traditional climate diagnostic tool). Finally, Tropical Indian Ocean Dipole (TIOD), Arctic Oscillation (AO), and other four indicators are chosen as the ﬁnal predictors a ﬀ ecting the ﬁrst mode of AP over the MLYRB; NINO3.4 SSTA (NINO3.4), Atlantic-European Circulation E Pattern (AECE), and other four indicators are the major predictors for the second mode. In the ﬁnal prediction experiment, considering the time series prediction of principal components (PCs) to be a small-sample problem, the Bayesian linear regression (BLR) model is used for the prediction. The experimental results reveal that the BLR model can e ﬀ ectively capture the time series trends of the ﬁrst two modes (the correlation coe ﬃ cients are greater than 0.5), and the overall performance is signiﬁcantly better than that of the multiple linear regression (MLR) model. The prediction factors and precipitation prediction results identiﬁed in this study can be referenced to rapidly obtain climatological information for AP over the MLYRB and improve the regional prediction of AP elsewhere, which will also help policymakers prepare appropriate adaptation and mitigation measures for future climate change.


Introduction
Over the past few years, changing precipitation patterns have been linked to climate change and attracted worldwide attention [1][2][3][4]. Studies have shown that extreme precipitation events have increased over many regions around the world, altering the hydrological cycle, ecological environment, and different society activities, such as agriculture and hydropower generation [5][6][7][8][9][10][11]. Even worse, according to numerical model simulations, these trends may continue into the next century [12].
For example, an analysis based on global-scale observation datasets shows that extreme precipitation has expanded significantly over many mid-latitude regions of the Northern Hemisphere and that areas affected by drought or severe drought have increased since the 1970s [13][14][15]. Therefore, it is important to analyze and predict changes in at different temporal and spatial scales around the world. Villarini et al. showed that increasing trends of heavy rainfall in the northern part of the central United States were related to expanded atmospheric moisture content due to rising temperatures [16]. Related research found no significant trends in annual precipitation from 1956 to 2013 over mainland China, but regional differences were noted. For instance, reduced precipitation was observed mainly in central, southeastern, southwestern and northeastern China, whereas increased precipitation was detected mainly in the middle and lower reaches of the Yangtze River Basin (MLYRB), the southeast coastal region, the Qinghai-Tibet Plateau, and northwestern China; at the same time, frequencies of extreme precipitation events in these regions have also increased significantly [17][18][19]. Studies have shown that the amount, intensity, and duration of extreme rainfall events in the MLYRB have all changed since the 1980s [17].
To improve the quality of predicting regional extreme precipitation events, it is necessary to evaluate the mechanisms responsible for these precipitation changes. Among the different aspects of these mechanisms, the influences of large-scale circulation modes, which have strong impacts on extreme precipitation events, are particularly important [20,21]. Cayan et al. suggested that the frequency distribution of daily precipitation in winter over the western United States shows a strong and systematic response to the EI Niño-Southern Oscillation (ENSO) phases [22]. Ning and Bradley also found that the ENSO phases have different influences on winter extreme precipitation events over the northeastern United States [23]. In China, many studies have pointed out that abnormal changes in precipitation are often affected by sea surface temperature (SST) anomalies. For example, the influence of ENSO on autumn precipitation (AP) is more significant than that in summer; during an El Niño year, the AP usually appears reduced over North China and intensified in southern regions [24]. In addition, during ENSO's decaying years, its impacts on China's climate have been promoted by the "capacitor effect" of the Indian Ocean SST [25]. Related studies have shown that the Tropical Indian Ocean Dipole (TIOD) negative phase contributes to strengthening of the Indo-Burma trough in autumn, which further promotes the transport of water vapor to the Indian Ocean and South China Sea [25,26]. Complex sea-air interactions can also affect AP in the Yangtze River Basin during autumn [27], for instance, the curvature of the summer Kuroshio Current axis, which is accompanied by large cold water masses in southern Japan. Correlation analysis has been extensively employed to assess the relationships between atmospheric circulation variability patterns and precipitation anomalies in China over the past 100 years [28]. The Southern Oscillation (SO), North Pacific Oscillation (NPO), and 500 hPa atmospheric circulation W pattern (AECW) are some examples of atmospheric circulation variability affecting mid-latitude and tropical weather conditions, including China precipitation [28]. Moreover, AP can be modulated by other low-frequency oscillations, such as the quasi-biennial oscillation (QBO) of the tropical stratosphere [29].
Although relevant research has revealed important relationships between different modes of climate variability and AP variations in China [25][26][27][28][29], the causal associations are still not fully understood. Furthermore, previous studies have focused on national or individual administrative divisions. In this study, we emphasize that there is more scientific and application value if assessing precipitation observations according to geographical characteristics. When examining the mechanisms modulating the large-scale circulation modes and related precipitation variability, physics-based regression modelling has been proposed to predict future precipitation events and effectively adopted to study summer precipitation forecasts in India, East Asia, and the MLYRB [30][31][32][33][34]. However, few related studies have used such methods to explore spatiotemporal changes in AP over the MLYRB or employed related models for quantitative predictions [34]. The MLYRB is not only a major agricultural base in China but also an area with strong societal and economic development [34]. Autumn is the season in which the East Asian summer circulation transitions to the winter circulation [35].
Many unstable factors may induce disasters, such as droughts, floods, and low temperatures [36]. Therefore, it is not only important to analyze the major drivers affecting AP over the MLYRB, but equally relevant to investigate the feasibility of predicting precipitation with an appropriate regression model. In this study, we analyze the spatiotemporal variations of AP over the MLYRB from 1980 to 2015. The information flow detection method is employed to extract potential preceding predictors and compared to time-lagged correlation analysis to assess the usefulness of the information flow algorithm. Finally, the Bayesian linear regression (BLR) model is used to predict AP over the MLYRB.

Data
(1) In this article, precipitation data were selected from 97 meteorological stations with data covering the period 1980-2015. These gauge stations were almost evenly distributed within and surrounding the MLYRB (25 • -35 • N and 105 • -122 • E, Figure 1). The meteorological data were provided by the National Climatic Data Center, National Meteorological Information Center, and the China Meteorological Administration; the quality of the precipitation data was evaluated before its release. Missing data accounted for less than 1% of the total data and were processed following Zhang et al. [37]: rainfall values are calculated as the weighted average of all gauge stations and the weight of each station is determined by kriging interpolation. The fractal dimension can well represent the regional heterogeneity of the network under investigation [38]. According to the relevant literature, the fractal dimension of the study area in this paper is about 1.5 [39], which is in the range progressively from 0 (when all stations are distributed on a single point or on isolated points) to 2 (when all stations are uniformly distributed) [38]. Therefore, according to the judgment standards proposed by Lovejoy et al., the characteristics of precipitation distribution over the MLYRB can be characterized by the data of these stations. In this study, the four seasons were defined as spring (March-May), summer (June-August), autumn (September-November), and winter (December-February). To better characterize seasonal precipitation variations, we computed the seasonal precipitation anomalies by removing the 36-year average seasonal climatology, corresponding to the 1980-2015 period, from seasonal raw values. Subsequent analysis is based on AP anomalies over the MLYRB.

Empirical Orthogonal Function (EOF) Analysis
EOF analysis was used to identify the leading modes of seasonal precipitation variability affecting the MLYRB. The EOF approach has been extensively adopted for extracting the modes  (2) A variety of monthly atmospheric circulation indices, such as the western Pacific subtropical high (WPSH) and QBO indices were sourced from the National Climate Center Climate System Diagnostic and Prediction Laboratory (NCC). Additionally, Niño 3.4 and Kuroshio Current SST (KCS) indices were calculated using the monthly Extended Reconstructed SST (ERSST) version 4 dataset [34] from the National Oceanic and Atmospheric Administration (NOAA) for the period 1980-2015 with a resolution of 2.5 • * 2.5 • . What needs to be emphasized here is that the indices and reanalysis data were analyzed at seasonal time-scale, and computed the seasonal anomalies for analysis.
(3) In the mechanism analysis section, monthly gridded data of 500 hPa geopotential height, 850 hPa wind horizontal components, and 700 hPa relative humidity were analyzed with a resolution of 2.5 • * 2.5 • from the National Centers for Environmental Prediction (NCEP) reanalysis dataset [40] for the period 1980-2015. We need to highlight that we computed the seasonal anomalies for reanalysis.

Empirical Orthogonal Function (EOF) Analysis
EOF analysis was used to identify the leading modes of seasonal precipitation variability affecting the MLYRB. The EOF approach has been extensively adopted for extracting the modes of variability explaining the largest portion of the total variance of the data [30][31][32][33]. Here, EOF analysis was conducted using AP anomalies over the MLYRB for the 1980-2015 period, resulting in a sample size of 36 cases.

Information Flow Causal Association Analysis Method
Pearson's correlation coefficient is a metric commonly used to evaluate the linear relationships between two variables, providing the direction of deviations [32]. However, high and low correlations do not indicate the existence or absence of causal association between elements, which should be a primary consideration in forecasting, modelling, and physical mechanism assessments. Recently, Liang showed that causality can be regulated by using a closed form solution (defined as information flow, IF) derived from maximum likelihood estimation [41][42][43][44]. IF not only has indicated the exchange of causal information in linear systems, but also shown superior characteristics to the Granger causality test and transfer entropy in the causal analysis of nonlinear systems [45]. To accurately measure the relative size of a detected causal relationship, Liang developed an IF standardization method. For two time series, x 2 and x 1 , the maximum likelihood estimator of the IF rate from x 2 to x 1 is [43]: Let . X j be the finite-difference approximation of dx j dt using the Euler forward scheme: .
where k = 1 or k = 2 and ∆t is the time step. The term C i,j in Equation (1) is the covariance between x i and x j . Ideally, if T 2→1 = 0, then x 2 does not cause x 1 ; otherwise, it is causal. In practice, a significance test needs to be performed. However, the results obtained in some cases would be very small (on the order of 0.001), which will not only make the contrast between the IF values of the variables inconspicuous, but also be inconsistent with our general intuitive understanding of standardized numerical ranges. Bai et al. proposed a standardization scheme with a new closed-form formula, which provided a reasonable solution to this limitation [46]. Moreover, Bai et al. verified the validity and robustness of this new standardized algorithm of information flow (NIF) through numerical simulation and forecasting experiments of tropical cyclone generation in the northwest Pacific Ocean. τ B 2→1 can be normalized as follows: τ B 2→1 measures the information flow from x 2 to x 1 in comparison to the stochastic processes. H noise 1 represents the role of random noise. For additional information regarding his calculation process see Bai et al. [46]. The range of the values for τ B 2→1 is (0,1). The larger the value, the sharper the effect of x 2 on x 1 and the stronger the causal association (note the directionality); for relatively small values, the weighed element is not a fundamental factor for the development of x 1 .

Bayesian Linear Regression Model
At present, climate predictions are usually produced and evaluated using physical and/or statistical models. The former are based on a set of differential partial equations and physical parameterizations representing different physical-dynamic processes in the atmosphere [47][48][49]. During recent years, such models have progressed in some aspects of precipitation predictions. However, owing to the nonlinearity of such physical models, errors generally increase rapidly for longer lead times, in particular for medium-and long-range forecasts [50]. To overcome the physical model's limitation, statistical forecasting has been widely used for developing climate precipitation predictions, for instance [51,52]. One of the most employed methods in this crescent field is the use of the multiple linear regression (MLR) model [31,33,34]. The output of the MLR model can be interpreted as the most likely point estimate given by the current dataset. Nevertheless, if we have only a small dataset, the credibility of the point estimate based on the maximum probability given by the current dataset will be very low. In this case, if we represent the estimate as a distribution of possible values and introduce a regression model based on Bayesian theory, we may reduce the error to some extent [53]. Considering that the time span of the seasonal data of the precipitation is not large, for example, the sample length in this article is only 36, which is a small-sample problem. The Bayesian linear regression (BLR) model in this paper can introduce both prior information and uncertainty when having a limited dataset or needing to use a priori knowledge in the model [54].

The Leading Modes of Autumn Precipitation Variability over the MLYRB
Changes in precipitation can indicate variations in the hydrological cycle on regional or global scale [37]. To evaluate the main characteristics of the AP variability over the MLYRB from 1980 to 2015, we employed an EOF analysis on the seasonal rainfall anomalies using data from the 97 observational stations over the MLYRB. Here, we focused on the two leading modes, which show distinct interannual variability patterns. A rule of thumb (North et al. [55]) was used to assess the degree of degeneracy among EOF modes.

First Leading Mode
The first leading mode accounts for 30.83% of the total variance of AP anomalies. According to North et al. [55], this mode is independent of higher order modes. The first eigenvector ( Figure 2a The first leading mode accounts for 30.83% of the total variance of AP anomalies. According to North et al. [55], this mode is independent of higher order modes. The first eigenvector ( Figure 2a

Second Leading Mode
The explained variance of the second mode is 16.13% and it is well separated from other EOF modes. AP anomalies exhibit a dipole-like pattern with a decreasing trend from southern to northern regions ( Figure 3a). The temporal evolution of the second principal component (PC2) shows a clear interannual variability throughout the analyzed period ( Figure 3b). The explained variance of the second mode is 16.13% and it is well separated from other EOF modes. AP anomalies exhibit a dipole-like pattern with a decreasing trend from southern to northern regions ( Figure 3a). The temporal evolution of the second principal component (PC2) shows a clear interannual variability throughout the analyzed period ( Figure 3b).
(a) The second eigenvector mode.

Extraction of the Main Modal Predictors of Autumn Precipitation Based on Information Flow Causality Detection
In this section, NIF is applied to the climate indices to extract the main climate predictors affecting the leading modes of AP variability over the MLYRB. Differently from a traditional statistical model, NIF focuses on identifying the most important predictors by understanding the causal relationships between potential predictors and predictands [46]. Both PCs (PC1 and PC2) are used to assess the most related climate indices that are causally associated with them. The correlation analysis (lead-lag correlation approach), which is commonly applied to

Extraction of the Main Modal Predictors of Autumn Precipitation Based on Information Flow Causality Detection
In this section, NIF is applied to the climate indices to extract the main climate predictors affecting the leading modes of AP variability over the MLYRB. Differently from a traditional statistical model, NIF focuses on identifying the most important predictors by understanding the causal relationships between potential predictors and predictands [46]. Both PCs (PC1 and PC2) are used to assess the most related climate indices that are causally associated with them. The correlation analysis (lead-lag correlation approach), which is commonly applied to evaluate climate predictions [33,34], is compared simultaneously to examine the effectiveness of the NIF method.

Potential Predictors for AP over the MLYRB
Based on previous studies, 13 climate indices (potential predictors) constitute the main scope for our investigation of predictors. They are classified as atmospheric circulation-, SST-and other slowly changing climate indices-based predictors.

Atmospheric Circulation Indices-Based Predictors
Many studies have shown that China precipitation anomalies are affected mainly by the western Pacific subtropical high (which controls warm and humid water vapor transport) and atmospheric circulation in the middle and high latitudes (which controls cold air transport from higher latitudes) [24,25,28,56,57]. In addition, atmospheric circulation conditions in the middle and high latitudes of the Northern Hemisphere can also be characterized by anomalies in the position and intensity of the 500 hPa ridges over the North Atlantic-Europe region (40 • -80 • N and 20 • W-70 • E). Five atmospheric circulation indices are selected to represent different tropical and extratropical circulation variability patterns. Table 1 shows the definition of each index analyzed here. It should be noted that the seasonal indices were analyzed only for summer season and obtained by computing 3-month mean using the corresponding monthly data.

Climate Index Definition
Atlantic-European Circulation W Pattern (AECW) Number of days in the current month showing W-type circulation [28] (the 500 hPa circulation pattern of the North Atlantic-Europe is divided into W, C, and E patterns according to the differences in the ridge position and intensity in the North Atlantic-Europe region; the W type is characterized by a straight westerly circulation and prevailing zonal circulation.)

Atlantic-European Circulation C Pattern (AECC)
Number of days in the current month showing C-type circulation (the C type is characterized by high-pressure ridges over the western coast of Europe, long wave troughs in the Ural Mountains and the development of a meridional circulation in Europe) [28].

Atlantic-European Circulation E Pattern (AECE)
Number of days in the current month showing E-type circulation. The characteristics of the E type are opposite to those of the C type. The Ural Mountains exhibit a high-pressure ridge, and the pressure increases with longitude over East Asia [28].

Arctic Oscillation (AO)
In the region of 20 • -90 • N and 0 • -360 • , this index signifies the normalized sequence of time coefficients of the first EOF mode of the 1000 hPa geopotential height anomaly field [28].

SST Indices-Based Predictors
Many studies have shown that the precipitation anomalies in China are often related to SST anomalies, especially in the tropical Pacific and Indian Oceans [25,27,28,56,57]. These studies have verified that El Niño, La Niña, the large cold water mass close to Japan (associated with the Kuroshio Current SST), Indian Ocean Dipole, and other climate phenomena affect China's precipitation through complex mechanisms. Among the six SST indices assessed here, two new indices were defined to clarify both the relationships between different SST anomalies distributions in the tropical Pacific Ocean and different China precipitation anomalous patterns. The first new SST index is the East-West index (EWI), which is the difference between the normalized Niño 3 index (defined as the regional average of monthly SST anomalies in the 5 • S-5 • N and 150 • -90 • W region, representing the tropical eastern Pacific) and the tropical western Pacific index (defined as the regional average of monthly SST anomalies in the 5 • S-5 • N and 110 • -160 • E region), reflecting the SST variability between the east and west tropical Pacific. The second new SST index is the East-Central index (ECI), which is the sum of the standardized Niño 3 index and the Niño 4 index (defined as the regional average of  Table 2 shows the definition of each SST index calculated from the ERSST dataset for summer season over the 1980-2015 period.

Potential Predictors of Low-Frequency Climate Signals
Some studies have shown that AP in China can be affected by other slowly changing climate indices, such as the QBO [29]. Other studies have also noted that abnormal solar activity may also affect precipitation in local areas [58,59]. Therefore, we included the QBO index (defined as the 30 hPa zonal wind average in the equatorial region) and total sunspot number index (TSN; relative number of sunspots) as the potential predictors to be also investigated. To eliminate subjective factors and irregular disturbances in the observations, a 12-month moving average of the relative number of sunspots is used; this dataset is also from the NCC.

Assessing the Predictors Affecting the First EOF Mode
In this section, we use the NIF to detect the causal relationship between each climate index and PC1. The reliability test follows the methodology proposed by Liang: if the standardized IF value is ≥ 0.01, the result is statistically significant with a 95% confidence level [43]. All results are listed in Tables 3-5, and it is necessary to indicate that the IF direction is transmitted from the column indicator to the line indicator.  [57]. The two new SST indices (EWI and ECI) have also a relatively large impact on PC1. By comparing the IF values of these indices, we can further verify that the difference between SST anomalies in the west and east tropical Pacific Ocean (EWI) have a more significant impact on the first EOF mode.
The same data are also evaluated using linear correlation analysis (Tables 6-8; correlation coefficient uses the F test [25,46]. Since the sample size of this article is 36, R ≥ 0.325 indicates statistically significant values at the 5% level.). With the exceptions of EWI, ECI, TIOD, and NINO3.4, most climate indices do not show statistically significant relationships, differing from previous studies. For example, Liu and Yu suggested that changes in AO have important influence on intensity and cold air flow path in autumn, and this cold air has a considerable impact on China AP [25]. Gu et al. verified that SIOD plays an important role in promoting the formation of TIOD [56]. However, correlation analysis cannot significantly reflect associations between these climate indices and PC1. In contrast, both indices are meaningful in the NIF assessment, which is in good agreement with previous studies. NIF can reflect time-lagged relationships between the equatorial Pacific SST anomalies and precipitation variability. For example, the IF from EWI to PC1 is 0.12 at the same season, which is smaller than the NIF value for one season later, as shown in Table 4. Moreover, this finding was not observed using correlation analysis (i.e., simultaneous correlation: 0.51; time-lagged correlation: 0.48).

Assessing the Predictors Affecting the Second EOF Mode
The results at Tables 9-14 show that NINO3.4, KCS, AECE, ECI, and QBO are the main climate indices affecting the second mode of AP variability over the MLYRB. Compared to the correlation coefficient assessment, the NIF can more accurately indicate which climate indices most affect PC2.
For example, when AECE is abnormal, the Ural Mountains exhibit a high-pressure ridge, and the pressure increases with longitude over East Asia, which usually indicates a cold air anomaly [28]; this may be an important reason for the second mode to show an antiphase north-south variation trend. Shi verified that SST variations in the Kuroshio Current can modulate AP over the Yangtze River Basin [27], which could explain the AP anomalies over northern regions from 29 • N.

Effectiveness of Predictors' Detection Based on Regression Analysis
To further verify the usefulness of the NIF, this section evaluates the two most significant climate indices related to PC1 variability (EWI, TIOD) by computing the linear regression between these indices and different dynamic and thermodynamic variables affecting China AP [57]. For comparison, similar regression analysis is also employed using the AECC index that showed weakest association with PC1. Regression coefficients are calculated using time-lagged anomalies with summer indices defined as predictors and AP as the predicted variable. In this article, we use the F test to test the significance of the regression coefficients. The F test is a test of the goodness of fit of the entire model, that is, the significance test of all variables to the explanatory variables [25]. Figures 4-6 show the variations in the regressed fields of the 700 hPa relative humidity, 850 hPa atmospheric circulation, and 500 hPa geopotential height with respect to the different climate indices. All of these regressed fields also support the conclusions provided by the NIF. As shown in Figure 4, due to the anomalous east-west distribution of the tropical Pacific SST, a strong westward airflow was stimulated over the tropical Pacific, and an easterly airflow occurred in the Indian Ocean at 850 hPa. After the eastward airflow crossed the Indo-China Peninsula, the airflow turned into a southwestern airstream and entered southwestern China (the dashed squares in Figures 4a and 5a). Furthermore, a large-scale anomalous southwesterly airflow over East China is observed. The 700 hPa relative humidity anomalies over the Indo-China Peninsula and Philippines (the dashed squares in Figures 4c and 5c) are verified to be most likely related to the 850 hPa atmospheric circulation anomalies. These mechanisms are relevant to produce AP anomalies over the MLYRB, especially in the southern region, which is in good agreement with the spatial distribution of the first EOF mode. In Figure 5, an easterly airflow was stimulated mainly over the tropical Indian Ocean owing to the TIOD. This easterly airflow was deflected to the north over the Indian Peninsula and then developed eastward, modulating a southwestern anomalous airflow over East China. Geopotential height anomalies were also enhanced in the India-Burma trough (the dashed squares in Figures 4b and 5b), but it must be highlighted that this field passed only the 90% confidence interval test and not the 95% confidence interval test. In Figure 6, the relative humidity and circulation fields showed weak regression coefficients, indicating that AECC has a negligible effect in modulating the first EOF mode. This finding is consistent with results presented in Table 3. Therefore, the proposed NIF assessment can accurately reveal the most important climate drivers affecting the leading mode of AP variability over the MLYRB.

Effectiveness of Predictors' Detection Based on Regression Analysis
To further verify the usefulness of the NIF, this section evaluates the two most significant climate indices related to PC1 variability (EWI, TIOD) by computing the linear regression between these indices and different dynamic and thermodynamic variables affecting China AP [57]. For comparison, similar regression analysis is also employed using the AECC index that showed weakest association with PC1. Regression coefficients are calculated using time-lagged anomalies with summer indices defined as predictors and AP as the predicted variable. In this article, we use the F test to test the significance of the regression coefficients. The F test is a test of the goodness of fit of the entire model, that is, the significance test of all variables to the explanatory variables [25]. Figures 4, 5 and 6 show the variations in the regressed fields of the 700 hPa relative humidity, 850 hPa atmospheric circulation, and 500 hPa geopotential height with respect to the different climate indices. All of these regressed fields also support the conclusions provided by the NIF. As shown in Figure 4, due to the anomalous east-west distribution of the tropical Pacific SST, a strong westward airflow was stimulated over the tropical Pacific, and an easterly airflow occurred in the Indian Ocean at 850 hPa. After the eastward airflow crossed the Indo-China Peninsula, the airflow turned into a southwestern airstream and entered southwestern China (the dashed squares in Figures 4a, 5a). Furthermore, a large-scale anomalous southwesterly airflow over East China is observed. The 700 hPa relative humidity anomalies over the Indo-China Peninsula and Philippines (the dashed squares in Figures 4c, 5c) are verified to be most likely related to the 850 hPa atmospheric circulation anomalies. These mechanisms are relevant to produce AP anomalies over the MLYRB, especially in the southern region, which is in good agreement with the spatial distribution of the first EOF mode. In Figure 5, an easterly airflow was stimulated mainly over the tropical Indian Ocean owing to the TIOD. This easterly airflow was deflected to the north over the Indian Peninsula and then developed eastward, modulating a southwestern anomalous airflow over East China. Geopotential height anomalies were also enhanced in the India-Burma trough (the dashed squares in Figures 4b, 5b), but it must be highlighted that this field passed only the 90% confidence interval test and not the 95% confidence interval test. In Figure 6, the relative humidity and circulation fields showed weak regression coefficients, indicating that AECC has a negligible effect in modulating the first EOF mode. This finding is consistent with results presented in Table 3. Therefore, the proposed NIF assessment can accurately reveal the most important climate drivers affecting the leading mode of AP variability over the MLYRB.    (b) 500 hPa geopotential height field.

Prediction Experiment of AP over MLYRB
To achieve an acceptable prediction performance, we first obtain two ranked lists of climate indices ordered according to the values of the NIF and correlation coefficients based on Tables 3-14. Then, as recommended by Song et al. [60], we selected the top k factors from these two ranking lists, where = √ * and is the number of potentially selected indices. In this study, the number of climate indices is 13, and k = 8. Thus, the top eight common indices in the two ranked lists are chosen as the final predictors. Following this procedure, KCS, TIOD, SIOD, EWI, WPSHA, and AO are the six most important predictors affecting PC1; NINO3.4, KCS, AECE, AECC, ECI, and QBO are the major predictors for PC2. Therefore, in this section, we analyze BLR prediction quality in coordination with these selected climatic factors and compare the BLR with the output of MLR, which is a common assessment in climate prediction. To reduce the impacts of the data from the current year on the forecasting results, when applying the MLR model, the regression forecast is performed with a cross-validation assessment excluding the target year. The BLR model can be simply expressed by Equation (4): where g is a constant term and SX is a set of predictors with a, b, c, d, e, and f being their corresponding parameters. The histograms of the regression parameters {a, b, c, d, e, f} are

Prediction Experiment of AP over MLYRB
To achieve an acceptable prediction performance, we first obtain two ranked lists of climate indices ordered according to the values of the NIF and correlation coefficients based on Tables 3-14. Then, as recommended by Song et al. [60], we selected the top k factors from these two ranking lists, where k = √ m * ln m and m is the number of potentially selected indices. In this study, the number of climate indices is 13, and k = 8. Thus, the top eight common indices in the two ranked lists are chosen as the final predictors. Following this procedure, KCS, TIOD, SIOD, EWI, WPSHA, and AO are the six most important predictors affecting PC1; NINO3.4, KCS, AECE, AECC, ECI, and QBO are the major predictors for PC2. Therefore, in this section, we analyze BLR prediction quality in coordination with these selected climatic factors and compare the BLR with the output of MLR, which is a common assessment in climate prediction. To reduce the impacts of the data from the current year on the forecasting results, when applying the MLR model, the regression forecast is performed with a cross-validation assessment excluding the target year. The BLR model can be simply expressed by Equation (4): where g is a constant term and SX is a set of predictors with a, b, c, d, e, and f being their corresponding parameters. The histograms of the regression parameters {a, b, c, d, e, f} are shown in Figures 7 and 8. The final regression results are shown in Figures 9 and 10, where the red area represents the 90% confidence interval predicted by BLR and the green solid line represents the interval average predicted by BLR. The multiple correlation coefficient (MCC) and root mean square error (RMSE) are introduced to measure the specific performance of the forecast results [46]. Regression assessment is calculated using time-lagged anomalies, which considers using summer indices to predict AP anomalies.
mean square error (RMSE) are introduced to measure the specific performance of the forecast results [46]. Regression assessment is calculated using time-lagged anomalies, which considers using summer indices to predict AP anomalies. According to the regression results for the time series of PC1 (Figure 9), the BLR model  Table 15, the prediction results for the first mode of the BLR model are more satisfactory than those of the MLR model; the MCC of the BLR model is 0.5299, and the RMSE is 0.8508, which are significantly better than those of the MLR model. The BLR model forecast quality, when evaluated by the MCC and RMSE scores, showed an improvement of around 40% compared to the MLR model. The related forecasting results also reflect how useful the NIF assessment is for extracting potential forecasting factors from the margins of the studied domain. Compared to the traditional scheme (MLR) [33,34], the BLR scheme better predicts the trend of interannual changes in AP over the MLYRB.
The prediction results for the PC2 time series (Figure 10) reveal that the BLR captured the upward trend in 1989-1991, the downward pattern in 1991-1992 and 2002-2003, the decreasing trend in 2013, and the increasing pattern in 2015. As seen from Table 16, the performance of the BLR is also satisfactory, with an MCC of 0.6727 and an RMSE of 0.7358. A comparison between BLR and MLR models shows that the former forecast quality, when evaluated by the MCC and RMSE scores, showed an improvement around 35% and 26% compared to the latter, respectively.     Figure 7Histograms of the BLR parameters of PC2 that map all uncertainty onto the parameter space (a-g refer to each regression coefficient).

Discussion
Although the prediction skills of AP over the MLYRB using the NIF-BLR model are superior, there are still some limitations. First, most of the climate indices proposed in this study were selected from previous research, which led to certain subjectivity in the evaluation process. As a future step, we will further develop research combining numerical forecasting model products to improve the quality of forecasts. Second, if the cycle of the regression model changes, this may lead to a decrease in the prediction skill. Thus, the prediction model  Table 15, the prediction results for the first mode of the BLR model are more satisfactory than those of the MLR model; the MCC of the BLR model is 0.5299, and the RMSE is 0.8508, which are significantly better than those of the MLR model. The BLR model forecast quality, when evaluated by the MCC and RMSE scores, showed an improvement of around 40% compared to the MLR model. The related forecasting results also reflect how useful the NIF assessment is for extracting potential forecasting factors from the margins of the studied domain. Compared to the traditional scheme (MLR) [33,34], the BLR scheme better predicts the trend of interannual changes in AP over the MLYRB. Table 15. Skill scores (in terms of the multiple correlation coefficient (MCC) and root mean square error (RMSE)) for the PC1 forecast by different models (bold font indicates the best performance). The prediction results for the PC2 time series (Figure 10) reveal that the BLR captured the upward trend in 1989-1991, the downward pattern in 1991-1992 and 2002-2003, the decreasing trend in 2013, and the increasing pattern in 2015. As seen from Table 16, the performance of the BLR is also satisfactory, with an MCC of 0.6727 and an RMSE of 0.7358. A comparison between BLR and MLR models shows that the former forecast quality, when evaluated by the MCC and RMSE scores, showed an improvement around 35% and 26% compared to the latter, respectively.

Discussion
Although the prediction skills of AP over the MLYRB using the NIF-BLR model are superior, there are still some limitations. First, most of the climate indices proposed in this study were selected from previous research, which led to certain subjectivity in the evaluation process. As a future step, we will further develop research combining numerical forecasting model products to improve the quality of forecasts. Second, if the cycle of the regression model changes, this may lead to a decrease in the prediction skill. Thus, the prediction model established in this study may not be fully applicable to other periods. Therefore, in the later period, we will carry out further research in combination with numerical forecasting models to improve the quality of forecasts. One way to undertake this evaluation is using the projection of the model's hindcast anomalies (we will select an appropriated climate model for providing hindcast data) onto the two observed leading eigenvectors (we already have this) to obtain the corresponding forecasted PC time series. Despite these potential limitations, the prediction factors and precipitation prediction results identified in this study will help us to better understand the variability in autumn precipitation over the MLYRB and improve the regional prediction of autumn precipitation elsewhere. These findings may also help policymakers and decision makers to prepare appropriate adaptation and mitigation measurements for future climate change.

Summary
This study investigated the characteristics of AP anomalies over the MLYRB by exploring the leading spatial-temporal modes and potential predictors driving the precipitation variability. The main conclusions are as follows: (1) Regarding EOF analysis, the MLYRB is a region with significantly varying AP. The contribution of the variance in the first leading mode is 30.83% and shows a monopole with different precipitation amounts over the whole region. The second mode explains 16.13% of the total variance, and its spatial distribution function is characterized by a meridional dipole. The time series of the first two PCs shows marked interannual variations, but weak interdecadal signals.
(2) To achieve an acceptable prediction performance, we firstly obtained two ranked lists of climate indices ordered according to the values of the NIF and correlation coefficients. Then, as recommended by Song et al. [60], we selected the top eight factors from the two ranked lists. Thus, the top eight common indices in the two ranked lists were chosen as the final predictors. Following this procedure, KCS, TIOD, SIOD, EWI, WPSHA, and AO are the six most important predictors affecting the first EOF mode of AP over the MLYRB, whereas NINO3.4, KCS, AECE, AECC, ECI, and QBO are the major predictors for the second mode.
(3) We considered the time series prediction of the first two PCs as a small-sample problem; therefore, the BLR model could be adopted. From the experimental results, BLR captured the PC1 and PC2 trends, and the overall performance was relatively satisfactory. Finally, the BLR demonstrates the ability to improve upon the MLR model.