Ensemble Forecasting Experiments Using the Breeding of Growing Modes with Perturbed Land Variables

: Although land surface inﬂuences atmospheric processes signiﬁcantly, insufﬁcient studies have been conducted on the ensemble forecasts using the breeding of growing modes (BGM) with perturbed land surface variables. To investigate the practicability of perturbed land variables for ensemble forecasting, we used the ARWv3 mesoscale model to generate ensembles for an event of 24 h heavy rainfall with perturbed atmospheric and land variables by the BGM method. Results show that both atmospheric and land variables can generate initial perturbations with BGM, except that they differ in time and saturation characteristics, e.g., saturation is generally achieved in approximately 30 h with a growth rate of ~1.30 for atmospheric variables versus 102 h and growth rate of 1.02 for land variables. With the increase in precipitation, the importance of the perturbations of land variables also increases as compared to those of atmospheric variables. Moreover, the inﬂuence of the perturbations of land variables on simulated precipitation is still relatively large, although smaller than that of atmospheric variables, e.g., the spreads of perturbed atmospheric and land subsets were 7.3 and 3.8 mm, respectively. The beneﬁts of perturbed initialisation can also be observed in terms of probability forecast. All ﬁndings indicate that the BGM method with perturbed land variables has the potential to ensemble forecasts for precipitation.


Introduction
One of the pioneering works of Lorenz [1] indicated the limit of deterministic prediction induced by the chaotic nature of the atmosphere; meanwhile, it was noted that this limit for the wintertime atmosphere was approximately two weeks, as determined from numerical studies of the general circulation models [2,3], in which the growth of small perturbations was investigated for various synoptic regimes.
Such experiments with perturbed initial conditions for numerical models have led to follow-up ensemble forecasting with various numerical models [4]. While Monte Carlo methods for generating initial conditions had been commonly used in the early stages of ensemble forecasting [5,6], substantial attention was later paid to methods for generating perturbations with dynamical features of the variable fields in the world-leading centres. For example, the method of breeding growing modes (BGM) [7][8][9][10][11][12] and the singular vector approach [13][14][15] were operationally applied in the National Centres for Environmental Prediction (NCEP) and Medium-Range Weather Forecasts (ECMWF), respectively.
Generally, control of the initial conditions in the ensemble runs is assumed to be the 'best' estimate of the state of the atmosphere, with the perturbed conditions representative of the analysis errors of the state. This process of generating perturbed conditions may induce a large amount of computational cost. While the BGM method is characterised by its relatively small computational expense (i.e., as compared with the singular vector approach), nevertheless, because the length of the breeding cycle is a critical factor, the determination of the growth rate and saturation time of the breeding mode are key elements. An experiment conducted at NCEP showed that initial perturbations can reach a growth rate of 1.4-1.6 times per day after 3-4 days of breeding, which is comparable to the growth rate of the actual forecast errors. However, Mitchell et al. [16] showed that the saturation time varies for different models and better models yield faster perturbation growth rates. Therefore, when using the BGM method, the time needed for the growth rate of a bred mode to reach saturation should be determined based on the type of the model. In fact, the saturation time can be used to assess the suitability of the BGM method for various perturbed variables.
With the BGM method being widely used in various global and regional numerical models [4,17], the perturbed variables (e.g., wind speed, air temperature, and relative humidity) have largely been focused on in combination with analysis of the saturation status of breeding cycles in the atmospheric context [17]; however, there has been an insufficient investigation on land surface variables (land surface temperatures and soil moistures).
Recently, some investigations produced new ensemble forecasts by combining atmospheric and land variables information [18][19][20][21][22]. For example, Beck et al. [18] presented a global 3-hourly precipitation dataset with a 0.25 spatial resolution, in which a correction for gauge under-catch and orographic effects was proposed by inferring precipitation from streamflow observations across the globe. Bhuiyan et al. [20] proposed a nonparametric statistical technique for combining precipitation datasets, in which a terrain elevation dataset was used as a part of inputs to quantile regression forests. Further, Bhuiyan et al. [21] evaluated a machine learning-based precipitation ensemble technique over three mountainous regions, in which global satellite precipitation datasets and an atmospheric re-analysis precipitation product with terrain information were integrated. Although these ensemble forecasts are generally based on statistical methods, corresponding results suggest the significance of using land information for atmospheric and hydrologic forecasts based on dynamic approaches (e.g., numerical modelling using general circulation models or regional weather models).
As land surface can also play an important role in affecting the atmospheric processes [23][24][25], the suitability of the land variables for generating perturbed initial conditions using the BGM method should be considered.
Hence, with the BGM perturbations of land surface variables included in a regional forecast system, the system would be characterised by its less computational expense and full consideration of both the atmospheric and land surface variables, compared with other forecast systems (e.g., those with only atmospheric variables perturbed using BGM or those using the singular vector approach). Further, more ensemble members could be generated to form a more representative ensemble for weather forecasts.
This paper aims to validate perturbed land surface variables for initial conditions using the BGM method and further explore the ensemble forecasts with the regional numerical model, ARWv3. In Section 2, we present the methods and data used in this study. In Section 3, we provide the results and discuss the ensemble of perturbed initial conditions. Finally, the conclusions are drawn in Section 4.

Weather Event Description
The weather event selected for this study is a heavy rainstorm that occurred in the middle and lower reaches of the Yangtze River in Southern China from 31 May to 1 June 2007. The interaction of cold and warm air masses coupled with the effect of a low vortex and a shear line resulted in a heavy rainfall process, which advanced from north to south along the middle and lower reaches of the Yangtze River from 30 May to 3 June 2007. From 31 May to 1 June, under the effect of a frontal cyclone with a convective cloud cluster at its tail end, several parts of southern Huanghuai and Jianghuai, and central and northern parts of Jiangnan experienced heavy to violent rain, with the heaviest precipitation occurring in the middle and lower reaches of the Yangtze River, where the convection was relatively strong. A large-scale 6 h precipitation event of more than 10 mm occurred in this region, with multiple effects as follows: (1) in the northern part of Chongqing, the 6 h precipitation reached 60 mm; (2) in Anqing City, Anhui Province, the precipitation reached 163 mm from the evening of 31 June to 8:00 on 1 June (Beijing time); and (3) Jiangxi Province experienced the heaviest precipitation since the beginning of the summer, with violent rain occurring in eight counties and torrential rain in five counties.

Experimental Design
In this study, the Weather Research and Forecasting (WRF)-Advanced Research WRF (ARW) V3 model was used along with the following parameterisation schemes (www.wrf-model.org, accessed on 26 November 2021): the WRF Single Moment class-3 (WSM3) microphysics, Rapid Radiative Transfer Model (RRTM) longwave radiation, Dudhia shortwave radiation, Yonsei University Planetary Boundary Layer (YSU PBL), Betts-Miller-Janjic cumulus parameterisation schemes, and the NOAH land surface model (NOAH LSM). In this attempt to use the breeding method for perturbing land surface variables, no domain nesting was used in the simulation. The simulation area consisted of 130 × 95 grid cells with a horizontal spacing of 15 km, centred at 30 • N and 113.5 • E ( Figure 1). The model, consisting of 31 uneven layers with the top-level at 50 hPa, used initial and boundary conditions extracted from the FNL analysis data (https://rda.ucar.edu/datasets/ds083.2, accessed on 26 November 2021), and the integration time step was set at 90 s with an hourly output for simulated results. along the middle and lower reaches of the Yangtze River from 30 May to 3 June 2007. From 31 May to 1 June, under the effect of a frontal cyclone with a convective cloud cluster at its tail end, several parts of southern Huanghuai and Jianghuai, and central and northern parts of Jiangnan experienced heavy to violent rain, with the heaviest precipitation occurring in the middle and lower reaches of the Yangtze River, where the convection was relatively strong. A large-scale 6 h precipitation event of more than 10 mm occurred in this region, with multiple effects as follows: (1) in the northern part of Chongqing, the 6 h precipitation reached 60 mm; (2) in Anqing City, Anhui Province, the precipitation reached 163 mm from the evening of 31 June to 8:00 on 1 June (Beijing time); and (3) Jiangxi Province experienced the heaviest precipitation since the beginning of the summer, with violent rain occurring in eight counties and torrential rain in five counties.

Experimental Design
In this study, the Weather Research and Forecasting (WRF)-Advanced Research WRF (ARW) V3 model was used along with the following parameterisation schemes (www.wrf-model.org, accessed on 26 November 2021): the WRF Single Moment class-3 (WSM3) microphysics, Rapid Radiative Transfer Model (RRTM) longwave radiation, Dudhia shortwave radiation, Yonsei University Planetary Boundary Layer (YSU PBL), Betts-Miller-Janjic cumulus parameterisation schemes, and the NOAH land surface model (NOAH LSM). In this attempt to use the breeding method for perturbing land surface variables, no domain nesting was used in the simulation. The simulation area consisted of 130 × 95 grid cells with a horizontal spacing of 15 km, centred at 30° N and 113.5° E ( Figure 1). The model, consisting of 31 uneven layers with the top-level at 50 hPa, used initial and boundary conditions extracted from the FNL analysis data (https://rda.ucar.edu/datasets/ds083.2, accessed on 26 November 2021), and the integration time step was set at 90 s with an hourly output for simulated results.

Perturbation Saturation Characteristics
In this study, the dynamic adjustment method of the BGM was adopted, i.e., during the breeding process, the magnitude of the output mode was adjusted after each cycle, and the adjusted mode was used as the input for the next breeding cycle. This dynamic adjustment process has been described in other studies [7][8][9][10][11][12]17] and can be summarised as follows: (1) superimpose a small perturbation on the initial analysis field; (2) integrate the model for 6 h from the perturbed and the unperturbed initial field (control forecast); (3) subtract the control forecast from the perturbed forecast to obtain the difference field caused by the perturbation; (4) scale down the obtained difference field to maintain the

Perturbation Saturation Characteristics
In this study, the dynamic adjustment method of the BGM was adopted, i.e., during the breeding process, the magnitude of the output mode was adjusted after each cycle, and the adjusted mode was used as the input for the next breeding cycle. This dynamic adjustment process has been described in other studies [7][8][9][10][11][12]17] and can be summarised as follows: (1) superimpose a small perturbation on the initial analysis field; (2) integrate the model for 6 h from the perturbed and the unperturbed initial field (control forecast); (3) subtract the control forecast from the perturbed forecast to obtain the difference field caused by the perturbation; (4) scale down the obtained difference field to maintain the same magnitude as that of the initial perturbation with respect to the root-mean-square error (RMSE); (5) superimpose the difference field on the next 6 h analysis field as described in (1); and (6) repeat steps (2)- (5). After a certain breeding period, when the perturbation growth rate is comparable to the actual forecast error growth, the initial perturbation field is obtained for the ensemble forecast.
The basic concept of the BGM method is that an arbitrary random perturbation is added to the analysis field so that after a breeding period, the initial perturbation field for the ensemble forecast can be obtained. Figure 2 is a schematic diagram for the BGM process, which differs from previous studies [7][8][9][10][11][12] in the land surface perturbations. same magnitude as that of the initial perturbation with respect to the root-mean-square error (RMSE); (5) superimpose the difference field on the next 6 h analysis field as described in (1); and (6) repeat steps (2)- (5). After a certain breeding period, when the perturbation growth rate is comparable to the actual forecast error growth, the initial perturbation field is obtained for the ensemble forecast.
The basic concept of the BGM method is that an arbitrary random perturbation is added to the analysis field so that after a breeding period, the initial perturbation field for the ensemble forecast can be obtained. Figure 2 is a schematic diagram for the BGM process, which differs from previous studies [7][8][9][10][11][12] in the land surface perturbations Therefore, determining when the perturbation has reached saturation is a critical step in the BGM method. The saturation level of a perturbation is generally determined based on its growth rate [7][8][9][10]. The concept of growth rate is briefly described below.
At the beginning of the breeding cycle, assuming that the initial perturbation and its magnitude with respect to the root mean square error (RMSE) is used as the magnitude of the breeding mode, the initial breeding mode is as follows: (1) where N denotes the number of grid points in the simulated area.
The output mode at the end of the tth breeding cycle is as follows: where p t f is the perturbation field and c t f is the control field.
The dynamic adjustment is applied in each step of the breeding cycle; hence, the magnitude of the input mode throughout the entire breeding process is the same as the magnitude of the initial mode with respect to the RMSE; therefore, the growth rate of the breeding mode E(Mt) in step t is as follows: The dynamic adjustment coefficient is defined as: The input mode of the next cycle is as follows: Therefore, determining when the perturbation has reached saturation is a critical step in the BGM method. The saturation level of a perturbation is generally determined based on its growth rate [7][8][9][10]. The concept of growth rate is briefly described below.
At the beginning of the breeding cycle, assuming that the initial perturbation and its magnitude with respect to the root mean square error (RMSE) is used as the magnitude of the breeding mode, the initial breeding mode is as follows: where N denotes the number of grid points in the simulated area.
The output mode at the end of the tth breeding cycle is as follows: where f p t is the perturbation field and f c t is the control field. The dynamic adjustment is applied in each step of the breeding cycle; hence, the magnitude of the input mode throughout the entire breeding process is the same as the magnitude of the initial mode with respect to the RMSE; therefore, the growth rate of the breeding mode E(M t ) in step t is as follows: The dynamic adjustment coefficient is defined as: The input mode of the next cycle is as follows: In the present experiment, the breeding process was initiated at 0 h (UTC) on 26 May 2007; during the 120 h breeding process, and the growth mode was adjusted every 6 h.
In the BGM method, only variables can be bred to find perturbations that contribute to the forecast; therefore, constant parameters were not used. In the present experiment, we Atmosphere 2021, 12, 1578 5 of 18 selected the following atmospheric variables: zonal wind field (u), meridional wind field (v), water vapour mixing ratio (q v ), and atmospheric temperature field (T), as well as the following land surface variables: total soil moisture (SMO) and surface skin temperature (TSK).
The initial random error fields (initial mode) of the atmospheric and land-surface variables are calculated as: where p(m x , m y , m l ) is an initial perturbed value at the beginning of the growth cycle for the grid point (m x , m y , m l ), E(m l ) is the RMSE of each level of the 24 h control forecast, and ω is the control coefficient that controls the magnitude of the initial mode. As the present model consisted of four soil layers of the Noah Land Surface Model (Noah LSM) and 31 vertical levels of the atmospheric model, the values of the maximum of m l for the land surface variables SMO and TSK were 4 and 1, respectively, and for m l of the atmospheric variables, the maximum was set to 31. The expression random(m x , m y , m l ) represents a set of random numbers distributed uniformly between [−1,1] when the variables are initially perturbed. Both the atmospheric and land surface variables should be limited to a range of values that have a physical meaning. When the time reaches the forecast starting point after several rounds of the breeding cycle, the breeding process is over, and the simulated state is regarded as the initial state for the model forecast, i.e., the model can then be integrated over a continuous period of the lead time for the forecast (i.e., 24 h in this study).
Multiple studies have investigated various methods of selecting ω [17] and concluded that if the selected ω is too large, the perturbations will not be able to grow, making it difficult to determine when the growth rate has reached saturation. Likewise, if ω is too low, the growing mode will take too long to reach saturation. This study aims to determine whether slowly evolving variables, such as land surface variables, are suitable for the BGM method. Therefore, the effect of different ω values on the perturbation saturation characteristics is not discussed here. Based on other studies [8][9][10]17], we conducted a large number of experiments. These experiments revealed that for the atmospheric variables, when the RMSE (ω = 1) of the 24 h forecast is used directly for perturbation, the breeding mode can rapidly reach saturation with evident saturation characteristics. For the land surface variables, the RMSE of the 24 h forecast is also suitable for the TSK; while for the SMO, the RMSE of the 24 h forecast should be multiplied by 0.25 (ω = 0.25).

Ensemble Forecast Experiment
Based on our analysis (see Section 3), it was found that the initial perturbations of the atmospheric and the land surface variables can reach saturation after 96 h of breeding. Hence, in this experiment, the perturbations obtained after 96 h of breeding were superimposed on the initial field at 00:00 h on 31 May 2007 for a 24 h integration, which led to the following ensemble forecasts. Figure 3 presents a schematic representation of the WRF forecast process for this study, where the forecasts of perturbing atmospheric/land variables and the forecast with unperturbed variables are included (corresponding to the ATM/LSM subset and the control test, respectively). The atmospheric variables u, v, T, and q v were simultaneously superimposed with the initial perturbations and subjected to 17 independent breeding cycles of 96 h. The resulting 17 perturbations were then added to the initial field at 00:00 h on 31 May 2007, for a 24 h integration in 17 individual tests. The results of this portion of the experiment belong to the ATM subset (i.e., the set of 17 individual experiments with the atmospheric variables perturbed for the 24 h integrations); for example, ATM17 represents the 17th test of the initial perturbation of the atmospheric variables. Similarly, the initial perturbations added to the land surface variables SMO and TSK were bred 17 times and then subjected to a 24 h integration in 17 independent tests, which belong to the LSM subset (i.e., the set of 17 individual experiments with the land surface variables perturbed for the 24 h integrations); for example, LSM10 represents the 10th test of the initial perturbation of the land surface variables. In addition, a 24 h integration with unperturbed initial values was run from 00:00 h on 31 May to 00:00 h on 1 June as the control test. Table 1 lists the designed ensemble subsets as compared with the control test. Section 3 analyses the simulation results from 00:00 h on 31 May to 00:00 h on 1 June. As no domain nesting was used in the model, a smaller area (25.5 • N-34.5 • N, 105.5 • E-122.5 • E) was used in the simulation (i.e., D 2 ; Figure 1) to facilitate analysis. Unless otherwise specified, the selected area was used in the following analysis. the atmospheric variables. Similarly, the initial perturbations added to the land surface variables SMO and TSK were bred 17 times and then subjected to a 24 h integration in 17 independent tests, which belong to the LSM subset (i.e., the set of 17 individual experiments with the land surface variables perturbed for the 24 h integrations); for example, LSM10 represents the 10th test of the initial perturbation of the land surface variables. In addition, a 24 h integration with unperturbed initial values was run from 00:00 h on 31 May to 00:00 h on 1 June as the control test. Table 1 lists the designed ensemble subsets as compared with the control test. Section 3 analyses the simulation results from 00:00 h on 31 May to 00:00 h on 1 June. As no domain nesting was used in the model, a smaller area (25.5° N-34.5° N, 105.5° E-122.5° E) was used in the simulation (i.e., D2; Figure 1) to facilitate analysis. Unless otherwise specified, the selected area was used in the following analysis.    In this study, the ensemble spread was used to quantitatively evaluate the differences between the ensemble members. The ensemble spread can be defined as [13]:

Perturbations at Model Levels
Atmosphere 2021, 12, 1578 where K is the number of ensemble members, f i represents an individual perturbed forecast, and f 0 is the reference field, which is generally the control forecast or the ensemble mean forecast. In this study, the ensemble spread is considered with respect to the ensemble mean forecast.

Equitable Threat Score (ETS)
To evaluate the impact of each perturbed variable (parameter) on the spatial distribution of precipitation, we use 24 h precipitation data of 379 active weather stations located within the simulation area D2 to obtain the Equitable Threat Score (ETS) of the precipitation forecast [25]. The ETS is calculated as follows: where , N A is the number of stations where precipitation was both simulated and actually occurred, N B is the number of stations where precipitation was simulated but no actual precipitation occurred, N C is the number of stations where precipitation occurred but was not simulated, and N D is the number of stations where no precipitation was simulated and no precipitation occurred.
The ETS considers the probability of correct forecasts due to random chance, and ETS values can range from 0.333 to 1; the ideal ETS value is 1, whereas a value of 0 indicates no forecast skill. In this study, we use the following four precipitation intensity levels: light rain (0.1-9.9 mm), moderate rain (10-24.9 mm), heavy rain (25-49.9 mm), and violent rain (50 mm and above) [26]. The ETS values were calculated for each of the above precipitation intensity levels.

Forecast Errors and Spatial Correlation Coefficient
To evaluate the overall effect of each perturbed variable (parameter) on the 24 h cumulative precipitation, we use the RMSE and the spatial correlation coefficient (CORR) to assess the model performance and the effectiveness of the ensemble forecast [27]. The RMSE and the CORR are obtained, respectively, using the following calculation formulas: where N is the total number of stations, M i is the simulated precipitation at the ith station, O i is the observed precipitation at the ith station, M is the mean simulated precipitation, and O is the mean observed precipitation.

Results and Discussion
In this section, for the perturbed experimental results, the saturation characteristics of the perturbed atmospheric variables are described before those of the land surface variables, because previous studies have investigated only atmospheric variables by the BGM method in the land-atmosphere coupled models [7][8][9][10][11][12]. Then, we present the results of ensemble forecasts based on the perturbed atmospheric and the land surface variables.

Atmospheric Variables
The ARW model consists of 31 vertical atmospheric levels. Hence, to show the growth and saturation characteristics of the bred modes more clearly, the following three levels (representing the bottom, middle, and upper levels, respectively) were used to analyse the perturbed atmospheric variables: η = 6 (L6;~1.5 km above ground level or AGL), η = 16 (L16;~5 km AGL), and η = 26 (L26;~8 km AGL).
Notably, the perturbation growth rate of both the zonal and the meridional wind field exceeded 1 after 12 h of breeding (Figure 4a,b). After each 6 h integration, the growth rate of the perturbed zonal wind field at L6, L16, and L26 oscillated around 1.3, 1.5, and 1.2, respectively. For the meridional wind speed field, the growth rate of the bred growing modes at L6, L16, and L26 oscillated around 1.5, 1.3, and 1.3, respectively. Although the perturbation growth rate of the wind speed field at each level exceeded 1 after 12 h of breeding, the initial perturbation of the wind field could be considered as saturated at this time, which is consistent with the result of Yu et al. [17]. water vapour mixing ratio field can reach saturation in the middle and lower layers, it is difficult to determine whether saturation had been reached at the upper levels (Level 21 and above). However, the magnitude of the water vapour mixing ratio at the upper level was relatively small (between 10-5 and 10-6), resulting in its low effect on precipitation. Therefore, it can be considered that the perturbation of the water vapour mixing ratio field reached saturation after 30 h of breeding.   Figure 5 shows the growth rates of the perturbed land surface variables SMO and TSK in each of the four soil layers of the Noah LSM. It is evident that in the third and fourth layers, the growth rate of the SMO increased to more than one and remained stable after 18 h of breeding, and then oscillated slightly around 1.02 and 1.04, respectively (Figure 5a). Although the growth rate of the breeding mode was relatively low, it was comparable with the value of 1.05 for the geopotential height field obtained by Guan and Zhang [28], using the T63L9 model. Therefore, we consider the perturbed SMO as saturated after 18 h of breeding. On the other hand, it is also evident that in the first and second layers, the initial perturbation of the SMO required more time to reach saturation;    Figure 4c shows the growth rate of the perturbed air temperature field. The growth rate of the perturbed variable for L6 gradually increased to more than 1 after 30 h of breeding and then oscillated around 1.1. The growth rate for both L16 and L26 exceeded 1 after 12 h of breeding and then oscillated around 1.3 and 1.4, respectively. The perturbed temperature field in the middle and upper layers generally reached saturation faster than that in the lower layer. After 30 h of breeding, the growth rate in each layer exceeded 1;

Land Surface Variables
Atmosphere 2021, 12, 1578 9 of 18 therefore, it can be considered that the time required for the perturbed air temperature field to reach saturation is 30 h. Figure 4d shows the growth rate of the perturbed water vapour mixing ratio field. The growth rate of the breeding mode for L6 gradually increased to more than 1 after 30 h of breeding and then oscillated around 1.2; for L16, the growth rate exceeded 1 after 24 h of breeding and then oscillated around 1.3; and in L26, the growth rate exceeded 1 after 30 h of breeding and then oscillated slightly above 1, making it difficult to determine whether saturation had been reached. Although after 30 h of breeding, the perturbed water vapour mixing ratio field can reach saturation in the middle and lower layers, it is difficult to determine whether saturation had been reached at the upper levels (Level 21 and above). However, the magnitude of the water vapour mixing ratio at the upper level was relatively small (between 10-5 and 10-6), resulting in its low effect on precipitation.
Therefore, it can be considered that the perturbation of the water vapour mixing ratio field reached saturation after 30 h of breeding. Figure 5 shows the growth rates of the perturbed land surface variables SMO and TSK in each of the four soil layers of the Noah LSM. It is evident that in the third and fourth layers, the growth rate of the SMO increased to more than one and remained stable after 18 h of breeding, and then oscillated slightly around 1.02 and 1.04, respectively (Figure 5a). Although the growth rate of the breeding mode was relatively low, it was comparable with the value of 1.05 for the geopotential height field obtained by Guan and Zhang [28], using the T63L9 model. Therefore, we consider the perturbed SMO as saturated after 18 h of breeding. On the other hand, it is also evident that in the first and second layers, the initial perturbation of the SMO required more time to reach saturation; the growth rate of the breeding mode required 96 h to stabilise above 1, then oscillated around 1.2 and 1.1, respectively. Therefore, we consider that the initial perturbation of the land variable SMO requires 96 h of breeding to reach saturation. Figure 5b shows that the growth rate of the TSK breeding mode required 72 h of breeding to exceed 1, after which it oscillated around 1.05, at which time it could be considered as saturated. The initial perturbations of the atmospheric variables could reach saturation relatively quickly, after 30 h of breeding, in the ARW model, and the perturbation growth rate was evident. The BGM method could additionally be used to generate the initial perturbations of land surface variables; however, the perturbations of land surface variables required more time to reach saturation than the atmospheric variables, generally requiring 102 h of breeding in this case (Figure 5a).

Ensemble Forecast Experiment
In this study, the BGM method was used to obtain 17 initial perturbation fields of the atmospheric variables (i.e., the ATM subset) and 17 initial perturbation fields of the land surface variables (i.e., the LSM subset), such that 34 total perturbations were su- The initial perturbations of the atmospheric variables could reach saturation relatively quickly, after 30 h of breeding, in the ARW model, and the perturbation growth rate was evident. The BGM method could additionally be used to generate the initial perturbations of land surface variables; however, the perturbations of land surface variables required more time to reach saturation than the atmospheric variables, generally requiring 102 h of breeding in this case (Figure 5a).

Ensemble Forecast Experiment
In this study, the BGM method was used to obtain 17 initial perturbation fields of the atmospheric variables (i.e., the ATM subset) and 17 initial perturbation fields of the land surface variables (i.e., the LSM subset), such that 34 total perturbations were superimposed on the initial field for a 24 h integration; a corresponding 24 h integration with no initial perturbation was used as a control experiment. Results show that in all experiments, the spatial distribution of the predicted precipitation was generally in agreement with the observed precipitation; e.g., the heavy precipitation zone in the middle and lower reaches of the Yangtze River was simulated well, except that the simulated rain belt extended further south compared with the observed (Figure S1; Supplementary Materials). At the same time, we found obvious differences in the location of the heavy precipitation centre and the regional characteristics of the simulated precipitation, suggesting that the 24 h cumulative precipitation forecast is sensitive to the initial perturbations of the land surface variables in the spatial distribution ( Figure S1), precipitation area ( Figure S2), and regional mean precipitation ( Figure S3). Theoretically, these perturbations of the land surface variables lead to the changes in latent and sensible heat fluxes (larger than 50 W m −2 over a large area, which is comparable to the differences induced by different land surface schemes by previous studies [26]; Figure S4), which further contribute to precipitation via the direct and indirect mechanisms, respectively, i.e., direct transportation of surface evaporation (which is proportional to latent heat flux) contributes to condensed water in the vertical, while indirect effect associated with the horizontal convergence of water vapour can provide even more moisture for precipitation [26], which is largely caused by the change in sensible heat flux [29].
To more concisely present the results of the experiment, in the following sections we limit our description to the parameter-based evaluation and the probability forecast results. Figure 6 shows that the spatial distribution of the ensemble spread is similar to the spatial distribution of the precipitation zone ( Figure S1), which means that the ensemble structure is reasonable to some extent, and the ensemble spread correctly reflects the uncertainty of the precipitation forecast. The high-value centre of the spread corresponds to the high-value centre of the 24 h precipitation forecast, suggesting that the forecast results are less satisfactory for the heavy-precipitation centre. In addition, the spread of the ATM subset is greater than that of the LSM subset, both in the magnitude of the precipitation centre and the range of the different precipitation intensity levels. As listed in Table 2 for evaluation results of the region-averaged precipitation, the spreads of the ATM and the LSM subsets were 7.32 and 3.82 mm, respectively, which indicates that the perturbed atmospheric variables have a greater effect on precipitation forecast than the perturbed land surface variables; however, the effect of the latter is also significant. This conclusion is consistent with the results of the preceding spatial distribution of precipitation (Figures S1-S3; Supplementary Materials) and is similar to the findings of other studies [30]. Additionally listed in Table 2, for the D 2 domain, there is an area about 68.6% (55.3%) of the total D 2 area that has significant differences between the control test and the ATM (LSM) subset according to the Student's t-tests (p = 0.10) with the time series of hourly precipitation (i.e., for each grid point, apart from 24 hourly precipitations of the control test, the time series are generated by the ensemble averages from 17 perturbations of the ATM or LSM subset with an hourly interval over the 24 h period), while the number is 52.7% of the total D 2 area for the differences between the ATM and LSM subsets. Considering the area that has little change in simulated precipitation, these values indicate quite large differences due to the perturbations. As perturbing atmospheric variables are commonly used in precipitation ensemble forecast, these results demonstrate that perturbing land variable is potentially significant for the ensemble forecast. control test, the time series are generated by the ensemble averages from 17 perturbations of the ATM or LSM subset with an hourly interval over the 24 h period), while the number is 52.7% of the total D2 area for the differences between the ATM and LSM subsets. Considering the area that has little change in simulated precipitation, these values indicate quite large differences due to the perturbations. As perturbing atmospheric variables are commonly used in precipitation ensemble forecast, these results demonstrate that perturbing land variable is potentially significant for the ensemble forecast.     Figure 7 shows the ETS values and their relative standard deviation (RSD) for different precipitation thresholds. Evidently, the initial perturbations of the atmospheric and land surface variables can cause large variations in the ETS values for different intensity-level precipitation. For precipitation above 0.1 mm, the maximum difference between the ETS scores of the tests in the ATM subset was 0.115 (ATM3 vs. ATM5), the maximum percentage difference was 22.53% (percentage of the control test ETS), and RSD of the ETS values was 0.051; and in the LSM subset, the maximum difference between the ETS values was 0.037 (LSM3 vs. LSM16), maximum percentage difference was 7.25%, and RSD was 0.020.

Evaluation Parameters Ensemble Spread
For the precipitation threshold of 10 mm, in the ATM subset, the maximum difference was 0.062 (ATM16 vs. ATM13), percentage difference was 28.3%, and RSD was 0.078; and in the LSM subset, the maximum difference was 0.036 (LSM5 vs. LSM3), percentage difference was 16.3%, and RSD was 0.047. For the precipitation threshold of 25 mm, in the ATM subset, the maximum difference was 0.095 (ATM16 vs. ATM9), percentage difference was 53.5%, and RSD was 0.127; and in the LSM subset, the maximum difference was 0.073 (LSM15 vs. LSM3), maximum percentage difference was 41.1%, and RSD was 0.097. For the precipitation threshold of 50 mm, in the ATM subset, the maximum difference was 0.083 (ATM13 vs. ATM12), maximum percentage difference was 75.6%, and RSD was 0.278; and in the LSM subset, the maximum difference was 0.100 (LSM10 vs. LSM6), maximum percentage difference was 91.1%, and RSD was 0.232.
These results indicate that the initial perturbations of the selected variables greatly affected the ETS scores of the precipitation forecast; as the precipitation intensity increased, the effect of the initial perturbations on the forecast ETS score increased. This implies that the initial perturbations of the selected variables have a relatively strong effect on the simulated precipitation, and their effect increases with the precipitation intensity. Notably, at all intensity thresholds, the initial perturbations had a greater effect on the ETS values of the ATM subset than those of the LSM subset, as is consistent with the above-mentioned subset spreads. However, as the precipitation intensity increased, the ETS RSD values of the LSM subset approached those of the ATM subset, indicating that as the precipitation intensity increases, the initial perturbations of the land surface variables become comparably important to the atmospheric variables. Figure 7 shows the ETS values and their relative standard deviation (RSD) for different precipitation thresholds. Evidently, the initial perturbations of the atmospheric and land surface variables can cause large variations in the ETS values for different intensity-level precipitation. For precipitation above 0.1 mm, the maximum difference between the ETS scores of the tests in the ATM subset was 0.115 (ATM3 vs. ATM5), the maximum percentage difference was 22.53% (percentage of the control test ETS), and RSD of the ETS values was 0.051; and in the LSM subset, the maximum difference between the ETS values was 0.037 (LSM3 vs. LSM16), maximum percentage difference was 7.25%, and RSD was 0.020. For the precipitation threshold of 10 mm, in the ATM subset, the maximum difference was 0.062 (ATM16 vs. ATM13), percentage difference was 28.3%, and RSD was 0.078; and in the LSM subset, the maximum difference was 0.036 (LSM5 vs. LSM3), percentage difference was 16.3%, and RSD was 0.047. For the precipitation threshold of 25 mm, in the ATM subset, the maximum difference was 0.095 (ATM16 vs. ATM9), percentage difference was 53.5%, and RSD was 0.127; and in the LSM subset, the maximum difference was 0.073 (LSM15 vs. LSM3), maximum percentage difference was 41.1%, and RSD was 0.097. For the precipitation threshold of 50 mm, in the ATM subset, the maximum difference was 0.083 (ATM13 vs. ATM12), maximum percentage difference was 75.6%, and RSD was 0.278; and in the LSM subset, the maximum difference was 0.100 (LSM10 vs. LSM6), maximum percentage difference was 91.1%, and RSD was 0.232. With respect to the ensemble forecast skill, Figure 7a shows that, in the ATM subset, when the precipitation threshold was 0.1 mm, the ensemble mean (test no. 18) ETS was 0.525, which is larger than that of 15 ensemble members; when the precipitation threshold was 10 mm, the ensemble mean ETS was 0.244, exceeding the ETS of 15 ensemble members; when the precipitation threshold was 25 mm, the ETS was 0.221, which is larger than that of 13 ensemble members; and when the precipitation threshold was 50 mm, the ETS was 0.050, exceeding the ETS of only one ensemble member. Although the ETS values of the ensemble mean in each subset were not superior for all precipitation thresholds, they were superior in most cases, which suggests that the results of the ensemble forecast are more stable than that of a deterministic forecast [28,30,31]. Figure 7b shows that, in the LSM subset, the ETS values of the ensemble mean (test no. 18) were also superior for most precipitation intensity thresholds, indicating that using perturbed land surface variables for ensemble forecast can improve the forecast skill.

RMSE and CORR
As shown in Table 2 for error analysis, region-averaged systematic errors (random errors) ranged between 32% and 38%, with the values of both subsets lower than that of the control test and the LSM value slightly higher than the ATM one, showing that ensemblemean forecast is better than the control one and that the LSM subset is comparable with the ATM subset. Correspondingly, normalised RMSE values indicate a slightly higher random error of the LSM subset than that of the ATM subset. Figure 8 shows more about the RMSE and the CORR values of the ensemble forecast of 24 h cumulative precipitation. As shown in Figure 8a,c, the maximum difference in the RMSE between the individual tests in the ATM subset was 1.76 mm (ATM13 vs. ATM16) and the RSD of the RMSE was 0.018; the maximum difference in the RMSE between the individual tests in the LSM subset was 1.41 mm (LSM2 vs. LSM13) and the RSD of the RSME was 0.015. Figure 8b,c shows that the maximum difference in the CORR values of the tests in the ATM subset was 0.059 (ATM17 vs. ATM11) and the RSD was 0.049; the maximum difference between the CORR values in the LSM subset was 0.072 (LSM10 vs. LSM2), and the RSD was 0.046. Similar to the ETS evaluation, the RSME and CORR evaluation results show that the initial perturbations of both the atmospheric and land surface variables can have a great impact on the precipitation forecast skill; however, the impact of the atmospheric variable perturbations is greater than those of the land surface variables.
Considering the RMSE and the CORR values of the 24 h cumulative precipitation, the RMSE of the ensemble mean of the ATM subset was 24.16 mm, which is generally smaller than that of all of its 17 members, and the ensemble mean CORR value was 0.426, which is generally greater than that of all of its 17 members. The RMSE of the ensemble mean of the LSM subset was 24.37 mm, which is smaller than that of 15 of its members, and the ensemble CORR value was 0.441, which is greater than that of 13 of its members. The RMSE and the CORR of the ensemble mean forecast of both subsets were superior to Similar to the ETS evaluation, the RSME and CORR evaluation results show that the initial perturbations of both the atmospheric and land surface variables can have a great impact on the precipitation forecast skill; however, the impact of the atmospheric variable perturbations is greater than those of the land surface variables.
Considering the RMSE and the CORR values of the 24 h cumulative precipitation, the RMSE of the ensemble mean of the ATM subset was 24.16 mm, which is generally smaller than that of all of its 17 members, and the ensemble mean CORR value was 0.426, which is generally greater than that of all of its 17 members. The RMSE of the ensemble mean of the LSM subset was 24.37 mm, which is smaller than that of 15 of its members, and the ensemble CORR value was 0.441, which is greater than that of 13 of its members. The RMSE and the CORR of the ensemble mean forecast of both subsets were superior to those of the individual ensemble member forecasts, suggesting that using the perturbed the atmospheric and the land surface variables to generate ensemble forecasts can improve the ensemble forecast skill. Figure 9 shows that the total area of the forecast probability >10% of the ATM subset is greater than that of the LSM subset for each intensity threshold, and the high-probability areas of the ATM subset are smaller than those of the LSM subset for each intensity threshold. The greater the perturbation of predicted precipitation, the less predictable with a deterministic forecast it becomes, suggesting that the precipitation probability will decrease, but its area will increase. Moreover, this shows that the influence of the perturbed atmospheric variables on the 24 h cumulative precipitation is greater than those of the land surface variables. The probability forecast map shows that in areas where the observed precipitation is >0.1 mm, the probability of each subset exceeds 90%. The probability forecast reflects the general area of precipitation, but the forecast area exceeds the observed precipitation area, and hence, the ensemble mean artificially expands the precipitation area. The areas with observed precipitation of >10 mm coincide well with the probability regions of >90% in both subsets; however, the predicted precipitation area is greater than the observed precipitation area. The areas with the observed precipitation of >25 mm and the regions with the probability of >70% are consistent in both subsets. Simultaneously, in both subsets, the regions with a probability of >90% and the corresponding observed precipitation areas are consistent; hence, the probability forecast for the precipitation intensity of >25 mm is effective. The areas with the observed precipitation of >50 mm and the regions with a probability of >30% are similar for both subsets; hence, these areas merit more attention in the actual forecast. In both subsets, the precipitation intensity in the regions with a probability of >70% was consistent with the corresponding observed precipitation. The above analysis shows that the probability forecast is more indicative of high-intensity precipitation events than those of lower intensity. It was also found that the probability forecast can be improved by using the initial perturbations of the land surface variables.

Precipitation Probability Forecast
The results of this study are consistent with those obtained by Wang et al. [30], both leading to the same conclusions with respect to the influence of the two methods on generating initial perturbations (initial conditions vs. various physics options) on precipitation and their relative importance. This shows that the findings of this study are not unique but have a certain commonality with respect to the influence of perturbed land surface variables (parameters) on precipitation and the ensemble forecast. It is noteworthy that, as discussed by Keil et al. [31], soil moisture heterogeneity appears to have the greatest influence on convection (and therefore precipitation) during weak synoptic forcing, which implies that our case study during strong synoptic forcing in the early summer is unlikely to overestimate the impact of soil moisture perturbation. similar for both subsets; hence, these areas merit more attention in the actual forecast. In both subsets, the precipitation intensity in the regions with a probability of >70% was consistent with the corresponding observed precipitation. The above analysis shows that the probability forecast is more indicative of high-intensity precipitation events than those of lower intensity. It was also found that the probability forecast can be improved by using the initial perturbations of the land surface variables.  The results of this study are consistent with those obtained by Wang et al. [30], both leading to the same conclusions with respect to the influence of the two methods on generating initial perturbations (initial conditions vs. various physics options) on precipitation and their relative importance. This shows that the findings of this study are not unique but have a certain commonality with respect to the influence of perturbed land surface variables (parameters) on precipitation and the ensemble forecast. It is noteworthy that, as discussed by Keil et al. [31], soil moisture heterogeneity appears to have the greatest influence on convection (and therefore precipitation) during weak synoptic forcing, which implies that our case study during strong synoptic forcing in the early summer is unlikely to overestimate the impact of soil moisture perturbation.

Conclusions
This study aimed to investigate whether slowly evolving land surface variables are suitable for generating the initial perturbations for ensemble forecasts using the BGM method. The ARWv3 model was used to breed the selected atmospheric and land surface variables using the dynamic adjustment method of the BGM and to conduct an ensemble forecast experiment.
Results show that atmospheric and land surface variables can be used to generate the initial perturbations by the BGM method; however, the initial perturbations of these two types of variables have different times and characteristics with respect to saturation. Generally, atmospheric variables can reach saturation after 30 h of breeding and their saturation characteristics are simpler to distinguish, whereas land surface variables can reach saturation after 102 h of breeding and their saturation characteristics are not as evident as those of atmospheric variables.
Among the atmospheric variables, the perturbed wind speed field could reach the saturation level only after 12 h of breeding with an evident saturation characteristic; after saturation is reached, the perturbation growth rate varies from level to level, in this case, ranging from 1.2 to 1.5 per 6 h. The saturation times of the perturbed air temperature field varied at different levels, with the growth rate reaching saturation faster at the middle and upper levels than at the lower. The perturbed water vapour mixing ratio required 30 h of breeding to reach saturation at the middle and lower levels; at the upper levels, the saturation characteristics of this variable were not evident. In general, the initial perturbations of land surface variables require more time to reach saturation than those of atmospheric variables. For example, initial SMO perturbations could easily reach the saturation level at the two lower levels of the land surface model after 18 h of breeding, whereas at the upper two levels, which were closer to the surface, the perturbations reached saturation after 102 h of breeding. According to our results, TSK perturbations can reach saturation after 72 h of breeding.
With respect to precipitation perturbations, perturbed land surface variables have a weaker effect on precipitation than perturbed atmospheric variables; however, the effect of perturbed land surface variables is still relatively large. The effect of both types of perturbed variables on precipitation increases with precipitation intensity. Therefore, as precipitation intensity increases, the effect of perturbed land surface variables becomes comparable to that of atmospheric variables. The influence of initial perturbations of selected variables on the precipitation forecast tends to increase with the simulation time, and simultaneously, the importance of perturbed land surface variables increases relative to that of atmospheric variables. With respect to the time scale of the precipitation forecast, it was found that the initial perturbations of the atmospheric variables had a significant effect on precipitation even in the first hour of simulation; during this time, the maximum RSD of the regional mean hourly precipitation in the ATM subset of experiments was 41.76%. The initial perturbations of the land surface variables had a relatively small effect on the precipitation forecast during the first 7 h of simulation; however, within 8 h of simulation, the maximum RSD of the hourly regional mean precipitation was 12.7% of the maximum standard deviation of the ATM subset; simultaneously, the standard deviation of the ATM subset was 25.4%. Therefore, it can be concluded that the perturbed variables had a significant effect on the precipitation. Considering the ensemble forecast results, the use of the initial perturbations of land surface variables and atmospheric variables, to a certain extent, improved the skill of the generated ensemble forecast, which demonstrates that the use of the perturbed land variables to generate ensemble forecasts is feasible.
Based on the results of the probability forecast, it was found that for small-intensity precipitation events (light and moderate rain), the ensemble mean artificially expanded the forecast area, and for high-intensity precipitation (heavy and violent rain), the ensemble mean could eliminate false precipitation in the control forecast. Therefore, the probability forecast of precipitation is more effective for high-intensity precipitation.
The comparison of our results with previous results shows that the conclusions of this study on the effect of perturbed land surface variables on simulated precipitation and ensemble forecasting are not unique; rather they have a measure of commonality in terms of the characteristics and relative importance of the results. In addition to previous studies [32][33][34][35][36] that have investigated perturbations of atmospheric variables, model physics parameterisations and lateral boundary conditions, this study demonstrates that the perturbation of land variables is also important for regional ensemble forecasting. Nevertheless, because some features of ensemble forecasting are geographically dependent [37], our follow-up studies would investigate more cases for various regions.