Simulation of an Extreme Precipitation Event Using Ensemble-Based WRF Model in the Southeastern Coastal Region of China

: Extreme weather events have increased signiﬁcantly in the past decades due to global warming. As a robust forecast and monitoring tool of extreme weather events, regional climate models have been widely applied on local scales. This study presented a simulation of an extreme precipitation event in the Southeastern Coastal Region of China (SEC), where ﬂoods, typhoons, and mountain torrents occur frequently using the Weather Research and Forecast model (WRF) driven by GEFS (The Global Ensemble Forecast System) ensemble members (one control run and 20 ensemble members) from 01 UTC 14 June to 18 UTC 16 June 2010. The observations of hourly precipitation records from 68 meteorological stations in the SEC were applied to validate the WRF ensemble simulations with respect to 3-hourly cumulative precipitation (3hP), 6-hourly cumulative precipitation (6hP) and total cumulative precipitation (TCP). The results showed that all WRF 20 ensemble outputs could capture the extreme precipitation events fairly well with the Pearson correlation coefﬁcient ranging from 0.01 to 0.82 and 0.16 to 0.89 for 3 and 6hP, respectively. The normalized root mean square error was comparable between the control run and 20 ensembles for 3hP (0.67 vs. 0.63) and 6hP (0.51 vs. 0.53). In general, WRF underestimated the observations for TCP. The control run (En00) modeled 28.1% less precipitation, while the 20 ensembles modeled 3.9% to 55.5% less precipitation than observations. The ensemble member 12 (En12) showed the best TCP simulation with the smallest bias. The average of 20 ensembles simulated 31.7% less precipitation than observations. The total precipitation was not captured by WRF with a signiﬁcant bias that ranged from − 203.1 to 112.3 mm. The storm centers were generally not captured by WRF in this case study. WRF ensembles underestimated the observation in the central Fujian Province while overestimated in the northern and southern Fujian Province. Although the average of ensembles can reduce the uncertainty to a certain extent, the individual ensemble (e.g., En12) may be more reliable on local scales.


Introduction
Extreme precipitation tends to trigger natural hazards such as floods, landslides, and debris flows that cause large human and economic losses [1]. The global increase in the frequency and intensity of extreme precipitation events has been observed in numerous studies [2][3][4][5]. Many previous studies showed that China experiences a significant increase in extreme precipitation, in particular northwestern and southeastern China, during the rainy season from April to September [6][7][8][9]. Fujian Province, located in the Southeastern Coastal Region of China (SEC), has a typical subtropical monsoon climate. It is an economically developed area with a high population density and a high concentration of wealth. However, natural disasters, such as floods, typhoons, and mountain torrents, often occur in this region. Based on preliminary statistics, more than 182 extreme precipitation events occurred in the Fujian Province, which caused more than 2000 deaths from 1950 to 2000 [10].
Quantitative precipitation forecast (QPF) has played a significant role in the prevention of floods and rainstorms. Numerical weather forecasts have been commonly used to provide QPF products in recent years. However, the global numerical weather forecast models cannot meet the requirement of high spatial and temporal resolution of a specific region with complicated topography due to the limitation of computer resources [11]. Many regional numerical weather forecast models-such as the Regional Atmospheric Modeling System (RAMS), the Pennsylvania State University-National Center for Atmospheric Research (PSU/NCAR) mesoscale model (MM5) and the Weather Research and Forecast model (WRF)-have been developed as dynamic downscaling tools to generate higher spatial and temporal resolution QPF products on the basis of global numerical weather forecasts. Numerous studies have indicated that regional numerical weather forecast models were more capable of predicting precipitation in complicated topography than global models, as higher resolution helped to reveal local circulations and orographic forcing [11,12].
As the next-generation mesoscale numerical weather prediction system, the WRF model designed for both atmospheric research and operational forecasting needs has been widely used by more than 150 countries [11][12][13][14][15]. For researchers, WRF can produce simulations based on actual atmospheric conditions (i.e., from observations and analysis) or idealized conditions. WRF offered operational forecasting a flexible and computationallyefficient platform while reflecting recent advances in physics, numeric, and data assimilation contributed by developers from the expansive research community [14].
It is well-known that the occurrence and development of heavy rain is not only related to large-scale weather conditions and sufficient water vapor transport but also to nonuniform underlying surface processes [13,16]. The land surface scheme is a basic physical and chemical process in the atmospheric circulation system, which affects the atmospheric circulation and climate change modeling [15,17]. Over the past several decades, many studies have shown that land-surface processes play an important role in climate models, atmospheric circulation models, and mesoscale numerical prediction models [17]. Land surface schemes such as CLM4, Noah and Noah-MP in WRF could enhance the ability of precipitation prediction [17].
However, numerical prediction models such as WRF were not perfect because of the measurement errors, analysis errors and model bias. The atmospheric variables cannot be measured to an infinite degree of accuracy or precision [14,15,17]. The models' initial state never matches the real atmosphere. Initial condition errors grew with model integration time, most rapidly at smaller scales [17,18]. The model equations did not fully represent all of the processes in the atmosphere. Thus, in order to overcome these disadvantages, ensemble technology was developed. Normally, the single forecast (control/deterministic run) from one forecast model or method used a single set of initial conditions. However, an ensemble collection of "member" forecasts were verified at the same time, created from different but equally viable initial conditions or different forecasting methods that (ideally) statistically represented nearly all forecast possibilities [18]. Although deterministic runs usually have more skill than any individual ensemble member due to superior resolution, the ensemble mean usually has at least as much skill as an equal-resolution control run. Normally, the ensemble mean can be more skillful than a higher-resolution deterministic run, especially beyond 3 days [15,17,18].
Until now, few studies have been carried out for extreme precipitation events in the Southeastern Coastal Region of China (SEC), especially based on the WRF-ensemble approach. It is necessary to evaluate the reliability of WRF in extreme precipitation prediction for the disaster reduction in the SEC. Meanwhile, the performance of an ensemble-based model is also of great importance to investigate for a short range (1-3 days) precipitation forecast. Thus, the objectives of the present study included (1) whether the WRF and ensembles can capture the spatiotemporal characteristics of extreme precipitation in the SEC, (2) which is more reliable between the WRF control run and ensemble simulations, and (3) can ensemble-based WRF be used as a short range forecasting tool in the SEC? This study provides a good reference for further improving the ability of weather and flood forecast in China, especially for the SEC, where typhoons and floods often occur.

Observation
The observations from 68 meteorological stations (mainly in Fujian Province) were provided by the Fujian Climate Center ( Figure 1, Table 1). These stations are evenly distributed with the elevation ranging from 7 (No. 32 Lianjiang) to 900 m (No. 10 Zhouning). Generally, these stations could represent the weather characteristics of cities, valleys, and hills. The hourly precipitation records over three days (14 to 16 June 2010) were extracted for this case study. During these days, the entire of Fujian Province experienced a heavy storm, which led to a serious flood in the Jingjiang River basin and an urban waterlogging in Nanping City. This storm was triggered by the frontal system and lasted almost two weeks. The peak discharge of the Jinjiang River basin on 15 June 2010 was three times more than the long-term average runoff.

GEFS Ensembles
In this study, the ensembles were derived from the Global Ensemble Forecast System (GEFS, https://www.ncdc.noaa.gov/data-access/model-data/model-datasets/globalensemble-forecast-system-gefs, 22 December 2021), which was previously known as the GFS Global ENSemble (GENS). GEFS is a weather forecast model made up of 21 separate forecasts (one control and 20 ensemble members). The National Centers for Environmental Prediction (NCEP) started the GEFS to address the nature of uncertainty in weather observations, which was used to initialize weather forecast models. The GEFS attempted to quantify the amount of uncertainty in a forecast by generating an ensemble of multiple forecasts, each minutely different, or perturbed, from the original observations. The Global Forecast System (GFS) model 2015 version was used as the initial condition for the GEFS control run. The ensemble Kalman filter (EnKF) was applied to generate the initial perturbations for ensemble members by adding the 6-hourly EnKF forecast perturbations to the GFS 2015 version analysis [19][20][21]. The gridded GEFS-003 version ensemble was applied for model initialization. GEFS-003 included one control run (labeled as En00) and 20 ensemble members (labeled as En01 to En20 in order) with a grid of 1-degree latitude-longitude and 6-hourly time steps (00, 06, 12, and 18 UTC). The same time period (14 to 16 June 2010) with observations was collected for the WRF simulation.

WRF Model Setup
The WRF was continuously updated, and the latest version was 4.2.1 released in 2020. The version 3.8.1 of WRF was utilized in this study. Three domains with a grid spacing of 27 (d01), 9 (d02), and 3 km (d03) and domain sizes of 288 × 165, 211 × 208, and 220 × 232, respectively, were configured for the WRF simulation ( Figure 2). The time step in the WRF was 90 s. The lateral boundary condition for the outermost domain (d01) also used these 6 h interval data, whereas the lateral boundary conditions of the two inner domains (d02 and d03) were provided by the model output of their respective outer domains (d01 and d02). The main physical parameterization schemes for the simulations were listed in Table 2 according to previous studies [15,17]. It should be noted that the time series from 00 UTC to 24 UTC 14 June was for the WRF warm-up. Thus, the outputs from 01 UTC 15 June to 18 UTC 16 June were analyzed for the model evaluation. In order to validate the WRF output total cumulative precipitation (TCP), 68 observations were used to validate the WRF simulations. Because the spatial resolution of d03 was 3 km, 9 WRF grids surrounding each meteorological station were averaged. Four statistical criteria (NRMSE, CSI and PCC) were used to evaluate the performance of 3-hourly and 6-hourly cumulative precipitation simulations (3hP and 6hP). The definition, formula and optimal value for the indices are listed in Table 3.
CSI Critical Success Index n 11 n 11 +n 12 +n 21 1 Notes: n is the number of samples; E (O) is the mean value of simulations (observations); E i (O i ) are the simulations (observations). The n 11 represents the ratio of accurate simulation, n 12 represents the ratio of fake simulation, and n 21 represents the ratio of missing simulation. Table 4 shows the performance of 3-hourly and 6-hourly cumulative precipitation simulations (3hP and 6hP) over the whole region by WRF ensembles. For 3hP, the control run (En00) has a 0.67 NRMSE and low PCC with a value of 0.28. The CSI was 0.56 both for En00 and Enmean. The performances of ensembles were different. The NRMSE ranged from 0.49 to 0.81 with an average value of 0.63, which was slightly better than En00. The PCC ranged from 0.01 to 0.82 with a much better average (0.43) than En00 (0.28). En06 had the smallest NRMSE (0.49). En07, En12 and En19 had large NRMSEs and weak PCCs. En03, En04, and En13 showed the best performances on PCC with a value higher than 0.7. For 6hP, the evaluation criteria enhanced compared to 3hP. The Enmean of NRMSE was slightly worse than En00 but with a tiny disparity. The Enmean was much better than En00 in PCC. The prediction accuracy on the rainy possibility was equal for both Enmean and En00. The NRMSE ranged from 0.41 to 0.69 while PCC ranged from 0.16 to 0.89. Seven ensembles have a PCC higher than 0.7, especially En04 and En13.

The 3-Hourly and 6-Hourly Cumulative Precipitation Simulations by WRF Ensembles
In general, ensembles could reduce the bias compared to En00. There were 16 ensemble members better than En00 in NRMSE for 3hP. However, only eight members were better than En00 in NRMSE for 6hP. It was interesting that CSI certainly enhanced from 3 to 6hP, but the NRMSE of Enmean was slightly worse than En00. Figure 3 shows the total cumulative precipitation (all 42 h) of WRF ensembles at 68 stations. The performances of sites varied significantly whether for the same ensemble or among different ensembles. For example, En05, En07 and En12 had comparable ranges (25% to 75% inner quartile) with observations. En03, En13 and En17 had smaller ranges. En00 and all ensembles had many outliers and underestimated the median values compared to observations. However, the bias between En12 and observations was the smallest with respect to the average value. En05 and En12 had a similar whole range with observation. Eight ensembles of median value were greater than Enmean. It can be clearly found that the Enmean was not optimal, although ensemble estimations can reduce the uncertainty in general. In terms of the bias for the total accumulative precipitation, En12 was the best performer. The performances of TCP, 3hP and 6hPwere also very different. This showed that the WRF model's ability to simulate storms varies greatly in different regions.  Although the values of En00, En02, En07, En15, and En19 were higher than observation (294.3 mm) at different time points, the others were much smaller. Thus, in general, the TCP of WRF runs were less than observations. In terms of bias, the control run (En00) modeled 28.1% less precipitation than observation. The 20 ensemble members modeled around 3.9% to 55.3% less precipitation than observation. The En12 modeled the best TCP compared to observation (−3.9%). The performance of En03 was the worst (−55.3%). In addition, the En00 (−28.1%) was a little better than Enmean (−31.7%).

Spatial Validation of WRF Ensemble Simulations
The spatial distributions of interpolated total cumulative precipitation from 68 stations as well as 21 WRF runs are presented in Figure 5. The TCP ranged from 3.7 to 186.6 mm. The storm center (high value area) was located in the middle of Fujian Province, especially the mountain areas (Daiyun Mountains) and Jinjiang River basin. The areas with less TCP were mainly distributed in the northern and southern Fujian (Figure 5a). The WRF-modeled TCP ranged from 0 to 302.1 mm, which was much higher than observations (Figure 5b). However, the distribution pattern was quite different from observations. The areas a high value of TCP were mainly located in the southeast of Fujian Province. The storm center of observation was not captured by En00. Figure 5c shows the spatial distribution of Enmean. The TCP ranged from 0 to 121.1 mm, which was much less than observations. The distribution pattern was also different from the observation. The areas with a simulated storm center (high TP) were mainly located in the southern Fujian Province. In contrast, less precipitation was found in the northwestern inland areas. Figure 5d shows the bias distribution of interpolated Enmean compared to the observations (here Enmean minus observation). Thus, the negative value indicates that WRF underestimated observation, and the positive value represents an overestimation. The bias ranged from −203.1 to 112.3 mm. The largest underestimation was found in the central part of the province (Sanming City and Nanping City, which are in the Daiyun Mountains). The northern and southern Fujian Province was overestimated by WRF ensembles. The central part of Fujian Province was underestimated because of the complex terrains.
All distribution figures of interpolated TP ( Figure S1) are provided in Supplementary Materials. Except for En08, the simulated ensembles show consistent characteristics with low value. A high-value center was found in the southern Fujian for most ensembles, which was contrary to observation.
The locations of the best models are inconsistent. For example, stations Nos. 1, 2, 15, 24, 34, 42 and 45 performed quite well. Among them, Nos. 42 and 45 are located in the Jinjiang River basin. This suggests that WRF worked well in the basin, where an extreme runoff occurred at 06 UTC 15 June 2010. In short, the WRF model has a large potential for improvement in the ability to capture the rainstorm centers.

Discussion
Although extreme precipitation event simulations and forecasts were hot spots in recent decades, there was no uniform definition of an extreme precipitation event. According to the regulations of the China Meteorological Administration, a rainstorm can be identified when the daily precipitation exceeds 50 mm or the precipitation exceeds 25 mm within 12 h. However, many different definitions have been applied to local studies. For example, many studies defined an extreme precipitation event when the daily precipitation exceeded the 90th or 95th percentile of a long-term precipitation series (e.g., [22,23]). Some studies preferred to use the series of climate indices provided by the Expert Team on Climate Change Detection Monitoring and Indices (ETCCDMI) (e.g., [24]). The present study selected the typical extreme precipitation event that leads to a serious flood disaster. This was a representative extreme precipitation event caused by a frontal system in the first-flood season in this region. In terms of the precipitation amount, the total cumulative precipitation in the study area reached the rainstorm level (the average daily precipitation was around 90 mm).
Many simulations of extreme precipitation events or storms in South China have been carried out based on the WRF and ensemble approaches. Compared with previous studies, the current study had not only similarities but also significant differences. Li and Tang [25] simulated the precipitation from 9 May to 24 June 2010 in southeastern China using an WRF ensemble model. They found that the WRF model had a fairly good precipitation forecasting ability, which was consistent with our findings. However, they claimed that the mean of all ensembles was always better than a single ensemble member, which was different from the present study. Zhang et al. [26] found that the GEFS ensemble could reduce the uncertainty of the control forecast in the first-flood season of southern China. They also found that the WRF ensembles tended to underestimate the storms, which was consistent with this study. However, the forecast skill was dominated by the initial conditions and multi-physical schemes. Huang and Gao [27] compared different driven conditions for the WRF model, and they found ERA-Interim was much better than the National Centers for Environmental Prediction (NCEP) Global Final Analysis (FNL) in precipitation simulation in China. More studies focused on the impacts of different physical processes on the precipitation simulation. Yang et al. [15] and Dai et al. [28] evaluated the performances of different WRF physics on catchment scales. The cumulus parameterizations, planetary boundary layer physics and land surface physics were the most sensitive processes for storm simulation [15,25,26,28]. Although this study adopted the fixed physical parameterization schemes, these were basically consistent with the best physical schemes in previous studies [15,[26][27][28].

Conclusions
An extreme precipitation event (00 UTC 14 June to 18 UTC 16 June 2010) in the Southeastern Coastal Region of China (SEC) was simulated based on an WRF model driven by GEFS ensemble members. The GEFS data provided a control run and 20 ensemble members. The 21 simulation results showed dissimilar spatial patterns of total cumulative precipitation over the SEC in general, except for a few ensemble members. The amounts of TCP and the storm centers were also significantly different. The 20 ensemble members showed different distributions for different TCP levels. The control run had a similar pattern and amount range with the ensemble mean averaged from 20 members (Enmean).
The TCP of 21 ensemble runs was validated by 68 meteorological stations over the SEC in time series and spatial patterns. For 3hP, the NRMSE for 20 ensembles ranged from 0.49 to 0.81 with an average value of 0.63, which was better than the control run (En00, 0.67). The CSI was equal (0.56) both for En00 and Enmean. The PCC ranged from 0.01 to 0.82 with a much better average (0.43) for Enmean than En00 (0.28). For 6hP, Enmean was much better than En00 in PCC but with a worse NRMSE. Although the CSI was enhanced from 3 to 6hP, the NRMSE of Enmean could possibly be worse than the control run (En00). For TCP, En00 and Enmean, as well as all ensembles, underestimated the observation in general. However, the performance of TCP was even worse than 3 and 6hP in different regions, which indicated the ability differences of WRF simulation on local scales. En12 had the smallest bias among all ensembles with respect to TCP.
All WRF ensembles underestimated the observations with respect to TCP except from 22 UTC 15 June to 03 UTC 16 June 2010. The performances were significantly different among 20 ensemble members. The spatial comparison showed that the pattern of TCP was not captured by WRF runs with respect to observations. The spatial bias ranged from −203.1 to 112.3 mm. WRF ensembles underestimated the observation in the central Fujian Province (especially Sanming City and Nanping City with complex terrains) while overestimating them in the northern and southern Fujian Province. The southwest and northeast parts were overestimated by WRF. On local scales, WRF worked well, for example, the Jinjiang River basin, where an extreme runoff occurred at 06 UTC 15 June 2010.
In terms of temporal and spatial comparison, all ensembles behaved differently. The average of ensembles (Enmean) can reduce the uncertainty to a certain extent. However, it must be carefully judged since sometimes the individual run (e.g., En12 has the smallest TCP simulation bias) was more indicative for forecast according to actual needs. Ensemblebased WRF may be more reliable than a single driven forecast for long range (beyond 3 days) precipitation forecasts.
However, 20 ensemble members were used to drive the WRF, as we know that the precipitation process was very complex. Under different driving conditions, different domain scales and different terrains, the influence mechanisms of the physical parameterization schemes were significantly different. Therefore, in practical forecast applications, the physical parameterization schemes need to be optimized. More simulations using different physical schemes and different ensemble-driven conditions should be extended to reduce the uncertainty of WRF in future. The sensitivity of parameterizations should also be explored for more accurate forecasting of extreme weather events on local scales, especially in complex terrains. Data Availability Statement: The ensemble driven data GEFS-003 was provided by the Global Ensemble Forecast System (GEFS, https://www.ncdc.noaa.gov/data-access/model-data/modeldatasets/global-ensemble-forecast-system-gefs, 23 December 2021). The data from the ensemble modeling that support the findings of this study are stored in Institute of Geography, Fujian Normal University and are available from the corresponding author upon request.

Conflicts of Interest:
The authors declare no conflict of interest.