Evaluation of Heavy Precipitation Simulated by the WRF Model Using 4 D-Var Data Assimilation with TRMM 3 B 42 and GPM IMERG over the Huaihe River Basin , China

To obtain independent, consecutive, and high-resolution precipitation data, the four-dimensional variational (4D-Var) method was applied to directly assimilate satellite precipitation products into the Weather Research and Forecasting (WRF) model. The precipitation products of the Tropical Rainfall Measuring Mission 3B42 (TRMM 3B42) and its successor, the Integrated Multi-satellitE Retrievals for Global Precipitation Measurement (GPM IMERG) were assimilated in this study. Two heavy precipitation events that occurred over the Huaihe River basin in eastern China were studied. Before assimilation, the WRF model simulations were first performed with different forcing data to select more suitable forcing data and determine the control experiments for the subsequent assimilation experiments. Then, TRMM 3B42 and GPM IMERG were separately assimilated into the WRF. The simulated precipitation results in the outer domain (D01), with a 27-km resolution, and the inner domain (D02), with a 9-km resolution, were evaluated in detail. The assessments showed that (1) 4D-Var with TRMM 3B42 or GPM IMERG could both significantly improve WRF precipitation predictions at a time interval of approximately 12 h; (2) the WRF simulated precipitation assimilated with GPM IMERG outperformed the one with TRMM 3B42; (3) for the WRF output precipitation assimilated with GPM IMERG over D02, which has spatiotemporal resolutions of 9 km and 50 s, the correlation coefficients of the studied events in August and November were 0.74 and 0.51, respectively, at the point and daily scales, and the mean Heidke skill scores for the two studied events both reached 0.31 at the grid and hourly scales. This study can provide references for the assimilation of TRMM 3B42 or GPM IMERG into the WRF model using 4D-Var, which is especially valuable for hydrological applications of GPM IMERG during the transition period from the TRMM era into the GPM era.


Introduction
Precipitation is a basic and vital component of the global water and energy cycles [1].A robust knowledge of precipitation processes at finer spatiotemporal resolutions has become increasingly important for hydrological modeling, flood monitoring, soil moisture estimation, and water resource management [2][3][4][5][6].Currently, there are generally three mainstream methods to obtain precipitation information: traditional in situ observations, estimations from remote sensing, and numerical weather prediction (NWP) [7][8][9].
In situ precipitation observations are generally obtained from conventional ground rain gauge stations.This type of data is generally considered to be the most accurate measurements, and served as the grounds for true precipitation values [10,11].However, the application of these data in the hydrological field is severely limited by poor point-to-area representativeness, incomplete opening to the public, and several well-recognized issues of the station network, such as poor spatial distribution and wind-induced deviation [12,13].As remote sensing techniques developed, various satellite precipitation products based on visible, infrared, and microwave wavelengths have emerged during the last few decades, such as the Global Precipitation Climatology Project (GPCP) [14], the Climate Prediction Center Morphing technique (CMORPH) [15], the Tropical Rainfall Measuring Mission (TRMM) [16], and its ongoing replacement Global Precipitation Measurement project (GPM) [17].These products not only cover a nearly global area, they also are available to the public free of charge.Nevertheless, they still have coarse spatiotemporal resolutions, which are incapable of representing consecutive precipitation process and have difficulty in detecting extreme events at high latitudes [18].Since the NWP model is built on precise physical governing equations, it can resolve the inherent dynamics of precipitation and nearly represent the entire spatial pattern of the precipitation process [19,20].However, when solving the equations and initialization errors, approximations due to incomplete observations often induce many uncertainties to the model outputs [21,22].In terms of simulated area ranges, the NWP model is usually divided into general circulation models (GCMs), which cover the global or continental scales, and regional climate models (RCMs), which cover the regional scale or mesoscale.Compared with GCMs, RCMs can better simulate the exact distribution of a climatic field since they have a finer spatial resolution; this enables them to resolve finer details of land surface characteristics, such as topography, land cover, and surface vegetation [23,24].Frequently used RCMs include the National Meteorological Center (NMC) forecast model [25][26][27], the next-generation Weather Research and Forecasting (WRF) model [28], the operational Japan Meteorological Agency (JMA) mesoscale model [29], and the European Center for Medium-Range Weather Forecasts (ECMWF) model [30].
Considering the merits and drawbacks of these three types of precipitation data, we attempted to integrate two of them to obtain a type of precipitation data.It is not only independent from in situ observations, it can also reflect the precipitation consecutiveness at higher spatiotemporal resolutions [31,32].Therefore, we used the data assimilation (DA) method to integrate freely downloaded remote sensing precipitation estimations with precipitation prediction from a more physically realistic NWP model.Among the various DA algorithms, such as the polynomial interpolation method [33], optimum interpolation [34], three-dimensional variational (3D-Var) assimilation [35], four-dimensional variational (4D-Var) assimilation [36], and the Kalman filter [37,38], 4D-Var is particularly appropriate for assimilating synoptic satellite data due to its advantages regarding a definite theoretical basis, simple formulation, and no limitations on the type of assimilated data that is utilized [39].
At present, there have been extensive studies regarding integrating various precipitation data with the NWP model via the 4D-Var data assimilation method.Zupanski and Mesinger [40] first carried out a 4D-Var experiment with 24-hour accumulated precipitation data and the NMC forecast model in the United States of America (USA) and demonstrated its improvement in precipitation forecasting.Koizumi et al. [21] used the JMA mesoscale 4D-Var system to assimilate one-hour radar-based precipitation data at a spatial resolution of 20 km over the islands of Japan and demonstrated improvements in precipitation forecasts for an 18-hour forecast time.Lopez [41] assimilated the National Centers for Environmental Prediction (NCEP) stage IV gauge-corrected radar precipitation data into the ECMWF Global Integrated Forecasting System over the eastern USA, and found a substantial improvement in short-term (i.e., up to 12 h) precipitation forecasts.Lin et al. [42] assimilated NCEP stage IV rainfall data into the WRF model with the 4D-Var method in the USA, and they successfully downscaled a six-hour precipitation product with a 20-km resolution to an hourly precipitation product with a resolution of less than 10 km.These studies were mainly carried out to resolve the problems such as the highly nonlinear and discontinuous in cumulus convection parameterization [7,41], the sensitivity of the different global datasets for the initial and boundary conditions [21,41,43], and the effectiveness of applying different observational and background error covariance matrices [44,45].We will not go into the many details of 4D-Var techniques, but rather will investigate its potential in hydrological applications.
From an application perspective, a majority of these existing studies were performed and evaluated at the mesoscale, whereas a limited number of studies focused on the basin scale evaluation, even though basin is the most commonly used unit in hydrological studies.Moreover, since the GPM was just released in 2014, there are very few studies on the feasibility and efficiency of the GPM application in 4D-Var data assimilation, and the discrepancies in assimilation effectiveness between assimilating GPM and assimilating TRMM are less investigated.Therefore, we assimilated the Integrated Multi-satellitE Retrievals for GPM (GPM IMERG) and the TRMM Multi-satellite Precipitation Analysis 3B42 (TRMM 3B42) with the 4D-Var method into the NWP model of WRF, and assessed their performances in simulating two heavy precipitation events that occurred over the Huaihe River basin (HRB) in eastern China.Before assimilation, we first drove the WRF model with two different forcing data to choose a more suitable forcing datum and determine the control experiment for the subsequent assimilation work.Then, DA experiments were carried out with different assimilation data and for different precipitation events.Finally, we evaluated the experimental precipitation results with the daily in situ observations and the hourly merged CMORPH estimations at different spatial and temporal scales in detail.The manuscript is organized as follows: Section 2 introduces the study area, study events, and data.Section 3 introduces the WRF configuration, 4D-Var methodology, experimental design, and evaluation metrics.Section 4 shows the evaluations of the simulated precipitation from the WRF and the WRF 4D-Var that is assimilated with TRMM 3B42 and GPM IMERG.Section 5 discusses the WRF sensitivity to different rainfall events, forcing data, and spatial resolutions; examines the effectiveness of WRF 4D-Var at different thresholds and time; and compares the 4D-Var performances assimilated with TRMM 3B42 and GPM IMERG.The conclusions are drawn in Section 6.

Study Area and Events
The HRB is one of seven major river basins in China; it is located between 110 • -122 • E and 31 • -37 • N (Figure 1b), and covers an area of 270,000 km 2 .Most of the HRB region is vast plains, except for some mountains and foothills located at the western, southern, and northeastern HRB (as shown in Figure 1b).The mountain altitudes are normally 1000-2000 meters above the sea level (a.s.l.).The HRB is in the transitional zone between the abundant rainfall area of southern China and the arid area of northern China [46], belonging to the warm temperate and semi-humid monsoon region with an average temperature of 11-16 degrees centigrade and an average annual rainfall about 910 mm.The precipitation distribution within a year is very uneven.In the flood season (June-September), the total precipitation accounts for 50-80% of annual precipitation [47].The heavy rainfall events that occur in the summer frequently cause disaster in this area.Across the basin, there is a gradual gradient in average annual precipitation from about 1000 mm in the southeast to less than 600 mm in the northwest, while the highest precipitation occurs in the inner mountain areas; this variation is a result of the topography [48].A robust knowledge of the precipitation processes in the HRB is vital for local flood monitoring and water resource management.To select the studied precipitation events, daily precipitation data from 30 meteorological stations in the HRB during 2015 were obtained from the China Meteorological Administration (CMA) and analyzed (Figure 2).Based on these CMA observations, the contributions of accumulated daily precipitation to the annual precipitation amounts at each meteorological station were calculated and sorted in decreasing order.As shown in Figure 2a, all 30 CMA stations received half of the total annual precipitation within seven days (Dingtao station) or 13 days (Huoshan station).This suggests that short-term heavy precipitation events are the main contributors to the total annual precipitation amount in the HRB.Therefore, we focused our studies on short-term heavy precipitation events.The mean monthly precipitation over the HRB in 2015 was also calculated, and is illustrated in Figure 2b.The results showed that the precipitation amount during the flood season of the HRB composed 58.23% of the annual amount at all of the stations.The mean monthly precipitation in June comprised the highest percentage (21.6%),and August had the second largest percentage (18.33%).However, because the WRF forcing data of the NCEP final analyses (FNL) ds083.3 was issued in July 2015, we selected one event from August.As shown in Figure 2c, a heavy rainfall event occurred on 9 August, when the daily precipitation at many CMA stations exceeded 50 mm, and the highest exceeded 160 mm; thus, we chose the rainfall event from 0000 UTC on 9 August to 1200 UTC on 11 August (hereafter referenced as event A) as our first case study.As shown in Figure 2b, the mean monthly precipitation in November was evidently the highest during the non-flood season.Therefore, we chose the rainfall event that occurred from 0000 UTC on 5 November to 1200 UTC on 7 November (hereafter referenced as event N) as our second case study.During event N, the average daily precipitation of the basin reached 22.1 mm, and that of a single CMA station (Rizhao) reached 66.6 mm.The two case studies (event A and event N; each spanned 60 h) represent the type of convection dominant precipitation events during a flood season, and extratropical cyclone-associated precipitation events during the non-flood season, respectively.To select the studied precipitation events, daily precipitation data from 30 meteorological stations in the HRB during 2015 were obtained from the China Meteorological Administration (CMA) and analyzed (Figure 2).Based on these CMA observations, the contributions of accumulated daily precipitation to the annual precipitation amounts at each meteorological station were calculated and sorted in decreasing order.As shown in Figure 2a, all 30 CMA stations received half of the total annual precipitation within seven days (Dingtao station) or 13 days (Huoshan station).This suggests that short-term heavy precipitation events are the main contributors to the total annual precipitation amount in the HRB.Therefore, we focused our studies on short-term heavy precipitation events.The mean monthly precipitation over the HRB in 2015 was also calculated, and is illustrated in Figure 2b.The results showed that the precipitation amount during the flood season of the HRB composed 58.23% of the annual amount at all of the stations.The mean monthly precipitation in June comprised the highest percentage (21.6%),and August had the second largest percentage (18.33%).However, because the WRF forcing data of the NCEP final analyses (FNL) ds083.3 was issued in July 2015, we selected one event from August.As shown in Figure 2c, a heavy rainfall event occurred on 9 August, when the daily precipitation at many CMA stations exceeded 50 mm, and the highest exceeded 160 mm; thus, we chose the rainfall event from 0000 UTC on 9 August to 1200 UTC on 11 August (hereafter referenced as event A) as our first case study.As shown in Figure 2b, the mean monthly precipitation in November was evidently the highest during the non-flood season.Therefore, we chose the rainfall event that occurred from 0000 UTC on 5 November to 1200 UTC on 7 November (hereafter referenced as event N) as our second case study.During event N, the average daily precipitation of the basin reached 22.1 mm, and that of a single CMA station (Rizhao) reached 66.6 mm.The two case studies (event A and event N; each spanned 60 h) represent the type of convection dominant precipitation events during a flood season, and extratropical cyclone-associated precipitation events during the non-flood season, respectively.

Satellite Precipitation Products for Assimilation
In this study, two satellite-based precipitation data, including the TRMM 3B42 (version 7) and the GPM IMERG (final run), were applied as observation operators.Both were processed to collect six-hour accumulated precipitation data for assimilation into the WRF model.
TRMM was jointly launched by the National Aeronautics and Space Administration (NASA) and the Japan Aerospace Exploration Agency (JAXA) in 1997.TRMM 3B42 is one product of the TRMM Multi-satellite Precipitation Analysis (TMPA); it combines remote sensing data from various microwave and infrared sensors with the monthly gauge analysis from the Global Precipitation Climatology Centre (GPCC).TRMM 3B42 covers ±50 °N/S at spatiotemporal resolutions of 0.25° and 3 h [49].After over 17 years of productive data acquisition, the observation instruments on TRMM were turned off on 8 April 2015.However, the actual termination of TRMM was not a substantive issue for its multi-satellite products in TMPA; therefore, the TRMM 3B42 data during our study periods are still available.Tang et al. [50] compared TRMM 3B42 with the gauge observations and found that the correlation coefficient (CC) over Mainland China reached 0.42 and 0.68 at its original three-hour and daily timescales, respectively; for the region encompassing the lower reaches of the Huaihe River, the CCs reached 0.43 and 0.71 at the three-hour and daily timescales, respectively.
As the replacement of TRMM, GPM has been providing next-generation global observations of rain and snow for more than three years.The observation system on the GPM comprises one core satellite and several constellation satellites, with a dual-frequency precipitation radar and a suite of microwave radiometers.The core observatory of the GPM was designed as an extension of TRMM's highly successful rain-sensing package, which primarily focused on heavy to moderate rain over

Satellite Precipitation Products for Assimilation
In this study, two satellite-based precipitation data, including the TRMM 3B42 (version 7) and the GPM IMERG (final run), were applied as observation operators.Both were processed to collect six-hour accumulated precipitation data for assimilation into the WRF model.
TRMM was jointly launched by the National Aeronautics and Space Administration (NASA) and the Japan Aerospace Exploration Agency (JAXA) in 1997.TRMM 3B42 is one product of the TRMM Multi-satellite Precipitation Analysis (TMPA); it combines remote sensing data from various microwave and infrared sensors with the monthly gauge analysis from the Global Precipitation Climatology Centre (GPCC).TRMM 3B42 covers ±50 • N/S at spatiotemporal resolutions of 0.25 • and 3 h [49].After over 17 years of productive data acquisition, the observation instruments on TRMM were turned off on 8 April 2015.However, the actual termination of TRMM was not a substantive issue for its multi-satellite products in TMPA; therefore, the TRMM 3B42 data during our study periods are still available.Tang et al. [50] compared TRMM 3B42 with the gauge observations and found that the correlation coefficient (CC) over Mainland China reached 0.42 and 0.68 at its original three-hour and daily timescales, respectively; for the region encompassing the lower reaches of the Huaihe River, the CCs reached 0.43 and 0.71 at the three-hour and daily timescales, respectively.
As the replacement of TRMM, GPM has been providing next-generation global observations of rain and snow for more than three years.The observation system on the GPM comprises one core satellite and several constellation satellites, with a dual-frequency precipitation radar and a suite of microwave radiometers.The core observatory of the GPM was designed as an extension of TRMM's highly successful rain-sensing package, which primarily focused on heavy to moderate rain over tropical and subtropical oceans [51].A key advancement of the GPM over the TRMM is the extended capability to measure light rain (≤0.5 mm/hour) and falling snow, since the two types of precipitation account for significant proportions of precipitation at mid and high latitudes.GPM IMERG is the third-level precipitation product of GPM, which covers an area of ±60 • N/S with unprecedented resolutions of 0.1 • and 30 min.Tang et al. [50] reported that when compared with the gauge observations, the CCs of GPM IMERG over Mainland China reached 0.53 and 0.71 at the hourly and daily timescales, respectively; for the region encompassing the lower reaches of the Huaihe River, the CCs reached 0.55 and 0.72 at the hourly and daily timescales, respectively.

Forcing Data for the WRF Model
To simulate the two heavy precipitation events over the HRB, the NCEP FNL datasets were applied to provide the WRF model with the initial states and lateral boundary conditions.Among the various NCEP FNL datasets, the commonly used 1.0 • and six-hour FNL ds083.2 and the newly released 0.25 • and six-hour FNL ds083.3datasets were employed and compared.The one that performed better with the WRF model was used in the subsequent 4D-Var experiments.These two forcing data were composed of an underlying Global Forecast System (GFS) that was obtained from the Global Data Assimilation System (GDAS), which continuously collects observational data from the Global Telecommunications System (GTS) and other sources for many related analyses.The archived time series of the ds083.2and ds083.3datasets started on 30 July 1999 and 8 July 2015, respectively.Both of these datasets extended to a near-current date.They can be downloaded from the Research Data Archive at the National Center for Atmospheric Research (https://rda.ucar.edu/datasets/).

In Situ and Gauge-Corrected Data for Evaluation
To evaluate the precipitation results simulated by the WRF model and the WRF 4D-Var with TRMM 3B42 and GPM IMERG, two types of precipitation datasets were used as reference data.One was the in situ gauge observation dataset for daily precipitation; this data was obtained from the 30 meteorological stations in the HRB (Figure 1b) and provided by the CMA (hereafter referenced as CMA data).This dataset was collected as a benchmark for the point-scale evaluations.The accuracy of the CMA data was officially claimed to be approximately 100%, as before its release, it experienced a series of quality control assessments including a climate limit value check, station extreme value check, spatiotemporal consistency check, and other related checks (http://data.cma.cn/data/cdcdetail/dataCode/SURF_CLI_CHN_MUL_DAY_V3.0.html).When comparing with the CMA data, the simulated precipitation results from the WRF and the WRF 4D-Var models were changed to Beijing time, which is used in the CMA data.The other dataset was the gridded merged CMORPH data, which was used for the evaluation at the grid scale [52].It is also released by the CMA, with spatiotemporal resolutions of 0.1 • and hourly.The CMA generated the merged CMORPH data by applying the probability density function matching method and the optimal interpolation method.This dataset integrates with the following two data sources: (1) gauged hourly precipitation from more than 30,000-40,000 automatic weather stations in China after quality control; and (2) inverse precipitation products from the global CMORPH satellite, with resolutions of 30 min and 8 km provided by the Climate Prediction Center (USA).The merged CMORPH data have been opened to the public since 1 January 2008.This dataset covers mainland China, and its applied time zone was the same as that used in the WRF model, i.e., the UTC time zone.Its general error remains under 10%, and the errors in strong precipitation over sparsely-gauged areas are less than 20%.Its quality is better than that of other similar products in China (http://data.cma.cn/).When used for evaluation, the merged CMORPH data were resampled to the same spatial resolutions as those of the WRF domains (i.e., 27 km and 9 km) by the nearest neighborhood interpolation method.Moreover, to ensure evaluation quality, the coordinate system of the merged CMORPH data was transformed to be the same projected coordinate system as that used in the WRF model.It should be noted that for the grid-scale evaluation, assessments were only carried out over intersecting regions that encompassed the two WRF domains and the merged CMORPH data region since the latter only covered mainland China.

WRF Configuration
To obtain precipitation at a finer resolution and reduce the computing quantity, a nested domain (Figure 1a) was applied in our WRF configuration.The outer domain (D01) was set around the HRB to be as large as possible to cover the important weather features of the HRB, which are probably caused by the Indian southwestern monsoon, the East Asian subtropical monsoon, and the Asian monsoon that originates from the Siberian high.However, D01 cannot be set to an unlimited large domain, as a large domain means more grids for calculation, which requires more computational resources.Therefore, we finally define D01 with grids of 180 * 155 and a 27-km resolution and set the inner domain (D02) to cover the whole HRB, with 226 * 175 grids and a 9-km resolution.
The physical configuration of the WRF model and its related dominant parameters were determined by obeying the rule that the configuration should incorporate the experiences obtained from comparable numerical modeling studies as much as possible, especially studies performed over the HRB.Moreover, a one-way nesting strategy was employed, since a nesting strategy was not decisive of the overall performance [53], and was not our research priority.The main configuration of the WRF model is listed in Table 1.

4D-Var Methodology
The incremental 4D-Var formulation, which is commonly used in operational systems [60][61][62][63], was utilized in the WRF 4D-Var data assimilation system.The incremental approach was designed to determine an analysis increment that minimized the cost function, which was defined as a function of the analysis increment instead of the analysis itself [64].Mathematically, the WRF 4D-Var minimizes cost function J: which includes quadratic measures of distance to the background, observation, and balanced solution.
The background cost function term J b is: where the superscripts −1 and T denote the inverse and the adjoint of a matrix or a linear operator, respectively.The final analysis of WRF 4D-Var (i.e., after the last (n) outer loop occurs) is denoted as x n .
The background x b has typically been a short-range forecast in previous analyses.B represents the background error covariance matrix.The observation cost function term represents the quadratic measure of the distances between the analysis x n and the forecast model M k and the observation operator H k and the observations y k : When calculating the observation cost function J o , a linear approximation is made, and the entire assimilation time window is split into K observation windows.The nonlinear observation operator H k is approximately transformed into a tangent linear observation operator, and the nonlinear model M k is also transformed into a tangent linear model.x n denotes the guess vector, and x n − x n−1 denotes the analysis increment.R represents the observation error covariance matrix.
The balancing cost function J c measures the quadratic distance between the analysis and a balanced state, and it is expressed as follows:

Evaluation Metrics
To evaluate the simulated precipitation results of the CTL and DA experiments, we applied two categories of metrics, which are shown in Table 3.The first category of metrics contained the mean error (ME), the relative error (RE), the root mean square error (RMSE), and the CC, which were used to describe the errors, deviations, and correlations between the simulated precipitation and the reference precipitation at the point scale.The second category of metrics contained skill scores commonly used in meteorological studies, including the bias score (BIAS), the false alarm ratio (FAR), the probability of detection (POD), the probability of false detection (POFD), and the Heidke skill score (HSS).These skill scores were constructed based on a contingency table [70] and applied in the evaluation at the grid scale.BIAS is an indicator of how well the model predicts the number of occurrences of an event.FAR indicates the fraction of forecasts detected by the WRF model that turns out to be wrong.POD represents the ratio of correct forecasts to the number of events that occurred, which is commonly known as the hit rate.POFD represents the faction of false alarms to the total number of events that did not occur.HSS is one of the most frequently used and comprehensive skill scores for summarizing square contingency tables and combining the characteristics of hints and random detections.Considering that the detection accuracy of the CMA ground observation was 0.1 mm/day, the rain/no rain threshold was set to 0.1 mm in this study.

Evaluation of Simulated Precipitation in the CTL Experiments at the Point Scale
To evaluate the precipitation simulated by the WRF model at the point scale, the error indices of ME, RE, CC, and RMSE were used.Precipitation values in D01 and D02 were both extracted from grids located closest to the CMA stations, then accumulated into daily values and compared against the daily in situ CMA data.As delineated in Figure 3, for both event A and event N, the values of ME and RE in D01 and D02 were positive, which indicated that the WRF-predicted precipitation results were generally overestimated.Although the CC values of event N were lower than those of event A, the points that represented the relationship between the station precipitation and the predicted precipitation was evidently much closer to the 1:1 line for event N (Figure 3b) than that for event A (Figure 3a).Moreover, the values of ME, RE, and RMSE for event N were also lower than those for event A. This means that the predicted precipitation results for event N showed better agreement with the CMA observations than those for event A. In addition, the error metrics for the simulated precipitation from D02 were generally comparable to those from D01, which indicated that the resolution increase for precipitation from 27 km (D01) to 9 km (D02) was realized through the nested domain at a very slight accuracy reduction cost at the point and daily scales.For the variations made by different forcing data, there appeared to be some slight negative changes in the ME, RE, CC, and RMSE values from CTL4 to CTL3, but the CCs of CTL 2 were only 0.01 lower than those of CTL1, and the ME, RE, and RMSE values of CTL2 were evidently much lower than those of CTL1.Therefore, we conclude that the WRF performance driven by the FNL ds083.3dataset outperformed that driven by the ds083.2dataset at the daily and point scales.To evaluate the precipitation simulated by the WRF model at the point scale, the error indices of ME, RE, CC, and RMSE were used.Precipitation values in D01 and D02 were both extracted from grids located closest to the CMA stations, then accumulated into daily values and compared against the daily in situ CMA data.As delineated in Figure 3, for both event A and event N, the values of ME and RE in D01 and D02 were positive, which indicated that the WRF-predicted precipitation results were generally overestimated.Although the CC values of event N were lower than those of event A, the points that represented the relationship between the station precipitation and the predicted precipitation was evidently much closer to the 1:1 line for event N (Figure 3b) than that for event A (Figure 3a).Moreover, the values of ME, RE, and RMSE for event N were also lower than those for event A. This means that the predicted precipitation results for event N showed better agreement with the CMA observations than those for event A. In addition, the error metrics for the simulated precipitation from D02 were generally comparable to those from D01, which indicated that the resolution increase for precipitation from 27 km (D01) to 9 km (D02) was realized through the nested domain at a very slight accuracy reduction cost at the point and daily scales.For the variations made by different forcing data, there appeared to be some slight negative changes in the ME, RE, CC, and RMSE values from CTL4 to CTL3, but the CCs of CTL 2 were only 0.01 lower than those of CTL1, and the ME, RE, and RMSE values of CTL2 were evidently much lower than those of CTL1.Therefore, we conclude that the WRF performance driven by the FNL ds083.3dataset outperformed that driven by the ds083.2dataset at the daily and point scales.

Evaluation of the Simulated Precipitation in the CTL experiments at the Grid Scale
To make the evaluations of the simulated precipitation more complete, we also compared the simulated precipitation from the CTL experiments with the hourly merged CMORPH data.The skill scores of BIAS, FAR, POD, POFD, and HSS were used to quantify the assessment.The evaluated results are displayed in Figure 4.It is observed that the box lengths of BIAS, FAR, and POFD for event N were much shorter than those for event A, which indicated lower deviations and better performances for the simulation of event N.This finding was consistent with the evaluated results at the daily and point scales.Thus, we conclude that the WRF performance for event N was

Evaluation of the Simulated Precipitation in the CTL experiments at the Grid Scale
To make the evaluations of the simulated precipitation more complete, we also compared the simulated precipitation from the CTL experiments with the hourly merged CMORPH data.The skill scores of BIAS, FAR, POD, POFD, and HSS were used to quantify the assessment.The evaluated results are displayed in Figure 4.It is observed that the box lengths of BIAS, FAR, and POFD for event N were much shorter than those for event A, which indicated lower deviations and better performances for the simulation of event N.This finding was consistent with the evaluated results at the daily and point scales.Thus, we conclude that the WRF performance for event N was better than that for event A. This was probably attributed to the over or underestimation of the WRF model for localized extreme precipitation intensities when simulating strong convective precipitation events.
By comparing the evaluation results between D01 and D02, it is found for event A that the values of BIAS over D02 were much closer to 1 than those for event N; the mean values of FAR and POFD over D02 were much lower, and the maximum and mean values of HSS over D02 were higher.For the rainfall in event N, although the comprehensive index values of HSS over D02 were slightly lower than those over D01, the BIAS values over D02 were much closer to 1 than those over D01, and the FAR values also decreased more over D02 than those over D01.Thus, advantages in the WRF improvement regarding spatial resolution were conveyed both in convective and non-convective rainfall events at the grid and hourly scales.
Based on the above evaluations, the forcing data of NCEP FNL ds083.3dataset were selected to drive the following WRF 4D-Var experiments.The CTL2 experiment was utilized as the reference for DA1 and DA2 (event A), and the CTL4 experiment was determined as the reference for DA3 and DA4 (event N).
Remote Sens. 2018, 10, 646 11 of 25 better than that for event A. This was probably attributed to the over or underestimation of the WRF model for localized extreme precipitation intensities when simulating strong convective precipitation events.By comparing the evaluation results between D01 and D02, it is found for event A that the values of BIAS over D02 were much closer to 1 than those for event N; the mean values of FAR and POFD over D02 were much lower, and the maximum and mean values of HSS over D02 were higher.For the rainfall in event N, although the comprehensive index values of HSS over D02 were slightly lower than those over D01, the BIAS values over D02 were much closer to 1 than those over D01, and the FAR values also decreased more over D02 than those over D01.Thus, advantages in the WRF improvement regarding spatial resolution were conveyed both in convective and non-convective rainfall events at the grid and hourly scales.Based on the above evaluations, the forcing data of NCEP FNL ds083.3dataset were selected to drive the following WRF 4D-Var experiments.The CTL2 experiment was utilized as the reference for DA1 and DA2 (event A), and the CTL4 experiment was determined as the reference for DA3 and DA4 (event N).

Evaluation of the Simulated Precipitation in the DA Experiments
The DA experiments output precipitation values with spatiotemporal resolutions of 27 km and 150 s in D01, and 9 km and 50 s in D02. Figure 5 shows the spatial patterns of the 12-hour accumulated precipitation extracted from the reference CTL experiments, the DA experiments and the merged CMORPH dataset over a subset of the D01 domain.It is clear in Figure 5 that the CTL2 experiment captures the shift in precipitation during event A over the HRB from the southwest to the northeast, and the CTL4 experiment also captures the shift in precipitation during event N over the HRB from the west to the east of the HRB.However, compared with the merged CMORPH data, which served as the true measurements at the grid scale, the spatial distributions and the amounts of simulated precipitation in CTL2 and CTL4 were much wider and larger than those reflected in the merged CMORPH dataset.For each 24-hour period, these discrepancies between the simulated precipitation and the merged CMROPH data became more evident in the second 12-hour period than those in the first 12-hour period, which may be related to the error accumulations when the WRF model was running.It can also be found in Figure 5 that the simulated precipitation after assimilation with TRMM 3B42 and GPM IMERG was more agreeable with the merged CMORPH dataset.

Evaluation of the Simulated Precipitation in the DA Experiments
The DA experiments output precipitation values with spatiotemporal resolutions of 27 km and 150 s in D01, and 9 km and 50 s in D02. Figure 5 shows the spatial patterns of the 12-hour accumulated precipitation extracted from the reference CTL experiments, the DA experiments and the merged CMORPH dataset over a subset of the D01 domain.It is clear in Figure 5 that the CTL2 experiment captures the shift in precipitation during event A over the HRB from the southwest to the northeast, and the CTL4 experiment also captures the shift in precipitation during event N over the HRB from the west to the east of the HRB.However, compared with the merged CMORPH data, which served as the true measurements at the grid scale, the spatial distributions and the amounts of simulated precipitation in CTL2 and CTL4 were much wider and larger than those reflected in the merged CMORPH dataset.For each 24-hour period, these discrepancies between the simulated precipitation and the merged CMROPH data became more evident in the second 12-hour period than those in the first 12-hour period, which may be related to the error accumulations when the WRF model was running.It can also be found in Figure 5 that the simulated precipitation after assimilation with TRMM 3B42 and GPM IMERG was more agreeable with the merged CMORPH dataset.

Evaluation of Simulated Precipitation in the DA Experiments at the Point Scale
To evaluate the simulated precipitation in the DA experiments at the point scale, precipitation values were extracted from the grids closest to the CMA stations from the DA simulations, which were accumulated into daily values and compared against the daily CMA data.The evaluated error indices of ME, RE, CC, and RMSE for the control experiments and the DA experiments are displayed in Figure 6.As shown in Figure 6a, the application of 4D-Var to event A had an obvious change in simulated precipitation via CLT2 from over-forecasting (above the 1:1 line) to under-forecasting (below the 1:1 line).This variation can also be found in the numerical changes in ME from positive values to negative values.For event N (Figure 6b), the MEs changed differently.The MEs for DA3 were still positive (indicating over-forecasting), and the MEs for DA4 were negative (indicating under-forecasting).This means that after assimilating GPM IMERG, WRF precipitation outputs tend to be lower than those from the corresponding CTL experiments for both event A and event N. By assimilating with the remotely sensed precipitation products, the RMSEs of all of the DA experiments all significantly decreased, which indicated reduced deviations in the precipitation outputs of the WRF 4D-Var.Moreover, the CCs of the DA experiments all significantly increased.The biggest CC increase for event A was 0.39 (from CTL2-D01 to DA2-D01) and 0.22 for event N (from CTL4-D01 to DA4-D01).These positive variations manifested improvements in the precipitation simulations due to the assimilation of TRMM 3B42 and GPM IMERG onto WRF at the daily and point scales.

Evaluation of Simulated Precipitation in the DA Experiments at the Point Scale
To evaluate the simulated precipitation in the DA experiments at the point scale, precipitation values were extracted from the grids closest to the CMA stations from the DA simulations, which were accumulated into daily values and compared against the daily CMA data.The evaluated error indices of ME, RE, CC, and RMSE for the control experiments and the DA experiments are displayed in Figure 6.As shown in Figure 6a, the application of 4D-Var to event A had an obvious change in simulated precipitation via CLT2 from over-forecasting (above the 1:1 line) to under-forecasting (below the 1:1 line).This variation can also be found in the numerical changes in ME from positive values to negative values.For event N (Figure 6b), the MEs changed differently.The MEs for DA3 were still positive (indicating over-forecasting), and the MEs for DA4 were negative (indicating under-forecasting).This means that after assimilating GPM IMERG, WRF precipitation outputs tend to be lower than those from the corresponding CTL experiments for both event A and event N. By assimilating with the remotely sensed precipitation products, the RMSEs of all of the DA experiments all significantly decreased, which indicated reduced deviations in the precipitation outputs of the WRF 4D-Var.Moreover, the CCs of the DA experiments all significantly increased.The biggest CC increase for event A was 0.39 (from CTL2-D01 to DA2-D01) and 0.22 for event N (from CTL4-D01 to DA4-D01).These positive variations manifested improvements in the precipitation simulations due to the assimilation of TRMM 3B42 and GPM IMERG onto WRF at the daily and point scales.

Evaluation of Simulated Precipitation in the DA experiments at the Grid Scale
The simulated precipitation results of the DA experiments in D01 and D02 were processed into hourly values and evaluated with the grid data (i.e., the hourly merged CMORPH data).The evaluations were quantified by the skill scores of BIAS, FAR, POD, POFD, and HSS.These skill scores for the hourly precipitation from the CTL2, CTL4, and DA experiments changed with time during the 48 h in each experiment (Figure 7).On the whole, for the reason of error accumulation during the simulations, the FAR values showed increasing tendency with time, the values of POD and HSS presented decreasing tendency with time, the values of BIAS and POFD fluctuated with time, and the amplitudes were weakened by the application of 4D-Var in DA experiments.Looking

Evaluation of Simulated Precipitation in the DA experiments at the Grid Scale
The simulated precipitation results of the DA experiments in D01 and D02 were processed into hourly values and evaluated with the grid data (i.e., the hourly merged CMORPH data).The evaluations were quantified by the skill scores of BIAS, FAR, POD, POFD, and HSS.These skill scores for the hourly precipitation from the CTL2, CTL4, and DA experiments changed with time during the 48 h in each experiment (Figure 7).On the whole, for the reason of error accumulation during the simulations, the FAR values showed increasing tendency with time, the values of POD and HSS presented decreasing tendency with time, the values of BIAS and POFD fluctuated with time, and the amplitudes were weakened by the application of 4D-Var in DA experiments.Looking into the HSS variation, for the reason of 4D-Var data assimilation, the HSS values for the hourly precipitation predictions increased at first, but as time passed, they decreased because of the error accumulation during the simulations.Moreover, there appear abrupt changes between the 24th hour and the 25th hour in Figure 7a-e.These sudden variations were caused by the newly initial and boundary conditions accompany with the start of the second 24-hour forecast in each DA experiment.
Remote Sens. 2018, 10, 646 14 of 25 into the HSS variation, for the reason of 4D-Var data assimilation, the HSS values for the hourly precipitation predictions increased at first, but as time passed, they decreased because of the error accumulation during the simulations.Moreover, there appear abrupt changes between the 24 th hour and the 25 th hour in Figure 7a-e.These sudden variations were caused by the newly initial and boundary conditions accompany with the start of the second 24-hour forecast in each DA experiment.Besides, it is also seen in Figure 7 that in the same domains, all of the values of BIAS, FAR, and POFD for the DA experiments are primarily lower than those in the CTL experiments.This suggests that the 4D-Var-assimilated TRMM 3B42 or GPM IMERG can effectively reduce several false alarms regarding rainfall occurrences.It is noteworthy that the variations in hit rates, which were indicated by the PODs, were different between event A and event N (Figure 7c).When comparing CTL2-D01 and CTL2-D02, the POD average values of DA1-D01, DA1-D02, DA2-D01, and DA2-D02 were reduced by 0.02, 0.01, 0.04, and 0.01, respectively.For event N, the PODs of DA3 and DA4 all remarkably improved compared with those of CTL4.These different variations in POD were probably attributed to the different precipitation mechanisms of event A and N and the imperfection of WRF in predicting extreme local precipitation.Regardless, the comprehensive scores of HSS for the DAs finally improved.In comparison with the mean hourly HSS values for the reference CTL experiments, the HSS mean values of DA1-D01, DA1-D02, DA2-D01, DA2-D02, DA3-D01, DA3-D02, DA4-D01, and DA4-D02 increased by 0.05 (0.28), 0.05 (0.32), 0.04 (0.28), 0.04 (0.31), 0.05 (0.34), 0.09 (0.31), 0.06 (0.34), and 0.09 (0.31), respectively (the HSS values are in parentheses).This definitely demonstrates the positive improvements in the precipitation simulations that were made by the 4D-Var assimilation with TRMM 3B42 and GPM IMERG via the WRF model at the grid and hourly scales.Moreover, it can be concluded that the improvements in WRF 4D-Var for event A mainly benefited from the corrections of the false alarms for non-occurrences, since the POD values for DA1 and DA2 decreased, but their BIASs, FARs, and POFDs also reduced, which finally improved their HSSs.For event N, the reductions in BIASs, FARs, and POFDs were minor, but the PODs experienced evident improvements, and the HSSs finally became better than those of CTL4.
Considering precipitation applications in a hydrological basin, the simulated precipitation results of the DA experiments in the HRB were specifically evaluated; they were extracted, accumulated and spatially averaged.Figure 8 shows the hourly mean precipitation values of the basin via the CTL2, CTL4, and DA experiments during the studied 48 h.As shown in Figure 8, the agreements between the DAs' simulated precipitation and the merged CMORPH estimations were better than those between the CTL simulated results and the merged CMORPH estimations.Nevertheless, it should be noted that the impact of the WRF 4D-Var was not always consistent throughout the whole forecast period.For each 24-hour simulation, the mean precipitation of the basin via the DAs had the greatest agreement with the merged CMORPH estimations at the beginning (about the first 12 h).During the next 6 h, the mean precipitation values of the basin also maintained relatively good consistency with the merged CMORPH data.However, in the following hours, the predictions showed evident deviations from the merged CMORPH estimations.The reduction in the positive impact on the precipitation simulations suggested a time interval for the efficiency of the WRF 4D-Var.This may be related to the inevitable errors that accumulate while the WRF 4D-Var DA system is running [53].Moreover, when we applied the cycling mode, the initial conditions of the subsequent 24-hour forecast were provided by the previous 24-hour forecast; this caused error accumulation from the first 24-hour simulation, and as a result, the WRF 4D-Var performance for the subsequent 24 h became generally worse than that for the first 24 h.

WRF Sensitivity to Different Rainfall Events, Forcing Data, and Spatial Resolutions
To quantitatively assess the WRF sensitivity to different rainfall events, different forcing data and spatial resolutions, the error indices of ME, RE, CC, and RMSE and the skill scores of BIAS, FAR, POD, POFD, and HSS for the CTL experiments were inter-compared and are displayed in Figure 9a and 9b, respectively.Both the sharp decreases in ME, RE, and RMSE from CTL1-2 to CTL3-4 (Figure 9a) and the evident BIAS value much closer to 1 (Figure 9b) demonstrated a better WRF performance for event N than that for event A. This finding is consistent with that concluded by Lin et al. [42], where they found that the WRF model was relatively harder to use when forecasting convective and dominant precipitation events than precipitation events caused by extratropical cyclones.The impacts of different forcing data on the WRF precipitation simulations were well reflected in the error indices at the daily and point scales.As shown in Figure 9a, the CCs of CTL2 over D01 and D02 were both 0.01 lower than the CCs of CTL1.However, the ME, RE, and RMSE values of CTL2 significantly decreased by 22.61%, 22.09%, and 12.63% compared with those of CTL1_D01, respectively, and decreased by 15.53%, 14.94% and 11.81% compared with those of CTL1_D02,

WRF Sensitivity to Different Rainfall Events, Forcing Data, and Spatial Resolutions
To quantitatively assess the WRF sensitivity to different rainfall events, different forcing data and spatial resolutions, the error indices of ME, RE, CC, and RMSE and the skill scores of BIAS, FAR, POD, POFD, and HSS for the CTL experiments were inter-compared and are displayed in Figure 9a,b, respectively.Both the sharp decreases in ME, RE, and RMSE from CTL1-2 to CTL3-4 (Figure 9a) and the evident BIAS value much closer to 1 (Figure 9b) demonstrated a better WRF performance for event N than that for event A. This finding is consistent with that concluded by Lin et al., [42], where they found that the WRF model was relatively harder to use when forecasting convective and dominant precipitation events than precipitation events caused by extratropical cyclones.

WRF Sensitivity to Different Rainfall Events, Forcing Data, and Spatial Resolutions
To quantitatively assess the WRF sensitivity to different rainfall events, different forcing data and spatial resolutions, the error indices of ME, RE, CC, and RMSE and the skill scores of BIAS, FAR, POD, POFD, and HSS for the CTL experiments were inter-compared and are displayed in Figure 9a and 9b, respectively.Both the sharp decreases in ME, RE, and RMSE from CTL1-2 to CTL3-4 (Figure 9a) and the evident BIAS value much closer to 1 (Figure 9b) demonstrated a better WRF performance for event N than that for event A. This finding is consistent with that concluded by Lin et al. [42], where they found that the WRF model was relatively harder to use when forecasting convective and dominant precipitation events than precipitation events caused by extratropical cyclones.The impacts of different forcing data on the WRF precipitation simulations were well reflected in the error indices at the daily and point scales.As shown in Figure 9a, the CCs of CTL2 over D01 and D02 were both 0.01 lower than the CCs of CTL1.However, the ME, RE, and RMSE values of CTL2 significantly decreased by 22.61%, 22.09%, and 12.63% compared with those of CTL1_D01, respectively, and decreased by 15.53%, 14.94% and 11.81% compared with those of CTL1_D02, The impacts of different forcing data on the WRF precipitation simulations were well reflected in the error indices at the daily and point scales.As shown in Figure 9a, the CCs of CTL2 over D01 and D02 were both 0.01 lower than the CCs of CTL1.However, the ME, RE, and RMSE values of CTL2 significantly decreased by 22.61%, 22.09%, and 12.63% compared with those of CTL1_D01, respectively, and decreased by 15.53%, 14.94% and 11.81% compared with those of CTL1_D02, respectively.This indicated that significant improvements were made by using forcing data ds083.3for event A. For event N, contrasted with CTL3, the CCs of CTL4 decreased by 0.03 in D01 and 0.04 in D02.The ME, RE, and RMSE values of CTL4_D01 slightly increased by 1.03 mm, 0.06 mm, and 2.31 mm respectively, compared with those of CTL3_D01.The values of CTL4-D02 increased by −0.07 mm, 0.00 mm, and 0.83 mm, respectively, compared to those of CTL3_D02.These results indicated that slightly negative impacts of ds083.3 were found when predicting non-convective precipitation.These differences in the error indices caused by different forcing data suggested that the WRF model is very sensitive to its initial and lateral boundary conditions, especially for convective precipitation.Considering the notable positive improvements caused by ds083.3 for the simulation of convective precipitation, the slightly negative impacts of ds083.3 on the simulation of non-convective precipitation were omitted from this study.Therefore, we considered forcing data ds083.3 to be better than ds083.2 to drive the WRF model.
As shown in Figure 9b, the mean hourly BIASs and FARs of the CTLs over D02 showed more evident decreases than those over D01, and these reductions of CTL1 and CTL2 were especially remarkable.The POD values over D02 were slightly lower than those over D01 for event A, but the mean hourly PODs for event N over D01 and D02 were nearly the same.For event A, the POFDs in D02 were visually lower than those in D01, and the comprehensive HSSs in D02 were higher than those in D01.For event N, the HSSs in D02 were lower than those in D01, while clearly lower values of the BIASs and FARs over D02 indicated some improvements in the reduction of false predictions.Hence, we conclude that the improvement of spatial resolution (from 27 km to 9 km) in the WRF model has an evident positive impact on precipitation predictions for strong convection-dominated rainfall events, and slightly negative impacts on the predictions of rainfall events associated with extratropical cyclones.Considering the practical need for finer spatial resolutions and the overall better performances over the D02, we recognize that the WRF performance is better over D02 than that over D01.

The Effectiveness of WRF 4D-Var at Different Thresholds and Time
To examine the WRF 4D-Var effectiveness in detail, the simulated daily precipitation from the CTL2, CLT4, and DA experiments at different thresholds ranging from 1 mm/day to 70 mm/day were evaluated with the daily merged CMORPH data.The assessments were quantified with the skill scores of BIAS, FAR, POD, POFD, and HSS.As shown in Figure 10, the performances of the WRF and WRF 4D-Var models decreased as the precipitation threshold increased.The predictions for strong precipitation events greater than 50 mm/day were severely over-forecasted in CTL2 and CTL4, as their BIASs mostly surpassed 2 and even reached 12 (Figure 10c).After assimilating the satellite precipitation products, over-forecasting was controlled, as the values of BIAS and FAR for the strong precipitation event obviously decreased (Figure 10a-d).As shown in Figure 10q,s, the HSSs of DA1-D02 and DA3-D01 for precipitation exceeding 40 mm/day were evidently much larger than those in other experiments, which suggested that the resultant heavy precipitation event assimilated with TRMM 3B42 was more accurate than that assimilated with GPM IMERG during the first day of each simulation.For the second day, these two simulated precipitation results tended to be over-forecasted.For light rain, the simulated precipitation results of DA2 and DA4 were better than those of DA1 and DA3.The BIASs for both DA2 and DA4 were generally much closer to 1, and they had lower values of FAR and POFD and higher values of POD and HSS.This was mainly attributed to the key advancement of the GPM over the TRMM, because the GPM has a better capability of measuring light rain; this is because its core observatory carried the first spaceborne Ku/Ka band dual-frequency precipitation radar, which is more sensitive to light rain rates [50]. is represented with red lines.The PODs for event A reduced, but those for event N increased.In Figure 11e, most of the HSS increments in the first and third 12-hour periods were higher than those in the second and fourth 12-hour periods.Thus, we conclude that substantial improvements via the WRF 4D-Var with TRMM 3B42 and GPM IMERG can be sustained for approximately 12 h.

Comparison of the 4D-Var Performance Assimilated with TRMM 3B42 and GPM IMERG
To comprehensively compare the 4D-Var performance after assimilation with TRMM 3B42 and GPM IMERG, normalized Taylor diagrams [71] were used to compare the spatial patterns of daily and 48-hour (i.e., the whole study period) precipitation data between the simulations of the DA, CTL experiments, and the merged CMORPH data.The Taylor diagrams and are shown in Figure 12.
For the first day (Figure 12a), DA2 agrees the best with the merged CMORPH data, as the two points delegated DA2_D01 and DA2_D02 are the closest to the point labeled as REF in the figure, and the two points also show the highest correlation coefficients.Simultaneously, they almost have the same standard deviations as those of the merged CMORPH estimations, since they are located on the bold dashed line.For DA2-D01 and DA2-D02, their correlation coefficients are 0.692 and 0.663, respectively, and both their RMSEs occur at approximately 0.75 mm.DA3 and DA4 slightly differ in performance, as their representative points are very close to each other.For the performances on the second day (shown in Figure 12b), CTL2 exhibits a remarkable departure from the REF point, as simulation errors accumulated from the first day to the second day during the

Comparison of the 4D-Var Performance Assimilated with TRMM 3B42 and GPM IMERG
To comprehensively compare the 4D-Var performance after assimilation with TRMM 3B42 and GPM IMERG, normalized Taylor diagrams [71] were used to compare the spatial patterns of daily and 48-hour (i.e., the whole study period) precipitation data between the simulations of the DA, CTL experiments, and the merged CMORPH data.The Taylor diagrams and are shown in Figure 12.
almost the same.It is clear in Figure 12c that DA2 evidently outperforms DA1.Thus, we conclude that the WRF 4D-Var with GPM IMERG generally outperforms the WRF 4D-Var with TRMM 3B42.

Conclusions
To reduce data acquisition difficulty for precipitation in hydrological studies and obtain independent, consecutive, and high-resolution precipitation data, we used a 4D-Var data assimilation method to assimilate the remotely sensed precipitation products of the TRMM 3B42 and GPM IMERG into the atmospheric WRF model.By focusing on two heavy precipitation events that occurred during the flood and non-flood seasons over the HRB in 2015, CTL experiments were first carried out to choose the best forcing data for the WRF model and determine the control experiment for the subsequent DA experiments.Then, DA experiments were carried out to investigate the feasibility and efficiency of the GPM IMERG to be assimilated into the WRF model with the 4D-Var method, and the 4D-Var performances assimilating with the GPM IMERG and the TRMM 3B42 were compared as well.All of the simulated precipitation values from the CTL experiment and the DA experiment were evaluated with in situ CMA observations and hourly merged CMORPH data.
CTL experiments were performed based on the WRF model with different forcing data and for different events.The assessment of the simulated precipitation in the CTL experiments found that when predicting heavy rainfall events over the HRB, the WRF performance for event N, which represented non-convective precipitation, outperformed the performance for event A, which represented convective precipitation.Moreover, the simulated precipitation generated by forcing For the first day (Figure 12a), DA2 agrees the best with the merged CMORPH data, as the two points delegated DA2_D01 and DA2_D02 are the closest to the point labeled as REF in the figure, and the two points also show the highest correlation coefficients.Simultaneously, they almost have the same standard deviations as those of the merged CMORPH estimations, since they are located on the bold dashed line.For DA2-D01 and DA2-D02, their correlation coefficients are 0.692 and 0.663, respectively, and both their RMSEs occur at approximately 0.75 mm.DA3 and DA4 slightly differ in performance, as their representative points are very close to each other.For the performances on the second day (shown in Figure 12b), CTL2 exhibits a remarkable departure from the REF point, as simulation errors accumulated from the first day to the second day during the WRF was running.These performances rank from best to worst for event A as follows: DA2-D02, DA1-D02, DA1-D01, and DA2-D01.For event N, the results of DA3 and DA4 over D01 and D02 are almost the same.It is clear in Figure 12c that DA2 evidently outperforms DA1.Thus, we conclude that the WRF 4D-Var with GPM IMERG generally outperforms the WRF 4D-Var with TRMM 3B42.

Conclusions
To reduce data acquisition difficulty for precipitation in hydrological studies and obtain independent, consecutive, and high-resolution precipitation data, we used a 4D-Var data assimilation method to assimilate the remotely sensed precipitation products of the TRMM 3B42 and GPM IMERG into the atmospheric WRF model.By focusing on two heavy precipitation events that occurred during

Figure 1 .
Figure 1.(a) Outer domain (D01, 27-km resolution) and inner domain (D02, 9-km resolution) defined in the Weather Research and Forecasting (WRF) model; (b) location and meteorological stations of the Huaihe River basin.

Figure 1 .
Figure 1.(a) Outer domain (D01, 27-km resolution) and inner domain (D02, 9-km resolution) defined in the Weather Research and Forecasting (WRF) model; (b) location and meteorological stations of the Huaihe River basin.

Figure 2 .
Figure 2. (a) Contribution (%) of accumulated daily precipitation to the total annual precipitation for 30 CMA stations across the HRB in 2015 (sorted in decreasing order); (b) mean monthly precipitation of the HRB in 2015; (c) daily precipitation at the 30 CMA stations and the mean daily precipitation of the basin (HRB).

Figure 2 .
Figure 2. (a) Contribution (%) of accumulated daily precipitation to the total annual precipitation for 30 CMA stations across the HRB in 2015 (sorted in decreasing order); (b) mean monthly precipitation of the HRB in 2015; (c) daily precipitation at the 30 CMA stations and the mean daily precipitation of the basin (HRB).

2 1*
(HSS) HSS = (s−S Ref ) (SPerf−S Ref ) S = (A+D) N S Ref = [(A+B)(A+C)+(B+D)(C+D)] N P P,i and P O,i denote the predicted and observed precipitation values of the i grid, where Ṕ P,i and Ṕ O,i are their means, respectively.A represents the precipitation predicted by the WRF model and observed by the reference data; B represents the precipitation predicted by the WRF model but not observed by the reference data; C represents the precipitation not predicted by the WRF model but observed by the reference data; and D represents the precipitation not predicted by the WRF model and not observed by the reference data.Remote Sens. 2018, 10, 646 10 of 25

4 . 1 .
Evaluation of Simulated Precipitation in the CTL Experiments 4.1.1.Evaluation of Simulated Precipitation in the CTL Experiments at the Point Scale

Figure 3 .
Figure 3. Scatter plots of daily precipitation (mm/day) observed by the China Meteorological Administration (CMA) meteorological stations and predicted by the WRF control experiments: CTL1 and CTL2 (a) and CTL3 and CTL4 (b).

Figure 3 .
Figure 3. Scatter plots of daily precipitation (mm/day) observed by the China Meteorological Administration (CMA) meteorological stations and predicted by the WRF control experiments: CTL1 and CTL2 (a) and CTL3 and CTL4 (b).

Figure 4 .
Figure 4. Box plots * of evaluation scores BIAS (a), FAR (b), POD (c), POFD (d), and HSS (e) for hourly precipitation (exceeding 0.1 mm/hour) simulated by the WRF CTL experiments and estimated by the merged Climate Prediction Center Morphing technique (CMORPH) data.* The lower and upper edges of the central box represent the first and third quartiles (25% and 75%, respectively), and the band and the circle inside the box represent the 50 th percentiles and the mean values, respectively.The ends of the outliers represent the minimum and maximum values of the score distributions.The asterisks represent several possible alternative values.

HSSFigure 4 .
Figure 4. Box plots * of evaluation scores BIAS (a), FAR (b), POD (c), POFD (d), and HSS (e) for hourly precipitation (exceeding 0.1 mm/h) simulated by the WRF CTL experiments and estimated by the merged Climate Prediction Center Morphing technique (CMORPH) data.* The lower and upper edges of the central box represent the first and third quartiles (25% and 75%, respectively), and the band and the circle inside the box represent the 50th percentiles and the mean values, respectively.The ends of the outliers represent the minimum and maximum values of the score distributions.The asterisks represent several possible alternative values.

Figure 6 .
Figure 6.Scatter plots of daily precipitation (mm/day) observed by CMA meteorological stations and simulated by the CTL and DA experiments for rainfall events in August (a) and November (b) of 2015.

Figure 6 .
Figure 6.Scatter plots of daily precipitation (mm/day) observed by CMA meteorological stations and simulated by the CTL and DA experiments for rainfall events in August (a) and November (b) of 2015.

Figure 7 .
Figure 7. Skill scores of BIAS (a), FAR (b), POD (c), POFD (d), and HSS (e) for the hourly precipitation (48 h in all during the study period for each event) obtained from the CLT2, CTL4, and DA experiments. 0

Figure 8 .
Figure 8. Mean hourly precipitation (mm/hour) of the basin via the merged CMORPH estimates and the CTL2, CTL4, and DA experiments for rainfall events in August (a) and November (b).

Figure 9 .
Figure 9. (a) Error indices for daily precipitation between the CTL experiment simulations and the CMA observations; (b) skill score averages of hourly precipitation between the CTL experiment simulations and the merged CMORPH estimations.

Figure 8 .
Figure 8. Mean hourly precipitation (mm/h) of the basin via the merged CMORPH estimates and the CTL2, CTL4, and DA experiments for rainfall events in August (a) and November (b).

Remote Sens. 2018, 10 , 646 16 of 25 Figure 8 .
Figure 8. Mean hourly precipitation (mm/hour) of the basin via the merged CMORPH estimates and the CTL2, CTL4, and DA experiments for rainfall events in August (a) and November (b).

Figure 9 .
Figure 9. (a) Error indices for daily precipitation between the CTL experiment simulations and the CMA observations; (b) skill score averages of hourly precipitation between the CTL experiment simulations and the merged CMORPH estimations.

Figure 9 .
Figure 9. (a) Error indices for daily precipitation between the CTL experiment simulations and the CMA observations; (b) skill score averages of hourly precipitation between the CTL experiment simulations and the merged CMORPH estimations.

Remote Sens. 2018, 10 , 646 19 of 25 Figure 11 .
Figure 11.Increments of skill scores of BIAS (a), FAR (b), POD (c), POFD (d), and HSS (e) in the DA experiments compared to their corresponding CTL experiments.The table in each graph lists the evaluation scores of CTL2 and CTL4.The x-axis represents the first, second, third, and fourth 12-hour periods in the overall study period.

Figure 11 .
Figure 11.Increments of skill scores of BIAS (a), FAR (b), POD (c), POFD (d), and HSS (e) in the DA experiments compared to their corresponding CTL experiments.The table in each graph lists the evaluation scores of CTL2 and CTL4.The x-axis represents the first, second, third, and fourth 12-hour periods in the overall study period.

Figure 12 .
Figure 12.Normalized Taylor diagrams of the precipitation simulated by the CTL2, CTL4, and DA1-4 experiments on the first day (a), second day (b), and over the whole 48 h (c).

Figure 12 .
Figure 12.Normalized Taylor diagrams of the precipitation simulated by the CTL2, CTL4, and DA1-4 experiments on the first day (a), second day (b), and over the whole 48 h (c).

Table 1 .
The main configuration of the Weather Research and Forecasting (WRF) model.

Table 2 .
Designs of the WRF control (CTL) experiments and the WRF data assimilation (DA) experiments.

Table 3 .
Statistical metrics applied in the evaluations *.