Optimize Short-Term Rainfall Forecast with Combination of Ensemble Precipitation Nowcasts by Lagrangian Extrapolation

: The rainfall forecasts currently available in Korea are not su ﬃ ciently accurate to be directly applied to the ﬂash ﬂood warning system or urban ﬂood warning system. As the lead time increases, the quality becomes even lower. In order to overcome this problem, this study proposes an ensemble forecasting method. The proposed method considers all available rainfall forecasts as ensemble members at the target time. The ensemble members are combined based on the weighted average method, where the weights are determined by applying the two conditions of the unbiasedness and minimum error variance. The proposed method is tested with McGill Algorithm for Precipitation Nowcasting by Lagrangian Extrapolation (MAPLE) rainfall forecasts for four storm events that occurred during the summers of 2016 and 2017 in Korea. In Korea, rainfall forecasts are generated every 10 min up to six hours, i.e., there are always a total of 36 sets of rainfall forecasts. As a result, it is found that just six ensemble members is su ﬃ cient to make the ensemble forecast. Considering additional ensemble members beyond six does not signiﬁcantly improve the quality of the ensemble forecast. The quality of the ensemble forecast is also found to be better than that of the single forecast, and the weighted average method is found to be better than the simple arithmetic average method.


Introduction
Recently, there have been growing concerns of urban flooding and flash flooding that appear to be related to the rapid population growth and climate change. More of the population has been living in some specific urban areas [1][2][3][4][5], and people are leaving these areas for outdoor activities such as camping in very scenic mountain valleys, particularly during the hot summer season [6][7][8]. Global warming is believed to change the climate continuously. Very localized and severe rainfall events are also believed to be increasing [9][10][11][12][13]. The combination of all of these changes unfortunately increases the risk of urban flooding and flash flooding, which is also expected to worsen in the future. Although the definitions of urban floods and flash floods differ, they share some unique characteristics, such as a localized area, short response time, high risk, and so forth.
Urban and/or flash floods occur within one or two hours, sometimes shorter, following severe rainfall [14][15][16][17][18]. Thus, in this case, the rainfall measurements on the ground may not be applicable for flood warnings. Put simply, sufficient lead time cannot be secured to warn people in the pruned area of such flooding. This is an important reason to use the rainfall forecast for this purpose. For example, the Flash Flood Guidance System (FFGS) in the US that uses rainfall forecasts based on both satellite and radar observation is used for flash flood warning [19]. The European Flood Awareness System (EFAS) also uses ensemble rainfall forecasts for flood warning [20]. The flash flood warning system in Korea also uses the rainfall forecast as input [21,22].

Method of Averaging
The ensemble forecast is made by averaging the ensemble members. That is, where R t is an ensemble forecast made at the target time t, w i is a weight for the ensemble member M i , and n is the number of ensemble members. To date, various averaging methods have been proposed. A relatively simple averaging method is the simple average method (SA method). In this method, an equal weight is simply applied to each ensemble member. That is, the weight w i in Equation (1) is determined to be 1/n. Although it is very simple without any deep theoretical background, the SA method has been applied in various studies [77,80,94]. The weighted average method (WA method) is a statistical method which can consider the uncertainty of each ensemble member [95][96][97]. The weight is generally determined to be inversely proportional to its uncertainty. The application of a linear regression is another method that can be used to determine the weights of ensemble members [98]. In this method, a linear function is derived by considering the multiple ensemble members as independent variables. A nonlinear function approach such as the artificial neural network (ANN) has also been proposed [99]. Additionally, a so-called time-varying mergence method (TV method) has been proposed in an attempt to consider the temporal change of weights [97,100,101]. Among the TV methods, a time-varying sum of square root error method (TVSSE method), first proposed by Granger and Newbold (1977), may be the most popular [97].
Among many averaging methods, this study selected the weighted average method (WA method). The WA method is believed to be one which can consider the statistical characteristics of ensemble members. Specifically, this study adopted the variance-covariance method [95], which combines the unbiased ensemble members based on the concept of minimum error variance of the ensemble forecast. As an example, in the case of considering two ensemble members F 1 and F 2 generated for the true value of F, where Z 1 and Z 2 represent the stochastic errors involved in the generation of ensemble members. Here, the two stochastic errors are assumed to follow the Gaussian distribution, but they are not assumed to be independent of each other. Based on the linear assumption, the optimal estimate can be expressed as follows.
where c 1 and c 2 are the optimal weights to be determined by considering the uncertainty of the ensemble members of F 1 and F 2 . Generally, the weights are estimated by applying the two conditions of the unbiasedness and minimum error variance. That is, the bias of the estimateF should be zero, and the error variance of the estimateF should be minimized as well [98]. By applying these two conditions, one can derive the weights as follows. where σ 2 1 and σ 2 2 are the variances of the two forecasts and σ 12 is the covariance between the two forecasts. If the two forecasts are independent, the covariance becomes zero. However, if they are dependent on each other, the covariance should be considered to determine the weights.
A generalization of the above result to the case of considering m forecasts is as follows.
whereŴ is a column vector of the weights of m ensemble members and u is a unit column vector (i.e., u = [1, 1, · · · , 1] T ). Σ is an m × m covariance matrix consisting of the forecasting errors. The weight for each ensemble member can be calculated using the covariance matrix of forecasts, as given in Equation (7). Further, the variance of the ensemble forecast can be derived as follows.
This variance represents the quality of the ensemble forecasts, which generally becomes smaller as the number of ensemble members increases.

Optimal Number of Ensemble Members and Quality of Ensemble Forecast
In this study, the number of ensemble members was determined based on the quality of the ensemble forecast. Basically, as the number of the ensemble members increases, the forecasting quality improves. However, the forecasting quality can also become stagnated if the number of ensemble members exceeds a certain limit. In this study, the optimal number of ensemble members was determined as the minimum number of ensemble members that significantly improved the quality of the ensemble forecast. The difference between the ensemble forecast and the observed data, as well as the variance of the ensemble forecast were considered to determine the optimal number of ensemble members.
Once the optimal number of ensemble members has been determined, the ensemble forecast can be produced by applying the weight to each ensemble member. The forecasting quality of the produced ensemble forecast is then evaluated by comparing it with the observed data. Two evaluation factors are used to evaluate the forecasting quality: the root mean square error (RMSE) represents the mean difference between the ensemble forecast and the observed data, while the pattern correlation coefficient R is used to quantify the similarity of the ensemble forecast and the observed data. The pattern correlation coefficient R is calculated using Equation (9).
where P m,n and O m,n are the ensemble forecast and observed data, respectively, at each grid point (m, n). Further, P and O are the mean values of the ensemble forecast and observed data, respectively. Similar to the ordinary correlation coefficient, the pattern correlation is high when the value of R is near to 1 or −1.

MAPLE (McGill Algorithm for Precipitation
Nowcasting by Lagrangian Extrapolation) is a very short term rainfall forecasting model. The basic concept of the MAPLE rainfall forecasting was first Water 2019, 11, 1752 6 of 20 described by Germann and Zawadzki (2002) [28]. The MAPLE provides the forecasts of precipitation echoes, which are made using storm moving vectors. The storm moving vectors are derived from past and current radar reflectivity images by applying the variational echo tracking (VET) technique [102]. In order to generate nowcasts over several hours, the MAPLE uses a semi-Lagrangian method, which is based on the cross-correlation of the rainfall patterns at different times. The MAPLE is known to have the ability to forecast with consideration of the life cycles of various rainstorms. It has also been revised consistently to overcome various limitations and problems [29,[103][104][105].
The MAPLE rainfall forecasting system operated by KMA has two limitations: First, as it is a nondynamic forecasting model, it does not involve any process to consider the growth and decay of precipitation echoes. Aside from the probabilistic analysis of the rainstorm size, the initial information of precipitation echoes does not change. Second, the initial echo moving vectors are applied to every lead time. If the precipitation echo moving vectors change abruptly, the forecasting quality can be decreased significantly. This limitation is particularly relevant when extending the lead time; if the lead time is longer than three hours, the forecasting quality of the rainfall forecasts is known to be decreased abruptly [106].
The MAPLE rainfall forecast is generated as a form of a composite radar map, showing the entire Korean Peninsula and the surrounding region ( Figure 1). Its lead time ranges from 10 min to 360 min (i.e., six hours). That is, each MAPLE rainfall forecast contains one set of observed data and 36 forecasts with a 10 min interval. It is composed of 1024 × 1024 grids with a resolution of 1 km × 1 km, as a type of matrix. The original format of the MAPLE rainfall forecast is a binary gzip, but for use in this study, it is changed into the ASCII format using the Geographic Information System (GIS) program.
Water 2019, 11, x FOR PEER REVIEW 6 of 20 known to have the ability to forecast with consideration of the life cycles of various rainstorms. It has also been revised consistently to overcome various limitations and problems [29,[103][104][105].
The MAPLE rainfall forecasting system operated by KMA has two limitations: First, as it is a nondynamic forecasting model, it does not involve any process to consider the growth and decay of precipitation echoes. Aside from the probabilistic analysis of the rainstorm size, the initial information of precipitation echoes does not change. Second, the initial echo moving vectors are applied to every lead time. If the precipitation echo moving vectors change abruptly, the forecasting quality can be decreased significantly. This limitation is particularly relevant when extending the lead time; if the lead time is longer than three hours, the forecasting quality of the rainfall forecasts is known to be decreased abruptly [106].
The MAPLE rainfall forecast is generated as a form of a composite radar map, showing the entire Korean Peninsula and the surrounding region ( Figure 1). Its lead time ranges from 10 min to 360 min (i.e., six hours). That is, each MAPLE rainfall forecast contains one set of observed data and 36 forecasts with a 10 min interval. It is composed of 1024 × 1024 grids with a resolution of 1 km × 1 km, as a type of matrix. The original format of the MAPLE rainfall forecast is a binary gzip, but for use in this study, it is changed into the ASCII format using the Geographic Information System (GIS) program.

Storm Events
This study used the MAPLE rainfall forecasts of four major storm events that occurred in 2016 and 2017. Figure 2 shows radar images of the selected storm events, which are available at the weather radar center portal (www://radar.kma.go.kr). These radar images show basic information about the size, intensity, and movement of these storm events.

Storm Events
This study used the MAPLE rainfall forecasts of four major storm events that occurred in 2016 and 2017. Figure 2 shows radar images of the selected storm events, which are available at the weather radar center portal (www://radar.kma.go.kr). These radar images show basic information about the size, intensity, and movement of these storm events. The selected major storm events have different characteristics (Table 1). For a specific example, consider the second event, Typhoon Chaba. Typhoon Chaba caused severe damage in several cities located in the southern part of the Korean Peninsula. This storm event was recorded as the worst typhoon that occurred in October among those that landed in the Korean Peninsula. The maximum rainfall intensity, 104.2 mm/hr and total rainfall depth of 266.0 mm were recorded at the Ulsan rain gauge station [107]. The fourth event was Typhoon Talim. This storm event landed on the Korean Peninsula on 10 September 2017 with a lot of rainfall, and decayed the next day at the Jeolla Province. The first and third events are those that occurred during the monsoon seasons in 2016 and 2017, respectively. In this study, 10 target times were selected in total to generate the weighted average ensemble forecasts. Each storm event contains two or three target times, which were those at which The selected major storm events have different characteristics (Table 1). For a specific example, consider the second event, Typhoon Chaba. Typhoon Chaba caused severe damage in several cities located in the southern part of the Korean Peninsula. This storm event was recorded as the worst typhoon that occurred in October among those that landed in the Korean Peninsula. The maximum rainfall intensity, 104.2 mm/hr and total rainfall depth of 266.0 mm were recorded at the Ulsan rain gauge station [107]. The fourth event was Typhoon Talim. This storm event landed on the Korean Peninsula on 10 September 2017 with a lot of rainfall, and decayed the next day at the Jeolla Province. The first and third events are those that occurred during the monsoon seasons in 2016 and 2017, respectively. In this study, 10 target times were selected in total to generate the weighted average ensemble forecasts. Each storm event contains two or three target times, which were those at which relatively high rainfall intensity occurred. Table 1 summarizes detailed information about the rainfall that occurred at these 10 target times.

Weighted Average Ensemble Forecast
The structure of the covariance is the key factor for determining the weight for each forecast. First, the cross-correlation coefficient was estimated to consider the difference in lead time among ensemble members. In this estimation process, the unit time lag was assumed to be 10 min. For example, the time lag between the 30 min forecast and the 90 min forecast corresponds to lag-6, and the time lag between the 60 min forecast and the 120 min forecast also corresponds to lag-6. As there were many cases corresponding to a given time lag, their average was assumed to be the represented value. Figure 3 shows the change in the cross-correlation coefficient in terms of the time lag, derived from the forecasts collected at the 10 target times. Further, the solid thick line represents their average. As can be expected, the cross-correlation coefficient decays slowly and stagnates after some time lag. It is obvious that a higher cross-correlation coefficient is expected for short time lags, but a lower one is expected for long time lags. This also indicates that the diagonal components of the covariance matrix are the highest. relatively high rainfall intensity occurred. Table 1 summarizes detailed information about the rainfall that occurred at these 10 target times.

Weighted Average Ensemble Forecast
The structure of the covariance is the key factor for determining the weight for each forecast. First, the cross-correlation coefficient was estimated to consider the difference in lead time among ensemble members. In this estimation process, the unit time lag was assumed to be 10 min. For example, the time lag between the 30 min forecast and the 90 min forecast corresponds to lag-6, and the time lag between the 60 min forecast and the 120 min forecast also corresponds to lag-6. As there were many cases corresponding to a given time lag, their average was assumed to be the represented value. Figure 3 shows the change in the cross-correlation coefficient in terms of the time lag, derived from the forecasts collected at the 10 target times. Further, the solid thick line represents their average. As can be expected, the cross-correlation coefficient decays slowly and stagnates after some time lag. It is obvious that a higher cross-correlation coefficient is expected for short time lags, but a lower one is expected for long time lags. This also indicates that the diagonal components of the covariance matrix are the highest.   The covariance matrix was derived separately for each storm event. For example, Figure 4 shows the covariance matrix derived from all of the available rainfall forecasts on 04:30, 11 September 2017. This figure includes the results for cases with ensemble members 2, 3, 6, 9, 18, and 36. Regardless of which ensemble members were considered, the diagonal components of the covariance matrix were always higher than the other components. This indicates that the quality of the rainfall forecast drops as the lead time increases. The correlation also becomes lower as the lead time increases.
Water 2019, 11, x FOR PEER REVIEW 9 of 20 This figure includes the results for cases with ensemble members 2, 3, 6, 9, 18, and 36. Regardless of which ensemble members were considered, the diagonal components of the covariance matrix were always higher than the other components. This indicates that the quality of the rainfall forecast drops as the lead time increases. The correlation also becomes lower as the lead time increases. As explained previously, the weights to be applied to the ensemble members (i.e., the rainfall forecasts available at a given target time) are derived by analyzing the covariance matrix. This study considered three different forecasting times; one, two, and three hours. For example, the case with the forecasting time of one hour considered all of the rainfall forecasts with a lead time longer than or equal to one hour. This procedure was repeated 10 times by considering all of the 10 different target times listed in Table 1. The weights were estimated independently for each case, and the final weights were determined as their arithmetic average. Figure 5 compares the final weights determined for the cases with the following numbers of ensemble members: 2, 3, 4, 6, and 9. The solid line represents the final weights determined for each case.
The characteristics of the determined weights were as expected. For example, the weight of the most recently made forecast was higher than that of the older one. The difference among weights was larger when a smaller number of ensemble members were considered. As the number of ensemble members considered increased, the weights became more similar to each other. That is, the simple arithmetic averaging could be a valid method when considering a large number of ensemble members. For example, Figure 5a shows the determined weights for the case considering four ensemble members. The weight of the first ensemble member (i.e., rainfall forecast with a lead time of one hour) was determined to be 0.4235, while that of the second one (i.e., rainfall forecast with a lead time of one hour and 10 min) was 0.2337. The third and fourth weights were determined to be much smaller at 0.1739 and 0.1689, respectively. However, the above characteristics become rather weakened when a longer forecasting time is considered. For example, the weight of the first ensemble member for the forecasting time of three hours (see Figure 5c) was determined to be substantially smaller than that for the forecasting time of one hour. On the other hand, the weights of the other ensemble members became slightly higher. The case for the forecasting time of two hours is in between the two forecasting times of one hour and As explained previously, the weights to be applied to the ensemble members (i.e., the rainfall forecasts available at a given target time) are derived by analyzing the covariance matrix. This study considered three different forecasting times; one, two, and three hours. For example, the case with the forecasting time of one hour considered all of the rainfall forecasts with a lead time longer than or equal to one hour. This procedure was repeated 10 times by considering all of the 10 different target times listed in Table 1. The weights were estimated independently for each case, and the final weights were determined as their arithmetic average. Figure 5 compares the final weights determined for the cases with the following numbers of ensemble members: 2, 3, 4, 6, and 9. The solid line represents the final weights determined for each case.
The characteristics of the determined weights were as expected. For example, the weight of the most recently made forecast was higher than that of the older one. The difference among weights was larger when a smaller number of ensemble members were considered. As the number of ensemble members considered increased, the weights became more similar to each other. That is, the simple arithmetic averaging could be a valid method when considering a large number of ensemble members. For example, Figure 5a shows the determined weights for the case considering four ensemble members. The weight of the first ensemble member (i.e., rainfall forecast with a lead time of one hour) was determined to be 0.4235, while that of the second one (i.e., rainfall forecast with a lead time of one hour and 10 min) was 0.2337. The third and fourth weights were determined to be much smaller at 0.1739 and 0.1689, respectively. However, the above characteristics become rather weakened when a longer forecasting time is considered. For example, the weight of the first ensemble member for the forecasting time of three hours (see Figure 5c) was determined to be substantially smaller than that for the forecasting time of one hour. On the other hand, the weights of the other ensemble members became slightly higher. The case for the forecasting time of two hours is in between the two forecasting times of one hour and three hours. This indicates independence among the ensemble members, which becomes stronger when considering a longer forecasting time. This result is also consistent with the characteristics of the covariance matrix. The cross-correlation coefficient was found to be negligible when the difference in lead time was longer than 150 min.
Water 2019, 11, x FOR PEER REVIEW 10 of 20 three hours. This indicates independence among the ensemble members, which becomes stronger when considering a longer forecasting time. This result is also consistent with the characteristics of the covariance matrix. The cross-correlation coefficient was found to be negligible when the difference in lead time was longer than 150 min.  In general, the quality of the ensemble forecast increases as more ensemble members are considered. However, it is not economical to simply consider as many ensembles as possible. The optimal number of ensembles is determined as the minimum number of ensembles which guarantees a similar quality of ensemble forecast. The uncertainty (or variance) of the ensemble forecast is analyzed to determine the optimal number of ensemble members. In this study, for the forecasting In general, the quality of the ensemble forecast increases as more ensemble members are considered. However, it is not economical to simply consider as many ensembles as possible. The optimal number of ensembles is determined as the minimum number of ensembles which guarantees a similar quality of ensemble forecast. The uncertainty (or variance) of the ensemble forecast is analyzed to determine the optimal number of ensemble members. In this study, for the forecasting times of one, two, and three hours, the behavior of the RMSE of the ensemble forecast was examined in terms of the number of ensemble members. In order to consider together all of the RMSEs estimated for each of the 10 target times evaluated in this study, the RMSE estimated for each target time was also normalized by the RMSE for the case of considering just one ensemble member (i.e., the biggest RMSE). The optimal number of ensemble members was determined as that near the inflection point of the RMSE curve. Figure 6 shows the behavior of the RMSEs estimated for the 10 target times. The solid thick line represents their average. The two red dashed lines represent the tangent lines from the beginning and from the end. These two lines meet at the inflection point. As shown in this figure, the optimal number of ensemble members is around six. This result was the same regardless of the forecasting time. The ensemble forecast hereafter is made by considering a total of six ensemble members. times of one, two, and three hours, the behavior of the RMSE of the ensemble forecast was examined in terms of the number of ensemble members. In order to consider together all of the RMSEs estimated for each of the 10 target times evaluated in this study, the RMSE estimated for each target time was also normalized by the RMSE for the case of considering just one ensemble member (i.e., the biggest RMSE). The optimal number of ensemble members was determined as that near the inflection point of the RMSE curve. Figure 6 shows the behavior of the RMSEs estimated for the 10 target times. The solid thick line represents their average. The two red dashed lines represent the tangent lines from the beginning and from the end. These two lines meet at the inflection point. As shown in this figure, the optimal number of ensemble members is around six. This result was the same regardless of the forecasting time. The ensemble forecast hereafter is made by considering a total of six ensemble members.

Quality of Weighted Average Ensemble Forecast
In the previous section, the optimal number of ensemble members was determined to be six. For the case of a forecasting time of one hour, as an example, six rainfall forecasts made from the prior 60 min to the prior 110 min were used to make the ensemble forecast. For the case of a forecasting time of two hours, six rainfall forecasts made from the prior 120 min to the prior 170 min were used, and for the case of a forecasting time of three hours, six rainfall forecasts made from the prior 180 min to the prior 230 min were used to make the ensemble forecast. Table 2 summarizes the weights to be used to make the weighted average (i.e., ensemble forecast based on the weighted average method) with these six ensemble members. The weight for the first ensemble member (the most recent rainfall forecast among the ensemble members considered) was the highest, and the weight dropped as the time gap increased. The quality of the MAPLE rainfall forecast decreases with longer lead time. Figure 7 shows six rainfall forecasts made at 09:00 8 July 2017 for the lead times from 60 min to 110 min. On this basis of these rainfall forecasts, the storm center was predicted to move to the bottom of the right-hand side. These forecasts also indicate that the forecasting quality decreases with increasing lead time, which is also a crucial factor for the practical use of rainfall forecasts.

Quality of Weighted Average Ensemble Forecast
In the previous section, the optimal number of ensemble members was determined to be six. For the case of a forecasting time of one hour, as an example, six rainfall forecasts made from the prior 60 min to the prior 110 min were used to make the ensemble forecast. For the case of a forecasting time of two hours, six rainfall forecasts made from the prior 120 min to the prior 170 min were used, and for the case of a forecasting time of three hours, six rainfall forecasts made from the prior 180 min to the prior 230 min were used to make the ensemble forecast. Table 2 summarizes the weights to be used to make the weighted average (i.e., ensemble forecast based on the weighted average method) with these six ensemble members. The weight for the first ensemble member (the most recent rainfall forecast among the ensemble members considered) was the highest, and the weight dropped as the time gap increased. Table 2. Weights to be used to make the ensemble forecast with six ensemble members.

No.
Forecasting The quality of the MAPLE rainfall forecast decreases with longer lead time. Figure 7 shows six rainfall forecasts made at 09:00 8 July 2017 for the lead times from 60 min to 110 min. On this basis of these rainfall forecasts, the storm center was predicted to move to the bottom of the right-hand side. These forecasts also indicate that the forecasting quality decreases with increasing lead time, which is also a crucial factor for the practical use of rainfall forecasts.  The quality of the ensemble forecast was first evaluated at the target time by comparing it with the observed rainfall field (Figure 8). Comparisons with the ensemble forecast based on the simple arithmetic average method and the single forecast at the target time (i.e., the case with just one ensemble member) were also conducted in order to emphasize the usefulness of the ensemble forecast. As shown in Figure 8, the storm center of the observed rainfall field is located around 36 • 10 N~36 • 20 N and 128 • 30 E~128 • 40 E. A similar pattern could also be found in the ensemble forecasts based on the weighted average method as well as the simple average method. Both of the results were similar. On the other hand, in the single forecast, the overall pattern was found to be somewhat different from the observed result. Two storm centers were found: one around 36 • 0 N~36 • 10 N and 128 •  The quality of the ensemble forecast was first evaluated at the target time by comparing it with the observed rainfall field (Figure 8). Comparisons with the ensemble forecast based on the simple arithmetic average method and the single forecast at the target time (i.e., the case with just one ensemble member) were also conducted in order to emphasize the usefulness of the ensemble forecast. As shown in Figure 8, the storm center of the observed rainfall field is located around 36°10′ N~36°20′ N and 128°30′ E~128°40′ E. A similar pattern could also be found in the ensemble forecasts based on the weighted average method as well as the simple average method. Both of the results were similar. On the other hand, in the single forecast, the overall pattern was found to be somewhat different from the observed result. Two storm centers were found: one around 36°0′ N~36°10′ N and 128°10′ E~128°25′ E and another around 36°20′ N~36°30′ N and 128°40′ E~128°50′ E. The quality of the ensemble forecast was also quantitatively evaluated according to the pattern correlation coefficient R and the root mean square error RMSE. These measures of the ensemble forecast were also compared with those of the single forecast and the ensemble forecast based on the simple arithmetic average method. For example, the evaluation result at the target time of 09:00 8 July 2017 shows that the ensemble forecast based on the weighted average method is most similar to the observed. For the case of a forecasting time of one hour, R and RMSE of ensemble forecast based on the weighted average method were 0.42 and 6.85, respectively. On the other hand, those of the single forecast were estimated to be 0.25 and 9.66, while those of the ensemble forecast based on simple arithmetic averaging were 0.39 and 7.14, respectively. Overall, the ensemble forecast was found to be The quality of the ensemble forecast was also quantitatively evaluated according to the pattern correlation coefficient R and the root mean square error RMSE. These measures of the ensemble forecast were also compared with those of the single forecast and the ensemble forecast based on the simple arithmetic average method. For example, the evaluation result at the target time of 09:00 8 July 2017 shows that the ensemble forecast based on the weighted average method is most similar to the observed. For the case of a forecasting time of one hour, R and RMSE of ensemble forecast based on the weighted average method were 0.42 and 6.85, respectively. On the other hand, those of the single forecast were estimated to be 0.25 and 9.66, while those of the ensemble forecast based on simple arithmetic averaging were 0.39 and 7.14, respectively. Overall, the ensemble forecast was found to be better than the single forecast, and the weighted average method was found to be better than the simple arithmetic average method. Similar results were derived for the forecasting times of two and three hours, as summarized in Table 3. Table 3. Evaluation of the single forecast, the ensemble forecast based on the simple arithmetic averaging, and the ensemble forecast based on the weighted averaging at 09:00 July 8, 2017. The above evaluation process was repeated for all 10 target times considered in this study, and the results are summarized in Table 4. Basically, the same result was derived for all 10 target times. That is, the ensemble forecast was found to be better than the single forecast, and the weighted average method was found to be better than the simple arithmetic average method. In addition, the evaluation result was found to be slightly deteriorated as the forecasting time increased. That is, the evaluation result for the case of a forecasting time of one hour was mostly better than that of the forecasting time of two or three hours. Further, the evaluation result for the case of a forecasting time of two hours was mostly better than that of a forecasting time of three hours. There were some exceptions, but this trend was seen in all three different forecasts: the ensemble forecast based on the weighted average method, the ensemble forecast based on the simple arithmetic average method, and the single forecast. It should also be noted that, as the forecasting time increased, the evaluation result of the single forecast became substantially worse than that of the ensemble forecast. This result seems to be based on the fact that the ensemble forecast was made by considering the uncertainty of rainfall propagation, i.e., by considering more feasible cases of rainfall propagation. Also, this limitation, which is the quality of three hour forecast is far lower than that of one hour and two hours can be found in the previous studies in Korea [108][109][110]. Table 4. Evaluation of the single forecast, the ensemble forecast based on the simple arithmetic averaging, and the ensemble forecast based on the weighted averaging at 10 target times.

Summary and Conclusions
This study proposed an ensemble forecasting method based on the rainfall forecast generated every hour. The ensemble members (i.e., the available rainfall forecasts at the target time) were combined using the weighted average method. Three issues were raised with this ensemble forecasting method: the first involved the optimal number of ensemble members, the second was concerned with the determination of the weights, and the last one involved the quality of the ensemble forecast. One by one, this study showed possible solutions to these issues, with example applications to four storm events that occurred during the summers of 2016 and 2017 in Korea.
First, the weights were determined by analyzing the covariance matrix of the rainfall forecasts under the two conditions of the unbiasedness and minimum error variance. As expected, the weight of the most recently made forecast was higher than that of older ones. The difference among weights was also larger when a smaller number of ensemble members were considered. As the number of ensemble members increased, the weights became similar to each other. That is, the simple arithmetic averaging could be a valid method when considering a large number of ensemble members. However, this trend was rather weakened when considering a longer forecasting time. For example, the weight of the first ensemble member for the forecasting time of three hours was determined to be substantially smaller than that for the forecasting time of one hour. On the other hand, the weights of the other ensemble members became slightly higher. This indicates independence among the ensemble members, which becomes stronger when considering a longer forecasting time.
Second, the optimal number of ensembles was determined to be the minimum number of ensembles which guarantee a similar quality of ensemble forecast. The uncertainty (or variance) of the ensemble forecast was analyzed, and the optimal number of ensemble members was determined as that near the inflection point of the RMSE curve. In this study, the optimal number of ensemble members was determined to be around six. This result was the same regardless of the forecasting time.
Third, the quality of the ensemble forecast was quantitatively evaluated by the pattern correlation coefficient R and the root mean square error RMSE. These measures of the ensemble forecast were also compared with those of the single forecast and the ensemble forecast based on the simple arithmetic average method. The results indicated that the ensemble forecast was better than the single forecast, and that the weighted average method was better than the simple arithmetic average method.
Finally, it was also found that the evaluation result was slightly deteriorated as the forecasting time increased. That is, the evaluation result for the case of a forecasting time of one hour was mostly better than that of a forecasting time of two or three hours. However, it should be mentioned that, as the forecasting time increased, the evaluation result of the single forecast became substantially worse than that of the ensemble forecast. This result underscores the usefulness of the ensemble forecast. In fact, this is a fundamental problem which cannot be solved immediately. However, even though the results for the lead times of two or three were confusing, the rainfall prediction based on the weighted average method was superior to that based on the simple arithmetic average method.
The ensemble forecasting method proposed in this study may help increase the usefulness of the rainfall forecast one step further. The rainfall forecasts that are currently available in Korea are not sufficiently accurate to be directly applied to the flash flood warning system or urban flood warning system. The quality becomes even lower as the lead time increases. However, by introducing the ensemble technique, this problem could be alleviated a bit. The usefulness of the proposed ensemble forecasting method can be even higher in the near future if more accurate rainfall forecasts become available.