1. Introduction
An increasing number of satellite-based rainfall estimates, with ever finer resolution, are becoming available. They are particularly valuable in Africa where the gauge network is not dense enough to represent the high variability of the rainfall during the monsoon season. Many satellite-based estimates include additional sources of data such as radar or gauge measurements. Gauge data are often used for bias correction, but can also be used for calibration or be merged with other estimates. These methods mostly focus on the intensity of the rainfall. However, a rainfall event is also characterized by its position and timing. Thus, we wanted to investigate the possibility to gauge-adjust a rainfall estimate with respect to the position and timing of the event instead of its intensity.
Rainfall is mainly evaluated with respect to its intensity or occurrence. However, part of the discrepancies among the estimates could be explained by position or timing errors. This is especially true for localized rainfall events such as the convective rainstorms occurring during the rainy season in sub-Saharan Africa. Timing errors have been studied in the field of hydrological modeling because of the large impact temporal patterns of rainfall can have on the models’ outputs [
1,
2,
3]. However, timing errors have not been studied much in terms of validation or comparison of rainfall estimates. There are some exceptions. For example, Reference [
4] evaluated the capacity of several satellite-based estimates in reproducing the daily cycle at two sites in West Africa. They showed that the rainfall peaks (of the diurnal cycle) can be delayed by up to 2 h.
There are several techniques to study timing differences between time series such as cross-correlation and dynamic time warping (DTW). Cross-correlation only allows for fixed time lags between time series. DTW is more flexible. It was developed for speech recognition [
5,
6] and is now used in many fields such as data mining, robotics, manufacturing, and medicine (see [
7] for references). It also has been applied to precipitation in several studies. Reference [
8] used DTW (with a combination of six indices) to compare rainfall events and select events that were similar. References [
9,
10] applied DTW in combination with a clustering method to classify time series into rainfall regimes. Reference [
10] used it in the framework of rainfall estimate validation, while the goal of [
9] was to derive precipitation estimates from cloud top temperature. Reference [
11] developed a multiscale DTW and used it as a dissimilarity measure. They applied it to yearly time series in Paris to study the impact of climate change on rainfall variability [
12].
Position errors have been taken into account in the field of forecast verification. The spatial verification methods can be divided into four categories [
13,
14]: the neighborhood, the scale-decomposition, the object-based, and the field deformation methods. These methods are rarely used for the evaluation of rainfall estimates other than forecasts from numerical models, even though the position of a rainfall event can be as important as its intensity. The latter is especially true for some applications such as hydrological modeling or data assimilation in a numerical model. The field deformation methods have also been used for position correction in the framework of data assimilation of several weather-related variables [
15,
16,
17,
18,
19,
20]. Field deformation methods can also be applied on other estimates than weather simulation. For example, in [
21], it was used to correct a satellite-based rainfall estimate with ground-based radar data.
In this article, we focus on image warping, a field-deformation method originating from image processing. Image warping is now used in many fields including data assimilation [
22,
23]. Reference [
24] described a framework to assimilate rainfall data into a weather model by combining a warping method with an ensemble Kalman filter. However, they did not implement it or apply it to actual rainfall data. In a previous article [
25], we used a similar warping method to correct the position error in rainfall estimates and applied it to a satellite-based estimate.
The goal of this article is to investigate the use of warping to correct timing and position error. That is, we apply warping to gauge-adjust a satellite-based estimate with respect to the timing or the position of the rainfall events. For the correction of the position error, we built upon the spatial-warping approach described in [
25] and extended it to take into account the temporal dimension of the event. Instead of processing one time step at a time, we considered all the time steps contributing to the event together. This way, they can influence each other. For the time warping, we adapted the spatial warping method to operate with (1D) time series instead of (2D) rainfall fields. Using this method allowed us to compare more fairly the space and time warping. It also leaves open the possibility to combine them at a later stage in order to correct both the position and timing error at the same time. However, that would also be very computationally expensive. Here, as an intermediate step, we considered the time warping separately from the spatial warping. Still, the connection between space and time is not completely ignored in the space and time warping. As mentioned above, in the space warping, the time steps can influence each other. Similarly, we took the spatial dimension into account in the time warping by processing time series of several gauges at once, so that nearby gauges could influence each other.
5. Discussion
The spatial and time warping methods were evaluated in terms of the position and timing of the rainfall peak and of the intensity of the event for two case studies. The spatial warping method targets the position error, and successfully improved it. The position error was significantly decreased after spatial warping in both cases (see
Table 2 and
Table 5). The timing error also decreased after the spatial warping (see
Table 3 and
Table 6). It is hard to distinguish between a time delay or a spatial shift. Part of the timing error can be due to a position error and vice versa. Therefore, by correcting the position, the spatial warping also reduced the timing error. The time warping explicitly corrects the timing error. It was shown that the timing of the event was significantly improved by the time warping for both cases (see
Table 3 and
Table 6). However, the time warping had a mixed impact on the position error, especially in the southern Ghana case. In the synthetic case, the time warping had a limited, but positive impact on the position. The position error was decreased by about 20 km (which is equivalent to about two grid cells). In the second case study, time warping decreased the position error at certain time steps, but increased it at others. However, the position error has to be interpreted with caution for this case study. There were several time steps with low and scattered precipitation for which the rainfall peak was not well defined because of the several local maxima.
By modifying the position or the timing of the rainfall event, the warping methods also impacted the continuous statistics that reflect the intensity error. The MAE, the RMSE, and the correlation were significantly improved after warping in both cases (see
Table 1 and
Table 4). The warping methods did not modify the intensity of the event, but they removed the double-penalty part of the error by making the events in the two estimates (here, IMERG-Late and TAHMO) fit better in terms of position and timing. This can be seen by the increase of the correlation after warping.
The warping methods seemed to benefit the large rainfall values. This was not due directly to the warping methods, but to the registration methods. The automatic registration is based on the minimization of a cost function. The larger rainfall values had more weight than the low ones in the cost function, and so were corrected with priority during the minimization. The correction of the high rainfall values can be at the detriment of the lower ones. A similar observation can be made for the timing error after time warping. This characteristic can be a drawback for applications for which lower rainfall amounts are important. It also shows a limitation of our validation method, since the timing and position errors are defined with respect to the peak (i.e., the maximum rainfall). They do not take into account the spatial or temporal variation of the lower rainfall. Thus, they can be biased and have to be considered with caution.
The spatial and time automatic registration methods have the advantage of not needing any manual selection. However, this also means that the inputs have to be similar enough for them to perform correctly. This condition is the main limitation of the warping methods. We do not have clear criteria to determine beforehand if the inputs are similar enough. However, there are some minimum conditions, such as having the same number of events. So far, we have used a visual inspection of the input data to assess their similarity. A next step will be to apply this method to other cases, involving different rainfall regimes. More study cases, including extreme ones, are needed to determine the boundaries within which the automatic registration succeeds and to determine “feasibility” thresholds.
The spatial warping method used in this paper built upon previous work in [
25]. The main difference was the processing of several time steps at the same time in the automatic registration. In [
25], the automatic registration was applied to one time step. To warp the entire rainfall event, one would need to apply the registration to each time step separately. Processing several time steps at the same time allowed us to add an assumption on the relationships of the mappings through time. This was done by adding an additional regulation term in the cost function. This term links the mappings through time, ensuring the time consistency of the mappings.
The spatial automatic registration used in [
25] and in this article was based on the one described in [
23,
24]. Reference [
24] described a data assimilation method based on morphing in order to assimilate radar precipitation data into a numerical weather model. However, they did not implement it or apply it to actual rainfall data. We modified and extended their automatic registration function. The first difference is the type of observation data. We used nongridded gauge measurements as observations instead of gridded radar data. Using nongridded data introduces additional uncertainties since we have to interpolate. The cost function was modified accordingly, so that it was not too influenced by areas without gauges. The second difference is the optimization method: they solved the minimization problem iteratively for one grid point at the time, while we solved it for all grid points together. The third main difference is the extension to take into account several time steps at a time.
As mentioned in the Introduction, time or spatial warping methods have been applied on rainfall data in previous works, but not for position and timing correction. For example, dynamic time warping has been used to classify rainfall time series or to measure dissimilarities between them [
8,
9,
10,
11,
12]. Spatial warping or similar methods have been used for position correction in the framework of data assimilation into numerical weather models. However, they have not been applied on rainfall, but other weather-related variables [
15,
16,
17,
18,
19,
20]. An exception is [
21], in which they used a feature calibration and alignment method, similar to spatial warping, to adjust a satellite-based rainfall estimate with respect to radar observations. There are three main differences between their method and the spatial warping described in this article. First, they corrected both the intensity and the position, while we only corrected the position. Second, they processed each time step individually, while we processed all the time steps in the time window at once. This allowed us to take into account the temporal relationship between the displacement fields of the different time steps. Third, they used radar data that were gridded, while we used nongridded gauge observations.
5.1. Parameters Influencing the Warping
The mappings were derived from the automatic registration, which can be tuned by several parameters. These parameters thus impacted the mappings and then the warped fields. The background term of the cost function consists of several regulation terms corresponding to properties characterizing the “optimal” mapping. These terms are weighted by the regulation coefficients
,
,
,
, or
. The impact of the coefficients on the mappings and warped fields was examined for the synthetic case (not shown here; see [
31]). The regulation terms mainly affect the area or times with no or low rainfall, and so have a limited impact on the warped fields in terms of continuous statistics and of the position and timing errors. Very large regulation coefficients are needed for them to have an effect on larger rainfall values (i.e., more than 5 mm/h). Nevertheless, they had a visible and valuable effect on the mappings. For example, the “smoothness” property controlled by the coefficient
prevents nonphysical discontinuity in space (time) for the spatial (time) method. For the spatial warping, the low precipitation is removed in the preprocessing step; thus, this property ensures that it is moved along with the higher rainfall values.
Another parameter influencing the automatic registration is the number of steps I. This number controls two things: the smoothing of the input data and the resolution of the mappings. Thus, a higher number of steps I means that more details are taken into account and that mappings themselves have more details. On the other hand, increasing I also increases the computational cost, since the number of variables in the minimization problem increases exponentially with I. For example, for the spatial registration, the first step (i.e., ) took 20 s, the second one more than 2 min, the third one 17 min, and the fourth one 47 min. Similarly, for the time registration, the first step ran in less than 6 s, while the third one needed almost a minute. In our case studies, with 65 stations, the computational aspect was not a problem for the time registration, but it could become more critical for cases with more stations.
When using nongridded data, there is another important step that influences the results: the interpolation. In the case of the spatial warping, the measurements from the stations have to be interpolated onto a regular grid before being used in the automatic registration. This step introduces some interpolation errors, especially in the areas far from the stations. Depending on its position, the rainfall event was more or less well captured by the stations and so represented by the interpolated field. This can be seen by comparing the “LOOV” and “All” experiments in the southern Ghana case. The standard deviation of the mappings (across the LOOV) was very low; however, some members did show a large deviation from the mean. That is, some stations had a bigger impact than others on the accuracy of the interpolation and, thereby, on the automatic registration. In the two cases studied here, the rainfall event was relatively well captured by the gauges. In practice, that would not always be the case, depending on the position of the event and on the network configuration. In the case of the time warping, the interpolation occurred after the registration. The mappings at the station locations had to be interpolated on the regular grid of the satellite-based estimates. This step was necessary to be able to warp the entire domain. We chose to use an ordinary kriging method, because it can be used for interpolation and extrapolation and can be linked to the spatial regulation term. The same variogram was used to define the influence function in this regulation term and for the ordinary kriging. However, other interpolation methods could have been used. The study of the impact of other interpolation methods is left for future work.
5.2. Computational Cost and Alternative Methods
Once we have the mappings, applying them to the first guess is straight-forward. The computational cost of the warping methods comes from the automatic registration, and more specifically from the minimization. The cost of the minimization increases with the number of variables. For the spatial registration, the number of variables to be determined depends on the number of steps
I and the number of time steps. The number
I controls the resolution of the mapping, that is the number of grid points. Thus, the choice of the number of steps is a trade-off between the resolution and the computational cost. The number of variables also increases with the number of time steps to be processed at the same time. Therefore, a way to reduce the computational cost would be to reduce the time window. If it is not possible due to the length of the rainfall event, an alternative would be to use a moving time window. We would then run several smaller minimization problems instead of one large one. Another possibility would be to modify the assumption on the time dimension of the mappings. In the present method, the time steps are processed together so that they can be linked though time. An alternative approach is to assume that the mappings at the different times are independent (as was done in [
25]). Then, we have several smaller minimization problems that can be run in parallel. A downside of this approach is that the mapping of two consecutive time steps can be very different. This can cause some nonphysical discontinuities in time, such as a sudden jump of the rainfall position. Another alternative approach is to still process all the time steps together, but to assume they have the same position error and so the same mappings. Then, the number of variables does not depend on the number of time steps. We solved the minimization problem for only one mapping. This would be the least expensive approach since the number of variables does not depend on the number of time steps any more. However, it also has a more rigid assumption. The further the data are from this assumption, the less efficient this approach will be.
Similarly, for the time registration, the number of variables, and so the computational cost, depends on the number of steps I and on the number of stations processed at once. In our two case studies, with 65 stations, the computational cost was not a problem, but could become one for larger cases with more stations. As for the spatial registration, we could then consider alternative approaches. The first one would be to assume that the timing errors of the stations are independent and apply the registration to each station separately. Solving several small minimization problem is faster than solving one large one. Moreover, they could be run in parallel since the stations are independent. However, this approach could lead to nonphysical spatial distortion if the mappings of neighboring stations are very different. The second approach is to assume that all stations have the same position error and so need the same mapping. This approach would be less computationally expensive, but also more rigid. It could also lead to a more limited improvement because of the rigidity of the assumption.
6. Conclusions
The use of warping to correct position and timing errors in rainfall estimates was tested on two case studies. The first one was a synthetic case where the rainfall events were represented by ellipses. It was used to evaluate the warping in a case for which the “truth” was known. The second case was a convective rainfall event over southern Ghana and allowed us to test the warping methods on real datasets.
Both the spatial and the time warping had a positive impact on the rainfall estimates:
The continuous statistics were significantly improved after the warping either in time or in space, e.g., the correlation went from 0.2 to about 0.6 after either warping for the southern Ghana case;
The timing error was considerably decreased by both types of warping, sometimes more by the spatial one than by the time one. In the southern Ghana case, the spatial and time warpings decreased the average timing error from 1.1 h to 0.40 h and 0.20 h, respectively. In the synthetic case, the error was reduced from 2.21 h to 0.50 h by the spatial warping, but to 0.77 h by the time warping;
The position error was decreased significantly by the spatial warping; however, the time warping only had a limited impact on it. The average position error was reduced by about 45 km after spatial warping in the southern Ghana case.
Hence, if both spatial and time warping are able to improve the rainfall estimates (in terms of continuous statistics), the spatial warping is more interesting because of its positive impact on both the position and timing errors. However, the spatial warping is also more computationally expensive.
The main drawback of the warping method comes from the automatic registration method. The first guess and the target (or truth) have to be similar enough for the automatic registration to produce meaningful mappings. In the two cases we studied here, the rainfall events were clearly defined and had a unique peak, which made them very suitable for the automatic registration. However, the registration can fail if the rainfall event from the satellite-based estimate and the one from the measurements are too dissimilar. For example, this happens if they have a different number of peaks. Moreover, the events have to be well captured by the gauge networks. This is particularly important for the spatial warping, which relies on the interpolated field. If the peak of the event is not captured by the gauges, the interpolated field will not be able to represent the event accurately (e.g., the center is not at a good position). In turn, the registration assumes that the interpolated field is the truth, which can lead to errors.
The two rainfall studied in this paper showed the potential of the spatial and warping methods to improve rainfall estimates. However, more cases would be needed to better assess the performance of the warping methods and their limits. Warping should be tested for more cases with different rainfall regimes. This would allow us to better determine the limits of the method and to derive clear feasibility criteria. As mentioned above, the main limitation of the warping is that the first guess and the observation fields have to be similar enough. The feasibility criteria would quantify this similarity in order to know beforehand if the automatic registration would succeed or not. More cases would also allow us to investigate further the sensitivity of the automatic registration (with respect to the regulation coefficients or to the number of steps I).