New Algorithm for Rain Cell Identiﬁcation and Tracking in Rainfall Event Analysis

: This study proposes a new algorithm termed rain cell identiﬁcation and tracking (RCIT) to identify and track rain cells from high resolution weather radar data. Previous algorithms have limitations when tracking non-consequent rain cells owing to their use of maximum correlation coe ﬃ cient methods and their lack of an alternative way to handle the variation stages of rain cells during their life cycles. To address these deﬁciencies, various methods are implemented in the new algorithm. These include the particle image velocimetry (PIV) method for motion estimation and the rain cell matching rule to obtain the stage changes of rain cells. High resolution (5 min and 1 km) radar data from three rainy days over the German federal state North Rhine Westphalia (NRW) are used in this study. The performance of the identiﬁcation module for the new algorithm is accessed by two object-oriented veriﬁcation methods: structure–amplitude–location (SAL) and geometric index, while the performance of the tracking module is compared with TREC and SCOUT tracking algorithms and evaluated by the contingency table veriﬁcation approach. Results suggest that the performance of the new algorithm is better than reference tracking method. Application of the RCIT algorithm to the selected cases shows that the inner structure of rainfall events in the experimental region present extreme value distributions, with most rainfall events having a short duration with less intensity. The new algorithm can e ﬀ ectively capture the stage changes of rain cells during their life cycles. The proposed algorithm can serve as the basis for further hydro-meteorological applications such as spatial and temporal analysis of rainfall events and short-term ﬂood forecasting.


Introduction
Precipitation is a key process in Earth's water circle. Acquiring explicit knowledge about its inner behavior is critical to assisting us in understanding its interaction with hydrological processes. Rainfall events are characterized by several elements, such as duration, intensity, velocity, and spatial and temporal variability [1]. The variability of rainfall events can be defined as "the variability derived from having multiple spatially-distributed rainfall fields for a given point in time" [2]. In hydro-meteorological applications, elements of rainfall events (e.g., duration, covering area, intensity, velocity) always vary over the life cycle of events. As a consequence, the responses of hydrology models are sensitive to this variation. Modeling rainfall events and analyzing their spatial and temporal information is necessary, in order to improve the accuracy of short-term rainfall forecasting and to lower the uncertainties of hydrological modeling from the variability of rainfall inputs.
For rainfall event monitoring, intensity and cumulative value are the two most common indices and they are usually measured using a rain gauge, which is the standard instrument for providing direct observations. Nevertheless, a rain gauge cannot directly detect variability in rainfall events, it is also subject to errors owing to topography and wind effects. As a possible alternative, weather radars have played a major role in recent years owing to their high spatial and temporal resolution. This is advantageous in terms of (i) acquiring spatial and temporal patterns when modeling rainfall events and (ii) undertaking short-term rainfall forecasting at fine scales. Identifying and tracking rainfall events is a common task in radar-based meteorological and hydrological applications [3][4][5][6].
Broadly speaking, the corresponding algorithms for radar-based rainfall event identification and tracking can be classified into pixel-and object-based approaches. The pixel-based approaches are also referred to as advection field tracking. These are pattern matching approaches for extracting global motion vectors by searching for the maximum correlation coefficient of rain cells in two consecutive radar images. Pixel-based algorithms are more effective when they are applied in convective rainfall nowcasting, which is usually found in frontal systems [7]. Algorithms of this kind are capable of distinguishing between large scale convective and stratiform rainfall, although they are not so effective for individual rain cell tracking. Herein, a rain cell is defined as the closed contours over which rainfall intensity exceeds a given threshold during one rainfall event [8]. Representative algorithms can be listed as follows: TREC [9], TITAN [10], COTREC [11], and SWIRLS [12]. Object-based approaches (also termed cell tracking) include (i) a detecting algorithm for searching a discrete rain cell's properties (e.g., centroid, area, echo-tops, and vertical-integrated) in consecutive radar images and (ii) a matching algorithm for tracking cell motion and stage changes (e.g., merging and splitting). Representative algorithms can be listed as follows: SCOUT [13], SCIT [14], Trace3D [15], and PERsiann-ForeCAST [16].
Despite the thorough application of these rain cell tracking algorithms, they have the following deficiencies. For pixel-based approaches, they cannot identify and analyze the properties of rain cells, and motion estimates are mostly based on the maximum correlation coefficient, which may yield non-continuous results when fast decay of rainfall occurs. Object-based approaches can quantify the characteristics of rain cells and reflect their dynamics, however, motion estimates obtained from the rain cell center of mass may lack accuracy owing to the random center of mass displacement problem. This problem occurs when the shape of rain cells changes rapidly. As a consequence of motion estimate inaccuracies, these algorithms may also encounter difficulties when handling merging and splitting scenarios [17], whereas a rain cell can begin its life cycle by simply emerging at a location with no rain. For handling false merger and splits issues in rain cell tracking, some recently developed tracking algorithms use object-oriented discriminant methods based on some judging parameters (e.g., overlaps of rain cells in successive radar images, intensity difference of rain cells, maximum matching distance or search radius factors) [18][19][20][21].
This study attempts to make a further step to solve the issues of non-continuous motion estimation and false rain cell stage assignments in rain cell tracking, and a new algorithm: Rain Cell Identification and Tracking (RICT) is proposed. This paper is organized as follows. The RCIT algorithm is illustrated in Section 2. An introduction to the study area and radar data is given in Section 3. Section 4 presents application results of the RCIT algorithm to selected rainfall events that occurred in North Rhine Westphalia (NRW). The discussion is given in Section 5. In Section 6, the main conclusions and further expectations of this work are given.

Methodology
The aim of the RCIT is to analyze rainfall events by fully utilizing the merits of weather radar. It involves two modules: rain cell identification and tracking, as presented in Figure 1. The new algorithm combines pixel and object tracking approaches. However, it uses instantaneous flow velocity measuring methods to estimate global motion vectors instead of the maximum correlation coefficient method. This can avoid the non-continuous motion estimation results. A child rain cell matching rule is also defined, which uses rain cell features (i.e., overlap, difference between centroid and average global motion vector, and area difference) to get rain cell tracking results. The inputs to the proposed Atmosphere 2019, 10, 532 3 of 18 algorithm are radar reflectivity images and the outputs are the properties of rain cells such as area, intensity, and trajectory. algorithm combines pixel and object tracking approaches. However, it uses instantaneous flow velocity measuring methods to estimate global motion vectors instead of the maximum correlation coefficient method. This can avoid the non-continuous motion estimation results. A child rain cell matching rule is also defined, which uses rain cell features (i.e., overlap, difference between centroid and average global motion vector, and area difference) to get rain cell tracking results. The inputs to the proposed algorithm are radar reflectivity images and the outputs are the properties of rain cells such as area, intensity, and trajectory.

Rain Cell Identification Module
Similar to most object-based tracking algorithms, the rain cell identification module of the RCIT is based on discerning a connected domain above a given threshold. As presented in the left portion of Figure 1, each selected radar reflectivity image is initially filtered by the median filtering method [22] to remove noisy pixels (pixels with abnormally high reflectivity). Then, pixels above a given reflectivity threshold are assigned the value one, with the remainder assigned the value zero. A segmenting process is implemented to assemble and cluster pixels sharing the same reflectivity threshold into a connected area. In the segmenting process, the following rules suggested by [23] are obeyed: (i) if the reflectivity of a rainy pixel is lower than a given threshold Rt, then it is set to null. (ii) For each rainy pixel and its eight neighbors, if more than five of them are null, then it is set to null. (iii) If the pixel is spurred, then it is set to null. Herein, spur pixels are those isolated pixels whose reflectivity is different to others along the horizontal and vertical directions in the labeled binary image. (iv) If the area of a connected region is smaller than 9 km 2 , then it is ignored. All the segmented regions are then labeled and fitted with an ellipse shape. Their properties are extracted and stored in a relational database. The extracted properties are as follows: • Area (km 2 )-sum value for the number of pixels contained in one rain cell. Step illustration of RCIT algorithm.

Rain Cell Identification Module
Similar to most object-based tracking algorithms, the rain cell identification module of the RCIT is based on discerning a connected domain above a given threshold. As presented in the left portion of Figure 1, each selected radar reflectivity image is initially filtered by the median filtering method [22] to remove noisy pixels (pixels with abnormally high reflectivity). Then, pixels above a given reflectivity threshold are assigned the value one, with the remainder assigned the value zero. A segmenting process is implemented to assemble and cluster pixels sharing the same reflectivity threshold into a connected area. In the segmenting process, the following rules suggested by [23] are obeyed: (i) if the reflectivity of a rainy pixel is lower than a given threshold R t , then it is set to null. (ii) For each rainy pixel and its eight neighbors, if more than five of them are null, then it is set to null. (iii) If the pixel is spurred, then it is set to null. Herein, spur pixels are those isolated pixels whose reflectivity is different to others along the horizontal and vertical directions in the labeled binary image. (iv) If the area of a connected region is smaller than 9 km 2 , then it is ignored. All the segmented regions are then labeled and fitted with an ellipse shape. Their properties are extracted and stored in a relational database. The extracted properties are as follows: • Area (km 2 )-sum value for the number of pixels contained in one rain cell.

•
Areal rainfall depth (mm)-cumulative precipitation of one rain cell over a 5-min interval.

•
Areal mean rainfall depth (mm·km −2 )-ratio of the areal rainfall depth and area.

•
Eccentricity-ratio of minor and major axes, which are acquired from the fitted eclipse. Used to describe the shape of one rain cell with a value range from 0 to 1. • Center of mass (km)-center of mass of a rain cell, which is weighted by the reflectivity of rainy pixels.
Property calculation: Areal rainfall depth, maximum intensity, and areal mean rainfall depth are based on the reflectivity (Z) and rain rate (R) converting function: Z = aR b .

Rain Cell Tracking Module
The rain cell tracking module is established based on a hybrid approach, as illustrated in the right portion of Figure 1. In the first procedure, motion vectors are estimated by implementing the particle image velocimetry (PIV). This is an optical method of flow visualization that is used to obtain instantaneous velocity measurements and related properties of fluids [24][25][26]. It consists of a class of flow measuring mechanisms that are characterized by recording the displacement of small particles embedded in a region of fluid. Figure 2 shows a PIV application in motion vector estimation. In the first step, a window box of size r × r is initially defined, which divides radar images into several sub-blocks. In the second step, a searching distance, d = 2 × v max + 1, is defined, where v max is the preset maximum velocity. The minimum quadric difference (MQD), as suggested by [27], is employed in searching the optimal grid points at time t + ∆t, as in Equation (1): where R 1 (X i ,Y j ) and R 2 (X i ,Y j ) are the reflectivity of grid points contained within the window boxes of radar images at time t and t + ∆t, respectively; ∆x and ∆y (∆x, ∆y ∈ d) are the locations of minimum reflectivity difference in the east-west and north-south directions separately. The minimum reflectivity differences of grid points within the window boxes are reversed to simplify the calculation. In order to guarantee that the solitary peak locations can be calculated, ∆x and ∆y are corrected separately in the east-west and north-south directions by fitting a second-order polynomial to the logarithm of the maximum reflectivity of the grid point and its three direct neighbors, as in Equation (2). In this way, the optimal grid points at time t + ∆t are identified, with their locations presented as (x + ∆x − d+1 2 , y + ∆y − d+1 2 ). In the final step, the calculated motion vectors are smoothed by the median filter algorithm.
In the second procedure, rain cells at time t and t + ∆t are identified. Finally, a child rain cell matching rule is defined in this work for identifying the most-matched rain cells. Before introduction of the child rain cell matching rule, certain definitions were identified: (i) for two radar images at time t and t + ∆t, rain cells identified from radar images at t are termed parent cells and (ii) rain cells identified from radar images at t + ∆t are termed child cells. The child rain cell matching scheme considers the stage changes of rain cells between successive radar images (e.g., merge, split), using certain indexes for determination such as overlap, area diversification, distance, and angle difference for center of mass. The matching rule can be depicted as follows: • A boundary box of a parent cell is defined, with a horizontal length of (10 + max(pos x ), min(pos x ) − 10) and vertical length of (10 + max(pos y ), min(pos y ) − 10), where pos x and pos y are Cartesian coordinates of pixels in the parent cell.

•
The number of child cells falling into the boundary box is determined and their properties, e.g., area, areal rainfall depth, max intensity, areal mean rainfall depth, and center of mass, are selected.

•
If only one child cell is searched in the boundary box and it overlaps with a parent cell, then it is the most-matched rain cell. If this child cell does not overlap with a parent cell and the distance and angle difference for the center of mass between it and the parent cell are less than 3 × mean (V motion_vector ) and 3 × θ motion_vector , it is also the most-matched rain cell, where mean (V motion_vector ) and θ motion_vector are the mean value of velocity and the prevailing direction of the motion vector, respectively. • If two or more child cells fall into the boundary box without overlapping a parent cell, the matching rule is changeless; however, one extra condition is included, i.e., child cells whose areas have minimum absolute differences with the parent cell are the most-matched rain cells.
• A boundary box of a parent cell is defined, with a horizontal length of (10 + max(posx), min(posx) − 10) and vertical length of (10 + max(posy), min(posy) − 10), where posx and posy are Cartesian coordinates of pixels in the parent cell.

•
The number of child cells falling into the boundary box is determined and their properties, e.g., area, areal rainfall depth, max intensity, areal mean rainfall depth, and center of mass, are selected.

•
If only one child cell is searched in the boundary box and it overlaps with a parent cell, then it is the most-matched rain cell. If this child cell does not overlap with a parent cell and the distance and angle difference for the center of mass between it and the parent cell are less than 3 × mean (Vmotion_vector) and 3 × θmotion_vector, it is also the most-matched rain cell, where mean (Vmotion_vector) and θmotion_vector are the mean value of velocity and the prevailing direction of the motion vector, respectively.
• If two or more child cells fall into the boundary box without overlapping a parent cell, the matching rule is changeless; however, one extra condition is included, i.e., child cells whose areas have minimum absolute differences with the parent cell are the most-matched rain cells. Step one: window boxes with the area of r × r are defined (rectangular with red color); step two: for any grid point in the window box at a previous timepoint (red block), the minimum quadric difference (MQD) algorithm is applied to deduce the minimum reflectivity differences, and grid points with minimum value in the next window box are identified (blue blocks); step three: the solitary locations of reversed MQD value, Δx and Δy, are corrected by applying Equation (2). The locations of the optimal grid point at the next timepoint are calculated using the second-order polynomial function, and the global motion vectors are extracted and smoothed by the median filter method. Step one: window boxes with the area of r × r are defined (rectangular with red color); step two: for any grid point in the window box at a previous timepoint (red block), the minimum quadric difference (MQD) algorithm is applied to deduce the minimum reflectivity differences, and grid points with minimum value in the next window box are identified (blue blocks); step three: the solitary locations of reversed MQD value, ∆x and ∆y, are corrected by applying Equation (2). The locations of the optimal grid point at the next timepoint are calculated using the second-order polynomial function, and the global motion vectors are extracted and smoothed by the median filter method.

Study Area and Data
The study area is the federal state of NRW ( Figure 3). Two main types of landscape can be found in NRW, namely the North German lowlands, with elevations just a few meters above sea level, and the North German low mountain range, with elevations of up to 850 m. The lowland areas comprise the Rhine-Ruhr area, which is one of the largest metropolitan areas, with a population of approximately 10 million. The circulation pattern of NRW is mainly affected by the air mass from the Atlantic Ocean along the direction toward the mainland. When arriving at the southern high mountain regions, the air mass stops and rises; this leads to more cloudiness and precipitation. On the eastern side of the mountain regions, drier air masses result in less cloudiness and less precipitation.
velocity, together with intensity volume scans (frequency: 500 Hz, maximum range: 256 km) and precipitation scans (frequency: 600 Hz, maximum range: 150 km) every 5 min for the precipitation echo, with a high spatial resolution of 1 km in range and 1° in azimuth. For this study, the Essen radar provided precipitation scans (time step: 5 min) with an elevation of 0.8° and a range of 128 km. The output reflectivity was selected with a plan position indicator (PPI) display type. As radar measures precipitation in an indirect manner, the quality of radar data must be carefully checked. The sources that affect the quality of the Essen radar data include ground clutter and speckle, beam blockage, and attenuation. The corresponding quality correction methods in this Radar data were obtained from the Essen radar deployed in Essen City, NRW. The Essen radar has been deployed over the study area and is a part of the radar network of the German weather service (DWD). The Essen radar is a dual-polarimetric C-band Doppler radar. It delivers radar volume scans (frequency: 800/1200 Hz, maximum range: 124 km) every 15 min for the Doppler velocity, together with intensity volume scans (frequency: 500 Hz, maximum range: 256 km) and precipitation scans (frequency: 600 Hz, maximum range: 150 km) every 5 min for the precipitation echo, with a high spatial resolution of 1 km in range and 1 • in azimuth. For this study, the Essen radar provided precipitation scans (time step: 5 min) with an elevation of 0.8 • and a range of 128 km. The output reflectivity was selected with a plan position indicator (PPI) display type.
As radar measures precipitation in an indirect manner, the quality of radar data must be carefully checked. The sources that affect the quality of the Essen radar data include ground clutter and speckle, beam blockage, and attenuation. The corresponding quality correction methods in this work follow [28]. After quality checking, an open source package "Wradlib" [29] was applied to project the raw radar image onto a 256 × 256 km 2 Cartesian map with 1-km resolution. In order to access the performance of the RCIT algorithm, selected rainfall events for testing should be those with fast growing and decaying intensities and distinct moving directions. In total, 864 radar reflectivity images for three rainy days

Results
In the algorithm application process, the reflectivity threshold for rain cell identification was set to 19 dBZ, which was based on the classification of DWD presented by [30], as in Table 1. For the German weather radar system, two common Z-R relationships were used by [30]: one is categorized for the RADOLAN product and the other uses constants a and b with values of 256 and 1.42, respectively. Although the DWD has stated that the categorized relationship statistically shows better results over long time periods, the standard relationship can be more compatible with local cases when a correction factor is added [31]. Based on the above considerations, we applied the DWD standard Z-R relationship to radar reflectivity-rain rate conversion in the application cases. There were 10,346 rain cells identified from the selected radar reflectivity images. Descriptive statistics of their properties are given in Table 2. It was found that a high standard deviation existed for the areas of these rain cells, with values ranging from 9 to 18,734 km 2 (most were less than 38 km 2 ). For areal rainfall depth, values ranged from 0.36 to 8861 mm and a high standard deviation again existed. For the max intensity property, a high standard deviation (34.08 mm·h −1 ) was also found; the median was 2.83 mm·h −1 and the range of values was 0.48 to 397.75 mm·h −1 . Areal mean rainfall depth was from 0.04 to 4.4 mm·km −2 . Eccentricity ranged from 0 to 1, with a median over 0.5. Inner structures of the selected events were described by statistically analyzing the physical and geometric properties of the rain cells. RCIT simulation results indicated that the properties (e.g., area, areal rainfall depth, max intensity, and areal mean rainfall depth) of the identified rain cells presented a wide range of values. The shape of the rain cells was somewhat elliptical, with a median value over 0.5. Histograms of the log 10 -transformed rain cell properties are given as in Figure 5. To determine the best theoretical distributions describing the empirical distributions, a multi-goodness of fit testing (GOF) approach combined with the Akaike information criterion (AIC), the Bayesian information criterion (BIC), and the Kolmogorov-Smirnov (K-S) test methods was applied (see Appendix A). Figure 6 shows their fitted cumulative distributions. Empirical distributions of the log 10 -transformed properties (area, areal rainfall depth, maximum intensity, and areal mean rainfall depth) could be fitted with the generalized Pareto distribution (GPD) presented in Equation (3), and the extreme value distribution (EVD) was found to fit the eccentricity property shown in Equation (4).
where k is the shape parameter, and µ and α are location and scale parameters, respectively. For µ < 0, k is above zero, and for µ < x < α, k is below zero. At the limit for k = 0, the GPD is the exponential distribution.
For 1 + k( x−µ α ), when k > 0, the generalized EVD is the Frchet distribution; k < 0 corresponds to the Weibull distribution; at the limit for k = 0, it is the Gumbel distribution.
A total of 1107 rain cell trajectories were exploited. Histograms of their duration and motion vectors are presented in Figure 7. All rain cells held a mean duration of 40 min. For all the identified rain cells, the median value of their life cycles was 15 min, with an average moving speed of 11.59 m·s −1 . The moving directions of the rain cells were consistently toward the motion direction observed in the radar images.
Atmosphere 2019, 10, 532 9 of 18 A total of 1107 rain cell trajectories were exploited. Histograms of their duration and motion vectors are presented in Figure 7. All rain cells held a mean duration of 40 min. For all the identified rain cells, the median value of their life cycles was 15 min, with an average moving speed of 11.59 m·s −1 . The moving directions of the rain cells were consistently toward the motion direction observed in the radar images.  A total of 1107 rain cell trajectories were exploited. Histograms of their duration and motion vectors are presented in Figure 7. All rain cells held a mean duration of 40 min. For all the identified rain cells, the median value of their life cycles was 15 min, with an average moving speed of 11.59 m·s −1 . The moving directions of the rain cells were consistently toward the motion direction observed in the radar images.   These results were in agreement with the study of [32], in which the structures of convective rainfall events were analyzed and presented an extreme value distribution statistically. Statistical analysis of the rain cell properties suggests that the inner structures of the selected rainy days can be expressed by the extreme value distribution. This suggests that most rainfall events had a limited covering area with less intensity and short duration; rainfall events with a long duration had a large covering area and high intensity.
During the life cycle of a rainfall event, the physical and geometrical features of rain cells continually change. Three common stages reflect these variations: cumulus, mature, and dissipating [33]. In fact, the stage changes of rain cells are not only associated with its internal growth and decay but also with other rain cells (e.g., merging or splitting). Five rain cell life stages could be confirmed by references; their definitions are listed as follows (Figure 8a-e): • Initial: A rain cell having no parent cell is termed an initial rain cell.
• Tracking: A rain cell with only one parent cell and having no interaction with other rain cells during its life cycle is termed a tracking rain cell.
• Merge: A rain cell with at least two parent cells is termed a merged rain cell.
• Split: A rain cell with only one parent cell but at least two child cells is termed a split rain cell.
• Dissipate: A rain cell with at least one parent cell but no child cells is termed a dissipate rain cell.
Additionally, other two stages named '5-min life cycle' and 'Complex stage' were firstly defined in this study (Figure 8f,g). A rain cell with a life cycle of only 5 min is termed 5-min life cycle stage. A rain cell for which merging and splitting simultaneously exist during its life cycle is termed complex stage. Table 3. Rain cells with "5-min life cycle" and "tracking" stages were dominant. The "merging", "splitting", and "complex stage" cells occurred in isolated cases, indicating a stable inner structure of the identified cells. For the cases of 19 July 2008 and 26 July 2008, the number of rain cells in the "tracking" stage was even greater, indicating that the rainfall events occurring on these two days had long duration. The number of rain cells in the "merging" and "splitting" stages was greater in the 26 July These results were in agreement with the study of [32], in which the structures of convective rainfall events were analyzed and presented an extreme value distribution statistically. Statistical analysis of the rain cell properties suggests that the inner structures of the selected rainy days can be expressed by the extreme value distribution. This suggests that most rainfall events had a limited covering area with less intensity and short duration; rainfall events with a long duration had a large covering area and high intensity.

The number of rain cells with different stages over their life cycles is summarized in
During the life cycle of a rainfall event, the physical and geometrical features of rain cells continually change. Three common stages reflect these variations: cumulus, mature, and dissipating [33]. In fact, the stage changes of rain cells are not only associated with its internal growth and decay but also with other rain cells (e.g., merging or splitting). Five rain cell life stages could be confirmed by references; their definitions are listed as follows (Figure 8a-e): • Initial: A rain cell having no parent cell is termed an initial rain cell.

•
Tracking: A rain cell with only one parent cell and having no interaction with other rain cells during its life cycle is termed a tracking rain cell.

•
Merge: A rain cell with at least two parent cells is termed a merged rain cell.

•
Split: A rain cell with only one parent cell but at least two child cells is termed a split rain cell.

•
Dissipate: A rain cell with at least one parent cell but no child cells is termed a dissipate rain cell.
Additionally, other two stages named '5-min life cycle' and 'Complex stage' were firstly defined in this study (Figure 8f,g). A rain cell with a life cycle of only 5 min is termed 5-min life cycle stage. A rain cell for which merging and splitting simultaneously exist during its life cycle is termed complex stage.
The number of rain cells with different stages over their life cycles is summarized in Table 3. Rain cells with "5-min life cycle" and "tracking" stages were dominant. The "merging", "splitting", and "complex stage" cells occurred in isolated cases, indicating a stable inner structure of the identified cells. For the cases of 19 July 2008 and 26 July 2008, the number of rain cells in the "tracking" stage was even greater, indicating that the rainfall events occurring on these two days had long duration. The number of rain cells in the "merging" and "splitting" stages was greater in the 26 July 2008 case. This suggests that there were more convective rainfall events on that day since rain cells merge or split more frequently under such conditions. 2008 case. This suggests that there were more convective rainfall events on that day since rain cells merge or split more frequently under such conditions.

Discussion
Assessing the performance of the RCIT algorithm involves three parts: evaluating the performance of the rain cell identification module, assessing the performance of motion estimation against the TREC tracking algorithm, and rain cell merging and splitting evaluation against the SCOUT algorithm.

Evaluation of Rain Cell Identification Module by Feature-Based Verification Methods
Grid-point related error measurement is problematic for the rain cell tracking algorithm. A classic example illustrating this problem is the well-known "Double Penalty" problem [34], in which

Discussion
Assessing the performance of the RCIT algorithm involves three parts: evaluating the performance of the rain cell identification module, assessing the performance of motion estimation against the TREC tracking algorithm, and rain cell merging and splitting evaluation against the SCOUT algorithm.

Evaluation of Rain Cell Identification Module by Feature-Based Verification Methods
Grid-point related error measurement is problematic for the rain cell tracking algorithm. A classic example illustrating this problem is the well-known "Double Penalty" problem [34], in which prediction of a precipitation object at the correct size and structure might yield very poor verification scores. For example, one rain cell is displaced slightly in space, but the categorical verification scores heavily penalize such a situation. In traditional verification methods, a displacement simply leads to a false alarm, and it is also very poorly rated owing to its large root mean squared error [35]. On the other hand, despite a great deal of effort in the statistical validation of grid-based rainfall estimated results, verification associated with the geometry patterns of rain cells has not been well researched or applied.
As an alternative, feature-based verification methods have been built upon the idea of identifying rainfall events as "objects". With this perspective, simulated and observed rainfalls are not compared directly at the same location; rather, objects of interest are extracted from simulated and observed data and then compared together so that verification statistics are obtained. A number of spatial verification methods have been proposed [36,37]. In the present work two feature-based verification methods, structure-amplitude-location (SAL) [38] and geometric index [39], were implemented to evaluate the identification module of the RCIT algorithm. A detailed introduction to the SAL and geometric index methods is presented in Appendix B. The inputs for these two methods are radar reflectivity images (termed obs_ref) and reflectivity images of pixel sets for rain cells, these kind images are generated from the RCIT algorithm (termed sim_ref). Figure 9 shows the SAL verification results, which are arranged based on the three selected rainy days. For each SAL plot presented in Figure 9, the vertical axis denotes the A component, the horizontal axis denoted the S component, the dots represent values of the S and A components, and the color scale of the dots denotes the L component. Median values of the SAL components are presented as dashed lines. It can be seen that all S and A values are concentrated close to zero, as are most L component values. Table 4 gives three geometric index values calculates from the sim_ref and obs_ref data sets respectively. All values were again organized based on the three selected rainy days. It is evident that the index differences between the sim_ref and obs_ref data sets were less than 0.05.
Atmosphere 2018, 9, x FOR PEER REVIEW  12 of 19 other hand, despite a great deal of effort in the statistical validation of grid-based rainfall estimated results, verification associated with the geometry patterns of rain cells has not been well researched or applied.
As an alternative, feature-based verification methods have been built upon the idea of identifying rainfall events as "objects". With this perspective, simulated and observed rainfalls are not compared directly at the same location; rather, objects of interest are extracted from simulated and observed data and then compared together so that verification statistics are obtained. A number of spatial verification methods have been proposed [36,37]. In the present work two feature-based verification methods, structure-amplitude-location (SAL) [38] and geometric index [39], were implemented to evaluate the identification module of the RCIT algorithm. A detailed introduction to the SAL and geometric index methods is presented in Appendix B. The inputs for these two methods are radar reflectivity images (termed obs_ref) and reflectivity images of pixel sets for rain cells, these kind images are generated from the RCIT algorithm (termed sim_ref).  Figure 9 shows the SAL verification results, which are arranged based on the three selected rainy days. For each SAL plot presented in Figure 9, the vertical axis denotes the A component, the horizontal axis denoted the S component, the dots represent values of the S and A components, and the color scale of the dots denotes the L component. Median values of the SAL components are presented as dashed lines. It can be seen that all S and A values are concentrated close to zero, as are most L component values. Table 4 gives three geometric index values calculates from the sim_ref and obs_ref data sets respectively. All values were again organized based on the three selected rainy days. It is evident that the index differences between the sim_ref and obs_ref data sets were less than 0.05.   According to the definition of SAL, values of S and A components range from −2 to 2, and 0 denotes a very good match for the simulated data set to the observed data set. The SAL verification results suggested that rain cells from the RCIT algorithm can be well matched to rainfall fields as manually observed from radar reflectivity images. It was also suggested that there were no rain cells (area threshold > 9 km 2 ) not identified from the RCIT algorithm. Geometric index verification results indicated that spatial pattern difference between rain cells from the RCIT algorithm and manually observed rainfall fields from radar reflectivity images were not obvious. In general, the rain cell identification module for the RCIT algorithm performed well based on the two feature-based verification results.

Assessing the Performance of Motion Estimation against the TREC Algorithm
The performance of the rain cell tracking module is compared to the pixel-based tracking algorithm: TREC by operating a deterministic persistence forecasting [17]. The forecasting performance can reflect the motion estimation accuracy by neglecting growth and decay process [40]. In this work, eight short-term rainfall events were selected from three rainy days, as listed in Table 5. The forecasting is done as follows: motion vectors at time t -∆t and t were generated by RCIT and TREC algorithm separately, then radar reflectivity images at time t were extrapolated with a leading time up to 2 h by using the motion vectors derived from their corresponding tracking schemes. The reflectivity threshold for the forecasting is 19 dBZ, which is the same with the reflectivity threshold used in the rain cell identification module. Forecasting performance is accessed by contingency table approach [10], which calculates three skill scores, i.e., the probability of detection (POD), the false alarm rate (FAR), and the critical success index (CSI). If rainfall appears on both of the forecast results and radar images, they are recorded as hit.
If it only appears on radar images, it is marked as miss, and if it only appears on forecast results, it is regarded as false. By counting all the hits, misses, and false, POD, FAR, and CSI can be calculated according to the following formula, respectively. The higher value of POD and CSI, or the lower value of FAR, the better the tracking algorithm. Figure 10 gives the averaged value of these three skill scores for forecasting results. When the leading time was less than 30 min, skill score value differences were not obvious for forecasting results Atmosphere 2019, 10, 532 14 of 18 from both tracking algorithms, while the higher POD and CSI but lower FAR scores were found for the RCIT algorithm compared to the TREC algorithm as leading time increased. It indicated that identified motion vectors from the RCIT algorithm were more credible.  Figure 10 gives the averaged value of these three skill scores for forecasting results. When the leading time was less than 30 min, skill score value differences were not obvious for forecasting results from both tracking algorithms, while the higher POD and CSI but lower FAR scores were found for the RCIT algorithm compared to the TREC algorithm as leading time increased. It indicated that identified motion vectors from the RCIT algorithm were more credible.

Rain Cell Merging and Splitting Evaluation against the SCOUT Algorithm
Performance of rain cell merging and splitting assignment for the new algorithm is compared with the object-based tracking approach: SCOUT. The performance is also evaluated by contingency table, which follows the accessing method listed in [17]. We firstly acquired tracks of rain cells manually from selected eight rainfall events, then tracks from RCIT and SCOUT approaches were also acquired. These tracks are compared to those manually acquired, number of hit, miss, and false were counted and POD, FAR, and CSI were calculated for RCIT and SCOUT algorithms respectively. Table 6 shows three skill scores for two tracking algorithms. A look at the results in Table 6 indicated a good fit between the visual inspection and what the new algorithm captures for rain cells. The higher POD and CSI values for rain cells with 'Merge' and 'Split' obtained by the new algorithm reflected an improvement in the RCIT algorithm with respect to the SCOUT tracking approach. Moreover, lower FAR values for the new algorithm also suggested better capability to avoid wrong rain cell assignment for the RCIT algorithm.

Rain Cell Merging and Splitting Evaluation against the SCOUT Algorithm
Performance of rain cell merging and splitting assignment for the new algorithm is compared with the object-based tracking approach: SCOUT. The performance is also evaluated by contingency table, which follows the accessing method listed in [17]. We firstly acquired tracks of rain cells manually from selected eight rainfall events, then tracks from RCIT and SCOUT approaches were also acquired. These tracks are compared to those manually acquired, number of hit, miss, and false were counted and POD, FAR, and CSI were calculated for RCIT and SCOUT algorithms respectively. Table 6 shows three skill scores for two tracking algorithms. A look at the results in Table 6 indicated a good fit between the visual inspection and what the new algorithm captures for rain cells. The higher POD and CSI values for rain cells with 'Merge' and 'Split' obtained by the new algorithm reflected an improvement in the RCIT algorithm with respect to the SCOUT tracking approach. Moreover, lower FAR values for the new algorithm also suggested better capability to avoid wrong rain cell assignment for the RCIT algorithm.

Conclusions
This study develops a new algorithm, RCIT, which utilizes the advantages of high resolution weather radar data. The proposed algorithm provides the following improvements:

•
It uses the PIV method in rain cell motion estimation. Rain cell motion estimation by past algorithms is mainly based on the maximum correlation coefficient method, which may lead to nonconsecutive motion when the shape and volume of a rain cell change rapidly. The PIV method avoids this situation. • A rain cell matching rule is proposed to discern the life cycle and stage change of rain cells. Some other algorithms focus mainly on analyzing feature changes of rain cells (e.g., overlap, area, intensity) by some object methods in dealing with cell merge and splits. The proposed rain cell matching rule put global motion vector as one judging parameter in rain cell stage analysis, this can be easily operated, and various stages of rain cells can also be discerned effectively.
The performance of rain cell identification for the RCIT algorithm is accessed by using two feature-based verification methods, SAL and geometric index; performance of rain cell tracking for the new algorithm is compared to pixel and object based tracking algorithms (TREC and SCOUT) by using the contingency table method. It is shown that the new algorithm improves the capabilities of motion estimation and rain cell tracking. Practical applications of the RCIT algorithm in analyzing the inner structure of historical rainfall events that occurred in the NRW is presented. Results show that the properties of rain cells in this region presented an extreme value distribution, indicating that the selected rainfall events had a short duration with low intensity. Long duration events with high intensity are rarely found and the stage changes of rain cells vary between events.
It should be noted that inputs for the proposed algorithm are not limited to radar data; other 2D remote sensing data will also be used as the algorithm inputs, suggesting the versatility of the proposed algorithm. In future application, it is intended that this algorithm will analyze the spatial-temporal variation of rainfall in small regions; this will lead to the determination of rainfall inputs with proper spatial and temporal scales for hydro-meteorological applications. The proposed algorithm will also be applied to rainfall nowcasting. However, it must be noted that the new algorithm is designed without considering the internal growth and decay of rainfall events. This will limit the forecasting accuracy in some convective scenarios (e.g., tropical cyclone). In addition, the features of the rain cell output from this algorithm can be used in sensitivity analyses of urban runoff in relation to short-term rainfall events, which will improve flood forecasting precision in small-medium catchments.
• AIC [41] is based on the use of Kullback-Leible information as the discrepancy measure between the true distribution and the approximating distributions: M i = g i (x, p 1 , p 2 , . . . , p n ). The AIC for the ith candidate distribution can be computed as: where (θ) stands for the maximum log-likelihood of the sample of the data set, p is the parameter's number of candidate distributions when the sample size n is small with respect to the number of the estimated parameter P i . The smaller the value of AIC, the better fitting the result for the candidate distribution is.
• BIC [42] serves as an asymptotic approximation to a transformation of the Bayesian posterior probability of a candidate model. It is based on the empirical log-likelihood and does not require the specification of priors. BIC is defined as: where the symbols are the same as those in Equation (A2). A small value of BIC means that the candidate distribution fits well with the empirical distribution.

Appendix B.
Two feature-based verification methods, structure-amplitude-location (SAL) and geometric index, were applied to evaluate the performance of the RCIT algorithm and nowcasting methods. In SAL, term structure denotes the similarity in the shapes of modeled and observed rain cells; its value range is from −2 to 2. Amplitude denotes the similarity in the total precipitation values of modeled and observed rain cells; its value range is from −2 to 2. Location denotes the similarity of the center of mass for the modeled and observed rain cells; its value varies from 0 to 2. The accuracy of nowcasting methods can be evaluated based on the value of the three SAL components and a perfect nowcasting is confirmed by S, A, and L values of 0. More details on the SAL method can be found in [39].
Geometric index is a quantitative assessment method for the spatial patterns of rain cells [40]. It compares the geometric features of modeled and observed rain cells via three indexes:

•
Connectivity index: This is defined to compare simulated rain cells with respect to a reference object (e.g., observed rain cells). Its value is calculated based on the number of rain cells (NC) and the total number of non-zero pixels or pixels above a given threshold (NP), as in Equation (A4).
where C index is the connectivity index, NP is the number of rainy pixels above a given threshold, and NC is the number of rain cells.

•
Shape index: This index is introduced to quantitatively describe the shape discrepancy of rain cells, as in Equation (A5).
where S index is the single index, P min is the theoretical minimum perimeter, and P is the actual perimeter of the rain cell.

•
Area index: This is defined to depict the dispersiveness between the modeled and observed rain cells. Its value is the ratio of the area of its convex hull (the boundary of the minimal convex set containing a finite set of points in the rain cell), as in Equation (A6).