1. Introduction
Storm tracking techniques were originally designed to track observed storms in radar-based observations [
1,
2,
3]. Recent decades have witnessed rapid convection allowing model (CAM) developments [
4,
5,
6,
7]. CAM’s ability to realistically simulate convective storms has motivated the adaptation of observation-based storm tracking techniques to track CAM-simulated storms [
8]. A reliable CAM-based storm tracking technique is essential for the accurate characterization of simulated convective storm evolutions including key attributes such as convection initiation (CI), storm motion, duration, and max-size during the lifetime of a simulated storm. It provides a framework to systematically evaluate crucial aspects of CAM-simulated storms. Such a framework advances our understanding of the strengths and limitations of CAM forecasts so that they can be more appropriately utilized for severe weather forecasting.
The object-based method, which involves the identification of objects to represent storms present in a spatial variable at time t, is one of the most widely used methods for convective storm tracking and CAM forecast verification [
9]. Assuming minimal changes in certain characteristics such as motion, size, intensity and/or changes in size and intensity, the object-based tracking of a storm from t to t + ∆t is generally a “predict and match” process [
2,
10,
11]. Specifically, the storm object
is first extrapolated to t + ∆t based on its status at
(i.e., location, size, intensity) and/or attributes from t − ∆t to
(e.g., motion, changes in intensity/size). Then, the object(s) present at t + ∆t with attributes most comparable to the extrapolated object is associated with
.
Although this extrapolation-based approach showed promise, it is unclear to what extent an object’s status at
and attributes from t − ∆t to
can be reliably used to predict its counterpart at t + ∆t. If too much confidence is given to a poorly predicted object, the derived track associating
with some object(s) at
may be false. False tracks derived from
to t + ∆t may lead to more unreliable tracks from t + ∆t to t + 2∆t and so on. Although this caveat has been noted by many studies for tracking both observed and CAM-simulated storms [
10,
12,
13], we have not found any existing object-based tracking method that quantified or at least studied the uncertainty associated with the attributes used for extrapolation.
The extrapolation-based approach is especially ill-suited for handling current CAM forecasts because it cannot accurately track spatially well resolved but “short-lived” storms (i.e., seen on less than three contiguous time frames) without a high-quality “first-guess motion”. It is not a serious concern if the number of such storms is limited. However, the output frequencies of current CAM forecasts are generally much lower compared to their spatial resolution (e.g., ∆t = 1 h versus
[
14]), giving rise to a considerable number of such storms. The abundance of these storms greatly increases the likelihood of random, false connections among them and would inevitably affect the tracking reliability of nearby, long-lived storms, posing a serious challenge to the overall tracking quality of CAM-simulated storms [
15]. We are not aware of any existing CAM-based tracking methods that address this challenge associated with hourly CAM outputs.
To overcome the above limitations and improve the accuracy of CAM-based convective storm tracking, this paper proposes a new object-based method with two new features. First, uncertainties associated with attributes used in object extrapolation were explicitly quantified to guide the derivation of tracks. Second, object size was incorporated to help identify objects not well resolved by the given temporal resolution; tracks associated with objects smaller than certain threshold size must undergo additional quality control steps before being accepted. Parameters used in the new features were derived from a resolution-dependent study of generic storm evolution properties using 2-min Multi-Radar Multi-Sensor (MRMS [
16]) data and hourly CAM forecasts produced by the University of Oklahoma (OU) Multiscale data Assimilation and Predictability Laboratory (MAP) from May 2019.
The structure of the paper is as follows.
Section 2 presents the independent, resolution-dependent study of convective storm evolution using 2-min, 1 km MRMS observations and hourly, 3 km CAM forecasts produced by OU MAP during May 2019. The new object-based tracking algorithm with parameters derived from the study is proposed in
Section 3.
Section 4 demonstrates the performance of the new tracking algorithm with examples of complex storm evolutions in hourly MRMS observations and MAP forecasts during May 2018 and systematic evaluations of 608 hourly MRMS tracks, and 123 distinct CIs derived with the new method.
Section 5 summarizes the study.
3. Materials and Methods
This section presents the new object-based storm tracking algorithm. The output of the algorithm is a track-object network serving as an abstract representation of the temporal evolutions and interactions of all storms resolvable by the given data during a specified period. Once the track-object network is constructed, a new space-time object describing the evolution of each distinct storm from initiation to dissipation can be identified.
Resolution-dependent parameters to be used in the proposed algorithm were derived from
Section 2 and can be applied to any data with the same resolution (i.e., hourly and 3 km). To demonstrate the general applicability of the algorithm, we use hourly MRMS observations and MAP forecasts from May 2018, a period different from the one used to derive baseline parameters, to illustrate and evaluate the proposed algorithm. MAP forecasts produced in May 2018 used the same configuration as the ones produced in May 2019 [
21].
Data preprocessing and object identification were the same as in
Section 2. Assuming each identified object describes one physically well-defined storm, the object-based tracking problem from t to
is defined as to derive the tracks between object(s) identified at
to object(s) identified at
that most accurately describe the physical evolutions of storms. There are four types of tracks we could derive:
A one-to-one track is the association of an object at to another object at . This is the most common type of track we derive, describing the independent evolution of a storm from to ;
A many-to-one track is the association of multiple objects at to one object at . It describes the merging of several storms at into one storm at ;
A one-to-many track is the association of one object at to multiple objects at . It describes the splitting of a storm at into several storms at ;
A many-to-many track is the association of multiple objects at to multiple objects at . It describes the group tracking of several storms at t to several storms at . Because our goal is to separately track the evolutions of individual storms, this type of track is not considered in our solution.
Here is used to describe one-to-one tracks while describes tracks in general, including one-to-one and composite (i.e., many-to-one or one-to-many) tracks. For simplicity, we will use one-to-one tracks to define most quantities encountered in the algorithm, but they can be straightforwardly extended for composite tracks.
Our general tracking strategy is extrapolation. However, instead of extrapolating in the forward direction only, we start with the forward extrapolation from (the earliest time of the data) to (the latest time of the data). Once it finishes, we extrapolate backwards from to (e.g., hindcast) and update the tracking solution. The procedure is repeated multiple times until the tracking solution converges (i.e., the solution changes little from the last iteration to the current). The iterative design generally improves the tracking accuracy following the 2nd iteration, because each probable track can be more reliably determined with not only the past track of from to , but also the future track of from to .
As the iterative design suggests, the likelihood of a probable one-to-one track
can be estimated by two conditional probabilities,
and
. Specifically,
is the conditional probability that
exists given (the past motion of)
and it is estimated by
Here
and
are the probabilities of the direction and speed of
conditioned on the past direction and speed of
. Similarly,
is the conditional probability of
that exists given (the future motion of)
and it is estimated by
with
and
denoting the probabilities of the direction and speed of
conditioned on the future direction and speed of
.
The past motion of
(future motion of
) which the above probabilities are conditioned on is estimated using either (a) the past track of
(future track of
) with extrapolation-based PDFs (derived from
Figure 4) if the track exists, or (b) a “first-guess motion” with generic PDFs (in the case of MRMS observations, derived from
Figure 3) or with mean motion predicted by MLP (in the case of CAM forecasts, detailed in
Section 2.4). The two sets of PDFs as well as other parameters used in the tracking algorithm are listed in
Table 2.
Given the past (or future) motion, the conditional probability of a one-to-one track was estimated by first determining the speed and direction of the track itself and then identifying their probabilities and on the PDFs the track was conditioned on. With the two conditional probabilities, and , we can thoroughly analyze the likelihood of each from both sides and derive our solution, the set of most likely tracks from to , following a specific order guided by the conditional probabilities. The hierarchy of decision-making starts with a set of “base tracks” determined first. A track is identified as a base track if and only if it is the highest probability track among all probable s of and , describing the most likely one-to-one tracking solution. The second tier s to be decided are ones that are most likely according to one of the two conditional probabilities of each . All other one-to-one tracks are then decided according to their relationships with the base tracks and accepted second tier tracks.
During the multi-tiered decision-making process, the decision to accept or reject each one-to-one track
is made upon diagnosing its relation to all the tracks that have already been accepted from
to
.
Figure 7 schematically demonstrated all three scenarios regarding whether to accept a new
(shown as a dashed red line). The already accepted tracks are shown as solid red lines. In
Figure 7a, the new track is accepted because it is independent of the already accepted track. In the case of
Figure 7b, the new track is rejected because it conflicts with already accepted tracks: accepting this track would lead to its combination with the two already accepted tracks into a many-to-many track, a type of track not allowed in our solution.
Figure 7c shows the final type of decision-making scenario in which a composite track will be identified if the new track is accepted. In this situation, the new track shares the same object with one of the already accepted tracks and we need to conduct a quantitative evaluation to help make the decision. The evaluation examines whether combining the new track with the already accepted track into a many-to-one track would improve the probability (
) of the track. A detailed description of the quantitative evaluation will be presented in Step 4a of the algorithm (below). If the probability is improved, the new track will be accepted and incorporated into the already accepted track as a composite track.
To avoid false connections associated with temporally not well resolved storms, each track derived during the process is further evaluated with a likelihood criterion based on (a) the sizes of objects within the track or (b) whether steady motion is well maintained. The likelihood criterion was implemented in Step 2 (for one-to-one tracks) and Step 6 (for composite tracks) of the algorithm (below). If the track is derived by first-guess motions only, the criterion evaluates the sizes of objects within the tracks: both sides of the track must include at least one object whose size is greater than
(equivalent to ~
in diameter, from
Figure 5) to pass the criterion. If the track is derived by past and/or future motions, the criterion evaluates whether steady motion is well maintained: changes in the moving direction between the current track and its immediate previous (or future) track must be smaller than
to pass the criterion. For slow-moving storms with large structural deformations, translation rather than centroid-based motion estimates are used for steady motion evaluation. Specifically, we attempt to translate
of
using the past moving direction(s) of
until the translated object
maximally overlaps with
as indicated by the overlap ratio
. The procedure is done similarly to
with the future moving direction(s) of
. The maximum overlap ratios of
derived by past and/or future motion must be greater than 0.5 to pass the criterion.
The step-by-step description of the algorithm is as follows. To facilitate the understanding of the algorithm, we also provided an MRMS example featuring the hourly development of three closely spaced storms in
Figure 8 and the step-by-step break-down of the algorithm in deriving the hourly tracks in
Figure 9. The example will be discussed further after the algorithm description.
Step 1: Estimate conditional probabilities, and , for each probable .
Step 2: To avoid false tracks connecting temporally not well resolved storms, each probable is further evaluated using the likelihood criterion defined above.
Step 3: Derive base tracks. A base track satisfies the following three conditions: (a) it is the most likely track among all one-to-one tracks of according to , (b) it is the most likely track among all one-to-one tracks of according to and (c) it satisfies the likelihood criterion of step 2.
Step 4 contains two independent and recursive steps: 4a and 4b, which decide the acceptance or rejection of each probable but undecided in the order determined by the past motion- and future motion-based probabilities in 4a and 4b, respectively.
Step 4a (recursive): For every object at that is not already associated with a track from step 3 and 4a, select the most likely and not yet rejected based on . For each of the selected s, accept it if it is not related to any existing tracks derived from step 3 and 4a and reject it if it conflicts with certain existing tracks derived from step 3 and 4a. If the shares the same object (i.e., ) to an existing track, decide whether to accept it by an evaluation of whether track likelihood improves if the two are combined. For example, the decision on whether to accept a new track , given an already accepted track is made after the evaluation of whether and merge, i is the more likely scenario than evolving into . The track is accepted if and only if (a) the probability () of the composite track conditioned on is larger than and (b) the probability of conditioned on is larger than . If accepted, the track is incorporated into the existed track as a composite track. If rejected, search for the next most likely and not yet rejected of according to and repeat 4a until we’ve accepted at least one (or have rejected them all) for every object at . The track derived from 4a is either one-to-one or many-to-one.
Step 4b (recursive): For every object at that is not already associated with a track from step 3 and 4b, select the most likely and not yet rejected based on and decide whether to accept it based on its relationship with already accepted tracks (from step 3 and 4b via similar decision-making flow as 4a). If the most likely of is rejected, search for the next most likely and not yet rejected of according to and repeat 4b until we have accepted at least one (or have rejected them all) for every object at . The track derived from 4b is either one-to-one or one-to-many.
Step 5 (recursive): Examine and resolve conflicting tracks of 4a and 4b. Track(s) from 4a conflict with track(s) from 4b if they share some, but not all objects. The conflict must be resolved because accepting both may lead to an ambiguous tracking of many to many. To resolve the conflict, we estimate the probabilities of the conflicting 4a and 4b tracks conditioned on all the objects involved and derive two object area weighted average probabilities, one for all the involved 4a tracks and the other for all the involved 4b tracks. The set of tracks with higher (lower) average probability is accepted (rejected). For each rejected track, we remove the non-base part of the track and repeat step 4 until the object associated with the removed track is reassigned with a new . This step is also recursive: it finishes when tracks from 4a and 4b has no conflicts. The track derived from step 5 may be one-to-one, one-to-many, or many-to-one.
Step 6: The same as step 2 but applied to tracks identified in step 5. Tracks that fail the likelihood criterion are deleted.
Figure 9a shows the step-by-step derivations leading to the final tracking results for the
Figure 8 case during the first round of extrapolation. The tracks identified by the proposed algorithm indicate that the bottom cell at 22 z continued as cell, the middle line storm split into two cell storms, while the top cell dissipated, which were consistent with the actual evolutions of the storms shown in
Figure 8e except the top one. The track for the top cell was correctly derived in step 5 but deleted in step 6 because it was derived by first-guess motions only and the object sizes within the track were too small to satisfy the likelihood criterion we designed. Since there was no future track existed to re-evaluate the validity of the deleted track as the storm dissipated shortly after 23 z, this track would not be identified even after multiple rounds of extrapolation. In other words, it is by design of our algorithm to not identify tracks associated with such short-lived (
) storms because they are not well-resolved by
.
After we apply the tracking algorithm to all consecutive time frames of a selected case, a space-time network of track-connected objects is derived. This track-object network serves as an abstract representation of convective storm evolutions and interactions well-resolved by the given data. Because the track-object network resolves storm interactions such as merging and splitting, a separate identification for each storm trajectory is not always straightforward without rules stipulating what constitutes a distinct storm trajectory and how storm interactions would be handled as a result.
Here we formally present the rules for the identification of distinct storm trajectories from the track-object network. Each separately identified storm trajectory, which we will refer to as object trajectory (OT) hereafter, is a track-connected object time series describing the complete and distinct evolution of a storm from initiation to dissipation. There are different ways to identify OTs depending on how one specifies the distinctness of a storm trajectory. As an example of OT identification focusing on the distinctness of convection initiation (CI), here we consider a storm trajectory to be distinct if and only if its CI is unique. In other words, the number of OTs we identify should be the same as the number of resolvable individual CI objects that exist. What does this mean for storms that underwent merging and splitting? (a) For
individually evolving storms that later merged,
OTs would be identified, each with a unique initial portion and a shared portion after the merge. (b) For a storm that later split, all of its split storms would be collectively identified as one OT as before the split, unless (c) one of the split storms later merges with another distinct storm. In the special case of (c), the split storm will be treated as a new storm starting at the split time, as was similarly done in [
15].
Figure 10 shows a schematic track-object network with all three scenarios included. Under the proposed rules, 3 OTs (e.g., blue shades, green shades, black contours) would be identified in
Figure 10, each representing a storm trajectory with a unique CI marked by solid triangles.
4. Results and Discussion
Figure 11 demonstrates the proposed tracking method with a complex storm evolution example from May 2018. Panel a and c of
Figure 11 shows hourly snapshots of composite reflectivity from MRMS and simulated reflectivity from 0–5 h MAP forecast during 00–05 z, 11 May 2018; the objects and hourly OTs identified during the period are shown in panel b and d. Identified objects are illustrated by color-shaded regions with contours and blue (gray) color denotes storms larger (smaller) than
. The track-object network is demonstrated by the piece-wise linear curves with each linear segment representing one hourly track. At each hourly time frame, the evolution of each well-resolved storm is displayed in detail including the current storm location (blue dot), past-hour location and track (gray dots and lines) and future locations and tracks until dissipation (black dots and lines). Storm merging is illustrated by two or more linear segments arriving at the same point (e.g., in
Figure 11(b3,d6)). A black circle around the blue dot indicates that the object is the unique CI object of a distinct OT. In this example, we identified four observed CIs (at 03 z and 05 z) and two forecast CIs (at 02 z and 05 z).
MRMS reflectivity snapshots of
Figure 11a indicate a convectively active region with an upscale-growing mesoscale convective system (MCS) moving towards the east and new cells emerged from the south and west that quickly merged with the MCS. Simulated storm evolution by 0–5 h MAP forecast in
Figure 11c shared key characteristics with the observed system including the general moving direction and speed of the MCS, the merging of the MCS with smaller cells to its south, and subsequent emergence of new cells from the west of the MCS. What’s different in the forecast is the strength of cells to the southwest of MCS and the timings and locations of the new CIs. The OT-based demonstration in
Figure 11b,d suggests that the proposed tracking method can reliably identify these key evolution characteristics in both data. In particular, the tracking method correctly displayed the distinct evolution patterns of the two cells to the southwest of the MCS from 00 z to 02 z. The two cells were smaller than
in both data, suggesting they were likely too small to be temporally well resolved. Subsequent evolution patterns confirm that the observed ones did dissipate shortly after initiation, but the forecast ones grew and later merged with the MCS at 05 z, both of which were accurately described by the derived tracks. Recognizing such subtle differences may be key to the diagnosis of subsequent forecast degradation; automating such a time-consuming task with a reliable tracking method is vital for the systematic evaluation of CAM forecasts.
To quantitatively evaluate the performance of the proposed tracking algorithm, four severe weather events of May 2018 (May 01–04, from 12 z to 12 z, according to the SPC severe weather event archive) were selected. These events offer 608 hourly MRMS tracks derived with the proposed method to be evaluated against validation tracks independently obtained using 2-min MRMS data and the method of
Section 2. As discussed before,
tracks derived with the method of
Section 2 may be many-to-many if storm evolution within the
period was complex on the 2 min level (i.e., both merging and splitting occurred). In situations where the validation track associated with an object is many-to-many, we conduct manual checks to determine the correctness of the derived track since many-to-many tracks are excluded from the solution of our tracking method. Approximately 70% of the 608 derived tracks were objectively evaluated and the remaining 30% underwent manual checks. To the authors’ knowledge, this is the first time an independent, objective evaluation has been conducted for algorithm-derived storm tracks. It is made possible since the proposed method allows explicit tracking of storm merging and splitting. The evaluation results were given in
Table 3. The 99% overall accuracy rates suggest the algorithm performs reasonably well. The algorithm was especially robust in complex cases where individual storms were located very close to each other compared to their spatial extents and moving speeds but maintained relatively consistent motions (see
Figure S3). Upon examining the 1% erroneous tracks, we found that they were derived mostly in situations where storms underwent (at current step,
) or had undergone (previous step,
) complex deformations such as splitting and merging with another storm soon thereafter, cell forms nearby that quickly merged with an existing storm and large shape deformation. In these situations, the derived tracks were less reliable mainly because the object-based estimates of storm motion were less accurate.
As discussed in
Section 3, a unique feature of the proposed tracking technique is its ability to identify distinct CIs, the initial object of each identified OT. Object-based CI identification is an active research area with many CAM-based applications [
15,
22,
23]. However, the accuracy of existing methodologies was rarely evaluated. To systemically evaluate the accuracy of CIs identified by the proposed method, we examined the tracks associated with all algorithm-identified CIs in the following two aspects: (a) whether the object was a CI (i.e., it was not associated with a past validation track) and (b) whether the future track associated with the CI object was consistent with validation. The special case for which an identified CI is not a new cell, but a split of an existing storm was manually evaluated; it was considered correct if the continued track(s) of the existing storm via its other split(s) were accurately identified by the algorithm, as was similarly done in [
15].
The CI evaluation results for the four severe weather events are presented in
Table 4. The overall accuracy rate of CI identification is 95.9% for a total of 123 algorithm-identified CIs. If the identified CIs missed by 1 h are considered accurate, the overall CI identification accuracy rate reaches 98.4%. These numbers suggest that the algorithm performs reasonably well in terms of accurate CI identification.