A Random Forest Method to Forecast Downbursts Based on Dual-Polarization Radar Signatures

The United States Air Force’s 45th Weather Squadron provides wind warnings, including those for downbursts, at the Cape Canaveral Air Force Station and Kennedy Space Center (CCAFS/KSC). This study aims to provide a Random Forest model that classifies thunderstorms’ downburst and null events using a 35-knot wind threshold to separate these two categories. The downburst occurrence was assessed using a dense network of wind observations around CCAFS/KSC. Eight dual-polarization radar signatures that are hypothesized to have physical implications for downbursts at the surface were automatically calculated for 209 storms and ingested into the Random Forest model. The Random Forest model predicted null events more correctly than downburst events, with a True Skill Statistic of 0.40. Strong downburst events were better classified than those with weaker wind magnitudes. The most important radar signatures were found to be the maximum vertically integrated ice and the peak reflectivity. The Random Forest model presented a more reliable performance than an automated prediction method based on thresholds of single radar signatures. Based on these results, the Random Forest method is suggested for continued operational development and testing.


Introduction
A downburst is characterized by the occurrence of divergent intense winds at or near the surface, which are produced by a thunderstorm's downdraft [1,2].This phenomenon can produce substantial surface damage, often similar to that of tornadoes [3].A number of observational [4][5][6][7][8][9] and modeling [10][11][12][13][14] studies have been conducted to reveal the structure, dynamics, microphysics, and environmental conditions associated with a variety of convective downbursts.Precipitation microphysical processes such as precipitation loading [10], melting hailstones [6,12,15], and evaporation of raindrops [10,14,16] are important for downburst generation.Based on this understanding, automated Doppler radar algorithms for downburst detection have been developed in prior studies [17,18].Recently, [19] used radar and environmental variables as input to different machine learning techniques to predict surface straight-line convective winds.
In addition to Doppler radar and environmental observations of downbursts, dual-polarization meteorological radar characteristics for downbursts have been described in recent decades.For example, the differential reflectivity (Z dr )-hole [6] is caused by melting hail within a downdraft and is characterized by a region of near-zero dB Z dr and high reflectivity (Z h ) that is surrounded by the mean wind direction during the 5-min period was used to help identify the convective cell that produced the downburst.A wind observation recorded by the Cape WINDS network was assumed to occur at a median time of 2.5 min after the start of the reporting period.
Remote Sens. 2019, 11, x FOR PEER REVIEW 3 of 18 Data from KXMR soundings launched at the CCAFS, typically at 00:00, 10:00, and 15:00 UTC every day, were available for this study.This dataset was primarily used to extract specific isotherm heights, such as 0°C, −10°C, and −40°C, which were used in the implementation of some radar parameters, as discussed in Section 2.4.For a given storm, the considered isotherm heights were from the sounding nearest to the majority of the storm's life cycle.

C-Band Radar and Processing
A Radtec Titan Doppler Radar, officially named Weather Surveillance Radar (herein 45WS-WSR), is a C-band dual-polarization radar operated by the 45WS to provide weather support to the CCAFS/KSC complex.It operates with a 0.95° beamwidth, 5.33 cm wavelength, 24 samples per pulse, and peak transmitted power of 250 kW [31].The radar is located about 42 km southwest from the CCAFS/KSC launch towers, which leads to a horizontal beam width of approximately 600 m and peak vertical gap between radar beams of roughly 700 m over the CCAFS/KSC complex [31] (Figure 1).Thirteen elevation angles ranging from 0.2° to 28.3° comprise a volume scan, which takes 2.65 min to complete [32].Quality control, such as differential attenuation correction, was applied to the raw data prior to their acquisition for this study.
The raw radar data were gridded to a Cartesian coordinate system with a 500 m grid resolution, 1 km constant radius of influence, and a Cressman weighting function [33] using the Python ARM Radar Toolkit [34].The gridding was performed on linear Zh and Zdr, which were then converted back to logarithmic Zh and Zdr.The data were gridded out to 100 km north, south, east, and west from the 45WS-WSR and 17 km in the vertical direction.These gridding attributes were selected based on the radar beam width and vertical spacing between radar beams over CCAFS/KSC, and through an empirical analysis using different gridding techniques performed by [31].
The radar variables used in this study were Zh and Zdr.An evident reduction in the ρhv values are typically observed from this radar, possibly because of the low number of samples per pulse within 45WS-WSR operations.Values of ρhv were often below 0.80 in mixed-phase precipitation and below 0.60 in very heterogeneous mixtures of precipitation [31].For these reasons, ρhv data were not used in this study.Data from KXMR soundings launched at the CCAFS, typically at 00:00, 10:00, and 15:00 UTC every day, were available for this study.This dataset was primarily used to extract specific isotherm heights, such as 0 • C, −10 • C, and −40 • C, which were used in the implementation of some radar parameters, as discussed in Section 2.4.For a given storm, the considered isotherm heights were from the sounding nearest to the majority of the storm's life cycle.

C-Band Radar and Processing
A Radtec Titan Doppler Radar, officially named Weather Surveillance Radar (herein 45WS-WSR), is a C-band dual-polarization radar operated by the 45WS to provide weather support to the CCAFS/KSC complex.It operates with a 0.95 • beamwidth, 5.33 cm wavelength, 24 samples per pulse, and peak transmitted power of 250 kW [31].The radar is located about 42 km southwest from the CCAFS/KSC launch towers, which leads to a horizontal beam width of approximately 600 m and peak vertical gap between radar beams of roughly 700 m over the CCAFS/KSC complex [31] (Figure 1).Thirteen elevation angles ranging from 0.2 • to 28.3 • comprise a volume scan, which takes 2.65 min to complete [32].Quality control, such as differential attenuation correction, was applied to the raw data prior to their acquisition for this study.
The raw radar data were gridded to a Cartesian coordinate system with a 500 m grid resolution, 1 km constant radius of influence, and a Cressman weighting function [33] using the Python ARM Radar Toolkit [34].The gridding was performed on linear Z h and Z dr , which were then converted back to logarithmic Z h and Z dr .The data were gridded out to 100 km north, south, east, and west from the 45WS-WSR and 17 km in the vertical direction.These gridding attributes were selected based on the radar beam width and vertical spacing between radar beams over CCAFS/KSC, and through an empirical analysis using different gridding techniques performed by [31].
The radar variables used in this study were Z h and Z dr .An evident reduction in the ρ hv values are typically observed from this radar, possibly because of the low number of samples per pulse within 45WS-WSR operations.Values of ρ hv were often below 0.80 in mixed-phase precipitation and below 0.60 in very heterogeneous mixtures of precipitation [31].For these reasons, ρ hv data were not used in this study.

Wind and Null Events
The 2015 and 2016 warm seasons (May through September) were the period used in this study.In order to identify the convective cells that caused winds ≥ 35 kt, hereafter 'wind events', the Cape WINDS towers were first analyzed to identify observations of wind greater than that threshold.It is important to note that the 45WS considers the wind value of 35 kt as a hard threshold for its warnings, even with the sensors' accuracy of 58 kt for the range of 0-39 kt.Therefore, we are also using this hard threshold in this study.The timing of the wind observation was then compared to the radar data timing.The time of a radar volume scan was considered to be the median value within the volume scan's 2.65 min duration (i.e., approximately 1 min and 20 s after the volume scan initiation time).Next, each wind observation was associated to a single radar volume scan.The wind direction was used to help determine which convective cell was associated with an observed downburst.The convective cell had to be located at a maximum distance of 10 km from the Cape WINDS tower that observed the wind ≥ 35 kt at the moment the downburst occurred.If these requirements were all met, the cell was manually tracked backward in time, which had to last at least 30 min.A box was subjectively defined around the cell throughout its life cycle, ignoring its history after the downburst time.If the cell's 40 dBZ reflectivity contour was merged with another cell at any height level, both storms were considered as one.These cells were tracked until their initiation or until the radar range distance of 67 km (Figure 1) because vertical gaps in the gridded data become significant at this distance [31].An example of a wind event is shown in Figure 2, with a red box representing the cell's spatial definition, which resulted from manual storm tracking.High winds associated with hurricanes, and consistent high biased values in a single instrument not verified in neighboring sensors, were discarded.
Convective cells that did not produce such high winds (i.e., <35 kt; hereafter 'null events'), were also obtained in order to differentiate them from the wind events and to be used to train the Random Forest model.Null cases were identified by selecting convective cells that passed through the Cape WINDS area (at 15 km distance or less from any tower) and did not produce a Cape WINDS wind observation ≥ 35 kt.The entire life cycle of null events were considered, which had to be at least 25 min.Another requirement for a null event identification was that a 40 dBZ Z h had to be observed at any altitude for a minimum time period of 10 min.

Wind and Null Events
The 2015 and 2016 warm seasons (May through September) were the period used in this study.In order to identify the convective cells that caused winds ≥ 35 kt, hereafter 'wind events', the Cape WINDS towers were first analyzed to identify observations of wind greater than that threshold.It is important to note that the 45WS considers the wind value of 35 kt as a hard threshold for its warnings, even with the sensors' accuracy of .58kt for the range of 0-39 kt.Therefore, we are also using this hard threshold in this study.The timing of the wind observation was then compared to the radar data timing.The time of a radar volume scan was considered to be the median value within the volume scan's 2.65 min duration (i.e., approximately 1 min and 20 s after the volume scan initiation time).Next, each wind observation was associated to a single radar volume scan.The wind direction was used to help determine which convective cell was associated with an observed downburst.The convective cell had to be located at a maximum distance of 10 km from the Cape WINDS tower that observed the wind ≥ 35 kt at the moment the downburst occurred.If these requirements were all met, the cell was manually tracked backward in time, which had to last at least 30 min.A box was subjectively defined around the cell throughout its life cycle, ignoring its history after the downburst time.If the cell's 40 dBZ reflectivity contour was merged with another cell at any height level, both storms were considered as one.These cells were tracked until their initiation or until the radar range distance of 67 km (Figure 1) because vertical gaps in the gridded data become significant at this distance [31].An example of a wind event is shown in Figure 2, with a red box representing the cell's spatial definition, which resulted from manual storm tracking.High winds associated with hurricanes, and consistent high biased values in a single instrument not verified in neighboring sensors, were discarded.
Convective cells that did not produce such high winds (i.e., <35 kt; hereafter 'null events'), were also obtained in order to differentiate them from the wind events and to be used to train the Random Forest model.Null cases were identified by selecting convective cells that passed through the Cape WINDS area (at 15 km distance or less from any tower) and did not produce a Cape WINDS wind observation ≥ 35 kt.The entire life cycle of null events were considered, which had to be at least 25 min.Another requirement for a null event identification was that a 40 dBZ Zh had to be observed at any altitude for a minimum time period of 10 min.

Dual-Polarization Radar Signatures
Once the radar data were gridded and the convective cells were identified and tracked, a large number of radar parameters (i.e., signatures) were calculated for every wind and null case.This method can be referred to as 'semi-automated analysis', since storms were manually tracked and radar signatures were automatically and objectively calculated for all storms.About 50 signatures were initially considered, all with a physical process hypothesized to be directly or indirectly related to a future occurrence of a downburst, as reviewed in Section 1.A considerable fraction of parameters were representing the same process, with variations in the radar threshold being the only difference.As an example, a signature that uses both Z h and Z dr data for identification of precipitation ice was tested using different thresholds of Z h .Then, in an attempt to reduce the amount of redundant information among the numerous signatures, a correlation analysis was performed.For large correlations (i.e., 0.70 or higher) between two radar signatures, only one signature was kept for further study, which was the signature that had the lowest correlation values with all other radar signatures examined.After this first reduction process, a Principal Component Analysis (PCA) [35] was performed to identify the variables that explained the most variance.The signatures with relatively large correlation (i.e., 0.60 or higher) with the first PCA level-which explains the most variance in the dataset-were selected as the final radar signatures.The number of radar signatures was ultimately reduced to eight, all based on radar variables Z h and/or Z dr .The parameters are listed in Table 1 and described in detail below.Signature #1 implies that storm's updraft lifts a significant amount of liquid hydrometeors, such as raindrops, above the 0 • C level, creating a column of Z dr ≥ 1 dB at sufficient reflectivity (Z h ≥ 30 dBZ).A Z dr column's height is associated with updraft strength and storm intensity [36][37][38].The freezing of these hydrometeors at sub-freezing environmental temperatures eventually produces ice particles, which may contribute to downburst formation.After identifying the 0 • C isotherm height using the KXMR sounding data, it was verified if a single gridded column had continuous Z dr values ≥ 1 dB from this height upward.The maximum column top height was recorded as the storm's Z dr column height.A 30 dBZ Z h filter was applied to avoid erroneous updraft identification at the edges of storms where positive Z dr values are also common.It is hypothesized that a higher maximum Z dr column height would lead to a greater potential of precipitation ice production and hence downburst occurrence at the surface through melting and loading of these hydrometeors.
The lifted liquid hydrometeors eventually freeze in the Z dr column's upper boundary, serving as embryos that can produce precipitation ice, such as graupel and hail [39].The increase in precipitation ice amount above the 0 • C level is represented by both Signatures #2 and #3.Signature #2, also called the precipitation ice signature [31], is a maximum height of the measured −1 dB ≤ Z dr ≤ +1 dB that is co-located with Z h ≥ 30 dBZ [38,40].Signature #3 is the maximum vertically integrated ice (VII), which is a reflectivity-integrated signature to estimate the amount of precipitation ice between the −10 • C and −40 • C isotherms in units of kg m −2 [41,42].It is hypothesized that a higher vertical extent of precipitation ice and a larger amount of reflectivity-integrated ice would indicate sufficient precipitation ice growth in both size and quantity, as well as an increase in hydrometeor loading and negative buoyancy.The VII expression is shown in the Equation (1).
where ρ i is the density of ice and N 0 is the intercept parameter, assumed to be equal to 917 kg m −3 and 4 × 10 6 m −4 , respectively, z h is the linear reflectivity (in mm 6 m −3 ), and h is the height of the specified isotherms in meters [41,42].
Signatures #4 and #5 are indirectly related to the ice calculation.A higher altitude of the peak Z h (Signature #4) and the peak Z h value above the 0 • C isotherm (Signature #5) are associated with the number and concentration of hydrometeors at high levels, which are usually associated with precipitation ice loading that may produce negative buoyancy [23].
Signatures #6-#8 are reflectivity-based parameters that consider the entire storm in their calculations.The number and concentration of all hydrometeor types are considered at all height levels for these signatures.A larger value for these three signatures is likely related to larger hydrometeor loading and increased likelihood of downburst generation.Signature #6 is the peak Z h in the storm, which can be at any height level, even below the 0 • C level.Similarly to Signature #3, the VIL signature (Signature #7) is an integration of z h through the storm's depth, as shown in equation 2 in units of kg m −2 [43].
Signature #8 is Density of VIL (DVIL) in units of g m −3 , which is simply VIL/echotop, with echotop being defined as the storm's maximum 18 dBZ Z h height in km [44].
Figure 3 highlights most of the aforementioned radar signatures for a wind event that occurred on 09 June 2015.It consists of a Z dr vertical cross-section plot at the location marked with a black line in Figure 2. A Z dr column (Signature #1) can be seen as warm colors about 10 km east from the radar center extending approximately 1.5 km above the 0 • C isotherm height, which is marked as a blue horizontal line.The precipitation ice signature (Signature #2) can be seen as Z dr ~0 dB (denoted by gray colors) co-located with Z h ≥ 30 dBZ, shown as black contours.This signature reaches its maximum height at 8.5 km AGL about 11 km east from radar.Other signatures, such as peak Z h and its height above ground level, can also be inferred from this plot.
Remote Sens. 2019, 11, x FOR PEER REVIEW 6 of 18 precipitation ice growth in both size and quantity, as well as an increase in hydrometeor loading and negative buoyancy.The VII expression is shown in the equation 1.
where ρi is the density of ice and N0 is the intercept parameter, assumed to be equal to 917 kg m -3 and 4×10 6 m -4 , respectively, zh is the linear reflectivity (in mm 6 m -3 ), and h is the height of the specified isotherms in meters [41,42].
Signatures #4 and #5 are indirectly related to the ice calculation.A higher altitude of the peak Zh (Signature #4) and the peak Zh value above the 0°C isotherm (Signature #5) are associated with the number and concentration of hydrometeors at high levels, which are usually associated with precipitation ice loading that may produce negative buoyancy [23].
Signatures #6-#8 are reflectivity-based parameters that consider the entire storm in their calculations.The number and concentration of all hydrometeor types are considered at all height levels for these signatures.A larger value for these three signatures is likely related to larger hydrometeor loading and increased likelihood of downburst generation.Signature #6 is the peak Zh in the storm, which can be at any height level, even below the 0°C level.Similarly to Signature #3, the VIL signature (Signature #7) is an integration of zh through the storm's depth, as shown in equation 2 in units of kg m -2 [43].
Signature #8 is Density of VIL (DVIL) in units of g m -3 , which is simply VIL/echotop, with echotop being defined as the storm's maximum 18 dBZ Zh height in km [44].
Figure 3 highlights most of the aforementioned radar signatures for a wind event that occurred on 09 June 2015.It consists of a Zdr vertical cross-section plot at the location marked with a black line in Figure 2. A Zdr column (Signature #1) can be seen as warm colors about 10 km east from the radar center extending approximately 1.5 km above the 0°C isotherm height, which is marked as a blue horizontal line.The precipitation ice signature (Signature #2) can be seen as Zdr ~ 0 dB (denoted by gray colors) co-located with Zh ≥ 30 dBZ, shown as black contours.This signature reaches its maximum height at 8.5 km AGL about 11 km east from radar.Other signatures, such as peak Zh and its height above ground level, can also be inferred from this plot.

Random Forest
This study uses a Random Forest model for training and forecasting of wind events.Random Forest is a tree-based method that combines multiple Decision Trees [45][46][47][48].Decision Trees consist of a series of splitting rules that stratifies observations into nodes, using predictors that best split the observations.In our study, the radar signatures' maximum values through a tracked storm's life cycle are used as inputs for the model, and classification trees are used to discriminate wind and null events.Random Forests build hundreds of Decision Trees, each taking a different storm sample (about two-thirds) from the total storm data set.Each Decision Tree built is a separate model, and the resulting prediction among all trees is averaged to reduce variance, which is high for a single decision tree because trees are not highly correlated.Also, Random Forest uses only a small sample of predictors as split candidates in every tree node.Using a limited number of predictors as split candidates usually yields even smaller errors than considering all predictors (the so-called bagged trees), and averaging the resulting trees leads to an even larger reduction in variance.
In order to implement the Random Forest model, the R package Random Forest was used [49], where 500 trees were built using the entire set of storms as the training dataset.Two predictors were used as split candidates, consistent with the Random Forest default settings of using approximately the square root of the total number of predictors available [46].No separate testing dataset was defined because it is possible to obtain the model's error through the set of storms not used for tree's construction, called out-of-bag (OOB) storms.As previously mentioned, each tree uses approximately two-thirds of the storm sample, which are randomly chosen.Storms not used to fit a given tree are called out-of-bag observations.As a result, each storm was out-of-bag for approximately one-third of trees.All trees' predictions for a given OOB storm are counted and the majority vote among all of these trees is considered as the Random Forest single prediction for that storm.For example, a vote equal to 0.6 for a given storm means that 60% of trees predicted that storm to be a wind event, while the other 40% predicted it to be a null event.The majority vote is considered as the Random Forest prediction (i.e., the wind/null classification is made based on whichever classification receives a vote greater than 0.5).This way, every storm has a wind/null prediction based on a model that used the entire storm dataset for training, without the need for a testing dataset.It is shown in Section 2.5 that this methodology is relevant and equivalent to an approach that applies a model using a separate training and testing datasets.
A classification prediction is obtained for each storm and a summary of all storm predictions can be displayed in a simple contingency table or confusion matrix, from which performance metrics can be calculated [50].The most intuitive metric for wind event predictability is the Probability of Detection (POD), which is the number of correct wind event forecasts divided by the total number of wind event observations.The Probability of False Alarm (POFA, same as false alarm ratio) is also used in this study, which is the number of incorrect wind forecasts divided by the total number of wind forecasts.The False Alarm Rate (F) is the number of incorrect wind forecasts divided by the total number of null observations.F is important to define because it is an analog to the POD, since it is a fraction of incorrectness of null events, while POD is fraction of correctness for the wind events.For that reason, the TSS is the main metric used in this study to evaluate the predictability of a model, since its formula can be simplified to the difference between POD and F. Thus, TSS is a simple and relevant measure of model performance because it balances the wind and null events' predictability equally within the model, independent of the size of each dataset.A secondary metric used in this study for Random Forest predictability is the OOB estimate of error rate, which is the number of incorrect wind and null predictions divided by the total number of events.This is equivalent to 1-PC, where PC is the Proportion Correct, or the sum of the number of correct wind and null predictions divided by the total number of events.This metric differs from TSS, since each event, wind or null, is equally considered in its computation.Because of this, if the size of a particular class (wind or null) is greater than the other, this class would be weighted more heavily in the OOB estimate of error rate (or 1-PC) calculation.

Mean Decrease Accuracy and Mean Decrease Gini
Since Random Forest is a method that builds hundreds of trees for its model development, it is not easy to determine the most important signatures that contributed most greatly to an increase in the model performance.However, two methods that account for the signatures' importance quantitatively for all trees are available when running the model [46].The Mean Decrease Accuracy (MDA) is obtained by recording the OOB observation error for a given tree, and then the same is done after permuting each signature from the tree.The difference between the two results is calculated, and differences for all trees are obtained, averaged, and normalized by the standard deviation of the differences.A large MDA value indicates that there was a significant decrease in model accuracy once the signature was removed, indicating an important signature.
The Mean Decrease Gini (MDG) is the second method to obtain the signatures' importance.The Gini index is a measure of node purity, being small for a node with a dominant class (wind or null classes are predominant for the OOB events that occurred at that given tree node).MDG is the sum of the decrease in the Gini index by splits over a given signature for a tree, averaged over all trees.Similar to the MDA, a large MDG value indicates an important predictor.Both variable importance methods were calculated in order to evaluate the most important signatures for the Random Forest model.

Single Signature Predictability
A simple method to determine the predictability of each individual radar signature was performed in order to compare with the Random Forest model results.The predictability of each signature in Table 1 was tested by applying different thresholds for each signature and testing them for all wind and null events.It was verified if a given threshold was observed before the downburst time for wind events and at any time during the life cycle of null events.Through these methods, statistics were obtained in a contingency table and performance metrics were calculated.The performance metrics calculated were the same as presented in Section 2.5, with TSS being the primary metric used for comparison of results between the single signatures and the Random Forest.

Random Forest
Using the methods described in Section 2.3, a total of 84 wind events and 125 null events were identified from the 2015 and 2016 warm seasons.Table 2 presents the Random Forest's out-of-bag confusion matrix, or contingency table, showing the number of correct and incorrect predictions for all wind and null events.For wind events, the random forest model predicted 49 out of the 84 events correctly, leading to a POD of 58%.For null events, the model correctly determined 102 out of 125 events.This means that 82% of null events were correctly depicted, or an F of 18% (note that this is not the same as POFA).The Random Forest prediction of null events is noticeably better than the prediction of wind events.In total, 58 out of all 209 events were incorrectly predicted, or an OOB estimate of error rate of 28%.The POFA for the model is 32%.The resultant TSS for the Random Forest model is 0.40, which is in the range of TSS values that are considered marginal for operational utility by the 45WS (i.e., 0.3 to 0.5) [24].
The OOB votes for each storm can also be accessed from the Random Forest model.Votes are the fraction of trees that predicted a given storm as a wind event, considering all trees that have not used that storm for training.In a classification Random Forest, a storm with a vote greater than 0.5 is considered a wind event.In this way, votes may be interpreted as a qualitative 'probability' for a storm to become a wind event.Figure 4 shows every storm's maximum wind magnitude measured by the Cape WINDS network in terms of its Random Forest vote.The vertical line depicts the wind event threshold of 35 kt, separating wind events to the right and null events to the left of the chart.The horizontal line at a vote equal to 0.5 determines the Random Forest's wind and null classification prediction above and below the line, respectively.The upper-right and the lower-left portions of the plot represent the random forest's correct predictions in the same manner as Table 2.The upper-left and the lower-right sections of Figure 4 represent the false alarms and misses of the model, respectively, or the Random Forests' incorrect predictions.In the lower-left quadrant, it can be seen that the correct negative events are numerous and spread out over most of the quadrant area.Few null events were incorrectly identified by the Random Forest as wind events, as can be seen in the upper-left quadrant.A significant number of storms produced peak winds around 35 kt, which is near the wind magnitude threshold that separated wind events from null events.The Random Forest model struggled to predict those borderline events as either wind or null, as evident by the wide range of vote values.If we examine events that produced peak winds between 35 kt and 40 kt, 38 out of 66 (58%) were correctly identified as wind events.Storms with a maximum wind magnitude greater than 40 kt were less numerous, but the Random Forest model classified them more correctly than events with peak winds between 35 kt and 40 kt.Eighteen storms had winds greater than 40 kt and the Random Forest model correctly classified 11 of these as wind events, or 61%.Based on these results, it seems that the POD of wind events increased with increasing downburst strength.This corroborates with a tendency for an increase of Random Forest votes with an increase in wind magnitude, even with the presence of some outlier events to this tendency.The OOB votes for each storm can also be accessed from the Random Forest model.Votes are the fraction of trees that predicted a given storm as a wind event, considering all trees that have not used that storm for training.In a classification Random Forest, a storm with a vote greater than 0.5 is considered a wind event.In this way, votes may be interpreted as a qualitative 'probability' for a storm to become a wind event.Figure 4 shows every storm's maximum wind magnitude measured by the Cape WINDS network in terms of its Random Forest vote.The vertical line depicts the wind event threshold of 35 kt, separating wind events to the right and null events to the left of the chart.The horizontal line at a vote equal to 0.5 determines the Random Forest's wind and null classification prediction above and below the line, respectively.The upper-right and the lower-left portions of the plot represent the random forest's correct predictions in the same manner as Table 2.The upper-left and the lower-right sections of Figure 4 represent the false alarms and misses of the model, respectively, or the Random Forests' incorrect predictions.In the lower-left quadrant, it can be seen that the correct negative events are numerous and spread out over most of the quadrant area.Few null events were incorrectly identified by the Random Forest as wind events, as can be seen in the upper-left quadrant.A significant number of storms produced peak winds around 35 kt, which is near the wind magnitude threshold that separated wind events from null events.The Random Forest model struggled to predict those borderline events as either wind or null, as evident by the wide range of vote values.If we examine events that produced peak winds between 35 kt and 40 kt, 38 out of 66 (58%) were correctly identified as wind events.Storms with a maximum wind magnitude greater than 40 kt were less numerous, but the Random Forest model classified them more correctly than events with peak winds between 35 kt and 40 kt.Eighteen storms had winds greater than 40 kt and the Random Forest model correctly classified 11 of these as wind events, or 61%.Based on these results, it seems that the POD of wind events increased with increasing downburst strength.This corroborates with a tendency for an increase of Random Forest votes with an increase in wind magnitude, even with the presence of some outlier events to this tendency.As such, the upper (lower) left quadrant can be interpreted as encompassing the incorrectly (correctly) forecasted null events.Similarly, the upper (lower) right quadrant can be interpreted as including the correctly (incorrectly) forecasted wind events.More details can be found in the main text.
The Mean Decrease Accuracy (MDA) and the Mean Decrease Gini (MDG) values for each signature are shown in Table 3.A large MDA and MDG value indicates a high importance of the As such, the upper (lower) left quadrant can be interpreted as encompassing the incorrectly (correctly) forecasted null events.Similarly, the upper (lower) right quadrant can be interpreted as including the correctly (incorrectly) forecasted wind events.More details can be found in the main text.
The Mean Decrease Accuracy (MDA) and the Mean Decrease Gini (MDG) values for each signature are shown in Table 3.A large MDA and MDG value indicates a high importance of the radar signature for the Random Forest.The two most important signatures were VII and peak Z h over the entire cell.VII is the signature with the highest MDG and second-highest MDA, while peak Z h over the entire convective cell is the signature with the highest MDA and second-highest MDG.The two signatures with the lowest MDA and MDG are the height of precipitation ice and the height of peak Z h , with the latter yielding a negative MDA.

Single Signatures
The individual predictability for each of the eight signatures were computed by defining thresholds for each signature and verifying if a signature value greater than that threshold occurred at least once before a wind event's downburst time and at any time during a null event's life cycle.This procedure was applied to all 209 storms, which is the same dataset used in the Random Forest simulation.From these predictions of the wind and null events, a number of performance metrics were obtained to evaluate each signature's predictability over a range of physically realistic thresholds.The main metric used for comparisons with the Random Forest simulations was TSS.The calculation of 1-PC was also performed because it is equivalent to the Random Forest's OOB estimate of error rate.Lastly, the well-known POD and POFA were calculated as well.
Figure 5 shows the performance metrics for different thresholds for all eight radar signatures.As expected, POD and POFA generally decrease as the signatures' thresholds increase.The maximum TSS observed for each signature was between 0.35 and 0.40 for six out of the eight signatures.The highest TSS among all signatures and thresholds tested is 0.43, which was observed for a threshold of 52 dBZ for the peak storm Z h at any height (Signature #6, Figure 5f).This specific signature's threshold presented POD, POFA, and 1-PC values equal to 0.83, 0.42, and 0.31, respectively.The signature that presented the smallest maximum TSS was the height of peak Z h (Signature #4, Figure 5d), which was 0.29 at a threshold of 1250 m above the 0 • C isotherm height.
In general, the curves for 1-PC in Figure 5 have an approximate negative correlation to the TSS curves, since a lower 1-PC value means a better prediction, while for TSS a larger value indicates a better prediction.For signatures S#1 and S#8, the minimum 1-PC is found at the same signature threshold as the maximum TSS.For the Z dr column signature (S#1), the maximum TSS and minimum 1-PC occurs for a threshold of 2750 m (TSS of 0.36 and 1-PC of 0.27), but this threshold presented an undesirable POD smaller than 50% (POD of 0.43).The Signature #8 DVIL has a maximum TSS of 0.39 and a minimum 1-PC of 0.26 for a threshold of 1.9 kg m −2 , but its POD is also lower than 50% (POD of 0.49).
VII signature (S#3) presents maximum TSS and minimum 1-PC for the same threshold of 4 kg m −2 , in which TSS is 0.40 and 1-PC is 0.29.However, other thresholds of 4.5 and 5.5 kg m −2 have the exact same minimum 1-PC, but these thresholds have lower TSS, POD, and POFA (Figure 5c).For the other five signatures (S#2, S#4-S#7), the minimum 1-PC occurs at higher thresholds than the maximum TSS, which resulted in lower TSS, POD and POFA for the thresholds with the minimum 1-PC.In addition, VIL (S#7) presented more than one threshold with the same minimum 1-PC value, with 16 and 17 kg m −2 having 1-PC equal to 0.27.In general, the curves for 1-PC in Figure 5 have an approximate negative correlation to the TSS curves, since a lower 1-PC value means a better prediction, while for TSS a larger value indicates a better prediction.For signatures S#1 and S#8, the minimum 1-PC is found at the same signature threshold as the maximum TSS.For the Zdr column signature (S#1), the maximum TSS and minimum 1-PC occurs for a threshold of 2750 m (TSS of 0.36 and 1-PC of 0.27), but this threshold presented an undesirable POD smaller than 50% (POD of 0.43).The Signature #8 DVIL has a maximum TSS of 0.39 and a minimum 1-PC of 0.26 for a threshold of 1.9 kg m −2 , but its POD is also lower than 50% (POD of 0.49).
VII signature (S#3) presents maximum TSS and minimum 1-PC for the same threshold of 4 kg m -2 , in which TSS is 0.40 and 1-PC is 0.29.However, other thresholds of 4.5 and 5.5 kg m −2 have the exact same minimum 1-PC, but these thresholds have lower TSS, POD, and POFA (Figure 5c).For The maximum TSS for each signature is shown in Figure 6, which is organized in terms of POD, POFA, and TSS.TSS increases toward the top left of the plot and is negative (i.e., worse than a random forecast) to the right of POFA equal to 0.6.As previously mentioned, two signatures had maximum TSS for thresholds with POD of less than 0.5.The other six signatures presented a maximum TSS for thresholds with POD of greater than 0.5, but with a relatively high POFA around 0.4.
The maximum TSS for each signature is shown in Figure 6, which is organized in terms of POD, POFA, and TSS.TSS increases toward the top left of the plot and is negative (i.e., worse than a random forecast) to the right of POFA equal to 0.6.As previously mentioned, two signatures had maximum TSS for thresholds with POD of less than 0.5.The other six signatures presented a maximum TSS for thresholds with POD of greater than 0.5, but with a relatively high POFA around 0.4.

Discussion
Random Forest OOB prediction for wind and null events presents better performance metrics than most of the single signatures' predictions, as described in Section 3.2.Random Forest correctly depicted 58% of wind events and 82% of null events, leading to an overall correct prediction of 72% for all events.In this study, the main performance metric used for predictability analysis is the TSS, which weighs each storm category (winds and nulls) equally.In the TSS equation, half of its formulation comes from the wind events' predictability (a/(a+c); see Table 2), while the other half considers the null events' predictability (b/(b+d)).In this way, the TSS equation is independent of how much larger a given category is compared to the other.The other performance metric used in this study is the Random Forest's OOB estimate of error rate, or 1-PC for single signature predictions.These equations are represented by the sum of all storms incorrectly predicted divided by the total number of events.This means that every storm is equally considered independently of whether it is a wind or a null event.In this study, since the null dataset comprises almost 60% of our entire dataset, the TSS weights wind events more heavily in its calculation compared to the OOB estimate of error rate.
The Random Forest's TSS of 0.40 is larger than most of the single signatures' best TSS.The only single signature threshold that had a larger TSS than the Random Forest OOB estimate is the maximum Zh over the entire storm (Signature #6) using the 52 dBZ threshold.This signature's

Discussion
Random Forest OOB prediction for wind and null events presents better performance metrics than most of the single signatures' predictions, as described in Section 3.2.Random Forest correctly depicted 58% of wind events and 82% of null events, leading to an overall correct prediction of 72% for all events.In this study, the main performance metric used for predictability analysis is the TSS, which weighs each storm category (winds and nulls) equally.In the TSS equation, half of its formulation comes from the wind events' predictability (a/(a+c); see Table 2), while the other half considers the null events' predictability (b/(b+d)).In this way, the TSS equation is independent of how much larger a given category is compared to the other.The other performance metric used in this study is the Random Forest's OOB estimate of error rate, or 1-PC for single signature predictions.These equations are represented by the sum of all storms incorrectly predicted divided by the total number of events.This means that every storm is equally considered independently of whether it is a wind or a null event.In this study, since the null dataset comprises almost 60% of our entire dataset, the TSS weights wind events more heavily in its calculation compared to the OOB estimate of error rate.
The Random Forest's TSS of 0.40 is larger than most of the single signatures' best TSS.The only single signature threshold that had a larger TSS than the Random Forest OOB estimate is the maximum Z h over the entire storm (Signature #6) using the 52 dBZ threshold.This signature's threshold presented a TSS equal to 0.43 due to its relatively high wind event predictability (POD of 0.83).However, its null event predictability is worse than the Random Forest model, since it only predicted 60% of these events correctly.Therefore, the F and the POFA were 0.40 and 0.42, respectively.Thresholds smaller than 52 dBZ showed higher F, while thresholds greater than 52 dBZ presented smaller POD, with both patterns leading to smaller TSS as shown in Figure 5f.In contrast to this single radar signature, the Random Forest model results show much better prediction for null events but a poorer wind event prediction, leading to a slightly lower TSS.The single parameter approach is simpler to apply operationally but it does not contrast null events to the wind events as well as the multi-parameter Random Forest model.Also, a 1 dB variation from this signature threshold leads to a lower TSS than Random Forest results, which is within the Z h measurement error.Hence, the Random Forest model is preferred due to it being a more robust model in comparison to the simpler single signature approach.However, the user should consider taking into account whether the wind detection is preferred over incorrect null event detection, or if a low F is more important for operational applications.
A VII threshold of 4 kg m −2 presented the exact same TSS as that of the Random Forest multi-parameter model results.However, this signature's POD and POFA are slightly larger (0.63 and 0.35) than those of the Random Forest.Similar to the Signature #6 case, a small variation of only 0.5 kg m −2 in the VII threshold produces poorer TSS than the Random Forest model.The other six signatures present lower TSS values than the Random Forest, which indicates a worse balance between wind detection and F. As shown in Figure 6, these signatures have high POFA (greater than 0.39) or low POD (lower than 0.49).
The Random Forest OOB estimate of error rate is 28%, which is the percentage of total events (winds and nulls) incorrectly predicted.As stated previously, this metric takes into account null events' performance more than wind events' simply because of null events comprising a larger percentage of the total dataset than wind events.The Random Forest model depicted null events with greater skill than wind events; therefore, this metric generally presents better results than single signature predictions.As shown in Figure 5, single signatures present their minimum 1-PC at higher thresholds than their maximum TSS.This is due to the low F these thresholds present, which is related to the fact that the null events' predictability has greater importance for this performance metric.The signature threshold associated with this minimum 1-PC also presents lower POD, since 1-PC weighs wind event predictability less than TSS does.This is the primary reason why Random Forest OOB estimate of error rate has better results (i.e., a lower value) than five single signatures' best 1-PC threshold.The five signatures with a 1-PC poorer than the Random Forest model are S#2-S#6.The three signatures that presented better 1-PC values than the Random Forest model yielded their strongest 1-PC value at a threshold that also presented a POD lower than 50%, which is undesirable.
The MDA and MDG calculated for all radar signatures (Table 2) indicated that VII and peak Z h were the most important signatures for the Random Forest model.Most of the other signatures also presented positive values, indicating they contributed to an improved discrimination between wind and null classes.The height of the peak Z h (Signature #4) was the only signature that presented a negative MDA.To examine potential effects this signature may have on the performance of the Random Forest model, an additional Random Forest run was performed using only seven of the original signatures, removing Signature #4.Resultant predictions showed slightly worse performance metrics than the original model run, with POD, POFA, and TSS equal to 0.57, 0.34, and 0.37, respectively, and positive MDA and MDG for all signatures.This implies that removing signatures is not required and even causes a reduction in Random Forest model performance.
An earlier study [31] explored downbursts at CCAFS/KSC using the same Cape WINDS tower data and some of the same storms used in this study, but with a smaller dataset.They used similar signatures and analyzed performance metrics from signature thresholds by visual, subjective analysis, in contrast to this study, which used a semi-automated objective analysis (i.e., storms were manually tracked and radar signatures were calculated automatically).The prior study [31] assessed five dual-polarization radar signatures, three of which are coincident with this study: height of the Z dr column, height of the precipitation ice signature, and peak Z h .The results from the Random Forest and objective single signature analyses herein are compared with the results from the subjective single signature analyses in [31] in the following paragraphs.
The Z dr column signature visually identified in [31] presents better results than the semiautomated single signature method and Random Forest model herein.For any given threshold, ref. [31] shows larger POD and TSS and smaller POFA than the semi-automated single signature approach.For example, for 2000 m above the 0 • C level, [31]'s POD, POFA, and TSS values are 0.84, 0.21, and 0.63 respectively, while for the semi-automated single signature analysis, these performance metrics are 0.63, 0.40, and 0.34, respectively.In [31], the Z dr column threshold with highest TSS is 2500 m, while for the semi-automated single signature the threshold with the highest TSS is 2750 m.Signature threshold resolutions are different between these two studies for this signature, being 500 m for [31] and 250 m for this study, which may have contributed to some of the differences in these results.
A similar behavior can be seen for the other two common signatures between these two studies.For the precipitation ice signature, subjective visual analysis in [31] yielded much better results, with the best TSS being 0.75 for the thresholds of 4500 and 5000 m, while for the semi-automated single signature analysis the maximum TSS was 0.37 for the threshold of 6500 m.This signature was observed at high altitudes within null events more often using the semi-automated analysis than in the visual analysis, where it was rarely observed.For example, in this study, about 31% of null events had this signature for the threshold of 6500 m above 0 • C level.This difference is speculated to be due to the expanded null event definition used in this study.The study in [31] only considered a single updraft-downdraft cycle for null events while this study used the entire null event life cycle.
For the maximum Z h signature, the subjective visual analysis in [31] had better performance metrics than the semi-automated analysis for any given threshold.For the 50 dBZ threshold, the visual analysis in [31] had POD, POFA, and TSS values of 0.94, 0.33, and 0.47, respectively, while the semi-automated analysis results herein are 0.91, 0.52 and 0.24, respectively.For the 55 dBZ threshold, the POD, POFA and TSS in [31] were 0.47, 0.06, and 0.44, respectively, while for the semi-automated analysis, the same metrics are 0.45, 0.25, and 0.34, respectively.As with the Z dr columns discussed above, the resolution used for the maximum Z h signature was different between these studies, being 5 dBZ in [31] and 1 dBZ for this study.The threshold with highest TSS was 50 dBZ in [31], with visual analysis yielding a TSS of 0.47, and the highest TSS for the semi-automated analysis is 52 dBZ, with a TSS of 0.43.Interestingly, the wind event detection is roughly the same for both methods, since POD is similar.However, there are more null events being detected in the semi-automated method compared to the visual method for these thresholds.As mentioned previously for the precipitation ice signature, the main reason for this difference is likely the different null event definitions used in these studies.
The Random Forest OOB approach is suitable for analysis since it presents results that are comparable to a method that uses a dataset to train the Random Forest model and a separate dataset to test the model.To simulate this, one storm was removed from the original dataset, and the Random Forest model was trained using all remaining 208 storms.Then, the model was applied to the removed storm, which became the single test storm.The same procedure was repeated for all storms and the output from each Random Forest run was compared to the real storm's category, wind or null.These results were then summarized using the same performance metrics used throughout this study.The results for this approach were very similar to the OOB approach (i.e., an approach that does not require splitting the dataset between training and testing), with a POD of 0.58, POFA of 0.32, TSS of 0.39, and an error rate of 0.28.In an operational setting, this approach would be suitable for application because of the straightforward method of Random Forest to be trained and applied to an ongoing convective cell.In addition, the OOB method used in this study generally agrees with the aforementioned operational approach, attesting to its suitableness for use in operations.

Conclusions
This study presented a Random Forest classification method for downburst forecasting around the CCAFS/KSC.The parameters ingested into the Random Forest model are based on dual-polarization radar signatures that have physical implications for downdraft intensification and the occurrence of a strong downburst at the surface.The Cape WINDS high density wind towers data provided unique quantitative wind observations, in contrast to wind reports based on surface damage that are frequently used where such observations are not available.A Random Forest consists of hundreds of decision trees, each using about two-thirds of the total storm dataset to be trained.For each tree node, only two signatures are candidates to be used for a tree's split, and one signature is ultimately used.This procedure results in lower variance, and hence better results than a single decision tree.Then, the OOB method is used to obtain a prediction result for each storm, avoiding the necessity for separate training and testing datasets.
The Random Forest model depicted null events better than wind events.The POD for the stronger downbursts was higher than for downbursts with maximum wind magnitude close to the wind event threshold of 35 kt.This corroborates with an expected tendency of wind detection increasing as the wind magnitude increases, as shown in Figure 4.
When compared to a threshold-based method for each single signature, the Random Forest model is preferred because of its robustness.Some single signature thresholds presented better TSS than the Random Forest model.However, they had poorer performance for thresholds close enough to be within the radar measurement error.Also, some single signatures with high TSS or low 1-PC metrics occurred at thresholds with POD lower than 0.5 or relatively high POFA.
The Random Forest OOB method was equivalent to an approach where a storm is separated from the model to be used as testing data.The latter approach, which had similar results to the OOB method, is suitable for adaptation in an operational forecast office.The 45WS and other users can decide among the methods presented in this study whether a better wind event detection or a lower false alarm is desirable.However, given its robust performance, the aforementioned Random Forest approach is recommended for continued investigation and operational testing.Before operational implementation and testing, future work should include a storm identification and tracking algorithm such as [51,52] in order to make the proposed Random Forest method fully objective and automated.

Figure 2 .
Figure 2. Zh at 5 km AGL on 06/09/2015 at 1915 UTC.The spatial definition of a cell associated with a wind event is highlighted as a red box, and the gray 'X's show Cape WINDS tower locations.The solid black line indicates the plane of the vertical-cross section shown in Figure 3.

Figure 2 .
Figure 2. Z h at 5 km AGL on 06/09/2015 at 1915 UTC.The spatial definition of a cell associated with a wind event is highlighted as a red box, and the gray 'X's show Cape WINDS tower locations.The solid black line indicates the plane of the vertical-cross section shown in Figure 3.

Figure 3 .
Figure 3. Vertical cross-section of Zdr (shaded) and Zh (black contour every 10 dBZ, from 10 dBZ to 50 dBZ) at the location shown as black line in Figure 2. The horizontal blue line indicates the 0 °C isotherm height.

Figure 3 .
Figure 3. Vertical cross-section of Z dr (shaded) and Z h (black contour every 10 dBZ, from 10 dBZ to 50 dBZ) at the location shown as black line in Figure 2. The horizontal blue line indicates the 0 • C isotherm height.

Figure 4 .
Figure 4. Random Forest vote for all events as a function of the observed maximum wind magnitude in kt.The vertical line depicts the wind event threshold of 35 kt.The horizontal line at a vote of 0.5 specifies the minimum vote value necessary for the Random Forest to predict a storm as a wind event.As such, the upper (lower) left quadrant can be interpreted as encompassing the incorrectly (correctly) forecasted null events.Similarly, the upper (lower) right quadrant can be interpreted as including the correctly (incorrectly) forecasted wind events.More details can be found in the main text.

Figure 4 .
Figure 4. Random Forest vote for all events as a function of the observed maximum wind magnitude in kt.The vertical line depicts the wind event threshold of 35 kt.The horizontal line at a vote of 0.5 specifies the minimum vote value necessary for the Random Forest to predict a storm as a wind event.As such, the upper (lower) left quadrant can be interpreted as encompassing the incorrectly (correctly) forecasted null events.Similarly, the upper (lower) right quadrant can be interpreted as including the correctly (incorrectly) forecasted wind events.More details can be found in the main text.

Figure 5 .
Figure 5. POD, POFA, TSS, and 1-PC for the single signatures prediction for different thresholds applied.The optimal value for POD and TSS is 1, and for POFA and 1-PC is 0. Radar signatures are: (a) Zdr column maximum height; (b) Precipitation ice signature maximum height; (c) VII; (d) Height of peak Zh above the 0°C isotherm level; (e) Peak Zh above the 0°C isotherm level; (f) Peak Zh within the storm; (g) VIL; (h) DVIL.

Figure 5 .
Figure 5. POD, POFA, TSS, and 1-PC for the single signatures prediction for different thresholds applied.The optimal value for POD and TSS is 1, and for POFA and 1-PC is 0. Radar signatures are: (a) Z dr column maximum height; (b) Precipitation ice signature maximum height; (c) VII; (d) Height of peak Z h above the 0 • C isotherm level; (e) Peak Z h above the 0 • C isotherm level; (f) Peak Z h within the storm; (g) VIL; (h) DVIL.

Figure 6 .
Figure 6.TSS for the radar signatures' threshold with maximum TSS (contours), presented in terms of POD and POFA.Radar signatures are S#1: Z dr column maximum height; S#2: Precipitation ice signature maximum height; S#3: VII; S#4: Height of peak Z h above the 0 • C isotherm level; S#5: Peak Z h above the 0 • C isotherm level; S#6: Peak Z h within the storm; S#7: VIL; S#8: DVIL.

Table 1 .
Radar signature numbers, physical descriptions, and units.

Table 2 .
Random forest out-of-bag confusion matrix.

Table 3 .
Random Forest's Mean Decrease Accuracy and Mean Decrease Gini for all radar signatures.