In order to provide more useful and general results, the analysis of a single severe storm in
Section 3.1 is repeated and expanded here on a dataset that combines the entire 33 storm (515 radar volume) sample, which is summarized in
Table 1. To characterize the thunderstorm dataset further, histograms of mean and maximum flash rate are provided in
Figure 5. The storms exhibit a wide range of flash rates with storm maximum flash rates ranging from 1.1 min
−1 to 101.7 min
−1. The mode or most common storm has mean and maximum flash rates ≤10 min
−1, which is not uncommon for non-severe storms or low-topped supercells in Alabama. In fact, half of the Alabama storm sample has a storm average flash rate <10 min
−1 and a storm maximum flash rate <21 min
−1. This distribution of flash rates is typical for Alabama but lower than Colorado’s high mean flash rates [
13,
14], especially compared to a recent Colorado data sample dominated by high flash rate storms [
48].
Because the severe storm analyzed in
Section 3.1 has mean (91st percentile) and maximum (97th percentile) flash rates that are much larger than most storms in the overall study sample, it is important to revisit the correlation between lightning, kinematic and microphysical parameters as revealed by ρ of the overall time series data (
Figure 6) before moving onto regressing linear relationships between them. Comparing
Figure 3 and
Figure 6, there are some similarities and differences. Similar to the QLCS severe storm, the microphysical parameters are somewhat better correlated to each other than to kinematic parameters and vice-versa in the overall data set, although there are a few minor exceptions. With the exception of maximum updraft, the values of ρ between lightning flash rate and the various radar-inferred kinematic and microphysical parameters are lower in the overall data set (0.69–0.76) than the severe QLCS storm (0.87–0.98). Interestingly, the ρ between lightning and maximum updraft is slightly higher in the overall data set (0.60) than the severe QLCS storm (0.50) but is still the lowest of all the storm-parameter–lightning relationships. The ρ between overall flash rates and storm parameters in
Figure 6 are generally lower compared to prior studies of single storms [
8,
9], including the Alabama severe QLCS in
Figure 3, and small samples of storms [
13,
14,
48]. Although the microphysical parameters again have higher correlation to flash rate than the kinematic parameters in the overall data set, the differences are less than for the severe QLCS storm. Based on
Figure 6, the error performance metrics of the various flash rate parameterizations derived from the overall data set are likely to be fairly similar with the possible exception of maximum updraft and updraft volume > 10 m s
−1, which are likely to be slightly worse. This suggestion will be explored in more detail in the following sub-sections along with an overall assessment of lightning–radar relationships and a comparison to prior studies.
3.2.1. Overall Dataset with Zero Y-intercept
Using the datasets and methods outlined in
Section 2, linear equations estimating lightning flash rate from various radar-inferred microphysical and kinematic parameters are regressed with a zero y-intercept and provided in
Table 2 using data from all 33 storms and 515 samples (
Table 1). It is worth noting that the coefficient of determination (R
2) is purposefully omitted from
Table 2 since an R
2 for a regression solution forced through the origin does not have the same physical interpretation (i.e., % variance explained) and cannot be directly compared to an R
2 when the regression solution is not forced through the origin. Scatterplots of lightning flash rates versus radar parameters for all thunderstorms are shown in
Figure 7 with the corresponding best fit lines from
Table 2. The scatter between flash rate and radar-inferred microphysical and kinematic parameters is clearly much larger for the full sample of 33 Alabama storms (
Figure 7), including low and high flash rate storms of varying types and severity, than for one high flash rate severe storm (
Figure 4). In particular, the scatter in
Figure 7 is much larger at low flash rates (i.e., <10 min
−1) than it is at moderate-to-high rates. A similar trend of large scatter in lightning–radar relations at flash rates <10 min
−1 can be gleaned from the smaller samples of earlier studies [
13,
14,
48], although the fraction of low flash rates is much larger in this study. As noted earlier, a little over half (17) of the 33 storms in this study have a storm average flash rate <10 min
−1 (
Figure 5). In fact, the median (mean) lightning flash rate for all 515 analyzed samples in this study is 4.6 min
−1 (12.5 min
−1) and fully 63% of the flash rates (i.e., during a radar sample volume time) are <10 min
−1. These low flash rates occur in a variety of storm types, including non-severe multicellular storms and low-topped supercells, and are very common in Alabama. The amount of scatter qualitatively evident in the lightning flash rate versus radar parameters in
Figure 7 is similar for each of the different microphysical and kinematic properties evaluated, including the increased scatter at low flash rates. Quantitative assessment of estimation error in
Table 3 confirms that RMSE (NRMSE) is similar for the microphysical parameters and most kinematic parameters, ranging from 13–14 min
−1 (13–14%) for the linear equations in
Table 2 and
Figure 7. For updraft volume >10 m s
−1, RMSE (NRMSE) is larger at 18 min
−1 (17%). Mean bias errors (
Table 3) for the relations in
Table 2 are small, ranging from −0.9% to 0.8%, which is to be expected since the lines are derived using WLS linear regression on the Alabama data in
Figure 7.
When applying flash rate parameterization equations that were derived in earlier studies using smaller Alabama and Colorado [
13,
14] or Colorado-only [
48] observational datasets to the larger sample of independently observed Alabama thunderstorms in this study, the RMSE and MBE increase for all radar parameters (
Table 3), sometimes significantly, compared to the new Alabama relations developed in this study. Flash rate parameterization equations derived in prior studies that perform reasonably well with relatively low error when applied to the Alabama dataset herein are based on the graupel volume (NRMSE = 16% and NMBE = 4%) and maximum updraft velocity (NRMSE = 17% and NMBE = 2%) observed in Colorado thunderstorms [
48]. Relations from prior studies [
13,
14,
48] based on graupel mass, 35 dBZ echo volume, updraft volume >5 m s
−1, and updraft volume >10 m s
−1 perform poorly on the Alabama dataset in this study, exhibiting much larger NMBE, NRMSE or both (
Table 3).
All of the flash rate parameterization relations derived from Colorado thunderstorms only [
48] overestimate flash rates when applied to the Alabama thunderstorms observed in this study. As can be seen in
Table 3 and
Figure 8, the positive bias (NMBE) and associated scatter error (RMSE) for the Colorado-only relations [
48] increase from graupel volume (NMBE = 4% and NRMSE = 16%) to 35 dBZ echo volume (NMBE = 11% and NRMSE = 27%) to graupel mass (NMBE = 14% and NRMSE = 34%) to updraft volume > 5 m s
−1 (NMBE = 28% and NRMSE = 55%).
One possible explanation for the positive bias and increased RMSE in the Colorado-only graupel mass relation for estimating flash rate could be due to differences in methodology with this study because in their study [
48], the dependent variable is strictly precipitation ice mass, which may include the ice mass from both graupel and hail. However, in an earlier study of both Alabama and Colorado storms [
13], hail contributed little to the overall precipitation ice mass such that it was very similar to graupel mass. Although not shown here, a similar outcome is found in this study, which is why only graupel mass is utilized. It is possible that the thunderstorm sample in the Colorado-only study [
48] has more hail that contributed significantly to the precipitation ice mass. However, combining hail and graupel into precipitation ice mass in this study and using the Colorado-only relation [
48] would only increase the positive biases in estimated flash rates for Alabama storms.
The fairly large positive bias and NRMSE associated with estimating flash rates from the 35 dBZ echo volume relation is surprising given the findings in the Colorado-only study [
48] that found superior performance, including when applied to Alabama storms. One possible explanation for the discrepancy is that the earlier Colorado-only study [
48] excludes non-isolated storms, while this study does not exclude them given their ubiquitous presence in Alabama. In this study, any errors in automated cell tracking associated with non-isolated, multicellular storms are manually corrected.
The large positive bias and RMSE in flash rates estimated from the Colorado-only [
48] updraft volume > 5 m s
−1 and >10 m s
−1 (
Table 3) would make these relations unsuitable for use in Alabama storms. An additional challenge for flash rate parameterization equations based on updraft volume > 10 m s
−1 is that Alabama thunderstorms with non-zero flash rates often have little or no updrafts > 10 m s
−1 (
Figure 9), which is why the relation from this study in
Table 2 with a zero y-intercept predicts low or zero flash rates in these situations, resulting in a negative bias (
Table 3). By comparison, the Colorado-only flash rate relation based on updraft volume > 10 m s
−1 [
48] has a positive y-intercept (of 8.8 min
−1), thus predicting a flash rate of 8.8 min
−1 even when the >10 m s
−1 updraft volume is zero (
Figure 9).
Conversely, the Colorado-only flash rate relation based on maximum updraft velocity [
48] has a negative y-intercept, frequently (i.e., 21% of the time;
Table 4), resulting in a prediction of unphysical negative flash rates when applied to the Alabama thunderstorms in this study (
Figure 9). Despite the negative flash rates at small values of maximum updraft, the Colorado-only maximum updraft velocity relation [
48] over-estimates flash rate in the mean when applied to the Alabama data (
Table 3), especially if the unphysical negative flash rates are replaced by zero before error estimation (
Table 4). This positive bias despite negative flash rates is likely due to the Colorado-only relation [
48] over-estimating flash rate at moderate-to-high values of maximum updraft velocity compared to the relation in this study derived from the Alabama data (
Table 2,
Figure 10). In fact, all of the Colorado-only relations [
48] over-estimate flash rate in nearly all kinematic and microphysical conditions compared to the Alabama-only relations derived in this study, as can be seen clearly in
Figure 10. Possible factors underlying the differences in relations are discussed in
Section 4.
Similarly, the flash rate parameterization equations derived in earlier studies of Alabama-only or Alabama and Colorado storms combined [
13,
14] also have negative y-intercepts and frequently produce physically unrealistic negative flash rates when applied to the Alabama thunderstorms in this study. In fact, negative flash rates are estimated in 49% to 50% of the samples when the updraft volume > 5 m s
−1 relation from the earlier study [
14] is applied to the radar samples in this study (
Table 4). Negative flash rates are even more frequently predicted (66–90%) when the graupel (or precipitation ice) mass relations are applied to the radar data in this study (
Table 4). When negative flash rates are included in error estimation (
Table 3), the bias errors for relations based on updraft volume > 5 m s
−1 and graupel mass from the earlier studies [
13,
14] as applied to radar data in this study are negative and accompanied by large RMSE, making them generally unsuitable. Similar results and conclusions regarding frequently predicted negative flash rates were also found by a recent study [
48] that tested these earlier relations [
13,
14] on Colorado-only data. If the negative flash rates are replaced by zero before error estimation (
Table 4), the bias errors for relations in the earlier studies [
13,
14] become less negative for graupel mass and even slightly positive for updraft volume > 5 m s
−1 when applied to radar samples in this study. The switch in bias from negative to positive when substituting zero for unphysical negative flash rates is related, in part, to the earlier updraft volume > 5 m s
−1 relation [
14] over-estimating flash rate at moderate-to-large values of updraft volume, as seen in a comparison (
Figure 10) of the flash rate parameterization equation for Alabama storms in [
14] to those derived from the Alabama data in this study (
Table 2).
Flash rates estimated from relations based on maximum updraft velocity that were derived in earlier radar studies [
14,
86] of Alabama-only or Alabama and Colorado storms are also frequently (24–43%) negative when applied to the radar observations of Alabama storms in this study, resulting in large NRMSE (
Table 3 and
Table 4). Interestingly, the NMBE is large and positive despite the negative flash rates (
Table 3) and only increases if negative flash rates are replaced by zero (
Table 4) because the earlier relations [
14,
86] are grossly over-estimating flash rate at larger maximum updrafts when compared to the relations for Alabama storms in this study (
Figure 10). Regardless of how negative flash rates are treated in error estimation, relations that produce them frequently are likely not well suited for general use. It is unclear why the earlier flash rate parameterizations [
13,
14,
86] based on graupel mass, updraft volume > 5 m s
−1 and maximum updraft velocity derived from Alabama and Colorado (or Alabama only) storms frequently predict unrealistic negative flash rates when applied to the Alabama storms in this study or other Colorado storms [
48]. Differences in data and methodology seem less likely to be important since the same radar and lightning networks and most methods as used in the earlier studies [
13,
14,
86] are used herein.
3.2.2. Overall Dataset with Non-zero Y-intercept
Given that it is standard statistical practice [
84] and that other studies have used unforced linear regression with a non-zero y-intercept in all [
13,
14] or some [
48] of their flash rate parameterization equations, the analysis in
Section 3.2.1 is briefly repeated here using non-zero y-intercepts. The R
2 for the regressed equations in
Table 5 with non-zero y-intercepts suggest that parameterizations based on radar-inferred microphysical parameters can explain about 60% of the variance in flash rate, while radar-inferred kinematic parameters can explain about 50% based on updraft volume to about 40% based on maximum updraft velocity. The R
2 and explained variance of flash rates by radar parameters in this study are lower than in prior studies [
13,
14,
48], although it is worth noting that the recent Colorado study [
48] also found lower R
2 for relations based on kinematic quantities compared to microphysical ones. The higher R
2 in the prior studies is likely the result of smaller samples and larger mean flash rates, although there could be other possible explanations (e.g., differences in storm dynamics, microphysics, observational error, and conceptual model error).
The flash rate parameterization equations in this study have positive y-intercepts except for maximum updraft velocity, which is negative (
Table 5). With the exception of maximum updraft velocity, this is different than the earlier studies for Alabama-only or Alabama and Colorado combined storms [
13,
14], which had all negative y-intercepts (
Figure 10). On the other hand, for those relations in the recent Colorado-only study [
48] with non-zero y-intercepts (i.e., updraft volume > 10 m s
−1 and maximum updraft velocity), the signs of the y-intercepts (i.e., positive for the former and negative for the latter) are the same as in this study. A consistent difference between the relations in
Table 5 and all prior studies is that the y-intercept parameters tend to have smaller magnitudes in this study (0.3 to 4.2 min
−1) compared to (8.8 to 16.7 min
−1) in the Colorado relations [
48] and (5.1 to 44.4 min
−1) for the earlier studies of Alabama and Colorado (or Alabama only) storms [
13,
14,
86]. As is also evident in
Figure 10, the flash rate parameterization relations in this study with zero y-intercepts (
Table 2) are very similar to those with non-zero y-intercepts (
Table 5), which is expected since the y-intercepts in
Table 5 are small (or not far from zero). As such, the error performance for the two sets of relations in this study are fairly similar with the non-zero y-intercept relations in
Table 5 having slightly smaller magnitudes of MBE and RMSE (
Table 3 and
Table 4).
Given these results, it seems that differences in storm characteristics, sample sizes, observational errors and/or radar and lightning analysis methods between studies are more likely to be behind poor error performance in flash rate parameterization equations derived in one study when applied to different data from another study than the choice of whether the linear regression is forced through the origin (zero y-intercept) or not (non-zero y-intercept). In
Section 4, we argue that these differences between radar-lightning studies are likely what causes variance in the flash rate parameterization equations, including y-intercepts and slopes, in the first place.