Using Logistic Regression to Identify the Key Hydrologic Controls of Ice-Jam Flooding near the Peace–Athabasca Delta: Assessment of Uncertainty and Linkage with Physical Process Understanding

: The Peace–Athabasca Delta (PAD) in northern Alberta is one of the world’s largest inland freshwater deltas and is home to many species of ﬁsh, mammals, and birds. Over the past ﬁve decades, the PAD has experienced prolonged dry periods in between rare ﬂoods, accompanied by a reduction in the area comprised of lakes and ponds that provide a habitat for aquatic life. In the Peace sector of the PAD, this likely resulted from a reduced frequency of spring ﬂooding caused by major ice jams that form in the lower Peace River. There is debate in the literature regarding the factors that promote or inhibit the formation of such ice jams, deriving from physical process studies, paleolimnological studies, and—recently—statistical analysis founded in logistic regression. Logistic regression attempts to quantify ice-jam ﬂood (IJF) probability, given the values of assumed explanatory variables, involve considerable uncertainty. Herein, different sources of uncertainty are examined and their effects on statistical inferences are evaluated. It is shown that epistemic uncertainty can be addressed by selecting direct explanatory variables, such as breakup ﬂow and ice cover thickness, rather than through more convenient, albeit weak, proxies that rely on winter precipitation and degree-days of frost. Structural uncertainty, which derives from the unknown mathematical relationship between IJF probability and the selected explanatory variables, leads to different probability predictions for different assumed relationships but does not modify assessments of statistical signiﬁcance. The uncertainty associated with the relatively small sample size (number of years of record) may be complicated by known physical constraints on IJF occurrence. Overall, logistic regression corroborates physical understanding that points to breakup ﬂow and freezeup level as primary controls of IJF occurrence. Additional inﬂuences, related to the thermal decay of the ice cover and the ﬂow gradient during the advance of the breakup front towards the PAD, are difﬁcult to quantify at present. Progress requires increased monitoring of processes and an enhanced numerical modelling capability.


Introduction
The Peace-Athabasca Delta (PAD) in northern Alberta (Figure 1) is one of the world's largest inland freshwater deltas and is a homeland for the Indigenous Peoples of the region.It has been designated a Ramsar wetland of international importance and is largely located within the Wood Buffalo National Park (WBNP), itself being a UNESCO World Heritage Site [1].During the past five decades, this complex and dynamic region has, in between rare overland floods, experienced prolonged dry periods and a considerable reduction in the area covered by lakes and ponds that provide a habitat for aquatic life [2][3][4][5][6][7][8][9][10].The drying trend coincides with the regulation of Peace River, which began with the construction (1968), reservoir-filling (1968)(1969)(1970)(1971), and operation (1972 onwards) of the W.A.C. Bennett hydroelectric dam in British Columbia, located some 1200 km upstream of the PAD (Figure 1).As noted in a related conference paper [11], there is debate in the scientific literature regarding the possible effect of regulation on the drying of the PAD (see [12] for the relevant bibliography).
Water 2023, 15, x FOR PEER REVIEW 2 of 16 construction (1968), reservoir-filling (1968)(1969)(1970)(1971), and operation (1972 onwards) of the W.A.C. Bennett hydroelectric dam in British Columbia, located some 1200 km upstream of the PAD (Figure 1).As noted in a related conference paper [11], there is debate in the scientific literature regarding the possible effect of regulation on the drying of the PAD (see [12] for the relevant bibliography).From [13], with changes.
Concern over the long-term health and sustenance of PAD ecosystems is underscored by climate change and the ongoing/future construction of more dams [13].As a result of a UNESCO Reactive Monitoring Mission report [14], prompted by a petition from Indigenous Peoples, Canadian federal and provincial authorities commissioned a strategic assessment [15] of WBNP.This assessment culminated in the development of the WBNP Action Plan [1], which incorporated Indigenous knowledge, to address several recommendations towards preserving the ecological integrity of this important World Heritage Site.The success of the Action Plan depends on a sound understanding of the processes and hydroclimatic variables that control the supply of water, sediment, and nutrients to the PAD basins.Ongoing concerns prompted a second Monitoring Mission in 2022, which resulted in further recommendations [16].
The recharge of the delta depends, in part, on the formation of extensive spring ice jams that can trigger overland inundation.Ice jamming in the lower Peace River is the primary mechanism that can replenish the high-elevation-or "perched"-lakes and ponds of the Peace Sector of the PAD [17].The Historical and Traditional Knowledge record of ice-jam floods (IJFs) includes events of varying magnitude, labelled 1, 2, and 3 [18,19], or small, moderate, and large, respectively [20].It is the large, or magnitude 3, variety that can generate the extensive overland flooding that is necessary to replenish perched basins.In what follows, the term IJF is used to denote large events (see also Section 6).During the 20th and 21st centuries, IJFs have been reported for the years 1900,1904,1920,1932,1933,1934,1942,1943,1948 (may have been of magnitude 2 and is not usually included in the IJF series), 1958, 1963, 1965, 1972, 1974, 1996, 1997, and 2014.Unlike past physics-based research [17,21], Lamontagne et al. [22] applied logistic regression to identify controlling variables and calculate IJF probability.This research avenue is uncommon in the study of river ice processes and therefore merits careful Concern over the long-term health and sustenance of PAD ecosystems is underscored by climate change and the ongoing/future construction of more dams [13].As a result of a UNESCO Reactive Monitoring Mission report [14], prompted by a petition from Indigenous Peoples, Canadian federal and provincial authorities commissioned a strategic assessment [15] of WBNP.This assessment culminated in the development of the WBNP Action Plan [1], which incorporated Indigenous knowledge, to address several recommendations towards preserving the ecological integrity of this important World Heritage Site.The success of the Action Plan depends on a sound understanding of the processes and hydroclimatic variables that control the supply of water, sediment, and nutrients to the PAD basins.Ongoing concerns prompted a second Monitoring Mission in 2022, which resulted in further recommendations [16].
The recharge of the delta depends, in part, on the formation of extensive spring ice jams that can trigger overland inundation.Ice jamming in the lower Peace River is the primary mechanism that can replenish the high-elevation-or "perched"-lakes and ponds of the Peace Sector of the PAD [17].The Historical and Traditional Knowledge record of ice-jam floods (IJFs) includes events of varying magnitude, labelled 1, 2, and 3 [18,19], or small, moderate, and large, respectively [20].It is the large, or magnitude 3, variety that can generate the extensive overland flooding that is necessary to replenish perched basins.In what follows, the term IJF is used to denote large events (see also Section 6).During the 20th and 21st centuries, IJFs have been reported for the years 1900,1904,1920,1932,1933,1934,1942,1943,1948 (may have been of magnitude 2 and is not usually included in the IJF series), 1958, 1963, 1965, 1972, 1974, 1996, 1997, and 2014.Unlike past physics-based research [17,21], Lamontagne et al. [22] applied logistic regression to identify controlling variables and calculate IJF probability.This research avenue is uncommon in the study of river ice processes and therefore merits careful examination and assessment.The primary objective of this paper is to explore the uncertainties that were identified in [22], assess their implications for the identification of controlling variables, and delineate what can, or cannot, be inferred with some confidence from the logistic regression Water 2023, 15, 3825 3 of 15 results.A secondary objective is to examine whether the results of the logistic regression agree with the known physics of ice breakup processes, with a focus on the lower Peace River.

Background Information
Logistic regression on binary outcomes, such as flood/no flood, during the spring breakup of any given year, attempts to quantify the probability of flood occurrence (P), given a set of known or hypothesized explanatory variables.It assigns respective values of 1 and 0 to flood and non-flood years and assumes that the probability of flood occurrence is expressed as: P{IJF/(x 1 , x 2 , x 3 , . . .
in which x 1 , x 2 , and x 3 are the explanatory variables and with b 0 , b 1 , b 2 , and b 3 being coefficients, of which the numerical values are furnished by the regression.The form of Equation ( 1) ensures that P always varies between 0 and 1 (as it should), regardless of the value of X, while Equation ( 2) assumes that the effects of the explanatory variables are linear and additive.Together, Equations ( 1) and ( 2) imply that X = ln{P/(1 − P)} = natural logarithm of the "odds ratio" and is called the "logit" in the statistical literature.To those who are accustomed to physics-based research, considerable serendipity appears to be needed to come across linear and additive logits (see also Section 4).This is a key point because the numerical values of b 0 , b 1 , b 2 , . . .and their associated p-values depend on the structure of Equation (2).A primary consideration in logistic regression is the sample size, which, in the present case, is the number of years of record.A common, though not universally accepted, empirical requirement is that the sample should contain at least 10 events per variable [23].For applications to IJFs in the lower Peace River (frequency ~0.1), the use of two variables would require sample sizes of at least 2*10/0.1 = 200 years (300 years in the case of three variables, 400 years for four variables, and so on).By contrast, the available hydrometric data span a period of <60 years.
Lamontagne et al. [22] noted that the bias introduced by relatively short records is successfully reduced by maximizing a penalized likelihood function (Firth [24]) and applied the Firth logistic regression to known IJFs using potentially relevant explanatory variables.They adopted the total winter (November to April) precipitation, WP, at Grande Prairie (supplemented, as needed, with data from nearby Beaverlodge) as a proxy for breakup flow, which is the primary factor that drives dynamic breakup events, as well as the accumulated degree-days of frost at Fort Vermilion (DDF) as a proxy for ice thickness, which is a resisting factor that promotes ice jamming.Grande Prairie, located ~150 km SW of the town of Peace River (Figure 1), is considered a good index station for snowmelt runoff in the key Smoky River sub-basin of the Peace [17].The freezeup level, HF, a resistance factor, was also considered initially but eventually discarded in [22] as it was not statistically significant (p-value > 0.05).Other factors, such as November flows and the rapidity of spring melt were also considered but did not seem to improve upon the ultimately adopted logistic model, which was solely based on the aforementioned proxy variables WP and DDF.This model predicts increasing IJF probability with increasing breakup flow potential, as reflected in the winter precipitation, and with increasing resistance to breakup, as reflected in the coldness of the winter.In the present context, the word "model" signifies an assumed version of Equation ( 2), following a statistical determination of the equation's coefficients.
It was noted in [22] that none of the examined models had strong predictive power and that "the best we can do is speak to what is more or less likely".This feature is visually illustrated in Figure 2: the probability estimates resulting from the adopted model for each one of the years of the examined record produced a wide overlap range (~0.2 to 0.7), containing five of the seven flood events of the instrumental record and six non-flood events.
It was noted in [22] that none of the examined models had strong predictive power and that "the best we can do is speak to what is more or less likely".This feature is visually illustrated in Figure 2: the probability estimates resulting from the adopted model for each one of the years of the examined record produced a wide overlap range (~ 0.2 to 0.7), containing five of the seven flood events of the instrumental record and six non-flood events.
With reference to "discarded" variables, [22] cautioned that: "It is important to note that just because a factor is not statistically significant or the model AICc is not competitive, does not necessarily mean that the factor is unimportant to generating large ice-jam floods.It could be that the factor's relationship to flood generation is different than assumed by the models we tested (structural uncertainty), or that the sample size is too small to precisely estimate the effect of the factor on ice-jam floods (parametric uncertainty).It could also be that the factors used as proxies for the physical drivers are not good (epistemic uncertainty)" [22].
This statement is particularly relevant to the dismissal of HF as an explanatory variable, contrary to what physical understanding and experience indicate for the Peace and other rivers [17,25,26].For instance, a simple counting of IJF and non-flood occurrences within successive, one-metre-wide ranges of HF indicates empirical probabilities that rapidly decrease as HF increases [27].The discrepancy between empirical evidence and logistic regression is explored in the following sections by assessing how uncertainty influences the degree of confidence that can be attached to statistical inferences regarding IJF occurrence in the lower Peace River.
Herein, as in previous related publications, HF is defined as the highest seven-day running average water level during ice-cover conditions in late fall and early winter.It can be determined from the records of hydrometric gauges located along the Peace River and operated by the Water Survey of Canada (WSC).Details of HF determination are provided in [26].The freezeup level varies along the river but positive correlations between successive pairs of gauges suggest that, in any one year, HF is generally low, moderate, or high throughout at least the lower 600 km of Peace River [28].In what follows, HF data are derived from a gauge located near Peace Point (Figure 1).This gauge is the closest flow-measuring gauge to the PAD, in essence capturing the entire river discharge that Figure 2. IJF occurrence (binary variable = 1) or non-occurrence (binary variable = 0) versus predicted probability using the regression model of [22].Note the large overlap range that includes 5 IJFs and 6 non-floods.
With reference to "discarded" variables, [22] cautioned that: "It is important to note that just because a factor is not statistically significant or the model AICc is not competitive, does not necessarily mean that the factor is unimportant to generating large ice-jam floods.It could be that the factor's relationship to flood generation is different than assumed by the models we tested (structural uncertainty), or that the sample size is too small to precisely estimate the effect of the factor on ice-jam floods (parametric uncertainty).It could also be that the factors used as proxies for the physical drivers are not good (epistemic uncertainty)" [22].
This statement is particularly relevant to the dismissal of HF as an explanatory variable, contrary to what physical understanding and experience indicate for the Peace and other rivers [17,25,26].For instance, a simple counting of IJF and non-flood occurrences within successive, one-metre-wide ranges of HF indicates empirical probabilities that rapidly decrease as HF increases [27].The discrepancy between empirical evidence and logistic regression is explored in the following sections by assessing how uncertainty influences the degree of confidence that can be attached to statistical inferences regarding IJF occurrence in the lower Peace River.
Herein, as in previous related publications, HF is defined as the highest seven-day running average water level during ice-cover conditions in late fall and early winter.It can be determined from the records of hydrometric gauges located along the Peace River and operated by the Water Survey of Canada (WSC).Details of HF determination are provided in [26].The freezeup level varies along the river but positive correlations between successive pairs of gauges suggest that, in any one year, HF is generally low, moderate, or high throughout at least the lower 600 km of Peace River [28].In what follows, HF data are derived from a gauge located near Peace Point (Figure 1).This gauge is the closest flow-measuring gauge to the PAD, in essence capturing the entire river discharge that eventually arrives at the delta reach.Moreover, this gauge has the longest historical record within the lower portion of the river.

Epistemic Uncertainty
In addition to freezeup levels, daily mean breakup flow and sporadic ice thickness data are available for the studied period of record (1962 to 2020, minus the reservoir-filling years 1968-1971).To characterize breakup flow using a single variable, one may simply select the maximum spring flow during the period when the gauge height is influenced by ice, which is flagged by the symbol B in the WSC records.To capture aspects of both the magnitude and duration of high flows, one may calculate the peak value of the running average flow over several days during the B period.Herein, the peak seven-day average (Q7) is used; the resulting conclusions are similar when using maximum breakup flow instead of Q7.
As illustrated in Figures 3 and 4, neither WP nor DDF are robust proxies for breakup flow and ice thickness, respectively.A common feature in both figures is the large scatter, which means that any given value of WP or DDF corresponds to a wide range of breakup flow or ice thickness, respectively.In the case of WP, the scatter in Figure 3 likely results from the interannual variability of the pre-breakup rates of melt and from a reliance on a single meteorological station to represent the snowpack of the entire basin.
within the lower portion of the river.

Epistemic Uncertainty
In addition to freezeup levels, daily mean breakup flow and sporadic ice thickness data are available for the studied period of record (1962 to 2020, minus the reservoir-filling years 1968-1971).To characterize breakup flow using a single variable, one may simply select the maximum spring flow during the period when the gauge height is influenced by ice, which is flagged by the symbol B in the WSC records.To capture aspects of both the magnitude and duration of high flows, one may calculate the peak value of the running average flow over several days during the B period.Herein, the peak seven-day average (Q7) is used; the resulting conclusions are similar when using maximum breakup flow instead of Q7.
As illustrated in Figures 3 and 4, neither WP nor DDF are robust proxies for breakup flow and ice thickness, respectively.A common feature in both figures is the large scatter, which means that any given value of WP or DDF corresponds to a wide range of breakup flow or ice thickness, respectively.In the case of WP, the scatter in Figure 3 likely results from the interannual variability of the pre-breakup rates of melt and from a reliance on a single meteorological station to represent the snowpack of the entire basin.With reference to Figure 4, the coldness of the winter, which is quantified by the accumulated degree-days of frost, or DDF, is occasionally considered a surrogate for river ice thickness.This assumption has merit if one is comparing multi-year average thickness values at different river sites: on average, we can reasonably expect to find thicker ice at river sites where the winters are generally colder than at sites where the winters are milder.However, this assumption breaks down when we compare ice thickness values from different years at the same site (as in Figure 4).Comparable thickness data, albeit less voluminous, from the records of the Fort Vermilion gauge exhibit a similar pattern when plotted against local values of DDF [28].This kind of scatter has been noticed by others in the past and has been explained in terms of snowfall conditions: "…the largest variations from year to year at a given site are associated with the thickness of the snow on the ice, and secondarily by variations in the coldness of the winter period of With reference to Figure 4, the coldness of the winter, which is quantified by the accumulated degree-days of frost, or DDF, is occasionally considered a surrogate for river ice thickness.This assumption has merit if one is comparing multi-year average thickness values at different river sites: on average, we can reasonably expect to find thicker ice at river sites where the winters are generally colder than at sites where the winters are milder.However, this assumption breaks down when we compare ice thickness values from different years at the same site (as in Figure 4).Comparable thickness data, albeit less voluminous, from the records of the Fort Vermilion gauge exhibit a similar pattern when plotted against local values of DDF [28].This kind of scatter has been noticed by others in the past and has been explained in terms of snowfall conditions: ". ..the largest variations from year to year at a given site are associated with the thickness of the snow on the ice, and secondarily by variations in the coldness of the winter period of thickening" [29].This quote is reproduced from a seminal paper written by the late G. D. Ashton, one of the world's foremost experts in river and lake ice growth.
An additional contributing factor to the epistemic uncertainty is the fact that WP and DDF are correlated to each other (colder winters generally produce more snow in the studied area), which contradicts one of the assumptions adopted in logistic regression.Lamontagne et al. [22] described the degree of correlation as "moderate" (Pearson's r ≈ 0.60) and pointed out that some of the information in the record is therefore redundant.Correlated variables could hinder the correct interpretation of the statistical results.For example, the apparent positive influence of the DDF on the chances of IJF occurrence could largely reflect the positive effect of a deep snowpack.
Water 2023, 15, x FOR PEER REVIEW 6 of 16 thickening" [29].This quote is reproduced from a seminal paper written by the late G. D. Ashton, one of the world's foremost experts in river and lake ice growth.An additional contributing factor to the epistemic uncertainty is the fact that WP and DDF are correlated to each other (colder winters generally produce more snow in the studied area), which contradicts one of the assumptions adopted in logistic regression.Lamontagne et al. [22] described the degree of correlation as "moderate" (Pearson's r ≈ 0.60) and pointed out that some of the information in the record is therefore redundant.Correlated variables could hinder the correct interpretation of the statistical results.For example, the apparent positive influence of the DDF on the chances of IJF occurrence could largely reflect the positive effect of a deep snowpack.
Logistic regressions on Q7 alone, HF alone, and end-of-winter ice thickness (hio) alone indicate statistical significance (at the 0.05 p-value threshold) for Q7, near-significance for HF, and a lack of significance for hio (Table 1).Simultaneous regressions of (Q7, HF) and (Q7, HF, hio) indicate significance for Q7 and HF but a lack of significance for hio.Regardless of the assumed model, the regressions indicate that Q7 and HF have positive and negative effects on the chances of an IJF, respectively.However, the model of hio alone points to a negative effect (b1 < 0), while the model that combines hio with Q7 and HF indicates a positive effect (b3 > 0).Logistic regressions on Q7 alone, HF alone, and end-of-winter ice thickness (h io ) alone indicate statistical significance (at the 0.05 p-value threshold) for Q7, near-significance for HF, and a lack of significance for h io (Table 1).Simultaneous regressions of (Q7, HF) and (Q7, HF, h io ) indicate significance for Q7 and HF but a lack of significance for h io .Regardless of the assumed model, the regressions indicate that Q7 and HF have positive and negative effects on the chances of an IJF, respectively.However, the model of h io alone points to a negative effect (b 1 < 0), while the model that combines h io with Q7 and HF indicates a positive effect (b 3 > 0).A fourth, potentially relevant, variable is the degree-days of thaw (S5) above a base temperature of −5 • C [30], accumulated up to the day when breakup is initiated at Peace Point.It is intended to index the degree of thermal decay that the ice cover is subjected to prior to being dislodged and mobilized.When regressed alone, S5 appears to be significant, though the associated constant does not (Table 1).It becomes insignificant when it is regressed along with other variables (p-value ~0.4), possibly because it correlates with both Q7 and HF (Pearson's r ≈ −0.32, 0.30, respectively).Given the lack of clear significance for h io and S5, the remainder of this paper focuses on the primary variables Q7 and HF, which are practically uncorrelated (Pearson's r ≈ 0.07).
To visualize how IJF probability varies as a function of breakup flow and freezeup level, the (Q7, HF) model in Table 1 is selected, and the indicated model coefficients are entered in Equations ( 1) and ( 2).The results are illustrated in Figure 5: for any given breakup flow, the probability P(IJF) decreases as HF increases; for any given HF, P(IJF) increases as Q7 increases.cient data); Q7 and HF may have been underestimated in 2018 and 2020, respectively; their p-values decrease slightly if either one is increased.
A fourth, potentially relevant, variable is the degree-days of thaw (S5) above a base temperature of −5 °C [30], accumulated up to the day when breakup is initiated at Peace Point.It is intended to index the degree of thermal decay that the ice cover is subjected to prior to being dislodged and mobilized.When regressed alone, S5 appears to be significant, though the associated constant does not (Table 1).It becomes insignificant when it is regressed along with other variables (p-value ~ 0.4), possibly because it correlates with both Q7 and HF (Pearson's r ≈ −0.32, 0.30, respectively).Given the lack of clear significance for hio and S5, the remainder of this paper focuses on the primary variables Q7 and HF, which are practically uncorrelated (Pearson's r ≈ 0.07).
To visualize how IJF probability varies as a function of breakup flow and freezeup level, the (Q7, HF) model in Table 1 is selected, and the indicated model coefficients are entered in Equations ( 1) and ( 2).The results are illustrated in Figure 5: for any given breakup flow, the probability P(IJF) decreases as HF increases; for any given HF, P(IJF) increases as Q7 increases.The preceding results remove the epistemic uncertainty associated with the proxy variables WP and DDF but two other types of uncertainty remain.Therefore, there is no guarantee that the probabilities indicated in Figure 5 are accurate; their main value is in qualitatively illustrating the effects of Q7 and HF on the chances of ice-jam flooding in any one year.

Structural Uncertainty
As intimated in Section 2, the linearity assumed in the logit formulation (Equation ( 2)) is most likely erroneous because complex physical processes are seldom quantified in terms of the simple linear sums of their controlling factors.Different powers and exponents, products, or logarithms of the respective variables are also possible.The "true" logit function is unknown, but helpful insights can be gleaned by varying the formulation of The preceding results remove the epistemic uncertainty associated with the proxy variables WP and DDF but two other types of uncertainty remain.Therefore, there is no guarantee that the probabilities indicated in Figure 5 are accurate; their main value is in qualitatively illustrating the effects of Q7 and HF on the chances of ice-jam flooding in any one year.

Structural Uncertainty
As intimated in Section 2, the linearity assumed in the logit formulation (Equation ( 2)) is most likely erroneous because complex physical processes are seldom quantified in terms of the simple linear sums of their controlling factors.Different powers and exponents, products, or logarithms of the respective variables are also possible.The "true" logit function is unknown, but helpful insights can be gleaned by varying the formulation of Equation ( 2) and repeating the regression.Noting that, at Peace Point, the resistance to ice cover mobilization depends directly on the difference 216-HF (per Section 6), the following expressions were tried: X = b 0 + b 1 ln(Q7) + b 2 ln(DF) (5) in which DF = 216-HF.The elevation 216 m (CGVD28) is slightly higher than the highest known HF of 215.9 m; therefore, DF is always positive.Equation ( 3) is equivalent to the conventional linear expression for the logit, while Equation (4) sums the arbitrarily selected powers of Q7 and DF; Equation (5) effectively involves the product (Q7) b1 (DF) b2 .Table 2 summarizes the results of logistic regressions with these models and shows that all p-values are lower or slightly higher than the conventional significance limit of 0.05.The same applies to various other models that were also tried, each involving different functions of Q7 and DF.It is concluded that these two variables are indeed statistically significant controls of IJF occurrence.The predictions generated by the three models shown in Table 2 are compared in Figure 6 in terms of their respective isolines for P(IJF) = 0.50.As can be expected from Equations ( 1) and ( 2), the isoline of the linear model (second row of Table 2) is straight.This is not the case for the nonlinear models, which generate concave or convex curves, depending on the assumed formulation.Isolines for other values of P(IJF) have similar shapes as the corresponding lines of Figure 6 and move up or down as P(IJF) increases above, or decreases below, 0.50; for the simple linear model, the isolines are parallel to each other.Five of the seven flood events that occurred during the period from 1962 to 2020 plot above the 0.5 isoline.2. The data points represent observed annual values of Q7 and HF at Peace Point for the period from 1962 to 2020, excluding the reservoir-filling years 1968-1971; the red circles mark IJFs.For 2018, the flow may have been higher than shown; for 2020, the freezeup level is only known to be within a range of elevations.
The predictive power of the Table 2 models was visually assessed using graphs similar to that of Figure 2. The best performance was exhibited by the logarithmic equation, which points to a product, rather than a sum, of the explanatory variables.As shown in  2. The data points represent observed annual values of Q7 and HF at Peace Point for the period from 1962 to 2020, excluding the reservoir-filling years 1968-1971; the red circles mark IJFs.For 2018, the flow may have been higher than shown; for 2020, the freezeup level is only known to be within a range of elevations.
The predictive power of the Table 2 models was visually assessed using graphs similar to that of Figure 2. The best performance was exhibited by the logarithmic equation, which points to a product, rather than a sum, of the explanatory variables.As shown in Figure 7, the overlap range now only involves two non-flood points.The rightmost of these points applies to 1967, a year for which the historical record mentions a flood, but without details as to the cause [20].Numerical modelling has indicated that a 1967 flood would require the presence of an ice jam in the lower Peace River [32].If an IJF did occur in 1967, the overlap range would be considerably reduced and contain a single non-flood occurrence.This potential reduction in the overlap range does not apply to Figure 2, where the rightmost non-flood data point derives from the year 1962.
Water 2023, 15, x FOR PEER REVIEW 1 Figure 7. IJF occurrence (binary variable = 1) or non-occurrence (binary variable = 0) versus pre probability using the third model of Table 2, which essentially involves the product of the p of the explanatory variables.
It is concluded that structural uncertainty does not influence assessments of th tistical significance of the explanatory variables.At the same time, it can lead to diff predictions of IJF probability, depending on the structure of the assumed model.findings apply to the present case study and may or may not be of general validity ever, they demonstrate that the exploration of different structural formulations f logit can enhance confidence and clarity in associated inferences.

Other Sources of Uncertainty
Recalling the quotation from [22] in Section 2, a third type of uncertainty derive samples that are too small for logistic regression (parametric uncertainty).This is addressed using the Firth [24] bias-reducing approach.Nevertheless, the events-pe able requirement suggests that each additional variable can generate further uncert potentially straining the limits of the Firth approach.Therefore, it is advisable to ke number of explanatory variables to a minimum when performing logistical regress small samples.
Independently of statistical theory, it has been determined by physics-based n ical modelling that a necessary, albeit not sufficient, condition for an IJF is that the flow at Peace Point should exceed 4000 m 3 /s [33].Consequently, P(IJF) is nil if Q7 than, or equal to, 4000 m 3 /s.However, all Q7-related regressions that have been perfo so far predict nonzero probabilities for all values of Q7.It is not known how to ac for this discrepancy in the logistic regression.A possible option could be to disca years in which Q7 was less than 4000 m 3 /s.However, this approach would severely r the sample size, exacerbating the parametric uncertainty.2, which essentially involves the product of the powers of the explanatory variables.
It is concluded that structural uncertainty does not influence assessments of the statistical significance of the explanatory variables.At the same time, it can lead to different predictions of IJF probability, depending on the structure of the assumed model.These findings apply to the present case study and may or may not be of general validity; however, they demonstrate that the exploration of different structural formulations for the logit can enhance confidence and clarity in associated inferences.

Other Sources of Uncertainty
Recalling the quotation from [22] in Section 2, a third type of uncertainty derives from samples that are too small for logistic regression (parametric uncertainty).This issue is addressed using the Firth [24] bias-reducing approach.Nevertheless, the events-pervariable requirement suggests that each additional variable can generate further uncertainty, potentially straining the limits of the Firth approach.Therefore, it is advisable to keep the number of explanatory variables to a minimum when performing logistical regression on small samples.Independently of statistical theory, it has been determined by physics-based numerical modelling that a necessary, albeit not sufficient, condition for an IJF is that the river flow at Peace Point should exceed 4000 m 3 /s [33].Consequently, P(IJF) is nil if Q7 is less than, or equal to, 4000 m 3 /s.However, all Q7-related regressions that have been performed so far predict nonzero probabilities for all values of Q7.It is not known how to account for this discrepancy in the logistic regression.A possible option could be to discard the years in which Q7 was less than 4000 m 3 /s.However, this approach would severely reduce the sample size, exacerbating the parametric uncertainty.

Physics of the Spring Breakup Process in the Lower Peace River
Downstream of the W.A.C. Bennett Dam (Figure 1; dam completed in 1968) and the Peace Canyon Dam (Figure 1; dam completed in 1980), the spring breakup initially proceeds in a thermal fashion, responding to the heat delivered by the increasingly longer and warmer open water reach upstream of the ice cover front.Local conditions typically involve low or moderate flow, increased solar radiation, and positive air temperatures.Once the front passes the town of Peace River (TPR), the breakup may continue to advance thermally or it may revert to a mechanical, and even highly dynamic, mode.In the former case, the rate of advance of the breakup front is relatively low (up to a few tens of km per day).While the front is advancing in this manner, the ice cover in the last ~100 km of Peace River decays thermally, and minimal, if any, jamming can occur (e.g., 2023 breakup).This is primarily caused by the low local river slope (~0.05 m/km; [13]): the tractive forces that are applied on the ice cover by the gradually increasing, runoff-generated flow are not enough to mechanically dislodge the winter ice cover, except for the improbable combinations of extreme flow and freezeup level [34].
A change to mechanical and even dynamic breakup conditions is occasionally facilitated by the arrival of sharp waves generated by ice-jam releases (also termed "javes") and accompanying ice runs from the Smoky River, a major tributary that joins the Peace just upstream of TPR [35][36][37][38].A sizeable ice jam may then form in Peace River, often near Sunny Valley (~km 490 in Figure 1); upon release of this jam, the resulting jave dislodges, mobilizes, and breaks the ice cover for a certain distance before the ensuing ice run is arrested, forming a new jam (javes are known to considerably amplify the tractive forces that are applied on the ice cover by the flow [34,39]).The process then repeats, and it can eventually deliver large volumes of ice rubble to the PAD reach of Peace River, where resulting major jams can generate ecologically beneficial overbank flooding (e.g., 2014 event).On occasion, javes may fail to dislodge a particularly resistant ice cover segment but resume their ice-breaking action farther downstream; this may then result in the simultaneous presence of more than one ice jam along the river.There can be variations to this general pattern, as evidenced by the unusual breakup of 2022 (see details in the Supplementary Materials).
The occurrence of an IJF depends on the overall celerity of breakup advance, CB.For events that become dynamic hundreds of km above the delta, CB is controlled by the number and "residence times" of the various jams that form along the way.High flows enhance the water levels caused by these jams and can then augment the power of subsequent javes to dislodge the winter ice cover, potentially enhancing CB.On the other hand, a large resistance of the ice cover to dislodgment and mobilization promotes low CB values.When the advance of the breakup front is slow, the volume of ice rubble that arrives at the delta reach (~last 50 km) of Peace River may be too small to produce a major jam, owing to prolonged ablation.A good example is the extreme-flow 2020 breakup, which became dynamic near Sunny Valley on April 22, but the front did not reach the mouth of Peace River until late on May 5 (travel time ~13 days).A short (~10 km) and brief (~1.5 days) jam that formed at a frequent jamming site in the Slave River did not raise water levels above the riverbanks [28].By contrast, the documented IJF events of 1996, 1997, and 2014 involved front travel times of ~3, 6, and 5 days, while the PAD jams lasted for ~8, 7, and 8 days, respectively [36,[40][41][42], as shown in Alberta Government ice observation reports and archives https://rivers.alberta.ca/,accessed on 28 October 2023].Similar information is not available for earlier IJF events; it is only known that the extreme IJF of 1974 lasted for 10-14 days [6] and that of 1963 lasted for about a week [18].
Apart from morphological and thermal factors, ice cover resistance to dislodgment is influenced by HF and h io .For the vicinity of the Peace Point gauge, the effects of these two variables are combined in a dimensionless quantity, R tf , which can be closely approximated using the following expression [26]: R tf ≈ h io 1/2 f(DF) (6) in which DF = 216-HF, while the function f decreases as DF increases (or as HF increases).
Because the range of h io at Peace Point is limited (~0.7 to 1.2 m), the variability of Rtf is dominated by HF.This is illustrated in Figure 8, where Rtf is seen to vary almost uniquely with HF; the relatively small scatter is caused by the variability of h io .The influence of resistance is evident in Table 3, which summarizes the breakup travel times for the four notable events mentioned in the previous paragraph.Despite the extreme 2020 breakup flow, the advance of the front was relatively slow and an IJF did not materialize.
Water 2023, 15, x FOR PEER REVIEW 12 in which DF = 216-HF, while the function f decreases as DF increases (or as HF increa Because the range of hio at Peace Point is limited (~0.7 to 1.2 m), the variability of R dominated by HF.This is illustrated in Figure 8, where Rtf is seen to vary almost uniq with HF; the relatively small scatter is caused by the variability of hio.The influen resistance is evident in Table 3, which summarizes the breakup travel times for the notable events mentioned in the previous paragraph.Despite the extreme 2020 brea flow, the advance of the front was relatively slow and an IJF did not materialize.While these findings demonstrate that the epistemically sound regression aligns the physics of the breakup process, it is emphasized that additional factors affect th currence of IJFs, including:

−
Weather conditions before and during the advance of the front influence the com tence of the downstream ice cover and its capacity to cause ice jams or add rubb them.Internal ice structure, air temperature, solar radiation, and melt rate o snow cover are relevant weather-related variables [43]; − The temporal gradient of the flow hydrograph as the breakup front advances a the river: increasing flow facilitates the release of ice jams that may form alon way, reducing residence times and enhancing celerity.The opposite will apply  While these findings demonstrate that the epistemically sound regression aligns with the physics of the breakup process, it is emphasized that additional factors affect the occurrence of IJFs, including: -Weather conditions before and during the advance of the front influence the competence of the downstream ice cover and its capacity to cause ice jams or add rubble to them.Internal ice structure, air temperature, solar radiation, and melt rate of the snow cover are relevant weather-related variables [43]; -The temporal gradient of the flow hydrograph as the breakup front advances along the river: increasing flow facilitates the release of ice jams that may form along the way, reducing residence times and enhancing celerity.The opposite will apply if the flow peak "overtakes" the front, which will then advance under a negative flow gradient.
Such complexities cannot be quantified at present and therefore add a random dimension to the breakup process, which the logistic regression attempts to quantify.For practical purposes, the plot of Figure 6 suggests that there is a good chance of an IJF when the applicable Q7 and HF define a data point that lies within the "northwest" portion of the graph.Moreover, the regression indicates (Figure 5) that P(IJF) will increase as the data point moves in a "northwesterly" direction (lower HF, higher Q7).

Discussion
The preceding analysis has highlighted some of the advantages and limitations of applying logistic regression to identify the factors that control the occurrence of IJFs in the lower Peace River and compute IJF probability as a function of these factors.The main advantage is that the statistical significance or non-significance of any one factor does not depend on the type of equation used to represent the logit once weak and correlated proxies are replaced by direct and uncorrelated variables.The main disadvantage is the lack of confidence in the predicted probabilities, which change with logit structure and are demonstrably in error when the breakup flow is less than a physics-derived threshold.Therefore, predicted IJF probabilities, regardless of logit structure, are only helpful as qualitative indications.Their use in quantitative mode to, for example, generate climaterelated projections of future IJF frequencies can lead to unrealistic results.
One of the proposed activities in the Wood Buffalo Action Plan is the development of a Strategic Flow Release (SFR) Protocol to increase the flooding potential of "promising" breakup events, i.e., events like those shown in Table 3, for which the early phases suggest that the occurrence of an IJF is possible [1,5].Of course, an enhanced release from the Peace Canyon Dam (or from the Site C Dam in the near future) must be timed in a way that will minimize the flooding potential of various riverside communities along the way (see also Home page | Site C (sitecproject.com, accessed on 29 October 2023)).Key factors in deciding whether to implement a flow release are the estimated time when the breakup front will be arriving at the PAD and the anticipated duration of a jam that may form in that area.Regardless of the various uncertainties, the logistic regression and current physical process understanding robustly point to breakup flow and freezeup level as important controls of IJF occurrence in the delta reach of Peace River.At the time when an SFR would be considered (typically in April), the freezeup level would be known but the breakup flow at Peace Point could only be projected from known flow conditions at upstream gauges like TPR and Fort Vermilion.Such projections could be improved by combining river ice models with snowmelt runoff models [44].
As noted in Section 1, the type of spring breakup flood considered in the logistic regression is the "magnitude 3" or "large" event, which can generate the extensive overland flooding that is necessary to replenish perched basins."Small" and "moderate" events can help recharge non-perched basins directly or via connecting channels, but their inclusion in the flood series examined by the logistic regression could be problematic owing to (a) the non-binary nature of the composite series and (b) gaps in the historical record of small and moderate events.For the 20th century, the record compiled in [20] mentions such events every few years or so, up to the year 1941.For the period from 1942 to 2006, the record contains no small, and only two moderate, floods.At the same time, the Fort Chipewyan oral history, which spanned the period 1950-1992, revealed that "Minor events were apparently common but could not be recalled in detail sufficient to record date or impact" [19].The frequency of reported non-large floods appears to recover after 2006.Small Peace River floods are mentioned [20] in 2007 and 2008, while intensive monitoring in the past 10 years or so indicates that non-large floods occurred in 2018, 2020, and 2022.These events have not yet been assigned a small or moderate label.In a personal communication (2020), M. Peterson (the lead author of the 1992 and 1995 historical flood reports [18,19]) indicated that there is uncertainty about the designations of flood magnitudes 1 and 2; therefore, some events assigned magnitude 1 might have been events of magnitude 2 and vice versa.However, there is considerable confidence that magnitude 3 events were indeed major ice-jam floods.

Summary and Conclusions
Single-station winter precipitation accumulated over the period from November to April and the coldness of the winter (indexed by degree-days of frost) are shown to be weak proxies for breakup flow and ice thickness, respectively.This exacerbates epistemic uncertainty, which is reduced by the use of direct variables in logistic regression.The latter approach demonstrates the statistical significance of the freezeup level and its negative effect on the probability of ice-jam flooding near the PAD during the spring breakup of Peace River.It is further pointed out that the simple linear sum of variables that is conventionally assumed to represent the structure of the logit is likely an erroneous quantification of complex physical processes (structural uncertainty).Trial regressions with different structural formulations of the logit indicate that the assessment of the significance of key explanatory variables does not change across different models.However, the calculated probability of an IJF does vary from one model to the next, and its value is therefore of qualitative utility.A comparison of probability isolines to actual occurrences on the breakup flow-freezeup level plane suggests that the correct, albeit unknown, logit likely involves a product, rather than a sum, of the functions of these two variables.The present regression results align with our physical understanding of the breakup process in the Peace River, where the PAD flood/no flood outcome depends on the rate of the advance of the breakup front during its progression to the PAD.This rate is enhanced by large breakup flow but is reduced by a high freezeup level; at the same time, it retains a degree of randomness, owing to additional, not-yet quantified influences related to thermally induced ice decay and flow hydrograph gradient during the advance of the breakup front.

Figure 1 .
Figure 1.Plan view of Peace River and Peace-Athabasca Delta (showing only the northern portion of Athabasca River).From [13], with changes.

Figure 1 .
Figure 1.Plan view of Peace River and Peace-Athabasca Delta (showing only the northern portion of Athabasca River).From [13], with changes.

Figure 4 .
Figure 4. Cross-sectional average ice thickness near the WSC Peace Point gauge, plotted versus accumulated degree-days of frost at Fort Chipewyan starting on October 1 of the year preceding the breakup year (Fort Chipewyan is located on the north shore of Lake Athabasca, ~90 km SE of Peace Point).The red circles mark IJF years.Adapted from[28].Data range: 1962-2020.Thickness data source: WSC archives for the Peace Point gauge.

Figure 4 .
Figure 4. Cross-sectional average ice thickness near the WSC Peace Point gauge, plotted versus accumulated degree-days of frost at Fort Chipewyan starting on October 1 of the year preceding the breakup year (Fort Chipewyan is located on the north shore of Lake Athabasca, ~90 km SE of Peace Point).The red circles mark IJF years.Adapted from[28].Data range: 1962-2020.Thickness data source: WSC archives for the Peace Point gauge.

Figure 5 .
Figure 5. Variation in the probability of an ice-jam flood (PIJF) with Peace Point freezeup level and breakup flow, as calculated by logistic regression on Q7 and HF.Logistic regression was performed using the Firth bias-reduced algorithm supplied by Wessa[31].

Figure 5 .
Figure 5. Variation in the probability of an ice-jam flood (PIJF) with Peace Point freezeup level and breakup flow, as calculated by logistic regression on Q7 and HF.Logistic regression was performed using the Firth bias-reduced algorithm supplied by Wessa[31].

Water 2023 , 16 Figure 6 .
Figure 6.Isolines for P(IJF) = 0.50, as generated by the three models in Table2.The data points represent observed annual values of Q7 and HF at Peace Point for the period from 1962 to 2020, excluding the reservoir-filling years 1968-1971; the red circles mark IJFs.For 2018, the flow may have been higher than shown; for 2020, the freezeup level is only known to be within a range of elevations.

2018 Figure 6 .
Figure 6.Isolines for P(IJF) = 0.50, as generated by the three models in Table2.The data points represent observed annual values of Q7 and HF at Peace Point for the period from 1962 to 2020, excluding the reservoir-filling years 1968-1971; the red circles mark IJFs.For 2018, the flow may have been higher than shown; for 2020, the freezeup level is only known to be within a range of elevations.

Figure 7 .
Figure 7. IJF occurrence (binary variable = 1) or non-occurrence (binary variable = 0) versus predicted probability using the third model of Table2, which essentially involves the product of the powers of the explanatory variables.

Figure 8 .
Figure 8. Variation in the resistance component Rtf with the freezeup level for the vicinity o Peace Point hydrometric gauge and for the years 1962 to 2020.Reservoir filling years are incl because the flow hydrograph does not influence the relationship between Rtf and the variable and hio.

Figure 8 .
Figure 8. Variation in the resistance component Rtf with the freezeup level for the vicinity of the Peace Point hydrometric gauge and for the years 1962 to 2020.Reservoir filling years are included because the flow hydrograph does not influence the relationship between Rtf and the variables HF and h io .

Table 1 .
Coefficients and p-values associated with logistic regression on different combinations of explanatory variables.Relevant hydrometric and climatic data are summarized in a supplement to this paper.

Table 1 .
Coefficients and p-values associated with logistic regression on different combinations of explanatory variables.Relevant hydrometric and climatic data are summarized in a supplement to this paper.

Model Coefficients (and p-Values)
excluded; the year 2006 is excluded when the regression involves Q7 (insufficient data); Q7 and HF may have been underestimated in 2018 and 2020, respectively; their p-values decrease slightly if either one is increased.

Table 2 .
Exploring structural uncertainty for models that utilize breakup flow and freezeup level as the explanatory variables.

Table 3 .
Breakup front travel times and relevant hydroclimatic controls for the years 1996, 2014, and 2020.Tbf = travel time of breakup front between ~Sunny Valley and ~mouth of Peace R Tjam = duration of PAD jam.

Table 3 .
Breakup front travel times and relevant hydroclimatic controls for the years 1996, 1997, 2014, and 2020.T bf = travel time of breakup front between ~Sunny Valley and ~mouth of Peace River; T jam = duration of PAD jam.