1. Introduction
There are growing concerns about the impacts of climate change on flooding. The recent 2022 Nigerian floods, which impacted several regions of the country, were attributed to climate change. In response, the Federal Government of Nigeria established a Presidential Committee on Flooding with a task to develop a comprehensive flood management plan for Nigeria within 90 days. Variations in climatic conditions result in significant fluctuations in runoff [
1] Although Nigeria is accustomed to seasonal flooding, the 2022 floods were the most severe in the country since the 2012 flood [
2]. Incidences of flood disasters are increasing globally, with both regional and sub-regional flood intensities also on the rise. In 2022, persistent flood events reached a peak in Nigeria, affecting 19 states [
2].
Flooding is caused by several factors, particularly runoff. Runoff refers to the flow of rainwater over land surface and occurs when there is an excess of precipitation in a given area. This happens when the rainwater exceeds the infiltration capacity of the soil. The relationship between rainfall and runoff is the most critical aspect of any watershed. This relationship is influenced by various factors such as the characteristics of rainfall, runoff, and infiltration. Despite the considerable impact of these factors on the magnitude of runoff, a consistent and more accurate correlation between rainfall and runoff allows for better and confident decision-making by local authorities in a timely manner [
3].
Runoff-producing floods cause great damage to communities worldwide and are believed to be one of the most expensive types of natural disasters in human and economic terms [
4]. The design of hydraulic and flood control structures is intended to withstand damage from those natural forces that are expected to act on these structures [
4]. In some circumstances where considerable risks to life exist and significant potential economic losses on failed public infrastructures (e.g., dams) are envisaged, it is necessary to design hydraulic structures against the worst possible situation of flood [
5]. Slight rises in climate extremes are said to have the potential to bring about substantial damage to existing and already built flood control structures [
6]). From regional climate models, there is evidence of an increase in flood risk in several regions of the world [
7]. The reports of the Intergovernmental Panel on Climate Change (IPCC) indicated increased intensity of rainfall events, particularly the ratio of total rain that falls throughout heavy events, and this ratio is believed to continue to increase in the future. It was also observed by the IPCC that greater rainfall intensity will likely escalate the risk of flooding globally [
8].
Sectors that are susceptible to changes in climate conditions, such as the water resources sub-sector, are predicted to suffer greater impacts from extreme climate events. There is general acceptance that changes in climate conditions negatively affect water resources management systems [
9]. In most advanced economies, because of climate change, decision-makers have been forced to revise already instituted flood protection plans and to consider the impacts of climate change in future flood control planning [
4]. This includes risk assessments, which is a germane requirement to evaluate the impacts on public infrastructure and flood protection plans [
10]. In Africa, varying rainfall patterns, rising temperatures, rising sea levels, and more extreme weather are increasing threats to human health and well-being, water and food security, and socioeconomic development [
11]. There is a rising effect of climate change on the African continent, which is hardest on the most susceptible communities, leading to food insecurity and population dislodgment due to frequent flooding and stress on water resources [
11]. Nigeria, including the Cross River Basin, is no exception from these negative impacts. Climate change is prevalent, rapid, and escalating [
12]. The changing climate in Nigeria is mostly evidenced by extreme events like droughts in far northern Nigeria and floods in some parts of southern Nigeria, predominantly in the coastal regions of southern Nigeria.
In view of the observed impacts of climate change and the predicted future effects, especially on flooding, it is believed that Africa is most likely to be more afflicted than other parts of the world, and Nigeria is one of those nations on the continent that suffer such effects [
13]. In contemporary Nigeria, there is strong indication to suggest a continual change in the climate. Some of these are variable precipitation, rise in temperature, increase in sea level, intense drought, increasing desertification, land degradation, flooding, depletion of freshwater resources, loss of biodiversity, and, above all, more frequent extreme weather events [
14,
15].
In Nigeria today, variability in climate conditions has resulted in an increase in the intensities and duration of precipitation, which have led to large runoffs and resultant flooding in many areas [
16]. Changes in precipitation in Nigeria are expected to keep rising. It is predicted that in southern Nigeria, rainfall will likely increase with attendant rise in sea levels, resulting in the flooding and consequent submergence of low-altitude lands [
15,
17]. Different magnitudes of droughts are also being experienced in Nigeria and most likely to continue, especially in northern Nigeria because of decrease in rainfall and increase in surface temperature in the zone [
18,
19]. As a consequence of the negative impact of climate change in Nigeria, Lake Chad and other lakes in Nigeria are drying up at rates unanticipated and are currently at risk of extinction [
14,
20]. Since the 1980s, there has been a significant rise in the average annual temperature in Nigeria [
16], and the forecast for future years in all ecological and hydrologic zones in the country is a considerable rise in temperature [
17].
There are expectations that climate change will result in an amplification of the global hydrological cycle and that these predicted intensifications will likely come with key modifications on regional water resources [
21]. Precipitation pattern and the changes in its intensity, frequency, and total amount have some correlations with the extent and timing of runoff and intensity of floods and droughts. However, at present, specific sub-regional effects are uncertain [
22].
An increase in global temperature will give rise to a warmer climate with a resultant effect of increase in the probability of flood risk [
23]. So far, only a few studies (e.g., [
24]) have, on a global scale, predicted variations in floods. Multiple climate models were not applied to nor relied upon over the years. On a global scale, some researchers (e.g., [
25]) have begun looking at the potential risk and vulnerability of people exposed to flooding, but further attempts at relating climate parameters, warming, and flooding are still in progress. Yukiko et al. [
23] analysed 11 climate models to predict the universal flood risk in this century. In their study, the river discharge in the flood-prone area was measured using a global river-routing model. The results showed a significant upward rise in the frequency of flooding in eastern Africa, northern half of the Andes, Peninsular India, and Southeast Asia. The study, however, projected a reduction in the frequency of floods in some parts of the world [
23]. The most significant sectoral concern from climate change is widely acknowledged to be the increased danger of flooding and other extreme weather occurrences. This has created a public debate on the observed rise in rainfall intensities and the apparent increase in the frequency of extreme events [
26]. Odekunle [
27] and Ologunorisa [
28,
29] concurred that there was an increase in flooding owing to climate change and identified three flood risk zones from their assessment of flood vulnerability zones in the Niger Delta region of Nigeria: high flood risk areas, moderate flood risk areas, and low flood risk areas.
Ekwueme and Agunwamba [
30] investigated the influence of meteorological variables on runoff in a tropical watershed. They developed a climate-flood model for the Adada River at Enugu State, Nigeria, where runoff was linked with climatic conditions. It was concluded that the climate had an effect on runoff in the catchment area.
The most up-to-date information on the variability and trends in climatic variables like rainfall, runoff/flooding, drought, and temperature is vital for ensuring proper planning and sustainable management of a sub-region’s water resources, particularly with regard to climate change, economic growth, and population increase [
31]. By regularly analysing hydro-meteorological data, strategies for effective water management, agricultural productivity, and ideal environmental protection can be developed. This is the main motivation for this study.
There has been inadequate basin-based information relating runoff to major climatic parameters and the attendant effects on flooding at the sub-regional level of the Cross River Basin. This study intends to fill this gap.
Apart from the MLR modelling technique, other models have been used to simulate and analyse hydrological events. These include models such as artificial neural networks (ANNs) (e.g., [
32,
33]), the Soil and Water Assessment Tool (SWAT) (e.g., [
33,
34,
35,
36]), Hydrologic Engineering Centre–Hydrologic Modelling System (HEC–HMS) (e.g., [
32]), and long short-term memory (LSTM) (e.g., [
34,
35]). ANN models can be used for a wide range of hydrology-based applications, including the modelling of rainfall–runoff relationships, flooding, streamflow, groundwater and water quality, erosion, and sediment transport. They are effective where there are substantial nonlinear relationships but are data-driven and based on machine learning. This implies a requirement for an extensive set of data, which can be a challenge to obtain in hydrology, where high-quality data are not readily available. SWAT is mainly designed to model the effect of land management practices on soil and water resources, such as sediment transport, water quality, and land contamination. It can also be used to model the impact of weather patterns on soil and water conditions. It requires a wide range of detailed input data pertaining to, for instance, soil and climatic conditions, and so are not usually ideal where there is limited data. MLR models can also be used to simulate all forms of hydrological processes, including rainfall–runoff relationships, flooding, flood risk assessment, streamflow, groundwater flow, water quality, erosion and sediment transport, and evapotranspiration. It is less data-intensive and computationally undemanding in comparison to other models, but the rationale for applying this statistical technique in this study lies in its interpretability and its ability to determine clear relationships between each dependent variable and the independent variables. The beta coefficients are useful formulations that reveal how each dependent variable is affected within the system. Furthermore, unlike other techniques such as SWAT, MLR models do not rely on the physical characteristics of the catchment and therefore less cumbersome to implement without in-depth understanding of the geographical features of the watershed. LSTM is a recurrent neural network (RNN) technique that is architecturally like ANN but applicable to sequential data such as time series. These are analogous in form to climatic data; however, as with ANN, RNN-based models are data-driven and data-intensive and, thus, dependent on big data for accurate predictions and analyses.
This study is based on the presumption that variation in runoff is predominantly based on climatic changes, which overlooks the possible contributions from land use, soil conditions, and human activities. The future direction of research should incorporate these factors to ascertain the extent of their individual and overall influence on runoff events within Cross River Basin.
As a statistical approach, the multiple linear regression (MLR) model is proven to be very reliable for climate-runoff modelling [
37,
38]. In many studies, emphasis has been on the temporal relationship between runoff and the climate. Arora et al. [
39] developed a multiple linear regression model for hourly runoff in the Waingang River sub-basin, China. Their findings showed that as the lead time increased across all three models developed, the accuracy of the models decreased, except for the four-hourly forecast. Also, when Singh et al. [
40] applied an MLR model in a highly glacierized Himalayan basin to examine the correlation between runoff and meteorological variables, the results showed that, between June and September, discharge was more strongly correlated with precipitation in July (r = 0.402), followed by August (r = 0.350), with the weakest correlation observed in June (r = 0.041).
The aim of this research is to examine the impact of climate change on runoff and subsequent flooding at the basin scale, specifically in the Cross River Basin catchment. A multiple regression approach is used to model the influence of climatic parameters on runoff. The term
runoff, herein, is used interchangeably with
streamflow because of the direct relationship between the two parameters. The same assertion has been made by a number of other studies (e.g., [
30,
32,
41]). There are two types of runoffs: surface runoff, which is the unabsorbed flow that moves toward streams, and channel runoff, which is streamflow. The latter is implied in this study. Surface runoff is a major component of channel runoff (streamflow), and often, there is a direct relationship between the two processes. In the strict sense, runoff in this study refers to
channel runoff.
3. Results and Discussion
3.1. Climate-Flood Model for Calabar
The estimates of regression coefficients were substituted in Equation (20) to obtain the fitted climate-flood model of the Calabar River, given as
where R is rainfall, S is sunshine hours, T is temperature, E is evaporation, RH is relative humidity, SR is solar radiation, T
s is soil temperature, and W is wind speed.
The parameter estimates of the model specified for the Calabar River are given in
Table 2, along with their standard error and 95% confidence intervals. Based on the information in the table and Equation (25), the regression coefficient for annual rainfall depth has the highest estimated value of 1.000. Our model, therefore, indicates that among the climatic parameters, only annual rainfall has a significant impact on runoff at 95% confidence intervals in the Calabar River Basin. It also shows that as rainfall increases, runoff increases. Additionally, the model indicates that a change in evaporation by a value −0.001, sunshine hours by a value of −0.001, relative humidity by a value of −0.004, solar radiation by a value of −0.002, and soil temperature by a value of −0.001 leads to significant increases in runoff/flows/discharge in the Calabar River.
Table 3 explains the adequacy of the model for the Calabar station, with an R
2 value of 1.000, meaning the model was able to explain 100% of the variation in the dependent variable.
The study of the Cross River Basin also revealed that climatic variables accounted for 100% of the yearly changes in runoff. This shows that the climate has a major impact on runoff and flooding in Calabar and that changes in climate can lead to changes in runoff and flooding.
The ANOVA table is used to assess the model adequacy of a multiple regression analysis. In this context, it helps to determine whether the regression model is a good fit for the data and whether the independent variables collectively explain a significant amount of the variance in the dependent variable. The ANOVA table is explained as follows:
Source: This column indicates the sources of variance in the analysis. In a typical ANOVA table for regression, you will usually find three sources:
- (a)
Model: This source represents the variance explained by the regression model, which includes all the independent variables. In
Table 3, the sum of squares of the variabilities is given as 3782.633.
- (b)
Residual: This source represents the unexplained or error variance. It is the variance that the model does not account for, and it includes random noise in the data. This is given as 0.000127 in
Table 3 for the sum of squares of the variabilities.
- (c)
Uncorrected Total: This is the total variance in the dependent variable before any modeling. It is the sum of the model and residual sources, which is 3782.633 in
Table 3 for the sum of squares of the variabilities.
Sum of Squares (SS): This is a measure of the variation of the independent variables, which is crucial because of the impact these have on the trend of the dependent variable.
Degrees of Freedom (DF): DF indicates the number of degrees of freedom associated with each source. In regression analysis, the DF for the model is typically the number of independent variables (including the intercept) minus 1, and the DF for residual is the total number of data points minus the number of parameters in the model. In
Table 3, DF for the model is 10, DF for the residuals is 14, and the uncorrected total is 24.
Mean Squares (MS): Mean squares are obtained by dividing the sum of squares by the degrees of freedom. It represents the variance explained or unexplained per degrees of freedom. In
Table 3, MS is 378.263 (3782.633/10) for the model, and MS is 9.07143 × 10
−6 (0.000127/14) for the residuals.
R
2 (R-squared): R
2, often called the
coefficient of determination, represents the proportion of variance in the dependent variable that is explained by the independent variables in the model. In
Table 3, R
2 is 1.0, which means that approximately 100% of the variance in the dependent variable is explained by the independent variables.
Interpretation:
The high R2 value (1.0) suggests that the regression model is a very good fit for the data. It explains almost 100% of the variance in the dependent variable.
The model’s sum of squares (3782.633) is much larger than the residual sum of squares (0.000127), indicating that the model explains a significant amount of the variance in the dependent variable.
The F-statistic can be calculated by dividing the model MS by the residual MS. If the F-statistic is significantly larger than 1, it suggests that the model is a good fit for the data. You would typically perform a hypothesis test using the F-statistic to determine if the model is statistically significant.
In summary, this ANOVA table provides evidence that the regression model is a good fit for the data and that the independent variables have a significant impact on explaining the variance in the dependent variable.
The R-squared (R
2) value, which represents the proportion of variance in the dependent variable explained by the independent variables in a regression model, is computed using the values from the ANOVA table (
Table 3). From
Table 3, R
2 is given as 1.000, which indicates that approximately 100% of the variance in the dependent variable is explained by the independent variables in the model.
R
2 can be calculated manually as the ratio of the model sum of squares (SS) to the uncorrected total sum of squares (SS).
From the ANOVA table, model SS is 3782.633, uncorrected total is 3782.633, and R2 is 3782.633/3782.633 ≈ 1.000. This matches the value in the ANOVA table and means that approximately 100% of the variance in the dependent variable is explained by the independent variables in the multiple regression model.
Values of the coefficient of determination R2 are given at the bottom of each ANOVA table. These were determined at a 5% significance level (p = 0.05). The p-value for all tests is lower than 0.05, suggesting that the observed results, including the F-statistic, are unlikely to have occurred by chance.
3.2. Climate-Flood Model for Uyo
The estimates of regression coefficients were substituted into Equation (20) to obtain the fitted climate-flood model of the Uyo River as
The strongest predictor of runoff variability in the Uyo region, as revealed by the climate-flood model (Equation (27)), is rainfall with a value of 0.995. The parameter estimates of the model specified for the Uyo River, presented in
Table 4, include their standard errors at 95% confidence intervals. This result implies that as rainfall increases, runoff also increases. Furthermore, the table shows that as temperature, evaporation, and solar radiation decreases, there is a significant increase in runoff. A decrease in evaporation, indicated by a value of −0.011, predicts an increase in runoff in Uyo.
The model’s ability to explain the variation in the dependent variable for the Uyo River is shown in
Table 5 with an R
2 value of 1.000. This means that the model was successful in explaining 100% of the variation.
Furthermore, the results of the study of the Uyo station showed that the climatic variables accounted for 100% of the annual variation in runoff. This indicates that a change in the climate conditions influences runoff and flooding in Uyo.
3.3. Climate-Flood Model for Ogoja
The estimates of regression coefficients were substituted into Equation (20) to obtain the fitted climate-flood model of the Ogoja River as
According to the climate-flood model for Ogoja (Equation (28) and
Table 6), rainfall depth, with a value of 0.896, is the most significant climatic variable in predicting the variability of runoff for Ogoja. In this study, the parameter estimates for the model specified for the Ogoja station, as presented in
Table 6, were calculated with standard errors at 95% confidence intervals. The most significant predictor of increased flooding in Ogoja is the increase in rainfall depth. According to the analysis, a decrease in temperature with a value of −1.571 and solar radiation with a value of −0.530 increases runoff in this station.
As shown in
Table 7, the results of the study of the Ogoja station revealed that climatic variables accounted for 98.8% of the yearly changes in runoff, while non-climatic variables accounted for only 1.2%. This shows that the climate has a significant impact on runoff and flooding in Ogoja.
3.4. Climate-Flood Model for Eket
The estimates of regression coefficients were substituted into Equation (20) to obtain the fitted climate-flood model of the Eket River as
The climate-flood model for Eket (Equation (29) and
Table 8) indicates that rainfall, with a value of 0.988, is the dominant factor in determining runoff variability. The parameter estimates of the Eket station model, with standard error at 95% confidence intervals, demonstrate that an increase in rainfall leads to a corresponding increase in flooding. The study also found that a decrease in evaporation, with a value of −0.041 and temperature with a value of −0.308, has a significant effect on increasing runoff in Eket. Additionally, relative humidity and solar radiation were found to play a significant role in influencing runoff in the Eket region.
The results of the study of the Eket station (
Table 9) indicate that the climatic variables accounted for 99.9% of the annual variation in runoff, whereas the non-climatic variables accounted for 0.1%. This demonstrates that the climate influences runoff and flooding in Eket.
In summary, the model output in this study suggests that changes in climate have an impact on the runoff and flooding. This aligns with the results from Babatola and Akinnnubi [
41], where 54.4% of the annual runoff variation in River Niger was attributed to climatic factors. This is also in line with the previous studies of Ekwueme and Agunwamba [
30] on the influence of meteorological variables on runoff in a tropical watershed, where climatic variables accounted for 66.1% of the runoff variation in Adada River.
The effect of precipitation (rainfall) on flow in catchments is so large because, from the analyzed models, precipitation has the most significant impact on flow/runoff. For instance, from the analysis for Calabar station, the regression coefficient for annual rainfall depth has the highest estimated value of 1.000. The model, therefore, indicates that among the climatic parameters, only annual rainfall has a significant impact on runoff at 95% confidence intervals in the Calabar River station. It also shows that as rainfall increases, runoff increases.
From the climate flood model, a rise in temperature reduces runoff by a factor of −0.069 in Uyo station. In Eket, increases in temperature reduce runoff/flooding by a factor of −0.308 and, in Ogoja, reduce runoff by −1.571. It is only in Calabar that the reverse is the case, with a factor of 0.002. However, this value is trivial and is considered to have negligible impact on runoff. It is insignificant in comparison to the effect of rainfall in the Calabar catchment area. The implication of this investigation is that changes in rainfall believed to be governed by climate change are the major determinant of flooding in the entire study area.
3.5. Multicollinearity Test for Model Adequacy
Multicollinearity is a statistical phenomenon that happens when there is a strong correlation between two or more independent variables in a regression model. That is, multicollinearity shows whether the predictor variables have a strong linear relationship. If there is collinearity, it may be difficult to precisely ascertain the unique effects of each independent variable on the dependent variable when performing a regression analysis. It may be challenging to interpret the findings and derive meaningful inferences from the model due to unstable and incorrect estimations of coefficients because of multicollinearity.
For regression models to be reliable and valid, multicollinearity must be identified and resolved. Multicollinearity occurs when two or more independent variables in a data frame have a strong association with one another in a regression model [
69].
VIF are the variance inflation factors. The strength of correlation between the independent variables is assessed using VIF. By regressing a variable against every other variable, VIF is predicted. It measures how much collinearity has inflated the variance (or standard error) of the calculated regression coefficient. The variance inflation factor is expressed as , where R2 is the coefficient of determination.
Examining the range of R2 (0 ≤ R2 ≤ 1), R2 = 0 implies that the model is unable to explain the variance in the dependent variable, while R2 = 1 suggests the opposite. That is, the model explains the variance in the dependent variable. In other words, there is goodness of fit.
3.5.1. Multicollinearity Test for Uyo Station
Table 10 above presents the multiple linear regression analysis results for Uyo for test of multicollinearity using all the variables. From the analysis (
Table 10), the VIF values are all less than 5. This shows that there is no multicollinearity problem in the variables at Uyo station.
3.5.2. Multicollinearity Test for Calabar Station
Table 11 above presents the multiple linear regression analysis results for Calabar station for test of multicollinearity using all the variables. From the analysis (
Table 11), the VIF values are all less than 5. This shows that there is no multicollinearity problem in the variables at Calabar station.
3.5.3. Multicollinearity Test for Eket Station
Table 12 above presents the multiple linear regression analysis results for Eket station for test of multicollinearity using all the variables. From the analysis (
Table 12), the VIF values are all less than 5. This shows that there is no multicollinearity problem in the variables at Eket station.
3.5.4. Multicollinearity Test for Ogoja
Table 13 represents the multicollinearity test for the linear regression model for Ogoja station. From our results, a unit increase in rainfall will result in the same quantity of runoff. Increase in rainfall by 0.0030 produces runoff and flooding by a value equal to 0.0030 in Ogoja, given that R
2 = 1.00 in Ogoja. Rainfall and maximum temperature with VIF (variance inflation factors) greater than 5 are the parameters we can eliminate in Ogoja because of multicollinearity. The model’s assumption of the problem of multicollinearity in the entire Cross River Basin catchment is only in Ogoja station, although it does not invalidate the model.
To overcome the effect of multicollinearity in Ogoja station, a multiple linear regression analysis was carried out using composite variables (variables that did not show multicollinearity, e.g., minimum temperature, solar radiation, sunshine hours, relative humidity, wind speed, soil temperature, and evaporation) (
Table 14 and
Table 15). This is similar to the approach adopted by Chinelo and Ozlem [
70], where the effect of multicollinearity on climate variables was investigated. In their work, the result of the multiple linear regression analysis revealed a strong effect of multicollinearity among the variables used with VIF > 5. To overcome this effect of multicollinearity, a principal component regression analysis was introduced. The result obtained from the principal component regression analysis of the retained variables yielded a preferable result with a VIF value of 1.039, which was much lower in comparison to when all the variables were applied.
Table 14 is the multiple linear regression analysis results for Ogoja station, excluding variables with VIF > 5. In this analysis, rainfall and maximum temperature are isolated to account for multicollinearity. By eliminating rainfall and maximum temperature, R
2 reduces to 0.597. This implies a reduction in the model adequacy.
In
Figure 3a, standardized coefficients were used to plot the feature importance for Calabar station using all the variables. The VIF values indicate no multicollinearity. In
Figure 3b, standardized coefficients were used to plot the feature importance for Eket station using all the variables. There is no multicollinearity in Eket station based on the VIF values. In
Figure 3c, standardized coefficients were used to plot the feature importance for Uyo station using all the variables. There is no multicollinearity in Uyo station based on the VIF values. Likewise, in
Figure 3d, the standard coefficients were used to plot the feature importance for Ogoja station using all the variables. The multicollinearity assumption failed in Ogoja station based on the VIFs.
For Ogoja, there is a considerable change in the feature importance when all variables with VIF > 5.0 are excluded (
Figure 4), indicating an increase in dominance of variables that were previously less consequential.
3.6. Model Validation
As mentioned in
Section 3.1, the coefficient of determination, R
2, is a measure of the portion of variance in the runoff that is explained by the predictor (climatic) variables incorporated in the models. High R
2 values are recorded based on the ANOVAs, which is an indication of the extent of the collective influence of the climatic variables on streamflow (channel runoff). The set of high R
2 values also shows that the flood models are a good fit for the recorded data, with minimal impingement from other factors outside the boundaries of the model. Large magnitudes of R
2 and R for calibrated and validated flood models are indications of the high precision in their ability to reproduce and/or predict discharge values. High values of R
2 and R, above 0.900, which denote the accuracy of flood models, are realizable. These have been achieved by others while constructing other flood models, as shown for instance by Addison-Atkinson et al. [
71] and Zarei et al. [
72]. The former evaluated the performance of flood models by simulating surface flood events, including surface runoff, and their interactions with underground sewer flow systems, while the latter compared the capabilities of the Hydrologic Engineering Center’s Hydrologic Modelling System (HEC–HMS) used in conjunction with machine learning models. On one hand, HEC–HMS was combined with a long short-term memory (LSTM) network model (HEC–HMS–LSTM), while on the other hand, it was linked with a gated recurrent unit (GRU) model (HEC–HMS–GRU). Both hybrid models were used to examine the effect of climate change on flood events.
Whereas R
2 shows the percentage of the overall variance described by the regression model (including how much of the linear variation in the observed values is explained by the variation in the projected values) [
73], model validation is a vital stage in assessing its performance on unseen data to ensure its reliability and accuracy. One of the most popular methods for assessing the accuracy of model predictions is the scatter plot of predicted versus observed (or vice versa) values. Indices for evaluating—and increasing confidence in—model performance are provided through the slope and the intercept of the line fitted to the data. The slope and intercept reflect the consistency and the model bias. An alternative approach is to determine the coefficient of correlation (R), which is an indicator of the intensity and direction of the linear relationship between the predicted and observed values. R ranges from −1 to +1. Values closer to |1| suggest a strong correlation between two variables. Positive values imply a direct and linear relationship and vice versa for negative values. The concept of R is adopted herein to test the validity of the models developed in this study.
In total, 80% of the data (1992–2015) were used to train the models and were therefore not used in the main validation process. Model validation was carried out using 20% of the dataset (2016–2021). The four models were tested by comparing the observed discharge flows with those predicted within the same time frame (2016–2021). Coefficients of correlation were calculated at 95% confidence intervals (
p = 0.05) (
Table 15 and
Figure 5). The high correlation coefficient (R) is an indication of how close the predicted and observed flows are.
A visual inspection of the scatter plots is also necessary. From
Figure 5a, the prediction of the multiple linear regression model (climate-flood model) for Calabar (Equation (25)) is accurate because almost all the points are in a straight line. The closer the points to the straight line, the more accurate the prediction is. Predicted values are the runoff predicted using the model (Equation (25)). Equation (25) and
Figure 5a, therefore, confirm that an increase in rainfall depth increases runoff by a value of 1.000 in Calabar station and its environs.
Figure 5b shows that the prediction of our multiple linear regression model (climate-flood model) for Eket station (Equation (29)) is accurate because almost all the points are on the straight line. The closer the points are to the straight line, the more accurate the prediction is. The predicted values are those determined by the model. The same pattern is depicted in
Figure 5c, which is the plot of residuals versus predicted values for Uyo station. The points are in a straight line, indicating that the model (Equation (27)) fit is adequate.
Figure 5d shows a slight disparity between the predicted and observed values for Ogoja, as indicated by the offsets of the data points. This does not undermine the accuracy of the model (Equation (28)) since the differences between the predicted and observed values, as shown in
Table 15, are trivial.
In addition, a set of one-way ANOVA was included to complement the model validation process (
Table 16,
Table 17,
Table 18 and
Table 19). These were conducted using the predicted and observed runoff values from 2016 to 2021 (
Table 15). Results for the four catchment areas are given in
Table 16,
Table 17,
Table 18 and
Table 19. In this context, ANOVA serves as a way of assessing the fitness of the model to the observed unseen data and to determine if the differences or similarities are statistically significant. The F-statistic values for all gauge stations are many folds larger than 1, which suggests that the model results closely match the observed unseen data. In other words, the model can reproduce the observed data with minimal errors. At a 95% confidence interval (significance level,
= 0.05), DF = 1, and DF = 4, the critical F-value is 7.71. The calculated F-statistic values for all gauge stations are greater than the critical value. Therefore, at the 0.05% significant level, there is no difference between the predicted and observed values.
In summary, multicollinearity tests were used to check the presence of any significant correlation between the predictor variables used in the model. In this case, the variance inflation factor (VIF) is used as an indicator. A VIF greater than 5 is considered large enough to suggest collinearity between predictor variables, while a VIF less than 5 indicates the contrary. Multicollinearity was only observed in Ogoja station; however, this did not compromise the accuracy of the model built with the climatic variables from that station.
There may be a relationship between sunshine hours and solar radiation and evaporation and temperature or sunshine hours, but these are not considered high enough to affect the measurements of their unique effects on the dependent variable when performing the regression analysis. Other forms of correlation analysis were not included because it is considered outside the scope of this article. This is recommended for future investigations, which will include correlation analyses to determine the strength of association between predictor climatic variables.
The primary aim of the study is to investigate the impact of climatic variables on runoff flow over a given time frame, which was carried out using a 30-year data set. The models have been verified for goodness of fit and checked against multicollinearity. To verify the model, its performance was evaluated using the coefficient of determination as an indicator to determine the extent to which the variance in the dependent variable is accounted for by the predictor variables. The model validation was successful, wherein predicted values were shown to match actual values observed at all hydro stations (Ogoja, Uyo, Calabar, and Eket).
3.7. Summary of Model Findings and Implications
The least squares estimate of the regression parameters, , are calculated from Equation (19). After finding the coefficients, they are substituted back into the original equation (Equation (22)) to determine the regression model, also referred to as the climate-flood model. The yearly runoff for each of the stations is predicted with the climate-flood model using data from the average annual climatic variables for each of the four stations as predictors. Importantly, it should be noted that Equation (22) yields runoff produced by only climatic variables as established in this work without other variables such as soil saturation, anthropogenic activities, land use, and changes in lifestyle. There are changes in climate in the catchment area, and these are impacting fluctuations in flooding at the different stations. For instance, at the Calabar station, the regression coefficient for annual rainfall depth has the highest estimated value of 1.000. Our model, therefore, indicates that among the climatic parameters, only annual rainfall has a significant impact on runoff at 95% confidence intervals in the Calabar River basin. It also shows that as rainfall increases, runoff increases.
Following our analysis, a new set of climate-flood models (Equations (25) and (27)–(29)) were developed for Calabar, Ogoja, Uyo, and Eket, which is a fair representation of the catchment of Cross River Basin (Akwa Ibom State and Cross River State). The findings suggest that at 95% confidence interval, the climate-flood model was effective in forecasting the annual runoff at all the stations. The findings also identified the climatic parameters that were responsible for 100% of the runoff variability in Calabar (R2 = 1.000), 100% of the runoff in Uyo (R2 = 1.000), 98.8% of the runoff in Ogoja (R2 = 0.988), and 99% of the runoff in Eket (R2 = 0.999). From the model, annual rainfall depth is the only climate parameter that significantly predicts runoff at 95% confidence intervals in Calabar station, while in Ogoja, rainfall depth, temperature, and evaporation significantly predict runoff. In Eket, rainfall depth, relative humidity, solar radiation, and soil temperatures are significant predictors of runoff. The model also reveals that rainfall depth and evaporation are significant predictors of runoff in Uyo. The implications of the model findings are as follows:
Climate parameters are responsible for 100% runoff variability in Calabar and its environs. This result agrees with the findings of Ekweme and Agunwamba [
30], where climatic variables were responsible for 64.73% of annual variability of runoff at Adada River in Enugu State, Nigeria. Specifically, rainfall was modelled to be the only positive climatic parameter that significantly increases runoff. From the model in this study, a rise in rainfall increases runoff by a factor of 1.000. The implication is that changes in rainfall are the major determinant of flooding in the Calabar catchment. With respect to the impact of climate change, this study shows that a decline in rainfall will ultimately lead to less runoff and a consequent reduction in flooding in Calabar. Additionally, the model indicates that a change in sunshine hours by a value of 0.001, evaporation by a value of −0.001, relative humidity by a value of −0.004, soil temperature by a value of −0.001, and solar radiation by a value of −0.002 leads to a significant reduction in runoff.
- 2.
Uyo:
From our results, climate parameters were responsible for 100% of the runoff variation in Uyo and its environs. This means that non-climate predictors do not contribute to runoff. It, therefore, implies that climate change has an overriding impact on flooding in Uyo. This is consistent with the findings of Babatolu and Akinnubi [
41], where 54.4% of the annual runoff variation in Niger River was attributed to climatic factors. In Uyo, results from this study indicate that an increase in rainfall will increase runoff by a factor of 0.995, while reductions in temperature, evaporation, and solar radiation will increase runoff and flooding in Uyo and its environs.
- 3.
Eket:
The climate-flood model for Eket shows that at 95% confidence interval, climatic variables account for 99.9% of runoff variation in Eket and its environs, while non-climatic factor contribute 0.1% to the runoff and flooding in the station. This aligns with the studies by Xu et al. [
74], where impacts of climate change and variations in climatic variables accounted for 26.9% decrease in runoff. Due to the dynamics of climate change around Eket station, an increase in rainfall, by a factor of 0.988, contributes to the same rise in runoff and flooding, while decrease in temperature, evaporation, relative humidity, and solar radiation increases runoff/flooding by the corresponding factors of −0.308, −0.041, −0.031, and −0.049, respectively.
Findings from our validated model show that at Ogoja, climatic variables are responsible for runoff variation by 98.8%, while non-climatic factors account for 1.2%. The climatic parameters that significantly predict runoff are mainly rainfall, which increases runoff by a factor of 0.896. Decrease in solar radiation by a factor of −0.530 and temperature by −1.571 increases runoff.
The outcome of the study suggests that climate change has impacted runoff and flooding within the Cross River Basin. Average annual flow or channel runoff (streamflow) is a key component for flood hazard assessment. Although complex, there is a relationship between average annual flow and flood hazard. This is so because the average annual flow can be used to determine the probability of the occurrence of a flood event and its severity. This can be understood through concepts like the return period (recurrence interval) and the annual exceedance probability (AEP). The return period is the average time in years it takes for the reoccurrence of a flood of a given magnitude, whereas the AEP is the probability of a given magnitude of flood occurring in a specified year. Generally, a higher average annual flow increases the potential for a flood event because of the likelihood of the flow to overrun the river channel banks. When the relationship between flood frequency and average annual flow is established, the information can then be used to carry out flood risk assessment, which would help identify areas that are most vulnerable. An illustration of the application of average annual flow in flood risk assessments is shown in Ibrahim et al. [
75].
This study presents protocols that can be applied to predict average annual runoff (streamflow) within the defined catchment areas, thus providing one of the vital information necessary for flood hazard identification and management. Basin-scale empirical relationships between influencing climatic parameters and flow are derived. Previous research by Agbiji et al. [
76] shows that trends in climatic variables are determined by climate change. It is the assertion in this study that if climatic parameters are invariably affected by climate change, then flow will be affected as well based on the relationships that have been established. It is, hence, possible to predict fluctuations in flow, which is a useful resource in water resource planning, including flood risk management.
4. Conclusions
In this study, the influence of climate change on flooding in the Cross River Basin sub-region was modeled by examining the effects of climatic variables such as rainfall, evaporation, solar radiation, wind speed, relative humidity, and sunshine hours. The results of the study showed that these climatic variables at 95% confidence accounted for 98.8% of the annual variation in runoff in Ogoja, 99.9% in Eket, 100% in Uyo, 100% in Calabar, and subsequent flooding in the Cross River Basin. The climate-flood model of the Calabar, Uyo, Ogoja, and Eket stations proved statistically significant in predicting yearly runoff and flooding in the basin, according to the multiple regression analysis. The statistical significance results equally demonstrated that the impacts of soil temperature, relative humidity, evaporation, and temperature on yearly runoff were significant. However, because they were not statistically significant, solar radiation, sunshine hour, and wind speed had only minimal effects on annual runoff/flooding.
These findings provide valuable insights for the Presidential Panel on Climate Change, the Presidential Committee on Flooding, private investors, policymakers, civil engineers, and all other stakeholders in the water sector in developing a basin- or sub-regional-level approach to mitigating the impact of climate change. They demonstrate the potential for runoff alterations brought on by climate change that would necessitate large-scale planning responses. The study also reveals the importance of studies on floods and the hydrological effects they have on water supplies, which are crucial when weighing the costs and other implications of projected climate change.
The selected parameters represent a broad range of influencing climatic variables that are generally observed at hydro stations, and a key objective of the study was to investigate the predominance of these variables in the runoff process. The set of climatic variables used in this research have been the subject of inquiry in other related studies (e.g., [
30,
32,
33,
34,
36,
41,
77,
78]) either individually or collectively and have been shown to affect runoff to various extent, thereby forming the basis for their inclusion in this study. The outcome of this research reveals that although a number of climatic variables contribute to runoff in the designated catchment areas, rainfall is by far the most impactful. This study can be extended to cover other input parameters and catchment properties like land use, topography, infiltration, and soil type to further calibrate the climate-flood model. This is a potential area for future research.